How to Use a SQL Window to Create a Result Set
Windows and window functions were first introduced in SQL:1999. With a window, a user can optionally partition a data set, optionally order the rows in each partition, and specify a collection of rows (the window frame) that is associated with a given row.
The window frame of a row R is some subset of the partition containing R. For example, the window frame may consist of all the rows from the beginning of the partition up to and including R, based on the way rows are ordered in the partition.
A window function computes a value for a row R, based on the rows in the window frame of R.
For example, suppose you have a SALES table that has columns of CustID, InvoiceNo, and TotalSale. Your sales manager may want to know what the total sales were to each customer over a specified range of invoice numbers. You can obtain what she wants with the following SQL code:
SELECT CustID, InvoiceNo, SUM (TotalSale) OVER ( PARTITION BY CustID ORDER BY InvoiceNo ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW ) FROM SALES;
The OVER clause determines how the rows of the query are partitioned before being processed, in this case by the SUM function. A partition is assigned to each customer. Within each partition will be a list of invoice numbers, and associated with each of them will be the sum of all the TotalSale values over the specified range of rows, for each customer.
SQL:2011 has added several major enhancements to the original window functionality, incorporating new keywords.
How to partition a window into buckets with NTILE
The NTILE window function apportions an ordered window partition into some positive integer number n of buckets, numbering the buckets from 1 to n. If the number of rows in a partition m is not evenly divisible by n, then after the NTILE function fills the buckets evenly, the remainder of m/n, called r, is apportioned to the first r buckets, making them larger than the other buckets.
Suppose you want to classify your employees by salary, partitioning them into five buckets, from highest to lowest. You can do it with the following code:
SELECT FirstName, LastName, NTILE (5) OVER (ORDER BY Salary DESC) AS BUCKET FROM Employee;
If there are, for example, 11 employees, each bucket is filled with two except for the first bucket, which is filled with three. The first bucket will contain the three highest paid employees, and the fifth bucket will contain the two lowest paid employees.
How to nest window functions
Sometimes to get the result you need, the easiest way is to nest one function within another. SQL:2011 added the capability to do such nesting with window functions.
As an example, consider a case where a stock investor is trying to determine whether it is a good time to buy a particular stock. To get a handle on this, she decides to compare the current stock price to the price it sold for on the immediately previous 100 trades. She wonders, how many times in the previous 100 trades it sold for less than the current price.
To reach an answer, she makes the following query:
SELECT SaleTime, SUM ( CASE WHEN SalePrice < VALUE OF (SalePrice AT CURRENT ROW) THEN 1 ELSE 0 ) OVER (ORDER BY SaleTime ROWS BETWEEN 100 PRECEDING AND CURRENT ROW ) FROM StockSales;
The window encompasses the 100 rows preceding the current row, which correspond to the 100 sales immediately prior to the current moment. Every time a row is evaluated where the value of SalePrice is less than the most recent price, 1 is added to the sum.
The end result is a number that tells you the number of sales out of the previous hundred that were made at a lower price than the current price.
How to evaluate groups of rows
Sometimes the sort key you have chosen to place a partition in order will have duplicates. You may want to evaluate all rows that have the same sort key as a group. In such cases you can use the GROUPS option. With it you can count groups of rows where the sort keys are identical.
Here’s an example:
SELECT CustomerID, SaleDate, SUM (InvoiceTotal) OVER ( PARTITION BY CustomerID ORDER BY SaleDate GROUPS BETWEEN 2 PRECEDING AND 2 FOLLOWING ) FROM Customers;
The window frame in this example consists of up to five groups of rows: two groups before the group containing the current row, the group containing the current row, and two groups following the group containing the current row. The rows in each group have the same SaleDate, and the SaleDate associated with each group is different from the SaleDate values for the other groups.