Thursday, 9 March 2006
The RANK and DENSE_RANK Ranking Functions
posted by Stefan Delmarco
RANK and DENSE_RANK are the last of the new ranking (window) functions. The best analogy of how these functions work is to think of how you'd solve the following problem: You've been asked to write a report that will generate a list of the best selling products. You're required to produce a result set that contains the product name, total quantity sold and assign each product a ranking (i.e. 1st, 2nd, 3rd, etc.). So you might write something like this:
sum(TH.Quantity) as [TotalQuantity],
row_number() over(order by sum(quantity) desc) as [Ranking]
from Production.TransactionHistory TH
inner join Production.Product P
on TH.ProductID = P.ProductID
group by P.[Name]
order by [TotalQuantity] desc;
| Name | TotalQuantity | Ranking |
| ----------------- | ------------- | -------------------- |
| BB Ball Bearing | 374090 | 1 |
| Seat Stays | 187376 | 2 |
| Blade | 93688 | 3 |
| Chain Stays | 93688 | 4 |
| Fork End | 93688 | 5 |
| HL Crankarm | 64900 | 6 |
| HL Hub | 56264 | 7 |
| Decal 1 | 56250 | 8 |
| Decal 2 | 56250 | 9 |
| LL Mountain Pedal | 50922 | 10 |
| ... |
Whilst this looks to have fulfilled the requirements, it has a very subtle deficiency. If you look closely you'll see that the top 10 products actually include a couple of ties. Blade, Chain Stays, and Fork End all come in at an equal 93,688 sold. By using ROW_NUMBER to assign the ranking we have failed to take account the possibility of ties and assigned lower rankings to products that have sold equal amounts. Imagine if this query was for assigning prize money in a golf tournament! I'm sure the competitors would not be very happy about this 'feature'!
This is where the RANK function comes into play. In the same manner as all the previous windowed functions we looked at, RANK requires an OVER clause. RANK uses the ORDER BY in the OVER clause to assign a monotonically increasing value to each row, starting from 1. However, RANK looks for duplicates in the range. Where duplicate rows are found, the same RANK value is assigned to each row. In addition, when parsing duplicates, RANK counts the number of duplicates read and assigns the next row, that has a differing value, a RANK equal to the RANK of the duplicates plus the number of duplicates read. Using RANK our query then becomes:
sum(TH.Quantity) as [TotalQuantity],
rank() over(order by sum(quantity) desc) as [Ranking]
from Production.TransactionHistory TH
inner join Production.Product P
on TH.ProductID = P.ProductID
group by P.[Name]
order by [TotalQuantity] desc;
| Name | TotalQuantity | Ranking |
| ----------------- | ------------- | -------------------- |
| BB Ball Bearing | 374090 | 1 |
| Seat Stays | 187376 | 2 |
| Blade | 93688 | 3 |
| Chain Stays | 93688 | 3 |
| Fork End | 93688 | 3 |
| HL Crankarm | 64900 | 6 |
| HL Hub | 56264 | 7 |
| Decal 1 | 56250 | 8 |
| Decal 2 | 56250 | 8 |
| LL Mountain Pedal | 50922 | 10 |
| ... |
As you can see, RANK has correctly identified the 'ties' and assigned Blade, Chain Stays and Fork End an identical RANK. In addition, it has correctly assigned HL Crankarm a RANK of sixth as we have 3 products that came in third.
DENSE_RANK behaves identical to RANK except that it does not increment the rank each time a duplicate is read. Using RANK we have no fourth or fifth placed products as the 3 third placed products have hidden the fourth and fifth placed ranks (i.e. we have non-consecutive ranks). DENSE_RANK does not skip ranks in this manner. Changing our query to use DENSE_RANK becomes:
sum(TH.Quantity) as [TotalQuantity],
dense_rank() over(order by sum(quantity) desc) as [Ranking]
from Production.TransactionHistory TH
inner join Production.Product P
on TH.ProductID = P.ProductID
group by P.Name
order by [TotalQuantity] desc;
| Name | TotalQuantity | Ranking |
| ------------------ | ------------- | -------------------- |
| BB Ball Bearing | 374090 | 1 |
| Seat Stays | 187376 | 2 |
| Blade | 93688 | 3 |
| Chain Stays | 93688 | 3 |
| Fork End | 93688 | 3 |
| HL Crankarm | 64900 | 4 |
| HL Hub | 56264 | 5 |
| Decal 1 | 56250 | 6 |
| Decal 2 | 56250 | 6 |
| LL Mountain Pedal | 50922 | 7 |
| ... |
Next we'll explore the OVER clause in a bit more detail to see how partitions work.
