SUMMARY
This article describes how to dynamically rank rows when
you perform a SELECT statement by using a flexible method, which may be the
only possible solution and which is faster than the procedural solution. Row
numbering or ranking is a typical procedural issue. The solutions are typically
based on loops and temporary tables; therefore, they are based on SQL Server
loops and cursors. This technique is based on an auto join. The chosen
relationship is typically "is greater than." Count how many times each element
of a particular set of data fulfills the relationship "is greater than" when
the set is compared to itself.
Note The following examples are based on the
pubs database. By default, the
Northwind sample database and the
pubs sample database are not installed in SQL Server 2005. These databases can be downloaded from the Microsoft Download Center. For more information, visit the following Microsoft Web site:
After you download SQL2000SampleDb.msi, extract the sample database scripts by double-clicking SQL2000SampleDb.msi. By default, SQL2000SampleDb.msi will extract the database scripts and a readme file into the following folder:
C:\SQL Server 2000 Sample Databases
Follow the instructions in the readme file to run the installation scripts.
If you are using SQL Server 2005
We recommend that you use ranking functions that are provided as a new feature in SQL Server 2005. For more information about the ranking functions, visit the following Microsoft Developer Network (MSDN) Web site:
back to the top
Example 1
In this example:
- Set 1 is authors.
- Set 2 is authors.
- The relationship is "last and first names are greater
than."
- You can avoid the duplicate problem by comparing the first
+ last names to the other first + last names.
- Count the number of times the relationship is fulfilled by
count(*).
Query:
select rank=count(*), a1.au_lname, a1.au_fname
from authors a1, authors a2
where a1.au_lname + a1.au_fname >= a2.au_lname + a2.au_fname
group by a1.au_lname, a1.au_fname
order by rank
Use the following code in SQL Server 2005.
select rank() OVER (ORDER BY a.au_lname, a.au_fname) as rank, a.au_lname, a.au_fname
from authors a
order by rank
Result:
Rank Au_Lname Au_Fname
---- -------------- -----------
1 Bennet Abraham
2 Blotchet-Halls Reginald
3 Carson Cheryl
4 DeFrance Michel
5 del Castillo Innes
6 Dull Ann
7 Greene Morningstar
8 Green Marjorie
9 Gringlesby Burt
10 Hunter Sheryl
11 Karsen Livia
12 Locksley Charlene
13 MacFeather Stearns
14 McBadden Heather
15 O'Leary Michael
16 Panteley Sylvia
17 Ringer Albert
18 Ringer Anne
19 Smith Meander
20 Straight Dean
21 Stringer Dirk
22 White Johnson
23 Yokomoto Akiko
(23 row(s) affected)
back to the top
Example 2
In this example:
- Rank stores by the number of books sold.
- Set 1 is the number of books sold by store: select stor_id,
qty=sum(qty) from sales group by stor_id.
- Set 2 is the number of books sold by store: select stor_id,
qty=sum(qty) from sales group by stor_id.
- The relationship is "the number of books is greater
than."
- To avoid duplicates, you can (as an example) compare
price*qty instead of qty.
Query:
select rank=count(*), s1.stor_id, qty=sum(s1.qty)
from (select stor_id, qty=sum(qty) from sales group by stor_id) s1,
(select stor_id, qty=sum(qty) from sales group by stor_id) s2
where s1.qty >= s2.qty
group by s1.stor_id
order by rank
Result:
Rank Stor_Id Qty
---- ------- ---
1 6380 8
2 7896 120
3 8042 240
4 7067 360
5 7066 625
6 7131 780
(6 row(s) affected)
Note The values in the
Qty column are incorrect. However, the ranking of stores based on the quantity of books sold is correct. This is a defect of this method. You can use this method to return the ranking of stores if you do not care about the wrong quantity in the result.
Use the following code in SQL Server 2005.
select row_number() over (order by qty desc) as rank,s1.stor_id,s1.qty
from (select stor_id, qty=sum(qty) from sales group by stor_id) as s1
Result:
rank stor_id qty
------- ------- ------
1 7131 130
2 7066 125
3 7067 90
4 8042 80
5 7896 60
6 6380 8
(6 row(s) affected)
Note In SQL Server 2005, you can receive the correct result of the ranking and the quantity when you use the ranking functions.
back to the top
Example 3
In this example:
- Rank the publishers by their earnings.
- Set 1 is the total sales by publisher:
select t.pub_id, sales=sum(s.qty*t.price)
from sales s, titles t
where s.title_id=t.title_id
and t.price is not null
group by t.pub_id
- Set 2 is the total sales by publisher:
select t.pub_id, sales=sum(s.qty*t.price)
from sales s, titles t
where s.title_id=t.title_id
and t.price is not null
group by t.pub_id
- The relationship is "earns more money than."
Query:
select rank=count(*), s1.pub_id, sales=sum(s1.sales)
from (select t.pub_id, sales=sum(s.qty*t.price)
from sales s, titles t
where s.title_id=t.title_id
and t.price is not null
group by t.pub_id) s1,
(select t.pub_id, sales=sum(s.qty*t.price)
from sales s, titles t
where s.title_id=t.title_id
and t.price is not null
group by t.pub_id) s2
where s1.sales>= s2.sales
group by s1.pub_id
order by rank
Result:
Rank Pub_Id Sales
---- ------ --------
1 0736 1,961.85
2 0877 4,256.20
3 1389 7,760.85
(3 row(s) affected)
Note The values in the
Sales column are incorrect. However, the ranking of publishers based on the earnings is correct.
Use the following code in SQL Server 2005.
select rank() over (order by sales desc) as rank,s1.pub_id,s1.sales
from (select t.pub_id, sales=sum(s.qty*t.price)
from sales s inner join titles t
on s.title_id=t.title_id
where t.price is not null
group by t.pub_id) as s1
Result:
rank pub_id sales
------- ------ ---------
1 1389 2586.95
2 0877 2128.10
3 0736 1961.85
(3 row(s) affected)
Note You receive the correct result of the ranking and the earning when you use the ranking functions.
back to the top
Drawbacks
- Because of the cross join, this is not designed for working
with a large number of rows. It works well for hundreds of rows. On large
tables, make sure to use an index to avoid large scans.
- This does not work well with duplicate values. When you
compare duplicate values, discontinuous row numbering occurs. If this is not
the behavior that you want, you can avoid it by hiding the rank column when you
insert the result in a spreadsheet; use the spreadsheet numbering
instead.
Note If you are using SQL Server 2005, you can use the row_number() function to return the sequential number of a row, regardless of the duplicate rows.
Example:
select rank=count(*), s1.title_id, qty=sum(s1.qty)
from (select title_id, qty=sum(qty) from sales group by title_id) s1,
(select title_id, qty=sum(qty) from sales group by title_id) s2
where s1.qty >= s2.qty
group by s1.title_id
order by rank
Result:
Rank Title_Id Qty
---- -------- ----
1 MC2222 10
4 BU1032 60
4 BU7832 60
4 PS3333 60
7 PS1372 140
7 TC4203 140
7 TC7777 140
10 BU1111 250
10 PS2106 250
10 PS7777 250
11 PC1035 330
12 BU2075 420
14 MC3021 560
14 TC3218 560
15 PC8888 750
16 PS2091 1728
(16 row(s) affected)
back to the top
Benefits
- You can use these queries in views and result
formatting.
- You can shift the lower-ranked data more to the
right.
Example 1:
CREATE VIEW v_pub_rank
AS
select rank=count(*), s1.title_id, qty=sum(s1.qty)
from (select title_id, qty=sum(qty) from sales group by title_id) s1,
(select title_id, qty=sum(qty) from sales group by title_id) s2
where s1.qty >= s2.qty
group by s1.title_id
Query:
select publisher=convert(varchar(20),replicate (' ', power(2,rank)) +
pub_id +
replicate(' ', 15-power(2,rank))+': '),
earnings=qty
from v_pub_rank
Result:
Publisher Earnings
------------- --------
0736 : 1,961.85
0877 : 4,256.20
1389 : 7,760.85
Use the following code in SQL Server 2005.
CREATE VIEW v_pub_rank
AS
select rank() over (order by sales) as rank,s1.pub_id,s1.sales
from (select t.pub_id, sales=sum(s.qty*t.price)
from sales s, titles t
where s.title_id=t.title_id
and t.price is not null
group by t.pub_id) as s1
GO
select publisher=convert(varchar(20),replicate (' ', power(2,rank)) +
pub_id + replicate(' ', 15-power(2,rank))+': '),
earnings=sales
from v_pub_rank order by rank
GO
Result:
publisher earnings
-------------------- ---------------------
0736 : 1961.85
0877 : 2128.10
1389 : 2586.95
(3 row(s) affected)
Example 2:
CREATE VIEW v_title_rank
AS
select rank=count(*), s1.title_id, qty=sum(s1.qty)
from (select title_id, qty=sum(qty) from sales group by title_id) s1,
(select title_id, qty=sum(qty) from sales group by title_id) s2
where s1.qty >= s2.qty
group by s1.title_id
Query:
select Book=convert(varchar(45),replicate (' ', 2*rank) +
title_id +
replicate(' ', 35-2*rank)+': '),
qty
from v_title_rank
order by rank
Result:
Book Qty
------------------------------------------- ----
MC2222 : 10
BU1032 : 60
BU7832 : 60
PS3333 : 60
PS1372 : 140
TC4203 : 140
TC7777 : 140
BU1111 : 250
PS2106 : 250
PS7777 : 250
PC1035 : 330
BU2075 : 420
MC3021 : 560
TC3218 : 560
PC8888 : 750
PS2091 : 1728
(16 row(s) affected)
Use the following code in SQL Server 2005.
CREATE VIEW v_title_rank
AS
select rank() over (order by qty) as rank, s1.title_id,s1.qty
from (select title_id, qty=sum(qty) from sales group by title_id) as s1
GO
select Book=convert(varchar(45),replicate (' ', 2*rank) +
title_id + replicate(' ', 35-2*rank)+': '), qty
from v_title_rank
order by rank
GO
Result:
Book qty
--------------------------------------------- -----------
MC2222 : 10
BU1032 : 15
BU7832 : 15
PS3333 : 15
TC4203 : 20
TC7777 : 20
PS1372 : 20
BU1111 : 25
PS7777 : 25
PS2106 : 25
PC1035 : 30
BU2075 : 35
MC3021 : 40
TC3218 : 40
PC8888 : 50
PS2091 : 108
(16 row(s) affected)
back to the top