PRB: Windows Collations Ignore Single Quote and Hyphen When Ordering (305704)



The information in this article applies to:

  • Microsoft SQL Server 2000 (all editions)

This article was previously published under Q305704

SYMPTOMS

A single quote (') or an apostrophe (') and hyphen (-) are considered last when SQL Server 2000 sorts data stored with Microsoft Windows collations. To illustrate this behavior consider the following example:
CREATE TABLE t1(c1 VARCHAR(10) COLLATE Latin1_General_CI_AI)

INSERT INTO t1 VALUES ('car''s')
INSERT INTO t1 VALUES ('cars')
INSERT INTO t1 VALUES ('carps')
INSERT INTO t1 VALUES ('car-s')
GO
SELECT * FROM t1 ORDER BY c1 ASC
GO
				
The output is:
c1         
---------- 
carps
cars
car's
car-s

(4 row(s) affected)
				

CAUSE

In collation comparisons that use Windows collations, characters like a single quote (') or hyphen (-) are compared last, only after the regular alphabet characters are compared. For example, when SQL Server compares "car's" and "cars", SQL Server compares the alphabet characters first:

"cars and "cars"

Because those two are equal, the punctuation character (') is included in the comparison. The first string contains punctuation characters, whereas the second string does not. Thus, "car's" is greater than "cars". Similarly, "car's" is greater than "carps" but here SQL Server does not account for the punctuation character at all because the regular alphabet characters are already different:

"cars" is greater than "carps"

This behavior is the same as the "word sort" comparison routine in Microsoft Windows NT and is by design.

WORKAROUND

To work around this behavior, use a binary comparison collation. For example, use collation Latin1_General_Bin. Then, in the ORDER BY clause wrap the field in the LOWER or UPPER function. Use of the LOWER or UPPER function allows a binary ordering of the apostrophe and hyphen characters in an alphabetic/dictionary order of characters, while retaining letter case in the result set.

For example:
CREATE TABLE t1 (c1 varchar (10) COLLATE Latin1_General_Bin NULL )
GO
INSERT INTO t1 VALUES ('CARs')
INSERT INTO t1 VALUES ('car''s')
INSERT INTO t1 VALUES ('carps')
GO
SELECT * FROM t1 ORDER BY LOWER(c1) ASC
				
Here is the output:
c1         
---------- 
car's
car-s
cars
CARs

(4 row(s) affected)
				

Modification Type:MajorLast Reviewed:10/3/2003
Keywords:kbprb KB305704