FIX: LIKE Operator with Japanese Repeating Character That Follows a Wildcard Does Not Return Expected Results (315649)



The information in this article applies to:

  • Microsoft SQL Server 2000 (all editions)

This article was previously published under Q315649
BUG #: 223518 (SHILOH_BUGS)

SYMPTOMS

A query that uses the LIKE operator with a wildcard and searches for a particular Japanese character, may not return expected results if all of these conditions are true:
  • The LIKE search criteria includes a Katakana or Hiragana prolonged sound mark, Cho-On, or other iteration mark (a character that extends the sound of the preceding character or indicates that the preceding character should repeat). For example, any of the following characters:

    • Katakana-Hiragana Prolonged Sound Mark (Unicode character code U+30FC, or 0x815B in code page 932)
    • Halfwidth Katakana-Hiragana Prolonged Sound Mark (Unicode character code U+FF70, or 0xB0 in code page 932)
    • Hiragana Iteration Mark (Unicode character code U+309D, or 0x8154 in code page 932)
    • Hiragana Voiced Iteration Mark (Unicode character code U+309E, or 0x8155 in code page 932)
    • Katakana Iteration Mark (Unicode character code U+30FD, or 0x8152 in code page 932)
    • Katakana Voiced Iteration Mark (Unicode character code U+30FE, or 0x8153 in code page 932)
    • Ideographic Iteration Mark (Unicode character code U+3005, or 0x8158 in code page 932)
    NOTE: In the remainder of this article, the letter "P" represents one of the characters described in the preceding list.
  • P immediately follows a wildcard character in the search criteria.
  • P also occurs in the string being searched (for example in a column in a table); however, P is not the first character in the string.

RESOLUTION

To resolve this problem, obtain the latest service pack for Microsoft SQL Server 2000. For additional information, click the following article number to view the article in the Microsoft Knowledge Base:

290211 INF: How To Obtain the Latest SQL Server 2000 Service Pack

NOTE: The following hotfix was created before the release of Microsoft SQL Server 2000 Service Pack 3.

The English version of this fix should have the following file attributes or later:
   Version       File name        Description
   ----------------------------------------------

   8.00.0558     Sqlsort.dll      SP2-based hotfix

   8.00.0558     Sqlservr.exe     SP2-based hotfix
				
NOTE: Because of file dependencies, the most recent hotfix or feature that contains the files may also contain additional files.


STATUS

Microsoft has confirmed that this is a problem in Microsoft SQL Server 2000.
This problem was first corrected in Microsoft SQL Server 2000 Service Pack 3.

MORE INFORMATION

In the text that follows:

P = A Kana Cho-On character, prolonged sound mark, or other iteration character.

K = Any character other than one of the characters that P may represent.

The character P generally extends the sound or the meaning of the preceding character. When you use character P, when SQL Server performs pattern matching, SQL Server expands P into a reoccurrence of the preceding character. In other words, KP=KK is a true expression, which is expected behavior.

Behavior of SQL Server without the Hotfix

Without the fix described in the "Resolution" section, SQL Server does not correctly handle a case where P immediately follows a wildcard character. If the character P immediately follows a wildcard character in a LIKE expression, the predicate only matches other strings that contain P if P is the first character in the string. KP LIKE %P% is considered to be untrue, although PK LIKE %P% returns true because P happens to be the first character in the string being searched. Without the fix, wildcards do not prevent the expansion of P to a duplicate of the preceding character. In other words, the expression KK LIKE K%P is considered to be true because P is expanded to the value of the previous (non-wildcard) character (KK LIKE K%P is true because KK LIKE K%K is true).

Behavior of SQL Server with the Hotfix

The hotfix only affects LIKE comparisons where P follows a wildcard. In LIKE expression evaluations that occur after you apply the fix, P no longer expands to the value of the preceding character if it immediately follows a wildcard. Also, in those cases where P is not expanded to the preceding character, it only matches another P in the string being searched. For example, with the search criteria K%P, because P follows the wildcard character it no longer expands to K. This means that KK LIKE K%P becomes false after the fix, although KP LIKE K%P remains true. Note that P still expands to a duplicate of the preceding character in cases where P follows a non-wildcard character. In other words, KP LIKE KK is still true after you apply the fix.

Here is a table that summarizes the behavior with and without the fix for a variety of expressions that involve P. The behavior of the first row changes because P is no longer expanded to a duplicate of the preceding character when P follows a wildcard. The behavior in the last row changes because P now correctly matches with another P when it follows a wildcard.

ExpressionWithout FixWith Fix
KK LIKE K%PTrueFalse (changed)
KP LIKE K%PTrueTrue
KP = (or LIKE) KKTrueTrue
PK LIKE PKTrueTrue
PK LIKE P%TrueTrue
PK LIKE %P%TrueTrue
KP LIKE %P%FalseTrue (changed)


Note that where P appears twice in the preceding expressions, each occurrence of P may be a different character of the type described in the "Symptoms" section. For example, after you apply the fix, KP LIKE '%P%' is considered to be a true expression even if the first P is Unicode character U+309D (Hiragana Iteration Mark) and the second P is U+30FD (Katakana Iteration Mark).

Modification Type:MinorLast Reviewed:9/27/2005
Keywords:kbHotfixServer kbQFE kbSQLServ2000sp3fix kbbug kbfix kbSQLServ2000preSP3fix KB315649