Q&A: SQL Server Driver 2.50 is DBCS-Enabled (136269)



The information in this article applies to:

  • Microsoft SQL Server 6.0
  • Microsoft SQL Server 6.5
  • Microsoft Open Database Connectivity 2.5

This article was previously published under Q136269

SUMMARY

The purpose of this article is to answer general questions and provide more background information regarding how the SQL Server Driver is DBCS-enabled. The article is divided into the following sections:
  • What is DBCS?
  • What does "DBCS-enabled" imply?
  • Q & A

MORE INFORMATION

What is DBCS?

Double-byte Character Set (DBCS) is a character encoding mechanism to accommodate ideographic characters used in Far Eastern languages. Unlike Single-byte Character Sets (SBCS), which can only represent at most 256 characters in one byte, characters in DBCS can be addressed using a 16-bit notation, using two bytes, or double-byte. With 16-bit notation, you can represent 65,536 (216) characters.

DBCS code pages contain both single and double-byte characters. The DBCS single-byte characters conform to the 8-bit national standards for each country and correspond closely to the ASCII character set.

In a double-byte character set, certain ranges of code-points are designated as leading bytes. A leading byte, together with the following byte, represents a single character. This second byte is called the trailing byte or trail byte. Each DBCS has a different set of lead-byte ranges and trail-byte ranges. Unlike leading bytes, trail-bytes in some DBCS can overlap with 7-bit ASCII character set.

For example, the Shift JIS (Japan Industry Standard) character set has a trail-byte range of 0x40H-0xFEH. That means a byte holding the value of 0x7DH can represent the second half of a Kanji character, not necessary a close brace character(}).

What does "DBCS-enabled" imply?

If a program is claimed to be DBCS-enabled, that means when it is running on a DBCS platform, the following conditions are true:
  1. It can distinguish a trail-byte from an ASCII character. For example, it can find out if 0x7DH is the trail-byte of a Kanji character or a close brace when it runs on Japanese versions of Windows or Windows NT.
  2. It should differentiate character-based semantics from byte-based semantics. For example, a function such as "CharCount" should return the number of characters in the string instead of the number of bytes in a DBCS string; a function such as "CharNext" should move to the next character rather than the next byte in a DBCS string.

Questions and Answers

The following answers are based on connections to the English version of Microsoft SQL Server version 6.0.
  1. CAN I PUT A DBCS STRING INTO CHAR OR VARCHAR COLUMNS? CAN I RETRIEVE A DBCS STRING FROM THE SQL SERVER AND DISPLAY IT?

    Yes, if the SQL Server code page is the correct DBCS code page. Storing DBCS characters in a non-DBCS SQL Server (or any code page X data in a code page Y SQL Server) is not supported and may result in data loss in some circumstances.

    In order to display DBCS strings, your client application should run on a DBCS platform such as the Japanese version of Windows.
  2. CAN I USE A DBCS CHARACTER OR STRING IN A LIKE CLAUSE ?

    Yes. Since the driver is DBCS-enabled, it can parse trail-bytes correctly. For example, it will not interpret trailing-byte characters such as the percent sign (%) and underscore character (_) as wildcards, and it will ignore trailing-byte characters such as the single quotation mark (') and close brace character(}).

    ODBC provides two wildcards in a LIKE clause: the percent sign matches zero or more of any character, and the underscore character matches any one character. When you connect to the English version of SQL Server version 6.0, the underscore character actually matches one byte.
  3. CAN I USE DBCS CHARACTERS TO NAME MY TABLES, COLUMNS AND OTHER OBJECTS?

    Yes, if the SQL Server code page selected during installation is the correct DBCS code page.
  4. I AM TOLD THAT DBCS ISSUES WILL BE ADDRESSED IN THE ODBC 3.0 TIME FRAME. SINCE THE SQL SERVER DRIVER 2.50 HAS ALREADY BEEN DBCS-ENABLED, WHAT WILL BE NEW IN ODBC 3.0?

    ODBC 3.0 will address DBCS issues from the specification's perspective. For example, in Kyle Geiger's book, "Inside ODBC," Chapter 9, section "ODBC 3.0", page 453, you can see two fields in a descriptor record: LENGTH and OCTET_LENGTH. Here, LENGTH specifies the number of characters in the column and OCTET_LENGTH gives the length of the column in bytes.

Modification Type:MinorLast Reviewed:3/14/2005
Keywords:kbProgramming KB136269 kbAudDeveloper