BUG: Certain Field Terminators May Cause the BCP Utility or a BULK INSERT Statement to Import DBCS Data Incorrectly (323323)



The information in this article applies to:

  • Microsoft SQL Server 2000 (all editions)
  • Microsoft SQL Server 7.0

This article was previously published under Q323323
BUG #: 350831 (SHILOH_BUGS)
BUG #: 100728 (SQLBUG_70)

SYMPTOMS

Certain field terminators may cause BCP or a BULK INSERT to import double-byte character set (DBCS) data incorrectly. In particular, the vertical bar character (0x7C) as the field terminator is prone to this problem in a DBCS environment.

For example:
  • There is a DBCS character such as 0x907C in the import file.
  • The field terminator is the vertical bar character 0x7C.

CAUSE

This problem occurs because the field terminator may match the trailing byte of some DBCS characters. In such situations, BCP or BULK INSERT cannot distinguish whether such a character is a data or a field terminator.

A DBCS character is made up of two bytes, a leading byte and a trailing byte. For example, the Japanese character for the word "vinegar" is represented by two bytes as 0x90 and 0x7C.

WORKAROUND

To work around this problem, select a field terminator from 0x01 to 0x3F (inclusive) to prevent misinterpretation with DBCS characters.

For example: comma (0x2c), tab (0x09), semicolon (0x3b)

STATUS

Microsoft has confirmed that this is a problem in the Microsoft products that are listed at the beginning of this article.

Modification Type:MajorLast Reviewed:10/16/2003
Keywords:kbbug kbpending KB323323