Gatherer Logs Do Not Show Double-Byte Character Set Strings Correctly (328040)



The information in this article applies to:

  • Microsoft SharePoint Portal Server 2001

This article was previously published under Q328040

SYMPTOMS

The gatherer logs that are generated by the Microsoft SharePoint Portal Server indexing process do not show double-byte character set (DBCS) strings correctly.

CAUSE

This issue occurs because the module that treats the gatherer logs does not handle double-byte character set strings correctly.

MORE INFORMATION

Gatherer log files are created during every index update. The log files are stored in the following folder:

Data\FTData\SharePointPortalServer\GatherLogs\workspace_name

This folder is typically located in ProgramFiles\SharePointPortalServer on the drive where you installed the SharePoint Portal Server program files. For additional information about how to use the Gthrlog.vbs utility to view gatherer logs, click the following article number to view the article in the Microsoft Knowledge Base:

289653 How to Use the Gthrlog.vbs Utility to View Gatherer Logs

Double-byte character set is a character-encoding mechanism that is used to handle ideographic characters in Asian languages. Single-byte character sets (SBCSs) only use one byte to represent each character, so they can only represent up to 256 characters. But DBCSs uses two bytes (or 16-bit notation), so a DBCS can represent up to 65,536 (2^16) characters.

DBCS code pages contain both single and double-byte characters. The DBCS single-byte characters comply with the 8-bit national standards for each country and correspond closely to the ASCII character set.

In a double-byte character set, certain ranges of code-points are designated as leading bytes. A leading byte, together with the following byte, represents a single character. This second byte is named the trailing byte or trail byte. Each DBCS has a different set of lead-byte ranges and trail-byte ranges. Unlike leading bytes, trail-bytes in some DBCSs can overlap with the 7-bit ASCII character set. For example, the Shift JIS (Japan Industry Standard) character set has a trail-byte range of 0x40H-0xFEH. That means a byte that holds the value of 0x7DH can represent the second half of a Kanji character, not necessarily a close brace character (}) as it would with the 7-bit ASCII set.

RESOLUTION

A supported fix is now available from Microsoft, but it is only intended to correct the problem that is described in this article. Apply it only to computers that are experiencing this specific problem. This fix may receive additional testing. Therefore, if you are not severely affected by this problem, Microsoft recommends that you wait for the next SharePoint Portal Server 2001 service pack that contains this hotfix.

To resolve this problem immediately, contact Microsoft Product Support Services to obtain the fix. For a complete list of Microsoft Product Support Services phone numbers and information about support costs, visit the following Microsoft Web site:NOTE: In special cases, charges that are ordinarily incurred for support calls may be canceled if a Microsoft Support Professional determines that a specific update will resolve your problem. The typical support costs will apply to additional support questions and issues that do not qualify for the specific update in question.

The Japanese version of this fix has the file attributes (or later) that are listed in the following table. The dates and times for these files are listed in coordinated universal time (UTC). When you view the file information, it is converted to local time. To find the difference between UTC and local time, use the Time Zone tab in the Date and Time tool in Control Panel.
 

Date     Time     Version     Size     File name 
------------------------------------------------------- 
27-Sep-2002  18:26  10.145.4629.35  843,776  msstools.dl

 
Note Because of file dependencies, this update requires Microsoft SharePoint Portal Server 2001. For more information, visit the following Microsoft Web site:

STATUS

Microsoft has confirmed that this is a problem in Microsoft SharePoint Portal Server 2001.

Modification Type:MinorLast Reviewed:10/6/2005
Keywords:kbbug kbfix kbQFE kbSharePtPortalSvr2001preSP2fix KB328040