How to Use Site Server Search with Non-Standard Languages (248366)



The information in this article applies to:

  • Microsoft Site Server 3.0

This article was previously published under Q248366

SUMMARY

Word-breakers, word stemmers, and noise word lists that are included in Index Server and Site Server only support English (US and UK versions), German, French, Spanish, Japanese, Dutch, and Swedish. For all other languages, a neutral word-breaker is used. However, with planning and preparation, you can successfully catalog and search for non-standard languages. To do this, see the steps in the "More Information" section of this article.

MORE INFORMATION

NOTE: The following example uses Turkish; however, the principle is the same for all languages.
  1. All content must be tagged as Turkish, so that the language is clearly identified when the Gatherer service runs. To do this, use the following META tag :
    <meta http-equiv="content-language" content="TR"><BR/>
    					
    Refer to RFC2616 "Hypertext Transfer Protocol -- HTTP/1.1" and RFC1766 "Tags for the Identification of Languages" for details about valid tags.

  2. Search.asp needs to force the input to the page to be Turkish, so that the correct match of characters is made. To do this, use the following ASP statement at the top of the page:
    <% Session.Codepage = 1254 %><BR/>
    					
    For information on codepages, see the following Web page: For additional information, see the Platform SDK.

  3. All searches must explicitly set the language of the search to Turkish otherwise the character matches will fail. To do this, set the LocaleID of the search object to Turkish:
    set Q = Server.CreateObject("MSSearch.Query")
    Q.LocaleId = 1055<BR/>
    					
    For a complete list of the Locale IDs, see the "International Features" topic in the "Windows Base Services" section of the Platform SDK.

Modification Type:MajorLast Reviewed:3/28/2001
Keywords:kbprogramming KB248366