How to Modify the Noise Word List (293438)



The information in this article applies to:

  • Microsoft SharePoint Portal Server 2001

This article was previously published under Q293438

SUMMARY

This article describes how to modify the list of noise words. Noise words are words which do not add value to a query, such as "a," "an," "and," "the," and single letters of the alphabet. The indexing engine filters these words out to save index space and increase performance.

MORE INFORMATION

The shipped version of SharePoint Portal Server includes a predefined list of noise words for each language that Search supports. You can add or remove words from the list to suit the needs of your organization. Noise word lists are customizable, language-specific text files that are stored on the drive that the SharePoint Portal Server data files are installed on, in the Data\Ftdata\SharePointPortalServer\Config folder. The files contain a list of words; one word is displayed on each line. You can use any text editor such as Microsoft Notepad to edit these files.

There is one file for each language, for example, the noise word list for US English is Noiseenu.txt. There is also a language-neutral list named "Noiseneu.txt". Both the language-specific file, and the Noiseneu.txt file are checked for each query; therefore, when you make changes to one list, you must change the other list. You must also add any common case and accent variations for words to each noise word list. For example, if you want to remove the word "Microsoft", add all common variations that users may type in queries (Microsoft, microsoft and MICROSOFT). You must perform a full update of the index to incorporate any changes.

For more information about editing noise word lists and a listing of the file names for each language, refer the Advanced Topics section in Administrators Help.

Modification Type:MinorLast Reviewed:4/25/2005
Keywords:kbinfo KB293438