Large Text Files Are Not Fully Indexed (318747)
The information in this article applies to:
- Microsoft SharePoint Portal Server 2001
This article was previously published under Q318747 IMPORTANT: This article contains information about modifying the registry. Before you
modify the registry, make sure to back it up and make sure that you understand how to restore
the registry if a problem occurs. For information about how to back up, restore, and edit the
registry, click the following article number to view the article in the Microsoft Knowledge Base:
256986 Description of the Microsoft Windows Registry
SYMPTOMS
If you crawl documents on a computer that is running SharePoint Portal Server, large text files may not be fully indexed.
The Microsoft Search service may log an error message in the Microsoft Windows Event Viewer Application event log that is similar to:
Event Type: Warning
Event Source: Microsoft Search
Event Category: Gatherer
Event ID: 3035
Date: 1/1/2002
Time: 12:00:00 PM
User: N/A
Computer: COMPUTERNAME
Description:
One or more warnings or errors were logged to file <C:\Program Files\SharePoint Portal Server\Data\FTData\SharePointPortalServer\GatherLogs\WORKSPACE\WORKSPACE.1.gthr>. If you are interested in these messages, please, look at the file using the gatherer log query object (gthrlog.vbs, log viewer web page).
Context: SharePointPortalServer Application, WORKSPACE Catalog
The Content Source log may also contain error messages that are similar to:
Time: 1/1/2002 12:00:00 PM
Type: Document Added
Message: Error fetching URL, (8004173e - The document was too large to filter in its entirety. Portions of the document were not emitted.)
URL: file://./backofficestorage/localhost/sharepoint portal server/workspaces/HOME/Do...
Time: 1/1/2002 12:00:00 PM
Type: Document Added
Message: Error fetching URL, (8004173e - The document was too large to filter in its entirety. Portions of the document were not emitted.)
URL: \\.\backofficestorage\localhost\sharepoint portal server\workspaces\HOME\documen...
NOTE: To view the Content Source log for a workspace, browse to the following URL on your SharePoint Portal Server computer (where computer_name is the name of your SharePoint Portal Server computer, and workspace is the name of your workspace):
http://computer_name/workspace/portal/resources/updatelog.asp?Workspace=workspace CAUSE
This issue can occur if some text files are too large for the server to index by using the default SharePoint Portal Server settings, which are configured for performance reasons.
RESOLUTIONIndexing Large Text Files
If you are indexing large text files (.txt), to resolve this issue, change the MaxTextFilterBytes registry value. WARNING: If you use Registry Editor incorrectly, you may cause serious problems that may
require you to reinstall your operating system. Microsoft cannot guarantee that you can solve
problems that result from using Registry Editor incorrectly. Use Registry Editor at your own
risk.
To change the MaxTextFilterBytes registry value:
- Start Registry Editor (Regedt32.exe).
- Locate the following key in the registry:
HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\ContentIndex - Double-click the MaxTextFilterBytes value, change the value to Decimal, and then type the new value. The value is the maximum size (in bytes) for files that the text filter indexes. (The default value is 25,000,000 bytes, or approximately 25 megabytes.)
See the "More Information" section of this article for a description of the MaxTextFilterBytes value.
Indexing Other Types of Large Documents
You can fully index most other document types by changing the MaxDownloadSize and MaxGrowFactor registry values:
- Start Registry Editor (Regedt32.exe).
- Locate the following key in the registry:
HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Search\1.0\Gathering Manager - Double-click the MaxDownloadSize value, change the value to Decimal, and then type the new value. The value is the maximum size (in megabytes) for files that the gatherer downloads.
- Double-click the MaxGrowFactor value, change the value to Decimal, and then type the new value. The value is the size of the output for the index filter.
- Quit Registry Editor.
See the "More Information" section of this article for a description of the MaxDownloadSize and MaxGrowFactor values. NOTE: After you make these changes, restart the Microsoft Search service. If you want your documents to be re-indexed immediately, do a full update on the content source that contains the large files.
Modification Type: | Major | Last Reviewed: | 1/3/2003 |
---|
Keywords: | kbprb KB318747 |
---|
|