MORE INFORMATION
Q: How does Site Server Search differ from Index Server?
A: Site Server Search has a different purpose than Index Server. Index Server runs as a service
in the background of a single computer. It monitors the File System for file change notices and updates
the catalog as it receives them. Site Server Search, on the other hand, is a powerful, distributable multi-threaded
crawler that can gather content from many sources both local and remote, Internet, as well as intranet.
It can crawl both file systems and HTML hyperlinks in Web documents. Search has a fully configurable schema,
while Index Server's is essentially static. Search can direct queries against multiple catalogs, Index Server can not. Other content sources available to Search that Index Server does not support are Exchange Server public folders and (through ASP pages) any ODBC database.
Other content sources available to Search that Index Server does not support are Exchange Server public folders and
(through ASP pages) any ODBC database.
Q: How do I troubleshoot Site Server Search?
A: Troubleshooting Search is best accomplished through the two available log files: the Gatherer Log and the Application Log in the Windows NT Event Viewer. A new Gatherer Log is created each time the crawler conducts a catalog build. By default it contains only failed tasks, but can be configured to log successful tasks and/or tasks that have failed due to restrictions you placed on the crawler or Robots.txt exclusions. The Event Viewer's Application Log contains messages for all major events that occur in the Site Server Search system. Another tool for troubleshooting is Windows NT Performance Monitor: Search adds many counters to allow you to oversee the performance of your system.
Q: When I try to build a catalog, nothing seems to happen. It doesn't seem to be indexing anything. What's wrong?
A: The most common cause for a catalog build failing is that the crawler cannot access the start page. There are several possible reasons for this: Misspelled start page, lack of network connectivity, or insufficient rights to content source. In the case of HTTP crawls, a good way to ensure that your start page is both correctly spelled, and the crawler can access it, is to connect to the Start Page with your browser. When you have successfully connected, you can simply copy and paste the URL from the browser to your catalog definition. For file crawls, you should also confirm the accuracy of the start page spelling, and that you can connect to the source. If you are failing due to insufficient rights, you will need to configure the
Default content access account in the
Catalog Build Server properties with a Windows NT account that has adequate rights. For more fine-grained control, you can specify an account strictly for the site in question, by adding the site to the catalog definition's Sites List and configuring that site with an appropriate Windows NT account.
Q: How do I catalog file types other than the Default Text, HTML, Microsoft Word, Microsoft Excel, and Microsoft PowerPoint?
A: Site Server Search supports file formats through the use of filters. Using the extensible architecture of Site Server Search, developers can create custom filters (
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/indexsrv/html/ixufilt_912d.asp) and wrap them with the IFilter COM interface. When the new filter is installed, Index Server will then index content of that document type.
Adobe (
http://www.adobe.com/main.html) is a notable third-party vendor that has made an IFilter available for its PDF file format (
Note You must have Acrobat reader installed on the same computer as Search for the filter to work properly).
Q: I've built a catalog, now how can I search it?
A: You can conduct a simple search against a catalog immediately after building it by using the MMC Administration tool. Open the Site Server MMC Administration tool, expand "Site Server Search," and select the host that contains the catalog you want to search. Under the host, expand "Search Server" AND expand the catalog, and then select the Search page. This page is also available through Web Administration.
Q: How do I know if the content I wanted is actually in my catalog?
A: If you enable
Log successful accesses for a catalog, the gatherer will make an entry in the log for each request for content that receives a successful response from the server. You can then simply search the log for the content you want to verify.
Q: What is
URL mapping?
A: This is a feature of Site Server Search that allows you to crawl a body of content using one address, but store a different address in the catalog for accessing that same content. Take as an example, a company with an Internet site that stages content on a mirrored server behind its corporate firewall. Using URL mapping, the company can crawl the site on its staging server, but store the Internet address in the catalog that users will use to display the content.
Q: What languages are currently supported by Search?
A: Auto-detected:
Arabic
Chinese
Czech
Danish
German *
Greek
English */**
Spanish *
Finnish
French *
Hebrew
Hungarian
Italian *
Japanese *
Korean
Dutch *
Norwegian
Polish
Portuguese
Russian
Swedish *
Thai
Lithuanian
Hindi
* Languages fully supported for search and indexing
** Includes UK English
Q: Why do I get an error from my proxy server saying it couldn't locate my server when accessing the gatherer logs through MMC?
A: Your proxy server may be attempting to resolve your local computer name on the Internet. To prevent this, add your server name to the
Do not use proxy server for addresses beginning with: dialog in the Advanced proxy settings for Internet Explorer on the
Connection tab under
View Internet Options.
Q: Is there any way to prevent catalog builds from automatically restarting when I restart my computer?
A: Yes. The VB Script <drive>:\Microsoft Site Server\Bin\Recovery.vbs allows disabling and re-enabling automatic recovery for catalogs. The syntax for this is as follows:
Cscript <path>\recovery.vbs <project-name> [Enable/Disable]
This script is only provided as a temporary measure when troubleshooting a problem catalog. Permanently disabling catalog recovery is not recommended.