Configuring the F5 Networks Big-IP Load Balancer to allow successful SharePoint Portal Server 2003 crawling (889652)



The information in this article applies to:

  • Microsoft Office SharePoint Portal Server 2003

SUMMARY

Background

In a Microsoft Office SharePoint Portal Server 2003 farm topology with separate Web front-end and index server computers, the SharePoint Portal Server 2003 index server computers index portal content by making Web Service calls to a Web Service that is on the Web front-end computers. If the network hosting the SharePoint Portal Server 2003 farm topology is configured so that portal content indexing traffic goes through an F5 Networks Big-IP Load Balancer, you may see multiple "address not found" errors in the SharePoint Portal Server 2003 gatherer logs. Here is an example of this kind of error, taken directly from a SharePoint Portal Server 2003 gatherer log:

DateTime Add sps://hlbtest.testnet.net/site$$$people/bucketid=3/itemid=22857

The address could not be found, (0x80041208 - The address appears to be in error. Check that the address is valid. )

If you examine the network traffic associated with the crawling process, you may also see multiple HTTP 400 (bad request) errors being returned from the SharePoint Portal Server 2003 Web front-end computers. These same HTTP 400 (bad request) errors will also be present in the IIS HTTP logs. These errors typically correspond to indexing failures for user profiles in the SharePoint Portal Server 2003 People database. As a result, even though there is a user profile record for a particular user in the People database, a portal search for that user will return no results.

MORE INFORMATION

If the HTTP or HTTPS requests used during portal content indexing resolve to a load-balancing virtual IP address on an F5 Networks Big-IP Load Balancer, and the Big-IP Load Balancer "OneConnect" engine or any kind of Layer 7 load-balancing method is being used, you may see these errors and experience poor performance while the crawl is in progress unless specific Big-IP configuration steps are taken.

The Big-IP "OneConnect" engine is a performance enhancement feature of the Big-IP Load Balancer that "pools" multiple "front-end" client TCP connections into a single "back-end" server TCP connection. The "address not found" errors result from an interoperability issue between the Big-IP "OneConnect" engine or any kind of Layer 7 load-balancing method and Microsoft .NET Framework.

Note Layer 7 load-balancing denotes any methods which include any cookie-related persistence (active, passive, and others), SSL Persistence, SIP Persistence, and Layer 4 and higher packet inspection methods.

If your SharePoint Portal Server 2003 deployment requires a Layer 7 load-balancing method to be in place, there are several options for correcting these errors:

Option 1

  • On the Big-IP pool containing the IP addresses of your SharePoint Portal Server 2003 deployment front-end Web server computers, configure the Header to Insert property to insert a Connection: Close header. This effectively disables keep-alives for this pool.
  • Configure the Big-IP to auto-increment the TCP ephemeral ports used by the SNAT address.
Note This option has the disadvantage of disabling keep-alives for the pool for all portal traffic-not just crawling traffic. While the crawl is in progress, you may experience extremely poor performance when you try to browse your portal site.

Option 2

  • On the Big-IP pool containing the IP addresses of your SharePoint Portal Server 2003 deployment front-end Web Server computers, configure the Header to Insert property to insert a Connection: Close header. This effectively disables keep-alives for this pool.
  • Create a second pool on your Big-IP load balancer that contains the IP addresses of the same SharePoint Portal Server 2003 front-end Web server computers, but do not configure a Header to Insert property for this pool.
  • Create a Big-IP address class, and add the IP address of all index servers in your SharePoint Portal Server deployment to this class.
  • Write a load-balancing rule that examines the IP address of the client making a request that comes to the Big-IP load balancer. This rule would then:
    • Compare that address to the address class that contains the IP addresses of the index servers in your SharePoint Portal Server 2003 deployment. If there is a match between the client IP address and any one of the IP addresses in the address class, direct the request to the pool that contains the Connection: Close Header Insert. Otherwise, direct the request to the pool that does not contain the Connection: Close Header Insert.
  • Configure the Big-IP virtual server to use the new rule (instead of a pool).
  • Configure the Big-IP to auto-increment the TCP ephemeral ports used by the SNAT address.
Note This option has the advantage of disabling keep-alives only for crawling traffic-all other portal traffic will use a pool for which keep-alives are still enabled. You should see no degradation in performance when browsing or navigating your portal site while the crawl is in progress.

Option 3 - If you require persistence, but no cookies:

  • You do not need an additional pool.
  • You do not need to specify the Connection: Close Header to Insert property. Configure your Big-IP pool that contains the IP addresses of your SharePoint Portal Server 2003 deployment front-end Web server computers to use Simple Persistence.
  • You do not need to auto-increment the SNAT ephemeral ports.
  • You do not need a load-balancing rule that examines the IP address of the client making a request that comes to the Big-IP load-balancer.
Note This kind of persistence will not work if your clients (including your crawler) connect to your SharePoint Portal Server 2003 deployment through a proxy server. This is because Simple Persistence uses the IP address of your clients for persistence and a proxy server typically masks the client IP addresses.

If your SharePoint Portal Server 2003 deployment does not require a Layer 7 load-balancing method to be in place, then set your Persistence to "None" and do not configure the Big-IP to auto-increment the ephemeral ports used by the SNAT address.

Option 4

  • Add another SharePoint Portal Server 2003 Web front-end computer to your SharePoint Portal Server 2003 deployment.
  • Do not include this server in your load-balanced pool on the Big-IP load-balancer.
  • Create a hosts file entry on all the SharePoint Portal Server 2003 index servers in your SharePoint Portal Server 2003 server farm that resolves the URL of the portal to the new Web front-end computer instead of the load-balancing virtual IP address. By doing this, all portal crawling traffic will directed to this one Web front-end computer instead of to the Big-IP load-balancer.
The third-party products that this article discusses are manufactured by companies that are independent of Microsoft. Microsoft makes no warranty, implied or otherwise, regarding the performance or reliability of these products.
Microsoft provides third-party contact information to help you find technical support. This contact information may change without notice. Microsoft does not guarantee the accuracy of this third-party contact information.

Modification Type:MajorLast Reviewed:1/5/2005
Keywords:kb3rdparty kbConfig kbtshoot kberrmsg kbprb KB889652 kbAudITPRO