How documents are opened from a Web site in Office 2003 (838028)



The information in this article applies to:

  • Microsoft Office Excel 2003
  • Microsoft Office PowerPoint 2003
  • Microsoft Office Word 2003

INTRODUCTION

This article describes the process that is used by Microsoft Office 2003 to open Microsoft Office Word 2003 documents, Microsoft Office Excel 2003 spreadsheets, and Microsoft Office PowerPoint 2003 presentations by using hyperlinks or Web folders in Microsoft Internet Explorer. The process involves several additions that have been made to enhance Web collaboration. These additions may affect existing Web solutions that rely on previous Office behavior. The information that is provided is for Web solution developers who want a better understanding of the technical process that Office uses to handle document downloading and editing from an HTTP resource.

MORE INFORMATION

Office 2003 is designed to make a more collaborative workspace. Therefore, several changes have been made to how Office 2003 works with Web content. These changes help to build Web solutions that make Office documents fully compatible with the Office 2003 system. This article describes these changes from a technical perspective. These changes provide better authoring features for the following Web servers that support Office 2003:
  • Microsoft Windows SharePoint Services
  • Microsoft SharePoint Portal Server
  • Microsoft Exchange Web Store
Note The term "Office" applies to the following products:
  • Microsoft Office Word 2003
  • Microsoft Office Excel 2003
  • Microsoft Office PowerPoint 2003
The term "document" applies to any file or any template that can be opened in Word 2003, in Excel 2003, or in PowerPoint 2003, regardless of the file format.

Hyperlinking in Office 2003 by using HLINK and URLMON

Like earlier versions of Office, Office 2003 implements hyperlinking behavior by using the publicly exposed OLE interfaces of the URL Moniker component (Urlmon.dll) from Internet Explorer. The API that is provided by URLMON lets Office treat a URL resource as any OLE link source is treated by Office. Additionally, the URLMON API also provides methods for asynchronous navigation, for redirection, and for content sharing between processes.

To handle navigation history and backward capabilities, Office uses the public interfaces of the Microsoft Hyperlink Library (Hlink.dll) to create hyperlinks, to bind to hyperlinks, and to move to hyperlinks. HLINK is high-level wrapper for the features that are exposed by URLMON. HLINK gives Office applications a common framework to handle the basic tasks of hyperlink behavior.

Opening an Office document from Internet Explorer

When you click a hyperlink to an Office document from a Web page in Internet Explorer, the host frame navigates to the hyperlink resource by using URLMON. URLMON downloads the file content by using an HTTP GET command. After URLMON obtains the resource, URLMON looks at any one of the three following locations to identify the content type:
  • The associated MIME type that is specified in the HTTP header
  • The CLSID as it is saved in a structured storage document
  • The file name extension, if it is preserved in the URL string
If the type is associated with an Office application, URLMON creates an OLE instance of the target application. URLMON prompts the OLE instance to load the content by using the IPersistMoniker interface of the OLE object. URLMON passes the URL Moniker that URLMON creates for the resource to Office. Office then wraps the URL Moniker in a new HLINK object. After the URL Moniker is bound to the HLINK object, Office can load the file and then display the file to the user.

The full process of loading from a moniker and then using HLINK and URLMON to bind to Web content is beyond the scope of this article. For more detail on the programming aspects of this process, see the documentation on the Microsoft Developer Network.

For additional information, click the following article number to view the article in the Microsoft Knowledge Base:

178853 HLINKAXD demonstrates a hyperlinking active document

There is one critical drawback to this approach. URL monikers that are provided by Internet Explorer are typically read-only. You can open content and modify content, but you cannot save content back to the server. When you save content back to the storage that is provided by the moniker, the modifications are applied to the content in the Internet Explorer Temporary Internet Files cache. However, the modifications are not applied to the content on the Web server. To resolve this drawback, the concept of the publishing moniker is introduced in Office 2000 and later.

Making a URL moniker have read access and write access by using MSDAIPP

With the introduction of Office 2000, the capabilities of URLMON are extended to support full write access to a publishing server that supports either FrontPage Server Extensions (FPSE) or the HTTP 1.1 command extensions for Web Distributed Authoring and Versioning (DAV).

Support for full write access is completed by using a protocol provider extension to URLMON. The protocol provider extension to URLMON permits binding through a component that is named the Microsoft OLE DB Provider for Internet Publishing Provider (Msdaipp.dll). By using a set of flags to URLMON, a host can request binding by using a specialized URL moniker type that uses MSDAIPP. Office refers to this as a publishing moniker. The publishing moniker uses MSDAIPP to open and to save the content directly on the server. This is an important step to extend the capabilities of URLMON.

However, there is a drawback. The MSDAIPP component uses its own session of the Windows Internet (WININET) API, not the session in use by Internet Explorer itself. Therefore, non-persisted session information, such as server cookies, is unavailable in MSDAIPP requests. This makes some servers require re-authentication or re-navigation to the URL for MSDAIPP to communicate with those servers. Additionally, to avoid obtaining "stale" data that may have been changed by another user, MSDAIPP re-acquires the Web content after successfully locking the Web content for write access. This causes a second HTTP GET request or a second FPSE POST request to the Web server for the document content.

To work around this drawback, a modified approach is introduced in Office 2000 Service Release 1. Instead of trying to bind by using a publishing moniker at load time, Office binds to the document by using the typical read-only URL moniker that is provided by Internet Explorer. When you want to save the file, Office tries to switch to the publishing moniker to perform a save back to the server, if the server supports Web publishing. If re-authentication is required because of the change in session, you are prompted for credentials on save instead of on open. If you want to read the file without saving the file, Office avoids the costly switch-of-context to a publishing moniker. Office also avoids a server lock on the resource. This is a compromise approach.

For additional information about some of the changes that have been made to Office 2000 Service Release 1 to mitigate the effects of opening Web documents by using the publishing moniker context, click the following article numbers to view the articles in the Microsoft Knowledge Base:

185978 Double GET requests and cookies are lost with Word 2000 or Excel 2000

266263 BUG: Word 2000 and Excel 2000 display ASP source when using MIME type to stream data

247318 BUG: Word 2000 and Excel 2000 do not redirect correctly when using Response.Redirect

264143 FIX: ASP session variables empty when Office 2000 MIME types are streamed with Internet Explorer

Recognizing drawbacks with the approaches that are used by previous Office versions

The compromise approach that is used by Office 2000 Service Release 1 and by Office XP is well suited for browsing documents and for saving these documents to the server. However, the compromise approach has drawbacks. The drawbacks become more noticeable as Web developers build more sophisticated Web-based document management systems that are meant to more seamlessly integrate with Microsoft Office.

The most important drawback is delaying the switch-of-context until after a user tries to save or to perform some explicit action that requires write access. The document resource is not locked and may be changed by another user or another process during the time that the first user has the file open. If the first user then tries to save, the changes of the second user are lost. Alternatively, the first user is faced with the choice of discarding their changes without knowing what the second user has changed.

Another drawback occurs because the author permissions of the user are unknown until the switch-of-context occurs. The user is not notified that they do not have permission to save the file until the user makes the actual request to save the file. The user must be notified that they do not have permissions to save the file before the file is opened for editing. This is the drawback that lead to the approach that is taken in Office 2000 Service Release 1.

Identifying changes to the hyperlink process for Office 2003

There are a growing number of users who are using Office as a front end for document collaboration over HTTP intranets. Therefore, the drawbacks of the previous approach are acute. Changes are required to detect the difference between a shared document and a browsed document. Office 2003 introduces new features to the hyperlinking process to work around the drawbacks.

Understanding Microsoft Office Protocol Discovery

When an Office application receives a request to open a Web resource, the Office application has to make the following decisions about how to open the Web resource:
  • Open the resource as read-only from the content that is downloaded by Internet Explorer. This content is opened in browse mode.
  • Open the resource as read/write with a document lock on the server for exclusive access. This content is opened in edit mode.
The decision about how to open the Web resource is resolved by investigating the folder path where the document comes from and by investigating the capabilities of the server that manages that path. To determine what capabilities the server supports, Office 2003 issues an HTTP 1.1 standard OPTIONS command. The OPTIONS command requests that the server identify what commands and what methods that the server supports for the folder where the document is located. The server identification is done according to the rules that are outlined in RFC 2616. An HTTP 1.1-compatible Web server responds to the OPTIONS request with the list of methods that are supported for the uniform resource identifier (URI). Office evaluates the response and then looks for the following:
  • Web authoring protocol

    If the server response provides either an MS-AUTHOR-VIA header value or a list of methods that are consistent with Distributed Authoring and Versioning, Office notes that the document can be saved back to the Web server by using the protocol that is specified.

    The protocols that are currently available are Web Extender Client (WEC) and Web DAV. If the server does not provide a protocol, the file is considered read-only. The client can perform a SaveAs to save a copy locally. However a copy cannot be saved back to the folder where the file came from.
  • Web-server type

    Office also tries to determine the Web-server type. This determination is based on header information that is returned by the OPTIONS call. Specifically, Office looks for header values that indicate communications with a SharePoint document library or an Exchange WebStore folder. If communications are detected, Office performs additional communication to the server to enable the following Web collaboration features:
    • Web discussion
    • Task list updates
    • Document check-out
    • Document check-in
    The previous Web collaboration features are supported by certain Web-server types. To identify the Web-server type, Office looks for the following headers:
    • MicrosoftSharePointTeamServices
    • MicrosoftTahoeServer
    • MicrosoftOfficeWebServer
    • MS-WebStore
If the Web server requires authentication to successfully complete the OPTIONS request, you may be prompted for credentials to complete the call. After the call is completed, the information that is gathered is cached in your registry hive so that the call does not have to be repeated again for this folder. The Office Protocol Discovery Cache is located under the following registry key:

HKEY_CURRENT_USER\Software\Microsoft\Office\11.0\Common\Internet\Server Cache

The Server Cache contains sub-key entries for each Web folder that is opened and that has successfully returned an OPTIONS call. Each entry contains the following values set to the appropriate setting for that folder:
  • Protocol

    This is a 32-bit DWORD value that contains the Web authoring protocol to use for the document. The currently-defined values follow:
    • 0 for read-only HTTP
    • 1 for WEC to an FPSE-enabled Web folder
    • 2 for DAV to a DAV-extended Web folder
  • Type

    This is a DWORD value that indicates the type of Web document collaboration server that manages the folder. The currently-defined values follow:
    • 0 for no collaboration
    • 1 for SharePoint Team Server
    • 2 for Exchange 2000 Server
    • 3 for SharePoint Portal 2001 Server
    • 4 for SharePoint 2001 enhanced folder
    • 5 for Windows SharePoint Server and SharePoint Portal 2003 Server
  • Expiration

    This is a 64-bit QWORD value that contains an expiration time. The value is a Win32 FILETIME structure that contains the expiration time in Universal Time Coordinate (UTC) format. After the expiration, Office re-queries the Web server with another OPTIONS call to make sure that the server configuration has not changed since the values were last cached. The length of the expiration time varies based on a random seed. The length of the expiration time is typically 2 weeks or longer.

    Important The registry key is provided for informational purposes only. Do not edit the registry key or values directly. Office clears the cache periodically. Therefore, saved information is temporary.
The maximum number of cache entries may be set by the MaxCount registry value under the same Server Cache key. Office removes old entries to make space if the maximum count is reached. If no space can be cleared, the results of the OPTIONS call are not cached.

Identifying known drawbacks that are caused by Office Protocol Discovery

Office Protocol Discovery resolves the most important drawback, and that is to determine if the document must be opened as a read-only document or as a read/write document on the server. However, Office Protocol Discovery has the potential for some new drawbacks. The following problems are known side-effects of the current design:
  • Office Protocol Discovery uses a standard HTTP 1.1 OPTIONS command. Web servers that do not handle this command cannot support full read/write access in Office 2003. This is expected and is by design.
  • You may be prompted for authentication when you open Office files. This behavior occurs if the Web server requires authentication to process an OPTIONS call to the URI of the folder. Changes to the server configuration can typically be made to avoid this problem by giving anonymous users browse permissions to the folder. Browse permissions are also know as list permissions. The prompt for authentication is expected if the server requires authentication.
  • You may be prompted to select a client certificate or to select a trust-a-server certificate on open. This behavior may occur even if this certificate information is previously provided to Internet Explorer for the same navigation. Because Office makes a new request for its own process space to the server, a new session is created every time. This new session may produce additional security warnings or additional prompts to complete the OPTIONS call successfully.
  • Cookie information that is used for gathering the document is not used in the OPTIONS request. If the server does not permit direct calls to the folder URL without this cookie information, the OPTIONS call may not be successful. If this problem occurs, the user may be prompted repeatedly for authentication, but the user may not be able to provide authentication. This is not because of missing authentication. This problem occurs because of missing session cookies for the Web server. This problem is specific to certain Web-server designs that depend on cookie information instead of authentication information or that depend on cookie information plus authentication information.
  • There is a known problem with network configurations that use a Cisco Content Server Switch (CSS) load balancer with layer 5 filtering in their intranet environment. The CSS software does not correctly handle the HTTP 1.1 OPTIONS command. The CSS software does not forward the call to the Web server. Also, the CSS software does not return a response to the client that indicates an error and then closes the TCP connection.

    Because the TCP packet is never acknowledged by the server, the client believes that the server has not received the message. Therefore, the client resends the message. Office continues sending this message and waiting for a response until the TCP connection eventually times out. This can cause a client to stop responding when opening an Office file. The Office application waits for the server response. The server response is never received because the CSS load balancer drops the TCP packet.

    Cisco knows about this problem. Cisco is working on an update to resolve the issue. To work around this problem without the update, you can lower the CSS filtering to level 3 rules or to level 4 rules. You can also bypass the load balancer by changing the URL that is opened so that the URL points directly to the Web server that holds the content.
The benefits that are gained by Office Protocol Discovery outweigh the currently known drawbacks. We believe these issues will decrease over time. We will continue to follow the last two problems to make sure that solutions are available if the existing network design cannot be adjusted. We believe the choice to use Office Protocol Discovery is the correct long-term strategy for Web collaboration.

Understanding HTTP conversion for UNC redirector files

Clients that are running Windows XP Professional can create Network Places to DAV Web folders by using the Web Client service. The Web Client service is also known as the WebDAV mini-redirector. This Web Client service lets DAV-enabled folders appear as UNC shares.

An application can open the file, edit the file, and save to the file because the application typically saves to a UNC path. However, document collaboration requires more functions than are provided by the Web Client service. Therefore, Office 2003 has added code to determine if a file is opened by the Web Client service. If a file is opened by the Web Client service, Office 2003 re-maps the path back to a full URL and then opens the file separately by using the protocol that is appropriate for the server type. This lets an Office 2003 application perform full-document collaboration features, as if the file is opened directly from the URL in Office. The information that is provided previously, including Office Protocol Discovery, applies to documents that are opened from a Web Client-enabled UNC share.

Understanding Hyperlink zone security and security prompts

Office 2003 uses enhanced-security measures for Internet hyperlinking from links in the Office document. This includes passing security credential information under a more restrictive security zone policy so that Internet Explorer can permit or can deny passing credentials to the server. Permission or denial is based on the zone settings that are set for the user.

Also, Office 2003 makes sure that when navigation is under user control, WININET has a correct window handle. This means that WININET can raise security prompts to the user if prompts are required to perform an action. This enhances Web security in Office. However, tighter restrictions for Internet Explorer security zones may cause alerts to appear that did not appear in older versions of Office. The alerts appear during hyperlink navigation.

Additionally, Office 2003 adds an additional warning prompt under the following circumstances:
  • A user clicks a hyperlink in an Office document
  • A document contains content that is based on a URL resource that may perform navigation
The additional warning prompt makes sure that the user wants to move to the Web site, and that the site is trusted. You can control this prompt behavior through a registry setting.

For additional information, click the following article number to view the article in the Microsoft Knowledge Base:

829072 How to disable hyperlink warning messages in Office 2003

REFERENCES

For additional information about the OPTIONS command and the HTTP 1.1 protocol, see the HTTP Working Group Request for Comments (RFC) specification #2616 at the following Internet Engineering Task Force Web site: For additional information about hyperlink issues in earlier versions of Office, click the following article numbers to view the articles in the Microsoft Knowledge Base:

297891 Performance slows, low memory problems when hyperlinking between programs

810360 BUG: Word 2000 and Excel 2000 do not retain cookie information when you move to a hyperlink in the same session

225234 You are prompted for a password when you open an Office document in a browser

314400 You are unnecessarily prompted for your password when you follow hyperlink on an Office document

218153 Error message: "Cannot locate the Internet server or proxy server" when clicking hyperlink

280680 Cannot follow hyperlink to Office document


Modification Type:MajorLast Reviewed:7/2/2004
Keywords:kbOfficeAuto kbinfo KB838028 kbAudDeveloper