BUG: PersistStreamInit::Load() Displays HTML Files as Text (323569)



The information in this article applies to:

  • Microsoft Internet Explorer (Programming) 6 (SP1)
  • Microsoft Internet Explorer (Programming) 5.5 SP2

This article was previously published under Q323569

SYMPTOMS

If you host the WebBrowser control, you may want to load HTML files from memory by using the IPersistStreamInit::Load() interface on MSHTML. Sometimes, however, you may notice that using this method causes the WebBrowser control to display your HTML file as plain text (that is, you see the raw HTML tags, and not rendered content). This is particularly noticeable in HTML pages with large SCRIPT blocks, or large chunks of plain text with no intervening markup.

CAUSE

When it loads an HTML document from Load(), MSHTML must perform MIME-sniffing (that is, it must detect the MIME type by inspecting the leading bytes of the file). However, a bug in MSHTML causes that component to make a second try at MIME detection without resetting its IStream pointer in the memory buffer. If the next chunk of text contains little or no markup, MSHTML recognizes it as text/plain, and overwrites the results of the prior sniff.

RESOLUTION

The surest workaround is to save your file to a temporary location on your disk, and then to load it by using either the MSHTML IPersistFile interface or the IWebBrowser2Navigate() method.

Another technique is to change your HTML files to defeat bogus data sniffing. IPersistStreamInit is frequently used to load documents that are produced through an XSL Transform, and XSL templates sometimes do not use white space and carriage returns to separate tags. Judicious use of white space and carriage returns can help MSHTML properly detect the MIME type of your document. You can also break up large SCRIPT blocks into several smaller blocks, or break up large passages of text with <P>, <DIV>, or <SPAN> tags.

STATUS

Microsoft has confirmed that this is a bug in the Microsoft products that are listed in the "Applies to" section.

MORE INFORMATION

Steps to reproduce the behavior

  1. Create a WebBrowser host in Microsoft Visual C++ 6.0 or 7.0. For a sample implementation that uses the Active Template Library (ATL), see the following Microsoft Developer Network (MSDN) Web site:
  2. Sink DWebBrowserEvents2 on the WebBrowser control. For more information about sinking WebBrowser events, see the following MSDN Web site:
  3. When your application is created, call the IWebBrowser2Navigate() method to move the WebBrowser to "about:blank".
  4. Add code to your DWebBrowserEvents2::DocumentComplete() event handler to load a file from memory:
        If (m_bInitial)  //m_bInitial is used to prevent the DocumentComplete event from being fired recursively, set it to TRUE 
    after the Navigate() call.
        {
            HRESULT hr = E_FAIL;
    
            CComPtr<IStream> spStream;
            DWORD dwWritten = 0;
            DWORD dwBytes = 0;
    
            HANDLE handle = CreateFile("c:\\temp\\test.htm", GENERIC_READ, 0, NULL, OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL, NULL);
            if (handle != INVALID_HANDLE_VALUE) 
            {
                    TCHAR szBuffer[10240];
                    DWORD dwBytesRead;
    
                    if (ReadFile(handle, szBuffer, sizeof(szBuffer) - 1, &dwBytesRead, NULL))
                    {
                        szBuffer[dwBytesRead] = '\0';
                        CloseHandle(handle); 
            
                        if (SUCCEEDED(CreateStreamOnHGlobal(NULL, TRUE, &spStream))) {
                            dwBytes = lstrlenA(szBuffer) + 1;
                            dwWritten = 0;
                            hr = spStream->Write(szBuffer, dwBytes, &dwWritten);
                            LARGE_INTEGER li = {0, 0};
                            spStream->Seek(li, STREAM_SEEK_SET, NULL);
    
                            CComPtr<IDispatch> spDispatch;
                            hr = m_pWebBrowser->get_Document(&spDispatch);
            
                            if (SUCCEEDED(hr)) {
                                CComQIPtr<IPersistStreamInit> spPersistStreamInit(spDispatch);
                                if (spPersistStreamInit != NULL) {
                                    hr = spPersistStreamInit->InitNew();
                                    if (SUCCEEDED(hr)) {
                                        hr = spPersistStreamInit->Load(spStream);
                                    }
                               }
                          }
                     }
                 }
             } 	
        }
        m_bInitial = FALSE;
    					
  5. Write a HTML file with a large amount of raw text in the body, or with a sizable (around 512-1024 bytes) SCRIPT block inside the <HEAD> tag, and then put the file in the C:\Temp folder. Rename it as Test.htm.
  6. Run your application. The raw HTML tags load as plain text in the WebBrowser window.

REFERENCES

For more information, visit the following MSDN Web site:

Modification Type:MajorLast Reviewed:6/29/2004
Keywords:kbbug kbnofix KB323569