PRB: Encoding Attribute Is Not Returned in DOMDocument XMLProperty (282287)



The information in this article applies to:

  • Microsoft XML 2.0
  • Microsoft XML 2.5
  • Microsoft XML 2.6
  • Microsoft XML 3.0
  • Microsoft XML 3.0 SP1
  • Microsoft XML 4.0

This article was previously published under Q282287

SYMPTOMS

The xml property of the DOMDocument object does not return the encoding attribute for the XML data, even if a specific encoding is specified in the XML.

CAUSE

Because the xml property always returns the data as a Unicode string, it is UTF-16 encoded. This means that the original encoding is no longer valid and is filtered out.

STATUS

This behavior is by design.

MORE INFORMATION

If a newer version of MSXML has been installed in side-by-side mode, you must explicitly use the Globally Unique Identifiers (GUIDs) or ProgIDs for that version to run the sample code. For example, MSXML version 4.0 can only be installed in side-by-side mode. For additional information about the code changes that are required to run the sample code with the MSXML 4.0 parser, click the following article number to view the article in the Microsoft Knowledge Base:

305019 INFO: MSXML 4.0 Specific GUIDs and ProgIds

Steps To Reproduce Behavior

  1. Create an XML file ("test.xml") similar to the following text that specifies a particular encoding, in this case "windows-1252:"

    <?xml version="1.0" encoding="windows-1252"?>
    <root>Hello</root>
    						

  2. Create a script using the following code:
    <HTML>
    <BODY>
      <script language="vbscript">
    	Set xmldoc = CreateObject("Msxml2.DOMDocument")
    	xmldoc.async = false
    	xmldoc.load("test.xml")
    	MsgBox xmldoc.xml
      </script>
    </BODY>
    </HTML>
    					
  3. Execute the script, and note the XML that is displayed.

Results

The XML data that is displayed in the message box looks similar to the following:

<?xml version="1.0"?>
<root>Hello</root>
					

Note that the encoding attribute has been removed.

However, the original value of this attribute is still stored in the DOMDocument, and can be retrieved by using a XMLDOMProcessingInstruction object. Usually, the encoding information is contained in the beginning of the XML file, or as the first node of the DOMDocument.

To retrieve the encoding information, retrieve the first node (item 0) of the DOMDocument object, which, in this case, is a processing instruction node, and then get the text value of the corresponding "encoding" attribute.

The following Microsoft VBScript example displays the value "windows-1252" if xmldoc refers to a DOMDocument object that was created by using the XML data from the preceding example:
Dim encoding
encoding = xmldoc.childNodes(0).Attributes.getNamedItem("encoding").Text
MsgBox encoding
				
The following is an example of how to retrieve the value in Microsoft Visual C++:
	IXMLDOMProcessingInstructionPtr pInst = pXMLDoc->GetchildNodes()->Getitem(0);
	_bstr_t bstrEncoding = pInst->Getattributes()->getNamedItem("encoding")->Gettext();
				

REFERENCES

For additional information%1, click the article number%2 below to view the article%2 in the Microsoft Knowledge Base:

%3 %4


Modification Type:MajorLast Reviewed:10/12/2001
Keywords:kbDSupport kbprb KB282287