HOW TO: Use the Encoding Property with System.Xml Classes in Visual Basic .NET (317169)



The information in this article applies to:

  • Microsoft Visual Basic .NET (2003)
  • Microsoft Visual Basic .NET (2002)
  • Microsoft XML Classes (included with the .NET Framework 1.0)
  • Microsoft XML Classes (included with the .NET Framework 1.1)

This article was previously published under Q317169
For a Microsoft Visual C# .NET version of this article, see 308061.

The following .NET Framework Class Library namespace is referenced in this article:
  • System.Xml

IN THIS TASK

SUMMARY

This step-by-step article explains how to use the Encoding property with XML in .NET Framework. The encoding declaration in the XML declaration identifies the encoding format of the XML data. In the System.Xml namespace, the Encoding property of the classes identifies the encoding format.

The System.Xml namespace in .NET Framework includes the following classes that have the Encoding property:
  • XmlDeclaration: represents XML declaration node
  • XmlParserContext: provides the context information required by the XML reader classes to parse an XML fragment
  • XmlTextReader: pull model parser; provides forward-only, fast, non-cached access to XML data
  • XmlValidatingReader: validates XML documents against XSD, XDR, or DTD
The Encoding property is a case insensitive string such as "UTF-8" or "ISO-8859-1" with the XmlDeclaration class. With the other classes, the Encoding property is of the type System.Text.Encoding class.

back to the top

The Encoding Declaration

Specify the encoding declaration in the XML declaration section of the XML document. For example, the following declaration indicates that the document is in UTF-16 Unicode encoding format:
<?xml version="1.0" encoding="UTF-16"?>
				
Note that this declaration only specifies the encoding format of an XML document and does not modify or control the actual encoding format of the data.

For example, if you have the actual XML data in UTF-8 encoding format, but the encoding declaration is set to ISO-8859-1, you receive error messages similar to the following when you parse the document:
There is invalid data at the root level. Line n, position n.
To convert the actual encoding format of the data, you must use the classes under the System.Text.Encoding namespace.

The following example shows how to set or get the encoding declaration of an XML document.
  1. Save the following data in a new XML document named "Q308061.xml".
    <?xml version='1.0' encoding='ISO-8859-1'?>
    <Collection>
       <Book>
          <Title>Priciple of Relativity</Title>
          <Author>Albert Einstein</Author>
          <Genre>Physics</Genre>
       </Book>
       <Book>
          <Title>Cosmos</Title>
          <Author>Carl Sagan</Author>
          <Genre>Cosmology</Genre>
       </Book>
    </Collection>
    					
  2. Create a new Visual Basic .NET console application and paste the following code in the Module1.vb file.
    Imports System
    Imports System.Xml
    
    Module Module1
    
       Sub Main()
    
          Try
             ' Load the XML document.
             Dim doc As XmlDocument = New XmlDocument()
             doc.Load("Q308061.xml")
    
             ' The first child of a standard XML document is the XML declaration.
             ' Following code assumes and reads the first child as the XmlDeclaration.
             If (doc.FirstChild.NodeType = XmlNodeType.XmlDeclaration) Then
    
                ' Get the encoding declaration.
                Dim decl As XmlDeclaration
                decl = CType(doc.FirstChild, XmlDeclaration)
                Console.WriteLine("Encoding declaration:" & vbNewLine & vbNewLine & " Before = " & decl.Encoding)
    
                ' Set the encoding declaration.
                decl.Encoding = "UTF-16"
                Console.WriteLine(" After = " & (CType(doc.FirstChild, XmlDeclaration)).Encoding & vbNewLine)
    
             End If
    
          Catch xmlex As XmlException
             Console.WriteLine("{0}", xmlex.Message)
          Catch ex As Exception
             Console.WriteLine("{0}", ex.Message)
          End Try
    
       End Sub
    
    End Module
    					
  3. Compile and run the application.
NOTE: The Q308061.xml file must be in the same directory as the executable file.

back to the top

The Encoding Property of the Readers

The XmlTextReader and the XmlValidatingReader classes provide a read-only Encoding property. These classes only read the encoding declaration value and do not determine the actual encoding format of the data.

The following is a Visual Basic .NET code sample that shows how to access the encoding attribute of an XML document. To run this sample, paste the following code in the try block of the previous code sample:
         ' Reading the encoding using the reader classes.
         Dim rdr As XmlTextReader = New XmlTextReader("Q308061.xml")
         rdr.Read()
         Console.WriteLine("Encoding from the reader: {0}" & vbNewLine & vbNewLine, rdr.Encoding.EncodingName)
				
back to the top

REFERENCES

For more information about System.Xml and System.Text classes, refer to the following topics in the Microsoft .NET Framework SDK documentation: For more information about character encoding, refer to the following topic in the Microsoft Developer Network (MSDN) Library: For the latest XML download and information, refer to the following MSDN Web site: back to the top

Modification Type:MajorLast Reviewed:9/24/2003
Keywords:kbHOWTOmaster KB317169 kbAudDeveloper