SUMMARY
This step-by-step article explains how to use the
Encoding property with XML in .NET Framework. The encoding declaration in
the XML declaration identifies the encoding format of the XML data. In the
System.Xml namespace, the
Encoding property of the classes identifies the encoding format.
The
System.Xml namespace in .NET Framework includes the following classes that
have the
Encoding property:
- XmlDeclaration: represents XML declaration node
- XmlParserContext: provides the context information required by the XML reader
classes to parse an XML fragment
- XmlTextReader: pull model parser; provides forward-only, fast, non-cached
access to XML data
- XmlValidatingReader: validates XML documents against XSD, XDR, or DTD
The
Encoding property is a case insensitive string such as "UTF-8" or
"ISO-8859-1" with the
XmlDeclaration class. With the other classes, the
Encoding property is of the type
System.Text.Encoding class.
back to the top
The Encoding Declaration
Specify the encoding declaration in the XML declaration section
of the XML document. For example, the following declaration indicates that the
document is in UTF-16 Unicode encoding format:
<?xml version="1.0" encoding="UTF-16"?>
Note that this declaration only specifies the encoding format of an XML
document and does not modify or control the actual encoding format of the data.
For example, if you have the actual XML data in UTF-8 encoding
format, but the encoding declaration is set to ISO-8859-1, you receive error
messages similar to the following when you parse the document:
There is invalid data at the root level. Line n,
position n.
To convert the actual encoding format of the data, you
must use the classes under the
System.Text.Encoding namespace.
The following example shows how to set or
get the encoding declaration of an XML document.
- Save the following data in a new XML document named
"Q308061.xml".
<?xml version='1.0' encoding='ISO-8859-1'?>
<Collection>
<Book>
<Title>Priciple of Relativity</Title>
<Author>Albert Einstein</Author>
<Genre>Physics</Genre>
</Book>
<Book>
<Title>Cosmos</Title>
<Author>Carl Sagan</Author>
<Genre>Cosmology</Genre>
</Book>
</Collection>
- Create a new Visual C# .NET console application and paste
the following code in the Class1.cs file.
using System;
using System.Xml;
namespace ConsoleApplication1
{
class Class1
{
[STAThread]
static void Main(string[] args)
{
try
{
// Load the XML document
XmlDocument doc = new XmlDocument();
doc.Load("Q308061.xml");
// The first child of a standard XML document is the XML declaration.
// The following code assumes and reads the first child as the XmlDeclaration.
if (doc.FirstChild.NodeType == XmlNodeType.XmlDeclaration)
{
// Get the encoding declaration.
XmlDeclaration decl = (XmlDeclaration) doc.FirstChild;
Console.WriteLine("Encoding declaration:\n\n Before = " + decl.Encoding);
// Set the encoding declaration.
decl.Encoding = "UTF-16";
Console.WriteLine(" After = " + ((XmlDeclaration) doc.FirstChild).Encoding + "\n");
}
}
catch(XmlException xmlex)
{
Console.WriteLine("{0}", xmlex.Message);
}
catch(Exception ex)
{
Console.WriteLine("{0}", ex.Message);
}
}
}
}
- Compile and then run the application.
NOTE: The Q308061.xml file must be in the same directory as the
executable file.
back to the top
The Encoding Property of the Readers
The
XmlTextReader and the
XmlValidatingReader classes provide a read-only
Encoding property. These classes only read the encoding declaration value
and do not determine the actual encoding format of the data.
The
following is a Visual C# .NET code sample that shows how to access the encoding
attribute of an XML document. To run this sample, paste the following code in
the
try block of the previous code sample:
// Reading the encoding using the reader classes.
XmlTextReader rdr = new XmlTextReader("Q308061.xml");
rdr.Read();
Console.WriteLine("Encoding from the reader: {0} \n\n", rdr.Encoding.EncodingName);
back to the top
REFERENCES
For more information about
System.Xml and
System.Text classes, refer to the following topics in the Microsoft .NET
Framework SDK documentation:
For more information about character encoding, refer to the
following topic in the Microsoft Developer Network (MSDN) Library:
For the latest XML download and information, refer to the
following MSDN Web site:
back to the top