SUMMARY
This article describes how to replace special characters in an Extensible Markup Language (XML) file by using Visual C# .NET.
back to the topDescription of the technique
XML predefines the following five entity references for special characters that would otherwise be interpreted as part of markup language:
Character Name |
Entity Reference |
Character Reference |
Numeric Reference |
Ampersand |
& |
& |
& |
Left angle bracket |
< |
< |
&#60; |
Right angle bracket |
> |
> |
> |
Straight quotation mark |
" |
" |
' |
Apostrophe |
' |
' |
" |
You can use entity and character references to escape the left angle bracket, the ampersand, and other delimiters. You can also use numeric character references. Numeric character references are expanded immediately when they are recognized. In addition, because numeric character references are treated as character data, you can use the numeric character references
If you declare either of the following two entities:
you must declare them as internal entities whose replacement text is a character reference to the respective character (the left angle bracket or the ampersand) that is being escaped. This double escaping is required for these entities so that references to them produce a well-formed result.
If you declare any of the following three entities:
you must declare them as internal entities whose replacement text is the single character that is being escaped.
back to the topDetermine whether you must replace a special character
Not required: XML file in which the data is retrieved from a database
When you are using the Microsoft .NET Framework, data is retrieved and is stored in a
DataSet object. When you write data from a
DataSet to an XML file by using the
WriteXml method, the special characters that are referred to in the "Summary" section are replaced with the respective character references. Therefore, when you write XML files, and if you use a
DataSet, no special replacement process is required.
back to the topRequired: XML file that contains third-party XML data with special characters
Sometimes the XML file or the XML data that comes from a third party may use these special characters. In this scenario, the data generates errors when you load it into an
XmlDocument object or an
XmlReader object.
You receive the following error message when the ampersand character is encountered:
An Error occurred while parsing entity_name, line #, position #.
where line
# and position
# represent the exact position of the special character.
You receive the following error message when a left angle bracket is encountered:
The '<' character, hexadecimal value 0x3C,cannot be included in a name. Line #, position #.
In this error message, the line
# and position
# do not indicate the position where the left angle bracket exists, but where the second left angle bracket is encountered.
If the XML file contains a right angle bracket (>), a straight quotation mark ("), or an apostrophe ('), the
XmlReader and the
XmlDocument objects handle these objects because these characters require only single character replacement.
back to the topReplace the special characters
To replace the ampersand and the left angle bracket characters:
- Create the XML file.
- Create the Visual C# .NET application, and then insert the code.
back to the topCreate the XML file
Copy and paste the following code into Notepad, and then save the file as Customers.xml:
<?xml version="1.0" standalone="yes"?>
<Customers>
<Customer>
<CustomerID>BLAUS</CustomerID>
<CompanyName>Blauer See Delikatessen</CompanyName>
<ContactName>Hanna Moos</ContactName>
<Region>test<ing</Region>
</Customer>
<Customer>
<CustomerID>SPLIR</CustomerID>
<CompanyName>Split Rail Beer & Ale</CompanyName>
<ContactName>Art raunschweiger</ContactName>
<Region>WY</Region>
</Customer>
</Customers>
back to the topCreate Visual C# .NET project
- Create a new Visual C# .NET Windows application as follows:
- Start Microsoft Visual Studio .NET.
- On the File menu, point to New, and then click Project.
- In the New Project dialog box, click Visual C# Projects under Project Types, and then click Windows Application under Templates.
- Drag a TextBox control, two Button controls, and a DataGrid control from the toolbox to your default form, Form1.cs.
- Set the Multiline property of the TextBox to True.
- Import the following namespaces:
using System.Xml;
using System.IO;
using System.Data.SqlClient;
- Add the following code after the Main function:
string filepath = "C:\\Customers.xml";
private void ReplaceSpecialChars(long linenumber)
{
System.IO.StreamReader strm;
string strline;
string strreplace = " ";
string tempfile = "C:\\Temp.xml";
try
{
System.IO.File.Copy(filepath,tempfile,true);
}
catch (Exception ex)
{
MessageBox.Show(ex.Message);
}
StreamWriter strmwriter = new StreamWriter(filepath);
strmwriter.AutoFlush = true;
strm = new StreamReader(tempfile);
long i = 0;
while (i < linenumber - 1)
{
strline = strm.ReadLine();
strmwriter.WriteLine(strline);
i = i + 1;
}
strline = strm.ReadLine();
Int32 lineposition;
lineposition = strline.IndexOf("&");
if (lineposition > 0)
{
strreplace = "&";
}
else
{
lineposition = strline.IndexOf("<",1);
if (lineposition > 0 )
{
strreplace = "<";
}
}
strline = strline.Substring(0, lineposition - 1) + strreplace + strline.Substring(lineposition + 1);
strmwriter.WriteLine(strline);
strline = strm.ReadToEnd();
strmwriter.WriteLine(strline);
strm.Close();
strm = null;
strmwriter.Flush();
strmwriter.Close();
strmwriter = null;
}
public XmlDocument LoadXMLDoc()
{
XmlDocument xdoc;
long lnum;
try
{
xdoc = new XmlDocument();
xdoc.Load(filepath);
}
catch (XmlException ex)
{
MessageBox.Show(ex.Message);
lnum = ex.LineNumber;
ReplaceSpecialChars(lnum);
xdoc = LoadXMLDoc();
}
return (xdoc);
}
- Add the following code to the Button1_Click event:
XmlDocument xmldoc = new XmlDocument();
xmldoc = LoadXMLDoc();
XmlNode nextnode;
nextnode = xmldoc.FirstChild.NextSibling;
this.textBox1.Text = nextnode.OuterXml.ToString();
- Add the following code to the Button2_Click event:
DataSet ds = new DataSet();
XmlDocument xdoc = new XmlDocument();
SqlConnection cnNwind = new SqlConnection("Data source=myServerName;user id=myUser;Password=myPassword;Initial catalog=Northwind;");
SqlDataAdapter daCustomers = new SqlDataAdapter("Select customerid,companyname,contactname, region from customers where region='WY'", cnNwind);
string filepath = "C:\\Customers.xml";
try
{
daCustomers.Fill(ds, "Customers");
this.dataGrid1.DataSource = ds.Tables["Customers"];
ds.WriteXml(filepath);
xdoc.Load(filepath);
XmlNode nextnode;
nextnode = xdoc.FirstChild.NextSibling;
textBox1.Text = nextnode.OuterXml.ToString();
}
catch (Exception ex)
{
MessageBox.Show(ex.Message);
}
- Change the properties in the SqlConnection connection string as necessary for your environment.
- Build and run the project.
- Click Button1.
The errors that you receive are consistent with the description of the errors that are explained in the Required: An XML file with special characters section. The XML data appears in the TextBox; the ampersand is replaced with - Click Button2.
In the DataGrid, notice that companyname has an ampersand and that the TextBox displays the XML data with
back to the top