SUMMARY
This article describes how to replace special characters in
an Extensible Markup Language (XML) file by using Visual Basic
.NET.
back to the top
Description of the Technique
XML predefines the following five entity references for special
characters that would otherwise be interpreted as part of markup language:
Character Name |
Entity Reference |
Character Reference |
Numeric Reference |
Ampersand |
& |
& |
& |
Left angle bracket |
< |
< |
&#60; |
Right angle bracket |
> |
> |
> |
Straight quotation mark |
" |
" |
' |
Apostrophe |
' |
' |
" |
You can use entity and character references to escape the left angle bracket,
the ampersand, and other delimiters. You can also use numeric character
references; they are expanded immediately when they are recognized and they are
treated as character data, so you can use the numeric character references
If you are declaring either of the following two entities
you have to declare them as internal entities whose replacement
text is a character reference to the respective character (the left angle
bracket or the ampersand) that is being escaped; the double escaping is
required for these entities so that references to them produce a well-formed
result.
If you are declaring any of the following three entities
you have to declare them as internal entities whose replacement
text is the single character being escaped.
back to the top
Determine Whether a Special Character Replacement Is Required
Not Required: XML Files in Which the Data Is Retrieved from a Database
When you are using the Microsoft .NET Framework, data is
retrieved and stored in a
DataSet. When you are writing data from a
DataSet to an XML file by using the
WriteXml method, the special characters that are referred to in the
"Summary" section are replaced with the respective character references;
therefore, when you are writing XML files and you are using a
DataSet, no special replacement process is required.
back to the top
Required: An XML File with Special Characters
Sometimes the XML file or the XML data that is coming from a
third party may use these special characters; in this scenario, the data
generates errors when you load it into an
XmlDocument object or an
XmlReader object.
The following error is generated when the
ampersand character is encountered:
An Error occurred
while parsing entity_name, line
#, position #.
where line
# and position
# represent the exact position of the special
character.
The following error occurs when a left angle bracket is
encountered:
The '<' character, hexadecimal value
0x3C,cannot be included in a name. Line #, position
#.
In this error message, the line
# and position
# do not
indicate the position where the left angle bracket exists, but where the second
left angle bracket is encountered.
If the XML file contains a right
angle bracket (>), a straight quotation mark (") or an apostrophe ('), these
are handled by the
XmlReader and the
XmlDocument objects because only single character replacement is required for
these characters.
back to the top
Replace the Special Characters
To replace the ampersand and the left angle bracket characters:
- Create the XML file.
- Create the Visual Basic .NET application, and then insert
the code.
back to the top
Create the XML File
Copy and paste the following code into Notepad, and then save the
file as
customers.xml:
<?xml version="1.0" standalone="yes"?>
<Customers>
<Customer>
<CustomerID>BLAUS</CustomerID>
<CompanyName>Blauer See Delikatessen</CompanyName>
<ContactName>Hanna Moos</ContactName>
<Region>test<ing</Region>
</Customer>
<Customer>
<CustomerID>SPLIR</CustomerID>
<CompanyName>Split Rail Beer & Ale</CompanyName>
<ContactName>Art raunschweiger</ContactName>
<Region>WY</Region>
</Customer>
</Customers>
back to the top
Create Visual Basic .NET Project
- Create a new Visual Basic .NET Windows
application.
- Use a drag-and-drop operation to move a TextBox, two Button controls, and a DataGrid.
- Set the Multiline property of the TextBox to True.
- Import the following namespaces:
Imports System.Xml
Imports System.IO
Imports System.Data.SqlClient
- After the following section
Inherits System.Windows.Forms.Form
copy and paste the following code sample:
Dim filepath As String = "C:\customers.xml"
Private Sub ReplaceSpecialChars(ByVal linenumber As Long)
Dim strm As StreamReader
Dim strline As String
Dim strreplace As String
Dim tempfile As String = "C:\temp.xml"
Try
FileCopy(filepath, tempfile)
Catch ex As Exception
MessageBox.Show(ex.Message)
End Try
Dim strmwriter As New StreamWriter(filepath)
strmwriter.AutoFlush = True
strm = New StreamReader(tempfile)
Dim i As Long = 0
While i < linenumber - 1
strline = strm.ReadLine
strmwriter.WriteLine(strline)
i = i + 1
End While
strline = strm.ReadLine
Dim lineposition As Int32
lineposition = InStr(strline, "&")
If lineposition > 0 Then
strreplace = "&"
Else
lineposition = InStr(2, strline, "<")
If lineposition > 0 Then
strreplace = "<"
End If
End If
strline = Mid(strline, 1, lineposition - 1) + strreplace + Mid(strline, lineposition + 1)
strmwriter.WriteLine(strline)
strline = strm.ReadToEnd
strmwriter.WriteLine(strline)
strm.Close()
strm = Nothing
strmwriter.Flush()
strmwriter.Close()
strmwriter = Nothing
End Sub
Public Function LoadXMLDoc() As XmlDocument
Dim xdoc As XmlDocument
Dim lnum As Long
Dim pos As Long
Dim Newxml As String
Try
xdoc = New XmlDocument()
xdoc.Load(filepath)
Catch ex As XmlException
MessageBox.Show(ex.Message)
lnum = ex.LineNumber
ReplaceSpecialChars(lnum)
xdoc = LoadXMLDoc()
End Try
Return (xdoc)
End Function
- Copy and paste the following code into the Click event of Button1:
Dim xmldoc As New XmlDocument()
xmldoc = LoadXMLDoc()
Dim nextnode As XmlNode
nextnode = xmldoc.FirstChild.NextSibling
TextBox1.Text = nextnode.OuterXml
- Copy and paste the following code into the Click event of Button2:
Dim ds As New DataSet()
Dim xdoc As New XmlDocument()
Dim cnNwind As New SqlConnection("Data source=myservername;user id=myuser;Password=mypassword;Initial catalog=Northwind;")
Dim daCustomers As New SqlDataAdapter("Select customerid,companyname,contactname, region from customers where region='WY'", cnNwind)
Dim filepath As String
Try
daCustomers.Fill(ds, "Customers")
DataGrid1.DataSource = ds.Tables(0)
ds.WriteXml("C:\Dataset.xml")
xdoc.Load("C:\Dataset.xml")
Dim nextnode As XmlNode
nextnode = xdoc.FirstChild.NextSibling
TextBox1.Text = nextnode.OuterXml.ToString
Catch ex As Exception
MessageBox.Show(ex.Message)
End Try
- Change the server name, the user name, and the password to
connect to your server that is running Microsoft SQL Server.
- Build the project, and then run it.
- Click Button1.
The errors that you receive are consistent with the
description of the errors that are explained in the "Required: An XML File with
Special Characters" section. The XML data is then displayed in the TextBox; the ampersand is replaced with - Click Button2.
In the DataGrid, companyname has an ampersand and the TextBox shows the XML data with
back to the top