How To Locate and Replace Special Characters in an XML Document with Visual Basic (251354)



The information in this article applies to:

  • Microsoft Visual Basic Learning Edition for Windows 6.0
  • Microsoft Visual Basic Professional Edition for Windows 6.0
  • Microsoft Visual Basic Enterprise Edition for Windows 6.0
  • ActiveX Data Objects (ADO) 2.5
  • ActiveX Data Objects (ADO) 2.6
  • Microsoft XML 3.0
  • Microsoft XML 4.0

This article was previously published under Q251354

SUMMARY

XML predefines five entity references for special characters that would otherwise be interpreted as part of the markup language. These five are listed as follows:
Entity Reference                Character 

     &                        & 
  
     &lt:                         < 

     &gt                         > 

     &quot;                       " 

     &apos;                       '                    
				
When you create an XML Document, you must input these special characters appropriately to insure proper parsing, interpretation, and data display. When you create an XML Document from Visual Basic, one way to search for these special characters and replace them with the proper entity reference is to use the Replace function.

NOTE: Microsoft Visual Basic 6.0 and later products, and Microsoft Office 98 and later products come with a Replace function that is built-in.

MORE INFORMATION

The following sample opens an ADO Recordset, creates Document Elements for the records, and saves the data as an XML Document using the FileSystemObject object. Before you save the file, however, the content is searched for special characters and replaced with the appropriate XML Entity References for those characters utilizing the Replace Method.

NOTE: For simplicity of demonstration, the instructions are for a search and replace for the Ampersand (&) character in the following sample. The same function can be applied for the other four characters as well.
  1. To run this sample, start a new Visual Basic Standard Exe Project. Form1 is created by default.
  2. On the Project menu, under References, select the following: Microsoft ActiveX Data Objects Library (2.1 or later)
    Microsoft XML, Version 1.0 or later
    Microsoft Scripting Runtime*
    *The FileSystemObject Object Model is contained in the Scripting Type Library.

  3. Draw the following controls on Form1 and set their Properties as listed:
       Control                   Name			   Caption
    Label                  lblFSORecordCount           Record Count:
    Text Box               txtFSORecordCount
    Command Button         cmdOpenAdoFsoRs             Open ADO Recordset	
    Command Button         cmdSaveRSasXMLwithFSO       Save RS as XML using FSO
    Command Button         cmdFsoXmltoTreeView         Place XML Data Into TreeView
    TreeView Control*      xtvFSO
    *The TreeView Control is part of the Windows Common Controls. 
    Therefore, you want to check this off under Components in the Project menu.
    					
  4. From the Project menu, add a standard Module and copy the following to create the appropriate variables needed in the project:
    Option Explicit
    
    Public cn As New adodb.Connection
    Public rs As New adodb.Recordset
    Public myI As Integer
    Public StrTmp As String
    Public Errs1 As Errors
    Public myFSO As New FileSystemObject
    Public myTxtStream As TextStream
    Public myfld As adodb.Field
    Public domdoc As New MSXML.DOMDocument
    Public domdoc As New MSXML2.DOMDocument 'For use with XML 2.6
    
    					
  5. Copy the following code into the General Declarations Section of the code window.NOTE: Replace your Server Name, User ID, and Password information in the Connection String.
    Private Sub cmdOpenAdoFsoRs_Click()
    Screen.MousePointer = vbHourglass
    
    'Check to be sure the connection is not already opened:
    If cn.State = 1 Then
        cn.Close
        Set cn = Nothing
    End If
    
    cn.Open "driver={SQL Server};Server=YourServer;uid=YourUID;pwd=YourPWD;database=Northwind"
    
    'Check to be sure a Recordset is not already opened:
    If rs.State = 1 Then
        rs.Close
        Set rs = Nothing
    End If
    
    With rs
        .CursorLocation = adUseClient
        .CursorType = adOpenKeyset
        .LockType = adLockOptimistic
        .CacheSize = 50
        .Source = "Select CustomerID, " & _
        "CompanyName, ContactName, Region from Customers"
        .ActiveConnection = cn
        .Open
    End With
    
    rs.MoveLast
    rs.MoveFirst
    
    txtFsoRecordCount.Text = rs.RecordCount
    
    Screen.MousePointer = vbNormal
    
    End Sub
    
    Private Sub cmdExit_Click()
    If cn.State = 1 Then
        cn.Close
        Set cn = Nothing
    End If
    
    If rs.State = 1 Then
        rs.Close
        Set rs = Nothing
    End If
    
    Unload Me
    
    End Sub
    
    Private Sub cmdSaveRSasXMLwithFso_Click()
    Screen.MousePointer = vbHourglass
    
    'If the file already exists, first Kill it
    'and then create it.
    'If it does not already exist, simply create it:
    
    On Error GoTo mdkNoSeeEm
    Const mdkNoSeeEm = 53 'File Not Found
    Kill "Customers.xml"
    
    'Note: Unicode data is stored as 2 Bytes per character. This will allow
    'proper parsing of international language characters. However, either
    'omitting or explicitly setting the last 'argument in the CreateTextFile 
    'Method to False will create the file as Ascii. Subsequently, a '"An 
    'Invalid Character Was Found in Text Content" error will occur if these 
    'characters exist in 'the text. Therefore, you use Unicode (True). 
    
    'Create a Text file using the FileSystemObject:
    
    Set myTxtStream = myFSO.CreateTextFile(App.Path & "\Customers.xml", True, True)
    
    'After searching for and replacing special characters with
    'their respective entities (via the ReplaceEntXMLSpecChar Function),
    'write the data to Text File to compile XML Document using the TextStream:
    
    myTxtStream.WriteLine "<?xml version=""1.0"" standalone=""yes""?>"
    myTxtStream.WriteLine "<Customers>"
    
    rs.MoveFirst
    
        Do While Not rs.EOF
    myTxtStream.WriteLine "<Customer>"
    myTxtStream.WriteLine "<CustomerID>" & ReplaceEntXMLSpecChar(rs.Fields("CustomerID") & "", "&", "amp;") & "</CustomerID>"
    myTxtStream.WriteLine "<CompanyName>" & ReplaceEntXMLSpecChar(rs.Fields("CompanyName") & "", "&", "amp;") & "</CompanyName>"
    myTxtStream.WriteLine "<ContactName>" & ReplaceEntXMLSpecChar(rs.Fields("ContactName") & "", "&", "amp;") & "</ContactName>"
    myTxtStream.WriteLine "<Region>" & ReplaceEntXMLSpecChar(rs.Fields("Region") & "", "&", "amp;") & "</Region>"
    myTxtStream.WriteLine "</Customer>"
    rs.MoveNext
    Loop
    
    rs.MoveFirst
    
    myTxtStream.WriteLine "</Customers>"
    
    myTxtStream.Close
    
    Screen.MousePointer = vbNormal
    
    Exit Sub
    
    mdkNoSeeEm:
        If Err.Number = mdkNoSeeEm Then
            Resume Next
        End If
        
    Screen.MousePointer = vbNormal
    
    End Sub
    
    'When this function is applied to each record in the recordset,
    'simply provide the Source to perform the search on, the item
    'to be Searched For and the item to Replace with:
    
    Private Function ReplaceEntXMLSpecChar(ByVal strSource As String, ByVal strSearchFor As String, ByVal strReplace As String) _
        As String
        Dim lngPointer As Long, strNew As String
        lngPointer = InStr(strSource, strSearchFor)
        If lngPointer = 0 Then
            ReplaceEntXMLSpecChar = strSource
        Else
            strNew = Left$(strSource, lngPointer - 1) & strReplace & Mid(strSource, lngPointer + 1, Len(strSource))
            ReplaceEntXMLSpecChar = strNew
        End If
    End Function
    
    Private Sub cmdFsoXmltoTreeView_Click()
    'Fill The TreeView with data from the XML Document:
    
    With DOMdoc
        .async = False
        .Load App.Path & "\customers.xml"
    End With
    
    If DOMdoc.readyState = 4 Then
        xtvFSO.Nodes.Clear
        AddNode DOMdoc.documentElement
    End If
    End Sub
    Private Sub AddNode(ByRef myNode As IXMLDOMNode, Optional ByRef TreeNode As Node)
    
           Dim tvcusterrNode As Node
           Dim tvcusterrNodeList As IXMLDOMNodeList
           Dim i As Long
                    
           If TreeNode Is Nothing Then
               Set tvcusterrNode = xtvFSO.Nodes.Add
           Else
               Set tvcusterrNode = xtvFSO.Nodes.Add(TreeNode, tvwChild)
           End If
              
           tvcusterrNode.Expanded = True
           tvcusterrNode.Text = myNode.nodeName
              
           If tvcusterrNode.Text = "#text" Then
                  tvcusterrNode.Text = myNode.nodeTypedValue
           Else
                  tvcusterrNode.Text = "<" & tvcusterrNode.Text & ">"
      End If
    					
  6. Run the project.
  7. Click the Open ADO Recordset CommandButton to open the Connection and Recordset, displaying a RecordCount.
  8. Click the Save RS as XML Using FSO CommandButton to parse the data, replace special characters in the Recordset, and then save it as an XML document through the FileSystemObject object.
  9. Subsequently, click the Place FSO XML Data into TreeView CommandButton to open the XML File and display the data in a TreeView Control.

    If you open the XML document produced in Notepad or any other text editor and do a search for the ampersand (&), you find:
    Split Rail Beer &amp;
    						
    In the treeview, this appears as:
    Split Rail Beer &
    						
If a newer version of MSXML has been installed in side-by-side mode, you must explicitly use the Globally Unique Identifiers (GUIDs) or ProgIDs for that version to run the sample code. For example, MSXML version 4.0 can only be installed in side-by-side mode. For additional information about the code changes that are required to run the sample code with the MSXML 4.0 parser, click the following article number to view the article in the Microsoft Knowledge Base:

305019 INFO: MSXML 4.0 Specific GUIDs and ProgIds

REFERENCES

For additional information, click the following article number to view the article in the Microsoft Knowledge Base:

308060 How To Locate and Replace Special Characters in an XML File with Visual Basic .NET


Modification Type:MinorLast Reviewed:7/1/2004
Keywords:kbDatabase kbhowto KB251354