How To Parse a Person's Name into Multiple Variables (168799)



The information in this article applies to:

  • Microsoft Visual Basic Control Creation Edition for Windows 5.0
  • Microsoft Visual Basic Learning Edition for Windows 5.0
  • Microsoft Visual Basic Professional Edition for Windows 5.0
  • Microsoft Visual Basic Enterprise Edition for Windows 5.0
  • Microsoft Visual Basic Standard Edition, 32-bit, for Windows 4.0
  • Microsoft Visual Basic Professional Edition, 16-bit, for Windows 4.0
  • Microsoft Visual Basic Professional Edition, 32-bit, for Windows 4.0
  • Microsoft Visual Basic Enterprise Edition, 16-bit, for Windows 4.0
  • Microsoft Visual Basic Enterprise Edition, 32-bit, for Windows 4.0
  • Microsoft Visual Basic Standard Edition for Windows 2.0
  • Microsoft Visual Basic Standard Edition for Windows 3.0
  • Microsoft Visual Basic Professional Edition for Windows 2.0
  • Microsoft Visual Basic Professional Edition for Windows 3.0
  • Microsoft Access 1.0
  • Microsoft Access 1.1
  • Microsoft Access 2.0
  • Microsoft Access for Windows 95 7.0
  • Microsoft Access 97
  • Microsoft Excel 97 for Windows
  • Microsoft Word 97 for Windows
  • Microsoft PowerPoint 97 for Windows

This article was previously published under Q168799

SUMMARY

This article provides a procedure for parsing a single variable containing a person's name into up to 6 parts:
   Title                (Mr., Ms., etc.)
   First Name
   Middle Initial(s)
   Last Name
   Pedigree             (Jr., Sr., III, etc.)
   Degree(s)            (B.S., PhD, etc.)
				

MORE INFORMATION

WARNING: ANY USE BY YOU OF THE CODE/MACRO PROVIDED IN THIS ARTICLE IS AT YOUR OWN RISK. Microsoft provides this code/macro "as is" without warranty of any kind, either express or implied, including but not limited to the implied warranties of merchantability and/or fitness for a particular purpose.

NOTE: In the following sample code, an underscore (_) at the end of a line is used as a line-continuation character. For versions of BASIC that don't support line-continuation characters, remove the underscore from the end of the line and merge with the following line when re-creating this code.

Fields are parsed and removed from the name in the following order:

Title - (if the first word matches the list of standard titles)
Degree - (anything after the first comma)
Pedigree - (if the last word matches the list of standard pedigrees)
Last Name - (must not contain spaces, but can be hyphenated)
First Name - (must not contain spaces)
Middle Initial(s) - (the remainder)

Step-by-Step Example

  1. Enter the following code:
          Function CutLastWord (ByVal S As String, Remainder As String) _
              As String
          ' CutLastWord: returns the last word in S.
          ' Remainder: returns the rest.
          '
          ' Words are separated by spaces
          '
          Dim  I As Integer, P As Integer
            S = Trim$(S)
            P = 1
            For I = Len(S) To 1 Step -1
              If Mid$(S, I, 1) = " " Then
                P = I + 1
                Exit For
              End If
            Next I
            If P = 1 Then
              CutLastWord = S
              Remainder = ""
            Else
              CutLastWord = Mid$(S, P)
              Remainder = Trim$(Left$(S, P - 1))
            End If
          End Function
    
          Function CutWord (ByVal S As String, Remainder As String) As String
          '
          ' CutWord: returns the first word in S.
          ' Remainder: returns the rest.
          '
          Dim P As Integer
            S = Trim$(S)
            P = InStr(S, " ")
            If P = 0 Then P = Len(S) + 1
            CutWord = Left$(S, P - 1)
            Remainder = Trim$(Mid$(S, P + 1))
          End Function
    
          Sub ParseName (ByVal S As String, Title As String, FName As String, _
                         MName As String, LName As String, _
                         Pedigree As String, Degree As String)
          Dim Word As String, P As Integer, Found As Integer
          Const Titles = "Mr.Mrs.Ms.Dr.Miss,Sir,Madam,Mayor,President"
          Const Pedigrees = "Jr.Sr.III,IV,VIII,IX,XIII"
            Title = ""
            FName = ""
            MName = ""
            LName = ""
            Pedigree = ""
            Degree = ""
          '
          ' Get Title
          '
            Word = CutWord(S, S)
            If InStr(Titles, Word) Then
              Title = Word
            Else
              S = Word & " " & S
            End If
          '
          ' Get Degree
          '
            P = InStr(S, ",")
            If P > 0 Then
              Degree = Trim$(Mid$(S, P + 1))
              S = Trim$(Left$(S, P - 1))
            End If
          '
          ' Get Pedigree
          '
            Word = CutLastWord(S, S)
            If InStr(Pedigrees, Word) Then
              Pedigree = Word
            Else
              S = S & " " & Word
            End If
          '
          ' Get the rest
          '
            LName = CutLastWord(S, S)   ' Last Name
            FName = CutWord(S, S)       ' First Name
            MName = Trim(S)             ' Initials/Middle Name(s)
          End Sub
    						
  2. To test, create a form with seven text boxes (txtName, txtTitle, txtFirstName, txtMI, txtLastName, txtPedigree, txtDegree), and a command button. Add the following code:
          Sub Command1_Click()
          Dim Title As String, FName As String, MI As String
          Dim LName As String, Pedigree As String, Degree As String
            ParseName txtName, Title, FName, MI, LName, Pedigree, Degree
            txtTitle = Title
            txtFirstName = FName
            txtMI = MI
            txtLastName = LName
            txtPedigree = Pedigree
            txtDegree = Degree
          End Sub
    						
  3. Display the form, type a name into txtName, and click the command button. The other six fields should contain the parsed values.

Additional Information

Other notes, limitations, and suggestions for improvement (left to the reader's devices):

  1. The parsing routine assumes a comma before the Degree but not before the Pedigree. If there is a comma before the Pedigree, it will be included as part of the Degree(s). If you want to allow the Pedigree to be preceded by a comma, you can add an additional check to see if the first word (comma separated) of the Degree is in the list of standard Pedigrees.
  2. The first and last names are assumed to contain no spaces. If they do (e.g. Mary Beth Saint John), part of the name (e.g. Beth Saint) will end up in the Middle Initial/Middle Name return value. Unfortunately, due to the high variability of this case, there is no good workaround other than manual intervention.
  3. The parsing routine is designed for English/American names. Other languages may have different conventions that may require adjustments to the parsing code.
  4. The list of standard Titles and Pedigrees may be easily extended by adding entries to the two Const declarations in ParseName.
  5. The check for Title can be made more generic by looking for a period at the end of the first word if it is not already in the list of standard Titles.

Modification Type:MinorLast Reviewed:10/11/2006
Keywords:kbhowto kbProgramming KB168799