Resting Anchor

The Anchorage

Personal website of Gregory K. Maxey, Commander USN (Retired)

Extract Data from Word Document File
(A Microsoft Word Help & Tip page by Gregory K. Maxey)

DISCLAIMER/TERMS OF USE

The information, illustrations and code contained in my "Microsoft Word Tips" are provided free and without risk or obligation.

Click to acces PayPal Verification Service Click to acces PayPal Verification Service

However, the work is mine. If you use it for commercial purposes or benefit from my efforts through income earned or time saved then a donation, however small, will help to ensure the continued availability of this resource.

If you would like to donate, please use the appropriate donate button to access PayPal. Thank you!

Click to donate British Pound Sterling                   Click to donate US dollars                   Click to donate EU euros

This Microsoft Word Tips & Microsoft Word Help page is is a companion to my Extract Batch Form Data page and shows you how to extract data from a targeted source file.  The data in the source file can be tabular, contained in the form fields of a protected form, or if you are using Word 2007 or later, contained in content controls. Using a source file as a data sheet in this manner you can build a collection of templates that draw or extract data from a common source data file.

I will leave the specific application of these methods up to you, but one application that I have often been asked about is how to a create master file of data about a certain person, client, or product, etc., and use that file as a source of data for creating one or more specific documents related to that person, client or product.

Site Note IconNote: The general concept presented in this page was applied to my DocBundler Word template add-in

In the examples that follow, I will show you how to extract data stored in a source document table, protected document form field, and a content control. I will then show you how to place that data in a new document bookmark, protected form field and content control. I understand that my examples may appear unrealistic or silly. Again, my objective is to demonstrate the methods and not a specific application.

Extracting Table Data:

The illustration below depicts a typical source document file with a table containing client data. You may have one or a thousand such files. Each one is similar, but unique to the specific person, client, product etc., and each one has a unique file name. I named this example file JSRagman.doc.

extract data from doc 1
Document: JSRagman.doc

After the source files are finished for each of my clients/products etc., I prepare one or more document templates to use each time I want to create a new document using JSRagman.doc (or similiar data sheet file) as a data source. For more on templates and why you should use them see Word MVP Suzanne Barnhill's: How to Create a Template.

The illustration below depicts a typical template.  I prepared it to create simple announcement documents the firm uses to invite a client and their spouse to an investment seminar. The grey "I" brackets represent bookmarks where data extracted from the source file will be placed.

extract data from doc 2
Template: Seminar Invite.dot

In the VBA project module of the template I added an AutoNew procedure. This procedure is executed each time a new document is create using the template. I also added two other procedures required for processing. All three procedures are contained in a standard project module in the template named "modMain" and are shown below.

VBA Script:
Sub AutoNew()
'Executes each time a new document is created from the template.
Dim oThisDoc As Word.Document
Dim oSourceDoc As Word.Document
Dim strTemp As String
Dim oTbl As Word.Table
  'Call a function to pick the source file
  strTemp = GetOpenFileName
  If strTemp = "" Then Exit Sub
  Set oThisDoc = ActiveDocument
  'Open the source file "invisibly to the user" to extract data
  Set oSourceDoc = Documents.Open(FileName:=strTemp, Visible:=False)
  Set oTbl = oSourceDoc.Tables(1)
  'Insert the text content of the appropriate source document table cell in the bookmarks
  'The "Left" method is used to strip the end of cell marker from the cell text.
  WriteDataToBM oThisDoc, "Name", Left(oTbl.Cell(2, 1).Range.Text, Len(oTbl.Cell(2, 1).Range.Text) - 2)
  WriteDataToBM oThisDoc, "Address", Left(oTbl.Cell(2, 2).Range.Text, Len(oTbl.Cell(2, 2).Range.Text) - 2)
  WriteDataToBM oThisDoc, "FirstName", Left(oTbl.Cell(2, 3).Range.Text, Len(oTbl.Cell(2, 3).Range.Text) - 2)
  WriteDataToBM oThisDoc, "SpouseName", Left(oTbl.Cell(2, 4).Range.Text, Len(oTbl.Cell(2, 4).Range.Text) - 2)
  'Close the source file and cleanup.
  oSourceDoc.Close wdDoNotSaveChanges
  Set oThisDoc = Nothing
  Set oSourceDoc = Nothing
  Set oTbl = Nothing
lbl_Exit:
  Exit Sub
End Sub

Function GetOpenFileName() As String
  With Dialogs(wdDialogFileOpen)
    If .Display = -1 Then
      GetOpenFileName = WordBasic.FileNameInfo$(.Name, 1)
    Else
      GetOpenFileName = ""
    End If
  End With
lbl_Exit:
  Exit Function
End Function

Sub WriteDataToBM(ByRef oDoc As Word.Document, bmName As String, strText As String)
Dim oRng As Word.Range
  If oDoc.Bookmarks.Exists(bmName) Then
    Set oRng = oDoc.Bookmarks(bmName).Range
    oRng.Text = strText
    oDoc.Bookmarks.Add bmName, oRng
  End If
lbl_Exit:
  Exit Sub
End Sub

Site Note icon See: Installing Macros for instructions on how to set up and use the macros provided in this Microsoft Word Help & Microsoft Word Tips page.

When AutoNew executes, it call a procedure "GetOpenFileName" which presents a dialog to the user. Using this dialog, the user selects the data source for the document. In this case the user only has one option (i.e., JSRagman.doc). You of course could have many similar files to choose from.

extract data from doc 3

When the new document is created, data is automatically extracted from the source file and placed in the proper bookmark placeholders.

extract data from doc 1
Finished Document

Extracting Protect Document Form Field Data:

Like with the table method shown above, you start with a collection files to use as the data source. The illustration below depicts a typical protected document used for this purpose.

extract data from doc 5
Document: EFudd.doc

Each of the source document formfields is identified with a unique bookmark name. For example, the loan balance field is bookmarked as "LoanBalance"

extract data from doc 6

Again, I prepare templates for the documents that I want to create that will use EFudd.doc (or similar file) as a data source. The illustration below depicts a typical template. This template illustrates both bookmarks and a form field used as placeholders for the extracted data.

extract data from doc 7
Protected Form Template: Loan Call.dot

Again, similar procedures are included in the template which are used to pick the source document and create the new document.

Site Note IconNote:  Only the AutoNew procedure is repeated below.  You can download a tips page demonstration file containing all of the documents, templates and procedures used in these examples using the link a the end of this page.

VBA Script:
Sub AutoNew()
Dim oThisDoc As Word.Document
Dim oSourceDoc As Word.Document
Dim strTemp As String
  'Call a function to pick the source file
  strTemp = GetOpenFileName
  If strTemp = "" Then Exit Sub
  Set oThisDoc = ActiveDocument
  'Open the source file "invisibly to the user" to extract data
  Set oSourceDoc = Documents.Open(FileName:=strTemp, Visible:=False)
  'The protected form must be unlocked
  oThisDoc.Unprotect
  'Insert the text content of the appropriate source document table cell in the bookmarks
  'The "Left" method is used to strip the end of cell marker from the cell text.
  WriteDataToBM oThisDoc, "Name", oSourceDoc.FormFields("Name").Result
  WriteDataToBM oThisDoc, "Addr1", oSourceDoc.FormFields("Addr1").Result
  WriteDataToBM oThisDoc, "Addr2", oSourceDoc.FormFields("Addr2").Result
  WriteDataToBM oThisDoc, "LastName", oSourceDoc.FormFields("LastName").Result
  'Relock the form
  oThisDoc.Protect wdAllowOnlyFormFields, True
  'Here the formfield value in the new document is set to the value of corresponding formfield in the source document
  oThisDoc.FormFields("LoanBalance").Result = oSourceDoc.FormFields("LoanBalance").Result
  'Close the source file and cleanup.
  oSourceDoc.Close wdDoNotSaveChanges
  Set oThisDoc = Nothing
  Set oSourceDoc = Nothing
lbl_Exit:
  Exit Sub
End Sub

The resulting document is a protected document with editable fields for supply information unique to the particular case.

extract data from doc 8
Finished Document

Extracting Word 2007/2010 Content Control Data:

The final example demonstrates using content controls both as individual data stores in the source document and data placeholders in the templates.

Content Controls are extremely versatile and offer several advantages over the previous methods shown:

Site Note IconNote: Advantage or disadvantage, the court is still out.  Content control titles and tags are not unique.  You can have multiple content controls in a document with the same title or tag.  This can add some complexity to VBA procedures when working with content controls.  While in our example each content control will have a unique title, I will show you how to use a common tag in the code to avoid traps.

The illustration below depicts a typical source document data sheet using content controls as individual data stores.

extract data from doc 9
Document: BBunny.docx

As before, I prepare templates for the documents that I want to create that will use BBunny.docx (or similar file) as a data source. The illustration below depicts a typical template.

Site Note IconNote:  When preparing the template, title all content content that you will use as placeholders using the same title used in the data sheet.  Tag the placeholder content controls using "DependentCC" 

extract data from doc 10
Template: Rate Reduction Deal.dotm

extract data from doc 11

As in the other examples, an AutoNew procedure stored in the template is executed when a new document is created.  Since content controls titles and tags are not unique identifiers, this procedure, while similar in purpose, has significant changes.

VBA Script:
Sub AutoNew()
Dim oThisDoc As Word.Document
Dim oSourceDoc As Word.Document
Dim strTemp As String
Dim matchingMasterCCs As ContentControls
Dim oDependentCCs As ContentControls
Dim oDependentCC As ContentControl
Dim bSame As Boolean
Dim bLocked As Boolean
Dim oCC As ContentControl
  bLocked = False
  strTemp = GetOpenFileName
  If strTemp = "" Then Exit Sub
  Set oThisDoc = ActiveDocument
  Set oSourceDoc = Documents.Open(FileName:=strTemp, Visible:=False)
  'Get the collection of DependentCCs
  Set oDependentCCs = oThisDoc.SelectContentControlsByTag("DependentCC")
  'Process each DependentCC
  For Each oDependentCC In oDependentCCs
    If oDependentCC.LockContents = True Then
      bLocked = True
      oDependentCC.LockContents = False
    End If
    'Get collection of CCs in data sheet that have a title matching the DependentCC
    Set matchingMasterCCs = oSourceDoc.SelectContentControlsByTitle(oDependentCC.Title)
    'Should only be one.
    If matchingMasterCCs.Count = 1 Then
      oDependentCC.Range.Text = matchingMasterCCs(1).Range.Text
    'If more than one then ...
    ElseIf matchingMasterCCs.Count > 1 Then
      bSame = True
      '... see if all like titled CCs contain the same content.
      For Each oCC In matchingMasterCCs
        If oCC.Range.Text <> matchingMasterCCs(1).Range.Text Then
          'If not then
          bSame = False
          MsgBox "The source document contains multiple CCs titled " _
                 & Chr(34) + oDependentCC.Title + Chr(34) _
                 & " that contain different content." & vbCr + vbCr _
                 & "The CC titled " & Chr(34) + oDependentCC.Title + Chr(34) _
                 & " in this document will not be updated"
          Set oCC = Nothing
          Exit For
        End If
      Next oCC
      'If so then
      If bSame Then
        oDependentCC.Range.Text = matchingMasterCCs(1).Range.Text
      End If
    End If
    If bLocked Then oDependentCC.LockContents = True
  Next oDependentCC
  oSourceDoc.Close wdDoNotSaveChanges
  Set oThisDoc = Nothing
  Set oSourceDoc = Nothing
  Set oDependentCCs = Nothing
  Set matchingMasterCCs = Nothing
  Set oDependentCC = Nothing
lbl_Exit:
  Exit Sub
End Sub

The resulting document is shown below.

extract doc data 12
Finished Document

That's it! I hope you have found this tips page useful and informative.  You can download all the files and templates used to create this Microsoft Word Help & Tips page here:
Extract Data from Documents.

Share

DISCLAIMER/TERMS OF USE

The information, illustrations and code contained in my "Microsoft Word Tips" are provided free and without risk or obligation.

Click to acces PayPal Verification Service Click to acces PayPal Verification Service

However, the work is mine. If you use it for commercial purposes or benefit from my efforts through income earned or time saved then a donation, however small, will help to ensure the continued availability of this resource.

If you would like to donate, please use the appropriate donate button to access PayPal. Thank you!

Click to donate British Pound Sterling                   Click to donate US dollars                   Click to donate EU euros

Search my site or the web using Google Search Engine

Google Search Logo