MORE INFORMATION
Automation is a process where one software component
communicates with and/or controls another software component using Microsoft's
Component Object Model (COM). It is the basis for most cross-component
communication used in languages such as Visual Basic or Visual Basic for
Applications, and has become a normal part of most programs.
Historically, an Automation object is any object that supports the IDispatch
interface. This interface allows clients to call methods and properties at run
time without having to know the exact object they are communicating with at
design time; a process called late binding. Today, however, the term
Automation object can be applied to virtually any COM object, even those that do
not support IDispatch (and therefore cannot be late bound). This article
assumes the object you are Automating supports both binding methods.
What is Binding?
Binding is a process of matching function calls written by the
programmer to the actual code (internal or external) that implements the
function. It is done when the application is compiled, and all functions called
in code must be bound before the code can be executed.
To understand
the process, think of "binding" in terms of publishing a book. Imagine your
code is like the text of the book where in a certain paragraph you have written
something like "see chapter 12, page x for more details." You don't know what
the page number is until the book is finished, so before the paragraph can be
read as intended, all the pages of the book must be bound together and the
correct page number inserted into the paragraph. You wait for the book to be
"bound" before you can reference other parts of the book.
Binding
software is similar. Your code is made up of parts that need to be pulled
together before the code can be "read." Binding is the act of replacing
function names with memory addresses (or memory offsets, to be more precise)
where the code will "jump to" when the function is called. For COM objects, the
address is a memory offset in a table of pointers (called the v-table) held by
the object. When a COM function is bound, it is bound through the
v-table.
The structure of a COM object is simple. When your code
holds a reference to an object, it holds an indirect pointer to the top of the
v-table. The v-table is an array of memory addresses where each entry is a
different function that can be called on that object. To call the third
function on a COM object, you jump down three entries in the table and then
jump to the memory location given there. That executes the code for the
function and, when complete, returns you back ready to execute the next line of
code.
+-[Code]------------+ +.................................[COM Object]...+
| | : +-------------+ :
|Set obj = Nothing -|--->| obj pointer | :
| | : +-|-----------+ :
+-------------------+ : | +-----------------+ :
: +-->| v-table pointer | :
: +--|--------------+ :
: | :
: | +----------------------------+ :
: (3rd) | | Function 1 Address pointer | :
: (Offset) | +----------------------------+ :
: | | Function 2 Address pointer | :
: | +----------------------------+ :
: +->| Function 3 Address pointer | :
: +----------------------------+ :
+................................................+
The example above shows what happens when releasing a COM object.
Because all COM objects inherit from IUnknown, the first three entries in the
table are the methods to IUnknown. When you need to free an object, your code
calls the third function in the v-table (IUnknown::Release).
Fortunately, this work is done by Visual Basic behind the scenes. As a Visual
Basic programmer, you never have to deal with a v-table directly. But, this
structure is how all COM objects are bound, and it is important that you are
familiar with it to understand what binding is.
Early Binding
The example above is what is known as early (or v-table) binding.
For all COM objects, this form of binding takes place whenever a COM object's
IUnknown interface is called. But what about the other functions of the object?
How do you call its Refresh method or its Parent property? These are custom
functions that are typically unique to an object. If their locations in the
v-table cannot be assumed, how do you find the function addresses needed to
call them?
The answer, of course, depends on whether or not you know
in advance what the object's v-table looks like. If you do, you can perform the
same early-binding process to the object's custom methods as you did to its
IUnknown methods. This is what is generally meant by "early-binding."
To use early binding on an object, you need to know what its v-table
looks like. In Visual Basic, you can do this by adding a reference to a type
library that describes the object, its interface (v-table), and all the
functions that can be called on the object. Once that is done, you can declare
an object as being a certain type, then set and use that object using the
v-table. For example, if you wanted to Automate Microsoft Excel using early
binding, you would add a reference to the "Microsoft Excel 8.0 Object Library"
from the Project|References dialog, and then declare your variable as being of
the type "Excel.Application." From then on, all calls made to your object
variable would be early bound:
' Set reference to 'Microsoft Excel 8.0 Object Library' in
' the Project|References dialog (or Tools|References for VB4 or VBA).
' Declare the object as an early-bound object
Dim oExcel As Excel.Application
Set oExcel = CreateObject("Excel.Application")
' The Visible property is called via the v-table
oExcel.Visible = True
This method works great most of the time, but what if you don't know
the exact object you will be using at design time? For example, what if you
need to talk to multiple versions of Excel, or possibly to an "unknown" object
altogether?
Late Binding
COM includes IDispatch. Objects that implement IDispatch are said
to have a dispinterface (if it is the only interface they support) or dual
interface (if they also have a custom interface that you can early bind to).
Clients that bind to IDispatch are said to be "late bound" because the exact
property or method they are calling is determined at run time using the methods
of IDispatch to locate them. Going back to the book example earlier, think of
it as being like a footnote that directs you to the table of contents where you
have to "look up" the page number at "read time" rather than having it already
printed there in the text.
The magic of the interface is controlled
by two functions: GetIDsOfNames and Invoke. The first maps function names
(strings) to an identifier (called a dispid) that represents the function. Once
you know the ID for the function you want to call, you can call it using the
Invoke function. This form of method invocation is called "late binding."
Again, in Visual Basic the way you specify how the object is bound
is by your object declaration. If you declare an object variable as "Object"
you are, in fact, telling Visual Basic to use IDispatch, and are therefore late
binding:
' No reference to a type library is needed to use late binding.
' As long as the object supports IDispatch, the method can
' be dynamically located and invoked at run-time.
' Declare the object as a late-bound object
Dim oExcel As Object
Set oExcel = CreateObject("Excel.Application")
' The Visible property is called via IDispatch
oExcel.Visible = True
As you can see, the rest of your code is the same. The only difference
between early binding and late binding (in terms of the code you write) is in
the variable declaration.
It is important to note that what is "late
bound" is the function being called and not the way it is called. From the
earlier discussion on binding in general, you should notice that IDispatch
itself is "early bound:" that is to say that Visual Basic makes the call to set
the Visible property through a v-table entry (IDispatch::Invoke) as it would
any COM call. The COM object itself is responsible for forwarding the call to
the correct function to make Excel visible. This indirection allows the Visual
Basic client to be compiled (that is, bound to a valid function address) but
still not know the exact function that will actually do the work.
Dispid Binding
Some Automation clients (most noticeably MFC and Visual Basic
3.0, but also Visual Basic 5.0 and 6.0 with respect to ActiveX Controls) use a
hybrid form of late binding called dispid binding. If the COM object is known
at design time, the dispids for the functions that are called can be cached and
passed directly to IDispatch::Invoke without the need to call GetIDsOfNames at
run time. This can greatly increase performance, because instead of making two
COM calls per function, you only need to make one.
Dispid binding is
not an option you can normally choose in Visual Basic 5.0 or 6.0. It is used
for objects that are referenced in a type library but do not contain a custom
interface (that is, for objects that have a dispinterface only) and for
aggregated ActiveX Controls but, in general, Visual Basic uses early binding
any place you would normally use dispid binding.
Which Form of Binding Should I Use?
The answer to this question depends as much on the design of your
project as anything else. Microsoft recommends early binding in almost all
cases. However, there may be reasons for choosing late binding.
Early
binding is the preferred method. It is the best performer because your
application binds directly to the address of the function being called and
there is no extra overhead in doing a run-time lookup. In terms of overall
execution speed, it is at least twice as fast as late binding.
Early
binding also provides type safety. When you have a reference set to the
component's type library, Visual Basic provides IntelliSense support to help
you code each function correctly. Visual Basic also warns you if the data type
of a parameter or return value is incorrect, saving a lot of time when writing
and debugging code.
Late binding is still useful in situations where
the exact interface of an object is not known at design-time. If your
application seeks to talk with multiple unknown servers or needs to invoke
functions by name (using the Visual Basic 6.0
CallByName function for example) then you need to use late binding. Late
binding is also useful to work around compatibility problems between multiple
versions of a component that has improperly modified or adapted its interface
between versions.
The advantages given to early binding make it the
best choice whenever possible.
Maintaining Compatibility Across Multiple Versions
If you will be using a component that you do not redistribute
with your setup package, and cannot be assured of the exact version you will be
communicating with at run-time, you should pay special attention to early bind
to an interface that is compatible with all versions of the component, or (in
some cases) use late binding to call a method that may exist in a particular
version and fail gracefully if that method is not present in the version
installed on the client system.
Microsoft Office applications provide
a good example of such COM servers. Office applications will typically expand
their interfaces to add new functionality or correct previous shortcomings
between versions. If you need to automate an Office application, it is
recommended that you early bind to the earliest version of the product that you
expect could be installed on your client's system. For example, if you need to
be able to automate Excel 95, Excel 97, Excel 2000, and Excel 2002, you should
use the type library for Excel 95 (XL5en32.olb) to maintain compatibility with
all three versions.
Office applications also demonstrate that object
models with large dual interfaces can suffer limitations in marshalling on some
platforms. For your code to work best across all platforms, use IDispatch.
For additional information on maintaining compatibility
when working with Office applications, click the article number below to view
the article in the Microsoft Knowledge Base:
247579 INFO: Use DISPID Binding to Automate Office Applications Whenever Possible