"How to Pass Parameters Between Basic and Assembly" (Part 1/2) (51501)
The information in this article applies to:
- Microsoft QuickBASIC 4.0
- Microsoft QuickBASIC 4.0b
- Microsoft QuickBASIC 4.5
- Microsoft BASIC Compiler for MS-DOS and OS/2 6.0
- Microsoft BASIC Compiler for MS-DOS and OS/2 6.0b
- Microsoft Basic Professional Development System (PDS) for MS-DOS and MS OS/2 7.0
- Microsoft Basic Professional Development System (PDS) for MS-DOS and MS OS/2 7.1
- Microsoft Macro Assembler (MASM) 5.0
- Microsoft Macro Assembler (MASM) 5.1
This article was previously published under Q51501 SUMMARY
The article below gives Part 1 of 2 of a complete tutorial and
examples for passing all types of parameters between compiled Basic
and Assembly Language.
The examples in BAS2MASM (but not the tutorial section) are also
available in this database as multiple separate ENDUSER articles,
which can be found as a group by querying on the word BAS2MASM.
MORE INFORMATIONHOW TO PASS PARAMETERS BETWEEN Basic AND ASSEMBLY LANGUAGE
This document explains how Microsoft Basic compiled programs can pass
parameters to and from Microsoft Macro Assembler (MASM) programs. This
document assumes that you have a fundamental understanding of Basic
and assembly language.
Microsoft Basic supports calls to routines written in Microsoft Macro
Assembler, FORTRAN, Pascal, and C. This document describes the
necessary syntax for calling Microsoft assembly-language procedures
and contains a series of examples demonstrating the interlanguage
calling capabilities between Basic and assembly language. The sample
programs apply to the following Microsoft products:
- Microsoft QuickBasic versions 4.00, 4.00b, and 4.50 for MS-DOS
- Microsoft Basic Compiler versions 6.00 and 6.00b for MS-DOS and MS
OS/2
- Microsoft Basic Professional Development System (PDS) versions 7.00
and 7.10 for MS-DOS and MS OS/2
- Microsoft Macro Assembler (MASM) versions 5.00 and 5.10 for MS-DOS
and MS OS/2
- Microsoft QuickAssembler versions 2.01 and 2.51 (which are
integrated as part of Microsoft QuickC Compiler with QuickAssembler
versions 2.01 and 2.51) for MS-DOS
Microsoft Basic can be linked with all versions of MASM or
QuickAssembler. However, we recommend that you use the latest version
of MASM or QuickAssembler with the examples in this application note.
For more information about interlanguage calling, refer to the
"Microsoft Mixed-Language Programming Guide," which is available with
C 5.00 and 5.10 and MASM 5.00 and 5.10.
MAKING MIXED-LANGUAGE CALLS
===========================
Mixed-language programming always involves a call; specifically, it
involves a function or subprogram call. For example, a Basic main
module may need to execute a specific task that you would like to
program separately. Instead of calling a Basic subprogram, however,
you can call an assembly-language procedure.
Mixed-language calls require multiple modules. Instead of compiling
all of your source modules with the same compiler, you use different
compilers. In the example mentioned above, you would compile the main-
module source file with the Basic compiler, assemble another source
file (written in assembly language) with the assembler, and then link
together the two object files.
There are two types of routines that can be called. Their principal
difference is that some return values, and others do not. (Note: In
this document, "routine" refers to any function or subprogram
procedure that can be called from another module.)
Note: Basic DEF FN functions and GOSUB subroutines cannot be called
from another language.
Basic has a much more complex environment and initialization
procedure than assembly language. Because of this, Basic must be
the initial environment that the program starts in, and from there,
assembly-language routines can be called (which can in turn call
Basic routines). This means that a program cannot start in assembly
language and then call Basic routines.
THE Basic INTERFACE TO ASSEMBLY LANGUAGE
========================================
The Basic DECLARE statement provides a flexible and convenient
interface to assembly language. When you call a routine, the DECLARE
statement syntax is as follows:
DECLARE FUNCTION <name> [ALIAS "aliasname"][CDECL][<parameter-
list>]
The <name> is the name of the function or subprogram that you want to
call as it appears in the Basic source file. The following are the
recommended steps for using the DECLARE statement when calling
assembly language:
1. For each distinct assembly-language routine you plan to call, put a
DECLARE statement in your Basic source file before the routine is
called.
2. If you are calling a MASM routine with a name longer than 31
characters, use the ALIAS feature. The use of ALIAS is explained
below.
3. Use the parameter list to determine how each parameter is to be
passed. The use of the parameter list is explained below.
4. Once the routine is properly declared, call it just as you would a
Basic subprogram or function.
NAMING-CONVENTION REQUIREMENTS
==============================
The term "naming convention" refers to the way that a compiler alters
the name of the routine before placing it into an object file.
It is important that you adopt a compatible naming convention when you
issue a mixed-language call. If the name of the called routine is
stored differently in each object file, then the linker will not be
able to find a match. Instead, it will report an unresolved external.
Microsoft compilers place machine code into object files, but they
also place into object files the names of all routines and common
blocks that need to be accessed publicly. (Note: Basic variables are
never public symbols.) That way, the linker can compare the name of a
routine called in one module to the name of a routine defined in
another module, and recognize a match.
Basic and MASM use the same naming conventions. They both translate
each letter of public names to uppercase. Basic drops the type
declaration character (%, &, !, #, $). Basic recognizes the first 40
characters of a routine name, while MASM recognizes the first 31
characters of a name.
CALLING-CONVENTION REQUIREMENTS
===============================
The term "calling convention" refers to the way that a language
implements a call. The choice of calling convention affects the actual
machine instructions that a compiler generates to execute (and return
from) a function, procedure, or subroutine call.
The use of a calling convention affects programming in two ways:
1. The calling routine uses a calling convention to determine in what
order to pass arguments (parameters) to another routine. The
convention can usually be specified in a mixed-language interface.
2. The called routine uses a calling convention to determine in what
order to receive the parameters that were passed to it. In most
languages, this convention can be specified in the routine's
heading. Basic, however, always uses its own convention to receive
parameters.
Basic's calling convention pushes parameters onto the stack in the
order in which they appear in the source code. For example, the Basic
statement CALL Calc(A, B) pushes argument A onto the stack before it
pushes B. This convention also specifies that the stack is restored by
the called routine just before returning control to the caller. (The
stack is restored by removing parameters.)
USING ALIAS
===========
The use of ALIAS may be necessary because assembly language places the
first 31 characters of a name into an object file, whereas Basic
places up to 40 characters of a name into an object file.
Note: You do not need the ALIAS feature to remove type
declaration characters (%, &, !, #, $). Basic automatically
removes these characters when it generates object code. Thus,
Fact% in Basic matches FACT in assembly language.
The ALIAS keyword directs Basic to place aliasname into the object
file, instead of <name>. The Basic source file still contains calls to
<name>. However, these calls are interpreted as if they were actually
calls to aliasname. This is used when a Basic name is longer then 31
characters and must be called from assembly language, or the assembly
language routine name contains characters that are illegal in a Basic
subroutine name.
For example:
DECLARE FUNCTION QuadraticPolynomialFunctionLeastSquares%
ALIAS "QUADRATI" (a, b, c)
In the example above, QUADRATI, the aliasname, contains the first
eight characters of the name QuadraticPolynomialFunctionLeastSquares%.
This causes Basic to place QUADRATI into the object file, thereby
mimicking MASM's behavior.
USING THE PARAMETER LIST
========================
The <parameter-list> syntax is displayed below, followed by
explanations of each field:
[BYVAL | SEG] <variable> [AS <type>]...,
Note: You can use BYVAL or SEG, but not both.
Use the BYVAL keyword to declare a value parameter. In each subsequent
call, the corresponding argument will be passed by value.
Note: Basic provides two ways of "passing by value." The usual
method of passing by value is to use an extra set of parentheses,
as in the following:
CALL HOLM((A))
This method actually creates a temporary value, whose address is
passed. In contrast, BYVAL provides a true method of passing by
value, because the value itself is passed, not an address. Only by
using BYVAL will a Basic program be compatible with an
assembly-language routine that expects a value parameter.
Use the SEG keyword to declare a far reference parameter. In each
subsequent call, the far (segmented) address of the corresponding
argument will be passed.
You can choose any legal name for <variable>, but only the type
associated with the name has any significance to Basic. As with other
variables, the type can be indicated with a type declaration character
(%, &, !, #, $) or the implicit declaration.
You can use the "AS type" clause to override the type declaration of
<variable>. The type field can be INTEGER, LONG, SINGLE, DOUBLE,
STRING, a user-defined type, or ANY, which directs Basic to permit any
type of data to be passed as the argument.
For example:
DECLARE FUNCTION Calc2! (BYVAL a%, BYVAL b%, BYVAL c!)
In the example above, Calc2! is declared as an assembly-language
routine that takes three arguments: the first two are integers passed
by value, and the last is a single-precision real number passed by
value.
ALTERNATIVE Basic INTERFACES
============================
You can specify parameter-passing methods without using a DECLARE
statement or by using a DECLARE statement and omitting the parameter
list.
1. You can make the call with the CALLS statement. The CALLS statement
causes each parameter to be passed by far reference.
2. You can use the BYVAL and SEG keywords in the actual parameter list
when you make the call, as follows:
CALL Fun2(BYVAL Term1, BYVAL Term2, SEG Sum)
In the example above, BYVAL and SEG have the same meaning that they
have in a Basic DECLARE statement. When you use BYVAL and SEG this
way, however, you need to be careful because neither the type nor the
number of parameters will be checked as they would be in a DECLARE
statement.
SETTING UP THE ASSEMBLY-LANGUAGE PROCEDURE
==========================================
The linker cannot combine the assembly-language procedure with the
calling program unless compatible segments are used and the procedure
itself is declared properly. The following points may be helpful:
1. If you have version 5.00 of the Macro Assembler, use the .MODEL
directive at the beginning of the source file; this directive
automatically causes the appropriate return to be generated (NEAR
for small or compact model, FAR otherwise). Modules called from
Basic should be declared as .MODEL MEDIUM. If you have a version of
the assembler earlier than 5.00, declare the procedure FAR.
2. If you have version 5.00 or later of the Microsoft Macro Assembler
(MASM), use the simplified segment directives .CODE to declare the
code segment and .DATA to declare the data segment. (Having a code
segment is sufficient if you do not have data declarations.) If you
are using an earlier version of the assembler, the SEGMENT, GROUP,
and ASSUME directives must be used.
3. The procedure label must be declared public with the PUBLIC
directive. This declaration makes the procedure available to be
called by other modules. Also, any data you want to make public to
other modules must be declared as PUBLIC.
4. Global data or procedures accessed by the routine must be declared
EXTRN. The safest way to use EXTRN is to place the directive
outside any segment definition (however, near data must go inside
the data segment).
PRESERVING REGISTERS
====================
There are several registers that need to be preserved in a mixed-
language program. These registers are as follows:
CX, BX
BP, SI, DI, SP
CS, DS, SS, ES
The direction flag should also be preserved.
ENTERING THE ASSEMBLY-LANGUAGE PROCEDURE
========================================
The following two instructions begin the procedure:
push bp
mov bp, sp
This sequence establishes BP as the "framepointer." The framepointer
is used to access parameters and local data, which are located on the
stack. SP cannot be used for this purpose because it is not an index
or base register. Also, the value of SP may change as more data is
pushed onto the stack. However, the value of the base register BP will
remain constant throughout the procedure, so that each parameter can
be addressed as a fixed displacement off of BP.
The instruction sequence above first saves the value of BP because it
will be needed by the calling procedure as soon as the current
procedure terminates. Then BP is loaded with the value of SP to
capture the value of the pointer at the time of entry to the
procedure.
ALLOCATING LOCAL DATA (OPTIONAL)
================================
An assembly-language procedure can use the same technique for
implementing local data that is used by high-level languages. To set
up local data space, decrease the contents of SP in the third
instruction of the procedure. (To ensure correct execution, you should
always increase or decrease SP by an even amount.) Decreasing SP
reserves space on the stack for the local data. The space must be
restored at the end of the procedure, as shown below:
push bp
mov bp, sp
sub sp, space
In the text above, space is the total size in bytes of the local data.
Local variables are then accessed as fixed, negative displacements off
of BP.
For example:
push bp
mov bp, sp
sub sp, 4
.
.
.
mov WORD PTR [bp-2], 0
mov WORD PTR [bp-4], 0
The example above uses two local variables, each of which is 2 bytes
in size. SP is decreased by 4, since there are 4 bytes of local data.
Later, each of the variables is initialized to 0 (zero). These
variables are never formally declared with any assembler directive;
the programmer must keep track of them manually.
Local variables are also called dynamic, stack, or automatic
variables.
EXITING THE PROCEDURE
=====================
Several steps may be involved in terminating the procedure:
1. If any of the registers SS, DS, SI, etc., have been saved, these
must be popped off the stack in the reverse order that they were
saved.
2. If local data space was allocated at the beginning of the
procedure, SP must be restored with the instruction MOV SP, BP.
3. Restore BP with POP BP. This step is always necessary.
4. Finally, if you are not using CDECL and the C calling conventions,
return to the calling program with the RET <n> instruction (where
<n> is the number of bytes to pop off the stack) to adjust the
stack with respect to the parameters that were pushed by the
caller.
ASSEMBLY-LANGUAGE CALLS TO Basic
================================
No Basic routine can be executed unless the main program is in Basic,
because a Basic routine requires the environment to be initialized in
a way that is unique to Basic. MASM will not perform this special
initialization.
However, a program can start up in Basic, call an assembly-language
function that does most of the work of the program, and then call
Basic subprograms and functions as needed.
The following rules are recommended when you call Basic from assembly
language:
1. Start up in a Basic main module. You must use the DECLARE statement
to provide an interface to the assembly-language module.
2. In the assembly-language module, declare the Basic routine as
EXTRN.
3. Make sure that all data is passed as a near pointer. Basic can pass
data in a variety of ways, but is unable to receive data in any
form other than near reference.
Note: With near pointers, the program assumes that the data is
in the default data segment. If you want to pass data that is
not in the default data segment, then first copy the data to a
variable that is in the default data segment.
Note: Microsoft Basic Professional Development System (PDS)
version 7.10 allows a Basic routine to be passed parameters by
value.
THE MICROSOFT SEGMENT MODEL
===========================
If you use the simplified segment directives by themselves, you do not
need to know the names assigned for each segment. However, versions of
the Macro Assembler earlier than 5.00 do not support these directives.
With earlier versions of the assembler, you should use the SEGMENT,
GROUP, ASSUME, and ENDS directives equivalent to the simplified
segment directives.
The following table shows the default segment names created by the
.MODEL MEDIUM directive used with Basic. Use of these segments ensures
compatibility with Microsoft languages and will help you access public
symbols. This table is followed by a list of three steps, illustrating
how to make the actual declarations, and a sample program.
Directive Name Align Combine Class Group
--------- ---- ----- ------- ----- -----
.CODE name_TEXT WORD PUBLIC 'CODE'
.DATA _DATA WORD PUBLIC 'DATA' DGROUP
.CONST CONST WORD PUBLIC 'CONST' DGROUP
.DATA? _BSS WORD PUBLIC 'BSS' DGROUP
.STACK STACK PARA STACK 'STACK' DGROUP
The directives in the table refer to the following kinds of segments:
Directive Description of Segment
--------- ----------------------
.CODE The segment containing all the code for the module.
.DATA Initialized data.
.DATA? Uninitialized data. Microsoft compilers store
uninitialized data separately because it can be more
efficiently stored than initialized data. (Note:
Basic does not use uninitialized data.)
.FARDATA and
.FARDATA? Data placed here will not be combined with the
corresponding segments in other modules. The segment
of data placed here can always be determined,
however, with the assembler SEG operator.
.CONST Constant data. Microsoft compilers use this segment
for such items as string and floating-point
constants.
.STACK Stack. Normally, this segment is declared in the
main module for you and should not be redeclared.
The following steps describe how to use this table to create
directives:
1. Refer to the table to look up the segment name, align type, combine
type, and class for your code and data segments. Use all of these
attributes when you define a segment. For example, the code segment
is declared as follows:
_TEXT SEGMENT WORD PUBLIC 'CODE'
The name _TEXT and all the attributes are taken from the table.
2. If you have segments in DGROUP, put them into DGROUP with the GROUP
directive, as in the following:
GROUP DGROUP _DATA _BSS
3. Use ASSUME and ENDS as you would normally. Upon entering routines
called directly from Basic, DS and SS will both point to DGROUP.
The following example shows an assembly-language program without the
simplified segment directives from version 5.00 of the Microsoft Macro
Assembler:
test_TEXT SEGMENT WORD PUBLIC 'CODE'
ASSUME cs:test_TEXT
PUBLIC Power2
Power2 PROC
push bp
mov bp, sp
mov ax, [bp+6]
mov cx, [bp+8]
shl ax, cl
pop bp
ret 4
Power2 ENDP
test_TEXT ENDS
END
COMPILING AND LINKING
=====================
After you have written your source files and resolved the issues
raised in the above sections, you are ready to compile individual
modules and then link them together.
Before linking, each program module must be compiled or assembled with
the appropriate compiler or assembler.
ACCESSING PARAMETERS
====================
PARAMETER-PASSING REQUIREMENTS
==============================
Microsoft compilers support three methods for passing a parameter:
Method Description
------ -----------
Near reference Passes a variable's near (offset) address. This
method gives the called routine direct access to the
variable itself. Any change the routine makes to the
parameter will be reflected in the calling routine.
Far reference Passes a variable's far (segmented) address. This
method is similar to passing by near reference,
except that a longer address is passed.
By value Passes only the variable's value, not address. With
this method, the called routine knows the value of
the parameter, but has no access to the original
variable. Changes to the value parameter have no
effect on the value of the parameter in the calling
routine, once the routine terminates.
Because there are different parameter-passing methods, please note the
following:
1. Make sure that the called routine and the calling routine use the
same method for passing each parameter (argument). In most cases,
you will need to check the parameter-passing defaults used by each
language, and possibly make adjustments. Each language has keywords
or language features that allow you to change the parameter-passing
method.
2. You may want to use a particular parameter-passing method rather
then using the default for the language.
Basic ARGUMENTS
===============
The default for Basic is to pass all arguments by near reference. This
can be overridden by using the SEG directive or CALLS instead of CALL.
Both of these methods cause Basic to pass both the segment and offset.
These methods can be used only to call a non-Basic routine because
Basic receives all parameters by near reference.
Note: Although Basic can pass parameters to other languages by far
reference by using the SEG directive or CALLS, Basic routines can
be CALLed only from other languages when parameters are passed by
near reference. You cannot DECLARE or CALL a Basic routine with
parameters that have SEG or BYVAL attributes. SEG and BYVAL are
only used for parameters of non- Basic routines.
Note: Basic PDS version 7.10 allows a Basic routine to be passed
parameters by value.
Basic STACK FRAME
=================
The following diagram illustrates the Basic stack frame as it appears
upon entry to the assembly-language routine:
+--------------------+
A | Arg 1 address | <-- BP + 8
|--------------------|
B | Arg 2 address | <-- BP + 6
|--------------------|
| Return address | BP + 4
| (4 bytes) | BP + 2
|--------------------|
| Saved BP | <-- BP
+--------------------+
Low Addresses
ASSEMBLY-LANGUAGE ARGUMENTS
===========================
Once you have established the procedure's framepointer, allocated
local data space (if desired), and pushed any registers that need to
be preserved, you can write the main body of the procedure. To write
instructions that can access parameters, consider the general picture
of the stack frame after a procedure call, as illustrated in the
following figure:
High Addresses
+------------------+
| Parameter |
|------------------|
| Parameter |
|------------------|
| . |
| . |
| . |
Stack grows |------------------| Parameters above
downward with| Parameter | this generated
each push or |------------------| automatically
by call | Return Address | <-- the compiler.
|------------------|
| Saved BP | <-- Framepointer (BP)
|------------------| points here.
| Local Data Space | These parameters
|------------------| would be generated
| Saved SI | by your assembly-
|------------------| language code.
| Saved DI | <-- SP points to last
+------------------+ item placed on
stack.
Low Addresses
The stack frame for the procedure is established by the following
sequence of events:
1. The calling program pushes each of the parameters on the stack,
after which SP points to the last parameter pushed.
2. The calling program issues a CALL instruction, which causes the
return address (the place in the calling program to which control
will ultimately return) to be placed on the stack. This address
may be either 2 bytes long (for near calls) or 4 bytes long (for
far calls). SP now points to this address. (Note: When dealing
with Basic, the return address will always be a far address [4
bytes].)
3. The first instruction of the called procedure saves the old value
of BP, with the instruction push bp. SP now points to the saved
copy of BP.
4. BP is used to capture the current value of SP, with the instruction
MOV BP, SP. Therefore, BP now points to the old value of BP.
5. Whereas BP remains constant throughout the procedure, SP may be
decreased to provide room on the stack, for local data or saved
registers.
In general, the displacement (off of BP) for a parameter X is equal to
the following:
2 + size of return address
+ total size of parameters between X and BP
For example, consider a FAR procedure (all Basic procedures are FAR)
that has received one parameter, a 2-byte address. The displacement of
the parameter would be as follows:
Argument's displacement = 2 + size of return address
= 2 + 4
= 6
The argument can thus be loaded into BX with the following
instruction:
mov bx, [bp+6]
Once you determine the displacement of each parameter, you may want to
use string equates or structures so that the parameters can be
referenced with a single identifier name in your assembly-language
source code. For example, the parameter above at bp+6 can be
conveniently accessed if you put the following statement at the
beginning of the assembly-language source file:
Arg1 EQU [bp+6]
You could then refer to this parameter as Arg1 in any instruction. Use
of this feature is optional.
PASSING Basic ARGUMENTS BY VALUE
================================
An argument is passed by value when the called routine is first
declared with a DECLARE statement, and the BYVAL keyword is applied to
the argument. For example:
DECLARE SUB AssemProc (BYVAL a AS INTEGER)
PASSING Basic ARGUMENTS BY NEAR REFERENCE
=========================================
The Basic default is to pass by near reference. Use of SEG, BYVAL, or
CALLS changes this default.
PASSING Basic ARGUMENTS BY FAR REFERENCE
========================================
Basic passes each argument in a call by far reference when CALLS is
used to invoke a routine. Using SEG to modify a parameter in a
preceding DECLARE statement also causes a Basic CALL to pass
parameters by far reference.
Note: CALLS cannot be used to call a routine that is named in a
DECLARE statement. For this reason, the use of the SEG directive is
the preferred method of passing variables by far reference.
DATA TYPES
==========
NUMERICAL FORMATS
=================
Numerical data formats are the simplest kinds of data to pass between
assembly language and Basic. The following chart shows the equivalent
data types in each language:
Basic Assembly Language
----- -----------------
x%, INTEGER DW
... DB, DF, DT <-- These are not available in Basic.
x&, LONG DD
x!, SINGLE DD
x#, DOUBLE DQ
USER-DEFINED TYPES
==================
The elements in a user-defined type are stored contiguously in memory,
one after the other. When a Basic user-defined type appears in an
argument list, Basic passes the address of the beginning element of
the user-defined type.
The routine that receives the user-defined type must know the format
of the type beforehand. The assembly-language routine should then
expect to receive a pointer to a structure of this type.
Basic STRING FORMATS
====================
Near Variable-Length Strings
----------------------------
Variable-length strings in Basic have 4-byte string descriptors:
+-------------------------------------+
| Length | Address (offset) |
+-------------------------------------+
(2 bytes) (2 bytes)
The first field of the string descriptor contains a 2-byte integer
indicating the length of the actual string text. The second field
contains the address of the text. This address is an offset into the
default data area (DGROUP) and is assigned by Basic's string-space
management routines. These management routines need to be available to
reassign this address whenever the length of the string changes, yet
the routines are available only to Basic. Therefore, an assembly-
language routine should not alter the length or address of a Basic
variable-length string.
Note: Fixed-length strings do not have a string descriptor.
Passing Variable-Length Strings from Basic
------------------------------------------
When a Basic variable-length string (such as A$) appears in an
argument list, Basic passes a string descriptor rather than the string
data itself.
Warning: When you pass a string from Basic to assembly language,
the called routine should under no circumstances alter the length
or address of the string.
The routine that receives the string must be aware that if any Basic
routine is called, Basic's string-space management routines may change
the location of the string data without warning. In this case, the
calling routine must note that the values in the string descriptor may
change.
The Basic functions SADD and LEN extract parts of the string
descriptor. SADD extracts the address of the actual string data, and
LEN extracts the length. The results of these functions can then be
passed to an assembly-language routine.
Basic should pass the result of the SADD function by value. Bear in
mind that the string's address, not the string itself, will be passed
by value. This amounts to passing the string itself by reference. The
Basic module passes the string address, and the other module receives
the string address. The address returned by SADD is declared as type
INTEGER, but is actually equivalent to a near pointer.
There are two methods for passing a variable-length string from Basic
to assembly language. The first method is to pass the string address
and string length as separate arguments, using the SADD and LEN
functions. The second method is to pass the string descriptor itself,
with a call statement such as the following:
CALL CRoutine(A$)
The assembly-language routine should then expect to receive a pointer
to a string descriptor of this type.
Passing Near String Descriptors from Assembly Language
------------------------------------------------------
To pass an assembly-language string to Basic, first allocate a string
in assembly language. Then create a structure identical to a Basic
string descriptor. Pass this structure by near reference. Make sure
that the string originates in assembly language, not in Basic.
Otherwise, Basic may attempt to move the string around in memory.
Warning: Microsoft does not recommend creating your own string
descriptors in assembler functions because it is very easy to
inadvertently destroy portions of the data segment. The Basic
routine should not reassign the value or length of a string passed
from assembly language.
The preferred method is to create the strings in Basic and then modify
their contents in the assembler function without altering their string
descriptors.
Far Variable-Length Strings
---------------------------
Microsoft Basic Professional Development System (PDS) versions 7.00
and 7.10 allow for the use of far strings. Information on using far
strings with other languages is covered in the "Microsoft Basic 7.0:
Programmer's Guide," in Chapter 13, "Mixed-Language Programming with
Far-Strings."
Fixed-Length Strings
--------------------
Fixed-length strings in Basic are stored simply as contiguous bytes of
characters, with no terminating character. There is no string
descriptor for a fixed-length string.
To pass a fixed-length string to a routine, the string must be put
into a user-defined type. For example:
TYPE FixType
A AS STRING * 10
END TYPE
The string is then passed like any other user-defined type.
ARRAYS
======
There are several special problems that you need to be aware of when
passing arrays between Basic and assembly language:
1. Arrays are implemented differently in Basic than in other
languages, so you must take special precautions when passing an
array from Basic to assembly language.
2. Arrays are declared differently in assembly language and Basic.
3. Because Basic uses an array descriptor, passed arrays must be
created in Basic.
Passing Arrays from Basic
-------------------------
To pass an array to an assembly-language routine, pass only the base
element, and the other elements will be contiguous from there.
Passed Arrays Must Be Created in Basic
--------------------------------------
Basic keeps track of all arrays in a special structure called an array
descriptor. The array descriptor is unique to Basic and is not
available in any other language. Because of this, to pass an array
from assembly language to Basic, the array must first be created in
Basic, then passed to the assembly-language routine. The assembly-
language routine may then alter the values in the array, but it cannot
change the length of the array.
The array descriptor is similar in some respects to a string
descriptor. The array descriptor is necessary because Basic may shift
the location of array data in memory. Therefore, you can safely pass
arrays from Basic only if you follow three rules:
1. Pass the array's address by applying the VARPTR function to the
first element of the array and passing the result by value. To pass
the far address of the array, apply both the VARPTR and VARSEG
functions and pass each result by value. The assembler gets the
address of the first element and considers it the address of the
entire array.
2. The routine that receives the array must not, under any
circumstances, make a call back to Basic. If it does, then the
location of the array may change, and the address that was passed
to the routine will become meaningless.
3. Basic can pass any member of an array by value. With this method,
the above precautions do not apply.
Array Ordering
--------------
There are two types of ordering: row-major and column-major.
Basic uses column-major ordering, in which the leftmost dimension
changes fastest. When you use Basic with the BC command line, you can
select the /R compile option, which specifies that row-major order is
to be used, rather than column-major order.
COMMON BLOCKS
=============
You can pass individual members of a Basic COMMON block in an argument
list, just as you can any data. However, you can also give an
assembly-language routine access to the entire COMMON block at once.
Assembly language can reference the items of a COMMON block by first
declaring a structure with fields that correspond to the COMMON block
variables. Having defined a structure with the appropriate fields, the
assembly-language routine must then get the address of the COMMON
block.
To pass the address of the COMMON block, pass the address of the first
variable in the block. The assembly-language routine should expect to
receive a structure by reference.
For named COMMON blocks, there is an alternative method. In the
assembly-language program, a segment is set up with the same name as
the COMMON block and then grouped with DGROUP, as follows:
BNAME SEGMENT COMMON 'BC_VARS'
x dw 1 dup (?)
y dw 1 dup (?)
z dw 1 dup (?)
BNAME ENDS
DGROUP GROUP BNAME
The above assembler code matches with the following Basic code using a
named COMMON block:
DEFINT A-Z
COMMON /BNAME/ x,y,z
Passing arrays through the COMMON block is done in a similar fashion.
However, only static arrays can be passed to assembler through COMMON.
Note: Microsoft does not support passing dynamic arrays through
COMMON to assembler (since this depends upon a Microsoft
proprietary dynamic array descriptor format that changes from
version to version). Dynamic arrays can be passed to assembler only
as parameters in a CALL statement.
When static arrays are used, the entire array is stored in the COMMON
block.
Note that variables in COMMON following STRING*n variables, where n is
odd, are aligned on the next even word boundary. Thus, you must define
an extra dummy byte using db 1 in the assembler code following
STRING*n variables (where n is odd). A dummy byte is not necessary
after STRING*n variables when n is even.
HOW TO RETURN VALUES FROM ASSEMBLY-LANGUAGE FUNCTIONS
=====================================================
Assembler "functions" are not called with the CALL statement; they are
invoked on the right-hand side of an equal sign (=) in compiled Basic.
When calling an assembly-language function from Basic, either the
passed variable or a pointer to the passed variable is returned in the
AX register, as shown in the following chart:
Data Type How Value Is Returned
--------- ---------------------
INTEGER The value is placed in AX.
LONG The high-order portion is placed in DX. The low-order
portion is placed in AX.
SINGLE The value is placed in the location provided by
Basic. The segment is DS. Basic will push an extra
parameter on the stack, after all the other
parameters, that contains the offset of the memory
location to share the return value. The offset
located in BP+6 should be placed in AX before the
function exits.
DOUBLE The value is placed in the location provided by
Basic. The segment is DS. Basic will push an extra
parameter on the stack, after all the other
parameters, that contains the offset of the memory
location to share the return value. The offset should
be placed in AX before the function exits.
VARIABLE-
LENGTH STRING Pointer to a descriptor (offset in AX).
Note: Basic does not allow functions with a fixed-length-string
type or a user-defined type.
DEBUGGING MIXED-LANGUAGE PROGRAMS
=================================
Microsoft CodeView is very useful when trying to debug mixed-language
programs. With CodeView you can trace through the source code of both
assembly language and Basic and watch variables in both languages.
To compile programs for use with CodeView, use the /Zi switch on the
compile line for both the assembler and the Basic compiler. Then when
linking, use the /CO switch.
CodeView is a multilanguage source code debugger supplied with
Microsoft Basic Compiler versions 6.00 and 6.00b; Microsoft Basic
Professional Development System (PDS) versions 7.00 and 7.10;
Microsoft C Optimizing Compiler versions 5.00 and 5.10; Microsoft
Macro Assembler versions 5.00 and 5.10; and Microsoft FORTRAN Compiler
versions 4.00 and 5.00.
COMPILING AND LINKING THE SAMPLE PROGRAMS
=========================================
The following is a series of examples, demonstrating the interlanguage
calling capabilities between Basic and assembler.
When compiling the sample Basic programs, use the following compile
line:
BC /O Basicprogramname;
When compiling the sample MASM programs, use the following compile
line for MASM 5.00 or 5.10:
MASM Assemprogramname;
Or, use the following compile line for QuickAssembler 2.01:
QCL Assemprogramname;
To link the programs together, use the following LINK line:
LINK Basicprogramname Assemprogramname;
Note: All the examples using variable-length strings assume the use
of near variable-length strings. These examples will not work in
the QuickBasic Extended (QBX.EXE) environment, or when compiling
with the BC/FS directive, in Microsoft Basic Professional
Development System (PDS) versions 7.00 and 7.10.
APPENDIX A: MISCELLANEOUS TOPICS
================================
Basic SUPPORTS MASM 5.10 UPDATE .MODEL AND PROC EXTENSIONS
==========================================================
Microsoft Macro Assembler (MASM) version 5.10 includes several new
features (not found in MASM version 5.00 or earlier) that simplify
assembly-language routines linked with high-level-language programs.
Two of these features are as follows:
1. An extension to the .MODEL directive that automatically sets up
naming, calling, and return conventions for a given high-level
language. For example:
.MODEL MEDIUM,Basic
2. A modification of the PROC directive that handles most of the
procedure entry automatically. The PROC directive saves specified
registers, defines text macros for passed arguments, and generates
stack setup code on entry and stack tear-down code on exit.
Section 5 of the "Microsoft Macro Assembler Version 5.1 Update" manual
discusses the new features.
PROBLEM CALLING ASSEMBLER ROUTINE WITH LABEL ON END DIRECTIVE
=============================================================
A QuickBasic .EXE program will hang at run time if it is LINKed to an
assembly-language routine that uses a label on the END directive. The
same programs execute successfully when run inside the QB.EXE editor
with the assembly-language routine in a Quick library.
Although versions of QuickBasic prior to version 4.00 allow a label on
the END directive in a LINKed assembly-language program, programs for
versions 4.00 and 4.00b require you to have no label on the assembly-
language END directive.
When the linker creates an executable program, it successively
examines each .OBJ file and determines whether that file has a
specified entry point. The first .OBJ file that specifies an entry
point is assumed by the linker to be the main program, and program
execution begins there.
In the assembly-language routines, the purpose of a label with an END
directive is to indicate to the linker the program's starting address
or entry point (where program execution is to start). Therefore, if no
entry point is found in the QuickBasic routine, program execution will
begin in the assembly-language routines (in effect, the Basic code is
totally bypassed).
In previous versions of the QuickBasic compiler, the QuickBasic object
code contains an entry-point specifier. Therefore, by simply listing
QuickBasic object files before the assembly-language object files on
the LINK command line, the linker recognizes that the QuickBasic
program is the main program.
However, in QuickBasic version 4.00, the entry-point information is no
longer in the object file; instead, it resides in the run-time module
(for example, BCOM40.LIB or BRUN40.LIB). Because these files are
LINKed after the Basic and assembly-language .OBJ files, if the
assembly-language routine specifies an entry point, the linker will
incorrectly assume that program execution is to begin in the assembly-
language routine.
Results of testing with previous versions of QuickBasic indicate that
the programs run successfully both inside the editor and as .EXE files
when compiled with versions 2.00, 2.01, and 3.00 of the QuickBasic
compiler.
There are two workarounds to correct this problem in version 4.00:
1. Remove the label on the END directive (that is, remove the entry-
point specification in your assembly-language routine) and
reassemble.
2. The assembler .OBJ module can be used successfully without removing
the label from the END directive. If the assembly-language routine
cannot be changed, place the assembly-language routine into a .LIB
file.
ASSEMBLER ROUTINES MUST NOT ASSUME ES EQUALS DS
===============================================
If CALLed assembler routines do string manipulation and use the ES
register, then the results inside the QB.EXE editor may differ from
the executable .EXE program if the assembler routines assume the ES
and DS registers are equal.
The ES and DS registers should not be assumed to be equal in
QuickBasic versions 4.00 and later.
Generally, the ES and DS registers are equal for the executable
program; however, this is not always a valid assumption. The assembler
routines must explicitly set ES equal to DS, as shown in the code
example below.
The following assembler code sets ES equal to DS:
push bp
mov bp, sp
push es ;These three
push ds ;lines set the
pop es ;es register equal to ds
.
. ;body of program
.
pop es ;at end of program need to
pop bp ;restore saved registers
ret
QUICK LIBRARY WITH 0 (ZERO) BYTES IN FIRST CODE SEGMENT
=======================================================
A Quick library containing leading zeros in the first CODE segment is
invalid, causing the message "Error in loading file <name> - Invalid
format" when you try to load it in QuickBasic. For example, this error
can occur if an assembly-language routine puts data that is
initialized to 0 (zero) in the first CODE segment, and it is
subsequently listed first on the LINK command line when you make a
Quick library. If you have this problem, do either of the following:
1. Link with a Basic module first on the LINK command line.
-or-
2. In whatever module comes first on the LINK command line, make sure
that the first code segment starts with a nonzero byte.
This article is continued in the following article in the Microsoft
Knowledge Base:
ARTICLE-ID: Q71275
TITLE : "How to Pass Parameters Between Basic and Assembly" (Part 2/2)
Modification Type: | Minor | Last Reviewed: | 8/16/2005 |
---|
Keywords: | KB51501 |
---|
|