nasmdoc.src

来自「一个汇编语言编译器源码」· SRC 代码 · 共 1,545 行 · 第 1/5 页
SRC
1,545 行
Things can be more complex than this: since 16-bit segments and
\i{groups} may \I{overlapping segments}overlap, you might occasionally
want to refer to some symbol using a different segment base from the
preferred one. NASM lets you do this, by the use of the \c{WRT}
(With Reference To) keyword. So you can do things like

\c           mov ax,weird_seg       ; weird_seg is a segment base
\c           mov es,ax
\c           mov bx,symbol wrt weird_seg

to load \c{ES:BX} with a different, but functionally equivalent,
pointer to the symbol \c{symbol}.

NASM supports far (inter-segment) calls and jumps by means of the
syntax \c{call segment:offset}, where \c{segment} and \c{offset}
both represent immediate values. So to call a far procedure, you
could code either of

\c           call (seg procedure):procedure
\c           call weird_seg:(procedure wrt weird_seg)

(The parentheses are included for clarity, to show the intended
parsing of the above instructions. They are not necessary in
practice.)

NASM supports the syntax \I\c{CALL FAR}\c{call far procedure} as a
synonym for the first of the above usages. \c{JMP} works identically
to \c{CALL} in these examples.

To declare a \i{far pointer} to a data item in a data segment, you
must code

\c           dw symbol, seg symbol

NASM supports no convenient synonym for this, though you can always
invent one using the macro processor.

\H{crit} \i{Critical Expressions}

A limitation of NASM is that it is a \i{two-pass assembler}; unlike
TASM and others, it will always do exactly two \I{passes}\i{assembly
passes}. Therefore it is unable to cope with source files that are
complex enough to require three or more passes.

The first pass is used to determine the size of all the assembled
code and data, so that the second pass, when generating all the
code, knows all the symbol addresses the code refers to. So one
thing NASM can't handle is code whose size depends on the value of a
symbol declared after the code in question. For example,

\c           times (label-$) db 0
\c label:    db 'Where am I?'

The argument to \i\c{TIMES} in this case could equally legally
evaluate to anything at all; NASM will reject this example because
it cannot tell the size of the \c{TIMES} line when it first sees it.
It will just as firmly reject the slightly \I{paradox}paradoxical
code

\c           times (label-$+1) db 0
\c label:    db 'NOW where am I?'

in which \e{any} value for the \c{TIMES} argument is by definition
wrong!

NASM rejects these examples by means of a concept called a
\e{critical expression}, which is defined to be an expression whose
value is required to be computable in the first pass, and which must
therefore depend only on symbols defined before it. The argument to
the \c{TIMES} prefix is a critical expression; for the same reason,
the arguments to the \i\c{RESB} family of pseudo-instructions are
also critical expressions.

Critical expressions can crop up in other contexts as well: consider
the following code.

\c           mov ax,symbol1
\c symbol1   equ symbol2
\c symbol2:

On the first pass, NASM cannot determine the value of \c{symbol1},
because \c{symbol1} is defined to be equal to \c{symbol2} which NASM
hasn't seen yet. On the second pass, therefore, when it encounters
the line \c{mov ax,symbol1}, it is unable to generate the code for
it because it still doesn't know the value of \c{symbol1}. On the
next line, it would see the \i\c{EQU} again and be able to determine
the value of \c{symbol1}, but by then it would be too late.

NASM avoids this problem by defining the right-hand side of an
\c{EQU} statement to be a critical expression, so the definition of
\c{symbol1} would be rejected in the first pass.

There is a related issue involving \i{forward references}: consider
this code fragment.

\c           mov eax,[ebx+offset]
\c offset    equ 10

NASM, on pass one, must calculate the size of the instruction \c{mov
eax,[ebx+offset]} without knowing the value of \c{offset}. It has no
way of knowing that \c{offset} is small enough to fit into a
one-byte offset field and that it could therefore get away with
generating a shorter form of the \i{effective-address} encoding; for
all it knows, in pass one, \c{offset} could be a symbol in the code
segment, and it might need the full four-byte form. So it is forced
to compute the size of the instruction to accommodate a four-byte
address part. In pass two, having made this decision, it is now
forced to honour it and keep the instruction large, so the code
generated in this case is not as small as it could have been. This
problem can be solved by defining \c{offset} before using it, or by
forcing byte size in the effective address by coding \c{[byte
ebx+offset]}.

\H{locallab} \i{Local Labels}

NASM gives special treatment to symbols beginning with a \i{period}.
A label beginning with a single period is treated as a \e{local}
label, which means that it is associated with the previous non-local
label. So, for example:

\c label1    ; some code
\c .loop     ; some more code
\c           jne .loop
\c           ret
\c label2    ; some code
\c .loop     ; some more code
\c           jne .loop
\c           ret

In the above code fragment, each \c{JNE} instruction jumps to the
line immediately before it, because the two definitions of \c{.loop}
are kept separate by virtue of each being associated with the
previous non-local label.

This form of local label handling is borrowed from the old Amiga
assembler \i{DevPac}; however, NASM goes one step further, in
allowing access to local labels from other parts of the code. This
is achieved by means of \e{defining} a local label in terms of the
previous non-local label: the first definition of \c{.loop} above is
really defining a symbol called \c{label1.loop}, and the second
defines a symbol called \c{label2.loop}. So, if you really needed
to, you could write

\c label3    ; some more code
\c           ; and some more
\c           jmp label1.loop

Sometimes it is useful - in a macro, for instance - to be able to
define a label which can be referenced from anywhere but which
doesn't interfere with the normal local-label mechanism. Such a
label can't be non-local because it would interfere with subsequent
definitions of, and references to, local labels; and it can't be
local because the macro that defined it wouldn't know the label's
full name. NASM therefore introduces a third type of label, which is
probably only useful in macro definitions: if a label begins with
the \I{label prefix}special prefix \i\c{..@}, then it does nothing
to the local label mechanism. So you could code

\c label1:   ; a non-local label
\c .local:   ; this is really label1.local
\c ..@foo:   ; this is a special symbol
\c label2:   ; another non-local label
\c .local:   ; this is really label2.local
\c           jmp ..@foo             ; this will jump three lines up

NASM has the capacity to define other special symbols beginning with
a double period: for example, \c{..start} is used to specify the
entry point in the \c{obj} output format (see \k{dotdotstart}).

\C{preproc} The NASM \i{Preprocessor}

NASM contains a powerful \i{macro processor}, which supports
conditional assembly, multi-level file inclusion, two forms of macro
(single-line and multi-line), and a `context stack' mechanism for
extra macro power. Preprocessor directives all begin with a \c{%}
sign.

\H{slmacro} \i{Single-Line Macros}

\S{define} The Normal Way: \I\c{%idefine}\i\c{%define}

Single-line macros are defined using the \c{%define} preprocessor
directive. The definitions work in a similar way to C; so you can do
things like

\c %define ctrl 0x1F &
\c %define param(a,b) ((a)+(a)*(b))
\c           mov byte [param(2,ebx)], ctrl 'D'

which will expand to

\c           mov byte [(2)+(2)*(ebx)], 0x1F & 'D'

When the expansion of a single-line macro contains tokens which
invoke another macro, the expansion is performed at invocation time,
not at definition time. Thus the code

\c %define a(x) 1+b(x)
\c %define b(x) 2*x
\c           mov ax,a(8)

will evaluate in the expected way to \c{mov ax,1+2*8}, even though
the macro \c{b} wasn't defined at the time of definition of \c{a}.

Macros defined with \c{%define} are \i{case sensitive}: after
\c{%define foo bar}, only \c{foo} will expand to \c{bar}: \c{Foo} or
\c{FOO} will not. By using \c{%idefine} instead of \c{%define} (the
`i' stands for `insensitive') you can define all the case variants
of a macro at once, so that \c{%idefine foo bar} would cause
\c{foo}, \c{Foo}, \c{FOO}, \c{fOO} and so on all to expand to
\c{bar}.

There is a mechanism which detects when a macro call has occurred as
a result of a previous expansion of the same macro, to guard against
\i{circular references} and infinite loops. If this happens, the
preprocessor will only expand the first occurrence of the macro.
Hence, if you code

\c %define a(x) 1+a(x)
\c           mov ax,a(3)

the macro \c{a(3)} will expand once, becoming \c{1+a(3)}, and will
then expand no further. This behaviour can be useful: see \k{32c}
for an example of its use.

You can \I{overloading, single-line macros}overload single-line
macros: if you write

\c %define foo(x) 1+x
\c %define foo(x,y) 1+x*y

the preprocessor will be able to handle both types of macro call,
by counting the parameters you pass; so \c{foo(3)} will become
\c{1+3} whereas \c{foo(ebx,2)} will become \c{1+ebx*2}. However, if
you define

\c %define foo bar

then no other definition of \c{foo} will be accepted: a macro with
no parameters prohibits the definition of the same name as a macro
\e{with} parameters, and vice versa.

This doesn't prevent single-line macros being \e{redefined}: you can
perfectly well define a macro with

\c %define foo bar

and then re-define it later in the same source file with

\c %define foo baz

Then everywhere the macro \c{foo} is invoked, it will be expanded
according to the most recent definition. This is particularly useful
when defining single-line macros with \c{%assign} (see \k{assign}).

You can \i{pre-define} single-line macros using the `-d' option on
the NASM command line: see \k{opt-d}.

\S{undef} Undefining macros: \i\c{%undef}

Single-line macros can be removed with the \c{%undef} command.  For
example, the following sequence:

\c %define foo bar
\c %undef foo
\c 		mov eax, foo

will expand to the instruction \c{mov eax, foo}, since after
\c{%undef} the macro \c{foo} is no longer defined.

Macros that would otherwise be pre-defined can be undefined on the
command-line using the `-u' option on the NASM command line: see
\k{opt-u}.

\S{assign} \i{Preprocessor Variables}: \i\c{%assign}

An alternative way to define single-line macros is by means of the
\c{%assign} command (and its \i{case sensitive}case-insensitive
counterpart \i\c{%iassign}, which differs from \c{%assign} in
exactly the same way that \c{%idefine} differs from \c{%define}).

\c{%assign} is used to define single-line macros which take no
parameters and have a numeric value. This value can be specified in
the form of an expression, and it will be evaluated once, when the
\c{%assign} directive is processed.

Like \c{%define}, macros defined using \c{%assign} can be re-defined
later, so you can do things like

\c %assign i i+1

to increment the numeric value of a macro.

\c{%assign} is useful for controlling the termination of \c{%rep}
preprocessor loops: see \k{rep} for an example of this. Another
use for \c{%assign} is given in \k{16c} and \k{32c}.

The expression passed to \c{%assign} is a \i{critical expression}
(see \k{crit}), and must also evaluate to a pure number (rather than
a relocatable reference such as a code or data address, or anything
involving a register).

\H{mlmacro} \i{Multi-Line Macros}: \I\c{%imacro}\i\c{%macro}

Multi-line macros are much more like the type of macro seen in MASM
and TASM: a multi-line macro definition in NASM looks something like
this.

\c %macro
nasmdoc.src - 源码说明

本页面展示了「一个汇编语言编译器源码」中的 nasmdoc.src 源码文件，采用 SRC 编程语言编写，共 1,545 行代码。您可以在线阅读完整代码内容，也可以返回资源详情页下载完整源码包进行本地学习和开发。
虫虫下载站收录了大量与汇编语言相关的技术资源，包括源代码、技术文档、电路图等，是电子工程师和嵌入式开发者的专业学习平台。
⌨️ 快捷键说明

复制代码Ctrl + C
搜索代码Ctrl + F
全屏模式F11
增大字号Ctrl + =
减小字号Ctrl + -
显示快捷键?