📄 nasmdoc.src

📁 nasm早期的源代码,比较简单是学习汇编和编译原理的好例子
💻 SRC
📖 第 1 页 / 共 5 页
字号:
上一页 1 2 3 45
Note also that \c{TIMES} can't be applied to \i{macros}: the reason
for this is that \c{TIMES} is processed after the macro phase, which
allows the argument to \c{TIMES} to contain expressions such as
\c{64-$+buffer} as above. To repeat more than one line of code, or a
complex macro, use the preprocessor \i\c{%rep} directive.


\H{effaddr} Effective Addresses

An \i{effective address} is any operand to an instruction which
\I{memory reference}references memory. Effective addresses, in NASM,
have a very simple syntax: they consist of an expression evaluating
to the desired address, enclosed in \i{square brackets}. For
example:

\c wordvar dw      123
\c         mov     ax,[wordvar]
\c         mov     ax,[wordvar+1]
\c         mov     ax,[es:wordvar+bx]

Anything not conforming to this simple system is not a valid memory
reference in NASM, for example \c{es:wordvar[bx]}.

More complicated effective addresses, such as those involving more
than one register, work in exactly the same way:

\c         mov     eax,[ebx*2+ecx+offset]
\c         mov     ax,[bp+di+8]

NASM is capable of doing \i{algebra} on these effective addresses,
so that things which don't necessarily \e{look} legal are perfectly
all right:

\c     mov     eax,[ebx*5]             ; assembles as [ebx*4+ebx]
\c     mov     eax,[label1*2-label2]   ; ie [label1+(label1-label2)]

Some forms of effective address have more than one assembled form;
in most such cases NASM will generate the smallest form it can. For
example, there are distinct assembled forms for the 32-bit effective
addresses \c{[eax*2+0]} and \c{[eax+eax]}, and NASM will generally
generate the latter on the grounds that the former requires four
bytes to store a zero offset.

NASM has a hinting mechanism which will cause \c{[eax+ebx]} and
\c{[ebx+eax]} to generate different opcodes; this is occasionally
useful because \c{[esi+ebp]} and \c{[ebp+esi]} have different
default segment registers.

However, you can force NASM to generate an effective address in a
particular form by the use of the keywords \c{BYTE}, \c{WORD},
\c{DWORD} and \c{NOSPLIT}. If you need \c{[eax+3]} to be assembled
using a double-word offset field instead of the one byte NASM will
normally generate, you can code \c{[dword eax+3]}. Similarly, you
can force NASM to use a byte offset for a small value which it
hasn't seen on the first pass (see \k{crit} for an example of such a
code fragment) by using \c{[byte eax+offset]}. As special cases,
\c{[byte eax]} will code \c{[eax+0]} with a byte offset of zero, and
\c{[dword eax]} will code it with a double-word offset of zero. The
normal form, \c{[eax]}, will be coded with no offset field.

The form described in the previous paragraph is also useful if you
are trying to access data in a 32-bit segment from within 16 bit code.
For more information on this see the section on mixed-size addressing
(\k{mixaddr}). In particular, if you need to access data with a known
offset that is larger than will fit in a 16-bit value, if you don't
specify that it is a dword offset, nasm will cause the high word of
the offset to be lost.

Similarly, NASM will split \c{[eax*2]} into \c{[eax+eax]} because
that allows the offset field to be absent and space to be saved; in
fact, it will also split \c{[eax*2+offset]} into
\c{[eax+eax+offset]}. You can combat this behaviour by the use of
the \c{NOSPLIT} keyword: \c{[nosplit eax*2]} will force
\c{[eax*2+0]} to be generated literally.

In 64-bit mode, NASM will by default generate absolute addresses.  The
\i\c{REL} keyword makes it produce \c{RIP}-relative addresses. Since
this is frequently the normally desired behaviour, see the \c{DEFAULT}
directive (\k{default}). The keyword \i\c{ABS} overrides \i\c{REL}.


\H{const} \i{Constants}

NASM understands four different types of constant: numeric,
character, string and floating-point.


\S{numconst} \i{Numeric Constants}

A numeric constant is simply a number. NASM allows you to specify
numbers in a variety of number bases, in a variety of ways: you can
suffix \c{H}, \c{Q} or \c{O}, and \c{B} for \i{hex}, \i{octal} and \i{binary},
or you can prefix \c{0x} for hex in the style of C, or you can
prefix \c{$} for hex in the style of Borland Pascal. Note, though,
that the \I{$, prefix}\c{$} prefix does double duty as a prefix on
identifiers (see \k{syntax}), so a hex number prefixed with a \c{$}
sign must have a digit after the \c{$} rather than a letter.

Some examples:

\c         mov     ax,100          ; decimal
\c         mov     ax,0a2h         ; hex
\c         mov     ax,$0a2         ; hex again: the 0 is required
\c         mov     ax,0xa2         ; hex yet again
\c         mov     ax,777q         ; octal
\c         mov     ax,777o         ; octal again
\c         mov     ax,10010011b    ; binary


\S{chrconst} \i{Character Constants}

A character constant consists of up to four characters enclosed in
either single or double quotes. The type of quote makes no
difference to NASM, except of course that surrounding the constant
with single quotes allows double quotes to appear within it and vice
versa.

A character constant with more than one character will be arranged
with \i{little-endian} order in mind: if you code

\c           mov eax,'abcd'

then the constant generated is not \c{0x61626364}, but
\c{0x64636261}, so that if you were then to store the value into
memory, it would read \c{abcd} rather than \c{dcba}. This is also
the sense of character constants understood by the Pentium's
\i\c{CPUID} instruction.
\# (see \k{insCPUID})


\S{strconst} String Constants

String constants are only acceptable to some pseudo-instructions,
namely the \I\c{DW}\I\c{DD}\I\c{DQ}\I\c{DT}\i\c{DB} family and
\i\c{INCBIN}.

A string constant looks like a character constant, only longer. It
is treated as a concatenation of maximum-size character constants
for the conditions. So the following are equivalent:

\c       db    'hello'               ; string constant
\c       db    'h','e','l','l','o'   ; equivalent character constants

And the following are also equivalent:

\c       dd    'ninechars'           ; doubleword string constant
\c       dd    'nine','char','s'     ; becomes three doublewords
\c       db    'ninechars',0,0,0     ; and really looks like this

Note that when used as an operand to \c{db}, a constant like
\c{'ab'} is treated as a string constant despite being short enough
to be a character constant, because otherwise \c{db 'ab'} would have
the same effect as \c{db 'a'}, which would be silly. Similarly,
three-character or four-character constants are treated as strings
when they are operands to \c{dw}.


\S{fltconst} \I{floating-point, constants}Floating-Point Constants

\i{Floating-point} constants are acceptable only as arguments to
\i\c{DW}, \i\c{DD}, \i\c{DQ}, \i\c{DT}, and \i\c{DO}, or as arguments
to the special operators \i\c{__float16__}, \i\c{__float32__},
\i\c{__float64__}, \i\c{__float80m__}, \i\c{__float80e__},
\i\c{__float128l__}, and \i\c{__float128h__}.

Floating-point constants are expressed in the traditional form:
digits, then a period, then optionally more digits, then optionally an
\c{E} followed by an exponent. The period is mandatory, so that NASM
can distinguish between \c{dd 1}, which declares an integer constant,
and \c{dd 1.0} which declares a floating-point constant.  NASM also
support C99-style hexadecimal floating-point: \c{0x}, hexadecimal
digits, period, optionally more hexadeximal digits, then optionally a
\c{P} followed by a \e{binary} (not hexadecimal) exponent in decimal
notation.

Some examples:

\c       dw    -0.5                    ; IEEE half precision
\c       dd    1.2                     ; an easy one
\c	 dd    0x1p+2		       ; 1.0x2^2 = 4.0
\c       dq    1.e10                   ; 10,000,000,000
\c       dq    1.e+10                  ; synonymous with 1.e10
\c       dq    1.e-10                  ; 0.000 000 000 1
\c       dt    3.141592653589793238462 ; pi
\c       do    1.e+4000		       ; IEEE quad precision

The special operators are used to produce floating-point numbers in
other contexts.  They produce the binary representation of a specific
floating-point number as an integer, and can use anywhere integer
constants are used in an expression.  \c{__float80m__} and
\c{__float80e__} produce the 64-bit mantissa and 16-bit exponent of an
80-bit floating-point number, and \c{__float128l__} and
\c{__float128h__} produce the lower and upper 64-bit halves of a 128-bit
floating-point number, respectively.

For example:

\c       mov    rax,__float64__(3.141592653589793238462)

... would assign the binary representation of pi as a 64-bit floating
point number into \c{RAX}.  This is exactly equivalent to:

\c       mov    rax,0x401921fb54442d18

NASM cannot do compile-time arithmetic on floating-point constants.
This is because NASM is designed to be portable - although it always
generates code to run on x86 processors, the assembler itself can
run on any system with an ANSI C compiler. Therefore, the assembler
cannot guarantee the presence of a floating-point unit capable of
handling the \i{Intel number formats}, and so for NASM to be able to
do floating arithmetic it would have to include its own complete set
of floating-point routines, which would significantly increase the
size of the assembler for very little benefit.

The special tokens \i\c{__Infinity__}, \i\c{__QNaN__} (or
\i\c{__NaN__}) and \i\c{__SNaN__} can be used to generate
\I{infinity}infinities, quiet \i{NaN}s, and signalling NaNs,
respectively.  These are normally used as macros:

\c %define Inf __Infinity__
\c %define NaN __QNaN__
\c
\c       dq    +1.5, -Inf, NaN         ; Double-precision constants

\H{expr} \i{Expressions}

Expressions in NASM are similar in syntax to those in C.  Expressions
are evaluated as 64-bit integers which are then adjusted to the
appropriate size.

NASM supports two special tokens in expressions, allowing
calculations to involve the current assembly position: the
\I{$, here}\c{$} and \i\c{$$} tokens. \c{$} evaluates to the assembly
position at the beginning of the line containing the expression; so
you can code an \i{infinite loop} using \c{JMP $}. \c{$$} evaluates
to the beginning of the current section; so you can tell how far
into the section you are by using \c{($-$$)}.

The arithmetic \i{operators} provided by NASM are listed here, in
increasing order of \i{precedence}.


\S{expor} \i\c{|}: \i{Bitwise OR} Operator

The \c{|} operator gives a bitwise OR, exactly as performed by the
\c{OR} machine instruction. Bitwise OR is the lowest-priority
arithmetic operator supported by NASM.


\S{expxor} \i\c{^}: \i{Bitwise XOR} Operator

\c{^} provides the bitwise XOR operation.


\S{expand} \i\c{&}: \i{Bitwise AND} Operator

\c{&} provides the bitwise AND operation.


\S{expshift} \i\c{<<} and \i\c{>>}: \i{Bit Shift} Operators

\c{<<} gives a bit-shift to the left, just as it does in C. So \c{5<<3}
evaluates to 5 times 8, or 40. \c{>>} gives a bit-shift to the
right; in NASM, such a shift is \e{always} unsigned, so that
the bits shifted in from the left-hand end are filled with zero
rather than a sign-extension of the previous highest bit.


\S{expplmi} \I{+ opaddition}\c{+} and \I{- opsubtraction}\c{-}:
\i{Addition} and \i{Subtraction} Operators

The \c{+} and \c{-} operators do perfectly ordinary addition and
subtraction.


\S{expmul} \i\c{*}, \i\c{/}, \i\c{//}, \i\c{%} and \i\c{%%}:
\i{Multiplication} and \i{Division}

\c{*} is the multiplication operator. \c{/} and \c{//} are both
division operators: \c{/} is \i{unsigned division} and \c{//} is
\i{signed division}. Similarly, \c{%} and \c{%%} provide \I{unsigned
modulo}\I{modulo operators}unsigned and
\i{signed modulo} operators respectively.

NASM, like ANSI C, provides no guarantees about the sensible
operation of the signed modulo operator.

Since the \c{%} character is used extensively by the macro
\i{preprocessor}, you should ensure that both the signed and unsigned
modulo operators are followed by white space wherever they appear.


\S{expmul} \i{Unary Operators}: \I{+ opunary}\c{+}, \I{- opunary}\c{-},
\i\c{~}, \I{! opunary}\c{!} and \i\c{SEG}

The highest-priority operators in NASM's expression grammar are
those which only apply to one argument. \c{-} negates its operand,
\c{+} does nothing (it's provided for symmetry with \c{-}), \c{~}
computes the \i{one's complement} of its operand, \c{!} is the
\i{logical negation} operator, and \c{SEG} provides the \i{segment address}
of its operand (explained in more detail in \k{segwrt}).


\H{segwrt} \i\c{SEG} and \i\c{WRT}

When writing large 16-bit programs, which must be split into
multiple \i{segments}, it is often necessary to be able to refer to
the \I{segment address}segment part of the address of a symbol. NASM
supports the \c{SEG} operator to perform this function.

The \c{SEG} operator returns the \i\e{preferred} segment base of a
symbol, defined as the segment base relative to which the offset of
th
上一页 1 2 3 45
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -