📄 nasmdoc.src

📁 nasm的全套源代码,有些我做了些修改,以方便您更方便更容易调试成功,方便学习做编译器
💻 SRC
📖 第 1 页 / 共 5 页
字号:
上一页 1 2 3 45
addresses \c{[eax*2+0]} and \c{[eax+eax]}, and NASM will generally
generate the latter on the grounds that the former requires four
bytes to store a zero offset.

NASM has a hinting mechanism which will cause \c{[eax+ebx]} and
\c{[ebx+eax]} to generate different opcodes; this is occasionally
useful because \c{[esi+ebp]} and \c{[ebp+esi]} have different
default segment registers.

However, you can force NASM to generate an effective address in a
particular form by the use of the keywords \c{BYTE}, \c{WORD},
\c{DWORD} and \c{NOSPLIT}. If you need \c{[eax+3]} to be assembled
using a double-word offset field instead of the one byte NASM will
normally generate, you can code \c{[dword eax+3]}. Similarly, you
can force NASM to use a byte offset for a small value which it
hasn't seen on the first pass (see \k{crit} for an example of such a
code fragment) by using \c{[byte eax+offset]}. As special cases,
\c{[byte eax]} will code \c{[eax+0]} with a byte offset of zero, and
\c{[dword eax]} will code it with a double-word offset of zero. The
normal form, \c{[eax]}, will be coded with no offset field.

The form described in the previous paragraph is also useful if you
are trying to access data in a 32-bit segment from within 16 bit code.
For more information on this see the section on mixed-size addressing
(\k{mixaddr}). In particular, if you need to access data with a known
offset that is larger than will fit in a 16-bit value, if you don't
specify that it is a dword offset, nasm will cause the high word of
the offset to be lost.

Similarly, NASM will split \c{[eax*2]} into \c{[eax+eax]} because
that allows the offset field to be absent and space to be saved; in
fact, it will also split \c{[eax*2+offset]} into
\c{[eax+eax+offset]}. You can combat this behaviour by the use of
the \c{NOSPLIT} keyword: \c{[nosplit eax*2]} will force
\c{[eax*2+0]} to be generated literally.


\H{const} \i{Constants}

NASM understands four different types of constant: numeric,
character, string and floating-point.


\S{numconst} \i{Numeric Constants}

A numeric constant is simply a number. NASM allows you to specify
numbers in a variety of number bases, in a variety of ways: you can
suffix \c{H}, \c{Q} or \c{O}, and \c{B} for \i{hex}, \i{octal} and \i{binary},
or you can prefix \c{0x} for hex in the style of C, or you can
prefix \c{$} for hex in the style of Borland Pascal. Note, though,
that the \I{$, prefix}\c{$} prefix does double duty as a prefix on
identifiers (see \k{syntax}), so a hex number prefixed with a \c{$}
sign must have a digit after the \c{$} rather than a letter.

Some examples:

\c         mov     ax,100          ; decimal
\c         mov     ax,0a2h         ; hex
\c         mov     ax,$0a2         ; hex again: the 0 is required
\c         mov     ax,0xa2         ; hex yet again
\c         mov     ax,777q         ; octal
\c         mov     ax,777o         ; octal again
\c         mov     ax,10010011b    ; binary


\S{chrconst} \i{Character Constants}

A character constant consists of up to four characters enclosed in
either single or double quotes. The type of quote makes no
difference to NASM, except of course that surrounding the constant
with single quotes allows double quotes to appear within it and vice
versa.

A character constant with more than one character will be arranged
with \i{little-endian} order in mind: if you code

\c           mov eax,'abcd'

then the constant generated is not \c{0x61626364}, but
\c{0x64636261}, so that if you were then to store the value into
memory, it would read \c{abcd} rather than \c{dcba}. This is also
the sense of character constants understood by the Pentium's
\i\c{CPUID} instruction (see \k{insCPUID}).


\S{strconst} String Constants

String constants are only acceptable to some pseudo-instructions,
namely the \I\c{DW}\I\c{DD}\I\c{DQ}\I\c{DT}\i\c{DB} family and
\i\c{INCBIN}.

A string constant looks like a character constant, only longer. It
is treated as a concatenation of maximum-size character constants
for the conditions. So the following are equivalent:

\c       db    'hello'               ; string constant
\c       db    'h','e','l','l','o'   ; equivalent character constants

And the following are also equivalent:

\c       dd    'ninechars'           ; doubleword string constant
\c       dd    'nine','char','s'     ; becomes three doublewords
\c       db    'ninechars',0,0,0     ; and really looks like this

Note that when used as an operand to \c{db}, a constant like
\c{'ab'} is treated as a string constant despite being short enough
to be a character constant, because otherwise \c{db 'ab'} would have
the same effect as \c{db 'a'}, which would be silly. Similarly,
three-character or four-character constants are treated as strings
when they are operands to \c{dw}.


\S{fltconst} \I{floating-point, constants}Floating-Point Constants

\i{Floating-point} constants are acceptable only as arguments to
\i\c{DD}, \i\c{DQ} and \i\c{DT}. They are expressed in the
traditional form: digits, then a period, then optionally more
digits, then optionally an \c{E} followed by an exponent. The period
is mandatory, so that NASM can distinguish between \c{dd 1}, which
declares an integer constant, and \c{dd 1.0} which declares a
floating-point constant.

Some examples:

\c       dd    1.2                     ; an easy one
\c       dq    1.e10                   ; 10,000,000,000
\c       dq    1.e+10                  ; synonymous with 1.e10
\c       dq    1.e-10                  ; 0.000 000 000 1
\c       dt    3.141592653589793238462 ; pi

NASM cannot do compile-time arithmetic on floating-point constants.
This is because NASM is designed to be portable - although it always
generates code to run on x86 processors, the assembler itself can
run on any system with an ANSI C compiler. Therefore, the assembler
cannot guarantee the presence of a floating-point unit capable of
handling the \i{Intel number formats}, and so for NASM to be able to
do floating arithmetic it would have to include its own complete set
of floating-point routines, which would significantly increase the
size of the assembler for very little benefit.


\H{expr} \i{Expressions}

Expressions in NASM are similar in syntax to those in C.

NASM does not guarantee the size of the integers used to evaluate
expressions at compile time: since NASM can compile and run on
64-bit systems quite happily, don't assume that expressions are
evaluated in 32-bit registers and so try to make deliberate use of
\i{integer overflow}. It might not always work. The only thing NASM
will guarantee is what's guaranteed by ANSI C: you always have \e{at
least} 32 bits to work in.

NASM supports two special tokens in expressions, allowing
calculations to involve the current assembly position: the
\I{$, here}\c{$} and \i\c{$$} tokens. \c{$} evaluates to the assembly
position at the beginning of the line containing the expression; so
you can code an \i{infinite loop} using \c{JMP $}. \c{$$} evaluates
to the beginning of the current section; so you can tell how far
into the section you are by using \c{($-$$)}.

The arithmetic \i{operators} provided by NASM are listed here, in
increasing order of \i{precedence}.


\S{expor} \i\c{|}: \i{Bitwise OR} Operator

The \c{|} operator gives a bitwise OR, exactly as performed by the
\c{OR} machine instruction. Bitwise OR is the lowest-priority
arithmetic operator supported by NASM.


\S{expxor} \i\c{^}: \i{Bitwise XOR} Operator

\c{^} provides the bitwise XOR operation.


\S{expand} \i\c{&}: \i{Bitwise AND} Operator

\c{&} provides the bitwise AND operation.


\S{expshift} \i\c{<<} and \i\c{>>}: \i{Bit Shift} Operators

\c{<<} gives a bit-shift to the left, just as it does in C. So \c{5<<3}
evaluates to 5 times 8, or 40. \c{>>} gives a bit-shift to the
right; in NASM, such a shift is \e{always} unsigned, so that
the bits shifted in from the left-hand end are filled with zero
rather than a sign-extension of the previous highest bit.


\S{expplmi} \I{+ opaddition}\c{+} and \I{- opsubtraction}\c{-}:
\i{Addition} and \i{Subtraction} Operators

The \c{+} and \c{-} operators do perfectly ordinary addition and
subtraction.


\S{expmul} \i\c{*}, \i\c{/}, \i\c{//}, \i\c{%} and \i\c{%%}:
\i{Multiplication} and \i{Division}

\c{*} is the multiplication operator. \c{/} and \c{//} are both
division operators: \c{/} is \i{unsigned division} and \c{//} is
\i{signed division}. Similarly, \c{%} and \c{%%} provide \I{unsigned
modulo}\I{modulo operators}unsigned and
\i{signed modulo} operators respectively.

NASM, like ANSI C, provides no guarantees about the sensible
operation of the signed modulo operator.

Since the \c{%} character is used extensively by the macro
\i{preprocessor}, you should ensure that both the signed and unsigned
modulo operators are followed by white space wherever they appear.


\S{expmul} \i{Unary Operators}: \I{+ opunary}\c{+}, \I{- opunary}\c{-},
\i\c{~} and \i\c{SEG}

The highest-priority operators in NASM's expression grammar are
those which only apply to one argument. \c{-} negates its operand,
\c{+} does nothing (it's provided for symmetry with \c{-}), \c{~}
computes the \i{one's complement} of its operand, and \c{SEG}
provides the \i{segment address} of its operand (explained in more
detail in \k{segwrt}).


\H{segwrt} \i\c{SEG} and \i\c{WRT}

When writing large 16-bit programs, which must be split into
multiple \i{segments}, it is often necessary to be able to refer to
the \I{segment address}segment part of the address of a symbol. NASM
supports the \c{SEG} operator to perform this function.

The \c{SEG} operator returns the \i\e{preferred} segment base of a
symbol, defined as the segment base relative to which the offset of
the symbol makes sense. So the code

\c         mov     ax,seg symbol
\c         mov     es,ax
\c         mov     bx,symbol

will load \c{ES:BX} with a valid pointer to the symbol \c{symbol}.

Things can be more complex than this: since 16-bit segments and
\i{groups} may \I{overlapping segments}overlap, you might occasionally
want to refer to some symbol using a different segment base from the
preferred one. NASM lets you do this, by the use of the \c{WRT}
(With Reference To) keyword. So you can do things like

\c         mov     ax,weird_seg        ; weird_seg is a segment base
\c         mov     es,ax
\c         mov     bx,symbol wrt weird_seg

to load \c{ES:BX} with a different, but functionally equivalent,
pointer to the symbol \c{symbol}.

NASM supports far (inter-segment) calls and jumps by means of the
syntax \c{call segment:offset}, where \c{segment} and \c{offset}
both represent immediate values. So to call a far procedure, you
could code either of

\c         call    (seg procedure):procedure
\c         call    weird_seg:(procedure wrt weird_seg)

(The parentheses are included for clarity, to show the intended
parsing of the above instructions. They are not necessary in
practice.)

NASM supports the syntax \I\c{CALL FAR}\c{call far procedure} as a
synonym for the first of the above usages. \c{JMP} works identically
to \c{CALL} in these examples.

To declare a \i{far pointer} to a data item in a data segment, you
must code

\c         dw      symbol, seg symbol

NASM supports no convenient synonym for this, though you can always
invent one using the macro processor.


\H{strict} \i\c{STRICT}: Inhibiting Optimization

When assembling with the optimizer set to level 2 or higher (see
\k{opt-On}), NASM will use size specifiers (\c{BYTE}, \c{WORD},
\c{DWORD}, \c{QWORD}, or \c{TWORD}), but will give them the smallest
possible size. The keyword \c{STRICT} can be used to inhibit
optimization and force a particular operand to be emitted in the
specified size. For example, with the optimizer on, and in
\c{BITS 16} mode,

\c         push dword 33

is encoded in three bytes \c{66 6A 21}, whereas

\c         push strict dword 33

is encoded in six bytes, with a full dword immediate operand \c{66 68
21 00 00 00}.

With the optimizer off, the same code (six bytes) is generated whether
the \c{STRICT} keyword was used or not.


\H{crit} \i{Critical Expressions}

A limitation of NASM is that it is a \i{two-pass assembler}; unlike
TASM and others, it will always do exactly two \I{passes}\i{assembly
passes}. Therefore it is unable to cope with source files that are
complex enough to require three or more passes.

The first pass is used to determine the size of all the assembled
code and data, so that the second pass, when generating all
上一页 1 2 3 45
💿 文件大小 701 K
👤 上传用户 lujing200912345
📂 所属分类编译器/解释器
🏷️ 相关标签

#nasm #源代码 #修改 #调试
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -