📄 nasmdoc.src
字号:
\b \i\c{orphan-labels} covers warnings about source lines which
contain no instruction but define a label without a trailing colon.
NASM does not warn about this somewhat obscure condition by default;
see \k{syntax} for an example of why you might want it to.
\b \i\c{number-overflow} covers warnings about numeric constants which
don't fit in 32 bits (for example, it's easy to type one too many Fs
and produce \c{0x7ffffffff} by mistake). This warning class is
enabled by default.
\S{nasmenv} The \c{NASM} \i{Environment} Variable
If you define an environment variable called \c{NASM}, the program
will interpret it as a list of extra command-line options, which are
processed before the real command line. You can use this to define
standard search directories for include files, by putting \c{-i}
options in the \c{NASM} variable.
The value of the variable is split up at white space, so that the
value \c{-s -ic:\\nasmlib} will be treated as two separate options.
However, that means that the value \c{-dNAME="my name"} won't do
what you might want, because it will be split at the space and the
NASM command-line processing will get confused by the two
nonsensical words \c{-dNAME="my} and \c{name"}.
To get round this, NASM provides a feature whereby, if you begin the
\c{NASM} environment variable with some character that isn't a minus
sign, then NASM will treat this character as the \i{separator
character} for options. So setting the \c{NASM} variable to the
value \c{!-s!-ic:\\nasmlib} is equivalent to setting it to \c{-s
-ic:\\nasmlib}, but \c{!-dNAME="my name"} will work.
\H{qstart} \i{Quick Start} for \i{MASM} Users
If you're used to writing programs with MASM, or with \i{TASM} in
MASM-compatible (non-Ideal) mode, or with \i\c{a86}, this section
attempts to outline the major differences between MASM's syntax and
NASM's. If you're not already used to MASM, it's probably worth
skipping this section.
\S{qscs} NASM Is \I{case sensitivity}Case-Sensitive
One simple difference is that NASM is case-sensitive. It makes a
difference whether you call your label \c{foo}, \c{Foo} or \c{FOO}.
If you're assembling to DOS or OS/2 \c{.OBJ} files, you can invoke
the \i\c{UPPERCASE} directive (documented in \k{objfmt}) to ensure
that all symbols exported to other code modules are forced to be
upper case; but even then, \e{within} a single module, NASM will
distinguish between labels differing only in case.
\S{qsbrackets} NASM Requires \i{Square Brackets} For \i{Memory References}
NASM was designed with simplicity of syntax in mind. One of the
\i{design goals} of NASM is that it should be possible, as far as is
practical, for the user to look at a single line of NASM code
and tell what opcode is generated by it. You can't do this in MASM:
if you declare, for example,
\c foo equ 1
\c bar dw 2
then the two lines of code
\c mov ax,foo
\c mov ax,bar
generate completely different opcodes, despite having
identical-looking syntaxes.
NASM avoids this undesirable situation by having a much simpler
syntax for memory references. The rule is simply that any access to
the \e{contents} of a memory location requires square brackets
around the address, and any access to the \e{address} of a variable
doesn't. So an instruction of the form \c{mov ax,foo} will
\e{always} refer to a compile-time constant, whether it's an \c{EQU}
or the address of a variable; and to access the \e{contents} of the
variable \c{bar}, you must code \c{mov ax,[bar]}.
This also means that NASM has no need for MASM's \i\c{OFFSET}
keyword, since the MASM code \c{mov ax,offset bar} means exactly the
same thing as NASM's \c{mov ax,bar}. If you're trying to get
large amounts of MASM code to assemble sensibly under NASM, you
can always code \c{%idefine offset} to make the preprocessor treat
the \c{OFFSET} keyword as a no-op.
This issue is even more confusing in \i\c{a86}, where declaring a
label with a trailing colon defines it to be a `label' as opposed to
a `variable' and causes \c{a86} to adopt NASM-style semantics; so in
\c{a86}, \c{mov ax,var} has different behaviour depending on whether
\c{var} was declared as \c{var: dw 0} (a label) or \c{var dw 0} (a
word-size variable). NASM is very simple by comparison:
\e{everything} is a label.
NASM, in the interests of simplicity, also does not support the
\i{hybrid syntaxes} supported by MASM and its clones, such as
\c{mov ax,table[bx]}, where a memory reference is denoted by one
portion outside square brackets and another portion inside. The
correct syntax for the above is \c{mov ax,[table+bx]}. Likewise,
\c{mov ax,es:[di]} is wrong and \c{mov ax,[es:di]} is right.
\S{qstypes} NASM Doesn't Store \i{Variable Types}
NASM, by design, chooses not to remember the types of variables you
declare. Whereas MASM will remember, on seeing \c{var dw 0}, that
you declared \c{var} as a word-size variable, and will then be able
to fill in the \i{ambiguity} in the size of the instruction \c{mov
var,2}, NASM will deliberately remember nothing about the symbol
\c{var} except where it begins, and so you must explicitly code
\c{mov word [var],2}.
For this reason, NASM doesn't support the \c{LODS}, \c{MOVS},
\c{STOS}, \c{SCAS}, \c{CMPS}, \c{INS}, or \c{OUTS} instructions,
but only supports the forms such as \c{LODSB}, \c{MOVSW}, and
\c{SCASD}, which explicitly specify the size of the components of
the strings being manipulated.
\S{qsassume} NASM Doesn't \i\c{ASSUME}
As part of NASM's drive for simplicity, it also does not support the
\c{ASSUME} directive. NASM will not keep track of what values you
choose to put in your segment registers, and will never
\e{automatically} generate a \i{segment override} prefix.
\S{qsmodel} NASM Doesn't Support \i{Memory Models}
NASM also does not have any directives to support different 16-bit
memory models. The programmer has to keep track of which functions
are supposed to be called with a \i{far call} and which with a
\i{near call}, and is responsible for putting the correct form of
\c{RET} instruction (\c{RETN} or \c{RETF}; NASM accepts \c{RET}
itself as an alternate form for \c{RETN}); in addition, the
programmer is responsible for coding CALL FAR instructions where
necessary when calling \e{external} functions, and must also keep
track of which external variable definitions are far and which are
near.
\S{qsfpu} \i{Floating-Point} Differences
NASM uses different names to refer to floating-point registers from
MASM: where MASM would call them \c{ST(0)}, \c{ST(1)} and so on, and
\i\c{a86} would call them simply \c{0}, \c{1} and so on, NASM
chooses to call them \c{st0}, \c{st1} etc.
As of version 0.96, NASM now treats the instructions with
\i{`nowait'} forms in the same way as MASM-compatible assemblers.
The idiosyncratic treatment employed by 0.95 and earlier was based
on a misunderstanding by the authors.
\S{qsother} Other Differences
For historical reasons, NASM uses the keyword \i\c{TWORD} where MASM
and compatible assemblers use \i\c{TBYTE}.
NASM does not declare \i{uninitialised storage} in the same way as
MASM: where a MASM programmer might use \c{stack db 64 dup (?)},
NASM requires \c{stack resb 64}, intended to be read as `reserve 64
bytes'. For a limited amount of compatibility, since NASM treats
\c{?} as a valid character in symbol names, you can code \c{? equ 0}
and then writing \c{dw ?} will at least do something vaguely useful.
\I\c{RESB}\i\c{DUP} is still not a supported syntax, however.
In addition to all of this, macros and directives work completely
differently to MASM. See \k{preproc} and \k{directive} for further
details.
\C{lang} The NASM Language
\H{syntax} Layout of a NASM Source Line
Like most assemblers, each NASM source line contains (unless it
is a macro, a preprocessor directive or an assembler directive: see
\k{preproc} and \k{directive}) some combination of the four fields
\c label: instruction operands ; comment
As usual, most of these fields are optional; the presence or absence
of any combination of a label, an instruction and a comment is allowed.
Of course, the operand field is either required or forbidden by the
presence and nature of the instruction field.
NASM places no restrictions on white space within a line: labels may
have white space before them, or instructions may have no space
before them, or anything. The \i{colon} after a label is also
optional. (Note that this means that if you intend to code \c{lodsb}
alone on a line, and type \c{lodab} by accident, then that's still a
valid source line which does nothing but define a label. Running
NASM with the command-line option
\I{orphan-labels}\c{-w+orphan-labels} will cause it to warn you if
you define a label alone on a line without a \i{trailing colon}.)
\i{Valid characters} in labels are letters, numbers, \c{_}, \c{$},
\c{#}, \c{@}, \c{~}, \c{.}, and \c{?}. The only characters which may
be used as the \e{first} character of an identifier are letters,
\c{.} (with special meaning: see \k{locallab}), \c{_} and \c{?}.
An identifier may also be prefixed with a \I{$prefix}\c{$} to
indicate that it is intended to be read as an identifier and not a
reserved word; thus, if some other module you are linking with
defines a symbol called \c{eax}, you can refer to \c{$eax} in NASM
code to distinguish the symbol from the register.
The instruction field may contain any machine instruction: Pentium
and P6 instructions, FPU instructions, MMX instructions and even
undocumented instructions are all supported. The instruction may be
prefixed by \c{LOCK}, \c{REP}, \c{REPE}/\c{REPZ} or
\c{REPNE}/\c{REPNZ}, in the usual way. Explicit \I{address-size
prefixes}address-size and \i{operand-size prefixes} \c{A16},
\c{A32}, \c{O16} and \c{O32} are provided - one example of their use
is given in \k{mixsize}. You can also use the name of a \I{segment
override}segment register as an instruction prefix: coding
\c{es mov [bx],ax} is equivalent to coding \c{mov [es:bx],ax}. We
recommend the latter syntax, since it is consistent with other
syntactic features of the language, but for instructions such as
\c{LODSB}, which has no operands and yet can require a segment
override, there is no clean syntactic way to proceed apart from
\c{es lodsb}.
An instruction is not required to use a prefix: prefixes such as
\c{CS}, \c{A32}, \c{LOCK} or \c{REPE} can appear on a line by
themselves, and NASM will just generate the prefix bytes.
In addition to actual machine instructions, NASM also supports a
number of pseudo-instructions, described in \k{pseudop}.
Instruction \i{operands} may take a number of forms: they can be
registers, described simply by the register name (e.g. \c{ax},
\c{bp}, \c{ebx}, \c{cr0}: NASM does not use the \c{gas}-style
syntax in which register names must be prefixed by a \c{%} sign), or
they can be \i{effective addresses} (see \k{effaddr}), constants
(\k{const}) or expressions (\k{expr}).
For \i{floating-point} instructions, NASM accepts a wide range of
syntaxes: you can use two-operand forms like MASM supports, or you
can use NASM's native single-operand forms in most cases. Details of
all forms of each supported instruction are given in
\k{iref}. For example, you can code:
\c fadd st1 ; this sets st0 := st0 + st1
\c fadd st0,st1 ; so does this
\c
\c fadd st1,st0 ; this sets st1 := st1 + st0
\c fadd to st1 ; so does this
Almost any floating-point instruction that references memory must
use one of the prefixes \i\c{DWORD}, \i\c{QWORD} or \i\c{TWORD} to
indicate what size of \i{memory operand} it refers to.
\H{pseudop} \i{Pseudo-Instructions}
Pseudo-instructions are things which, though not real x86 machine
instructions, are used in the instruction field anyway because
that's the most convenient place to put them. The current
pseudo-instructions are \i\c{DB}, \i\c{DW}, \i\c{DD}, \i\c{DQ} and
\i\c{DT}, their \i{uninitialised} counterparts \i\c{RESB},
\i\c{RESW}, \i\c{RESD}, \i\c{RESQ} and \i\c{REST}, the \i\c{INCBIN}
command, the \i\c{EQU} command, and the \i\c{TIMES} prefix.
\S{db} \c{DB} and friends: Declaring Initialised Data
\i\c{DB}, \i\c{DW}, \i\c{DD}, \i\c{DQ} and \i\c{DT} are used, much
as in MASM, to declare initialised data in the output file. They can
be invoked in a wide range of ways:
\I{floating-point}\I{character constant}\I{string constant}
\c db 0x55 ; just the byte 0x55
\c db 0x55,0x56,0x57 ; three bytes in succession
\c db 'a',0x55 ; character constants are OK
\c db 'hello',13,10,'$' ; so are string constants
\c dw 0x1234 ; 0x34 0x12
\c dw 'a' ; 0x41 0x00 (it's just a number)
\c dw 'ab' ; 0x41 0x42 (character constant)
\c dw 'abc' ; 0x41 0x42 0x43 0x00 (string)
\c dd 0x12345678 ; 0x78 0x56 0x34 0x12
\c dd 1.234567e20 ; floating-point constant
\c dq 1.234567e20 ; double-precision float
\c dt 1.234567e20 ; extended-precision float
\c{DQ} and \c{DT} do not accept \i{numeric constants} or string
constants as operands.
\S{resb} \c{RESB} and friends: Declaring \i{Uninitialised} Data
\i\c{RESB}, \i\c{RESW}, \i\c{RESD}, \i\c{RESQ} and \i\c{REST} are
designed to be used in the BSS section of a module: they declare
\e{uninitialised} storage space. Each takes a single operand, which
is the number of bytes, words, doublewords or whatever to reserve.
As stated in \k{qsother}, NASM does not support the MASM/TASM syntax
of reserving uninitialised space by writing \I\c{?}\c{DW ?} or
similar things: this is what it does instead. The operand to a
\c{RESB}-type pseudo-instruction is a \i\e{critical expression}: see
\k{crit}.
For example:
\c buffer: resb 64 ; reserve 64 bytes
\c wordvar: resw 1 ; reserve a word
\c realarray resq 10 ; array of ten reals
\S{incbin} \i\c{INCBIN}: Including External \i{Binary Files}
\c{INCBIN} is borrowed from the old Amiga assembler \i{DevPac}: it
includes a binary file verbatim into the output file. This can be
handy for (for example) including \i{graphics} and \i{sound} data
directly into a game executable file. It can be called in one of
these three ways:
\c incbin "file.dat" ; include the whole file
\c incbin "file.dat",1024 ; skip the first 1024 bytes
\c incbin "file.dat",1024,512 ; skip the first 1024, and
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -