📄 nasmdoc.txt
字号:
prefixed with a `$' to indicate that it is intended to be read as an
identifier and not a reserved word; thus, if some other module you
are linking with defines a symbol called `eax', you can refer to
`$eax' in NASM code to distinguish the symbol from the register.
Maximum length of an identifier is 4095 characters.
The instruction field may contain any machine instruction: Pentium
and P6 instructions, FPU instructions, MMX instructions and even
undocumented instructions are all supported. The instruction may be
prefixed by `LOCK', `REP', `REPE'/`REPZ' or `REPNE'/`REPNZ', in the
usual way. Explicit address-size and operand-size prefixes `A16',
`A32', `O16' and `O32' are provided - one example of their use is
given in chapter 9. You can also use the name of a segment register
as an instruction prefix: coding `es mov [bx],ax' is equivalent to
coding `mov [es:bx],ax'. We recommend the latter syntax, since it is
consistent with other syntactic features of the language, but for
instructions such as `LODSB', which has no operands and yet can
require a segment override, there is no clean syntactic way to
proceed apart from `es lodsb'.
An instruction is not required to use a prefix: prefixes such as
`CS', `A32', `LOCK' or `REPE' can appear on a line by themselves,
and NASM will just generate the prefix bytes.
In addition to actual machine instructions, NASM also supports a
number of pseudo-instructions, described in section 3.2.
Instruction operands may take a number of forms: they can be
registers, described simply by the register name (e.g. `ax', `bp',
`ebx', `cr0': NASM does not use the `gas'-style syntax in which
register names must be prefixed by a `%' sign), or they can be
effective addresses (see section 3.3), constants (section 3.4) or
expressions (section 3.5).
For x87 floating-point instructions, NASM accepts a wide range of
syntaxes: you can use two-operand forms like MASM supports, or you
can use NASM's native single-operand forms in most cases. For
example, you can code:
fadd st1 ; this sets st0 := st0 + st1
fadd st0,st1 ; so does this
fadd st1,st0 ; this sets st1 := st1 + st0
fadd to st1 ; so does this
Almost any x87 floating-point instruction that references memory
must use one of the prefixes `DWORD', `QWORD' or `TWORD' to indicate
what size of memory operand it refers to.
3.2 Pseudo-Instructions
Pseudo-instructions are things which, though not real x86 machine
instructions, are used in the instruction field anyway because
that's the most convenient place to put them. The current pseudo-
instructions are `DB', `DW', `DD', `DQ', `DT', `DO' and `DY'; their
uninitialized counterparts `RESB', `RESW', `RESD', `RESQ', `REST',
`RESO' and `RESY'; the `INCBIN' command, the `EQU' command, and the
`TIMES' prefix.
3.2.1 `DB' and friends: Declaring initialized Data
`DB', `DW', `DD', `DQ', `DT', `DO' and `DY' are used, much as in
MASM, to declare initialized data in the output file. They can be
invoked in a wide range of ways:
db 0x55 ; just the byte 0x55
db 0x55,0x56,0x57 ; three bytes in succession
db 'a',0x55 ; character constants are OK
db 'hello',13,10,'$' ; so are string constants
dw 0x1234 ; 0x34 0x12
dw 'a' ; 0x61 0x00 (it's just a number)
dw 'ab' ; 0x61 0x62 (character constant)
dw 'abc' ; 0x61 0x62 0x63 0x00 (string)
dd 0x12345678 ; 0x78 0x56 0x34 0x12
dd 1.234567e20 ; floating-point constant
dq 0x123456789abcdef0 ; eight byte constant
dq 1.234567e20 ; double-precision float
dt 1.234567e20 ; extended-precision float
`DT', `DO' and `DY' do not accept numeric constants as operands.
3.2.2 `RESB' and friends: Declaring Uninitialized Data
`RESB', `RESW', `RESD', `RESQ', `REST', `RESO' and `RESY' are
designed to be used in the BSS section of a module: they declare
_uninitialized_ storage space. Each takes a single operand, which is
the number of bytes, words, doublewords or whatever to reserve. As
stated in section 2.2.7, NASM does not support the MASM/TASM syntax
of reserving uninitialized space by writing `DW ?' or similar
things: this is what it does instead. The operand to a `RESB'-type
pseudo-instruction is a _critical expression_: see section 3.8.
For example:
buffer: resb 64 ; reserve 64 bytes
wordvar: resw 1 ; reserve a word
realarray resq 10 ; array of ten reals
ymmval: resy 1 ; one YMM register
3.2.3 `INCBIN': Including External Binary Files
`INCBIN' is borrowed from the old Amiga assembler DevPac: it
includes a binary file verbatim into the output file. This can be
handy for (for example) including graphics and sound data directly
into a game executable file. It can be called in one of these three
ways:
incbin "file.dat" ; include the whole file
incbin "file.dat",1024 ; skip the first 1024 bytes
incbin "file.dat",1024,512 ; skip the first 1024, and
; actually include at most 512
`INCBIN' is both a directive and a standard macro; the standard
macro version searches for the file in the include file search path
and adds the file to the dependency lists. This macro can be
overridden if desired.
3.2.4 `EQU': Defining Constants
`EQU' defines a symbol to a given constant value: when `EQU' is
used, the source line must contain a label. The action of `EQU' is
to define the given label name to the value of its (only) operand.
This definition is absolute, and cannot change later. So, for
example,
message db 'hello, world'
msglen equ $-message
defines `msglen' to be the constant 12. `msglen' may not then be
redefined later. This is not a preprocessor definition either: the
value of `msglen' is evaluated _once_, using the value of `$' (see
section 3.5 for an explanation of `$') at the point of definition,
rather than being evaluated wherever it is referenced and using the
value of `$' at the point of reference. Note that the operand to an
`EQU' is also a critical expression (section 3.8).
3.2.5 `TIMES': Repeating Instructions or Data
The `TIMES' prefix causes the instruction to be assembled multiple
times. This is partly present as NASM's equivalent of the `DUP'
syntax supported by MASM-compatible assemblers, in that you can code
zerobuf: times 64 db 0
or similar things; but `TIMES' is more versatile than that. The
argument to `TIMES' is not just a numeric constant, but a numeric
_expression_, so you can do things like
buffer: db 'hello, world'
times 64-$+buffer db ' '
which will store exactly enough spaces to make the total length of
`buffer' up to 64. Finally, `TIMES' can be applied to ordinary
instructions, so you can code trivial unrolled loops in it:
times 100 movsb
Note that there is no effective difference between
`times 100 resb 1' and `resb 100', except that the latter will be
assembled about 100 times faster due to the internal structure of
the assembler.
The operand to `TIMES', like that of `EQU' and those of `RESB' and
friends, is a critical expression (section 3.8).
Note also that `TIMES' can't be applied to macros: the reason for
this is that `TIMES' is processed after the macro phase, which
allows the argument to `TIMES' to contain expressions such as
`64-$+buffer' as above. To repeat more than one line of code, or a
complex macro, use the preprocessor `%rep' directive.
3.3 Effective Addresses
An effective address is any operand to an instruction which
references memory. Effective addresses, in NASM, have a very simple
syntax: they consist of an expression evaluating to the desired
address, enclosed in square brackets. For example:
wordvar dw 123
mov ax,[wordvar]
mov ax,[wordvar+1]
mov ax,[es:wordvar+bx]
Anything not conforming to this simple system is not a valid memory
reference in NASM, for example `es:wordvar[bx]'.
More complicated effective addresses, such as those involving more
than one register, work in exactly the same way:
mov eax,[ebx*2+ecx+offset]
mov ax,[bp+di+8]
NASM is capable of doing algebra on these effective addresses, so
that things which don't necessarily _look_ legal are perfectly all
right:
mov eax,[ebx*5] ; assembles as [ebx*4+ebx]
mov eax,[label1*2-label2] ; ie [label1+(label1-label2)]
Some forms of effective address have more than one assembled form;
in most such cases NASM will generate the smallest form it can. For
example, there are distinct assembled forms for the 32-bit effective
addresses `[eax*2+0]' and `[eax+eax]', and NASM will generally
generate the latter on the grounds that the former requires four
bytes to store a zero offset.
NASM has a hinting mechanism which will cause `[eax+ebx]' and
`[ebx+eax]' to generate different opcodes; this is occasionally
useful because `[esi+ebp]' and `[ebp+esi]' have different default
segment registers.
However, you can force NASM to generate an effective address in a
particular form by the use of the keywords `BYTE', `WORD', `DWORD'
and `NOSPLIT'. If you need `[eax+3]' to be assembled using a double-
word offset field instead of the one byte NASM will normally
generate, you can code `[dword eax+3]'. Similarly, you can force
NASM to use a byte offset for a small value which it hasn't seen on
the first pass (see section 3.8 for an example of such a code
fragment) by using `[byte eax+offset]'. As special cases,
`[byte eax]' will code `[eax+0]' with a byte offset of zero, and
`[dword eax]' will code it with a double-word offset of zero. The
normal form, `[eax]', will be coded with no offset field.
The form described in the previous paragraph is also useful if you
are trying to access data in a 32-bit segment from within 16 bit
code. For more information on this see the section on mixed-size
addressing (section 9.2). In particular, if you need to access data
with a known offset that is larger than will fit in a 16-bit value,
if you don't specify that it is a dword offset, nasm will cause the
high word of the offset to be lost.
Similarly, NASM will split `[eax*2]' into `[eax+eax]' because that
allows the offset field to be absent and space to be saved; in fact,
it will also split `[eax*2+offset]' into `[eax+eax+offset]'. You can
combat this behaviour by the use of the `NOSPLIT' keyword:
`[nosplit eax*2]' will force `[eax*2+0]' to be generated literally.
In 64-bit mode, NASM will by default generate absolute addresses.
The `REL' keyword makes it produce `RIP'-relative addresses. Since
this is frequently the normally desired behaviour, see the `DEFAULT'
directive (section 5.2). The keyword `ABS' overrides `REL'.
3.4 Constants
NASM understands four different types of constant: numeric,
character, string and floating-point.
3.4.1 Numeric Constants
A numeric constant is simply a number. NASM allows you to specify
numbers in a variety of number bases, in a variety of ways: you can
suffix `H', `Q' or `O', and `B' for hex, octal and binary, or you
can prefix `0x' for hex in the style of C, or you can prefix `$' for
hex in the style of Borland Pascal. Note, though, that the `$'
prefix does double duty as a prefix on identifiers (see section
3.1), so a hex number prefixed with a `$' sign must have a digit
after the `$' rather than a letter.
Numeric constants can have underscores (`_') interspersed to break
up long strings.
Some examples:
mov ax,100 ; decimal
mov ax,0a2h ; hex
mov ax,$0a2 ; hex again: the 0 is required
mov ax,0xa2 ; hex yet again
mov ax,777q ; octal
mov ax,777o ; octal again
mov ax,10010011b ; binary
mov ax,1001_0011b ; same binary constant
3.4.2 Character Strings
A character string consists of up to eight characters enclosed in
either single quotes (`'...''), double quotes (`"..."') or
backquotes (``...`'). Single or double quotes are equivalent to NASM
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -