📄 nasmdoc.txt
字号:
(except of course that surrounding the constant with single quotes
allows double quotes to appear within it and vice versa); the
contents of those are represented verbatim. Strings enclosed in
backquotes support C-style `\'-escapes for special characters.
The following escape sequences are recognized by backquoted strings:
\' single quote (')
\" double quote (")
\` backquote (`)
\\ backslash (\)
\? question mark (?)
\a BEL (ASCII 7)
\b BS (ASCII 8)
\t TAB (ASCII 9)
\n LF (ASCII 10)
\v VT (ASCII 11)
\f FF (ASCII 12)
\r CR (ASCII 13)
\e ESC (ASCII 27)
\377 Up to 3 octal digits - literal byte
\xFF Up to 2 hexadecimal digits - literal byte
\u1234 4 hexadecimal digits - Unicode character
\U12345678 8 hexadecimal digits - Unicode character
All other escape sequences are reserved. Note that `\0', meaning a
`NUL' character (ASCII 0), is a special case of the octal escape
sequence.
Unicode characters specified with `\u' or `\U' are converted to
UTF-8. For example, the following lines are all equivalent:
db `\u263a` ; UTF-8 smiley face
db `\xe2\x98\xba` ; UTF-8 smiley face
db 0E2h, 098h, 0BAh ; UTF-8 smiley face
3.4.3 Character Constants
A character constant consists of a string up to eight bytes long,
used in an expression context. It is treated as if it was an
integer.
A character constant with more than one byte will be arranged with
little-endian order in mind: if you code
mov eax,'abcd'
then the constant generated is not `0x61626364', but `0x64636261',
so that if you were then to store the value into memory, it would
read `abcd' rather than `dcba'. This is also the sense of character
constants understood by the Pentium's `CPUID' instruction.
3.4.4 String Constants
String constants are character strings used in the context of some
pseudo-instructions, namely the `DB' family and `INCBIN' (where it
represents a filename.) They are also used in certain preprocessor
directives.
A string constant looks like a character constant, only longer. It
is treated as a concatenation of maximum-size character constants
for the conditions. So the following are equivalent:
db 'hello' ; string constant
db 'h','e','l','l','o' ; equivalent character constants
And the following are also equivalent:
dd 'ninechars' ; doubleword string constant
dd 'nine','char','s' ; becomes three doublewords
db 'ninechars',0,0,0 ; and really looks like this
Note that when used in a string-supporting context, quoted strings
are treated as a string constants even if they are short enough to
be a character constant, because otherwise `db 'ab'' would have the
same effect as `db 'a'', which would be silly. Similarly, three-
character or four-character constants are treated as strings when
they are operands to `DW', and so forth.
3.4.5 Floating-Point Constants
Floating-point constants are acceptable only as arguments to `DB',
`DW', `DD', `DQ', `DT', and `DO', or as arguments to the special
operators `__float8__', `__float16__', `__float32__', `__float64__',
`__float80m__', `__float80e__', `__float128l__', and
`__float128h__'.
Floating-point constants are expressed in the traditional form:
digits, then a period, then optionally more digits, then optionally
an `E' followed by an exponent. The period is mandatory, so that
NASM can distinguish between `dd 1', which declares an integer
constant, and `dd 1.0' which declares a floating-point constant.
NASM also support C99-style hexadecimal floating-point: `0x',
hexadecimal digits, period, optionally more hexadeximal digits, then
optionally a `P' followed by a _binary_ (not hexadecimal) exponent
in decimal notation.
Underscores to break up groups of digits are permitted in floating-
point constants as well.
Some examples:
db -0.2 ; "Quarter precision"
dw -0.5 ; IEEE 754r/SSE5 half precision
dd 1.2 ; an easy one
dd 1.222_222_222 ; underscores are permitted
dd 0x1p+2 ; 1.0x2^2 = 4.0
dq 0x1p+32 ; 1.0x2^32 = 4 294 967 296.0
dq 1.e10 ; 10 000 000 000.0
dq 1.e+10 ; synonymous with 1.e10
dq 1.e-10 ; 0.000 000 000 1
dt 3.141592653589793238462 ; pi
do 1.e+4000 ; IEEE 754r quad precision
The 8-bit "quarter-precision" floating-point format is
sign:exponent:mantissa = 1:4:3 with an exponent bias of 7. This
appears to be the most frequently used 8-bit floating-point format,
although it is not covered by any formal standard. This is sometimes
called a "minifloat."
The special operators are used to produce floating-point numbers in
other contexts. They produce the binary representation of a specific
floating-point number as an integer, and can use anywhere integer
constants are used in an expression. `__float80m__' and
`__float80e__' produce the 64-bit mantissa and 16-bit exponent of an
80-bit floating-point number, and `__float128l__' and
`__float128h__' produce the lower and upper 64-bit halves of a 128-
bit floating-point number, respectively.
For example:
mov rax,__float64__(3.141592653589793238462)
... would assign the binary representation of pi as a 64-bit
floating point number into `RAX'. This is exactly equivalent to:
mov rax,0x400921fb54442d18
NASM cannot do compile-time arithmetic on floating-point constants.
This is because NASM is designed to be portable - although it always
generates code to run on x86 processors, the assembler itself can
run on any system with an ANSI C compiler. Therefore, the assembler
cannot guarantee the presence of a floating-point unit capable of
handling the Intel number formats, and so for NASM to be able to do
floating arithmetic it would have to include its own complete set of
floating-point routines, which would significantly increase the size
of the assembler for very little benefit.
The special tokens `__Infinity__', `__QNaN__' (or `__NaN__') and
`__SNaN__' can be used to generate infinities, quiet NaNs, and
signalling NaNs, respectively. These are normally used as macros:
%define Inf __Infinity__
%define NaN __QNaN__
dq +1.5, -Inf, NaN ; Double-precision constants
3.5 Expressions
Expressions in NASM are similar in syntax to those in C. Expressions
are evaluated as 64-bit integers which are then adjusted to the
appropriate size.
NASM supports two special tokens in expressions, allowing
calculations to involve the current assembly position: the `$' and
`$$' tokens. `$' evaluates to the assembly position at the beginning
of the line containing the expression; so you can code an infinite
loop using `JMP $'. `$$' evaluates to the beginning of the current
section; so you can tell how far into the section you are by using
`($-$$)'.
The arithmetic operators provided by NASM are listed here, in
increasing order of precedence.
3.5.1 `|': Bitwise OR Operator
The `|' operator gives a bitwise OR, exactly as performed by the
`OR' machine instruction. Bitwise OR is the lowest-priority
arithmetic operator supported by NASM.
3.5.2 `^': Bitwise XOR Operator
`^' provides the bitwise XOR operation.
3.5.3 `&': Bitwise AND Operator
`&' provides the bitwise AND operation.
3.5.4 `<<' and `>>': Bit Shift Operators
`<<' gives a bit-shift to the left, just as it does in C. So `5<<3'
evaluates to 5 times 8, or 40. `>>' gives a bit-shift to the right;
in NASM, such a shift is _always_ unsigned, so that the bits shifted
in from the left-hand end are filled with zero rather than a sign-
extension of the previous highest bit.
3.5.5 `+' and `-': Addition and Subtraction Operators
The `+' and `-' operators do perfectly ordinary addition and
subtraction.
3.5.6 `*', `/', `//', `%' and `%%': Multiplication and Division
`*' is the multiplication operator. `/' and `//' are both division
operators: `/' is unsigned division and `//' is signed division.
Similarly, `%' and `%%' provide unsigned and signed modulo operators
respectively.
NASM, like ANSI C, provides no guarantees about the sensible
operation of the signed modulo operator.
Since the `%' character is used extensively by the macro
preprocessor, you should ensure that both the signed and unsigned
modulo operators are followed by white space wherever they appear.
3.5.7 Unary Operators: `+', `-', `~', `!' and `SEG'
The highest-priority operators in NASM's expression grammar are
those which only apply to one argument. `-' negates its operand, `+'
does nothing (it's provided for symmetry with `-'), `~' computes the
one's complement of its operand, `!' is the logical negation
operator, and `SEG' provides the segment address of its operand
(explained in more detail in section 3.6).
3.6 `SEG' and `WRT'
When writing large 16-bit programs, which must be split into
multiple segments, it is often necessary to be able to refer to the
segment part of the address of a symbol. NASM supports the `SEG'
operator to perform this function.
The `SEG' operator returns the _preferred_ segment base of a symbol,
defined as the segment base relative to which the offset of the
symbol makes sense. So the code
mov ax,seg symbol
mov es,ax
mov bx,symbol
will load `ES:BX' with a valid pointer to the symbol `symbol'.
Things can be more complex than this: since 16-bit segments and
groups may overlap, you might occasionally want to refer to some
symbol using a different segment base from the preferred one. NASM
lets you do this, by the use of the `WRT' (With Reference To)
keyword. So you can do things like
mov ax,weird_seg ; weird_seg is a segment base
mov es,ax
mov bx,symbol wrt weird_seg
to load `ES:BX' with a different, but functionally equivalent,
pointer to the symbol `symbol'.
NASM supports far (inter-segment) calls and jumps by means of the
syntax `call segment:offset', where `segment' and `offset' both
represent immediate values. So to call a far procedure, you could
code either of
call (seg procedure):procedure
call weird_seg:(procedure wrt weird_seg)
(The parentheses are included for clarity, to show the intended
parsing of the above instructions. They are not necessary in
practice.)
NASM supports the syntax `call far procedure' as a synonym for the
first of the above usages. `JMP' works identically to `CALL' in
these examples.
To declare a far pointer to a data item in a data segment, you must
code
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -