📄 nasmdoc.src

📁 开源的nasm编译器源码,研究编译器原理很有帮且
💻 SRC
📖 第 1 页 / 共 5 页
字号:
上一页 1 2 3 45
that allows the offset field to be absent and space to be saved; infact, it will also split \c{[eax*2+offset]} into\c{[eax+eax+offset]}. You can combat this behaviour by the use ofthe \c{NOSPLIT} keyword: \c{[nosplit eax*2]} will force\c{[eax*2+0]} to be generated literally.\H{const} \i{Constants}NASM understands four different types of constant: numeric,character, string and floating-point.\S{numconst} \i{Numeric Constants}A numeric constant is simply a number. NASM allows you to specifynumbers in a variety of number bases, in a variety of ways: you cansuffix \c{H}, \c{Q} or \c{O}, and \c{B} for \i{hex}, \i{octal} and \i{binary},or you can prefix \c{0x} for hex in the style of C, or you canprefix \c{$} for hex in the style of Borland Pascal. Note, though,that the \I{$, prefix}\c{$} prefix does double duty as a prefix onidentifiers (see \k{syntax}), so a hex number prefixed with a \c{$}sign must have a digit after the \c{$} rather than a letter.Some examples:\c         mov     ax,100          ; decimal\c         mov     ax,0a2h         ; hex\c         mov     ax,$0a2         ; hex again: the 0 is required\c         mov     ax,0xa2         ; hex yet again\c         mov     ax,777q         ; octal\c         mov     ax,777o         ; octal again\c         mov     ax,10010011b    ; binary\S{chrconst} \i{Character Constants}A character constant consists of up to four characters enclosed ineither single or double quotes. The type of quote makes nodifference to NASM, except of course that surrounding the constantwith single quotes allows double quotes to appear within it and viceversa.A character constant with more than one character will be arrangedwith \i{little-endian} order in mind: if you code\c           mov eax,'abcd'then the constant generated is not \c{0x61626364}, but\c{0x64636261}, so that if you were then to store the value intomemory, it would read \c{abcd} rather than \c{dcba}. This is alsothe sense of character constants understood by the Pentium's\i\c{CPUID} instruction (see \k{insCPUID}).\S{strconst} String ConstantsString constants are only acceptable to some pseudo-instructions,namely the \I\c{DW}\I\c{DD}\I\c{DQ}\I\c{DT}\i\c{DB} family and\i\c{INCBIN}.A string constant looks like a character constant, only longer. Itis treated as a concatenation of maximum-size character constantsfor the conditions. So the following are equivalent:\c       db    'hello'               ; string constant\c       db    'h','e','l','l','o'   ; equivalent character constantsAnd the following are also equivalent:\c       dd    'ninechars'           ; doubleword string constant\c       dd    'nine','char','s'     ; becomes three doublewords\c       db    'ninechars',0,0,0     ; and really looks like thisNote that when used as an operand to \c{db}, a constant like\c{'ab'} is treated as a string constant despite being short enoughto be a character constant, because otherwise \c{db 'ab'} would havethe same effect as \c{db 'a'}, which would be silly. Similarly,three-character or four-character constants are treated as stringswhen they are operands to \c{dw}.\S{fltconst} \I{floating-point, constants}Floating-Point Constants\i{Floating-point} constants are acceptable only as arguments to\i\c{DD}, \i\c{DQ} and \i\c{DT}. They are expressed in thetraditional form: digits, then a period, then optionally moredigits, then optionally an \c{E} followed by an exponent. The periodis mandatory, so that NASM can distinguish between \c{dd 1}, whichdeclares an integer constant, and \c{dd 1.0} which declares afloating-point constant.Some examples:\c       dd    1.2                     ; an easy one\c       dq    1.e10                   ; 10,000,000,000\c       dq    1.e+10                  ; synonymous with 1.e10\c       dq    1.e-10                  ; 0.000 000 000 1\c       dt    3.141592653589793238462 ; piNASM cannot do compile-time arithmetic on floating-point constants.This is because NASM is designed to be portable - although it alwaysgenerates code to run on x86 processors, the assembler itself canrun on any system with an ANSI C compiler. Therefore, the assemblercannot guarantee the presence of a floating-point unit capable ofhandling the \i{Intel number formats}, and so for NASM to be able todo floating arithmetic it would have to include its own complete setof floating-point routines, which would significantly increase thesize of the assembler for very little benefit.\H{expr} \i{Expressions}Expressions in NASM are similar in syntax to those in C.NASM does not guarantee the size of the integers used to evaluateexpressions at compile time: since NASM can compile and run on64-bit systems quite happily, don't assume that expressions areevaluated in 32-bit registers and so try to make deliberate use of\i{integer overflow}. It might not always work. The only thing NASMwill guarantee is what's guaranteed by ANSI C: you always have \e{atleast} 32 bits to work in.NASM supports two special tokens in expressions, allowingcalculations to involve the current assembly position: the\I{$, here}\c{$} and \i\c{$$} tokens. \c{$} evaluates to the assemblyposition at the beginning of the line containing the expression; soyou can code an \i{infinite loop} using \c{JMP $}. \c{$$} evaluatesto the beginning of the current section; so you can tell how farinto the section you are by using \c{($-$$)}.The arithmetic \i{operators} provided by NASM are listed here, inincreasing order of \i{precedence}.\S{expor} \i\c{|}: \i{Bitwise OR} OperatorThe \c{|} operator gives a bitwise OR, exactly as performed by the\c{OR} machine instruction. Bitwise OR is the lowest-priorityarithmetic operator supported by NASM.\S{expxor} \i\c{^}: \i{Bitwise XOR} Operator\c{^} provides the bitwise XOR operation.\S{expand} \i\c{&}: \i{Bitwise AND} Operator\c{&} provides the bitwise AND operation.\S{expshift} \i\c{<<} and \i\c{>>}: \i{Bit Shift} Operators\c{<<} gives a bit-shift to the left, just as it does in C. So \c{5<<3}evaluates to 5 times 8, or 40. \c{>>} gives a bit-shift to theright; in NASM, such a shift is \e{always} unsigned, so thatthe bits shifted in from the left-hand end are filled with zerorather than a sign-extension of the previous highest bit.\S{expplmi} \I{+ opaddition}\c{+} and \I{- opsubtraction}\c{-}:\i{Addition} and \i{Subtraction} OperatorsThe \c{+} and \c{-} operators do perfectly ordinary addition andsubtraction.\S{expmul} \i\c{*}, \i\c{/}, \i\c{//}, \i\c{%} and \i\c{%%}:\i{Multiplication} and \i{Division}\c{*} is the multiplication operator. \c{/} and \c{//} are bothdivision operators: \c{/} is \i{unsigned division} and \c{//} is\i{signed division}. Similarly, \c{%} and \c{%%} provide \I{unsignedmodulo}\I{modulo operators}unsigned and\i{signed modulo} operators respectively.NASM, like ANSI C, provides no guarantees about the sensibleoperation of the signed modulo operator.Since the \c{%} character is used extensively by the macro\i{preprocessor}, you should ensure that both the signed and unsignedmodulo operators are followed by white space wherever they appear.\S{expmul} \i{Unary Operators}: \I{+ opunary}\c{+}, \I{- opunary}\c{-},\i\c{~} and \i\c{SEG}The highest-priority operators in NASM's expression grammar arethose which only apply to one argument. \c{-} negates its operand,\c{+} does nothing (it's provided for symmetry with \c{-}), \c{~}computes the \i{one's complement} of its operand, and \c{SEG}provides the \i{segment address} of its operand (explained in moredetail in \k{segwrt}).\H{segwrt} \i\c{SEG} and \i\c{WRT}When writing large 16-bit programs, which must be split intomultiple \i{segments}, it is often necessary to be able to refer tothe \I{segment address}segment part of the address of a symbol. NASMsupports the \c{SEG} operator to perform this function.The \c{SEG} operator returns the \i\e{preferred} segment base of asymbol, defined as the segment base relative to which the offset ofthe symbol makes sense. So the code\c         mov     ax,seg symbol\c         mov     es,ax\c         mov     bx,symbolwill load \c{ES:BX} with a valid pointer to the symbol \c{symbol}.Things can be more complex than this: since 16-bit segments and\i{groups} may \I{overlapping segments}overlap, you might occasionallywant to refer to some symbol using a different segment base from thepreferred one. NASM lets you do this, by the use of the \c{WRT}(With Reference To) keyword. So you can do things like\c         mov     ax,weird_seg        ; weird_seg is a segment base\c         mov     es,ax\c         mov     bx,symbol wrt weird_segto load \c{ES:BX} with a different, but functionally equivalent,pointer to the symbol \c{symbol}.NASM supports far (inter-segment) calls and jumps by means of thesyntax \c{call segment:offset}, where \c{segment} and \c{offset}both represent immediate values. So to call a far procedure, youcould code either of\c         call    (seg procedure):procedure\c         call    weird_seg:(procedure wrt weird_seg)(The parentheses are included for clarity, to show the intendedparsing of the above instructions. They are not necessary inpractice.)NASM supports the syntax \I\c{CALL FAR}\c{call far procedure} as asynonym for the first of the above usages. \c{JMP} works identicallyto \c{CALL} in these examples.To declare a \i{far pointer} to a data item in a data segment, youmust code\c         dw      symbol, seg symbolNASM supports no convenient synonym for this, though you can alwaysinvent one using the macro processor.\H{strict} \i\c{STRICT}: Inhibiting OptimizationWhen assembling with the optimizer set to level 2 or higher (see\k{opt-On}), NASM will use size specifiers (\c{BYTE}, \c{WORD},\c{DWORD}, \c{QWORD}, or \c{TWORD}), but will give them the smallestpossible size. The keyword \c{STRICT} can be used to inhibitoptimization and force a particular operand to be emitted in thespecified size. For example, with the optimizer on, and in\c{BITS 16} mode,\c         push dword 33is encoded in three bytes \c{66 6A 21}, whereas\c         push strict dword 33is encoded in six bytes, with a full dword immediate operand \c{66 6821 00 00 00}.With the optimizer off, the same code (six bytes) is generated whetherthe \c{STRICT} keyword was used or not.\H{crit} \i{Critical Expressions}A limitation of NASM is that it is a \i{two-pass assembler}; unlikeTASM and others, it will always do exactly two \I{passes}\i{assemblypasses}. Therefore it is unable to cope with source files that arecomplex enough to require three or more passes.The first pass is used to determine the size of all the assembledcode and data, so that the second pass, when generating all thecode, knows all the symbol addresses the code refers to. So onething NASM can't handle is code whose size depends on the value of asymbol declared after the code in question. For example,\c         times (label-$) db 0\c label:  db      'Where am I?'The argument to \i\c{TIMES} in this case could equally legallyevaluate to anything at all; NASM will reject this example becauseit cannot tell the size of the \c{TIMES} line when it first sees it.It will just as firmly reject the slightly \I{paradox}paradoxicalcode\c         times (label-$+1) db 0\c label:  db      'NOW where am I?'in which \e{any} value for the \c{TIMES} argument is by definitionwrong!NASM rejects these examples by means of a concept called a\e{critical expression}, which is defined to be an expression whosevalue is required to be computable in the first pass, and which musttherefore depend only on symbols defined before it. The argument tothe \c{TIMES} prefix is a critical expression; for the same reason,the arguments to the \i\c{RESB} family of pseudo-instructions arealso critical expressions.Critical expressions can crop up in other contexts as well: considerthe following code.\c                 mov     ax,symbol1\c symbol1         equ     symbol2\c symbol2:On the first pass, NASM cannot determine the value of \c{symbol1},because \c{symbol1} is defined to be equal to \c{symbol2} which NASMhasn't seen yet. On the second pass, therefore, when it encountersthe line \c{mov ax,symbol1}, it is unable to generate the code forit because it still doesn't know the value of \c{symbol1}. On thenext line, it would see the \i\c{EQU} again and be able to determinethe value of \c{
上一页 1 2 3 45
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -