📄 nasmdoc7.htm
字号:
(functions or data) they define are formed by prefixing an underscore tothe name as it appears in the C program. So, for example, the function a Cprogrammer thinks of as <code><nobr>printf</nobr></code> appears to anassembly language programmer as <code><nobr>_printf</nobr></code>. Thismeans that in your assembly programs, you can define symbols without aleading underscore, and not have to worry about name clashes with Csymbols.<p>If you find the underscores inconvenient, you can define macros toreplace the <code><nobr>GLOBAL</nobr></code> and<code><nobr>EXTERN</nobr></code> directives as follows:<p><pre>%macro cglobal 1 global _%1 %define %1 _%1 %endmacro %macro cextern 1 extern _%1 %define %1 _%1 %endmacro</pre><p>(These forms of the macros only take one argument at a time; a<code><nobr>%rep</nobr></code> construct could solve this.)<p>If you then declare an external like this:<p><pre>cextern printf</pre><p>then the macro will expand it as<p><pre>extern _printf %define printf _printf</pre><p>Thereafter, you can reference <code><nobr>printf</nobr></code> as if itwas a symbol, and the preprocessor will put the leading underscore on wherenecessary.<p>The <code><nobr>cglobal</nobr></code> macro works similarly. You mustuse <code><nobr>cglobal</nobr></code> before defining the symbol inquestion, but you would have had to do that anyway if you used<code><nobr>GLOBAL</nobr></code>.<p>Also see <a href="nasmdoc2.html#section-2.1.21">section 2.1.21</a>.<h4><a name="section-7.4.2">7.4.2 Memory Models</a></h4><p>NASM contains no mechanism to support the various C memory modelsdirectly; you have to keep track yourself of which one you are writing for.This means you have to keep track of the following things:<ul><li>In models using a single code segment (tiny, small and compact),functions are near. This means that function pointers, when stored in datasegments or pushed on the stack as function arguments, are 16 bits long andcontain only an offset field (the <code><nobr>CS</nobr></code> registernever changes its value, and always gives the segment part of the fullfunction address), and that functions are called using ordinary near<code><nobr>CALL</nobr></code> instructions and return using<code><nobr>RETN</nobr></code> (which, in NASM, is synonymous with<code><nobr>RET</nobr></code> anyway). This means both that you shouldwrite your own routines to return with <code><nobr>RETN</nobr></code>, andthat you should call external C routines with near<code><nobr>CALL</nobr></code> instructions.<li>In models using more than one code segment (medium, large and huge),functions are far. This means that function pointers are 32 bits long(consisting of a 16-bit offset followed by a 16-bit segment), and thatfunctions are called using <code><nobr>CALL FAR</nobr></code> (or<code><nobr>CALL seg:offset</nobr></code>) and return using<code><nobr>RETF</nobr></code>. Again, you should therefore write your ownroutines to return with <code><nobr>RETF</nobr></code> and use<code><nobr>CALL FAR</nobr></code> to call external routines.<li>In models using a single data segment (tiny, small and medium), datapointers are 16 bits long, containing only an offset field (the<code><nobr>DS</nobr></code> register doesn't change its value, and alwaysgives the segment part of the full data item address).<li>In models using more than one data segment (compact, large and huge),data pointers are 32 bits long, consisting of a 16-bit offset followed by a16-bit segment. You should still be careful not to modify<code><nobr>DS</nobr></code> in your routines without restoring itafterwards, but <code><nobr>ES</nobr></code> is free for you to use toaccess the contents of 32-bit data pointers you are passed.<li>The huge memory model allows single data items to exceed 64K in size.In all other memory models, you can access the whole of a data item just bydoing arithmetic on the offset field of the pointer you are given, whethera segment field is present or not; in huge model, you have to be morecareful of your pointer arithmetic.<li>In most memory models, there is a <em>default</em> data segment, whosesegment address is kept in <code><nobr>DS</nobr></code> throughout theprogram. This data segment is typically the same segment as the stack, keptin <code><nobr>SS</nobr></code>, so that functions' local variables (whichare stored on the stack) and global data items can both be accessed easilywithout changing <code><nobr>DS</nobr></code>. Particularly large dataitems are typically stored in other segments. However, some memory models(though not the standard ones, usually) allow the assumption that<code><nobr>SS</nobr></code> and <code><nobr>DS</nobr></code> hold the samevalue to be removed. Be careful about functions' local variables in thislatter case.</ul><p>In models with a single code segment, the segment is called<code><nobr>_TEXT</nobr></code>, so your code segment must also go by thisname in order to be linked into the same place as the main code segment. Inmodels with a single data segment, or with a default data segment, it iscalled <code><nobr>_DATA</nobr></code>.<h4><a name="section-7.4.3">7.4.3 Function Definitions and Function Calls</a></h4><p>The C calling convention in 16-bit programs is as follows. In thefollowing description, the words <em>caller</em> and <em>callee</em> areused to denote the function doing the calling and the function which getscalled.<ul><li>The caller pushes the function's parameters on the stack, one afteranother, in reverse order (right to left, so that the first argumentspecified to the function is pushed last).<li>The caller then executes a <code><nobr>CALL</nobr></code> instructionto pass control to the callee. This <code><nobr>CALL</nobr></code> iseither near or far depending on the memory model.<li>The callee receives control, and typically (although this is notactually necessary, in functions which do not need to access theirparameters) starts by saving the value of <code><nobr>SP</nobr></code> in<code><nobr>BP</nobr></code> so as to be able to use<code><nobr>BP</nobr></code> as a base pointer to find its parameters onthe stack. However, the caller was probably doing this too, so part of thecalling convention states that <code><nobr>BP</nobr></code> must bepreserved by any C function. Hence the callee, if it is going to set up<code><nobr>BP</nobr></code> as a <em>frame pointer</em>, must push theprevious value first.<li>The callee may then access its parameters relative to<code><nobr>BP</nobr></code>. The word at <code><nobr>[BP]</nobr></code>holds the previous value of <code><nobr>BP</nobr></code> as it was pushed;the next word, at <code><nobr>[BP+2]</nobr></code>, holds the offset partof the return address, pushed implicitly by <code><nobr>CALL</nobr></code>.In a small-model (near) function, the parameters start after that, at<code><nobr>[BP+4]</nobr></code>; in a large-model (far) function, thesegment part of the return address lives at<code><nobr>[BP+4]</nobr></code>, and the parameters begin at<code><nobr>[BP+6]</nobr></code>. The leftmost parameter of the function,since it was pushed last, is accessible at this offset from<code><nobr>BP</nobr></code>; the others follow, at successively greateroffsets. Thus, in a function such as <code><nobr>printf</nobr></code> whichtakes a variable number of parameters, the pushing of the parameters inreverse order means that the function knows where to find its firstparameter, which tells it the number and type of the remaining ones.<li>The callee may also wish to decrease <code><nobr>SP</nobr></code>further, so as to allocate space on the stack for local variables, whichwill then be accessible at negative offsets from<code><nobr>BP</nobr></code>.<li>The callee, if it wishes to return a value to the caller, should leavethe value in <code><nobr>AL</nobr></code>, <code><nobr>AX</nobr></code> or<code><nobr>DX:AX</nobr></code> depending on the size of the value.Floating-point results are sometimes (depending on the compiler) returnedin <code><nobr>ST0</nobr></code>.<li>Once the callee has finished processing, it restores<code><nobr>SP</nobr></code> from <code><nobr>BP</nobr></code> if it hadallocated local stack space, then pops the previous value of<code><nobr>BP</nobr></code>, and returns via<code><nobr>RETN</nobr></code> or <code><nobr>RETF</nobr></code> dependingon memory model.<li>When the caller regains control from the callee, the functionparameters are still on the stack, so it typically adds an immediateconstant to <code><nobr>SP</nobr></code> to remove them (instead ofexecuting a number of slow <code><nobr>POP</nobr></code> instructions).Thus, if a function is accidentally called with the wrong number ofparameters due to a prototype mismatch, the stack will still be returned toa sensible state since the caller, which <em>knows</em> how many parametersit pushed, does the removing.</ul><p>It is instructive to compare this calling convention with that forPascal programs (described in <a href="#section-7.5.1">section 7.5.1</a>).Pascal has a simpler convention, since no functions have variable numbersof parameters. Therefore the callee knows how many parameters it shouldhave been passed, and is able to deallocate them from the stack itself bypassing an immediate argument to the <code><nobr>RET</nobr></code> or<code><nobr>RETF</nobr></code> instruction, so the caller does not have todo it. Also, the parameters are pushed in left-to-right order, notright-to-left, which means that a compiler can give better guarantees aboutsequence points without performance suffering.<p>Thus, you would define a function in C style in the following way. Thefollowing example is for small model:<p><pre>global _myfunc _myfunc: push bp mov bp,sp sub sp,0x40 ; 64 bytes of local stack space mov bx,[bp+4] ; first parameter to function ; some more code mov sp,bp ; undo "sub sp,0x40" above pop bp ret</pre><p>For a large-model function, you would replace<code><nobr>RET</nobr></code> by <code><nobr>RETF</nobr></code>, and lookfor the first parameter at <code><nobr>[BP+6]</nobr></code> instead of<code><nobr>[BP+4]</nobr></code>. Of course, if one of the parameters is apointer, then the offsets of <em>subsequent</em> parameters will changedepending on the memory model as well: far pointers take up four bytes onthe stack when passed as a parameter, whereas near pointers take up two.<p>At the other end of the process, to call a C function from your assemblycode, you would do something like this:<p><pre>extern _printf ; and then, further down... push word [myint] ; one of my integer variables push word mystring ; pointer into my data segment call _printf add sp,byte 4 ; `byte' saves space ; then those data items... segment _DATA myint dw 1234 mystring db 'This number -> %d <- should be 1234',10,0</pre><p>This piece of code is the small-model assembly equivalent of the C code<p><pre> int myint = 1234; printf("This number -> %d <- should be 1234\n", myint);</pre><p>In large model, the function-call code might look more like this. Inthis example, it is assumed that <code><nobr>DS</nobr></code> already holdsthe segment base of the segment <code><nobr>_DATA</nobr></code>. If not,you would have to initialise it first.<p><pre> push word [myint] push word seg mystring ; Now push the segment, and... push word mystring ; ... offset of "mystring" call far _printf add sp,byte 6</pre><p>The integer value still takes up one word on the stack, since largemodel does not affect the size of the <code><nobr>int</nobr></code> datatype. The first argument (pushed last) to <code><nobr>printf</nobr></code>,however, is a data pointer, and therefore has to contain a segment andoffset part. The segment should be stored second in memory, and thereforemust be pushed first. (Of course, <code><nobr>PUSH DS</nobr></code> wouldhave been a shorter instruction than<code><nobr>PUSH WORD SEG mystring</nobr></code>, if<code><nobr>DS</nobr></code> was set up as the above example assumed.) Thenthe actual call becomes a far call, since functions expect far calls inlarge model; and <code><nobr>SP</nobr></code> has to be increased by 6rather than 4 afterwards to make up for the extra word of parameters.<h4><a name="section-7.4.4">7.4.4 Accessing Data Items</a></h4><p>To get at the contents of C variables, or to declare variables which Ccan access, you need only declare the names as<code><nobr>GLOBAL</nobr></code> or <code><nobr>EXTERN</nobr></code>.(Again, the names require leading underscores, as stated in<a href="#section-7.4.1">section 7.4.1</a>.) Thus, a C variable declared as<code><nobr>int i</nobr></code> can be accessed from assembler as<p><pre>extern _i mov ax,[_i]</pre><p>And to declare your own integer variable which C programs can access as<code><nobr>extern int j</nobr></code>, you do this (making sure you areassembling in the <code><nobr>_DATA</nobr></code> segment, if necessary):<p><pre>global _j _j dw 0</pre><p>To access a C array, you need to know the size of the components of thearray. For example, <code><nobr>int</nobr></code> variables are two byteslong, so if a C program declares an array as<code><nobr>int a[10]</nobr></code>, you can access<code><nobr>a[3]</nobr></code> by coding<code><nobr>mov ax,[_a+6]</nobr></code>. (The byte offset 6 is obtained by
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -