📄 readme
字号:
Copyright 1999, 2000, 2001 Free Software Foundation, Inc.This file is part of the GNU MP Library.The GNU MP Library is free software; you can redistribute it and/or modifyit under the terms of the GNU Lesser General Public License as published bythe Free Software Foundation; either version 2.1 of the License, or (at youroption) any later version.The GNU MP Library is distributed in the hope that it will be useful, butWITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITYor FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General PublicLicense for more details.You should have received a copy of the GNU Lesser General Public Licensealong with the GNU MP Library; see the file COPYING.LIB. If not, write tothe Free Software Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA02111-1307, USA. X86 MPN SUBROUTINESThis directory contains mpn functions for various 80x86 chips.CODE ORGANIZATION x86 i386, generic x86/i486 i486 x86/pentium Intel Pentium (P5, P54) x86/pentium/mmx Intel Pentium with MMX (P55) x86/p6 Intel Pentium Pro x86/p6/mmx Intel Pentium II, III x86/p6/p3mmx Intel Pentium III x86/k6 \ AMD K6 x86/k6/mmx / x86/k6/k62mmx AMD K6-2 x86/k7 \ AMD Athlon x86/k7/mmx / x86/pentium4 \ x86/pentium4/mmx | Intel Pentium 4 x86/pentium4/sse2 /The top-level x86 directory contains blended style code, meant to bereasonable on all x86s. STATUSThe code is well-optimized for AMD and Intel chips, but there's nothingspecific for Cyrix chips, nor for actual 80386 and 80486 chips.ASM FILESThe x86 .asm files are BSD style assembler code, first put through m4 formacro processing. The generic mpn/asm-defs.m4 is used, together withmpn/x86/x86-defs.m4. See comments in those files.The code is meant for use with GNU "gas" or a system "as". There's nosupport for assemblers that demand Intel style code.STACK FRAMEm4 macros are used to define the parameters passed on the stack, and theseact like comments on what the stack frame looks like too. For example,mpn_mul_1() has the following. defframe(PARAM_MULTIPLIER, 16) defframe(PARAM_SIZE, 12) defframe(PARAM_SRC, 8) defframe(PARAM_DST, 4)PARAM_MULTIPLIER becomes `FRAME+16(%esp)', and the others similarly. Thereturn address is at offset 0, but there's not normally any need to accessthat.FRAME is redefined as necessary through the code so it's the number of bytespushed on the stack, and hence the offsets in the parameter macros staycorrect. At the start of a routine FRAME should be zero. deflit(`FRAME',0) ... deflit(`FRAME',4) ... deflit(`FRAME',8) ...Helper macros FRAME_pushl(), FRAME_popl(), FRAME_addl_esp() andFRAME_subl_esp() exist to adjust FRAME for the effect of those instructions,and can be used instead of explicit definitions if preferred.defframe_pushl() is a combination FRAME_pushl() and defframe().There's generally some slackness in redefining FRAME. If new values aren'tgoing to get used then the redefinitions are omitted to keep from clutteringup the code. This happens for instance at the end of a routine, where theremight be just four pops and then a ret, so FRAME isn't getting used.Local variables and saved registers can be similarly defined, with negativeoffsets representing stack space below the initial stack pointer. Forexample, defframe(SAVE_ESI, -4) defframe(SAVE_EDI, -8) defframe(VAR_COUNTER,-12) deflit(STACK_SPACE, 12)Here STACK_SPACE gets used in a "subl $STACK_SPACE, %esp" to allocate thespace, and that instruction must be followed by a redefinition of FRAME(setting it equal to STACK_SPACE) to reflect the change in %esp.Definitions for pushed registers are only put in when they're going to beused. If registers are just saved and restored with pushes and pops thendefinitions aren't made.ASSEMBLER EXPRESSIONSOnly addition and subtraction seem to be universally available, certainlythat's all the Solaris 8 "as" seems to accept. If expressions are wantedthen m4 eval() should be used.In particular note that a "/" anywhere in a line starts a comment in Solaris"as", and in some configurations of gas too. addl $32/2, %eax <-- wrong addl $eval(32/2), %eax <-- rightBinutils gas/config/tc-i386.c has a choice between "/" being a commentanywhere in a line, or only at the start. FreeBSD patches 2.9.1 to selectthe latter, and from 2.9.5 it's the default for GNU/Linux too.ASSEMBLER COMMENTSSolaris "as" doesn't support "#" commenting, using /* */ instead. For thatreason "C" commenting is used (see asm-defs.m4) and the intermediate ".s"files have no comments.Any comments before include(`../config.m4') must use m4 "dnl", since it'sonly after the include that "C" is available. By convention "dnl" is alsoused for comments about m4 macros.TEMPORARY LABELSTemporary numbered labels like "1:" used as "1f" or "1b" are available in"gas" and Solaris "as", but not in SCO "as". Normal L() labels should beused instead, possibly with a counter to make them unique, see jadcl0() forinstance. A separate counter for each macro makes it possible to nest them,for instance movl_text_address() can be used within an ASSERT()."1:" etc must be avoided in gcc __asm__ blocks too. "%=" for generating aunique number looks like a good alternative, but is that actually adocumented feature? In any case this problem doesn't currently arise.ZERO DISPLACEMENTSIn a couple of places addressing modes like 0(%ebx) with a byte-sized zerodisplacement are wanted, rather than (%ebx) with no displacement. These areeither for computed jumps or to get desirable code alignment. Explicit.byte sequences are used to ensure the assembler doesn't turn 0(%ebx) into(%ebx). The Zdisp() macro in x86-defs.m4 is used for this.Current gas 2.9.5 or recent 2.9.1 leave 0(%ebx) as written, but old gas1.92.3 changes it. In general changing would be the sort of "optimization"an assembler might perform, hence explicit ".byte"s are used wherenecessary.SHLD/SHRD INSTRUCTIONSThe %cl count forms of double shift instructions like "shldl %cl,%eax,%ebx"must be written "shldl %eax,%ebx" for some assemblers. gas takes either,Solaris "as" doesn't allow %cl, gcc generates %cl for gas and NeXT (which isgas), and omits %cl elsewhere.For GMP an autoconf test GMP_ASM_X86_SHLDL_CL is used to determine whether%cl should be used, and the macros shldl, shrdl, shldw and shrdw inmpn/x86/x86-defs.m4 pass through or omit %cl as necessary. See the commentswith those macros for usage.IMUL INSTRUCTIONGCC config/i386/i386.md (cvs rev 1.187, 21 Oct 00) under *mulsi3_1 notesthat the following two forms produce identical object code imul $12, %eax imul $12, %eax, %eaxbut that the former isn't accepted by some assemblers, in particular the SCOOSR5 COFF assembler. GMP follows GCC and uses only the latter form.(This applies only to immediate operands, the three operand form is onlyvalid with an immediate.)DIRECTION FLAGThe x86 calling conventions say that the direction flag should be clear atfunction entry and exit. (See iBCS2 and SVR4 ABI books, references below.)Although this has been so since the year dot, it's not absolutely clearwhether it's universally respected. Since it's better to be safe thansorry, GMP follows glibc and does a "cld" if it depends on the directionflag being clear. This happens only in a few places.
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -