📄 em86real.s
字号:
/* * em86real.S * * Copyright (C) 1998, 1999 Gabriel Paubert, paubert@iram.es * * Modified to compile in RTEMS development environment * by Eric Valette * * Copyright (C) 1999 Eric Valette. valette@crf.canon.fr * * The license and distribution terms for this file may be * found in found in the file LICENSE in this distribution or at * http://www.rtems.com/license/LICENSE. * * $Id: em86real.S,v 1.2.2.1 2003/09/04 18:45:20 joel Exp $ *//* If the symbol __BOOT__ is defined, a slightly different version is * generated to be compiled with the -m relocatable option */#ifdef __BOOT__#include "bootldr.h"/* It is impossible to gather statistics in the boot version */#undef EIP_STATS#endif /* * * Given the size of this code, it deserves a few comments on how it works, * and why it was implemented the way it is. * * The goal is to have a real mode i486SX emulator to initialize hardware, * mostly graphics boards, by interpreting ROM BIOSes. The choice of a 486SX * is logical since this is the lowest processor that PCI ROM BIOSes must run * on. * * The goal of this emulator is not performance, but a small enough memory * footprint to include it in a bootloader. * * It is actually likely to be comparable to a 25MHz 386DX on a 200MHz 603e ! * This is not as serious as it seems since most of the BIOS code performs * a lot of accesses to I/O and non-cacheable memory spaces. For such * instructions, the execution time is often dominated by bus accesses. * Statistics of the code also shows that it spends a large function of * the time in loops waiting for vertical retrace or programs one of the * timers and waits for the count to go down to zero. This type of loop * runs emulated at the same speed as on 5 GHz Pentium IV++ ;) * *//* * Known bugs or differences with a real 486SX (real mode): * - segment limits are not enforced (too costly) * - xchg instructions with memory are not locked * - lock prefixes are not implemented at all * - long divides implemented but perhaps still buggy * - miscellaneous system instructions not implemented * (some probably cannot be implemented) * - neither control nor debug registers are implemented for the time being * (debug registers are impossible to implement at a reasonable cost) *//* Code options, put them on the compiler command line */ /* #define EIP_STATS */ /* EIP based profiling *//* #undef EIP_STATS *//* * Implementation notes: * * A) flags emulation. * * The most important decisions when it comes to obtain a reasonable speed * are related to how the EFLAGS register is emulated. * * Note: the code to set up flags is complex, but it is only seldom * executed since cmp and test instructions use much faster flag evaluation * paths. For example the overflow flag is almost only needed for pushf and * int. Comparison results only involve (SF^OF) or (SF^OF)+ZF and the * implementation is fast in this case. * * Rarely used flags: AC, NT and IOPL are kept in a memory EFLAGS image. * All other flags are either kept explicitly in PPC cr (DF, IF, and TF) or * lazily evaluated from the state of 4 registers called flags, result, op1, * op2, and sometimes the cr itself. The emulation has been designed for * minimal overhead for the common case where the flags are never used. With * few exceptions, all instructions that set flags leave the result of the * computation in a register called result, and operands are taken from op1 * and op2 registers. However a few instructions like cmp, test and bit tests * (bt/btc/btr/bts/bsf/bsr) explicitly set cr bits to short circuit * condition code evaluation of conditional instructions. * * As a very brief summary: * * - the result of the last flag setting operation is often either in the * result register or in op2 after increment or decrement instructions * because result and op1 may be needed to compute the carry. * * - compare instruction leave the result of the unsigned comparison * in cr4 and of signed comparison in cr6. This means that: * - cr4[0]=CF (short circuit for jc/jnc) * - cr4[1]=~(CF+ZF) (short circuit for ja/jna) * - cr6[0]=(OF^SF) (short circuit for jl/jnl) * - cr6[1]=~((SF^OF)+ZF) (short circuit for jg/jng) * - cr6[2]=ZF (short circuit for jz/jnz) * * - test instruction set flags in cr6 and clear overflow. This means that: * - cr6[0]=SF=(SF^OF) (short circuit for jl/jnl/js/jns) * - cr6[1]=~((SF^OF)+ZF) (short circuit for jg/jng) * - cr6[2]=ZF (short circuit for jz/jnz) * * All flags may be lazily evaluated from several values kept in registers: * * Flag: Depends upon: * OF result, op1, op2, flags[INCDEC_FIELD,SUBTRACTING,OF_STATE_MASK] * SF result, op2, flags[INCDEC_FIELD,RES_SIZE] * ZF result, op2, cr6[2], flags[INCDEC_FIELD,RES_SIZE,ZF_PROTECT] * AF op1, op2, flags[INCDEC_FIELD,SUBTRACTING,CF_IN] * PF result, op2, flags[INCDEC_FIELD] * CF result, op1, flags[CF_STATE_MASK, CF_IN] * * The order of the fields in the flags register has been chosen so that a * single rlwimi is necessary for common instruction that do not affect all * flags. (See the code for inc/dec emulation). * * * B) opcodes and prefixes. * * The register called opcode holds in its low order 8 bits the opcode * (second byte if the first byte is 0x0f). More precisely it holds the * last byte fetched before the modrm byte or the immediate operand(s) * of the instruction, if any. High order 24 bits are zero unless the * instruction has prefixes. These higher order bits have the following * meaning: * 0x80000000 segment override prefix * 0x00001000 repnz prefix (0xf2) * 0x00000800 repz prefix (0xf3) * 0x00000400 address size prefix (0x67) * 0x00000200 operand size prefix (0x66) * (bit 0x1000 and 0x800 cannot be set simultaneously) * * Therefore if there is a segment override the value will be between very * negative (between 0x80000000 and 0x800016ff), if there is no segment * override, the value will be between 0 and 0x16ff. The reason for * this choice will be understood in the next part. * * C) addresing mode description tables. * * the encoding of the modrm bytes (especially in 16 bit mode) is quite * complex. Hence a table, indexed by the five useful bits of the modrm * byte is used to simplify decoding. Here is a description: * * bit mask meaning * 0x80000000 use ss as default segment register * 0x00004000 means that this addressing mode needs a base register * (set for all entries except sib and displacement-only) * 0x00002000 set if preceding is not set * 0x00001000 set if an sib follows * 0x00000700 base register to use (16 and 32 bit) * 0x00000080 set in 32 bit addressing mode table, cleared in 16 bit * (so extsb mask,entry; ori mask,mask,0xffff gives a mask) * 0x00000070 kludge field, possible values are * 0: 16 bit addressing mode without index * 10: 32 bit addressing mode * 60: 16 bit addressing mode with %si as index * 70: 16 bit addressing mode with %di as index * * This convention leads to the following special values used to check for * sib present and displacement-only, which happen to the three lowest * values in the table (unsigned): * 0x00003090 sib follows (implies it is a 32 bit mode) * 0x00002090 32 bit displacement-only * 0x00002000 16 bit displacement-only * * This means that all entries are either very negative in the 0x80002000 * range if the segment defaults to ss or higher than 0x2000 if it defaults * to ds. Combined with the value in opcode this gives the following table: * opcode entry entry>opcode ? segment to use * positive positive yes ds (default) * negative positive yes overriden by prefix * positive negative no ss * negative negative yes overridden by prefix * * Hence a simple comparison allows to check for the need to override * the current base with ss, i.e., when ss is the default base and the * instruction has no override prefix. * * D) BUGS * * This software is obviously bug-free :-). Nevertheless, if you encounter * an interesting feature. Mail me a note, if possible with a detailed * instruction example showing where and how it fails. * *//* Now the details of flag evaluation with the necessary macros *//* Alignment check is toggable so the system believes it is a 486, butCPUID is not to avoid unnecessary complexities. However, alignmentis actually never checked (real mode is CPL 0 anyway). */#define AC86 13 /* Can only be toggled */#define VM86 14 /* Not used for now */#define RF86 15 /* Not emulated precisely *//* Actually NT and IOPL are kept in memory */#define NT86 17#define IOPL86 18 /* Actually 18 and 19 */#define OF86 20 #define DF86 21#define IF86 22#define TF86 23#define SF86 24#define ZF86 25#define AF86 27#define PF86 29#define CF86 31/* Where the less important flags are placed in PPC cr */#define RF 20 /* Suppress trap flag: cr5[0] */#define DF 21 /* Direction flag: cr5[1] */#define IF 22 /* Interrupt flag: cr5[2] */#define TF 23 /* Single step flag: cr5[3] *//* Now the flags which are frequently used *//* * CF_IN is a copy of the input carry with PPC polarity, * it is cleared for add, set for sub and cmp, * equal to the x86 carry for adc and to its complement for sbb. * it is used to evaluate AF and CF. */#define CF_IN 0x80000000/* #define GET_CF_IN(dst) rlwinm dst,flags,1,0x01 *//* CF_IN_CR set in flags means that cr4[0] is a copy of carry bit */#define CF_IN_CR 0x40000000#define EVAL_CF andis. r3,flags,(CF_IN_CR)>>16; beql- _eval_cf/* * CF_STATE tells how to compute the carry bit. * NOTRESULT16 and NOTRESULT8 are never set explicitly, * but they may happen after a cmc instruction. */#define CF 16 /* cr4[0] */#define CF_LOCATION 0x30000000#define CF_ZERO 0x00000000#define CF_EXPLICIT 0x00000000#define CF_COMPLEMENT 0x08000000 /* Indeed a polarity bit */#define CF_STATE_MASK (CF_LOCATION|CF_COMPLEMENT)#define CF_VALUE 0x08000000#define CF_SET 0x08000000#define CF_RES32 0x10000000#define CF_NOTRES32 0x18000000#define CF_RES16 0x20000000#define CF_NOTRES16 0x28000000#define CF_RES8 0x30000000#define CF_NOTRES8 0x38000000 #define CF_ADDL CF_RES32#define CF_SUBL CF_NOTRES32#define CF_ADDW CF_RES16#define CF_SUBW CF_RES16#define CF_ADDB CF_RES8#define CF_SUBB CF_RES8#define CF_ROTCNT(dst) rlwinm dst,flags,7,0x18#define CF_POL(dst,pos) rlwinm dst,flags,(36-pos)%32,pos,pos#define CF_POL_INSERT(dst,pos) \ rlwimi dst,flags,(36-pos)%32,pos,pos#define RES2CF(dst) rlwinm dst,result,8,7,15 /* * OF_STATE tells how to compute the overflow bit. When the low order bit * is set (OF_EXPLICIT), it means that OF is the exclusive or of the * two other bits. For the reason of this choice, see rotate instructions. */#define OF 1 /* Only after EVAL_OF */#define OF_STATE_MASK 0x07000000#define OF_INCDEC 0x00000000#define OF_EXPLICIT 0x01000000#define OF_ZERO 0x01000000#define OF_VALUE 0x04000000#define OF_SET 0x04000000#define OF_ONE 0x05000000#define OF_XOR 0x06000000#define OF_ARITHL 0x06000000#define OF_ARITHW 0x02000000#define OF_ARITHB 0x04000000#define EVAL_OF rlwinm. r3,flags,6,0,1; bngl+ _eval_of; andis. r3,flags,OF_VALUE>>16 /* See _eval_of to see how this can be used */#define OF_ROTCNT(dst) rlwinm dst,flags,10,0x1c /* * SIGNED_IN_CR means that cr6 is set as after a signed compare: * - cr6[0] is SF^OF for jl/jnl/setl/setnl... * - cr6[1] is ~((SF^OF)+ZF) for jg/jng/setg/setng... * - cr6[2] is ZF (ZF_IN_CR is always set if this bit is set) */#define SLT 24 /* cr6[0], signed less than */#define SGT 25 /* cr6[1], signed greater than */#define SIGNED_IN_CR 0x00800000#define EVAL_SIGNED andis. r3,flags,SIGNED_IN_CR>>16; beql- _eval_signed/* * Above in CR means that cr4 is set as after an unsigned compare: * - cr4[0] is CF (CF_IN_CR is also set) * - cr4[1] is ~(CF+ZF) (ZF_IN_CR is also set) */#define ABOVE 17 /* cr4[1] */#define ABOVE_IN_CR 0x00400000#define EVAL_ABOVE andis. r3,flags,ABOVE_IN_CR>>16; beql- _eval_above/* SF_IN_CR means cr6[0] is a copy of SF. It implies ZF_IN_CR is also set */#define SF 24 /* cr6[0] */#define SF_IN_CR 0x00200000#define EVAL_SF andis. r3,flags,SF_IN_CR>>16; beql- _eval_sf_zf /* ZF_IN_CR means cr6[2] is a copy of ZF. */#define ZF 26 #define ZF_IN_CR 0x00100000 #define EVAL_ZF andis. r3,flags,ZF_IN_CR>>16; beql- _eval_sf_zf#define ZF2ZF86(s,d) rlwimi d,s,ZF-ZF86,ZF86,ZF86#define ZF862ZF(reg) rlwimi reg,reg,32+ZF86-ZF,ZF,ZF /* * ZF_PROTECT means cr6[2] is the only valid value for ZF. This is necessary * because some infrequent instructions may leave SF and ZF in an apparently * inconsistent state (both set): sahf, popf and the few (not implemented) * instructions that only affect ZF. */#define ZF_PROTECT 0x00080000 /* The parity is always evaluated when it is needed */#define PF 0 /* Only after EVAL_PF */#define EVAL_PF bl _eval_pf/* This field gives the shift amount to use to evaluate SF and ZF when ZF_PROTECT is not set */#define RES_SIZE_MASK 0x00060000#define RESL 0x00000000#define RESW 0x00040000#define RESB 0x00060000#define RES_SHIFT(dst) rlwinm dst,flags,18,0x18/* SUBTRACTING is set if the last flag setting instruction was sub/sbb/cmp, used to evaluate OF and AF */#define SUBTRACTING 0x00010000#define GET_ADDSUB(dst) rlwinm dst,flags,16,0x01 /* rotate (rcl/rcr/rol/ror) affect CF and OF but not other flags */#define ROTATE_MASK (CF_IN_CR|CF_STATE_MASK|ABOVE_IN_CR|OF_STATE_MASK|SIGNED_IN_CR)#define ROTATE_FLAGS rlwimi flags,one,24,ROTATE_MASK/* * INCDEC_FIELD has at most one bit set when the last flag setting instruction * was either inc or dec (which do not affect the carry). When one of these * bits is set, it affects the way OF, SF, ZF, AF, and PF are evaluated. */#define INCDEC_FIELD 0x0000ff00#define DECB_SHIFT 8#define INCB_SHIFT 9#define DECW_SHIFT 10#define INCW_SHIFT 11#define DECL_SHIFT 14#define INCL_SHIFT 15#define INCDEC_MASK (OF_STATE_MASK|SIGNED_IN_CR|ABOVE_IN_CR|SF_IN_CR|\ ZF_IN_CR|ZF_PROTECT|RES_SIZE_MASK|SUBTRACTING|\ INCDEC_FIELD)/* Operations to perform to tell where the flags are after inc or dec */#define INC_FLAGS(BWL) rlwimi flags,one,INC##BWL##_SHIFT,INCDEC_MASK#define DEC_FLAGS(BWL) rlwimi flags,one,DEC##BWL##_SHIFT,INCDEC_MASK /* How the flags are set after arithmetic operations */#define FLAGS_ADD(BWL) (CF_ADD##BWL|OF_ARITH##BWL|RES##BWL)#define FLAGS_SBB(BWL) (CF_SUB##BWL|OF_ARITH##BWL|RES##BWL|SUBTRACTING)#define FLAGS_SUB(BWL) FLAGS_SBB(BWL)|CF_IN#define FLAGS_CMP(BWL) FLAGS_SUB(BWL)|ZF_IN_CR|CF_IN_CR|SIGNED_IN_CR|ABOVE_IN_CR/* How the flags are set after logical operations */#define FLAGS_LOG(BWL) (CF_ZERO|OF_ZERO|RES##BWL)#define FLAGS_TEST(BWL) FLAGS_LOG(BWL)|ZF_IN_CR|SIGNED_IN_CR|SF_IN_CR/* How the flags are set after bt/btc/btr/bts. */#define FLAGS_BTEST CF_IN_CR|CF_ADDL|OF_ZERO|RESL/* How the flags are set after bsf/bsr. */#define FLAGS_BSRCH(WL) CF_ZERO|OF_ZERO|RES##WL|ZF_IN_CR/* How the flags are set after logical right shifts */#define FLAGS_SHR(BWL) (CF_EXPLICIT|OF_ARITH##BWL|RES##BWL)/* How the flags are set after double length shifts */#define FLAGS_DBLSH(WL) (CF_EXPLICIT|OF_ARITH##WL|RES##WL)/* How the flags are set after multiplies */#define FLAGS_MUL (CF_EXPLICIT|OF_EXPLICIT) #define SET_FLAGS(fl) lis flags,(fl)>>16#define ADD_FLAGS(fl) addis flags,flags,(fl)>>16/* * We are always off by one when compared with Intel's eip, this shortens * code by allowing to load next byte with lbzu x,1(eip). The register * called eip actually contains csbase+eip, and thus should be called lip * for linear ip. */ /* * Reason codes passed to the C part of the emulator, this includes all * instructions which may change the current code segment. These definitions * will soon go into a separate include file. Codes 0 to 255 correspond * directly to the interrupt/trap that has to be generated. */#define code_divide_err 0#define code_trap 1#define code_int3 3#define code_into 4#define code_bound 5#define code_ud 6#define code_dna 7 /* FPU not available */ #define code_iretw 256 /* Interrupt returns */#define code_iretl 257#define code_lcallw 258 /* Far calls and jumps */#define code_lcalll 259#define code_ljmpw 260#define code_ljmpl 261#define code_lretw 262 /* Far returns */#define code_lretl 263#define code_softint 264 /* int $xx */#define code_lock 265 /* Lock prefix *//* Codes 1024 to 2047 are used for I/O port access instructions: - The three LSB define the port size (1, 2 or 4) - bit of weight 512 means out if set, in if clear - bit of weight 256 means ins/outs if set, in/out if clear - bit of weight 128 means use 32 bit addresses if set, 16 bit if clear (only used for ins/outs instructions, always clear for in/out) */#define code_inb 1024+1#define code_inw 1024+2#define code_inl 1024+4#define code_outb 1024+512+1#define code_outw 1024+512+2#define code_outl 1024+512+4#define code_insb_a16 1024+256+1#define code_insw_a16 1024+256+2#define code_insl_a16 1024+256+4#define code_outsb_a16 1024+512+256+1#define code_outsw_a16 1024+512+256+2#define code_outsl_a16 1024+512+256+4#define code_insb_a32 1024+256+128+1#define code_insw_a32 1024+256+128+2#define code_insl_a32 1024+256+128+4#define code_outsb_a32 1024+512+256+128+1#define code_outsw_a32 1024+512+256+128+2#define code_outsl_a32 1024+512+256+128+4#define state 31/* r31 (state) is a pointer to a structure describing the emulated x86 processor, its layout is the following:first the general purpose registers, they are in little endian byte orderoffset name 0 eax/ax/al 1 ah 4 ecx/cx/cl 5 ch 8 edx/dx/dl 9 dh 12 ebx/bx/bl 13 bh 16 esp/sp 20 ebp/bp 24 esi/si 28 edi/di*/#define AL 0#define AX 0#define EAX 0#define AH 1
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -