📄 mmo.texi
字号:
@section mmo backendThe mmo object format is used exclusively together with ProfessorDonald E.@: Knuth's educational 64-bit processor MMIX. The simulator@command{mmix} which is available at@url{http://www-cs-faculty.stanford.edu/~knuth/programs/mmix.tar.gz}understands this format. That package also includes a combinedassembler and linker called @command{mmixal}. The mmo format hasno advantages feature-wise compared to e.g. ELF. It is a simplenon-relocatable object format with no support for archives ordebugging information, except for symbol value information andline numbers (which is not yet implemented in BFD). See@url{http://www-cs-faculty.stanford.edu/~knuth/mmix.html} for moreinformation about MMIX. The ELF format is used for intermediateobject files in the BFD implementation.@c We want to xref the symbol table node. A feature in "chew"@c requires that "commands" do not contain spaces in the@c arguments. Hence the hyphen in "Symbol-table".@menu* File layout::* Symbol-table::* mmo section mapping::@end menu@node File layout, Symbol-table, mmo, mmo@subsection File layoutThe mmo file contents is not partitioned into named sections aswith e.g.@: ELF. Memory areas is formed by specifying thelocation of the data that follows. Only the memory area@samp{0x0000@dots{}00} to @samp{0x01ff@dots{}ff} is executable, soit is used for code (and constants) and the area@samp{0x2000@dots{}00} to @samp{0x20ff@dots{}ff} is used forwritable data. @xref{mmo section mapping}.Contents is entered as 32-bit words, xor:ed over previouscontents, always zero-initialized. A word that starts with thebyte @samp{0x98} forms a command called a @samp{lopcode}, wherethe next byte distinguished between the thirteen lopcodes. Thetwo remaining bytes, called the @samp{Y} and @samp{Z} fields, orthe @samp{YZ} field (a 16-bit big-endian number), are used forvarious purposes different for each lopcode. As documented in@url{http://www-cs-faculty.stanford.edu/~knuth/mmixal-intro.ps.gz},the lopcodes are:There is provision for specifying ``special data'' of 65536different types. We use type 80 (decimal), arbitrarily chosen thesame as the ELF @code{e_machine} number for MMIX, filling it withsection information normally found in ELF objects. @xref{mmosection mapping}.@table @code@item lop_quote0x98000001. The next word is contents, regardless of whether itstarts with 0x98 or not.@item lop_loc0x9801YYZZ, where @samp{Z} is 1 or 2. This is a locationdirective, setting the location for the next data to the next32-bit word (for @math{Z = 1}) or 64-bit word (for @math{Z = 2}),plus @math{Y * 2^56}. Normally @samp{Y} is 0 for the text segmentand 2 for the data segment.@item lop_skip0x9802YYZZ. Increase the current location by @samp{YZ} bytes.@item lop_fixo0x9803YYZZ, where @samp{Z} is 1 or 2. Store the current locationas 64 bits into the location pointed to by the next 32-bit(@math{Z = 1}) or 64-bit (@math{Z = 2}) word, plus @math{Y *2^56}.@item lop_fixr0x9804YYZZ. @samp{YZ} is stored into the current location plus@math{2 - 4 * YZ}.@item lop_fixrx0x980500ZZ. @samp{Z} is 16 or 24. A value @samp{L} derived fromthe following 32-bit word are used in a manner similar to@samp{YZ} in lop_fixr: it is xor:ed into the current locationminus @math{4 * L}. The first byte of the word is 0 or 1. If itis 1, then @math{L = (@var{lowest 24 bits of word}) - 2^Z}, if 0,then @math{L = (@var{lowest 24 bits of word})}.@item lop_file0x9806YYZZ. @samp{Y} is the file number, @samp{Z} is count of32-bit words. Set the file number to @samp{Y} and the linecounter to 0. The next @math{Z * 4} bytes contain the file name,padded with zeros if the count is not a multiple of four. Thesame @samp{Y} may occur multiple times, but @samp{Z} must be 0 forall but the first occurrence.@item lop_line0x9807YYZZ. @samp{YZ} is the line number. Together withlop_file, it forms the source location for the next 32-bit word.Note that for each non-lopcode 32-bit word, line numbers areassumed incremented by one.@item lop_spec0x9808YYZZ. @samp{YZ} is the type number. Data until the nextlopcode other than lop_quote forms special data of type @samp{YZ}.@xref{mmo section mapping}.Other types than 80, (or type 80 with a content that does notparse) is stored in sections named @code{.MMIX.spec_data.@var{n}}where @var{n} is the @samp{YZ}-type. The flags for such asections say not to allocate or load the data. The vma is 0.Contents of multiple occurrences of special data @var{n} isconcatenated to the data of the previous lop_spec @var{n}s. Thelocation in data or code at which the lop_spec occurred is lost.@item lop_pre0x980901ZZ. The first lopcode in a file. The @samp{Z} field forms thelength of header information in 32-bit words, where the first wordtells the time in seconds since @samp{00:00:00 GMT Jan 1 1970}.@item lop_post0x980a00ZZ. @math{Z > 32}. This lopcode follows after allcontent-generating lopcodes in a program. The @samp{Z} fielddenotes the value of @samp{rG} at the beginning of the program.The following @math{256 - Z} big-endian 64-bit words are loadedinto global registers @samp{$G} @dots{} @samp{$255}.@item lop_stab0x980b0000. The next-to-last lopcode in a program. Must followimmediately after the lop_post lopcode and its data. After thislopcode follows all symbols in a compressed format(@pxref{Symbol-table}).@item lop_end0x980cYYZZ. The last lopcode in a program. It must follow thelop_stab lopcode and its data. The @samp{YZ} field contains thenumber of 32-bit words of symbol table information after thepreceding lop_stab lopcode.@end tableNote that the lopcode "fixups"; @code{lop_fixr}, @code{lop_fixrx} and@code{lop_fixo} are not generated by BFD, but are handled. They aregenerated by @code{mmixal}.This trivial one-label, one-instruction file:@example :Main TRAP 1,2,3@end examplecan be represented this way in mmo:@example 0x98090101 - lop_pre, one 32-bit word with timestamp. <timestamp> 0x98010002 - lop_loc, text segment, using a 64-bit address. Note that mmixal does not emit this for the file above. 0x00000000 - Address, high 32 bits. 0x00000000 - Address, low 32 bits. 0x98060002 - lop_file, 2 32-bit words for file-name. 0x74657374 - "test" 0x2e730000 - ".s\0\0" 0x98070001 - lop_line, line 1. 0x00010203 - TRAP 1,2,3 0x980a00ff - lop_post, setting $255 to 0. 0x00000000 0x00000000 0x980b0000 - lop_stab for ":Main" = 0, serial 1. 0x203a4040 @xref{Symbol-table}. 0x10404020 0x4d206120 0x69016e00 0x81000000 0x980c0005 - lop_end; symbol table contained five 32-bit words.@end example@node Symbol-table, mmo section mapping, File layout, mmo@subsection Symbol table formatFrom mmixal.w (or really, the generated mmixal.tex) in@url{http://www-cs-faculty.stanford.edu/~knuth/programs/mmix.tar.gz}):``Symbols are stored and retrieved by means of a @samp{ternarysearch trie}, following ideas of Bentley and Sedgewick. (SeeACM--SIAM Symp.@: on Discrete Algorithms @samp{8} (1997), 360--369;R.@:Sedgewick, @samp{Algorithms in C} (Reading, Mass.@:Addison--Wesley, 1998), @samp{15.4}.) Each trie node stores acharacter, and there are branches to subtries for the cases wherea given character is less than, equal to, or greater than thecharacter in the trie. There also is a pointer to a symbol tableentry if a symbol ends at the current node.''So it's a tree encoded as a stream of bytes. The stream of bytesacts on a single virtual global symbol, adding and removingcharacters and signalling complete symbol points. Here, we readthe stream and create symbols at the completion points.First, there's a control byte @code{m}. If any of the listed bitsin @code{m} is nonzero, we execute what stands at the right, inthe listed order:@example (MMO3_LEFT) 0x40 - Traverse left trie. (Read a new command byte and recurse.) (MMO3_SYMBITS) 0x2f - Read the next byte as a character and store it in the current character position; increment character position. Test the bits of @code{m}: (MMO3_WCHAR) 0x80 - The character is 16-bit (so read another byte, merge into current character. (MMO3_TYPEBITS) 0xf - We have a complete symbol; parse the type, value and serial number and do what should be done with a symbol. The type and length information is in j = (m & 0xf). (MMO3_REGQUAL_BITS) j == 0xf: A register variable. The following byte tells which register. j <= 8: An absolute symbol. Read j bytes as the big-endian number the symbol equals. A j = 2 with two zero bytes denotes an unknown symbol. j > 8: As with j <= 8, but add (0x20 << 56) to the value in the following j - 8 bytes. Then comes the serial number, as a variant of uleb128, but better named ubeb128: Read bytes and shift the previous value left 7 (multiply by 128). Add in the new byte, repeat until a byte has bit 7 set. The serial number is the computed value minus 128. (MMO3_MIDDLE) 0x20 - Traverse middle trie. (Read a new command byte and recurse.) Decrement character position. (MMO3_RIGHT) 0x10 - Traverse right trie. (Read a new command byte and recurse.)@end exampleLet's look again at the @code{lop_stab} for the trivial file(@pxref{File layout}).@example 0x980b0000 - lop_stab for ":Main" = 0, serial 1. 0x203a4040 0x10404020 0x4d206120 0x69016e00 0x81000000@end exampleThis forms the trivial trie (note that the path between ``:'' and``M'' is redundant):@example 203a ":" 40 / 40 / 10 \ 40 / 40 / 204d "M" 2061 "a" 2069 "i" 016e "n" is the last character in a full symbol, and with a value represented in one byte. 00 The value is 0. 81 The serial number is 1.@end example@node mmo section mapping, , Symbol-table, mmo@subsection mmo section mappingThe implementation in BFD uses special data type 80 (decimal) toencapsulate and describe named sections, containing e.g.@: debuginformation. If needed, any datum in the encapsulation will bequoted using lop_quote. First comes a 32-bit word holding thenumber of 32-bit words containing the zero-terminated zero-paddedsegment name. After the name there's a 32-bit word holding flagsdescribing the section type. Then comes a 64-bit big-endian wordwith the section length (in bytes), then another with the sectionstart address. Depending on the type of section, the contentsmight follow, zero-padded to 32-bit boundary. For a loadablesection (such as data or code), the contents might follow at somelater point, not necessarily immediately, as a lop_loc with thesame start address as in the section description, followed by thecontents. This in effect forms a descriptor that must be emittedbefore the actual contents. Sections described this way must notoverlap.For areas that don't have such descriptors, synthetic sections areformed by BFD. Consecutive contents in the two memory areas@samp{0x0000@dots{}00} to @samp{0x01ff@dots{}ff} and@samp{0x2000@dots{}00} to @samp{0x20ff@dots{}ff} are entered insections named @code{.text} and @code{.data} respectively. If an areais not otherwise described, but would together with a neighboringlower area be less than @samp{0x40000000} bytes long, it is joinedwith the lower area and the gap is zero-filled. For other cases,a new section is formed, named @code{.MMIX.sec.@var{n}}. Here,@var{n} is a number, a running count through the mmo file,starting at 0.A loadable section specified as:@example .section secname,"ax" TETRA 1,2,3,4,-1,-2009 BYTE 80@end exampleand linked to address @samp{0x4}, is represented by the sequence:@example 0x98080050 - lop_spec 80 0x00000002 - two 32-bit words for the section name 0x7365636e - "secn" 0x616d6500 - "ame\0" 0x00000033 - flags CODE, READONLY, LOAD, ALLOC 0x00000000 - high 32 bits of section length 0x0000001c - section length is 28 bytes; 6 * 4 + 1 + alignment to 32 bits 0x00000000 - high 32 bits of section address 0x00000004 - section address is 4 0x98010002 - 64 bits with address of following data 0x00000000 - high 32 bits of address 0x00000004 - low 32 bits: data starts at address 4 0x00000001 - 1 0x00000002 - 2 0x00000003 - 3 0x00000004 - 4 0xffffffff - -1 0xfffff827 - -2009 0x50000000 - 80 as a byte, padded with zeros.@end exampleNote that the lop_spec wrapping does not include the sectioncontents. Compare this to a non-loaded section specified as:@example .section thirdsec TETRA 200001,100002 BYTE 38,40@end exampleThis, when linked to address @samp{0x200000000000001c}, isrepresented by:@example 0x98080050 - lop_spec 80 0x00000002 - two 32-bit words for the section name 0x7365636e - "thir" 0x616d6500 - "dsec" 0x00000010 - flag READONLY 0x00000000 - high 32 bits of section length 0x0000000c - section length is 12 bytes; 2 * 4 + 2 + alignment to 32 bits 0x20000000 - high 32 bits of address 0x0000001c - low 32 bits of address 0x200000000000001c 0x00030d41 - 200001 0x000186a2 - 100002 0x26280000 - 38, 40 as bytes, padded with zeros@end exampleFor the latter example, the section contents must not beloaded in memory, and is therefore specified as part of thespecial data. The address is usually unimportant but mightprovide information for e.g.@: the DWARF 2 debugging format.
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -