📄 bfdsumm.texi
字号:
@c This summary of BFD is shared by the BFD and LD docs.When an object file is opened, BFD subroutines automatically determinethe format of the input object file. They then build a descriptor inmemory with pointers to routines that will be used to access elements ofthe object file's data structures.As different information from the object files is required,BFD reads from different sections of the file and processes them.For example, a very common operation for the linker is processing symboltables. Each BFD back end provides a routine for convertingbetween the object file's representation of symbols and an internalcanonical format. When the linker asks for the symbol table of an objectfile, it calls through a memory pointer to the routine from therelevant BFD back end which reads and converts the table into a canonicalform. The linker then operates upon the canonical form. When the link isfinished and the linker writes the output file's symbol table,another BFD back end routine is called to take the newlycreated symbol table and convert it into the chosen output format.@menu* BFD information loss:: Information Loss* Canonical format:: The BFD canonical object-file format @end menu@node BFD information loss@subsection Information Loss@emph{Information can be lost during output.} The output formatssupported by BFD do not provide identical facilities, andinformation which can be described in one form has nowhere to go inanother format. One example of this is alignment information in@code{b.out}. There is nowhere in an @code{a.out} format file to storealignment information on the contained data, so when a file is linkedfrom @code{b.out} and an @code{a.out} image is produced, alignmentinformation will not propagate to the output file. (The linker willstill use the alignment information internally, so the link is performedcorrectly).Another example is COFF section names. COFF files may contain anunlimited number of sections, each one with a textual section name. Ifthe target of the link is a format which does not have many sections (e.g.,@code{a.out}) or has sections without names (e.g., the Oasys format), thelink cannot be done simply. You can circumvent this problem bydescribing the desired input-to-output section mapping with the linker commandlanguage.@emph{Information can be lost during canonicalization.} The BFDinternal canonical form of the external formats is not exhaustive; thereare structures in input formats for which there is no directrepresentation internally. This means that the BFD back endscannot maintain all possible data richness through the transformationbetween external to internal and back to external formats.This limitation is only a problem when an application reads oneformat and writes another. Each BFD back end is responsible formaintaining as much data as possible, and the internal BFDcanonical form has structures which are opaque to the BFD core,and exported only to the back ends. When a file is read in one format,the canonical form is generated for BFD and the application. At thesame time, the back end saves away any information which may otherwisebe lost. If the data is then written back in the same format, the backend routine will be able to use the canonical form provided by theBFD core as well as the information it prepared earlier. Sincethere is a great deal of commonality between back ends,there is no information lost whenlinking or copying big endian COFF to little endian COFF, or @code{a.out} to@code{b.out}. When a mixture of formats is linked, the information isonly lost from the files whose format differs from the destination.@node Canonical format@subsection The BFD canonical object-file formatThe greatest potential for loss of information occurs when there is the leastoverlap between the information provided by the source format, thatstored by the canonical format, and that needed by thedestination format. A brief description of the canonical form may helpyou understand which kinds of data you can count on preserving acrossconversions.@cindex BFD canonical format@cindex internal object-file format@table @emph@item filesInformation stored on a per-file basis includes target machinearchitecture, particular implementation format type, a demand pageablebit, and a write protected bit. Information like Unix magic numbers isnot stored here---only the magic numbers' meaning, so a @code{ZMAGIC}file would have both the demand pageable bit and the write protectedtext bit set. The byte order of the target is stored on a per-filebasis, so that big- and little-endian object files may be used with oneanother.@item sectionsEach section in the input file contains the name of the section, thesection's original address in the object file, size and alignmentinformation, various flags, and pointers into other BFD datastructures.@item symbolsEach symbol contains a pointer to the information for the object filewhich originally defined it, its name, its value, and various flagbits. When a BFD back end reads in a symbol table, it relocates allsymbols to make them relative to the base of the section where they weredefined. Doing this ensures that each symbol points to its containingsection. Each symbol also has a varying amount of hidden private datafor the BFD back end. Since the symbol points to the original file, theprivate data format for that symbol is accessible. @code{ld} canoperate on a collection of symbols of wildly different formats withoutproblems.Normal global and simple local symbols are maintained on output, so anoutput file (no matter its format) will retain symbols pointing tofunctions and to global, static, and common variables. Some symbolinformation is not worth retaining; in @code{a.out}, type information isstored in the symbol table as long symbol names. This information wouldbe useless to most COFF debuggers; the linker has command line switchesto allow users to throw it away.There is one word of type information within the symbol, so if theformat supports symbol type information within symbols (for example, COFF,IEEE, Oasys) and the type is simple enough to fit within one word(nearly everything but aggregates), the information will be preserved.@item relocation levelEach canonical BFD relocation record contains a pointer to the symbol torelocate to, the offset of the data to relocate, the section the datais in, and a pointer to a relocation type descriptor. Relocation isperformed by passing messages through the relocation typedescriptor and the symbol pointer. Therefore, relocations can be performedon output data using a relocation method that is only available in one of theinput formats. For instance, Oasys provides a byte relocation format.A relocation record requesting this relocation type would pointindirectly to a routine to perform this, so the relocation may beperformed on a byte being written to a 68k COFF file, even though 68k COFFhas no such relocation type.@item line numbersObject formats can contain, for debugging purposes, some form of mappingbetween symbols, source line numbers, and addresses in the output file.These addresses have to be relocated along with the symbol information.Each symbol with an associated list of line number records points to thefirst record of the list. The head of a line number list consists of apointer to the symbol, which allows finding out the address of thefunction whose line number is being described. The rest of the list ismade up of pairs: offsets into the section and line numbers. Any formatwhich can simply derive this information can pass it successfullybetween formats (COFF, IEEE and Oasys).@end table
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -