📄 tcc-doc.texi
字号:
@}@end example@end table@node Libtcc@chapter The @code{libtcc} libraryThe @code{libtcc} library enables you to use TCC as a backend fordynamic code generation. Read the @file{libtcc.h} to have an overview of the API. Read@file{libtcc_test.c} to have a very simple example.The idea consists in giving a C string containing the program you wantto compile directly to @code{libtcc}. Then you can access to any globalsymbol (function or variable) defined.@chapter Developer's guideThis chapter gives some hints to understand how TCC works. You can skipit if you do not intend to modify the TCC code.@section File readingThe @code{BufferedFile} structure contains the context needed to read afile, including the current line number. @code{tcc_open()} opens a newfile and @code{tcc_close()} closes it. @code{inp()} returns the nextcharacter.@section Lexer@code{next()} reads the next token in the currentfile. @code{next_nomacro()} reads the next token without macroexpansion.@code{tok} contains the current token (see @code{TOK_xxx})constants. Identifiers and keywords are also keywords. @code{tokc}contains additional infos about the token (for example a constant valueif number or string token).@section ParserThe parser is hardcoded (yacc is not necessary). It does only one pass,except:@itemize@item For initialized arrays with unknown size, a first pass is done to count the number of elements.@item For architectures where arguments are evaluated in reverse order, a first pass is done to reverse the argument order.@end itemize@section TypesThe types are stored in a single 'int' variable. It was choosen in thefirst stages of development when tcc was much simpler. Now, it may notbe the best solution.@example#define VT_INT 0 /* integer type */#define VT_BYTE 1 /* signed byte type */#define VT_SHORT 2 /* short type */#define VT_VOID 3 /* void type */#define VT_PTR 4 /* pointer */#define VT_ENUM 5 /* enum definition */#define VT_FUNC 6 /* function type */#define VT_STRUCT 7 /* struct/union definition */#define VT_FLOAT 8 /* IEEE float */#define VT_DOUBLE 9 /* IEEE double */#define VT_LDOUBLE 10 /* IEEE long double */#define VT_BOOL 11 /* ISOC99 boolean type */#define VT_LLONG 12 /* 64 bit integer */#define VT_LONG 13 /* long integer (NEVER USED as type, only during parsing) */#define VT_BTYPE 0x000f /* mask for basic type */#define VT_UNSIGNED 0x0010 /* unsigned type */#define VT_ARRAY 0x0020 /* array type (also has VT_PTR) */#define VT_BITFIELD 0x0040 /* bitfield modifier */#define VT_STRUCT_SHIFT 16 /* structure/enum name shift (16 bits left) */@end exampleWhen a reference to another type is needed (for pointers, functions andstructures), the @code{32 - VT_STRUCT_SHIFT} high order bits are used tostore an identifier reference.The @code{VT_UNSIGNED} flag can be set for chars, shorts, ints and longlongs.Arrays are considered as pointers @code{VT_PTR} with the flag@code{VT_ARRAY} set.The @code{VT_BITFIELD} flag can be set for chars, shorts, ints and longlongs. If it is set, then the bitfield position is stored from bitsVT_STRUCT_SHIFT to VT_STRUCT_SHIFT + 5 and the bit field size is storedfrom bits VT_STRUCT_SHIFT + 6 to VT_STRUCT_SHIFT + 11.@code{VT_LONG} is never used except during parsing.During parsing, the storage of an object is also stored in the typeinteger:@example#define VT_EXTERN 0x00000080 /* extern definition */#define VT_STATIC 0x00000100 /* static variable */#define VT_TYPEDEF 0x00000200 /* typedef definition */@end example@section SymbolsAll symbols are stored in hashed symbol stacks. Each symbol stackcontains @code{Sym} structures.@code{Sym.v} contains the symbol name (rememberan idenfier is also a token, so a string is never necessary to storeit). @code{Sym.t} gives the type of the symbol. @code{Sym.r} is usuallythe register in which the corresponding variable is stored. @code{Sym.c} isusually a constant associated to the symbol.Four main symbol stacks are defined:@table @code@item define_stackfor the macros (@code{#define}s).@item global_stackfor the global variables, functions and types.@item local_stackfor the local variables, functions and types.@item global_label_stackfor the local labels (for @code{goto}).@item label_stackfor GCC block local labels (see the @code{__label__} keyword).@end table@code{sym_push()} is used to add a new symbol in the local symbolstack. If no local symbol stack is active, it is added in the globalsymbol stack.@code{sym_pop(st,b)} pops symbols from the symbol stack @var{st} untilthe symbol @var{b} is on the top of stack. If @var{b} is NULL, the stackis emptied.@code{sym_find(v)} return the symbol associated to the identifier@var{v}. The local stack is searched first from top to bottom, then theglobal stack.@section SectionsThe generated code and datas are written in sections. The structure@code{Section} contains all the necessary information for a givensection. @code{new_section()} creates a new section. ELF file semanticsis assumed for each section.The following sections are predefined:@table @code@item text_sectionis the section containing the generated code. @var{ind} contains thecurrent position in the code section.@item data_sectioncontains initialized data@item bss_sectioncontains uninitialized data@item bounds_section@itemx lbounds_sectionare used when bound checking is activated@item stab_section@itemx stabstr_sectionare used when debugging is actived to store debug information@item symtab_section@itemx strtab_sectioncontain the exported symbols (currently only used for debugging).@end table@section Code generation@cindex code generation@subsection IntroductionThe TCC code generator directly generates linked binary code in onepass. It is rather unusual these days (see gcc for example whichgenerates text assembly), but it can be very fast and surprisinglylittle complicated.The TCC code generator is register based. Optimization is only done atthe expression level. No intermediate representation of expression iskept except the current values stored in the @emph{value stack}.On x86, three temporary registers are used. When more registers areneeded, one register is spilled into a new temporary variable on the stack.@subsection The value stack@cindex value stack, introductionWhen an expression is parsed, its value is pushed on the value stack(@var{vstack}). The top of the value stack is @var{vtop}. Each valuestack entry is the structure @code{SValue}.@code{SValue.t} is the type. @code{SValue.r} indicates how the value iscurrently stored in the generated code. It is usually a CPU registerindex (@code{REG_xxx} constants), but additional values and flags aredefined:@example#define VT_CONST 0x00f0#define VT_LLOCAL 0x00f1#define VT_LOCAL 0x00f2#define VT_CMP 0x00f3#define VT_JMP 0x00f4#define VT_JMPI 0x00f5#define VT_LVAL 0x0100#define VT_SYM 0x0200#define VT_MUSTCAST 0x0400#define VT_MUSTBOUND 0x0800#define VT_BOUNDED 0x8000#define VT_LVAL_BYTE 0x1000#define VT_LVAL_SHORT 0x2000#define VT_LVAL_UNSIGNED 0x4000#define VT_LVAL_TYPE (VT_LVAL_BYTE | VT_LVAL_SHORT | VT_LVAL_UNSIGNED)@end example@table @code@item VT_CONSTindicates that the value is a constant. It is stored in the union@code{SValue.c}, depending on its type.@item VT_LOCALindicates a local variable pointer at offset @code{SValue.c.i} in thestack.@item VT_CMPindicates that the value is actually stored in the CPU flags (i.e. thevalue is the consequence of a test). The value is either 0 or 1. Theactual CPU flags used is indicated in @code{SValue.c.i}. If any code is generated which destroys the CPU flags, this value MUST beput in a normal register.@item VT_JMP@itemx VT_JMPIindicates that the value is the consequence of a conditional jump. For VT_JMP,it is 1 if the jump is taken, 0 otherwise. For VT_JMPI it is inverted.These values are used to compile the @code{||} and @code{&&} logicaloperators.If any code is generated, this value MUST be put in a normalregister. Otherwise, the generated code won't be executed if the jump istaken.@item VT_LVALis a flag indicating that the value is actually an lvalue (left value ofan assignment). It means that the value stored is actually a pointer tothe wanted value. Understanding the use @code{VT_LVAL} is very important if you want tounderstand how TCC works.@item VT_LVAL_BYTE@itemx VT_LVAL_SHORT@itemx VT_LVAL_UNSIGNEDif the lvalue has an integer type, then these flags give its realtype. The type alone is not enough in case of cast optimisations.@item VT_LLOCALis a saved lvalue on the stack. @code{VT_LLOCAL} should be eliminatedASAP because its semantics are rather complicated.@item VT_MUSTCASTindicates that a cast to the value type must be performed if the valueis used (lazy casting).@item VT_SYMindicates that the symbol @code{SValue.sym} must be added to the constant.@item VT_MUSTBOUND@itemx VT_BOUNDEDare only used for optional bound checking.@end table@subsection Manipulating the value stack@cindex value stack@code{vsetc()} and @code{vset()} pushes a new value on the valuestack. If the previous @var{vtop} was stored in a very unsafe place(forexample in the CPU flags), then some code is generated to put theprevious @var{vtop} in a safe storage.@code{vpop()} pops @var{vtop}. In some cases, it also generates cleanupcode (for example if stacked floating point registers are used as onx86).The @code{gv(rc)} function generates code to evaluate @var{vtop} (thetop value of the stack) into registers. @var{rc} selects in whichregister class the value should be put. @code{gv()} is the @emph{mostimportant function} of the code generator.@code{gv2()} is the same as @code{gv()} but for the top two stackentries.@subsection CPU dependent code generation@cindex CPU dependentSee the @file{i386-gen.c} file to have an example.@table @code@item load()must generate the code needed to load a stack value into a register.@item store()must generate the code needed to store a register into a stack valuelvalue.@item gfunc_start()@itemx gfunc_param()@itemx gfunc_call()should generate a function call@item gfunc_prolog()@itemx gfunc_epilog()should generate a function prolog/epilog.@item gen_opi(op)must generate the binary integer operation @var{op} on the two topentries of the stack which are guaranted to contain integer types.The result value should be put on the stack.@item gen_opf(op)same as @code{gen_opi()} for floating point operations. The two topentries of the stack are guaranted to contain floating point values ofsame types.@item gen_cvt_itof()integer to floating point conversion.@item gen_cvt_ftoi()floating point to integer conversion.@item gen_cvt_ftof()floating point to floating point of different size conversion.@item gen_bounded_ptr_add()@item gen_bounded_ptr_deref()are only used for bounds checking.@end table@section Optimizations done@cindex optimizations@cindex constant propagation@cindex strength reduction@cindex comparison operators@cindex caching processor flags@cindex flags, caching@cindex jump optimizationConstant propagation is done for all operations. Multiplications anddivisions are optimized to shifts when appropriate. Comparisonoperators are optimized by maintaining a special cache for theprocessor flags. &&, || and ! are optimized by maintaining a special'jump target' value. No other jump optimization is currently performedbecause it would require to store the code in a more abstract fashion.@unnumbered Concept Index@printindex cp@bye@c Local variables:@c fill-column: 78@c texinfo-column-for-description: 32@c End:
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -