📄 c4x.md
字号:
;; Machine description for the TMS320C[34]x for GNU C compiler;; Copyright (C) 1994, 1995, 1996, 1997, 1998,;; 1999, 2000, 2002 Free Software Foundation, Inc.;; Contributed by Michael Hayes (m.hayes@elec.canterbury.ac.nz);; and Herman Ten Brugge (Haj.Ten.Brugge@net.HCC.nl);; This file is part of GNU CC.;; GNU CC is free software; you can redistribute it and/or modify;; it under the terms of the GNU General Public License as published by;; the Free Software Foundation; either version 2, or (at your option);; any later version.;; GNU CC is distributed in the hope that it will be useful,;; but WITHOUT ANY WARRANTY; without even the implied warranty of;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the;; GNU General Public License for more details.;; You should have received a copy of the GNU General Public License;; along with GNU CC; see the file COPYING. If not, write to;; the Free Software Foundation, 59 Temple Place - Suite 330,;; Boston, MA 02111-1307, USA.;; TODO :; Try using PQImode again for addresses since C30 only uses; 24-bit addresses. Ideally GCC would emit different insns; for QImode and Pmode, whether Pmode was QImode or PQImode.; For addresses we wouldn't have to have a clobber of the CC; associated with each insn and we could use MPYI in address; calculations without having to synthesize a proper 32 bit multiply.; Additional C30/C40 instructions not coded:; CALLcond, IACK, IDLE, LDE, LDFI, LDII, LDM, NORM, RETIcond; ROLC, RORC, SIGI, STFI, STII, SUBC, SWI; Additional C40 instructions not coded:; LDEP, LDPE, LWRct, LAJcond, RETIcondD;; C4x MODES;; QImode char, short, int, long (32-bits); HImode long long (64-bits); QFmode float, double (32-bits); HFmode long double (40-bits); CCmode ; CC_NOOVmode ;; C4x PREDICATES:;; comparison_operator LT, GT, LE, GE, LTU, GTU, LEU, GEU, EQ, NE; memory_operand memory [m]; immediate_operand immediate constant [IKN]; register_operand register [rf]; general_operand register, memory, constant [rfmI]; addr_reg_operand AR0-AR7, pseudo reg [a]; sp_reg_operand SP [b]; std_reg_operand AR0-AR7, IR0-IR1, RC, RS, RE, SP, pseudo [c]; ext_reg_operand R0-R11, pseudo reg [f]; ext_low_reg_operand R0-R7, pseudo reg [q]; index_reg_operand IR0-IR1, pseudo reg [x]; st_reg_operand ST [y]; dp_reg_operand DP [z]; stik_const_operand 5-bit const [K]; src_operand general operand [rfHmI]; par_ind_operand indirect S mode (ARx + 0, 1, IRx) [S<>]; parallel_operand par_ind_operand or ext_low_reg_operand; symbolic_address_operand; call_address_operand; ADDI src2, src1, dst three operand op; ADDI src, dst two operand op; Note that the predicates are only used when selecting a pattern; to determine if an operand is valid.; The constraints then select which of the possible valid operands; is present (and guide register selection). The actual assembly; instruction is then selected on the basis of the constraints.; The extra constraint (valid_operands) is used to determine if; the combination of operands is legitimate for the pattern.;; C4x CONSTRAINTS:;; a address reg AR0-AR7; b stack pointer SP; c other int reg AR0-AR7, IR0-IR1, RC, RS, RE; d fp reg R0-R11 (sets CC when dst) ; e; f fp reg R0-R11 (sets CC when dst); g general reg, memory, constant; h fp reg (HFmode) R0-R11 (sets CC when dst) ; i immediate int constant; j; k block count BK; l; m memory; n immediate int constant with known numeric value; o offsettable memory; p memory address; q low fp reg R0-R7 (sets CC when dst); r general reg R0-R11, AR0-AR7, IR0-IR1, RC, RS, RE; s immediate int constant (value not explicit); t R0-R1; u R2-R3; v repeat count reg RC; w; x index reg IR0-IR1; y status (CC) reg ST; z data pointer DP; G fp zero; H fp 16-bit constant; I signed 16-bit; J signed 8-bit (C4x only); K signed 5-bit (C4x only); L unsigned 16-bit; M unsigned 8-bit (C4x only); N ones complement of unsigned 16-bit; O 16 bit high constant; Q ARx + 9-bit signed disp; R ARx + 5-bit unsigned disp (C4x only); S ARx + 0, 1, IRx disp; T direct memory operand; V non offsettable memory; X any operand; < memory operand with autodecrement addressing; > memory operand with autoincrement addressing; { memory operand with pre-modify addressing; } memory operand with post-modify addressing; Note that the 'd', 'f', and 'h' constraints are equivalent.; The m constraint is equivalent to 'QT<>{}'; Note we cannot use the 'g' constraint with Pmode (i.e, QImode); operations since LEGITIMATE_CONSTANT_P accepts SYMBOL_REF.; So instead we use 'rIm' for signed operands or 'rLm' for unsigned operands.; Note that the constraints are used to select the operands; for a chosen pattern. The constraint that requires the fewest; instructions to load an operand is chosen.; Note that the 'r' constraint is mostly only used for src integer register ; operands, while 'c' and 'd' constraints are generally only used for dst; integer register operands (the 'r' constraint is the union of the 'c' and; 'd' constraints). When a register satisfying the 'd' constraint; is used as a dst operand, the CC gets clobbered (except for LDIcond)---but ; not for 'c'.; The 'f' constraint is only for float register operands---when ; a register satisying the 'f' constraint is used as a dst operand,; the CC gets clobbered (except for LDFcond).; The ! in front of the 'b' constaint says to GCC to disparage the; use of this constraint. The 'b' constraint applies only to the SP.; Note that we deal with the condition code CC like some of the RISC; architectures (arm, sh, sparc) where it is stored in a general register,; in this case the hard register ST (21). Unlike these other architectures; that do not set the CC with many instructions, the C[34]x architectures; sets the CC for many instructions when the destination register is; an extended precision register. While it would have been easier; to use the generic cc0 register to store the CC, as with most of; the other ported architectures, this constrains the setting and testing; of the CC to be consecutive insns. Thus we would reduce the benefit; of scheduling instructions to avoid pipeline conflicts and filling of; delayed branch slots.; Since the C[34]x has many instructions that set the CC, we pay the; price of having to explicity define which insns clobber the CC; (rather than using the macro NOTICE_UPDATE_CC). ; Note that many patterns say that the CC is clobbered when in fact; that it may not be (depending on the destination register).; We have to cover ourselves if an extended precision register; is allocated to the destination register.; Unfortunately, it is not easy to tell GCC that the clobbering of CC; is register dependent. If we could tolerate the ST register being; copied about, then we could store the CC in a pseudo register and; use constructs such as (clobber (match_scratch:CC N "&y,X")) to; indicate that the 'y' class (ST register) is clobbered for the; first combination of operands, but not with the second.; I tried this approach for a while but reload got unhappy since I; didn't allow it to move the CC around.; Note that fundamental operations, such as moves, must not clobber the; CC. Thus movqi choses a move instruction that doesn't clobber the CC.; If GCC wants to combine a move with a compare, it is smart enough to; chose the move instruction that sets the CC.; Unfortunately, the C[34]x instruction set does not have arithmetic or; logical operations that never touch the CC. We thus have to assume; that the CC may be clobbered at all times. If we define patterns; such as addqi without the clobber of CC, then GCC will be forced; to use registers such as the auxiliary registers which can cause; horrible pipeline conflicts. The tradeoff is that GCC can't now; sneak in an add instruction between setting and testing of the CC.; Most of the C[34]x instructions require operands of the following formats,; where imm represents an immediate constant, dir a direct memory reference,; ind an indirect memory reference, and reg a register:; src2 (op2) src1 (op1) dst (op0); imm dir ind reg | imm dir ind reg | reg Notes;---------------------+----------------------+------; ILH T Q<> r | - - - 0 | r 2 operand; - - S<> r | - - S<> r | r ; J - R - | - - R r | r C4x; Arithmetic operations use the I, J constraints for immediate constants,; while logical operations use the L, J constraints. Floating point; operations use the H constraint for immediate constants.; With most instructions the src2 and src1 operands are commutative; (except for SUB, SUBR, ANDN). The assembler considers; ADDI 10, R0, R1 and ADDI R0, 10, R1 to be equivalent.; We thus match src2 and src1 with the src_operand predicate and; use valid_operands as the extra constraint to reject invalid; operand combinations. For example, ADDI @foo, @bar, R0.; Note that we use the ? modifier so that reload doesn't preferentially; try the alternative where three registers are acceptable as; operands (whenever an operand requires reloading). Instead it will try; the 2 operand form which will produce better code since it won't require; a new spill register.; Note that the floating point representation of 0.0 on the C4x; is 0x80000000 (-2147483648). This value produces an warning; message on 32-bit machines about the decimal constant being so large; that it is unsigned.; With two operand instructions patterns having two sets,; the compare set must come first to keep the combiner happy.; While the combiner seems to cope most of the time with the; compare set coming second, it's best to have it first.;; C4x CONSTANT attributes;(define_attr "cpu" "c4x,c3x" (const (cond [(symbol_ref "TARGET_C3X") (const_string "c3x")] (const_string "c4x"))));; C4x INSN ATTRIBUTES:;; lda load address, non-clobber CC; store memory store, non-clobber CC; load_load parallel memory loads, non-clobber CC; load_store parallel memory load and store, non-clobber CC; store_load parallel memory store and load, non-clobber CC; store_store parallel memory stores, non-clobber CC; unary two operand arithmetic, non-clobber CC; unarycc two operand arithmetic, clobber CC; binary three operand arithmetic, non-clobber CC; binarycc three operand arithmetic, clobber CC; compare compare, clobber CC; call function call; rets return from subroutine; jump unconditional branch; jmpc conditional branch; db decrement and branch (unconditional); dbc decrement and branch (conditional); ldp load DP; push stack push; pop stack pop; repeat block repeat; repeat_top block repeat top; laj link and jump; multi multiple instruction; misc nop (default); The only real instructions that affect things are the ones that modify; address registers and ones that call or jump. Note that the number; of operands refers to the RTL insn pattern, not the number of explicit; operands in the machine instruction.;(define_attr "type" "lda,store,unary,unarycc,binary,binarycc,compare,call,rets,jump,jmpc,db,dbc,misc,ldp,repeat,repeat_top,laj,load_load,load_store,store_load,store_store,push,pop,multi" (const_string "misc")); Some instructions operate on unsigned data constants, some on signed data; constants, or the ones complement of unsigned constants.; This differentiates them. Default to signed. This attribute; is used by the macro SMALL_CONST () (defined in c4x.h) to determine; whether an immediate integer constant will fit within the instruction,; or will have to be loaded using direct addressing from memory.; Note that logical operations assume unsigned integers whereas; arithmetic operations assume signed integers. Note that the C4x; small immediate constant (J) used as src2 in three operand instructions; is always signed. not_uint16 refers to a number that fits into 16-bits; when one's complemented.;(define_attr "data" "int16,uint16,high_16,not_uint16" (const_string "int16"))(define_asm_attributes [(set_attr "type" "multi")]);; C4x DELAY SLOTS;; Define delay slot scheduling for branch and call instructions.; The C[34]x has three delay slots. Note that none of the three instructions; that follow a delayed branch can be a Bcond, BcondD, BR, BRD, DBcond,; DBcondD, CALL, CALLcond, TRAPcond, RETIcond, RETScond, RPTB, RPTS, or IDLE.;; Annulled branches are a bit difficult because the next instructions; are preprocessed.; The table below shows what phase of the c4x is executed.; BccA[TF] label; op1 fetch, decode and read executed; op2 fetch and decode executed; op3 fetch executed; This means that we can allow any instruction in the last delay slot; and only instructions which modify registers in the first two. ; lda can not be executed in the first delay slot ; and ldpk can not be executed in the first two delay slots.(define_attr "onlyreg" "false,true" (cond [(eq_attr "type" "unary,unarycc") (if_then_else (and (match_operand 0 "reg_imm_operand" "") (match_operand 1 "reg_imm_operand" "")) (const_string "true") (const_string "false")) (eq_attr "type" "binary,binarycc") (if_then_else (and (match_operand 0 "reg_imm_operand" "")
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -