📄 ffe.texi
字号:
as a valid @code{FORMAT} statement specifying a twelve-characterHollerith constant.The implication here is that, since the new lexer is a zero-feedback one,it won't know that the special case of a @code{FORMAT} statement being parsedrequires apparently distinct lexemes @samp{1} and @samp{2} to be treated asa single lexeme.(This is a horrible misfeature of the Fortran 90 language.It's one of many such misfeatures that almost make me wantto not support them, and forge ahead with designing a new``GNU Fortran'' language that has the features,but not the misfeatures, of Fortran 90,and provide utility programs to do the conversion automatically.)So, the lexer must gather distinct chunks of decimal strings intoa single lexeme in contexts where a single decimal lexeme mightstart a Hollerith constant.(Which probably means it might as well do that all the timefor all multi-character lexemes, even in free-form mode,leaving it to subsequent phases to pull them apart as they see fit.)Compare the treatment of this to how@smallexampleCHARACTER * 4 5 HEY@end smallexampleand@smallexampleCHARACTER * 12 HEY@end smallexamplemust be treated---the former must be diagnosed, due to the separationbetween lexemes, the latter must be accepted as a proper declaration.@subsubsection Hollerith ConstantsRecognizing a Hollerith constant---specifically,that an @samp{H} or @samp{h} after a digit string beginssuch a constant---requires some knowledge of context.Hollerith constants (such as @samp{2HAB}) can appear after:@itemize @bullet@item@samp{(}@item@samp{,}@item@samp{=}@item@samp{+}, @samp{-}, @samp{/}@item@samp{*}, except as noted below@end itemizeHollerith constants don't appear after:@itemize @bullet@item@samp{CHARACTER*},which can be treated generally asany @samp{*} that is the second lexeme of a statement@end itemize@subsubsection Confusing Function KeywordWhile@smallexampleREAL FUNCTION FOO ()@end smallexamplemust be a @code{FUNCTION} statement and@smallexampleREAL FUNCTION FOO (5)@end smallexamplemust be a type-definition statement,@smallexampleREAL FUNCTION FOO (@var{names})@end smallexamplewhere @var{names} is a comma-separated list of names,can be one or the other.The only way to disambiguate that statement(short of mandating free-form source or a short maximumlength for name for external procedures)is based on the context of the statement.In particular, the statement is known to be within analready-started program unit(but not at the outer level of the @code{CONTAINS} block),it is a type-declaration statement.Otherwise, the statement is a @code{FUNCTION} statement,in that it begins a function program unit(external, or, within @code{CONTAINS}, nested).@subsubsection Weird READThe statement@smallexampleREAD (N)@end smallexampleis equivalent to either@smallexampleREAD (UNIT=(N))@end smallexampleor@smallexampleREAD (FMT=(N))@end smallexampledepending on which would be valid in context.Specifically, if @samp{N} is type @code{INTEGER},@samp{READ (FMT=(N))} would not be valid,because parentheses may not be used around @samp{N},whereas they may around it in @samp{READ (UNIT=(N))}.Further, if @samp{N} is type @code{CHARACTER},the opposite is true---@samp{READ (UNIT=(N))} is not valid,but @samp{READ (FMT=(N))} is.Strictly speaking, if anything follows@smallexampleREAD (N)@end smallexamplein the statement, whether the first lexeme after the closeparenthese is a comma could be used to disambiguate the two cases,without looking at the type of @samp{N},because the comma is required for the @samp{READ (FMT=(N))}interpretation and disallowed for the @samp{READ (UNIT=(N))}interpretation.However, in practice, many Fortran compilers allowthe comma for the @samp{READ (UNIT=(N))}interpretation anyway(in that they generally allow a leading comma beforean I/O list in an I/O statement),and much code takes advantage of this allowance.(This is quite a reasonable allowance, since thejuxtaposition of a comma-separated list immediatelyafter an I/O control-specification list, which is also comma-separated,without an intervening comma,looks sufficiently ``wrong'' to programmersthat they can't resist the itch to insert the comma.@samp{READ (I, J), K, L} simply looks cleaner than@samp{READ (I, J) K, L}.)So, type-based disambiguation is needed unless strict adherenceto the standard is always assumed, and we're not going to assume that.@node TBD (Transforming)@subsection TBD (Transforming)Continue researching gotchas, designing the transformational process,and implementing it.Specific issues to resolve:@itemize @bullet@itemJust where should @code{INCLUDE} processing take place?Clearly before (or part of) statement identification (@file{sta.c}),since determining whether @samp{I(J)=K} is a statement-functiondefinition or an assignment statement requires knowing the context,which in turn requires having processed @code{INCLUDE} files.@itemJust where should (if it was implemented) @code{USE} processing take place?This gets into the whole issue of how @code{g77} should handle the conceptof modules.I think GNAT already takes on this issue, but don't know more than that.Jim Giles has written extensively on @code{comp.lang.fortran}about his opinions on module handling, as have others.Jim's views should be taken into account.Actually, Richard M. Stallman (RMS) also has written upsome guidelines for implementing such things,but I'm not sure where I read them.Perhaps the old @email{gcc2@@cygnus.com} list.If someone could dig references to these up and get them to me,that would be much appreciated!Even though modules are not on the short-term list for implementation,it'd be helpful to know @emph{now} how to avoid making them harder toimplement them @emph{later}.@itemShould the @code{g77} command become just a script that invokesall the various preprocessing that might be needed,thus making it seem slower than necessary for legacy codethat people are unwilling to convert,or should we provide a separate script for that,thus encouraging people to convert their code once and for all?At least, a separate script to behave as old @code{g77} did,perhaps named @code{g77old}, might ease the transition,as might a corresponding one that converts source codesnamed @code{g77oldnew}.These scripts would take all the pertinent options @code{g77} usedto take and run the appropriate filters,passing the results to @code{g77} or just making new sources out of them(in a subdirectory, leaving the user to do the dirty deed ofmoving or copying them over the old sources).@itemDo other Fortran compilers provide a prefix syntaxto govern the treatment of backslashes in @code{CHARACTER}(or Hollerith) constants?Knowing what other compilers provide would help.@itemIs it okay to drop support for the @samp{-fintrin-case-initcap},@samp{-fmatch-case-initcap}, @samp{-fsymbol-case-initcap},and @samp{-fcase-initcap} options?I've asked @email{info-gnu-fortran@@gnu.org} for input on this.Not having to support these makes it easier to write the new front end,and might also avoid complicated its design.@end itemize@node Philosophy of Code Generation@section Philosophy of Code GenerationDon't poke the bear.The @code{g77} front end generates codevia the @code{gcc} back end.@cindex GNU Back End (GBE)@cindex GBE@cindex @code{gcc}, back end@cindex back end, gcc@cindex code generatorThe @code{gcc} back end (GBE) is a large, complexlabyrinth of intricate codewritten in a combination of the C languageand specialized languages internal to @code{gcc}.While the @emph{code} that implements the GBEis written in a combination of languages,the GBE itself is,to the front end for a language like Fortran,best viewed as a @emph{compiler}that compiles its own, unique, language.The GBE's ``source'', then, is written in this language,which consists primarily ofa combination of calls to GBE functionsand @dfn{tree} nodes(which are, themselves, createdby calling GBE functions).So, the @code{g77} generates code by, in effect,translating the Fortran code it readsinto a form ``written'' in the ``language''of the @code{gcc} back end.@cindex GBEL@cindex GNU Back End Language (GBEL)This language will heretofore be referred to as @dfn{GBEL},for GNU Back End Language.GBEL is an evolving language,not fully specified in any published formas of this writing.It offers many facilities,but its ``core'' facilitiesare those that corresponding most directlyto those needed to support @code{gcc}(compiling code written in GNU C).The @code{g77} Fortran Front End (FFE)is designed and implementedto navigate the currents and eddiesof ongoing GBEL and @code{gcc} developmentwhile also delivering on the potentialof an integrated FFE(as compared to using a converter like @code{f2c}and feeding the output into @code{gcc}).Goals of the FFE's code-generation strategy include:@itemize @bullet@itemHigh likelihood of generation of correct code,or, failing that, producing a fatal diagnostic or crashing.@itemGeneration of highly optimized code,as directed by the uservia GBE-specific (versus @code{g77}-specific) constructs,such as command-line options.@itemFast overall (FFE plus GBE) compilation.@itemPreservation of source-level debugging information.@end itemizeThe strategies historically, and currently, used by the FFEto achieve these goals include:@itemize @bullet@itemUse of GBEL constructs that most faithfully encapsulatethe semantics of Fortran.@itemAvoidance of GBEL constructs that are so rarely used,or limited to use in specialized situations not related to Fortran,that their reliability and performance has not yet been establishedas sufficient for use by the FFE.@itemFlexible design, to readily accommodate changes to specificcode-generation strategies, perhaps governed by command-line options.@end itemize@cindex Bear-poking@cindex Poking the bear``Don't poke the bear'' somewhat summarizes the above strategies.The GBE is the bear.The FFE is designed and implemented to avoid poking itin ways that are likely to just annoy it.The FFE usually either tackles it head-on,or avoids treating it in ways dissimilar to howthe @code{gcc} front end treats it.For example, the FFE uses the native array facility in the back endinstead of the lower-level pointer-arithmetic facilityused by @code{gcc} when compiling @code{f2c} output).Theoretically, this presents more opportunities for optimization,faster compile times,and the production of more faithful debugging information.These benefits were not, however, immediately realized,mainly because @code{gcc} itself makes little or no useof the native array facility.Complex arithmetic is a case study of the evolution of this strategy.When originally implemented,the GBEL had just evolved its own native complex-arithmetic facility,so the FFE took advantage of that.When porting @code{g77} to 64-bit systems,it was discovered that the GBE didn't reallyimplement its native complex-arithmetic facility properly.The short-term solution was to rewrite the FFEto instead use the lower-level facilitiesthat'd be used by @code{gcc}-compiled code(assuming that code, itself, didn't use the native complex typeprovided, as an extension, by @code{gcc}),since these were known to work,and, in any case, if shown to not work,would likely be rapidly fixed(since they'd likely not work for vanilla C code in similar circumstances).However, the rewrite accommodated the original, native approach as wellby offering a command-line option to select it over the emulated approach.This allowed users, and especially GBE maintainers, to try outfixes to complex-arithmetic support in the GBEwhile @code{g77} continued to default to compiling more code correctly,albeit producing (typically) slower executables.As of April 1999, it appeared that the last few bugsin the GBE's support of its native complex-arithmetic facilitywere worked out.The FFE was changed back to default to using that native facility,leaving emulation as an option.Other Fortran constructs---arrays, character strings,complex division, @code{COMMON} and @code{EQUIVALENCE} aggregates,and so on---involve issues similar to those pertaining to complex arithmetic.So, it is possible that the historyof how the FFE handled complex arithmeticwill be repeated, probably in modified form
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -