The Structure of the GNAT Compiler

Download Report

Transcript The Structure of the GNAT Compiler

The Structure of the GNAT Compiler
A target-independent Ada95 front-end for GCC


Ada components
C components
GCC tree
Syntax
Sem
AST
Expand
Annotated
AST
gigi
GCC
Your
Project!
Lexical Scanner






Hand-written for speed
Must handle different encodings for extended character
sets (ISO642).
Input file is read in full to minimize I/O
Data structure: global name table
Scanner is subroutine for parser, delivers tokens on
demand
Preprocessor may be integrated with scanner (but Ada
has no defined preprocessor).
Parser


Hand-written for speed and error recovery
Organized according to chapters of the
language reference (ARM):





par-ch3 handles declarations, par-ch9 handles
tasking, par-ch12 handles generics, etc.
Recursive-descent with lookahead
Follows exactly ARM grammar (ambiguous!)
Builds AST: main data structure for front-end
Symbol table is integrated into AST.
Semantic Analysis







Legality rules (1436 error and warning messages)
Tree annotations: add structure to AST to localize
semantic information
Name resolution
(which x?)
Type and overload resolution
(which “+”?)
Dispatching and polymorphism
Static expression evaluation
Simple optimizations: dead code elimination, constant
folding, static conditions.
Expansion



Replace complex constructs with simpler one
(map Ada semantics into C semantics)
Aggregates: (1..10 => 42, others => 0)
Equality on composite types:
if A (x .. y) = B (x . .y) then..



Tasking (thread communication)
Many others…
Expander builds AST fragments, calls semantics
to annotate them: process is recursive
Gigi: GNAT to GCC




Impractical to use GCC data structures in frontend
Gigi traverses GNAT tree fragments and calls
tree-building procedures (just like other GCC
front-ends)
Each tree fragment is immediately translated
into Register Transfer Language (RTL)
Gigi must reflect Ada semantics (syntax and
semantic information packages) and must
interface to GCC generators: written in C.
Other components

Library management: program is assembled
from compilation of multiple files. Need tools
to insure coherence (like make) gather object
files, record dependencies between units.


gnatbind, gnatlink
Runtime management: Input/Output,
numeric libraries, tasking, real time,
distributed computing.
The data structures of the AST






Nodes are variant records of fixed size. Each variant
corresponds to one non-terminal
A node has at most 4 syntactic descendants
A descendant is either a node or a list
Terminal nodes are identifiers and literals
Node kind is discriminant: determines meaning of
descendants for each node
220 kinds:


Compilation_unit, package_declaration…loop_statement…
Syntactic structure described in sinfo.ads
Semantic annotations


Each tree node carries semantic information
Node-specific:





expressions have types
procedures have scope
array types have index and component types
Literals have a value
Full description in einfo.ads
Symbol table is integrated into AST





Identifiers in declarations are special:
defining_occurrence accumulates semantic
information for an entity
References to an identifier are pointers to
corresponding defining_occurrence: Entity.
Symbol table is set of defining_occurrences.
Identifiers point to names table
Symbol table contains no strings