Chapter 5 - Basic Semantics
Download
Report
Transcript Chapter 5 - Basic Semantics
Chapter 5
Basic Semantics
Chapter 5
K. Louden, Programming Languages
1
Ways to specify semantics
Language reference manual – commonly
used, but lack of precision
By a defining translator – questions
cannot be answered in advance – only by
trying it
By formal definition – denotational
semantics – complex and abstract
A fundamental step in defining semantics
is to describe the meaning of each
identifier.
Chapter 5
K. Louden, Programming Languages
2
Attributes
Properties of language entities, especially
identifiers used in a program.
Important examples:
–
–
–
–
–
Value of an expression
Data type of an identifier
Maximum number of digits in an integer
Location of a variable
Code body of a function or method
Declarations ("definitions") bind attributes
to identifiers.
Different declarations may bind the same
identifier to different sets of attributes.
Chapter 5
K. Louden, Programming Languages
3
Binding times can vary widely:
Value of an expression: during execution or
during translation (constant expression).
Data type of an identifier: translation time
(Java) or execution time (Smalltalk, PERL).
Maximum number of digits in an integer:
language definition time or language
implementation time.
Location of a variable: load or execution
time.
Code body of a function or method:
translation time or link time or execution
time.
Chapter 5
K. Louden, Programming Languages
4
Two general categories of binding: static
(prior to execution) and dynamic
Interpreters will perform most bindings
dynamically
Concern is earliest time when it COULD be
bound, not when it actually is
Possible times
– Language definition
– Language implementation
– Translation time
– Link time
– Load time
– Execution time - dynamic
Chapter 5
}
static
K. Louden, Programming Languages
5
Many of the most important and subtle difference
between languages involve binding times.
Simple changes to the language (say adding
recursion) may drastically change binding times.
Languages in which types are dynamically bound
are dramatically different from those in which
types are statically bound.
For example: Dynamic Type Binding
– flexibility - can write a generic sort routine where type
isn't specified
– error detection is diminished
– is expensive as type checking must be done at run time;
often implemented using pure interpreters
Chapter 5
K. Louden, Programming Languages
6
Classes of Binding Times (listed from late to early)
1. Execution Time (Late Binding).
Variables to their values.
Variables to a particular storage location (termed dynamic storage allocation).
– At entry to a subprogram or block.
Example: formal to actual parameters and formal parameters to actual
locations.
– At arbitrary points during execution.
Example: variables to values. In some languages, variables to types.
Consider Prolog - variable type is determined dynamically
2. Load time: globals bound to location
3. Link time: body of external function bound to call instruction
4. Compile time (Early Binding).
–
–
Bindings chosen by programmer.
Variable names, types, names.
Bindings chosen by translator.
Example: variables to storage (load time) (termed static storage allocation).
Example: particular machine instruction for a statement.
Example: initial values of variables (if none specified)
Example: in C declaration defines type by gives no space
5. Language Implementation Time.
Example: Association of enumerated type values with integers.
Example: maxint
6. Language definition Time - probably the most important binding time.
Example: structure of language fixed, set of all basic data types, set of
statements: syntax and semantics fixed, predefined types.
Chapter 5
K. Louden, Programming Languages
7
Symbol table and environment
A dictionary or table is used to maintain the
identifier/attribute bindings.
It can be maintained either during translation
(symbol table) or execution (environment) or
both.
Pre-translation entities are entered into the initial
or default table.
If both are maintained, the environment can
usually dispense with names, keeping track only
of locations (names are maintained implicitly).
Chapter 5
K. Louden, Programming Languages
8
Declarations bind identifiers to attributes.
Examples of declarations (C)
int x = 0;
Explicitly specifies data type and initial value.
Implicitly specifies scope (see next slide) and
location in memory.
int f(double);
Explicitly specifies type (double int).
Implicitly specifies nothing else: needs another
declaration specifying code.
The former is called a definition (as specifies all
potential attributes) in C, the latter is simply a
declaration (as other attributes- like function
body - need to be specified later).
Chapter 5
K. Louden, Programming Languages
9
Scope
The scope of a declaration is the region of the
program to which the bindings established by the
declaration apply.
Informally - Scope of a variable: range of
statements in which the variable is visible
A variable is visible in a statement if it can be
referenced in that statement. (Scope holes caused
by new declarations)
In a block-structured language, the scope is
typically the code from the end of the declaration to
the end of the "block" (indicated by braces {…} in C
and Java) in which the declaration occurs.
Scope can extend backwards to the beginning of
the block in certain cases (class declarations in
Java and C++, top-level declarations in Scheme).
Chapter 5
K. Louden, Programming Languages
10
Lexical vs. dynamic scope
Scope is maintained by the properties of the lookup
operation in the symbol table or environment.
If scope is managed statically (prior to execution),
the language is said to have static or lexical scope
("lexical" because it follows the layout of the code
in the file).
If scope is managed directly during execution, then
the language is said to have dynamic scope.
It is possible to maintain lexical scope during
execution - via static links in the call stack.
Chapter 5
K. Louden, Programming Languages
11
Java scope example
public class Test
{ public static int x = 2;
public static void f()
{ System.out.println(x);
}
public static void main(String[] args)
{ int x = 3;
f();
}
}
This prints 2, but under dynamic scope it would
print 3 (the most recent declaration of x in the
execution path is found).
Chapter 5
K. Louden, Programming Languages
12
Dynamic scope evaluated
Almost all languages use lexical scope
With dynamic scope the meaning of a variable
cannot be known until execution time, thus there
cannot be any static checking.
Originally used in Lisp. Scheme could still use it, but
doesn't. Some languages still use it: VBScript,
Javascript, Perl (older versions).
Lisp inventor (McCarthy) now calls it a bug.
Still useful as a pedagogical tool to understand the
workings of scope.
Chapter 5
K. Louden, Programming Languages
13
Symbol table structure – how handle
multiple uses of same name?
A table of little stacks of declarations under each
name. For example the table for the Test class of
slide 13 would look as follows inside main (using
lexical scope):
name
bindings
x
f
public static
void method
main
public static
void method
args
Chapter 5
int local to
main
public
static int
String[]
parameter of main
K. Louden, Programming Languages
14
Symbol table structure (2)
Alternatively, a stack of little tables, one for each
scope. For example, the previous example would
look as follows (lexical scope):
Symbol table for Scope
name
x
f
main
bindings
public
static int
Can be deleted
after leaving f
public static
void method
Symtab:
name
(empty)
Current table
inside main
public static
void method
Symtab:
name
args
x
Chapter 5
bindings
K. Louden, Programming Languages
bindings
String[]
parameter
int local
15
Symbol table construction
Symbol table is constructed as declarations are
encountered (insert operation).
Lookups occur as names are encountered
In lexical scope, lookups occur either as names are
encountered in symbol table to that point
(declaration before use—C), or all lookups are
delayed until after the symbol table is fully
constructed and then performed (Java class—scope
applies backwards to beginning of class).
In dynamic scope, need links to tell you which
declarations to use
Chapter 5
K. Louden, Programming Languages
16
Symbol table structure evaluated
Which organization is better?
Table of little stacks is simpler (C, Pascal).
Stack of little tables is more versatile, and
helpful when you need to recover outer
scopes from within inner ones or from
elsewhere in the code (Ada, Java, C++).
Normally, no specific table structure is
part of a language specification: any
structure that provides the appropriate
properties will do.
Chapter 5
K. Louden, Programming Languages
17
Overloading
Overloading is a property of symbol tables that
allows them to successfully handle declarations
that use the same name within the same scope.
It is the job of the symbol table to pick the correct
choice from among the declarations for the same
name in the same scope. This is called overload
resolution.
Overloading typically applies only to functions or
methods.
Overloading must be distinguished from dynamic
binding in an OO language.
Chapter 5
K. Louden, Programming Languages
18
An example in Java:
public class Overload {
public static int max(int x, int y)
{ return x > y ? x : y;}
public static double max(double x, double y)
{ return x > y ? x : y;}
public static int max(int x, int y, int z)
{ return max(max(x,y),z);}
public static void main(String[] args)
{ System.out.println(max(1,2));
System.out.println(max(1,2,3));
System.out.println(max(4,1.3));
}}
Adding more max functions that mix double and
int parameters is ok. But adding ones that mix
double and int return values is not!
Chapter 5
K. Louden, Programming Languages
19
Allocation
Can be constructed entirely statically (Fortran):
all vars and functions have fixed locations for the
duration of execution.
Can also be entirely dynamic: functional
languages like Scheme and ML.
Names of constants may represent purely
compile time quantities.
Most language use a mix: C, C++, Java, Ada.
Consists of three components:
– A fixed area for static allocation
– A stack area for lifo allocation (usually the
processor stack)
– A "heap" area for on-demand dynamic
allocation (with or without garbage collection)
Chapter 5
K. Louden, Programming Languages
20
Typical
environment
organization
(possible C)
[Figure 5.25, p. 165)]
© 2003 Brooks/Cole - Thomson Learning™
Chapter 5
K. Louden, Programming Languages
21
The Runtime Stack
Used for:
– Procedure/function/method calls
– temporaries
– local variables
Temporaries: intermediate results that cannot be
kept in registers; not considered here further.
Procedure calls: (parameters and return values)
Local variables: part of calls, but can be
considered independently, showing LIFO
behavior for nested scopes (next slide).
Chapter 5
K. Louden, Programming Languages
22
Example of stack-based allocation in
C within a procedure:
(1)
(2)
(4)
(5)
(7)
(8)
(9)
(11)
(12)
(14)
(16)
(18)
Chapter 5
A: {
int x;
char y;
B: { double x;
int a;
} /* end B */
C: { char y;
int b;
D: { int x;
double y;
} /* end D */
} /* end C */
} /* end A */
K. Louden, Programming Languages
Point #1
Point #2
23
An alternative: a flat local space
All locations can be determined at compile time
All local variables allocated at onceWastes
some space, but not critically.
the primary structure of the stack is still
the call structure: a complete record of a
call on the stack is called an activation
record or frame, and the stack is referred
to as the call stack. (Chapter 8)
Java promotes a flat space by forbidding
nested redeclarations, but this is not an
essential property: a symbol table can
easily distinguish nested declarations as
A.x, A.y, A.B.x, A.B.a, etc.
Chapter 5
K. Louden, Programming Languages
24
Heap Allocation
In "standard" languages (C, C++, Java) heap
allocation requires a special operation: new.
Any kind of data can be allocated on the heap in
C/C++; in Java all objects and only objects are
allocated on the heap.
Even with heap allocation available in Java &
C/C++, the stack is still used to represent calls.
In C/C++, deallocation is typically by hand
(destructors), but it is hard to do right.
Java uses a garbage collector that periodically
sweeps the heap looking for data that cannot be
accessed any more by the program and adding it
back to free space.
Chapter 5
K. Louden, Programming Languages
25
Lifetime
The lifetime of a program entity is the duration of
its allocation in the environment.
Allocation is static when the lifetime is the
duration of the entire program execution.
Lifetime is related to but not identical to scope.
With scope holes, lifetime can extend to regions
of the program where the program entity is not
accessible.
It is also possible for scope to exceed lifetime
when a language allows locations to be
manipulated directly (as for example manual
deallocation). This is of course very dangerous!
Chapter 5
K. Louden, Programming Languages
26
Variables and Constants
A variable is an object whose stored value
can change during execution.
– x=y (we want value of y but address of x)
– referred to as l-value and r-value
A constant is an object whose value does
not change throughout its lifetime.
Constants are often confused with literals:
constants have names, literals do not.
Chapter 5
K. Louden, Programming Languages
27
Constants
Compile-time constant in Java:
static final int zero = 0;
Load-time constant in Java:
static final Date now = new Date();
Dynamic constant in Java:
any non-static final assigned in a constructor.
Java takes a very general view of constants,
since it is not very worried about getting rid of
them during compilation.
C takes a much stricter view of constants,
essentially forcing them to be capable of
elimination during compilation.
Chapter 5
K. Louden, Programming Languages
28
Aliases
An alias occurs when the same object is bound to
two different names at the same time. This is
fairly common with Java objects.
Side Effect: changes in value that persists after
execution.
Many side effects are intentional
– x=5 intent was to change x
– Obj.setT(2) – intent was to set value of t
– Swap(a,b) - intent was to change values
Sometimes a side effect is not intentional
– Sqrt(x) - what if it set x to zero?
– T=0
- what if an aliased variable was also changed?
Chapter 5
K. Louden, Programming Languages
29
Dangling References, and Garbage
A dangling reference is a location that has been
deallocated from the environment, but is still
accessible within the program. Dangling
references are impossible in a garbage-collected
environment with no direct access to addresses.
Garbage is memory that is still allocated in the
environment but has become inaccessible to the
program. Garbage can be a problem in a nongarbage collected environment, but is much less
serious than dangling references.
Chapter 5
K. Louden, Programming Languages
30