CSC321: Programming Languages

Download Report

Transcript CSC321: Programming Languages

CSC321: Programming Languages
Chapter 5: Types
5.1
5.2
5.3
5.4
5.5
5.6
5.7
5.8
5.9
5.10
Type Errors
Static and Dynamic Typing
Basic Types
NonBasic Types
Recursive Data Types
Functions as Types
Type Equivalence
Subtypes
Polymorphism and Generics
Programmer-Defined Types
CSC321: Programming Languages
5-1
Types
• A type is a collection of values and
operations on those values.
• Example: Integer type has values ..., -2, 1, 0, 1, 2, ... and operations +, -, *, /, <, ...
• The Boolean type has values true and
false and operations , , .
• Computer types have a finite number of
values due to fixed size allocation;
problematic for numeric types.
CSC321: Programming Languages
5-2
Type Problems
• Exceptions:
– Smalltalk uses unbounded fractions.
– Haskell type Integer represents unbounded
integers.
• Floating point problems?
– Even more problematic is fixed sized
floating point numbers:
• 0.2 is not exact in binary.
• So 0.2 * 5 is not exactly 1.0
• Floating point is inconsistent with real numbers
in mathematics.
CSC321: Programming Languages
5-3
Purpose of Types
• In the early languages, Fortran, Algol,
Cobol, all of the types were built in.
• If needed a type color, could use
integers; but what does it mean to
multiply two colors.
• Purpose of types in programming
languages is to provide ways of
effectively modeling a problem solution.
CSC321: Programming Languages
5-4
Data Interpretation
• Machine data carries no type information.
• Basically, just a sequence of bits.
• Example:
0100 0000 0101 1000 0000 0000 0000 0000
– The floating point number 3.375
– The 32-bit integer 1,079,508,992
– Two 16-bit integers 16472 and 0
– Four ASCII characters: @ X NUL NUL
CSC321: Programming Languages
5-5
Type Errors
• A type error is any error that arises
because an operation is attempted on a
data type for which it is undefined.
• Type errors are common in assembly
language programming.
• High level languages reduce the number
of type errors.
• A type system provides a basis for
detecting type errors.
CSC321: Programming Languages
5-6
Type Checking
• A type system imposes constraints such as the
values used in an addition must be numeric.
• Cannot be expressed syntactically in EBNF.
• Some languages perform type checking at
compile time (e.g., C).
• Other languages (e.g., Perl) perform type
checking at run time.
• Still others (e.g., Java) do both.
CSC321: Programming Languages
5-7
Static and Dynamic Typing
• A language is statically typed if the types
of all variables are fixed when they are
declared at compile time.
• A language is dynamically typed if the type
of a variable can vary at run time
depending on the value assigned.
• Can you give examples of each?
CSC321: Programming Languages
5-8
Strongly Typing
• A language is strongly typed if its type system
allows all type errors in a program to be
detected either at compile time or at run time.
• A strongly typed language can be either
statically or dynamically typed.
• Union types are a hole in the type system of
many languages.
• Most dynamically typed languages associate a
type with each value.
CSC321: Programming Languages
5-9
Basic Types
• Terminology in use with current 32-bit computers:
– Nibble: 4 bits
– Byte: 8 bits
– Half-word: 16 bits
– Word: 32 bits
– Double word: 64 bits
– Quad word: 128 bits
• In most languages, the numeric types are finite in size.
So a + b may overflow the finite range.
– Unlike mathematics: a+(b+c)(a+b)+c
• Also in C-like languages, the equality and relational
operators produce an int, not a Boolean
CSC321: Programming Languages
5-10
Overloading
• An operator or function is overloaded when
its meaning varies depending on the types
of its operands or arguments or result.
• Java: a + b (ignoring size)
– integer add
– floating point add
– string concatenation
• Mixed mode: one operand an int, the other
floating point, or one string and the other int
CSC321: Programming Languages
5-11
Type Conversion
• A type conversion is a narrowing
conversion if the result type permits fewer
bits, thus potentially losing information.
• Otherwise it is termed a widening
conversion.
• Should languages ban implicit narrowing
conversions? Why?
• C++/Java:
– Explicit narrowing conversion
– Implicit widening conversion
– How?
CSC321: Programming Languages
5-12
Non-basic Types
• Enumeration:
– enum day {Monday, Tuesday, Wednesday,
Thursday, Friday, Saturday, Sunday};
– enum day myDay = Wednesday;
• In C/C++ the above values of this
type are 0, ..., 6.
• More powerful in Java:
for (day d : day.values())
Sytem.out.println(d);
CSC321: Programming Languages
5-13
Pointers
•
•
•
•
•
•
C, C++, Ada, Pascal
Java???
Value is a memory address
Indirect referencing
Operator in C: *
Example: A Simple Linked List in C
struct Node {
int key;
struct Node* next;
};
struct Node* head;
CSC321: Programming Languages
5-14
Pointer Operations
• If T is a type and ref T is a pointer:
& : T → ref T
* : ref T → T
• For an arbitrary variable x:
*(&x) = x
• Increment/decrement
• String Copy
void strcpy(char *p, char *q) {
while (*p++ = *q++) ;
}
CSC321: Programming Languages
5-15
Problems with Pointers
•
•
•
•
Bane of reliable software development
Error-prone
Buffer overflow, memory leaks
Particularly troublesome in C
• float sum(float a[ ],int n) {
int i;
float s = 0.0;
for (i = 0; i<n; i++)
s += a[i];
return s;
• float sum(float *a,int n) {
int i;
float s = 0.0;
for (i = 0; i<n; i++) s += *a++;
return s;
• }
• }
CSC321: Programming Languages
5-16
Dangling References
• A dangling reference is a pointer to storage that
is being used for another purpose; typically, the
storage has been deallocated
• Dangling reference may cause a program either
crashing or giving the wrong results
• Dangling reference may be caused by
– Deallocation operations: a storage that is being
pointed to by a pointer has been deallocated.
– Procedure/function/method returns: the address of a
local variable in a procedure is returned as the return
value. Consider the following situation (C/C++):
CSC321: Programming Languages
5-17
Arrays
• A set of elements, but different languages
have different ways to deal with arrays
• Basically, an array has the following
attributes:
• Data type – the type of all elements
• Dimension – number of dimensions: 1dim, 2-dim, 3-dim, etc
• Range of dimensions – lower and upper
bound for each dimension
• Subscripts – index. Most languages use
integer, but others such as Pascal/Ada
allow any discrete types
CSC321: Programming Languages
5-18
Arrays Layout
• 1-dim array: consecutive locations
• High-dim array:
– Row-major layout: consecutive locations
arranged row by row
e.g. a[3][3]: a[0,0], a[0,1], a[0,2], a[1,0],
a[1,1], a[1,2], a[2,0], a[2,1], a[2,2]
– Column-major layout: arranged column by
column
e.g. a[3][3]: a[0,0], a[1,0], a[2,0], a[0,1],
a[1,1], a[2,1], a[0,2], a[1,2], a[2,2]
CSC321: Programming Languages
5-19
Arrays Element Accessing
• Element notation:
– C/C++, Java, Algol: a[i][j][k]
– Pascal: a[i, j, k]
– Fortran: a(i, j, k), same as a function.
Actually, Fortran treats accessing array
elements as performing function evaluation
• Pointer accessing:
– C/C++ can use pointers to access array
elements
CSC321: Programming Languages
5-20
Arrays and Pointers
• Equivalence between arrays and pointers: a =
&a[0]
• If either e1 or e2 is type: ref T:
e1[e2]=*((e1)+(e2))
• Example: a is float[ ] and i int: a[i] = *(a + i)
–
–
–
–
–
double a[16];
double *b;
b = &a[0];
// assign the base address of a to b
b++;
// move b to the next location: a[1]
b + i;
// move b to the next i-th location: a[i+1]
CSC321: Programming Languages
5-21
Arrays Indexing
• Index computation: Loc
• Assume: base – the location of the first element
low – the lower bound
upp – the upper bound
w – the length of one element
• 1-dim: Loc(a[i]) = base + w*(i-low) = (base – low *w) + i*w = c + i*w,
where c = base – low*w is a constant
• 2-dim: Loc(a[i,j])
= base + w*[ r*(i-low1) + (j-low2) ]
= base - w*( r*low1 + low2) + w*(r*i + j)
where r = the number of rows above a[i,j], which is (upp2-low2 + 1)
3-dim: Loc(a[i,j,k]) = base + w*[r2*(r1*(i-low1) + (j-low2)) + (k-low3)]
where r1 = (upp2-low2+1), r2 = (upp3-low3+1) ??
CSC321: Programming Languages
5-22
Arrays Binding
• Name-size binding – bound evaluation: how many storage
cells should be allocated to an array
• Static binding – static evaluation: fixed size, determined at
compile time, allocated at load time
– Advantages: easy; efficient
– Disadvantages: inflexible, waste storage
• Dynamic binding – dynamic evaluation: changeable size,
determined at run time, allocated dynamically
– E.g. Java:
Declare:
Create:
double a[];
a = new double[10];
a = new double[15];
• Extendable arrays: APL, SNOBOL, Perl
– Java also has a kind of extendable array: Vector v = new Vector(5);
CSC321: Programming Languages
5-23
Arrays Binding
• Semi-dynamic binding – evaluation upon procedure entry:
Ada, Pascal, C/C++, Java
– The size (range of dimension) is a parameter.
– Once the parameter is determined, the array size is fixed
E.g. C/C++: double fun(int n) { double a[n]; …}
Ada:
declare
m, n: Integer;
begin
declare
type Matrix is array(Integer range<>,
Integer range<>) of Real;
a: Matrix(1..m, 1..n);
begin … end
…
end
CSC321: Programming Languages
5-24
Array Attributes
• Some languages treat array as object/package, and
maintain its attributes
– E.g. Java: length --– Ada:
range:
Lower bound:
Upper bound:
a.length  the size of a
a’range
a’first
a’last
• With these attributes, when an array is passed as a
parameter, do not need to specify the number of elements
in the array
• However, some languages have to specify the length as a
parameter
– E.g. in C, array is passed by reference (call-by-reference) and does
not have the length attribute.
CSC321: Programming Languages
5-25
Strings
•
•
•
•
A special kind of array Now so fundamental, directly
supported.
In C, a string is a 1D array with the string value
terminated by a NUL character (value = 0).
In Java, Perl, Python, a string variable can hold an
unbounded number of characters.
Libraries of string operations and functions.
–
–
–
–
Comparison, selection of substring, searching, matching
(Most languages)
Replacement, appending, substring deletion (SNOBOL)
Concatenation: produce a new string
Java: String (static) and StringBuffer (dynamic)
CSC321: Programming Languages
5-26
Structures/Records
• Composed from simpler data items
• Each data item (field) has a name
• Data items may have different types (recall array, all
elements must be in the same type)
• Access to fields mostly uses .(dot) expression
Pascal:
type Date =
record
day: Integer;
month: Integer;
year: Integer;
end
var adate: Date;
Access: adate.day
C/C++:
struct Date {
int day;
int month;
int year;
}
C:
struct Date adate;
C++: Date adate;
Access: adate.day
CSC321: Programming Languages
5-27
Structures/Records
• The order of fields in the record has no effect
on the meaning
• When a variable of record is declared, it is
allocated storage
• Record assignment is to assign all fields in a
record
• Collection of elements of different types
• Used first in Cobol, PL/I
• Absent from Fortran, Algol 60
• Common to Pascal-like, C-like languages
• Omitted from Java as redundant
CSC321: Programming Languages
5-28
Comparison of Arrays and Records
• Component: all elements in an array have
the same type, while components (fields) of a
record may be different
• Access: array elements are accessed by
subscripts and element locations are
determine at run time, while record
components are accessed by dot expression
and component names are determined at
compile time
• Assignment: array assignment is to assign
the base location, while record assignment is
to assign all components
CSC321: Programming Languages
5-29
Unions
type union =
• C: union
record
• Pascal: case-variant
case b : boolean of
record
true : (i : integer);
• Logically: multiple
false : (r : real);
views of same
end;
var tagged : union;
storage
begin tagged :=
• Useful in some
(b => false, r => 3.375);
systems applications put(tagged.i); -- error
end
CSC321: Programming Languages
5-30
Simulated union type
class Value extends Expression {
// Value = int intValue | boolean
boolValue
Type type; int intValue; boolean boolValue;
Value(int i) { intValue = i;
type = new Type(Type.INTEGER);
}
Value(boolean b) { boolValue = b;
type = new Type(Type.BOOLEAN);
}
}
CSC321: Programming Languages
5-31
Recursive Data Type
• Type definition refers to another type that refers
to the current type definition
• E.g.
type link = cell;
// refer to cell
type cell = record
data: integer;
next: link;
// refer to link
end;
CSC321: Programming Languages
5-32
Dynamic Data Type
• With dynamic data structures, you can not only
operate on the data items in the structure, but
also change the structure
• Usually constructed with static data structures
(such as arrays and records) and pointers
• E.g. Linked lists, Trees
• The basic components of these structures are
records that contain pointers: nodes. Usually,
each node has two parts: data items, and links
(pointers/references)
CSC321: Programming Languages
5-33
Functions as Types
• Pascal example:
– function newton(a, b: real; function f: real): real;
– Know that f returns a real value, but the
arguments to f are unspecified.
• Java example
• public interface RootSolvable {
•
double valueAt(double x);
• }
• public double Newton(double a, double b,
RootSolvable f);
CSC321: Programming Languages
5-34
Type Names and Equivalence
• Type names
– Each type may or may not have a name,
such as int, Integer, Char, etc. are type
names; while array [0..9] of integer is an
anonymous type.Name equivalence
• Type equivalence
– Name equivalence
– Structural equivalence
CSC321: Programming Languages
5-35
Type Name Equivalence
• Name equivalence: have the same name type
• Pure name equivalence: exactly same name
• Transitive name equivalence:
– if S is equivalent to T which is equivalent to U, then
S is equivalent to U
• Type expression equivalence:
– A type name is equivalent to itself.
– Two type expressions are equivalent if they are
formed by applying the same constructor to
equivalent expressions.
CSC321: Programming Languages
5-36
Type Structural Equivalence
• Two type expressions are structurally equivalent if and
only if they are equivalent under the following three
rules
– A type name is structurally equivalent to itself
– Two types are structurally equivalent if they are formed by
applying the same type constructor to structurally equivalent
types
– After a type declaration, type n = T, the type name n is
structurally equivalent to T
• Examples
– integer is structurally equivalent to integer
– Types S and T are structurally equivalent if they are declared
as
• type S = array [0..9] of integer;
• type T = array [0..9] of integer;
– Types S and integer are structurally equivalent if S is declared
as:
type S = integer;
CSC321: Programming Languages
5-37
Type Equivalence
struct complex { float re, im; };
struct polar { float x, y; };
struct {
float re, im;
} a, b;
struct complex c, d;
struct polar e;
int f[5], g[10];
which are equivalent types: a, b, c, d, e, f,
g?
CSC321: Programming Languages
5-38
Type Equivalence Implementations
• Pascal uses name equivalence, but
ambiguous.
• Modula-2 (the successor of Pascal) defines
two types S and T to be compatible if
– S and T are the same name, or
– S = T is a type declaration: type S = T; or type T =
S; or
– One is a subrange of the other; or
– Both are subranges of the same basic types
• C uses structural equivalence for all types
except records (struct)
CSC321: Programming Languages
5-39
Subtypes
• A subtype is a type that has certain constraints
placed on its values or operations.
• In Ada subtypes can be directly specified.
subtype one_to_ten is Integer range 1 .. 10;
type Day is (Monday, Tuesday, Wednesday, Thursday, Friday,
Saturday, Sunday);
subtype Weekend is Day range Saturday .. Sunday;
type Salary is delta 0.01 digits 9
range 0.00 .. 9_999_999.99;
subtype Author_Salary is Salary digits 5
range 0.0 .. 999.99;
Integer i = new Integer(3);
...
Number v = i;
...
Integer x = (Integer) v;
//Integer is a subclass of Number,
// and therefore a subtype
CSC321: Programming Languages
5-40
Polymorphism and Generics
• A function or operation is polymorphic if it
can be applied to any one of several
related types and achieve the same
result.
• An advantage of polymorphism is that it
enables code reuse.
CSC321: Programming Languages
5-41
Polymorphism
• Comes from Greek, and means having many forms
• Example: overloaded built-in operators and functions
+ - * / == != ...
– Ada, C++: define + - ... for new types
– Java overloaded methods: number or type of parameters
• Example: class PrintStream
print, println defined for:
boolean, char, int, long, float, double, char[ ]
String, Object
– Java: + also used for string concatenation
– Java: instance variable, method: e.g., name, name( )
CSC321: Programming Languages
5-42
Generics
• Ada generics: generic sort (Figs. 5.10 and
5.11)
– parametric polymorphism
– type binding delayed from code
implementation to compile time
– procedure sort is new generic_sort(integer);
• Java generic classes
• C/C++ templates
CSC321: Programming Languages
5-43
Programmer-defined Types
• Recall the definition of a type:
– A set of values and a set of operations on
those values.
• Structures allow a definition of a
representation
• Problems:
– Representation is not hidden
– Type operations cannot be defined
• Solutions: ADT – Abstract Data Types
CSC321: Programming Languages
5-44