Intro to Data Structures and ADTs

Download Report

Transcript Intro to Data Structures and ADTs

Intro to Data Structures and
ADTs
Chapter 2
1
Goal of Data Structures
• Organize data
• Facilitate efficient …
– storage
– retrieval
– manipulation
of data
• Select and design appropriate data types
• This is the real essence of OOP
2
Simplicity Tradeoff
• Simplicity of data organization
versus
• Simplicity/elegance of algorithms
• Simple (unsophisticated) data structure
– may require much work for processing data.
• More complex data organization
– May yield nicer algorithms for the basic
operations
3
Issues
• Amount of data
– phone book lookup (Hallsville vs. Dallas)
– linear search?
• Number of accesses and use required
– compiler's lookup of an identifier's type in a
symbol table
– linear, binary, hash table?
• Static vs. dynamic nature of the data
– consider a text processor
– array, vector?
4
Abstract Data Types (ADT)
• Defn: collection of
– related data items … together with
– an associated set of operations
• Why "abstract?"
– Data, operations, and relations are studied
independent of implementation.
•
What not how is the focus.
5
Implementation of an ADT
• Defn: storage (data) structures which
– store the data items … and
– algorithms for the basic operations
• These data structures are
– provided in a language or
– built from the language constructs
(user defined)
6
Implementation of an ADT
• Successful software design uses
data abstraction
• We separate the
– definition of a data type
– the implementation
from
7
C++
Types
C++ Types
Fundamental Types
void
Arithmetic
Integral
Floating point
(reals)
Characters Enumerations Integers
char
unsigned char
signed char
pointers
bool complex
float
double
long double
int
short int
long int
unsigned
unsigned short
unsigned long
Structured Types
array
struct
union
class
istream
ostream
iostream
ifstream
ofstream
fstream
string
vector
deque
list
stack
queue
priority_queue
map
multimap
set
multiset
bitset 8
valarray
Memory
• 2-state devices « bits 0 and 1
• Organized into
– bytes (8 bits) and
– words (machine dependent — e.g., 4 bytes).
• Each byte (or word) has an address
– to store and retrieve contents of any given
memory location.
9
Simple Data Types
• The most basic form of data
sequences of bits
• Simple data types
– values are atomic — can't be subdivided
– are ADTs.
• Implementations have:
– Storage structures: memory locations
– Algorithms: system hardware/software to do
basic operations.
10
Simple Data Types
• Boolean
– values { false, true }
– could be stored in bits, usually use a byte
– operations &&, ||
• Character
– byte for ASCII, EBCDIC
– 2 bytes for Unicode (Java)
– operations ==, <, >, etc. using numeric code
11
Simple Data Types
• Unsigned Integer data
– non-negative unsigned integers
– stored in base-two in a fixed number of bits
• Signed integer
– stored in a fixed number of bits
• Representations
– sign-magnitude
– two's complement
12
Sign-magnitude Representation
• Save one bit (usually most significant) for
sign
(0 = +, 1 = – )
• Use base-two representation in the other
bits.
 88 = 0000000001011000
 -88 = 1 000000001011000
Cumbersome for
arithmetic
computations
13
Two's Complement Representation
•
For nonnegative n:
– Use ordinary base-two representation with
leading (sign) bit 0
•
For n < 0
1) Find w-bit base-2 representation of |n|
2) Complement each bit.
3) Add 1
14
Two's Complement Representation
•
Example: –88
1. 88 as a 16-bit base-two number
0000000001011000
2. Complement this bit string
1111111110100111
WHY?
3. Add 1
1111111110101000
15
Two's Complement Representation
• Works well for arithmetic computations
5 + –6:
0000000000000101
+1111111111111010
What gets done to
1111111111111111
the bits to give this
answer?
16
Biased Representation
•
Add a constant bias to the number
– typically 2w-1 (where w is number of bits)
– then find its base-two representation
•
Example: 88 using
w = 16 bits and bias of 215 = 32768
1. Add the bias to 88, giving 32856
2. Represent the result in base-two notation:
1000000001011000
17
Biased Representation
•
Example -88 using
w = 16 bits and bias of 215 = 32768
1. Add the bias to -88, giving 32680
2. Represent the result in base-two notation:
0111111110101000
•
Good for comparisons; so, it is commonly used
for exponents in floating-point representation
of reals.
18
Problems with Integer Representation
• Limited Capacity — a finite number of bits
• An operation can produce a value that requires
more bits than maximum number allowed.
This is called
overflow .
• None of these is a perfect representation of
(mathematical) integers
• Can only store a finite (sub)range of them.
19
Real Data
• Types float and double in C++
• Use single precision (IEEE Floating-Point)
• Store:
 sign of mantissa in leftmost bit (0 = +, 1 = – )
 biased binary rep. of exponent in next 8 bits
(bias = 127)
 bits b2b3 . . .b24 mantissa in rightmost 23 bits.
 Need not store b1 — know it's 1)
20
Real Data
• Example: 22.625 = 10110.1012
• Floating point form:
1.01101012 * 24
21
Problems with Real Representation
• Exponent overflow and underflow
• Round off error
– Most reals do not have terminating binary
representations.
Example:
0.7 = (0.10110011001100110011001100. . .)2
22
Problems with Real Representation
• Round off error may be compounded in a
sequence of operations.
– Recall the sums of calculated currency values
• Be careful in comparing reals
– with == and !=.
– Instead use comparison for closeness
if (abs (x – 12.34) < 0.001) …
23
C-Style Data Structures
Arrays
• Single dimension
• Multi dimension
int numList [30];
float realList [10][10];
int numTable [3][4][5];
• All elements of same type
• Elements accessed by
– name and [ ] operator
numList[5]
– name, offset, and dereference *(numlist + 5)
• Name of the array is a pointer constant
24
Arrays
• Arrays as parameters
Note you must
specify number of
elements used
– Formal parameter
void doIt (int list[ ], int count); / or
void toIt (int *list, int count);
– Actual parameter
doit (numList, numUsed);
Same call for either
style of parameter list
declaration
25
Problems with C-Style Arrays
• Capacity cannot change.
• Solution 1 (non-OOP)
Later
– Use a "run-time array"
– Construct B to have required capacity
– Copy elements of A into B
– Deallocate A
Solution 2 (OOP) Use vector
26
Problems with C-Style Arrays
• Virtually no predefined operations
for non-char arrays.
• The Deeper Problem:
– C-style arrays aren't self-contained.
27
Basic Principle of OOP:
• An object should be autonomous
(self-contained)
• Should carry within itself all of the
information needed to describe and
operate upon itself.
28
Aggregate Data Types
• Predefined types not always adequate to
model the problem
– When objects have multiple attributes
– When objects have collections of
heterogeneous elements
• C++ provides structs and classes
– Create new types with multiple attributes
29
Structures
• Characteristics
– has a fixed size
– is ordered
– elements may be of different size
– direct access of elements by name (not index)
struct Date {
int month, day, year;
char dayOfWeek [12];
};
30
FAQs about Structures
• structs can be nested (can contain struct
objects)
• Access members with
– name of struct object
– dot (member selector operator) .
– name of struct member
Date today = { 3, 4, 2005, "Tuesday");
cout << today.month;
31
A commercial for OOP: Two programming paradigms
• Procedural: ( C, FORTRAN,
and Pascal )
• Object-oriented: ( C++, Java, and
Smalltalk)
– Action-oriented — concentrates
on the verbs
– Focuses on the nouns of problem
specification
– Programmers:
• Identify basic tasks to solve
problem
• Implement actions to do tasks
as subprograms
(procedures/functions/
subroutines)
• Group subprograms into
programs/modules/libraries,
• together make up a complete
system for solving the problem
– Programmers:
• Determine objects needed for
problem
• Determine how they should
work together to solve the
problem.
• Create types called classes
made up of
– data members
– function members to operate
on the data.
• Instances of a type (class)
called objects.
32