Transcript Lecture 2 -

DCO20105 Data structures and algorithms
 Lecture
2:
Vector
Array and vector
 Internal structure of a vector
 How data is stored in a vector
 Process on a vector
 Application considerations

-- By Rossella Lau
Rossella Lau
Lecture 2, DCO20105, Semester A,2005-6
Array
A traditional container allowing storage of multiple
occurrences of data
 A truck
its slots.
 The
of memory is assigned and data can be stored into
…
01234 …
number of slots is the size of the array and once that is
defined, its size cannot be changed
 An
array uses an index to identify (access) its element
 Data
can be stored in any slot of an array but usually are
stored from the first slot and new datum is appended at the
end
Rossella Lau
Lecture 2, DCO20105, Semester A,2005-6
C++ Arrays (Ford’s slide: 4-12)
An array is a fixed-size collection of values of the same
data type.
An array is a container that stores the n (size) elements
in a contiguous block of memory.
arr[0]
0
3 Rossella Lau
arr[1]
arr[2]
1
...
n-1
2
Main Index
arr[n-1]
Contents
Lecture 2, DCO20105, Semester A,2005-6
Typical operations on an array

An array allows elements to be added/deleted at any
arbitrary position
 “Shift” operations are required when elements are inserted
between elements and deleted at a position before the last
element (Text book slides: 1:7)
Vector 15 20 30 35 40
Vector 15 20 30 35 40
Insert 25 at
Erase 20 at
25
15 30 35 40
30 35 40
Position 2 15 20
Position 1
20 Shift left
Shift right
Rossella Lau
Lecture 2, DCO20105, Semester A,2005-6
Arrays in programming language
 Usually
supported by programming languages
without using any library functions
 An
array can be static or dynamic
 The
size of a static array is determined at
compilation time while the size of a dynamic array
can be determined during execution
Rossella Lau
Lecture 2, DCO20105, Semester A,2005-6
Examples of C arrays
 An
array and a pointer are
the same
array id is a constant
pointer while dynamic array,
the pointer, can point to any
new value (array)
// static arrays
int arrayA[20];
int arrayB[]={1,2,3,4};
 A static
 The
storage occupied by a
dynamic array should be
released before the program
is terminated.
Rossella Lau
// dynamic array
int *array;
……
array = new int[size];
……
delete[] array;
Lecture 2, DCO20105, Semester A,2005-6
Evaluating an Array as a Container
(Ford’s slide: 4-13)

The size of an array is fixed at the time of its
declaration and cannot be changed during the
runtime.


An array cannot report its size. A separate integer variable
is required in order to keep track of its size.
C++ arrays do not allow the assignment of one array
to another.

The copying of an array requires the generation of a loop
structure with the array size as an upper bound.
7 Rossella Lau
Main Index
Contents
Lecture 2, DCO20105, Semester A,2005-6
Exception handling of array process
 A programmer should
take care to
avoid overflow, when elements
needed to be stored are more than
the slots of an array
 A programmer should
take care
all the “shift” operations are of an
insert or a delete action
Rossella Lau
Lecture 2, DCO20105, Semester A,2005-6
Vector
 As
O-O concept was matured and there were O-O
languages, class Vector usually comes with a
language’s library for use as an array
 A vector
encapsulates all array’s related
housekeeping processes to save programmers’
some work in taking care of the overflow (while
doing an insertion), shift operations, etc.
Rossella Lau
Lecture 2, DCO20105, Semester A,2005-6
A typical vector class – Structure
 A language
supported
array
 capacity
stores the
number of slots in the
vector (the array size)
 size
stores the number of
slots used (usually the
slots are occupied
contiguously)
Rossella Lau
class Vector{
private:
int *array;
size_t size;
size_t capacity;
public:
……
};
Lecture 2, DCO20105, Semester A,2005-6
A typical vector class – Methods
at(i) – returns the item at slot i
 insert(i,item) – inserts item
before slot i and automatically
resize the array if the array is
“full”
 erase(i) – removes the element
at slot i and may shift elements
from (i+i..n-1) to (i..n-2)
 resize(i) – makes the vector to
have an array with size i

Rossella Lau
class Vector{
private:
……
public:
int at(size_t i);
void insert(size_t i,
int item);
void erase(size_t i);
void resize(size_t);
size_t size();
size_t capacity();
……
};
Lecture 2, DCO20105, Semester A,2005-6
However …
 Consider
the sample
class vector
 if
the objects stored on
the array are not
integers, but strings,
rational numbers, or
student records, we
may need to create
many classes.
  Absolutely terrible
Rossella Lau
class Vector{
int *array;
size_t size;
size_t capacity;
……
};
class VectorStr{
string *array;
size_t size;
size_t capacity;
……
};
class VectorStd{
Student *array;
size_t size;
size_t capacity;
……
};
for maintenance
Lecture 2, DCO20105, Semester A,2005-6
Templates in C++
 In
C++, type parameter
supports a template class /
function to be written as it
can be of any data type
template <class T>
class Vector{
T
int
*array;
size_t size;
size_t capacity;
……
};
 Rewrite
class Vector to a
template class
 Whenever
instantiating a
Vector object, a type must
be specified
Rossella Lau
Vector <int> intArray;
Vector<string> strArray;
Vector<Student> stdArray;
Lecture 2, DCO20105, Semester A,2005-6
C++ Vector
 Vector
in C++ is a template class in which it supports
typical array operations: getting space for an array
(instantiating a vector object), identifying an element, storing
data, removing data; etc
 housekeeping together with iterator operations in order to
allow some generic algorithms to be applied on

Rossella Lau
Lecture 2, DCO20105, Semester A,2005-6
Getting a vector
 Syntax:
Ford’s slide 4:15-16
 E.g.,
// declare a vector with number of default slots
vector<int> studentNumber;
 Ford’s
slide 4:22
// vector of size 5 containing the integer
// value 0
vector<int> intVector(5);
// vector of size 10; each element
// contains the empty string
vector<string> strVector(10);
Rossella Lau
Lecture 2, DCO20105, Semester A,2005-6
Identifying elements
 Syntax:
Ford’s slide 4:18
 To
identify elements in a vector is the same as the way for
an array: Ford’s slide 14
v[0]
0
Rossella Lau
v[1]
1
v[2]
2
...
v[n-1]
room to grow
n-1
Lecture 2, DCO20105, Semester A,2005-6
Storing data to a vector

push_back()
• provides a quick action to store an item after the occupied
slot (the last datum)
• can cause the vector to re-size if there is not enough room

insert()
• places an item before a specified position
• is inefficient but necessary if a particular order is required

[]= (e.g., v[i]=23;)
• places an item at a specified position
• overwrites the value if the position is occupied
• may cause error if the specified position is invalid
Rossella Lau
Lecture 2, DCO20105, Semester A,2005-6
Examples for storing data
 Storing
data into an empty vector
vector<int> numbers;
for( i = 0; i < SIZE; i++ )
numbers.push_back(i);
 Storing
data into a pre-defined vector:
vector<int> numbers(SIZE);
for( i = 0; i < SIZE; i++ )
numbers[i] = i;
Rossella Lau
Lecture 2, DCO20105, Semester A,2005-6
Remove an item
 Remove
an item:
pop_back() To remove the last item, shifting is not needed
 erase() To remove an item through an iterator, but shifting
is required

• E.g., OrderItem.h in application bookShop v2.0 (without a
delete flag)
 Alternatively,
each element can include a flag to
indicate if an item is deleted to avoid shift operations

E.g., OrderItem.h in application bookShop (v 1.0)
Rossella Lau
Lecture 2, DCO20105, Semester A,2005-6
Elements with delete flags
is efficient to remove an item in the middle – just a
mark rather than shifts
 It
T
0

T
1
2
3
4
…
…
The trade-off is an additional space is required and
subsequently, whenever a slot is visited, the slot must be
checked
 The
slot marked “delete” can be used again or let it be
“removed” forever

Re-use may cause more checking
Rossella Lau
Lecture 2, DCO20105, Semester A,2005-6
Other vector operations
 Accessors:
back() to return the value of the last item
 size() to return the number of contiguous used slots
 capacity() to return the number of slots in the vector

 Housekeeping:
resize() to re-allocate the capacity and size of the vector
 Care should be taken when instantiating an object with the
constructor Vector(SIZE). The system assumes SIZE of
slots have been used. If it is not the case, remember to
reset its size to 0 by using resize(0).

 ……
Rossella Lau
Lecture 2, DCO20105, Semester A,2005-6
Vector traversal with index
 It
is similar to traverse an array; e.g.:
……
vector<int> studentID;
……
for (int i=0; ; i < studentID.size(); i++)
cout << studentID[i] << “ “;
……
Traversal: to “visit” each element in a data structure;
typical operations such as: to print all elements from a
data structure, to find an item from a data structure
Rossella Lau
Lecture 2, DCO20105, Semester A,2005-6
Vector traversal with Iterator
……
vector<int> studentID;
vector<int>::iterator it;
……
for (it=studentID.begin(); i !=studentID.end(); it++)
cout << *it << “ “;
……
Remember that
Iterator is similar to a pointer referring to an element
 each STL container supports an iterator to traverse the container
itself
 begin() returns the iterator which points to the first element
 end() returns the iterator which refers to to pass-the-end, not the last
element

Rossella Lau
Lecture 2, DCO20105, Semester A,2005-6
Searching an item from a vector with find()
find() is a generic algorithm which can be applied to
every STL’s container
Typical usage:
typedef vector<int>::iterator It;
It
if
// for simpler declaration
it = find(v.begin(), v.end(), x); // returns an iterator
( it != v.end() ) // found
Rossella Lau
Lecture 2, DCO20105, Semester A,2005-6
Example of find()
 The
linear search we used to have: (w/o delete flag)
int getItem(string publicationCode) const {
int i=0;
for (; i<items.size() &&((items[i].getPublication().getCode()
!= publicationCode); i++);
return items.size() == 0 ? -1 :
i < items.size() ? i : -1; }
 Use
the generic algorithm find() with iterator;
pair<bool, It> getItem(string const& publicationCode) {
It result = find(items.begin(),items.end(), publicationCode);
return result == items.end()? pair<bool, It>(false, result) :
pair<bool, It>(true, result);
}
Rossella Lau
Lecture 2, DCO20105, Semester A,2005-6
Better searching method
find() uses linear searching and the efficiency is not
good. When a container can provide random (direct)
position access, such as an array or a vector, sort can
be applied first and then binary search can take place

sort (v.begin(), v.end());
binary_search(v.begin(), v.end(), target);
// returns boolean
Rossella Lau
Lecture 2, DCO20105, Semester A,2005-6
Binary search methods
 To
return the iterator pointing to the position of the
target being searched, the following can be used:
It lbIt = lower_bound(v.begin(), v.end(), target);
 It ubIt = upper_bound(v.begin(), v.end(), target);
 pair<It, It> range = equal_range(v.begin, v.end(), target);

target=25
…… 13 13 25 25 25 28 33 ……
range.first= lbIt
Rossella Lau
ubIt =range.second
Lecture 2, DCO20105, Semester A,2005-6
Operator Overload and generic algorithm
 Note
that generic algorithm used to require
operator overload
 In the last example, two operator overload
operations are required:


orderItem == publicationCode (in OrderItem.h)
publication == publicationCode (in Publication.h)
Rossella Lau
Lecture 2, DCO20105, Semester A,2005-6
Better? Not always!
 Storing
data to a vector
push_back() provides a quick action to store an item
 insert() is inefficient but necessary if a particular order is
required (save operations for a sort)

 Finding
elements from a vector
Sequential search is inefficient but an order is not
required; i.e., push_back() can be used for storing data
 Binary search is efficient but requiring a sorted order:
i.e., insert() must be used for storing data

Rossella Lau
Lecture 2, DCO20105, Semester A,2005-6
Application considerations
Stable and long data stream required a lot of
searches but few insert/erase operations are better
to be sorted for binary search
 Short
data stream with few searches but many
insert operations are better to use push_back()
and sequential search
Rossella Lau
Lecture 2, DCO20105, Semester A,2005-6
Summary
 Vector is
a sequential storage container which encapsulates an
array and its housekeeping process as a class to simplify
programmers’ work
 It
is the simplest storage structure to store data in contiguous
slots with a trade off of that inefficient insert/erase operations
 In
C++, vector, in the STL, is a template class which allows
data in a vector to be of any data type
 STL’s
vector supports a variety of functions for data storage
 Together with
generic algorithms, iterator, and operator
overload, many popular vector traversal functions, such as
search, are ready for use
 Operations’ usage
Rossella Lau
should depend on a particular application
Lecture 2, DCO20105, Semester A,2005-6
Reference
 Ford:
1.8, 2.4, 3.5, 4
 Lecture
12 of DCO10105
 STL online
references
http://www.sgi.com/tech/stl/stl_introduction.html
 http://www.sgi.com/tech/stl


http://www.cppreference.com/
 Example
programs: OrderItem.h, Order.h, Catalog.h in the
application BookShop v2.0
-- END -Rossella Lau
Lecture 2, DCO20105, Semester A,2005-6