DATA STRUCTURE

Download Report

Transcript DATA STRUCTURE

DATA STRUCTURE
A data structure is a particular way of
storing and organizing data in
a computer so that it can be
used efficiently.
Kinds of Data Structures
• For example, B-trees are particularly wellsuited for implementation of databases,
while compiler implementations usually
use hash tables to look up identifiers.
Uses of Data Structures
Data structures are used in almost every
program or software system.
Data types
1.Primitive Types
2.Composite Types
3.Abstract Data Types(ADT)
Primitive Types
1. the Boolean or logical data type is the
most primitive data type, having one of two
values (true or false), intended to
represent the truth
values of logic andBoolean algebra.
2. a character is a unit of information that
roughly corresponds to a grapheme,
grapheme-like unit, or symbol, such as in
an alphabet or syllabary in the written form
of a natural language.
3. integer is used to refer to a data
type which represents some finite subset
of the mathematical integers. These are
also known as integral data types
4. a string is a sequence of symbols that are
chosen from a set or alphabet.
5. double precision is a computer numbering
format that occupies two adjacent storage
locations in computer memory. A double
precision number, sometimes simply called
a double, may be defined to be an integer, fixed
point, or floating point.
• 6. Floating point data types.
The keyword float usually represents a
single precision floating point data type,
and double represents a double precision
floating point data type. In TIGCC,
both float and double (and even long
double) are the same.
Composite Types
1. a record (also called tuple or struct) is
one of the simplest data structures,
consisting of two or
more values or variables stored in
consecutive memory positions; so that
each component (called
a field or member of the record) can be
accessed by applying different offsets to
the starting address.
2. a union is a value that may have any of
several representations or formats; or
a data structure that consists of
a variable which may hold such a value.
Someprogramming languages support
special data types, called (somewhat
confusingly) union types, to describe
such values and variables.
3.
a
tagged
union,
also
called
a variant, variant record, discriminated
union, or disjoint union, is a data
structure used to hold a value that could
take on several different, but fixed types.
Only one of the types can be in use at any
one time, and a tag field explicitly
indicates which one is in use.
Bubble Sort
Bubble sort is a simple and well-known sorting algorithm. It is used in practice
once in a blue moon and its main application is to make an introduction to
the sorting algorithms. Bubble sort belongs to O(n2) sorting algorithms,
which makes it quite inefficient for sorting large data volumes. Bubble
sort is stable and adaptive.
Algorithm
1.Compare each pair of adjacent elements from the
beginning of an array and, if they are in reversed order,
swap them.
2.If at least one swap has been done, repeat step 1
Imagine that on every step big bubbles float to the surface and stay there. At
the step, when no bubble moves, sorting stops. Let us see an example of
sorting an array to make the idea of bubble sort clearer.
Complexity analysis
Average and worst case complexity of bubble sort is
O(n2). Also, it makes O(n2) swaps in the worst case.
Bubble sort is adaptive. It means that for almost sorted
array it gives O(n) estimation. Avoid implementations,
which don't check if the array is already sorted on every
step (any swaps made). This check is necessary, in
order to preserve adaptive property.
See Sample Figure 1.0
TURTLE AND RABBITS
• One more problem of bubble sort is that its running time badly
depends on the initial order of the elements. Big elements (rabbits)
go up fast, while small ones (turtles) go down very slow. This
problem is solved in the Cocktail sort.
• Turtle example. Thought, array {2, 3, 4, 5, 1} is almost sorted, it
takes O(n2) iterations to sort an array. Element {1} is a turtle.
See Sample Figure 1.1
• Rabbit example. Array {6, 1, 2, 3, 4, 5} is almost sorted too, but it
takes O(n) iterations to sort it. Element {6} is a rabbit. This example
demonstrates adaptive property of the bubble sort.
See Sample Figure 1.2
Selection Sort
Selection sort is one of the O(n2) sorting algorithms, which makes it quite
inefficient for sorting large data volumes. Selection sort is notable for its
programming simplicity and it can over perform other sorts in
certain situations (see complexity analysis for more details).
Algorithm
• The idea of algorithm is quite simple. Array is imaginary
divided into two parts - sorted one and unsorted one.
At the beginning, sorted part is empty, while unsorted
one contains whole array. At every step, algorithm
finds minimal element in the unsorted part and adds it
to the end of the sorted one. When unsorted
part becomes empty, algorithm stops.
When algorithm sorts an array, it swaps first element
of unsorted part with minimal element and then it
is included to the sorted part. This implementation of
selection sort in not stable. In case of linked list is
sorted, and, instead of swaps, minimal element is linked
to the unsorted part, selection sort is stable
Insertion Sort
Insertion sort belongs to the O(n2) sorting algorithms. Unlike many sorting
algorithms with quadratic complexity, it is actually applied in practice for sorting
small arrays of data. For instance, it is used to improve quicksort routine. Some
sources notice, that people use same algorithm ordering items, for example,
hand of cards.
• Algorithm
Insertion sort algorithm somewhat resembles selection
sort. Array is imaginary divided into two parts - sorted
one andunsorted one. At the beginning, sorted
part contains first element of the array and unsorted
one contains the rest. At every step, algorithm takes first
element in the unsorted part and inserts it to the right
place
of
the
sorted
one.
Whenunsorted
part becomes empty, algorithm stops. Sketchy, insertion
sort algorithm step looks like this:
Insertion sort properties
•
•
•
•
adaptive (performance adapts to the initial order of elements);
stable (insertion sort retains relative order of the same elements);
in-place (requires constant amount of additional space);
Online (new elements can be added during the sort).
Illustration
Quick Sort
is a fast sorting algorithm, which is used not only for educational purposes, but
widely applied in practice. On the average, it has O(n log n) complexity,
making quicksort suitable for sorting big data volumes. The idea of the
algorithm is quite simple and once you realize it, you can write quicksort as
fast as bubble sort.
STACKS
Stack is one of the fundamental data structures in
computer science and it is used in many algorithms and
applications. As an example, stack is used:
implicitly in recursion;
-for expression evaluation;
-to check the correctness of parentheses sequence;
etc.
-First of all, we will describe Stack ADT and then show two different implementations.
• a stack is a last in, first out (LIFO) abstract data
type and data structure. A stack can have
any abstract data type as an element, but is
characterized by only two fundamental
operations: push and pop. The push operation
adds to the top of the list, hiding any items
already on the stack, or initializing the stack if it
is empty. The pop operation removes an item
from the top of the list, and returns this value to
the caller. A pop either reveals previously
concealed items, or results in an empty list.
Stack ADT
Well illustration from the real world is a stack of
books, where you can operate with a "top" of the
stack only. We will dedicate six operations on
the stack. You can put a book on
top (push); look, which one lays on top of a
stack (peek)and remove a topmost
book (pop) and check, if stack is
empty (isEmpty). Also, there are two operations
to create and to destroy a stack. Let us structure
them up in the following list.
Operations
• Stack create()
creates empty stack
• boolean isEmpty(Stack s)
tells whether the stack s is empty
• push(Stack s, Item e)
put e on top of the stack s
• Item peek(Stack s)
returns topmost element in stack s
Precondition: s is not empty
• pop(Stack s)
removes topmost element from the stack s
Precondition: s is not empty
• destroy(Stack s)
destroys stack s
• Note. In different sources peek operation may be called top.
• Axioms
• newly created stack is empty;
• after pushing an item to a newly created stack, it
becomes nonempty;
• peek returns the most recently pushed item;
• stack remains untouched, after a pair of push
and pop commands, executed one after another.
• It is a formal definition. Let us step forward to
more intuitive examples, before we turn to
implementation issues.
Linked list
a linked list is a data structure that consists
of a sequence of data records such that in
each record there is a field that contains
a reference (i.e., a link) to the next record
in the sequence.
• Singly-linked list
• Linked list is a very important dynamic data
structure. Basically, there are two types of linked
list, singly-linked list and doubly-linked list. In a
singly-linked list every element contains some
data and a link to the next element, which allows
to keep the structure. On the other hand, every
node in a doubly-linked list also contains a link
to the previous node. Linked list can be an
underlying data structure to implement stack,
queue or sorted list.
• Each cell is called a node of a singlylinked list. First node is called head and
it's a dedicated node. By knowing it, we
can access every other node in the list.
Sometimes, last node, called tail, is also
stored in order to speed up add operation.
•
Singly-linked list. Traversal.
Assume, that we have a list with some nodes. Traversal
is the very basic operation, which presents as a part in
almost every operation on a singly-linked list. For
instance, algorithm may traverse a singly-linked list to
find a value, find a position for insertion, etc. For a
singly-linked list, only forward direction traversal is
possible.
Traversal algorithm
Beginning from the head
• check, if the end of a list hasn't been reached yet;
• do some actions with the current node, which is specific
for particular algorithm;
• current node becomes previous and next node
becomes current. Go to the step 1
Binary Heap
• Binary heap. Array-based internal representation.
Heap is a binary tree and therefore can be stored inside
the computer using links. It gives various benefits; one of
them isthe ability to vary number of elements in a heap
quite easily. On the other hand, each node stores two
auxiliary links, which implies an additional memory
costs. As mentioned above, a heap is a complete binary
tree, which leads to the idea of storing it using an array.
By utilizing array-based representation, we can reduce
memory costs while tree navigation remains quite
simple.
Complete binary tree
It is said, that binary tree is complete, if
all its levels, except possibly the
deepest, are complete. Thought,
incomplete bottom level can't have
"holes", which means that it has to be
fulfilled from the very left node and up
to some node in the middle. See
illustrations below.
• Heap property
There are two possible types of binary heaps:
max heap and min heap. The difference is that
root of a min heap contains minimal element and
vice versa. Priority queue is often deal with min
heaps, whereas heapsort algorithm, when
sorting in ascending order, uses max heap.
• Inserting an element into a heap
In this article we examine the idea laying
in the foundation of the heap data
structure. We call it sifting, but you also
may meet another terms, like "trickle",
"heapify", "bubble" or "percolate".
Insertion algorithm
1. Now, let us phrase general algorithm to insert
a new element into a heap.
2. Add a new element to the end of an array;
3. Sift up the new element, while heap property is
broken. Sifting is done as following: compare
node's value with parent's value. If they are in
wrong order, swap them.
Now heap property
is broken at the root
node:
• Removing the minimum from a heap
Removal operation uses the same idea as
was used for insertion. Root's value, which
is minimal by the heap property, is
replaced by the last array's value. Then
new value is sifted down, until it takes right
position.
Removal algorithm
1. Copy the last value in the array to the root;
2. Decrease heap's size by 1;
3. Sift down root's value. Sifting is done as following:
– if current node has no children, sifting is over;
– if current node has one child: check, if heap property is broken,
then swap current node's value and child value; sift down the
child;
– if current node has two children: find the smallest of them. If
heap property is broken, then swap current node's value and
selected child value; sift down the child.
Introduction to Binary Tree
A binary tree is made of nodes, where each node
contains a "left" pointer, a "right" pointer, and a
data element. The "root" pointer points to the
topmost node in the tree. The left and right
pointers recursively point to smaller "subtrees"
on either side. A null pointer represents a binary
tree with no elements -- the empty tree. The
formal recursive definition is: a binary tree is
either empty (represented by a null pointer), or is
made of a single node, where the left and right
pointers (recursive definition ahead) each point
to a binary tree.
A "binary search tree" (BST) or
"ordered binary tree" is a type of binary
tree where the nodes are arranged in
order: for each node, all elements in its
left subtree are less-or-equal to the
node (<=), and all the elements in its
right subtree are greater than the node
(>). The tree shown above is a binary
search tree -- the "root" node is a 5,
and its left subtree nodes (1, 3, 4) are
<= 5, and its right subtree nodes (6, 9)
are > 5. Recursively, each of the
subtrees must also obey the binary
search tree constraint: in the (1, 3, 4)
subtree, the 3 is the root, the 1 <= 3
and 4 > 3. Watch out for the exact
wording in the problems -- a "binary
search tree" is different from a "binary
tree".
Insert()
Insert() -- given a binary
search tree and a number,
insert a new node with the
given number into the tree
in the correct place. The
insert() code is similar to
lookup(), but with the
complication that it
modifies the tree
structure. As described
above, insert() returns the
new tree pointer to use to
its caller. Calling insert()
with the number 5 on this
tree
hasPathSum()
We'll define a "root-to-leaf
path" to be a sequence of
nodes in a tree starting
with the root node and
proceeding downward to a
leaf (a node with no
children). We'll say that an
empty tree contains no
root-to-leaf paths. So for
example, the following tree
has exactly four root-toleaf paths:
For this problem, we will be concerned
with the sum of the values of such a path
-- for example, the sum of the values on
the 5-4-11-7 path is 5 + 4 + 11 + 7 = 27.
• A
binary
tree
is
either empty or consists
of
a
node
called
the root together with
two binary trees called
the left subtree and
the right subtree.
Examples of Binary Trees
Binary Tree Traversal
Deletion in BST
Let x be a value to be deleted from the BST and let X
denote the node containing the value x. Deletion of an
element in a BST again uses the BST property in a
critical way. When we delete the node X containing x, it
would create a "void" that should be filled by a suitable
existing node of the BST. There are two possible
candidate nodes that can fill this void, in a way that the
BST property is not violated: (1). Node containing
highest valued element among all descendants of left
child of X. (2). Node containing the lowest valued
element among all the descendants of the right child of
X. In case (1), the selected node will necessarily have a
null right link which can be conveniently used in patching
up the tree. In case (2), the selected node will
necessarily have a null left link which can be used in
patching up the tree. Figure illustrates several scenarios
for deletion in BSTs.