Transcript Tirgul 7

Tirgul no. 7

Topics covered:
- More Sorting and Searching + Complexity analysis
- Matrices: Definitions and basic operations.
- Gram-Schmidt Orthonormalization algorithm
Searching

Searching for a value in an unsorted list requires - linear scan
of all elements.

However, when the list is sorted we can use this fact to
significantly improve our search complexity.

This is done using the Binary Search algorithm which is a
simple example of the “Divide and Conquer” approach.

Divide and Conquer: given a problem – divide it into subproblems and solve each one of them separately.
Binary Search
//assume A is an array with n values
//return the index in which the value key is located,
//or –1 if not found in the array
BinarySearch(A,key) {
(1) lower  1, upper  n
(2) middle  (lower+upper)/2
(3) while( (lower<=upper) and (A[middle]< >key))
(3.1) if (key < A[middle]) then
upper  middle-1
else
lower middle+1
(3.2) middle  (lower+upper)/2
(4) if (A[middle] = key) then
return(middle)
else
return(-1)
}
Binary Search – complexity analysis

In each round the algorithm divides the search space into 2 equal parts and
uses the assumption that the numbers are sorted to choose one out of the 2
parts in to continue searching in.

Why is the sorting assumption important?

In order to calculate the running time, we have to count the number of
comparisons that the algorithm makes – which is equivalent to the number
of values that the index middle is given:
Worst case analysis:
Let us assume that we are searching for the smallest value in the sorted array A:
For an array of size n: middle= (n/2, n/4, n/8,n/16,n/32…,1)

- Note that this is equivalent to searching for a non existing value in the array!

Therefore for an array of size n: the runtime will be: T(n)= O(log n)
In the average case we may find the key sooner.
Insertion Sort using Binary Search
Insertion sort - each time insert the next value into an already sorted array.
 In order to find its insertion index – use BinarySearch! –
using a small modification in the BinarySearch algorithm we can have it return the
expected position of a value in a sorted list:
// when insertFlag = true, we are searching for the insertion index of key into the list –
// i.e. we are not interested to know whether key is in the list or not, but rather where he
// should be positioned in order to keep the list sorted!
BinarySearch (A,key,insertFlag) {

(1) lower  1, upper  n
(2) middle  (lower+upper)/2
(3) while( (lower<=upper) and (A[middle]< >key))
(3.1) if (key < A[middle]) then
upper  middle-1
else
lower middle+1
(3.2) middle  (lower+upper)/2
(4) if (A[middle] = key) or (insertFlag) then
return(middle)
else
return(-1)
}
Insertion Sort using Binary Search
public static void insertionSort (int[] numbers){
boolean insertFlag= true;
for (int index = 1; index < numbers.length; index++){
int key = numbers[index];
int insertionPosition = binarySearch(numbers,key,insertFlag,0,index);
int currPosition = index;
// shift larger values to the right
while (position > insertionPosition))
{
numbers[position] = numbers[position-1];
position--;
}
numbers[insertionPosition] = key;
}
}
Shell’s Sort


Shell sort is the fastest O(n2) sorting algorithm.
The algorithms for shell sort can be defined in two steps:
Step 1: divide the original list into smaller lists.
Step 2: sort individual sub lists using any known sorting
algorithm (like bubble sort, insertion sort, selection
etc).
Many questions arise:
- how should I divide the list?
- Which sorting algorithm to use?
- How many times I will have to execute steps 1 and 2?
- And the most puzzling question: If I am anyway using bubble,
insertion or selection sort then how I can
- achieve improvement in efficiency?
sort,
Shell’s Sort



For dividing the original list into smaller lists, we choose a
value K, which is known as increment.
Based on the value of K, we split the list into K sub lists.
For example: if our original list is x[0], x[1], x[2], x[3],
x[4]….x[99] and we choose K=5 as the increment value
then we get the following sub lists:
first_list = x[0], x[5], x[10], x[15]…….x[95]
second_list =x[1], x[6], x[11], x[16]…….x[96]
third_list =x[2], x[7], x[12], x[17]…….x[97]
forth_list =x[3], x[8], x[13], x[18]…….x[98]
fifth_list =x[4], x[9], x[14], x[19] …….x[99]
Shell’s Sort



For dividing the original list into smaller lists, we choose a
value K, which is known as increment.
Based on the value of K, we split the list into K sub lists.
For example: if our original list is x[0], x[1], x[2], x[3],
x[4]….x[99] and we choose K=5 as the increment value
then we get the following sub lists:
first_list = x[0], x[5], x[10], x[15]…….x[95]
second_list =x[1], x[6], x[11], x[16]…….x[96]
third_list =x[2], x[7], x[12], x[17]…….x[97]
forth_list =x[3], x[8], x[13], x[18]…….x[98]
fifth_list =x[4], x[9], x[14], x[19] …….x[99]
So the ith sub list will contain every Kth element of the
original list starting from index i-1.
Shell’s Sort

According to the algorithm mentioned above, for each
iteration, the list is divided and then sorted. If we use the
same value of K, we will get the same sub lists and every
time we will sort the same sub lists, which will not result
in the ordered final list.

Note also that sorting the five sub lists independently do
not ensure that the full list is sorted! So we need to
change the value of K.

We also know that number of sub lists we get are equal
to the value of K.
if we decide to reduce the value of K after every iteration
we will reduce the number of sub-lists also in every
iteration. Eventually, when K will be set to 1, we will have
only one sub list.

Shell’s Sort – why does it perform better – intution




Let’s assume that we will use insertion-sort to sort the
sublists. If the list is either small or almost sorted then
insertion sort is efficient since a smaller number of
elements will be shifted.
In Shell Sort the original list is divided into smaller lists
based on the value of increment and each list is sorted.
Initially value of K (increment) is fairly large giving a large
number of small sub lists.
As K reduces its value, the length of the sub lists
increases. But since the earlier sub lists have been
sorted, we except the full list to become more and more
sorted.
Thus the inefficiency arising out of working with larger
lists is partly compensated by lists being increasingly
sorted.
Choosing the best value of increments (k )

Up to date there is no optimal sequence of increments. There have
been a few suggestions that have been shown to be efficient (both
theoretically and empirically)

It has been shown empirically that using a larger number of
increments gives more efficient results.

It is advisable not to use empirical sequences like 1,2,4,8… or
1,3,6,9… because they may result in having almost the same
elements compared for different increment values which will result in
poor performance.
Knuth’s suggestion:

• h1 = 1
• ht+1 = ht* 3 +1 and stops at ht, When ht+2 >= N.
Which gives the increment series as 1, 4, 13….. ht.
Shell’s Sort – implementation:
public void ShellSort(int[] numbers) {
int[] increments= generateIncrementSeq();
// start with the largest increment and proceed till 1.
for(int k=increments.length-1; k>=0; k--) {
int currIncrement=increments[k];
// here the list is divided into ’currIncrement’ subarrays
for (int sublistid=0; sublistid < currIncrement; sublistid++){
//perform insertion sort with each sublist
//note that you can use any sorting algorithm here
insertionSort(currIncrement, sublistid,numbers);
}//for k
}
Shell’s Sort – implementation:
public int[] generateIncrementSeq(int arrayLength){
//count how many increments need to be generated.
numOfInc=0;
for(int h=1;h<arrayLength;) {
if(h > arrayLength)
numOfInc= numOfInc –2;
h=3*h + 1;
}
//generate increment sequence:
increments= new int[numOfIncs];
for ( int i=1;i< numOfIncs; i++ ){
increments[i]= h;
h= 3*h + 1;
}
return(increments);
}
Shell’s Sort – example
Int[ ] numbers=
{8, 2, 84, 6, 12, 5, 76, 9, 13, 67, 54, 0, 43, 20, -4, 78, 32, 62,
75, 106, 16, 4, 77, 45, 23, 31, 92, 7, 18};
When increment is 13, we get the following 13 sub lists and
the elements are divided among the sub lists in the
following ways i.e. every K’th element of the list starting
with the index i will be in the sub list I:
Shell’s Sort – example
After sorting the sub lists, the resulting list we get is as
follows:
{8, -4, 18, 6, 12, 5, 76, 9, 4, 67, 45, 0, 31, 20, 2, 78, 32,
62, 75, 106, 16, 13, 77, 54, 23, 43, 92, 7,84}
Compare this to:
{8, 2, 84, 6, 12, 5, 76, 9, 13, 67, 54, 0, 43, 20, -4, 78, 32,
62, 75, 106, 16, 4, 77, 45, 23, 31, 92, 7, 18};
Shell’s Sort – example (cont.)
We now, reduce the increment value to 4 and get the
following 4 sub lists.
Shell’s Sort – example (cont.)
After sorting these sub lists, the resulting list is as follows:
{ 4, -4, 2, 0, 8, 5, 18, 6, 12, 13, 45, 7, 16, 20, 75, 9, 23,
43, 76, 54, 31, 62, 77, 78, 32, 67, 92, 106, 84 }
We now repeat for increment= 1 (which is simply insertion
sort) and get the sorted array!
Matrices

An m*n dimensional matrix is a mathematical object that is
defined by a table of numbers with N rows and M columns. For
example, a Matrix A 2*3 and a matrix B 3*2 are defined as:
a11 , a12 , a13 
A
,
a
,
a
,
a
 21 22 23 
b11 , b12 
A  23 , B  b21 , b22 , B  32


b31 , b32 
The element in the i'th row and j’th column in the
matrix a is denoted by Ai, j
Basic Operations on Matrices

Sum - We can define the sum of 2 matrices as the sum of each of
the corresponding elements, i.e. (susbstraction is same)
c11, c12,  c1n 


C  
  A B
cm1, cm 2 ,  cmn 
i, j Ci , j  Ai , j  Bi , j
A, B, C   mn
1.2, 4.5, 8.9
3.5, 4.8, 7.5
A
, B  
 then
2.0, 1.1, 7.0 
3.1, 4.0, 5.5
1.2 3.5, 4.5 4.8, 8.9 7.5 4.7, 9.3,16.4

C  A B  


2.0 3.1, 1.1 4.0, 7.0 5.5  5.1, 5.1,12.5 
Matrix Multiplication

Matrix Multiplication – is defined by multiplication of the
corresponding elements in the 2 matrices, and results in a new
matrix, i.e.
a11 , a12  a1n 


m n
A  
,
A



a m1 , a m 2  a mn 


n
b11 , b12  b1n 


n k
B  
,
B



bn1 , bn 2  bnk 


C  AB, where Ci , j   ai ,l * bl , j
then
C   m k
l 1
If the number of columns in A is not equal to the number of
rows in B, matrix multiplication is undefined.
Matrix Multiplication - Example
3, 4
1, 3, 2 
5, 2 then
A
,
B


 
2
,
5
,
7


1, 6 
3
20, 22
C  AB  
, C1,1   A1,l * Bl ,1  (1 * 3  3 * 5  2 * 1)  20

l 1
38, 60 
Multiplication By A Vector

right multiplication of a matrix by a vector is defined if the
dimension of the vector equals the number of columns in the
matrix, i.e.
a11 , a12  a1n 


A  
,
a m1 , a m 2  a mn 


y  Ax,
A   m n
n
where
yi   ai ,l * bl
l 1
 x1 
 
x    ,
 xn 
 
y  n
x  n
then
Multiplication By A Vector - example
3
1, 3, 2 
5 then
A
,
x


 
2
,
5
,
7


1 
3
20
y  Ax   , y1   A1,l * xl  (1 * 3  3 * 5  2 * 1)  20
l 1
38
Gram-Scmidt Orthonormalization

If we have a set of linearly-independent but non-orthogonal
vectors x1 … xn we often wish to turn them into an alternative
set of vectors, q1 … qn that are mutually orthonormal:
1
q qj  
0
T
i

for i  j 

for i  j 
Achieving this orthonormalization is the purpose of the GramSchimdt procedure
Gram-Scmidt Orthonormalization

Let's address the particular example where:
1
1
3
 
 
 
 2
6
7
x1    x2    x3   
3
12
11
 
 
 
 4
4
0
 
 
 

The first is simple, involving a simple normalization;
q1 
1
 
1
1  2
x1 
T
30  3 
x1 x1
 
 4
 
Gram-Scmidt Orthonormalization

The second vector will be the original second vector, x2 minus
its projection on the first orthonormal basis vector q1 , that is:

1
 1
1


 
 
 

1  2 1  2
6
T
q2  x2  x2 q1 q1    1   1 6 12 4
 3   30  3 
12
30

 
 
 
4


 4


4
 
 
 

1
1
6
 13 
 7 
 
 
 
 


 6  65  2  1  36  1  26  1  10 
q2              
12 30 3
6 72 6 39
6 33 
 
 
 
 


4
 4
 24 
 52 
  28 
 
 
 
 


Gram-Schmidt: Normalization
We now noramlize q2 to unit norm:
  7    0.156 

 

1  10   0.222 
q2 



 0.734 
6 q1 33

 

  28    0.623 

 

Gram-Schmidt (cont.)

To construct the last basis vector, we need to subtract from q3
its projections on both q1 and q2 :
q3  x3  x q1 q1  x q2 q2
T
3

Sparing some tedious math:
 0.806 


 0.476 
q3  
 0.211


  0.281


T
3
Gram-Schmidt (cont.)

In conclusion, we have transformed our original, nonorthonormal, basis set, to an entirely equivalent set, whose
vectors are all mutually orthogonal and have unit norms.

Now, when we express an arbitrary vector from the space
spanned by these 2 alternative sets as a linear combination of
the orthonormal vectors q1 q2 and q3.

The projections (the weights) are entirely independent of one
another, and their sum is exactly the norm of the represented
vector, and not more.
Gram-Schmidt algorithm:
Input: Matrix Anxn = ( a0 a1 ... an ) whose columns form a basis of Rn (are
linearly independent).
Output: Matrix Bnxn whose columns are an orthonormal basis of Rn , i.e. for all
columns bi,bj we have <bi,bj> = 0 , <bi,bi> = 1
(1)
a0
b0 
a0
(2) for i=1 to n do:
(2.1)
bk  
k
ak   ak bi bi
i 1
k
ak   ak bi b
i 1