Lecture 10: Knapsack Problems and Public Key Crypto

Download Report

Transcript Lecture 10: Knapsack Problems and Public Key Crypto

Lecture 10: Knapsack Problems
and Public Key Crypto
Wayne Patterson
SYCS 654
Spring 2010
The Classical Knapsack Problem

Easily enough stated, this problem is one that
turns out to be extremely difficult.

First, in English: I have a knapsack, and I know it
(or I) can carry W pounds. I have a bunch of
things I would like to take on a trip that weigh w1,
w2, w3, …, wn pounds.

The problem: Is there a subset of the {w2, w3, …,
wn } that will add exactly to W, in other words,
that will allow me to carry the maximum possible
weight.
Knapsack = Subset Sum

Sometimes this problem is also called the
“subset sum” problem.

Sometimes we are lucky and can find a
very quick solution to the problem.

For example, with knapsack weight W, and
objects that weigh { 1, 2, 4, 8, 16, 32, …, 2n
}, we can answer the question very easily.
The “Easy” Knapsack Sets

For the example given previously, of
weights { 1, 2, 4, 8, 16, 32, …, 2n }, the
solution to the knapsack problem is
unique.

For every 0  W  2n+1 – 1, there is a
unique solution, and for W  2n+1, there is
no solution.
Easy Knapsacks
Proof: (Binary string argument). For W 
2n+1 – 1, W has a binary representation
with n+1 bits. E.g., if n+1 = 4, 2n+1 – 1 =
15, and W is represented as 1111
(binary).
 For arbitrary W  2n+1 – 1, represent W
as a binary --- then all the weights
corresponding to a 1-bit position can
exactly fit into the knapsack.

Example

Suppose the knapsack set is {1, 2, 4, 8, 16,
32, 64, 128} and W = 173. (W can go up
to 255.)

Express W in binary: 173 = 101011012.
Then the weights corresponding to the 1bits will add to W:
1
0
128 +
173
1
0
32 +
1
1
0
1
8 + 4
+
1 =
Super-increasing Knapsack Sets

Of course, in the preceding example,
there will NOT be a solution to the
knapsack problem if W > 255.

There is a more general class of “easy”
knapsack problems, and basically the same
algorithm will apply. We will call this class
of problems the “super-increasing
knapsack sets”.
Super-increasing …

Suppose now we have a set of weights
with the property that each weight is
greater than the sum of the weights of all
of its predecessors in order:
w2 > w 1
 w3 > w1 + w2
 w4 > w1 + w2 + w3

and so on …
Solving the Super-increasing
knapsack problem

Let’s take as an example a set of weights:
◦ { 3, 7, 19, 35, 72, 155, 367, 984 }


And suppose W = 1230.
The algorithm for solution is: set x = W,
process the weights in descending order, if
the weight is less than or equal the current
value of x, subtract it and remember the
weight. After you have processed all the
weights, if you have a remainder of 0, you
have a solution. If the remainder is not zero,
there is no solution.
The Computation















x = 1230
(984 < x, subtract it)
984
x = 246
(367 > x, don’t subtract)
(155 < x, subtract it)
155
x = 91
(72 < x, subtract it)
72
x=
19
(35 > x, don’t subtract)
(19  x, subtract it)
19
x =
0, done.
So the solution is:
{ 984, 155, 72, 19 }
General Knapsacks

So we’ve looked at the easy cases, where
there is a fast algorithm to determine a
solution.

Unfortunately, MOST knapsack sets are
not nearly so nice. Consider:

{ 347, 356, 387, 401, 422, 461, 479, 521 }
and W = 1635.
Brute Force

Now for this small a knapsack set (with only
8 weights), we can solve the problem by
brute force. This means one sum calculation
for every subset of the knapsack set. Since a
set with cardinality n has 2n subsets, we can
solve this with 2n = 256 tries.
But if the knapsack had 200 items, our brute
force approach would require an estimated
 803,469,022,129,495,137,770,981,046,170,58
1,301,261,101,496,891,396,417,650,688
tries.

I’m Still Working on it …

Unfortunately, despite the centuries that
people have thought about this problem,
no better solution has been found than
brute force.

If you have studied complexity theory, you
would know that the knapsack problem
falls into the category of the most
intractable problems, the category called
NP-Complete.
What’s That Got to Do with PKC?

Shortly after Diffie and Hellman (1976)
described the concept of Public-Key
Crypto with a public and private key,

Merkle and Hellman proposed the use of
the knapsack problem to create a Public
Key Cryptosystem.
The Merkle-Hellman Knapsack PKC


First, for my private key, I will define a superincreasing knapsack set.
To make it interesting, the knapsack set will
have n numbers, n = 100. To make sure the
numbers are large enough not to be guessed,
define w1 to be chosen at random in the
interval [2100, 2101-1]; then each successive wi
will be in the interval [2100+i-1, 2100+i-1]; in this
way we guarantee that the knapsack set will
have the super-increasing property.
More Private Key then Public

So now we have our “easy” set {w1, …, w100 }, and next
we find a prime number p > 2201 (thus larger than the
sum of all the wi’s, and choose at random some m < p,
and also compute m-1 (mod p).

Now create a “hard” knapsack set {w1*, …, w100* } by
computing

wi* = m * wi (mod p).

The public key is the “hard” knapsack set {w1*, …, w100*
}
Encryption and Decryption

As we well know, every user creates his or her
public key and publishes it. So to send a message
of length 100 bits to a user, find his or her public
knapsack, and add up the numbers corresponding
to the 1-bits in the message. I.e., if the message is
m = b1…b100, (b for bits), the encryption is:

b1 × w1* + b2 × w2* + … + b100 × w100* = c

(which is just a sum of some subset …) now send
c.
Decryption

When I receive c, I multiply it by m-1 and reduce
mod p. This gives:
m-1 × (b1 × w1* + b2 × w2* + … + b100 × w100*) =
 b1 × m-1 × w1* + b2 × m-1 × w2* + … + b100 × m-1 ×
w100* =
 b1 × w1 + b2 × w2 + … + b100 × w100
 Which is now a knapsack problem in our easy set,
so solve it to get the values of the bi and therefore
the message

Example
Easy = { 1, 3, 7, 13, 26, 65, 119, 267}
 The complete sum is 501, choose p = 523
and m = 467. Then m-1 = 28.

The hard knapsack set, or public key, will
be 1 × 467 (mod 523), 3 × 467 (mod
523), etc. or:
 Hard = Public = {467, 355, 131, 318, 113,
21, 135, 215}

Encrypt the Bitstring 01001011
The encryption is:
 c = 0 × 467 + 1 × 355 + 0 ×131 + 0 ×
318 + 1 × 113 + 0 × 21 + 1 × 135 + 1 ×
215
 = 355 + 113 + 135 + 215 = 818


To decrypt, multiply c × m-1 (mod p) =
818 × 28= 415 (mod p).
If That was the end of the story …

But unfortunately it isn’t. Within a few years,
it was discovered that Merkle Hellman
knapsack systems were eminently breakable.
And not only the Merkle Hellman systems,
but any knapsack approach that depended
on numbers in the knapsack set growing
very fast.

So the crypto community fell out of love
with knapsacks.
But there was one knapsack
approach left standing …

Let’s just remember good old Blaise
Pascal and his triangle …
Excursions in Computation
Wayne Patterson
Professor of Computer Science
Howard University
([email protected])
SYCS Colloquium Series, March 26, 2010
24
25
Pascal
PK Crypto
Goldbach
??????
26
• The author is reminded of the old
expression: “Something old, something
new; something borrowed, something
blue.” Although reluctant to suggest a
presentation anything like a wedding
ceremony, he will look anew at some old
computational concepts involving the
Pascal triangle; something new (to many)
in a related application revisiting a public
key crypto chestnut; borrowing some
ideas from what is now usually described
as “experimental mathematics”. Something
blue? You’ll have to wait and see.
27
1
1
1
1
1
1
1
1
1
8
3
5
7
6
15
1
4
10
20
35
56
1
3
10
21
28
2
4
6
1
5
15
35
70
1
1
6
21
56
1
7
28
1
8
1
28

You will recall that each row in the Pascal
triangle is the sequence of coefficients in
the expansion of
( x  y)
n
n

Starting with the 0th row, (x+y)0 = 1

And the kth element
 n  in the nth row being
 
k 
29

Often the best mathematical insights
come from an ability to visualize the same
phenomenon from multiple perspectives.

To illustrate this point, I am going to
describe an example wherein the same
underlying principle will have three
separate expressions: one in a geometric
representation, one in a combinatorial
representation, and one in a binary string
representation.
30
31
1
1
1
1
1
1
1
1
1
8
3
5
7
6
15
1
4
10
20
35
56
1
3
10
21
28
2
4
6
1
5
15
35
70
1
1
6
21
56
1
7
28
1
8
1
32
Consider a mouse that finds itself at the
cornerstone of the parallelogram.
 The mouse, whose name is “One”, wishes
to escape to freedom by emerging from
the top.
 When the mouse moves up and to the
right, the number “bypassed” is added to
the mouse’s value (starting at One!).
 If the mouse moves up and to the left,
nothing is added.

33
The sequence
of moves:
1
1
1
1
1
1
1
1
1
8
3
5
7
6
15
1
4
10
20
35
56
1
3
10
21
28
2
4
6
1
5
15
35
70
1
1
6
21
56
Lets the
mouse
escape with
a value of 1
+ 35+20 +
10 + 4 =
70.
1
7
28
1
8
1
34
The sequence
of moves:
1
1
1
1
1
1
1
1
1
8
3
5
7
6
15
1
4
10
20
35
56
1
3
10
21
28
2
4
6
1
5
15
35
70
1
1
6
21
56
Lets the
mouse
escape with
a value of 1
+ 15+10 +
3+ 2 = 31.
1
7
28
1
8
1
35






could be written more compactly by
representing an “up to the right” by a “1” and
“up to the left” by “0”.
The result of this is a bitstring, and so the
figure on the left becomes:
01101100
Since each mouse move goes up by one row,
all successful paths are of length 8
And to go out the top, the mouse must make
an equal number of “up rights” and “up lefts”
So our bitstring will be always of length 8
with 4 1-bits.
36
Clearly there is a 1-1 correspondence
between paths that escape through the
top and bitstrings of length 8 with 4 1bits.
 How many paths? How many such
bitstrings?
 Each such bitstring results from picking 4
positions out of 8
 8  is8 the
 7  definition
65
 But this
 
 70 of

 4
 
4  3  2 1
37
PBJ
 Let P = set of all paths through the
parallelogram
 Let B = set of all bitstrings of length 2n
with n 1-bits
 Let J = subset of N, natural numbers, =

 2n 
[1,  ]
n
38
 = Use
the bits
to tell
the
mouse
where to
go
=
Track
the
mouse
move
ments
with
bits
=
Add
the
path
value
s
We just
need ,
since then
the last
piece will
be   

39

Track back through the parallelogram. (52) =
0
1
1
1
1
1
2
3
4
5
1 + 35 +
10 + 6 + 0
= 52
1
1
3
6
10
15
1
4
10
20
35
1
5
15
35
70
40
41






Knapsacks are dead for public key
crypto, or mostly dead …
As Billy Crystal said in the “Princess
Bride”: “mostly dead is partly alive” …
All the knapsacks previously studied
were “low-density”
Methods of Brickell, Lagarias and
Odlyzko depended on this density
So here’s a knapsack modelled on the
Pascal parallelogram
That can’t be attacked by the lowdensity methods
42
For the first row, choose a number pseudorandomly in the
interval [1,2200].
 Second row: Each element pseudorandomly between
[2200+1,2201]
 For each succeeding row (i), let the kth element be chosen
pseudorandomly in the interval

 i  1   i  1  i  1   i  1
  
,2(
  
)]
[
 k  1  k   k  1  k 


Create 200 rows
As with traditional knapsacks, find a large prime p and a
multiplier m, and multiply each element in the
parallelogram by m mod p.
43
The public key is the transformed
parallelogram
 The private key is the original
parallelogram, as well as m.

44
45

In recent years, a number of mathematicians
have worked to develop new ways of thinking
about their subject. These approaches, often
described as "experimental mathematics," were
simply not available to earlier generations of
mathematicians, because they depend upon the
ability to analyze the results of computations
made feasible by appropriate mathematical
software tools in order to formulate previously
unthinkable hypotheses.
46

Number Theory
◦ Purest branch of mathematics
◦ Open problems can be explained to a non-mathematician
◦ Among the most difficult to solve

As Jim Arthur has said:
◦ “Andrew Wiles’s proof of Fermat’s Last Theorem, in a way
that we would not have expected, caught people’s
imagination. Books like the one on John Nash, A Beautiful
Mind, have also brought a good deal of attention to
mathematics. And of course in movies, mathematics has
been chic in the last five or ten years.”
47

We will look at two of the classical
computational number theoretic problems:

Goldbach conjecture

n2+1-prime conjecture
48

One of the greatest remaining
conjectures in elementary number theory
is the Goldbach conjecture, which in its
most often quoted form is:

“Every even positive integer≥4 is the sum
of two prime numbers”
49

“Every prime number > 11 is the sum of
two composite numbers.”

I have been able to prove the Goldfinger
Conjecture!
50
Given any prime number p > 11,
 Either p  1 (mod 3) or p  2 (mod 3).
 If p  1 (mod 3), then p = 3k + 1 = 3(k-1)
+ 4 = 3*(k-1) + 2*2
 If p  2 (mod 3), then p = 3k + 2 = 3(k-2)
+ 8 = 3*(k-2) + 2*4.
 Q. E. D.

51

(a) Patterson will be heading to Norway
for the next Abel Prize ($980,000) or

(b) The result will be formally announced
on 01.04.2010 (European convention)
52




There was briefly a $1M prize for solving the Goldbach
Conjecture
Needless to say, it wasn’t claimed, and it’s now closing in
on 200 years without solution
Among the people who have recently looked at the
Goldbach Conjecture is John Nash
Also my Op-Ed in the Washington Post at the time of
the release of “A Beautiful Mind”, regarding my
interactions with him in my Princeton days, and other
musings.
53
My interest has been to try to determine
the difficulty of finding primes that add to
a given, pseudo-randomly selected even
number of varying magnitudes.
 For a given n = 4k+1, where k is odd, the
first approach to the question involves
testing the numbers 2k-1 and 2k+1 for
primality.

54












maxbyexp=Table[0,{m,1,37},{n,1,2}];
Do[top=10^l;
resul=Table[0,{i,1,500},{j,1,4}];
Do[od = 2*Random[Integer,{1,top/2}]+1; ev = 2*od;
p = ev/2 +2;q=ev/2-2;i=1;
While[(!(PrimeQ[p]&&PrimeQ[q])&&(p>0)),p=p-2;q=q+2;
i=i+1];
resul[[k]]={i,ev,p,q},{k,1,500}];MatrixForm[resul];tr = Transpose[resul];
Print["For exponent ",l,", the largest i is ",Max[tr[[1]]]];
maxbyexp[[l-3]]={l,Max[tr[[1]]]},{l,4,40}];
Print[MatrixForm[maxbyexp]];
ListPlot[maxbyexp]
55
10000
8000
6000
4000
2000
5
10
15
20
25
30
56
14000
12000
10000
8000
6000
4000
2000
5
10
15
20
25
30
57
Maximum number of tests is 457 for a
number of magnitude 1096
 Vs. 41,177 (for 2k-1,2k+1) and
 46,317 for random odd numbers


There is also a result that the maximum
number of tests up to 1014 is 735.
58
Test for Goldbach Primes Using Small Primes Only
500
450
400
350
300
250
Small Primes
200
150
100
50
99
89
94
79
84
69
74
59
64
49
54
39
44
29
34
19
24
9
14
4
0
59
Given the seeming efficiency of finding
Goldbach pairs using lists of small primes, I
wondered whether, given a selected interval of
consecutive even numbers, the complete set of
members in that interval could be covered by
Goldbach pairs using the small primes
 And, if so, how large an interval, at what starting
point, and using how large a list of small primes
…

60
Using List of 1000 Small Primes
246
250
Number
of Cases
Missed
200
157
113
150
100
66
12
50
69
6
S3
28
2
S2
0
S1
1
2
Range to
Cover 1000*s
3
Starting Point 10^100x
61
Using List of 2000 Small Primes
16
14
Number
of Cases
Missed
12
10
8
6
4
S3
2
S2
0
1
S1
2
Range to
Cover 1000*s
3
Starting Point 10^100x
62
Using List of 3000 Small Primes
3
2. 5
Number
of Cases
Missed
2
1. 5
1
0. 5
S3
S2
0
1
S1
2
Range to
Cover -
3
Starting Point 10^100x
63
25
20
15
10
5
100
200
300
400
500
64







This number theoretic conjecture asserts that
there are an infinite number of primes of the
form n2+1.
12 + 1 = 2
22 + 1 = 5
42 + 1 = 17
62 + 1 = 37
102 + 1 = 101
142 + 1 = 197…
65

It might be noted that the first case where n2+1
is not prime for n even is 82 + 1 = 65, and that
in general, n2 + 1 will never be prime if n=2
(mod 10) or n=8 (mod 10), for n=8, since the
last digit of n2 will be 4, and the last digit of
n2+1 will be 5.

Thus we can limit ourselves to considerations
for n, for n2+1 to be prime, to be n=0, 4, or 6
(mod 10).
66

As before, we selected numbers at
random of varying magnitudes up to 10500,
and tested the values of n2+1 for primality
67
600
500
400
300
200
100
50
100
150
200
68

Numbers of the form n4+1 form a proper
subset of those of the form n2+1

Since numbers of the form n4+1 are more
spread out along the number line than
those of the form n2+1, it would be
reasonable to expect that it would be
harder to find primes of the form n4+1.
69
1: Number of tries to find an n2+1prime

2: Number of tries to find an n4+1prime

3: Number of tries to find an n8+1prime

(abc): Least number of tries is a,
second least b, most c.

70
Order (123) (132) (213) (231) (312) (321) Total
/Test
Cases
1
16
11
12
29
10
22
100
2
15
8
18
28
8
23
100
3
10
9
14
25
16
26
100
4
15
8
17
29
10
21
100
71

To delve further into this, I thought
selecting a specific sequence of n’s would
be interesting in trying to find a sequence
of primes (or composites).

I was led to the sequence
◦ {10k + 1 | k= 1, 2, …}
◦ 10000000000000 …… 0000000000000001
72

Observation 1: 10k+1 is prime if k=1 or 2.

Observation 2: 10k+1 is divisible by 11 (and therefore
composite) if k is positive and odd. (Use the old trick of
computing the sum of the digits in the odd and even
positions; if their difference is divisible by 11, so is the
number.)

Observation 3: 10k+1 is composite (k>1) if k is not a
power of 2.

E.g. (1014 + 1) = (102 + 1)(1012 – 1010 + 108 – 106 + 104
– 102 + 1)
73

In studying three well-known numbertheoretic outstanding conjectures, we
are able to discover some unexpected
phenomena, and thus shed new light
on these classical problems.
Furthermore, these investigations are
accessible to undergraduate
mathematics students.
74
75
This is Larry Bowa

He is the third-base
coach for the Los
Angeles Dodgers

He wears blue

He has a
cryptosystem
http://www.youtube.com/watch?v=x-S-eeInJVk&NR=1
76
An Important Cryptanalysis

Financial share for World Series Winners
(2008) = $18,417,358.

Bowa communicates one of 9 signals to a
runner or a batter:

Plaintext = { Steal, Hold, StealOnOverflow,
Take, Bunt, Swing, HitAndRun, RunAndHit,
SqueezeBunt }
77
Base Coach Signals

Signals have two components, typically:


A number of body movements,
BODY = {Belt, Clap, Hat, Leg, Nose,
Shoulder, Wipe }

And a “hot sign” ε [1, 10]

So the key space is KEY = { (x, y) | x ε [1,10],
y ε BODY }
78
Frequency of Occurrences in a
Game
Situat Steal
ion/P
lay
Hold
Steal Take
OnOv
erthr
ow
Bunt
1st,
2nd,
1st
and
2nd
116
116
78
116
Swin
g
116
3rd, <
2 out
2-0,
3-0,
3-1
count
s
HitAn RunA Sque
dRun ndHit eze
116
32
64
64
79
Cryptanalysis
Map succeeding signals in situation: runner
on 1st, 39% of time, or 116 times/game.
 Example:

1
2
3
4
5
6
7
8
9
10
Shoul Belt
der
Wipe
Belt
Clap
Nose
Belt
Clap
Hat
Nose
1
2
3
4
5
6
7
8
9
10
Belt
Wipe
Belt
Clap
Shoul Hat
der
Belt
Nose
Wipe
Leg
80
With One “Fixed Point”…

Can exactly determine key with two
readings.
Number of messages = | KEY | = 710
 = (6+1)10 = 610 + 10 x 69 + 45 x 68 + 120 x
67 + … + 10 x 6 + 1
 # of messages with 1 fixed point = 10 x 69 =
100,776,960
 # of messages with ≥ 1 fixed point = 710 x
610 = 181,698,289

81
Probabilities

Probability of exactly 1 fixed point = 0.555

Probability of more than 1 fixed point =
0.445

Probability of exactly 1 fixed point in 3
pitches = 1 – (0.445)3 = 0.912

Probability of exactly 1 fixed point in 4
pitches = 1 – (0.445)6 = 0.992
82
References

Agrawal, M., N. Kayal and N. Saxena, PRIMES in P, (August 2002), http://www.cse.iitk.ac.in/users/manindra/primality.ps or
http://www.cse.iitk.ac.in/news/primality.pdf.

Bailey, David H. and Jonathan M. Borwein, Experimental Mathematics: Examples, Methods, and Implications, Notices of the American
Mathematics Society, vol. 52, no. 5, May 2005, pp. 502-514.

Brickell, E. F., "Solving low density knapsacks," Advances in Cryptology-Proc. Crypto 83, Plenum Press, New York, 1984, pp. 25-37.

Chen, J.-R. and T.-Z. Wang, On the Goldbach Problem, Acta Math. Sinica 32, 1918, pp. 702-718.

Cooper, Rodney H., Hunter-Duvar, Ron, and Patterson, Wayne, A More Efficient Public-Key Cryptosystem Using the Pascal Triangle,
ICC `89, Boston, June 1989, pp. 1165-1169.

Klamkin, M. S. , Problem 63—12, SIAM Review, 5 (1963) 275-276.

Lagarias, J. C. and A. M. Odlyzko, "Solving Low Density Subset Sum Problems," J. Assoc. Comp. Mach., vol. 32, 1985, pp. 229-246.
Proc. 24th IEEE Symposium on Foundations of Computer Science, IEEE, 1983, pp. 1-10.

Patterson, Wayne, An Exploration in ‘Experimental Mathematics’: Computing the Determinant Function, Proceedings of the MidAtlantic Consortium for Research in Mathematical Sciences, Ocean City, MD, 2004, to appear.

Patterson, Wayne, “Experimentation in Computational Number Theory,” Proceedings of the Mid-Atlantic Consortium for
Research in Mathematical Sciences, Orlando, FL, June 2005.

Patterson, Wayne, Mathematical Cryptology, Rowman and Littlefield, 318 pp., 1987.

Presidential Views: Interview with James Arthur, Notices of the American Mathematics Society, vol. 52, no. 3, March 2005, pp. 350-352.

Richstein, J., Verifying the Goldbach Conjecture up to 4 ▪ 1014, Math. Comp. 70:236 (2001) 1745-1749..
83