Coding Theory and Cryptography: Putting Mathematics to Work!
Download
Report
Transcript Coding Theory and Cryptography: Putting Mathematics to Work!
Secret Codes, CD players, and Missions to Mars:
An Introduction to Cryptology and Coding Theory
Sarah Spence Adams
Associate Professor of Mathematics
Franklin W. Olin College of Engineering
AKA
• Sit back, relax, and learn something new while
your combinatorics knowledge marinates for
the test
Cryptology
Cryptography
Inventing
cipher systems
Alice
Bob
Eve
Cryptanalysis
Breaking
cipher systems
Hidden Messages Through the Ages
440 BC – The Histories of Herodotus
messages
concealed beneath wax on wooden tablets
tattoos on a slave's head concealed by regrown hair
WWII – microdots
Modern day – hidden information within digitial
pictures
The Scytale of Ancient Greece
Used by the
Spartan military
5th Century B.C.
Caesar’s Substitution Cipher
1st Century B.C.
Example
Plaintext:
OLINCOLLEGE
= 3
Shift forward by 3
Ciphertext: ROLQFROOHJH
Decryption: Shift backwards by 3 (or forwards by 23)
Encryption:
Modular Arithmetic
15 mod 12 = 3
27 mod 26 = 1
90 mod 10 = 0
a mod m is the remainder obtained upon
dividing a by m
Can be obtained by subtracting off multiples of
m from a until the result is between 0 and m-1
Caesar Using Modular Arithmetic
Map A,B,C,…Z to 0,1,2,…,25
View cipher as function e(pi) = pi + k (mod 26)
Example with key = k = 15
Plaintext: OLINCOLLEGE=14,11,8,13,2,14,11,11,4,6,4
Encryption: e(pi)
Ciphertext:
3,0,23,2,17,3,0,0,19,21,19 = DAXCRDAATVT
Decryption:
= pi + 15 (mod 26)
d(ci) = ci -15 (mod 26) or
d(ci) = ci +11 (mod 26)
p
Kerckhoff’s Principle
Must assume the
enemy knows the
system
Also known as
Shannon’s Maxim
Cryptanalysis of Substitution Ciphers
Caesar’s 26 shifts to test
Generally 26! permutations to test
If parsed, guess at common words or
cribs
Frequency analysis
Frequency Analysis (Al-Kindi ~850A.D.)
www.wikipedia.com
The Letter E
Vigenère Cipher (1586)
Polyalphabetic cipher
Involves multiple Caesar shifts
Example
Plaintext:
Key:
Encryption:
O L I N C O L L E G E
S UN S U N S U N SU
e(pi) = pi + ki (mod 26)
Decryption: d(ci) = ci - ki (mod 26)
One-Time Pads:
The Ultimate Substitution Cipher
Plaintext:
MATHISUSEFULANDFUN
Key:
NGUJKAMOCTLNYBCIAZ
Encryption: e(pi) = pi + ki (mod 26)
Ciphertext: BGO…..
Decryption: d(ci) = ci - ki (mod 26)
One-Time Pads
Unconditionally secure
Problem: Exchanging the key
Public-Key Cryptography
Diffie & Hellman (1976)
Known at GCHQ years before
Uses one-way (asymmetric) functions, public
keys, and private keys
Public Key Algorithms
Based on two hard problems (traditionally)
Factoring
large integers (RSA)
The discrete logarithm problem (ElGamal)
WWII: The WeatherBeaten Enigma
3 x 10114 ciphering
possibilities; polyalphabetic
Destroyed frequency counts
Cracked thanks to traitors,
captured machines,
mathematicians, and human
error
Not only do you want secrecy…
…but you also want reliability!
Enter coding theory…..
Communication System
Digital Source
Digital Sink
Source
Encoding
Encryption
Source
Decoding
Decryption
Error Control
Encoding
Error Control
Decoding
Modulation
Channel
Demodulation
What is Coding Theory?
Coding theory is the study of error-control
codes, which are used to detect and
correct errors that occur when data are
transferred or stored
What IS Coding Theory?
A mix of mathematics, computer science,
electrical engineering, telecommunications
Linear
algebra
Abstract algebra (groups, rings, fields)
Probability&Statistics
Signals&Systems
Implementation issues
Optimization issues
Performance issues
General Problem
We want to send data from one place to another
over some channel
telephone
lines, internet cables, fiber-optic lines,
microwave radio channels, cell phone channels, etc.
or we want to write and later retrieve data…
channels:
hard drives, disks, CD-ROMs, DVDs, solid
state memory, etc.
BUT the data may be corrupted
hardware
malfunction, atmospheric disturbances,
attenuation, interference, jamming, etc.
General Solution
Introduce controlled redundancy to the
message to improve the chances of
recovering the original message
Trivial example: The telephone game
Introductory Example
We want to communicate YES or NO
Message 1 represents YES
Message 0 represents NO
If I send my message and there
is an error, then you will decode
incorrectly
Repetition Code of Length 2
Encode message 1 as codeword 11
Encode message 0 as codeword 00
If one error occurs, you can detect that
something went wrong
Repetition Code of Length 5
Encode message 1 as codeword 11111
Encode message 0 as codeword 00000
If up to two errors occur, you can correct
them using a majority vote
Evaluating and Comparing Codes
Important Code Parameters
Minimum
distance
Determines error-control capability
Code
rate
Ratio of information bits to codeword bits
Measure of efficiency
Length
of codewords
Number of codewords
Complicated Problem
Want
Large
minimum distance for reliability
Large number of codewords
High rate for efficiency
Conflicting goals
Require
trade-offs
Inspire more sophisticated mathematical
solutions
Inherent Trade-offs
What are the ideal trade-offs between rate,
error-correcting capability, and number of
codewords?
What is the biggest distance you can get given
a fixed rate or fixed number of codewords?
What is the best rate you can get given a fixed
distance or fixed number of codewords?
The ISBN-10 Code
x1 x2… x10
x10 is a check digit chosen so that
S = x1 + 2x2 + … + 9x9 + 10x10 = 0 mod 11
If check digit should be 10, use X instead
Can detect all single and all transposition
errors
ISBN-10 Example
Cryptology by Thomas Barr: 0-13-088976-?
Want 1(0) + 2(1) + 3(3) + 4(0) + 5(8) + 6(8) + 7(9)
+ 8(7) + 9(6) + 10(?) = multiple of 11
Compute 1(0) + 2(1) + 3(3) + 4(0) + 5(8) + 6(8) +
7(9) + 8(7) + 9(6) = 272
Ponder 272 + 10(?) = multiple of 11
The check digit must be 8
New ISBN-13
x1 x2… x13
x13 = 10-((1x1 + 3x2 + … + 1x11 + 3x12) mod 10)
If check digit should be 10, use 0 instead
Convert ISBN-10 to ISBN-13 using 978 prefix
and new check digit
Universal Product Code (UPC)
Universal Product Code (UPC)
First number: Type of product
0,
6, 7: Standard
2: Random-weight items (fruits, meats, etc)
4: In-store
5: Coupons
Next chunk: Manufacturer identification
Next chunk: Item identification
Last number: Check digit
UPC Check Digit
x1 x2… x12
x12 is a check digit chosen so that
S = 3x1 + 1x2 + … + 3x11 + 1x12 = 0 mod
10
Can detect all single and most
transposition errors
Which transposition errors go undetected?
Hadamard Code
Used in NASA Mariner Missions
Pictures divided into pixels
Pixels are assigned a level a darkness
on a scale of 0 to 63
Binary Representation of Messages
Express 64 levels of darkness (our
messages) using binary strings
000000
1 000001
2 000010
3 000011
….
63 111111
0
Hadamard Matrices
Map length 6 messages to length 32
codewords obtained from rows of
Hadamard matrices
Hadamard matrices (Sylvester, 1867)
have special properties that give these
codewords a minimum distance of 16
HHT = n I
Compare with Length 5 Repetition Code
Send a message of length 6
Probability of bit error p= .01
Length 5 Repetition Code
Sending
6 info bits requires 30 coded bits: Rate = 6/30
P(decode incorrectly) = .00006
Hadamard Code
Sending
6 info bits requires 32 coded bits: Rate = 6/32
P(decode incorrectly) = .0000000008
Voyager Missions (1980’s-90’s)
Reed-Solomon codes use abstract algebra to get
even better results
Ideals of polynomial rings (Dedekind,1876)
Same codes used to protect CDs from scratches!
Summary
Cryptology and coding theory are all around us
From Caesar to RSA…. from Repetition to Reed-Solomon Codes….
More sophisticated mathematics better ciphers/codes
New uses for old mathematics, motivation for new mathematics
Cryptology has existed for thousands of years… what ciphers and codes
will be next?
Jefferson Disk - Bazeries Cylinder
Used by US Army from early 1920’s to early 1940’s
Alice and Bob agree on order of disks – the key
Alice
rotates wheels to spell
message in one row
sends any other row of text
Bob
spells out the ciphertext on wheel
looks around the rows until he sees the (coherent)
message
The Code Talkers
Refers primarily to Navajo speakers in WWII
Only unbroken wartime cipher
Fast and efficient
Interesting connections between cracking secret
ciphers and cracking ancient languages
Major Questions
What are the ideal trade-offs between rate,
error-correcting capability, and number of
codewords?
What is the biggest distance you can get
given a fixed rate or fixed number of
codewords?
What is the best rate you can get given a
fixed distance or fixed number of
codewords?
A Parity Check Code
Suppose we want to send 4 messages 00, 01, 10, 11
Form codewords by appending a parity check bit to the end
of each message
00 000
01 011
10 101
11 110
Compare with length 2 repetition code
Both detect all single errors using minimum distance 2
Now rate 2/3 compared with rate 1/2
Now 4 codewords compared with 2 codewords