Language Modeling and Encryption on Packet Switched Networks

Download Report

Transcript Language Modeling and Encryption on Packet Switched Networks

Language Modeling and
Encryption on
Packet Switched Networks
Kevin McCurley
A.A. Markov
of St. Petersburg
1856-1922
Андрей Андреевич Марков
The science of cryptology
1. Devise a mathematical model of
communication
2. Devise a mathematical model of an
adversary
3. Construct a cryptosystem
4. Prove that the theoretical adversary cannot
break the cryptosystem (perhaps under
assumptions)
The application of cryptology
• Choose a construction for a cryptosystem
• Adapt it to your model of communication
• Watch it get broken by adversaries who
don’t fit the model.
The choice of a model is
at least as important as a proof.
Security models
Complexitytheoretic
security
Information
theoretic
security
The
real
world
Attacks that have fallen outside
the scope of security models
• Electromagnetic radiation measurements
• Timing analysis
• Power analysis
• Fault analysis
• Acoustic attacks
• Cache attacks
…
Micali and Reyzin
“Physical Observable Security”
TCC 2004
• Extension of security models to include
physical instantiation of computation.
• Better description of adversary results in a
more robust model.
Are these the only attacks?
• Micali & Reyzin address the physical act of
computation
• What about the physical act of communication?
How do we even define communication?
Claude E. Shannon
“The mathematical theory of communication”
(1948)
“The communication theory of secrecy systems”
(1949)
Shannon’s linear model
of communication (1948)
Information
source
Transmitter
Channel
Receiver
Destination
Noise
A discrete Markov process on a finite domain
Information theoretic
perfect secrecy
• Messages are drawn from an underlying
known probability distribution
• An encryption system has perfect secrecy if
P(ciphertext | plaintext) is independent of
plaintext.
One time pad
The one time pad has perfect secrecy.
BUT: messages must all be the same length or else
you lose perfect secrecy.
What about
infinite message spaces?
Theorem. (Chor-Kushilevitz, 1989) It is
impossible to construct an informationtheoretically secure cryptosystem on a
countably infinite message space.
Is leaking the message length
unavoidable?
Theorem (Goldreich) A semantically secure
encryption scheme cannot hide the length of the
plaintext against polynomial time adversaries.
Are these just theoretical concerns?
or do they show up in practice?
What can be learned from
the length of a message?
Example: communication to a stock broker with messages
of “buy IBM” or “sell IBM”
Example: in a file system, sizes of many files may provide
evidence that encrypted files are copies of known files
Example: in the military, voluminous communications may
indicate a command or intelligence center with multimedia
How can we deal with
message lengths?
 Pad the messages to be all the same length
 Keep talking at all times
In practical systems, bandwidth and storage are
not free!
What is this the encryption of?
i3or uqpcs hrt nbqpdn 0xcae opx
How does one approach the question?
is it encrypted or just gibberish?
what language is it?
what is the character set?
who said it?
when did they say it?
how was it encrypted?
A related problem:
communication is segmented
• Human spoken language is broken into
syllables, words, sentences.
• Human written communication is broken
into paragraphs, chapters, articles, books.
• Movies are broken into frames.
• Internet communication is packet switched.
Suppose I told you
where the spaces were
# usually
####### get
### my
## stuff
##### from
####
I
###### who
### promised
######## somebody
######## else
####
people
#### they
#### would
##### keep
#### it
## a
# secret.
#######
that
###### Winchell
########
Walter
An cryptanalyst would have a tremendous
advantage in guessing the message!
Word lengths in 3 translations
of a Tolstoy novel
Word length transitions
S8
S8
S7
S7
S6
S6
S5
S5
S4
S4
S3
S3
S2
S2
S1
1
2
3
4
5
6
7
S1
8
1
French
S8
S7
S6
S5
S4
German
S3
S2
S1
1
2
3
4
5
6
7
8
2
3
4
5
6
7
English
8
Internet Communications
Are Packet Switched
 Packets are formed and transmitted according
to the “language of the application”
 Complications arise from buffering, Nagle’s
algorithm, etc.
The “language” of ssh (simplified)
SYN
ACK
SYN-ACK
Banner
Login:
m
a
b
password:
w
h
...
Cryptanalytic attacks based on
packet timings and size
• Keystroke timings (Song, Wagner, Tiang)
• Probable plaintext (Bellovin)
• Several traffic classification studies:
– Moore and Zuev (2005) 95% accuracy with
supervised Bayesian analysis
Another looming attack:
voice over IP
• VoIP has high quality of service demands.
• VoIP packets are easy to recognize.
• VoIP supports “silence suppression” for
bandwidth savings.
IPSEC traffic padding (TFC)
• Omitted in early versions of IPSEC
• Added in recent versions of tunnel mode to
obfuscate traffic patterns.
• No theoretical basis for security arguments.
• No guidance on how to generate dummy
traffic.
Note: simple padding is
NOT enough
If a known packet is repeatedly encrypted,
then the padding distribution may be
recognizable, and can be subtracted.
Do we have to packetize data?
 Success of Internet depends on it





Fairness in a shared medium
Quality of service
Buffering
Error correction
retransmission
Can we afford to keep the
Internet channels full?
• Everyone depends on everyone else using
only what they need.
• Internet exists as a sparse graph:
– O(n) total nodes to connect n endpoints
n
– To support  2  circuits, we would need a much
 
denser graph
• Alternatives in onion routing, mixes
Shannon’s epic 1949 paper
“Communication Theory of Secrecy Systems”
1. Concealment systems, that obscure the
existence of communication.
2. Privacy systems, requiring special
communication hardware.
3. Secrecy systems, utilizing mathematical
transformations.
Shannon addressed only secrecy systems, calling
concealment a “psychological problem”
What is an appropriate definition
of communication?
Shannon’s communication model
revisited
Information
source
Transmitter
Channel
Receiver
Destination
Noise
“Frequently the messages have meaning; that is they refer
to or are correlated according to some system with certain
physical or conceptual entities. These semantic aspects of
communication are irrelevant to the engineering problem.”
Deficiencies in Shannon’s model
• Communication should be bidirectional.
• Communication is a physical process with
side effects.
• Communication is segmented.
• Communication has context.
Temporal aspects of
communication
Do nothing secretly; for time sees and
hears all things, and discloses all.
- Sophocles, 496-406 B.C.
Two versions of Shannon’s paper
• Bell System Technical Journal (1948)
– “A Mathematical Theory of Communication”
• University of Illinois Press (1949)
– “The Mathematical Theory of Communication”
– With a section by Warren Weaver: “Recent
Contributions to The Mathematical Theory of
Communication”
Weaver’s three levels of
communication
Level A. How accurately can the symbols of
communication be transmitted? (the technical
problem)
Level B. How precisely do the transmitted
symbols convey the desired meaning? (the
semantic
problem)
Level C. How effectively does the received
meaning effect conduct in the desired way?
(the effectiveness problem)
Definitions of communication
• Shannon:
– Reproducing the output from a stochastic process either
approximately or exactly.
• Lasswell’s definition:
– Who says what to whom in what channel with what
effect
• The exchange of messages that change the a priori
expectation of events.
• Griffin: the management of messages for the
purpose of creating meaning
• Schramm: a purposeful effort to establish a
commonness between a source and receiver
The DIKW Hierarchy
(from the 1980s)
• Data (bits or raw symbols)
– ’k’,51771
• Information (symbols with relationships)
– ‘e’ is more common than ‘z’
• Knowledge (useful information)
– ‘e’ is energy; ‘e’ is a mathematical constant
• Wisdom (understanding of knowledge)
– Extrapolative, non-deterministic process.
DIKW Hierarchy
(the origins)
• Where is the wisdom we have lost in
knowledge?
• Where is the knowledge we have lost in
information?
T.S. Eliot, 1934
“The Rock”
Moving up the DIWK Hierarchy
• Data  Information
– Understanding structure
• Information  Knowledge
– Understanding patterns, context, relationships
• Knowledge  Wisdom
– Understanding principles, applications,
meaning, interpretation.
At what layer does a
cryptanalyst work?
• Data: what is the fifth symbol in this message?
• Information: Are there more 1’s than 0’s in this
message? Is there ever a sequence 00010000000?
• Knowledge: what language is being spoken in this
communication?
• Wisdom: did the initiator just ask a question or
give a command? Should I expect a response?
Who is in command?
At what layer does the crypto
designer work?
• Data: make sure that this bit is
unrecognizable from a random bit.
• Information: make sure that they can’t
estimate the distribution of bits accurately.
• Knowledge: make sure that the adversary
cannot recover the encoded knowledge.
• Wisdom: make sure the cryptanalyst has no
understanding of what they observe.
Closing thoughts
• We need better models of communication in
order to advance cryptology.
• We need better definitions of knowledge
and wisdom in order to advance cryptology.
• Absolute security for internet
communication is probably impossible.
Security is mostly a superstition.
... Life is nothing if not a grand
adventure.
- Helen Keller
http://mccurley.org/papers/traffic/ for updates