Information Theory

exercise in the previous class
give proof for the discussion in p.19
see http://apal.naist.jp/~kaji/lecture/
1
chapter 4:
cryptography
2
what we do, and what we do not in this class
cryptography is discusses in many contexts
management
politics
history
philosophy
In this class, we focus on the technical aspects of cryptography.
3
terminology
p
D(c)
encryption (暗号化)
E
D
decryption (復号)
plaintexts(平文,ひらぶん);
make sense by themselves
E(p)
c
ciphertexts (暗号文);
make no sense by themselves
cryptography (暗号) = pair of E and D such that D(E(p)) = p
many variations and confusions on the words:
crypto  cipher, text  data, cryptography  encryption
4
three types of cryptography
key-less cryptography
E(p) (resp. D(c)) is solely determined by p (resp. c).
no key ... the algorithms must be kept secret
security relies on the “gap of wisdom” of the recipients
“O, draconian devil”  “Leonardo da Vinci”
common-key cryptography
E and D must use the same key
public-key cryptography
E and D use different keys which are in special relation
5
class plan
today: common-key cryptography
widely known algorithms
key agreement protocol
next: public-key cryptography
RSA
related algorithms
June 4 (MON): exercise
June 5 (TUE): test
6
common-key cryptography
symmetric-key ―, classic ―, ...
E (resp. D) takes two inputs: key and plaintext (resp. ciphertext)
E(k, p): the ciphertext of p encrypted with the key k
D(k, c): the plaintext of c decrypted with the key k
D(k, E(k, p)) = p, but D(k’, E(k, p))  p if k’  k
k1
p
E
k2
c
D
p, if k1 = k2
?, if k1  k2
7
substitution cipher
E
K
A
...
...
Y
Z

C

B

ciphertext
A

plaintext

substitution cipher (換字暗号):
encrypt: replace characters in plaintexts to different characters
decrypt: do the inverse replacement of encoding
key: the table of the character replacement
Z
G
the number of possible keys = 26! for English alphabet
... too many even for today’s computers
the statistics of the plaintexts can be observed in cipherexts
8
frequency attack
in a naive substitution cipher...
a character is always replaced to the identical character
in many data, there is bias on the frequencies of characters
in English...
characters such as “e”, “t”, “a”, and “s” occur frequently
characters which occur frequently in a ciphertext
= replacements of the above four frequent characters
A.C. Doyle, 1903,
The Adventure of the Dancing Men
9
sketch of the frequency attack
information as a
concept has many
meanings the
concept of
information is a
b
c
d
typical English texts
8.6%
1.4%
2.8%
3.8%
zpunim gt oncuit
ciphertext of
utqvgwp gw h
unknown text
antaubz spgap
nigqgthvvm
cuigluw eino
h
8.4% → a
x
a
c
theory in modern
english is a concept
which originally
derives from
classical greek
plaintext
1.5% → b
2.7% → c
3.8% → d
10
many improvements
The vulnerability (脆弱性) of the substitution cipher was
well-known to cryptographers from early days...
many improvements were considered...
one-to-many substitution
substitution of N-grams or words
use of multiple substitution tables
dynamically change the substitution table
 Enigma
11
Enigma
used by German military in the World War II
the substitution is determined by “rotor wheels”
the rotor wheels rotate as one character is processed
A
B
D
Enigma showed that
machine power >> human power
C
12
DES (Data Encryption Standard)
DES (Data Encryption Standard)
developed in the US in 70’s to secure classified data
not the “first-class” cryptography
“good security with reasonable cost”
insecure nowadays, but played important role in cryptology
1973
1974
1977
1997
NBS solicited (公募する) encryption algorithms
IBM submitted a candidate
published as federal standard
NIST (formerly NBS) solicited newer AES
13
RK16
RK2
48
round 1
round 2
round 16
ciphertext
L16
L1
L0
32
initial
permutation
64
IP-1
R15
L15
f
R2
L2
f
f
R1
48
round keys
R16
56
RK1
R0
IP
56...# of bits
64
plaintext
32
48
key
56
encryption of DES
14
Feistel structure
each round of DES has the Fesitel structure
Li
Ri
RKi+1
f
Li+1
Ri+1
the Fesitel structure is easy to
invert if RKi+1 is provided correctly
the inversion can be done with
the same Feistel mechanism
(with left and right exchanged)
Ri+1
Li+1
RKi+1
f
Ri
Li
15
L16
plaintext
IP-1
R16
key
RK1
R15
L15
f
R2
L2
f
L1
L0
f
IP
ciphertext
R0
R1
RK16
RK15
decryption of DES
inside this box is the same as the encryption
 one circuit is used for both of encryption and decryption
16
security of DES
theoretical attacks
differential analysis by Biham & Shamir (1990)
investigated at the design phase of DES...
linear analysis by Matsui (1993)
succeeded to break DES first time
exhaustive attacks
22hours, 100K computers connected by network (1999)
9days, FPGA-based parallel machine (2006)
DES is not secure anymore!
17
rumor of DES
rumor, or urban legend: “NSA must settle a back-door in DES”
NSA: National Security Agency
intelligence agency of the US
some activities not revealed
commitment to the Echelon system
evidence?
the key length is shortened from the IBM proposal
some substitution tables in DES is replaced by NSA
NSA did know the differential analysis
there is no way to verify what is true and what is not true...
18
AES and others
DES is no more secure
there is no way to deny the bad rumor
 the newer and stronger cryptography is needed
1997
1999
2000
2001
NIST solicited Advanced Encryption Standard (AES)
15 candidate algorithms from 12 countries
5 candidates passed the screening
Rijndael, from Belgium, was selected as winner
published as federal standard
There are many other algorithms: Blowfish, IDEA, Camellia...
19
key agreement
Any common-key cryptography faces to one serious problem:
How can we share a key with a person at remote place?
the sender and the receiver must have the same key
the key must not be known to anyone else
solution...
use an expensive but secure communication channel
secret agent, registered mail, pigeon, etc...
utilize mathematical trick  key agreement protocol
20
key agreement protocol
We consider a protocol between two users A and B:
the communication channel is not secure
an attacker C can wiretap (盗聴する) the communication,
but does not modify data in the channel
after the protocol execution...
A and B know a certain information in common
C does not know the information
21
Diffie-Hellman protocol
Diffie-Hellman protocol;
is proposed by Diffie & Hellman in 1976
makes use of the property that
it is difficult to solve the discrete logarithm problem
preliminary
Fq = {0, ..., q – 1} with q a big prime number
g, a generator of Fq
(any nonzero aFq is written as a = gx mod q)
discrete logarithm problem (DLP):
“given q, g and a, determine x with a = gx mod q”
22
example
F7 = {0, 1, 2, ..., 6}
g = 3 is a generator of F7
1 = 36 mod 7
2 = 32 mod 7
3 = 31 mod 7
4 = 34 mod 7
5 = 35 mod 7
6 = 33 mod 7
the answer of the DLP
log3 1 = 6
log3 2 = 2
log3 3 = 1
log3 4 = 4
log3 5 = 5
log3 6 = 3
6
5
4
3
2
1
x
0 1 2 3 4 5 6
a
no smart algorithm known today
... the only means to solve the problem is by exhaustive search
... nobody can solve the problem if q is large (> thousands bits)
23
the protocol
step 1: A and B agree the prime q and the generator g (in public)
step 2a: A chooses random x, and sends mA = gx mod q to B
step 2b: B chooses random y, and sends mB = gy mod q to A
step 3a: A computes (mB)x mod q = gxy mod q
step 3b: A computes (mA)y mod q = gxy mod q
determine q & g
x
mA = gx mod q
mB = gy mod q
gxy mod q
y
gxy mod q
24
example
q = 197, g = 3
71 = 351 mod 197
51
38 = 355 mod 197
122 = 3851 mod 197
55
122 = 7155 mod 197
How can we compute 3851 mod 197?
3851 mod 197
= (3832 mod 197) (3816 mod 197) (382 mod 197) (381 mod 197) mod 197
382n mod 197 = (38n mod 197)2 mod 197
381
382
384
388
3816
3832 mod 197
25
security
Is the protocol secure?
determine q & g
mA = gx mod q
x
mB = gy mod q
gxy mod q
y
gxy mod q
C finds q, g, mA and mB
C cannot know x and y unless he/she solves DLP
C cannot know the value of the shared gxy mod q
26
another security
What happens if the attacker do more than wiretapping?
C communicates with A pretending B
C communicates with B pretending A
A and B communicate with C, believing that
he/she is communicating with a valid opponent.
 man-in-the-middle attack (中間一致攻撃)
27
summary
classification of cryptography
key-less, common-key and public-key
common-key cryptography
substitution cipher
DES
key-agreement protocol
28
exercise
Decrypt the following ciphertext.
qiw aufmlyn gcmwz yz c mcxae yoqweocqyaocu wpwoq
jwcqkeyog zkmmwe cod vyoqwe zlaeqz, yo viyni qiakzcodz aj
cqiuwqwz lceqynylcqw yo c pceywqf aj namlwqyqyaoz. qiw
aufmlyn gcmwz icpw namw qa hw ewgcedwd cz qiw vaeud'z
jaewmazq zlaeqz namlwqyqyao viwew maew qico qva ikodewd
ocqyaoz lceqynylcqw. qiw gcmwz cew nkeewoquf iwud wpwef
qva fwcez, vyqi zkmmwe cod vyoqwe aufmlyn gcmwz
cuqweocqyog, cuqiakgi qiwf annke wpwef jake fwcez vyqiyo
qiwye ewzlwnqypw zwczaocu gcmwz.
29
about test
June 4(Mon), 9:20AM, exercise
June 5 (Tue), 9:20AM, this room
you can bring books, notes and copies of slides
you can bring a calculator and/or PC
PC must be disconnected from the network:
download all needed material before the test starts
本,ノート,資料,電卓,PC ...なんでも持ちこみ可
PC 等の通信機能は使用不可
必要な資料類は事前にダウンロードしておくこと
30