--- title: 密碼學導論筆記 Ch2 HISTORICAL CIPHERS tags: 密碼學導論,筆記,大學檔案 --- [TOC] # The one-time pad ![image](https://hackmd.io/_uploads/r1wQdiu-Je.png) - The system uses XOR function. C=P ⊕ K and P=C ⊕ K. - This system is unbreakable! - K must be truly random, and available to both sides - Its fatal to reuse a one-time pad # Symmetric Encryption - Also referred to as conventional encryption or single-key encryption ![image (1)](https://hackmd.io/_uploads/BJgEdidZkx.png) ## Ciphers - A cipher is a means of transforming plaintext into ciphertext - under the control of a secret key - We write c = e$_{k}$(m), where - m is the plaintext - e is the cipher function - k is the secret key - c is the ciphertext - Decipherment m = d$_{k}$(c) - e, d are public, the secrecy of m given c depends totally on the secrecy of k. ![image (2)](https://hackmd.io/_uploads/HJoEdsdWkl.png) # Cryptographic Systems - Characterized along three independent dimensions: 1. The type of operations used for transforming plaintext to ciphertext - Substitution - Transposition 2. The number of keys used - Symmetric, single-key, secret key, conventional encryption - Asymmetric, two-key, or public-key encryption 3. The way in which the plaintext is processed - Block cipher - Stream cipher # Cryptanalysis vs. Brute-Force Attack - Cryptanalysis - Attack relies on the nature of the algorithm plus some knowledge of the general characteristics of the plaintext - Attack exploits the characteristics of the algorithm to attempt to deduce a specific plaintext or to deduce the key being used - Brute-force attack - Attacker tries every possible key on a piece of ciphertext until an intelligible translation into plaintext is obtained - On average, half of all possible keys must be tried to achieve success # Keys - The number of keys must be large to prevent exhaustive search - Worst case assumptions - assume attacker has Full knowledge of the cipher algorithm 𝑒 - A number of plaintext/ciphertext pairs associated to the target key 𝑘 - The cipher designer must play the role of the cryptanalyst. - In practice ciphers are used which are believed to be strong - All this means is that the best attempts of experienced cryptanalysts cannot break them. # English plaintext - All of the examples in this lecture will have English plaintext - To break these early ciphers we make use of the statistics of English - In a later lecture on Information Theory, we shall study this relationship between - How easy the cipher is to break - The underlying plaintext distribution - English Letter Frequencies - The most common bigrams are, in decreasing order: TH, HE, IN, ER, AN, RE, ED, ON, ES, ST, EN, AT, TO, NT, HA - The most common trigrams are, in decreasing order: THE, ING, AND, HER, ERE, ENT, THA, NTH, WAS, ETH, FO ![image (3)](https://hackmd.io/_uploads/rktSOiOWke.png) # Shift Cipher - Each letter we identify with a number - 𝐴 = 0,𝐵 = 1, 𝐶 = 2, … , 𝑍 = 25 - The key 𝑘 is a number in the range 0~25 - Encryption is add 𝑘 onto each letter modulo 26. - Julius Caeser used the key 𝑘 = 3. - HELLO becomes KHOOR - We break a Shift cipher by using the statistics of the underlying language ## Example - Take the following example cipher text - cipher text BPMZM WVKM EIA IV COTG LCKSTQVO EQBP NMIBPMZA ITT ABCJJG IVL JZWEV IVL BPM WBPMZ JQZLA AIQL QV AW UIVG EWZLA OMB WCB WN BWEV OMB WCB, OMB WCB, OMB WCB WN BWEV IVL PM EMVB EQBP I YCIKS IVL I EILLTM IVL I YCIKS QV I NTCZZG WN MQLMZLWEV BPIB XWWZ TQBBTM COTG LCKSTQVO EMVB EIVLMZQVO NIZ IVL VMIZ JCB IB MDMZG XTIKM BPMG AIQL BW PQA NIKM VWE OMB WCB, OMB WCB, OMB WCB WN PMZM IVL PM EMVB EQBP I YCIKS IVL I EILLTM IVL I YCIKS IVL I DMZG CVPIXXG BMIZ - 計算頻率 - 尋找跟A和E位移幾個字母在ciphertext 一樣高 ![image (4)](https://hackmd.io/_uploads/SJ48uiOZ1e.png) - Hence the key is probably equal to 8 - We can now decrypt the cipher text to reveal - plain text There once was an ugly duckling With feathers all stubby and brown And the other birds said in so many words Get out of town Get out, get out, get out of town And he went with a quack and a waddle and a quack In a flurry of eiderdown That poor little ugly duckling Went wandering far and near But at every place they said to his face Now get out, get out, get out of here And he went with a quack and a waddle and a quack And a very unhappy tear # Substitution Cipher - The problem with the Shift cipher is that the number of keys is too small. - We only have 26 possible keys - To increase the number of keys a substituion cipher was invented. - Encryption involves replacing each letter by its permuted version. - Decryption involves use of the inverse permutation. ABCDEFGHIJKLMNOPQRSTUVWXYZ TMKGOYDSIPELUAVCRJWXZNHBQF - Hence HELLO encrypts to SOLLV - This is far too large a number to brute force search using modern computers. - Still we can break these ciphers using statistics of the underlying plaintext language. ## Example - cipher text XSO MJIWXVL JODIVA STW VAO VY OZJVCO’W LTJDOWX KVAKOAXJTXIVAW VY SIDS XOKSAVLVDQ IAGZWXJQ.KVUCZXOJW, KVUUZAIKTXIVAW TAG UIKJVOLOKXJVAIKW TJO HOLL JOCJOWOAXOG, TLVADWIGO GIDIXTL UOGIT, KVUCZXOJ DTUOW TAG OLOKXJVAIK KVUUOJKO. TW HOLL TW SVWXIAD UTAQ JOWOTJKS TAG CJVGZKX GONOLVCUOAX KOAXJOW VY UTPVJ DLVMTL KVUCTAIOW, XSO JODIVA STW T JTCIGLQ DJVHIAD AZUMOJ VY IAAVNTXINO AOH KVUCTAIOW. XSO KVUCZXOJ WKIOAKO GOCTJXUOAX STW KLVWO JOLTXIVAWSICW HIXS UTAQ VY XSOWO VJDTAIWTXIVAW NIT KVLLTMVJTXINO CJVPOKXW, WXTYY WOKVAGUOAXW TAG NIWIXIAD IAGZWXJITL WXTYY. IX STW JOKOAXLQ IAXJVGZKOG WONOJTL UOKSTAIWUW YVJ GONOLVCIAD TAG WZCCVJXIAD OAXJOCJOAOZJITL WXZGOAXW TAG WXTYY, TAG TIUW XV CLTQ T WIDAIYIKTAX JVLO IA XSO GONOLVCUOAX VY SIDS-XOKSAVLVDQ IAGZWXJQ IA XSO JODIVA. XSO GOCTJXUOAX STW T LTJDO CJVDJTUUO VY JOWOTJKS WZCCVJXOG MQ IAGZWXJQ, XSO OZJVCOTA ZAIVA, TAG ZE DVNOJAUOAX JOWOTJKS OWXTMLIWSUOAXW TAG CZMLIK KVJCVJTXIVAW. T EOQ OLOUOAX VY XSIW IW XSO WXJVAD LIAEW XSTX XSO GOCTJXUOAX STW HIXS XSO KVUCZXOJ, KVUUZAIKTXIVAW, UIKJVOLOKXJVAIKW TAG UOGIT IAGZWXJIOW IA XSO MJIWXVL JODIVA . XSO TKTGOUIK JOWOTJKS CJVDJTUUO IW VJDTAIWOG IAXV WONOA DJVZCW, LTADZTDOW TAG TJKSIXOKXZJO, GIDIXTL UOGIT, UVMILO TAG HOTJTMLO KVUCZXIAD, UTKSIAO LOTJAIAD, RZTAXZU KVUCZXIAD, WQWXOU NOJIYIKTXIVA, TAG KJQCXVDJTCSQ TAG IAYVJUTXIVA WOKZJIXQ. - We have the following statistics for single letters ![image (5)](https://hackmd.io/_uploads/By0vuiu-kg.png) - Most common bigrams : TA, AX, IA, VA, WX, XS, AG, OA, JO, JV - Most common trigrams : OAX, TAG, IVA, XSO, KVU, TXI, UOA, AXS - Since O occurs with frequency 11.479 we can guess - O = E - Three common trigrams are then - OAX = E * * - XSO = * * E - Common similar trigrams in English are (from ref. textbook) - ENT, ETH - THE - Hence likely to have - X = T - S = H - A = N - If we only look at the first two sentences - Before - XSO MJIWXVL JODIVA STW VAO VY OZJVCO’W LTJDOWX KVAKOAXJTXIVAW VY SIDS XOKSAVLVDQ IAGZWXJQ. - KVUCZXOJW, KVUUZAIKTXIVAW TAG UIKJVOLOKXJVAIKW TJO HOLL JOCJOWOAXOG, TLVADWIGO GIDIXTL UOGIT, KVUCZXOJ DTUOW TAG OLOKXJVAIK KVUUOJKO. - After the changes... - O = E, X = T, S = H, A = N - THE MJIWTVL JEDIVN HTW VNE VY EZJVCE’W LTJDEWT KVNKENTJTTIVNW VY HIDH TEKHNVLVDQ INGZWTJQ. - KVUCZTEJW, KVUUZNIKTTIVNW TNG UIKJVELEKTJVNIKW TJE HELL JECJEWENTEG, TLVNDWIGE GIDITTL UEGIT,KVUCZTEJ DTUEW TNG ELEKTJVNIK KVUUEJKE. ![image (6)](https://hackmd.io/_uploads/HyGKOsO-kg.png) - Since T occurs as a single letter we must have - T = I or T = A - T has probability of 8.07175, which is the highest probability left - Therefore, more likely to have - T = A - The most frequent - bigram is TA = AN - trigram is TAG = AN* - Therefore highly likely that - G = D - This was after the changes... - O = E, X = T, S = H, A = N, T = A, G = D ![image (7)](https://hackmd.io/_uploads/Bk6t_sdbyx.png) - We now look at two letter words... - IX = *T* - *Therefore I must be one of A,I due to English plaintext* - *Already have A.* - *Hence* - *I = I* - *XV = T** - Must have, due to English, V = O - More two letter words... - VY = O* - Hence Y must be one of F,N,R due to English - Already have N - Y has probability 1.6 (頻率較低) - (O) F has probability 2.2 (頻率相似) - (X) R has probability 6.0 (頻率太高) - Hence, Y =F - IW = I* - Therefore W must be one of F,N,S,T - Already have F,N,T - Hence, W = S - This was after the changes... - O = E, X = T, S = H, A = N, T = A, G = D - I = I, V = O, Y = F, W = S ![image (8)](https://hackmd.io/_uploads/HJgo_su-yl.png) - It is then easy to see what the plaintext is - HIDH → HIGH - TEKHNOLODQ → TECHNOLOGY - INDZSTJQ → INDUSTRY - UEDIA → MEDIA - KOUCZTEJ → COMPUTER ![image (9)](https://hackmd.io/_uploads/Hytj_ouZkx.png) - It is then easy to see what the plaintext is - plaintext THE BRISTOL REGION HAS ONE OF EUROPE’ S LARGEST CONCENTRATIONS OF HIGH TECHNOLOGY INDUSTRY. COMPUTERS, COMMUNICATIONS AND MICROELECTRONICS ARE WELL REPRESENTED, ALONGSIDE DIGITAL MEDIA, COMPUTER GAMES AND ELECTRONIC COMMERCE. # Polyalphabetic Ciphers - Polyalphabetic substitution cipher - Improves on the simple monoalphabetic technique by using different monoalphabetic substitutions as one proceeds through the plaintext message - All these techniques have the following features in common: - A set of related monoalphabetic substitution rules is used - A key determines which particular rule is chosen for a given transformation # Vigenère Cipher - The problem with the Caesar and Substitution cipher was that each plaintext letter always encrypted to the same ciphertext letter. - Hence underlying statistics of the language could be used to break the cipher. - The Vigenère cipher again identifies letters with the numbers 0~25 - The secret key is a short sequence of letters (e.g. a word). - Encryption involves adding the plaintext letter to a key letter. - With the key letters used in rotation. - A: shift 0, B: shift 1, C: shift 2… - Thus if the key is SESAME, encryption works as follows, ![image (10)](https://hackmd.io/_uploads/rkO3douWkg.png) - This is a polyalphabetic substitution cipher - A will encrypt to a different letter depending on where it is in the message - But Vigenère is easy to break. - Once we have found the length of the keyword then breaking the message is the same as breaking the Shift Cipher a number of times. - Stinson gives a more automated way of breaking the system. - We shall give a more down-to-earth way ![image (11)](https://hackmd.io/_uploads/Syk6uj_-1e.png) ## Example - cipher text CJ UT WFCN LTTF VF AAHGKEE DNH VYC IPSPKGTMV EVLINFA, NC HXS SLGIX QNVYGEM VYYI MVG ZLUHFORRXHB-RIMRXGUZLV TBF KCAXQQDKJGWERRXHBU ICKHZWKGDGG PFU JGRGIUPR KKCJ RHBVZLJX JKXMGHIUCW XGHQ KFT MKGERN-YWTJR. LX WPKCGTQV RLS MFCEQPVH DP BXKSEKGCZ TNFAZL CH UGVBHCC NPVYGKQ IHKCIBH XOEY MIAST KFGHIIY ANUSTJNPVS, ERPGRWPX JDOS PFRTL, RKXGITZ ERQW, TBF JCRKSV TMGICTRRT WCELKTGHU. FSG ISTJMCTZ CEB TVCPFKXV ZKMCH KSNP KDKS CEB BHFG FL DNF CSGABHA KM AXH ULAW XHJVPTTZ ERPGBST GGVXCPJ KTWWCKC PM O FZQITBEV UWTH YV SHXR VF BD PWVY DPVS-VF-DPVS OVCIBBIJ, NPIST UMRNAGERH, TBF R DXKA JRLSLVCBC. JGTQIRJGOVVJN, MVG KCRABKTYA PWBRPSKM GEYQEWPX PTFCVV ADEZCSMGTHKFLH BG HFSCWSF FL QKCCUAPLHKEE TOSTPRWBBI RQ HXEWVLRXG QW XTKCU RLS HBGJ RWTH QEC’H HKP UMV PCWCBC’M FGTMVGWBV. WTH KJ RD WWUKGCZIKJF P WWIZRPE RQCJPK KJVL XM WU RQ TTGKCW GXDTFBJVWDCC PL HJV QEHYGE UDKR? - Find key length(猜測兩個相同字母間距離是密碼長度的倍數) - First we need to look for repeated sequences of characters. - Recall English has a large repetition of certain bigrams - These are likely to match up to the same two letters in the key every so often - By looking for the distance between two repeated sequences we can guess the length of the keyword. - Each distance should be a multiple of the keyword. - Taking gcd of all distances between sequences should give the keyword length. - Kasiski examination - First sentence is - CJ UT WFCN LTTF VF AAHGKEE DNH VYC IPSPKGTMV EVLINFA, NC HXS SLGIX QNVYGEM VYYI MVG ZLUHFORRXHBRIMRXGUZLV TBF KCAXQQDKJGWERRXHBU ICKHZWKGDGG PFU JGRGIUPR KKCJ RHBVZLJX JKXMGHIUCW XGHQ KFT MKGERN-YWTJR. - Distance between occurrences of RR : 30 - Distance between occurrences of KG : 96, 46 - gcd(30, 96) = 6, gcd(30, 46) = 2, gcd(96, 46) = 2 - Unlikely to have a keyword of length two - Guess keyword is of length 6 - Now we take every sixth letter and look at the statistics just as we did for a Shift Cipher to deduce the first letter of the keyword.(每個位置各自計算頻率分析) - Then we take every sixth letter starting from the second one and repeat to find the second letter of the keyword. - And so on to determine the six letters of the keyword. - First Letter: We look for a blank and then three high ones, corresponding to Q, R, S, T - Key is 2 or C ![image (12)](https://hackmd.io/_uploads/BkGC_ouZyx.png) - Second Letter: We look for a blank and then three high ones, corresponding to Q, R, S, T - Key is 17 or R ![image (13)](https://hackmd.io/_uploads/HyKRdiOZJg.png) - Continuing in a similar way we find the keyword is: CRYPTO - The underlying plaintext is then found to be - Plaintext As we draw near to closing out the twentieth century, we see quite clearly that the information-processing and telecommunications revolutions now underway will continue vigorously into the twenty-first. We interact and transact by directing flocks of digital packets towards each other through cyberspace, carrying love notes, digital cash, and secret corporate documents. Our personal and economic lives rely more and more on our ability to let such ethereal carrier pigeons mediate at a distance what we used to do with face -to-face meetings, paper documents, and a firm handshake. Unfortunately, the technical wizardry enabling remote collaborations is founded on broadcasting everything as sequences of zeros and ones that one’s own dog wouldn’t recognize. What is to distinguish a digital dollar when it is a s easily reproducible as the spoken word? How do we converse privately when every syllable is bounced off a satellite and smeared over an entire continent? # Playfair Cipher - Best-known multiple-letter encryption cipher - Treats digrams in the plaintext as single units and translates these units into ciphertext digrams - Based on the use of a 5 x 5 matrix of letters constructed using a keyword - Fill in letters of keyword (minus duplicates) from left to right and from top to bottom, then fill in the remainder of the matrix with the remaining letters in alphabetic order - Using the keyword MONARCHY: ![image (14)](https://hackmd.io/_uploads/SyVktouZke.png) - Rule 1. 同一對中的重複明文字母 → 填充字母分隔,例如 x,Ex:balloon → ba lx lo on. 2. 同一行 → 右方字母替代,EX:ar is encrypted as RM. 3. 同一列 → 下方字母替代,EX:mu is encrypted as CM. 4. 不同列不同行 → 由對角的兩點替代,EX:hs becomes BP and ea becomes IM (or JM, as the encipherer wishes). ![image (15)](https://hackmd.io/_uploads/HkSxKoO-Jl.png) # Vernam Cipher - Originally - The key had not been generated in a truly random manner - The key was on a looped tape - The key was reused ![image (16)](https://hackmd.io/_uploads/HyOZYjdZyg.png) ## One-Time Pad - Improvement to Vernam cipher proposed by an Army Signal Corp officer, Joseph Mauborgne - Use a random key that is as long as the message so that the key need not be repeated - Key is used to encrypt and decrypt a single message and then is discarded - Each new message requires a new key of the same length as the new message - Scheme is unbreakable - Produces random output that bears no statistical relationship to the plaintext - Because the ciphertext contains no information whatsoever about the plaintext, there is simply no way to break the code - The one-time pad offers complete security but, in practice, has two fundamental difficulties: - There is the practical problem of making large quantities of random keys - Any heavily used system might require millions of random characters on a regular basis(成本昂貴) - Mammoth key distribution problem - For every message to be sent, a key of equal length is needed by both sender and receiver - Because of these difficulties, the one-time pad is of limited utility - Useful primarily for low-bandwidth channels requiring very high security - The one-time pad is the only cryptosystem that exhibits perfect secrecy. # Rail Fence Cipher - 最簡單的轉置密碼 - 明文被寫為對角線序列,然後作為行序列讀取 ![image (17)](https://hackmd.io/_uploads/Hko7KsdZ1e.png) # Row Transposition Cipher - 更複雜的轉置密碼 - 將訊息逐行寫入矩形中,然後逐列讀取訊息,但要排列列的順序 - 列的順序就成為演算法的key ![image (18)](https://hackmd.io/_uploads/HydvKsubyl.png) # Rotor Machines - 採用替換密碼然後旋轉它被視為理想的solution。 ### Example - To encrypt the first letter we use the substitutions ![image (19)](https://hackmd.io/_uploads/r1YVKs_W1x.png) - To encrypt the second letter we use the substitutions ![image (20)](https://hackmd.io/_uploads/HkBrtjubJg.png) - To encrypt the third letter we use the substitutions ![image (21)](https://hackmd.io/_uploads/BJxtFsu-ye.png) - and so on # Enigma - 由Plug board和Scrambler(3個rotor、1個startor、一個reflector) ![image (22)](https://hackmd.io/_uploads/S1DKKiObkg.png) - 可能性:$\binom{5}{3}⋅3!⋅26^{3}⋅26^{2}⋅25⋅10^{14} ≈ 10^{23}$