Recover plaintext attack to block ciphers

Reading time: 5 minute
...

📝 Original Info

  • Title: Recover plaintext attack to block ciphers
  • ArXiv ID: 0807.3383
  • Date: 2009-03-21
  • Authors: ** 논문에 저자 정보가 명시되지 않아 확인 불가. **

📝 Abstract

we will present an estimation for the upper-bound of the amount of 16-bytes plaintexts for English texts, which indicates that the block ciphers with block length no more than 16-bytes will be subject to recover plaintext attacks in the occasions of plaintext -known or plaintext-chosen attacks.

💡 Deep Analysis

📄 Full Content

For the security of block ciphers there are many researches, which may be found in most of textbooks and papers in cryptography, refer to see [1]. It is known that block ciphers have a characteristic that it encrypt plaintexts in blocks with a regular encryption scheme, so, plaintexts are 1-1 related to the ciphertexts in blocks for a secret key. It is not difficult to know that these block ciphers will be easy subjected to recover plaintext attack if the amount of block plaintexts is not sufficient large. Suppose that the amount of all possible plaintexts blocks is no more than 2 m , an adversary has a dictionary of the block-pairs (ciphertext, plaintext) with size about / 2 2 2 m + , then he will recover a block plaintext while collect / 2 2 m blocks of ciphertexts with high successful probability by the general birthday paradox. In most of the currently used block ciphers, the output sizes, that is, the lengths of blocks are equal to, or smaller than 16 bytes. In this paper, we will show that in the case of English text the number of 16-bytes plaintexts is less than 56 2 , so the block ciphers with output size of 16 bytes will be vulnerable to recover plaintext attacks in the occasions of plaintext-known or plaintext-chosen attacks.

In the rest of this section, we give some conceptions used in this paper.

Denoted by Q the vocabulary for the plaintexts, and suppose that the size | | N = Q . For a word , w∈Q , denote by | | w the length, i.e., the number of the letters contained in the word w .

An English phase or a plaintext blockα is called of k -terms if it consists of k words or parts of words, There are four possible expressions for the k -terms plaintext blocks (1.3) are not complete English words but only parts. Besides, possibly there are existed some blocks contain some punctuation marks such as ‘,’ or ‘.’ or ‘;’, which will be agreed to be a character rather than a term, except the special case that 1 word in (1.1) or (1.2) is just a punctuation mark. We will only take the frequently used three punctuation marks ‘,’ , ‘.’ and ‘;’ into the consideration in the following discussion.

For the simplicity, in this paper, it is assumed that the words in the vocabulary Q are consist of English letters, no include special characters such as @, #, etc, and Arabian numbers and abbreviations.

In this section, we will present a estimation for the amount of 16-bytes plaintexts.

Proposition 1. Suppose that Q is a vocabulary consist of English words, including no special characters and Arabian numbers, and the size | | 60000 ≤ Q

. Let F be the set of all possible 16-bytes blocks of English texts over

where μ is a constant, 2,

Proof. Denoted by F , F and ′ F the subsets of F consist of 16-bytes plaintexts with that the first letter is a minuscule one, a capital one and a punctuation respectively. We will see that F possess a main part in the amount. For an positive integer ,

be the subsets of k F with the expression forms (1.1), (1.2), (1.3) and (1.4) respectively. We will firstly calculate

At first we are restricted in the case

(2.5)

By the basic combinatorics, we know that for any positive integer s , it has

So, with (2.5) , (2.6), (2.7) and (2.4), we have

To get an estimation for

where Stirling’s formula has been applied.

It is likely the inequation (2.1) is true for the distribution of English words, but we have not checked in total, so we have taken it as a condition, so that the constant μ may be modified according to the actual cases.

Remark 2. Moreover, for a k -letters word w , and a positive integer , , i i k ≤ we call the segment formed by the first i letters of w as the i -prefix of w , similarly, the segment formed by the last i letters of w as the i-suffix of w . Denoted by [ ] i Q and [ ] i Q the sets of all the distinct i -prefix’s and i-suffix’s of the words inQ respectively. Suppose that [ ] ( ) It is easy to know that the conjecture is true for 1, i = and 5 i > , so the rest to be verified are the cases 2 5 i ≤ ≤ .

For the simplicity of discussion, we have excluded Arabian numbers and some special characters such as @, $, etc, and some special punctuations such as ‘!’, ‘?’, etc, though they occasionally appear in the English texts, but a little. So, the estimation above may be viewed as the one for the frequently appeared ones. The calculations in the paper is nearly in combinatorics, no considerations on the English grammar, logic and semantics, so it is very likely that the actual amount of plaintext blocks will be much smaller then the one presented in Proposition 1. In fact, our first idea is from the consideration in English grammar, but which is somewhat trifling. The result presented indicate that the block ciphers with 16-bytes block length such as AES will be subject to recover plaintext attacks when applied to encrypt English texts in the occasions of plaintext-known or plaintext-chosen attacks. From the discussion above, we have seen that the amount of plaintext b

Reference

This content is AI-processed based on open access ArXiv data.

Start searching

Enter keywords to search articles

↑↓
ESC
⌘K Shortcut