Need to read up on cryptography?
See the links at the end of this page.
This page describes a block mode for block ciphers that has several features:
This mode is based on the Merkle-Damgard hash construct, which is used inside most modern cryptographic hash functions. With some minor modifications the hash construct is turned into a MAC.
The basic realisation is that the intermediate hashes inside the Merkle-Damgard hash construct form a pseudo random number stream. I have modified the construct to turn that pseudo random number stream into what I think is a cryptographically secure keystream. That keystream can then be used to encrypt the message the same way as many other block modes do. That is, XOR the keystream onto the cleartext to get the ciphertext.
This page describes the different components of the mode and the reasoning behind them step by step. A full diagram of the mode is at the end of this page.
Note that for most uses integrated modes like this that do both authentication and encryption at the same time are slower and perhaps also less secure than using for instance CTR mode combined with HMAC. The reason is that these integrated modes cause rekeying of the block cipher for each block processed. So this kind of integrated modes are normally only the better choice for embedded systems that needs to save code space.
Two bit-strings can be combined (compressed together) by using XOR (bitwise exclusive or).
If the result then is fed into a cryptographic compression function such as a block cipher then the two strings will be properly mixed together and thus get proper avalanche effect etc.
The two strings must not be similar in "behaviour" in a way that can cause them to interact. For instance combining a counter and an IV this way is bad if the application uses a message counter as IV.
The reason to use XOR as a "compression function" in this mode is to save block encrypt operations. If done right it should still be secure.
This is the Miyaguchi-Preneel one-way compression function.
It uses a block crypto and feeds one of the input bit-strings as key and the other as the block to be encrypted. It then does two XORs to turn the block operation into a one-way function. This function is cryptographically secure and collision resistant if the block crypto used is secure.
There are many ways to turn a block crypto into a one-way compression function.
If the block crypto has sufficient key and block size (at least 128 bits)
then this is the one-way compression method I recommend to use in
the authenticated encryption mode.
This is the Merkle-Damgard hash construct.
It is used inside most modern cryptographic hash functions. It starts out with an IV, which usually is standardised for the hash function. It then uses a one-way compression function [f] such as the Miyaguchi-Preneel above. It compresses in each block of the message to be hashed. It then also adds "length padding" to prevent an attacker from creating collisions.
Note that the intermediate hashes (H0 - Hn-1) form a pseudo random number stream. This is the stream that I modify to become the keystream in the authenticated encryption mode.
A little about length padding:
When using a hash as a MAC then the length padding is not that necessary since it is not usually considered a problem if the sender (which knows the key for the MAC) can create hash/MAC collisions. Although in the authenticated encryption mode I have chosen to still use length padding, no need to remove a good feature.
The length padding is just some bits starting with 1, then some 0's to fill out, and an integer stating the length of the message hashed. If the message hashed was say "Wikipedia" (9 bytes) and the compression function uses 8 byte blocks then the message + length padding would look like one of these two examples (depending on which kind of padding standard is used):
Wikipedi a1000000 00000009
Wikipedi a1000009
(+) means bitwise exclusive or (XOR).
[f] is a cryptographic one-way compression function. For instance the Miyaguchi-Preneel.
[E] is a block crypto (for instance AES) and together with the surrounding two arrows that are XORed below it it forms the one-way compression function [f].
Key is the symmetric key (shared secret) that both the sender and the receiver know. The same key value is used in each round in the diagram.
IV means "Initial Value" or "Initialisation Vector" and is used to make each message unique since otherwise using the same key would be insecure. The IV is either a big random number usually sent along with the message in cleartext, or a unique message counter that also might be sent along in cleartext.
m0 - m1 is the cleartext message blocks. The last block is Merkle-Damgard length padded.
H-1 - H1 is the intermediate hashes that are also used as keystream for the encryption.
C0 - C1 is the encrypted message.
The 1 - 3 counter that is XORed in with the key and the intermediate hash in each round is to prevent the structure to get internal loops. (To avoid getting repeating patterns of intermediate hashes.) This can happen if the message that is being MACed itself is repeating, for instance if it consists of a lot of "AAAAAA". The risk of such loops occurring is very small if the block cipher used has a big enough block size. However some systems might use a block cipher with a too small block size and such loops would completely destroy the security of the encryption. So adding a non-repeating counter in this way seems prudent and is a very cheap operation. And I think it also makes it a stronger hash/MAC.
Note that we could not use H0 to encrypt m0 since the receiver will not know m0 before he have decrypted C0 and thus can not calculate H0 until after decrypting C0. And we can not use the last hash (H1) to encrypt the last message block since usually we want to save one block operation by putting the whole length padding in the last message block if that block has room enough. Thus in that case H1 is HFinal and since HFinal is the MAC it will be sent in cleartext.
The keystream here is created in a way similar to OFB and CTR mode, but with some differences. The key, counter and intermediate hash (Hx) is combined using XOR and then fed as one input into the one-way compression function. From an encryption point of view it does not matter at all what is fed as the other input into the one-way function. All the message blocks fed can just as well contain all zeros. We will still get a good secure random looking keystream.
To mix in the intermediate hash (Hx) is of course necessary for the MACing of the message. But it is also necessary to retain the influence of the IV.
Since the intermediate hashes also contains the influence of the key from the first round it might at first glance seem unnecessary to mix in the key on each round. However if the key is not mixed in at each round AFTER the intermediate hash has been used as keystream there is a trivial attack that could completely decrypt the message. The attack works like this: The attacker snoops the encrypted message in transit (C0 - C1). The attacker guesses the contents of one of the cleartext blocks, say m0. That block might for instance contain some standard message header that is easy to guess. The attacker then does C0 XOR m0 and gets H-1. If we do not mix in the key after that then the attacker would know all the inputs to the one-way function since he then knows H-1, the counter value and he already have guessed m0. That means he could calculate the next hash (H0) and use that to decrypt C1 and so on. That is, he could then decrypt all the message blocks after the block he guessed. The simple XORing in of the key before each round stops this attack.
It seems that we don't really need a one-way function but can use the block cipher as is in each round. (Without the Miyaguchi-Preneel construct.) This would allow the sender to on purpose cause hash/MAC collisions but that is usually not considered a problem when MACing. However the Miyaguchi-Preneel one-way function just costs two extra XORs so it is very cheap. Using a one-way function seems more robust and it might be nice to even stop the sender from causing hash/MAC collisions. So it seems prudent to use the one-way function.
Authenticated encryption modes like this mode causes the block cipher to re-key for each block operation. In many block ciphers rekeying takes some time, which probably makes modes like this slower than just using CTR mode and HMAC together. So ideally a block mode should not cause rekeying. I don't know if it is possible to design a secure authenticated encryption mode that only uses one block operation per message block and no rekeying (and perhaps 1-2 additional block operations for set-up and finalisation), but it would be very neat.