|
Introduction
UniCipher's
encryption software features their homemade encryption algorithm to safe-guard
your files and private data. In this essay we'll explore this encryption algorithm
and reach the conclusion that this algorithm is far from secure. I'll also show
a successful attack against this algorithm. By the end of this essay I hope
you will realize this is a definite case of a "snake oil" - the product
is sold without consideration of its quality or its ability to fulfill its vendor's
claims. The majority of this essay will be technical, and will deal with cryptography,
yet I hope it will at least pass the following point: good cryptography is not
easy to build. It is, in fact, very difficult. So whenever you have a choice
between a well-known encrypting algorithm, and a new-home-made-super-secret
algorithm, be sure to choose the former.
C.R.O.E.S. Description
C.R.O.E.S.
is the name of UniCiphers's encryption algorithm.
This is the description you'll get from the help file:
C.R.O.E.S. is a stream cipher. The key array is fed into the shift register and serves as a rotor. A second, larger rotor is generated from the key prior to encryption and is also fed into the shift register during encryption. For those of you who study encryption, these rotors would be described to you as s-boxes that change with each use. Excellent results are achievable by using passwords of moderate length (8 characters and up). The values obtained from both rotors compound into the shift register with single byte of data encrypted. Both rotors are changed byte by byte with each operation and fed back into the shift register. The methods used to permute the rotor, the key and the shift register can produce patternless cipher with very short keys. Encryption can further be enhanced by activating CBC mode, whereby the key (rotor 1) is permuted by previously encrypted bytes of data. No initialization vector is created. C.R.O.E.S. is an algorithm developed in 1995 by NaillonWorks programming. |
Help files will get us nowhere. A plaintext encrypted with a simple
xor encryption and with the DES encryption will probably produce the same 'garbage-look'
in the ciphertext. In order to know how good the algorithm really is, we'll
have to reverse engineer (the RC4 algortihm, for example, was published after
it was reverse engineered).
I wont bother you with softice breakpoints and IDA disassembly. It takes only
a few hours of tracing to reach a good understanding of the algorithm. This
is it's real description:
1. C.R.O.E.S. encryption algorithm is a stream cipher. |
2.
A key stream of 500 bytes are produced each time. The first 500 keystream
bytes are generated from the password. The rest of the key stream (byte
501 and on) is generated from preceding keystream bytes. |
3. During encryption/decryption
there's a running key byte, hereonafter refered as keyValue, which
is the accumelative sum of all keystream bytes until the current position
(modulo 0xff). keyValue is updated after each encryption step. |
4. To encrypt, the current keyValue is added to the plaintext byte, and this results in the current ciphertext byte. |
5. To decrypt, the current keyValue is subtracted from the ciphertext byte, and this results in the plaintext byte. |
Points 4 and 5 are the engine of the algorithm, while points 2 and 3 are its
heart, and we'll explore them closely.
Keystream Generation
The keystream is a 500bytes (dwords acctually, but they are treated
as unsigned chars) array. The first keystream block is initialized with the
password:
// at start up seed = sum of password chars // seed0 key = seed i = 0 // build 500bytes keystream loop do // update seed seed *= 0x80884505 seed++ // update key key = key + ((key*seed) >> 0x20) + password[i % passwordLength] // make sure key is byte size key = key % 0xff // put it in the keystream buffer K[i] = key // next byte i++ loop until i == 500 |
For the record, the password string is rotated whenever (i
% passwordLength) has reached the last char of the password.
When you wish to encrypt a plaintext document bigger than 500bytes you'll have
to generate new keys in the stream. All the keystream blocks (other than the
first one) are generated after each encryption step, so keyValue is updated
after each byte encryption:
// KeyBlock0 is described above // KeyBlockn is the current 500bytes key stream block used for the encryption KeyBlockn+1[i] = KeyBlockn[i]+keyValue |
Encryption / Decryption
Basically, encryption is performed on a byte level. A plaintext
byte is fetched from the plaintext. Its value is added to the current keyValue,
and the result byte (hence and 0xff) is the ciphertext byte. After each encryption
step, keyValue is updated for both the next encryption step, and the next keystream
block.
// init keyValue with the last key from the first 500bytes keystream block keyValue = KeyBlock0[499] i = 0 n = 0 // encryption loop do // encryption C[i] = (P[i] + keyValue) & 0xff // update keyValue for next encryption step keyValue = keyValue + KeyBlockn[i%500] keyValue = keyValue % 0xff // update next keystream block as well KeyBlockn+1[i%500] = keyValue i++ // if we're out of keys, move to the next keystream block if i%500 == 0 n++ end if loop until no more plaintext bytes to encrypt |
Decryption is much like encryption. As this is a stream cipher,
the keystream and keyValue are generated like they would in encryption, only
that keyValue is subtracted from the ciphertext byte to give the original plaintext
byte:
// init keyValue with the last key from the first 500bytes keystream block keyValue = KeyBlock0[499] i = 0 n = 0 // decryption loop do // decryption P[i] = (C[i] - keyValue) & 0xff // update keyValue for next encryption step keyValue = keyValue + KeyBlockn[i%500] keyValue = keyValue % 0xff // update next keystream block as well KeyBlockn+1[i%500] = keyValue i++ // if we're out of keys, move to the next keystream block if i%500 == 0 n++ end if loop until no more ciphertext bytes to decrypt |
Analysis
Lets keep things simple and work with a small plaintext document
(much smaller than 500 bytes) so we'll only have to work with the first keystream
block. This means we can cut the 'update next keystream block' part, and we
are left with the following algorithm:
// Build KeyBlock0 using the password as described above buildKeyBlock0() // init keyValue with the last key from the first 500bytes keystream block keyValue = KeyBlock0[499] i = 0 // encryption loop - suppose plaintext document size is smaller than 500 bytes do // encryption C[i] = (P[i] + keyValue) & 0xff // update keyValue for next encryption step keyValue = keyValue + KeyBlock0[i] keyValue = keyValue % 0xff i++ loop until no more plaintext bytes to encrypt |
Bare in mind the keystream (KeyBlock0) depends only
on the password. keyValue depends only on the previous keyValue, and the keystream
bytes.
Now suppose we have a few known plaintext bytes
(this is well in reach when you encrypt files like EXEs and ZIPs that have a
known header with constant bytes). With every plaintext-ciphertext pair we can
deduct the keyValue used for the encryption. The beautiful thing is that when
we have a sequence of known plaintext bytes
(and hence a sequence of known keyValue bytes), we can reconstruct a sequence
of keystream bytes:
Known plaintext bytes: P0 , P1 , P2 , P3 , P4 , P5 Known ciphertext bytes: C0 , C1 , C2 , C3 , C4 , C5 Deducted keyValues: keyValuex = Cx - Px keyValue0 , keyValue1 , keyValue2 , keyValue3 , keyValue4 , keyValue5 During encryption we have: keyValuex+1 = keyValuex + Kx This means from every consecutive keyValue bytes, we can deduct the key of the keystream: Kx = keyValuex+1 - keyValuex K0 , K1 , K2 , K3 , K4 |
O.k. we had a sequence of known plaintext bytes from the file
(the bytes are located in the first 500 bytes block in the file), and we have
built a sequence of keystream bytes. Now lets take another look at how the stream
of keys is generated from the password:
// at start up seed = sum of password chars // seed0 key = seed i = 0 // build 500bytes keystream loop do // update seed seed *= 0x80884505 seed++ // update key key = key + ((key*seed) >> 0x20) + password[i % passwordLength] // make sure key is byte size key = key % 0xff // put it in the keystream buffer K[i] = key // next byte i++ loop until i == 500 |
Attack
We start with seed0 = 0. For every pair of the key sequence we
deducted, we reconstruct the seed value (it depends on seed0 and the file position),
and calculate the password char. If we get a typeable char, we try the next
key-pair, else we try another seed0 since we know this cant be a correct password.
Here is the attack formally:
1. Collect information about the stream of plaintext bytes, and build a set plaintext-ciphertext pairs. The known plaintext bytes must be consequtive, and all must be in the first 500bytes of the file. 2. Build the keyValue sequence using the P-C pairs. 3. Build the key sequence using the keyValue pairs. 4. Set seed0 = 0 5. Set testKeyPair to the first key pair in the sequence. 6. Calculate seed using seed0 and the known plaintext/ciphertext byte location. 7. Calculate a char of the password using the testKeyPair and the seed: passwordChar = keyx+1 - ( keyx + (keyx*seed)>>0x20) 8. If passwordChar is not typeable goto 12 9. Set testKeyPair to the next key pair in the sequence. 10. If we havent reached the end of the key sequence goto step 6 11. All the key sequence has been processed, and all password chars are typeable. Add this password as a possible match. 12. seed0++. Remember that seed0 is the sum of the password chars so a different seed0 is a different password. 13. If seed0 is smaller than the maximum seed0 allowed goto 5 |
Bottom line, from every 2 keys we can deduct, we "guess"
seed0, and calculate the password char. If the password char is not typeable,
we try a different seed0. Now if we have a stream of keys, and the each pair
producess a printable char - the odds are we just figured the encryption password.
If we process seed0 different from the correct seed0, the test is very likely
to give an un-typeable password char in one of the key pairs.
Attack Results
The PE file MS-DOS header has a field of 20bytes long buffer marked
as reserved - i.e. all null. This means that for every exe you encrypt with
this stream algorithm you know a sequence of 20bytes plaintext! Thats a lot!
A 19chars long password can be recovered, let alone the "moderate length
(8 characters and up)" they said in the help file. If you set maximum seed0
value to 2500 (the sum of 20 'z') the full attack takes only a couple of seconds.
Here's a sample output of the attack program. The password I used was "phySics".
[#] Encrypted file: calc.exe
[*] Please Enter (in hex) known plaintext stream location: 22
[*] Enter (in hex) known plaintext byte at location 0x22 : 0
[*] Enter (in hex) known plaintext byte at location 0x23 : 0
[*] Enter (in hex) known plaintext byte at location 0x24 : 0
[*] Enter (in hex) known plaintext byte at location 0x25 : 0
[*] Enter (in hex) known plaintext byte at location 0x26 : 0
[*] Enter (in hex) known plaintext byte at location 0x27 : 0
[*] Enter (in hex) known plaintext byte at location 0x28 : 0
[*] Enter (in hex) known plaintext byte at location 0x29 : 0
[*] Enter (in hex) known plaintext byte at location 0x2a : 0
[*] Enter (in hex) known plaintext byte at location 0x2b : 0
[*] Enter (in hex) known plaintext byte at location 0x2c : 0
[*] Enter (in hex) known plaintext byte at location 0x2d : q
[#] found pass phrase segment: 0wq;p|.ud
[#] found pass phrase segment: cl<x~w`y)
[#] found pass phrase segment: ^h>J~j}v>
[#] found pass phrase segment: Su[hw\6r>
[#] found pass phrase segment: S`=j~P,pf
[#] found pass phrase segment: Mq\:xOToS
[#] found pass phrase segment: ~]2Zh|MmZ
[#] found pass phrase segment: xnP*~{ulF
[#] found pass phrase segment: m{oHxm.yG
[#] found pass phrase segment: @`h={gVs4
[#] found pass phrase segment: csphySisp *****
[#] found pass phrase segment: 9{8rnt%vP
[#] found pass phrase segment: Ub}k|lkwQ
[#] found pass phrase segment: 4w8DngCsd
|
The password is rotated, but thats expected and it doesnt take
long to realize the correct password.
Final words
Sourcecode: I've included the source code of my crypt-analysis attack to the UniCipher algorithm. You are welcomed to experiment with it. You can find it here.
Snake Oil FAQ: Please have a look
at the snake-oil
faq ("Snake Oil Warning Signs: Encryption Software to Avoid")
- it was written in 1996. Its surprising to see how much of the warning signs
written there are well shown in UniCipher case. The sad truth is that there
are more such cases of selling poor cryptography. I personally encourage everyone
to study cryptography and try to invent algorithms, its great fun, but please
- dont sell such things.
Disclaimer: This essay and the attack program
source code is provided for educational purposes only. Any use, mis-use or illegal
activity is the sole responsibility of the reader.
Take care,
The+Q
(you can reach me at qster@oldleetos.net)