Swimming upstream...

an essay by:

j!m

09-19-2001

Published by Tsehp


Target:

zipcrack 2.0 by Paul Kocher.
It is a very very old stuff designed in 1992 by a famous cryptographer (author of Differential Power Analysis attacks) to demonstrates weaknesses in the encryption algorithm used by PKZIP 1.0.

The program is crippled and can only search for passwords beginning with the letter 'z' (seems simple..). Moreover the 'optimized search feature' which was using undocumented properties of the encryption header doesn't work anymore with the new version of the pkzip algorithm. So you can't use the optimized search feature.

But it doesn't matter, nowadays you can find a lot of efficient tools to crack zip passwords. These tools allow you dictionnary attacks, brute force , known plaintext attacks...try zipkey or advanced zip password recovery if you want to play with some of them..

My goal is not to provide you the latest keygens or patches for the latest commercial releases. My goal is to learn and to share what i've learned!!.

If you follow this tut, you should at least learn something about the pkzip stream cipher (i did!).

Tools used:

Turbo Debugger for DOS,
Hedit 2.11,
a crypted archive (password: reverse)
Windows calculator

Let's go now!

Launch turbo debugger, open the program and in the run menu choose the 'arguments...' entry to enter the full path of the crypted zip archive.
As you can read in the zipcrack documentation, the program can only search for passwords beginning with the letter 'z'. Without being a wizard i can feel it...a closer look at my ascii table inform me that the code for the letter 'z' is 0x7A so let's search into the program source for the value '7A'.

you should find something like this:


cs:0B21 80BEFEFE7A      cmp byte ptr [bp-0102],7A
cs:0B26 7507            jne 0B2F
cs:0B28 80BEFEFD7A      cmp byte ptr [bp-0202],7A
cs:0B2D 7415            je 0B44
cs:0B2F 833E423300      cmp word ptr [3342],0000
cs:0B34 750E            jne 0B44
cs:0B36 B8EC3A          mov ax,3AEC
cs:0B39 50              push ax
cs:0B3A E81511          call 1C52
cs:0B3D 59              pop cx
cs:0B3E C70642330100    mov word ptr [3342],0001

cs:0B44 C686FEFE7A      mov byte ptr [bp-0102],7A
cs:0B49 C686FEFD7A      mov byte ptr [bp-0202],7A
cs:0B4E C6047A          mov byte ptr [si],7A
cs:0B51 837E0600        cmp word ptr [bp+06],0000
cs:0B55 7417            je 0B6E
cs:0B57 B8703F          mov ax,3F70
cs:0B5A 50              push ax
cs:0B5B 8BC6            mov ax,si
cs:0B5D 40              inc ax
cs:0B5E 50              push ax
cs:0B5F 8D86FFFD        lea ax,[bp-0201]
cs:0B63 50              push ax
cs:0B64 8D86FFFE        lea ax,[bp-0101]
cs:0B68 50              push ax
cs:0B69 E8EF05          call 115B
cs:0B6C EB15            jmp 0B83

cs:0B6E B8723F          mov ax,3F72
cs:0B71 50              push ax
cs:0B72 8BC6            mov ax,si
cs:0B74 40              inc ax
cs:0B75 50              push ax
cs:0B76 8D86FFFD        lea ax,[bp-0201]
cs:0B7A 50              push ax
cs:0B7B 8D86FFFE        lea ax,[bp-0101]
cs:0B7F 50              push ax
cs:0B80 E83900          call 0BBC
cs:0B83 83C408          add sp,0008
cs:0B86 5F              pop di
cs:0B87 5E              pop si
cs:0B88 8BE5            mov sp,bp
cs:0B8A 5D              pop bp
cs:0B8B C3              ret

Ok, it looks like we have found the trick!, put a breakpoint on the first cmp instruction and run the program (do not forget to provide it the zip file as an argument).
Don't use the optimized search.
Type *reverse and press enter.

The program breaks. Dump memory at [bp-102]...you see the same thing as me? this is the password we typed!
Wonderful, the logic here is quite simple:

The program tests if the string entered begins with the letter 'z', if not the program jumps and tests cmp word ptr [3342],0000 if it has already displayed the advertisement saying that it can only find passwords beginning with the letter 'z'.
After this, it forces the first letter to 'z', tests if you want optimized search or not and branches according to your choice.

It seems that all we have to do here, is to change the jne 0B2F by a jmp 0B2F.
So launch you favorite hex editor, open zipcrack.com, search for
750780BEFEFD7A and replace 7507 by EB29.

Let's verify: open a dos window and launch zipcrack with the good argumen t. I say yes, you say no, i say yes, you say no la la lala la la lala ooppss it's not time for the Beatles yet! go on and type *reverse and smash the enter key to see the result of your hard work...!!
NOTHING HAPPENS....maybe this program doesn't work well...but no, it works, if you test it upon an archive crypted with a password beginning by 'z' it prints 'MATCH'.

So, where is the problem?

To catch it, we have to trace deeper in the program and have a look into the pkzip stream cipher implementation.
Re-load the program in turbo debugger, but a breakpoint on the cmp and run it!
When the program breaks, go on tracing step by step into the code until you reach the
call 0BBC.
Have you noticed that the ax register is loaded and pushed with the address of the string you entered + 1 ? seems odd!
Go on tracing and enter into the call function until you r each these lines:

cs:0BFA 66C706783F7A55+mov dword ptr [3F78],EE1C557A
cs:0C03 66C7067C3F1094+mov dword ptr [3F7C],
D2149410
cs:0C0C 66C706803FC476+mov dword ptr [3F80],
98E676C4

Do you feel it? it looks like crypto stuff initialization or something like that. At this point it is time to have a closer look into the pkzip stream cipher specification.
I suggest you a paper published by +Tsehp called '
ZIP Attacks with Reduced Known-Plaintext', the paragraph 1.1 of this paper descibes the cipher and say that:

'...The internal state of the cipher consists of three 32-bit words: key0, key1, key2. thes values are initialized to 0x12345678, 0x23456789, 0x34567890, respectively...'

It doesn't look like the instructions we found in the program!!!!

'..The cipher is keyed by encrypting the user password and throwing away the corresponding stream bytes. The stream bytes produced after this point are XORed with the plaintext to produce the ciphertext...'

The important thing to understand here is that a stream cipher has an internal state defined by key0, key1 and key2 and in order to decrypt something we have to update this state with the password bytes.
Are you curious ? i think so ! what is the state of the cipher after it processed the caracter 'z' ? good question, isn't it ?
Answer? let's compute it together..first the initialization.
Key(0) = 0x12345678
Key(1) = 0x23456789
Key(2) = 0x34567890

compute PKZIP_stream_byte('z') like this:

key(0) = crc32(key(0) , 0x7A) where
crc32(crc,b)=((crc >> 8) ^ crctab[(crc & 0xFF) ^ b])

i've searched some pkzip source code and found the crc32 table used
, find it by yourself or believe the value i give you here after!

key(0) = ((0x00123456) ^ crctab[78 ^ 7A]) = 0x00123456 ^ crctab[2] = 0x00123456 ^ 0xEE0E612C =
0xEE1C557A looks pretty good no?

key(1) = (key(1)+(key(0) & 0xFF)) * 0x08088405 + 1 = (0x23456789 + 7A) * 0x08088405 + 1 =
0xD2149410 good! let's going on...

key(2) = crc32 (key(2), key(1) >> 24) = 0x00345678 ^ crctab[90 ^ D2] = 0x00345678 ^ crctab[0x42] = 0x00345678 ^ 0x98D220BC =
0x98E676C4

That's it!! Paul Kocher has 'forwarded' the cipher to the next state. All we have to do is to move the cipher state backward.
To do this, we have to modify the three mov instructions to put the correct initialization values, but that's not all, we have to correct the address loaded in ax before the call, to point to the beginning of the password string, that means
nop the inc ax instruction, change lea ax,[bp-0201] with lea ax,[bp-0202] and change lea ax,[bp-0101] with lea ax,[bp-0102].

I let you do this with an hex editor, don't forget that intel computers are little-endian ;-) so you have to search 0x7A551CEE and replace it by 0x78563412 and so on for key1 and key2 and all things should work well after that.

That's all folks!

Additional readings:

APPNOTE.TXT, the pkzip format official specification, find the latest on the pkware site.
To learn more about stream ciphers i suggest you this
link..

Errors, comments, suggestions ?

write me!