***** ***** by crUsAdEr ***** ****
Published by +Tsehp 2002
This tutorial aims to discuss more about internal working of AsProtect mainly, more than just unpacking it. So if you just want to unpack it and don't want to waste your time on reverse engineering, forget the second part of this tutorial!
TOOLS used :
IDA 4.15
Soft Ice on Win2k
LordPE
Revirgin (for unpacking only)
WinHex (for unpacking only)
Targets : ReGet Deluxe 3.0 beta (build 117) (but I think
any program protected with the same version of AsProtect will do)
Included : asprotect.dll
1. Unpacking AsProtect
This is boring after a while, here I summarise some steps you should take when unpacking
AsProtected program
(please read Spl/\j's tutorial on commview, not much has
changed since!)
- Run the program
- Run WinHex and open the program memory space, search for AsProtect signature byte "61 FF E0"
- If you are on win 98, run super bpm. (for win2k user, use Solomon's trick of "bpx 80464C50" to prevent
AsProtect from clearing your bpm).
- Close the program, put a bpx on GetVersion and run the program again, sice should break, press F12
and you should be in the AsProtected code, trace with F8 and take note of where
the results of this
GetVersion is stored in memory. AsProtect store this results and used it later to emulate the API.
Other emulated API are GetModuleHandleA, GetCurrentProcess, GetCurrentProcessId and
GetCommandLineA.
So trace and watch where they are stored. Write down these memory addresses. Tips : trace with F8
- Do a bpm on the address of "61 FF E0" you found earlier
- Let the program runs and sice will break again at "popad ; jmp eax " where eax is you OEP. Dump
here.
- Run revirgin and let it resolve IAT (read the revirgin manual on how to do this)
- There are a few missing APIs which are either emulated (remember the address we wrote down earlier?)
or redirected. You need to fix these manually and they should be dealt with case by case.
(again, read Spl/\j's tutorial)
- Sometimes AsProtect dips inside the program code before hitting OEP to trick /tracex and set various flags, encrypt/decrypt some codes but these should be dealt with individually.
Phew, done with the boring part, now if you are interested in how AsProtect really works and willing to work on your own to learn the art of reverse engineering then read on. The next few sections aims to discuss how AsProtect decrypts the program, how APIs are really emulated, how API is mangled, how dippings are done etc... and of course how your bpm are cleared!
2. AsProtect library
Yep, you are right! AsProtect has a dll that is used to perform all of its tasks
of decrypting and loading the target. This dll is decrypted at runtime so this section will discuss how to obtain this dll from memory, rebuild it and use it to study AsProtect.
Use IDA to disassemble the program, you should be able to over come most of the
obfuscation code (grab IDA tutorials or read the IDA manual). My approach is disassemble and debug bit by bit in parallel.
Trace with softice from the beginning of the protected program, you will soon be brought AsProtect code in the last section of the program, F8 a few step and then you will be in the first decryption loop :
0067B083 ; ---------------------------------------------------------------- 0067B083 0067B083 loc_67B083: ; CODE XREF: 0067B06Fp 0067B083 8A DD mov bl, ch 0067B085 5E pop esi ; esi := 67B074 0067B086 8A C3 mov al, bl 0067B088 81 C6 C2 07 00 00 add esi, 7C2h ; esi := 67B836 0067B08E 56 push esi 0067B08F 8A E2 mov ah, dl 0067B091 5B pop ebx ; ebx := 67B836 0067B092 68 B5 01 00 00 push 1B5h 0067B097 59 pop ecx ; ecx := 1B5 0067B097 ; Number of time loop perform 0067B097 ; or number of dwords to be decrypted 0067B098 0067B098 loop1: ; CODE XREF: 0067B154j 0067B098 FF 36 push dword ptr [esi] 0067B09A 66 B8 8B 7F mov ax, 7F8Bh 0067B09E 5A pop edx ; edx := [esi] 0067B09F 0F 89 0C 00 00 00 jns loc_67B0B1 0067B09F ; ---------------------------------------------------------------- 0067B0A5 0F db 0Fh ; <----useless code---->
0067B0B0 00 db 0 ; 0067B0B1 ; ---------------------------------------------------------------- 0067B0B1 0067B0B1 loc_67B0B1: ; CODE XREF: 0067B09Fj 0067B0B1 81 C2 51 CF 9C 42 add edx, 429CCF51h ; add edx 0067B0B7 66 BF 0A 05 mov di, 50Ah 0067B0BB 81 F2 B6 23 95 0D xor edx, 0D9523B6h ; xor edx 0067B0C1 80 CB 2D or bl, 2Dh 0067B0C4 81 EA B7 D8 25 0E sub edx, 0E25D8B7h ; sub edx 0067B0CA 68 B0 D5 A1 60 push 60A1D5B0h 0067B0CF E8 14 00 00 00 call sub_67B0E8 0067B0CF ; ---------------------------------------------------------------- 0067B0D4 DC db 0DCh ; _ <----useless code---->
0067B0E7 5B db 5Bh ; [ 0067B0E8 0067B0E8 ; ¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦ S U B R O U T I N E ¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦ 0067B0E8 0067B0E8 0067B0E8 sub_67B0E8 proc near ; CODE XREF: 0067B0CFp 0067B0E8 66 BF A4 D9 mov di, 0D9A4h 0067B0EC 58 pop eax ; eax := 67B0D4 0067B0ED 5F pop edi 0067B0EE 89 16 mov [esi], edx ; store back edx into [esi] 0067B0F0 80 F7 27 xor bh, 27h 0067B0F3 81 EE 7D 99 CB 4F sub esi, 4FCB997Dh 0067B0F9 66 8B D9 mov bx, cx 0067B0FC 81 C6 79 99 CB 4F add esi, 4FCB9979h ; sub esi, 4 0067B102 66 B8 35 7B mov ax, 7B35h 0067B106 49 dec ecx ; decrease counter 0067B107 0F 85 22 00 00 00 jnz continue_decrypt 0067B10D 0F 8A 06 00 00 00 jp loc_67B119 0067B10D ; -------------------------------------------------------------- 0067B113 81 db 81h ; ü <----useless code---->
0067B118 74 db 74h ; t 0067B119 ; -------------------------------------------------------------- 0067B119 0067B119 loc_67B119: ; CODE XREF: sub_67B0E8+25j 0067B119 E9 48 00 00 00 ; 0067B190j 0067B119 jmp loc_67B166 0067B119 ; -------------------------------------------------------------- 0067B11E A5 db 0A5h ; Ñ <----useless code---->
0067B12E 15 db 15h ; 0067B12F ; -------------------------------------------------------------- 0067B12F 0067B12F continue_decrypt: ; CODE XREF: sub_67B0E8+1Fj 0067B12F E8 0D 00 00 00 call loc_67B141 0067B12F ; -------------------------------------------------------------- 0067B134 91 db 91h ; æ <----useless code---->
0067B140 85 db 85h ; à 0067B140 sub_67B0E8 endp 0067B140 0067B141 ; -------------------------------------------------------------- 0067B141 0067B141 loc_67B141: ; CODE XREF: sub_67B0E8+47p 0067B141 E9 0D 00 00 00 jmp loc_67B153 0067B141 ; -------------------------------------------------------------- 0067B146 01 db 1 ; <----useless code---->
0067B152 F5 db 0F5h ; ) 0067B153 ; -------------------------------------------------------------- 0067B153 0067B153 loc_67B153: ; CODE XREF: 0067B141j 0067B153 5F pop edi ; edi := 67D134 0067B154 E9 3F FF FF FF jmp loop1
Lots of obfuscation code but I list here once so that hopefully you will get accustomed to them. There are more to come!
I hope the dead listing with comments are good enough to understand, but as you can see AsProtect is decrypting something and that some thing happen to be the VERY next block of codes. Do trace through it once or twice and you will get the hang of it. This is important as it will help you to understand AsProtect structure better as you go on. As you can see, the key is to use IDA to disassemble at the right place and ignore obfuscation code. Also, just a personal opinion, comment like mad, I commented everything I see eventhouugh sometimes I don't know what they are, I simply rename those offset "some_shit", the next time you see "some_shit" you'll know that this variable has been accessed before and it helps....
Once you have understand how this loop works, bpx on the exit of the loop and you will soon see the next loop with the same algorithm but different key and size used to decrypt the block after itself. This decryption is repeated a few times, (I think 4) and then a block of data is copy to high memory and decrypt there (our dll). It is also quite interesting to watch how AsProtect search for its import; namely GetProcAddress, GetModuleHandleA, LoadLibraryA, VirtualAlloc and VirtualFree by scanning the export directory of kernel32.dll instead of using the pre-loaded import IAT.
Once, the dll is loaded into some high memory, I made a dump,
attach it to the end of the program, adjust the sections header so that the
virtual address is the same. At the first glance,
it looks like just a data block with some code on it but once I start tracing
this code I see something fishy. Here comes the OS loader :
(I remove the relocation codes as they are too long to list here. Only Import
loading and OEP calculation is listed)
00A4A488 loc_A4A488: 00A4A488 mov esi, dword ptr ss:unk_442A61[ebp] ; esi = [A4A11D] 00A4A48E 8B 95 D8 30 44 00 mov edx, dword ptr ss:unk_4430D8[ebp] ; add image base 00A4A494 03 F2 add esi, edx ; esi now points to Import Directory 00A4A496 00A4A496 load_next_dl_import: 00A4A496 8B 46 0C mov eax, [esi+0Ch] ; get dll name offset 00A4A499 85 C0 test eax, eax 00A4A49B 0F 84 0A 01 00 00 jz finish_import_loading ; eax := [A4A121] 00A4A4A1 03 C2 add eax, edx ; add image base 00A4A4A3 8B D8 mov ebx, eax 00A4A4A5 50 push eax 00A4A4A6 FF 95 EC 31 44 00 call dword ptr ss:unk_4431EC[ebp] ; GetModuleHandleA 00A4A4AC 85 C0 test eax, eax 00A4A4AE 75 07 jnz short library_loaded 00A4A4B0 53 push ebx 00A4A4B1 FF 95 F0 31 44 00 call dword ptr ss:unk_4431F0[ebp] ; LoadLibraryA 00A4A4B7 00A4A4B7 library_loaded: 00A4A4B7 89 85 4D 29 44 00 mov dword ptr ss:unk_44294D[ebp], eax 00A4A4BD C7 85 51 29 44 00+ mov dword ptr ss:unk_442951[ebp], 0 ; initialise Import Counter 00A4A4C7 00A4A4C7 next_first_thunk_entry: 00A4A4C7 8B 95 D8 30 44 00 mov edx, dword ptr ss:unk_4430D8[ebp] ; get image base 00A4A4CD 8B 06 mov eax, [esi] ; check Original_First_Thunk 00A4A4CF 85 C0 test eax, eax 00A4A4D1 75 03 jnz short original_first_thunk_found ; 00A4A4D3 8B 46 10 mov eax, [esi+10h] ; get first thunk offset 00A4A4D6 00A4A4D6 original_first_thunk_found: ; CODE XREF: 00A4A4D1j 00A4A4D6 03 C2 add eax, edx ; add image base 00A4A4D8 03 85 51 29 44 00 add eax, dword ptr ss:unk_442951[ebp] ; add counter 00A4A4DE 8B 18 mov ebx, [eax] ; get Import ASCII 00A4A4E0 8B 7E 10 mov edi, [esi+10h] 00A4A4E3 03 FA add edi, edx ; edi => first thunk 00A4A4E5 03 BD 51 29 44 00 add edi, dword ptr ss:unk_442951[ebp] ; add counter 00A4A4EB 85 DB test ebx, ebx 00A4A4ED 0F 84 A2 00 00 00 jz dll_done 00A4A4F3 F7 C3 00 00 00 80 test ebx, 80000000h ; import by ordinal? 00A4A4F9 75 04 jnz short loc_A4A4FF 00A4A4FB 03 DA add ebx, edx ; add image base 00A4A4FD 43 inc ebx 00A4A4FE 43 inc ebx ; add 2 to point to API ASCII 00A4A4FF 00A4A4FF loc_A4A4FF: ; CODE XREF: 00A4A4F9j 00A4A4FF 53 push ebx 00A4A500 81 E3 FF FF FF 7F and ebx, 7FFFFFFFh 00A4A506 53 push ebx 00A4A507 FF B5 4D 29 44 00 push dword ptr ss:unk_44294D[ebp] ; modulehandle 00A4A50D FF 95 E8 31 44 00 call dword ptr ss:unk_4431E8[ebp] ; get Proc address 00A4A513 85 C0 test eax, eax 00A4A515 5B pop ebx 00A4A516 75 6F jnz short API_add_found 00A4A518 F7 C3 00 00 00 80 test ebx, 80000000h ; import by ordinal ? 00A4A51E 75 19 jnz short loc_A4A539 00A4A520 57 push edi 00A4A521 8B 46 0C mov eax, [esi+0Ch] 00A4A524 03 85 D8 30 44 00 add eax, dword ptr ss:unk_4430D8[ebp] 00A4A52A 50 push eax 00A4A52B 53 push ebx 00A4A52C 8D 85 53 31 44 00 lea eax, unk_443153[ebp] 00A4A532 50 push eax 00A4A533 57 push edi 00A4A534 E9 99 00 00 00 jmp loc_A4A5D2 00A4A539 ; ------------------------------------------------------------- 00A4A539 00A4A539 loc_A4A539: ; CODE XREF: 00A4A51Ej 00A4A539 81 E3 FF FF FF 7F and ebx, 7FFFFFFFh 00A4A53F 8B 85 DC 30 44 00 mov eax, dword ptr ss:unk_4430DC[ebp] 00A4A545 39 85 4D 29 44 00 cmp dword ptr ss:unk_44294D[ebp], eax 00A4A54B 75 24 jnz short loc_A4A571 00A4A54D 57 push edi 00A4A54E 8B D3 mov edx, ebx 00A4A550 4A dec edx 00A4A551 C1 E2 02 shl edx, 2 00A4A554 8B 9D 4D 29 44 00 mov ebx, dword ptr ss:unk_44294D[ebp] 00A4A55A 8B 7B 3C mov edi, [ebx+3Ch] 00A4A55D 8B 7C 3B 78 mov edi, [ebx+edi+78h] 00A4A561 03 5C 3B 1C add ebx, [ebx+edi+1Ch] 00A4A565 8B 04 13 mov eax, [ebx+edx] 00A4A568 03 85 4D 29 44 00 add eax, dword ptr ss:unk_44294D[ebp] 00A4A56E 5F pop edi 00A4A56F EB 16 jmp short API_add_found 00A4A571 ;-------------------------------------------------------------- 00A4A571 00A4A571 loc_A4A571: ; CODE XREF: 00A4A54Bj 00A4A571 57 push edi 00A4A572 8B 46 0C mov eax, [esi+0Ch] 00A4A575 03 85 D8 30 44 00 add eax, dword ptr ss:unk_4430D8[ebp] 00A4A57B 50 push eax 00A4A57C 53 push ebx 00A4A57D 8D 85 A4 31 44 00 lea eax, unk_4431A4[ebp] 00A4A583 50 push eax 00A4A584 57 push edi 00A4A585 EB 4B jmp short loc_A4A5D2 00A4A587 ; ----------------------------------------------------------------- 00A4A587 00A4A587 API_add_found: ; CODE XREF: 00A4A516j 00A4A587 89 07 ; 00A4A56Fj 00A4A587 mov [edi], eax ; update first thunk 00A4A589 83 85 51 29 44 00+ add dword ptr ss:unk_442951[ebp], 4 00A4A590 E9 32 FF FF FF jmp next_first_thunk_entry ; get image base 00A4A595 ; ------------------------------------------------------------------ 00A4A595 00A4A595 dll_done: ; CODE XREF: 00A4A4EDj 00A4A595 89 06 mov [esi], eax ; clear Import Directory Entry 00A4A597 89 46 0C mov [esi+0Ch], eax 00A4A59A 89 46 10 mov [esi+10h], eax 00A4A59D 83 C6 14 add esi, 14h ; next Import Directories Entry 00A4A5A0 8B 95 D8 30 44 00 mov edx, dword ptr ss:unk_4430D8[ebp] 00A4A5A6 E9 EB FE FF FF jmp load_next_dl_import ; get dll name offset 00A4A5AB ; ------------------------------------------------------------------ 00A4A5AB 00A4A5AB finish_import_loading: ; CODE XREF: 00A4A49Bj 00A4A5AB 8B 85 65 2A 44 00 mov eax, dword ptr ss:unk_442A65[ebp] ; eax := [A4A121] 00A4A5B1 50 push eax 00A4A5B2 03 85 D8 30 44 00 add eax, dword ptr ss:unk_4430D8[ebp] ; add image base 00A4A5B8 5B pop ebx 00A4A5B9 0B DB or ebx, ebx 00A4A5BB 89 85 11 2F 44 00 mov dword ptr ss:unk_442F11[ebp], eax ; update instruction at A4A5CC 00A4A5C1 61 popa 00A4A5C2 75 08 jnz short OEP_OK 00A4A5C4 B8 01 00 00 00 mov eax, 1 00A4A5C9 C2 0C 00 retn 0Ch 00A4A5CC ; ------------------------------------------------------------------ 00A4A5CC OEP_OK: ; CODE XREF: 00A4A5C2j 00A4A5CC 68 00 00 00 00 push 0 ; this will be changed to push OEP 00A4A5D1 C3 retn ; go to OEP of the Delphi prog 00A4A5D2 ; -------------------------------------------------------------------
I know it is long dead listing, but do read through it, at least the comments as I think I wrote sufficient comments for readers to have an idea on what is happening. I have removed most of the codes (the decompression and relocation codes). Basically, it decompresses the main block and then perform what seems to me a relocation of image file, then import loading and finally getting OEP to jump to it. Hence I suspect this was either a dll or exe file (packed with AsPack?) or something. Hence I decided to dump this image (before all this reloc and IAT loading is done), it was hard to build the file as the header is completely ripped off and I took some time before getting it right so that IDA will disassemble them. First I notice that it has 6 sections (most if not all Delphi app has 6 sections) and then I remember AsProtect likes Delphi very much! Thus I open some Delphi exe and ripped the header completely, paste into this dump, fix the section info, fix the reloc and import info. (If you are not familiar with PE header, read some excellent PE tutorials by Iczellion and others at krobar's site).
Finally, I got IDA to disassemble the file nicely and hence discover it was a console dll written in Delphi and IDA (great tool!! Really helpful here) actually can apply its FLIRT feature to detect lots of Delphi function and that saves me lots of time. I must say if IDA has not been able to disassemble the dll properly, this tutorial would not have been done so easily. here I attach the dll here so you can disassemble and study them yourself. You should also try to dump and rebuild on from any AsProtect program as a practice.
Yep, once you obtain the dll, you will realise that all of AsProtect mysteries are inside that dll... its seh clearing debug registers, its crc check, IAT mangling etc... Here come the fun!!!
3. AsProtect.dll seh tricks
Here is how a typical seh is set up in AsProtect and you will know that seh is used not less than 30 times in this dll. It used to stop newbies like me from tracing AsProtect code but not anymore once you understand what is going on. Before you read this, please read some essential information about seh (I suggest Jeremy Gordon's excellent paper on seh).
004106B2 E8 49 C6 FF FF call Clear_API_emu_code ; upper limit 004106B7 E8 25 00 00 00 call set_seh_3 004106BC 004106BC seh_3_handler: ; 004106BC 8B 44 24 0C mov eax, [esp+0Ch] ; get context to eax 004106C0 83 80 B8 00 00 00+ add dword ptr [eax+0B8h], 2 ; add context.eip by 2 004106C7 51 push ecx 004106C8 31 C9 xor ecx, ecx 004106CA 89 48 04 mov [eax+4], ecx ; clear debug register 0 004106CD 89 48 08 mov [eax+8], ecx ; clear debug register 1 004106D0 89 48 0C mov [eax+0Ch], ecx ; clear debug register 2 004106D3 89 48 10 mov [eax+10h], ecx ; clear debug register 3 004106D6 C7 40 18 55 01 00+ mov dword ptr [eax+18h], 155h ; context.dr7 := 155 004106DD 59 pop ecx 004106DE 31 C0 xor eax, eax ; exception handled, continue 004106E0 C3 retn 004106E0 set_seh_1 endp ; sp = 4 004106E1 004106E1 ; ¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦ S U B R O U T I N E ¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦ 004106E1 004106E1 set_seh_3 proc near ; CODE XREF: set_seh_1+25p 004106E1 31 C0 xor eax, eax 004106E3 64 FF 30 push dword ptr fs:[eax] ; set up seh 3 004106E6 64 89 20 mov fs:[eax], esp 004106E9 31 00 xor [eax], eax ; cause seh 3 004106E9 ; this seh clear debug register 004106EB 64 8F 05 00 00 00+ pop large dword ptr fs:0 ; remove seh struc 004106F2 58 pop eax 004106F3 E8 CC 1F FF FF call @System@Randomize$qqrv ; System::Randomize(void)
The seh is set up in a slightly different way which makes it harder to detect but with IDA everything becomes very clear!
004106B7 call set_seh_3 ; this is equivalent to push handler and move eip to 4106E1 same as
; push 4106BC (our seh handler)
004106E1 xor eax, eax
004106E3 push dword ptr fs:[eax] ; set up seh 3
004106E6 mov fs:[eax], esp
004106E9 xor [eax], eax ; cause seh 3
004106E9 ; this seh clear debug register
004106EB pop large dword ptr fs:0 ; remove seh struc
004106F2 pop eax
Thus when we trace over 4106E9, the seh is triggered and our context is
retrieved to eax, where eip is adjusted by increasing it by 2 to point to the
next working instruction, also debug registers are cleared the same way. (this
is posted by R!sc before)
This is interesting indeed! For example when you are tracing and you are at 4106B7, if you trace into with F8 you will meet the faulty instruction at 4106E9 and be lost in kernel seh code! if you trace over with F10, sice will place a break point at the next instruction which happen to be the seh handler and then you will soon meet the "ret" at 4106E0 and again you will be lost in seh kernel code! Now that we know how this whole seh scheme works, we can trace anywhere we want, simply do a "r eip eip+2" at the faulty instruction and seh handler will be skipped altogether!
Analysing the dll is slightly harder as it is a long, full-blown dll with a hell lot of seh and Delphi bloated codes but IDA eases the jobs quite a bit. You should try to analyse this slowly, always trace over a call and guess what it does first before actually stepping inside it.
4. AsProtect.dll internal
OK, I am not going to discuss the whole dll here, that would take too long and is pointless to paste long Delphi codes. I will just roughly describe its structure. Here is the skeleton of the dll
00410C32 68 4C E7 40 00 push offset self_hash 00410C37 68 5C 0D 41 00 push offset Dips_DriveHash_Date_Registry_CodeDecrypt 00410C3C 68 90 02 41 00 push offset nothing_important ; notthing really important here 00410C3C ; just some file header checking 00410C41 68 44 FF 40 00 push offset complete_API_mangling 00410C46 68 1C F9 40 00 push offset Fully_decrypt_file ; decrypt the file fully, 00410C46 ; IAT is untouched.. virgin!!!!! 00410C46 ; import ASCII stripped but 00410C46 ; hint are still there 00410C4B 68 10 F3 40 00 push offset Decrypt_file_first_time 00410C50 68 2C 06 41 00 push offset emu_API_n_file_hash 00410C55 C3 retn
Yep, so AsProtect will execute the API_emulation routine and then on return, it will jump to the next routine Decrypt_file_first_time and then so on and so on. API emulation is pretty easy to understand, especially if you have unpack a few AsProtected programs. Also AsProtect use open the exe and hash the whole file to check for CRC. Also, you should read the thread by me, Mike and Dakien at Fravia's Board (crypto forum, "stream cipher??" and "Tutorial: finding encryption code") on encryption routines used. You will find that in fact all the main decryption routines are similar to that one throughout this dll with MD5 heavily used!
The third routine decrypt the exe fully and perform relocation on the code section if required. Here you will see how the mysterious "ret" at 401014 call (that appears in all AsProtect programs) is made, I guess to test if the code section is executable.
It should be noted that after the 3rd routine, we can actually dump the file and obtain a nice clean dump with IAT first thunk intact, import ASCII are stripped together with dll names but the hints are still there so we can actually rebuild the program from there!
Disassembling the API mangling routine we'll see exactly how decrypt import information, classify imports into different categories (6 if I am not wrong) and have different treatment for each of them. Again, you have unpacked AsProtect then you will know these routines or at least have an ideas of how Imports are treated differently by AsProtect. I will not post the full code here as the offset given above should be enough for you to find where the routines are.
00410033 A1 A4 49 41 00 mov eax, ds:Encrypted_data_offset 00410038 8B 55 F0 mov edx, [ebp-10h] 0041003B 89 42 04 mov [edx+4], eax 0041003E 8B 15 94 32 41 00 mov edx, ds:sign_13_import_data 00410044 8B 45 F0 mov eax, [ebp-10h] 00410047 E8 A4 D0 FF FF call Decrypt_n_Find ; eax points to mem_struc 00410047 ; mem_struc+4 : original data 00410047 ; mem_struc+8 : decrypted data 00410047 ; it also traverse the decrypted data 00410047 ; block to search for data block with 00410047 ; signature in edx 0041004C 8B D8 mov ebx, eax 0041004E 85 DB test ebx, ebx 00410050 74 1B jz short import_data_not_found 00410052 8B 43 04 mov eax, [ebx+4] 00410055 E8 3A 25 FF FF call @System@@GetMem$qqrv ; size in eax 0041005A 89 45 FC mov [ebp-4], eax ; [ebp-4] points to import data
The call at 410047 decrypt a block data in the exe containing all the information about the protected program, like import data, dips to be performed, various hash results, decryption keys and assign each of these data with a signature tag. In this case the signature tag is named "sign_13_import_data" by me. The procedure search for this signature in edx and output eax pointing to the desired block of data. Hence, rename this procedure and you will that it is called everywhere in the dll!
Look into the second last routine, we'll see how pre-OEP dipping is done. Using the call Decrypt_n_Find above, it searches for data blocks with Dip_signatures in the large chunk of decrypted data. Each of these data block contains a dipping address in the main program code section and the signature represents different kind of duties these dippings perform. Here are examples of the dips :
00411133 83 3D A4 45 41 00+ cmp ds:dip_B_address, 0 0041113A 74 0B jz short loc_411147 ; dip_B : redirect TApplication$Initialise 0041113C 68 00 CE 40 00 push offset @Forms@TApplication@Initialize$qqrv ; 00411141 FF 15 A4 45 41 00 call ds:dip_B_address 00411147 00411147 loc_411147: ; CODE XREF: set_seh_2A+2C2j 00411147 C7 45 C0 F7 27 00+ mov dword ptr [ebp-40h], 27F7h 0041114E 8D 85 B5 D7 FF FF lea eax, [ebp-284Bh]
OR here :
004111C9 83 3D 84 45 41 00+ cmp ds:Dip_3_address, 0 004111D0 74 16 jz short loc_4111E8 004111D2 68 54 CD 40 00 push offset nullsub_3 004111D7 FF 15 84 45 41 00 call ds:Dip_3_address ; dip_3 : simple return 004111DD 68 54 CD 40 00 push offset nullsub_3 004111E2 FF 15 88 45 41 00 call ds:dip_4_address ; dip_4 : simple return 004111E8 004111E8 loc_4111E8: ; CODE XREF: set_seh_2A+358j 004111E8 83 7D C0 00 cmp dword ptr [ebp-40h], 0 004111EC 75 4C jnz short registry_data_found_already
As you can see, there are different type of dips, ranging from @Forms@TApplication@Initialize$qqrv (which probably only applicable to Delphi apps), Decrypting parts of code section to a simple "ret". I think there are about 11 types of dips but some of them are very similar. ReGet only used 2 dips to decrypt some parts of the code section so I was not able to debug much of this D-D business, mainly analysing the dead listing. As you can see from the name of routine, it used you hard disk information as hash key, store the hash in registry, access system date etc... these dippings can get really wild :>
Finally, the last routine listed above are the self hash routine that check for error in its own dll code, so that if you place a breakpoint somewhere there, its opcodes will be replaced by "CC" and AsProtect will be able to detect it and exit. Here is one example :
0040E74C 68 54 C3 84 15 push 1584C354h ; [esp] value
0040E751 68 AC 0F 00 00 push 0FACh ; end of area to be hashed??? 0040E756 68 9C D7 00 00 push 0D79Ch ; start area to be hashed??? 0040E75B 68 00 90 01 00 push 19000h ; RVA of rsrc section (hash data) 0040E760 FF 35 14 40 41 00 push ds:hInstance ; base image of dll 0040E766 E8 31 E8 FF FF call self_hash
0040E76B 31 04 24 xor [esp], eax ; test hash ([esp] = 1584C354h) 0040E76E 8B 05 14 40 41 00 mov eax, ds:hInstance 0040E774 01 04 24 add [esp], eax ; if hash is wrong, ret goes to wrong place 0040E774 ; seh is trigger and program quit 0040E777 C3 retn ; go to 4117D4 if correct
I am not quite sure about how exactly parameters are used in the Hash routine, too lazy to really trace into the routine in details to figure everything out, but I do have a rough idea of what is going on. As you can see, the hash result is used to decide where the program will go to next so if you are tracing and the hash is wrong, you would not notice it at all and will continue tracing until caught in seh and the program exits!
The above routine is repeated 1 more time to test the other half of the dll to make sure that no bpx escape its grasp ... but now we know how to defeat it :)... Once the hash check are OK, AsProtect proceeds to another memory area (loaded and decrypted by the dll) to perform the final task of calculating OEP and the famous "popad ; jmp eax"! I did not bother tracing and dumping this routine as it looks like a long boring nonlinear MD5 ... nah it wasn't MD5 but I don't see that I can learn much from it. AsProtect is more or less fully reversed.
5. API Mangling, a closer analysis
Okie, this section is going to discuss how AsProtect redirect API in more details so that hopefully you will be able to analyse other protectors the same way. After AsProtect decrypt the import data block, it has a Delphi-like structure with first dword is the signature (remmeber?), next dword is length, then followed by blocks of each library. The library block start with position of the first thunk offset, then dll name, then followed by each import entries. The import entries consist of first byte as group classification, second byte is length of the entry then import ASCII (encrypted of course).
The first byte is 01 then there will be no mangling, 03 is GetProcAddress, 04 is import by ordinal, 05 is redirect API and 06 is emulated API!!! You can find this whole API mangling routine starting from 41011A.
I would like to discuss about class 05 :> (which is quite interesting how AsProtect scan first few instruction of API, copy them to the redirected API location etc..)...
0040FD57 check_next_instruction: ; CODE XREF: Mangle_IAT+76j 0040FD57 E8 84 FF FF FF call get_instruction_Table 0040FD5C 8B D8 mov ebx, eax 0040FD5E C6 44 24 0C 00 mov [esp+10h+copied_flag], 0 ; clear flag 0040FD63 0040FD63 check_instruction_start_byte: ; CODE XREF: Mangle_IAT+6Fj 0040FD63 0F B6 33 movzx esi, byte ptr [ebx] ; first byte to esi 0040FD66 8D 43 01 lea eax, [ebx+1] ; the second byte is stored in edi 0040FD66 ; for later usage 0040FD69 0F B6 38 movzx edi, byte ptr [eax] 0040FD6C 8D 53 02 lea edx, [ebx+2] ; third byte onwards 0040FD6F 8B CE mov ecx, esi ; first byte 0040FD71 8B C5 mov eax, ebp ; original API address 0040FD73 E8 E8 C6 FF FF call CompareBinary ; compare binary string in eax and edx 0040FD73 ; with length ecx 0040FD78 84 C0 test al, al 0040FD7A 74 18 jz short not_equal 0040FD7C 8B CF mov ecx, edi ; second byte 0040FD7E 8B D5 mov edx, ebp ; original API adddress 0040FD80 8B 44 24 08 mov eax, [esp+10h+new_mem_pointer] ; newly allocated memory address 0040FD84 E8 7F 47 FF FF call Move_memory ; copy a few bytes from original API 0040FD84 ; adddress to the newly allocated 0040FD84 ; memory to redirect the API 0040FD89 01 7C 24 08 add [esp+10h+new_mem_pointer], edi 0040FD8D 03 EF add ebp, edi ; calculate the position for 0040FD8D ; the redirected API to jump back 0040FD8D ; to the original API 0040FD8F C6 44 24 0C 01 mov [esp+10h+copied_flag], 1 ; set the API redirected flag 0040FD94 0040FD94 not_equal: ; CODE XREF: Mangle_IAT+46j 0040FD94 83 C6 02 add esi, 2 0040FD97 03 DE add ebx, esi ; point ebx to next API redirection 0040FD97 ; data block 0040FD99 80 7C 24 0C 00 cmp [esp+10h+copied_flag], 0 0040FD9E 75 05 jnz short loc_40FDA5 ; is API already redirected? 0040FD9E ; jump if yes 0040FDA0 80 3B 00 cmp byte ptr [ebx], 0 ; the end of the API redirection 0040FDA0 ; block?? 0040FDA3 75 BE jnz short check_instruction_start_byte ; 0040FDA5 0040FDA5 loc_40FDA5: ; CODE XREF: Mangle_IAT+6Aj 0040FDA5 80 7C 24 0C 00 cmp [esp+10h+copied_flag], 0 0040FDAA 75 AB jnz short check_next_instruction
Basically, the routine above check the first few bytes of each instruction in the original API routine, compared with a pre-stored table of instructions and decide how it should copy the routine over to the redirected API. The first call of the routine at 40FD57 "call get_instruction_table" simply points eax to the beginning of the instruction table. The loop goes on until an "unknown" instruction is found, that is an instruction is not defined in the pre-stored table. The pre-stored table looks something like this
0040FCFB 01 db 1 ; ; no. of bytes to compare 0040FCFC 01 db 1 ; ; number of bytes to copy 0040FCFD ; --------------------------------------------------------------- 0040FCFD 57 push edi 0040FCFD ; --------------------------------------------------------------- 0040FCFE 01 db 1 ; ; no. of bytes to compare 0040FCFF 02 db 2 ; ; number of bytes to copy 0040FD00 6A db 6Ah ; j ; 6Axx ==> push xx 0040FD01 01 db 1 ; ; no. of bytes to compare 0040FD02 05 db 5 ; ; number of bytes to copy 0040FD03 68 db 68h ; h ; 68xxxxxxxx ==> push xxxxxxxx 0040FD04 02 db 2 ; ; no. of bytes to compare 0040FD05 03 db 3 ; ; number of bytes to copy 0040FD06 FF db 0FFh ; ; FF75xx ==> push dword ptr [ebp+xx] 0040FD07 75 db 75h ; u
Look back at the loop above again, you will understand how it scan through the table to find the right instruction, for example when the routine scan through the table, at 40FD04, the first byte is 02 so that means it compares the first 2 bytes of the current instruction in the original API with 2 bytes starting from 40FD06, if same then the instruction is "push dword ptr [ebp+xx]" hence copy the next 3 bytes (next instruction) over to the redirected routine. This is how it can copy the full instructions without using a disassembler.
After leeching as much as possible from the original API, AsProtect sets about to create the final jump to bring the redirected API to the original API... this is again mundane calculation of number of byte copied etc... Only the last interesting bit is it uses a Random number to decide which kind of call back to use, "push xxxxxxxx ret" or a long jump!
0040FDDA B8 02 00 00 00 mov eax, 2 0040FDDF E8 D0 29 FF FF call @System@@RandInt$qqrv ; System __linkproc__ RandInt(void) 0040FDE4 83 E8 01 sub eax, 1 ; randomize between 0 and 1 :>?? 0040FDE4 ; 2 options f returning to the original 0040FDE4 ; API?? a push ret or a long jump? 0040FDE7 73 1A jnb short push_return 0040FDE9 8B 44 24 08 mov eax, [esp+10h+new_mem_pointer] 0040FDED C6 00 E9 mov byte ptr [eax], 0E9h ; setting up the long jump E9xxxxxxxx 0040FDF0 83 C5 05 add ebp, 5 0040FDF3 8B 04 24 mov eax, [esp+10h+API_address] 0040FDF6 2B C5 sub eax, ebp ; calculate the relative distance 0040FDF6 ; to put in after E9 0040FDF8 03 F0 add esi, eax 0040FDFA 8B 44 24 08 mov eax, [esp+10h+new_mem_pointer] 0040FDFE 40 inc eax 0040FDFF 89 30 mov [eax], esi ; update the redirected API witht the 0040FDFF ; distance found 0040FE01 EB 1B jmp short done 0040FE03 ; ------------------------------------------------------------------ 0040FE03 0040FE03 push_return: ; CODE XREF: Mangle_IAT+B3j 0040FE03 8B 44 24 08 mov eax, [esp+10h+new_mem_pointer] 0040FE07 C6 00 68 mov byte ptr [eax], 68h ; setting up a push, 68xxxxxxxx 0040FE0A 03 34 24 add esi, [esp+10h+API_address] ; calculate the return address 0040FE0D 8B 44 24 08 mov eax, [esp+10h+new_mem_pointer] 0040FE11 40 inc eax 0040FE12 89 30 mov [eax], esi ; update the redirected API 0040FE14 8B 44 24 08 mov eax, [esp+10h+new_mem_pointer] 0040FE18 83 C0 05 add eax, 5 0040FE1B C6 00 C3 mov byte ptr [eax], 0C3h ; finally put a "ret", C3 0040FE1B ; to go to the address pushed above 0040FE1E 0040FE1E done: ; CODE XREF: Mangle_IAT+CDj 0040FE1E 8B C7 mov eax, edi
Rather interesting! I hope everything is clear. There is one more routine that deals with type 06, emulated APIs and it is left as a practise for readers to locate and analyze this routine... it is not that simple though!
6. Finally
I hope you have learnt something from this long tutorial. This is my first so there are bound to be mistakes, PLEASE contact me and help me correct them (I can be found at RCE board most of the time). I hope that you will be able to use these info to unpack AsProtect better, or to inline patch it, to remove CRC check and all...
--------------------------------------------------------
This tutorial would not have been possible without the following people : Spl/\j, evaluator, Solomon, SpeKKel, FoxThree, sv (for helping me unpacking my first AsProtected program), Tsehp (for revirgin), Clandestiny (for answering seh stuff), Daemon (for anti- debugging/tracing stuff on his site) and last but not least R!sc (the old genius) for his excellent tutorials on unpacking.... I am sorry if I miss someone out, but you know that I am always grateful for your help, it is the thought that counts heh :>
Special thanks to Kayaker for some analysis on API mangling and of course, give me more work to do :<... so if you find this essay too long, blame him :>>
Last Edited : 30 April 2002
Cheers,
crUsAdEr