,, MMP""MM""YMM `7MM P' MM `7 MM MM MMpMMMb. .gP"Ya MM MM MM ,M' Yb MM MM MM 8M"""""" MM MM MM YM. , .JMML. .JMML JMML.`Mbmmd' `7MMF' `7MF' `7MMF' `7MMF' `MA ,V MM MM VM: ,V `7M' `MF' MM MM .gP"Ya ,6"Yb.`7M' `MF'.gP"Ya `7MMpMMMb. MM. M' `VA ,V' MMmmmmmmMM ,M' Yb 8) MM VA ,V ,M' Yb MM MM `MM A' XMX MM MM 8M"""""" ,pm9MM VA ,V 8M"""""" MM MM :MM; ,V' VA. MM MM YM. , 8M MM VVV YM. , MM MM VF .AM. .MA..JMML. .JMML.`Mbmmd' `Moo9^Yo. W `Mbmmd'.JMML JMML. ,, ,, ,, .g8"""bgd `7MM `7MM mm db .dP' `M MM MM MM dM' ` ,pW"Wq. MM MM .gP"Ya ,p6"bo mmMMmm `7MM ,pW"Wq.`7MMpMMMb. MM 6W' `Wb MM MM ,M' Yb 6M' OO MM MM 6W' `Wb MM MM MM. 8M M8 MM MM 8M"""""" 8M MM MM 8M M8 MM MM `Mb. ,'YA. ,A9 MM MM YM. , YM. , MM MM YA. ,A9 MM MM `"bmmmd' `Ybmd9'.JMML..JMML.`Mbmmd' YMbmd' `Mbmo.JMML.`Ybmd9'.JMML JMML. -- Contact -- https://twitter.com/vxunderground vxug@null.net
kaze/FAT <kaze@lyua.org>

1. Introduction

This paper is a (partial and crapy) translation of the original paper written in french. I take no responsibility for the brain injuries caused by my awful english. You have been warned.

The main thing i really enjoy in virus writing is neither spreading nor weird target platform infection, it's just AV detection evading. And when I say stealth, i don't mean "kill any AV running on the victim's OS", I mean: not detected. But to be honest, writing a long-enough undectetd virus begins to be a real challenge. Nowadays, even the most advanced poly engines get detected in a few days. A few years ago, some little tricks like including big loops in decryptors, generating a lot of junk or using uncommon opcodes could fool some of the weakest emulators. But new techniques like code normalization can detect easily such tricky polymorphized decryptors.

I'll try to present here a new approach to evade av detection. Instead of increasing the complexity of the decryptor, as most of the actual poly engines tend to, we will try to build a decryptor that looks as common as possible, hopping for the AV to cancel emulation. We will try to increase the risk of false positive during virus detection. This approach has been implemented in my last virus, win32.leon, which can be found in the virus section of this emag.

2. The idea

To defeat or slow down actual emulators, I propose a new approach: not a breaking-through one, but I never saw it in a working virus. It's based on two hypothesis:

Those hypothesis may be a bit too strong. Some av products, like sandboxes, emulate the whole OS environnement. But those two hypothesis are likely to be the reality for desktop antivirus, that can't spend more than a bunch of seconds for an executable analyzis.

Usual virus decryptors use to be as complex and as hard to emulate as possible. What we will try here instead is to build a decryptor that looks like a common application, an harmless chunk of code. Our decryptor will only use standard win32 api calls. No xor nor uncommon opcodes. A very few number of viruses use APIs in their decryptor, and they're often used as junk code. If AVs don't emulate them, they can still ignore them: no good. And in most of the cases, the API sequence used by the virus is constant, leaving the possibility to perform behavioural detection.

3. The decryptor

There are several ways to build a 100% api-based decryptor, you just need to find a set of api able to perform simple encryption. I choosed the CryptoApis for leon, but other api sets should do the job. For example, the BitBlt API with it's xor capability may be a good choice too (even a better one, as the XOR operation is done by the graphic card chipset, not the cpu). But simpler, better, let's look at our candiate:

        csp     dd      ?
        hash    dd      ?
        key     db      48 dup (?)
        hkey    dd      ?

        call    CryptAcquireContext, offset csp,0,0,PROV_RSA_FULL,CRYPT_VERIFY_CONTEXT
        call    CryptCreateHash, csp,CALG_MD5,0,0,offset hash
        call    CryptHashData, hash, offset key,4,0
        call    CryptDeriveKey, csp,CALG_RC4,hash,48,offset hkey
        call    CryptDecrypt, hkey,0,1,0,start_of_virus,size_of_virus,start_of_virus

This simple bit of code will decrypt our virus through the RC4 symmetric algorithm. More info on the CryptoApis can be found easily on the msdn library. In fact, win32.leon has two decryptors, this one and a standard xor decryptor, but the xor decryptor is rarely used, and we'll focus here on the cryptoapis based one. We will first see how to build an api-based decryptor, and then how to make it as stealth as possible.

4. "Stealth" IAT patching

In order to use APIs in the decryptor, we need to gather the used APIs adresses, nothing new. Usually, in a virus, API adresses are obtained at runtime from a memory scan, by parsing the export table of the DLLs. As we want our decryptor to look as harmless as possible, looking for the api adresses in memory at runtime won't be considered. Instead, we will modify the infected host to make it import the wanted APis for us . This can be done by patching the host's IAT at infection time, in this way:

  1. Modify the IAT RVA in the host's data directory to point to some free space. For example, we can allocate an extra 4kb space in the last section by resizing the section. Copy the original host's IAT to this free space
  2. Add an extra Image Import Descriptor for the DLL hosting the APIs used in the decrpytor. For the CryptoApis, it will be an IID for advapi32.dll. If an IID for advapi32 was already in the host's IAT it won't matter: a PE containing two IIDs for the same DLL is still a valid PE.
  3. Add to this IID the names (in the First Thunk and Original First Thunk) of the APIs used in the decryptor: "CryptAcquireContext", "CryptCreateHash", etc.
  4. Remember the virtual adress of the First Thunk of this IID. It will contain the adresses of the wanted APIs when the infected host is loaded in memory by the windows loader. For example, if CryptAcquireContext is the second imported API, the call CryptAquireContext in the decryptor will be replaced by call [adress_of_first_thunk+4].

I won't show here the structure of the Import Image Descriptor, as it has been already explained in a bunch of papers: if you're lost, go get some documentation about the PE format. Source code for this task can be found in the functions add_iid_decrypteur and deal_imports of fusion_imports.asm in win32.leon source code.

Thanks to IAT patching, we can now call APIs in the decryptor without having to look for the API's adresses in memory, that is, just like a traditionnal application. But this algorithm can be improved a bit thought: the IID we just added in the IAT is too much constant. In fact, it could lead to potential signatures for the AVs. To make it a bit stealthier, I added two little tricks in win32.leon:

The first algorithm is very simple: the only thing you have to do is to find (or to make) some random located free spaces in the host and to store the API strings there. Those free spaces can be located in the virus body, but be careful to let enough space between the API strings. And make sure that the spaces between two API strings aren't constant bytes, as it could lead to a potential signature too.

The second trick isn't harder to implement. What you have to do is to keep a list of advapi32 API strings (or whatever dll your decryptor uses) inside your virus, and add a random set of thoses string at infection time into the extra IID (well, into the FirstChunk and OriginalFirstThunk of the IID). Don't hesitate to insert a lot of different APIs into the extra IID, but make sure:

Again, I won't show source code for this as this is a trivial task. If you're very curious, take a look to the other functions in fusion_imports.asm in win32.leon source code. The cross-compatible advapi32 APIs list is stored in the file apis_advapi_compat.inc. Of course this list has been computed, I didn't test APIs one by one ;)

5. Adding some semantic junk

Now, we have a 100% api-based decryptor that looks pretty harmless. It could be still detected in two ways:

Because we don't want our decryptor to be detected, even dynamically, we will try to introduce some randomness to the API calls sequence of the virus decryptor. We will insert random "safe API" calls between our effective api calls in the decryptor. By "safe API", I mean an API we can call with random-value parameters and that won't do anything besides returning an error code. For example the API CloseHandle: if we call CloseHandle(random()) under Windows XP or 2000, the API will just return (99.999% of the time) an harmless error code into eax, nothing more. No side effect, nothing. So the idea is to insert calls to such harmless APIs into our decryptor, at random locations.

5.1. Finding safe APIs

Before adding safe api calls into our decryptor, we have to find witch win32 API is a "safe" one, witch APIs can we execute without a risk. It can be a painful task if we test them by hand one by one. Would be nice if a program could compute that for us. That's why I wrote a little tool that tests all APIs on a given OS with random parameters, and write down the APIs that don't crash, i.e the safe ones. Again, as this tool is very simple, I won't paste source code. It can be found in this emag too (safeapisdetector). To be short, what it does is:

  1. List all of the APIs in the system dlls (kernel32, user32, advapi32, gdi32, etc.)
  2. For each API, find the number of parameters of the API. This is done by calling the API with just say 20 parameters on the stack, call the API, and look how many dword have been popped from the stack. This number is the number of parameteres for the API.
  3. Call the API many times, each time with different random parameters.
  4. If no exception has been thrown and if we still haven't crashed, this API is a safe one.

Then, this tool write the list of (crc of the name of safe api, number of parameters) into a .inc file that can be directly included into a .asm file. I ran it on several OS (win XP SP0, SP2, SP2 Pro and 2000) and kept the intersection of all the safe APIs that the tool found on each system. The result can be found into the file fake_apis.inc from win32.leon source code. Here is the beginning of that file:

        db      "kernel32.dll",0
        dd      034EEF5CFh, 1           ; AddAtomA
        dd      085870330h, 1           ; AddAtomW
        dd      0E8EE9923h, 3           ; AddConsoleAliasA
        dd      07B5E9926h, 3           ; AddConsoleAliasW
  dd    0E9FA4F67h, 0           ; AllocConsole
        dd      0E35DCCE1h, 0           ; AreFileApisANSI

For windows XP and 2000, something like 66% of the APIs are safe ones. Under Vista it's about 5-10% as most of the APIs throw an exception when called with wrong parameters. But I don't care, as win32.leon targets XP/2000 only.

5.2. Adding random api calls to the decryptor

We now have a list of cross-compatible safe apis, that we can call from our decryptor, embedded in our virus. The next step is to modify the decryptor. We will insert api calls to some of those safe apis at random locations into the decryptor. This is done at infection time, just before the decryptor gets polymorphized. But again, in order to call those APIs in the decryptor, we have to make sure the host import them. In win32.leon, I choosed to use only already-imported APIs. As my safe-API list is a big one (more than 500 cross-compatible safe APIs), we are nearly sure that the infected host will import at least two or three of them.

After semantic polymorphism, our api-based decryptor (for a given generation of the virus) may look like:

        call    CreateFiber, random(),random(),random()
        call    CloseHandle, random()
        call    CryptAcquireContext, offset csp,0,0,PROV_RSA_FULL,CRYPT_VERIFY_CONTEXT
        call    TlsAlloc
        call    CryptCreateHash, csp,CALG_MD5,0,0,offset hash

6. Decryptor fragmentation

In order to obfuscate a bit more the virus, Win32.leon's decryptor is fragmented into several chunks of code. Each chunk contains the code for a (fake or not) API call. Those chunks are written at random locations into the host, the first chunk being located at the entry point of the infected PE. They overwrite the host data (or code), data that are saved into the virus body. Those data are of course restored when the virus exits, just before jumping to the infected program.

The location of each chunk is choosed carefully, in order to avoid the host corruption: the main PE structures won't be overwritten by any chunk. In leon's source code, the module fragmentation.asm will list all the important structures of the PE (IAT, EAT, ressources, tls, etc.) and choose N safe locations for the chunks (where N=number of chunks=number of api calls in the decryptor).

In order to loose a bit the emulator, the jump from chunk #k to chunk #k+1 is a bit obfuscated. If the av doesn't know the number of parameters of the api used in chunk #k (and if the api itself is not emulated of course), it won't be able to locate chunk #k+1. Lets look at an example:

7. Polymorphism

Our api-based decryptor may be able to bypass emulation, but it is still vulnerable to signature-based detections. A quick solution is to polymorphize the decryptor code, but it should be done carefully. As we want our decryptor to look as harmless as possible, standard engines won't be considered: most of them produce "strange" code, uncommon opcodes and are quickly flagged as "suspicious" by AV emulators. Instead, the poly engine for such decryptors should focus on "standard" code production.

7.1. Kpasm

As i'm a lazy person, I didn't want to build the poly engine for leon entierly in asm. A good poly engine is an engine with a lot of obuscation rules, and writing all those rules in asm is a painful task. Instead, I created a tool to help me in the creation of my poly engine. This tool, kpasm, is like a specialized compiler designed to build poly engines. The obfuscation rules are written in a high level language that looks like C, with specialized instructions for polymorphism. To be short, an obfuscation rule may look like:

mov reg,cst <=> mov reg,0; add reg,cst

This is not Kpasm syntax, but the idea is there: rules are described in a short and elegant manner, while poly engine implementation is abstracted. Rules may be of course more complicated (use of random registers, jumps, loops, memory reads and writes etc.). From those rules, kpasm will produce the source code (asm source code) of the poly engine that will apply those user-defined obfuscation rules.

I won't describe kpasm here as it is a complex tool. User guide (in french) can be found on the FAT website, as well as binaries and examples. An english version may be avaible in the future. Using kpasm permits to build a lot of obfuscation rules very quickly, and that's what is important for the kind of decryptor we want to build. For example, the poly engine of win32.leon has been built in 2-3 hours (see regles.kpasm in source code).

7.2. The poly rules

As we want to increase the risk of false positive for AVs, poly rules must be written carefuly. The idea is to apply a lot of small obfuscation rules that produce short standard code. Uncommon operations like push reg / junk reg / pop reg should be avoided, as well as uncommon opcodes like stc, clc etc. The polymorphized decryptor of win32.leon only contains:

The decryptor should not be polymorphized too much tho, as unoptimized code always looks suspicious. In win32.leon, only a few steps of polymorphism are done: not all opcodes are polymorphized and polymorphized code is not too big. But two decryptors from different generations are totally differents, and that's what matter. All the rules are described in kpasm syntax in the file regles.kpasm. Below is an example of a polimorphised (fake) api chunk of the decryptor. Just infect some executable with win32.leon to look at more samples.

01008A61 > $ C705 6F3C0801 >MOV DWORD PTR DS:[1083C6F],kazerege.0100>
01008A6B   . FF35 6F3C0801  PUSH DWORD PTR DS:[1083C6F]
01008A71   . 8B35 D33C0801  MOV ESI,DWORD PTR DS:[1083CD3]           ;  kazerege.01069704
01008A77   . 2B1D 733C0801  SUB EBX,DWORD PTR DS:[1083C73]
01008A7D   . 56             PUSH ESI                                 ; /hTemplateFile => 01069704
01008A7E   . FF35 433D0801  PUSH DWORD PTR DS:[1083D43]              ; |Attributes = HIDDEN|SYSTE
01008A84   . BF 239C70DD    MOV EDI,DD709C23                         ; |
01008A89   . BB FD350801    MOV EBX,kazerege.010835FD                ; |
01008A8E   . 2B15 BB3C0801  SUB EDX,DWORD PTR DS:[1083CBB]           ; |
01008A94   . 8BB3 B2060000  MOV ESI,DWORD PTR DS:[EBX+6B2]           ; |
01008A9A   . 57             PUSH EDI                                 ; |Mode => DD709C23
01008A9B   . C705 E73D0801 >MOV DWORD PTR DS:[1083DE7],8BAE08D6      ; |
01008AA5   . BB 093D0801    MOV EBX,kazerege.01083D09                ; |
01008AAA   . 8B53 0A        MOV EDX,DWORD PTR DS:[EBX+A]             ; |
01008AAD   . FF35 E73D0801  PUSH DWORD PTR DS:[1083DE7]              ; |pSecurity = 00E2DA48
01008AB3   . 68 F73848DB    PUSH DB4838F7                            ; |ShareMode = FILE_
01008AB8   . 68 A2A36A41    PUSH 416AA3A2                            ; |Access = GENERIC_WRITE|16
01008ABD   . FF35 5B3D0801  PUSH DWORD PTR DS:[1083D5B]              ; |FileName = 0035DB23 ???
01008AC3   . FF15 B0110001  CALL DWORD PTR DS:[<&KERNEL32.CreateFile>; \CreateFileW
01008AC9   . 813D 233D0801 >CMP DWORD PTR DS:[1083D23],9DD8D33
01008AD3   .^0F84 9D9CFFFF  JE kazerege.01002776
01008AD9   . C3             RETN

As you can see, the polymorphized decryptor chunks don't look very suspicious: pushs, pops, memory access, that's all. Of course all registers, instructions and memory access change from one generation to another.

8. Obfuscation through relocations

Our stealth api-based decryptor can be even be more obfuscated, without looking too much suspicious. A good technique to obfuscate the decryptor (besides polymorphism) without introducing weird code ike xor loops, is the "encryption through relocations" technique. This technique is a old one, first presented by TCP in 29A#5 and used in his resur virus and my sankey (aka slicer) virus. I won't describe the original technique here, as it's not original anymore.

Though, some additional work has to be done in order to use this old technique under winXP/2000. The main difference between win9x and xp/2000 for the PE relocation process is that the default imagebase under xp/2000 is no longer 0x400000 but 0x10000. It implies that in order to use this technique, we first have to relocate the infected executable, because the imagebase of the process (often 0x400000) differs from the default imagebase (0x10000) under 2k/XP.

A consequence is that this technique will only be possible for executable which have a .reloc section (~5% of the executables). But 5% is better than nothing, isn't it ? I won't describe the whole technique here: the idea is the same as the old one, and for the differences under winxp/2000 (relocating the host), just take a look at the file relocation.asm in win32.leon source code (this is a trivial task).

This technique allow us to have some chosen dwords of the virus decryptor encrypted on the disk. Those dwords are decrypted by the windows loader when applying relocations : no need to use additional decryptor code, windows decrypts our code for us. Let's look at an example:

Finally, we have a 100% api-based polymorphic decrytor, fragmented, featuring fake api calls and sometimes encrypted through the reloc technique. It's likely that actual AVs engine won't be able to emulate the decryptor. The only flaw is see is located in the poly engine: it must feature enough obfuscation rules to avoid signature-based detections. Actual poly engine may be a bit "light", but thanks to kpasm, it shouldn't be hard to add a lot of polymorphism rules to the engine.

9. Results & Conclusion

The few techniques presented in this paper or not killer ones. But put together, they are likely to make life hard for the AVs. As leon is a proof of concept, i didn't spend a lot of time on the poly rules (well, the poly is still better than a lot of engines). But even with this reduced poly engine, it took more than three weeks for the AVs (after i sent them samples) to detect it, and with a bad detection rate (~80% for sophos, less than 20% for the others). After 6 months, the detection rate is about 95% for sophos and some others, while most of AVs, e.g KAV, stay near 15% (source: virustotal). Maybe the reason for those bad detection rates is because the virus hasn't been spread into the wild, so avers don't care ?

The main goal has been achieved though: the virus decryptor is not emulated (i did some tests with well-known malwares embedded into win32.leon: still not detected). Sophos detection seems to be signature based (the signature for win32.leon is ~40k big, seems that they detect win32.leon tanks to a lot of different decryptor signatures, but I may be wrong as I didn't ask them :). I won't say it's a win, infact further tests with a slightly improved poly engine should be done to draw conclusions. But I still believe that stealth api-based decryptor are the solution for undetectability in future viruses. Todays AVs engines are good, and what we have left is to play with the false positive rate.

Finally, I hope that you enjoyed reading this short paper. Greets go to all FAT members, as well as ex 29a & IKx members for their past work. Thanks to izee too, for taking the time to read my poor english ;) Don't hesitate to mail me comments or questions.

W32.Leon source

Last updated 2008-07-26 20:18:22 Paris, Madrid