,, MMP""MM""YMM `7MM P' MM `7 MM MM MMpMMMb. .gP"Ya MM MM MM ,M' Yb MM MM MM 8M"""""" MM MM MM YM. , .JMML. .JMML JMML.`Mbmmd' `7MMF' `7MF' `7MMF' `7MMF' `MA ,V MM MM VM: ,V `7M' `MF' MM MM .gP"Ya ,6"Yb.`7M' `MF'.gP"Ya `7MMpMMMb. MM. M' `VA ,V' MMmmmmmmMM ,M' Yb 8) MM VA ,V ,M' Yb MM MM `MM A' XMX MM MM 8M"""""" ,pm9MM VA ,V 8M"""""" MM MM :MM; ,V' VA. MM MM YM. , 8M MM VVV YM. , MM MM VF .AM. .MA..JMML. .JMML.`Mbmmd' `Moo9^Yo. W `Mbmmd'.JMML JMML. ,, ,, ,, .g8"""bgd `7MM `7MM mm db .dP' `M MM MM MM dM' ` ,pW"Wq. MM MM .gP"Ya ,p6"bo mmMMmm `7MM ,pW"Wq.`7MMpMMMb. MM 6W' `Wb MM MM ,M' Yb 6M' OO MM MM 6W' `Wb MM MM MM. 8M M8 MM MM 8M"""""" 8M MM MM 8M M8 MM MM `Mb. ,'YA. ,A9 MM MM YM. , YM. , MM MM YA. ,A9 MM MM `"bmmmd' `Ybmd9'.JMML..JMML.`Mbmmd' YMbmd' `Mbmo.JMML.`Ybmd9'.JMML JMML. -- Contact -- https://twitter.com/vxunderground email@example.com
This is a guide for those who already know how to make an engine but cannot work out why their viruses are still detectable.
The single purpose of polymorphic viruses is to avoid detection - at the heart of the polymorphic virus is the engine. It can usually take from 30-80% of the virus code size so is a very important component of the virus to have working properly.
This guide will tell you how polymorphic detectors work in order to help you design/make a better engine to defeat scanners.
Making a good engine takes a good amount of time. If you don't make it correctly you might as well leave it out completely because it's main purpose (avoiding detection) will not work!.
Polymorphism covers has many levels of skill.
According to Vesselin Bontchev (AV) these are:
#6 is not considered higher than #5 it's simply considered a different classing.
I think there is now a 7th class. Highly advanced polymorphism which is designed to be better than #5. These ones have the following attributes:
All these attributes are not part of the virus but instead part of the polymorphic code produced by the virus.
There are many methods for detecting polymorphic viruses here are some popular methods:
Works by searching for a pattern of bytes in FIXED positions and a FIXED sequence. e.g.,
scan string: aa ?? bb ?? cc virus text : aa xx bb xx cc
Work by searching for a pattern of bytes in VARIABLE positions but in a FIXED sequence. e.g.,
scan string: aa * bb * cc virus text: 1. aa xx xx bb xx xx xx xx cc 2. aa bb xx xx xx cc 3. and so on...
Works by finding part of the VIRUS BODY and then performing some very basic cryptanalysis on it and then decrypting it (if possible).
This method according to many AV is not used anymore (due to the effectiveness of Generic Decryptor) but I will tell you how to defeat it anyway just to be sure ;-) -- its not hard to defeat.
Works by emulating instructions in the polymorphic decryptor in order to make the virus decrypt itself and then it detects the virus by a standard scan string.
This has undeservedly been a virus buzz word for a long time. It has been the target of polymorph engine creators to beat the heuristics which shows how little they know of polymorphic detection.
This method involves searching for inconsistencies between the code being analysed and normal everyday code found in programs.
While it is important - it is not THAT important and will not help you stop being detected by anti virus software.
It is important to note that heuristics is not used very much (they do use a bit) in the most popular AV programs (F-PROT, McAfee and AVP) these are the programs you should target. Do not target programs which only hard core virus people use. Most of the hard core AV software could spot a virus anyway. -- in other words: target the less intelligent software users
This is really easy - avoid the use of code common to every decryptor just because some code isn't in the same position doesn't mean it cannot be scanned though. For example:
Your Decryptor #1 (as hexidecimal):
45 34 xx xx xx 54 80 xx xx xx 12 xx xx xx 34 32 xx xx xx 43 xx xx xx xx
Your Decryptor #2:
xx xx xx 45 30 xx xx xx xx xx 54 81 xx xx xx xx xx 12 xx xx 34 32 xx 43
Looking at this code you can see an obvious pattern it can be scanned using this string:
45 3? * 54 8? * 12 * 34 32 * 43 Legend: ? - match 1 positions only * - match up to N bytes but low as 0 bytes
This will identify this decryptor (not the virus) by looking for code common to each decryptor. So how do you combat it? Well try making sure that you always have at LEAST 1 alternative to every instruction your engine can generate.
NOTE: Make enough alternatives that it makes multiple variable scan strings not an option to AV!
This is very easy to defeat - simple add multiple encryption operations for example:
A loop using a single XOR with byte/word is very easy to cryptanalyse but a loop using XOR b/w, ADD b/w, SUB b/w, ROL b/w in one loop is VERY hard to cryptanalyse.
The only problem with this is applying the encryptions in reverse order to that of your virus decryptor so that when the virus decryptor is run it will do it in the correct ordering.
There is an easy way to do this! -- There isn't really I was just joking there is no easy way =)
You can leave bit out anyway because AV's are using all using Generic Decryption as far as I know.
This method is very popular amongst AV and requires the cooperation of the virus to work. If a virus can detect it is being emulated and then throw an emulator off by some method then it will defeat this method.
Products known to be using this technique are: F-PROT AVP TBAV DSAV (and I would guess McAfee?).
How does generic decryption work? well the AV products each have in them a little Intel software CPU emulator which does not allow instructions to actually execute but simulates them enough in certain controlled ways in order to make the virus decrypt itself in a safe environment - this way all they need is a scan string for a very complex polymorphic virus!
These controlled conditions avoid endless loops and other similar bugs in normal programs from making the emulator hang. I experimented and found that making a 10 KB decryptor on a virus will dramatically slow down scanning in DSAV and AVP because the emulator is actually simulating the code. I made 10 x 10KB samples and these took over 3 minutes to be examined by DSAV and AVP however each of these took only milliseconds to run normally.
This shows that the emulators in DSAV and AVP are really quite good and don't give up easily when trying to decrypt a virus (I used the /ANALYSE option on DSAV). F-PROT and TBSCAN did not emulate these samples correctly even with maximum heuristics enabled or if they did they must have discovered how to simulate INCREDIBLY quickly (even TBSCAN being written in assembly language cannot run them that fast).
So how do we stop this emulation taking place? or better put: How do we detect ourselves being emulated and throw the emulator off?
e.g., Imagine we know that the PSP contains a certain constant value at ALL times - but we also know the emulator doesn't emulate the PSP. With this knowledge we can construct some code in our polymorphic DECRYPTOR to detect this and throw the emulator off:
mov ax, sub ax,20CDh jz ok mov ah,4Ch int 21 ok:
Note: This code must be in the decryptor because it's goal is to stop decrypting BEFORE we reach the virus body. This code must be generated with the same principles of variability that all other poly code requires - if you don't make this code variable also then you risk having the code used against you to detect the virus!!!
Possible methods to exploit for detecting and terminating emulation:
e.g., long winded version:
cli ;disable ints cld ;set data string copy direction push 6000h pop ds ;any segment which AV and virus don't own. push ds pop es ;es=ds=6000h sub si,si mov di,0002 lodsw ;save in ax, si=di=2 xor ds:,1234h ;write mov cx,0f000h ;some large amount L1: rep movsw ;write to memory (a large amount is better) cmp ds:,ax ;did the AV forget about the write? mov ds:,ax ;set it back to normal regardless jz not_emulated ;seems they messed up remembering where we ;wrote. mov ah,4Ch int 21h ;bye Mr Emulator. not_emulated:
Most emulators can emulate the PSP, MCB and so on but every single structure would take too much memory and processing so trying to exploit this possible weakness is a good idea.
TbClean a program which emulates viruses to disinfect programs only emulates certain small parts of the PSP leaving other parts to be exploited by emulation trapping. In fact one can trick TbClean into converting the virus infected file into an infected Trojan horse program for the person who runs it next.
NOTE: TbClean is good fun for testing your polymorphic decryptors it shows you how the emulator is going to go through your code like a hot knife through butter. Make sure to crack the registration on TbClean so you can use it properly <grin>.
Remember that many AV programs are built to be fast so by making your virus take a very long time (in AV program terms) to analyse your virus might make it quit thinking that it has encountered an endless loop.
However! running a time consuming decryptor normally takes next to no time. So we can see that resources of time, memory, processing power all contribute to methods for killing off an AV scanner emulator.
You must think how to detect and force the emulator to quit
AV researchers are the ones responsible for making your virus detectable so having some ways to hinder AV researchers doing analysis of your polymorphic virus and engine is always good to throw in.
The most common way to analyse a new polymorphic virus is to generate 1000's of samples of your virus. This involves activating the virus on a test computer and executing 1000's of goat programs.
The goal in generating these 1000's of copies is to get a good sampling of what the engine can generate and then test the detection method against it.
If your virus chooses only to show a certain sample then their detector may work in the Lab but not when it comes to "in the wild" situations.
Of course it is best to not make it obvious to AV that you are trying to do this or they might catch on and alter their methods.
It's always good idea to have plan of the engine structure.
Many coders spend their time byte-fiddling trying to optimise their code - this method of planning enables to you block-fiddle - each of these blocks can be shuffled and optimised meaning every change for the better is saving you lots of bytes instead of 1-2 bytes.
NOTE: NEVER place ANY code in a CALL/RET routine unless it is used more than once!
A polymorphic engine is very similar to the code generation phase of a compiler - most compiler writers use the word "emit"  as the word to say they're outputting code. So try to use the same because it's good to follow this standard when planning your engine.
: Means "output" or "give off" for those bad at English.
e.g., Here is a very basic model of an engine plan (you may want to add more detail than this to any plan you make):
Engine: EmitDecryptor EmitDecryptor: repeat EmitGarbage & EmitAntiEmulation, random(5) times EmitSetupRegs repeat EmitGarbage & EmitAntiEmulation, random(5) times MarkLoopStart repeat EmitDecryptionCode, random(5) times repeat EmitGarbage & EmitAntiEmulation, random(5) times EmitEndLoop repeat EmitGarbage & EmitAntiEmulation, random(5) times End-EmitDecryptor EmitGarbage: Randomly Select 1 of: EmitFakeINT21 - randomly select some int 21 functions EmitFakeINT10 - randomly select some int 10 functions EmitCMPbmemXX - cmp byte ptr [xxxx],xx EmitCMPwmemXXXX - cmp word ptr [xxxx],cccc EmitMOVbmemXX - mov byte ptr [xxxx],cc EmitMOVwmemXXXX - mov word ptr [xxxx],cccc EmitMOVbregXX - mov rb,cc EmitMOVwregXXXX - mov rx,cccc EmitMOVbregMEM - mov rb,byte ptr [xxx] EmitMOVwregMEM - mov rw,byte ptr [xxxx] EmitCALL - CALL xxxx/garb/jmp yyyy/garb/xxxx:/garb/ret/yyyy: EmitJMP - JMP xxxx/garb/xxxx: End-EmitGarbage EmitAntiEmulation: Randomly Select 1 of: EmitFarCALL - place RETF into mem/CALL yyyy:xxxx EmitFarJMP - place Far JMP into mem/JMP yyyy:xxxx EmitWriteAndTest - write to known RAM mem, test it changes, if not crash EmitFakeExit - set int 21 = virus_cs:virus_return and call ah=4c, int 21 EmitPSPcheck - cmp ds:,21CDh/jnz crash: use better check! just an example. EmitDOScheck - dos call/check return value is consistent. End-EmitAntiEmulation EmitSetupRegs: If Boolean Then LoopType = Counter or Pointer Select Count Register from [AX,BX,CX,DX,SI,DI] Select Pointer Register from [SI,DI,BP] Else LoopType = Pointer Select Pointer Register from [SI,DI,BP] End If End-EmitSetupRegs MarkLoopStart: Save output pointer (usually DI register) to remember loop location. End-MarkLoopStart EmitDecryptionCode: Randomly select 1 of: EmitXORptr EmitADDptr EmitSUBptr End-EmitDecryptionCode EmitEndLoop: If LoopType=Counter and Counter=CX and Boolean Then EmitLoop Else If LoopType=Counter and Boolean Then EmitDECJNZ Else If LoopType=Counter and Boolean Then EmitDECJZJMP Else If LoopType=Pointer and Boolean Then EmitDECCMPJNZ Else If LoopType=Pointer and Boolean Then EmitDECCMPJZJMP End End-EmitEndLoop
This is just a simple example of a plan so you can see how to structure your engine - do not forget these parts:
This part is usually done while emitting.
If you are going to go the trouble of making a polymorph engine then do it right and don't waste 1-3Kb of code on an engine which can be generically decrypted.
If you are going to make a good engine remember the following points:
The final pain in the ass
Some AV are obsessed with EXACT detection - even if they are able to detect your decryptor like they do to many some TPE based viruses - in the end they always want exact detection. So try to make your engine such a hard ass that it might allow detection of the actual "decryptor" part but NOT the virus body - This will be a great annoyance to the AV (even though they may say otherwise).
Remember inexact detection leads to inability to remove the virus and if you virus ever becomes common they will have to answer the customers question "why can't you remove it?".