Tutorial Anti-Disassembly techniques used by malware (a primer) by Rahul Nair

Storm Shadow

Administrator
Staff member
Developer
Ida Pro Expert
Elite Cracker
There are chances that malware authors implement some kind of trolling so that a malware analyst has a hard time figuring out code during static analysis (IDA Pro ?). Implementing these cunning asm instruction will not cause any issues to the flow of the program but will confuse static analysis tools such as IDA Pro from interpreting the code correctly.
Once upon a time there were 2 kinds of disassembly algorithms -Linear disassembly and flow-oriented disassembly.The former was used in tutorials/ nobody gives a damn is not used that much in disassemblers.
What we are concerned about is the latter which is used in IDA Pro and sometime gamed by malware authors-
1.Jump Instructions to a location with constant value
This is the most used trick by malware writers/anti-disassembly programs which create jumps into the same location + 1 or 2 bytes. It would lead to interpretation of completely different byte code by the system.
img1.png

For instance the actual jump instance here would take the flow of program to the bytecode mentioned above.
Since tools like IDA pro are not that clever(no offense to the creator) it cannot make such judgements and instead interprets the opcode from E8 instead which shows us a bunch of call instructions to some random crappy address, weird decrements and adds.
No we can fix this with ease in IDA PRO. Do that by pressing D on the E8 and C key on the 8B Opcode and voila! you get what is actually being interpreted.
After playing around more with the C & D key you get the following in IDA which seems legit :p
img2-1.png
Now what has happened here is that the the author might have inserted something known as a rogue byte which confuses IDA pro leading to a wrong interpretation of the rest of the opcode.This is a simple technique and if you dont like to see that ugly E8 byte you could NOP it out :)
2.Jump Instructions to the Same target
IDA Pro usually follows this behavior where for a conditional instruction (jnz) it first disassembles the false branch of the conditional instruction and then moves forward to the true part.
From a malware POV since both the jz and jnz are present it is similar to an unconditional jump
img3.png

Once IDA pro reaches the jz instruction it would first branch out and interpret the false instruction and move on to jnz where it would do the same.A nice and dirty trick is to insert a rogue byte code and make the disassembler interpret the instructions as a call.
If we do the C & D thingy in IDA pro as mentioned in 1. we get the following code
img4.png

3.Ping-Pong jumps I have no idea what this technique is named as but it involves doing a lot of jumping around using the method mentioned in 1. and maybe even a bit of 2
Let's look at this innocent jump below.
img5.png

This jumps goes back to loc_4012E6+2 which would be the EB opcode. If we ignore the 66 and B8 opcode ,make IDA interpret the rest as code instead we get the following
img6.png
Yay more jumps.
Once again ignoring the other E8 byte and considering the rest as code the result is as follows-
img7.png

We can see how incorporating rogue bytes obscures the real function call from being hidden in static analysis.
4.Usage of Function Pointers
Instead of a screen shot here is a piece of code
Code:
mov [ebp+var8],offset sub4211C1
 
push 4Ah
 
call [ebp+var_8]
What happens above is that a function is called via use of a reference to an address. For example for the function call it would get the funciton stringname by the use of some weird bunch of decoding subroutine and save the value in an offset sub4211C1. This would make static analysis really hard since IDA won't recognize it easily.
From a static analysis point of view though it dosen't seem to cause massive harm this coupled with other anti-disassembly techniques can lead to annoyance for an analyst.
There are a couple more annoying techniques which I will explore in another post such as abusing the return pointer (for fun and profit:p ) ,using your own Structed Exception Handler (SEH) and screwing around with the stack-frame construction in IDA pro.


I couldn't quite finish up my last post on anti-disassembly so here are a few more which will be explored in this post
5. Abusing the return pointer (for fun and profit:p )
If you have played around with x86 asm a bit you should know that there are ways other than jmp or calls to a function by which the control of the program can be subverted.
Short story : if you have an address pushed on the top of the stack and call a ret instruction your program flow will goto the address on the stack
For instance the following is a code in IDA->
Code:
sub do_Evil:
push aScumbagAddresstoGoto
mov eax,ebp
push ebp
<.......random instructions...>
pop ebp
retn
-*also note aScumbagAddresstoGoto might be calculated(derived from) at execution or saved somewhere within the program.
Assume that we have popped all the things on the stack till the first ebp which was pushed at start of the code.Now executing return will pop the aScumbagAddresstoGoto and take the control of the program to that address. Usually malware authors will try to push this slightly earlier onto the stack rather than do something like the following as it would be detected more easily in static analysis -
push aScumbagAddresstoGoto
retn
Following this technique would confuse static analysis tools like IDA which would not have generated cross references or even read a particular set of instructions since it is considered as a bunch of irrelevant bytes (remember linear disassembly?, yeah that's what just happened)
img8.PNG

Speculating from this^^ what might happen is that the retn instruction would get executed (pop aScumbagAddresstoGoto ) and the control flow would transfer to memory address 0x04014C0, which would execute a bunch of evil code. In an ideal situation this evil code would probably be scattered across the program somewhere far off to confuse the analyst rather than how it is represented in the above example (duh)
6.Creating/Using your own SEH (Structured Exception Handler)
TL;DR picture version
802.jpg

Structured exception handlers were introduced by intel microsoft 4 to give programs a way to handle error conditions gracefully ( kinda like try/catch in java). There are many ways that exceptions could be triggered- one of the easiest would be access unknown memory address (0x41414141 :D) or by stuff like divide by zero.
The SEH is like a linked list which has a chain of functions meant to handle an exception, suppose a function doesn't know how to handle a certain exception it is passed down the line until the program [To get resolved by an exception handler OR!->] crashes and you get an unhandled exception. Using this feature exceptions would be handled gracefully rather than annoy the user with crappy error codes.
To figure out the SEH chain the OS would refer the FS segment register. This contains a selector which will be used to access the TEB (Thread Environment Block).
Now the first structure within the TEB is the Thread Information Block which points to the SEH which would be a linked list as shown below -
Code:
struct EXCEPTION_REGISTRATION 2
{DWORD prev;
DWORD handler;
}
image_thumb45.png
*3
This works similar to a stack with the prev working as link to the last record and handler points to the handler function.So you could write your own custom exception which points to some code of your choice and then forcefully do some illegal operation so that the control flow goes to the function which was written by you
Further reading on this 4
img9-1.png

What happens above is pretty darn neat, the location of the function to execute when the exception occurs is pushed, then the SEH record is set to point to esp and an exception is forcefully generated as shown.
The printf function block will not run as the control is transferred to the evil_func loc.
Exploring the eviL_func loc we see the following:
img10-1.PNG

In a nutshell the assembly does the following, it unlinks the SEH and removes the exceptions which were pushed from the stack. move eax,[eax] which is dereferencing (similar to C) is done twice because for some weird reason windows puts its own crappy SEH when our cool custom exception handler is created (hmph). Doing the operation twice dereferences our SEH + the win SEH.
Once the dereferencing is done it keeps continuing to execute the rest of the code without any intention of returns etc .Since SEH are not iterated by IDA it becomes another smarty pants way of obscuring the flow
7.Screwing around the stack-frame construction in IDA Pro
This is mostly done with handwritten assembly wherein an author might use techniques mentioned in 1 & 2 (part 1) to fool IDA pro into executing the false conditional branch of jump instructions first.
Consider the following , suppose your stack size was 8...and following code is reached-
Code:
.....
 
	xor eax,eax
	jnz trick_loc
	add esp,4
	jmp real_code
	..........arbitary code?>
	trick loc:
	add esp,200h
	real code:
	..>>>evil stuff
Since IDA does recursive disassembly it has this tendency evaluate the false branch first.Since xor eax,eax is always going to set Zeroflag it can be concluded that code at the trick_loc will never be execute BUT! IDA does not know this and gets confused. Symptoms of this would be IDA showing negative values for the stack pointer.
ALT+K would be a good way to fix this issue if ever encountered in IDA.
References:
  1. https://www.microsoft.com/msj/0197/exception/exception.aspx
  2. Shamelessly stolen from corelan.be
  3. https://www.exploit-db.com/docs/17505.pdf
  4. Thanks to Steven Reddie for pointing out my error, it's actually microsoft who has created the SEH
Source
http://malwinator.com/anti-disassembly-used-in-malware-a-primer/
 
Top