Other issues in this category (25)
When can we have some peace?
If you know about anti-viruses, you’ve probably heard about signatures too. At least our regular readers definitely know about them. Let's recall what signatures are.
A signature is a specific piece of malicious code that enables an anti-virus to identify malware.
So, in theory, one can select (manually or automatically) a code fragment (as short as possible, to keep the virus databases small), make sure it doesn’t match any of the code of other malicious programs and the most popular legitimate applications, and add it to the virus databases. After that, an anti-virus can detect an infection by simply looking through application code.
But, because attackers also know that anti-viruses use signatures, they can determine what signature will trigger an anti-virus so it can expose a certain malicious file. So they modify the respective piece of code and thus escape the attention of anti-viruses.
But if anti-viruses can easily expose malicious files by way of their signatures, can attackers design malware in such a way that no signatures are generated?
Polymorphic malicious programs contain no piece of code that can be singled out as pertaining to a specific malware species. Whenever a file gets infected, the malicious code is modified or distributed across the file so that no distinctive sign of infection can be detected.
The first known polymorphic virus was called 1260 and was written my Mark Washburn in 1990. Another, better-known polymorphic virus was deployed in 1992 by a Bulgarian hacker, going under the alias Dark Avenger, who also created an MtE (Mutation Engine) of his own.
There are several ways to prevent anti-viruses from detecting malware by its signature.
NOP. Or NOOP ("No Operation") – a specific instruction that tells the CPU to do nothing. Its applications are many. For example, it can be used to create a delay for a certain period of time.
In simple terms, any program (malware included) is a set of instructions and data. An application's code is a sequence of instructions. Since the NOP command does nothing, it can be inserted anywhere in a stream of other instructions or scattered randomly in code: it won't have any impact on the program's operation (unless, of course, it is essential that no delays occur in the course of its operation).
The trick is really simple, but so is the anti-viruses' response: when matching its signatures against the code being analysed, an anti-virus ignores NOP instructions.
Encryption is the second option. As you know, if you use different keys to encrypt data, you’ll end up with two different pieces of code. When a file is being infected, the malicious code is encrypted using a new key, which is added into the decryptor's code. Place the encrypted data into a target file, specify the beginning of the encrypted code for the decryptor, and append the decryptor code to the target file.
Thus an anti-virus can't expose encrypted code and can only look for the decryptor. Meanwhile, when a file gets infected, the decryptor code is modified too—the key and the branch instructions are added to it.
Unfortunately, when users see information about another anti-virus test, they are only interested in its results and completely disregard the testing methodology.
Meanwhile, the methodology is a crucial factor. For example, if the test only verifies how anti-viruses detect malicious files without launching them, an anti-virus can score well even without a polymorphic analyser, simply by looking for typical pieces of code that are regularly used in the tests.
However, all the techniques we've described above only make the work of anti-viruses more difficult, nothing more. Permutation is used to rid malicious programs of specific pieces of code that can be used to identify it.
Note once again that code is a sequence of instructions. Instructions can follow one another or be scattered among other data. The JMP (jump) instruction is used to navigate between the fragments.
In this case, malicious code will consist of multiple blocks that will be rearranged whenever a new file is infected and connected using the jump instruction. Sounds easy? Not by a long shot! Any program code incorporates function calls, branches, etc. So, if pieces of code get reshuffled, the code needs to be examined to make sure that all the calls point to appropriate locations. And to prevent anti-viruses from detecting the code by its signature, the code blocks must be very small, and yet code integrity also needs to be maintained.
How can anti-viruses counter permutations? Now they don’t just search for signatures; they also look through instructions and navigate between jumps and branches. Thus, by running through instruction sequences (effectively by executing files within its engine), an anti-virus puts pieces of code together and can compare the result against its signature database.
That doesn't sound too difficult, but in reality this means that system resources must be allocated to launch the files, which can affect overall system performance. That's why polymorphic analysis can't be run for every file an anti-virus scans. But here the heuristic analyser comes to the rescue.
It can detect malicious programs by how similar they are to previously discovered malware. But this is only one of the heuristic analyser’s tasks.
These are just a few code properties that can help the heuristic analyser expose malware:
An entry point in the section for which the write permission is available (rwx). If control is transferred to a section containing executable code and marked as having the write permission, it is very likely that self-modifying code is being used. Sections of this sort are typical of viruses and protection programs.
A jump instruction at the entry point. There is no practical reason to place a jump instruction at the entry point. So when this happens, it indicates that self-modifying code is present in the file.
Entry point in the second half of a section. Virus code is usually appended to the end of a file section. This is not typical of normal files and appears to be suspicious.
Broken headers. A file may remain operational after it’s been infected, but its header may contain errors that couldn't occur during the linking process. So errors of this kind arouse suspicion too.
An unusual format in some special sections. Executable files contain such special sections as .ctors, .dtors, nad .fini. They can be used to infect files. If any format changes in these sections are discovered, that’s also suspicious.
… and there are hundreds of other suspicious code properties.
The properties are many, and each property has a different degree of risk associated with it. Some may only indicate a threat when they occur together, but nonetheless they can be very helpful in exposing threats and deciding whether further analysis is required. Circumventing heuristic analysis is not an easy task (by that I mean preventing it from even issuing a warning). This can be achieved by employing platform-dependent solutions that utilise the features of specific compilers and frameworks (e.g., constructor or destructor overwrite), which get into heuristic databases rather quickly, or by using sophisticated infectors that are adroit at rearranging code across a file.
So, permutations don't prevent anti-viruses from detecting malware programs by their signatures, but they do raise the bar in terms of the requirements placed on the anti-virus developer team.
What else do attackers have in their arsenal?
Garbage insertion. We already mentioned the NOP instruction. Attackers can also make sure that redundant instructions are inserted into an infected file to conceal the actual malicious code. This may also sound easy, but it isn't, because the code must remain operational despite multiple useless commands being added to it.
To counter this, the polymorphic analyser becomes even more complex—it doesn't only put pieces of code together but also cleans it of useless fragments (if we run the routine to examine Windows code, how much of it will be left?). In this case an analysis error can have dire consequences.
Metamorphic malware. Unlike permutations, where blocks of code are merely rearranged, metamorphic malware code really changes. In theory, malware of this sort can't be detected because generating signatures for malicious code of this kind makes no sense.
But as many of our readers know, almost all programs are written in a certain programming language and subsequently are compiled into executable code. And the compiler can be set to generate different code for the same program by enabling such options as performance and size optimisation and so on. The metamorphic engine operates in a similar fashion. Whenever it’s launched, it will use basic code to generate new code.
Malicious code can be obfuscated even further by changing constants, registers, etc. Let's illustrate our point by drawing an analogy with mathematics: For example, 6 can be obtained as a result of 2+4, 3+3, and 8-2… Which operation can be regarded as typical?
To respond to the challenge, code analysis is not enough. So the analyser becomes an emulator. This means that a program is executed within an anti-virus engine, which no longer looks for signature matches but rather monitors the actions of the program being "launched".
The arms race between malware makers and anti-virus developers made polymorphic malware a rarity. Making a program of this kind is so difficult and expensive that:
If you can write a good metamorphic engine that will occupy security researchers for at least several days or design an emulator that will determine what signatures an anti-virus will detect or come up with a high-quality crackme, just message me. I’m no recruiter, but if there are a lot of you folks out there, I will switch professions and start introducing you to information security companies. Rest assured that your income and stability in life will far outweigh any possible benefits you may get out of distributing malware or cracking legitimate applications.
A modern anti-virus is a very complex piece of software that has evolved through neutralising generations of malicious programs. Today, only a few companies have the capacity to create a real anti-virus.