5 April 2022
Author: Vladimir Meier / @plowsec
In our journey to try and make our payload fly under the radar of antivirus software, we wondered if there was a simple way to encrypt all the strings in a binary, without breaking anything. We did not find any satisfying solution in the literature, and the project looked like a fun coding exercise so we decided it was worth a shot.
By the end of it, we succeeded partly, and realised that the approach is not directly suited for antivirus evasion, as this tool’s limitations do not allow antivirus bypass on its own. That’s why we then made avcleaner, which operates on source code directly.
Still, the tool presented in this blog posts brings in some binary hacking that we believe might be of some value to the community, and who knows, someone might end up doing something useful with it.
Currently, we plan to use it along another antivirus bypass tool in order to better target the strings to be encrypted.
Our idea was to encrypt in place all the strings in PE file. To avoid breaking the software, it is obviously mandatory to allow decryption of the string as soon as it is needed. For that to work, one should inject a decryption routine within the binary, and somehow call it when the string is used.
The best approach would be to decompile the binary, locate strings usages and wrap them in a decryption routine. However, frameworks such as ret-dec, rev.ng, mcsema and so on were not mature enough at the time.
In view of that, our solution relies on lief for the binary manipulation, radare2 / rizin for the program analysis, and keystone for code injection.
The process is as follows:
These last steps require storing the string’ size and the return address, so we use lief as well to build a kind of jump table.
Here is an artistic diagram for clarity:
This section shares the implementation details and demonstrates the use of keystone, lief and radare2 to accomplish our goal.
Strings can be enumerated with the iz command of radare2.
For each recovered string, we should encrypt it in place and build the corresponding jump table (described in the subsequent sections).
The “encryption algorithm” for this Proof-of-Concept is actually a simple Vigenere:D, but you can roll your own crypto obviously. Luckily for us, antivirus can be fooled with Vigenere, so let’s not waste time on this.
Cross-references to strings can be obtained with r2pipe’s axt command. Appending a j to the command and then using cmdj allows to get the result in the JSON format, and then automatically parse it with Python.
To simplify things, we do not handle strings with many xrefs although that’s definitely doable.
First, we need to create a new section in the target binary. The section should be big enough to hold information about each identified string.
Then, we use keystone to assemble the hook instructions, but let’s go over the process step-by-step.
Our trampoline should look as follows:
However, this does not account for the calling convention of the target binary, and sadly there are too many variations to cover. We thus decided to only support 64-bit ELF and PE files as a first step.
This sets up the parameters required by the decryption routine, the actual call and then the return to the original instruction. With that out of the way, let us define the blueprint for this trampoline. For a PE file, our actual trampoline would actually be:
Then, it is important to recover the original register used to reference the string, and update its value with the string’s new address:
Now, it is simply a matter of returning to the original instruction. The final code can be assembled with keystone.
The goal here to locate the decryption routine previously generated and carve it out, and then inject it into the target binary.
To carve it out, we will use symbols to locate the function by its name. For ELF files, the lief API get_static_symbol did the job, wheras it did not work for PE files. No worries though, using radare2 it is almost as easy. Then, lief offers the API get_content_from_virtual_addresss, which allows to copy the bytes making up the decryption routine.
Then, inject it as follows:
In practice, it is not possible to encrypt 100% of the strings in a binary:
So, while we could encrypt around 2000 strings within mimikatz, Windows Defender still detected the binary statically. It’s quite a shame to encrypt that many strings and miss the only 5 strings that actually trigger the detection, mais c’est la vie.
To improve this tool and allow it to actually circumvent antivirus software, more advanced analysis should be performed on the binary, in order to identify more cross-references and handle scenarios where a cross-reference points to a collection of strings rather than the string directly. There are some treasures in the floss codebase, and probably some of the problems they solved while making their tool could be helpful here as well.
Or, one can embrace the current limitations and only encrypt strings which are definitely going to trigger the antivirus, hoping they are not located within an array.