An approach similar to modifying assembly code to direct control flow can be used to de-obfuscate and reverse-engineer Java malware or any compiled Java classes for that matter. In this post we will look at one such instance where this technique proved useful.
TL;DR – Skip to Bytecode
During the second week of December 2021 the news of the discovery of Log4Shell (CVE-2021-44228) vulnerability wreaked havoc in every sector that is using Log4j, a widely used Java logging framework, either directly or via third party modules. We all have heard about and dealt with the fallout of this ad nauseam. Hence, I will refrain from discussing anything related to this vulnerability; rather we are taking a look into analysis of a sample from Khonsari malware family.
vx-ug has been collecting all the samples that are leveraging this Log4Shell vulnerability which is where this sample (SHA: 86fc70d24f79a34c46ef66112ef4756639fcad2f2d7288e0eeb0448ffab90428) has been obtained from under Orcus RAT. The reason for attributing this particular sample to Khonsari even though it’s under Orcus RAT is because of the package name for the classes, which we will see during the analysis. Additionally as per this article by Bitdefender the final payload injected into
conhost.exe is Orcus RAT.
As far as Java malware goes, it is seldom complicated and most can be reverse-engineered by either analysis of decompiled Java classes or using dynamic analysis in tandem, to enrich the findings from static analysis.
After busting open the JAR sample we can see that there are three classes under khonsari package. Additionally, notice that it is using the JNA library, perhaps, for native calls. Manifest file tells us that the entry point of this JAR is khonsari.A.
We can use any decompiler to look at the classes. My usual go-to is jd-gui, however, as decompilation of Java bytecode to Java source is not lossless, it failed to decompile for some code blocks esp. the blocks that are involved in decoding obfuscated strings.
Even with this minor setback we could figure out the tasks that this sample is carrying out just by doing some static analysis.
A.R(Object paramArrayOfObject)is being used for HTTP communication.
// 34: invokestatic a : (III)Ljava/lang/String; // 37: invokespecial <init> : (Ljava/lang/String;)V // 40: invokevirtual openConnection : ()Ljava/net/URLConnection; // 43: checkcast java/net/HttpURLConnection
- The same method is carrying out some sort of decryption.
// 321: invokestatic a : (III)Ljava/lang/String; // 324: invokestatic getInstance : (Ljava/lang/String;)Ljavax/crypto/Cipher; // 327: astore #13 // 329: aload #13 // 331: iconst_2 // 332: new javax/crypto/spec/SecretKeySpec
a()in all the classes is used for string deobfuscation. Since we see that being called often where string values are assigned.
invokestatic a : (III)Ljava/lang/String;
- Judging by the Windows API calls,
P.k()performs some code injection into another process.
After finding out the above details about this malware, if we execute the JAR, we do not observe the said activity.
More Trials, More Errors
- I considered Bytecode Viewer as well since it provides multiple decompilers to choose from. Unfortunately, none of them were able to generate a valid source code. It was interesting to see static initialization blocks containing
break, which are not allowed, but it’s expected as the compilation <-> decompilation is not lossless.
- Debugging the sample was a futile endeavour too since stack and local variable visibility is needed. Here is an old but decent article on how to debug bytecode using Dr. Garbage.
- Then I decided to go this route of editing the bytecode and have relevant code blocks executed and de-obfuscate strings. I came across JBE, unfortunately there were lot of syntax errors. Hence, saving the bytecode after edits was out of question.
- Finally, I landed on Recaf. Initially, it was a little disappointing to find that it uses the same decompilers for classes as that of Bytecode Viewer, and as expected the same parsing errors stop us from editing the classes. Turns out there are other Class Modes in Recaf which could allow editing.
At this point I suspected that these classes were written in Bytecode to make the analysis difficult. But it is a mere speculation.
Bytecode is the instruction set that gets executed on the Java Virtual Machine (JVM). All Java source once compiled gets translated to Bytecode. If we are able to modify the bytecode for these classes, consequently we can control the entire flow of the malware.
We are going to modify the method epilogue of
A.a(), which is the method of concern for decoding the string. Printing out the string to
stdout before it is returned by the method will do the trick.
These four instructions are doing the following:
- Duplicating the final String value onto the stack, so that we do not lose the reference to it.
- Printing it out using our good old friend ‘
Dynamic + Static Analysis
As we proceed to execute the JAR again, four String values are printed out. The first three Strings can be ignored as they are the initialized values of class variables. The fourth string is interesting, and as inferred from the partially decompiled code is checked against the actual argument of main. Let’s supply this as a command-line argument to the jar while execution.
Presto! We are blind no more. If we follow the code and match up with the occurrences of
A.a() calls, it’s clear that the method
A.y() is being used for persistence. As evident from the decoded string a run key registry has been added.
Next strings are related to the HTTP communication in
A.R(). The URL suggests that file
dorflersaladreviews.bin.encrypted is being fetched via HTTP GET request. If we look at the other file under Orcus RAT URL where vx-ug has uploaded the samples, it’s the same encrypted file with hash 295aa53d4f104ee8532593b17eaf6b31b8c065de922e4507879cecb13f0d3504. Simulating a fake network and hosting this encrypted file will allow for further execution of the code.
As indicated from the code alongside our
stdout, decryption of the encrypted payload happens, which is supplied to the method
P.k() and eventually injected into
Sometimes simple tricks can go a long way while reverse-engineering malware. In our case less than a few lines of instructions was enough to get more insight into the workings of this sample.
|test.verble.rocks||Domain where both these files were hosted|