API address resolution (Emotet)

Since the Emotet campaign has been back again for the past 2 weeks I got my hands on one of the samples (3e269b0ba5c550cd0636355f2b8da977dac2dc4ad42bcf8b917322006ccf4745) that someone tweeted and started to dissect. As I glanced over some of the functions, I came across these techniques that malware writers often use to make lives of novice analysts difficult. I saw this as an opportunity to blog about them.
There are some articles out there which provide details of these techniques but this post is just to put these tricks into context.

Note: I have renamed a few function names and variables so I will try to mention addresses whenever possible to avoid confusion.

I am going to look at the procedure at 0x401BE0 for this. This function is called to find other API addresses and it accepts the module (DLL) and procedure (API function) as the arguments.

functioncall.png

As we can see it is getting called a few times as well.

The first thing it does is pushing argument on top of the stack for the call instruction at 0x401221. It accepts the DLL name which in this case is kernel32.dll.

resolvedlladdress.png
To find the API name for that CRC hash 6FC49B7Ch I used the IDA script shellcode_hash_search.py.
Below is the  output of execution of this script.
shellcode_hash: 0x00401c21: crc32:0x6fc49b7c kernel32.dll!LoadLibraryExW
shellcode_hash: 0x00401c74: crc32:0xc97c1fff kernel32.dll!GetProcAddress
 
I also used FLARE’s stackstrings.py to assemble the stack variable strings for the ease of analysis. Stackstrings script requires vivisect to run.
 
Function at 0x401AE0 gets the PEB address by returning fs:0x30.
 
getpeb.png
 
Now since PEB holds the loaded modules one can get the list of them by following the appropriate structures. This is the purpose of the below instructions.
.text:00401B03 E8 9B F7 FF FF                                      call    GetPEB          ; FS:[0x30]
.text:00401B08 8B 40 0C                                            mov     eax, [eax+0Ch]
.text:00401B0B 8B 78 0C                                            mov     edi, [eax+0Ch]
.text:00401B0E 8B 5C 24 10                                         mov     ebx, [esp+0Ch+arg_0]
In PEB structure, element at offset 0x0C points to the information to all the loaded modules in this process.
0x00C _PEB_LDR_DATA* Ldr;
peb.png
If we take a look at the structure PEB_LDR_DATA at 0x0C and 0x14 we can see that it has the module lists. (ref: https://www.aldeid.com/wiki/PEB_LDR_DATA)
 
So, in the below loop it iterates through all the loaded modules and compares if we have the correct DLL. If the right DLL is found the address of the module is returned.
 
ldr_list.png
 
Once the function at 0x401221 returns with the address of the DLL it pushed on the top of the stack as argument for the next function at 0x401019. It accepts the CRC hash of the function of which the address we want to find and the address of the DLL.
 
It takes the DLL address and finds the exports directory by traversing headers as seen below and iterates over the exported functions and calculates CRC hash for each of them to compare it with the one that for which we want to find the address.
.text:00401B54 8B 74 24 14                                         mov     esi, [esp+10h+arg_0]
.text:00401B58 8B 46 3C                                            mov     eax, [esi+3Ch]  ; PE header pointer
.text:00401B5B 8B 44 30 78                                         mov     eax, [eax+esi+78h] ; export directory rva: .text section
.text:00401B5F 8B 5C 30 20                                         mov     ebx, [eax+esi+20h]
.text:00401B63 8B 6C 30 1C                                         mov     ebp, [eax+esi+1Ch]
.text:00401B67 8B 4C 30 24                                         mov     ecx, [eax+esi+24h]
.text:00401B6B 03 C6                                               add     eax, esi
.text:00401B6D 8B 40 18                                            mov     eax, [eax+18h]
 
More about PE header here.
 
crc_loop.png
 

Below is the pseudocode using Ghidra of the function which calculates the CRC of the API function names.

uint __cdecl CRCSomething(byte *param_1)
{
  byte bVar1;
  uint uVar2;
  uint uVar3;
  
  bVar1 = *param_1;
  uVar2 = 0xffffffff;
  while (bVar1 != 0) {
    uVar3 = uVar2 ^ (uint)bVar1;
    param_1 = param_1 + 1;
    uVar2 = (int)(uVar3 << 0x1e) >> 0x1f & 0xee0e612cU ^
            (int)(uVar3 << 0x1f) >> 0x1f & 0x77073096U ^ (int)(uVar3 << 0x1d) >> 0x1f & 0x76dc419U ^
            (int)(uVar3 << 0x19) >> 0x1f & 0x76dc4190U ^ (int)(uVar3 << 0x1a) >> 0x1f & 0x3b6e20c8U
            ^ (int)(uVar3 << 0x1b) >> 0x1f & 0x1db71064U ^ (int)(uVar3 << 0x1c) >> 0x1f & 0xedb8832U
            ^ uVar2 >> 8 ^ (int)(uVar3 << 0x18) >> 0x1f & 0xedb88320U;
    bVar1 = *param_1;
  }
  return ~uVar2;
}
Once the exported API function is found its offset is calculated and added to the address of the loaded DLL and returned.
.text:00401BAA                                     loc_401BAA:                             ; CODE XREF: sub_401B50+47↑j
.text:00401BAA 8B 54 24 10                                         mov     edx, [esp+14h+var_4]
.text:00401BAE 0F B7 04 7A                                         movzx   eax, word ptr [edx+edi*2]
.text:00401BB2 8B 44 85 00                                         mov     eax, [ebp+eax*4+0]
.text:00401BB6 5F                                                  pop     edi
.text:00401BB7 03 C6                                               add     eax, esi        ; eax has the offset of the func
.text:00401BB9 5E                                                  pop     esi
.text:00401BBA 5D                                                  pop     ebp
.text:00401BBB 5B                                                  pop     ebx
.text:00401BBC 59                                                  pop     ecx
.text:00401BBD C3                                                  retn
.text:00401BBD                                     sub_401B50      endp
Function at 0x401BE0 resolves addresses for 2 APIs LoadLibraryExW and GetProcAddress using the above methods. And as we know these 2 API calls are used to resolve other API calls.
 
 
This sample is doing a lot of different and concealed activities as Emotet variants usually do. But only using the above behavior we can find similar samples for this campaign.

Similar samples:
25a4ae2a1ce6dbe7da4ba1e2559caa7ed080762cf52dba6c8b55450852135504
a892ec890794466a5d6285e15f316736191c59d2613f2925023e011c5829584d
a1146d2500065673896452578c39ea401c9648306cebadcd7c6f3421c058b5de
4fd4474eaa8d7631d01bd0893641e4b467238b6e6a3fc432d9a324edd635ca77
a1146d2500065673896452578c39ea401c9648306cebadcd7c6f3421c058b5de
e92dd00b092b435420f0996e4f557023fe1436110a11f0f61fbb628b959aac99

PS: I apologize for this long hiatus. I will try to be consistent this time around and post more articles like these.
 

1 Comment

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s