Chapter Eight and Nine focused on dynamic analysis of programs. Once the basics were out of the way in Chapter eight, we shifted focus to using OllyDbg to fulfil our dynamic analysis objectives. Let’s get to solving problems from this chapter!

Exercise 1

HashName
b94af4a4d4af6eac81fc135abda1c40cLab09-01.exe
d6356b2c6f8d29f8626062b5aefb13b7fc744d54Lab09-01.exe
6ac06dfa543dca43327d55a61d0aaed25f3c90cce791e0555e3e306d47107859Lab09-01.exe

Preface: Analyze the malware found in the file Lab09-01.exe using OllyDbg and IDA Pro to answer the following questions. This malware was initially analyzed in the Chapter 3 labs using basic static and dynamic analysis techniques.

Analysis: Let’s take this particular sample through our standard malware analysis process. I’m going to statically analyze the binary and see what information can be gathered without interacting with it. Opening up the binary in PE Studio, we can find:

  • It’s a 32-bit console application
  • It was compiled on Oct 18, 2011 (as per the file-header)

Basic string analysis shows us the following strings:

  • GET/DOWNLOAD/UPLOAD: Functions might indicate the functionality embedded within the program (possibly a backdoor?)
  • http://www.practicalmalwareanalysis.com: URL might be used to communicate with the TA’s C2 server
  • cmd.exe: Launch command prompt on the compromised endpoint
  • SOFTWARE\Microsoft \XPS: Registry key might be used by the malware to persist (we’ll explore this relation later)
  • k:%s h:%s p:%s per:%s: A format string (needs more context)

Let’s shift our attention to advanced static analysis and fire this binary up in IDA. Main starts at 00402AF0 and this is where we’ll begin our hunt.

It kicks off with a simple argument count check; comparing argc to 1. If it does not have an argument, it queries the registry key: SOFTWARE\MICROSOFT\XPS\Configuration. If the value is 0, the handle is closed (actually, the handle is close regardless of the value of the key). After this check, a call to function at 00402B13 is made in which an interesting operation takes place. For this, it’d be better if we switch our focus to dynamic analysis via OllyDbg.

Firing up the binary in OllyDbg, let’s quickly get to the function call by stepping into the main function, followed by a few steps in the function. Once you’re at the function call, step into it. First of all, it makes a call to GetModuleFileNameA to which if no module name is provided, it returns the full path of the currently executing process. We can also see a few offsets i.e., /c del and >> NUL. Using OllyDbg, step into this function and take special notice of the registers. See how the string slowly formulates into the final parameter to be executed by the ShellExecute call referencing cmd.exe and the string offsets which move on to deleting the executing binary. Though the program won’t be able to delete the binary as it is open in IDA and OllyDbg at my end.

Okay, what happens if the argument count is 1? Let’s explore that route now. While opening the binary in OllyDbg, there’s an option to pass an argument to it. Let’s pass a random argument to at least meet our argument count conditional.

Providing Arguments to Ollydbg

Providing Arguments to Ollydbg

There’s one other way for you to pass arguments to the process i.e. by heading into the Debug option in the menu from the navbar and selecting Arguments. Simply add in your arguments and reset the flow by pressing CTRL+F2.

Passing Arguments

Passing Arguments

Now, let’s quickly shift to the main function at 0x403945. Our argument check is now fulfilled. Let’s review the branch at 402B1D. It takes in the values of argc and argv and stores it in EAX and ECX respectively. Next, we see a slightly cryptic operation where the address [ECX+EAX*4-4] is moved into EDX. Since EAX points to the number of arguments i.e. 2 in our case and ECX is array of arguments passed to the program, by this pointer arithmetic, we reference the last element of the argv array.

This last element is then fed to a function at 0x00402510 which… is performing some operation on this argument. Alright, so let’s re-execute the binary in our debugger with a sample argument, aaaa. Going into the function’s disassembly, I’ve renamed the argument to lastArgumentOfArgv for it to make more sense to me. Onwards, we see ECX is OR’ed with a large number effectively making it a counter. Then, the following code segment is executed:

xor eax, eax repne scasb not ecx add ecx, 0FFFFFFFFh cmp ecx,4

SCASB stands for ‘SCan A String (Byte)’ which scans the source string in ES:DI for a match in EAX. When paired with REPNE, the operation is repeated until the zero flag is set or ECX is equal to 0. Since ECX here is a large number, it’s likely never going to be zero. But the Zero flag is likely going to be set to 1 before and the condition for REPNE will fail eventually leading to the ADD and CMP instructions where ECX is compared to 4 to see if the length of the parameter is 4. Let’s take a look at it in the debugger:

REPNE SCASB

REPNE SCASB

Since our test argument was four characters long, we’ll make the jump to loc_40252D. Since the last element is essentially a character array, the reference to the element’s characters can be incremented by 1 to move on to the next character. Here, the first character is matched to be a. Since our argument did stand true, our execution continues.

Alphabet Check

Alphabet Check

Here, the array is moved back to EAX, the pointer is incremented to point towards the second element, and the value is moved to VAR_4. The same array is moved back to EDX i.e. pointer to the first element and the two values (first and second character) are subtracted and later compared with 1. Their difference can only be 1 if the next character is b. Alright this doesn’t hold true for us but let’s take a look at our static disassembly.

Password Check

Password Check

You will see that the next matches are for the characters ‘C’ and the next ‘D’ after which the comparisons end (ultimately, the password here should ‘abcd’). If these comparisons are true, EAX is set to 1 and the value is returned. That’s it! It’s more like a password provided to the script as an argument. But… we can completely bypass this check by patching the code and returning 1. These instructions can actually do it all:

MOV EAX, 1 RETN

Bypassing Password Check

Bypassing Password Check

Once patched, we can continue our analysis. Jumping to the branch at 402B3F, we can the second element of the argv array being moved into a variable and later pushed to stack for the function call at 40380F. After a short debugging session, I’ve come to the realization that this function is in fact simply checking to see if the second argument matches the given list of commands (referenced as offsets).

  • IN
  • RE
  • C
  • CC

Let’s re-execute the binary with the second argument being -in. With this change, we’ve skipped the branch (to continue looking for potential command-line switches). Next, the argument count is compared to 3 and later to 4; here the third argument (after the switch and before the password) would indicate the lpServiceName variable (indicating the name of the service to be installed).

Service Control Manager API Calls

Service Control Manager API Calls

Let’s analyze the two in detail:

  • If three arguments are provided to the binary at run-time, the function call at 4025B0 is executed. I won’t be diving into this function as all it does is strip the file path (which it retrieves using a function call to GetModuleFileName) and returning the filename as the potential name of the service (stripping the extension as well).

  • Let’s re-execute the binary in our debugger by providing four arguments (I’ll be using program.exe -in ServiceX abcd as my argument). As expected, the third argument (ServiceX) in my case is sent as a parameter to the function at 04025B0.

ServiceX

ServiceX

Eventually, the function at 402600 is called with the lpServiceName as the name of the service to be installed on the system. Using a combination of REPNE SCASB calls, a path is stitched together such that it is: %SYSTEMROOT%\system32\{NAME_OF_THE_EXECUTABLE}.exe. A series of calls related to the Service Control Manager are made and the branch at 0040277D contributes to stitching together the DisplayName of the service which is equal to {NAME_OF_EXECUTABLE} Manager Service. Finally, the service is created using the CreateServiceA API call. Since the referenced binary file path for the service was set to %SYSTEMROOT%\system32\{NAME_OF_THE_EXECUTABLE}.exe; the running binary is copied to the path using CopyFile and the filetime is set to that of the kernel32.dll so as to evade defenses.

The Service

The Service

It eventually calls out to the function at 401070 to create a new registry key at SOFTWARE\Microsoft\XPS and sets the value of Configuration to http://www.practicalmalwareanalysis[.]com:80 (I could be wrong here as it was a lot of string manipulation which I seem to have missed and the API call likely never succeeded on my system either). This is likely a beacon of some kind attempting to connect to the mentioned host using the ups mode (we might uncover other modes later).

Moving on to the second command-line switch, -re; we can see a similar argument check followed by a argument count check (used to check if a custom service name is provided or not) and the function call at 402900 is made. Digging into the function, we can easily see that the function is simply reversing the effect of -in i.e., it’s deleting the service, any binary copied to SYSTEMROOT, and the registry key created earlier.

On to the second last switch, let’s check out -cc. It checks to see if the argument count is equal to 3 and heads into the function which reads the current network configuration of the malware (set in the registry key discussed earlier). It acquires the values and prints them to the console using the format: k:%s h:%s p:%s per:%s\n. Here’s the output from my own system: k:ups h:http://www.practicalmalwareanalysis.com p:80 per:60 .

Lastly, the -c switch checks to see if 7 arguments are provided to the program on runtime. I can imagine how it must be the mode of execution (e.g. UPS), host, port, and the time field. I’ll re-execute the program with 7 parameters i.e., -c down http://abc.com 90 60 90 (no clue what is what right now). It attempts to update the configuration in the registry. Since I wasn’t running the binary as an Administrator, the registry changes fail at my end.

That’s the end of the command-line switches. Now, what happens if no command-line argument is passed? This is where one last branch comes in. It starts off with the function at 401000 and checks to see if the registry key has been configured to beacon to the host. If it is set, the function at 402360 is called.

Default Behavior (no arguments)

Default Behavior (no arguments)

The function begins to read the values of the Configuration value (of the registry key) and fills in variables which are later used in functions. 00402020 is where we witness the different modes the backdoor can operate in. Now this one’s a bit too extensive. Let’s dig in!

Right off the bat, we see a function call to 401E60 which is followed by checks to see if any of the following modes of operations are selected by the malware:

  • SLEEP
  • DOWNLOAD
  • UPLOAD
  • NOTHING
  • CMD

We’ll discover their functionality later. Let’s first analyze the function at 401E60. It’s first two function calls are quite similar; 401420 and 401470. They retrieve two values from the configuration set in the registry - mainly the host and the port.

Registry Value Acquisition Functions

Registry Value Acquisition Functions

The third function call at 401D80 is a bit different in a sense that it generates a random alphanumeric string every time it’s run and the resulting string is pushed to stack. Since the malware did acquire the port and host, this random string might be part of the URL used to acquire a resource from the remote host. The next function call at 401AF0 shows several socket commands being sent to/from the remote host which suggest the randomly generated resource is acquired using a GET request. Follow-up branches show comparisons of the returned resources against the string combination backticks and single quotes.

URL Resource

URL Resource

HTTP GET Requests

HTTP GET Requests

Since this function has several sub-nested calls, this is where we’ll be concluding our analysis of it. Going back to our caller function, let’s finally take a look at the commands and what they do on the system. From our previous function, the actual command’s value (for e.g. the number of seconds to sleep for the SLEEP command) is retrieved and sent to the command for which it was received. Here’s a breakdown of the commands and their functionality (sums up loads of sub-nested calls):

  • SLEEP: Sleep for given time
  • UPLOAD: Function at 4019E0 writes the received file to disk (contrary to what the UPLOAD command should do)
'UPLOAD' Functionality

‘UPLOAD’ Functionality

  • DOWNLOAD: Function at 401870 opens up the handle to a file, reads its content, and sends it to the remote server (again, contrary to what DOWNLOAD should do)
'DOWNLOAD' Functionality

‘DOWNLOAD’ Functionality

  • CMD: Executes an arbitrary command on the system using the command prompt
Command Execution ('CMD')

Command Execution (‘CMD’)

Question Number 1: How can you get this malware to install itself?

Pass the -in command-line switch with or without another string to act as the name of the service which the malware installs to persist on the system.

Question Number 2: What are the command-line options for this program? What is the password requirement?

Potential command-line options for this program are:

  • IN (Install service)
  • RE (Remove service)
  • C (Update configuration)
  • CC (Display configuration)

The password of the program is abcd. Though it can easily be bypassed (answered next).

Question Number 3: How can you use OllyDbg to permanently patch this malware, so that it doesn’t require the special command-line password?

Assemble instructions in the function checking for the password (0x402510) to return 1 (True) in EAX. These instructions can do it:

MOV EAX, 1 RETN

Question Number 4: What are the host-based indicators of this malware?

  • Name of Service: “Executable Name”
  • Display Name of Service: “Executable Name” Manager Service
  • Registry Key: HKLM\SOFTWARE\Microsoft \XPS
  • Malware: C:\Windows\system32{NAME_OF_SERVICE.exe}

Question Number 5: What are the different actions this malware can be instructed to take via the network?

  • Download a file to disk (from the remote host)
  • Upload a file to remote host
  • Sleep for X seconds
  • Do nothing
  • Execute an arbitrary command on the system and return output to the remote host

Question Number 6: Are there any useful network-based signatures for this malware?

  • Host: http://www.practicalmalwareanalysis.com
  • Port: 80
  • Protocol: HTTP 1.0
  • Method: GET
  • Resources: xxxx/xxxx.xxx

Exercise 2

Preface: Analyze the malware found in the file Lab09-02.exe using OllyDbg to answer the following questions.

HashName
251f4d0caf6eadae453488f9c9c0ea95Lab09-02.exe
ea8e109eb3fbdb76623cf9522267345b19721e42Lab09-02.exe
f153dfacec09dd69809c3bbf68270a38ee3701f44220c7bf181c14a68c138133Lab09-02.exe

Quick Analysis:

  • Starts off the main function at:401128
  • Use the GetModuleFilename API call to get the name of the executable
    • Function 401550 returns the name of the executable in the EAX register along with a backslash (e.g. \ocl.exe)
    • Function 4014C0 takes in ocl.exe in three registers - EAX, ECX, EDX
      • Compares the previously acquired file name with ocl.exe
      • Program execution continues if both match
  • If matches, WSAStartup and other network configuration API calls are made including WSASocketA (sets up an IPv4, two-way stream, TCP connection)
  • Function at 401089 takes in an address of 19FD40 and a string (initially pushed into stack variables - 1qaz2wsx3edc)
    • Function at 401440 takes in the same string as parameter [returns 0xC]
  • Function continues XOR decoding to de-obfuscate and copy domain into address 19FB0C (byte by bye copy) - www.practicalmalwareanalysis.com
  • Socket connections continue. If successful, a reverse shell is launched. If not, the socket closes.

Question 1. What strings do you see statically in the binary?

Lack of interesting strings. Mostly API imports and junk strings.

Question 2. What happens when you run this binary?

The malware didn’t do anything at first. Changing the name to ocl.exe will trigger a network connection to practicalmalwareanalysis.com and fetch a command to execute on the system via cmd.exe.

Question 3. How can you get this sample to run its malicious payload?

  • Changing the name of the binary to ocl.exe.

Question 4. What is happening at 0x00401133?

Data is written to the memory address which can be seen in the dump as: 1qaz2wsx3edc µ¶·ocl.exe. It is later used to de-obfuscate the name of the domain name.

Question 5. What arguments are being passed to subroutine 0x00401089?

  • Address: 19FD40
  • String: 1qaz2wsx3edc

Question 6. What domain name does this malware use?

practicalmalwareanalysis.com

Question 7. What encoding routine is being used to obfuscate the domain name?

Data at the buffer is XOR’ed with the string 1qaz2wsx3edc .

Question 8. What is the significance of the CreateProcessA call at 0x0040106E?

Launches a shell with the input, output, and error handles configured to connect to the socket (sent as an argument to the function). The reverse shell is going to be connected to the opened socket soon after the CreateProcess call is made.

Exercise 3

Preface: Analyze the malware found in the file Lab09-03.exe using OllyDbg and IDA Pro. This malware loads three included DLLs (DLL1.dll, DLL2.dll, and DLL3.dll) that are all built to request the same memory load location. Therefore, when viewing these DLLs in OllyDbg versus IDA Pro, code may appear at different memory locations. The purpose of this lab is to make you comfortable with finding the correct location of code within IDA Pro when you are looking at code in OllyDbg.

Quick Analysis: Wasn’t really necessary as this was a rather easy exercise.

Question 1: What DLLs are imported by Lab09-03.exe?

  • NETAPI32
  • KERNEL32
  • DLL1
  • DLL2
  • USER32 [Dynamically via LoadLibrary]
  • DLL3 [Dynamically via LoadLibrary]

Question 2: What is the base address requested by DLL1.dll, DLL2.dll, and DLL3.dll?

All DLLs request the same base address of: 0x10000000

Question 3: When you use OllyDbg to debug Lab09-03.exe, what is the assigned based address for: DLL1.dll, DLL2.dll, and DLL3.dll?

  • DLL1: 10001000
  • DLL2: 00871000
  • DLL3: 00521000

Question 4: When Lab09-03.exe calls an import function from DLL1.dll, what does this import function do?

  • Calculates a random integer
  • Prints the integer to console with the string format: DLL1 Mystery Data is: %d

Question 5: When Lab09-03.exe calls WriteFile, what is the filename it writes to?

Temp.txt in the same directory

Question 6: When Lab09-03.exe creates a job using NetScheduleJobAdd, where does it get the data for the second parameter?

Question 7: While running or debugging the program, you will see that it prints out three pieces of mystery data. What are the following: DLL 1 mystery data 1, DLL 2 mystery data 2, and DLL 3 mystery data 3?

DLL1 Mystery Data: We can see two variables being passed to the function sub_10001038 in DLL1Print. One’s the string format and the second argument is the value of EAX which is a DWORD at a specific memory address. Looking at write cross-references of the DWORD, we can actually see it holding the return value of the GetCurrentProcessId API call.

Returns the PID

Returns the PID

DLL1Print Function

DLL1Print Function

DLL2 Mystery Data: Similarly, DLL2Print takes the return value of the CreateFile call which spawns the temp.txt file in the same directory. It’s the handle ID to the file which is printed by the function.

Returns the handle of the file

Returns the handle of the file

DLL2Print Function

DLL2Print Function

DLL3 Mystery Data: Here, the MultiByteToWideChar API call is used to convert the ASCII string, ping www.malwareanalysisbook.com to UNICODE. Once converted, the address of the string in memory is printed to console.

ASCII to Unicode String Conversion

ASCII to Unicode String Conversion

DLL3Print Function - Prints memory address of string

DLL3Print Function - Prints memory address of string

Question 8: How can you load DLL2.dll into IDA Pro so that it matches the load address used by OllyDbg?

  • Select Manual Load while opening the binary in IDA
  • Write the Image base address available in OllyDbg to ensure the two are synced