According to the Yoroi annual cyber security report (available HERE), to Cyber Threat Trends (available HERE) and to many additional resources, Microsoft Office files (Word documents and Excel spreadsheet) are one of the most used malware loaders in the current era. Attackers lure victims, by seducing them to open a specially crafted Office document, which loads (sometime even drops from external resources) malicious contents and execute it on the landed host. Today, I decided to write some personal notes on how to deal with them. Following a list of reverse engineering and malware analysis techniques that could help you to analyze such a droppers.
Many different file formats and methodologies plus a lot of singular ways to hide malicious content have been developed in the past years, I decided to group the techniques by paragraphs in order to smooth the whole reading in a way you can jump directly to the interested section without need to read everything.
Hope you find it interesting and useful, if so please share it in a way many professionals/practitioners can use or improve this by sending me contents to be added !
Rich Text Format (.RTF)
Rich Text Format are interesting documents since they can carry Objects.
Didier Stevens built a great tool named
rtfdump.py (available HERE) which can be used to deal with RTF files. Indeed if you run it against an RTF file you will see its composition and the objects that are included and used once run. The following picture shows an example of such a run on a RTF document (
b98b7be0d7a4004a7e3f22e4061b35a56f825fdc3cba29248cf0500beca2523d). Usually I suggest to investigate from the heavier one, in other words from the object with higher Bytes on it.
rtfdump.py offers the way to select specific sections (
-s) and you might decide to show it or to dump it to a file for additional analyses. Selecting the section 2 and showing its content through the following command you might appreciate an interesting string.
python rtfdump.py -s 2 -H mal1.doc
EquationEditor is always a red flag in my personal experience. Indeed CVE-2017-11882 is often abused from attacker in order to run specific shellCode. If you follow in checking in section 2’s HexView you would probably see encoding patterns: recurring characters and symbols. This is a typical behavior in XOR/ROL/SHIFT encryption functions. Didier Stevens comes out with another interesting tool names xorsearch.
Before dealing with xorsearch (available HERE) we need to dump the equationeditor section into an external file. Once you have done such a dump you should move to Windows (we will need it later on) and run xorsearch.exe against the dumped binary.
“[..] XORSearch is a program to search for a given string in an XOR, ROL, ROT or SHIFT encoded binary file. An XOR encoded binary file is a file where some (or all) bytes have been XORed with a constant value (the key). A ROL (or ROR) encoded file has its bytes rotated by a certain number of bits (the key). A ROT encoded file has its alphabetic characters (A-Z and a-z) rotated by a certain number of positions. A SHIFT encoded file has its bytes shifted left by a certain number of bits (the key): all bits of the first byte shift left, the MSB of the second byte becomes the LSB of the first byte, all bits of the second byte shift left, … XOR and ROL/ROR encoding is used by malware programmers to obfuscate strings like URLs. [..] (from Didier Stevens’Blog)
Once run xorsearch would give us offsets in where there is higher probability to find change of control. In other words where you might start your shellcode in order to run it without falling into unaligned instructions. From that point you might use another great and widly known software “The ShellCode Debugger”:
scDbg (available HERE). Once you run it (the following picture shows the GUI) you need to make emulator starting from the offset found in
xorsearch.exe in my specifi case it was on
0x2c74c. I suggest to check “Unlimited steps” so that the emulator would follow on shellcode without stopping it and check the Reporting Mode, so that you would have a summary view at the end of the execution.
Once run, here we go ! We do have our IoC out the shellcode.
Sometimes the attacker uses a different syscall:
ExpandEnvironmentStringsW which is not a hooked function by scDbg. In that case you might need to open up the “just dumped file” and patch the binary by replacing the string:
ExpandEnvironmentStringsW with the string
ExpandEnvironmentStringsA, Once you have done it, reload the patched version of your shellcode into scDbg and re-run it, you would obtain better results.
- b98b7be0d7a4004a7e3f22e4061b35a56f825fdc3cba29248cf0500beca2523d (mal1.doc downlaod HERE)
- eac70cabccac5b0bd493111ec238f287e129923c27d68e5bb126d2442a4bf8da (dumped binary)
- //yatesassociates[.co[.za/documentato/MLY.exe (download HERE)
Office Encrypted Contents
Sometimes you might experience encrypted office content. Running
oleid you would see Encrypted content set to True. Once you have an OLE file with encrypted VBA you cannot access them, and you might not be able to reverse/study/understand what they do. In such a case you need to figure out the encryption key and to decrypt the content.
Fortunately even if you encrypt your MACROs, the running client needs to know how to decrypt them in order to run the MACRO code.
This protection seem to be relatively stable at first sight, but a more detailed analysis revealed that it is not the password that is entered (or its hash) which is used to encrypt the document, but rather a fixed key stored in the MS Excel program code. This key is generated from the password ‘VelvetSweatshop’. What a nice joke by Microsoft! Try to protect a MS Excel document with this password (or to use this password to open a document). The most surprising thing is that no password is required to open a document.
A great tool to check this issue is the
msoffcrypto-crack.py (available HERE).
Once you have found the “Encryption Key” you can just decrypt the file content (using the same msoff-crypto-crack.py) save it in “clear text” and run
oledump.py over it. At this point you should see normal object contents. In this specific case one more Equation Editor is used. Let’s dump it (
python3 oledump.py -s B2 -d cracked.xlsx > out_b2.bin
Now let’s check if common control flow patterns have found with xorsearch.exe ! In case of positives, please join the analysis using scDBG.exe (from the section above: Rich Text Format (.RTF) )
- 3f3c2a4cb476c76b8bf84d6d2b0ee1a0a589709ccc69e84ffe6b2afd2dadbb39 (XLS download from HERE)
- 03u.ru (D&C2)
Office With VBA Macro
Maybe one of the most classic scenario happens when you are facing a document with VBA Macro on it. By running
oledump.py you would check various VBA contents (M tag where MACRO are in ) and focus on the most “fat” one. In other words I definitely suggest to start investigating where more content is (so where high number of Bytes are found, in the following picture A11) since there is high probability to find interesting IoC for blocking or detection purposes.
In that case
olevba comes in helping us (available HERE). It emulates VBA engine and runs the MACRO script like a charm without any big issue. The execution will end up like the following image.
The emulator engine keeps going on until one known functions reaches the end. For example
- 84a07333851ed300b34b34a026a58636844861e2d5265f2faabddddf05815f21 (direct.07.20.doc download HERE)
- detayworx[.com/_vsnpNgyXp84Os8Xh.php (Dropper)
Office Excel Macro 4.0
Sometime it happens you open a malicious Microsoft Excel but no MACROs are in there. This technique provides attackers a simple and reliable method to get a foothold on a target network, as it simply represents an abuse of a legitimate feature of Excel, and does not rely on any vulnerability or exploit. It is just an old feature (almost 30-year old Microsoft Excel feature) that has been exploited only from the past few years. One of the best content regarding this type of attack evasion is given by Lastline (HERE)
Once you run OLEVBA, you can check if it finds something interesting. In that run it suggests that XLM Excel 4 were used on such a document. In order to deobfuscate them and to analyze their contents there are many ways, from single “find” to more complex tool-sets. In this note I would add how I did in the past months. Today there is a script which works quite well, made by
DissectMalware it’s the
XLMDeobfuscator (that you can find it HERE). But we will cover this tool later on the following notes.
In order to un-hide the XLM Obfuscation MACRO what I’ve successfully used the following technique.
Open the malicious file with no macro enabled, open the Macro editor, copy the following
reveal script, save it and re-open with macro enabled (credit HERE )
Sub ShowAllSheets() Dim sh As Worksheet For Each sh In ActiveWorkbook.Sheets sh.Visible = True Next End Sub
If you cant open the malicious file since the macro get executed and you have no control over the execution (since evasion) you might open another sheet, open the VBA editor and “import” the malicious document directly with VBA in the following way
Public Sub Convert_XML_To_Excel_From_Local_Path() Dim xml_File_Path As String Dim wb As Workbook 'Load XML Data into a New Workbook - Code from Officetricks.com Application.DisplayAlerts = False xml_File_Path = "c:/FileToOpen.xlm" Set wb = Workbooks.OpenXML(Filename:=xml_File_Path) 'Copy Content from New workbook to current active Worksheet wb.Sheets(1).UsedRange.Copy ThisWorkbook.Sheets(Sheet2).Range(A1) 'Close New Workbook & Enable Alerts wb.Close False Application.DisplayAlerts = True End Sub
Now you should see the hidden sheet or the hidden cells. One more TIP here, in order to quick find the cells with the content on it, you might search for
=. The following images shows what I meant.
Now, by checking the top-left box (in the following image
BG35344) you can see where is the starting point. In this file
Auto_Open is the first function that is called and you find its reference on there. Then you might see two main formats being used:
At this point you might decide to deobfuscate XLM by executing the MACRO 4 in a controlled way. In other word you might decide to delete the last
GOTO in that way you will give no the control flow to the deobfuscated MACRO but you rather stop them (substituting the last
HALT) and see the deobfuscated code on the sheet.
- d864b4da58253cba29a8106b0727e81852a181f3ac59ec7dfb9b9dee5931b7cc (W2_tax.xls download HERE)
- 22.214.171.124/get.php (Dropper)
.CSV Interpreted by Excel
Sometime you might find
.csv files. They get imported into Microsoft Excel and become “true” Ole Files. Indeed running OleDump against a well-crafted csv you might discover interesting things such as that the CSV file holds VBA or Objects or. For example if you consider sha-256 :
d5db2034631e56d58dffd797d25d286469f56690a1b00d4e6a0a80c31dbf119e you might find the following stuff in there (even if you open it with a common editor is normal text divided by commas). Running OleDump will shows a bunch of interesting sections.
Now you might decide if you prefer to dump the code and to manually analyze it or if you prefer running a code emulator. By running OleVBA against that CSV your would figure-out many interesting indicators (check the following image). For example the tool points out that AutoExec is called once you would open the document with Microsoft excel. Many suspicious calls would be performed, for example:
run, and some
base64 string obfuscation techniques. On this run it was able to even decode such a strings and to recognize IoC such as URLs and file names.
If the code emulator wont work you might decide to dump the entire code by using OleDump. Once you have dumped the code you might analyze it trough a debugger or just reading it if it’s not obfuscated.
- d5db2034631e56d58dffd797d25d286469f56690a1b00d4e6a0a80c31dbf119e (invoice.csv)
- omamontaggi.[it/bels.exe (Droping WB)
XLMDeobfuscator (grab it HERE) is definitely a great tool developed by @DissectMalware. It can be used to decode obfuscated XLM macros (also known as Excel 4.0 macros). It utilizes an internal XLM emulator to interpret the macros, without fully performing the code. It supports both
Before such a great tool the mythical OleDump plugin
plugin_biffis able to overlook to every Microsoft Excel cell and to find functions and formulas. By using the
-x plugin option you are able to show the hidden Macro XLM while using the
-f plugin option the plugin tries to figure-out external links by interpreting encoding (such as hex and base64) and printing out strings.
- 1e194edbb1f28b9ecc4dc6a9a1e289d1c404470724f5fb14dd01312ed75bc298 (File_457366.xls)
- p://45[.11[.183[.78/6f04e0be46qb4Zc[.php (Dropper)