Saturday 2 December 2017

Reversing a PyInstaller based ransomware

Occasionally, I get questions about how to unpack PyInstaller executables using pyinstxtractor, how to identify the script of interest among the bunch of extracted files etc. In this post, I intend to cover all of these. Let's get started.

The file for our purpose is a recently identified ransomware having the following SHA256 hash.

Sample Hash: 53854221c6c1fa513d6ecf83385518dbd8b0afefd9661f6ad831a5acf33c0f8e
Download from Mega (Password: infected)

Preliminary Analysis

The executable "hc6.exe" has the following icon. The icon itself is a tell-tale sign that it's a PyInstaller executable.
Figure 1: Icon
Another way we can identify such files is by dropping it in a hex editor. A PyInstaller generated executable has many strings referencing python, towards the end of the binary.

Figure 2: Strings

A PyInstaller executable consists of two parts - a bootloader and a zlib archive appended to it as an overlay. The purpose of the loader is to set up the Python environment for running the application. This includes loading the Python DLL from the filesystem or from memory when the DLL is bundled within the executable. After going through a series of operations, finally it executes the main script and the control is transferred to user code. The loader also sets up hooks for resolving imports which are embedded within, the details of which are beyond the scope of this post. You can refer to the source for more information.


Knowing that the sample is a PyInstaller generated executable we can proceed to extract its contents using pyinstxtractor as shown in Figure 3.

Figure 3: Running pyinstxtractor
The latest version (1.9) of pyinstxtractor shows which scripts are the possible entry points to the application. These are the python scripts which are run when the application is launched. Naturally, we want to begin our analysis from here. In this sample, it has identified  pyiboot01_bootstrap and hc6 as the entry points. Among the two, the former is PyInstaller specific and not of interest. The other one named hc6 does sound interesting. Let's have a look at the contents of the extracted directory before analyzing the file hc6.

Figure 4: The extracted contents
Within the extracted directory we can see a bunch of stuff - DLL files, Python C Extensions (PYD) and also a sub-directory out00-PYZ.pyz_extracted. This nested sub-directory just contains compiled python files (PYC) as shown in Figure 5. The pyc files are from the standard python library or from a 3rd party library such as PyCrypto. Hence, in this sample, we can exclude these files from analysis.

Figure 5: pyc files inside the pyz

Decompiling the main script

The main script or the entry script is named hc6, let's have a look in a hex editor.

Figure 6: hc6 in a hex editor
This does not look like python code, does it? However, this was not the case in earlier versions of PyInstaller, where the main script was left as-is, in plain text. Recent versions, compile the py source to bytecode before packaging it in the executable.

We now want to decompile this bytecode file back to python source, however, in its present form a decompiler wouldn't recognize this as a valid pyc file. The reason for this is that the magic value (i.e. the signature) is missing from this file header. A Python 2.7 pyc file begins with the bytes 03 F3 0D 0A followed by a four-byte timestamp indicating when this file was compiled. We can add these 8 bytes as shown in Figure 7.

Figure 7: Adding the missing header
With the above changes, we can now feed this file to a decompiler such as pycdc. In case you do not want to compile yourself, I have provided precompiled binaries at AppVeyor. Decompiling we get back the source.

Analyzing the ransomware

Finally, we can have a look at the ransomware in all its glory. It encrypts files from the following list of extensions.
.txt, .exe,  .php, .pl, .7z, .rar, .m4a, .wma, .avi, .wmv, .csv, .d3dbsp, .sc2save, .sie, .sum, .ibank, .t13, .t12, .qdf, .gdb, .tax, .pkpass, .bc6, .bc7, .bkp, .qic, .bkf, .sidn, .sidd, .mddata, .itl, .itdb, .icxs, .hvpl, .hplg, .hkdb, .mdbackup, .syncdb, .gho, .cas, .svg, .map, .wmo, .itm, .sb, .fos, .mcgame, .vdf, .ztmp, .sis, .sid, .ncf, .menu, .layout, .dmp, .blob, .esm, .001, .vtf, .dazip, .fpk, .mlx, .kf, .iwd, .vpk, .tor, .psk, .rim, .w3x, .fsh, .ntl, .arch00, .lvl, .snx, .cfr, .ff, .vpp_pc, .lrf, .m2, .mcmeta, .vfs0, .mpqge, .kdb, .db0, .mp3, .upx, .rofl, .hkx, .bar, .upk, .das, .iwi, .litemod, .asset, .forge, .ltx, .bsa, .apk, .re4, .sav, .lbf, .slm, .bik, .epk, .rgss3a, .pak, .big, .unity3d, .wotreplay, .xxx, .desc, .py, .m3u, .flv, .js, .css, .rb, .png, .jpeg, .p7c, .p7b, .p12, .pfx, .pem, .crt, .cer, .der, .x3f, .srw, .pef, .ptx, .r3d, .rw2, .rwl, .raw, .raf, .orf, .nrw, .mrwref, .mef, .erf, .kdc, .dcr, .cr2, .crw, .bay, .sr2, .srf, .arw, .3fr, .dng, .jpeg, .jpg, .cdr, .indd, .ai, .eps, .pdf, .pdd, .psd, .dbfv, .mdf, .wb2, .rtf, .wpd, .dxg, .xf, .dwg, .pst, .accdb, .mdb, .pptm, .pptx, .ppt, .xlk, .xlsb, .xlsm, .xlsx, .xls, .wps, .docm, .docx, .doc, .odb, .odc, .odm, .odp, .ods, .odt, .sql, .zip, .tar, .tar.gz, .tgz, .biz, .ocx, .html, .htm, .3gp, .srt, .cpp, .mid, .mkv, .mov, .asf, .mpeg, .vob, .mpg, .fla, .swf, .wav, .qcow2, .vdi, .vmdk, .vmx, .gpg, .aes, .ARC, .PAQ, .tar.bz2, .tbk, .bak, .djv, .djvu, .bmp, .cgm, .tif, .tiff, .NEF, .cmd, .class, .jar, .java, .asp, .brd, .sch, .dch, .dip, .vbs, .asm, .pas, .ldf, .ibd, .MYI, .MYD, .frm, .dbf, .SQLITEDB, .SQLITE3, .asc, .lay6, .lay, .ms11 (Security copy), .sldm, .sldx, .ppsm, .ppsx, .ppam, .docb, .mml, .sxm, .otg, .slk, .xlw, .xlt, .xlm, .xlc, .dif, .stc, .sxc, .ots, .ods, .hwp, .dotm, .dotx, .docm, .DOT, .max, .xml, .uot, .stw, .sxw, .ott, .csr, .key, wallet.dat
Encrypted files have an extension of .fucku appended to the original filename. This can be seen in the decompiled code as shown below.

Figure 8: Supported extensions
Files are encrypted with the AES cipher in CBC mode with a random IV generated per file.

Figure 9: Files are encrypted using AES

AES is no doubt a strong algorithm and infeasible to crack. However, the ransomware encrypts each file using a constant and hardcoded key which makes decryption feasible. This is shown in the figure below. The AES key used is j<L;G|hD*3CQk%I!g|Ei&#aQ6*;Vh,

Figure 10: Look, the password is hardcoded!

Decrypting encrypted files

Since we know that each files are encrypted with the same key we can develop a decrypter. However, our kind ransomware author has spared us the bother by providing the decrypter in the same code.

Figure 11: Bundled decrypter

The function decrypt decrypts an encrypted file. It's not called from anywhere, indicating it was there for testing purposes and was not removed in the final build.

There is no need to pay the ransom if someone is infected by this ransomware. A free decrypter is available from the malware hunter team. Kudos to them for their fabulous work!