Coding | Reversing

PyInstaller Extractor updated to v2.0

2020-03-28T09:08:00.002+00:00

PyInstaller Extractor has been updated to version 2.0. This has been long overdue. The project is now migrated to GitHub and all further development will take place over there.

https://github.com/extremecoders-re/pyinstxtractor

Version 2.0 includes support for Python 3.7 and above. Earlier, the script used to generate invalid pyc files when extracting exe's generated from Python 3.7. This was because from Python 3.7 onward, the pyc header format slightly changed.

Also starting from this version, I am moving away from traditional version numbering. Instead you can always find the latest version on GitHub.

(Finally) Solving the Weasel keygenme

2020-02-02T08:50:00.000+00:00

Back in 2016, kao posted the weasel keygenme on Tuts4you. It consisted of two parts - a custom VM in C# and some crypto. The VM implemented the main logic of the keygenme. In my previous writeup about the challenge, I succeeded in partially solving the challenge by devirtualizing the VM. The crypto part was left unsolved. This was mainly because the crypto logic was way too complex and without special algorithms there was no way to have a go at it. As described in my earlier post, even SAT solvers have no luck in cracking the crypto.

However this time after nearly 4 years I am happy to say that I did manage to break the crypto. In this blog post I will describe the process to solve this keygenme along with all the failed attempts.

Initially I planned to post the write-up on this blog but the blogger platform is not too good for handling equations and mathematical terms. Hence I have left the write-up as an IPython notebook on Google Colab which has been embedded as a GitHub Gist below. In case the preview below doesn't load properly on your browser you can always find the original notebook on Colab.

https://colab.research.google.com/gist/extremecoders-re/cc268753b6d25017fe0cff7299ad9be3/-finally-solving-the-weasel-keygenme.ipynb

The keygen can be found on GitHub

Go-dispatch-proxy: A load balancing proxy written in Go

2018-05-16T10:41:00.002+00:00

This will be my first post of 2018. I've been very busy these days and didn't have the time to devote to this blog, that was the reason for the prolonged absence of activity. Anyway, in this post, I want to present a tool which I've been working on for some days. It's not related to RE or something fancy. It's just a tool to load balance network connections across multiple interfaces.

To cut a long story short, I have multiple internet connections on my Windows desktop. I needed a tool which would help to utilise all those connections effectively to give an improved speed. In case if multiple connections (i.e. routes) are available, the default behaviour of Windows is to select the one which has a lower metric. If all routes have the same metric, Windows will select one and ignore the rest. [To change the interface metric manually have a look here]

To workaround this problem we can use a load balancer proxy. The purpose of the proxy would be to distribute our connections across the available interfaces leading to increased throughput. There is a project on GitHub - dispatch-proxy which is based on the same idea. It works but I dislike the fact that its written in Node.js. A seemingly innocuous command npm install can wreak havoc on the hard disk by creating tons of tiny files and fragmenting the file system in the process. Naturally, I have an aversion towards desktop apps which are written using node and friends.

Based on the idea I've written SOCKS5 load balancing proxy in pure Go with no additional dependencies.

https://github.com/extremecoders-re/go-dispatch-proxy

The binary consists of just a single file which you can get from the appveyor CI server. It works on Windows and fulfils my purpose. It has not been tested on macOS but will likely work. However, it won't work on Linux. This is because the tool works by changing the source IP address for the outgoing packets. On Windows changing the source IP makes the outgoing packet to use the corresponding interface. On Linux, the packet is transmitted according to the kernels routing table and specifying the source IP has no effect.

To conclude this short post, here are a couple of screenshots showing the tool in action.

Listing the available interfaces

SOCKS proxy at work

Reversing a PyInstaller based ransomware

2017-12-02T21:38:00.001+00:00

Occasionally, I get questions about how to unpack PyInstaller executables using pyinstxtractor, how to identify the script of interest among the bunch of extracted files etc. In this post, I intend to cover all of these. Let's get started.

The file for our purpose is a recently identified ransomware having the following SHA256 hash.

Sample Hash: 53854221c6c1fa513d6ecf83385518dbd8b0afefd9661f6ad831a5acf33c0f8e
Download from Mega (Password: infected)

Preliminary Analysis

The executable "hc6.exe" has the following icon. The icon itself is a tell-tale sign that it's a PyInstaller executable.

Figure 1: Icon

Another way we can identify such files is by dropping it in a hex editor. A PyInstaller generated executable has many strings referencing python, towards the end of the binary.

Figure 2: Strings

A PyInstaller executable consists of two parts - a bootloader and a zlib archive appended to it as an overlay. The purpose of the loader is to set up the Python environment for running the application. This includes loading the Python DLL from the filesystem or from memory when the DLL is bundled within the executable. After going through a series of operations, finally it executes the main script and the control is transferred to user code. The loader also sets up hooks for resolving imports which are embedded within, the details of which are beyond the scope of this post. You can refer to the source for more information.

Extracting

Knowing that the sample is a PyInstaller generated executable we can proceed to extract its contents using pyinstxtractor as shown in Figure 3.

Figure 3: Running pyinstxtractor

The latest version (1.9) of pyinstxtractor shows which scripts are the possible entry points to the application. These are the python scripts which are run when the application is launched. Naturally, we want to begin our analysis from here. In this sample, it has identified pyiboot01_bootstrap and hc6 as the entry points. Among the two, the former is PyInstaller specific and not of interest. The other one named hc6 does sound interesting. Let's have a look at the contents of the extracted directory before analyzing the file hc6.

Figure 4: The extracted contents

Within the extracted directory we can see a bunch of stuff - DLL files, Python C Extensions (PYD) and also a sub-directory out00-PYZ.pyz_extracted. This nested sub-directory just contains compiled python files (PYC) as shown in Figure 5. The pyc files are from the standard python library or from a 3rd party library such as PyCrypto. Hence, in this sample, we can exclude these files from analysis.

Figure 5: pyc files inside the pyz

Decompiling the main script

The main script or the entry script is named hc6, let's have a look in a hex editor.

Figure 6: hc6 in a hex editor

This does not look like python code, does it? However, this was not the case in earlier versions of PyInstaller, where the main script was left as-is, in plain text. Recent versions, compile the py source to bytecode before packaging it in the executable.

We now want to decompile this bytecode file back to python source, however, in its present form a decompiler wouldn't recognize this as a valid pyc file. The reason for this is that the magic value (i.e. the signature) is missing from this file header. A Python 2.7 pyc file begins with the bytes 03 F3 0D 0A followed by a four-byte timestamp indicating when this file was compiled. We can add these 8 bytes as shown in Figure 7.

Figure 7: Adding the missing header

With the above changes, we can now feed this file to a decompiler such as pycdc. In case you do not want to compile yourself, I have provided precompiled binaries at AppVeyor. Decompiling we get back the source.

Analyzing the ransomware

Finally, we can have a look at the ransomware in all its glory. It encrypts files from the following list of extensions.

.txt, .exe, .php, .pl, .7z, .rar, .m4a, .wma, .avi, .wmv, .csv, .d3dbsp, .sc2save, .sie, .sum, .ibank, .t13, .t12, .qdf, .gdb, .tax, .pkpass, .bc6, .bc7, .bkp, .qic, .bkf, .sidn, .sidd, .mddata, .itl, .itdb, .icxs, .hvpl, .hplg, .hkdb, .mdbackup, .syncdb, .gho, .cas, .svg, .map, .wmo, .itm, .sb, .fos, .mcgame, .vdf, .ztmp, .sis, .sid, .ncf, .menu, .layout, .dmp, .blob, .esm, .001, .vtf, .dazip, .fpk, .mlx, .kf, .iwd, .vpk, .tor, .psk, .rim, .w3x, .fsh, .ntl, .arch00, .lvl, .snx, .cfr, .ff, .vpp_pc, .lrf, .m2, .mcmeta, .vfs0, .mpqge, .kdb, .db0, .mp3, .upx, .rofl, .hkx, .bar, .upk, .das, .iwi, .litemod, .asset, .forge, .ltx, .bsa, .apk, .re4, .sav, .lbf, .slm, .bik, .epk, .rgss3a, .pak, .big, .unity3d, .wotreplay, .xxx, .desc, .py, .m3u, .flv, .js, .css, .rb, .png, .jpeg, .p7c, .p7b, .p12, .pfx, .pem, .crt, .cer, .der, .x3f, .srw, .pef, .ptx, .r3d, .rw2, .rwl, .raw, .raf, .orf, .nrw, .mrwref, .mef, .erf, .kdc, .dcr, .cr2, .crw, .bay, .sr2, .srf, .arw, .3fr, .dng, .jpeg, .jpg, .cdr, .indd, .ai, .eps, .pdf, .pdd, .psd, .dbfv, .mdf, .wb2, .rtf, .wpd, .dxg, .xf, .dwg, .pst, .accdb, .mdb, .pptm, .pptx, .ppt, .xlk, .xlsb, .xlsm, .xlsx, .xls, .wps, .docm, .docx, .doc, .odb, .odc, .odm, .odp, .ods, .odt, .sql, .zip, .tar, .tar.gz, .tgz, .biz, .ocx, .html, .htm, .3gp, .srt, .cpp, .mid, .mkv, .mov, .asf, .mpeg, .vob, .mpg, .fla, .swf, .wav, .qcow2, .vdi, .vmdk, .vmx, .gpg, .aes, .ARC, .PAQ, .tar.bz2, .tbk, .bak, .djv, .djvu, .bmp, .cgm, .tif, .tiff, .NEF, .cmd, .class, .jar, .java, .asp, .brd, .sch, .dch, .dip, .vbs, .asm, .pas, .ldf, .ibd, .MYI, .MYD, .frm, .dbf, .SQLITEDB, .SQLITE3, .asc, .lay6, .lay, .ms11 (Security copy), .sldm, .sldx, .ppsm, .ppsx, .ppam, .docb, .mml, .sxm, .otg, .slk, .xlw, .xlt, .xlm, .xlc, .dif, .stc, .sxc, .ots, .ods, .hwp, .dotm, .dotx, .docm, .DOT, .max, .xml, .uot, .stw, .sxw, .ott, .csr, .key, wallet.dat

Encrypted files have an extension of .fucku appended to the original filename. This can be seen in the decompiled code as shown below.

Figure 8: Supported extensions

Files are encrypted with the AES cipher in CBC mode with a random IV generated per file.

Figure 9: Files are encrypted using AES

AES is no doubt a strong algorithm and infeasible to crack. However, the ransomware encrypts each file using a constant and hardcoded key which makes decryption feasible. This is shown in the figure below. The AES key used is j<L;G|hD*3CQk%I!g|Ei&#aQ6*;Vh,

Figure 10: Look, the password is hardcoded!

Decrypting encrypted files

Since we know that each files are encrypted with the same key we can develop a decrypter. However, our kind ransomware author has spared us the bother by providing the decrypter in the same code.

Figure 11: Bundled decrypter

The function decrypt decrypts an encrypted file. It's not called from anywhere, indicating it was there for testing purposes and was not removed in the final build.

There is no need to pay the ransom if someone is infected by this ransomware. A free decrypter is available from the malware hunter team. Kudos to them for their fabulous work!

Pyinstaller Extractor updated to v1.9

2017-11-29T10:39:00.000+00:00

PyInstaller Extractor has been updated to v1.9. The features of this release includes:

Support for Pyinstaller 3.3
Display the scripts which are run at entry point

Support for Pyinstaller 3.3

Self explanatory. For extending the support to Pyinstaller 3.3 no major changes had to be introduced. The earlier script works as-is.

Display the scripts which are run at entry point

A Pyinstaller executable have many embedded files in it. Naturally, users of this tool had difficulty identifying which of the extracted files are of interest. With this update, pyinstxtractor now shows a list of python scripts which are run by the executable at load time. An example is shown in the screenshot below.

pyiboot01_bootstrap and main are the scripts which are run at load time. Out of this two, the former is Pyinstaller specific and not interesting for our purpose. Hence you should start the analysis from the file named main located within the _extracted directory.

As usual, pyinstxtractor can be found at SourceForge.

TUCTF Write-up - RE track

2017-11-27T10:09:00.001+00:00

TU CTF is an introductory CTF for teams that want to build their experience. We will have the standard categories of Web, Forensics, Crypto, RE, and Exploit, as well as some other categories we don't want to reveal just yet. If you have any questions, our contact is at the bottom of each page, but please read the official rules before sending us any emails.

This is a write-up for the Reversing challenges in TU CTF 2017.

Funmail [25]

Figure 1: Challenge description

This is straightforward. The challenge requires a password which is hardcoded within the binary as shown in the Figure 2.

Figure 2: Hardcoded password

Provide the password and get the flag TUCTF{d0n7_h4rdc0d3_p455w0rd5}.

Figure 3: Flag for #1

Funmail 2.0 [50]

Figure 4: Challenge description

Same drill as before. The password is hardcoded but the program is intentionally crippled and does not show the flag.

Figure 5: Deliberately crippled

The binary contains a function printFlag but it is not called from anywhere. We can just patch any call instruction such as the call puts shown in Figure 6.

Figure 6: The instruction to patching

to call printFlag as shown below.

Figure 7: After applying the patch

Running the patched binary we get the flag TUCTF{l0c4l_<_r3m073_3x3cu710n}.

Figure 8: Flag for #2

Unknown [200]

Figure 9: Challenge description

The binary takes the flag as a command line argument. The length of the correct flag is 56 as evident in the disassembly listing below where it compares the result of strlen to 56.

Figure 10: Length of flag must be 56

Navigating down in the disassembly listing we have a function check_letter which takes in the provided flag and an index. The function checks whether the character at the specified index within the flag is correct and returns 0 if so.

Figure 11: Checking the flag letter by letter

check_letter is called from a loop, once for each of the 56 characters. If any check fail, the function returns one which is stored in the variable named fail. Later, on the contents of this variable decide whether to print the success or the failure message.

Figure 12: To fail or not to fail

The flag checking algorithm can be attacked using a brute-force approach. Each of the characters are checked individually, letter by letter without regards to the other characters.

To develop a bruteforce tool. our approach would be to set a breakpoint on 401c7d - the place where check_letter is called. We would modify the string and the index that is passed. If the function returns 0, we know the character is correct. Using this approach we can try out various letters at each of the 56 positions. The code for the brute forcer tool developed in python using r2pipe is shown below.

import r2pipe

"""
|    .----> 0x00401c71      8b55f4         mov edx, dword [local_ch]
|    :|||   0x00401c74      488b45f8       mov rax, qword [local_8h]
|    :|||   0x00401c78      89d6           mov esi, edx
|    :|||   0x00401c7a      4889c7         mov rdi, rax
|    :|||   0x00401c7d      e80e020000     call check_letter
"""

flag = ""

r2 = r2pipe.open('unknown')

# Run with a dummy string
r2.cmd('doo {}'.format('a'*56))

# Set breakpoint
r2.cmd('db 0x401c7d')

for pos in xrange(56):
    for ch in '0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ_+!{}':        
        # Run
        r2.cmd('dc')

        # Breakpoint at 0x00401c7d hits
        # Write the character index
        r2.cmd('dr esi={}'.format(pos))

        # Write the flag obtained so far
        r2.cmd('wz %s @ rdi' %format(flag+ch))

        # Step over call
        r2.cmd('dso')

        # Check function result
        rax = r2.cmdj('drj')['rax']

        # Set rip back to start
        r2.cmd('dr rip=0x401c71')

        # Success
        if rax == 0:
            flag += ch
            print '****************************************', flag
            break

To speed up execution, after the function returns we set rip back to 401c71. This way we do not need to re-execute the binary each time. Running the script we get the flag TUCTF{w3lc0m3_70_7uc7f_4nd_7h4nk_y0u_f0r_p4r71c1p471n6!}. Below is a demo of the bruteforce tool in action.

Future [250]

Figure 13: Challenge description

The last challenge of the RE track is a bit different from the rest. Quite unexpectedly we have been provided the C source code of the challenge. Hence there is no need to inspect the binary. The source looks like the image below.

Figure 14: Challenge source code

The program takes the flag as input, performs some calculations on it and compares the result to a hardcoded string. If it matches our flag is correct. Navigating up within the source code we can see two functions genMatrix and genAuthString which perform these calculations.

Figure 15: The calculations

We can employ a black box approach to solve this challenge. The entire system can be modelled in z3. To retrieve the flag, we can then query z3 if there is a possible input such that output matches the hardcoded string. The script is shown below.

from z3 import *

flag = [BitVec('ch'+str(i), 8) for i in xrange(25)]
mat = [[0 for i in xrange(5)] for i in xrange(5)]

# genMatrix
for i in xrange(25):
    m = (i * 2) % 25
    f = (i * 7) % 25
    mat[m/5][m%5] = flag[f]

#genAuthString
auth = [0 for i in xrange(18)]

auth[0] = mat[0][0] + mat[4][4]
auth[1] = mat[2][1] + mat[0][2]
auth[2] = mat[4][2] + mat[4][1]
auth[3] = mat[1][3] + mat[3][1]
auth[4] = mat[3][4] + mat[1][2]
auth[5] = mat[1][0] + mat[2][3]
auth[6] = mat[2][4] + mat[2][0]
auth[7] = mat[3][3] + mat[3][2] + mat[0][3]
auth[8] = mat[0][4] + mat[4][0] + mat[0][1]
auth[9] = mat[3][3] + mat[2][0]
auth[10] = mat[4][0] + mat[1][2]
auth[11] = mat[0][4] + mat[4][1]
auth[12] = mat[0][3] + mat[0][2]
auth[13] = mat[3][0] + mat[2][0]
auth[14] = mat[1][4] + mat[1][2]
auth[15] = mat[4][3] + mat[2][3]
auth[16] = mat[2][2] + mat[0][2]
auth[17] = mat[1][1] + mat[4][1]

correct_output = "\x8b\xce\xb0\x89\x7b\xb0\xb0\xee\xbf\x92\x65\x9d\x9a\x99\x99\x94\xad\xe4"

s = Solver()

s.add(flag[0] == ord('T'))
s.add(flag[1] == ord('U'))
s.add(flag[2] == ord('C'))
s.add(flag[3] == ord('T'))
s.add(flag[4] == ord('F'))
s.add(flag[5] == ord('{'))
s.add(flag[24] == ord('}'))


for pos, ch in enumerate(correct_output):
    s.add(auth[pos] == ord(ch))

if s.check() == sat:
    m = s.model()
    print ''.join([chr(m[e].as_long()) for e in flag])

Running the script we get the flag TUCTF{5y573m5_0f_4_d0wn!}

Figure 16: The flag, finally!

bnpy - A python architecture plugin for Binary Ninja

2017-11-20T20:30:00.000+00:00

Recently I got a chance to try out Vector 35's Binary Ninja, and I must say the experience has been great so far. The good thing about binary ninja (binja henceforth) is its API, we can easily custom plugins for various purposes such as a disassembler for a foreign architecture. We can do the same in IDA, but developing processor plugins in IDA is not for the faint of heart. At the moment, binja is entirely a static analysis tool but we do have plugins like binjatron that attempts to fill this void.

Playing with the binja API, I developed bnpy - a disassembler for python bytecode. In the binja terminology this is called as an Architecture plugin. At the moment it works for raw python bytecode, i.e. you must extract the instruction stream from a pyc file in order to use it.

In the near future, I plan to extend it so that it can disassemble a pyc (compiled python) file right out of the box. Right now, this is difficult due to certain limitations in the API. To understand this we need to know a bit more about the pyc format.

The pyc file is not a flat file format like a PE or ELF. It is a nested format bearing a tree-like structure. A pyc file contains a single top-level code object. Among other things, a code object stores an array of constants used by the code. This array is called as co_consts. The constants can be integers, strings and even another nested code object. The code object also stores the bytecode instructions in a string named as co_code. At the moment, the bnpy plugin operates on this instruction string. To better describe the structure of pyc files we can refer to the following image taken from kaitai struct.

Fig. 1: The structure of a pyc file

You can see, the code objects within a pyc file are nested. The function view in binja is flat and thus not suitable for displaying a tree structure. As of now, the plugin can be used on the raw bytecode stream. Steps for extracting the bytecode along with other directions can be found on the plugin page at GitHub.

https://github.com/extremecoders-re/bnpy

To conclude this short post, here is a GIF of the plugin in action.

Flare-On Challenge 2017 Writeup

2017-10-15T21:49:00.001+00:00

Flare-on is an annual CTF style challenge organized by Fire-eye with a focus on reverse engineering. The contest falls into its fourth year this season. Taking part in these challenges gives us a nice opportunity to learn something new and this year was no exception. Overall, there were 12 challenges to complete. Official solution to the challenges has already been published at the FireEye blog. Hence instead of a detailed write-up, I will just cover the important parts.

#1 - Login.html

The first problem was as simple as it gets. There is an HTML file with a form. We need to provide a flag and check for its correctness.

Figure 1: Check thy flag

The code simply performs a ROT-13 of the input and compares it with another string. To get back the flag, re-apply ROT-13.

Figure 2: ROT-13 again

Flag: ClientSideLoginsAreEasy@flare-on.com

#2 - IgniteMe.exe

This is one of the classical crackme challenges. A PE is provided. It takes a text as input and checks whether it's correct or not. The following script obtains the flag.

target = [13, 38, 73, 69, 42, 23, 120, 68, 43, 108, 93, 94, 69,
18, 47, 23, 43, 68,111, 110, 86, 9, 95, 69, 71, 115, 38, 10, 
13, 19, 23, 72, 66, 1, 64, 77, 12, 2, 105]

flag = []
v = 4

for x in reversed(target):
    v = x ^ v
    flag.append(v)

print ''.join(reversed(map(chr, flag)))

Flag: R_y0u_H0t_3n0ugH_t0_1gn1t3@flare-on.com

#3 - greek_to_me.exe

This is a PE which takes its input over the network. The input is a 32-bit integer. The most significant byte of this integer is used to decrypt a piece of encrypted code. A fletcher's checksum checks for correctness of the decrypted data. Like others have done, I bruteforced the key in C.

#include <stdio.h>
#include <string.h>

typedef unsigned char uint8_t;
typedef unsigned short uint16_t;

uint8_t data[] = {
 51, 225, 196, 153, 17, 6, 129, 22, 240, 50, 159, 196, 145, 23, 6, 129, 20, 240, 
 6, 129, 21, 241, 196, 145, 26, 6, 129, 27, 226, 6, 129, 24, 242, 6, 129, 25, 241, 
 6, 129, 30, 240, 196, 153, 31, 196, 145, 28, 6, 129, 29, 230, 6, 129, 98, 239, 6, 
 129, 99, 242, 6, 129, 96, 227, 196, 153, 97, 6, 129, 102, 188, 6, 129, 103, 230, 
 6, 129, 100, 232, 6, 129, 101, 157, 6, 129, 106, 242, 196, 153, 107, 6, 129, 104, 
 169, 6, 129, 105, 239, 6, 129, 110, 238, 6, 129, 111, 174, 6, 129, 108, 227, 6, 
 129, 109, 239, 6, 129, 114, 233, 6, 129, 115, 124
}; 

uint8_t buf[121];

uint16_t fletcher16(uint8_t *data, int count )
{
    uint16_t sum1 = 0;
    uint16_t sum2 = 0;
    int index;
 
    for( index = 0; index < count; ++index )
    {
       sum1 = (sum1 + data[index]) % 255;
       sum2 = (sum2 + sum1) % 255;
    }
 
    return (sum2 << 8) | sum1;
 }

void main()
{
 uint16_t checksum;
 int i;
 for (int key = 1; key <= 255; key++)
 {
  memcpy(buf, data, 121);
  printf("[*] Trying key: %x -> ", key);
  for (i = 0; i <= 121; i++)
  {
   buf[i] ^= key;
   buf[i] += 0x22;
  }

  checksum = fletcher16(buf, 121);
  printf("%x\n", checksum);
  if (checksum == 0xFB5E)
  {
   printf("[!] Xor key: %x\n", key);
   break;
  }
 }
}

The decryption key is 0xa2. Running the program with this key we can retrieve the flag from memory.

Figure 3: Yes, even we too bruteforce!

Flag: et_tu_brute_force@flare-on.com

#4 - notepad.exe

This is one of those challenges which need a bit of spark to complete. The challenge searches for PE files in the directory %USERPROFILE%/flareon2016 challenge. The files must have a specific timestamp or else we fail. This set me off for quite some time as I didn't have any idea what it meant. Fortunately, searching for one of the timestamps on Google lead to a clue. The values are basically the timestamps of the first four PE files from last years challenge. Dropping those files in the proper directory and after a bit of fiddling gives the flag.

Flag: bl457_fr0m_th3_p457@flare-on.com

#5 - pewpewboat.exe

This is an x86_64 ELF although named as an exe. It's one of the hidden ships game. A 8x8 grid is provided. Some of the cells have a ship hidden beneath. The task is to complete all the levels. To play the game in a semi-automatic fashion I developed a script using the python bindings of radare.

#!/usr/bin/env python

import r2pipe
import sys

def get_ships(state):
    print 'Writing moves...'
    f = open('moves', 'w')
    for row in xrange(8):
        for col in xrange(8):
            bitmask = 1 << ((row * 8) + col)
            if state & bitmask != 0:
                f.write('%s%s' %(chr(65+row), chr(49+col)))
                f.write('\n')
    f.close()


def main():
    r2 = r2pipe.open('tcp://127.0.0.1:5555')
    # r2.cmd('aa')

    # r2.cmd('doo')
    """
    .text:0000000000403EB1 mov     rdi, rax
    .text:0000000000403EB4 call    play_map
    .text:0000000000403EB9 mov     [rbp+var_4C], eax
    .text:0000000000403EBC cmp     [rbp+var_4C], 1
    """
    # Set breakpoint on play_map
    r2.cmd('db 0x403EB4')

    # Resume execution
    r2 = r2pipe.open('tcp://127.0.0.1:5555')
    r2.cmd('dc')

    while True:
        # Breakpoint hit, get address of map in rdi
        r2 = r2pipe.open('tcp://127.0.0.1:5555')
        map_addr = r2.cmdj('drj')['rdi']

        # Get goal state
        r2 = r2pipe.open('tcp://127.0.0.1:5555')
        goal_state = r2.cmdj('pv8j @  %d' %map_addr)['value']

        get_ships(goal_state)

        # Resume execution
        r2 = r2pipe.open('tcp://127.0.0.1:5555')
        r2.cmd('dc')


if __name__ == '__main__':
    main()

On completing all of the levels, it displays a message. Cleaning it up reveals.

Aye! You found some letters did ya? To find what you're looking for, you'll want to re-order them: 9, 1, 2, 7, 3, 5, 6, 5, 8, 0, 2, 3, 5, 6, 1, 4. Next you let 13 ROT in the sea! THE FINAL SECRET CAN BE FOUND WITH ONLY THE UPPER CASE.

As instructed, we ROT-13 the letters to get the string BUTWHEREISTHERUM. Feeding this to the application, we can get the flag.

Flag: y0u__sUnK_mY__P3Wp3w_b04t@flare-on.com

#6 - payload.dll

A PE32+ dll is provided which exports a single function named EntryPoint. However, trying to call this export via rundll32 fails. This was because DllMain modified the export table when called. Dumping the running dll from memory we were clearly able to see that the exported function name was indeed getting changed. The exported function took a name depending on the value of (year+month)%26. Using an x64dbg script, I was able to recover all of the possible names.

ctr = 0
@run:
init "C:\Documents and Settings\Administrator\Desktop\payload.dll"
doSleep 100
run 180005DDD
mov eax, ctr
run 180005D24
log =========================================================
log {d:ctr}
find rdx+33, 00
log function: {s:$result+1}
log =========================================================
stop
ctr = ctr + 1
cmp ctr, 1a
jl @run

Calling the obtained names using a batch script reveals the flag letter by letter,

@echo off
setlocal enabledelayedexpansion

set fn[0]=filingmeteorsgeminately
set fn[1]=leggykickedflutters
set fn[2]=incalculabilitycombustionsolvency
set fn[3]=crappingrewardsanctity
set fn[4]=evolvablepollutantgavial
set fn[5]=ammoniatesignifiesshampoo
set fn[6]=majesticallyunmarredcoagulate
set fn[7]=roommatedecapitateavoider
set fn[8]=fiendishlylicentiouslycolouristic
set fn[9]=sororityfoxyboatbill
set fn[10]=dissimilitudeaggregativewracks
set fn[11]=allophoneobservesbashfulness
set fn[12]=incuriousfatherlinessmisanthropically
set fn[13]=screensassonantprofessionalisms
set fn[14]=religionistmightplaythings
set fn[15]=airglowexactlyviscount
set fn[16]=thonggeotropicermines
set fn[17]=gladdingcocottekilotons
set fn[18]=diagrammaticallyhotfootsid
set fn[19]=corkerlettermenheraldically
set fn[20]=ulnacontemptuouscaps
set fn[21]=impureinternationalisedlaureates
set fn[22]=anarchisticbuttonedexhibitionistic
set fn[23]=tantalitemimicryslatted
set fn[24]=basophileslapsscrapping
set fn[25]=orphanedirreproducibleconfidences

for /l %%n in (0,1,25) do (
    set /a year=2001+%%n
    date 1-2-!year!
    rundll32 payload.dll !fn[%%n]! !fn[%%n]!
)

Flag: wuuut-exp0rts@flare-on.com

#7 - zsud.exe

The 7th challenge is a maze game implemented in PowerShell. The script was embedded inside a parent executable and ran using CLR Hosting. Dumping the script from the running process using dnSpy was easy.

Figure 4: Dumping the PS script

The ps script implementing the game was obfuscated. After deobfuscating it manually, I was able to draw the maze by following the code.

Figure 5: The Maze

We need to traverse the maze in a specific way or else we fail. The function which executed a move was as follows.

function Invoke-MoveDirection($char, $room, $direction, $trailing) {
 $nextroom = $null
 $movetext = "You can't go $direction."
 $statechange_tristate = $null

 $nextroom = Get-RoomAdjoining $room $direction
 if ($nextroom -ne $null) {
  $key = Get-ThingByKeyword $char 'key'
  if (($key -ne $null) -and ($script:okaystopnow -eq $false)) {
   $dir_short = ([String]$direction[0]).ToLower()

   ${N} = ${sCRiPt:MSVcRt}::("rand").Invoke()%6
            Write-Host $N

   if ($directions_enum[$dir_short] -eq ($n)) {
    $script:key_directions += $dir_short
    $newdesc = Invoke-XformKey $script:key_directions $key.Desc
    $key.Desc = $newdesc
    if ($newdesc.Contains("@")) {
     $nextroom = $script:map.StartingRoom
     $script:okaystopnow = $true
    }
    $statechange_tristate = $true
   } else {
    $statechange_tristate = $false
   }
  }

  $script:room = $nextroom
  $movetext = "You go $($directions_short[$direction.ToLower()])"

  if ($statechange_tristate -eq $true) {
   $movetext += "`nThe key emanates some warmth..."
  } elseif ($statechange_tristate -eq $false) {
   $movetext += "`nHmm..."
  }

  if ($script:autolook -eq $true) {
   $movetext += "`n$(Get-LookText $char $script:room $trailing)"
  }
 } else {
  $movetext = "You can't go that way."
 }

 return "$movetext"
}

At a first glance, the use of random function seemed a bit strange. The first few correct moves could be obtained by brute-force. Inspecting the executable which hosted this script revealed that the rand function was actually hooked. Instead of returning random numbers the function returned numbers from the following list in sequential order.

3, 0, 0, 2, 2, 1, 1, 1, 0, 2, 3, 0, 2, 2, 3, 3, 3, 5, 4, 0, 5, 4, 0, 5, 4, 0, 1, 4, 0, 2, 4, 0, 1, 2, 3, 5, 4, 0, 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, 5, 4, 0

Mapping the numbers to the directions we get the following list of 53 moves.

w, n, n, e, e, s, s, s, n, e, w, n, e, e, w, w, w, d, u, n, d, u, n, d, u, n, s, u, n, e, u, n, s, e, w, d, u, n, s, e, w, s, e, w, s, e, w, s, e, w, d, u, n

I used a macro creator tool to send these moves to the game window. On performing the last move we are taken back to the starting point at the vestibule. From there it was a matter to go to the office to get the flag.

Flag: mudd1ng_by_y0ur53lph@flare-on.com

#8 - flair.apk

Every year there is at least one challenge related to Android. This is it. An apk is provided. The challenge consists of four levels implemented by four different activities named Michael, Brian, Milton and Printer. We need to solve them sequentially in order to get back the flag.

The first activity can be easily solved statically to get the password MYPRSHE__FTW. The second activity compared our input password with a string generated at runtime. The last two activities performs a calculation on our input to generate an array which was again compared. To solve these three levels, I used frida to hook on the relevant functions.

import frida
import time

jscode = """
console.log("[+] Script loaded successfully...");
console.log("[+] Java available: " + Java.available);

Java.perform(function x() 
{   
    console.log("[+] Entered perform.."); 
    
    // Second level
    var brian = Java.use("com.flare_on.flair.Brian");    
    brian.teraljdknh.implementation = function(s1,s2) 
    {
        console.log("[+] Entered com.flare_on.flair.Brian.teraljdknh"); 
        console.log("[*] arg1=" + s1);
        console.log("[*] arg2=" + s2);
        return this.teraljdknh(s1, s2);
    };
    console.log("[+] Replaced com.flare_on.flair.Brian.teraljdknh");


    // Third level    
    var milton = Java.use("com.flare_on.flair.Milton");        
    milton.nbsadf.implementation = function () 
    {
        console.log("[+] Entered com.flare_on.flair.Milton.nbsadf");        
        retval = this.nbsadf();
        console.log("[*] retval="+retval);                  
        return retval;        
    };
    console.log("[+] Replaced com.flare_on.flair.flair.Milton.nbsadf");  
        
    
    //Fourth level
    var stapler = Java.use("com.flare_on.flair.Stapler");        
    
    //Hook string decryptor
    stapler.iemm.implementation = function (p) 
    {
        console.log("[+] Entered com.flare_on.flair.Stapler.iemm");        
        retval = this.iemm(p);
        console.log("[*] arg1="+p);                  
        console.log("[*] retval="+retval);                  
        return retval;        
    };
    console.log("[+] Replaced com.flare_on.flair.flair.Stapler.iemm");    
    
    //Hook hardcoded array
    stapler.poserw.implementation = function (p) 
    {
        console.log("[+] Entered com.flare_on.flair.Stapler.poserw");        
        retval = this.poserw(p);                          
        console.log("[*] retval="+retval);                  
        return retval;        
    };
    console.log("[+] Replaced com.flare_on.flair.flair.Stapler.poserw");    

});
"""

device =  frida.get_usb_device()
process = device.attach('com.flare_on.flair')
script = process.create_script(jscode)
script.load()
raw_input()

The script hooks the important functions and logs both the arguments and the return value. Running this we get a log.

[+] Script loaded successfully...
[+] Java available: true
[+] Entered perform..
[+] Replaced com.flare_on.flair.Brian.teraljdknh
[+] Replaced com.flare_on.flair.flair.Milton.nbsadf
[+] Replaced com.flare_on.flair.flair.Stapler.iemm
[+] Replaced com.flare_on.flair.flair.Stapler.poserw
[+] Entered com.flare_on.flair.Brian.teraljdknh
[*] arg1=hashtag_covfefe_Fajitas!
[*] arg2=hashtag_covfefe_Fajitas!
[+] Entered com.flare_on.flair.Milton.nbsadf
[+] Entered com.flare_on.flair.Stapler.poserw
[*] retval=16,-82,-91,-108,-125,30,11,66,-71,86,-59,120,-17,-102,109,68,-18,57,-109,-115
[*] retval=16,-82,-91,-108,-125,30,11,66,-71,86,-59,120,-17,-102,109,68,-18,57,-109,-115
[+] Entered com.flare_on.flair.Milton.nbsadf
[+] Entered com.flare_on.flair.Stapler.poserw
[*] retval=16,-82,-91,-108,-125,30,11,66,-71,86,-59,120,-17,-102,109,68,-18,57,-109,-115
[*] retval=16,-82,-91,-108,-125,30,11,66,-71,86,-59,120,-17,-102,109,68,-18,57,-109,-115
[+] Entered com.flare_on.flair.Stapler.iemm
[*] arg1=e.RP9SR8x9.GH.G8M9.GHkG
[*] retval=android.content.Context
[+] Entered com.flare_on.flair.Stapler.iemm
[*] arg1=LHG1@@!SxeGS9.M9.GHkG
[*] retval=getApplicationContext

--------------
snip
--------------

[+] Entered com.flare_on.flair.Stapler.iemm
[*] arg1=H?ye!v
[*] retval=equals
[+] Entered com.flare_on.flair.Stapler.poserw
[*] retval=95,27,-29,-55,-80,-127,-60,13,-33,-60,-96,35,-127,86,0,-114,-25,30,36,-92
[+] Entered com.flare_on.flair.Stapler.iemm
[*] arg1=e.RP9SR8x9.GH.G8M9.GHkG
[*] retval=android.content.Context
[+] Entered com.flare_on.flair.Stapler.iemm
[*] arg1=LHG1@@!SxeGS9.M9.GHkG
[*] retval=getApplicationContext

--------------
snip
--------------

[+] Entered com.flare_on.flair.Stapler.iemm
[*] arg1=,e}e8yGS!81PPe(v
[*] retval=java.util.Arrays
[+] Entered com.flare_on.flair.Stapler.iemm
[*] arg1=H?ye!v
[*] retval=equals
[+] Entered com.flare_on.flair.Stapler.poserw
[*] retval=95,27,-29,-55,-80,-127,-60,13,-33,-60,-96,35,-127,86,0,-114,-25,30,36,-92

From the trace, we can see that the password for the 2nd activity is hashtag_covfefe_Fajitas!.The third and fourth activity required a bit of brute-force to get the respective password.

Third Activity

public class Main
{
  public static void main(String args[])
  {
    byte[] b_arr = new byte[] {16,-82,-91,-108,-125,30,11,66,-71,86,-59,120,-17,-102,109,68,-18,57,-109,-115};
    char keyspace[] = "0123456789abcdef".toCharArray();
    
    for (byte b: b_arr)
    {
      next:
      for (char c1: keyspace)
      {
        for (char c2: keyspace)
        {
          byte x = (byte)((Character.digit(c1, 16) << 4) + Character.digit(c2, 16));
          if (b == x)
          {
            System.out.print(c1 + "" + c2);
            break next;
          }
        }
      }
    }
  }
}

Password: 10aea594831e0b42b956c578ef9a6d44ee39938d

Fourth activity

public class Main
{
  public static void main(String args[]) throws Exception
  {

    byte[] b_arr = new byte[] {95,27,-29,-55,-80,-127,-60,13,-33,-60,-96,35,-127,86,0,-114,-25,30,36,-92};
    char keyspace[] = "0123456789abcdef".toCharArray();
    
    for (byte b: b_arr)
    {
      next:
      for (char c1: keyspace)
      {
        for (char c2: keyspace)
        {
          byte x = (byte)((Character.digit(c1, 16) << 4) + Character.digit(c2, 16));
          if (b == x)
          {
            System.out.print(c1 + "" + c2);
            break next;
          }
        }
      }
    }
  }
}

Password: 5f1be3c9b081c40ddfc4a0238156008ee71e24a4

Flag: pc_lo4d_l3tt3r_gl1tch@flare-on.com

#9 - remorse.ino.hex

The challenge consists of a binary for an Arduino Uno in Intel Hex format. The microcontroller board used in the Uno is ATmega328p. This is an 8-bit RISC microprocessor. After consulting the pinout diagram it was evident that the state of Pin D was used as an input.

Figure 6: Pinout diagram of the ATmega328.
Source: https://www.arduino.cc/en/Hacking/PinMapping168

The 8 bit input from the pin is used as a key to xor a block of bytes. If the decrypted data contained an @ sign at the appropriate position we have found the correct decryption key.

Figure 7: The xor loop in IDA

We can easily brute-force the decryption key and thus the flag in python.

li = [0xB5, 0xB5, 0x86, 0xB4, 0xF4, 0xB3, 0xF1, 0xB0, 0xB0, 
0xF1, 0xED, 0x80, 0xBB, 0x8F, 0xBF, 0x8D, 0xC6, 0x85, 0x87, 0xC0, 0x94, 0x81, 0x8C]

output = [0]  * len(li)

for k in xrange(255):
  for i in xrange(len(li)):
    v = (li[i]^k)+i
    if v > 255:
      v -= 256
    output[i] = chr(v)
  if output[-3] == 'c' and output[-2] == 'o' and output[-1] == 'm':
    print 'Xor key=', hex(k)
    print ''.join(output)
    break

The decryption key i.e. the state of Pin D must be 0xdb. We can also emulate the binary in AVR Studio with the XOR key to get the flag.

Figure 8: Simulating in AVR Studio

Flag: no_r3m0rs3@flare-on.com

#10 - shell.php

This one required a lot of manual work to go through. The overall idea of this level is to decrypt data encrypted using multi-byte xor in a chained fashion. Suppose the length of the key is n, then the crypto system can be represented by the following image.

Figure 9: The crypto system

P stands for plaintext, K for key and C for the resultant ciphertext. The first part of the plaintext is xored with the key, the remaining plaintext is xored with itself to form the ciphertext.

To break the crypto, the initial step is to find the length of the key. We know that the plaintext was a piece of PHP code. For the first php code, the possible length of the key could vary from 32 to 64 and can only consist of hexadecimal characters 0-9 a-f.

$key = isset($_POST['key']) ? $_POST['key'] : "";
$key = md5($key) . substr(MD5(strrev($key)) , 0, strlen($key));

To find the exact key length, I wrote another script which tried all the key lengths from 32 to 64 such that the resultant plain text consists of only printable characters.

import string

#Put the base64 blob here
encoded = ''
ct = map(ord, list(encoded.decode('base64')))


def is_char_possible(char, pos, keylen):
    char = ord(char)
    while True:
        if pos + keylen >= len(ct):
            return True
        pos += keylen
        char ^= ct[pos]
        if chr(char) not in string.printable:
            return False

def is_keylen_possible(keylen):
    for pos in xrange(0, keylen):
        possible = False

        # Iterate over all printable characters
        for ch in string.printable:
            if is_char_possible(ch, pos, keylen):
                # Char ch is possible at this position pos
                possible = True
                break

        # No printable char possible at position pos
        if not possible:
            return False

    # All position have possible printable characters
    return True


def find_possible_key_lengths():
    for keylen in xrange(32,65):
        if is_keylen_possible(keylen):
            print '[+] Possible key length =', keylen


def find_possible_chars(pos, keylen):
    possible = []
    for ch in string.printable:
        if is_char_possible(ch, pos, keylen):
            possible.append(ch)
    return possible

def find_possible_keys(pos, keylen):
    for pos in xrange(0, 64):
        print '[+] Position', pos, find_possible_chars(pos,64)

if __name__ == '__main__':
    find_possible_key_lengths()
    #find_possible_keys()

Running the code we immediately get the key length to be 64.

Figure 10: Calculating the key length

After finding the key length, the next step is to get the key. This was done in two steps. First, I wrote a script which showed which calculated which characters were possible at each of the 64 positions. The code for it is included in the same script above. Just switch the comments in main().

$ python findcandidates.py 
[+] Position 0 ['#', '$', '%', ' ']
[+] Position 1 ['b', 'c', 'd', 'e', 'g']
[+] Position 2 ['0', '1', '2', '3', '4', '5', '7', '8', '9', '!', '"', '#', '$', '%', '(', ')', '*', '+', ',', ':', ';', '<', '=', '>', '?']
[+] Position 3 ['#', '&', "'", ' ']
[+] Position 4 ['!', '$', '&', "'", ' ']
[+] Position 5 ['1', '2', '3', '5', '6', '7', '8', '9', '!', '"', '#', '$', '&', ')', '*', '+', ',', '-', '.', '/', ':', ';', '=', '>', '?']
[+] Position 6 ['\t', '\r', '\x0b', '\x0c']
[+] Position 7 ['\t', '\n', '\r', '\x0b', '\x0c']
[+] Position 8 ['"', '$', '%', ' ']
[+] Position 9 ['h', 'j', 'k', 'l']
[+] Position 10 ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'k', 'l', 'm', 'o', 'p', 'q', 's', 'u', 'v', 'w', 'z', '{', '|']
[+] Position 11 ['a', 'c', 'd', 'e', 'f', 'i', 'j', 'k', 'l', 'm', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z', '`', '{', '|', '}', '~']
[+] Position 12 ['!', '&', ' ']
[+] Position 13 [':', '<', '=']
[+] Position 14 ['!', '#', '&', ' ']
[+] Position 15 ['0', '2', '3', '4', '5', '6', '7', '!', '"', '#', '$', '%', '&', "'", '(', ')', '*', '+', ',', '-', '.', '/', ':', ';', '<', '=', '?', ' ']
[+] Position 16 ['0', '2', '4', '5', '6', '7', '8', '9', '!', '"', '#', '&', "'", '(', '+', ',', '-', '/', ';', '=', '>', '?', ' ']
[+] Position 17 ['1', '2', '3', '4', '5', '6', '8', '9', '"', '#', '$', '%', '&', '(', ')', '*', '-', '.', ':', ';', '<', '=', '>', '?']
[+] Position 18 ['\t', '\r', '\x0b', '\x0c']
[+] Position 19 ['\t', '\n', '\r', '\x0b', '\x0c']
[+] Position 20 ['h', 'i', 'm']
[+] Position 21 ['e', 'f', 'g']
[+] Position 22 ['0', '5', '6', '7', '8', '9', '!', '"', '#', '$', '%', '&', '(', ')', ',', '-', '.', '/', ';', '=', '>', '?', ' ']
[+] Position 23 ['(', ')', ',', '.', '/']
[+] Position 24 ['h', 'i', 'j', 'o']
[+] Position 25 ['a', 'b', 'd', 'e', 'f', 'i', 'k', 'l', 'n', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z', '}', '~']
[+] Position 26 ['r', 's', 't', 'u', 'w']
[+] Position 27 ['b', 'd', 'e', 'f']
[+] Position 28 ['p', 't', 'u']
[+] Position 29 ['(', ')', '+', '.']
[+] Position 30 ['0', '1', '2', '4', '5', '8', '9', '!', '"', '#', '$', '%', '&', "'", '*', '+', ',', '-', '.', ':', ';', '<', '=', '?', ' ']
[+] Position 31 ['B', 'C', 'E', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z', '@', '[', '\\', ']', '^', '_']
[+] Position 32 ['A', 'B', 'F', 'G', 'I', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'W', 'Y', 'Z', '[', '\\', '^', '_']
[+] Position 33 ['A', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'Y', 'Z', '\\', ']', '^', '_']
[+] Position 34 ['R', 'S', 'T', 'U', 'W']
[+] Position 35 ['R', 'S', 'T', 'U', 'W']
[+] Position 36 ['B', 'D', 'F', 'I', 'K', 'L', 'N', 'Q', 'R', 'S', 'T', 'U', 'X', 'Y', 'Z', '@', '[', ']']
[+] Position 37 ['!', '#', '&', "'", ' ']
[+] Position 38 ['h', 'i', 'n', 'o']
[+] Position 39 ['s', 't', 'u', 'X', 'Y', '\\', '^', '_']
[+] Position 40 ['a', 'c', 'd', 'f', 'h', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'r', 's', 't', 'u', 'v', 'x', 'y', '`', '{']
[+] Position 41 ['1', '2', '3', '5', '7', '8', '9', '"', '#', '$', '%', '&', "'", ')', '*', '-', '.', ':', '<', '>']
[+] Position 42 ['Y', '[', '\\', ']']
[+] Position 43 ['(', ')', '.', '/']
[+] Position 44 ['(', ')', '*']
[+] Position 45 ['\t', '\n', '\r', '\x0b', '\x0c']
[+] Position 46 ['\t', '\n', '\x0b', '\x0c']
[+] Position 47 ['0', '1', '2', '5', '7', '8', '9', '!', '#', '$', '%', "'", '(', ')', '*', '+', ',', '-', '.', '/', ';', '<', '=', '?', ' ']
[+] Position 48 ['!', ' ']
[+] Position 49 ['#', '$', '%']
[+] Position 50 ['h', 'j', 'k', 'l', 'm']
[+] Position 51 ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'k', 'l', 'm', 'o', 'p', 'q', 's', 't', 'u', 'v', 'w', 'z', '`', '{', '|', '}', '~']
[+] Position 52 ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'j', 'l', 'p', 's', 'u', 'w', 'x', 'y', 'z', '`', '{', '|', '}']
[+] Position 53 ['0', '1', '2', '3', '4', '5', '6', '7', '!', '"', '#', '$', '%', '&', "'", ')', '*', '.', ';', '=', '>', '?', ' ']
[+] Position 54 ['0', '2', '3', '4', '5', '7', '8', '9', '!', '"', '$', '&', "'", '(', ')', '*', '+', ',', '-', ':', '<', '=', '>', '?']
[+] Position 55 ['0', '1', '3', '4', '5', '7', '8', '9', '!', '"', '#', '$', '%', '&', "'", '(', ')', '*', '+', ',', '.', '<', '=', '>', '?', ' ']
[+] Position 56 ['1', '3', '4', '6', '8', '9', '!', '"', '$', '%', '&', "'", '(', '*', '+', '-', '.', '/', ':', ';', '=', ' ']
[+] Position 57 ['X', 'Y', '[', '^', '_']
[+] Position 58 ['P', 'Q', 'V', 'W']
[+] Position 59 ['H', 'I', 'L', 'N', 'O']
[+] Position 60 ['A', 'C', 'E', 'F', 'G', 'K', 'L', 'O', 'P', 'Q', 'R', 'S', 'V', 'W', 'Z', '\\', ']', '^', '_']
[+] Position 61 ['A', 'B', 'C', 'E', 'F', 'K', 'L', 'M', 'N', 'P', 'Q', 'R', 'T', 'U', 'V', 'W', 'Y', 'Z', '@', '[', '\\', ']', '^']
[+] Position 62 ['Z', '[', '\\', '_']
[+] Position 63 ['$', '&', "'", ' ']

We know which characters are possible at each of the 64 positions in the key. Next, I designed an interactive GUI tool in python which allowed me to construct the key letter by letter from the list above. We need to choose a character such that decrypted code is valid PHP. The tool showed the result of the decryption as we gradually built the key. The source code of the tool can be found at github: https://gist.github.com/extremecoders-re/f8258764d38133d435a0f1ae053a1a0d

Figure 11: A tool to decrypt interactively

There are four such sub-levels in this challenge based on the same idea. Solving i.e finding the correct decryption key gives the flag.

Flag: th3_xOr_is_waaaay_too_w34k@flare-on.com

#11 - covefefe.exe

This was definitely the finest challenge of Flare-On 2017. The problem is based on a VM the instruction set of which consists of only a single instruction - subtracting the contents of one memory location from another. A search on Google reveals that this is a Subleq VM. The entire VM is implemented in few lines of code.

Figure 12: The decompiled code implementing the VM

With such a small code, I implemented the entire logic in python.

from __future__ import print_function
from vm import code
import random

userinput = None
input_idx = -1

# printf("%c", char)
def printfc(char):
    print(chr(char), end='')


# scanf("%c", &returned)
def scanfc():
    global userinput, input_idx
    if userinput is None:
        userinput = raw_input()
        input_idx = 0
        if userinput == '':
            return 0xA
        else:
            return ord(userinput[input_idx])
    else:
        input_idx += 1
        if input_idx == len(userinput):
            input_idx -= 1
            return 0xA
        else:
            return ord(userinput[input_idx])


def dispatch(a, b, c):
    code[b] -= code[a]
    if c != 0:
        return code[b] <= 0
    else:
        return False


def exec_vm(entry, size):
    pc = entry
    while pc + 3 <= size:
        if dispatch(code[pc], code[pc+1], code[pc+2]):
            if code[pc+2] == -1:
                return 1
            pc = code[pc+2]

        else:
            pc += 3

        if code[4] == 1:
            printfc(code[2])
            code[4] = 0
            code[2] = 0

        if code[3] == 1:
            code[1] = scanfc()
            code[3] = 0


def main():
    code[273] = random.randint(0, 32767) % code[272]
    exec_vm(1123, 4352)


if __name__ == '__main__':
    main()

From here, I modified the VM code to print the instruction pointer during each loop. This did not help as the total number of executed instructions were well over 100k. The trace looked like the screenshot below.

Figure 13: Execution trace of the VM

With such a huge trace, it was extremely difficult to figure out what was going on. After a bit of trial and errors, I decided to run a taint analysis to track how the input flowed through the code. The VM is essentially an array of integers. The idea was to mark a cell dirty if it was controllable by user input. If there was a comparison it would immediately show through. The VM code was modified to introduce the tainting features.

from __future__ import print_function
from vm import code
import random

scanf_count = 0
user_input = 'abcdefghijklmnopqrstuvwxyz_0123'[::-1]
tainted = []
track_taints = False
tainted_at_least_once = []

def taint(address, taint_src):
    if address not in tainted:
        tainted.append(address)
        print('[+] Tainted [{}] = {}'.format(address, code[address]))
    else:
        print('[+] Re-tainted [{}] = {}'.format(address, code[address]))

    if address not in tainted_at_least_once:
        tainted_at_least_once.append(address)


def is_tainted(address):
    return address in tainted

def untaint(address):
    if address in tainted:
        tainted.remove(address)
        print('[-] Untainted [{}]'.format(address))


# printf("%c", char)
def printfc(char):
    print(chr(char), end='')


def dispatch(a, b, c):
    global track_taints

    code[b] -= code[a]

    if track_taints:
        if a == b and is_tainted(b):
            untaint(b)

        elif is_tainted(a):
            taint(b, a)

        elif is_tainted(b):
            taint(b, a) # Retaint

    
    if c != 0:
        return code[b] <= 0
    else:
        return False


def exec_vm(entry, size):
    global scanf_count, track_taints
    pc = entry
    while pc + 3 <= size:
        if dispatch(code[pc], code[pc+1], code[pc+2]):
            if code[pc+2] == -1:
                return 1
            pc = code[pc+2]

        else:
            pc += 3

        if code[4] == 1:
            if track_taints:
                track_taints = False
                print('[!] Taint tracking OFF')
            printfc(code[2])
            code[4] = 0
            code[2] = 0

        if code[3] == 1:
            print('[*] Call scanf <<<<<<<<<<<<<<<<<<<<<<<<')
            if scanf_count == 0: 
                print('[!] Taint tracking ON')
                track_taints = True
                code[1] = ord(user_input[scanf_count])
                scanf_count += 1
                taint(1, 'external')

            elif scanf_count < len(user_input):
                code[1] = ord(user_input[scanf_count])
                scanf_count += 1
                taint(1, 'external')

            else:
                code[1] = 0xA

            code[3] = 0

def main():
    global tainted, tainted_at_least_once
    code[273] = 0 #random.randint(0, 32767) % code[272]
    exec_vm(1123, 4352)

    print(tainted)
    for t in tainted:
        print('[{}] = {}'.format(t, code[t]))
    print('Tainted at least once')
    print(tainted_at_least_once)


if __name__ == '__main__':
    main()

The taint trace was significantly reduced in length compared to the execution trace.

Figure 14: Taint trace

From the trace, it was clear, that the characters at the odd positions very multiplied by 15 and left shifted seven times. Inspecting the VM bytecode there were a series of numbers which looked to be the result of the calculation on the characters at the odd positions.

Figure 15: A strange sequence of integers!

Reversing the calculation, the characters at the odd position were found. The remaining characters at the even positions were found using Google and an English dictionary. The regex search feature in Notepad++ really helped here.

Flag: subleq_and_reductio_ad_absurdum@flare-on.com

#12 - [missing]

The last challenge is so convoluted that an entire CTF contest can be made on this. A malware ran on a machine and exfiltrated some files. The task is to reconstruct the exact set of events that occurred based on the network traffic capture. I am not going to describe everything in detail as it will easily cover many blog posts. Instead, I will only focus my approach.

The network topology can be described by the next figure.

Figure 16: The network topology

The Command & Control communicates with the machine having an IP 192.168.221.91. This is used as a pivot to attack the other system 192.168.221.105. The pcap file which we have been provided contains the network traffic between C&C and the first system.

There are multiple stages in the malware. The first stage (coolprogram.exe) downloads the second stage (srv.exe) and executes it. The second stage is the principal malware. The functionality of this malware is built upon plugins. Plugins can be of three types - cryptography (CRYP), compression (COMP) and command (CMD). The traffic between the malware and its C&C is mostly encrypted except at the start when the CRYP plugins have not yet been loaded. Plugins are DLL files with a modified PE header.

To recreate the exact set of events, we need to replay the network traffic from the PCAP. I wrote a Python script using the pyshark library. Since we were replaying the packets there were no necessity to listen for the responses from the malware. However with this approach the malware freezed after running for some time as the send buffer filled up. To remedy the situation I had to patch ws2_32.send to discard all the packets sent to it.

import pyshark
import socket
import time
import threading

server_ip = '52.0.104.200'

sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
sock.bind(('', 9443))
sock.listen(5)
print '[++] Waiting for connection...'
(clientsocket, address) = sock.accept()
print '[++] Accepting connection from', address

cap = pyshark.FileCapture('data-only.pcap')

while idx < 4144:
    packet = cap[idx]
    data = packet.data.data.decode('hex')
    buflen = len(data)
 

 # Data packet from server to client
    if packet.ip.src == server_ip:
        clientsocket.send(data) 
        print '[<-] Sent %d bytes from index %d' %(buflen, idx)
        time.sleep(0.1)
  
    idx += 1

When the malware is running under the control of the debugger it's possible to intercept the process and dump the buffer containing the plugins from memory. Using another script I corrected the PE header of the dumped DLLs so that IDA and other tools can analyze it.

import sys
import struct

ifile = open(sys.argv[1], 'rb')
ofile = open(sys.argv[2], 'wb')

# MZ header
print '[+] Correcting MZ header'
assert ifile.read(2) == 'LM'
ofile.write('MZ')

# Write up to e_lfanew
ofile.write(ifile.read(0x3c-2))

e_lfanew = struct.unpack('<I', ifile.read(4))[0]
ofile.write(struct.pack('<I', e_lfanew))

ofile.write(ifile.read(e_lfanew - (0x3c + 4)))

# PE header
print '[+] Correcting PE header'
assert ifile.read(4) == 'NOP\0'
ofile.write('PE\0\0')

# Machine
print '[+] Correcting PE.Machine'
assert ifile.read(2) == '32'
ofile.write('\x4c\x01')

ofile.write(ifile.read(0x22))

print '[+] Correcting Address of entrypoint'
# Address of entrypoint
entrypoint = struct.unpack('<I', ifile.read(4))[0] ^ 0xabcdabcd
ofile.write(struct.pack('<I', entrypoint))

ofile.write(ifile.read())

ifile.close()
ofile.close()

print '[+] Done'

The second stage running on 192.168.221.91 loads 9 plugins. Each plugin has a unique 16-byte signature. The signature is also present in each of packets to determine which plugin will process that particular data packet. Out of the 9 plugins - 4 deal with crypto, 1 with compression and the remaining are command plugins.

The second stage then uses psexec to copy itself over to the third system at 192.168.221.105 and execute it. One of the command plugin acts as a relay and forwards all traffic between the C&C and the third system in both directions. This third stage loads up 9 more plugins - all of which are relayed by the second system from the C&C.

Thus there are 18 plugins in total out of which 8 deals with cryptography, 2 with compression and remaining are command plugins. The crypto and compression plugins are shown in the table below.

	Algorithm	Type	Key Size	IV Size	Mode
STAGE 2	RC4	Crypto	16	-	-
	Transposition cipher	Crypto	-	-	-
	Custom base64	Crypto	-	-	-
	XTEA	Crypto	16	8	CBC
	ZLIB	Comp	-	-	-
STAGE 3	Blowfish	Crypto	16	8	CBC
	XOR	Crypto	4	-	ECB
	Triple DES	Crypto	24	8	CBC
	Camellia	Crypto	16	-	ECB
	Aplib	Comp	-	-	-

The above plugins implement standard crypto/compression algorithms. Hence I reimplemented them in python.

I have provided the source code of all of the plugins implemented in python.
https://gist.github.com/extremecoders-re/b7caf1a5d2f884733a75dcdc80d8e384

Once the plugins were implemented in Python, decrypting the traffic was simple. The decrypted traffic contained a BMP image without a header. The entire image was split across multiple packets. After assembling them properly and adding the header we get the following image containing a password.

Figure 17: Decrypted BMP

The third stage running on larryjohnson-pc encrypted a file lab10.zip to lab10.zip.cry and exfiltrated it to the server via the stage2 relay. Decrypting the traffic using our plugins and reassembling the pieces we can reconstruct the cry file.

The encryptor named cf.exe is present in the captured traffic. Based on the decompiled C# code of the encryptor we can build a decrypter to get back the zip file.

using System;
using System.Text;
using System.IO;
using System.Security.Cryptography;

namespace cf_decrypter
{
    class Program
    {
        static void Main(string[] args)
        {
            using (FileStream fileStream = File.Open("lab10.zip.cry", FileMode.Open))
            {
                byte[] signature = new byte[4];
                
                fileStream.Read(signature, 0, 4);
                string sign = Encoding.ASCII.GetString(signature);

                if (sign.Equals("cryp"))
                {
                    byte[] IV = new byte[16];

                    // Read IV
                    fileStream.Read(IV, 0, IV.Length);

                    byte[] sha256_hash = new byte[32];

                    //Read SHA256 hash
                    fileStream.Read(sha256_hash, 0, sha256_hash.Length);

                    int ciphertext_len = (int)(fileStream.Length - fileStream.Position);

                    byte[] ciphertext = new byte[ciphertext_len];

                    // Read cipher text
                    fileStream.Read(ciphertext, 0, ciphertext_len);

                    byte[] key = Convert.FromBase64String("tCqlc2+fFiLcuq1ee1eAPOMjxcdijh8z0jrakMA/jxg=");
                    Aes aes = Aes.Create();
                    aes.KeySize = 256;
                    aes.Key = key;
                    aes.IV = IV;
                    aes.Padding = PaddingMode.PKCS7;
                    aes.Mode = CipherMode.CBC;

                    ICryptoTransform transform = aes.CreateDecryptor();

                    using (MemoryStream memoryStream = new MemoryStream())
                    {
                        using (CryptoStream cryptoStream = new CryptoStream(memoryStream, transform, CryptoStreamMode.Write))
                        {
                            cryptoStream.Write(ciphertext, 0, ciphertext.Length);
                            //cryptoStream.FlushFinalBlock();
                            File.WriteAllBytes("decrypted", memoryStream.ToArray());                            
                        }
                    }
                }
            }
        }
    }
}

The decrypted zip file was password protected. The password can be found in the BMP image. Opening the zip there is an x86_64 ELF written in Golang. Running the ELF gives the flag.

Flag: n3v3r_gunna_l3t_you_down_1987_4_ever@flare-on.com

Final Words

Overall, I feel the challenges this year were harder than the previous years. Challenge #11 and #12 deserve special mention. Challenge #12, in particular, can get very tough and time taking if not approached in the proper way. With this, we come to the end of this blog post. I would like to thank Crystalboy, rand0m, Alexander Polyakov and Peter Huene for their tips.

Deobfuscating PjOrion using bytecode simplifier

2017-07-11T05:14:00.003+00:00

Bytecode simplifier is a tool to de-obfuscate PjOrion protected python scripts. This post is a short tutorial to show how to use this module to deobfuscate a protected python script.

I have used the sample code below to demonstrate its usage. This is a small program to calculate the factorial of a number.

# Python program to find the factorial of a number using recursion
 
def recur_factorial(n):
   """Function to return the factorial
   of a number using recursion"""
   if n == 1:
       return n
   else:
       return n*recur_factorial(n-1)
 
 
# take input from the user
num = int(input("Enter a number: "))
 
# check is the number is negative
if num < 0:
   print("Sorry, factorial does not exist for negative numbers")
elif num == 0:
   print("The factorial of 0 is 1")
else:
   print("The factorial of",num,"is",recur_factorial(num))

I have first compiled the script and protected it as shown in Figure 1 & 2.

Figure 1: Compiling the script

Figure 2: Protecting the generated pyc file

After protection, we get a large file example.pyc about 27 KiB in size. This is the file we will be working on.

The stock python interpreter does not have bytecode tracing facilities inbuilt. Hence we have to use a modified version of Python which supports bytecode tracing. I have provided a precompiled version of Python 2.7.13 with bytecode tracing support at Github. The python27.dll file has to be copied to C:\Windows\System32\. Make sure to backup the existing dll so that you can revert when finished.

Step - 1: Unwrapping the layers

The first step is to unwrap the protection layers to get hold of the actual obfuscated code object. For this, we will be using the pjunwrapper module as shown below.

C:\pj-dump>python pjunwrapper.py --ifile=example.pyc
XXX lineno: 1, opcode: 156
[*] Dumped 1 code object
XXX lineno: 1, opcode: 213
XXX lineno: 1, opcode: 184
XXX lineno: 1, opcode: 240
XXX lineno: 1, opcode: 240
XXX lineno: 1, opcode: 240
XXX lineno: 1, opcode: 240
[*] Dumped 1 code object
XXX lineno: 1, opcode: 7
XXX lineno: 1, opcode: 45
[*] Dumped 1 code object
XXX lineno: 1, opcode: 161
Enter a number: ^D
Error in module '__main__': unexpected EOF while parsing (<string>, line 1)

PjUnwrapper requires the pystack extension module. Make sure that the extension is present in python path. Running this, some files having names of wrapper_ would be dumped. These are basically the wrapper layers over the actual obfuscated code. In our case, the obfuscated code has a file name wrapper_3.pyc as shown in Figure 3. In general, the highest numbered file contains the final obfuscated code.

Figure 3: Unwrapping the protection layers

Step - 2: Deobfuscating

The final step is to run bytecode_simplifier over wrapper_3.pyc as shown below.

C:\bytecode_simplifier\main.py --ifile=wrapper_3.pyc --ofile=wrapper_deobf.pyc
INFO:__main__:Opening file wrapper_3.pyc
INFO:__main__:Input pyc file header matched
DEBUG:__main__:Unmarshalling file
INFO:__main__:Processing code object \x0f\x1d\n\x00\x07\x0f\x0f
DEBUG:deobfuscator:Code entrypoint matched PjOrion signature v1
INFO:deobfuscator:Original code entrypoint at 269
INFO:deobfuscator:Starting control flow analysis...
DEBUG:disassembler:Finding leaders...
DEBUG:disassembler:Start leader at 269
DEBUG:disassembler:End leader at 272
DEBUG:disassembler:Start leader at 272
DEBUG:disassembler:End leader at 117
DEBUG:disassembler:Start leader at 117
DEBUG:disassembler:End leader at 82
DEBUG:disassembler:Start leader at 82
DEBUG:disassembler:End leader at 28
DEBUG:disassembler:Start leader at 28
DEBUG:disassembler:End leader at 177
DEBUG:disassembler:Start leader at 177
DEBUG:disassembler:End leader at 125
DEBUG:disassembler:Start leader at 125
DEBUG:disassembler:End leader at 155
DEBUG:disassembler:Start leader at 155
DEBUG:disassembler:End leader at 60
DEBUG:disassembler:Start leader at 60
DEBUG:disassembler:End leader at 165
DEBUG:disassembler:Start leader at 165
DEBUG:disassembler:End leader at 353
DEBUG:disassembler:Start leader at 353
DEBUG:disassembler:End leader at 303
DEBUG:disassembler:Start leader at 303
DEBUG:disassembler:End leader at 190
DEBUG:disassembler:Start leader at 190
DEBUG:disassembler:End leader at 235
DEBUG:disassembler:Start leader at 235
DEBUG:disassembler:Start leader at 235
DEBUG:disassembler:End leader at 51
DEBUG:disassembler:Start leader at 51
DEBUG:disassembler:End leader at 238
DEBUG:disassembler:Start leader at 238
DEBUG:disassembler:End leader at 313
DEBUG:disassembler:Start leader at 313
DEBUG:disassembler:End leader at 105
DEBUG:disassembler:Start leader at 105
DEBUG:disassembler:End leader at 246
DEBUG:disassembler:Start leader at 246
DEBUG:disassembler:End leader at 142
DEBUG:disassembler:Start leader at 142
DEBUG:disassembler:End leader at 71
DEBUG:disassembler:Start leader at 71
DEBUG:disassembler:End leader at 229
DEBUG:disassembler:Start leader at 229
DEBUG:disassembler:End leader at 33
DEBUG:disassembler:Start leader at 33
DEBUG:disassembler:Start leader at 33
DEBUG:disassembler:End leader at 44
DEBUG:disassembler:Start leader at 44
DEBUG:disassembler:End leader at 342
DEBUG:disassembler:Start leader at 342
DEBUG:disassembler:End leader at 36
DEBUG:disassembler:Start leader at 36
DEBUG:disassembler:End leader at 94
DEBUG:disassembler:Start leader at 94
DEBUG:disassembler:End leader at 17
DEBUG:disassembler:Start leader at 17
DEBUG:disassembler:End leader at 285
DEBUG:disassembler:Start leader at 285
DEBUG:disassembler:End leader at 295
DEBUG:disassembler:Start leader at 295
DEBUG:disassembler:End leader at 257
DEBUG:disassembler:Start leader at 257
DEBUG:disassembler:End leader at 197
DEBUG:disassembler:Start leader at 197
DEBUG:disassembler:End leader at 349
DEBUG:disassembler:End leader at 207
DEBUG:disassembler:Start leader at 207
DEBUG:disassembler:End leader at 361
DEBUG:disassembler:Start leader at 361
DEBUG:disassembler:End leader at 221
DEBUG:disassembler:Start leader at 221
DEBUG:disassembler:End leader at 332
DEBUG:disassembler:Start leader at 332
DEBUG:disassembler:End leader at 324
DEBUG:disassembler:Start leader at 324
DEBUG:disassembler:End leader at 134
DEBUG:disassembler:Start leader at 134
DEBUG:disassembler:End leader at 369
DEBUG:disassembler:Start leader at 369
DEBUG:disassembler:End leader at 6
DEBUG:disassembler:Start leader at 6
DEBUG:disassembler:End leader at 94
DEBUG:disassembler:Start leader at 94
DEBUG:disassembler:Found 81 leaders
DEBUG:disassembler:Constructing basic blocks...
DEBUG:disassembler:Creating basic block 0x24bd800 spanning from 5 to 6, both inclusive
DEBUG:disassembler:Creating basic block 0x24ca558 spanning from 14 to 17, both inclusive
DEBUG:disassembler:Creating basic block 0x24ca5d0 spanning from 25 to 28, both inclusive
DEBUG:disassembler:Creating basic block 0x24ca648 spanning from 33 to 33, both inclusive
DEBUG:disassembler:Creating basic block 0x24ca698 spanning from 36 to 36, both inclusive
DEBUG:disassembler:Creating basic block 0x24ca6e8 spanning from 44 to 44, both inclusive
DEBUG:disassembler:Creating basic block 0x24ca738 spanning from 51 to 51, both inclusive
DEBUG:disassembler:Creating basic block 0x24ca788 spanning from 57 to 60, both inclusive
DEBUG:disassembler:Creating basic block 0x24ca800 spanning from 68 to 71, both inclusive
DEBUG:disassembler:Creating basic block 0x24ca878 spanning from 79 to 82, both inclusive
DEBUG:disassembler:Creating basic block 0x24ca8f0 spanning from 93 to 94, end exclusive
DEBUG:disassembler:Creating basic block 0x24ca940 spanning from 94 to 94, both inclusive
DEBUG:disassembler:Creating basic block 0x24ca990 spanning from 102 to 105, both inclusive
DEBUG:disassembler:Creating basic block 0x24caa30 spanning from 114 to 117, both inclusive
DEBUG:disassembler:Creating basic block 0x24caaf8 spanning from 122 to 125, both inclusive
DEBUG:disassembler:Creating basic block 0x24cabc0 spanning from 131 to 134, both inclusive
DEBUG:disassembler:Creating basic block 0x24cac88 spanning from 141 to 142, both inclusive
DEBUG:disassembler:Creating basic block 0x24cad50 spanning from 152 to 155, both inclusive
DEBUG:disassembler:Creating basic block 0x24cae18 spanning from 162 to 165, both inclusive
DEBUG:disassembler:Creating basic block 0x24caee0 spanning from 174 to 177, both inclusive
DEBUG:disassembler:Creating basic block 0x24cafa8 spanning from 187 to 190, both inclusive
DEBUG:disassembler:Creating basic block 0x24ce0a8 spanning from 196 to 197, both inclusive
DEBUG:disassembler:Creating basic block 0x24ce170 spanning from 204 to 207, both inclusive
DEBUG:disassembler:Creating basic block 0x24ce238 spanning from 218 to 221, both inclusive
DEBUG:disassembler:Creating basic block 0x24ce300 spanning from 228 to 229, both inclusive
DEBUG:disassembler:Creating basic block 0x24ce3c8 spanning from 235 to 235, both inclusive
DEBUG:disassembler:Creating basic block 0x24ce468 spanning from 238 to 238, both inclusive
DEBUG:disassembler:Creating basic block 0x24ce508 spanning from 243 to 246, both inclusive
DEBUG:disassembler:Creating basic block 0x24ce5d0 spanning from 254 to 257, both inclusive
DEBUG:disassembler:Creating basic block 0x24ce698 spanning from 269 to 272, both inclusive
DEBUG:disassembler:Creating basic block 0x24ce760 spanning from 282 to 285, both inclusive
DEBUG:disassembler:Creating basic block 0x24ce828 spanning from 292 to 295, both inclusive
DEBUG:disassembler:Creating basic block 0x24ce8f0 spanning from 300 to 303, both inclusive
DEBUG:disassembler:Creating basic block 0x24ce9b8 spanning from 310 to 313, both inclusive
DEBUG:disassembler:Creating basic block 0x24cea80 spanning from 321 to 324, both inclusive
DEBUG:disassembler:Creating basic block 0x24ceb48 spanning from 332 to 332, both inclusive
DEBUG:disassembler:Creating basic block 0x24cebe8 spanning from 342 to 342, both inclusive
DEBUG:disassembler:Creating basic block 0x24cec88 spanning from 349 to 349, both inclusive
DEBUG:disassembler:Creating basic block 0x24ced28 spanning from 350 to 353, both inclusive
DEBUG:disassembler:Creating basic block 0x24cedf0 spanning from 360 to 361, both inclusive
DEBUG:disassembler:Creating basic block 0x24ceeb8 spanning from 366 to 369, both inclusive
DEBUG:disassembler:41 basic blocks created
DEBUG:disassembler:Constructing edges between basic blocks...
DEBUG:disassembler:Adding explicit edge from block 0x24bd800 to 0x24ca8f0
DEBUG:disassembler:Adding explicit edge from block 0x24ca800 to 0x24ca648
DEBUG:disassembler:Adding explicit edge from block 0x24ce828 to 0x24cec88
DEBUG:disassembler:Adding explicit edge from block 0x24ca878 to 0x24ca5d0
DEBUG:disassembler:Adding explicit edge from block 0x24ce0a8 to 0x24cedf0
DEBUG:disassembler:Adding explicit edge from block 0x24ca940 to 0x24ce828
DEBUG:disassembler:Adding explicit edge from block 0x24ca6e8 to 0x24ca940
DEBUG:disassembler:Adding explicit edge from block 0x24ce170 to 0x24ce238
DEBUG:disassembler:Adding explicit edge from block 0x24ca990 to 0x24cac88
DEBUG:disassembler:Adding explicit edge from block 0x24ce9b8 to 0x24ce508
DEBUG:disassembler:Adding explicit edge from block 0x24caa30 to 0x24ca878
DEBUG:disassembler:Adding explicit edge from block 0x24cea80 to 0x24cabc0
DEBUG:disassembler:Adding explicit edge from block 0x24caaf8 to 0x24cad50
DEBUG:disassembler:Adding explicit edge from block 0x24ce300 to 0x24ca6e8
DEBUG:disassembler:Adding explicit edge from block 0x24ceb48 to 0x24ca940
DEBUG:disassembler:Adding implicit edge from block 0x24ce3c8 to 0x24ce468
DEBUG:disassembler:Adding explicit edge from block 0x24ce3c8 to 0x24ca738
DEBUG:disassembler:Adding explicit edge from block 0x24cabc0 to 0x24ceeb8
DEBUG:disassembler:Adding explicit edge from block 0x24cebe8 to 0x24ca558
DEBUG:disassembler:Adding explicit edge from block 0x24ce468 to 0x24ca990
DEBUG:disassembler:Adding explicit edge from block 0x24cac88 to 0x24ce300
DEBUG:disassembler:Adding explicit edge from block 0x24ce508 to 0x24ca800
DEBUG:disassembler:Adding explicit edge from block 0x24ced28 to 0x24ce8f0
DEBUG:disassembler:Adding explicit edge from block 0x24ce238 to 0x24cea80
DEBUG:disassembler:Adding explicit edge from block 0x24ca558 to 0x24ce5d0
DEBUG:disassembler:Adding explicit edge from block 0x24ce8f0 to 0x24cafa8
DEBUG:disassembler:Adding explicit edge from block 0x24ca5d0 to 0x24caee0
DEBUG:disassembler:Adding explicit edge from block 0x24ce5d0 to 0x24ce170
DEBUG:disassembler:Adding explicit edge from block 0x24cedf0 to 0x24ceb48
DEBUG:disassembler:Adding explicit edge from block 0x24cae18 to 0x24ced28
DEBUG:disassembler:Adding implicit edge from block 0x24ca648 to 0x24ca698
DEBUG:disassembler:Adding explicit edge from block 0x24ca648 to 0x24cebe8
DEBUG:disassembler:Adding explicit edge from block 0x24ca698 to 0x24ce760
DEBUG:disassembler:Adding explicit edge from block 0x24ceeb8 to 0x24bd800
DEBUG:disassembler:Adding explicit edge from block 0x24caee0 to 0x24caaf8
DEBUG:disassembler:Adding explicit edge from block 0x24ca738 to 0x24ce9b8
DEBUG:disassembler:Adding explicit edge from block 0x24ce760 to 0x24ce0a8
DEBUG:disassembler:Adding explicit edge from block 0x24ce698 to 0x24caa30
DEBUG:disassembler:Adding explicit edge from block 0x24ca788 to 0x24cae18
DEBUG:disassembler:Adding explicit edge from block 0x24cafa8 to 0x24ce3c8
DEBUG:disassembler:Adding explicit edge from block 0x24cad50 to 0x24ca788
INFO:deobfuscator:Control flow analysis completed.
INFO:deobfuscator:Starting simplication of basic blocks...
DEBUG:simplifier:Eliminating forwarders...
INFO:simplifier:Adding explicit edge from block 0x24ceb48 to 0x24ce828
INFO:simplifier:Adding explicit edge from block 0x24ca6e8 to 0x24ce828
INFO:simplifier:Adding implicit edge from block 0x24ca8f0 to 0x24ce828
DEBUG:simplifier:Forwarder basic block 0x24ca940 eliminated
INFO:simplifier:Adding explicit edge from block 0x24ce300 to 0x24ce828
DEBUG:simplifier:Forwarder basic block 0x24ca6e8 eliminated
INFO:simplifier:Adding explicit edge from block 0x24cedf0 to 0x24ce828
DEBUG:simplifier:Forwarder basic block 0x24ceb48 eliminated
INFO:simplifier:Adding explicit edge from block 0x24ca648 to 0x24ca558
DEBUG:simplifier:Forwarder basic block 0x24cebe8 eliminated
INFO:simplifier:Adding implicit edge from block 0x24ce3c8 to 0x24ca990
DEBUG:simplifier:Forwarder basic block 0x24ce468 eliminated
INFO:simplifier:Adding implicit edge from block 0x24ca648 to 0x24ce760
DEBUG:simplifier:Forwarder basic block 0x24ca698 eliminated
INFO:simplifier:Adding explicit edge from block 0x24ce3c8 to 0x24ce9b8
DEBUG:simplifier:Forwarder basic block 0x24ca738 eliminated
INFO:simplifier:7 basic blocks eliminated
DEBUG:simplifier:Merging basic blocks...
INFO:simplifier:Adding explicit edge from block 0x24ceeb8 to 0x24ca8f0
DEBUG:simplifier:Basic block 0x24bd800 merged with block 0x24ceeb8
INFO:simplifier:Adding explicit edge from block 0x24ce508 to 0x24ca648
DEBUG:simplifier:Basic block 0x24ca800 merged with block 0x24ce508
INFO:simplifier:Adding explicit edge from block 0x24caa30 to 0x24ca5d0
DEBUG:simplifier:Basic block 0x24ca878 merged with block 0x24caa30
INFO:simplifier:Adding explicit edge from block 0x24ce760 to 0x24cedf0
DEBUG:simplifier:Basic block 0x24ce0a8 merged with block 0x24ce760
INFO:simplifier:Adding implicit edge from block 0x24ceeb8 to 0x24ce828
DEBUG:simplifier:Basic block 0x24ca8f0 merged with block 0x24ceeb8
INFO:simplifier:Adding explicit edge from block 0x24ce5d0 to 0x24ce238
DEBUG:simplifier:Basic block 0x24ce170 merged with block 0x24ce5d0
INFO:simplifier:Adding explicit edge from block 0x24ce698 to 0x24ca5d0
DEBUG:simplifier:Basic block 0x24caa30 merged with block 0x24ce698
INFO:simplifier:Adding explicit edge from block 0x24ce238 to 0x24cabc0
DEBUG:simplifier:Basic block 0x24cea80 merged with block 0x24ce238
INFO:simplifier:Adding explicit edge from block 0x24caee0 to 0x24cad50
DEBUG:simplifier:Basic block 0x24caaf8 merged with block 0x24caee0
INFO:simplifier:Adding explicit edge from block 0x24cac88 to 0x24ce828
DEBUG:simplifier:Basic block 0x24ce300 merged with block 0x24cac88
DEBUG:simplifier:Basic block 0x24cec88 merged with block 0x24ce828
INFO:simplifier:Adding implicit edge from block 0x24cafa8 to 0x24ca990
INFO:simplifier:Adding explicit edge from block 0x24cafa8 to 0x24ce9b8
DEBUG:simplifier:Basic block 0x24ce3c8 merged with block 0x24cafa8
INFO:simplifier:Adding explicit edge from block 0x24ce238 to 0x24ceeb8
DEBUG:simplifier:Basic block 0x24cabc0 merged with block 0x24ce238
INFO:simplifier:Adding explicit edge from block 0x24ca990 to 0x24ce828
DEBUG:simplifier:Basic block 0x24cac88 merged with block 0x24ca990
INFO:simplifier:Adding explicit edge from block 0x24ce9b8 to 0x24ca648
DEBUG:simplifier:Basic block 0x24ce508 merged with block 0x24ce9b8
INFO:simplifier:Adding explicit edge from block 0x24cae18 to 0x24ce8f0
DEBUG:simplifier:Basic block 0x24ced28 merged with block 0x24cae18
INFO:simplifier:Adding explicit edge from block 0x24ce5d0 to 0x24ceeb8
DEBUG:simplifier:Basic block 0x24ce238 merged with block 0x24ce5d0
INFO:simplifier:Adding explicit edge from block 0x24cae18 to 0x24cafa8
DEBUG:simplifier:Basic block 0x24ce8f0 merged with block 0x24cae18
INFO:simplifier:Adding explicit edge from block 0x24ce698 to 0x24caee0
DEBUG:simplifier:Basic block 0x24ca5d0 merged with block 0x24ce698
INFO:simplifier:Adding explicit edge from block 0x24ca558 to 0x24ceeb8
DEBUG:simplifier:Basic block 0x24ce5d0 merged with block 0x24ca558
INFO:simplifier:Adding explicit edge from block 0x24ce760 to 0x24ce828
DEBUG:simplifier:Basic block 0x24cedf0 merged with block 0x24ce760
INFO:simplifier:Adding explicit edge from block 0x24ca788 to 0x24cafa8
DEBUG:simplifier:Basic block 0x24cae18 merged with block 0x24ca788
INFO:simplifier:Adding explicit edge from block 0x24ce9b8 to 0x24ca558
INFO:simplifier:Adding implicit edge from block 0x24ce9b8 to 0x24ce760
DEBUG:simplifier:Basic block 0x24ca648 merged with block 0x24ce9b8
INFO:simplifier:Adding implicit edge from block 0x24ca558 to 0x24ce828
DEBUG:simplifier:Basic block 0x24ceeb8 merged with block 0x24ca558
INFO:simplifier:Adding explicit edge from block 0x24ce698 to 0x24cad50
DEBUG:simplifier:Basic block 0x24caee0 merged with block 0x24ce698
INFO:simplifier:Adding explicit edge from block 0x24cad50 to 0x24cafa8
DEBUG:simplifier:Basic block 0x24ca788 merged with block 0x24cad50
INFO:simplifier:Adding implicit edge from block 0x24cad50 to 0x24ca990
INFO:simplifier:Adding explicit edge from block 0x24cad50 to 0x24ce9b8
DEBUG:simplifier:Basic block 0x24cafa8 merged with block 0x24cad50
INFO:simplifier:Adding implicit edge from block 0x24ce698 to 0x24ca990
INFO:simplifier:Adding explicit edge from block 0x24ce698 to 0x24ce9b8
DEBUG:simplifier:Basic block 0x24cad50 merged with block 0x24ce698
INFO:simplifier:28 basic blocks merged.
INFO:deobfuscator:Simplication of basic blocks completed.
INFO:deobfuscator:Beginning verification of simplified basic block graph...
INFO:deobfuscator:Verification succeeded.
INFO:deobfuscator:Assembling basic blocks...
DEBUG:assembler:Performing a DFS on the graph to generate the layout of the blocks.
DEBUG:assembler:Morphing some JUMP_ABSOLUTE instructions to make file decompilable.
DEBUG:assembler:Verifying generated layout...
DEBUG:assembler:Successfully verified layout.
DEBUG:assembler:Calculating addresses of basic blocks.
DEBUG:assembler:Calculating instruction operands.
DEBUG:assembler:Generating code...
INFO:deobfuscator:Successfully assembled. 
INFO:__main__:Successfully deobfuscated code object \x0f\x1d\n\x00\x07\x0f\x0f
INFO:__main__:Collecting constants for code object \x0f\x1d\n\x00\x07\x0f\x0f
INFO:__main__:Code object \x0f\x1d\n\x00\x07\x0f\x0f contains embedded code object recur_factorial
INFO:__main__:Processing code object recur_factorial
DEBUG:deobfuscator:Code entrypoint matched PjOrion signature v2
INFO:deobfuscator:Original code entrypoint at 161
INFO:deobfuscator:Starting control flow analysis...
DEBUG:disassembler:Finding leaders...
DEBUG:disassembler:Start leader at 161
DEBUG:disassembler:End leader at 164
DEBUG:disassembler:Start leader at 164
DEBUG:disassembler:End leader at 46
DEBUG:disassembler:Start leader at 46
DEBUG:disassembler:End leader at 141
DEBUG:disassembler:Start leader at 141
DEBUG:disassembler:End leader at 19
DEBUG:disassembler:Start leader at 19
DEBUG:disassembler:Start leader at 19
DEBUG:disassembler:End leader at 127
DEBUG:disassembler:Start leader at 127
DEBUG:disassembler:End leader at 22
DEBUG:disassembler:Start leader at 22
DEBUG:disassembler:End leader at 66
DEBUG:disassembler:Start leader at 66
DEBUG:disassembler:End leader at 105
DEBUG:disassembler:Start leader at 105
DEBUG:disassembler:End leader at 87
DEBUG:disassembler:Start leader at 87
DEBUG:disassembler:End leader at 126
DEBUG:disassembler:End leader at 154
DEBUG:disassembler:Start leader at 154
DEBUG:disassembler:End leader at 113
DEBUG:disassembler:Start leader at 113
DEBUG:disassembler:End leader at 93
DEBUG:disassembler:Start leader at 93
DEBUG:disassembler:End leader at 34
DEBUG:disassembler:Start leader at 34
DEBUG:disassembler:End leader at 11
DEBUG:disassembler:Start leader at 11
DEBUG:disassembler:End leader at 53
DEBUG:disassembler:Found 32 leaders
DEBUG:disassembler:Constructing basic blocks...
DEBUG:disassembler:Creating basic block 0x24ce3a0 spanning from 10 to 11, both inclusive
DEBUG:disassembler:Creating basic block 0x24bddc8 spanning from 19 to 19, both inclusive
DEBUG:disassembler:Creating basic block 0x24bda30 spanning from 22 to 22, both inclusive
DEBUG:disassembler:Creating basic block 0x24bdad0 spanning from 31 to 34, both inclusive
DEBUG:disassembler:Creating basic block 0x24bd9b8 spanning from 43 to 46, both inclusive
DEBUG:disassembler:Creating basic block 0x24bde68 spanning from 53 to 53, both inclusive
DEBUG:disassembler:Creating basic block 0x24bdc10 spanning from 63 to 66, both inclusive
DEBUG:disassembler:Creating basic block 0x24bdcb0 spanning from 84 to 87, both inclusive
DEBUG:disassembler:Creating basic block 0x24bd7d8 spanning from 92 to 93, both inclusive
DEBUG:disassembler:Creating basic block 0x24bdb98 spanning from 102 to 105, both inclusive
DEBUG:disassembler:Creating basic block 0x24bdf58 spanning from 110 to 113, both inclusive
DEBUG:disassembler:Creating basic block 0x24bdb48 spanning from 126 to 126, both inclusive
DEBUG:disassembler:Creating basic block 0x24ca8a0 spanning from 127 to 127, both inclusive
DEBUG:disassembler:Creating basic block 0x24caf58 spanning from 138 to 141, both inclusive
DEBUG:disassembler:Creating basic block 0x24ca7b0 spanning from 151 to 154, both inclusive
DEBUG:disassembler:Creating basic block 0x24ca670 spanning from 161 to 164, both inclusive
DEBUG:disassembler:16 basic blocks created
DEBUG:disassembler:Constructing edges between basic blocks...
DEBUG:disassembler:Adding explicit edge from block 0x24bdc10 to 0x24bdcb0
DEBUG:disassembler:Adding explicit edge from block 0x24ca7b0 to 0x24bdf58
DEBUG:disassembler:Adding explicit edge from block 0x24bda30 to 0x24bdb98
DEBUG:disassembler:Adding explicit edge from block 0x24ca670 to 0x24bd9b8
DEBUG:disassembler:Adding explicit edge from block 0x24ca8a0 to 0x24bdc10
DEBUG:disassembler:Adding explicit edge from block 0x24bdcb0 to 0x24ca7b0
DEBUG:disassembler:Adding explicit edge from block 0x24bdad0 to 0x24ce3a0
DEBUG:disassembler:Adding explicit edge from block 0x24bdf58 to 0x24bd7d8
DEBUG:disassembler:Adding explicit edge from block 0x24bdb98 to 0x24bdb48
DEBUG:disassembler:Adding explicit edge from block 0x24ce3a0 to 0x24bde68
DEBUG:disassembler:Adding explicit edge from block 0x24bd9b8 to 0x24caf58
DEBUG:disassembler:Adding implicit edge from block 0x24bddc8 to 0x24bda30
DEBUG:disassembler:Adding explicit edge from block 0x24bddc8 to 0x24ca8a0
DEBUG:disassembler:Adding explicit edge from block 0x24bd7d8 to 0x24bdad0
DEBUG:disassembler:Adding explicit edge from block 0x24caf58 to 0x24bddc8
INFO:deobfuscator:Control flow analysis completed.
INFO:deobfuscator:Starting simplication of basic blocks...
DEBUG:simplifier:Eliminating forwarders...
INFO:simplifier:Adding implicit edge from block 0x24bddc8 to 0x24bdb98
DEBUG:simplifier:Forwarder basic block 0x24bda30 eliminated
INFO:simplifier:Adding explicit edge from block 0x24bddc8 to 0x24bdc10
DEBUG:simplifier:Forwarder basic block 0x24ca8a0 eliminated
INFO:simplifier:2 basic blocks eliminated
DEBUG:simplifier:Merging basic blocks...
INFO:simplifier:Adding explicit edge from block 0x24bdcb0 to 0x24bdf58
DEBUG:simplifier:Basic block 0x24ca7b0 merged with block 0x24bdcb0
DEBUG:simplifier:Basic block 0x24bde68 merged with block 0x24ce3a0
INFO:simplifier:Adding explicit edge from block 0x24bdc10 to 0x24bdf58
DEBUG:simplifier:Basic block 0x24bdcb0 merged with block 0x24bdc10
INFO:simplifier:Adding explicit edge from block 0x24bd7d8 to 0x24ce3a0
DEBUG:simplifier:Basic block 0x24bdad0 merged with block 0x24bd7d8
DEBUG:simplifier:Basic block 0x24bdb48 merged with block 0x24bdb98
INFO:simplifier:Adding explicit edge from block 0x24bdc10 to 0x24bd7d8
DEBUG:simplifier:Basic block 0x24bdf58 merged with block 0x24bdc10
DEBUG:simplifier:Basic block 0x24ce3a0 merged with block 0x24bd7d8
INFO:simplifier:Adding explicit edge from block 0x24ca670 to 0x24caf58
DEBUG:simplifier:Basic block 0x24bd9b8 merged with block 0x24ca670
INFO:simplifier:Adding implicit edge from block 0x24caf58 to 0x24bdb98
INFO:simplifier:Adding explicit edge from block 0x24caf58 to 0x24bdc10
DEBUG:simplifier:Basic block 0x24bddc8 merged with block 0x24caf58
DEBUG:simplifier:Basic block 0x24bd7d8 merged with block 0x24bdc10
INFO:simplifier:Adding implicit edge from block 0x24ca670 to 0x24bdb98
INFO:simplifier:Adding explicit edge from block 0x24ca670 to 0x24bdc10
DEBUG:simplifier:Basic block 0x24caf58 merged with block 0x24ca670
INFO:simplifier:11 basic blocks merged.
INFO:deobfuscator:Simplication of basic blocks completed.
INFO:deobfuscator:Beginning verification of simplified basic block graph...
INFO:deobfuscator:Verification succeeded.
INFO:deobfuscator:Assembling basic blocks...
DEBUG:assembler:Performing a DFS on the graph to generate the layout of the blocks.
DEBUG:assembler:Morphing some JUMP_ABSOLUTE instructions to make file decompilable.
DEBUG:assembler:Verifying generated layout...
DEBUG:assembler:Successfully verified layout.
DEBUG:assembler:Calculating addresses of basic blocks.
DEBUG:assembler:Calculating instruction operands.
DEBUG:assembler:Generating code...
INFO:deobfuscator:Successfully assembled. 
INFO:__main__:Successfully deobfuscated code object recur_factorial
INFO:__main__:Collecting constants for code object recur_factorial
INFO:__main__:Generating new code object for recur_factorial
INFO:__main__:Generating new code object for \x0f\x1d\n\x00\x07\x0f\x0f
INFO:__main__:Writing deobfuscated code object to disk
INFO:__main__:Success

Running this we get back the deobfuscated code in the file wrapper_deobf.pyc. We can now run a python decompiler on this to get back our deobfuscated code as shown in Figure 4.

Figure 4: Decompiling the deobfuscated code

Introducing bytecode simplifier

2017-07-10T10:55:00.000+00:00

Bytecode simplifier is a tool to deobfuscate PjOrion protected python scripts. It is a complete rewrite of my earlier tool PjOrion Deobfuscator. I have reimplemented the deobfuscation functionality from scratch and have used networkx specifically for this purpose. Using networkx made reasoning about the code much simpler.
The PjOrion version used is 1.3.2 (Filename: PjOrion_Uncompyle6_01.10.2016.zip)

The code is at https://github.com/extremecoders-re/bytecode_simplifier

A short tutorial can be found here: https://0xec.blogspot.com/2017/07/deobfuscating-pjorion-using-bytecode.html

Remote debugging in IDA Pro by http tunnelling

2017-03-31T20:34:00.000+00:00

IDA Pro provides remote debugging capability that allows us to debug a target binary residing on a different machine over the network. This feature is very useful in situations such as when we want to debug an executable for an arm device as installing IDA on it is not possible. IDA can remotely debug another binary in two ways - through a gdbserver or by the provided debugger servers (located in dbgsrv directory).

These debugging servers transport the debugger commands, messages and relevant data over a TCP/IP network through BSD sockets. So far so good, but what if the debugging server resided on a virtual host hosting multiple domain names? We cannot use sockets anymore.

A socket connection between two endpoints is characterized by a pair of socket addresses, one for each node. The socket address, in turn, comprises of the IP address and a port number. For an incoming socket connection, a server hosting multiple domains on the same IP address cannot decide which domain to actually forward the request based on socket address alone. Thus remote debugging using sockets is not possible. However, this is not entirely true as there are techniques such as port forwarding (aka virtual server) that can be used to reroute the incoming traffic to various private IPs based on a pre-decided table. Port forwarding capability is not available everywhere so we can ignore it for now. Instead, it would be much better if sockets supported connections based on domain names as described in this paper Name-based Virtual Hosting in TCP.

The Application Layer Protocol HTTP solves the virtual host problem by including the Host header in HTTP messages. It seems that if we can wrap the transport layer socket traffic in plain old HTTP messages our problem would be solved. The rest of the blog post describes this process in detail.

The problem

A few days ago, I was trying some CTF challenge involving an arm binary. The binary was loaded in IDA within a Windows XP VM. Debugging the binary would require a Linux box at the minimum with qemu-arm installed. Rather than powering up my ubuntu VM, I decided to debug it remotely on cloud9. Cloud9 is a sort of VPS that provide Docker Ubuntu containers called as workspaces where we can run whatever we want. The arm binary can be debugged using qemu as follows:

$ qemu-arm-static -g 8081 ./challenge

We are using the user mode emulation capability of qemu to run non-native elf binaries. The port on which qemu listens for incoming gdb connections is specified by the -g flag and is 8081 in this case. We have specified port 8081 as it is one of the few ports cloud9 allows incoming connections. Now, if we try to attach to the process in IDA using remote gdb debugger as the debugger type configured as shown in Figure 1, IDA fails.

Figure 1: Remote debugger configuration (ignore the paths)

This is expected as the container on which the debuggee is running is on a virtual host where multiple containers have same IP addresses with different domain names. A socket connects by IP addresses and not by domain names thus it is not possible to connect to our container using sockets. We can get a clearer picture using netcat.

Let us create a netcat server listening on port 8081 as shown in Figure 2.

Figure 2: Netcat server listening on port 8081

We can try to connect to this server from our Windows XP VM as shown in Figure 3.

Figure 3: Trying to connect to our netcat server

Unsurprisingly, this fails too for the same reason.

The workaround

We have seen that socket connection is be made using IP addresses. However, if we connect using HTTP we can use domain names. This is possible because of the Host header as mentioned earlier, Let's test this concept.

We create a netcat server listening on port 8081 which replies with a HTTP "HELLO WORLD" message. This is done as shown in Figure 4

Figure 4: Netcat server replying with HTTP message

For the client part in the Windows XP box, we use curl instead of netcat as shown in Figure 5. We choose curl over netcat as we are performing an HTTP transaction and not a socket connection.

Figure 5: Using curl to connect to the netcat server

The connection succeeds and we get the HELLO WORLD response. The netcat server running on cloud9 also displays the success status as in Figure 6.

Figure 6: Netcat server replied to the request

From the above experiments, it is clear that we must use HTTP in order to establish a connection to the remote container running on a virtual host. Similarly, if we intend to debug remotely an app using IDA we must also use HTTP instead of sockets.

Using HTTP Tunnelling

We have seen that connection using HTTP is only possible. If we want to use sockets, it must be wrapped in HTTP. This technique of encapsulating one protocol over HTTP is called HTTP tunnelling. Wikipedia explains this best. Primarily, HTTP tunnels are used to bypass restrictive network environments like firewalls where only connections to well-known ports are permitted. We can reuse the tunnelling technique for debugging in IDA as well.

A Http tunnel application consists of two parts - server and client both communicating over HTTP. Before using a Http tunnel the situation was like Figure 7.

Figure 7: Socket connection

After using Http tunnel, the situation would look like Figure 8.

Figure 8: Http tunnelling

The debugger and tunnel client reside on the same machine though they are depicted as separate computers. Similarly, the tunnel client and the debuggee reside on the same cloud9 container. The tunnel client-server pair encapsulates the socket in an Http connection. Using this mechanism we can remotely debug using IDA.

Searching for a Http tunnelling application, I came across Chisel. It is open-source and written in Go. Compiling this from source is simple:

$ git clone https://github.com/jpillora/chisel.git
$ cd chisel
$ go build -o chisel # for compiling native linux binaries
$ GOOS=windows GOARCH=386 go build -o chisel.exe # cross compiling for windows x86

Remote configuration

We run the chisel server on cloud9 listening on port 8081 on all network interfaces:

$ ./chisel server --port=8081
2017/03/31 19:57:03 server: Fingerprint 07:4e:00:e4:82:9b:76:3a:3a:70:55:30:2e:1d:c2:82
2017/03/31 19:57:03 server: Listening on 8081...

qemu runs with the gdbserver listening on port 23946 for incoming gdb connections from IDA.

$ qemu-arm-static -g 23946 ./challenge

The connection between chisel server and qemu is through sockets. The debugger traffic wrapped in Http will be passed to chisel server at port 8081, chisel will extract the payload of the message and pass it to qemu at port 23946 over a socket.

Local configuration

In our Windows XP box we run chisel in client mode with the following command line:

C:\>chisel client qemu-extremecoders-re.c9users.io:8081 1234:23946
2017/04/01 01:28:36 client: Connecting to ws://qemu-extremecoders-re.c9users.io:8081
2017/04/01 01:28:38 client: Fingerprint 07:4e:00:e4:82:9b:76:3a:3a:70:55:30:2e:1d:c2:82
2017/04/01 01:28:39 client: Connected (Latency 203.125ms)

The remote Url on which the chisel server listens (qemu-extremecoders-re.c9users.io:8081) is specified along with the port.

The second set of port (1234:23946) separated by a colon specifies the port mapping from local to remote. It means incoming traffic to chisel client at local port 1234 will be forwarded to the chisel server which will, in turn, relay the traffic over a socket to port 23946 where qemu is listening.

Finally, we need to configure IDA to use the local chisel client as the remote host. This is done as per Figure 9.

Figure 9: IDA remote debugger configuration

The hostname is specified as 127.0.0.1 and the port as 1234. This is the address where the chisel client is accepting socket connections.

At this point, if we try to attach to the remote process, it succeeds with the following message as in Figure 10.

Figure 10: Attach successful

Mission accomplished!

Final words

Http tunnelling is a very effective technique in scenarios where only Http connections are allowed or possible. In this case of remote debugging, we used http tunnelling since normal socket connections cannot be established. With this we come to the end of this post. Hope you find this useful. Ciao!

67,000 cuts with python-pefile

2017-03-26T10:30:00.000+00:00

EasyCTF featured an interesting reversing engineering challenge. The problem statement is shown in Figure 1.

Figure 1: Problem statement

A file 67k.zip was provided containing 67,085 PE files numbered from 00000.exe to 1060c.exe as shown in Figure 2.

Figure 2: 67k files to reverse!

The task was to reverse engineer each of them and combine their solutions to get the flag. All of the files were exactly 2048 bytes in size as shown in Figure 3.

Figure 3: 2048 all the way

Let's analyze one of the files, say the first one 00000.exe in IDA. The graph view is simple as in Figure 4.

Figure 4: Graph view of 00000.exe

The program accepts one integer input through scanf. This is compared with another number generated by a simple operation like sub on two hard-coded integers stored in register eax and ecx. If they match, we go to the green basic block on the left. It does another calculation (sar - Shift Arithmetic Right at 402042) and finally prints this calculated value along with the success message at 40204F. This general pattern is followed by all of the 67,085 files with minor changes as enumerated below:

The imagebase and the entrypoint of the PE vary with each file.
The operation on the two hardcoded integers can be any of addition, subtraction or xor.
The address of the function (op_sub in the example) performing the operation varies.
The address of the hard coded integer (dword_403000 in the example) varies.
The amount of shift stored in byte_403007 also varies.

Obviously, reversing 67k files by hand is not possible and requires automation. For this task, I choose the pefile module by Ero Carrera. First, we need to get the offsets of the individual instructions from the Entry point. We can do this from OllyDbg as in the following listing. The offsets are in the left most column.

<Modul>/$  68 5E304000       push 0040305E                            ; /s = "Launch codes?"
$+5   >|.  FF15 44104000     call dword ptr [<&msvcrt.puts>]          ; \puts
$+B   >|.  58                pop eax
$+C   >|.  68 6C304000       push 0040306C
$+11  >|.  68 04304000       push 00403004                            ; /format = "%d"
$+16  >|.  FF15 48104000     call dword ptr [<&msvcrt.scanf>]         ; \scanf
$+1C  >|.  83C4 08           add esp,8
$+1F  >|.  A1 00304000       mov eax,dword ptr [403000]
$+24  >|.  B9 EDA7A8A1       mov ecx,A1A8A7ED
$+29  >|.  E8 CFFFFFFF       call <op_sub>
$+2E  >|.  3B05 6C304000     cmp eax,dword ptr [40306C]
$+34  >|.  75 1E             jnz short 0040205A
$+36  >|.  8A0D 07304000     mov cl,byte ptr [403007]
$+3C  >|.  D3F8              sar eax,cl
$+3E  >|.  25 FF000000       and eax,0FF
$+43  >|.  50                push eax                                 ; /<%c>
$+44  >|.  68 34304000       push 00403034                            ; |format = "Wow you got it. Here is the result: (%c)"
$+49  >|.  FF15 4C104000     call dword ptr [<&msvcrt.printf>]        ; \printf
$+4F  >|.  83C4 08           add esp,8
$+52  >|.  EB 0C             jmp short 00402066
$+54  >|>  68 08304000       push 00403008                            ; /s = "I think my dog figured this out before you."
$+59  >|.  FF15 44104000     call dword ptr [<&msvcrt.puts>]          ; \puts
$+5F  >|.  58                pop eax
$+60  >\>  C3                ret

The complete script is provided below.

import zipfile
import struct
import pefile
import cStringIO


def rshift(val, n):
    """
    Implements arithmetic right shift on 32 bits
    """
    return (val % 0x100000000) >> n

def process(buf):
    # Load the Pe file
    pe = pefile.PE(data=buf, fast_load=True)

    # RVA of Entry Point
    ep = pe.OPTIONAL_HEADER.AddressOfEntryPoint

    imagebase = pe.OPTIONAL_HEADER.ImageBase

    # $+1F  >|.  A1 00304000       mov eax,dword ptr [403000]
    # $+24  >|.  B9 EDA7A8A1       mov ecx,A1A8A7ED
    eax = pe.get_dword_at_rva(pe.get_dword_at_rva(ep + 0x1f + 1) - imagebase)
    ecx = pe.get_dword_at_rva(ep + 0x24 + 1)

    # $+29  >|.  E8 CFFFFFFF       call <op_sub>
    fn_offs = struct.unpack('<i', pe.get_data(ep + 0x29 + 1, length = 4))[0]

    # function rva = instruction address + length + func offset from imagebase
    fn_rva = 0x29 + 5 + fn_offs 

    # Get the first byte of the function (op_sub)
    func_byte = ord(pe.get_data(rva = ep+fn_rva, length=1))

    # Perform the operation based on the function byte

    # op_xor
    # 31C8            xor eax,ecx
    # C3              ret
    if func_byte == 0x31:
        eax ^= ecx

    # op_add
    # 01C8            add eax,ecx
    # C3              ret
    elif func_byte == 0x1:
        eax += ecx

    # op_sub
    # 29C8            sub eax,ecx
    # C3              ret
    elif func_byte == 0x29:
        eax -= ecx

    else:
        raise 'Error'

    # $+36  >|.  8A0D 07304000     mov cl,byte ptr [403007]
    # $+3C  >|.  D3F8              sar eax,cl
    # $+3E  >|.  25 FF000000       and eax,0FF
    cl = ord(pe.get_data(pe.get_dword_at_rva(ep+0x36+2)-imagebase, 1))

    return chr(rshift(eax, cl) & 0xFF)

if __name__ == '__main__':
    output = cStringIO.StringIO()
    with zipfile.ZipFile('67k.zip') as f:
        for idx in xrange(67085):
            fname = format(idx, 'x').zfill(5) + '.exe'
            buf = f.read(fname)
            output.write(process(buf))
            # Fast divisiblity check by 1024, 2^10 (last 10 bits must be zero)
            if idx & 0x3FF == 0: 
                print 'Completed', idx

    open('output.txt', 'w').write(output.getvalue())
    print 'Done!!'

Instead of unpacking 67,085 files to the hard drive and fragmenting it in the process, I have used the zipfile module to access the files within the archive. However, zipfile throws an error on opening the archive and must be modified slightly as described in this Stack Overflow answer.

We access the instructions by using the offsets from the entry point. The address of the operation function and the values of the hard-coded integers, shift amount are also obtained similarly. We discern the type of operation performed by examining its first byte. With this information, we can find the correct output.

Running the script on stock Python 2.7 takes close to 15 minutes. With PyPy, this is reduced to 2 minutes. We get a 66 kb output consisting of obfuscated javascript as shown in Figure 5.

Figure 5: Obfuscated javascript output

Running the obfuscated javascript on jsfiddle gives us the flag easyctf{wtf_67k_binaries?why_so_mean?} as also shown in Figure 6:

Figure 6: Finally we get the flag

Hacking the CPython virtual machine to support bytecode debugging

2017-03-16T09:44:00.000+00:00

As you may know, Python is an interpreted programming language. By Python, I am referring to the standard implementation i.e CPython. The implication of being interpreted means that python code is never directly executed by the processor. The python compiler converts the source code into an intermediate representation called as the bytecode. The bytecode consists of instructions which at runtime are interpreted by the CPython virtual machine. For knowing more about the nitty-gritty details refer to ceval.c.

Unfortunately, the standard python implementation does not provide a way to debug the bytecode when they are being executed on the virtual machine. You may question, why is that even needed as I can already debug python source code using pdb and similar tools. Also, gdb 7 and above support debugging the virtual machine itself so bytecode debugging may seem unnecessary.

However, that is only one side of the coin. Pdb can be used for debugging only when the source code is available. Gdb no doubt can debug without the source as we are dealing directly with the virtual machine but it is too low level for our tasks. This is akin to finding bugs in your C code by using an In-Circuit Emulator on the processor. Sure, you would find bugs if you have the time and patience but it is unusable for the most of us. What we need, is something in between, one which can not only debug without source but also is not too low-level and can understand the python specific implementation details. Further, it would be an icing on the cake if this system can be implemented directly in python code.

Implementation details of a source code debugger

Firstly, we need to know how a source code debugger is implemented with respect to Python. The defacto python debugger is the pdb module. This is basically a derived class from bdb. Pdb provides a command line user interface and is a wrapper around bdp. Now, both pdb and bdp are coded in python. The main debugging harness in CPython is implemented right within the sys module.

Among the multifarious utility functions in the sys module, settrace allows us to provide a callback function which just as its name suggest can trace code execution. Python will call this function in response to various events like when a function is called, a function is about to return, an exception is generated or when a new line of code is about to be executed. Refer to the documentation of settrace for knowing about the specifics.

However, there are a couple of gotchas. Unlike a physical processor, the CPython virtual machine has no concept of breakpoints. There is no such instruction like an INT 3 on x86 or BKPT on ARM to automatically switch the processor to debug state. Instead, the breakpoint mechanism must be implemented in the trace callback function. The trace function will be called whenever a new line of code is about to be executed. We need to check whether the user has requested a break on this line and if so yield control. This mechanism is not without its downside. As the callback will be invoked for every line, and for every other important event, execution speed will be severely reduced. To speed things up, this may be implemented in C as an extension module like cpdb.

So far so good, and it seems line tracing is just the functionality we require, however, this works only at a source code level. The lowest granularity on which tracing works is at the line level. and not at the instruction level as we require.

How does line tracing work?

Python code objects have a special member called co_lnotab. also known as the line number table. It contains a series of unsigned bytes wrapped up in a string. This is used to map bytecode offsets back into the source code line from where the particular instruction originated.

When the CPython virtual machine interprets the source code, after execution of each instruction it checks whether the current bytecode offset is the start point of some source code line, if so; it calls the trace function. An example trace function taken from the bdb module is shown below.

def trace_dispatch(self, frame, event, arg):
    if self.quitting:
        return # None
    if event == 'line':
        return self.dispatch_line(frame)
    if event == 'call':
        return self.dispatch_call(frame, arg)
    if event == 'return':
        return self.dispatch_return(frame, arg)
    if event == 'exception':
        return self.dispatch_exception(frame, arg)
    if event == 'c_call':
        return self.trace_dispatch
    if event == 'c_exception':
        return self.trace_dispatch
    if event == 'c_return':
        return self.trace_dispatch
    print 'bdb.Bdb.dispatch: unknown debugging event:', repr(event)
    return self.trace_dispatch

The trace function is provided with the currently executing frame as an argument. The frame is a data structure that encapsulates the context under which a code object is executing. We can query the frame using the inspect module. We can change the currently executing line by changing f_lineno of the frame object. Similarly, we can modify variables by using the eval function in the context of the globals and locals obtained from the frame.

Bytecode Tracing Techniques

Listed below are some existing techniques for tracing python bytecode execution.

Extending co_lnotab

We have seen co_lnotab, the line number table is used for determining when to call the trace function. Ned Batchelder (2008) showed that it is possible to modify the line number table to include an entry for each instruction offset in the bytecode. To the Python VM, this implies that every instruction corresponds to a different line of source, and hence it calls the trace function for every instruction executed. This technique is very easy to implement and requires no modification to python. We only need to alter the line number table for each code object to include an entry for each instruction. The downside of this approach is that it increases the pyc file size, and more so if the bytecode is obfuscated when we have no idea which bytes are instruction and which are junk. To be on the safer side, we can add an entry for each byte no matter if it is a real instruction or a junk byte.

Compiling python with LLTRACE

An undocumented way to trace bytecode execution is to compile python from source with the LLTRACE flag enabled. At execution time, python prints every instruction it executes on the console. This method is not without its flaws. Printing every executed instruction on the console is an expensive operation slowing down execution speed. Further, we have no control over the execution, i.e. we cannot modify the operation of the code in any way and it is not possible to toggle off this feature when we do not need it.

Introducing a new opcode

Yet another way to implement tracing is to introduce a new opcode altogether (Rouault, 2015). This is a complicated process and requires a lot of modifications to python. The entire process with all its gory details is described on this page. The gist of the approach is that we create a new opcode which Roualt (2015) calls as DEBUG_OP. Whenever Python VM encounters this opcode, it calls a previously user supplied function. passing the execution context consisting of the Frame and the evaluation stack as the arguments.

Undoubtedly, this method is superior to the pre-existing methods, although it requires a lot of changes in the implementation of python. However, the main drawback of this approach is that it requires to modify the instruction stream and slip a DEBUG_OP opcode in between. This is feasible for normal bytecode generated by python but definitely not for the ones which are obfuscated. When the instructions are obfuscated, it is not possible to insert DEBUG_OP opcode in advance as we cannot differentiate between normal instructions and junk instructions.

The proposed method

Keeping note of the limitations of the above techniques, our proposed method must overcome these. Specifically, it must be resistant to obfuscation and should not require any changes to the bytecode itself. It would be ideal if we could reuse or extend existing functionality to support bytecode tracing and debugging.

As said before, the Python VM consults co_lnotab, the line number table before execution of each instruction to determine when to call the trace function. It looks like we can somehow modify this to call our tracing function right before execution of the individual instructions without checking the line number table. This is the approach we will take.

The function responsible for calling the tracing function is maybe_call_line_trace at line #4054 within ceval.c.

/* See Objects/lnotab_notes.txt for a description of how tracing works. */
static int
maybe_call_line_trace(Py_tracefunc func, PyObject *obj,
                      PyFrameObject *frame, int *instr_lb, int *instr_ub,
                      int *instr_prev)
{
    int result = 0;
    int line = frame->f_lineno;

    /* If the last instruction executed isn't in the current
       instruction window, reset the window.
    */
    if (frame->f_lasti < *instr_lb || frame->f_lasti >= *instr_ub) {
        PyAddrPair bounds;
        line = _PyCode_CheckLineNumber(frame->f_code, frame->f_lasti,
                                       &bounds);
        *instr_lb = bounds.ap_lower;
        *instr_ub = bounds.ap_upper;
    }
    /* If the last instruction falls at the start of a line or if
       it represents a jump backwards, update the frame's line
       number and call the trace function. */
    if (frame->f_lasti == *instr_lb || frame->f_lasti < *instr_prev) {
        frame->f_lineno = line;
        result = call_trace(func, obj, frame, PyTrace_LINE, Py_None);
    }
    *instr_prev = frame->f_lasti;
    return result;
}

Those If statements are mostly checking whether the current bytecode instruction maps to the beginning of some line. We can simply remove them to make it call our trace function per executed instruction than per source line.

static int
maybe_call_line_trace(Py_tracefunc func, PyObject *obj,
                      PyFrameObject *frame, int *instr_lb, int *instr_ub,
                      int *instr_prev)
{
    int result = 0;
 result = call_trace(func, obj, frame, PyTrace_LINE, Py_None);
    *instr_prev = frame->f_lasti;
    return result;
}

After building Python from source with those teeny-tiny changes in-place, we have implemented an execution tracer re-using the existing settrace functionality. We now need to code the callback function which will be called by settrace. This can be realized either in Python or C as an extension (like cpdb), but we choose the former for ease of development.

The Tracer

The code of the tracer is listed below and can also be found on GitHub at https://github.com/extremecoders-re/bytecode_tracer

import sys
import dis
import marshal
import argparse

tracefile = None
options = None

# List of valid python opcodes
valid_opcodes = dis.opmap.values()

def trace(frame, event, arg):
    global tracefile, valid_opcodes, options
    if event == 'line':
        # Get the code object
        co_object = frame.f_code

        # Retrieve the name of the associated code object
        co_name = co_object.co_name

        if options.name is None or co_name == options.name:
            # Get the code bytes
            co_bytes = co_object.co_code

            # f_lasti is the offset of the last bytecode instruction executed
            # w.r.t the current code object
            # For the very first instruction this is set to -1
            ins_offs = frame.f_lasti

            if ins_offs >= 0:
                opcode = ord(co_bytes[ins_offs])

                # Check if it is a valid opcode
                if opcode in valid_opcodes:
                    if opcode >= dis.HAVE_ARGUMENT:
                        # Fetch the operand
                        operand = arg = ord(co_bytes[ins_offs+1]) | (ord(co_bytes[ins_offs+2]) << 8)

                        # Resolve instriction arguments if specified
                        if options.resolve:
                            try:
                                if opcode in dis.hasconst:
                                    operand = co_object.co_consts[arg]
                                elif opcode in dis

For demonstrating the usage I have chosen the following piece of code taken from programiz.

# Python program to find the factorial of a number using recursion

def recur_factorial(n):
   """Function to return the factorial
   of a number using recursion"""
   if n == 1:
       return n
   else:
       return n*recur_factorial(n-1)

# Change this value for a different result
num = 7

# check is the number is negative
if num < 0:
   print("Sorry, factorial does not exist for negative numbers")
elif num == 0:
   print("The factorial of 0 is 1")
else:
   print("The factorial of",num,"is",recur_factorial(num))

Suppose, we want to trace the execution of the recur_factorial function. We can do so, by running the following:

$ python tracer.py -t=only -n=recur_factorial -r factorial.pyc trace.txt

We are tracing the execution of only those code objects having a name of recur_factorial.
The -r flag means to resolve the operands of instructions. Instructions in python can take an argument. For some instructions like LOAD_CONST, the argument is an integer specifying the index of an item within the co_consts table which will be pushed on the evaluation stack. If resolution (-r flag) is enabled, the item will be written to the trace instead of the integer argument.

The input file name is factorial.pyc and the trace file name is trace.txt. Running this we get the execution trace like the following

recur_factorial> 0 LOAD_FAST (n)
recur_factorial> 3 LOAD_CONST (1)
recur_factorial> 6 COMPARE_OP (==)
recur_factorial> 16 LOAD_FAST (n)
recur_factorial> 19 LOAD_GLOBAL (recur_factorial)
recur_factorial> 22 LOAD_FAST (n)
recur_factorial> 25 LOAD_CONST (1)
recur_factorial> 28 BINARY_SUBTRACT
recur_factorial> 29 CALL_FUNCTION (1)
recur_factorial> 0 LOAD_FAST (n)
recur_factorial> 3 LOAD_CONST (1)
recur_factorial> 6 COMPARE_OP (==)
recur_factorial> 16 LOAD_FAST (n)
recur_factorial> 19 LOAD_GLOBAL (recur_factorial)
recur_factorial> 22 LOAD_FAST (n)
recur_factorial> 25 LOAD_CONST (1)
recur_factorial> 28 BINARY_SUBTRACT
recur_factorial> 29 CALL_FUNCTION (1)
recur_factorial> 0 LOAD_FAST (n)
recur_factorial> 3 LOAD_CONST (1)
recur_factorial> 6 COMPARE_OP (==)
recur_factorial> 16 LOAD_FAST (n)
recur_factorial> 19 LOAD_GLOBAL (recur_factorial)
recur_factorial> 22 LOAD_FAST (n)
recur_factorial> 25 LOAD_CONST (1)
recur_factorial> 28 BINARY_SUBTRACT
recur_factorial> 29 CALL_FUNCTION (1)
recur_factorial> 0 LOAD_FAST (n)
recur_factorial> 3 LOAD_CONST (1)
recur_factorial> 6 COMPARE_OP (==)
recur_factorial> 16 LOAD_FAST (n)
recur_factorial> 19 LOAD_GLOBAL (recur_factorial)
recur_factorial> 22 LOAD_FAST (n)
recur_factorial> 25 LOAD_CONST (1)
recur_factorial> 28 BINARY_SUBTRACT
recur_factorial> 29 CALL_FUNCTION (1)
recur_factorial> 0 LOAD_FAST (n)
recur_factorial> 3 LOAD_CONST (1)
recur_factorial> 6 COMPARE_OP (==)
recur_factorial> 16 LOAD_FAST (n)
recur_factorial> 19 LOAD_GLOBAL (recur_factorial)
recur_factorial> 22 LOAD_FAST (n)
recur_factorial> 25 LOAD_CONST (1)
recur_factorial> 28 BINARY_SUBTRACT
recur_factorial> 29 CALL_FUNCTION (1)
recur_factorial> 0 LOAD_FAST (n)
recur_factorial> 3 LOAD_CONST (1)
recur_factorial> 6 COMPARE_OP (==)
recur_factorial> 16 LOAD_FAST (n)
recur_factorial> 19 LOAD_GLOBAL (recur_factorial)
recur_factorial> 22 LOAD_FAST (n)
recur_factorial> 25 LOAD_CONST (1)
recur_factorial> 28 BINARY_SUBTRACT
recur_factorial> 29 CALL_FUNCTION (1)
recur_factorial> 0 LOAD_FAST (n)
recur_factorial> 3 LOAD_CONST (1)
recur_factorial> 6 COMPARE_OP (==)
recur_factorial> 12 LOAD_FAST (n)
recur_factorial> 15 RETURN_VALUE
recur_factorial> 32 BINARY_MULTIPLY
recur_factorial> 33 RETURN_VALUE
recur_factorial> 32 BINARY_MULTIPLY
recur_factorial> 33 RETURN_VALUE
recur_factorial> 32 BINARY_MULTIPLY
recur_factorial> 33 RETURN_VALUE
recur_factorial> 32 BINARY_MULTIPLY
recur_factorial> 33 RETURN_VALUE
recur_factorial> 32 BINARY_MULTIPLY
recur_factorial> 33 RETURN_VALUE
recur_factorial> 32 BINARY_MULTIPLY
recur_factorial> 33 RETURN_VALUE

That's pretty cool!. Now we now the exact opcodes that are executing. Tracing obfuscated bytecode is no longer a problem.

Extending tracing to full-fledged debugging

The tracer developed does not have advanced debugging capabilities. For instance, we cannot interact with the operand stack, tamper the values stored, modify the opcodes dynamically at the run time etc. We do have access to the frame object but the evaluation stack is not accessible from python. However, everything is accessible to a C extension. We can develop such a C extension which when given a frame object can allow python code to interact with the objects stored on the operand stack.

This will be the topic for another blog post. I also intend to show, how we can use such an advanced tracer to unpack & deobfuscate the layers of a PjOrion protected python application.

References

Batchelder, N. (2008, April 11). Wicked hack: Python bytecode tracing. Retrieved March 15, 2017, from https://nedbatchelder.com/blog/200804/wicked_hack_python_bytecode_tracing.html

Rouault, C. (2015, May 7), Understanding Python execution from inside: A Python assembly tracer. Retrieved March 15, 2017, from http://web.archive.org/web/20160830181828/http://blog.hakril.net/articles/2-understanding-python-execution-tracer.html?

Extracting encrypted pyinstaller executables

2017-02-14T19:58:00.002+00:00

UPDATE: For recent PyInstaller versions, the script below won't work. Please visit the pyinstxtractor wiki for more information.

It has been more than a quarter since the last post, and in the meantime, I was very busy and did not have the time to write a proper post. The good news is at the moment, I am comparatively free and can put in a quick post.

As said earlier, PyInstaller provides an option to encrypt the embedded files within the executable. This feature can be used by supplying an argument --key=key-string while generating the executable.

Detecting encrypted pyinstaller executables is simple. If pyinstxtractor is used, it would indicate this as shown in Figure 1.

Figure 1: Trying to extract encrypted pyinstaller archive

The other tell-tale sign is the presence of the file pyimod00_crypto_key in the extracted directory as shown in Figure 2.

Figure 2: The file pyimod00_crypto_key indicates usage of crypto

If encryption is used, pyinstaller AES encrypts all the embedded files present within ZLibArchive i.e. the out00-PYZ.pyz file. When pyinstxtractor encounters an encrypted pyz archive, it would extract the contents as-is without decrypting the individual files as shown in Figure 3.

Figure 3: Contents of an encrypted pyz archive

To decrypt the files, you would need the key, and the key is present right within the file pyimod00_crypto_key. This is just a pyc file, and can be fed to a decompiler to retrieve the key.

With the key in hand, it is a matter of another script to decrypt.

from Crypto.Cipher import AES
import zlib

CRYPT_BLOCK_SIZE = 16

# key obtained from pyimod00_crypto_key
key = 'MySup3rS3cr3tK3y'

inf = open('_abcoll.pyc.encrypted', 'rb') # encrypted file input
outf = open('_abcoll.pyc', 'wb') # output file 

# Initialization vector
iv = inf.read(CRYPT_BLOCK_SIZE)

cipher = AES.new(key, AES.MODE_CFB, iv)

# Decrypt and decompress
plaintext = zlib.decompress(cipher.decrypt(inf.read()))

# Write pyc header
outf.write('\x03\xf3\x0d\x0a\0\0\0\0')

# Write decrypted data
outf.write(plaintext)

inf.close()
outf.close()

The above snippet can be used for decrypting the encrypted files. Afterward, you can run a decompiler to get back the source.

Flare-on Challenge 2016 Write-up

2016-11-09T06:41:00.000+00:00

The Flare-on challenge is an annual CTF style challenge with a focus on reverse engineering. Official solutions have already been published, besides that there are other writeups available too, hence I will just skim through the parts.

Challenge #1

The first was simple. This is base64 encoding with a custom charset. This online tool does the job.

Flag: sh00ting_phish_in_a_barrel@flare-on.com

Fig 1: Challenge 1

Challenge #2 - DudeLocker

This is a file encrypting ransomware. An encrypted file (BusinessPapers.doc) is provided, the task is to decrypt it. As the encryption key is hardcoded in the binary, I simply changed the CryptEncrypt call to CryptDecrypt by modifying the IAT. This decrypts the file giving the following image.

flag: cl0se_t3h_f1le_0n_th1s_One@flare-on.com

Fig 2: Challenge 2

Challenge #3 - Unknown

The challenge is named unknown, we need to find it proper name. This can be found from the embedded pdb file path debug information. The binary implements a custom md5 hash algorithm which is used to calculate a table from the command line argument and the executable path. Since we already know the proper path, the command line argument can simply be brute forced giving us the flag Ohs0pec1alpwd@flare-on.com

sss = map(ord, list('__FLARE On!'))

target = [0xEE613E2F, 0xDE79EB45, 0xAF1B2F3D, 0x8747BBD7, 
0x739AC49C, 0xC9A4F5AE, 0x4632C5C1, 0xA0029B24, 0xD6165059, 
0xA6B79451, 0xE79D23BA, 0x8AAE92CE, 0x85991A18, 0xFEE05899, 
0x430C7994, 0x1AB9F36F, 0x70C42481, 0x05BD27CF, 0xC4FF6E6F, 
0x5A77847C, 0xDD9277B3, 0x25843CFF, 0x5FDCA944, 0x8EE42896, 
0x2AE961C7, 0xA77731DA]

def charsprod(li):
 prod = 0
 for i in xrange(len(li)):
  prod = ((prod*37)&0xFFFFFFFF) + li[i] 
 return prod

email = ''

for i in xrange(26):
 sss[1] = ord('`') + i
 for j in xrange(1, 256):
  sss[0] = j
  if charsprod(sss) == target[i]:
   email += chr(j)
   break

print email

Challenge #4 - flareon2016challenge

A dll is provided which exports 51 functions by ordinals. Among them functions 1 to 48 & 51 changes the global state in someway or the other and must be called first. Ordinal 50 makes call to Beep and also tries to decrypt some piece of data. The task is to calls the functions in proper order so that the decryption may succeed. Additionally these functions return an integer byte value indicating the ordinal of the next function that must be called. We can use the return values to build up the call chain order using the following script.

import ctypes

dll = ctypes.windll.LoadLibrary('flareon2016challenge')
call_chain = {}

for i in range(1, 49):
 retval = dll[i]()
 call_chain[i] = retval

print sorted(call_chain.values())
# missing is ordinal 30, should be called first

The call chain table thus found has no entry for ordinal 30, hence that is the function to be called first. After calling the functions in the correct order, an embedded executable is decrypted which just makes a series of calls to the Beep function. Setting a logging breakpoint on Beep allows us to recover the parameters passed. Calling export 50 using the same parameters gives us the flag: f0ll0w_t3h_3xp0rts@flare-on.com

import sys
import ctypes
import os

params =[(440, 500), (440, 500), (440, 500), (349, 350) , (523, 150), (440, 500), (349, 350), (523, 150), (440, 1000), (659, 500), 
(659, 500), (659, 500), (698, 350), (523, 150), (415, 500), (349, 350), (523, 150), (440 , 1000)]

dll = ctypes.cdll.LoadLibrary('flareon2016challenge')

# call first function
retval = dll[30]()

# do not call last function
while retval != 51:
 retval = dll[retval]()

# call last func
dll[51]()

for p in params:
 dll[50](p[0], p[1])

Challenge #5 - smokestack

The provided executable is a stack based virtual machine. It takes in an argument, and prints the flag if it is correct. I reimplemented the vm in python and brute forced the flag A_p0p_pu$H_&_a_Jmp@flare-on.com

instructions = [0, 33, 2, 0, 145, 8, 0, 22, 0, 12, 9, 10, 11, 0, 0,
12, 2, 12, 0, 0, 29, 10, 11, 0, 0, 99, 2, 12, 0, 0,
24, 6, 0, 84, 8, 0, 51, 0, 41, 9, 10, 11, 0, 0, 44,
2, 12, 0, 0, 61, 10, 0, 14, 1, 11, 0, 0, 89, 2, 12,
0, 11, 0, 0, 0, 12, 1, 0, 9, 12, 0, 11, 1, 0, 2, 2,
12, 1, 11, 0, 0, 1, 3, 12, 0, 11, 0, 0, 0, 8, 0, 71,
0, 96, 9, 10, 12, 0, 11, 1, 3, 0, 93, 8, 0, 124, 0,
110, 9, 10, 11, 0, 0, 7, 3, 12, 0, 0, 91, 12, 1, 0,
135, 10, 0, 54, 12, 1, 11, 0, 11, 1, 2, 12, 1, 11, 1,
0, 88, 2, 6, 0, 249, 8, 0, 160, 0, 150, 9, 10, 11, 0,
0, 77, 6, 12, 0, 0, 174, 10, 0, 803, 0, 299, 3, 12,
1, 11, 0, 11, 1, 2, 12, 1, 12, 1, 11, 1, 11, 1, 0, 1,
3, 12, 1, 0, 3, 2, 11, 1, 0, 0, 8, 0, 178, 0, 199, 9,
10, 7, 0, 65143, 8, 0, 216, 0, 209, 9, 10, 11, 0, 0,
88, 2, 12, 0, 0, 3, 4, 0, 140, 2, 0, 24724, 8, 0, 238,
0, 231, 9, 10, 11, 0, 0, 231, 2, 12, 0, 11, 1, 2, 0,
12, 6, 0, 116, 8, 0, 263, 0, 253, 9, 10, 11, 0, 0, 9,
3, 12, 0, 0, 285, 10, 0, 10, 12, 1, 11, 1, 0, 1, 3,
12, 1, 11, 1, 0, 0, 8, 0, 267, 0, 285, 9, 10, 0, 6,
5, 0, 7616, 8, 0, 307, 0, 297, 9, 10, 11, 0, 0, 113,
2, 12, 0, 0, 317, 10, 11, 0, 0, 119, 2, 12, 0, 0, 317,
10, 0, 22, 2, 0, 14, 3, 0, 97, 8, 0, 339, 0, 332, 9,
10, 11, 0, 0, 44, 3, 12, 0, 12, 1, 11, 1, 0, 8492, 11,
1, 0, 1, 3, 12, 1, 0, 7, 3, 11, 1, 0, 0, 8, 0, 345,
0, 366, 9, 10, 0, 458, 6, 0, 8181, 8, 0, 385, 0, 378,
9, 10, 11, 0, 0, 18, 2, 12, 0, 13]

stack = []
sp = ctx1 = ctx2 = 0

def push(value):
 global sp
 sp += 1
 stack[sp] = value & 0xFFFF

def pop():
 global sp
 sp -= 1
 return stack[sp + 1]

def init_vm():
 global stack, sp, ctx1, ctx2
 stack = [ord(ch) for ch in 'kYwxCbJoLp']
 stack += [0,0,0,0]
 sp = 9
 ctx1 = ctx2 = 0

def exec_vm():
 global ctx1, ctx2
 ip = 0

 while ip < 386:
  #print 'loc_%d' %ip
  opcode = instructions[ip]
  
  if opcode == 0: # ins_load
   operand = instructions[ip + 1]
   push(operand)
   ip += 2

  elif opcode == 1: # ins_dec_sp
   pop()
   ip += 1

  elif opcode == 2: # ins_add
   v0 = pop()
   v1 = pop()
   push(v0 + v1)
   ip += 1

  elif opcode == 3: # ins_sub
   v0 = pop()
   v1 = pop()
   push(v1 - v0)
   ip += 1  

  elif opcode == 4: # ins_rotr
   v0 = pop()
   v1 = pop();
   push((v1 << (16 - v0)) | (v1 >> v0))
   ip += 1

  elif opcode == 5: # ins_rotl
   v0 = pop()
   v1 = pop();
   push((v1 >> (16 - v0)) | (v1 << v0))
   ip += 1

  elif opcode == 6: # ins_xor
   v0 = pop()
   v1 = pop();
   push(v1 ^ v0)
   ip += 1

  elif opcode == 7: # ins_not
   v0 = pop()
   push(~v0)
   ip += 1  

  elif opcode == 8: # ins_cmp
   v0 = pop()
   if v0 == pop():
    push(1)
   else:
    push(0)
   ip += 1    

  elif opcode == 9: # ins_cload
   v1 = pop()
   v0 = pop()
   if 1 == pop():
    push(v1)
   else:
    push(v0)
   ip += 1 

  elif opcode == 10: # ins_jmp
   ip = pop()

  elif opcode == 11: # ins_load_ctx
   operand = instructions[ip + 1]
   if operand == 0:
    push(ctx1)
   elif operand == 1:
    push(ctx2)
   ip += 2

  elif opcode == 12: # ins_set_ctx
   operand = instructions[ip + 1]
   v1 = pop()
   if operand == 0:
    ctx1 = v1
   elif operand == 1:
    ctx2 = v1
   ip += 2 

  elif opcode == 13: # ins_inc_ip
   ip += 1    


def main():
 global stack
 start = ord('A') - 1
 end = ord('z') + 1

 for pos in range(10):
  ret_vals = [None for i in xrange(start, end)]
  for i in range(start, end):
   init_vm()
   stack[pos] = i
   exec_vm()
   ret_vals[i-start] = tuple([i, ctx1])

  for e in ret_vals:
   if e[1] != ret_vals[0][1]:
    print chr(e[0]),
    break 


if __name__ == '__main__':
 main()

Challenge #6 - khaki

The challenge presents a piece of obfuscated python bytecode and by far this is best challenge. The file provided is a py2exe'd executable which can be easily unpacked to get the embedded pyc file. This pyc is obfuscated and cannot be easily decompiled. The reason for this is it has been sprinkled with NOPs , two POP_TOP, two ROT_TWO, and three ROT_THREE instructions. I developed a peephole optimizer to remove these instructions and make the file decompile-able using the bytecode-graph library developed by fireeye.

import bytecode_graph
import marshal
import opcode

def remove_nops(bcg, nodes):
 for i in xrange(len(nodes) - 1):
  node = nodes[i]
  if node.opcode == opcode.opmap['NOP']:
   bcg.delete_node(node)
   return True
 return False


def peephole_load_const(bcg, nodes):
 for i in xrange(len(nodes) - 1):
  node = nodes[i]
  # Peephole optimization (remove sequence of load and pop instructions)  
  if node.opcode == opcode.opmap['LOAD_CONST'] and nodes[i+1].opcode == opcode.opmap['POP_TOP']:
   bcg.delete_node(node)
   bcg.delete_node(nodes[i+1])
   return True
 return False   

def peephole_rot_two(bcg, nodes):
 for i in xrange(len(nodes) - 1):
  node = nodes[i]

  # Peephole optimization (remove two consecutive ROT_TWO)  
  if node.opcode == opcode.opmap['ROT_TWO'] and nodes[i+1].opcode == opcode.opmap['ROT_TWO']:
   bcg.delete_node(node)
   bcg.delete_node(nodes[i+1])
   return True
 return False

def peephole_rot_three(bcg, nodes):
 for i in xrange(len(nodes) - 2):
  node = nodes[i]

  # Peephole optimization (remove two consecutive ROT_THREE)  
  if node.opcode == opcode.opmap['ROT_THREE'] and nodes[i+1].opcode == opcode.opmap['ROT_THREE'] and nodes[i+2].opcode == opcode.opmap['ROT_THREE']:
   bcg.delete_node(node)
   bcg.delete_node(nodes[i+1]) 
   bcg.delete_node(nodes[i+2]) 
   return True

 return False  


def main():
 pyc_file = open('poc.pyc', 'rb').read()
 pyc = marshal.loads(pyc_file[8:])
 bcg = bytecode_graph.BytecodeGraph(pyc)

 nodes = [x for x in bcg.nodes()]

 while remove_nops(bcg, nodes) == True:
  nodes = [x for x in bcg.nodes()]

 while peephole_load_const(bcg, nodes) == True:
  nodes = [x for x in bcg.nodes()]
 
 while peephole_rot_two(bcg, nodes) == True:
  nodes = [x for x in bcg.nodes()]


 while peephole_rot_three(bcg, nodes) == True:
  nodes = [x for x in bcg.nodes()]

 deobf_code = bcg.get_code()
 f = open('poc-deobf.pyc', 'wb')
 f.write('\x03\xf3\x0d\x0a\0\0\0\0')
 marshal.dump(deobf_code, f)
 f.close()


if __name__ == '__main__':
 main()

Using this we can obtain the following deobfuscated code.

# Embedded file name: poc.py
import sys, random
__version__ = 'Flare-On ultra python obfuscater 2000'
target = random.randint(1, 101)
count = 1
error_input = ''
while True:
    print '(Guesses: %d) Pick a number between 1 and 100:' % count,
    input_num = sys.stdin.readline()
    try:
        input_num = int(input_num, 0)
    except:
        error_input = input_num
        print 'Invalid input: %s' % error_input
        continue

    if target == input_num:
        break
    if input_num < target:
        print 'Too low, try again'
    else:
        print 'Too high, try again'
    count += 1

if target == input_num:
    win_msg = 'Wahoo, you guessed it with %d guesses\n' % count
    sys.stdout.write(win_msg)
if count == 1:
    print 'Status: super guesser %d' % count
    #sys.exit(1)
if count > 25:
    print 'Status: took too long %d' % count
    sys.exit(1)
else:
    print 'Status: %d guesses' % count

if error_input != '':
    tmp = ''.join((chr(ord(x) ^ 66) for x in error_input)).encode('hex')
    if tmp != '312a232f272e27313162322e372548':
        sys.exit(0)
    stuffs = [67,139,119,165,232,86,207,61,
    import hashlib
    stuffer = hashlib.md5(win_msg + tmp).digest()
    for x in range(len(stuffs)):
        print chr(stuffs[x] ^ ord(stuffer[x % len(stuffer)])),

    print

Another python script to brute force the flag.

import hashlib

for i in xrange(100):
 win_msg = 'Wahoo, you guessed it with %d guesses\n' %i
 tmp = '312a232f272e27313162322e372548'

 stuffs = [67,139,119,165,232,86,207,61,79,67,45,58,230,190,181,74,65,148,71,243,246,67,142,60,61,92,58,115,240,226,171]
 stuffer = hashlib.md5(win_msg + tmp).digest()

 s = ''
 for x in range(len(stuffs)):
  s += chr(stuffs[x] ^ ord(stuffer[x % len(stuffer)]))
 if s.endswith('.com'):
  print s
  break

Flag 1mp0rt3d_pygu3ss3r@flare-on.com

Challenge #7 - hashes

The challenge is a x86 ELF (linux binary) developed in the go language. However unlike the standard go compiler gc, this has been compiled with gccgo and requires libgo.so.7 in order to be able to run. Now my local linux vm is ubuntu 14.04 and libgo7 is only available for ubuntu 16.04 and above. However I was not willing to download and install a complete new distro just for running this single binary. Hence a workaround was necessary. I powered on cloud9 vm, wgetted the deb directly bypassing the package manager. Although dpkg could not install the package, I got the much needed file libgo.so.7. Using it I could debug the binary in my local ubuntu 14.04 vm.

Fig 7 - Satisfying the dependencies

With that out of the picture, the objective of the challenge is to crack the SHA1 hash of the flag applied three times recursively. Since we know, that the flag ends in @flare-on.com, all that is required is to bruteforce the first few characters. Taking the good boy message "You have hashed the hashes" as a cue, I quickly brute forced the flag h4sh3d_th3_h4sh3s@flare-on.com

Challenge #8 - chimera

The name of the challenge immediately reminded me of the movie Mission: Impossible II wherein IMF agent Ethan Hunt must track and destroy a biological weapon Chimera along with its anti-dote Bellerophon and prevent it from being misused. While the actual challenge had nothing to do with the movie but certainly it was equally engrossing. Instead of the chimera virus, here we have an PE executable with a the relevant code hidden up the sleeves in the DOS stub. Once this is figured out, all that is left to disassemble the obfuscated 16 bit code to understand its workings. Dosbox along with its debugger proved much helpful in solving this problem. To get the flag I used the following script.

table = [255, 21, 116, 32, 64, 0, 137, 236, 93, 195, 66, 70,
192, 99, 134, 42, 171, 8, 191, 140, 76, 37, 25, 49,
146, 176, 173, 20, 162, 182, 103, 221, 57, 216, 95,
63, 123, 92, 194, 178, 246, 46, 117, 155, 97, 148, 207,
206, 106, 152, 80, 242, 91, 240, 69, 48, 14, 56, 235,
59, 108, 102, 127, 36, 61, 223, 136, 151, 185, 179,
241, 203, 131, 153, 26, 13, 239, 177, 3, 85, 158, 154,
122, 16, 224, 54, 232, 211, 228, 50, 193, 120, 7, 183,
107, 199, 112, 201, 44, 160, 145, 53, 109, 254, 115,
94, 244, 164, 217, 219, 67, 105, 245, 141, 238, 68,
125, 72, 181, 220, 75, 2, 161, 227, 210, 166, 33, 62,
47, 163, 215, 187, 132, 90, 251, 143, 18, 28, 65, 40,
197, 118, 89, 156, 247, 51, 6, 39, 10, 11, 175, 113,
22, 74, 233, 159, 79, 111, 226, 15, 190, 43, 231, 86,
213, 83, 121, 45, 100, 23, 149, 167, 189, 124, 29, 88,
147, 165, 101, 248, 24, 19, 234, 188, 229, 243, 55,
4, 150, 168, 30, 1, 41, 130, 81, 60, 104, 31, 142, 218,
138, 5, 34, 114, 73, 250, 135, 169, 84, 98, 198, 170,
9, 180, 253, 214, 209, 172, 133, 17, 71, 58, 157, 230,
77, 27, 204, 82, 128, 35, 252, 237, 139, 126, 96, 205,
110, 87, 186, 222, 174, 202, 196, 119, 12, 78, 212,
208, 200, 225, 184, 249, 38, 144, 129, 52]

target = [56, 225, 74, 27, 12, 26, 70, 70, 10, 150, 41, 115, 115, 164, 105, 3, 0, 27, 168, 248, 184, 36, 22, 214, 9, 203]

flag = map(ord, list('A'*26))

def rol(n):
 b = (n >> 7) & 1
 n = ((n << 1) | b) & 0xFF
 return n

def ror(n):
 b = n & 1
 n = ((n >> 1) | (b << 7)) & 0xFF
 return n 


def calc():
 for i in xrange(len(flag)-1, -1, -1):
  if i == len(flag) - 1:
   v1 = rol(rol(rol(0x97)))
  else:
   v1 = rol(rol(rol(flag[i+1])))

  v2 = table[v1]
  v2 = table[v2]

  flag[i] ^= v2  

 for i in xrange(len(flag)):
  if i == 0:
   flag[i] ^= 0xC5
  else:
   flag[i] ^= flag[i-1]

 print map(hex, flag)


def reverse():
 for i in xrange(len(target)-1, -1, -1):
  if i == 0:
   target[i] ^= 0xC5
  else:
   target[i] ^= target[i-1]

 for i in xrange(len(target)-1, -1, -1):
  if i == len(target) - 1:
   v1 = rol(rol(rol(0x97)))
  else:
   v1 = rol(rol(rol(save)))

  v2 = table[v1]
  v2 = table[v2]
  save = target[i]
  target[i] ^= v2 

 print ''.join(map(chr, target))


if __name__ == '__main__':
 reverse()

flag retr0_hack1ng@flare-on.com

Challenge #9 - GUI

The challenge consists of a .net executable with ConfuserEx thrown in for a change. DnSpy along with NoFuserEx is sufficient to extract all the necessary strings for reconstructing back the shared secret.

Share:1-d8effa9e8e19f7a2f17a3b55640b55295b1a327a5d8aebc832eae1a905c48b64
Share:2-f81ae6f5710cb1340f90cd80d9c33107a1469615bf299e6057dea7f4337f67a3
Share:3-523cb5c21996113beae6550ea06f5a71983efcac186e36b23c030c86363ad294
Share:4-04b58fbd216f71a31c9ff79b22f258831e3e12512c2ae7d8287c8fe64aed54cd
Share:5-5888733744329f95467930d20d701781f26b4c3605fe74eefa6ca152b450a5d3
Share:6-a003fcf2955ced997c8741a6473d7e3f3540a8235b5bac16d3913a3892215f0a

Flag Shamir_1s_C0nfused@flare-on.com

Challenge #10 - flava

This was the final challenge, and is composed of many sub challenges. The first part requires to get through three layers of obfuscated javascript with an obfuscated Diffie Hellman (courtesy of Angler EK) for more distress. So unless, one figures out what the heck is with all the obfuscated javascript it is a dead end. Even if one manages to guess that, breaking the Diffie Hellman is more pain. Luckily, Kaspersky researches have already done the hard work before and it requires a bit of Googling to locate the code necessary to break the diffie hellman.
After three layers of javascript there are three more layers of actionscript. While the first layer is straightforward the second and third layers are obfuscated. The challenge in this part is to identify that the RC4 key is reused. Once we know that, we can simply xor the plain text and ciphertext to get the keystream, and xor the resultant keystream with the second ciphertext to get back the plain text. The third actionscript layer simply prints the flag angl3rcan7ev3nprim3@flare-on.com

Final Words

Overall, the challenges this year were certainly more difficult than those of the preceding year. Some parts required bruteforcing hashes and guesswork which I detest. Another point of notice is that there were no 64 bit binaries. There were also no challenges involving kernel drivers. Finally, I would like to extend my thanks to everyone who helped me through the course of the challenges.

Hack the Vote 2016 CTF - APTeaser writeup

2016-11-07T09:41:00.003+00:00

Just for fun I decided to have a go at the Hack the Vote 2016 CTF, particularly the reversing challenges on Windows. There were two of them APTeaser & Trumpervisor. I managed to solve the first. I did try the second but it involved reversing a Win 10 kernel driver implementing a hypervisor using the Intel Virtualization Extensions (VT-x). Anyway, here is a somewhat detailed writeup for the first.

Initial Analysis

The provided file is a pcapng. Opening it in fiddler, reveals an interesting http request for a supposed pdf file on the domain important.documents.trustme, but as indicated from the Content-Type the response is actually an executable.

Fig 1: Serving an executable when all I want is a pdf

This can be further verified in wireshark as shown in Fig 2.

Fig 2: Cross checking in wireshark

We can save the response to a new file for further analysis. As expected the the file has a pdf icon and a dual extension to pass off as an innocuous pdf, waiting to be clicked.

Fig 3: An exe with a pdf icon and dual extensions

Dissecting the executable

The executable in question is a screen grabber. It takes screenshot of the desktop at regular intervals, saves to a jpeg file, "encrypts" the file, and sends it to a remove server. We can decompile it in hex-rays to get the following pseudo code.

int __cdecl main(int argc, const char **argv, const char **envp)
{
  char v4; // [sp+0h] [bp-2Ch]@1
  int v5; // [sp+10h] [bp-1Ch]@1
  int x; // [sp+14h] [bp-18h]@1
  int y; // [sp+18h] [bp-14h]@1
  int x1; // [sp+1Ch] [bp-10h]@1
  int y1; // [sp+20h] [bp-Ch]@1
  SOCKET sock; // [sp+24h] [bp-8h]@2
  int i; // [sp+28h] [bp-4h]@1

  sub_401BF0(0, 0, 0);
  GdiplusStartup(&v5, &v4, 0);
  x1 = 0;
  y1 = 0;
  x = GetSystemMetrics(SM_CXSCREEN);
  y = GetSystemMetrics(SM_CYSCREEN);
  i = 0;
  do
  {
    sock = create_socket();
    if ( sock == SOCKET_ERROR )
    {
      ++i;
      Sleep(300u);
    }
    else
    {
      i = 0;
      take_screenshot(x1, y1, x - x1, y - y1);
      send_file(sock, "a");
      destroy_socket(sock);
    }
  }
  while ( i < 5 );
  GdiplusShutdown(v5);
  return 0;
}

As shown in the preceding snippet, it takes screencaps at regular intervals, and retrying for 5 times if socket creation fails. To get a better overview of the network traffic, we can analyse it using FakeNet-NG an excellent dynamic network analysis tool from Fireeye.

A quick overview of the network traffic

Fig 4: FakeNet-NG capturing the outbound requests

FakeNet-NG is an immensely helpful tool for dealing with malware/apps that uses the internet for communication. While tools like wireshark can only capture the traffic, fakenet additionally, also has the ability to mimic/emulate/tamper the responses via custom listeners implemented as plugins. On this particular sample, it has redirected an outbound request to 128.213.48.117:27015 to its default tcp listeners. Additionally, the name/pid of the application that initiated the request is also shown. The next point of interest the data that is actually being sent. It starts with the hex bytes 80 F4 FF E0. Now, a jpeg starts with FF D8 FF E0. Similarly the next 4 bytes, 43 4F 4A 46 bear strong resemblance to the true jpg header bytes xx xx 4A 46. From this, it can be easily deduced that a jpeg is being sent after applying some sort of "encryption" on the first two bytes of each dword.

The encryption algorithm

The encrypting functionality is implemented in the function send-file and is the subject of further analysis. The snippet of code that deals with the encryption is as follows.

  completed = 0;
  blocksize = 0;
  while ( completed < filesize )
  {
    seed = get_time(0);
    srand(seed);
    if ( filesize - completed < 2048 )
      blocksize = filesize - completed;
    else
      blocksize = 2048;
    for ( i = 0; i < blocksize / 4; ++i )
    {
      randnum = rand();
      if ( !i )
      {
        xored_val = randnum ^ *(jpeg_buf + completed);
        v4 = std::basic_ostream<char,std::char_traits<char>>::operator<<(std::cerr, randnum, sub_403880, " ", *(jpeg_buf + completed), " ");
        v5 = std::basic_ostream<char,std::char_traits<char>>::operator<<(v4);
        v6 = sub_401040(v5, xored_val);
        v7 = std::basic_ostream<char,std::char_traits<char>>::operator<<(v6, sub_401680, v13, *&v14, v15, v16);
        v9 = sub_401040(v7, v8);
        v11 = std::basic_ostream<char,std::char_traits<char>>::operator<<(v9, v10, *&v14, v15, v16, v17);
        std::basic_ostream<char,std::char_traits<char>>::operator<<(v11);
      }
      *(jpeg_buf + 4 * i + completed) ^= randnum;
    }
    send(sock, jpeg_buf + completed, blocksize, 0);
    completed += blocksize;
    Sleep(Seed * completed % 3600);
  }

The code initializes the random number using the current time as a seed to the function srand. Then it begins to xor the jpeg 4 bytes at a time with the generated random number in blocks of 2048 bytes, i.e it uses a new random number after every 2048 bytes. The range of the function rand lies between 0 and 32767 (0x7FFF). The upper two bytes are always zero. This explains the reason behind the fact that for every 4 bytes of the encrypted data, 2 bytes were always left unencrypted, as was found in fakenet.

Breaking the crypto

To break the crypto, we need to predict the generated random number, which depends on the seed used. The seed in turn is just the current time in epochs. Now each captured packet has a timestamp, we can use that value as a seed to srand to regenerate the exact same sequence of random numbers.

First we need to get the timestamps of the relevant packets. This can be done by setting a filter and exporting the resultant packets to a new pcap file as shown in Fig. 5

Fig 5: Applying a display filter in wireshark

Then a python script to get the timestamps and to carve out the jpeg.

import pyshark
import cStringIO

cap = pyshark.FileCapture('filtered.pcap')
buf = cStringIO.StringIO()
rand_seeds = []

for pkt in cap:
    buf.write(pkt.data.data.decode('hex')) # pkt.layers[3].data
    rand_seeds.append(int(float(pkt.frame_info.time_epoch))) 

open('encrypted.jpg', 'wb').write(buf.getvalue())
print rand_seeds

Once we have the encrypted jpeg and the seed values, we can create the decrypter in C.

#include <stdlib.h>
#include <stdio.h>
#include <conio.h>

void main()
{
 int rand_seeds[] = {1460329071, 1460329074, 1460329077, 1460329080, 1460329084, 
  1460329084, 1460329087, 1460329090, 1460329091, 1460329093, 1460329096, 
  1460329100, 1460329102, 1460329105, 1460329109, 1460329111, 1460329112, 
  1460329114, 1460329115, 1460329117, 1460329121, 1460329121, 1460329123, 
  1460329125, 1460329128, 1460329131, 1460329133, 1460329135, 1460329136, 
  1460329140, 1460329141, 1460329142, 1460329145, 1460329148, 1460329150, 
  1460329154, 1460329154, 1460329154, 1460329158, 1460329158, 1460329160, 
  1460329161, 1460329161, 1460329165, 1460329165, 1460329167, 1460329168, 
  1460329170, 1460329171, 1460329172, 1460329173, 1460329174, 1460329175, 
  1460329178, 1460329180, 1460329182, 1460329182, 1460329184, 1460329185, 
  1460329186, 1460329187, 1460329190, 1460329191, 1460329192, 1460329194, 
  1460329197, 1460329197, 1460329198, 1460329199, 1460329200, 1460329202, 
  1460329202, 1460329203, 1460329203, 1460329205, 1460329207, 1460329210, 
  1460329210, 1460329213, 1460329215, 1460329217, 1460329218, 1460329220, 
  1460329220, 1460329222, 1460329225, 1460329227, 1460329230, 1460329232, 
  1460329234};

 FILE *inf = fopen("encrypted.jpg", "rb");
 FILE *outf = fopen("decrypted.jpg", "wb");
 fseek(inf, 0, SEEK_END);
 long size = ftell(inf);
 rewind(inf);

 for (int i = 0; i < 90; i++)
 {
  srand(rand_seeds[i]);
  for (int j = 0; j < (size < 2048?size/4:2048/4); j++)
  {
   unsigned int data;
   fread(&data, 4, 1,  inf);
   data ^= rand();
   fwrite(&data, 4, 1, outf);
  }
  size -= 2048;
 }
 getch();
}

Running this we get the decrypted jpeg as shown in Fig 6. The flag is flag{1_n33d_my_t00Lb4r5}.

Fig 6: The flag

A punched card reader in javascript

2016-10-30T06:36:00.001+00:00

While trying out the Ektoparty CTF 2016 there was a challenge which requires to decode a series of punched card images. A punched card looks like this

Fig 1: A sample punch card

The small white blocks denote the punch. All of the card images were specifically generated by an online service - Card punch. However, on searching for a tool which reads such images, I found nothing, hence I developed a small web app which can read in these punched card images. Please do note the card image must have an exact dimensions of 588 x 264 or else the tool wont work. You can try out the tool here or below. The source is on github.

Pyinstaller Extractor updated to v1.6

2016-09-07T08:51:00.003+00:00

PyInstaller Extractor is a tool to extract the contents of a windows executable file created by pyinstaller. This weekend I updated the tool to version 1.6. The new features which were incorporated include

Support for extracting pyinstaller 3.2
Extractor would now use a random name for extracting unnamed files within the CArchive
Preliminary support for handling encrypted pyz archives

The features are explained below.

Support for extracting pyinstaller 3.2

There has not been any format changes between pyinstaller 3.2 and the earlier versions. The previous versions works as is for pyinstaller 3.2

Handling unnamed files within CArchive

A Pyinstaller exe file can be visualized as of two nested archives. The outer layer is called CArchive. It is called so as it handled by C code i.e. the decompression of the layer is handled by a native stub written in C. The CArchive in turn contain another archive called PYZArchive along with other files. The PYZArchive is usually zlib compressed and is handled by python code and hence the reason for its name.

The CArchive usually contains the main script along with necessary dll files and python extension modules (pyd files). When running an pyinstaller exe, all dll and pyd files are written to a temporary directory to facilitate loading. (This behaviour is noticeably different from py2exe which loads dlls from memory.) The main script is never written to disk and hence it is possible to remove its name. If such is the case then the earlier versions of extractor would fail. The current version 1.6 will use a random name if it finds any unnamed files.

This feature has mainly been inspired while working on the PAN LabyREnth challenge (Threat #7).

Preliminary support for handling encrypted pyz archives

The files within the PYZ archive can be encrypted too. If such is the case, the tool would dump those files as is without attempting to process them. Previously the tool would fail trying to decompress encrypted data.

That's it. I would make a separate post to demonstrate how to extract encrypted pyz archives. It isn't difficult as the key to decrypt is present right in the CArchive.

LabyREnth CTF WriteUp - Random track

2016-08-20T10:43:00.002+00:00

Attempting the Labyrenth challenges was an interesting experience. I completed three tracks - Windows, Docs & Random, and the others were left halfway. Among all the tracks, the random track was more interesting particularly due to the last python challenge.

Level 1 - Java

The challenge consists of a jar file. So it seems, we have to reverse java bytecode. The interesting thing is the jar will only run with Java 9 (currently in beta) as it uses StringConcatFactory. Attempting to decompile the file with Jad, Procyon, or CFR fails. This is due the fact it uses a new opcode InvokeDynamic. This is a fairly new opcode and as of now, the java compiler does not emit this opcode. It exists to support other dynamic languages running on the JVM.

Only krakatau was able to decompile the file proper, although it too doesn't handle the InvokeDynamic opcode.

public class omg {
    String username;
    String levelFlag;
    
    public omg(String s)
    {
        super();
        this.levelFlag = "W5SSA2DPOBSSA6LPOUQGK3TKN54SA5DINFZSASTBOZQSAYLQOAQGC4ZAO5QXE3JNOVYC4ICUNBSSA4TFON2CA53JNRWCAYTFEBWXKY3IEBWW64TFEBZWK6DDNF2GS3THEEQFI2DFEBTGYYLHEBUXGICQIFHHWRBQL5MTA5K7IV3DG3S7IJQXGZJTGJ6Q!";
        this.username = s;
        if (s.contains((CharSequence)(Object)"-isDrunk"))
        {
            String[] a = s.split("-");
            int i = a[1].charAt(2);
            int i0 = Character.toUpperCase((char)i);
            int i1 = (char)(i0 ^ 15);
            this.levelFlag = /*invokedynamic*/null;
            StringBuilder a0 = new StringBuilder(this.levelFlag);
            int i2 = 0;
            while(i2 < 4)
            {
                Object[] a1 = new Object[1];
                int i3 = a[1].charAt(a[1].length() - 1);
                int i4 = (short)i3;
                a1[0] = Short.valueOf((short)i4);
                int i5 = (char)(Integer.parseInt(String.format("%04x", a1), 16) ^ 166);
                a0.append((char)i5);
                i2 = i2 + 1;
            }
            System.out.println(a0.toString());
            this.levelFlag = a0.toString();
        }
        else
        {
            int i6 = 0;
            while(i6 < s.length() / 2)
            {
                String s0 = this.levelFlag;
                int i7 = s.charAt(i6);
                int i8 = s.charAt(s.length() - 1);
                this.levelFlag = s0.replace((char)i7, (char)i8);
                i6 = i6 + 1;
            }
        }
    }
    
    public String getLevelFlag()
    {
        return this.levelFlag;
    }
    
    public static void main(String[] a)
    {
        omg a0 = new omg(System.getenv("Admin"));
        System.out.println(/*invokedynamic*/null);
    }
}

There is a whole bunch of decoy code. The actual flag can be found by base32 decoding the levelFlag:

W5SSA2DPOBSSA6LPOUQGK3TKN54SA5DINFZSASTBOZQSAYLQOAQGC4ZAO5QXE3JNOVYC4ICUNBSSA4TFON2CA53JNRWCAYTFEBWXKY3IEBWW64TFEBZWK6DDNF2GS3THEEQFI2DFEBTGYYLHEBUXGICQIFHHWRBQL5MTA5K7IV3DG3S7IJQXGZJTGJ6Q

Fig 1: Level 1 flag

This gives our flag: PAN{D0_Y0u_Ev3n_Base32}

Level 2 - Regex

This challenge looks scary at first sight. A regular expression is given. Our task is to find a string that does not match the regex. Netcatting the string to 52.27.101.106 would give the flag.

Fig 2: That looks scary, doesn't it?

The regex seems to match everything ever thrown at it. It has 625 clauses making it unfit for manual analysis. A regex debugger like RegexBuddy or regex101 is handy for such situations.

Lets look at the clause 1: .*[^0mglo8sc1enC3].*

This matches a single character unlimited number of times followed by a character which is not in the negation list, finally followed by a character unlimited number of times. Hence any string consisting of only characters present in the negation list will not match this clause.

Clause 2: .{,190}
Clause 3: .{192,}

The 2nd clause matches any string of length 190 or lower. The 3rd clause matches any string of length 192 or above. Combining these two clauses, for a non match our string should exactly 191 characters drawn from the list in clause 1.

The remaining clauses worth of interest is clause 124 and 341.

Fig 3: Clause 124

Fig 4: Clause 341

Clause 124 matches a string of length 190 chars followed by one of e0nlCo3c8. Similarly clause 341 matches a string of length 190 followed by one of mg1.

Combining the above clauses the string which does not match the regex consists of 190 g followed by a solitary s.

ggggggggggggggggggggggggggggggggggggggggggggggggggggggggggggggggggggggggggggggggggggggggggggggggggggggggggggggggggggggggggggggggggggggggggggggggggggggggggggggggggggggggggggggggggggggggggggggs

Netcatting this to 52.27.101.106 gives the flag PAN{th4t5_4_pr311y_dum8_w4y_10_us3_r3g3x}

Fig 5: Level 2 flag

Level 3 - Pcap

A pcap file is provided. Loading it in Wireshark reveals something interesting with the tcp sequence number of the SYN packets. The sequence numbers starts with PK\03\04 which is also the magic signature for zip files. We can set a display filter will only display the SYN packets. The same can be used to save the filtered packets to a new pcap file.

Fig 6: ZIP signature in tcp sequence number

To assemble zip file by combining the sequence number, we can use scapy.

from scapy.all import *
from binascii import unhexlify
import cStringIO

def main():
    buf = cStringIO.StringIO()
    with PcapReader('SYN-filtered.pcap') as pcap_reader:
        for pkt in pcap_reader:                                 
                buf.write(unhexlify(format(pkt.seq, 'x').zfill(8)))
    open('extracted.zip', 'wb').write(buf.getvalue())   
            
if __name__ == '__main__':
    main()

This creates a new file extracted.zip. This zip contains 853 files each containing base64 encoded text.

Fig 7: Contents of zip file

Our task is to join the files in the correct order to form a huge base64 text blob. Decoding that text blob should give us our flag. The padding character used in base64 is =. Grepping for the = character within the extracted BIN files gives one hit 339.bin. This should therefore be the last file in the combined blob.

Fig 8: 339.bin

The other point to note is that the contents of the files overlap as in Fig 9 showing the overlapped part of 531.bin with 339.bin.

Fig 9: Overlapped part

We can write a python script which would find the order of assembling the files by searching for the overlapped part.

import re

def findMatch(dct, txt):
    for i in xrange(len(txt)-1, 0, -1):
        reg = re.compile(re.escape(txt[0:i])+'$')

        for key in dct.keys():
            if reg.search(dct[key]):
                return (key, i)

    return (None, -1)


def main():
    contents = dict()

    for i in xrange(853):
        fname = '%d.txt' %i
        txt = open(fname).read()
        contents[fname] = txt

    nm = '339.txt'
    print nm

    while True:
        txt = contents[nm]
        result, matchlen = findMatch(contents, txt)

        if result == None:
            break
            
        print result, matchlen
        del contents[nm] 
        nm = result


if __name__ == '__main__':
    main()

Running this gives the order of joining the files in reverse i.e the first line (339.txt) is the last file and the line (659.txt) is the first file. Each line contains two comma separated values. The second value is the number of overlapping characters. The full list is available as a gist.

Based on the join order, another python script can assemble the files.

f = open('combined.txt', 'w')
lines = open('random-lev3-joinorder.txt', 'r').readlines()

try:
    for line in reversed(lines):
        fname, matchlen = line.split(' ')
        matchlen = int (matchlen)

        txt = open(fname, 'r').read()
        f.write(txt[0:len(txt)-matchlen])

except:
    f.write(open('339.txt', 'r').read()) # last file has no overlap

f.close()

The assembled file, combined.txt is a base64 text blob. Decoding it results in a zip file. The zip file has several images within.
One of them troll_cloud.png contains the flag: PAN{YouDiD4iT.GREATjob}

Fig 10: Level 3 flag

Level 4 - PHP

The challenge consists of solving a maze. The maze is coded in PHP and is additionally heavily obfuscated. For running this, I had to set up wamp server in my windows virtual machine. The actual php code which we are after is dynamically generated and eval'd. Getting around this obfuscation was easy as PHP was configured with display errors to be true. This made it to spit out the eval'd php code as a warning message.

Fig 11: Dynamically generated php code

To copy the de-obfuscated php code, we need to use a tool like fiddler. We cannot directly copy the php code from the warning message as the characters are not escaped properly. The deobfuscated source can be found here. From this point it is a matter of manual source code analysis to find the correct path for solving the maze. Solving the maze gives the flag:

PAN{Life is a maze of complications. Also, puppets are sometimes involved. Deal with it.}

Fig 12: Level 4 flag

Level 5 - Python

Fig 13: Level 5 challenge - Crack APT Maker Pro

This the final challenge in the Random track and is the most interesting. As the description of the challenge points out we really "have to be a snake charmer to crack the newest version of APT Maker Pro". The file is a 720 KB python script which contains a zlib compressed marshalled code object which is dynamically executed by the exec statement as shown in Fig 14.

Fig 14: Dynamically executed payload

To proceed, we need the decompiled payload from the compressed code object. This is easy and my tool Easy Python Decompiler is sufficient for the task. First, decompress the zlib data into a new file, add the python magic header 03 F3 0D 0A 00 00 00 00 at the beginning, and finally feed it to the decompiler to get our decompiled source as shown in Fig 15.

Fig 15: Decompiled payload

However that is not all. The decompiled payload contains a base64 encoded, zlib compressed code object which is executed by the exec statement just like the previous case. In addition to this there is a piece of RC4 encrypted data labelled as malware as shown in Fig 16. Thus the actual challenge is to find the correct RC4 decryption key. To find the key, we need to inspect the second level payload just found.

Fig 16: The malware blob

The difficulty at this stage increases rapidly. We cannot decompile the second level payload as it has been obfuscated. A deep knowledge about CPython internals is necessary to proceed forward. As I said, the payload is no longer decompilable, hence we need to inspect it manually. Let's inspect the code object.

>>> import marshal
>>> f = open('level2.pyc', 'rb')
>>> f.seek(8)
>>> co = marshal.load(f)
>>> co.co_consts
(<code object verify_license at 00AAB2F0, file "", line -1>, None)

The code object contains another nested code object verify_license which sounds interesting. Lets dump it to a new file.

>>> of = open('level3.pyc', 'wb')
>>> of.write('\x03\xF3\x0D\x0A' + '\x00' * 4)
>>> marshal.dump(co.co_consts[0], of)
>>> of.close()

Now we need to analyze 3rd level payload level3.pyc. Similar to level2.pyc this is too obfuscated and undecompileable. Lets run some preliminary analysis.

>>> f = open('level3.pyc', 'rb')
>>> f.seek(8)
>>> co=marshal.load(f)
>>> len(co.co_consts)
37173
>>> len(co.co_names)
37124
>>> len(co.co_code)
144060

That's more than 37k constants and names!. In addition the size of bytecode instructions that gets actually executed is over 144k. Thats insane!. Who would like to analyze such a file manually unless one have tools and luckily we do have tools. Out of the 37173 constants, 37121 just store the None type and is redundant. We can write a quick python script for finding this.

>>> for i in xrange(len(co.co_consts)):
...     if co.co_consts[i] is not None:
...             print i
...             break
...
37121

The 37122th constant stores a png file which looks interesting.

>>> co.co_consts[37122][:6]
'\x89PNG\r\n'

Fig 17: Embedded PNG file

The remaining constants store various integers and are not of much interest. Lets shift our focus to the 144k long co_code. Lets disassemble the very first instruction,

>>> import opcode
>>> opcode.opname[ord(co.co_code[0])]
'EXTENDED_ARG'

That's the EXTENDED_ARG opcode. In normal python, it is very rare to encounter this opcode. This is only generated if the operand of the instruction cannot fit in a space of 2 bytes. This can happen in rare situations such as passing more than 65,535 parameters to a function. The actual opcode on which EXTENDED_ARG is operating on is located at a offset of +3. Lets see what it is.

>>> opcode.opname[ord(co.co_code[3])]
'EXTENDED_ARG'

That's even more strange. We expected to see a real opcode here. If we continue this way, we will see a huge chain of EXTENDED_ARG opcodes and the final instruction which it is operating on is a JUMP_FORWARD which as the name suggests increments the IP by an offset.

>>> opcode.opname[ord(co.co_code[144051])]
'EXTENDED_ARG'
>>> opcode.opname[ord(co.co_code[144054])]
'EXTENDED_ARG'
>>> opcode.opname[ord(co.co_code[144057])]
'JUMP_FORWARD'

To find out the target offset of the jump we need to write a python script.

import marshal

f=open('level3.pyc', 'rb')
f.seek(8)
co=marshal.load(f)
f.close()

i = 0
arg = 0
while i < len(co.co_code):
 arg = (arg << 16) | ord(co.co_code[i+1]) | (ord(co.co_code[i+2]) << 8)
 arg = arg & 0xFFFFFFFF  
 i += 3

print hex(arg)

Running this gives us the target offset which is 0xfffdcd45 or -1,44,059. That is instead of jumping forward it jumps backward within the instruction stream. The obfuscation that is applied is akin to overlapping instruction obfuscation found in native x86 executables.

Now the size of the instruction stream (co_code) is 144060 and a 144059 long backward jump from the rear leads to the second byte. If we disassemble this we uncover a hidden series of instructions stitched together with JUMP_FORWARDs.

>>> opcode.opname[ord(co.co_code[1])]
'LOAD_NAME'
>>> opcode.opname[ord(co.co_code[4])]
'NOP'
>>> opcode.opname[ord(co.co_code[5])]
'JUMP_FORWARD'

We need to uncover this hidden instructions, join them as one after removing the NOP and JUMP_FORWARD instructions used for stitching them. Another python script to the rescue.

# level3 extract code

import marshal
import opcode
import types

cleaned_bytecode = []


def clean(opkode, arg1, arg2):
    if opcode.opname[opkode] == 'JUMP_FORWARD' or opcode.opname[opkode] == 'NOP':
        return

    else:
        if opkode >= opcode.HAVE_ARGUMENT:
            cleaned_bytecode.append(opkode)
            cleaned_bytecode.append(arg1)
            cleaned_bytecode.append(arg2)
        else:
            cleaned_bytecode.append(opkode)


def printline(offset, opname, arg):
    if opname == 'JUMP_FORWARD' or opname == 'NOP':
        return
    if arg is not None:
        print 'loc_%06d: %s %d' %(offset, opname, arg)
    else:
        print 'loc_%06d: %s' %(offset, opname)


def modifyCodeStr(code_obj):
    co_argcount = code_obj.co_argcount
    co_nlocals = code_obj.co_nlocals
    co_stacksize = code_obj.co_stacksize
    co_flags = code_obj.co_flags

    # new code string
    co_codestring = ''.join(map(chr, cleaned_bytecode))

    # Replace png file contents to facilitate decompiling
    co_constants = list(code_obj.co_consts)
    co_constants[37122] = 'PNG FILE HERE'
    co_constants = tuple(co_constants)

    co_names = code_obj.co_names
    co_varnames = code_obj.co_varnames
    co_filename = code_obj.co_filename
    co_name = code_obj.co_name
    co_firstlineno = code_obj.co_firstlineno
    co_lnotab = code_obj.co_lnotab

    return types.CodeType(co_argcount, co_nlocals, co_stacksize, \
                          co_flags, co_codestring, co_constants, co_names, \
                          co_varnames, co_filename, co_name, co_firstlineno, co_lnotab)

def main(): 
    f=open('level3.pyc', 'rb')
    f.seek(8)
    co=marshal.load(f)
    f.close()
    kode = map(ord, list(co.co_code))
    offset = 1

    while offset < len(kode):
        opkode = kode[offset]
        opname = opcode.opname[opkode]

        if opkode >= opcode.HAVE_ARGUMENT:
            arg1 = kode[offset+1]
            arg2 = kode[offset+2]
            arg = (arg2 << 8) | arg1 # Little endian
            printline(offset, opname, arg)
            offset += 3

        else:
            arg = arg1 = arg2 = None
            printline(offset, opname, arg)
            offset += 1

        clean(opkode, arg1, arg2)

        if opname == 'JUMP_FORWARD':
            offset += arg

        elif opname == 'RETURN_VALUE':
            break

    newCodeObj = modifyCodeStr(co)
    f = open('level3_deobf.pyc', 'wb')
    f.write('\x03\xF3\x0D\x0A' + '\x00'*4)
    marshal.dump(newCodeObj, f)
    f.close()


if __name__ == '__main__':
    main()

The hidden instruction stream can be found here. The python script above stitches the hidden code and replaces the PNG file contents with a dummy string to facilitate decompiling, else decompiler would choke. Lets decompile the produced level3_deobf.pyc selecting pycdc as the engine. It gives the following code.

# File: l (Python 2.7)

   = license_key[0]
    = 'PNG FILE HERE'[542]
exec       =    ==    

   = license_key[1]
    = 'PNG FILE HERE'[379]
exec       =    ==    

   = license_key[2]
    = 'PNG FILE HERE'[1020]
exec       =    ==    

   = license_key[3]
    = 'PNG FILE HERE'[457]
exec       =    ==    

   = license_key[4]
    = 'PNG FILE HERE'[203]
exec       =    ==    

   = license_key[5]
    = 'PNG FILE HERE'[203]
exec       =    ==    

   = license_key[6]
    = 'PNG FILE HERE'[39]
exec       =    ==    

   = license_key[7]
    = 'PNG FILE HERE'[379]
exec       =    ==    

   = license_key[8]
    = 'PNG FILE HERE'[65]
exec       =    ==    

   = license_key[9]
    = 'PNG FILE HERE'[54]
exec       =    ==    

   = license_key[10]
    = 'PNG FILE HERE'[379]
exec       =    ==    

   = license_key[11]
    = 'PNG FILE HERE'[40]
exec       =    ==    

   = license_key[12]
    = 'PNG FILE HERE'[262]
exec       =    ==    

   = license_key[13]
    = 'PNG FILE HERE'[54]
exec       =    ==    

   = license_key[14]
    = 'PNG FILE HERE'[379]
exec       =    ==    

   = license_key[15]
    = 'PNG FILE HERE'[250]
exec       =    ==    

   = license_key[16]
    = 'PNG FILE HERE'[704]
exec       =    ==    

   = license_key[17]
    = 'PNG FILE HERE'[1110]
exec       =    ==    

   = license_key[18]
    = 'PNG FILE HERE'[141]
exec       =    ==    

   = license_key[19]
    = 'PNG FILE HERE'[379]
exec       =    ==    

   = license_key[20]
    = 'PNG FILE HERE'[65]
exec       =    ==    

   = license_key[21]
    = 'PNG FILE HERE'[54]
exec       =    ==    

   = license_key[22]
    = 'PNG FILE HERE'[285]
exec       =    ==    

   = license_key[23]
    = 'PNG FILE HERE'[1215]
exec       =    ==    

   = license_key[24]
    = 'PNG FILE HERE'[840]
exec       =    ==    

  =   &   &   &   &   &   &   &   &   &   &   &   &   &   &   &   &   &   &   &   &   &   &   &   &   &

The variable names are missing, but it is fairly evident what the code does. It compares the characters of the license key with some bytes of the PNG file. For success, each of these checks must succeed. Joining the characters we get the license key 1_W4nnA_b3_Th3_vERy_b3ST!. Feeding this, APT Maker Pro becomes registered as shown in Fig 18.

Fig 18: APT Maker Pro is licensed

Clicking on Generate APT drops the malware payload as EVIL_MALWARE_ CYBER_PATHOGEN .pyc. Decompiling it we get the file containing the flag for this level

PAN{l1Ke_n0_oN3_ev3r_Wa5}

Fig 19: The flag

Solving the weasel keygenme by kao

2016-08-16T08:44:00.001+00:00

Over at tuts4you kao posted a nice .net keygenme. The ultimate goal is developing a keygen which can generate keys for any name within a second.

Fig 1: Know thy goal

Although the post is titled as "Solving the weasel keygenme", I did not get that far as developing a working keygen as required for the much coveted gold medal. However, the algorithm that kao implements is quite interesting to document and that is the reason for this blog post.

Solving the keygenme consists of two parts -- First to devirtualize the virtual machine and second to reverse the crypto.

De-virtualizing the VM

Dropping the file in dnSpy, reveals that method and variable names have been removed as in Fig 2.

Fig 2: Wazzup with those names?

The obfuscation can be removed by processing the file with de4dot which leads to a slightly better readable code as in Fig 3.

Fig 3: Not anonymous anymore

Now with the obfuscation out of the way, it's time to deal with the vm. A vm based software protection works by transforming the original instructions to a set of new instruction which only a custom vm understands. Executing the new set of instructions under the supervision of the vm is expected to produce the same output as executing the original instructions without the vm. As regards to the implementation of a vm this keygenme is also no different.

The virtual machine is implemented in a class named SillyVM. The constructor of this class initializes the opcode handler table as shown in Fig 4. There are 16 instructions that the vm understands.

Fig 4: Initializing the vm

Immediately, after setting up the opcode handler table the constructor initializes an array which contains the actual instructions of the vm as shown in Fig 5.

Fig 5: The instructions of the vm

The next task is to understand the semantics of each of the 16 handlers. This is a manual task but it cannot be avoided. Understanding the operation of the handlers is crucial in comprehending the operation of the vm. Luckily, this is not a too difficult task, as the handlers are short and simple. The handlers can then be renamed as per their semantics. As shown in Fig 6, we have renamed three of the handlers to JumpIfLess, JumpIfZero, Multiply.

Fig 6: Renaming the handlers

The complete set of handlers along with their associated semantics is listed in the following table.

INSTRUCTION	OPCODE	FORMAT	SEMANTICS
Add	0	Add b1, b2	arr[b2] += arr[b1]
JumpIfEqual	1	JumpIfEqual b1, b2, b3	If (arr[b1] == b2) then IP = b3
NewBitArray	2	NewBitArray b1, b2	bitarr[b1] = new BitArray(b2)
JumpIfLess	3	JumpIfLess b1, b2, b3	If (arr[b1] < b2) then IP = b3
JumpIfLargePassw	4	JumpIfLargePassw b1, b2	If (arr[b1] < passw.Length) then IP = b2
Jump	5	Jump b1	IP = b1
JumpIfZero	6	JumpIfZero b1, b2	If (arr[b1] == 0) then IP = b2
BitToInt	7	BitToInt b1, b2, b3	arr[b3] = (bitarr[b1][arr[b2]] == true ? 1 : 0)
Increment	8	Increment b1	arr[b1]++
IndexOf	9	IndexOf b1, b2	arr[b2] = hardc.IndexOf(passw[arr[b1]])
SumNumOne	10	SumNumOne b1, b2	arr[b2] = arr[b1] + 1
Multiply	11	Multiply b1, b2	arr[b2] *= arr[b1]
BitArrayCopy	12	BitArrayCopy b1	bitarr[b1].CopyTo(ans)
IntToBit	13	IntToBit b1, b2, b3	bitarr[b1][arr[b2]] = arr[b3] & 1
RShr	14	RShr b1	arr[b1] >>= 1
SetZero	15	SetZero b1	arr[b1] = 0

As per the above table, we can code a disassembler in python which takes in the instruction list and emits the de-virtualized C# code.

instr = [2,2,56,15,41,15,15,5,36,9,15,8,1,8,255,34,15,1,
5,30,13,2,41,8,8,41,14,8,8,1,3,1,5,20,8,15,4,15,
9,2,1,40,15,61,15,55,15,24,5,134,7,0,61,36,8,61,
15,18,5,78,7,0,61,19,6,19,74,7,2,18,19,0,19,36,8,
61,8,18,3,18,50,60,15,11,5,122,10,11,47,5,116,7,0,
61,19,6,19,112,7,2,11,19,7,2,47,57,11,19,57,0,57,
36,8,61,8,47,3,47,50,91,8,11,3,11,49,86,13,1,55,
36,8,55,8,24,3,24,40,50,12,1]


def disassemble():
 ip = 0
 while ip < len(instr):
  opcode = instr[ip]
  print 'loc_%03d:\t' %(ip),
  ip += 1

  # Add
  if opcode == 0:
   b1 = instr[ip]
   b2 = instr[ip+1]
   ip += 2
   print 'arr[%d] += arr[%d];' %(b2, b1)
   
  # JumpIfEqual
  elif opcode == 1:
   b1 = instr[ip]
   b2 = instr[ip+1]
   b3 = instr[ip+2]
   ip += 3
   print 'if (arr[%d] == (int)((sbyte)%d)) goto loc_%03d;' %(b1, b2, b3)

  # NewBitArray
  elif opcode == 2:
   b1 = instr[ip]
   b2 = instr[ip+1]
   ip += 2
   print 'bitarr[%d] = new BitArray(%d);' %(b1, b2)
  
  # JumpIfLess
  elif opcode == 3:
   b1 = instr[ip]
   b2 = instr[ip+1]
   b3 = instr[ip+2]
   ip += 3
   print 'if ((long)arr[%d] < (long)((ulong)%d)) goto loc_%03d;' %(b1, b2, b3)
  
  # JumpIfLargePassw
  elif opcode == 4:
   b1 = instr[ip]
   b2 = instr[ip+1]
   ip += 2
   print 'if (arr[%d] < password.Length) goto loc_%03d;' %(b1, b2)

  # Jump
  elif opcode == 5:
   b1 = instr[ip]
   ip += 1
   print 'goto loc_%03d;' %(b1)

  # JumpIfZero
  elif opcode == 6:
   b1 = instr[ip]
   b2 = instr[ip+1]
   ip += 2
   print 'if (arr[%d] == 0) goto loc_%03d;' %(b1, b2)

  # BitToInt
  elif opcode == 7:
   b1 = instr[ip]
   b2 = instr[ip+1]
   b3 = instr[ip+2]
   ip += 3
   print 'arr[%d] = (bitarr[%d][arr[%d]] == true ? 1:0);' %(b3, b1, b2)

  # Increment
  elif opcode == 8:
   b1 = instr[ip]
   ip += 1
   print 'arr[%d]++;' %(b1)

  # IndexOf
  elif opcode == 9:
   b1 = instr[ip]
   b2 = instr[ip+1]
   ip += 2
   print 'arr[%d] = "23456789ABCDEFGHJKLMNPQRSTUVWXYZ".IndexOf(password[arr[%d]]);' %(b2, b1)

  # SumNumOne
  elif opcode == 10:
   b1 = instr[ip]
   b2 = instr[ip+1]
   ip += 2
   print 'arr[%d] = arr[%d] + 1;' %(b2, b1)

  # Multiply
  elif opcode == 11:
   b1 = instr[ip]
   b2 = instr[ip+1]
   ip += 2
   print 'arr[%d] *= arr[%d];' %(b2, b1)

  # BitArrayCopy
  elif opcode == 12:
   b1 = instr[ip]
   ip += 1
   print 'bitarr[%d].CopyTo(ans, 0);' %(b1)

  # IntToBit
  elif opcode == 13:
   b1 = instr[ip]
   b2 = instr[ip+1]
   b3 = instr[ip+2]
   ip += 3
   print 'bitarr[%d][arr[%d]] = ((arr[%d] & 1) != 0);' %(b1, b2, b3)

  # RShr
  elif opcode == 14:
   b1 = instr[ip]
   ip += 1
   print 'arr[%d] = (int)((uint)arr[%d] >> 1);' %(b1, b1)

  # SetZero
  elif opcode == 15:
   b1 = instr[ip]
   ip += 1
   print 'arr[%d] = 0;' %(b1)

  else:
   print 'Error'
   break

if __name__ == '__main__':
 disassemble()

Running the disassembler produces the following C# code.

loc_000: bitarr[2] = new BitArray(56);
loc_003: arr[41] = 0;
loc_005: arr[15] = 0;
loc_007: goto loc_036;
loc_009: arr[8] = "23456789ABCDEFGHJKLMNPQRSTUVWXYZ".IndexOf(password[arr[15]]);
loc_012: if (arr[8] == -1) goto loc_034;
loc_016: arr[1] = 0;
loc_018: goto loc_030;
loc_020: bitarr[2][arr[41]] = ((arr[8] & 1) != 0);
loc_024: arr[41]++;
loc_026: arr[8] = (int)((uint)arr[8] >> 1);
loc_028: arr[1]++;
loc_030: if ((long)arr[1] < (long)((ulong)5)) goto loc_020;
loc_034: arr[15]++;
loc_036: if (arr[15] < password.Length) goto loc_009;
loc_039: bitarr[1] = new BitArray(40);
loc_042: arr[61] = 0;
loc_044: arr[55] = 0;
loc_046: arr[24] = 0;
loc_048: goto loc_134;
loc_050: arr[36] = (bitarr[0][arr[61]] == true ? 1:0);
loc_054: arr[61]++;
loc_056: arr[18] = 0;
loc_058: goto loc_078;
loc_060: arr[19] = (bitarr[0][arr[61]] == true ? 1:0);
loc_064: if (arr[19] == 0) goto loc_074;
loc_067: arr[19] = (bitarr[2][arr[18]] == true ? 1:0);
loc_071: arr[36] += arr[19];
loc_074: arr[61]++;
loc_076: arr[18]++;
loc_078: if ((long)arr[18] < (long)((ulong)50)) goto loc_060;
loc_082: arr[11] = 0;
loc_084: goto loc_122;
loc_086: arr[47] = arr[11] + 1;
loc_089: goto loc_116;
loc_091: arr[19] = (bitarr[0][arr[61]] == true ? 1:0);
loc_095: if (arr[19] == 0) goto loc_112;
loc_098: arr[19] = (bitarr[2][arr[11]] == true ? 1:0);
loc_102: arr[57] = (bitarr[2][arr[47]] == true ? 1:0);
loc_106: arr[57] *= arr[19];
loc_109: arr[36] += arr[57];
loc_112: arr[61]++;
loc_114: arr[47]++;
loc_116: if ((long)arr[47] < (long)((ulong)50)) goto loc_091;
loc_120: arr[11]++;
loc_122: if ((long)arr[11] < (long)((ulong)49)) goto loc_086;
loc_126: bitarr[1][arr[55]] = ((arr[36] & 1) != 0);
loc_130: arr[55]++;
loc_132: arr[24]++;
loc_134: if ((long)arr[24] < (long)((ulong)40)) goto loc_050;
loc_138: bitarr[1].CopyTo(ans, 0);

The de-virtualized code thus obtained is semantically equivalent to the original code, but is splattered with numerous gotos. This is because during virtualization the code lost its original structure. Loops were converted to If-then goto statements. Although we can directly use the code as is, it is better if we try to restore it to its original form which should also improve readability.

Restoring the structure of the de-virtualized code

To restore the original structure of the de-virtualized code we will compile it followed by decompilation with dnSpy.

Original de-virtualized code

void doIt()
{
 bitarr = new BitArray[3];
 bitarr[0] = new BitArray(hardcoded);
 arr = new int[64];
 ans = new byte[5];
 
 loc_000: bitarr[2] = new BitArray(56);
 loc_003: arr[41] = 0;
 loc_005: arr[15] = 0;
 loc_007: goto loc_036;
 loc_009: arr[8] = "23456789ABCDEFGHJKLMNPQRSTUVWXYZ".IndexOf(password[arr[15]]);
 loc_012: if (arr[8] == -1) goto loc_034;
 loc_016: arr[1] = 0;
 loc_018: goto loc_030;
 loc_020: bitarr[2][arr[41]] = ((arr[8] & 1) != 0);
 loc_024: arr[41]++;
 loc_026: arr[8] = (int)((uint)arr[8] >> 1);
 loc_028: arr[1]++;
 loc_030: if ((long)arr[1] < (long)((ulong)5)) goto loc_020;
 loc_034: arr[15]++;
 loc_036: if (arr[15] < password.Length) goto loc_009;
 loc_039: bitarr[1] = new BitArray(40);
 loc_042: arr[61] = 0;
 loc_044: arr[55] = 0;
 loc_046: arr[24] = 0;
 loc_048: goto loc_134;
 loc_050: arr[36] = (bitarr[0][arr[61]] == true ? 1:0);
 loc_054: arr[61]++;
 loc_056: arr[18] = 0;
 loc_058: goto loc_078;
 loc_060: arr[19] = (bitarr[0][arr[61]] == true ? 1:0);
 loc_064: if (arr[19] == 0) goto loc_074;
 loc_067: arr[19] = (bitarr[2][arr[18]] == true ? 1:0);
 loc_071: arr[36] += arr[19];
 loc_074: arr[61]++;
 loc_076: arr[18]++;
 loc_078: if ((long)arr[18] < (long)((ulong)50)) goto loc_060;
 loc_082: arr[11] = 0;
 loc_084: goto loc_122;
 loc_086: arr[47] = arr[11] + 1;
 loc_089: goto loc_116;
 loc_091: arr[19] = (bitarr[0][arr[61]] == true ? 1:0);
 loc_095: if (arr[19] == 0) goto loc_112;
 loc_098: arr[19] = (bitarr[2][arr[11]] == true ? 1:0);
 loc_102: arr[57] = (bitarr[2][arr[47]] == true ? 1:0);
 loc_106: arr[57] *= arr[19];
 loc_109: arr[36] += arr[57];
 loc_112: arr[61]++;
 loc_114: arr[47]++;
 loc_116: if ((long)arr[47] < (long)((ulong)50)) goto loc_091;
 loc_120: arr[11]++;
 loc_122: if ((long)arr[11] < (long)((ulong)49)) goto loc_086;
 loc_126: bitarr[1][arr[55]] = ((arr[36] & 1) != 0);
 loc_130: arr[55]++;
 loc_132: arr[24]++;
 loc_134: if ((long)arr[24] < (long)((ulong)40)) goto loc_050;
 loc_138: bitarr[1].CopyTo(ans, 0);
}

Decompiled code after recompilation

// Token: 0x06000005 RID: 5 RVA: 0x00003A94 File Offset: 0x00002A94
private void doIt()
{
 this.bitarr = new BitArray[3];
 this.bitarr[0] = new BitArray(this.hardcoded);
 this.arr = new int[64];
 this.ans = new byte[5];
 this.bitarr[2] = new BitArray(56);
 this.arr[41] = 0;
 this.arr[15] = 0;
 while (this.arr[15] < this.password.Length)
 {
  this.arr[8] = "23456789ABCDEFGHJKLMNPQRSTUVWXYZ".IndexOf(this.password[this.arr[15]]);
  if (this.arr[8] != -1)
  {
   this.arr[1] = 0;
   while ((long)this.arr[1] < 5L)
   {
    this.bitarr[2][this.arr[41]] = ((this.arr[8] & 1) != 0);
    this.arr[41]++;
    this.arr[8] = (int)((uint)this.arr[8] >> 1);
    this.arr[1]++;
   }
  }
  this.arr[15]++;
 }
 this.bitarr[1] = new BitArray(40);
 this.arr[61] = 0;
 this.arr[55] = 0;
 this.arr[24] = 0;
 while ((long)this.arr[24] < 40L)
 {
  this.arr[36] = (this.bitarr[0][this.arr[61]] ? 1 : 0);
  this.arr[61]++;
  this.arr[18] = 0;
  while ((long)this.arr[18] < 50L)
  {
   this.arr[19] = (this.bitarr[0][this.arr[61]] ? 1 : 0);
   if (this.arr[19] != 0)
   {
    this.arr[19] = (this.bitarr[2][this.arr[18]] ? 1 : 0);
    this.arr[36] += this.arr[19];
   }
   this.arr[61]++;
   this.arr[18]++;
  }
  this.arr[11] = 0;
  while ((long)this.arr[11] < 49L)
  {
   this.arr[47] = this.arr[11] + 1;
   while ((long)this.arr[47] < 50L)
   {
    this.arr[19] = (this.bitarr[0][this.arr[61]] ? 1 : 0);
    if (this.arr[19] != 0)
    {
     this.arr[19] = (this.bitarr[2][this.arr[11]] ? 1 : 0);
     this.arr[57] = (this.bitarr[2][this.arr[47]] ? 1 : 0);
     this.arr[57] *= this.arr[19];
     this.arr[36] += this.arr[57];
    }
    this.arr[61]++;
    this.arr[47]++;
   }
   this.arr[11]++;
  }
  this.bitarr[1][this.arr[55]] = ((this.arr[36] & 1) != 0);
  this.arr[55]++;
  this.arr[24]++;
 }
 this.bitarr[1].CopyTo(this.ans, 0);
}

[Full Code]
This can be further be hand optimized to convert all arrays to variables and while loops to for loop. The optimization will be done in a later step.

The key checking algorithm

As shown in Fig 7 the keygenme expects a username/password combination. It calculates a standard MD5 hash of the username, and compares some bytes of it to another hash computed from the password. The algorithm which implements the latter hash is the crux of this keygenme. For success, these comparisons must match.

Fig 7: Gimme the password!

The user supplied password is converted to a 50 size bit array using the following algorithm.

calc = new BitArray(50);

for (int i = 0, j = 0; i < password.Length; i++)
{
 int pos = "23456789ABCDEFGHJKLMNPQRSTUVWXYZ".IndexOf(password[i]);
 if (pos != -1)
 {
  for (int k = 0; k < 5; k++)
  {
   calc[j++] = (pos & 1) != 0;
   pos >>= 1;
  }
 }
}

Next, there is a huge hard coded byte array containing 6380 entries. This is also converted to a bit array. Since 8 bits make a byte, the size of the bit array is 6380 x 8 = 51,040.

Now, suppose the 50 bit password array is represented as p0, p1, p2, p3,......, p49. Using these two bit arrays, it calculates the value of 40 such following statements. The indices of the bits which are summed and multiplied are also fetched from the hard coded bit array. To understand how this is done refer to the decompiled code listing.

bit0 = (0 + p[1] + p[2] + p[9] + ... + p[49] + (p[0] * p[2]) + (p[0] * p[3]) + ... + (p[48] * p[49])) & 1;
bit1 = (1 + p[0] + p[2] + p[4] + ... + p[48] + (p[0] * p[2]) + (p[0] * p[3]) + ... + (p[48] * p[49])) & 1;
bit2 = (1 + p[0] + p[1] + p[2] + ... + p[49] + (p[0] * p[1]) + (p[0] * p[2]) + ... + (p[47] * p[49])) & 1;
...
37 more lines
...

The 40 calculated bits are then combined 8 at a time to form 5 bytes. For success, each byte thus obtained must match the corresponding byte obtained from the MD5 of the name as explained before.

Converting to a boolean system

The algorithm the keygenme implements is a sort of combination of a system of linear & bilinear equations having 50 unknowns (p0, ...p49). Note that in each of the equations only the last bit of the sum is retained, the remaining bits are discarded. Hence this can be converted to a boolean system, but before that we need to convert the arithmetic addition and multiplication operations to boolean logic.

Converting arithmetic addition

In a boolean system, variables can have two possible values viz 0 or 1. Further, since we are only interested in the last bit, the result of addition operation will either be 0 or 1. We can either use a Karnaugh map or a truth table. Here I used the latter.

Truth Table for Converting Bit Addition
A	B	F = A + B
0	0	0
0	1	1
1	0	1
1	1	0

A & B are two bits. Bit B is added to Bit A. The rightmost column denotes the sum. Using the above truth table to convert it to a canonical sum-of-products boolean expression we get the following which is already in its simplest form,

Converting arithmetic product
Similar to addition, we can convert the product to an boolean expression,

Truth Table for converting Bit Product
A	B	C	B.C	F = A + B.C
0	0	0	0	0
0	0	1	0	0
0	1	0	0	0
0	1	1	1	1
1	0	0	0	1
1	0	1	0	1
1	1	0	0	1
1	1	1	1	0

Bit B & C are multiplied and added with Bit A. The rightmost column denotes the result. The canonical sum-of-products obtained is,

This can be further simplified to,

For simplification of boolean expressions, there are online tools available such as http://www.32x8.com/ and http://tma.main.jp/logic/index_en.html.

An attempt at keygenning

The objective of the keygenme is thus to solve a system of equations having 50 unknowns. Now, as I said before that the de-virtualized code can further be hand optimized to get rid of the excess clutter. Here is it.

private void doIt()
{
 huge = new BitArray(hardcoded);
 calc = new BitArray(50);
 output = new BitArray(40);
 arr = new int[64];
 ans = new byte[5];

 for (int i = 0, j = 0; i < password.Length; i++)
 {
  int pos = "23456789ABCDEFGHJKLMNPQRSTUVWXYZ".IndexOf(password[i]);
  if (pos != -1)
  {
   for (int k = 0; k < 5; k++)
   {
    calc[j++] = (pos & 1) != 0;
    pos >>= 1;
   }
  }
 }

 for (int x = 0, y = 0; x < 40; x++)
 {
  bool bit = huge[y++];
  
  for (int a = 0; a < 50; a++)
   if (huge[y++])
    if (calc[a]) bit = !bit;

  for (int b = 0; b < 49; b++)
   for (int c = b + 1; c < 50; c++)
    if (huge[y++]) 
     if (calc[b] & calc[c]) bit = !bit;
  output[x] = bit;
 }
 output.CopyTo(ans, 0);
}

While loops have been converted to for loops. Arrays have been converted to variables. The last steps that remains is to develop a working keygen. For problems like this, I tend to prefer the Z3 SMT solver. Z3 has a nice python api to code against. Without further ado, here the is the code which tries to find a valid password for the name 0xec. It isn't a true keygen.

from z3 import *

from init_bits import initbits
from targ_bits import targbits
from sum_bits import sumbits
from prod_bits import prodbits

def main():
    bits = [Bool('b'+str(i)) for i in range(50)]
    s = SolverFor('sat') # Solver()
    keyspace = '23456789ABCDEFGHJKLMNPQRSTUVWXYZ'
    finalkey = ''
    tempkey = ''

    for i in range(len(initbits)):
        A = BoolVal(initbits[i])

        li = sumbits[i]
        for x in li:
            B = bits[x]
            A = Or(And(Not(A), B), And(A, Not(B)))

        li = prodbits[i]
        for x, y in li:
            B, C = bits[x], bits[y]
            A = Or(And(A, Not(B)), And(A, Not(C)), And(Not(A), B, C))

        s.add(A == targbits[i])
        print i


    if s.check() == sat:
        print 'Key Found!'
        m = s.model()
        for i in xrange(50):
            val = m[bits[i]]
            if val is not None:
                tempkey += '1'
            else:
                tempkey += '0'

        # Reverse tempkey
        tempkey = tempkey[::-1]
        for i in xrange(0, 50, 5):
            finalkey +=  keyspace[int(tempkey[i:i+5], 2)]

        finalkey = finalkey[::-1]
        print finalkey[0:5] + '-' + finalkey[5:]

    else:
        print 'Could not find a key:('

if __name__ == '__main__':
    main()

Congrats if you have read this far, however, I have disappointing news in store. The keygen developed does not generate a key within a realistic time. For a gold medal, we need to generate a key within one second. I have ran the keygen for more than an hour, and it still continued running. However, I know the code is correct since it can verify the sample username / password combinations (kao/QQRR9-DL6JF).

Further optimizations of the python code are possible. but that doesn't improve the situation. For example, if you closely look at the boolean expression for converting bit addition, you can deduce that it is the same as a 2 bit half adder where we are discarding the carry bit.

Similarly, the expression for converting bit product is a 3 bit full adder discarding the carry.

However, in spite of all of the above optimizations, it still remains unsolved.

Trying with an arithmetic system

As said before, the implemented algorithm is a combination of linear and bilinear system having 50 unknowns. In an arithmetic system, the problem looks like,

In simple maths, the above represents an equation similar to the following,

There are about 8 such equations. Each equation have a linear and a bilinear part. Without the bilinear part, we could have applied Gaussian Elimination for a solution, but that isn't the case to be. Further, my mathematical fu is not strong enough to devise an automated way to solve these type of equations. Hence, I will wait for others to solve this.

Disk scheduling visualizer

2016-06-10T08:30:00.000+00:00

Over the weekend, I developed a small python utility to visualize disk scheduling algorithms, particularly the way in which requests are serviced.

The algorithms implemented are SCAN & LOOK (not the circular versions i.e. CSCAN & CLOOK).

You can provide input, by modifying the below line in the code.

requestQ = [request(2, 0), request(156, 0), request(78, 0), request(192, 0),
            request(19, 30), request(127, 30), request(90, 30), request(100, 150),
            request(140, 150), request(60, 200)]

Each of the entry is a named tuple consisting of the track number, and the time of request.

You can change the algorithm to look or scan by modifying the corresponding line alg = AL_LOOK

It will display a formatted table showing the way the requests are serviced.

Sample output

Track Range : 0 - 199
Starting Track : 89
Starting Direction : LOW
Algorithm : LOOK

      TIME      |     TRACK
----------------|----------------
       0        |       89
       11       |       78
       70       |       19
       87       |       2
      175       |       90
      185       |      100
      212       |      127
      225       |      140
      241       |      156
      277       |      192
      409       |       60

Additionally, if you have matplotlib installed, it will render a graph too.

Fig: Sample Screenshot

The code is hosted at github: https://github.com/extremecoders-re/disk-scheduling-visualizer

PjOrion Deobfuscator Open Sourced

2016-05-11T03:46:00.001+00:00

Update (11-July-2017)

The project PjOrion Deobfuscator has been discontinued. This is superseded by bytecode_simplifier.

PjOrion is a python bytecode protector. While originally developed for obfuscating World of Tanks mods it can be used for pretty much any python code. What makes this protector special is that it works on the python bytecode itself. It tampers the bytecode making in un-decompilable and un-disassemble by standard tools. For scripted languages like python, this is quite a significant improvement considering code protection in python was just a myth.

An example

Some time in 2015 bomblader posted a crackme on tuts4you. I will be using the same crackme to demonstrate the protection offered by PjOrion.

PjOrion breaks existing disassemblers by tampering the bytecode. For example, using the standard dis module on the obfuscated pyc file results in the following output.

Not only did the disassembler fail but also there are several invalid opcodes in the listing. For example the opcode 144 is invalid and non existent. When cpython tries to execute an invalid opcode, it throws an exception. Without an exception handler installed the program would crash. This is precisely the reason why the very first instruction is SETUP_EXCEPT. The purpose of the instruction is to set up an exception handler at bytecode offset 102 which will be called when an exception is thrown.

It is clear that we need to follow the exception handler to understand the program flow. For this I developed a trivial program which could trace the program flow.

From the above listing, it is clear that in addition to the invalid instruction at the beginning the code is splattered with unconditional jumps. The result of this is a spaghetti control flow as shown in Fig 1 or an even more extreme example in Fig 2.

Fig 1: Too many jumps!

Fig 2: Devilish CFG

Deobfuscating and beyond

To deobfuscate such files, I developed a tool PjOrion Deobfuscator (@github). It is currently in pre- alpha stage and may even not work. There are many moving pieces involved which needs refactoring to make this workable. With time I aim to improve this tool.

The tools removes redundant jumps between the basic blocks as in Fig 3. However this is not as simple as it sounds and needs to recursively disassemble the code stream. I also incorporated some ideas borrowed from the LLVM project to optimize the cfg.

Fig 3: Redundant Jumps removal

After removing the redundant jumps, we need to reassemble the modified cfg. This is also quite a task as we need to re-compute all instruction offsets and the position of the basic blocks within the reassembled instruction stream.

For now, you can use the tool to generate a CFG which should help to better understand the bytecode. Be sure to have pydotplus and graphviz installed before using it.

I would like to reiterate once again that the tool is in a pre-alpha stage and may not work for your files. However, I definitely aim to improve this tool with time.

PjOrion Deobfuscator: https://github.com/extremecoders-re/PjOrion-Deobfuscator

Reversing the petya ransomware with constraint solvers

2016-04-22T20:04:00.000+00:00

With the advent of anonymous online money transactions (read Bitcoin) ransomware has become a profitable business in the cybercrime industry. This combined with the Tor network hides the attackers identity. Further, low infosec literacy makes social engineering really easy. All you need to do is send an email with an attached word file for a failed FedEx delivery etc. The victim would download the attachment, run the word file, enable macros, all without thinking a bit.

Minutes later he/she would be staring at a screen that may look like this

Fig 1: The Petya lock screen

In panic, the hapless user may call his tech friend or google (seriously?) and be convinced that he/she has really lost all data. Now either he/she heed to the attacker's demands and pay the ransom or just forget it. Paying the ransom is however not too easy. There is no paypal, no credit card. The attackers accepts payments only in bitcoins and you also need to install the so called tor browser. This is the end of the road for many.

For us reversers, this is not the end. We try to find out if the user can have his/her data recovered WITHOUT paying the ransom. However this is not always possible particularly if the malware coders are experts. If not then we may have a way in just as was the case of the Petya ransomware which is the topic for today's blog post.

Preliminary Analysis

The sample on which the analysis has been done can be found here.

The malware is different & unique from typical ransomware as it does not encrypts file. Instead it encrypts the MFT (Master File Table) on NTFS volumes. In short, on NTFS the MFT is a table which contains information about each and every file on the partition. For small files, the file content may be stored entirely within the MFT. For larger files, it contains a pointer to the actual data.

Encrypting the MFT is advantageous in the sense that the operation is very fast. You do not need to recursively traverse the entire drive to find the files. The files are there on the disk but the system does not know where to find them. This is as good as having the individual files encrypted.

The downside of having the MFT encrypted is the malware will need to be low level & sophisticated. Since the system cannot boot the OS a custom bootloader has to be developed. The code has to be 16-bit running in real mode. It has to use the BIOS interrupt services to communicate with the user. This is not an easy task considering we are used to develop in 32/64 bit with memory protection, segmentation and other niceties by the OS. In real mode we are responsible for everything.

Initial Triage

My favourite OS for reverse engineering tasks like this is old and trusty Windows XP. However for reasons unknown I could not get this sample running. Hence, I had to resort to a Windows 7 SP1 x86 VM. Running the sample leads to a BSOD after some seconds.

Fig 2: The BSOD

This BSOD is generated entirely from user mode by calling an undocumented API NtRaiseHardError. On the next boot a fake chkdsk scan starts to run.

Fig 3: The malware is hiding its action behind the fake scan.

In reality, the malware is doing its dirty work of encrypting the MFT. The chkdsk is just a decoy. When done we are presented with the redemption screen as in Fig 4 and then Fig 1. However it is not that scary as it looks :).

Fig 4: Danger Ahead!

Carving the mal-bootloader from disk

To perform static analysis we need the malicious bootloader. Since I have used vmware here, the easiest way would be to attach the vmware disk image (vmdk) to another virtual machine and use a tool like Active@ Disk Editor as in Fig 5. I have also developed a 010 editor template for parsing vmware disk images directly and can be used just in case.

Fig 5: Active Disk Editor in action

We need to the extract the sectors (first 10,000 sectors ~ 5 MiB should be more than enough) to a separate file.

IDA & Bochs

For static analysis we will be using IDA (no surprises). For dynamic analysis we will be using the Bochs debugger. Although vmware can debug bios code & bootloaders by its gdb stub but it is quite a pain to use efficiently. Hence we will stick with bochs. IDA provides first class support for bochs. Further running bochs is lot snappier than powering a full fledged vmware vm.

You can get the bxrc file (bochs config file) here. We can now load the bxrc file in IDA and it automatically do the rest.

Fig 6: The initial malware code

The malware copies 32 sectors starting from sector 34 to 0x8000 and jumps to it as in Fig 6.

Fig 7: To encrypt or not

Among other things, it reads sector 54 to a buffer. If the first byte contains 0 it proceeds to encrypt. If not it displays the ransom screen as shown in Fig 7. Hence the first byte of sector 54 is used mainly as a flag to decide its further course of action.

Analyzing the decryption code

We will be focussing primarily on the decryption algorithm. After all we are more interested in getting our data back than figuring out how it got lost. The process is simple, it reads a string, checks it and if it is valid decrypts our data.

Fig 8: Read & Check key

It accepts a key of maximum length 73 characters, but only the first 16 of them are used. The characters which are accepted consists of 1-9, a-z, A-X. After this the 16 byte key is expanded to a 32 byte key by adding 122 and doubling consecutive characters respectively. This is shown in Fig 9.

Fig 9: Expanding the key

Next we reach the crux of the malware. It reads some data from sector 54 & 55 and passes them to the Crypt function. Using the 32 byte decryption key and an 8 byte Initialization Vector (nonce) from sector 54 it decrypts the 512 bytes of data in sector 55. If our key is correct, all byte in the decrypted data must equal all 0x37.

Fig 10: Calling Crypt

Finding the encryption algorithm

The encryption algorithm used is a variant of the Salsa stream cipher. I call this variant because properly implemented Salsa is quite secure. Well, how do we know this is Salsa? From magic constants, of course. See Fig 11.

Fig 11: expand 32-byte k

Searching for "expand 32-byte k" would directly lead you to Salsa. The exact code used in the malware can be found here. I am using the word exact in a broad sense. If it had been a ditto copy, we would have no chance of breaking it. The original Salsa implementation uses 32 bit (uint32_t) variables. This salsa implementation uses 16 bit variables for the same purpose borking it badly. Here is a snippet of the borked version. You can get the full version here. Compare this to the original version.

The primary reason for the mess up can be attributed to the fact that all of this is running in 16 bit real mode. So the authors decide to go easy and implement the exact same algorithm but with 16 bit variables.

Breaking the algorithm

We already have the entire algorithm in source code. We need to fire up our tools to go & break it. These days, my defacto tool for such analysis has mostly been angr. However angr failed to work in this case. This is expected as the framework is in a continuous state of development. Not spending time on finding why it failed, I decided to look at other options. I used KLEE. It did not fail but took a long time and never finished. Next, some wild cropped up and I decided to use fuzzing based approach. For this I used the AFL framework. No luck here too.

Lastly I decided to use the tried and tested Z3 constraint solver and it did not disappoint :). We already have the source, we just need to implement it in Z3. The code is as follows.

The program has to be provided with the 8 byte nonce from sector 54. and 64 bytes from sector 55 after xoring with 0x37. The remainder of the program is a literal transcription of the c source and hence not explained. Running the program we get our decryption key in a few milliseconds. Apply the decryption key & hope for the best.

Fig 12: Mission accomplished!

Mission accomplished.

References

http://blog.trendmicro.com/trendlabs-security-intelligence/petya-crypto-ransomware-overwrites-mbr-lock-users-computers/
http://www.bleepingcomputer.com/news/security/petya-ransomware-skips-the-files-and-encrypts-your-hard-drive-instead/
https://github.com/leo-stone/hack-petya
http://pastebin.com/Zc16DfL1

Solving kao's toy project with symbolic execution and angr

2016-04-01T15:18:00.002+00:00

Kao's toy project is a nifty and small crackme and quite ideal for demonstrating the power of symbolic execution. Running the crackme provides us with an installation id. We need to enter an unlock code which shows the goodboy message.

Fig 1: The main window

The installation id is calculated from the hard disk serial number. We will not focus on the algorithm that generates the installation id but rather on developing the keygen which calculates an unlock code when given an install id.

Before discussing about the function which checks the validity of the entered unlock code, it is important to mention that the installation id is 32 bytes (8 DWORDS) long and is displayed on the crackme screen in the form

D₁D₀-D₃D₂-D₅D₄-D₇D₆

i.e. within each QWORD the two DWORDS are stored in little endian order. We need to take this into account in our keygen program and convert the entered installation id to the proper form.

Previous Work

This crackme has previously been solved by Rolf Rolles who used a similar technique mentioned here in this blog post. While the method involving SMT solver is similar, Rolf used semi-automatic techniques, which translated the assembly code to IR and finally generated the constraints from the IR.

Before Rolf Rolles, this was solved by andrewl & Dcoder who used cryptanalysis techniques to reduce the keyspace. More recently, this was solved by Cr4sh who used the openreil framework.

The heart of the crackme

At the heart of the crackme lies this small function which checks whether a given unlock code is valid or not.

Fig 2: The checking function

The function takes two dwords (from the unlock code) as arguments which are then used to encode/encrypt the installation id (plaintext) to a given output buffer(ciphertext). For our entered unlock code to be valid, the encoded output must match the hardcoded string 0how4zdy81jpe5xfu92kar6cgiq3lst7.

Solving with Z3

At first we will try to model the system in Z3. Specifically, we will represent the encoding loop in Z3. Then we will use Z3 to solve the system and find the two dwords (unlock code) which encodes the installation id to the hardcoded string.

The script takes in the installation id as a command line argument. Lets' walk through the code step by step.

install_id = getInstallIdFromString(sys.argv[1])

Here we convert the install id into its proper form i.e the order of the two DWORDs within each QWORDs is reversed and returned as a list of integers.

target = map(ord, list('0how4zdy81jpe5xfu92kar6cgiq3lst7'))

After encoding the installation id it must match with the hardcoded string. Here we are converting the that string to a list of characters where each character is represented by its ASCII value.

part1 = edx = BitVec('part1', 32) 
part2 = ebx = BitVec('part2', 32)

We declare two bit-vectors of with a size of 32 bits each. These two bit vectors represents the two DWORDS of the unlock code.

for i in xrange(32):
    # text:00401105 lodsb
    byte = install_id[i]
        
    # text:00401106 sub al, bl
    byte -= Extract(7, 0, ebx)
    
    # text:00401108 xor al, dl
    byte ^= Extract(7, 0, edx)

    # text:0040110B rol edx, 1
    edx = RotateLeft(edx, 1)
        
    # text:0040110D rol ebx, 1
    ebx = RotateLeft(ebx, 1)
        
    # Add constraint
    s.add(byte == target[i])

The above loop describes the encoding process. Each character of the install_id is processed. This value must match the corresponding character in the target list. For this we use constraints.

# Solve the system
if s.check() == sat:
    m = s.model()
    print 'Unlock Code: ',
    print '%08X-%08X' %(m[part1].as_long(), m[part1].as_long () ^ m[part2].as_long())

Finally, we ask z3 to solve the system. and print the solutions.

Solving with angr

Okay we have already solved the crackme, so why another method? This is because I wanted to see if we can use angr for the same purpose, besides it would be a good learning experience.

Lets look at the cfg once again

Fig 3: We want to execute the green basic block and avoid the red one

At 40122A function check is called. If our entered unlock code is correct check would return 1 and we would go to the green color basic block at 401234 which displays the good boy message.

Now to the cfg of the check function.

Fig 4: Inserting hooks inside the check function

We are going to execute the above function symbolically. The unlock code which is comprised of two parts are passed as arguments to the function. Since we are executing this function in isolation we need provide the inputs ourselves, and this can be done by setting a hook at 4010FF (set_ebx_edx). Within the hook, we would store symbolic values representing the two parts of the unlock code into the ebx and edx registers.

Lastly, at 40111D there is a call to lstrcmpA. This function is imported from kernel32.dll. Now, within our execution environment this dll is not loaded, we must emulate the behaviour of lstrcmpA and this can be done with SimProcedures.

Fig 5: lstrcmpA function

lstrcmpA is located at 40130E. We would set a hook at this location to call a SimProcedure which emulates the behaviour of lstrcmpA.

Now lets see the code which implements all of these.
Finally to wrap things up, here is an asciicast showing the solver in action.

Revisiting find the flag crackme (Part-2)

2016-03-09T18:35:00.000+00:00

This is the 2nd and final part of the series find the flag crackme.

Back to the challenge

Coming back to the challenge, we will see how we can use symbolic execution for solving the challenge. We can use Z3 for representing the system symbolically but that is too much of a work as we need to convert each instruction to its Z3 equivalent. Hence we need to look at alternatives which will do this work automatically.

Enter angr

The angr project is the next-generation binary analysis framework created by the computer security lab at UC Santa Barbara. Among its myriads of capabilities, it can lift a raw binary to an intermediate language and perform symbolic execution on it, just the thing we are looking for. Additionally, it can be instructed to search for paths that lead to the execution of a particular instruction. It can also find out what initial values (in registers / memory etc) will lead to the execution of a particular path.

For this particular challenge, we need to find the path that will print the good boy message while avoiding the path printing the bad boy message. angr can then automatically found out the flag which will lead to the execution of this path. This is no magic but done through the power of symbolic execution and constraint solving.

Installing angr is pretty straightforward and well documented. Hence I will dive right into the actual problem.

Solving with angr

First we need to keep a note of some information about the binary.

The check function starts at VA 0x804846D.

Start of Check function

The basic block we want to execute is 0x8049A52 and the one we want to avoid is 0x8049A60.

The basic block on the left is the good boy

The check function is called from 0x8049A44. Hence after returning execution resumes at 0x8049A49.

Execution resumes at 0x8049A49 after return

The flag is stored in an array at VA 0x804B060.

VA of the flag

With the above information we can develop the solver in angr.

UPDATE 19-July-2017: The script below is out of date and will not work in current version of angr. Get the updated script here: https://gist.github.com/extremecoders-re/ede291f19bbc6a2508087f58ac75846e

Running on CPython the script takes about a couple of minutes to find the flag. This is quite an impressive feat considering that we did not even analyzed the check function itself. Furthermore, we could have reduced the execution time by running on PyPy instead of CPython.

And finally, to end things with, here is an asciicast showing the solver in action. You can download the files that were used in this post from google drive.