This is the 2nd and final part of the series find the flag crackme.
Back to the challenge
Coming back to the challenge, we will see how we can use symbolic execution for solving the challenge. We can use Z3 for representing the system symbolically but that is too much of a work as we need to convert each instruction to its Z3 equivalent. Hence we need to look at alternatives which will do this work automatically.
The angr project is the next-generation binary analysis framework created by the computer security lab at UC Santa Barbara. Among its myriads of capabilities, it can lift a raw binary to an intermediate language and perform symbolic execution on it, just the thing we are looking for. Additionally, it can be instructed to search for paths that lead to the execution of a particular instruction. It can also find out what initial values (in registers / memory etc) will lead to the execution of a particular path.
For this particular challenge, we need to find the path that will print the good boy message while avoiding the path printing the bad boy message. angr can then automatically found out the flag which will lead to the execution of this path. This is no magic but done through the power of symbolic execution and constraint solving.
Installing angr is pretty straightforward and well documented. Hence I will dive right into the actual problem.
Solving with angr
First we need to keep a note of some information about the binary.
The check function starts at VA 0x804846D.
|Start of Check function|
The basic block we want to execute is 0x8049A52 and the one we want to avoid is 0x8049A60.
|The basic block on the left is the good boy|
The check function is called from 0x8049A44. Hence after returning execution resumes at 0x8049A49.
|Execution resumes at 0x8049A49 after return|
The flag is stored in an array at VA 0x804B060.
|VA of the flag|
With the above information we can develop the solver in angr.
Running on CPython the script takes about a couple of minutes to find the flag. This is quite an impressive feat considering that we did not even analyzed the check function itself. Furthermore, we could have reduced the execution time by running on PyPy instead of CPython.
And finally, to end things with, here is an asciicast showing the solver in action. You can download the files that were used in this post from google drive.