angr can now symbolically execute Java code and Android apps, including Android apps incorporating both compiled Java and native (C/C++) code!
This project is the result of a multi-year effort by (in no particular order): Thorsten Eisenhofer (thrsten), Sebastiano Mariani (phate), Ruoyu Wang (fish), Antonio Bianchi (anton00b), and Andrea Continella (conand).
Enabling Java support requires a few more steps than a typical angr installation. To get started, we recommend looking through the detailed installation instructions and a list of examples available in the official angr documentation.
WARNING: Java support is still experimental! You may encounter issues while analyzing Java or Android apps. Please report any bugs on the respective git repository in the angr project. Pull requests are also welcome!
Implementation
Java support relies primarily on our new Soot python wrapper, pysoot, to lift both Java and DEX bytecode.
pysoot
lifts both Java and DEX bytecode, as well as extracts a fully serializable interface from Android apps and Java code.
Unfortunately, as of now, it only works on Linux.
We then leverage the generated IR in a new angr engine able to interpret such code. This engine is also able to automatically switch to executing native code if a native method is called through the JNI interface.
Along with symbolic execution, we have also implemented some basic static analysis, specifically a basic CFG reconstruction analysis.
Moreover, we added support for string constraint solving by integrating CVC4’s string constraint abilities into claripy
.
Solving a CTF challenge
The challenge javaisnotfun
from iCTF 2017 is a game implemented in mixed Java/C code.
You can find its source code here and a writeup (in Chinese) here.
The challenge starts with a challenge-response game in which 5 random numbers are shown to the user, and the user has to reply with 3 numbers. Solving five rounds of the game allows the attacker to trigger the challenge vulnerability.
In this example, we will focus on how to solve one round of the game using angr. The complete angr code is available here.
A typical approach would require reversing the Java code and the native code used to implement the game. However, we can now use angr to automatically compute the 3 numbers of the solution!
This is the source code implementing one round of the game:
Random rnd = new Random();
int c1,c2,c3,c4,c5;
c1 = rnd.nextInt(100);
c2 = rnd.nextInt(100);
c3 = rnd.nextInt(256);
c4 = rnd.nextInt(10);
c5 = rnd.nextInt(10);
print("These are your unlucky numbers:");
print(c1);
print(c2);
print(c3);
c3 <<= 8;
switch(c4){
case 0:
c3 = c3 * 2 + 3;
break;
case 1:
c3 = c3 * 7 + 8;
break;
case 2:
c3 = c3 * 3 + 1;
break;
case 3:
c3 = c3 * 5 + 3;
break;
case 4:
c3 = c3 * 2 + 9;
break;
case 5:
c3 = c3 * 9 + 1;
break;
case 6:
c3 = c3 * 6 + 2;
break;
case 7:
c3 = c3 * 5 + 4;
break;
case 8:
c3 = c3 * 8 + 2;
break;
case 9:
c3 = c3 * 4 + 2;
break;
}
switch(c5){
case 0:
c3 = magic0(c3);
break;
case 1:
c3 = magic1(c3);
break;
case 2:
c3 = magic2(c3);
break;
case 3:
c3 = magic3(c3);
break;
case 4:
c3 = magic4(c3);
break;
case 5:
c3 = magic5(c3);
break;
case 6:
c3 = magic6(c3);
break;
case 7:
c3 = magic7(c3);
break;
case 8:
c3 = magic8(c3);
break;
case 9:
c3 = magic9(c3);
break;
}
print(c4);
print(c5);
//System.err.println("expected: " + String.valueOf(c1+2)+"|"+String.valueOf(c2*3+1)+"|"+String.valueOf(c3));
if(! (getInt() == c1 + 2)){
gameFail();
}
if(! (getInt() == magic000(c2))){
gameFail();
}
if(! (getInt() == c3)){
gameFail();
}
JNIEXPORT int JNICALL Java_NotFun_magic000(JNIEnv *env, jobject thisObj, jint n) {
return n*3 + 1;
}
JNIEXPORT int JNICALL Java_NotFun_magic0(JNIEnv *env, jobject thisObj, jint n) {
return (n<<1) + 4;
}
JNIEXPORT int JNICALL Java_NotFun_magic1(JNIEnv *env, jobject thisObj, jint n) {
return (n<<4) + 3;
}
JNIEXPORT int JNICALL Java_NotFun_magic2(JNIEnv *env, jobject thisObj, jint n) {
return (n<<3) + 2;
}
JNIEXPORT int JNICALL Java_NotFun_magic3(JNIEnv *env, jobject thisObj, jint n) {
return (n<<2) + 3;
}
JNIEXPORT int JNICALL Java_NotFun_magic4(JNIEnv *env, jobject thisObj, jint n) {
return (n<<2) + 1;
}
JNIEXPORT int JNICALL Java_NotFun_magic5(JNIEnv *env, jobject thisObj, jint n) {
return (n>>2) + 3;
}
JNIEXPORT int JNICALL Java_NotFun_magic6(JNIEnv *env, jobject thisObj, jint n) {
return (n>>3) + 3;
}
JNIEXPORT int JNICALL Java_NotFun_magic7(JNIEnv *env, jobject thisObj, jint n) {
return (n>>1) + 3;
}
JNIEXPORT int JNICALL Java_NotFun_magic8(JNIEnv *env, jobject thisObj, jint n) {
return (n>>4) + 7;
}
JNIEXPORT int JNICALL Java_NotFun_magic9(JNIEnv *env, jobject thisObj, jint n) {
return (n>>1) + 1;
}
Let’s create, using angr, a function able to compute the 3 response values, given the 5 random challenge values.
First of all we need to create an angr Project
.
binary_path = os.path.join(self_dir, "bin/service.jar")
jni_options = {'jni_libs': ['libnotfun.so']}
project = angr.Project(binary_path, main_opts=jni_options)
As you can see, we manually specify that the jar
file uses a native library, libnotfun.so
.
We then set up a few hooks.
Since these hooks are in Java, we specify their addresses using the class SootMethodDescriptor
.
project.hook(SootMethodDescriptor(class_name="java.util.Random", name="nextInt", params=('int',)).address(), Random_nextInt())
project.hook(SootMethodDescriptor(class_name="java.lang.Integer", name="valueOf", params=('int',)).address(), Dummy_valueOf())
project.hook(SootMethodDescriptor(class_name="NotFun", name="print", params=('java.lang.Object',)).address(), Custom_Print())
project.hook(SootMethodDescriptor(class_name="NotFun", name="getInt", params=()).address(), Custom_getInt())
Then, we set up the symbolic execution entry point.
Specifically, we want to start the symbolic execution from the Java method called game()
:
game_method = [m for m in project.loader.main_object.classes['NotFun'].methods if m.name == "game"][0]
game_entry = SootMethodDescriptor.from_soot_method(game_method).address()
entry = project.factory.blank_state(addr=game_entry)
simgr = project.factory.simgr(entry)
To handle the challenge-response, we create two fake files keeping track (symbolically) of what the program prints (the 5 challenge numbers) and what the user inserts (the 3 response values). See the full solution for details.
Finally, we start symbolically executing the program step-by-step.
We prune paths reaching the gameFail()
method, while we stash paths solving one round of the game (formally, reaching basic block 30 of the method game()
).
When run, numeric_solutions
will contain the 3 response values.
print("="*10 + " SYMBOLIC EXECUTION STARTED")
while(len(simgr.active)>0):
simgr.step()
print("===== " + str(simgr))
print("===== " + ",".join([str(a.addr) for a in simgr.active if type(a.addr)==SootAddressDescriptor]))
# If we reach block_idx 30, it means that we solved 1 round of the game --> we stash the state
# If we reach the gameFail() method, it means that we failed --> we prune the state
simgr.move('active', 'stashed', lambda a: type(a.addr) == SootAddressDescriptor
and a.addr.method == SootMethodDescriptor("NotFun", "game", ()) and a.addr.block_idx == 30)
simgr.move('active', 'pruned', lambda a: type(a.addr) == SootAddressDescriptor
and a.addr.method == SootMethodDescriptor("NotFun", "gameFail", ()))
print("="*10 + " SYMBOLIC EXECUTION ENDED")
assert len(simgr.stashed) == 1
win_state = simgr.stashed[0]
numeric_solutions = []
for s in solutions:
es = win_state.solver.eval_atmost(s, 2)
assert len(es) == 1
numeric_solutions.append(es[0])
Limitations and Future Work
As mentioned before, Java support is experimental!
There are many things that should be improved, for instance:
- Move away from
pysoot
or, at least:- Make it compatible with Windows, OSX, …
- Do not use Jython and Pickle (but something more efficient, such as
protobuf
) to obtain the Soot IR in Python
- Implement more static analyses, including, data-flow, slicing, … (this is currently work-in-progress)
- Many many more simprocedures to model the HUGE Java/Android framework
- …
Contribution from the community is highly encouraged! Pull requests are very welcome!