Angr Management: Another Few Useful Tricks for Using Angr
0x00 Intro
When doing my assignment with angr, I found some useful (and obvious) tricks but they didn’t fit into the previous post on state merging with angr. This post will cover the following topics: (1) using angr with Jupyter Notebook, (2) forcefully setting the RIP to a specific address, and (3) speeding up symbolic execution (symbex) when there are too many constraints by having multiple symbex runs for different parts.
0x01 Using angr with Jupyter Notebook
The reason why Jupyter Notebook is useful is similar to having checkpoints in
hardcore games (like learning angr for assignment, for example). It usually
takes minutes to run a well-tested part of the code before reaching the untested
part. If it doesn’t work, another few minutes is needed to test the revised
code. In jupyter notebook, the code can be run cell by cell (cell is just a
block of code), so after running the correct cells, continuing with a new copy
of the state by state.copy()
as a checkpoint will make life easier.
In case it is needed, setting up Jupyter Notebook is relatively straightforward.
Easiest approach would be to let VS Code set up everything needed by opening a
.ipynb
file and follow its hints.
A quirky problem that may happen when putting cells together into a standalone
python script is that claripy
, relied on by angr, sometimes takes very deep
recursion to solve then constraints, which will not cause a problem in Jupyter
Notebook but otherwise will outside of it. The error message looks like:
Traceback (most recent call last):
File "/home/ru1d4/.local/lib/python3.12/site-packages/cachetools/__init__.py", line 68, in __getitem__
return self.__data[key]
~~~~~~~~~~~^^^^^
KeyError: 553510368
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/ru1d4/example.py", line 42, in <module>
state.solver.eval_upto(some_expr, 2)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
... many lines omitted, which are call stacks within claripy ...
File "/home/ru1d4/.local/lib/python3.12/site-packages/cachetools/__init__.py", line 70, in __getitem__
return self.__missing__(key)
^^^^^^^^^^^^^^^^^^^^^
RecursionError: maximum recursion depth exceeded
The following snippet might be helpful.
import sys
sys.setrecursionlimit(10000)
0x02 Forcefully set RIP
state.regs.rip = 0x400000
The idea is that one may want to skip some parts in a function and the quickest way to do that is to directly set RIP. Similarly, combining with setting other aspects of the state, conditional jumps can by manipulated.
The scenario of using this may seem questionable at the first glance. Bear with me for a while and this is useful combined with the next trick.
0x03 Speeding up symbolic execution
When there are too many constraints in the state, significant amount of time is needed to run even one simple instruction.
Consider the following pseudo code as a target for analysis:
user_input = input()
foo = func1(user_input)
bar = func2(foo)
baz = func3(bar)
if baz == 0xdeadbeef:
print("Congrats!")
else:
print("Try again!")
Easiest approach is to symbex before calling func1
and executing all the way till returning from func3
. then go with
state.solver.eval_upto(symbol_user_input, 2, cast_to=bytes, extra_constraints=[baz == 0xdeadbeef])
If it takes too long to even reach func3 because of having too many constraints accumulated in func1 and func2, consider having three separate symbex runs for each function, resulting in three states. Then solve them one by one in reverse order.
# baz = func3(bar)
expected_bar = state3.solver.eval_upto(sym_bar, 1, cast_to=bytes, extra_constraints=[baz == 0xdeadbeef])[0]
# bar = func2(foo)
expected_foo = state2.solver.eval_upto(sym_foo, 1, cast_to=bytes, extra_constraints=[bar == expected_bar])[0]
# foo = func1(user_input)
expect_user_input = state1.solver.eval_upto(sym_user_input, 1, cast_to=bytes, extra_constraints=[foo == expected_foo])[0]
The total time used by running three separate symbex would be probably less than that for an all-in-one run. Hopefully, it justifies the section 2 above on how to set RIP to skip some parts of the binary.