angr — Analysis and Coordination

Project

angr.project.register_default_engine(loader_backend, engine, arch='any')

Register the default execution engine to be used with a given CLE backend. Usually this is the SimEngineVEX, but if you’re operating on something that isn’t going to be lifted to VEX, you’ll need to make sure the desired engine is registered here.

Parameters:
  • loader_backend – The loader backend (a type)
  • engine (type) – The engine to use for the loader backend (a type)
  • arch – The architecture to associate with this engine. Optional.
Returns:

angr.project.get_default_engine(loader_backend, arch='any')

Get some sort of sane default for a given loader and/or arch. Can be set with register_default_engine() :param loader_backend: :param arch: :return:

angr.project.load_shellcode(shellcode, arch, start_offset=0, load_address=0)

Load a new project based on a string of raw bytecode.

Parameters:
  • shellcode – The data to load
  • arch – The name of the arch to use, or an archinfo class
  • start_offset – The offset into the data to start analysis (default 0)
  • load_address – The address to place the data in memory (default 0)
class angr.project.Project(thing, default_analysis_mode=None, ignore_functions=None, use_sim_procedures=True, exclude_sim_procedures_func=None, exclude_sim_procedures_list=(), arch=None, simos=None, load_options=None, translation_cache=True, support_selfmodifying_code=False, **kwargs)

This is the main class of the angr module. It is meant to contain a set of binaries and the relationships between them, and perform analyses on them.

Parameters:thing – The path to the main executable object to analyze, or a CLE Loader object.

The following parameters are optional.

Parameters:
  • default_analysis_mode – The mode of analysis to use by default. Defaults to ‘symbolic’.
  • ignore_functions – A list of function names that, when imported from shared libraries, should never be stepped into in analysis (calls will return an unconstrained value).
  • use_sim_procedures – Whether to replace resolved dependencies for which simprocedures are available with said simprocedures.
  • exclude_sim_procedures_func – A function that, when passed a function name, returns whether or not to wrap it with a simprocedure.
  • exclude_sim_procedures_list – A list of functions to not wrap with simprocedures.
  • arch – The target architecture (auto-detected otherwise).
  • simos – a SimOS class to use for this project.
  • translation_cache (bool) – If True, cache translated basic blocks rather than re-translating them.
  • support_selfmodifying_code (bool) – Whether we aggressively support self-modifying code. When enabled, emulation will try to read code from the current state instead of the original memory, regardless of the current memory protections.

Any additional keyword arguments passed will be passed onto cle.Loader.

Variables:
  • analyses – The available analyses.
  • entry – The program entrypoint.
  • factory – Provides access to important analysis elements such as path groups and symbolic execution results.
  • filename – The filename of the executable.
  • loader – The program loader.
  • surveyors – The available surveyors.
hook(addr, hook=None, length=0, kwargs=None, replace=False)

Hook a section of code with a custom function. This is used internally to provide symbolic summaries of library functions, and can be used to instrument execution or to modify control flow.

When hook is not specified, it returns a function decorator that allows easy hooking. Usage:

# Assuming proj is an instance of angr.Project, we will add a custom hook at the entry
# point of the project.
@proj.hook(proj.entry)
def my_hook(state):
    print "Welcome to execution!"
Parameters:
  • addr – The address to hook.
  • hook – A angr.project.Hook describing a procedure to run at the given address. You may also pass in a SimProcedure class or a function directly and it will be wrapped in a Hook object for you.
  • length – If you provide a function for the hook, this is the number of bytes that will be skipped by executing the hook by default.
  • kwargs – If you provide a SimProcedure for the hook, these are the keyword arguments that will be passed to the procedure’s run method eventually.
  • replace – Control the behavior on finding that the address is already hooked. If true, silently replace the hook. If false (default), warn and do not replace the hook. If none, warn and replace the hook.
is_hooked(addr)

Returns True if addr is hooked.

Parameters:addr – An address.
Returns:True if addr is hooked, False otherwise.
hooked_by(addr)

Returns the current hook for addr.

Parameters:addr – An address.
Returns:None if the address is not hooked.
unhook(addr)

Remove a hook.

Parameters:addr – The address of the hook.
hook_symbol(symbol_name, obj, kwargs=None, replace=None)

Resolve a dependency in a binary. Looks up the address of the given symbol, and then hooks that address. If the symbol was not available in the loaded libraries, this address may be provided by the CLE externs object.

Additionally, if instead of a symbol name you provide an address, some secret functionality will kick in and you will probably just hook that address, UNLESS you’re on powerpc64 ABIv1 or some yet-unknown scary ABI that has its function pointers point to something other than the actual functions, in which case it’ll do the right thing.

Parameters:
  • symbol_name – The name of the dependency to resolve.
  • obj – The thing with which to satisfy the dependency.
  • kwargs – If you provide a SimProcedure for the hook, these are the keyword arguments that will be passed to the procedure’s run method eventually.
  • replace – Control the behavior on finding that the address is already hooked. If true, silently replace the hook. If false, warn and do not replace the hook. If none (default), warn and replace the hook.
Returns:

The address of the new symbol.

Return type:

int

is_symbol_hooked(symbol_name)

Check if a symbol is already hooked.

Parameters:symbol_name (str) – Name of the symbol.
Returns:True if the symbol can be resolved and is hooked, False otherwise.
Return type:bool
unhook_symbol(symbol_name)

Remove the hook on a symbol. This function will fail if the symbol is provided by the extern object, as that would result in a state where analysis would be unable to cope with a call to this symbol.

execute(*args, **kwargs)

This function is a symbolic execution helper in the simple style supported by triton and manticore. It designed to be run after setting up hooks (see Project.hook), in which the symbolic state can be checked.

This function can be run in three different ways:

  • When run with no parameters, this function begins symbolic execution from the entrypoint.
  • It can also be run with a “state” parameter specifying a SimState to begin symbolic execution from.
  • Finally, it can accept any arbitrary keyword arguments, which are all passed to project.factory.full_init_state.

If symbolic execution finishes, this function returns the resulting simulation manager.

terminate_execution()

Terminates a symbolic execution that was started with Project.execute().

class angr.factory.AngrObjectFactory(project, default_engine, procedure_engine, engines)

This factory provides access to important analysis elements.

successors(state, addr=None, jumpkind=None, inline=False, default_engine=False, engines=None, **kwargs)

Perform execution using any applicable engine. Enumerate the current engines and use the first one that works. Return a SimSuccessors object classifying the results of the run.

Parameters:
  • state – The state to analyze
  • addr – optional, an address to execute at instead of the state’s ip
  • jumpkind – optional, the jumpkind of the previous exit
  • inline – This is an inline execution. Do not bother copying the state.
  • default_engine – Whether we should only attempt to use the default engine (usually VEX)
  • engines – A list of engines to try to use, instead of the default.

Additional keyword arguments will be passed directly into each engine’s process method.

blank_state(**kwargs)

Returns a mostly-uninitialized state object. All parameters are optional.

Parameters:
  • addr – The address the state should start at instead of the entry point.
  • initial_prefix – If this is provided, all symbolic registers will hold symbolic values with names prefixed by this string.
  • fs – A dictionary of file names with associated preset SimFile objects.
  • concrete_fs – bool describing whether the host filesystem should be consulted when opening files.
  • chroot – A path to use as a fake root directory, Behaves similarly to a real chroot. Used only when concrete_fs is set to True.
  • kwargs – Any additional keyword args will be passed to the SimState constructor.
Returns:

The blank state.

Return type:

SimState

entry_state(**kwargs)

Returns a state object representing the program at its entry point. All parameters are optional.

Parameters:
  • addr – The address the state should start at instead of the entry point.
  • initial_prefix – If this is provided, all symbolic registers will hold symbolic values with names prefixed by this string.
  • fs – a dictionary of file names with associated preset SimFile objects.
  • concrete_fs – boolean describing whether the host filesystem should be consulted when opening files.
  • chroot – a path to use as a fake root directory, behaves similar to a real chroot. used only when concrete_fs is set to True.
  • argc – a custom value to use for the program’s argc. May be either an int or a bitvector. If not provided, defaults to the length of args.
  • args – a list of values to use as the program’s argv. May be mixed strings and bitvectors.
  • env – a dictionary to use as the environment for the program. Both keys and values may be mixed strings and bitvectors.
Returns:

The entry state.

Return type:

SimState

full_init_state(**kwargs)

Very much like entry_state(), except that instead of starting execution at the program entry point, execution begins at a special SimProcedure that plays the role of the dynamic loader, calling each of the initializer functions that should be called before execution reaches the entry point.

Parameters:
  • addr – The address the state should start at instead of the entry point.
  • initial_prefix – If this is provided, all symbolic registers will hold symbolic values with names prefixed by this string.
  • fs – a dictionary of file names with associated preset SimFile objects.
  • concrete_fs – boolean describing whether the host filesystem should be consulted when opening files.
  • chroot – a path to use as a fake root directory, behaves similar to a real chroot. used only when concrete_fs is set to True.
  • argc – a custom value to use for the program’s argc. May be either an int or a bitvector. If not provided, defaults to the length of args.
  • args – a list of values to use as arguments to the program. May be mixed strings and bitvectors.
  • env – a dictionary to use as the environment for the program. Both keys and values may be mixed strings and bitvectors.
Returns:

The fully initialized state.

Return type:

SimState

call_state(addr, *args, **kwargs)

Returns a state object initialized to the start of a given function, as if it were called with given parameters.

Parameters:
  • addr – The address the state should start at instead of the entry point.
  • args – Any additional positional arguments will be used as arguments to the function call.

The following parametrs are optional.

Parameters:
  • base_state – Use this SimState as the base for the new state instead of a blank state.
  • cc – Optionally provide a SimCC object to use a specific calling convention.
  • ret_addr – Use this address as the function’s return target.
  • stack_base – An optional pointer to use as the top of the stack, circa the function entry point
  • alloc_base – An optional pointer to use as the place to put excess argument data
  • grow_like_stack – When allocating data at alloc_base, whether to allocate at decreasing addresses
  • toc – The address of the table of contents for ppc64
  • initial_prefix – If this is provided, all symbolic registers will hold symbolic values with names prefixed by this string.
  • fs – A dictionary of file names with associated preset SimFile objects.
  • concrete_fs – bool describing whether the host filesystem should be consulted when opening files.
  • chroot – A path to use as a fake root directory, Behaves similarly to a real chroot. Used only when concrete_fs is set to True.
  • kwargs – Any additional keyword args will be passed to the SimState constructor.
Returns:

The state at the beginning of the function.

Return type:

SimState

The idea here is that you can provide almost any kind of python type in args and it’ll be translated to a binary format to be placed into simulated memory. Lists (representing arrays) must be entirely elements of the same type and size, while tuples (representing structs) can be elements of any type and size. If you’d like there to be a pointer to a given value, wrap the value in a SimCC.PointerWrapper. Any value that can’t fit in a register will be automatically put in a PointerWrapper.

If stack_base is not provided, the current stack pointer will be used, and it will be updated. If alloc_base is not provided, the current stack pointer will be used, and it will be updated. You might not like the results if you provide stack_base but not alloc_base.

grow_like_stack controls the behavior of allocating data at alloc_base. When data from args needs to be wrapped in a pointer, the pointer needs to point somewhere, so that data is dumped into memory at alloc_base. If you set alloc_base to point to somewhere other than the stack, set grow_like_stack to False so that sequencial allocations happen at increasing addresses.

simulation_manager(thing=None, **kwargs)

Constructs a new simulation manager.

Parameters:
  • thing – Optional - What to put in the new SimulationManager’s active stash (either a SimState or a list of SimStates).
  • kwargs – Any additional keyword arguments will be passed to the SimulationManager constructor
Returns:

The new SimulationManager

Return type:

angr.manager.SimulationManager

Many different types can be passed to this method:

  • If nothing is passed in, the SimulationManager is seeded with a state initialized for the program entry point, i.e. entry_state().
  • If a SimState is passed in, the SimulationManager is seeded with that state.
  • If a list is passed in, the list must contain only SimStates and the whole list will be used to seed the SimulationManager.
callable(addr, concrete_only=False, perform_merge=True, base_state=None, toc=None, cc=None)

A Callable is a representation of a function in the binary that can be interacted with like a native python function.

Parameters:
  • addr – The address of the function to use
  • concrete_only – Throw an exception if the execution splits into multiple states
  • perform_merge – Merge all result states into one at the end (only relevant if concrete_only=False)
  • base_state – The state from which to do these runs
  • toc – The address of the table of contents for ppc64
  • cc – The SimCC to use for a calling convention
Returns:

A Callable object that can be used as a interface for executing guest code like a python function.

Return type:

angr.surveyors.caller.Callable

cc(args=None, ret_val=None, sp_delta=None, func_ty=None)

Return a SimCC (calling convention) parametrized for this project and, optionally, a given function.

Parameters:
  • args – A list of argument storage locations, as SimFunctionArguments.
  • ret_val – The return value storage location, as a SimFunctionArgument.
  • sp_delta – Does this even matter??
  • func_ty – The protoype for the given function, as a SimType.

Relevant subclasses of SimFunctionArgument are SimRegArg and SimStackArg, and shortcuts to them can be found on this cc object.

For stack arguments, offsets are relative to the stack pointer on function entry.

cc_from_arg_kinds(fp_args, ret_fp=None, sizes=None, sp_delta=None, func_ty=None)

Get a SimCC (calling convention) that will extract floating-point/integral args correctly.

Parameters:
  • arch – The Archinfo arch for this CC
  • fp_args – A list, with one entry for each argument the function can take. True if the argument is fp, false if it is integral.
  • ret_fp – True if the return value for the function is fp.
  • sizes – Optional: A list, with one entry for each argument the function can take. Each entry is the size of the corresponding argument in bytes.
  • sp_delta – The amount the stack pointer changes over the course of this function - CURRENTLY UNUSED
Parmm func_ty:

A SimType for the function itself

Program State

class angr.sim_state.SimState(project=None, arch=None, plugins=None, memory_backer=None, permissions_backer=None, mode=None, options=None, add_options=None, remove_options=None, special_memory_filler=None, os_name=None)

The SimState represents the state of a program, including its memory, registers, and so forth.

Variables:
  • regs – A convenient view of the state’s registers, where each register is a property
  • mem – A convenient view of the state’s memory, a angr.state_plugins.view.SimMemView
  • registers – The state’s register file as a flat memory region
  • memory – The state’s memory as a flat memory region
  • se – The solver engine for this state
  • inspect – The breakpoint manager, a angr.state_plugins.inspect.SimInspector
  • log – Information about the state’s history
  • scratch – Information about the current execution step
  • posix – MISNOMER: information about the operating system or environment model
  • libc – Information about the standard library we are emulating
  • cgc – Information about the cgc environment
  • uc_manager – Control of under-constrained symbolic execution
  • unicorn – Control of the Unicorn Engine
ip

Get the instruction pointer expression, trigger SimInspect breakpoints, and generate SimActions. Use _ip to not trigger breakpoints or generate actions.

Returns:an expression
addr

Get the concrete address of the instruction pointer, without triggering SimInspect breakpoints or generating SimActions. An integer is returned, or an exception is raised if the instruction pointer is symbolic.

Returns:an int
simplify(*args)

Simplify this state’s constraints.

add_constraints(*args, **kwargs)

Add some constraints to the state.

You may pass in any number of symbolic booleans as variadic positional arguments.

satisfiable(**kwargs)

Whether the state’s constraints are satisfiable

downsize()

Clean up after the solver engine. Calling this when a state no longer needs to be solved on will reduce memory usage.

step(**kwargs)

Perform a step of symbolic execution using this state. Any arguments to AngrObjectFactory.successors can be passed to this.

Returns:A SimSuccessors object categorizing the results of the step.
block(*args, **kwargs)

Represent the basic block at this state’s instruction pointer. Any arguments to AngrObjectFactory.block can ba passed to this.

Returns:A Block object describing the basic block of code at this point.
copy()

Returns a copy of the state.

merge(*others, **kwargs)

Merges this state with the other states. Returns the merging result, merged state, and the merge flag.

Parameters:
  • states – the states to merge
  • merge_conditions – a tuple of the conditions under which each state holds
  • common_ancestor – a state that represents the common history between the states being merged
  • plugin_whitelist – a list of plugin names that will be merged. If this option is given and is not None, any plugin that is not inside this list will not be merged, and will be created as a fresh instance in the new state.
Returns:

(merged state, merge flag, a bool indicating if any merging occured)

widen(*others)

Perform a widening between self and other states :param others: :return:

reg_concrete(*args, **kwargs)

Returns the contents of a register but, if that register is symbolic, raises a SimValueError.

mem_concrete(*args, **kwargs)

Returns the contents of a memory but, if the contents are symbolic, raises a SimValueError.

stack_push(*args, **kwargs)

Push ‘thing’ to the stack, writing the thing to memory and adjusting the stack pointer.

stack_pop(*args, **kwargs)

Pops from the stack and returns the popped thing. The length will be the architecture word size.

stack_read(*args, **kwargs)

Reads length bytes, at an offset into the stack.

Parameters:
  • offset – The offset from the stack pointer.
  • length – The number of bytes to read.
  • bp – If True, offset from the BP instead of the SP. Default: False.
dbg_print_stack(depth=None, sp=None)

Only used for debugging purposes. Return the current stack info in formatted string. If depth is None, the current stack frame (from sp to bp) will be printed out.

class angr.state_plugins.inspect.BP(when='before', enabled=None, condition=None, action=None, **kwargs)

A breakpoint.

check(state, when)

Checks state state to see if the breakpoint should fire.

Parameters:
  • state – The state.
  • when – Whether the check is happening before or after the event.
Returns:

A boolean representing whether the checkpoint should fire.

fire(state)

Trigger the breakpoint.

Parameters:state – The state.
class angr.state_plugins.inspect.SimInspector

The breakpoint interface, used to instrument execution. For usage information, look here: https://docs.angr.io/docs/simuvex.html#breakpoints

action(event_type, when, **kwargs)

Called from within SimuVEX when events happens. This function checks all breakpoints registered for that event and fires the ones whose conditions match.

make_breakpoint(event_type, *args, **kwargs)

Creates and adds a breakpoint which would trigger on event_type. Additional arguments are passed to the BP constructor.

Returns:The created breakpoint, so that it can be removed later.
b(event_type, *args, **kwargs)

Creates and adds a breakpoint which would trigger on event_type. Additional arguments are passed to the BP constructor.

Returns:The created breakpoint, so that it can be removed later.
add_breakpoint(event_type, bp)

Adds a breakpoint which would trigger on event_type.

Parameters:
  • event_type – The event type to trigger on
  • bp – The breakpoint
Returns:

The created breakpoint.

remove_breakpoint(event_type, bp=None, filter_func=None)

Removes a breakpoint.

Parameters:
  • bp – The breakpoint to remove.
  • filter_func – A filter function to specify whether each breakpoint should be removed or not.
downsize()

Remove previously stored attributes from this plugin instance to save memory. This method is supposed to be called by breakpoint implementors. A typical workflow looks like the following :

>>> # Add `attr0` and `attr1` to `self.state.inspect`
>>> self.state.inspect(xxxxxx, attr0=yyyy, attr1=zzzz)
>>> # Get new attributes out of SimInspect in case they are modified by the user
>>> new_attr0 = self.state._inspect.attr0
>>> new_attr1 = self.state._inspect.attr1
>>> # Remove them from SimInspect
>>> self.state._inspect.downsize()
class angr.state_plugins.libc.SimStateLibc

This state plugin keeps track of various libc stuff:

class angr.state_plugins.posix.Stat(st_dev, st_ino, st_nlink, st_mode, st_uid, st_gid, st_rdev, st_size, st_blksize, st_blocks, st_atime, st_atimensec, st_mtime, st_mtimensec, st_ctime, st_ctimensec)

Create new instance of Stat(st_dev, st_ino, st_nlink, st_mode, st_uid, st_gid, st_rdev, st_size, st_blksize, st_blocks, st_atime, st_atimensec, st_mtime, st_mtimensec, st_ctime, st_ctimensec)

st_atime

Alias for field number 10

st_atimensec

Alias for field number 11

st_blksize

Alias for field number 8

st_blocks

Alias for field number 9

st_ctime

Alias for field number 14

st_ctimensec

Alias for field number 15

st_dev

Alias for field number 0

st_gid

Alias for field number 5

st_ino

Alias for field number 1

st_mode

Alias for field number 3

st_mtime

Alias for field number 12

st_mtimensec

Alias for field number 13

Alias for field number 2

st_rdev

Alias for field number 6

st_size

Alias for field number 7

st_uid

Alias for field number 4

class angr.state_plugins.solver.SimSolver(solver=None, all_variables=None)

Symbolic solver.

reload_solver()

Reloads the solver. Useful when changing solver options.

BVS(name, size, min=None, max=None, stride=None, uninitialized=False, explicit_name=None, inspect=True, events=True, **kwargs)

Creates a bit-vector symbol (i.e., a variable). Other keyword parameters are passed directly on to the constructor of claripy.ast.BV.

Parameters:
  • name – The name of the symbol.
  • size – The size (in bits) of the bit-vector.
  • min – The minimum value of the symbol.
  • max – The maximum value of the symbol.
  • stride – The stride of the symbol.
  • uninitialized – Whether this value should be counted as an “uninitialized” value in the course of an analysis.
  • explicit_name – If False, an identifier is appended to the name to ensure uniqueness.
Returns:

A BV object representing this symbol.

eval_to_ast(*args, **kwargs)

Evaluate an expression, using the solver if necessary. Returns AST objects.

Parameters:
  • e – the expression
  • n – the number of desired solutions
  • extra_constraints – extra constraints to apply to the solver
  • exact – if False, returns approximate solutions
Returns:

a tuple of the solutions, in the form of claripy AST nodes

Return type:

tuple

eval_upto(e, n, cast_to=<type 'int'>, **kwargs)

Evaluate an expression, using the solver if necessary. Returns primitives as specified by the cast_to parameter. Only certain primitives are supported, check the implementation of _cast_to to see which ones.

Parameters:
  • e – the expression
  • n – the number of desired solutions
  • extra_constraints – extra constraints to apply to the solver
  • exact – if False, returns approximate solutions
  • cast_to – A type to cast the resulting values to
Returns:

a tuple of the solutions, in the form of Python primitives

Return type:

tuple

eval(e, **kwargs)

Evaluate an expression to get any possible solution. The desired output types can be specified using the cast_to parameter. extra_constraints can be used to specify additional constraints the returned values must satisfy.

Parameters:
  • e – the expression to get a solution for
  • kwargs – Any additional kwargs will be passed down to eval_upto
Raises:

SimUnsatError – if no solution could be found satisfying the given constraints

Returns:

eval_one(e, **kwargs)

Evaluate an expression to get the only possible solution. Errors if either no or more than one solution is returned. A kwarg parameter default can be specified to be returned instead of failure!

Parameters:
  • e – the expression to get a solution for
  • default – A value can be passed as a kwarg here. It will be returned in case of failure.
  • kwargs – Any additional kwargs will be passed down to eval_upto
Raises:
  • SimUnsatError – if no solution could be found satisfying the given constraints
  • SimValueError – if more than one solution was found to satisfy the given constraints
Returns:

The value for e

eval_atmost(e, n, **kwargs)

Evaluate an expression to get at most n possible solutions. Errors if either none or more than n solutions are returned.

Parameters:
  • e – the expression to get a solution for
  • n – the inclusive upper limit on the number of solutions
  • kwargs – Any additional kwargs will be passed down to eval_upto
Raises:
  • SimUnsatError – if no solution could be found satisfying the given constraints
  • SimValueError – if more than n solutions were found to satisfy the given constraints
Returns:

The solutions for e

eval_atleast(e, n, **kwargs)

Evaluate an expression to get at least n possible solutions. Errors if less than n solutions were found.

Parameters:
  • e – the expression to get a solution for
  • n – the inclusive lower limit on the number of solutions
  • kwargs – Any additional kwargs will be passed down to eval_upto
Raises:
  • SimUnsatError – if no solution could be found satisfying the given constraints
  • SimValueError – if less than n solutions were found to satisfy the given constraints
Returns:

The solutions for e

eval_exact(e, n, **kwargs)

Evaluate an expression to get exactly the n possible solutions. Errors if any number of solutions other than n was found to exist.

Parameters:
  • e – the expression to get a solution for
  • n – the inclusive lower limit on the number of solutions
  • kwargs – Any additional kwargs will be passed down to eval_upto
Raises:
  • SimUnsatError – if no solution could be found satisfying the given constraints
  • SimValueError – if any number of solutions other than n were found to satisfy the given constraints
Returns:

The solutions for e

Storage

class angr.state_plugins.view.SimMemView(ty=None, addr=None, state=None)

This is a convenient interface with which you can access a program’s memory.

The interface works like this:

  • You first use [array index notation] to specify the address you’d like to load from
  • If at that address is a pointer, you may access the deref property to return a SimMemView at the address present in memory.
  • You then specify a type for the data by simply accesing a property of that name. For a list of supported types, look at state.mem.types.
  • You can then refine the type. Any type may support any refinement it likes. Right now the only refinements supported are that you may access any member of a struct by its member name, and you may index into a string or array to access that element.
  • If the address you specified initially points to an array of that type, you can say .array(n) to view the data as an array of n elements.
  • Finally, extract the structured data with .resolved or .concrete. .resolved will return bitvector values, while .concrete will return integer, string, array, etc values, whatever best represents the data.
  • Alternately, you may store a value to memory, by assigning to the chain of properties that you’ve constructed. Note that because of the way python works, x = s.mem[...].prop; x = val will NOT work, you must say s.mem[...].prop = val.

For example:

>>> s.mem[0x601048].long
<long (64 bits) <BV64 0x4008d0> at 0x601048>
>>> s.mem[0x601048].long.resolved
<BV64 0x4008d0>
>>> s.mem[0x601048].deref
<<untyped> <unresolvable> at 0x4008d0>
>>> s.mem[0x601048].deref.string.concrete
'SOSNEAKY'
class angr.storage.file.SimFile(name, mode, pos=0, content=None, size=None, closed=None)

Represents a file.

variables()
Returns:the symbolic variable names associated with the file.
read(dst_addr, length)

Reads some data from the current (or provided) position of the file.

Parameters:
  • dst_addr – If specified, the data is written to that address.
  • length – The length of the read.
Returns:

The length of the read.

concretize(**kwargs)

Returns a concrete value for this file satisfying the current state constraints.

Or: generate a testcase for this file.

merge(others, merge_conditions, common_ancestor=None)

Merges the SimFile object with others.

class angr.storage.file.SimDialogue(name, mode=None, pos=0, content=None, size=None, dialogue_entries=None)

Emulates a dialogue with a program. Enables us to perform concrete short reads.

add_dialogue_entry(dialogue_len)

Add a new dialogue piece to the end of the dialogue.

read(dst_addr, length)

Reads some data from current dialogue entry, emulates short reads.

class angr.storage.memory.AddressWrapper(region, region_base_addr, address, is_on_stack, function_address)

AddressWrapper is used in SimAbstractMemory, which provides extra meta information for an address (or a ValueSet object) that is normalized from an integer/BVV/StridedInterval.

Constructor for the class AddressWrapper.

Parameters:
  • strregion – Name of the memory regions it belongs to.
  • region_base_addr (int) – Base address of the memory region
  • address – An address (not a ValueSet object).
  • is_on_stack (bool) – Whether this address is on a stack region or not.
  • function_address (int) – Related function address (if any).
to_valueset(state)

Convert to a ValueSet instance

Parameters:state – A state
Returns:The converted ValueSet instance
class angr.storage.memory.RegionDescriptor(region_id, base_address, related_function_address=None)

Descriptor for a memory region ID.

class angr.storage.memory.RegionMap(is_stack)

Mostly used in SimAbstractMemory, RegionMap stores a series of mappings between concrete memory address ranges and memory regions, like stack frames and heap regions.

Constructor

Parameters:is_stack – Whether this is a region map for stack frames or not. Different strategies apply for stack regions.
map(absolute_address, region_id, related_function_address=None)

Add a mapping between an absolute address and a region ID. If this is a stack region map, all stack regions beyond (lower than) this newly added regions will be discarded.

Parameters:
  • absolute_address – An absolute memory address.
  • region_id – ID of the memory region.
  • related_function_address – A related function address, mostly used for stack regions.
unmap_by_address(absolute_address)

Removes a mapping based on its absolute address.

Parameters:absolute_address – An absolute address
absolutize(region_id, relative_address)

Convert a relative address in some memory region to an absolute address.

Parameters:
  • region_id – The memory region ID
  • relative_address – The relative memory offset in that memory region
Returns:

An absolute address if converted, or an exception is raised when region id does not exist.

relativize(absolute_address, target_region_id=None)

Convert an absolute address to the memory offset in a memory region.

Note that if an address belongs to heap region is passed in to a stack region map, it will be converted to an offset included in the closest stack frame, and vice versa for passing a stack address to a heap region. Therefore you should only pass in address that belongs to the same category (stack or non-stack) of this region map.

Parameters:absolute_address – An absolute memory address
Returns:A tuple of the closest region ID, the relative offset, and the related function address.
class angr.storage.memory.MemoryStoreRequest(addr, data=None, size=None, condition=None, endness=None)

A MemoryStoreRequest is used internally by SimMemory to track memory request data.

class angr.storage.memory.SimMemory(endness=None, abstract_backer=None, stack_region_map=None, generic_region_map=None)

Represents the memory space of the process.

category

Return the category of this SimMemory instance. It can be one of the three following categories – reg, mem, or file.

set_state(state)

Call the set_state method in SimStatePlugin class, and then perform the delayed initialization.

Parameters:state – The SimState instance
set_stack_address_mapping(absolute_address, region_id, related_function_address=None)

Create a new mapping between an absolute address (which is the base address of a specific stack frame) and a region ID.

Parameters:
  • absolute_address – The absolute memory address.
  • region_id – The region ID.
  • related_function_address – Related function address.
unset_stack_address_mapping(absolute_address)

Remove a stack mapping.

Parameters:absolute_address – An absolute memory address, which is the base address of the stack frame to destroy.
stack_id(function_address)

Return a memory region ID for a function. If the default region ID exists in the region mapping, an integer will appended to the region name. In this way we can handle recursive function calls, or a function that appears more than once in the call frame.

This also means that stack_id() should only be called when creating a new stack frame for a function. You are not supposed to call this function every time you want to map a function address to a stack ID.

Parameters:function_address (int) – Address of the function.
Returns:ID of the new memory region.
Return type:str
store(addr, data, size=None, condition=None, add_constraints=None, endness=None, action=None, inspect=True, priv=None, disable_actions=False)

Stores content into memory.

Parameters:
  • addr – A claripy expression representing the address to store at.
  • data – The data to store (claripy expression or something convertable to a claripy expression).
  • size – A claripy expression representing the size of the data to store.

The following parameters are optional.

Parameters:
  • condition – A claripy expression representing a condition if the store is conditional.
  • add_constraints – Add constraints resulting from the merge (default: True).
  • endness – The endianness for the data.
  • action – A SimActionData to fill out with the final written value and constraints.
  • inspect (bool) – Whether this store should trigger SimInspect breakpoints or not.
  • disable_actions (bool) – Whether this store should avoid creating SimActions or not. When set to False, state options are respected.
store_cases(addr, contents, conditions, fallback=None, add_constraints=None, endness=None, action=None)

Stores content into memory, conditional by case.

Parameters:
  • addr – A claripy expression representing the address to store at.
  • contents – A list of bitvectors, not necessarily of the same size. Use None to denote an empty write.
  • conditions – A list of conditions. Must be equal in length to contents.

The following parameters are optional.

Parameters:
  • fallback – A claripy expression representing what the write should resolve to if all conditions evaluate to false (default: whatever was there before).
  • add_constraints – Add constraints resulting from the merge (default: True)
  • endness – The endianness for contents as well as fallback.
  • action (SimActionData) – A SimActionData to fill out with the final written value and constraints.
load(addr, size=None, condition=None, fallback=None, add_constraints=None, action=None, endness=None, inspect=True, disable_actions=False, ret_on_segv=False)

Loads size bytes from dst.

Parameters:
  • dst – The address to load from.
  • size – The size (in bytes) of the load.
  • condition – A claripy expression representing a condition for a conditional load.
  • fallback – A fallback value if the condition ends up being False.
  • add_constraints – Add constraints resulting from the merge (default: True).
  • action – A SimActionData to fill out with the constraints.
  • endness – The endness to load with.
  • inspect (bool) – Whether this store should trigger SimInspect breakpoints or not.
  • disable_actions (bool) – Whether this store should avoid creating SimActions or not. When set to False, state options are respected.
  • ret_on_segv (bool) – Whether returns the memory that is already loaded before a segmentation fault is triggered. The default is False.

There are a few possible return values. If no condition or fallback are passed in, then the return is the bytes at the address, in the form of a claripy expression. For example:

<A BVV(0x41, 32)>

On the other hand, if a condition and fallback are provided, the value is conditional:

<A If(condition, BVV(0x41, 32), fallback)>
normalize_address(addr, is_write=False)

Normalize addr for use in static analysis (with the abstract memory model). In non-abstract mode, simply returns the address in a single-element list.

find(addr, what, max_search=None, max_symbolic_bytes=None, default=None, step=1)

Returns the address of bytes equal to ‘what’, starting from ‘start’. Note that, if you don’t specify a default value, this search could cause the state to go unsat if no possible matching byte exists.

Parameters:
  • addr – The start address.
  • what – What to search for;
  • max_search – Search at most this many bytes.
  • max_symbolic_bytes – Search through at most this many symbolic bytes.
  • default – The default value, if what you’re looking for wasn’t found.
Returns:

An expression representing the address of the matching byte.

copy_contents(dst, src, size, condition=None, src_memory=None, dst_memory=None, inspect=True, disable_actions=False)

Copies data within a memory.

Parameters:
  • dst – A claripy expression representing the address of the destination
  • src – A claripy expression representing the address of the source

The following parameters are optional.

Parameters:
  • src_memory – Copy data from this SimMemory instead of self
  • src_memory – Copy data to this SimMemory instead of self
  • size – A claripy expression representing the size of the copy
  • condition – A claripy expression representing a condition, if the write should be conditional. If this is determined to be false, the size of the copy will be 0.
class angr.state_plugins.abstract_memory.SimAbstractMemory(memory_backer=None, memory_id='mem', endness=None, stack_region_map=None, generic_region_map=None)

This is an implementation of the abstract store in paper [TODO].

Some differences:

  • For stack variables, we map the absolute stack address to each region so that we can effectively trace stack accesses. When tracing into a new function, you should call set_stack_address_mapping() to create a new mapping. When exiting from a function, you should cancel the previous mapping by calling unset_stack_address_mapping(). Currently this is only used for stack!
create_region(key, state, is_stack, related_function_addr, endness, backer_dict=None)

Create a new MemoryRegion with the region key specified, and store it to self._regions.

Parameters:
  • key – a string which is the region key
  • state – the SimState instance
  • is_stack (bool) – Whether this memory region is on stack. True/False
  • related_function_addr – Which function first creates this memory region. Just for reference.
  • endness – The endianness.
  • backer_dict – The memory backer object.
Returns:

None

set_state(state)

Overriding the SimStatePlugin.set_state() method

Parameters:state – A SimState object
Returns:None
normalize_address(addr, is_write=False, convert_to_valueset=False, target_region=None)

Convert a ValueSet object into a list of addresses.

Parameters:
  • addr – A ValueSet object (which describes an address)
  • is_write – Is this address used in a write or not
  • convert_to_valueset – True if you want to have a list of ValueSet instances instead of AddressWrappers, False otherwise
  • target_region – Which region to normalize the address to. To leave the decision to SimuVEX, set it to None
Returns:

A list of AddressWrapper or ValueSet objects

get_segments(addr, size)

Get a segmented memory region based on AbstractLocation information available from VSA.

Here are some assumptions to make this method fast:
  • The entire memory region [addr, addr + size] is located within the same MemoryRegion
  • The address ‘addr’ has only one concrete value. It cannot be concretized to multiple values.
Parameters:
  • addr – An address
  • size – Size of the memory area in bytes
Returns:

An ordered list of sizes each segment in the requested memory region

copy()

Make a copy of this SimAbstractMemory object :return:

merge(others, merge_conditions, common_ancestor=None)

Merge this guy with another SimAbstractMemory instance

map_region(addr, length, permissions, init_zero=False)

Map a number of pages at address addr with permissions permissions. :param addr: address to map the pages at :param length: length in bytes of region to map, will be rounded upwards to the page size :param permissions: AST of permissions to map, will be a bitvalue representing flags

unmap_region(addr, length)

Unmap a number of pages at address addr :param addr: address to unmap the pages at :param length: length in bytes of region to map, will be rounded upwards to the page size

dbg_print()

Print out debugging information

class angr.storage.memory_object.SimMemoryObject(object, base, length=None)

A MemoryObjectRef instance is a reference to a byte or several bytes in a specific object in SimSymbolicMemory. It is only used inside SimSymbolicMemory class.

class angr.storage.paged_memory.BasePage(page_addr, page_size, permissions=None, executable=False)

Page object, allowing for more flexibility than just a raw dict.

Create a new page object. Carries permissions information. Permissions default to RW unless executable is True, in which case permissions default to RWX.

Parameters:
  • page_addr (int) – The base address of the page.
  • page_size (int) – The size of the page.
  • executable (bool) – Whether the page is executable. Typically, this will depend on whether the binary has an executable stack.
  • permissions (claripy.AST) – A 3-bit bitvector setting specific permissions for EXEC, READ, and WRITE
store_mo(state, new_mo, overwrite=True)

Stores a memory object.

Parameters:
  • new_mo – the memory object
  • overwrite – whether to overwrite objects already in memory (if false, just fill in the holes)
load_mo(state, page_idx)

Loads a memory object from memory.

Parameters:page_idx – the index into the page
Returns:a tuple of the object
load_slice(state, start, end)

Return the memory objects overlapping with the provided slice.

Parameters:
  • start – the start address
  • end – the end address (non-inclusive)
Returns:

tuples of (starting_addr, memory_object)

class angr.storage.paged_memory.TreePage(*args, **kwargs)

Page object, implemented with a bintree.

load_mo(state, page_idx)

Loads a memory object from memory.

Parameters:page_idx – the index into the page
Returns:a tuple of the object
load_slice(state, start, end)

Return the memory objects overlapping with the provided slice.

Parameters:
  • start – the start address
  • end – the end address (non-inclusive)
Returns:

tuples of (starting_addr, memory_object)

class angr.storage.paged_memory.ListPage(*args, **kwargs)

Page object, implemented with a list.

load_mo(state, page_idx)

Loads a memory object from memory.

Parameters:page_idx – the index into the page
Returns:a tuple of the object
load_slice(state, start, end)

Return the memory objects overlapping with the provided slice.

Parameters:
  • start – the start address
  • end – the end address (non-inclusive)
Returns:

tuples of (starting_addr, memory_object)

angr.storage.paged_memory.Page

alias of ListPage

class angr.storage.paged_memory.SimPagedMemory(memory_backer=None, permissions_backer=None, pages=None, initialized=None, name_mapping=None, hash_mapping=None, page_size=None, symbolic_addrs=None, check_permissions=False)

Represents paged memory.

load_objects(addr, num_bytes, ret_on_segv=False)

Load memory objects from paged memory.

Parameters:
  • addr – Address to start loading.
  • num_bytes – Number of bytes to load.
  • ret_on_segv (bool) – True if you want load_bytes to return directly when a SIGSEV is triggered, otherwise a SimSegfaultError will be raised.
Returns:

list of tuples of (addr, memory_object)

Return type:

tuple

contains_no_backer(addr)

Tests if the address is contained in any page of paged memory, without considering memory backers.

Parameters:addr (int) – The address to test.
Returns:True if the address is included in one of the pages, False otherwise.
Return type:bool
store_memory_object(mo, overwrite=True)

This function optimizes a large store by storing a single reference to the SimMemoryObject instead of one for each byte.

Parameters:memory_object – the memory object to store
replace_memory_object(old, new_content)

Replaces the memory object old with a new memory object containing new_content.

Parameters:
  • old – A SimMemoryObject (i.e., one from memory_objects_for_hash() or :func:` memory_objects_for_name()`).
  • new_content – The content (claripy expression) for the new memory object.
Returns:

the new memory object

replace_all(old, new)

Replaces all instances of expression old with expression new.

Parameters:
  • old – A claripy expression. Must contain at least one named variable (to make it possible to use the name index for speedup).
  • new – The new variable to replace it with.
addrs_for_name(n)

Returns addresses that contain expressions that contain a variable named n.

addrs_for_hash(h)

Returns addresses that contain expressions that contain a variable with the hash of h.

memory_objects_for_name(n)

Returns a set of SimMemoryObjects that contain expressions that contain a variable with the name of n.

This is useful for replacing those values in one fell swoop with replace_memory_object(), even if they have been partially overwritten.

memory_objects_for_hash(n)

Returns a set of SimMemoryObjects that contain expressions that contain a variable with the hash h.

permissions(addr, permissions=None)

Returns the permissions for a page at address addr.

If optional argument permissions is given, set page permissions to that prior to returning permissions.

class angr.concretization_strategies.SimConcretizationStrategy(filter=None, exact=True)

Concretization strategies control the resolution of symbolic memory indices in SimuVEX. By subclassing this class and setting it as a concretization strategy (on state.memory.read_strategies and state.memory.write_strategies), SimuVEX’s memory index concretization behavior can be modified.

Initializes the base SimConcretizationStrategy.

Parameters:
  • filter – A function, taking arguments of (SimMemory, claripy.AST) that determins if this strategy can handle resolving the provided AST.
  • exact – A flag (default: True) that determines if the convenience resolution functions provided by this class use exact or approximate resolution.
concretize(memory, addr)

Concretizes the address into a list of values. If this strategy cannot handle this address, returns None.

copy()

Returns a copy of the strategy, if there is data that should be kept separate between states. If not, returns self.

merge(others)

Merges this strategy with others (if there is data that should be kept separate between states. If not, is a no-op.

Simulation Manager

class angr.manager.SimulationManager(project, active_states=None, stashes=None, hierarchy=None, veritesting=None, veritesting_options=None, immutable=None, resilience=None, save_unconstrained=None, save_unsat=None, threads=None, errored=None)

The Simulation Manager is the future future.

Simulation managers allow you to wrangle multiple states in a slick way. States are organized into “stashes”, which you can step forward, filter, merge, and move around as you wish. This allows you to, for example, step two different stashes of states at different rates, then merge them together.

Stashes can be accessed as attributes (i.e. .active). A mulpyplexed stash can be retrieved by prepending the name with mp_, e.g. .mp_active. A single state from the stash can be retrieved by prepending the name with one_, e.g. .one_active.

Note that you shouldn’t usually be constructing SimulationManagers directly - there is a convenient shortcut for creating them in Project.factory: see angr.factory.AngrObjectFactory.

Parameters:project (angr.project.Project) – A Project instance.

The following parameters are optional.

Parameters:
  • active_states – Active states to seed the “active” stash with.
  • stashes – A dictionary to use as the stash store.
  • hierarchy – A StateHierarchy object to use to track the relationships between states.
  • immutable – If True, all operations will return a new SimulationManager. Otherwise (default), all operations will modify the SimulationManager (and return it, for consistency and chaining).
  • threads – the number of worker threads to concurrently analyze states (useful in z3-intensive situations).

Multithreading your search can be useful in constraint-solving-intensive situations. Indeed, Python cannot multithread due to its GIL, but z3, written in C, can.

The most important methods you should look at are step, explore, and use_technique.

Variables:
  • errored – Not a stash, but a list of ErrorRecords. Whenever a step raises an exception that we catch, the state and some information about the error are placed in this list. You can adjust the list of caught exceptions with the resilience parameter.
  • stashes – All the stashes on this instance, as a dictionary.
mulpyplex(*stashes)

Mulpyplex across several stashes.

Parameters:stashes – the stashes to mulpyplex
Returns:a mulpyplexed list of states from the stashes in question, in the specified order
apply(state_func=None, stash_func=None, stash=None)

Applies a given function to a given stash.

Parameters:
  • state_func – A function to apply to every state. Should take a state and return a state. The returned state will take the place of the old state. If the function doesn’t return a state, the old state will be used. If the function returns a list of states, they will replace the original states.
  • stash_func

    A function to apply to the whole stash. Should take a list of states and return a list of states. The resulting list will replace the stash.

    If both state_func and stash_func are provided state_func is applied first, then stash_func is applied on the results.

Returns:

The resulting SimulationManager.

Return type:

SimulationManager

split(stash_splitter=None, stash_ranker=None, state_ranker=None, limit=None, from_stash=None, to_stash=None)

Split a stash of states. The stash from_stash will be split into two stashes depending on the other options passed in. If to_stash is provided, the second stash will be written there.

stash_splitter overrides stash_ranker, which in turn overrides state_ranker. If no functions are provided, the states are simply split according to the limit.

The sort done with state_ranker is ascending.

Parameters:
  • stash_splitter – A function that should take a list of states and return a tuple of two lists (the two resulting stashes).
  • stash_ranker – A function that should take a list of states and return a sorted list of states. This list will then be split according to “limit”.
  • state_ranker – An alternative to stash_splitter. States will be sorted with outputs of this function. used as a key. The first “limit” of them will be kept, the rest split off.
  • limit – For use with state_ranker. The number of states to keep. Default: 8
  • from_stash – The stash to split (default: ‘active’)
  • to_stash – The stash to write to (default: ‘stashed’)
Returns:

The resulting SimulationManager.

Return type:

SimulationManager

step(n=None, selector_func=None, step_func=None, stash=None, successor_func=None, until=None, **kwargs)

Step a stash of states forward and categorize the successors appropriately.

The parameters to this function allow you to control everything about the stepping and categorization process.

Parameters:
  • stash – The name of the stash to step (default: ‘active’)
  • n – The number of times to step (default: 1 if “until” is not provided)
  • selector_func – If provided, should be a function that takes a state and returns a boolean. If True, the state will be stepped. Otherwise, it will be kept as-is.
  • step_func – If provided, should be a function that takes a SimulationManager and returns a SimulationManager. Will be called with the SimulationManager at every step. Note that this function should not actually perform any stepping - it is meant to be a maintenance function called after each step.
  • successor_func – If provided, should be a function that takes a state and return its successors. Otherwise, project.factory.successors will be used.
  • until – If provided, should be a function that takes a SimulationManager and returns True or False. Stepping will terminate when it is True.

Additionally, you can pass in any of the following keyword args for project.factory.sim_run:

Parameters:
  • jumpkind – The jumpkind of the previous exit
  • addr – An address to execute at instead of the state’s ip.
  • stmt_whitelist – A list of stmt indexes to which to confine execution.
  • last_stmt – A statement index at which to stop execution.
  • thumb – Whether the block should be lifted in ARM’s THUMB mode.
  • backup_state – A state to read bytes from instead of using project memory.
  • opt_level – The VEX optimization level to use.
  • insn_bytes – A string of bytes to use for the block instead of the project.
  • size – The maximum size of the block, in bytes.
  • num_inst – The maximum number of instructions.
  • traceflags – traceflags to be passed to VEX. Default: 0

The following parameters are specific to the unicorn-engine.

Parameters:extra_stop_points – A collection of addresses where unicorn should stop, in addition to default program points at which unicorn stops (e.g., hook points).
Returns:The resulting SimulationManager.
Return type:SimulationManager
prune(filter_func=None, from_stash=None, to_stash=None)

Prune unsatisfiable states from a stash. This function will move all unsatisfiable states in the given stash into a different stash.

Parameters:
  • filter_func – Only prune states that match this filter.
  • from_stash – Prune states from this stash. (default: ‘active’)
  • to_stash – Put pruned states in this stash. (default: ‘pruned’)
Returns:

The resulting SimulationManager.

Return type:

SimulationManager

move(from_stash, to_stash, filter_func=None)

Move states from one stash to another.

Parameters:
  • from_stash – Take matching states from this stash.
  • to_stash – Put matching states into this stash.
  • filter_func – Stash states that match this filter. Should be a function that takes a state and returns True or False. Default: stash all states
Returns:

The resulting SimulationManager.

Return type:

SimulationManager

stash(filter_func=None, from_stash=None, to_stash=None)

Stash some states. This is an alias for move(), with defaults for the stashes.

Parameters:
  • filter_func – Stash states that match this filter. Should be a function. that takes a state and returns True or False. (default: stash all states)
  • from_stash – Take matching states from this stash. (default: ‘active’)
  • to_stash – Put matching states into this stash. (default: ‘stashed’)
Returns:

The resulting SimulationManager

Return type:

SimulationManager

drop(filter_func=None, stash=None)

Drops states from a stash. This is an alias for move(), with defaults for the stashes.

Parameters:
  • filter_func – Drop states that match this filter. Should be a function that takes a state and returns True or False. (default: drop all states)
  • stash – Drop matching states from this stash. (default: ‘active’)
Returns:

The resulting SimulationManager

Return type:

SimulationManager

unstash(filter_func=None, to_stash=None, from_stash=None)

Unstash some states. This is an alias for move(), with defaults for the stashes.

Parameters:
  • filter_func – Unstash states that match this filter. Should be a function that takes a state and returns True or False. (default: unstash all states)
  • from_stash – take matching states from this stash. (default: ‘stashed’)
  • to_stash – put matching states into this stash. (default: ‘active’)
Returns:

The resulting SimulationManager.

Return type:

SimulationManager

merge(merge_func=None, stash=None)

Merge the states in a given stash.

Parameters:
  • stash – The stash (default: ‘active’)
  • merge_func – If provided, instead of using state.merge, call this function with the states as the argument. Should return the merged state.
Returns:

The result SimulationManager.

Return type:

SimulationManager

use_technique(tech)

Use an exploration technique with this SimulationManager. Techniques can be found in angr.exploration_techniques.

Parameters:tech – An ExplorationTechnique object that contains code to modify this SimulationManager’s behavior
stash_not_addr(addr, from_stash=None, to_stash=None)

Stash all states not at address addr from stash from_stash to stash to_stash.

stash_addr(addr, from_stash=None, to_stash=None)

Stash all states at address addr from stash from_stash to stash to_stash.

stash_addr_past(addr, from_stash=None, to_stash=None)

Stash all states containg address addr in their backtrace from stash from_stash to stash to_stash.

stash_not_addr_past(addr, from_stash=None, to_stash=None)

Stash all states not containg address addr in their backtrace from stash from_stash to stash to_stash.

stash_all(from_stash=None, to_stash=None)

Stash all states from stash from_stash to stash to_stash.

unstash_addr(addr, from_stash=None, to_stash=None)

Unstash all states at address addr.

unstash_addr_past(addr, from_stash=None, to_stash=None)

Unstash all states containing address addr in their backtrace.

unstash_not_addr(addr, from_stash=None, to_stash=None)

Unstash all states not at address addr.

unstash_not_addr_past(addr, from_stash=None, to_stash=None)

Unstash all states not containing address addr in their backtrace.

unstash_all(from_stash=None, to_stash=None)

Unstash all states.

explore(stash=None, n=None, find=None, avoid=None, find_stash='found', avoid_stash='avoid', cfg=None, num_find=1, step_func=None)

Tick stash “stash” forward (up to “n” times or until “num_find” states are found), looking for condition “find”, avoiding condition “avoid”. Stores found states into “find_stash’ and avoided states into “avoid_stash”.

The “find” and “avoid” parameters may be any of:

  • An address to find
  • A set or list of addresses to find
  • A function that takes a state and returns whether or not it matches.

If an angr CFG is passed in as the “cfg” parameter and “find” is either a number or a list or a set, then any states which cannot possibly reach a success state without going through a failure state will be preemptively avoided.

run(stash=None, n=None, step_func=None)

Run until the SimulationManager has reached a completed state, according to the current exploration techniques.

TODO: step_func doesn’t work with veritesting, since veritesting replaces the default step logic.

Parameters:
  • stash – Operate on this stash
  • n – Step at most this many times
  • step_func – If provided, should be a function that takes a SimulationManager and returns a new SimulationManager. Will be called with the current SimulationManager at every step.
Returns:

The resulting SimulationManager.

Return type:

SimulationManager

class angr.manager.ErrorRecord(state, error, traceback)

A container class for a state and an error that was thrown during its execution. You can find these in SimulationManager.errored.

Variables:
  • state – The state that encountered an error, at the point in time just before the erroring step began
  • error – The error that was thrown
  • traceback – The traceback for the error that was thrown
debug()

Launch a postmortem debug shell at the site of the error

class angr.exploration_techniques.ExplorationTechnique

An otiegnqwvk is a set of hooks for a simulation manager that assists in the implementation of new techniques in symbolic exploration.

TODO: choose actual name for the functionality (techniques? strategies?)

Any number of these methods may be overridden by a subclass. To use an exploration technique, call simgr.use_technique with an instance of the technique.

setup(simgr)

Perform any initialization on this manager you might need to do.

step_state(state, **kwargs)

Perform the process of stepping a state forward.

If the stepping fails, return None to fall back to a default stepping procedure. Otherwise, return a dict of stashes to merge into the simulation manager. All the states will be added to the PathGroup’s stashes based on the mapping in the returned dict.

step(simgr, stash, **kwargs)

Step this stash of this manager forward. Should call simgr.step(stash, **kwargs) in order to do the actual processing.

Return the stepped manager.

filter(state)

Perform filtering on a state.

If the state should not be filtered, return None. If the state should be filtered, return the name of the stash to move the state to. If you want to modify the state before filtering it, return a tuple of the stash to move the state to and the modified state.

complete(simgr)

Return whether or not this manager has reached a “completed” state, i.e. SimulationManager.run() should halt.

class angr.exploration_techniques.dfs.DFS

Depth-first search.

Will only keep one path active at a time, any others will be stashed in the ‘deferred’ stash. When we run out of active paths to step, we take the longest one from deferred and continue.

class angr.exploration_techniques.explorer.Explorer(find=None, avoid=None, find_stash='found', avoid_stash='avoid', cfg=None, num_find=1, avoid_priority=False)

Search for up to “num_find” paths that satisfy condition “find”, avoiding condition “avoid”. Stashes found paths into “find_stash’ and avoided paths into “avoid_stash”.

The “find” and “avoid” parameters may be any of:

  • An address to find
  • A set or list of addresses to find
  • A function that takes a path and returns whether or not it matches.

If an angr CFG is passed in as the “cfg” parameter and “find” is either a number or a list or a set, then any paths which cannot possibly reach a success state without going through a failure state will be preemptively avoided.

If either the “find” or “avoid” parameter is a function returning a boolean, and a path triggers both conditions, it will be added to the find stash, unless “avoid_priority” is set to True.

class angr.exploration_techniques.looplimiter.LoopLimiter(count=5, discard_stash='spinning')

Limit the number of loops a path may go through. Paths that exceed the loop limit are moved to a discard stash.

Note that this uses the default detect_loops method from Path, which approximates loop counts by counting the number of times each basic block is executed in a given stack frame.

class angr.exploration_techniques.threading.Threading(threads=8)

Enable multithreading.

This is only useful in paths where a lot of time is taken inside z3, doing constraint solving. This is because of python’s GIL, which says that only one thread at a time may be executing python code.

class angr.exploration_techniques.veritesting.Veritesting(**options)

Enable veritesting. This technique, described in a paper[1] from CMU, attempts to address the problem of state explosions in loops by performing smart merging.

[1] https://users.ece.cmu.edu/~aavgerin/papers/veritesting-icse-2014.pdf

Simulation Engines

class angr.engines.engine.SimEngine(**kwargs)

A SimEngine is a class which understands how to perform execution on a state. This is a base class.

process(state, *args, **kwargs)

Perform execution with a state.

You should only override this method in a subclass in order to provide the correct method signature and docstring. You should override the _process method to do your actual execution.

Parameters:
  • state – The state with which to execute. This state will be copied before modification.
  • inline – This is an inline execution. Do not bother copying the state.
  • force_addr – Force execution to pretend that we’re working at this concrete address
Returns:

A SimSuccessors object categorizing the execution’s successor states

check(state, *args, **kwargs)

Check if this engine can be used for execution on the current state. A callback check_failure is called upon failed checks. Note that the execution can still fail even if check() returns True.

You should only override this method in a subclass in order to provide the correct method signature and docstring. You should override the _check method to do your actual execution.

Parameters:
  • state (SimState) – The state with which to execute.
  • args – Positional arguments that will be passed to process().
  • kwargs – Keyword arguments that will be passed to process().
Returns:

True if the state can be handled by the current engine, False otherwise.

class angr.engines.successors.SimSuccessors(addr, initial_state)

This class serves as a categorization of all the kinds of result states that can come from a SimEngine run.

Variables:
  • addr (int) – The address at which execution is taking place, as a python int
  • initial_state – The initial state for which execution produced these successors
  • engine – The engine that produced these successors
  • sort – A string identifying the type of engine that produced these successors
  • processed (bool) – Whether or not the processing succeeded
  • description (str) – A textual description of the execution step

The successor states produced by this run are categorized into several lists:

Variables:
  • artifacts (dict) – Any analysis byproducts (for example, an IRSB) that were produced during execution
  • successors – The “normal” successors. IP may be symbolic, but must have reasonable number of solutions
  • unsat_successors – Any successor which is unsatisfiable after its guard condition is added.
  • all_successors – successors + unsat_successors
  • flat_successors – The normal successors, but any symbolic IPs have been concretized. There is one state in this list for each possible value an IP may be concretized to for each successor state.
  • unconstrained_successors – Any state for which during the flattening process we find too many solutions.

A more detailed description of the successor lists may be found here: https://docs.angr.io/docs/simuvex.html

add_successor(state, target, guard, jumpkind, add_guard=True, exit_stmt_idx=None, exit_ins_addr=None, source=None)

Add a successor state of the SimRun. This procedure stores method parameters into state.scratch, does some housekeeping, and calls out to helper functions to prepare the state and categorize it into the appropriate successor lists.

Parameters:
  • state (SimState) – The successor state.
  • target – The target (of the jump/call/ret).
  • guard – The guard expression.
  • jumpkind (str) – The jumpkind (call, ret, jump, or whatnot).
  • add_guard (bool) – Whether to add the guard constraint (default: True).
  • exit_stmt_idx (int) – The ID of the exit statement, an integer by default. ‘default’ stands for the default exit, and None means it’s not from a statement (for example, from a SimProcedure).
  • exit_ins_addr (int) – The instruction pointer of this exit, which is an integer by default.
  • source (int) – The source of the jump (i.e., the address of the basic block).
angr.engines.vex.size_bits(t)

Returns size, in BITS, of a type.

angr.engines.vex.size_bytes(t)

Returns size, in BYTES, of a type.

class angr.engines.vex.engine.SimEngineVEX(stop_points=None, use_cache=True, cache_size=10000, default_opt_level=1, support_selfmodifying_code=False, single_step=False)

Execution engine based on VEX, Valgrind’s IR.

process(state, irsb=None, skip_stmts=0, last_stmt=99999999, whitelist=None, inline=False, force_addr=None, insn_bytes=None, size=None, num_inst=None, traceflags=0, thumb=False, opt_level=None, **kwargs)
Parameters:
  • state – The state with which to execute
  • irsb – The PyVEX IRSB object to use for execution. If not provided one will be lifted.
  • skip_stmts – The number of statements to skip in processing
  • last_stmt – Do not execute any statements after this statement
  • whitelist – Only execute statements in this set
  • inline – This is an inline execution. Do not bother copying the state.
  • force_addr – Force execution to pretend that we’re working at this concrete address
  • thumb – Whether the block should be lifted in ARM’s THUMB mode.
  • opt_level – The VEX optimization level to use.
  • insn_bytes – A string of bytes to use for the block instead of the project.
  • size – The maximum size of the block, in bytes.
  • num_inst – The maximum number of instructions.
  • traceflags – traceflags to be passed to VEX. (default: 0)
Returns:

A SimSuccessors object categorizing the block’s successors

lift(state=None, clemory=None, insn_bytes=None, arch=None, addr=None, size=None, num_inst=None, traceflags=0, thumb=False, opt_level=None)

Lift an IRSB.

There are many possible valid sets of parameters. You at the very least must pass some source of data, some source of an architecture, and some source of an address.

Sources of data in order of priority: insn_bytes, clemory, state

Sources of an address, in order of priority: addr, state

Sources of an architecture, in order of priority: arch, clemory, state

Parameters:
  • state – A state to use as a data source.
  • clemory – A cle.memory.Clemory object to use as a data source.
  • addr – The address at which to start the block.
  • thumb – Whether the block should be lifted in ARM’s THUMB mode.
  • opt_level – The VEX optimization level to use. The final IR optimization level is determined by (ordered by priority): - Argument opt_level - opt_level is set to 1 if OPTIMIZE_IR exists in state options - self._default_opt_level
  • insn_bytes – A string of bytes to use as a data source.
  • size – The maximum size of the block, in bytes.
  • num_inst – The maximum number of instructions.
  • traceflags – traceflags to be passed to VEX. (default: 0)
class angr.engines.procedure.SimEngineProcedure

An engine for running SimProcedures

process(state, procedure, ret_to=None, inline=None, force_addr=None, **kwargs)

Perform execution with a state.

Parameters:
  • state – The state with which to execute
  • procedure – An instance of a SimProcedure to run
  • ret_to – The address to return to when this procedure is finished
  • inline – This is an inline execution. Do not bother copying the state.
  • force_addr – Force execution to pretend that we’re working at this concrete address
Returns:

A SimSuccessors object categorizing the execution’s successor states

class angr.engines.unicorn.SimEngineUnicorn(base_stop_points=None)

Concrete execution in the Unicorn Engine, a fork of qemu.

process(state, step=None, extra_stop_points=None, inline=False, force_addr=None, **kwargs)
Parameters:
  • state – The state with which to execute
  • step – How many basic blocks we want to execute
  • extra_stop_points – A collection of addresses at which execution should halt
  • inline – This is an inline execution. Do not bother copying the state.
  • force_addr – Force execution to pretend that we’re working at this concrete address
Returns:

A SimSuccessors object categorizing the results of the run and whether it succeeded.

Simulation Logging

class angr.state_plugins.sim_action.SimAction(state, region_type)

A SimAction represents a semantic action that an analyzed program performs.

Initializes the SimAction.

Parameters:state – the state that’s the SimAction is taking place in.
downsize()

Clears some low-level details (that take up memory) out of the SimAction.

class angr.state_plugins.sim_action.SimActionExit(state, target, condition=None, exit_type=None)

An Exit action represents a (possibly conditional) jump.

class angr.state_plugins.sim_action.SimActionConstraint(state, constraint, condition=None)

A constraint action represents an extra constraint added during execution of a path.

class angr.state_plugins.sim_action.SimActionOperation(state, op, exprs)

An action representing an operation between variables and/or constants.

class angr.state_plugins.sim_action.SimActionData(state, region_type, action, tmp=None, addr=None, size=None, data=None, condition=None, fallback=None, fd=None)

A Data action represents a read or a write from memory, registers or a file.

class angr.state_plugins.sim_action_object.SimActionObject(ast, reg_deps=None, tmp_deps=None)

A SimActionObject tracks an AST and its dependencies.

Procedures

class angr.sim_procedure.SimProcedure(project=None, cc=None, symbolic_return=None, returns=None, is_syscall=None, is_stub=False, num_args=None, display_name=None, library_name=None, is_function=None, **kwargs)

A SimProcedure is a wonderful object which describes a procedure to run on a state.

You may subclass SimProcedure and override run(), replacing it with mutating self.state however you like, and then either returning a value or jumping away somehow.

A detailed discussion of programming SimProcedures may be found at https://docs.angr.io/docs/simprocedures.md

Parameters:arch – The architecture to use for this procedure

The following parameters are optional:

Parameters:
  • symbolic_return – Whether the procedure’s return value should be stubbed into a single symbolic variable constratined to the real return value
  • returns – Whether the procedure should return to its caller afterwards
  • is_syscall – Whether this procedure is a syscall
  • num_args – The number of arguments this procedure should extract
  • display_name – The name to use when displaying this procedure
  • cc – The SimCC to use for this procedure
  • sim_kwargs – Additional keyword arguments to be passed to run()
  • is_function – Whether this procedure emulates a function
execute(state, successors=None, arguments=None, ret_to=None)

Call this method with a SimState and a SimSuccessors to execute the procedure.

Alternately, successors may be none if this is an inline call. In that case, you should provide arguments to the function.

run(*args, **kwargs)

Implement the actual procedure here!

static_exits(blocks)

Get new exits by performing static analysis and heuristics. This is a fast and best-effort approach to get new exits for scenarios where states are not available (e.g. when building a fast CFG).

Parameters:blocks (list) – Blocks that are executed before reaching this SimProcedure.
Returns:A list of tuples. Each tuple is (address, jumpkind).
Return type:list
arg(i)

Returns the ith argument. Raise a SimProcedureArgumentError if we don’t have such an argument available.

Parameters:i (int) – The index of the argument to get
Returns:The argument
Return type:object
inline_call(procedure, *arguments, **kwargs)

Call another SimProcedure in-line to retrieve its return value. Returns an instance of the procedure with the ret_expr property set.

Parameters:
  • procedure – The class of the procedure to execute
  • arguments – Any additional positional args will be used as arguments to the procedure call
  • sim_kwargs – Any additional keyword args will be passed as sim_kwargs to the procedure construtor
ret(expr=None)

Add an exit representing a return from this function. If this is not an inline call, grab a return address from the state and jump to it. If this is not an inline call, set a return expression with the calling convention.

call(addr, args, continue_at, cc=None)

Add an exit representing calling another function via pointer.

Parameters:
  • addr – The address of the function to call
  • args – The list of arguments to call the function with
  • continue_at – Later, when the called function returns, execution of the current procedure will continue in the named method.
  • cc – Optional: use this calling convention for calling the new function. Default is to use the current convention.
jump(addr)

Add an exit representing jumping to an address.

exit(exit_code)

Add an exit representing terminating the program.

class angr.procedures.stubs.format_parser.FormatString(parser, components)

Describes a format string.

Takes a list of components which are either just strings or a FormatSpecifier.

replace(startpos, args)

Produce a new string based of the format string self with args args and return a new string, possibly symbolic.

interpret(addr, startpos, args, region=None)

Interpret a format string, reading the data at addr in region into args starting at startpos.

class angr.procedures.stubs.format_parser.FormatSpecifier(string, length_spec, size, signed)

Describes a format specifier within a format string.

class angr.procedures.stubs.format_parser.FormatParser(project=None, cc=None, symbolic_return=None, returns=None, is_syscall=None, is_stub=False, num_args=None, display_name=None, library_name=None, is_function=None, **kwargs)

For SimProcedures relying on format strings.

Calling Conventions and Types

class angr.calling_conventions.ArgSession(cc)

A class to keep track of the state accumulated in laying parameters out into memory

class angr.calling_conventions.SimCC(arch, args=None, ret_val=None, sp_delta=None, func_ty=None)

A calling convention allows you to extract from a state the data passed from function to function by calls and returns. Most of the methods provided by SimCC that operate on a state assume that the program is just after a call but just before stack frame allocation, though this may be overridden with the stack_base parameter to each individual method.

This is the base class for all calling conventions.

An instance of this class allows it to be tweaked to the way a specific function should be called.

Parameters:
  • arch – The Archinfo arch for this CC
  • args – A list of SimFunctionArguments describing where the arguments go
  • ret_val – A SimFunctionArgument describing where the return value goes
  • sp_delta – The amount the stack pointer changes over the course of this function - CURRENTLY UNUSED
Parmm func_ty:

A SimType for the function itself

classmethod from_arg_kinds(arch, fp_args, ret_fp=False, sizes=None, sp_delta=None, func_ty=None)

Get an instance of the class that will extract floating-point/integral args correctly.

Parameters:
  • arch – The Archinfo arch for this CC
  • fp_args – A list, with one entry for each argument the function can take. True if the argument is fp, false if it is integral.
  • ret_fp – True if the return value for the function is fp.
  • sizes – Optional: A list, with one entry for each argument the function can take. Each entry is the size of the corresponding argument in bytes.
  • sp_delta – The amount the stack pointer changes over the course of this function - CURRENTLY UNUSED
Parmm func_ty:

A SimType for the function itself

int_args

Iterate through all the possible arg positions that can only be used to store integer or pointer values Does not take into account customizations.

Returns an iterator of SimFunctionArguments

both_args

Iterate through all the possible arg positions that can be used to store any kind of argument Does not take into account customizations.

Returns an iterator of SimFunctionArguments

fp_args

Iterate through all the possible arg positions that can only be used to store floating point values Does not take into account customizations.

Returns an iterator of SimFunctionArguments

is_fp_arg(arg)

This should take a SimFunctionArgument instance and return whether or not that argument is a floating-point argument.

Returns True for MUST be a floating point arg,
False for MUST NOT be a floating point arg, None for when it can be either.
class ArgSession(cc)

A class to keep track of the state accumulated in laying parameters out into memory

SimCC.arg_session

Return an arg session.

A session provides the control interface necessary to describe how integral and floating-point arguments are laid out into memory. The default behavior is that there are a finite list of int-only and fp-only argument slots, and an infinite number of generic slots, and when an argument of a given type is requested, the most slot available is used. If you need different behavior, subclass ArgSession.

SimCC.stack_space(args)
Parameters:args – A list of SimFunctionArguments
Returns:The number of bytes that should be allocated on the stack to store all these args, NOT INCLUDING the return address.
SimCC.return_val

The location the return value is stored.

SimCC.return_addr

The location the return address is stored.

SimCC.arg_locs(is_fp, sizes=None)

Pass this a list of whether each parameter is floating-point or not, and get back a list of SimFunctionArguments. Optionally, pass a list of argument sizes (in bytes) as well.

If you’ve customized this CC, this will sanity-check the provided locations with the given list.

SimCC.arg(state, index, stack_base=None)

Returns a bitvector expression representing the nth argument of a function.

stack_base is an optional pointer to the top of the stack at the function start. If it is not specified, use the current stack pointer.

WARNING: this assumes that none of the arguments are floating-point and they’re all single-word-sized, unless you’ve customized this CC.

SimCC.get_args(state, is_fp=None, sizes=None, stack_base=None)

is_fp should be a list of booleans specifying whether each corresponding argument is floating-point - True for fp and False for int. For a shorthand to assume that all the parameters are int, pass the number of parameters as an int.

If you’ve customized this CC, you may omit this parameter entirely. If it is provided, it is used for sanity-checking.

sizes is an optional list of argument sizes, in bytes. Be careful about using this if you’ve made explicit the arg locations, since it might decide to combine two locations into one if an arg is too big.

stack_base is an optional pointer to the top of the stack at the function start. If it is not specified, use the current stack pointer.

Returns a list of bitvector expressions representing the arguments of a function.

SimCC.setup_callsite(state, ret_addr, args, stack_base=None, alloc_base=None, grow_like_stack=True)

This function performs the actions of the caller getting ready to jump into a function.

Parameters:
  • state – The SimState to operate on
  • ret_addr – The address to return to when the called function finishes
  • args – The list of arguments that that the called function will see
  • stack_base – An optional pointer to use as the top of the stack, circa the function entry point
  • alloc_base – An optional pointer to use as the place to put excess argument data
  • grow_like_stack – When allocating data at alloc_base, whether to allocate at decreasing addresses

The idea here is that you can provide almost any kind of python type in args and it’ll be translated to a binary format to be placed into simulated memory. Lists (representing arrays) must be entirely elements of the same type and size, while tuples (representing structs) can be elements of any type and size. If you’d like there to be a pointer to a given value, wrap the value in a PointerWrapper. Any value that can’t fit in a register will be automatically put in a PointerWrapper.

If stack_base is not provided, the current stack pointer will be used, and it will be updated. If alloc_base is not provided, the current stack pointer will be used, and it will be updated. You might not like the results if you provide stack_base but not alloc_base.

grow_like_stack controls the behavior of allocating data at alloc_base. When data from args needs to be wrapped in a pointer, the pointer needs to point somewhere, so that data is dumped into memory at alloc_base. If you set alloc_base to point to somewhere other than the stack, set grow_like_stack to False so that sequencial allocations happen at increasing addresses.

SimCC.teardown_callsite(state, return_val=None, arg_types=None, force_callee_cleanup=False)

This function performs the actions of the callee as it’s getting ready to return. It returns the address to return to.

Parameters:
  • state – The state to mutate
  • return_val – The value to return
  • arg_types – The fp-ness of each of the args. Used to calculate sizes to clean up
  • force_callee_cleanup – If we should clean up the stack allocation for the arguments even if it’s not the callee’s job to do so

TODO: support the stack_base parameter from setup_callsite...? Does that make sense in this context? Maybe it could make sense by saying that you pass it in as something like the “saved base pointer” value?

SimCC.get_return_val(state, is_fp=None, size=None, stack_base=None)

Get the return value out of the given state

SimCC.set_return_val(state, val, is_fp=None, size=None, stack_base=None)

Set the return value into the given state

class angr.calling_conventions.SimLyingRegArg(name)

A register that LIES about the types it holds

class angr.calling_conventions.SimCCUnknown(arch, args=None, ret_val=None, sp_delta=None, func_ty=None)

Represent an unknown calling convention.

Parameters:
  • arch – The Archinfo arch for this CC
  • args – A list of SimFunctionArguments describing where the arguments go
  • ret_val – A SimFunctionArgument describing where the return value goes
  • sp_delta – The amount the stack pointer changes over the course of this function - CURRENTLY UNUSED
Parmm func_ty:

A SimType for the function itself

class angr.sim_variable.SimVariableSet

A collection of SimVariables.

complement(other)

Calculate the complement of self and other.

Parameters:other – Another SimVariableSet instance.
Returns:The complement result.
class angr.sim_type.SimType(label=None)

SimType exists to track type information for SimProcedures.

Parameters:label – the type label.
class angr.sim_type.SimTypeBottom(label=None)

SimTypeBottom basically repesents a type error.

Parameters:label – the type label.
class angr.sim_type.SimTypeTop(size=None, label=None)

SimTypeTop represents any type (mostly used with a pointer for void*).

class angr.sim_type.SimTypeReg(size, label=None)

SimTypeReg is the base type for all types that are register-sized.

Parameters:
  • label – the type label.
  • size – the size of the type (e.g. 32bit, 8bit, etc.).
class angr.sim_type.SimTypeNum(size, signed=True, label=None)

SimTypeNum is a numeric type of arbitrary length

Parameters:
  • size – The size of the integer, in bytes
  • signed – Whether the integer is signed or not
  • label – A label for the type
class angr.sim_type.SimTypeInt(signed=True, label=None)

SimTypeInt is a type that specifies a signed or unsigned C integer.

Parameters:
  • signed – True if signed, False if unsigned
  • label – The type label
class angr.sim_type.SimTypeChar(label=None)

SimTypeChar is a type that specifies a character; this could be represented by an 8-bit int, but this is meant to be interpreted as a character.

Parameters:label – the type label.
class angr.sim_type.SimTypeFd(label=None)

SimTypeFd is a type that specifies a file descriptor.

Parameters:label – the type label
class angr.sim_type.SimTypePointer(pts_to, label=None, offset=0)

SimTypePointer is a type that specifies a pointer to some other type.

Parameters:
  • label – The type label.
  • pts_to – The type to which this pointer points to.
class angr.sim_type.SimTypeFixedSizeArray(elem_type, length)

SimTypeFixedSizeArray is a literal (i.e. not a pointer) fixed-size array.

class angr.sim_type.SimTypeArray(elem_type, length=None, label=None)

SimTypeArray is a type that specifies a pointer to an array; while it is a pointer, it has a semantic difference.

Parameters:
  • label – The type label.
  • elem_type – The type of each element in the array.
  • length – An expression of the length of the array, if known.
class angr.sim_type.SimTypeString(length=None, label=None)

SimTypeString is a type that represents a C-style string, i.e. a NUL-terminated array of bytes.

Parameters:
  • label – The type label.
  • length – An expression of the length of the string, if known.
class angr.sim_type.SimTypeWString(length=None, label=None)

A wide-character null-terminated string, where each character is 2 bytes.

class angr.sim_type.SimTypeFunction(args, returnty, label=None)

SimTypeFunction is a type that specifies an actual function (i.e. not a pointer) with certain types of arguments and a certain return value.

Parameters:
  • label – The type label
  • args – A tuple of types representing the arguments to the function
  • returnty – The return type of the function, or none for void
class angr.sim_type.SimTypeLength(signed=False, addr=None, length=None, label=None)

SimTypeLength is a type that specifies the length of some buffer in memory.

...I’m not really sure what the original design of this class was going for

Parameters:
  • signed – Whether the value is signed or not
  • label – The type label.
  • addr – The memory address (expression).
  • length – The length (expression).
class angr.sim_type.SimTypeFloat(size=32)

An IEEE754 single-precision floating point number

class angr.sim_type.SimTypeDouble

An IEEE754 double-precision floating point number

class angr.sim_type.SimStructValue(struct, values=None)

A SimStruct type paired with some real values

Parameters:
  • struct – A SimStruct instance describing the type of this struct
  • values – A mapping from struct fields to values
class angr.sim_type.SimUnion(members, label=None)

why

Parameters:members – The members of the struct, as a mapping name -> type
angr.sim_type.define_struct(defn)

Register a struct definition globally

>>> define_struct('struct abcd {int x; int y;}')
angr.sim_type.register_types(mapping)

Pass in a mapping from name to SimType and they will be registered to the global type store

>>> register_types(parse_types("typedef int x; typedef float y;"))
angr.sim_type.do_preprocess(defn)

Run a string through the C preprocessor that ships with pycparser but is weirdly inaccessable?

angr.sim_type.parse_defns(defn, preprocess=True)

Parse a series of C definitions, returns a mapping from variable name to variable type object

angr.sim_type.parse_types(defn, preprocess=True)

Parse a series of C definitions, returns a mapping from type name to type object

angr.sim_type.parse_file(defn, preprocess=True)

Parse a series of C definitions, returns a tuple of two type mappings, one for variable definitions and one for type definitions.

angr.sim_type.parse_type(defn, preprocess=True)

Parse a simple type expression into a SimType

>>> parse_type('int *')

Knowledge Base

Representing the artifacts of a project.

class angr.knowledge_base.KnowledgeBase(project, obj)

Represents a “model” of knowledge about an artifact.

Contains things like a CFG, data references, etc.

class angr.knowledge_plugins.functions.function_manager.FunctionDict(backref, *args, **kwargs)

FunctionDict is a dict where the keys are function starting addresses and map to the associated Function.

class angr.knowledge_plugins.functions.function_manager.FunctionManager(kb)

This is a function boundaries management tool. It takes in intermediate results during CFG generation, and manages a function map of the binary.

contains_addr(addr)

Decide if an address is handled by the function manager.

Note: this function is non-conformant with python programming idioms, but its needed for performance reasons.

Parameters:addr (int) – Address of the function.
ceiling_func(addr)

Return the function who has the least address that is greater than or equal to addr.

Parameters:addr (int) – The address to query.
Returns:A Function instance, or None if there is no other function after addr.
Return type:Function or None
floor_func(addr)

Return the function who has the greatest address that is less than or equal to addr.

Parameters:addr (int) – The address to query.
Returns:A Function instance, or None if there is no other function before addr.
Return type:Function or None
function(addr=None, name=None, create=False, syscall=False, plt=None)

Get a function object from the function manager.

Pass either addr or name with the appropriate values.

Parameters:
  • addr (int) – Address of the function.
  • name (str) – Name of the function.
  • create (bool) – Whether to create the function or not if the function does not exist.
  • syscall (bool) – True to create the function as a syscall, False otherwise.
  • or None plt (bool) – True to find the PLT stub, False to find a non-PLT stub, None to disable this restriction.
Returns:

The Function instance, or None if the function is not found and create is False.

Return type:

Function or None

class angr.knowledge_plugins.functions.function.Function(function_manager, addr, name=None, syscall=False)

A representation of a function and various information about it.

Function constructor

Parameters:
  • addr – The address of the function.
  • name – (Optional) The name of the function.
  • syscall – (Optional) Whether this function is a syscall or not.
blocks

An iterator of all local blocks in the current function.

Returns:angr.lifter.Block instances.
block_addrs

An iterator of all local block addresses in the current function.

Returns:block addresses.
block_addrs_set

Return a set of block addresses for a better performance of inclusion tests.

Returns:A set of block addresses.
Return type:set
operations

All of the operations that are done by this functions.

code_constants

All of the constants that are used by this functions’s code.

string_references(minimum_length=2, vex_only=False)

All of the constant string references used by this function.

Parameters:
  • minimum_length – The minimum length of strings to find (default is 1)
  • vex_only – Only analyze VEX IR, don’t interpret the entry state to detect additional constants.
Returns:

A list of tuples of (address, string) where is address is the location of the string in memory.

local_runtime_values

Tries to find all runtime values of this function which do not come from inputs. These values are generated by starting from a blank state and reanalyzing the basic blocks once each. Function calls are skipped, and back edges are never taken so these values are often unreliable, This function is good at finding simple constant addresses which the function will use or calculate.

Returns:a set of constants
runtime_values

All of the concrete values used by this function at runtime (i.e., including passed-in arguments and global values).

binary

Get the object this function belongs to. :return: The object this function belongs to.

add_jumpout_site(node)

Add a custom jumpout site.

Parameters:node – The address of the basic block that control flow leaves during this transition.
Returns:None
add_retout_site(node)

Add a custom retout site.

Retout (returning to outside of the function) sites are very rare. It mostly occurs during CFG recovery when we incorrectly identify the beginning of a function in the first iteration, and then correctly identify that function later in the same iteration (function alignments can lead to this bizarre case). We will mark all edges going out of the header of that function as a outside edge, because all successors now belong to the incorrectly-identified function. This identification error will be fixed in the second iteration of CFG recovery. However, we still want to keep track of jumpouts/retouts during the first iteration so other logic in CFG recovery still work.

Parameters:node – The address of the basic block that control flow leaves the current function after a call.
Returns:None
mark_nonreturning_calls_endpoints()

Iterate through all call edges in transition graph. For each call a non-returning function, mark the source basic block as an endpoint.

This method should only be executed once all functions are recovered and analyzed by CFG recovery, so we know whether each function returns or not.

Returns:None
get_call_sites()

Gets a list of all the basic blocks that end in calls.

Returns:A list of the addresses of the blocks that end in calls.
get_call_target(callsite_addr)

Get the target of a call.

Parameters:callsite_addr – The address of a basic block that ends in a call.
Returns:The target of said call, or None if callsite_addr is not a callsite.
get_call_return(callsite_addr)

Get the hypothetical return address of a call.

Parameters:callsite_addr – The address of the basic block that ends in a call.
Returns:The likely return target of said call, or None if callsite_addr is not a callsite.
graph

Return a local transition graph that only contain nodes in current function.

subgraph(ins_addrs)

Generate a sub control flow graph of instruction addresses based on self.graph

Parameters:ins_addrs (iterable) – A collection of instruction addresses that should be included in the subgraph.
Returns:A subgraph.
Return type:networkx.DiGraph
instruction_size(insn_addr)

Get the size of the instruction specified by insn_addr.

Parameters:insn_addr (int) – Address of the instruction
Returns:Size of the instruction in bytes, or None if the instruction is not found.
Return type:int
dbg_print()

Returns a representation of the list of basic blocks in this function.

dbg_draw(filename)

Draw the graph and save it to a PNG file.

normalize()

Make sure all basic blocks in the transition graph of this function do not overlap. You will end up with a CFG that IDA Pro generates.

This method does not touch the CFG result. You may call CFG{Accurate, Fast}.normalize() for that matter.

Returns:None
class angr.knowledge_plugins.variables.variable_manager.LiveVariables(register_region, stack_region)

A collection of live variables at a program point.

class angr.knowledge_plugins.variables.variable_manager.VariableManagerInternal(manager, func_addr=None)

Manage variables for a function. It is meant to be used internally by VariableManager.

get_variables(sort=None, collapse_same_ident=False)

Get a list of variables.

Parameters:
  • or None sort (str) – Sort of the variable to get.
  • collapse_same_ident – Whether variables of the same identifier should be collapsed or not.
Returns:

A list of variables.

Return type:

list

input_variables()

Get all variables that have never been written to.

Returns:A list of variables that are never written to.
assign_variable_names()

Assign default names to all variables.

Returns:None
class angr.knowledge_plugins.variables.variable_manager.VariableManager(kb)

Manage variables.

get_variable_accesses(variable, same_name=False)

Get a list of all references to the given variable.

Parameters:
  • variable (SimVariable) – The variable.
  • same_name (bool) – Whether to include all variables with the same variable name, or just based on the variable identifier.
Returns:

All references to the variable.

Return type:

list

Analysis

class angr.analyses.analysis.Analyses(p)

This class contains functions for all the registered and runnable analyses,

Creates an Analyses object

Variables:p – A project
class angr.analyses.analysis.Analysis

This class represents an analysis on the program.

Variables:
  • project – The project for this analysis.
  • kb (KnowledgeBase) – The knowledgebase object.
  • _progress_callback (callable) – A callback function for receiving the progress of this analysis. It only takes one argument, which is a float number from 0.0 to 100.0 indicating the current progress.
  • _show_progressbar (bool) – If a progressbar should be shown during the analysis. It’s independent from _progress_callback.
  • _progressbar (progressbar.ProgressBar) – The progress bar object.
class angr.analyses.backward_slice.BackwardSlice(cfg, cdg, ddg, targets=None, cfg_node=None, stmt_id=None, control_flow_slice=False, same_function=False, no_construct=False)

Represents a backward slice of the program.

Create a backward slice from a specific statement based on provided control flow graph (CFG), control dependence graph (CDG), and data dependence graph (DDG).

The data dependence graph can be either CFG-based, or Value-set analysis based. A CFG-based DDG is much faster to generate, but it only reflects those states while generating the CFG, and it is neither sound nor accurate. The VSA based DDG (called VSA_DDG) is based on static analysis, which gives you a much better result.

Parameters:
  • cfg – The control flow graph.
  • cdg – The control dependence graph.
  • ddg – The data dependence graph.
  • targets – A list of “target” that specify targets of the backward slices. Each target can be a tuple in form of (cfg_node, stmt_idx), or a CodeLocation instance.
  • cfg_node – Deprecated. The target CFGNode to reach. It should exist in the CFG.
  • stmt_id – Deprecated. The target statement to reach.
  • control_flow_slice – True/False, indicates whether we should slice only based on CFG. Sometimes when acquiring DDG is difficult or impossible, you can just create a slice on your CFG. Well, if you don’t even have a CFG, then...
  • no_construct – Only used for testing and debugging to easily create a BackwardSlice object.
dbg_repr(max_display=10)

Debugging output of this slice.

Parameters:max_display – The maximum number of SimRun slices to show.
Returns:A string representation.
dbg_repr_run(run_addr)

Debugging output of a single SimRun slice.

Parameters:run_addr – Address of the SimRun.
Returns:A string representation.
annotated_cfg(start_point=None)

Returns an AnnotatedCFG based on slicing result.

Query in taint graph to check if a specific taint will taint the IP in the future or not. The taint is specified with the tuple (simrun_addr, stmt_idx, taint_type).

Parameters:
  • simrun_addr – Address of the SimRun.
  • stmt_idx – Statement ID.
  • taint_type – Type of the taint, might be one of the following: ‘reg’, ‘tmp’, ‘mem’.
  • simrun_whitelist – A list of SimRun addresses that are whitelisted, i.e. the tainted exit will be ignored if it is in those SimRuns.
Returns:

True/False

is_taint_impacting_stack_pointers(simrun_addr, stmt_idx, taint_type, simrun_whitelist=None)

Query in taint graph to check if a specific taint will taint the stack pointer in the future or not. The taint is specified with the tuple (simrun_addr, stmt_idx, taint_type).

Parameters:
  • simrun_addr – Address of the SimRun.
  • stmt_idx – Statement ID.
  • taint_type – Type of the taint, might be one of the following: ‘reg’, ‘tmp’, ‘mem’.
  • simrun_whitelist – A list of SimRun addresses that are whitelisted.
Returns:

True/False.

angr.analyses.bindiff.differing_constants(block_a, block_b)

Compares two basic blocks and finds all the constants that differ from the first block to the second.

Parameters:
  • block_a – The first block to compare.
  • block_b – The second block to compare.
Returns:

Returns a list of differing constants in the form of ConstantChange, which has the offset in the block and the respective constants.

class angr.analyses.bindiff.FunctionDiff(function_a, function_b, bindiff=None)

This class computes the a diff between two functions.

Parameters:
  • function_a – The first angr Function object to diff.
  • function_b – The second angr Function object.
  • bindiff – An optional Bindiff object. Used for some extra normalization during basic block comparison.
probably_identical

returns – Whether or not these two functions are identical.

identical_blocks

returns – A list of block matches which appear to be identical

differing_blocks

returns – A list of block matches which appear to differ

blocks_with_differing_constants

return – A list of block matches which appear to differ

static get_normalized_block(addr, function)
Parameters:
  • addr – Where to start the normalized block.
  • function – A function containing the block address.
Returns:

A normalized basic block.

block_similarity(block_a, block_b)
Parameters:
  • block_a – The first block address.
  • block_b – The second block address.
Returns:

The similarity of the basic blocks, normalized for the base address of the block and function call addresses.

blocks_probably_identical(block_a, block_b, check_constants=False)
Parameters:
  • block_a – The first block address.
  • block_b – The second block address.
  • check_constants – Whether or not to require matching constants in blocks.
Returns:

Whether or not the blocks appear to be identical.

class angr.analyses.bindiff.BinDiff(other_project, enable_advanced_backward_slicing=False, cfg_a=None, cfg_b=None)

This class computes the a diff between two binaries represented by angr Projects

Parameters:other_project – The second project to diff
functions_probably_identical(func_a_addr, func_b_addr, check_consts=False)

Compare two functions and return True if they appear identical.

Parameters:
  • func_a_addr – The address of the first function (in the first binary).
  • func_b_addr – The address of the second function (in the second binary).
Returns:

Whether or not the functions appear to be identical.

identical_functions

returns – A list of function matches that appear to be identical

differing_functions

returns – A list of function matches that appear to differ

differing_functions_with_consts()
Returns:A list of function matches that appear to differ including just by constants
differing_blocks

returns – A list of block matches that appear to differ

identical_blocks

return A list of all block matches that appear to be identical

blocks_with_differing_constants

return – A dict of block matches with differing constants to the tuple of constants

get_function_diff(function_addr_a, function_addr_b)
Parameters:
  • function_addr_a – The address of the first function (in the first binary)
  • function_addr_b – The address of the second function (in the second binary)
Returns:

the FunctionDiff of the two functions

class angr.analyses.boyscout.BoyScout(cookiesize=1)

Try to determine the architecture and endieness of a binary blob

class angr.analyses.cdg.TemporaryNode(label)

A temporary node.

Used as the start node and end node in post-dominator tree generation. Also used in some test cases.

class angr.analyses.cdg.ContainerNode(obj)

A container node.

Only used in post-dominator tree generation. We did this so we can set the index property without modifying the original object.

class angr.analyses.cdg.CDG(cfg, start=None, no_construct=False)

Implements a control dependence graph.

Constructor.

Parameters:
  • cfg – The control flow graph upon which this control dependence graph will build
  • start – The starting point to begin constructing the control dependence graph
  • no_construct – Skip the construction step. Only used in unit-testing.
get_post_dominators()

Return the post-dom tree

get_dependants(run)

Return a list of nodes that are control dependent on the given node in the control dependence graph

get_guardians(run)

Return a list of nodes on whom the specific node is control dependent in the control dependence graph

class angr.analyses.cfg.cfg_accurate.CFGAccurate(context_sensitivity_level=1, start=None, avoid_runs=None, enable_function_hints=False, call_depth=None, call_tracing_filter=None, initial_state=None, starts=None, keep_state=False, enable_advanced_backward_slicing=False, enable_symbolic_back_traversal=False, additional_edges=None, no_construct=False, normalize=False, max_iterations=1, address_whitelist=None, base_graph=None, iropt_level=None, max_steps=None, state_add_options=None, state_remove_options=None)

This class represents a control-flow graph.

All parameters are optional.

Parameters:
  • context_sensitivity_level – The level of context-sensitivity of this CFG (see documentation for further details). It ranges from 0 to infinity. Default 1.
  • avoid_runs – A list of runs to avoid.
  • enable_function_hints – Whether to use function hints (constants that might be used as exit targets) or not.
  • call_depth – How deep in the call stack to trace.
  • call_tracing_filter – Filter to apply on a given path and jumpkind to determine if it should be skipped when call_depth is reached.
  • initial_state – An initial state to use to begin analysis.
  • starts (iterable) – A collection of starting points to begin analysis. It can contain the following three different types of entries: an address specified as an integer, a 2-tuple that includes an integer address and a jumpkind, or a SimState instance. Unsupported entries in starts will lead to an AngrCFGError being raised.
  • keep_state – Whether to keep the SimStates for each CFGNode.
  • enable_advanced_backward_slicing – Whether to enable an intensive technique for resolving direct jumps
  • enable_symbolic_back_traversal – Whether to enable an intensive technique for resolving indirect jumps
  • additional_edges – A dict mapping addresses of basic blocks to addresses of successors to manually include and analyze forward from.
  • no_construct (bool) – Skip the construction procedure. Only used in unit-testing.
  • normalize (bool) – If the CFG as well as all Function graphs should be normalized or not.
  • max_iterations (int) – The maximum number of iterations that each basic block should be “executed”. 1 by default. Larger numbers of iterations are usually required for complex analyses like loop analysis.
  • address_whitelist (iterable) – A list of allowed addresses. Any basic blocks outside of this collection of addresses will be ignored.
  • base_graph (networkx.DiGraph) – A basic control flow graph to follow. Each node inside this graph must have the following properties: addr and size. CFG recovery will strictly follow nodes and edges shown in the graph, and discard any contorl flow that does not follow an existing edge in the base graph. For example, you can pass in a Function local transition graph as the base graph, and CFGAccurate will traverse nodes and edges and extract useful information.
  • iropt_level (int) – The optimization level of VEX IR (0, 1, 2). The default level will be used if iropt_level is None.
  • max_steps (int) – The maximum number of basic blocks to recover forthe longest path from each start before pausing the recovery procedure.
  • state_add_options – State options that will be added to the initial state.
  • state_remove_options – State options that will be removed from the initial state.
copy()

Make a copy of the CFG.

Returns:A copy of the CFG instance.
Return type:angr.analyses.CFG
resume(starts=None, max_steps=None)

Resume a paused or terminated control flow graph recovery.

Parameters:
  • starts (iterable) – A collection of new starts to resume from. If starts is None, we will resume CFG recovery from where it was paused before.
  • max_steps (int) – The maximum number of blocks on the longest path starting from each start before pausing the recovery.
Returns:

None

remove_cycles()

Forces graph to become acyclic, removes all loop back edges and edges between overlapped loop headers and their successors.

downsize()

Remove saved states from all CFGNodes to reduce memory usage.

Returns:None
unroll_loops(max_loop_unrolling_times)

Unroll loops for each function. The resulting CFG may still contain loops due to recursion, function calls, etc.

Parameters:max_loop_unrolling_times (int) – The maximum iterations of unrolling.
Returns:None
force_unroll_loops(max_loop_unrolling_times)

Unroll loops globally. The resulting CFG does not contain any loop, but this method is slow on large graphs.

Parameters:max_loop_unrolling_times (int) – The maximum iterations of unrolling.
Returns:None
immediate_dominators(start, target_graph=None)

Get all immediate dominators of sub graph from given node upwards.

Parameters:
  • start (str) – id of the node to navigate forwards from.
  • target_graph (networkx.classes.digraph.DiGraph) – graph to analyse, default is self.graph.
Returns:

each node of graph as index values, with element as respective node’s immediate dominator.

Return type:

dict

immediate_postdominators(end, target_graph=None)

Get all immediate postdominators of sub graph from given node upwards.

Parameters:
  • start (str) – id of the node to navigate forwards from.
  • target_graph (networkx.classes.digraph.DiGraph) – graph to analyse, default is self.graph.
Returns:

each node of graph as index values, with element as respective node’s immediate dominator.

Return type:

dict

remove_fakerets()

Get rid of fake returns (i.e., Ijk_FakeRet edges) from this CFG

Returns:None
get_topological_order(cfg_node)

Get the topological order of a CFG Node.

Parameters:cfg_node – A CFGNode instance.
Returns:An integer representing its order, or None if the CFGNode does not exist in the graph.
get_subgraph(starting_node, block_addresses)

Get a sub-graph out of a bunch of basic block addresses.

Parameters:
  • starting_node (CFGNode) – The beginning of the subgraph
  • block_addresses (iterable) – A collection of block addresses that should be included in the subgraph if there is a path between starting_node and a CFGNode with the specified address, and all nodes on the path should also be included in the subgraph.
Returns:

A new CFG that only contain the specific subgraph.

Return type:

CFGAccurate

get_function_subgraph(start, max_call_depth=None)

Get a sub-graph of a certain function.

Parameters:
  • start – The function start. Currently it should be an integer.
  • max_call_depth – Call depth limit. None indicates no limit.
Returns:

A CFG instance which is a sub-graph of self.graph

unresolvables

Get those SimRuns that have non-resolvable exits.

Returns:A set of SimRuns
Return type:set
deadends

Get all CFGNodes that has an out-degree of 0

Returns:A list of CFGNode instances
Return type:list
class angr.analyses.cfg.cfg_base.CFGBase(sort, context_sensitivity_level, normalize=False, binary=None, force_segment=False, iropt_level=None, base_state=None)

The base class for control flow graphs.

functions

A reference to the FunctionManager in the current knowledge base.

Returns:FunctionManager with all functions
Return type:angr.knowledge_plugins.FunctionManager
make_copy(copy_to)

Copy self attributes to the new object.

Parameters:copy_to (CFGBase) – The target to copy to.
Returns:None
generate_index()

Generate an index of all nodes in the graph in order to speed up get_any_node() with anyaddr=True.

Returns:None
get_predecessors(cfgnode, excluding_fakeret=True, jumpkind=None)

Get predecessors of a node in the control flow graph.

Parameters:
  • cfgnode (CFGNode) – The node.
  • excluding_fakeret (bool) – True if you want to exclude all predecessors that is connected to the node with a fakeret edge.
  • or None jumpkind (str) – Only return predecessors with the specified jumpkind. This argument will be ignored if set to None.
Returns:

A list of predecessors

Return type:

list

get_successors(basic_block, excluding_fakeret=True, jumpkind=None)

Get successors of a node in the control flow graph.

Parameters:
  • basic_block (CFGNode) – The node.
  • excluding_fakeret (bool) – True if you want to exclude all successors that is connected to the node with a fakeret edge.
  • or None jumpkind (str) – Only return successors with the specified jumpkind. This argument will be ignored if set to None.
Returns:

A list of successors

Return type:

list

get_all_predecessors(cfgnode)

Get all predecessors of a specific node on the control flow graph.

Parameters:cfgnode (CFGNode) – The CFGNode object
Returns:A list of predecessors in the CFG
Return type:list
get_node(block_id)

Get a single node from node key.

Parameters:block_id (BlockID) – Block ID of the node.
Returns:The CFGNode
Return type:CFGNode
get_any_node(addr, is_syscall=None, anyaddr=False)

Get an arbitrary CFGNode (without considering their contexts) from our graph.

Parameters:
  • addr (int) – Address of the beginning of the basic block. Set anyaddr to True to support arbitrary address.
  • is_syscall (bool) – Whether you want to get the syscall node or any other node. This is due to the fact that syscall SimProcedures have the same address as the targer it returns to. None means get either, True means get a syscall node, False means get something that isn’t a syscall node.
  • anyaddr (bool) – If anyaddr is True, then addr doesn’t have to be the beginning address of a basic block. By default the entire graph.nodes() will be iterated, and the first node containing the specific address is returned, which is slow. If you need to do many such queries, you may first call generate_index() to create some indices that may speed up the query.
Returns:

A CFGNode if there is any that satisfies given conditions, or None otherwise

irsb_from_node(cfg_node)

Create an IRSB from a CFGNode object.

get_any_irsb(addr)

Returns an IRSB of a certain address. If there are many IRSBs with the same address in CFG, return an arbitrary one. You should never assume this method returns a specific IRSB.

Parameters:addr (int) – Address of the IRSB to get.
Returns:An arbitrary IRSB located at addr.
Return type:IRSB
get_all_nodes(addr, is_syscall=None, anyaddr=False)

Get all CFGNodes whose address is the specified one.

Parameters:
  • addr – Address of the node
  • is_syscall – True returns the syscall node, False returns the normal CFGNode, None returns both
Returns:

all CFGNodes

nodes_iter()

An iterator of all nodes in the graph.

Returns:The iterator.
Return type:iterator
get_all_irsbs(addr)

Returns all IRSBs of a certain address, without considering contexts.

get_branching_nodes()

Returns all nodes that has an out degree >= 2

get_exit_stmt_idx(src_block, dst_block)

Get the corresponding exit statement ID for control flow to reach destination block from source block. The exit statement ID was put on the edge when creating the CFG. Note that there must be a direct edge between the two blocks, otherwise an exception will be raised.

Returns:The exit statement ID
normalize()

Normalize the CFG, making sure that there are no overlapping basic blocks.

Note that this method will not alter transition graphs of each function in self.kb.functions. You may call normalize() on each Function object to normalize their transition graphs.

Returns:None
remove_function_alignments()

Remove all function alignments.

Returns:None
make_functions()

Revisit the entire control flow graph, create Function instances accordingly, and correctly put blocks into each function.

Although Function objects are crated during the CFG recovery, they are neither sound nor accurate. With a pre-constructed CFG, this method rebuilds all functions bearing the following rules:

  • A block may only belong to one function.
  • Small functions lying inside the startpoint and the endpoint of another function will be merged with the other function
  • Tail call optimizations are detected.
  • PLT stubs are aligned by 16.
Returns:None
class angr.analyses.cfg.cfg_fast.Segment(start, end, sort)

Representing a memory block. This is not the “Segment” in ELF memory model

Parameters:
  • start (int) – Start address.
  • end (int) – End address.
  • sort (str) – Type of the segment, can be code, data, etc.
Returns:

None

size

Calculate the size of the Segment.

Returns:Size of the Segment.
Return type:int
copy()

Make a copy of the Segment.

Returns:A copy of the Segment instance.
Return type:angr.analyses.cfg_fast.Segment
class angr.analyses.cfg.cfg_fast.SegmentList

SegmentList describes a series of segmented memory blocks. You may query whether an address belongs to any of the blocks or not, and obtain the exact block(segment) that the address belongs to.

next_free_pos(address)

Returns the next free position with respect to an address, including that address itself

Parameters:address – The address to begin the search with (including itself)
Returns:The next free position
is_occupied(address)

Check if an address belongs to any segment

Parameters:address – The address to check
Returns:True if this address belongs to a segment, False otherwise
occupied_by_sort(address)

Check if an address belongs to any segment, and if yes, returns the sort of the segment

Parameters:address (int) – The address to check
Returns:Sort of the segment that occupies this address
Return type:str
occupy(address, size, sort)

Include a block, specified by (address, size), in this segment list.

Parameters:
  • address (int) – The starting address of the block.
  • size (int) – Size of the block.
  • sort (str) – Type of the block.
Returns:

None

copy()

Make a copy of the SegmentList.

Returns:A copy of the SegmentList instance.
Return type:angr.analyses.cfg_fast.SegmentList
occupied_size

The sum of sizes of all blocks

Returns:An integer
has_blocks

Returns if this segment list has any block or not. !is_empty

Returns:True if it’s not empty, False otherwise
class angr.analyses.cfg.cfg_fast.FunctionReturn(callee_func_addr, caller_func_addr, call_site_addr, return_to)

FunctionReturn describes a function call in a specific location and its return location. Hashable and equatable

class angr.analyses.cfg.cfg_fast.MemoryData(address, size, sort, irsb, irsb_addr, stmt, stmt_idx, pointer_addr=None, max_size=None, insn_addr=None)

MemoryData describes the syntactic contents of single address of memory along with a set of references to this address (when not from previous instruction).

copy()

Make a copy of the MemoryData.

Returns:A copy of the MemoryData instance.
Return type:angr.analyses.cfg_fast.MemoryData
add_ref(irsb_addr, stmt_idx, insn_addr)

Add a reference from code to this memory data.

Parameters:
  • irsb_addr (int) – Address of the basic block.
  • stmt_idx (int) – ID of the statement referencing this data entry.
  • insn_addr (int) – Address of the instruction referencing this data entry.
Returns:

None

class angr.analyses.cfg.cfg_fast.CFGJob(addr, func_addr, jumpkind, ret_target=None, last_addr=None, src_node=None, src_ins_addr=None, src_stmt_idx=None, returning_source=None, syscall=False)

Defines a job to work on during the CFG recovery

class angr.analyses.cfg.cfg_fast.CFGFast(binary=None, regions=None, pickle_intermediate_results=False, symbols=True, function_prologues=True, resolve_indirect_jumps=True, force_segment=False, force_complete_scan=True, indirect_jump_target_limit=100000, collect_data_references=False, extra_cross_references=False, normalize=False, start_at_entry=True, function_starts=None, extra_memory_regions=None, data_type_guessing_handlers=None, arch_options=None, indirect_jump_resolvers=None, base_state=None, exclude_sparse_regions=True, skip_specific_regions=True, start=None, end=None, **extra_arch_options)

We find functions inside the given binary, and build a control-flow graph in very fast manners: instead of simulating program executions, keeping track of states, and performing expensive data-flow analysis, CFGFast will only perform light-weight analyses combined with some heuristics, and with some strong assumptions.

In order to identify as many functions as possible, and as accurate as possible, the following operation sequence is followed:

# Active scanning

  • If the binary has “function symbols” (TODO: this term is not accurate enough), they are starting points of
    the code scanning
  • If the binary does not have any “function symbol”, we will first perform a function prologue scanning on the
    entire binary, and start from those places that look like function beginnings
  • Otherwise, the binary’s entry point will be the starting point for scanning

# Passive scanning

  • After all active scans are done, we will go through the whole image and scan all code pieces

Due to the nature of those techniques that are used here, a base address is often not required to use this analysis routine. However, with a correct base address, CFG recovery will almost always yield a much better result. A custom analysis, called GirlScout, is specifically made to recover the base address of a binary blob. After the base address is determined, you may want to reload the binary with the new base address by creating a new Project object, and then re-recover the CFG.

Parameters:
  • binary – The binary to recover CFG on. By default the main binary is used.
  • regions (iterable) – A list of tuples in the form of (start address, end address) describing memory regions that the CFG should cover.
  • pickle_intermediate_results (bool) – If we want to store the intermediate results or not.
  • symbols (bool) – Get function beginnings from symbols in the binary.
  • function_prologues (bool) – Scan the binary for function prologues, and use those positions as function beginnings
  • resolve_indirect_jumps (bool) – Try to resolve indirect jumps. This is necessary to resolve jump targets from jump tables, etc.
  • force_segment (bool) – Force CFGFast to rely on binary segments instead of sections.
  • force_complete_scan (bool) – Perform a complete scan on the binary and maximize the number of identified code blocks.
  • collect_data_references (bool) – If CFGFast should collect data references from individual basic blocks or not.
  • extra_cross_references (bool) – True if we should collect data references for all places in the program that access each memory data entry, which requires more memory, and is noticeably slower. Setting it to False means each memory data entry has at most one reference (which is the initial one).
  • normalize (bool) – Normalize the CFG as well as all function graphs after CFG recovery.
  • start_at_entry (bool) – Begin CFG recovery at the entry point of this project. Setting it to False prevents CFGFast from viewing the entry point as one of the starting points of code scanning.
  • function_starts (list) – A list of extra function starting points. CFGFast will try to resume scanning from each address in the list.
  • extra_memory_regions (list) – A list of 2-tuple (start-address, end-address) that shows extra memory regions. Integers falling inside will be considered as pointers.
  • indirect_jump_resolvers (list) – A custom list of indirect jump resolvers. If this list is None or empty, default indirect jump resolvers specific to this architecture and binary types will be loaded.
  • base_state – A state to use as a backer for all memory loads
  • start (int) – (Deprecated) The beginning address of CFG recovery.
  • end (int) – (Deprecated) The end address of CFG recovery.
  • arch_options (CFGArchOptions) – Architecture-specific options.
  • extra_arch_options (dict) – Any key-value pair in kwargs will be seen as an arch-specific option and will be used to set the option value in self._arch_options.

Extra parameters that angr.Analysis takes:

Parameters:
  • progress_callback – Specify a callback function to get the progress during CFG recovery.
  • show_progressbar (bool) – Should CFGFast show a progressbar during CFG recovery or not.
Returns:

None

generate_code_cover()

Generate a list of all recovered basic blocks.

class angr.analyses.cfg.cfg_node.CFGNodeCreationFailure(exc_info=None, to_copy=None)

This class contains additional information for whenever creating a CFGNode failed. It includes a full traceback and the exception messages.

class angr.analyses.cfg.cfg_node.CFGNode(addr, size, cfg, callstack=None, input_state=None, simprocedure_name=None, syscall_name=None, looping_times=0, no_ret=False, is_syscall=False, syscall=None, function_address=None, final_states=None, block_id=None, irsb=None, instruction_addrs=None, depth=None, callstack_key=None, creation_failure_info=None, thumb=False, byte_string=None)

This class stands for each single node in CFG.

Note: simprocedure_name is not used to recreate the SimProcedure object. It’s only there for better __repr__.

downsize()

Drop saved states.

class angr.analyses.code_location.CodeLocation(block_addr, stmt_idx, sim_procedure=None, ins_addr=None, **kwargs)

Stands for a specific program point by specifying basic block address and statement ID (for IRSBs), or SimProcedure name (for SimProcedures).

Constructor.

Parameters:
  • block_addr (int) – Address of the block
  • stmt_idx (int) – Statement ID. None for SimProcedures
  • sim_procedure (class) – The corresponding SimProcedure class.
  • ins_addr (int) – The instruction address. Optional.
  • kwargs – Optional arguments, will be stored, but not used in __eq__ or __hash__.
class angr.analyses.ddg.AST(op, *operands)

A mini implementation for AST

class angr.analyses.ddg.ProgramVariable(variable, location, initial=False, arch=None)

Describes a variable in the program at a specific location.

Variables:
  • variable (SimVariable) – The variable.
  • location (CodeLocation) – Location of the variable.
class angr.analyses.ddg.LiveDefinitions

A collection of live definitions with some handy interfaces for definition killing and lookups.

Constructor.

branch()

Create a branch of the current live definition collection.

Returns:A new LiveDefinition instance.
Return type:LiveDefinitions
copy()

Make a hard copy of self.

Returns:A new LiveDefinition instance.
Return type:LiveDefinitions
add_def(variable, location, size_threshold=32)

Add a new definition of variable.

Parameters:
  • variable (SimVariable) – The variable being defined.
  • location (CodeLocation) – Location of the varaible being defined.
  • size_threshold (int) – The maximum bytes to consider for the variable.
Returns:

True if the definition was new, False otherwise

Return type:

bool

add_defs(variable, locations, size_threshold=32)

Add a collection of new definitions of a variable.

Parameters:
  • variable (SimVariable) – The variable being defined.
  • locations (iterable) – A collection of locations where the variable was defined.
  • size_threshold (int) – The maximum bytes to consider for the variable.
Returns:

True if any of the definition was new, False otherwise

Return type:

bool

kill_def(variable, location, size_threshold=32)

Add a new definition for variable and kill all previous definitions.

Parameters:
  • variable (SimVariable) – The variable to kill.
  • location (CodeLocation) – The location where this variable is defined.
  • size_threshold (int) – The maximum bytes to consider for the variable.
Returns:

None

lookup_defs(variable, size_threshold=32)

Find all definitions of the varaible

Parameters:
  • variable (SimVariable) – The variable to lookup for.
  • size_threshold (int) – The maximum bytes to consider for the variable. For example, if the variable is 100 byte long, only the first size_threshold bytes are considered.
Returns:

A set of code locations where the variable is defined.

Return type:

set

iteritems()

An iterator that returns all live definitions.

Returns:The iterator.
Return type:iter
itervariables()

An iterator that returns all live variables.

Returns:The iterator.
Return type:iter
class angr.analyses.ddg.DDGView(cfg, ddg, simplified=False)

A view of the data dependence graph.

class angr.analyses.ddg.DDG(cfg, start=None, call_depth=None, block_addrs=None)

This is a fast data dependence graph directly generated from our CFG analysis result. The only reason for its existence is the speed. There is zero guarantee for being sound or accurate. You are supposed to use it only when you want to track the simplest data dependence, and you do not care about soundness or accuracy.

For a better data dependence graph, please consider performing a better static analysis first (like Value-set Analysis), and then construct a dependence graph on top of the analysis result (for example, the VFG in angr).

Also note that since we are using states from CFG, any improvement in analysis performed on CFG (like a points-to analysis) will directly benefit the DDG.

Parameters:
  • cfg – Control flow graph. Please make sure each node has an associated state with it. You may want to generate your CFG with keep_state=True.
  • start – An address, Specifies where we start the generation of this data dependence graph.
  • call_depth – None or integers. A non-negative integer specifies how deep we would like to track in the call tree. None disables call_depth limit.
  • or None block_addrs (iterable) – A collection of block addresses that the DDG analysis should be performed on.
graph

returns – A networkx DiGraph instance representing the dependence relations between statements. :rtype: networkx.DiGraph

data_graph

Get the data dependence graph.

Returns:A networkx DiGraph instance representing data dependence.
Return type:networkx.DiGraph
simplified_data_graph

return

pp()

Pretty printing.

dbg_repr()

Representation for debugging.

get_predecessors(code_location)

Returns all predecessors of the code location.

Parameters:code_location – A CodeLocation instance.
Returns:A list of all predecessors.
function_dependency_graph(func)

Get a dependency graph for the function func.

Parameters:func – The Function object in CFG.function_manager.
Returns:A networkx.DiGraph instance.
data_sub_graph(pv, simplified=True, killing_edges=False, excluding_types=None)

Get a subgraph from the data graph or the simplified data graph that starts from node pv.

Parameters:
  • pv (ProgramVariable) – The starting point of the subgraph.
  • simplified (bool) – When True, the simplified data graph is used, otherwise the data graph is used.
  • killing_edges (bool) – Are killing edges included or not.
  • excluding_types (iterable) – Excluding edges whose types are among those excluded types.
Returns:

A subgraph.

Return type:

networkx.MultiDiGraph

find_definitions(variable, location=None, simplified_graph=True)

Find all definitions of the given variable.

Parameters:
  • variable (SimVariable) –
  • simplified_graph (bool) – True if you just want to search in the simplified graph instead of the normal graph. Usually the simplified graph suffices for finding definitions of register or memory variables.
Returns:

A collection of all variable definitions to the specific variable.

Return type:

list

find_consumers(var_def, simplified_graph=True)

Find all consumers to the specified variable definition.

Parameters:
  • var_def (ProgramVariable) – The variable definition.
  • simplified_graph (bool) – True if we want to search in the simplified graph, False otherwise.
Returns:

A collection of all consumers to the specified variable definition.

Return type:

list

find_killers(var_def, simplified_graph=True)

Find all killers to the specified variable definition.

Parameters:
  • var_def (ProgramVariable) – The variable definition.
  • simplified_graph (bool) – True if we want to search in the simplified graph, False otherwise.
Returns:

A collection of all killers to the specified variable definition.

Return type:

list

find_sources(var_def, simplified_graph=True)

Find all sources to the specified variable definition.

Parameters:
  • var_def (ProgramVariable) – The variable definition.
  • simplified_graph (bool) – True if we want to search in the simplified graph, False otherwise.
Returns:

A collection of all sources to the specified variable definition.

Return type:

list

class angr.analyses.forward_analysis.GraphVisitor

A graph visitor takes a node in the graph and returns its successors. Typically it visits a control flow graph, and returns successors of a CFGNode each time. This is the base class of all graph visitors.

startpoints()

Get all start points to begin the traversal.

Returns:A list of startpoints that the traversal should begin with.
successors(node)

Get successors of a node. The node should be in the graph.

Parameters:node – The node to work with.
Returns:A list of successors.
Return type:list
predecessors(node)

Get predecessors of a node. The node should be in the graph.

Parameters:node – The node to work with.
Returns:A list of predecessors.
Return type:list
sort_nodes(nodes=None)

Get a list of all nodes sorted in an optimal traversal order.

Parameters:nodes (iterable) – A collection of nodes to sort. If none, all nodes in the graph will be used to sort.
Returns:A list of sorted nodes.
Return type:list
nodes_iter()

Return an iterator of nodes following an optimal traversal order.

Returns:
reset()

Reset the internal node traversal state. Must be called prior to visiting future nodes.

Returns:None
next_node()

Get the next node to visit.

Returns:A node in the graph.
all_successors(node, skip_reached_fixedpoint=False)

Returns all successors to the specific node.

Parameters:node – A node in the graph.
Returns:A set of nodes that are all successors to the given node.
Return type:set
revisit(node, include_self=True)

Revisit a node in the future. As a result, the successors to this node will be revisited as well.

Parameters:node – The node to revisit in the future.
Returns:None
reached_fixedpoint(node)

Mark a node as reached fixed-point. This node as well as all its successors will not be visited in the future.

Parameters:node – The node to mark as reached fixed-point.
Returns:None
class angr.analyses.forward_analysis.JobInfo(key, job)

Stores information of each job.

job

Get the latest available job.

Returns:The latest available job.
add_job(job, merged=False, widened=False)

Appended a new job to this JobInfo node. :param job: The new job to append. :param bool merged: Whether it is a merged job or not. :param bool widened: Whether it is a widened job or not.

class angr.analyses.forward_analysis.ForwardAnalysis(order_jobs=False, allow_merging=False, allow_widening=False, status_callback=None, graph_visitor=None)

This is my very first attempt to build a static forward analysis framework that can serve as the base of multiple static analyses in angr, including CFG analysis, VFG analysis, DDG, etc.

In short, ForwardAnalysis performs a forward data-flow analysis by traversing a graph, compute on abstract values, and store results in abstract states. The user can specify what graph to traverse, how a graph should be traversed, how abstract values and abstract states are defined, etc.

ForwardAnalysis has a few options to toggle, making it suitable to be the base class of several different styles of forward data-flow analysis implementations.

ForwardAnalysis supports a special mode when no graph is available for traversal (for example, when a CFG is being initialized and constructed, no other graph can be used). In that case, the graph traversal functionality is disabled, and the optimal graph traversal order is not guaranteed. The user can provide a job sorting method to sort the jobs in queue and optimize traversal order.

Feel free to discuss with me (Fish) if you have any suggestions or complaints.

Constructor

Parameters:
  • order_jobs (bool) – If all jobs should be ordered or not.
  • allow_merging (bool) – If job merging is allowed.
  • allow_widening (bool) – If job widening is allowed.
  • graph_visitor (GraphVisitor or None) – A graph visitor to provide successors.
Returns:

None

should_abort

Should the analysis be terminated. :return: True/False

abort()

Abort the analysis :return: None

class angr.analyses.girlscout.GirlScout(binary=None, start=None, end=None, pickle_intermediate_results=False, perform_full_code_scan=False)

We find functions inside the given binary, try to decide the base address if needed, and build a control-flow graph on top of that to see if there is an entry or not. Obviously if the binary is not loaded as a blob (not using Blob as its backend), GirlScout will not try to determine the base address.

It’s also optional to perform a full code scan of the binary to show where all codes are. By default we don’t scan the entire binary since it’s time consuming.

You probably need a BoyScout to determine the possible architecture and endianess of your binary blob.

genenare_callmap_sif(filepath)

Generate a sif file from the call map

generate_code_cover()

Generate a list of all recovered basic blocks.

class angr.analyses.loopfinder.LoopFinder(functions=None, normalize=True)

Extracts all the loops from all the functions in a binary.

class angr.analyses.veritesting.CallTracingFilter(project, depth, blacklist=None)

Filter to apply during CFG creation on a given state and jumpkind to determine if it should be skipped at a certain depth

filter(call_target_state, jumpkind)

The call will be skipped if it returns True.

Parameters:
  • call_target_state – The new state of the call target.
  • jumpkind – The Jumpkind of this call.
Returns:

True if we want to skip this call, False otherwise.

class angr.analyses.veritesting.Veritesting(input_state, boundaries=None, loop_unrolling_limit=10, enable_function_inlining=False, terminator=None, deviation_filter=None)

An exploration technique made for condensing chunks of code to single (nested) if-then-else constraints via CFG accurate to conduct Static Symbolic Execution SSE (conversion to single constraint)

SSE stands for Static Symbolic Execution, and we also implemented an extended version of Veritesting (Avgerinos, Thanassis, et al, ICSE 2014).

Parameters:
  • input_state – The initial state to begin the execution with.
  • boundaries – Addresses where execution should stop.
  • loop_unrolling_limit – The maximum times that Veritesting should unroll a loop for.
  • enable_function_inlining – Whether we should enable function inlining and syscall inlining.
  • terminator – A callback function that takes a state as parameter. Veritesting will terminate if this function returns True.
  • deviation_filter – A callback function that takes a state as parameter. Veritesting will put the state into “deviated” stash if this function returns True.
is_not_in_cfg(s)

Returns if s.addr is not a proper node in our CFG.

Parameters:s (SimState) – The SimState instance to test.
Returns bool:False if our CFG contains p.addr, True otherwise.
is_overbound(state)

Filter out all states that run out of boundaries or loop too many times.

param SimState state: SimState instance to check returns bool: True if outside of mem/loop_ctr boundary

class angr.analyses.vfg.VFGJob(*args, **kwargs)

A job descriptor that contains local variables used during VFG analysis.

class angr.analyses.vfg.AnalysisTask

An analysis task describes a task that should be done before popping this task out of the task stack and discard it.

class angr.analyses.vfg.FunctionAnalysis(function_address, return_address)

Analyze a function, generate fix-point states from all endpoints of that function, and then merge them to one state.

class angr.analyses.vfg.CallAnalysis(address, return_address, function_analysis_tasks=None, mergeable_plugins=None)

Analyze a call by analyze all functions this call might be calling, collect all final states generated by analyzing those functions, and merge them into one state.

class angr.analyses.vfg.VFGNode(addr, key, state=None)

A descriptor of nodes in a Value-Flow Graph

Constructor.

Parameters:
  • addr (int) –
  • key (BlockID) –
  • state (SimState) –
append_state(s, is_widened_state=False)

Appended a new state to this VFGNode. :param s: The new state to append :param is_widened_state: Whether it is a widened state or not.

class angr.analyses.vfg.VFG(cfg=None, context_sensitivity_level=2, start=None, function_start=None, interfunction_level=0, initial_state=None, avoid_runs=None, remove_options=None, timeout=None, max_iterations_before_widening=8, max_iterations=40, widening_interval=3, final_state_callback=None, status_callback=None, record_function_final_states=False)

This class represents a control-flow graph with static analysis result.

Perform abstract interpretation analysis starting from the given function address. The output is an invariant at the beginning (or the end) of each basic block.

Steps:

  • Generate a CFG first if CFG is not provided.
  • Identify all merge points (denote the set of merge points as Pw) in the CFG.
  • Cut those loop back edges (can be derived from Pw) so that we gain an acyclic CFG.
  • Identify all variables that are 1) from memory loading 2) from initial values, or 3) phi functions. Denote
    the set of those variables as S_{var}.
  • Start real AI analysis and try to compute a fix point of each merge point. Perform widening/narrowing only on
    variables in S_{var}.
Parameters:
  • cfg – The control-flow graph to base this analysis on. If none is provided, we will construct a CFGAccurate.
  • context_sensitivity_level – The level of context-sensitivity of this VFG. It ranges from 0 to infinity. Default 2.
  • function_start – The address of the function to analyze.
  • interfunction_level – The level of interfunction-ness to be
  • initial_state – A state to use as the initial one
  • avoid_runs – A list of runs to avoid
  • remove_options – State options to remove from the initial state. It only works when initial_state is None
  • timeout (int) –
get_any_node(addr)

Get any VFG node corresponding to the basic block at @addr. Note that depending on the context sensitivity level, there might be multiple nodes corresponding to different contexts. This function will return the first one it encounters, which might not be what you want.

get_paths(begin, end)

Get all the simple paths between @begin and @end. Returns: a list of angr.Path instances.

class angr.analyses.vsa_ddg.DefUseChain(def_loc, use_loc, variable)

Stand for a def-use chain. it is generated by the DDG itself.

Constructor.

Parameters:
  • def_loc
  • use_loc
  • variable
Returns:

class angr.analyses.vsa_ddg.VSA_DDG(vfg=None, start_addr=None, interfunction_level=0, context_sensitivity_level=2, keep_data=False)

A Data dependency graph based on VSA states. That means we don’t (and shouldn’t) expect any symbolic expressions.

Constructor.

Parameters:
  • vfg – An already constructed VFG. If not specified, a new VFG will be created with other specified parameters. vfg and start_addr cannot both be unspecified.
  • start_addr – The address where to start the analysis (typically, a function’s entry point).
  • interfunction_level – See VFG analysis.
  • context_sensitivity_level – See VFG analysis.
  • keep_data – Whether we keep set of addresses as edges in the graph, or just the cardinality of the sets, which can be used as a “weight”.
get_predecessors(code_location)

Returns all predecessors of code_location.

Parameters:code_location – A CodeLocation instance.
Returns:A list of all predecessors.
get_all_nodes(simrun_addr, stmt_idx)

Get all DDG nodes matching the given basic block address and statement index.

class angr.blade.Blade(graph, dst_run, dst_stmt_idx, direction='backward', project=None, cfg=None, ignore_sp=False, ignore_bp=False, ignored_regs=None, max_level=3, base_state=None)

Blade is a light-weight program slicer that works with networkx DiGraph containing CFGNodes. It is meant to be used in angr for small or on-the-fly analyses.

Parameters:
  • graph (networkx.DiGraph) – A graph representing the control flow graph. Note that it does not take angr.analyses.CFGAccurate or angr.analyses.CFGFast.
  • dst_run (int) – An address specifying the target SimRun.
  • dst_stmt_idx (int) – The target statement index. -1 means executing until the last statement.
  • direction (str) – ‘backward’ or ‘forward’ slicing. Forward slicing is not yet supported.
  • project (angr.Project) – The project instance.
  • cfg (angr.analyses.CFGBase) – the CFG instance. It will be made mandatory later.
  • ignore_sp (bool) – Whether the stack pointer should be ignored in dependency tracking. Any dependency from/to stack pointers will be ignored if this options is True.
  • ignore_bp (bool) – Whether the base pointer should be ignored or not.
  • max_level (int) – The maximum number of blocks that we trace back for.
Returns:

None

class angr.slicer.SimSlicer(arch, statements, target_tmps=None, target_regs=None, target_stack_offsets=None, inslice_callback=None, inslice_callback_infodict=None)

A super lightweight intra-IRSB slicing class.

class angr.annocfg.AnnotatedCFG(project, cfg=None, detect_loops=False)

AnnotatedCFG is a control flow graph with statement whitelists and exit whitelists to describe a slice of the program.

Constructor.

Parameters:
  • project – The angr Project instance
  • cfg – Control flow graph. Only used when path prioritizer is used.
  • detect_loops – Only used when path prioritizer is used.
from_digraph(digraph)

Initialize this AnnotatedCFG object with a networkx.DiGraph consisting of the following form of nodes:

Tuples like (block address, statement ID)

Those nodes are connected by edges indicating the execution flow.

Parameters:digraph – A networkx.DiGraph object
add_loop(loop_tuple)

A loop tuple contains a series of IRSB addresses that form a loop. Ideally it always starts with the first IRSB that we meet during the execution.

get_whitelisted_statements(addr)
Returns:True if all statements are whitelisted
dbg_print_irsb(irsb_addr, project=None)

Pretty-print an IRSB with whitelist information

keep_path(path)

Given a path, returns True if the path should be kept, False if it should be cut.

filter_path(path)

Used for debugging.

Parameters:path – A Path instance
Returns:True/False
path_priority(path)

Given a path, returns the path priority. A lower number means a higher priority.

successor_func(path)

Callback routine that takes in a path, and returns all feasible successors to path group. This callback routine should be passed to the keyword argument “successor_func” of PathGroup.step().

Parameters:path – A Path instance.
Returns:A list of all feasible Path successors.

SimOS

Manage OS-level configuration.

class angr.simos.IRange(start, end)

A simple range object for testing inclusion. Like xrange but works for huge numbers.

class angr.simos.SimOS(project, name=None)

A class describing OS/arch-level configuration.

configure_project()

Configure the project to set up global settings (like SimProcedures).

state_blank(addr=None, initial_prefix=None, stack_size=8388608, **kwargs)

Initialize a blank state.

All parameters are optional.

Parameters:
  • addr – The execution start address.
  • initial_prefix
  • stack_size – The number of bytes to allocate for stack space
Returns:

The initialized SimState.

Any additional arguments will be passed to the SimState constructor

prepare_call_state(calling_state, initial_state=None, preserve_registers=(), preserve_memory=())

This function prepares a state that is executing a call instruction. If given an initial_state, it copies over all of the critical registers to it from the calling_state. Otherwise, it prepares the calling_state for action.

This is mostly used to create minimalistic for CFG generation. Some ABIs, such as MIPS PIE and x86 PIE, require certain information to be maintained in certain registers. For example, for PIE MIPS, this function transfer t9, gp, and ra to the new state.

prepare_function_symbol(symbol_name, basic_addr=None)

Prepare the address space with the data necessary to perform relocations pointing to the given symbol

Returns a 2-tuple. The first item is the address of the function code, the second is the address of the relocation target.

handle_exception(successors, engine, exc_type, exc_value, exc_traceback)

Perform exception handling. This method will be called when, during execution, a SimException is thrown. Currently, this can only indicate a segfault, but in the future it could indicate any unexpected exceptional behavior that can’t be handled by ordinary control flow.

The method may mutate the provided SimSuccessors object in any way it likes, or re-raise the exception.

Parameters:
  • successors – The SimSuccessors object currently being executed on
  • engine – The engine that was processing this step
  • exc_type – The value of sys.exc_info()[0] from the error, the type of the exception that was raised
  • exc_value – The value of sys.exc_info()[1] from the error, the actual exception object
  • exc_traceback – The value of sys.exc_info()[2] from the error, the traceback from the exception
class angr.simos.SimUserland(project, syscall_library=None, **kwargs)

This is a base class for any SimOS that wants to support syscalls.

It uses the CLE kernel object to provide addresses for syscalls. Syscalls will be emulated as a jump to one of these addresses, where a SimProcedure from the syscall library provided at construction time will be executed.

syscall(state, allow_unsupported=True)

Given a state, return the procedure corresponding to the current syscall. This procedure will have .syscall_number, .display_name, and .addr set.

Parameters:
  • state – The state to get the syscall number from
  • allow_unsupported – Whether to return a “dummy” sycall instead of raising an unsupported exception
is_syscall_addr(addr)

Return whether or not the given address corresponds to a syscall.

syscall_from_addr(addr, allow_unsupported=True)

Get a syscall SimProcedure from an address.

Parameters:
  • addr – The address to convert to a syscall SimProcedure
  • allow_unsupported – Whether to return a dummy procedure for an unsupported syscall instead of raising an exception.
Returns:

The SimProcedure for the syscall, or None if the address is not a syscall address.

class angr.simos.SimLinux(project, **kwargs)

OS-specific configuration for *nix-y OSes.

prepare_function_symbol(symbol_name, basic_addr=None)

Prepare the address space with the data necessary to perform relocations pointing to the given symbol.

Returns a 2-tuple. The first item is the address of the function code, the second is the address of the relocation target.

class angr.simos.SimCGC(project, **kwargs)

Environment configuration for the CGC DECREE platform

class angr.simos.SimWindows(project, **kwargs)

Environemnt for the Windows Win32 subsystem. Does not support syscalls currently.