simuvex — Program State and Emulation

This module handles constraint generation.

Program State

class simuvex.s_state.SimState(arch='AMD64', plugins=None, memory_backer=None, permissions_backer=None, mode=None, options=None, add_options=None, remove_options=None, special_memory_filler=None, os_name=None)

The SimState represents the state of a program, including its memory, registers, and so forth.

Variables:
  • regs – A convenient view of the state’s registers, where each register is a property
  • mem – A convenient view of the state’s memory, a simuvex.plugins.view.SimMemView
  • registers – The state’s register file as a flat memory region
  • memory – The state’s memory as a flat memory region
  • se – The solver engine for this state
  • inspect – The breakpoint manager, a simuvex.plugins.inspect.SimInspector
  • log – Information about the state’s history
  • scratch – Information about the current execution step
  • posix – MISNOMER: information about the operating system or environment model
  • libc – Information about the standard library we are emulating
  • cgc – Information about the cgc environment
  • uc_manager – Control of under-constrained symbolic execution
  • unicorn – Control of the Unicorn Engine
ip

Get the instruction pointer expression, trigger SimInspect breakpoints, and generate SimActions. Use _ip to not trigger breakpoints or generate actions.

Returns:an expression
simplify(*args)

Simplify this state’s constraints.

add_constraints(*args, **kwargs)

Add some constraints to the state.

You may pass in any number of symbolic booleans as variadic positional arguments.

satisfiable(**kwargs)

Whether the state’s constraints are satisfiable

downsize()

Clean up after the solver engine. Calling this when a state no longer needs to be solved on will reduce memory usage.

copy()

Returns a copy of the state.

merge(*others, **kwargs)

Merges this state with the other states. Returns the merging result, merged state, and the merge flag.

Parameters:
  • states – the states to merge
  • merge_conditions – a tuple of the conditions under which each state holds
  • common_ancestor – a state that represents the common history between the states being merged
Returns:

(merged state, merge flag, a bool indicating if any merging occured)

widen(*others)

Perform a widening between self and other states :param others: :return:

reg_concrete(*args, **kwargs)

Returns the contents of a register but, if that register is symbolic, raises a SimValueError.

mem_concrete(*args, **kwargs)

Returns the contents of a memory but, if the contents are symbolic, raises a SimValueError.

stack_push(*args, **kwargs)

Push ‘thing’ to the stack, writing the thing to memory and adjusting the stack pointer.

stack_pop(*args, **kwargs)

Pops from the stack and returns the popped thing. The length will be the architecture word size.

stack_read(*args, **kwargs)

Reads length bytes, at an offset into the stack.

Parameters:
  • offset – The offset from the stack pointer.
  • length – The number of bytes to read.
  • bp – If True, offset from the BP instead of the SP. Default: False.
dbg_print_stack(depth=None, sp=None)

Only used for debugging purposes. Return the current stack info in formatted string. If depth is None, the current stack frame (from sp to bp) will be printed out.

Calling Conventions

class simuvex.s_cc.ArgSession(cc)

A class to keep track of the state accumulated in laying parameters out into memory

class simuvex.s_cc.SimCC(arch, args=None, ret_val=None, sp_delta=None, func_ty=None)

A calling convention allows you to extract from a state the data passed from function to function by calls and returns. Most of the methods provided by SimCC that operate on a state assume that the program is just after a call but just before stack frame allocation, though this may be overridden with the stack_base parameter to each individual method.

This is the base class for all calling conventions.

An instance of this class allows it to be tweaked to the way a specific function should be called.

Parameters:
  • arch – The Archinfo arch for this CC
  • args – A list of SimFunctionArguments describing where the arguments go
  • ret_val – A SimFunctionArgument describing where the return value goes
  • sp_delta – The amount the stack pointer changes over the course of this function - CURRENTLY UNUSED
Parmm func_ty:

A SimType for the function itself

classmethod from_arg_kinds(arch, fp_args, ret_fp=False, sizes=None, sp_delta=None, func_ty=None)

Get an instance of the class that will extract floating-point/integral args correctly.

Parameters:
  • arch – The Archinfo arch for this CC
  • fp_args – A list, with one entry for each argument the function can take. True if the argument is fp, false if it is integral.
  • ret_fp – True if the return value for the function is fp.
  • sizes – Optional: A list, with one entry for each argument the function can take. Each entry is the size of the corresponding argument in bytes.
  • sp_delta – The amount the stack pointer changes over the course of this function - CURRENTLY UNUSED
Parmm func_ty:

A SimType for the function itself

int_args

Iterate through all the possible arg positions that can only be used to store integer or pointer values Does not take into account customizations.

Returns an iterator of SimFunctionArguments

both_args

Iterate through all the possible arg positions that can be used to store any kind of argument Does not take into account customizations.

Returns an iterator of SimFunctionArguments

fp_args

Iterate through all the possible arg positions that can only be used to store floating point values Does not take into account customizations.

Returns an iterator of SimFunctionArguments

is_fp_arg(arg)

This should take a SimFunctionArgument instance and return whether or not that argument is a floating-point argument.

Returns True for MUST be a floating point arg,
False for MUST NOT be a floating point arg, None for when it can be either.
class ArgSession(cc)

A class to keep track of the state accumulated in laying parameters out into memory

SimCC.arg_session

Return an arg session.

A session provides the control interface necessary to describe how integral and floating-point arguments are laid out into memory. The default behavior is that there are a finite list of int-only and fp-only argument slots, and an infinite number of generic slots, and when an argument of a given type is requested, the most slot available is used. If you need different behavior, subclass ArgSession.

SimCC.stack_space(args)
Parameters:args – A list of SimFunctionArguments
Returns:The number of bytes that should be allocated on the stack to store all these args, NOT INCLUDING the return address.
SimCC.return_val

The location the return value is stored.

SimCC.return_addr

The location the return address is stored.

SimCC.arg_locs(is_fp, sizes=None)

Pass this a list of whether each parameter is floating-point or not, and get back a list of SimFunctionArguments. Optionally, pass a list of argument sizes (in bytes) as well.

If you’ve customized this CC, this will sanity-check the provided locations with the given list.

SimCC.arg(state, index, stack_base=None)

Returns a bitvector expression representing the nth argument of a function.

stack_base is an optional pointer to the top of the stack at the function start. If it is not specified, use the current stack pointer.

WARNING: this assumes that none of the arguments are floating-point and they’re all single-word-sized, unless you’ve customized this CC.

SimCC.get_args(state, is_fp=None, sizes=None, stack_base=None)

is_fp should be a list of booleans specifying whether each corresponding argument is floating-point - True for fp and False for int. For a shorthand to assume that all the parameters are int, pass the number of parameters as an int.

If you’ve customized this CC, you may omit this parameter entirely. If it is provided, it is used for sanity-checking.

sizes is an optional list of argument sizes, in bytes. Be careful about using this if you’ve made explicit the arg locations, since it might decide to combine two locations into one if an arg is too big.

stack_base is an optional pointer to the top of the stack at the function start. If it is not specified, use the current stack pointer.

Returns a list of bitvector expressions representing the arguments of a function.

SimCC.setup_callsite(state, ret_addr, args, stack_base=None, alloc_base=None, grow_like_stack=True)

Okay. this one is serious.

Parameters:
  • state – The SimState to operate on
  • ret_addr – The address to return to when the called function finishes
  • args – The list of arguments that that the called function will see
  • stack_base – An optional pointer to use as the top of the stack, circa the function entry point
  • alloc_base – An optional pointer to use as the place to put excess argument data
  • grow_like_stack – When allocating data at alloc_base, whether to allocate at decreasing addresses

The idea here is that you can provide almost any kind of python type in args and it’ll be translated to a binary format to be placed into simulated memory. Lists (representing arrays) must be entirely elements of the same type and size, while tuples (representing structs) can be elements of any type and size. If you’d like there to be a pointer to a given value, wrap the value in a PointerWrapper. Any value that can’t fit in a register will be automatically put in a PointerWrapper.

If stack_base is not provided, the current stack pointer will be used, and it will be updated. If alloc_base is not provided, the current stack pointer will be used, and it will be updated. You might not like the results if you provide stack_base but not alloc_base.

grow_like_stack controls the behavior of allocating data at alloc_base. When data from args needs to be wrapped in a pointer, the pointer needs to point somewhere, so that data is dumped into memory at alloc_base. If you set alloc_base to point to somewhere other than the stack, set grow_like_stack to False so that sequencial allocations happen at increasing addresses.

SimCC.get_return_val(state, is_fp=None, size=None, stack_base=None)

Get the return value out of the given state

SimCC.set_return_val(state, val, is_fp=None, size=None, stack_base=None)

Set the return value into the given state

class simuvex.s_cc.SimLyingRegArg(name)

A register that LIES about the types it holds

class simuvex.s_cc.SimCCUnknown(arch, args=None, ret_val=None, sp_delta=None, func_ty=None)

Represent an unknown calling convention.

Parameters:
  • arch – The Archinfo arch for this CC
  • args – A list of SimFunctionArguments describing where the arguments go
  • ret_val – A SimFunctionArgument describing where the return value goes
  • sp_delta – The amount the stack pointer changes over the course of this function - CURRENTLY UNUSED
Parmm func_ty:

A SimType for the function itself

Engines

class simuvex.engines.engine.SimEngine(**kwargs)

A SimEngine is a class which understands how to perform execution on a state. This is a base class.

process(state, *args, **kwargs)

Perform execution with a state.

You should only override this method in a subclass in order to provide the correct method signature and docstring. You should override the _process method to do your actual execution.

Parameters:
  • state – The state with which to execute. This state will be copied before modification.
  • inline – This is an inline execution. Do not bother copying the state.
  • force_addr – Force execution to pretend that we’re working at this concrete address
Returns:

A SimSuccessors object categorizing the execution’s successor states

check(state, *args, **kwargs)

Check if this engine can be used for execution on the current state. A callback check_failure is called upon failed checks. Note that the execution can still fail even if check() returns True.

You should only override this method in a subclass in order to provide the correct method signature and docstring. You should override the _check method to do your actual execution.

Parameters:
  • state (simuvex.SimState) – The state with which to execute.
  • args – Positional arguments that will be passed to process().
  • kwargs – Keyword arguments that will be passed to process().
Returns:

True if the state can be handled by the current engine, False otherwise.

class simuvex.engines.successors.SimSuccessors(addr, initial_state)

This class serves as a categorization of all the kinds of result states that can come from a SimEngine run.

Variables:
  • addr (int) – The address at which execution is taking place, as a python int
  • initial_state – The initial state for which execution produced these successors
  • engine – The engine that produced these successors
  • sort – A string identifying the type of engine that produced these successors
  • processed (bool) – Whether or not the processing succeeded
  • description (str) – A textual description of the execution step

The successor states produced by this run are categorized into several lists:

Variables:
  • artifacts (dict) – Any analysis byproducts (for example, an IRSB) that were produced during execution
  • successors – The “normal” successors. IP may be symbolic, but must have reasonable number of solutions
  • unsat_successors – Any successor which is unsatisfiable after its guard condition is added.
  • all_successors – successors + unsat_successors
  • flat_successors – The normal successors, but any symbolic IPs have been concretized. There is one state in this list for each possible value an IP may be concretized to for each successor state.
  • unconstrained_successors – Any state for which during the flattening process we find too many solutions.

A more detailed description of the successor lists may be found here: https://docs.angr.io/docs/simuvex.html

add_successor(state, target, guard, jumpkind, add_guard=True, exit_stmt_idx=None, exit_ins_addr=None, source=None)

Add a successor state of the SimRun. This procedure stores method parameters into state.scratch, does some housekeeping, and calls out to helper functions to prepare the state and categorize it into the appropriate successor lists.

Parameters:
  • state (SimState) – The successor state.
  • target – The target (of the jump/call/ret).
  • guard – The guard expression.
  • jumpkind (str) – The jumpkind (call, ret, jump, or whatnot).
  • add_guard (bool) – Whether to add the guard constraint (default: True).
  • exit_stmt_idx (int) – The ID of the exit statement, an integer by default. ‘default’ stands for the default exit, and None means it’s not from a statement (for example, from a SimProcedure).
  • exit_ins_addr (int) – The instruction pointer of this exit, which is an integer by default.
  • source (int) – The source of the jump (i.e., the address of the basic block).
simuvex.engines.vex.size_bits(t)

Returns size, in BITS, of a type.

simuvex.engines.vex.size_bytes(t)

Returns size, in BYTES, of a type.

class simuvex.engines.vex.engine.SimEngineVEX(stop_points=None, use_cache=True, cache_size=10000, default_opt_level=1, support_selfmodifying_code=False, single_step=False)

Execution engine based on VEX, Valgrind’s IR.

process(state, irsb=None, skip_stmts=0, last_stmt=99999999, whitelist=None, inline=False, force_addr=None, insn_bytes=None, size=None, num_inst=None, traceflags=0, thumb=False, opt_level=None, **kwargs)
Parameters:
  • state – The state with which to execute
  • irsb – The PyVEX IRSB object to use for execution. If not provided one will be lifted.
  • skip_stmts – The number of statements to skip in processing
  • last_stmt – Do not execute any statements after this statement
  • whitelist – Only execute statements in this set
  • inline – This is an inline execution. Do not bother copying the state.
  • force_addr – Force execution to pretend that we’re working at this concrete address
  • thumb – Whether the block should be lifted in ARM’s THUMB mode.
  • opt_level – The VEX optimization level to use.
  • insn_bytes – A string of bytes to use for the block instead of the project.
  • size – The maximum size of the block, in bytes.
  • num_inst – The maximum number of instructions.
  • traceflags – traceflags to be passed to VEX. (default: 0)
Returns:

A SimSuccessors object categorizing the block’s successors

lift(state=None, clemory=None, insn_bytes=None, arch=None, addr=None, size=None, num_inst=None, traceflags=0, thumb=False, opt_level=None)

Lift an IRSB.

There are many possible valid sets of parameters. You at the very least must pass some source of data, some source of an architecture, and some source of an address.

Sources of data in order of priority: insn_bytes, clemory, state

Sources of an address, in order of priority: addr, state

Sources of an architecture, in order of priority: arch, clemory, state

Parameters:
  • state – A state to use as a data source.
  • clemory – A cle.memory.Clemory object to use as a data source.
  • addr – The address at which to start the block.
  • thumb – Whether the block should be lifted in ARM’s THUMB mode.
  • opt_level – The VEX optimization level to use. The final IR optimization level is determined by (ordered by priority): - Argument opt_level - opt_level is set to 1 if OPTIMIZE_IR exists in state options - self._default_opt_level
  • insn_bytes – A string of bytes to use as a data source.
  • size – The maximum size of the block, in bytes.
  • num_inst – The maximum number of instructions.
  • traceflags – traceflags to be passed to VEX. (default: 0)
class simuvex.engines.procedure.SimEngineProcedure

An engine for running SimProcedures

process(state, procedure, ret_to=None, inline=None, force_addr=None, **kwargs)

Perform execution with a state.

Parameters:
  • state – The state with which to execute
  • procedure – An instance of a SimProcedure to run
  • ret_to – The address to return to when this procedure is finished
  • inline – This is an inline execution. Do not bother copying the state.
  • force_addr – Force execution to pretend that we’re working at this concrete address
Returns:

A SimSuccessors object categorizing the execution’s successor states

class simuvex.engines.unicorn_engine.SimEngineUnicorn(base_stop_points=None)

Concrete exection in the Unicorn Engine, a fork of qemu.

process(state, step=None, extra_stop_points=None, inline=False, force_addr=None, **kwargs)
Parameters:
  • state – The state with which to execute
  • step – How many basic blocks we want to execute
  • extra_stop_points – A collection of addresses at which execution should halt
  • inline – This is an inline execution. Do not bother copying the state.
  • force_addr – Force execution to pretend that we’re working at this concrete address
Returns:

A SimSuccessors object categorizing the results of the run and whether it succeeded.

Plugins

class simuvex.plugins.cgc.SimStateCGC

This state plugin keeps track of CGC state.

get_max_sinkhole(length)

Find a sinkhole which is large enough to support length bytes.

This uses first-fit. The first sinkhole (ordered in descending order by their address) which can hold length bytes is chosen. If there are more than length bytes in the sinkhole, a new sinkhole is created representing the remaining bytes while the old sinkhole is removed.

add_sinkhole(address, length)

Add a sinkhole.

Allow the possibility for the program to reuse the memory represented by the address length pair.

class simuvex.plugins.gdb.GDB(omit_fp=False, adjust_stack=False)

Initialize or update a state from gdb dumps of the stack, heap, registers and data (or arbitrary) segments.

Parameters:
  • omit_fp – The frame pointer register is used for something else. (i.e. –omit_frame_pointer)
  • adjust_stack – Use different stack addresses than the gdb session (not recommended).
set_stack(stack_dump, stack_top)

Stack dump is a dump of the stack from gdb, i.e. the result of the following gdb command :

dump binary memory [stack_dump] [begin_addr] [end_addr]

We set the stack to the same addresses as the gdb session to avoid pointers corruption.

Parameters:
  • stack_dump – The dump file.
  • stack_top – The address of the top of the stack in the gdb session.
set_heap(heap_dump, heap_base)

Heap dump is a dump of the heap from gdb, i.e. the result of the following gdb command:

dump binary memory [stack_dump] [begin] [end]

Parameters:
  • heap_dump – The dump file.
  • heap_base – The start address of the heap in the gdb session.
set_data(addr, data_dump)

Update any data range (most likely use is the data segments of loaded objects)

set_regs(regs_dump)

Initialize register values within the state

Parameters:regs_dump – The output of info registers in gdb.
class simuvex.plugins.inspect.BP(when='before', enabled=None, condition=None, action=None, **kwargs)

A breakpoint.

check(state, when)

Checks state state to see if the breakpoint should fire.

Parameters:
  • state – The state.
  • when – Whether the check is happening before or after the event.
Returns:

A boolean representing whether the checkpoint should fire.

fire(state)

Trigger the breakpoint.

Parameters:state – The state.
class simuvex.plugins.inspect.SimInspector

The breakpoint interface, used to instrument execution. For usage information, look here: https://docs.angr.io/docs/simuvex.html#breakpoints

action(event_type, when, **kwargs)

Called from within SimuVEX when events happens. This function checks all breakpoints registered for that event and fires the ones whose conditions match.

make_breakpoint(event_type, *args, **kwargs)

Creates and adds a breakpoint which would trigger on event_type. Additional arguments are passed to the BP constructor.

Returns:The created breakpoint, so that it can be removed later.
b(event_type, *args, **kwargs)

Creates and adds a breakpoint which would trigger on event_type. Additional arguments are passed to the BP constructor.

Returns:The created breakpoint, so that it can be removed later.
add_breakpoint(event_type, bp)

Adds a breakpoint which would trigger on event_type.

Parameters:
  • event_type – The event type to trigger on
  • bp – The breakpoint
Returns:

The created breakpoint.

remove_breakpoint(event_type, bp)

Removes a breakpoint.

Parameters:bp – The breakpoint to remove.
downsize()

Remove previously stored attributes from this plugin instance to save memory. This method is supposed to be called by breakpoint implementors. A typical workflow looks like the following :

>>> # Add `attr0` and `attr1` to `self.state.inspect`
>>> self.state.inspect(xxxxxx, attr0=yyyy, attr1=zzzz)
>>> # Get new attributes out of SimInspect in case they are modified by the user
>>> new_attr0 = self.state._inspect.attr0
>>> new_attr1 = self.state._inspect.attr1
>>> # Remove them from SimInspect
>>> self.state._inspect.downsize()
class simuvex.plugins.libc.SimStateLibc

This state plugin keeps track of various libc stuff:

class simuvex.plugins.posix.Stat(st_dev, st_ino, st_nlink, st_mode, st_uid, st_gid, st_rdev, st_size, st_blksize, st_blocks, st_atime, st_atimensec, st_mtime, st_mtimensec, st_ctime, st_ctimensec)

Create new instance of Stat(st_dev, st_ino, st_nlink, st_mode, st_uid, st_gid, st_rdev, st_size, st_blksize, st_blocks, st_atime, st_atimensec, st_mtime, st_mtimensec, st_ctime, st_ctimensec)

st_atime

Alias for field number 10

st_atimensec

Alias for field number 11

st_blksize

Alias for field number 8

st_blocks

Alias for field number 9

st_ctime

Alias for field number 14

st_ctimensec

Alias for field number 15

st_dev

Alias for field number 0

st_gid

Alias for field number 5

st_ino

Alias for field number 1

st_mode

Alias for field number 3

st_mtime

Alias for field number 12

st_mtimensec

Alias for field number 13

Alias for field number 2

st_rdev

Alias for field number 6

st_size

Alias for field number 7

st_uid

Alias for field number 4

class simuvex.plugins.solver.SimSolver(solver=None)

Symbolic solver.

reload_solver()

Reloads the solver. Useful when changing solver options.

BVS(name, size, min=None, max=None, stride=None, uninitialized=False, explicit_name=None, **kwargs)

Creates a bit-vector symbol (i.e., a variable). Other keyword parameters are passed directly on to the constructor of claripy.ast.BV.

Parameters:
  • name – The name of the symbol.
  • size – The size (in bits) of the bit-vector.
  • min – The minimum value of the symbol.
  • max – The maximum value of the symbol.
  • stride – The stride of the symbol.
  • uninitialized – Whether this value should be counted as an “uninitialized” value in the course of an analysis.
  • explicit_name – If False, an identifier is appended to the name to ensure uniqueness.
Returns:

A BV object representing this symbol.

eval_to_ast(*args, **kwargs)

Evaluate an expression, using the solver if necessary. Returns AST objects.

Parameters:
  • e – the expression
  • n – the number of desired solutions
  • extra_constraints – extra constraints to apply to the solver
  • exact – if False, returns approximate solutions
Returns:

a tuple of the solutions, in the form of claripy AST nodes

Return type:

tuple

eval(*args, **kwargs)

Evaluate an expression, using the solver if necessary. Returns primitives.

Parameters:
  • e – the expression
  • n – the number of desired solutions
  • extra_constraints – extra constraints to apply to the solver
  • exact – if False, returns approximate solutions
Returns:

a tuple of the solutions, in the form of Python primitives

Return type:

tuple

any_int(*args, **kwargs)

Evaluate an expression, using the solver if necessary. Returns an integer.

Parameters:
  • e – the expression
  • extra_constraints – extra constraints to apply to the solver
  • exact – if False, returns approximate solutions
Returns:

a single integer solution, in the form of a Python primitive

Return type:

int

any_str(e, **kwargs)

Evaluate an expression, using the solver if necessary. Returns a string.

Parameters:
  • e – the expression
  • extra_constraints – extra constraints to apply to the solver
  • exact – if False, returns approximate solutions
Returns:

a single string solution, in the form of a Python primitive

Return type:

string

Procedures

class simuvex.s_procedure.SimProcedure(addr, arch, symbolic_return=None, returns=None, is_syscall=None, num_args=None, display_name=None, convention=None, sim_kwargs=None, is_function=None, is_continuation=False, continuation_addr=None)

A SimProcedure is a wonderful object which describes a procedure to run on a state.

You may subclass SimProcedure and override run(), replacing it with mutating self.state however you like, and then either returning a value or jumping away somehow.

A detailed discussion of programming SimProcedures may be found at https://docs.angr.io/docs/simprocedures.md

Parameters:arch – The architecture to use for this procedure

The following parameters are optional:

Parameters:
  • symbolic_return – Whether the procedure’s return value should be stubbed into a single symbolic variable constratined to the real return value
  • returns – Whether the procedure should return to its caller afterwards
  • is_syscall – Whether this procedure is a syscall
  • num_args – The number of arguments this procedure should extract
  • display_name – The name to use when displaying this procedure
  • convention – The SimCC to use for this procedure
  • sim_kwargs – Additional keyword arguments to be passed to run()
  • is_function – Whether this procedure emulates a function
execute(state, successors=None, arguments=None, ret_to=None)

Call this method with a SimState and a SimSuccessors to execute the procedure.

Alternately, successors may be none if this is an inline call. In that case, you should provide arguments to the function.

run(*args, **kwargs)

Implement the actual procedure here!

static_exits(blocks)

Get new exits by performing static analysis and heuristics. This is a fast and best-effort approach to get new exits for scenarios where states are not available (e.g. when building a fast CFG).

Parameters:blocks (list) – Blocks that are executed before reaching this SimProcedure.
Returns:A list of tuples. Each tuple is (address, jumpkind).
Return type:list
arg(i)

Returns the ith argument. Raise a SimProcedureArgumentError if we don’t have such an argument available.

Parameters:i (int) – The index of the argument to get
Returns:The argument
Return type:object
set_return_expr(expr)

Set this expression as the return value for the function. If this is not an inline call, this will write the expression to the state via the calling convention.

inline_call(procedure, *arguments, **sim_kwargs)

Call another SimProcedure in-line to retrieve its return value. Returns an instance of the procedure with the ret_expr property set.

Parameters:
  • procedure – The class of the procedure to execute
  • arguments – Any additional positional args will be used as arguments to the procedure call
  • sim_kwargs – Any additional keyword args will be passed as sim_kwargs to the procedure construtor
ret(expr=None)

Add an exit representing a return from this function. If this is not an inline call, grab a return address from the state and jump to it. If this is not an inline call, set a return expression with the calling convention.

call(addr, args, continue_at, cc=None)

Add an exit representing calling another function via pointer.

Parameters:
  • addr – The address of the function to call
  • args – The list of arguments to call the function with
  • continue_at – Later, when the called function returns, execution of the current procedure will continue in the named method.
  • cc – Optional: use this calling convention for calling the new function. Default is to use the current convention.
jump(addr)

Add an exit representing jumping to an address.

exit(exit_code)

Add an exit representing terminating the program.

class simuvex.s_format.FormatString(parser, components)

Describes a format string.

Takes a list of components which are either just strings or a FormatSpecifier.

replace(startpos, args)

Produce a new string based of the format string self with args args and return a new string, possibly symbolic.

interpret(addr, startpos, args, region=None)

Interpret a format string, reading the data at addr in region into args starting at startpos.

class simuvex.s_format.FormatSpecifier(string, length_spec, size, signed)

Describes a format specifier within a format string.

class simuvex.s_format.FormatParser(addr, arch, symbolic_return=None, returns=None, is_syscall=None, num_args=None, display_name=None, convention=None, sim_kwargs=None, is_function=None, is_continuation=False, continuation_addr=None)

For SimProcedures relying on format strings.

Parameters:arch – The architecture to use for this procedure

The following parameters are optional:

Parameters:
  • symbolic_return – Whether the procedure’s return value should be stubbed into a single symbolic variable constratined to the real return value
  • returns – Whether the procedure should return to its caller afterwards
  • is_syscall – Whether this procedure is a syscall
  • num_args – The number of arguments this procedure should extract
  • display_name – The name to use when displaying this procedure
  • convention – The SimCC to use for this procedure
  • sim_kwargs – Additional keyword arguments to be passed to run()
  • is_function – Whether this procedure emulates a function

Storage

class simuvex.storage.file.SimFile(name, mode, pos=0, content=None, size=None, closed=None)

Represents a file.

variables()
Returns:the symbolic variable names associated with the file.
read(dst_addr, length)

Reads some data from the current (or provided) position of the file.

Parameters:
  • dst_addr – If specified, the data is written to that address.
  • length – The length of the read.
Returns:

The length of the read.

merge(others, merge_conditions, common_ancestor=None)

Merges the SimFile object with others.

class simuvex.storage.file.SimDialogue(name, mode=None, pos=0, content=None, size=None, dialogue_entries=None)

Emulates a dialogue with a program. Enables us to perform concrete short reads.

add_dialogue_entry(dialogue_len)

Add a new dialogue piece to the end of the dialogue.

read(dst_addr, length)

Reads some data from current dialogue entry, emulates short reads.

class simuvex.storage.memory.AddressWrapper(region, region_base_addr, address, is_on_stack, function_address)

AddressWrapper is used in SimAbstractMemory, which provides extra meta information for an address (or a ValueSet object) that is normalized from an integer/BVV/StridedInterval.

Constructor for the class AddressWrapper.

Parameters:
  • strregion – Name of the memory regions it belongs to.
  • region_base_addr (int) – Base address of the memory region
  • address – An address (not a ValueSet object).
  • is_on_stack (bool) – Whether this address is on a stack region or not.
  • function_address (int) – Related function address (if any).
to_valueset(state)

Convert to a ValueSet instance

Parameters:state – A state
Returns:The converted ValueSet instance
class simuvex.storage.memory.RegionDescriptor(region_id, base_address, related_function_address=None)

Descriptor for a memory region ID.

class simuvex.storage.memory.RegionMap(is_stack)

Mostly used in SimAbstractMemory, RegionMap stores a series of mappings between concrete memory address ranges and memory regions, like stack frames and heap regions.

Constructor

Parameters:is_stack – Whether this is a region map for stack frames or not. Different strategies apply for stack regions.
map(absolute_address, region_id, related_function_address=None)

Add a mapping between an absolute address and a region ID. If this is a stack region map, all stack regions beyond (lower than) this newly added regions will be discarded.

Parameters:
  • absolute_address – An absolute memory address.
  • region_id – ID of the memory region.
  • related_function_address – A related function address, mostly used for stack regions.
unmap_by_address(absolute_address)

Removes a mapping based on its absolute address.

Parameters:absolute_address – An absolute address
absolutize(region_id, relative_address)

Convert a relative address in some memory region to an absolute address.

Parameters:
  • region_id – The memory region ID
  • relative_address – The relative memory offset in that memory region
Returns:

An absolute address if converted, or an exception is raised when region id does not exist.

relativize(absolute_address, target_region_id=None)

Convert an absolute address to the memory offset in a memory region.

Note that if an address belongs to heap region is passed in to a stack region map, it will be converted to an offset included in the closest stack frame, and vice versa for passing a stack address to a heap region. Therefore you should only pass in address that belongs to the same category (stack or non-stack) of this region map.

Parameters:absolute_address – An absolute memory address
Returns:A tuple of the closest region ID, the relative offset, and the related function address.
class simuvex.storage.memory.MemoryStoreRequest(addr, data=None, size=None, condition=None, endness=None)

A MemoryStoreRequest is used internally by SimMemory to track memory request data.

class simuvex.storage.memory.SimMemory(endness=None, abstract_backer=None, stack_region_map=None, generic_region_map=None)

Represents the memory space of the process.

category

Return the category of this SimMemory instance. It can be one of the three following categories – reg, mem, or file.

set_state(state)

Call the set_state method in SimStatePlugin class, and then perform the delayed initialization.

Parameters:state – The SimState instance
set_stack_address_mapping(absolute_address, region_id, related_function_address=None)

Create a new mapping between an absolute address (which is the base address of a specific stack frame) and a region ID.

Parameters:
  • absolute_address – The absolute memory address.
  • region_id – The region ID.
  • related_function_address – Related function address.
unset_stack_address_mapping(absolute_address)

Remove a stack mapping.

Parameters:absolute_address – An absolute memory address, which is the base address of the stack frame to destroy.
stack_id(function_address)

Return a memory region ID for a function. If the default region ID exists in the region mapping, an integer will appended to the region name. In this way we can handle recursive function calls, or a function that appears more than once in the call frame.

This also means that stack_id() should only be called when creating a new stack frame for a function. You are not supposed to call this function every time you want to map a function address to a stack ID.

Parameters:function_address (int) – Address of the function.
Returns:ID of the new memory region.
Return type:str
store(addr, data, size=None, condition=None, add_constraints=None, endness=None, action=None, inspect=True, priv=None, disable_actions=False)

Stores content into memory.

Parameters:
  • addr – A claripy expression representing the address to store at.
  • data – The data to store (claripy expression or something convertable to a claripy expression).
  • size – A claripy expression representing the size of the data to store.

The following parameters are optional.

Parameters:
  • condition – A claripy expression representing a condition if the store is conditional.
  • add_constraints – Add constraints resulting from the merge (default: True).
  • endness – The endianness for the data.
  • action – A SimActionData to fill out with the final written value and constraints.
  • inspect (bool) – Whether this store should trigger SimInspect breakpoints or not.
  • disable_actions (bool) – Whether this store should avoid creating SimActions or not. When set to False, state options are respected.
store_cases(addr, contents, conditions, fallback=None, add_constraints=None, endness=None, action=None)

Stores content into memory, conditional by case.

Parameters:
  • addr – A claripy expression representing the address to store at.
  • contents – A list of bitvectors, not necessarily of the same size. Use None to denote an empty write.
  • conditions – A list of conditions. Must be equal in length to contents.

The following parameters are optional.

Parameters:
  • fallback – A claripy expression representing what the write should resolve to if all conditions evaluate to false (default: whatever was there before).
  • add_constraints – Add constraints resulting from the merge (default: True)
  • endness – The endianness for contents as well as fallback.
  • action (simuvex.s_action.SimActionData) – A SimActionData to fill out with the final written value and constraints.
load(addr, size=None, condition=None, fallback=None, add_constraints=None, action=None, endness=None, inspect=True, disable_actions=False)

Loads size bytes from dst.

Parameters:
  • dst – The address to load from.
  • size – The size (in bytes) of the load.
  • condition – A claripy expression representing a condition for a conditional load.
  • fallback – A fallback value if the condition ends up being False.
  • add_constraints – Add constraints resulting from the merge (default: True).
  • action – A SimActionData to fill out with the constraints.
  • endness – The endness to load with.
  • inspect (bool) – Whether this store should trigger SimInspect breakpoints or not.
  • disable_actions (bool) – Whether this store should avoid creating SimActions or not. When set to False, state options are respected.

There are a few possible return values. If no condition or fallback are passed in, then the return is the bytes at the address, in the form of a claripy expression. For example:

<A BVV(0x41, 32)>

On the other hand, if a condition and fallback are provided, the value is conditional:

<A If(condition, BVV(0x41, 32), fallback)>
normalize_address(addr, is_write=False)

Normalize addr for use in static analysis (with the abstract memory model). In non-abstract mode, simply returns the address in a single-element list.

find(addr, what, max_search=None, max_symbolic_bytes=None, default=None, step=1)

Returns the address of bytes equal to ‘what’, starting from ‘start’. Note that, if you don’t specify a default value, this search could cause the state to go unsat if no possible matching byte exists.

Parameters:
  • addr – The start address.
  • what – What to search for;
  • max_search – Search at most this many bytes.
  • max_symbolic_bytes – Search through at most this many symbolic bytes.
  • default – The default value, if what you’re looking for wasn’t found.
Returns:

An expression representing the address of the matching byte.

copy_contents(dst, src, size, condition=None, src_memory=None, dst_memory=None)

Copies data within a memory.

Parameters:
  • dst – A claripy expression representing the address of the destination
  • src – A claripy expression representing the address of the source

The following parameters are optional.

Parameters:
  • src_memory – Copy data from this SimMemory instead of self
  • src_memory – Copy data to this SimMemory instead of self
  • size – A claripy expression representing the size of the copy
  • condition – A claripy expression representing a condition, if the write should be conditional. If this is determined to be false, the size of the copy will be 0.
class simuvex.plugins.abstract_memory.SimAbstractMemory(memory_backer=None, memory_id='mem', endness=None, stack_region_map=None, generic_region_map=None)

This is an implementation of the abstract store in paper [TODO].

Some differences:

  • For stack variables, we map the absolute stack address to each region so that we can effectively trace stack accesses. When tracing into a new function, you should call set_stack_address_mapping() to create a new mapping. When exiting from a function, you should cancel the previous mapping by calling unset_stack_address_mapping(). Currently this is only used for stack!
set_state(state)

Overriding the SimStatePlugin.set_state() method

Parameters:state – A SimState object
Returns:None
normalize_address(addr, is_write=False, convert_to_valueset=False, target_region=None)

Convert a ValueSet object into a list of addresses.

Parameters:
  • addr – A ValueSet object (which describes an address)
  • is_write – Is this address used in a write or not
  • convert_to_valueset – True if you want to have a list of ValueSet instances instead of AddressWrappers, False otherwise
  • target_region – Which region to normalize the address to. To leave the decision to SimuVEX, set it to None
Returns:

A list of AddressWrapper or ValueSet objects

get_segments(addr, size)

Get a segmented memory region based on AbstractLocation information available from VSA.

Here are some assumptions to make this method fast:
  • The entire memory region [addr, addr + size] is located within the same MemoryRegion
  • The address ‘addr’ has only one concrete value. It cannot be concretized to multiple values.
Parameters:
  • addr – An address
  • size – Size of the memory area in bytes
Returns:

An ordered list of sizes each segment in the requested memory region

copy()

Make a copy of this SimAbstractMemory object :return:

merge(others, merge_conditions, common_ancestor=None)

Merge this guy with another SimAbstractMemory instance

dbg_print()

Print out debugging information

class simuvex.storage.memory_object.SimMemoryObject(object, base, length=None)

A MemoryObjectRef instance is a reference to a byte or several bytes in a specific object in SimSymbolicMemory. It is only used inside SimSymbolicMemory class.

class simuvex.storage.paged_memory.Page(page_size, permissions=None, executable=False, storage=None, sinkhole=None)

Page object, allowing for more flexibility than just a raw dict.

Create a new page object. Carries permissions information. Permissions default to RW unless executable is True in which case permissions default to RWX.

Parameters:executable – Whether the page is executable, typically this will depend on whether the binary has an executable stack.
class simuvex.storage.paged_memory.SimPagedMemory(memory_backer=None, permissions_backer=None, pages=None, initialized=None, name_mapping=None, hash_mapping=None, page_size=None, symbolic_addrs=None, check_permissions=False)

Represents paged memory.

load_bytes(addr, num_bytes, ret_on_segv=False)

Load bytes from paged memory.

Parameters:
  • addr – Address to start loading.
  • num_bytes – Number of bytes to load.
  • ret_on_segv (bool) – True if you want load_bytes to return directly when a SIGSEV is triggered, otherwise a SimSegfaultError will be raised.
Returns:

A 3-tuple of (a dict of pages loaded, a list of indices of missing pages, number of bytes scanned in all).

Return type:

tuple

contains_no_backer(addr)

Tests if the address is contained in any page of paged memory, without considering memory backers.

Parameters:addr (int) – The address to test.
Returns:True if the address is included in one of the pages, False otherwise.
Return type:bool
store_memory_object(mo, overwrite=True)

This function optimizes a large store by storing a single reference to the SimMemoryObject instead of one for each byte.

Parameters:memory_object – the memory object to store
replace_memory_object(old, new_content)

Replaces the memory object old with a new memory object containing new_content.

Parameters:
  • old – A SimMemoryObject (i.e., one from memory_objects_for_hash() or :func:` memory_objects_for_name()`).
  • new_content – The content (claripy expression) for the new memory object.
Returns:

the new memory object

replace_all(old, new)

Replaces all instances of expression old with expression new.

Parameters:
  • old – A claripy expression. Must contain at least one named variable (to make it possible to use the name index for speedup).
  • new – The new variable to replace it with.
addrs_for_name(n)

Returns addresses that contain expressions that contain a variable named n.

addrs_for_hash(h)

Returns addresses that contain expressions that contain a variable with the hash of h.

memory_objects_for_name(n)

Returns a set of SimMemoryObjects that contain expressions that contain a variable with the name of n.

This is useful for replacing those values in one fell swoop with replace_memory_object(), even if they have been partially overwritten.

memory_objects_for_hash(n)

Returns a set of SimMemoryObjects that contain expressions that contain a variable with the hash h.

permissions(addr, permissions=None)

Returns the permissions for a page at address addr.

If optional arugment permissions is given, set page permissions to that prior to returning permissions.

class simuvex.concretization_strategies.SimConcretizationStrategy(filter=None, exact=True)

Concretization strategies control the resolution of symbolic memory indices in SimuVEX. By subclassing this class and setting it as a concretization strategy (on state.memory.read_strategies and state.memory.write_strategies), SimuVEX’s memory index concretization behavior can be modified.

Initializes the base SimConcretizationStrategy.

Parameters:
  • filter – A function, taking arguments of (SimMemory, claripy.AST) that determins if this strategy can handle resolving the provided AST.
  • exact – A flag (default: True) that determines if the convenience resolution functions provided by this class use exact or approximate resolution.
concretize(memory, addr)

Concretizes the address into a list of values. If this strategy cannot handle this address, returns None.

copy()

Returns a copy of the strategy, if there is data that should be kept separate between states. If not, returns self.

merge(others)

Merges this strategy with others (if there is data that should be kept separate between states. If not, is a no-op.

class simuvex.plugins.view.SimMemView(ty=None, addr=None, state=None)

This is a convenient interface with which you can access a program’s memory.

The interface works like this:

  • You first use [array index notation] to specify the address you’d like to load from
  • If at that address is a pointer, you may access the deref property to return a SimMemView at the address present in memory.
  • You then specify a type for the data by simply accesing a property of that name. For a list of supported types, look at state.mem.types.
  • You can then refine the type. Any type may support any refinement it likes. Right now the only refinements supported are that you may access any member of a struct by its member name, and you may index into a string or array to access that element.
  • If the address you specified initially points to an array of that type, you can say .array(n) to view the data as an array of n elements.
  • Finally, extract the structured data with .resolved or .concrete. .resolved will return bitvector values, while .concrete will return integer, string, array, etc values, whatever best represents the data.
  • Alternately, you may store a value to memory, by assigning to the chain of properties that you’ve constructed. Note that because of the way python works, x = s.mem[...].prop; x = val will NOT work, you must say s.mem[...].prop = val.

For example:

>>> s.mem[0x601048].long
<long (64 bits) <BV64 0x4008d0> at 0x601048>
>>> s.mem[0x601048].long.resolved
<BV64 0x4008d0>
>>> s.mem[0x601048].deref
<<untyped> <unresolvable> at 0x4008d0>
>>> s.mem[0x601048].deref.string.concrete
'SOSNEAKY'

Useful for Analysis, I Guess?

class simuvex.s_slicer.SimSlicer(arch, statements, target_tmps=None, target_regs=None, target_stack_offsets=None, inslice_callback=None, inslice_callback_infodict=None)

A super lightweight intra-IRSB slicing class.

class simuvex.s_type.SimType(label=None)

SimType exists to track type information for SimProcedures.

Parameters:label – the type label.
class simuvex.s_type.SimTypeBottom(label=None)

SimTypeBottom basically repesents a type error.

Parameters:label – the type label.
class simuvex.s_type.SimTypeTop(size=None, label=None)

SimTypeTop represents any type (mostly used with a pointer for void*).

class simuvex.s_type.SimTypeReg(size, label=None)

SimTypeReg is the base type for all types that are register-sized.

Parameters:
  • label – the type label.
  • size – the size of the type (e.g. 32bit, 8bit, etc.).
class simuvex.s_type.SimTypeNum(size, signed=True, label=None)

SimTypeNum is a numeric type of arbitrary length

Parameters:
  • size – The size of the integer, in bytes
  • signed – Whether the integer is signed or not
  • label – A label for the type
class simuvex.s_type.SimTypeInt(signed=True, label=None)

SimTypeInt is a type that specifies a signed or unsigned C integer.

Parameters:
  • signed – True if signed, False if unsigned
  • label – The type label
class simuvex.s_type.SimTypeChar(label=None)

SimTypeChar is a type that specifies a character; this could be represented by an 8-bit int, but this is meant to be interpreted as a character.

Parameters:label – the type label.
class simuvex.s_type.SimTypeFd(label=None)

SimTypeFd is a type that specifies a file descriptor.

Parameters:label – the type label
class simuvex.s_type.SimTypePointer(pts_to, label=None, offset=0)

SimTypePointer is a type that specifies a pointer to some other type.

Parameters:
  • label – The type label.
  • pts_to – The type to which this pointer points to.
class simuvex.s_type.SimTypeFixedSizeArray(elem_type, length)

SimTypeFixedSizeArray is a literal (i.e. not a pointer) fixed-size array.

class simuvex.s_type.SimTypeArray(elem_type, length=None, label=None)

SimTypeArray is a type that specifies a pointer to an array; while it is a pointer, it has a semantic difference.

Parameters:
  • label – The type label.
  • elem_type – The type of each element in the array.
  • length – An expression of the length of the array, if known.
class simuvex.s_type.SimTypeString(length=None, label=None)

SimTypeString is a type that represents a C-style string, i.e. a NUL-terminated array of bytes.

Parameters:
  • label – The type label.
  • length – An expression of the length of the string, if known.
class simuvex.s_type.SimTypeFunction(args, returnty, label=None)

SimTypeFunction is a type that specifies an actual function (i.e. not a pointer) with certain types of arguments and a certain return value.

Parameters:
  • label – The type label
  • args – A tuple of types representing the arguments to the function
  • returns – The return type of the function, or none for void
class simuvex.s_type.SimTypeLength(signed=False, addr=None, length=None, label=None)

SimTypeLength is a type that specifies the length of some buffer in memory.

...I’m not really sure what the original design of this class was going for

Parameters:
  • signed – Whether the value is signed or not
  • label – The type label.
  • addr – The memory address (expression).
  • length – The length (expression).
class simuvex.s_type.SimTypeFloat(size=32)

An IEEE754 single-precision floating point number

class simuvex.s_type.SimTypeDouble

An IEEE754 double-precision floating point number

class simuvex.s_type.SimStructValue(struct, values=None)

A SimStruct type paired with some real values

Parameters:
  • struct – A SimStruct instance describing the type of this struct
  • values – A mapping from struct fields to values
class simuvex.s_type.SimUnion(members, label=None)

why

Parameters:members – The members of the struct, as a mapping name -> type
simuvex.s_type.define_struct(defn)

Register a struct definition globally

>>> define_struct('struct abcd {int x; int y;}')
simuvex.s_type.register_types(mapping)

Pass in a mapping from name to SimType and they will be registered to the global type store

>>> register_types(parse_types("typedef int x; typedef float y;"))
simuvex.s_type.do_preprocess(defn)

Run a string through the C preprocessor installed on your system

simuvex.s_type.parse_defns(defn, preprocess=True)

Parse a series of C definitions, returns a mapping from variable name to variable type object

simuvex.s_type.parse_types(defn, preprocess=True)

Parse a series of C definitions, returns a mapping from type name to type object

simuvex.s_type.parse_file(defn, preprocess=True)

Parse a series of C definitions, returns a tuple of two type mappings, one for variable definitions and one for type definitions.

simuvex.s_type.parse_type(defn, preprocess=True)

Parse a simple type expression into a SimType

>>> parse_type('int *')
class simuvex.s_variable.SimVariableSet

A collection of SimVariables.

complement(other)

Calculate the complement of self and other.

Parameters:other – Another SimVariableSet instance.
Returns:The complement result.

Logging Data

class simuvex.s_action.SimAction(state, region_type)

A SimAction represents a semantic action that an analyzed program performs.

Initializes the SimAction.

Parameters:state – the state that’s the SimAction is taking place in.
downsize()

Clears some low-level details (that take up memory) out of the SimAction.

class simuvex.s_action.SimActionExit(state, target, condition=None, exit_type=None)

An Exit action represents a (possibly conditional) jump.

class simuvex.s_action.SimActionConstraint(state, constraint, condition=None)

A constraint action represents an extra constraint added during execution of a path.

class simuvex.s_action.SimActionOperation(state, op, exprs)

An action representing an operation between variables and/or constants.

class simuvex.s_action.SimActionData(state, region_type, action, tmp=None, addr=None, size=None, data=None, condition=None, fallback=None, fd=None)

A Data action represents a read or a write from memory, registers or a file.

class simuvex.s_action_object.SimActionObject(ast, reg_deps=None, tmp_deps=None)

A SimActionObject tracks an AST and its dependencies.