pyvex — Binary Translator

PyVEX provides an interface that translates binary code into the VEX intermediate represenation (IR). For an introduction to VEX, take a look here: https://docs.angr.io/docs/ir.html

Translation Interface

class pyvex.block.IRSB(data, mem_addr, arch, max_inst=None, max_bytes=None, bytes_offset=0, traceflags=0, opt_level=1, num_inst=None, num_bytes=None)

The IRSB is the primary interface to pyvex. Constructing one of these will make a call into LibVEX to perform a translation.

IRSB stands for Intermediate Representation Super-Block. An IRSB in VEX is a single-entry, multiple-exit code block.

Variables:
  • arch (archinfo.Arch) – The architecture this block is lifted under
  • statements (list of IRStmt) – The statements in this block
  • next (IRExpr) – The expression for the default exit target of this block
  • offsIP (int) – The offset of the instruction pointer in the VEX guest state
  • stmts_used (int) – The number of statements in this IRSB
  • jumpkind (str) – The type of this block’s default jump (call, boring, syscall, etc) as a VEX enum string
  • direct_next (bool) – Whether this block ends with a direct (not indirect) jump or branch
  • size (int) – The size of this block in bytes
  • addr (int) – The address of this basic block, i.e. the address in the first IMark
Parameters:
  • data (str or bytes or cffi.FFI.CData or None) – The bytes to lift. Can be either a string of bytes or a cffi buffer object. You may also pass None to initialize an empty IRSB.
  • mem_addr (int) – The address to lift the data at.
  • arch (archinfo.Arch) – The architecture to lift the data as.
  • max_inst – The maximum number of instructions to lift. Max 99. (See note below)
  • max_bytes – The maximum number of bytes to use. Max 5000.
  • bytes_offset – The offset into data to start lifting at.
  • traceflags – The libVEX traceflags, controlling VEX debug prints.
  • opt_level – The level of optimization to apply to the IR, 0-2.

Note

Explicitly specifying the number of instructions to lift (max_inst) may not always work exactly as expected. For example, on MIPS, it is meaningless to lift a branch or jump instruction without its delay slot. VEX attempts to Do The Right Thing by possibly decoding fewer instructions than requested. Specifically, this means that lifting a branch or jump on MIPS as a single instruction (max_inst=1) will result in an empty IRSB, and subsequent attempts to run this block will raise SimIRSBError(‘Empty IRSB passed to SimIRSB.’).

pp()

Pretty-print the IRSB to stdout.

expressions

A list of all expressions contained in the IRSB.

instructions

The number of instructions in this block

size

The size of this block, in bytes

operations

A list of all operations done by the IRSB, as libVEX enum names

all_constants

Returns all constants in the block (including incrementing of the program counter) as pyvex.const.IRConst.

constants

The constants (excluding updates of the program counter) in the IRSB as pyvex.const.IRConst.

constant_jump_targets

A set of the static jump targets of the basic block.

constant_jump_targets_and_jumpkinds

A dict of the static jump targets of the basic block to their jumpkind.

class pyvex.block.IRTypeEnv(arch, types=None)

An IR type environment.

Variables:types (list of str) – A list of the types of all the temporaries in this block as VEX enum strings. types[3] is the type of t3.
lookup(tmp)

Return the type of temporary variable tmp as an enum string

add(ty)

Add a new tmp of type ty to the environment. Returns the number of the new tmp.

IR Components

class pyvex.stmt.IRStmt

IR statements in VEX represents operations with side-effects.

class pyvex.stmt.NoOp

A no-operation statement. It is usually the result of an IR optimization.

class pyvex.stmt.IMark(addr, length, delta)

An instruction mark. It marks the start of the statements that represent a single machine instruction (the end of those statements is marked by the next IMark or the end of the IRSB). Contains the address and length of the instruction.

class pyvex.stmt.AbiHint(base, length, nia)

An ABI hint, provides specific information about this platform’s ABI.

class pyvex.stmt.Put(data, offset)

Write to a guest register, at a fixed offset in the guest state.

class pyvex.stmt.PutI(descr, ix, data, bias)

Write to a guest register, at a non-fixed offset in the guest state.

class pyvex.stmt.WrTmp(tmp, data)

Assign a value to a temporary. Note that SSA rules require each tmp is only assigned to once. IR sanity checking will reject any block containing a temporary which is not assigned to exactly once.

class pyvex.stmt.Store(addr, data, end)

Write a value to memory..

class pyvex.stmt.CAS(addr, dataLo, dataHi, expdLo, expdHi, oldLo, oldHi, end)

an atomic compare-and-swap operation.

class pyvex.stmt.LLSC(addr, storedata, result, end)

Either Load-Linked or Store-Conditional, depending on STOREDATA. If STOREDATA is NULL then this is a Load-Linked, else it is a Store-Conditional.

class pyvex.stmt.Exit(guard, dst, jk, offsIP)

A conditional exit from the middle of an IRSB.

class pyvex.stmt.LoadG(end, cvt, dst, addr, alt, guard)

A guarded load.

class pyvex.stmt.StoreG(end, addr, data, guard)

A guarded store.

class pyvex.expr.IRExpr

IR expressions in VEX represent operations without side effects.

child_expressions

A list of all of the expressions that this expression ends up evaluating.

constants

A list of all of the constants that this expression ends up using.

class pyvex.expr.Binder(binder)

Used only in pattern matching within Vex. Should not be seen outside of Vex.

class pyvex.expr.GetI(descr, ix, bias)

Read a guest register at a non-fixed offset in the guest state.

class pyvex.expr.RdTmp(tmp)

Read the value held by a temporary.

class pyvex.expr.Get(offset, ty)

Read a guest register, at a fixed offset in the guest state.

class pyvex.expr.Qop(op, args)

A quaternary operation (4 arguments).

class pyvex.expr.Triop(op, args)

A ternary operation (3 arguments)

class pyvex.expr.Binop(op, args)

A binary operation (2 arguments).

class pyvex.expr.Unop(op, args)

A unary operation (1 argument).

class pyvex.expr.Load(end, ty, addr)

A load from memory.

class pyvex.expr.Const(con)

A constant expression.

class pyvex.expr.ITE(cond, iffalse, iftrue)

An if-then-else expression.

class pyvex.expr.CCall(retty, cee, args)

A call to a pure (no side-effects) helper C function.

Misc. Things

class pyvex.enums.VEXObject

The base class for Vex types.

class pyvex.enums.IRCallee(regparms, name, addr, mcx_mask)

Describes a helper function to call.

class pyvex.enums.IRRegArray(base, elemTy, nElems)

A section of the guest state that we want te be able to index at run time, so as to be able to describe indexed or rotating register files on the guest.

Variables:
  • base (int) – The offset into the state that this array starts
  • elemTy (str) – The types of the elements in this array, as VEX enum strings
  • nElems (int) – The number of elements in this array