cle — Binary Loader

CLE is an extensible binary loader. Its main goal is to take an executable program and any libraries it depends on and produce an address space where that program is loaded and ready to run.

The primary interface to CLE is the Loader class.

Loading Interface

class cle.loader.Loader(main_binary, auto_load_libs=True, force_load_libs=(), skip_libs=(), main_opts=None, lib_opts=None, custom_ld_path=(), use_system_libs=True, ignore_import_version_numbers=True, case_insensitive=False, rebase_granularity=16777216, except_missing_libs=False, aslr=False, page_size=1, extern_size=32768)

The loader loads all the objects and exports an abstraction of the memory of the process. What you see here is an address space with loaded and rebased binaries.

Parameters:main_binary – The path to the main binary you’re loading, or a file-like object with the binary in it.

The following parameters are optional.

Parameters:
  • auto_load_libs – Whether to automatically load shared libraries that loaded objects depend on.
  • force_load_libs – A list of libraries to load regardless of if they’re required by a loaded object.
  • skip_libs – A list of libraries to never load, even if they’re required by a loaded object.
  • main_opts – A dictionary of options to be used loading the main binary.
  • lib_opts – A dictionary mapping library names to the dictionaries of options to be used when loading them.
  • custom_ld_path – A list of paths in which we can search for shared libraries.
  • use_system_libs – Whether or not to search the system load path for requested libraries. Default True.
  • ignore_import_version_numbers – Whether libraries with different version numbers in the filename will be considered equivalent, for example libc.so.6 and libc.so.0
  • case_insensitive – If this is set to True, filesystem loads will be done case-insensitively regardless of the case-sensitivity of the underlying filesystem.
  • rebase_granularity – The alignment to use for rebasing shared objects
  • except_missing_libs – Throw an exception when a shared library can’t be found.
  • aslr – Load libraries in symbolic address space. Do not use this option.
  • page_size – The granularity with which data is mapped into memory. Set to 1 if you are working in a non-paged environment.
Variables:
  • memory (cle.memory.Clemory) – The loaded, rebased, and relocated memory of the program.
  • main_object – The object representing the main binary (i.e., the executable).
  • shared_objects – A dictionary mapping loaded library names to the objects representing them.
  • all_objects – A list containing representations of all the different objects loaded.
  • requested_names – A set containing the names of all the different shared libraries that were marked as a dependency by somebody.
  • initial_load_objects – A list of all the objects that were loaded as a result of the initial load request.

When reference is made to a dictionary of options, it requires a dictionary with zero or more of the following keys:

  • backend : “elf”, “pe”, “mach-o”, “ida”, “blob” : which loader backend to use
  • custom_arch : The archinfo.Arch object to use for the binary
  • custom_base_addr : The address to rebase the object at
  • custom_entry_point : The entry point to use for the object

More keys are defined on a per-backend basis.

close()

Release any resources held by this loader.

max_addr

The maximum address loaded as part of any loaded object (i.e., the whole address space).

min_addr

The minimum address loaded as part of any loaded object (i.e., the whole address space).

initializers

Return a list of all the initializers that should be run before execution reaches the entry point, in the order they should be run.

finalizers

Return a list of all the finalizers that should be run before the program exits. I’m not sure what order they should be run in.

linux_loader_object

If the linux dynamic loader is present in memory, return it

extern_object

Return the extern object used to provide addresses to unresolved symbols and angr internals.

Accessing this property will load this object into memory if it was not previously present.

kernel_object

Return the object used to provide addresses to syscalls.

Accessing this property will load this object into memory if it was not previously present.

tls_object

Return the object used to provide addresses for thread-local storage.

Accessing this property will load this object into memory if it was not previously present.

all_elf_objects

Return a list of every object that was loaded from an ELF file.

all_pe_objects

Return a list of every object that was loaded from an ELF file.

missing_dependencies

Return a set of every name that was requested as a shared object dependency but could not be loaded

describe_addr(addr)

Returns a textual description of what’s in memory at the provided address

find_object(spec, extra_objects=())

If the given library specification has been loaded, return its object, otherwise return None.

find_object_containing(addr)

Return the object that contains the given address, or None if the address is unmapped.

find_symbol(name)

Search for the symbol with the given name or address.

Parameters:name – Either the name or address of a symbol to look up
Returns:A cle.backends.Symbol object if found, None otherwise.
find_all_symbols(name, exclude_imports=True, exclude_externs=False, exclude_forwards=True)

Iterate over all symbols present in the set of loaded binaries that have the given name

Parameters:
  • name – The name to search for
  • exclude_imports – Whether to exclude import symbols. Default True.
  • exclude_externs – Whether to exclude symbols in the extern object. Default False.
  • exclude_forwards – Whether to exclude forward symbols. Default True.
find_plt_stub_name(addr)

Return the name of the PLT stub starting at addr.

find_relevant_relocations(name)

Iterate through all the relocations referring to the symbol with the given name

perform_irelative_relocs(resolver_func)

Use this method to satisfy IRelative relocations in the binary that require execution of loaded code.

Note that this does NOT handle IFunc symbols, which must be handled separately. (this could be changed, but at the moment it’s desirable to support lazy IFunc resolution, since emulation is usually slow)

Parameters:resolver_func – A callback function that takes an address, runs the code at that address, and returns the return value from the emulated function.
dynamic_load(spec)

Load a file into the address space. Note that the sematics of auto_load_libs and except_missing_libs apply at all times.

Parameters:spec – The path to the file to load. May be an absolute path, a relative path, or a name to search in the load path.
Returns:A list of all the objects successfully loaded, which may be empty if this object was previously loaded. If the object specified in spec failed to load for any reason, including the file not being found, return None.
get_loader_symbolic_constraints()

Do not use this method.

Backends

class cle.backends.Backend(binary, loader=None, is_main_bin=False, filename=None, custom_entry_point=None, custom_arch=None, custom_base_addr=None, **kwargs)

Main base class for CLE binary objects.

An alternate interface to this constructor exists as the static method cle.loader.Loader.load_object()

Variables:
  • binary – The path to the file this object is loaded from
  • is_main_bin – Whether this binary is loaded as the main executable
  • segments – A listing of all the loaded segments in this file
  • sections – A listing of all the demarked sections in the file
  • sections_map – A dict mapping from section name to section
  • imports – A mapping from symbol name to import symbol
  • resolved_imports – A list of all the import symbols that are successfully resolved
  • relocs – A list of all the relocations in this binary
  • irelatives – A list of tuples representing all the irelative relocations that need to be performed. The first item in the tuple is the address of the resolver function, and the second item is the address of where to write the result. The destination address is an RVA.
  • jmprel – A mapping from symbol name to the address of its jump slot relocation, i.e. its GOT entry.
  • arch (archinfo.arch.Arch) – The architecture of this binary
  • os (str) – The operating system this binary is meant to run under
  • mapped_base (int) – The base address of this object in virtual memory
  • deps – A list of names of shared libraries this binary depends on
  • linking – ‘dynamic’ or ‘static’
  • linked_base – The base address this object requests to be loaded at
  • pic (bool) – Whether this object is position-independent
  • execstack (bool) – Whether this executable has an executable stack
  • provides (str) – The name of the shared library dependancy that this object resolves
Parameters:
  • binary – The path to the binary to load
  • is_main_bin – Whether this binary should be loaded as the main executable
rebase()

Rebase backend’s regions to the new base where they were mapped by the loader

contains_addr(addr)

Is addr in one of the binary’s segments/sections we have loaded? (i.e. is it mapped into memory ?)

find_segment_containing(addr)

Returns the segment that contains addr, or None.

find_section_containing(addr)

Returns the section that contains addr or None.

min_addr

This returns the lowest virtual address contained in any loaded segment of the binary.

max_addr

This returns the highest virtual address contained in any loaded segment of the binary.

initializers

Stub function. Should be overridden by backends that can provide initializer functions that ought to be run before execution reaches the entry point. Addresses should be rebased.

finalizers

Stub function. Like initializers, but with finalizers.

get_symbol(name)

Stub function. Implement to find the symbol with name name.

static extract_soname(path)

Extracts the shared object identifier from the path, or returns None if it cannot.

classmethod check_compatibility(spec, obj)

Performs a minimal static load of spec and returns whether it’s compatible with other_obj

class cle.backends.symbol.Symbol(owner, name, relative_addr, size, sym_type)

Representation of a symbol from a binary file. Smart enough to rebase itself.

There should never be more than one Symbol instance representing a single symbol. To make sure of this, only use the cle.backends.Backend.get_symbol() to create new symbols.

Variables:
  • owner_obj (cle.backends.Backend) – The object that contains this symbol
  • name (str) – The name of this symbol
  • addr (int) – The un-based address of this symbol, an RVA
  • type (int) – The type of this symbol as one of SYMBOL.TYPE_*
  • resolved (bool) – Whether this import symbol has been resolved to a real symbol
  • resolvedby (None or cle.backends.Symbol) – The real symbol this import symbol has been resolve to
  • resolvewith (str) – The name of the library we must use to resolve this symbol, or None if none is required.
Iver int size:

The size of this symbol

Not documenting this since if you try calling it, you’re wrong.

rebased_addr

The address of this symbol in the global memory space

is_function

Whether this symbol is a function

demangled_name

The name of this symbol, run through a C++ demangler

Warning: this calls out to the external program c++filt and will fail loudly if it’s not installed

resolve_forwarder()

If this symbol is a forwarding export, return the symbol the forwarding refers to, or None if it cannot be found.

class cle.backends.regions.Regions(lst=None)

A container class acting as a list of regions (sections or segments). Additionally, it keeps an sorted list of all regions that are mapped into memory to allow fast lookups.

We assume none of the regions overlap with others.

raw_list

Get the internal list. Any change to it is not tracked, and therefore _sorted_list will not be updated. Therefore you probably does not want to modify the list.

Returns:The internal list container.
Return type:list
max_addr

Get the highest address of all regions.

Returns:The highest address of all regions, or None if there is no region available.

rtype: int or None

append(region)

Append a new Region instance into the list.

Parameters:region (Region) – The region to append.
find_region_containing(addr)

Find the region that contains a specific address. Returns None if none of the regions covers the address.

Parameters:addr (int) – The address.
Returns:The region that covers the specific address, or None if no such region is found.
Return type:Region or None
class cle.backends.elf.elf.ELF(binary, addend=None, **kwargs)

The main loader class for statically loading ELF executables. Uses the pyreadelf library where useful.

get_symbol(symid, symbol_table=None)

Gets a Symbol object for the specified symbol.

Parameters:symid – Either an index into .dynsym or the name of a symbol.
class cle.backends.elf.elf.ELFSymbol(owner, symb)

Represents a symbol for the ELF format.

Variables:
  • elftype (str) – The type of this symbol as an ELF enum string
  • binding (str) – The binding of this symbol as an ELF enum string
  • section – The section associated with this symbol, or None
class cle.backends.elf.elfcore.CoreNote(n_type, name, desc)

This class is used when parsing the NOTES section of a core file.

class cle.backends.elf.elfcore.ELFCore(binary, **kwargs)

Loader class for ELF core files.

class cle.backends.elf.metaelf.MetaELF(*args, **kwargs)

A base class that implements functions used by all backends that can load an ELF.

plt

Maps names to addresses.

reverse_plt

Maps addresses to names.

is_ppc64_abiv1

Returns whether the arch is powerpc64 ABIv1.

Returns:True if powerpc64 ABIv1, False otherwise.
static get_text_offset(path)

Offset of .text in the binary.

class cle.backends.elf.symbol.ELFSymbol(owner, symb)

Represents a symbol for the ELF format.

Variables:
  • elftype (str) – The type of this symbol as an ELF enum string
  • binding (str) – The binding of this symbol as an ELF enum string
  • section – The section associated with this symbol, or None
class cle.backends.elf.regions.ELFSegment(readelf_seg)

Represents a segment for the ELF format.

class cle.backends.elf.hashtable.ELFHashTable(symtab, stream, offset, arch)

Functions to do lookup from a HASH section of an ELF file.

Information: http://docs.oracle.com/cd/E23824_01/html/819-0690/chapter6-48031.html

Parameters:
  • symtab – The symbol table to perform lookups from (as a pyelftools SymbolTableSection).
  • stream – A file-like object to read from the ELF’s memory.
  • offset – The offset in the object where the table starts.
  • arch – The ArchInfo object for the ELF file.
get(k)

Perform a lookup. Returns a pyelftools Symbol object, or None if there is no match.

Parameters:k – The string to look up.
class cle.backends.elf.hashtable.GNUHashTable(symtab, stream, offset, arch)

Functions to do lookup from a GNU_HASH section of an ELF file.

Information: https://blogs.oracle.com/ali/entry/gnu_hash_elf_sections

Parameters:
  • symtab – The symbol table to perform lookups from (as a pyelftools SymbolTableSection).
  • stream – A file-like object to read from the ELF’s memory.
  • offset – The offset in the object where the table starts.
  • arch – The ArchInfo object for the ELF file.
get(k)

Perform a lookup. Returns a pyelftools Symbol object, or None if there is no match.

Parameters:k – The string to look up
class cle.backends.pe.pe.PE(*args, **kwargs)

Representation of a PE (i.e. Windows) binary.

get_symbol(name)

Look up the symbol with the given name. Symbols can be looked up by ordinal with the name "ordinal.%d" % num

class cle.backends.pe.symbol.WinSymbol(owner, name, addr, is_import, is_export, ordinal_number, forwarder)

Represents a symbol for the PE format.

class cle.backends.pe.regions.PESection(pe_section, remap_offset=0)

Represents a section for the PE format.

class cle.backends.macho.macho.MachO(binary, **kwargs)

Mach-O binaries for CLE

The Mach-O format is notably different from other formats, as such: * Sections are always part of a segment, self.sections will thus be empty * Symbols cannot be categorized like in ELF * Symbol resolution must be handled by the binary * Rebasing cannot be done statically (i.e. self.mapped_base is ignored for now) * ...

is_thumb_interworking(address)

Returns true if the given address is a THUMB interworking address

decode_thumb_interworking(address)

Decodes a thumb interworking address

get_string(start)

Loads a string from the string table

parse_lc_str(f, start, limit=None)

Parses a lc_str data structure

get_symbol_by_address_fuzzy(address)

Locates a symbol by checking the given address against sym.addr, sym.bind_xrefs and sym.symbol_stubs

get_symbol(name, include_stab=False, fuzzy=False)

Returns all symbols matching name.

Note that especially when include_stab=True there may be multiple symbols with the same name, therefore this method always returns an array.

Parameters:
  • include_stab – Include debugging symbols NOT RECOMMENDED
  • fuzzy – Replace exact match with “contains”-style match
get_segment_by_name(name)

Searches for a MachOSegment with the given name and returns it :param name: Name of the sought segment :return: MachOSegment or None

class cle.backends.macho.symbol.MachOSymbol(owner, name, addr, symtab_offset, macho_type, section_number, description, value, library_name=None, segment_name=None, section_name=None, is_export=None)

Base class for Mach-O symbols. Made to be (somewhat) compatible with backends.Symbol. Note that ELF-specific fields from backends.Symbol are not used and semantics of the remaining fields differs in many cases. As a result most stock functionality from Angr and related libraries WILL NOT WORK PROPERLY on MachOSymbol.

Much of the code below is based on heuristics as official documentation is sparse, consider yourself warned!

class cle.backends.macho.section.MachOSection(offset, vaddr, size, vsize, segname, sectname, align, reloff, nreloc, flags, r1, r2)

Mach-O Section, only defined within the context of a Mach-O Segment.

  • offset is the offset into the file the region starts
  • vaddr (or just addr) is the virtual address
  • filesize (or just size) is the size of the region in the file
  • memsize (or vsize) is the size of the region when loaded into memory
  • segname is the corresponding segment’s name without padding
  • sectname is the section’s name without padding
  • align is the sections alignment as a power of 2
  • reloff is the file offset to the section’s relocation entries
  • nreloc is the number of relocation entries for this section
  • flags is a bit vector containing per-section flags
  • r1 and r2 are values for the reserved1 and reserved2 fields respectively
class cle.backends.macho.segment.MachOSegment(offset, vaddr, size, vsize, segname, nsect, sections, flags, initprot, maxprot)

Mach-O Segment

  • offset is the offset into the file the region starts
  • vaddr (or just addr) is the virtual address
  • filesize (or just size) is the size of the region in the file
  • memsize (or vsize) is the size of the region when loaded into memory
  • segname is the segment’s name without padding
  • nsect is the number of sections contained in this segment
  • sections is an array of MachOSections
  • flags is a bit vector containing per-segment flags
  • initprot and maxprot are initial and maximum permissions respectively
get_section_by_name(name)

Searches for a section by name within this segment :param name: Name of the section :return: MachOSection or None

cle.backends.macho.binding.read_uleb(blob, offset)

Reads a number encoded as uleb128

cle.backends.macho.binding.read_sleb(blob, offset)

Reads a number encoded as sleb128

class cle.backends.macho.binding.BindingState(is_64)

State object

add_address_ov(address, addend)

this is a very ugly klugde. It is needed because dyld relies on overflow semantics and represents several negative offsets through BIG ulebs

class cle.backends.macho.binding.BindingHelper(binary)

Factors out binding logic from MachO. Intended to work in close conjunction with MachO not for standalone use

do_normal_bind(blob)

Performs non-lazy, non-weak bindings :param blob: Blob containing binding opcodes

do_lazy_bind(blob)

Performs lazy binding

cle.backends.macho.binding.default_binding_handler(state, binary)

Binds location to the symbol with the given name and library ordinal

class cle.backends.cgc.cgc.CGC(binary, *args, **kwargs)

Backend to support the CGC elf format used by the Cyber Grand Challenge competition.

See : https://github.com/CyberGrandChallenge/libcgcef/blob/master/cgc_executable_format.md

class cle.backends.cgc.backedcgc.BackedCGC(path, memory_backer=None, register_backer=None, writes_backer=None, permissions_map=None, current_allocation_base=None, *args, **kwargs)

This is a backend for CGC executables that allows user provide a memory backer and a register backer as the initial state of the running binary.

Parameters:
  • path – File path to CGC executable.
  • memory_backer – A dict of memory content, with beginning address of each segment as key and actual memory content as data.
  • register_backer – A dict of all register contents. EIP will be used as the entry point of this executable.
  • permissions_map – A dict of memory region to permission flags
  • current_allocation_base – An integer representing the current address of the top of the CGC heap.
class cle.backends.ihex.Hex(path, custom_arch=None, custom_entry_point=0, **kwargs)

A loader for Intel Hex Objects See https://en.wikipedia.org/wiki/Intel_HEX

class cle.backends.blob.Blob(path, custom_offset=None, segments=None, **kwargs)

Representation of a binary blob, i.e. an executable in an unknown file format.

Parameters:
  • custom_arch – (required) an archinfo.Arch for the binary blob.
  • custom_offset – Skip this many bytes from the beginning of the file.
  • segments – List of tuples describing how to map data into memory. Tuples are of (file_offset, mem_addr, size).

You can’t specify both custom_offset and segments.

function_name(addr)

Blobs don’t support function names.

in_which_segment(addr)

Blobs don’t support segments.

class cle.backends.idabin.IDABin(binary, *args, **kwargs)

Get information from binaries using IDA.

in_which_segment(addr)

Return the segment name at address addr (IDA).

function_name(addr)

Return the function name at address addr (IDA).

get_symbol_addr(sym)

Get the address of the symbol sym from IDA.

Returns:An address.
min_addr

Get the min address of the binary (IDA).

max_addr

Get the max address of the binary (IDA).

resolve_import_dirty(sym, new_val)

Resolve import for symbol sym the dirty way, i.e. find all references to it in the code and replace it with the address new_val inline (instead of updating GOT slots). Don’t use this unless you really have to, use resolve_import_with() instead.

set_got_entry(name, newaddr)

Resolve import name with address newaddr. That is, update the GOT entry for name with newaddr.

is_thumb(addr)

Is the address addr in thumb mode ? (ARM).

get_strings()

Extract strings from binary (IDA).

Returns:An array of strings.

Relocations

CLE’s loader implements program relocation data on a plugin basis. If you would like to add more relocation implementations, do so by subclassing the Relocation class and overriding any relevant methods or properties. Put your subclasses in a module in the relocations subpackage of the appropraite backend package. The name of the subclass will be used to determine when to use it! Look at the existing versions for details.

class cle.backends.relocation.Relocation(owner, symbol, relative_addr)

A representation of a relocation in a binary file. Smart enough to relocate itself.

Variables:
  • owner_obj – The binary this relocation was originaly found in, as a cle object
  • symbol – The Symbol object this relocation refers to
  • relative_addr – The address in owner_obj this relocation would like to write to
  • rebased_addr – The address in the global memory space this relocation would like to write to
  • resolvedby – If the symbol this relocation refers to is an import symbol and that import has been resolved, this attribute holds the symbol from a different binary that was used to resolve the import.
  • resolved – Whether the application of this relocation was succesful
relocate(solist, bypass_compatibility=False)

Applies this relocation. Will make changes to the memory object of the object it came from.

This implementation is a generic version that can be overridden in subclasses.

Parameters:solist – A list of objects from which to resolve symbols.

Thread-local storage

class cle.backends.tls.TLSObject(loader)

CLE implements thread-local storage by treating the TLS region as another object to be loaded. Because of the complex interactions between TLS and all the other objects that can be loaded into memory, each TLS object will perform some basic initialization when instanciated, and then once all other objects have been loaded, finalize() is called.

register_object(obj)

Lay out the TLS initialization images into memory. Do the actual work in a subclass.

class cle.backends.tls.elf_tls.ELFTLSObject(loader, max_data=32768, max_modules=256)

This class is used when parsing the Thread Local Storage of an ELF binary. It heavily uses the TLSArchInfo namedtuple from archinfo.

ELF TLS is implemented based on the following documents:

thread_pointer

The thread pointer. This is a technical term that refers to a specific location in the TLS segment.

user_thread_pointer

The thread pointer that is exported to the user

get_addr(module_id, offset)

basically __tls_get_addr.

class cle.backends.tls.pe_tls.PETLSObject(loader, max_modules=256, max_data=32768)

This class is used when parsing the Thread Local Storage of a PE binary. It represents both the TLS array and the TLS data area for a specific thread.

In memory the PETLSObj is laid out as follows:

+----------------------+---------------------------------------+
| TLS array            | TLS data area                         |
+----------------------+---------------------------------------+

A more detailed description of the TLS array and TLS data areas is given below.

TLS array

The TLS array is an array of addresses that points into the TLS data area. In memory it is laid out as follows:

+-----------+-----------+-----+-----------+
|  address  |  address  | ... |  address  |
+-----------+-----------+-----+-----------+
| index = 0 | index = 1 |     | index = n |
+-----------+-----------+-----+-----------+

The size of each address is architecture independent (e.g. on X86 it is 4 bytes). The number of addresses in the TLS array is equal to the number of modules that contain TLS data. At load time (i.e. in the finalize method), each module is assigned an index into the TLS array. The address of this module’s TLS data area is then stored at this location in the array.

TLS data area

The TLS data area directly follows the TLS array and contains the actual TLS data for each module. In memory it is laid out as follows:

+----------+-----------+----------+-----------+-----+
| TLS data | zero fill | TLS data | zero fill | ... |
+----------+-----------+----------+-----------+-----+
|       module a       |       module b       | ... |
+---------------------------------------------------+

The size of each module’s TLS data area is variable and can be found in the module’s tls_data_size property. The same applies to the zero fill. At load time (i.e in the finalize method), the initial TLS data values are copied into the TLS data area. Because a TLS index is also assigned to each module, we can access a module’s TLS data area using this index into the TLS array to get the start address of the TLS data.

get_tls_data_addr(tls_idx)

Get the start address of a module’s TLS data area via the module’s TLS index.

From the PE/COFF spec:

The code uses the TLS index and the TLS array location (multiplying the index by the word size and using it as an offset into the array) to get the address of the TLS data area for the given program and module.

Misc. Utilities

cle.gdb.convert_info_sharedlibrary(fname)

Convert a dump from gdb’s info sharedlibrary command to a set of options that can be passed to CLE to replicate the address space from the gdb session

Parameters:fname – The name of a file containing the dump
Returns:A dict appropriate to be passed as **kwargs for angr.Project or cle.Loader
cle.gdb.convert_info_proc_maps(fname)

Convert a dump from gdb’s info proc maps command to a set of options that can be passed to CLE to replicate the address space from the gdb session

Parameters:fname – The name of a file containing the dump
Returns:A dict appropriate to be passed as **kwargs for angr.Project or cle.Loader
class cle.memory.Clemory(arch, root=False)

An object representing a memory space. Uses “backers” and “updates” to separate the concepts of loaded and written memory and make lookups more efficient.

Accesses can be made with [index] notation.

add_backer(start, data)

Adds a backer to the memory.

Parameters:
  • start – The address where the backer should be loaded.
  • data – The backer itself. Can be either a string or another Clemory.
read_bytes(addr, n, orig=False)

Read up to n bytes at address addr in memory and return an array of bytes.

Reading will stop at the beginning of the first unallocated region found, or when n bytes have been read.

write_bytes(addr, data)

Write bytes from data at address addr.

write_bytes_to_backer(addr, data)

Write bytes from data at address addr to backer instead of self._updates. This is only needed when writing a huge amount of data.

read_addr_at(where, orig=False)

Read addr stored in memory as a series of bytes starting at where.

write_addr_at(where, addr)

Writes addr into a series of bytes in memory at where.

stride_repr

Returns a representation of memory in a list of (start, end, data) where data is a string.

seek(value)

The stream-like function that sets the “file’s” current position. Use with read().

Parameters:value – The position to seek to.
read(nbytes)

The stream-like function that reads up to a number of bytes starting from the current position and updates the current position. Use with seek().

Up to nbytes bytes will be read, halting at the beginning of the first unmapped region encountered.

cbackers

This function directly returns a list of already-flattened cbackers. It’s designed for performance purpose. GirlScout uses it. Use this property at your own risk!

read_bytes_c(addr)

Read n bytes at address addr in cbacked memory, and returns a cffi buffer pointer.

Note: We don’t support reading across segments for performance concerns.

class cle.patched_stream.PatchedStream(stream, patches)

An object that wraps a readable stream, performing passthroughs on seek and read operations, except to make it seem like the data has actually been patched by the given patches.

Parameters:
  • stream – The stream to patch
  • patches – A list of tuples of (addr, patch data)
class cle.address_translator.AddressTranslator(rva, owner)

Mediates address translations between typed addresses such as RAW, RVA, LVA, MVA and VA including address owner and its state (linked or mapped)

Semantics:

owner - object associated with the address
    (any object class based on `cle.Backend`)
owner mapping state - sparse object can be either mapped or not
    (actual object's image base VA to be considered valid)
RAW - offset (index) inside a file stream
VA  - address inside process flat virtual memory space
RVA - address relative to the object's segment base
    (segment base normalized virtual address)
LVA - linked VA (linker)
MVA - mapped VA (loader)
Parameters:
  • rva (int) – virtual address relative to owner’s object image base
  • owner (cle.Backend) – The object owner address relates to
classmethod from_lva(lva, owner)

Loads address translator with LVA

classmethod from_mva(mva, owner)

Loads address translator with MVA

classmethod from_rva(rva, owner)

Loads address translator with RVA

classmethod from_raw(raw, owner)

Loads address translator with RAW address

classmethod from_linked_va(lva, owner)

Loads address translator with LVA

classmethod from_va(mva, owner)

Loads address translator with MVA

classmethod from_mapped_va(mva, owner)

Loads address translator with MVA

classmethod from_relative_va(rva, owner)

Loads address translator with RVA

to_lva()

VA -> LVA :rtype: int

to_mva()

RVA -> MVA :rtype: int

to_rva()

RVA -> RVA :rtype: int

to_raw()

RVA -> RAW :rtype: int

to_linked_va()

VA -> LVA :rtype: int

to_va()

RVA -> MVA :rtype: int

to_mapped_va()

RVA -> MVA :rtype: int

to_relative_va()

RVA -> RVA :rtype: int

cle.address_translator.AT

alias of AddressTranslator

Errors

exception cle.errors.CLEError

Base class for errors raised by CLE.

exception cle.errors.CLEUnknownFormatError

Error raised when CLE encounters an unknown executable file format.

exception cle.errors.CLEFileNotFoundError

Error raised when a file does not exist.

exception cle.errors.CLEInvalidBinaryError

Error raised when an executable file is invalid or corrupted.

exception cle.errors.CLEOperationError

Error raised when a problem is encountered in the process of loading an executable.

exception cle.errors.CLECompatibilityError

Error raised when loading an executable that is not currently supported by CLE.