The ERESI Reverse Engineering Software Interface is a multi-architecture binary analysis framework with a domain-specific language tailored to reverse engineering and program manipulation. Some of its goals are:
- Feature both user-mode and kernel-mode support for instrumentation, debugging and program analysis
- Handle INTEL and SPARC machine programs (partial support for ARM, MIPS and ALPHA processors).
- Designed for analysis of Operating Systems based on the Executable & Linking Format (ELF) in particular on the Linux OS.
- Support many features on *BSD, Solaris, HP-UX, IRIX and BeOS.
- Trace into any OS in a virtual machine or emulator using the GDB serial protocol.
- Construct and display program graphs in native code as well as Intermediate Representation (IR) code
- Does not need symbols or debug info to operate most of its features (but will use them if available in ELF/DWARF/STABS)
- Inject or debug code that runs without executable data segment (PaX, Openwall, etc)
- Prone modularity and reuse of code.
I joined the ERESI team in late 2006 and started with the development of the SPARC support, later proceeding to work on the control-flow graphing features and part of initial program analysis efforts. The project was led by Julien Vanegue and, over the years, was featured or referenced in varying specialized material: articles, books, presentations. One of such articles/presentations, “Next-Generation Debuggers for Reverse Engineering”, was co-authored by me, initially published in 2007 at the time of our presentation at Black Hat Europe. I later did the same presentation also for H2HC 4th edition and Ekoparty 3rd edition. For the record, this is the paper abstract:
Classical debuggers make use of an interface provided by the operating system in order to access the memory of programs while they execute. As this model is dominating in the industry and the community, we show that our novel embedded architecture is more adapted when debuggee systems are hostile and protected at the operating system level. This alternative modelization is also more performant as the debugger executes from inside the debuggee program and can read the memory of the host process directly. We give detailed information about how to keep memory unintrusiveness using a new technique called allocation proxying.We reveal how we developed the organization of our multiarchitecture framework and its multiple modules so that they allow for graphbased binary code analysis, ad-hoc typing, compositional fingerprinting, program instrumentation, real-time tracing, multithread debugging and general hooking of systems. We reveal the reflective essence of our framework by embedding its internal structures in our own reverse engineering language, thus recalling concepts of aspect oriented programming.
By the time I was working on ERESI I was heavily interested in binary program analysis and conveniently made these the main themes of my BSc thesis, on which I was working for the first half of 2007. The final dissertation, “Developing an Intermediate Representation for the Analysis of Binary Code”, had the following abstract:
The field of Program Analysis is vast and complex. Even though it has many decades of study and advances now, some of the biggest and most pursued problems remain open for resolution. In particular, a quick search through the literature on the intersection between the disciplines of static analysis of binary programs and automated bug-finding reveals that there is a big window of opportunity open to scientists willing to engage in this exciting research field.
This work attempts to perform a survey on the state-of-the-art of the subjects touching the questions on static program analysis, binary analysis and automated bug-finding. Once properly contextualized, this document will introduce the ERESI framework, an open-source project on top of which all of this work’s implementation is based. Finally, the reader will find a detailed report of the work done to transform Intel IA-32 machine code into the ERESI LIR (Low-level Intermediate Representation), an important step to extend the analysis features of the framework in question.