Frida Engage Part One | Building an ELF Parser with Frida
How to Use Frida’s Interceptor API
In this blog series we will be covering the endless possibilities and power of Frida. For those of you who have never heard of Frida, it is a dynamic instrumentation toolkit that allows you to inject Javascript or your own libraries into native apps across multiple platforms. Frida is commonly used for hooking and manipulating functions. If you search the internet for tutorials on Frida, you will find many resources on how to use Frida’s Interceptor API, which gives you the ability to ‘intercept’ target function calls. In this series I would like to explore beyond just hooking functions, and into all of the crazy and cool things you can do when the control of a process is at your fingertips.
Building an ELF Parser with Frida
So I thought for the first part of this blog series, I would attempt to write a simple and generic ELF parser using Frida and in Javascript. There could be many applicable and practical scenarios where you would want to do something like this when injecting into another process, but for this blog post it will just be for fun.
The environment I have setup is basic:
- Nexus 5 running Lollipop
- Latest Frida
- PoC Application
The application I built for this project includes a shared-library with a single JNI function. This shared-library is what I will be parsing with Frida. The importance of this shared-library is only for the nifty things we will do in the second part of this blog series. If you are following along in your own testing environment, any shared-library or executable will do.
Opening and Reading the ELF
In order to parse our target ELF, we need to first open it and read in its contents. I have implement this using the open
and read
syscalls.
int open(const char *path, int oflag, … );
ssize_t read(int fd, void *buf, size_t count);
When implementing this in Frida, we will need to use the following Frida API(s):
- Memory
- NativeFunction
- Module
When writing this code, what we should really be asking ourselves is – “How would I write this in C ?”. Let’s break down the steps we need in order to answer that question.
- The first argument to the open syscall is a
const char *
to the path of the file. We can create this usingMemory.allocUtf8String(path)
- In order to call
open
, we need to obtain its original address, so that we can use theNativeFunction
constructor, which gives us the power to invokeopen
- Obtaining
open’s
address can be accomplished through Frida’sModule.findExportByName()
- After we call
open
, this will return the file descriptor to the opened shared-library
Now that we’ve successfully opened the shared-library, we need to calculate the size of the file and read it the contents of the shared-library based on that size. I chose to call fstat
in order to obtain the size of the shared-library.
int fstat(int fd, struct stat *buf);
fstat
takes two arguments that we need to satisfy- The first argument is a file descriptor, which we should already have from our previous call to
open
- The second argument is a pointer to an allocated
stat struct
- We can allocate memory for the
stat struct
through Frida’sMemory.alloc()
- Calling the
fstat
function requires the exact steps we used for callingopen
Now we need to read from the st_size
member of the stat struct
in order to obtain the size of the shared-library
off_t st_size For regular files, the file size in bytes. For symbolic links, the length in bytes of the pathname contained in the symbolic link.
When searching around for the size of off_t
, all I could find was the following.
“ blkcnt_t and off_t shall be signed integer types.”
With that in mind, and knowing that the st_size
member is at 0x30
into the structure, we can use Frida’s Memory.readS32()
to obtain the size.
function getFileSize(fd) { // TODO Get the actual size of this structure var statBuff = Memory.alloc(500); console.log('[+] struct stat --> ' + statBuff.toString()); var fstatSymbol = getSymbolAddress('libc.so', 'fstat'); console.log('[+] fstat --> ' + fstatSymbol); var fstat = new NativeFunction(fstatSymbol, 'int', ['int', 'pointer']); console.log('[+] Calling fstat() [!]'); if(fd > 0) { var ret = fstat(fd, statBuff); if(ret < 0) { console.log('[+] fstat --> failed [!]'); } } console.log(hexdump(statBuff, { offset: 0, length: 20, header: true, ansi: true })); var size = Memory.readS32(statBuff.add(0x30)) if(size > 0) { console.log('[+] size of fd --> ' + size.toString()); return size; } else { return 0; } }
Finally we can implement the read
syscall using memory we’ve allocated for the shared-library using the size we got back from fstat
, and the file descriptor we got back from open
.
function openAndReadLibrary(library_path) { library_path_ptr = Memory.allocUtf8String(library_path); console.log('[+] path --> ' + library_path_ptr.toString()); open = getSymbolAddress('libc.so', 'open'); console.log('[+] open --> ' + open.toString()); mOpen = new NativeFunction(open, 'int', ['pointer', 'int']); console.log('[+] Opening --> ' + library_path); var fd = mOpen(library_path_ptr, 0); if(fd < 0) { console.log('[+] Failed to open --> ' + library_path); } console.log('[+] fd --> ' + fd.toString()); var size = getFileSize(fd); var read_sym = getSymbolAddress('libc.so', 'read'); var read = new NativeFunction(read_sym, 'int', ['int', 'pointer', 'long']); var rawElf = Memory.alloc(size); if(read(fd, rawElf, size) < 0) { console.log('[+] Unable to read ELF [!]'); return -1; } console.log('[+] read --> ' + size + ' bytes [!]'); console.log(hexdump(rawElf, { offset: 0, length: 20, header: true, ansi: true })); return rawElf }
Below is the output we get when running the ELF parsing Frida script.
[+] Running elf parser [!] [+] path --> 0xa05af7c8 [+] open --> 0xb6e491dd [+] Opening --> /data/data/com.versprite.poc/lib/libnative-lib.so [+] fd --> 55 [+] struct stat --> 0xa05b0520 [+] fstat --> 0xb6e6d154 [+] Calling fstat() [!] 0 1 2 3 4 5 6 7 8 9 A B C D E F 0123456789ABCDEF 00000000 1c b3 00 00 00 00 00 00 00 00 00 00 57 7c 02 00 ............W|.. 00000010 ed 81 00 00 01 00 00 00 e8 03 00 00 e8 03 00 00 ................ 00000020 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00000030 a0 25 00 00 00 00 00 00 00 10 00 00 00 00 00 00 .%.............. [+] size of fd --> 9632 [+] read --> 9632 bytes [!] 0 1 2 3 4 5 6 7 8 9 A B C D E F 0123456789ABCDEF 00000000 7f 45 4c 46 01 01 01 00 00 00 00 00 00 00 00 00 .ELF............ 00000010 03 00 28 00
Parsing the ELF
For the sake of this blog series I have only implemented a very generic ELF parser, which includes the following capabilities:
- Parse the ELF header
- Parse program header table
- Parse the section header table
Now being a noob to Javascript, I was unsure how to access and visualize data once I had read it into a buffer from the shared-library. That is when I discovered DataView
.
“The DataView
view provides a low-level interface for reading and writing multiple number types in an ArrayBuffer
irrespective of the platform’s endianness.”
First I want to process the ELF header. At the time of writing this blog post, I am only supporting 32-bit. For those of you who are unfamiliar with the ELF header structure, here is the 32-bit definition.
#define EI_NIDENT 16 typedef struct elf32_hdr{ unsigned char e_ident[EI_NIDENT]; Elf32_Half e_type; Elf32_Half e_machine; Elf32_Word e_version; Elf32_Addr e_entry; /* Entry point */ Elf32_Off e_phoff; Elf32_Off e_shoff; Elf32_Word e_flags; Elf32_Half e_ehsize; Elf32_Half e_phentsize; Elf32_Half e_phnum; Elf32_Half e_shentsize; Elf32_Half e_shnum; Elf32_Half e_shstrndx; } Elf32_Ehdr;
I also love to use 010 Editor when parsing binary formats. It allows me to visualize everything I am attempting to do code-wise. Parsing the ELF header is actually pretty simple, and can be accomplished in the following steps.
- From the the pointer we have to our ELF file in memory, use Frida’s
Memory.readByteArray()
with the size ofElf32_Ehdr
, read in the ELF header into a buffer - We can then construct a new
DataView
instance from this buffer - The
DataView
class contains methods that will allow us to read a data type from a specified index - For example if we wanted to read the
e_type
member of theElf32_Ehdr
structure, we can do something like this
var e_type = elfHeaderDataView.getInt32(0x10, true);
Processing the ELF program header table and section header table is another combination of using Memory.readByteArray()
and DataViews
. You can find those structures here.
https://raw.githubusercontent.com/torvalds/linux/master/include/uapi/linux/elf.h
l’ll leave this exercise up to the reader, but you can also check the final ELF parsing script here for more insight.
https://github.com/VerSprite/engage/blob/master/js/elf_parser.js
Here is some truncated output from script.
[+] HEADERS ----------------------------- [+] e_type --> 2621443 [+] e_machine --> 40 [+] e_version --> 1 [+] e_entry --> 0 [+] e_phoff --> 52 [+] e_shoff --> 0x21b8 [+] e_flags --> 0x5000200 [+] e_ehsize --> 52 [+] e_phentsize --> 32 [+] e_phnum --> 8 [+] e_shentsize --> 40 [+] e_shnum --> 25 [+] e_shtrndx --> 24 [+] SEGMENTS ----------------------------- [+] segment --> 0x34 : PT_PHDR [+] segment --> 0x54 : PT_LOAD [+] segment --> 0x74 : PT_LOAD [+] segment --> 0x94 : PT_DYNAMIC [+] segment --> 0xb4 : PT_NOTE [+] SECTIONS ----------------------------- [+] .note.gnu.build-id : 0x21e0 [+] s_addr --> 0x134 [+] s_offset --> 0x134 [+] s_size --> 0x24 [+] .dynsym : 0x2208 [+] s_addr --> 0x158 [+] s_offset --> 0x158 [+] s_size --> 0xf0
Frida and Parsing Binary Formats
I hope part one of this blog series demonstrated some pretty rad things you can do with Frida when it comes to parsing binary formats. Check out part two, Shellcoding an Arm64 In-Memory Reverse TCP Shell with Frida, which covers more bad ass things you can do with Frida.
References
https://www.frida.re/docs/javascript-api/#interceptor
https://www.frida.re/docs/javascript-api/#NativeFunction
https://www.frida.re/docs/javascript-api/#module
https://linux.die.net/man/3/open
https://linux.die.net/man/2/read
https://linux.die.net/man/2/fstat
https://raw.githubusercontent.com/torvalds/linux/master/include/uapi/linux/elf.h
https://www.sweetscape.com/010editor/
Protect Your Assets from Various Threat Actors
VerSprite’s Research and Development division (a.k.a VS-Labs) is comprised of individuals who are passionate about diving into the internals of various technologies.
Our clients rely on VerSprite’s unique offerings of zero-day vulnerability research and exploit development to protect their assets from various threat actors.
From advanced technical security training to our research for hire B.O.S.S offering, we help organizations solve their most complex technical challenges. Learn more about Research as a Service →
View our security advisories detailing vulnerabilities found in major products for MacOs, Windows, Android, and iOS.
- /
- /
- /
- /
- /
- /
- /
- /
- /
- /
- /
- /
- /
- /
- /
- /
- /
- /
- /
- /
- /
- /