Frida Engage Part One: Building an ELF Parser with Frida | VerSprite Frida Engage Part One: Building an ELF Parser with Frida | VerSprite

Frida Engage Part One | Building an ELF Parser with Frida

Written By: Versprite

How to Use Frida’s Interceptor API

In this blog series we will be covering the endless possibilities and power of Frida. For those of you who have never heard of Frida, it is a dynamic instrumentation toolkit that allows you to inject Javascript or your own libraries into native apps across multiple platforms. Frida is commonly used for hooking and manipulating functions. If you search the internet for tutorials on Frida, you will find many resources on how to use Frida’s Interceptor API, which gives you the ability to ‘intercept’ target function calls. In this series I would like to explore beyond just hooking functions, and into all of the crazy and cool things you can do when the control of a process is at your fingertips.

Building an ELF Parser with Frida

So I thought for the first part of this blog series, I would attempt to write a simple and generic ELF parser using Frida and in Javascript.  There could be many applicable and practical scenarios where you would want to do something like this when injecting into another process, but for this blog post it will just be for fun.

The environment I have setup is basic:

  • Nexus 5 running Lollipop
  • Latest Frida
  • PoC Application

The application I built for this project includes a shared-library with a single JNI function. This shared-library is what I will be parsing with Frida. The importance of this shared-library is only for the nifty things we will do in the second part of this blog series. If you are following along in your own testing environment, any shared-library or executable will do.

Opening and Reading the ELF

In order to parse our target ELF, we need to first open it and read in its contents. I have implement this using the open and read syscalls.

int open(const char *path, int oflag, … );

 ssize_t read(int fd, void *buf, size_t count);

When implementing this in Frida, we will need to use the following Frida API(s):

  • Memory
  • NativeFunction
  • Module

When writing this code, what we should really be asking ourselves is – “How would I write this in C ?”. Let’s break down the steps we need in order to answer that question.

  • The first argument to the open syscall is a const char * to the path of the file. We can create this using Memory.allocUtf8String(path)
  • In order to call open, we need to obtain its original address, so that we can use the NativeFunction constructor, which gives us the power to invokeopen
  • Obtaining open’s address can be accomplished through Frida’s Module.findExportByName()
  • After we call open, this will return the file descriptor to the opened shared-library

Now that we’ve successfully opened the shared-library, we need to calculate the size of the file and read it the contents of the shared-library based on that size. I chose to call fstat in order to obtain the size of the shared-library.

int fstat(int fd, struct stat *buf);

  • fstat takes two arguments that we need to satisfy
  • The first argument is a file descriptor, which we should already have from our previous call to open
  • The second argument is a pointer to an allocated stat struct
  • We can allocate memory for the stat struct through Frida’s Memory.alloc()
  • Calling the fstat function requires the exact steps we used for calling open

Now we need to read from the st_size member of the stat struct in order to obtain the size of the shared-library

 off_t     st_size    For regular files, the file size in bytes. 
                     For symbolic links, the length in bytes of the 
                     pathname contained in the symbolic link.

When searching around for the size of off_t, all I could find was the following.

blkcnt_t  and   off_t   shall be signed integer types.”

With that in mind, and knowing that the st_size member is at 0x30 into the structure, we can use Frida’s Memory.readS32() to obtain the size.

function getFileSize(fd) {
 // TODO Get the actual size of this structure 
 var statBuff = Memory.alloc(500);
 console.log('[+] struct stat --> ' + statBuff.toString());
 var fstatSymbol = getSymbolAddress('libc.so', 'fstat');
 console.log('[+] fstat --> ' + fstatSymbol);
 var fstat = new NativeFunction(fstatSymbol, 'int', ['int', 'pointer']);
 console.log('[+] Calling fstat() [!]');
 if(fd > 0) {
 var ret = fstat(fd, statBuff);
 if(ret < 0) { console.log('[+] fstat --> failed [!]');
 }
 }
 console.log(hexdump(statBuff, {
 offset: 0,
 length: 20,
 header: true,
 ansi: true
 }));
 var size = Memory.readS32(statBuff.add(0x30))
 if(size > 0) {
 console.log('[+] size of fd --> ' + size.toString());
 return size;
 } else {
 return 0;
 }
}

Finally we can implement the read syscall using memory we’ve allocated for the shared-library using the size we got back from fstat, and the file descriptor we got back from open.

function openAndReadLibrary(library_path) {
 library_path_ptr = Memory.allocUtf8String(library_path);
 console.log('[+] path --> ' + library_path_ptr.toString());
 open = getSymbolAddress('libc.so', 'open');
 console.log('[+] open --> ' + open.toString());
 mOpen = new NativeFunction(open, 'int', ['pointer', 'int']);
 console.log('[+] Opening --> ' + library_path);
 var fd = mOpen(library_path_ptr, 0);
 if(fd < 0) { console.log('[+] Failed to open --> ' + library_path);
 }
 console.log('[+] fd --> ' + fd.toString());
 var size = getFileSize(fd);
 var read_sym = getSymbolAddress('libc.so', 'read');
 var read = new NativeFunction(read_sym, 'int', ['int', 'pointer', 'long']);
 var rawElf = Memory.alloc(size);
 if(read(fd, rawElf, size) < 0) { console.log('[+] Unable to read ELF [!]'); return -1; } console.log('[+] read --> ' + size + ' bytes [!]');
 console.log(hexdump(rawElf, {
 offset: 0,
 length: 20,
 header: true,
 ansi: true
 }));
 return rawElf
}

Below is the output we get when running the ELF parsing Frida script.

[+] Running elf parser [!]
[+] path --> 0xa05af7c8
[+] open --> 0xb6e491dd
[+] Opening --> /data/data/com.versprite.poc/lib/libnative-lib.so
[+] fd --> 55
[+] struct stat --> 0xa05b0520
[+] fstat --> 0xb6e6d154
[+] Calling fstat() [!]
           0  1  2  3  4  5  6  7  8  9  A  B  C  D  E  F  0123456789ABCDEF
00000000  1c b3 00 00 00 00 00 00 00 00 00 00 57 7c 02 00  ............W|..
00000010  ed 81 00 00 01 00 00 00 e8 03 00 00 e8 03 00 00  ................
00000020  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
00000030  a0 25 00 00 00 00 00 00 00 10 00 00 00 00 00 00  .%..............
[+] size of fd --> 9632
[+] read --> 9632 bytes [!]
           0  1  2  3  4  5  6  7  8  9  A  B  C  D  E  F  0123456789ABCDEF
00000000  7f 45 4c 46 01 01 01 00 00 00 00 00 00 00 00 00  .ELF............
00000010  03 00 28 00

Parsing the ELF

For the sake of this blog series I have only implemented a very generic ELF parser, which includes the following capabilities:

  • Parse the ELF header
  • Parse program header table
  • Parse the section header table

Now being a noob to Javascript, I was unsure how to access and visualize data once I had read it into a buffer from the shared-library. That is when I discovered DataView.

“The DataView view provides a low-level interface for reading and writing multiple number types in an ArrayBuffer irrespective of the platform’s endianness.

First I want to process the ELF header. At the time of writing this blog post, I am only supporting 32-bit. For those of you who are unfamiliar with the ELF header structure, here is the 32-bit definition.

#define EI_NIDENT	16

typedef struct elf32_hdr{
  unsigned char	e_ident[EI_NIDENT];
  Elf32_Half	e_type;
  Elf32_Half	e_machine;
  Elf32_Word	e_version;
  Elf32_Addr	e_entry;  /* Entry point */
  Elf32_Off	e_phoff;
  Elf32_Off	e_shoff;
  Elf32_Word	e_flags;
  Elf32_Half	e_ehsize;
  Elf32_Half	e_phentsize;
  Elf32_Half	e_phnum;
  Elf32_Half	e_shentsize;
  Elf32_Half	e_shnum;
  Elf32_Half	e_shstrndx;
} Elf32_Ehdr;

I also love to use 010 Editor when parsing binary formats.  It allows me to visualize everything I am attempting to do code-wise. Parsing the ELF header is actually pretty simple, and can be accomplished in the following steps.

  • From the the pointer we have to our ELF file in memory, use Frida’s Memory.readByteArray() with the size of Elf32_Ehdr, read in the ELF header into a buffer
  • We can then construct a new DataView instance from this buffer
  • The DataView class contains methods that will allow us to read a data type from a specified index
  • For example if we wanted to read the e_type member of the Elf32_Ehdrstructure, we can do something like this
var e_type = elfHeaderDataView.getInt32(0x10, true);

Processing the ELF program header table and section header table is another combination of using Memory.readByteArray() and DataViews. You can find those structures here.

https://raw.githubusercontent.com/torvalds/linux/master/include/uapi/linux/elf.h

l’ll leave this exercise up to the reader, but you can also check the final ELF parsing script here for more insight.

https://github.com/VerSprite/engage/blob/master/js/elf_parser.js

Here is some truncated output from script.

[+] HEADERS -----------------------------
[+] e_type      --> 2621443
[+] e_machine   --> 40
[+] e_version   --> 1
[+] e_entry     --> 0
[+] e_phoff     --> 52
[+] e_shoff     --> 0x21b8
[+] e_flags     --> 0x5000200
[+] e_ehsize    --> 52
[+] e_phentsize --> 32
[+] e_phnum     --> 8
[+] e_shentsize --> 40
[+] e_shnum     --> 25
[+] e_shtrndx   --> 24

[+] SEGMENTS -----------------------------
[+] segment --> 0x34 : PT_PHDR
[+] segment --> 0x54 : PT_LOAD
[+] segment --> 0x74 : PT_LOAD
[+] segment --> 0x94 : PT_DYNAMIC
[+] segment --> 0xb4 : PT_NOTE

[+] SECTIONS -----------------------------
[+] .note.gnu.build-id : 0x21e0
[+]  s_addr   --> 0x134
[+]  s_offset --> 0x134
[+]  s_size   --> 0x24
[+] .dynsym : 0x2208
[+]  s_addr   --> 0x158
[+]  s_offset --> 0x158
[+]  s_size   --> 0xf0

Frida and Parsing Binary Formats

I hope part one of this blog series demonstrated some pretty rad things you can do with Frida when it comes to parsing binary formats.  Check out part two, Shellcoding an Arm64 In-Memory Reverse TCP Shell with Frida, which covers more bad ass things you can do with Frida.

References

https://www.frida.re/docs/javascript-api/#interceptor

https://www.frida.re/docs/javascript-api/#NativeFunction

https://www.frida.re/docs/javascript-api/#module

https://linux.die.net/man/3/open

https://linux.die.net/man/2/read

https://linux.die.net/man/2/fstat

https://raw.githubusercontent.com/torvalds/linux/master/include/uapi/linux/elf.h

https://www.sweetscape.com/010editor/

Protect Your Assets from Various Threat Actors

VerSprite's Research and Development division (a.k.a VS-Labs) is comprised of individuals who are passionate about diving into the internals of various technologies.

Our clients rely on VerSprite's unique offerings of zero-day vulnerability research and exploit development to protect their assets from various threat actors.

From advanced technical security training to our research for hire B.O.S.S offering, we help organizations solve their most complex technical challenges. Learn more about Research as a Service →

Receive Security News





View our security advisories detailing vulnerabilities found in major products for MacOs, Windows, Android, and iOS.

We are an international squad of professionals working as one.

logos