Some Analysis of the Backdoored Backdoor

Update: Shortly after reading my post, Willem Pinckaers pointed out that the reseed_system_prng function sets the global variable system_prng_bufpos to 32. This means that after the first invocation of this function, the for loop right after the reseed call in system_prng_gen_block never executes. Hence, the ANSI X9.31 PRNG code is completely non-functional.

Recently, the internet circus called Twitter has been abuzz with news of a plurality of backdoors found in some versions of Juniper’s ScreenOS operating system. While Fox-IT and then HD Moore quickly found the backdoor password allowing SSH and Telnet access, the exact mechanisms underlying the VPN decryption backdoor are unclear at this point. This blog post tries to summarize my findings of the last couple of days regarding the apparently backdoored Dual_EC PRNG found in ScreenOS 6.3.0r12 and other affected firmware revisions listed in at Juniper’s 2015-12 Out of Cycle Security Bulletin: ScreenOS

NIST publication SP 800-90A describes a family of pseudo-random number generators called Dual_EC DRBG (deterministic random bit generator is the official designation used in the standard) for different elliptic curves (NIST curves P-256, P-384 and P-521). For each of these PRNGs, two parameters are needed: two points on the elliptic curve that are called P and Q. These points are also specified in Appendix A.1.1 of the same standard.

During the CRYPTO 2007 rump session, Niels Ferguson and Dan Shumow demonstrated that if the points are not randomly generated, but carefully chosen in advance, the security of Dual_EC DRBG can be subverted by the party doing the choosing; effectively backdooring the PRNG. Namely if one chooses P, Q such that Q=P*e holds for a value e that is kept secret, it will allow the party that generated said P, Q to recover the internal state of the PRNG from observed output in a computationally “cheap fashion” – hence instances of Dual_EC PRNG for which the provenance of the points P and Q is unknown are susceptible to having been backdoored. Parties that are not in possession of the value e can obtain it by solving the discrete log problem for e on the elliptic curve; but for the discrete logarithm problem on prime curves such as P-256, no sub-exponential algorithms are currently known. In fact, unless quantum computers capable of running Shor’s algorithm on more than a handful of bits become reality, conventional cryptographic wisdom places the strength of P-256 at a 128-bit symmetric key security level.

After the Snowden revelations uncovered Project BULLRUN and gave stronger indications of the compromise of the proposed Dual_EC parameters in SP 800-90A, Checkoway et al. presented a paper On the Practical Exploitability of Dual EC in TLS Implementations at Usenix Security 2014. In July 2015, Bernstein, Lange and Niederhagen wrote an excellent article on the history of Dual EC and how to exploit Dual EC-based backdoors.

Alas, while Juniper used Dual_EC_DRBG with the P-256 NIST curve and the point P specified in SP 800-90A in ScreenOS — the operating system running on NetScreen VPN gateways — they chose to use a different point Q and not the one supplied in the standard for P-256. When the Snowden revelations in 2013 shone light on Project BULLRUN and the compromise of Dual_EC_DRBG by the NSA, Juniper responded as follows in knowledge base article KB28205:

ScreenOS does make use of the Dual_EC_DRBG standard, but is designed to not use Dual_EC_DRBG as its primary random number generator. ScreenOS uses it in a way that should not be vulnerable to the possible issue that has been brought to light. Instead of using the NIST recommended curve points it uses self-generated basis points and then takes the output as an input to FIPS/ANSI X.9.31 PRNG, which is the random number generator used in ScreenOS cryptographic operations.

However, apparently starting in August 2012 (release date according to release notes for 6.3.0r12), Juniper started shipping ScreenOS firmware images with a different point Q. Adam Caucill first noted this difference after HD Moore posted a diff of strings found in the SSG 500 6.2.0r14 and the 6.2.0r15 firmware. As we can deduce from their recent security advisory and the fact that they reverted back to the old value Q in the patched images, this was a change not authored by them. Apparently Juniper only realised this recently and not when they were issuing KB28205. This led us to investigate the change more thoroughly, which led to the discovery of its use in a Dual_EC PRNG, as documented by Adam Langley. This discovery was fairly quick after I realized that ScreenOS utilized OpenSSL as a crypto library underneath; a well-kept secret among people having reversed products containing OpenSSL before is that of all those EC_PUT_error macros sprinkled over the OpenSSL codebase are mighty useful for identifying functions and hence getting a hook into the codebase.

It stands to reason that whoever managed to slip in their own Q will also know the corresponding e such that P*e=Q (the value P was unchanged from the standard) and hence is able recover the internal state of the backdoored Dual_EC generator from the output generator. What is unknown however is what an attack would look like for the PRNG cascade employed by Juniper’s ScreenOS.

Since there is no public description of this PRNG cascade, I analysed firmware version 6.3.0r12 of a Netscreen SSG 20 to investigate this issue further. This was the first version of 6.3.0 that Juniper indicates as having been backdoored. Thankfully, HD Moore already wrote up the details of how to unpack and load the firmware images in IDA in his blog post, so I did not have to do that again here.

Static analysis indicates that the output of the Dual_EC generator indeed is not used directly, but rather only to reseed an ANSI X9.31 PRNG. Besides the unused EC PRNG known-answer test function, a function we call reseed_system_prng is the only one that references the ec_prng_generate_output function. Caveat: we may be overlooking a dynamically generated indirect call to the Dual_EC generator that leaks its state at some point; superficial BinDiffing of 6.3.0r11 and 6.3.0r12 however did not show any leads into that direction. Further analysis using a JTAG debugger on a live device hopefully will show us more.

The “system” PRNG is then used throughout ScreenOS to generate random values, for instance to construct IKE nonces, random OpenSSL BNs etc. This system PRNG generates output in 32 byte blocks using a function we chose to call system_prng_gen_block:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
// This generates a block of 32 random bytes
void system_prng_gen_block(int a1)
{
  int v3;
  int v4;
  unsigned int i;
  unsigned int timeval[2];

  timeval[0] = 0;
  timeval[1] = ixp425_read_timestamp_timer();
  system_prng_bufpos = 0;
  ++sysprng_num_gen_blocks;
  if ( !prng_does_not_require_reseeding() )
    reseed_system_prng();
  for ( ; system_prng_bufpos <= 31; system_prng_bufpos += 8 )
  {
    memcpy(&prev_prng_seed_part1, &ansi_x9_31_seed, 8);
    memcpy(&prev_generator_out, generator_outbuf, 8);
    ansi_x9_31_update(timeval, &ansi_x9_31_seed, &ansi_x9_31_3des_key, generator_outbuf);
    if ( is_fips_enabled(0) )
    {
      if ( !memcmp(&ansi_x9_31_seed, &prev_prng_seed_part1, 8) || !memcmp(generator_outbuf, &prev_generator_out, 8)))
      {
        log_dbgmsg3(0, 6, get_current_vsys(0), "FIPS ERROR: PRNG failure, duplicate random output\n", timeval[0], timeval[1]);
        /* 0x404100E -> "Failed to generate random." */
        log_dbgmsg(0x404100E);
        log_dbgmsg4("FIPS ERROR: PRNG failure, duplicate random output\n", 11);
      }
    }
    for ( i = 0; i < system_prng_bufpos; i += 8 )
    {
      if ( !memcmp(&system_prng_output_buffer[i], generator_outbuf, 8) )
      {
        log_dbgmsg3(0, 6, get_current_vsys(0), "FIPS ERROR: PRNG failure, duplicate random output\n", timeval[0], timeval[1]);
        log_dbgmsg4("FIPS ERROR: PRNG failure, duplicate random output\n", 11);
      }
    }
    memcpy(&system_prng_output_buffer[system_prng_bufpos], generator_outbuf, 8);
  }
}

As we see, before any output is generated, this function calls another function which I named prng_does_require_reseeding. This function reads out a flag that is set to zero by default. What this means is that by default the system PRNG is reseeded from the Dual_EC PRNG for each output block generated! This periodic reseeding can apparently be turned off using the command set key one-stage-rng. This command however I was not able to find in the ScreenOS documentation but rather only in the firmware binary. The reseeding of the system PRNG is done by getting 32 bytes of output from the Dual EC generator and splitting this into 8 bytes of seed and 24 bytes of key material for an X9.31 PRNG:

1
2
3
4
5
6
7
8
9
10
void reseed_system_prng()
{
  system_prng_state[0] = 0;
  if ( ec_prng_gen_keystream_with_checks(system_prng_output_buffer, 32) != 32 )
    log_dbgmsg4("FIPS ERROR: PRNG failure, unable to reseed\n", 11);
  memcpy(&ansi_x9_31_seed, system_prng_output_buffer, 8u);
  result = memcpy(&ansi_x9_31_3des_key, &system_prng_output_buffer[8], 24u);
  system_prng_bufpos = 32;
  return result;
}

The function ansi_x9_31_update works as one expects it to. Each call to this function calculates 8 bytes of keystream. This is done by encrypting a timer value that is obtained by directly reading a hardware register in our case (IXP425 platform) with the 192-bit 3DES key ansi_x9_31_3des_key. Furthermore, the function updates the seed value ansi_x9_31_seed with each call. In the below diagram, T is represented by the variable timeval, K by ansi_x9_31_3des_key, Vi by the value of ansi_x9_31_seed before the call and Vi+1 by the value of ansi_x9_31_seed after; Ri denotes the output:

ANSI X9.31 PRNG

While X9.31-style PRNGs are known to be fragile, I currently do not see an easy way to do passive decryption of VPN traffic using the above ScreenOS Dual_EC backdoor, even if the value e were known to me. Although recovering the internal state of the Dual_EC generator will allow reduce the entropy of PRNG output to at most 32 bits (the timer value), there just is not any direct output of Dual_EC visible that allows to recover its internal state.

Maybe I am missing a direct leak of Dual_EC somewhere output, maybe I am overlooking an attack on the above X9.31 construction or the cascaded PRNG that does not involve breaking 3DES; maybe there’s another subtle change in the code that I am missing which breaks the whole thing. Juniper clearly stated that a change starting in 6.3.0r12 enables passive decryption of VPN traffic; given the fact that in the patched 6.3.0r12b version they reverted the point Q to the on contained in ScreenOS 6.3.0r11, it seems very likely that a changed and reverted point for the Dual_EC generator gives rise to this vulnerability. Last but not least (and I rate this as extremely unlikely given the nature of the backdoor), maybe 6.3.0r12 does not contain a fully enabled backdoor yet?

Even though I do not have answers now, I am confident that getting more eyes on this problem can shed light on this. My next step is to try to attach a JTAG debugger to the SSG 20 to see whether I am missing any Dual_EC leaks.

Kudos to my tweeps Matthew Green, Adam Langley and HD Moore for the lively discussion on this matter.

I am back do other projects now, but I do not think I can stop my subconsciousness thinking about this.

Postpwnium Writeup

The following post gives an overview of the bugs that I was cobbling together for the Pwnium 3 competition to achieve remote code execution on the device.

While inspecting the browser plugins installed by default on my Chromebook, I noticed that the “Google Talk” plugin is running in an ‘unsandboxed’ mode. Inspecting the entry more closely, one sees that not one MIME type, but two are registered, application/googletalk and application/vnd.o3d.auto. Since the Google Talk extension is closed source and O3D is an open source project, I decided to set my sights on O3D. Moreover, since O3D seems to be a dead-end technology, that looked like a safe bet to find bugs in.

By inspecting the Chromium source code and using gdb I reassured myself that the O3D plugin really was running in unsandboxed mode:

[from chrome/common/chrome_content_client.cc in Chromium r185835,
 function ComputeBuiltInPlugins(std::vector<content::PepperPluginInfo>* plugins]:
  static bool skip_o3d_file_check = false;
  if (PathService::Get(chrome::FILE_O3D_PLUGIN, &path)) {
    if (skip_o3d_file_check || file_util::PathExists(path)) {
      content::PepperPluginInfo o3d;
      o3d.path = path;
      o3d.name = kO3DPluginName;
      o3d.is_out_of_process = true;
      o3d.is_sandboxed = false;
      o3d.permissions = kO3DPluginPermissions;
      webkit::WebPluginMimeType o3d_mime_type(kO3DPluginMimeType,
                                              kO3DPluginExtension,
                                              kO3DPluginDescription);
      o3d.mime_types.push_back(o3d_mime_type);
      plugins->push_back(o3d);

      skip_o3d_file_check = true;
    }
  }

A plan was formed: Pop the Chromebook through the O3D plugin. Let’s see what steps are necessary to get from here to there*.

* where here is a URL I give you to visit on your Chromebook and there is a connect-back shell being spawned on your shiny toy.

Getting a O3D plugin object instantiated

This was initially perceived as “an easy feat”, but just like most easy tasks, it turned out to be much harder than expected. “Why?”, I hear you ask. The plugin is configured with a list of whitelisted domains:

[from build/branding.gypi]:
'plugin_domain_whitelist': ('".corp.google.com", '
                            '".prod.google.com", '
                            '".googleplex.com", '
                            '"hostedtalkgadget.google.com", '
                            '"mail.google.com", '
                            '"plus.google.com", '
                            '"plus.sandbox.google.com", '
                            '"talk.google.com", '
                            '"talkgadget.google.com"')

The factory install of my Chromebook was ChromeOS 23.x, which didn’t actually implement this whitelisting feature for reasons unknown.

However, on ChromeOS 25.x, unless the HTML document embedding the o3d plugin is served over a HTTPS connection from one of these domains, the plugin will be blocked. This restriction can be overcome with an XSS on one of the above Google properties, by being able to spoof window.location.href or by making use of Chromium Issue #64229 (which has been marked as WontFix since January 2011). Have a look at the function IsDomainAuthorized():

[from plugin/cross/whitelist.cc]
bool IsDomainAuthorized(NPP instance) {
#ifdef O3D_PLUGIN_DOMAIN_WHITELIST
  std::string url(GetURL(instance));
  if (url.empty()) {
    // This can happen in Chrome due to a bug with cross-origin security checks,
    // including on legitimate pages. Until it's fixed we'll just allow any
    // domain when this happens.
    // http://code.google.com/p/chromium/issues/detail?id=64229
    LOG(WARNING) <<
        "Allowing use despite inability to determine the hosting page";
    return true;
  }

Since that bug had been rotting in the bugtracker for over two years and also was marked as WontFix, I assumed it had gained the the status of a “feature”. Albeit, after being unable to get the situation reproduced, I went trawling through the WebKit repository and after a while made myself believe that the problem was fixed by this or a related commit:

[from http://trac.webkit.org/changeset/124693]
2012-08-04  Adam Barth  <abarth@webkit.org>

    [V8] Re-wire "target" half of the same-origin security check through Document rather than DOMWindow
    https://bugs.webkit.org/show_bug.cgi?id=93079

    Reviewed by Eric Seidel.

    Before this patch, we were traversing from Nodes to Frames to
    DOMWindows to SecurityOrigins when determing the "target" of an
    operation for the same-origin policy security check. Rather than
    detouring through DOMWindow, these security checks should operate in
    terms of ScriptExecutionContexts (aka Documents) because that's the
    canonical place we store SecurityOrigin objects.

    A future patch will re-wire the "active" part of the security check to
    use ScriptExecutionContexts as well and we'll be able to remove the
    extra copy of SecurityOrigin that we keep in DOMWindow.

More specifically, this change caught my eye:

static v8::Handle<v8::Context> activeContext()
{
    v8::Handle<v8::Context> context = v8::Context::GetCalling();
    if (!context.IsEmpty())
        return context;
    // Unfortunately, when processing script from a plug-in, we might not
    // have a calling context. In those cases, we fall back to the
    // entered context.
    return v8::Context::GetEntered();
}

[activeContext() now uses GetEntered() instead of GetCurrent() which I thought might mitigate the problem described in the above discussion. BUT I WAS WRONG! IT DOES NOT MITIGATE THE PROBLEM. SEE BELOW.]

I then wasted endless hours trying to find an XSS in one of the above Google sites. Apparently I suck at these things; I heard that this is supposed to be rather easy…

Revisiting the situation later – using much hair-pulling and gdb – I was able to discern a scenario that triggered the href property becoming 0 due to a failed cross-origin check. A Google Groups discussion among Google developers proved to be very helpful for that. The trigger is racy, but with a forced reload on the outer frame it works reliably. The trigger can be reduced to something as simple as embedding an iframe containing the following:

<html>
<body onload="document.defaultView.getComputedStyle(e).getPropertyValue('width');">
<div id="e"></div>
</body>
</html>

Quick explanation: we can force a CSS style computation to happen in an inner iframe with the o3d plugin embedded on the outer frame. This in turn will trigger a layout of the page, which happens in the V8 context of the inner iframe. During this layout (actually, in the post layout stage) a plugin instantiation will be attempted for the o3d object on the outer frame. This in turn will lead to NPN_GetProperty calling V8 bindings (V8Location) for the window.location.href property. Since this is in the V8 context of the inner iframe, the access will fail due to a cross-origin violation.

An exploitable memory corruption in the O3D plugin

To any arrogant memory corruption practioner, the O3D code base looks sufficiently large to find at least a couple of decent use-after free or type confusion vulnerabilities in. And I only need one. The reference-counting patterns used throughout the code base together with the weak_ptrs were extremely annoying (due to me hitting a number of false positives), but finally I found a simple way to UAF: Setting the owner property on a DrawElement object and subsequently destroying that owner object will cause a dangling pointer. To wit, the DrawElement class contains a bare Element pointer:

[from /core/cross/draw_element.h]:
Element* owner_;  // our current owner.

Let’s have a look at the available Javascript bindings of DrawElement:

[from plugin/idl/draw_element.idl]:
[getter, setter, userglue_setter] Element? owner_;

[verbatim=cpp_glue] %{
  void userglue_setter_owner_(
      o3d::DrawElement* _this,
      o3d::Element* owner) {
    _this->SetOwner(owner);
  }
-- snap --

and more specifically at the setter function:

[from /core/cross/draw_element.cc]:
void DrawElement::SetOwner(Element* new_owner) {
  // Hold a ref to ourselves so we make sure we don't get deleted while
  // as we remove ourself from our current owner.
  DrawElement::Ref temp(this);

  if (owner_ != NULL) {
    bool removed = owner_->RemoveDrawElement(this);
    DCHECK(removed);
  }

  owner_ = new_owner;

  if (new_owner) {
    new_owner->AddDrawElement(this);
  }
}

The o3d::Element class follows the following inheritance: Element -> ParamObject -> NamedObject -> ObjectBase -> RefCounted

More importantly, there is a virtual class inheriting from Element that we can instantiate through Javascript bindings, namely the Primitive class. This means the standard dangling pointer vtable overwrite, using the JS binding to pull the UAF trigger, is possible. pop goes the glock

Please note that this find may look somewhat easier than it was - there are tons of other cases (see above) in which various design patterns, for instance the Object Manager pattern are very successful at preventing UAF by making the objects unavailable through Javascript bindings.

A memory leak to defeat ASLR

While many a memory leak can be procured out of a UAF usually, I was not so lucky with the one that I had. But that just meant I need to find a dedicated one…. Contrary to more security-conscious operating systems such as OpenBSD [sorry, can’t help the trolling here :)], free()d memory is not zero-filled on Linux. This means that being able to allocate memory that can be read through Javascript bindings and which is left uninitialized after the allocation can give us the juicy bits we want, especially if we can choose the allocation size as well! I found this behaviour in o3d::Buffer, in the processing of RawData objects into fields with an input that intentionally is too short. In this case, the fields are not overwritten and can be used to peek into previously freed memory - just be careful to not get any floating point conversions into your way - this is what happened to me initially and made pointers only approximately correct ;) Look at the function Buffer::Set(o3d::RawData *, size_t, size_t) in file core/cross/buffer.c as well as Buffer::AllocateElements(unsigned) and VertexBufferGLES2::ConcreteAllocate(size_t) to see what I’m talking about [GLES2 and not GL is used on the Chromebook, also I’ve arbitrarily chosen VertexBuffer over IndexBuffer here].

This now allowed me to leak useful objects like the base::PendingTask object of Chrome which gives us the offset of the chrome binary to reliably predict addresses in memory. This also gives a great way to allocate memory and set its content for a use-after free, as long as the chunk is a multiple of 4 bytes in size.