Some Analysis of the Backdoored Backdoor

Update: Shortly after reading my post, Willem Pinckaers pointed out that the reseed_system_prng function sets the global variable system_prng_bufpos to 32. This means that after the first invocation of this function, the for loop right after the reseed call in system_prng_gen_block never executes. Hence, the ANSI X9.31 PRNG code is completely non-functional.

Recently, the internet circus called Twitter has been abuzz with news of a plurality of backdoors found in some versions of Juniper’s ScreenOS operating system. While Fox-IT and then HD Moore quickly found the backdoor password allowing SSH and Telnet access, the exact mechanisms underlying the VPN decryption backdoor are unclear at this point. This blog post tries to summarize my findings of the last couple of days regarding the apparently backdoored Dual_EC PRNG found in ScreenOS 6.3.0r12 and other affected firmware revisions listed in at Juniper’s 2015-12 Out of Cycle Security Bulletin: ScreenOS

NIST publication SP 800-90A describes a family of pseudo-random number generators called Dual_EC DRBG (deterministic random bit generator is the official designation used in the standard) for different elliptic curves (NIST curves P-256, P-384 and P-521). For each of these PRNGs, two parameters are needed: two points on the elliptic curve that are called P and Q. These points are also specified in Appendix A.1.1 of the same standard.

During the CRYPTO 2007 rump session, Niels Ferguson and Dan Shumow demonstrated that if the points are not randomly generated, but carefully chosen in advance, the security of Dual_EC DRBG can be subverted by the party doing the choosing; effectively backdooring the PRNG. Namely if one chooses P, Q such that Q=P*e holds for a value e that is kept secret, it will allow the party that generated said P, Q to recover the internal state of the PRNG from observed output in a computationally “cheap fashion” – hence instances of Dual_EC PRNG for which the provenance of the points P and Q is unknown are susceptible to having been backdoored. Parties that are not in possession of the value e can obtain it by solving the discrete log problem for e on the elliptic curve; but for the discrete logarithm problem on prime curves such as P-256, no sub-exponential algorithms are currently known. In fact, unless quantum computers capable of running Shor’s algorithm on more than a handful of bits become reality, conventional cryptographic wisdom places the strength of P-256 at a 128-bit symmetric key security level.

After the Snowden revelations uncovered Project BULLRUN and gave stronger indications of the compromise of the proposed Dual_EC parameters in SP 800-90A, Checkoway et al. presented a paper On the Practical Exploitability of Dual EC in TLS Implementations at Usenix Security 2014. In July 2015, Bernstein, Lange and Niederhagen wrote an excellent article on the history of Dual EC and how to exploit Dual EC-based backdoors.

Alas, while Juniper used Dual_EC_DRBG with the P-256 NIST curve and the point P specified in SP 800-90A in ScreenOS — the operating system running on NetScreen VPN gateways — they chose to use a different point Q and not the one supplied in the standard for P-256. When the Snowden revelations in 2013 shone light on Project BULLRUN and the compromise of Dual_EC_DRBG by the NSA, Juniper responded as follows in knowledge base article KB28205:

ScreenOS does make use of the Dual_EC_DRBG standard, but is designed to not use Dual_EC_DRBG as its primary random number generator. ScreenOS uses it in a way that should not be vulnerable to the possible issue that has been brought to light. Instead of using the NIST recommended curve points it uses self-generated basis points and then takes the output as an input to FIPS/ANSI X.9.31 PRNG, which is the random number generator used in ScreenOS cryptographic operations.

However, apparently starting in August 2012 (release date according to release notes for 6.3.0r12), Juniper started shipping ScreenOS firmware images with a different point Q. Adam Caucill first noted this difference after HD Moore posted a diff of strings found in the SSG 500 6.2.0r14 and the 6.2.0r15 firmware. As we can deduce from their recent security advisory and the fact that they reverted back to the old value Q in the patched images, this was a change not authored by them. Apparently Juniper only realised this recently and not when they were issuing KB28205. This led us to investigate the change more thoroughly, which led to the discovery of its use in a Dual_EC PRNG, as documented by Adam Langley. This discovery was fairly quick after I realized that ScreenOS utilized OpenSSL as a crypto library underneath; a well-kept secret among people having reversed products containing OpenSSL before is that of all those EC_PUT_error macros sprinkled over the OpenSSL codebase are mighty useful for identifying functions and hence getting a hook into the codebase.

It stands to reason that whoever managed to slip in their own Q will also know the corresponding e such that P*e=Q (the value P was unchanged from the standard) and hence is able recover the internal state of the backdoored Dual_EC generator from the output generator. What is unknown however is what an attack would look like for the PRNG cascade employed by Juniper’s ScreenOS.

Since there is no public description of this PRNG cascade, I analysed firmware version 6.3.0r12 of a Netscreen SSG 20 to investigate this issue further. This was the first version of 6.3.0 that Juniper indicates as having been backdoored. Thankfully, HD Moore already wrote up the details of how to unpack and load the firmware images in IDA in his blog post, so I did not have to do that again here.

Static analysis indicates that the output of the Dual_EC generator indeed is not used directly, but rather only to reseed an ANSI X9.31 PRNG. Besides the unused EC PRNG known-answer test function, a function we call reseed_system_prng is the only one that references the ec_prng_generate_output function. Caveat: we may be overlooking a dynamically generated indirect call to the Dual_EC generator that leaks its state at some point; superficial BinDiffing of 6.3.0r11 and 6.3.0r12 however did not show any leads into that direction. Further analysis using a JTAG debugger on a live device hopefully will show us more.

The “system” PRNG is then used throughout ScreenOS to generate random values, for instance to construct IKE nonces, random OpenSSL BNs etc. This system PRNG generates output in 32 byte blocks using a function we chose to call system_prng_gen_block:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
// This generates a block of 32 random bytes
void system_prng_gen_block(int a1)
{
  int v3;
  int v4;
  unsigned int i;
  unsigned int timeval[2];

  timeval[0] = 0;
  timeval[1] = ixp425_read_timestamp_timer();
  system_prng_bufpos = 0;
  ++sysprng_num_gen_blocks;
  if ( !prng_does_not_require_reseeding() )
    reseed_system_prng();
  for ( ; system_prng_bufpos <= 31; system_prng_bufpos += 8 )
  {
    memcpy(&prev_prng_seed_part1, &ansi_x9_31_seed, 8);
    memcpy(&prev_generator_out, generator_outbuf, 8);
    ansi_x9_31_update(timeval, &ansi_x9_31_seed, &ansi_x9_31_3des_key, generator_outbuf);
    if ( is_fips_enabled(0) )
    {
      if ( !memcmp(&ansi_x9_31_seed, &prev_prng_seed_part1, 8) || !memcmp(generator_outbuf, &prev_generator_out, 8)))
      {
        log_dbgmsg3(0, 6, get_current_vsys(0), "FIPS ERROR: PRNG failure, duplicate random output\n", timeval[0], timeval[1]);
        /* 0x404100E -> "Failed to generate random." */
        log_dbgmsg(0x404100E);
        log_dbgmsg4("FIPS ERROR: PRNG failure, duplicate random output\n", 11);
      }
    }
    for ( i = 0; i < system_prng_bufpos; i += 8 )
    {
      if ( !memcmp(&system_prng_output_buffer[i], generator_outbuf, 8) )
      {
        log_dbgmsg3(0, 6, get_current_vsys(0), "FIPS ERROR: PRNG failure, duplicate random output\n", timeval[0], timeval[1]);
        log_dbgmsg4("FIPS ERROR: PRNG failure, duplicate random output\n", 11);
      }
    }
    memcpy(&system_prng_output_buffer[system_prng_bufpos], generator_outbuf, 8);
  }
}

As we see, before any output is generated, this function calls another function which I named prng_does_require_reseeding. This function reads out a flag that is set to zero by default. What this means is that by default the system PRNG is reseeded from the Dual_EC PRNG for each output block generated! This periodic reseeding can apparently be turned off using the command set key one-stage-rng. This command however I was not able to find in the ScreenOS documentation but rather only in the firmware binary. The reseeding of the system PRNG is done by getting 32 bytes of output from the Dual EC generator and splitting this into 8 bytes of seed and 24 bytes of key material for an X9.31 PRNG:

1
2
3
4
5
6
7
8
9
10
void reseed_system_prng()
{
  system_prng_state[0] = 0;
  if ( ec_prng_gen_keystream_with_checks(system_prng_output_buffer, 32) != 32 )
    log_dbgmsg4("FIPS ERROR: PRNG failure, unable to reseed\n", 11);
  memcpy(&ansi_x9_31_seed, system_prng_output_buffer, 8u);
  result = memcpy(&ansi_x9_31_3des_key, &system_prng_output_buffer[8], 24u);
  system_prng_bufpos = 32;
  return result;
}

The function ansi_x9_31_update works as one expects it to. Each call to this function calculates 8 bytes of keystream. This is done by encrypting a timer value that is obtained by directly reading a hardware register in our case (IXP425 platform) with the 192-bit 3DES key ansi_x9_31_3des_key. Furthermore, the function updates the seed value ansi_x9_31_seed with each call. In the below diagram, T is represented by the variable timeval, K by ansi_x9_31_3des_key, Vi by the value of ansi_x9_31_seed before the call and Vi+1 by the value of ansi_x9_31_seed after; Ri denotes the output:

ANSI X9.31 PRNG

While X9.31-style PRNGs are known to be fragile, I currently do not see an easy way to do passive decryption of VPN traffic using the above ScreenOS Dual_EC backdoor, even if the value e were known to me. Although recovering the internal state of the Dual_EC generator will allow reduce the entropy of PRNG output to at most 32 bits (the timer value), there just is not any direct output of Dual_EC visible that allows to recover its internal state.

Maybe I am missing a direct leak of Dual_EC somewhere output, maybe I am overlooking an attack on the above X9.31 construction or the cascaded PRNG that does not involve breaking 3DES; maybe there’s another subtle change in the code that I am missing which breaks the whole thing. Juniper clearly stated that a change starting in 6.3.0r12 enables passive decryption of VPN traffic; given the fact that in the patched 6.3.0r12b version they reverted the point Q to the on contained in ScreenOS 6.3.0r11, it seems very likely that a changed and reverted point for the Dual_EC generator gives rise to this vulnerability. Last but not least (and I rate this as extremely unlikely given the nature of the backdoor), maybe 6.3.0r12 does not contain a fully enabled backdoor yet?

Even though I do not have answers now, I am confident that getting more eyes on this problem can shed light on this. My next step is to try to attach a JTAG debugger to the SSG 20 to see whether I am missing any Dual_EC leaks.

Kudos to my tweeps Matthew Green, Adam Langley and HD Moore for the lively discussion on this matter.

I am back do other projects now, but I do not think I can stop my subconsciousness thinking about this.