Obfuscated Obfuscation

klout (1)

Remember Klout? Didn’t think so.

Klout was a social network that assigned you a score between 1 and 100 depending on what it thought you were worth as a human being.

For a while, American Airlines offered VIP lounge access to people with Klout scores above 55. One of my classmates was an intern at Klout, so I asked him to insert a few lines of server code for me:

if( username==”elaine” ) { 
     kloutScore = 60;
}


I’ll get fired,
he said.

Of course, you can’t put it in there just like that. It has to be nonobvious. Like this:

private String o = “656c61696e65”;
private StringBuilder O = new StringBuilder(username);
for( int j=0; j<o.length()-1; j+=2 ){	  
    String l = o.substring(i, (i+2));
    int i = Integer.parseInt(l, 16);
    if( O.charAt(j/2)!=(char)i ) {
        break;
    } else {
        kloutScore+=10;
    }
}

The two code examples achieve the same function. The difference is, the first one is inadmissibly honest in its intentions. The second is a code-reviewer’s nightmare.

Instead of building a stronger lock, hide the front door.

Software obfuscation was originally used to protect proprietary software from reverse engineers looking to bypass copy protection. If the debugger output is tortuous enough, maybe the engineer will get frustrated and go watch porn instead.

A simple transformation to source code can leave it logically opaque while functionally the same. For example:

AND EAX, 0x40    ;mask all but the 6th bit 
CMP EAX, 0x40    ;check if that bit was set

is logically equivalent to:

PUSH EAX 
XOR [ESP], 0xFFFFFFFF 
AND [ESP], 0x40 
POP EAX 
CMP EAX, 0

These methods worked against not just DRM-crackers, but company competitors seeking access to intellectual property. The cat-and-mouse game of software protection gave rise to a whole industry of obfuscation tools designed to chew up a program source file and spit out a snake pit.

Of course, as soon as reverse engineers encountered obfuscated code, they started building deobfuscation tools.

Thus came the impetus to obfuscate the obfuscation.

If you don’t want anyone to break your lock, convince them there is no lock.

Good obfuscation does not look obfuscated. For example, a programmer could hide her intentions by using machine instructions with hidden side effects [1]. Some innocuous string operations:

XCHG EAX, ESI   ;swap two registers
LODS            ;store the value pointed to by ESI
XCHG EAX, ESI   ;swap registers back

This result of those instructions is:

INC EAX

The LODS instruction has the side effect that it increments ESI to the next address after storing the value, because it works on continuous blocks of memory.

obfuscated code

Obfuscated obfuscation is even easier in high-level languages. The Underhanded C Contest is an annual challenge to write honest-looking code that secretly performs a nefarious function.

Common tactics include triggering an arithmetic overflow, pointer overwrites, and bad hash values. As a result, the code ends up doing the opposite of what a user might expect from a visual inspection.

Last year’s winning entry put this line in a single header file:

typedef double float_t; /* Desired precision for floating-point vectors */

By default, float_t is defined as single precision in math.h. The above file overrides the typedef as double precision. By #include-ing this header file in some C files but not others, the programmer passes an array of 8-byte numbers into a function that expects an array of 4-byte numbers. C interprets each 8-byte number as two 4-byte numbers, leading to an array where every other value is 0.

Submissions to the Underhanded C Contest would never pass formal verification, let alone a basic systems test. Far more interesting is the Underhanded Crypto Contest, which challenges programmers to submit cryptography implementations with hidden backdoors.

This one implements Stern’s zero-knowledge identification protocol [2], a scheme based on error-correcting codes where the public key is a parity check matrix and the cryptogram is a noisy codeword. The private key is the unencoded word. A backdoor was inserted by allowing an arithmetic overflow in part of the verification, making it possible for an attacker to pass off an incorrect key. (Note: My interpretation is grossly simplified and possibly flawed.)

I think the moral of the story is, try not to piss off your software engineers. If that can’t be avoided, confine them to a safe language like Ada.

References:
1. S. Schrittwieser, et al. Covert Computation — Hiding code in code through compile-time obfuscation. Computers & Security 42, 2014.
2. J. Stern. A new identification scheme based on syndrome decoding. CRYPTO, Volume 773, 1993.