Codehead's Corner
Random ramblings on hacking, coding, fighting with infrastructure and general tech
LabyREnth 2016 – Windows 1 – AntiD
Posted: 21 Aug 2016 at 12:08 by Codehead

The LabyREnth competition ran from 15th July to 14th Aug. I only managed to find time to do the first Windows challenge. It was a tricky one and I was only able to work at it in fits and starts, often with days or weeks between sessions. However, I learnt quite a few handy new things while working on it, so I’m writing this as a reference for myself and as an example of the dead-ends and rabbit holes of the analysis process for those who are interested.

The only clues on the challenge page for Windows 1 were ‘Deez bugs are bad bugs’ and that the zipfile password was ‘infected’.

Extracting the zipfile provided a single executable named AntiD.exe. Poking with file and binwalk told me that the file was a Windows PE32 which was to be expected given that this was a Windows challenge. Using strings revealed some mangled text and notably a ‘UPX!’ string which indicates a compressed binary. Checking the file with UPX confirmed that we have compressed executable, but also warned that there might be some shenanigans going on:

> upx -t AntiD.exe
                       Ultimate Packer for eXecutables
                          Copyright (C) 1996 - 2013
UPX 3.91        Markus Oberhumer, Laszlo Molnar & John Reiser   Sep 30th 2013

upx: AntiD.exe: CantUnpackException: file is modified/hacked/protected; take care!!!

Attempting to decompress the file with the -d flag produces a similar message.

Running the .exe under Windows produces a prompt asking us to figure the key out:

C:\temp>AntiD.exe
Figure the key out:

We know the key format for the competition is PAN{…}, but trying a few simple things like ‘PAN{AntiD}’ and ‘PAN{Windows1}’ produces the following message:

C:\temp>AntiD.exe
Figure the key out: PAN{Windows1}
You wrong, why you so wrong at this?

We need to take a look at the code, but the tricky UPX tweaks mean we can’t just load it from the binary. Viewing the file with static disassembly tools like IDA and Binary Ninja didn’t achieve much, all you can see is the UPX decompression header/loader. The guys at Vector35 have opened bug #338 to try to address this, but for the moment we’re on our own.

We know the code works when running the binary, so we can open or attach to the AntiD process using WinDbg and view the code in memory once it has started up and decompressed. After a few passes stepping through the code and a whole bunch of fail, I used LordPE to grab a copy of the decompressed code from memory (right click the process, dump full) which at least gave me something to load into IDA.

The IDA analysis was sketchy with a few decompilation fails, but I was able to patch it up and identify an interesting looking function located at AntiD+0x11b0 which I named ‘Test’.

The function starts with a long sequence of hex value definitions:

push    ebp
mov     ebp, esp
sub     esp, 38h
mov     eax, ds:233004h
xor     eax, ebp
mov     [ebp+var_4], eax
mov     [ebp+var_2C], 8Ch
mov     [ebp+var_2B], 0F1h
mov     [ebp+var_2A], 53h
mov     [ebp+var_29], 0A3h
mov     [ebp+var_28], 8
mov     [ebp+var_27], 0D7h
mov     [ebp+var_26], 0DCh
mov     [ebp+var_25], 48h
mov     [ebp+var_24], 0DBh
mov     [ebp+var_23], 0Ch
mov     [ebp+var_22], 3Ah
mov     [ebp+var_21], 0EEh
mov     [ebp+var_20], 15h
mov     [ebp+var_1F], 22h
mov     [ebp+var_1E], 0C4h
mov     [ebp+var_1D], 0E5h
mov     [ebp+var_1C], 0C9h
mov     [ebp+var_1B], 0A0h
mov     [ebp+var_1A], 0A5h
mov     [ebp+var_19], 0Ch
mov     [ebp+var_18], 0D3h
mov     [ebp+var_17], 0DCh
mov     [ebp+var_16], 51h
mov     [ebp+var_15], 0C7h
mov     [ebp+var_14], 39h
mov     [ebp+var_13], 0FDh
mov     [ebp+var_12], 0D0h
mov     [ebp+var_11], 0F8h
mov     [ebp+var_10], 3Bh
mov     [ebp+var_F], 0E8h
mov     [ebp+var_E], 0CCh
mov     [ebp+var_D], 3
mov     [ebp+var_C], 6
mov     [ebp+var_B], 43h
mov     [ebp+var_A], 0F7h
mov     [ebp+var_9], 0DAh
mov     [ebp+var_8], 7Eh
mov     [ebp+var_7], 65h
mov     [ebp+var_6], 0AEh
mov     [ebp+var_5], 80h

After the definitions (AntiD+0x126D) there is a check that the length of the input text is 16 chars.

mov     eax, [ebp+arg_0]
push    eax
call    dword ptr ds:2320D4h
add     esp, 4
cmp     eax, 10h
jnz     {FailOut}

The code then loops through the array of hex values performing numerous tests. This looks like the key check routine with the loop restarting at AntiD+0x128f (label: MainLoop) after each pass.

There are 5 main blocks in the test which jump to AntiD+0x1363 (label: ‘FailOut’) and exit if the test in each case fails.

For my first real shot at tracing through the execution I used WinDbg, it’s easy to patch out the length check by breaking at AntiD+0x126D and setting EAX to 0x10. Another breakpoint set atMainLoop will fire when a character has been successfully validated and the loop restarts. My plan was to feel my way through the test blocks and try to work out what was going on.

The first test block at AntiD+0x1299 does some juggling of values, calls a function and fails if the return value is non-zero.

Block1:
mov     [ebp+var_30], 0
mov     edx, [ebp+arg_0]
add     edx, [ebp+IndexPtr?]
movsx   eax, byte ptr [edx]
xor     eax, 33h
and     eax, 0FFh
mov     [ebp+var_30], eax
call    sub_271130
movzx   ecx, al
test    ecx, ecx
jz      short Block2

Static analysis of sub_271130 is no help as the imports are broken and external calls are just addresses.

push    ebp
mov     ebp, esp
push    ecx
mov     [ebp+var_4], 0
lea     eax, [ebp+var_4]
push    eax
call    dword ptr ds:232000h
push    eax
call    dword ptr ds:232008h
cmp     [ebp+var_4], 0
jz      short loc_271156
mov     al, 1
jmp     short loc_271158
loc_271156:
xor     al, al
loc_271158:
mov     esp, ebp
pop     ebp
retn

However, tracing the function in WinDbg shows that the first external call is to IsDebuggerPresent(). We can be pretty sure this is an anti-debug measure, easily bypassed by setting EAX to zero after the function returns, but before the test ecx, ecx call.

Skipping to the next block at ntiD+0x12C7, we see a similar layout. This time the function is calling FindWindow() and seems to be looking for a Window with the title ‘OLLYDBG’. Another anti-debug measure. Again, easily defeated by setting EAX to zero after the function call.

Block2:
mov     edx, [ebp+var_30]
add     edx, 44h
and     edx, 0FFh
mov     [ebp+var_30], edx
call    FindWindow?
movzx   eax, al
test    eax, eax
jz      short Block3

At this point it is apparent that the code in the blocks is probably all anti-debugging measures. We can see some values being manipulated and passed through these blocks, but tracing is laborious because of the need to break and manually tweak registers to defeat the anti-analysis measures.

Time for a different approach. The WinAppDbg module for Python allows for scripting of debugging operations. I didn’t figure out how to pass input to the target, but it is easy enough start the application in one window and then attach to it and debug from a script running in another.

Breakpoints in WinAppDbg are set with the function break_at(PID, Address, Callback_function). This allows us to write code to be executed when the breakpoint is hit. Here’s a short example to bypass the length check in AntiD:

def lenChk(event):
    print "Bypassing Length Check"
    
    # Get the register values
    process.suspend()
    registers = event.get_thread().get_context()

    # Modify EAX and write back
    registers['Eax'] = 16
    event.get_thread().set_context(registers)
    process.resume()


debug = Debug()
try:

    # Lookup the currently running processes.
    debug.system.scan_processes()

    # For all processes that match the requested filename...
    for ( process, name ) in debug.system.find_processes_by_filename( "AntiD.exe" ):
        # Get PID
        pid = process.get_pid()
        print "Found " + name + " - PID:" + str(pid)

        # Attach to the process.
        debug.attach( pid )

        # Get ImgBase
        img_base = process.get_image_base()

        # Insert breakpoint at length check
        debug.break_at(pid, img_base + 0x126d, lenChk)
        
    # Wait for all the debugees to finish.
    debug.loop()

# Stop the debugger.
finally:
    debug.stop()

The script attaches to a running instance of AntiD.exe, sets a breakpoint at AntiD+0x126d (cmp eax, 10h) at sets EAX to 0x10 when the breakpoint is hit. Now we can supply any key we want and the length check will always pass. This removes the repetitive work of setting the breakpoint and tweaking the registers for each run, it is all handled by the script, we can just concentrate on working out what how the key is checked.

Using the above approach, it is simple to write handlers to pass all the test blocks. I didn’t even bother working out what the remaining blocks did, I just scripted workarounds for them. We can dump the registers at the end of the checks to see how the ‘Test’ function is actually working:

AntiD Window:

C:\Users\Karl\Desktop\files>AntiD.exe
Figure the key out: PAN{TEST}
You wrong, why you so wrong at this?

WinAppDbg Script Output:

C:\temp>python solve.py
Found AntiD.exe - PID:1704
Bypassing Length Check
Main Loop Breakpoint Hit!
Bypassing Debug Check
Bypassing Window Check
Bypassing Check 3
Bypassing Check 4
Key Check
Ecx - 0x8cL
Edx - 0x8cL
Main Loop Breakpoint Hit!
Bypassing Debug Check
Bypassing Window Check
Bypassing Check 3
Bypassing Check 4
Key Check
Ecx - 0xf1L
Edx - 0xf1L
Main Loop Breakpoint Hit!
Bypassing Debug Check
Bypassing Window Check
Bypassing Check 3
Bypassing Check 4
Key Check
Ecx - 0x53L
Edx - 0x53L
Main Loop Breakpoint Hit!
Bypassing Debug Check
Bypassing Window Check
Bypassing Check 3
Bypassing Check 4
Key Check
Ecx - 0xa3L
Edx - 0xa3L
Main Loop Breakpoint Hit!
Bypassing Debug Check
Bypassing Window Check
Bypassing Check 3
Bypassing Check 4
Key Check
Ecx - 0x8L
Edx - 0xebL

We can see that at the end of the loop ECX contains a value taken from the array defined at the start of the ‘Test’ function, EDX contains a value taken from our input, this value has been transformed by a sequence of XOR and arithmetic operations during the anti-debug test blocks. These values stop matching after ‘PAN{’ so we need to work out the rest of the key.

Now we know the function calls aren’t relevant and we don’t have to worry about the anti-debug tests, we can work out the transforms that are applied to each character of the input text:

The input value is frequently moved between variables and registers, but the general gist of the transform is as follows:

In the DebuggerCheck block, XOR 0x33 is applied to the input value. This is followed by AND 0xff. This is probably to keep the value in the range 0-255.

mov     [ebp+var_30], 0
mov     edx, [ebp+arg_0]
add     edx, [ebp+var34]
movsx   eax, byte ptr [edx]
xor     eax, 33h
and     eax, 0FFh
mov     [ebp+var_30], eax
call    DebuggerCheck
movzx   ecx, al
test    ecx, ecx
jz      short Block2

In the FindWindow block, 0x44 is added to the test value, followed by another AND 0xff.

Block2:
mov     edx, [ebp+var_30]
add     edx, 44h
and     edx, 0FFh
mov     [ebp+var_30], edx
call    FindWindow
movzx   eax, al
test    eax, eax
jz      short Block3

Block3 performs XOR 0x55 then AND 0xff.

Block3:
mov     ecx, [ebp+var_30]
xor     ecx, 55h
and     ecx, 0FFh
mov     [ebp+var_30], ecx
call    Check3
movzx   edx, al
test    edx, edx
jz      short Block4

Finally, Block4 subtracts 0x66 and applies the AND 0xff constraint.

Block4:
mov     eax, [ebp+var_30]
sub     eax, 66h
and     eax, 0FFh
mov     [ebp+var_30], eax
call    Check4
movzx   ecx, al
test    ecx, ecx
jz      short Block5

A quick bit of Python confirms that pushing ‘P’ through the transform detailed above produces a final value of 0x8C which matches the first value in the test array.

>>> v = ord('P')
>>> v = v ^ 0x33
>>> v = v & 0xff
>>> v = v + 0x44
>>> v = v & 0xff
>>> v = v ^ 0x55
>>> v = v & 0xff
>>> v = v - 0x66
>>> v = v & 0xff
>>> hex(v)
'0x8c' 

Unfortunately, when the next letter (’A’), has the same transform applied, we don’t get 0xf1. We get 0x7d.

>>> v = ord('A')
>>> v = v ^ 0x33
>>> v = v & 0xff
>>> v = v + 0x44
>>> v = v & 0xff
>>> v = v ^ 0x55
>>> v = v & 0xff
>>> v = v - 0x66
>>> v = v & 0xff
>>> hex(v)
'0x7d'

Closer examination of the final check block shows that there is one last XOR with the value in var_38. This value is zero on the first pass through the loop, but the final transformed value is added to var_38 at the end of each subsequent iteration for use in the next pass.

Block5:
mov     edx, [ebp+var_38]
and     edx, 0FFh
xor     edx, [ebp+var_30]
and     edx, 0FFh
mov     [ebp+var_30], edx
mov     eax, [ebp+IndexPtr?]
movsx   ecx, [ebp+eax+var_2C]
and     ecx, 0FFh
cmp     [ebp+var_30], ecx
jz      short Block6

Block6:
mov     edx, [ebp+var_38]
add     edx, [ebp+var_30]
mov     [ebp+var_38], edx
jmp     {UpdateIndex ... MainLoop}

Using the 0x8C value from the previous pass gives us the 0xf1 value we’re looking for

>>> hex(v)
'0x7d'
>>> v = v ^ 0x8c
>>> v = v & 0xff
>>> hex(v)
'0xf1'

Now we can take the test value array and build some Python to solve the key:

import sys

lastVal = 0
workVal = 0
index = 0

testArr = [ 0x8C, 0xF1, 0x53, 0xA3, 0x08, 0xD7, 0xDC, 0x48, 0xDB, 0x0C, 0x3A, 0xEE, 0x15, 0x22, 0xC4, 0xE5 ]
keyArr  = "PAN{"

def transform(v, old):
    v ^= 0x33
    v &= 0xff
    v += 0x44
    v &= 0xff
    v ^= 0x55
    v &= 0xff
    v -= 0x66
    v &= 0xff
    v ^= old
    v &= 0xff
    return v

def brute(tgt, old):

    for g in range(0,255):
        if transform(g, old) == tgt:
            return chr(g)

# Main Loop
# Check the known chars...
for c in keyArr:
    workVal = transform(ord(c), lastVal)
    if workVal == testArr[index]:
        sys.stdout.write(c)
    else:
        print keyArr[index] + " FAIL! - " + str(hex(testArr[index])) + " " + hex(workVal) + "(lastVal - " + str(hex(lastVal)) + ")"
        brute(lastVal, testArr[index])
        break

    lastVal += workVal
    lastVal &= 0xff
    index+=1

# Brute force the remaining characters
try:
    while True:
        sys.stdout.write(brute(testArr[index], lastVal))
        lastVal += testArr[index]
        lastVal &= 0xff
        index+=1
except:
    print ""

This code works, but using 16 chars from the test array produces a key which looks too short:

PAN{C0nf1agul4ti

The key is not accepted by the AntiD.exe or the website scoreboard. However, there is plenty more data in the test array, so lets feed that into our solver:

testArr = [ 0x8C, 0xF1, 0x53, 0xA3, 0x08, 0xD7, 0xDC, 0x48, 0xDB, 0x0C, 0x3A, 0xEE, 0x15, 0x22, 0xC4, 0xE5, 
            0xC9, 0xA0, 0xA5, 0xc,  0xD3, 0xDC, 0x51, 0xC7, 0x39, 0xFD, 0xD0, 0xF8, 0x3B, 0xE8, 0xCC, 0x03, 
            0x06, 0x43, 0xF7, 0xDA, 0x7E, 0x65, 0xAE, 0x80 ]

This extra data gives us a better looking flag:

PAN{C0nf1agul4ti0ns_0n_4_J08_W3LL_D0N3!}

However, it’s still rejected by AntiD.exe, as it is longer than 16 chars.

Using the WinAppDbg script we can bypass the length check and get our key accepted by the program:

C:\temp>AntiD.exe
Figure the key out: PAN{C0nf1agul4ti0ns_0n_4_J08_W3LL_D0N3!}
Well done! A+! You get a gold star!

This longer flag is accepted by the website too, so we know we have solved the challenge correctly. I’m not sure why the length check in the program is incorrect, but this was a fun exercise that required me to come up with some new solutions to solve problems I hadn’t encountered before.

Well done to the Paloalto Networks team, I wish I’d had more time to look at some of the other problems.

Categories: Hacking CTF



Site powered by Hugo.
Polymer theme by pdevty, tweaked by Codehead