is an extensible framework for easily writing debuggable, compiler optimized, position-independent, x86 and x64 shellcode for windows platforms.


I will be demonstrating how to write optimized, position-independent x86 and x64 shellcode using our ShellcodeStdio framework. Our approach is invaluable in the rapid development of shellcode as ShellcodeStdio maintains distinct advantages over coding in pure assembly. The framework allows for better debugging by utilizing the Visual Studio environment, as opposed to a assembler-level debugger such as OllyDbg. In addition, the shellcode produced by the Visual Studio compiler is optimized at a sufficient level to be used in production environments. Most importantly, it allows one to development exploit-ready shellcode using pure C/C++ code, with the convenience of pre-processed Win32 API macros and hard-coded string literals.

Shellcode 101

Shellcode, in an offensive security context, is a sequence of machine code that can execute at any memory location in a target process assuming it has the correct memory protections. To better understand what this means, let’s examine what happens when we view the disassembly of a simple program displaying a messagebox.

MessageBoxA(NULL, "Message", "Caption", MB_OK);
----------------------------------------------------------------------------
00A91013 6A 00                push        0  
00A91015 68 10 20 A9 00       push        offset string "Caption" (0A92010h)  
00A9101A 68 18 20 A9 00       push        offset string "Message" (0A92018h)  
00A9101F 6A 00                push        0  
00A91021 FF 15 00 20 A9 00    call        dword ptr [__imp__MessageBoxA@16 (0A92000h)]

From left to right, we can see the memory address, the machine code bytes, and the instructions. Since these addresses are relative to our already compiled executable file, this code will fail to execute in an arbitrary memory context.

When string literals are compiled into a portable executable file, they are most often stored in a section with constant, read-only, initialized data attributes (e.g. the .rdata section). We can see the memory locations of our two strings referenced in the code excerpt above. In addition, we must consider the memory location of the imported MessageBoxA function. ShellcodeStdio addresses these problems in order to allow our C/C++ compiled code to be executed at any memory location.

Making Data Position-Independent

In order to make our strings position-independent, ShellcodeStdio dynamically creates them on the stack. Any string (ANSI or Unicode), can be represented by pushing the individual bytes to the stack. A complication to be aware of when using this framework is that when one allocates a very large objects on the stack, the Visual Studio compiler will replace the position-independent stack method with a call to HeapAlloc, making the resultant code position-dependent. A possible solution is increasing the SizeOfStackCommit field in the IMAGE_OPTIONAL_HEADER. However, since this does not solve the importing of an extraneous Windows API function done by the compiler, it’s an interesting, but somewhat moot, approach. As of now, this issue limits the size of objects which one can allocate on the stack, but in practice, should not be a problem as there is usually sufficient space for most operations.

Making Windows API Calls Position-Independent

The second issue is resolving all calls to external libraries (i.e. doing anything useful). In the code above, we would need to resolve the call to the MessageBoxA function, which resides in user32.dll. In order to perform a runtime function call we need to know the following:

  • The name of the function to call (MessageBoxA).
  • The name of the library the function resides in (User32.dll).
  • Loading the library into memory, if not already loaded.
  • The address of the library in memory.
  • The address of the function relative to the library.
  • The parameters of the function.

Normally, one can simply call the LoadLibrary function to load external dlls at runtime. But, since our code is executing at an arbitrary memory location, there is additional work involved to first locate the base address of Kernel32 and then walk the export directory to find the LoadLibrary function. In every portable executable with the window or console subsystem, that is to say, non-system, it is safe to assume that both ntdll.dll and kernel32.dll are loaded into memory by the Windows loader. By navigating the Process Environment Block (PEB), one can reliably the base address of every module loaded in the current process.

Navigating the PEB

A pointer to the PEB can be obtained as follows:

#ifndef _WIN64
	p = (PPEB)__readfsdword(0x30);
#else
	p = (PPEB)__readgsqword(0x60);
#endif

Once a pointer to the PEB is obtained, navigate to the PEB_LDR_DATA field which contains a linked-list data structure of each loaded module. These point to the LDR_MODULE structure which contains information about said modules — including the base address of where it was loaded into memory.

It is worthwhile to mention that ShellcodeStdio relies on the InMemoryOrder module list and not the other list entries because kernel32 will always be the 3rd module loaded in all executing environments (current image -> ntdll -> kernel32).

Walking the Export Table

To find the relative virtual address (RVA) of LoadLibrary, and consequentially any other function, we must walk kernel32’s export table. We can obtain a pointer to the export directory from the IMAGE_DATA_DIRECTORY structure.

IMAGE_EXPORT_DIRECTORY structure

The RVA of a desired function can be found by accessing a specific index of the AddressOfFunctions array. A procedural approach to this involves iterating through each name exported function, comparing the name to our function, then following a match, obtain the RVA by providing the corresponding name ordinal as the index in the AddressOfFunctions array. This is more concisely expressed through this code snippet which uses hashed values of the strings to to compare:

IMAGE_EXPORT_DIRECTORY* ied = (IMAGE_EXPORT_DIRECTORY*)(baseAddress + iedRVA);
char* moduleName = (char*)(baseAddress + ied->Name);
DWORD moduleHash = rt_hash(moduleName);
DWORD* nameRVAs = (DWORD*)(baseAddress + ied->AddressOfNames);

for (DWORD i = 0; i < ied->NumberOfNames; ++i) {
	char* functionName = (char*)(baseAddress + nameRVAs[i]);
	if (hash == moduleHash + rt_hash(functionName)) {
		WORD ordinal = ((WORD*)(baseAddress + ied->AddressOfNameOrdinals))[i];
		DWORD functionRVA = ((DWORD*)(baseAddress + ied->AddressOfFunctions))[ordinal];
		return baseAddress + functionRVA;
	}
}

This will yield the address of any name exported function, non-forwarded function that we need (ShellcodeStdio does support forwarded functions). Combine this with some neat C++ constant expression evaluation magic, and we can use this to call Windows API functions pretty much as we normally do — without the need to define every function as an individually typed pointer. To see this part of the source, visit my repository on GitHub.

Putting it All Together

The final step is to adjust our compiler settings. There are some that must be changed in order for ShellcodeStdio to emit proper shellcode.

C/C++ -> Optimization -> /O1, /Ob2, /Oi, /Os, /Oy-, /GL
C/C++ -> Code Generation -> /MT, /GS-, /Gy
Linker -> General -> /INCREMENTAL:NO