Fixing a bug in donut

Oct 9, 2024

—

A client used donut to generate a reflective loader for a 64-bit PE file. The shellcode was then embedded and run from a .NET assembly. Strangely, it was not working, but only with some assemblies.

After some mucking around with a debug version of donut, the loader code was exiting prematurely leaving us with the following message.

Host process [abc] and file [xyz] are not compatible… cannot load.

We can see the following check in the source code where this message originates.

// where:
// nt -> IMAGE_NT_HEADERS is the 64-bit PE file
// nthost -> IMAGE_NT_HEADERS is the .NET assembly
if(nt->FileHeader.Machine != nthost->FileHeader.Machine) {
    DPRINT("Host process %08lx and file %08lx are not compatible...cannot load.", 
      nthost->FileHeader.Machine, nt->FileHeader.Machine);
    return;
}

While this check is reasonable — it’s not always true.

Platform-Neutral Assemblies

In the client’s case, the assembly hosting the shellcode was a 32-bit .NET assembly. But, like many .NET assemblies, it was built as an ILONLY or platform-neutral assembly. If you’ve ever used Visual Studio to set the platform target for a .NET project, you may have encountered the ANYCPU or ANYCPU32BITPREFERRED options.

ANYCPU: compiles your assembly to run on any platform. Your application runs as a 64-bit process whenever possible and falls back to 32-bit when only that mode is available.
ANYCPU32BITPREFERRED: compiles your assembly to run on any platform. Your application runs in 32-bit mode on systems that support both 64-bit and 32-bit applications. You can specify this option only for projects that target .NET Framework 4.5 or later.

Selecting one of these options means that your assembly will be declared platform-neutral. This information is embedded in the Flags field of the IMAGE_COR_20_HEADER by setting the COMIMAGE_FLAGS_ILONLY flag.

// CLR Header entry point flags.
COMIMAGE_FLAGS_ILONLY               =0x00000001,
COMIMAGE_FLAGS_32BITREQUIRED        =0x00000002,
COMIMAGE_FLAGS_IL_LIBRARY           =0x00000004,
COMIMAGE_FLAGS_STRONGNAMESIGNED     =0x00000008,
COMIMAGE_FLAGS_NATIVE_ENTRYPOINT    =0x00000010,
COMIMAGE_FLAGS_TRACKDEBUGDATA       =0x00010000,
COMIMAGE_FLAGS_32BITPREFERRED       =0x00020000

typedef struct IMAGE_COR20_HEADER
{
    // Header versioning
    ULONG                   cb;              
    USHORT                  MajorRuntimeVersion;
    USHORT                  MinorRuntimeVersion;
    
    // Symbol table and startup information
    IMAGE_DATA_DIRECTORY    MetaData;        
    ULONG                   Flags;           
    ULONG                   EntryPointToken;
    
    // Binding information
    IMAGE_DATA_DIRECTORY    Resources;
    IMAGE_DATA_DIRECTORY    StrongNameSignature;

    // Regular fixup and binding information
    IMAGE_DATA_DIRECTORY    CodeManagerTable;
    IMAGE_DATA_DIRECTORY    VTableFixups;
    IMAGE_DATA_DIRECTORY    ExportAddressTableJumps;

    // Precompiled image info (internal use only - set to zero)
    IMAGE_DATA_DIRECTORY    ManagedNativeHeader;
    
} IMAGE_COR20_HEADER;

The related COMIMAGE_FLAGS_32BITREQUIRED and COMIMAGE_FLAGS_32BITPREFERRED flags interact as a pair to get the performance profile desired for platform-neutral assemblies while retaining backward compatibility with pre-4.5 runtimes/OSs, which don’t know about COMIMAGE_FLAGS_32BITPREFERRED.

COMIMAGE_FLAGS_32BITREQUIRED originally meant “this assembly is x86-only” (required to distinguish platform-neutral assemblies which also mark their PE MachineType as IMAGE_FILE_MACHINE_I386)
COMIMAGE_FLAGS_32BITPREFERRED has been added so we can create a sub-class of platform-neutral assembly that prefers to be loaded into the 32-bit environment for perf reasons but is still compatible with 64-bit environments

To retain maximum backward compatibility you cannot simply read or write one of these flags. You must treat them as a pair, a two-bit field with the following meanings:

32BITREQUIRED	32BITPREFERRED
0	0	No special meaning, MachineType and ILONLY flag determine image requirements
0	1	Illegal, reserved for future use
1	0	Image is x86-specific
1	1	Image is platform neutral and prefers to be loaded 32-bit when possible

Source: corhdr.h from https://github.com/dotnet/runtime

ILONLY assemblies are always PE32 image files on disk, but depending on the execution context, choose to behave like either a PE32 or PE32+ image file in memory. When run on a 64-bit system, these platform-neutral assemblies are transformed to a PE32+ image format in memory. This transformation happens during process initialization in the mscoree.dll!_CorValidateImage function, with the heavy lifting done in mscoree.dll!PE32ToPE32Plus.

mscoree.dll!PE32toPE32Plus
mscoree.dll!_CorValidateImage
ntdll!LdrpCorValidateImage
ntdll!LdrpCheckCorImage
ntdll!LdrpMapDll
ntdll!LdrpLoadDll

Which looks like this:

static
HRESULT PE32ToPE32Plus(PBYTE pImage) {
    IMAGE_DOS_HEADER *pDosHeader = (IMAGE_DOS_HEADER*)pImage;
    IMAGE_NT_HEADERS32 *pHeader32 = (IMAGE_NT_HEADERS32*) (pImage + pDosHeader->e_lfanew);
    IMAGE_NT_HEADERS64 *pHeader64 = (IMAGE_NT_HEADERS64*) pHeader32;

    _ASSERTE(&pHeader32->OptionalHeader.Magic == &pHeader32->OptionalHeader.Magic);
    _ASSERTE(pHeader32->OptionalHeader.Magic == IMAGE_NT_OPTIONAL_HDR32_MAGIC);

    // Move the data directory and section headers down 16 bytes.
    PBYTE pEnd32 = (PBYTE) (IMAGE_FIRST_SECTION(pHeader32)
                            + pHeader32->FileHeader.NumberOfSections);
    PBYTE pStart32 = (PBYTE) &pHeader32->OptionalHeader.DataDirectory[0];
    PBYTE pStart64 = (PBYTE) &pHeader64->OptionalHeader.DataDirectory[0];
    _ASSERTE(pStart64 - pStart32 == 16);

    if ( (pEnd32 - pImage) + 16 /* delta in headers */ + 16 /* label descriptor */ > 4096 ) {
        // This should never happen.  An IL_ONLY image should at most 3 sections.  
        _ASSERTE(!"_CORValidateImage(): Insufficent room to rewrite headers as PE32+");
        return STATUS_INVALID_IMAGE_FORMAT;
    }

    memmove(pStart64, pStart32, pEnd32 - pStart32);

    // Move the tail fields in reverse order.
    pHeader64->OptionalHeader.NumberOfRvaAndSizes = pHeader32->OptionalHeader.NumberOfRvaAndSizes;
    pHeader64->OptionalHeader.LoaderFlags = pHeader32->OptionalHeader.LoaderFlags;
    pHeader64->OptionalHeader.SizeOfHeapCommit = pHeader32->OptionalHeader.SizeOfHeapCommit;
    pHeader64->OptionalHeader.SizeOfHeapReserve = pHeader32->OptionalHeader.SizeOfHeapReserve;
    pHeader64->OptionalHeader.SizeOfStackCommit = pHeader32->OptionalHeader.SizeOfStackCommit;
    pHeader64->OptionalHeader.SizeOfStackReserve = pHeader32->OptionalHeader.SizeOfStackReserve;

    // One more field that's not the same
    pHeader64->OptionalHeader.ImageBase = pHeader32->OptionalHeader.ImageBase;

    // The optional header changed size.
    pHeader64->FileHeader.SizeOfOptionalHeader += 16;
    pHeader64->OptionalHeader.Magic = IMAGE_NT_OPTIONAL_HDR64_MAGIC;

    // Several directorys can now be nuked.
    pHeader64->OptionalHeader.DataDirectory[IMAGE_DIRECTORY_ENTRY_IAT].VirtualAddress = 0;
    pHeader64->OptionalHeader.DataDirectory[IMAGE_DIRECTORY_ENTRY_IAT].Size = 0;
    pHeader64->OptionalHeader.DataDirectory[IMAGE_DIRECTORY_ENTRY_TLS].VirtualAddress = 0;
    pHeader64->OptionalHeader.DataDirectory[IMAGE_DIRECTORY_ENTRY_TLS].Size = 0;
    pHeader64->OptionalHeader.DataDirectory[IMAGE_DIRECTORY_ENTRY_IMPORT].VirtualAddress = 0;
    pHeader64->OptionalHeader.DataDirectory[IMAGE_DIRECTORY_ENTRY_IMPORT].Size = 0;

    // Great.  Now just have to make a new slot for the entry point.
    PBYTE pEnd64 = (PBYTE) (IMAGE_FIRST_SECTION(pHeader64) + pHeader64->FileHeader.NumberOfSections);
    pHeader64->OptionalHeader.AddressOfEntryPoint = (ULONG) (pEnd64 - pImage);
    // This will get filled in shortly ...

    return STATUS_SUCCESS;
}

You might notice that the code does not modify the IMAGE_FILE_HEADER — the structure that contains the Machine field in donut’s original check. The PE32ToPE32Plus function does transform the Magic field for the IMAGE_OPTIONAL_HEADER from IMAGE_NT_OPTIONAL_HDR32_MAGIC to IMAGE_NT_OPTIONAL_HDR64_MAGIC.

We can confirm this by dumping the headers after the transformation.

PS> dumpbin /headers .\project_0000000000070000.bin

Dump of file .\project_0000000000070000.bin

PE signature found

File Type: EXECUTABLE IMAGE

FILE HEADER VALUES
             14C machine (x86)
               3 number of sections
        C2D3D193 time date stamp Sun Jul 30 13:34:11 2073
               0 file pointer to symbol table
               0 number of symbols
              F0 size of optional header
              22 characteristics
                   Executable
                   Application can handle large (>2GB) addresses

OPTIONAL HEADER VALUES
             20B magic # (PE32+)
             ...

Fixing the Check

Instead of evaluating:

nt->FileHeader.Machine != nthost->FileHeader.Machine

Donut’s loader could instead evaluate:

nt->OptionalHeader.Magic != nthost->OptionalHeader.Magic

There is room to be a bit more granular with the check so here’s how I ended up implementing it. I don’t consider the optional header in the fixed check. Instead, I check if the host process is a platform-neutral assembly without the COMIMAGE_FLAGS_32BITREQUIRED flag.

if (nt->FileHeader.Machine != nthost->FileHeader.Machine) {
    // This is not always the case:
    // If IL_ONLY PE32 is loaded on 64-bit Windows and we load a PE32+ exe/dll
		if ((nt->FileHeader.Machine == IMAGE_FILE_MACHINE_I386 && nthost->FileHeader.Machine == IMAGE_FILE_MACHINE_IA64) && !CheckForILOnly(nthost, (ULONG_PTR)host)) {
			  DPRINT("Host process %08lx and file %08lx are not compatible...cannot load.",
				nthost->FileHeader.Machine, nt->FileHeader.Machine);
			  return;
		}
}

BOOL CheckForILOnly(PIMAGE_NT_HEADERS nthost, ULONG_PTR host)
{
	PIMAGE_DATA_DIRECTORY		net_data_dir;
	PBYTE						cor20_hdr;
	DWORD						cor20_flags;

	net_data_dir = &nthost->OptionalHeader.DataDirectory[IMAGE_DIRECTORY_ENTRY_COM_DESCRIPTOR];

	if (net_data_dir->Size && net_data_dir->VirtualAddress) {

		cor20_hdr = (PBYTE)(host + net_data_dir->VirtualAddress);
		cor20_flags = *(PDWORD)((ULONG_PTR)cor20_hdr + 0x10);

		if (cor20_flags & 0x1 /* IL_ONLY */) {
			if ((cor20_flags & 0x2) /* ! 32_BIT_REQUIRED */ == 0) {
				return TRUE;
			}
		}
	}

	return FALSE;
}

If you maintain or use a reflective loader you should ensure this edge case is handled.

Further Questions

Why does the loader ignore the Machine field for platform-neutral assemblies?

This appears to contradict the documentation when both the COMIMAGE_FLAGS_32BITREQUIRED and COMIMAGE_FLAGS_32BITPREFERRED flags are unset.

I think the most likely answer is the loader just ignores it — the information is not needed.

Another explanation may lie in semantics, but it’s likely just speculation.

MSDN describes the Machine field as, “The architecture type of the computer”, adding, “An image file can only be run on the specified computer or a system that emulates the specified computer”.

Then, it would seem, that the Machine field doesn’t indicate whether or not the file is a PE32+ format, rather, it describes the need for the computer to support an architecture for the image file.

Platform-neutral assemblies contain no real machine code (other than a placeholder entrypoint)
Platform-neutral assemblies will always have a Machine value of IMAGE_FILE_MACHINE_I386
IMAGE_FILE_MACHINE_I386 can always be emulated on 64-bit Windows

So… maybe there’s no need to transform it?