Game Hacking #3: Hooking Direct3D EndScene()

reverse-engineering c++ binary gamehacking hooking

I’ve experimented with even moar game hacking and hooking techniques and you didn’t, so here comes another blog post.

Today’s topic is about hooking a specific function of the Direct3D library with the goal to cause Counter Strike: Global Offensive to draw additional things on the screen. There can be various reasons to do this:

Overview

The function we want to hook is called EndScene(). It’s being called to queue an already existing scene for output. In the context of this blog post a scene is equivalent to a frame and you can therefore say that EndScene() is called once for each frame. Since this function is being executed after a specific scene has been put together, it’s an ideal function to hook when adding additional content to the screen.

The plan is to inject a DLL into the game process that hooks the target function. To add custom content to each scene, an endSceneHook() function that accepts the same list of parameters as the original function has to be implemented. But here’s the thing: According to the documentation EndScene() is parameterless. However, there’s an implicit parameter of type LPDIRECT3DDEVICE9, which is the this pointer. Hence, endSceneHook() is required to have the following function prototype:

void APIENTRY endSceneHook(LPDIRECT3DDEVICE9 p_pDevice);

Here’s a list of all requirements:

Determining the EndScene() Function Address

Once the DLL is up and running in the game process, the first thing to do is to determine the address of the function that’s going to be hooked. Some clever people have found a convenient way to do this. The general approach is to create a dummy Direct3D device object before copying its memory contents into a separate buffer. This object of type IDirect3DDevice9 contains a virtual function table (VTable). At index 42, there’s a pointer to the EndScene() function that will be hooked. The index value is fixed for the Direct3D 9 library, which CS:GO uses. You can confirm the index yourself by checking the IDirect3DDevice9ExVtbl structure on this page.

For debugging purposes it’s a good idea to place a breakpoint at the EndScene() function. An easy way to do this is to start x64dbg, download the symbols for d3d9.dll and searching for the function name:

The cool thing about this dummy device technique is that it’s not necessary to hardcode function addresses or to scan the memory for the first bytes of the desired function.

Here’s my slightly modified version of the code:

// see https://guidedhacking.com/threads/get-direct3d9-and-direct3d11-devices-dummy-device-method.11867/
// Create the dummy d3d device and copy the object contents in order to obtain the
// addresses of functions that are about to be hooked
bool d3dHelper::getD3D9Device() {
    IDirect3D9* d3dSys = Direct3DCreate9(D3D_SDK_VERSION);
    IDirect3DDevice9* dummyDev = NULL;

    // Options to create dummy device
    D3DPRESENT_PARAMETERS d3dpp = {};
    d3dpp.Windowed = false;
    d3dpp.SwapEffect = D3DSWAPEFFECT_DISCARD;
    d3dpp.hDeviceWindow = hwnd;

    HRESULT dummyDeviceCreated = d3dSys->CreateDevice(D3DADAPTER_DEFAULT, D3DDEVTYPE_HAL, d3dpp.hDeviceWindow, D3DCREATE_SOFTWARE_VERTEXPROCESSING, &d3dpp, &dummyDev);

    // Copy memory to our own data structure
    memcpy(this->d3d9DeviceTable, *reinterpret_cast<void***>(dummyDev), sizeof(this->d3d9DeviceTable));

    // Destroy the device afterwards, we don't need it anymore
    dummyDev->Release();
    d3dSys->Release();
    return true;
}

Error checking and checks for special cases have been removed, check the full source code I’ve referenced at the end of this post for a copy-pasta-compatible version.

Getting the function address for EndScene() is now as simple as this:

char* ogEndSceneAddress = d3dHelper.d3d9DeviceTable[42];

How to Hook

Now it’s time to place a hook at the function address that was just determined using the technique described above. I’m using a trampoline hook here. Here’s how it works:

  1. Allocate some executable and writable memory. I call it trampoline.
  2. Copy the first N bytes of the original function to the start of the trampoline. N is determined by checking the disassembly of the hooked function: It’s necessary to only copy the function prologue since the trampoline needs to replicate it.
  3. The trampoline needs to be able to jump back to a specific instruction of the original function. The easiest way to accomplish this is to add a relative jump instruction to the trampoline. The jump and the destination address is appended to the copied function prologue.
  4. The beginning of the original function gets modified to cause it to take a detour to the trampoline. For this, the memory protection is modified temporarily since otherwise write operations to the code section may fail. At the beginning of the hooked function a relative jump to the trampoline is inserted by replacing the original instructions.

The trampoline may seem unnecessary at first, but without the trampoline this would create an endless loop. EndScene() would jump to endSceneHook() which calls the original function at the end, returning to EndScene() which jumps to endSceneHook() again which calls the original function at the end, returning to EndScene() which jumps to endSceneHook() again which calls the original function at the end, returning to EndScene() which jumps to endSceneHook() again, and so on and on.

Check this out for a better overview:

Here’s some code for all this:

const char* REL_JMP = "\xE9";
// 1 byte instruction + 4 bytes address
const unsigned int SIZE_OF_REL_JMP = 5;

// adapted from https://guidedhacking.com/threads/simple-x86-c-trampoline-hook.14188/
// hookedFn: The function that's about to the hooked
// hookFn: The function that will be executed before `hookedFn` by causing `hookFn` to take a detour
void* WINAPI hookFn(char* hookedFn, char* hookFn, int copyBytesSize, unsigned char* backupBytes, std::string descr) {

    //
    // 0. Backup the original function prologue
    //
    ReadProcessMemory(GetCurrentProcess(), hookedFn, backupBytes, copyBytesSize, 0);

    //
    // 1. Setup the trampoline
    //
    // --> Cause `hookedFn` to return to `hookFn` without causing an infinite loop
    // Otherwise calling `hookedFn` directly again would then call `hookFn` again, and so on :)
    //
    // allocate executable memory for the trampoline
    // the size is (amount of bytes copied from the original function) + (size of a relative jump + address)
    //
    char* trampoline = (char*)VirtualAlloc(0, copyBytesSize + SIZE_OF_REL_JMP, MEM_COMMIT | MEM_RESERVE, PAGE_EXECUTE_READWRITE);
    // steal the first `copyBytesSize` bytes from the original function
    // these will be used to make the trampoline work
    // --> jump back to `hookedFn` without executing `hookFn` again
    memcpy(trampoline, hookedFn, copyBytesSize);

    //
    // 2. append the relative JMP instruction after the stolen instructions, calculate and write the offset between the hooked function and the trampoline
    //
    memcpy(trampoline + copyBytesSize, REL_JMP, sizeof(REL_JMP));

    // Distance between the trampoline and the original function `hookedFn`
    // the jump will land *after* the inserted JMP instruction, hence subtracting 5
    int hookedFnTrampolineOffset = hookedFn - trampoline - SIZE_OF_REL_JMP;
    memcpy(trampoline + copyBytesSize + 1, &hookedFnTrampolineOffset, sizeof(hookedFnTrampolineOffset));

    // 3. Detour the original function `hookedFn`
    // --> cause `hookedFn` to execute `hookFn` first
    // remap the first few bytes of the original function as RXW
    DWORD oldProtect;
    VirtualProtect(hookedFn, copyBytesSize, PAGE_EXECUTE_READWRITE, &oldProtect);

    // best variable name ever
    // calculate the size of the relative jump between the start of `hookedFn` and the start of `hookFn`.
    int hookedFnHookFnOffset = hookFn - hookedFn - SIZE_OF_REL_JMP;

    //
    // Take a relative jump to `hookFn` at the beginning
    //
    // of course, `hookFn` has to expect the same parameter types and values
    memcpy(hookedFn, REL_JMP, sizeof(REL_JMP));
    memcpy(hookedFn + 1, &hookedFnHookFnOffset, sizeof(hookedFnHookFnOffset));

    // restore the previous protection values
    VirtualProtect(hookedFn, copyBytesSize, oldProtect, &oldProtect);

    return trampoline;
}

Once again: Error checks removed.

As can be seen, the hookFn() function returns the address of the trampoline. This is required by endSceneHook() (Hook Function), since it needs to call the trampoline function as can be seen on the image above.

For all you low-level fans, here’s what happens to the assembly during the hooking process:

Original EndScene():

0x5F8F46A0  6A 14           push 14                         ; Prologue
0x5F8F46A2  B8 2E01915F     mov eax,d3d9.5F91012E           ; Prologue
0x5F8F46A7  E8 3E8B0100     call <d3d9.__EH_prolog3_catch>  ; Actual code
;<more code>

Modified EndScene() after hooking:

0x5F8F46A0  E9 5ECB0004     jmp dll.66F21203                ; Jump to trampoline
0x5F8F46A5  91              ???                             ; Trash, never executed
0x5F8F46A6  5F              ???                             ; Trash, never executed
0x5F8F46A7  E8 3E8B0100     call <d3d9.__EH_prolog3_catch>  ; Actual code
;<more code>

Trampoline:

0x66F21203  6A 14           push 14                 ; Copied prologue
0x66F21205  B8 2E01915F     mov eax,d3d9.5F91012E   ; Copied prologue
0x66F2120A  E9 9B464221     jmp d3d9.5F8F46A7       ; Jump back to `EndScene()`

Adding Content to a Scene

Now only one more thing is missing: The actual endSceneHook() function that adds our custom content to each scene. Here’s a little proof-of-concept that adds a small rectangle to the screen:

[...]
// pointer to a function that's like the original EndScene() function
typedef HRESULT(APIENTRY* endSceneFunc)(LPDIRECT3DDEVICE9 pDevice);
// the returned trampoline
extern endSceneFunc trampEndScene;
[...]

void APIENTRY d3dHelper::endSceneHook(LPDIRECT3DDEVICE9 p_pDevice) {
    // Save the parameter, since it' used by us in order to draw stuff
    if (!d3dDevice) {
        d3dDevice = p_pDevice;
    }

    // Do own stuff
    drawRectangle(25, 25, 100, 100, D3DCOLOR_ARGB(255, 255, 255, 255));

    // Call original function using the trampoline
    trampEndScene(d3dDevice);
}

Compiling

I’ve used these VisualStudio settings:

$ g++ -m32 -shared -o dll.dll .\dll.cpp

Demo

Normally, there’s no crosshair for sniper rifles in CS:GO:

My proof of concept DLL adds a beautiful crosshair that allows aiming without using the scope:


References

In-Process Fuzzing With Frida

frida exploitation fuzzing reverse-engineering

Dynamic Instrumentation: Frida And r2frida For Noobs

radare2 r2 frida r2frida ctf reverse-engineering

r2con 2019 CTF Writeups

r2 radare2 ctf reverse-engineering