This is a 2 part tutorial, the second part follows this post.
"Peace among worlds"
Introduction
There's generally three agreed upon "standard" methods for code injection:
So what IS a Proxy DLL?
A Proxy DLL is named as such because it acts as a "proxy" between a process and the actual library it wants to access. Much like how at school or work when you browse the internet the proxy server fetches web pages for you. This allows the system administrators to filter the internet and improve response times by caching similar requests that are requested frequently (www.google.com for example).
A Proxy DLL functions much in the same way. It works as an interface between a process and the real DLL allowing us to apply hooks, modify return data and even be automagically injected into the target process by simply placing our DLL in the game's directory (amazing right?).
To understand exactly how this works you need to know a little bit about how PE files are structured and how Windows behaves under certain conditions. Previously I mentioned the Import Address Table (IAT) which is a data structure common to all PE files on Windows. This data structure holds the names of all the external libraries and functions that an executable (or DLL) needs to run successfully.
Let's examine the function D3D11CreateDevice which is the startup function for accessing the DirectX11 Library. MSDN tells us that in order to use this function we need to link our executable against D3D11.lib. When we link against a library (using Run-time Linking, static is different) we're telling the compiler that we want to use a function that is located outside of our executable failing to do this will result in the standard "unresolved external" error you may have seen. This is because the compiler doesn't know what to point the code to because we didn't let it know where the function resided.
When the compiler links against D3D11.lib it updates our executable's IAT with the name of the library (d3d11.dll) and the offset of where that function is located. This way when we launch our executable the windows loader sees this information and goes "This process requires DX11: better make sure that is mapped into memory before calling main".
However this is where things get interesting. Remember I said you'd need to know about Windows behavior in certain situations? This is one of them. There are two types of paths in Windows:
If you want to test this with a game create a DLL, put a MessageBox in DLL_PROCESS_ATTACH rename it to a DLL that game uses (like d3d9.dll) and place it in the directory. Running the game will result in your message showing: then the game will crash.
Avoiding the crash
The reason the game will immediately crash is because your DLL does not export the symbols that the game is expecting. In order to successfully proxy it we need mimic what the game is expecting: but how can we find out exactly what the game needs to work?
On Windows we can use DUMPBIN to pull information from PE files. Open the VS Tools command prompt and navigate to your game's folder containing the executable.
This will create a new text document called dump containing the output of the command we just ran.
As you can see vkQuake2 requires ref_vk.dll which in turn requires vulkan-1.dll and 90 imported functions to correctly launch. It's also important to note that some executables and DLLs may have a dependency chain that needs to be resolved prior to launch.
Here's a dump from Doom 2016 which is what we're going to focus on:
As you can see there's similarities between the two games. Now you can make a choice either to just support the necessary functions for a single game or completely support the entirety of the Vulkan API meaning the ProxyDLL will work with ANY Vulkan powered game.
Getting to Work:
I think everyone reading this has compiled a DLL before so I wont waste words going over how to do that. However we do need to go over the design of the ProxyDLL because that is quite important. You can't just export a bunch of symbols and hope for the best, you need to understand how the library that you're hooking works. For example with DirectX 9 we know that applications must call Direct3DCreate9 first. However in DirectX11 applications can either call D3D11CreateDevice or D3D11CreateDeviceAndSwapChain first. This is important because in certain situations your DllMain method may be called after one of the APIs are called. If your trampolines aren't initialized when that happens you'll crash the game.
So how you design your ProxyDLL will matter. I've written a number of ProxyDLLs over the years and each one required it's own unique approach. You'll have to debug, tweak and adjust until you get everything perfect and yes this will take a fair amount of time.
You'll also need to consider how much of the target API you want to support. You don't need to support 100% of the API just those required by the game. If you want universal support you'll need to implement everything in it's entirety. Or perhaps you could do it the Timb3r way: support enough of the API to make it work in one game, try loading it in another game make it crash, then update the supported API and finally get completely sidetracked writing green font to the game's console.
"It was totally worth all the time I wasted in x64dbg"
Coding up a storm:
I'm not going to put a massive amount of code in this article because I'll be uploading the source code. However I might go over why I implemented things the way I did.
To start with my project does not require any external dependencies like the Vulkan SDK. This was the first choice I made in regards to the project: I wanted it to be really easy to compile, really simply laid out and easy to mess with. The down side of course was I had to manually define all of Vulkan's various types and structs so I could easily interact with them inside the hooks:
Before you ask I copied all this information from the Vulkan SDK. The header inside the project is much much much larger. But like I said earlier that was a design choice I didn't want any external dependencies. I just wanted a small subset of the Vulkan definitions so I could examine structures in memory easily. If you don't actually care about modifying or changing parameters of calls you could easily just make everything generic pointers or a base type instead.
The OTHER reason I did it this way as I was able to simply call my functions exactly the same as the real vulkan-1.dll. If I'd linked to the SDK I wouldn't be able to call my function VkCreateInstance as it's already defined, I'd have to call it something like VkCreateInstance_Hook. One second if you're calling it VkCreateInstance_Hook and exporting it wont the exported function also be called that? How will the proxying work?
The magic of DEF files:
A DEF file also known as a Module Definition File is a special file that tells the build chain how you want to export symbols. It's basic structure looks like this:
These match up with the definitions in the code:
If you want to use different names you can specify aliases in the DEF file:
Then define your hook in code like this:
This way if you want to link against the SDK you can without getting any conflicts.
Continued in Part 2.
"Peace among worlds"
Introduction
There's generally three agreed upon "standard" methods for code injection:
- Static Modification of a PE file: Such as editing it's Import Address Table (IAT) or adding some shell code to it's startup routine.
- Injecting code into a live process through the use of WPM/RPM and CreateRemoteThread.
- Exploiting Window's DLL search order path to load a module (Also known as ProxyDLLs).
So what IS a Proxy DLL?
A Proxy DLL is named as such because it acts as a "proxy" between a process and the actual library it wants to access. Much like how at school or work when you browse the internet the proxy server fetches web pages for you. This allows the system administrators to filter the internet and improve response times by caching similar requests that are requested frequently (www.google.com for example).
A Proxy DLL functions much in the same way. It works as an interface between a process and the real DLL allowing us to apply hooks, modify return data and even be automagically injected into the target process by simply placing our DLL in the game's directory (amazing right?).
To understand exactly how this works you need to know a little bit about how PE files are structured and how Windows behaves under certain conditions. Previously I mentioned the Import Address Table (IAT) which is a data structure common to all PE files on Windows. This data structure holds the names of all the external libraries and functions that an executable (or DLL) needs to run successfully.
Let's examine the function D3D11CreateDevice which is the startup function for accessing the DirectX11 Library. MSDN tells us that in order to use this function we need to link our executable against D3D11.lib. When we link against a library (using Run-time Linking, static is different) we're telling the compiler that we want to use a function that is located outside of our executable failing to do this will result in the standard "unresolved external" error you may have seen. This is because the compiler doesn't know what to point the code to because we didn't let it know where the function resided.
When the compiler links against D3D11.lib it updates our executable's IAT with the name of the library (d3d11.dll) and the offset of where that function is located. This way when we launch our executable the windows loader sees this information and goes "This process requires DX11: better make sure that is mapped into memory before calling main".
However this is where things get interesting. Remember I said you'd need to know about Windows behavior in certain situations? This is one of them. There are two types of paths in Windows:
- Relative (Ex: "kernel32.dll")
- Absolute (Ex: "C:\Windows\System32\kernel32.dll")
- If the DLL is on the list of known DLLs for the version of Windows on which the application is running, the system uses its copy of the known DLL (and the known DLL's dependent DLLs, if any). The system does not search for the DLL. For a list of known DLLs on the current system, see the following registry key: HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Session Manager\KnownDLLs.
- The directory from which the application loaded.
- The system directory. Use the GetSystemDirectory function to get the path of this directory.
- The 16-bit system directory. There is no function that obtains the path of this directory, but it is searched.
- The Windows directory. Use the GetWindowsDirectory function to get the path of this directory.
- The current directory.
- The directories that are listed in the PATH environment variable. Note that this does not include the per-application path specified by the App Paths registry key. The App Paths key is not used when computing the DLL search path.
If you want to test this with a game create a DLL, put a MessageBox in DLL_PROCESS_ATTACH rename it to a DLL that game uses (like d3d9.dll) and place it in the directory. Running the game will result in your message showing: then the game will crash.

Avoiding the crash
The reason the game will immediately crash is because your DLL does not export the symbols that the game is expecting. In order to successfully proxy it we need mimic what the game is expecting: but how can we find out exactly what the game needs to work?
On Windows we can use DUMPBIN to pull information from PE files. Open the VS Tools command prompt and navigate to your game's folder containing the executable.
This is the DLL containing all the Vulkan logic for vkQuake2.DUMPBIN /IMPORTS ref_vk.dll > dump.txt
This will create a new text document called dump containing the output of the command we just ran.
Code:
Microsoft (R) COFF/PE Dumper Version 14.22.27905.0
Copyright (C) Microsoft Corporation. All rights reserved.
Dump of file ref_vk.dll
File Type: DLL
Section contains the following imports:
vulkan-1.dll
18002B310 Import Address Table
18003A1D8 Import Name Table
0 time date stamp
0 Index of first forwarder reference
10 vkCmdBlitImage
47 vkCreateImageView
15 vkCmdCopyBufferToImage
17 vkCmdCopyImageToBuffer
6E vkEnumerateDeviceExtensionProperties
74 vkEnumeratePhysicalDevices
A8 vkGetPhysicalDeviceSurfacePresentModesKHR
A0 vkGetPhysicalDeviceProperties
81 vkGetDeviceQueue
A2 vkGetPhysicalDeviceQueueFamilyProperties
97 vkGetPhysicalDeviceFeatures
A9 vkGetPhysicalDeviceSurfaceSupportKHR
3F vkCreateDevice
A7 vkGetPhysicalDeviceSurfaceFormatsKHR
BA vkResetFences
4F vkCreateSemaphore
4D vkCreateSampler
3 vkAllocateDescriptorSets
44 vkCreateFramebuffer
6A vkDestroySurfaceKHR
B vkCmdBeginRenderPass
66 vkDestroySampler
3C vkCreateDescriptorPool
60 vkDestroyInstance
38 vkCreateBuffer
75 vkFlushMappedMemoryRanges
8A vkGetImageMemoryRequirements
25 vkCmdPipelineBarrier
1 vkAcquireNextImageKHR
21 vkCmdEndRenderPass
6C vkDeviceWaitIdle
54 vkDestroyBuffer
1D vkCmdDrawIndexed
3D vkCreateDescriptorSetLayout
8 vkBindImageMemory
4 vkAllocateMemory
E vkCmdBindPipeline
6 vkBindBufferMemory
78 vkFreeMemory
9D vkGetPhysicalDeviceMemoryProperties
30 vkCmdSetScissor
5F vkDestroyImageView
65 vkDestroyRenderPass
AF vkInvalidateMappedMemoryRanges
5E vkDestroyImage
46 vkCreateImage
B0 vkMapMemory
79 vkGetBufferMemoryRequirements
26 vkCmdPushConstants
1C vkCmdDraw
76 vkFreeCommandBuffers
52 vkCreateSwapchainKHR
AE vkGetSwapchainImagesKHR
A6 vkGetPhysicalDeviceSurfaceCapabilitiesKHR
61 vkDestroyPipeline
45 vkCreateGraphicsPipelines
80 vkGetDeviceProcAddr
4A vkCreatePipelineLayout
8F vkGetInstanceProcAddr
B3 vkQueuePresentKHR
BD vkUnmapMemory
F vkCmdBindVertexBuffers
63 vkDestroyPipelineLayout
58 vkDestroyDescriptorSetLayout
4C vkCreateRenderPass
BF vkUpdateDescriptorSets
6B vkDestroySwapchainKHR
57 vkDestroyDescriptorPool
99 vkGetPhysicalDeviceFormatProperties
5A vkDestroyDevice
77 vkFreeDescriptorSets
69 vkDestroyShaderModule
C vkCmdBindDescriptorSets
34 vkCmdSetViewport
68 vkDestroySemaphore
D vkCmdBindIndexBuffer
56 vkDestroyCommandPool
48 vkCreateInstance
5D vkDestroyFramebuffer
43 vkCreateFence
C0 vkWaitForFences
B4 vkQueueSubmit
5 vkBeginCommandBuffer
2 vkAllocateCommandBuffers
5C vkDestroyFence
3A vkCreateCommandPool
6D vkEndCommandBuffer
14 vkCmdCopyBuffer
50 vkCreateShaderModule
53 vkCreateWin32SurfaceKHR
Here's a dump from Doom 2016 which is what we're going to focus on:
Code:
DLL Name: vulkan-1.dll
vma: Hint/Ord Member-Name Bound-To
2c37fb4 57 vkCreateDevice
2c37fc6 79 vkDestroyDevice
2c37fd8 98 vkEnumerateDeviceExtensionProperties
2c38000 110 vkGetDeviceQueue
2c38014 139 vkQueueWaitIdle
2c38026 81 vkDestroyFence
2c38038 92 vkDestroySemaphore
2c3804e 89 vkDestroyQueryPool
2c38064 55 vkCreateDescriptorPool
2c3807e 77 vkDestroyDescriptorPool
2c379ce 148 vkWaitForFences
2c380b0 104 vkFreeCommandBuffers
2c380c8 127 vkGetPhysicalDeviceSurfaceSupportKHR
2c37968 138 vkQueueSubmit
2c37978 5 vkBindBufferMemory
2c3798e 107 vkGetBufferMemoryRequirements
2c379ae 59 vkCreateFence
2c379be 144 vkResetFences
2c37f9e 109 vkGetDeviceProcAddr
2c37f78 120 vkGetPhysicalDeviceMemoryProperties
2c37f4c 122 vkGetPhysicalDeviceQueueFamilyProperties
2c37f2c 121 vkGetPhysicalDeviceProperties
2c37f06 118 vkGetPhysicalDeviceFormatProperties
2c37ee8 117 vkGetPhysicalDeviceFeatures
2c37eca 102 vkEnumeratePhysicalDevices
2c37eb6 85 vkDestroyInstance
2c37ea2 64 vkCreateInstance
2c37e8e 96 vkDeviceWaitIdle
2c37e74 18 vkCmdCopyBufferToImage
2c37e5e 84 vkDestroyImageView
2c37e4a 63 vkCreateImageView
2c37e38 83 vkDestroyImage
2c37e28 62 vkCreateImage
2c37e08 113 vkGetImageMemoryRequirements
2c37df4 6 vkBindImageMemory
2c37dde 29 vkCmdEndRenderPass
2c37dc6 8 vkCmdBeginRenderPass
2c37db0 50 vkCmdWriteTimestamp
2c37d9a 36 vkCmdResetQueryPool
2c37d8a 28 vkCmdEndQuery
2c37d78 7 vkCmdBeginQuery
2c37d60 33 vkCmdPipelineBarrier
2c37d4e 17 vkCmdCopyBuffer
2c37d3e 22 vkCmdDispatch
2c37d22 26 vkCmdDrawIndexedIndirect
2c37d0e 25 vkCmdDrawIndexed
2c37cf4 12 vkCmdBindVertexBuffers
2c37cdc 10 vkCmdBindIndexBuffer
2c37cc2 9 vkCmdBindDescriptorSets
2c37ca6 45 vkCmdSetStencilReference
2c37c88 44 vkCmdSetStencilCompareMask
2c37c74 39 vkCmdSetDepthBias
2c37c62 43 vkCmdSetScissor
2c37c4e 47 vkCmdSetViewport
2c37c3a 11 vkCmdBindPipeline
2c37c24 97 vkEndCommandBuffer
2c37c0c 4 vkBeginCommandBuffer
2c37bf0 1 vkAllocateCommandBuffers
2c37bda 53 vkCreateCommandPool
2c37bc4 90 vkDestroyRenderPass
2c37bae 68 vkCreateRenderPass
2c37b96 82 vkDestroyFramebuffer
2c37b80 60 vkCreateFramebuffer
2c37b66 147 vkUpdateDescriptorSets
2c37b4a 2 vkAllocateDescriptorSets
2c37b32 142 vkResetDescriptorPool
2c37b12 78 vkDestroyDescriptorSetLayout
2c37af4 56 vkCreateDescriptorSetLayout
2c37ada 88 vkDestroyPipelineLayout
2c37ac0 66 vkCreatePipelineLayout
2c37aac 86 vkDestroyPipeline
2c37a90 54 vkCreateComputePipelines
2c37a74 61 vkCreateGraphicsPipelines
2c37a5c 93 vkDestroyShaderModule
2c37a44 71 vkCreateShaderModule
2c37a32 74 vkDestroyBuffer
2c37a20 51 vkCreateBuffer
2c37a08 130 vkGetQueryPoolResults
2c379f4 67 vkCreateQueryPool
2c379e0 70 vkCreateSemaphore
2c38098 76 vkDestroyCommandPool
2c381fa 20 vkCmdCopyImageToBuffer
2c381e8 69 vkCreateSampler
2c381cc 103 vkFlushMappedMemoryRanges
2c381bc 146 vkUnmapMemory
2c381ae 134 vkMapMemory
2c3819e 106 vkFreeMemory
2c3818a 3 vkAllocateMemory
2c38170 73 vkCreateWin32SurfaceKHR
2c38144 126 vkGetPhysicalDeviceSurfacePresentModesKHR
2c3811c 125 vkGetPhysicalDeviceSurfaceFormatsKHR
2c380f0 124 vkGetPhysicalDeviceSurfaceCapabilitiesKHR
Getting to Work:
I think everyone reading this has compiled a DLL before so I wont waste words going over how to do that. However we do need to go over the design of the ProxyDLL because that is quite important. You can't just export a bunch of symbols and hope for the best, you need to understand how the library that you're hooking works. For example with DirectX 9 we know that applications must call Direct3DCreate9 first. However in DirectX11 applications can either call D3D11CreateDevice or D3D11CreateDeviceAndSwapChain first. This is important because in certain situations your DllMain method may be called after one of the APIs are called. If your trampolines aren't initialized when that happens you'll crash the game.
So how you design your ProxyDLL will matter. I've written a number of ProxyDLLs over the years and each one required it's own unique approach. You'll have to debug, tweak and adjust until you get everything perfect and yes this will take a fair amount of time.
You'll also need to consider how much of the target API you want to support. You don't need to support 100% of the API just those required by the game. If you want universal support you'll need to implement everything in it's entirety. Or perhaps you could do it the Timb3r way: support enough of the API to make it work in one game, try loading it in another game make it crash, then update the supported API and finally get completely sidetracked writing green font to the game's console.
"It was totally worth all the time I wasted in x64dbg"
Coding up a storm:
I'm not going to put a massive amount of code in this article because I'll be uploading the source code. However I might go over why I implemented things the way I did.
To start with my project does not require any external dependencies like the Vulkan SDK. This was the first choice I made in regards to the project: I wanted it to be really easy to compile, really simply laid out and easy to mess with. The down side of course was I had to manually define all of Vulkan's various types and structs so I could easily interact with them inside the hooks:
C++:
#define VKAPI_CALL __stdcall
#define VKAPI_PTR VKAPI_CALL
#define VK_NULL_HANDLE 0
typedef enum VkResult {
VK_SUCCESS = 0,
VK_RESULT_MAX_ENUM = 0x7FFFFFFF
} VkResult;
#define VK_DEFINE_HANDLE(object) typedef struct object##_T* object;
#define VK_DEFINE_NON_DISPATCHABLE_HANDLE(object) typedef struct object##_T *object;
typedef uint32_t VkFlags;
typedef uint32_t VkBool32;
typedef uint64_t VkDeviceSize;
typedef uint32_t VkSampleMask;
VK_DEFINE_HANDLE(VkInstance)
VK_DEFINE_HANDLE(VkPhysicalDevice)
VK_DEFINE_HANDLE(VkDevice)
VK_DEFINE_HANDLE(VkQueue)
VK_DEFINE_NON_DISPATCHABLE_HANDLE(VkSemaphore)
VK_DEFINE_HANDLE(VkCommandBuffer)
VK_DEFINE_NON_DISPATCHABLE_HANDLE(VkFence)
VK_DEFINE_NON_DISPATCHABLE_HANDLE(VkDeviceMemory)
VK_DEFINE_NON_DISPATCHABLE_HANDLE(VkBuffer)
VK_DEFINE_NON_DISPATCHABLE_HANDLE(VkImage)
VK_DEFINE_NON_DISPATCHABLE_HANDLE(VkEvent)
VK_DEFINE_NON_DISPATCHABLE_HANDLE(VkQueryPool)
VK_DEFINE_NON_DISPATCHABLE_HANDLE(VkBufferView)
VK_DEFINE_NON_DISPATCHABLE_HANDLE(VkImageView)
VK_DEFINE_NON_DISPATCHABLE_HANDLE(VkShaderModule)
VK_DEFINE_NON_DISPATCHABLE_HANDLE(VkPipelineCache)
VK_DEFINE_NON_DISPATCHABLE_HANDLE(VkPipelineLayout)
VK_DEFINE_NON_DISPATCHABLE_HANDLE(VkRenderPass)
VK_DEFINE_NON_DISPATCHABLE_HANDLE(VkPipeline)
VK_DEFINE_NON_DISPATCHABLE_HANDLE(VkDescriptorSetLayout)
VK_DEFINE_NON_DISPATCHABLE_HANDLE(VkSampler)
VK_DEFINE_NON_DISPATCHABLE_HANDLE(VkDescriptorPool)
VK_DEFINE_NON_DISPATCHABLE_HANDLE(VkDescriptorSet)
VK_DEFINE_NON_DISPATCHABLE_HANDLE(VkFramebuffer)
VK_DEFINE_NON_DISPATCHABLE_HANDLE(VkCommandPool)
VK_DEFINE_NON_DISPATCHABLE_HANDLE(VkSwapchainKHR)
The OTHER reason I did it this way as I was able to simply call my functions exactly the same as the real vulkan-1.dll. If I'd linked to the SDK I wouldn't be able to call my function VkCreateInstance as it's already defined, I'd have to call it something like VkCreateInstance_Hook. One second if you're calling it VkCreateInstance_Hook and exporting it wont the exported function also be called that? How will the proxying work?
The magic of DEF files:
A DEF file also known as a Module Definition File is a special file that tells the build chain how you want to export symbols. It's basic structure looks like this:
Exports.def:
EXPORTS
vkAcquireNextImageKHR
vkAllocateCommandBuffers
vkAllocateDescriptorSets
vkAllocateMemory
vkBeginCommandBuffer
vkBindBufferMemory
vkBindImageMemory
VulkanHooks.h:
#define DLLEXPORT __declspec(dllexport)
extern "C"
{
// Function prototypes
VkResult DLLEXPORT vkAcquireNextImageKHR(VkDevice device, VkSwapchainKHR swapchain, uint64_t timeout, VkSemaphore semaphore, VkFence fence, uint32_t* pImageIndex);
VkResult DLLEXPORT vkAllocateCommandBuffers(VkDevice device, const VkCommandBufferAllocateInfo* pAllocateInfo, VkCommandBuffer* pCommandBuffers);
VkResult DLLEXPORT vkAllocateDescriptorSets(VkDevice device, const VkDescriptorSetAllocateInfo* pAllocateInfo, VkDescriptorSet* pDescriptorSets);
VkResult DLLEXPORT vkAllocateMemory(VkDevice device, const VkMemoryAllocateInfo* pAllocateInfo, const VkAllocationCallbacks* pAllocator, VkDeviceMemory* pMemory);
VkResult DLLEXPORT vkBeginCommandBuffer(VkCommandBuffer commandBuffer, const VkCommandBufferBeginInfo* pBeginInfo);
VkResult DLLEXPORT vkBindBufferMemory(VkDevice device, VkBuffer buffer, VkDeviceMemory memory, VkDeviceSize memoryOffset);
VkResult DLLEXPORT vkBindImageMemory(VkDevice device, VkImage image, VkDeviceMemory memory, VkDeviceSize memoryOffset);
void DLLEXPORT vkCmdBeginQuery(VkCommandBuffer commandBuffer, VkQueryPool queryPool, uint32_t query, VkQueryControlFlags flags);
// ...
}
Code:
EXPORTS
vkAcquireNextImageKHR = vkAcquireNextImageHKR_Hook
C++:
extern "C"
{
// Function prototypes
VkResult DLLEXPORT vkAcquireNextImageKHR_Hook(VkDevice device, VkSwapchainKHR swapchain, uint64_t timeout, VkSemaphore semaphore, VkFence fence, uint32_t* pImageIndex);
}
Continued in Part 2.
Last edited by a moderator: