
This article contains functions and features that are not documented by the original manufacturer. By following advice in this article, you're doing so at your own risk. The methods presented in this article may rely on internal implementation and may not work in the future.
Intro
This post will be more of a vlog post than anything else. I've spent over a week screen-recording myself while coding a Windows driver in Visual Studio. The videos should show the way how one can inject a test DLL into all running processes on Windows 10. I recorded myself coding it from start to finish, so it should be a somewhat comprehensive demo ... or, a snoozefest. 😁
In the process I also learned that coding complex algorithms and talking at the same time isn't easy, and what comes out isn't always what I would want to say had I had a quiet time to think about it 😂 So, I misspoke in a few places there. Thus, please be lenient with me if you watch it all.
And, if you are not into reading blog posts and want to start watching the video tutorial itself, check the playlist blow. Additionally, you can just download the source code alone.
Finally, let me say that if you mess up your production OS by misapplying what I showed here, it will be entirely on you. Don't blame me later!
Credit
First and foremost, I want to show my appreciation to Rbmm for sharing his original code that my solution is based on. Please give him props at his GitHub repo. He is the original author of most of the concepts that I will outline in my long video presentation here.
Quick Overview
I am not going to delve into all the nitty-gritty details in this blog post that I covered when recording my tutorial. But just to recap, here is how the process of injection into all running processes in Windows works:
- We'll write a kernel driver to install our callback that will be invoked when a module (or DLL) is mapped into a process. We can do it using 
		the PsSetLoadImageNotifyRoutinefunction.
- Knowing the sequence how DLLs are loaded in Windows, namely, first we have ntdll.dllthat loads into any user-mode process, followed bykernel32.dll, that loads into all non-native processes. Thus, if we intercept in our callback the moment whenkernel32.dllis being loaded, we can inject our own DLL before it.Just for fun, we will call our DLL, that we will be injecting into all processes, as FAKE.DLL. And to signify its bitness, the actual file will be namedFAKE64.DLLorFAKE32.DLL. It won't do much, except just write into a log file the date & time and the process that it was injected into.The way we will be injecting it puts a constraint on our FAKE.DLLin that it cannot rely on imports from any DLLs except forntdll.dll. This includes C-Runtime (or CRT) and most of the C++ standard libraries.
- To be able to bypass security mitigations in Windows, 
		and to streamline the loading of our injected DLL, we will first create a KnownDllsection out of ourFAKE.DLL. This way we will be able to load ourFAKE.DLLfrom user-mode without raising alarms from "Code Integrity Guard" (CIG) or from "Arbitrary Code Guard" (ACG).Note that this is not a bypass of the security mitigations in Windows, since we're employing a kernel driver for our solution. 
- The injection itself will be done through a series of 
		Asynchronous Procedure Calls (APC) 
		that will be initiated from the kernel mode. The sequence will go as such:
	- We will open our FAKE.DLLand create aKnownDllsection out of it in the callback to thePsSetLoadImageNotifyRoutinefunction. We need to keep in mind that the callback will be executing from within a critical section, and thus we can't do much from it. Thus we will only quickly queue a kernel APC, usingKeInitializeApc/KeInsertQueueApcfunctions.
- From our APC callbacks, we will skip the KernelRoutineroutine because it will be executing underAPC_LEVELIRQL.
- But from within the NormalRoutineroutine (that will be running under thePASSIVE_LEVELIRQL) we will map our special base-independent shell-code into the target process, and queue user-mode APC that will invoke it.We will write our shell-code in Assembly language that will enable it to be base-independent, meaning that it will not require relocations and can run from any address in memory. 
- The shell-code will execute two simple function calls from the address space of target process:
C++ pseudo-code[Copy]UNICODE_STRING uS = { sizeof(L"FAKE.DLL") - sizeof(WCHAR), sizeof(L"FAKE.DLL"), L"FAKE.DLL" }; HANDLE h; LdrLoadDll(NULL, 0, &uS, &h); //BaseAddress = base address of this module NtUnmapViewOfSection(NtCurrentProcess(), BaseAddress);
- After that our FAKE.DLLwill be injected into the target process, that we can verify by running itsDllMainfunction that will do some basic logging into a file for us.
 
- We will open our 
This is a quick overview of the injection technique, where I omitted the peculiarities of dealing with the WOW64 processes (or 32-bit processes running on the 64-bit operating system) and other important details that I covered in detail in my video overview.
The video tutorial also covers the aspects of testing the driver in a VM, and creating a separate test C++ project to debug the injected FAKE.DLL.
Video Playlist
Note that the following is a playlist of multiple consecutive videos where I will show you the coding process from start to finish. I would recommend watching them in sequence and playing them full-screen to make sure that you can see the code:
Video Timecodes
Or, the following are time-coded segments of the tutorial that will open in a YouTube player:
- Installing & Setting Up Tools, Basic Concepts:
	- 1:31 - Setting up virtual machines to run driver tests in.
- 4:22 - Setting up Visual Studio components needed to code our project.
- 7:00 - Setting up tools in a VM:
- 7:44 - Process Hacker - to view running processes & modules.
- 9:36 - DebugView - to view debugging output from our driver.
- 11:16 - WinObj - to view kernel space objects.
- 11:55 - PEInternals - to statically view PE files.
- 13:11 - WERSetup - to set up Windows Error Reporting to catch user-mode process crashes.
- 15:19 - WinAPI Search - to check Imports/Exports from PE files and to search for error codes.
- 16:53 - Driver Loader/Unloaded - to register, start, stop and unregister our driver.
- 17:37 - Putting the Operating System in a VM into a test signing mode to be able to run our driver.
- 19:52 - Creating a snapshot in the VM in case we mess up the operating system during our driver testing.
- 21:20 - Quick overview of: physical/virtual memory, and of DLLs/modules/"sections" in the kernel space.
- 30:34 - Overview of DLL injection with the PsSetLoadImageNotifyRoutinefunction.
- 31:13 - Basic overview how we can inject our DLL into every process.
 
- Starting Windows Driver C++ Project:
	- 0:29 - Credit to Rbmm.
- 1:01 - Recap of how we'll be injecting our FAKE.DLL into all processes: ntdll.dll, kernel32.dll, no CRT, use CFG, kernel APC.
- 9:38 - Starting to code: Creating solution, named "InjectAll".
- 11:03 - Starting WDM Windows driver project, named "Drv".
- 12:26 - Adding DrvMain.cpp.
- 13:41 - Adding DrvTypes.h.
- 15:55 - Adding SharedDefs.h.
- 17:14 - Adding CFuncclass.
- 19:38 - Adding DriverEntryfunction.
- 21:12 - Installing the correct Windows SDK & WDK.
- 24:04 - Installing (fighting with) Spectre-mitigated libraries for Visual Studio.
- 26:25 - Solution to missing Spectre-mitigated libraries.
- 28:49 - Fixing initial issues with building a driver solution.
- 31:25 - (Erroneously) Removing test signing from building a driver.
- 34:01 - Coding DbgPrintLinemacro.
- 38:11 - Coding DriverUnloadroutine.
- 39:59 - Testing our first build of the driver.
- 43:15 - Adding test signing back for building a driver in Visual Studio.
- 45:02 - Was able to start and stop our first build of the driver!
 
- Beginning to Code Windows Driver:
	- 0:55 - Coding basic driver entry objects.
- 2:43 - Setting up PsSetLoadImageNotifyRoutinecallback.
- 8:10 - Setting up OnLoadImagecallback.
- 11:15 - Coding FreeResources()function.
- 15:30 - Coding the statement to catch kernel32.dll being loaded.
- 19:50 - Coding CFunc::IsSuffixedUnicodeString()function.
- 25:41 - Defining STATIC_UNICODE_STRINGmacro.
- 30:01 - Coding CFunc::IsMappedByLdrLoadDll()function.
- 40:03 - Coding CFunc::IsSpecificProcessW()function.
- 1:10:45 - Determining if we got a WOW64 process, IoIs32bitProcess.
- 1:12:57 - Running another driver test of what we built so far.
 
- Coding Windows Driver: Creating Section:
	- 0:39 - Quick review of what we've done so far.
- 3:09 - Setting up CSectionclass.
- 4:37 - Setting up DLL_STATSstruct.
- 6:07 - Declaring SECTION_TYPEenum.
- 10:25 - Coding CSection::Initialize()function.
- 12:04 - Coding CSection::GetSection()singleton function usingRtlRunOnceBeginInitialize/RtlRunOnceCompletefunctions.
- 32:03 - Explanation of Code Integrity Guard (CIG) and how it may affect our DLL injection.
- 35:26 - Lowdown on KnownDlls.
- 37:48 - Using PsInitialSystemProcessto attach to system process.
- 45:15 - Defining the debugging TAGmacro for kernel functions.
- 47:39 - Continuing to code CSection::GetSection()function.
 
- Coding Windows Driver: Creating Section - KnownDlls:
	- 1:24 - Fixing previous bug in the CSection::GetSection()function.
- 3:44 - Coding CSection::FreeSection()function.
- 9:49 - Adding DBG_VERBOSE_DRVpreprocessor directive for verbose debugging output.
- 13:51 - Adding code to call CSection::FreeSection()function.
- 17:10 - Starting to code CSection::CreateKnownDllSection()function.
- 20:27 - Setting up to "steal" security descriptor from the existing KnownDll- kernel32.dll.
- 21:22 - Opening existing kernel32.dll section.
- 30:58 - Testing current build of the driver.
- 34:14 - Adding code to call CSection::GetSection()function.
- 39:17 - Testing again the current build of the driver.
- 41:21 - Going back to coding CSection::CreateKnownDllSection()function.
- 42:20 - Retrieving security descriptor from kernel32.dll section with ZwQuerySecurityObject.
- 47:22 - Description of the OBJ_PERMANENTsection object.
- 49:48 - Differentiation of our Fake.dll section names for KnownDlls.
- 57:22 - Allocating memory for the security descriptor from the kernel32.dll section.
 
- 1:24 - Fixing previous bug in the 
- Coding Injected FAKE.DLL:
	- 1:18 - Adding new C++ project - FAKE.dll.
- 3:03 - Review of restrictions of injection of our DLL into a process: ntdll.dll, kernel32.dll.
- 9:11 - Adding new DllTypes.hfile.
- 12:15 - Removing C-Run-Time (CRT) from our FAKE.dll for the 64-bit build.
- 15:54 - Adding Exports.def file.
- 16:41 - Adding loadcfg.c file to enable Control Flow Guard (CFG) for our FAKE.dll.
- 19:54 - Adding loadcfg64.asm file and x64 Assembly into it for CFG.
- 25:29 - Removing C-Run-Time (CRT) from our FAKE.dll for the 32-bit build.
- 28:48 - Coding loadcfg32.asm file with x86 Assembly into it for CFG.
- 36:13 - Adding LogToFile()function using native functions from ntdll.dll.
- 51:46 - Adding LogToFileFmt()function.
- 59:39 - Adding code in DllMain()to run when our DLL is injected into a process.
 
- Coding Injected FAKE.DLL - TestConsole Project:
	- 1:02 - Creating TestConsoleproject.
- 1:45 - Writing test code to call DllMainin our FAKE.DLL.
- 4:36 - Ways to debug a DLL using TestConsole project.
- 11:52 - Adding code to get pointer to TEBin DllMain.
- 13:33 - Coding Get_TEB()function.
- 17:30 - Coding Get_PEB()function.
- 18:36 - Adding code to our DllMainfor debugging output: process ID, process image path, current time with ntdll.dll only.
- 28:33 - Testing our FAKE.DLL in a TestConsolewith debugging output.
- 30:57 - Explanation why we need to adjust security descriptor for the InjectAll folder for access from any process.
- 32:37 - Adding SetDS_InjectAllFolder()debugging function.
- 43:28 - Running our TestConsole with the SetDS_InjectAllFolder()function to adjust security descriptor on the InjectAll folder.
 
- 1:02 - Creating 
- Coding Windows Driver: Creating Section - KnownDlls (continued):
	- 0:36 - Continuing to code CSection::CreateKnownDllSection()function.
- 3:16 - Opening our FAKE.DLL file using ZwOpenFile.
- 13:09 - Creating a section from our FAKE.DLL using ZwCreateSection.
- 17:57 - Filling in our DLL_STATSwith created section info.
- 18:22 - Getting our section object pointer with ObReferenceObjectByHandleWithTag.
- 24:49 - Adjusting CSection::FreeSection()function to remove our section.
- 27:28 - Adjusting CSection::CreateKnownDllSection()function to close permanent section correctly in case of an error.
- 30:46 - Testing current build of the driver and two bitnesses of FAKE.DLL in a test VM.
- 34:36 - Dealing with the error 0xC0000035during testing.
- 37:09 - Fixing a bug with missing CSection::Initialize()function call.
- 48:01 - Adjusting sectionTypedebugging output to be more readable after a change by doing some refactoring.
- 51:06 - Checking that security descriptor is set up correctly on the InjectAll folder.
 
- 0:36 - Continuing to code 
- Coding Windows Driver: DLL Injection via Kernel APC:
	- 0:52 - Adding version resource to our FAKE.DLL.
- 2:41 - Explanation why we need to use Asynchronous Procedure Calls (APC) from our driver callback.
- 7:00 - Quick lowdown on kernel APC KernelRoutine,NormalRoutine,RundownRoutine.
- 10:44 - Adding CSection::InjectDLL()function.
- 14:55 - Quick lowdown on why we need to allocate from 
			NonPagedPoolwhen queuing KAPC.
- 18:00 - Coding of queuing of the kernel APC with KeInitializeApc.
- 23:38 - Using reference count on our driver object and the section object to prevent problems when queuing APC.
- 27:42 - Inserting kernel APC with KeInsertQueueApc.
- 33:29 - Explanation of how to dereference driver object from APC routines correctly. Why I'm coding it using JMP instruction from Assembly language.
- 41:21 - Adding asm64.asm and asm32.asm files for APC callback stubs.
- 43:21 - Coding RundownRoutineAPC callback stub in x64 Assembly.
- 44:44 - Coding RundownRoutine_Proc()callback procedure in C++.
- 51:58 - Lowdown on the use of the __imp_ prefix on imported function calls from the Assembly code.
- 58:00 - Coding KernelRoutineAPC callback stub in x64 Assembly.
- 1:01:11 - Coding KernelRoutine_Proc()callback procedure in C++.
- 1:13:06 - Explanation of forwarding function call parameters on the stack inside KernelRoutinefunction written in x64 Assembly.
- 1:18:04 - Coding NormalRoutineAPC callback stub in x64 Assembly.
- 1:19:17 - Coding NormalRoutine_Proc()callback procedure in C++.
 
- Coding Windows Driver: DLL Injection via Kernel APC (continued):
	- 0:28 - Recap of what we've coded in x64 Assembly so far.
- 3:16 - Starting to code asm32.asm x86 Assembly file.
- 4:00 - Coding RundownRoutineAPC callback stub in x86 Assembly.
- 7:24 - Explanation of forwarding function call parameters on the stack inside RundownRoutinefunction written in x86 Assembly.
- 16:05 - Coding KernelRoutineAPC callback stub in x86 Assembly.
- 18:31 - Explanation of forwarding function call parameters on the stack inside KernelRoutinefunction written in x86 Assembly.
- 22:52 - Coding NormalRoutineAPC callback stub in x86 Assembly.
 
- Coding Windows Driver: DLL Injection - ShellCode x64:
	- 1:22 - Reasons for using APC to code DLL injection from our OnLoadImagekernel callback.
- 8:05 - Coding RundownRoutine_Proc()callback.
- 11:59 - Coding KernelRoutine_Proc()callback.
- 14:50 - Coding NormalRoutine_Proc()callback.
- 19:21 - Explanation of two types of code that we will put into our FAKE.DLL: Shell-code and DllMain.
- 22:50 - Adding dll_asm64.asm file with the base-independent x64 Assembly shell-code to the FAKE.DLL project.
- 24:33 - Coding UserModeNormalRoutinefunction shell-code in base-independent x64 Assembly.
- 29:57 - Explanation why we can't use imports from external DLLs to call system functions in our base-independent shell-code.
- 31:45 - Coding getProcAddrForModfunction to resolve exported function address from a module in base-independent x64 Assembly.
- 1:01:49 - Finishing to code UserModeNormalRoutinefunction in base-independent x64 Assembly.
 
- 1:22 - Reasons for using APC to code DLL injection from our 
- Coding Windows Driver: DLL Injection - ShellCode x86:
	- 1:07 - Adding dll_asm32.asm file with the base-independent x86 Assembly shell-code to the FAKE.DLL project.
- 2:04 - Recap of UserModeNormalRoutinefunction from x64 Assembly code.
- 4:31 - Coding getProcAddrForModfunction to resolve exported function address from a module in base-independent x86 Assembly.
- 25:55 - Coding UserModeNormalRoutinefunction in base-independent x86 Assembly.
- 30:58 - Coding getStr_LdrLoadDll()function to obtain pointer to a base-independent static string.
- 47:59 - Coding getStr_NtUnmapViewOfSection()function to obtain pointer to a base-independent static string.
- 59:54 - Setting up UserModeNormalRoutinefunction to be exported as the ordinal 1 in Exports.def.
- 1:02:33 - Explanation how to mark UserModeNormalRoutinefunction to bypass Export Suppression from CFG.
- 1:05:00 - Coding exported stub function f1()to include CFG conformance for theUserModeNormalRoutinefunction.
 
- Coding Windows Driver: DLL Injection - Finishing up:
	- 1:13 - Adding SEARCH_TAG_Wstruct to keep static signature in our fake.dll.
- 7:00 - Modifying our dummy exported function f1()to include static signature inSEARCH_TAG_Wstruct.
- 13:36 - Coding CFunc::FindStringByTag()function.
- 20:29 - Adjusting CSection::CreateKnownDllSection()function to retrieve info from our FAKE.DLL section:ZwMapViewOfSection, resolving ordinal 1 forUserModeNormalRoutine, callingCFunc::FindStringByTagandZwQuerySection.
- 43:06 - Adding new members into DLL_STATSwith additional info about our section.
 
- 1:13 - Adding 
- Coding Windows Driver: Mapping Shell-Code & FAKE.DLL:
	- 1:21 - Review of DLL_STATSstruct members.
- 2:22 - Diagram of mapping FAKE.DLL into a process: shell-code and DllMainfunctions,PreferredAddresswhen mapping.
- 16:07 - Creating CSection::MapSectionForShellCode()function that maps our shell-code.
- 37:05 - Writing code to map section for shell-code in NormalRoutine_Proc()callback.
- 42:52 - Coding CFunc::debugGetCurrentProcName()to get current process image name.
 
- 1:21 - Review of 
- Coding Windows Driver: Invoking Shell-Code & Loading FAKE.DLL:
	- 0:40 - Recap of how our Shell-code will run from the UserModeNormalRoutine()function.
- 5:24 - Diagram with explanation of invoking kernel APCs to run our Shell-code in user-mode.
- 14:15 - Finishing up writing kernel APC callbacks: KernelRoutine_Proc(),NormalRoutine_Proc().
- 37:19 - Adding code to inject DLL into OnLoadImage()callback via ourCSection::InjectDLL()function.
- 40:32 - Building and testing our injection project with the notepad.exe process only.
- 50:17 - Example of dealing with a crash in a user-mode process (notepad.exe), collecting crash dumps with WERSetup.
- 52:40 - Adjusting NormalRoutine_Proc()to handle injection into WOW64 processes withPsWrapApcWow64Thread.
- 56:23 - Testing injection into WOW64 notepad.exe process.
 
- 0:40 - Recap of how our Shell-code will run from the 
- Final Testing:
- Testing Driver On Windows 7, Crash Dump Analysis, Bug Fixes:
	- 1:25 - Fixing a small bug.
- 3:24 - Overview of how I used PE Internals tool.
- 5:55 - Testing our driver on Windows 7 Pro, 64-bit OS.
- 10:28 - Dealing with the Blue Screen Of Death (BSOD), or BugCheck on Windows 7.
- 14:26 - Opening a crash dump file memory.dmpin WinDbg to analyze OS crash:run !analyze -v.
- 20:17 - Fixing the issue with the crash to make our driver backward compatible with Windows 7.
- 21:32 - Testing updated driver on Windows 7 to inject our FAKE.DLL into all running processes.
- 28:15 - Conclusion.
 
Downloads
If you are interested in the source code for what I've been coding in the tutorial above:
- You can download the source code here as the Visual Studio 2019 solution.
 
		
