AFD.sys – Primitives in The Pocket | Integer Shenanigans
Intro
VerSprite VS-Labs Research team discovered an interesting integer arithmetic bug within the Windows Kernel Ancillary Function Driver (AFD.sys) while performing N-day analysis after Microsoft Patch Tuesday. Within this blog post, the following information is outlined:
- Intro
- Unpacking the Mischievous Function: A Closer Look
- Understanding Call Flow & Further Constraints: AfdGetBuffer()
- Understanding Call Flow & Further Constraints: AfdSuperConnect()
- Pulling It All Together Now: How Do We Trigger Though?
- Pulling It All Together Now: Forcing Integer Wrap
- The Importance of Verifying Return Values
This post specifically looks at the AfdCalculateBufferSize() function, understanding a specific bug, any constraints, and why relying on caller functions to be responsible for verifying return values is a critical step that can’t be forgotten. Thus, the naming of this post, Primitives in The Pocket | Integer Shenanigans, was selected. Sadly, while occasionally bugs may appear to be non-abusable at first, in time all past conclusions are challenged.
File Information
- Windows Version: 11 22h2
- Reversing Version: 10.0.22621.1555
Unpacking the Mischievous Function: A Closer Look at Primitives in The Pocket
In the realm of integers, the mathematicians reign supreme; however, in the realm of integer wrapping bugs, all bets are off, and this is exactly what happens within the AfdCalculateBufferSize() function, which can be seen in the code snippet below.
// Function: afd!AfdCalculateBufferSize() uint64_t AfdCalculateBufferSize(int32_t arg1, int32_t arg2, char arg3) { int16_t rsi = ((int16_t)arg3); int32_t rdx_2 = ((((MmSizeOfMdl(0, ((uint64_t)arg1)) + 0xf) & 0xfffffff0) + (((((uint32_t)((rsi * 0x48) + 0xd0)) + 0xf) & 0xfffffff0) + arg2)) + (((arg1 + 0xf) & 0xfffffff0) + 0x60)); if (rdx_2 < 0x1000) { if (rsi != data_1c0059292) { rdx_2 = (rdx_2 + (data_1c0059638 - 0x10)); } else { rdx_2 = (rdx_2 + data_1c0059640); } if (rdx_2 >= 0x1000) { rdx_2 = 0x1000; } } return ((uint64_t)rdx_2); }
Reviewing the code snippet above, many masks, addition, and multiplication operations are being performed during the call to MmSizeOfMdl() function. Zero size, boundary, and sanity checks are performed on the arguments within the AfdCalculateBufferSize() function. The return value RDX_2 is checked multiple times depending on its value, it’s either left unchanged or modified with data_1c0059638 and data_1c0059640 or assigned directly the value of 0x1000. Given the name of this function AfdCalculateBufferSize() it is clear that the caller is trying to obtain a size value. What could this returned value be used for is the real question. So, before diving further into this bug, let’s take a look at the caller function to gather additional context of potential abuse.
// Function: afd!AfdGetBufferSlow() int128_t* AfdGetBufferSlow(int32_t arg1, int32_t arg2, int64_t arg3, char arg4) { int64_t r14; r14 = data_1c0059292; int64_t rdi = arg3; int128_t* rax_4; int32_t rdi_1; if (arg2 > 0xffff) { label_1c0021155: rdi_1 = -0x3fffff66; } else { int32_t rbp_1 = 4; arg3 = r14; if (arg1 != 0) { rbp_1 = arg1; } int32_t rax_1 = AfdCalculateBufferSize(rbp_1, arg2, arg3); if (rax_1 < rbp_1) { goto label_1c0021155; } if (rax_1 < arg2) { goto label_1c0021155; } int32_t var_1c_1 = 0; int32_t var_20_1 = 0; int64_t var_28 = 1; int64_t rax_2 = ExAllocatePool3(0x42, ((uint64_t)rax_1), 0x42646641, &var_28, 1); if (rax_2 == 0) { goto label_1c0021155; } int32_t rax_3; int64_t rcx_2; if (rdi == 0) { rcx_2 = rax_2; } else { rax_3 = PsChargeProcessPoolQuota(rdi, 0x200, ((uint64_t)rax_1)); rdi_1 = rax_3; rcx_2 = rax_2; } if ((rdi == 0 || (rdi != 0 && rax_3 >= 0))) { rax_4 = AfdInitializeBuffer(rcx_2, rbp_1, arg2, r14); goto label_1c000128a; } if ((rdi != 0 && rax_3 < 0)) { ExFreePoolWithTag(rcx_2, 0x42646641); } } if ((arg4 & 1) != 0) { ExRaiseStatus(((uint64_t)rdi_1)); breakpoint(); } rax_4 = nullptr; label_1c000128a: return rax_4; }
Reviewing the function AfdGetBufferSlow() it appears that the return value from AfdCalculateBufferSize() is passed as the NumberOfBytes to the ExAllocatePool3() function. Then the newly returned pool allocation within RAX_2 is passed to RCX_2 and further passed as the first argument to AfdInitializeBuffer() function. Given the name of AfdInitalizeBuffer(), we can infer that a buffer (pool allocation) is initialized along with two size arguments ( RBP_1 & ARG2 which are fully attacker controlled).
However, at the start of this function, a hard-coded check is present against ARG2, where if it is larger than 0xFFFF then RDI_1 is assigned the value 0x3FFFFF66 and passed as an argument to the ExRaiseStatus() function. So, we know that one constraint of the integer bug within AfdCalculateBufferSize() is that the second argument cannot be larger than 0xFFFF.
It is also present that upon returning from the AfdCalculateBufferSize() function, two checks are made to make sure that the return value is not smaller than either of the two size arguments ( RBP_1 or ARG2 ), and if it is, then GOTO is called with the label label_1c0021155 and similar behavior as when ARG2 is larger than 0xFFFF is observed.
Now, for the sake of trying to understand further constraints, the entire call flow is outlined below.
- AfdSuperConnect()
- AfdGetBuffer()
- AfdGetBufferSlow()
- AfdCalculateBufferSize()
- AfdGetBufferSlow()
- AfdGetBuffer()
Given the call flow outlined above, let’s take a closer look into the AfdGetBuffer() & AfdSuperConnect() functions, as AfdSuperConnect() is the direct function called given the exposed IOCTL Dispatch Table for the AFD.sys kernel driver to trigger the bug within AfdCalculateBufferSize().
Understanding Call Flow & Further Constraints: AfdGetBuffer()
First, let’s start with the AfdGetBuffer() function, which, given its name, we can infer that a buffer is returned to the caller, which is AfdSuperConnect(). The pseudo code for AfdGetBuffer() is in the snippet below.
// Function: afd!AfdGetBuffer() int64_t AfdGetBuffer(int32_t arg1, int32_t arg2, int64_t arg3, int32_t arg4) { int64_t rax_2; if ((arg2 > data_1c00592bc || (arg2 <= data_1c00592bc && arg1 > data_1c00592c4))) { int32_t var_18_1 = arg4; rax_2 = AfdGetBufferSlow(arg1, arg2, arg3, arg4); } if ((arg2 <= data_1c00592bc && arg1 <= data_1c00592c4)) { int64_t rcx; int64_t rbx_1; if (arg1 <= data_1c0059294) { rcx = data_1c0059918; rbx_1 = data_1c00599a8; } else if (arg1 <= data_1c0059298) { rcx = data_1c0059920; rbx_1 = data_1c00599b0; } else if (arg1 > data_1c005929c) { rcx = data_1c0059930; rbx_1 = data_1c00599c0; } else { rcx = data_1c0059928; rbx_1 = data_1c00599b8; } void* gsbase; int64_t rax_1 = ExAllocateFromLookasideListEx(PplpRetrieveListIndex(rcx, *(int32_t*)((char*)gsbase + 0x1a4))); int32_t rbx_2; if (rax_1 == 0) { rbx_2 = -0x3fffff66; } int32_t rax_3; if ((((!(TEST_BITW(*(int16_t*)(rax_1 + 0x4c), 8))) && rax_1 != 0) && arg3 != 0)) { rax_3 = PsChargeProcessPoolQuota(arg3, 0x200, rbx_1); rbx_2 = rax_3; if (rax_3 < 0) { AfdFreeBuffer(rax_1); } } if ((rax_1 == 0 || (((rax_1 != 0 && (!(TEST_BITW(*(int16_t*)(rax_1 + 0x4c), 8)))) && arg3 != 0) && rax_3 < 0))) { if ((arg4 & 1) != 0) { ExRaiseStatus(((uint64_t)rbx_2)); breakpoint(); } rax_2 = 0; } if ((rax_1 != 0 && (((TEST_BITW(*(int16_t*)(rax_1 + 0x4c), 8)) || ((!(TEST_BITW(*(int16_t*)(rax_1 + 0x4c), 8))) && arg3 == 0)) || (((!(TEST_BITW(*(int16_t*)(rax_1 + 0x4c), 8))) && arg3 != 0) && rax_3 >= 0)))) { rax_2 = rax_1; } } return rax_2; }
Now, early in execution within the AfdSuperConnect() function, we identify the call to the AfdGetBufferSlow() function, where all four arguments are passed; however, checks before and after the function call are present. The checks before, which are outlined in the code snippet below, are available for review.
// Function: afd!AfdGetBuffer() - Snippet //[SNIP] int64_t rax_2; if ((arg2 > data_1c00592bc || (arg2 <= data_1c00592bc && arg1 > data_1c00592c4))) { int32_t var_18_1 = arg4; rax_2 = AfdGetBufferSlow(arg1, arg2, arg3, arg4); } if ((arg2 <= data_1c00592bc && arg1 <= data_1c00592c4)) //[SNIP]
It appears that both ARG2 & ARG1 are being checked against data_1c00592bc & data_1c00592c4. In the first check, we make sure that ARG2 is greater than data_1c00592bc or if that is not true, then it verifies that ARG2 is less than data_1c00592bc & ARG1 is greater than data_1c00592c4.
Now, a good question is, what exactly are the values of data_1c00592bc & data_1c00592c4. It appears that data_1c00592bc is a hard-coded value of 0x1C as seen in the snippet below.
// data_1c00592bc Value int32_t data_1c00592bc = 0x1c
Regarding data_1c00592c4, it is a bit trickier; it appears to be a hard-coded value of 0x10000 as seen in the snippet below. However, some usage (both read/write) is recorded within the AfdReadRegistry() function; further analysis would need to be performed on this function to determine the purpose of data_1c00592c4.
// data_1c00592c4 Value int32_t data_1c00592c4 = 0x10000
Through cross-referencing data_1c00592c4 it appears that a single write occurs within the function AfdReadRegistry() seen in the snippet below.
// Function: afd!AfdReadRegistry() int64_t AfdReadRegistry() { //[SNIP] int32_t r8_4 = data_1c0059294; rbx_1 = 8; if (r8_4 < 0x58) { if ((data_1c0059416 & 8) != 0) { WPP_SF_SlP(); } r8_4 = 0x58; data_1c0059294 = 0x58; } uint64_t rdx_2 = ((uint64_t)data_1c0059298); int64_t* var_168; if (rdx_2 < r8_4) { if ((data_1c0059416 & 8) != 0) { var_168 = r8_4; WPP_SF_Sll(0xf, rdx_2, "MediumBufferSize", rdx_2); } rdx_2 = ((uint64_t)data_1c0059294); data_1c0059298 = rdx_2; } int32_t r8_5 = data_1c005929c; if (r8_5 < rdx_2) { if ((data_1c0059416 & 8) != 0) { var_168 = rdx_2; rdx_2 = WPP_SF_Sll(0x10, rdx_2, "LargeBufferSize", r8_5); } r8_5 = data_1c0059298; data_1c005929c = r8_5; } int32_t r9_1 = data_1c00592c4; if (r9_1 < r8_5) { if ((data_1c0059416 & 8) != 0) { var_168 = r8_5; WPP_SF_Sll(0x11, rdx_2, "HugeBufferSize", r9_1); } data_1c00592c4 = data_1c005929c; } data_1c00599a3 = AfdReadSingleParameter(var_158, "IgnorePushBitOnReceives", ((uint32_t)data_1c00599a3)) != 0; data_1c00599a2 = AfdReadSingleParameter(var_158, "DisableRawSecurity", ((uint32_t)data_1c00599a2)) != 0; data_1c0059990 = AfdReadSingleParameter(var_158, "DisableDirectAcceptEx", ((uint32_t)data_1c0059990)) != 0; data_1c00599a1 = AfdReadSingleParameter(var_158, "DisableChainedReceive", ((uint32_t)data_1c00599a1)) != 0; data_1c0059291 = AfdReadSingleParameter(var_158, "UseTdiSendAndDisconnect", ((uint32_t)data_1c0059291)) != 0; if (AfdReadSingleParameter(var_158, "IgnoreOrderlyRelease", 0) != 0) //[SNIP] rax_19 = AfdReadSingleParameter(var_158, "MaxActiveTransmitFileCount", data_1c00596bc); //[SNIP] int32_t rax_16 = AfdReadSingleParameter(var_158, "BufferAlignment", data_1c0059638); //[SNIP] int32_t rax_17 = AfdReadSingleParameter(var_158, "VolatileParameters", ((uint32_t)data_1c00599a0)); //[SNIP] }
Quite a bit of this function has been removed; however, some interesting registry parameter keys have been left, which for interested readers, these keys could be of interest in looking into and seeing how they are utilized. Now, back to the real reason we are looking at this function, trying to understand where data_1c00592c4 could hold a different value than 0x10000. So, right after the WPP_SF_Sll() function call where HugeBufferSize is passed as an argument, we have the location where data_1c00592c4 is written with the value from data_1c005929c. It’s important to note that the AfdReadRegistry() function is only called during DriverEntry() routine. Further analysis of this function is out of the scope of this post.
Now, going back to the AfdGetBuffer() function and trying to understand the constraints, we know the first check against ARG2 can be passed if we provide a value greater than 0x1C or if ARG2 is less than 0x1C & ARG1 is greater than 0x10000 we will enter into the function AfdGetBufferSlow().
Understanding Call Flow & Further Constraints: AfdSuperConnect()
After reviewing all functions up to this point and understanding the initial constraints around size arguments (that are attacker controlled) that are passed to AfdCalculateBufferSize() function, the last function we have to review is AfdSuperConnect().
The code from AfdSuperConnect() is available in the code snippet below.
// Function: afd!AfdSuperConnect() uint64_t AfdSuperConnect(void* arg1, void* arg2) { arg_8 = arg1; arg_18 = nullptr; int128_t var_158 = 0; int64_t var_148 = 0; int64_t var_188 = 0; void* rax = *(int64_t*)((char*)arg2 + 0x30); int16_t* rbx = *(int64_t*)((char*)rax + 0x18); arg_10 = rbx; uint64_t rax_1 = ((uint64_t)*(int32_t*)((char*)arg2 + 0x10)); int64_t rax_4; int64_t* rcx_5; int32_t rdx_18; int64_t* r12_1; char* r14_1; bool z_1; if (rax_1 < 0xc) { rdx_18 = 0x13a1; } else { int32_t var_198_1 = 0; if (*(int8_t*)((char*)arg1 + 0x40) != 0) { int64_t rcx = *(int64_t*)((char*)arg2 + 0x20); int64_t rax_2; int64_t rdx; int64_t (* const r8_1)(); if ((rcx & 3) != 0) { rax_2 = ExRaiseDatatypeMisalignment(rcx); } else { rdx = (rcx + rax_1); r8_1 = MmUserProbeAddress; rax_2 = *(int64_t*)MmUserProbeAddress; } if ((((rcx & 3) != 0 || ((rcx & 3) == 0 && rdx > rax_2)) || (((rcx & 3) == 0 && rdx <= rax_2) && rdx < rcx))) { *(int8_t*)rax_2 = 0; } uint64_t rax_3 = ((uint64_t)*(int32_t*)((char*)arg2 + 8)); if (rax_3 != 0) { int64_t rcx_6 = *(int64_t*)((char*)arg1 + 0x70); int64_t rdx_5 = (rcx_6 + rax_3); char* rax_7 = *(int64_t*)r8_1; if ((rdx_5 > rax_7 || (rdx_5 <= rax_7 && rdx_5 < rcx_6))) { *(int8_t*)rax_7 = 0; } } } r14_1 = *(int64_t*)((char*)arg2 + 0x20); char* var_128_1 = r14_1; if ((data_1c0059c60 == 0 || (data_1c0059c60 != 0 && *(int8_t*)r14_1 != 0))) { rax_4 = AfdGetBuffer(*(int32_t*)((char*)arg2 + 8), (*(int32_t*)((char*)arg2 + 0x10) - 4), *(int64_t*)((char*)rbx + 0x28), 1); //[SNIP]
The AfdSuperConnect() function is quite large, so most of it was removed, and the important sections were left. It’s important to note that the AfdSuperConnect() function is reachable from the IOCTL Dispatch Table for the AFD kernel driver. Now, focusing back on our target function AfdGetBuffer(), it appears that at offset 0x8 & 0x10 two 32bit values are passed as arguments; however, the value at ARG2+0x10 is subtracted by 4 before being passed. However, before reaching this point, we have a few more checks on the size arguments specifically, at the start where RAX_1 is checked to see if it is less than 0xC, if it is, then we don’t call into the ELSE block, and execute our call to AfdGetBuffer().
Pulling It All Together Now: How Do We Trigger Though?
So now that we understand all the constraints around the size arguments passed to AfdCalculateBufferSize() function, let’s cover a quick recap. We know the second argument cannot be less than 0xC or greater than 0xFFFF. We also know we have no constraints around ARG1 if we ensure that ARG2+0x10 – 4 is greater than 0x1C. How do we trigger the integer wrapping bug through? Well, let’s dive into it.
At the start of this post, in section Unpacking the Mischievous Function: A Closer Look , we reviewed pseudo source for the AfdCalculateBufferSize(); however, now let’s do some live debugging and see if we can work out the math to spot why the integer wrapping bug appears starting from the entry point AfdSuperConnect() and verifying our assumptions about the constraints and see what happens if we pass valid sizes versus, some not so valid size values and observe what happens.
// WinDbg Kernel Mode: Entry into AfdSuperConnect & ARG2 Breakpoint 0 hit afd!AfdSuperConnect: fffff806`e9b201d0 4c8bdc mov r11,rsp 2: kd> dps rdx ffff858b`1465cd48 00000000`0005310e ffff858b`1465cd50 00000000`00000040 ffff858b`1465cd58 00000000`00000040 ffff858b`1465cd60 00000000`000120c7 ffff858b`1465cd68 00000000`00b652f0 ffff858b`1465cd70 ffff858b`138bbe00 ffff858b`1465cd78 ffff858b`14520e10
Reviewing the output from above, let’s break down what we are seeing:
- ARG2+0x8 = 0x40 | Size (Fully Attacker Controlled)
- ARG2+0x20 = 00000000`00b652f0 | Usermode Buffer (Fully Attacker Controlled)
- Arg2+0x28 = ffff858b`138bbe00 | Afd Device Object
- Arg2+0x30 = ffff858b`14520e10 | Afd Endpoint File Object
It is important to note that the usermode buffer is an entirely controlled attacker buffer, where depending on which IOCTL is triggered within the Dispatch Table, certain checks will need to be bypassed as parsing of values from this buffer occur often and can be tracked via usage relating to ARG2+0x20 upon entry ( just a friendly tip for others who are also auditing AFD to make static analysis easier ).
Now, let’s go to our first check of ARG2+0x10.
// WinDbg Kernel Mode: AfdSuperConnect() | First Check ARG2+0x10 2: kd> t afd!AfdSuperConnect+0x51: fffff806`e9b20221 83f80c cmp eax,0Ch 2: kd> t afd!AfdSuperConnect+0x54: fffff806`e9b20224 0f8216880100 jb afd!AfdSuperConnect+0x18870 (fffff806`e9b38a40) 2: kd> r rax rax=0000000000000040
We have an unsigned comparison here, and we will not execute this jump since 0x40 is NOT below 0xC. Next, we dive straight to AfdGetBuffer().
// WinDbg Kernel Mode: AfdSuperConnect() | Entry to AfdGetBuffer() 2: kd> tc afd!AfdSuperConnect+0xcc: fffff806`e9b2029c e81fc2ffff call afd!AfdGetBuffer (fffff806`e9b1c4c0) 2: kd> r rcx, rdx, r8, r9 rcx=0000000000000040 rdx=000000000000003c r8=ffff858b188a90c0 r9=0000000000000001 2: kd> dps ffff858b188a90c0 ffff858b`188a90c0 00000000`00000003
Dumping the arguments to the AfdGetBuffer() and we see:
- ARG1 = 0x40
- ARG2 = 0x3C (0x40 – 4)
- Arg3 = ffff858b188a90c0 (Which points to 0x3)
- Arg4 = 0x1
Up next is the first check to see if ARG2 is greater than 0x1C.
// WinDbg Kernel Mode: AfdGetBuffer() | First Check ARG2 2: kd> t afd!AfdGetBuffer+0x14: fffff806`e9b1c4d4 3b15e2cd0400 cmp edx,dword ptr [afd!AfdStandardAddressLength (fffff806`e9b692bc)]
Now, this is interesting. We have a name AfdStandardAddressLength, which is interesting; let’s look at some of the other assembly and see what other information we have presented for us to help aid reversing.
// WinDbg Kernel Mode: AfdGetBuffer() | Reviewing Additional Information fffff806`e9b1c4d4 3b15e2cd0400 cmp edx, dword ptr [afd!AfdStandardAddressLength (fffff806e9b692bc)] fffff806`e9b1c4da 418be9 mov ebp, r9d fffff806`e9b1c4dd 498bf0 mov rsi, r8 fffff806`e9b1c4e0 0f87d1000000 ja afd!AfdGetBuffer+0xf7 (fffff806e9b1c5b7) fffff806`e9b1c4e6 3b0dd8cd0400 cmp ecx, dword ptr [afd!AfdHugeBufferSize (fffff806e9b692c4)] fffff806`e9b1c4ec 0f87c5000000 ja afd!AfdGetBuffer+0xf7 (fffff806e9b1c5b7) fffff806`e9b1c4f2 3b0d9ccd0400 cmp ecx, dword ptr [afd!AfdSmallBufferSize (fffff806e9b69294)] fffff806`e9b1c4f8 0f8696000000 jbe afd!AfdGetBuffer+0xd4 (fffff806e9b1c594) fffff806`e9b1c4fe 3b0d94cd0400 cmp ecx, dword ptr [afd!AfdMediumBufferSize (fffff806e9b69298)] fffff806`e9b1c504 0f86b8000000 jbe afd!AfdGetBuffer+0x102 (fffff806e9b1c5c2) fffff806`e9b1c50a 3b0d8ccd0400 cmp ecx, dword ptr [afd!AfdLargeBufferSize (fffff806e9b6929c)] fffff806`e9b1c510 0f878e000000 ja afd!AfdGetBuffer+0xe4 (fffff806e9b1c5a4)
Interestingly, we now see further checks against the following:
- AfdHugeBufferSize – fffff806e9b692c4 00010000
- AfdSmallBufferSize – fffff806e9b69294 00000080
- AfdMediumBufferSize – fffff806e9b69298 00000640
- AfdLargeBufferSize – fffff806e9b6929c 00002000
If we review the afd!AfdReadRegistry() code snippet from earlier, we will see some similar names passed to the WPP_SF_Sll(), which could help in the analysis for those who are also reversing the AFD driver. Now, back to the checks being performed.
// WinDbg Kernel Mode: AfdGetBuffer() | Entry Into AfdGetBufferSlow() 2: kd> t afd!AfdGetBuffer+0x1a: fffff806`e9b1c4da 418be9 mov ebp,r9d 2: kd> t afd!AfdGetBuffer+0x1d: fffff806`e9b1c4dd 498bf0 mov rsi,r8 2: kd> t afd!AfdGetBuffer+0x20: fffff806`e9b1c4e0 0f87d1000000 ja afd!AfdGetBuffer+0xf7 (fffff806`e9b1c5b7) 2: kd> t afd!AfdGetBuffer+0xf7: fffff806`e9b1c5b7 896c2420 mov dword ptr [rsp+20h],ebp 2: kd> t afd!AfdGetBuffer+0xfb: fffff806`e9b1c5bb e8e04bffff call afd!AfdGetBufferSlow (fffff806`e9b111a0) 2: kd> r rcx, rdx, r8, r9 rcx=0000000000000040 rdx=000000000000003c r8=ffff858b188a90c0 r9=0000000000000001
All arguments appear to be exactly the same, as we reviewed earlier, before entering into the AfdGetBufferSlow() function.
// WinDbg Kernel Mode: AfdGetBufferSlow() | Past First Check & Entering Into AfdCalculateBufferSize() 2: kd> t afd!AfdGetBufferSlow+0x24: fffff806`e9b111c4 81faffff0000 cmp edx,0FFFFh 2: kd> t afd!AfdGetBufferSlow+0x2a: fffff806`e9b111ca 0f87c5fd0100 ja afd!AfdGetBufferSlow+0x1fdf5 (fffff806`e9b30f95) 2: kd> t r edx edx=3c 2: kd> t afd!AfdGetBufferSlow+0x30: fffff806`e9b111d0 85c9 test ecx,ecx 2: kd> t afd!AfdGetBufferSlow+0x32: fffff806`e9b111d2 bd04000000 mov ebp,4 2: kd> t afd!AfdGetBufferSlow+0x37: fffff806`e9b111d7 458ac6 mov r8b,r14b 2: kd> t afd!AfdGetBufferSlow+0x3a: fffff806`e9b111da 0f45e9 cmovne ebp,ecx 2: kd> t afd!AfdGetBufferSlow+0x3d: fffff806`e9b111dd 8bcd mov ecx,ebp 2: kd> t afd!AfdGetBufferSlow+0x3f: fffff806`e9b111df e8bc040000 call afd!AfdCalculateBufferSize (fffff806`e9b116a0) 2: kd> r rcx, rdx, r8 rcx=0000000000000040 rdx=000000000000003c r8=ffff858b188a9003 2: kd> dps r8 ffff858b`188a9003 00000000`00000000
The first check of ARG2 against 0xFFFF passes since ARG2 is 0x3C, and we immediately enter into the function call to AfdCalculateBufferSize(). This is our destination, so let’s track what occurs thoroughly.
// WinDbg Kernel Mode: AfdCalculateBufferSize() | Overview afd!AfdCalculateBufferSize: fffff806`e9b116a0 48895c2408 mov qword ptr [rsp+8], rbx fffff806`e9b116a5 4889742410 mov qword ptr [rsp+10h], rsi fffff806`e9b116aa 57 push rdi fffff806`e9b116ab 4883ec20 sub rsp, 20h fffff806`e9b116af 8bda mov ebx, edx fffff806`e9b116b1 8bf9 mov edi, ecx fffff806`e9b116b3 8bd1 mov edx, ecx fffff806`e9b116b5 33c9 xor ecx, ecx fffff806`e9b116b7 410fbef0 movsx esi, r8b fffff806`e9b116bb 48ff151edc0500 call qword ptr [afd!__imp_MmSizeOfMdl (fffff806e9b6f2e0)] fffff806`e9b116c2 0f1f440000 nop dword ptr [rax+rax] fffff806`e9b116c7 baf0ffffff mov edx, 0FFFFFFF0h fffff806`e9b116cc 440fb7ce movzx r9d, si fffff806`e9b116d0 6641c1e103 shl r9w, 3 fffff806`e9b116d5 448d580f lea r11d, [rax+0Fh] fffff806`e9b116d9 b8d0000000 mov eax, 0D0h fffff806`e9b116de 4423da and r11d, edx fffff806`e9b116e1 458d1431 lea r10d, [r9+rsi] fffff806`e9b116e5 6641c1e203 shl r10w, 3 fffff806`e9b116ea 664403d0 add r10w, ax fffff806`e9b116ee 8d470f lea eax, [rdi+0Fh] fffff806`e9b116f1 23c2 and eax, edx fffff806`e9b116f3 410fb7ca movzx ecx, r10w fffff806`e9b116f7 83c10f add ecx, 0Fh fffff806`e9b116fa 83c060 add eax, 60h fffff806`e9b116fd 23ca and ecx, edx fffff806`e9b116ff 03cb add ecx, ebx fffff806`e9b11701 418d140b lea edx, [r11+rcx] fffff806`e9b11705 b900100000 mov ecx, 1000h fffff806`e9b1170a 03d0 add edx, eax fffff806`e9b1170c 3bd1 cmp edx, ecx fffff806`e9b1170e 7318 jae afd!AfdCalculateBufferSize+0x88 (fffff806e9b11728)
Now, let’s step through these instructions after the call to MmSizeOfMdl(), record each argument, and see where the fault resides within the AfdCalculateBufferSize() function.
// WinDbg Kernel Mode: AfdCalculateBufferSize() | Single Step Analysis fffff806`e9b116c2 0f1f440000 nop dword ptr [rax+rax] fffff806`e9b116c7 baf0ffffff mov edx, 0FFFFFFF0h // edx = 0xFFFFFFF0 fffff806`e9b116cc 440fb7ce movzx r9d, si // si = 0x3 , r9d = 0x3 fffff806`e9b116d0 6641c1e103 shl r9w, 3 // r9w = 0x18 fffff806`e9b116d5 448d580f lea r11d, [rax+0Fh] // rax = 0x38, rax+0xF = 0x47, r11d = 0x47 fffff806`e9b116d9 b8d0000000 mov eax, 0D0h // eax = 0xD0 fffff806`e9b116de 4423da and r11d, edx // edx = 0xfffffff0, r11d = 0x47, r11d = 0x40 fffff806`e9b116e1 458d1431 lea r10d, [r9+rsi] // r9 = 0x18, rsi = 0x3, r10d = 0x1B fffff806`e9b116e5 6641c1e203 shl r10w, 3 // r10w = 0xD8 fffff806`e9b116ea 664403d0 add r10w, ax // ax = D0, r10w = 0xD8, r10w = 0x1A8 fffff806`e9b116ee 8d470f lea eax, [rdi+0Fh] // rdi = (0x40), rdi+0xf = 0x4F, eax = 0x4F (Attacker Fully Controlled RDI) fffff806`e9b116f1 23c2 and eax, edx // edx = 0xFFFFFFF0, eax = 0x4F, eax = 0x40 fffff806`e9b116f3 410fb7ca movzx ecx, r10w // r10w = 0x1A8, ecx = 0x0, ecx = 0x1A8 fffff806`e9b116f7 83c10f add ecx, 0Fh // ecx = 0x1A8, ecx + 0xF = 0x1B7 fffff806`e9b116fa 83c060 add eax, 60h // eax = 0x40, eax + 0x60 = 0xA0 fffff806`e9b116fd 23ca and ecx, edx // edx = 0xFFFFFFF0, ecx = 0x1B7, ecx = 0x1B0 fffff806`e9b116ff 03cb add ecx, ebx // ebx = 0x3C, ecx = 0x1B0, ecx = 0x1EC (Attacker Fully Controlled EBX) fffff806`e9b11701 418d140b lea edx, [r11+rcx] // r11 = 0x40, rcx = 0x1EC, edx = 0x22C fffff806`e9b11705 b900100000 mov ecx, 1000h // First check if return from MmSizeOfMdl fffff806`e9b1170a 03d0 add edx, eax // eax = 0xA0, edx = 0x22C, edx = 0x2CC fffff806`e9b1170c 3bd1 cmp edx, ecx // Check if the resulting math of 0x2CC is greather or equal to 0x1000 fffff806`e9b1170e 7318 jae afd!AfdCalculateBufferSize+0x88 (fffff806e9b11728)
Reviewing the above output, it appears that at address 0xfffff806e9b116ee, we first use attacker-controlled values, where the value is stored in EAX. Immediately after, a mask is applied to the attacker-supplied value then shortly after 0x60 is added to the attacker-masked value.
This is where our wrap can occur if we max out the size of RDI since zero checks happen to verify this value upon reaching this code. Now, let’s, continue and return from this function and observe what occurs.
// WinDbg Kernel Mode: AfdGetBufferSlow() | AfdCalculateBufferSize() Return Checks 2: kd> t afd!AfdGetBufferSlow+0x44: fffff806`e9b111e4 3bc5 cmp eax,ebp 2: kd> dc rax 00000000`000002cc 2: kd> dc ebp 00000000`00000040 2: kd> t afd!AfdGetBufferSlow+0x46: fffff806`e9b111e6 0f82a9fd0100 jb afd!AfdGetBufferSlow+0x1fdf5 (fffff806`e9b30f95) 2: kd> t afd!AfdGetBufferSlow+0x4c: fffff806`e9b111ec 3bc3 cmp eax,ebx 2: kd> r eax, rbx eax=2cc rbx=000000000000003c 2: kd> t afd!AfdGetBufferSlow+0x4e: fffff806`e9b111ee 0f82a1fd0100 jb afd!AfdGetBufferSlow+0x1fdf5 (fffff806`e9b30f95) 2: kd> t afd!AfdGetBufferSlow+0x54: fffff806`e9b111f4 8364243c00 and dword ptr [rsp+3Ch],0 2: kd> t afd!AfdGetBufferSlow+0x59: fffff806`e9b111f9 4c8d4c2430 lea r9,[rsp+30h] 2: kd> t afd!AfdGetBufferSlow+0x5e: fffff806`e9b111fe 8364243800 and dword ptr [rsp+38h],0 2: kd> t afd!AfdGetBufferSlow+0x63: fffff806`e9b11203 41b841666442 mov r8d,42646641h 2: kd> t afd!AfdGetBufferSlow+0x69: fffff806`e9b11209 8bd0 mov edx,eax 2: kd> t afd!AfdGetBufferSlow+0x6b: fffff806`e9b1120b b942000000 mov ecx,42h 2: kd> t afd!AfdGetBufferSlow+0x70: fffff806`e9b11210 48c744243001000000 mov qword ptr [rsp+30h],1 2: kd> t afd!AfdGetBufferSlow+0x79: fffff806`e9b11219 448bf8 mov r15d,eax 2: kd> t afd!AfdGetBufferSlow+0x7c: fffff806`e9b1121c c744242001000000 mov dword ptr [rsp+20h],1 2: kd> t afd!AfdGetBufferSlow+0x84: fffff806`e9b11224 48ff159de00500 call qword ptr [afd!_imp_ExAllocatePool3 (fffff806`e9b6f2c8)] 2: kd> r rcx, rdx, r8 rcx=0000000000000042 rdx=00000000000002cc r8=0000000042646641
Upon returning from AfdCalculateBufferSize(), the two checks are passed since the return value is greater than both provided arguments to the AfdCalculateBufferSize() function, with the return value being used as a size argument to ExAllocatePool3().
Pulling It All Together Now: Forcing Integer Wrap
Now that we understand the bug and the constraints around the bug, we will now supply input to verify our static analysis to trigger the wrap using dynamic analysis to verify our assumptions. This time, we will skip all earlier checks and focus on the final function, AfdCalculateBufferSize(), the return value, and the error function call to ExRaiseStatus().
// WinDbg Kernel Mode: AfdCalculateBufferSize() | Trigger Wrap & ExRaiseStatus() 3: kd> t afd!AfdCalculateBufferSize+0x4e: fffff806`e9b116ee 8d470f lea eax,[rdi+0Fh] 3: kd> r rdi, rax rdi=00000000ffffffff rax=00000000000000d0 3: kd> t afd!AfdCalculateBufferSize+0x51: fffff806`e9b116f1 23c2 and eax,edx 3: kd> r rax rax=000000000000000e 3: kd> dc rdi+0xf 00000001`0000000e 3: kd> dc rdi 00000000`ffffffff 3: kd> ?rdi+0xf Evaluate expression: 4294967310 = 00000001`0000000e 3: kd> t afd!AfdCalculateBufferSize+0x53: fffff806`e9b116f3 410fb7ca movzx ecx,r10w 3: kd> t afd!AfdCalculateBufferSize+0x57: fffff806`e9b116f7 83c10f add ecx,0Fh 3: kd> t afd!AfdCalculateBufferSize+0x5a: fffff806`e9b116fa 83c060 add eax,60h 3: kd> r rax rax=0000000000000000 3: kd> ?rax+0x60 Evaluate expression: 96 = 00000000`00000060 3: kd> t afd!AfdCalculateBufferSize+0x5d: fffff806`e9b116fd 23ca and ecx,edx 3: kd> t afd!AfdCalculateBufferSize+0x5f: fffff806`e9b116ff 03cb add ecx,ebx 3: kd> t afd!AfdCalculateBufferSize+0x61: fffff806`e9b11701 418d140b lea edx,[r11+rcx] 3: kd> t afd!AfdCalculateBufferSize+0x65: fffff806`e9b11705 b900100000 mov ecx,1000h 3: kd> t afd!AfdCalculateBufferSize+0x6a: fffff806`e9b1170a 03d0 add edx,eax 3: kd> t afd!AfdCalculateBufferSize+0x6c: fffff806`e9b1170c 3bd1 cmp edx,ecx 3: kd> dc edx 00000000`0080027c 3: kd> dc ecx 00000000`00001000 3: kd> t afd!AfdCalculateBufferSize+0x6e: fffff806`e9b1170e 7318 jae afd!AfdCalculateBufferSize+0x88 (fffff806`e9b11728) 3: kd> t afd!AfdCalculateBufferSize+0x88: fffff806`e9b11728 488b5c2430 mov rbx,qword ptr [rsp+30h] 3: kd> t afd!AfdCalculateBufferSize+0x8d: fffff806`e9b1172d 8bc2 mov eax,edx 3: kd> t afd!AfdCalculateBufferSize+0x8f: fffff806`e9b1172f 488b742438 mov rsi,qword ptr [rsp+38h] 3: kd> t afd!AfdCalculateBufferSize+0x94: fffff806`e9b11734 4883c420 add rsp,20h 3: kd> t afd!AfdCalculateBufferSize+0x98: fffff806`e9b11738 5f pop rdi 3: kd> t afd!AfdCalculateBufferSize+0x99: fffff806`e9b11739 c3 ret 3: kd> t afd!AfdGetBufferSlow+0x44: fffff806`e9b111e4 3bc5 cmp eax,ebp 3: kd> r eax, ebp eax=80027c ebp=ffffffff 3: kd> t afd!AfdGetBufferSlow+0x46: fffff806`e9b111e6 0f82a9fd0100 jb afd!AfdGetBufferSlow+0x1fdf5 (fffff806`e9b30f95) 3: kd> t afd!AfdGetBufferSlow+0x1fdf5: fffff806`e9b30f95 bf9a0000c0 mov edi,0C000009Ah 3: kd> t afd!AfdGetBufferSlow+0x1fdfa: fffff806`e9b30f9a f684248000000001 test byte ptr [rsp+80h],1 4: kd> t afd!AfdGetBufferSlow+0x1fe02: fffff806`e9b30fa2 740f je afd!AfdGetBufferSlow+0x1fe13 (fffff806`e9b30fb3) 4: kd> t afd!AfdGetBufferSlow+0x1fe04: fffff806`e9b30fa4 8bcf mov ecx,edi 4: kd> t afd!AfdGetBufferSlow+0x1fe06: fffff806`e9b30fa6 48ff153be30300 call qword ptr [afd!_imp_ExRaiseStatus (fffff806`e9b6f2e8)]
Sadly, the checks made upon returning from the AfdCalculateBufferSize() function are accurate, and this bug is short-lived as an Exception is raised. It is also important to note that the resulting value from AfdCalculateBufferSize() is not 100% attacker-controlled. This is another constraint we must work with.
The Importance of Verifying Return Values
Now, yes, we went through all of this for a non-abusable bug in this specific scenario; however, at this time, the AfdCalculateBufferSize() function is utilized in a few locations, and maybe others might be more promising. While this analysis is of a normal bug, where the return value is properly validated, and an exception is raised, which is standard, we must remember that developers are humans as well, and maybe in the future, the AfdCalculateBufferSize() function may be improperly utilized, and the return value never validated before being passed as an argument to sensitive routines. Remember, while we didn’t find a useful purpose for this bug, we learned a decent amount of a common theme among IOCTL usermode arguments and their associated functions that are reachable through the AFD kernel drive from usermode.
Thus, we have a potential primitive in the pocket that will wait and lurk for a day when a developer forgets to check the return value properly!
Conclusion
This blog post covered quite a bit of material to provide readers with a step-by-step approach to performing static analysis & dynamic analysis while reversing Windows Kernel Drivers. Along with general kernel driver analysis, some other topics of vulnerability research and exploit development were covered, such as understanding constraints, identifying call flow to build a Proof-of-Concept, and mindset around trying to understand how a bug could be abused in the future. Some areas within the post also leave room for others to explore further if they wish, such as with the AFD Registry function and other places where AfdCalculateBufferSize() function is utilized.
Thank you for reading this post; hopefully, it helps others learn to perform security research. Remember, just because this bug appears to cause no issues today, this opinion can be challenged in the future.
Special shoutout to Jon Reyes (@notCh3rn0byl) for the assistance during the N-day analysis and research!
- /
- /
- /
- /
- /
- /
- /
- /
- /
- /
- /
- /
- /
- /
- /
- /
- /
- /
- /
- /
- /
- /