AFD.sys – Primitives in The Pocket | Integer Shenanigans

AFD.sys – Primitives in The Pocket | Integer Shenanigans

Intro


VerSprite VS-Labs Research team discovered an interesting integer arithmetic bug within the Windows Kernel Ancillary Function Driver (AFD.sys) while performing N-day analysis after Microsoft Patch Tuesday. Within this blog post, the following information is outlined:

  • Intro
  • Unpacking the Mischievous Function: A Closer Look
  • Understanding Call Flow & Further Constraints: AfdGetBuffer()
  • Understanding Call Flow & Further Constraints: AfdSuperConnect()
  • Pulling It All Together Now: How Do We Trigger Though?
  • Pulling It All Together Now: Forcing Integer Wrap
  • The Importance of Verifying Return Values

This post specifically looks at the AfdCalculateBufferSize() function, understanding a specific bug, any constraints, and why relying on caller functions to be responsible for verifying return values is a critical step that can’t be forgotten. Thus, the naming of this post, Primitives in The Pocket | Integer Shenanigans, was selected. Sadly, while occasionally bugs may appear to be non-abusable at first, in time all past conclusions are challenged.

AFD.sys - Primitives in The Pocket | Integer Shenanigans

File Information 

  • Windows Version: 11 22h2
  • Reversing Version: 10.0.22621.1555

Unpacking the Mischievous Function: A Closer Look at Primitives in The Pocket

In the realm of integers, the mathematicians reign supreme; however, in the realm of integer wrapping bugs, all bets are off, and this is exactly what happens within the AfdCalculateBufferSize() function, which can be seen in the code snippet below.

// Function: afd!AfdCalculateBufferSize()

uint64_t AfdCalculateBufferSize(int32_t arg1, int32_t arg2, char arg3)
{
    int16_t rsi = ((int16_t)arg3);
    int32_t rdx_2 = ((((MmSizeOfMdl(0, ((uint64_t)arg1)) + 0xf) & 0xfffffff0) + (((((uint32_t)((rsi * 0x48) + 0xd0)) + 0xf) & 0xfffffff0) + arg2)) + (((arg1 + 0xf) & 0xfffffff0) + 0x60));

    if (rdx_2 < 0x1000)
    {
        if (rsi != data_1c0059292)
        {
            rdx_2 = (rdx_2 + (data_1c0059638 - 0x10));
        }
        else
        {
            rdx_2 = (rdx_2 + data_1c0059640);
        }
        if (rdx_2 >= 0x1000)
        {
            rdx_2 = 0x1000;
        }
    }
    return ((uint64_t)rdx_2);
}

Reviewing the code snippet above, many masks, addition, and multiplication operations are being performed during the call to MmSizeOfMdl() function. Zero size, boundary, and sanity checks are performed on the arguments within the AfdCalculateBufferSize() function. The return value RDX_2 is checked multiple times depending on its value, it’s either left unchanged or modified with data_1c0059638 and data_1c0059640 or assigned directly the value of 0x1000. Given the name of this function AfdCalculateBufferSize() it is clear that the caller is trying to obtain a size value. What could this returned value be used for is the real question. So, before diving further into this bug, let’s take a look at the caller function to gather additional context of potential abuse.

// Function: afd!AfdGetBufferSlow()

int128_t* AfdGetBufferSlow(int32_t arg1, int32_t arg2, int64_t arg3, char arg4)
{
    int64_t r14;
    r14 = data_1c0059292;
    int64_t rdi = arg3;
    int128_t* rax_4;
    int32_t rdi_1;
    if (arg2 > 0xffff)
    {
    label_1c0021155:
        rdi_1 = -0x3fffff66;
    }
    else
    {
        int32_t rbp_1 = 4;
        arg3 = r14;
        if (arg1 != 0)
        {
            rbp_1 = arg1;
        }
        int32_t rax_1 = AfdCalculateBufferSize(rbp_1, arg2, arg3);
        if (rax_1 < rbp_1)
        {
            goto label_1c0021155;
        }
        if (rax_1 < arg2)
        {
            goto label_1c0021155;
        }
        int32_t var_1c_1 = 0;
        int32_t var_20_1 = 0;
        int64_t var_28 = 1;
        int64_t rax_2 = ExAllocatePool3(0x42, ((uint64_t)rax_1), 0x42646641, &var_28, 1);
        if (rax_2 == 0)
        {
            goto label_1c0021155;
        }
        int32_t rax_3;
        int64_t rcx_2;
        if (rdi == 0)
        {
            rcx_2 = rax_2;
        }
        else
        {
            rax_3 = PsChargeProcessPoolQuota(rdi, 0x200, ((uint64_t)rax_1));
            rdi_1 = rax_3;
            rcx_2 = rax_2;
        }
        if ((rdi == 0 || (rdi != 0 && rax_3 >= 0)))
        {
            rax_4 = AfdInitializeBuffer(rcx_2, rbp_1, arg2, r14);
            goto label_1c000128a;
        }
        if ((rdi != 0 && rax_3 < 0))
        {
            ExFreePoolWithTag(rcx_2, 0x42646641);
        }
    }
    if ((arg4 & 1) != 0)
    {
        ExRaiseStatus(((uint64_t)rdi_1));
        breakpoint();
    }
    rax_4 = nullptr;
label_1c000128a:
    return rax_4;
}

Reviewing the function AfdGetBufferSlow() it appears that the return value from AfdCalculateBufferSize() is passed as the NumberOfBytes to the ExAllocatePool3() function. Then the newly returned pool allocation within RAX_2 is passed to RCX_2 and further passed as the first argument to AfdInitializeBuffer() function. Given the name of AfdInitalizeBuffer(), we can infer that a buffer (pool allocation) is initialized along with two size arguments ( RBP_1 & ARG2 which are fully attacker controlled). 

However, at the start of this function, a hard-coded check is present against ARG2, where if it is larger than 0xFFFF then RDI_1 is assigned the value 0x3FFFFF66 and passed as an argument to the ExRaiseStatus() function. So, we know that one constraint of the integer bug within AfdCalculateBufferSize() is that the second argument cannot be larger than 0xFFFF

It is also present that upon returning from the AfdCalculateBufferSize() function, two checks are made to make sure that the return value is not smaller than either of the two size arguments ( RBP_1 or ARG2 ), and if it is, then GOTO is called with the label label_1c0021155 and similar behavior as when ARG2 is larger than 0xFFFF is observed.

Now, for the sake of trying to understand further constraints, the entire call flow is outlined below.

  • AfdSuperConnect()
    • AfdGetBuffer()
      • AfdGetBufferSlow()
        • AfdCalculateBufferSize()

Given the call flow outlined above, let’s take a closer look into the AfdGetBuffer() & AfdSuperConnect() functions, as AfdSuperConnect() is the direct function called given the exposed IOCTL Dispatch Table for the AFD.sys kernel driver to trigger the bug within AfdCalculateBufferSize().

Understanding Call Flow & Further Constraints: AfdGetBuffer()

First, let’s start with the AfdGetBuffer() function, which, given its name, we can infer that a buffer is returned to the caller, which is AfdSuperConnect(). The pseudo code for AfdGetBuffer() is in the snippet below.

// Function: afd!AfdGetBuffer()

int64_t AfdGetBuffer(int32_t arg1, int32_t arg2, int64_t arg3, int32_t arg4)
{
    int64_t rax_2;
    if ((arg2 > data_1c00592bc || (arg2 <= data_1c00592bc && arg1 > data_1c00592c4)))
    {

        int32_t var_18_1 = arg4;
        rax_2 = AfdGetBufferSlow(arg1, arg2, arg3, arg4);
    }

    if ((arg2 <= data_1c00592bc && arg1 <= data_1c00592c4))
    {

        int64_t rcx;
        int64_t rbx_1;
        if (arg1 <= data_1c0059294)

        {
            rcx = data_1c0059918;
            rbx_1 = data_1c00599a8;
        }

        else if (arg1 <= data_1c0059298)
        {
            rcx = data_1c0059920;
            rbx_1 = data_1c00599b0;
        }

        else if (arg1 > data_1c005929c)
        {
            rcx = data_1c0059930;
            rbx_1 = data_1c00599c0;
        }

        else

        {
            rcx = data_1c0059928;
            rbx_1 = data_1c00599b8;
        }
        void* gsbase;
        int64_t rax_1 = ExAllocateFromLookasideListEx(PplpRetrieveListIndex(rcx, *(int32_t*)((char*)gsbase + 0x1a4)));
        int32_t rbx_2;
        if (rax_1 == 0)
        {
            rbx_2 = -0x3fffff66;
        }
        int32_t rax_3;
        if ((((!(TEST_BITW(*(int16_t*)(rax_1 + 0x4c), 8))) && rax_1 != 0) && arg3 != 0))
        {
            rax_3 = PsChargeProcessPoolQuota(arg3, 0x200, rbx_1);
            rbx_2 = rax_3;
            if (rax_3 < 0)
            {
                AfdFreeBuffer(rax_1);
            }
        }
        if ((rax_1 == 0 || (((rax_1 != 0 && (!(TEST_BITW(*(int16_t*)(rax_1 + 0x4c), 8)))) && arg3 != 0) && rax_3 < 0)))
        {
            if ((arg4 & 1) != 0)
            {
                ExRaiseStatus(((uint64_t)rbx_2));
                breakpoint();
            }
            rax_2 = 0;
        }
        if ((rax_1 != 0 && (((TEST_BITW(*(int16_t*)(rax_1 + 0x4c), 8)) || ((!(TEST_BITW(*(int16_t*)(rax_1 + 0x4c), 8))) && arg3 == 0)) || (((!(TEST_BITW(*(int16_t*)(rax_1 + 0x4c), 8))) && arg3 != 0) && rax_3 >= 0))))
        {            rax_2 = rax_1;
        }
    }
    return rax_2;
}

Now, early in execution within the AfdSuperConnect() function, we identify the call to the AfdGetBufferSlow() function, where all four arguments are passed; however, checks before and after the function call are present. The checks before, which are outlined in the code snippet below, are available for review.

// Function: afd!AfdGetBuffer() - Snippet

//[SNIP]
    int64_t rax_2;
    if ((arg2 > data_1c00592bc || (arg2 <= data_1c00592bc && arg1 > data_1c00592c4)))
    {
        int32_t var_18_1 = arg4;
        rax_2 = AfdGetBufferSlow(arg1, arg2, arg3, arg4);
    }
    if ((arg2 <= data_1c00592bc && arg1 <= data_1c00592c4))
//[SNIP]

It appears that both ARG2 & ARG1 are being checked against data_1c00592bc & data_1c00592c4. In the first check, we make sure that ARG2 is greater than data_1c00592bc or if that is not true, then it verifies that ARG2 is less than data_1c00592bc & ARG1 is greater than data_1c00592c4

Now, a good question is, what exactly are the values of data_1c00592bc & data_1c00592c4. It appears that data_1c00592bc is a hard-coded value of 0x1C as seen in the snippet below.

// data_1c00592bc Value
int32_t data_1c00592bc = 0x1c

Regarding data_1c00592c4, it is a bit trickier; it appears to be a hard-coded value of 0x10000 as seen in the snippet below. However, some usage (both read/write) is recorded within the AfdReadRegistry() function; further analysis would need to be performed on this function to determine the purpose of data_1c00592c4.

// data_1c00592c4 Value
int32_t data_1c00592c4 = 0x10000

Through cross-referencing data_1c00592c4 it appears that a single write occurs within the function AfdReadRegistry() seen in the snippet below.

// Function: afd!AfdReadRegistry()

int64_t AfdReadRegistry()

{
 //[SNIP]
        int32_t r8_4 = data_1c0059294;
        rbx_1 = 8;
        if (r8_4 < 0x58)
        {
            if ((data_1c0059416 & 8) != 0)
            {
                WPP_SF_SlP();
            }
            r8_4 = 0x58;
            data_1c0059294 = 0x58;
        }
        uint64_t rdx_2 = ((uint64_t)data_1c0059298);
        int64_t* var_168;
        if (rdx_2 < r8_4)
        {
            if ((data_1c0059416 & 8) != 0)
            {
                var_168 = r8_4;
                WPP_SF_Sll(0xf, rdx_2, "MediumBufferSize", rdx_2);
            }
            rdx_2 = ((uint64_t)data_1c0059294);
            data_1c0059298 = rdx_2;
        }
        int32_t r8_5 = data_1c005929c;
        if (r8_5 < rdx_2)
        {
            if ((data_1c0059416 & 8) != 0)
            {
                var_168 = rdx_2;
                rdx_2 = WPP_SF_Sll(0x10, rdx_2, "LargeBufferSize", r8_5);
            }
            r8_5 = data_1c0059298;
            data_1c005929c = r8_5;
        }
        int32_t r9_1 = data_1c00592c4;
        if (r9_1 < r8_5)
        {
            if ((data_1c0059416 & 8) != 0)
            {
                var_168 = r8_5;
                WPP_SF_Sll(0x11, rdx_2, "HugeBufferSize", r9_1);
            }
            data_1c00592c4 = data_1c005929c;
        }
        data_1c00599a3 = AfdReadSingleParameter(var_158, "IgnorePushBitOnReceives", ((uint32_t)data_1c00599a3)) != 0;
        data_1c00599a2 = AfdReadSingleParameter(var_158, "DisableRawSecurity", ((uint32_t)data_1c00599a2)) != 0;
        data_1c0059990 = AfdReadSingleParameter(var_158, "DisableDirectAcceptEx", ((uint32_t)data_1c0059990)) != 0;
        data_1c00599a1 = AfdReadSingleParameter(var_158, "DisableChainedReceive", ((uint32_t)data_1c00599a1)) != 0;
        data_1c0059291 = AfdReadSingleParameter(var_158, "UseTdiSendAndDisconnect", ((uint32_t)data_1c0059291)) != 0;
        if (AfdReadSingleParameter(var_158, "IgnoreOrderlyRelease", 0) != 0)
//[SNIP]
            rax_19 = AfdReadSingleParameter(var_158, "MaxActiveTransmitFileCount", data_1c00596bc);
//[SNIP]
        int32_t rax_16 = AfdReadSingleParameter(var_158, "BufferAlignment", data_1c0059638);
//[SNIP]
        int32_t rax_17 = AfdReadSingleParameter(var_158, "VolatileParameters", ((uint32_t)data_1c00599a0));
//[SNIP]
}

Quite a bit of this function has been removed; however, some interesting registry parameter keys have been left, which for interested readers, these keys could be of interest in looking into and seeing how they are utilized. Now, back to the real reason we are looking at this function, trying to understand where data_1c00592c4 could hold a different value than 0x10000. So, right after the WPP_SF_Sll() function call where HugeBufferSize is passed as an argument, we have the location where data_1c00592c4 is written with the value from data_1c005929c. It’s important to note that the AfdReadRegistry() function is only called during DriverEntry() routine. Further analysis of this function is out of the scope of this post.

Now, going back to the AfdGetBuffer() function and trying to understand the constraints, we know the first check against ARG2 can be passed if we provide a value greater than 0x1C or if ARG2 is less than 0x1C & ARG1 is greater than 0x10000 we will enter into the function AfdGetBufferSlow().

Understanding Call Flow & Further Constraints: AfdSuperConnect()

After reviewing all functions up to this point and understanding the initial constraints around size arguments (that are attacker controlled) that are passed to AfdCalculateBufferSize() function, the last function we have to review is AfdSuperConnect().

The code from AfdSuperConnect() is available in the code snippet below.

// Function: afd!AfdSuperConnect()

uint64_t AfdSuperConnect(void* arg1, void* arg2)
{
    arg_8 = arg1;
    arg_18 = nullptr;
    int128_t var_158 = 0;
    int64_t var_148 = 0;
    int64_t var_188 = 0;
    void* rax = *(int64_t*)((char*)arg2 + 0x30);
    int16_t* rbx = *(int64_t*)((char*)rax + 0x18);
    arg_10 = rbx;
    uint64_t rax_1 = ((uint64_t)*(int32_t*)((char*)arg2 + 0x10));
    int64_t rax_4;
    int64_t* rcx_5;
    int32_t rdx_18;
    int64_t* r12_1;
    char* r14_1;
    bool z_1;
    if (rax_1 < 0xc)
    {
        rdx_18 = 0x13a1;
    }
    else
    {
        int32_t var_198_1 = 0;
        if (*(int8_t*)((char*)arg1 + 0x40) != 0)
        {
            int64_t rcx = *(int64_t*)((char*)arg2 + 0x20);
            int64_t rax_2;
            int64_t rdx;
            int64_t (* const r8_1)();
            if ((rcx & 3) != 0)
            {
                rax_2 = ExRaiseDatatypeMisalignment(rcx);
            }
            else
            {
                rdx = (rcx + rax_1);
                r8_1 = MmUserProbeAddress;
                rax_2 = *(int64_t*)MmUserProbeAddress;
            }
            if ((((rcx & 3) != 0 || ((rcx & 3) == 0 && rdx > rax_2)) || (((rcx & 3) == 0 && rdx <= rax_2) && rdx < rcx)))
            {
                *(int8_t*)rax_2 = 0;
            }
            uint64_t rax_3 = ((uint64_t)*(int32_t*)((char*)arg2 + 8));
            if (rax_3 != 0)
            {
                int64_t rcx_6 = *(int64_t*)((char*)arg1 + 0x70);
                int64_t rdx_5 = (rcx_6 + rax_3);
                char* rax_7 = *(int64_t*)r8_1;
                if ((rdx_5 > rax_7 || (rdx_5 <= rax_7 && rdx_5 < rcx_6)))
                {
                    *(int8_t*)rax_7 = 0;
                }
            }
        }
        r14_1 = *(int64_t*)((char*)arg2 + 0x20);
        char* var_128_1 = r14_1;
        if ((data_1c0059c60 == 0 || (data_1c0059c60 != 0 && *(int8_t*)r14_1 != 0)))
        {
            rax_4 = AfdGetBuffer(*(int32_t*)((char*)arg2 + 8), (*(int32_t*)((char*)arg2 + 0x10) - 4), *(int64_t*)((char*)rbx + 0x28), 1);
//[SNIP]

The AfdSuperConnect() function is quite large, so most of it was removed, and the important sections were left. It’s important to note that the AfdSuperConnect() function is reachable from the IOCTL Dispatch Table for the AFD kernel driver. Now, focusing back on our target function AfdGetBuffer(), it appears that at offset 0x8 & 0x10 two 32bit values are passed as arguments; however, the value at  ARG2+0x10 is subtracted by 4 before being passed. However, before reaching this point, we have a few more checks on the size arguments specifically, at the start where RAX_1 is checked to see if it is less than 0xC, if it is, then we don’t call into the ELSE block, and execute our call to AfdGetBuffer().

Pulling It All Together Now: How Do We Trigger Though?

So now that we understand all the constraints around the size arguments passed to AfdCalculateBufferSize() function, let’s cover a quick recap. We know the second argument cannot be less than 0xC or greater than 0xFFFF. We also know we have no constraints around ARG1 if we ensure that ARG2+0x104 is greater than 0x1C. How do we trigger the integer wrapping bug through? Well, let’s dive into it.

At the start of this post, in section Unpacking the Mischievous Function: A Closer Look , we reviewed pseudo source for the AfdCalculateBufferSize(); however, now let’s do some live debugging and see if we can work out the math to spot why the integer wrapping bug appears starting from the entry point AfdSuperConnect() and verifying our assumptions about the constraints and see what happens if we pass valid sizes versus, some not so valid size values and observe what happens.

// WinDbg Kernel Mode: Entry into AfdSuperConnect & ARG2

Breakpoint 0 hit
afd!AfdSuperConnect:
fffff806`e9b201d0 4c8bdc          mov     r11,rsp
2: kd> dps rdx
ffff858b`1465cd48  00000000`0005310e
ffff858b`1465cd50  00000000`00000040
ffff858b`1465cd58  00000000`00000040
ffff858b`1465cd60  00000000`000120c7
ffff858b`1465cd68  00000000`00b652f0
ffff858b`1465cd70  ffff858b`138bbe00
ffff858b`1465cd78  ffff858b`14520e10

Reviewing the output from above, let’s break down what we are seeing:

  • ARG2+0x8  = 0x40 | Size (Fully Attacker Controlled)
  • ARG2+0x20 = 00000000`00b652f0 | Usermode Buffer (Fully Attacker Controlled)
  • Arg2+0x28 = ffff858b`138bbe00 | Afd Device Object
  • Arg2+0x30 = ffff858b`14520e10 | Afd Endpoint File Object

It is important to note that the usermode buffer is an entirely controlled attacker buffer, where depending on which IOCTL is triggered within the Dispatch Table, certain checks will need to be bypassed as parsing of values from this buffer occur often and can be tracked via usage relating to ARG2+0x20 upon entry ( just a friendly tip for others who are also auditing AFD to make static analysis easier ).

Now, let’s go to our first check of ARG2+0x10.

// WinDbg Kernel Mode: AfdSuperConnect() | First Check ARG2+0x10

2: kd> t
afd!AfdSuperConnect+0x51:
fffff806`e9b20221 83f80c          cmp     eax,0Ch
2: kd> t
afd!AfdSuperConnect+0x54:
fffff806`e9b20224 0f8216880100    jb      afd!AfdSuperConnect+0x18870 (fffff806`e9b38a40)
2: kd> r rax
rax=0000000000000040

We have an unsigned comparison here, and we will not execute this jump since 0x40 is NOT below 0xC. Next, we dive straight to AfdGetBuffer().

// WinDbg Kernel Mode: AfdSuperConnect() | Entry to AfdGetBuffer()

2: kd> tc
afd!AfdSuperConnect+0xcc:
fffff806`e9b2029c e81fc2ffff      call    afd!AfdGetBuffer (fffff806`e9b1c4c0)
2: kd> r rcx, rdx, r8, r9
rcx=0000000000000040 rdx=000000000000003c r8=ffff858b188a90c0 r9=0000000000000001
2: kd> dps ffff858b188a90c0
ffff858b`188a90c0  00000000`00000003

Dumping the arguments to the AfdGetBuffer() and we see:

  • ARG1 = 0x40
  • ARG2 = 0x3C (0x40 – 4)
  • Arg3 = ffff858b188a90c0 (Which points to 0x3)
  • Arg4 = 0x1

Up next is the first check to see if ARG2 is greater than 0x1C.

// WinDbg Kernel Mode: AfdGetBuffer() | First Check ARG2

2: kd> t
afd!AfdGetBuffer+0x14:
fffff806`e9b1c4d4 3b15e2cd0400    cmp     edx,dword ptr [afd!AfdStandardAddressLength (fffff806`e9b692bc)]

Now, this is interesting. We have a name AfdStandardAddressLength, which is interesting; let’s look at some of the other assembly and see what other information we have presented for us to help aid reversing.

// WinDbg Kernel Mode: AfdGetBuffer() | Reviewing Additional Information

fffff806`e9b1c4d4 3b15e2cd0400     cmp     edx, dword ptr [afd!AfdStandardAddressLength (fffff806e9b692bc)]
fffff806`e9b1c4da 418be9           mov     ebp, r9d
fffff806`e9b1c4dd 498bf0           mov     rsi, r8
fffff806`e9b1c4e0 0f87d1000000     ja      afd!AfdGetBuffer+0xf7 (fffff806e9b1c5b7)
fffff806`e9b1c4e6 3b0dd8cd0400     cmp     ecx, dword ptr [afd!AfdHugeBufferSize (fffff806e9b692c4)]
fffff806`e9b1c4ec 0f87c5000000     ja      afd!AfdGetBuffer+0xf7 (fffff806e9b1c5b7)
fffff806`e9b1c4f2 3b0d9ccd0400     cmp     ecx, dword ptr [afd!AfdSmallBufferSize (fffff806e9b69294)]
fffff806`e9b1c4f8 0f8696000000     jbe     afd!AfdGetBuffer+0xd4 (fffff806e9b1c594)
fffff806`e9b1c4fe 3b0d94cd0400     cmp     ecx, dword ptr [afd!AfdMediumBufferSize (fffff806e9b69298)]
fffff806`e9b1c504 0f86b8000000     jbe     afd!AfdGetBuffer+0x102 (fffff806e9b1c5c2)
fffff806`e9b1c50a 3b0d8ccd0400     cmp     ecx, dword ptr [afd!AfdLargeBufferSize (fffff806e9b6929c)]
fffff806`e9b1c510 0f878e000000     ja      afd!AfdGetBuffer+0xe4 (fffff806e9b1c5a4)

Interestingly, we now see further checks against the following:

  • AfdHugeBufferSize   – fffff806e9b692c4  00010000
  • AfdSmallBufferSize  – fffff806e9b69294  00000080
  • AfdMediumBufferSize – fffff806e9b69298  00000640
  • AfdLargeBufferSize  – fffff806e9b6929c  00002000

If we review the afd!AfdReadRegistry() code snippet from earlier, we will see some similar names passed to the WPP_SF_Sll(), which could help in the analysis for those who are also reversing the AFD driver. Now, back to the checks being performed.

// WinDbg Kernel Mode: AfdGetBuffer() | Entry Into AfdGetBufferSlow()

2: kd> t
afd!AfdGetBuffer+0x1a:
fffff806`e9b1c4da 418be9          mov     ebp,r9d
2: kd> t
afd!AfdGetBuffer+0x1d:
fffff806`e9b1c4dd 498bf0          mov     rsi,r8
2: kd> t
afd!AfdGetBuffer+0x20:
fffff806`e9b1c4e0 0f87d1000000    ja      afd!AfdGetBuffer+0xf7 (fffff806`e9b1c5b7)
2: kd> t
afd!AfdGetBuffer+0xf7:
fffff806`e9b1c5b7 896c2420        mov     dword ptr [rsp+20h],ebp
2: kd> t
afd!AfdGetBuffer+0xfb:
fffff806`e9b1c5bb e8e04bffff      call    afd!AfdGetBufferSlow (fffff806`e9b111a0)
2: kd> r rcx, rdx, r8, r9
rcx=0000000000000040 rdx=000000000000003c r8=ffff858b188a90c0 r9=0000000000000001

All arguments appear to be exactly the same, as we reviewed earlier, before entering into the AfdGetBufferSlow() function.

// WinDbg Kernel Mode: AfdGetBufferSlow() | Past First Check & Entering Into AfdCalculateBufferSize()

2: kd> t
afd!AfdGetBufferSlow+0x24:
fffff806`e9b111c4 81faffff0000    cmp     edx,0FFFFh
2: kd> t
afd!AfdGetBufferSlow+0x2a:
fffff806`e9b111ca 0f87c5fd0100    ja      afd!AfdGetBufferSlow+0x1fdf5 (fffff806`e9b30f95)
2: kd> t r edx
edx=3c
2: kd> t
afd!AfdGetBufferSlow+0x30:
fffff806`e9b111d0 85c9            test    ecx,ecx
2: kd> t
afd!AfdGetBufferSlow+0x32:
fffff806`e9b111d2 bd04000000      mov     ebp,4
2: kd> t
afd!AfdGetBufferSlow+0x37:
fffff806`e9b111d7 458ac6          mov     r8b,r14b
2: kd> t
afd!AfdGetBufferSlow+0x3a:
fffff806`e9b111da 0f45e9          cmovne  ebp,ecx
2: kd> t
afd!AfdGetBufferSlow+0x3d:
fffff806`e9b111dd 8bcd            mov     ecx,ebp
2: kd> t
afd!AfdGetBufferSlow+0x3f:
fffff806`e9b111df e8bc040000      call    afd!AfdCalculateBufferSize (fffff806`e9b116a0)
2: kd> r rcx, rdx, r8
rcx=0000000000000040 rdx=000000000000003c r8=ffff858b188a9003
2: kd> dps r8
ffff858b`188a9003  00000000`00000000

The first check of ARG2 against 0xFFFF passes since ARG2 is 0x3C, and we immediately enter into the function call to AfdCalculateBufferSize(). This is our destination, so let’s track what occurs thoroughly.

// WinDbg Kernel Mode: AfdCalculateBufferSize() | Overview

afd!AfdCalculateBufferSize:
fffff806`e9b116a0 48895c2408     mov     qword ptr [rsp+8], rbx
fffff806`e9b116a5 4889742410     mov     qword ptr [rsp+10h], rsi
fffff806`e9b116aa 57             push    rdi
fffff806`e9b116ab 4883ec20       sub     rsp, 20h
fffff806`e9b116af 8bda           mov     ebx, edx
fffff806`e9b116b1 8bf9           mov     edi, ecx
fffff806`e9b116b3 8bd1           mov     edx, ecx
fffff806`e9b116b5 33c9           xor     ecx, ecx
fffff806`e9b116b7 410fbef0       movsx   esi, r8b
fffff806`e9b116bb 48ff151edc0500 call    qword ptr [afd!__imp_MmSizeOfMdl (fffff806e9b6f2e0)]
fffff806`e9b116c2 0f1f440000     nop     dword ptr [rax+rax]
fffff806`e9b116c7 baf0ffffff     mov     edx, 0FFFFFFF0h
fffff806`e9b116cc 440fb7ce       movzx   r9d, si
fffff806`e9b116d0 6641c1e103     shl     r9w, 3
fffff806`e9b116d5 448d580f       lea     r11d, [rax+0Fh]
fffff806`e9b116d9 b8d0000000     mov     eax, 0D0h
fffff806`e9b116de 4423da         and     r11d, edx
fffff806`e9b116e1 458d1431       lea     r10d, [r9+rsi]
fffff806`e9b116e5 6641c1e203     shl     r10w, 3
fffff806`e9b116ea 664403d0       add     r10w, ax
fffff806`e9b116ee 8d470f         lea     eax, [rdi+0Fh]
fffff806`e9b116f1 23c2           and     eax, edx
fffff806`e9b116f3 410fb7ca       movzx   ecx, r10w
fffff806`e9b116f7 83c10f         add     ecx, 0Fh
fffff806`e9b116fa 83c060         add     eax, 60h
fffff806`e9b116fd 23ca           and     ecx, edx
fffff806`e9b116ff 03cb           add     ecx, ebx
fffff806`e9b11701 418d140b       lea     edx, [r11+rcx]
fffff806`e9b11705 b900100000     mov     ecx, 1000h
fffff806`e9b1170a 03d0           add     edx, eax
fffff806`e9b1170c 3bd1           cmp     edx, ecx
fffff806`e9b1170e 7318           jae     afd!AfdCalculateBufferSize+0x88 (fffff806e9b11728)

Now, let’s step through these instructions after the call to MmSizeOfMdl(), record each argument, and see where the fault resides within the AfdCalculateBufferSize() function.

// WinDbg Kernel Mode: AfdCalculateBufferSize() | Single Step Analysis

fffff806`e9b116c2 0f1f440000     nop     dword ptr [rax+rax]
fffff806`e9b116c7 baf0ffffff     mov     edx, 0FFFFFFF0h   // edx = 0xFFFFFFF0
fffff806`e9b116cc 440fb7ce       movzx   r9d, si           // si = 0x3 , r9d = 0x3
fffff806`e9b116d0 6641c1e103     shl     r9w, 3            // r9w = 0x18
fffff806`e9b116d5 448d580f       lea     r11d, [rax+0Fh]   // rax = 0x38, rax+0xF = 0x47, r11d = 0x47
fffff806`e9b116d9 b8d0000000     mov     eax, 0D0h         // eax = 0xD0
fffff806`e9b116de 4423da         and     r11d, edx         // edx = 0xfffffff0, r11d = 0x47, r11d = 0x40
fffff806`e9b116e1 458d1431       lea     r10d, [r9+rsi]    // r9 = 0x18, rsi = 0x3, r10d = 0x1B
fffff806`e9b116e5 6641c1e203     shl     r10w, 3           // r10w = 0xD8
fffff806`e9b116ea 664403d0       add     r10w, ax          // ax = D0, r10w = 0xD8, r10w = 0x1A8
fffff806`e9b116ee 8d470f         lea     eax, [rdi+0Fh]    // rdi = (0x40), rdi+0xf = 0x4F, eax = 0x4F (Attacker Fully Controlled RDI)
fffff806`e9b116f1 23c2           and     eax, edx          // edx = 0xFFFFFFF0, eax = 0x4F, eax = 0x40
fffff806`e9b116f3 410fb7ca       movzx   ecx, r10w         // r10w = 0x1A8, ecx = 0x0, ecx = 0x1A8
fffff806`e9b116f7 83c10f         add     ecx, 0Fh          // ecx = 0x1A8, ecx + 0xF = 0x1B7
fffff806`e9b116fa 83c060         add     eax, 60h          // eax = 0x40, eax + 0x60 = 0xA0
fffff806`e9b116fd 23ca           and     ecx, edx          // edx = 0xFFFFFFF0, ecx = 0x1B7, ecx = 0x1B0
fffff806`e9b116ff 03cb           add     ecx, ebx          // ebx = 0x3C, ecx = 0x1B0, ecx = 0x1EC (Attacker Fully Controlled EBX)
fffff806`e9b11701 418d140b       lea     edx, [r11+rcx]    // r11 = 0x40, rcx = 0x1EC, edx = 0x22C
fffff806`e9b11705 b900100000     mov     ecx, 1000h        // First check if return from MmSizeOfMdl
fffff806`e9b1170a 03d0           add     edx, eax          // eax = 0xA0, edx = 0x22C, edx = 0x2CC
fffff806`e9b1170c 3bd1           cmp     edx, ecx          // Check if the resulting math of 0x2CC is greather or equal to 0x1000
fffff806`e9b1170e 7318           jae     afd!AfdCalculateBufferSize+0x88 (fffff806e9b11728)

Reviewing the above output, it appears that at address 0xfffff806e9b116ee, we first use attacker-controlled values, where the value is stored in EAX. Immediately after, a mask is applied to the attacker-supplied value then shortly after 0x60 is added to the attacker-masked value. 

This is where our wrap can occur if we max out the size of RDI since zero checks happen to verify this value upon reaching this code. Now, let’s, continue and return from this function and observe what occurs.

// WinDbg Kernel Mode: AfdGetBufferSlow() | AfdCalculateBufferSize() Return Checks

2: kd> t
afd!AfdGetBufferSlow+0x44:
fffff806`e9b111e4 3bc5            cmp     eax,ebp
2: kd> dc rax
00000000`000002cc
2: kd> dc ebp
00000000`00000040
2: kd> t
afd!AfdGetBufferSlow+0x46:
fffff806`e9b111e6 0f82a9fd0100    jb      afd!AfdGetBufferSlow+0x1fdf5 (fffff806`e9b30f95)
2: kd> t
afd!AfdGetBufferSlow+0x4c:
fffff806`e9b111ec 3bc3            cmp     eax,ebx
2: kd> r eax, rbx
eax=2cc rbx=000000000000003c
2: kd> t
afd!AfdGetBufferSlow+0x4e:
fffff806`e9b111ee 0f82a1fd0100    jb      afd!AfdGetBufferSlow+0x1fdf5 (fffff806`e9b30f95)
2: kd> t
afd!AfdGetBufferSlow+0x54:
fffff806`e9b111f4 8364243c00      and     dword ptr [rsp+3Ch],0
2: kd> t
afd!AfdGetBufferSlow+0x59:
fffff806`e9b111f9 4c8d4c2430      lea     r9,[rsp+30h]
2: kd> t
afd!AfdGetBufferSlow+0x5e:
fffff806`e9b111fe 8364243800      and     dword ptr [rsp+38h],0
2: kd> t
afd!AfdGetBufferSlow+0x63:
fffff806`e9b11203 41b841666442    mov     r8d,42646641h
2: kd> t
afd!AfdGetBufferSlow+0x69:
fffff806`e9b11209 8bd0            mov     edx,eax
2: kd> t
afd!AfdGetBufferSlow+0x6b:
fffff806`e9b1120b b942000000      mov     ecx,42h
2: kd> t
afd!AfdGetBufferSlow+0x70:
fffff806`e9b11210 48c744243001000000 mov   qword ptr [rsp+30h],1
2: kd> t
afd!AfdGetBufferSlow+0x79:
fffff806`e9b11219 448bf8          mov     r15d,eax
2: kd> t
afd!AfdGetBufferSlow+0x7c:
fffff806`e9b1121c c744242001000000 mov     dword ptr [rsp+20h],1
2: kd> t
afd!AfdGetBufferSlow+0x84:
fffff806`e9b11224 48ff159de00500  call    qword ptr [afd!_imp_ExAllocatePool3 (fffff806`e9b6f2c8)]
2: kd> r rcx, rdx, r8
rcx=0000000000000042 rdx=00000000000002cc r8=0000000042646641

Upon returning from AfdCalculateBufferSize(), the two checks are passed since the return value is greater than both provided arguments to the AfdCalculateBufferSize() function, with the return value being used as a size argument to ExAllocatePool3().

Pulling It All Together Now: Forcing Integer Wrap

Now that we understand the bug and the constraints around the bug, we will now supply input to verify our static analysis to trigger the wrap using dynamic analysis to verify our assumptions. This time, we will skip all earlier checks and focus on the final function, AfdCalculateBufferSize(), the return value, and the error function call to ExRaiseStatus().

// WinDbg Kernel Mode: AfdCalculateBufferSize() | Trigger Wrap & ExRaiseStatus()

3: kd> t
afd!AfdCalculateBufferSize+0x4e:
fffff806`e9b116ee 8d470f          lea     eax,[rdi+0Fh]
3: kd> r rdi, rax
rdi=00000000ffffffff rax=00000000000000d0
3: kd> t
afd!AfdCalculateBufferSize+0x51:
fffff806`e9b116f1 23c2            and     eax,edx
3: kd> r rax
rax=000000000000000e
3: kd> dc rdi+0xf
00000001`0000000e
3: kd> dc rdi
00000000`ffffffff
3: kd> ?rdi+0xf
Evaluate expression: 4294967310 = 00000001`0000000e
3: kd> t
afd!AfdCalculateBufferSize+0x53:
fffff806`e9b116f3 410fb7ca        movzx   ecx,r10w
3: kd> t
afd!AfdCalculateBufferSize+0x57:
fffff806`e9b116f7 83c10f          add     ecx,0Fh
3: kd> t
afd!AfdCalculateBufferSize+0x5a:
fffff806`e9b116fa 83c060          add     eax,60h
3: kd> r rax
rax=0000000000000000
3: kd> ?rax+0x60
Evaluate expression: 96 = 00000000`00000060
3: kd> t
afd!AfdCalculateBufferSize+0x5d:
fffff806`e9b116fd 23ca            and     ecx,edx
3: kd> t
afd!AfdCalculateBufferSize+0x5f:
fffff806`e9b116ff 03cb            add     ecx,ebx
3: kd> t
afd!AfdCalculateBufferSize+0x61:
fffff806`e9b11701 418d140b        lea     edx,[r11+rcx]
3: kd> t
afd!AfdCalculateBufferSize+0x65:
fffff806`e9b11705 b900100000      mov     ecx,1000h
3: kd> t
afd!AfdCalculateBufferSize+0x6a:
fffff806`e9b1170a 03d0            add     edx,eax
3: kd> t
afd!AfdCalculateBufferSize+0x6c:
fffff806`e9b1170c 3bd1            cmp     edx,ecx
3: kd> dc edx
00000000`0080027c
3: kd> dc ecx
00000000`00001000
3: kd> t
afd!AfdCalculateBufferSize+0x6e:
fffff806`e9b1170e 7318            jae     afd!AfdCalculateBufferSize+0x88 (fffff806`e9b11728)
3: kd> t
afd!AfdCalculateBufferSize+0x88:
fffff806`e9b11728 488b5c2430      mov     rbx,qword ptr [rsp+30h]
3: kd> t
afd!AfdCalculateBufferSize+0x8d:
fffff806`e9b1172d 8bc2            mov     eax,edx
3: kd> t
afd!AfdCalculateBufferSize+0x8f:
fffff806`e9b1172f 488b742438      mov     rsi,qword ptr [rsp+38h]
3: kd> t
afd!AfdCalculateBufferSize+0x94:
fffff806`e9b11734 4883c420        add     rsp,20h
3: kd> t
afd!AfdCalculateBufferSize+0x98:
fffff806`e9b11738 5f              pop     rdi
3: kd> t
afd!AfdCalculateBufferSize+0x99:
fffff806`e9b11739 c3              ret
3: kd> t
afd!AfdGetBufferSlow+0x44:
fffff806`e9b111e4 3bc5            cmp     eax,ebp
3: kd> r eax, ebp
eax=80027c ebp=ffffffff
3: kd> t
afd!AfdGetBufferSlow+0x46:
fffff806`e9b111e6 0f82a9fd0100    jb      afd!AfdGetBufferSlow+0x1fdf5 (fffff806`e9b30f95)
3: kd> t
afd!AfdGetBufferSlow+0x1fdf5:
fffff806`e9b30f95 bf9a0000c0      mov     edi,0C000009Ah
3: kd> t
afd!AfdGetBufferSlow+0x1fdfa:
fffff806`e9b30f9a f684248000000001 test    byte ptr [rsp+80h],1
4: kd> t
afd!AfdGetBufferSlow+0x1fe02:
fffff806`e9b30fa2 740f            je      afd!AfdGetBufferSlow+0x1fe13 (fffff806`e9b30fb3)
4: kd> t
afd!AfdGetBufferSlow+0x1fe04:
fffff806`e9b30fa4 8bcf            mov     ecx,edi
4: kd> t
afd!AfdGetBufferSlow+0x1fe06:
fffff806`e9b30fa6 48ff153be30300  call    qword ptr [afd!_imp_ExRaiseStatus (fffff806`e9b6f2e8)]

Sadly, the checks made upon returning from the AfdCalculateBufferSize() function are accurate, and this bug is short-lived as an Exception is raised. It is also important to note that the resulting value from AfdCalculateBufferSize() is not 100% attacker-controlled. This is another constraint we must work with.

The Importance of Verifying Return Values

Now, yes, we went through all of this for a non-abusable bug in this specific scenario; however, at this time, the AfdCalculateBufferSize() function is utilized in a few locations, and maybe others might be more promising. While this analysis is of a normal bug, where the return value is properly validated, and an exception is raised, which is standard, we must remember that developers are humans as well, and maybe in the future, the AfdCalculateBufferSize() function may be improperly utilized, and the return value never validated before being passed as an argument to sensitive routines. Remember, while we didn’t find a useful purpose for this bug, we learned a decent amount of a common theme among IOCTL usermode arguments and their associated functions that are reachable through the AFD kernel drive from usermode.

Thus, we have a potential primitive in the pocket that will wait and lurk for a day when a developer forgets to check the return value properly!

Conclusion

This blog post covered quite a bit of material to provide readers with a step-by-step approach to performing static analysis & dynamic analysis while reversing Windows Kernel Drivers. Along with general kernel driver analysis, some other topics of vulnerability research and exploit development were covered, such as understanding constraints, identifying call flow to build a Proof-of-Concept, and mindset around trying to understand how a bug could be abused in the future. Some areas within the post also leave room for others to explore further if they wish, such as with the AFD Registry function and other places where AfdCalculateBufferSize() function is utilized.

Thank you for reading this post; hopefully, it helps others learn to perform security research. Remember, just because this bug appears to cause no issues today, this opinion can be challenged in the future.

Special shoutout to Jon Reyes (@notCh3rn0byl) for the assistance during the N-day analysis and research!