UNEXPECTED_KERNEL_MODE_TRAP (7f) This means a trap occurred in kernel mode, and it's a trap of a kind that the kernel isn't allowed to have/catch (bound trap) or that is always instant death (double fault). Arguments: Arg1: 0000000000000008, EXCEPTION_DOUBLE_FAULT Arg2: 0000000080050033 Arg3: 00000000000406f8 Arg4: fffff800032aa875
In our case, the 1st argument was 8, therefore this indicates a double fault occurred. So, what is a double fault, and when/why does one occur?
Double faults occur when an exception cannot be handled by the handler, or when an exception occurs when the CPU is already trying to call an exception handler for a previously thrown exception. In most cases, two exceptions that were thrown at the exact same time are handled separately, however in some cases, you may have a situation occur in which a pagefault occurs, but the exception handler is located in a not-present page, two page faults would occur and neither of them can be handled. This is known as a double fault! Also, double faults can occur (like in this scenario) when the processor cannot properly service an interrupt that is pending.
4: kd> k Child-SP RetAddr Call Site fffff880`009b9de8 fffff800`0328b169 nt!KeBugCheckEx fffff880`009b9df0 fffff800`03289632 nt!KiBugCheckDispatch+0x69 fffff880`009b9f30 fffff800`032aa875 nt!KiDoubleFaultAbort+0xb2 <- Uh oh, double fault! fffff880`03dccfd0 fffff800`032909ba nt!KiIpiSendRequest+0x305 <- As it is a multiprocessor job, processor #4 sent an inter-processor interrupt to interrupt another processor saying "Hey, we need to flush the TLB." fffff880`03dcd090 fffff800`032ec198 nt!KeFlushMultipleRangeTb+0x22a <- Flushing translation lookaside buffer, this is a multiprocessor job. fffff880`03dcd160 fffff800`033935ea nt! ?? ::FNODOBFM::`string'+0x204ce fffff880`03dcd350 fffff800`03394be7 nt!MiEmptyWorkingSet+0x24a <- Removing
as many pages as possible from the working set. fffff880`03dcd400 fffff800`0372f371 nt!MiTrimAllSystemPagableMemory+0x218 <- Unmapping
all pageable system memory. fffff880`03dcd460 fffff800`0372f4cf nt!MmVerifierTrimMemory+0xf1 fffff880`03dcd490 fffff800`0372fc24 nt!ViKeRaiseIrqlSanityChecks+0xcf <- As verifier is enabled, it's doing a sanity check. A sanity check is essentially verifier saying "Okay, what IRQL are we on and are we supposed to be here?" fffff880`03dcd4d0 fffff880`018443f5 nt!VerifierKeAcquireSpinLockRaiseToDpc+0x54 <- IRST resetting IRQL to DISPATCH (2) and then acquiring a lock. fffff880`03dcd530 fffff880`018222a2 iaStor+0x253f5 <- Intel Rapid Storage Technology fffff880`03dcd560 fffff880`01871489 iaStor+0x32a2 <- Intel Rapid Storage Technology
4: kd> ub nt!KiIpiSendRequest+0x305 nt!KiIpiSendRequest+0x2eb: fffff800`032aa85b 5e pop rsi fffff800`032aa85c 5d pop rbp fffff800`032aa85d c3 ret fffff800`032aa85e 8bc6 mov eax,esi fffff800`032aa860 e9e2feffff jmp nt!KiIpiSendRequest+0x1d7 (fffff800`032aa747) fffff800`032aa865 0fb70db4892100 movzx ecx,word ptr [nt!KeActiveProcessors (fffff800`034c3220)] fffff800`032aa86c 0fb705af892100 movzx eax,word ptr [nt!KeActiveProcessors+0x2 (fffff800`034c3222)] fffff800`032aa873 8bfa mov edi,edx
By unassmembling nt!KiIpiSendRequest+0x305 backwards, it looks like there's a check for active processors, and then the attempt to send the IPI.
4: kd> !ipi IPI State for Processor 0 TargetCount 0 PacketBarrier 0 IpiFrozen 2 [Frozen] IPI State for Processor 1 TargetCount 0 PacketBarrier 0 IpiFrozen 2 [Frozen] IPI State for Processor 2 TargetCount 0 PacketBarrier 0 IpiFrozen 2 [Frozen] IPI State for Processor 3 TargetCount 0 PacketBarrier 0 IpiFrozen 2 [Frozen] IPI State for Processor 4 TargetCount 0 PacketBarrier 0 IpiFrozen 0 [Running] IPI State for Processor 5 TargetCount 0 PacketBarrier 0 IpiFrozen 2 [Frozen] IPI State for Processor 6 TargetCount 0 PacketBarrier 0 IpiFrozen 2 [Frozen] IPI State for Processor 7 TargetCount 0 PacketBarrier 0 IpiFrozen 2 [Frozen]
By running !ipi we can check the inter-processor interrupt state for every processor on the box. We can see here that every single processor (except #4) is in a frozen state (idle), therefore obviously our IPI is never going to be serviced, will remain pending, and we're going to double fault.
4: kd> lmvm iaStor start end module name fffff880`0181f000 fffff880`01bc3000 iaStor (no symbols) Loaded symbol image file: iaStor.sys Image path: \SystemRoot\system32\DRIVERS\iaStor.sys Image name: iaStor.sys Timestamp: Wed Feb 01 19:15:24 2012
The IRST driver is dated from early 2012, which is likely the problem since it is a notoriously problematic driver, and it gets worse as it gets older. The newer update would likely solve it, but honestly, I always usually recommend a user safely removes and replaces this driver with the standard MSFT driver if they aren't running a RAID setup. Kaspersky was also present on this system, and antivirus suites don't tend to play nice with this software either.
This post also shows how helpful Driver Verifier is, and how without it in this specific scenario, we likely would have had no idea what was causing this, and may interpret it as a hardware problem.
Thanks for reading!