Sunday, October 1, 2006 10:09 PM
by
Ken
Causes, and tips for debugging, a STOP 0x0000000A (IRQL_NOT_LESS_OR_EQUAL) bugcheck/blue screen - Part 1
A STOP 0x0000000A (IRQL_NOT_LESS_OR_EQUAL) bugcheck is one of the most common STOP codes encountered by users today. After discussing this with my former colleague Chewy on my way to Tech.Ed SEA, I've decided to write up a post briefly explaining what an IRQL is, why it being "not less or equal" will cause a bugcheck, and what you can do to diagnose what's causing the issue.
Firstly - what's an IRQL? IRQL stands for Interrupt Request Level. An IRQL "defines the hardware priority at which a processor operates at any given time". The set of IRQLs maintained depends on the CPU architecture. For x86 there are 32 levels, and for ia64/x64 there are 16 levels. The table below (extracted from a MS Whitepaper) shows the various IRQLs for each platform. (Note: I'll put all references at the end of Part 2 so there's a consolidated list of reference you can go to)
IRQL | IRQL x86 | IRQL IA64 | IRQL AMD64 | Description |
PASSIVE_LEVEL | 0 | 0 | 0 | User threads and most kernel-mode operations |
APC_LEVEL | 1 | 1 | 1 | Asynchronous procedure calls and page faults |
DISPATCH_LEVEL | 2 | 2 | 2 | Thread scheduler and deferred procedure calls (DPCs) |
CMC_LEVEL | N/A | 3 | N/A | Correctable machine-check level (IA64 platforms only) |
Device interrupt levels (DIRQL) | 3-26 | 4-11 | 3-11 | Device interrupts |
PC_LEVEL | N/A | 12 | N/A | Performance counter (IA64 platforms only) |
PROFILE_LEVEL | 27 | 15 | 15 | Profiling timer for releases earlier than Windows 2000 |
SYNCH_LEVEL | 27 | 13 | 13 | Synchronization of code and instruction streams across processors |
CLOCK_LEVEL | N/A | 13 | 13 | Clock timer |
CLOCK2_LEVEL | 28 | N/A | N/A | Clock timer for x86 hardware |
IPI_LEVEL | 29 | 14 | 14 | Interprocessor interrupt for enforcing cache consistency |
POWER_LEVEL | 30 | 15 | 14 | Power failure |
HIGH_LEVEL | 31 | 15 | 15 | Machine checks and catastrophic errors; profiling timer for Windows XP and later releases |
The lowest level (0) is not really an IRQL - it is at this level that most processing occurs. Levels 1 and 2 are "software" IRQLs - Windows uses these IRQLs to perform various tasks. Levels 3-26 (for x86) and 3-11 (for x64) are device (hardware) IRQLs (or DIRQLs). When a hardware device raises an interrupt, the IRQL is raised to within this band. The actual IRQL that a particular interrupt raises the IRQL level to is determined by the Windows Plug-n-Play (PnP) subsystem. Above these DIRQLs are certain privileged IRQLs, such as that used by the system clock.
In the Windows system, higher IRQLs preempt lower level IRQLs. This is why the system clock has such a high IRQL - Windows relies on the clock to know when threads have exhausted their quatum on the CPU and should be preempted by other threads. The actual system thread dispatcher runs at IRQL 2 (Dispatch Level). This allows the thread dispatcher to preempt normal running code (at IRQL 0) and schedule a different ready thread to run on the CPU.
When an IRQL has been raised, then all lower level IRQLs are temporarily "masked" or ignored until the IRQL is lowered. And because of this, we potentially end up with the IRQL_NOT_LESS_OR_EQUAL bugcheck. The most common cause of this Bugcheck is when a hardware device has raised an interrupt, and the IRQL has been raised. Windows looks in the Interrupt Dispatch Table (IDT) to see what code should be run in response to the interrupt. Typically some device driver software is now run. Suppose the device driver attempts to access some memory, but that memory has been paged to disk. Normally, when memory has been paged to disk we dispatch another thread to perform a file i/o operation to bring the necessary data back into physical memory. However because the system thread dispatcher runs at IRQL 2, and all lower level IRQLs are currently being masked, we end up in a deadlock situation. The device driver won't signal that it's ready to lower the IRQL until it gets the data it needs, but Windows can't get that data from disk until the IRQL is lowered and the System Thread Dispatcher can dispatch a file i/o thread. Windows detects this situation as an unresolvable deadlock and bughecks the system with an "STOP 0x0000000A IRQL_NOT_LESS_OR_EQUAL" error. The "not less or equal" refers to the fact that the required IRQL is not less or equal to the current IRQL.
In part 2 we look at how we can work out what has caused the problem.