天天看點

調試遺漏IoStartNextPacket引起的阻塞

    前面 ​​ driver verifier檢測驅動死鎖 ​​ 一文中本想檢測一下驅動中潛在的死鎖來解決驅動無響應的bug,然而并沒有實質性的進展。後來通過一系列的調試終于找到了根源所在,本文用于記錄查找問題的過程。

    下面是測試程式,主線程循環讀驅動,子線程循環寫驅動,由于是異步操作,每次讀寫結束後,線程都要調用WaitForSingleObject等待驅動完成IRP。可是由于驅動處理有誤,ReadFile調用2次後,主線程就徹底卡死在Wait函數上了:

hDev = CreateFileA(interfaceBuff, 
        GENERIC_ALL, 
        FILE_SHARE_READ | FILE_SHARE_WRITE,
        NULL, OPEN_EXISTING,
        FILE_ATTRIBUTE_NORMAL | FILE_FLAG_OVERLAPPED, 
        NULL);

  InitializeCriticalSection(&cs);

  CreateThread(NULL, 0, (LPTHREAD_START_ROUTINE)ThreadProc, NULL, 0, &threadId);

  while (1)
  {
    memset(readBuff, 0, 4096);
    BOOL bRead = ReadFile(hDev, readBuff, 4096, &readLen, &overlapRd); 1)
    WaitForSingleObject(overlapRd.hEvent, INFINITE);                   2)

    if (readLen)
    {
      EnterCriticalSection(&cs);
      printf("_tmain read:%s\n",readBuff);
      LeaveCriticalSection(&cs);
    }
      
  }

  CloseHandle(hDev);
  return 0;
}

DWORD ThreadProc(void* param)
{
  OVERLAPPED overlapRd = { 0 },overlapWr = {0};
  overlapRd.hEvent = CreateEvent(NULL, FALSE, FALSE, NULL);
  overlapWr.hEvent = CreateEvent(NULL, FALSE, FALSE, NULL);

  char writeBuff[4096] = { "_ThreadProc" }, readBuff[4096] = { 0 };
  DWORD len = 0, writeLen, readLen;

  while (1)
  {
    WriteFile(hDev, writeBuff, strlen(writeBuff), &writeLen, &overlapWr);
    WaitForSingleObject(overlapWr.hEvent, INFINITE);

  }
}      

    對于異步IRP,調用ReadFile會傳入OVERLAPPED結構,之後WaitForSingleObject會在OVERLAPPED!HEVENT上等待。當驅動中完成讀請求(調用IoCompleteRequest)時,會調用KeSetEvnet(irp->UserEvent);使得使用者層的等待函數傳回。基于上述分析,可以推知,ReadFile的IRP請求被挂起後一直沒有被完成(懸而未決的IRP請求)。為了驗證我的猜測,可以檢視一下線程的IRP請求:

1.首先檢視R3下線程ID,目前程序有3個線程,0/1分别對應讀寫線程,其ID号分别為:75c.3a8/75c.828;2号線程可能是windbg調試線程

0:002> ~*kb ;注意這裡的0:002> 暗示目前是在R3下調試

   0  Id: 75c.3a8 Suspend: 1 Teb: 7ffdf000 Unfrozen
ChildEBP RetAddr  Args to Child              
0042d53c 77165e6c 754f179c 0000007c 00000000 ntdll!KiFastSystemCallRet
0042d540 754f179c 0000007c 00000000 00000000 ntdll!NtWaitForSingleObject+0xc
0042f7d0 001b400a 00000001 005f96d0 005f9738 onlyForWrite!main+0x22b [c:\studio\samplechar\onlyforwrite\onlyforwrite\testsamplechar.cpp @ 173]

   1  Id: 75c.828 Suspend: 1 Teb: 7ffde000 Unfrozen
ChildEBP RetAddr  Args to Child              
01c7db38 77165e6c 754f179c 00000090 00000000 ntdll!KiFastSystemCallRet
01c7db3c 754f179c 00000090 00000000 00000000 ntdll!NtWaitForSingleObject+0xc
01c7fd20 76ef1174 00000000 01c7fd6c 7717b3f5 onlyForWrite!ThreadProc+0x139 [c:\studio\samplechar\onlyforwrite\onlyforwrite\testsamplechar.cpp @ 200]

#  2  Id: 75c.8dc Suspend: 1 Teb: 7ffdd000 Unfrozen
ChildEBP RetAddr  Args to Child              
007efa94 771bd279 7764a5c0 00000000 00000000 ntdll!DbgBreakPoint
007efac4 76ef1174 00000000 007efb10 7717b3f5 ntdll!DbgUiRemoteBreakin+0x3c      

2.利用上面的線程号,在R0 windbg上檢視線程挂起的IRP:

kd> !process 0 0 ;查找所有程序  注意這裡調試開頭是kd>
**** NT ACTIVE PROCESS DUMP ****
PROCESS 89a4bd40  SessionId: 1  Cid: 075c    Peb: 7ffd4000  ParentCid: 0f8c
    DirBase: 7ff7e480  ObjectTable: 8a2a16d0  HandleCount:  57.
    Image: onlyForWrite.exe ;找到onlyForWrite.exe程序的資訊


kd> !process 89a4bd40 6 ;查找程序onlyForWrite.exe下所有線程
PROCESS 89a4bd40  SessionId: 1  Cid: 075c    Peb: 7ffd4000  ParentCid: 0f8c
    DirBase: 7ff7e480  ObjectTable: 8a2a16d0  HandleCount:  57.
    Image: onlyForWrite.exe

        THREAD 8a927d48  Cid 075c.03a8  Teb: 7ffdf000 Win32Thread: fd3c7dd8 WAIT: (Suspended) KernelMode Non-Alertable
SuspendCount 1
FreezeCount 1
            8a927f10  Semaphore Limit 0x2
        IRP List: ;這個線程有IRP隊列
            8773eef8: (0006,0100) Flags: 00060970  Mdl: 00000000
        

        THREAD 8b3a37d8  Cid 075c.0828  Teb: 7ffde000 Win32Thread: 00000000 WAIT: (Suspended) KernelMode Non-Alertable
SuspendCount 1
FreezeCount 1
            8b3a39a0  Semaphore Limit 0x2
        Not impersonating
        DeviceMap                 946136a0
        Owning Process            89a4bd40       Image:         onlyForWrite.exe
        Attached Process          N/A            Image:         N/A
        

        THREAD 8b328cd8  Cid 075c.08dc  Teb: 7ffdd000 Win32Thread: 00000000 WAIT: (Executive) KernelMode Non-Alertable
SuspendCount 1
            92ddd6e0  SynchronizationEvent
        Not impersonating
        DeviceMap                 946136a0
        Owning Process            89a4bd40       Image:         onlyForWrite.exe
        Attached Process          N/A            Image:         N/A      

對比windbg輸出的3個線程資訊,發現唯獨Cid==075c.03a8線程有IRP隊列。結合R3 windbg中查到的讀線程ID:0x3a8,可以确定讀線程的确出了什麼問題。再進一步确認IRP是不是應用層發出的:

kd> !irp 8773eef8 ;8773ee8從前面THREAD 8a927d48的IRP List中擷取
Irp is active with 2 stacks 2 is current (= 0x8773ef8c)
 No Mdl: System buffer=8776e000: Thread 8a927d48:  Irp stack trace.  
     cmd  flg cl Device   File     Completion-Context
 [N/A(0), N/A(0)]
            0  0 00000000 00000000 00000000-00000000    

      Args: 00000000 00000000 00000000 00000000
>[IRP_MJ_READ(3), N/A(0)]
            0  1 8a7762b8 876c7988 00000000-00000000    pending -->irp狀态是挂起的 
         \Driver\SampleChar
      Args: 00001000 00000000 00000000 00000000
kd> da 8776e000
8776e000  "................................"
8776e020  "................................"      

果不其然,這是程式發出的IRP請求未被處理。至少可以肯定應該是驅動中StartIo函數有錯誤了。

#pragma code_seg()
void SampleStartIo(PDEVICE_OBJECT devObj, PIRP irp)
{
  KEVENT workEvt, completeEvt;
  KIRQL origIrql;
  NTSTATUS status = STATUS_SUCCESS;
  unsigned long readLen;
  SampleCharDevContext* devCtx = (SampleCharDevContext*)devObj->DeviceExtension;
  IO_STACK_LOCATION* curStack = IoGetCurrentIrpStackLocation(irp);

  KeInitializeEvent(&workEvt,SynchronizationEvent,FALSE);
  KeInitializeEvent(&completeEvt, NotificationEvent, FALSE);

  if (curStack->Parameters.Read.Length > 4096)
  {
    status = irp->IoStatus.Status = STATUS_BUFFER_OVERFLOW;
    irp->IoStatus.Information = 0;

    IoCompleteRequest(irp, IO_NO_INCREMENT);
    IoStartNextPacket(devObj, FALSE);
    return;
  }

  KeAcquireSpinLock(&devCtx->devSpinLock, &origIrql);
  
  if(devCtx->buffPos != 0x00UL)
  {
    readLen = devCtx->buffPos >= curStack->Parameters.Read.Length ? curStack->Parameters.Read.Length : devCtx->buffPos;
    RtlCopyMemory(irp->AssociatedIrp.SystemBuffer,
      devCtx->SampleBuff,
      readLen);
    devCtx->buffRemained += readLen;
    devCtx->buffPos -= readLen;
  }
  else
  {
    KeReleaseSpinLock(&devCtx->devSpinLock, origIrql);
    
    irp->IoStatus.Status = STATUS_SUCCESS;
    irp->IoStatus.Information = 0x00UL;
    IoCompleteRequest(irp, IO_NO_INCREMENT); ->1)
    //maybe blocked at here,if without this
    //IoStartNextPacket(devObj, FALSE);  ->4)
    return; ->2)
  }

  KeReleaseSpinLock(&devCtx->devSpinLock, origIrql);

  IoCopyCurrentIrpStackLocationToNext(irp);
  IoSetCompletionRoutine(irp, IrpAsyncReadCompleteRoutine, &completeEvt, TRUE, TRUE, TRUE);
  
  status = IoCallDriver(devCtx->lowerDev,irp);
  if (status == STATUS_PENDING)
  {
    KeWaitForSingleObject(&completeEvt, Executive, KernelMode, FALSE, NULL);
  }
  
  irp->IoStatus.Status = STATUS_SUCCESS;
  irp->IoStatus.Information = readLen;
  IoCompleteRequest(irp, IO_NO_INCREMENT);

  IoStartNextPacket(devObj,FALSE);
}

NTSTATUS SampleCharReadAsync(PDEVICE_OBJECT devObj, PIRP irp)
{
  IoMarkIrpPending(irp);
  IoStartPacket(devObj,irp,NULL,NULL); ->3)
  return STATUS_PENDING;
}      
kd> !devobj 8a7762b8 -------------->8a7762b8是前面!irp 指令輸出的irp相關的devobj
Device object (8a7762b8) is for:
 SampleChar \Driver\SampleChar DriverObject 8774cb18
Current Irp 8773eef8 RefCount 0 Type 00000022 Flags 0000204c
Dacl 8fb049c0 DevExt 8a776370 DevObjExt 8a7763f0 
ExtensionFlags (0x30000000)  DOE_RAW_FDO
                             Unknown flags 0x10000000
Characteristics (0x00000100)  FILE_DEVICE_SECURE_OPEN
AttachedTo (Lower) 8761a400 \Driver\PnpManager
Device queue is busy - Queue flink = blink  ------->windbg顯示裝置隊列忙