差異處
這裏顯示兩個版本的差異處。
cpp:wdm:generic:trace_bsod_with_windbg [2017/08/01 13:21] tony [Reload symbol] |
cpp:wdm:generic:trace_bsod_with_windbg [2023/06/25 09:48] |
||
---|---|---|---|
行 1: | 行 1: | ||
- | {{tag>WDM}} | ||
- | ====== Trace BSOD with WinDbg ====== | ||
- | ===== Introduction ===== | ||
- | 本篇文章記錄使用WinDbg追蹤BSOD root cause的過程,目的在於透過minidump去找出發生問題的程式碼。而要透過minidump的主要原因,是因為完整的memory dump太大,不易從客戶手中獲得。本篇文章包含: | ||
- | - Launch WinDbg: 使用WinDbg載入並分析minidump。 | ||
- | - Setup Symbol Path: 載入Symbol以獲得詳細的callstack內容。 | ||
- | - Analyze With Source Code: 找出問題程式碼的位置。 | ||
- | 在本文章中,我寫一個測試用的driver叫testdriver當做範例。在其它章節補充遇到與想到的問題。 | ||
- | ===== Launch WinDbg ===== | ||
- | === Install WinDbg === | ||
- | 起初我安裝7600版本的SDK,在Win2012上無法找出正確的root cause,後來改使用Win10 1703版本的SDK。可以直接到[[https://developer.microsoft.com/zh-tw/windows/hardware/windows-driver-kit|微軟網站]]下載並安裝,只要安裝WinDbg就好。 | ||
- | === Fetch MiniDump File === | ||
- | 在能夠正常進入系統後或是在安全模式中,將minidump檔案給複製出來,通常位於C:/Winodws/Minidump中:\\ | ||
- | {{:cpp:wdm:generic:minidump_file.png|}} | ||
- | === The first analysis === | ||
- | 首先打開從BSOD主機上抓到的minidump:\\ | ||
- | {{:cpp:wdm:generic:windbg_open_minidump.png|}}\\ | ||
- | \\ | ||
- | 打開後會看到類似以下畫面,接著點擊!analyze -v或直接在下方kd>中輸入!analyze -v,就會開始分析:\\ | ||
- | {{:cpp:wdm:generic:windbg_after_open_minidump.png?800|}}\\ | ||
- | \\ | ||
- | CallStack是debug非常重要的依據,但因為目前沒載入symbol table,無法找到位置所對應的意義: | ||
- | <code> | ||
- | STACK_TEXT: | ||
- | ffffd000`eb981498 fffff801`3b5c389a : 00000000`00000038 00000000`00000000 00000000`00000030 00000000`00000000 : testdriver+0x19fa | ||
- | ffffd000`eb9814a0 00000000`00000038 : 00000000`00000000 00000000`00000030 00000000`00000000 ffffd000`eb981538 : testdriver+0x189a | ||
- | ffffd000`eb9814a8 00000000`00000000 : 00000000`00000030 00000000`00000000 ffffd000`eb981538 ffffe000`00000250 : 0x38 | ||
- | </code> | ||
- | ===== Setup Symbol Path ===== | ||
- | 為了要讓debug資訊更清楚,我們必須要載入symbol資料。 | ||
- | ==== Setup search path ==== | ||
- | 打開file>symbol search path,並如下圖輸入,我把symbols給放在C:\symbols下: | ||
- | <code> | ||
- | SRV*c:\symbols*http://msdl.microsoft.com/download/symbols | ||
- | </code> | ||
- | {{:cpp:wdm:generic:windbg_setup_reload_symbol_path.png|}}\\ | ||
- | \\ | ||
- | 完成後會出現載入testdriver.sys錯誤,這是因為缺少testdriver的symbol:\\ | ||
- | {{:cpp:wdm:generic:windbg_after_reload_symbol_path.png|}} | ||
- | ==== Enable symbol loading diagnostics mode ==== | ||
- | 在kd>中輸入!sym noisy後,可以啟用診斷模式。 | ||
- | ==== Reload symbol ==== | ||
- | 啟用診斷模式後,輸入.reload /i testdriver.sys可以重新讀取symbol:\\ | ||
- | {{:cpp:wdm:generic:windbg_reload_testdriver_sys.png|}}\\ | ||
- | \\ | ||
- | 由上圖可以知道WinDbg到哪些地方去載入symbol,所以只要把testdriver.sys放到對應目錄下並重新reload,即可通過這關。但下一關是pdb的問題:\\ | ||
- | {{:cpp:wdm:generic:windbg_reload_testdriver_pdb.png|}}\\ | ||
- | \\ | ||
- | 通常build sys出來時,會伴隨著一個pdb,一樣把它放到對應目錄後並重新reload symbol。 | ||
- | ==== The second analysis ==== | ||
- | 在symbol都設定完成後,再次輸入!analyze -v:\\ | ||
- | {{:cpp:wdm:generic:windbg_after_load_pdb.png|}}\\ | ||
- | 如上圖,我們可以知道testdriver清楚的callstack,與呼叫什麼function時發生問題。 | ||
- | ===== Analyze With Source Code ===== | ||
- | ==== Setup Source Search Path ==== | ||
- | 打開file>source search path,並如下圖輸入程式碼目錄:\\ | ||
- | {{:cpp:wdm:generic:windbg_source_search_path.png|}} | ||
- | ==== The third analysis ==== | ||
- | 在設定完Source Search Path後,再次執行分析指令,就能找到問題的對應程式碼:\\ | ||
- | {{:cpp:wdm:generic:windbg_analyze_with_source_code.png?800|}}\\ | ||
- | ===== 其它 ===== | ||
- | ==== 如何得知driver版本? ==== | ||
- | 如果不知道release時所搭載的驅動版本為何,目前我使用lmt指令去查詢timestamp,再去找對應的driver:\\ | ||
- | {{:cpp:wdm:generic:windbg_list_driver_timestamp.png|}} | ||
- | ==== 是否一定要載入發生問題的驅動sys與pdb? ==== | ||
- | 這個要看發生問題的情況。我另外一隻驅動程式發生的BSOD問題,在只提供系統本身的symbol後,就已經知道問題的發生點: | ||
- | <code> | ||
- | STACK_TEXT: | ||
- | ffffd001`79154408 fffff800`4d5980e7 : 00000000`00000050 ffffffff`fffffff8 00000000`00000001 ffffd001`791545f0 : nt!KeBugCheckEx | ||
- | ffffd001`79154410 fffff800`4d47a9c9 : 00000000`00000001 ffffe001`dd3c3880 ffffd001`791545f0 fffff800`4d6a7d33 : nt! ?? ::FNODOBFM::`string'+0x20c37 | ||
- | ffffd001`791544b0 fffff800`4d57122f : 00000000`00000001 ffffffff`ffffffd0 00000000`00001000 ffffd001`791545f0 : nt!MmAccessFault+0x7a9 | ||
- | ffffd001`791545f0 fffff800`4d51eefd : 00000000`00000200 00000000`20206f49 e001e7f2`00000000 ffffe001`e521d210 : nt!KiPageFault+0x12f | ||
- | ffffd001`79154780 fffff801`72e0136c : 00000000`00000000 00000000`00000801 00000000`00000000 fffff800`4d6f07c8 : nt!IoWriteErrorLogEntry+0x25 | ||
- | ffffd001`791547b0 00000000`00000000 : 00000000`00000801 00000000`00000000 fffff800`4d6f07c8 ffffffff`80000b00 : testdriver2+0x136c | ||
- | </code> | ||
- | 當然這也和code本身的寫法與問題發生點有關。載入.sys而沒載入.pdb時: | ||
- | <code> | ||
- | STACK_TEXT: | ||
- | ffffd001`79154408 fffff800`4d5980e7 : 00000000`00000050 ffffffff`fffffff8 00000000`00000001 ffffd001`791545f0 : nt!KeBugCheckEx | ||
- | ffffd001`79154410 fffff800`4d47a9c9 : 00000000`00000001 ffffe001`dd3c3880 ffffd001`791545f0 fffff800`4d6a7d33 : nt! ?? ::FNODOBFM::`string'+0x20c37 | ||
- | ffffd001`791544b0 fffff800`4d57122f : 00000000`00000001 ffffffff`ffffffd0 00000000`00001000 ffffd001`791545f0 : nt!MmAccessFault+0x7a9 | ||
- | ffffd001`791545f0 fffff800`4d51eefd : 00000000`00000200 00000000`20206f49 e001e7f2`00000000 ffffe001`e521d210 : nt!KiPageFault+0x12f | ||
- | ffffd001`79154780 fffff801`72e0136c : 00000000`00000000 00000000`00000801 00000000`00000000 fffff800`4d6f07c8 : nt!IoWriteErrorLogEntry+0x25 | ||
- | ffffd001`791547b0 fffff800`4d8d07ea : ffffe001`e7f2de60 00000000`00000000 ffffd001`79154950 ffffe001`e7f2c000 : testdriver2+0x136c | ||
- | ffffd001`79154850 fffff800`4d8f77ee : ffffe001`dd3c39c0 00000000`00000000 fffff800`4d6b6300 ffffe001`dd3c3880 : nt!IopLoadDriver+0x5e2 | ||
- | ffffd001`79154b10 fffff800`4d466adb : fffff801`00000000 ffffffff`80000b5c fffff800`4d8f77a0 ffffe001`dd5f4ad0 : nt!IopLoadUnloadDriver+0x4e | ||
- | ffffd001`79154b50 fffff800`4d4e2794 : 4819b200`009206b8 ffffe001`dd3c3880 ffffe001`dd3c3880 ffffe001`dc837040 : nt!ExpWorkerThread+0x293 | ||
- | ffffd001`79154c00 fffff800`4d56d5c6 : fffff800`4d6f9180 ffffe001`dd3c3880 fffff800`4d760a00 41fffec9`1de8cf8b : nt!PspSystemThreadStartup+0x58 | ||
- | ffffd001`79154c60 00000000`00000000 : ffffd001`79155000 ffffd001`7914f000 00000000`00000000 00000000`00000000 : nt!KiStartSystemThread+0x16 | ||
- | </code> | ||
- | ==== 是否一定要載入sys對應的pdb? ==== | ||
- | 我拿相同版本的程式碼在重新build pdb出來後,雖然reload symbol會出現mismatch,但還是能夠讀到對應的位置: | ||
- | <code> | ||
- | STACK_TEXT: | ||
- | ffffd001`79154408 fffff800`4d5980e7 : 00000000`00000050 ffffffff`fffffff8 00000000`00000001 ffffd001`791545f0 : nt!KeBugCheckEx | ||
- | ffffd001`79154410 fffff800`4d47a9c9 : 00000000`00000001 ffffe001`dd3c3880 ffffd001`791545f0 fffff800`4d6a7d33 : nt! ?? ::FNODOBFM::`string'+0x20c37 | ||
- | ffffd001`791544b0 fffff800`4d57122f : 00000000`00000001 ffffffff`ffffffd0 00000000`00001000 ffffd001`791545f0 : nt!MmAccessFault+0x7a9 | ||
- | ffffd001`791545f0 fffff800`4d51eefd : 00000000`00000200 00000000`20206f49 e001e7f2`00000000 ffffe001`e521d210 : nt!KiPageFault+0x12f | ||
- | ffffd001`79154780 fffff801`72e0136c : 00000000`00000000 00000000`00000801 00000000`00000000 fffff800`4d6f07c8 : nt!IoWriteErrorLogEntry+0x25 | ||
- | ffffd001`791547b0 fffff800`4d8d07ea : ffffe001`e7f2de60 00000000`00000000 ffffd001`79154950 ffffe001`e7f2c000 : testdriver2!gtInitializeDriver+0x364 [c:\workspace\testdriver2\testdriver2.c @ 684] | ||
- | ffffd001`79154850 fffff800`4d8f77ee : ffffe001`dd3c39c0 00000000`00000000 fffff800`4d6b6300 ffffe001`dd3c3880 : nt!IopLoadDriver+0x5e2 | ||
- | ffffd001`79154b10 fffff800`4d466adb : fffff801`00000000 ffffffff`80000b5c fffff800`4d8f77a0 ffffe001`dd5f4ad0 : nt!IopLoadUnloadDriver+0x4e | ||
- | ffffd001`79154b50 fffff800`4d4e2794 : 4819b200`009206b8 ffffe001`dd3c3880 ffffe001`dd3c3880 ffffe001`dc837040 : nt!ExpWorkerThread+0x293 | ||
- | ffffd001`79154c00 fffff800`4d56d5c6 : fffff800`4d6f9180 ffffe001`dd3c3880 fffff800`4d760a00 41fffec9`1de8cf8b : nt!PspSystemThreadStartup+0x58 | ||
- | ffffd001`79154c60 00000000`00000000 : ffffd001`79155000 ffffd001`7914f000 00000000`00000000 00000000`00000000 : nt!KiStartSystemThread+0x16 | ||
- | </code> | ||
- | 是否能百試百靈,等以後遇到問題再看看。 | ||
- | |||
- | ===== Reference ===== | ||
- | * [[http://geekswithblogs.net/.netonmymind/archive/2006/03/14/72262.aspx|WinDbg / SOS Cheat Sheet]] | ||
- | * [[https://blogs.msdn.microsoft.com/cclayton/2010/02/24/how-to-setup-windbg/|how-to-setup-windbg?]] | ||
- | * [[http://www.cnblogs.com/georgepei/archive/2012/02/15/2353072.html|symbol加载失败的case分析]] | ||
- | * [[https://blogs.technet.microsoft.com/askcore/2008/10/31/how-to-debug-kernel-mode-blue-screen-crashes-for-beginners/|How to Debug Kernel Mode Blue Screen Crashes (for beginners)?]] | ||
- | * [[https://docs.microsoft.com/en-us/windows-hardware/drivers/debugger/verifying-symbols|Verifying symbols]] | ||
- | * [[http://windbg.info/doc/1-common-cmds.html|WinDBG commands]] | ||
- | ===== ===== | ||
- | ---- | ||
- | \\ | ||
- | ~~DISQUS~~ |