Debugging The BOINC Client Software Crashes On Microsoft Windows
From Unofficial BOINC Wiki
Contents |
[edit] General
Debugging BOINC's Science Applications when weird/unexplained crashes happen. (Work in progress)
[edit] Liability Disclaimer
Neither the writer of this manual, nor the project you are running the debug for, assume liability for damage to your computer, loss of data, braking down of the BOINC Client Software or any other event or condition that may occur as a result of participating in this debug.
[edit] Pre-Installation
The pieces you need are windbg and livekd.
Windbg is from Microsoft, its part of the debugging tools for Windows: http://www.microsoft.com/whdc/DevTools/Debugging/default.mspx Grab the version for your platform and install it.
Livekd is from System Internals, its handy for setting up and starting windbg. Its here: http://www.sysinternals.com/Utilities/LiveKd.html .. if you installed the debugging tools in the default location, put livekd where it'll be easy for you to find. If you installed the debugging tools in a different location, put livekd in the same directory.
Get the debug symbols for the version of BOINC you're running. They're here: http://boinc.berkeley.edu/dl/ (in the form of *.pdb files. The right pdb file will have the same time & date as your BOINC installation file.) Unzip the files into the BOINC folder - same place as boinc.exe.
[edit] Setup
Open Boinc Manager, navigate to the “Projects” tab, select the project you see there and press the Suspend button on the left. Do this for all projects you are attached to.
I'm going to assume you put BOINC on the E: drive, substitute where it really is. Actually, all the stuff that follows is on the BOINC drive unless stated differently.
Create a folder named "symbols" in the root directory. You should have e:\symbols. Its where windbg will look for symbols and where it'll store the symbol files it gets from Microsofts Symbol Server.
Copy all the .pdb files from e:\boinc and e:\boinc\projects\* to e:\symbols. Don't create any subdirectores, they all go into e:\symbols. And make sure you don't move them, they should be in both places. You do need to copy the .pdb files for the other projects also, otherwise windbg will try to get them from Microsofts Symbol Server if the applications start under the debugger.
Open a command prompt (Start->Run, type cmd and hit Enter} and start livekd:
e: cd \msdebug (cd to the debugging tools directory) livekb -w
Reply "y" to the question about symbols.
Reply "e:\symbols" when it asks where the symbol file are stored.
Windbg will now automatically start up. Ignore any of the stuff that shows up in it.
In Windbg click "debug", "event filters". You need to make sure several of the events are set properly. To set select it, and check the appropriate options in the lower right. You want these settings:
Create thread - output - not handled Exit thread - output - Not handled Create process - enabled - not handled Exit process - output - not handled Load module - output - not handled Unload module - ignore - not handled Initial breakpoint - ignore - not handled Initial module load - ignore - not handled Unknown exception - enabled - not handled Access violation - enabled - not handled
Click "File", "Save workspace as" and enter something like "boinc debug". That's it for setup. Now for a first run, before we start to outmate the breakpoints.
[edit] The First Debug Session
(It's adamant that you stay in the neighborhood of your computer, or stay watching this debugger at this moment. So you can fill in the commands at the bottom. That is, until I've written the automation manual.)
Exit BOINC completely. If you are running BOINC as a service, use "services.msc" to stop it and set it to manual. We haven't found a way yet to debug BOINC when it is running as a service, so everything happens manually. Check in Task Manager if Boinc.exe and Boincmgr.exe are stopped, if they aren't force them to stop.
Get Windbg back on the top, then click "Debug", "Stop Debugging". Its OK to save the workspace if prompted. (that gets rid of the livekd session).
Click "Edit", "Open/close log file", enter a descriptive name like "Debug BOINC.txt". The "txt" part is done so you can easily open the log afterwards in Notepad.
Click "File", "open executable", enter "-redirectio" in the "Arguments:" box, check the "debug child processes also" box, browse to where BOINC is located and select boinc.exe. (not Boinc Manager). Click the "open" button.
A command window should pop up with a command line at the bottom. Switch to the command line (single line at the bottom) and enter "g". (without the quotes. Make sure you press the enter key to enter the command) "Debugee is running" should appear in the command entry line, while a separate command box should open running BOINC.exe .. Do NOT close any of the command boxes! If "Debugee is running" doesn't appear in the command entry line, go back to it and fill in "g" (without quotes) and hit Enter.
Back in Windows, start Boinc Manager (NOT!! BOINC.exe!!!) and let it connect to BOINC. Start the applications by switching the run mode to "always" or "run based on preferences". Then start a workunit by resuming that project. (After it downloads a workunit, its probably a good idea to set "no new work" so you only get one or two at a time).
Now wait for the application to error out.
Its possible that other events will trigger the debugger, you'll see that in the command window. Things like threads starting and stuff like that should be logged, without causing a breakpoint. For those, just continue the execution with "g", "enter".
What you're looking for is the first-time exception for access violations.
When it happens, it should break into the debugger. It should look something like "access violation - code C0000005". Won't be the bottom line, but will be near the bottom. Heres what a breakpoint looks like:
(980.110): Break instruction exception - code 80000003 (first chance) eax=77fc6427 ebx=7ffdf000 ecx=00000003 edx=77f77008 esi=00241eb4 edi=00241eb4 eip=77f7f570 esp=0012fb38 ebp=0012fc2c iopl=0 nv up ei pl nz na pe nc cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00000202 ntdll!DbgBreakPoint: 77f7f570 cc int 3
When you get that, enter ".dump /mhtipc /u project.dmp".
(the /u means it'll append the data and time to the filename)
(I usually type that command in another window, select it and copy it to the clipboard. Then I just select the command entry area of the windbg command window and press ctrl-v to paste it.)
After that, enter these commands (one at a time):
k kb kp kv !analyze -v !process -1 7 .dump /mf /u project-complete.dmp !process 0 7
(that last one lists all the processes and threads in the system, it'll take a while to complete, the last one-1 creates a complete dump of the program in case its needed.)
Thats it for now. Click "Debug", "stop debugging" and all the programs will exit. Close the log ("Edit", "open/close log") and exit the debugger. Reply "n" to the livekd prompt.
Stopping the debugger should also stop BOINC, but you'll have to stop Boinc Manager by yourself. Probably a good idea to start BOINC once under Boinc Manager to "suspend" the project(s) you are debugging.
The dump and log file will be in the debug directory, its probably a good idea to zip the minidump - my test dump was 5MB (thats the first .dump command).
Additional note, you can play with the commands when you're running livekd, before debugging boinc. Might be worth trying some of them before the eral debug session. If things get messed up, you can always exit the debugger, then reply "y" so livekd starts it back up again. When you restart, you can reload the workspace with "file", "open workspace".
I'm including a program that generates an access violation, you can try running it or attaching to it.
Do remember to reopen the log file if you restart the debugger.
[edit] What To Do With the Log and Dump Files?
Zip them up so they don't take too much space up on your harddrive. Email the logs to the project developer who asked you to run this debug. Or to Paul Buck. He likes a bit of code in his email box. ;)
(More to come: Automating the output. I'll be writing the second part another day. This will automate the output so you don't have to be around the computer all the time while you debug BOINC.)

