Using WinDbg Compiled from msdn site. Installation So, first things first, get hold of a copy of WinDbg from here and install it. If you want to follow along with this post then you can go and download it right now - the install takes about 2 minutes once you've got the 10MB download down from the site. The first thing you'll realize is how lightweight an installation WinDbg is and that (sometimes) means that you can install it into places that you'd never install a copy of Visual Studio or the Visual Studio remote debug set up. Indeed, often if you raise a support call with Microsoft you'll get asked to install WinDbg and it's automated helper (the "autodumper") in order to get a crash dump of your application to send back to Microsoft. Before we get into debugging something I need to have a quick word or two about Symbols and Commands. Symbols The first issue that we need to address in order to progress is that of symbols. If you want to get decent stack traces and dumps of variable values and so on out of the debugger then you need symbols for the modules that you’re debugging whether those modules are yours or someone else’s and whether those modules are managed or unmanaged (although there’s quite a lot more that you can do with managed code without symbols). Symbols are typically either private symbols (include variable information), retail symbols (function information but not variable information) or export symbols (generally
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Using WinDbg
Compiled from msdn site.
Installation So, first things first, get hold of a copy of WinDbg from here and install it. If you want to follow along with this post then you can go and download it right now - the install takes about 2 minutes once you've got the 10MB download down from the site.
The first thing you'll realize is how lightweight an installation WinDbg is and that (sometimes) means that you can install it into places that you'd never install a copy of Visual Studio or the Visual Studio remote debug set up. Indeed, often if you raise a support call with Microsoft you'll get asked to install WinDbg and it's automated helper (the "autodumper") in order to get a crash dump of your application to send back to Microsoft.
Before we get into debugging something I need to have a quick word or two about Symbols and Commands.
SymbolsThe first issue that we need to address in order to progress is that of symbols. If you want to get decent stack traces and dumps of variable values and so on out of the debugger then you need symbols for the modules that you’re debugging whether those modules are yours or someone else’s and whether those modules are managed or unmanaged (although there’s quite a lot more that you can do with managed code without symbols).
Symbols are typically either private symbols (include variable information), retail symbols (function information but not variable information) or export symbols (generally not so useful for debugging purposes). If you’re debugging with “export” symbols it’s not usually enough to actually work out what’s going on.
For your own code you’re responsible for building the symbol files (usually .PDB program databases) by setting the right flags on the compiler or choosing the right configuration in a project (this is true for VC++, VB6 and all the .NET languages). This will give you full private symbols.
When the debugger (and lots of other tools) want to find symbols they typically check the contents of an environment variable named _NT_SYMBOL_PATH which is used (just like PATH – i.e. a list of folder separated by semi-colons) to find symbol files. You can set _NT_SYMBOL_PATH prior to running WinDbg or you can use the debugger’s own symbol path settings to find symbol files. These settings are controlled via the File->Symbol File Path menu option or using the .sympath command.
For other people’s code you need to get symbols from them. For Microsoft code you’re in luck because the symbols are available from a public symbol server on the internet. In order to make use of the symbol server you should set your symbol path to be something like;
What this says is that the debugger should first check a local cache named C:\MyLocalSymbols and if symbols are not found there then go out to the Microsoft symbol server for symbols and, if found, download them and cache them in that folder. Note that symbols are downloaded based upon name, version and checksum information from the modules being debugged so the debugger usually knows if the symbols match the code properly.
If you wanted to combine this with a local location for your own code then you’d use something like;
so that the debugger would check your folder first and would then proceed to look in the cache and finally to the symbol server for symbols.
Commands A core set of commands would be as below. I rarely venture beyond these as I’m not an advanced user of WinDbg by any means and this gets me by.
.hh- bring up the help file :-) The help file is fantastic so don't skip on it - everything's documented in there.
g - "GO!". That is, continue running.
bp(Address) - set a break point. Note that (address) can take many formats here including just a memory location but the most common format specifier here is to use syntax such as;
bp KERNEL32!CreateFileA
(remember that most Windows functions have an ASCII and a Unicode variant so we have KERNEL32!CreateFileA and KERNEL32!CreateFileW).
bl- list all breakpoints. Each breakpoint listed has a number in the list which you need for...
bc(number), be (number), bd (number) - these respectively clear, enable and disable breakpoints from the list.
kb, kp, kd etc. The commands beginning with "k" show the stack for the thread that the debugger is looking at. There are a few variants so check out the help for those. The simplest variant that I use is;
kb 200
(200 is the maximum depth of stack trace that I'm looking for)
dw, db, ds, etc. The commands beginning with "d" show the contents of memory. One of the most common is "dt" which shows you the contents of memory laid out according to the specification of a type as long as the debugger can see the definition of a type.
sxe, sxd, sxi, sxn. The commands beginning with "sx" set the behaviour for what the debugger should do when an exception occurs. A good example here would be;
sxe 0xc0000005
which is saying "I want to break into the debugger if there's an access violation".
lm- lists the modules loaded by the program and what kind of symbols are loaded for the modules. Note that export symbols are not really symbols at all but are really the debugger guessing which function you're in based upon the export table of the DLL. For COM servers in particular (which have only 4 exported functions or so) you'll often find yourself appearing in DllRegisterServer if you only have export symbols. Good examples here would be;
lm v (verbose mode)
lm v mUSER* (verbose mode, matc any modules that begin with USER*)
x- Examine symbols. This is very, very useful as it shows you the set of symbols loaded for particular modules. A good example here would be;
x USER32!* (show me all the symbols loaded from the USER32 module)
x USER32!Create* (show me all the symbols loaded from the USER32 module that begin with "Create")
.cls– clear the screen. You’ll be needing this one!
.reload– this causes the symbol information to be reloaded for a particular module. The most common form of this would be to do something like;
.reload /f user32.dll
Where the /f overrides the debugger’s naturally lazy mode of working whereby it wouldn’t actually do the reload right there and then. Normally, this is a useful parameter to use.
~- this is the tilde character and it lists all the threads in the process and shows their status. This can be combined with wildcards in order to execute particular commands on all threads in the process. The single most common one is probably to combine it with a “stack” command to do something like;
~* kb 200
which will show you the stack frames for all the threads in the process.
~N s– this changes the debugger’s focus to another thread. For example;
~7 s
Will switch the debugger so that it’s focused on thread 7. Note that 7 here is the debugger’s thread number rather than the real thread ID and it comes from the list given by the ~ command.
Along with the core set of commands, WinDbg is capable of loading up debugger extensions to extend the core functionality of the debugger. These are “bang commands” in that they begin with an exclamation mark. You can manipulate the extensions that the debugger has loaded using the following commands;
.chain– this shows the extensions that the debugger has loaded
.load– this loads an extension DLL (e.g. .load SOS.DLL)
.unload – unloads an extension DLL
.unloadall
.setdll– this sets the “default” extension DLL.
A quick word about .setdll and how command processing works here. If you have an extension named UEXT.DLL which contains a command named help (and most extensions will have a help command) then you can run that command using;
!UEXT.help
And that works fine. If you find yourself using UEXT more than any other extension then you can use;
.setdll UEXT.DLL
To make UEXT.DLL the default extension and then you can just use;
!help
And that will now default to mean !UEXT.help because of the default extension setting.
A First Debugging Session
So, let’s finish up this post with a quick debugging session and next time around I’ll talk about debugging managed code rather than just any old process.
1. Run up WinDbg and also run up a copy of NotePad.2. Within WinDbg attach to the NotePad process that you just ran by either using the
File->Attach to Process menu or the F6 shortcut. This will give you a dialog where you can select a process. Note that the “Non Invasive” option would allow you to detach from the debugged process without killing it (if you’re on Windows XP or 2003) but it limits the commands that you can use so leave it unchecked.
3. Don’t worry about the workspace information, let it go.4. The debugger should present some diagnostics as it loads information and will
then stop with the command prompt. The debuggee (NotePad) is now halted.5. Let’s take a look around. Firstly, let’s look at the modules that are loaded by
notepad. Execute a “lm v” to list all the modules and follow it up with a simple “lm”. On my system I get this;
1. Let’s get some symbols because we can’t really work with export symbols. Set up your symbol path (either through the File->Symbol Path menu or through the .sympath command) so look something like this;
1. Reload symbols for the loaded modules by using the “.reload /f” command – this may well take a little time as the symbols for all those modules that NotePad loaded trickle down from the internet to your local cache. This gets better with time as you pull down common symbols from the net.
2. After the reload completes do another lm and make sure you got some symbols. My session now gives me;
1. So, that’s modules and symbols. Let’s see if we can set some breakpoints. Suppose that I want to hit a breakpoint anytime that NotePad opens up a file. Let’s do that. Firstly, take a look at symbols in KERNEL32 to see if we can find the CreateFile function that NotePad uses.
2. Issue a “x KERNEL32!CreateF*” and see what results we get back you should find the ASCII CreateFileA and the Unicode CreateFileW functions in there.
3. Set a breakpoint on a function. Issue a “bp kernel32!CreateFileW” to set a breakpoint on that function.
4. Issue a “GO” command “g” to continue the debuggee running.5. Note that when the debuggee is running you can always get back to the debugger
by issuing a CTRL+BREAK to the debugger window when it will try and halt the debuggee for you and drop you back to the command prompt.
6. Open a file with notepad. You’ll spot NotePad loading some more modules in order to do this operation and get that dialog onto the screen. Your breakpoint will more than likely hit before the file dialog comes up.
7. Check-out where you are with your breakpoint – i.e. what the stack frame looks like. Issue a “kb 200” command. You should get results as below;
1. So, we can see the direct call-stack that led to our call to CreateFileW. If we had full symbols for these modules we could use a KP rather than a KB command and we’d get a list of parameters to the functions. As it is with what we have here we’d need to resort to disassembly to get the parameter information here and it’s beyond this posting.
2. Continue the debuggee by issuing a “g” command and continue issuing “g” commands until NotePad finally puts its dialog on the screen – note how many times we hit this function here JIf you’re interested, issue a few KB’s on some of these call chains and see what exactly it is that Notepad is doing.
3. Break the debuggee by hitting CTRL+BREAK on the WinDbg window.4. Issue a “~” command to get a picture of the threads in the process. In my session I
get 3 as below;
0:002> ~
0 Id: 9bc.b58 Suspend: 1 Teb: 7ffdf000 Unfrozen
1 Id: 9bc.8a4 Suspend: 1 Teb: 7ffde000 Unfrozen
. 2 Id: 9bc.d4 Suspend: 1 Teb: 7ffdd000 Unfrozen
1. Note that this display indicates that the debugger is focused on thread “2” which is really OS thread 8A4 in process 9BC (all hex I’m afraid).
2. Check out where all the threads are by issuing a “~* kb 200”. You might not get anything too exciting here but it’s illustrative of how to use the technique.
3. Finally, close off this session by having a look at some extension commands. 1. List the set of extensions that you’ve got by issuing the “.chain”
command.2. Have a look at the commands available in these extensions by issuing: !
exts.help, !uext.help, !ntdsexts.help, !ext.help and see what commands are available to you.
3. Try an extension command. For example, “!exts.cs” will display all the Critical Sections that the process has and their status as to whether they are locked or not. This can be a useful command in many server-side scenarios.
4. End the debugging session with a “q” command.
That’s it. I’ll post again in the near future around how you can build on what I’ve just walked through here in order to debug managed code with WinDbg and take advantage of features within the debugger that don’t (yet) appear elsewhere.
A word for WinDbg (2)
Continuing on from the previous poston using WinDbg let’s take what we learnt in that previous post and apply it to managed code.
WinDbg supports the debugging of managed code through an extension named SOS.DLL. This is named for esoteric reasons that I’ll not get into here but you can find out where that name came from by Googling the web for it.
This DLL is shipped with the latest debugger so you’ll find a version of it which matches V1.1 of the .NET framework in the debugger’s installation folder and you’ll also find a sub-folder labelled CLR10 which contains the previous version of the DLL.
You interact with the SOS.DLL extension by loading it up into WinDbg as you load any other extension. So, if I want to make use of SOS.DLL I issue;
.load SOS.DLL
And the debugger loads the extension for me and I can see it in the results of a;
.chain
command and I can set it as my default extension by using;
Before we go into SOS.DLL in more depth I’ll add a quick word about the symbols for the .NET framework. If you have the .NET framework SDK installed on your machine then you should have symbols for DLLs such as mscorwks.dll and mscoree.dll which make up the CLR installed onto your machine. On my machine (with VS.NET 2003 installed) these symbol files exist in a folder named;
So, this means that you can add this folder to your symbol path (see the previous post) and have WinDbg pick up symbols for the CLR bits for you locally. Alternatively, these symbols will come down from the symbol server anyway but it might save you some download time.
So, let’s consider that straight away what that adds. If I have a piece of managed code such as the following;
usingSystem;
usingSystem.Threading;
namespaceConsoleApplication12
{
classBadClass
{
publicBadClass()
{
}
~BadClass()
{
Thread.Sleep(Timeout.Infinite);
}
}
classEntryPoint
{
staticvoidMain(string[] args)
{
BadClass c1 = newBadClass();
BadClass c2 = newBadClass();
System.GC.Collect();
System.GC.Collect();
System.GC.WaitForPendingFinalizers();
Console.WriteLine("Done");
Console.ReadLine();
}
}
}
Then I’ve created a bit of a monster in that what you should find is that this piece of code probably runs to completion if you run it in debug mode but it’ll never complete if you run it in release mode. This is because in release mode we made c1 and c2 eligible for
garbage collection at an earlier point and that allows this code to run the finalizer for BadClass and that finalizer blocks the finalizer thread.
How to debug this with WinDbg? Run the process in release mode, attach WinDbg and have a look at the stack traces with a;
~* kb 200
command. I can see straight away from this that one of my threads looks like this;
So, I can see that this is my finalizer thread and somewhere in the call chain I can see that this thread is calling Sleep which sounds like a bad idea to me for a finalizer method.
Instantly, you get a good picture of what might be going on here. As an aside, it’s worth taking a look around the mscorwks, mscoree and friends DLLs to have a look at places where you can set breakpoints. You can do this with;
X mscoree!*
X mscorwks!*
So, our example so far has been nice but it doesn’t directly interact with managed code. If we want to do that then we need to look to the SOS.DLL and the functionality that it offers. We can see this with !SOS.help which gives;
0:003> !sos.help
SOS : Help
COMState | List COM state for each thread
ClrStack | Provides true managed stack trace, source and line numbers.
The last one here, COMState, is useful for seeing how your .NET threads have initilaised themselves to COM (i.e. are they STA threads or MTA threads and more detailed info on where they are with respect to COM apartments, contexts, etc).
!FinalizeQueue
0:003> !FinalizeQueue
SyncBlock to be cleaned up: 0
----------------------------------
generation 0 has 0 finalizable objects (0014d330->0014d330)
generation 1 has 0 finalizable objects (0014d330->0014d330)
generation 2 has 0 finalizable objects (0014d330->0014d330)
Ready for finalization 1 objects (0014d330->0014d334)
Statistics:
MT Count TotalSize Class Name
935108 2 24 ConsoleApplication12.BadClass
Total 2 objects
In our scenario you can get a clear picture of what’s present on the queue for finalization in that we have 2 instances of BadClass resident in there.
!EEVersion
0:003> !EEVersion
1.1.4322.573 retail
Workstation build
Looking at Managed Memory
We can get a picture of the “global” state of memory within the application through the dumpheap command.
The simplest way to use dumpheap is to provide the –stat option which will give a complete list of what’s allocated on the managed heap. Note that for a big process you might need to go and get a cup of tea whilst this list is being created. In our small example dumpheap –stat gives us;
So we can see that we have 57 objects here. If you examine the line for our managed type BadClass you’ll see that it has a MT column for its method table. This effectively describes the type and we can use it to restrict the output of dumpheap by doing;
!dumpheap –mt 935108
0:003> !dumpheap -mt 935108
Address MT Size
04a61998 00935108 12
04a619a4 00935108 12
total 2 objects
Statistics:
MT Count TotalSize Class Name
935108 2 24 ConsoleApplication12.BadClass
Total 2 objects
Note that this actually gives us addresses of the instances of our type (2 in our case) which we’ll come back to in a second but, first, how would we get the MethodTable for a particular type without sitting through !dumpheap –stat first? We can use the Name2EE function as below;
And now we know the MethodTable address for our type which is nice JNow, coming back to those addresses. These allow us to dump out the objects themselves using the !dumpobj command;
79c0d0cc 400039a 2c System.Int32 instance 0 version
79c0d0cc 400039b 8 CLASS instance 00000000 keys
79c0d0cc 400039c c CLASS instance 00000000 values
79c0d0cc 400039d 10 CLASS instance 00000000 _hcp
79c0d0cc 400039e 14 CLASS instance 00000000 _comparer
79c0d0cc 400039f 18 CLASS instance 00000000 m_siInfo
79c0d0cc 4000394 0 CLASS shared static primes
>> Domain:Value 00147880:04a61ac4 <<
So, in essence, we can traverse the entire managed heap from within WinDbg here and we can see the details of every single instance that is stored on that managed heap and we can track down all the members of those instances ad infinitum. This is really, really powerful stuff.
Looking at Stack Frames
What about stack frames? Well, if we look at the original stack trace that I had when I was trying to diagnose my “hung application” problem we can see the frames look like this;
So, what’s the bit that’s highlighted there? How come WinDbg can’t fathom this stack frame? Essentially what this points to is that in between the mscorwks!MethodTable::CallFinalizer function and mscorwks!ThreadNative::Sleep function we have managed code. If we want to see the managed stack frames then we can issue;
which works on the current thread (i.e. the one that the debugger is “focused” on) and dumps out the managed stack – you can see that we now could guess that we have a problem in BadClass.Finalize which we kind of knew all along ;-)
We can also do !ClrStack –params –locals –regs –all to include parameter, local and register information which is going to be really helpful for real-world debugging. For managed objects that we find we can then go and do !dumpobj to take a look at those objects.
Setting BreakPoints
Setting breakpoints isn’t quite as easy for managed code in WinDbg as it might be but it’s far from impossible. What we have to work out to set a breakpoint on a manged function is the address of the code for that function.
For our type, BadClass let’s say that we want to set a breakpoint in its constructor. What we do is;
1) Get the MethodTable address for the type by issuing our !Name2EE command.
a. !Name2EE ConsoleApplication12.exe ConsoleApplication12.BadClass
3) We can see that the address for our constructor here is 009350eb. We can provide this as a breakpoint to WinDbg
a. bp 009350eb
b. and when that breakpoint hits we can issue a step command (“t”) and we’ll find that we’re inside the function that we wanted to break in.
Determining Roots
Last topic for this post – how to determine what is holding on to your CLR memory. I talked about how we can use !dumpheap to take a look at what’s on the managed heap but we can also use the fantastic !gcroot command in order to determine for any managed object what “set of roots” are actually causing that object to remain “alive” rather than be garbage collected by the GC.
This is a particularly useful command if you suspect you’ve got a memory leak somewhere in the sense that something is holding on to a managed object longer than it should be – you can break into the process with WinDbg and take a look at what “roots” are present for your particular object and, hopefully, that’d move you forward in determining what’s going on.
An example of the output here on !gcroot is as below let’s use the following piece of code;
classEntryPoint
{
privatestaticArrayList al = newArrayList();
staticvoidMain(string[] args)
{
for(inti = 0; i < 10; ++i)
{
switch(i % 3)
{
case0:
al.Add("Hello");
break;
case1:
al.Add(newException(""));
break;
case2:
al.Add(newobject());
break;
}
}
Console.ReadLine();
}
}
So, if I want to know what objects are around at the point of the Console.ReadLine()call then I can use !dumpheap –stat to show me;
I can see that my System.Exception instance at 04a619fc is being referenced (rooted) by an array of objects at 04a61990 and that is rooted by an ArrayList at 04a619c.
So, we can chase back object references as far as we need to work out what’s going on with our managed memory and why particular instances are not yet eligible for GC.
Working with Exceptions
The WinDbg debugger does not know anything about managed exceptions. Whenever managed code throws an exception it throws a native exception of type 0xe0434f4d
If you want to stop on managed exceptions then you can issue to the debugger;
sxe 0xe0434f4d
or
sxe clr (much easier!)
and it’ll break at the point where the exception is thrown. You can grab the exception record from an unmanaged point of view by issuing a display exception record command with a minus 1 as;
I think there’s a nice way of turning this into a managed exception object but, at the time of writing, I can’t work it out. What you can do is to execute a !dumpstackobjects and the usual case would be to should find your exception living within the set of objects that you get back there.
Wrap-up
That’s it for this posting – if you want a more info on debugging with SOS.DLL check out this big guidethat’s over on MSDN which will give you more info and also have a look at resources such as;