My interest in a light-weight command-line debugger developed a while back after I watched one used in a Ruby on Rails video. I suspect there are many .Net developers with more experience than me who have used command line debuggers and never want to go back. The more I code, the more I long to leave my mouse behind and communicate with my development tools using only my keyboard, Excalibur.
I have been following a series of blogs by the awesome Harry Pierson (aka DevHawk) on writing an IronPython debugger. When I decided I actually wanted to play with the debugger, I found myself going back over these blogs taking notes as I went. This post is an attempt to summarize all this information and provide the links for going deeper, for myself if no one else.
Explains why he built a debugger when there are other alternatives. The Visual Studios debugging was required to much mouse clicking, although he has posted about doing that too in Debugging IronPython Code in Visual Studio. MDbg which is a .Net Command Line Debugger doesn't support Just My Code debugging. He also notes the reason for not using/porting pdb (The Python Debugger) is due to IronPython not yet implementing settrace.
Describes the basics of MDbg. I found it interesting there is also an IronPython extension for MDbg which isn't for debugging IronPython code, but to interact with MDbg using IronPython. Sounds cool, I'll have check it out some time.
Goes over the worlds simplest debugger, and it really is pretty straight forward. It just passes the path of ipy.exe and the python code to debug as a parameter into CorDebug (provided by MDbg). CorDebug starts the process (it can be used to debug any .Net process) and provides some events to hook into; OnCreateAppDomain and OnProcessExit. The world simplest debugger basically just prints some text when the AppDonain is starts and stops.
Provides some background information:
ipy.exe produces debug information when the -D parameter is used.
IL generation is dynamic and in memory so the debugger API provides the Symbols (equivalent to .pdb files) as a Stream and has a callback when the Symbols change.
The .Net framework has an API to read and write these symbols a files and MDbg provides a wrapper to read them from a Stream.
Then covers the python code to load the symbols, translate document/line into a function/offset, set a breakpoint. Does not implement user defined breakpoints in the post.
Covers the basics of how the interactivity will happen, which I've just realized isn't that complicated. The process being debugged just runs until a breakpoint event occurs (or the symbols change or the process completes). Once the process has stopped the the breakpoint the interactivity can occur. It would be nice to be able to set breakpoints while the process is running, but it does keep it more simple for now.
Background info on MDbg:
Two stack chains in a typical managed app; unmanaged and managed. The debugger for managed code so only the managed chain used
Each contains a collection of stack frames which is the familiar call stack. There are three types of stack frame; IL, native and internal.
Explains Dynamic Methods are usually used created by IronPython but can't be debugged and are implemented as non-dynamic methods when the -D option is set on ipy.exe.
MDbg has a wrapper around the unmanaged metadata API to get method information for displaying the call stack.
The command in ipydbg is used to view the trace from the interactive console is "T".
Mostly discussion about code changes and refactoring. Add code to automatically generate the MTA apartment state if the -X:MTA argument is not used. Explains the effect of this on the debugger design.
Introduces console commands for step (S), step in (I), step out (O)
Skipped over this one, and feel I may start skipping over a few where I'm not specially interested in the implementation. I can always refer back to them later if I really need to understand something.
Describes an issue when stepping into a Python function the CLR breaks at some infrastructure code, presumably there to manage Pythons decorators and such. Hence there is no line of user code to display. To resolve this an additional automatic step is added when stepping into a function so it can be mapped to a line of user code.
To avoid multiple hits to the file system source files used to retrieve user code are cached.
Describes how he choose to implement colors in the console, while bemoaning the stateful nature of the console foreground colours (ie when you change the foreground colour it will stay that way until you change it back). I'm sure I'll come back to this post when I want to add colours to the console but for now I skipped through it, it doesn't have much to do with writing or using an IronPython debugger. I guess that's why he discusses moving it out of the ipydbg and into its own module in his next post.
Discusses some issues regarding mapping to the debugger COM object instance. This didn't sound like a whole log of fun and I hope I don't have to come back and fully understand this one.
Uses the GetLocalVariable(int index) and GetLocalVariablesCount() methods from the MDbg CorFame class. This post finally made me look up what lexical scoping meant, its a term I've heard heaps usually in discussions about compilers that I never really understood. Its fair to say that I still not confident in my understanding of it.
Discusses matching up debug symbols with variable names from the user code. Doesn't actually evaluate the variable in this post. Notes get_locals from the IronPython process emits some locals used internally, these are prefixed with a dollar sign.
This post covers some pretty tricky interfaces for dealing with all the different types, which requires a pretty good understand of how the CLR handles variables under the cover. I didn't read into to much detail but will consider revisiting it later as the content is quite interesting and it's good to understand core CLR stuff.
Discusses how console commands are routed to functions. This really has nothing to do with debugging IronPython but it's interesting the way it's implemented. He starts by implementing a switch using a dictionary of input commands and functions (Python has no C# switch keyword equivalent). He then takes it further by making use of Python decorators to bind commands to functions.
Discusses getting function arguments as locals.
Discusses implementing a REPL console in the debugger with the IPyDebugProcess object available in the console scope. This is awesome for exploring the API using python, just by reflecting the methods of the process object I realised it would be trivial to add a command to list the source files currently being debugged as it's just a property on the object. Explorative coding using reflection is a really powerful concept in Python.
In the current implementation is a new local scope is created and used by the by the REPL console, it does not yet support executing code in the process being debugged.
Implements console commands for setting Breakpoints, as the original post was about how it worked and only implemented breaking on the first line of the user code. Introduces multi-key commands for breakpoints. Adds a breakpoint function (B) with sub functions add (A), list (L), enable (E), disable (D).
Here's a screenshot of me debugging a really simple app which might help explain what I've been talking about for this whole post.
I'm pretty amazed by the productivity and brilliance of Harry Pierson, I've learned heaps reading his blogs and I'm really impressed by the progress he's making with IpyDbg. The debugger still has a long way to being a really useful tool, but its coming along very quickly. I think I will have to try using MDbg with the Python extensions, but I really hope to make some use of ipydbg and maybe even find something I can contribute to it.
If your interested in trying out the debugger I recommend checking his blog and the latest version of the project on GitHub.