Writing an IronPython Debugger: A Little Hack…err…Cleanup

Yesterday, I pushed out two commits to ipydbg. The first was simple, I removed all of the embedded ConsoleColorMgr code in favor of the separate consolecolor.py module I blogged about Thursday. The second commit…well, let’s just say it’s not quite so simple.

Last weekend, I was experimenting with breakpoints when I discovered that the MoveNext method of BreakpointEnumerator was throwing a NotImplementedException. Up to that point, I hadn’t modified any of the MDbg C# source code except to merge the corapi and raw assemblies into a single assembly. But since I had to fix BreakpointEnumerator, I figured I should make some improvements to the C# code as well. For example, I added helper functions to easily retrieve the metadata for a class or function.

In my latest commit, I’ve added a SymbolReader property to CorModule. Previously, I managed the mapping from CorModules to SymbolReaders in my IPyDebugProcess class via the symbol_readers field. However, since mapping CorModules to SymbolReaders is something pretty much any debugger app would have to do, it made more sense to have that be a part of CorModule directly. So now, you can set and retrieve the SymbolReader directly on the module. Furthermore, I moved the logic to retrieve a SymbolReader from the IStream provided in the OnUpdateModuleSymbols event into the CorModule class as well.

I wouldn’t have bothered to blog this change at all, except that if you look at how the SymbolReader property is implemented under the hood, it’s not what you would expect. Instead of having SymbolReader as an instance variable on CorModule – as you might expect -CorModule has a static dictionary mapping CorModules to SymbolReaders. The instance SymbolReader property simply then access to the underlying static dictionary.

//code taken from CorModule class in CorModule.cs
private static Dictionary<CorModule, ISymbolReader> _symbolsMap = 
    new Dictionary<CorModule, ISymbolReader>();

public ISymbolReader SymbolReader
{
    get
    {
        if (_symbolsMap.ContainsKey(this))
            return _symbolsMap[this];
        else
            return null;
    }
    set
    {
        _symbolsMap[this] = value;
    }
}

Now obviously, this the way you typically implement properties. However, the problem is that there isn’t a 1-to-1 mapping between the underlying debugger COM object instances and the managed objects instances that wrap them. For example, if you look at the CorClass:Module property, it constructs a new managed wrapper for the COM interface it gets back from ICorDebugClass.GetModule. That means that I can’t store the symbol reader as an instance field in the managed wrapper since I probably will never see a given managed wrapper module instance ever again.

All of the debugger API wrapper classes including CorModule inherit from a class named WrapperBase which overrides Equals and GetHashCode. The overridden implementations defer to the wrapped COM interface, which means that two separate managed wrapper instances of the same COM interface will have the same hash code and will evaluate as equal. The upshot is that object uniqueness is determined by the wrapped COM object rather that the managed object instance itself.

Using a static dictionary to store a module instance property provides the necessary “it doesn’t matter what managed object instance you use as long as they all wrap the same COM object underneath” semantics. If I create multiple instances CorModule that all wrap the same underlying COM interface pointer, they’ll all share the same SymbolReader instance from the dictionary.

Yeah, it’s feels kinda hacky, but it works.