Hybrid App Debugging – Threading

I added traceback to my GetThings app in just two lines of code, but so far it doesn’t actually do anything that you would expect a debugger to do. But before we get to that, we need understand a little about how threading works for traceback debugging.

As I mentioned last time, the traceback debugger works by calling into the registered traceback handler at various times (entering/exiting a function, before executing a line of code and on exceptions). Execution of the Python code continues when the traceback function exits. That means that you have to block the execution thread while you let the user poke around with the debugger UI. For a console based app, that’s easy. For a GUI app, not so much.

At a minimum, you need to run your debugger UI on a separate thread from your main app window. If you want your main app window to be responsive while you debug, you’ll need to pump messages at a minimum (DoEvents in Windows Forms, similar approaches are available for WPF) or preferably run your python scripts on a background thread separate from either the main window UI thread or the debugger UI thread. To keep things simple, I’m going to simply block the main window thread while the debugger is active.

Since I’m going to have to setup a new thread for the debugger window, I decided to use a static constructor to centralize creating the thread, creating the window and registering the traceback handler all in one place.

static Thread _debugThread;
static DebugWindow _debugWindow;
static ManualResetEvent _debugWindowReady = new ManualResetEvent(false);

public static void InitDebugWindow(ScriptEngine engine)
{
    _debugThread = new Thread(() =>
    {
        _debugWindow = new DebugWindow(engine);
        _debugWindow.Show();
        Dispatcher.Run();
    });
    _debugThread.SetApartmentState(ApartmentState.STA);
    _debugThread.Start();

    _debugWindowReady.WaitOne();
    engine.SetTrace(_debugWindow.OnTracebackReceived);
}

As you can see, InitDebugWindow spins up a new thread and creates the debug window on that thread. Since it’s not the main WPF application thread, you have to explicitly call Dispatcher.Run to get the event queue pumping. You also have to explicitly set the apartment state to be single threaded for any threads creating WPF objects. Finally, I wait for the window to signal that it’s ready (it set’s the _debugWindowReady AutoResetEvent in the Window Loaded event) and then call SetTrace, passing in the debug window’s OnTracebackReceived event, on the thread that called InitDebugWindow.

It’s critical that you call SetTrace – and thus InitDebugWindow – on the thread that’s going to execute the Python code. Debugging in Python is per thread. Even if you execute the same code in the same ScriptScope with the same ScriptEngine but on a different thread, the traceback handler calls won’t fire. The way DebugWindow is written, it will only support debugging a single thread, but it would be pretty straightforward to support multiple threads by changing the way OnTracebackReceived gets signaled to continue.

Speaking of OnTracebackReceived, this was my initial basic implementation of it:

private TracebackDelegate OnTracebackReceived
    (TraceBackFrame frame, string result, object payload)
{
    Action<TraceBackFrame, string, object> tbAction = this.OnTraceback;
    this.Dispatcher.BeginInvoke(tbAction, frame, result, payload);
    _dbgContinue.WaitOne();
    return this.OnTracebackReceived;
}

As we saw, the DebugWindow is running on a different thread than the traceback handler call will come in on. So OnTracebackReceived needs to invoke a new call on the correct thread by using Dispatcher.BeginInvoke. Even though OnTracebackReceived is always called on the main window thread, it still has access to the properties of the debug window thread like its Dispatcher. I used BeginInvoke to invoke OnTraceback asynchronously – OnTraceback isn’t going to return anything interesting and we’re going to wait on an AutoResetEvent before continuing anyway so I didn’t see any reason to use a synchronous call.

We’ll discuss OnTraceback more next post, but basically it will configure the UI for the traceback event that happened. Then DebugWindow will wait for user input. When the user indicates they want to resume execution, the command handler in question will set _dbgContinue and the original traceback will return so execution can continue.

Hybrid App Debugging Aside – The DLR Hosting API

In my series on Hybrid App Debugging, I showed the following code for executing a Python file in a hybrid C#/IronPython app.

private void Window_Loaded(object sender, RoutedEventArgs e)
{
    ScriptEngine engine = Python.CreateEngine();
    ScriptScope  scope = engine.CreateScope();
    scope.SetVariable("items", lbThings.Items);
    engine.ExecuteFile("getthings.py", scope);
}

The DLR Hosting API has three distinct levels of functionality. As simple as this is, technically it’s level 2 since it’s using a ScriptEngine directly. If you wanted to use the simplest level 1 hosting API, you could use runtimes instead of engines and save a line of code.

private void Window_Loaded(object sender, RoutedEventArgs e)
{
    ScriptRuntime runtime = Python.CreateRuntime();
    runtime.Globals.SetVariable("items", lbThings.Items);
    runtime.ExecuteFile("getthings.py");
}

The ScriptRuntime version of ExecuteFile doesn’t include an overload that takes a ScriptScope like ScriptEngine does, so instead you add the items variable to the globals scope. However, this doesn’t automatically add the items object to every child scope – you have to explicitly import items into the local scope if you want to use it. So for Python, that means you need to add “import items” to the top of the GetThings.py script. Nothing else changes.

Personally, I find DLR Hosting API Level 2 to be straightforward and easy enough to understand, so I tend to code to that level by default. I actually had to go read the doc to discover the ScriptRuntime.Globals property and talk to Dino about importing those variables into a local scope. However, I wanted to point out that nothing in my Hybrid App Debugging sample so far is really dependent on the level 2 API. If you just want to execute some Python files in the context of your C# application, you can stick with the simpler level 1 API if you want. You can even use lightweight debugging with the level 1 API – there’s an overload of the SetTrace extension method for ScriptRuntimes just as there is for ScriptEngines. Just something to keep in mind.

Hybrid App Debugging – TracebackDelegate and SetTrace

Now that I’ve introduced my simple hybrid GetThings app, we need to set about adding support for debugging just the IronPython part of the app via the new lightweight debugging functionality we’re introducing in 2.6. Note, the code is up on github, but isn’t going to exactly match what I show on the blog. Also, I have a post RC1 daily build of IronPython in the Externals folder since I discovered a few issues while building this sample that Dino had to fix after RC1. Those assemblies will be updated as needed as the sample progresses.

We saw last time how how easy it is to execute a Python script to configure a C# app – only four lines of code. If we want to support debugging, we need to add a fifth:

private void Window_Loaded(object sender, RoutedEventArgs e)
{
    ScriptEngine engine = Python.CreateEngine();
    engine.SetTrace(this.OnTraceback);

    ScriptScope s = engine.CreateScope();
    s.SetVariable("items", lbThings.Items);
    engine.ExecuteFile("getthings.py", s);
}

You’ll notice the one new line – the call to engine.SetTrace. This is actually an extension method – ScriptEngine is a DLR hosting API class and but SetTrace is IronPython specific functionality 1. If you look at the source of Python.SetTrace, you’ll see that it’s just a wrapper around SysModule.settrace, but it avoids needing to get the engine’s shared PythonContext yourself.

SetTrace takes a TracebackDelegate as a parameter. That delegate gets registered as the global traceback handler for the Python engine (on that thread, but we’ll ignore threading for now). Whenever that engine enters a new scope (i.e. a new function), the IronPython runtime calls into the global traceback handler. While the traceback handler runs, execution of the python code in that engine is paused. When the traceback handler returns, the engine resumes executing python code.

In addition to the global traceback handler, each scope has a local traceback handler as well. The TracebackDelegate type returns a TracebackDelegate which is used as the local traceback handler for the next traceback event within that scope. Traceback handlers can return themselves, some other TracebackDelegate, or null if they don’t want any more traceback events for the current scope. It’s kinda confusing, so here’s a picture:

You’ll notice three different traceback event types in the picture above: call, line and return. Call indicates the start of a scope, and is always invoked on the global traceback handler (i.e. the traceback passed to SetTrace). Line indicates the Python engine is about to execute a line of code and return indicates the end of a scopes execution. As you can see, the runtime uses the return value of the traceback for the next tracing call until the end of the scope. The return value from the “return” event handler is ignored.

So now that we know the basics of traceback handlers, here’s a simple TracebackDelegate that simply returns itself. The “Hello, world!” of traceback debugging if you will.

private TracebackDelegate OnTraceback
    (TraceBackFrame frame, string result, object payload)
{
    return this.OnTraceback;
}

If you run this code, there will be no functional difference from the code before you added the SetTrace call. That’s because we’re not doing anything in the traceback handler. But if you run this in the debugger with a breakpoint on this function, you’ll see that it gets called a bunch of times. In the python code from the last post, there are three scopes – module scope, download_stuff function scope and the get_nodes function scope. Each of those function scopes will have a call and return event, plus a bunch of line events in between.

The parameters for TracebackDelegate are described in the Python docs. The frame parameter is the current stack frame – it has information about the local and global variables, the code object currently executing, the line number being executed and a pointer to the previous stack frame if there is one. More information on code and frame objects is available in the python data model (look for “internal types”). Result is the reason why the traceback function is being called (in Python docs, it’s called “event” but that’s a keyword in C#). IronPython supports four traceback results: “call”, “line” and “return” as described above plus “exception” when an exception is thrown. Finally, the payload value’s meaning depends on the traceback result. For call and line, payload is null. For return, payload is the value being returned from the function. For exception, the payload is information about the exception and where it was thrown.

As I mentioned above, python code execution is paused while the traceback handler executes and then continues when the traceback handler returns. That means you need to block in that function if you want to let the user interact with the debugger. For a console app like PDB, you can do that with a single thread of execution easily enough. For a GUI app like GetThings, that means running the debugger and debugee windows on separate threads. And as I alluded to, tracing for Python script engines is per thread. So next time, we’ll look deeper into how to use multiple threads for lightweight debugging a hybrid app.


  1. Eventually, I’d like to see IronRuby support lightweight debugging as well. However, there’s no built in mechanism for Ruby debugging the way there is for Python, so it’s less clear how we should expose debugging to the Ruby developer. We’d also want to build a language neutral DLR Hosting API mechanism for lightweight debugging as well at that point. But honestly, we have higher priorities at this point.

Lightweight Debugging for Hybrid C#/IronPython Apps

One of the IronPython scenarios that I’m hearing more and more about recently is for polyglot programs. In these scenarios, part of the application is built in IronPython other parts are build in compiled, statically typed languages like C# or Visual Basic. Sometimes, programs are written this way to allow the C# app to access a Python library, like my Pygments for WL Writer plugin. Other programs want to be customizable by the end user, like Intellipad. Whatever the reason, I think that the number of these hybrid polyglot programs is going up, which partially explains why the C# team added the new dynamic type to C# 4.0.

(FYI: the You had me at “dynamic” shirt above is available for sale in my Zazzle store along with my Architecture Help 5¢ shirt)

The thing is that if you’re going to build polyglot apps, you’re probably going to want the ability to debug polyglot apps as well. I’ve written extensively about building a debugger for IronPython. However, ipydbg uses the CLR debugger under the hood which means you have to have the debugger and the code it’s debugging in separate processes. That’s a huge design burden for building a debuggable polyglot application. Luckily, as of IronPython 2.6, we support Python’s built-in trace debugging capability (aka sys.settrace). While you can use this in pure Python apps (like PDB), you can also use it polyglot C# (or VB)/IronPython apps as well. If only someone were to take the time to build a sample and document what he did along the way…

Hey, that sounds like PM work!

Seriously, let me introduce you to the worlds simplest Twitter application: GetThings. The app downloads a list of my tweets via the Twitter API and displays them in a list box. The UI is written in C# while the tweet download code is written in Python. Clearly, this is a pretty brain dead app – but the point isn’t to build a great Twitter app but rather to show how to use the settrace API from C#.

I’ve stuck the code up on GitHub. If you want to see the basic app in action sans debugging, start with the initial checkin. As you can see here, basic C# / IronPython integration is pretty trivial. I’m simply creating an engine and a scope, adding the list boxes’ Items property to the scope, and executing the getthings.py file from the disk.

private void Window_Loaded(object sender, RoutedEventArgs e)
{
    ScriptEngine engine = Python.CreateEngine();
    ScriptScope  scope = engine.CreateScope();
    scope.SetVariable("items", lbThings.Items);
    engine.ExecuteFile("getthings.py", scope);
}

Since GetThings.py is just a text file, the user can modify it to get a list of anything they want – some other user’s timeline, the public timeline, or even – gasp! – something not from Twitter! In fact, as you see below, I’ve actually modified it to pull the tweets from a file on disk so I can avoid hitting the network on every run.

import clr
clr.AddReference("System.Xml")
from System.Xml import XmlDocument

def get_nodes(xml):
    return xml.SelectNodes("statuses/status/text")

def download_stuff():
    x = XmlDocument()

    #load from disk to save time in development

    #x.Load("http://twitter.com/statuses/user_timeline/devhawk.xml")

    x.Load("devhawk.xml")

    for n in get_nodes(x):
        txt = n.InnerText
        items.Add(txt)

download_stuff()

OK, so that’s the basics of the world’s simplest hybrid C#/IronPython Twitter application. Next up, I’ll add the settrace basics.