IronPython And * Series

IronPython and [Insert MSFT Technology Here]

Now that PDC08 is in my rear view mirror, I’m back to doing IronPython stuff. One of the things I’m looking at is making IronPython work with a variety of Microsoft technologies. Given the usage of dynamic languages in web scenarios, most of our focus to date has been on using Iron languages in Silverlight. Being able to program the browser with the same language you program the server is a fairly compelling scenario. We’re also starting to see new progress on ASP.NET support for Iron languages.

But those are only two out of a veritable universe of cool technologies. Now that I’m done with PDC, I can start to explore some of the others. Some ideas include:

  • IPy and WPF
  • IPy and Surface
  • IPy and XNA (desktop only – Xbox and Zune use the Compact Framework with doesn’t support DLR)
  • IPy and WCF
  • IPy and WF

Any other suggestions? Please leave them in the comments.

IronPython and WPF Part 1: Introduction

I decided to start my IronPython and “veritable universe of cool technologies” examples with WPF. I figured that since we already have Silverlight support, there might be some overlap (there was). Futhermore, after seeing BabySmash on Surface I’m jonesing to build a Surface app of my own. Getting vanilla WPF working with IPy seems like a smart step before trying to build a Surface WPF app with IPy.

WPF is all about cool graphics, so I decided to build a photo viewing app. Kinda boring, I know. But it turns out my wife has posted hundreds of photos to her WL Space, and WL Spaces provides convenient RSS feeds of both photo albums as well as photos in specific albums. So I built out a simple WPF based photo viewer for my wife’s WL Space photos in IronPython.

TechieWife Photo Viewer screenshot

As you can see, I’m not quitting my job to go pursue a career in design anytime soon. But hey, the point is demonstrate building a WPF app in IPy, not to be a great designer. Plus, don’t those cute kids make up for the ugliness of the app?

Turns out building this app in IPy was fairly straightforward, with a few pitfalls. I wasted half a day digging thru data binding before realized that data binding against IPy objects works out of the box – but only if you type the case of the property correctly (Title != title). Also, I couldn’t make TypeConverters work the way I wanted, but python list comprehensions made it enough to transform the feed data before binding it to the UI. That approach worked great for this scenario but maybe not so much for others. (I’ve got feelers out to the WPF data binding wonks, so maybe there’s still hope for type converters)

Over the next several posts, I’m going to show you all the code for this app. It’s pretty small, only about 50 lines of app-specific python code + 50 lines of XAML to describe the window. There’s also some reusable code – 50 lines of WPF module code (mostly stolen from in the IPy tutorial), 200 lines of xml2py code which I’ve discussed before and a very small C# based assembly to make accessing WPF elements by name very pythonic.

IronPython and WPF Part 2: Loading XAML

If we’re going to build a WPF app, we’re going to want to be able to load some XAML. Sure, you can programmatically build up your UI, but WPF and more importantly WPF tools like Expression Blend are designed to work with XAML. Luckily, loading XAML is fairly easy:

def LoadXaml(filename):
    from System.IO import File
    from System.Windows.Markup import XamlReader
    with File.OpenRead(filename) as f:
        return XamlReader.Load(f)

We simply open the filename provided and use XamlReader to build out the corresponding WPF object graph. Note, this is very different from the XAML approach used by C#/VB or even by IronPythonStudio. In those scenarios, the XAML is compiled into a binary format (BAML) and embedded in the compiled assembly. For my TechieWife Photo viewer, it’s all script so there’s neither a XAML to BAML compile step nor a compiled assembly to embed the BAML into, so we’re just loading raw XAML.

Since we’re using raw XAML, there are additional rules we need to follow. First, when using compiled XAML, we can specify the name of the event handler in the XAML directly. For XamlReader, that’s no allowed since there’s no C#/VB class associated with the XAML. Speaking of class, you can’t specify x:Class either. Finally, anywhere you want to use a static resource, as far as I can tell those need to be compiled in a static language. I think you could build one in C#, add a reference to that assembly via clr.AddReference, then use it from XAML and it should just work. However, since I’m trying to stick to IronPython exclusively, I didn’t try that scenario out. 

Since you can’t specify the event handlers in XAML loaded by XamlReader, you have to bind the event handlers in code. There are two listboxes in my photo viewing app, and I want to capture the SelectionChanged event of both of them. Binding event handlers in IronPython code uses the same += syntax as C# uses.

win1 = wpf.LoadXaml('win1.xaml')

win1.listbox1.SelectionChanged += listbox1_OnSelectionChanged
win1.listbox2.SelectionChanged += listbox2_OnSelectionChanged

My win1.xaml file has a Window type instance as the root. You don’t need to be a deep WPF expert to realize that the WPF Window doesn’t have listbox1 or listbox2 properties. Yet, in the code snippet above, I was able to say win1.listbox1 and get back the WPF ListBox element with that name. Cool trick, eh? Well, I can’t take credit for it – I copied the code from our Silverlight integration for dynamic languages. Unfortunately, this code has to be written in C# code, but it is the only C# code in my whole solution (and it’s reusable!)

[assembly: ExtensionType(

namespace DevHawk.Scripting.Wpf  
    public static class FrameworkElementExtension  
        public static object GetBoundMember(FrameworkElement e, string n)  
            object result = e.FindName(n);  
            if (result == null)  
                return OperationFailed.Value;  
            return result;  

GetBoundMember is kinda like Python’s __getattr__ or Ruby’s method_missing. Of course, it doesn’t work with C#, but it does lets us trap dynamic member resolution when calling a C# object from a DLR language. Srivatsn has a great write up on using GetBoundMember and the four other special methods you can use to make your CLR objects act more dynamic.

In this case, if the standard reflection-based member name resolution fails, we try calling FrameworkElement’s FindName method to see if there’s a corresponding control with the provided name. So win.listbox1 is the equivalent to win.FindName('listbox1'), but with less code and a much more pythonic feel.

You’ll notice that we’re attaching this GetBoundMember method to FrameworkElement as an extension method. It’s kinda cool that we can inject a new method into an existing class to provides dynamic behavior and it all works seamlessly from Python. However, DLR uses a different mechanism to locate and bind extension methods than C# or VB. Those languages use ExtensionAttribute to mark extension methods and the assemblies and classes that contain them. However, that approach forces you to examine ever single class in marked assemblies and every single method in marked classes. Examining every class and method is no big deal to do at compile time, but it would be a significant perf issue at runtime. By using ExtensionType attribute, the DLR only has to look at assembly attributes in order to bind extension methods.

Once you’ve got the compiled FrameworkElementExtension assembly, you just need to load it via clr.AddReference. I called the assembly Devhawk.Scripting.Wpf and I load it automatically in my module. So if you’re building a WPF app in IronPython, you can simply “import wpy” and you get the GetBoundMember extension method, the LoadXaml function, and a bunch of WPF related namespaces imported into the wpf scope. That way, you can write wpf.Button() instead of System.Windows.Control.Button() to programmatically create a new button.

IronPython and WPF Part 3: Data Binding

Here’s the short version of this post: data binding in WPF to IPy objects just works…mostly. However, I’m guessing you are much more interested in the long version.

Typically, data binding depends on reflection. For example, the following snippet of XAML defines a data bound list box where the title property of each object in the bound collection gets bound to the text property of a text block control. WPF would typically find the title property of the bound objects via reflection.

<ListBox Grid.Column="0" x:Name="listbox1" >
      <TextBlock Text="{Binding Path=title}" />

The problem is that IronPython objects don’t support reflection – or more accurately, reflection won’t give you the answer you’re expecting. Every IPy object does have a static type, but it implements Python’s dynamic type model. 1 Thus, if you reflect on the IPy object looking for the title property or field, you won’t find it. It might seem we’re in a bit of a bind (pun intended). However, WPF does provide an out:

“You can bind to public properties, sub-properties, as well as indexers of any common language runtime (CLR) object. The binding engine uses CLR reflection to get the values of the properties. Alternatively, objects that implement ICustomTypeDescriptor or have a registered TypeDescriptionProvider also work with the binding engine.”
WPF Binding Sources Overview, MSDN Library

Luckily for us, IronPython objects implement ICustomTypeDescriptor 2. That snippet of XAML above? It’s straight from my photo viewing app. All I had to do was define the data template in the list box XAML then set the ItemsSource property of the list box instance.

w.listbox1.ItemsSource =

As I said, it just works. However, I did hit one small snag – hence the “mostly” caveat above.

If you look at the top level WL Spaces photos feed, you’ll see that each item’s title starts with “Photo Album:”. Yet in the screenshot of my app, you’ll notice that I’ve stripped that redundant text out of the title. Typically, if you want to change the bound value during the binding process, you build an IValueConverter class. I needed two value conversions in my app, stripping “Photo Album:” for the album list box and converting a string URL into a BitmapImage for the image list box.

IronPython objects can inherit from a .NET interface, so there’s no problem building an IValueConverter. However, in order to use a custom IValueConverter from XAML, you need to declare it in XAML as a static resource. However, as you might imagine, dynamic IPy objects don’t work as static resources. So while I can define an IValueConverter in Python, I can’t create one from XAML.

There are a few possible solutions to this. The first is to build up the data template in code. If you do that, they you can programmatically add the converter to the binding. I was hopeful that I could define the data template in XAML then manipulate the binding, but there doesn’t appear to be any way to do that. Another option would be to build some type of generic IValueConverter class in C# that loads either an IPy based IValueConverter or embedded python conversion code. That’s problematic because those IPy object would need to be created in the right ScriptRuntime, and there’s no built-in way to access that. There are also a small set of XamlReader extensions such as XamlTypeMapper that might be able to provide the right hook into the XAML parsing to allow IronPython based conversion.

In the end, I took the easiest way out – I transformed the data to be bound before binding it. It’s cheating of sorts, but given the read-only nature of this app, it was the easiest thing to do. So the actual line of code to set listbox1’s ItemsSource looks like this:

class Album(object):
  def __init__(self, item):
    self.title = item.title.Substring(13)
    self.itemRSS = item.itemRSS

w.listbox1.ItemsSource = [Album(item) for item in]

I create a Python class for each RSS item in the feed, saving the stripped title and the album RSS URL as fields. It’s kinda annoying to basically be parsing the feed twice, but at least it’s not much code. Python’s list comprehension syntax makes creating a list of Albums from a list of RSS items a single line of code. I do something very similar for data binding the second list box:

class Picture(object):
  def __init__(self, item):
    self.title = item.title  
    self.picture = BitmapImage(Uri(item.enclosure.url + ":thumbnail"))

w.listbox2.ItemsSource = [Picture(item) for item in]

Here I’m not only converting the raw data (adding “:thumbnail” at the end of the URL) but also changing the data type from string to BitmapImage. I’m binding to an image object in the second list box, but to do that I need a BitmapImage instead of a string.

This “convert the data first” approach feels like a hack to me. After I get this series of posts done, I am planning on going back and improving this sample. Hopefully, I can find a better approach to value conversions. Any gurus out there on XAML parsing, please feel free to drop me a line or leave me a comment.

  1. you can access the underlying CLR type for any Python type via the clr.GetClrType method. You an also check out the CreateNewType method from NewTypeMaker.cs

  2. I spent the better part of an afternoon trying to make TypeDescriptionProviders work before Dino pointed out that we already support ICustomTypeDescriptor in Python objects. I didn’t realize at first because I had a case sensitivity bug in my original prototype code – it turns out that “Title” != “title”.

IronPython and WPF Part 4: Background Processing

Like many apps today, my WL Spaces photo viewer is a connected app. The various WL Spaces RSS feeds that drive the app can take a several seconds to download. Unless you like annoying your users, it’s a bad idea to lock up your user interface while you make you make synchronous network calls on your UI thread. Typically, this long running processing gets farmed out to a background thread which keeps the UI thread free to service the user events.

.NET provides a variety of mechanisms for doing long running processing on a background thread. For example you can create a new thread, you can queue a work item to the ThreadPoool or use the BackgroundWorker component. However, none of these are particularly pythonic, so I set out to see if I could leverage any of Python’s unique capabilities to make background processing as easy as possible. This is what I ended up with:

def OnClick(self, sender, args):  
    self.DLButton.IsEnabled = False  

def BackgroundTask(self, url):  
    wc = WebClient()
    data = wc.DownloadString(Uri(url))

def Completed(self, data):  
    self.DLButton.IsEnabled = True
    self._text.Text = data

By using the cool decorators feature of Python, I’m able to declaratively indicate whether I want a given method to be executed on the UI thread or on a background thread. Doesn’t get much easier than that. Even better, the implementations of BGThread and UIThread are only about twenty lines of Python code combined!

Decorators kinda look like custom .NET attributes. However, where .NET attributes are passive (you have to ask for them explicitly), decorators act as an active modifier to the functions they are attached to. In that respect, they’re kind of like aspects. Certainly, I would consider which thread a given method executes on to be a cross-cutting concern.

The Completed function above is exactly the same as if I had written the following:

def Completed(self, data):  
    self.DLButton.IsEnabled = True  
    self._text.Text = data  
Completed = UIThread(Completed)

In C#, you can’t pass a function as a parameter to another function – you have to first wrap that function in a delegate. Python, like F#, directly supports higher-order functions. This lets you easily factor common aspectual code out into reusable functions then compose them with your business logic. The decorators have no knowledge of the functions they are attached to and the code that calls those functions are written in complete ignorance of the decorators. Python goes the extra mile beyond even F# by providing the ‘@’ syntax.

Here are the implementations of my the UIThread and BGThread decorators:

def BGThread(fun):  
  def argUnpacker(args):  

  def wrapper(*args):  
    ThreadPool.QueueUserWorkItem(WaitCallback(argUnpacker), args)

  return wrapper

def UIThread(fun):
  def wrapper(self, *args):
    if len(args) == 0:
      actiontype = Action1[object]
      actiontype = Action[tuple(object for x in range(len(args)+1))]

    action = actiontype(fun)
    self.dispatcher.Invoke(action, self, *args)

  return wrapper

BGThread defines a wrapper function that queues a call to the decorated function to the .NET thread pool.  UIThread defines a wrapper that marshals the call to the UI thread by using a WPF Dispatcher. I’m thinking there might be a way to use SynchronizationContext to marshal it automatically, but I haven’t tried to figure that out yet. The above approach does require a dispatcher property hanging off the class, but that’s fairly trivial to implement and seems like a small price to pay to get declarative background thread processing.

A couple of quick implementation notes:

  • The ‘*args’ syntax used in those methods above means “given me the rest of the positional arguments in a tuple”. Kinda like the C# params keyword. But that syntax also lets you pass a tuple of parameters to a function, and have them broken out into individual parameters. QueueUserWorkItem only supports passing a single object into the queued function, so I pass the tupled arguments to the argUnpacker method, which in turn untuples the arguments and calls the decorated function.
  • The System assembly includes the single parameter Action<T> delegate. The current DLR provides Action delegates with zero, two and up to sixteen parameters. However, those are in a separate namespace (remember?) and IPy seems to have an issue with importing overloaded type names into the current scope. I could have used their namespace scoped name, but instead I redefined the version from System to be called Action1.
  • To interop with .NET generic types, IPy uses the legal but rarely used Python syntax type[typeparam]. For example, to create a List of strings, you would say “List[str]()”. The type parameter is a tuple, so in UIThread I build a tuple of objects based on the number of arguments passed into wrapper (with the special case of a single type parameter using Action1 instead of Action).

I haven’t uploaded my WL Spaces Photo Viewer app because I keep making changes to it as I write this blog post series. However, for this post I built a simple demo app so I could focus on just the threading scenario. I’ve stuck the code for that demo up on my SkyDrive, so feel free to leverage it as you need.

IronPython and WPF Background Processing Revisited

Yesterday, I blogged about using decorators to indicate if a given function should execute on the UI or background thread. While the solution works, I wrote “I’m thinking there might be a way to use SynchronizationContext to marshal it automatically, but I haven’t tried to figure that out yet.” I had some time this morning so I figured out how to use SynchronizationContext instead of the WPF dispatcher.

Leslie Sanford wrote a pretty good overview, but the short version is that SyncContext is an abstraction for concurrency management. It lets you write code that is ignorant of specific synchronization mechanisms in concurrency-aware managed frameworks like WinForms and WPF. For example, while my previous version worked fine, it was specific to WPF. If I wanted to provide similar functionality that worked with WinForms, I’d have to rewrite my decorators to use Control.Invoke. But if I port them over to use SyncContext, they would work with WinForms, WPF and any other library that plugs into SyncContext.

SyncContext abstracts away both initially obtaining the sync context as well as marshaling calls back to the UI thread. SyncContext provides a static property to access  current context, instead of a framework specific mechanism like accessing the Dispatcher property of the WPF Window class. Once you have a context, you can call Send or Post to marshal the call back to the UI thread (Send blocks the calling thread, Post doesn’t).

With that in mind, here’s the new version of BGThread and UIThread. Slightly more complex, but still pretty simple clocking in at just under 30 lines.

def BGThread(fun):  
  def argUnpacker(args):  
    oldSyncContext = SynchronizationContext.Current

  def wrapper(*args):
    args2 = args + (SynchronizationContext.Current,)
    ThreadPool.QueueUserWorkItem(WaitCallback(argUnpacker), args2)

  return wrapper

def UIThread(fun):
  def unpack(args):  
    ret = fun(*args)
    if ret != None:
      import warnings
      warnings.warn(fun.__name__ + " function returned " + str(ret) + " but that return value isn't propigated to the calling thread")

  def wrapper(*args):
    if SynchronizationContext.Current == None:
      SynchronizationContext.Current.Send(SendOrPostCallback(unpack), args)

  return wrapper

In the BGThread wrapper, I add the current SyncContext to the parameter tuple that I pass to the background thread. Once on the background thread, I set the current SyncContext to the last element of the the parameter tuple then call the decorated function with the remaining parameters. (for the non pythonic: args[:-1] is Python slicing syntax that means “all but the last element of args”). Using a try/finally block is probably overkill – I expect the current SyncContext to be either None or leftover garbage – but the urge to clean up after myself is apparently much stronger on the background thread than it is in say my office. 😄

In the UIThread wrapper, I grab the current context and invoke the decorated method via the Send method. Like QueueUserWorkItem, SyncContext Send and Post only support a single parameter, so I use the same *args trick I described in my last post. (I changed the name to unpack in the code above for blog formatting purposes)

One major caveat about this approach is that there’s no way to return a value from a function decorated as UIThread. I understand why SyncContext.Post doesn’t return a value (it’s async) but SyncContext.Send is synchronous call, so why doesn’t it marshal the return value back to the calling thread? WPF’s Dispatcher.Invoke and WinForm’s Control.Invoke both return a value. I didn’t handle the return value in my original version of UIThread, but now that I’ve moved over to using SyncContext, I can’t. Not sure why the SyncContext is designed that way – seems like a design flaw to me. Since the return value won’t propagate, I sniff the result decorated function’s return value and raise a warning if it’s not None.

I’ve uploaded the SyncContext version to my SkyDrive in case you want the code for yourself. Note, I’ll thinking I’ll revise code this one more time – I want to rebuild the WPF version so that it propagates return values and picks up an dispatcher via Application.Current.MainWindow rather than having to have a dispatcher property on my class.

IronPython and WPF Part 5: Interactive Console

One of the hallmarks of dynamic language programming is the use of the interactive prompt, otherwise known as the Read-Eval-Print-Loop or REPL. Even though I’m building a WPF client application, I’d still like to have the ability to poke around and even modify the app as it’s running from the command prompt, REPL style.

If you work thru the IronPython Tutorial, there are exercises for interactively building both a WinForms and a WPF application. In both scenarios, you create a dedicated thread to service the UI so it can run while the interactive prompt thread is blocked waiting for user input. However, as we saw in the last part of this series, UI elements in both WinForms and WPF can only be accessed from the thread they are created on. We already know how to marshal calls to the correct UI thread – Dispatcher.Invoke. However, what we need is a way to intercept commands entered on the interactive prompt so we can marshal them to the correct thread before they execute.

Luckily, IronPython provides just such a mechanism: clr module’s SetCommandDispatcher. A command dispatcher is a function hook that gets called for every command the user enters. It receives a single parameter, a delegate representing the command the user entered. In the WPF and WinForms tutorials, you use this function hook to marshal the commands to the right thread to be executed. Here’s the command dispatcher from the WPF tutorial:

def DispatchConsoleCommand(consoleCommand):
    if consoleCommand:
        dispatcher.Invoke(DispatcherPriority.Normal, consoleCommand)

The dispatcher.Invoke call looks kinda like the UIThread decorator from the Background Processing part of this series, doesn’t it?

Quick aside: I looked at using SyncContext here instead of Dispatcher, since I don’t care about propagating a return value back to the interactive console thread. However, SyncContext expects a SendOrPostDelegate, which expects a single object parameter. The delegate passed to the console hook function is an Action with no parameters. I could have built a wrapper function that took a single parameter which it would ignore, but I decided it wasn’t worth it. The more I look at it, the more I believe SyncContext is a good idea with a bad design.

I wrapped all the thread creation and command dispatching into a reusable helper class called InteractiveApp.

class InteractiveApp(object):
  def __init__(self):
    self.evt = AutoResetEvent(False)

    thrd = Thread(ThreadStart(self.thread_start))
    thrd.ApartmentState = ApartmentState.STA
    thrd.IsBackground = True


  def thread_start(self):
    try: = Application() += self.on_startup

  def on_startup(self, *args):
    self.dispatcher = Threading.Dispatcher.FromThread(Thread.CurrentThread)

  def DispatchConsoleCommand(self, consoleCommand):
    if consoleCommand:

  def __getattr__(self, name):
    return getattr(, name)

The code is pretty self explanatory. The constructor (__init__) creates the UI thread, starts it, waits for it to signal that it’s ready via an AutoResetEvent and then finally sets the command dispatcher. The UI thread creates and runs the WPF application, saves the dispatcher object as a field on the object, then signals that it’s ready. DispatchConsoleCommand is nearly identical to the earlier version, I’ve just made it an instance method instead of a stand-alone function. Finally, I define __getattr__ so that any operations invoked on InteractiveApp are passed thru to the contained WPF Application instance.

In my file, I look to see if the module has been started directly or if it’s been imported into another module. If the module is run directly (aka ‘ipy’) then the global __name__ variable will be ‘__main__’. In that case, we start the application up normally (i.e. without the interactive prompt) by just creating an Application then running it with a Window instance. Otherwise, we are importing this app into another module (typically, the interactive console), so we create an InteractiveApp instance and we create an easy to use run method that can create the instance of the main window.

if __name__ == '__main__':
  app = wpf.Application()
  window1 = MainWin.MainWindow()

  app = wpf.InteractiveApp()

  def run():
    global mainwin
    mainwin = MainWin.MainWindow()

If you want to run the app interactively, you simply import the app module and call run. Here’s a sample session where I iterate thru the items bound to the first list box. Of course, I can do a variety of other operations I can do such as manipulate the data or create new UI elements.

IronPython 2.0 ( on .NET 2.0.50727.3053
>>> import app
#at this point the app window launches
>>> for i in app.mainwin.allAlbumsListBox.Items:
...     print i.title
Harvest Festivals
Mrs. Gardner's Art
Riley's Playdate
August 13
Camp Days
July 14
May Photo Shoot
Summer Play 2006
Lake Washington With The Gellers
Camp Pierson '06
January 28

One small thing to keep in mind: if you exit the command prompt, the UI thread will also exit since it’s marked as a background thread. Also, it looks like you could shut the client down then call run again to restart it, but you can’t. If you shut the client down, the Run method in InteractiveApp.thread_start exits, resets the Command Dispatcher to nothing and the thread terminates. I could fix it so that you could run the app multiple times, but I find I typically only run the app once for a given session anyway.

IronPython and Linq to XML Part 1: Introduction

Shortly after I joined the VS Languages team, we had a morale event that included a Rock Band tournament. I didn’t play that day in the tournament since I had never played before, but I was hooked just the same. I got Rock Band for my birthday, Rock Band 2 shortly after it came out in September and I’m hoping to get the AC/DC Track Pack for Christmas.

There are lots of songs available for Rock Band – 461 currently available between on-disc and downloadable tracks – with more added every week. Frankly, there’s lots of music on that list that I don’t recognize. Luckily, I’m also a Zune Pass subscriber, so I can go out and download all the Rock Band tracks and listen to them on my Zune. But who has time to manually search for 461 songs? Not me. So I wrote a little Python app to download the list of Rock Band songs and save it as a Zune playlist.

I ended up use Linq to XML very heavily in this project. Zune playlists use the same XML format as Windows playlists, Zune exposes the backend music catalog via a Atom feeds and I used Chris Lovett’s SgmlReader to expose the HTML list of Rock Band songs as XML. I realize Linq to XML wasn’t on “the list”, but I had a specific need so it got bumped to the head of the line.

BTW, for those who just want the playlist, I stuck it on my Skydrive. Unfortunately, there’s no Skydrive API right now, so I can’t automate uploading the new playlist every week. If anyone has alternative suggestions or a way to programmatically upload files to SkyDrive, let me know.

IronPython and Linq to XML Part 2: Screen Scraping

First, I need to convert the HTML list of Rock Band songs into a machine readable format. That means doing a little screen scraping. Originally, I used Beautiful Soup but I found that UnicodeDammit got confused on names like Blue Öyster Cult and Mötley Crüe. I’m guessing it’s broken because IronPython doesn’t have non-unicode strings.

Instead, I used SgmlReader to provide an XmlReader interface over the HTML, then queried that data via Linq to XML. I used the version of SgmlReader from MindTouch since they include a compiled binary and it seems to be the only active maintained version. I wrapped it all up in a function called load that loads HTML from either disk or the network (based on the URI scheme) into an XDocument.

def loadStream(streamreader):
  from System.Xml.Linq import XDocument
  from Sgml import SgmlReader

  reader = SgmlReader()
  reader.DocType = "HTML"
  reader.InputStream = streamreader
  return XDocument.Load(reader)

def load(url):
  from System import Uri
  from System.IO import StreamReader

  if isinstance(url, str):
    url = Uri(url)

  if url.Scheme == "file":
    from System.IO import File
    with File.OpenRead(url.LocalPath) as fs:
      with StreamReader(fs) as sr:
        return loadStream(sr)
    from System.Net import WebClient
    wc = WebClient()
    with wc.OpenRead(url) as ns:
      with StreamReader(ns) as sr:
        return loadStream(sr)

def parse(text):
  from System.IO import StringReader
  return loadStream(StringReader(text))

I call load, passing in the URL to the list of songs. The “official” Rock Band song page loads the actual content from a different page via AJAX, so I just load the actual list directly via my load function.

Once the HTML is loaded as an XDocument, I need a way to find the specific HTML nodes I was looking for. As I said earlier, XDocument uses Linq to XML – there is not other API for querying the XML tree. In the HTML, there’s a div tag with the id “content” that contains all the song rows as table row elements. I built a simple function that uses the LINQ Single method to find the tag by it’s id attribute value.

def FindById(node, id):
  def CheckId(n):
    a = n.Attribute('id')
    return a != None and a.Value == id

  return linq.Single(node.Descendants(), CheckId)

(Side note – I didn’t like the verbosity of the a != None and a.Value == id line of code, but XAttributes are not comparable by value. That is, I can’t write node.Attribute('id') == XAttribute('id', id). And writing ``node.Attribute('id').Value == id11 only works if every node has an id attribute. Not making XAttribute comparable by value seems like a strange design choice to me.)

LINQ to objects works just fine from IronPython, with a few caveats. First, IronPython doesn’t have extension methods, so you can’t chain calls together sequentially like you can in C#. So instead of collection.Where(…).Select(…), you have to write Select(Where(collection, …), …). Second, all the LINQ methods are generic, so you have to use the verbose list syntax (for example: Single[object] or Select[object,object]). Since Python doesn’t care about the generic types, I wrote a bunch of simple helper functions around the common LINQ methods that just use object as the generic type. Here are a few examples:

def Single(col, fun):
  return Enumerable.Single[object](col, Func[object, bool](fun))

def Where(col, fun):
  return Enumerable.Where[object](col, Func[object, bool](fun))

def Select(col, fun):
  return Enumerable.Select[object, object](col, Func[object, object](fun))

Once I have the content node, all the songs are in tr nodes beneath it. I wrote a function called ScrapeSong that transforms a song tr node into a Song object (which I’ll talk about in the next installment of this series). I use LINQ methods Select, OrderBy and ThenBy to provide me an enumeration of Song objects, ordered by date added (descending) than artist name.

def ScrapeSong(node):
  tds = list(node.Elements(xhtml.ns+'td'))
  anchor = list(tds[0].Elements(xhtml.ns+'a'))[0]

  title = anchor.Value
  url = anchor.Attribute('href').Value
  artist = tds[1].Value
  year = tds[2].Value
  genre = tds[3].Value
  difficulty = tds[4].Value
  _type = tds[5].Value
  added = DateTime.Parse(tds[6].Value)

  return Song(title, artist, added, url, year, genre, difficulty, _type)

songs = ThenBy(OrderByDesc(
          Select(content.Elements(xhtml.ns +'tr'), ScrapeSong),
          lambda s: s.added), lambda s: s.artist)

And that’s pretty much it. Next, I’ll iterate thru the list of songs and get the details I need from Zune’s catalog web services in order to write out a playlist that the Zune software will understand.

IronPython and Linq to XML Part 3: Consuming Atom Feeds

Now that I have my list of Rock Band songs, I need to generate a Zune playlist. I wrote that Zune just uses the WMP playlist format, but that’s not completely true. Media elements in a Zune playlist have several attributes that appear unique to Zune.

Because of Zune Pass, Zune supports the idea of streaming playlists where the songs are downloaded on demand instead of played from the local hard drive. In order to enable this, media elements in Zune playlists can have a serviceID attribute, a GUID that uniquely identifies the song on the Zune service. We also need the song’s album and duration – the Zune software summarily removes songs that don’t include the duration.

Of course, the Rock Band song list doesn’t include the Zune song service ID. It also doesn’t include the song’s album or duration. So we need a way, given the song’s title and artist (which we do have) to get its album, duration and service ID. Luckily, the Zune service provides a way to do exactly this, albeit an undocumented way. Via Fiddler2, I learned that Zune exposes a set of Atom feed web services on that the UI uses when you search the marketplace from the Zune software. There are feeds to search by artist and by album but the one we care about is the search by track. For example, here’s the track query for Pinball Wizard by The Who.

Since these feeds are real XML, I can simply use XDocument.Load to suck down the XML. Then I look for the first Atom entry element using similar LINQ to XML techniques I wrote about last time. If there’s no Atom elements, that means that the search failed – either Zune doesn’t know about the song or it can’t find it via the Rock Band provided title and artist. Of the 461 songs on Rock Band right now, my script can find 417 of them on Zune automatically.

Of course, since the Zune data is in XML instead of HTML, finding the data I’m looking for is much easier that it was to find the Rock Band song data. Here’s the code pull the relevant information out of the Zune catalog feed that we need.

def ScrapeEntry(entry):
  id = entry.Element(atomns+'id').Value  
  length = entry.Element(zunens+'length').Value  

  d = {}  
  d['trackTitle'] = entry.Element(atomns+'title').Value  
  d['albumArtist'] = entry.Element(zunens+'primaryArtist').Element(zunens+'name').Value  
  d['trackArtist'] = d['albumArtist']  
  d['albumTitle'] = entry.Element(zunens+'album').Element(zunens+'title').Value  

  if id.StartsWith('urn:uuid:'):  
    d['serviceId'] = "{" + id.Substring(9) + "}"  
    d['serviceId'] = id  

  m = length_re.Match(length)  
  if m.Success:  
    min = int(m.Groups[1].Value)  
    sec = int(m.Groups[2].Value)  
    d['duration'] = str((min * 60 + sec) * 1000)  
    d['duration'] = '60000'  

  return d  

trackurl = catalogurl + song.search_string
trackfeed = XDocument.Load(trackurl)  
trackentry = First(trackfeed.Descendants(atomns+'entry'))  
track = ScrapeEntry(trackentry)

A few quick notes:

  • song.search_string returns the song title and artist as a plus delimited string. i.e. pinball+wizard+the+who. However, many Rock Band songs end in a parenthetical like (Cover Version) so I automatically strip that off for the search string
  • duration in the Atom feed is stored like PT3M23S, which means the song is 3:23 long. The playlist file expect the song length in milliseconds, so I use a .NET regular expression to pull out the minutes and seconds and do the conversion. It’s not exact – songs lengths usually aren’t exactly a factor of seconds, but as far as I can understand, Zune just uses that to display in the UI – it doesn’t affect playback at all.

Now I have a list of songs with all the relevant metadata, next time I’ll write it out into a Zune playlist file.

IronPython and Linq to XML Part 4: Generating XML

Now that I have my list of Rock Band songs and I can get the right Zune metadata for most of them, I just need to write out the playlist XML. This is very straight forward to do with the classes in System.Xml.Linq.

def GenMediaElement(song):
    trackurl = zune_catalog_url + song.search_string
    trackfeed = XDocument.Load(trackurl)
    trackentry = First(trackfeed.Descendants(atomns+'entry'))
    trk = ScrapeEntry(trackentry)
    return XElement('media', (XAttribute(key, trk[key]) for key in trk))
    print "FAILED", song

zpl = XElement("smil",
    XElement("title", "Rock Band Generated Playlist")),     
    XElement("seq", (GenMediaElement(song) for song in songs))))

settings = XmlWriterSettings()
settings.Indent = True
settings.Encoding = Encoding.UTF8
with XmlWriter.Create("rockband.zpl", settings) as xtw:

XElement’s constructor takes a name (XName to be precise) and any number of child objects. These child objects can be XML nodes (aka XObjects) or simple content objects like strings or numbers. If you pass an IEnumerable, the XElement constructor will iterate the collection and add all the items as children of the element. If you’ve had the displeasure of building an XML tree using the DOM, you’ll really appreciate XElements’s fluent interface. I was worried that Python’s significant whitespace would force me to put all the nested XElements on a single line, but luckily Python doesn’t treat whitespace inside parenthesis as significant. 

Creating collections in Python is even easier than it is in C#. Python’s supports a yield keyword which is basically the equivalent of C#’s yield return. However, Python also supports list comprehensions (known as generator expressions), which are similar to F#’s sequence expressions. These are nice because you can specify a collection in a single line, rather than having to create a separate function, which is what you have to do to use yield. I have two generator expressions: (XAttribute(key, trk[key]) for key in trk) creates a collection of XAttributes, one for every item in the trk dictionary and (GenMediaElement(song) for song in songs) which generates a collection of XElements, one for every song in the song collection.

Once I’ve finished building the playlist XML, I need to write it out to a file. Originally, I used Python’s built in open function, but the playlist file had to be UTF-8 because of band names like Mötley Crüe. Zune’s software appears to always use UTF-8. In addition to setting the encoding, I also specify to use indentation, so the resulting file is somewhat readable by humans.

The playlist works great in the Zune software, but since it’s a streaming playlist there’s no easy way to automatically download all the songs and sync them to your Zune device. I expected to be able to right click on the playlist and select “download all”, but there’s no such option. Zune does have a concept called Channels where the songs from a regularly updated feed are downloaded locally and synced to the device. However, the Zune software appears to be hardcoded to only download channels from the catalog service so I couldn’t tap into that. If anyone knows how to sign up to become a Zune partner channel, please drop me a line.

Otherwise, that’s So there you have it. As usual, I’ve stuck the code up on my SkyDrive. If I can remember, I’ll try and run the script once a week and upload the new playlist to my SkyDrive as well.

IronPython and LiveFX: Accessing Profiles

I recently got access to both the Windows Azure and Live Framework CTP programs. Frankly, I’m very interested in Live Mesh, so I decided to start with a simple LiveFX program. Scott (aka ScottIsAFool) at LiveSide posted a “quick and dirty” console app that pulls info from a user’s profile via LiveFx. It’s not Mesh per se, but it does use the same framework and resource model so I decided to port it to IronPython. FYI, this app won’t run unless you’ve been received a LiveFx CTP token and provisioned yourself.

#Add LiveFX References
import sys
sys.path.append('C:\Program Files\Microsoft SDKs\Live Framework SDK\v0.9\Libraries\.Net Library')

import clr

from Microsoft.LiveFX.Client import LiveOperatingEnvironment
from Microsoft.LiveFX.ResourceModel.ProfileResource import ProfileType
from System.Net import NetworkCredential

from devhawk import linq

#get username and password from the user
uid = raw_input("Enter Windows Live ID: ")
pwd = raw_input("Enter Password: ")
creds = NetworkCredential(uid, pwd, "")

#print out user's info
loe = LiveOperatingEnvironment()

general = linq.Single(loe.Profiles.Entries,  
  lambda e: e.Resource.Type == ProfileType.General)

print loe.Mesh.ProvisionedUser.Name     
print loe.Mesh.ProvisionedUser.Email
print general.Resource.ProfileInfo.PersonalStatusMessage
print linq.Count(loe.Contacts.Entries)

I did modify the app slightly, reading the WLID and password off the console – I was sure I would accidently post my personal credentials if I left them embedded in the app. Otherwise, it’s a straight port. First, I add references the LiveFX dlls. Since they’re not local to my script, I add the directory where they’re installed to sys.path, which lets me call clr.AddReference directly. Then I retrieve the user’s ID and password using raw_input (Python’s equivalent to Console.ReadLine). Finally, I connect to the user’s LiveOperatingEnvironment and pull their name, email address, personal status message and the number of contacts they have.

As per the original app, I use LINQ to find the right profile as well as count the number of contacts. I was able to reuse the file I wrote for my Rock Band song list screen scraper (though I did have to add the Count function since I hadn’t needed it previously). I’ve posted this script on my SkyDrive, and it includes my most recent file.

BTW, it doesn’t appear that you can set the PersonalStatusMessage programmatically, at least not currently. I was thinking it would be cool to build an app that sets your PSM via Twitter, but the set method of PersonalStatusMessage is marked internal. In fact, all the set methods of all the profile properties I looked at are marked internal. If someone knows how to update LiveFX resource objects in the current CTP, I’d appreciate it if you dropped me a line or left me a comment.

IronPython and LiveFX: Ori’s

Ori Amiga is a Group Program Manager over in the Live Framework team whom you might have seen at PDC08 delivering the Lap Around LiveFX & Mesh Services and LiveFX Programming Model Architecture and Insights talks. And apparently, he’s an IronPython fan as posted a small LiveFX Python module to his blog. It’s pretty simple – it only wraps Connect and ConnectLocal – but it does cut about ten lines of path appending, reference adding and module importing code into a single import statement. Here’s the profile access script from my last post rewritten to use Ori’s LiveOE module.

import LiveOE     
from devhawk import linq

uid = raw_input("Enter Windows Live ID: ")
pwd = raw_input("Enter Password: ")

loe = LiveOE.Connect(uid, pwd)

general = linq.Single(loe.Profiles.Entries,  
  lambda e: e.Resource.Type == LiveOE.ProfileResource.ProfileType.General)

print loe.Mesh.ProvisionedUser.Name
print loe.Mesh.ProvisionedUser.Email
print general.Resource.ProfileInfo.PersonalStatusMessage
print linq.Count(loe.Contacts.Entries)

FYI, make sure you update the sdkLibsPath in – I’m not sure where Ori has installed the LiveFX SDK, but it’s *not* in the location suggested by the read me file.

BTW, it turns out the WL Profile information is read only which answers a question I had. However, reading the thread it sounds like they will eventually get around to making it read-write at some point.

IronPython and LiveFX: Raw HTTP Access

One of the cool things about the Live Framework is that while there’s a convenient .NET library available, you can use the raw HTTP interface from any platform. LiveFX data is served up over HTTP and is available in ATOM, RSS, JSON or POX formats. As I’ve already shown, you can easily use the .NET library from IronPython, but I wanted to try working with the raw HTTP interface to get a feel for that as well.

Unfortunately, it was harder than I expected it to be. The big issue is that the documentation on how to LiveFX authorization tokens via raw HTTP is fairly sparse and occasionally contradictory. For example, there’s a whole section on Authentication and Live Framework, but it doesn’t cover this scenario. Luckily, I was able to figure it out with the help of AtomPub Project Manager LiveFX Sample, a post on Alex Feinman’s blog, a post on Emmanuel Mesas’ blog and a little groveling around with Reflector. It does appear that the auth docs are in flux –Emmanuel refers to this MSDN article as being about RPS Soap requests, but it’s actually about delegated authority. (Is MSDN reusing URLs? Bad idea.) Also, the sample code has a comment that reads “to be replaced by delegated authorization” so it looks like changes are coming. In other words, no promises on how long this code will work!

If you look at the AtomPub Project Manager sample, there’s a WindowsLiveIdentity.cs file that implements static GetTicket method that looks similar to both the code on Alex’s blog as well as the implementation of GetWindowsLiveAuthenticationToken. The upshot is that there’s a WS-Trust endpoint for Windows Live at You send it a RequestSecurityToken (aka RST) message (with a couple of extra WL specific extensions) and it responds with the security token you’ll need for accessing the LiveFx HTTP endpoints.

I ported the GetTicket function over to IronPython. I’m using .NET classes like WebRequest and XmlReader, but there’s nothing fancy here so I would expect it to be easy enough to port over to the standard Python library.

def get_WL_ticket(username, password, compactTicket):
    req = WebRequest.Create(_LoginEndPoint)
    req.Method = "POST"
    req.ContentType = "application/soap+xml; charset=UTF-8"
    req.Timeout = 30 * 10000

    rst = get_RST_message(username, password, compactTicket)
    rstbytes = Encoding.UTF8.GetBytes(rst)
    with req.GetRequestStream() as reqstm:
      reqstm.Write(rstbytes, 0, rstbytes.Length)

    with req.GetResponse() as resp:
      with resp.GetResponseStream() as respstm:
        with XmlReader.Create(respstm) as reader:
          if compactTicket:
            name = "BinarySecurityToken"
            namespace = ""
            name = "RequestedSecurityToken"
            namespace = ""

          if not reader.ReadToDescendant(name, namespace):
            raise "couldn't find security token element"

          reader.ReadStartElement(name, namespace)
          token = reader.ReadContentAsString()

          return Convert.ToBase64String(Encoding.UTF8.GetBytes(token))

This code simply uses a WebRequest object to post the RST message to the WS-Trust enpoint then parses the result to find the token. get_RST_message uses standard Python string formatting to generate the RST message that gets posted to the WS-Trust endpoint. I’m not exactly sure why you need to convert the token value to a byte array and then Base64 encode it, but that’s what the sample code does so I did it to.

Once you have the authentication ticket, you need to download root service endpoint document in order to get the base URL and the profiles link. Then you can download all the profiles or you can download a specific one if you know it’s leet-speak identifier. LiveFX data can be downloaded in a variety of formats: ATOM, JSON, RSS or POX. You choose your format by setting the Accept and Content-Type headers.

I wrote the following functions, the generic boilerplate download function as well a specific versions for downloading JSON and POX:

def download(url, contentType, authToken):
  req = WebRequest.Create(url)
  req.Accept = contentType
  req.ContentType = contentType
  req.Headers.Add(HttpRequestHeader.Authorization, authToken)

  return req.GetResponse()  

def download_json(url, authToken):
  resp = download(url, 'application/json', authToken)
  with StreamReader(resp.GetResponseStream()) as reader:  
      data = reader.ReadToEnd()
      return eval(data)

def download_pox(url, authToken):
  resp = download(url, 'text/xml', authToken)
  return XmlReader.Create(resp.GetResponseStream())

Using JSON in Python is really easy, since I can simply eval the returned string and get back Python dictionary objects, similar to what you can do in Javascript.

Here’s some code that uses the get_WL_ticket and download_json functions above to retrieve the the user’s Personal Status Message

#Get user's WL ticket
uid = raw_input("enter WL ID: ")
pwd = raw_input("enter password: ")

authToken = livefx_http.get_WL_ticket(uid, pwd, True)

#download root service document
service = livefx_http.download_json(_LiveFxUri, authToken)

#download general profile document
url = service['BaseUri'] + service['ProfilesLink'] + "/G3N3RaL"

genprofile = livefx_http.download_json(url, authToken)
print genprofile['ProfileBase']['PersonalStatusMessage']

POX is also fairly easy, though a bit more verbose than JSON. The sample code, which I have stuck on my SkyDrive, includes both POX and JSON code, so you can compare and contrast the differences.

IronPython and CodeDOM: Dynamically Compiling C# Files

As part of my series on using IronPython with WPF 1, I built an extension method in C# that does dynamic member resolution on WPF FrameworkElements. The upshot of this code is that I can write win1.listbox1 instead of win1.FindName('listbox1') when using WPF objects from Python or any DLR language. Convenient, right?

The problem with this approach is that the C# extension method gets compiled into an assembly that’s bound to a specific version of the DLR. I recently started experimenting with a more recent build of IronPython and I couldn’t load the extension method assembly due to a conflict between the different versions of Microsoft.Scripting.dll. Of course, I could have simply re-compiled the assembly against the new bits, but that would mean every time I moved to a new version of IronPython, I’d have to recompile. Worse, it would limit my ability to run multiple versions of IronPython on my machine at once. I currently have three – count ‘em, three – copies of IronPython installed: 2.0 RTM, nightly build version 46242, and an internal version without the mangled namespaces of our public CodePlex releases. Having to manage multiple copies of my extension assembly would get annoying very quickly.

Instead of adding a reference to the compiled assembly, what if I could add a reference to a C# file directly? Kinda like how adding references to Python files works, but for statically compiled C#. That would let me write code like the following, which falls back to adding a reference to the C# file directly if adding a reference to the compiled assembly fails.

  import codedom
    ['System', 'WindowsBase', 'PresentationFramework',
     'PresentationCore', 'Microsoft.Scripting'])

Since this technique uses CodeDOM, I decided to encapsulate the code in a Python module named codedom, which is frankly pretty simple. As a shout-out to my pals on the VB team, I broke compiling out into it’s own separate function so I could easily support adding VB as well as C# files.

def compile(prov, file, references):
  cp = CompilerParameters()
  cp.GenerateInMemory = True
  for ref in references:
    a = Assembly.LoadWithPartialName(ref)
  cr = prov.CompileAssemblyFromFile(cp, file)
  if cr.Errors.Count > 0:
    raise Exception(cr.Errors)
  return cr.CompiledAssembly

def add_reference_cs_file(file, references):
  clr.AddReference(compile(CSharpCodeProvider(), file, references))

def add_reference_vb_file(file, references):
  clr.AddReference(compile(VBCodeProvider(), file, references))

The compile function uses a CodeDOM provider, which provides a convenient function to compile an assembly from a single file. The only tricky part was adding the references correctly. Of the five references in this example, the only one CodeDOM can locate automatically is System.dll. For the others, it appears that CodeDOM needs the full path to the assembly in question.

Of course, hard-coding the assembly paths in my script would be too fragile, so instead I use partial names. I load each referenced assembly via Assembly.LoadWithPartialName then pass it’s Location to the CodeDOM provider via the CompilerParameters object. I realize that loading an assembly just to find its location it kind of overkill but a) I couldn’t find another mechanism to locate an assemblies location given only a partial name and b) I’m going to be loading the referenced assemblies when I load the generated assembly anyway, so I figured it loading them to find their location wasn’t a big deal. Note, that typically you’re used to passing a string to clr.AddReference, but it also can accept an assembly object directly.

Of course, this approach isn’t what you would call “fast”. Loading the pre-compiled assembly is much, much faster than compiling the C# file on the fly. But I figure slow code is better than code that doesn’t work at all. Besides, the way the code is written, I only take the extra compile hit if the pre-compiled assembly won’t load.

I stuck my file up on my SkyDrive. Feel free to leverage as you need.

  1. I had to put that series on the back burner in part because the December update to Windows Live totally broke my WPF photo viewing app. I’ve got a new WPF app I’m working on, but I’m not quite ready to blog about it yet.

IronPython and IronRuby CTPs for .NET 4.0 Beta 2

In case you’ve been hiding under a rock (or maybe just aren’t tracking developments in the .NET community outside of IronPython), Microsoft released Visual Studio 2010 beta 2 this week. Of course for me personally, the most important feature in Visual Studio 2010 is C# 4.0 new dynamic type (also available in Visual Basic, but since VB already supported some level of late binding it’s not exactly “new” to VB).

For those of you who want to experiment with this cool new feature, may I present IronPython 2.6 CTP for .NET 4.0 Beta 2. If you can’t think of any cool things to try with this new feature, the VB team blog has some scenarios to get your started.

Also available: IronRuby CTP for .NET 4.0 Beta 2 if you’re more into gemstones than snakes.

These are preview releases, which means they’ve gone thru basic testing. If you find any bugs, PLEASE report them via the usual channel. I wrote in my Post 2.6 Roadmap post, “we are committed to shipping the RTM of our .NET 4.0 version the day that Visual Studio 2010 is publicly available” but that means shaking out the bugs between now and then. We need your help so we’re ready to go by Visual Studio 2010 launch – March 22, 2010 as per Soma’s blog.

BTW, Alcides Fonseca suggested we call this release “IronPython 2.6 N4” since it’s designed to run on .NET Framework 4.0. I like that. What do you think?