Syntax Highlighting TextBoxes in WPF – A Sad Story

One of the big new features in VS 2010 is the WPF based editor. With it, you can build all sorts of cool stuff like control the visualization of XML doc comments, change how intellisense looks, even scale the size of text based on the location of the caret. Huzzah for the WPF Visual Studio editor!

However, as wicked awesome as the new editor is, AFAIK it’s not going to be released as a separate component. So while the PowerShell, Intellipad and other teams inside Microsoft can reuse the VS editor bits, nobody else can. So if you want to do something like embed a colorizing REPL in your WPF app, you’ll have to use something else.

I’ve thought about putting a WPF based UI on top of ipydbg (though now I’d probably use the new lightweight debugger instead). So I downloaded John’s repl-lib code to see how he was doing it. Turns out his REPL control is essentially a wrapper around WPF’s RichTextBox control. It works, but it seems kinda kludgy. For example, the RichTextBox supports bold, italics and underline hotkeys, so John’s REPL does too. Though it is possible to turn off these formatting commands, I decided to take a look at modifying how the plain-old TextBox renders. After all, WPF controls are supposed to be lookless, right?

Well, apparently not all the WPF controls are lookless. In particular to this post, the TextBox is definitely NOT lookless. It looks like the text editing capabilities of TextBox are provided by the Sys.Win.Documents.TextEditor class while the text rendering is provided by the Sys.Win.Controls.TextBoxView class. Both of those classes are internal, so don’t even think about trying to customize or reuse them.

The best (and I use that term loosely) way I found for customizing the TextBox rendering was a couple of articles on CodeProject by Ken Johnson. Ken’s CodeBox control inherits from TextBox and sets the Foreground and Background to transparent (to hide the result of TextBoxView) and then overloads OnRender to render the text with colorization. Rendering the text twice – once transparently and once correctly – seems like a better solution than using the RichTextBox, but it’s still pretty kludgy. (Note, I’m calling the TextBox design kludgy – Ken’s code is a pretty good work around).

So if you want a colorized text box in WPF, your choices are:

  • Build your own class that inherits from RichTextBox, disabling all the formatting commands and handling the TextChanged event to do colorization
  • Build your own class that inherits from TextBox, but set Foreground an Background colors to transparent and overload OnRender to do the visible text rendering.
  • Use a 3rd party control. The only one I found was the AqiStar TextBox. No idea how good it is, but they claim to be a true lookless control. Any other syntax highlighting WPF controls around that I don’t know about?

Microsoft.Scripting.Debugging

If you’ve compiled IronPython from source recently, you may have noticed a new DLL: Microsoft.Scripting.Debugging. This DLL contains a lightweight, non-blocking debugger for DLR based languages that is going to enable both new scenarios as well as better compatibility with CPython. Needless to say, we’re very excited about it.

When I was actively working on my ipydbg series, I got several emails asking about using it in an embedded scripting scenario. Unfortunately, the ipydbg approach doesn’t work very well in the embedded scripting scenario. ipydbg uses ICorDebug and friends, which completely blocks the application being debugged. This means, your debugger has to run in a separate process. So either you run your debugger in your host app process and your scripts in a separate process or you run your debugger in a separate process debugging both the scripts and the host app. Neither option is very appealing.

Now with the DLR Debugger, you can run all three components in the same process. I think of the DLR debugger as a “cooperative” debugger in much the same way that Windows 3.x supported cooperative multitasking. It’s also known as trace or traceback debugging. Code being debugged yields to the debugger at set points during its execution. The debugger then does whatever it wants, including showing UI and/or letting the developer inspect or modify program state. When the debugger returns, execution of the original code continues until the next set point wherein the process repeats itself.

The primary point of entry for the DLR Debugger is the DebugContext class. Notable there is the TransformLambda method, which takes a normal DLR LambdaExpression and transforms it into a cooperatively debugged LambdaExpression. LambdaExpressions can contain DebugInfoExpressions – typically we insert them at the start of every Python code line as well as one at the end of the function. When we run IronPython in debug mode (i.e. –D), those get turned into sequence points as we saw back when I was working on ipydbg. When using the DLR Debugger, those DebugInfoExpressions are transformed into calls out to IDebugCallback.OnDebugEvent. The DLR Debugger implements the IDebugCallback interface on the TracePipeline class which also implements ITracePipeline. In OnDebugEvent, TracePipeline calls out to an ITraceCallback instance you provide. The extra layer of indirection means you can change your traceback handler without having to regenerate the debuggable version of your functions.

Of course, we hide all this DLR Debugger goo from you in IronPython. Python already has a mechanism for doing traceback debugging – sys.settrace. Our ITraceCallback, PythonTracebackListener, wrapps the DLR Debugger API to expose the sys.settrace API. That makes this feature a twofer – new capability for IronPython + better compatibility with CPython. Instead of needing a custom tool (i.e. ipydbg) you can now use PDB from the standard Python library (modulo bugs in our implementation). I haven’t been working on ipydbg recently since you’ll be able to use PDB soon enough.

For those hosting IronPython, we also have a couple of static extension methods in our hosting API (look for the SetTrace functions in IronPythonHostingPython.cs). These are simply wrappers around sys.settrace, so it has the same API regardless if you access it from inside IronPython or from the hosting API. But if you’re hosting IronPython in a C# application, those extension methods are very convenient to use.

This debugger will be in our regular releases of IronPython as of 2.6 beta 2 which is scheduled to drop at the end of this month. For those who just can’t wait, it’s available as source code starting with yesterday’s changeset. Please let us know what you think!

Add-Bcd-Vhd.ps1

I LOVE the new boot from VHD feature in Win7. I am primarily using it for doing some VS 2010 dogfooding without messing up my primary drive partition. But man, the process for setting up a VHD for booting is brutal. Scott Hanselman did a great job laying out the steps, but I wanted something a bit more productive.

First, I created a clean Win7 RC VHD and zipped it up for easy storage. The basic Win7 RC VHD is just under 5GB, but compresses down to about 1.5GB with 7-zip. I used the ImageX process Aviraj described though in the future I’ll use the Install-WindowsImage script. Install-WindowsImage is more convenient to use because it will list the indexes within a given .wim file instead of making you grovel thru an XML file like ImageX does. Also Install-WindowsImage is 27k download while ImageX is part of the 1.4 gigabyte Windows Automated Installation Kit. Look, I’m not hurting for bandwidth, but I don’t see the point of downloading 54442 times more data for a utility that isn’t as useful.

Once you’ve created the VHD, you need to update your Boot Configuration Data, or BCD for short, using the appropriately named BCDEdit utility. The process is fairly straight forward, if tedious. You have to run BCDEdit four times, copy the configuration GUID to the clipboard and type out the path to the VHD in a slightly funky syntax. Blech. So I built a PowerShell script to automate updating the BCD, called add-bcd-vhd. You can get it from my SkyDrive. Pass in the name of the BCD entry and the path to the VHD and add-bcd-vhd will do the rest.

I was whining on Twitter yesterday that there’s no PowerShell specific tools for managing the BCD data. Add-bcd-vhd just runs bcdedit behind the scenes and processes the text output with regular expressions. Ugly, but effective. I decided to spend some time trying accessing the BCD data from its WMI provider, but that turned out to be way too much of a hassle to be effective. If someone else out there knows how to use the BCD WMI provider from PowerShell, I’d appreciate some sample code.

__clrtype__ Metaclasses: Named Attribute Parameters

In my last post, I added support for custom attribute positional parameters . To finish things off, I need to add support for named parameters as well. Custom attributes support named parameters for public fields and settable properties. It works kind of like C# 3.0’s object initalizers. However, unlike object initalizers, the specific fields and properties to be set on a custom attribute as well as their values are passed to the CustomAttributeBuilder constructor. With six arguments – five of which are arrays – it’s kind of an ugly constructor. But luckily, we can hide it away in the make_cab function by using Python’s keyword arguments feature.

def make_cab(attrib_type, *args, **kwds):
  clrtype = clr.GetClrType(attrib_type)
  argtypes = tuple(map(lambda x:clr.GetClrType(type(x)), args))
  ci = clrtype.GetConstructor(argtypes)

  props = ([],[])
  fields = ([],[])

  for kwd in kwds:
    pi = clrtype.GetProperty(kwd)
    if pi is not None:
      props[0].append(pi)
      props[1].append(kwds[kwd])
    else:
      fi = clrtype.GetField(kwd)
      if fi is not None:
        fields[0].append(fi)
        fields[1].append(kwds[kwd])
      else:
        raise Exception, "No %s Member found on %s" % (kwd, clrtype.Name)

  return CustomAttributeBuilder(ci, args,
    tuple(props[0]), tuple(props[1]),
    tuple(fields[0]), tuple(fields[1]))

def cab_builder(attrib_type):
  return lambda *args, **kwds:make_cab(attrib_type, *args, **kwds)

You’ll notice that make_cab now takes a third parameter: the attribute type and the tuple of positional arguments we saw last post. This third parameter “**kwds” is a dictionary of named parameters. Python supports both positional and named parameter passing, like VB has for a while and C# will in 4.0. However, this **kwds parameter contains all the extra or leftover named parameters that were passed in but didn’t match any existing function arguments. Think of it like the params of named parameters.

As I wrote earlier, custom attributes support setting named values of both fields and properties. We don’t want the developer to have to know if given named parameter is a field or property, so make_cab iterates over all the named parameters, checking first to see if it’s a property then if it’s a field. It keeps a list of all the field / property infos as well as their associated values. Assuming all the named parameters are found, those lists are converted to tuples and passed into the CustomAttributeBuilder constructor.

In addition to the change to make_cab, I also updated cab_builder slightly in order to pass the **kwds parameter on thru to the make_cab function. No big deal. So now, I can add an attribute with named parameters to my IronPython class and it still looks a lot like a C# attribute specification.

clr.AddReference("System.Xml")
from System.Xml.Serialization import XmlRootAttribute
from System import ObsoleteAttribute, CLSCompliantAttribute
Obsolete = cab_builder(ObsoleteAttribute)
CLSCompliant = cab_builder(CLSCompliantAttribute)
XmlRoot = cab_builder(XmlRootAttribute)

class Product(object):
  __metaclass__ = ClrTypeMetaclass
  _clrnamespace = "DevHawk.IronPython.ClrTypeSeries"
  _clrclassattribs = [
    Obsolete("Warning Lark's Vomit"),
    CLSCompliant(False),
    XmlRoot("product", Namespace="http://samples.devhawk.net")]

  # remainder of Product class omitted for clarity

As usual, sample code is up on my SkyDrive.

Now that I can support custom attributes on classes, it would be fairly straightforward to add them to methods, properties, etc as well. The hardest part at this point is coming up with a well designed API that works within the Python syntax. If you’ve got any opinions on that, feel free to share them in the comments, via email, or on the IronPython mailing list.

__clrtype__ Metaclasses: Positional Attribute Parameters

The basic infrastructure for custom attributes in IronPython is in place, but it’s woefully limited. Specifically, it only works for custom attributes that don’t have parameters. Of course, most of the custom attributes that you’d really want to use require additional parameters, both the positional or named variety. Since positional parameters are easier, let’s start with them.

Positional parameters get passed to the custom attribute’s constructor. As we saw in the previous post, you need a CustomAttributeBuilder to attach a custom attribute to an attribute target (like a class). Previously, I just needed to know the attribute type since I was hard coding the positional parameters. But now, I need to know both the attribute type as well as the desired positional parameters. I could have built a custom Python class to track this information, but it made much more sense just to use CustomAttributeBuilder instances. I built a utility function make_cab to construct the CustomAttributeBuilder instances.

def make_cab(attrib_type, *args):
  argtypes = tuple(map(lambda x:clr.GetClrType(type(x)), args))
  ci = clr.GetClrType(attrib_type).GetConstructor(argtypes)
  return CustomAttributeBuilder(ci, args)

from System import ObsoleteAttribute

class Product(object):
  __metaclass__ = ClrTypeMetaclass
  _clrnamespace = "DevHawk.IronPython.ClrTypeSeries"
  _clrclassattribs = [make_cab(ObsoleteAttribute , "Warning Lark's Vomit")]

  # remaining Product class definition omited for clarity

In make_cab, I build a tuple of CLR types from the list of positional arguments that was passed in. If you haven’t seed the *args syntax before, it works like C#’s params keyword – any extra arguments are passed into the function as a tuple names args. I use Python’s built in map function (FP FTW!) to build a tuple of CLR types of the provided arguments, which I then pass to GetConstructor. Previously, I passed an empty tuple to GetConstructor because I wanted the default constructor. If you don’t pass any positional arguments, you still get the default constructor. Once I’ve found the right constructor, I pass it and the original tuple of arguments to the CustomAttributeBuilder constructor.

One major benefit of this approach is that it simplifies the metaclass code. Since _clrclassattribs is now a list of CustomAttributeBuilders, now I just need to iterate over that list and call SetCustomAttribute for each.

if hasattr(cls, '_clrclassattribs'):
      for cab in cls._clrclassattribs:
        typebld.SetCustomAttribute(cab)

The only problem with this approach is that specifying the list of custom attributes is now extremely verbose. Not only am I specifying the full attribute class name as well as the positional arguments, I’m also having to insert a call to make_cab. Previously, it kinda looked like a C# custom attribute, albeit in the wrong place. Not anymore. So I decided to write a function called cab_builder to generates less verbose calls to make_cab:

def cab_builder(attrib_type):
  return lambda *args:make_cab(attrib_type, *args)

from System import ObsoleteAttribute
Obsolete = cab_builder(ObsoleteAttribute)

class Product(object):
  __metaclass__ = ClrTypeMetaclass
  _clrnamespace = "DevHawk.IronPython.ClrTypeSeries"
  _clrclassattribs = [Obsolete("Warning Lark's Vomit")]

  # remaining Product class definition omited for clarity

The cab_builder function returns an anonymous lambda function that closes over the attrib_type variable. Python lambdas are just like C# lambdas, except that they only support expressions 1. The results of calling the lambda returned from cab_builder is exactly the same as calling make_cab directly, but less verbose. And since I named the function returned from cab_builder Obsolete, now my list of class custom attributes looks exactly like it does in C# (though still in a different place). As usual, the code is up on my SkyDrive.

If you’re only using the attribute once like this, it is kind of annoying to first declare the cab_builder function. If you wanted to you could iterate over the types in a given assembly, looking for ones that inherit from Attribute and generate the cab_builder call dynamically. However, I’m not sure how performant that would be. Another possibility would be to iterate over the types in a given assembly and generate a Python module on disk with the calls to cab_builder. Then, you’d just have to import this module of common attributes but still be able to include additional calls to cab_builder as needed.


  1. The lack of statement lambdas in Python is one of my few issues with the language.