Pygments for WL Writer v1.0.1

I just replaced the original v1.0.0 Pygments for WL Writer installer with a new and improved v1.0.1. The original URL still works – I archived the old version off with a new name. Updated source is available on on GitHub.

The only change is that I now override OnSelectedContentChanged in the sidebar control. That way, if I have multiple blocks of pygmented code in a given post, the sidebar UI updates with the correct language and color scheme of the currently selected code block.

Writing an IronPython Debugger: REPL Console

While I was ~~banging my head against a wall~~ experimenting with understanding how CorValue extraction worked, I found myself wanting to dink around with the debugger objects in a REPL console. One of IronPython’s core strengths is support for “exploratory programming” via the REPL. It turned out bringing a REPL to ipydbg was quite simple.

Python includes two built-in features that making DIY REPL quite easy: compile and exec (though technically, exec is a statement, not a function). As you might assume from their names, compile converts a string into what Python calls a code object while exec executes a code object in a given scope. Technically, exec can accept a string so I could get by without using compile. However, if you’re compiling a single interactive statement compile can automatically insert a print statement if you’ve passed in a an expression. In other words, if you type in “2+2” on the console it will print “4”, which is the behavior I wanted.

Here’s what my REPL console code look like. I love that it’s only 20 lines of code.

@inputcmd(_inputcmds, ConsoleKey.R)
def _input_repl_cmd(self, keyinfo):
  with CC.Gray:
    print "nREPL ConsolenPress Ctl-Z to Exit"
    cmd = ""
    _locals = {'self': self}

    while True:
      Console.Write(">>>" if not cmd else "...")

      line = Console.ReadLine()
      if line == None:
        break

      if line:
        cmd = cmd + line + "n"
      else:
        try:
          if len(cmd) > 0:
            exec compile(cmd, "<input>", "single") in globals(),_locals
        except Exception, ex:
          with CC.Red: print type(ex), ex
        cmd = ""

It’s pretty straightforward. I set up a dictionary to act as the local variable scope for the code that gets executed. I’m just reusing the current global scope, but I want the local scope to start with only the reference to the current IPyDebugProcess instance which is passed into _input_repl_cmd as “self”. All the other local variables like cmd and line won’t be available to the REPL code. Then I drop into a loop where I read lines from the console and execute them.

In order to support multi-line statements, I build up the cmd variable over multiple line inputs and I don’t execute it until the user inputs an empty line. In the standard Python console, it can recognize single line statements and execute them immediately. Dino showed me how to use the IronPython parser to do the same thing, but I haven’t implemented that in ipydbg yet. To exit the REPL loop, you type Ctl-Z, which returns None (aka null) from ReadLine instead of the empty string.

Since I never execute the code more than once, I have my exec and compile statements together on a single line. Compile takes the string to be compiled, the name of the file it came from (I’m using <input> for this) and the kind of code. Passing in “single” for the kind of code adds the auto-expression-print functionality I mentioned above. Then I exec the code object that’s returned in specified scope I’m managing for this instance of the REPL loop. If you exit out of the REPL and re-enter it, you get a fresh new copy of the local scope so any functions or variables you define in the last REPL are gone.

Runtime execution of code into a given scope is a hallmark of dynamic languages, but I’m still fairly green when it comes to Python so it took me a while to figure this out. Python code executes in a given scope, a combination of global and local variables. When you’re in the ipy.exe REPL, you’re at top level scope anyway, so global and local scope are the same – if you add something to global scope, it shows up in local scope and vis-versa. Inside a function, you’ll have the same global scope, but the local scope will be different and changes to one won’t be reflected in the other. The ipydbg REPL isn’t a function per-se, but it does provide an explicit local scope that gets disposed when you exit the REPL.

While having a debugger REPL is really convenient for prototyping new ipydbg commands, it’ll really shine once I get function evaluation working. Then I’ll be able to open a REPL console where the commands are executed in the target process instead of the debugger process as they are now. That will be very cool. Until then, the latest code is – as always – up on GitHub.

Writing an IronPython Debugger: Getting Arguments

It’s a small update, but I added support for displaying method arguments along side the local variables. As I mentioned in that post, breaking out the CorValue extraction and display code into a shared function was a good idea – adding support for getting arguments was trivial since I could reuse that code.

Because there’s no hierarchy of scopes to deal with and the names are in the metadata instead of debug symbols, getting arguments is much easier than getting local variables.

def get_arguments(frame):
    mi = frame.GetMethodInfo()
    for pi in mi.GetParameters():
      if pi.Position == 0: continue
      arg = frame.GetArgument(pi.Position - 1)
      yield pi.Name, arg

You’ll notice that I’m yielding the arguments as a tuple of the name and value, the same as get_locals yields. I did refactor get_locals a bit – there’s no longer an argument to skip hidden variables anymore (though get_locals still skips dynamic call sites caches as it did before). Now, it’s up to the the caller of get_arguments and get_locals to filter hidden variables as they see fit.

Because get_locals and get_arguments yield the same types, I was able to factor the code to print a value and loop through the collection of values into separate local functions.

@inputcmd(_inputcmds, ConsoleKey.L)  
def _input_locals_cmd(self, keyinfo):  
  def print_value(name, value):  
    display, type_name = display_value(extract_value(value))  
    with CC.Magenta: print "  ", name,
    print display,  
    with CC.Green: print type_name  

  def print_all_values(f, show_hidden):  
      count = 0  
      for name,value in f(self.active_thread.ActiveFrame):  
        if name.startswith("$") and not show_hidden:  
          continue  
        print_value(name, value)  
        count+=1
      return count  

  print "nLocals"  
  show_hidden =  
    (keyinfo.Modifiers & ConsoleModifiers.Alt) == ConsoleModifiers.Alt  
  count = print_all_values(get_locals, show_hidden)  
  count += print_all_values(get_arguments, show_hidden)  

  if count == 0:  
      with CC.Magenta: print "  No Locals Found"

I really like the local functions feature of Python. In C#, you can define an anonymous delegate using the lambda syntax. But for a scenario like this, I like local functions better. However, I do like C#’s support for statement lambdas – Python only supports expression lambdas. So while I like local functions better in this scenario (because I’m using the method more than once) in something like an event handler, I like the statement lambda syntax better.

As usual, the latest version of ipydbg is up on GitHub.

Pygments for Windows Live Writer

For the past few years, I’ve used the CodeHTMLer plugin for Windows Live Writer for the code snippets in my blog. However, recently I discovered the Pygments Python syntax highlighter package which supports scores more languages than CodeHTMLer does. It also support multiple color schemes and was easily extensible so I could build an HTML formatter that didn’t use <pre> tags (which I’ve found DasBlog has issues with in the RSS feed, though honestly I’m running three minor releases behind the latest DasBlog release). IronPython supports Pygments just fine – at least, the one IPy bug that Pygments exposes has a simple workaround – so I set about building a Windows Live Writer plugin that uses it.

If you’re simply interested in the plugin itself, you can get it from my SkyDrive. The source is up on GitHub. For now, if you find any bugs, please leave a comment on this post. If there’s enough interest I’ll setup a site somewhere (CodePlex perhaps) where I can track bugs and feature requests.

Pygments for WL Writer is a smart content source. In WL Writer’s terminology, that means when you click inserted text in the editor window, it is treated as an atomic entity which you can then edit by using the Edit Code button in the Pygments for WL Writer sidebar editor. I I often found that I would edit my code multiple times – usually to shorten lines so they’d fit on my blog without wrapping. CodeHTMLer for WL Writer is a standard content source, so it just spews the formatted code as HTML onto the page.

From an IronPython perspective, there’s some interesting stuff there. I decided to compile the pygments library into a DLL for easier distribution. If you look in the source, there’s a folder for the Pygments source as well as the parts of the standard Python library that Pygments depends on and my custom HTML formatter. Those all get compiled via a custom script which can be called by the build.bat file in the project root.

Some features I’m thinking about adding:

An extensibility model so that you can add new languages by dropping new Pygments lexers into the same folder the plugin is installed to. Pygments supports lots of languages, but not all of them – notably it’s missing Powershell and F#.
Support for new HTML formatters and color schemes using the same extensibility mechanism described above.
Support for selecting an HTML formatter.
Improving the code editor window. Currently, I’m using a standard WinForms multi-line TextBox, but that leaves a lot to be desired. With the Python work I do, I often need to be able to select a bunch of text and change it’s indenting via tab and shift-tab. If anyone has a suggestion for a good WinForms text editing control, let me know.
Being able to specify the font and size of the Pygmented code.
Storing user preferences – remembering the most recent syntax and color scheme the user used.

Feedback, as always is appreciated. I’ll probably write a few posts about the project when I get a chance, so let me know if there’s anything you’re dying to hear about.

Writing an IronPython Debugger: Command Routing

At this point, ipydbg support seven commands: Continue, Quit, Show Stack Trace, Show Locals, Step Over, Step In, and Step Out. All these commands are invoked by a single keystroke. I’m using Console.ReadKey in an attempt to cut down on the number of keystrokes needed for interacting with the debugger. If I only type ‘s’ instead of ‘s <enter>’ to step, I figure I’ll be twice as productive! 😄

If I was writing ipydbg in C#, I could use switch statement to dispatch commands in the _input method based on user keystrokes. However, Python doesn’t have a switch statement so I’ve been using a cascading set of if/elif/else statements instead. When you get up to seven if/elif clauses plus an else clause, the code smell is pretty overwhelming.

# Only has three if/elif clauses,but it's already a little smelly

val = Console.ReadKey()
if val.Key == 'a':  
  result = 'a'  
elif val.Key == 'b'  
  result = 'b'  
elif val.Key == 'c'  
  result = 'c'  
else:  
  print "unknown key"

Python might not have a switch statement, but it does have first-order functions so you can get the effects of a switch by using a dictionary.

def do_a():
  return 'a'
def do_b():
  return 'b'
def do_c():
  return 'c'
_switch = {'a':do_a, 'b':do_b, 'c':do_c}

val = Console.ReadKey()
if val in _switch:
  result = _switch[val.Key]()
else:
  print "unknown key"

I like this approach much better. Individual if/elif blocks are now broken out into separate functions, which smells better than embedding them in one big function. Also, I like that my pseduo-switch statement is completely separate from the how the _switch dictionary is initialized. However, this approach also separates the pseudo-case statement functions from the _switch dictionary as well. That’s not a good thing. You can easily imagine screwing up by adding a new function but forgetting to manually update the _switch dictionary.

What I need is a way to declaratively associate the switch function with the dictionary lookup key that’s associated with it. Luckily, Python Decorators provides a very clean way to do this.

_switch = {}

@inputcmd(_switch, 'a')
def do_a():
  return 'a'
@inputcmd(_switch, 'b')
def do_b():
  return 'b'
@inputcmd(_switch, 'c')
def do_c():
  return 'c'

val = Console.ReadKey()
if val in _switch:  
  result = _switch[val.Key]()  
else:  
  print "unknown key"

I’ve blogged about decorators before when I wanted to automatically invoke operations on the right thread in my WPF photo viewing app. The @inputcmd decorator is a bit more complicated than the @BGThread and @UIThread decorators since @inputcmd decorator accepts arguments. Each of the @input command decorators in the code above is the equivalent to this code:

def do_a():
  return 'a'

_tmp = inputcmd(_switch, 'a')
do_a = _tmp(do_a)

As you can see, the inputcmd function returns the decorator that wraps do_a, rather than being the decorator itself. This function that returns a function that returns a function is kinda confusing at first. But this approach allows you to configure the decorator for a specific purpose via the arguments – in this case, specifying which dictionary and which console key this function is associated with.

Also unlike @BGThread and @UIThread, I don’t actually want to modify the behavior of the methods decorated with @inputcmd. I only want to store a reference to them in the passed in dictionary. So implementing this decorator is very easy:

def inputcmd(cmddict, key):
    def deco(f):
        cmddict[key] = f
        return f  
    return deco

The decorator simply inserts the function into the passed-in dictionary using the passed in key. It then returns the function as is, so it’s not really rebinding the symbol to a new method (technically, it’s rebinding the symbol to the same function it’s currently bound to). If I wanted also wrap the passed in function to provide additional functionality, I could do that with a second locally defined function inside the deco function.

The latest version of ipydbg as been refactored to use @inputcmd instead of set of a cascading if/elif statement blocks. Now that that’s done, I can start working on multi-key commands.

Series

Disclaimer

The information in this weblog is provided "AS IS" with no warranties, and confers no rights. This weblog does not represent the thoughts, intentions, plans or strategies of my employer. It is solely my opinion. Inappropriate comments will be deleted at the authors discretion.