DevHawk World Tour FY2010

As I’ve done the past two years, here’s a list of all the places I’m going in the next fiscal year. Traditionally, I’ve done this post by calendar year, but all MSFT planning is done by FY and so invariably I miss events early in the calendar year but late in the fiscal (like PyCon last year). I’ll be updating this post periodically as I get tapped for more presentations. There are several other conferences I’m considering, submitting sessions for, in discussions with, but these are the ones that are confirmed.

250px-Dannebrog

Danish University Tour, Sept 7-11
My FY10 travels first take me to Copenhagen, where I was invited by the local subsidiary to present at four different universities in a single week. Don’t know how much sightseeing I’ll get done, but I’ll sure be talking a lot. My host Martin Esmann writes Stud.blog for Danish ComputerWorld and has a post (in Danish) about my visit. Personally, I am just excited about being featured in something called “Stud.blog”! 😄 Actually, Stud here means “Student” not “slender, upright members of wood” or any other definition of the term “stud”.

I’ll be visiting Aalborg University, Aarhus University, University of Southern Denmark and University of Copehhagen as well as delivering a TechTalk at the Microsoft Development Center Copenhagen, which is Microsoft’s biggest development center in Europe. I’ll primarily be delivering my Iron Languages introductory talk “Pumping Iron”, but there’s also some interest in language development on the DLR so I’ll be talking on that topic as well.

patterns & practices Summit Redmond 2009, Oct 12-16
This will be my third p&p Summit in a row and fourth in five years. This year, I’m doing a talk called “Not Everything is a new Nail() : How Languages Influence Design”. I was supposed to deliver this talk last year, but got side track with my day job and ended up talking about IronPython instead. Keith has made it VERY clear he doesn’t want another last minute substitution again this year.

Turing award winner Alan Perlis is credited with saying ‘A language that doesn’t affect the way you think about programming is not worth knowing.’ Yet, most programmers rarely venture outside of the comfort zone of statically-typed object-oriented languages. Our heavy use of object-oriented languages influences our thinking to the point that we can?t see alternative approaches at all. This isn?t to say the object-oriented languages are bad, but as is typical in most things, there is no one ‘best’ way for all situations. In this talk, VS Languages PM Harry Pierson will look at a given software development scenario from both the object-oriented and functional perspectives, in order to see how much on an influence language really has on our engineering efforts.

TechEd_Europe_2009

Tech·Ed Europe 2009, Nov 9-13
I knew I was going to be updating this post over time, but I didn’t expect to have to update it so soon! Literally the day after I posted this, I got the speaker invite for Tech·Ed Europe 2009. My session hasn’t been posted yet, but this is the abstract we submitted:

Dynamic Languages on the Microsoft .NET Framework
The Dynamic Language Runtime (DLR) adds a shared dynamic type system, a standard hosting model, and support for generating fast dynamic code to the CLR. IronPython and IronRuby are Microsoft’s dynamic language implementations on .NET. In this talk, we’ll show you how to interactively create great .NET applications using dynamic languages. You’ll walk away knowing why dynamic languages deserve a spot in your toolbox!

It’s kind of generic, but given that most of the audience probably hasn’t seen IronPython or IronRuby, having broad latitude in my presentation topic is a good thing. I’ll probably deliver a variant of my standard “Pumping Iron” talk like I’m doing in Denmark. I delivered it recently at an internal event with Jimmy, so there’s lots more IronRuby content than there used to be.

The only bummer about doing Tech·Ed Europe is that I’m only doing one measly talk. I’m asking around – I’d love to do a .NET user group or university talk while I’m in town. Any takers?

Find out what's
next

Microsoft Professional Developers Conference 2009, Nov 17-19
Update: Tech·Ed Europe and PDC are on back-to-back weeks this year so we’ll be sending a teammate-to-be-determined to PDC in my stead. My family is very pleased I won’t be gone for two weeks straight.

Last year, I was on the content team for PDC. This year, that PITA responsibility belongs to someone else so I might actually get real work done in the four weeks leading up to PDC. My team will tell you, last year PDC sucked up 100% of my time for a month as we were driving towards our 2.0 release.

Technically, I haven’t had a talk for PDC accepted yet. But I submitted three and two are looking good (though I assume only one will make it to the actual show) so I thought I’d just go ahead and include it on this post. If/when my talks get accepted, I’ll post links and abstracts. Also, if one of my PDC talks is accepted, I’ll probably submit a talk for SoCal Code Camp as well.

pycon logo

PyCon 2010, Feb 19-21
This will also be my third PyCon in a row, though PyCon last year was a bit of a whirlwind since I had literally just joined the IronPython team. I finally feel like I might have something interesting to present at PyCon this year. Last year Dino and Jim handled the presentation duties from our team (with Michael Foord and Jonathan Hartley delivering a tutorial and Sarah Sutkiewicz speaking on FePy). We already have one announcement that I think is pretty significant lined up and might have a second depending on how hard I can push LCA and management between now and then. Talk proposals are due October 1st, so any suggestions would be appreciated!

HawkCodeBox

Last month, I lamented the lack of extensibility of the WPF text box. While there are several vendors and at least one open source custom syntax highlighting text box, it still really bothers me how inextensible the basic WPF text box is. I just want to do a simple colorizing REPL – why is that so hard?

So instead of using any of those syntax highlighting text boxes, I decided to build my own using the approach Ken Johnson wrote about on Code Project. As I wrote before, it’s a hack – you set the text box’s foreground and background brushes to transparent so that you can override OnRender – but it works.

The big change I made from Ken’s code was to use DLR TokenCategorizer instead of regular expressions to tokenize the code. TokenCategorizer is a service provided by the DLR hosting API, which will tokenize a given script source for you. Here’s the code that colorizes the text in the text box.

var source = Engine.CreateScriptSourceFromString(this.Text);
var tokenizer = Engine.GetService<TokenCategorizer>();
tokenizer.Initialize(null, source, SourceLocation.MinValue);

var t = tokenizer.ReadToken();
while (t.Category != TokenCategory.EndOfStream)
{
    if (SyntaxMap.ContainsKey(t.Category))
    {
        ft.SetForegroundBrush(_syntaxMap[t.Category],
             t.SourceSpan.Start.Index, t.SourceSpan.Length);
    }

    t = tokenizer.ReadToken();
}

As you can see, I ask the engine for a TokenCategorizer, initialize it with the text box’s current contents, then iterate thru the tokens, looking for ones in my SyntaxMap. If the token category is in the syntax map, we change the foreground brush for that span of formatted text (ft is a WPF FormattedText instance I created earlier in the method.

Of course, this approach isn’t very efficient – it re-colorizes the entire file on every change. It turns out that some DLR TokenCategorizer are restartable so you can cache the tokenizer state at any point and then return later with a new TokenCategorizer instance and pick up tokenizing where you left off. With this approach, you could say tokenize a line at a time, allowing you to only need to retokenize the line where the change occurred rather than the entire file. But only IronPython supports tokenizer restarting today, so I decided to take the easy way and simple re-colorize on every change.

I named the project HawkCodeBox and I’ve published the source up on GitHub. It’s fairly simple, but of course the goal wasn’t to build the be-all-end-all text editor – other people in the VS team already have that job.

CodePlex Editor Role

Ask Sara, I have been bugging her for a LONG time for this CodePlex feature. Actually, my team has been bugging her team for longer than either of us have been in these jobs.

Last week’s CodePlex release includes a feature known as “Editor Role”. If you look at the Project Role Matrix, you’ll notice two primary differences from what the standard logged-in user can do: they can create/edit wiki pages and they can’t rate releases. Developers and Coordinators can’t rate releases either – I guess the idea is that they don’t want members of the team rating their own releases (5 Stars! Again! Wow, we’re awesome!).

Until now, the only way to give members of the community the ability to edit the wiki also gave permission to edit work items, check in source code and make releases. We’re still working on getting Microsoft at large to understand the benefits of community collaboration aspect in open source, but in the meantime we just can’t give those permissions to people off the team. However, we would love to have contributions to our documentation wiki. 1 With the new Editor Role, we’ll be able to grant wiki editor access without any of the other permissions.

Of course, the whole idea of “wiki permissions” kinda flies in the face of the basic wiki design principles. So we’re going to be pretty liberal about handing out editor permissions. If you’re interested in editing the wiki, drop me a line and I’ll get you hooked up.

Big mega-thanks to the CodePlex team for making this feature happen. I guess I’ll have to find something new to bug Sara about!


  1. You can tell we’re a real open source project because we’re begging for documentation help!

2009 Space Elevator Conference

Today marks the start of the 2009 Space Elevator Conference on the Microsoft campus. Last night, my father and I attended a free overview presentation on space elevators. My father is a huge sci-fi fan and has read many of Arthur C. Clarke’s books include The Fountains of Paradise so he was very excited for this opportunity. Unfortunately, while the idea of a space elevator is pretty exciting, the presentation itself left quite a bit to be desired.

For the un-initiated, a space elevator is just what it sounds like – an elevator into space. Chemical rockets are horribly inefficient, so instead the idea is to run a cable way out into space. According to Wikipedia, a space elevator would be a couple of orders of magnitude cheaper for getting things into space than chemical rocketry.

Of course, actually building a space elevator would have a massive up front cost and an engineering effort that would dwarf even the effort that landed mankind on the moon. One of the biggest problems is substance the cable itself is build out of. This cable would be thousands of kilometers long, and would have to be extremely strong. Frankly, there’s no feasible material to make the cable from available to us today. Apparently, making a cable strong enough out of the strongest high tensile steel available today would weigh more than the entire universe! Not exactly feasible. But advancements in carbon nanotubes have scientists believing they might be able to make materials 100x stronger than high tensile steel. If that pans out, it would be feasible to build the space elevator cable from carbon nanotubes.

Another big issue is power for the climbers. Current thinking apparently is to beam power to the climbers via megawatt lasers – an idea that like carbon nanotubes would have far reaching impact on our society over and above space elevators. The idea of “beaming power” sounds nearly as fantastic as the space elevator itself, but apparently there’s an X-Prize style competition underway with a cool $2 million in prize money if you can build a beam powered climber that travel 5 meters/second.

While the idea of a space elevator is very fascinating and I was excited to spend an evening with my dad geeking out in a non-software related field, the presentation itself was kinda crappy. I have no doubt that Dr. Bryan Laubscher, who delivered the presentation, is one of the top minds in space elevator theory and technology in the world today. However, his presentation was bullet-point laden, rambling, incoherent at times and frankly boring.

For example, I get the feeling that Dr. Laubscher spends a lot of time defending the idea of a space elevator to skeptical NASA scientists. He spent WAY too much time talking about how inefficient chemical rockets are – I mean, mention it once but don’t keep coming back to that point over and over. He also went off on a strange tangent about the potential for societal decline when we turn our back on exploration. But he wasn’t presenting to skeptical NASA scientist last night – he was presenting to group of enthusiastic amateurs. If you can’t tailor your presentation to your audience, there’s no way you’re going to be effective.

While the presentation could have been better, it still had some fascinating information. For example, there would probably have to be multiple space elevators – Dr. Laubscher estimated there would be five. It’s much more efficient to have the space elevator be one way so you need at least two – one to have one to go up and one to go down. I never considered the idea of multiple space elevators before.

Apparently, last year’s Space Elevator Conference was on the Microsoft Campus and I wouldn’t be surprised if next year’s was as well. I hope it will be. I’d like to attend more of the conference. Saturday is Space Elevator 101 day at the conference but I’m driving my parents to the airport. In the meantime, there are some space elevator blogs to follow. Also, I met the president of the LiftPort Group which is headquartered in Seattle, so maybe I’ll get a chance to talk to him one-on-one sometime after the conference is over.

And I should probably read The Fountains of Paradise while I’m at it.

Invoking Python Functions from C# (Without Dynamic)

So I’ve compiled the Pygments package into a CLR assembly and loaded an embedded Python script, so now all that remains is calling into the functions in that embedded Python script. Turns out, this is the easiest step so far.

We’ll start with get_all_lexers and get_all_styles, since they’re nearly identical. Both functions are called once on initialization, take zero arguments and return a PythonGenerator (for you C# devs, a PythonGenerator is kind of like the IEnumerable that gets created when you yield return from a function). In fact, the only difference between them is that get_all_styles returns a generator of simple strings, while get_all_lexers returns a PythonTuple of the long name, a tuple of aliases, a tuple of filename patterns and a tuple of mime types. Here’s the implementation of Languages property:

PygmentLanguage[] _lanugages;

public PygmentLanguage[] Languages
{
    get
    {
        if (_lanugages == null)
        {
            _init_thread.Join();

            var f = _scope.GetVariable<PythonFunction>("get_all_lexers");
            var r = (PythonGenerator)_engine.Operations.Invoke(f);
            var lanugages_list = new List<PygmentLanguage>();
            foreach (PythonTuple o in r)
            {
                lanugages_list.Add(new PygmentLanguage()
                    {
                        LongName = (string)o[0],
                        LookupName = (string)((PythonTuple)o[1])[0]
                    });
            }

            _lanugages = lanugages_list.ToArray();
        }

        return _lanugages;
    }
}

If you recall from my last post, I initialized the _scope on a background thread, so I first have to wait for the thread to complete. If I was using C# 4.0, I’d simply be able to run _scope.get_all_lexers, but since I’m not I have to manually reach into the _scope and retrieve the get_all_lexers function via the GetVariable method. I can’t invoke the PythonFunction directly from C#, instead I have to use the Invoke method that hangs off _engine.Operations. I cast the return value from Invoke to a PythonGenerator and iterate over it to populate the array of languages.

If you’re working with dynamic languages from C#, the ObjectOperations instance than hangs off the ScriptEngine instance is amazingly useful. Dynamic objects can participate in a powerful but somewhat complex protocol for binding a wide variety of dynamic operation types. The DynamicMetaObject class supports twelve different Bind operations. But the DynamicMetaObject binder methods are designed to be used by language implementors. The ObjectOperations class lets you invoke them fairly easily from a higher level of abstraction.

The last Python function I call from C# is generate_html. Unlike get_all_lexers, generate_html takes three parameters and can be called multiple times. The Invoke method has a params argument so it can accept any number of additional parameters, but when I tried to call it I got a NotImplemented exception. It turns out that Invoke currently throws NotImplemented if it receives more than 2 parameters. Yes, we realize that’s kinda broken and we are looking to fix it. However, it turns out there’s another way that’s also more efficient for a function like generate_html that we are likely to call more than once. Here’s my implementation of GenerateHtml in C#.

Func<object, object, object, string> _generatehtml_function;

public string GenerateHtml(string code, string lexer, string style)
{
    if (_generatehtml_function == null)
    {
        _init_thread.Join();

        var f = _scope.GetVariable<PythonFunction>("generate_html");
        _generatehtml_function = _engine.Operations.ConvertTo
                           <Func<object, object, object, string>>(f);
    }

    return _generatehtml_function(code, lexer, style);
}

Instead of calling Invoke, I convert the PythonFunction instance into a delegate using Operations.ConvertTo which I then cache and call like any other delegate from C#. Not only does Invoke fail for more than two parameters, it creates a new dynamic call site every time it’s called. Since get_all_lexers and get_all_styles are each only called once, it’s no big deal. But you typically call generate_html multiple times for a block of source code. Using ConvertTo generates a dynamic call site as part of the delegate, so that’s more efficient than creating one on every call.

The rest of the C# code is fairly pedestrian and has nothing to do with IronPython, as all access to Python code is hidden behind GenerateHtml as well as the Languages and Styles property.

So as I’ve shown in the last few posts, embedding IronPython inside a C# application – even before we get the new dynamic functionality of C# 4.0 – isn’t really all that hard. Of course, we’re always interested in ways to make it easier. If you’ve got any questions or suggestions, please feel free to leave a comment or drop me a line.