Hawk Notes, Volume 4

I rolled out a small update to my blog: It's now reading the blog posts from a GitHub repo instead of an Azure DevOps repo.

Only reason for the swap is to get programmatic public access to the repo. Azure DevOps supports public projects but REST access still appears to require authentication. Personal Access Tokens work just fine, but they expire which is a pain in the backside.

Maybe anonymous access to Azure Repos REST services gets enabled in the future, but I didn't want to wait. It was pretty easy to port the Azure DevOps REST Wrapper to Octokit. My code was already using TPL Dataflow to load all the post metadata from the Azure Git repo. All I really needed to do was change the client initialization code and the URL construction scheme and I was good to go.

Hawk Notes, Volume 3

I just rolled out a new feature for my blog: Series.

Back when I was working for the IronPython team, I would write series of posts on a single topic, such as writing an IronPython debugger or using IronPython's __clrtype__ metaclass feature. I've also written series on building parsers in F#, a step-by-step guide to brokered WinRT components and the home grown engine I use for this blog. And with the new job, I suspect I'll be writing even more.

So I decided to make series a first class construct in my blog. You can go to a top level page to see all my series. Or you can go to a specific series page, such as the one for Hawk Notes. One thing I really like about Series is that they display in chronological order. That makes reading a series like the IronPython Debugger much easier to follow.

Implementing Series was fairly straightforward...or at least it would have been if I hadn't decided to significantly refactor the service code. I didn't like how I was handling configuration - too much direct config reading that was better handled by options. I made some poor service design decisions that limited the use of dependency injection. Most of all, I wanted to change the way memory caching worked so that more data loaded on demand instead of ahead of time. I also took the opportunity to use newer language constructs like value tuples and local functions.

I still have two different sources for posts - the local file system and an Azure Repo. I used to use Azure Storage but I got rid of that source as part of this refactoring. I have a simple interface PostLoader¹ which the AzureGitLoader and FileSystemLoader classes implement. In order to select between them at run time, I have a third PostLoader implementation named PostLoaderSelector. PostLoaderSelector chooses between the sources based on configuration and uses IServiceProvider to activate the specified type from the DI container. PostLoaderSelector gets the IServiceProvider instance via constructor injection. I couldn't find a good example of how to manually activate types from ASP.NET's DI container, so for future reference it looks like this:

public Task<IEnumerable<Post>> LoadPostsAsync()
{
    // Look Ma, a Local Function!
    PostLoader GetLoader()
    {
        switch (options.PostStorage)
        {
            case BlogOptions.PostStorageType.FileSystem:
                return serviceProvider.GetService<FileSystemLoader>();
            case BlogOptions.PostStorageType.AzureGit:
                return serviceProvider.GetService<AzureGitLoader>();
            default:
                throw new NotImplementedException();
        }
    };

    var loader = GetLoader();
    return loader.LoadPostsAsync();
}

Note the lack of the "I" prefix for my interface type names. Death to this final vestigal Hungarian notation!↩

Hawk Notes, Volume 2

Obviously, I've been pretty bad about keeping this blog up to date. That applies to both the content of the blog as well as the blog engine. Frankly, the last time I updated my blog code before last weekend was four years ago. At the time, ASP.NET Core was still called ASP.NET 5 and the latest version was Preview 7. Before posting my recent announcement, I thought it might be a good idea to get this blog running on a released and supported version of ASP.NET Core.

Once I got my code updated to the latest version of ASP.NET, I decided actually add some functionality. When I built Hawk, I focused primarily on how the site was going to render my existing content. I didn't give much though to how I would write new blog posts. That turned out to be a bad decision. Getting new posts published turned out to be such a huge pain in the butt that I bothered to write new posts. With the new job, I want to go back to blogging much more often. So I figured I should update the engine to make publishing new posts easier.

As I wrote in my previous Hawk Note, Hawk loads all the post metadata into memory on startup. It supports loading posts from either the file system as well as from Azure Storage. The master copy of my posts are stored in their own ~~Visual Studio Online~~ ~~Visual Studio Team Services~~ Azure DevOps Repos. Moving posts from the git repo into azure storage turned out the be the biggest publishing obstacle. So I decided to eliminate that step. If you're reading this, I guess it worked!

Since Hawk already supported loading posts from two different repositories, it was pretty straight forward to add a third that reads directly from the Azure Repo. I also added code to render the markdown to HTML using Markdig, eliminating the need to use Edge.js ¹.

Originally, I decided to store my blog posts in what is now called Azure Repos because it offered free private repos for personal use. While GitHub now offers free private repos, I've decided to keep my posts in an Azure Repos because I find the Azure DevOps REST interface much easier to wrap my head around than GitHub's GraphQL based approach. Yeah, I'm sure GraphQL is the better approach. But I was able to add the ability to load posts from Azure Repos in roughly 150 lines of code. For now anyway, easy and proven beats newer and more powerful.

Hawk pulling from the Azure Repo is the first step for creating a simple publishing process. Next, I will to set up a service hook to automatically trigger a Hawk refresh when I push to the post repo master branch. Eventually, I also want to add image publishing and maybe some type tool to help author the post metadata (auto set the date, auto slugify the title, validate categories and tags, etc). I also want to re-enable comments, though social media makes that somewhat less important than it was back in the day. Regardless, I figure it's best to tackle improvements incrementally. Last thing I want is another long silence on this blog.

Note, I might not use Edge.js in Hawk anymore, but I still think Tomasz Janczuk must be some kind of a wizard.↩

Hawk Notes, Volume 1

This is the first in a series of blog posts about Hawk, the engine that powers this site. My plan is to make a post like this for every significant update to the site. We'll see well that plan works.

I just pushed out a new version of Hawk on my website. The primary feature of this release is support for ASP.NET 5 Beta 7. I also published the source code up on GitHub. Feedback welcome!
As I mentioned in my post on Edge.js, the publishing tools for Hawk is little more than duct tape and bailing wire at this point. Eventually, I'd like to have a dedicated tool, but for now it's a manual three step process:
1. Run the PublishDraft to publish a post from my draft directory to a local git repo of all my content. As part of this, I update some of the metadata and render the markdown to HTML.
2. Run my WritePostsToAzure Custom Command to publish posts from my local git repo to Azure. I have a blog post on my custom command infrastructure in the works.
3. Trigger a content refresh via an unpublished URL.
I need to trigger a content refresh because Hawk loads all of the post metadata from Azure on startup. The combined metadata for all my posts is pretty small - about 2/3 of a megabyte stored on disk as JSON. Having the data in memory makes it easy to query as well as support multiple post repositories ( Azure storage and the file system).
I felt comfortable publishing the Hawk source code now because I added a secret key to the data refresh URL. Previously, the refresh URL was unsecured. I didn't think giving random, anonymous people on the Interet the ability to kick off a data refresh was a good idea, so I held off publishing source until I secured that endpoint.
Hawk caches blog post content and legacy comments in memory. This release also adds cache invalidation logic so that everything gets reloaded from storage on data refresh, not just the blog post metadata.
I don't understand what the ASP.NET team is doing with the BufferedHtmlContent class. In beta 7 it's been moved to the Common repo and published as source. However, I couldn't get it to compile because it depends on an internal [NotNull] attribute. I decided to scrap my use of BufferedHtmlContent and built out several classes that implement IHtmlContent directly instead. For example, the links at the bottom of my master layout are rendered by the SocialLink class. Frankly, I'm not sure if rolling your own IHtmlContent class for snippet of HTML code you want to automate is a best practice. It seems like it's harder than it should be. It feels like ASP.NET needs a built-in class like BufferedHtmlContent, so I'm not sure why it's been removed.

The Brilliant Magic of Edge.js

In my post relaunching DevHawk, I mentioned that the site is written entirely in C# except for about 30 lines of JavaScript. Like many modern web content systems, Hawk uses Markdown. I write blog posts in Markdown and then my publishing "tool" (frankly little more than duct tape and bailing wire at this point) coverts the Markdown to HTML and uploads it to Azure.

However, as I went thru and converted all my old content to Markdown, I discovered that I needed some features that aren't supported by either the original implementation or the new CommonMark project. Luckily, I discovered the markdown-it project which implements the CommonMark spec but also supports syntax extensions. Markdown-it already had extensions for all of the extra features I needed - things like syntax highlighting, footnotes and custom containers.

The only problem with using markdown-it in Hawk is that it's written in JavaScript. JavaScript ~~is a fine language~~ has lots of great libraries, but I find it a chore to write significant amounts of code in JavaScript - especially async code. I did try and rewrite my blog post upload tool in JavaScript. It was much more difficult than the equivalent C# code. Maybe once promises become more widely used and async/await is available, JavaScript will feel like it has a reasonable developer experience to me. Until then, C# remains my weapon of choice.

I wasn't willing to use JavaScript for the entire publishing tool, but I still needed to use markdown-it ¹. So I started looking for a way to integrate the small amount of JavaScript code that renders Markdown into HTML in with the rest of my C# code base. I was expecting to have to setup some kind of local web service with Node.js to host the markdown-it code in and call out to it from C# with HttpClient.

But then I discovered Edge.js. Holy frak, Edge.js blew my mind.

Edge.js provides nearly seamless interop between .NET and Node.js. I was able to drop the 30 lines of JavaScript code into my C# app and call it directly. It took all of about 15 minutes to prototype and it's less than 5 lines of C# code.

Seriously, I think Tomasz Janczuk must be some kind of a wizard.

To demonstrate how simple Edge.js is to use, let me show you how I integrated markdown-it into my publishing tool. Here is a somewhat simplified version of the JavaScript code I use to render markdown in my tool using markdown-it, including syntax highlighting and some other extensions.

// highlight.js integration lifted unchanged from 
// https://github.com/markdown-it/markdown-it#syntax-highlighting
var hljs  = require('highlight.js');
var md = require('markdown-it')({
  highlight: function (str, lang) {
    if (lang && hljs.getLanguage(lang)) {
      try { 
        return hljs.highlight(lang, str).value;
      } catch (__) {}
    }

    try {
      return hljs.highlightAuto(str).value;
    } catch (__) {}

    return ''; 
  }
});

// I use a few more extensions in my publishing tool, but you get the idea
md.use(require('markdown-it-footnote'));
md.use(require('markdown-it-sup'));

var html = return md.render(markdown);

As you can see, most of the code is just setting up markdown-it and its extensions. Actually rendering the markdown is just a single line of code.

In order to call this code from C#, we need to wrap the call to md.render with a JavaScript function that follows the Node.js callback style. We pass this wrapper function back to Edge.js by returning it from the JavaScript code.

// Ain't first order functions grand? 
return function (markdown, callback) {
    var html = md.render(markdown);
    callback(null, html);
}

Note, I have to use the callback style in this case even though my code is syncronous. I suspect I'm the outlier here. There's a lot more async Node.js code out in the wild than syncronous.

To make this code available to C#, all you have to do is pass the JavaScript code into the Edge.js Func function. Edge.js includes a embedded copy of Node.js as a DLL. The Func function executes the JavaScript and wraps the returned Node.js callback function in a .NET async delegate. The .NET delegate takes an object input parameter and returns a Task<object>. The delegate input parameter is passed in as the first parameter to the JavaScript function. The second parameter passed to the callback function becomes the return value from the delegate (wrapped in a Task of course). I haven't tested, but I assume Edge.js will convert the callback function's first parameter to a C# exception if you pass a value other than null.

It sounds complex, but it's a trivial amount of code:

// markdown-it setup code omitted for brevity
Func<object, Task<object>> _markdownItFunc = EdgeJs.Edge.Func(@"
var md = require('markdown-it')() 

return function (markdown, callback) {
    var html = md.render(markdown);
    callback(null, html);
}");
  
async Task<string> MarkdownItAsync(string markdown)
{
    return (string)await _markdownItFunc(markdown);
}

To make it easier to use from the rest of my C# code, I wrapped the Edge.js delegate with a statically typed C# function. This handles type checking and casting as well as provides intellisense for the rest of my app.

The only remotely negative thing I can say about Edge.js is that it doesn't support .NET Core yet. I had to build my markdown rendering tool as a "traditional" C# console app instead of a DNX Custom Command like the rest of Hawk's command line utilities. However, Luke Stratman is working on .NET Core support for Edge.js. So maybe I'll be able to migrate my markdown rendering tool to DNX sooner rather than later.

Rarely have I ever discovered such an elegant solution to a problem I was having. Edge.js simply rocks. As I said on Twitter, I owe Tomasz a beer or five. Drop me a line Tomasz and let me know when you want to collect.

I also investigated what it would take to update an existing .NET Markdown implementation like CommonMark.NET or F# Formatting to support custom syntax extensions. That would have been dramatically more code than simply biting the bullet and rewriting the post upload tool in JavaScript.↩

Series

Disclaimer

The information in this weblog is provided "AS IS" with no warranties, and confers no rights. This weblog does not represent the thoughts, intentions, plans or strategies of my employer. It is solely my opinion. Inappropriate comments will be deleted at the authors discretion.