Writing custom Markdig extensions
Markdig, according to its description, "is a fast, powerful, CommonMark compliant, extensible Markdown processor for .NET". While most of our older projects use MarkdownDeep (including an increasingly creaky cyotek.com), current projects use Markdig and thus far it has proven to be an excellent library.
One of the many overly complicated aspects of cyotek.com is that in addition to the markdown processing, every single block of content is also ran through a byzantine number of regular expressions for custom transforms. When cyotek.com is updated to use Markdig, I definitely don't want these expressions to hang around. Enter, Markdig extensions.
Markdig extensions allow you extend Markdig to include additional transforms, things that might not conform to the CommonMark specification such as YAML blocks or pipe tables.
MarkdownPipeline pipline;
string html;
string markdown;
markdown = "# Header 1";
pipline = new MarkdownPipelineBuilder()
.Build();
html = Markdown.ToHtml(markdown, pipline); // <h1>Header 1</h1>
pipline = new MarkdownPipelineBuilder()
.UseAutoIdentifiers() // enable the Auto Identifiers extension
.Build();
html = Markdown.ToHtml(markdown, pipline); // <h1 id="header-1">Header 1</h1>
Example of using an extension to automatically generate id
attributes for heading elements.
I recently updated our internal crash aggregation system to be
able to create MantisBT issues via our MantisSharp library.
In these issues, stack traces include the line number or IL
offset in the format #<number>
. To my vague annoyance, Mantis
Bug Tracker treats these as hyperlinks to other issues in the
system in a similar fashion to how GitHub automatically links to
issues or pull requires. It did however give me an idea to
create a Markdig extension that performs the same functionality.
Deciding on the pattern
The first thing you need to do is decide the markdown pattern to
trigger the extension. Our example is perhaps a bit too basic
as it is a simple #<number>
, whereas if you think of other
issue systems such as JIRA, it would be <string>-<number>
. As
well as the "body" of the pattern you also need to consider the
characters which surround it. For example, you might only allow
white space, or perhaps brackets or braces - whenever I
reference a JIRA issue I tend to surround them in square braces,
e.g. [PRJ-1234]
.
The other thing to consider is the criteria of the core pattern.
Using our example above, should we have a minimum number of
digits before triggering, or a maximum? #999999999
is probably
not a valid issue number!
Extension components
A Markdig extension is comprised of a few moving parts. Depending on how complicated your extension is, you may not need all parts, or could perhaps reuse existing parts.
- The extension itself (always required)
- A parser
- A renderer
- A object used to represent data in the abstract syntax tree (AST)
- A object used to configure the extension functionality
In this plugin, I'll be demonstrating all of these parts.
Happily enough, there's actually already an extension built into Markdig for rendering JIRA links which was great as a getting started point, including the original MarkdigJiraLinker extension by Dave Clarke. As I mentioned at the start, Markdig has a lot of extensions, some simple, some complex - there's going to be a fair chunk of useful code in there to help you with your own.
Supporting classes
I'm actually going to create the components in a backwards order from the list above, as each step depends on the one before it, so it would make for awkward reading if I was referencing things that don't yet exist.
To get started with some actual code, I'm going to need a couple of supporting classes - an options object for configuring the extension (at the bare minimum we need to supply the base URI of a MantisBT installation), and also class to present a link in the AST.
First the options class. As well as that base URI, I'll also add
an option to determine if the links generated by the application
should open in a new window or not via the target
attribute.
public class MantisLinkOptions
{
public MantisLinkOptions()
{
this.OpenInNewWindow = true;
}
public MantisLinkOptions(string url)
: this()
{
this.Url = url;
}
public MantisLinkOptions(Uri uri)
: this()
{
this.Url = uri.OriginalString;
}
public bool OpenInNewWindow {get; set; }
public string Url { get; set; }
Next up is the object which will present our link in the syntax tree. Markdig nodes are very similar to HTML, coming in two flavours - block and inline. In this article I'm only covering simple inline nodes.
I'm going to inherit from LeafInline
and add a single property
to hold the Mantis issue number.
There is actually a more specific
LinkInline
element which is probably a much better choice to use (as it also means you shouldn't need a custom renderer). However, I'm doing this example the "long way" so that when I move onto the more complex use cases I have for Markdig, I have a better understanding of the API.
[DebuggerDisplay("#{" + nameof(IssueNumber) + "}")]
public class MantisLink : LeafInline
{
public StringSlice IssueNumber { get; set; }
}
String vs StringSlice
In the above class, I'm using the StringSlice
struct offered
by Markdig. You can use a normal string
if you wish (or any
other type for that matter), but StringSlice
was specifically
designed for Markdig to improve performance and reduce
allocations. In fact, that's how I heard of Markdig to start
with, when I read Alexandre's comprehensive blog post on
the subject last year.
Creating the renderer
With the two supporting classes out the way, I can now create
the rendering component. Markdig renderer's take an element from
the AST and spit out some content. Easy enough - we create a
class, inherit HtmlObjectRenderer<T>
(where T
is the name of
your AST class, e.g. MantisLink
) and override the Write
method. If you are using a configuration class, then creating a
constructor to assign that is also a good idea.
public class MantisLinkRenderer : HtmlObjectRenderer<MantisLink>
{
private MantisLinkOptions _options;
public MantisLinkRenderer(MantisLinkOptions options)
{
_options = options;
}
protected override void Write(HtmlRenderer renderer, MantisLink obj)
{
StringSlice issueNumber;
issueNumber = obj.IssueNumber;
if (renderer.EnableHtmlForInline)
{
renderer.Write("<a href=\"").Write(_options.Url).Write("view.php?id=").Write(issueNumber).Write('"');
if (_options.OpenInNewWindow)
{
renderer.Write(" target=\"blank\" rel=\"noopener noreferrer\"");
}
renderer.Write('>').Write('#').Write(issueNumber).Write("</a>");
}
else
{
renderer.Write('#').Write(obj.IssueNumber);
}
}
}
So how does this work? The Write
method we're overriding
supplies the HtmlRenderer
to write to, and the MantisLink
object to render.
First we need to check if we should be rendering HTML by
checking the EnableHtmlForInline
property. If this is false
,
then we output the plain text, e.g. just the issue number and
the #
prefix.
If we are writing full HTML, then it's a matter of building a
HTML a
tag with the fully qualified URI generated from the
base URI in the options object, and the AST node's issue number.
We also add a target
attribute if the options state that links
should be in a new window. If we do add a target
attribute
I'm also adding a rel
attribute as per MDN guidelines.
Notice how the HtmlRenderer
objects Write
method happily
accepts string
, char
or StringSlice
arguments, meaning we
can mix and match to suit our purposes.
Creating the parser
With rendering out of the way, it's time for the most complex
part of creating an extension - parsing it from a source
document. For that, we need to inherit from InlineParser
and
overwrite the Match
method, as well as setting up the
characters that would trigger the parse routine - that single
#
character in our example.
public class MantisLinkInlineParser : InlineParser
{
private static readonly char[] _openingCharacters =
{
'#'
};
public MantisLinkInlineParser()
{
this.OpeningCharacters = _openingCharacters;
}
public override bool Match(InlineProcessor processor, ref StringSlice slice)
{
bool matchFound;
char previous;
matchFound = false;
previous = slice.PeekCharExtra(-1);
if (previous.IsWhiteSpaceOrZero() || previous == '(' || previous == '[')
{
char current;
int start;
int end;
slice.NextChar();
current = slice.CurrentChar;
start = slice.Start;
end = start;
while (current.IsDigit())
{
end = slice.Start;
current = slice.NextChar();
}
if (current.IsWhiteSpaceOrZero() || current == ')' || current == ']')
{
int inlineStart;
inlineStart = processor.GetSourcePosition(slice.Start, out int line, out int column);
processor.Inline = new MantisLink
{
Span =
{
Start = inlineStart,
End = inlineStart + (end - start) + 1
},
Line = line,
Column = column,
IssueNumber = new StringSlice(slice.Text, start, end)
};
matchFound = true;
}
}
return matchFound;
}
}
In the constructor, we set the OpeningCharacters
property to a
character array. When Markdig is parsing content, if it comes
across any of the characters in this array it will automatically
call your extension.
This neatly leads us onto the meat of this class - overriding
the Match
method. Here, we scan the source document and try to
build up our node. If we're successful, we update the processor
and let Markdig handle the rest.
We know the current character is going to be #
as this is our
only supported opener. However, we need to check the previous
character to make sure that we try and process an distinct
entity, and not a #
character that happens to be in the middle
of another string.
previous = slice.PeekCharExtra(-1);
if (previous.IsWhiteSpaceOrZero() || previous == '(' || previous == '[')
Here I use an extension method exposed by Markdig to check if
the previous character was either whitespace, or nothing at all,
i.e. the start of the document. I'm also checking for (
or [
characters in case the issue number has been wrapped in brackets
or square braces.
If we pass this check, then it's time to parse the issue number.
First we advance the character stream (to discard the #
opener) and also initalize the values for creating a final
StringSlice
if we're successful.
slice.NextChar();
current = slice.CurrentChar;
start = slice.Start;
end = start;
As our GitHub/MantisBT issue numbers are just that, plain numbers, we simply keep advancing the stream until we run out of digits.
while (current.IsDigit())
{
end = slice.Start;
current = slice.NextChar();
}
As I'm going to work exclusively with the
StringSlice
struct, I'm only recording where the new slice will end. Even if you wanted to use a more traditional string, it probably makes sense to keep the above construct and then build your string at the end.
Once we've ran out of digits, we now essentially do a reverse of the check we made at the start - now we want to see if the next character is white space, the end of the stream, or a closing bracket/brace.
if (current.IsWhiteSpaceOrZero() || current == ')' || current == ']')
I didn't add a check for this, but potentially you should also look for matching pair - so if a bracket was used at the start, a closing bracket should therefore be present at the end.
Assuming this final check passes, that means we have a valid
#<number>
sequence, and so we create a new MantisLink
object with the IssueNumber
property populated with a brand
new string slice. We then assign this new object to the Inline
property of the processor.
inlineStart = processor.GetSourcePosition(slice.Start, out int line, out int column);
processor.Inline = new MantisLink
{
Span =
{
Start = inlineStart,
End = inlineStart + (end - start)
},
Line = line,
Column = column,
IssueNumber = new StringSlice(slice.Text, start, end)
};
I'm not sure if the
Line
andColumn
properties are used directly by Markdig, or if they are only for debugging or advanced AST scenarios. I'm also uncertain what the purpose of setting theSpan
property is - even though I based this code on the code from the Markdig repository, it doesn't seem to quite match up should I print out its contents. This leaves me wondering if I'm setting the wrong values. So far I haven't noticed any adverse effects though.
Creating the extension
The first thing to set up is the core extension. Markdig
extensions implement the IMarkdownExtension
interface. This
simple interface exposes two overloads of a Setup
method for
configuring the parsing and rendering aspect of the extension.
One of these overloads is for customising the pipeline - we'll add our parser here. The second overload is for setting up the renderer. Depending on the nature of your extension you may only need one or the other.
As this class is responsible for creating any renders or parsers your extension needs, that also means it needs to have access to any required configuration classes to pass down.
public class MantisLinkerExtension : IMarkdownExtension
{
private readonly MantisLinkOptions _options;
public MantisLinkerExtension(MantisLinkOptions options)
{
_options = options;
}
public void Setup(MarkdownPipelineBuilder pipeline)
{
OrderedList<InlineParser> parsers;
parsers = pipeline.InlineParsers;
if (!parsers.Contains<MantisLinkInlineParser>())
{
parsers.Add(new MantisLinkInlineParser());
}
}
public void Setup(MarkdownPipeline pipeline, IMarkdownRenderer renderer)
{
HtmlRenderer htmlRenderer;
ObjectRendererCollection renderers;
htmlRenderer = renderer as HtmlRenderer;
renderers = htmlRenderer?.ObjectRenderers;
if (renderers != null && !renderers.Contains<MantisLinkRenderer>())
{
renderers.Add(new MantisLinkRenderer(_options));
}
}
}
Firstly, I make sure the constructor accepts an argument of the
MantisLinkOptions
class to pass to the renderer.
In the Setup
overload that configures the pipeline, I first
check to make sure the MantisLinkInlineParser
parser isn't
already present; if not I add it.
In a very similar fashion, in the Setup
overload that
configures the renderer, I first check to see if a
HtmlRenderer
renderer was provided - after all, you could be
using a custom renderer which wasn't HTML based. If I have got a
HtmlRenderer
renderer then I do a similar check to make sure a
MantisLinkRenderer
instance isn't present, and if not I create
on using the provided options class and add it.
Adding an initialisation extension method
Although you could register extensions by directly manipulating
the Extensions
property of a MarkdownPipelineBuilder
,
generally Markdig extensions include an extension method which
performs the boilerplate code of checking and adding the
extension. The extension below checks to see if the
MantisLinkerExtension
has been registered with a given
pipeline, and if not adds it with the specified options.
public static MarkdownPipelineBuilder UseMantisLinks(this MarkdownPipelineBuilder pipeline, MantisLinkOptions options)
{
OrderedList<IMarkdownExtension> extensions;
extensions = pipeline.Extensions;
if (!extensions.Contains<MantisLinkerExtension>())
{
extensions.Add(new MantisLinkerExtension(options));
}
return pipeline;
}
Using the extension
MarkdownPipeline pipline;
string html;
string markdown;
markdown = "See issue #1";
pipline = new MarkdownPipelineBuilder()
.Build();
html = Markdown.ToHtml(markdown, pipline); // <p>See issue #1</p>
pipline = new MarkdownPipelineBuilder()
.UseMantisLinks(new MantisLinkOptions("https://issues.cyotek.com/"))
.Build();
html = Markdown.ToHtml(markdown, pipline); // <p>See issue <a href="https://issues.cyotek.com/view.php?id=1" target="blank" rel="noopener noreferrer">#1</a></p>
Example of using an extension to automatically generate links for MantisBT issue numbers.
Wrapping up
In this article I showed how to introduce new inline elements parsed from markdown. This example at least was straightforward, however there is more that can be done. More advanced extensions such as pipeline tables have much more complex parsers that generate a complete AST of their own.
Markdig supports other ways to extend itself too. For example, the Auto Identifiers shown at the start of the article doesn't parse markdown but instead manipulates the AST even as it is being generated. The Emphasis Extra extension injects itself into another extension to add more functionality to that. There appears to be quite a few ways you can hook into the library in order to add your own custom functionality!
A complete sample project can be downloaded from the URL below or from the GitHub page for the project.
Although I wrote this example with Mantis Bug Tracker in mind, it wouldn't take very much effort at all to make it cover innumerable other websites.
Update History
- 2017-08-05 - First published
- 2020-11-22 - Updated formatting
Related articles you may be interested in
Downloads
Filename | Description | Version | Release Date | |
---|---|---|---|---|
MarkdigMantisLink.zip
|
Basic Markdig extension for the writing custom Markdig extensions blog post. |
05/08/2017 | Download |
Leave a Comment
While we appreciate comments from our users, please follow our posting guidelines. Have you tried the Cyotek Forums for support from Cyotek and the community?