
Adding New Syntaxes to Hakyll
Configuring Hakyll's Pandoc Compiler to Support New Language Syntaxes
August 16, 2021
If you’re here, you’re probably attempting at configuring Hakyll to support additional languages for syntax highlighting. Or you’re “not satisfied with the build-in highlighting”.
If your journey here is similar to mine, I’ll make the assumption that you’ve
stumbled on a post somewhere on the internet telling you that all you need to
do is take KDE’s syntax file for your language, then use --syntax-definition
flag for Pandoc via the command line will include your syntax file. And voila!
New Language is ready!
The reality is not as easy as that. We are using Hakyll. Hakyll uses Pandoc under the hood to convert from markdown to html, and so we do not have command line access to Pandoc. However, Hakyll does expose the Pandoc settings to us. We can ask Hakyll for the Pandoc settings and update it with our own preferences. For the purpose of this post we’ll walk through how to set-up a new language for syntax highlighting, but the process is similar for configuring other Pandoc options for Hakyll.
Before we get started, I want to note that the code examples below are using packages from Stack’s 16.31 LTS release. Meaning that we’ll be using GHC 8.8.4, Hakyll 4.13.4.0, Pandoc 2.9.2.1, and Skylighting 0.8.5. If you’re using older or newer versions, the actual code required may be different but the idea should be similar. “Follow the types” and all will fall in place.
Identify the Pandoc Compiler and Pandoc Options
The first step is to identify where the Pandoc compiler is being applied.
The default project will make use of pandocCompiler
. Hakyll, however provides
additional functions for setting Pandoc options for the compiler.
Those methods are: pandocCompilerWith
, pandocCompilerWithTransform
, and
pandocCompilerWithTransformM
. For details checkout the Hakyll.Web.Pandoc
module.
The function we’re interested in is pandocCompilerWith
. This method
takes two parameters. The first is ReaderOptions
from Text.Pandoc.Options.
ReaderOptions
contain options that act on the raw input format as it
is converted to the Pandoc AST. The second parameter is WriterOptions
.
WriterOptions
are settings that act on the Pandoc AST and affect the
final output.
Both ReaderOptions
and WriterOptions
are record types that can be
queried and updated. The setting we want is writerSyntaxMap
and is
part of WriterOptions
. The next question is, where can we find the
default WriterOptions
used by Hakyll?
Hakyll exposes defaultHakyllReaderOptions
and defaultHakyllWriterOptions
with Hakyll’s defaults. We can reference these directly and use them
when we use pandocCompilerWith
.
An equivalent of pandocCompiler
would be:
= pandocCompilerWith defaultHakyllReaderOptions defaultHakyllWriterOptions myPandocCompiler
With this knowledge we can now query for Hakyll’s default syntax map:
= writerSyntaxMap defaultHakyllWriterOptions defaultHakyllSyntaxMap
At this point we can build a custom Pandoc compiler, and query
for Hakyll’s default so we can augment it. Next we need to find a way to
load additional syntax definition files. Pandoc uses the library
Skylighting
to do it’s syntax highlighting. Skylighting
takes
case of loading, parsing, and outputing styled syntax.
Loading KDE XML Syntax
In the above snippet, defaultHakyllSyntaxMap
is of type: SyntaxMap
,
which comes from the Skylighting.Types module. Skylighting
contains
methods to load XML syntax files. Skylighting.Loader contains: loadSyntaxFromFile
and loadSyntaxesFromDir
to load a file and a directory respectively.
When we make use of these methods we need to do a bit of unwrapping from
types like: IO (Either String Syntax)
.
import Data.Either (fromRight)
import System.IO.Unsafe (unsafePerformIO)
import Skylighting.Loader (loadSyntaxesFromDir)
import Skylighting.Syntax (defaultSyntaxMap)
import Skylighting.Types (SyntaxMap)
ioResult :: IO (Either String SyntaxMap)
= loadSyntaxesFromDir "syntax"
ioResult
eitherResult :: Either String SyntaxMap
= unsafePerformIO ioResult
eitherResult
loadedSyntaxMap :: SyntaxMap
= fromRight defaultSyntaxMap eitherResult loadedSyntaxMap
At this point we’ve successfully loaded new SyntaxMap
s using
Skylighting
’s methods and are ready to make use of them.
We can join this new SyntaxMap
with the default SyntaxMap
Hakyll knows about:
updatedSyntaxMap :: SyntaxMap
=
updatedSyntaxMap let defaultSyntaxMap = writerSyntaxMap defaultHakyllWriterOptions
in defaultSyntaxMap `mappend` loadedSyntaxMap
We can then create an updated WriterOptions
based on the Hakyll
defaults.
pandocWriterOptions :: WriterOptions
= defaultHakyllWriterOptions {
pandocWriterOptions = updatedSyntaxMap
writerSyntaxMap }
All Together
Now that we know more about the constituent parts to configuring a pandoc compiler with our custom settings we can bring it all together.
import Data.Monoid (mappend)
import Data.Either (fromRight)
import System.IO.Unsafe (unsafePerformIO)
import Hakyll
import Text.Pandoc.Options
import Skylighting.Loader (loadSyntaxesFromDir)
import Skylighting.Syntax (defaultSyntaxMap)
import Skylighting.Types (SyntaxMap)
-- | 'loadedSyntaxMap' loads xml syntax definitions or
-- the default syntaxMap if load operation fails.
loadedSyntaxMap :: SyntaxMap
=
loadedSyntaxMap let ioResult = loadSyntaxesFromDir "syntax"
= unsafePerformIO ioResult
eitherResult in fromRight defaultSyntaxMap eitherResult
-- | 'updatedSyntaxMap' extras the default Hakyll syntax map
-- and appends 'loadedSyntaxMap' to include new syntaxes.
updatedSyntaxMap :: SyntaxMap
=
updatedSyntaxMap let defaultSyntaxMap = writerSyntaxMap defaultHakyllWriterOptions
in defaultSyntaxMap `mappend` loadedSyntaxMap
-- | 'pandocWriterOptions' update Hakyll's default WriterOptions
-- with the new writerSyntaxMap
pandocWriterOptions :: WriterOptions
= defaultHakyllWriterOptions {
pandocWriterOptions = updatedSyntaxMap
writerSyntaxMap
}
-- | 'customPandocCompiler' definition of the Pandoc compiler with
-- custom loaded syntaxes and settings.
customPandocCompiler :: Compiler (Item String)
= pandocCompilerWith defaultHakyllReaderOptions pandocWriterOptions customPandocCompiler
And finally, we can use the new customPandocCompiler
in the compile step for
posts. This will replace what would have been the default pandocCompiler
.
main :: IO ()
= hakyll $ do
main "posts/*" $ do
match $ setExtension "html"
route $ customPandocCompiler
compile >>= loadAndApplyTemplate "templates/post.html" postCtx
>>= loadAndApplyTemplate "templates/default.html" postCtx
>>= relativizeUrls