I Generated This Post With C Preprocessor
By Artyom Bologov
It's a new phase, so I'm re-evaluating my life and tech choices yet again. This time, I identify as a C programmer. I'm moving to C-based software and trying to script everything with C. So why not move my website from Lisp to C too? To C preprocessor, actually.
There are several inspirations for this idea:
- Quake hack for raw file injection.
- An anecdote about preprocessor-based website.
- One attempt to generate HTML with C,
- And another HTML generation library.
So, technically, I'm not the first one to generate a website with C preprocessor (let's call it CPP. But I'm up to the challenge of making it actually usable and pretty! The page you're looking at is generated with C preprocessor, so consider that a success.
C Preprocessor Is a Templating Engine, Actually
CPP is quite dumb: it operates on code that's not even parsed yet. Which is bad if you want to make Lisp-like macros. And good if you want to embed some text or files into arbitrary text. Even if this text is HTML. So preprocessor is a templating engine of sorts.
But CPP possesses several advantages over tools like Mustache:
- Portability
- C compiler with a preprocessor is available for every OS and every toolset.
- Familiarity
- Every C programmer knows how CPP works. Almost every programmer knows HTML. If they don't, they can easily learn both preprocessor and HTML.
- Zero dependencies, no building
- Again, C compilers are everywhere. And you don't need any third-party libraries to build a website with preprocessor.
Preprocessor also has built-in recursive file inclusion.
One can write HTML files with preprocessor directives in them.
Here's how a template file (say, template/head
) might look like:
You can then #include "template/head"
from another file.
Problematic Chars
Preprocessor, as any templating engine, has some special chars that you have to handle. And preprocessor doesn't make it easy to work with these.
- Hash Sign
-
The most obvious preprocessor offender is the hash sign.
Preprocessor interprets hash sign as a directive.
And fails silently if it cannot interpret the directive properly.
- If you use hash in plaintext content, just replace it with
#
and enjoy:safe hash sign: #
- If you need hash in element IDs/fragments, quote it as per HTML attribute syntax and it will be recognized as C quoted string (nice feature of the preprocessor!):
<a href="#link-fragment">...</a>
- Hash in HTML entities—you don't need it, because you have Unicode.
- If you use hash in plaintext content, just replace it with
- Unicode Chars
-
GCC in particular is bad at it. It expands 😃 to
U0001f603
, for example. That's why I use Clang. - Comments
-
Compilation stages before preprocessor remove comments, unless you instruct preprocessor not to.
You simply have to provide a
-C
flag. GCC is making it hard again: it's adding its own comments to the output. That's why I use Clang.
Is That Worth It?
Given all these problems with chars and the fact that preprocessor is scary, is it worth it? Actually, yes. The preprocessor-based setup abstracts away the repetitive parts, while keeping things simple and portable enough. And, however painful it is to acknowledge, it's simpler than my previous Lispy setup.
You can review all of my build code here:
- Makefile Makefile that builds blog posts.
- Templates for page head, header,
- and a copyright footer.
- And the exact source file for this page: this-post-is-cpp.h