Write Hypertext, not Plaintext
¶ I visit Derek Sivers' website from time to time. Every time I do, I discover he had
- written some half a hundred new blog posts,
- moved to another place,
- and started a new diet/philosophy/routine/book.
¶ He's too productive for a human, but that's a topic for a different time. The topic for today is his choice of data formats.
§ All You Need is Plaintext
¶ One of the posts that started me the most was Derek's Write Plain Text Files. Storing all of one's life, writing, and data in plaintext files is too appealing to ignore:
¶ The problem is: Derek's are not really plaintext files. He invents his own metadata headers. Date, tags, and title—pieces of information that don't belong to text-only files. Ad-hoc extension to the headers-and-paragraphs structure everyone does in plaintext. Waaaaaait, the headers-and-paragraphs structure is ad-hoc too!
¶ Plaintext is not enough. The moment you start adding metadata, links, lists, and quotations—you've transgressed. It's richtext now. The worst kind of richtext:
- and likely inconsistent across time.
The perfect plaintext file
¶ The most readable plaintext file that I've ever seen is this define-syntax tutorial. But even there, the syntax, though perfectly readable, is author-specific and non-portable. It'd benefit from being HTML instead, but oh well.
§ The Plaintext is a Lie
I have to clarify: there are actually two types of "plaintext" we're talking about here.
Both Derek Sivers and Scott Nesbit make the mistake of conflating these types.
What plaintext devotees usually mean is a storage format, as in binary-versus-plaintext.
That's the first part of the MIME type:
¶ Then, there's a second part of MIME: subtype. That's where it gets ugly (citing Scott Nesbit):
¶ And then:
¶ Plaintext is code, web page sources, configuration files, etc... And you can also open and preview it all in your editor? This blog post you're reading is a pure horror when compiled from Lisp (C preprocessor, actually) to HTML. The code that supports it is quite unreadable because I didn't really care about the style of the code hosting my blog. And then, this supporting code is Lisp, which causes nausea for the programmers used to their slightly-incompatible C-like syntax language.
¶ There are really two kinds of plaintext. Plaintext-the-storage-type and plaintext-the-display/markup/software-subtype. The former is eternal, and the latter is as ephemeral as the software/human processing it.
¶ That's why I'm taking an "anti-plaintext" stance here. You must use plaintext as a storage type, but, more important than that, you have to pick the most reliable subtype. Otherwise, all your "plaintext" files from 1991 get too ugly to understand 😛
§ Screw .TXT, All My Homies Use Markdown
¶ So plaintext is not enough, and one needs a structured and metadata-aware text format. Most of my fellow programmers realize that at some point. They usually convert to Markdown soon afterward.
¶ Markdown is the de facto standard for Zettelkasten notes, blog posts, and project docs. Obsidian supports markdown, GitHub supports it, VSCode supports it—everything supports it. It has headings, lists, links, metadata headers, HTML injection, and other platform-specific incompatible goodies. And it looks nice when rendered!
¶ One can argue for a particular richtext format (Org Mode (possibly with Orgdown), YAML, Wiki), each with its benefits. But, essentially, everyone is okay with richtext in whatever format they have. Obsidian, Roam, Brain, and other knowledge systems make these atomic richtext files interconnected. They bring structure and hierarchy to them. One's Markdown files are a self-sufficient knowledge web now. But, for a paranoid like me, these systems are no consolation.
§ Knowledge Webs. Better ones.
¶ Obsidian will end. Myspace, Google Reader, and dozen other vital tech products did. Once it's gone—your knowledge web is at best scattered, at worst destroyed. Markdown files you have are subtly incompatible with the new knowledge system. Your reference system is lost.
¶ One of the social tech-agnostic solutions (because social solutions are superior to technical ones) would be to pick another, more reliable, referencing system. Like academic one—with a dozen metadata fields unique to the given paper/post/media. It's been there for an eternity, and will likely last—the academy is not letting reliable systems go. And there are technical solutions to keep your references clean and consistent. Like Zotero and Citation Machine.
¶ But, even with these citation and referencing tools, keeping your knowledge base up to date is hard. You have to update references, load files, generate links, and interface with the document/richtext editor of choice. Would be nice to have something with (back-)linking, formatting, and plaintext-like data persistence. Something like...
¶ The academy made a live-changing gift to modern civilization in the nineties. HTTP, URI, and HTML are simple yet reliable foundations for the modern Internet. For a reason:
- HTML has its roots in the academy with its semantic markup and referencing fetishes.
- HTTP is plaintext in the most reliable meaning of plaintext
- and URI is a system for unique, linkable, and readable data references.
HTML was (and still is) intended to replace printouts, libraries, and opaque PDFs.
CSS follows suit.
<a> tag and a
href=URI give one the full power of referencing unique data,
in the minimum amount of characters possible.
Tables, lists, paragraphs, and other structural tags (especially in HTML5)
cover all the possible needs of a writer, followed by
and block/multiline quotations,
addresses, and whatnot.
¶ I'm getting stoned every time I open HTML and CSS references on MDN or WHATWG. Seems like they anticipated every single use-case for semantic pages on the Web. HTML is Markdown but with all the possible metadata one needs.
¶ If you've read this far, you've probably seen the silcrow signs near headings and pilcrow signs near paragraphs. These link to the sections and paragraphs they precede I've reinvented Bible-like linking, and I've done it in the only system that was flexible enough to do that: HTML.
Your data is safe in HTML because it's still plaintext.
Your data is portable in HTML (even if it's my ugly sort of HTML) because it's not an ad-hoc plaintext extension.
Your data is pretty in HTML because HTML+CSS is a plaintext format intended for display (even if it's an audio "display", ahem.)
Your data is meta-enriched in HTML because there's a tag,
or attribute for any type of metadata you can imagine.
Your data is a knowledge web in HTML because it's all interlinked and machine-parseable.
By default. Forever.
¶ Write Hypertext Knowledge Webs, Not Plain Text Files.