lark: Designing a markup language

hannah

This document represents my attempt to design a markup language for HD-DN. It's heavily inspired by

The language is in the public domain, and I exert no rights over it.

Motivations

Design details

lark is a line-based markup language, where context is defined by the first few characters on a line. There is no in-line markup (e.g. bold, italics).

Structure

A lark is structured as a series of articles. Each article is composed of sections, and each section is made up of blocks. Blocks are made up of content, all of which is of the same type (paragraph, link, image). An example lark might look like this:

+--------------------------------------+
|  lark                                |
|  +--------------------------------+  |
|  |  article                       |  |
|  |  +--------------------------+  |  |
|  |  | section                  |  |  |
|  |  | +---------------------+  |  |  |
|  |  | | block               |  |  |  |
|  |  | +---------------------+  |  |  |
|  |  +--------------------------+  |  |
|  |                                |  |
|  |  +--------------------------+  |  |
|  |  | section                  |  |  |
|  |  | +---------------------+  |  |  |
|  |  | | block               |  |  |  |
|  |  | +---------------------+  |  |  |
|  |  | +---------------------+  |  |  |
|  |  | | block               |  |  |  |
|  |  | +---------------------+  |  |  |
|  |  | +---------------------+  |  |  |
|  |  | | block               |  |  |  |
|  |  | +---------------------+  |  |  |
|  |  +--------------------------+  |  |
|  +--------------------------------+  |
+--------------------------------------+

This is a lark made up of a single article. The article is made up of two sections: the first contains a single block, while the second contains three blocks.

Glyphs & blocks

Glyphs are used to indicate the type of block being written. They are always the first character of a line. If the first character of a line is not a glyph, then the assumption is that the block being written is a paragraph.

=    Header
-    Subheader
_    Subsubheader
+    Date
~    Author
@    Link
!    Image
>    Blockquote
*    Unordered list
:    Ordered list
'    Preformatted text
`    Code block
---  Section divider
***  Article divider

lark attempts to be a simple but strict superset of gemtext. You should be able to convert valid gemtext into valid lark using less than 10 lines of simple shell script. If this isn't possible, then lark has failed in its goals.

Block specifications

Paragraphs

Paragraphs are

Paragraph specifications are the same as for gemtext: a completely blank line will be read as an empty paragraph.

Headings

Headings are

(lark)
= This is a main header  
- And a subheader
_ A subsubheader now

All whitespace between the glyph and the first non-whitespace character will be ignored. Everything else, including whitespace, will be read verbatim.

Links and images

Links and images are

The human-readable description is optional for links, but mandatory for images since it will be used as alt-text.

(lark)
@   https://hd-dn.com   HD-DN
!   /path/to/bird.png   A picture of a bird

Whether images are displayed inline or rendered as links is left up to the client.

Blockquotes

Blockquotes are

(lark)
>This is a valid blockquote
> This is also a valid blockquote

Lists

Lists are

List items are

(lark)
*this is a valid unordered list item
* and so is this

:this is a valid ordered list item
: and so is this

Sequential list items of the same type are put into an unordered or ordered list block. For example,

(lark)
: Here's an item
: And another
: A third

might get processed to something like

Block(
  type: OrderedList,
  content: [
    Block(type: ListItem, content: "Here's an item"),
    Block(type: ListItem, content: "And another"),
    Block(type: ListItem, content: "A third")
  ]
)

Pre-formatted and code blocks

Pre-formatted and code blocks are

Using ''' or ``` will toggle the client into a preformatted mode. While in this mode, any text will be processed verbatim into the block. For example

(lark)
'
'    This is a pre-formatted block
'      So all
'    these spaces should appear in content.

might be processed into something like


Block(
  type: Preformatted,
  content: [
    "    This is a pre-formatted block",
    "      So all",
    "    these spaces should appear in content."
  ]
)

Code blocks and pre-formatted blocks can have attributes, indicated by the structure

(lark)
'(attr)
' ...

or

(lark)
`(attr)
` ...

For example, a poem might be written as

(lark)
'(poem)
'a
'  poem is a type
'of written
'  WORD!

and this could give the block the "poem" attribute, which might be useful to the client in some way. Attributes must be a single word containing no whitespace.

Sections and articles

Sections and articles are

Section and article dividers are

(lark)
***

or

(lark)
---

Everything in a lark implicitly lives within an article and a section. When a divider is used, the current context ends and a new one begins. When the article divider is used it also ends the current section and starts a new one within the new article: you will never had sections that span multiple articles.

Reference

What follows is a lark reference document that can be used to test client rendering.

lark Reference Document

hannah

Purpose

This document contains every lark block and can be used to test client rendering.

Items

Paragraphs

This is a paragraph of text.

This is another, and so is the line above, which is blank.

Lists

  1. This is the first element of an ordered list
  2. This is the second

Links and images

So here's a link

and here's an image

Three linocut prints  by Ritual Dust displayed next to a mirror. The top one is a mushroom growing out of a hand. The middle is a snake and a key, representing the goddess Hekate. The bottom is of the plant henbane.

Blockquotes

Here is an example blockquote

Preformatted block

This text should
   all
   be
   pre-formatted! ! !

Code block

(go)
package main

import (
  "fmt"
)

func main() {
  fmt.Println("hi!")
}