Semantic Markdown is a plain-text format
for writing documents that embed machine-readable data.
The documents are easy to author and both human and machine-readable,
so that the structured data contained within these documents
is available to tools and applications.
Technically speaking,
Semantic Markdown is "RDFa Lite for Markdown"
and aims at enhancing the HTML generated from Markdown
with RDFa Lite attributes.
Design Rationale:
This document is in early draft stage!
Interested in joining the idea or providing feedback?
Semantic annotations are declared within curly braces {...}
.
Semantic Markdown provides 3 types of annotations:
.
indicates a type/class,typeof
attribute: {.foaf:Person}
property
attribute: {foaf:name}
=
indicates an IRI of a known entity,resource
attribute: {=wdt:Q42}
Would produce the following HTML+RDFa:
Notice how IRI namespace "schema" is implicitly resolved
from its listing at RDFa Core Initial Context
Would produce the following HTML+RDFa:
Semantic Markdown is declared as sets of hints.
Each set of hints is declared either directly where applied
or indirectly tied to links.
Hints may use shortened CURIE notation,
where uncommon vocabularies need to be defined.
Semantic Markdown id written
as a set of zero or more whitespace delimited hints,
wrapped with curly braces {...}
.
Each hint consists of a type identifier and an address.
Type identifier is either .
or =
or none.
Address is either a IRI wrapped with angle brackets <...>
,
or an RDFa CURIE.
All CURIEs must use
either an explicitly defined prefix
or a prefix listed in RDFa Core Initial Context.
:heavy_exclamation_mark: FIXME: write this…
:heavy_exclamation_mark: FIXME: write this…
Semantic Markdown is applied to content in different ways:
Hints immediately following an explicitly confined span of text
apply to the span;
i.e. bare spans (square brackets: [...]
),
underline (underscore: _..._
),
emphasis (asterisk: *...*
),
strong emphasis (double asterisk: **...**
),
inline code (backticks: `…`),
or link (square brackets + parenthesis: [...](...)
).
Would produce the following HTML+RDFa:
Notice how third sentence above has no hints
immediately following the span,
and fourth sentence has no explicit span.
Hints not immediately following an explicit span,
in a block with non-whitespace characters
before the hints and none after,
applies to the block.
Would produce the following HTML+RDFa:
Notice how second paragraph has punctuation after the hints.
Similarly for a list:
Would produce the following HTML+RDFa:
Hints in a block with non-whitespace characters
after the hints and none before,
applies to the block and any descendant blocks.
If the resulting scope does not correspond
to already generated html scope,
then a div is added.
In particular, when the resulting scope is the whole Markdown context
then a Markdown parser targeting a full html document
(not only a subset of body part as Markdown generally does)
may apply the hints to the <html>
tag.
Would produce the following HTML+RDFa:
Notice how second header and succeeding paragraph is wrapped
with a div tag,
whereas third header is omitted
because it is not a descendant but a sibling.
Hints in a header or list block
with no non-whitespace characters before or after the hints,
followed by a block of same type and level,
applies individually to each following block of same type and level,
until any block of a lower level.
Would produce the following HTML+RDFa:
Similarly for a list:
Is equivalent to
And would produce the following HTML+RDFa:
Hints in a block
with no non-whitespace characters before or after the hints,
followed by a different type or level of block,
applies to the following block
and any contained or descendant blocks;
or followed by a non-header non-list block,
applies to the following block and any following siblings
and any contained or descendant blocks of any of them.
For the context of this definition,
a paragraph or any container block
(block which can contain other blocks)
is considered to be descendent of a leaf block
(block which cannot contain other blocks,
e.g. a header or horisontal ruler).
If the resulting scope does not correspond
to already generated html scope,
then a div is added.
In particular, when the resulting scope is the whole Markdown context
then a Markdown parser targeting a full html document
(not only a subset of body part as Markdown generally does)
may apply the hints to the <html>
tag.
Would produce the following HTML+RDFa:
Hints not immediately following an explicit span,
in a link definition block
with no non-whitespace characters after the hints,
applies to all references to that definition,
even if no link is defined.
Similar to Markdown link definitions,
source markup of this kind does not in itself result
in any output html markup:
It only affects other markup, and if unused it simply is ignored.
Would produce the following HTML+RDFa:
See PHP Markdown extra special attributes
and Pandoc's header attributes:
Semantic Markdown uses similar syntax,
but either with different leading character "="
or "keywords" containing a colon.
Extract from PHP Markdown extra documentation:
With Markdown Extra,
you can set the id and class attribute on certain elements
using an attribute block.
For instance, put the desired id prefixed by a hash
inside curly brackets after the header at the end of the line,
like this:Then you can create links to different parts of the same document
like this:To add a class name, which can be used as a hook for a style sheet,
use a dot like this:You can also add custom attributes having simple values
by specifying the attribute name,
followed by an equal sign, followed by the value
(which cannot contain spaces at this time):The id, multiple class names, and other custom attributes
can be combined
by putting them all into the same special attribute block:At this time, special attribute blocks can be used with
- headers,
- fenced code blocks
- links, and
- images.
See Pandoc bracketed spans
Would produce the following HTML+RDFa:
Annotations declared as a an initial separate block
applies to all siblings by introducing a surrounding <div>
tag.
Would produce the following HTML+RDFa:
As per Block scope,
Annotations declared at the end of a block (modulo whitespace)
applies to that one block.
Would produce the following HTML+RDFa:
An attribute without .
, without #
and that is not a key-value pair
should be recognized as a property name, e.g. {foaf:name}
.
An attribute beginning with the =
sign indicates a subject IRI,
equivalent to an resource=xxx
property,
e.g. {=wdt:Q42}
is equivalent to <sometag resource="wdt:Q42">
Should yield
Same with _
, *
or **
.
It should be possible to annotate with 2 properties
Should yield
Would produce the following HTML+RDFa:
As per Block scope,
above hints applies to an existing block
which serves as placeholder for the semantic hints,
and there is therefore no need for adding a wrapper <div>
tag.
Would produce the following HTML+RDFa:
(Note that the typeof
RDFa attribute used alone
generates an anonymous node
as the current subject of inner property
attributes.
In other words, further property annotations
will refer to an entity of the provided type.)
Use an annotation starting with "="
:heavy_exclamation_mark:
FIXME: Undecided if leading character =
should be replaced
with e.g. @
or #
.
It should be possible to combine an ID and a type attribute
Should produce the following HTML+RDFa
But beware that if one hint is broken
then the whole annotation is passed through as-is,
e.g. if using an undefined prefix:
Should produce the following HTML+RDFa
RDFa relies on a mechanism
to indicate the current subject of the annotation.
Semantic Markdown aims at having an equivalent mechanism.
Intuitively, the current subject is the resource
annotated in the "closest ancestor" of a property annotation.
Used to indicate
that a certain inline portion of a sentence is about an entity.
Should yield
Used to indicate
that a whole paragraph is about an entity.
The annotation is at the end of the paragraph for readability.
Should yield
Used to indicate that a whole list describes an entity.
The annotation should be sought
at the end of the line preceding the list.
Should yield
If an annotation is between a paragraph and a list,
then it applies to the list
when standalone with double newlines
same as writing a separate paragraph:
Should yield
Indented lists are key
because they could make plain Markdown lists look like JSON-LD trees;
Plain Markdown list:
Annotated version:
:heavy_exclamation_mark:
FIXME: Either replace this section with JSON-LD like style
or drop this section
Used to indicate that a blockquote describes an entity
:heavy_exclamation_mark:
FIXME: Either document how or drop this section
Used to indicate
that a certain section of a document describes an entity.
The following annotated MD:
Should produce the following HTML+RDFa:
Similarly
Should produce the following HTML+RDFa:
Should yield
As per Block-cluster scope,
hints applies until next descendant block or sibling paragraph.
To limit without introducing new content,
use an empty hint:
Should yield
Declare prefix definitions,
anywhere in the document,
preferably at the end to ease readability.
Should yield
Prefixes mimic the syntax for links,
but using curly brackets instead of angle brackets:
Declaring a prefix as the default generates a vocab
attribute
instead of a prefix on the outermost block of the text,
adding a div if no such block exists already.
All uses of that default prefix is then generated without a prefix.
Would produce the following HTML+RDFa:
FIXME : The default namespace should make it possible
to annotate the document without using a prefix at all.
Instead of giving the default
both a prefix name and a special annotation,
I suggest using @default
as the prefix itself.
Would produce the following HTML+RDFa:
All prefixes predefined in RDFa Core Initial Context can be used
without explicitly defining them.
Would produce the following HTML+RDFa:
Meeting with [Bob]{.<http://xmlns.com/foaf/0.1/Person>}
Link can optionally be made clickable
by adding a link to its definition at the bottom.
Whereas the scope of this project is limited
to authoring a specification
and maybe developing proof-of-concept parsers for it,
some projects doing similar or more than that
can be of inspiration.
Roam-research
Org-roam
TiddlyRoam
There is also some experimentations
on how to use those specifications:
SemanticMarkdown use cases studies
Other references :
RDFa Lite
RDFa CURIE
RDFa Core
RDFa Core Initial Context
If a property annotation immediately follows a word
with no explicit inline delimiters,
it should be applied to this word only.
(Is it really possible in terms of parsing? don't know).
Should yield
Should yield
If the list item contains :
or =
,
the annotation is applied to the string after this character.
If final non-space non-annotation character of the list item
is ,
or ;
,
the annotation is applied to the string before this character.
Should yield (note how semi-colons are excluded from last annotations):