# Semantic Markdown Spec (V0) [TOC] ## Introduction ### What is Semantic MarkDown ? Design Rationale: - Embed RDFa-like semantic annotation within MarkDown - Ability to mix unstructured human-text with machine-readabale data in JSON-LD-like lists - Ability to semantically annotate an existing plain MarkDown document with semantic annotations - Try to keep human-readability to a maximum We need 3 kinds of annotations: - annotations with a property - annotations with a subject identifier - annotations with a type/class ### About this document ## In brief - Annotations starting with a `.` indicate a type/class, and generate RDFa `typeof` attribute : `{.foaf:Person}` - Annotations starting with a `=` indicate a URI of a known entity, and generate RDF `resource` attribute : `{=wdt:Q42}` - Annotations without any marker indicate a property, and generate RDFa `property` attribute : `{foaf:name}` ## Paragraph example ``` The Hitchhiker's Guide to the Galaxy was written by [Douglas Adams]{dct:creator} in [1979]{dct:created}. {=wdt:Q25169} ``` ## A short example ``` :::{.schema:Event} ## Specification meeting {schema:name} * Date : 11/10 {schema:startDate} * Place : Our office, Street name, 75014 Paris {schema:location} * Meeting participants : {schema:attendee} * Alice; * Bob; * [Tim](https://www.wikidata.org/wiki/Q80); * Description : Some information not annotated ::: ``` ## MarkDown extensions needed ### Rely on attributes extension See [PHP Markdown extra special attributes](https://michelf.ca/projects/php-markdown/extra/#spe-attr) and [Pandoc's header attributes](https://pandoc.org/MANUAL.html#heading-identifiers) : Extract from PHP Markdown extra documentation : > With Markdown Extra, you can set the id and class attribute on certain elements using an attribute block. For instance, put the desired id prefixed by a hash inside curly brackets after the header at the end of the line, like this: > > ``` > Header 1 {#header1} > ======== > >## Header 2 ## {#header2} >``` >Then you can create links to different parts of the same document like this: > > ``` > [Link back to header 1](#header1) > ``` >To add a class name, which can be used as a hook for a style sheet, use a dot like this: >``` >## The Site ## {.main} >``` >You can also add custom attributes having simple values by specifying the attribute name, followed by an equal sign, followed by the value (which cannot contain spaces at this time): >``` >## Le Site ## {lang=fr} >``` >The id, multiple class names, and other custom attributes can be combined by putting them all into the same special attribute block: >``` >## Le Site ## {.main .shine #the-site lang=fr} >``` >At this time, special attribute blocks can be used with > - headers, > - fenced code blocks > - links, and > - images. #### Extend where attributes can be places ##### Extend attributes to lists The attribute mechanism need to be extended to annotate lists. In this case the curly brackets should be put right before the list: ``` {foaf:member} - item 1 - item 2 - item 3 ``` ##### Extend attributes to list items ``` - item 1 {foaf:member} - item 2 {foaf:member} - item 3 {foaf:member} ``` ##### Extend attributes to inlines ``` Thomas is _39_{foaf:age}. ``` ##### Attributes on a word without inline delimiters ? ``` Thomas is 39{foaf:age}. ``` #### Allow "property attribute" An attribute without `.`, without `#` and that is not a key-value pair should be recognized as a property name, e.g. `{foaf:name}`. #### Allow "subject attribute" An attribute beginning with the `=` sign indicates a subject URI, equivalent to an `about=xxx` property, e.g. `{=wdt:Q42}` is equivalent to `{about=wdt:Q42}` ### Rely on divs and bracketed spans extension See PanDoc [bracketed spans](https://pandoc.org/MANUAL.html#divs-and-spans) ``` Meeting with [Bob]{.foaf:Person} ``` Should produce ``` <p>Meeting with <span typeof="foaf:Person">Bob</span></p> ``` ## Mechanism to indicate property values (RDFa "property" attribute) ### Properties in lists #### Key/Value pairs If the list item contains `:` or `=`, the annotation is applied to the string after this character. Should final dot or semi-coloon be omitted here ? ``` - Nom : Thomas Francart {foaf:name} - Age = 39 {foaf:age} - Profession : Semantic Web Consultant; {rdfs:comment} ``` Should yield (note how semi-colon is exclused from last annotation) : ``` <ul> <li>Nom : <span property="foaf:name">Thomas Francart</span></li> <li>Age = <span property="foaf:age">Thomas Francart</span></li> <li>Profession : <span property="rdfs:comment">Semantic Web Consultant</span>;</li> </ul> ``` #### URI written directly as key ``` - foaf:name : Thomas Francart - foaf:age = 39 - rdfs:comment : Semantic Web Consultant ``` Should yield ``` <ul> <li>foaf:name : <span property="foaf:name">Thomas Francart</span></li> <li>foaf:age = <span property="foaf:age">Thomas Francart</span></li> <li>rdfs:comment : <span property="rdfs:comment">Semantic Web Consultant</span></li> </ul> ``` #### Value-only list items ``` - Thomas Francart {foaf:name} - 39 {foaf:age} - Semantic Web Consultant {rdfs:comment} ``` Should yield ``` <ul> <li><span property="foaf:name">Thomas Francart</span></li> <li><span property="foaf:age">Thomas Francart</span></li> <li><span property="rdfs:comment">Semantic Web Consultant</span></li> </ul> ``` #### Annotate a list with a property Annotating a list with a property annotation should be treated as if all list items are annotated with the same property ``` {foaf:member} - Thomas ; - Vincent; - Nicolas; ``` Is equivalent to ``` - Thomas; {foaf:member} - Vincent; {foaf:member} - Nicolas; {foaf:member} ``` And should yield ``` <ul> <li><span property="foaf:member">Thomas</span>;</li> <li><span property="foaf:member">Vincent</span>;</li> <li><span property="foaf:member">Nicolas</span>;</li> </ul> ``` ### Inline properties #### Properties on inline delimiters ``` Thomas is [39]{foaf:age}. ``` Should yield ``` <p>Thomas is <span property="foaf:age">39</span></p> ``` Same with `_`, `*` or `**`. #### Properties on word without delimiters If a property annotation immediatly follows a word with no explit inline delimiters, it should be applied to this word only. (Is it really possible in termes of parsing ? don't know). ``` Thomas is 39{foaf:age}. ``` Should yield ``` <p>Thomas is <span property="foaf:age">39</span></p> ``` ### Annotate with 2 properties It should be possible to annotate with 2 properties ``` - Name : Alice {foaf:name rdfs:label} - Age : 23 {foaf:age} ``` ## Mechanism to indicate current subject RDFa relies on a mechanism to indicate the _current subject_ of the annotation (precise reference needed). We should aim at having something equivalent in SemanticMarkDown. Intuitively, the current subject is the resource annotated in the "closest ancestor" of a property annotation. ### Use a class attribute (RDFa "typeof" attribute) ``` # Le site {.foaf:Document} ``` ``` {.foaf:Document} - item 1 - item 2 - item 3 ``` ### Use an ID attribute (RDFa "about" or "resource" attribute) Use an attribute with a key-pair, with the key "about" or "resource" ``` # Douglas Adams {resource=wdt:Q42} ``` Can we find some ind of shortcut ? Maybe use the equal sign ``` # Douglas Adams {=wdt:Q42} ``` ### Combine ID + class It should be possible to combine an ID and a type attrbute ``` # Douglas Adams {.foaf:Person =wdt:Q42} ``` ### Where to find the current subject ? #### Current span subject (?) (requires div-span extension) Used to indicate that a certain inline portion of a sentence is about an entity. ``` [Tim Berners Lee]{=wdt:Q80} invented the web. ``` Should yield ``` <p><span resource="wdt:Q80">Tim Berners Lee</span> invented the web</p>. ``` #### Current paragraph subject Used to indicate that a whole paragraph is about an entity. ``` Tim Berners Lee invented the web. {=wdt:Q80} ``` Should yield ``` <p resource="wdt:Q80">Tim Berners Lee invented the web</p>. ``` #### Current list subject Used to indicate that a whole list describes an entity ``` {=wdt:Q80} - Name : Tim Berner's Lee {foaf:name} - ISNI : 0000 0000 7866 6209 {wd:P213} ``` Should yield ``` <ul resource="wdt:Q80"> <li>Name : <span property="foaf:name">Tim Berner's Lee</span></li> <li>ISNI : <span property="wd:P213">0000 0000 7866 6209</span></li> </ul> ``` **For readablity, the list annotation should be seeked at the end of the line preceding the list**: ``` :::{.schema:Event} * Date : 11/10 {schema:startDate} * Meeting participants : {schema:attendee} * Alice; * Bob; ::: ``` #### Indented lists Indented lists are key because they could make plain MarkDown lists look like JSON-LD-like structures; ``` Here is our meeting description : - Date : 10/11/2019 - Location : somewhere - Attendees : - Alice - Engineer - Works for : Foo - Hobbies : - Football - Video games - Bob - Sales Manager - Works for : Bar - Hobbies : - Cooking - Cycling ``` Annotated version: ``` Here is our meeting description : {.schema:Event} - Date : 10/11/2019 {schema:startDate} - Location : somewhere {schema:place} - Attendees : {schema:attendee} - Alice {schema:name} - Engineer {schema:jobTitle} - Works for : Foo {schema:affiliation} - Hobbies : {schema:knowsAbout} - Football - Video games - Bob {schema:name} - Sales Manager {schema:jobTitle} - Works for : Bar {schema:affiliation} - Hobbies : {schema:knowsAbout} - Cooking - Cycling ``` Arguably, this is not human-readable anymore #### Current blockquote subject (is it useful ?) Used to indicate that a blockquote describes an entity TODO #### Current header subject Used to indicate that a certain section of a document describes an entity. While this is certainly useful and intuitive to do (and compatible with the attributes MarkDown extension), this is probably the most tricky to implement because a header in MarkDown does not generate a common HTML ancestor for its whole content. Let's assume for now that it is possible to generate a `<div>` that contains the entire header content, but the feasability of this should be checked. TODO ``` ## Specification meeting {.schema:Event} - Date : 10/11/2019 {schema:startDate} - Location : somewhere {schema:location} ``` Should yield ``` <div typeof="schema:Event"> <h2>Specification meeting</h2> <ul> <li>Date : <span property="schema:startDate">10/11/2019</span></li> <li>Location : <span property="schema:location">somewhere</span></li> </ul> </div> ``` #### Current div subject (requires div-span extension) ``` :::{=wdt:Q80} Tim Berners Lee invented the web. He now works on Solid. ::: ``` Should yield ``` <div about="wdt:Q80"> <p>Tim Berners Lee invented the web.</p> <p>He now works on Solid.</p> </div> ``` ## Mechisnm to declare namespaces Use link references, anywhere in the document, preferably at the end to ease readability. ``` {.schema:Event} * Date : 10/11/2019 {schema:startDate} * Location : somewhere {schema:location} ... the rest of the document ... — [schema]: http://schema.org/ [rdfs]: http://www.w3.org/2000/01/rdf-schema# ``` Should yield ``` <html prefix="schema: http://schema.org/ rdfs: http://www.w3.org/2000/01/rdf-schema#"> <body> <ul typeof="schema:Event"> <li>Date : <span property="schema:startDate">10/11/2019</span></li> <li>Location : <span property="schema:location">somewhere</span></li> </ul> </body> </html> ``` Question : how to distinguish link references that are prefixes from link references that are just link references ? should we need a special annotation for that ? e.g. `{@prefix}` : ``` ### Specifications Meeting {.schema:Event} * Date : 10/11/2019 {.schema:startDate} ... the rest of the document ... — [schema]: http://schema.org/ {@prefix} [rdfs]: http://www.w3.org/2000/01/rdf-schema# {@prefix} ``` ### Mechanism to declare default namespace ## Referring to a URI ### Absolute URI reference `Meeting with _Bob_{.http://xmlns.com/foaf/0.1/Person}` ### Absolute URI reference with <> `Meeting with _Bob_{.<http://xmlns.com/foaf/0.1/Person>}` ### Prefixed URI reference (known prefix) `Meeting with _Bob_{.foaf:Person}` Prefixes known in [RDFa Core Initial Context](https://www.w3.org/2011/rdfa-context/rdfa-1.1) ### Prefixed URI reference (with a link reference) ``` Meeting with _Bob_{.f:Person} [f]: http://xmlns.com/foaf/0.1/Person ```` ---- ## Parallel Idea : Indented Lists using Link References (JSON-LD-like lists). ``` Here is our meeting description : {.schema:Event} - [Date] : 10/11/2019 - [Location] : somewhere - [Attendees] : - [Name] : Alice - [jobTitle] : Engineer - [Works for] : Foo - [Hobbies] : - Football - Video games - [Name] : Bob - [jobTitle] : Sales Manager - [Works for] : Bar - [Hobbies] : - Cooking - Cycling -- [Date]: http://schema.org/startDate [Location] : http://schema.org/Location [Name] : http://schema.org/name [jobTitle] : http://schema.org/jobTitle [Works for] : http://schema.org/affiliation [Hobbies] : http://schema.org/knowsAbout ``` > In this example there is a "key : value" syntax we could use to populate correctly an event (instead of [*]). The first "key:value" is Date:10/11/2019, meaning the script has to put 10/11/2019 in date property of event. > If we encounter "Attendees:", we retrieve schema of event's attendee => foaf:Person and we use it to analyse the stuff below. > -- <cite>Swann</cite> Very readable ! Looks very much like JSON-LD, with a context. Besides, makes link clickable. And does not rely on an extension to capture the property. Closer to original MarkDown philosophy. Lots of advantages ! - [Date] : 10/11/2019 - [Location] : somewhere - [Attendees] : - [Name] : Alice - [jobTitle] : Engineer - [Works for] : Foo - [Hobbies] : - Football - Video games - [Name] : Bob - [jobTitle] : Sales Manager - [Works for] : Bar - [Hobbies] : - Cooking - Cycling --- [Date]: http://schema.org/startDate [Location]: http://schema.org/Location [Attendees]: http://schema.org/attendees [Name]: http://schema.org/name [jobTitle]: http://schema.org/jobTitle [Works for]: http://schema.org/affiliation [Hobbies]: http://schema.org/knowsAbout # Benchmark [Roam-research](https://roamresearch.com/) - Similarities : - Differences : Open-source [Org-roam](https://org-roam.readthedocs.io/en/master/) - Similarities : - Differences : [TiddlyRoam](https://joekroese.github.io/tiddlyroam/) - Similarities : - Differences :