During a fun and productive visit of Nicolas Thiéry at Michael Kohlhase's Kwarc group at FAU Erlangen in May 2023, and a joint stay at HIM in 2024, we continued our exploration of the following problematic: could lightweight semantic annotations on Jupyter/Markdown based teaching material enrich the experience for authors and learners?
Let's make it concrete on an simple example: if the definitions of concepts in the course material are explicitly annotated with adequate markup:
:::{definition}
The {definiendum}`cat` is a {symref}`domestic species` of small
{symref}`carnivorous mammal`.
:::
then flashcards can be automatically generated to help students memorize the definitions of the course, using spaced repetition. Or of some subset, like the definitions of all concepts that are required to define what a cat is.
This blog post reports on ideas, progress and first prototypes.
[Vision paper](TODO: add link)
TODO: Nicolas: describe the direction in Orsay
One long term line of work of the Kwarc group is to explore how knowledge can enrich computer systems, touching on domains such as semantic AI and semantic web and using tools such as ontologies or knowledge graphs. Our previous collaboration revolved around computation: can we improve the interoperability between computational software by exploiting or adding semantic information into the systems.
Nowadays the Kwarc group is conducting that exploration on education (work funded by the German Ministry of Research and Education (BMBF): VoLL-KI project 2021-2025), as part of the global movement for assisted learning based on learning analytics.
They are building a system, ALeA for Adaptive Learning Assistant, that takes as input course material split in small annotated units (currently authored in LaTeX), and produces an interactive web site for learners to explore that material. The system is continuously experimented with on a corpus of ongoing large courses, including a 1000 student AI course. You can browse the courses – or even create an account to actually play with them – here:
https://courses.voll-ki.fau.de/ (alias: https://alea.education )
TODO Dennis / Jonas
The course material is written as a collection of (small) annotated LaTeX files; each of these files typically contain a few (beamer) slides. We will from now on refer to them as the stex files (s for semantically annotated).
A special implementation of TeX, called RusTeX takes as input the stex files and generates as many html files, where the annotations have been encoded (which we will refer to as the sHTML files). RusTeX also has the property of producing sHTML which is visually identical to the pdf that would be produced by the usual toolchain (pdflatex).
sHTML plays the role of an AST (abstract syntax tree) from which to pivot. There are also sHTML importers from word, powerpoint, and markdown.
The sHTML files are imported into the ALeA system. They are processed by MMT to extract the semantic and reason on them: resolve cross-references and references to the domain model. The outcome (processed html files + ???) are then uploaded on the course web server.
When the learner logs into the service, a javascript library running in the learner's browser combines all the bits to pilot the user navigation: displaying slides, navigating among them, querying and updating the learner's model service, displaying flash cards, resolving guided tours, etc.
Currently, the values in the learner models are essentially estimated from:
Caveat: these self-assessments tend to be heavy-handed: they use up energy and meta-cognition; they also require a good understanding of the learning process. In somes courses, the students are first introduced to this chunk of the BLOOM terminology.
VoLL-KI aims at disseminating widely its outcomes, fostering best practices (semantic annotations of course material; pedagogical and ethical exploitation thereof), and reuse of the tools they are developing.
To this end, the following design principles aim at reducing the entry barriers.
Annotating one's course material should be:
MyST (markdownly Structured Text) is both an extension of Markdown and a (mostly javascript based) ecosystem of tools to support scientific authoring. Of particular interest for this project are:
Due to these qualities, MyST is getting traction, e.g. in AGU's project Notebook Now for a Jupyter+MyST scientific publishing plateform, or overleaf's analog for Markdown curvenote. And for teaching!
Our main target for the brainstorms and sprints was to explore collaboration opportunities between Alea and the Jupyter / Markdown ecosystem:
One argument macro: {foo}`bla` \foo{bla}
Two arguments macro: {foo}`bla <truc>` \foo[truc]{bla}
Simple directives:
:::{foo} xxx sdfa asdf
Lorem Ipsum ...
:::
Directives with simple metadata:
:::{foo} xxx sdfa asdf
:key1: value1
:key2: value2
Lorem Ipsum ...
:::
Directives with YAML metadata header:
:::{foo} xxx sdfa asdf
---
key1: value1
key2: value2
---
Lorem Ipsum ...
:::
Discussion about markdown syntax for stex: https://github.com/slatex/sTeX-React/issues/281
some sTeX documentation for learning objects
Annotations in the sTeX syntax:
\symname{A} : \symref{A}{A}
+ shorthands for plural, ...
\definiendum{symname}{blah blah blah}
\symref[pre=prefix,post=postfix]{symname}{bla bla bla}
\sn{bla} shorthand for \symref{bla}{bla}
\sn{pre=un,post=ed}{color} -> uncolored
\sr=\symref + syntactic sugar for case/plural
Environments:
- sdefinition
- sassertion (with styles theorem, lemma, remark, ...)
- sproblem (with styles exercise, ...) (with subenvironments subproblem, solution, ...)
- sproof (with subenvironments sproofitem, ...)
- sparagraph (with title and styles)
- sfragment (a sectionning)
- smodule (grouping of symbol / namespace)
Example:
\begin{sdefinition}
\end{sdefinition}
imports module / use module
- inputref: presumably the analogue of %embed in MyST
Inline versions of these environments: \inlinedef{...} (which generate spans instead of divs in sHTML).
Didactict annotations for tasks are inserted in the task block:
\objective{remember}{concept}
\objective{understand}{concept}
\objective{apply}{concept}
Being able to use a single \objective{understand,apply}
could be nicer. Prerequisites are deduced implicitly from the symrefs.
Open questions:
{definiendum}`foo`
{definiendum}`blah blah blah <label>`
{symref}`foo`, {symref}`foo <label>`
:::{prf:definition} Lorem
:label: lorem alternative: (lorem)= on the line before the definition
A {definiendum}`Lorem <lorem>` ipsum is ...
:::
:::{admonition} Exercise (in waiting for :::{exercise})
:::
The label could be presumably be made implicit from the definiendums. Particularly handy when there are several definiendums.
Didactict annotations for activities:
:::{exercise}
---
prerequisites:
- loop: apply
objectives:
- accumulator: remember
---
Lorem ipsum
:::
Comments:
a-zA-Z -_
strings.Dennis and Nicolas had a sprint in 2023, with sHTML export with JupyterBook and roles / directives implemented in Python. This "worked".
The web page still has the sHTML annotations. See for example this page (login: Enseignant, password: Enseignant).
In the meantime Nicolas has annotated all definitions (with definition/definiendum, and sometimes symrefs) in the course (so flashcards should work) with many symbols being partially aligned (at the occasion of a sprint with Michael in July 2024) in a central file.
Sources for the course; Annotation example
Instructions for Jupyter-Book:
extensions
subdirectory of your jupyter book directory._config.yml
:
sphinx:
extra_extensions:
- sphinx_proof
local_extensions:
semantic: 'extensions/'
semantic:
# This can be any URI identifying your course; need not be an actual URL
namespace: https://Nicolas.Thiery.name/Enseignement/Info111
# This is were you can insert alignments for the terminology of your course
symdecls:
- verbalization: informatique
en: computer science
wikipedia: ...
mathhub: '[smglom/cs]mod?computer-science?CS'
pip install sphinx-proof
These three courses share the exact same technological stack as Info111
make web
. Output will be in _build/html. Alternatively, your continuous integration on GitLab may just work.<span>
arguments are striped.Deliverable: ALeA integration in Jupyter
Note: some misc data currently stored in the ALeA application that do not appear in the current diagram:
Tasks and dependencies:
Potential partners: