<mrow>
canonIn the new W3C Math group, as well as the community group that preceded it, we have been brainstorming a dialect of annotation markup for specifying "author intent" over presentation MathML trees.
In the simplest of examples, you could imagine an atomic attribute that grounds a single node to its mathematical meaning, as in Euler's constant and Euler's number:
<mi intent="euler-number">e</mi>
<mi intent="euler-constant">γ</mi>
which would allow accessibility (AT) software to either choose to speak the mathematically informative concept name behind the rendered symbol, or directly speak the raw presentation readout. These values also double as anchors for:
In this note, I will do some example-driven troubleshooting of the "free lunch" or "intent defaults" idea that has been discussed in the community group.
The idea of "intent defaults" is to standardize a shared, public domain, list of the most ubiquitous math notations in use in a K-14 educational setting, as initially informed by Western curricula. Each list item will contain:
euler-constant
Then we plan to offer this precompiled list to accessibility tools as a "default" interpretation mode, where notations recorded in it are assumed to imply the recorded intent, when they can be automatically spotted in classic presentation MathML trees. This idea follows the spirit of the 80/20 rule, where we hope to remediate 80% of expressions in mass use with a small fraction of standardization effort. Of course we also mean to provide an escape hatch for remediating the "long tail" of formulas that won't fit neatly in this notational mini universe, and a straight out "off switch" for documents that completely deviate from K-14 material.
Consider the following single line TeX formula (displayed properly further down, if you're using Firefox today):
x \in (0,1) , y \in (-1, 0)
Here are some possible - and valid - presentation MathML trees for this expression, with their browser rendering.
<mrow>
nodes are added, where possible, following the implied operator tree of the expression. In other words, pieces of the formula are grouped together based on the order and scope in which a human mathematician would evaluate them. As generated by latexml v0.8.5:
<math>
<mrow>
<mi>x</mi>
<mo>∈</mo>
<mrow>
<mo stretchy="false">(</mo>
<mn>0</mn>
<mo>,</mo>
<mn>1</mn>
<mo stretchy="false">)</mo>
</mrow>
</mrow>
<mo>,</mo>
<mrow>
<mi>y</mi>
<mo>∈</mo>
<mrow>
<mo stretchy="false">(</mo>
<mrow>
<mo>-</mo>
<mn>1</mn>
</mrow>
<mo>,</mo>
<mn>0</mn>
<mo stretchy="false">)</mo>
</mrow>
</mrow>
</math>
<math>
<mi>x</mi>
<mo>∈</mo>
<mo stretchy="false">(</mo>
<mn>0</mn>
<mo>,</mo>
<mn>1</mn>
<mo stretchy="false">)</mo>
<mo>,</mo>
<mi>y</mi>
<mo>∈</mo>
<mo stretchy="false">(</mo>
<mo>−</mo>
<mn>1</mn>
<mo>,</mo>
<mn>0</mn>
<mo stretchy="false">)</mo>
</math>
There is a large variety of possible <mrow>
wrapper nodes, at various levels of nesting. Independent of which variant is chosen, it will be displayed as visually identical to the others, as seen above. The distinction remains invisible to the ultimate reader of the document, much like the <span>
tag in HTML.
For this example, the official MathML 3 spec, Chapter 3 has yet another suggested tree. It advocates for an extra <mrow>
that holds together the arguments of a range, even if that has failed to persuade implementers. Explicitly:
<mrow>
<mo stretchy="false">(</mo>
<mrow>
<mn>0</mn>
<mo>,</mo>
<mn>1</mn>
</mrow>
<mo stretchy="false">)</mo>
</mrow>
An in-between example was seen when combining AsciiMath with MathJax, which grouped together the intervals, but not the two top-level list items:
<math>
<mstyle displaystyle="true">
<mi>x</mi>
<mo>∈</mo>
<mrow>
<mo>(</mo>
<mn>0</mn>
<mo>,</mo>
<mn>1</mn>
<mo>)</mo>
</mrow>
<mo>,</mo>
<mi>y</mi>
<mo>∈</mo>
<mrow>
<mo>(</mo>
<mo>-</mo>
<mn>1</mn>
<mo>,</mo>
<mn>0</mn>
<mo>)</mo>
</mrow>
</mstyle>
</math>
We could imagine our running example read out as:
x is in the open interval from zero to one and y is in the open interval from negative one to zero
In order to automate such a reading, we would need to provide the enriched annotations for:
∈
can have the short narration "in" when indexing into a numeric range, versus the more verbose set-theoretic reading of "x is an element of …".<mo>
and an <mn>
. Only then we can avoid speaking the raw Unicode name for the operator (used above were "hyphen" or "minus") and instead use the mathematically accurate narration "negative".Math notations are generally highly ambiguous. The construct
Question: which notations to include, under what conditions?
Expressions have varying lengths and are fully remixable
<mrow>
and 2D wrappersOne avenue for a "free lunch" for accessibility tools is to ask authoring tools to mark up, as best as they can, a parallel operator tree over the presentation layout, via strategic <mrow>
wrappers.
For our running example that would be identical to the latexml variant at the top:
<mrow>
<mo stretchy="false">(</mo>
<mn>0</mn>
<mo>,</mo>
<mn>1</mn>
<mo stretchy="false">)</mo>
</mrow>
For every notation introduced by our "Intent Core" (also referred to as "Intent Level 1") list. working draft spreadsheet here. Capturing each horizontal notation in a dedicated <mrow>
will allow us to standardize a list of unambiguous selectors, which would enrich matching subtrees with a given intent value, such as "open-interval".
This approach does not on its own answer how to resolve multiple meanings for the exact identical notation (e.g. tuple vs interval), but allows us to avoid some of the contextual ambiguity (e.g. the <mrow>
, instead of the open parenthesis).
As this will put a burden on authoring tools, it is more of a "delegated lunch" than a free lunch.
While mathematicians will have an easy time writing down
An XPath selector for open-interval, in a tree with canonical mrow structure:
//mrow[
count(./*)=5 and
./*[1][name()="mo" and text()="("] and
./*[3][name()="mo" and text()=","] and
./*[5][name()="mo" and text()=")"] ]
An XPath selector for open-interval, adding inferred mrows:
//*[(self::mrow or self::mtd or self::mpadded or self::mstyle or self::msqrt or
self::math or self::merror or self::menclose or self::mphantom) and
count(./*)=5 and
./*[1][name()="mo" and text()="("] and
./*[3][name()="mo" and text()=","] and
./*[5][name()="mo" and text()=")"] ]
Alternative approach: An XPath selector for open-interval using only the sibling axis.
//mo[text()="(" and
following-sibling::*[2][
self::mo and text()=","] and
following-sibling::*[4][
self::mo and text()=")"]]
Are there easier ways to serialize and maintain these selectors?
msqrt
) and sometimes near-impossible (msup
, msub
, mover
, munder
).Q: Are "operator trees" another name for Content MathML?
A: No. They are a partial prerequisite. An operator tree is roughly equivalent to determining the correct skeleton of <apply>
-based subexpressions, and only that.
For example, knowing that "<mrow>
in the presentation tree.
We still need to fully determine the exact content symbols and variable bindings before we can build a Content MathML tree. And we also have to resolve any leftover linguistic phenomena that are only easy to mark up in presentation MathML (such as ellipsis <mi>⋯</mi>
), but are difficult to fully formalize.
It is valuable to note that the explicit syntax for the intent annotations allows to remediate any of the enumerated presentation trees, at the cost of making the markup "coarse grained". To be precise, the best possible annotation would be deposited on the "Lowest common ancestor (LCA)" of the participating presentation elements.
As an example, here is how the flat presentation tree may be remediated:
<math intent="formulae(element-of($1, open-interval($2, $3)),
element-of($4, open-interval($5($6), $7)))">
<mi arg="1">x</mi>
<mo>∈</mo>
<mo stretchy="false">(</mo>
<mn arg="2">0</mn>
<mo>,</mo>
<mn arg="3">1</mn>
<mo stretchy="false">)</mo>
<mo>,</mo>
<mi arg="4">y</mi>
<mo>∈</mo>
<mo stretchy="false">(</mo>
<mo arg="5">−</mo>
<mn arg="6">1</mn>
<mo>,</mo>
<mn arg="7">0</mn>
<mo stretchy="false">)</mo>
</math>
To infer that automatically however appears to be beyond the "defaults" of a specification, and closer to embarking on an ambitious parsing project for arbitrary math expressions.
And here is the same with the remediation possible already with MathML 3, using the alttext attribute. Or alternatively also possible via the aria-label attribute.
<math alttext="x in the open interval from zero to one
and y in the open interval from negative one to zero">
<mi>x</mi>
<mo>∈</mo>
<mo stretchy="false">(</mo>
<mn>0</mn>
<mo>,</mo>
<mn>1</mn>
<mo stretchy="false">)</mo>
<mo>,</mo>
<mi>y</mi>
<mo>∈</mo>
<mo stretchy="false">(</mo>
<mo>−</mo>
<mn>1</mn>
<mo>,</mo>
<mn>0</mn>
<mo stretchy="false">)</mo>
</math>
The "top-level textual description" approach has some obvious limitations.
Take <math>...</math>.
,This is the case when <math>...</math>
.