On Advanced Reification Using OWL

# On Advanced Reification Using OWL ## Introduction The [RDF reification vocabulary](https://www.w3.org/TR/rdf11-mt/#reification) is defined to: > allow an RDF graph to act as metadata describing other RDF triples. This informally supports various forms of statement qualification and provenance annotations. The following concepts are defined here to distinguish between notions in different applications: * Statement Types * Statement Instances * Claims * Tokens * Facts From just a statement description, even an "overclaim", it is possible to infer distinct statement types. A fact type can be defined by any ontology needing to talk about such a precise interpretation of a statement. OWL is used to define relevant semantics. ### Conventions Examples below assume these declarations: ```turtle PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> PREFIX owl: <http://www.w3.org/2002/07/owl#> PREFIX xsd: <http://www.w3.org/2001/XMLSchema#> PREFIX : <https://schema.org/> PREFIX ex: <http://example.org/ns#> BASE <http://example.org/> ``` (Entailments below have been verified with the [OWL-RL Reasoner web service](https://www.ldf.fi/service/owl-rl-reasoner) and with [Reasonable](https://github.com/gtfierro/reasonable).) ## Triples and Statements A statement is expressed in RDF by an asserted triple in a graph: ```turtle <elizabeth> :spouse <richard> . ``` Such a statement can also be *described*, using reification: ```turtle [] a rdf:Statement ; rdf:subject <elizabeth> ; rdf:predicate :spouse ; rdf:object <richard> . ``` Multiple distinct statements for the same triple can be defined, each with its distinct identity. ## Statement Types The class of the meaning of each such statement can also be defined as a type: ```turtle _:ElizabethSpouseRichard rdfs:subClassOf rdf:Statement ; owl:intersectionOf ( [ owl:onProperty rdf:subject ; owl:hasValue <elizabeth> ] [ owl:onProperty rdf:predicate ; owl:hasValue :spouse ] [ owl:onProperty rdf:object ; owl:hasValue <richard> ] ) . ``` That is exactly the class of statements with these constituents. The OWL entailment rules for `owl:intersectionOf` and `owl:Restriction` will infer that the above statement is an instance of this class. The purpose of statement types is to preserve the integrity of the statement as represented by a triple. By using these, it is possible to find the distinct statements, via the type, even for instances of multiple statement types. For convenience, these locally defined statement types will be used below to avoid the repetition of the triple constituents. ## Statement Instances A description of a statement, further defining its meaning, is distinct from the assertion of its triple (as the fact of it being true in the interpretation of a graph). In fact, there can be many distinct such descriptions, each differing in various ways. With statement types, it is easier to express such. Here, two different instances of the same statement type are described: ```turtle _:a a _:ElizabethSpouseRichard ; :startDate "1964" ; :endDate "1974" . _:b a _:ElizabethSpouseRichard ; :startDate "1975" ; :endDate "1976" . ``` As instances of the defined statement type, the above entails (among other things): ```turtle _:a a rdf:Statement ; rdf:subject <elizabeth> ; rdf:predicate :spouse ; rdf:object <richard> . _:b a rdf:Statement ; rdf:subject <elizabeth> ; rdf:predicate :spouse ; rdf:object <richard> . ``` Notably, the latter triples alone also entail that `_:a` and `_:b` have the `rdf:type _:ElizabethSpouseRichard`. ### Claims [RDF concepts](https://www.w3.org/TR/rdf11-concepts/#entailment) state: > An RDF triple encodes a statement—a simple logical expression, or claim about the world. And for [reification](https://www.w3.org/TR/rdf11-mt/#reification) is is also stated that: > A reification of a triple does not entail the triple, and is not entailed by it. The reification only says that the triple token exists and what it is about, not that it is true, so it does not entail the triple. and: > Since the relation between triples and reifications of triples in any RDF graph or graphs need not be one-to-one, asserting a property about some entity described by a reification need not entail that the same property holds of another such entity, even if it has the same components. Based on that, statements that are not simply identified with their subject, predicate and object will be called *claims* here. The exemplified statements above are examples of claims. Claims are usually qualifications of further details, whose meaning are thus more specific than the meaning of the asserted triple itself. ### Tokens A simple claim can be furthermore described as being an exact token from an observation, a document or other data source. It can provide details such as exact term token representations used as subject, predicate and object, and more, all depending on how precise the needs are in a specific application and processing context. Examples range from linters and editing completion to fact integrity checking and claims monitoring and archiving. This is an example of a detailed data provenance example, maintaining referential opacity and lexicality: ```turtle <urn:uuid:27584316-5745-482a-beed-ee400cf12693> a ex:DocumentRepresentation ; ex:parsedAt "2024-01-13T15:24:02+0100"^^xsd:dateTime ; ex:locationLexical "http://example.org/elizabeth/data.ttl"^^xsd:anyURI ; ex:checksum "c6b6278d8baa95be955a3d004d424ce4"^^ex:md5sum . <urn:uuid:a048ec21-ad77-477e-a612-2f9308d42785> a ex:TripleToken, _:ElizabethSpouseRichard ; ex:dataSource <urn:uuid:27584316-5745-482a-beed-ee400cf12693> ; ex:subjectLexical [ ex:sourceToken "<elizabeth>" ; ex:iriToken "http://example.org/elizabeth"^^xsd:anyURI ; ex:startLine 9 ; ex:endLine 9 ; ex:startColumn 1 ; ex:endColumn 11 ] ; # ... . ``` (Many details omitted here, such as source retrieval facts (using the [HTTP vocabulary](https://www.w3.org/TR/HTTP-in-RDF10/)), defined base IRI and prefix declarations.) ### Facts The class of facts is defined here as: ```turtle ex:Fact a owl:Class ; rdfs:subClassOf rdf:Statement ; owl:hasKey (rdf:subject rdf:predicate rdf:object) . ``` An instance of this class is the exact meaning of its asserted triple: ```turtle _:c a _:ElizabethSpouseRichard, ex:Fact ; :startDate "1964" ; :endDate "1974" . _:d a _:ElizabethSpouseRichard, ex:Fact ; :startDate "1975" ; :endDate "1976" . ``` The two bnodes `_:c` and `_:d` denote this *same* fact. (This has important implications, further detailed in the following section.) The above is thus the same as: ```turtle [] a _:ElizabethSpouseRichard, ex:Fact ; owl:sameAs _:c , _:d ; rdf:subject <elizabeth> ; rdf:predicate :spouse ; rdf:subject <richard> ; :startDate "1964", "1975" ; :endDate "1974", "1976" . ``` This fact is the expressed meaning (the relationship in the universe of discourse) represented by the corresponding asserted triple. Note that describing this fact does not entail its assertion. But it expresses the *assumption* that it is true. ## Many Claims and One Fact These basic definitions carry important consequences. ### Overclaims Given another statement type: ```turtle _:ElizabethAPerson rdfs:subClassOf rdf:Statement ; owl:intersectionOf ( [ owl:onProperty rdf:subject ; owl:hasValue <elizabeth> ] [ owl:onProperty rdf:predicate ; owl:hasValue rdf:type ] [ owl:onProperty rdf:object ; owl:hasValue :Person ] ) . ``` This is an *overclaim*: ```turtle [] a _:ElizabethSpouseRichard, _:ElizabethAPerson . ``` This is *ambiguous* since it claims two different statements at once. It may still make *sense*, if these statements make sense together, with at least one statement entailing all of the other statements of the overclaim. It is also possible to pick out exactly which statements are claimed, since they are integral in the definitions of the statement types. Crucially though, if seen as a *fact*, it becomes a *conflation*, which is quite possibly nonsense. ### Facts Are Reasonably Simple Statements It is important to recognize that `ex:Fact` is like `owl:sameAs` for all instances of a specific statement type. That includes all subsets of that class. So unless you have really good reasons for it, and understand the consequences, *don't* do this: ```turtle _:e a _:ElizabethSpouseRichard, _:ElizabethAPerson, ex:Fact . ``` Since that would make *every* fact belonging to *any* of these two distinct statement types the *same* fact. Thus, the above and these two would all be the *same fact*: ```turtle _:f a _:ElizabethSpouseRichard, ex:Fact . _:g a _:ElizabethAPerson, ex:Fact . ``` Thus these claims are still distinct: ```turtle _:h a _:ElizabethSpouseRichard . _:i a _:ElizabethAPerson . ``` It is not a *total* conflation. (That is because [`owl:hasKey`](https://www.w3.org/TR/2012/REC-owl2-syntax-20121211/#Keys) defines a key for a specific class. So other statements "matching" this key *are not* inferred to be facts.) ### Facts Are Not Tokens Note that the *fact* itself is not necessarily a token. The fact is the same across claims, and it is the meaning that a token captures, but without the further distinctions of the token itself (data source, lexical details, etc.). The notions mostly serve complementary purposes.