# Gaia-X Self Descriptions Data Assets
(this heading is just a comment and will not enter the ADR process)
## Results of the OWP Self-Descriptions for the ADR Process
(this heading is just a comment and will not enter the ADR process)
### Start of the Documentation that will enter the ADR Process
please enter your results below to prepare for the ADR process.
### Provider
Done; see https://gitlab.com/gaia-x/gaia-x-technical-committee/gaia-x-architecture-document/-/merge_requests/308
### Service Offering
Done; see https://gitlab.com/gaia-x/gaia-x-technical-committee/gaia-x-architecture-document/-/merge_requests/309
### Data Service Offering
The mandatory attributes are:
| Attribute | Description | Possible Datatype(s) | Cardinality | Example Value |
| ---------------------- |:------------------------- |:-------------------------:| -------------:| -------------:|
| DID | Unique identifier of the data asset, to be resolved to a DID Document (DDO) / Data Asset Self Description (DASD) | xsd:string | 1..1 | did:op:0ebed8226ada17fde24b6bf2b95d27f8f05fcce09139ff5cec31f6d81a7cd2ea |
| providerDID | Unique identifier of the provider, to be resolved to a provider Self Description | gax-participant:Participant | 1..1 | did:3:bafyreifcinixxemb7mu5zowrhqheh5e3byduqywtsr7lq2x4lk7rurffty |
| assetTitle | Title of the data asset featuring a high level description for quick reference. | xsd:string | 1...1 | Example Data |
| assetDescription | A more detailed description incl. markdown of the data asset that contains all information not included in standardized Self Descriptions | xsd:string | 1...1 | This data asset contains the data formerly kept in my data silo. |
| licenseType | Reference to the license model of the data asset | xsd:string | 1...1 | Public Domain, CC-0, CC-BY, No License Specified |
| copyrightHolder | Reference to the author or copyright holder | xsd:string | 1...1 | Satoshi Nakamoto, did:3:bafyreigh5aiij5xltuqcf5n4dcssgzcvg775cef3o2gcd7z7ssbn5w3sae |
| assetType | Type of the data asset which helps the discovery process. | xsd:string | 1...1 | dataset, algorithm, container, videostream, audiostream, ... |
| dateCreated | The timestamp the data asset has been created. ISO 8601 format, Coordinated Universal Time. | xs:dateTime | 1...1 | 2021-07-17T00:31:30Z |
| datePublished | The timestamp the data asset has listed in the federated catalogue | xs:dateTime | 1...1 | 2021-07-17T00:31:30Z |
| assetStatus | Current status of the dataset reflecting the state of the lifecycle. | xsd:string | 1...1 | active, deprecated, revoked, outdated |
| lastModified | A timestamp of the last modification of the self description. | xs:dateTime | 1...1 | 2021-07-17T00:31:30Z |
| assetStandard | Provides information about standards applied, i.e. ISO10303. | xsd:string | 1...n | ISO10303-242:2014, ISO 13567-1:2017 |
| assetStandardReference | Provides a link to the schema or additional details about the underlying standards applied | xsd:anyURI | 1...n | https://www.iso.org/standard/57620.html |
Topics for further discussion:
* A hierarchical model is needed to describe the types of data assets and their mandatory attributes.
* Dependencies in the service composition for data service offerings in Gaia-X need to be taken into account at a later stage, i.e. storage, computation or interconnection components. This will help transparency and compliance, but is not needed to full extend to simple data services where participants do not care too much about the underlying resources.
* Self descriptions of data assets need to be easily extendable and backwards compatible. Many different standards / schemata are expected to emerge from various data spaces.
* Access control could be implemented as a whitelist or blacklist / required trust level indicator.
* Should deployment templates for data assets be included in self descriptions and placed for download in the federated catalogue? Conflict with data sovereignty? We need to define mandatory attributes for different types of data resources, i.e. raw data vs. software containers. Templates could be part of specific data service offerings that are meant to be replicated.
* Existing standards should be taken into consideration, i.e. W3C Data Catalog Vocabulary (DCAT), International Data Spaces Association (IDSA) Information Model, W3C Web of Things (WoT) Thing Description, Schema.org - Dataset Definition, European Open Science Cloud (EOSC) is currently in consolidation of standards for the science community.
* Inclusion of DOI for immutable assets and version control are important for self descriptions and will likely by implemented in following iterations.
* We need to discuss how to deal with composite assets that have many data owners or contributors? There might not be the one copyright holder.
* How can traceability / provenance / data audit trails be included in self descriptions? Provenance is important, especially in the context of GDPR and AI/ML and the current state of regulation of AI/ML and the European data act.
New input from 2021-09-09 MVG and SD calls:
* Further attributes https://gitlab.com/gaia-x/gaia-x-technical-committee/gaia-x-architecture-document/-/merge_requests/308#note_673211395
* Alignment with DCAT-AP would make sense
* Do these attributes apply to the Offering of a Service that primarily serves Data, or to the Data Asset itself?
### Example for an Extension of a Data Service Offer (Files)
Datasets might require different attributes then streams or software products. This shall illustrate an example for a simple data asset that is available for download or computation.
The mandatory attributes are:
| Attribute | Description | Possible Datatype(s) | Cardinality | Example Value |
| ------------------ |:------------------------- |:-------------------------:| -------------:| -------------:|
| contentType | File format, could be detected during the listing process and availability check. | xsd:string | 1...1 | text/plain , text/csv |
| localURL | Endpoint that is used during publishing process | xsd:anyURI | 1...1 | https://raw.githubusercontent.com/examplefile.zip |
| encryptedURL | Contains encrypted URL to enable access control. | xsd:string | 1...1 | 7f9s79fu90s7f0s8fsfjsdpfjß8a002 |
| encryptionEndpoint | Contains the endpoint responsible for URL decryption and access control | xsd:anyURI | 1...1 | https://accesscontroller.yourname.com |
| fileIndex | Index number, starting from 0 | xsd:number | 1...1 | 4 |
| fileEncoding | File encoding (e.g. UTF-8). | xsd:string | 1...n | UTF-8, ANSI |
| contentLength | Size of the file in bytes. | xsd:string | 1...n | 378928719 |
| checksum | Checksum of the file using your preferred format (i.e. MD5). Format specified in checksumType. | xsd:string | 1...n | 25d422cc23b44c3bbd7a66c76d52af46 |
| checksumType | Format of the provided checksum. Can vary according to server (i.e Amazon vs. Azure) | xsd:string | 1...n | md5 |
### Interconnection Asset
An Interconnection resource is composed of a physical medium asset, a and a …. The mandatory attributes of these are listed in separate tables below.
| Attribute | Description | Possible Type(s) | Cardinality | Example Value |
| ---------------------- |:------------------------- |:--------------------:| -----------:| --------------:|
| hasPhysicalMediumAsset | | PhysicalMediumAsset | 1..1 | |
| hasConnectionResource | | ConnectionResource | 1..1 | |
| hasRouteAsset | | RouteAsset | 1..1 | |
### Physical Medium
A Physical Medium Asset belongs to an Interconnection resource. It has the following mandatory attributes:
| Attribute | Description | Possible Type(s) | Cardinality | Example Value |
| ------------------------------ |:------------------------- |:-------------------------:| -------------:| --------------:|
| hasPhysicalMediumAssetLocation | the two locations that are connected by the ... | xsd:string or dct:Location | 2..2 |
| hasPhysicalMediumAssetType | | xsd:string | 1..1 | |
### Connection Resource
A Connection Resource belongs to an Interconnection resource. It has the following mandatory attributes:
| Attribute | Description | Possible Type(s) | Cardinality | Example Value |
| ---------------------- |:------------------------- |:--------------------:| -----------:| --------------:|
| | | | 1..1 | |
| | | | 1..1 | |
| | | | 1..1 | |
### Route Asset
A Route Asset belongs to an Interconnection resource. It has the following mandatory attributes:
| Attribute | Description | Possible Type(s) | Cardinality | Example Value |
| ---------------------- |:------------------------- |:--------------------:| -----------:| --------------:|
| | | | 1..1 | |
| | | | 1..1 | |
| | | | 1..1 | |