PLEASE MAKE ANY CHANGES TO THE ISSUE HERE https://github.com/informatics-isi-edu/chaise/issues/2108
# GUDMAP google dataset annotations
## Is HTML allowed in JSON-LD values?
From what Stack Overflow states it is allowed in some attributes like description but there is no mention of how the remaining attributes would be treated. [This](https://webmasters.stackexchange.com/questions/124993/are-html-tags-allowed-in-structured-data-product-description) and [this](https://stackoverflow.com/questions/35115292/does-description-in-schema-org-markup-have-to-be-free-of-html-tags) link state that only a small subset of HTML is supported like `<p>` and `<a>`.
For another JSON-LD type Job Posting, Google explicitly [asks](https://developers.google.com/search/docs/data-types/job-posting#job-posting-definition) for the full description of the job in HTML format.
There is no official guidance from Google about HTML markup in general.
## Is Markdown allowed in JSON-LD values?
For Dataset, Google [states](https://developers.google.com/search/docs/data-types/dataset#dataset) that it is allowed for the attribute `description` but again there is no official guidance about the remaining attributes provided or any other information about markdown.
## Note:
1. Attributes are in a sorted order alphabetically apart from `@context` and `type`.
2. Attributes present in all tables
creator
dateModified
datePublished
description
funder
identifier
includedInDataCatalog
inLanguage
isAccessibleForFree
keywords
license
name
publisher
Url
usageInfo
## 1. Protocol:Protocol
https://www.gudmap.org/chaise/record/#2/Protocol:Protocol/RID=N-H9E2
**(Add it even though its more procedural, just as as experiment)**
// description is already of min length 50 chars
//details about [`dateModified`](https://github.com/informatics-isi-edu/chaise/wiki/SEO-Experiment#what-is-the-significance-of-datemodified-attribute-in-json-ld)
// details about [`isAccessibleForFree`](https://github.com/informatics-isi-edu/chaise/wiki/SEO-Experiment#what-is-the-significance-of-isaccessibleforfree-)
```js=
{
"tag:isrd.isi.edu,2021:google-dataset": {
"detailed": {
"template_engine": "handlebars",
"dataset": {
"@context": "https://schema.org",
"@type": "Dataset",
"creator": {
"@type": "Person",
"name": "{{{$fkey_Protocol_Protocol_Principal_Investigator_fkey.rowName}}}"
},
"dateModified": "{{{_RMT}}}",
"datePublished": "{{{Release_Date}}}",
"description": "{{{Abstract}}} \n {{#if Authors}}**Authors: ** {{{Authors}}}{{/if}}",
"funder": {
"@type": "Organization",
"name": "National Institute of Diabetes and Digestive and Kidney Diseases (NIDDK), National Institute of Health (NIH)",
"url": "https://www.niddk.nih.gov/"
},
"identifier": [
"{{#if (eq $fkey_Protocol_Protocol_Consortium_fkey.values.Name \"RBK\")}}https://www.rebuildingakidney.org/id/{{{RID}}}{{else}}https://www.gudmap.org/id/{{{RID}}}{{/if}}"
],
"includedInDataCatalog": {
"@type": "DataCatalog",
"name": "{{#if (eq $fkey_Protocol_Protocol_Consortium_fkey.values.Name \"RBK\")}}RBK{{else}}GUDMAP{{/if}}",
"url": "{{#if (eq $fkey_Protocol_Protocol_Consortium_fkey.values.Name \"RBK\")}}https://www.rebuildingakidney.org{{else}}https://www.gudmap.org{{/if}}"
},
"inLanguage": {
"@type": "Language",
"name": "English"
},
"isAccessibleForFree": "true",
"keywords": [
"Protocol",
"kidney",
"genitourinary",
"{{{Title}}}",
"{{{Keywords}}}", // eventually replace with psuedo-column from Protocol.Keyword
"{{{Subjects}}}" // eventually replace with Protocol.Subject
],
"license": "https://www.gudmap.org/using-gudmap/terms-of-use/",
"name": "Protocol: {{{$self.rowName.value}}}",
"publisher": {
"@type": "Organization",
"name": "{{{$fkey_Protocol_Protocol_Consortium_fkey.rowName}}}" # use GUDMAP as a consortium. Check what we put under landing page?
},
"url": "{{{$self.uri.detailed}}}",
"usageInfo": "https://www.gudmap.org/using-gudmap/terms-of-use/",
"version": "{{{Version}}}"
}
}
}
}
```
## 2. Common:Collection - Completed
// description is already of min length 50 chars
// can we put Transcriptomics in variableMeasured? https://www.omicsdi.org/dataset/geo/GSE6309
//Do we want to include "citation": "{{{Persistent_ID}}}", many similar pages mention it if DOI exists https://figshare.com/articles/dataset/_Characterization_of_Fibrillar_Collagens_and_Extracellular_Matrix_of_Glandular_Benign_Prostatic_Hyperplasia_Nodules_/1191644
https://www.gudmap.org/chaise/record/#2/Common:Collection/RID=16-QKNG
```js=
{
"tag:isrd.isi.edu,2021:google-dataset": {
"detailed": {
"template_engine": "handlebars",
"dataset": {
"@context": "https://schema.org",
"@type": "Dataset",
"citation": "{{{Persistent_ID}}}",
"creator": {
"@type": "Person",
"name": "{{{$fkey_Common_Collection_Principal_Investigator_fkey.rowName}}}"
},
"dateModified": "{{{_RMT}}}",
"datePublished": "{{{Release_Date}}}",
"description": "Collection: {{{Description}}}",
"funder": {
"@type": "Organization",
"name": "National Institute of Diabetes and Digestive and Kidney Diseases (NIDDK), National Institute of Health (NIH)",
"url": "https://www.niddk.nih.gov/"
},
"identifier": [
"{{{Persistent_ID}}}",
"{{#if (eq $fkey_Common_Collection_Consortium_fkey.values.URL \"rebuildingakidney.org\")}}https://www.rebuildingakidney.org/id/{{{RID}}}{{else}}https://www.gudmap.org/id/{{{RID}}}{{/if}}"
],
"includedInDataCatalog": {
"@type": "DataCatalog",
"name": "{{#if (eq $fkey_Common_Collection_Consortium_fkey.values.Name \"RBK\")}}RBK{{else}}GUDMAP{{/if}}",
"url": "{{#if (eq $fkey_Common_Collection_Consortium_fkey.values.Name \"RBK\")}}https://www.rebuildingakidney.org{{else}}https://www.gudmap.org{{/if}}"
},
"inLanguage": {
"@type": "Language",
"name": "English"
},
"isAccessibleForFree": "true",
"keywords": [
"Collection",
"data management",
"data repository",
"database",
"kidney",
"genitourinary",
"transcriptomics",
"{{{Title}}}"
],
"license": "https://www.gudmap.org/using-gudmap/terms-of-use/",
"name": "Collection: {{{$self.rowName.value}}}",
"publisher": {
"@type": "Organization",
"name": "{{{$fkey_Common_Collection_Data_Provider_fkey.rowName}}}"
},
"url": "{{{$self.uri.detailed}}}",
"usageInfo": "https://www.gudmap.org/using-gudmap/terms-of-use/",
"variableMeasured": "Transcriptomics"
}
}
}
}
```
## 3. RNASeq:Study - Completed
https://www.gudmap.org/chaise/record/#2/RNASeq:Study/RID=16-WPBT
// can we put Genomics in variableMeasured? https://www.omicsdi.org/dataset/dbgap/phs001698
// description is already of min length 50 chars
```js
{
"tag:isrd.isi.edu,2021:google-dataset": {
"detailed": {
"template_engine": "handlebars",
"dataset": {
"@context": "https://schema.org",
"@type": "Dataset",
"creator": {
"@type": "Person",
"name": "{{{$fkey_RNASeq_Study_Principal_Investigator_fkey.rowName}}}"
},
"dateModified": "{{{_RMT}}}",
"datePublished": "{{{Release_Date}}}",
"description": "{{{Summary}}} \n **Overall Design**: {{{Overall_Design}}}",
"funder": {
"@type": "Organization",
"name": "National Institute of Diabetes and Digestive and Kidney Diseases (NIDDK), National Institute of Health (NIH)",
"url": "https://www.niddk.nih.gov/"
},
"identifier": ["{{#if (eq $fkey_RNASeq_Study_Consortium_fkey.values.URL \"rebuildingakidney.org\")}}https://www.rebuildingakidney.org/id/{{{RID}}}{{else}}https://www.gudmap.org/id/{{{RID}}}{{/if}}"],
"includedInDataCatalog": {
"@type": "DataCatalog",
"name": "{{#if (eq $fkey_RNASeq_Study_Consortium_fkey.values.Name \"RBK\")}}RBK{{else}}GUDMAP{{/if}}",
"url": "{{#if (eq $fkey_RNASeq_Study_Consortium_fkey.values.Name \"RBK\")}}https://www.rebuildingakidney.org{{else}}https://www.gudmap.org{{/if}}"
},
"inLanguage": {
"@type": "Language",
"name": "English"
},
"isAccessibleForFree": "true",
"keywords": [
"RNASeq",
"Study",
"data management",
"data repository",
"database",
"kidney",
"genitourinary",
"transcriptomics",
"{{{Title}}}"
],
"license": "https://www.gudmap.org/using-gudmap/terms-of-use/",
"name": "RNASeq: {{{$self.rowName.value}}}",
"publisher": {
"@type": "Organization",
"name": "{{{$fkey_RNASeq_Study_Data_Provider_fkey.rowName}}}"
},
"url": "{{{$self.uri.detailed}}}",
"usageInfo": "https://www.gudmap.org/using-gudmap/terms-of-use/",
"variableMeasured": "Genomics"
}
}
}
}
```
## 4. Common:Gene - COMPLETED
// description attribute of table is short of 50 chars in many cases so we add extra text
// no foreign key available for `publisher`
// identifier hard-coded as no foreign key
// no Release_Date
https://www.gudmap.org/chaise/record/#2/Common:Gene/RID=Q-4AQY
```js=
{
"tag:isrd.isi.edu,2021:google-dataset": {
"detailed": {
"template_engine": "handlebars",
"dataset": {
"@context": "https://schema.org",
"@type": "Dataset",
"creator": {
"@type": "Organization",
"description": "The GenitoUrinary Development Molecular Anatomy Project (GUDMAP) is a consortium of laboratories working to provide the scientific and medical community with tools to facilitate research on the GenitoUrinary (GU) tract. The key components are: 1) a molecular atlas of gene expression for the developing organs of the GU tract; 2) a high resolution molecular anatomy that highlights development of the GU system mouse strains to facilitate developmental and functional studies within the GU system tutorials describing GU organogenesis; and 3) rapid access to primary data via the GUDMAP database",
"logo": "https://www.gudmap.org/assets/img/logo-white.svg",
"name": "GUDMAP Consortium",
"url": "https://www.gudmap.org/about/"
},
"dateModified": "{{{_RMT}}}",
"datePublished": "{{{_RCT}}}",
"description": "PLEASE FIND IT AT THE BOTTOM (not finalized)",
"funder": {
"@type": "Organization",
"name": "National Institute of Diabetes and Digestive and Kidney Diseases (NIDDK), National Institute of Health (NIH)",
"url": "https://www.niddk.nih.gov/"
},
"identifier": ["https://www.gudmap.org/id/{{{RID}}}"],
"includedInDataCatalog": {
"@type": "DataCatalog",
"name": "GUDMAP",
"url": "https://www.gudmap.org/"
},
"inLanguage": {
"@type": "Language",
"name": "English"
},
"isAccessibleForFree": "true",
"keywords": [
"Gene",
"data management",
"data repository",
"database",
"kidney",
"genitourinary",
"transcriptomics",
"{{{Title}}}",
"{{{Synonyms}}}",
"{{{Description}}}",
"{{#if (eq _scRNASeq_Data true)}}scRNASeq Data{{/if}}",
"{{#if (eq _Specimen_Expression_Data true)}}Specimen Expression Data{{/if}}",
"{{#if (eq _Imaging_Data true)}}Imaging Data{{/if}}",
"{{#if (eq _Array_Data true)}}Array Data{{/if}}"
],
"license": "https://www.gudmap.org/using-gudmap/terms-of-use/",
"name": "Gene: {{{$self.rowName.value}}}",
"publisher": {
"@type": "Organization",
"name": "GUDMAP Consortium",
"logo": "https://www.gudmap.org/assets/img/logo-white.svg",
"url": "https://www.gudmap.org/about/",
"description": "The GenitoUrinary Development Molecular Anatomy Project (GUDMAP) is a consortium of laboratories working to provide the scientific and medical community with tools to facilitate research on the GenitoUrinary (GU) tract. The key components are: 1) a molecular atlas of gene expression for the developing organs of the GU tract; 2) a high resolution molecular anatomy that highlights development of the GU system mouse strains to facilitate developmental and functional studies within the GU system tutorials describing GU organogenesis; and 3) rapid access to primary data via the GUDMAP database"
},
"url": "{{{$self.uri.detailed}}}",
"usageInfo": "https://www.gudmap.org/using-gudmap/terms-of-use/"
}
}
}
}
```
### Gene Description
````
Gene description -
In the species {{{Species}}}, {{{Description}}} is encoded by {{#if NCBI_Symbol}} the gene {{{NCBI_Symbol}}}. {{else}}this gene.{{/if}} It is a {{#if Gene_Type}}{{{Gene_Type}}}{{/if}} gene with NCBI Gene ID {{{NCBI_GeneID}}}. \n \n
{{#if External_Links}}Further details can be found on the external links - {{{External_Links}}} \n\n{{/if}}
The following data is available for this gene in this page: \n
{{#if (eq _scRNASeq_Data true)}}* scRNASeq Data \n {{/if}}
{{#if (eq _Specimen_Expression_Data true)}}* Specimen Expression Data \n {{/if}}
{{#if (eq _Imaging_Data true)}}* Imaging Data \n {{/if}}
{{#if (eq _Array_Data true)}}* Array Data \n {{/if}}
````
**Q-4MF4**
In the species Mus musculus, zinc finger E-box binding homeobox 2 is encoded by the gene Zeb2. It is a protein-coding gene with NCBI Gene ID 24136.
Further details can be found on the external links - NCBI: [24136](https://www.ncbi.nlm.nih.gov/gene/24136), Ensembl: [ENSMUSG00000026872](http://uswest.ensembl.org/Mus_musculus/Gene/Summary?g=ENSMUSG00000026872), MGI: [MGI:1344407](http://www.informatics.jax.org/marker/MGI:1344407), Vega: [OTTMUSG00000012355](http://vega.sanger.ac.uk/Mus_musculus/Gene/Summary?g=OTTMUSG00000012355)
The following data is available for this gene in this page:
* Specimen Expression Data
* Imaging Data
* Array Data
**Q-4ASR**
In the species Mus musculus, inhibitor of DNA binding 1 is encoded by the gene Id1. It is a protein-coding gene with NCBI Gene ID 15901.
Further details can be found on the external links - NCBI: [15901](https://www.ncbi.nlm.nih.gov/gene/15901), Ensembl: [ENSMUSG00000042745](http://uswest.ensembl.org/Mus_musculus/Gene/Summary?g=ENSMUSG00000042745), MGI: [MGI:96396](http://www.informatics.jax.org/marker/MGI:96396), Vega: [OTTMUSG00000015813](http://vega.sanger.ac.uk/Mus_musculus/Gene/Summary?g=OTTMUSG00000015813)
The following data is available for this gene in this page:
* Specimen Expression Data
* Imaging Data
* Array Data
**Q-605T**
In the species Mus musculus, CLP1, cleavage and polyadenylation factor I subunit is encoded by the gene Clp1. It is a protein-coding gene with NCBI Gene ID 98985.
Further details can be found on the external links - NCBI: [98985](https://www.ncbi.nlm.nih.gov/gene/98985), Ensembl: [ENSMUSG00000027079](http://uswest.ensembl.org/Mus_musculus/Gene/Summary?g=ENSMUSG00000027079), MGI: [MGI:2138968](http://www.informatics.jax.org/marker/MGI:2138968), Vega: [OTTMUSG00000013665](http://vega.sanger.ac.uk/Mus_musculus/Gene/Summary?g=OTTMUSG00000013665)
The following data is available for this gene in this page:
* Specimen Expression Data
* Imaging Data
* Array Data
## 5. Gene_Expression:Specimen
keyValues for 16-6F6Y -
https://www.gudmap.org/chaise/record/#2/Gene_Expression:Specimen/RID=16-6F6Y
```js=
{
"tag:isrd.isi.edu,2021:google-dataset": {
"detailed": {
"template_engine": "handlebars",
"dataset": {
"@context": "https://schema.org",
"@type": "Dataset",
"creator": {
"@type": "Person",
"name": "{{{$fkey_Gene_Expression_Specimen_Principal_Investigator_fkey.rowName}}}"
},
"dateModified": "{{{_RMT}}}",
"datePublished": "{{{Release_Date}}}",
"description": "{{#if Probe_Usage_Notes}}{{{Probe_Usage_Notes}}}{{/if}}. For the {{{Sex}}} of the species {{{Species}}}, this specimen has been created.",
"funder": {
"@type": "Organization",
"name": "National Institute of Diabetes and Digestive and Kidney Diseases (NIDDK), National Institute of Health (NIH)",
"url": "https://www.niddk.nih.gov/"
},
"identifier": "{{#if (eq $fkey_Gene_Expression_Specimen_Consortium_fkey.values.URL \"rebuildingakidney.org\")}}https://www.rebuildingakidney.org/id/{{{RID}}}{{else}}https://www.gudmap.org/id/{{{RID}}}{{/if}}",
"includedInDataCatalog": {
"@type": "DataCatalog",
"name": "{{#if (eq $fkey_Gene_Expression_Specimen_Consortium_fkey.values.Name \"RBK\")}}RBK{{else}}GUDMAP{{/if}}",
"url": "{{#if (eq $fkey_Gene_Expression_Specimen_Consortium_fkey.values.Name \"RBK\")}}https://www.rebuildingakidney.org{{else}}https://www.gudmap.org{{/if}}"
},
"inLanguage": {
"@type": "Language",
"name": "English"
},
"isAccessibleForFree": "true",
"keywords": [
"Collection",
"data management",
"data repository",
"database",
"kidney",
"genitourinary",
"transcriptomics",
"{{{Title}}}",
"{{{Assay_Type}}}",
"{{{Wild_Type}}}"
],
"license": "https://www.gudmap.org/using-gudmap/terms-of-use/",
"name": "Specimen: {{{$self.rowName.value}}}",
"publisher": {
"@type": "Organization",
"name": "{{{$fkey_Gene_Expression_Specimen_Principal_Investigator_fkey.values.Affiliation_Abbreviation}}}"
},
"url": "{{{$self.uri.detailed}}}",
"usageInfo": "https://www.gudmap.org/using-gudmap/terms-of-use/"
}
}
}
}
```