# 2023-05-18 PFHub Data & Schema Musings
[](https://hackmd.io/kkgyfcoJTDWphNh3TRqAKg)
Attendees:
- Daniel Wheeler
- Yannick Congo
- Trevor Keller
## Agenda
Discuss [PFHub], [MaRDA], [OPTIMADE], [RO-Crate], and [PFHub Schema].
* Specific LinkML schema implementation:
<https://github.com/usnistgov/pfhub-schema/blob/main/src/pfhub_schema/schema/pfhub_schema.yaml>
<!-- links -->
[PFHub]: https://pages.nist.gov/pfhub
[PFHub Schema]: https://github.com/usnistgov/pfhub-schema
[MaRDA]: https://www.marda-alliance.org
[OPTIMADE]: https://www.optimade.org
[RO-Crate]: https://www.researchobject.org/ro-crate/
## Resources
Example working group: [MaRDA Extractors](https://github.com/marda-alliance/metadata_extractors)
## Questions
- What's unique about phase field data?
- Why is this necessary?
- Can Yannick provide a list of tools that capture simulations so that we have a reference of similar schema?
- Sumatra
- CoRR
- Optimade
- Is there something obvious that we're missing?
- Will this serve a purpose and will people adopt it?
- Can we imagine what the data might be used for (Jon's question)?
- Is a "Phase Field Schema group" the right level.
## The Schema
We need a modest goal.
1. The description of the mathematical problem being solved
- This could just be a link to a description such as a notebook
2. The most boring metadata: who, why, when
3. Metadata about the compuation
- link to an environment
- link to input files
- how long, how much memory
- description of numerical technique (to make PFHub more searchable
for example)
4. The data
- a way to describe the data and its location
## What do we need for the proposal?
### Working Group Members
### Work plan
### Goals and Expected Impact
### Deliverables
- a suggested schema
- a tool to lint schema and data
- a simple tool to display some examples of the and data
- implementation as part of PFHub
- an easy way to register schema uploads
- example of using schema on Zenodo, other databases