--- title: CSW Intorudction slideOptions: backgroundTransition: 'fade' spotlight: enabled: false --- ## :spiral_note_pad: CSW Intorudction :::success :dart: **Goals** - What is `CSW` - How does the retrieved metadata look like - Problems of retreving metadata from CSW. - Alternative: GeoNetwork database access ::: --- ## :building_construction: Recap ```mermaid graph LR N --Api--> Havester((Havester)) E --Api--> Havester((Havester)) R --Api--> Havester((Havester)) C --Api--> Havester((Havester)) Havester --metadata*.xml--> Parser{XmlParser/XmlSerializer}--ETL-->Database[(Database)] Database -.-> EF{EFCore} -.-> SearchBar EF -.-> Database SearchBar -.-> EF ``` --- ## :mag: What is CSW 1. Every Data Center has their own way to provide metadata. 2. ...**But** they all provide their metadata through an API: **Catalogue Service for Web** (`CSW`) > CSW is a standard for exposing a catalogue of geospatial records in XML on the Internet (over HTTP) -- [csw@wiki] [csw@wiki]: https://en.wikipedia.org/wiki/Catalogue_Service_for_the_Web --- ## Example: `requests.get(CSW_API_URL_WITH_PARAMS)` ``` https://metadata.bgs.ac.uk/geonetwork/srv/eng/csw? service=CSW& version=2.0.2& REQUEST=GetRecords& resultType=results& outputFormat=application/xml& outputSchema=http://www.isotc211.org/2005/gmd&maxRecords=100& typeNames=csw:Record& ElementSetName=full ``` --- In short, `CSW` is just a regular API with **a lot of** parameters, see [csw@ogc] for details. [csw@ogc]: https://docs.ogc.org/is/12-176r7/12-176r7.html --- ## Demo: Retreiving NERC Metadata - [bas] - own document store, providing`CSW` - [bgs] - GeoNetwork: providing`CSW` - [ceda] - GeoNetwork: providing`CSW` - [ceh] - own document store, **not** providing `CSW`, `WAF` instead. - [nerc data catalogue] - GeoNetwork: providing`CSW` - it is our source of metadata, currently. - it is the overall metadata, namely. - it havests/sync nerc metadata throught csw, regularly. - not sure how does it do with ceh - there is inconsistency, actually. [bas]: https://api.bas.ac.uk/data/metadata/csw/v2/published?service=CSW&version=2.0.2&REQUEST=GetRecords&resultType=results&outputFormat=application/xml&outputSchema=http://gcmd.gsfc.nasa.gov/Aboutus/xml/dif/&maxRecords=100&typeNames=gmd:MD_Metadata&ElementSetName=full [bgs]: https://metadata.bgs.ac.uk/geonetwork/srv/eng/csw?service=CSW&version=2.0.2&REQUEST=GetRecords&resultType=results&outputFormat=application/xml&outputSchema=http://www.isotc211.org/2005/gmd&maxRecords=100&typeNames=csw:Record&ElementSetName=full [ceda]: https://csw.ceda.ac.uk/geonetwork/srv/eng/csw?service=CSW&version=2.0.2&REQUEST=GetRecords&resultType=results&outputFormat=application/xml&outputSchema=http://www.isotc211.org/2005/gmd&maxRecords=100&typeNames=gmd:MD_Metadata&ElementSetName=full [ceh]: https://catalogue.ceh.ac.uk/eidc/documents [nerc data catalogue]: https://data-search.nerc.ac.uk/geonetwork/srv/eng/csw?service=CSW&version=2.0.2&REQUEST=GetRecords&resultType=results&outputFormat=application/xml&http://www.isotc211.org/2005/gmd --- You can find all these in the [github-repo]. - they are powershell script (`.ps`) - ... *i.e.* runs on window powershell only. [github-repo]: https://github.com/UoMResearchIT/nerc-digital-solutions-hub/blob/develop/nerc-data-centres/Metadata/download-nerc.ps1 --- ## Problems of retrieving metadata from CSW - not all of metadata have the same structure. - see [bgs] example - e.g. `identificationInfo` content could be `MD_DataIdentification` or `SV_ServiceIdentification` - to extract the info from the metadata, you either 1. find all XPath of each info you need. 2. build a respective class/data binder - use `XML deserialisation program` to load the info to the class. --- ## GeoNetwork Database Access? [GeoNetwork offical docker image] [geoNetwork offical docker image]: https://hub.docker.com/_/geonetwork https://github.com/geonetwork/docker-geonetwork/blob/main/4.2.3/docker-compose.yml --- GeoNetwork PostgresDB :arrow_right: structured metadata :arrow_right: Reverse engineering the classes, see [efcore- scaffolding](https://learn.microsoft.com/en-us/ef/core/managing-schemas/scaffolding/?tabs=dotnet-core-cli) ---