--- tags: spatial, frictionlessdata --- # Spatial Extent for Lat-Lon Locations ## Point dataset The [Spatial Data Package Investigation](https://research.okfn.org/spatial-data-package-investigation/#point-datasets) explains how to describe columns in a CSV file containing point data: ```javascript= "locations": [ { "type": "lat-lon", "fields": { "latitude": "lat", "longitude": "lon" }, "geojson-path": "data.geojson", "role": "start" } ] ``` ### Required properties #### `type` One of: - `lat-lon` point defined by two columns - `boundary-id` linked boundary defined by a column with IDs - `geojson` geometry provided as geojson *Given that GeoJSON uses lon, lat for it's coordinates, perhaps `lon-lat`?* #### `fields` All coordinates `MUST` be recorded using [ESPG:4326](https://epsg.io/4326) representing the World Geodetic System 1984 (WGS84) as used in GPS. Projected coordinates are intentionally not supported. ##### `latitude` The `name` of field containing latitude in decimal degrees ("-37.8"). This field is of `type`: - `number` with constraints no broader than: - `"minimum": "-90"` - `"maximum": "90"` - `number` with `"format": "latitude"` - `latitude` *[See discussion](https://discuss.okfn.org/t/geo-data-package/6143/26?u=stephen)* ##### `longitude` The `name` of field containing longitude in decimal degrees ("-137.8"). This field is of `type`: - `number` with constraints no broader than: - `"minimum": "-180"` - `"maximum": "180"` - `number` with `"format": "latitude"` - `longitude` *[See discussion](https://discuss.okfn.org/t/geo-data-package/6143/26?u=stephen)* ### Optional properties #### `geojson-path` A local path to equivalent representation in GeoJSON format, for convenience. *If GeoJSON file is included in data package, then you can't apply `"profile": "tabular-data-package"` to the data package.* #### `role` A controlled vocabulary (TBD) that describes the role of the point location. Used when there is more than one point location in the row of data. Candidate roles: - start - end ## A resource with two lat-lon locations A resource can have more than one columns of location data > `Locations` Array of one or more sources of location information, within the resource. For instance, a resource may contain both point and boundary information, or two kinds of point information > This shows two sets of `lat-lon` point data. The `name` property is new and needed to support validating the spatial extent. E.g. an error message could be, `row 3 - start-location is outside the spatial extent` ```javascript= "locations": [ { "name": "start-location", "type": "lat-lon", "fields": { "latitude": "lat-start", "longitude": "lon-start" } "role": "start" }, { "name": "end-location", "type": "lat-lon", "fields": { "latitude": "lat-end", "longitude": "lon-end" } "role": "end" }, ] ``` ## Describe and validate the spatial extent of point data Point data in a tabular data package could reference spatial data to: - describe the spatial extent of the `lat-lon` data - validate that a `lat-lon` point is inside the referenced (multi-)polygon This requires: - each `lat-lon` pair to be given a `name` so it can be referenced in error messages - a reference to a named polygon inside a GeoJSON file The GeoJSON file could be stored in a: 1. file at a URL 2. `resource` in the same data package 3. `resource` in a data package at a URL 4. `resource` in a data package identified via [data package identifier](https://frictionlessdata.io/specs/data-package-identifier/) using datapackage.json 5. `resource` in a data package identified via [data package identifier](https://frictionlessdata.io/specs/data-package-identifier/) using data package directory 6. `resource` in a data package identified via [data package identifier](https://frictionlessdata.io/specs/data-package-identifier/) using GitHub 7. `resource` in a data package identified via [data package identifier](https://frictionlessdata.io/specs/data-package-identifier/) using a Core Repository dataset name 8. standard boundary service *I may have misunderstood data package identifier and examples 4-7 may be wrong.* In the examples below, I: - propose a new properties `name` and `spatialExtent` - draw on [data package identifiers](https://frictionlessdata.io/specs/data-package-identifier/) to identify a data package - draw on the syntax used for `foreignKeys` in Table Schema and the [Foreign Keys to Data Packages](https://frictionlessdata.io/specs/patterns/#table-schema:-foreign-keys-to-data-packages) pattern to describe a resource and field in a data package - aspire for a `datapackage` property that correctly resolves the value provided regardless of it being a: - url to a datapackage.json - url to a data package directory - url to a data package stored in GitHub - name of core data in a registry - ignore `version` but this can be added as an extra property to clarify which data to use This may not be technically possible but thought it worth exploration. The examples are for a CSV file containing `lat-lon` point locations for Koala sightings. They reference a polygon representing the Australian State of "Victoria" inside a GeoJSON file. The `lat-lon` points should all be inside the polygon. ### `spatialExtent` #### `reference` Contains one of the `path`, `datapackage` or `codelist` properties to helps resolve the location of the GeoJSON file. Each option must be accompanied by `field` and `value`. `resource` is only required to clarify the `datapackage`. ##### `path` A URL to a GeoJSON file, `path`: - fully resolves the location of the GeoJSON file ##### `datapackage` A [identifier string](https://frictionlessdata.io/specs/data-package-identifier/#identifier-string) that resolves to a data package. Must be accompanied by `resource` to fully resolve the location of the GeoJSON file ##### `codelist` - resolves the location of the GeoJSON file through the help of a resolver service (TBD) ##### `resource` The `name` of a Data Resource in a `datapackage` with a `locations` `type` of `geojson` ##### `field` The name of the property in the GeoJSON that contains the `value` of the polygon ##### `value` The value of the property in the GeoJSON that identifies the polygon the describes the spatial extent ### 1. Reference to GeoJSON file at a URL ```javascript= "locations": [ { "name": "koala-sighting", "type": "lat-lon", "fields": { "latitude": "lat", "longitude": "lon" }, "spatialExtent": { "reference": { "path": "https://example.com/states/state-boundaries.geojson", "field": "state", "value": "Victoria" } } } ] ``` ### 2. Reference to a resource in the same data package ```javascript= "locations": [ { "name": "koala-sighting", "type": "lat-lon", "fields": { "latitude": "lat", "longitude": "lon" }, "spatialExtent": { "reference": { "resource": "state-boundaries", "field": "state", "value": "Victoria" } } } ] ``` ### 3. Reference to a resource in a data package at a URL Uses the `datapackage` concept from ```javascript= "locations": [ { "name": "koala-sighting", "type": "lat-lon", "fields": { "latitude": "lat", "longitude": "lon" }, "spatialExtent": { "reference": { "datapackage": "https://example.com/states/datapackage.json", "resource": "state-boundaries", "field": "state", "value": "Victoria" } } } ] ``` ### 4. Reference to a resource via a data package identifier using datapackage.json The `datapackage` property should be [`dataPackageJsonUrl`](https://frictionlessdata.io/specs/data-package-identifier/#identifier-object-structure) but wouldn't it be nice if you could just specify `datapackage` and the [identifier string](https://frictionlessdata.io/specs/data-package-identifier/#identifier-string) resolves correctly? Note this example is the same as [Example 3](#3.-reference-to-a-resource-in-a-data-package-at-a-URL) above. ```javascript= "locations": [ { "name": "koala-sighting", "type": "lat-lon", "fields": { "latitude": "lat", "longitude": "lon" }, "spatialExtent": { "reference": { "datapackage": "https://example.com/states/datapackage.json", "resource": "state-boundaries", "field": "state", "value": "Victoria" } } } ] ``` ### 5. Reference to a resource via a data package identifier using data package directory ```javascript= "locations": [ { "name": "koala-sighting", "type": "lat-lon", "fields": { "latitude": "lat", "longitude": "lon" }, "spatialExtent": { "reference": { "datapackage": "https://example.com/states/", "resource": "state-boundaries", "field": "state", "value": "Victoria" } } } ] ``` #### 6. Reference to a resource via a data package identifier using GitHub The `datapackage` property should be [`url`](https://frictionlessdata.io/specs/data-package-identifier/#identifier-object-structure) but wouldn't it be nice if you could just specify `datapackage` and the [identifier string](https://frictionlessdata.io/specs/data-package-identifier/#identifier-string) resolves correctly? ```javascript= "locations": [ { "name": "koala-sighting", "type": "lat-lon", "fields": { "latitude": "lat", "longitude": "lon" }, "spatialExtent": { "reference": { "datapackage": "http://github.com/datasets/states/", "resource": "state-boundaries", "field": "state", "value": "Victoria" } } } ] ``` ### 7. Reference to a resource via a data package identifier using Core Dataset Registry The `datapackage` property should be [`name`](https://frictionlessdata.io/specs/data-package-identifier/#identifier-object-structure) but wouldn't it be nice if you could just specify `datapackage` and the [identifier string](https://frictionlessdata.io/specs/data-package-identifier/#identifier-string) resolves correctly? The `name` is a dataset in the [Core Datasets registry](http://data.okfn.org/data). Note: [Specs Issue #567](https://github.com/frictionlessdata/specs/issues/567) ```javascript= "locations": [ { "name": "koala-sighting", "type": "lat-lon", "fields": { "latitude": "lat", "longitude": "lon" }, "spatialExtent": { "reference": { "datapackage": "australian-states", "resource": "state-boundaries", "field": "state", "value": "Victoria" } } } ] ``` ### 8. Reference to a standard boundary service Using [CSV-Geo-AU State/Territory](https://github.com/TerriaJS/nationalmap/wiki/csv-geo-au#stateterritory-ste) and noting that the data [doesn't exist as GeoJSON](http://www.abs.gov.au/AUSSTATS/abs@.nsf/DetailsPage/1270.0.55.001July%202011?OpenDocument) but does [elsewhere](https://data.gov.au/dataset?_organization_limit=0&sort=extras_harvest_portal+asc%2C+score+desc&q=states&organization=primeministerandcabinet). Note that I split out the `field` from the`codelist` which differs from the format proposed in the [Spatial Data Package Investigation](https://research.okfn.org/spatial-data-package-investigation/#preparing-boundary-linked-tabular-data). It would be interesting to explore it `resource` could be used instead of `codelist` to further "harmonise" the language used across the Frictionless Data specification. ```javascript= "locations": [ { "name": "koala-sighting", "type": "lat-lon", "fields": { "latitude": "lat", "longitude": "lon" }, "spatialExtent": { "reference": { "codelist": "csv-geo-au", "field": "ste_name", "value": "Victoria" } } } ] ``` ## Metadata Property Harmonisation Exploring "harmonisation" further, if for a location type of `boundary-id` you: - replaced `codelist` with `resource`. If the `resource` wasn't found locally, go to the boundary resolver service. *(perhaps this is just confusing?)* - split `field` from `codelist` *(counter to the proposal in the [Spatial Data Package Investigation](https://research.okfn.org/spatial-data-package-investigation/#point-datasets))* Then in a CSV with both `lat-lon` point data and `boundary-id` data you'd have: ```javascript= "locations": [ { "name": "koala-sighting", "type": "lat-lon", "fields": { "latitude": "lat", "longitude": "lon" }, "spatialExtent": { "reference": { "resource": "csv-geo-au", "field": "ste_name", "value": "Victoria" } } }, { "type": "boundary-id", "field": "lga", "reference": { "resource": "csv-geo-au", "field": "ste_name" } } ] ``` This may be worth exploring further to make it easier to create data packages and remove friction in authoring them. # Spatial Extent for Boundary-id Locations Spatial extent can also be applied to `boundary-id` locations. In this case a CSV file has a State/Territory column and a Population column: state |population (million) ------------|---------- Queensland | 4.69 Victoria | 5.79 State/Territory is described by CSV-Geo-AU using `field` [ste-name](https://github.com/TerriaJS/nationalmap/wiki/csv-geo-au#stateterritory-ste) The spatial extent is the whole of Australia as describe in CSV-Geo-AU using the `field` [aus-code](https://github.com/TerriaJS/nationalmap/wiki/csv-geo-au#australia-as-a-whole-aus) and the `value`:`0`. Using the language from the [Spatial Data Package Investigation](https://research.okfn.org/spatial-data-package-investigation/#point-datasets)) ```javascript= "locations": [ { "type": "boundary-id", "field": "state", "codelist": "csv-geo-au:ste_name" }, "spatialExtent": { "reference": { "codelist": "csv-geo-au", "field": "aus-code", "value": "0" } } } ] ``` Or using the language proposed in [section 8](#8.-reference-to-a-standard-boundary-service). ```javascript= "locations": [ { "type": "boundary-id", "field": "state", "reference": { "codelist": "csv-geo-au", "field": "ste_name" }, "spatialExtent": { "reference": { "codelist": "csv-geo-au", "field": "aus-code", "value": "0" } } } ] ```