owned this note
owned this note
Published
Linked with GitHub
>A simple way to run a data cube
>[https://github.com/opendatacube/cube-in-a-box](https://)
# How to use postgreSQL
1. Use this command to open psql: psql -U postgres
2. Use pgAdmin to create a user ([follow this page](https://www.guru99.com/postgresql-create-alter-add-user.html))
# Indexing the data
> Documentation
> https://datacube-core.readthedocs.io/en/stable/ops/product.html
> Scripts for indexing data into ODC instances
> https://github.com/opendatacube/datacube-dataset-config
## Create a Product Definition
ODC requires a bit of information up front to know what to do with them
1. Create Our Definition File:
```yml
---
name: Nimrod
description: Nimrod c-band Rain Radar 1km
metadata_type: eo3
license: CC-BY-SA-4.0
metadata:
product:
name: UKMO
storage:
crs: EPSG:27700
resolution:
x: 1000
y: 1000
measurements:
- name: "rainrate"
dtype: "int8"
units: "mm/h"
nodata: -1
aliases: []
```
2. Use Command Line tool to load the product

## Ensure Dataset Documents are complete
Every dataset that you intend to index requires a metadata document describing what the data represents and where it has come from, as well has what format it is stored in. At a minimum, you need the dimensions or fields your want to search by, such as lat, lon and time, but you can include any information you deem useful.
### EO3 Format
EO3 is an intermediate format before we move to something more standard like STAC.
```yml
# UUID of the dataset
id: f884df9b-4458-47fd-a9d2-1a52a2db8a1a
$schema: 'https://schemas.opendatacube.org/dataset'
# Product name
product:
name: Nimrod_c-band_rain_radar_1km
# Native CRS, assumed to be the same across all bands
crs: "epsg:27700"
# Optional GeoJSON object in the units of native CRS.
# Defines a polygon such that, all valid pixels across all bands
# are inside this polygon.
geometry:
type: Polygon
coordinates: [[..]]
# Mapping name:str -> { shape: Tuple[ny: int, nx: int]
# transform: Tuple[float x 9]}
# Captures image size, and geo-registration
grids:
default: # "default" grid must be present
shape: [7811, 7691]
transform: [30, 0, 618285, 0, -30, -1642485, 0, 0, 1]
pan: # Landsat Panchromatic band is higher res image than other bands
shape: [15621, 15381]
transform: [15, 0, 618292.5, 0, -15, -1642492.5, 0, 0, 1]
# Per band storage information and references into `grids`
# Bands using "default" grid should not need to reference it
measurements:
pan: # Band using non-default "pan" grid
grid: "pan" # should match the name used in `grids` mapping above
path: "pan.tif"
red: # Band using "default" grid should omit `grid` key
path: red.tif # Path relative to the dataset location
blue:
path: blue.tif
multiband_example:
path: multi_band.tif
band: 2 # int: 1-based index into multi-band file
netcdf_example: # just example, mixing TIFF and netcdf in one product is not recommended
path: some.nc
layer: some_var # str: netcdf variable to read
# Dataset properties, prefer STAC standard names here
# Timestamp is the only compulsory field here
properties:
eo:platform: landsat-8
eo:instrument: OLI_TIRS
# If it's a single time instance use datetime
datetime: 2020-01-01T07:02:54.188Z # Use UTC
# When recording time range use dtr:{start,end}_datetime
dtr:start_datetime: 2020-01-01T07:02:02.233Z
dtr:end_datetime: 2020-01-01T07:03:04.397Z
# ODC specific "extensions"
odc:processing_datetime: 2020-02-02T08:10:00.000Z
odc:file_format: GeoTIFF
odc:region_code: "074071" # provider specific unique identified for the same location
# for Landsat '{:03d}{:03d}'.format(path, row)
dea:dataset_maturity: final # one of: final| interim| nrt (near real time)
odc:product_family: ard # can be useful for larger installations
# Lineage only references UUIDs of direct source datasets
# Mapping name:str -> [UUID]
lineage: {} # set to empty object if no lineage is defined
```
## Run the indexing process
Use Command Line tool to load the product