Table of contents

TODO:

  • author names (?)
  • STAC paragraph edit
  • data storage paragraph edit
  • Allas usage, more info
  • Learning material links
    * Geoportti mention

Formating

  • Table of contents missing (in the right panel)
  • Capital first letter in bullet points
  • Some links are not links, for example under GEE, where can I store the data, OTB, Resources
  • Headings style in Software section
  • QGIS bullet list not bullet list
  • Julia section, 2 links in one row
  • In Puhti / On Puhti ?

General
* Guide rather than tutorial

  • Target group?
    • "beginners" with background from other fields
    • give a starting point
      ~~* TODO: provide links to remote sensing basics; ~~
      • echoes from space for SAR
      • FMI/SYKE pdf?
      • Canadian center for remote sensing
        * Prerequisites? -> not needed for Guide, there might be something for everyone

Data sources:

  • A table with main products + resolution, revisit time, years, bands?
    • -> find something, or create if it does not exist
      • did not find good one; and it is quite a lot of work to create one, so leaving this open for now.
  • https://research.csc.fi/open-gis-data#intdata3 has a few more satellites mentioned, not sure, but may-be some of them could be listed also here
    ~~* STAC ~~
    • -> add info box about it with link to graphical
    • FMI catalogue
    • Element84 (example one in github)

Storage solutions
* I would have Puhti and Allas storage under same heading.
* merge during process/short term storage

* keep only Allas/Puhti part, remove others, only have links to docs about other services
* Add Allas webinar link to Allas links. Some more comments about usage
* Would leave out Long term and PAS all together.

Processing

  • Hmm, link to Tykky for installations is a little bit strange, but it seems that docs does not have any good page to link for own installations.
    * Application Programming Interface (API). For me acronym API means totally different thing, would not use it here. WMS and WFS are APIs. -> Scripting languages?
    • ok
  • Python/R/Julia list no needed in the first list?
    ~~* Not sure about the GUI/CLI/API division, most of GUI tools have also CLI and Python API
    • toolnames with interfaces as tool list~~

Added later: link to geocomputing page for Puhti basics?

GUI
* NoMachine away
* deprecated

  • "the actual efficient processing should not be done within interactive jobs." -> within batch jobs

Python
* Add Python GIS course link ?
* CSC course link

* GEE, Py6S added in 2022 to geoconda, any comment on that
* check geoconda content and add here

* GEE left out to avoid confusion
* STAC libraries
* (R libraries a lot more limited; no example, but is installed)

~~
Added later: add link to ML stuff too?~~

~~Matlab

  • Add Matlab ?
    • ok, with extensions for EO (?)~~

Earth Observation guide

This guide aims to help users who wish to work with Earth Observation (EO) data using CSC's computing resources.

The purpose of this guide is to help you finding the right data and tools for EO tasks. The basis of this guide is a seminar about the topic held at CSC in 2018 and 2022. And part of the material will also be taught in geospatial training at CSC, check the training calendar for dates and topics of upcoming courses. If you encounter any problems or questions come up, CSC's specialists are happy to help with all aspects of your data driven research, and can be contacted via the CSC Service Desk: servicedesk@csc.fi.

If you are interested in the fundamentals of remote sensing, take a look at these excellent resources:

  • Fundamentals of remote sensing tutorial by Canada Centre for Mapping and Earth Observation , Natural Resources Canada; an "interactive module is intended as an overview at a senior high school or early university level and touches on physics, environmental sciences, mathematics, computer sciences and geography."
  • Echoes in space - Introduction to RADAR remote sensing by the European Space Agency; "a detailed insight into the history of Radar technology, including all the basics that are needed to understand how electromagnetic waves work and a unique hands-on experience to work with Radar data in diverse application scenarios."
  • Newcomers guide to Earth Observation by the European Space Agency, "a guide to help non-experts in providing a starting point in the decision process for selecting an appropriate Earth Observation (EO) solution."

EO potential

  • Possibility to observe wide area at same time
  • Non-intrusive
  • Same sensor for different parts of the world
  • Time series

-> Raster data

  • One file per band or multiband files
  • Grid of pixel values
  • Example of continuous data
  • Georeference: coordinate for the top left pixel , the size of each pixel in the X direction, the size of each pixel in the Y direction, and the amount (if any) by which the product is rotated.

Using EO data in your research

What data do I need?

Consider:

  • Sensor
  • Resolution
    • Temporal: when and how often a certain area is visited
    • Spatial: the area on the ground that each pixel covers
    • Spectral: the area of the electromagnetic spectrum that is observed and spectral width of each band provided
    • Radiometric: how many values are possible for each pixel (bit-depth)
  • Costs
    • Free: e.g. Landsat, MODIS, Sentinel,
    • Non free (but might be possible to get for free/less for research): e.g. WorldView, Spot, Planet,
  • Preprocessing needs
    • Raw or pre-processed
  • User experience and knowledge
    • RADAR/LiDAR require solid background knowledge for processing and interpretation
    • Optical data is more easily interpreted and processed (and more pre-processed data is available)

Where do I find the data?

The best place to get the data from depends on your needs: Do you want to download the data into your own processing environment or do you need a processing environment close to the data? The answer depends on what you want to do with the data and where it is located.

Below is a (uncomplete) set of services, that provide download or download and processing (marked with *) capabilities:

See also a list of other places on CSC research pages.

=== "CSC * "

​​​​* Puhti
​​​​    * [List of all available datasets in Puhti](https://docs.csc.fi/data/datasets/spatial-data-in-csc-computing-env/#spatial-data-in-puhti)
​​​​    * Sentinel and Landsat mosaics of Finland provided by FMI and SYKE: ```/appl/data/geo/sentinel/s2```
​​​​    * Every CSC user has **read** access to data stored on Puhti, no need to move it, unless you need to modify it
​​​​* Allas
​​​​    * [List of all available geospatial datasets in Allas](https://docs.csc.fi/data/datasets/spatial-data-in-csc-computing-env/#spatial-data-in-allas)
​​​​    * Sentinel-2 L2A data of crop growing Finland, growing seasons 2016-present, [usage instructions](https://a3s.fi/sentinel-readme/README.txt)
​​​​    * Data can be directly read from Allas without download for some cases, see eg [GDAL docs](https://docs.csc.fi/apps/gdal/#using-files-directly-from-allas) and [Allas Python examples](https://github.com/csc-training/geocomputing/blob/master/python/allas/working_with_allas_from_Python_S3.py)

=== "Open Access Hubs"

​​​​[SciHub](https://scihub.copernicus.eu/dhus/#/home)

​​​​* needs [registration](https://scihub.copernicus.eu/dhus/#/self-registration) 

​​​​* Sentinel 2 L1C and L2A products
​​​​* Sentinel 1 SLC, GRD , RAW and OCN products
​​​​* Worldwide
​​​​* Note: most of the data is in "Long term archive" and cannot be downloaded directly, but needs to be requested

​​​​[FinHub](https://finhub.nsdc.fmi.fi/#/home)

​​​​* Needs [registration](https://nsdc.fmi.fi/services/service_finhub_registration)
​​​​* Sentinel 2 L1C product
​​​​* Sentinel 1 SLC, GRD and OCN products
​​​​* Only Finland (and Baltics)

​​​​[ASF](https://search.asf.alaska.edu/#/)
​​​​
​​​​* Needs [registration](https://urs.earthdata.nasa.gov/users/new?)
​​​​* Sentinel 1 SLC, GRD , RAW and OCN products
​​​​* Many SAR and SAR derived datasets from other sensors
​​​​* Worldwide
​​​​* Sentinel 1 data available for immediate download
​​​​
​​​​**All of the above** provide a similar Graphical User Interface (GUI) and Application Programming Interface (API) to access the data.
​​​​Other tools for downloading the data from open access hubs: [sentinelsat](https://sentinelsat.readthedocs.io/en/stable/) with [examples for SciHub and FinHub](https://github.com/csc-training/geocomputing/blob/master/python/sentinel/sentinelsat_download_from_finhub_and_scihub.py), ...

=== "USGS EarthExplorer"

​​​​[Earthexplorer](https://earthexplorer.usgs.gov/)

​​​​* Needs [registration](https://ers.cr.usgs.gov/register)
​​​​* Lots of different US related datasets 
​​​​* Main: Landsat worldwide
​​​​* GUI in web interface and bulk download
​​​​* Landsat download instructions: https://lta.cr.usgs.gov/sites/default/files/LS_C2_Help_122020.pdf

=== "NASA Earthdata"

​​​​[Earthdata](https://search.earthdata.nasa.gov)
​​​​
​​​​* Needs [registration](https://urs.earthdata.nasa.gov/users/new)
​​​​* Harmonized Landsat 8 and Sentinel-2 dataset and many more
​​​​* Graphical web interface and bulk download

=== "Sentinel image mosaics"

​​​​* Available in Puhti: /appl/data/geo/sentinel
​​​​* Only Finland
​​​​* [Sentinel-2 image index mosaics](https://ckan.ymparisto.fi/dataset/sentinel-2-image-index-mosaics-s2ind-sentinel-2-kuvamosaiikit-s2ind) 
​​​​* [Sentinel-1 SAR-image mosaics](https://ckan.ymparisto.fi/dataset/sentinel-1-sar-image-mosaic-s1sar-sentinel-1-sar-kuvamosaiikki-s1sar)
​​​​* [WMS (Geoserver)](https://data.nsdc.fmi.fi/geoserver/wms)
​​​​* [WCS (Geoserver)](https://data.nsdc.fmi.fi/geoserver/wcs)
​​​​* Provided by [SYKE](https://www.syke.fi/en-US) and [FMI](https://en.ilmatieteenlaitos.fi/)
​​​​* Instructions on how to use - link to example script?

=== "Google Cloud Storage"

​​​​[Google Cloud Storage](https://cloud.google.com)

​​​​* [Sentinel 2: L1C](https://cloud.google.com/storage/docs/public-datasets/sentinel-2)
​​​​* [Landsat: Collection 1](https://cloud.google.com/storage/docs/public-datasets/landsat)
​​​​* [FORCE](https://docs.csc.fi/apps/force/) can download directly from here

=== "Amazon Web Service (AWS) *"

​​​​* Worldwide
​​​​* [Sentinel-2 bucket](https://registry.opendata.aws/sentinel-2/)
​​​​* [Sentinel-1 bucket](https://registry.opendata.aws/sentinel-1/)
​​​​* Requester pays the download costs
​​​​* Managed by [Sinergise](http://www.sinergise.com/) 

=== "DIAS *"

​​​​* Data and Information Access Services
​​​​* Multiple sites exits:
​​​​    * [ONDA](https://www.onda-dias.eu/cms/)
​​​​    * [sobloo](https://sobloo.eu/)
​​​​    * [CREODIAS](https://creodias.eu/)
​​​​    * [MUNDI](https://mundiwebservices.com/)
​​​​* Costs
​​​​* Processing platform with the data, no download needed
​​​​* Data from DIAS objectstorage can easily be transferred to Allas (link to instructions here)

=== "Microsoft planetary computer *"

​​​​* [Data](https://planetarycomputer.microsoft.com/catalog) and processing platform ([Hub](https://planetarycomputer.microsoft.com/compute))
​​​​* Currently available in preview, [request access](https://planetarycomputer.microsoft.com/account/request)

=== "Terramonitor"

​​​​[Terramonitor](https://www.terramonitor.com/services/analysis-ready)

​​​​* Pre-prosessed, analysis ready Sentinel-2 data
​​​​* Data from Finland available between 2018-2020
​​​​* [Pricing](https://store.terramonitor.com/category/analysis-ready?6f8e8f38_page=1) 

=== "Sentinelhub *"

​​​​[Sentinelhub](https://www.sentinel-hub.com/explore/)

​​​​* Wordlwide
​​​​* Lots of different EO data sets:
​​​​    * [Sentinel-2](https://collections.sentinel-hub.com/sentinel-2-l2a/) 
​​​​    * [Sentinel-1](https://collections.sentinel-hub.com/sentinel-1-grd/)
​​​​* Requires [subscription](https://www.sentinel-hub.com/pricing/)

=== "Google Earth Engine * "

​​​​[Google Earh Engine](https://earthengine.google.com/) is a platform for planetary-scale Earth observation data and analysis

​​​​* Usage:
​​​​    * [Registration](https://signup.earthengine.google.com/)
​​​​    * [From a browser](https://code.earthengine.google.com/)
​​​​    * Python: 
​​​​        * [API](https://developers.google.com/earth-engine/guides/python_install)
​​​​        * [geemap-library](https://geemap.org/)
​​​​    * [R-package](https://github.com/r-spatial/rgee)
​​​​* Pros:
​​​​    * Good coverage of analysis ready data worldwide
​​​​        * [Sentinel-2](https://developers.google.com/earth-engine/datasets/catalog/sentinel-2/)
​​​​        * [Sentinel-1](https://developers.google.com/earth-engine/datasets/catalog/COPERNICUS_S1_GRD)
​​​​    * (rather easy to use, nice tool to test new ideas)
​​​​    * lots of case studies and tutorials:
​​​​        * https://developers.google.com/earth-engine/tutorials
​​​​        * https://www.csc.fi/fi/web/training/-/introduction-to-using-google-earth-engine
​​​​
​​​​* Cons:
​​​​    * Uncertain long-term availability
​​​​    * Google Cloud Storage might be needed to export large datasets
​​​​    * Not always suitable for small-scale analysis

If you plan to work with Sentinel-2 and Landsat 8, check also the 30 m harmonized Landsat 8 and Sentinel-2 product: https://hls.gsfc.nasa.gov/

Many data providers and companies also provide a Spatio Temporal Asset Catalog (STAC) of their and other datasets. These catalogues help in finding available data based on time and location with the possibility for multiple additional filters, such as cloud cover and resolution. The STAC Index provides a nice overview of available catalogues from all over the world. The STAC Index page also includes many resources for learning and utilizing STAC.

Where can I store the data?

What to consider:

  • Raw vs intermediate vs final result data
    • What needs to be stored?
    • Storage space?
  • Accessibility
    • Sensitive data?
    • Who needs to have access?
    • How needs the data to be accessed?
    • Intended usage?
  • Maintenance needs?
  • Metadata needs?

See also: https://docs.csc.fi/data/datasets/hosting-datasets-at-CSC/#what-to-consider-when-choosing-a-suitable-storage-solution

  • For direct access data can be stored on the supercomputer, check out the different available disk areas or in the object storage Allas (Allas overview, Allas guide)
  • On the supercomputer, data can be stored on /scratch/project_xxx with xxx being your project number
  • Smaller amounts can also be stored short term on the computing nodes $LOCAL_SCRATCH during processing
  • In Allas, data is stored in so-called buckets, and can be accessed or transferred as part of the computing job, see also CSC's webinar on Allas for spatial data.

For longer term storage and publication, CSC offers a range of other services. See also CSC's general guide on stroing data.

How can I process the data?

At CSC, EO data can be processed and analyzed using for example supercomputer Puhti or a virtual machine in the CSC cloud = cPouta. You can find more information around geocomputing using CSC resources on our Geocomputing pages.

Puhti has a lot of applications ready installed (see below), you do not need to worry about it. You can also add your own installations using for example the Tykky tool. In cPouta, you need to set up your own virtual machine including all security and software setup, see instructions.

Software

What to consider:

  • User skills and preferences
    • Graphical User Interface (GUI)
    • Command Line Interface (CLI)
    • Scripting
  • User needs
    • Batch processing
    • Automation
    • Reproducibility
  • Open source vs commercial
What applications are available on Puhti?
  • Only Linux software
  • Mostly open source

GUIs of software available on Puhti can be accessed as an interactive job via the Puhti web interface or X11 connection. These graphical interfaces are mainly for visualization and testing purposes, the actual efficient processing should be done within batch jobs rather than interactive jobs.

SNAP

"All-in-one" Graphical User Interface for processing of Sentinel data (+ support for other data sources) with Python interfaces snappy and snapista and the Graph Processing Tool as Command Line Interface.

QGIS

GIS software with limited multispectral image processing capabilities

Orfeo Toolbox

Offers a wide variety of applications from ortho-rectification or pansharpening, all the way to classification, SAR processing, and much more!

Orfeo Toolbox is available as Command Line Interface, Graphical User Interface, Python API and as plugin to other applications.

Sen2Cor

Sen2Cor is a stand-alone processor for Sentinel-2 Level 2A product generation and formatting with CLI.

FORCE

FORCE (Framework for Operational Radiometric Correction for Environmental monitoring) is an all-in-one solution for mass-processing medium-resolution satellite images with CLI and GUI.

See examples for use of FORCE on Puhti on github

GDAL (OGR)

GDAL (Geospatial Data Abstraction Library) is a geospatial library for accessing and transforming geospatial data with CLI and Python package.

See examples for use of GDAL on Puhti on github

Python

Geospatial Python module on Puhti

The geoconda module provides - among others - many useful Python packages for raster data processing and analysis:

  • py6s: Python interface to the Second Simulation of the Satellite Signal in the Solar Spectrum (6S) atmospheric Radiative Transfer Model.

  • rasterio: access to geospatial raster data.

  • rasterstats: summarizing geospatial raster datasets based on vector geometries.

  • sentinelsat: downloading Sentinel images

  • scimage: algorithms for image processing.

  • stackstac: STAC data to xarray

  • xarray: working with multidimensional raster data.

  • See examples for use of geopspatial Python on Puhti on github

  • See also raster lesson of CSC version of GeoPython course material

R

All available R packages on Puhti are included in the r-env module.

Julia

Julia on Puhti
JuliaGeo

Matlab

Matlab on Puhti

Machine Learning

One example of the advanced usage of EO data is for machine learning. If you are interested in the topic, you can find a lot of examples using EO and other geospatial data for machine learning in our Machine learning with spatial data course exercises on Github.

Help

Help from CSC specialists is available via servicedesk@csc.fi.
We are happy to help with technical problems around our services and are open for suggestions on which Software should be installed to Puhti, or what kind of courses should be offered or materials/examples should be prepared.

Acknowledgement

This guide was developed in cooperation with the Finnish Environment Institute SYKE, as part of the Geoportti project.

Resources and further reading

https://step.esa.int/main/doc/tutorials/
https://www.earthdatascience.org/courses/use-data-open-source-python/multispectral-remote-sensing/intro-multispectral-data/
https://github.com/sacridini/Awesome-Geospatial
https://github.com/acgeospatial/awesome-earthobservation-code
http://database.eohandbook.com/