PaSTa is a nextflow-based end-to-end image analysis pipeline for decoding image-based spatial transcriptomics data. It performs imaging cycle registration, cell segmentation and transcripts peak decoding. It is currently supports analysis of three types of ST technology:
We're working on a 7-cycle, 5-channel dataset. Image data from each cycle is a z-projected 5-channel hyperstack ome.tif.
Due to the constraint of running time and resources we have, we will be working on a small crop (yellow, ~ 2500*800 pixels) from this whole mouse brain section.
Create an empty folder, e.g. i2k_demo:
mkdir i2k_demo
cd i2k_demo
Download two configuration files for the pipeline:
nextflow run bioinfotongli/Image-ST -r v0.1.1 \
-profile local,docker -c run.config \
-params-file params_tiny_only.yaml \
-with-report report_tiny_only.html
The nextflow pipeline is here:
Which is composed by the following modules.
and the corresponding container used in each of the modules are in:
Credits to Konrad Rokicki:
The minimum parameters required to run the pipeline is specified with Parameter file
The run.config takes extra settings for specific runs.
A pip-installable image alignment tools using ome-tiffs
A tiled version of cellpose segmentation to bypass RAM issue. Save outputs as polygons (wkt/geojson) (
A deep-learning based RNA spot peak-calling. Similarily written in a tiled version to avoid RAM limitation. (
A probabilistic RNA spot barcode decoding algorithm (
Construct a spatialdata ( object using the compoenent previously generated above.
Go to output folder:
ls output
Short explanation of output folders:
- object (folder) which contains all main outputs of the pipeline in spatialdata format. It contains DAPI images, segmentation masks and decoded spots,cellpose_segmentation_merged_wkt
- contains polygons of segmented cells for whole imagenaive_cellpose_segmentation
- contains polygons and downscaled mask images of segmented cells for image slicespeak_profiles
- contains files with peak positions and peak profiles used for decoding step (this information is not stored in spatialdata)PoSTcode_decoding_output
- contains table with all peaks decoded, their positions and probability of decoding resultsregistered_stacks
- outputs of registration process, contains registered stacks of imagesregistration_configs
- configuration files that were used for registration processslice_jsons
- csv with boundaries of image slicesspotiflow_peaks
- spot peaks called with spotiflowoutput
zip -r spatialdata/ISS_demo_tiny.sdata
conda env create --file=napari_spatialdata_environment.yml
conda activate napari_spatialdata
ipython kernel install --user --name=napari_spatialdata
Quick and dirty solution is to manually specify singularity dir by setting:
singularity cache clean
export SINGULARITY_CACHEDIR=./singularity_image_dir
export NXF_SINGULARITY_CACHEDIR=./singularity_image_dir
ext.args=”--[key] [value]”
in the run.config file.An example is
withName: POSTCODE {
ext.args = "--channel_names 'DAPI,Cy5,AF488,Cy3,AF750'"
Exception: URL fetch failure on None -- [Errno -3] Temporary failure in name resolution
Or CellPose
urllib.error.URLError: <urlopen error [Errno -3] Temporary failure in name resolution>
Mostly likely you've reached max download (?), wait a bit and try later OR manually download those models and update the configuration file.
They are pre uploaded to Wellcome Sanger Institute’s S3 buckets, specifically the
Nextflow is able to download the files as long as these configurations are included in the run.config file.
aws {
client {
signerOverride = "S3SignerType"