# SynchWeb / RELION launching
This is a "live" document to keep track of current status and plans for launching the RELION processing pipeline from SynchWeb.
## 2021 Software commissioning visit
Proposal `cm28212` | Visit start | Visit end
--- | --- | ---
CM28212-1 | 09:00 Fri 1 Jan 2021 | 09:00 Fri 12 Mar 2021
CM28212-2 | 09:00 Fri 12 Mar 2021 | 09:00 Fri 21 May 2021
CM28212-3 | 09:00 Fri 21 May 2021 | 09:00 Fri 13 Aug 2021
CM28212-4 | 09:00 Fri 13 Aug 2021 | 09:00 Fri 22 Oct 2021
CM28212-5 | 09:00 Fri 22 Oct 2021 | 09:00 Fri 31 Dec 2021
Tutorial data set: `/dls/m12/data/2021/cm28212-1/raw/Frames`
## Data collections
February (cm28212-1) | ID
--- | ---
DCID | 6018191
processingJobId | 6611642
SynchWeb | https://ispyb.diamond.ac.uk/dc/visit/cm28212-1
December (cm28024-2) | ID
--- | ---
dataCollectionGroupId | 5152322
dataCollectionId | 5684468
processingJobId (empty?) | 5978855
processingJobId (tutorial data) | 6052874
processingJobId (Yuriy's data) | 6053783
November (nt21004-402) | ID
--- | ---
DCID | 5627399
ProcessingJob | 5863295
## Buttons in Synchweb
On pressing 'Start' check if there is a DCID for these files already.
If there is no DCID then create the DCID.
And then create and trigger the Processing ID (ReprocessingID, ProcessingJobID, ReprocessingJobID - these are all the same thing).
When Zocalo kicks into action it creates an AutoProcProgramID (which identifies eg. "RELION 3.1" and can have a status)
Before we offer the user the option to start something we should check:
- look up the associated (primary) Processing IDs
- we are interested in those that don't have an autoprocprogramID attached
- or that have one that is not a success or failure (because then it's either running or will be running soon)
- If there is one or more of these then maybe show that fact on the website, but ultimately if this is the case then the *Start* button becomes a *Stop and (Re-)Start* button.
## Relion Recipe
https://gitlab.diamond.ac.uk/scisoft/zocalo/-/blob/master/recipes/ispyb-relion.json
## Path and message handling
### ISPyB Fields
We need to set `imagePrefix="GridSquare"` in ISPyB.
Data directory:
```
/dls/m12/data/2020/nt21004-402/raw
/Supervisor_????/**
/GridSquare_*/*.mrc/xml/tif
```
Working directory:
```
/dls/m12/data/2020/nt21004-402/tmp/zocalo/raw/GridSquare_1
/relion -> <uuid>
/<uuid>
```
Results directory:
```
/dls/m12/data/2020/nt21004-402/processed/raw/GridSquare_1
/relion -> <uuid>
/<uuid>
```
```bash
cs03r-sc-serv-36 ~ :) $ dlstbx.find_in_ispyb 5646074 | grep "working_directory\|results_directory\|imageDirectory\|fileTemplate\|imagePrefix\|dataCollectionNumber"
'dataCollectionNumber': 1,
'fileTemplate': '3-3_1_master.h5',
'imageDirectory': '/dls/i04/data/2020/mx21426-77/guido/screening/TRIM28/',
'imagePrefix': '3-3',
'ispyb_results_directory': '/dls/i04/data/2020/mx21426-77/processed/guido/screening/TRIM28/3-3_1_/2d286aa7-01d4-4999-9e44-4986de3848ad',
'ispyb_working_directory': '/dls/i04/data/2020/mx21426-77/tmp/zocalo/guido/screening/TRIM28/3-3_1_/2d286aa7-01d4-4999-9e44-4986de3848ad'}
cs03r-sc-serv-36 ~ :) $ dlstbx.find_in_ispyb 5627399 | grep "working_directory\|results_directory\|imageDirectory\|fileTemplate\|imagePrefix\|dataCollectionNumber"
'dataCollectionNumber': 1,
'fileTemplate': 'GridSquare_*/Data/*.mrc',
'imageDirectory': '/dls/m12/data/2020/nt21004-402/raw/',
'imagePrefix': None,
'ispyb_results_directory': '/dls/m12/data/2020/nt21004-402/processed/raw/GridSquare_/efea357e-6e45-4083-a199-7a5557aef404',
'ispyb_working_directory': '/dls/m12/data/2020/nt21004-402/tmp/zocalo/raw/GridSquare_/efea357e-6e45-4083-a199-7a5557aef404'}
```
### Current situation
#### 1. Message creation
SynchWeb's [EM page](https://github.com/DiamondLightSource/SynchWeb/blob/master/api/src/Page/EM.php#L85-L268) creates a JSON object with parameters and writes it in the `.ispyb/processed/` directory of the visit.
An example is `/dls/m07/data/2019/em19865-17/.ispyb/processed/relion_msg_200416.130626.json` which contains:
```
{
"acquisition_software": "SerialEM",
"import_images": "/dls/m07/data/2019/em19865-17/raw/Frames/*.tif",
"motioncor_gainreference": "/dls/m07/data/2019/em19865-17/processing/gain.mrc",
"voltage": 300,
"Cs": 2.7,
"ctffind_do_phaseshift": false,
"angpix": 1.23,
"motioncor_binning": 1,
"motioncor_doseperframe": 0.5,
"stop_after_ctf_estimation": true
}
```
SynchWeb then puts a message on the `relion.start` Zocalo queue with contents:
```
{
"relion_workflow": "/dls/m07/data/2019/em19865-17/.ispyb/processed/relion_msg_200416.130626.json"
}
```
(We think the JSON is written to disk because of worries about the size of messages in the Zocalo queue, but this is not actually a realistic limitation here.)
#### 2. Message consumption
After some Zocalo magic, [`RelionRunner.run_relion()`](https://gitlab.diamond.ac.uk/scisoft/imaging/relion-zocalo-runners/-/blob/master/relion_zo/consumers/zoc_relion_main_consumer.py#L64-178) is called with the message as a parameter and does the following:
1. Calls [`setup_folder_str()`](https://gitlab.diamond.ac.uk/scisoft/imaging/relion-zocalo-runners/-/blob/master/relion_zo/consumers/zoc_relion_main_consumer.py#L224-240), which:
i. Uses the path from `relion_workflow` to find the visit directory
ii. Makes `<visit_dir>/processed/relion_<visit_id>` and `<visit_dir>/raw` directories
2. Calls `link_movies()`, which creates a symlink `<visit_dir>/processed/relion_<visit_id>/Movies` that points to `../../raw`
3. Does some `sys.path` hacking to import `cryolo_relion_it`
4. Creates a new `RelionItOptions` object
5. Loads default DLS and cluster options
6. Loads the JSON file from SynchWeb as a dictionary and uses it to update the options object
7. Saves a copy of the options in `relion_it_options.py`
8. Creates a command string to load modules and run `python cryolo_relion_it.py relion_it_options.py`
9. Runs the command string in the background using `subprocess.Popen()`
10. Waits up to 15 seconds for `RUNNING_RELION_IT` and `RELION_IT_SUBMITTED_JOBS` files to appear and then copies them to `<visit_dir>/.ispyb/processed/`
#### Notes
* The gain reference file is somehow copied to `<visit_dir>/processing/gain.mrc` but we're not actually sure exactly how this happens. Might be done manually by the eBIC local contact? Or possibly by the data transfer rsync script.
* Sounds like the file was copied by local contact per session.
* There is some manual transformation applied that could be scripted.
* It probably makes sense to do the gain reference conversion and copying as part of the rsync transfer script
* How the gain file book keeping works is still undetermined (but there should be one, pointed out by ESRF)
* Most of the entries in the JSON file are (correctly) options for `relion_it.py`, but `acquisition_software` is ignored.
* Path handling is overly complicated!
* Log messages are not always consistent with code behaviour
### Proposed behaviour
#### 1. Message creation
SynchWeb should:
1. Create a new Data Collection ID (DCID) in ISPyB, with at least `imageDirectory` populated correctly
2. Create a ProcessingJobID
3. Put processing parameters as key / value pairs into the ISPyB ProcessingJobParameters table
4. Send the processing job ID to Zocalo to start the job
5. Watch for the `RUNNING_RELION_IT` file in the data processing directory (preferably not in the `.ispyb` directory as now)
* MG: if we're in proper ISPyB-land then Synchweb can also look for an AutoProcessingJob-thingy which Zocalo could set up, which would give you a started/stopped/failed status and potentially a status message.
Information we actually need to run the processing pipeline is:
* Visit directory (e.g. `"/dls/m07/data/2019/em19865-17"`)
* Data collection directory (probably `"raw"` for now)
* Data collection ID (maybe? Not actually sure how we would use this if it's not used as part of the raw data directory name)
* The DCID would be as good as the visit directory and data collection directory, as the data collection record allows looking these things up. Specifically the `imageDirectory` is visit+datacollection directory
* Processing ID or RELION project directory name (e.g. `"processing/relion_<proc_id>"`)
* Parameters for `relion_it.py`:
* `import_images` as a wildcard path relative to the data collection directory (e.g. `"GridSquare_*/Data/*.tif"`)
* `motioncor_gainreference`, `voltage`, `Cs` and others as in the example above
#### 2. Message consumption
We would like to change the Zocalo consumer from a service to a wrapper.
Zocalo should take care of getting the processing parameters from ISPyB and putting them into a Python dictionary. When `run_relion()` is called, the parameters will then be available as an attribute of the `Wrapper` parent class.
`RelionRunner` will be simplified. It will be set up so Zocalo runs it in a Python environment where `cryolo_relion_it` is available. It will import `cryolo_relion_it`, set up and save the options, then call the main pipeline function directly (without using `subprocess`).
This will launch one or more `relion_pipeliner` processes. If the Zocalo wrapper is running on the cluster and returns, the job will end and Linux cgroups will ensure all of the connected processes are killed. To avoid this, we should update `relion_it.py` so it always keeps running until the `RUNNING_RELION_IT` file is removed.
#### Open questions
* What exactly should the directory structures look like for raw data and processing runs?
* Where in the system should the processing directory be created? The Zocalo consumer needs it to know where to actually run processing, but SynchWeb also needs it to be able to look for the files that indicate processing is happening