# Scivision coworking notes (2022-03-09)
- how to structure things?
- example data
- models
- Follow the documentation PR: https://github.com/alan-turing-institute/scivision/pull/167 (or change things if needed)
- model repo
- can include some sample data in e.g. `/example_data`, along with (optionally) a `.scivision/data.yml` that the `url` points to `../example_data/mydata.tif`
- data repo
- some of our examples might need a separate repo, but could be in the model (as above)
## predict interface
- one idea: refer to the model repository - include a map of the labels (standard format)
- can scivision return the class labels too? E.g. include a '.labels()' in the model class
- What is the 'task' for phenotyping example:
- multiple (predefined) patches, classification on each
- What is the 'task' for odin/cryoEM example:
- only one class, detects the positions of everything on the image so all labels are 1
- 'single patch' classification
- predict(image : ndarray, other_args ...) -> class probabilities : ndarray
where:
- image is shape (height, width, channels)
- where 'class probabilities' is 1d array of probability for each class in the problem (N classes)
- 'multi patch' classification
- predict(image : ndarray, other_args ...) -> class probabilities, patches : (np.array, np.array)
where:
- 'image' is shape (height, width, channels)
- 'class probabilities' is 1d array of probability for each class in the problem (N classes)
- 'patches' is a numpy
- batched prediction similar to 'multi patch'
## Idea: run your model on another dataset from the catalog
The example notebooks could, after running the model on the project dataset, query the catalog and find a different dataset from one of the other projects then `load_dataset` and run the model on
Observation: interface much more stable for arguments than return values - pretty much all model predict functions will take an ndarray of shape (height, width, channels), whereas models tend to return different things, with many small variations.
Much easier to rerun the analysis of each example with different data, than swapping models
## idea for describing predict interfaces of each model
- Level 0 : documentation somewhere, not linked
- Level 1 : documentation in standard place or linked from catalog
- Level 2 : way to denote input/return types (see Intake?), and include in the catalog, or returned from model.describe().
- Level 3 : settle on a standard interface for particular tasks (see above). Hard to cover every conceivable model
# TODO:
## Create example gallery
Action [name=Oliver] : Create `scivision_gallery` org, add everyone as admin
Create one repo for each notebook example
Each repo has a:
1. Notebook
2. Environment file (environment.yml or requirements.txt)
3. README with instructions saying
- conda activate...
- jupyter notebook
Action [name=Ed] : checklist/template gallery repo
Other options considered:
- All within scivision (single subdir) - ruled out since each example might place different requirements on an environment
- One directory per example (in scivision or separate repo) - ruled out, since Binder does not support multiple environments within a single repo
- Branches - ruled out: somewhat confusing, difficult to browse