Scivision coworking notes (2022-03-09)

# Scivision coworking notes (2022-03-09) - how to structure things? - example data - models - Follow the documentation PR: https://github.com/alan-turing-institute/scivision/pull/167 (or change things if needed) - model repo - can include some sample data in e.g. `/example_data`, along with (optionally) a `.scivision/data.yml` that the `url` points to `../example_data/mydata.tif` - data repo - some of our examples might need a separate repo, but could be in the model (as above) ## predict interface - one idea: refer to the model repository - include a map of the labels (standard format) - can scivision return the class labels too? E.g. include a '.labels()' in the model class - What is the 'task' for phenotyping example: - multiple (predefined) patches, classification on each - What is the 'task' for odin/cryoEM example: - only one class, detects the positions of everything on the image so all labels are 1 - 'single patch' classification - predict(image : ndarray, other_args ...) -> class probabilities : ndarray where: - image is shape (height, width, channels) - where 'class probabilities' is 1d array of probability for each class in the problem (N classes) - 'multi patch' classification - predict(image : ndarray, other_args ...) -> class probabilities, patches : (np.array, np.array) where: - 'image' is shape (height, width, channels) - 'class probabilities' is 1d array of probability for each class in the problem (N classes) - 'patches' is a numpy - batched prediction similar to 'multi patch' ## Idea: run your model on another dataset from the catalog The example notebooks could, after running the model on the project dataset, query the catalog and find a different dataset from one of the other projects then `load_dataset` and run the model on Observation: interface much more stable for arguments than return values - pretty much all model predict functions will take an ndarray of shape (height, width, channels), whereas models tend to return different things, with many small variations. Much easier to rerun the analysis of each example with different data, than swapping models ## idea for describing predict interfaces of each model - Level 0 : documentation somewhere, not linked - Level 1 : documentation in standard place or linked from catalog - Level 2 : way to denote input/return types (see Intake?), and include in the catalog, or returned from model.describe(). - Level 3 : settle on a standard interface for particular tasks (see above). Hard to cover every conceivable model # TODO: ## Create example gallery Action [name=Oliver] : Create `scivision_gallery` org, add everyone as admin Create one repo for each notebook example Each repo has a: 1. Notebook 2. Environment file (environment.yml or requirements.txt) 3. README with instructions saying - conda activate... - jupyter notebook Action [name=Ed] : checklist/template gallery repo Other options considered: - All within scivision (single subdir) - ruled out since each example might place different requirements on an environment - One directory per example (in scivision or separate repo) - ruled out, since Binder does not support multiple environments within a single repo - Branches - ruled out: somewhat confusing, difficult to browse

Syntax	Example	Reference
# Header	Header	基本排版
- Unordered List	Unordered List
1. Ordered List	Ordered List
- [ ] Todo List	Todo List
> Blockquote	Blockquote
Bold font	Bold font
Italics font	Italics font
~~Strikethrough~~	~~Strikethrough~~
19^th^	19^th
H~2~O	H₂O
++Inserted text++	Inserted text
==Marked text==	Marked text
[link text](https:// "title")	Link
![image alt](https:// "title")	Image
`Code`	`Code`	在筆記中貼入程式碼
```javascript var i = 0; ```	`var i = 0;`
:smile:		Emoji list
{%youtube youtube_id %}	Externals
$L^aT_eX$	L^aT_eX
:::info This is a alert area. :::	This is a alert area.