owned this note
owned this note
Published
Linked with GitHub
# How we work
## Kurt
Code for separate analysis steps stored in separate github repos (preprocessing, analyses, paper, presentations). All of these binary folders are stored in "bin" folder. Data is divided into "raw" (should never be modified) and "derivatives" (e.g., preprocessed data, analysis output) folders.
During data collection, we keep a human-readable document for notes about data quality (also on github). During preprocessing, I transfer the key to a notes to a "notes" column in a tsv file. The .tsv has other columns pointing to the various raw and reprocessed data files (tracking the various transformations applied to the data). I find it a nice blend of reproducibility, flexibility and automaticity. (I can easily find files used in a particular analysis and link to human readable notes. The tsv is also machine readable).
I haven't (yet) found the paper I mentioned in lab meeting. A similar organizational scheme is described here: https://data.library.arizona.edu/data-management/best-practices/data-project-organization.
I avoid file formats that might be difficult to read in the future (docx xlxs, .mat, .pkl), and favor simple text files (.txt., .md, .csv, .tsv).
---
## Danny
Code is stored in project specific repos (e.g. https://github.com/conwaycolorlab/nafc / https://github.com/conwaycolorlab/MT1/) and then cloned on the server (`bc6:\PROJECTS\Causal Globs` / `bc6:\PROJECTS\Color_Shape_Contingency\MonkeyTurk_1`). There is a tension between what is the "official" source - which one is most up to date. My preference would be that it would be the github version, since that can then manage conflicts, but I've been trying to keep the server up to date based on a desire to:
- keep an official "on site" version (which I think actually is a changing target now that github is officially endorsed)
- allow access for those who are not as comfortable with git as I am
The server version also has the data, and this is not synced to github (`Y:\PROJECTS\Causal Globs\.gitignore` includes the line `/data/` which means that the data folder is ignored. That means that if you want to run analyses you need to either copy the data you want into your local repo or run it on the server).
This repo also has the logbooks for monkey training (a daily markdown file with a template, organised into folders by month and year, and with the filenames including the date and monkey name) - (https://github.com/conwaycolorlab/nAFC/tree/master/log / `bc6:\PROJECTS\Causal Globs\log`).
Notebook - I keep a notebook on my personal machine (which is slightly naughty) and the plan is to create a sanitised copy to turn over to the NIH when I leave. I'm excited about having an "official" way to keep a digital notebook as part of the lab.
---
## Stuart
For coding and displaying stimuli and color calibration scripts, everything is coded in Matlab and pushed to a repository to keep track of all scripts ever presented and also so that behavioral data/new scripts can be ported between the stimulus computer and my working computer.
For Color Contrast Spatial Frequency:
All code is just currently on the server and the computer, written in Python and Matlab. Files are linked by JSON through the Spencer-Stuart fMRI pipeline.
fMRI pipeline is on a git repository.
Generally I would just prefer a framework where any code and notes goes into a repository for that project, or, alternatively, the code is maintained in a repository for that tool, although we would then need to keep careful track of how that tool/pipeline was used.
---
## James
Code (all languages) stored in personal GitHub repository during development. "Finished" projects are moved to the LSR repository. Supporting files (images, parameter files, etc) are handled in the same way.
---
## Felix
Stimulus presentation code and calibration information is stored on the Kofiko drive, as well as on the data server in corresponding data folders.
Analysis code for on-line analysis code is stored in the git-tracked Kofiko drive. My off-line analysis code is stored in my personal dropbox during development, as well as on UMD's box service. However, it would be easy to transfer more of the data alignment and analysis code to the relevant data folders on the lsr drives alongside the processed data.
Bevil has in the past suggested that I keep a version fo the code that was used to process and analyze a dataset in the same folder as that dataset, which I think is potentially useful but might result in a lot of redundancy - but it does make a lot of sense for datasets with specific preprocessing needs (such as realizing that there was a lot of drift in an electrode, or that some aspect of data was not saved correctly).
---
## Marianne
### Code
- Experimental code is grouped in repositories depending on the where it's used and/or the project: nif and rig have their own repo (not optimal to have 2 bc code very similar but that's how it is for now), macman human as well. All of it is in python. Not all equally well documented, not planning on, except human code when will be public when publication.
- Analyses is a mess, only local (not on github). But Kurt has done most of the code for analyses. Maybe later will do analysis repo or add to the one Kurt has for common project. I however would make sure before a project is published that the code is available online. Also developped utilities for analyses but again, it has it's own messy structure for now and is not on github.
### Notes
[talking here about notes regarding data collection]
- For fMRI experiments:
- on a md doc (github, project notes), diary style.
- on a tsv file by session (should be stored on the server), that inputs IMA number, sequence, whether run was ok, name of the corresponding event file and potentially other useful stuff (percent fixation etc.)
- additionnally for monkey:
- small notes / diary on the monkey physical notebook for rig (only behavioral training) and whether there was MION injection
- separate excel/tsv file to document monkey MION injection parameters (weight, qty, signal drop), locally on laptop for now.
- for Rig (only behavior), notes are on the monkey physical notebook.
### Data
For human and monkey fMRI, usually 3 files for each run: a tsv "event" file, an "eye-movement file" (edf for humans, binary from analog for monkey) and the dicoms. Additionally, a txt file for physiological measures for humans. All Data is transfered after each fMRI session on the server in the project folder (bc7/projects/macman_align) following a specific structure.
Rig Data (only behavioral training for now anyways) is just stored on the rig linux computer (only used by me and Stuart) and not copied onto the server.
### Others
I don't use other notes or else that I consider of interest to share, but let me know if there are things I haven't thought about.
---