HackMD - Collaborative Markdown Knowledge Base

Initial metadata ---------------- Add a `gather_metadata()` function for each job. It should return a `dict` which can be nested as necessary: ``` {parameter: value} ``` or ``` {parameter: {subvalue1: sv1}, {subvalue2: sv2}} ``` then write results schema for it in `pipeliner/metadata_schema/results/` The results schema need to describe each key value pair in the `dict`. The code below creates the schema and adds two pairs that are added to the metadata `dict` automatically by the pipeliner: `OutputFiles` and `InputFiles` . So just add entries for each key value pair in the `dict` `gather_metadata()` returned. [http://json-schema.org/draft-04/schema#]("http://json-schema.org/draft-04/schema#") ``` { "$schema": "http://json-schema.org/draft-04/schema#", "title": "RESULTS: relion.autopick", "description": "CTF determination in Relion using Ctffind or GCTF", "type": "object", "properties": { "OutputFile(s)": { "description": "Output nodes added to the pipeline", "type": "array", "items": { "type": "string" } }, "LogFiles": { "description": "Logfiles generated by the job", "type": "array", "items": { "type": "string" } }, "YourFirstDictItem": {description, type} }} ``` Here’s two jobs to get started on: **cryoEF job type** `pipeliner/jobs/other/model_validation_job.py` Metadata from `cryoef_angles.log`: {Efficiency} {Mean PSF resolution } {Standard deviation} {Worst PSF resolution: {resolution, phi, Theta}} {Best PSF resolution: {resolution, phi, Theta}} {Distribution of PSF resolution: [ [res min, resmax, %] ]} **Model_validation.evaluate job type** `pipeliner/jobs/other/model_validation_job.py` Metadata just needs the summary stats from the end of `run.out`: Ramachandran outliers = 0.10 % favored = 97.92 % Rotamer outliers = 0.36 % Peptide Plane: Cis-proline : 8.20 % Cis-general : 0.32 % Twisted Proline : 0.00 % Twisted General : 0.00 % C-beta deviations = 0 Clashscore = 2.79 (percentile: 85.7 N=33165, 2.40A+/-0.25A) RMS(bonds) = 0.0129 RMS(angles) = 1.46 MolProbity score = 1.09 (percentile: 95.2 N=32691, 2.40A+/-0.25A) Resolution = 2.40 R-work = 0.3316 (percentile: 0.0 N=33089, 2.40A+/-0.25A) Refinement program = REFMAC Unit tests ... and we'll need to write unt tests