Board Meetings

--- title: Board Meetings tags: board, statistical-software robots: noindex, nofollow --- # Agendas and Notes for Statistical Software Board Meetings These notes are included in a [`hacmkd.io` organisation](https://hackmd.io/@stat-software) which contains draft standards for the categories selected for prototype development. ## Overall Project Timetable: - End August 2020: Draft standards for all categories - End September 2020: Revised standards, prepare for public release and publication - End October 2020: First running demonstration of testing and reporting tools - End November 2020: Public Announcement of impending opening of system - End 2020: Demonstrated system with both publicly accessible API and local tools (as R packages) for assessment and reporting on category-specific software. - Jan 2021: Begin accepting packages for review --- ## Agenda 04th May 2021 Meeting will step through [this demonstration of editorial process](https://github.com/ropenscilabs/statistical-software-review/issues/7). Note that your roles will be as handling editors. We anticipate most submissions obtaining no red crosses in the initial section, and so being passed from the Editor-in-Chief straight to you, where you may have to consider some of the details of the package report. In the context of this role, we will ask each of you for feedback on the following questions: 1. Are there any general aspects of the general review process we might have missed? 2. Is the information contained within the initial automated report sufficient? Is it clear? Could anything be added or removed? 3. What do you anticipate being the most likely difficulties we may face when following this process? 4. And finally ... Do you think that process is sufficient for us to announce a public launch? If not, what else do we need to develop in order to do so? RK: - Likely need to go through review process to really know - 2 reviewers is fine; editors may act as reviewers as last resort BB: - Agree with RK - Suggestion that we all do an internal submission/review of our own packages? LCT: - having check package available for general use will be useful. Example: https://bioconductor.org/packages/biocthis which enables creating a GHA workflow with biocthis::use_bioc_github_action() that runs BiocCheck::BiocCheck(), which helps prepare your package before submitting to Bioconductor (like make sure that you are passing the checks before you submit it). - having version versions for the check system will be useful (seems doable if it's a package). That is, as someone submitting a package, I'd like to be able to reproduce the errors/warnings shown in my computer, then verify that my changes address them, before re-submitting (though well, you could also trigger a new build). TH: - fix bugs in check system and good to go! --- ## Agenda 29th March 2021 The three tasks to be discussed and decided upon are: - Discuss final steps prior to official launch of system - Agreement on proposed badging system - Transition of board members from current roles to roles as handling editors following system launch ### 1. Official Launch We believe we are in a state to be able to publicly launch the system very soon. The general procedures expected to be followed by submitting authors, by the editorial team, and by reviewers, are described in Chapters 3--5 of the [*Statistical Software Review Book*](https://ropenscilabs.github.io/statistical-software-review-book/index.html). Briefly: - **Authors** must assure robustness of their packages via the [`autotest` tool](https://github.com/ropenscilabs/autotest), and document compliance with standards via the [`srr` package](https://github.com/ropenscilabs/srr) - Automated checking system features online endpoints including one which performs numerous checks to confirm that software may be submitted and/or considered for review. - **Editor-in-Chief** generally only needs to confirm that package passes primary and singular check, and proceeds to delegate handling editors. - **Handling Editors** clarify desired grade (bronze, silver, gold) at end of review; find and assign reviewers; and address prior to review any aspects of automated checks unable to be fulfilled. - **Reviewers** use the [`srr` system](https://github.com/ropenscilabs/srr) to assess compliance with standards, then proceed to a general review through addressing specific questions identified in the [*Guide for Reviewers*](https://ropenscilabs.github.io/statistical-software-review-book/pkgreview.html#general-package-review). **Questions to be Discussed** - What else do we need to have in place prior to official launch? - Is there anything missing from the general procedure? Anything we might have failed to consider? - Are the [general questions for reviewers](https://ropenscilabs.github.io/statistical-software-review-book/pkgreview.html#general-package-review) sufficient? Could anything be improved? - Is the proposed system for stating and documenting a [*Software Life Cycle*](https://ropenscilabs.github.io/statistical-software-review-book/pkgdev.html#pkgdev-lifecycle) sufficient? ### 2. Badging System We propose a system of **bronze**, **silver**, and **gold** badges, as described in the [*Guide for Authors*](https://ropenscilabs.github.io/statistical-software-review-book/pkgdev.html#pkgdev-badges). - Might there be any preferable or alternative systems? - Are there any potential issues with using the terminology of bronze, silver, gold? - Are the requirements for the [silver grade](https://ropenscilabs.github.io/statistical-software-review-book/pkgdev.html#pkgdev-silver) both appropriate and sufficiently clear? - Could distinction between these three grades be formulated differently? How? ### 3. We want you as Editors! Who of you is willing to act as handling editors following launch? Envisioned roles of handling editors are described in the [*Guide for Editors*](https://ropenscilabs.github.io/statistical-software-review-book/pkgsubmission.html#handling-editor). While we ultimately are looking editors to serve two-year terms, currently we asking that members of this board will act as interim handlers as we test the system in the next six months or so as we test the system. Beyond that, note that there are currently five of you, while standards have been developed for seven categories. Please add your name to any category for which you agree to act as a handling editor, and please reach out to anybody else who you know who might be able to help with any of the missing categories. 1. Bayesian and Monte Carlo Software - Ben (one of), Paula 2. Exploratory Data Analysis - Paula, Leo 3. Machine Learning Software 4. Regression and Supervised Learning Software - Ben (one of), Rebecca 5. Spatial Software - Mark, Paula 6. Time Series Software - Rebecca 7. Dimensionality Reduction, Clustering, and Unsupervised Learning Software - Stephanie (only starting in Aug 2021 though) _Leo_: anything that involves some genomics/bioinformatics. (if that's allowed) _Rebecca_: also happy to handle more environmental and health related submissions. The following standards will be developed following launch: 1. Wrapper Packages 2. Network Analysis Software 3. Probability Distributions 4. Workflow Support Software --- ## Agenda 16th Feb 2021 **NOTE**: Please attend our upcoming [Community Call](https://ropensci.org/commcalls/) on March 2nd 2021. Meeting will be divided into two parts: **First 20-30 minutes**: Demonstration and discussion on tools for developers to assess their software prior to submission. - Input, feedback, opinions on system for "injecting" standards into code, system for developers to address those standards, and system for automatically collating reports on standards adherence. - Meeting will start with a short (5-10 min) demonstration of the system. - Concrete questions will include: - Current system uses three forms of tags (default/standard complied with; NA; and TODO). Is this sufficient? Any other suggestions? - Current system relies on `roxygen2` "roclets" to record and process standards; are there better approaches? - [`autotest`](https://github.com/ropenscilabs/autotest) and [this system](https://github.com/ropenscilabs/srr) together present quite a high hurdle to gain initial entry to *start* a review process. Is this likely to act as a deterrent? If so, what can or should we do to negate any deterring effects? **Next 20-30 minutes**: Discussion of remaining tasks prior to launch. Priority tasks in no particular order include: - [ ] Resolve levels for green/silver/gold badging, which means only resolving what "silver" means. *Proposal*: - **Green** = decision to accept - **Silver** is a statement of *intent* rather than status, that developers have made some progress towards gold standard, with more to come. - **Gold** = All standards deemed by reviewers to be *potentially* applicable have been adhered to, and `autotest` passes cleanly. - [ ] See which board members want to serve as associate editors for first few months (together with Mark and Noam). - [ ] Finish remaining standards (*Probability Distributions*, *Wrapper Packages*, *Networks*, and *Workflow Support*). - [ ] Decide whether packages will generally be integrated into [`github.com/ropensci`](https://github.com/ropensci), or whether that will be optional only, with authors able to retain packages in their own organizations. Non-verbal update - upcoming automation work: - [ ] Integrate [`autotest`](https://github.com/ropenscilabs/autotest) output with [`srr`](https://github.com/ropenscilabs/srr) to automatically pre-populate standards checklist items. - [ ] Complete extraction of [`srr`](https://github.com/ropenscilabs/srr) documentation from code to populate report on standards compliance. - [ ] Ensure stable prototype server to deliver combined [`autotest`](https://github.com/ropenscilabs/autotest) and [`srr`](https://github.com/ropenscilabs/srr) reports via bot ([`buffy`](https://github.com/openjournals/buffy)) command. ## Agenda Dec 1st 2020 - Introductions of new members (Paula, Leo) - Two (plus one) main discussion points: 1. How will the process be organised? 2. What does "acceptance" look like? 3. (If time permits) What happens after acceptance? Explicit questions to be addressed are *in italics*. **1. How will the process be organised?** Note#1: The current rOpenSci system is primarily organised by an occasionally revolving team of around six editors (with no "specialist" or subject editors). Their primary tasks are (i) to determine whether software is in scope, (ii) to find and assign reviewers, and (iii) to manage the review process. A number of aspects might be adopted and/or modified for our statistical software systems, for which important questions include: - *Do we integrate within current system, with perhaps a few more editors to handle increased workload?* or, - *Do we integrate within current system, but have statistics-specific editors for any statistical packages?*, or - *Do we keep submission and review system for statistical software separate from previous system?* Responses: - SH: Leverage the system/people that you have, bring in additional editors as needed - BB: Existing editors should know to recognize/assign, opportunity to expand the brand - RK: Keep infrastructure, bring in editor to spearhead/be figurehead - SH: What would be the role of statistical editors? - NR: a little more outrach and recruitment probably than current editors, especially for a statistical "editor-in-chief" If system is to be kept separate, - *How to we "brand" accepted packages?* - *With current rOpenSci badge plus additional one for standards?* or - *Modified version of current branding and badging that includes reference to statistical software project/programme?* or, - *Branding and badging distinct and separate from current rOpenSci system?* Responses: - RK: What about multiple badges for different standard categories? - NR: We should record which categories are reviewed, and badge could link to checklist/cover - MK: What about post-review development - NR/MP: We'll return to this - BB: Passing at different levels is important - RK: We definitely need badging as an incentive - NR: How should we do graded badging? - BB: Need to have the subjective approach included, maybe not go to the edge of quantifying everything - RK: Guidance for reviewers - make checklist enable _eligibility_ for certain level, then reviewers can approve that level or lower - SH: How do we ensure uniformity? Automation is hard but need explict and specific badges to ensure it. Need very clear guidance, at least for base level, to make sure people know how to reach them. - PM: Need to be clear that these are only for software quality, not methodological correctness/usefulness - BB: a couple of thoughts: (1) a **verbal** rubric for bronze/silver/gold, indicating in some detail (but *qualitatively* what's expected at each level). Bronze/silver/gold is culturally very clear, but it would be nice to have slightly more granularity (5-point Likert scale with appropriate labels?) - SH: Very quantitative for minimum bar, more subjectivity higher up. - NR: Some of the gold/novel type practices are not very subjective, but may be onerous or are currently rare. - RK: Current system includes a couple of options for submissions to be considered part of a review phase for subsequent submission to journals (notably Journal of Open Source Software and Methods in Ecology and Evolution). - *Should we endeavour to have our review process recognised as an initial part of review for subsequent journal submissions?* - *If so, which journals should we consider contacting?* - *The Journal of Statistical Software*? (via Rebecca?) *R Journal*? **2. What does "acceptance" look like?** Note #1: rOpenSci's current system enacts primary judgement of whether a submission is in scope. If so, submission is invited, following which reviewers and editors generally work with developers to get a package accepted. Actual "rejection" is rare. The model for the statistical software system will be similar in that rejection should be rare. Note #2: Relative to the current system, the statistical standards are much more "leading edge," in that most current packages are unlikely to meet them. Current RO standards are more "lagging" in that they are more adoptions of already-evolved best practices. Should acceptance be indicated by: - *A simple badge as with current system?* - *Step-wise badging (bronze/passing, silver, gold, as in [coreinfrastructure](https://bestpractices.coreinfrastructure.org/en/criteria))?* - *Graduated badging similar to code coverage?* Possible templates are the [Criteria Statistics](https://bestpractices.coreinfrastructure.org/en/criteria_stats) for Core Infrastructure badges. Further important question: - *How should "not applicable" standards be considered?* Current standards suggest the following: 1. Checking boxes of all standards which are applicable and which are met 2. Checking boxes of all standards which it is okay to consider not applicable (and appending those with **N/A** to aid machine parsing of reviews). 3. Leaving unchecked boxes of all standards not met, as well as of all standards currently unable to be applied, yet which should ideally be applied. **3. What happens after acceptance?** Note #1: rOpenSci currently offers/requires that developers transfer accepted repositories over the `ropensci` domain, which also also turns on some automated and non-automated maintenance, checking, and documentation processes. The model for the statistical software system need not follow that pattern. Note #2: We will likely require developers to submit an expression of envisioned lifecycle or future development plans. - *How might a lifecycle plan best be incorporated within both the review process, and an R package structure?* - *How might we ask reviewers to consider these statements? How, for example, should a reviewer judge a statement that a package represents the completion of a unit of work, and so no further development is anticipated?* ## Agenda: 2020 Sept 01 - Walk through applications of standards to [`lme4`](https://hackmd.io/VZ-wgQtZRV2pb-wFZNDM5g), with reference both to [General Standards](https://hackmd.io/gVjTHFupS-GCy4qdjfn8gg) and [Regression-specific Standards](https://hackmd.io/ipvRsLU-ShSNi7n2skS-6A) - Discuss thoughts and opinions on general approach to standards thus far - In particular, on the relative paucity of algorithm-specific standards within each category (for example, compare [Regression Standards](https://hackmd.io/ipvRsLU-ShSNi7n2skS-6A) with [Unsupervised Learning Standards](https://hackmd.io/KHzx4Sq-SnOaEQ8N9-7qvA). - Brief concluding discussion about soft "launch" in order to invite initial submissions/enquiries from developers of software in the first 5 categories. - FYI: Invitations sent to potential new board members: - [Sarah Romanes](https://sarahromanes.github.io/) - U Sydney; statistical machine learning - [Paula Moraga](http://www.paulamoraga.com/) - King Abdullah Uni Saudi Arabia; geostatistics, epidemiology - [Leonardo Collado-Torres](http://lcolladotor.github.io/) - Lieber Inst for Brain Development; genetics, bioconductor stuff ### Notes from meeting 2020 Sept 01 ben: need to consider implications of backward compatibility for standards ben: leave distinction between warning and error more flexible G2.13 stephanie: "appropriately handle" seems strange - "adequately handle" G4.4b Parameter recovery tests should use multiple seeds - stephanie: important to acknowledge time trade-off, and possibility of relegating to extended rather than regular tests RE1.2 ben a bit concerned about that requirement, but maybe because it's not currently sufficiently clear that it pertains to documentation only. TODO: Clarify RE2.4 co-linearity: Max - very hard to do, maybe not always possible - if `qr()` function from Matrix package gets a number < #columns? - provide some boiler plate to get going 4.1 - standards for algorithmic control? Would be a good thing to have in general RE4.14-15 Forecasing. Comments by Ben: - Very hard except by parametric bootstrapping - Such things are not part of `lme4` because we've never been able to do it - General half-matrix method applicable to most linear models - prediction for new subjects ought also be part of it RE6.3 Visualisation of forecast values. Comments by Ben: - `predict` merely takes new data, and not necessarily extrapolation - sounds very much just TS specific - Max: How about just "feasible", or "when possible" ### TODO List: - [x] Clarify that RE1.2 pertains only to documentation - [x] Change or clarify "appropriately handle" in 2.13 - [x] Indicate that G4.4b may require relegation to extended tests - [ ] Some boiler-plate examples for RE2.4 - [ ] Sections 4.1: Standards for algorithmic control? - [x] RE4.14-15 Clarify that this may not be possible - [x] RE4.14-15 Add new standard for prediction using new subjects/groups (where applicable) - [x] RE6.3 Clarify that that only applies "where feasible" or "where possible" ## Agenda: 2020 July 14 1. Discuss high-level conceptual approach to Standards thus far, particularly the comparably well developed initial versions for [`Time Series`](https://hackmd.io/uu8AJDGnStmaNTfFd0SZ-g) and [`Bayesian`](https://hackmd.io/38W9pcE3TWGawcAcBbFlNg) software. 2. Discuss the relative paucity of detail in the core *algorithmic* sections of these categories, noting the following: - Many standards beyond these core algorithmic sections might be ultimately merged into more general, higher-level standards, and so not end up being category specific at all. - We have aimed with these first cuts to be as general as possible, and to avoid conditional clauses as far as possible ("*If your software within this category is of this sub-type, then ...*). Such conditional clauses will ultimately be necessary, but how much might be too much? We'd like to briefly discuss approaches to the development of category-specific standards as an exercise in identifying and specifying sub-categories. 3. Current initial standards for the EDA category are notably different from those for other categories, and are likely to remain so. These standards are more *qualitative*, and suggest that developers should identify things like target audiences and key questions. Many items in the standards for other categories are intentionally more *quantitative*, partly reflecting our attempts to develop standards able to be assessed in a (semi-)automatic way. We'd like to discuss the issue of the potentially greater burden placed on both developers and reviewers by these kinds of qualitative standards, including such questions as: - How much is too much? - What are the relative advantages and disadvantages of qualitative versus quantitative standards? 4. Remaining Categories are: - Dimensionality Reduction, Clustering, Unsupervised Learning - Machine Learning - Probability Distributions - Wrapper Packages - Networks - Workflow Support - Spatial Analyses The standards thus far likely provide good templates for most of those remaining categories, although perhaps less so for the *Machine Learning* category. We'd like to briefly discuss ideas for how we might address core *algorithmic* standards in that category. 5. General logistical issues for brief discussion: - Workflow from here: more regular Board meetings - General timeline - what stage should standards be at in order to start submissions? - Martin has decided he won't be able to participate in the future. Who should we invite in his place? ## Notes: 2020 July 14 - Noam mentioned Alex Hayes's four "Types of Tests" (correctness, parameter recovery, convergence, identification). - Max: Worth putting in somewhere regardless, to at least get people thinking about them - Rebecca: They are quite generic, so maybe not directly useful? - Agreement that board members will nominate a package to help step through standards - Next Categories? - Clustering - Probability Distributions - Tasks: 1. Nominate package to be assessed 2. Nominate potential new board members 3. Actively engage with standards in current form - Next meeting: Single task of walking through assessment of one package

Syntax	Example	Reference
# Header	Header	基本排版
- Unordered List	Unordered List
1. Ordered List	Ordered List
- [ ] Todo List	Todo List
> Blockquote	Blockquote
Bold font	Bold font
Italics font	Italics font
~~Strikethrough~~	~~Strikethrough~~
19^th^	19^th
H~2~O	H₂O
++Inserted text++	Inserted text
==Marked text==	Marked text
[link text](https:// "title")	Link
![image alt](https:// "title")	Image
`Code`	`Code`	在筆記中貼入程式碼
```javascript var i = 0; ```	`var i = 0;`
:smile:		Emoji list
{%youtube youtube_id %}	Externals
$L^aT_eX$	L^aT_eX
:::info This is a alert area. :::	This is a alert area.