Checklist to evaluate machine learning systems

This checklist is based on the paper The Model Openness Framework: Promoting Completeness and Openness for Reproducibility, Transparency and Usability in AI - DOI published Mar 21, 2024. The Model Openness Framework (MOF) is implemented on the Model Openness Tool website (MOT).

Scope of this document

This Checklist was developed by volunteers during the co-design process to help reviewers of AI systems to identify and rank the components required to exercise the basic freedoms of Open Source AI. It's been further refined via public comments, on the forum and on the public draft on hackmd.

This document should be seen as part of the definitional process, a learning tool: The Checklist is not an operating manual to evaluate Open Source AI.

Relationship to the Model Openness Framework

The MOF classifies systems in three degrees of availability of components, from some (Class III, Open Model) to all (Class I, Open Science). When using the MOF, one can think of the requirements of the "preferred form to make modifications to a ML system" as a bar overlayed on the MOF range of classes.

Known issues and limitations

Tied to generative AI: Being based on the MOF, this Checklist appears to be tightly coupled to generative AI. The list of components is not generalized enough to be applied to all machine learning. More research is necessary to apply the principles of the Open Source AI Definition to other kinds of AI and different machine learning systems.
Subject to interpretation: When the Datasets component is made available, the Data requirements should be satisfied. When AI systems don't make the Datasets component available, one needs to extrapolate from the alternative Data components if they provide the requirements listed in the Open Source AI Definition. This is another area that requires further research as the practice of Open Source AI develops.

For more details, see also the Open Source AI FAQ.

Table of default required components

Required components	Legal frameworks^[1]
Data
See Known Issues. The requirements in the Open Source AI Definition must be satisfied.
- Datasets	Available under OSI-approved terms
- Research paper	Available under OSI-approved terms
- Technical report	Available under OSI-approved terms
- Data card	Available under OSI-approved terms
Code
All of these components are required
- Data pre-processing	Available under OSI-approved license
- Training, validation and testing	Available under OSI-approved license
- Inference	Available under OSI-approved license
- Supporting libraries and tools	Available under OSI-approved license
Model
All of these components are required
- Model architecture	Available under OSI-approved license
- Model parameters	Available under OSI-approved terms

Table of optional components

The other components listed in the Model Openness Framework are optional.

Optional components
Data
- Evaluation data
- Evaluation results
Code
- Code used to perform inference for benchmark tests
- Evaluation code
Model
- Model card
- Sample model outputs
- Model metadata

Available under OSI-approved terms means that the OSI will review licenses and agreements to ensure that all materials are available under terms that conform with the Open Source Definition. ↩︎

Checklist to evaluate machine learning systems

Scope of this document

Relationship to the Model Openness Framework

Known issues and limitations

Table of default required components

Table of optional components

Read more

Answers to frequently asked questions

The Open Source AI Definition v1.0-RC2

The Open Source AI Definition v.0.0.8

The Open Source AI Definition - 0.0.7