# The Open Source AI Definition ### version 0.0.5 :::info :information_source: Note: This document is made of three parts: A preamble, stating the intentions of this document; the Definition of Open Source AI itself; and a checklist to evaluate licenses. ::: :::info :information_source: This document follows the definition of AI system adopted by the [Organization for Economic and Co-operation Development (OECD)](https://legalinstruments.oecd.org/en/instruments/OECD-LEGAL-0449) > An AI system is a machine-based system that, for explicit or implicit objectives, infers, from the input it receives, how to generate outputs such as predictions, content, recommendations, or decisions that can influence physical or virtual environments. Different AI systems vary in their levels of autonomy and adaptiveness after deployment. More information about definitions of AI systems on [OSI's blog](https://blog.opensource.org/open-source-ai-establishing-a-common-ground/). ::: # Preamble ## Why we need Open Source Artificial Intelligence (AI) Open Source has demonstrated that massive benefits accrue to everyone when you remove the barriers to learning, using, sharing and improving software systems. These benefits are the result of using licenses that adhere to the Open Source Definition. The benefits can be summarized as autonomy, transparency, and collaborative improvement. Everyone needs these benefits in AI. We need essential freedoms to enable users to build and deploy AI systems that are reliable and transparent. ## Out of scope issues The Open Source AI Definition doesn’t say how to develop and deploy an AI system that is ethical, trustworthy or responsible, although it doesn’t prevent it. We support the efforts to discuss the responsible development, deployment and use of AI systems, including through appropriate government regulation, as a separate conversation. # What is Open Source AI To be Open Source, an AI system needs to be available under legal terms that grant the freedoms to: * **Use** the system for any purpose and without having to ask for permission. * **Study** how the system works and inspect its components. * **Modify** the system for any purpose, including to change its output. * **Share** the system for others to use with or without modifications, for any purpose. # Checklist to evaluate legal documents :::info This table is work in progress. See [slide 7](https://opensource.org/wp-content/uploads/2024/01/osi_townhall_2.pdf) for more details. ::: | Component | Necessary to Use | Necessary to Study | Necessary to Modify | Necessary to Share | | ------------------------| ----------------- | ------------------- | -------------------- | ------------------- | | **Code** All code used to parse and process data, including: | | - Data preprocessing code | | - Training code | | - Code used to perform inference for benchmark tests | | - Validation code | | - Inference code | | - Evaluation code| | - Other libraries or code artifacts that are part of the system, such as tokenizers and hyperparameter search code, if used.| | **Data** All data sets, including: | | - Training data sets| | - Testing data sets | | - Validation data sets | | - Benchmarking data sets | | - Data cards| | - Evaluation metrics and results | | - All other data documentation| | **Model** All model elements, including:| | - Model architecture | | - Model parameters (including weights) | | - Model card | | - Sample model outputs| | **Other** Any other documentation or tools produced or used, including:| | - Thorough research papers | | - Usage documentation | | - Technical report | | - Supporting tools |