TTW: Open AI

Location: Guide for Reproducible Research/Open Research/Open AI

What is open AI?

Open source has been an important attribute of AI communities in academia and industry. In the early 2000s, open source machine learning software libraries like scikit-learn and open datasets like ImageNet helped generate interest and set standards for the community. While these early projects were driven to open to build a shared and optimised resource for a community of collaborators, new models and motivations for open AI have developed.

Benefits of open AI

  • Collaboration through shared datasets & eval benchmarks
  • Encouraging competition for innovation
  • Growing user base and developer team
  • Platform leadership

[TODO: Fill out section]

Drawbacks to open AI

[TODO: Fill out section]

Contributing to open AI

[TODO: Fill out section]

Different approaches to open AI

While traditional references to "open AI" refer to open access to models and potentially source code, what constitutes "openness" in AI is multidimensional. There are opportunities for open along the entire AI pipeline from sourcing data, training a model, creating code and tools to support the model, evaluating and applying the model, governance and maintenance of the model, and licensing and distribution of the model.

The table below compares different flavours of openness in a few well-known AI models.

[TODO: Create Table]

Model Name Organization Description
BLOOM Text Text
Stable Diffusion Text Text
OPT Text Text
GPT Text Text

Case Study: BLOOM LLM

BLOOM is an open-access, multilingual Large Language Model co-created by 1000+ researchers through the Big Science Workshop, which was inspired by open science intiatives like CERN.

[TODO: Fill out section on BLOOM approach to open AI]