---
tags: ocaml.org
---
# Building package docs
Contacts: @julow @jonludlam @lortex
Overview
---
This document describes how we build and update the package docs on `www.ocaml.org`.
The documentation is produced from the built libraries using `odoc`, a tool that is designed and built from the ground up to understand how OCaml libraries are assembled and used. It allows linking between different libraries in different packages, ensuring that links end up at exactly the correct place, a task complicated by the fact that this depends upon the precise versions of all dependent packages required to build the package being documented.
These documents will be placed in the ocaml.org website combined with the information currently shown on https://opam.ocaml.org/packages/. This will be the canonical source of information of packages published in opam.
Voodoo
---
## How it works
Building package docs with ocaml-docs-ci and Voodoo is an incremental process, and as new packages are added to `opam repository` the work required is restricted to the new package only. Occasionally we may rebuild large chunks of the website, for example when a new OCaml release is made, but updating the website for new packages should be a quick process.
Stages of the process:
0. Wait until triggered by a change being pushed to `opam-repository`
1. Decide what to build, recording decisions made. Concretely, we record the fact that we intend to build specific sets of opam packages at specific versions with a specific version of OCaml.
2. Run the build with ocluster and extract all the information we need from each build, without making decisions about how it's presented. (voodoo-prep)
3. Compile the obtained artifacts using odoc, using informations about the final package hierarchy, and generate html files for each package target. (voodoo-do)
4. Generate index pages for all packages, and the global index page, linking all html files together. (voodoo-indexes)
5. Goto 1
This is an ocurrent pipeline, using an ssh server to contain the generated artifacts.
- `voodoo-prep`:
+ **Output**: `prep/universes/<universe_id>/<package name>/<package version>` directory. It's given a list of packages to prep, and the associated _universe_ id.
- `voodoo-do`:
+ Input: the compilation result of a package's dependencies
+ Input: the prep folder of the package to do
+ Compile the package artifacts into odoc, odocl and html files.
+ **Output**: `compile/universes/<universe_id>/<package name>/<package version>/`: odoc and odocl files
+ **Output**: `html/universes/<universe_id>/<package name>/<package version>/`: html files
+ *if blessed*:
* **Output**: `compile/packages/<package name>/<package version>/`
* **Output**: `html/packages/<package name>/<package version>/`
- `voodoo-indexes`:
+ Generate the index pages as .mld files
+ Compile them using odoc
### Dependency universes
A package in not just defined by the tuple of package name and package version. Additionally, it may be dependent on any of the packages it depends upon - for example, consider a package containing an mli file such as:
```ocaml
module M : Set.S with type elt = int
```
The expansion of this will depend on which version of the standard library it was compiled against.
A particular package is therefore specified by the triple of the package name, the package version, and the 'dependency universe hash'. This has is computed in the following way:
1. Find all dependencies (including transitive dependencies, though not going 'through' the ocaml package) using opam.
2. Sort and write them to a string, one package per line, in the format `<package name>.<version>`
3. Compute md5 hash of the string.
For example:
```
conf-m4.1
ocaml.4.11.1
ocamlbuild.0.14.0
ocamlfind.1.8.1
topkg.1.0.3
```
which are the dependencies on this particular system for the package `astring.0.8.5`. The hash of this should be `92edc0c1c4ec93b2f61fdd7fc9491460`
The type to uniquely identify a package is therefore given by:
```ocaml=
type universe_id = Digest.t
type package_name = string
type package_version = string
type package = universe_id * package_name * package_version
```
### Handling packages, sub-packages and libraries
Because odoc handles include paths in the same way that OCaml does, and because we would like references to behave in the same familiar way that normal OCaml paths do, it makes sense to keep the `odoc` files in the identical directory structure to that of the associated `cmt`, `cmti` and `cmi` files. This does _not_ imply that the directory structure of the output `html` files (or man/latex files) must mirror this. The implication of this is that we _cannot_ determine sub-packages.
As an example of the various ways complex packages are layed out, we have the following case studies:
#### Case study: yaml
- Compiled with dune.
- Contains multiple packages, including a sub-sub-package:
```
yaml
yaml.bindings
yaml.bindings.types
yaml.c
yaml.ffi
yaml.types
yaml.unix
```
- Each sub-package corresponds with precisely one META file
No, there is a single META for yaml, describing every sub-packages.
- Each sub-package corresponds with precisely one archive
- Each package has an isolated include directory
- All subdirs are underneath ~/.opam/$switch/lib/yaml
#### Case study: oasis
- Not compiled with dune
- Contains multiple packages:
```
oasis
oasis.base
oasis.builtin-plugins
oasis.cli
oasis.dynrun
```
- Two META files - one in `~/.opam/$switch/lib/plugin-loader` and the other in `~/.opam/$switch/lib/oasis`
- Each sub-package corresponds with precisely one archive
- Multiple packages share the same directory
#### Case study: dose3
- Not compiled with dune
- Contains multiple packages:
```
dose3
dose3.algo
dose3.common
dose3.csw
dose3.debian
dose3.doseparse
dose3.doseparseNoRpm
dose3.npm
dose3.opam
dose3.pef
dose3.rpm
dose3.versioning
```
- One META file, in ~/.opam/$switch/lib/dose3
- The dose3 package contains multiple archives - `"common.cma algo.cma versioning.cma pef.cma debian.cma csw.cma opam.cma npm.cma"`
- Sub-packages also contain the same archives - e.g. dose3.algo specifies `algo.cma`
#### Case study: stdlib and associated libraries
- Not compiled with dune
- Contains multiple libraries, the exact list depends on the OCaml version:
```
bigarray
bytes
compiler-libs
dynlink
ocamldoc
raw_spacetime
stdlib
str
threads
unix
```
- META files _not_ distributed with the package, they come with `ocamlfind`
- META files in isolated directories, but many of the packages include dirs overlap
#### Sub-packages observations
- Different sub-packages containing the same libraries is unusual.
Questions:
- What do we do for something like dose3?
- Can we just do nice docs for dune-based projects? probably not, not least due to Daniel's packages
- How do we figure out which packages can be documented nicely? (e.g. no overlapping archives)
- What do we do for the OCaml libraries (stdlib, seq, raw_spacetime, str etc -- these don't have opam packages -- mostly the META files come from the `ocamlfind` package)
- What other packages will be painful? We have the 'corpus' compiled already, but missing files like META, dune-packages and so on.
#### Detecting sub-packages
We should detect sub-packages and group modules under them in package pages.
This is an important information to be able to use them in Dune for example.
+ Some packages use subdirectories (eg. yaml)
+ Some packages have one archive per sub-package (eg. logs)
+ Some packages have intersecting archives (eg. dose3)
+ Some packages are "unwrapped" (eg. base)
+ One package have two `lib/*` directories and two `META` files (oasis).
Maybe:
* treat that as two different packages, one of which Opam doesn't know ?
* find them using Opam's `.changes` files and treating the second package as a sub-package ?
Reliable ways to find them:
* Querying ocamlfind
is only way to pair sub-package names with archives.
The library exposes a parser for META files.
Currently, the CLI fails to print archives, for example `%A` is not working in `ocamlfind query -format "%p %d %A"`
Later, we'll also need `assemble` to create a nice looking hierarchy for sub-packages,
for example: `.../<package>/<version>/<sub.package>/<modules>`
### Package content
- [ ] Most packages don't have documentation pages but have:
- `doc/$/README.*` (.org or .md)
- `doc/$/LICENSE.*`
- `doc/$/CHANGES.*`
- `lib/$/META`
Every packages except the stdlib have it.
It's the only way to know sub packages.
- `lib/$/opam`
Added by opam.
- `lib/$/dune-package`
Added by dune, only in projects using dune.
Contains the same informations as `META`.
- `lib/$/**.ml?`
Source files, intended to be seen by merlin or why not, "see code" links from documentation. (Odoc should do that someday !)
A few packages have documentation intended to be read by Odoc:
- `doc/$/odoc-pages/index.mld`
This is intended to be the entry point of the package's doc.
`assemble` should use it has the package page, possibly modifying it to add a common header.
- `doc/$/odoc-pages/*.mld`
Various other files we can find sometimes:
- `doc/$/*.ml`
In dbuezli's libraries, it is meant to be appended at the end of `index.mld` automatically.
- Some packages have things in `share/$`
But these are not intended to be read by users (eg. emacs/vim plugins)
`voodoo-prep` adds some other informations that may be useful:
- List of dependencies to other packages
## Ocaml-docs-ci
This is the incremental pipeline to build documentation.
Repo: https://github.com/ocurrent/ocaml-docs-ci
### 1. Track the opam-repository
```ocaml
val v : Git.Commit.t Current.t -> t list Current.t
val pkg : t -> OpamPackage.t
```
Given an opam repository commit, list all its packages.
### 2. Solver
```ocaml
type t
type key
val keys : t -> key list
val get : key -> Package.t
val incremental :
opam:Git.Commit.t Current.t ->
Track.t list Current.t ->
t Current.t
```
An incremental solver, performing opam-0install solves when new packages are added. After the solving step, we obtain a list of `Package.t` which corresponds to packages and their associated universes.
### 3. Jobs
```ocaml
type t = { install : Package.t; prep : Package.t list }
val schedule : targets : Package.t list jobs -> t list
```
From the list of packages to obtain, generate a list of prep jobs to perform. Each job consists in a single package to install and multiple packages to prep.
### 4. Build and prepare artifacts
```ocaml
type t
val package : t -> Package.t
val folder : t -> Fpath.t
val artifacts_digest : t -> string
val v : voodoo:Voodoo.t Current.t -> digests:Folder_digest.t Current.t -> Jobs.t Current.t -> t list Current.t
```
**Done via _ocluster_.**
Perform the _prep_ step for one job. It will generate the prep folder of multiple packages. The `Folder_digests.t` value allows to track existing prep folders.
Prep data is stored in `/prep/universes/<universe>/<name>/<version>/`.
Should be updated when:
- voodoo-prep changes (should not happen a lot)
- the upstream `/prep` folder digest is invalidated
### 5. Compile
```ocaml
type t
val digest : t -> string
val artifacts_digest : t -> string
val is_blessed : t -> bool
val package : t -> Package.t
val folder : t -> Fpath.t
val odoc : t -> Mld.Gen.odoc_dyn
val v :
voodoo:Voodoo.t Current.t ->
digests:Folder_digest.t Current.t ->
blessed:Package.Blessed.t Current.t ->
deps:t list Current.t ->
Prep.t Current.t ->
t Current.t
```
**Done via _ocluster_.**
Compile .odoc, .odocl and .html files for one package, given its prep result and the compile result of its dependencies.
Output (when _blessed_, otherwise replace `packages` by `universes/<universe_id>`):
- generated index: `/compile/packages/<name>/page-<version>.odoc(l)`
- odoc, odocl: `/compile/packages/<name>/<version>/`
- html: `/html/packages/<name>/<version>/`
### 6. Indexes
```ocaml
val v : Compile.t list Current.t -> unit Current.t
```
Given the list of the successfully compiled packages, generate the index pages and compile them to HTML. Done on the _host machine_.
Output:
- `/html/packages/index.html`
- `/html/packages/<name>/index.html`
- `/html/universes/index.html`
- `/html/universes/<universe_id>/index.html`
## Voodoo-prep
Current repo: https://github.com/ocaml-doc/voodoo
The job as submitted by the pipeline will install a specific set of packages.
Once the install has completed, voodoo-prep, the binary, will be executed.
### Voodoo-prep (the tool)
Voodoo-prep is run after the build of a particular set of packages has been completed. It is run in the environment in which the build succeeded.
We iterate through all of the packages installed in the opam environment, and go through the files installed as part of each package, as recorded by `~/.opam/<switch>/.opam-switch/install/<package>.changes`. The tool collects the following types of files:
|File type|Reason|
|---------|------|
|.cmti |This is what odoc would prefer to operate on.|
|.cmt | Odoc will use cmt files for analysing usage of identifiers.|
|.cmi | Only if the above two files don't exist will odoc resort to using cmi files.|
|.mld, .md, examples, contents of `doc` opam dir| These are documentation files.|
|.cm(x)a info, dune-package, META| voodoo-do may use the info from these files to organise the documentation into libraries/subpackages.|
a
All of the above files are copied into the following path:
```
prep/universes/<hash>/<package>/<version>/...
```
where the `...` represents the identical path the files appear under `~/.opam/<switch>/`. hash, package and version are the triple that uniquely identifies a package as described above.
The info contained in `cmxa/cma` libraries installed as part of the package is collected as follows:
```
ocamlobjinfo <lib>.cm{a,xa}
```
The opam file is collected from `~/.opam/<switch>/.opam-switch/packages/<package_name>.<package_version>/opam"`
### Version.mld
The contents of this file will be rendered when someone visits the URL `http://docs.ocaml.org/packages/$package/$version/` and is therefore the landing page for the package as a whole. As such it needs to contain all the important info needed. It should contain:
- Name
- List of modules
:::danger
:pencil: Missing informations:
- Findlib packages names for every sub-packages. This may not correspond exactly to the Opam name.
This is useful for copy/pasting into Dune files.
- Link to rendered README, CHANGELOG, LICENSE
- Package Dependencies (references to other packages)
- List of toplevel modules sorted by sub-packages
Which can be improved by showing modules' doc, see https://github.com/ocaml/odoc/issues/297 and https://github.com/ocaml/odoc/issues/478
Currently sub-packages are not recognized. There is some bit of code that looks at directories tree but that's wrong, the goal was just to demonstrate how the output should look.
:::
The package may contain an `index.mld` file. It must be concatenated at the end of `version.mld` with its level-0 headings removed.
This way, every package pages share a common header.
:::info
:bulb: The following section is experimental and is still at the prototype stage
:::
Examples follow:
```
{0 Package 'yaml' version 2.5.0}
{1 [yaml]}
{!modules: Yaml}
{1 [yaml.bindings]}
{!modules: Yaml_bindings}
...
```
:::danger
:pencil: This is the current output, it might need some improvements:
- The quotes in the title are not nice
- A lot of informations are missing, see the TODO above
:::
There is one section for every sub-packages, using findlib's name, containing the lists of modules. See [Detecting sub-packages](#Detecting-sub-packages) above.
:::danger
:pencil: It would be useful to be able to tell which subpackage contains which module, by URL/breadcrumbs
Suggested layout:
```
/packages/$package/$version/TopLevelModules/index.html
/packages/$package/$version/$subpackage/SubPackageModule/index.html
```
For example for `yaml`:
```
/packages/yaml/2.1.0/Yaml/index.html
/packages/yaml/2.1.0/Yaml/Stream/index.html
/packages/yaml/2.1.0/yaml.bindings/Yaml_bindings/index.html
/packages/yaml/2.1.0/yaml.bindings.types/Yaml_bindings_types/index.html
```
The example above is making an exception for the "main" library. This is done only if that library have the same name as the Opam package. (common under Dune)
:::
### Dependencies
:::danger
:pencil: Currently, prep doesn't collect dependencies and `assemble` compute them
:::
## Open questions
### Handling of 'special' packages:
- ocaml-secondary-compiler
- ocamlfind-secondary
These two are currently simply blacklisted and removed from universe calculations
- conf-*
These currently don't produce anything but add many extra steps. Should they be blacklisted?
- ocaml-base-compiler / ocaml
Currently Stdlib docs are under ocaml-base-compiler - we probably want them under 'ocaml' instead? or maybe elsewhere?
### 'Blessed' status
## Further thoughts
:::info
:bulb: This section is experimental and is still at the prototype stage
:::
- Extra click in wrapped libraries
Some libraries have one top-level module, which has the whole library as submodules.
This module is generated by Dune and is often not very useful in the doc, it's an unordered list of modules.
We could inline it into the package page and avoid an unecessary click.
Some packages document this module carefully and it sometimes contains types and values (for example base).
- weird packages:
- ocaml has `topdirs.cmti` in 2 places -- lib/ocaml and lib/ocaml/compiler-libs/
- ocaml-compiler-libs has many unresolved module aliases (expected because it's a namespacing package) - thus odoc compile-deps will miss many of its dependencies, which will need to come from the opam metadata (likely why odig fails to link them)