Pulp "Unified Docs" Project - Part 1


The goal of this project is to create an unified and well-structured documentation for Pulp Project.

This document is organized as follows:

  • Content Design: structuring content systematically
  • Presentation Layer: presenting aggregated content in a website hierarchy
  • Implementation: choosing and implementing the tech
  • Publish Process: releasing/publishing the build
  • Content Migration: migration priorities and delivarables (ongoing)
  • Future Work: ideas for future work (ongoing)

Content Design


The goal of this project is to have an unified docs. But in order to aggregate from various sources and build something reasonable, we need some basic standarization.

Because of that, we've choose to addopt Diataxis as a basis for building such a standard. It defines a user-driven approach which outlines 4 content-types, which we'll use as our basic documentation content types.

Furthermore, we'll define personas, repository-types and a base folder structure, and how those structure can be translated in the presentation layer (the final website hierarchy, which doesn't have to be the same as the base folder-structure).

Here are the schematics of how these should work togheter:

    **persona** X **content-type**
    **folder structure**
    **presentation layer**
    B -- strictly based on --> A
    C -- flexibly based on --> A
    C -- predictably fetches from --> B


We've identified three different user profiles for our product. These profiles will guide how we'll create and organize content.

  • User: "I just want to create sync and publish repositorioes"
  • Admin: "I need to get this instance configured and keep it running"
  • Dev: "I need to add features, troubleshoot and fix bugs"

Content Types

For each persona, we'll use the 4 Diataxis categories:

  • Tutorial: learning-oriented phase
    • We begin by learning, and learning a skill means diving straight in to do it - under the guidance of a teacher, if we’re lucky. DRF Tutorial Example
  • Guide: goal-oriented phase
  • Reference: information-oriented phase
    • As soon as our work calls upon knowledge that we don’t already have in our head, it requires us to consult technical reference. Pytest Reference Example
  • Learn: explanation-oriented phase

Plugin Ecosystem

Pulp Project has a complex set of plugins. We've categorized them to help organize content presentation.

  1. content-repos: Content repositories that are shipped within OCI-images.
    • Contains: ansible, container, deb, file, maven, python, rpm, ostree, certguard
    • Obs: Special care to Normal users.
  2. general-repos: Non-content and non-pulpcore
    • Contains: oci-env, oci-images, operator, cli, glue, k8s-resources, squeezer, selinux, openapi-generator
    • Obs: More advanced or general purpose (oci-env, cli).
  3. pulpcore-repo: The pulpcore repo only.
    • Contains: pulpcore
    • Obs: Common and overview content

Folder Structure

In the project, we have the requirement of aggreating content from different repositories into a single build.

To make this work, each respository should know what content to create/update, so we've defined folder structure for each repository-type.

Note that this does not mean every one of these folder must be populated, it means they can. Later on we'll elaborate a priority plan of deliverables for orienting this.

1. All repositories

CHANGES.md # expected to be here

[Changed in feb/24] Remove the global "reference" and use it as the other contents.

2. Pulp-Docs

    (...) # same as other +
    index.md # landing page

Presentation Layer

There are many ways in which these "Content-types x Personas" matrix can be arranged into a concrete documentation website (see complex hierarchies).

The first Live Demo was this. The actual one being used is this.

The current demo structure can be summarized in plain text as below. Other ideas on organizing the content in a hierarchy are very welcomed.

User Manual
        Overview # optional
            Overview # optional
            Overview # optional
Admin Manual
    (...) # same as previous
Developer Manual           
    (...) # same as previous
    Get Involved
    Documentation Usage/


The following choices were based primarily in popularity/maintanability:

For the aggregation, we starrted with mkdocs-multirepo-plugin, but after some experimentation it proved itself not enough flexible to get content from the proposed content type structure to the presentation layer structure.

Because of that, we've developed pulp-docs, a python package to help build and serve the docs. It uses a mkdocs-macros-plugin hook to inject a preparation script into the mkdocs processing workflow. Additionally, it should help develop and distribute future doc tooling and automations across all repos, such as configured linting and other kind of automations, while also making it easier to run the docs locally regardless of a full oci-env setup.

Publish/Release Process

The publishing workflow is define in pulpcore and should be configured to run nightly and publish to https://staging-docs.pulpproject.org/.

Content Migration


Here's a recommended workflow:

  1. Copy the existing docs to a folder named tmp_docs
  2. Convert all of tmp_docs to markdown all at once using rst2myst
  3. Checkin the tmp_docs to version control so you can work on it over time
  4. Move files and/or sections from tmp_docs to staging_docs
  5. When tmp_docs is empty, the work is done and it can be removed


Here's some helpful commands to setup the repository:

# 1. create structure
mkdir -p staging_docs/{admin,user,dev}/{guides,tutorials,learn,reference}
touch staging_docs/{admin,user,dev}/{guides,tutorials,learn,reference}/.gitkeep
cp -r docs tmp_docs

# 2. convert to rst->markdown (optional)
pip install rst-to-myst[sphinx]
find tmp_docs -name '*.rst' -exec rst2myst convert --replace-files {} ';'

# 3. extra automated cleaning

find $DOCDIR -type f -name "*.md" -exec sed -i -e 's/:::{note}/!!! note/' {} ';'
find $DOCDIR -type f -name "*.md" -exec sed -i -e 's/:::{tip}/!!! tip/' {} ';'
find $DOCDIR -type f -name "*.md" -exec sed -i -e 's/:::{warning}/!!! warning/' {} ';'
find $DOCDIR -type f -name "*.md" -exec sed -i -e 's/:::{hint}/!!! tip/' {} ';'
find $DOCDIR -type f -name "*.md" -exec sed -i -e 's/:::{glossary}//' {} ';'
find $DOCDIR -type f -name "*.md" -exec sed -i -e 's/{term}//' {} ';'
find $DOCDIR -type f -name "*.md" -exec sed -i -e 's/{github}//' {} ';'
find $DOCDIR -type f -name "*.md" -exec sed -i -e 's/{ref}//' {} ';'
find $DOCDIR -type f -name "*.md" -exec sed -i -re 's/^\([a-zA-Z-]+\)=$//' {} ';' # remove (something)=
find $DOCDIR -type f -name "*.md" -exec sed -i -re 's/`([a-zA-Z ]*)<\w*>`/`\1`/' {} ';' # remove `some thing <RemoveThis>`
find $DOCDIR -type f -name "*.md" -exec sed -i -e 's/:::$//' {} ';'

(Taken from this gist)

The motivation for having tmp_dir is to make it easier to determine what files were already migrated and what remains. If there is not much content in the repo, that may not be necessary.


Some guidance on how to choose the right place to put some content.

  • Tutorial or How-to Guide?
    • A tutorial’s purpose is to help the pupil acquire basic competence.
    • A how-to guide’s purpose is to help the already-competent user perform a particular task correctly.
    • Read more here
  • Admin or User?
    • Generally, if it is related to an API it's for a user. "Related to" includes the API itself or accessing it via the CLI, bindings, etc.
      • There are some exceptions, specifically, the following are probably APIs that are for admins:
        • Access Policies, Groups, Users, Signing Services, Importers, Exporters, Repair.
    • All other info is for admins, which likely include topics like:
      • Installation, Upgrading, Configuration (anything realted to settings), Logging, Custom Authentication e.g. LDAP or django-social, architecture, performance, tuning, monitoring, OTEL, etc.

Markdown style guide

Mkdocs provides a superset of features over markdown.

Here we'll mention only fundamental decisions that should be followed. For more extensive reference on other markdown "components", see the live cheatsheet.

this is a [link](site:{repo}/docs/{persona}/{content-type}/page.md).
  • use custom absolute links (with mkdocs-site-urls plugin):
  • see tradeoffs of absolute vs relative here

File includes

We won't support file includes for now. Move their content to markdown files.

ggainey's notes

Future work

  • every plugin should include an "upload content to a repo" example
  • every plugin should have "upload naked artifacts and turn into content" example, in the admin/guide (admin/reference?) section

Q1 Increase the adoption of Pulp upstream by improving the documentation for new and existing users.

  • Create a unified documentation site
  • Make contributing to the documentation easier
  • Systematically structure the documentation content

Q2 - Proposal Increase the adoption of Pulp upstream by improving the documentation for new and existing users.

  • Make the staging-docs stable and production ready
  • Content improvement and de-duplication
    • fill in the "voids" in our pulpcore staging-docs (eg, "Getting Started")
    • start migrating content from pulpproject.org into the staging-docs hierarchy
      • poss use this content to fill the holes above
    • Ultimately - replace current pulpproject.org with the output of staging-docs efforts
  • Add base CI and automation infrastructure for auto-build/test/publish of new-docs-location
    • doc-only changes should run only docs-CI (ie, not full code test suites)
    • setup the docs-CI (e.g, test builds)
    • pipeline needs to account for "plugin Foo releases, docs-site rebuilds"
Select a repo