Collaborative workflows for content development

###### tags: `Documentation` # Collaborative workflows for content development Upstream and downstream paths for Ansible technical content are disconnected. This document proposes a tooling solution that brings content sets together and positions teams for more efficient collaboration. This proposal aims to accelerate velocity of writing teams by: - Opening up the content sets - Abstracting complexity from workflows through automation - Reducing duplicate efforts - Increasing content visibility that makes issue triage and ownership identification easier ## Downstream is part of the community too Collaboration is a circular flow, not a one-way stream. Folks in the community share ideas, contribute code, fix bugs, and drive innovation. Red Hat cross-functional teams (including support, QE, technical marketing, field enablement, solution architects, and so on) bring a tremendous amount of knowledge and expertise with production systems that they can contribute back to the community. **How do we open up the content and bring community and downstream together?** ![](https://i.imgur.com/M61iDXZ.png) ### Asciidoc and downstream publishing workflows Let's take a look at how technical content gets published to `access.redhat`. ![](https://i.imgur.com/myJyKZo.png) Red Hat technical documentation is sourced in asciidoc, which is a markup language similar to RST or markdown. Asciidoc is a pretty sensible choice for a large enterprise content set. It's structured and easily transforms to DocBook XML. You can think of asciidoc as a sort of a flat text "front end" to DocBook. In fact, the CCS writing teams use modular docs based on the [DocBook specification for assemblies](https://tdg.docbook.org/tdg/5.2/ch06.html). All asciidoc content for Red Hat technical documentation is sourced in GitLab. There's a Jenkins job that polls GitLab and generates xhtml that goes out to `access.redhat`. ## The AAP docs repo The [ansible/aap-docs](https://github.com/ansible/aap-docs) repo should be the primary source for Ansible Automation Platform documentation. To facilitate direct access between cross-functional teams and open the content sets up: - CCS writing teams should migrate from the [insights](https://github.com/RedHatInsights/red-hat-ansible-automation-platform-documentation) repo - Automation Controller technical content should be migrated from the [product-docs](https://github.com/ansible/product-docs) repo To connect the entire Ansible content set with the downstream publishing workflows: - Ansible playbooks convert RST to ADOC with pandoc - Ansible playbooks convert MD to ADOC with kramdown ![](https://i.imgur.com/a6a9Csn.png) Guiding principles: - All conversion playbooks are available to the community. Take an upstream-first approach, extend trust, and bring the power of the community with us. - RST to ADOC should be a **temporary solution** to a bottleneck that blocks collaboration. Ansible engineering should work towards achieving portable content while the downstream publishing toolchain folks prevent bottlenecks from things that are too implementation specific. - It should be possible for anyone to run a conversion job through a GitHub action or by executing automation jobs locally. - Conversion playbooks should be reusable and interchangeable where possible. - Conversions should use Pandoc for RST to ADOC. Pandoc is widely used and well maintained. Pandoc is the de facto standard for workflows that use different input and output formats for content. Pandoc is an open-source utility available under the GNU GPLv2 license. - Conversions should use Kramdoc for MD to ADOC. Kramdoc is part of the Asciidoctor toolchain and offers the best possible conversion of MD to ADOC. Kramdoc is an open-source utility available under the MIT License. - There should be as little post-conversion scripting as possible. We should not use post-conversion tasks as a way to bypass improvements or changes that could be integrated into the source content. When the team observes an issue with the converted content, we should always evaluate how to fix that in the source and get a cleaner conversion. ### Synchronizing content downstream To get content into GitLab we synchronize it via a Jenkins. Jobs use the `docs-bot` service account to copy asciidoc files, images, and other content from the ansible/aap-docs source repository to GitLab. ![](https://i.imgur.com/YZtcSj0.png) Using Jenkins centralizes the management of GitHub to GitLab details. Each job specifies the target branch in a declarative repository, which provides a central place for maintenance. Synchronization jobs basically run a simple bash script on a cron schedule. Scripts set up folders and copy content from the source to the target and use common commands like “mkdir” / “cp” / “rm”. Scripts should not modify content as part of a synchronization job. Likewise, scripts should not be embedded in the Jenkins configuration. All scripts should reside in the ansible/aap-docs repository and be invoked during the job. ## Future endeavors Here are some considerations for carrying this effort into the future. ### Content portability Sphinx is a pretty awesome set of tools for documentation. And project tooling is an engineering decision. Engineering teams are the ultimate owners of their documentation. At the same time, docs should be portable across projects. Docs must flow. * Keep content simple. Avoid brittleness. Eliminate complexity. (More on this below.) * Gain a large, more cohesive doc set that is easier to maintain. * Lower barrier to entry for collaboration. Docs should provide a comfortable and easy starting point for project contributions. **Code should read like documentation. Documentation shouldn't read like code.** BTW no one is picking on Sphinx. What CCS refers to as "modular docs" is a bottleneck because it is too implementation specific. Content portability is a major issue at Red Hat. Docs are not currently flowing. ### Reducing content complexity Simplicity is the best way to convert content between formats reliably and consistently. We should establish a set of standards and create lint rules based on them to reduce complexity of our content sets. Aiming for a simplified content set also lowers the barrier for community contributions. #### Internal cross-references There are two types of internal cross-reference: intradocument (local xrefs) and interdocument (across the content set). Sphinx does a great job of handling both types of cross-references but it isn't easily portable. The pipeline includes a post-conversion step to build Asciidoc `xref` from Sphinx `ref`. There is then an `xref` > `link` conversion on top to compensate for the fact that Asciidoc doesn't handle interdocument cross-references well. Need to establish some guidelines for consistent cross-reference markup in RST and reduce maintenance overhead. #### Ventilated prose One sentence per line in RST, MD, or ADOC source. - Prevents changes at the start of a paragraph from repositioning the remaining lines. - Easily swap sentences. - Easily separate or join paragraphs. - Can comment out sentences or add commentary. - Makes it easier to spot sentences that are too long. - Easier to spot redundant (mundane) writing patterns. ### Accessibility We should make every effort to conform with accessibility guidelines so that our content set is open and inclusive to all. Content should be audited to ensure it is accessible by those who rely on assistive technology or are neurodivergent. ### Tooling Need tooling commitment for things like bccutil. Controller docs team needs training on how to do the ADOC to DocBook transformation. We should collaborate more with the DXP engineering team on the tooling side. #### Direct paths for RST and MD content RST to ADOC to DocBook XML is not the most efficient path. Why do another intermediary transform? There needs to be further engagement with the DXP engineering team on a direct path to `access.redhat` for formats other than asciidoc. Rather than being specific to any implementation (this includes Sphinx), docs should be portable. There is also a need for tooling to handle API reference content generated from source: Swagger artifacts, Python docstrings. More info here: https://hackmd.io/gJc2aIGbTZSgSai-xPjM4g Note that it is possible to convert RST directly to DocBook XML. See https://github.com/Abstrys/rst2db. The one advantage for RST to ADOC is that downstream writing teams are familiar and comfortable with asciidoc. Going directly to DocBook XML bypasses collaboration. Plus the conversion pipelines should be a **temporary** solution.