# Migrating the docs subdomain This proposal outlines the plan to migrate the `docs.ansible.com` subdomain to Read The Docs. This plan entails steps to create a new subdomain to replace `docs.ansible.com` on the current web server then migrating the CNAME record to Read The Docs. Additionally this plan outlines specific actions to preserve SEO authority and avoid 404 errors and broken redirects. ## Motivation There are two main driving factors behind the proposed migration of the `docs.ansible.com` subdomain: - Winding down Red Hat managed infrastructure including the web server and Jenkins jobs to reduce maintenance overhead, AWS spend, and security risk. - Creating a unified space for community documentation in the Ansible ecosystem. The `docs.ansible.com` subdomain signals the "official" location of trusted content for Ansible users, developers, maintainers, and contributors. At present this applies only to the package docs, core docs, and Ansible automation controller docs, which results in a fragmented experience for community documentation. ### Reducing managed infrastructure At present the content available from `docs.ansible.com` is served by Apache httpd running on RHEL hosts in an AWS EC2 instance behind Cloudflare. The Ansible engineering organization at Red Hat are responsible for the associated maintenance and infrastructure cost. In addition to the web server, there are several Jenkins jobs responsible for building and publishing documentation to the web server: - `Docs/job/Build_Ansible_Core_Docs_Automated/` - `Docs/job/Build_Ansible_Docs/` - `Docs/job/Build_Ansible_Package_Docs/` - `Docs/job/Build_Ansible_Package_Docs_Automated/` - `Docs/job/Build_AWX_Cli_Docs/` - `Docs/job/Build_Docsite/` - `Docs/job/Build_Tower_Docs/` - `Docs/job/Build_Tower_Swagger_Docs/` Currently we have three main sets of content available from `docs.ansible.com`: - Ansible core documentation - Ansible package documentation - Ansible automation controller documentation In addition, there are several hundred http redirects that prevent 404 errors for moved or renamed pages. ### Improving the docs experience Read The Docs is a hosted documentation building and publishing platform that offers features such as preview builds for pull requests, integration with ci/cd workflows, automatic builds, [cross-project search](https://docs.readthedocs.io/en/stable/server-side-search/index.html), analytics, and more. Read The Docs offers a free solution that is available to open-source projects. As a gesture of goodwill, the Ansible community team at Red Hat sponsor Read The Docs via a [gold membership](https://about.readthedocs.com/pricing/#/community). This also entitles us to remove ethical ads from several projects under the `ansible` namespace. Currently all projects in the Ansible GitHub organization build and publish documentation in the `ansible` namespace on Read The Docs at: https://ansible.readthedocs.io/ - Ansible package: https://ansible.readthedocs.io/projects/ansible/latest/ - Ansible core: https://ansible.readthedocs.io/projects/ansible-core/devel/ - Ansible lint: https://ansible.readthedocs.io/projects/lint/ - AWX: https://ansible.readthedocs.io/projects/awx/en/latest/ - Galaxy: https://ansible.readthedocs.io/projects/galaxy-ng/en/latest/ - Rulebook: https://ansible.readthedocs.io/projects/rulebook/en/stable/ - And so on... Note the `/projects/` context in the urls. Read The Docs allows multiple documentation projects to be nested under a single namespace. These are referred to as subprojects. The current layout of the `ansible` namespace on Read The Docs is the [landing pages](https://github.com/ansible/docsite) on the top-level with each project, including core and the package, added as a subproject. Read The Docs allows projects to [share a custom domain](https://docs.readthedocs.io/en/stable/subprojects.html#sharing-a-custom-domain) so that, after the subdomain migration is complete, `ansible.readthedocs.io` will be replaced in each of the preceding urls with `docs.ansible.com`, for example: - Ansible package: https://docs.ansible.com/projects/ansible/latest/ - Ansible core: https://docs.ansible.com/projects/ansible-core/devel/ - Ansible lint: https://docs.ansible.com/projects/lint/ ## Pre-migration tasks This section outlines work items that must be complete before we can start the migration process. ### Agreeing on new subdomain Agreement on a new subdomain to replace `docs.ansible.com` on the web server is necessary. This requires gathering input and suggestions from various teams. After the migration is complete, the web server will still provide content for Ansible automation controller versions that are supported by Red Hat. There should be no new content published to the web server after the migration. Content that resides on the web server should be updated only in exceptional cases, such as in response to a customer support case or where content might have some negative customer impact. For this reason the new subdomain should not imply that the content is actively maintained. At the same time the new subdomain should not imply that the content is out of date or no longer supported. Eventually it will be appropriate to wind down the web server entirely. There is some controller version 4.5 content hosted on the web server so it will need to follow that support lifecycle at a minimum. For the purposes of this document, we will refer to the new subdomain as `docs-archive.ansible.com`. Here are some suggestions for subdomains that we should consider: - `docs-archive.ansible.com` - `legacy-docs.ansible.com` - `old-docs.ansible.com` - `docsv1.ansible.com` - `documentation.ansible.com` - `guides.ansible.com` - `help.ansible.com` > Note that an alternative possibility is leaving the `docs.ansible.com` subdomain in place and creating a new subdomain for the Read The Docs site, such as `community-docs.ansible.com`. However this would not be the optimal result for users because the content on `docs.ansible.com` would be stale yet the SEO rankings would still be high. This would mean that users would search for Ansible content and most likely go to `docs.ansible.com` from the results but would then need to figure out they actually need to go to the new subdomain to find the latest content. ### Redirects To facilitate the migration to Read The Docs, we need to drastically reduce the number of server-side redirects. Read The Docs imposes a limit of 100 redirects per project. At the moment, there are thousands of server-side redirects in place. However, we can slim down the number of server-side redirects as follows: - Consolidating pre-collections redirects - Converting redirects to stub pages Regardless of the migration effort, reducing the number of server-side redirects is essential maintenance. The high number of redirects we have in place now are difficult to keep up to date as modules and plugins are moved between collections and collections get deprecated from the Ansible package. You can find more information about plans to handle redirects in the following places: - [Consolidating server-side redirects](https://hackmd.io/NnAtqLljTJSIpVxiPkBskQ) - [Creating redirects on Read The Docs](https://hackmd.io/3K5Es_UZQvyUTLWPmdtFxQ) - [Using stub pages to replace redirects](https://github.com/ansible/ansible-documentation/issues/2147) ## Migration tasks This section outlines the steps to move the `docs.ansible.com` subdomain to Read The Docs hosting. ### Creating an archive landing page As part of moving the `docs.ansible.com` subdomain to Read The Docs, we will move the `index.html` landing page to the top-level project. It is already currently available at [https://ansible.readthedocs.io/](https://ansible.readthedocs.io/). The archive landing page should: - Briefly explain the migration to Read The Docs. - Explain that Ansible community and Ansible core documentation will be available from `docs.ansible.com` but URLs will change due to the structure of Read The Docs subprojects. - Notify that Ansible core documentation from version 2.15 and later will be available from `docs.ansible.com` after the migration. Note that we can enable builds for earlier versions in Read The Docs if necessary. It would be good to get feedback. - Explain that we will not remove any pages from the current server but will no longer actively maintain or update them. For example, https://docs.ansible.com/ansible/10/getting_started/index.html will still be available from the `docs-archive.ansible.com` subdomain. - Provide entry points to the main pages that will be available after the migration, as follows: ``` ansible-tower.html automation-tower-chinese-translations.html automation-tower-japanese-translations.html automation-tower-korean-translations.html automation-tower-prior-versions.html core.html platform.html ``` 1. Create an `archive-index.html` in the `ansible/docsite` repository. This index page should be as simple as possible and not generated from any template. It should use straightforward inline styling. 2. Update the `.htaccess` configuration to allow access to `archive-index.html`. 3. Update the docsite build job in Jenkins to rsync `archive-index.html`. After migration, we will complete some steps to clean up the two landing page repositories. ### Configuring DNS After we agree on a new subdomain for content on the web server, we should set it up alongside `docs.ansible.com`. 1. Create a new `A` record for `docs-archive.ansible.com` that points to the IP address of the EC2 instance. 2. Convert the `A` record for `docs.ansible.com` into a CNAME that points to `docs-archive.ansible.com`. This step will result in a smoother transition when we move the CNAME to Read The Docs. It becomes a simple replacement of the CNAME value and not A to CNAME change, which means faster propagation. Likewise if an issue arises with the migration and we need to rollback, we can quickly revert the CNAME back to `docs-archive.ansible.com`. 4. Lower the Time To Live (TTL) setting for the `docs.ansible.com` record. This step will help the CNAME change to propagate quickly. The TTL setting tells DNS resolvers how long to cache the record before updating. We can lower the TTL to something like 60 seconds. After doing this, we wait for the amount of time for the initial TTL setting. DNS resolvers will retain the cache for the original TTL duration. Once this waiting period is over, any new DNS queries will get the response with the shorter TTL value. ### Configuring the web server We need to make some changes on the web server so that the docs are available from both `docs.ansible.com` and `docs-archive.ansible.com`. 1. Configure the web server to handle requests for `docs-archive.ansible.com`. For example, create `/etc/httpd/conf.d/docs-archive.ansible.com.conf`. 2. Configure the web server to serve content for both `docs.ansible.com` and `docs-archive.ansible.com`. 3. Configure the web server to use different index files for each subdomain. The resulting configuration should be something like the following: ```xml <VirtualHost *:80> ServerName docs.ansible.com ServerAlias docs-archive.ansible.com DocumentRoot /var/www/html/docs/ <Directory /var/www/html/docs/> Options Indexes FollowSymLinks AllowOverride All Require all granted </Directory> <If "%{HTTP_HOST} == 'docs.ansible.com'"> DirectoryIndex index.html </If> <If "%{HTTP_HOST} == 'docs-archive.ansible.com'"> DirectoryIndex index-archive.html </If> ErrorLog /var/log/httpd/docs.ansible.com-error.log CustomLog /var/log/httpd/docs.ansible.com-access.log combined </VirtualHost> ``` ### Moving the CNAME record Update the CNAME record for `docs.ansible.com` to point to Read The Docs. 1. Follow the steps to [add a custom domain on Read The Docs](https://docs.readthedocs.io/en/stable/guides/custom-domains.html#adding-a-custom-domain). a. Enter `docs.ansible.com` as the custom domain. b. Select the `Canonical` option. 2. Update the DNS record for `docs.ansible.com` so that it points to `readthedocs.io`. 3. Wait for the changes to propagate and then test with something like `nslookup` to verify the CNAME record. ## Post-migration tasks This section outlines work items that we should complete after the `docs.ansible.com` subdomain is migrated to Read The Docs hosting. ### Tasks in Google search console To help preserve SEO authority of `docs.ansible.com`, we should performing the following steps in the search console: - Verify both `docs.ansible.com` and the new subdomain are verified. - Use the [Change of Address Tool](https://support.google.com/webmasters/answer/9370220?hl=en) to inform Google about the change of subdomain. - Follow steps in [Move a site with URL changes](https://developers.google.com/search/docs/crawling-indexing/site-move-with-url-changes) to monitor traffic and check for crawl errors. ### Custom sitemaps XML sitemaps help search engines discover and index the site structure faster. Read The Docs automatically creates sitemaps however we should consider generating a [custom sitemap](https://docs.readthedocs.io/en/stable/reference/sitemaps.html) for the top-level project. Additionally we should create a new XML sitemap for the new subdomain to replace the existing ones: - https://raw.githubusercontent.com/ansible/docsite/refs/heads/main/ansible-sitemap.xml - https://github.com/ansible/docsite/blob/main/automation-controller-sitemap.xml Here are some commands used to create sitemaps: ``` sudo dnf install nodejs sudo npm install -g sitemap-generator-cli sitemap-generator -f ansible-sitemap.xml https://docs.ansible.com/ansible/latest/ sitemap-generator -f automation-controller-sitemap.xml https://docs.ansible.com/automation-controller/latest/ ``` ### Updating internal links Even though we will have redirects in place that should automatically point to the updated `docs.ansible.com` urls, we should ensure that as many links are updated as possible. Updating internal linking should help both users and search engines navigate the new structure without relying on redirects. > Links that point to automation-controller content on `docs.ansible.com` are likely to be broken. Here are some places where we should scan for `docs.ansible.com` urls and make batch updates where necessary: - `ansible/ansible-documentation` - `ansible/ansible` - `ansible/aap-docs` ### Separating the landing pages Landing pages refer to the top-level pages that guide users to relevant parts of the documentation. After the migration there will be two landing pages: - `docs.ansible.com` on Read The Docs and sourced from the `ansible/docsite` repository - `docs-archive.ansible.com` on the web server and sourced from the `ansible/archive-docsite` repository #### Updating the archive landing page We should complete the following steps to modify the landing page for `docs-archive.ansible.com`: 1. Temporarily disable the Jenkins job to build the docsite. 2. Fork the `ansible/docsite` repository to `ansible/archive-docsite`. 3. Rename `archive-index.html` to `index.html`. 4. Create a new standalone 404 page with the cowsay image. 5. Remove the following files and folders: ``` ├── ansible/ ├── data/ ├── requirements/ ├── sass/ ├── static/css/ ├── static/images/community_logo.svg ├── static/js/ ├── templates/ ├── .pip-tools.toml ├── .readthedocs.yaml ├── ansible-sitemap.xml ├── build.py └── noxfile.py ``` 5. Update `.htaccess` to use the standalone 404 page. 6. Update `.htaccess` to remove redirects that applied to the `docs.ansible.com` subdomain. 7. Update the catch all redirect in `.htaccess` at https://github.com/ansible/docsite/blob/c02fae53bbfae3b296f38b1b04b7639d3431b98a/.htaccess#L11 8. Update `robots.txt` to disallow the `/ansible` and `/ansible-core` directories. 9. Update `robots.txt` to modify the sitemaps. Remove `ansible-sitemap` and update the subdomain for `automation-controller-sitemap`. 10. Update `automation-controller-sitemap.xml` to reflect the change of the subdomain. 11. Update the Jenkins job to prune deleted files and folders from the rsync step. 12. Update the Jenkins job to clone the `ansible/archive-docsite` repository instead of `ansible/docsite`. 13. Enable the Jenkins job to build the docsite and run it. 14. Modify the web server configuration to serve content for `docs-archive.ansible.com` only and to use the `index.html` file. #### Updating the landing page on Read The Docs We should update the `docs.ansible.com` landing pages to put the focus on the content journeys. 1. Remove the following files and folders: ``` ├── ansible/ ├── templates/ansible-prior-versions.html ├── templates/ansible_community.html ├── templates/automation-tower-*.html ├── templates/core-translated-ja.html ├── templates/core.html ├── templates/platform.html ├── .htaccess ├── ansible-sitemap.xml ├── automation-controller-sitemap.xml └── robots.txt ``` 2. Update templates as appropriate to reflect the changes. 3. Update the `ecosystem.html` page to ensure links to Read The Docs projects are correct. 4. Ensure there is a link to the ecosystem page on the index. 5. Add redirects to the top-level Read The Docs project for the deleted `templates/*` pages. > As a future enhancement, we should consider building the `docs.ansible.com` landing pages with Sphinx. This would allow us to make better use of the Read The Docs widget that provides cross-project search. #### Cleaning up the web server After we are certain that the migration is a success and we won't rollback any changes, we can do some clean up on the web server. - Add `.htaccess` rules so that all `ansible/*` pages redirect to Read The Docs. - Remove the `/var/www/html/docs/ansible` directory. - Remove all landing pages except for `index.html`. - Remove all filesystem content for `docs.testing.ansible.com` including associated Jenkins jobs and configuration. - Decommission all Jenkins jobs to build core and community docs. Post migration we can also decide if it's worth moving any remaining content for automation controller to an S3 bucket or other location.