# Zero-Downtime Policy for Migrations
###### tags: `PulpCon 2020` `Migrations`
## Motivation
* Downtime is not desirable in a lot of environments.
* Clustered installations require that all parts of the cluster are stopped before any server is upgraded. This extends the downtime of an upgrade.
* Deployments on Kubernetes that are managed by the Pulp operator need to be able to support a mixture of containers that are at pulpcore 3.y and 3.y+1
## What does the upgrade process look like now?
1) Stop all services
1) Upgrade code
1) Run migrations
1) Start all services
## What does the Zero-downtime upgrade process look like?
When upgrading from 3.y.z to 3.y+1 the following is possible:
1) Upgrade code
1) Run migrations
1) Restart all services (in a cluster this would be a "rolling restart")
## How can we get there?
* Plugin Writers need documentation on how they can introduce gradual changes that will allow users to upgrade part of their Pulp infrastructure at a time.
* Changes that require a change in the database schema need to be split into two migrations.
* The first migration delivered with 3.y+1 needs to add a new column that the 3.y+1 will use.
* The second migration delivered with 3.y+2 needs to remove the old column that is no longer needed in 3.y+1 and 3.y+2.
* CI that checks that migrations are compatible with the latest 3.y.z release
## Resources
* https://pypi.org/project/django-pg-zero-downtime-migrations/
* CI checks to look for risky migrations https://wxweekly.com/zero-downtime-deploys-a-tale-of-django-migrations-7a040f425e4a
* Pulp ticket https://pulp.plan.io/issues/7120
## Next steps
* Add CI that looks at our migrations and provides analysis similar to the article on wxweekly.com [dkliban]
* Send out an email to pulp-dev list announcing the plan to start working toward this goal [dkliban]
###### tags: `PulpCon 2020`