# Migration Azure OVH Pour migrer une appli de Azure vers OVH: 1. modifier [infra-startup](https://gitlab.fabrique.social.gouv.fr/infra/infra-startup) pour positionner les secrets ovh-storage sur le cluster OVH, [cf PR](https://gitlab.fabrique.social.gouv.fr/infra/infra-startup/-/merge_requests/353) de backups CNPG 2. Modifier Kontinuous: - activer ovh en prod (config.yaml de kontinuous) - configurer la récupération des WALs dans kontinuous et changement de nom pour les backups [cf PR](https://github.com/SocialGouv/ozensemble/pull/509/files) - idem pour les autres cnpg type metabase 3. couper la prod azure (scale down deployments + suspend cronjobs) 4. déployer la prod sur OVH (idem scale down + suspend) 5. valider le contenu de CNPG - le cluster fait son full recovery (7 min sur oz) - liste des tables - taille rapide des tables - vérifier les backups du CNPG dans le nouveau sous-dossier du bucket de backups de prod 6. créer un PVC dans manifests qui arrive dans le namespace de prod 1. déploie un pod de debug qui monte le PVC avec l'image docker rclone - variables d'env azure - rclone sync azure -> local PVC 1. on rallume (déploiements et cronjobs) et on attend que l'app tourne 1. azure : supprimer l'ingress 1. azure : suspendre cronjobs et scale down 1. azure : hibernate CNPG ## Dev et preprod 1. activer extends OVH dans config kontinuous 2. valider avec les devs l'interruption de la preprod 3. toute la migration prod 4. dire aux devs que les urls ont changé (.ovh.) ## Scripts ### new dump/restore SQL ```yaml apiVersion: v1 kind: Pod metadata: labels: run: psql name: psql spec: containers: - command: - tail - -f image: ghcr.io/socialgouv/docker/psql imagePullPolicy: Always name: psql env: - name: PGHOST valueFrom: secretKeyRef: name: pg-superuser key: host - name: PGPORT valueFrom: secretKeyRef: name: pg-superuser key: port - name: PGUSER valueFrom: secretKeyRef: name: pg-superuser key: user - name: PGPASSWORD valueFrom: secretKeyRef: name: pg-superuser key: password - name: PGDATABASE # value: preprod value: app ``` - from (dev) ```bash kubectl -n domifa-preprod apply -f psql.yaml kubectl -n domifa-preprod exec -it psql -- pg_dump --clean --if-exists --no-owner --no-privileges --quote-all-identifiers --format=custom -f /tmp/backup.dump kubectl cp domifa-preprod/psql:/tmp/backup.dump backup.dump ``` - to (ovh-dev) ```bash kubectl -n domifa-preprod apply -f psql.yaml kubectl cp backup.dump domifa-preprod/psql:/tmp/backup.dump kubectl -n domifa-preprod exec -it psql -- pg_restore --clean --if-exists --no-owner --role app --no-acl --verbose -d app /tmp/backup.dump ``` ### OLD dump/restore SQL ```shell # azure # kubectl --context prod -n xxx run -it psql --image=ghcr.io/socialgouv/docker/psql -- bash export SOURCE='postgresql://STARTUP:PASSWD@pg-metabase-rw:5432/STARTUP?sslmode=prefer' pg_dump --dbname "$SOURCE" \ --clean --if-exists \ --no-privileges \ --quote-all-identifiers \ --format=custom \ -f /tmp/backup.dump kubectl --context prod -n matomo-metabase-STARTUP cp psql:/tmp/backup.dump metabase-STARTUP-dump.dump # ovh kubectl --context ovh-prod -n startup-STARTUP--metabase-prod run -it psql --image=ghcr.io/socialgouv/docker/psql -- bash kubectl --context ovh-prod -n startup-STARTUP--metabase-prod cp metabase-tumeplay-dump.dump psql:/tmp/backup.dump export TARGET='postgresql://app:PASSWORD@metabase-rw:5432/app?sslmode=disable' pg_restore --dbname "$TARGET" \ --clean --if-exists --no-owner --role app --no-acl --verbose \ /tmp/backup.dump ``` ### Rclone azure files -> ovh openebs 1. Lancer un pod dans le namespace OVH de destination avec l'image rclone et le montage du volume openebs (créé via PVC dans manifests) ```yaml apiVersion: v1 kind: Pod metadata: name: storage-migration-rclone namespace: XXX spec: containers: - name: debug image: rclone/rclone command: ["tail", "-f", "/dev/null"] stdin: true tty: true volumeMounts: - mountPath: /mnt/ovh-storage name: files restartPolicy: Never securityContext: runAsNonRoot: false volumes: - name: files persistentVolumeClaim: claimName: XXX ``` 2. Lancer rclone sur le pod : ```bash export RCLONE_CONFIG_AZUREFILES_TYPE=azurefiles export RCLONE_CONFIG_AZUREFILES_ACCOUNT=XXX export RCLONE_CONFIG_AZUREFILES_KEY=XXX export RCLONE_CONFIG_AZUREFILES_SHARE_NAME=XXX rclone sync azurefiles: /mnt/ovh-storage ``` ### PGBench #### Cluster générique ```yaml apiVersion: postgresql.cnpg.io/v1 kind: Cluster metadata: name: pg spec: logLevel: info instances: 1 imageName: ghcr.io/cloudnative-pg/postgis:14 imagePullPolicy: IfNotPresent resources: limits: cpu: 4 memory: 3Gi requests: cpu: 100m memory: 3Gi storage: size: 30Gi storageClass: csi-cinder-high-speed minSyncReplicas: 0 maxSyncReplicas: 0 postgresql: parameters: TimeZone: Europe/Paris max_standby_archive_delay: 1d max_standby_streaming_delay: 1d pg_stat_statements.max: "10000" pg_stat_statements.track: all monitoring: enablePodMonitor: false priorityClassName: cnpg-high-priority-3 bootstrap: initdb: database: test owner: test postInitTemplateSQL: - CREATE EXTENSION IF NOT EXISTS "postgis"; - CREATE EXTENSION IF NOT EXISTS "postgis_topology"; - CREATE EXTENSION IF NOT EXISTS "fuzzystrmatch"; - CREATE EXTENSION IF NOT EXISTS "postgis_tiger_geocoder"; - CREATE EXTENSION IF NOT EXISTS "uuid-ossp"; - CREATE EXTENSION IF NOT EXISTS "citext"; - CREATE EXTENSION IF NOT EXISTS "pgcrypto"; - CREATE EXTENSION IF NOT EXISTS "hstore"; nodeMaintenanceWindow: inProgress: true ``` Puis lancer les commandes: ```shell kubectl cnpg pgbench pg -- --initialize --scale 100 kubectl cnpg pgbench pg -- -t 10000 ``` Les résultats sont dans le pod créé par la dernière commande. #### OVH-DEV 1 client 1 thread 10000 transactions, scale 100 ``` pgbench (14.11 (Debian 14.11-1.pgdg110+2)) starting vacuum...end. transaction type: <builtin: TPC-B (sort of)> scaling factor: 100 query mode: simple number of clients: 1 number of threads: 1 number of transactions per client: 10000 number of transactions actually processed: 10000/10000 latency average = 15.991 ms initial connection time = 32.196 ms tps = 62.535399 (without initial connection time) ``` #### AKS-DEV 1 client 1 thread 10000 transactions, scale 100 ``` pgbench (14.11 (Debian 14.11-1.pgdg110+2), server 14.10 (Debian 14.10-1.pgdg110+1)) starting vacuum...end. transaction type: <builtin: TPC-B (sort of)> scaling factor: 100 query mode: simple number of clients: 1 number of threads: 1 number of transactions per client: 10000 number of transactions actually processed: 10000/10000 latency average = 10.963 ms initial connection time = 13.801 ms tps = 91.215455 (without initial connection time) ``` #### OVH-DEV 10 clients 10 threads 10000 transactions, scale 100 ``` pgbench (14.11 (Debian 14.11-1.pgdg110+2)) starting vacuum...end. transaction type: <builtin: TPC-B (sort of)> scaling factor: 100 query mode: simple number of clients: 10 number of threads: 10 number of transactions per client: 10000 number of transactions actually processed: 100000/100000 latency average = 32.375 ms initial connection time = 41.136 ms tps = 308.884240 (without initial connection time) ``` #### AKS-DEV 10 clients 10 threads 10000 transactions, scale 100 ``` pgbench (14.11 (Debian 14.11-1.pgdg110+2), server 14.10 (Debian 14.10-1.pgdg110+1)) starting vacuum...end. transaction type: <builtin: TPC-B (sort of)> scaling factor: 100 query mode: simple number of clients: 10 number of threads: 10 number of transactions per client: 10000 number of transactions actually processed: 100000/100000 latency average = 31.705 ms initial connection time = 25.440 ms tps = 315.403144 (without initial connection time) ``` #### OVH-PROD 1 client 1 thread 10000 transactions, scale 100 ``` pgbench (14.11 (Debian 14.11-1.pgdg110+2)) starting vacuum...end. transaction type: <builtin: TPC-B (sort of)> scaling factor: 100 query mode: simple number of clients: 1 number of threads: 1 number of transactions per client: 10000 number of transactions actually processed: 10000/10000 latency average = 13.229 ms initial connection time = 24.488 ms tps = 75.588732 (without initial connection time) Stream closed EOF for cnpg-test-bench-adrien/pg-pgbench-815924-qkrxt (pgbench) ``` #### OVH-PROD 1 client 1 thread 10000 transactions, scale 100 -> maintenance cnpg désactivée ``` pgbench (14.11 (Debian 14.11-1.pgdg110+2)) starting vacuum...end. transaction type: <builtin: TPC-B (sort of)> scaling factor: 100 query mode: simple number of clients: 1 number of threads: 1 number of transactions per client: 10000 number of transactions actually processed: 10000/10000 latency average = 16.044 ms initial connection time = 25.169 ms tps = 62.329173 (without initial connection time) Stream closed EOF for cnpg-test-bench-adrien/pg-pgbench-646182-7hczv (pgbench) ``` #### OVH-DEV 1 - sur nodepool cnpg compute optimized cnpg-nodepool-node-f86a29 ``` pgbench (14.11 (Debian 14.11-1.pgdg110+2)) starting vacuum...end. transaction type: <builtin: TPC-B (sort of)> scaling factor: 100 query mode: simple number of clients: 1 number of threads: 1 number of transactions per client: 10000 number of transactions actually processed: 10000/10000 latency average = 16.461 ms initial connection time = 26.896 ms tps = 60.748573 (without initial connection time) Stream closed EOF for cnpg-test-bench-adrien/pg-pgbench-315170-m2ttd (pgbench) ``` ## cnpg + pv-mutator = 💜 ### setup pv-mutator #### pre-requis à installer - https://github.com/utkuozdemir/pv-migrate - yq ```sh git clone https://github.com/xavierfnk/pv-mutator ``` #### pour contourner kyverno: modifier pv-mutator/helm/pv-migrate-values.yaml en ajoutant sous la clé `rsync`: ``` securityContext: runAsNonRoot: false ``` ### migrer le volume ```sh kubectl cnpg -n domifa-preprod hibernate on pg ``` sauvegarder les labels et annotations du PVC !!! ```sh cd pv-mutator NAMESPACE="domifa-preprod" PVC_NAME="pg-1" CLASS=sc-hspeed-gen2-delete"" SIZE="150Gi" ./pv-mutator.sh ``` remettre les labels et les annotations du PVC ```sh kubectl cnpg -n domifa-preprod hibernate off pg ``` Enjoy ! ### nettoyage supprimer à la main l'ancien PVC (qui a été laissé par pv-mutator pour plus de sécurité) et PV (si il était en Retain) ## Backup ovh openebs vers jiva ``` export RCLONE_CONFIG_BACKUP_TYPE=s3 export RCLONE_CONFIG_BACKUP_PROVIDER=Other export RCLONE_CONFIG_BACKUP_REGION=gra export RCLONE_CONFIG_BACKUP_ENDPOINT=https://s3.gra.io.cloud.ovh.net export RCLONE_CONFIG_BACKUP_ACCESS_KEY_ID=SECRET export RCLONE_CONFIG_BACKUP_SECRET_ACCESS_KEY=SECRET rclone sync /mnt/ovh-storage backup: ```