# Atmos account bootstrap diary Author: Alex Box ### Top level questions * why base certs r not created as part of atmos? i see standard pattern * ## pull request list https://github.com/nexmoinc/tf-data-team/pull/130/files Before working out timestamps: * Figure out how to connect to VPN * Clone repos referenced in [[https://confluence.vonage.com/display/APISRE/How-to:+Atmos+-+Configure+IaC+in+Atmos#How-to:Atmos-ConfigureIaCinAtmos-Step8.DeployArgoCD][here]] * Huddle with Shankar about how to proceed - why isn't this automated? ** <2021-11-30 Tue 17:12> Try to run terragrunt, it fails with error about not finding terragrunt.hcl in any parent folders - why? ** <2021-11-30 Tue 17:21> Realise it's because =terragrunt.hcl= in repo root is a symlink into =tf-common/terragrunt.hcl=, which is a submodule that hasn't been pulled. Run this to check it out: #+begin_src git submodule update --recursive --remote # UPDATE: WRONG!!!! This pulled latest version of tf-common, DON'T WANT THAT!!! Updated wiki page with correct command #+end_src ** <2021-11-30 Tue 17:38> Spent time working out what to do with the value returned from CloudOps =/createPrivatelinks= API endpoint. It's obviously something to do with establishing private connectivity (not routed through public internet) between the Atmos account and other (which?) AWS accounts, primarily /something/ owned by CloudOps team. Relevant docs: https://docs.aws.amazon.com/vpc/latest/privatelink/endpoint-services-overview.html Adding note to wiki explaining what to do with the VPC endpoint DNS address returned by CloudOps API. ** <2021-11-30 Tue 18:02> Worked out what to do with response mentioned above. It's used as the target of a CNAME record for ArgoCD. I assume this must be because ArgoCD needs to only be accessible by engineers on the VPN, and the VPN endpoints reside in CloudOps AWS account. I found this out by looking for references to the input parameter that we [[https://confluence.vonage.com/display/APISRE/How-To:+Atmos+Privatelink+and+IAC_Bootstrap+Update#How-To:AtmosPrivatelinkandIAC_BootstrapUpdate-6-UpdateArgoCDModule][need to set]] with the response returned by CloudOps API: https://github.com/nexmoinc/terraform-aws-nexmo-eks/blob/ba2dba6b7ddfe3739d0471e120720872a924960a/modules/argocd/main.tf#L40 Note: I was a bit unsure why this separate wiki page exists, which seems to contain needed information to progress in the main Atmos IaC wiki page: https://confluence.vonage.com/display/APISRE/How-To:+Atmos+Privatelink+and+IAC_Bootstrap+Update#How-To:AtmosPrivatelinkandIAC_BootstrapUpdate-6-UpdateArgoCDModule ** <2021-11-30 Tue 22:16> Terragrunt plan and apply logs: #+begin_src git submodule update --recursive --remote abox@VONC02CG7P3MD6P:~/repositories/tf-data-team/vonage-api-data-main-prd/iac_argocd/ap-southeast-1$ terragrunt plan An execution plan has been generated and is shown below. Resource actions are indicated with the following symbols: + create <= read (data resources) Terraform will perform the following actions: # data.aws_secretsmanager_secret_version.argo_creds will be read during apply # (config refers to values not yet known) <= data "aws_secretsmanager_secret_version" "argo_creds" { + arn = (known after apply) + id = (known after apply) + secret_binary = (sensitive value) + secret_id = (known after apply) + secret_string = (sensitive value) + version_id = (known after apply) + version_stages = (known after apply) } # aws_route53_record.argocd will be created + resource "aws_route53_record" "argocd" { + allow_overwrite = (known after apply) + fqdn = (known after apply) + id = (known after apply) + name = "argocd" + records = [ + "vpce-0860e947a5446e148-wcpb4o1p.vpce-svc-0fc74d3cc5379c2f8.ap-southeast-1.vpce.amazonaws.com", ] + ttl = 300 + type = "CNAME" + zone_id = "Z0302226P948K4236YUR" } # helm_release.argo will be created + resource "helm_release" "argo" { + atomic = false + chart = "argo-cd" + cleanup_on_fail = false + create_namespace = true + dependency_update = false + disable_crd_hooks = false + disable_openapi_validation = false + disable_webhooks = false + force_update = false + id = (known after apply) + lint = false + manifest = (known after apply) + max_history = 0 + metadata = (known after apply) + name = "argo-cd" + namespace = "argocd" + recreate_pods = false + render_subchart_notes = true + replace = false + repository = "https://argoproj.github.io/argo-helm" + reset_values = false + reuse_values = false + skip_crds = false + status = "deployed" + timeout = 300 + values = (known after apply) + verify = false + version = "3.17.6" + wait = true + wait_for_jobs = false } # helm_release.external-secrets will be created + resource "helm_release" "external-secrets" { + atomic = false + chart = "kubernetes-external-secrets" + cleanup_on_fail = false + create_namespace = true + dependency_update = false + disable_crd_hooks = false + disable_openapi_validation = false + disable_webhooks = false + force_update = false + id = (known after apply) + lint = false + manifest = (known after apply) + max_history = 0 + metadata = (known after apply) + name = "external-secrets" + namespace = "cluster" + recreate_pods = false + render_subchart_notes = true + replace = false + repository = "https://external-secrets.github.io/kubernetes-external-secrets/" + reset_values = false + reuse_values = false + skip_crds = false + status = "deployed" + timeout = 300 + values = [ + <<-EOT env: AWS_REGION: ap-southeast-1 AWS_DEFAULT_REGION: ap-southeast-1 LOG_LEVEL: debug USE_HUMAN_READABLE_LOG_LEVELS: true serviceAccount: name: external-secrets annotations: eks.amazonaws.com/role-arn: arn:aws:iam::249606884981:role/sre/application-cluster1-main-apse1-0-prd-k8s-external-secrets securityContext: fsGroup: 65534 EOT, ] + verify = false + version = "7.0.0" + wait = true + wait_for_jobs = false } # null_resource.encrypted_admin_password will be created + resource "null_resource" "encrypted_admin_password" { + id = (known after apply) + triggers = (known after apply) } # module.argo-admin-password.aws_secretsmanager_secret.generated-secret will be created + resource "aws_secretsmanager_secret" "generated-secret" { + arn = (known after apply) + description = "Admin password for the argocd" + force_overwrite_replica_secret = false + id = (known after apply) + kms_key_id = "5c4b18b7-99df-4aee-9aff-884790eb960d" + name = "global/sre/other/cluster1-main-apse1-0-prd/argocd/admin-credentials" + name_prefix = (known after apply) + policy = (known after apply) + recovery_window_in_days = 7 + rotation_enabled = (known after apply) + rotation_lambda_arn = (known after apply) + tags = { + "app-env" = "prod" + "app-lob" = "api" + "app-module" = "nexmo_aws_secret" + "app-name" = "global/sre/other/cluster1-main-apse1-0-prd/argocd/admin-credentials" + "app-region" = "ap-southeast-1" + "app-team" = "sre" + "aws-account" = "vonage-api-data-main-prd" } + tags_all = { + "app-env" = "prod" + "app-lob" = "api" + "app-module" = "nexmo_aws_secret" + "app-name" = "global/sre/other/cluster1-main-apse1-0-prd/argocd/admin-credentials" + "app-region" = "ap-southeast-1" + "app-team" = "sre" + "aws-account" = "vonage-api-data-main-prd" } + replica { + kms_key_id = (known after apply) + last_accessed_date = (known after apply) + region = (known after apply) + status = (known after apply) + status_message = (known after apply) } + rotation_rules { + automatically_after_days = (known after apply) } } # module.argo-admin-password.aws_secretsmanager_secret_version.generated-secret-version[0] will be created + resource "aws_secretsmanager_secret_version" "generated-secret-version" { + arn = (known after apply) + id = (known after apply) + secret_id = (known after apply) + secret_string = (sensitive value) + version_id = (known after apply) + version_stages = (known after apply) } # module.argo-admin-password.random_password.generated-secret-password[0] will be created + resource "random_password" "generated-secret-password" { + id = (known after apply) + length = 20 + lower = true + min_lower = 0 + min_numeric = 0 + min_special = 5 + min_upper = 0 + number = true + override_special = "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ" + result = (sensitive value) + special = true + upper = true } Plan: 7 to add, 0 to change, 0 to destroy. Changes to Outputs: + endpoint = "argocd.main0.api.data.prd.apse1.vonagenetworks.net" Warnings: - Version constraints inside provider configuration blocks are deprecated on provider.tf line 8 To see the full warning notes, run Terraform without -compact-warnings. ------------------------------------------------------------------------ This plan was saved to: ./plan.out To perform exactly these actions, run the following command to apply: terraform apply "./plan.out" abox@VONC02CG7P3MD6P:~/repositories/tf-data-team/vonage-api-data-main-prd/iac_argocd/ap-southeast-1$ abox@VONC02CG7P3MD6P:~/repositories/tf-data-team/vonage-api-data-main-prd/iac_argocd/ap-southeast-1$ terragrunt apply "./plan.out" module.argo-admin-password.random_password.generated-secret-password[0]: Creating... module.argo-admin-password.random_password.generated-secret-password[0]: Creation complete after 0s [id=none] aws_route53_record.argocd: Creating... module.argo-admin-password.aws_secretsmanager_secret.generated-secret: Creating... helm_release.external-secrets: Creating... module.argo-admin-password.aws_secretsmanager_secret.generated-secret: Creation complete after 2s [id=arn:aws:secretsmanager:ap-southeast-1:249606884981:secret:global/sre/other/cluster1-main-apse1-0-prd/argocd/admin-credentials-OtT02f] module.argo-admin-password.aws_secretsmanager_secret_version.generated-secret-version[0]: Creating... module.argo-admin-password.aws_secretsmanager_secret_version.generated-secret-version[0]: Creation complete after 2s [id=arn:aws:secretsmanager:ap-southeast-1:249606884981:secret:global/sre/other/cluster1-main-apse1-0-prd/argocd/admin-credentials-OtT02f|4E1D5022-2165-44D1-86BA-D0F067B07BA7] data.aws_secretsmanager_secret_version.argo_creds: Reading... data.aws_secretsmanager_secret_version.argo_creds: Read complete after 1s [id=arn:aws:secretsmanager:ap-southeast-1:249606884981:secret:global/sre/other/cluster1-main-apse1-0-prd/argocd/admin-credentials-OtT02f|AWSCURRENT] null_resource.encrypted_admin_password: Creating... null_resource.encrypted_admin_password: Creation complete after 0s [id=5977981283397304961] helm_release.argo: Creating... aws_route53_record.argocd: Still creating... [10s elapsed] helm_release.external-secrets: Still creating... [10s elapsed] helm_release.argo: Still creating... [10s elapsed] aws_route53_record.argocd: Still creating... [20s elapsed] helm_release.external-secrets: Still creating... [20s elapsed] helm_release.argo: Still creating... [20s elapsed] helm_release.external-secrets: Creation complete after 28s [id=external-secrets] aws_route53_record.argocd: Still creating... [30s elapsed] aws_route53_record.argocd: Creation complete after 33s [id=Z0302226P948K4236YUR_argocd_CNAME] helm_release.argo: Still creating... [30s elapsed] helm_release.argo: Still creating... [40s elapsed] helm_release.argo: Still creating... [50s elapsed] helm_release.argo: Still creating... [1m0s elapsed] helm_release.argo: Still creating... [1m10s elapsed] helm_release.argo: Still creating... [1m20s elapsed] helm_release.argo: Creation complete after 1m24s [id=argo-cd] Warnings: - Version constraints inside provider configuration blocks are deprecated on provider.tf line 8 To see the full warning notes, run Terraform without -compact-warnings. Apply complete! Resources: 7 added, 0 changed, 0 destroyed. Outputs: endpoint = "argocd.main0.api.data.prd.apse1.vonagenetworks.net" abox@VONC02CG7P3MD6P:~/repositories/tf-data-team/vonage-api-data-main-prd/iac_argocd/ap-southeast-1$ git diff diff --git a/tf-common b/tf-common index 8dda087..f53a887 160000 --- a/tf-common +++ b/tf-common @@ -1 +1 @@ -Subproject commit 8dda087eb6a122704345262ed934e09f85e2eed1 +Subproject commit f53a887cf5107694c83dbb0f82e5dee004812adb diff --git a/vonage-api-data-main-prd/iac_argocd/ap-southeast-1/terragrunt.hcl b/vonage-api-data-main-prd/iac_argocd/ap-southeast-1/terragrunt.hcl index 6513382..d27a8db 100644 --- a/vonage-api-data-main-prd/iac_argocd/ap-southeast-1/terragrunt.hcl +++ b/vonage-api-data-main-prd/iac_argocd/ap-southeast-1/terragrunt.hcl @@ -21,6 +21,6 @@ inputs = { cluster_name = "cluster1", cluster_id = "cluster1-main-apse1-0-prd", team = "sre" - privatelink_vpce_endpoint = "" + privatelink_vpce_endpoint = "vpce-0860e947a5446e148-wcpb4o1p.vpce-svc-0fc74d3cc5379c2f8.ap-southeast-1.vpce.amazonaws.com" irsa_roles = dependency.irsa.outputs.irsa_roles } abox@VONC02CG7P3MD6P:~/repositories/tf-data-team/vonage-api-data-main-prd/iac_argocd/ap-southeast-1$ #+end_src ** <2021-11-30 Tue 22:33> At step 9 it looks like this has already been done - the step appears to be once per /account/ not once per /region/: https://github.com/nexmoinc/ops-terraform/blob/b7ec126d4afd9b0902961a848f48937cb45d615d/nexmo-prod/atmos_share_secrets/configmap.yaml#L24 I looked at latest version of linked example repositories, which use commit hashes as opposed to =master= in the URL, and it all looks account specific not region specific: [[https://github.com/vonage-atmos/atmos-aws-api-data-main-stacks/blob/master/iam/grouping/eks/managed_policies.yaml]] https://github.com/vonage-atmos/atmos-aws-api-data-main-stacks/blob/master/iam/grouping/applications/application-k8s-external-secrets.yaml ** <2021-11-30 Tue 23:23> Step 10 the example placement seems to be old. Also the placements (=terragrunt.hcl= files) for the different regions are not in sync - =us-west-2= is different to all the rest, but it was created most recently (23rd November) so I =mkdir= and copy pasted it for =ap-southeast-1=. Here is a log of the apply for step 10: #+begin_src abox@VONC02CG7P3MD6P:~/repositories/tf-data-team/vonage-api-data-main-prd/iac_eks_bootstrap/ap-southeast-1$ time terragrunt apply An execution plan has been generated and is shown below. Resource actions are indicated with the following symbols: + create Terraform will perform the following actions: # argocd_application.cluster-bootstrap will be created + resource "argocd_application" "cluster-bootstrap" { + id = (known after apply) + wait = true + metadata { + generation = (known after apply) + name = "cluster-bootstrap" + namespace = "argocd" + resource_version = (known after apply) + uid = (known after apply) } + spec { + project = "k8s-infrastructure" + destination { + namespace = "argocd" + server = "https://kubernetes.default.svc" } + source { + path = "." + repo_url = "https://github.com/nexmoinc/eks-iac-bootstrap.git" + target_revision = "feature/codebase_merge" + helm { + values = <<-EOT "acm_arn": "arn:aws:acm:ap-southeast-1:249606884981:certificate/770f463b-a173-4547-8bf0-66d53ce6f485" "argo_project": "k8s-infrastructure" "argocd": "host": "argocd.main0.api.data.prd.apse1.vonagenetworks.net" "ldap_groups": - "team_ops" "ldap_host": "ldap01.ap-southeast-1.nexmo.xxx" "aws_account_name": "vonage-api-data-main-prd" "aws_region": "ap-southeast-1" "cluster_autoscaler": "enable_scale_down": true "enabled": true "cluster_id": "cluster1-main-apse1-0-prd" "cluster_name": "cluster1" "dns_domain": "main0.api.data.prd.apse1.vonagenetworks.net" "irsa_roles": "aws_ebs_csi_driver": "arn": "arn:aws:iam::249606884981:role/sre/application-cluster1-main-apse1-0-prd-k8s-ebs-csi-driver" "aws_load_balancer_controller": "arn": "arn:aws:iam::249606884981:role/sre/application-cluster1-main-apse1-0-prd-k8s-aws-lb-controller" "cloudwatch_exporter": "arn": "arn:aws:iam::249606884981:role/sre/application-cluster1-main-apse1-0-prd-k8s-cloudwatch-exporter" "cluster_autoscaler": "arn": "arn:aws:iam::249606884981:role/sre/application-cluster1-main-apse1-0-prd-k8s-cluster-autoscaler" "external_dns": "arn": "arn:aws:iam::249606884981:role/sre/application-cluster1-main-apse1-0-prd-k8s-external-dns" "external_secrets": "arn": "arn:aws:iam::249606884981:role/sre/application-cluster1-main-apse1-0-prd-k8s-external-secrets" "vmagent": "arn": "arn:aws:iam::249606884981:role/sre/application-cluster1-main-apse1-0-prd-k8s-vmagent" "nginx_privatelink": "load_balancer_name": "privatelink-igvftpug" "vmagent": "enable_vmagent_for_vpc": true "vminsert_endpoints": - "http://insert.victoriametrics.nexmo-eks-sre-1.eu-west-2.nexmo.xxx/insert/0/prometheus/api/v1/write" - "http://insert.victoriametrics.nexmo-eks-1.eu-central-1.nexmo.xxx/insert/0/prometheus/api/v1/write" "vpc_id": "vpc-0af04a931675f7329" EOT } } + sync_policy { + automated = { + "allow_empty" = false + "prune" = true + "self_heal" = true } + retry { + backoff = { + "duration" = "" + "max_duration" = "" } + limit = "0" } } } } # argocd_project.infrastructure will be created + resource "argocd_project" "infrastructure" { + id = (known after apply) + metadata { + generation = (known after apply) + name = "k8s-infrastructure" + namespace = "argocd" + resource_version = (known after apply) + uid = (known after apply) } + spec { + description = "Holds all baseline infrastructure applications" + source_repos = [ + "*", ] + cluster_resource_whitelist { + group = "*" + kind = "*" } + destination { + namespace = "*" + server = "https://kubernetes.default.svc" } + orphaned_resources { + warn = true } } } # argocd_repository.iac_repo will be created + resource "argocd_repository" "iac_repo" { + connection_state_status = (known after apply) + id = (known after apply) + inherited_creds = (known after apply) + repo = "https://github.com/nexmoinc/eks-iac-bootstrap.git" + type = "git" } # argocd_repository_credentials.org will be created + resource "argocd_repository_credentials" "org" { + id = (known after apply) + password = (sensitive value) + url = "https://github.com/nexmoinc" + username = "argocd" } Plan: 4 to add, 0 to change, 0 to destroy. Warnings: - Version constraints inside provider configuration blocks are deprecated on provider.tf line 5 - Interpolation-only expressions are deprecated on .terraform/modules/vpc-data/all_cidrs/main.tf line 45 (and 1 more) To see the full warning notes, run Terraform without -compact-warnings. Do you want to perform these actions? Terraform will perform the actions described above. Only 'yes' will be accepted to approve. Enter a value: yes argocd_repository_credentials.org: Creating... argocd_project.infrastructure: Creating... argocd_repository_credentials.org: Creation complete after 4s [id=https://github.com/nexmoinc] argocd_repository.iac_repo: Creating... argocd_project.infrastructure: Creation complete after 6s [id=k8s-infrastructure] argocd_repository.iac_repo: Creation complete after 5s [id=https://github.com/nexmoinc/eks-iac-bootstrap.git] argocd_application.cluster-bootstrap: Creating... argocd_application.cluster-bootstrap: Still creating... [10s elapsed] argocd_application.cluster-bootstrap: Creation complete after 15s [id=cluster-bootstrap] Apply complete! Resources: 4 added, 0 changed, 0 destroyed. real 6m11.464s user 0m18.733s sys 0m9.796s abox@VONC02CG7P3MD6P:~/repositories/tf-data-team/vonage-api-data-main-prd/iac_eks_bootstrap/ap-southeast-1$ #+end_src ** <2021-12-01 Wed 00:08> After step 10 (ArgoCD apps bootstrapping) most of the apps in ArgoCD were showing as requiring sync. The wiki didn't make it clear that this seems to be expected - auto-sync is off for most (all?) apps. I decided to take a risk and manually sync them all one by one, starting with ArgoCD itself. The UI returned errors for a few minutes, so I closed the browser tab and waited, then opened again in a fresh tab. Now ArgoCD showed no errors, so I started sync'ing each app one by one, using the default sync options in the UI. Everything sync'ed with no errors. I logged out as Admin user and logged back in using my Nexmo LDAP credentials, it worked fine. ** <2021-12-01 Wed 16:50> Call with Shankar. Some things we discussed: * Went over notes discussing things I found out/learnt/surprises etc * Copy paste of =terragrunt.hcl= highlighted as a problem (environments are not consistent, already, especially eks-iac-bootstrap repo) * Wiki pages not having full details, e.g. whether to expect ArgoCD apps to be in sync or not on first provisioning bootstrap app of apps (we updated) ** <2021-12-01 Wed 17:37> Make a start on next steps for bootstrapping as instructed by Vijay in DM with Shankar: https://vonage.slack.com/archives/C02PBKRGNAD/p1638279152010600 First step is bootstrapping nginx. Read about architecture [[https://confluence.vonage.com/display/APISRE/Service+Runbook:+Nginx+Private+Loadbalancer+in+Atmos][here]]. Basically flow is user -> NLB -> nginx -> Nomad client host/port. Worked out how Puppet code is applied to Nginx EC2 instances and added wiki update: #+begin_src html In AWS the entrypoint for Puppet config for any given node is <a href="https://github.com/nexmoinc/puppet-master/blob/8d8cf40/code/environments/aws/manifests/site.pp#L5">based</a> on a tag called <code>PuppetRole</code>. The value of this tag should be a valid Puppet class name to apply to the node. The Terraform that provisions Nginx EC2 instances <a href="https://github.com/nexmoinc/terraform-aws-nexmo-nginx-lb/blob/fdc2898/variables.tf#L95">specifies</a> Puppet class <a href="https://github.com/nexmoinc/puppet-master/blob/master/code/environments/aws/modules/role/manifests/loadbalancer/private_atmos.pp">role::loadbalancer::private_atmos</a> to be applied <a href="https://github.com/nexmoinc/terraform-aws-nexmo-asg/blob/1273a9d/main.tf#L168">from the ASG</a>.</p> #+end_src ** <2021-12-01 Wed 20:12> Call with Shankar and Vijay. Questions: * Is there documentation for what is behind the VPC endpoint which we reference via CNAME for ArgoCD? * I've noticed some steps are per account and some steps are per region, would it make sense to indicate the scope in every step of the bootstrap How-To wiki pages? Or is it obvious and I'm stupid? ** <2021-12-01 Wed 20:24> Wanted to see how ASG module specifies =PuppetRole= tag, was surprised to see that there are two ASG modules, and the one that we are /not/ using looks like we /should/ be using it based on its name: [[./img/asg_repos.png]] ** <2021-12-01 Wed 20:44> Nginx steps regarding IAM and SG setup - looks like this has already been done for prod data account apse1: [[https://github.com/vonage-atmos/atmos-aws-api-data-main-stacks/tree/master/networking/prd/main0.api.data.prd.apse1.vonagenetworks.net]] ** <2021-12-01 Wed 20:57> Huddle with Shankar and Vijay. Shankar has readied PR for approval for apse2 - this will need manual apply. For Nginx step and onwards, Atlantis should apply it - good test to see if Atlantis is working. ** <2021-12-01 Wed 21:27> PR to provision Nginx failed to plan: https://github.com/nexmoinc/tf-data-team/pull/135#issuecomment-983490323 ** <2021-12-01 Wed 21:38> Realised that my confusion over the nginx LB/ASG repos was because the former calls the latter: https://github.com/nexmoinc/terraform-modules/blob/6893ad92553a2b7690dd71a8368333b990da279f/spaces/nginx_privlb/atmos_nginx/main.tf#L9 ** <2021-12-01 Wed 22:21> Identified a possible bug in the nginx module: https://vonage.slack.com/archives/C01SLHZML07/p1638356837169100 And realised that ap-southeast-[12] and VPC data needs to be added here (again this was evident in the TF plan output from Atlantis, and I did a search in the =nexmoinc= GitHub org to find the problematic code shown in the plan (basically searched for =all_atmos_cidrs= and there were only two hits, easy): https://github.com/nexmoinc/terraform-nexmo-vpc-data/blob/9e1c21d/all_cidrs/vpc_data.yml#L53 Asked for help in #nx-infra-team: https://vonage.slack.com/archives/G0108MLH10B/p1638357932434300 Got an answer from Daniel Miranda, spreedsheet with all Atmos CIDRs, Accounts tab, the CIDRs are in the right hand split (scroll right to get apse1/2): https://docs.google.com/spreadsheets/d/1QAgf0JuvsfrUiz3rcyhuaFR6DuO6PrOGwT1oWXwlSrw/edit#gid=364526952 Also realised that we need the CIDR ending =/21= as opposed to =/23= - the former is for application traffic, the latter is for database traffic. Not sure why this delineation is made. ** <2021-12-01 Wed 22:35> Tim pushed a fix with a new release to fix the bug encountered provisioning Nginx: https://github.com/nexmoinc/terraform-aws-nexmo-nginx-lb/releases/tag/v1.3.7 ** <2021-12-01 Wed 23:08> Pushed PR to add apse1 CIDR to approriate TF repo: https://github.com/nexmoinc/terraform-nexmo-vpc-data/pull/70 Realised that patch releases get created automatically by a GitHub action without requiring any key words (i.e. =minor-release= or =major-release=) in the comment for a one of the commits. ** <2021-12-01 Wed 23:37> My problem now is that I need to update the space for nginx private LBs, which is presumably shared all over the place. It is pinned to 1.3.0 of nexmo-nginx-lb module (hash is current master): https://github.com/nexmoinc/terraform-modules/blob/97b84e4/spaces/nginx_privlb/atmos_nginx/main.tf#L10 I asked about it here: https://vonage.slack.com/archives/C01SLHZML07/p1638362467180200 There's no extra work required for the vpc-data PR to be pulled through because the version is not fully pinned, it's set to any minor/patch release of major v1: https://github.com/nexmoinc/terraform-aws-nexmo-asg/blob/1273a9d/main.tf#L13 ** <2021-12-01 Wed 23:56> Consensus about nginx private LB space PR is to just update it, patch release changes should be fine. It's the next unfortunate person's problem if anything goes wrong. Pushed a PR but nobody reviewed, will nag in the morning: https://vonage.slack.com/archives/C01SLHZML07/p1638363990185300?thread_ts=1638362467.180200&cid=C01SLHZML07 ** <2021-12-02 Thu 15:37> Realised my PR will update major version, so pin it to v1.3.x instead. ** <2021-12-02 Thu 15:59> Vijay approved nginx placement PR. I thought I would have to update the reference to nginx-lb inside the space wrapper, but actually the version constraint for the reference to nginx-lb in the currently pinned space/wrapper is =~> 0= which just means /always fetch the latest/. So no need to update the space, just re-run the plan now that the Glue/SG bug is fixed, and the apse1 CIDR has been added to the VPC module. ** <2021-12-02 Thu 16:16> I ran =atlantis apply= on nginx provisioning PR and nothing seemed to happen - forgot that Atlantis only adds comment when the apply is finished (or maybe it's configured differently in data team's account, i.e. it doesn't post a comment saying "Running apply..."?). I SSH'ed to one of the EC2 instances that got built using =vonage-aws --ec2= and verified that nginx was running: ** <2021-12-02 Thu 17:39> Helped Shankar deploy initial/seed ArgoCD (without any apps): * Configure Terraform Registry credentials (=~/.terraformrc=) * Login to data team AWS account via =vonage-aws --configure/--login= * Set correct version of =terraform= using =tfenv= * Run =terragrunt init/plan/apply "./plan.out"= There was a brief storm in Melbourne; power cut and temporarily lost internet access. Fun. ** <2021-12-02 Thu 17:45> Provided update at request of Vijay: https://vonage.slack.com/archives/C01TK9SLZ4L/p1638427507284400?thread_ts=1638278392.273400&cid=C01TK9SLZ4L ** <2021-12-02 Thu 18:06> Copy/paste for shared Nomad clients from us-west-2 (most recently pushed) to ap-southeast-1. Realised that commenting =atlantis apply= doesn't show anything happening in GitHub until Atlantis has finished the automated =terraform apply=, only then does it comment the results. Only way to know for a fact it is running is to check the Atlantis pod logs (in centralised logging cluster preferably), or watch activity in AWS console. Realised that we have several "categories" or Nomad client - some clients are shared by multiple services, other clients are dedicated to particular services. Discussed this in DM with Vijay, we agreed to talk in the shared channel next time so Shankar can see the discussion. ** <2021-12-02 Thu 18:27> Copy pasted more =terragrunt.hcl= files for non-shared clients: https://github.com/nexmoinc/tf-data-team/pull/140 ** <2021-12-02 Thu 22:44> Confirmed apse1 and apse2 are now fully bootstrapped, nothing more to do here so going back to regular work. ** <2021-12-03 Thu 10:30 IST> NameSpace Validation step missing When nomad is deployed it is access from a central console ideally a url like (http://nomad.us-east-1.nexmo.vip/ui). The namespace drop down should reflect the new cluster provisioned in the region for developer to access it. This was identified to be missing.. Addressed by pull request -> https://github.com/nexmoinc/nomad-tf/pull/203. More on the thread https://vonage.slack.com/archives/C01TK9SLZ4L/p1638458967303300