owned this note changed 3 years ago
Published Linked with GitHub

Balancing Speed and Security

The time it takes to spin up a workspace should be closer to one minutes instead of ten.

Let's look at centralizing at least DNS so we can speed up letsencrypt wildcard TLS, and send the resulting TLS secret to the cluster if we want to handle ingress locally.

We could replicate the zone configs if we really wanted to between the deployed cluster and the higher level cluster running PowerDNS.

Either via cluster-api or terraform, let's explore how fast we could bring up clusters.

  • Machine VMs or clusters might work well if we are able to prepoluate images to reduce spin up time.

cluster-api-provider-kubevirt

clusterctl init --infrastructure kubevirt
clusterctl generate cluster --infrastructure kubevirt kubevirttest > ~/kubevirt-test.yaml

Fails to provision

Trying Vcluster

This seems to be the fastest at the moment. Let's figure out what the limits are, maybe give a tour to Mauilion to see if it could be secured.
Ingress is likley the most difficult with this approach.

Trying DO

This sounds like it may be fast, let's see how fast. What about public IPs ingress.

@BobyMCbobs: launching a droplet takes 40s
@BobyMCbobs: launching a DOKS cluster (no-capi) is 5mins

@hh Is the droplet a working k8s cluster or just the underlying "vm"?

@BobyMCbobs: a droplet is just a VM

@BobyMCbobs: CAPI on DO requires building of custom base images
https://github.com/kubernetes-sigs/cluster-api-provider-digitalocean/blob/main/docs/getting-started.md#building-images

@BobyMCbobs: having a hard time building it

    ubuntu-2004: [WARNING]: ansible.utils.display.initialize_locale has not been called, this
    ubuntu-2004:
    ubuntu-2004: PLAY [all] *********************************************************************
    ubuntu-2004: may result in incorrectly calculated text widths that can cause Display to
    ubuntu-2004: print incorrect line lengths
==> ubuntu-2004: failed to handshake
    ubuntu-2004:
    ubuntu-2004: TASK [Gathering Facts] *********************************************************
    ubuntu-2004: fatal: [default]: UNREACHABLE! => {"changed": false, "msg": "Failed to connect to the host via ssh: Unable to negotiate with 127.0.0.1 port 34351: no matching host key type found. Their offer: ssh-rsa", "unreachable": true}
    ubuntu-2004:
    ubuntu-2004: PLAY RECAP *********************************************************************
    ubuntu-2004: default                    : ok=0    changed=0    unreachable=1    failed=0    skipped=0    rescued=0    ignored=0
    ubuntu-2004:
==> ubuntu-2004: Provisioning step had errors: Running the cleanup provisioner, if present...
==> ubuntu-2004: Destroying droplet...
==> ubuntu-2004: Deleting temporary ssh key...
Build 'ubuntu-2004' errored after 57 seconds 526 milliseconds: Error executing Ansible: Non-zero exit status: exit status 4

==> Wait completed after 57 seconds 526 milliseconds

==> Some builds didn't complete successfully and had errors:
--> ubuntu-2004: Error executing Ansible: Non-zero exit status: exit status 4

==> Builds finished but no artifacts were created.
make: *** [Makefile:423: build-do-ubuntu-2004] Error 1

@BobyMCbobs: I found that DOKS clusters create with images like do-kube-1.23.10-do.0, which is unavailable in the doctl compute droplet create command

Trying AWS

Same as DO

@BobyMCbobs: EC2 launch time is about 1m30s
@BobyMCbobs: using EKS is not even in question, due to the complexities of setting it up. So much Terraform (equivalent)

Trying GCP

Same as DO

@BobyMCbobs: launch time about 25s for Compute on a e2-micro instance with an SSD
@BobyMCbobs: building images appears to also be required for CAPI

🐚 make build-gce-ubuntu-2004
hack/ensure-ansible.sh
Starting galaxy collection install process
Nothing to do. All requested collections are already installed. If you want to reinstall them, consider using `--force`.
hack/ensure-packer.sh
hack/ensure-goss.sh
Right version of binary present
packer build -var-file="/home/ii/kubernetes-sigs/image-builder/images/capi/packer/config/kubernetes.json"  -var-file="/home/ii/kubernetes-sigs/image-builder/images/capi/packer/config/cni.json"  -var-file="/home/ii/kubernetes-sigs/image-builder/images/capi/packer/config/containerd.json"  -var-file="/home/ii/kubernetes-sigs/image-builder/images/capi/packer/config/ansible-args.json"  -var-file="/home/ii/kubernetes-sigs/image-builder/images/capi/packer/config/goss-args.json"  -var-file="/home/ii/kubernetes-sigs/image-builder/images/capi/packer/config/common.json"  -var-file="/home/ii/kubernetes-sigs/image-builder/images/capi/packer/config/additional_components.json"  -color=true -var-file="/home/ii/kubernetes-sigs/image-builder/images/capi/packer/gce/ubuntu-2004.json"  packer/gce/packer.json
Error: Failed to prepare build: "ubuntu-2004"

unexpected EOF



==> Wait completed after 7 microseconds

==> Builds finished but no artifacts were created.
make: *** [Makefile:391: build-gce-ubuntu-2004] Error 1

Current Pair instance evaluations

Time is spend in several areas that could be improved

  • Equinix Metal instance creation time (2mins)
    • unavoidable
    • varies between cloud providers
  • dependency installation (2mins)
    • prebuild images?
    • use Talos?
  • Kubernetes init (0m30s)
    • unavoidable
    • not the biggest task
  • Environment container image pull (2m33s)
    • set up another registry pull proxy
    • time until Environment is live after kubectl apply
  • powerdns launch (3m0s)
    • replace with coredns for speed and ecosystem
    • almost parallel with container image pull
    • mystery behaviour with not showing as running until late
      • readinessProbe?
      • livenessProbe?
      • postStart exec command?
  • tls cert (2m40s)
    • only if the tls certs hasn't been cached

over all, around 10m0s or 12m43s.

to Pair's interface

  • add a loading screen and something to fill in the time
    • currently unclear why there's delay
    • good things take time
Select a repo