# Flake Finder Fridays #001
March 5th 2021
## Introduction
Dan Mangum and Rob Kielty are back for the second episode of Flake Finder Fridays.
In this month's episode they will walk through how to run Kubernetes e2e tests locally, as well as how they are packaged and run in CI environments.
## Issue
Candidates:
- Volume cleanup
- https://github.com/kubernetes/kubernetes/issues/96565
- https://github.com/kubernetes/kubernetes/issues/96759
- https://github.com/kubernetes/kubernetes/issues/99599
- https://github.com/kubernetes/kubernetes/issues/88766
## Running tests locally
### Assumpions
You have a checked out of the kubernetes/kubernetes
Interested in this test
https://testgrid.k8s.io/sig-release-master-informing#gce-ubuntu-master-default&include-filter-by-regex=Pod&include-filter-by-regex=Pod%20Container%20Status%20should%20never%20report%20success%20for%20a%20pending%20container&width=5
Documentation Links:
- [Running Node E2E Tests](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-node/e2e-node-tests.md)
- [Running Kubernetes E2E Tests](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-testing/e2e-tests.md)
- [Running Kubernetes E2E Tests with kubetest2](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-testing/e2e-tests-kubetest2.md)
### Running Locally
For an example of using kubetest to run a test on a local cluster
[Volume teardown and container start can race while a pod is being deleted and report an error #88766](https://hackmd.io/IjPwR-waSKyL4S4tv2YI8w)
### Running in CI
- Let's take a look at the Prow Job with the test we are interested in (`ci-kubernetes-e2e-ubuntu-gce`)
- Spyglass: https://prow.k8s.io/view/gs/kubernetes-jenkins/logs/ci-kubernetes-e2e-ubuntu-gce/1369989496409427968
- k/test-infra: https://github.com/kubernetes/test-infra/blob/490054af4c868eea45c26d05051b0cd3eb33bfc1/config/jobs/kubernetes/sig-cloud-provider/gcp/gcp-gce.yaml#L433
> **A brief aside on naming conventions:** You'll notice the job we are interested in is prefixed with `ci-kubernetes-e2e`. It is customary to prefix periodic jobs that run in CI with `ci` and k8s e2e tests with `kubernetes-e2e`. You can see that a [similar job](https://github.com/kubernetes/test-infra/blob/490054af4c868eea45c26d05051b0cd3eb33bfc1/config/jobs/kubernetes/sig-cloud-provider/gcp/gcp-gce.yaml#L215) (`pull-kubernetes-e2e-gce-ubuntu`) that runs on *presubmit* (i.e. on pull requests before they are merged) is prefixed with `pull`, but still contains `kubernetes-e2e`.
- Shout out to the folks who added this job because it has a nice description: `Uses kubetest to run e2e tests (-Slow|Serial|Disruptive|Flaky|Feature) against a cluster (ubuntu based) created with cluster/kube-up.sh`
- Hmm `kubetest`, we just used that to run our local tests!
- Prow Jobs run container images, what image are we using [here](https://github.com/kubernetes/test-infra/blob/490054af4c868eea45c26d05051b0cd3eb33bfc1/config/jobs/kubernetes/sig-cloud-provider/gcp/gcp-gce.yaml#L258)?
- Where is the source for the [kubekins-e2e image](https://github.com/kubernetes/test-infra/tree/master/images/kubekins-e2e)?
> **A brief aside on autobumper PRs:** Any time the test images change, a new version is built and pushed, then the [autobumper job](https://testgrid.k8s.io/sig-testing-prow#autobump-prow) opens a [PR](https://github.com/kubernetes/test-infra/pull/21295) against `k/test-infra`. This makes it such that all of the jobs depend on this functionality do not have to be constantly updated by their maintainers. Autobumper PRs once daily by `test-infra oncall`, per the test images [documentation](https://github.com/kubernetes/test-infra/tree/master/images#testing-images).
- `kubekins-e2e` (and multiple other images) are [built on](https://github.com/kubernetes/test-infra/blob/e3f09b2907750207dd2a78f0552ab8a3d62ce7e6/images/kubekins-e2e/Dockerfile#L20) the [bootstrap image](https://github.com/kubernetes/test-infra/tree/master/images/bootstrap).
- An important implication of this is that if you need functionality in a certain job to change, and the functionality lives in the `bootstrap` image, you will actually need to autobump merges for it to take effect: one to update the `bootstrap` base image in `kubekins-e2e`, and one to update the subsequently built `kubekins-e2e` image in the relevant job.
- The Long Winding Pathâ„¢
- [bootstrap.py](https://github.com/kubernetes/test-infra/blob/master/jenkins/bootstrap.py) - `bootstrap.py is deprecated, long live bootstrap.py!`
- Chooses from a set of [scenarios](https://github.com/kubernetes/test-infra/blob/e3f09b2907750207dd2a78f0552ab8a3d62ce7e6/jenkins/bootstrap.py#L908)
- [scenarios](https://github.com/kubernetes/test-infra/tree/master/scenarios) -- in this case we are interested in [kubernetes_e2e.py](https://github.com/kubernetes/test-infra/blob/master/scenarios/kubernetes_e2e.py)
- This sets up an environment to [run kubetest](https://github.com/kubernetes/test-infra/blob/67f589ad7c2fa2dce765a52591040cdcc0bfb69c/scenarios/kubernetes_e2e.py#L408)
- [kubetest]() does some setup, then eventually we get to running [e2e.test](https://github.com/kubernetes/kubernetes/blob/fcee7a01050652e54d2819b1942533d96e40f455/hack/ginkgo-e2e.sh#L30) using [ginkgo](https://onsi.github.io/ginkgo/)
- Switching to `kubetest2`
- kubetest2 repo: https://github.com/kubernetes-sigs/kubetest2
- CI Migration KEP: https://github.com/kubernetes/enhancements/tree/master/keps/sig-testing/2464-kubetest2-ci-migration
- Example PR: https://github.com/kubernetes/test-infra/pull/21180
## Investigation
https://prow.k8s.io/view/gs/kubernetes-jenkins/logs/ci-kubernetes-e2e-ubuntu-gce/1367736234834661376
```
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/test/e2e/node/pods.go:206
Mar 5 07:44:26.433: 1 errors:
pod pod-submit-status-2-9 on node bootstrap-e2e-minion-group-1rjx container unexpected exit code 128: start=2021-03-05 07:43:43 +0000 UTC end=2021-03-05 07:43:43 +0000 UTC reason=ContainerCannotRun message=OCI runtime create failed: container_linux.go:370: starting container process caused: process_linux.go:459: container init caused: rootfs_linux.go:59: mounting "/var/lib/kubelet/pods/5a7d0450-9861-425e-98b5-af36c19b9525/volumes/kubernetes.io~projected/kube-api-access-x7lz8" to rootfs at "/var/run/secrets/kubernetes.io/serviceaccount" caused: stat /var/lib/kubelet/pods/5a7d0450-9861-425e-98b5-af36c19b9525/volumes/kubernetes.io~projected/kube-api-access-x7lz8: no such file or directory: unknown
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/github.com/onsi/ginkgo/internal/leafnodes/runner.go:113
```
In k/k:
```
$ find . -print | grep rootfs_linux.go
./vendor/github.com/opencontainers/runc/libcontainer/rootfs_linux.go
$ find . -print | grep container_linux.go
./vendor/github.com/opencontainers/runc/libcontainer/container_linux.go
./pkg/kubelet/kuberuntime/kuberuntime_container_linux.go
```
https://github.com/opencontainers/runc/blob/497cd0c96ec49c7ad6959c4f1732c65a1f8070c1/libcontainer/container_linux.go#L370
https://github.com/kubernetes/kubernetes/blob/6b9379eae4bbfc15b03110a56727c5a8012fa7b5/pkg/kubelet/cm/container_manager_linux.go
https://github.com/opencontainers/runc/blob/497cd0c96ec49c7ad6959c4f1732c65a1f8070c1/libcontainer/rootfs_linux.go#L45
https://github.com/opencontainers/runc/blob/4d4d19ce528ac40cc357ef92cd3a6931dba19316/libcontainer/standard_init_linux.go#L46
https://github.com/opencontainers/runc/blob/091dd32dd1fadb09141449295e4c012474fc19dc/libcontainer/init_linux.go#L79
https://github.com/opencontainers/runc/blob/091dd32dd1fadb09141449295e4c012474fc19dc/libcontainer/factory_linux.go#L337
## Kubernetes Project Resources
Brand new to the project?
- Start here: https://www.kubernetes.dev/
Setup already and interested in maintaining tests?
- Check out [this video](https://www.youtube.com/watch?v=Ewp8LNY_qTg) from Jordan Liggit who describes strategies and tactics to deflake flaking tests ([Jordan's show notes for that talk](https://gist.github.com/liggitt/6a3a2217fa5f846b52519acfc0ffece0))
Here's how the CI Signal Team actively monitors CI during a release cycle:
- [A Tour of CI on the Kubernetes Project](https://www.youtube.com/watch?v=bttEcArAjUw)
- [Show notes](bit.ly/k8s-ci)