HackMD - Collaborative Markdown Knowledge Base

## Introduce new chaosexperiments and workflowtemplates to nc-helm charts Helm Chart A helm chart is a collection of kubernetes yaml manifests grouped into a single package to simplify the deployment of a containarized application. Use tree command inside chart folder to see the chart structure ├── Chart.yaml ├── charts ├── templates │ ├── NOTES.txt │ ├── _helpers.tpl │ ├── deployment.yaml │ ├── ingress.yaml │ ├── service.yaml │ └── tests │ └── test-connection.yaml └── values.yaml Helm chart contains, **charts**: Used for adding dependent charts. **templates**: Contains configuration files **templates/tests**: Used for adding test-manifests to test helm chart **values.yaml**: Used to define default values for configuration files #### chaosexperiment helm chart Chaos experiment contain the actual chaos details. These experiments are installed on target kubernetes cluster as Kubernetes CRs. $ nc-helm/chaosexperiments# tree . ├── Chart.yaml ├── templates │ ├── argo-workflow-rbac.yaml │ ├── experiment-pod-cpu-hog.yaml │ ├── experiment-pod-delete.yaml │ ├── experiment-pod-memory-hog.yaml │ ├── experiment-pod-network-latency.yaml │ ├── experiment-pod-network-loss.yaml │ ├── litmus-chaos-rbac.yaml │ ├── test-chaosexperiments-serviceaccount.yaml │ └── tests │ ├── test-chaosexperiments-pod-cpu-hog.yaml │ ├── test-chaosexperiments-pod-delete.yaml │ ├── test-chaosexperiments-pod-memory-hog.yaml │ ├── test-chaosexperiments-pod-network-latency.yaml │ └── test-chaosexperiments-pod-network-loss.yaml └── values.yaml templates folder contains the list of experiments available in the chart and test folder contains test-experiments used to test helm chart. To install the experiment, edit the values.yaml file and set the experiments to true and run helm install or override the values.yaml using --set command from helm command line. # Using values.yaml $ cat values.yaml experiments: pod_delete: enabled: false ---> set to true network_latency: enabled: false network_loss: enabled: false pod_memory_hog: enabled: false pod_cpu_hog: enabled: false # Using --set to the helm install command $ cd chaosexperiments $ helm install . --name=chaosexperiments --namespace=aqua --set experiments.pod_delete.enabled=true,experiments.network_latency.enabled=true,experiments.network_loss.enabled=true,experiments.pod_cpu_hog.enabled=true,experiments.pod_memory_hog.enabled=true By default, All chaosexperiments are set to false in values.yaml under experiments section. Example chaosexperiment manifest template: {{ if .Values.experiments.<experiment-name>.enabled }} -----> Enable the experiment in values.yaml apiVersion: litmuschaos.io/v1alpha1 description: message: | <Description of experiment> kind: ChaosExperiment metadata: name: <experiment-name> namespace: {{ .Values.namespace }} -----> Namespace of the experiment labels: name: <label> app.kubernetes.io/part-of: litmus app.kubernetes.io/component: chaosexperiment app.kubernetes.io/version: 1.10.0 version: 0.1.19 spec: definition: scope: Namespaced image: {{ .Values.images.tags.go_runner }} -----> Docker image used to perform chaos imagePullPolicy: {{ .Values.images.pull_policy }} -----> Image download policy args: - -c - ./experiments -name <experiment-name> command: - /bin/bash env: - name: <key1> value: <value1> - name: <key2> value: <value2> labels: name: <label> app.kubernetes.io/part-of: litmus app.kubernetes.io/component: experiment-job app.kubernetes.io/version: 1.10.0 {{ end }} To add a new experiment to chaosexperiment helm chart, 1. Create a new experiment file in templates folder with name experiment-<experiment-name>.yaml chaosexperiment menifests can be downloaded from https://litmuschaos.io/ 2. Create a test experiment with test-chaosexperiments-<experiment-name>.yaml 3. Add new experiment to values.yaml in experiments section ### litmus-workflowtemplates helm chart Litmus use argo workflow templates. This allow developers to reuse them by referencing them from your Workflows. litmus-argo-workflow-templates chart directory structure ├── Chart.yaml ├── templates │ ├── chaos_manager_service_account.yaml │ ├── pod-cpu-hog-template.yaml │ ├── pod-delete-template.yaml │ ├── pod-memory-hog-template.yaml │ ├── pod-network-latency-template.yaml │ ├── pod-network-loss-template.yaml │ ├── test-workflowtemplates-serviceaccount.yaml │ └── tests │ ├── test-workflowtemplate-pod-cpu-hog.yaml │ ├── test-workflowtemplate-pod-delete.yaml │ ├── test-workflowtemplate-pod-memory-hog.yaml │ ├── test-workflowtemplate-pod-network-latency.yaml │ └── test-workflowtemplate-pod-network-loss.yaml └── values.yaml To install the argo-workflowtemplates, edit the values.yaml file and set the template to true and run helm install or override the values.yaml using --set command from helm command line. # Using values.yaml $ cat values.yaml workflowTemplates: pod_delete: enabled: false ---> set to true network_latency: enabled: false network_loss: enabled: false pod_memory_hog: enabled: false pod_cpu_hog: enabled: false # Using --set to the helm install command $ cd litmus-argo-workflow-templates $ helm install . --name=workflowtemplates --namespace=aqua --set workflowTemplates.pod_delete.enabled=true,workflowTemplates.network_latency.enabled=true,workflowTemplates.network_loss.enabled=true,workflowTemplates.pod_cpu_hog.enabled=true,workflowTemplates.pod_memory_hog.enabled=true To add a new litmus-argo-workflowtemplate, create a manifest <experiment-name>-template.yaml Example workflowtemplate: {{ if .Values.workflowTemplates.<experiment-name>.enabled }} -----> Add the template in values.yaml apiVersion: argoproj.io/v1alpha1 kind: WorkflowTemplate metadata: name: <experiment-name> spec: arguments: parameters: - name: target_namespace -----> Define input parameter here - name: pod_label templates: - name: <experiment-name> inputs: parameters: - name: targt_namespace -----> Define input parametes here - name: pod_label resource: # indicates that this is a resource template action: create # can be any kubectl action (e.g. create, delete, apply, patch) successCondition: status.engineStatus == completed failureCondition: status.engineStatus == failed manifest: | #put your kubernetes spec here apiVersion: litmuschaos.io/v1alpha1 kind: ChaosEngine metadata: generateName: <experiment-name>- -----> It generates a randon name for chaosengine resource namespace: {{ .Values.chaos_namespace }} -----> namespace for the resource ownerReferences: - apiVersion: argoproj.io/v1alpha1 blockOwnerDeletion: true controller: true kind: Workflow name: "{{`{{workflow.name}}`}}" uid: "{{`{{workflow.uid}}`}}" spec: appinfo: appns: '{{`{{inputs.parameters.target_namespace}}`}}' applabel: '{{`{{inputs.parameters.pod_label}}`}}' # It can be true/false annotationCheck: 'false' # It can be active/stop engineState: 'active' #ex. values: ns1:name=percona,ns2:run=nginx auxiliaryAppInfo: '' chaosServiceAccount: {{ .Values.chaos_service_account }} monitoring: false # It can be delete/retain jobCleanUpPolicy: '{{`{{inputs.parameters.cleanup_policy}}`}}' components: runner: {{- with .Values.nodeSelector }} nodeSelector: {{- toYaml . | nindent 18 }} {{- end }} experiments: - name: pod-delete spec: components: {{- with .Values.nodeSelector }} nodeSelector: {{- toYaml . | nindent 22 }} {{- end }} env: # set chaos duration (in sec) as desired - name: TOTAL_CHAOS_DURATION value: '{{`{{inputs.parameters.duration}}`}}' # set chaos interval (in sec) as desired - name: CHAOS_INTERVAL value: '{{`{{inputs.parameters.chaos_interval}}`}}' # pod failures without '--force' & default terminationGracePeriodSeconds - name: FORCE value: '{{`{{inputs.parameters.force}}`}}' {{ end }} The workflowtemplate can be tested by creating a simple workflow and add templatereference to it $ cat workflow.yaml apiVersion: argoproj.io/v1alpha1 kind: Workflow metadata: generateName: pod-network-latency- spec: serviceAccountName: litmus-workflow entrypoint: pod-network-latency templates: - name: pod-network-latency steps: - - name: pod-network-latency templateRef: -----> template reference name: pod-network-latency -----> Template name template: pod-network-latency -----> Tempalte name arguments: parameters: - name: target_namespace value: "default" - name: pod_label value: "name=networkchaos" Make sure argo and litmus pods running Install the chaosexperiment and run $ kubectl apply -f workflow.yaml -n aqua Verify the workflow $ kubectl get wf -n aqua NAME STATUS AGE pod-network-latency-br26d Succeeded 4m50s Running helm test for chaosexperiment $ helm ls NAME REVISION UPDATED STATUS CHART APP VERSION NAMESPACE chaosexperiments 1 Fri May 28 04:36:52 2021 DEPLOYED chaosexperiments-0.1.0 1.0 aqua litmus 1 Fri May 28 04:16:20 2021 DEPLOYED litmus-1.11.0 1.11.0 aqua workflowtemplates 1 Fri May 28 04:37:11 2021 DEPLOYED litmus-argo-workflow-templates-0.1.0 1.0 aqua $ helm test chaosexperiments RUNNING: test-pod-network-loss-chaosexperiment PASSED: test-pod-network-loss-chaosexperiment RUNNING: test-pod-network-latency-chaosexperiment PASSED: test-pod-network-latency-chaosexperiment RUNNING: test-pod-delete-chaosexperiment PASSED: test-pod-delete-chaosexperiment Running helm test for litmus-argo-workflow-templates $ helm test workflowtemplates RUNNING: test-pod-network-loss-workflowtemplate PASSED: test-pod-network-loss-workflowtemplate RUNNING: test-pod-network-latency-workflowtemplate PASSED: test-pod-network-latency-workflowtemplate RUNNING: test-pod-delete-workflowtemplate PASSED: test-pod-delete-workflowtemplate