# OCP EDA for auto oc adm inspect ## Add bastion info in AAP 1. Create Inventory - Bastion   2. Add host in the inventory    3. Create credential for bastion  ## Create new playbook for oc adm inspect   1. in bastion, pull ``` git pull https://gitea.apps.cluster-5q2s5.5q2s5.sandbox1411.opentlc.com/lab-user/event-driven-ansible cd event-driven-ansible/automation_controller ``` 2. create playbook vi oc-inspect.yml ```yaml= cat <<'EOF' > oc-inspect.yml - name: oc adm inspect hosts: bastion.9gfrm.sandbox2486.opentlc.com gather_facts: no vars: ns: "{{ ansible_eda.event.resource.metadata.namespace }}" tasks: - name: Create inspect file shell: "rm -rf ./inspect.local.{{ ns }} && oc adm inspect ns/{{ ns }} --dest-dir=./inspect.local.{{ ns }} --kubeconfig /home/lab-user/.kube/config" register: lsout - name: Compress with tar command: "tar -czf inspect.local.{{ ns }}.tar.gz inspect.local.{{ ns }}" when: lsout.rc == 0 - name: Create support case shell: | RH_PORTAL_TOKEN=$(<rh-customer-portal-token) TOKEN=$(curl https://sso.redhat.com/auth/realms/redhat-external/protocol/openid-connect/token -d grant_type=refresh_token -d client_id=rhsm-api -d refresh_token=$RH_PORTAL_TOKEN | jq --raw-output .access_token) response=$(curl -sS -X POST -H "Content-Type: application/json" -H "Authorization: Bearer $TOKEN" --data '{ "product": "OpenShift Container Platform", "version": "4.12", "caseType": "RCA Only", "description": "My pod crashed last night, I was wondering about RCA", "environment": "staging", "caseLanguage": "zh_TW", "severity": 3, "summary": "Summary message here." }' "https://api.access.redhat.com/support/v1/cases") echo $response | jq -r '.location[0] | capture("/cases/(?<case_no>[0-9]+)") | .case_no' register: case_number when: lsout.rc == 0 - name: Upload logs to support case shell: | RH_PORTAL_TOKEN=$(<rh-customer-portal-token) TOKEN=$(curl https://sso.redhat.com/auth/realms/redhat-external/protocol/openid-connect/token -d grant_type=refresh_token -d client_id=rhsm-api -d refresh_token=$RH_PORTAL_TOKEN | jq --raw-output .access_token) CASE_NO={{ case_number.stdout }} curl -X POST -F "file=@inspect.local.{{ ns }}.tar.gz" -H "Authorization: Bearer $TOKEN" https://api.access.redhat.com/support/v1/cases/${CASE_NO}/attachments when: lsout.rc == 0 and case_number.stdout is defined EOF ``` ```bash= # 請取代 your_bastion_host 的內容 your_bastion_host="bastion.4lf5g.sandbox786.opentlc.com" sed -i "s/bastion\.9gfrm\.sandbox2486\.opentlc\.com/$your_bastion_host/g" oc-inspect.yml ``` ## For test ```yaml= - name: oc adm inspect hosts: bastion.9gfrm.sandbox2486.opentlc.com gather_facts: no vars: ns: "{{ ansible_eda.event.resource.metadata.namespace }}" tasks: - name: Create inspect file shell: "oc adm inspect ns/{{ ns }} --dest-dir=./inspect.local.{{ ns }} --kubeconfig /home/lab-user/.kube/config" register: lsout - name: Compress with tar command: "tar -czf inspect.local.{{ ns }}.tar.gz inspect.local.{{ ns }}" when: lsout.rc == 0 - name: Create support case shell: | RH_PORTAL_TOKEN=$(<rh-customer-portal-token) TOKEN=$(curl https://sso.redhat.com/auth/realms/redhat-external/protocol/openid-connect/token -d grant_type=refresh_token -d client_id=rhsm-api -d refresh_token=$RH_PORTAL_TOKEN | jq --raw-output .access_token) response='{"location":["https://access.redhat.com/hydra/rest/v1/cases/03643603"]}' echo $response | jq -r '.location[0] | capture("/cases/(?<case_no>[0-9]+)") | .case_no' register: case_number when: lsout.rc == 0 - name: Upload logs to support case shell: | RH_PORTAL_TOKEN=$(<rh-customer-portal-token) TOKEN=$(curl https://sso.redhat.com/auth/realms/redhat-external/protocol/openid-connect/token -d grant_type=refresh_token -d client_id=rhsm-api -d refresh_token=$RH_PORTAL_TOKEN | jq --raw-output .access_token) CASE_NO={{ case_number.stdout }} curl -X POST -F "file=@inspect.local.{{ ns }}.tar.gz" -H "Authorization: Bearer $TOKEN" https://api.access.redhat.com/support/v1/cases/${CASE_NO}/attachments when: lsout.rc == 0 and case_number.stdout is defined ``` 3. push ``` git add * git commit -am "playbook for oc adm inspect" git push ```  ## Create job template in AAP 1. git server 更新AAP Project  新增template取名為 `oc-inspect`  ## Create new rulebook runner ``` cd /opt/podman/eda/ cp -r resource_quota oc_inspect ```    - Replace `resource_quota` with `oc_inspect`  或著使用命令列進行取代 ```bash= find ./oc_inspect/ -type f -exec grep -Iq . {} \; -print0 | xargs -0 sed -i "s/resource_quota/oc_inspect/g" ``` - Replace rulebook.yml with following ```bash= cd oc_inspect ``` ```yaml= cat <<'EOF' > rulebook.yml --- - name: Listen for unhealthy+warning event hosts: all sources: - sabre1041.eda.k8s: api_version: v1 kind: Event namespace: jace #自行替換成新預計的ns名稱 rules: - name: Debug condition: event.resource.reason == "Unhealthy" and event.resource.type == "Warning" throttle: once_within: 5 minutes group_by_attributes: - event.resource.reason - event.resource.metadata.name action: run_job_template: name: oc-inspect #必須對應AAP內的template 名稱 organization: Default EOF ``` 如果您要變更監控的namespace請執行 ```bash= my_namespace=my-new-ns sed -i "s/jace/$my_namespace/g" rulebook.yml ```  ```yaml= cd /opt/podman/eda/oc_inspect sudo podman-compose up ```   ## 測試 (建立一個會自動probe fail的pod) ```bash= oc new-project jace cat << EOF | oc apply -f - apiVersion: v1 kind: Pod metadata: labels: test: liveness name: liveness-exec spec: containers: - name: liveness securityContext: allowPrivilegeEscalation: false seccompProfile: type: RuntimeDefault capabilities: drop: - ALL resources: requests: memory: "64Mi" cpu: "250m" limits: memory: "128Mi" cpu: "500m" image: k8s.gcr.io/busybox args: - /bin/sh - -c - touch /tmp/healthy; sleep 30; rm -rf /tmp/healthy; sleep 600 livenessProbe: exec: command: - cat - /tmp/healthy initialDelaySeconds: 5 periodSeconds: 5 EOF ``` ## Alternative - using deployment & configmap ```yaml= # html-configmap.yaml cat << EOF | oc apply -f - apiVersion: v1 kind: ConfigMap metadata: name: html-content data: index.html: | <h1>Hello world! Welcome to K8s Summit 2023</h1> EOF ``` ```yaml= # nginx-hello-world.yaml cat << EOF | oc apply -f - apiVersion: apps/v1 kind: Deployment metadata: name: nginx-hello-world labels: app: nginx-hello-world spec: replicas: 1 selector: matchLabels: app: nginx-hello-world template: metadata: labels: app: nginx-hello-world spec: volumes: - name: html-volume configMap: name: html-content containers: - name: nginx image: "quay.io/redhattraining/hello-nginx:v1.0" securityContext: allowPrivilegeEscalation: false runAsNonRoot: true seccompProfile: type: RuntimeDefault capabilities: drop: - ALL volumeMounts: - name: html-volume mountPath: /usr/share/nginx/html resources: requests: memory: "64Mi" cpu: "250m" limits: memory: "128Mi" cpu: "500m" livenessProbe: exec: command: - /bin/sh - -c - curl -s http://localhost:8080 | grep -q "world" initialDelaySeconds: 5 periodSeconds: 5 terminationMessagePath: /dev/termination-log terminationMessagePolicy: File tty: true stdin: true serviceAccount: default terminationGracePeriodSeconds: 5 EOF ``` ```bash= oc expose deploy/nginx-hello-world --port 8080 oc expose svc nginx-hello-world #oc create route edge nginx-route --service=nginx-service ```   ## 驗證inspect是否成功   # 參考資料 rulebook 寫法 https://www.redhat.com/en/topics/automation/what-is-an-ansible-rulebook https://ansible.readthedocs.io/projects/rulebook/en/stable/rules.html
×
Sign in
Email
Password
Forgot password
or
By clicking below, you agree to our
terms of service
.
Sign in via Facebook
Sign in via Twitter
Sign in via GitHub
Sign in via Dropbox
Sign in with Wallet
Wallet (
)
Connect another wallet
New to HackMD?
Sign up