# ANT CHAOS Whiteboard Session (2.2.21) ## Mobility Questions 1. Repo For resiliant TC's 2. Deployment/Installation process 3. quality gates and process 1. unit testing 1. linting 4. Things we need for test case templates 1. support start and stop with max timer 1. base template 5~~. ANT instance (ANT23? other VM?)~~ Confirmed ANT23 ## The code is grouped into 3 sections of responsibility ### ANT code side (2/8/21) 1. Infra 1. repo direction Maybe nc-ant-test-cases 1. abstract starting direction for support data structure and chaos action template 1. High level AC 1. README.md in the ant test case directory to point to the ANT wiki and the information we need to understand this project (ANT repo sci has an example https://gerrit.mtn5.cci.att.com/gitweb?p=aic-aqa-ant.git;a=tree;h=refs/heads/main;hb=refs/heads/main that can be used the idea is to have all info out there to reduce questions) 1. Linting and gating 1. standard Makefile (michelle add wiki pages for ref) https://wiki.web.att.com/display/CCPdev/AQuA+Team+make+standards 1. Call code-review which will have (ant sci has an example) 1. tox -p https://wiki.web.att.com/display/CCPdev/AQuA+Team+tox+standards 1. shellcheck 1. lint json schema of the project 1. Build process for ant package (get this running locally first) 1. bring together the schema and zip it up 1. stardard tox 1. yapf 1. other lint 1. Tech debt 1. return code of the chaos test case is hardcoded 1. Change the workflow.yaml into one yaml per chaos event 1. Create a factory method for chaos class 1. 1. I would like to see a sequence diagram of the client side ANT code (eg from a factory abstract class) 1. Abstract the workflow.yaml data so we are not using ant data structures and others can use outside of ant and not limiting names. -create workflow templat 3. What about versioning? How will we know what version of the test is loaded into ANT instance? Version is a field think about adding a date and timestamp for tracking to eventually have build numbers. Discussion with ANT needed to derermine what is supported 1. Mobility wants 1. Chaos for AMF control Plane in the sub-cluster 1. Questions 1. How will the ANT side of the code get deployed in sites like 1. ant225 1. Can we get the schema for ant json so we can test before we upload? ### Mobility Sub-Cluster 1. Install the necessary software to do Chaos 2. Software via helm 3. argo 4. Litmus 5. Questions 6. what is the deployment process for the sub-cluster in questions ### Network Cloud Cluster 1. Get base code working in Network Cloud deployment 1. Install Litmus by hand on mtn65b in the aqua namespace 2. run a small demo of running a test using k8s interface 3. Get Litmus installed as part of NC in aqua name space 2. Create EPICs for Set of Chaos testing events for Mobility 3. Chaos event for SRE 3. Darren and team let talk about this to get a win win for all `Node Restart (can we target compute nodes with specific mobility workloads?)` `Pod Network Latency (coredns -- mobility uses their own dns)` `Pod Network Loss` ### Mobility sub cluster 1. What and How to install needed software ### Validation or monitor ## EPIC we want to write ## Action 1. Michelle create EPIC and US from 1 in Network Cloud Cluster ## Notes 1. On stop it stop the workflow item and allow the client to pull logs. 2. ANT Note 3. One zip file per test case so 4. one json per case Questions For Mobility 1. Minimum Read Access to and contact info for Schema's being used.