Best Practices for Implementing Automation Projects

### Best Practices for Automation Projects  slide: https://hackmd.io/@kvegh/automation_best_practices/ --- ### TL;DR: the intention here is to collect key strategies to focus on during automation projects ---  #### #1 Minimize the number of automation frameworks *** As far as possible, if you can, then to one single tool. The more different teams speak the same markup language, the better the collaboration. **Maximize collaboration**. /* It's easy to get distracted to highly focused tools for specific usecases, however complexity doesn't pay off, and different tools grow the maintenance effort of the automation framework exponentially. */ IOW: Aim for joining automation efforts **across teams**, in all IT Stack Layers. **Minimize number of tools, Maximize their scope**. ---  #### #2 Aim for [Infrastructure as Code](https://https://www.redhat.com/en/topics/automation/what-is-infrastructure-as-code-iac) *** Describing your infrastructure with versionable code allows you to: * Document and review changes in the Deployment code (also, retrospectively) * Version Infrastructure definition releases * Rebuild Infrastructure any time (allowing Just-in-time-DR and multicloud migrations) * Regularly validate running configuration ---  #### #3 Aim for lowest requirements on the managed node side *** Deploying agents can be both undesired (updates, security) and/or even not possible (hypervisors, cloud instances, network elements, application instances...) Use standard connections, APIs instead. ---  #### #4 Research the Vendor support for technologies to be automated *** No need to build everything from scratch. In order to efficiently automate different usecases, ensure that the technology vendor itself is involved in creating the automation content or best practices for their own solutions. ---  #### #5 Define and DOCUMENT [the usecases to be automated](http://people.redhat.com/kvegh/slidedecks/Finding%20the%20right%20automation%20usecases%20to%20start%20with%20-%20German+English.pdf) VERY CLEARLY *** Automation projects stand and fall with the usecase definitions - and the focus being kept on them. Prioritize, and plan the timeline of rolling them out. ---  #### #6 Automated usecases shall be consumable in a self-service manner *** Have interfaces defined where users/customers with the right roles can trigger automated jobs, preferably without human intervention. Provide an automation API for integration with existing tools. ---  #### #7 Auditability and logging *** The outcome of all triggered automation jobs shall be centrally collected for review and/or autditing purposes. Consider implementing an auditor/reviewer role. ---  #### #8 Enable modular workflows *** Building modules of job automation allows chaining different jobs together to workflows. (e.g. VM creation -> OS setup -> OS update -> OS config -> storage config -> hardening -> App deployment -> CMDB entries -> notifications) ---  #### #9 Provide a way to schedule jobs *** Some jobs need to run regularly (e.g. validating curently running configuration) ---  #### #10 Plan for regular validation checks of currently running configurations (config management) *** Title says it all ---  #### #11 Consider security aspects *** * <**IF Agentless ; THEN** Credential management for automation login **; FI** > * Credential management for runtime additional authentications * Consider external vaults * Role Based Access Control for Automation Engineers **AND TEAMS** * Central use authentication from external sources ---  #### #12 Architectural and Load Scalability *** * Are distributed automation job execution instances necessary? * Does the automation workload justify a distributed architecture? * How to guarantee consitent REMOTE running environment for the automation job to be executed? ---  #### #13 Runtime Parameters *** In case automation jobs require custom runtime parameters, what is the source of those? (CMDB? Git? User?) ---  #### #14 Single sources of truth *** * Where does the approved/tested/certified automation content come from? * Where do I get the uptodate number and data of the managed nodes (Inventory) * Where do I authenticate my users from? * Where do the automation credentials come from? ---  #### #15 Testing? *** * Do I test my automation when new automation content is checked into the automation repository? ---  #### #16 Automating automation *** Manage your Automation Platform like any other managed node too - have its configuration checked into the IaC repository. ---  #### #17 Markup Language Corp. standards/coding styleguide *** Define your local automation standards for the IaC code: * Define variable naming, coding style * Enforce reusability and modularity * Strive for code reasibility * See: https://docplayer.net/53699914-Ansible-best-practices-for-startups-to-enterprises.html * And: http://people.redhat.com/kvegh/slidedecks/Ansible_Best_practices_2019_kvegh_v2.pdf ---  #### #X ***