Hello everyone. My name is Kevin Minehart. I'm not an extremely accredited developer or author, just an average software engineer who's at his wits end with YAML. In my day-to-day I use Go, and that's what I prefer.
I have barely published any blog posts or articles. This, however, is a topic that I am very passionate about and I'm excited to share.
### Introduction
I was on a small team of three working for my county government when I was introduced to continuous integration. I started with JetBrains TeamCity and introduced git to our squad; my main interest was taking the code that we committed and getting it compiled and copied to the correct network drive, never mind all of that automated testing business. That was way too advanced for us. Once it started working, I was sold. I knew this was the niche I wanted to carve out. For me, there was a lot more fulfillment in making these difficult processes easier and faster for my peers than there was in adding new features to our dispatch log program.
To set the stage about terms I'll be using in this post, what I set up was a deployment **pipeline**. This pipeline consisted of three basic **steps**:
1. Clone the project
2. Compile the project
3. Move the compiled artifact(s)
### Configuration-Based Pipelines
Over the next several years and jobs, I was introduced to tools like Travis CI, CircleCI, GitLab CI, and Drone, which all fall into a category which I call **configuration-based** pipelines. These services and
tools follow the same general principles but with different features or some unique twists.
1. All pipelines are configured using a simple markup language, like YAML.
2. For the most part, they are all driven by docker.
* Each step or pipeline is ran in a docker container. The step typically defines the image as well as the command.
3. Each step allows the user to define a shell script to run. If that script returns 0, then the step passes.
4. Steps can happen one after another, or in parallel.
5. Pipelines can also be ran in parallel or in sequence.
* For example, you may have a "deploy" or "upload" pipeline that runs only when the "test-frontend" or "test-backend" pipelines have failed.
6. They offer some mechanism of sharing data between steps.
* For example, you may have a "install dependencies" step that installs the necessary dependencies before the next steps are ran.
* The data that is shared is not always explicitly exported or imported.
All of these processes are defined in a configuration language. From here on out I'll be exclusively referring to YAML; however these issues are prevalent in any configuration language. Here's a pseudo-code example:
%[https://gist.github.com/kminehart/660ce485c850ff0a3daaf9747075ef26]
This is a summary of what this might represent:
* When a PR is submitted:
* The backend and frontend is tested
* The backend code is compiled
* The frontend code is compiled
* When a PR is merged to main / when a commit lands on main:
* The backend and frontend is tested
* The server & client code is compiled
* The frontend code is compiled
* Upon approval:
* The server, client, and frontend are uploaded (released)
Those that are familiar with this type of CI pipeline will immediately groan with the reminder of all the issues that they've experienced with it.
#### Reproducibility & Development
If an issue occurs, or a modification needs to be made, there's not a straightforward way to run the whole locally. While many services will claim it's an item on their roadmap or provide a half-baked solution, they often fall short.
Thus begins the arduous development cycle; in order to test structural changes, they need to be committed to the git repository. `git add ci.yaml && git commit -m "maybe this time" && git push`. Wait 10 minutes for the CI service to arrive at the relevant part, and then... failure. Make a small change, and try again.
We've all had hour-long blocks in our commit histories that look like this:

#### Maintainability
Tooling is lacking. Some questions are difficult to answer, but should be easily solved by proper tooling. Like, for example:
* What if you want to bring your own docker image?
* How do I reuse content in a single pipeline?
* How do I reuse content across multiple pipelines?
* How do I cache data between steps?
* How do I write tests for my pipeline that cover less frequently used scenarios?
As pipeline developers and their pipelines continue to grow and these questions start formulating, they find answers and expectedly land on some common tooling.
Reusability is achieved by using an external tool like `make` or `mage` or even the scripts object in `package.json`. This doesn't completely solve the problem:
* Your CI pipeline still needs to define a docker image with `make` available
* Every step has to have that image defined as a string literal.
* Every step and pipeline is a DAG, and must define the names of steps or pipelines that they depend on. If these names change, then those changes must be propagated.
To take it a step further, one might introduce a templating layer, either something really simple like `envsubst`, or something really complex like `starlark` or `jsonnet`.
The stack continues to grow and suddenly, maintaining the pipeline is less about pipeline maintenance
and more about maintaining the tooling that generates the pipeline.
Developers may find they need a more robust programming language than bash for developing their pipelines. The benefit of importing shared libraries or packages outweighs the additional complexity. After all, these problems are almost ubiquitously solved by programming languages already.
So instead of `make`, they might use `mage`, or `nodejs` and scripts in `package.json`, or they might make their own program. After this transition, however, they're left with the same issues: the thing that defines the order of operations is still a flat, complex, messy configuration file.
**Summary**
* CI pipeline configs offer little in terms of reusability, and even less in reproducibility.
* Developing a CI pipeline is a messy chore of making hundreds of small commits until something eventually works.
* Configuration-based pipelines are difficult to maintain, and as they grow, they become more and more cumbersome to meaningfully modify.
* Tooling is introduced to make maintaining pipeliens easier, but the tooling is also difficult to maintain.
#### Observability
Observability varies wildly depending on the services used. Application observability stacks are great; metrics, logs, and traces, all with a ubiquitous view in Grafana. Some services offer little in the way of what's to be expected now, and at the later stages of CI pipeline development, developers want to start treating their pipelines like they're an integral part of an overall system, just like any other service.
Developers want metrics, but it's not clear how to get them from the service they're using. Some may offer a Prometheus-compatible metrics endpoint, others may not.
What about logs? Without having direct access to the log file or the container's `stdout` stream, you have to push logs directly from the pipeline.
And traces? Good luck. What you see in your views on their website is what you get. At least developers can set up slack notifications whenever a pipeline fails.
### The Concept: Code-Driven Pipelines
We can categorize pipeline failures in 4 ways:
* **Intermittent failures**, like flaky tests.
* **Legitimate failures**. Tests that should fail when something is actually wrong.
* **Logic errors**. Things in pipeline scripts that don't match our intentions. Think of anything that can be prevented with a unit test.
* **Configuration errors**. These happen whenever we've misconfigured the CI pipeline in a way that causes a failure.
* For pipeline developers, this will be the most common source of frustration. These failures can include:
* Using the wrong docker image.
* Misconfiguring the DAG. A step depends on the wrong step or a step that doesn't exist.
* Syntax errors.
* Providing the wrong arguments to a command.
* Misconfiguring environment variables or using the wrong environment variable name.
**We want to do everything we can to reduce the amount of failures we encounter that aren't legitimate test failures.**
We can accomplish that by leveraging existing tooling to create a CI pipeline. Rather than write scripts at a layer below the CI configuration that still forces us to use its configuration language, we can use the CI configuration as an artifact or output of our "pipeline as code" process.
Without referencing any actual libraries or packages, here's the example from the beginning of the article, written as a Go program.
The syntax and formatting might not be perfect, and this isn't a real-life example, just a concept.
%[https://gist.github.com/kminehart/332442c764d6343bfafba1fda16db6be]
Since this is a Go program (it has in the package `main` and has a `func main()`), it is compileable and runnable. To run our pipeline, we can just run the program:
```
$ go run ./pipeline
Running pipeline...
Please select an event or provide the '-event' argument:
[ ] Pull Request
[x] Commit
In the future you can provide the event with the '-event=commit' argument.
Using current repository and ignoring filters...
[\] go mod install..
[/] yarn install...
```
After about 10 minutes...
```
[✓] go mod install
[✓] go test backend
[✓] go build server
[✓] go build client
[✓] yarn install
[✓] yarn test
[✓] yarn build
[y/N] Continue with release?
N
```
And so on. You can use your imagination to see how this might look as it runs the entire pipeline, or in the event that the pipeline requires some secrets in order to run.
We can use this same program to generate a config file for our CI service. While it will run on a remote server on our provider, it will behave the same way that it will locally, with some assumptions and restrictions.
```
$ go run ./pipeline -client=my-ci-provider
```
```yaml
volumes:
- name: bin
kind: tmp
- name: state
kind: tmp
pipelines:
- name: build-go
when:
- event: commit
branch: main
steps:
- name: compile-pipeline
run: go build -o ./bin/pipeline ./pipeline
env:
- GOOS=linux
- GOARCH=amd64
- CGO_ENABLED=0
image: golang:1.18
- name: go-mod-install
run: ./bin/pipeline -p "build-go" -s "go-mod-install"
image: golang:1.18
- name: go-test
run: ./bin/pipeline -p "build-go" -s "go-test"
image: golang:1.18
after: install
- name: go-build-server
run: ./bin/pipeline -p "build-go" -s "go-build-server"
image: golang:1.18
after: go-test
- name: go-build-client
run: ./bin/pipeline -p "build-go" -s "go-build-server"
image: golang:1.18
after: go-test
# For brevity I have excluded the other pipelines.
# Imagine that they all look very similar to this:
# * They start by compiling the pipeline
# * Every step references the compiled pipeline, rather than the command or script used.
```
Simply by using a programming language, we are given many benefits here:
* **Reusability**. Steps, actions, names, images, are all re-usable.
* These things are also shareable externally. Packages can contain steps that can be used in many different projects.
* **Reproducibility**. The pipelines are runnable locally exactly as they are ran remotely.
* **Validation and safety**. Because we define what each step produces and what they require, we can know if all of the requirements for a step are provided before running it at all.
* We can also see if a step attempts to use an argument it hasn't requested.
* **Additional features**. If a CI service doesn't implement a feature, like caching, then our pipeline program could create or use its own version of that feature.
* **Observability**. We can instrument metrics, logs, and traces from our steps and actions. Structured logging will make our pipelines easy to query with [Loki](https://github.com/grafana/loki).
In this scenario, as far as we're concerned, the pipeline YAML is an output of our program. It's as useful (and readable) to us as obfuscated JavaScript is in our frontend.
With all of this data, we could even generate views in Grafana that are completely independent of the CI service we use.

And maybe even some traces!

### Introducing Scribe
I've been a big fan of this concept for a while and could never quite find the syntax or concept that worked for me. After many jobs of suffering through YAML and wishing there was something better, I've finally started making a proper implementation of this concept called, "Scribe."
https://github.com/grafana/scribe
Functionally it's the same as what's described in the Code-Driven pipelines section.
Currently Scribe is in it's early stages of development. It's rough around the edges and could use feedback from early adopters. If you're interested in the idea, there are some great examples in the [demo](https://github.com/grafana/scribe/tree/main/demo) folder that will hopefully match scenarios you've encountered.
As someone who has worked way too much with YAML-based CI pipelines, I'm hoping that this starts more conversations about making more maintainable and flexible CI pipelines by leveraging the existing tooling we all love.