# Is Arazzo Turing Complete?
An investigation into the possible Turing Completeness of the Arazzo Specification.
Contact, and More details at https://captnemo.in/
References:
- [Arazzo Spec](https://spec.openapis.org/arazzo/latest.html)
- [OpenAPI 3.1.0 Spec](https://swagger.io/specification/)
- [JSON Pointer Spec](https://datatracker.ietf.org/doc/html/rfc6901)
- [XPath 3.1](https://www.w3.org/TR/xpath-31)
## Basics
An Arazzo Specification builds a workflow(s) on top of operations that are defined in an OpenAPI Specification. It has some control flow, and variable storage, but is just a GitHub Actions Workflow Syntax like DSL that is written primarily in YAML.
Since Arazzo is not a real programming language, this is a fun investigation to see how much computation can you in such an environment? Relying on any specific HTTP API Behaviour (say using the repl.it API) would count as cheating, but relying on standard HTTP behaviour seems like a fair game.
## Strategies
Interesting starting points:
### Criterion Objects
Criterion are expressions used to decide whether a given Step (which converts to an API call via an OpenAPI Operation) was successful or not. You can write these conditions in many ways, all of which are great for control flow:
- JSONPath
- Regex
- Simple expressions, which support some basic operations:
`<,<=,>,>,==,!=,!,&&,||`
`()` Logical Grouping
`[]` (Index (0-based)
`a.b` Property de-reference) on top of basic literals (`number`, `null`, `string`, `true`, `false`)
- XPath
JSONPath in particular comes with a small list of functions which can be helpful (length, count, match, search, value)
### Runtime Expressions
> A runtime expression allows values to be defined based on information that will be available within an HTTP message, an event message, and within objects serialized from the Arazzo document such as workflows or steps.
These are simple strings like `$request.header.accept`, `$workflows.foo.inputs.username`, or `$response.body#/status` (`#/status` is a JSON Pointer) that can be used to "pluck" a specific value.
Runtime Expressions are used in multiple places:
1. `workflow.dependsOn` could point to other dynamically generated workflow reference. It is unclear at what point this is evaluated in the spec, since a workflow could be called multiple times, and the context could change over time.
2. `step.operation[Id|Path]`: A step can refer to an OpenAPI Operation dynamically.
3. `step.workflowId`: A step doesn't have to invoke an operation, it can just point to a workflow.
4. `step.outputs`: Step outputs are generated using runtime expressions. More specifically, an output is a map of string:runtime-expression. Key names can't be dynamic, and there's no way to generate arrays inside outputs, without such array being plucked directly from a "source" (request/response body etc)
There's also a few other places, but I don't think these are essential components, since the inputs/outputs system can do whatever the other parts can:
1. Parameter values can be generated using runtime-expressions
2. Reuseable Objects, that can be referenced using runtime expressions
### goto
We can write loops easily using the `goto` construct in Arazzo:
```yml
name: JoinWaitingList
type: goto
stepId: joinWaitingListStep
criteria:
# assertions to determine if this success action should be executed
- context: $response.body
condition: $[?count(@.pets) > 0]
type: jsonpath
```
The above is an example of a `[Success|Failure]Action` object, which can either mark an `end` or goto `Step|Workflow`.
There is also "Retry", which you can use to build loops as well, along with `retryLimit`. You can force a failing API Operation by marking the successCriteria as `false`, and then the step will automatically be retried to the `retryLimit`.
### HTTP
There's also a bunch of HTTP+OpenAPI behaviour that we can rely upon. Examples:
1. HTTP Trace, which returns the request headers in the response body.
2. Common replies for Invalid HTTP Semantics, such as expecting a 405 error code by sending invalid HTTP methods.
## Loops
### Attempt 1
```yaml
steps:
- stepId: repeatStep
operationId: updateCounterOperation
description: Repeat this step until the counter reaches N
outputs:
counter: $steps.repeatStep.outputs.counter + 1
successActions:
- name: continueLoop
type: goto
stepId: repeatStep
criteria:
condition: $steps.repeatStep.outputs.counter < $inputs.N
```
`counter: $steps.repeatStep.outputs.counter + 1` is unfortunately invalid, since the output can only be a `runtime-expression`, and while `$steps.repeatStep.outputs.counter` is a valid RE, `$steps.repeatStep.outputs.counter+1` is not.
### Attempt 2
```yaml
steps:
- stepId: repeatStep
operationId: updateCounterOperation
description: Repeat this step until the counter reaches N
outputs:
counter: $steps.repeatStep.outputs.counter
criteria:
condition: false # This step always fails
onFailure:
type: retry
stepId: repeatStep
retryLimit: 5
```
This repeats repeatStep 5 times. `retryLimit` is an integer, so it must be provided in the doc.
## Storage
Loops seem doable, but its storage that is much tricker.
We have assignments, but they are not arbitary - We can use step outputs to assign a dictionary mapping of strings to runtime expressions. These expressions can not involve arbitary expressions. They do support JSON-Pointer, but not JSONPath (so we don't get the JSONPath methods). The list of sources is quite decent:
```abnf
expression = ( "$url" / "$method" / "$statusCode" /
"$request." source / "$response." source /
"$inputs." name / "$outputs." name / "$steps." name /
"$workflows." name / "$sourceDescriptions." name)
parameter-name = name ; Reuses 'name' rule for parameter names
source = ( header-reference / query-reference / path-reference / body-reference )
header-reference = "header." token
query-reference = "query." name
path-reference = "path." name
body-reference = "body" ["#" json-pointer ]
json-pointer = *( "/" reference-token )
```
So we can pluck out specific elements via JSON Pointers from the request or response body. We can also get a specific output or input value. An output value can refer to itself as well.
But there is no direct way for an output to be set to an operation result.
## Final Thoughts
Still WIP, but the easy way out is to use criterion to set (and as a corollary) not set specific outputs.
The perfect solution would be an arbitarily large implementation of Rule 110 that is only bound by memory. Some imperfect solutions seem easily within reach:
1. A small 5 block Rule 110 implementation seems doable by just hardcoding each pair-wise operation.
2. There's some possibilities of chaining up workflows as well, by defining a separate workflow for each "computation". So you get N workflows to build a N-block implementation.
3. If none of these work out, generating a really large workflow is still polynomial time, and considered okay. So given N, we can statically generate a really large Arazzo Workflow that unrolls everything on compile time. This is technically not cheating, but not in the spirit of what we want to achieve.
Given enough time, I would also like to try similar investigations in the CI/CD Space against:
1. GitHub Actions/Tekton/Jenkins Workflow Syntax
2. Ansible YAML Templating
3. Helm Templating
## Aside
I mentioned in my talk that "trying to use something for unintended purposes" is a great way of learning it. That's very much the case here - I've found half a dozen corrections or clarifications in the spec, as a lot of what I'm trying to do searches for unintended behaviour.