An investigation into the possible Turing Completeness of the Arazzo Specification.
Contact, and More details at https://captnemo.in/
References:
An Arazzo Specification builds a workflow(s) on top of operations that are defined in an OpenAPI Specification. It has some control flow, and variable storage, but is just a GitHub Actions Workflow Syntax like DSL that is written primarily in YAML.
Since Arazzo is not a real programming language, this is a fun investigation to see how much computation can you in such an environment? Relying on any specific HTTP API Behaviour (say using the repl.it API) would count as cheating, but relying on standard HTTP behaviour seems like a fair game.
Interesting starting points:
Criterion are expressions used to decide whether a given Step (which converts to an API call via an OpenAPI Operation) was successful or not. You can write these conditions in many ways, all of which are great for control flow:
<,<=,>,>,==,!=,!,&&,||
()
Logical Grouping[]
(Index (0-based)a.b
Property de-reference) on top of basic literals (number
, null
, string
, true
, false
)JSONPath in particular comes with a small list of functions which can be helpful (length, count, match, search, value)
A runtime expression allows values to be defined based on information that will be available within an HTTP message, an event message, and within objects serialized from the Arazzo document such as workflows or steps.
These are simple strings like $request.header.accept
, $workflows.foo.inputs.username
, or $response.body#/status
(#/status
is a JSON Pointer) that can be used to "pluck" a specific value.
Runtime Expressions are used in multiple places:
workflow.dependsOn
could point to other dynamically generated workflow reference. It is unclear at what point this is evaluated in the spec, since a workflow could be called multiple times, and the context could change over time.step.operation[Id|Path]
: A step can refer to an OpenAPI Operation dynamically.step.workflowId
: A step doesn't have to invoke an operation, it can just point to a workflow.step.outputs
: Step outputs are generated using runtime expressions. More specifically, an output is a map of string:runtime-expression. Key names can't be dynamic, and there's no way to generate arrays inside outputs, without such array being plucked directly from a "source" (request/response body etc)There's also a few other places, but I don't think these are essential components, since the inputs/outputs system can do whatever the other parts can:
We can write loops easily using the goto
construct in Arazzo:
The above is an example of a [Success|Failure]Action
object, which can either mark an end
or goto Step|Workflow
.
There is also "Retry", which you can use to build loops as well, along with retryLimit
. You can force a failing API Operation by marking the successCriteria as false
, and then the step will automatically be retried to the retryLimit
.
There's also a bunch of HTTP+OpenAPI behaviour that we can rely upon. Examples:
counter: $steps.repeatStep.outputs.counter + 1
is unfortunately invalid, since the output can only be a runtime-expression
, and while $steps.repeatStep.outputs.counter
is a valid RE, $steps.repeatStep.outputs.counter+1
is not.
This repeats repeatStep 5 times. retryLimit
is an integer, so it must be provided in the doc.
Loops seem doable, but its storage that is much tricker.
We have assignments, but they are not arbitary - We can use step outputs to assign a dictionary mapping of strings to runtime expressions. These expressions can not involve arbitary expressions. They do support JSON-Pointer, but not JSONPath (so we don't get the JSONPath methods). The list of sources is quite decent:
So we can pluck out specific elements via JSON Pointers from the request or response body. We can also get a specific output or input value. An output value can refer to itself as well.
But there is no direct way for an output to be set to an operation result.
Still WIP, but the easy way out is to use criterion to set (and as a corollary) not set specific outputs.
The perfect solution would be an arbitarily large implementation of Rule 110 that is only bound by memory. Some imperfect solutions seem easily within reach:
Given enough time, I would also like to try similar investigations in the CI/CD Space against:
I mentioned in my talk that "trying to use something for unintended purposes" is a great way of learning it. That's very much the case here - I've found half a dozen corrections or clarifications in the spec, as a lot of what I'm trying to do searches for unintended behaviour.