# Design: Schema Changesets
**Author**: @mtoohey
<!-- Status of the document - draft, in-review, etc. - is conveyed via HackMD labels -->
## Description (what)
Deploying modules to a cluster will involve a new process where the a changeset is created that tracks the progress of the change to the cluster.
## Motivation (why, optional)
The schema should model the truth of the cluster. The reality of the cluster is that the it is in flux as deployments are created.
We want a clear view to see what is changing in a cluster and it's state.
This will clear the way for features like:
- canary deployments
- rollbacks of active changesets
## Goals
- Cleanly model the process of upgrading deployments in a cluster.
### Non-Goals (optional)
- Canaries (out of scope)
- Rollbacks (deploy an upgrade with old code instead)
- Handling multiple active changesets (out of scope)
## Design (how)
Schema service holds the canonical schema and a list of changesets.
Each changeset has a list of module schemas. For now we will only allow one changeset to be active at a time. If a changeset is active when a request comes in to create a new one, it will be rejected.
### Deployment Process
- `ftl deploy` publishes artifacts to OCI
- `ftl deploy` calls schema service to create a changeset
- Provisioner watches schema changes, notices a change set needs provisioning and starts provisioning resources
- As each resource is provisioned (or fails to provision), provisioning service updates schema runtime on the schema service
- Provisioner finishes provisioning resources and then provisions runners for the new deployment
- Provisioner calls schema service to commit changeset. Schema service updates the canonical schema.
- Other services watch these schema changes and begin treating the new deployments as canonical.
If the provisioner is unable to fully provision all deployments in a changeset:
- Provisioner will call schema service to fail the changeset with an error message
Changesets that have failed or have been commited will remain in changesets but not be returned in `GetSchema`.
// TODO: how to get the status of a specific schema that may be commited or failed. (use case: CI)
Other than the provisioner, other services will ignore deployments that are part of a changeset for now.
### Required changes
TODO: add a way to remove modules by creating a changeset. This allows provisioners to delete resources.
TODO: how to handle this case:
- canonical has no modules
- create changeset, with module A, B
- Provisioners provision module a successfully, runner activates
- Provisioners fail to provision module B
- changeset fails
- ...Just fail? Manual fixes only for now...
- ... Could be bad for pubsub consumers?
Update `schemaservice.proto` so that:
- `GetSchemaResponse` returns schema + changesets
- `PullSchemaResponse` contains either a changeset or a deployment. Deployments can have a changeset key
- `UpdateDeploymentRuntimeRequest` includes an optional changeset key (depending whether update applies to a canonical deployment or one within a changeset)
- Added `CreateChangeset`, `CommitChangeset` and `FailChangeset` calls
```proto
message GetSchemaRequest {}
message GetSchemaResponse {
message Changeset {
string key = 1;
google.protobuf.Timestamp created_time = 2;
ChangesetState state = 3;
repeated ftl.schema.v1.Module modules = 4;
optional string error = 5;
}
ftl.schema.v1.Schema schema = 1;
repeated Changeset changesets = 2;
}
enum ChangesetState {
CHANGESET_STATE_PROVISIONING = 0;
CHANGESET_STATE_COMMITED = 1;
CHANGESET_STATE_FAILED = 2;
}
message PullSchemaRequest {}
message PullSchemaResponse {
// ChangesetCreated is sent when a new changeset is created.
// TODO: consider whether to:
// - include all modules in the message? and don't publish individual DeploymentCreated messages for them
// - Let deployments be created into a changeset after the changeset is created
// - Add a way to indicate how many deployments are expected in the changeset? So listeners can determine if they have the full changeset?
message ChangesetCreated {
string key = 1;
google.protobuf.Timestamp created_time = 2;
ChangesetState state = 3;
// repeated ftl.schema.v1.Module modules = 4;
optional string error = 5;
}
// ChangesetFailed is sent when a changeset fails.
message ChangesetFailed {
string key = 1;
string error = 2;
}
// ChangesetFailed is sent when a changeset becomes canonical.
message ChangesetCommited {
string key = 1;
}
message DeploymentCreated {
// Will not be set for builtin modules.
optional string key = 1;
string module_name = 2;
// If present, the deployment is not yet canonical as it is currently part of a changeset.
optional string changeset = 3;
optional ftl.schema.v1.Module schema = 4;
}
message DeploymentUpdated {
// Will not be set for builtin modules.
optional string key = 1;
string module_name = 2;
// If present, the deployment is not yet canonical as it is currently part of a changeset.
optional string changeset = 3;
optional ftl.schema.v1.Module schema = 4;
}
message DeploymentRemoved {
// Will not be set for builtin modules.
optional string key = 1;
string module_name = 2;
// If this is true then the module was removed as well as the deployment.
bool module_removed = 3;
}
oneof event {
ChangesetCreated changesetCreated = 1;
ChangesetFailed changesetFailed = 2;
ChangesetCommited changesetCommited = 3;
DeploymentCreated deploymentCreated = 4;
DeploymentUpdated deploymentUpdated = 5;
DeploymentRemoved deploymentRemoved = 6;
}
// If true there are more schema changes immediately following this one as part of the initial batch.
// If false this is the last schema change in the initial batch, but others may follow later.
bool more = 4;
}
message UpdateDeploymentRuntimeRequest {
string deployment = 1;
optional string changeset = 2;
ftl.schema.v1.ModuleRuntimeEvent event = 3;
}
message UpdateDeploymentRuntimeResponse {}
message CreateChangesetRequest {
repeated ftl.schema.v1.Module modules = 1;
}
message CreateChangesetResponse {
// The changeset key of the newly created changeset.
string changeset = 1;
}
message CommitChangesetRequest {
// The changeset key to commit.
string changeset = 1;
}
message CommitChangesetResponse {}
message FailChangesetRequest {
// The changeset key to fail.
string changeset = 1;
string error = 2;
}
message FailChangesetResponse {}
service SchemaService {
// Ping service for readiness.
rpc Ping(PingRequest) returns (PingResponse) {
option idempotency_level = NO_SIDE_EFFECTS;
}
// Get the full schema.
rpc GetSchema(GetSchemaRequest) returns (GetSchemaResponse) {
option idempotency_level = NO_SIDE_EFFECTS;
}
// Pull schema changes from the Controller.
//
// Note that if there are no deployments this will block indefinitely, making it unsuitable for
// just retrieving the schema. Use GetSchema for that.
rpc PullSchema(PullSchemaRequest) returns (stream PullSchemaResponse) {
option idempotency_level = NO_SIDE_EFFECTS;
}
// UpdateModuleRuntime is used to update the runtime configuration of a module.
rpc UpdateDeploymentRuntime(UpdateDeploymentRuntimeRequest) returns (UpdateDeploymentRuntimeResponse);
// CreateChangeset creates a new changeset.
rpc CreateChangeset(CreateChangesetRequest) returns (CreateChangesetResponse);
// CommitChangeset makes all deployments for the changeset part of the canonical schema.
rpc CommitChangeset(CommitChangesetRequest) returns (CommitChangesetResponse);
// FailChangeset fails an active changeset.
rpc FailChangeset(FailChangesetRequest) returns (FailChangesetResponse);
}
```
## Rejected Alternatives (optional)
<!-- Other ideas that were considered but rejected, including reasoning. -->