# Design: Schema Changesets **Author**: @mtoohey <!-- Status of the document - draft, in-review, etc. - is conveyed via HackMD labels --> ## Description (what) Deploying modules to a cluster will involve a new process where the a changeset is created that tracks the progress of the change to the cluster. ## Motivation (why, optional) The schema should model the truth of the cluster. The reality of the cluster is that the it is in flux as deployments are created. We want a clear view to see what is changing in a cluster and it's state. This will clear the way for features like: - canary deployments - rollbacks of active changesets ## Goals - Cleanly model the process of upgrading deployments in a cluster. ### Non-Goals (optional) - Canaries (out of scope) - Rollbacks (deploy an upgrade with old code instead) - Handling multiple active changesets (out of scope) ## Design (how) Schema service holds the canonical schema and a list of changesets. Each changeset has a list of module schemas. For now we will only allow one changeset to be active at a time. If a changeset is active when a request comes in to create a new one, it will be rejected. ### Deployment Process - `ftl deploy` publishes artifacts to OCI - `ftl deploy` calls schema service to create a changeset - Provisioner watches schema changes, notices a change set needs provisioning and starts provisioning resources - As each resource is provisioned (or fails to provision), provisioning service updates schema runtime on the schema service - Provisioner finishes provisioning resources and then provisions runners for the new deployment - Provisioner calls schema service to commit changeset. Schema service updates the canonical schema. - Other services watch these schema changes and begin treating the new deployments as canonical. If the provisioner is unable to fully provision all deployments in a changeset: - Provisioner will call schema service to fail the changeset with an error message Changesets that have failed or have been commited will remain in changesets but not be returned in `GetSchema`. // TODO: how to get the status of a specific schema that may be commited or failed. (use case: CI) Other than the provisioner, other services will ignore deployments that are part of a changeset for now. ### Required changes TODO: add a way to remove modules by creating a changeset. This allows provisioners to delete resources. TODO: how to handle this case: - canonical has no modules - create changeset, with module A, B - Provisioners provision module a successfully, runner activates - Provisioners fail to provision module B - changeset fails - ...Just fail? Manual fixes only for now... - ... Could be bad for pubsub consumers? Update `schemaservice.proto` so that: - `GetSchemaResponse` returns schema + changesets - `PullSchemaResponse` contains either a changeset or a deployment. Deployments can have a changeset key - `UpdateDeploymentRuntimeRequest` includes an optional changeset key (depending whether update applies to a canonical deployment or one within a changeset) - Added `CreateChangeset`, `CommitChangeset` and `FailChangeset` calls ```proto message GetSchemaRequest {} message GetSchemaResponse { message Changeset { string key = 1; google.protobuf.Timestamp created_time = 2; ChangesetState state = 3; repeated ftl.schema.v1.Module modules = 4; optional string error = 5; } ftl.schema.v1.Schema schema = 1; repeated Changeset changesets = 2; } enum ChangesetState { CHANGESET_STATE_PROVISIONING = 0; CHANGESET_STATE_COMMITED = 1; CHANGESET_STATE_FAILED = 2; } message PullSchemaRequest {} message PullSchemaResponse { // ChangesetCreated is sent when a new changeset is created. // TODO: consider whether to: // - include all modules in the message? and don't publish individual DeploymentCreated messages for them // - Let deployments be created into a changeset after the changeset is created // - Add a way to indicate how many deployments are expected in the changeset? So listeners can determine if they have the full changeset? message ChangesetCreated { string key = 1; google.protobuf.Timestamp created_time = 2; ChangesetState state = 3; // repeated ftl.schema.v1.Module modules = 4; optional string error = 5; } // ChangesetFailed is sent when a changeset fails. message ChangesetFailed { string key = 1; string error = 2; } // ChangesetFailed is sent when a changeset becomes canonical. message ChangesetCommited { string key = 1; } message DeploymentCreated { // Will not be set for builtin modules. optional string key = 1; string module_name = 2; // If present, the deployment is not yet canonical as it is currently part of a changeset. optional string changeset = 3; optional ftl.schema.v1.Module schema = 4; } message DeploymentUpdated { // Will not be set for builtin modules. optional string key = 1; string module_name = 2; // If present, the deployment is not yet canonical as it is currently part of a changeset. optional string changeset = 3; optional ftl.schema.v1.Module schema = 4; } message DeploymentRemoved { // Will not be set for builtin modules. optional string key = 1; string module_name = 2; // If this is true then the module was removed as well as the deployment. bool module_removed = 3; } oneof event { ChangesetCreated changesetCreated = 1; ChangesetFailed changesetFailed = 2; ChangesetCommited changesetCommited = 3; DeploymentCreated deploymentCreated = 4; DeploymentUpdated deploymentUpdated = 5; DeploymentRemoved deploymentRemoved = 6; } // If true there are more schema changes immediately following this one as part of the initial batch. // If false this is the last schema change in the initial batch, but others may follow later. bool more = 4; } message UpdateDeploymentRuntimeRequest { string deployment = 1; optional string changeset = 2; ftl.schema.v1.ModuleRuntimeEvent event = 3; } message UpdateDeploymentRuntimeResponse {} message CreateChangesetRequest { repeated ftl.schema.v1.Module modules = 1; } message CreateChangesetResponse { // The changeset key of the newly created changeset. string changeset = 1; } message CommitChangesetRequest { // The changeset key to commit. string changeset = 1; } message CommitChangesetResponse {} message FailChangesetRequest { // The changeset key to fail. string changeset = 1; string error = 2; } message FailChangesetResponse {} service SchemaService { // Ping service for readiness. rpc Ping(PingRequest) returns (PingResponse) { option idempotency_level = NO_SIDE_EFFECTS; } // Get the full schema. rpc GetSchema(GetSchemaRequest) returns (GetSchemaResponse) { option idempotency_level = NO_SIDE_EFFECTS; } // Pull schema changes from the Controller. // // Note that if there are no deployments this will block indefinitely, making it unsuitable for // just retrieving the schema. Use GetSchema for that. rpc PullSchema(PullSchemaRequest) returns (stream PullSchemaResponse) { option idempotency_level = NO_SIDE_EFFECTS; } // UpdateModuleRuntime is used to update the runtime configuration of a module. rpc UpdateDeploymentRuntime(UpdateDeploymentRuntimeRequest) returns (UpdateDeploymentRuntimeResponse); // CreateChangeset creates a new changeset. rpc CreateChangeset(CreateChangesetRequest) returns (CreateChangesetResponse); // CommitChangeset makes all deployments for the changeset part of the canonical schema. rpc CommitChangeset(CommitChangesetRequest) returns (CommitChangesetResponse); // FailChangeset fails an active changeset. rpc FailChangeset(FailChangesetRequest) returns (FailChangesetResponse); } ``` ## Rejected Alternatives (optional) <!-- Other ideas that were considered but rejected, including reasoning. -->