## FHIR Bulk Submit Operation - Draft ### Changelog **2025-10-22** - Remove `manifestId` operation parameter, using the existing `manifestUrl` parameter as a unique identifier for the manifest instead. This also impacts the `replacedManifestId` operation parameter which is renamed to `replacesManifestUrl` and the `manifestId` extension element in the status manifest which is renamed to `manifestUrl`. - Document use of an optional `countSeverity` extension in the status manifest. ### Audience and Scope This implementation guide is intended to be used by developers at organizations that aim to interoperate by sharing large FHIR datasets. The guide defines the application programming interfaces (APIs) through which an authenticated and authorized system (Data Provider) may submit Bulk FHIR Data to a server (Data Recipient) and receive status information regarding the server's receipt and/or processing of the data and, where applicable, processed data. For example, an health system may use this approach to submit FHIR data to a regulatory agency which will calculate quality measures, load FHIR data from an Electronic Health Record system into an analytics server, or send FHIR data to an application to be de-identified. The scope of this document does NOT include: * The activities a Data Recipient needs to do to validate and process the submitted data * A legal framework for sharing data between partners, such as Business Associate Agreements, Service Level Agreements, and Data Use Agreements, though these may be required for some use cases * Real-time data exchange * Data transformations that may be required by the Data Recipient * Patient matching (although identifiers may be included in the exported FHIR resources) #### Relationship to Bulk Export The [FHIR Bulk Export API](export.html) represents a pull based approach. The client (Data Recipient) makes a kick-off request to a server (Data Provider), polls for the export status and, when files are ready, retrieves them from the data provider. This model is well suited for ad-hoc data requests, where a client is trusted by the server and is able to provide filters describing the data required for a particular use. In contrast, the Bulk Submit operation is a push based approach where the Data Provider sends one or more lists of files (manifests) to the Data Recipient that contain a pre-coordinated data set. This model is well suited for cases where the required data is not ad-hoc, and the Data Provider can determine the correct timing for sending data to the Data Recipient. The Bulk Submit operation described in this document may be used in three ways: 1. In a standalone context with a pre-generated FHIR data set. For example, a set of static ndjson FHIR files generated from a legacy database using a data transformation tool. 2. Initiated by a client application that first performs a bulk export to obtain the data set from a server, optionally transforms the data, and then submits the resulting data to the Data Recipient 3. Triggered directly by a Bulk Export server when an export is complete. Bulk Export servers that support triggering a Bulk Submit operation SHALL support the inclusion of a `bulkSubmitEndpoint` parameter in the Bulk Export [kickoff request](https://build.fhir.org/ig/HL7/bulk-data/export.html##bulk-data-kick-off-request) with the full URL of the Bulk Submit Operation and SHOULD support the inclusion of a `bulkSubmitStatusEndpoint` parameter with the full URL of the Bulk Submit Status Operation. Before initiating the export, the Bulk Export Server should be registered with the Data Recipient as a client with access to a scope of `system/bulk-submit` as described below. ### Roles There are two primary roles involved in a Bulk Submit transaction: 1. **Data Provider** - consists of the following components: a. **Submission Client** - provides details on one or more Bulk Submission manifests to the Data Recipient and optionally tracks job status. b. **Output File Server** - returns files and attachments in response to urls in the submission manifests. This may be built integrated with FHIR Server, or the files may be independently hosted. c. **Authorization Server** - issues access tokens and authenticates file requests to the Output File Server. 2. **Data Recipient** - consists of the following components: a. **Submission Server** - accepts manifest details and provides job status. b. **Authorization Server** - issues access tokens and authenticates manifest submission and job status requests. c. **File Retrieval Client** - retrieves files listed in manifests from Data Provider. d. **File Processor** - processes submitted files with operation such as validation, quality metric calculation, and/or merging into an existing data set. ### Bulk Submit Operation #### Request (Data Consumer Endpoint) ``` POST [fhir base]/$bulk-submit ``` ##### Parameters The request body SHALL be a FHIR [Parameters resource](https://hl7.org/fhir/parameters.html) with the following parameters: | name | cardinality | type | description | | --- | --- | --- | --- | | submitter | 1..1 | Identifier | The submitter must match a system and code specified by the Data Recipient (coordinated out-of-band or in an implementation guide specific to a use case). | | submissionId | 1..1 | string | The value must be unique for the `submitter`. | | submissionStatus | 0..1 | coding | System of `http://hl7.org/fhir/uv/bulkdata/ValueSet/submission-status`, code of `in-progress` (default if parameter is omitted), `complete` or `aborted`. Once a request has been submitted with a `submissionStatus` of `aborted` or `complete`, no additional requests may be submitted for that `submitter` and `submissionId` combination. | | manifestUrl | 0..1 | string (url) | Url pointing to a [Bulk Export Manifest](https://build.fhir.org/ig/HL7/bulk-data/export.html#response---output-manifest) with a pre-coordinated FHIR data set. Files in multiple submitted manifests with the same `submitter` and `submissionId` SHALL be treated by the Data Recipient as if they were submitted in a single manifest. This parameter MAY be omitted when the operation is being called to set the submissionStatus to `complete` or `aborted`. The value must be unique for all manifests that share a `submitter` and `submissionId` combination. | | replacesManifestUrl | 0..1 | string | The url of a previously submitted manifest that has the same `submissionId` and `submitter` as this request. When provided, Data Recipient SHALL replace the data in the referenced manifest with the one in the current request. If the url is invalid or the Data Recipient is unable to replace the data, it should respond to the request with an OperationOutcome describing the error. | | outputFormat | 0..1 | string (MIME-type) | The format for the Bulk Data files in the manifest. The MIME-type MAY include a MIME-type parameter of `fhirVersion` as described in the [FHIR specification](https://hl7.org/fhir/http.html#version-parameter) to indicate which version of FHIR the resources in the Bulk Data files are based on. When omitted, defaults to `application/fhir+ndjson` (Newline Delimited JSON) with a version of FHIR determined by the Data Recipient. All of the resources in a submission SHALL use the same version of FHIR. | | FHIRBaseUrl | 1..1 | string (url) | Base url to be used by the Data Recepient when resolving relative references in the submitted resources. | | fileRequestHeaders | 0..* | part | HTTP headers that the Data Recipient should use when requesting a data file from the Data Sender | | → headerName | 1..1 | string | | | → headerValue | 1..1 | string | | | oauthMetadataUrl | 0..* | string (url) | Location that a Data Recipient can use to obtain the information needed to retrieve files protected using OAuth 2.0. The url SHALL be the path to a [FHIR Authorization Endpoint and Capabilities Discovery file](https://hl7.org/fhir/smart-app-launch/conformance.html#using-well-known) or another [OAuth 2.0 Protected Resource Metadata file](https://datatracker.ietf.org/doc/rfc9728/) that is registered in the [IANA Well-Known URIs Registry](https://www.iana.org/assignments/well-known-uris/well-known-uris.xhtml). | | fileEncryptionKey | 0..1 | part | | | → coding | 0..1 | Coding | If omitted, defaults to a system of `http://hl7.org/fhir/uv/bulkdata/ValueSet/file-encryption-type` and code of `jwe` | | | → value | 1..1 | string | For the system of `file-encryption-type` and code of `jwe` populate with the JSON Web Encryption structure to deliver a Content Encryption Key for the Data Recipient to decrypt retrieved data files form the Data Provider. Experimental, looking for feedback on the [draft specification](https://github.com/jmandel/fhir-bulk-encryption-example/blob/main/README.md) | | metadata | 0..1 | part | Child parameters can be added under this parameter to pass pre-coordinated data relevent to the sumbission from the Data Provider to the Data Recipient. Each child parameter name SHALL be an absolute URL. | | import | 0..1 | part | Child parameters can be added under this parameter to pass pre-coordinated options relevent to how the data will be processed from the Data Provider to the Data Recipient. For example, a Data Recipient may allow the Data Provider to specify whether or not existing data should be replaced with the data in the submission. Each child parameter name SHALL be an absolute URL. | Constraint: At least one of the `submissionStatus` and `manifestUrl` parameters SHALL be populated. ##### Security The Data Recipient SHOULD implement OAuth 2.0 access management in accordance with the [SMART Backend Services Authorization Profile](https://www.hl7.org/fhir/smart-app-launch/backend-services.html). When SMART Backend Services Authorization is used, the Data Provider SHALL use a token with a scope of `system/bulk-submit` when kicking off the bulk-submit operation, kicking off the bulk-submit-status operation, making a polling request to the endpoint provided from the kickoff, or navigating to a bulk manifest returned by the operation. If the `oauthMetadataUrl` parameter in the request is populated with the path to an [OAuth 2.0 Protected Resource Metadata file](https://datatracker.ietf.org/doc/rfc9728/) such as a [FHIR Authorization Endpoint and Capabilities Discovery file](https://hl7.org/fhir/smart-app-launch/conformance.html#using-well-known) for [SMART Backend Services](https://www.hl7.org/fhir/smart-app-launch/backend-services.html), the Data Recipient SHALL obtain and use a valid token when retrieving the manifest at the manifestUrl in the request. If the `requiresAccessToken` parameter in the retrieved manifest is also set to `true`, the Data Recipient SHALL obtain and use a token scoped to read the resource types included in the manifest when retrieving the referenced files. If the `fileEncryptionKey` parameter in the request is set to `jwe`, the Data Provider SHALL use the key in `fileEncryptionKey.value` to encrypt the manifest and each file listed in the `output` section of the manifest and the Data Recipient SHALL use this key to decrypt these files. If the `fileRequestHeaders` parameter is included in the request, the Data Recipient SHALL provide the listed header name and value pairs when requesting a manifest or data file. ##### Manifest When populated, the `manifestUrl` parameter of the request SHALL contain a url pointing to a valid [Bulk Data Manifest](https://build.fhir.org/ig/HL7/bulk-data/export.html#response---output-manifest) though when used in a submission the `request` field in the manifest MAY be omitted. This manifest MAY contain a `link` field, and when present, the Data Recipient SHALL follow this link to retrieve additional manifests. Alternatively, the Data Provider MAY call the Bulk Submit operation multiple times, each with a different manifestUrl, using the same `submitter` and `submissionId` parameters to indicate that the contents of these manifests are part of a single submission. #### Response - Success - HTTP Status Code of `200 OK` - Optionally, a FHIR `OperationOutcome` resource in the body #### Response - Error (e.g., unsupported input parameter) - HTTP Status Code of `4XX` or `5XX` - The body SHALL be a FHIR `OperationOutcome` resource If a server wants to prevent a client from beginning a new submission before an in-progress submission is completed, it SHOULD respond with a `429 Too Many Requests` status and a [`Retry-After`](https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Retry-After) header, following the rate-limiting advice desribed in [Bulk Data Status Request](https://build.fhir.org/ig/HL7/bulk-data/export.html#bulk-data-status-request) . ### Bulk Submit Status Operation After a Data Provider has kicked off a Bulk Submit operation, they may wish to receive updates on the status of the submission. For example, a Data Recipient may indicate files it was unable to retrieve, resources that failed validation or resources that weren't able to be merged into an existing data set. Additionally, the Data Recipeint may need to return processed data back to the Data Provider. For example, a set of computed quaity measures or a de-identified version of the submitted data. The Bulk Submit Status operation provides a way for a Data Providers to request FHIR resources related to a submission from the Data Recipient. #### Kick-off Request ``` POST [fhir base]/$bulk-submit-status ``` ##### Parameters The request body SHALL be a FHIR [Parameters resource](https://hl7.org/fhir/parameters.html) with the following parameters: | name | cardinality | type | description | | --- | --- | --- | --- | | submitter | 1..1 | Identifier | The submitter must match a system and code specified by the Data Recipient (coordinated out-of-band or in an implementation guide specific to a use case). | | submissionId | 1..1 | string | The value must be unique for the `submitter`. | | _outputFormat | 0..1 | string | The format for the generated bulk data files used to return OperationOutcome resources related to the submission status and, when applicable, other resources. Currently, ndjson must be supported, though servers may choose to also support other output formats. Servers SHALL support the full content type of application/fhir+ndjson as well as abbreviated representations including application/ndjson and ndjson. Defaults to application/fhir+ndjson.| ##### Headers - `Accept` (string) Specifies the format of the optional FHIR `OperationOutcome` resource response to the kick-off request. Currently, only `application/fhir+json` is supported. A client SHOULD provide this header. If omitted, the server MAY return an error or MAY process the request as if `application/fhir+json` was supplied. - `Prefer` (string) Specifies whether the response is immediate or asynchronous. Currently, only a value of <a href="https://datatracker.ietf.org/doc/html/rfc7240#section-4.1"><code>respond-async</code></a> is supported. A client SHOULD provide this header. If omitted, the server MAY return an error or MAY process the request as if respond-async was supplied. #### Response - Success - HTTP Status Code of `202 Accepted` - `Content-Location` header with the absolute URL of an endpoint for subsequent status requests (polling location) - Optionally, a FHIR `OperationOutcome` resource in the body in JSON format #### Response - Error (e.g., unsupported input parameter) - HTTP Status Code of `4XX` or `5XX` - The body SHALL be a FHIR `OperationOutcome` resource in JSON format If a server wants to prevent a client from beginning a new submission before an in-progress submission is completed, it SHOULD respond with a `429 Too Many Requests` status and a [`Retry-After`](https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Retry-After) header, following the rate-limiting advice for a [Bulk Data Status Request](https://build.fhir.org/ig/HL7/bulk-data/export.html#bulk-data-status-request). ### Bulk Data Status Polling Request After a Bulk Status request has been started, the request and response flow will follow the [FHIR Asynchronous Request Pattern](https://www.hl7.org/fhir/R4/async.html). The Data Recipient MAY return a [partial export manifest](https://build.fhir.org/ig/HL7/bulk-data/export.html#response---in-progress-status) and a HTTP status of 202 while the submission is incomplete or is being processed. Once the submission is complete (a request was sent by the Data Provider with a submission status of `complete` and the Data Recipient has retrieved and processed the files from the Data Provider) the Data Recipient SHALL return an [export manifest](https://build.fhir.org/ig/HL7/bulk-data/export.html#response---output-manifest) and a HTTP status of 200. These manifests provide a mechanism for the Data Recipient to return resources related to the data submission. If there isn't relevant information to communicate and the submission is complete, the Data Recipient MAY return a manifest with empty output and error sections. Each manifest SHALL include an extension at the root level with a `submissionId` element listing the relevant submission. If there is status information to communicate, the Data Recipient SHALL populate the `error` section of the manifest with one or more files that contain OperationOutcome resources. For example, the Data Recipient may indicate files from the Data Provider it was unable to retrieve, resources that failed validation or resources that weren't able to be merged into an existing data set. Depending on the use case, a Data Recipient may wish to return an OperationOutcome resource for every resource that was included in the submission. Each item in the `error` section of the manifest SHALL include an extension with an element named `manifestUrl` that links the OperationOutcome file to the manifestUrl submitted by the Data Provider where the issue occurred. A single manifestUrl may be referenced from multiple items in the error section. Each item in the manifest SHALL also include a url pointing to a bulk file of OperationOutcome resources. If an issue is related to individual resources submitted by the Data Provider, the OperationOutcome resource for the issue SHOULD use the [artifact-relatedArtifact](https://build.fhir.org/ig/HL7/fhir-extensions/StructureDefinition-artifact-relatedArtifact.html) extension at its root to reference those resources. If an issue is related to a large number of resources, the Data Recipient SHOULD provide multiple OperationOutcome resources, each of which reference a few of the resources submitted by the Data Provider, to avoid making the individual OperationOutcome resources extremely large. The Data Recipient MAY inlcude an extension for each item in the `error` section named `countSeverity` containing an object with keys of the `OperationOutcome.severity` codes present in that file and values of the number of instances of each code. If there are resources to return, the Data Recipient SHALL populate the `output` section of the manifest with one or more files that contain FHIR resources. Each item in the `output` section of the manifest SHOULD include an extension with an element named `manifestUrl` that links the OperationOutcome file to the manifestUrl submitted by the Data Provider. A single manifestUrl may be referenced from multiple items in the output section. Example status manifest: ```json { "extension": { "submissionId": "a15eea1f-1605-4303-989f-542d3a7962d8" }, "transactionTime": "2025-01-01T00:00:00Z", "error" : [{ "extension": { "manifestUrl": "https://example.com/manifests/3556d214-c6e2-42e6-a7f7-89690f7a40bb_2", "countSeverity": {"success": 98, "error": 2} }, "url" : "https://example.com/output/validation_errors_2.ndjson" },{ "extension": { "manifestUrl": "https://example.com/manifests/3556d214-c6e2-42e6-a7f7-89690f7a40bb_1", "countSeverity": {"success": 0, "error": 100} }, "url" : "https://example.com/output/import_errors_1.ndjson" }] } ``` Example OperationOutcome: ```json { "resourceType" : "OperationOutcome", "id" : "validationfailure-1", "extension": [{ "url":"http://hl7.org/fhir/StructureDefinition/artifact-relatedArtifact", "valueRelatedArtifact": { "type": "comments-on", "resourceReference": "Patient/pt-1" } },{ "url":"http://hl7.org/fhir/StructureDefinition/artifact-relatedArtifact", "valueRelatedArtifact": { "type": "comments-on", "resourceReference": "Patient/pt-2" } }], "issue" : [{ "severity" : "error", "code" : "structure", "details" : { "text" : "Error parsing resource json (Unknown Content 'label')" }, "location" : ["/f:Patient/f:identifier"], "expression" : ["Patient.identifier"] }] } ``` ### Bulk Submit Workflow ```mermaid --- config: theme: 'base' themeVariables: background: '#ffffff' --- sequenceDiagram box Data Provider participant FileServer as Submission File Server participant DataProvider as Submission Client end box Data Recipient participant DataEndpoint as Submission Endpoint participant StatusEndpoint as Status Endpoint participant StatusFileServer as Status File Server end DataProvider->>DataEndpoint: Submit manifest url (submissionStatus: in-progress) DataEndpoint->>FileServer: Retrieve manifest file FileServer-->>DataEndpoint: DataEndpoint-->>DataProvider: Success OperationOutcome DataEndpoint->>FileServer: Retrieve submission files FileServer-->>DataEndpoint: DataProvider->>StatusEndpoint: Status export kick-off StatusEndpoint-->>DataProvider: Polling url DataProvider->>StatusEndpoint: Poll request StatusEndpoint-->>DataProvider: Partial status export manifest DataProvider->>StatusFileServer: Retrieve status files StatusFileServer-->>DataProvider: DataProvider->>DataEndpoint: Submit manifest (submissionStatus: complete) DataEndpoint-->>DataProvider: Success OperationOutcome DataEndpoint->>FileServer: Retrieve submission files FileServer-->>DataEndpoint: DataProvider->>StatusEndpoint: Poll request StatusEndpoint-->>DataProvider: Complete export status manifest DataProvider->>StatusFileServer: Retrieve final status files StatusFileServer-->>DataProvider: ``` ### Underlying Standards * [HL7 FHIR](https://www.hl7.org/fhir/) * [Newline-delimited JSON](https://github.com/ndjson/ndjson-spec) * [RFC5246, Transport Layer Security (TLS) Protocol Version 1.2](https://tools.ietf.org/html/rfc5246) * [RFC6749, The OAuth 2.0 Authorization Framework](https://tools.ietf.org/html/rfc6749) * [RFC6750, The OAuth 2.0 Authorization Framework: Bearer Token Usage](https://tools.ietf.org/html/rfc6750) * [RFC7159, The JavaScript Object Notation (JSON) Data Interchange Format](https://tools.ietf.org/html/rfc7159) * [RFC7240, Prefer Header for HTTP](https://tools.ietf.org/html/rfc7240) ### Terminology This profile inherits terminology from the standards referenced above. The key words "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this specification are to be interpreted as described in [RFC2119](https://tools.ietf.org/html/rfc2119). ### Privacy and Security Considerations All exchanges described herein between a client and a server SHALL be secured using [Transport Layer Security (TLS) Protocol Version 1.2 (RFC5246)](https://tools.ietf.org/html/rfc5246) or a more recent version of TLS. Use of mutual TLS is OPTIONAL. The Bulk Submit Operation and Bulk Submit Status Operation SHOULD support OAuth 2.0 access management in accordance with the [SMART Backend Services Authorization Profile](authorization.html). When SMART Backend Services Authorization is used, Bulk Data Status Requests SHALL be protected the same way the Bulk Data Kick-off Request, including an access token with scopes that cover all resources being exported. A server MAY additionally restrict Bulk Data Status Requests by limiting them to the client that originated the export. Implementations MAY include endpoints that use authorization schemes other than OAuth 2.0, such as mutual-TLS or signed URLs. This implementation guide does not address protection of a server from potential compromise. An adversary who successfully captures administrative rights to the server will have full control over that server and can use those rights to undermine the server's security protections. In the Bulk Submit workflow, the file server will be a particularly attractive target, as it holds highly sensitive and valued PHI. An adversary who successfully takes control of a file server may choose to continue to deliver files in response to client requests, so that neither the client nor the FHIR server is aware of the take-over. Meanwhile, the adversary is able to put the PHI to use for its own malicious purposes. Healthcare organizations have an imperative to protect PHI persisted in file servers in both cloud and data-center environments. A range of existing and emerging approaches can be used to accomplish this, not all of which would be visible at the API level. This specification does not dictate a particular approach at this time, though it does support the use of an `Expires` header to limit the time period a file will be available for client download (removal of the file from the server is left up to the server implementer). A server SHOULD NOT delete files from a Bulk Data response that a client is actively in the process of downloading regardless of the pre-specified expiration time. Data access control obligations can be met with a combination of in-band restrictions (e.g., OAuth scopes), and out-of-band restrictions, where the server limits the data returned to a specific client in accordance with local considerations (e.g. policies or regulations). The FHIR server SHALL limit the data returned to only those FHIR resources for which the client is authorized. Implementers SHOULD incorporate technology that preserves and respects an individual's wishes to share their data with desired privacy protections. For example, some clients are authorized to access sensitive mental health information and some aren't; this authorization is defined out-of-band, but when a client requests a full data set, filtering is automatically applied by the server, restricting the data that the client receives. Bulk Submit can be a resource-intensive operation. Developers SHOULD consider and mitigate the risk of intentional or inadvertent denial-of-service attacks though the details are beyond the scope of this specification. For example, transactional systems may wish to provide Bulk Data access to a read-only mirror of the database or may distribute processing over time to avoid loads that could impact clinical operations.