owned this note
owned this note
Published
Linked with GitHub
# Multi-tenancy
Multi-tenancy support in AFJ will allow multiple tenants to use the same agent, sharing resources. The feature will be developed as an optional module to AFJ (to not add even more to the core package). It will live in the main repo under package name `@aries-framework/module-tenants` . Until the new module and plugin API is ready, the module will be developed as an injected module as described here: <https://github.com/hyperledger/aries-framework-javascript/tree/main/samples/extension-module>
## Tasks
* [x] Updating AFJ core to use an [Agent Context](#h-agent-context)
* Add `AgentContext` interface and create a single context in the agent constructor and bind this to the injection container.
* Do not bind the `AgentConfig` and `Wallet` instances in the agent constructor anymore
* Use the `AgentContext` in all services and repositories
* Inject the `AgentContext` in all modules so they can pass it to the services
* Pass the `AgentContext` from the `MessageReceiver` to the `Dispatcher`, bind the `AgentContext` to the `InboundMessageContext` (add `agentContext` property to interface) in the dispatcher when creating the inbound message context
* Do not depend on the `ConnectionsModule` from within the `MessageReceiver` anymore as the `MessageReceiver` should be stateless (which the `ConnectionsModule` is not)
* [x] Add an [Agent Context Provider](#h-agent-context-provider)
* Create `AgentContextProvider` interface
* Add default `DefaultAgentContextProvider` implementation
* Add `InjectionSymbols.AgentContextProvider`
* If nothing has been registered for `InjectionSymbols.AgentContextProvider` , register the `DefaultAgentContextProvider` in the agent constructor
* Integrate the `MessageReceiver` class with the new `AgentContextProvider` interface
* [x] Support for the [Tenant Agent](#h-tenant-agent)
* Update almost all usages of `@scoped(Lifecycle.ContainerScoped)` to `@singleton()`
* With the exception of:
* `Wallet`
* `AgentConfig`
* All `Module` classes
* Create a `BaseAgent` class that doesn’t do all the registration of classes. we can extend the base agent class in the `TenantAgent` class. The `Agent` class will also extend the `BaseAgent` class.
* [x] Add the [Tenant Module](#h-tenant-module)
* Implement the `TenantModule` interface as described below
* Add a `TenantRecord`
* Make it possible for the tenant module to access the root container to create new agent instances using the root container
* [x] [Tenant Lifecycle](#h-tenant-lifecycle)
* Remove all internal event listeners, or make them tenant aware. The agent context should not be used after a method has resolved. Rather, a new tenant agent should be retrieved (or agent context) and the method should be dispatched using that tenant context.
* Find a solution for the emitting of message received events when receiving messages from the mediator / inbound transports. This can cause conflicts with the lifecycle of an agent context.
* Tests:
* close wallets immediately after they should be done processing, this will helps us find places where processing happens after the promise has resolved
* Add a way to keep track of open sessions (see possible approaches described below).
* Wallets should be closed if not used for x amount of time
* Wallets should be closed if they’re not used for the longest amount of time when an limit *n* is reached of open wallets
* However this should only happen if the wallet isn’t still processing, otherwise we should wait for a wallet to finish processing, before closing it and opening a new one.
* [idea] add lifecycle methods to the agent context so we can easily track which wallets should be kept open and when they are done processing.
* This way the core should just know the lifecycle methods of an agent context, while the tenant module can implemented the complex logic of handling this for a large amount of wallets
* Dispose of the injection container if a tenant agent was used for the session
* Add the `TenantSessionCoordinator` and integrate this with the `TenantAgentContextProvider`
* Add lifecycle methods to the `AgentContext` and implement these in the `TenantAgentContext` to integrate wit the `TenantAgentContextProvider` and `TenantSessionCoordinator`
* [x] [Wallet key registration for tenant wallets](#h-wallet-key-registration-for-tenant-wallets)
* Add a new `RoutingProvider` class with a method `getRouting` that replaces the current `MediationRecipientService.getRouting`
* Add a way to dynamically add handlers that can participate in creating the routing object (and perform side effects) (see two possible patterns described below)
* Add a new mediation recipient routing handler to the mediation recipient service and register the handler in the `RoutingProvider`
* Add a new tenant routing handler to the tenant module/service something that will store a mapping of the new recipientKeys to the tenant id and register the handler in the `RoutingProvider`
* Note: As there could be multiple agents writing to this objects for different or the same tenant, we should not create a single record for all tenants, or a record per tenant as that will increase the chance of conflicts with writing. Askar can help with locks, but as the mapping is write-once, read-a-lot (maybe delete once) keeping it in separate record will avoid any conflicts.
### Agent Context
The agent context allows to make most classes stateless in AFJ. This allows us to reuse those classes across all tenants.
```typescript
interface AgentContext {
wallet: Wallet
agentConfig: AgentConfig
// could add metdata to store arbitrary data (see question 1.)
// metadata: {}
}
```
The root container will not have the `Wallet` or `AgentConfig` registered anymore. Instead we will pass the `AgentContext` to methods and we can get them from the `AgentContext`. See the code block below for an example
```typescript
class ConnectionService {
public async getConnectionById(agentContext: AgentContext, connectionId: string) {
this.connectionRepository.getById(agentContext, connectionId)
}
}
class ConnectionRepository {
public async getById(agentContext: AgentContext, connectionId: string) {
this.storageService.getById(agentContext.??, ConnectionRecord, connectionId)
}
class IndyStorageService {
private wallet: IndyWallet
public constructor(@inject(InjectionSymbols.Wallet) wallet: Wallet) {
this.wallet = wallet
}
private assertWalletIsIndyWallet(wallet: Wallet): asserts wallet is IndyWallet {
if (!(wallet instanceof IndyWallet)) {
throw new AriesFrameworkError("Indy Wallet must be used to use the indy storage service")
}
}
public async getById(agentContext: AgentContext, recordClass: Record, id: string) {
// this makes sure the generic 'Wallet' in agentContext is an indy wallet
this.assertWalletIsIndyWallet(agentContext.wallet)
const record = await this.indy.getWalletRecord(
this.wallet.handle,
recordClass.type,
id,
IndyStorageService.DEFAULT_QUERY_OPTIONS
)
}
}
```
Questions
* Should the agent context just be an object, or a class instance? If we make it a class instance we can provide a custom `TenantAgentContext` that holds extra metadata about the tenant in the context. We could also easily do this with an interface ofc. (`TenantAgentContext` interface), or by just adding a `state` or `metdata` property on the agent context that allows to store arbitrary context data
### Inbound and Outbound Transports
Inbound and outbound transports will be shared across all tenants. However the transports don’t need any custom processing logic themselves. The base agent will act as a relay for the other agents. So we’ll still call `agent.receiveMessage` on the base agent. Then in the messageReceiver we’ll find the associated tenant. This means transports don’t need to handle the complexity of working with tenants, and we don’t have to repeat this logic for each new transport that will be added.
### Agent Context Provider
The agent context provider allows us to provide the agent context for an incoming message. This interface will be added to the core AFJ package and called by the message receiver.
```typescript
interface AgentContextProvider {
getContextForInboundMessage(encryptedMessage: JWE): Promise<AgentContext>
}
```
With the interface comes a minimal default implementation for usage in AFJ without the tenant module installed. The implementation will look something like below. Without multi tenancy the agent will use a single `AgentContext` object that will be used throughout the agent. By adding this interface we can keep the core functionality really simple (always returning the same agent context), while adding an extensible API that allows us to build multi tenant features.
```typescript
// default agent context provider in AFJ. Just returns the singleton agent context for each
// message. (i.e. single tenant)
class DefaultAgentContextProvider implements AgentContextProvider {
private agentContext: AgentContext
public constructor(agentContext: AgentContext) {
this.agentContext = agentContext
}
public async getContextForInboundMessage(encryptedMessage: JWE) {
// just return the agent context
return this.agentContext
}
}
```
In the multitenant module we could add a more complex implementation that looks up in a mapping which key belongs to which tenant and returns the agent context based on that.
```javascript
class TenantAgentContextProvider implements AgentContextProvider {
// this doesn't handle multipe recipients within the same agent yet
// this is very uncommon, also not supported in ACA-Py
public async getContextForInboundMessage(encryptedMessage: JWE) {
const recipientKeys = this.getRecipientKeysForMessage(encryptedMessage)
for (const recipientKey in recipientKeys) {
const tenant = await this.findTenantByRecipientKey(recipientKey)
if (tenant) return this.agentContextForTenant(tenant)
}
throw new AriesFrameworkError("No tenant found for inbound message")
}
}
```
This approach is inspired by the [multi-tenant implementation from ACA-Py](https://github.com/hyperledger/aries-cloudagent-python/blob/00d97b3e0e6f713dfab383eb2e5e14e58472a47d/aries_cloudagent/transport/inbound/session.py#L160), but is different in a few ways:
* Usage of an interface with simple default implementation means we don’t have to integrate it into core
* Instead of calling it a relay we call it a generic agent context provider. This allows for other types of agent context providers in the future. (and would also allow an implementation not based on `TenantRecord` but could integrate with a remote KMS for example to retrieve the wallet key.
### Tenant Module
The tenant module provides the public api for working with the multi-tenant AFJ agent. It allows to manage tenants and also get an agent instance unique to that tenant.
```javascript
// minimal version of the init config. See question 1.
interface TenantConfig = {
label: string
connectionImageUrl: string
}
interface TenantModule {
getTenantAgent(options: { tenantId: string, config: TenantConfig }): Promise<Agent>
createTenant(options: { walletConfig: WalletConfig }): Promise<TenantRecord>
getById(tenantId: string): Promise<TenantRecord>
deleteById(tenantId: string): Promise<TenantRecord>
}
```
Questions:
* Should we make a distinction between the tenant config and the agent config? The agent config would only be configured once and is not unique per agentContext, while the tenant config (different name tbd) can be different for each tenant / agent context object.
* AF.NET has wallet specific config and agent specific config
### Tenant Agent
When calling `agent.tenants.getTenantAgent` a new agent will be created that has the same API as the base agent. This makes it easy to work with multi tenant enabled agents, as there’s minimal difference.
To achieve this all modules will be stateful objects. This means the `AgentContext` can be injected into the module, and then passed down to the stateless services the module interacts with. This means that if you have 1000 tenant agents in use at one point, you’ll have 1000 agent instances (1001 if we count the base agent) and 1000 * the number of modules of module instances. We think this is a fair tradeoff between
* **Root Container:** The root container will be used across the all agent instances (base and tenants) and will provide all the stateless containers
* **Agent Container:** The agent container will be used for the base agent instance. This will have the `AgentContext` registered for injection in other modules.
* **Tenant Container:** The tenant container will be used for tenants. Each tenant will get its own child container that is created from the root container (**note:** not the agent container). This is basically the same as the agent container
Example of how it will look in the agent constructor. We should probably extract this out of the constructor into a container factory.
```javascript
import { DependencyContainer, container } from 'tsyringe'
class Agent {
public constructor(
config: InitConfig,
agentDependencies: AgentDependencies,
injectionContainer?: Container
) {
// root container doesn't have any staefull classes registered (no modules)
const rootContainer = injectionContainer ?? container.createChildContainer()
// container is the base wallet container. This will have modules registered
// and is accesible using agent.injectionContainer
this.container = rootContainer.createChildContainer()
// agent context is only available on the agent container
const agentContext = /* get the agent context */
agentContainer.register(AgentContext, agentContext)
// register modules on the base container. Modules can inject the `AgentContext`
// which they can pass down to the services.
this.connections = this.container.resolve(ConnectionsModule)
}
}
```
The tenant agent will only be created for usage with the public API (i.e. accessing `agent.xxx`). When processing an inbound message no module will be created as only the handlers, services, repositories etc will be used.
### Tenant Lifecycle
As with multi-tenancy there could be an unlimited amount of tenants, we should make sure we build a robust lifecycle mechanism that closes unused wallets and cleans up resources when needed to prevent memory leaks. The handling of the tenant lifecycle has been divided into two sections, described below.
#### Tenant Agent Lifecycle
When a tenant agent instance is created, an agent is created with custom modules specifically for that tenant. All modules hold a reference to the `AgentContext` for that specific tenant. The tenant should be ‘freed’ after you’re done processing with it. For both method, disposal doesn’t mean the wallet will be closed, but it means you’re not dependant on the wallet being open anymore. It is possible for the agent to keep the wallet open for more efficient processing on subsequent requests.
**Asynchronous callback method**
The asynchronous callback method will provide you with an agent in the callback. Once the async method has resolved, we can assume you’re done with the agent and close the wallet and dispose of the injection container with modules. This is comparable to how database sessions/transactions are often performed in JS libraries.
```typescript
await agent.tenants.withTenantAgent({ /* config */ }, async tenantAgent => {
// tenantAgent can be used until promise resolves.
const connections = await tenantAgent.connections.getAll()
// tenantAgent will now be disposed
})
```
**Manual free method**
```javascript
const tenantAgent = await agent.tenants.getTenantAgent({ /* config */ })
// tenantAgent can be used until freed
const connections = await tenantAgent.connections.getAll()
// tenantAgent will now be disposed.
await tenantAgent.dispose()
```
#### Inbound Message Lifecycle
When an inbound message is processed we open a wallet (or get it from a pool of open wallets). After the dispatcher (and thus the handler) is done processing the message the context can be disposed (i.e. the wallet *may* be closed). Same as with the tenant agent lifecycle it is important to not do any processing after the main method has returned (so don’t schedule tasks, have unresolved promises using the agent context, etc…). Although a strict restriction this will make sure the agent context is always initlialized when needed and won’t cause weird issues throughout the framework.
\
#### Tenant Session Coordinator
```javascript
// will keep track of all sessions for a specific tenant. A tenant could have
// multiple sessions at the same time (recieving multiple inbound messages)
// but share the wallet between those sessions. When an `AgentContext` is finished
// we can mark the session as completed here and make it available for cleanup
// If before cleanup an action is performed that open the wallet again we can just reuse
// the open wallet and add it back to the current session mapping again.
class TenantSessionCoordinator {
// example. Should look at max open sessions per tenant and also total max open sessions
// e.g. total max is 100 (random, can probably be higher) while the max per tenant is 3 (also random)
public readonly maxOpenSessions = 100
}
```
### Tenant Events
Events should include some form of metadata that allows to uniquely identity an event from a specific wallet. The event bus will be shared across all tenants. This is partly because events can occur even when no agent for the tenant exists, meaning the event can’t be listened to (and most probably there won’t be
**Single event bus for base and tenant agents**
We need a way to add metadata to all events (probably take it from the agentContext.metadata?)
```javascript
tenantAgent.events.on(ConnectionStateChanged, (event) => {
// metadata (key tbd) contains metadata for this event. In case of multi tenancy it will
// contain the tenantId.
console.log(event.metadata.tenantId)
})
```
**Separate event bus for base agent and shared event bus for tenant agents**
This would mean we need different event emitter handling for tenants
```javascript
agent.events.on(ConnectionStateChanged, (event) => {
// always in context of base agent here
})
agent.tenants.events.on(ConnectionStateChanged, (event) => {
if (event.metadata.tenantId !== tenantId) return
// always in contex of tenant agent here
console.log(event.metadata.tenantId)
})
agent.events.observable(ConnectionStateChanged).pipe(
filterTenant(tenantId)
).subscribe(() => {
// always in contex of tenant agent here
console.log(event.metadata.tenantId)
})
```
Questions:
* What is the best approach here?
\
### Wallet key registration for tenant wallets
When one of the tenants creates a key that will be used for DIDComm messaging (keys only used for signing are out of scope here), the key needs to be registered somewhere so the `TenantAgentContextProvider` can find the right tenant for an incoming message.
In ACA-Py we inject the multi tenant manager every time we create a new key. This has a few downsides, mainly:
1. There’s a lot of repeated code handling for the multi tenant integration across the codebase
2. The codebase needs to be aware of a base → tenant hierarchy, making the code more complex and less flexible.
In AFJ we already have the abstraction of the `MediationRecipientService.getRouting` which we always use when creating keys for DIDComm routing. This nicely integrates with the mediation module and can provide the needed `routingKeys`. As there’s now another module that needs to interfere (in this case just listening) we could look at an API that allows to hook into the routing key providing process.
**Middleware pattern**
Multiple middlewares can modify the same routing object, adding mediator routingKeys and registering them at the mediator, or storing the new recipient key in the tenant record.
[Example Code](https://www.typescriptlang.org/play?#code/PQKhCgAIUhyA7ApgDwC60gMwK7wMaoCWA9vADSQAOAhgM62IAmkqxk1kAtoY4wDaIA7tQBOiKCGDhUAT0qJIAORSpIAXkgAKAJTqAfJABuxHpAA+kAAoji3BgB5jPPQG5w4UBGiQAglx78QqLi0FKy8pAAsgECwmL2ACoGalBaeKSoKgBckAkUSGg5ymi6agbWtoQOTowGFjVuHmASkAASiHzyIli4BCTwWMTdhPDGANYjAObskHgAFtQjkMSY-ryxwbTLAxzp8JloAHQSUnQy+D34RKSQI+OI0etBYrSJepp7B6g5eWuBcYhaDlHv9gm8ANoAXW0OQqdkQjhMtUgAG93JBbqtNABCbhPAG0Q4CeCTVBzXRiVDYETwRoYva0VScQTqP4bF7ggAMkLpkEp1IGzI+GRUFDOFx0+lRqQx1GEhFUd2IYweMWegOF+1FbPVhNofEIeEQmgAjNptG4MQBfbTuK3uTwtPx40FiWYZRZIbrUeDMJUqkQnSDgPB8OhbSKCAAiVRoqHmiBEb2lqRd7MBwLVAIhPPR7vgjJE2AIQ0laIxGLJVUOad1rKhlsg9tSjorMB8vBmtYBlz6pGObakGOwDE0h3HzMz+LBSShMKMSJTFZYc2r3c2h0oI7mY4nggteebGNbGJgAFFkIg8NhMiuFPNPctVuuXhQlmSFENGIm7zJIIJEwUaheCYbZ2BlGBJkIQxEAGABhEUjmDQdUkYWNqHjHdPmyXJ5zhKoERqAxy2XfkaVuUZlVVacXk1L4KCrQkX0BW1rTtdxwBGTIREwagjUgBCtTQVE+WIG8pgAaUQGQgUgQspihCgxDwQhKEIWDUCkmScnkklFMgWDGEoEx9lk3TJihJt3AZVRqEoShWSQFlIxjWg4wTJNBK+d5bRDUhGTvRQhk4ag+B8PsBg0OicK87UCm+JQVFKYiW2AWYxAwhQP0gb9eOwPhVBVGRUgZYgBCJYhJk0AAiOCMqIElICK6rWPzL5DmU1T1P2LTCS3Wgd1gbL4GC0Kmuk2BfOPNKGFUbLDOMrjaBK-yysQCqquqgBlRBUAa6YFpM1AWpWoTUEOQ6lvrar5t9RbTOqyE83inQ81K8q+EqmqfEwbilFGvgTubGzIEmXbIiYQgMKGAAlMT9tZaLCgExDUHyHDilQZLpWmlhiEYYgcjmYgWVYUHdrvLhIehkQAH4tCQUDaFsBROF24CMOoVqQbZtCadZEiMUu-YciGuZEAAWl5qHWBECXhfQVJ7QxU7mY+r7avqqZ2F9PlECgxlE21or-wVOYqb52WTtxsQDb+k2MItmWhlSp2aY6-Wqm43qkfOzq1I03rWtVtaNu+3htel-mbHEkleuttqVA6+HJOkvrtxq7Ko9l8aZBOkP1c2na9sj6mc4VhPsKOBWtg0cFs6GC67qOp7Uheqb8zV9bPs2n7uIhy3YZTkkges-zVEZIYpgSWCfU0tPEarhLYrQdHkcx7GSPe7uNa22XtcyeA59z2h84xdu3tWwvvt+n9J7EUf3DsyhDhHY0PyCkQQrCiKKDB1AB7OxEHDWOkwKD32nrPHqadfIgyXgLUSoDeo5H0v7bq89tKQH0jXFBkI2LP0OGhNyGF5i+20IcD88BNCSjKJ3UOPcyG+SAA)
**Visitor-esque Pattern**
Add visitor functions that can be registered in the routing provider. When `getRouting` is called on the provider the provider will loop over all visitors and call the visitor with the current routing object, allowing it to return an extended routing object and run side effects.
[Example Code](https://www.typescriptlang.org/play?#code/FASwdgLgpgTgZgQwMZQAQEEDmVIGED2kUAHhKgN4C+wSANggM4OoAqOCkWOEBRpqIALYAHWlEHdmXPIWj9ywVKmhgOEAJIATAFyoGEGOEzBFqYQFcARrRBJUSQvpjmkEfDAAUKtVt1OjAJQUpkoQABYgDAB03pBaqAC8yuxxmqbU1KBE8MhovHJkCkoSmiAIbjBaAPx+BkamCNgyfBC60jyyJBDAmeDQOSioAEr45hBGwUo4msL4fQy1hmCYANoAuqYwo+PLANJQAJ4LenXL65tQSCDCINz7R4tG55kQB8JoI2NGAGqRIBWJVAeLZfZa6T47TAAGnsnVIunyXSCCQAfKgAApbQSRKAAHghRhRJjojGYBOWmPwADcQJpYJMzIYqeU0CDIb8GP93MdyZgOVyYOtAecGppNPyKsDtj8-hVwdLlhL3EEikplBFomyZZyKtELAwwlLQXzZcr0iYlIwDmA7NgILyPA4Wgi4RAVSFUAB6T32GBQFmoMBQADuqC1y1QAGtDh6xGRw5h5cbAeQwwrMPdjusYdNZvNdNmw5drrdIJmCwBycJQAC0QeDNb9Vxu3Br0YOFbWmTVqDg7iBTv0aeNSpgqHwcHVkSiCdHDHdPYTgIQwYQ-2H7NNngTMKdBQCHu7ar9EHMMDAG-qSgyxMcZBKZXGhF5Sc3Ov7SStNqBO9hLWRaKqt6qCCPgSCRqYg5kHaAAiUCIOYtAQAAslApTlO48Sfgw1p2B4AGoAARHSCBwIhEA1g+GEwDWtKERaf5DlRFRYX+BRRMxmGaKgVRVKgK5rtBUAQHBCFIah6EsZo+EMSAk4eAAhJxlSaEEJ5nheCYMep54MkoUQGTuHq5nMkBZhWzEgIQNYmX0nZQh6CblqgKwGTO6bucamYwhZaGPlZYA2WAMymRA9nmpkUHJKokAvsM6ajsuOHfkakK7q6BFAT6DCRtcAiTmA+BkOA0VqGxXSmHJQIKY6rpRI03CIvw4D6BwKATqwKQQO0TVumpwkaZeywMVFsQaNxSR7l09VNB0LQxF1WiQY4+BiFEtD4JgHiEQAyhUaBjUWzalmQ7YMIRMJjVoB4XKeulaZkwDAQwwmoOYwhDZgjLUrSsA0Hen2UjSdJjkk9bxcaQO-Z4B4JlDIP1WKo4eJZz7prD6bw7AiPiluXhdbyN3Pa973qgdXXlaQ-1gEOU3wqgvUpg0s29bo4NsDF3Us6623VjWY20ZohEHpkcNbMD2N2g6dNujEYQ4LVNOrVA62bQEQA)
Questions:
* What is the best approach here?