Core SDK Update

# Core SDK Update ## Overview Previously, the App & Extensions operated on an Event Driven model, where components within the platform could act as both subscribers and publishers. This approach aimed to reduce dependencies between components and minimize logic at the application layer, but it led to increased complexity at the extension layer. While it was easily scalable across frameworks and languages, it posed limitations in terms of tight coupling with platform components. For example, directly utilizing engines and tools within the app and extensions was not feasible. ```mermaid graph TD; A[Chat Shell] -->|broadcast events| B[Extensions]; B -->|broadcast events| A; ``` Like the chart above, the components won't distinguish between who is the sender and who is the receiver, as designated. Logic is determined by event data. This isn't really a bad architecture; in fact, it's easily scalable. However, it can be troublesome in terms of extension development, requiring extension developers to subscribe to events and proxy requests to build subsequent actions. This necessitates developers to have a comprehensive understanding of the entire system structure and its extensions. (Wew...) - How to execute a sequence of actions sequentially through multiple extensions? - How to perform an inference action without impacting ChatShell? E.g. Thread summarization. ## Engines We introduce Engines (Inference Engines), which are objects registered by extensions upon installation into the application. These Engines are managed by the Core SDK instance and can be utilized throughout the platform, from the application to extensions, to meet various usage needs. For example, an Assistant Extension can use the OpenAI engine for embedding and then perform inference using the Nitro engine, or vice versa. ```mermaid graph TD; A[Inference Extensions] --> |register engines| CoreSDK C[Chat Shell] -->|imports|CoreSDK Extensions -->|imports|CoreSDK CoreSDK --> |uses| Engines; Engines -->|inference| Outputs; ``` In the diagram above, we can import Core SDK from the application layer to the extensions layer and use Engines as we do in the application layer. Example code snippet demonstrating the use of an engine: ```ts import { EngineManager } from "@janhq/core" const engine = EngineManager.instance().get(model.engine) // Load a certain model then inference engine.loadModel(model).then(() => engine.inference(request)) // Unload the model engine.unloadModel(model) ``` **How can an extension registers an engine?** We provide an AIEngine class inherited from BaseExtension so that extension developers can extend and implement pre-designed methods: ```ts abstract class AIEngine extends BaseExtension { // Provider name abstract provider: string; // Defines pre-populate odels models(): Model[]; // Loads the model. loadModel(model: Model); // Stops the model. unloadModel(model: Model); // Inference a message request inference(data: MessageRequest); // Stop inference stopInference(); } ``` By providing this AIEngine class, we offer a structured framework for extension developers to build custom engines. They can extend the AIEngine class and implement the required methods according to the specific requirements of their engine. This approach ensures consistency, reusability, and ease of development for building new engines within the platform. Example: ```ts // OpenAI inference engine extension class OpenAIEngineExtension extends AIEngine { // Loads the model. override loadModel(model: Model) { /** Custom logic goes here **/ } // Stops the model. override unloadModel(model: Model) { /** Custom logic goes here **/ } // Inference a message request override inference(data: MessageRequest) { /** Custom logic goes here **/ } } ``` Furthermore, we built popular engine classes for extension developers to extend and configure instead of having to build custom logic from scratch, such as RemoteOAIEngine and LocalOAIEngine. There will be more. ```mermaid classDiagram AIEngine <|-- OAIEngine OAIEngine <|-- LocalOAIEngine LocalOAIEngine <|-- NitroExtension LocalOAIEngine <|-- TensorRTLLMExtension OAIEngine <|--RemoteOAIEngine RemoteOAIEngine <|-- OpenAIExtension RemoteOAIEngine <|-- GroqAIExtension RemoteOAIEngine <|-- TritonTensorRTExtension class AIEngine { <<Abstract>> +string provider +models() +prePopulateModels() } class OAIEngine { <<Abstract>> +string inferenceUrl +Model loadedModel +inference() +stopInference() +headers() } class RemoteOAIEngine { <<Abstract>> +string apiKey +headers() } class LocalOAIEngine { <<abstract>> +loadModel() +unloadModel() } class OpenAIExtension { +string inferenceUrl +string apiKey } class GroqAIExtension{ +string inferenceUrl +string apiKey } class TritonTensorRTExtension { +string inferenceUrl +string apiKey } class NitroExtension { +spawnNitroProcess() +loadModel() +unloadModel() +killSubprocess() } class TensorRTLLMExtension { +spawnNitroProcess() +loadModel() +unloadModel() +killSubprocess() } ``` ### OAIEngine As previously mentioned, there are pre-implemented functions from the base class, such as handling SSE requests, cancellation, error handling, constructing request headers, and text decoding, to facilitate the ease of building a new engine for extension developers. **RemoteOAIEngine** The RemoteOAIEngine is compatible with remote OAI service providers, such as OpenAI, Groq, Deepinfra, and othersothers. For example, by extending RemoteOAIEngine, developers only need to define the provider name, endpoint, and apiKey: ```ts // OpenAI inference engine extension class OpenAIEngineExtension extends RemoteOAIEngine { // Provider name provider: string = 'openai' // Inference URL inferenceUrl: string = 'https://api.openai.com/v1/chat/completions' // API Key - if required apiKey: string = 'sk-<your key here>' // Retrieve from settings } ``` > What about other endpoints, such as /embedding? (TBD) **LocalOAIEngine** LocalOAIEngine is designed to assist developers in building and communicating with local inference engines, such as Nitro, TensorRT-LLM, and Ollama... ```ts // TensorRT-LLM inference engine extension class TensorRTLLMEngineExtension extends LocalOAIEngine { // Provider name provider: string = "tensorrt-llm" inferenceUrl: string = "http://localhost:3929/v1/chat/completions" } // Node module under node/index.ts export default { loadModel: (params: any, systemInfo?: SystemInformation) => { // Custom logic }, unloadModel: () => { // Custom logic } } ``` Since LocalOAIEngine primarily interacts with processes running directly on the operating system, it cannot be tightly coupled with the extension class (browser module). Instead, through inheritance from the base class, extension developers only need to provide custom logic according to the methods, as demonstrated in the example above. ### Engine Lifecycle ```mermaid sequenceDiagram participant App participant Extension participant CoreSDK App->>Extension: Loads Extension on load Extension->>Extension: onLoad is invoked Extension->>CoreSDK: Registers Engine during onLoad CoreSDK->>CoreSDK: Stores Engine instance App->>CoreSDK: Retrieves Engine Extension->>CoreSDK: Retrieves Engine ``` - **App loads Extension on load**: The application loads the extension during its initialization phase. - **onLoad is invoked**: Once the extension is loaded, its onLoad function is invoked, indicating that the extension is being initialized. - **Extension Registers Engine during onLoad**: During initialization, the extension registers its engine with the CoreSDK, indicating that it is ready to provide functionality and services. - **CoreSDK Stores Engine instance**: The CoreSDK stores the registered engine instance, making it accessible globally within the application and other extensions. - **App Retrieves Engine**: The application retrieves the registered engine from the CoreSDK, enabling it to utilize the engine's capabilities as needed. - **Extension Retrieves Engine**: Additionally, the extension itself can retrieve the registered engine from the CoreSDK for its own use, ensuring consistent access to engine functionality across the application. ## Tools **Problem** Previously, the definition of a tool was merely a name, and it was hard-coded as follows: `if(assistant.tool_enabled) ingestDoc()`. This means that scaling to support multiple tools was not possible, and developers couldn't support additional tools from their own extensions. It also meant that chaining actions through tools was not feasible. Furthermore, listening to events from extensions to process them into tool actions is not efficient. It requires adding cumbersome logic to modify requests and validate them at each node, which is quite complex and demands an understanding of the entire system and all installed extensions (at runtime). However, developers will only have visibility into this process at build time, making it quite challenging ```mermaid graph TD; A[Chat Shell] -->|broadcast events| B[Assistant Extension]; B -->|broadcast events| C[Inference Extension] -->|broadcast events| A; ``` - How can developers add new tools via their extensions? - How are requests processed by multiple tools? **Tools Provider** Extensions/assistants can now register the Inference Tool to be used across extensions/apps, which was not available previously. This eliminates the need for a hacky retrieval tool implementation (modifying & proxying requests), which added a lot of messy logic and code. ```mermaid graph TD; AssistantExtension --> |registers Tools| CoreSDK; App --> |imports| CoreSDK; Extension --> |imports| CoreSDK; CoreSDK --> |uses| Tools; Tools --> |inference|Engines ``` Here is an example of how a tool (for retrieval) can be built: ```ts class RetrievalTool extends InferenceTool { // Tool's identifier name: string = 'retrieval' // Process the request, modify and return output async process( data: MessageRequest, tool?: AssistantTool ): Promise<MessageRequest> { // Custom logic - E.g. utilize langchain // Example pseudocode await ingestDoc(data, vectorDb) const output = await query(vectorDb, data.messages) // Example pseudocode } ``` From here, we can register tools from extensions ```ts class JanAssistantExtension extends AssistantExtension { onLoad() { // Register new Tool ToolManager.instance().register(new RetrievalTool()) } } ``` Any request from ChatShell goes through the ToolManager, which chains together available tools for processing and gives the final output to the engine for inference. ```mermaid sequenceDiagram participant ChatShell participant ToolManager participant Tool participant Engine ChatShell->>ToolManager: Sends completion request ToolManager->>Tool: Chain processes (Tool X) Tool->>Tool: Process request Tool-->>Engine: Process request Engine-->>Tool: Embedding / completion response Tool->>ToolManager: Processed request ToolManager->>Tool: Chain processes (Tool Y) Tool->>Tool: Process request (langchain) Tool->>ToolManager: Processed request ToolManager->>ChatShell: Returns result ChatShell->>Engine: Sends processed request Engine->>ChatShell: Sends completion message ChatShell->>ChatShell: Update UI ``` Tool can also use engines to process a request, such as embedding or chat. ## Structure CoreSDK exports two different modules: `browser` and `node`. ``` /core │ index.ts │ rollup.config.ts │ package.json │ ├── browser │ │ core.ts │ │ fs.ts │ │ events.ts │ │ │ ├── extensions │ │ ├── engines │ │ │ (engine files...) │ │ │ │ │ (extension files...) │ │ │ └── tools │ (tool files...) │ └── node │ ├── api │ (api files...) │ ├── extension │ store.ts | manager.ts │ └── helper (helper files...) ``` **Browser Module (@janhq/core)** The Browser module is designed for use in Browser process. - **Extensions**: Base extension class such as assistant, conversational, model, or monitoring. - **Tools**: Tool's base class and manager. - **Engines**: Inference engine's base class and its extended classes. - **Events**: The event class allows apps and extensions to subscribe to or publish global events. - **FS**: The File System submodule offers functionalities for managing files and directories from browser process, executed by NodeJS process. - **Core APIs**: Core APIs provide essential functionalities and utilities for web application development. - **Types**: Types to be exported for use across components/modules. **Node Module (@janhq/core/node)** The Node module is designed for use in Node.js process. - **APIs**: Logic of the Jan API Server. - **Extension (Store)**: Allow installing, uninstalling, and managing available extensions from the Node.js host. - **Helper**: Helpers for use in the Node.js process. - **Types**: Types to be exported for use across components/modules and shared with the browser process. ## Extension Settings With the introduction of the settings API from the SDK, extension developers can now register their own settings more seamlessly. Here's an example of how an extension, such as `OpenAIExtension`, can utilize this API: ```typescript class OpenAIExtension extends RemoteOAIEngine { apiKey: string = ''; constructor() { // Register settings - can be edited from the Settings page this.registerSetting<string>('api-key', 'sk-<your key here>') } onLoad() { // Read the 'api-key' setting this.apiKey = this.getSetting<string>('api-key', 'default_value'); } // Update extension logic once setting is updated onSettingUpdate<T>(key: string, value: T): void {} } } ``` In this example, `OpenAIExtension` extends `RemoteOAIEngine` and uses the SDK's `readSetting()` method in its `onLoad()` function to access the 'api-key' setting. This simplifies file operations and reduces code duplication among extensions. ## What's next? - Extension's UI Framework - Python CoreSDK: import from Python scripts to communicate with the app/extensions?