Zettel: a framework for creating apps from functional modules, aided by LLMs

v0.0 - HM # Zettel: a framework for creating apps from functional modules, aided by LLMs ## 1. Introduction Zettel is a text-to-app platform, a framework for constructing applications in flexiable ways using functional modules, aided by LLMs. Zettel makes it simple for end users to get customlized applications with natural language. The robutness of well-tested modules makes the application more reliable than having GPT coding from stratch. Traditionally, software tools are application-centric. In order to use certain features offered in an application, users has to bring in data to that application and export it to another application for other opertaions not offer. Zettel takes a data-centric approach. It allows applications to be built in flexible ways on top of data. Developers can focus on building modules, letting Zettel deal with the problem of authentication, natural language UI, storage etc. Below is an example of a Zettel application. ![](https://hackmd.io/_uploads/SJIJtFBwh.gif) ## 2. System Design To create an app, end users first need to describe their needs imperatively or declaratively in natural language. Zettel's AI agent will clarify with users their intents and choose relevant modules from the module library to form the application. Users can modify the app and add features at any time. ### 2.1. Overview Figure below outlines the main system design. It provides a general idea of how user prompt is translated into an app. ![System diagram](https://hackmd.io/_uploads/HyRDTARL2.png) The implementation of Zettel contains the following concepts: - **End-user interface:** a frontend where user interacts with Zettel, inputting prompt and accessing the application created. - **AI agent**: translate user prompt to a set of modules to evoke and how they should coordinate. - **Application**: a workflow formed by one or more modules. Each application is a page of cards. - **One generic data schema**: Zettel uses one generic data schema for all applications - **Functional modules and module library**: small functional components that bring "functionality" to an app. Module library is a collection of modules built by the community. ### 2.2. AI Agent Unlike common no/low code applications, end users do not to hand pick modules to form an app. Instead, AI agent is trained to orchestra the modules and further configure them. Every module only needs describe its own capacity in natural language. For example, the **upvote module** is described as: `"aiDescription": "User can upvote on cards; Ranking cards based on users' feedback; Counting upvotes per card"` > Future work: deconstruct tasks. ranking modules to better pick from similiar modules. > Note: currently using few shot learning in GPT3.5. plan to do fine tuning next. ### 2.3. Application architecture A Zettel application's state is an array of strings, which we call a **page** of **cards**. The cards are the only place where data is stored. The functionalities of an app are provided by modules enabled on that app. Modules do not store any data on their own. They store anything on the page cards accessible to other modules. ![](https://hackmd.io/_uploads/HkHPXuBPn.png) ### 2.4. One generic data schema In order to coordinate different modules, Zettel uses one central database to share data and states. A Zettel application is a synchronized store of data, which can be simultaneously accessible by all users or modules belong to that app. This design allows users to easily replace a module with an alternative. Therefore, all users and modules work with the same data in parallel. This one unified data approach serves as the main inter-communication for all modules and users in a Zettel app. The design of data schema for the central database takes two criteria into consideration: - Every module should be able to read any data, despite which user or module writes the data. - The data should be human-readable and -manipulable. Even if a relevant module is no longer present, this data is still readable and useable. We invented a **human language described** data format to store any data that a module writes. It can not only be easily read by human users, but also read by modules, aided by AI. A pseudo example of a Zettel application (page) may look like: ```javascript // My App: { name: 'My App', members: [ { name: 'Maria Garegin' }, { name: 'Anoush Narek' }, { name: 'Sahak Taline' }, ... ], modules: [ { name: 'Scheduler' }, { name: 'Sync to Google Calendar' }, ... ], // Each card contains human language described data. // A Zettel app is an array of strings, a.k.a. a page of cards. cards: [ { text: 'Weekly meeting is from 10:00 AM to 11:00 AM. Link to meeting room is https://...' }, { text: 'The retro meeting is from 2:00 PM to 2:30 PM.'}, ... ] } ``` Modules can write data to a card in its `text` field, using human language to describe the data and its format. **3.1. Upvote Module** provides a real example of how Upvote module utilizes this approach to store Upvote data. One downside of storing all data in human language described format is that requires modules to perform requests to LLM services every time they read or write data. In order to reduce such requests, every module has a _private storage space_ on each _page_ and _card_ to help the module store its inner operation data, such as the cache of the extracted information from processing the card content. ### 2.5. Module Modules are small functional components that bring "functionality" to an app. Every module consists of a client-side and sometimes a server-side implementation, just like a web application. Modules can also extend the UI to provide users with customized interaction. Below describes the module structure. ![Client- vs Server-side](https://hackmd.io/_uploads/SyiurSmP2.png) The client-side implementation includes a _manifest_ file containing the header data about the module, the JavaScript implementation, and other assets the module needs at runtime. When a module is picked by the AI agent, this client-side implementation will be downloaded from the module library into the user's device and executed together with other modules to form the app. Zettel web-front client, in a sense, serves as an OS for the modules with client-side Module API. > To do: add explination of Module API. On the other hand, any third-party integrations and data manipulations happen on a module's server side. Zettel provides the required access to those services via the server-side Module API. ## 3. Case studies ### 3.1. Upvote Module The Upvote modules allows user to be able to upvote on any card in a page. It adds a “upvote button” to each card and displays number of upvotes, if there's any. This example will show how upvote data in stored in human language. ![](https://hackmd.io/_uploads/HkVDvXSDh.png) The Upvote app is a page of cards, like every Zettel app. Each card is stored as a JSON object with different fields. Text is one of the public fields. (It is to note that the card content which user sees from the interface is a subset of this text field) Before the module is enabled, the text field on each card looks like: ```jsonld // Before enabling upvote module. Pseudo code. // Card 1 Text: "1" // Card 2 Text: "2" // Card 3 Text: "3" ``` After the AI agent decides to enable Upvote module given the prompt, an empty metadata field is appended to the private field in each card. The purpose of having a metadata field is to make data faster to read at rendering. The fields in metadata are pre-defined by module developers. When a card gets an **upvote**, the button click event triggers two updates: 1. A natural language description of the upvote event is added to the card. `"Upvoted by user_name"` string is appended to the text field, aided by AI (not the same AI agent that orchestra modules). More details in the following section. 2. Update to metadata. The user who upvoted will get be appended to `upvote_users_list` field. ```jsonld // after enabling upvote module and user "hm" upvoting on card 1 and 3. Pseudo code. // Card 1 Text: "1 Upvoted by hm" Metadata: { upvote_users_list: ["hm"] } // Card 2 Text: "2" Metadata: {} // Card 3 Text: "3 Upvoted by hm" Metadata: { upvote_users_list: ["hm"] } ``` In the first update, AI is used to identify upvote event and add a natural language description. By having natural language as a source of truth, it can be easily read by any other modules without worrying about data fields and format. Below gives an example of the training data to let AI be able to add `Upvoted by user_name` for every upvote. ```jsx! // Use few-shot-learning to apply the new list of upvoting usernames into the card text. Pseudo code. [Text]: "Our call is on Sun May 28 2023 from 10:30 to 11am, timezone is GMT+2." [NewUpvoters]: Sally_J [UpdatedText]: "Our call is on Sun May 28 2023 from 10:30 to 11am, timezone is GMT+2. Upvoted by Sally_J." [Text]: "Here is the link Sally has uploaded:\nzettel.ooo/developer." [NewUpvoters]: Olivia Ken [UpdatedText]: "Here is the link Sally has uploaded:\nzettel.ooo/developer. Upvoted by Olivia and Ken." [Text]: "The Catcher in the Ray. ahs502 upvoted this. 289340 upvoted this. Norman downvoted this. The book is on sale." [NewUpvoters]: [UpdatedText]: "The Catcher in the Ray. Norman downvoted this. The book is on sale." ``` Text is a public field in the card. It is used as the only source of truth. The metadata is private to each module in a card. In the following section, we will show how another module can read information from the Upvote module. ### 3.2 Highlight Most Voted Module Imagine we want to create a module to find out the most upvoted card and highlight it. The "Highlight module" will first access the text field of all cards on this page. ```jsonld! // Card 1 Text: "1 Upvoted by hm" // Card 2 Text: "2" // Card 3 Text: "3 Upvoted by hm" ``` In order to read the upvote information from the text, this module will train another AI to extract the number of upvotes from each card. Below are a few examples of the training data. ```jsx! // Uses the few-shot-learning to extract the list of upvote users from the card text. Pseudo code. [Text]: "Buy milk. The task is completed by Ken. Upvoted by Sally_J and Ken." [Upvotes]: 2 [Text]: "Hello world." [Upvotes]: 0 [Text]: "Shake Shack. 8520 Santa Monica Blvd, Santa Monica, CA. ahs502 upvoted this. 289340 upvoted this. Norman downvoted this." [Upvotes]: 2 ``` With the trained AI, it is able to calculate which cards to highlight. ```jsonld! // Card 1 [Text]: "1 Upvoted by hm" [Upvotes]: 1 // Card 2 [Text]: "2" [Upvotes]: 0 // Card 3 [Text]: "3 Upvoted by hm" [Upvotes]: 1 ``` In the example of the Upvote module communicating with Highlight module, adding and storing the upvotes in a specific fields in JSON seems more straight-forward than using human language. However, we argue that using human language makes it much more scalable to accomodate more modules that needing the upvotes information in one way or another. > Question here: what information is being recorded in the text depends on how modules developers trains the AI. In upvote modules, if the developer didn't train AI to capture upvote timestamp, that information is not stored. How would other developers building another module based on upvotes be able to guess what's captured? ### 3.3 File or database as input example tba ## 4. Evaluation (wip) The current approach resembles *free composition* in Jackson's Book *The Essence of Software* (Ch.6). Every module is a standalone mini-application. All modules work on the same level and completely independently. **Pro:** - Modules can easily communicate with each other without worrying about matching fields and formats. **Con:** - Since human language is used as the only for inter-module communication, it requires that "data" and it's structure can be explained efficiently in human language. For example, Graphic db are not easy to describe in human language. Sub-table is also hard to explain. - All modules works independently. They can't be depend on each other. - AI is used to write and read the card in human language described data format. The accuracy is all dependent on how well module developer trains the AI model to parse the data. ### ## 5. Security (wip) ## 6. Discussion - alternative data structure for inter-module communication ## 7. Conclusion Acknowledgement

Syntax	Example	Reference
# Header	Header	基本排版
- Unordered List	Unordered List
1. Ordered List	Ordered List
- [ ] Todo List	Todo List
> Blockquote	Blockquote
Bold font	Bold font
Italics font	Italics font
~~Strikethrough~~	~~Strikethrough~~
19^th^	19^th
H~2~O	H₂O
++Inserted text++	Inserted text
==Marked text==	Marked text
[link text](https:// "title")	Link
![image alt](https:// "title")	Image
`Code`	`Code`	在筆記中貼入程式碼
```javascript var i = 0; ```	`var i = 0;`
:smile:		Emoji list
{%youtube youtube_id %}	Externals
$L^aT_eX$	L^aT_eX
:::info This is a alert area. :::	This is a alert area.