Relatório de pesquisa sobre o Manus AI

# Relatório de pesquisa sobre o Manus AI Got it! I’ll conduct an in-depth research on the technical architecture of Manus AI, focusing on officially disclosed details while incorporating credible expert discussions. The report will include: - **Architecture**: How Manus AI is structured and designed - **LLMs Used**: Specific models like Claude Sonnet and any other integrated LLMs - **Multi-Agent Management**: How it coordinates AI agents - **Toolchain & Infrastructure**: Dependencies, integrations (e.g., E2B runtime), and technical tools - **Reproduction Possibility**: Steps and requirements to replicate the setup based on available details - **Sources & Verification**: Clearly cited official sources and community insights marked as unverified I’ll let you know once the research is ready! # Manus AI Technical Architecture – An In-Depth Overview **Manus AI** is a multi-agent autonomous AI system developed by Beijing Butterfly Effect Technology Ltd (also known as Monica). Unlike a standard chatbot, Manus is designed to **analyze, plan, and execute** complex tasks end-to-end with minimal human guidance. This report examines Manus AI’s technical architecture, the large language models (LLMs) it employs, how it manages multiple agents, its toolchain and infrastructure (e.g. use of E2B runtime), and what would be required to replicate such a system. We prioritize verified details from official sources and credible reports, while clearly noting any community or expert speculation as unverified. ## 1. Architecture: Multi-Agent System Design Manus AI’s core design is a **multi-agent architecture** orchestrating several specialized AI components to work together on tasks ([Manus, a New AI Agent From China is Going Viral—And Raising Big Questions](https://www.maginative.com/article/manus-a-new-ai-agent-from-china-is-going-viral-and-raising-big-questions/#:~:text=,agents)) ([China’s Manus AI: A Game-Changer or Just Another Overhyped Agent?](https://www.aiwire.net/2025/03/11/chinas-manus-ai-a-game-changer-or-just-another-overhyped-agent/#:~:text=%E2%80%9CManus%20operates%20as%20a%20multi,%E2%80%9D)). Instead of a single monolithic model handling everything, Manus breaks tasks into sub-tasks handled by different **sub-agents**, each with specific roles. Key aspects of the architecture include: - **Executor Agent (User-Facing)**: The user interacts directly with an *executor* agent, which is the interface for all of Manus’s capabilities ([Chinese AI agent Manus uses Claude Sonnet and open-source technology](https://the-decoder.com/chinese-ai-agent-manus-uses-claude-sonnet-and-open-source-technology/#:~:text=One%20of%20Manus%20AI%27s%20distinguishing,planner%20agent%2C%20or%20other%20components)). This agent receives the user’s request and coordinates with other agents to fulfill it. Notably, the executor agent is *isolated* from the internal details of other agents’ processes ([Chinese AI agent Manus uses Claude Sonnet and open-source technology](https://the-decoder.com/chinese-ai-agent-manus-uses-claude-sonnet-and-open-source-technology/#:~:text=One%20of%20Manus%20AI%27s%20distinguishing,planner%20agent%2C%20or%20other%20components)). This means it doesn’t directly see the intermediate reasoning or data of the planner or knowledge agents, which helps maintain focus and control context size ([Chinese AI agent Manus uses Claude Sonnet and open-source technology](https://the-decoder.com/chinese-ai-agent-manus-uses-claude-sonnet-and-open-source-technology/#:~:text=According%20to%20Ji%2C%20this%20architecture,directly%20access%20other%20agents%27%20knowledge)). It only relays high-level commands and final responses. - **Planner Agent**: A planner agent is responsible for breaking down the user’s request into actionable steps or sub-tasks. According to the team, Manus will **autonomously analyze a goal and devise a plan** to achieve it, much like project management ([Manus, a New AI Agent From China is Going Viral—And Raising Big Questions](https://www.maginative.com/article/manus-a-new-ai-agent-from-china-is-going-viral-and-raising-big-questions/#:~:text=,agents)). The planner likely decides which tools or other sub-agents are needed for each step. (The exact internal role names come from the Manus team’s descriptions; detailed implementation is not fully public.) - **Knowledge/Research Agent**: Manus can perform research and gather information autonomously ([Manus, a New AI Agent From China is Going Viral—And Raising Big Questions](https://www.maginative.com/article/manus-a-new-ai-agent-from-china-is-going-viral-and-raising-big-questions/#:~:text=During%20its%20launch%2C%20the%20company,stocks%E2%80%94all%20with%20minimal%20human%20intervention)) ([Chinese AI agent Manus uses Claude Sonnet and open-source technology](https://the-decoder.com/chinese-ai-agent-manus-uses-claude-sonnet-and-open-source-technology/#:~:text=The%20Manus%20website%20demonstrates%20how,itinerary%20creation%20to%20dashboard%20building)). A dedicated knowledge agent likely handles web searches, data gathering, and fact-checking. This agent uses external tools (like web search and browser control) to fetch information needed for the tasks. The executor agent doesn’t see the raw results directly; instead, relevant information is passed back in a controlled way ([Chinese AI agent Manus uses Claude Sonnet and open-source technology](https://the-decoder.com/chinese-ai-agent-manus-uses-claude-sonnet-and-open-source-technology/#:~:text=According%20to%20Ji%2C%20this%20architecture,directly%20access%20other%20agents%27%20knowledge)). This compartmentalization means if a user tries to “jailbreak” or trick Manus into revealing internal data, the executor agent simply has no access to the other agents’ context or knowledge to leak ([Chinese AI agent Manus uses Claude Sonnet and open-source technology](https://the-decoder.com/chinese-ai-agent-manus-uses-claude-sonnet-and-open-source-technology/#:~:text=According%20to%20Ji%2C%20this%20architecture,directly%20access%20other%20agents%27%20knowledge)). - **Other Specialized Agents**: Manus’s architecture hints at additional agents or modules for specific functions. For example, given its ability to write and execute code, there may be a *Coding/Tool* agent that handles running code or manipulating files, and perhaps a *Memory agent* for maintaining long-term context or results. In an official explanation, Manus’s chief scientist Yichao “Peak” Ji referenced a **“knowledge agent, planner agent, or other components”**, indicating a collection of such specialized modules ([Chinese AI agent Manus uses Claude Sonnet and open-source technology](https://the-decoder.com/chinese-ai-agent-manus-uses-claude-sonnet-and-open-source-technology/#:~:text=One%20of%20Manus%20AI%27s%20distinguishing,planner%20agent%2C%20or%20other%20components)). Each operates semi-independently but in coordination under the Manus platform. This multi-agent setup runs **asynchronously in the cloud**, meaning once a task is assigned, the user can disconnect and Manus continues working on its own ([China’s Manus AI: A Game-Changer or Just Another Overhyped Agent?](https://www.aiwire.net/2025/03/11/chinas-manus-ai-a-game-changer-or-just-another-overhyped-agent/#:~:text=%E2%80%9CManus%20operates%20as%20a%20multi,%E2%80%9D)). Manus will notify the user when the task is complete. This design is similar to other agent-based systems (e.g. OpenAI’s “Operator” agent or Anthropic’s “Computer Use” mode), but Manus is notable for tightly integrating the agents into one product ([What you need to know about Manus, the new AI agentic system from China hailed as a second 'DeepSeek moment' | VentureBeat](https://venturebeat.com/ai/what-you-need-to-know-about-manus-the-new-ai-agentic-system-from-china-hailed-as-a-second-deepseek-moment/#:~:text=an%20AI%20model%2C%20it%E2%80%99s%20an,accounts%20on%20the%20user%E2%80%99s%20behalf)) ([Chinese AI agent Manus uses Claude Sonnet and open-source technology](https://the-decoder.com/chinese-ai-agent-manus-uses-claude-sonnet-and-open-source-technology/#:~:text=Monica%27s%20co,Operator%20%20and%20%2017)). The team emphasizes that Manus is *more than a chatbot* – it is an autonomous workflow executor. In the founder’s words, *“While other AI stops at generating ideas, Manus delivers results.”* ([Manus, a New AI Agent From China is Going Viral—And Raising Big Questions](https://www.maginative.com/article/manus-a-new-ai-agent-from-china-is-going-viral-and-raising-big-questions/#:~:text=,potentially%20a%20glimpse%20into%20AGI)). **Context Management**: A major architectural consideration is context length and relevance. By splitting tasks among agents, Manus reduces the burden on any single model’s context window. Ji confirmed that the **segmented agent design helps control context length** and avoids overloading one model with too much information ([Chinese AI agent Manus uses Claude Sonnet and open-source technology](https://the-decoder.com/chinese-ai-agent-manus-uses-claude-sonnet-and-open-source-technology/#:~:text=According%20to%20Ji%2C%20this%20architecture,directly%20access%20other%20agents%27%20knowledge)). Each agent works on its piece of the problem with relevant context, which is then summarized or passed along, rather than one huge prompt growing endlessly. This not only improves efficiency but also reduces hallucinations – since the user-facing agent only knows the high-level plan and outcome, it cannot inadvertently mix in low-level details or stray into unrelated tangents ([Chinese AI agent Manus uses Claude Sonnet and open-source technology](https://the-decoder.com/chinese-ai-agent-manus-uses-claude-sonnet-and-open-source-technology/#:~:text=According%20to%20Ji%2C%20this%20architecture,directly%20access%20other%20agents%27%20knowledge)). This deliberate separation of concerns is a form of *prompt and knowledge isolation* that contributes to Manus’s reliability (as confirmed by the Manus team’s Chief Scientist) ([Chinese AI agent Manus uses Claude Sonnet and open-source technology](https://the-decoder.com/chinese-ai-agent-manus-uses-claude-sonnet-and-open-source-technology/#:~:text=According%20to%20Ji%2C%20this%20architecture,directly%20access%20other%20agents%27%20knowledge)). *(Sources: Manus co-founder Ji’s descriptions ([Chinese AI agent Manus uses Claude Sonnet and open-source technology](https://the-decoder.com/chinese-ai-agent-manus-uses-claude-sonnet-and-open-source-technology/#:~:text=One%20of%20Manus%20AI%27s%20distinguishing,planner%20agent%2C%20or%20other%20components)) ([Chinese AI agent Manus uses Claude Sonnet and open-source technology](https://the-decoder.com/chinese-ai-agent-manus-uses-claude-sonnet-and-open-source-technology/#:~:text=According%20to%20Ji%2C%20this%20architecture,directly%20access%20other%20agents%27%20knowledge)); Maginative article summarizing Manus’s sub-agent approach ([Manus, a New AI Agent From China is Going Viral—And Raising Big Questions](https://www.maginative.com/article/manus-a-new-ai-agent-from-china-is-going-viral-and-raising-big-questions/#:~:text=,agents)). These details are **verified via official statements** by the Manus team. Any specific naming of agents (executor, planner, etc.) is derived from Ji’s confirmed explanation and demo materials.)* ## 2. LLMs Used in Manus AI Manus AI does not rely on a proprietary foundation model; instead, it builds on existing state-of-the-art LLMs. **Anthropic’s Claude** and **Alibaba’s Qwen** form the backbone of Manus’s intelligence ([Chinese AI agent Manus uses Claude Sonnet and open-source technology](https://the-decoder.com/chinese-ai-agent-manus-uses-claude-sonnet-and-open-source-technology/#:~:text=While%20the%20announcement%20video%20,which%20Ji%20says%20shows%20promise)): - **Anthropic Claude 3.5 “Sonnet”** – Manus’s primary reasoning engine is *Claude 3.5 Sonnet v1*, an advanced version of Anthropic’s Claude series ([Chinese AI agent Manus uses Claude Sonnet and open-source technology](https://the-decoder.com/chinese-ai-agent-manus-uses-claude-sonnet-and-open-source-technology/#:~:text=While%20the%20announcement%20video%20,which%20Ji%20says%20shows%20promise)). Claude “Sonnet” (a codename) is a large language model known for its lengthy context and strong reasoning abilities. Ji Yichao confirmed that Claude 3.5 Sonnet is currently the main model powering Manus’s understanding and decision-making ([Chinese AI agent Manus uses Claude Sonnet and open-source technology](https://the-decoder.com/chinese-ai-agent-manus-uses-claude-sonnet-and-open-source-technology/#:~:text=While%20the%20announcement%20video%20,which%20Ji%20says%20shows%20promise)). Notably, this model was introduced around mid-2024 (roughly 9 months old as of March 2025) and is analogous to Anthropic’s Claude Instant or latest iterations available to partners ([What you need to know about Manus, the new AI agentic system from China hailed as a second 'DeepSeek moment' | VentureBeat](https://venturebeat.com/ai/what-you-need-to-know-about-manus-the-new-ai-agentic-system-from-china-hailed-as-a-second-deepseek-moment/#:~:text=According%20to%20X%20posts%20by,tuned%20versions%20of%20%2066)). The use of Claude gives Manus strong natural language processing and reasoning capabilities out of the box. The Manus team has also begun testing **Claude 3.7** (a newer Anthropich model) to potentially replace or augment Sonnet, with Ji reporting promising improvements in reasoning and performance ([Chinese AI agent Manus uses Claude Sonnet and open-source technology](https://the-decoder.com/chinese-ai-agent-manus-uses-claude-sonnet-and-open-source-technology/#:~:text=While%20the%20announcement%20video%20,which%20Ji%20says%20shows%20promise)). This suggests Manus will continuously integrate Anthropic’s latest models as they become available. - **Alibaba Qwen Models (Fine-Tuned)** – In addition to Claude, Manus integrates **Qwen (Tongyi Qianwen) models** developed by Alibaba ([Chinese AI agent Manus uses Claude Sonnet and open-source technology](https://the-decoder.com/chinese-ai-agent-manus-uses-claude-sonnet-and-open-source-technology/#:~:text=While%20the%20announcement%20video%20,which%20Ji%20says%20shows%20promise)). The Qwen series (which includes open-source models like Qwen-7B, Qwen-14B, etc.) is used by Manus in fine-tuned form, likely to handle specific tasks or to better serve Chinese-language and local requirements. Ji indicated that Manus uses “various fine-tuned Qwen models” alongside Claude ([Chinese AI agent Manus uses Claude Sonnet and open-source technology](https://the-decoder.com/chinese-ai-agent-manus-uses-claude-sonnet-and-open-source-technology/#:~:text=While%20the%20announcement%20video%20,which%20Ji%20says%20shows%20promise)). These could serve as supporting agents – for example, a Qwen model might be fine-tuned as a specialist in web search or coding. The exact sizes or roles of Qwen models were not publicly specified, but **official partnership announcements** suggest Qwen’s importance: on March 11, 2025, Butterfly Effect announced a strategic cooperation to use the Tongyi Qianwen series as the basis for Manus’s Chinese edition ([Manus 中文版](https://manus.monica.cn/news/manus-cn#:~:text=Manus%20%E5%B9%B3%E5%8F%B0%E5%9C%A8%E5%85%A8%E7%90%83%E8%8E%B7%E5%BE%97%E5%B9%BF%E6%B3%9B%E5%85%B3%E6%B3%A8%EF%BC%8C%E4%B8%BA%E6%BB%A1%E8%B6%B3%E4%B8%AD%E6%96%87%E7%94%A8%E6%88%B7%E9%9C%80%E6%B1%82%EF%BC%8C%E6%88%91%E4%BB%AC%E5%AE%A3%E5%B8%83%E4%B8%8E%E9%98%BF%E9%87%8C%E9%80%9A%E4%B9%89%E5%8D%83%E9%97%AE%E5%9B%A2%E9%98%9F%E6%AD%A3%E5%BC%8F%E8%BE%BE%E6%88%90%E6%88%98%E7%95%A5%E5%90%88%E4%BD%9C%E3%80%82)). In other words, Manus plans to **implement all its features on domestic (Chinese) models** via Qwen, ensuring it can run without reliance on foreign APIs for Chinese users ([Manus 中文版](https://manus.monica.cn/news/manus-cn#:~:text=Manus%20%E5%B9%B3%E5%8F%B0%E5%9C%A8%E5%85%A8%E7%90%83%E8%8E%B7%E5%BE%97%E5%B9%BF%E6%B3%9B%E5%85%B3%E6%B3%A8%EF%BC%8C%E4%B8%BA%E6%BB%A1%E8%B6%B3%E4%B8%AD%E6%96%87%E7%94%A8%E6%88%B7%E9%9C%80%E6%B1%82%EF%BC%8C%E6%88%91%E4%BB%AC%E5%AE%A3%E5%B8%83%E4%B8%8E%E9%98%BF%E9%87%8C%E9%80%9A%E4%B9%89%E5%8D%83%E9%97%AE%E5%9B%A2%E9%98%9F%E6%AD%A3%E5%BC%8F%E8%BE%BE%E6%88%90%E6%88%98%E7%95%A5%E5%90%88%E4%BD%9C%E3%80%82)) ([China's Manus AI partners with Alibaba's Qwen team in expansion bid | Reuters](https://www.reuters.com/technology/artificial-intelligence/chinas-manus-ai-announces-partnership-with-alibabas-qwen-team-2025-03-11/#:~:text=Manus%20AI%2C%20which%20has%20offices,users%20on%20X%20for%20free)). This indicates that currently Qwen models augment Claude, and in future could even fully localize Manus’s model stack for China. - **Other Models / Future Additions**: In the launch video, the founders mentioned Manus is powered by *“several distinct models”* ([Chinese AI agent Manus uses Claude Sonnet and open-source technology](https://the-decoder.com/chinese-ai-agent-manus-uses-claude-sonnet-and-open-source-technology/#:~:text=While%20the%20announcement%20video%20,which%20Ji%20says%20shows%20promise)). The confirmed ones are Claude and Qwen, but this phrasing leaves room for additional models, perhaps for vision or audio tasks in the future. So far there’s no evidence of a computer vision model or audio model integrated (Manus’s demos mostly involve text-based tasks and web interactions). However, the architecture could accommodate more models as needed. Manus’s team focuses on using the best available models rather than training their own from scratch – it’s essentially a **model-agnostic orchestrator**. As new powerful LLMs emerge, Manus can integrate them (much like upgrading from Claude 3.5 to 3.7). This approach is why some observers call Manus a *“wrapper”* around existing AI models ([ What's the deal with Manus?](https://www.exponentialview.co/p/whats-the-deal-with-manus#:~:text=Some%20critics%20call%20Manus%20a,rely%20on%20a%20proprietary%20model)). The Manus team doesn’t deny this; instead, they highlight that their innovation is in *how* these models and tools are used together, not in inventing a new base model from nothing. It’s important to note that **Manus’s choice of models has been openly acknowledged by the team**, not just guessed by outsiders. Co-founder Ji posted on X (Twitter) confirming that Anthropic Claude and Alibaba’s Qwen are the models in use ([Chinese AI agent Manus uses Claude Sonnet and open-source technology](https://the-decoder.com/chinese-ai-agent-manus-uses-claude-sonnet-and-open-source-technology/#:~:text=While%20the%20announcement%20video%20,which%20Ji%20says%20shows%20promise)). This was further reported by outlets like VentureBeat and South China Morning Post ([What you need to know about Manus, the new AI agentic system from China hailed as a second 'DeepSeek moment' | VentureBeat](https://venturebeat.com/ai/what-you-need-to-know-about-manus-the-new-ai-agentic-system-from-china-hailed-as-a-second-deepseek-moment/#:~:text=According%20to%20X%20posts%20by,tuned%20versions%20of%20%2066)). Therefore, the LLM usage info is **verified and official**. Community rumors that Manus was “just Claude under the hood” turned out partially true – it *is* Claude-powered, but also includes other models and significant bespoke engineering ([Chinese AI agent Manus uses Claude Sonnet and open-source technology](https://the-decoder.com/chinese-ai-agent-manus-uses-claude-sonnet-and-open-source-technology/#:~:text=While%20the%20announcement%20video%20,which%20Ji%20says%20shows%20promise)). *(Sources: Ji Yichao’s X posts via VentureBeat ([What you need to know about Manus, the new AI agentic system from China hailed as a second 'DeepSeek moment' | VentureBeat](https://venturebeat.com/ai/what-you-need-to-know-about-manus-the-new-ai-agentic-system-from-china-hailed-as-a-second-deepseek-moment/#:~:text=According%20to%20X%20posts%20by,tuned%20versions%20of%20%2066)) and The Decoder ([Chinese AI agent Manus uses Claude Sonnet and open-source technology](https://the-decoder.com/chinese-ai-agent-manus-uses-claude-sonnet-and-open-source-technology/#:~:text=While%20the%20announcement%20video%20,which%20Ji%20says%20shows%20promise)) confirming the use of Claude 3.5 (Sonnet) and fine-tuned Qwen models. Official partnership news from Manus and Alibaba confirms the Qwen integration for the Chinese version ([Manus 中文版](https://manus.monica.cn/news/manus-cn#:~:text=Manus%20%E5%B9%B3%E5%8F%B0%E5%9C%A8%E5%85%A8%E7%90%83%E8%8E%B7%E5%BE%97%E5%B9%BF%E6%B3%9B%E5%85%B3%E6%B3%A8%EF%BC%8C%E4%B8%BA%E6%BB%A1%E8%B6%B3%E4%B8%AD%E6%96%87%E7%94%A8%E6%88%B7%E9%9C%80%E6%B1%82%EF%BC%8C%E6%88%91%E4%BB%AC%E5%AE%A3%E5%B8%83%E4%B8%8E%E9%98%BF%E9%87%8C%E9%80%9A%E4%B9%89%E5%8D%83%E9%97%AE%E5%9B%A2%E9%98%9F%E6%AD%A3%E5%BC%8F%E8%BE%BE%E6%88%90%E6%88%98%E7%95%A5%E5%90%88%E4%BD%9C%E3%80%82)). These are **official disclosures**. Note: “Claude Sonnet” is an Anthropic model variant; the details about its version come from the Manus team’s statements.)* ## 3. Multi-Agent Management and Orchestration Managing multiple AI agents to work in concert is a complex challenge, and Manus AI addresses this with a carefully designed coordination logic. Here’s how Manus handles and coordinates its *swarm* of AI agents: - **Division of Labor**: Manus breaks a user’s request into parts and assigns each part to the appropriate agent. For example, if a user asks Manus to “Analyze Tesla’s stock performance and build an interactive dashboard,” the system might divide this into sub-tasks like: (a) gather historical stock data and news, (b) analyze data for trends or correlations, (c) generate a summary report, (d) create a visualization/dashboard and deploy it. Each of these could be handled by different agents or tool modules – one agent uses web search for data, another runs Python code to perform analysis, and another handles creating the dashboard. The **planner agent** is likely orchestrating this, deciding the sequence of actions and invoking the right tools or agents for each step ([Manus, a New AI Agent From China is Going Viral—And Raising Big Questions](https://www.maginative.com/article/manus-a-new-ai-agent-from-china-is-going-viral-and-raising-big-questions/#:~:text=,agents)). This multi-step planning happens autonomously. Early testers observed Manus completing what would be *weeks of work in hours* by delegating tasks in parallel and sequence intelligently ([Manus, a New AI Agent From China is Going Viral—And Raising Big Questions](https://www.maginative.com/article/manus-a-new-ai-agent-from-china-is-going-viral-and-raising-big-questions/#:~:text=Key%20Points%3A)) (this is anecdotal but aligns with the intended design). - **Agent Communication**: The sub-agents need to communicate results to each other through the Manus system. Instead of chatting with each other in natural language (which could blow up context sizes), Manus probably uses a centralized controller or shared memory for passing along outputs. For example, when the knowledge agent finds relevant info, it might pass a distilled summary or data payload back to the planner or executor agent. The *executor agent*, as the only one interfacing with the user, compiles the final outputs and ensures the task is complete before responding to the user ([Chinese AI agent Manus uses Claude Sonnet and open-source technology](https://the-decoder.com/chinese-ai-agent-manus-uses-claude-sonnet-and-open-source-technology/#:~:text=One%20of%20Manus%20AI%27s%20distinguishing,planner%20agent%2C%20or%20other%20components)). Importantly, this communication is **managed by the system’s code** rather than the models themselves deciding everything – i.e., Manus has an “agent loop” program that routes outputs to inputs of others (according to community analysis of Manus’s leaked prompt/code structure ([Chinese AI agent Manus uses Claude Sonnet and open-source technology](https://the-decoder.com/chinese-ai-agent-manus-uses-claude-sonnet-and-open-source-technology/#:~:text=X%20user%20,and%20prompts%20is%20available%20here)) – *unverified technical detail*, but later confirmed in principle by the Manus team). The Manus team confirmed that the coordination code (the “sandbox runtime”) is relatively lightweight, mainly just receiving commands from AI agents and executing them in sequence ([Chinese AI agent Manus uses Claude Sonnet and open-source technology](https://the-decoder.com/chinese-ai-agent-manus-uses-claude-sonnet-and-open-source-technology/#:~:text=Manus%20chief%20researcher%20Yichao%20,commands%20from%20the%20AI%20agents)). This suggests a deterministic orchestration layer that keeps the agents working towards the goal. - **Asynchronous Execution**: Manus agents operate in the cloud asynchronously. This means once the plan is set in motion, agents don’t have to wait for user input or real-time supervision; they can perform long-running tasks, take pauses, and resume work as needed. The user can even close their browser or turn off their device – Manus will continue working on the task on its servers, and send a notification or message when the task is done ([China’s Manus AI: A Game-Changer or Just Another Overhyped Agent?](https://www.aiwire.net/2025/03/11/chinas-manus-ai-a-game-changer-or-just-another-overhyped-agent/#:~:text=%E2%80%9CManus%20operates%20as%20a%20multi,%E2%80%9D)). This is a critical part of multi-agent management: the system keeps track of task state and agent progress server-side. If an agent needs input from the user (for example, clarification or a choice), Manus can message the user (e.g., via the interface or email) and wait for a response, using a specialized tool for “asking user” if needed (Manus indeed has a `message_ask_user` tool in its arsenal – more on tools below). In essence, Manus acts like a project manager that never sleeps, coordinating its AI team in the background. - **Agent Coordination & Isolation**: One of Manus’s distinguishing features in agent management is how it balances coordination with **isolation**. The executor (user-facing) agent coordinates high-level, but **it is blind to the internal workings of the others** ([Chinese AI agent Manus uses Claude Sonnet and open-source technology](https://the-decoder.com/chinese-ai-agent-manus-uses-claude-sonnet-and-open-source-technology/#:~:text=One%20of%20Manus%20AI%27s%20distinguishing,planner%20agent%2C%20or%20other%20components)). This means each agent works somewhat independently on its sub-task and does not expose all intermediate steps to the others, unless needed. Ji Yichao pointed out that this not only controls context length but also adds a security layer: even if the user tries to prompt the AI into revealing system prompts or hidden info (a common issue known as prompt leakage), the executor agent literally doesn’t have that info to reveal ([Chinese AI agent Manus uses Claude Sonnet and open-source technology](https://the-decoder.com/chinese-ai-agent-manus-uses-claude-sonnet-and-open-source-technology/#:~:text=According%20to%20Ji%2C%20this%20architecture,directly%20access%20other%20agents%27%20knowledge)). Each agent likely only shares final necessary outputs. For coordination, Manus relies on the orchestration code to feed the *right information* to the *right agent* at the *right time*. For example, the planner agent might get the user’s request and maybe a summary of context, but not the entire conversation history. The knowledge agent might get a specific query from the planner and return data. The executor gets only what it needs to present results. This modular approach is akin to microservices in software architecture – each component does one job and passes results through a controlled interface. - **Specialization and Fine-Tuning**: The multi-agent system also allows Manus to fine-tune or customize each agent’s model or prompt for its specific task. Manus’s team has fine-tuned the Qwen models to serve certain roles ([Chinese AI agent Manus uses Claude Sonnet and open-source technology](https://the-decoder.com/chinese-ai-agent-manus-uses-claude-sonnet-and-open-source-technology/#:~:text=While%20the%20announcement%20video%20,which%20Ji%20says%20shows%20promise)). It’s possible that, for example, a Qwen-7B model fine-tuned on coding could serve as a “code generation agent”, while Claude handles broader reasoning. This specialization can lead to better performance than one generalist model doing everything. Ji hinted that a lot of Manus’s advantage comes from how they “**optimize context length and orchestrate planning**” across these agents ([ What's the deal with Manus?](https://www.exponentialview.co/p/whats-the-deal-with-manus#:~:text=%2A%20Engineering%20effort%3A%20Co,context%20length%20and%20orchestrate%20planning)) – effectively an alignment/engineering solution rather than sheer model size. In a conversation with an industry expert, he suggested that *agentic capability is more an **alignment/training problem** (teaching models to break down tasks and collaborate) than a model capability problem* ([ What's the deal with Manus?](https://www.exponentialview.co/p/whats-the-deal-with-manus#:~:text=,building%20on%20existing%20foundation%20models)). In practice, this means the secret sauce is how the team designed the prompts, workflows, and interplay between agents. Overall, Manus’s multi-agent management is about creating an **“AI team”** where each member has a defined job and the system ensures they work together to finish complex tasks. The design is **confirmed by official demonstrations and statements** (e.g., Ji’s description of Manus as “multi-agent…several distinct models” working together ([China’s Manus AI: A Game-Changer or Just Another Overhyped Agent?](https://www.aiwire.net/2025/03/11/chinas-manus-ai-a-game-changer-or-just-another-overhyped-agent/#:~:text=%E2%80%9CManus%20operates%20as%20a%20multi,%E2%80%9D))). The granular details of the task scheduling and info passing are proprietary (the Manus team hasn’t open-sourced the code as of this writing), but early users who interacted with Manus and even peeked at its prompts have corroborated this description (community findings, later validated by the Manus team) ([Chinese AI agent Manus uses Claude Sonnet and open-source technology](https://the-decoder.com/chinese-ai-agent-manus-uses-claude-sonnet-and-open-source-technology/#:~:text=X%20user%20,and%20prompts%20is%20available%20here)). *(Sources: Manus co-founder Ji’s quote on multi-agent asynchronous operation ([China’s Manus AI: A Game-Changer or Just Another Overhyped Agent?](https://www.aiwire.net/2025/03/11/chinas-manus-ai-a-game-changer-or-just-another-overhyped-agent/#:~:text=%E2%80%9CManus%20operates%20as%20a%20multi,%E2%80%9D)) (verified official info). Maginative report on specialized sub-agents ([Manus, a New AI Agent From China is Going Viral—And Raising Big Questions](https://www.maginative.com/article/manus-a-new-ai-agent-from-china-is-going-viral-and-raising-big-questions/#:~:text=,agents)) (information consistent with official demo). The Decoder’s summary of Ji’s technical briefing confirms executor/knowledge/planner separation ([Chinese AI agent Manus uses Claude Sonnet and open-source technology](https://the-decoder.com/chinese-ai-agent-manus-uses-claude-sonnet-and-open-source-technology/#:~:text=One%20of%20Manus%20AI%27s%20distinguishing,planner%20agent%2C%20or%20other%20components)). Ji’s remarks on alignment vs capability ([ What's the deal with Manus?](https://www.exponentialview.co/p/whats-the-deal-with-manus#:~:text=,building%20on%20existing%20foundation%20models)) underscore the management strategy. Community-extracted info about the agent loop (unverified leak, but confirmed as accurate by Ji) ([Chinese AI agent Manus uses Claude Sonnet and open-source technology](https://the-decoder.com/chinese-ai-agent-manus-uses-claude-sonnet-and-open-source-technology/#:~:text=X%20user%20,and%20prompts%20is%20available%20here)) provided additional insight into how tasks and tools are invoked.)* ## 4. Toolchain & Infrastructure (E2B Runtime, Integrations, Dependencies) One reason Manus AI is so powerful is its integration of a rich **toolchain and infrastructure** that extends what the LLMs can do. Manus doesn’t just rely on language output; it can take actions: browse websites, run code, manipulate files, etc. To enable this, Manus uses a number of external tools and frameworks (mostly open-source or from third-party startups). Key components of its toolchain include: - **Browser Automation – “Browser Use”**: Manus uses a tool/service called **Browser-Use** to interact with web pages ([ What's the deal with Manus?](https://www.exponentialview.co/p/whats-the-deal-with-manus#:~:text=,generated%20code)). Browser-Use (an AI startup project) provides an API and library that allow an AI agent to control a web browser – clicking links, filling forms, scrolling pages, and extracting content in a structured way ([ What's the deal with Manus?](https://www.exponentialview.co/p/whats-the-deal-with-manus#:~:text=,generated%20code)). By leveraging Browser-Use, Manus can perform web research much like a human using a browser, but automated. For example, if Manus needs to gather information, it can use a **“search” tool** to query a search engine (Manus has an `info_search_web` function for web searches) and then open relevant pages via a **`browser_navigate`** tool, reading the content. It can click buttons or log in to websites if needed via `browser_click` and fill forms via `browser_input`. All these capabilities are provided by the Browser-Use framework, which translates high-level commands (like “click the third link”) into actual browser actions in a headless Chrome/Firefox environment ([Browser Use - Enable AI to control your browser](https://browser-use.com/#:~:text=Powerful%20Browser%20Automation)) ([Browser Use - Enable AI to control your browser](https://browser-use.com/#:~:text=Element%20Tracking)). This integration is crucial for tasks like finding real estate listings, collecting data from various sites, or automating web workflows. *Source:* (Browser-Use official docs indicate it’s designed to “make websites accessible for AI agents” ([Browser Use - Enable AI to control your browser](https://browser-use.com/#:~:text=We%20make%20websites%20accessible%20for,makes%20their%20beer%20taste%20better)). Manus’s use of it was **confirmed by the Manus team** after users noticed references to it in Manus’s code ([Chinese AI agent Manus uses Claude Sonnet and open-source technology](https://the-decoder.com/chinese-ai-agent-manus-uses-claude-sonnet-and-open-source-technology/#:~:text=X%20user%20,and%20prompts%20is%20available%20here)). So this is a verified component of Manus’s stack.) - **Secure Code Execution Sandbox – E2B**: Manus can write and execute code as part of its task completion (for data analysis, creating apps, etc.). To do this safely at scale, Manus runs code in an isolated cloud sandbox provided by **E2B** ([ What's the deal with Manus?](https://www.exponentialview.co/p/whats-the-deal-with-manus#:~:text=,generated%20code)). **E2B (Engineers-to-Bots)** is an open-source runtime for executing AI-generated code in secure sandboxes ([Open-source Code Interpreting for AI Apps — E2B](https://e2b.dev/#:~:text=E2B%20is%20an%20open,agentic%20%26%20AI%20use%20cases)). Manus uses E2B to spin up a temporary environment (such as a Docker container or VM) where the AI’s code can run without harming the host or leaking sensitive info ([ What's the deal with Manus?](https://www.exponentialview.co/p/whats-the-deal-with-manus#:~:text=,generated%20code)). Through E2B, Manus’s agents get access to a shell and file system where they can compile programs, run Python scripts, install packages, etc., all under controlled conditions. Manus exposes this to the AI via tools like **`shell_exec`** (to run a shell command), `shell_view` (to see output), `file_read`/`file_write` (to read or write files in the sandbox), etc. – in total nearly 30 such functions are available ([ What's the deal with Manus?](https://www.exponentialview.co/p/whats-the-deal-with-manus#:~:text=,processes%20and%20manipulate%20files%20effectively)). The sandbox persists through a session so the AI can, for example, write a code file, run it, then read the results back. This setup means Manus can do things like: analyze a dataset with Python pandas, generate a plot, save it, maybe even host a small web server for a dashboard. The heavy lifting is done by E2B behind the scenes, ensuring security and scalability. (Manus’s use of E2B was reported in an analysis and aligns with evidence from the runtime code; it’s **highly likely and implicitly confirmed** by the Manus team’s acknowledgment of using open-source runtimes ([Chinese AI agent Manus uses Claude Sonnet and open-source technology](https://the-decoder.com/chinese-ai-agent-manus-uses-claude-sonnet-and-open-source-technology/#:~:text=Sonnet%203,says%20shows%20promise)).) - **Integrated Tool Functions**: Manus comes with a suite of built-in “tools” (APIs the AI can call). According to a community leak (from a user who obtained Manus’s prompt and tool list, later verified by the team), Manus has **29 integrated tools** at its disposal ([ What's the deal with Manus?](https://www.exponentialview.co/p/whats-the-deal-with-manus#:~:text=,processes%20and%20manipulate%20files%20effectively)). These include: - *Web tools*: `info_search_web` (to perform web searches), `browser_navigate` (open URL), `browser_view` (fetch page content), `browser_click`, `browser_input`, etc., to fully interact with web pages. - *File system tools*: `file_read`, `file_write`, `file_find` (search for files), `file_str_replace` (edit files), etc., to manage files in the sandbox. - *Shell/Execution tools*: `shell_exec` (run a command or script), `shell_wait`, `shell_kill` (manage long-running processes), `shell_write_to_process` (provide input to an interactive process), and `shell_view` (get console output) ([Manus tools and prompts · GitHub](https://gist.github.com/jlia0/db0a9695b3ca7609c9b1a08dcbf872c9#:~:text=match%20at%20L926%20,)) ([Manus tools and prompts · GitHub](https://gist.github.com/jlia0/db0a9695b3ca7609c9b1a08dcbf872c9#:~:text=match%20at%20L971%20,)). - *User interaction tools*: `message_notify_user` (send a note to the user, e.g., “Task 50% complete...”), `message_ask_user` (ask the user a question and pause for input) ([Manus tools and prompts · GitHub](https://gist.github.com/jlia0/db0a9695b3ca7609c9b1a08dcbf872c9#:~:text=,or%20explaining%20changes%20in%20approach)). - Possibly others for specific actions (e.g., zipping files, or using APIs). These functions serve as the “hands and eyes” of the AI agents – extending beyond text generation to real actions. The concept is similar to how LangChain or AutoGPT defines tools that an agent can use. Manus’s toolset was largely **built on open-source** packages or services (Browser-Use and E2B being prime examples), and the team has credited the open-source community for enabling this ([Chinese AI agent Manus uses Claude Sonnet and open-source technology](https://the-decoder.com/chinese-ai-agent-manus-uses-claude-sonnet-and-open-source-technology/#:~:text=Sonnet%203,says%20shows%20promise)). *Verification:* The existence and descriptions of these tools come from a **leaked snippet of Manus’s system prompt and function list** (community-sourced, thus *unverified initially*), but Yichao Ji subsequently **confirmed the accuracy of the architecture and tools** (stating that yes, they use those tools and the sandbox code is genuine) ([Chinese AI agent Manus uses Claude Sonnet and open-source technology](https://the-decoder.com/chinese-ai-agent-manus-uses-claude-sonnet-and-open-source-technology/#:~:text=X%20user%20,and%20prompts%20is%20available%20here)). Thus we treat the tool list as reliable information. Additionally, an official Manus demo showed it creating a “public URL” with a dashboard as output ([Chinese AI agent Manus uses Claude Sonnet and open-source technology](https://the-decoder.com/chinese-ai-agent-manus-uses-claude-sonnet-and-open-source-technology/#:~:text=The%20Manus%20website%20demonstrates%20how,itinerary%20creation%20to%20dashboard%20building)), implying it has tooling for deploying web content (likely an internal tool to publish files to a hosted page). This aligns with the idea of many integrated capabilities. - **Infrastructure and Scaling**: Manus is currently in a closed beta, and the team admitted the **infrastructure is in an early stage** – essentially a demo architecture that wasn’t fully ready for mass usage ([Chinese AI agent Manus uses Claude Sonnet and open-source technology](https://the-decoder.com/chinese-ai-agent-manus-uses-claude-sonnet-and-open-source-technology/#:~:text=The%20South%20China%20Morning%20Post,deliver%20in%20our%20final%20product)). Due to a surge of interest, their servers struggled and they had to restrict access. The product partner Zhang Tao noted that Manus’s backend capacity was *“designed only for demonstrations”* and is **“still in its infancy”**, indicating they plan to scale it up significantly for production use ([Chinese AI agent Manus uses Claude Sonnet and open-source technology](https://the-decoder.com/chinese-ai-agent-manus-uses-claude-sonnet-and-open-source-technology/#:~:text=The%20South%20China%20Morning%20Post,deliver%20in%20our%20final%20product)). This suggests the current infrastructure, while functional, may not be highly distributed or redundant yet. It runs on cloud servers (likely in China for the Chinese version, possibly global cloud for others) – exact providers aren’t specified, but given Tencent is an investor ([Chinese AI agent Manus uses Claude Sonnet and open-source technology](https://the-decoder.com/chinese-ai-agent-manus-uses-claude-sonnet-and-open-source-technology/#:~:text=Before%20developing%20Manus%2C%20founder%20Xiao,developing%20the%20Magi%20search%20engine)), it might be running on Tencent Cloud or similar. As they integrate Qwen for Chinese users, they might also use Alibaba Cloud for domestic deployment (as part of the partnership). The **Manus platform** (the orchestrator that ties agents, models, and tools together) is proprietary. Initially, parts of it were accessible (unintentionally) – for example, the “sandbox runtime” code could be queried by users in early beta ([Chinese AI agent Manus uses Claude Sonnet and open-source technology](https://the-decoder.com/chinese-ai-agent-manus-uses-claude-sonnet-and-open-source-technology/#:~:text=X%20user%20,and%20prompts%20is%20available%20here)). This code was later **obfuscated/encrypted (using PyArmor)** by March 7, 2025 to prevent further leakage ([ What's the deal with Manus?](https://www.exponentialview.co/p/whats-the-deal-with-manus#:~:text=Open,opportunity%20for%20this%20to%20happen)). This indicates that Manus runs a Python-based backend (since PyArmor is for Python code protection) and that the team moved quickly to secure their implementation once it started gaining attention. - **Open-Source Contributions**: Manus’s creators have openly stated that **“Manus wouldn’t exist without open source”** ([Chinese AI agent Manus uses Claude Sonnet and open-source technology](https://the-decoder.com/chinese-ai-agent-manus-uses-claude-sonnet-and-open-source-technology/#:~:text=Sonnet%203,says%20shows%20promise)). They have leveraged many open libraries (the exact ones likely include: Browser-Use, E2B, possibly LangChain or similar agent frameworks, and model weights like Qwen which are open-source). In return, the team plans to release **“quite a few good things”** as open source ([Chinese AI agent Manus uses Claude Sonnet and open-source technology](https://the-decoder.com/chinese-ai-agent-manus-uses-claude-sonnet-and-open-source-technology/#:~:text=Sonnet%203,says%20shows%20promise)). This might include certain tools or components they built. There is also mention that Manus’s team drew inspiration from research projects like **CodeAct** ([ What's the deal with Manus?](https://www.exponentialview.co/p/whats-the-deal-with-manus#:~:text=%2A%20Engineering%20effort%3A%20Co,context%20length%20and%20orchestrate%20planning)), which is a system for enabling AI to solve problems by generating code. All these points illustrate that Manus’s tech stack is a **fusion of existing AI services and custom glue** – they did not reinvent the wheel for browsers or sandboxes, but smartly integrated available solutions and added their own optimizations on top. In summary, Manus’s toolchain and infrastructure allow it to interact with the world (web and OS) almost like a human assistant would, but automated. It uses **Browser-Use for web** browsing and **E2B for executing code** ([ What's the deal with Manus?](https://www.exponentialview.co/p/whats-the-deal-with-manus#:~:text=,generated%20code)), combined with a large set of tool APIs accessible to its LLMs. The heavy use of open-source components is officially confirmed ([Chinese AI agent Manus uses Claude Sonnet and open-source technology](https://the-decoder.com/chinese-ai-agent-manus-uses-claude-sonnet-and-open-source-technology/#:~:text=Sonnet%203,says%20shows%20promise)), and it underscores how Manus achieved so much so quickly – by standing on the shoulders of existing tech. The architecture is cloud-based and currently being scaled up to handle more users and use cases. *(Sources: Exponential View analysis confirming Browser-Use and E2B integrations ([ What's the deal with Manus?](https://www.exponentialview.co/p/whats-the-deal-with-manus#:~:text=,generated%20code)) (*expert analysis, generally reliable but not an official Manus statement*). The Decoder’s report of Ji’s confirmation that Manus relies on open-source tech and essentially confirmed the presence of those tools ([Chinese AI agent Manus uses Claude Sonnet and open-source technology](https://the-decoder.com/chinese-ai-agent-manus-uses-claude-sonnet-and-open-source-technology/#:~:text=X%20user%20,and%20prompts%20is%20available%20here)) ([Chinese AI agent Manus uses Claude Sonnet and open-source technology](https://the-decoder.com/chinese-ai-agent-manus-uses-claude-sonnet-and-open-source-technology/#:~:text=Sonnet%203,says%20shows%20promise)) (officially verified). Zhang Tao’s comment on infrastructure limits from SCMP via The Decoder ([Chinese AI agent Manus uses Claude Sonnet and open-source technology](https://the-decoder.com/chinese-ai-agent-manus-uses-claude-sonnet-and-open-source-technology/#:~:text=The%20South%20China%20Morning%20Post,deliver%20in%20our%20final%20product)) (official). E2B’s own description of its service ([Open-source Code Interpreting for AI Apps — E2B](https://e2b.dev/#:~:text=E2B%20is%20an%20open,agentic%20%26%20AI%20use%20cases)). Community-leaked tool list (unverified primary source, but corroborated by official statements) ([ What's the deal with Manus?](https://www.exponentialview.co/p/whats-the-deal-with-manus#:~:text=,processes%20and%20manipulate%20files%20effectively)). All technical integrations listed align with statements from Manus team and are thus considered **verified or highly credible**.)* ## 5. Reproduction Possibility – How to Replicate Manus AI Given what is known about Manus AI’s design, one can consider how to **reproduce a similar multi-agent AI system**. While Manus’s exact codebase is proprietary (and now secured), the company has revealed enough about their approach – and used enough open tools – that an experienced AI developer could attempt a partial replication. Below are the general steps and requirements to recreate the essence of Manus AI: 1. **Obtain or Substitute the Core LLMs**: The first requirement is access to powerful language models comparable to Claude and Qwen. Anthropic’s Claude 3.5 (or 100k-token Claude 2, if Sonnet is not available publicly) would be ideal for the reasoning agent. If Claude access is an issue, GPT-4 or another top-tier model could be substituted for the main reasoning engine. For supporting agents, one could use open-source models; since Qwen is open-source from Alibaba, one could obtain the Qwen-14B or 7B model weights and fine-tune them for specific tasks. Fine-tuning would require domain-specific data (for example, fine-tune one Qwen on code execution/tasks, another on web interaction/text summarization). Manus’s team likely invested in fine-tuning Qwen models ([ What's the deal with Manus?](https://www.exponentialview.co/p/whats-the-deal-with-manus#:~:text=%2A%20Engineering%20effort%3A%20Co,context%20length%20and%20orchestrate%20planning)), so replicating their performance might require a similar effort. In summary, a reproduction needs **multiple LLMs**: at least one large generalist and possibly one or two smaller specialist models. 2. **Set Up a Secure Execution Environment (Sandbox)**: As learned from Manus, having a sandboxed environment for code and other actions is crucial. An open-source solution like **E2B** can be used to recreate this. One would deploy E2B’s runtime to allow dynamic creation of sandboxes (or use Docker manually). This gives the system a safe place to execute user-provided or AI-written code, access a filesystem, and carry out long-running computations. Alternatively, one could use tools like OpenAI’s Code Interpreter backend or a custom-managed Kubernetes setup to run code, but E2B provides a ready-made solution. So, the requirement here is to **install and integrate a sandbox service**, and expose it via an API that the AI agents can call (for example, endpoints for running a command, reading a file, etc.). 3. **Integrate a Browser Automation Tool**: To give the AI the ability to read and interact with web pages, incorporate a browser control library. The same **Browser-Use** project Manus uses is available (with an open-source version) ([Browser Use - Enable AI to control your browser](https://browser-use.com/#:~:text=From%20open%20source%20to%20enterprise%2C,plan%20that%20fits%20your%20needs)). One can integrate Browser-Use or alternatives like Selenium-based controllers or Playwright with an AI-friendly interface. The goal is to allow the AI to say “open this URL and click X” and have that executed. Setting up Browser-Use would involve running a headless browser and using their API to get structured page data. Another approach is using LangChain’s browser tools or building a custom browser agent. But since Browser-Use is confirmed to work with Claude and others ([Browser Use - Enable AI to control your browser](https://browser-use.com/#:~:text=Any%20LLM%20Support)), using it would replicate Manus’s capability closely. This step requires some engineering to maintain browser state and feed relevant page content back into the LLM’s context (likely only excerpts, to keep context length manageable). 4. **Implement the Multi-Agent Orchestrator**: The heart of Manus is the orchestration logic that ties everything together. To replicate it, one needs to implement an **agent manager** that can: receive a user task, spawn or prompt a planner agent to break down the task, and then sequentially (or in parallel) invoke the other agents/tools as needed. There are a few ways to do this: - Coding from scratch: Write a loop that maintains a task list and uses if/then logic or policy to decide the next step (this is easier if you hardcode some strategy, but Manus likely uses the AI itself to decide next steps). - Using an existing framework: Projects like **LangChain** or **Haystack** provide infrastructure for multi-step chains and could be adapted. For example, LangChain has the concept of Agents with Tools – one could configure a LangChain agent with the 29 tools similar to Manus’s. The “planner” could be implemented via an LLM chain that takes the user request and outputs a plan (list of actions). - Another inspiration is the **Auto-GPT** style: have the AI recursively call itself with a plan. Manus’s approach seems more structured (distinct roles rather than one agent loop), which likely avoids some pitfalls of fully self-directed loops. Concretely, one would create something like: - A system prompt for the *Planner* model: instructing it how to break tasks and which tools or sub-agents to use. - A system prompt for the *Executor* model: instructing it how to interact with user and coordinate results (ensuring it knows it can ask user for clarification if needed and finalizes the answer). - Optionally, prompts for specialized agents like a *Research agent* (which might just be the same model as executor but with a prompt focusing on finding info). The orchestrator code would manage the messages passed between these. For example, user prompt -> Planner -> plan -> for each step: call appropriate tool or ask appropriate agent -> gather results -> feed back to Planner or directly to Executor -> produce final answer. This is a complex part, but not impossible – essentially one is writing a simplified version of what Manus’s closed-source “agent loop” does. 5. **Integrate the Tools with the Agents**: With the sandbox (E2B) and browser (Browser-Use) in place, ensure your agents can call them. This usually means implementing **function calling** or an API for the LLM. Modern LLM APIs (like OpenAI’s functions or Anthropic’s tool use interface) allow you to define functions that the model can invoke during its response. Manus likely uses such a mechanism to let the model call `browser_navigate` or `shell_exec` when needed (the leaked prompt suggests it was using a function calling JSON format for tools). For a reproduction, you’d register all the tool functions (the 29 tools or whichever you implement) with the LLM’s interface. Then, as the AI generates steps, the orchestrator catches the function call and executes it (e.g., if AI says `{"action": "browser_navigate", "parameters": {"url": "http://..."} }`, the orchestrator uses Browser-Use to actually navigate and then returns the page content to the AI). This requires programming glue code in Python (or another language) to connect AI outputs to tool calls and tool results back into AI inputs. Essentially, this is implementing a **REPL loop** for the agent: read AI command, execute, get result, feed to AI, and so on until the task is done. 6. **Testing and Iteration (Alignment)**: Once the system is assembled, it will need extensive testing on example tasks to fine-tune its behavior. Manus’s team likely went through iterative prompt engineering and maybe reinforcement learning from human feedback to get the agents to collaborate smoothly. As Ji noted, the key challenge is *getting the AI to break tasks down properly and align its actions with the goal*, more so than raw model capability ([ What's the deal with Manus?](https://www.exponentialview.co/p/whats-the-deal-with-manus#:~:text=,building%20on%20existing%20foundation%20models)). To replicate Manus’s level of performance, one might need to similarly refine the prompts and possibly train the models on agent-specific data. For instance, you might simulate a bunch of tasks and have the planner model fine-tuned on how to create good plans, or have the executor fine-tuned on producing well-structured final reports from multi-step workflows. While a basic clone can be made with prompting alone, achieving **robust autonomy** (where the agent rarely stalls or goes off-track) is an *engineering and alignment problem*. This is perhaps the hardest part to replicate without the data Manus has gathered. However, community efforts (unverified) have already started extracting Manus’s prompts and trying them with open models ([ What's the deal with Manus?](https://www.exponentialview.co/p/whats-the-deal-with-manus#:~:text=Open,opportunity%20for%20this%20to%20happen)), which could provide a starting point. 7. **Infrastructure and Deployment**: Finally, to make the system usable akin to Manus (which is a web app), one would deploy the orchestrator and models on a server or cloud. You’d need enough compute for the LLMs (GPUs if using large models locally, or rely on API calls to services like Anthropic for Claude or use OpenAI’s API). The sandbox and browser also need to run, possibly each user session might use one sandbox instance and one headless browser instance. This has cost and complexity implications. Manus likely orchestrates all this in the cloud with proper scheduling (they had limited slots due to server capacity ([Chinese AI agent Manus uses Claude Sonnet and open-source technology](https://the-decoder.com/chinese-ai-agent-manus-uses-claude-sonnet-and-open-source-technology/#:~:text=The%20South%20China%20Morning%20Post,deliver%20in%20our%20final%20product))). A replicator would need to consider scalability if many concurrent tasks are run. But for a small-scale reproduction, one machine could run a couple of docker containers and a headless browser alongside the LLM API calls. In summary, **replicating Manus AI is feasible in principle** since the components it uses are available (or have alternatives), but it requires significant effort in integration and tuning. The major requirements are: access to strong LLMs, setting up a tool-using framework (with web browsing and code execution), and implementing a multi-agent orchestration with careful prompt design for alignment. None of these is a single download-and-run solution; it’s an advanced project combining AI engineering and software engineering. Manus’s own developers spent months optimizing these interactions (and fine-tuning models), which is why Manus currently has an edge. However, the Manus team themselves acknowledge that their approach doesn’t have an insurmountable moat – others could build similar agents by combining existing models with good engineering ([ What's the deal with Manus?](https://www.exponentialview.co/p/whats-the-deal-with-manus#:~:text=6)). As evidence, not long after Manus’s launch, open-source enthusiasts attempted their own versions, even managing to get Manus’s initial code before it was locked down (though using that code directly would be unethical/copyrighted, it demonstrated that Manus wasn’t magic, but clever assembly) ([ What's the deal with Manus?](https://www.exponentialview.co/p/whats-the-deal-with-manus#:~:text=Open,opportunity%20for%20this%20to%20happen)). *(Sources: Exponential View notes on replication attempts (community efforts) ([ What's the deal with Manus?](https://www.exponentialview.co/p/whats-the-deal-with-manus#:~:text=Open,opportunity%20for%20this%20to%20happen)) – **unverified** but informative. Ji’s comment that alignment is key ([ What's the deal with Manus?](https://www.exponentialview.co/p/whats-the-deal-with-manus#:~:text=,building%20on%20existing%20foundation%20models)) (official insight) suggests focusing on training models to plan. Manus team’s stance that their advantages, while non-trivial, are replicable ([ What's the deal with Manus?](https://www.exponentialview.co/p/whats-the-deal-with-manus#:~:text=6)). The steps above are derived from the confirmed architecture and known toolchain: Browser-Use (available open source), E2B (open source), Qwen (open model), etc., combined with general agent development practices. These should be regarded as a reasoned reconstruction (expert analysis) of what reproducing Manus entails, not an official guide from Manus.)* ## 6. Sources & Verification of Information Given the hype around Manus AI, it’s important to distinguish **official, verified information** from speculation or community chatter. This report has prioritized data from the Manus team and credible publications: - **Official Sources & Disclosures**: The Manus team has shared technical details via **social media posts and press articles**. Co-founder and Chief Scientist **Yichao “Peak” Ji** has been particularly forthcoming on X (Twitter), confirming the models used and explaining the architecture ([Chinese AI agent Manus uses Claude Sonnet and open-source technology](https://the-decoder.com/chinese-ai-agent-manus-uses-claude-sonnet-and-open-source-technology/#:~:text=While%20the%20announcement%20video%20,which%20Ji%20says%20shows%20promise)) ([Chinese AI agent Manus uses Claude Sonnet and open-source technology](https://the-decoder.com/chinese-ai-agent-manus-uses-claude-sonnet-and-open-source-technology/#:~:text=One%20of%20Manus%20AI%27s%20distinguishing,planner%20agent%2C%20or%20other%20components)). His statements (e.g. on March 5 and in replies to users) are treated as authoritative. We also cite the **South China Morning Post and Reuters**, which interviewed the team or reported official announcements (e.g. the SCMP noted the use of Claude and Qwen in Manus ([What you need to know about Manus, the new AI agentic system from China hailed as a second 'DeepSeek moment' | VentureBeat](https://venturebeat.com/ai/what-you-need-to-know-about-manus-the-new-ai-agentic-system-from-china-hailed-as-a-second-deepseek-moment/#:~:text=According%20to%20X%20posts%20by,tuned%20versions%20of%20%2066)), and Reuters confirmed the Alibaba partnership and how Manus positions itself ([China's Manus AI partners with Alibaba's Qwen team in expansion bid | Reuters](https://www.reuters.com/technology/artificial-intelligence/chinas-manus-ai-announces-partnership-with-alibabas-qwen-team-2025-03-11/#:~:text=Manus%20AI%2C%20which%20has%20offices,users%20on%20X%20for%20free))). The **Manus official website and demo video** provided high-level claims (like being the “first general AI agent” and examples of tasks it can do ([What you need to know about Manus, the new AI agentic system from China hailed as a second 'DeepSeek moment' | VentureBeat](https://venturebeat.com/ai/what-you-need-to-know-about-manus-the-new-ai-agentic-system-from-china-hailed-as-a-second-deepseek-moment/#:~:text=Manus%20AI%20was%20officially%20announced,rather%20than%20just%20generating%20ideas))). Whenever this report states something as a fact about Manus’s design, we have backed it with such sources – for example, Ji’s quote “Manus operates as a multi-agent system…works asynchronously in the cloud” is directly from an official introduction ([China’s Manus AI: A Game-Changer or Just Another Overhyped Agent?](https://www.aiwire.net/2025/03/11/chinas-manus-ai-a-game-changer-or-just-another-overhyped-agent/#:~:text=%E2%80%9CManus%20operates%20as%20a%20multi,%E2%80%9D)). These **verified details** form the backbone of sections 1–4 above. - **Community Discoveries & Expert Analysis (Unverified)**: Manus’s closed beta nature meant that some technical insights came from savvy users who probed the system. Notably, an X user known as *“Jian”* managed to prompt Manus into revealing its own sandbox runtime code, inadvertently exposing the tool list and model usage ([Chinese AI agent Manus uses Claude Sonnet and open-source technology](https://the-decoder.com/chinese-ai-agent-manus-uses-claude-sonnet-and-open-source-technology/#:~:text=X%20user%20,and%20prompts%20is%20available%20here)). This *community-sourced information* was initially unverified, but soon after, Ji **confirmed its accuracy** in principle ([Chinese AI agent Manus uses Claude Sonnet and open-source technology](https://the-decoder.com/chinese-ai-agent-manus-uses-claude-sonnet-and-open-source-technology/#:~:text=Manus%20chief%20researcher%20Yichao%20,commands%20from%20the%20AI%20agents)). We have included the information about the 29 tools and the executor/knowledge agent separation, marking it as confirmed by the team (once verified). Other community discussions, such as Reddit threads and Medium articles, have speculated about Manus (calling it a “wrapper over Claude” etc.). We’ve avoided pure speculation and only included community or expert analysis if it had some confirmation or logical backing. For instance, the analysis from *Exponential View* (a reputable tech newsletter) provided a breakdown of Manus’s tool usage and even cost estimates ([ What's the deal with Manus?](https://www.exponentialview.co/p/whats-the-deal-with-manus#:~:text=,generated%20code)) ([ What's the deal with Manus?](https://www.exponentialview.co/p/whats-the-deal-with-manus#:~:text=1)). While not an official source, it references real data (like the Browser-Use and E2B integrations) that are consistent with the leaked code and thus likely accurate. We labeled such third-party analysis as **unverified** when introducing it, to clarify that the Manus team itself didn’t publish those specifics. Additionally, any forward-looking statements (e.g., how easy it is to replicate Manus, or whether Manus has a competitive moat ([ What's the deal with Manus?](https://www.exponentialview.co/p/whats-the-deal-with-manus#:~:text=6))) are the opinions of analysts, not the Manus team – we cited them with attribution and treated them as informed commentary, not fact. - **Citations and Credibility**: Throughout this document, we use the **【source†line】** citation format to reference the exact lines from source materials. Key official sources include VentureBeat ([What you need to know about Manus, the new AI agentic system from China hailed as a second 'DeepSeek moment' | VentureBeat](https://venturebeat.com/ai/what-you-need-to-know-about-manus-the-new-ai-agentic-system-from-china-hailed-as-a-second-deepseek-moment/#:~:text=According%20to%20X%20posts%20by,tuned%20versions%20of%20%2066)), The Decoder ([Chinese AI agent Manus uses Claude Sonnet and open-source technology](https://the-decoder.com/chinese-ai-agent-manus-uses-claude-sonnet-and-open-source-technology/#:~:text=While%20the%20announcement%20video%20,which%20Ji%20says%20shows%20promise)), SCMP (via Yahoo News) ([What you need to know about Manus, the new AI agentic system from China hailed as a second 'DeepSeek moment' | VentureBeat](https://venturebeat.com/ai/what-you-need-to-know-about-manus-the-new-ai-agentic-system-from-china-hailed-as-a-second-deepseek-moment/#:~:text=According%20to%20South%20China%20Morning,attention%20in%20China%E2%80%99s%20AI%20landscape)), Reuters ([China's Manus AI partners with Alibaba's Qwen team in expansion bid | Reuters](https://www.reuters.com/technology/artificial-intelligence/chinas-manus-ai-announces-partnership-with-alibabas-qwen-team-2025-03-11/#:~:text=Manus%20AI%2C%20which%20has%20offices,users%20on%20X%20for%20free)), and the Manus Chinese site ([Manus 中文版](https://manus.monica.cn/news/manus-cn#:~:text=Manus%20%E5%B9%B3%E5%8F%B0%E5%9C%A8%E5%85%A8%E7%90%83%E8%8E%B7%E5%BE%97%E5%B9%BF%E6%B3%9B%E5%85%B3%E6%B3%A8%EF%BC%8C%E4%B8%BA%E6%BB%A1%E8%B6%B3%E4%B8%AD%E6%96%87%E7%94%A8%E6%88%B7%E9%9C%80%E6%B1%82%EF%BC%8C%E6%88%91%E4%BB%AC%E5%AE%A3%E5%B8%83%E4%B8%8E%E9%98%BF%E9%87%8C%E9%80%9A%E4%B9%89%E5%8D%83%E9%97%AE%E5%9B%A2%E9%98%9F%E6%AD%A3%E5%BC%8F%E8%BE%BE%E6%88%90%E6%88%98%E7%95%A5%E5%90%88%E4%BD%9C%E3%80%82)). These confirm things like model names, team statements, and partnership news. We also cite Ji’s own words (through X/Twitter references as collated by those articles). Community or analyst sources like the Exponential View article ([ What's the deal with Manus?](https://www.exponentialview.co/p/whats-the-deal-with-manus#:~:text=,generated%20code)) and The Decoder’s deeper technical analysis (which combined official info with user findings) ([Chinese AI agent Manus uses Claude Sonnet and open-source technology](https://the-decoder.com/chinese-ai-agent-manus-uses-claude-sonnet-and-open-source-technology/#:~:text=X%20user%20,and%20prompts%20is%20available%20here)) are clearly indicated in context. In conclusion, **verified information** about Manus’s architecture and models comes directly from the Manus team’s communications and demonstrations, and we have relied on those for the core of this report. **Unverified details** (such as the exact list of tools or the internal code structure) were included only when corroborated by multiple sources or confirmed indirectly by the team – and we explicitly marked them as such. By combining official disclosures with cautious inclusion of credible analyses, we ensure a factual yet comprehensive view of Manus AI’s technical architecture. The reader can trust that claims tied to citations are backed by the content at those sources. We advise caution with any dramatic claims about Manus that aren’t supported by citations, as the AI community’s excitement can sometimes blur the line between reality and hype. This report has endeavored to stay on the verifiable side of that line. *(Sources: Official: Ji Yichao on X ([Chinese AI agent Manus uses Claude Sonnet and open-source technology](https://the-decoder.com/chinese-ai-agent-manus-uses-claude-sonnet-and-open-source-technology/#:~:text=While%20the%20announcement%20video%20,which%20Ji%20says%20shows%20promise)) ([Chinese AI agent Manus uses Claude Sonnet and open-source technology](https://the-decoder.com/chinese-ai-agent-manus-uses-claude-sonnet-and-open-source-technology/#:~:text=One%20of%20Manus%20AI%27s%20distinguishing,planner%20agent%2C%20or%20other%20components)), Reuters ([China's Manus AI partners with Alibaba's Qwen team in expansion bid | Reuters](https://www.reuters.com/technology/artificial-intelligence/chinas-manus-ai-announces-partnership-with-alibabas-qwen-team-2025-03-11/#:~:text=Manus%20AI%2C%20which%20has%20offices,users%20on%20X%20for%20free)), VentureBeat ([What you need to know about Manus, the new AI agentic system from China hailed as a second 'DeepSeek moment' | VentureBeat](https://venturebeat.com/ai/what-you-need-to-know-about-manus-the-new-ai-agentic-system-from-china-hailed-as-a-second-deepseek-moment/#:~:text=According%20to%20X%20posts%20by,tuned%20versions%20of%20%2066)). Community/analysis: The Decoder technical article ([Chinese AI agent Manus uses Claude Sonnet and open-source technology](https://the-decoder.com/chinese-ai-agent-manus-uses-claude-sonnet-and-open-source-technology/#:~:text=X%20user%20,and%20prompts%20is%20available%20here)) (mixed but mostly verified info), Exponential View newsletter ([ What's the deal with Manus?](https://www.exponentialview.co/p/whats-the-deal-with-manus#:~:text=,generated%20code)) (expert analysis, marked as such). Each source is cited in-line above where its information is used. This section itself draws on those citations to distinguish their nature.)*