# 服務 Part 2:AWS 生成式 AI 服務及架構
[服務 Part 1:AWS 生成式 AI 服務及架構](https://hackmd.io/@yuhsintsao/S17EEnetC)
# Generative AI 每季服務最新消息
:::info
[Announcing the AWS Billing and Cost Management MCP server](https://aws.amazon.com/about-aws/whats-new/2025/08/aws-billing-cost-management-mcp-server/) | [GitHub](https://github.com/awslabs/mcp/tree/main/src/billing-cost-management-mcp-server)
更新時間: 2025-08-22
[Amazon Neptune now integrates with Cognee for graph-native memory in GenAI Applications](https://github.com/topoteretes/cognee/blob/main/notebooks/neptune-analytics-example.ipynb)
更新時間: 2025-08-15
[Amazon Lex improves conversational accuracy with LLM-Assisted NLU](https://aws.amazon.com/about-aws/whats-new/2025/06/amazon-lex-conversational-accuracy-llm-assisted-nlu/)
更新時間: 2025-06-12
[AI search flow builder is now available on the Amazon OpenSearch Service](https://aws.amazon.com/about-aws/whats-new/2025/05/ai-search-flow-builder-amazon-opensearch-service/) | [Tutorial](https://github.com/opensearch-project/dashboards-flow-framework/blob/main/documentation/tutorial.md)
更新時間: 2025-05-01
[Anonymous user access for Q Business](https://aws.amazon.com/about-aws/whats-new/2025/04/user-access-q-business/)
更新時間: 2025-04-30
[AWS announces upgrades to Amazon Q Business integrations for M365 Word and Outlook](https://aws.amazon.com/about-aws/whats-new/2025/04/upgrades-amazon-q-business-m365-word-outlook/) | [Blog](https://aws.amazon.com/blogs/aws/writer-palmyra-x5-and-x4-foundation-models-are-now-available-in-amazon-bedrock/)
更新時間: 2025-04-23
[Amazon Q Business launches support for hallucination mitigation in chat responses](https://aws.amazon.com/about-aws/whats-new/2025/04/amazon-q-business-hallucination-mitigation-chat-responses/)
更新時間: 2025-04-14
[Amazon Lex adds ability to control intent switching during conversations](https://aws.amazon.com/about-aws/whats-new/2025/04/amazon-lex-control-intent-switching-during-conversations/)
更新時間: 2025-04-10
[Amazon Q Business Browser Extension now available to all subscribers](https://aws.amazon.com/about-aws/whats-new/2025/04/amazon-q-business-browser-extension-subscribers/)
更新時間: 2025-04-08
[AWS announces new upgrades to the Amazon Q Business browser extension](https://aws.amazon.com/about-aws/whats-new/2025/03/upgrades-amazon-q-business-browser-extension/)
更新時間: 2025-03-17
[Amazon Q Business now supports insights from audio and video data](https://aws.amazon.com/about-aws/whats-new/2025/03/amazon-q-business-insights-audio-video-data/)
更新時間: 2025-03-04
[AWS announces Microsoft 365 for Word and Outlook integrations for Amazon Q Business](https://aws.amazon.com/about-aws/whats-new/2025/02/microsoft-365-word-outlook-amazon-q-business/)
更新時間: 2025-02-28
[Amazon Q Developer now troubleshoots AWS Console errors in all AWS Commercial regions](https://aws.amazon.com/about-aws/whats-new/2025/02/amazon-q-developer-console-errors-aws-commercial-regions/)
Amazon Elastic Compute Cloud (Amazon EC2)
Amazon Elastic Container Service (Amazon ECS)
Amazon Simple Storage Service (Amazon S3)
AWS Lambda
Amazon CloudFormation (Amazon CFn)
IAM Permission (across all AWS Console pages)
Athena console errors (across all AWS Console pages)
更新時間: 2025-02-04
[Amazon Lex expands Assisted Slot Resolution regions and model access](https://aws.amazon.com/new/?whats-new-content-all.sort-by=item.additionalFields.postDateTime&whats-new-content-all.sort-order=desc&awsf.whats-new-categories=marketing-marchitecture%23artificial-intelligence&awsm.page-whats-new-content-all=4)
更新時間: 2025-01-30
[Amazon Q Business now supports insights from images uploaded in chat](https://aws.amazon.com/about-aws/whats-new/2025/01/amazon-q-business-insights-images-uploaded-chat/)
更新時間: 2025-01-24
[Amazon Q Developer is now available in Amazon SageMaker Code Editor IDE](https://aws.amazon.com/about-aws/whats-new/2025/01/amazon-q-developer-sagemaker-code-editor-ide/)
更新時間: 2025-01-08
[Amazon Q Business analytics dashboard enhanced with conversation insights](https://aws.amazon.com/about-aws/whats-new/2024/12/amazon-q-business-analytics-dashboard-conversation-insights/)
更新時間: 2024-12-18
[Amazon Lex launches new multilingual speech recognition models](https://aws.amazon.com/about-aws/whats-new/2024/12/amazon-bedrock-guardrails-reduces-pricing-85-percent/)
更新時間: 2024-12-12
[re:Invent 2024 Annoucement](https://hackmd.io/@yuhsintsao/HJ1RRzpQkg)
最後更新時間: 2024-12-04
[AWS App Studio is now generally available](https://aws.amazon.com/about-aws/whats-new/2024/11/aws-app-studio-generally-available/)
更新時間: 2024-11-18
:::
# Amazon Kendra GenAI Index Deep Dive
:::info
[GenAI Index in Amazon Kendra](https://aws.amazon.com/about-aws/whats-new/2024/12/genai-index-amazon-kendra/) | [Blog](https://aws.amazon.com/blogs/machine-learning/introducing-amazon-kendra-genai-index-enhanced-semantic-search-and-retrieval-capabilities/)
更新時間: 2024-12-06
:::
:::spoiler 關於 GenAI Index
By OpenAI Deep Research
Significance of the Amazon Kendra GenAI Index in the GenAI Ecosystem
Amazon Kendra GenAI Index is a specialized index edition of Amazon Kendra designed to support generative AI applications. It serves as a high-accuracy retrieval backbone for AI systems, making it easier to build Retrieval-Augmented Generation (RAG) pipelines, enhance semantic search capabilities, and power intelligent digital assistants. By addressing the challenges of data ingestion, search relevance, and integration with large language models (LLMs), the Kendra GenAI Index plays a pivotal role in improving the quality and context relevance of generative AI responses  .
Enhancing Generative AI with Retrieval-Augmented Generation (RAG)
RAG is an approach that combines an LLM with a retriever to ground the model’s output in relevant data. The Amazon Kendra GenAI Index is purpose-built as the retriever component in RAG architectures:
• High-Accuracy Context Retrieval: Kendra GenAI Index uses advanced semantic models to retrieve highly relevant passages in response to a user query . This ensures the LLM has the right context to generate accurate answers, making the system especially valuable for RAG applications that demand factual correctness . By supplying authoritative information from enterprise data, it helps reduce AI hallucinations and increase the reliability of responses  .
• Addressing RAG Pipeline Challenges: Building a RAG pipeline from scratch can be complex – from handling diverse data sources and formats to tuning embedding models and search algorithms  . Kendra GenAI Index streamlines this by providing a managed retriever with pre-optimized settings, so developers don’t need to worry about low-level details like vector dimensionality or nearest-neighbor tuning . This significantly lowers the barrier to implementing RAG solutions in the enterprise.
• Seamless Integration with LLM Workflows: The GenAI Index integrates directly with AWS’s generative AI ecosystem (Amazon Bedrock). It can feed retrieved results into an LLM through Amazon Bedrock Knowledge Bases with just a few clicks . This tight integration means an organization can reuse the same index across multiple generative applications without rebuilding or re-indexing data each time . In practice, you connect a Bedrock-powered LLM to Kendra, and the LLM will automatically query Kendra for supporting information, yielding more context-rich and up-to-date answers to user prompts .
Advanced Semantic Search Capabilities
One of the core strengths of Amazon Kendra GenAI Index is its enhanced semantic search functionality, which is crucial for understanding user queries and documents beyond simple keyword matching:
• Hybrid Vector-Keyword Search: The GenAI Index employs a hybrid indexing system that combines dense vector embeddings with traditional keyword search . This means it captures the contextual meaning of queries (via embeddings) while still recognizing exact keyword matches. The result is higher recall and precision – it finds relevant information even if the query uses different wording than the document, without losing the ability to handle precise terms  .
• Improved Semantic Models: Amazon has incorporated state-of-the-art semantic ranking models and re-rankers into this index. These models were tested across diverse datasets and deliver high retrieval accuracy out-of-the-box . In practice, when a user asks a question, the GenAI Index understands the intent and context, retrieving the most pertinent passages (not just surface-level keyword matches). This enriched semantic understanding directly translates to more relevant context for generative AI, thereby improving answer relevance.
• Semantic Search vs. Traditional Search: In generative AI contexts, traditional keyword search often falls short for complex questions. The Kendra GenAI Index’s semantic search can map natural language questions to the right information, returning direct answers or specific excerpts rather than a list of links . By automating tasks like document chunking, embedding creation, and relevance ranking, it relieves developers from manual data preparation steps . This makes it much easier to add vast external knowledge sources to LLM applications and maximizes the quality of the retrieved context for generation.
Empowering Intelligent Digital Assistants
Amazon Kendra GenAI Index is a catalyst for building intelligent digital assistants (enterprise chatbots and Q&A systems) that are both knowledgeable and secure:
• Foundation for Enterprise Chatbots: When creating a conversational assistant (for customer support, IT helpdesk, HR self-service, etc.), Kendra GenAI Index serves as the knowledge base that the assistant draws from. For example, Amazon Q Business – AWS’s fully managed generative AI assistant – uses Kendra GenAI Index under the hood to answer questions with real enterprise data  . This allows end-users to get immediate answers that are grounded in their company’s content, complete with citations to source documents  .
• Permissions-Aware Responses: An intelligent assistant built with Kendra GenAI Index can respect user access controls. The index supports metadata-based user permissions filtering, meaning it will only retrieve documents the requesting user is allowed to see . This is crucial in enterprise settings; it ensures that a digital assistant serving an employee will not expose confidential data they lack rights to. The result is a secure conversational experience, where each user’s interactions are limited to appropriate content.
• Integration with AWS AI Services: Kendra GenAI Index works seamlessly with services like Amazon Bedrock (for custom assistants) and Amazon Q Business (for a fully-managed solution). Organizations have flexibility in how they build their assistant:
• Fully Managed: With Amazon Q Business, the GenAI Index plugs in to handle retrieval, while AWS manages the LLM and prompt engineering. This removes the complexity of choosing models or crafting prompts – you get a ready-to-use assistant powered by your indexed data .
• Customizable: With Amazon Bedrock, developers can use the GenAI Index as a retriever in a Knowledge Base and pair it with their preferred LLM, custom prompts, and agent logic . This allows tailoring the assistant’s behavior while still leveraging the strong retrieval capabilities of Kendra. In both cases, a single Kendra index can serve multiple assistant applications, so companies can build various bots (IT support, legal Q&A, etc.) all drawing from the same unified knowledge index .
By providing the “brain” for these assistants (in the form of relevant knowledge), Amazon Kendra GenAI Index significantly enhances the assistants’ ability to understand context and deliver helpful, fact-based responses.
Technical Advantages of Amazon Kendra GenAI Index
From an engineering perspective, the Kendra GenAI Index brings several key advantages that bolster generative AI systems:
• Managed Retriever with Pre-Optimized Settings: It offers a fully managed retrieval service, so teams don’t need to set up or tune their own search infrastructure . The index comes with pre-optimized parameters (embeddings models, relevance thresholds, etc.), delivering strong performance without extensive ML expertise or manual tweaking .
• Hybrid Search for Higher Accuracy: The index’s hybrid search (vector + keyword) approach yields higher accuracy and robustness in finding relevant info . Advanced semantic relevance models are built-in, and they’re combined with traditional search to ensure no relevant document is missed due to vocabulary mismatch . This high semantic accuracy is critical for providing LLMs the right context .
• Connectors to Enterprise Data Sources: Amazon Kendra GenAI Index can connect to 43+ enterprise content repositories out-of-the-box (SharePoint, OneDrive, Google Drive, Salesforce, and many others) . This vastly simplifies data ingestion – organizations can seamlessly index documents, wiki pages, tickets, and more from existing systems without custom ETL work. It also supports a wide range of file formats (PDF, Word, HTML, etc.), ensuring all relevant knowledge can be included  .
• User Context and Metadata Filtering: Built-in support for metadata and access controls means results can be filtered by attributes like document type, date, or user permissions  . This fine-grained filtering improves result relevance and ensures compliance/security requirements are met (e.g. only surface documents tagged for public view in a customer-facing assistant).
• Optimized Resource Utilization: The GenAI Index introduces smaller index units and a streamlined architecture for better resource utilization . This makes it more cost-effective to scale – companies can efficiently manage large volumes of data and queries while maintaining low latency. The managed service abstracts away the heavy lifting of infrastructure, automatically handling scaling and index distribution as usage grows.
• One Index, Multiple Uses: A single Kendra GenAI Index can be reused across different applications and services. This “index once, use anywhere” capability means an organization can maintain one centralized knowledge index and leverage it for various generative AI use cases (chatbots, search portals, analytics) simultaneously  . It protects the investment in indexing and avoids duplicate work when expanding to new GenAI projects.
• Deep AWS Ecosystem Integration: The GenAI Index is natively integrated into AWS’s GenAI tools. It plugs into Amazon Bedrock Knowledge Bases (for building RAG workflows with chosen foundation models and prompt orchestration) and Amazon Q Business (for out-of-the-box AI assistants) . This tight integration enables features like pay-per-token cost control for LLM calls, easy prompt customization, and agent-based orchestration while relying on Kendra for retrieval . Essentially, it serves as a bridge between enterprise data and LLMs, maximally leveraging AWS’s GenAI stack.
Use Cases and Integration with LLMs
The introduction of Amazon Kendra GenAI Index opens up a variety of use cases by enhancing how generative AI systems access and use knowledge. It also demonstrates powerful integration patterns with LLMs to improve answer quality and relevance:
• Enterprise Q&A and Helpdesk Assistants: Organizations can build internal chatbots (IT helpdesk, HR assistant, knowledge base Q&A) that understand employees’ questions and provide precise answers from company documentation. For example, an HR assistant can retrieve the exact policy document excerpt about annual leave when asked, “How much annual leave do I have?” – then the LLM uses that excerpt to give a clear answer. Because Kendra enforces permission rules, if a confidential policy is restricted, the assistant won’t reveal it  . This leads to faster issue resolution and employee self-service with trustworthy answers.
• Customer Support and Self-Service: Companies can power customer-facing virtual agents with a Kendra GenAI Index loaded with product manuals, FAQs, and troubleshooting guides. When a customer asks a complex question, the system retrieves the relevant sections of the product documentation and the LLM uses that to generate a helpful response (with citations if needed). This provides accurate, up-to-date support information and can deflect load from human support agents.
• Research and Knowledge Discovery: In scenarios like legal research, scientific literature review, or business intelligence, the GenAI Index can semantically search vast document collections. A generative model can then summarize or explain the findings. The combination allows users to ask natural language questions and get synthesized answers grounded in the retrieved source material – greatly speeding up research tasks while ensuring the answers are backed by actual documents.
• Intelligent Search Portals: Beyond chatbots, the GenAI Index can back an intelligent search experience on corporate portals or applications. Instead of traditional keyword search that returns a list of files, a portal can use Kendra GenAI to return direct answers or a short generated summary, along with the source content. This gives users a more conversational search experience. Since Kendra’s retrieval is highly relevant, the LLM’s generated summary or answer will be context-rich and on-point.
• Improved LLM Response Quality: In all these cases, integrating Kendra GenAI Index with LLMs leads to better response quality. The LLM is no longer answering questions in a vacuum or solely from training data; it’s augmented with fresh, domain-specific knowledge. According to AWS, leveraging Kendra’s high-accuracy retrieval with Bedrock’s generative capabilities results in more sophisticated and accurate AI assistants  . The answers are more likely to be correct and contextually appropriate because the model can reference the retrieved facts. Additionally, the ability to include citations to source content (a practice supported by RAG pipelines and services like Amazon Q) boosts transparency and user trust in the AI’s answers .
In summary, Amazon Kendra GenAI Index significantly enhances the generative AI ecosystem by providing a robust, ready-made solution for the retrieval component of AI systems. Its advanced semantic search capabilities and seamless integration with LLM workflows (RAG, Bedrock, Q Business) enable developers to build generative AI applications that are far more knowledgeable, relevant, and trustworthy than using LLMs alone. By uniting enterprise-grade search with generative AI, Kendra GenAI Index helps organizations deliver rich conversational experiences and accurate insights drawn from their vast troves of data – all with lower complexity and faster time-to-value  .
Sources: The information and benefits discussed above are drawn from Amazon’s official announcements and documentation on Kendra GenAI Index  , which highlight its role as a high-accuracy RAG retriever, its technical features (hybrid search, connectors, permissions filtering)  , and its integration into AWS generative AI services for building intelligent assistants  . These sources underscore how Kendra GenAI Index is architected to improve generative AI response relevance and quality by tightly coupling retrieval with generation in real-world applications.
:::
# Amazon Q 服務功能 Deep Dive
## Amazon Q Business
### 與 AWS IAM Identity Center 進行整合
Amazon Q Business 提供的原生數據源連接器可以將多個存儲庫中的內容無縫整合並索引到統一索引中。
:::info
[AWS Blog: onfigure Amazon Q Business with AWS IAM Identity Center trusted identity propagation](https://aws.amazon.com/blogs/machine-learning/configuring-amazon-q-business-with-aws-iam-identity-center-trusted-identity-propagation/)
更新時間:2024-07-30
:::
Amazon Q Business 使用 AWS IAM Identity Center 來記錄您分配訪問權限的員工用戶及其屬性,例如group。
考慮一個客戶端-服務器應用程序,它使用外部身份提供商(IdP)來驗證用戶身份,以提供對用戶私有的 AWS 資源的訪問。例如,您的 Web 應用程式可能使用 Okta 作為外部 IdP 來驗證用戶身份,以查看他們來自 Q Business 的私人對話。在這種情況下,Q Business 無法使用第三方提供商生成的identity token直接訪問用戶的私有資料,因為沒有機制來信任第三方發行的identity token。
為了解決這個問題,您可以使用 IAM Identity Center 將外部 IdP 的用戶身份引入 AWS Identity and Access Management (IAM) 角色會話,這允許您根據每個人及他/她帶有的屬性、和他們所屬的group授權請求,而不是透過 IAM 政策管理權限。您可以將外部 IdP 發行的token交換為 Identity Center 生成的token。Identity Center 生成的token對應到 Identity Center 的使用者(user)。Web 應用程式現在可以使用新token向 Q Business 發起私人聊天對話的請求。該token對應到 Identity Center 中相應的使用者,Q Business 可以根據使用者或其在 Identity Center 中的group身份,授權發起私人聊天對話的請求。
Trusted identity propagation
提供了一種機制,使在 AWS 外部進行身份驗證的應用程式能夠使用可信token發行者代表其用戶發出請求。

在這個模型架構中,Web 瀏覽器代表您應用程式的用戶界面。這可能是在 Web 瀏覽器上呈現的網頁、Slack、Microsoft Teams 或其他應用程式。
應用程式服務器可能是在 Amazon Elastic Container Service (Amazon ECS) 上運行的 Web 服務器,或使用 AWS Lambda 實現的 Slack 或 Microsoft Teams gateway。
Identity Center 本身可能部署在委派的管理帳戶或 Identity Center(前圖中的身份帳戶)上,或者可以與 Amazon Q Business 一起部署在同一個 AWS 帳戶(前圖中的應用程式帳戶)中。
您有一個 OAuth 2.0 OpenID Connect (OIDC) 外部 IdP,如 Okta、Ping One、Microsoft Entra ID 或 Amazon Cognito,用於身份驗證和授權。
### Data Connector: Microsoft SharePoint Online
:::info
[AWS Blog: Connect Amazon Q Business to Microsoft SharePoint Online using least privilege access controls](https://aws.amazon.com/blogs/machine-learning/connect-amazon-q-business-to-microsoft-sharepoint-online-using-least-privilege-access-controls/)
:::
### Data Connector: Web Crawler
:::info
[AWS Blog: Index website contents using the Amazon Q Web Crawler connector for Amazon Q Business](https://aws.amazon.com/blogs/machine-learning/index-website-contents-using-the-amazon-q-web-crawler-connector-for-amazon-q-business/)
:::
## Amazon Q Developer
:::info
AWS blog: [Introducing the new Amazon Q Developer experience in AWS Lambda](https://aws.amazon.com/blogs/devops/introducing-the-new-amazon-q-developer-experience-in-aws-lambda/)
更新時間: 2024-10-22
:::
# App Studio 服務功能 Deep Dive
:::danger
[Creating and setting up an App Studio instance for the first time](https://docs.aws.amazon.com/appstudio/latest/userguide/setting-up-first-time-admin.html)
初次設定需要約莫 30 分鐘的時間
:::
:::spoiler App Studio 服務功能的主要設置聲明
App Studio 將創建 IAM 策略來訪問:
- AWS IAM Identity Center
- Amazon CodeCatalyst
- AWS Secrets Manager
- Amazon CloudWatch
App Studio 將使用現有的 CodeCatalyst 空間來創建專案。如果找不到空間,App Studio 會創建一個新空間並添加專案。使用 App Studio 創建的 CodeCatalyst 專案不會收費。
App Studio 將部署 CloudFormation 堆疊並創建必要的 IAM 角色,使開發者能夠在 AWS 帳戶中創建 DynamoDB 資源。請避免編輯或修改這些角色,因為這可能會影響 App Studio 管理的存儲解決方案的功能。
App Studio 會跨區域傳輸數據以啟用生成式 AI 功能。確認即表示您同意此跨區域數據移動。您可以在完成設置後選擇退出跨區域數據移動。
:::