With Maden - HackMD

N+DXpXs3VjFPrcf98BgWuCni06y19FYDquCHI8DRECQ= " There are still on ambiguity on the our requirements for logging that I would to address. . I would like to clarify something on the requirement for logging. Are we interested in monitoring who logged in and their activities or we are looking into for a better way to trouble to applicati Hi Nigel, There’s still some ambiguity around our logging requirements that I’d like to clarify. Are we primarily aiming to: 1. Monitor user activity — such as who logged in and what actions they performed, or 2. Enhance our ability to troubleshoot the application in case of failures — for example, by implementing use cases like correlation IDs? on in case of a failures( Use case for Correlation ID). " Quick update on the function deployment: • basics Azured functions are deployed, but for a production implementation requires further improvements. • API usability and security need refinement, especially around authorization. • Currently, authorization uses a device flow instead of an interactive login process, allowing the function to act on behalf of the user. • The approach enables access to transcripts without requiring user login, but we need to assess security risks. Next Steps: • Improve API user experience. • Strengthen security mechanisms beyond just the secret key. • Define a clearer flow for handling transcript access securely. What you have: Tested API (channel meeting, non-channel meeting). Watched the video (nothing strictly new there, one interesting thing about auth flow) What you plan: Improve / rewrite application in anticipation for future needs Questions: Should we think about how this is going to be incorporated into overall process? Where are we going to deploy this thing? Specifically how is this going to be invoked and presented? Are we going to use Todds Teams bot? Somethign else? # For team creation Team.Create Team.ReadBasic.All # For channel creation Channel.Create Channel.ReadBasic.All # To create meeting OnlineMeetings.ReadWrite # Maybe? Calendars.ReadWrite So just to reiterate: The essential difference between Incident app developed on top of Slack, is that their app has permissions required to build in an event driven fashion, without any user intervention (specifically they can receive events through webhooks and react to them without user granting permission). My current understanding is that similar path is not available to us in Teams / Microsoft, because of combination of lack of compartmentalization and inability to use application permissions. We can try different workarounds (pooling, service accounts), but it might create serious technological debt, with no guarantee of success in case another "deploy into production now" event. Lack of compartmentalization is also a serious risk in terms of data governance. Even application can be granted "application permissions" (in contrast to delegated permissions), organization wide access is something we should avoid. We don't want to be the source of the next Slack-like leak. Can we get a MS / Azure solution architect to chip in here? Can for example Incident Response be placed in separate OU, and application be granted access only to communication withing that unit, using application permissions? Notes: Current State: • A Microsoft Graph application is registered with delegated permissions, allowing it to: • Map meeting join IDs to internal meeting IDs. • Retrieve recordings and transcriptions for a given meeting ID. • A Python-based interactive application is in place, which: • Requires user interaction for login. • Fetches and processes currently available (not necessarily completed) transcripts. • Has additional functionality for fetching and transcribing recordings locally. • Capabilities: • Retrieve available transcripts in real-time. • Store transcripts in MongoDB or another storage solution if access is granted. Comparison with the Incident Bot: • The incident bot operates differently: • It runs on MongoDB Compas and would require porting the application and rewriting it in TypeScript. • It uses application-level permissions and does not require user interaction. • It reacts to webhook events, but Teams webhooks also require application permissions. • Despite requests, it remains unclear whether application-level permissions are entirely restricted. • Alternatives, such as meeting bots with live audio access, also require application permissions. Where we are right now: - We have MS Graph application registered with **delegated** permissions sufficient to: - map between meeting join id and internal meeting id - retrive list of recordings and transcriptions for the meeting identified with meeting id - We have an interactive Python application that uses above permissions to: - Login on behalf of a user (user interaction required) - Retrieve and process currently available (not necessarily compelted) transcripts for a user provided meeting id. - Additional capabilities to fetch and transcript recordings locally. - What we can do from here: - Get any transcripts available at the moment - Assuming access, write them to MongoDB or other storage of choice. This is fundamentally different flow, from what incident bot application is doing. In particular we need explicit user interaction to be able to fetch data. Incident bot: - Is deployed on MongoDB compas in ts. This would required porting of application to this environment and rewriting form Pyhton to TypeScript. - Works with application permissions and doesn't depend on user interaction. Despite repeated request we didn't recieve clarification if comparable application level permissions are strictly off limits on Teams. - Reacts to events through webhooks. Webhooks for teams are available, but same as non-interactive flow, require applicaiton permissions. Alternatives to graph api (i.e. meeting bots with live audio access) also require aplication permissions. Given that we have to clearly define our goals, confirm that the identified limitations exist, and communicate possible blockers to the stakeholders. What can be achived right now: - Adding data persistence for the interactive application (easy, but with no guarantee that current state is final) - Porting application to CompasDB (likely hard, as I don't have setup at the moment). It also doesn't resolve any of the problems listed above. - Summarization of the conent. This is most likely redundant, as it can be handled by reusing incident bot parts, once data is saved to Mongo. Any other thoughs? <hr/> ------------------------------------------------------------- Do we know how the meetings are going to be setup? Based on the research so far, capturing data in real time (either audio or transcript) is not exactly straightforward, so if we're going to deal with 24/7 meetings, we have to communicate that setting up the whole thing might be take a while - Graph API (https://learn.microsoft.com/en-us/graph/teams-changenotifications-callrecording-and-calltranscript#subscribe-to-transcripts-available-for-a-particular-online-meeting) referenced by me and Todd before, allows accessing transcripts after the meeting ended: > Change notifications enable you to subscribe to changes to transcripts and recordings. You can get notified whenever a transcript or a recording is available after an online meeting. - According to MS support, live call transcripts might be available (https://techcommunity.microsoft.com/discussions/teamsdeveloper/any-available-api-for-live-transcript-of-a-meeting/3924884, https://learn.microsoft.com/en-us/graph/api/resources/calltranscript?view=graph-rest-beta), but it is not clear if it is indeed the case, and there seem to be different limitations. Also, it is a beta API. To be checked if this works. - Teams recording bots can be created with .NET https://github.com/vasalis/TeamsRecordingBotAndAzureCongitiveServicesAtWork. They seem to require full Windows VM for deployment. More: https://learn.microsoft.com/en-us/microsoftteams/platform/bots/calls-and-meetings/calls-meetings-bots-overview - Azure communication services can be used to create app that capture live captions https://learn.microsoft.com/en-us/azure/communication-services/quickstarts/voice-video-calling/get-started-with-closed-captions?pivots=platform-web. It is a typescript API and might require more complex flow? In contrast, fetching a transcript after the meeting is straightforward. ========= https://github.com/vasalis/TeamsRecordingBotAndAzureCongitiveServicesAtWork I checked the paper and I'm still not sure to what extent this idea is even useful. While it might, in theory, provide some advantages for quality of your returned values, without specific metrics it is not clear if you exhausted possibilities of the current approach. It might, given lack of financial restrictions, useful idea to explore, but otherwise be premature as noted for example by thoughtworks (https://www.thoughtworks.com/radar) > "As organizations are looking for ways to make large language models (LLMs) work in the context of their product, domain or organizational knowledge, we're seeing a rush to fine-tune LLMs. While fine-tuning an LLM can be a powerful tool to gain more task-specificity for a use case, in many cases it’s not needed. One of the most common cases of a misguided rush to fine-tuning is about making an LLM-backed application aware of specific knowledge and facts or an organization's codebases. In the vast majority of these cases, using a form of retrieval-augmented generation (RAG) offers a better solution and a better cost-benefit ratio. Fine-tuning requires considerable computational resources and expertise and introduces even more challenges around sensitive and proprietary data than RAG. There is also a risk of underfitting, when you don't have enough data available for fine-tuning, or, less frequently, overfitting, when you have too much data and are therefore not hitting the right balance of task specificity that you need. Look closely at these trade-offs and consider the alternatives before you rush to fine-tune an LLM for your use case." It is also not clear, without specific measurements that latency is really an issue and if it is really caused by the vector store or other retrieval performance limits (in contrast to bad data management practices or queries). An alternative approach that could be researched and implemented using specific data representation, could be fine tuning embeddings (ibid.): > "When building LLM applications based on retrieval-augmented generation (RAG), the quality of embeddings directly impacts both retrieval of the relevant documents and response quality. Fine- tuning embedding models can enhance the accuracy and relevance of embeddings for specific tasks or domains. Our teams fine-tuned embeddings when developing domain-specific LLM applications for which precise information extraction was crucial. However, consider the trade-offs of this approach before you rush to fine-tune your embedding model" No matter which path is selected, we need clear business goals and evaluation criteria, that can be used to justify the approach taken to the upper management. Also, we have to be mindfull of possible limitations: - Amount and type of data we have. Currently KG contains mostly BO data, that might not be a perfect choice for the task of tunning either type of models. Additional tools or additional data might be necessary. - Amount of data might be critical, especially for LLM tuning. This might also increase costs. - We have to be aware that tuning won't remove need for RAG in general case. - Data represenation has to be robust enough to be reusable in CosomosDB or other platforms we choose later on. To ansure long term success we shoudl consider how much soverignity over data should be prserved in the company. For example, it might be preferable to use different techniques as prompt tokenization to reduce exposure confidential data to 3rd party tools. With such approach, team might focus on retrival, data governance, integration, etc. ============================== Checked the paper, but it’s unclear how useful this idea actually is. While it might improve result quality, without concrete metrics, it’s hard to say if we’ve exhausted the current approach. Fine-tuning could be an option if there are no financial constraints, but as ThoughtWorks points out, RAG often offers a better cost-benefit trade-off. Fine-tuning requires significant resources, introduces data challenges, and risks underfitting/overfitting. We need to assess whether latency issues are actually from the vector store or other factors like query performance or data management. One alternative worth exploring is fine-tuning embeddings, which could improve retrieval quality without the full complexity of LLM fine-tuning. However, we need clear business goals and evaluation criteria before committing to any approach. Key considerations: • Data limitations: Our KG mainly contains BO data, which may not be ideal for tuning. Additional tools or data might be needed. • Cost & feasibility: Tuning requires significant data and can be expensive. • RAG remains necessary: Fine-tuning won’t eliminate the need for RAG. • Data representation: Should be robust enough to be reusable across platforms like CosmosDB. For long-term success, we should also consider data sovereignty—minimizing exposure to third-party tools via prompt tokenization, focusing on retrieval, governance, and integration instead. ## Response: > So how to add video transcription data to this which has been identified as of major importance to the MIM team. My thoughts: How quickly does it take Teams to transcribe a call afterit has concluded, i.e., when is the Recap tab populated with the latest transcription. Does it vary based on the length of the call? This should be a low latency process. In call chatbot must already use transcription so it has to be done in give or take real time. Still, they don't provide any details, so it has to be measured. > I'm having Maria find out if they plan to use the same Teams channel or room or whatever they call it on that platform as we use the same Zoom room for standup. If so, can we 'listen' to the available API and just grab the Recap and push it into Mongo as a pipeline? Probably not. Recaps or any AI components seem to be available only from UI, not through API. We could probably go around it somehow, but it might be cumbersome > So you point of evaluation is on point...specifically point 1. If the Recap takes too long or we later find it is not great, we need to inform the MIM team to manage their expectations We can test it manually, but that's probably similar in response time to standard chatbot. ## Great info and very on point Regarding transcription capture, funny you ask because I asked the same thing of the stakeholders just yesterday. Some background: In emergency triage situations, whether they be incidents, medical situations, or military scenarios, there is a critical period of response time. For example, with stroke victims, it is called the Golden Hour , i.e., if you don't assess their condition and have reasonable plan within that time, the victim is much more likely to have permanent brain damage or worse. For major incident management, it is 15 minutes . In this scenario, they need to determine, at the very least, if this is a P1 or P2 event. Sometimes a team will call them freaking out about an outage and make it seem like a P1 or P2 when it is really much less serious, they find out later that a P2 is really a P1. So in that 15 minutes, we can see the benefits of GenAI summarizing and analyzing multiple streams of data. So how to add video transcription data to this which has been identified as of major importance to the MIM team. My thoughts: How quickly does it take Teams to transcribe a call afterit has concluded, i.e., when is the Recap tab populated with the latest transcription. Does it vary based on the length of the call? This should be a low latency process. In call chatbot must already use transcription so it has to be done in give or take real time. Still, they don't provide any details, so it has to be measured. > I'm having Maria find out if they plan to use the same Teams channel or room or whatever they call it on that platform as we use the same Zoom room for standup. If so, can we 'listen' to the available API and just grab the Recap and push it into Mongo as a pipeline? Probably not. Recaps or any AI components seem to be available only from UI, not through API. We could probably go around it somehow, but it might be cumbersome So you point of evaluation is on point...specifically point 1. If the Recap takes too long or we later find it is not great, we need to inform the MIM team to manage their expectations We can test it manually, but that's probably similar in response time to standard chatbot. For now let's work with the assumption that Premium is fine. As I mentioned to Todd a few days ago, if we shorten a major incident that saves the company way more in one event than a Premium service is likely to cost over the next 3 years :wink: Writing thoughts down fast and as I get information so if anything is confusing, ask as many questions as you need. ==================== So you mentioned it is possible for you to attach persisitent storage to lambda. Is that much work on your side? Can such storage be configured to be accessible for outside process (i.e. Droid ETL). Also, just theorethically, can we deploy multiple lambdas for each branch? https://hackmd.io/@36cnYzZSSLONAOIMO8YMOQ/BkX3n9QVC/edit ``` ``` AARP content management system Option 1.: User finds new relevant recall notice User logs into existing CMS User goes into standard "create article" page There is new UI component which takes recall URL and button "generate draft" This redirects call to your "backend" which generates whatever you need And injects into UI Option 2. Your system traces relevant recall feeds If new recall is detected it creates a draft and puts it into cms system using tools they have available User gets email or other form of notification, that new draft is available User logs into CMS system, finds draft article and works with it as usual ## Agenda for call. ================================= CDC Where do I set the model type? The StreamLit had a drop down for one, but I do not see it used in the APIs. - Currently not implemented. What is the retrieve context API used for. - Is redundant. It gives the same (or equivalent) output as quesiton. Can you send me an example response for the ask question API? I want to know how to parse it for references. Thanks ============================ Let’s use this time to discuss what is needed for governance for the Verified Data Library data sets and access. A few items to start, but feel free to come with your own agenda items in addition: - Approval workflow - Rubric for data verification, a standardized checklist for approval - Audit procedures for data and access Key: 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 has context menu DNY Workbench ==== - When you create and issue with Clone - Tested by - Relation between different jira. - Referencing another Jira issue - Connection between 2 separe Jira = Flowchart: service -> mongo -> s3 -> neo4j ## ## confluence page Concerns Regarding the Cyberbot System and Team Knowledge Graph When designing a team knowledge graph integrated with a cyberbot system, it is crucial to address various risks and uncertainties that may arise at different stages of the system. Below are the key areas of concern: 1. Sources of Uncertainty in the System The primary sources of uncertainty include: • Errors within upstream systems: Issues that originate from external systems feeding into the knowledge graph. • ETL process errors (target → MongoDB → Neo4j): Failures in extracting, transforming, or loading data can compromise data integrity across the pipeline. • Errors in data representation: Misalignment between how data is stored and how it’s represented in the knowledge graph. Mitigation: These concerns should be handled through robust data engineering and governance practices to ensure data accuracy and consistency. 2. Model Quality and Query Reliability Additional risks are associated with the quality of the model and the accuracy of Cypher queries: • Incorrect Cypher queries: Faulty queries can return data that seems correct in format but is incorrect in meaning or relevance. • Inaccurate summarization: Errors in summarizing data may lead to misinformed decisions or insights. In many scenarios, it’s assumed that users possess enough domain expertise to evaluate whether a given output is accurate, similar to using tools like Copilot, where the user assesses code quality. However, in the context of a cyberbot system, such assumptions may not hold. Users may lack the necessary knowledge to recognize issues, especially if the data “looks” correct but fails to meet specific requirements (e.g., an incorrectly applied data filter). 3. Managing the Risks Key questions arise regarding how to manage these risks effectively: • Risk Acceptance: What level of risk can the system tolerate, given the likelihood of undetected errors? • Risk Minimization: How can we reduce these risks? Potential approaches include: • Visual feedback: Providing users with visual cues that help them understand the generated Cypher queries, even without deep technical knowledge of Cypher. • Sanity checks: Implementing checks such as Cypher-to-natural-language translations and similarity checks to ensure the output aligns with expectations. 4. Sanity Checking with Domain Knowledge One critical safeguard could be the implementation of procedures that allow users to validate the system’s output using their domain knowledge. For instance, if the knowledge graph (KG) serves as the sole source of unified data, users should be able to perform a sanity check based on their familiarity with the data and its context, helping to identify any discrepancies. Concerns: Evaluation Criteria: 1. Clarity in Ambiguity Handling: How well does the model manage unclear natural language queries, especially when dealing with relationships? 2. Accuracy in Complex Graph Queries: Does the model generate accurate, efficient Cypher queries, particularly for multi-level graph traversals and pattern matching? 3. Domain Adaptability: Can the model handle domain-specific graphs, understanding the context and relationships unique to the graph’s structure? 4. Error Handling: Does the system detect and explain errors or potential issues in the Cypher query, offering understandable feedback to the user? In graph databases, relationships and nodes are critical components, and generating correct Cypher queries from natural language can be difficult due to: 1. Ambiguity in Natural Language: Users often input vague or ambiguous queries, and the model needs to resolve these ambiguities to generate accurate Cypher queries. 2. Complex Graph Traversals: Some queries require multi-level traversals of relationships or paths, making it difficult for models to generate the correct Cypher structure. 3. Domain-Specific Graph Structures: Adapting text-to-Cypher models to work with domain-specific graphs, such as knowledge graphs for cybersecurity or healthcare, requires deep understanding of both the domain and graph structure. Task 2: Complex Query Generation Design a set of complex queries requiring the model to: • Join multiple tables. • Use nested subqueries. • Implement aggregate functions (e.g., SUM(), COUNT(), AVG()). Example Query: • “Find the total sales for each product category for the last quarter, and display only categories where sales exceeded $100,000.” Your goal: Test how well the text-to-SQL model handles these types of queries. Identify common failure modes, like: • Incorrect table joins. • Inappropriate use of aggregation functions. • Errors in filtering or condition application. ## Choosing MongoDB replacement This concerns are predicated on the assumption that there is a single MongoDB instance that is intended for: - GraphQL backend - KG (Neo4j injestions) If that's the case, should the decision be delayed until there is clarity about: - Neo4j deployment (Aura vs. on prem) and where such deployment will be located - Choice of the cloud provider for the AI component If the current preference for the latter is Azure, and it is possible that will be the same for the former, wouldn't make more sense (for security, governance and inbound / outbound traffic costs) to keep things within a single cloud? For example, Azure Cosmos DB provides both MongoDB API, as well as graph API (though with Gremlin not Cypher, although there is https://github.com/opencypher/cypher-for-gremlin that can do the translation). ## Also, with intersection with AI initatives - Permission tracking and synchronization across systems and transoformations - Central, machine readable and enforcable permission model - Challanges with enforcing - Record level permissions - Column level permisson - Unstructured data handling especially if data is transformed and/or generated. - Are there any solutions implemented or planned in the near future? - If so, what platforms are used or considered? - If not, why, and should a discussion be started? Rationale: - There are different legacy (i.e. Droid), existing initatives to integrate data sources and make them actionable - All requires governance and could be supported by a single platform to reduce costs and effort required to develop new soltuions ## Data governance for Neo4J How do we track permissions and access controls across the whole system and at each stage? There can be restrictions that apply to Rows / documents / nodes / edges Fields / columns Collections / tables / types in both raw form (without significant transformations) or in a processed form (i.e aggregated form). This can be further complicated by inclusion of textual data in form of RAG with index built on top of Confluence pages or JIRA tickets. In more integrated environments tracking such information would be handled by a dedicated system, but we're not there yet. Furthermore we don't have tools required to enforce permissions at the POC level. In such situation we can start tracking sources, with transformations and associated risks using Confluence pages. This information can allow us to narrow down target audience for POC and later enable development to role base access model for KG and guardrails for any AI solution. ================= Please sure the following prompt=========== prompt = f""" You are an expert in generating concise and relevant questions for users. Your task is to create three follow-up questions based on the information provided below. The information includes the previous question that the user asked, its answer that LLM generated based on the context it was given by a RAG module and the context given to the LLM by the RAG for the previous question. Your questions should be designed to: 1. Broaden the understanding of the topic covered by the previous question 2. Be concise and easy to understand for the users 3. Answerable from the context provided Previous Question: {last_question} Answer: {last_answer} Context: {context}. """ ## Notes for cdc demo: Why do we need such tool? - Ability to query CDC own documents, Compliance, Ability to Query across multiple documents, policies, rules etc.. - Allow user to ask meaninful get question and get a Semantic answer. Summary of document 1. Problem Solved: Streamlines access to critical information in dense healthcare documents (e.g., policies, clinical guidelines, compliance manuals). 2. How It Works: • Upload PDFs. • Ask a question in natural language. • AI retrieves and answers with relevant document sections and citations. 3. Real-Time Information: Delivers precise answers instantly with references to the exact document and page. 4. Intelligent Follow-Up: Suggests clickable follow-up questions to deepen understanding and explore related topics. 5. Time-Saving: Reduces manual effort spent searching through documents, enabling healthcare staff to focus on patient care. 6. Improved Compliance: Helps agencies quickly retrieve regulatory requirements (e.g., HIPAA, GDPR) to minimize compliance risks. 7. Versatility: Handles various document types, from research studies to operational manuals, for diverse healthcare needs. 8. Transparency: Cites all document sources used to generate responses, ensuring confidence in the answers provided. 9. Cost-Efficiency: Reduces administrative overhead and operational delays, saving both time and money. 10. Value for Healthcare: Empowers staff with quick, reliable information access, improving decision-making and patient outcomes. This summary focuses on how the system works, its benefits, and its value for healthcare agencies. Future works: > As a possible future development we can implement backing knowledge graph to provide cross-document view into domain, by linking documents and concepts. > This could allow users to get insights into authority of individual answers, their logical and temporal relationships etc. . Manual Searches: Staff manually sift through PDFs or use CTRL+F, which is time-consuming and ineffective for complex queries. 2. Fragmented Systems: Information is scattered across shared drives, emails, or physical files, making retrieval inefficient. 3. Reliance on Experts: Staff often depend on senior personnel for guidance, causing delays and bottlenecks. 4. Lack of Traceability: Answers from documents are often unverifiable, leading to potential errors. 5. Delayed Decisions: Finding critical information takes time, slowing down workflows and urgent responses. Healthcare agencies handle vast amounts of critical information—policies, compliance manuals, and clinical guidelines—often buried in dense PDFs or scattered across systems. Finding the right information manually is time-consuming, error-prone, and inefficient, delaying decisions and increasing risks. Imagine quickly verifying a compliance guideline or retrieving details for patient care—all by asking a question. Our AI-powered PDF Question-Answering Tool makes this possible, transforming document search into instant, accurate, and reliable answers. It’s a smarter way to streamline operations, ensure compliance, and focus on quality care. Aspect Manual Process AI-Powered Tool Time to Find Info Hours spent searching through PDFs. Instant retrieval of answers. Accuracy Prone to human error and misinterpretation. Provides accurate answers with citations. Efficiency Labor-intensive and repetitive. Automated and seamless. Transparency Limited—difficult to trace sources. Answers include source references. Scalability Becomes unmanageable with more documents. Easily handles large repositories. Decision-Making Delayed by slow information access. Accelerated with real-time answers. Call to Action Let us help you bring AI-driven innovation to your healthcare agency. With this tool, you can transform how you manage knowledge, solve challenges, and create value for your patients and team. Would you like a demo? ============================== **Call with Data governance**====== We are dealing with enterprise system with variety of data sources, complex (and growing) data lineages, variety of applications, with new on the way (AI initiative, duh), and diverse and often poorly understood ("we know, that someone uses Droid, but not who") group of users. There seem to be no single view of the system, including ways to track permissions, roles, granted access, data lineage and lifecycle and so on. Additionally, we deal with mixed permission model and access granted in different places. We're dealing with sensitive data (including, but not limited to, PII and financial sources) and plan to add more (including chat conversations and user prompts), including ones that already caused significant fallout and reputation damage (yeah Slack, talking about you . Data leaks are free distributed data backups, but it is not clear if they're worth it LOL). Data is getting duplicated over multiple systems increasing risk of stale or incorrect data and unauthorized access due to incompatible permission models. Furthermore, some parts of data data ingestion process are tightly connected to the stack (i.e. aggregation pipelines limiting out ability to move from MongoDB or dependence on Neo4j platform in contrast to using just Cypher). Mixed signals regarding data governance ranging from "not now" to "don't touch that" and possibly lack of common understanding of the scope (in particular governance vs. management, team vs. enterprise). Opinions Given broad scope and access, as well as high risks data governance cannot be an afterthought. We are dealing with an asset which, when properly managed and utilized, can support enterprise mission and create significant value. However, if mismanaged, can cause significant risk and impose unjustified costs. While we might choose rather loose data management for data and AI products in pre-POC, POC and MVP stages, we should still understand enterprise governance strategy and plan any developed products in a way, that enables stricter and compliant management, when product is accepted for production. This applies not only to products themselves, but also technology stack we choose. This is particularly important, given that we're in the middle of the negotiations for procurement different systems (and consider further migrations) and, while being "startupish", don't have much flexibility due to need to acquire legal, financial and opsec approvals. It needs to be stated, that governance doesn't imply any particular technological solutions, rather defining overall goals, requirements and practices. It would be optimal if enterprise level strategy was already in place, but giving the scope of the projects we work on, we might want to communicate need for such strategy to be created. ## Evaluation and drift Short term goals To enable drift and other metrics monitoring we need to be able to capture different aspects of the system: Code version Model(s) version and parameters Data version Inputs and outputs To ensure comprehensive monitoring of all components it would be preferred to separate and be able to test individual components: (system-prompt, user-prompt) -> cypher-query (system-prompt, user-prompt, database-version) -> neo4j-execution-plan (user-prompt, data) -> summary These can be captured in CI/CD pipeline through programmatic API or on QA deployment with rest API with client triggered using web hooks or similar approach. All the information should be stored in a persisted way for further inspection and analysis. We can use whatever tech is in places (mognodb, s3 compatible storage, etc.). Note: Ideally, we should use synthetic and / or anonymized data to reduce data related risks for logged data. Mid term goals Extend question 'coverage' to get a better understanding of how system changes in time. That includes: Question meaningful for stakeholders Cypher queries written by an expert. In long term: Expert description of the data, given question if possible Continuously, but more on mid to long term Selecting and monitoring tools and metrics with ability (through persistent "log") to retroactively evaluate quality of the product. This is likely to be required, given changing landscape of tools and good practices (frameworks and solutions which were good yesterday, i.e. langchain, are frowned upon today, as no longer keeping pace) Email Need API. Question ==> Cypher Query. Database that can be controlled. Fixed populated for test. 3 API, 1 take questions from USer and returned Cypher QUery. 2 Api point is: Take the query and return return Execution plan form Database. 3- Take user question, example data and return a natural language answer that system generate. Part that generate query and part that generate Natural language Answers. Put in place that is dependent on input. ========== email Draft As we work on building our knowledge graph and Cypherbot, it’s essential to establish a solution for capturing and monitoring model drift and other critical system metrics. The main idea here is to separate data collection and evaluation as well as introduce higher granularity data collection for evaluation and monitoring. Below is a concise outline of our goals to ensure we achieve this effectively: Short-Term Goals To enable robust drift and metric monitoring, we need to capture various aspects of the system: • Code version • Model(s) version and parameters • Data version (i.e. versioned test database) • Inputs and outputs To ensure comprehensive monitoring, we should separate and test individual components, such as: 1. (system-prompt, user-prompt) -> cypher-query 2. (system-prompt, user-prompt, database-version) -> neo4j-execution-plan 3. (user-prompt, data) -> summary These metrics can be captured via a CI/CD pipeline using programmatic APIs or through QA deployments with REST APIs triggered by webhooks. • All data should be stored persistently (e.g., MongoDB, S3-compatible storage) for further inspection and analysis. • Data-related risks can be mitigated by using synthetic or anonymized data for logged information. The choice of particular approach is mostly irrelevant, as long as other goals are satisfied, so let's choose approach that is the least intrusive. Mid-Term Goals Extend question “coverage” to better understand system changes over time, including: • Questions meaningful to stakeholders. • Expert-written Cypher queries. Long-Term Goals, starting ASAP • Include expert descriptions of data for given questions, where possible. • Continuously evaluate and improve tools and metrics with persistent logs, enabling retrospective quality analysis. • Adapt to the evolving landscape of tools and frameworks, ensuring our solution remains future-proof. I need a set of APIs to facilitate the following functionalities: 1)User Question to Cypher Query •An API endpoint that takes a user’s question as input and returns the corresponding Cypher query. 2)Cypher Query to Execution Plan •An API endpoint that receives a Cypher query as input and retrieves the execution plan from a fixed, controlled database designed specifically for testing purposes. 3)User Question to Natural Language Answer Response •An API endpoint that takes a user’s question, combines it with example data, and returns a natural language answer generated by the system. ================== ``` Problem as disucssed Right now chat bot is getting in a shape where individual components can be tested with limited isolation From here we can easily write scripts that take predefined inputs and return outputs in some format However This needs example data. Currently dev tests seem to run on a real data. If results are persisted that might be problematic, so we have to Get approval for storing the data, if it is considered confidential, sensitive, etc. Get comparable synthetic data We have to decide how, where and when data collection code can be executed: Manual running of the code from dev machine might be OK right now, because code change frequency is low, but won't scale. Also data has to be still stored somewhere (where? Mongo? S3? Droid / Snowflake?) We can run github actions, but problem with storage as above and maybe not acceptable within company. Deploy separate lambdas with persistent storage for testing? Valentina mentioned it might be possible, but not sure if data can be taken out of such s3 bucket. If it can, we can move it to droid for later analysis Other things: Further refactoring of the code (extracting prompts, etc., as hinted in the MR comments) Taking look at the other repo, if there was miscommunication with Valentina and it is still needed. Working on common standards (as per valentina message) ``` ===================== Hey Mohamed! I think we should all start looking into customizing these teams / co-pilot chat bots: https://learn.microsoft.com/en-us/copilot/microsoft-365/microsoft-365-copilot-overview https://learn.microsoft.com/en-us/microsoftteams/platform/bots/how-to/teams-conversati[…]eams-conversation-ai-overview?tabs=javascript%2Cjavascript1 https://nanddeepnachanblogs.com/posts/2024-08-20-ai-models-ms-teams-aoai/#:~:text=In%20the%20Azure%20OpenAI%2[…]20the%20Settings%20%3E%20Keys%20menu I have been looking into fine-tuning a model using our graph data, but you can take a look at that as well: https://arxiv.org/html/2402.06764v1 https://learn.microsoft.com/en-us/azure/ai-services/openai/how-to/fine-tuning?tabs=azu[…]nai%2Cturbo%2Cpython-new&pivots=programming-language-studio (edited) learn.microsoft.comlearn.microsoft.com What is Microsoft 365 Copilot? Learn about what Microsoft 365 Copilot is and the common Copilot features in Microsoft 365 apps, like Word, Excel, PowerPoint, and Teams. This article answers common questions about Copilot, including what is Copilot, how Copilot works, and the benefits of using Copilot. What is M365 Copilot? https://learn.microsoft.com/en-us/copilot/microsoft-365/microsoft-365-copilot-overview learn.microsoft.comlearn.microsoft.com Introduction to Teams AI library - Teams Learn about Teams AI library, Teams-centric component scaffolding, natural language modeling, prompt engineering, LLM, action planner, assistants API, augmentation. (93 kB) https://learn.microsoft.com/en-us/microsoftteams/platform/bots/how-to/teams-conversational-ai/teams-conversation-ai-overview?tabs=javascript%2Cjavascript1 Nanddeep Nachan BlogsNanddeep Nachan Blogs Integrating AI Models into Microsoft Teams with Azure OpenAI Integrating AI Models into Microsoft Teams with Azure OpenAI Aug 19th, 2024 (74 kB) https://nanddeepnachanblogs.com/posts/2024-08-20-ai-models-ms-teams-aoai/#:~:text=In%20the%20Azure%20OpenAI%20Studio,from%20the%20Settings%20%3E%20Keys%20menu learn.microsoft.comlearn.microsoft.com Customize a model with Azure OpenAI Service - Azure OpenAI Learn how to create your own customized model with Azure OpenAI Service by using Python, the REST APIs, or Azure AI Foundry portal. (93 kB) ===== > Gentlemen I have two high priority tasks related to Teams that require your analysis. They are VERY topical to BizOps and Support Center. I have no idea if this will take 10 minutes or a couple of days so let me know: You might have gotten the email about Disney migrating off Zoom to Teams. Teams CoPilot has voice transcription capabilities. Support Center, when triaging P1 and P2 incidents, often rely on Zoom calls to discuss the issue and get to next steps fast. They will have to do that in Teams and they want us to capture those conversations, if possible, and add it to the Slack (soon to be Teams) Incident Management thread and ServiceNow incident management data streams we now have. Please look into it. Todd you shared the Teams AI Bot link...the AI part we can figure and I'm finishing our Azure GenAI access today (should get in a couple of days and then we can set up a service account as we have with Vertex AI). I think the much harder part is creating a Teams bot. Initially I don't care about the actual bot content as much as the process and pain points of creating one. Jason might be looking at this too but he's busy with many other things so let's take it on ourselves to investigate and keep him looped in. We need to move off our custom UI for all chatbots to a more consolidated interface ASAP. Those little things like security and cost need to be understood as well. Both of these are extremely high-value asks. Please feel free to write our own tickets with the appropriate work estimate once you have an idea. Also note you might run into blockers as we always do :face_with_rolling_eyes:. Let me know if you have any questions. For now, after all this work for the demo, I'm going to lie down.. This seems, at least in basic form, something that is available for Teams meeeting out-of-the-box. In particluar [this documment](https://support.microsoft.com/en-us/office/use-copilot-in-microsoft-teams-meetings-0bf9dd3c-96f7-44e2-8bb8-790bedf066b1) states: > Copilot can summarize key discussion points—including who said what and where people are aligned or disagree—suggest action items, and answer any questions you have, all in real time during or after a meeting. This implies > The meeting must be transcribed to enable Copilot, unless the meeting organizer sets up Copilot only during the meeting. that we need all relevant meetings to be transcribed, to use this feature after a meeting. After a transcribed meeting, recap tab should be available: > Copilot can also be accessed from the Recap tab in the meeting chat. Open Copilot Icon Copilot to see the same conversation history with Copilot as the one accessed from the meeting chat. However, any previous conversation history with Copilot in a recurring meeting will no longer be available if a later meeting in the series is transcribed. If the transcript from a transcribed meeting is no longer available, all previous conversation history with Copilot will be removed. So the starting point should be to evaluate if such built-in capabilities are sufficient or not If there are not, and additional feature are required then we can investiage further. It is woth noting that these seem to be premium features (?) > Intelligent recap is available as part of Teams Premium, an add-on license that provides additional features to make Teams meetings more personalized, intelligent, and secure. To get access to Teams Premium, contact your IT admin. > >Intelligent recap is also available as part of the Microsoft 365 Copilot license. So might not be available for us? If thse are not available, cost effective, sufficient then enforcing transcripts and using graph API to get notified about transcript availability https://learn.microsoft.com/en-us/graph/teams-changenotifications-callrecording-and-calltranscript to summarize it with external tool might be another option? It seems like Todd already explored some of these options. Goal is to Capture Teams CoPilot voice transcriptions from P1/P2 incident calls and integrate them into Slack Incident Management threads and ServiceNow. Teams CoPilot has voice transcription capabilities. Support Center, when triaging P1 and P2 incidents, often rely on Zoom calls to discuss the issue and get to next steps fast. They will have to do that in Teams and they want us to capture those conversations, if possible, and add it to the Slack (soon to be Teams) Incident Management thread and ServiceNow incident management data streams we now have. Please look into it. ============ Overview The goal of this initiative is to capture Microsoft Teams CoPilot voice transcriptions from P1/P2 incident calls and integrate them into Slack Incident Management threads and ServiceNow. This will enable Support Center teams to seamlessly access incident discussions, ensuring smooth incident resolution workflows. Currently, Support Center teams use Zoom for P1/P2 incident triaging but will transition to Microsoft Teams. We aim to automate transcription capture from Teams meetings and feed the data into our incident management systems. Understanding Microsoft Teams CoPilot Transcription Capabilities Microsoft Teams CoPilot provides advanced meeting intelligence, including: • Summarization of key discussion points, including participant contributions. • Identification of action items and next steps. • Real-time and post-meeting transcription-based insights. However, to leverage these capabilities, transcription must be enabled for the meeting. Key Findings from Research 1. CoPilot Transcription Prerequisites: • Meetings must be transcribed to allow CoPilot to function after the meeting. • CoPilot features, including Recap Tab, are only available if a transcript exists. • Recap history for recurring meetings may not persist if later meetings in the series are transcribed. 2. Access to CoPilot Features: • Intelligent Recap is a premium feature, requiring either: • Teams Premium license (add-on). • Microsoft 365 CoPilot license. • If unavailable or costly, alternative approaches should be explored. 3. Alternative Approach via Graph API: • If premium features are unavailable, we can enforce transcription and use Microsoft Graph API to: • Get notifications on transcript availability. • Extract transcribed text from Teams meetings. • Integrate the extracted transcripts into Slack and ServiceNow. • Microsoft Graph API documentation: Teams Call Transcript Notifications. https://learn.microsoft.com/en-us/graph/teams-changenotifications-callrecording-and-calltranscript Proposed Implementation Plan Step 1: Evaluate Built-in CoPilot Capabilities • Determine if the organization has Teams Premium or Microsoft 365 CoPilot licenses. • If available, assess whether Intelligent Recap and CoPilot summaries meet our needs. Step 2: Automate Transcript Extraction via Graph API If CoPilot transcription is not available or cost-prohibitive: • Enforce transcription for all P1/P2 incident calls in Teams. • Use Microsoft Graph API to capture transcriptions. • Store and process transcripts using an external summarization tool. Step 3: Integrate Transcripts into Slack & ServiceNow • Format extracted transcripts into a structured summary. • Automatically post summaries into: • Slack Incident Management threads (or future Teams Incident Management). • ServiceNow Incident records as part of incident documentation. Task Owner Status Check Teams Premium & Copilot licensing availability IT Admin 🔄 Pending Verify if built-in CoPilot capabilities meet our requirements Engineering 🔄 Pending Investigate Graph API transcript retrieval Dev Team 🔄 In Progress Develop integration pipeline for Slack & ServiceNow DevOps 🔄 Not Started Define testing & validation process QA Team 🔄 Not Started Conclusion • If Teams CoPilot transcription is available, we can leverage Intelligent Recap. • If not available, we can use Graph API to extract transcriptions and process them externally. • Integration with Slack and ServiceNow ensures incident discussions are logged and actionable. This implementation will streamline P1/P2 incident resolution by preserving critical discussions and action points, reducing the need for redundant follow-ups.