owned this note
owned this note
Published
Linked with GitHub
# SIG Docs and AI
The following discusses the potential integration of AI tools within SIG Docs.
As a part of the CNCF, SIG Docs will adhere to the CNCF and Linux Foundation stated guidelines for AI use within open source project.
* See the Linux Foundation's [Guidance Regarding Use of Generative AI Tools for Open Source Software Development](https://www.linuxfoundation.org/legal/generative-ai)
* CNCF's guidelines are not currently available
### We'd like AI to help
- automate trivial tasks that can be a blocker to getting content reviewed, edited, published.
- reduce the workload on SIG Docs leader's administration tasks
- make it easier to contribute to Kubernetes docs
### We don't want AI tools to
- create more work for reviewers and approvers to review/approve issues and PRs
- block community member's ability to contribute or get credit for thier contributions
- introduce overly complex workflows for contribution
- violate copyright laws or the Kubernetes Code of Conduct.
### Approach to AI
SIG Docs is open to using AI tools if
1. the tools solve a problem for contributors, SIG leaders, and/or docs readers.
2. the tools have sufficent contributor support for implementation and ongoing maintance.
At this time, there isn't enough of a push within the community to adopt AI across the SIG and kubernetes/website repo. SIG Docs doesn't currently have the resources to research, implement, and maintain a large scale integration of an AI tool into our workflows. If someone from the community comes forward with an idea for using AI, SIG Docs is happy to provide feedback and guidance on the project.
## Notes from Docs sprint
**Participants:**
- Jonas Rosland
- Nate Waddington
- Divya Mohan
- Rey Lejano
- Robert Reeves
Discussion Notes:
* idea: Using AI to help review PRs against the style guide
* [Divya] How do we deal with AI contributions? if AI is making suggestions or writing content, how do we make sure that the input is from a contributor? How do we handle giving contributors credit, if they didn't write the content. We shouldn't allow for prompt engineering.
* [Nate] It would be great it we could have an option for an AI to do a first pass through a review. And allow for the suggestions to be accepted right inside the PR. "Would you like and AI review pass on this?"
* Vale might be a good option to integrate with AI to do the style checking
* [Divya] This is going to take a lot of people, and continued administration of the tool. This will likely need a new subproject to create and run the tools after it's set up. AI integration on the level on getting an AI review in PRs will take a lot of time, money, and people. We should make sure that we have the staffing in place before we take on those scenarios
* idea: Using AI to take meeting notes and posting transcripts.
* Jonas showed some useful workflows with fasterwhisper from other open source project's experiences
* The difference between this and the transcripts service
* We already get the transcripts by using tools we have in place, zoom transcripts.
* We should check that the transcripts are good enough on zoom, of if we could get better
* [Robert] The CNCF can help us get funding and access to AI tools. And get support from venders to set up the tools with us.
* [Diyva] it makes sense to start off with smaller use cases to solve problems. Use successes there to get folks in the community excited to help tackle the bigger use cases.
* [Robert] Think about how we will know if we fail? +1 to starting small, and gut checking on if we are having success with it and if it worth while.
* SIG Docs is the first group in the Kubernetes community who is really looking into this.
* [Nate] We should try something, something small is great. That way we can see if it works, and if it fails, then its a small fail. We should take a "Research" frame of mind and approach to this. Any tool or workflow proposed and implemented is for testing purposes, and to discover more information around if it's working, if it's useful, and if its sustainable.
## AI tools
This is a list of AI tools that have been recommended to SIG Docs for further research
- claude.ai
- docsbot.ai
- writer.com
- jasper.ai
- fireflies.ai
- www.deepl.com
- chat gpt
- transcription
- otter.ai
- fasterwhisper - https://github.com/guillaumekln/faster-whisper
## Apendix
The following sections were created using Claude.ai and human edited.
### Potential benefits of using AI
Here are some potential benefits and advantages of AI:
- Automation - AI can automate tedious, repetitive tasks allowing humans to focus on more meaningful work. This can increase efficiency and productivity.
- Insights - AI can uncover patterns and insights in data that humans may miss. These insights can inform better decision making.
- Prediction - AI models can analyze historical data to make predictions about the future for things like forecasting demand, detecting fraud, etc.
- Speed - AI systems can process and analyze data significantly faster than humans. This enables real-time responsiveness.
- Scalability - AI models can be replicated quickly and scaled cost effectively allowing expansion to large datasets/problems.
- Reliability - AI performs consistently without human limitations like fatigue or inattention. This makes certain tasks more reliable.
- Capabilities - AI can take in and process multiple modes of data like image, text, speech etc. expanding what is possible to analyze.
- Accessibility - AI assistants and chatbots make services more accessible 24/7 for those who need help or information.
- Creativity - AI can generate novel content, suggestions, and solutions that humans may not conceive of.
The key is ultimately applying AI thoughtfully and ethically where it can complement and enhance human abilities for the benefit of society.
### Potential problems with using AI
Here are some common problems with using AI today:
- Bias - AI systems can inherit and amplify biases if the data they are trained on contains biases. This can lead to issues with fairness, accuracy, and representation.
- Transparency - The inner workings of AI systems are often opaque and can be difficult to interpret. This "black box" nature makes it hard to understand why an AI arrived at a particular decision.
- Security - AI models can be vulnerable to various attacks like data poisoning, model stealing, and adversarial examples designed to fool the system.
- Data dependence - AI relies heavily on data. Insufficient data, low quality data, or unrepresentative data can result in poor system performance.
- Explainability - Being able to explain how an AI system arrived at a decision in an understandable way for humans is still difficult for complex models.
- Ethical concerns - Issues around inherent biases, potential job losses, privacy violations, and harmful applications require careful ethical considerations for AI.
- Environmental concerns - Generating LLM uses a lot of resources, raising questions around how sustainable creating new models will be
- Hype vs reality - There is often exaggerated marketing hype around AI that does not match its actual capabilities and limitations. Setting appropriate expectations is important.
- Cost - Developing and maintaining AI systems requires significant computational resources, engineering talent, and data infrastructure. The costs may be prohibitive for many organizations.
### Example use cases
### Using AI to assist with documentation formatting, linking, styling
As a Kubernetes documentation contributor, I want an AI assistant to scan my pull requests and provide friendly suggestions to improve adherence with style guidelines and written English standards so that I can improve my technical writing skills and contribute higher quality documentation.
As a Kubernetes documentation reviewer/approver, I want an AI assistant to scan pull requests and provide suggestions to improve adherence with style guidelines and written English standards to help speed up the time it takes to review PRs.
### Building chatbots to answer user questions
As a Kubernetes user, I want to be able to ask questions to a documentation chatbot in plain language so I can get instant answers to basic questions and resolve issues faster without needing to search through lengthy docs.
### Translating Kubernetes documentation across multiple languages:
As a user who is not fluent in English, I want key Kubernetes documentation to be available in my native language so I can fully understand the concepts and instructions without facing a language barrier.
### Summarizing documentation into overviews
As a user new to Kubernetes, I want clear, concise overviews of complex topics that summarize key takeaways so I can get the big picture before diving into lengthy documentation details.
### Identifying documentation gaps and suggesting new content:
As a Kubernetes documentation contributor, I want an AI tool to analyze documentation content and user questions to identify potential gaps where additional documentation is needed, and suggest topics that could fill those gaps with new content, so that we can proactively improve documentation coverage and reduce user confusion.
### Using AI to generate meeting summaries and notes
As a SIG Docs chair, I'd like to be able to generate meeting notes and summaries from community meeting recording that can be posted to Slack to help keep the community updated on what's happened in the meeting.
### Docs sprint topic: Exploring AI Solutions for Kubernetes Documentation: A SIG Docs Sprint
Join the Kubernetes SIG Docs community at [Kubernetes Contributor Summit](https://www.kubernetes.dev/events/2023/kcsna/) Chicago 2023 for our doc sprint focused on exploring AI solutions for open source documentation!
**When:** Monday, November 6th from 11am-4pm
**Where:** Michigan conference room (take the stairs up one floor from where the contributor kickoff room), [Hyatt Regency McCormick Place](https://www.hyatt.com/en-US/hotel/illinois/hyatt-regency-mccormick-place/chimc), Chicago.
Starting from several lively discussions in community meetings and in [Github discussions](https://github.com/kubernetes/website/discussions/41986), we are focusing this year's docs sprint on the potenial use cases of AI within SIG Docs. As AI capabilities advance, we want to discuss how these technologies could help improve and streamline Kubernetes documentation. Potential topics include:
* Using AI to assist with documentation formatting, linking, styling
* Building chatbots to answer user questions
* Translating content across multiple languages
* Summarizing documentation into overviews
* Identifying documentation gaps and suggesting new content
* Using AI to generate meeting summaries and notes
* Using AI to generate diagrams
The event will kick off with a short intro to on AI, followed by open discussion, breakout groups, and hands-on exploration of AI documentation tools. We welcome contributors of all experience levels to share insights, brainstorm ideas, and help guide adoption of AI capabilities in our docs.
Come collaborate with the SIG Docs community to improve Kubernetes documentation for users worldwide!