<!-- markdownlint-disable-file MD024 --> <!-- markdownlint-disable-file MD025 --> # Accessibility Projects > _THIS IS THE INITIAL LIST. FURTHER DOCUMENTATION CAN BE FOUND IN GITLAB._ A list of potential projects based on the initial accessibility discussion. # Improving Vosk Speech-to-Text in Opencast ## Summary Vosk is a free speech-to-text engine now included in Opencast. It's an easy way to get automated subtitles on a large scale. Right now it is included as-is with no major modifications or improvements in post-processing. We hope that will very little effort, we can improve the experience quite a bit. ## Goals Extend [`vosk-cli`](https://github.com/elan-ev/vosk-cli) - Make line length and number of lines configurable via YAML configuration file - Option to automatically detect the language using [LanguageTool](https://languagetool.org/) - Option to automatically apply [LanguageTool](https://languagetool.org/) suggestions - Option to recognize sentences - Maybe using [NLTK](https://nltk.org) ## Non Goals - Do not modify Vosk directly - Do not modify the Opencast integration ## Potential Risks Recognizing sentence structures may be hard and may require additional work to get into that topic. This sub-task may fail if we don't find a good tutorial on something similar. ## Success Criteria - A configuration was added - LanguageTool has been integrated ## Benefits/Rewards - We improve the quality of subtitles - We can re-use the work for BigBlueButton and other projects - Less manual work correcting subtitles ## Resources/Budget - 2-3 weeks of Nadine's time ## Initial Funding - Part of AVVP accessibility ## Proposed By - Clemens - Lars - Rüdiger ## Additional Notes LanguageTool's language recognition: ```json ❯ curl -s 'http://lt.home.lkiesow.io:8080/v2/check?language=auto&text=Dies+ist+ein+deutscher+Testtext.' | jq .language { "name": "German (Germany)", "code": "de-DE", "detectedLanguage": { "name": "German (Germany)", "code": "de-DE", "confidence": 0.9999928 } } ``` LanguageTool responds with rules and suggestions to recognized errors: ```json ❯ curl -s 'http://lt.home.lkiesow.io:8080/v2/check?language=de-DE&text=dies+ist+ein+deutscher+Testtext..' | jq '.matches[0].message' "Dieser Satz fängt nicht mit einem großgeschriebenen Wort an." ``` # Database for Sharing Corrected Audio and Subtitle Material ## Summary For a long term improvement of speech recognition in an academic context, we need better source material to build data from. We should look into if we can contribute material to [Mozilla Common Voice](https://commonvoice.mozilla.org/en) or similar projects to improve recognition long-term. ## Goals Research projects and options and ways for Opencast community members to contribute material to a central database. Does such a database exist? How can we contribute? Will that have any effect on data models we can use? ## Non Goals - We are not building a database - We are not building new Vosk models - Do not contribute anything (except for testing purposes) ## Potential Risks Worst case, Common Voice does not want our contribution, and we find no projects. But at least, we then know that there are none. ## Success Criteria - Find a good place to contribute to - Provide a clear way for community members to do so ## Benefits/Rewards While we will not see any effect short-term, we can hope that this will have a long-term effect on free language models, improving speech recognition. ## Resources/Budget 1 week of Nadine's time. ## Initial Funding - Part of AVVP Accessibility ## Proposed By - Clemens - Lars - Rüdiger # Research how to improve Vosk language models ## Summary Vosk is a free speech-to-text engine now included in Opencast. It's an easy way to get automated subtitles on a large scale. Vosk comes with support for many languages, but it's not tunes to an academic context. It does [allow for tuning the language models](https://alphacephei.com/vosk/lm), and we should look into how hard this is and if we can easily do this. ## Goals Research what we need to do to improve the language models for our context. How hard is it? What do we need to do that? ## Non Goals Building a new model is out of scope for this project, but may be addressed after evaluation in a follow-up project. ## Potential Risks No risks. ## Success Criteria - We have found an easy way to tune language models with less than two weeks of work. ## Benefits/Rewards We have lots of specific content where we actually need the general type of content before we analyze it. If we have more precise language models, we can potentially increase the accuracy of the recognition. ## Resources/Budget One week of Nadine's time ## Initial Funding - Part of AVVP Accessibility ## Proposed By - Clemens - Lars - Rüdiger # Packaging vosk-cli ## Summary To make Vosk easy to use for the community, we should make sure to add `vosk-cli` and different languages to Opencast's package repository. ## Goals Add `vosk-cli` and language packs to the package repository. ## Non Goals No modifications or improvements of the tools unless necessary for packaging. ## Potential Risks No risks. ## Success Criteria - In the RPM repository: - `vosk-cli` - English - German - Spanish - French ## Benefits/Rewards This makes the speech recognition actually usable for the masses. ## Resources/Budget - 20h of time - Martin or Lars ## Initial Funding - AVVP - Basis Souver@n - Community hours ## Proposed By - Lars # Research good tool independent keyboard shortcuts ## Summary We have several tools with similar functionality which already support keyboard shortcuts or should support them in the future. We should make sure to not use different shortcuts for the same function. ## Goals Research and define a set of ideally well established of keyboard shortcuts for common functionality (play/pause, seek, search, …) we can then implement across different tools. ## Non Goals Not actually implementing any shortcuts anywhere. ## Potential Risks None. ## Success Criteria A set of at least 10 commonly used shortcuts for web-based media tools. ## Benefits/Rewards It's less confusing for users if they do not have to re-learn shortcuts with every tool they use. ## Resources/Budget One week of Nadine's time. ## Initial Funding - Part of AVVP Accessibility ## Proposed By - Clemens - Lars - Rüdiger # Adding Keyboard Shortcuts to Opencast Studio ## Summary Opencast Studio, especially the built-in editor, does not support any keyboard shortcuts at the moment. ## Goals Add keyboard shortcuts to Opencast studio where it makes sense. This means first and foremost to the player on the editor view. ## Non Goals Not all functions need keyboard shortcuts. ## Potential Risks Relatively risk-free. ## Success Criteria - Play/pause with keyboard shortcut possible - Setting trim marks via keyboard possible ## Benefits/Rewards - Improved accessibility in Opencast Studio - Developers learn how to include keyboard shortcuts ## Resources/Budget 1 week of Nadine's time. ## Initial Funding - Part of AVVP Accessibility ## Proposed By - Clemens - Lars - Rüdiger # Switch from Travis CI to GitHub Actions in Opencast Studio ## Summary Opencast Studio still uses Travis CI for automated tests and deployments. The Travis checks seem to be broken which could cause quality issues if we do more development on Studio again. We switched to GitHub Actions everywhere else. We should do that here as well. ## Goals - Port tests and deployment from Travis CI to GitHub Actions ## Non Goals - Do not add new tests ## Potential Risks - None ## Success Criteria - New pull requests are checked automatically - Pull requests from trusted sources are deployed automatically ## Benefits/Rewards - Will help us for all additional Studio developments ## Resources/Budget 1 weeks of Nadine's time. ## Initial Funding - Part of AVVP Accessibility ## Proposed By - Lars # Implement Two-Pass loudnorm Audio Normalization ## Summary Audio normalization can drastically improve the clarity of audio, especially if a video is created by multiple speakers. We can easily make use of FFmpeg's excellent internal loudnorm implementation to add good audio normalization to Opencast. ## Goals - Add a loudnorm workflow operation - Implement a fist-pass analysis step in the composer which can then be used in a regular encode operation ## Non Goals - Not dealing with live content ## Potential Risks Relatively risk-free. ## Success Criteria - Working two-pass loudnorm operation in Opencast ## Benefits/Rewards - Improved accessibility for visually impaired users - Overall improvement for users ## Resources/Budget 2 weeks of Nadine's or Alex's time. ## Initial Funding - Part of AVVP Accessibility ## Proposed By - Lars # Dark Mode in Studio ## Summary Glare sensitivity can be one type of visually impairment. This means that bright white surfaces can cause problems for users. This can be particular bad if users have to switch between dark and bright interfaces. To prevent this, it would be great for Opencast Studio to have a dark mode. Even better would it be if the mode would automatically be triggered by a user's system being in dark mode. ## Goals - Implement dark mode in Studio - Users should be able to enable the mode in the settings - By default, it should be set to `auto` to respect the user's system settings if possible - Studio should remember the user's settings ## Non Goals - No custom designs ## Potential Risks - Low risk ## Success Criteria - A dark mode should be selectable on all systems ## Benefits/Rewards - Learn about detecting system dark mode - Better supporting visually impaired users ## Resources/Budget 1 weeks of Nadine's time. ## Initial Funding - Part of AVVP Accessibility ## Notes - We can likely extend these pull request: - https://github.com/elan-ev/opencast-studio/pull/902 - https://github.com/elan-ev/opencast-studio/pull/909 ## Proposed By - Lars - Clemens # Dark Mode in the Opencast Editor ## Summary Glare sensitivity can be one type of visually impairment. This means that bright white surfaces can cause difficulties for users. This can be particular bad if users have to switch between dark and bright interfaces. To prevent this, it would be great for Opencast Editor to have a dark mode. Even better would it be if the mode would automatically be triggered by a user's system being in dark mode. ## Goals - Implement dark mode in the Opencast Editor - Users should be able to enable the mode in the settings - Create a settings view - By default, it should be set to `auto` to respect the user's system settings if possible - Studio should remember the user's settings in the Browser's storage ## Non Goals - No custom designs ## Potential Risks - Low risk ## Success Criteria - A dark mode should be selectable on all systems ## Benefits/Rewards - Better supporting visually impaired users ## Resources/Budget 2 weeks of Nadine's time. ## Initial Funding - Part of AVVP Accessibility ## Proposed By - Lars - Clemens # Dark Mode in Tobira ## Summary Glare sensitivity can be one type of visually impairment. This means that bright white surfaces can cause difficulties for users. This can be particular bad if users have to switch between dark and bright interfaces. To prevent this, it would be great for Tobira to have a dark mode. Even better would it be if the mode would automatically be triggered by a user's system being in dark mode. ## Goals - Implement a dark mode in Tobira - Users should be able to enable the mode in the settings - By default, it should be set to `auto` to respect the user's system settings if possible ## Non Goals - No custom designs ## Potential Risks - Low risk - This may conflict with corporate identity settings ## Success Criteria - A dark mode should be selectable on all systems ## Benefits/Rewards - Better supporting visually impaired users ## Resources/Budget 1 weeks of Julian's or Lukas' time. ## Initial Funding - Part of AVVP Accessibility - or: ETH ## Proposed By - Lars - Clemens # Tab Navigation in Opencast Studio ## Summary Right now, a lot of elements can only be reached via the mouse in Opencast Studio. Tabbing through active elements and activating them via keyboard should be possible. Additionally, sensible `title` and `aria-label` attributes should be used in the user interface. ## Goals - Make sure main components can be reached via tab key - Make sure active elements are visually distinguishable - Make sure elements can be activated via keyboard - Make sure we have sensible attributes ## Non Goals - No full accessibility analysis - No keyboard shortcuts ## Potential Risks Low risk ## Success Criteria A user can create a full recording with no mouse. ## Benefits/Rewards Accessibility improvement for motor impaired people and for visually impaired people using screen reader. ## Resources/Budget 1 weeks of Nadine's time. ## Initial Funding - Part of AVVP Accessibility ## Proposed By - Rüdiger - Lukas # Switch to WebVTT for Segments ## Summary Opencast automatically recognizes video scenes and with this, for example, slide changes in the presentation video. The format Opencast stores these segments in is XML-based and maybe somewhat outdated. We could investigate using WebVTT instead. This would potentially allow us to use the editing tools we have for subtitles. ## Goals - Investigate if using WebVTT for sections is common and makes sense - Make a decision about switching ## Non Goals Update Opencast to fully work with WebVTT sections (could be a potentially resulting project) ## Potential Risks We decide that there is no benefit in switching. ## Success Criteria We made a decision. ## Benefits/Rewards - Potentially better support from modern players. - Overlapping tooling from the subtitle editor. ## Resources/Budget 1 weeks of Nadine's time. ## Initial Funding - Part of AVVP Accessibility ## Proposed By - Lars # Implement Segment Editor ## Summary Opencast automatically recognizes video scenes and with this, for example, slide changes in the presentation video. While this works well, sometimes it would be nice if users can fix these, add new segments, change segments and update segment descriptions. Blocked by `Switch to WebVTT for Segments` which may make this obsolete. ## Goals - Implement a segment editor - Allow for setting text information - Make this part of the Opencast Editor ## Non Goals - Not touching the admin interface editor - No automatic re-publication ## Potential Risks - Hard to predict time budget. May take longer. - May require workflow changes ## Success Criteria Users are able to change segments ## Benefits/Rewards - Segments can be more accurate - Setting texts can help with in-video navigation ## Resources/Budget 7 weeks of Nadine's or Alex's time. ## Initial Funding - Part of AVVP Accessibility ## Proposed By - Rüdiger - Clemens # Detect Title in Slide Text ## Summary Opencast automatically recognizes video scenes and with this, for example, slide changes in the presentation video. It also extracts text from these slides. It would be really helpful if we could identify the most significant parts of these slide texts like, for example, the slide title. This can help users navigate through the video. ## Goals - Identify and extract slide title ## Non Goals - Only based on slide text - No additional text sources - Extraction only. This is not about presenting the results to users. ## Potential Risks If we cannot make this accurate enough, results may be useless. ## Success Criteria We extracted at least 4 titles from the [Dual-Stream Demo](https://develop.opencast.org/play/ID-dual-stream-demo) video. ## Benefits/Rewards - Segments can be more accurate - Setting texts can help with in-video navigation ## Notes - Take a look at slide texts extracted via OCR - Tesseract will give us size and accuracy information we currently discard ## Resources/Budget 4 weeks of Nadine's time. ## Initial Funding - Part of AVVP Accessibility ## Proposed By - Rüdiger # Identify Key Subjects for Event ## Summary Having a lot of extracted text for each event with the OCR on slides and the speech recognition, we could use something like [tf-idf](https://en.wikipedia.org/wiki/Tf%E2%80%93idf) to easily extract subjects or other keywords from these texts to enrich event metadata. This might work especially well, if we look at the term frequency in documents for one event and compared to the frequency in all events of a series. ## Goals Implement a tf-idf extraction workflow operation to extract keywords from metadata, slide texts and subtitles. We can add the subjects as Dublin Core subject metadata and maybe add the results with additional information as JSON attachments. ## Non Goals - Not automatically updating old events - Not showing this in any particular user interface ## Potential Risks None ## Success Criteria Running this on a series of UOS or ETH material should result in recognizable subjects. ## Benefits/Rewards - Improves navigation to/in videos - May help users to understand content since they know what it is about ## Notes - Talk to Rüdiger or Lars for information about tf-idf. ## Resources/Budget 2 weeks of Nadine's time. ## Initial Funding - Part of AVVP Accessibility ## Proposed By - Lars # Show Segments, Texts and Titles in Paella Player ## Summary Analyzing the videos, we can extract a lot of useful information we should properly present to users in the player to help them navigate through a video. ## Goals Evaluate, add and/or improve in Paella Player: - Showing segments - Showing slide texts - Showing segment titles ## Non Goals - Improved extraction - No other player - Not in the admin interface ## Potential Risks Potential conflict with what UPV and ETH does. Make sure to coordinate with them. ## Success Criteria A user can use the slide texts of the Dual-Stream Demo for navigational help in paella Player. ## Benefits/Rewards - Improves navigation in videos - May additionally help when using screen readers ## Resources/Budget 3 weeks of Nadine's time. ## Initial Funding - Part of AVVP Accessibility ## Proposed By - RnD # Let Users Search Through Event Texts in Tobira ## Summary Analyzing the videos, we can extract a lot of useful information in text form about events. It would be really helpful, if we could use those for searching in Tobira. ## Goals Make Tobira search through: - Slide texts - Segment titles - Subtitles ## Non Goals - Don't deal with temporal results within an event. ## Potential Risks - Potential conflict with what UPV and ETH does. Make sure to coordinate with them. - May cause the search to slow down. ## Success Criteria A user can find events based on speech recognition. ## Benefits/Rewards - Improves navigation to videos - Potential linking of in-video results (exact position) in future projects ## Resources/Budget 3 weeks of Lukas' or Julian's time. ## Initial Funding - Part of AVVP Accessibility ## Proposed By - RnD # Add Subtitle2Go Based Transcription Service ## Summary [Subtitle2go](https://github.com/uhh-lt/subtitle2go) is an alternative to Vosk for free, automatic subtitling. It has good support for the German language. We could add it as a transcription service to Opencast. ## Goals - Add Subtitle2go based transcription service similar to the Vosk based service ## Non Goals - Do not include Subtitle2go in Opencast - Do not deal with the installation of Subtitle2go ## Potential Risks - Unsure how stable that project is - May be a German-only solution ## Success Criteria - The Dual-Stream demo successfully gets subtitles through Subtitle2go - Subtitles resemble the actual speech ## Benefits/Rewards - Potentially better results than Vosk for German ## Resources/Budget 4 weeks of Nadine's time. ## Initial Funding - Part of AVVP Accessibility ## Proposed By - Clemens # Accessible Tooltips in Opencast Studio ## Summary Toooltips (e.g. HTML `title` attributes) are great to add additioonal explanations to control elements. They are – in combination with `aria-label` – also picked up by some accessibility tools to help users further. But non-persistent tooltips can unfortunately hinder accessibility. Users relying on screen magnification tool may be unable too read the whole tooltip since only part of the actual screen is visible. For example, here is screen magnification being used with default `title` attributes vs using [tippy.js](https://atomiks.github.io/tippyjs/) for rendering [in the Opencast documentation](https://github.com/opencast/opencast/pull/3512): - [accessible-tooltips.mp4](https://data.lkiesow.io/opencast/accessible-tooltips.mp4) ## Goals - Use tippy.js to render persistent toooltips - Make them adhere to the default theme ## Non Goals - Not changing any tooltip textx and/or evaluating their usefullyness - No design evaluatioon for how they should look exactly ## Potential Risks - Low risk ## Success Criteria - Studio renders tooltips similar to Opencast's documentation ## Benefits/Rewards - Improved accessibility - Improved look and feel ## Resources/Budget - 1 weeks of Nadine's time ## Initial Funding - Part of AVVP high-availability ## Proposed By - Lars # Evaluate Vosk Punctuation Models ## Summary In addition to the regular models, Vosk provides [pubctuation models](https://alphacephei.com/vosk/models#punctuation-models) for both punctuation and case restoration. We should try them and compare the Results to the default models for German and English. ## Goals - Set up Opencast with Vosk - Set up Vosk with regular and punctuation models - Process a few recordings - Compare the results ## Non Goals - No formal evaluation - No manual tweeking of models - English and German only ## Potential Risks - None ## Success Criteria - Finding a models with similar good results but with punctuation and case support ## Benefits/Rewards - Improved readability - Better subtitles ## Resources/Budget - 1 weeks of Nadine's time ## Initial Funding - Part of AVVP accessibility ## Proposed By - Rüdiger