Try   HackMD

Accessibility Projects

THIS IS THE INITIAL LIST. FURTHER DOCUMENTATION CAN BE FOUND IN GITLAB.

A list of potential projects based on the initial accessibility discussion.

Improving Vosk Speech-to-Text in Opencast

Summary

Vosk is a free speech-to-text engine now included in Opencast. It's an easy way to get automated subtitles on a large scale.

Right now it is included as-is with no major modifications or improvements in post-processing. We hope that will very little effort, we can improve the experience quite a bit.

Goals

Extend vosk-cli

  • Make line length and number of lines configurable via YAML configuration file
  • Option to automatically detect the language using LanguageTool
  • Option to automatically apply LanguageTool suggestions
  • Option to recognize sentences

Non Goals

  • Do not modify Vosk directly
  • Do not modify the Opencast integration

Potential Risks

Recognizing sentence structures may be hard and may require additional work to get into that topic. This sub-task may fail if we don't find a good tutorial on something similar.

Success Criteria

  • A configuration was added
  • LanguageTool has been integrated

Benefits/Rewards

  • We improve the quality of subtitles
  • We can re-use the work for BigBlueButton and other projects
  • Less manual work correcting subtitles

Resources/Budget

  • 2-3 weeks of Nadine's time

Initial Funding

  • Part of AVVP accessibility

Proposed By

  • Clemens
  • Lars
  • Rüdiger

Additional Notes

LanguageTool's language recognition:

❯ curl -s 'http://lt.home.lkiesow.io:8080/v2/check?language=auto&text=Dies+ist+ein+deutscher+Testtext.' | jq .language
{
  "name": "German (Germany)",
  "code": "de-DE",
  "detectedLanguage": {
    "name": "German (Germany)",
    "code": "de-DE",
    "confidence": 0.9999928
  }
}

LanguageTool responds with rules and suggestions to recognized errors:

❯ curl -s 'http://lt.home.lkiesow.io:8080/v2/check?language=de-DE&text=dies+ist+ein+deutscher+Testtext..' | jq '.matches[0].message'
"Dieser Satz fängt nicht mit einem großgeschriebenen Wort an."

Database for Sharing Corrected Audio and Subtitle Material

Summary

For a long term improvement of speech recognition in an academic context, we need better source material to build data from. We should look into if we can contribute material to Mozilla Common Voice or similar projects to improve recognition long-term.

Goals

Research projects and options and ways for Opencast community members to contribute material to a central database. Does such a database exist? How can we contribute? Will that have any effect on data models we can use?

Non Goals

  • We are not building a database
  • We are not building new Vosk models
  • Do not contribute anything (except for testing purposes)

Potential Risks

Worst case, Common Voice does not want our contribution, and we find no projects. But at least, we then know that there are none.

Success Criteria

  • Find a good place to contribute to
  • Provide a clear way for community members to do so

Benefits/Rewards

While we will not see any effect short-term, we can hope that this will have a long-term effect on free language models, improving speech recognition.

Resources/Budget

1 week of Nadine's time.

Initial Funding

  • Part of AVVP Accessibility

Proposed By

  • Clemens
  • Lars
  • Rüdiger

Research how to improve Vosk language models

Summary

Vosk is a free speech-to-text engine now included in Opencast. It's an easy way to get automated subtitles on a large scale.

Vosk comes with support for many languages, but it's not tunes to an academic context. It does allow for tuning the language models, and we should look into how hard this is and if we can easily do this.

Goals

Research what we need to do to improve the language models for our context. How hard is it? What do we need to do that?

Non Goals

Building a new model is out of scope for this project, but may be addressed after evaluation in a follow-up project.

Potential Risks

No risks.

Success Criteria

  • We have found an easy way to tune language models with less than two weeks of work.

Benefits/Rewards

We have lots of specific content where we actually need the general type of content before we analyze it. If we have more precise language models, we can potentially increase the accuracy of the recognition.

Resources/Budget

One week of Nadine's time

Initial Funding

  • Part of AVVP Accessibility

Proposed By

  • Clemens
  • Lars
  • Rüdiger

Packaging vosk-cli

Summary

To make Vosk easy to use for the community, we should make sure to add vosk-cli and different languages to Opencast's package repository.

Goals

Add vosk-cli and language packs to the package repository.

Non Goals

No modifications or improvements of the tools unless necessary for packaging.

Potential Risks

No risks.

Success Criteria

  • In the RPM repository:
    • vosk-cli
    • English
    • German
    • Spanish
    • French

Benefits/Rewards

This makes the speech recognition actually usable for the masses.

Resources/Budget

  • 20h of time
  • Martin or Lars

Initial Funding

  • AVVP
  • Basis Souver@n
  • Community hours

Proposed By

  • Lars

Research good tool independent keyboard shortcuts

Summary

We have several tools with similar functionality which already support keyboard shortcuts or should support them in the future. We should make sure to not use different shortcuts for the same function.

Goals

Research and define a set of ideally well established of keyboard shortcuts for common functionality (play/pause, seek, search, …) we can then implement across different tools.

Non Goals

Not actually implementing any shortcuts anywhere.

Potential Risks

None.

Success Criteria

A set of at least 10 commonly used shortcuts for web-based media tools.

Benefits/Rewards

It's less confusing for users if they do not have to re-learn shortcuts with every tool they use.

Resources/Budget

One week of Nadine's time.

Initial Funding

  • Part of AVVP Accessibility

Proposed By

  • Clemens
  • Lars
  • Rüdiger

Adding Keyboard Shortcuts to Opencast Studio

Summary

Opencast Studio, especially the built-in editor, does not support any keyboard shortcuts at the moment.

Goals

Add keyboard shortcuts to Opencast studio where it makes sense. This means first and foremost to the player on the editor view.

Non Goals

Not all functions need keyboard shortcuts.

Potential Risks

Relatively risk-free.

Success Criteria

  • Play/pause with keyboard shortcut possible
  • Setting trim marks via keyboard possible

Benefits/Rewards

  • Improved accessibility in Opencast Studio
  • Developers learn how to include keyboard shortcuts

Resources/Budget

1 week of Nadine's time.

Initial Funding

  • Part of AVVP Accessibility

Proposed By

  • Clemens
  • Lars
  • Rüdiger

Switch from Travis CI to GitHub Actions in Opencast Studio

Summary

Opencast Studio still uses Travis CI for automated tests and deployments. The Travis checks seem to be broken which could cause quality issues if we do more development on Studio again. We switched to GitHub Actions everywhere else. We should do that here as well.

Goals

  • Port tests and deployment from Travis CI to GitHub Actions

Non Goals

  • Do not add new tests

Potential Risks

  • None

Success Criteria

  • New pull requests are checked automatically
  • Pull requests from trusted sources are deployed automatically

Benefits/Rewards

  • Will help us for all additional Studio developments

Resources/Budget

1 weeks of Nadine's time.

Initial Funding

  • Part of AVVP Accessibility

Proposed By

  • Lars

Implement Two-Pass loudnorm Audio Normalization

Summary

Audio normalization can drastically improve the clarity of audio, especially if a video is created by multiple speakers. We can easily make use of FFmpeg's excellent internal loudnorm implementation to add good audio normalization to Opencast.

Goals

  • Add a loudnorm workflow operation
  • Implement a fist-pass analysis step in the composer which can then be used in a regular encode operation

Non Goals

  • Not dealing with live content

Potential Risks

Relatively risk-free.

Success Criteria

  • Working two-pass loudnorm operation in Opencast

Benefits/Rewards

  • Improved accessibility for visually impaired users
  • Overall improvement for users

Resources/Budget

2 weeks of Nadine's or Alex's time.

Initial Funding

  • Part of AVVP Accessibility

Proposed By

  • Lars

Dark Mode in Studio

Summary

Glare sensitivity can be one type of visually impairment. This means that bright white surfaces can cause problems for users. This can be particular bad if users have to switch between dark and bright interfaces.

To prevent this, it would be great for Opencast Studio to have a dark mode. Even better would it be if the mode would automatically be triggered by a user's system being in dark mode.

Goals

  • Implement dark mode in Studio
  • Users should be able to enable the mode in the settings
  • By default, it should be set to auto to respect the user's system settings if possible
  • Studio should remember the user's settings

Non Goals

  • No custom designs

Potential Risks

  • Low risk

Success Criteria

  • A dark mode should be selectable on all systems

Benefits/Rewards

  • Learn about detecting system dark mode
  • Better supporting visually impaired users

Resources/Budget

1 weeks of Nadine's time.

Initial Funding

  • Part of AVVP Accessibility

Notes

Proposed By

  • Lars
  • Clemens

Dark Mode in the Opencast Editor

Summary

Glare sensitivity can be one type of visually impairment. This means that bright white surfaces can cause difficulties for users. This can be particular bad if users have to switch between dark and bright interfaces.

To prevent this, it would be great for Opencast Editor to have a dark mode. Even better would it be if the mode would automatically be triggered by a user's system being in dark mode.

Goals

  • Implement dark mode in the Opencast Editor
  • Users should be able to enable the mode in the settings
    • Create a settings view
  • By default, it should be set to auto to respect the user's system settings if possible
  • Studio should remember the user's settings in the Browser's storage

Non Goals

  • No custom designs

Potential Risks

  • Low risk

Success Criteria

  • A dark mode should be selectable on all systems

Benefits/Rewards

  • Better supporting visually impaired users

Resources/Budget

2 weeks of Nadine's time.

Initial Funding

  • Part of AVVP Accessibility

Proposed By

  • Lars
  • Clemens

Dark Mode in Tobira

Summary

Glare sensitivity can be one type of visually impairment. This means that bright white surfaces can cause difficulties for users. This can be particular bad if users have to switch between dark and bright interfaces.

To prevent this, it would be great for Tobira to have a dark mode. Even better would it be if the mode would automatically be triggered by a user's system being in dark mode.

Goals

  • Implement a dark mode in Tobira
  • Users should be able to enable the mode in the settings
  • By default, it should be set to auto to respect the user's system settings if possible

Non Goals

  • No custom designs

Potential Risks

  • Low risk
  • This may conflict with corporate identity settings

Success Criteria

  • A dark mode should be selectable on all systems

Benefits/Rewards

  • Better supporting visually impaired users

Resources/Budget

1 weeks of Julian's or Lukas' time.

Initial Funding

  • Part of AVVP Accessibility
  • or: ETH

Proposed By

  • Lars
  • Clemens

Tab Navigation in Opencast Studio

Summary

Right now, a lot of elements can only be reached via the mouse in Opencast Studio. Tabbing through active elements and activating them via keyboard should be possible.

Additionally, sensible title and aria-label attributes should be used in the user interface.

Goals

  • Make sure main components can be reached via tab key
  • Make sure active elements are visually distinguishable
  • Make sure elements can be activated via keyboard
  • Make sure we have sensible attributes

Non Goals

  • No full accessibility analysis
  • No keyboard shortcuts

Potential Risks

Low risk

Success Criteria

A user can create a full recording with no mouse.

Benefits/Rewards

Accessibility improvement for motor impaired people and for visually impaired people using screen reader.

Resources/Budget

1 weeks of Nadine's time.

Initial Funding

  • Part of AVVP Accessibility

Proposed By

  • Rüdiger
  • Lukas

Switch to WebVTT for Segments

Summary

Opencast automatically recognizes video scenes and with this, for example, slide changes in the presentation video.

The format Opencast stores these segments in is XML-based and maybe somewhat outdated. We could investigate using WebVTT instead. This would potentially allow us to use the editing tools we have for subtitles.

Goals

  • Investigate if using WebVTT for sections is common and makes sense
  • Make a decision about switching

Non Goals

Update Opencast to fully work with WebVTT sections (could be a potentially resulting project)

Potential Risks

We decide that there is no benefit in switching.

Success Criteria

We made a decision.

Benefits/Rewards

  • Potentially better support from modern players.
  • Overlapping tooling from the subtitle editor.

Resources/Budget

1 weeks of Nadine's time.

Initial Funding

  • Part of AVVP Accessibility

Proposed By

  • Lars

Implement Segment Editor

Summary

Opencast automatically recognizes video scenes and with this, for example, slide changes in the presentation video.

While this works well, sometimes it would be nice if users can fix these, add new segments, change segments and update segment descriptions.

Blocked by Switch to WebVTT for Segments which may make this obsolete.

Goals

  • Implement a segment editor
  • Allow for setting text information
  • Make this part of the Opencast Editor

Non Goals

  • Not touching the admin interface editor
  • No automatic re-publication

Potential Risks

  • Hard to predict time budget. May take longer.
  • May require workflow changes

Success Criteria

Users are able to change segments

Benefits/Rewards

  • Segments can be more accurate
  • Setting texts can help with in-video navigation

Resources/Budget

7 weeks of Nadine's or Alex's time.

Initial Funding

  • Part of AVVP Accessibility

Proposed By

  • Rüdiger
  • Clemens

Detect Title in Slide Text

Summary

Opencast automatically recognizes video scenes and with this, for example, slide changes in the presentation video. It also extracts text from these slides.

It would be really helpful if we could identify the most significant parts of these slide texts like, for example, the slide title. This can help users navigate through the video.

Goals

  • Identify and extract slide title

Non Goals

  • Only based on slide text
  • No additional text sources
  • Extraction only. This is not about presenting the results to users.

Potential Risks

If we cannot make this accurate enough, results may be useless.

Success Criteria

We extracted at least 4 titles from the Dual-Stream Demo video.

Benefits/Rewards

  • Segments can be more accurate
  • Setting texts can help with in-video navigation

Notes

  • Take a look at slide texts extracted via OCR
  • Tesseract will give us size and accuracy information we currently discard

Resources/Budget

4 weeks of Nadine's time.

Initial Funding

  • Part of AVVP Accessibility

Proposed By

  • Rüdiger

Identify Key Subjects for Event

Summary

Having a lot of extracted text for each event with the OCR on slides and the speech recognition, we could use something like tf-idf to easily extract subjects or other keywords from these texts to enrich event metadata.

This might work especially well, if we look at the term frequency in documents for one event and compared to the frequency in all events of a series.

Goals

Implement a tf-idf extraction workflow operation to extract keywords from metadata, slide texts and subtitles.

We can add the subjects as Dublin Core subject metadata and maybe add the results with additional information as JSON attachments.

Non Goals

  • Not automatically updating old events
  • Not showing this in any particular user interface

Potential Risks

None

Success Criteria

Running this on a series of UOS or ETH material should result in recognizable subjects.

Benefits/Rewards

  • Improves navigation to/in videos
  • May help users to understand content since they know what it is about

Notes

  • Talk to Rüdiger or Lars for information about tf-idf.

Resources/Budget

2 weeks of Nadine's time.

Initial Funding

  • Part of AVVP Accessibility

Proposed By

  • Lars

Show Segments, Texts and Titles in Paella Player

Summary

Analyzing the videos, we can extract a lot of useful information we should properly present to users in the player to help them navigate through a video.

Goals

Evaluate, add and/or improve in Paella Player:

  • Showing segments
  • Showing slide texts
  • Showing segment titles

Non Goals

  • Improved extraction
  • No other player
  • Not in the admin interface

Potential Risks

Potential conflict with what UPV and ETH does. Make sure to coordinate with them.

Success Criteria

A user can use the slide texts of the Dual-Stream Demo for navigational help in paella Player.

Benefits/Rewards

  • Improves navigation in videos
  • May additionally help when using screen readers

Resources/Budget

3 weeks of Nadine's time.

Initial Funding

  • Part of AVVP Accessibility

Proposed By

  • RnD

Let Users Search Through Event Texts in Tobira

Summary

Analyzing the videos, we can extract a lot of useful information in text form about events. It would be really helpful, if we could use those for searching in Tobira.

Goals

Make Tobira search through:

  • Slide texts
  • Segment titles
  • Subtitles

Non Goals

  • Don't deal with temporal results within an event.

Potential Risks

  • Potential conflict with what UPV and ETH does. Make sure to coordinate with them.
  • May cause the search to slow down.

Success Criteria

A user can find events based on speech recognition.

Benefits/Rewards

  • Improves navigation to videos
  • Potential linking of in-video results (exact position) in future projects

Resources/Budget

3 weeks of Lukas' or Julian's time.

Initial Funding

  • Part of AVVP Accessibility

Proposed By

  • RnD

Add Subtitle2Go Based Transcription Service

Summary

Subtitle2go is an alternative to Vosk for free, automatic subtitling. It has good support for the German language. We could add it as a transcription service to Opencast.

Goals

  • Add Subtitle2go based transcription service similar to the Vosk based service

Non Goals

  • Do not include Subtitle2go in Opencast
  • Do not deal with the installation of Subtitle2go

Potential Risks

  • Unsure how stable that project is
  • May be a German-only solution

Success Criteria

  • The Dual-Stream demo successfully gets subtitles through Subtitle2go
  • Subtitles resemble the actual speech

Benefits/Rewards

  • Potentially better results than Vosk for German

Resources/Budget

4 weeks of Nadine's time.

Initial Funding

  • Part of AVVP Accessibility

Proposed By

  • Clemens

Accessible Tooltips in Opencast Studio

Summary

Toooltips (e.g. HTML title attributes) are great to add additioonal explanations to control elements. They are – in combination with aria-label – also picked up by some accessibility tools to help users further.

But non-persistent tooltips can unfortunately hinder accessibility. Users relying on screen magnification tool may be unable too read the whole tooltip since only part of the actual screen is visible.

For example, here is screen magnification being used with default title attributes vs using tippy.js for rendering in the Opencast documentation:

Goals

  • Use tippy.js to render persistent toooltips
  • Make them adhere to the default theme

Non Goals

  • Not changing any tooltip textx and/or evaluating their usefullyness
  • No design evaluatioon for how they should look exactly

Potential Risks

  • Low risk

Success Criteria

  • Studio renders tooltips similar to Opencast's documentation

Benefits/Rewards

  • Improved accessibility
  • Improved look and feel

Resources/Budget

  • 1 weeks of Nadine's time

Initial Funding

  • Part of AVVP high-availability

Proposed By

  • Lars

Evaluate Vosk Punctuation Models

Summary

In addition to the regular models, Vosk provides pubctuation models for both punctuation and case restoration. We should try them and compare the Results to the default models for German and English.

Goals

  • Set up Opencast with Vosk
  • Set up Vosk with regular and punctuation models
  • Process a few recordings
  • Compare the results

Non Goals

  • No formal evaluation
  • No manual tweeking of models
  • English and German only

Potential Risks

  • None

Success Criteria

  • Finding a models with similar good results but with punctuation and case support

Benefits/Rewards

  • Improved readability
  • Better subtitles

Resources/Budget

  • 1 weeks of Nadine's time

Initial Funding

  • Part of AVVP accessibility

Proposed By

  • Rüdiger