Accessibility Projects

THIS IS THE INITIAL LIST. FURTHER DOCUMENTATION CAN BE FOUND IN GITLAB.

A list of potential projects based on the initial accessibility discussion.

Improving Vosk Speech-to-Text in Opencast

Summary

Vosk is a free speech-to-text engine now included in Opencast. It's an easy way to get automated subtitles on a large scale.

Right now it is included as-is with no major modifications or improvements in post-processing. We hope that will very little effort, we can improve the experience quite a bit.

Goals

Extend vosk-cli

Make line length and number of lines configurable via YAML configuration file
Option to automatically detect the language using LanguageTool
Option to automatically apply LanguageTool suggestions
Option to recognize sentences
- Maybe using NLTK

Non Goals

Do not modify Vosk directly
Do not modify the Opencast integration

Potential Risks

Recognizing sentence structures may be hard and may require additional work to get into that topic. This sub-task may fail if we don't find a good tutorial on something similar.

Success Criteria

A configuration was added
LanguageTool has been integrated

Benefits/Rewards

We improve the quality of subtitles
We can re-use the work for BigBlueButton and other projects
Less manual work correcting subtitles

Resources/Budget

2-3 weeks of Nadine's time

Initial Funding

Part of AVVP accessibility

Proposed By

Clemens
Lars
Rüdiger

Additional Notes

LanguageTool's language recognition:

❯ curl -s 'http://lt.home.lkiesow.io:8080/v2/check?language=auto&text=Dies+ist+ein+deutscher+Testtext.' | jq .language
{
  "name": "German (Germany)",
  "code": "de-DE",
  "detectedLanguage": {
    "name": "German (Germany)",
    "code": "de-DE",
    "confidence": 0.9999928
  }
}

LanguageTool responds with rules and suggestions to recognized errors:

❯ curl -s 'http://lt.home.lkiesow.io:8080/v2/check?language=de-DE&text=dies+ist+ein+deutscher+Testtext..' | jq '.matches[0].message'
"Dieser Satz fängt nicht mit einem großgeschriebenen Wort an."

Summary

For a long term improvement of speech recognition in an academic context, we need better source material to build data from. We should look into if we can contribute material to Mozilla Common Voice or similar projects to improve recognition long-term.

Goals

Research projects and options and ways for Opencast community members to contribute material to a central database. Does such a database exist? How can we contribute? Will that have any effect on data models we can use?

Non Goals

We are not building a database
We are not building new Vosk models
Do not contribute anything (except for testing purposes)

Potential Risks

Worst case, Common Voice does not want our contribution, and we find no projects. But at least, we then know that there are none.

Success Criteria

Find a good place to contribute to
Provide a clear way for community members to do so

Benefits/Rewards

While we will not see any effect short-term, we can hope that this will have a long-term effect on free language models, improving speech recognition.

Resources/Budget

1 week of Nadine's time.

Initial Funding

Part of AVVP Accessibility

Proposed By

Clemens
Lars
Rüdiger

Research how to improve Vosk language models

Summary

Vosk is a free speech-to-text engine now included in Opencast. It's an easy way to get automated subtitles on a large scale.

Vosk comes with support for many languages, but it's not tunes to an academic context. It does allow for tuning the language models, and we should look into how hard this is and if we can easily do this.

Goals

Research what we need to do to improve the language models for our context. How hard is it? What do we need to do that?

Non Goals

Building a new model is out of scope for this project, but may be addressed after evaluation in a follow-up project.

Potential Risks

No risks.

Success Criteria

We have found an easy way to tune language models with less than two weeks of work.

Benefits/Rewards

We have lots of specific content where we actually need the general type of content before we analyze it. If we have more precise language models, we can potentially increase the accuracy of the recognition.

Resources/Budget

One week of Nadine's time

Initial Funding

Part of AVVP Accessibility

Proposed By

Clemens
Lars
Rüdiger

Packaging vosk-cli

Summary

To make Vosk easy to use for the community, we should make sure to add vosk-cli and different languages to Opencast's package repository.

Goals

Add vosk-cli and language packs to the package repository.

Non Goals

No modifications or improvements of the tools unless necessary for packaging.

Potential Risks

No risks.

Success Criteria

In the RPM repository:
- vosk-cli
- English
- German
- Spanish
- French

Benefits/Rewards

This makes the speech recognition actually usable for the masses.

Resources/Budget

20h of time
Martin or Lars

Initial Funding

AVVP
Basis Souver@n
Community hours

Proposed By

Lars

Research good tool independent keyboard shortcuts

Summary

We have several tools with similar functionality which already support keyboard shortcuts or should support them in the future. We should make sure to not use different shortcuts for the same function.

Goals

Research and define a set of ideally well established of keyboard shortcuts for common functionality (play/pause, seek, search, …) we can then implement across different tools.

Non Goals

Not actually implementing any shortcuts anywhere.

Potential Risks

None.

Success Criteria

A set of at least 10 commonly used shortcuts for web-based media tools.

Benefits/Rewards

It's less confusing for users if they do not have to re-learn shortcuts with every tool they use.

Resources/Budget

One week of Nadine's time.

Initial Funding

Part of AVVP Accessibility

Proposed By

Clemens
Lars
Rüdiger

Adding Keyboard Shortcuts to Opencast Studio

Summary

Opencast Studio, especially the built-in editor, does not support any keyboard shortcuts at the moment.

Goals

Add keyboard shortcuts to Opencast studio where it makes sense. This means first and foremost to the player on the editor view.

Non Goals

Not all functions need keyboard shortcuts.

Potential Risks

Relatively risk-free.

Success Criteria

Play/pause with keyboard shortcut possible
Setting trim marks via keyboard possible

Benefits/Rewards

Improved accessibility in Opencast Studio
Developers learn how to include keyboard shortcuts

Resources/Budget

1 week of Nadine's time.

Initial Funding

Part of AVVP Accessibility

Proposed By

Clemens
Lars
Rüdiger

Switch from Travis CI to GitHub Actions in Opencast Studio

Summary

Opencast Studio still uses Travis CI for automated tests and deployments. The Travis checks seem to be broken which could cause quality issues if we do more development on Studio again. We switched to GitHub Actions everywhere else. We should do that here as well.

Goals

Port tests and deployment from Travis CI to GitHub Actions

Non Goals

Do not add new tests

Potential Risks

None

Success Criteria

New pull requests are checked automatically
Pull requests from trusted sources are deployed automatically

Benefits/Rewards

Will help us for all additional Studio developments

Resources/Budget

1 weeks of Nadine's time.

Initial Funding

Part of AVVP Accessibility

Proposed By

Lars

Implement Two-Pass loudnorm Audio Normalization

Summary

Audio normalization can drastically improve the clarity of audio, especially if a video is created by multiple speakers. We can easily make use of FFmpeg's excellent internal loudnorm implementation to add good audio normalization to Opencast.

Goals

Add a loudnorm workflow operation
Implement a fist-pass analysis step in the composer which can then be used in a regular encode operation

Non Goals

Not dealing with live content

Potential Risks

Relatively risk-free.

Success Criteria

Working two-pass loudnorm operation in Opencast

Benefits/Rewards

Improved accessibility for visually impaired users
Overall improvement for users

Resources/Budget

2 weeks of Nadine's or Alex's time.

Initial Funding

Part of AVVP Accessibility

Proposed By

Lars

Dark Mode in Studio

Summary

Glare sensitivity can be one type of visually impairment. This means that bright white surfaces can cause problems for users. This can be particular bad if users have to switch between dark and bright interfaces.

To prevent this, it would be great for Opencast Studio to have a dark mode. Even better would it be if the mode would automatically be triggered by a user's system being in dark mode.

Goals

Implement dark mode in Studio
Users should be able to enable the mode in the settings
By default, it should be set to auto to respect the user's system settings if possible
Studio should remember the user's settings

Non Goals

No custom designs

Potential Risks

Low risk

Success Criteria

A dark mode should be selectable on all systems

Benefits/Rewards

Learn about detecting system dark mode
Better supporting visually impaired users

Resources/Budget

1 weeks of Nadine's time.

Initial Funding

Part of AVVP Accessibility

Notes

We can likely extend these pull request:
- https://github.com/elan-ev/opencast-studio/pull/902
- https://github.com/elan-ev/opencast-studio/pull/909

Proposed By

Lars
Clemens

Dark Mode in the Opencast Editor

Summary

Glare sensitivity can be one type of visually impairment. This means that bright white surfaces can cause difficulties for users. This can be particular bad if users have to switch between dark and bright interfaces.

To prevent this, it would be great for Opencast Editor to have a dark mode. Even better would it be if the mode would automatically be triggered by a user's system being in dark mode.

Goals

Implement dark mode in the Opencast Editor
Users should be able to enable the mode in the settings
- Create a settings view
By default, it should be set to auto to respect the user's system settings if possible
Studio should remember the user's settings in the Browser's storage

Non Goals

No custom designs

Potential Risks

Low risk

Success Criteria

A dark mode should be selectable on all systems

Benefits/Rewards

Better supporting visually impaired users

Resources/Budget

2 weeks of Nadine's time.

Initial Funding

Part of AVVP Accessibility

Proposed By

Lars
Clemens

Dark Mode in Tobira

Summary

To prevent this, it would be great for Tobira to have a dark mode. Even better would it be if the mode would automatically be triggered by a user's system being in dark mode.

Goals

Implement a dark mode in Tobira
Users should be able to enable the mode in the settings
By default, it should be set to auto to respect the user's system settings if possible

Non Goals

No custom designs

Potential Risks

Low risk
This may conflict with corporate identity settings

Success Criteria

A dark mode should be selectable on all systems

Benefits/Rewards

Better supporting visually impaired users

Resources/Budget

1 weeks of Julian's or Lukas' time.

Initial Funding

Part of AVVP Accessibility
or: ETH

Proposed By

Lars
Clemens

Summary

Right now, a lot of elements can only be reached via the mouse in Opencast Studio. Tabbing through active elements and activating them via keyboard should be possible.

Additionally, sensible title and aria-label attributes should be used in the user interface.

Goals

Make sure main components can be reached via tab key
Make sure active elements are visually distinguishable
Make sure elements can be activated via keyboard
Make sure we have sensible attributes

Non Goals

No full accessibility analysis
No keyboard shortcuts

Potential Risks

Low risk

Success Criteria

A user can create a full recording with no mouse.

Benefits/Rewards

Accessibility improvement for motor impaired people and for visually impaired people using screen reader.

Resources/Budget

1 weeks of Nadine's time.

Initial Funding

Part of AVVP Accessibility

Proposed By

Rüdiger
Lukas

Switch to WebVTT for Segments

Summary

Opencast automatically recognizes video scenes and with this, for example, slide changes in the presentation video.

The format Opencast stores these segments in is XML-based and maybe somewhat outdated. We could investigate using WebVTT instead. This would potentially allow us to use the editing tools we have for subtitles.

Goals

Investigate if using WebVTT for sections is common and makes sense
Make a decision about switching

Non Goals

Update Opencast to fully work with WebVTT sections (could be a potentially resulting project)

Potential Risks

We decide that there is no benefit in switching.

Success Criteria

We made a decision.

Benefits/Rewards

Potentially better support from modern players.
Overlapping tooling from the subtitle editor.

Resources/Budget

1 weeks of Nadine's time.

Initial Funding

Part of AVVP Accessibility

Proposed By

Lars

Implement Segment Editor

Summary

Opencast automatically recognizes video scenes and with this, for example, slide changes in the presentation video.

While this works well, sometimes it would be nice if users can fix these, add new segments, change segments and update segment descriptions.

Blocked by Switch to WebVTT for Segments which may make this obsolete.

Goals

Implement a segment editor
Allow for setting text information
Make this part of the Opencast Editor

Non Goals

Not touching the admin interface editor
No automatic re-publication

Potential Risks

Hard to predict time budget. May take longer.
May require workflow changes

Success Criteria

Users are able to change segments

Benefits/Rewards

Segments can be more accurate
Setting texts can help with in-video navigation

Resources/Budget

7 weeks of Nadine's or Alex's time.

Initial Funding

Part of AVVP Accessibility

Proposed By

Rüdiger
Clemens

Detect Title in Slide Text

Summary

Opencast automatically recognizes video scenes and with this, for example, slide changes in the presentation video. It also extracts text from these slides.

It would be really helpful if we could identify the most significant parts of these slide texts like, for example, the slide title. This can help users navigate through the video.

Goals

Identify and extract slide title

Non Goals

Only based on slide text
No additional text sources
Extraction only. This is not about presenting the results to users.

Potential Risks

If we cannot make this accurate enough, results may be useless.

Success Criteria

We extracted at least 4 titles from the Dual-Stream Demo video.

Benefits/Rewards

Segments can be more accurate
Setting texts can help with in-video navigation

Notes

Take a look at slide texts extracted via OCR
Tesseract will give us size and accuracy information we currently discard

Resources/Budget

4 weeks of Nadine's time.

Initial Funding

Part of AVVP Accessibility

Proposed By

Rüdiger

Identify Key Subjects for Event

Summary

Having a lot of extracted text for each event with the OCR on slides and the speech recognition, we could use something like tf-idf to easily extract subjects or other keywords from these texts to enrich event metadata.

This might work especially well, if we look at the term frequency in documents for one event and compared to the frequency in all events of a series.

Goals

Implement a tf-idf extraction workflow operation to extract keywords from metadata, slide texts and subtitles.

We can add the subjects as Dublin Core subject metadata and maybe add the results with additional information as JSON attachments.

Non Goals

Not automatically updating old events
Not showing this in any particular user interface

Potential Risks

None

Success Criteria

Running this on a series of UOS or ETH material should result in recognizable subjects.

Benefits/Rewards

Improves navigation to/in videos
May help users to understand content since they know what it is about

Notes

Talk to Rüdiger or Lars for information about tf-idf.

Resources/Budget

2 weeks of Nadine's time.

Initial Funding

Part of AVVP Accessibility

Proposed By

Lars

Show Segments, Texts and Titles in Paella Player

Summary

Analyzing the videos, we can extract a lot of useful information we should properly present to users in the player to help them navigate through a video.

Goals

Evaluate, add and/or improve in Paella Player:

Showing segments
Showing slide texts
Showing segment titles

Non Goals

Improved extraction
No other player
Not in the admin interface

Potential Risks

Potential conflict with what UPV and ETH does. Make sure to coordinate with them.

Success Criteria

A user can use the slide texts of the Dual-Stream Demo for navigational help in paella Player.

Benefits/Rewards

Improves navigation in videos
May additionally help when using screen readers

Resources/Budget

3 weeks of Nadine's time.

Initial Funding

Part of AVVP Accessibility

Proposed By

Let Users Search Through Event Texts in Tobira

Summary

Analyzing the videos, we can extract a lot of useful information in text form about events. It would be really helpful, if we could use those for searching in Tobira.

Goals

Make Tobira search through:

Slide texts
Segment titles
Subtitles

Non Goals

Don't deal with temporal results within an event.

Potential Risks

Potential conflict with what UPV and ETH does. Make sure to coordinate with them.
May cause the search to slow down.

Success Criteria

A user can find events based on speech recognition.

Benefits/Rewards

Improves navigation to videos
Potential linking of in-video results (exact position) in future projects

Resources/Budget

3 weeks of Lukas' or Julian's time.

Initial Funding

Part of AVVP Accessibility

Proposed By

Add Subtitle2Go Based Transcription Service

Summary

Subtitle2go is an alternative to Vosk for free, automatic subtitling. It has good support for the German language. We could add it as a transcription service to Opencast.

Goals

Add Subtitle2go based transcription service similar to the Vosk based service

Non Goals

Do not include Subtitle2go in Opencast
Do not deal with the installation of Subtitle2go

Potential Risks

Unsure how stable that project is
May be a German-only solution

Success Criteria

The Dual-Stream demo successfully gets subtitles through Subtitle2go
Subtitles resemble the actual speech

Benefits/Rewards

Potentially better results than Vosk for German

Resources/Budget

4 weeks of Nadine's time.

Initial Funding

Part of AVVP Accessibility

Proposed By

Clemens

Accessible Tooltips in Opencast Studio

Summary

Toooltips (e.g. HTML title attributes) are great to add additioonal explanations to control elements. They are – in combination with aria-label – also picked up by some accessibility tools to help users further.

But non-persistent tooltips can unfortunately hinder accessibility. Users relying on screen magnification tool may be unable too read the whole tooltip since only part of the actual screen is visible.

For example, here is screen magnification being used with default title attributes vs using tippy.js for rendering in the Opencast documentation:

accessible-tooltips.mp4

Goals

Use tippy.js to render persistent toooltips
Make them adhere to the default theme

Non Goals

Not changing any tooltip textx and/or evaluating their usefullyness
No design evaluatioon for how they should look exactly

Potential Risks

Low risk

Success Criteria

Studio renders tooltips similar to Opencast's documentation

Benefits/Rewards

Improved accessibility
Improved look and feel

Resources/Budget

1 weeks of Nadine's time

Initial Funding

Part of AVVP high-availability

Proposed By

Lars

Evaluate Vosk Punctuation Models

Summary

In addition to the regular models, Vosk provides pubctuation models for both punctuation and case restoration. We should try them and compare the Results to the default models for German and English.

Goals

Set up Opencast with Vosk
Set up Vosk with regular and punctuation models
Process a few recordings
Compare the results

Non Goals

No formal evaluation
No manual tweeking of models
English and German only

Potential Risks

None

Success Criteria

Finding a models with similar good results but with punctuation and case support

Benefits/Rewards

Improved readability
Better subtitles

Resources/Budget

1 weeks of Nadine's time

Initial Funding

Part of AVVP accessibility

Proposed By

Rüdiger

Accessibility Projects

Improving Vosk Speech-to-Text in Opencast

Summary

Goals

Non Goals

Potential Risks

Success Criteria

Benefits/Rewards

Resources/Budget

Initial Funding

Proposed By

Additional Notes

Database for Sharing Corrected Audio and Subtitle Material

Summary

Goals

Non Goals

Potential Risks

Success Criteria

Benefits/Rewards

Resources/Budget

Initial Funding

Proposed By

Research how to improve Vosk language models

Summary

Goals

Non Goals

Potential Risks

Success Criteria

Benefits/Rewards

Resources/Budget

Initial Funding

Proposed By

Packaging vosk-cli

Summary

Goals

Non Goals

Potential Risks

Success Criteria

Benefits/Rewards

Resources/Budget

Initial Funding

Proposed By

Research good tool independent keyboard shortcuts

Summary

Goals

Non Goals

Potential Risks

Success Criteria

Benefits/Rewards

Resources/Budget

Initial Funding

Proposed By

Adding Keyboard Shortcuts to Opencast Studio

Summary

Goals

Non Goals

Potential Risks

Success Criteria

Benefits/Rewards

Resources/Budget

Initial Funding

Proposed By

Switch from Travis CI to GitHub Actions in Opencast Studio

Summary

Goals

Non Goals

Potential Risks

Success Criteria

Benefits/Rewards

Resources/Budget

Initial Funding

Proposed By

Implement Two-Pass loudnorm Audio Normalization

Summary

Goals

Non Goals

Potential Risks

Success Criteria

Benefits/Rewards

Resources/Budget