Nathan Graule
    • Create new note
    • Create a note from template
      • Sharing URL Link copied
      • /edit
      • View mode
        • Edit mode
        • View mode
        • Book mode
        • Slide mode
        Edit mode View mode Book mode Slide mode
      • Customize slides
      • Note Permission
      • Read
        • Only me
        • Signed-in users
        • Everyone
        Only me Signed-in users Everyone
      • Write
        • Only me
        • Signed-in users
        • Everyone
        Only me Signed-in users Everyone
      • Engagement control Commenting, Suggest edit, Emoji Reply
    • Invite by email
      Invitee

      This note has no invitees

    • Publish Note

      Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note

      Your note will be visible on your profile and discoverable by anyone.
      Your note is now live.
      This note is visible on your profile and discoverable online.
      Everyone on the web can find and read all notes of this public team.
      See published notes
      Unpublish note
      Please check the box to agree to the Community Guidelines.
      View profile
    • Commenting
      Permission
      Disabled Forbidden Owners Signed-in users Everyone
    • Enable
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
      • Everyone
    • Suggest edit
      Permission
      Disabled Forbidden Owners Signed-in users Everyone
    • Enable
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
    • Emoji Reply
    • Enable
    • Versions and GitHub Sync
    • Note settings
    • Note Insights New
    • Engagement control
    • Transfer ownership
    • Delete this note
    • Save as template
    • Insert from template
    • Import from
      • Dropbox
      • Google Drive
      • Gist
      • Clipboard
    • Export to
      • Dropbox
      • Google Drive
      • Gist
    • Download
      • Markdown
      • HTML
      • Raw HTML
Menu Note settings Note Insights Versions and GitHub Sync Sharing URL Create Help
Create Create new note Create a note from template
Menu
Options
Engagement control Transfer ownership Delete this note
Import from
Dropbox Google Drive Gist Clipboard
Export to
Dropbox Google Drive Gist
Download
Markdown HTML Raw HTML
Back
Sharing URL Link copied
/edit
View mode
  • Edit mode
  • View mode
  • Book mode
  • Slide mode
Edit mode View mode Book mode Slide mode
Customize slides
Note Permission
Read
Only me
  • Only me
  • Signed-in users
  • Everyone
Only me Signed-in users Everyone
Write
Only me
  • Only me
  • Signed-in users
  • Everyone
Only me Signed-in users Everyone
Engagement control Commenting, Suggest edit, Emoji Reply
  • Invite by email
    Invitee

    This note has no invitees

  • Publish Note

    Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note

    Your note will be visible on your profile and discoverable by anyone.
    Your note is now live.
    This note is visible on your profile and discoverable online.
    Everyone on the web can find and read all notes of this public team.
    See published notes
    Unpublish note
    Please check the box to agree to the Community Guidelines.
    View profile
    Engagement control
    Commenting
    Permission
    Disabled Forbidden Owners Signed-in users Everyone
    Enable
    Permission
    • Forbidden
    • Owners
    • Signed-in users
    • Everyone
    Suggest edit
    Permission
    Disabled Forbidden Owners Signed-in users Everyone
    Enable
    Permission
    • Forbidden
    • Owners
    • Signed-in users
    Emoji Reply
    Enable
    Import from Dropbox Google Drive Gist Clipboard
       Owned this note    Owned this note      
    Published Linked with GitHub
    • Any changes
      Be notified of any changes
    • Mention me
      Be notified of mention me
    • Unsubscribe
    # Introduction Worlds are explored as much with our ears as they are with our eyes. In games, audio is just as, if not more important to immersion than graphics, it is therefore necessary that Bevy's audio engine is taken care as well as (if not better than) its graphics engine. For various reasons we will list below, the current audio engine of Bevy is lacking in features and performance, and an overhaul is needed to bring it up to speed with the rest of the engine. This document will go in details about why it needs to change, how we will change it, and the scope for the initial "Better Audio" working group, which will work to bring it forward for inclusion into as a first-party audio engine in Bevy. ## Why the current audio engine is insufficent There are several reasons why we think the current audio engine lacks the capability to be extended easily and meet the needs of Bevy users: 1. **Insufficent performance**. `rodio`, the audio library used by Bevy to provide the audio engine, has notable technical limitations which prevent more advanced features from being implemented: - `rodio` does not follow the rules of audio programming, which restricts the performance ceiling of the audio engine, and can cause glitches when the audio engine and/or the whole host system is under load. Audio programing has strict restrictions on the kinds of algorithms and data synchronization methods because any kind of performance drop is noticeable as audio glitches by the end user. 2. **Lack of extensibility**. While effects can be applied on audio sources, they can only be done so on a single source at a time. Because no audio bussing features are available (and such a feature would be hard to implement given the architecture of `rodio`), it necessarily means each effect has to be duplicated, which is a deal-breaker for heavier effects (ex. reverb effects are notoriously heavy on processing and memory, as each internal part of the effect is constantly evolving and reading data off of delay buffers). 3. **Incompatible type-safety**. Because `rodio` has all its components declared as separate Rust types, it is hard to write generic code that handles audio in the general sense (i.e. for an editor). Since each effect applied on a source results in a new, distinct type, there can potentially be as many concrete audio types in use in any given project as there are audio sources. Bevy has reflection features, which would help in those cases, however reflection on foreign types is not available, which means we would need to duplicate all those types to allow them to be reflected by Bevy. 4. **Incompatible with animation features**. Contrary to the graphics engine, where any value can be relatively easily animated by changing their value once before rendering, the audio engine has to have a tight integration with the animation features, as it needs to be able to evaluate animation curves at multiple intervals within a single "frame". Therefore, curves should be accessible by the audio engine itself so that animators can set up audio animations. 5. **Not ECS-friendly**. From a puristy point of view, entities and components should be the ground truth for all the systems in the application, including graphics and audio. Currently nearly all of the audio-related component data is actually hidden from the ECS, as instead of plain old data, the audio components are instead handles that systems use through setters to manipulate the audio engine. While of all the above reasons, this is the least compelling as we could easily compromise on purity to allow this, it still limits the cooperation of data between systems, and should be kept in mind still. ## Initial plan of action: integrating `kira` This working group has originally been created to replace `rodio` with `kira` as the first-party audio backend for Bevy. However, while `kira` is a good library for game audio in individual projects, it has some of the same issues as `rodio` as talked above, namely the type-safety and lack of reflection, as well as ecs-friendliness issues above. Furthermore, `kira` complicates integration by having two "kinds" of parameters, one only used during initialization, and the other only available at runtime. This complicates integration as this means control is spread in two places, and this dichotomy cannot easily hidden or resolved from the side of the integration; it has to be exposed to the user in order to keep the flexibility. ## New audio engine This new audio engine needs to be designed with advanced features in mind, even though this document, and the working group for this project, is only concerned with an initial implementation. This is because we need to focus on extendability in order not to impede future developmenets in the audio features of the engine. Features like audio busses, mixing and effect processing, as well as deeper sound spatialization, need to be able to be implemented in the future. This is why, in this document, the "features" are different to the "scope" of the project; the former describes the long-term goals of the audio engine, while the latter is focused on the work to be done by this working group. ### Features - **Audio sources** - Static and streaming of audio files - Gapless playback and looping - Automatic prioritization of sound playback based on user-defined priorities, distance (when spatial), and maximum polyphony - Third-party custom sources - **Spatial audio** - v1 with basic panning and attenuation, v2 with HRTF and atmospheric absorption model - Multiple listeners - **Audio mixer** - Customizable track inputs (either from spatial audio listeners or directly from sounds configured to output to the track) - Effect rack for serial audio processing per-track and globally (through the Master track) - Third-party custom effects ### Scope The goal for the initial version of the new audio engine is to have feature-parity with the existing one, that is: - Custom audio sources - Basic spatial audio - Per-source volume control - Global volume control It should also be extensible, allowing both first-party and third-party implementations of audio effects. # Implementation :::info In the following sections, "component" refers to both effects and sources, being similar in implementation. ::: In order to be user-friendly, and ituitive to use, the audio engine will have to integrate naturally into the ECS framework of the rest of Bevy. On the other hand, the restrictions of real-time programming make providing a reliable and high-quality audio engine hard. Bridging the gap between those two ends will require specific solutions which this document will go over. Controlling an audio component may require access to any number of sources of data from the game; that is, an audio component cannot be set as a standalone, isolated component. In fact, each audio effect should be considered its own ECS system, running every frame to update state on the audio engine side from the game engine. This allows the ECS to retain the "ground truth" while simultaneously not restricting the implementation of this state syncing mechanism. Additionally, audio components should be stored in separate Bevy entities in order to simplify the lifecycle of Bevy components associated with audio components (meaning that despawning an entity can intuitively trigger resource cleanup on the audio engine side) Strictly speaking, there isn't much in the way of an official API that Bevy is required to provide in order to implement audio effect systems, the communication layer effectively operating completely outside of the ECS itself; however because the goal of Bevy Audio is to make implementing audio effects as ergonomic as possible, API design and ergonomics should still be part of this design document. ## Interlude: audio programming rules In the rest of the technical discussion of this design document, there will be several references to "rules of audio programming", "real-time programming", or "real-time safety". While there are subtleties in the different definitions of each, they will be used interchangeably here. Real-time audio processing requires processing a stream of audio fast enough such that audio devices continually have enough data to drive outputs. When the audio process feeding that data does not compute fast enough, the user ends up hearing audio dropouts, or microloops, which are symptomatic of the audio process taking to long to process the requested audio stream. Real-time programming is a set of rules which, when followed, guarentees that code **will** terminate in a finite amount of time. The rules can be boiled down to "no infinite loops and no system calls", which means no locking of mutexes (and no spinlocks), no allocations[^1], and always having bounds on loops (such as iterating through structures or an explicit range). [^1]: This refers specifically to allocating through the system, as it is a system call. You can set up a custom allocator, and as long as *it* does not go through system calls, you can allocate within it. A good article to read up on the subject is [this one](http://www.rossbencina.com/code/real-time-audio-programming-101-time-waits-for-nothing) from Ross Benica, which details the why and the how in more details. ## The communication layer Communicating with an audio engine from the outside is tricky; not only does it involve cross-thread communication, it also needs to fulfill the rules of audio programming as explained above. This means that most methods of cross-thread communication (ie. mutexes, channels) are out. The vast majority of solutions will use a circular (or ring) buffer data structure. This simple structure allows for channel-like communication by having one producer writing to shared memory and keeping track of their writing position, and one consumer reading from the shared memory and also keeping track of their reading position. The circular part comes from the fact that the positions of the reader and writer (or "heads", as analogous to a tape machine) wrap around the shared memory. By implementing the read and write head as atomic pointers (or using `AtomicUsize` in Rust), both ends of the circular buffer and work in complete autonomy by only following a few simple rules, but most importantly, there are no mechanisms by which one end has to wait for the other. Reading (resp. writing) of a value either succeeds, or fails. There are a number of high-quality circular buffer crates available, and we shouldn't reinvent the wheel here. We can use the circular buffer as the basis for a channel data structure by having the ECS side hold on to a "Sender" or "Producer" struct which writes to shared memory and manages its write head, and a "Receiver" struct which reads from shared memory and manages its read head. They both work very similarly to Event Writers and Event Readers in Bevy, and as such should share as many similarities as possible given the differences in storage. To change a parameter in the audio engine, the ECS system should send an "event" (represented by an enum) through the circular buffer, where the audio component implementation periodically checks its end of the circular buffer to update itself accordingly when any new event is sent in. The meaning of what the event is is left to the enum itself, and the audio component implementation. Another problem facing sending events is that arbitrary data cannot be sent; for example, sending a `Vec<_>` through a circular buffer means that when the event is read, and and the data is not moved out, the Vec will get dropped, causing a deallocation, which with the default allocator, will cause a system call to reclaim memory. **This breaks the audio programming rule of no system calls**. ### An API for communicating between audio components and Bevy Therefore, we need a way to "collect" data after its use. For this use-case, the [`basedrop`](https://crates.io/crates/basedrop) crate provides smart pointers which delay the `Drop` implementation of their inner data until it has been sent back from the audio thread. This allows audio components to be able to move data from events into their internal state. It's also difficult to enforce through enums, meaning users may accidentally provoke (de-)allocations when communicating through events; this introduces hard-to-debug footguns, and is thus not ergonomic. Another solution is using serialization to provide a way to pass data into the audio engine. This works in enforcing allocation rules in the communication layer (by controlling the format of the serialized data, we can ensure no de-allocating types are transferred). However, this moves the problem to the audio component itself: it may unknowingly allocate in the process of deserializing the data back into a proper type. A different solution could involve "parameter stores": a shared space is pre-allocated with all possible parameters (as enumerated by an enum), with their values serialized into a specific format, and both sides can read and write into them using triple-buffered storage (allowing for reading and writing independently, at the cost of making parameters *eventually* consistent). This has the advantage of relaxing the rules on what can be stored, as well as the serialization format (any self-describing format can be used here, e.g. `serde_json::Value`). This allows any type implementing `serde::Serialize + serde::Deserialize<'a>'` to effectively be stored as a parameter, from simple float values to entire structures, as needed. The downside to this approach is that serializing/deserializing can be expensive, and deserializing on the audio thread could still incur system calls (ie. allocations from a `Vec<_>`). Additionally, features like change detection need extra data to be store per parameter, and checked explicitely, rather than being a natural consequence of the event nature of the previous solutions. It also remains that dropping deserialized structures can be a source of deallocations as well. **TODO**: choose method to implement as communication layer. ## Bevy systems **TODO**: Internal systems resulting from the choice of communication layer above ## Audio engine The audio engine is the part that entirely resides within the audio thread, controlled by the OS audio APIs. **TODO**

    Import from clipboard

    Paste your markdown or webpage here...

    Advanced permission required

    Your current role can only read. Ask the system administrator to acquire write and comment permission.

    This team is disabled

    Sorry, this team is disabled. You can't edit this note.

    This note is locked

    Sorry, only owner can edit this note.

    Reach the limit

    Sorry, you've reached the max length this note can be.
    Please reduce the content or divide it to more notes, thank you!

    Import from Gist

    Import from Snippet

    or

    Export to Snippet

    Are you sure?

    Do you really want to delete this note?
    All users will lose their connection.

    Create a note from template

    Create a note from template

    Oops...
    This template has been removed or transferred.
    Upgrade
    All
    • All
    • Team
    No template.

    Create a template

    Upgrade

    Delete template

    Do you really want to delete this template?
    Turn this template into a regular note and keep its content, versions, and comments.

    This page need refresh

    You have an incompatible client version.
    Refresh to update.
    New version available!
    See releases notes here
    Refresh to enjoy new features.
    Your user state has changed.
    Refresh to load new user state.

    Sign in

    Forgot password

    or

    By clicking below, you agree to our terms of service.

    Sign in via Facebook Sign in via Twitter Sign in via GitHub Sign in via Dropbox Sign in with Wallet
    Wallet ( )
    Connect another wallet

    New to HackMD? Sign up

    Help

    • English
    • 中文
    • Français
    • Deutsch
    • 日本語
    • Español
    • Català
    • Ελληνικά
    • Português
    • italiano
    • Türkçe
    • Русский
    • Nederlands
    • hrvatski jezik
    • język polski
    • Українська
    • हिन्दी
    • svenska
    • Esperanto
    • dansk

    Documents

    Help & Tutorial

    How to use Book mode

    Slide Example

    API Docs

    Edit in VSCode

    Install browser extension

    Contacts

    Feedback

    Discord

    Send us email

    Resources

    Releases

    Pricing

    Blog

    Policy

    Terms

    Privacy

    Cheatsheet

    Syntax Example Reference
    # Header Header 基本排版
    - Unordered List
    • Unordered List
    1. Ordered List
    1. Ordered List
    - [ ] Todo List
    • Todo List
    > Blockquote
    Blockquote
    **Bold font** Bold font
    *Italics font* Italics font
    ~~Strikethrough~~ Strikethrough
    19^th^ 19th
    H~2~O H2O
    ++Inserted text++ Inserted text
    ==Marked text== Marked text
    [link text](https:// "title") Link
    ![image alt](https:// "title") Image
    `Code` Code 在筆記中貼入程式碼
    ```javascript
    var i = 0;
    ```
    var i = 0;
    :smile: :smile: Emoji list
    {%youtube youtube_id %} Externals
    $L^aT_eX$ LaTeX
    :::info
    This is a alert area.
    :::

    This is a alert area.

    Versions and GitHub Sync
    Get Full History Access

    • Edit version name
    • Delete

    revision author avatar     named on  

    More Less

    Note content is identical to the latest version.
    Compare
      Choose a version
      No search result
      Version not found
    Sign in to link this note to GitHub
    Learn more
    This note is not linked with GitHub
     

    Feedback

    Submission failed, please try again

    Thanks for your support.

    On a scale of 0-10, how likely is it that you would recommend HackMD to your friends, family or business associates?

    Please give us some advice and help us improve HackMD.

     

    Thanks for your feedback

    Remove version name

    Do you want to remove this version name and description?

    Transfer ownership

    Transfer to
      Warning: is a public team. If you transfer note to this team, everyone on the web can find and read this note.

        Link with GitHub

        Please authorize HackMD on GitHub
        • Please sign in to GitHub and install the HackMD app on your GitHub repo.
        • HackMD links with GitHub through a GitHub App. You can choose which repo to install our App.
        Learn more  Sign in to GitHub

        Push the note to GitHub Push to GitHub Pull a file from GitHub

          Authorize again
         

        Choose which file to push to

        Select repo
        Refresh Authorize more repos
        Select branch
        Select file
        Select branch
        Choose version(s) to push
        • Save a new version and push
        • Choose from existing versions
        Include title and tags
        Available push count

        Pull from GitHub

         
        File from GitHub
        File from HackMD

        GitHub Link Settings

        File linked

        Linked by
        File path
        Last synced branch
        Available push count

        Danger Zone

        Unlink
        You will no longer receive notification when GitHub file changes after unlink.

        Syncing

        Push failed

        Push successfully