In this tutorial, we will introduce you to Hyperfy's Audio-Reactive App Suite: a powerful toolkit of apps that allows you to analyze audio data from various sources and use it to adjust properties of apps within your world. We will cover the basics of audio spectrum analysis, provide an overview of the Web Audio API, and explain how the Audio Bridge app interacts with the other suite of audio-related apps in Hyperfy.
Sound waves are longitudinal waves that propagate through a medium (such as air or water) as a result of vibrations. They consist of compressions (areas of high pressure) and rarefactions (areas of low pressure), which travel through the medium. These compressions and rarefactions create what are known as sound waves. The human ear detects these pressure changes and converts them into electrical signals, which are then processed by the brain as sound. The human ear can typically hear sound waves within the frequency range of 20 Hz to 20 kHz.
Sound wave amplitude refers to the maximum displacement or change in pressure of a sound wave from its equilibrium position. In the context of compressions and rarefactions, the amplitude of a sound wave is directly related to the intensity of the compressions and rarefactions.
In simpler terms, a higher amplitude sound wave results in more intense compressions and rarefactions, leading to a louder perceived sound. Conversely, a lower amplitude sound wave results in less intense compressions and rarefactions, producing a quieter sound.
Every sound wave has both a positive and negative amplitude and we must measure both to reconstruct them digitally.
In digital audio, sound waves are represented as a series of discrete values called samples. These samples are usually taken at regular intervals, known as the sampling rate (e.g., 44.1 kHz or 48 kHz). Each sample represents the amplitude of the sound wave at a specific point in time.
The more samples we take, the better our digital reconstruction of the analgoue sound is.
Hyperfy uses the Web Audio API for its audio-reactive toolkit. The Web Audio API provides a set of tools for reading, processing, and analyzing digital audio data. To work with audio data, you first need to create an AudioContext, which is the main object that manages all aspects of the API. Once you have an AudioContext, you can create various types of AudioNodes to manipulate the audio data.
For example, you can use an AnalyserNode to extract frequency and time-domain data from the input audio. The AnalyserNode provides methods like getByteFrequencyData() and getFloatFrequencyData(), which return arrays containing the magnitude of the frequency components in the input signal.
Frequency bands are specific ranges of frequencies within the audio spectrum. In audio processing, you can divide the frequency spectrum into different bands to analyze and manipulate each band separately. This can be useful for applications like equalization, noise reduction, or creating audio-driven visual effects.
In the Audio Bridge App, you can define up to 4 frequency bands, each with its own threshold ratio. This allows you to focus on specific parts of the audio spectrum and create different effects for each band.
Low frequencies generally cover bass sounds, such as the deep tones produced by a kick drum or bass guitar. Low mid frequencies include some of the lower harmonics and the fundamental frequencies of many instruments and vocals. High mid frequencies contain the upper harmonics and the "presence" of instruments and vocals, contributing to their clarity and intelligibility. High frequencies are responsible for the "brightness" or "airiness" of the sound, including the highest harmonics and transient details.
Below is a table illustrating typical frequency band ranges for low, low mid, high mid, and high bands. Note that these ranges can vary depending on the specific application or context.
Below is an example of how these bands might be distributed across a logarithmic frequency spectrum.
A threshold ratio is a value that determines how sensitive the app is to changes in the amplitude of the frequency components. The lower the threshold, the more sensitive the app is to small changes in amplitude, and the higher the data signal it generates.
To calculate the data signal, the app takes the delta (difference) between the amplitude of a frequency component and its threshold. This delta is then used as a scaling factor to adjust variables such as the emission value of a material or the intensity of a light.
For example, if the amplitude of a frequency component is 0.8 and its threshold is 0.5, the delta would be 0.3 (0.8 - 0.5). The app could then use this delta value to increase the emission value of a material or the intensity of a light by a factor of 0.3, creating a dynamic effect that responds to the input audio.
By adjusting the threshold ratios for different frequency bands, you can control how the app responds to various frequency components and create a wide range of audio-driven visual effects.
The Audio Bridge App is the central component of the toolkit, responsible for processing the audio input from an audio, video, or stream app. You can set up to 4 frequency bands and a 'threshold ratio' for each. The lower the threshold, the higher the value sent to the Audio Bloom or Audio Light app. Each Audio Bridge instance can be associated with one 'Audio ID' from the source.
The audio, video and stream app all allow you to assign an Audio ID that will be used by the Audio Bridge.
The Audio Bloom App allows you to upload a GLB file and select the material you want to control using the Audio Bridge app. It receives the analyzed audio data and adjusts the emission value of the selected material accordingly. This creates a dynamic and interactive 'bloom' effect based on the input audio.
In this case, I have 5 meshes in the GLB I've assigned to Audio Bloom, but I only want to adjust the emission for the material applied to the 'top' mesh. I would therefore enter "emission1" into the Material text box in the Audio Bloom App.
The Audio Light App enables you to create an area or point light and use the Audio Bridge app to increase the intensity value of the light. This app adds another layer of interactivity to your 3D scene, allowing the light intensity to react to the audio input in real-time.
Hyperfy's Audio-Reactive App Suite offers a powerful and accessible way to create dynamic visual experiences driven by audio. By understanding the fundamentals of sound waves, amplitude, frequency bands, and the Web Audio API, you can effectively utilize the Audio Bridge, Audio Bloom, and Audio Light apps to bring your projects to life. This tutorial serves as a starting point for exploring the possibilities offered by the toolkit, and with a deeper understanding of the underlying concepts, you can unlock new creative potentials in your work as an artist, designer, or developer by utilizing Hyperfy's SDK