GMT 2A Endofunctional harmony

# GMT 2A Endofunctional harmony > Part A: On the building block of intervals {%hackmd theme-dark %} ###### tags: `GMT` <style> h4 { font-size: 130% !important; font-weight: 100 !important; } .alert h5 { margin-top: 0.7em; color: rgba(0, 0, 0, 0.8); } .alert a { color: #3355bb; } .alert { color: rgba(0, 0, 0, 0.9); } hr.in-view { height: 1px; } </style> GMT series: - [GMT1: Overanalyzing the perfect cadence](https://hackmd.io/@euwbah/gen-music-theory-1) - focuses on introducing the conceptualization of the attempt to generalize tonal music with, but also the futility of such an attempt. - **GMT 2A Endofunctional harmony** - introduces constructs involving octaves, fifths and fourths, an introspective inquiry on personal conception of sound, and an inquiry on sound, and some various perspectives and hypotheses on culture. - [WIP] - focuses on tonal harmony, introducing constructs involving 5-limit maj/minthirds, temperaments, and how symmetries of tempered intervals in 12edo birth modern functional harmony - [WIP] - focuses on extrapolating these symmetries to the realm of [xenharmonics & microtonality](https://en.xen.wiki/w/Main_Page), generalizing when and where such constructs can be applied depending on the tuning system. ---- Loosely inspired by constructive spirit and intentions of: - Fred Lerdahl's [Tonal Pitch Space](https://www.researchgate.net/publication/257925384_Tonal_Pitch_Space) - Heinrich Schenker's [_Tonraum_](https://trace.tennessee.edu/cgi/viewcontent.cgi?article=1110&context=gamut) - Arnold Schoenberg's [Chart of regions](https://symposium.music.org/index.php/33-34/item/2103-schoenberg-on-the-modes-characteristics-substitutes-and-tonal-orientation) - David Temperley's [_Melisma_ computational models](http://davidtemperley.com/melisma-v2/) of harmonic analysis as implemented by Daniel Sleator - Hugo Riemann's [_Vereinfachte Harmonielehre_](https://archive.org/details/cu31924022305357/mode/2up?ref=ol&view=theater) - James Tenney's [History of Consonance and Dissonance](https://www.plainsound.org/pdfs/HCD.pdf) - The [Xenharmonic Wiki](https://en.xen.wiki/) This article mainly serves to provide applied examples of a more generalized harmonic function analysis, which builds upon the alternate construct of tonality as described in [GMT1: Overanalyzing the perfect cadence](https://hackmd.io/@euwbah/gen-music-theory-1). This will be done in two parts: a brief introspective inquiry on your individual conception of pre-existing constructs, followed by a reified re-exposition of theoretical and cultural constructs (octaves, fifths, fourths) that eventually birthed 12edo music from the Xenharmonic perspective. This re-exposition is extremely dense and requires the introduction of many new concepts that may not be new for xenharmonists, but probably new to the general musician, so it will be done in several parts. Along the way, we will learn new ways to interpret these same intervals at depth, and learn metrics and techniques for quantifying them sonically and mathematically in a way that can give deeper intuition for what these intervals are. #### Suggested prerequisites to get the most out of this: - Fundamentals of just intonation - **[hearing & audiating overtones/harmonic series](https://www.youtube.com/watch?v=hDLhe-NkH2A&ab_channel=mannfishh)**\ Ear training is **imperative** for applying any of these techniques/metrics musically. - The understanding that [musical interval name](https://en.xen.wiki/w/Gallery_of_just_intervals) corresponds to a certain ratio between two frequencies. \begin{align} \text{perfect fifth} &= \frac{3}{2} \\ &= \text{the interval between 300hz and 200hz} \\ &= \text{the interval between 150hz and 100hz} \end{align} - The understanding that adding or subtracting a note, $f_0\text{ hz}$, by an interval, $I$, is performed by multiplication ($I\times f_0$) or division ($f_0 \div I$) respectively. - Harmonics, overtones & partials: a single perceived 'note' can be comprised of many constituent frequencies. - That tonality is completely subjective and impossible to predict ([GMT1](https://hackmd.io/@euwbah/gen-music-theory-1)). - Sensitivity to musical culture and emotion - Strong familiarity with interval sounds and names :::success There are [interactive audio examples](https://xenpaper.com/#1%2F1_9%2F8_6%2F5_4%2F3_9%2F8-_16%2F18_1%2F1--) + visuals (edit examples in xenpaper, then click 'Ruler' tab on the left). Much gratitude to [Damien Clarke](https://github.com/dxinteractive) for creating [xenpaper](https://xenpaper.com/). Ratio-interval frequency examples are based on 1/1 = A3 = 220Hz, unless otherwise stated. Also, if the math equations are too small, you can right-click/tap & hold them to open up a menu to adjust how they are displayed. ::: ---- ## 1. Introspection This brief section focuses on raising awareness of your subjective preferences within eurocentric music. Having a rigorous awareness of your internal muse allows for selectively ignoring concepts that do not agree with your subjective preferences. You may also choose to skip this section and proceed to applications, though the effectiveness of the reified concepts may be suboptimal. :::warning Disclaimer: this section is written for, and biased towards, formally trained/experienced musicians trained in audiation of 12ED2 and just intonation. Ideally, a less biased general psychological test (though extremely tedious) should be executed in place of this that can allow anyone to participate and realize their entrained biases. Ensure you are familiar with just intonation concepts listed in the [GMT1 JI primer](https://hackmd.io/@euwbah/gen-music-theory-1#Just-intonation-analysis) before continuing. ::: ### Resolution #### of chords _approx. 30 mins_ In 12ED2 A440, construct chord progressions using any extensions or alterations of choice (accounting for voicings, octave equivalence do not apply here) that can evoke the following resolutions. Do so by pure audiation with the prompt of the C pitch, with the help of an instrument only if necessary: > Note: the 'tonality of X' nomenclature refers to being able to identify X as the 'root' of the tonality, but it is not limited to any particular modes or qualities, it could resolve to any mode/scale of which has X as the root. The root must be audiated as the lowest note. Try to exhaust resolving to as many qualities/modes as you can. - a final resolution from the chord G major to the tonality of C (using the notes from the C major scale) - a final resolution from C major to the tonality of (using the G major scale) - a final resolution from G major to the tonality of C (using the G major scale) - a final resolution from C major to the tonality of G (using the C major scale) - a final resolution from the chord G minor to the tonality of C (using the F major scale, or otherwise) - a final resolution from the chord C minor to the tonality of G (using the C melodic/harmonic scale, or otherwise) Now analyse: - Which resolutions were possible for you? (including resolving to alternate qualities/modes) - Which ones were easier/harder to actualize? - Were there additional/removed notes inside/outside the suggested scales that you had to use to aid in your audiation? - Were there specific choices in voicing (octave choice) to aid in your audiation? - Is the resolution unidirectional or bidirectional? Was there a preference to have them work one way over the other? - How would you represent these resolutions on the 5-limit [Tonnetz](https://en.wikipedia.org/wiki/Tonnetz)? What patterns do you notice in your resolution preferences in terms of just intonation/products of primes? - How did culture influence each choice? Was there any specific culture/genre/artiste/concept? #### of melodies _approx. 60 mins_ Repeat the above inquiries, but by means of audiating a monophonic melody. Construct both the shortest and longest possible resolutions (in terms of duration spent on the 'dominant'-function chord) you can perceive. Analyze the same questions as above. ### Voicing _approx. 90 mins_ In 12ED2 A440, construct 12 sets of [pitch classes](https://en.wikipedia.org/wiki/Pitch_class#:~:text=In%20music%2C%20a%20pitch%20class,%2C%20in%20whatever%20octave%20position.%22) as follows: - For each of the 12 notes, construct a major triad of that note, i.e. $\{\{C, E, G\},\{D^b, F, A^b\},\ ...\ , \{B, D^\text{#}, F^\text{#}\}\}$ - For each triad, adjoin the C major triad, yielding the sets: - C, E, G - C, Db, E, F, G, Ab - C, D, E, F#, G, A - C, Eb, E, G, Bb - C, E, G, G#, B - C, E, F, G, A - C, C#, E, F#, G, A# - C, D, E, G, B - C, Eb, E, G, Ab - C, C#, E, G, A - C, D, E, F, G, Bb - C, D#, E, F#, G, B Then, for each adjoined set, choose an octave for each note such that each note in the set appears only once. Audiate these choices as colors, and select voicings that appeal to you. Now analyse: - Were there any patterns in voicings that were easier/harder to construct? - Did concordance/discordance contribute to your preferential choices? - Have you encountered/utilized such voicings? From which musical paradigm/culture? - Were there certain notes that could have been omitted without losing much of the voicing's character? How about imperative notes? - Repeat the analysis from the perspective of the Tonnetz, and find your ideal composite ratios (by audiation) for subsets of the voicings with 2 to 5 notes. Which ideal ratios between the subsets conflict? ### Finally As we continue with the exposition & extrapolation, keep in mind the unique individual realizations and preferences you have, and find ways to link the reified theories back to your own subjective interpretation, or ignore them entirely if they do not agree. ---- ## 2. Exposition Now we begin the reification of tonality, starting with the discovery of intervals. To reiterate the premise, it has already been established that it is impossible to derive an 'objective tonality' algorithm that can work out exactly how listeners could perceive stimuli. Yet, the impetus to try comes from the fact that a finite number of notes in stimuli can only give rise to a finite number of possible permutations in which tonality can emerge: Analogous to the essence of group theory, we cannot make blanket statements for all of tonality, but given a finite starting point, we can: 1. analyze permutations on a case-by-case basis 2. find situations where all permutations hold true for some statement when certain notes in stimuli are used (by brute force, but music-cultural norms/descriptive music theory can hint at where to look to find such patterns) 3. find the reason for why such tautologies occur from an analytical perspective, so that we can generalize them 4. construct upwards using these analytic methods as our new basis 5. rinse & repeat recursively till a desired level of abstraction is reached Note that we are trying to pick apart harmonic structure, so anything that is constructed without any intention of tonality at all (i.e. serialism/free atonality/free-pitch expressionism/non-functional spectralism) is completely outside the scope of the GMT series. We are assuming the use of instruments with harmonic timbres, which encourages the perception of consonance when dealing with intervals with simple ratios as demonstrated here [this video by Objective Harmony](https://youtu.be/wUwC4syOX1s?t=315). We can safely assume harmonic timbres as the human voice, the primordial instrument, is harmonic ([spectra example](https://www.youtube.com/watch?v=VnC8I3d2MXQ&ab_channel=WhatMusicReally%C4%B0s) and [musical example](https://www.youtube.com/watch?v=i9-pwR6qdhE&ab_channel=Anna-MariaHefele)). It seems that music has evolved in a way that agrees with natural phenomena and the human experience. overtone : Any frequency content present in the [discrete fourier transform](https://www.youtube.com/watch?v=nl9TZanwbBk&ab_channel=SteveBrunton) that is above the [fundamental frequency](https://en.wikipedia.org/wiki/Fundamental_frequency) (or perceived [virtual fundamental](https://www.youtube.com/watch?v=t-iWKvh6Fbw&ab_channel=MusicalAcoustics)) of the sound source. Recall that a 'single' perceived pitch is composite of numerous pitches (decomposed into sine/cosine waves), unless the sound source is a single sine wave itself. timbre : The unique makeup of overtones of a rational/irrational frequency multiples, amplitudes, and phase offsets that allow us to classify sounds/instruments/words. harmonic timbre : The frequency multiples of the overtones in a timbre are integers multiples/close to integer multiples of the fundamental pitch. On the contrary, we have inharmonic timbres (instruments with spherical/rectangular resonance patterns like in gamelan/pitched percussion), or timbres that have missing partials (the [clarinet](https://pages.mtu.edu/~suits/clarinet.html#:~:text=A%20clarinet%20is%20an%20example,odd%20harmonics%20in%20the%20sound)) Before we start, here's a good refresher to set the tone of this section: [this clip of Bernstein's lecture](https://www.youtube.com/watch?v=Gt2zubHcER4&ab_channel=paxwallacejazz). This section intends to build up from the fundamental harmonic series to a convincing interpretation of modern functional harmony. Let us begin! ---- ### Nomenclature: For the purposes of this series, let us standardize the notation of musical intervals: $n/d$ : This represents an interval distance between two notes, and is used to refer to a single note such that $n\div d$ gives the value $A\text{ hz} \div B\text{ hz}$, where $A$ is the frequency of the note referred to by $n/d$, and $B$ is the frequency of the note that $n/d$ is heard/conceptualised with respect to. For example, if the arbitrary 'root note' is 220hz (A3), and the note to be referred to is 330hz (E4), which is thought of with respect to the 'root note', then the interval $330/220 = 3/2$ is used to refer to the note E4. Of course $3/2$ can also represent the abstract interval of the perfect fifth, regardless of arbitrary choice of root note. This can be read as _'3 against 2'_ or _'three-two interval'_. $r_1:r_2:\ldots:r_n$ : This represents a chord of $n$ notes. The chord is formed by with the notes given by intervals from the root, $r_1$, such that the set of all notes in the chord is given by: $\{r_1, r_2/r_1, \ldots, r_n/r_1\}$ For example, $4:5:6$ is a 5-limit classic major triad. This can be read as _'4 against 5 against 6'_ $x\backslash y\text{ ED }(n/d)$ : This represents $x$ times of the interval given by $n/d$ that is equally (irrationally, logarithmically) divided into $y$ parts. For example, $n/d = 2/1$ represents the interval of an octave. $2/1 \times 2/1$ is two octaves, and $\underbrace{2/1 \times \ldots\times 2/1}_\text{k times}$ is $k$ octaves. Thus, $(2/1)^k$ represents $k$ octaves. Setting $k = \cfrac{x}{y}$, we can divide an octave into $y$ equal parts, then form an interval as large as $x$ parts. Since a semitone is dividing the octave into 12 equal parts and taking 1 of the parts, we can notate it as: $1\backslash 12 \text{ ED }2/1$. A 12ed2 tritone is dividing the octave into half, so it could be notated as $1\backslash 2 \text{ ED }2/1$. But it could also be 6 semitones, so $6\backslash 12 \text{ ED }2/1$ is an equivalent notation. The 6\12 simplifies to 1\12. Note that if $n$ defaults to 2, and $d$ defaults to 1, so if we don't specify any $\text{ED }n/d$, it defaults to $\text{ED }2/1$. This is the definition of **'EDO'** as in 'equal divisions of the octave', which we can also use as shorthand. (There's also **'EDT'** which is shorthand for $\text{ED }3/1$) Thus, if we are only equally dividing the octave, we can simply specify $x\backslash y$, e.g. $1\backslash 12$ for 1 semitone and $1\backslash 24$ for 1 quartertone, etc... --- ### On the $\text{octave}=2:1$ Overview: :::success - Why octave equivalence is a thing for harmonic timbres - If you play a note an octave above another prior note concurrently, the note in the higher octave does not actually contribute any 'new' frequencies to the set of all frequencies present. It only increases the volume of some of the already existing frequencies. - Simplicity of octaves - Interval height metric to measure mathematical complexity of ratio - p-limit metric to measure 'dimension' of interval - Harmonicity, Harmonics, Partials and Overtones - Otonality vs Utonality - perceiving an octave up from below is fundamentally different from perceiving an octave down from above even though the end result yields the exact same two notes ::: #### Octave-equivalence & pitch class As previously discussed in [GMT1](https://hackmd.io/@euwbah/gen-music-theory-1#Assumption-of-octave-equivalence), octave-equivalence is a phenomenon that cannot be assumed: that is, all frequencies that are multiples of integer powers of two are the 'same note', and belong in the same [pitch class](https://en.wikipedia.org/wiki/Pitch_class). E.g. the idea that 110hz, 220hz, 440hz, 880hz, and 1760hz are all the pitch class of 'A' since they are the $2^{-2}, 2^{-1}, 2^{0}, 2^{1}, 2^{2}$ multiples of $\text{A4}=440\text{hz}$.  <style> iframe { overflow:hidden; width: 100%; margin: 1.2rem auto; } .annot { width: 100%; text-align: center; margin-top: 0.5rem; margin-bottom: 3rem; } </style> <iframe height="170" src="https://xenpaper.com/#embed:110hz_220hz_440hz_880hz_1760hz%0A%7Br440hz%7D%0A1%2F4_1%2F2_1%2F1_2%2F1_4%2F1" title="Xenpaper" frameborder="0"></iframe> We can also stack these notes without significantly changing the effect of the sound (besides timbral features like 'brightness', 'shrillness' or 'power'): <iframe width="560" height="230" src="https://xenpaper.com/#embed:(1%2F2)%0A1%2F1%0A%5B1%2F1_1%2F2%5D%0A%5B1%2F1_1%2F2_2%2F1%5D%0A%5B1%2F1_1%2F2_2%2F1_4%2F1_1%2F4%5D" title="Xenpaper" frameborder="0"></iframe> #### The octave is the 'simplest' just interval. Recall that we can use [height functions](https://en.xen.wiki/w/Height) to evaluate mathematical simplicity. For example, the Tenney Height/log product complexity of the octave: $$ \begin{align*} \text{let octave} = \frac{n}{d} = \frac{2}{1},\\\\ H_{tenney}(\text{octave}) &= \log_2(n \cdot d)\\ &= \log_2(2 \cdot 1)\\ &= 1 \end{align*} $$ Analogously, for a single interval between two notes, we can also look to the nth iterations of the [Farey sequence](https://en.wikipedia.org/wiki/Farey_sequence), or the [fundamental region](https://en.wikipedia.org/wiki/J-invariant#The_fundamental_region) of the j-invariant functions in the study of [modular forms](https://en.wikipedia.org/wiki/Modular_form). Modular forms have yet to be thoroughly studied musically, but I hypothesize that they will eventually yield a deeper understanding of intervals. ![Farey diagram](https://i.imgur.com/K48hUCV.png =500x) Fig 1: Farey sequence \ \ \ ![Fundamental domain j-invariant function](https://i.imgur.com/DAVbBTP.png =500x) Fig 2: fundamental domain of [$\text{PSL}(2, \mathbb{Z})$](https://en.wikipedia.org/wiki/Modular_group) acting on [$\mathcal{H}$](https://en.wikipedia.org/wiki/Upper_half-plane) \ \ Also recall that the measures of complexity with height functions do not correlate with perceived consonance of the interval. For reference, page 30 of [Simultaneous Consonance in Music Perception and Composition (Peter M.C. Harrison & Marcus T. Pearce, 2019)](https://files.de-1.osf.io/v1/resources/6jsug/providers/osfstorage/5c467104154ce50016e12802?format=pdf&action=download&direct&version=3) lists a whole host of consonance modeling techniques. #### Why octaves could be perceived to be the 'same note' :::warning Disclaimer: this section only holds true assuming we are using a harmonic timbre. A straightforward example would be the [saw wave](https://www.youtube.com/watch?v=A6NFknpJalA&ab_channel=TruncatedTriangle). ::: If you haven't already, have a look at the coinciding of partials in [_"Benedetti's Puzzle SOLVED"_ by Objective Harmony](https://youtube.com/clip/Ugkx0KTCax_yrk9zUqoWva0juRUE1TZwXByU), the videos of this channel supplements the general thoughts of this entire GMT paradigm. Let's try to rationalise this: in harmonic timbres (like most instruments, basic waveforms and the human voice), we can assume there are frequency content given at integer multiples of the fundamental frequency $f_0\text{ hz}$. The partials of a harmonic timbre is equal, or approximately equal to the [harmonic series](https://www.youtube.com/watch?v=Wf0mZ42Kf7w&ab_channel=AlexanderChen), whereas, the partials of an inharmonic timbre will be significantly out of tune/different from the harmonic series. If this is all new to you, [watch this video](https://www.youtube.com/watch?v=Wx_kugSemfY&ab_channel=ANDREWHUANG) by Andrew Huang which gives a basic introduction to the concept of harmonics - a that single note is not made out of just one frequency. :::warning Note that there are slight differences between the terms _overtone_, _partial_, and _harmonic_: overtones : are specifically any tonal frequency content that are present above the fundamental frequency (hence, _over_). partials : are the set of overtones but including the fundamental frequency itself (the second partial is the first overtone) harmonics : are strictly the positive integer multiples (including 1) of the fundamental frequency. if a timbre is _inharmonic_, its overtones will not coincide with the expected _harmonics_. The third harmonic of a note is equal to its third partial (and second overtone) if and only if the note is sounded with a harmonic timbre. The set of harmonics may not necessarily reflect the set of frequencies that are present in the overtones of a note. Because we are assuming harmonic timbres in the eurocentric context, these terms have some overlap (we are concerned with the _overtones of harmonic timbres_). Hopefully this scary Venn diagram helps to make this tricotism clearer: ![](https://i.imgur.com/sgVLla6.png) ::: Ok back to the topic of why octaves sound equivalent. Let us assume that all overtones are present, then, given a base frequency $f_0$, we can say that there are the set of all frequencies present, $F_\text{partials}$ is given by: $$ F_\text{partials} = \{n \times f_0\ |\ n \in \mathbb{N}\} $$ (i.e. all the frequencies $n$ times $f_0$ where $n$ is any positive integer [$\mathbb{N}$](https://en.wikipedia.org/wiki/Natural_number)). E.g., if $f_0 = 100\text{hz}$, then: <iframe width="560" height="315" src="https://xenpaper.com/#embed:100hz_200hz_300hz_400hz_500hz%0A600hz_700hz_800hz_900hz_1000hz%0A1100hz_1200hz_1300hz_1400hz_1500hz%0A1600hz_1700hz_1800hz_1900hz_2000hz%0A%0A%23_alternative_notation%3A_(see_info_on_xenpaper)%0A%7Br100hz%7D%0A%7B1%3A%3A20%7D%0A1_2_3_4_5_6_7_8_9_10%0A11_12_13_14_15_16_17_18_19_20" title="Xenpaper" frameborder="0"></iframe> :::warning Note that the example above is non-exhaustive, and should theoretically continue until infinity (or your hearing range) ::: Recall that the above sequence is also known as the musical [harmonic series](https://en.wikipedia.org/wiki/Harmonic_series_(music)). These frequencies in the set $F_\text{partials}$ are all present in the sound of any harmonic timbre sounded at $f_0 = 100\text{hz}$. Now let us look at the frequencies we get when we sound a harmonic timbre at $200\text{hz}$: <iframe width="560" height="230" src="https://xenpaper.com/#embed:(env%3A4069)200hz_400hz_600hz_800hz_1000hz%0A1200hz_1400hz_1600hz_1800hz_2000hz_2200hz%0A2400hz_2600hz_2800hz_3000hz_3200hz" title="Xenpaper" frameborder="0"></iframe> In fact, all these frequencies here are already contained within the harmonics of $100\text{hz}$! We can easily see that every second frequency in $\{n \times f_0\ |\ n \in \mathbb{N}\}$ coincides with every frequency in $\{n \times 2f_0\ |\ n \in \mathbb{N}\}$. On average, half the overtones coincide, and all overtones of the $2f_0$ set are contained within the $f_0$ set. Of course, this is bounded by human hearing — if the bounds are unlimited then I would get flamed by the [$\aleph_0$](https://en.wikipedia.org/wiki/Aleph_number) pedants because both sets have the same size blah blah blah. The concept of coinciding partials is not new at all, and in academic literature, goes by the name of [Tonal fusion](https://www.jstor.org/stable/40285526), 'melding', or [Tonverschmelzung](https://www.amazon.de/-/en/Wilhelm-Kemp/dp/B000QY59GK), amongst other names. For the purposes of this article, we shall refer to this phenomenon as coinciding partials since it is a self-explanatory definition. #### The octave is 2-limit \begin{align} \text{p-limit}\left(\frac{2}{1}\right) &= \text{p-limit}\left(\frac{2^1}{2^0}\right)\\ &= 2 \end{align} The [p-limit](https://en.xen.wiki/w/Harmonic_limit) measures the largest prime number that an interval or piece of music requires in its prime factorization. The octave interval only uses primes up to 2 in its [prime factor decomposition](https://www.mathsisfun.com/prime-factorization.html), which is the first prime. ##### p-limit intuition: Knowing what limit an interval lives in helps when trying to conceptualize the interval spatially: - There is no way raising/lowering a pitch by any number of octaves can allow one to arrive at a different note that is not simply n octaves away. - Mathematically, there is no way any power of repeated multiplication or division by 2 will allow one to arrive at some multiple of 3, 5 or any other prime number. - Thus, we can draw a straight 1-dimensional line in space along the up/down axis, and say that along this infinitely extended straight line, we have all the powers of 2, which musically speaking, contains all octaves of the same note. - Let's say we add the next interval, the perfect fifth which would be $3:2$. Now we cannot represent this note along that 1D line, as it contains a power of 3, making it a 3-limit interval. - But, we can expand our 1D line into a 2D plane using the left/right axis. Along this new axis, we have all the powers of 3. - Now we can represent the $3:2$ as $3^1 \cdot 2^{-1}$: which we can think as 1 unit to the right (+1 power of 3) and 1 unit downwards (-1 power of 2) - Now on this infinitely extending 2D plane, we have all the 3-limit intervals that can be formed by combining octaves and fifths. ##### Monzo intuition: Thus we can use the p-limit of an interval or piece of music to define how many 'dimensions' it lives in. Note that we can represent this spatial position as a 'coordinate' using a [Monzo](https://en.xen.wiki/w/Monzo). For example, $3:2$ could be represented as the monzo $[\ -1\ \ 1>$ to mean 1 step backwards in the 2-prime octave dimension, and 1 step forward in the 3-prime 'tritave' (3rd harmonic, octave + fifth) dimension. If some music is 11-limit, we could say it could be spatially represented in up to 5 dimensions, if all the primes 2, 3, 5, 7, and 11 are utilized in the intervals. If certain primes are not utilized, or if octaves (prime-2) are assumed equivalent, then you can omit those primes from the spatial representation like this 11-limit 2-dimensional example by mannfish — [Enchantment Under the Sea](https://www.youtube.com/watch?v=FXRh-Tr62Aw&ab_channel=mannfishh). To give an extreme example, here's one of my favorite xenharmonic works by [Zhea Erose - WXTCHCRXFT](https://www.youtube.com/watch?v=a63V_fAPNaA&ab_channel=ZheannaErose). Although it wasn't conceived with p-limit in mind, the piece technically has a p-limit of 389 (77 dimensions!). Instead, [near-equal just-intonation](https://en.xen.wiki/w/Neji) and [primodality](https://en.xen.wiki/w/Primodality) are used to construct the tuning system. #### Is $2:1$ functionally equivalent to $1:2$? > _Is the [dyadic](https://en.wikipedia.org/wiki/Dyad_(music)) interval formed one octave up from 100hz equivalent to the dyad formed one octave down from 200hz?_ This seems like a redundant question either way you look at it, since: 1. $2:1 \neq 1:2$ 2. both dyads in question are the same set of frequencies However, there's a nuanced asymmetry in the duality when applied to conscious perception with the element of time. When we start with a [saw wave](https://www.youtube.com/watch?v=A6NFknpJalA&ab_channel=TruncatedTriangle) sounding at 100hz, then sound a pitch an octave above at 200hz, there are fundamentally no new frequencies to be heard. Sure, the overtones of 200hz will resound louder than before, but they were already there in the first place. :::spoiler open pedantic disclaimer Of course, if very specific timbres are used, like the square or triangle waveforms which only has odd harmonics present in its partials, then this argument does not hold true. Yet again, one could counter-argue that the non-linearity of the ear makes it such that we can perceive ghost/psychoacoustical even harmonics even if none are present, or you can even blame the non-linearity/subtle distortion of digital-analog-converters or speaker amplifiers. Or combination tones. (There are links to all these topics in a later section, I'm lazy to re-link them, but you can google these terms) Whatever the argument is, all these alternative modeling of perception taking into account psychoacoustical and other more tangible factors all agree that even with only odd harmonics present, you can arrive at even harmonics. ::: \ Now, from the perspective of starting with the 200hz pitch, then sounding the 100hz pitch: the latter action in fact adds (infinitely) many new pitches to the overall set of pitches, since 100hz, 300hz, 500hz, 700hz... are not contained within the initial set of pitches. The final collection of pitches and amplitudes in both cases are the same, but the relatively perceived action is different. Consider the action of doubling octaves to raise intensity and fullness of a melody or bassline: Would you double it an octave lower or octave above? Would you factor in tessitura and hearing range in making that decision? Which interval is simpler, 1/2 or 2/1? Were your thoughts reflected in the mathematical models of height functions, [critical bandwidth dissmeasure](https://gist.github.com/endolith/3066664), or [harmonic entropy](https://en.xen.wiki/w/Harmonic_entropy)? This [otonal-utonal](https://en.wikipedia.org/wiki/Otonality_and_Utonality) dichotomy comes up even as we continue through the other intervals. _All this time, we have only been talking about the same notes,_ yet not enough has been said about the octaves. Later concepts discussed here can be retroactively applied to earlier constructs, so feel free to come back to this section, but it is time to move on: ### On the $\text{fifth} = 3:2$ There's a lot going for this very special interval in the music of many traditions. Here's an overview in case you get lost: :::success - The fifth is the second 'simplest' interval - Unlike the octave, if a fifth of a note is sounded, the fifth will contribute some new frequencies to the set of all frequencies present in the sound. - The more consecutive fifths are stacked, the less the relative contribution to the amount of frequency information - Stacked fifths are nice consonant structures for harmonic timbres and are used as a topological (spatial) framework for understanding/mapping most of western harmony - 12 fifths almost equal to 7 octaves - 53 fifths almost equal to 31 octaves with lesser error - 359 fifths almost equal to 210 octaves with lesser error still - 665 fifths almost equal to 389 octaves... ::: #### The second simplest interval <iframe height="170" src="https://xenpaper.com/#embed:%7Br440hz%7D%0A3%3A2---" title="Xenpaper" frameborder="0"></iframe> > more accurately: the second [superparticular](https://en.wikipedia.org/wiki/Superparticular_ratio) interval. It was written of [more than 2300 years ago](https://www.jstor.org/stable/920859) that 12 of them could _almost_ form a perfect circle, with a small error known as the [Pythagorean comma](https://en.wikipedia.org/wiki/Pythagorean_comma) > > aka [perfect fifth](https://en.wikipedia.org/wiki/Perfect_fifth) If you've committed to reading this article till here, I can assume there isn't much to introduce about the role of fifths in music of the european tradition. So I will list some well-known conceptions/uses of fifths in point form: - Venetian Tonic-Dominant relationship - [Circle of fifths](https://ledgernote.com/columns/music-theory/circle-of-fifths-explained/), key centers - Chord extensions - Modal/harmonic distance Let's go straight to making some observations, keeping in mind the previously discussed analytical methods and introspective inquiries. #### Musical example Mannfish has really good videos demonstrating properties of certain just intervals. Here's one on the pure fifth: <iframe width="560" height="315" src="https://www.youtube.com/embed/-6VEu64x4zU" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe> #### The fifth is $3:2$ $$ \begin{align} H_{tenney}\left(\frac{3}{2}\right) &= \log_2(3\cdot 2)\\ &\approx 2.58496...\\ \\ \text{p-limit}\left(\frac{3}{2}\right) &= \text{p-limit}\left(\frac{3^1}{2^1}\right)\\ &= 3 \end{align} $$ It is the interval formed between the 3rd and 2nd partials of the harmonic series. That also means that any note sounded with a harmonic timbre will also contain its own fifth within its spectra. We can use the same partial coinciding analysis to see that given a root-fifth power chord, $$ F_\text{partials}(\text{root}) = \{n \times f_0\ |\ n \in \mathbb{N}\} \\ F_\text{partials}(\text{fifth}) = \{n \times \frac{3}{2}f_0\ |\ n \in \mathbb{N}\} $$ Thus, we can see that every third harmonic of the $\text{root}$ coincides with every second harmonic of the $\text{fifth}$. Setting $f_0 = 100$ we get: <iframe width="560" height="170" src="https://xenpaper.com/#embed:100hz_200hz_300hz_400hz_500hz_600hz%0A___150hz____300hz____450hz____600hz" title="Xenpaper" frameborder="0"></iframe> [Just like we did with the octaves](#Is-21-functionally-equivalent-to-12), we can analyse this both ways. From the perspective of starting with the $\text{root}$ and adding the $\text{fifth}$: - every 3rd harmonic of the $\text{root}$ is reinforced/coincided with - for every 3 harmonics of the $\text{root}$, we get one new frequency not previously present within that interval. - 150hz between $(0,300)$ - 450hz between $(300,600)$ From the perspective of starting with the $\text{fifth}$ and adding the $\text{root}$: - every 2nd harmonic of the $\text{fifth}$ is reinforced/coincided with - for every 2 harmonics of the $\text{fifth}$, we get 2 new frequencies - 100hz & 200hz between $(0,300)$ - 400hz & 500hz between $(300,600)$ By now, a pattern should be evident, and we can reify our intuition by creating two metrics: Given two notes $A$ and $B$, such that $A$ is already present in the [Deutsch](https://deutsch.ucsd.edu/psychology/pages.php?i=209)-[Schenkerian](https://en.wikipedia.org/wiki/Schenkerian_analysis)-[Lerdahlian](https://www.researchgate.net/publication/257925384_Tonal_Pitch_Space) _heuristic tonal pitch space_, and $B$ is sounded after, and perceived relatively to $A$: We can define the _coincidence_ metric for measuring the fraction of partials coincided among the partials of $A$: $$ P_\text{coincidence}(A|B) = \frac{|F_\text{partials}(A) \cap F_\text{partials}(B)|}{|F_\text{partials}(A)|} $$ And the _newinfo_ metric for the average number of new pitches present in $B$ for each partial of $A$: $$ P_\text{newinfo}(A|B) = \frac{|F_\text{partials}(B) - F_\text{partials}(A)|}{|F_\text{partials}(A)|} $$ > Test your understanding of the above: can you calculate these metrics for upwards octave interval $2/1$? > :::spoiler Answer: > \ > $\large P_\text{coincidence} = \frac{1}{2}$ > > $\large P_\text{newinfo} = 0$ > ::: And before set theorists scream at me, recall we redefined the partials to only include those within human hearing range. Heuristically, this range could be reduced to 30hz-5000hz as [pitch perception degrades](https://pressbooks.umn.edu/sensationandperception/chapter/pitch-perception/) due to limitations of the [auditory nerve](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6573645/#:~:text=These%20values%20indicate%20that%2C%20as,those%20at%20the%20highest%20frequencies.), among other things. This approach uncovers the dualistic, yet asymmetric properties of the otonal-utonal dichotomy. This demonstrates the relationship between perceiving intervals upwards or downwards relative to the 'tonal centre', that takes into account how the subjective tonal pitch space can affect the perception. Revisit the [dichotomy of the Dominant and Subdominant](https://hackmd.io/@euwbah/gen-music-theory-1#Tonality), and observe the correlations. The _coincidence_ and _newinfo_ metrics prove to be useful in analysing cultural patterns in 12ed2 music of the european tradition, and continue to be helpful for concepts that involve lower-limit primes ($\lessapprox$ 11-limit). Note that these formulas do not answer the question of how 'tonic' is perceived or conceptualised. For that, we have to look to cultural entrainment and other places asymmetry arises. :::success ##### For advanced readers The above metrics can be made smooth and useful for all intervals by adapting [Sethares' Dissmeasure](https://gist.github.com/endolith/3066664) function. The original Plomp & Levelt [critical band](https://www.sfu.ca/sonic-studio-webdav/handbook/Critical_Band.html) roughness curve can be split into two curves at the point of maxima, such that the initial upwards curve segment increasing in roughness is now the inverse of the 'note similarity' curve, and the curve segment decreasing in roughness is now the inverse of the 'newness' curve. Both curves are then convolved with segments of the gaussian distrubution (coefficients to be experimentally optimised) to 'crossfade' the curves, and each curve can be plugged in to Sethares' process in replacement of the roughness curve to arrive at the _coincidence_ and _newinfo_ scores independently. As always, implementation is left as an exercise to the reader. ::: #### The fifth is simple enough to be stacked. <iframe width="560" height="315" src="https://xenpaper.com/#embed:(1%2F2)%7Br200hz%7D%0A%5B1%2F1_3%2F2%5D_%0A%5B1%2F1_3%2F2_9%2F4%5D%0A%5B1%2F1_3%2F2_9%2F4_27%2F8%5D%0A%5B1%2F1_3%2F2_9%2F4_27%2F8_81%2F32%5D%0A%5B1%2F1_3%2F2_9%2F4_27%2F8_81%2F32_243%2F64%5D%0A%5B1%2F1_3%2F2_9%2F4_27%2F8_81%2F32_243%2F64_729%2F256%5D%0A%5B1%2F1_3%2F2_9%2F4_27%2F8_81%2F32_243%2F64_729%2F256_2187%2F512%5D" title="Xenpaper" frameborder="0"></iframe> Just as how octaves can be stacked upwards without creating new frequency content, we have seen that an added fifth above creates 1 new frequency for every 3 harmonics of the original root that wasn't present, $P_\text{newinfo} = \frac{1}{3}$. Since a fifth was introduced, let's now consider our 'tonal space' to be the set of frequencies in the partials of $A = 200\text{hz}$ and $B = 300\text{hz}$, for this we define $F_\text{partials}(A)\ \cup\ F_\text{partials}(B) = \{200, 300, 400, 600, 800, 900, 1000,...\}$ For the sake of brevity, let us define $F_\text{partials}()$ to be a polyadic function so we can represent the above as $F_\text{partials}(A, B, ...)$. Now let us stack yet another fifth: $C = \frac{3}{2}\times 300 = 450\text{hz}$ (this is represented by the `9/4` interval in the example above) This yields the partials $F_\text{partials}(C) = \{450, 900, 1350, ...\}$. Now, we can see that for every 6 partials in $F_\text{partials}(A, B)$, we get 1 new frequency from $F_\text{partials}(C)$. E.g., between the first 6 partials $\{200, 300, 400, 600, 800, 900\}$, we get 450hz as new information, and between the next 6 partials $\{1000, 1200, 1400, 1500, 1600, 1800\}$, we get 1350hz as new information. This happens periodically for every 6 partials of the set $F_\text{partials}(A, B)$. Thus, adding the ninth ($C = 9/4$) with respect to an existing root-fifth chord ($\text{cons}(A, B) = 3:2$), we have the _newinfo_ metric: $$ P_\text{newinfo}(A, B\ |\ C)=\frac{1}{6} $$ Not surprisingly, adding the second stacked fifth adds relatively less new frequencies as compared to the size of the set of available frequencies. The _newinfo_ metric can be seen as an 'impact' or 'surprise' metric (if there are many new frequencies, our ears would be 'startled'). Thus, we can understand $P_\text{newinfo}(A, B\ |\ C) < P_\text{newinfo}(A\ |\ B)$ to mean that the impact of adding a single fifth above a single fundamental is less than the impact of adding second fifth above a dyad. :::warning Note: the concepts presented in the last two sections are absolutely crucial to understanding the rest of this article. In summary, you should be able to understand that: - A single resounding 'note' comprises of many partials: these are other frequencies present that constitute the note's timbre, according to the harmonic series - You can construct a set (a list of unique items) of all these partials' frequencies - Given an existing note, $X$, that is sounding (or virtually sounding in the pitch memory), you can measure the effect of adding a new note, $Y$, on top of the existing note by merging the set of partials of $Y$ into the set of partials of $X$. (This is the [union $\cup$](https://en.wikipedia.org/wiki/Union_(set_theory)) operation) - By seeing how many new frequencies are added in the set after the union, you heuristically calculate how much new information the ear has to take in by trying to process that stimuli. - In reality, there are infinitely many frequencies, but we are concerned with measuring the relative increase of density of harmonic information. This would be given by the $P_\text{newinfo}$ metric. - Now that two notes are already present, you can keep going, and evaluate the effect of adding yet another note $Z$ by noting how the change in density of harmonic information by merging the partials of $Z$ into the already merged set of partials from $X \cup Y$. - Rinse and repeat If you can intuitively understand this, you should be able to evaluate these metrics for any interval yourself by working out the sets of frequencies manually, though it quickly becomes tediouss. Hence, in the following section, we try to abstract out the repetitive steps into a simple one-line formula. ::: #### Fifth fractal math :::warning Warning: math ahead, skip to next section if [proof by intimidation](https://www.explainxkcd.com/wiki/index.php/982:_Set_Theory) suffices for you. ::: We can fully work out the above pattern. Honestly, it is hard to write it out in a concise way, so it would be better if you could work it out on your own, but here goes: First, it is trivial that for any $3:2$ fifth $[A, B]$, we have: $P_\text{coincidence}(A|B) = \frac{1}{3}$ $P_\text{newinfo}(A|B) = \frac{1}{3}$ ##### Def. coincidence & newinfo metrics :::info For any just interval formed between $A$ and $B$: For all $n$ and $d$ such that $\large \frac{n}{d} = \frac{B\text{ hz}}{A\text{ hz}}$. Then, $$ \begin{align} P_\text{coincidence}(A|B)\ \ =&\ \ \frac{d}{\text{lcm}(n, d)} \\ \\ P_\text{newinfo}(A|B)\ \ =&\ \ \frac{d}{\text{lcm}(n,d)}\cdot(\frac{\text{lcm}(n,d)}{n} - 1)\\ =&\frac{d}{n} - \frac{d}{\text{lcm}(n, d)} \end{align} $$ If you have constructed equations/algorithms for the [generalized coincidence and newinfo](#For-advanced-readers) metrics that do not require just intervals and accounts for the psychoacoustic 'auto-correcting' of [slight tuning errors](https://www.researchgate.net/publication/49721885_Enhanced_brainstem_encoding_predicts_musicians'_perceptual_advantages_with_pitch), you can substitute them in place of the above. ::: \ The partials of $B$ coinciding with that of $A$ happens periodically every $\frac{1}{P_\text{coincidence}}$th partial of $A$. Thus, we can say that the overall density of unique partials in $F_\text{partials}(A, B)$ is increased by a factor of $P_\text{newinfo}(A|B)$. Now it gets confusing: as shown previously, adding the note $3/2$ adds 1 new partial in the interval-span of every 3 partials of the existing fundamental $1/1$. Which means, within that same periodic interval-span of 3 partials, we have 4 unique partials coming from both notes $1/1$ and $3/2$ (incr. by a factor of $\frac{1}{3}$). This is what it means to have the density of partials increase by a factor of $\frac{1}{3}$, which is the exact metric of $P_\text{newinfo}$. Adding the note $9/4$ adds 1 new partial in the period of every 3 existing partials of $3/2$. Within the interval-span of every 2 partials of $3/2$ (which is analogous to the interval of 3 partials of $1/1$) there are 4 partials (since adding the $3/2$ itself increased it from 3 to 4). However, we are concerned with the interval-span of every ==3 partials of $3/2$==, not 2. So, we interpolate the number of partials to the larger interval, keeping the density consistent. The $9/4$ interval is $3/2$ as large as $3/2$, so we scale the number of partials by a factor of $3/2$ to arrive at a total of 6 unique partials (from both $1/1$ and $3/2$) that appear within the interval-span of "3 partials of $3/2$". With that information, recall at the start of this paragraph that adding the note $9/4$ adds 1 new partial in this interval-span, so instead of 6 partials every period, we now have 7. Here we say $P_\text{newinfo} = \frac{1}{6}$, since every 6 existing partials, 1 new one is added. Now we continue this pattern with a fully worked out table, hopefully making the process clearer. To make the math easier, let us assume the root note is 8hz, yielding the harmonics 8hz, 16hz, 24hz, 32hz, 40hz, etc. The fifth of the root note will be $3/2 \times 8 = 12$ hz, and so on. ##### Table 1: _newinfo_ impact of stacking consecutive fifths ```csvpreview {header="true"} num of stacked 5ths, existing partials in span, new partials per span, newinfo, list of freqs present within 'span' 0 (root), infinity, infinity, infinity, 8 16 24 32... 1, 3, 1, 1/3, 8 [12] 16 24 2, 3/2 * (3+1) = 6, 1, 1/6, 8 12 16 [18] 24 32 36 3, 3/2 * (6+1) = 10.5, 1, 2/21, 8 12 16 18 24 [27] 32 36 40 48 54 etc... 4, 3/2 * (10.5 + 1) = 17.25, 1, 4/69, im lazy 5, 3/2 * (17.25 + 1) = 27.375, 1, 8/219, no ``` Existing partials in span : the number of unique partials (within each 'span') that have been cumulatively added as the fifths are stacked. : this is referred to below as $p_k$ where $k$ is the number of times the interval is stacked new partials per span : the number of new frequencies within each 'span' caused by the addition of this row's new fifth. : this is given by the constant $\text{newpart}$ span : a span is an arbitrary interval such that each span contains only one coinciding partial between the newly added fifth and previous fifth/fundamental. The specifically chosen periodic interval of the span allows us to calculate the 'density' of harmonic content within the span, and confidently extrapolate that density throughout upper harmonics. : specifically, given root note $f_0 \text{ hz}$, the initial span of the $k$th stacked $n/d$ interval is: $\text{span}(0) = \mathopen{\biggl(}0\text{ hz},\quad p_1 \cdot (n/d)^{k}\cdot f_0\text{ hz}\mathclose{\biggr]}$ and subsequent spans are given by: $\text{span}(n) = span(0) + n\cdot |span(0)|$ $\quad (\forall n \in \mathbb{N}_0)$ With these variables, we can represent this iterative tabulation succinctly by solving a recurrent relation, which will allow us to apply this analysis to other just intervals later on: the initial value is the reciprocal of the [coincidence metric](#Def-coincidence-amp-newinfo-metrics): $$ p_1 = \frac{1}{P_\text{coincidence}(A|B)} $$ and subsequent values are given by: $$ \begin{align} p_k &= \frac{n}{d}\times (p_{k-1} + \text{newpart}) \\ &= p_1\cdot {(\frac{n}{d})}^{k-1} + \frac{\text{newpart}\cdot\left(n-\cfrac{n^k}{d^{k-1}}\right)}{d - n} & \text{(solving recurrence relation)} \end{align} $$ where $\frac{n}{d}$ is a constant representing the interval that is being repeatedly stacked, in this case, that would be $3/2$; $\text{newpart}$ is a constant value representing the number of new partials per span, given by: [$\cfrac{P_\text{newinfo}(A|B)}{P_\text{coincidence}(A|B)}$](#Def-coincidence-amp-newinfo-metrics) Finally, to evaluate the relative increase in frequency density by the action of adding the $k$th stacked interval on top of the chord of $k - 1$ prior notes, we have the $\text{newinfo}$ metric: $$ \text{newinfo}(k) = \frac{\text{newpart}}{p_k} $$ To give an example of applying the formula to measure the $\text{newinfo}$ metric of stacking the $3:2$ interval $k$ times: \begin{align} n &= 3\\ d &= 2\\ P_\text{newinfo}(3|2) &= 1/3\\ P_\text{coincidence}(3|2) &= 1/3\\ \text{newpart} &= \frac{P_\text{newinfo}(3|2)}{P_\text{coincidence}(3|2)}\\ &= 1 \end{align} \begin{cases} p_k &= \frac{n}{d} \cdot (p_{k-1} + \text{newpart})\\ &= \cfrac{3}{2} \cdot (p_{k-1} + 1) \\ \\ p_1 &= \cfrac{1}{P_\text{coincidence}}\\ &= 3 \end{cases}   Lets calculate $\text{newinfo}$ for the 3rd stacked fifth: \begin{align} p_k &= p_1\cdot {(\frac{n}{d})}^{k-1} + \frac{\text{newpart}\cdot\left(n-\cfrac{n^k}{d^{k-1}}\right)}{d - n}\\ p_3 &= 3\times {(\frac{3}{2})}^2 + \frac{3-3^3/2^2}{2 - 3}\\ &= \frac{27}{4} + \frac{15}{4}\\ &= 10.5\\ \\ \text{newinfo}(3) &= \frac{\text{newpart}}{p_3}\\ &=\frac{1}{10.5}\\ &= \frac{2}{21} \hspace{3cm}\blacksquare \end{align} The above result agrees with the [tabulation](#Table-1-newinfo-impact-of-stacking-consecutive-fifths). It works! The math nerd reading this will probably be puking blood because we are dealing with sets of undefined length and trying to define some sort of pseudo measure of size on a discontinuous non-differentiable set. There is a more rigorous, but analogous way to justify this analysis using [Lebesgue measures and measure theory](https://www.youtube.com/watch?v=xZ69KEg7ccU&ab_channel=TheBrightSideofMathematics), but that is left as an exercise to the reader. This is intended to be a music article after all. :::success Relax. Take a break. Go touch some grass. ::: #### The point is: fifths, like octaves, are good for stacking. All the math above is proof that: fifths like to be stacked! As we stack more fifths, each consecutively added fifth has an _exponentially_ diminishing relative contribution to the amount of unique frequency information ($\text{newinfo}$). (refer to [Table 1](#Table-1-newinfo-impact-of-stacking-consecutive-fifths)). Graph of decreasing impact of subsequent stacked fifths: ![](https://i.imgur.com/EScxjyQ.png) Of course, we didn't take into account the amplitudes of the partials present, and we assumed all partials extend to the limits of the human hearing range with equal amplitude. In reality, all timbres have partials that taper off after a certain number of harmonics, and some timbres have missing/less harmonic content than others. Take for example these pure [sine waves](https://en.wikipedia.org/wiki/Sine_wave) playing stacked fifths, which theoretically has no overtones: <iframe width="560" height="315" src="https://xenpaper.com/#embed:(1%2F2)(osc%3Asine)%7Br200hz%7D%0A%5B1%2F1_3%2F2%5D_%0A%5B1%2F1_3%2F2_9%2F4%5D%0A%5B1%2F1_3%2F2_9%2F4_27%2F8%5D%0A%5B1%2F1_3%2F2_9%2F4_27%2F8_81%2F32%5D%0A%5B1%2F1_3%2F2_9%2F4_27%2F8_81%2F32_243%2F64%5D%0A%5B1%2F1_3%2F2_9%2F4_27%2F8_81%2F32_243%2F64_729%2F256%5D%0A%5B1%2F1_3%2F2_9%2F4_27%2F8_81%2F32_243%2F64_729%2F256_2187%2F512%5D" title="Xenpaper" frameborder="0"></iframe> Compare that with that of a [sawtooth wave](https://en.wikipedia.org/wiki/Sawtooth_wave), which has overtones. <iframe width="560" height="315" src="https://xenpaper.com/#embed:(1%2F2)(osc%3Asawtooth)%7Br200hz%7D%0A%5B1%2F1_3%2F2%5D_%0A%5B1%2F1_3%2F2_9%2F4%5D%0A%5B1%2F1_3%2F2_9%2F4_27%2F8%5D%0A%5B1%2F1_3%2F2_9%2F4_27%2F8_81%2F32%5D%0A%5B1%2F1_3%2F2_9%2F4_27%2F8_81%2F32_243%2F64%5D%0A%5B1%2F1_3%2F2_9%2F4_27%2F8_81%2F32_243%2F64_729%2F256%5D%0A%5B1%2F1_3%2F2_9%2F4_27%2F8_81%2F32_243%2F64_729%2F256_2187%2F512%5D" title="Xenpaper" frameborder="0"></iframe> Note how every consecutively added stacked fifth on the sine timbre has about the same impact on the overall density of harmonics present. You could say there are no 'diminishing returns', since the lack of overtones in a sine wave mean that there are no additional partials to coincide with. Adding the fifth of a root will add only one more unique frequency (itself) to the set of sounds. Adding one more fifth will still add only one more unique frequency. The impact of adding each fifth is pretty much unchanging. Performing the above recurrent $\text{newinfo}(k)$ analysis on the sine timbre in fact yields the sequence $\frac{1}{1}, \frac{1}{2}, \frac{1}{3}, \frac{1}{4},\frac{1}{5},...$. It is surely decreasing, but nowhere near as fast as if a saw or other harmonic timbre were to be used. :::spoiler _Disclaimer: sine waves aren't truly 'pure'._ The [non-linearity of your ears](https://agilescientific.com/blog/2014/6/9/the-nonlinear-ear.html), [speaker](https://www.klippel.de/fileadmin/_migrated/content_uploads/Loudspeaker_Nonlinearities%E2%80%93Causes_Parameters_Symptoms_01.pdf), [digital-to-analog converter](https://www.allaboutcircuits.com/technical-articles/understanding-dnl-and-inl-specifications-of-a-digital-to-analog-converter/), etc... all cause subtle additional harmonics/distortion that can still be perceived. ::: \ Contrast that with the saw wave version. The above table/recurrent relation calculations apply directly to saw waves since saw waves contain all the harmonics — all integer multiples of the fundamental pitches are present. This may be subjective, but generally should be able to feel that in the saw wave example, the textural complexity/frequency information density increases by diminishing amounts as subsequent fifths get stacked, whereas the information density of the sine wave example increases by constant amounts as the fifths stack. Now to really nail in the idea of how timbre affects applicability of using coincidence of partials as a metric of interval stackability, here is a final example with [square waves](https://en.wikipedia.org/wiki/Square_wave): <iframe width="560" height="315" src="https://xenpaper.com/#embed:(1%2F2)(osc%3Asquare)%7Br200hz%7D%0A%5B1%2F1_3%2F2%5D_%0A%5B1%2F1_3%2F2_9%2F4%5D%0A%5B1%2F1_3%2F2_9%2F4_27%2F8%5D%0A%5B1%2F1_3%2F2_9%2F4_27%2F8_81%2F32%5D%0A%5B1%2F1_3%2F2_9%2F4_27%2F8_81%2F32_243%2F64%5D%0A%5B1%2F1_3%2F2_9%2F4_27%2F8_81%2F32_243%2F64_729%2F256%5D%0A%5B1%2F1_3%2F2_9%2F4_27%2F8_81%2F32_243%2F64_729%2F256_2187%2F512%5D" title="Xenpaper" frameborder="0"></iframe> How much does each added fifth impact the density of information as compared to the sine wave or saw wave example? Technically speaking, square waves only contain odd harmonics, i.e. the frequencies present in the partials of a saw wave sounded at 100hz will only contain the overtones 300hz, 500hz, 700hz, 900hz, etc... When you try to analyse which partials of a fifth would coincide with that of the root, you will realise that none of them actually do, since the 2nd partial of the fifth coincides with the 3rd partial of the root, but there is no 2nd partial in a square wave. Try to work out the $\text{newinfo}$ metric or $F_\text{partial}$ sets for the square timbre and see if you can find an explanation for how you would rank the $\text{newinfo}$ 'impact' of each subsequent fifth when using the square wave timbre. #### Stacked fifths in 'music theory' We can draw a line of reasoning between - the pervasiveness of the fifth in music of the western tradition - with the harmonic timbres of the instruments of the western tradition - stackability of fifths in harmonic timbres So it shouldn't come as a surprise that in contemporary chord-scale theory we have: - C major-13 $\{C, E, G, B, D, F^\#, A\}$ is in fact just C and E with stacked fifths built: C-G-D-A, and E-B-F#. - C min-13 $\{C, Eb, G, Bb, D, F, A\}$ is just C and Eb with the chains C-G-D-A and Eb-Bb-F. - C13sus4 $\{C, G, Bb, D, F, A\}$ is Cmin13 without the third. - The C lydian scale is 7 consecutive fifths starting from C. - The C major scale is 7 consecutive fifths starting from F: F-C-G-D-A-B-E. - C mixolydian is 7 consecutive fifths starting from Bb. - C dorian is 7 fifths from Eb. - etc... The [Lydian Chromatic Concept of Tonal Organization](https://www.thejazzpianosite.com/jazz-piano-lessons/modern-jazz-theory/lydian-chromatic-concept/) is a theoretical framework for contemporary harmony by George Russell that finds its basis on the pervasiveness of fifths. To some, this represents completion of contemporary music theory and encompasses all applicable functional harmony that can be extrapolated from the science. However, we're only just getting started. #### Stacked fifths have to be stacked The slowing down of the growth of the recurrent $\text{newinfo}(k)$ sequence measuring increase in harmonic information per $k$-th stacked interval only happens because of the increasing density of harmonic content already present in the base 'tonaltiy'. This argument cannot be used to justify, neither is it true that you can freely use dyads which are arbitrarily many fifths apart without sounding the intermediate fifths that was stacked on the way there. In the following example, note how the first dyad sounds relatively out of place, but the second chord which utilizes the same dyad as its lowest and highest notes now sounds 'correct' now that it is given its appropriate context (in the form of harmonic content of the stacked fifths): <iframe width="560" height="200" src="https://xenpaper.com/#embed:%5B1%2F1_243%2F64%5D------..%0A%0A%5B1%2F1_3%2F2_9%2F4_27%2F16_81%2F64_243%2F64%5D------" title="Xenpaper" frameborder="0"></iframe> In general, for a set of notes to have intended tonal structure, there has to be an approximate interpretation of the notes in just intonation that is within humanly perceptible complexity. Here's 4 intervals that could represent a major third: - $81:64$ going up 4 fifths and down 2 octaves (3-limit Pythagorean major third) - $2^{4/12}:1$ dividing the octave into exactly 1/3 (4 semitones of 12 edo) - $2^{13}:3^8$ going down 8 fifths and up 5 octaves (3-limit diminished 4th, flat-4) - $5:4$ the classic 5-limit major third. <iframe width="560" height="250" src="https://xenpaper.com/#embed:%5B0_81%2F64%5D--------%0A%5B0_4%5C12%5D--------%0A%5B0_8192%2F6561%5D--------%0A%5B0_5%2F4%5D--------" title="Xenpaper" frameborder="0"></iframe> Hearing them side by side, you could probably tell subtle differences in pitch and color of these major thirds, but the amount that they differ in perception is nowhere as close as how much their differ in how they were constructed, nor any of the number theoretical metrics that measure the numerator/denominator. For example, the conceptual construction of $8192:6561$ would imply a diminished-4th interval (i.e. between A and Db). It could conceptually be used within the context of a Db augmented triad in 2nd inversion tuned using 3-limit pythagorean major thirds ($81:64$): <iframe width="560" height="180" src="https://xenpaper.com/#embed:%5B64%2F81_0_8192%2F6561%5D--------" title="Xenpaper" frameborder="0"></iframe> However, without any other notes given for context or anything else to compare it to, it is very hard to perceive this interval functioning as a diminished fourth. In fact, nothing is stopping most people from hearing the above example as two stacked major thirds with F as the 'root' instead of the theoretical Db it was constructed from. What does it truly mean for a note to 'function' as a particular scale degree? We don't have the tools to answer this yet, but we'll get to this in a later section. ---- ### On the fifth = $2^{7/12}$ Although not verifiable, one could easily assume the fifth to be the second interval ever 'discovered'. Its stackability has yielded some ancient musical constructs to arrive at some ridiculous tuning systems like the 50+ untempered unique stacked fifths of the [sanfen sunyi](https://youtu.be/51WG9XTkUfg?t=1098) cycle in the [Book of the Later Han (445AD)](https://en.wikipedia.org/wiki/Book_of_the_Later_Han) resulting in genuinely bizarre looking intervals like $1162261467:1073741824$ (though it was not written in that manner, following the steps as instructed would construct these intervals and beyond). In reality, the fifth most people have been hearing on 12ed2 instruments is given by the interval $2^{7/12} \approx 1.498307...:1$, which is close to, but not exactly the $1.5:1$ of the $3/2$ perfect pure fifth. We can arrive at the interval $2^{7/12}$ by realising that $(3/2)^{12} \approx 129.746...:1$ (rising by 12 pure fifths) yields an interval that is very close to $(2/1)^7 = 128:1$ (rising by 7 pure octaves): <iframe width="560" height="315" src="https://xenpaper.com/#embed:%7Br220hz%7D(osc%3Afmtriangle1)%0A64%2F729_32%2F243_16%2F81_8%2F27_4%2F9_2%2F3_1%2F1_%0A3%2F2_9%2F4_27%2F8_81%2F16_243%2F32_729%2F64%0A%0A%7Br77.2565hz%7D%0A%60%600_%600_0_'0_''0_%0A'''0_''''0_'''''0" title="Xenpaper" frameborder="0"></iframe> :::info Note: you can click on the `>` to the left of any line to start playing from that line. ::: And here is a side-by-side comparison between the unison root note the new 'root note' after traversing up 12 fifths and down 7 octaves, also known as the [pythagorean comma](https://en.wikipedia.org/wiki/Pythagorean_comma): <iframe width="560" height="265" src="https://xenpaper.com/#embed:%23_unison%0A1%2F1-----%0A%0A%23_'unison'_after_12_fifths%0A531441%2F524288-----" title="Xenpaper" frameborder="0"></iframe> #### Equal-division interval math intuition It may not be immediately obvious why 7 steps of 12ed2 has the interval value $2^{7/12}$. Hopefully this construction can give some intuition: \begin{align*} 1\text{ octave} &= 2/1\\ \\ 2\text{ octaves} &= 2/1 \times 2/1 \hspace{1.5cm}\small\text{(intervals ascend by multiplication)}\\ &= 2^2\\ \\ n\text{ octaves} &= \underbrace{2/1 \times\ ...\ \times 2/1}_{\large n \text{ times}}\\ &= 2^n\\ \\ \frac{1}{2}\text{ octave} &= 2^{1/2}\\ \\ \text{12ed2 semitone} = \frac{1}{12}\text{ octave} &= 2^{1/12}\\ \\ \text{12ed2 fifth} = 7\text{ semitones} &= \text{semitone}^{7/12}\\ &= \left(2^{1/12}\right)^7\\ &= 2^{7/12}\qquad \blacksquare \end{align*} #### Historical excuse for not having equal temperament Not surprisingly, it took a long time from the 'discovery' of the pure fifth to arrive at the modern $2^{\ n/12}$ definition of 12ED2 tempered intervals: The earlier written records being [Fusion of music and calendar - Zhu Zaiyu (1580)](https://en.wikipedia.org/wiki/Zhu_Zaiyu) and [Van De Spiegheling der singconst - Simon Stevin (ca. 1605)](https://en.wikipedia.org/wiki/Simon_Stevin). Long ago, when the inversely proportional relationship between string length and frequency was discovered in ancient Greece (also independently discovered around the world), the ancient Greek [music theory](https://en.wikipedia.org/wiki/Musical_system_of_ancient_Greece) revolved itself around intervals that were represented as ratios (i.e. just intonation). Circa 400 BC, [Archytas proven](https://books.google.com.sg/books?id=h0jTzgEACAAJ&dq=isbn:052130220X&hl=en&sa=X&redir_esc=y) that one cannot simply construct the [arithmetic mean](https://en.wikipedia.org/wiki/Arithmetic_mean) interval between two [superparticular intervals](https://en.wikipedia.org/wiki/Superparticular_ratio) $(a+1):a\quad \forall a \in \mathbb{N}$ (without having to resort to non-just non-constructible intervals), implying that the set of superparticular ratios and their reciprocals under multiplication can construct all of [$\mathbb{Q}$](https://en.wikipedia.org/wiki/Rational_number) (all the just intervals). The _systema teleion meizon_ thus revolved itself around pure ratios and constructible means, and the [genus](https://en.wikipedia.org/wiki/Genus_(music)) system has later influenced [Byzantine music](https://www.jstor.org/stable/738676), [Arabic/Turkish maqam](https://www.jstor.org/stable/765597) partially, eventually the [liturgical modes](https://en.wikipedia.org/wiki/Gregorian_mode) based on [Oktōēchos](https://en.wikipedia.org/wiki/Hagiopolitan_Octoechos), and thus the Western classical tradition. The abstract concept of raising fractions to exponents, let alone arbitrary [rational](https://en.wikipedia.org/wiki/Rational_number)/[real](https://en.wikipedia.org/wiki/Real_number) exponents (see [_What is the graph of x^a when a is not an integer?_](https://youtu.be/_lb1AxwXLaM)) to [extract](https://en.wikipedia.org/wiki/Nth_root) roots like $\sqrt[12]{2^7} \equiv 2^{7/12}$ took humanity a while to develop. Before then, mathematics only involved [constructible numbers](https://en.wikipedia.org/wiki/Constructible_number) that could be tangibly/physically conceived. #### Comparison between tunings Here's a stack of fifths in a few different tunings: - [Just intonation](https://en.xen.wiki/w/Just_intonation) - [12ed2](https://en.xen.wiki/w/12edo) - [31ed2](https://en.xen.wiki/w/31edo) - [22ed2](https://en.xen.wiki/w/22edo) - [7ed2](https://en.xen.wiki/w/7edo) <iframe width="560" height="440" src="https://xenpaper.com/#embed:(1)(osc%3Asquare%3B_env%3A4159)%0A%7Br140hz%7D%0A%23_Just%0A1%2F1_3%2F2_9%2F4_27%2F8_81%2F16_243%2F32-..%0A%5B1%2F1_3%2F2_9%2F4_27%2F8_81%2F16_243%2F32%5D----...%0A%0A%7B12ed2%7D%0A0_7_14_21_28_35-..%0A%5B0_7_14_21_28_35%5D----...%0A%0A%7B31ed2%7D%0A0_18_36_54_72_90-.%0A%5B0_18_36_54_72_90%5D----...%0A%0A%7B22ed2%7D%0A0_13_26_39_52_65-..%0A%5B0_13_26_39_52_65%5D----...%0A%0A%7B7ed2%7D%0A0_4_8_12_16_20-..%0A%5B0_4_8_12_16_20%5D----..." title="Xenpaper" frameborder="0"></iframe> Which ones do you have an affinity for, or feel familiar with? Which ones feel the most concordant to you? #### 'Coinciding partials' of 12ed2 fifths The [coinciding partials](#Why-octaves-could-be-perceived-to-be-the-‘same-note’) concept seems to only apply if the frequencies in question are justly tuned. After all, the third harmonic of root note ($3\times 1/1$) does not coincide with the second harmonic of the 12ed2 tempered fifth $2\times 2^{(7/12)}$. \begin{align} \text{3rd harm. of root} &= 3\times 1/1\\ &= 3\\ \\ \text{2nd harm. of 7\12ed2} &= 2\times 2^{7/12}\\ &\approx 2.99661415... \end{align} We can measure the discrepancy by simply taking the interval difference between these two intervals. Recall that subtracting an interval uses division of interval ratios: \begin{align} \text{3rd harm. $-$ 19\12ed2} &= \frac{3}{2^{7/12}}\\ &\approx 1.00112989063... \end{align} And in cents: \begin{align} \text{cents}\left(\frac{3}{2^{7/12}}\right) &= 1200 \cdot \log_2\left(\frac{3}{2^{7/12}}\right)\\ &\approx 1.955¢ \end{align} <iframe width="560" height="230" src="https://xenpaper.com/#embed:(osc%3Asquare)(1)3%2F2---%0A7%5C12o2---%0A%5B3%2F2_7%5C12o2%5D-------" title="Xenpaper" frameborder="0"></iframe> This is a very minute, mostly non-detectable interval (which explains the similarity of the just and 12ed2 stacked fifths in the [previous audio example](#Comparison-between-tunings)). When a 12ed2 [power-chord](https://en.wikipedia.org/wiki/Power_chord) is sounded, or when these two different fifths are played simultaneously, you may notice a very slow beating at 3/1000th of the fundamental frequency. (A4-E5 should yield 1.32 beats per second). Watch [Beats and Just Noticeable Difference](https://www.youtube.com/watch?v=TpBihrFVUG0&ab_channel=WalkThatBass) for a brief explanation to this phenomenon. If you implemented the [advanced $P_\text{coincidence}$ and $P_\text{newinfo}$ metrics](#For-advanced-readers), the [$\text{newinfo}(k)$](#Fifth-fractal-math) metric of stackability will take into account the beating of tempered intervals. Though if you wish for a concordance/heuristic consonance metric for both just and non-just intervals in the reals, you are better off using [harmonic entropy](https://en.xen.wiki/w/Harmonic_entropy) or dissmeasure. Perhaps a future article will cover these mathematical metrics, but they are not of importance to understanding a musical construction. Nonetheless, these partials between tempered fifths can still be thought of as coinciding, but they will do so with increasing turbulence/beating as more tempered intervals get stacked. Stacking up to 3 fifths (C-G-D-A) begets an error of $1.955¢ \times 3 = 5.865¢$, which is the commonly accepted [Just Noticeable Difference](http://hyperphysics.phy-astr.gsu.edu/hbase/Music/cents.html#:~:text=You%20can%20hear%20about%0Aa%20nickel%27s%20worth%20of%20difference) of pitch (though it really depends on the timbre and context). #### Why temper the fifth? Pragmatically, it is so you don't have to deal with an infinitely long circle of fifths (which means having an infinite number of notes). This example below is merely the 'circle' of 53 fifths, which is intimidating enough, but the true untempered spiral of fifths [spirals to infinity](https://youtu.be/51WG9XTkUfg?t=1311). ![](https://i.imgur.com/1PKWbA9.png =400x) There are numerous videos online that provides succinct starting points to begin exploring the need for temperaments like [Why There are Twelve Notes in Music](https://www.youtube.com/watch?v=IT9CPoe5LnM&ab_channel=StevenJacks). #### Visual intuition for 12 tempered fifths: The circle of fifths... ----- ### On the $\text{fourth} = 4:3$ <iframe width="560" height="210" src="https://xenpaper.com/#embed:(env%3A0099)%7Br160hz%7D_3%2F1_4%2F1....%0A%7Br4%2F1%7D%0A4%3A3--...." title="Xenpaper" frameborder="0"></iframe> Commonly thought of as the 'inverted fifth', because it is the same interval as going down a fifth and up an octave: \begin{align} \text{octave $-$ fifth }&= \frac{2}{1} \div \frac{3}{2}\\ &= \frac{4}{3} \end{align} There's not much new to say about this interval, since it still stays within the 3-limit. There's [no new dimensions](#Monzo-intuition) to be had here. #### Musical examples Sadly at this time of writing, mannfish hasn't had any intervalic study done on $4:3$, but there is one done on $8:3$ which is the same interval but an octave wider: <iframe width="560" height="315" src="https://www.youtube.com/embed/cARNFGF4CAA" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe> Examples of extensive melodic/harmonic use: - [Wayne Shorter - Witch Hunt](https://www.youtube.com/watch?v=hi6nOr9MIqI&ab_channel=JazzTuna) - [Eddie Harris - Freedom Jazz Dance](https://www.youtube.com/watch?v=iDrH5urtCbQ&ab_channel=EddieHarris-Topic) - [Charles Ives - The Cage (1906)](https://www.youtube.com/watch?v=iiL7XoADFEU) - [Arnold Schoenberg - Das Buch der hängenden Gärten](https://www.youtube.com/watch?v=e-pt51n6eOk) - [Le Mystere des Voix Bulgares - Full Performance](https://www.youtube.com/watch?v=AFgzzWT3zX4) #### The third superparticular ratio \begin{align} \text{p-limit}\left(\frac{4}{3}\right) &= \text{p-limit}\left(\frac{2^2}{3^1}\right)\\ &= 3\\ \\ H_\text{tenney}(4/3) &= \log_2(4\cdot 3)\\ &\approx 3.5849625... \end{align} Overall, not a complex interval by any means, but is known to be a subjective 'dissonance' when applied in the following western cultural contexts: - When the intention is to resolve to the tonic $1/1$ but there is a $4/3$ present which has to resolve to the major third $5/4$ <iframe width="560" height="210" src="https://xenpaper.com/#embed:(env%3A2876)%7Br4%2F3%7D%0A%5B3%2F4_1%2F1_4%2F3%5D----%0A%5B3%2F4_1%2F1_5%2F4%5D----" title="Xenpaper" frameborder="0"></iframe> - When the intention is to have a final resolution on the tonic but the tonic chord is the infamous $\text{V}^6_4$ (Tonic 1 chord in 2nd inversion) instead, so you have to do the mumbo jumbo that the [common practice period](https://en.wikipedia.org/wiki/Common_practice_period) said you have to do: <iframe width="560" height="270" src="https://xenpaper.com/#embed:(env%3A2876)%7Br4%2F3%7D%0A%5B3%2F4_1%2F1_4%2F3%5D---%0A%5B3%2F4_1%2F1_5%2F4%5D---%0A%5B3%2F4_15%2F16_9%2F8%5D---%0A%5B1%2F2_3%2F4_1%2F1_5%2F4%5D---" title="Xenpaper" frameborder="0"></iframe> #### The fourth dissonance hypothesis 1: combination tones Apart from merely ascribing the rationale behind 'dissonant fourths' to [cultural entrainment](https://www.open.ac.uk/Arts/experience/InTimeWithTheMusic.pdf) of the common practice period, here's a hypothesis: First, listen to this excerpt that gets increasingly shrill. The listening experience should be uncomfortably loud, so don't overdo this if you don't hear it. At a sufficiently uncomfortably loud volume level, and at varying pitches depending on the listener's variable physiology, a very audible lower pitch that is two octaves below the higher note emerges. Ensure your volume is adequate then try playing this sample with caution: :::warning Do not overdo the volume or keep replaying this if you don't hear it. The example increases in pitch, so the very moment you can hear the effect stop the playback as it is sufficient demonstration for your ears. The effect is more pronounced on low-quality speakers/headphones. If you wish to save your ears, you can emulate the intended effect by screen-recording this sound and passing it through a soft saturation/overdrive effect. Hint: this phenomenon has something to do with _distortion_.. ::: <iframe width="560" height="315" src="https://xenpaper.com/#embed:(osc%3Asine)(env%3A9099)(1%2F2)%0A30%2F7_40%2F7%0A%5B30%2F7_40%2F7%5D---..%0A30%2F6_40%2F6%0A%5B30%2F6_40%2F6%5D---..%0A30%2F5_40%2F5%0A%5B30%2F5_40%2F5%5D---..%0A30%2F4_40%2F4%0A%5B30%2F4_40%2F4%5D---..%0A30%2F3_40%2F3%0A%5B30%2F3_40%2F3%5D---.." title="Xenpaper" frameborder="0"></iframe> The ghost tones appearing that are not in the original source material are [combination tones](https://en.wikipedia.org/wiki/Combination_tone). In particular, the 'lower' note that you hear when tones are played sufficiently loud is the absolute frequency difference between the higher and lower frequencies. E.g. if 4000hz and 3000hz are simultaneously played, then 1000hz will be the resultant difference tone, which know is $1/2^2$ the frequency of the higher note, hence it is 2 octaves below the higher 4000hz note. The science/reason comes later, but let us focus on the musical implication/question at hand: We can postulate that the reason why fourths eventually evolve to be subjectively dissonant (although initially considered perfect consonances in writings dated from the high middle ages), could be due to a subtle psychoacoustic/cognitive dissonance due to the presence of this 'ghost bottom note' 2 octaves lower. For other similar reasons (non-linearity), and also the reason in fourth dissonance hypothesis 2, the bottom note of the fourth has a characteristic physiological and tonal precedence (as much as how a bassist, by altering only the lowest note of the overall tonality, can temporally _exhume_ control over musical function). The unique 'rootness' attributed to the actually tangible and intended 'bottom' note of the fourth, could hypothetically work against the even more 'rooted' difference tone since it is lower, and emphasises the higher note of the fourth rather than the lower. Of course, there is no definitive answer for the subjective dissonance of the fourth, and I personally still feel that culture had a larger part to play than mere phenomenon. After all, the consonance of pure thirds (that fourths resolved to) was not something that can be assumed of all musical cultures. Some systems did not use the _nicer_ thirds, so why would you resolve a perfectly perfect fourth to a not nice third? (more about this later). :::info For those who do not need explanation of why this phenomenon occurs, skip to the next section. Otherwise, science warning! ::: There are several natural, physiological, and psychoacoustical reasons as to why these ghost/resultant/tartini/helmoltz/combination tones (they have many different names) appear, but the root cause is _distortion_. An audio source has to pass through several mediums before being perceived: - bytes to volts (via [DAC](https://en.wikipedia.org/wiki/Integral_nonlinearity)) - volts to [standing waves](http://hyperphysics.phy-astr.gsu.edu/hbase/Waves/standw.html) of air pressure ([resonant membrane](https://barefacedaudio.com/pages/loudspeaker-non-linearity)) - air pressure to and from [resonant bodies](https://en.wikipedia.org/wiki/Nonlinear_resonance) - equipment producing the sound (cabinet/membrane) - instrument body - table, room, (human) body/head - air itself - air pressure into ear canal ([affects resonance, natural EQ curves causing phase offsets](https://journals.lww.com/otology-neurotology/Abstract/2004/07000/Human_Middle_Ear_Transfer_Function_Measured_by.5.aspx)) - air pressure into the [dynamic system](https://www.sciencedirect.com/science/article/pii/S0378595516302787) of [eardrum](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2852437/), malleus, incus, stapes - dynamic membrane vibrations into [cochlea fluids](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4491943/) (non-linear transfer function) - fluids into stochastic low-resolution resonance of [auditory nerves](https://www.sciencedirect.com/topics/psychology/auditory-nerve) + basilar membrane wavelength sensory input ([nature's fourier transform](https://www.sciencedirect.com/science/article/abs/pii/0025556470901331)) - [brain](https://neuroscience.stanford.edu/news/reality-constructed-your-brain-here-s-what-means-and-why-it-matters) (the most nonlinear thing ever, [only if you think about it](https://plato.stanford.edu/entries/self-reference/)) At every step, the signal gets affected in a way that is [non-linear](https://en.wikipedia.org/wiki/Nonlinear_system): One sine wave in, many sine waves out. The shape of a sine wave will be altered in a non-linear way (i.e. not a trivial change in amplitude/phase). _Distortion_ is non-linearity. An input sine wave will result in an output tone with the same fundamental frequency (except in extreme distortion cases), but with a different timbre. We know from before that a change in timbre from sine to non-sine would mean an addition of frequency content/partials/harmonics. (Because sine is fundamental) At increased amplitudes/volumes, the effect is more pronounced, the distortion is greater. Now, when the input sound source consists of more than one fundamental tone (like the shrill $4:3$ sounds in the example), a non-linear distortion on the overall input sound does not "distort" each fundamental tone independently. Rather, nature/physics will act its non-linearities on the entire sound source as a whole (gestalt idealists rejoice!). This entails that the mathematical transformation of the multiple tones result in additional frequency content with the new frequencies being arbitrary linear combinations between all the input frequencies. That is, given two tones at $A$ hz, $B$ hz, the combination tones can manifest in the form $(\pm mA \pm nB)\text{ hz}\ \forall m,n \in \mathbb{Z},$ where exact possible values of $m,n$ depend on the exact non-linear function used. (e.g. even/odd functions may skip some permutations, and distortion functions expressible as polynomials of finite degree will only xhave a finite number of combination tones) The mathematical proof for this is out of scope but here's the steps, as always, left as an exercise to the reader: - Express non-linear function as a [Taylor expansion](https://en.wikipedia.org/wiki/Taylor_series) that yields the [power series](https://en.wikipedia.org/wiki/Power_series) $P(x) = a + bx + cx^2 + dx^3 + \ldots$ - Evaluate $P(x)$ where x is the summation of waveforms present - e.g. a 150hz tone at half the amplitude of a 200hz tone would look like $P(\frac{1}{2}sin(150\text{hz}\cdot2\pi t) + sin(200\text{hz}\cdot 2\pi t))$ - Bi/tri/[multinomially](https://en.wikipedia.org/wiki/Multinomial_theorem) expand each $x^n$ term of $P(x)$ - Evaluate the fourier transform of each term - The dirac delta arguments of the resultant fourier expressions are the resultant combination tones. These will always be linear combinations of the initial frequencies (trivially by nature of how multinomial expansion works and how terms of same degree collect and simplify) #### Fourth dissonance hypothesis 2: partial coincidence emphasis and tonal space First, let us recap deconstructing what it means to 'add' a fifth: \begin{align} F_\text{partials}(3)&=\{3, \color{limegreen}6, 9, \color{limegreen}{12}, \ldots\}\\ F_\text{partials}(2)&=\{2, 4, \color{limegreen}6, 8, 10, \color{limegreen}{12}, \ldots\}\\ \end{align} <iframe width="560" height="315" src="https://xenpaper.com/#embed:(osc%3Asine)%7Br%600%7D%0A2%2F2_4%2F2_6%2F2_8%2F2_10%2F2_12%2F2%0A3%2F2_6%2F2_9%2F2_12%2F2_15%2F2_18%2F2%0A%0A(1)(env%3A2099)%0A%5B3%2F2_2%2F2%5D%5B6%2F2_4%2F2%5D%5B9%2F2_6%2F2%5D%5B12%2F2_8%2F2%5D%5B15%2F2_10%2F2%5D" title="Xenpaper" frameborder="0"></iframe> Recall the partial coincidences of the fifth $3:2$: when the tonal root was assumed to be $2$, we have coinciding partials every 3rd harmonic of $2$, thus emphasizing the amplitudes of $6/1$, $12/1$, $18/1$, $24/1$, $30/1$, ... Relative to the root note $2/1$, this corresponds to the fifth, fifth, ninth, fifth, major7th, etc. Relative to the fifth $3/1$, this corresponds to the root, root, fifth, root, maj3rd, etc. When sounded with a bright harmonic timbre, the coinciding partials form a strong upper structure emphasizing the fifth. Relative to the subjective monophonic tonality of $2/1$ (i.e. assuming the 'key' was $2/1$ and we are now adding $3/1$) the new upper structure gives strength to this fifth, and subsequent stacked fifths (hence the stackability of fifths). Relative to the monophonic tonality of $3/1$, the new upper structure gives strength to the tonality itself, emphasising the otonal expansion of the harmonics of $3/1$. Of course, in the minor context, the emphasis of the 5th harmonic/maj3rd in the otonal series of $3/1$ adds to the tension, hence why IV-im or ivm-im yields a larger change in color than IV-I. Either ways, no matter which of the two notes of the fifth is perceived to be the 'tonic', adding a fifth above/below the 'tonic' strengthens harmonic intention by reinforcing the otonal structure of the newly added fifth (thus strengthening intention), or by reinforcing the otonal structure of the 'tonic' itself (thus strengthening tonicity), both part of the same otonal series. Recall that [tonal fusion/concordance](https://link.springer.com/chapter/10.1007/978-1-4612-1260-7_206) is an asymmetric phenomenon that is perceived with respect to the overtones and not any arbitrary pitches/undertones. Now applying this concept to the fourth in comparison: \begin{align} F_\text{partials}(4)&=\{4, 8, \color{limegreen}{12}, 16, 20, \color{limegreen}{24},\ldots\}\\ F_\text{partials}(3)&=\{3, 6, 9, \color{limegreen}{12}, 15, 18, 21, \color{limegreen}{24},\ldots\}\\ \end{align} <iframe width="560" height="315" src="https://xenpaper.com/#embed:(osc%3Asine)%7Br%600%7D%0A3%2F2_6%2F2_9%2F2_12%2F2_15%2F2_18%2F2_21%2F2_24%2F2%0A4%2F2_8%2F2_12%2F2_16%2F2_20%2F2_24%2F2_28%2F2_32%2F2...%0A%0A(1)(env%3A2099)%0A%5B3%2F2_4%2F2%5D%5B6%2F2_8%2F2%5D%5B9%2F2_12%2F2%5D%5B12%2F2_16%2F2%5D%5B15%2F2_20%2F2%5D" title="Xenpaper" frameborder="0"></iframe> In $4:3$ every fourth harmonic of the lower note coincides with every third harmonic of the upper. From the perspective of $3/1$ (the lower note) as our base tonality, every 4th harmonic coincides. The action of adding the $4/3$ above the root reinforces the otonal upper structure of the original tonality (lower note) itself. From the perspective of $4/1$ as our base tonality, every 3rd harmonic coincides. The action of adding $3/4$ underneath the root reinforces the added new note. Initially, assuming octave equivalence, fifths and fourths yields the same emergent tonal structures (from either otonal/utonal POV) due to having identical pitch class sets (invariant to transposition). Just like how fifths can form a base tonality stucture, the $4:3$ dyad can form a base tonality itself, as evidenced in [_'pa-less' raags_](https://forum.chandrakantha.com/post/ragas-that-do-not-use-pancham-8671658), where the basis tonality provided by the tampura drone is root-fourth. However, there is a subtle difference when octaves are considered, which is best explained aurally: Harmonic vertical structure example: <iframe width="560" height="315" src="https://xenpaper.com/#embed:(1%2F3)%0A%23_3%3A2_harmonics_1-5_(otonal-centric)%0A%5B2%2F2_4%2F2_6%2F2_8%2F2_10%2F2%2C%0A3%2F2_6%2F2_9%2F2_12%2F2_15%2F2%5D--.%0A%0A%23_3%3A4_harmonics_1-5_(otonal-centric)%0A%5B4%2F2_8%2F2_12%2F2_16%2F2_20%2F2%2C%0A3%2F2_6%2F2_9%2F2_12%2F2_15%2F2%5D--.%0A%0A%23_3%3A4_harmonics_1-5_('subdominant')%0A%5B4%2F3_8%2F3_12%2F3_16%2F3_20%2F3%2C%0A3%2F3_6%2F3_9%2F3_12%2F3_15%2F3%5D--.%0A" title="Xenpaper" frameborder="0"></iframe> Lateral order-preserved melodic example: <iframe width="560" height="315" src="https://xenpaper.com/#embed:(1)%0A(env%3A_5099)%0A%23_4%3A6%3A9%0A8%2F6_12%2F6_18%2F6.%0A8%2F6_12%2F6_18%2F6.%0A8%2F6_12%2F6_18%2F6.%0A....%0A%23_9%3A12%3A16%0A9%2F6_12%2F6_16%2F6.%0A9%2F6_12%2F6_16%2F6.%0A9%2F6_12%2F6_16%2F6.%0A....%0A%23_9%3A6%3A4%0A18%2F6_12%2F6_8%2F6.%0A18%2F6_12%2F6_8%2F6.%0A18%2F6_12%2F6_8%2F6.%0A....%0A%23_16%3A12%3A9%0A16%2F6_12%2F6_9%2F6.%0A16%2F6_12%2F6_9%2F6.%0A16%2F6_12%2F6_9%2F6.%0A...." title="Xenpaper" frameborder="0"></iframe> The difference in perception ultimately boils down to how timbral fusion is biased towards otonality, and how musicians of the western tradition are culturally entrained to understand intervals upwards. (Conversely, the [_systema teleion meizon_](https://dash.harvard.edu/bitstream/handle/1/12712855/29552907.pdf) of the ancient Greek musical system emphasizes focusing on the top note as the resolution and constructing downwards.) #### Partial coincidence & stackability \begin{align} P_\text{coincidence}(4:3) &= 1/4\\ P_\text{newinfo}(4:3) &= 1/2\\ \text{newpart} &= 2\\ p_1 &= 4\\ p_k &= 4\cdot (4/3)^{k-1} - 2\cdot\left(4-\cfrac{4^k}{3^{k-1}}\right)\\ \text{newinfo}(k) &= \frac{2}{p_k} \end{align} Graph of $\text{newinfo}(k)$ of fourth (solid green) vs. $\text{newinfo}(k)$ of fifth (dotted red) ![](https://i.imgur.com/bbJSY12.png =560x) $4:3$ is less stackable than the $3:2$ (more harmonic info per stack). #### _The minor fourth, the major fifth_ _Fifths are bright/major, fourths are dark/minor_. That's something that's heard a lot thanks to [Jacob Collier videos](https://twitter.com/jacobcollier/status/1352707893779435521). Harmonic brightness/darkness based on fifths is a concept that has permeated modern popular music theory, but has its roots in: - The [_ekyptics_ (Hans Kayser)](https://www.sacredscienceinstitute.com/EZ/ssi/ssi/hans-kayser-book-textbook-harmonics-tone-spirals-curves.php?PHPSESSID=djkbco08eogd0uvt09bu8b84k6) - The [_absolute_ vs _telluric_ (Ernst Levy)](https://www.docdroid.net/DZadkSr/theory-of-harmony-pdf#page=9) - The [_systema teleion meizon_](https://dash.harvard.edu/bitstream/handle/1/12712855/29552907.pdf) - [network theory data analysis (McLaughlin's)](https://pythondig.com/r/network-theory-of-jazz-scales-version---modularized-and--python) The majorness/minorness of the fifths/fourths could be attributed to the fact that the circle of fifths can generate lydian/dorian modes when traversing towards the 'fifths'/'fourths' side, but this phenomenon only applies when the temperament we are using allows us to arrive at usable thirds (more on this in the next part).