# EMR Response
## Letter to the reviewer
Dear Claire and Niels,
Thank you for your positive feedback and constructive comments, the majority we have addressed in this revised manuscript.
* Thanks for the correction regarding Néstor's quartets. We have now categorized them with the human-annotated datasets and removed references to an "automatically-annotated" category.
* We have also made the following changes for greater clarity in a couple of places that you pointed out:
* "We provide the labels in the `.mscx` files and, *in addition*, tabular files with notes, labels, and measures.""
* "The workflow automates label validation, notification dispatch, and creation of analysis and other auxiliary files".
* We followed your suggestions regarding terminology and wording, including the x-axis labels in Figs. 5 and 6.
We chose to leave unaddressed a few comments which, while truly insightful and substantial, we consider beyond the scope of a "data report," and promising ideas for separate papers. These include the study of contentious annotation cases (which would involve the examination of GitHub comments in conjunction with the git history), as well as an analysis of how successive key areas relate to each other. Furthermore, we opted to keep the stacked-bar chart because, compared to other plots we tried (e.g., nine individual pie charts), it seemed most ergonomic for inter-corpus comparisons.
Please find the revised manuscript attached. Thank you and kind regards,
Johannes
## Review April 2023
Johannes, I broke down the reviewer's comments into unit tasks.
Did those labeled @done. Some of these labels have things for us to discuss. Nothing major.
I labeled the rest with @todo.
~~An Annotated Corpus of Tonal Piano Music from the Long 19th Century~~
~~This corpus will be a welcome addition to the growing body of corpora with harmonic annotations. The corpus presented here covers a time period that has seen little representation in the symbolic domain and represents an impressive degree of diligence, planning, and scrutiny with regard to both the assembly of the corpus and the annotation process. My comments here merely make several suggestions for clarifying the prose at times, as well as a few suggestions to reduce ambiguity both in the prose and in the figures and figure captions.~~
@done ~~(Johannes, are you aware of an automatically annotated harmony corpus? If not, we should probably omit that category.)~~
~~doneA small correction: Nestor Lopez’s corpus (the Haydn Sun quartets) is not ‘automated’; “It is a manually-annotated corpus of harmonic analysis in **harm syntax” (direct quote from the link in the reference) and therefore likely falls under the varied branch of similar work (TAVERN, Bach chorales, etc. that have been manually annotated, just in a different format).~~
> [name=Johannes] removed the category "automatically analyzed"
@done ~~I rather dislike “hand-annotated” as it implies that it was literally done in pen and paper. I suggest “manual annotations by experts” or something along those lines.~~
@done ~~They were not literally scores that were annotated but digital encodings of scores (in which case certain with some form of annotations (many different corpora in different formats are listed.)~~
@done ~~Entire first sentence of the second paragraph is awkward and unclear; what is the main point of the sentence? That the galant marks a shift towards piano over vocals? The subsequent sentence begins with “The period..” which is ambiguous.~~
@done ~~What is meant by “the dataset period”? (If referring to one’s own dataset, may I suggest to make an acronym so that it is clear when you are reffering to your own work? E.g., TPM (Tonal Piano Music)? If another dataset, or group of datasets, perhaps find a way to concisely refer to that? In any case it’s grammatically weird so perhaps something instead such as ‘of the so-called long 19th century’ or else ‘during the span of time covered by our corpus.’~~
> [name=Yannis] text amended
@done ~~Please excuse my ignorance here, but how useful are the annotations being IN the musescore files in terms of analysis purposes? Neither of the three existing toolkits currently are able to parse .mscx files and therefore would require conversion to musicxml first, right?~In addition, data lost could require custom code and parsers?~~ (Already addressed in *Formats & Features*: "These files are automatically extracted from the annotated MuseScore files by the ms3 parsing library." To be sure, we are now mentioning the mscx file extension in *Formats and features*.)
@done ~~Do the authors have any plans to also release standalone text files? I appreciate the ODD approach, honestly, it dramatically simplifies things and makes it clean and prevents errors, but I’m just concerned about useability? Perhaps this is a non-issue, I’m not sure.~~ (Already addressed in Formats & Features: "The same information is additionally provided in the form of plaintext TSV-formatted feature tables...")
@done ~~Forgive me, but I don’t see what is ‘semi-automated’ about this workflow? A quick skim of the 2021 papers gives me no clues either… It seems entirely manual other than that github versioning will point to the specific differences, is that the semi-automated part?~~ (Already addressed in "Annotations" and "Formats and features." I prefer not to touch our well-modulated phrase in the 2nd paragraph of *Annotations*.)
@done ~~If so, I’m not sure that it’s the ‘semi automated’ part that you should be touting or drawing attention to per se, but rather the extreme attention to systematic process, error checking, and transparency that you all maintained. I do believe that it’s becoming standard practice to use github to store corpora (in which case any changes or updates to files would be trackable) but that doesn’t necessarily mean that the process was transparent.~~ (Added a concluding sentence in "Annotations.")
@done (But is this style-compliant? I also right-aligned all numbers.) ~~Fig 1. Suggest shrinking the text so that the values can all fit in a single line (at least the numbers in each cell?)~~
> [name=Johannes] if not, they will let us know. Looks good
@done ~~Also, I’m not sure what “length” is referring to (if it’s not notes or measures?). I would suggest mentioning it in the table caption.~~
@done ~~Also, I presume ‘pieces’ is synonymous with movements in the case of Beethoven sonatas? Again, perhaps a note or include in table caption?~~
@done ~~Last paragraph before the “Descriptive statistics” section: states “…each corpus of pieces..” this is a bit confusing since you are not referring to your corpus/dataset as a meta-corpus. I would suggest simply saying ‘each file in the corpus’ or ‘for each collection of pieces within the corpus’ or something like that.~~
@done (I am omitting the detailed calculation method for phrase lengths in measures, which seems to me a bit pedantic now. Wht do you think?) ~~Description of Fig. 5 is awkward. Here’s my suggestion: “Histogram showing the length of each of the 3544 phrases (as determined and peer-reviewed by our annotators) with counts on the y-axis measured on a logarithmic scale. Phrase lengths are counted in terms of measures (here I suggest another footenote to explain exactly or else put it in the prose of the paper).~~
> [name=Johannes] makes sense
@done (Johannes, the reviewer is referring to the x-axis label of Fig 5 here.)
~~With regard to the x-axis title, it’s quite confusing. Since you have whole-numbered bins I’d suggest simply calling this ‘Phrase length duration in measures’].~~
> [name=Johannes] went with it
@done (But Johannes please take a second look. Also, I am deleting a word in the title, "PHRASE ~~SEGMENTS~~," because I think it's redundant. I am also keeping the quotation marks to alert the reader to the provisional use of the word.) ~~Also is phrase, groupings, and segments supposed to all be synonymous? If so I would suggest being consistent in terms of vocabulary to avoid confusion.~~
> [name=Johannes] alright
@done (But I don't use the reviewer's suggested phrase, which seems vague to me.) ~~First sentence of “Key Segments” is a bit awkward. I suggest: “…shows the distribution of the key areas of modulation sections (relative to each piece’s global tonic) over the full dataset.”~~
> [name=Johannes] agreed
@NOTtodo (Johannes, I think this is a valid point with regard to Fig. 6, if a difficult one to address. Let's discuss it. We could, perhaps, do what the reviewer suggests, but onlt for key areas that are contained *within* a single "phrase" (hence highly local modulations). Alternatively, we could plot modulation distances in relation to the immediately *preceding* key, whether local or global.)
~~With regard to this, I’m just curious about the choice of expressing the areas as a distance from tonic given that many of these modulations will be short and follow some other modulation (e.g., 3rds cycles) in which case the distance of the local modulation may be more interesting than the global one – especially for this time period. Just food for thought.~~
> [name=Johannes] I think we're close to the fine line between data report and analysis paper. Bigrams between key sections come with a whole array of ways to do it and interesting questions one might ask, maybe better to simply leave that box closed.
> [name=Yannis] Agreed.
@done (?) (Johannes, I think my rewording addresses this point, too, but please double-check. Anyhow, I don't see why the reviewer was confused, since we have a P1 pair of bars in Fig. 6.) ~~Actually, upon reaching the last sentence of this section (key segments) I’m no longer confident that you are only graphing modulations but in fact all key areas. If so, please revise to clarify (i.e., it is not ‘key areas of modulation sections’ but ‘key areas over all pieces including modulation sections’).~~
@NOTtodo
~~For fig.7 I’m a real hater of stacked bar plots (sorry)! While I can easily compare the bottom row (PACs) it’s a bit hard to compare anything above that. It’s fine, but I would consider if unstacking them and having skinny side-by-side plots or something else entirely might be better.~~
> [name=Johannes] they are easier to compare than individual pie charts
> [name=Yannis] Agreed. Also they save space (not of first importance in online journals, but still...).
@todo
3rd sentence of Discussion: missing a reference?
@done (Should be a separate publication, in my opinion, also including a study on inter-annotator agreement.)
~~I’m not sure if there is room for a discussion of the challenges in annotating such a dataset? E.g., I’m sure RN for this type of music was not always easy. I think readers would find this discussion useful and further contribute to the project’s transparency. However, I understand as a corpus report there’s a pretty tight word limit, so it would be understandable if there isn’t room for that.~~
Amazing work!
Best,
Claire Arthur
## Resubmission Jan 14th, 2023
Dear Mr. Hansen,
Apologies for not sending over a more detailed report on the changes we applied to the paper during the second half of December, which would have facilitate your review.
> p. 3, Fig. 1: Please double check that the composition dates are correct. I haven’t checked all entries systematically, but I believe Grieg’s Lyric Pieces, for example, were composed between 1864 and 1901 whereas the figure seems to suggest that they were all composed between 1900 and 1910.
Sorry, this was a glitch. Our collaborator had updated the composition dates already but the figure was still the old one. For this resubmission we have released version 1.1 of the dataset which, in addition to the dates taken from Oxford Music Online, comes with an exhaustive set of metadata.
> p. 6, first paragraph: Re Figs 3 and 4, is it worth commenting on the apparent bias towards notes with sharps rather than flats? Do we know (from other corpora or systematic analyses) if this is typical of this repertoire?
We do not know of any systematic analysis of pitch content in that regard.
> p. 7, “The bar plot in Fig. 6”: Don’t you mean “Fig. 7” here?
Yes indeed, we stand corrected, thank you.
> p. 7: The cadential differences between the composers (or, fairly, rather ”between the repertoires” since the observations could be different with larger, more representative samples) are really interesting. Do you know if any of these observations have been made before—e.g., in the musicological literature? This might not be straightforward to figure out, but if you do know of this already, it could be really useful to reference any relevant sources here.
We know of two other datasets with comparable statistics, https://transactions.ismir.net/articles/10.5334/tismir.63/ and http://doi.org/10.5334/tismir.63. The latter contains a comparison of both in Table 2. A meaningful comparison with these statistics, however, would require a more elaborate numerical investigation (e.g. based on randomly sampled subsets) which, we believe, is beyond the scope of a data report.
> In addition, I am now happy with your consistent use of the term "long 19th century" in the revised manuscript, but I would still like you to define exactly what age range this spans (conventionally this would be 1789-1914, but I guess you extend it somewhat) and maybe reference its originator(s). You could do so the first time it is mentioned in the main text. No references should be needed when it is used in the abstract (although you could perhaps state the age range in brackets if that makes sense to you). This is key--especially as the term appears in the title, and as not all readers of an interdisciplinary journal like EMR will be familiar with the term.
In the revised version we had added two references and an explanation in end note 2 (our bad we did not point it out to you). Do you think it solves the issue for the readers?
We hope the revised version satisfies the raised issues and are anticipating what will follow.
Kind regards,
Johannes Hentschel