This dataset was created for the paper:
Insights into end-to-end audio-to-score transcription with real recordings: A case study with saxophone works
Juan C. Martínez-Sevilla, María Alfaro-Contreras, Jose J. Valero-Mas, Jorge Calvo-Zaragoza
INTERSPEECH, 2023
The code for the paper can be found here.
About the dataset
Despite a large number of existing works in the Automatic Music Transcription (AMT) field, there is a shortage of end-to-end Audio-to-Score (A2S) transcription efforts, leading to a lack of benchmark corpora, particularly when dealing with real data.
We present a compilation of recorded saxophone performances together with their digital music scores.
The collection includes a total of 1,026 recordings of real interpretations played on two different types of saxophones: tenor and alto. Each recording is paired with its corresponding score in the Humdrum **kern format. The compositions in the dataset, which amount to approximately 3 hours in total duration, encompass a variety of melodies, exercises, and scales, as well as a small number of music incipits extracted from the PrIMuS dataset.
Saxophone | Average duration (s) | Mean symbols per score | Pitch range (transposed to C) |
---|---|---|---|
Tenor | 10.8 ± 2.8 | 27.0 ± 8.6 | A♭2 - G 5 |
Alto | D♭3 - C 6 |
All pieces were recorded in a home studio by musicians proficiently trained in the instrument. Different tempi, styles, and rhythm metrics were considered to increase the variability in the data. A metronome was used to avoid significant tempo deviations. It is important to note that the transposing nature of the saxophone —i.e., music notation is not written at concert pitch— prevents its direct use in A2S since a given note token does not represent the same pitch for all variants of this family of instruments.
The following equipment was used:
- Microphone: Shure SM57
- Audio interface: Behringer U-Phoria UMC22
- Recording software: Audacity
Click here to download the dataset (TGZ file). It has the following structure:
real_a2s_sax_dataset.tgz/ |- fluidsyth/ |----| alto/ |--------| .wav files |----| tenor/ |--------| .wav files |- mididdsp/ |----| alto/ |--------| .wav files |----| tenor/ |--------| .wav files |- real/ |----| alto/ |--------| .wav files |----| tenor/ |--------| .wav files |- krn/ |----| alto/ |--------| .krn and .skm files |----| tenor/ |--------| .krn and .skm files |- Alto_Sax_Index.csv |- Dataset_Index.xlsx |- Tenor_Sax_Index.csv
Three indexes are included along with the audio files and their corresponding scores.
The dataset is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0) License.