9GDB ( 9 genres database )
- Description:
- Size:
- Format:
- Metadata:
- Limitation:
- Ground truth:
- References: Genre classification using chords and stochastic language models Pérez-Sancho, C.; Rizo, D; Iñesta J.M., Connection Science, 145-159, 21, 2009
- Provided by:
- Download:
Corpus of chord progressions from nine different genres taken from three "domains": popular, jazz, and academic music. Popular music data have been separated into three sub-genres: pop, blues, and celtic (mainly Irish jigs and reels). For jazz, three styles have been established: a pre-bop class grouping swing, early, and Broadway tunes, bop standards, and bossanovas as a representative of latin jazz. Finally, academic music has been categorized according to historic periods: baroque, classicism, and romanticism.
The total number of pieces is 856, providing around 60h of music data.
Academic (235) | Jazz (338) | Popular (283) |
---|---|---|
Baroque (56) | Pre-bop (178) | Blues (84) |
Classical (50) | Bop (94) | Pop (100) |
Romanticism (129) | Bossanova (66) | Celtic (99) |
The corpus is available in four different encodings, using absolute and relative encodings for the chord roots, and two extension sets. More details can be found in publications.
Piece name and genre.
Available for research purposes.
Categories have been defined with the help and advice of music experts who have also collaborated in the task of assigning meta-data tags to the files and rejecting outliers in order to have a reliable ground truth.
@article {
author = "Pérez-Sancho, C.; Rizo, D; Iñesta, J.M.",
title = "Genre classification using chords and stochastic language models",
issn = "0954-0091",
journal = "Connection Science",
month = "May",
number = "2 & 3",
pages = "145-159",
volume = "21",
year = "2009",
rchivarRua = "1"
}
Pattern Recognition and Artificial Intelligence Group - University of Alicante (PRAIg-UA)
ODB (onset detection database)
- Description:
- Size:
- Format:
- wav format: 22,050 kHz, mono, 16 bit.
- text format: each row contains an onset time in seconds.
- Metadata:
- Limitation:
- Ground truth:
- Evaluation:
- References:
- Provided by:
- Download:
ODB is an onset detection test database built using a set of real recordings. This database contains some sounds selected from the RWC database and other real recordings. The songs were selected to cover a relatively wide range of instruments and musical genres.
19 real recordings in wav format and their onset positions in text format.
Recording name.
Available for research purposes. Not for comercial use.
The ground truth onset positions were marked and reviewed using a software called speech filling system (SFS). http://www.phon.ucl.ac.uk/resource/sfs/
An onset detection evaluation software, which is also available for research purposes has been developed to compare the detected onsets with the groundtruth. The system computes the number of correct detections, false positives and false negatives, considering a 50 ms error margin.
None.
Pattern Recognition and Artificial Intelligence Group - University of Alicante (PRAIg-UA)
JvC ( music genre recognition )
- Description:
- Size:
- Format:
Each file is a Format 1 MIDI file that contains two tracks:
- The so called master track or track 'zero' where the following MIDI metaevents are optionally encoded:
- Track Name (often contains the song name)
- Time Signature
- Tempo
- Key Signature
- The melody track (track 'one') containing the melody line. This melody line it is not guaranted to be fully monophonic, althought it is so in most files.
Most files have the melody concatenated three times, and one or two silence bars at the beginning. Some melody tracks are polyphonic (double octaves, violin parts in a single track, etc...). If you desperately need the tracks to be monophonic, you can reduce them to monophonic tracks using the 'smf2txt/txt2smf toolchain' avalaible at the following address: http://grfia.dlsi.ua.es/gen.php?id=resources
A command line example that reduces all tracks of a MIDI file to monophonic tracks:
$ smf2txt -p 1 polyphonic.mid | txt2smf -f monophonic.mid
The '-p' option argument means whether you want to preserve the top (1) or bottom (2) line. Be aware, however, that the result may be not exactly what you would expect. A slight, perceptually not significant, overlapping of two otherwise consecutive notes could result in the second note being discarded.
- The so called master track or track 'zero' where the following MIDI metaevents are optionally encoded:
- Metadata:
- Limitation:
- Definition of task:
- Evaluation measures:
- As explain in Ponce de León and Iñesta (2007), a ten-fold cross validation scheme is proposed for evaluation. Average accuracy and standard deviation are proposed measures. For multiple results comparison, an ANOVA test with Bonferroni correction is suggested.
- In the paper above, the authors provide extensive results about classification done by extracting fixed length segments from melody tracks by means of a sliding window procedure. An exploration of varying both the length of the window and the overlap between consecutive segments was performed, suggesting values for the window length above 30 bars for obtaining good classification results.
- Other publications where this corpus has been used are Pérez-Sancho et al. (2006), Ponce de León et al. (2006) and Moreno-Seco et al. (2006)
- Ground truth:
- References: A Pattern Recognition Approach for Music Style Identification Using Shallow Statistical Descriptors. Ponce de León P. J. and Iñesta J. M., IEEE Transactions on Systems Man and Cybernetics C, 248-257, 37, 2007
- Provided by:
- Download: jvc1.2.tgz(jazz and classical music MIDI files)
Automatic music genre recognition, either from audio or symbolic sources, is a much researched topic
in the music information retrieval community. When dealing with symbolic music sources like digital
scores or MIDI files, the main melody of a music piece is often clearly identified.
This corpus contains melodies from jazz and classical music pieces encoded as MIDI file tracks.
150 MIDI files distributed in two subdirectories, one per genre, from which 65 are classical music melodies and 85 are jazz melodies.
MIDI metaevents are stored in the master track of each MIDI file. See the 'Format' section above for more information.
For research purpouses only. Not for commercial use.
This is a two classes classification task. Each subdirectory --'jazz' and 'clas'-- contains MIDI files from Jazz and Classical music, respectively.
Manually annotated. In Iñesta et al. (2008) this ground truth was used to asses human genre recognition capabilities in absence of timbre related information.
@article {
author = "Ponce de León P. J. and Iñesta J.M.",
title = "A Pattern Recognition Approach for Music Style Identification Using Shallow Statistical Descriptors",
journal = "IEEE Transactions on Systems Man and Cybernetics C",
number = "2",
pages = "248-257",
volume = "37",
year = "2007"
}
Pattern Recognition and Artificial Intelligence Group - University of Alicante (PRAIg-UA)
Melody track recognition
- Description:
- Size:
4 files are provided, three of them for particular music genres and a forth one merging the other three, in order to study the possible specificities of melody in different music genres:
- CL200: classical music. 703 instances (16 unlabeled)
- JZ200: jazz. 769 instances (11 unlabeled)
- KR200: karaoke (mainly pop music). 1668 instances (338 unlabeled)
- MEL200.arff: the merge of the 3 files.
In Rizo et al. (2006) the names above were used for the training sets, but in Ponce de León et al. (2007) the merged file was named "SMALL".
3140 samples are provided, 365 unlabeled, so 2775 labeled samples are provided according to the description in Ponce de León et al. (2007), although slightly different from that reported in Rizo et al. (2006). The unlabeled samples belong to the files where no melody tracks were easy to find.
- Format:
- Metadata:
- Name of the MIDI file the track was extrated from.
- Path name, corresponding to the genre of the particular MIDI file.
- Song number (starting form zero for each genre) the track belongs to.
- Instance number, starting from 1 for each MIDI file.
- Limitation:
- Ground truth:
- Definition of task:
The problem is stated at two different levels:
- Given a particular MIDI track (extracted from a multitrack MIDI file), does it contain a melody? (two class problem fully labelled, just partitions for cross-validation need to be done)
- Given a multitrack MIDI file, which track contains the melody? (no melody is a possible situation). (song number metadata is needed to be taken into account for classfying the melody track among those belonging to the same song)
- Evaluation measures:
- As explained in Rizo et al. (2006), for the task a) a 10-fold cross-validation is proposed for evaluation. Success rate, precision, recall, and F-measure are suggested. The ten folds were done automatically by weka (seed = 1).
- For the task b) leave-one-song-out was utilized due to the limited number of samples. The same performance estimators are proposed according to the instructions in Rizo et al. (2006)
- References: A Pattern Recognition Approach for Melody Track Selection in MIDI Files. Rizo D., Ponce de León P. J., Pérez-Sancho C., Pertusa A., Iñesta J. M. (2006). In: Proc. of the 7th Int. Symp. on Music Information Retrieval ISMIR 2006, pp. 61-66, Victoria, Canada.
- Provided by:
- Download:
- CL200.arff (Classical files)
- JZ200.arff (Jazz files)
- KR200.arff (Karaoke-pop files)
- ALL200.arff (whole set)
A melody can be defined "melody" as a 'cantabile' sequence of notes, usually what a listener can remember of a song after hearing it, but this definition is not computable. MIDI files are organized by tracks containing one different voice each. A reliable system to identify the track containing the melody of a MIDI file is very relevant in music information retrieval for indexing and comparing music pieces.
Each track is described as a vector x ℜ34 stored together with class labels and four other tags (the four first attributes are metadata, see below, and the last one is the boolean label 'IsMelody').
Features are both statistical descriptors of the track content (e.g. average note pitch) and how the track is related to the others in the file (e.g. track relative duration). ARFF Weka format is used.
Public use.
Manually annotated.
The different possible situations can be found in a multitrack MIDI file (one melody track, more than one melody track, no melody track). Each row represents a track, those labelled with a 'M' contain a melody:
|
|
|
@inproceedings {
author = "Rizo D., Ponce de León P. J., Pérez-Sancho C., Pertusa A., Iñesta J. M.",
title = "A Pattern Recognition Approach for Melody Track Selection in MIDI Files",
address = "Victoria, Canada",
booktitle = "Proc. of the 7th Int. Symp. on Music Information Retrieval ISMIR 2006",
editor = "Dannenberg R., Lemström K., Tindale A.",
isbn = "1-55058-349-2",
pages = "61-66",
year = "2006"
}
@inproceedings {
author = "Pedro J. Ponce de León, Joé M. Iñesta, David Rizo",
title = "Towards a human-friendly melody characterization by automatically induced rules",
address = "Vienna",
booktitle = "Proceedings of the 8th Int. Conf. on Music Information Retrieval, ISMIR 2007",
editor = "Simon Dixon, David Bainbridge, Rainer Typke",
month = "September",
organization = "Austrian Computer Society",
pages = "437--440",
publisher = "Austrian Computer Society",
year = "2007",
df = "ismir2007.pdf"
}
Pattern Recognition and Artificial Intelligence Group - University of Alicante (PRAIg-UA)