- Calvo-Zaragoza, J.; Valero-Mas, J.J.; Pertusa, A
"End-To-End Optical Music Recognition using Neural Networks"
Proc. of International Society for Music Information Retrieval Conference (ISMIR), Suzhou, China
This work addresses the Optical Music Recognition (OMR) task in an end-to-end fashion using neural net- works. The proposed architecture is based on a Recurrent Convolutional Neural Network topology that takes as input an image of a monophonic score and retrieves a sequence of music symbols as output. In the first stage, a series of convolutional filters are trained to extract meaningful fea- tures of the input image, and then a recurrent block models the sequential nature of music. The system is trained us- ing a Connectionist Temporal Classification loss function, which avoids the need for a frame-by-frame alignment be- tween the image and the ground-truth music symbols. Ex- perimentation has been carried on a set of 90,000 synthetic monophonic music scores with more than 50 different pos- sible labels. Results obtained depict classification error rates around 2 % at symbol level, thus proving the po- tential of the proposed end-to-end architecture for OMR. The source code, dataset, and trained models are publicly released for reproducible research and future comparison purposes.
author = "Calvo-Zaragoza, J.; Valero-Mas, J.J.; Pertusa, A",
title = "End-To-End Optical Music Recognition using Neural Networks",
address = "Suzhou, China",
booktitle = "Proc. of International Society for Music Information Retrieval Conference (ISMIR)",
month = "October",
year = "2017"
|Resources associated with this publication|