grfia

+ General Information

+ Members

+ Research

+ Intranet

Access

2021

15th workshop gRFIA

9th Music Encoding Conference (MEI 2021)

Alicante, July 19-23

13th international workshop on Machine Learning and Music

(online) September 18, 2020

Machine Learning

Machine Learning is an essential task for any recognition problem, that permits to train adaptable systems from examples. The structuration of this knowledge as symbolic rules is the domain of grammatical inference, where the group has been specially active: Algorithms for building finite state automata, stochastic automata, tree grammars, etc. from strings and trees and methods that learn to translate between formal languages.

String edit distance

The edit distance is used to compute the similarity of a pair of strings, as the number of operations for transforming one string to the other. If this transformation is based on a random phenomenon and then on an underlying probability distribution, edit operations become random variables. Therefore, will be called the stochastic edit distance: a stochastic transduction between the input and output alphabets.

This approach is useful to handle noise in sequences such as in grammar inference, where is required either to remove or correct noisy data to avoid overfitting phenomena. About above, are required to estimate the parameters of the stochastic transducer with some approach.

Tree edit distance

The stochastic edit distance can be extended to tree comparison. It is possible to model an edit distance as a stochastic process and to use probabilistic methods to learn these costs. To learn a generative model in the form of a joint distribution has an advantage, because it provides an estimate of the unknown joint density with a small variance. However, it also have a drawback due to the estimate is biased because it depends on the distribution of the node labels in the input trees. The second approach is to learn a discriminative model in the form of a conditional distribution, since it provides an unbiased estimate. Nowadays, we are working into development a new forest-based framework in order to allow that the above approaches can be extended.