2014
Higuera, C. De La; Oncina, J.
The most probable string: an algorithmic study Journal Article
In: Journal of Logic and Computation, vol. 24, no. 2, pp. 311-330, 2014, ISSN: 0955-792X.
Abstract | Links | BibTeX | Tags: Prometeo 2012, TIASA
@article{k304,
title = {The most probable string: an algorithmic study},
author = {C. De La Higuera and J. Oncina},
url = {https://grfia.dlsi.ua.es/repositori/grfia/pubs/304/J+Logic+Computation-2013.pdf},
issn = {0955-792X},
year = {2014},
date = {2014-01-01},
urldate = {2014-01-01},
journal = {Journal of Logic and Computation},
volume = {24},
number = {2},
pages = {311-330},
abstract = {The problem of finding the consensus (most probable string) for a distribution generated by a weighted finite automaton or a probabilistic grammar is related to a number of important questions: computing the distance between two distributions or finding the best translation (the most probable one) given a probabilistic finite state transducer. The problem is undecidable
with general weights and is NP-hard if the automaton is probabilistic. We give a pseudo-polynomial algorithm that solves a decision problem directly associated with the consensus string and answers if there is a (reasonably short) string whose probability is larger than a given bound in time polynomial in the the size of this bound, both for probabilistic finite automata
and probabilistic context-free grammars.We also study a randomized algorithm solving the same problem. Finally, we report links between the length of the consensus string and the probability of this string.},
keywords = {Prometeo 2012, TIASA},
pubstate = {published},
tppubtype = {article}
}
The problem of finding the consensus (most probable string) for a distribution generated by a weighted finite automaton or a probabilistic grammar is related to a number of important questions: computing the distance between two distributions or finding the best translation (the most probable one) given a probabilistic finite state transducer. The problem is undecidable
with general weights and is NP-hard if the automaton is probabilistic. We give a pseudo-polynomial algorithm that solves a decision problem directly associated with the consensus string and answers if there is a (reasonably short) string whose probability is larger than a given bound in time polynomial in the the size of this bound, both for probabilistic finite automata
and probabilistic context-free grammars.We also study a randomized algorithm solving the same problem. Finally, we report links between the length of the consensus string and the probability of this string. Abreu, J.; Rico-Juan, J. R.
A New Iterative Algorithm for Computing a Quality Approximated Median of Strings based on Edit Operations Journal Article
In: Pattern Recognition Letters, vol. 36, pp. 74–80, 2014.
Abstract | Links | BibTeX | Tags: TIASA
@article{k308,
title = {A New Iterative Algorithm for Computing a Quality Approximated Median of Strings based on Edit Operations},
author = {J. Abreu and J. R. Rico-Juan},
url = {https://grfia.dlsi.ua.es/repositori/grfia/pubs/308/PRL+-+A+New+Iterative+Algorithm+for+Computing+a+Quality+Median.pdf},
year = {2014},
date = {2014-01-01},
journal = {Pattern Recognition Letters},
volume = {36},
pages = {74--80},
abstract = {This paper presents a new algorithm that can be used to compute an approximation to the median of a set of strings. The approximate median is obtained through the successive improvements of a partial solution. The edit distance from the partial solution to all the strings in the set is computed in each iteration, thus accounting for the frequency of each of the edit operations in all the positions of the approximate median. A goodness index for edit operations is later computed by multiplying their frequency by the cost. Each operation is tested, starting from that with the highest index, in order to verify whether applying it to the partial solution leads to an improvement. If successful, a new iteration begins from the new approximate median. The algorithm finishes when all the operations have been examined without a better solution being found. Comparative experiments involving Freeman chain codes encoding 2D shapes and the Copenhagen chromosome database show that the quality of the approximate median string is similar to benchmark approaches but achieves a much faster convergence.},
keywords = {TIASA},
pubstate = {published},
tppubtype = {article}
}
This paper presents a new algorithm that can be used to compute an approximation to the median of a set of strings. The approximate median is obtained through the successive improvements of a partial solution. The edit distance from the partial solution to all the strings in the set is computed in each iteration, thus accounting for the frequency of each of the edit operations in all the positions of the approximate median. A goodness index for edit operations is later computed by multiplying their frequency by the cost. Each operation is tested, starting from that with the highest index, in order to verify whether applying it to the partial solution leads to an improvement. If successful, a new iteration begins from the new approximate median. The algorithm finishes when all the operations have been examined without a better solution being found. Comparative experiments involving Freeman chain codes encoding 2D shapes and the Copenhagen chromosome database show that the quality of the approximate median string is similar to benchmark approaches but achieves a much faster convergence.2013
Pérez-Sancho, C.; Bernabeu, J. F.
A Multimodal Genre Recognition Prototype Proceedings Article
In: Actas del III Workshop de Reconocimiento de Formas y Análisis de Imágenes, pp. 13-16, Madrid, Spain, 2013, ISBN: 978-84-695-8332-6.
Abstract | Links | BibTeX | Tags: DRIMS, TIASA
@inproceedings{k305,
title = {A Multimodal Genre Recognition Prototype},
author = {C. Pérez-Sancho and J. F. Bernabeu},
url = {https://grfia.dlsi.ua.es/repositori/grfia/pubs/305/wsrfai2013_submission_4.pdf},
isbn = {978-84-695-8332-6},
year = {2013},
date = {2013-09-01},
urldate = {2013-09-01},
booktitle = {Actas del III Workshop de Reconocimiento de Formas y Análisis de Imágenes},
pages = {13-16},
address = {Madrid, Spain},
abstract = {In this paper, a multimodal and interactive prototype to perform music genre classification is presented. The system is oriented to multi-part files in symbolic format but it can be adapted using a transcription system to transform audio content in music scores. This prototype uses different sources of information to give a possible answer to the user. It has been developed to allow a human expert to interact with the system to improve its results. In its current implementation, it offers a limited range of interaction and multimodality. Further development aimed at full interactivity and multimodal interactions is discussed.},
keywords = {DRIMS, TIASA},
pubstate = {published},
tppubtype = {inproceedings}
}
In this paper, a multimodal and interactive prototype to perform music genre classification is presented. The system is oriented to multi-part files in symbolic format but it can be adapted using a transcription system to transform audio content in music scores. This prototype uses different sources of information to give a possible answer to the user. It has been developed to allow a human expert to interact with the system to improve its results. In its current implementation, it offers a limited range of interaction and multimodality. Further development aimed at full interactivity and multimodal interactions is discussed. Calvo-Zaragoza, J.; Oncina, J.
Human-Computer Interaction for Optical Music Recognition tasks Proceedings Article
In: Actas del III Workshop de Reconocimiento de Formas y Análisis de Imágenes, pp. 9-12, Madrid, Spain, 2013, ISBN: 978-84-695-8332-6.
Links | BibTeX | Tags: Prometeo 2012, TIASA
@inproceedings{k306,
title = {Human-Computer Interaction for Optical Music Recognition tasks},
author = {J. Calvo-Zaragoza and J. Oncina},
url = {https://grfia.dlsi.ua.es/repositori/grfia/pubs/306/wsrfai2013_submission_3.pdf},
isbn = {978-84-695-8332-6},
year = {2013},
date = {2013-09-01},
urldate = {2013-09-01},
booktitle = {Actas del III Workshop de Reconocimiento de Formas y Análisis de Imágenes},
pages = {9-12},
address = {Madrid, Spain},
keywords = {Prometeo 2012, TIASA},
pubstate = {published},
tppubtype = {inproceedings}
}
Serrano, A.; Micó, L.; Oncina, J.
Which fast nearest neighbour search algorithm to use? Journal Article
In: Lecture Notes in Computer Science, vol. 7887, pp. 567-574, 2013.
Links | BibTeX | Tags: Prometeo 2012, TIASA
@article{k301,
title = {Which fast nearest neighbour search algorithm to use?},
author = {A. Serrano and L. Micó and J. Oncina},
url = {https://grfia.dlsi.ua.es/repositori/grfia/pubs/301/IbPria13.pdf},
year = {2013},
date = {2013-06-01},
journal = {Lecture Notes in Computer Science},
volume = {7887},
pages = {567-574},
keywords = {Prometeo 2012, TIASA},
pubstate = {published},
tppubtype = {article}
}
Sanches, J. M.; Micó, L.; Cardoso, J. S.
Pattern Recognition and Image Analysis 6th Iberian Conference, IbPRIA 2013 Book
Springer, 2013.
BibTeX | Tags: Prometeo 2012, TIASA
@book{k302,
title = {Pattern Recognition and Image Analysis 6th Iberian Conference, IbPRIA 2013},
author = {J. M. Sanches and L. Micó and J. S. Cardoso},
editor = {J. M. Sanches and L. Micó and J. S. Cardoso},
year = {2013},
date = {2013-01-01},
publisher = {Springer},
keywords = {Prometeo 2012, TIASA},
pubstate = {published},
tppubtype = {book}
}
Calvo-Zaragoza, J.; Oncina, J.; Iñesta, J. M.
Recognition of Online Handwritten Music Symbols Proceedings Article
In: Proceedings of the 6th International Workshop on Machine Learning and Music, Prague, Czech Republic, 2013.
Abstract | Links | BibTeX | Tags: Prometeo 2012, TIASA
@inproceedings{k307,
title = {Recognition of Online Handwritten Music Symbols},
author = {J. Calvo-Zaragoza and J. Oncina and J. M. Iñesta},
url = {https://grfia.dlsi.ua.es/repositori/grfia/pubs/307/calvozaragoza-mml13.pdf},
year = {2013},
date = {2013-01-01},
urldate = {2013-01-01},
booktitle = {Proceedings of the 6th International Workshop on Machine Learning and Music},
address = {Prague, Czech Republic},
abstract = {An effective way of digitizing a new musical composition is to use an e-pen and tablet application in which the user's pen strokes are recognized online and the digital score is created with the sole effort of the composition itself. This work aims to be a starting point for research on the recognition of online handwritten music notation. To this end, different alternatives within the two modalities of recognition resulting from this data are presented: online recognition, which uses the strokes marked by a pen, and offline recognition, which uses the image generated after drawing the symbol. A comparative experiment with common machine learning algorithms over a dataset of 3800 samples and 32 different music symbols is presented. Results show that samples of the actual user are needed if good classification rates are pursued. Moreover, algorithms using the online data, on average, achieve better classifocation results than the others.},
keywords = {Prometeo 2012, TIASA},
pubstate = {published},
tppubtype = {inproceedings}
}
An effective way of digitizing a new musical composition is to use an e-pen and tablet application in which the user's pen strokes are recognized online and the digital score is created with the sole effort of the composition itself. This work aims to be a starting point for research on the recognition of online handwritten music notation. To this end, different alternatives within the two modalities of recognition resulting from this data are presented: online recognition, which uses the strokes marked by a pen, and offline recognition, which uses the image generated after drawing the symbol. A comparative experiment with common machine learning algorithms over a dataset of 3800 samples and 32 different music symbols is presented. Results show that samples of the actual user are needed if good classification rates are pursued. Moreover, algorithms using the online data, on average, achieve better classifocation results than the others.2012
Rico-Juan, J. R.; Iñesta, J. M.
New rank methods for reducing the size of the training set using the nearest neighbor rule Journal Article
In: Pattern Recognition Letters, vol. 33, no. 5, pp. 654–660, 2012, ISSN: 0167-8655.
Abstract | Links | BibTeX | Tags: DRIMS, MIPRCV, TIASA
@article{k283,
title = {New rank methods for reducing the size of the training set using the nearest neighbor rule},
author = {J. R. Rico-Juan and J. M. Iñesta},
url = {https://grfia.dlsi.ua.es/repositori/grfia/pubs/283/rankTrainingSet.pdf},
issn = {0167-8655},
year = {2012},
date = {2012-04-01},
journal = {Pattern Recognition Letters},
volume = {33},
number = {5},
pages = {654--660},
abstract = {(http://dx.doi.org/10.1016/j.patrec.2011.07.019)
Some new rank methods to select the best prototypes from a training set are proposed in this paper in order to establish its size according to an external parameter, while maintaining the classification accuracy. The traditional methods that filter the training set in a classification task like editing or condensing have some rules that apply to the set in order to remove outliers or keep some prototypes that help in the classification. In our approach, new voting methods are proposed to compute the prototype probability and help to classify correctly a new sample. This probability is the key to sorting the training set out, so a relevance factor from 0 to 1 is used to select the best candidates for each class whose accumulated probabilities are less than that parameter. This approach makes it possible to select the number of prototypes necessary to maintain or even increase the classification accuracy. The results obtained in different high dimensional databases show that these methods maintain the final error rate while reducing the size of the training set.},
keywords = {DRIMS, MIPRCV, TIASA},
pubstate = {published},
tppubtype = {article}
}
(http://dx.doi.org/10.1016/j.patrec.2011.07.019)
Some new rank methods to select the best prototypes from a training set are proposed in this paper in order to establish its size according to an external parameter, while maintaining the classification accuracy. The traditional methods that filter the training set in a classification task like editing or condensing have some rules that apply to the set in order to remove outliers or keep some prototypes that help in the classification. In our approach, new voting methods are proposed to compute the prototype probability and help to classify correctly a new sample. This probability is the key to sorting the training set out, so a relevance factor from 0 to 1 is used to select the best candidates for each class whose accumulated probabilities are less than that parameter. This approach makes it possible to select the number of prototypes necessary to maintain or even increase the classification accuracy. The results obtained in different high dimensional databases show that these methods maintain the final error rate while reducing the size of the training set. Rico-Juan, J. R.; Iñesta, J. M.
Confidence voting method ensemble applied to off-line signature verification Journal Article
In: Pattern Analysis and Applications, vol. 15, no. 2, pp. 113–120, 2012, ISSN: 1433-7541.
Abstract | Links | BibTeX | Tags: MIPRCV, TIASA
@article{k290,
title = {Confidence voting method ensemble applied to off-line signature verification},
author = {J. R. Rico-Juan and J. M. Iñesta},
url = {https://grfia.dlsi.ua.es/repositori/grfia/pubs/290/rico_juanConfidenceVotingMethodEmsembleOffLineSignatureVerification.pdf},
issn = {1433-7541},
year = {2012},
date = {2012-04-01},
journal = {Pattern Analysis and Applications},
volume = {15},
number = {2},
pages = {113--120},
abstract = {In this paper, a new approximation to off-line signature verification is proposed based on two-class classifiers using an expert decisions ensemble. Different methods to extract sets of local and a global features from the target sample are detailed. Also a normalisation by confidence voting method is used in order to decrease the final equal error rate (EER). Each set of features is processed by a single expert, and on the other approach proposed, the decisions of the individual classifiers are combined using weighted votes. Experimental results are given using a subcorpus of the large MCYT signature database for random and skilled forgeries. The results show that the weighted combination outperforms the individual classifiers significantly. The best EER obtained were 6.3% in the case of skilled forgeries and 2.3% in the case of random forgeries.},
keywords = {MIPRCV, TIASA},
pubstate = {published},
tppubtype = {article}
}
In this paper, a new approximation to off-line signature verification is proposed based on two-class classifiers using an expert decisions ensemble. Different methods to extract sets of local and a global features from the target sample are detailed. Also a normalisation by confidence voting method is used in order to decrease the final equal error rate (EER). Each set of features is processed by a single expert, and on the other approach proposed, the decisions of the individual classifiers are combined using weighted votes. Experimental results are given using a subcorpus of the large MCYT signature database for random and skilled forgeries. The results show that the weighted combination outperforms the individual classifiers significantly. The best EER obtained were 6.3% in the case of skilled forgeries and 2.3% in the case of random forgeries. Serrano, A.; Micó, L.; Oncina, J.
Restructuring Versus non Restructuring Insertions in MDF Indexes Proceedings Article
In: Carmona, J. Salvador Sá Pedro Latorre; nchez,; Fred, Ana (Ed.): ICPRAM 2012: 1st International Conference on Pattern Recognition Applications and Methods, pp. 474–480, INSTICC SciTePress, Vilamoura, Portugal, 2012, ISBN: 978-989-8425-98-0.
Abstract | Links | BibTeX | Tags: MIPRCV, TIASA
@inproceedings{k282,
title = {Restructuring Versus non Restructuring Insertions in MDF Indexes},
author = {A. Serrano and L. Micó and J. Oncina},
editor = {J. Salvador Sá Pedro Latorre Carmona and nchez and Ana Fred},
url = {https://grfia.dlsi.ua.es/repositori/grfia/pubs/282/ICPRAM12.pdf},
isbn = {978-989-8425-98-0},
year = {2012},
date = {2012-02-01},
booktitle = {ICPRAM 2012: 1st International Conference on Pattern Recognition Applications and Methods},
pages = {474--480},
publisher = {SciTePress},
address = {Vilamoura, Portugal},
organization = {INSTICC},
abstract = {MDF tree is a data structure (index) that is used to speed up similarity searches in huge databases. To achieve its goal the indexes should exploit some property of the dissimilarity measure. MDF indexes assume that the dissimilarity measure can be viewed as a distance in a metric space. Moreover, in this framework is assumed that the distance is computationally very expensive and then, counting distance computations is a good measure of the time complexity.
To tackle with a changing world, a problem arises when new points should be inserted in the index. Efficient algorithms should choose between trying to be efficient in search maintaining the “ideal” structure of the index or trying to be efficient when inserting but worsening the search time.
In this work we propose an insertion algorithm for MDF trees that focus on optimizing insertion times. The worst case time complexity of the algorithm only depends on the depth of the MDF tree. We compare this algorithm with a similar one that focuses on search time performance. We also study the range of applicability of each one.},
keywords = {MIPRCV, TIASA},
pubstate = {published},
tppubtype = {inproceedings}
}
MDF tree is a data structure (index) that is used to speed up similarity searches in huge databases. To achieve its goal the indexes should exploit some property of the dissimilarity measure. MDF indexes assume that the dissimilarity measure can be viewed as a distance in a metric space. Moreover, in this framework is assumed that the distance is computationally very expensive and then, counting distance computations is a good measure of the time complexity.
To tackle with a changing world, a problem arises when new points should be inserted in the index. Efficient algorithms should choose between trying to be efficient in search maintaining the “ideal” structure of the index or trying to be efficient when inserting but worsening the search time.
In this work we propose an insertion algorithm for MDF trees that focus on optimizing insertion times. The worst case time complexity of the algorithm only depends on the depth of the MDF tree. We compare this algorithm with a similar one that focuses on search time performance. We also study the range of applicability of each one. Micó, L.; Oncina, J.
A log square average case algorithm to make insertions in fast similarity search Journal Article
In: Pattern Recognition Letters, vol. 33, no. 9, pp. 1060–1065, 2012.
Links | BibTeX | Tags: MIPRCV, TIASA
@article{k287,
title = {A log square average case algorithm to make insertions in fast similarity search},
author = {L. Micó and J. Oncina},
url = {https://grfia.dlsi.ua.es/repositori/grfia/pubs/287/prl.pdf},
year = {2012},
date = {2012-01-01},
journal = {Pattern Recognition Letters},
volume = {33},
number = {9},
pages = {1060–1065},
keywords = {MIPRCV, TIASA},
pubstate = {published},
tppubtype = {article}
}
Gallego-Sánchez, A. J.; Calera-Rubio, J.; López, D.
Structural Graph Extraction from Images Proceedings Article
In: Omatu, S.; Santana, Juan F. De Paz; González, S. Rodríguez; Molina, J. M.; a. M. Bernardos, (Ed.): Distributed Computing and Artificial Intelligence, pp. 717-724, Springer Berlin / Heidelberg, 2012, ISBN: 978-3-642-28764-0.
@inproceedings{k289,
title = {Structural Graph Extraction from Images},
author = {A. J. Gallego-Sánchez and J. Calera-Rubio and D. López},
editor = {S. Omatu and Juan F. De Paz Santana and S. Rodríguez González and J. M. Molina and a. M. Bernardos},
isbn = {978-3-642-28764-0},
year = {2012},
date = {2012-01-01},
urldate = {2012-01-01},
booktitle = {Distributed Computing and Artificial Intelligence},
pages = {717-724},
publisher = {Springer Berlin / Heidelberg},
keywords = {MIPRCV, TIASA},
pubstate = {published},
tppubtype = {inproceedings}
}
Abreu, J.
Detección de regularidades en contornos 2D, cálculo aproximado de medianas y su aplicación en tareas de clasificación PhD Thesis
2012.
Abstract | BibTeX | Tags: ISIC 2010, TIASA
@phdthesis{k293,
title = {Detección de regularidades en contornos 2D, cálculo aproximado de medianas y su aplicación en tareas de clasificación},
author = {J. Abreu},
editor = {J. R. Rico},
year = {2012},
date = {2012-01-01},
urldate = {2012-01-01},
organization = {Universidad de Alicante},
abstract = {In this work, we address two main problems: identifying the regularities and the construction of average contours from contours encoded by Freeman chain codes. Solutions for those problems are proposed which relies in the information gathered from the Levenshtein edit distance computation. We describe a new method for quantifying the regularity of contours and comparing them, when encoded by Freeman chain codes, in terms of a similarity criterion. The criterion used allows subsequences to be found from the minimal cost edit sequence that specifi and es an alignment of contour segments which are similar. Two external parameters adjust the similarity criterion. The information about each similar part is encoded by strings that represent an average contour region. An explanation of how to construct a prototype based on the identifi and ed regularities is also reviewed. The reliability of the prototypes is eva- luated by replacing contour groups, samples, by new prototypes used as the training set in a classifi and cation task. This way, the size of the data set can be reduced without sensibly affecting its representational power for classifi and cation purposes. Experimental results show that this scheme achieves a reduction in the size of the training data set of about 80% while the classifi and cation error only increases by 0.45% in one of the three data sets studied. Also this thesis presents a new fast algorithm for computing an approximation to the mean between two strings of characters representing a 2D shape and its application to a new Wilson-based editing proce dure. The approximate mean is built by including some symbols from the two original strings. Besides, a greedy approach to this algorithm is studied which allows to reduce the time required to computed an approximate mean. The new dataset editing scheme relaxes the criterion for deleting instances proposed by the Wilson editing procedure. In practice, not all instances misclassifi and ed by their near neighbors are pruned. Instead, an artifi and cial instance is added to the dataset in the hope of successfully classifying the instance in the future. The new artifi and cial instance is the approximated mean of the misclassifi and ed sample and its same-class nearest neighbor. Experiments carried over three widely known databases of contours show the proposed algorithms performs very well in computing the mean of two strings, outperforming methods proposed by other authors. Particularly the low computational time required by the heuristic approach make it very suitable when dealing with long length strings. Results also shows the propo- sed preprocessing scheme can reduce the classifi and cation error in about 83% of trials. There is empirical evidence that using the greedy approximation to compute the approximated mean does not affect the editing procedure performance. Finally, a new algorithm with which to compute an approximation to the mean of a set of strings is presented. The approximated mean is computed through the successive improvements of a partial solution. In each iteration, the edit distance from the partial solution to all the strings in the set are computed, thus accounting for the frequency of each of the edit operations in every position of the approximated mean. A goodness index for edit operations is later computed by multiplying their frequency by the cost. Each operation is tested, starting from that with the highest index, in order to verify whether applying it to the partial solution leads to an improvement. If successful, a new iteration begins from the new approximated mean. The algorithm fi and nishes after all the operations have been examined without a better solution being found. Comparative experiments involving Freeman chain codes encoding 2D shapes show that the quality of the approximated mean string is similar to other approaches but achieves a much faster convergence.},
keywords = {ISIC 2010, TIASA},
pubstate = {published},
tppubtype = {phdthesis}
}
In this work, we address two main problems: identifying the regularities and the construction of average contours from contours encoded by Freeman chain codes. Solutions for those problems are proposed which relies in the information gathered from the Levenshtein edit distance computation. We describe a new method for quantifying the regularity of contours and comparing them, when encoded by Freeman chain codes, in terms of a similarity criterion. The criterion used allows subsequences to be found from the minimal cost edit sequence that specifi and es an alignment of contour segments which are similar. Two external parameters adjust the similarity criterion. The information about each similar part is encoded by strings that represent an average contour region. An explanation of how to construct a prototype based on the identifi and ed regularities is also reviewed. The reliability of the prototypes is eva- luated by replacing contour groups, samples, by new prototypes used as the training set in a classifi and cation task. This way, the size of the data set can be reduced without sensibly affecting its representational power for classifi and cation purposes. Experimental results show that this scheme achieves a reduction in the size of the training data set of about 80% while the classifi and cation error only increases by 0.45% in one of the three data sets studied. Also this thesis presents a new fast algorithm for computing an approximation to the mean between two strings of characters representing a 2D shape and its application to a new Wilson-based editing proce dure. The approximate mean is built by including some symbols from the two original strings. Besides, a greedy approach to this algorithm is studied which allows to reduce the time required to computed an approximate mean. The new dataset editing scheme relaxes the criterion for deleting instances proposed by the Wilson editing procedure. In practice, not all instances misclassifi and ed by their near neighbors are pruned. Instead, an artifi and cial instance is added to the dataset in the hope of successfully classifying the instance in the future. The new artifi and cial instance is the approximated mean of the misclassifi and ed sample and its same-class nearest neighbor. Experiments carried over three widely known databases of contours show the proposed algorithms performs very well in computing the mean of two strings, outperforming methods proposed by other authors. Particularly the low computational time required by the heuristic approach make it very suitable when dealing with long length strings. Results also shows the propo- sed preprocessing scheme can reduce the classifi and cation error in about 83% of trials. There is empirical evidence that using the greedy approximation to compute the approximated mean does not affect the editing procedure performance. Finally, a new algorithm with which to compute an approximation to the mean of a set of strings is presented. The approximated mean is computed through the successive improvements of a partial solution. In each iteration, the edit distance from the partial solution to all the strings in the set are computed, thus accounting for the frequency of each of the edit operations in every position of the approximated mean. A goodness index for edit operations is later computed by multiplying their frequency by the cost. Each operation is tested, starting from that with the highest index, in order to verify whether applying it to the partial solution leads to an improvement. If successful, a new iteration begins from the new approximated mean. The algorithm fi and nishes after all the operations have been examined without a better solution being found. Comparative experiments involving Freeman chain codes encoding 2D shapes show that the quality of the approximated mean string is similar to other approaches but achieves a much faster convergence. López, D.; Calera-Rubio, J.; Gallego-Sánchez, A. J.
Inference of k-Testable Directed Acyclic Graph Languages Proceedings Article
In: Journal of Machine Learning Research: Workshop and Conference Proceedings, Vol. 21: ICGI 2012, pp. 149-163, 2012.
Abstract | Links | BibTeX | Tags: PASCAL2, Prometeo 2012, TIASA
@inproceedings{k296,
title = {Inference of k-Testable Directed Acyclic Graph Languages},
author = {D. López and J. Calera-Rubio and A. J. Gallego-Sánchez},
url = {https://grfia.dlsi.ua.es/repositori/grfia/pubs/296/lopez12a.pdf},
year = {2012},
date = {2012-01-01},
booktitle = {Journal of Machine Learning Research: Workshop and Conference Proceedings, Vol. 21: ICGI 2012},
pages = {149-163},
abstract = {In this paper, we tackle the task of graph language learning. We first extend the well-known classes of k-testability and k-testability in the strict sense languages to directed graph languages. Second, we propose a graph automata model for directed acyclic graph languages. This graph automata model is used to propose a grammatical inference algorithm to learn the class of directed acyclic k-testable in the strict sense graph languages. The algorithm runs in polynomial time and identifies this class of languages from positive data.},
keywords = {PASCAL2, Prometeo 2012, TIASA},
pubstate = {published},
tppubtype = {inproceedings}
}
In this paper, we tackle the task of graph language learning. We first extend the well-known classes of k-testability and k-testability in the strict sense languages to directed graph languages. Second, we propose a graph automata model for directed acyclic graph languages. This graph automata model is used to propose a grammatical inference algorithm to learn the class of directed acyclic k-testable in the strict sense graph languages. The algorithm runs in polynomial time and identifies this class of languages from positive data.2011
Higuera, C. De La; Oncina, J.
Finding the most probable string and the consensus string: an algorithmic study Proceedings Article
In: In: 12th International Conference on Parsing Technologies (IWPT 2011), pp. 26-36, Dublin, 2011.
Abstract | Links | BibTeX | Tags: MIPRCV, TIASA
@inproceedings{k288,
title = {Finding the most probable string and the consensus string: an algorithmic study},
author = {C. De La Higuera and J. Oncina},
url = {https://grfia.dlsi.ua.es/repositori/grfia/pubs/288/iwpt2011.pdf},
year = {2011},
date = {2011-10-01},
urldate = {2011-10-01},
booktitle = {In: 12th International Conference on Parsing Technologies (IWPT 2011)},
pages = {26-36},
address = {Dublin},
abstract = {The problem of finding the most probable string for a distribution generated by a weighted finite automaton is related to a number of important questions: computing the distance between two distributions or finding the best translation (the most probable one) given a probabilistic finite state transducer. The problem is undecidable with general weights and is $NP$-hard if the automaton is probabilistic. In this paper we give a pseudo-polynomial algorithm which computes the most probable string in time polynomial in the inverse of the probability of this string itself. We also give a randomised algorithm solving the same problem and discuss the case where the distribution is generated by other types of machines.},
keywords = {MIPRCV, TIASA},
pubstate = {published},
tppubtype = {inproceedings}
}
The problem of finding the most probable string for a distribution generated by a weighted finite automaton is related to a number of important questions: computing the distance between two distributions or finding the best translation (the most probable one) given a probabilistic finite state transducer. The problem is undecidable with general weights and is $NP$-hard if the automaton is probabilistic. In this paper we give a pseudo-polynomial algorithm which computes the most probable string in time polynomial in the inverse of the probability of this string itself. We also give a randomised algorithm solving the same problem and discuss the case where the distribution is generated by other types of machines. Socorro, R.; Micó, L.; Oncina, J.
A fast pivot-based indexing algorithm for metric spaces Journal Article
In: Pattern Recognition Letters, vol. 32, no. 11, pp. 1511-1516, 2011.
Abstract | Links | BibTeX | Tags: MIPRCV, TIASA
@article{k266,
title = {A fast pivot-based indexing algorithm for metric spaces},
author = {R. Socorro and L. Micó and J. Oncina},
url = {https://grfia.dlsi.ua.es/repositori/grfia/pubs/266/piaesa-prl.pdf},
year = {2011},
date = {2011-08-01},
urldate = {2011-08-01},
journal = {Pattern Recognition Letters},
volume = {32},
number = {11},
pages = {1511-1516},
abstract = {This work focus on fast nearest neighbor (NN) search algorithms that can work in any metric space (not just the Euclidean distance) and where the distance computation is very time consuming. One of the most well known methods in this field is the AESA algorithm, used as baseline for performance measurement for over twenty years. The AESA works in two steps that repeats: first it searches a promising candidate to NN and computes its distance (approximation step), next it eliminates all the unsuitable NN candidates in view of the new information acquired in the previous calculation (elimination step).
This work introduces the PiAESA algorithm. This algorithm improves the performance of the AESA algorithm by splitting the approximation criterion: on the first iterations, when there is not enough information to find good NN candidates, it uses a list of pivots (objects in the database) to obtain a cheap approximation of the distance function. Once a good approximation is obtained it switches to the AESA usual behavior. As the pivot list is built in preprocessing time, the run time of PiAESA is almost the same than the AESA one.
In this work, we report experiments comparing with some competing methods. Our empirical results show that this new approach obtains a significant reduction of distance computations with no execution time penalty.},
keywords = {MIPRCV, TIASA},
pubstate = {published},
tppubtype = {article}
}
This work focus on fast nearest neighbor (NN) search algorithms that can work in any metric space (not just the Euclidean distance) and where the distance computation is very time consuming. One of the most well known methods in this field is the AESA algorithm, used as baseline for performance measurement for over twenty years. The AESA works in two steps that repeats: first it searches a promising candidate to NN and computes its distance (approximation step), next it eliminates all the unsuitable NN candidates in view of the new information acquired in the previous calculation (elimination step).
This work introduces the PiAESA algorithm. This algorithm improves the performance of the AESA algorithm by splitting the approximation criterion: on the first iterations, when there is not enough information to find good NN candidates, it uses a list of pivots (objects in the database) to obtain a cheap approximation of the distance function. Once a good approximation is obtained it switches to the AESA usual behavior. As the pivot list is built in preprocessing time, the run time of PiAESA is almost the same than the AESA one.
In this work, we report experiments comparing with some competing methods. Our empirical results show that this new approach obtains a significant reduction of distance computations with no execution time penalty. Oncina, J.; Vidal, E.
Interactive Structured Output Prediction: Application to Chromosome Classification Journal Article
In: Pattern Recognition and Image Analysis (LNCS), vol. 6669, pp. 256-264, 2011.
Abstract | Links | BibTeX | Tags: MIPRCV, TIASA
@article{k267,
title = {Interactive Structured Output Prediction: Application to Chromosome Classification},
author = {J. Oncina and E. Vidal},
url = {https://grfia.dlsi.ua.es/repositori/grfia/pubs/267/karyo.pdf},
year = {2011},
date = {2011-06-01},
urldate = {2011-06-01},
journal = {Pattern Recognition and Image Analysis (LNCS)},
volume = {6669},
pages = {256-264},
abstract = {Interactive Pattern Recognition concepts and techniques are applied to problems with structured output and i.e., problems in which the result is not just a simple class label, but a suitable structure of labels. For illustration purposes (a simplification of) the problem of Human Karyotyping is considered. Results show that a) taking into account label dependencies in a karyogram significantly reduces the classical (noninteractive) chromosome label prediction error rate and b) they are further improved when interactive processing is adopted.},
keywords = {MIPRCV, TIASA},
pubstate = {published},
tppubtype = {article}
}
Interactive Pattern Recognition concepts and techniques are applied to problems with structured output and i.e., problems in which the result is not just a simple class label, but a suitable structure of labels. For illustration purposes (a simplification of) the problem of Human Karyotyping is considered. Results show that a) taking into account label dependencies in a karyogram significantly reduces the classical (noninteractive) chromosome label prediction error rate and b) they are further improved when interactive processing is adopted. Bernabeu, J. F.; Calera-Rubio, J.; Iñesta, J. M.; Rizo, D.
Melodic Identification Using Probabilistic Tree Automata Journal Article
In: Journal of New Music Research, vol. 40, no. 2, pp. 93-103, 2011, ISSN: 0929-8215.
Abstract | BibTeX | Tags: DRIMS, MIPRCV, TIASA
@article{k270,
title = {Melodic Identification Using Probabilistic Tree Automata},
author = {J. F. Bernabeu and J. Calera-Rubio and J. M. Iñesta and D. Rizo},
issn = {0929-8215},
year = {2011},
date = {2011-06-01},
urldate = {2011-06-01},
journal = {Journal of New Music Research},
volume = {40},
number = {2},
pages = {93-103},
abstract = {Similarity computation is a difficult issue in music information retrieval tasks, because it tries to emulate the special ability that humans show for pattern recognition in general, and particularly in the presence of noisy data. A number of works have addressed the problem of what is the best representation for symbolic music in this context. The tree representation, using rhythm for defining the tree structure and pitch information for leaf and node labelling has proven to be effective in melodic similarity computation. One of the main drawbacks of this approach is that the tree comparison algorithms are of a high time complexity. In this paper, stochastic k-testable tree-models are applied for computing the similarity between two melodies as a probability. The results are compared to those achieved by tree edit distances, showing that k-testable tree-models outperform other reference methods in both recognition rate and efficiency. The case study in this paper is to identify a snippet query among a set of songs stored in symbolic format. For it, the utilized method must be able to deal with inexact queries and with efficiency for scalability issues.},
keywords = {DRIMS, MIPRCV, TIASA},
pubstate = {published},
tppubtype = {article}
}
Similarity computation is a difficult issue in music information retrieval tasks, because it tries to emulate the special ability that humans show for pattern recognition in general, and particularly in the presence of noisy data. A number of works have addressed the problem of what is the best representation for symbolic music in this context. The tree representation, using rhythm for defining the tree structure and pitch information for leaf and node labelling has proven to be effective in melodic similarity computation. One of the main drawbacks of this approach is that the tree comparison algorithms are of a high time complexity. In this paper, stochastic k-testable tree-models are applied for computing the similarity between two melodies as a probability. The results are compared to those achieved by tree edit distances, showing that k-testable tree-models outperform other reference methods in both recognition rate and efficiency. The case study in this paper is to identify a snippet query among a set of songs stored in symbolic format. For it, the utilized method must be able to deal with inexact queries and with efficiency for scalability issues. Socorro, R.; Micó, L.; Oncina, J.
Efficient search supporting several similarity queries by reordering pivots Proceedings Article
In: Signal Processing, Pattern Recognition, and Applications (SPPRA 2011), pp. 114-120, ACTA Press, Innsbruck, Austria, 2011, ISBN: 978-0-88986-865-6.
Abstract | Links | BibTeX | Tags: MIPRCV, TIASA
@inproceedings{k260,
title = {Efficient search supporting several similarity queries by reordering pivots},
author = {R. Socorro and L. Micó and J. Oncina},
url = {https://grfia.dlsi.ua.es/repositori/grfia/pubs/260/sppra.pdf},
isbn = {978-0-88986-865-6},
year = {2011},
date = {2011-02-01},
booktitle = {Signal Processing, Pattern Recognition, and Applications (SPPRA 2011)},
pages = {114-120},
publisher = {ACTA Press},
address = {Innsbruck, Austria},
abstract = {Effective similarity search indexing in general metric spaces has traditionally received special attention in several areas of interest like pattern recognition, computer vision or information retrieval. A typical method is based on the use of a distance as a dissimilarity function (not restricting to Euclidean distance) where the main objective is to speed up the search of the most similar object in a database by
minimising the number of distance computations. Several types of search can be defined, being the k-nearest neighbour or the range search the most common. AESA is one of the most well known of such algorithms due to its performance (measured in distance computations). PiAESA is an AESA variant where the main objective has changed. Instead of trying to find the best nearest neighbour candidate at each step, it tries to find the object that contributes the most to have a bigger lower bound function, that is, a better estimation of the distance. In this paper we extend and test PiAESA to support several similarity queries. Our empirical results show that this approach obtains a significant improvement in performance when comparing with competing algorithms.},
keywords = {MIPRCV, TIASA},
pubstate = {published},
tppubtype = {inproceedings}
}
Effective similarity search indexing in general metric spaces has traditionally received special attention in several areas of interest like pattern recognition, computer vision or information retrieval. A typical method is based on the use of a distance as a dissimilarity function (not restricting to Euclidean distance) where the main objective is to speed up the search of the most similar object in a database by
minimising the number of distance computations. Several types of search can be defined, being the k-nearest neighbour or the range search the most common. AESA is one of the most well known of such algorithms due to its performance (measured in distance computations). PiAESA is an AESA variant where the main objective has changed. Instead of trying to find the best nearest neighbour candidate at each step, it tries to find the object that contributes the most to have a bigger lower bound function, that is, a better estimation of the distance. In this paper we extend and test PiAESA to support several similarity queries. Our empirical results show that this approach obtains a significant improvement in performance when comparing with competing algorithms. Abreu, J.; Rico-Juan, J. R.
Characterization of contour regularities based on the Levenshtein edit distance Journal Article
In: Pattern Recognition Letters, vol. 32, pp. 1421-1427, 2011.
Abstract | Links | BibTeX | Tags: MIPRCV, TIASA
@article{k263,
title = {Characterization of contour regularities based on the Levenshtein edit distance},
author = {J. Abreu and J. R. Rico-Juan},
url = {https://grfia.dlsi.ua.es/repositori/grfia/pubs/263/2009_J_IbPRIA.pdf},
year = {2011},
date = {2011-01-01},
urldate = {2011-01-01},
journal = {Pattern Recognition Letters},
volume = {32},
pages = {1421-1427},
abstract = {This paper describes a new method for quantifying the regularity of contours and comparing them (when encoded by Freeman chain codes) in terms of a similarity criterion which relies on information gathered from Levenshtein edit distance computation. The criterion used allows subsequences to be found from the minimal cost edit sequence that specifies an alignment of contour segments which are similar. Two external parameters adjust the similarity criterion. The information about each similar part is encoded by strings that represent an average contour region. An explanation of how to construct a prototype based on the identified regularities is also reviewed. The reliability of the prototypes is evaluated by replacing contour groups (samples) by new prototypes used as the training set in a classification task. This way, the size of the data set can be reduced without sensibly affecting its representational power for classification purposes. Experimental results show that this scheme achieves a reduction in the size of the training data set of about 80% while the classification error only increases by 0.45% in one of the three data sets studied.},
keywords = {MIPRCV, TIASA},
pubstate = {published},
tppubtype = {article}
}
This paper describes a new method for quantifying the regularity of contours and comparing them (when encoded by Freeman chain codes) in terms of a similarity criterion which relies on information gathered from Levenshtein edit distance computation. The criterion used allows subsequences to be found from the minimal cost edit sequence that specifies an alignment of contour segments which are similar. Two external parameters adjust the similarity criterion. The information about each similar part is encoded by strings that represent an average contour region. An explanation of how to construct a prototype based on the identified regularities is also reviewed. The reliability of the prototypes is evaluated by replacing contour groups (samples) by new prototypes used as the training set in a classification task. This way, the size of the data set can be reduced without sensibly affecting its representational power for classification purposes. Experimental results show that this scheme achieves a reduction in the size of the training data set of about 80% while the classification error only increases by 0.45% in one of the three data sets studied. Bernabeu, J. F.; Calera-Rubio, J.; Iñesta, J. M.
Classifying melodies using tree grammars Journal Article
In: Lecture Notes in Computer Science, vol. 6669, pp. 572–579, 2011, ISSN: 0302-9743.
Abstract | Links | BibTeX | Tags: DRIMS, MIPRCV, TIASA
@article{k264,
title = {Classifying melodies using tree grammars},
author = {J. F. Bernabeu and J. Calera-Rubio and J. M. Iñesta},
url = {https://grfia.dlsi.ua.es/repositori/grfia/pubs/264/ibpria2011-bernabeu.pdf},
issn = {0302-9743},
year = {2011},
date = {2011-01-01},
journal = {Lecture Notes in Computer Science},
volume = {6669},
pages = {572--579},
abstract = {Similarity computation is a difficult issue in music information retrieval, because it tries to emulate the special ability that humans show
for pattern recognition in general, and particularly in the presence of noisy data. A number of works have addressed the problem of what
is the best representation for symbolic music in this context. The tree representation, using rhythm for defining the tree structure and pitch information for leaf and node labeling has proven to be effective in melodic similarity computation. In this paper we propose a solution when we have melodies represented by trees for the training but the duration information is not available for the input data. For that, we infer a probabilistic context-free grammar using the information in the trees (duration and pitch) and classify new melodies represented by strings using only the pitch. The case study in this paper is to identify a snippet query among a set of songs stored in symbolic format. For it, the utilized method must be able to deal with inexact queries and efficient for scalability issues.},
keywords = {DRIMS, MIPRCV, TIASA},
pubstate = {published},
tppubtype = {article}
}
Similarity computation is a difficult issue in music information retrieval, because it tries to emulate the special ability that humans show
for pattern recognition in general, and particularly in the presence of noisy data. A number of works have addressed the problem of what
is the best representation for symbolic music in this context. The tree representation, using rhythm for defining the tree structure and pitch information for leaf and node labeling has proven to be effective in melodic similarity computation. In this paper we propose a solution when we have melodies represented by trees for the training but the duration information is not available for the input data. For that, we infer a probabilistic context-free grammar using the information in the trees (duration and pitch) and classify new melodies represented by strings using only the pitch. The case study in this paper is to identify a snippet query among a set of songs stored in symbolic format. For it, the utilized method must be able to deal with inexact queries and efficient for scalability issues. Serrano, A.; Micó, L.; Oncina, J.
Impact of the Initialization in Tree-Based Fast Similarity Search Techniques Proceedings Article
In: Pelillo, M.; Hancock, E. R. (Ed.): SIMBAD'11 Proceedings of the First international conference on Similarity-based pattern recognition, pp. 163-176, Springer, Venecia, Italia, 2011, ISBN: 978-3-642-24470-4.
Abstract | Links | BibTeX | Tags: MIPRCV, TIASA
@inproceedings{k272,
title = {Impact of the Initialization in Tree-Based Fast Similarity Search Techniques},
author = {A. Serrano and L. Micó and J. Oncina},
editor = {M. Pelillo and E. R. Hancock},
url = {https://grfia.dlsi.ua.es/repositori/grfia/pubs/272/simbad11.pdf},
isbn = {978-3-642-24470-4},
year = {2011},
date = {2011-01-01},
booktitle = {SIMBAD'11 Proceedings of the First international conference on Similarity-based pattern recognition},
pages = {163-176},
publisher = {Springer},
address = {Venecia, Italia},
abstract = {Many fast similarity search techniques relies on the use of pivots (specially selected points in the data set). Using these points, specific structures (indexes) are built speeding up the search when queering. Usually, pivot selection techniques are incremental, being the first one randomly chosen.
This article explores several techniques to choose the first pivot in a tree-based fast similarity search technique. We provide experimental results showing that an adequate choice of this pivot leads to significant reductions in distance computations and time complexity.
Moreover, most pivot tree-based indexes emphasizes in building balanced trees.We provide experimentally and theoretical support that very unbalanced trees can be a better choice than balanced ones.},
keywords = {MIPRCV, TIASA},
pubstate = {published},
tppubtype = {inproceedings}
}
Many fast similarity search techniques relies on the use of pivots (specially selected points in the data set). Using these points, specific structures (indexes) are built speeding up the search when queering. Usually, pivot selection techniques are incremental, being the first one randomly chosen.
This article explores several techniques to choose the first pivot in a tree-based fast similarity search technique. We provide experimental results showing that an adequate choice of this pivot leads to significant reductions in distance computations and time complexity.
Moreover, most pivot tree-based indexes emphasizes in building balanced trees.We provide experimentally and theoretical support that very unbalanced trees can be a better choice than balanced ones.2010
Calera-Rubio, J.; Bernabeu, J. F.
Tree language automata for melody recognition Proceedings Article
In: Pérez, Juan Carlos (Ed.): Actas del II Workshop de Reconocimiento de Formas y Análisis de Imágenes (AERFAI), pp. 17-22, AERFAI IBERGARCETA PUBLICACIONES, S.L., Valencia, Spain, 2010, ISBN: 978-84-92812-66-0.
Abstract | Links | BibTeX | Tags: DRIMS, MIPRCV, TIASA
@inproceedings{k251,
title = {Tree language automata for melody recognition},
author = {J. Calera-Rubio and J. F. Bernabeu},
editor = {Juan Carlos Pérez},
url = {https://grfia.dlsi.ua.es/repositori/grfia/pubs/251/bernabeuCEDI2010Final.pdf},
isbn = {978-84-92812-66-0},
year = {2010},
date = {2010-09-01},
urldate = {2010-09-01},
booktitle = {Actas del II Workshop de Reconocimiento de Formas y Análisis de Imágenes (AERFAI)},
pages = {17-22},
publisher = {IBERGARCETA PUBLICACIONES, S.L.},
address = {Valencia, Spain},
organization = {AERFAI},
abstract = {The representation of symbolic music by
means of trees has shown to be suitable in
melodic similarity computation. In order to
compare trees, different tree edit distances
have been previously used, being their complexity
a main drawback. In this paper, the application of stochastic k-testable treemodels for computing the similarity between two melodies as a probability, compared to the classical edit distance has been addressed. The results show that k-testable tree-models seem to be adequate for the task, since they outperform other reference methods in both recognition rate and efficiency. The case study in this paper is to identify a snippet query among a set of songs. For it, the utilized method must be able to deal with inexact queries and efficiency
for scalability issues.},
keywords = {DRIMS, MIPRCV, TIASA},
pubstate = {published},
tppubtype = {inproceedings}
}
The representation of symbolic music by
means of trees has shown to be suitable in
melodic similarity computation. In order to
compare trees, different tree edit distances
have been previously used, being their complexity
a main drawback. In this paper, the application of stochastic k-testable treemodels for computing the similarity between two melodies as a probability, compared to the classical edit distance has been addressed. The results show that k-testable tree-models seem to be adequate for the task, since they outperform other reference methods in both recognition rate and efficiency. The case study in this paper is to identify a snippet query among a set of songs. For it, the utilized method must be able to deal with inexact queries and efficiency
for scalability issues. Rico-Juan, J. R.; Abreu, J.
A new editing scheme based on a fast two-string median computation applied to OCR Proceedings Article
In: Hancok, E. R.; Wilson, R. C.; Ilkay, T. W.; Escolano, F. (Ed.): Structural, Syntactic, and Statistical Pattern Recognition, pp. 748–756, Springer, Cesme, Izmir, Turkey, 2010, ISBN: 978-3-642-14979-5.
Abstract | BibTeX | Tags: MIPRCV, TIASA
@inproceedings{k247,
title = {A new editing scheme based on a fast two-string median computation applied to OCR},
author = {J. R. Rico-Juan and J. Abreu},
editor = {E. R. Hancok and R. C. Wilson and T. W. Ilkay and F. Escolano},
isbn = {978-3-642-14979-5},
year = {2010},
date = {2010-08-01},
urldate = {2010-08-01},
booktitle = {Structural, Syntactic, and Statistical Pattern Recognition},
pages = {748--756},
publisher = {Springer},
address = {Cesme, Izmir, Turkey},
abstract = {This paper presents a new fast algorithm to compute an approximation to the median between two strings of characters representing a 2D shape and its application to a new classification scheme to decrease its error rate. The median string results from the application of certain edit operations from the minimum cost edit sequence to one of the original strings. The new dataset editing scheme relaxes the criterion to delete instances proposed by the Wilson Editing Proce- dure. In practice, not all instances misclassified by its near neighbors are pruned. Instead, an artificial instance is added to the dataset expecting to successfully classify the instance on the future. The new artificial instance is the median from the misclassified sample and its same-class nearest neighbor. The experiments over two widely used datasets of handwritten characters show this preprocessing scheme can reduce the classification error in about 78% of trials.},
keywords = {MIPRCV, TIASA},
pubstate = {published},
tppubtype = {inproceedings}
}
This paper presents a new fast algorithm to compute an approximation to the median between two strings of characters representing a 2D shape and its application to a new classification scheme to decrease its error rate. The median string results from the application of certain edit operations from the minimum cost edit sequence to one of the original strings. The new dataset editing scheme relaxes the criterion to delete instances proposed by the Wilson Editing Proce- dure. In practice, not all instances misclassified by its near neighbors are pruned. Instead, an artificial instance is added to the dataset expecting to successfully classify the instance on the future. The new artificial instance is the median from the misclassified sample and its same-class nearest neighbor. The experiments over two widely used datasets of handwritten characters show this preprocessing scheme can reduce the classification error in about 78% of trials. Gómez-Ballester, E.; Micó, L.; Thollard, F.; Oncina, J.; Moreno-Seco, F.
Combining Elimination Rules in Tree-Based Nearest Neighbor Search Algorithms Proceedings Article
In: Hancok, E. R.; Wilson, R. C.; Ilkay, T. W.; Escolano, F. (Ed.): Structural, Syntactic, and Statistical Pattern Recognition, pp. 80–89, Springer, Cesme, Turkey, 2010, ISBN: 978-3-642-14979-5.
Abstract | Links | BibTeX | Tags: MIPRCV, TIASA
@inproceedings{k249,
title = {Combining Elimination Rules in Tree-Based Nearest Neighbor Search Algorithms},
author = {E. Gómez-Ballester and L. Micó and F. Thollard and J. Oncina and F. Moreno-Seco},
editor = {E. R. Hancok and R. C. Wilson and T. W. Ilkay and F. Escolano},
url = {https://grfia.dlsi.ua.es/repositori/grfia/pubs/249/tr-ssspr2010.pdf},
isbn = {978-3-642-14979-5},
year = {2010},
date = {2010-08-01},
booktitle = {Structural, Syntactic, and Statistical Pattern Recognition},
pages = {80--89},
publisher = {Springer},
address = {Cesme, Turkey},
abstract = {A common activity in many pattern recognition tasks, image processing or clustering techniques involves searching a labeled data set looking for the nearest point to a given unlabelled sample. To reduce the computational overhead when the naive exhaustive search is applied, some fast nearest neighbor search (NNS) algorithms have appeared in the last years. Depending on the structure used to store the training set (usually a tree), different strategies to speed up the search have been defined. In this paper, a new algorithm based on the combination of different pruning rules is proposed. An experimental evaluation and comparison of its behavior with respect to other techniques has been performed, using both real and artificial data.},
keywords = {MIPRCV, TIASA},
pubstate = {published},
tppubtype = {inproceedings}
}
A common activity in many pattern recognition tasks, image processing or clustering techniques involves searching a labeled data set looking for the nearest point to a given unlabelled sample. To reduce the computational overhead when the naive exhaustive search is applied, some fast nearest neighbor search (NNS) algorithms have appeared in the last years. Depending on the structure used to store the training set (usually a tree), different strategies to speed up the search have been defined. In this paper, a new algorithm based on the combination of different pruning rules is proposed. An experimental evaluation and comparison of its behavior with respect to other techniques has been performed, using both real and artificial data. Micó, L.; Oncina, J.
A Constant Average Time Algorithm to Allow Insertions in the LAESA Fast Nearest Neighbour Search Index Proceedings Article
In: Proc. of the 20th International Conference on Pattern Recognition, ICPR 2010, Istanbul, Turkey, pp. 23–26, 2010.
Links | BibTeX | Tags: MIPRCV, TIASA
@inproceedings{k257,
title = {A Constant Average Time Algorithm to Allow Insertions in the LAESA Fast Nearest Neighbour Search Index},
author = {L. Micó and J. Oncina},
url = {https://grfia.dlsi.ua.es/repositori/grfia/pubs/257/icpr-2010.pdf},
year = {2010},
date = {2010-08-01},
booktitle = {Proc. of the 20th International Conference on Pattern Recognition, ICPR 2010, Istanbul, Turkey},
pages = {23--26},
keywords = {MIPRCV, TIASA},
pubstate = {published},
tppubtype = {inproceedings}
}
Verdú-Mas, J. L.
Gramáticas probabilisticas para la desambiguación sintáctica PhD Thesis
2010.
@phdthesis{k262,
title = {Gramáticas probabilisticas para la desambiguación sintáctica},
author = {J. L. Verdú-Mas},
editor = {Jorge Calera Rafael Carrasco},
year = {2010},
date = {2010-01-01},
organization = {Univ. Alicante},
keywords = {MIPRCV, TIASA},
pubstate = {published},
tppubtype = {phdthesis}
}
2009
Calera-Rubio, J.; Bernabeu, J. F.
A probabilistic approach to melodic similarity Proceedings Article
In: Proceedings of MML 2009, pp. 48-53, 2009.
Abstract | Links | BibTeX | Tags: ARFAI, DRIMS, MIPRCV, PROSEMUS, TIASA
@inproceedings{k231,
title = {A probabilistic approach to melodic similarity},
author = {J. Calera-Rubio and J. F. Bernabeu},
url = {https://grfia.dlsi.ua.es/repositori/grfia/pubs/231/mml2009Bernabeu.pdf},
year = {2009},
date = {2009-01-01},
urldate = {2009-01-01},
booktitle = {Proceedings of MML 2009},
pages = {48-53},
abstract = {Melodic similarity is an important research topic in music information retrieval.
The representation of symbolic music by means of trees has proven to be suitable
in melodic similarity computation, because they are able to code rhythm in their
structure leaving only pitch representations as a degree of freedom for coding.
In order to compare trees, different edit distances have been previously used.
In this paper, stochastic k-testable tree-models, formerly used in other domains
like structured document compression or natural language processing, have been
used for computing a similarity measure between melody trees as a probability
and their performance has been compared to a classical tree edit distance.},
keywords = {ARFAI, DRIMS, MIPRCV, PROSEMUS, TIASA},
pubstate = {published},
tppubtype = {inproceedings}
}
Melodic similarity is an important research topic in music information retrieval.
The representation of symbolic music by means of trees has proven to be suitable
in melodic similarity computation, because they are able to code rhythm in their
structure leaving only pitch representations as a degree of freedom for coding.
In order to compare trees, different edit distances have been previously used.
In this paper, stochastic k-testable tree-models, formerly used in other domains
like structured document compression or natural language processing, have been
used for computing a similarity measure between melody trees as a probability
and their performance has been compared to a classical tree edit distance.
2014
Higuera, C. De La; Oncina, J.
The most probable string: an algorithmic study Journal Article
In: Journal of Logic and Computation, vol. 24, no. 2, pp. 311-330, 2014, ISSN: 0955-792X.
Abstract | Links | BibTeX | Tags: Prometeo 2012, TIASA
@article{k304,
title = {The most probable string: an algorithmic study},
author = {C. De La Higuera and J. Oncina},
url = {https://grfia.dlsi.ua.es/repositori/grfia/pubs/304/J+Logic+Computation-2013.pdf},
issn = {0955-792X},
year = {2014},
date = {2014-01-01},
urldate = {2014-01-01},
journal = {Journal of Logic and Computation},
volume = {24},
number = {2},
pages = {311-330},
abstract = {The problem of finding the consensus (most probable string) for a distribution generated by a weighted finite automaton or a probabilistic grammar is related to a number of important questions: computing the distance between two distributions or finding the best translation (the most probable one) given a probabilistic finite state transducer. The problem is undecidable
with general weights and is NP-hard if the automaton is probabilistic. We give a pseudo-polynomial algorithm that solves a decision problem directly associated with the consensus string and answers if there is a (reasonably short) string whose probability is larger than a given bound in time polynomial in the the size of this bound, both for probabilistic finite automata
and probabilistic context-free grammars.We also study a randomized algorithm solving the same problem. Finally, we report links between the length of the consensus string and the probability of this string.},
keywords = {Prometeo 2012, TIASA},
pubstate = {published},
tppubtype = {article}
}
with general weights and is NP-hard if the automaton is probabilistic. We give a pseudo-polynomial algorithm that solves a decision problem directly associated with the consensus string and answers if there is a (reasonably short) string whose probability is larger than a given bound in time polynomial in the the size of this bound, both for probabilistic finite automata
and probabilistic context-free grammars.We also study a randomized algorithm solving the same problem. Finally, we report links between the length of the consensus string and the probability of this string.
Abreu, J.; Rico-Juan, J. R.
A New Iterative Algorithm for Computing a Quality Approximated Median of Strings based on Edit Operations Journal Article
In: Pattern Recognition Letters, vol. 36, pp. 74–80, 2014.
Abstract | Links | BibTeX | Tags: TIASA
@article{k308,
title = {A New Iterative Algorithm for Computing a Quality Approximated Median of Strings based on Edit Operations},
author = {J. Abreu and J. R. Rico-Juan},
url = {https://grfia.dlsi.ua.es/repositori/grfia/pubs/308/PRL+-+A+New+Iterative+Algorithm+for+Computing+a+Quality+Median.pdf},
year = {2014},
date = {2014-01-01},
journal = {Pattern Recognition Letters},
volume = {36},
pages = {74--80},
abstract = {This paper presents a new algorithm that can be used to compute an approximation to the median of a set of strings. The approximate median is obtained through the successive improvements of a partial solution. The edit distance from the partial solution to all the strings in the set is computed in each iteration, thus accounting for the frequency of each of the edit operations in all the positions of the approximate median. A goodness index for edit operations is later computed by multiplying their frequency by the cost. Each operation is tested, starting from that with the highest index, in order to verify whether applying it to the partial solution leads to an improvement. If successful, a new iteration begins from the new approximate median. The algorithm finishes when all the operations have been examined without a better solution being found. Comparative experiments involving Freeman chain codes encoding 2D shapes and the Copenhagen chromosome database show that the quality of the approximate median string is similar to benchmark approaches but achieves a much faster convergence.},
keywords = {TIASA},
pubstate = {published},
tppubtype = {article}
}
2013
Pérez-Sancho, C.; Bernabeu, J. F.
A Multimodal Genre Recognition Prototype Proceedings Article
In: Actas del III Workshop de Reconocimiento de Formas y Análisis de Imágenes, pp. 13-16, Madrid, Spain, 2013, ISBN: 978-84-695-8332-6.
Abstract | Links | BibTeX | Tags: DRIMS, TIASA
@inproceedings{k305,
title = {A Multimodal Genre Recognition Prototype},
author = {C. Pérez-Sancho and J. F. Bernabeu},
url = {https://grfia.dlsi.ua.es/repositori/grfia/pubs/305/wsrfai2013_submission_4.pdf},
isbn = {978-84-695-8332-6},
year = {2013},
date = {2013-09-01},
urldate = {2013-09-01},
booktitle = {Actas del III Workshop de Reconocimiento de Formas y Análisis de Imágenes},
pages = {13-16},
address = {Madrid, Spain},
abstract = {In this paper, a multimodal and interactive prototype to perform music genre classification is presented. The system is oriented to multi-part files in symbolic format but it can be adapted using a transcription system to transform audio content in music scores. This prototype uses different sources of information to give a possible answer to the user. It has been developed to allow a human expert to interact with the system to improve its results. In its current implementation, it offers a limited range of interaction and multimodality. Further development aimed at full interactivity and multimodal interactions is discussed.},
keywords = {DRIMS, TIASA},
pubstate = {published},
tppubtype = {inproceedings}
}
Calvo-Zaragoza, J.; Oncina, J.
Human-Computer Interaction for Optical Music Recognition tasks Proceedings Article
In: Actas del III Workshop de Reconocimiento de Formas y Análisis de Imágenes, pp. 9-12, Madrid, Spain, 2013, ISBN: 978-84-695-8332-6.
Links | BibTeX | Tags: Prometeo 2012, TIASA
@inproceedings{k306,
title = {Human-Computer Interaction for Optical Music Recognition tasks},
author = {J. Calvo-Zaragoza and J. Oncina},
url = {https://grfia.dlsi.ua.es/repositori/grfia/pubs/306/wsrfai2013_submission_3.pdf},
isbn = {978-84-695-8332-6},
year = {2013},
date = {2013-09-01},
urldate = {2013-09-01},
booktitle = {Actas del III Workshop de Reconocimiento de Formas y Análisis de Imágenes},
pages = {9-12},
address = {Madrid, Spain},
keywords = {Prometeo 2012, TIASA},
pubstate = {published},
tppubtype = {inproceedings}
}
Serrano, A.; Micó, L.; Oncina, J.
Which fast nearest neighbour search algorithm to use? Journal Article
In: Lecture Notes in Computer Science, vol. 7887, pp. 567-574, 2013.
Links | BibTeX | Tags: Prometeo 2012, TIASA
@article{k301,
title = {Which fast nearest neighbour search algorithm to use?},
author = {A. Serrano and L. Micó and J. Oncina},
url = {https://grfia.dlsi.ua.es/repositori/grfia/pubs/301/IbPria13.pdf},
year = {2013},
date = {2013-06-01},
journal = {Lecture Notes in Computer Science},
volume = {7887},
pages = {567-574},
keywords = {Prometeo 2012, TIASA},
pubstate = {published},
tppubtype = {article}
}
Sanches, J. M.; Micó, L.; Cardoso, J. S.
Pattern Recognition and Image Analysis 6th Iberian Conference, IbPRIA 2013 Book
Springer, 2013.
BibTeX | Tags: Prometeo 2012, TIASA
@book{k302,
title = {Pattern Recognition and Image Analysis 6th Iberian Conference, IbPRIA 2013},
author = {J. M. Sanches and L. Micó and J. S. Cardoso},
editor = {J. M. Sanches and L. Micó and J. S. Cardoso},
year = {2013},
date = {2013-01-01},
publisher = {Springer},
keywords = {Prometeo 2012, TIASA},
pubstate = {published},
tppubtype = {book}
}
Calvo-Zaragoza, J.; Oncina, J.; Iñesta, J. M.
Recognition of Online Handwritten Music Symbols Proceedings Article
In: Proceedings of the 6th International Workshop on Machine Learning and Music, Prague, Czech Republic, 2013.
Abstract | Links | BibTeX | Tags: Prometeo 2012, TIASA
@inproceedings{k307,
title = {Recognition of Online Handwritten Music Symbols},
author = {J. Calvo-Zaragoza and J. Oncina and J. M. Iñesta},
url = {https://grfia.dlsi.ua.es/repositori/grfia/pubs/307/calvozaragoza-mml13.pdf},
year = {2013},
date = {2013-01-01},
urldate = {2013-01-01},
booktitle = {Proceedings of the 6th International Workshop on Machine Learning and Music},
address = {Prague, Czech Republic},
abstract = {An effective way of digitizing a new musical composition is to use an e-pen and tablet application in which the user's pen strokes are recognized online and the digital score is created with the sole effort of the composition itself. This work aims to be a starting point for research on the recognition of online handwritten music notation. To this end, different alternatives within the two modalities of recognition resulting from this data are presented: online recognition, which uses the strokes marked by a pen, and offline recognition, which uses the image generated after drawing the symbol. A comparative experiment with common machine learning algorithms over a dataset of 3800 samples and 32 different music symbols is presented. Results show that samples of the actual user are needed if good classification rates are pursued. Moreover, algorithms using the online data, on average, achieve better classifocation results than the others.},
keywords = {Prometeo 2012, TIASA},
pubstate = {published},
tppubtype = {inproceedings}
}
2012
Rico-Juan, J. R.; Iñesta, J. M.
New rank methods for reducing the size of the training set using the nearest neighbor rule Journal Article
In: Pattern Recognition Letters, vol. 33, no. 5, pp. 654–660, 2012, ISSN: 0167-8655.
Abstract | Links | BibTeX | Tags: DRIMS, MIPRCV, TIASA
@article{k283,
title = {New rank methods for reducing the size of the training set using the nearest neighbor rule},
author = {J. R. Rico-Juan and J. M. Iñesta},
url = {https://grfia.dlsi.ua.es/repositori/grfia/pubs/283/rankTrainingSet.pdf},
issn = {0167-8655},
year = {2012},
date = {2012-04-01},
journal = {Pattern Recognition Letters},
volume = {33},
number = {5},
pages = {654--660},
abstract = {(http://dx.doi.org/10.1016/j.patrec.2011.07.019)
Some new rank methods to select the best prototypes from a training set are proposed in this paper in order to establish its size according to an external parameter, while maintaining the classification accuracy. The traditional methods that filter the training set in a classification task like editing or condensing have some rules that apply to the set in order to remove outliers or keep some prototypes that help in the classification. In our approach, new voting methods are proposed to compute the prototype probability and help to classify correctly a new sample. This probability is the key to sorting the training set out, so a relevance factor from 0 to 1 is used to select the best candidates for each class whose accumulated probabilities are less than that parameter. This approach makes it possible to select the number of prototypes necessary to maintain or even increase the classification accuracy. The results obtained in different high dimensional databases show that these methods maintain the final error rate while reducing the size of the training set.},
keywords = {DRIMS, MIPRCV, TIASA},
pubstate = {published},
tppubtype = {article}
}
Some new rank methods to select the best prototypes from a training set are proposed in this paper in order to establish its size according to an external parameter, while maintaining the classification accuracy. The traditional methods that filter the training set in a classification task like editing or condensing have some rules that apply to the set in order to remove outliers or keep some prototypes that help in the classification. In our approach, new voting methods are proposed to compute the prototype probability and help to classify correctly a new sample. This probability is the key to sorting the training set out, so a relevance factor from 0 to 1 is used to select the best candidates for each class whose accumulated probabilities are less than that parameter. This approach makes it possible to select the number of prototypes necessary to maintain or even increase the classification accuracy. The results obtained in different high dimensional databases show that these methods maintain the final error rate while reducing the size of the training set.
Rico-Juan, J. R.; Iñesta, J. M.
Confidence voting method ensemble applied to off-line signature verification Journal Article
In: Pattern Analysis and Applications, vol. 15, no. 2, pp. 113–120, 2012, ISSN: 1433-7541.
Abstract | Links | BibTeX | Tags: MIPRCV, TIASA
@article{k290,
title = {Confidence voting method ensemble applied to off-line signature verification},
author = {J. R. Rico-Juan and J. M. Iñesta},
url = {https://grfia.dlsi.ua.es/repositori/grfia/pubs/290/rico_juanConfidenceVotingMethodEmsembleOffLineSignatureVerification.pdf},
issn = {1433-7541},
year = {2012},
date = {2012-04-01},
journal = {Pattern Analysis and Applications},
volume = {15},
number = {2},
pages = {113--120},
abstract = {In this paper, a new approximation to off-line signature verification is proposed based on two-class classifiers using an expert decisions ensemble. Different methods to extract sets of local and a global features from the target sample are detailed. Also a normalisation by confidence voting method is used in order to decrease the final equal error rate (EER). Each set of features is processed by a single expert, and on the other approach proposed, the decisions of the individual classifiers are combined using weighted votes. Experimental results are given using a subcorpus of the large MCYT signature database for random and skilled forgeries. The results show that the weighted combination outperforms the individual classifiers significantly. The best EER obtained were 6.3% in the case of skilled forgeries and 2.3% in the case of random forgeries.},
keywords = {MIPRCV, TIASA},
pubstate = {published},
tppubtype = {article}
}
Serrano, A.; Micó, L.; Oncina, J.
Restructuring Versus non Restructuring Insertions in MDF Indexes Proceedings Article
In: Carmona, J. Salvador Sá Pedro Latorre; nchez,; Fred, Ana (Ed.): ICPRAM 2012: 1st International Conference on Pattern Recognition Applications and Methods, pp. 474–480, INSTICC SciTePress, Vilamoura, Portugal, 2012, ISBN: 978-989-8425-98-0.
Abstract | Links | BibTeX | Tags: MIPRCV, TIASA
@inproceedings{k282,
title = {Restructuring Versus non Restructuring Insertions in MDF Indexes},
author = {A. Serrano and L. Micó and J. Oncina},
editor = {J. Salvador Sá Pedro Latorre Carmona and nchez and Ana Fred},
url = {https://grfia.dlsi.ua.es/repositori/grfia/pubs/282/ICPRAM12.pdf},
isbn = {978-989-8425-98-0},
year = {2012},
date = {2012-02-01},
booktitle = {ICPRAM 2012: 1st International Conference on Pattern Recognition Applications and Methods},
pages = {474--480},
publisher = {SciTePress},
address = {Vilamoura, Portugal},
organization = {INSTICC},
abstract = {MDF tree is a data structure (index) that is used to speed up similarity searches in huge databases. To achieve its goal the indexes should exploit some property of the dissimilarity measure. MDF indexes assume that the dissimilarity measure can be viewed as a distance in a metric space. Moreover, in this framework is assumed that the distance is computationally very expensive and then, counting distance computations is a good measure of the time complexity.
To tackle with a changing world, a problem arises when new points should be inserted in the index. Efficient algorithms should choose between trying to be efficient in search maintaining the “ideal” structure of the index or trying to be efficient when inserting but worsening the search time.
In this work we propose an insertion algorithm for MDF trees that focus on optimizing insertion times. The worst case time complexity of the algorithm only depends on the depth of the MDF tree. We compare this algorithm with a similar one that focuses on search time performance. We also study the range of applicability of each one.},
keywords = {MIPRCV, TIASA},
pubstate = {published},
tppubtype = {inproceedings}
}
To tackle with a changing world, a problem arises when new points should be inserted in the index. Efficient algorithms should choose between trying to be efficient in search maintaining the “ideal” structure of the index or trying to be efficient when inserting but worsening the search time.
In this work we propose an insertion algorithm for MDF trees that focus on optimizing insertion times. The worst case time complexity of the algorithm only depends on the depth of the MDF tree. We compare this algorithm with a similar one that focuses on search time performance. We also study the range of applicability of each one.
Micó, L.; Oncina, J.
A log square average case algorithm to make insertions in fast similarity search Journal Article
In: Pattern Recognition Letters, vol. 33, no. 9, pp. 1060–1065, 2012.
Links | BibTeX | Tags: MIPRCV, TIASA
@article{k287,
title = {A log square average case algorithm to make insertions in fast similarity search},
author = {L. Micó and J. Oncina},
url = {https://grfia.dlsi.ua.es/repositori/grfia/pubs/287/prl.pdf},
year = {2012},
date = {2012-01-01},
journal = {Pattern Recognition Letters},
volume = {33},
number = {9},
pages = {1060–1065},
keywords = {MIPRCV, TIASA},
pubstate = {published},
tppubtype = {article}
}
Gallego-Sánchez, A. J.; Calera-Rubio, J.; López, D.
Structural Graph Extraction from Images Proceedings Article
In: Omatu, S.; Santana, Juan F. De Paz; González, S. Rodríguez; Molina, J. M.; a. M. Bernardos, (Ed.): Distributed Computing and Artificial Intelligence, pp. 717-724, Springer Berlin / Heidelberg, 2012, ISBN: 978-3-642-28764-0.
@inproceedings{k289,
title = {Structural Graph Extraction from Images},
author = {A. J. Gallego-Sánchez and J. Calera-Rubio and D. López},
editor = {S. Omatu and Juan F. De Paz Santana and S. Rodríguez González and J. M. Molina and a. M. Bernardos},
isbn = {978-3-642-28764-0},
year = {2012},
date = {2012-01-01},
urldate = {2012-01-01},
booktitle = {Distributed Computing and Artificial Intelligence},
pages = {717-724},
publisher = {Springer Berlin / Heidelberg},
keywords = {MIPRCV, TIASA},
pubstate = {published},
tppubtype = {inproceedings}
}
Abreu, J.
Detección de regularidades en contornos 2D, cálculo aproximado de medianas y su aplicación en tareas de clasificación PhD Thesis
2012.
Abstract | BibTeX | Tags: ISIC 2010, TIASA
@phdthesis{k293,
title = {Detección de regularidades en contornos 2D, cálculo aproximado de medianas y su aplicación en tareas de clasificación},
author = {J. Abreu},
editor = {J. R. Rico},
year = {2012},
date = {2012-01-01},
urldate = {2012-01-01},
organization = {Universidad de Alicante},
abstract = {In this work, we address two main problems: identifying the regularities and the construction of average contours from contours encoded by Freeman chain codes. Solutions for those problems are proposed which relies in the information gathered from the Levenshtein edit distance computation. We describe a new method for quantifying the regularity of contours and comparing them, when encoded by Freeman chain codes, in terms of a similarity criterion. The criterion used allows subsequences to be found from the minimal cost edit sequence that specifi and es an alignment of contour segments which are similar. Two external parameters adjust the similarity criterion. The information about each similar part is encoded by strings that represent an average contour region. An explanation of how to construct a prototype based on the identifi and ed regularities is also reviewed. The reliability of the prototypes is eva- luated by replacing contour groups, samples, by new prototypes used as the training set in a classifi and cation task. This way, the size of the data set can be reduced without sensibly affecting its representational power for classifi and cation purposes. Experimental results show that this scheme achieves a reduction in the size of the training data set of about 80% while the classifi and cation error only increases by 0.45% in one of the three data sets studied. Also this thesis presents a new fast algorithm for computing an approximation to the mean between two strings of characters representing a 2D shape and its application to a new Wilson-based editing proce dure. The approximate mean is built by including some symbols from the two original strings. Besides, a greedy approach to this algorithm is studied which allows to reduce the time required to computed an approximate mean. The new dataset editing scheme relaxes the criterion for deleting instances proposed by the Wilson editing procedure. In practice, not all instances misclassifi and ed by their near neighbors are pruned. Instead, an artifi and cial instance is added to the dataset in the hope of successfully classifying the instance in the future. The new artifi and cial instance is the approximated mean of the misclassifi and ed sample and its same-class nearest neighbor. Experiments carried over three widely known databases of contours show the proposed algorithms performs very well in computing the mean of two strings, outperforming methods proposed by other authors. Particularly the low computational time required by the heuristic approach make it very suitable when dealing with long length strings. Results also shows the propo- sed preprocessing scheme can reduce the classifi and cation error in about 83% of trials. There is empirical evidence that using the greedy approximation to compute the approximated mean does not affect the editing procedure performance. Finally, a new algorithm with which to compute an approximation to the mean of a set of strings is presented. The approximated mean is computed through the successive improvements of a partial solution. In each iteration, the edit distance from the partial solution to all the strings in the set are computed, thus accounting for the frequency of each of the edit operations in every position of the approximated mean. A goodness index for edit operations is later computed by multiplying their frequency by the cost. Each operation is tested, starting from that with the highest index, in order to verify whether applying it to the partial solution leads to an improvement. If successful, a new iteration begins from the new approximated mean. The algorithm fi and nishes after all the operations have been examined without a better solution being found. Comparative experiments involving Freeman chain codes encoding 2D shapes show that the quality of the approximated mean string is similar to other approaches but achieves a much faster convergence.},
keywords = {ISIC 2010, TIASA},
pubstate = {published},
tppubtype = {phdthesis}
}
López, D.; Calera-Rubio, J.; Gallego-Sánchez, A. J.
Inference of k-Testable Directed Acyclic Graph Languages Proceedings Article
In: Journal of Machine Learning Research: Workshop and Conference Proceedings, Vol. 21: ICGI 2012, pp. 149-163, 2012.
Abstract | Links | BibTeX | Tags: PASCAL2, Prometeo 2012, TIASA
@inproceedings{k296,
title = {Inference of k-Testable Directed Acyclic Graph Languages},
author = {D. López and J. Calera-Rubio and A. J. Gallego-Sánchez},
url = {https://grfia.dlsi.ua.es/repositori/grfia/pubs/296/lopez12a.pdf},
year = {2012},
date = {2012-01-01},
booktitle = {Journal of Machine Learning Research: Workshop and Conference Proceedings, Vol. 21: ICGI 2012},
pages = {149-163},
abstract = {In this paper, we tackle the task of graph language learning. We first extend the well-known classes of k-testability and k-testability in the strict sense languages to directed graph languages. Second, we propose a graph automata model for directed acyclic graph languages. This graph automata model is used to propose a grammatical inference algorithm to learn the class of directed acyclic k-testable in the strict sense graph languages. The algorithm runs in polynomial time and identifies this class of languages from positive data.},
keywords = {PASCAL2, Prometeo 2012, TIASA},
pubstate = {published},
tppubtype = {inproceedings}
}
2011
Higuera, C. De La; Oncina, J.
Finding the most probable string and the consensus string: an algorithmic study Proceedings Article
In: In: 12th International Conference on Parsing Technologies (IWPT 2011), pp. 26-36, Dublin, 2011.
Abstract | Links | BibTeX | Tags: MIPRCV, TIASA
@inproceedings{k288,
title = {Finding the most probable string and the consensus string: an algorithmic study},
author = {C. De La Higuera and J. Oncina},
url = {https://grfia.dlsi.ua.es/repositori/grfia/pubs/288/iwpt2011.pdf},
year = {2011},
date = {2011-10-01},
urldate = {2011-10-01},
booktitle = {In: 12th International Conference on Parsing Technologies (IWPT 2011)},
pages = {26-36},
address = {Dublin},
abstract = {The problem of finding the most probable string for a distribution generated by a weighted finite automaton is related to a number of important questions: computing the distance between two distributions or finding the best translation (the most probable one) given a probabilistic finite state transducer. The problem is undecidable with general weights and is $NP$-hard if the automaton is probabilistic. In this paper we give a pseudo-polynomial algorithm which computes the most probable string in time polynomial in the inverse of the probability of this string itself. We also give a randomised algorithm solving the same problem and discuss the case where the distribution is generated by other types of machines.},
keywords = {MIPRCV, TIASA},
pubstate = {published},
tppubtype = {inproceedings}
}
Socorro, R.; Micó, L.; Oncina, J.
A fast pivot-based indexing algorithm for metric spaces Journal Article
In: Pattern Recognition Letters, vol. 32, no. 11, pp. 1511-1516, 2011.
Abstract | Links | BibTeX | Tags: MIPRCV, TIASA
@article{k266,
title = {A fast pivot-based indexing algorithm for metric spaces},
author = {R. Socorro and L. Micó and J. Oncina},
url = {https://grfia.dlsi.ua.es/repositori/grfia/pubs/266/piaesa-prl.pdf},
year = {2011},
date = {2011-08-01},
urldate = {2011-08-01},
journal = {Pattern Recognition Letters},
volume = {32},
number = {11},
pages = {1511-1516},
abstract = {This work focus on fast nearest neighbor (NN) search algorithms that can work in any metric space (not just the Euclidean distance) and where the distance computation is very time consuming. One of the most well known methods in this field is the AESA algorithm, used as baseline for performance measurement for over twenty years. The AESA works in two steps that repeats: first it searches a promising candidate to NN and computes its distance (approximation step), next it eliminates all the unsuitable NN candidates in view of the new information acquired in the previous calculation (elimination step).
This work introduces the PiAESA algorithm. This algorithm improves the performance of the AESA algorithm by splitting the approximation criterion: on the first iterations, when there is not enough information to find good NN candidates, it uses a list of pivots (objects in the database) to obtain a cheap approximation of the distance function. Once a good approximation is obtained it switches to the AESA usual behavior. As the pivot list is built in preprocessing time, the run time of PiAESA is almost the same than the AESA one.
In this work, we report experiments comparing with some competing methods. Our empirical results show that this new approach obtains a significant reduction of distance computations with no execution time penalty.},
keywords = {MIPRCV, TIASA},
pubstate = {published},
tppubtype = {article}
}
This work introduces the PiAESA algorithm. This algorithm improves the performance of the AESA algorithm by splitting the approximation criterion: on the first iterations, when there is not enough information to find good NN candidates, it uses a list of pivots (objects in the database) to obtain a cheap approximation of the distance function. Once a good approximation is obtained it switches to the AESA usual behavior. As the pivot list is built in preprocessing time, the run time of PiAESA is almost the same than the AESA one.
In this work, we report experiments comparing with some competing methods. Our empirical results show that this new approach obtains a significant reduction of distance computations with no execution time penalty.
Oncina, J.; Vidal, E.
Interactive Structured Output Prediction: Application to Chromosome Classification Journal Article
In: Pattern Recognition and Image Analysis (LNCS), vol. 6669, pp. 256-264, 2011.
Abstract | Links | BibTeX | Tags: MIPRCV, TIASA
@article{k267,
title = {Interactive Structured Output Prediction: Application to Chromosome Classification},
author = {J. Oncina and E. Vidal},
url = {https://grfia.dlsi.ua.es/repositori/grfia/pubs/267/karyo.pdf},
year = {2011},
date = {2011-06-01},
urldate = {2011-06-01},
journal = {Pattern Recognition and Image Analysis (LNCS)},
volume = {6669},
pages = {256-264},
abstract = {Interactive Pattern Recognition concepts and techniques are applied to problems with structured output and i.e., problems in which the result is not just a simple class label, but a suitable structure of labels. For illustration purposes (a simplification of) the problem of Human Karyotyping is considered. Results show that a) taking into account label dependencies in a karyogram significantly reduces the classical (noninteractive) chromosome label prediction error rate and b) they are further improved when interactive processing is adopted.},
keywords = {MIPRCV, TIASA},
pubstate = {published},
tppubtype = {article}
}
Bernabeu, J. F.; Calera-Rubio, J.; Iñesta, J. M.; Rizo, D.
Melodic Identification Using Probabilistic Tree Automata Journal Article
In: Journal of New Music Research, vol. 40, no. 2, pp. 93-103, 2011, ISSN: 0929-8215.
Abstract | BibTeX | Tags: DRIMS, MIPRCV, TIASA
@article{k270,
title = {Melodic Identification Using Probabilistic Tree Automata},
author = {J. F. Bernabeu and J. Calera-Rubio and J. M. Iñesta and D. Rizo},
issn = {0929-8215},
year = {2011},
date = {2011-06-01},
urldate = {2011-06-01},
journal = {Journal of New Music Research},
volume = {40},
number = {2},
pages = {93-103},
abstract = {Similarity computation is a difficult issue in music information retrieval tasks, because it tries to emulate the special ability that humans show for pattern recognition in general, and particularly in the presence of noisy data. A number of works have addressed the problem of what is the best representation for symbolic music in this context. The tree representation, using rhythm for defining the tree structure and pitch information for leaf and node labelling has proven to be effective in melodic similarity computation. One of the main drawbacks of this approach is that the tree comparison algorithms are of a high time complexity. In this paper, stochastic k-testable tree-models are applied for computing the similarity between two melodies as a probability. The results are compared to those achieved by tree edit distances, showing that k-testable tree-models outperform other reference methods in both recognition rate and efficiency. The case study in this paper is to identify a snippet query among a set of songs stored in symbolic format. For it, the utilized method must be able to deal with inexact queries and with efficiency for scalability issues.},
keywords = {DRIMS, MIPRCV, TIASA},
pubstate = {published},
tppubtype = {article}
}
Socorro, R.; Micó, L.; Oncina, J.
Efficient search supporting several similarity queries by reordering pivots Proceedings Article
In: Signal Processing, Pattern Recognition, and Applications (SPPRA 2011), pp. 114-120, ACTA Press, Innsbruck, Austria, 2011, ISBN: 978-0-88986-865-6.
Abstract | Links | BibTeX | Tags: MIPRCV, TIASA
@inproceedings{k260,
title = {Efficient search supporting several similarity queries by reordering pivots},
author = {R. Socorro and L. Micó and J. Oncina},
url = {https://grfia.dlsi.ua.es/repositori/grfia/pubs/260/sppra.pdf},
isbn = {978-0-88986-865-6},
year = {2011},
date = {2011-02-01},
booktitle = {Signal Processing, Pattern Recognition, and Applications (SPPRA 2011)},
pages = {114-120},
publisher = {ACTA Press},
address = {Innsbruck, Austria},
abstract = {Effective similarity search indexing in general metric spaces has traditionally received special attention in several areas of interest like pattern recognition, computer vision or information retrieval. A typical method is based on the use of a distance as a dissimilarity function (not restricting to Euclidean distance) where the main objective is to speed up the search of the most similar object in a database by
minimising the number of distance computations. Several types of search can be defined, being the k-nearest neighbour or the range search the most common. AESA is one of the most well known of such algorithms due to its performance (measured in distance computations). PiAESA is an AESA variant where the main objective has changed. Instead of trying to find the best nearest neighbour candidate at each step, it tries to find the object that contributes the most to have a bigger lower bound function, that is, a better estimation of the distance. In this paper we extend and test PiAESA to support several similarity queries. Our empirical results show that this approach obtains a significant improvement in performance when comparing with competing algorithms.},
keywords = {MIPRCV, TIASA},
pubstate = {published},
tppubtype = {inproceedings}
}
minimising the number of distance computations. Several types of search can be defined, being the k-nearest neighbour or the range search the most common. AESA is one of the most well known of such algorithms due to its performance (measured in distance computations). PiAESA is an AESA variant where the main objective has changed. Instead of trying to find the best nearest neighbour candidate at each step, it tries to find the object that contributes the most to have a bigger lower bound function, that is, a better estimation of the distance. In this paper we extend and test PiAESA to support several similarity queries. Our empirical results show that this approach obtains a significant improvement in performance when comparing with competing algorithms.
Abreu, J.; Rico-Juan, J. R.
Characterization of contour regularities based on the Levenshtein edit distance Journal Article
In: Pattern Recognition Letters, vol. 32, pp. 1421-1427, 2011.
Abstract | Links | BibTeX | Tags: MIPRCV, TIASA
@article{k263,
title = {Characterization of contour regularities based on the Levenshtein edit distance},
author = {J. Abreu and J. R. Rico-Juan},
url = {https://grfia.dlsi.ua.es/repositori/grfia/pubs/263/2009_J_IbPRIA.pdf},
year = {2011},
date = {2011-01-01},
urldate = {2011-01-01},
journal = {Pattern Recognition Letters},
volume = {32},
pages = {1421-1427},
abstract = {This paper describes a new method for quantifying the regularity of contours and comparing them (when encoded by Freeman chain codes) in terms of a similarity criterion which relies on information gathered from Levenshtein edit distance computation. The criterion used allows subsequences to be found from the minimal cost edit sequence that specifies an alignment of contour segments which are similar. Two external parameters adjust the similarity criterion. The information about each similar part is encoded by strings that represent an average contour region. An explanation of how to construct a prototype based on the identified regularities is also reviewed. The reliability of the prototypes is evaluated by replacing contour groups (samples) by new prototypes used as the training set in a classification task. This way, the size of the data set can be reduced without sensibly affecting its representational power for classification purposes. Experimental results show that this scheme achieves a reduction in the size of the training data set of about 80% while the classification error only increases by 0.45% in one of the three data sets studied.},
keywords = {MIPRCV, TIASA},
pubstate = {published},
tppubtype = {article}
}
Bernabeu, J. F.; Calera-Rubio, J.; Iñesta, J. M.
Classifying melodies using tree grammars Journal Article
In: Lecture Notes in Computer Science, vol. 6669, pp. 572–579, 2011, ISSN: 0302-9743.
Abstract | Links | BibTeX | Tags: DRIMS, MIPRCV, TIASA
@article{k264,
title = {Classifying melodies using tree grammars},
author = {J. F. Bernabeu and J. Calera-Rubio and J. M. Iñesta},
url = {https://grfia.dlsi.ua.es/repositori/grfia/pubs/264/ibpria2011-bernabeu.pdf},
issn = {0302-9743},
year = {2011},
date = {2011-01-01},
journal = {Lecture Notes in Computer Science},
volume = {6669},
pages = {572--579},
abstract = {Similarity computation is a difficult issue in music information retrieval, because it tries to emulate the special ability that humans show
for pattern recognition in general, and particularly in the presence of noisy data. A number of works have addressed the problem of what
is the best representation for symbolic music in this context. The tree representation, using rhythm for defining the tree structure and pitch information for leaf and node labeling has proven to be effective in melodic similarity computation. In this paper we propose a solution when we have melodies represented by trees for the training but the duration information is not available for the input data. For that, we infer a probabilistic context-free grammar using the information in the trees (duration and pitch) and classify new melodies represented by strings using only the pitch. The case study in this paper is to identify a snippet query among a set of songs stored in symbolic format. For it, the utilized method must be able to deal with inexact queries and efficient for scalability issues.},
keywords = {DRIMS, MIPRCV, TIASA},
pubstate = {published},
tppubtype = {article}
}
for pattern recognition in general, and particularly in the presence of noisy data. A number of works have addressed the problem of what
is the best representation for symbolic music in this context. The tree representation, using rhythm for defining the tree structure and pitch information for leaf and node labeling has proven to be effective in melodic similarity computation. In this paper we propose a solution when we have melodies represented by trees for the training but the duration information is not available for the input data. For that, we infer a probabilistic context-free grammar using the information in the trees (duration and pitch) and classify new melodies represented by strings using only the pitch. The case study in this paper is to identify a snippet query among a set of songs stored in symbolic format. For it, the utilized method must be able to deal with inexact queries and efficient for scalability issues.
Serrano, A.; Micó, L.; Oncina, J.
Impact of the Initialization in Tree-Based Fast Similarity Search Techniques Proceedings Article
In: Pelillo, M.; Hancock, E. R. (Ed.): SIMBAD'11 Proceedings of the First international conference on Similarity-based pattern recognition, pp. 163-176, Springer, Venecia, Italia, 2011, ISBN: 978-3-642-24470-4.
Abstract | Links | BibTeX | Tags: MIPRCV, TIASA
@inproceedings{k272,
title = {Impact of the Initialization in Tree-Based Fast Similarity Search Techniques},
author = {A. Serrano and L. Micó and J. Oncina},
editor = {M. Pelillo and E. R. Hancock},
url = {https://grfia.dlsi.ua.es/repositori/grfia/pubs/272/simbad11.pdf},
isbn = {978-3-642-24470-4},
year = {2011},
date = {2011-01-01},
booktitle = {SIMBAD'11 Proceedings of the First international conference on Similarity-based pattern recognition},
pages = {163-176},
publisher = {Springer},
address = {Venecia, Italia},
abstract = {Many fast similarity search techniques relies on the use of pivots (specially selected points in the data set). Using these points, specific structures (indexes) are built speeding up the search when queering. Usually, pivot selection techniques are incremental, being the first one randomly chosen.
This article explores several techniques to choose the first pivot in a tree-based fast similarity search technique. We provide experimental results showing that an adequate choice of this pivot leads to significant reductions in distance computations and time complexity.
Moreover, most pivot tree-based indexes emphasizes in building balanced trees.We provide experimentally and theoretical support that very unbalanced trees can be a better choice than balanced ones.},
keywords = {MIPRCV, TIASA},
pubstate = {published},
tppubtype = {inproceedings}
}
This article explores several techniques to choose the first pivot in a tree-based fast similarity search technique. We provide experimental results showing that an adequate choice of this pivot leads to significant reductions in distance computations and time complexity.
Moreover, most pivot tree-based indexes emphasizes in building balanced trees.We provide experimentally and theoretical support that very unbalanced trees can be a better choice than balanced ones.
2010
Calera-Rubio, J.; Bernabeu, J. F.
Tree language automata for melody recognition Proceedings Article
In: Pérez, Juan Carlos (Ed.): Actas del II Workshop de Reconocimiento de Formas y Análisis de Imágenes (AERFAI), pp. 17-22, AERFAI IBERGARCETA PUBLICACIONES, S.L., Valencia, Spain, 2010, ISBN: 978-84-92812-66-0.
Abstract | Links | BibTeX | Tags: DRIMS, MIPRCV, TIASA
@inproceedings{k251,
title = {Tree language automata for melody recognition},
author = {J. Calera-Rubio and J. F. Bernabeu},
editor = {Juan Carlos Pérez},
url = {https://grfia.dlsi.ua.es/repositori/grfia/pubs/251/bernabeuCEDI2010Final.pdf},
isbn = {978-84-92812-66-0},
year = {2010},
date = {2010-09-01},
urldate = {2010-09-01},
booktitle = {Actas del II Workshop de Reconocimiento de Formas y Análisis de Imágenes (AERFAI)},
pages = {17-22},
publisher = {IBERGARCETA PUBLICACIONES, S.L.},
address = {Valencia, Spain},
organization = {AERFAI},
abstract = {The representation of symbolic music by
means of trees has shown to be suitable in
melodic similarity computation. In order to
compare trees, different tree edit distances
have been previously used, being their complexity
a main drawback. In this paper, the application of stochastic k-testable treemodels for computing the similarity between two melodies as a probability, compared to the classical edit distance has been addressed. The results show that k-testable tree-models seem to be adequate for the task, since they outperform other reference methods in both recognition rate and efficiency. The case study in this paper is to identify a snippet query among a set of songs. For it, the utilized method must be able to deal with inexact queries and efficiency
for scalability issues.},
keywords = {DRIMS, MIPRCV, TIASA},
pubstate = {published},
tppubtype = {inproceedings}
}
means of trees has shown to be suitable in
melodic similarity computation. In order to
compare trees, different tree edit distances
have been previously used, being their complexity
a main drawback. In this paper, the application of stochastic k-testable treemodels for computing the similarity between two melodies as a probability, compared to the classical edit distance has been addressed. The results show that k-testable tree-models seem to be adequate for the task, since they outperform other reference methods in both recognition rate and efficiency. The case study in this paper is to identify a snippet query among a set of songs. For it, the utilized method must be able to deal with inexact queries and efficiency
for scalability issues.
Rico-Juan, J. R.; Abreu, J.
A new editing scheme based on a fast two-string median computation applied to OCR Proceedings Article
In: Hancok, E. R.; Wilson, R. C.; Ilkay, T. W.; Escolano, F. (Ed.): Structural, Syntactic, and Statistical Pattern Recognition, pp. 748–756, Springer, Cesme, Izmir, Turkey, 2010, ISBN: 978-3-642-14979-5.
Abstract | BibTeX | Tags: MIPRCV, TIASA
@inproceedings{k247,
title = {A new editing scheme based on a fast two-string median computation applied to OCR},
author = {J. R. Rico-Juan and J. Abreu},
editor = {E. R. Hancok and R. C. Wilson and T. W. Ilkay and F. Escolano},
isbn = {978-3-642-14979-5},
year = {2010},
date = {2010-08-01},
urldate = {2010-08-01},
booktitle = {Structural, Syntactic, and Statistical Pattern Recognition},
pages = {748--756},
publisher = {Springer},
address = {Cesme, Izmir, Turkey},
abstract = {This paper presents a new fast algorithm to compute an approximation to the median between two strings of characters representing a 2D shape and its application to a new classification scheme to decrease its error rate. The median string results from the application of certain edit operations from the minimum cost edit sequence to one of the original strings. The new dataset editing scheme relaxes the criterion to delete instances proposed by the Wilson Editing Proce- dure. In practice, not all instances misclassified by its near neighbors are pruned. Instead, an artificial instance is added to the dataset expecting to successfully classify the instance on the future. The new artificial instance is the median from the misclassified sample and its same-class nearest neighbor. The experiments over two widely used datasets of handwritten characters show this preprocessing scheme can reduce the classification error in about 78% of trials.},
keywords = {MIPRCV, TIASA},
pubstate = {published},
tppubtype = {inproceedings}
}
Gómez-Ballester, E.; Micó, L.; Thollard, F.; Oncina, J.; Moreno-Seco, F.
Combining Elimination Rules in Tree-Based Nearest Neighbor Search Algorithms Proceedings Article
In: Hancok, E. R.; Wilson, R. C.; Ilkay, T. W.; Escolano, F. (Ed.): Structural, Syntactic, and Statistical Pattern Recognition, pp. 80–89, Springer, Cesme, Turkey, 2010, ISBN: 978-3-642-14979-5.
Abstract | Links | BibTeX | Tags: MIPRCV, TIASA
@inproceedings{k249,
title = {Combining Elimination Rules in Tree-Based Nearest Neighbor Search Algorithms},
author = {E. Gómez-Ballester and L. Micó and F. Thollard and J. Oncina and F. Moreno-Seco},
editor = {E. R. Hancok and R. C. Wilson and T. W. Ilkay and F. Escolano},
url = {https://grfia.dlsi.ua.es/repositori/grfia/pubs/249/tr-ssspr2010.pdf},
isbn = {978-3-642-14979-5},
year = {2010},
date = {2010-08-01},
booktitle = {Structural, Syntactic, and Statistical Pattern Recognition},
pages = {80--89},
publisher = {Springer},
address = {Cesme, Turkey},
abstract = {A common activity in many pattern recognition tasks, image processing or clustering techniques involves searching a labeled data set looking for the nearest point to a given unlabelled sample. To reduce the computational overhead when the naive exhaustive search is applied, some fast nearest neighbor search (NNS) algorithms have appeared in the last years. Depending on the structure used to store the training set (usually a tree), different strategies to speed up the search have been defined. In this paper, a new algorithm based on the combination of different pruning rules is proposed. An experimental evaluation and comparison of its behavior with respect to other techniques has been performed, using both real and artificial data.},
keywords = {MIPRCV, TIASA},
pubstate = {published},
tppubtype = {inproceedings}
}
Micó, L.; Oncina, J.
A Constant Average Time Algorithm to Allow Insertions in the LAESA Fast Nearest Neighbour Search Index Proceedings Article
In: Proc. of the 20th International Conference on Pattern Recognition, ICPR 2010, Istanbul, Turkey, pp. 23–26, 2010.
Links | BibTeX | Tags: MIPRCV, TIASA
@inproceedings{k257,
title = {A Constant Average Time Algorithm to Allow Insertions in the LAESA Fast Nearest Neighbour Search Index},
author = {L. Micó and J. Oncina},
url = {https://grfia.dlsi.ua.es/repositori/grfia/pubs/257/icpr-2010.pdf},
year = {2010},
date = {2010-08-01},
booktitle = {Proc. of the 20th International Conference on Pattern Recognition, ICPR 2010, Istanbul, Turkey},
pages = {23--26},
keywords = {MIPRCV, TIASA},
pubstate = {published},
tppubtype = {inproceedings}
}
Verdú-Mas, J. L.
Gramáticas probabilisticas para la desambiguación sintáctica PhD Thesis
2010.
@phdthesis{k262,
title = {Gramáticas probabilisticas para la desambiguación sintáctica},
author = {J. L. Verdú-Mas},
editor = {Jorge Calera Rafael Carrasco},
year = {2010},
date = {2010-01-01},
organization = {Univ. Alicante},
keywords = {MIPRCV, TIASA},
pubstate = {published},
tppubtype = {phdthesis}
}
2009
Calera-Rubio, J.; Bernabeu, J. F.
A probabilistic approach to melodic similarity Proceedings Article
In: Proceedings of MML 2009, pp. 48-53, 2009.
Abstract | Links | BibTeX | Tags: ARFAI, DRIMS, MIPRCV, PROSEMUS, TIASA
@inproceedings{k231,
title = {A probabilistic approach to melodic similarity},
author = {J. Calera-Rubio and J. F. Bernabeu},
url = {https://grfia.dlsi.ua.es/repositori/grfia/pubs/231/mml2009Bernabeu.pdf},
year = {2009},
date = {2009-01-01},
urldate = {2009-01-01},
booktitle = {Proceedings of MML 2009},
pages = {48-53},
abstract = {Melodic similarity is an important research topic in music information retrieval.
The representation of symbolic music by means of trees has proven to be suitable
in melodic similarity computation, because they are able to code rhythm in their
structure leaving only pitch representations as a degree of freedom for coding.
In order to compare trees, different edit distances have been previously used.
In this paper, stochastic k-testable tree-models, formerly used in other domains
like structured document compression or natural language processing, have been
used for computing a similarity measure between melody trees as a probability
and their performance has been compared to a classical tree edit distance.},
keywords = {ARFAI, DRIMS, MIPRCV, PROSEMUS, TIASA},
pubstate = {published},
tppubtype = {inproceedings}
}
The representation of symbolic music by means of trees has proven to be suitable
in melodic similarity computation, because they are able to code rhythm in their
structure leaving only pitch representations as a degree of freedom for coding.
In order to compare trees, different edit distances have been previously used.
In this paper, stochastic k-testable tree-models, formerly used in other domains
like structured document compression or natural language processing, have been
used for computing a similarity measure between melody trees as a probability
and their performance has been compared to a classical tree edit distance.