2013
Pérez-Sancho, C.; Bernabeu, J. F.
A Multimodal Genre Recognition Prototype Proceedings Article
In: Actas del III Workshop de Reconocimiento de Formas y Análisis de Imágenes, pp. 13-16, Madrid, Spain, 2013, ISBN: 978-84-695-8332-6.
Abstract | Links | BibTeX | Tags: DRIMS, TIASA
@inproceedings{k305,
title = {A Multimodal Genre Recognition Prototype},
author = {C. Pérez-Sancho and J. F. Bernabeu},
url = {https://grfia.dlsi.ua.es/repositori/grfia/pubs/305/wsrfai2013_submission_4.pdf},
isbn = {978-84-695-8332-6},
year = {2013},
date = {2013-09-01},
urldate = {2013-09-01},
booktitle = {Actas del III Workshop de Reconocimiento de Formas y Análisis de Imágenes},
pages = {13-16},
address = {Madrid, Spain},
abstract = {In this paper, a multimodal and interactive prototype to perform music genre classification is presented. The system is oriented to multi-part files in symbolic format but it can be adapted using a transcription system to transform audio content in music scores. This prototype uses different sources of information to give a possible answer to the user. It has been developed to allow a human expert to interact with the system to improve its results. In its current implementation, it offers a limited range of interaction and multimodality. Further development aimed at full interactivity and multimodal interactions is discussed.},
keywords = {DRIMS, TIASA},
pubstate = {published},
tppubtype = {inproceedings}
}
In this paper, a multimodal and interactive prototype to perform music genre classification is presented. The system is oriented to multi-part files in symbolic format but it can be adapted using a transcription system to transform audio content in music scores. This prototype uses different sources of information to give a possible answer to the user. It has been developed to allow a human expert to interact with the system to improve its results. In its current implementation, it offers a limited range of interaction and multimodality. Further development aimed at full interactivity and multimodal interactions is discussed. Hontanilla, M.; Pérez-Sancho, C.; Iñesta, J. M.
Modeling Musical Style with Language Models for Composer Recognition Journal Article
In: Lecture Notes in Computer Science, vol. 7887, pp. 740-748, 2013, ISSN: 0302-9743.
Abstract | Links | BibTeX | Tags: DRIMS, Prometeo 2012
@article{k300,
title = {Modeling Musical Style with Language Models for Composer Recognition},
author = {M. Hontanilla and C. Pérez-Sancho and J. M. Iñesta},
url = {https://grfia.dlsi.ua.es/repositori/grfia/pubs/300/10.1007_978-3-642-38628-2_88.pdf},
issn = {0302-9743},
year = {2013},
date = {2013-06-01},
journal = {Lecture Notes in Computer Science},
volume = {7887},
pages = {740-748},
abstract = {In this paper we present an application of language modeling using n-grams to model the style of different composers. For this, we repeated the experiments performed in previous works by other authors using a corpus of 5 composers from the Baroque and Classical periods. In these experiments we found some signs that the results could be influenced by external factors other than the composers’ styles, such as the heterogeneity in the musical forms selected for the corpus. In order to as- sess the validity of the modeling techniques to capture the own personal style of the composers, a new experiment was performed with a corpus of fugues from Bach and Shostakovich. All these experiments show that language modeling is a suitable tool for modeling musical style, even when the styles of the different datasets are affected by several factors.},
keywords = {DRIMS, Prometeo 2012},
pubstate = {published},
tppubtype = {article}
}
In this paper we present an application of language modeling using n-grams to model the style of different composers. For this, we repeated the experiments performed in previous works by other authors using a corpus of 5 composers from the Baroque and Classical periods. In these experiments we found some signs that the results could be influenced by external factors other than the composers’ styles, such as the heterogeneity in the musical forms selected for the corpus. In order to as- sess the validity of the modeling techniques to capture the own personal style of the composers, a new experiment was performed with a corpus of fugues from Bach and Shostakovich. All these experiments show that language modeling is a suitable tool for modeling musical style, even when the styles of the different datasets are affected by several factors. Iñesta, J. M.; Pérez-Sancho, C.
Interactive multimodal music transcription Proceedings Article
In: Proc. of IEEE Int. Conf. on Acoustics, Speech and Signal Processing (ICASSP 2013), pp. 211-215, IEEE, Vancouver, Canada, 2013, ISBN: 978-1-4799-0356-6.
Abstract | BibTeX | Tags: DRIMS, Prometeo 2012
@inproceedings{k299,
title = {Interactive multimodal music transcription},
author = {J. M. Iñesta and C. Pérez-Sancho},
isbn = {978-1-4799-0356-6},
year = {2013},
date = {2013-05-01},
booktitle = {Proc. of IEEE Int. Conf. on Acoustics, Speech and Signal Processing (ICASSP 2013)},
pages = {211-215},
publisher = {IEEE},
address = {Vancouver, Canada},
abstract = {Automatic music transcription has usually been performed as an autonomous task and its evaluation has been made in terms of precision, recall, accuracy, etc. Nevertheless, in this work, assuming that the state of the art is far from being perfect, it is considered as an interactive one, where an expert user is assisted in its work by a transcription tool. In this context, the performance evaluation of the system turns into an assessment of how many user interactions are needed to complete the work. The strategy is that the user interactions can be used by the system to improve its performance in an adaptive way, thus minimizing the workload. Also, a multimodal approach has been implemented, in such a way that different sources of information, like onsets, beats, and meter, are used to detect notes in a musical audio excerpt. The system is focused on monotimbral polyphonic transcription.},
keywords = {DRIMS, Prometeo 2012},
pubstate = {published},
tppubtype = {inproceedings}
}
Automatic music transcription has usually been performed as an autonomous task and its evaluation has been made in terms of precision, recall, accuracy, etc. Nevertheless, in this work, assuming that the state of the art is far from being perfect, it is considered as an interactive one, where an expert user is assisted in its work by a transcription tool. In this context, the performance evaluation of the system turns into an assessment of how many user interactions are needed to complete the work. The strategy is that the user interactions can be used by the system to improve its performance in an adaptive way, thus minimizing the workload. Also, a multimodal approach has been implemented, in such a way that different sources of information, like onsets, beats, and meter, are used to detect notes in a musical audio excerpt. The system is focused on monotimbral polyphonic transcription.2012
Bresson, J.; Pérez-Sancho, C.
New Framework for Score Segmentation and Analysis in OpenMusic Proceedings Article
In: Serafin, S. (Ed.): Proceedings of the 9th Sound and Music Computing Conference, pp. 506-513, Sound & Music Computing Logos Verlag, Copenhagen, Denmark, 2012, ISBN: 978-3-8325-3180-5.
Abstract | Links | BibTeX | Tags: DRIMS, PASCAL2
@inproceedings{k295,
title = {New Framework for Score Segmentation and Analysis in OpenMusic},
author = {J. Bresson and C. Pérez-Sancho},
editor = {S. Serafin},
url = {https://grfia.dlsi.ua.es/repositori/grfia/pubs/295/smc2012-149.pdf},
isbn = {978-3-8325-3180-5},
year = {2012},
date = {2012-07-01},
booktitle = {Proceedings of the 9th Sound and Music Computing Conference},
pages = {506-513},
publisher = {Logos Verlag},
address = {Copenhagen, Denmark},
organization = {Sound & Music Computing},
abstract = {We present new tools for the segmentation and analysis of musical scores in the OpenMusic computer-aided composition environment. A modular object-oriented framework enables the creation of segmentations on score objects and the implementation of automatic or semi-automatic analysis processes. The analyses can be performed and displayed thanks to customizable classes and callbacks. Concrete examples are given, in particular with the implementation of a semi-automatic harmonic analysis system and a framework for rhythmic transcription.},
keywords = {DRIMS, PASCAL2},
pubstate = {published},
tppubtype = {inproceedings}
}
We present new tools for the segmentation and analysis of musical scores in the OpenMusic computer-aided composition environment. A modular object-oriented framework enables the creation of segmentations on score objects and the implementation of automatic or semi-automatic analysis processes. The analyses can be performed and displayed thanks to customizable classes and callbacks. Concrete examples are given, in particular with the implementation of a semi-automatic harmonic analysis system and a framework for rhythmic transcription. Bernabeu, J. F.; Calera-Rubio, J.; Iñesta, J. M.; Rizo, D.
Query Parsing Using Probabilistic Tree Grammars Technical Report
Edinburgh, 2012.
Abstract | Links | BibTeX | Tags: DRIMS
@techreport{k292,
title = {Query Parsing Using Probabilistic Tree Grammars},
author = {J. F. Bernabeu and J. Calera-Rubio and J. M. Iñesta and D. Rizo},
url = {https://grfia.dlsi.ua.es/repositori/grfia/pubs/292/mml2012.pdf},
year = {2012},
date = {2012-06-01},
booktitle = {5th workshop on Music and Machine Learning, MML 2012},
address = {Edinburgh},
organization = {5th workshop on Music and Machine Learning, MML 2012},
abstract = {The tree representation, using rhythm for defining the tree structure and pitch infor- mation for node labeling has proven to be ef- fective in melodic similarity computation. In this paper we propose a solution representing melodies by tree grammars. For that, we in- fer a probabilistic context-free grammars for the melodies in a database, using their tree coding (with duration and pitch) and classify queries represented as a string of pitches. We aim to assess their ability to identify a noisy snippet query among a set of songs stored in symbolic format.},
keywords = {DRIMS},
pubstate = {published},
tppubtype = {techreport}
}
The tree representation, using rhythm for defining the tree structure and pitch infor- mation for node labeling has proven to be ef- fective in melodic similarity computation. In this paper we propose a solution representing melodies by tree grammars. For that, we in- fer a probabilistic context-free grammars for the melodies in a database, using their tree coding (with duration and pitch) and classify queries represented as a string of pitches. We aim to assess their ability to identify a noisy snippet query among a set of songs stored in symbolic format. Rico-Juan, J. R.; Iñesta, J. M.
New rank methods for reducing the size of the training set using the nearest neighbor rule Journal Article
In: Pattern Recognition Letters, vol. 33, no. 5, pp. 654–660, 2012, ISSN: 0167-8655.
Abstract | Links | BibTeX | Tags: DRIMS, MIPRCV, TIASA
@article{k283,
title = {New rank methods for reducing the size of the training set using the nearest neighbor rule},
author = {J. R. Rico-Juan and J. M. Iñesta},
url = {https://grfia.dlsi.ua.es/repositori/grfia/pubs/283/rankTrainingSet.pdf},
issn = {0167-8655},
year = {2012},
date = {2012-04-01},
journal = {Pattern Recognition Letters},
volume = {33},
number = {5},
pages = {654--660},
abstract = {(http://dx.doi.org/10.1016/j.patrec.2011.07.019)
Some new rank methods to select the best prototypes from a training set are proposed in this paper in order to establish its size according to an external parameter, while maintaining the classification accuracy. The traditional methods that filter the training set in a classification task like editing or condensing have some rules that apply to the set in order to remove outliers or keep some prototypes that help in the classification. In our approach, new voting methods are proposed to compute the prototype probability and help to classify correctly a new sample. This probability is the key to sorting the training set out, so a relevance factor from 0 to 1 is used to select the best candidates for each class whose accumulated probabilities are less than that parameter. This approach makes it possible to select the number of prototypes necessary to maintain or even increase the classification accuracy. The results obtained in different high dimensional databases show that these methods maintain the final error rate while reducing the size of the training set.},
keywords = {DRIMS, MIPRCV, TIASA},
pubstate = {published},
tppubtype = {article}
}
(http://dx.doi.org/10.1016/j.patrec.2011.07.019)
Some new rank methods to select the best prototypes from a training set are proposed in this paper in order to establish its size according to an external parameter, while maintaining the classification accuracy. The traditional methods that filter the training set in a classification task like editing or condensing have some rules that apply to the set in order to remove outliers or keep some prototypes that help in the classification. In our approach, new voting methods are proposed to compute the prototype probability and help to classify correctly a new sample. This probability is the key to sorting the training set out, so a relevance factor from 0 to 1 is used to select the best candidates for each class whose accumulated probabilities are less than that parameter. This approach makes it possible to select the number of prototypes necessary to maintain or even increase the classification accuracy. The results obtained in different high dimensional databases show that these methods maintain the final error rate while reducing the size of the training set. Vicente, O.; Iñesta, J. M.
Bass track selection in MIDI files and multimodal implications to melody Proceedings Article
In: Carmona, J. Salvador Sá Pedro Latorre; nchez,; Fred, Ana (Ed.): Proceedings of the Int. Conf. on Pattern Recognition Applications and Methods (ICPRAM 2012), pp. 449–458, INSTICC SciTePress, Vilamoura, Portugal, 2012, ISBN: 978-989-8425-98-0.
Abstract | BibTeX | Tags: DRIMS, MIPRCV
@inproceedings{k285,
title = {Bass track selection in MIDI files and multimodal implications to melody},
author = {O. Vicente and J. M. Iñesta},
editor = {J. Salvador Sá Pedro Latorre Carmona and nchez and Ana Fred},
isbn = {978-989-8425-98-0},
year = {2012},
date = {2012-02-01},
urldate = {2012-02-01},
booktitle = {Proceedings of the Int. Conf. on Pattern Recognition Applications and Methods (ICPRAM 2012)},
pages = {449--458},
publisher = {SciTePress},
address = {Vilamoura, Portugal},
organization = {INSTICC},
abstract = {Standard MIDI files consist of a number of tracks containing information that can be considered as a symbolic representation of music. Usually each track represents an instrument or voice in a music piece. The goal for this work is to identify the track that contains the bass line. This information is very relevant for a number of tasks like rhythm analysis or harmonic segmentation, among others. It is not easy since a bass line can be performed by very different kinds of instruments. We have approached this problem by using statistical features from the symbolic representation of music and a random forest classifier. The first experiment was to classify a track as bass or non-bass. Then we have tried to select the correct bass track in a multi-track MIDI file. Eventually, we have studied the issue of how different sources of information can help in this latter task. In particular, we have analyzed the interactions between bass and melody information. Yielded results were very accurate and melody track identification was significantly improved when using this kind of multimodal help.},
keywords = {DRIMS, MIPRCV},
pubstate = {published},
tppubtype = {inproceedings}
}
Standard MIDI files consist of a number of tracks containing information that can be considered as a symbolic representation of music. Usually each track represents an instrument or voice in a music piece. The goal for this work is to identify the track that contains the bass line. This information is very relevant for a number of tasks like rhythm analysis or harmonic segmentation, among others. It is not easy since a bass line can be performed by very different kinds of instruments. We have approached this problem by using statistical features from the symbolic representation of music and a random forest classifier. The first experiment was to classify a track as bass or non-bass. Then we have tried to select the correct bass track in a multi-track MIDI file. Eventually, we have studied the issue of how different sources of information can help in this latter task. In particular, we have analyzed the interactions between bass and melody information. Yielded results were very accurate and melody track identification was significantly improved when using this kind of multimodal help. Pertusa, A.; Iñesta, J. M.
Efficient methods for joint estimation of multiple fundamental frequencies in music signals Journal Article
In: EURASIP Journal on Advances in Signal Processing, vol. 2012, no. 1, pp. 27, 2012, ISSN: 1687-6180.
Abstract | BibTeX | Tags: DRIMS
@article{k286,
title = {Efficient methods for joint estimation of multiple fundamental frequencies in music signals},
author = {A. Pertusa and J. M. Iñesta},
issn = {1687-6180},
year = {2012},
date = {2012-01-01},
journal = {EURASIP Journal on Advances in Signal Processing},
volume = {2012},
number = {1},
pages = {27},
abstract = {This study presents efficient techniques for multiple fundamental frequency estimation in music signals. The proposed methodology can infer harmonic patterns from a mixture considering interactions with other sources and evaluate them in a joint estimation scheme. For this purpose, a set of fundamental frequency candidates are first selected at each frame, and several hypothetical combinations of them are generated. Combinations are independently evaluated, and the most likely is selected taking into account the intensity and spectral smoothness of its inferred patterns. The method is extended considering adjacent frames in order to smooth the detection in time, and a pitch tracking stage is finally performed to increase the temporal coherence. The proposed algorithms were evaluated in MIREX contests yielding state of the art results with a very low computational burden.},
keywords = {DRIMS},
pubstate = {published},
tppubtype = {article}
}
This study presents efficient techniques for multiple fundamental frequency estimation in music signals. The proposed methodology can infer harmonic patterns from a mixture considering interactions with other sources and evaluate them in a joint estimation scheme. For this purpose, a set of fundamental frequency candidates are first selected at each frame, and several hypothetical combinations of them are generated. Combinations are independently evaluated, and the most likely is selected taking into account the intensity and spectral smoothness of its inferred patterns. The method is extended considering adjacent frames in order to smooth the detection in time, and a pitch tracking stage is finally performed to increase the temporal coherence. The proposed algorithms were evaluated in MIREX contests yielding state of the art results with a very low computational burden. Orio, N.; Rauber, A.; Rizo, D.
Introduction to the focused issue on music digital libraries Journal Article
In: International Journal on Digital Libraries, vol. 12, no. 2-3, pp. 51-52, 2012, ISSN: ISSN: 1432-5012.
@article{k294,
title = {Introduction to the focused issue on music digital libraries},
author = {N. Orio and A. Rauber and D. Rizo},
issn = {ISSN: 1432-5012},
year = {2012},
date = {2012-01-01},
urldate = {2012-01-01},
journal = {International Journal on Digital Libraries},
volume = {12},
number = {2-3},
pages = {51-52},
keywords = {DRIMS},
pubstate = {published},
tppubtype = {article}
}
2011
Rizo, D.; Iñesta, J. M.; Lemström, K.
Polyphonic Music Retrieval with Classifier Ensembles Journal Article
In: Journal of New Music Research, vol. 40, no. 4, pp. 313-324, 2011, ISSN: 0929-8215.
@article{k284,
title = {Polyphonic Music Retrieval with Classifier Ensembles},
author = {D. Rizo and J. M. Iñesta and K. Lemström},
issn = {0929-8215},
year = {2011},
date = {2011-12-01},
urldate = {2011-12-01},
journal = {Journal of New Music Research},
volume = {40},
number = {4},
pages = {313-324},
keywords = {DRIMS},
pubstate = {published},
tppubtype = {article}
}
Iñesta, J. M.; Pérez-García, T.
A Multimodal Music Transcription Prototype Proceedings Article
In: Proc. of International Conference on Multimodal Interaction, ICMI 2011, pp. 315–318, ACM, Alicante, Spain, 2011, ISBN: 978-1-4503-0641-6.
Abstract | BibTeX | Tags: DRIMS, MIPRCV
@inproceedings{k274,
title = {A Multimodal Music Transcription Prototype},
author = {J. M. Iñesta and T. Pérez-García},
isbn = {978-1-4503-0641-6},
year = {2011},
date = {2011-11-01},
urldate = {2011-11-01},
booktitle = {Proc. of International Conference on Multimodal Interaction, ICMI 2011},
pages = {315--318},
publisher = {ACM},
address = {Alicante, Spain},
abstract = {Music transcription consists of transforming an audio signal encoding a music performance in a symbolic representation such as a music score. In this paper, a multimodal and interactive prototype to perform music transcription is
presented. The system is oriented to monotimbral transcription, its working domain is music played by a single instrument. This prototype uses three different sources of information to detect notes in a musical audio excerpt. It has been developed to allow a human expert to interact with the system to improve its results. In its current implementation, it offers a limited range of interaction and multimodality. Further development aimed at full interactivity and multimodal interactions is discussed.},
keywords = {DRIMS, MIPRCV},
pubstate = {published},
tppubtype = {inproceedings}
}
Music transcription consists of transforming an audio signal encoding a music performance in a symbolic representation such as a music score. In this paper, a multimodal and interactive prototype to perform music transcription is
presented. The system is oriented to monotimbral transcription, its working domain is music played by a single instrument. This prototype uses three different sources of information to detect notes in a musical audio excerpt. It has been developed to allow a human expert to interact with the system to improve its results. In its current implementation, it offers a limited range of interaction and multimodality. Further development aimed at full interactivity and multimodal interactions is discussed. Miotto, R.; Rizo, D.; Orio, N.; Lartillot, O.
MusiCLEF: a Benchmark Activity in Multimodal Music Information Retrieval Proceedings Article
In: Proc. of the 12th International Society for Music Information Retrieval Conference (ISMIR), Miami 2011, pp. 603-608, University of Miami, 2011, ISBN: 978-0-615-54865-4.
@inproceedings{k275,
title = {MusiCLEF: a Benchmark Activity in Multimodal Music Information Retrieval},
author = {R. Miotto and D. Rizo and N. Orio and O. Lartillot},
url = {https://grfia.dlsi.ua.es/repositori/grfia/pubs/275/Orio_etal_Ismir_2011.pdf},
isbn = {978-0-615-54865-4},
year = {2011},
date = {2011-10-01},
urldate = {2011-10-01},
booktitle = {Proc. of the 12th International Society for Music Information Retrieval Conference (ISMIR), Miami 2011},
pages = {603-608},
publisher = {University of Miami},
keywords = {DRIMS},
pubstate = {published},
tppubtype = {inproceedings}
}
León, Pedro J. Ponce
A statistical pattern recognition approach to symbolic music classification PhD Thesis
2011.
Abstract | Links | BibTeX | Tags: DRIMS, MIPRCV
@phdthesis{k271,
title = {A statistical pattern recognition approach to symbolic music classification},
author = {Pedro J. Ponce León},
editor = {José M. Iñesta},
url = {https://grfia.dlsi.ua.es/repositori/grfia/pubs/271/PhD_Pedro_J_Ponce_de_Leon_2011.pdf},
year = {2011},
date = {2011-09-01},
address = {Alicante, Spain},
organization = {University of Alicante},
abstract = {[ENGLISH] This is a work in the field of Music Information Retrieval, from symbolic sources (digital music scores or similar). It applies statistical pattern recognition techniques to approach two different, but related, problems: melody part selection in polyphonic works, and automatic music genre classification.
[ESPAÑOL] El trabajo se enmarca en el dominio de Recuperación de Música por Ordenador, a partir de fuentes simbólicas (partituras digitales o similares). En concreto, se plantean soluciones computacionales mediante la aplicación de técnicas estadísticas de reconocimiento de formas a dos problemas: la selección automática de partes melódicas en obras polifónicas y la clasificación automática de géneros musicales. Entre las posibles aplicaciones de estas técnicas está la catalogación, indexación y recuperación automática de obras musicales, basadas en su contenido, de grandes bases de datos que contienen obras en formato simbólico (partituras digitales, archivos MIDI, etc.). Otras aplicaciones, en el ámbito de la musicología computacional, incluyen la caracterización de géneros musicales y melodías mediante el análisis automático del contenido de grandes volúmenes de obras musicales.},
keywords = {DRIMS, MIPRCV},
pubstate = {published},
tppubtype = {phdthesis}
}
[ENGLISH] This is a work in the field of Music Information Retrieval, from symbolic sources (digital music scores or similar). It applies statistical pattern recognition techniques to approach two different, but related, problems: melody part selection in polyphonic works, and automatic music genre classification.
[ESPAÑOL] El trabajo se enmarca en el dominio de Recuperación de Música por Ordenador, a partir de fuentes simbólicas (partituras digitales o similares). En concreto, se plantean soluciones computacionales mediante la aplicación de técnicas estadísticas de reconocimiento de formas a dos problemas: la selección automática de partes melódicas en obras polifónicas y la clasificación automática de géneros musicales. Entre las posibles aplicaciones de estas técnicas está la catalogación, indexación y recuperación automática de obras musicales, basadas en su contenido, de grandes bases de datos que contienen obras en formato simbólico (partituras digitales, archivos MIDI, etc.). Otras aplicaciones, en el ámbito de la musicología computacional, incluyen la caracterización de géneros musicales y melodías mediante el análisis automático del contenido de grandes volúmenes de obras musicales. Bernabeu, J. F.; Calera-Rubio, J.; Iñesta, J. M.; Rizo, D.
Melodic Identification Using Probabilistic Tree Automata Journal Article
In: Journal of New Music Research, vol. 40, no. 2, pp. 93-103, 2011, ISSN: 0929-8215.
Abstract | BibTeX | Tags: DRIMS, MIPRCV, TIASA
@article{k270,
title = {Melodic Identification Using Probabilistic Tree Automata},
author = {J. F. Bernabeu and J. Calera-Rubio and J. M. Iñesta and D. Rizo},
issn = {0929-8215},
year = {2011},
date = {2011-06-01},
urldate = {2011-06-01},
journal = {Journal of New Music Research},
volume = {40},
number = {2},
pages = {93-103},
abstract = {Similarity computation is a difficult issue in music information retrieval tasks, because it tries to emulate the special ability that humans show for pattern recognition in general, and particularly in the presence of noisy data. A number of works have addressed the problem of what is the best representation for symbolic music in this context. The tree representation, using rhythm for defining the tree structure and pitch information for leaf and node labelling has proven to be effective in melodic similarity computation. One of the main drawbacks of this approach is that the tree comparison algorithms are of a high time complexity. In this paper, stochastic k-testable tree-models are applied for computing the similarity between two melodies as a probability. The results are compared to those achieved by tree edit distances, showing that k-testable tree-models outperform other reference methods in both recognition rate and efficiency. The case study in this paper is to identify a snippet query among a set of songs stored in symbolic format. For it, the utilized method must be able to deal with inexact queries and with efficiency for scalability issues.},
keywords = {DRIMS, MIPRCV, TIASA},
pubstate = {published},
tppubtype = {article}
}
Similarity computation is a difficult issue in music information retrieval tasks, because it tries to emulate the special ability that humans show for pattern recognition in general, and particularly in the presence of noisy data. A number of works have addressed the problem of what is the best representation for symbolic music in this context. The tree representation, using rhythm for defining the tree structure and pitch information for leaf and node labelling has proven to be effective in melodic similarity computation. One of the main drawbacks of this approach is that the tree comparison algorithms are of a high time complexity. In this paper, stochastic k-testable tree-models are applied for computing the similarity between two melodies as a probability. The results are compared to those achieved by tree edit distances, showing that k-testable tree-models outperform other reference methods in both recognition rate and efficiency. The case study in this paper is to identify a snippet query among a set of songs stored in symbolic format. For it, the utilized method must be able to deal with inexact queries and with efficiency for scalability issues. Iñesta, J. M.; Pérez-Sancho, C.; Hontanilla, M.
Composer Recognition using Language Models Proceedings Article
In: Proc. of Signal Processing, Pattern Recognition, and Applications (SPPRA 2011), pp. 76-83, ACTA Press, Innsbruck, Austria, 2011, ISBN: 978-0-88986-865-6.
Abstract | BibTeX | Tags: DRIMS, UA-CPS
@inproceedings{k261,
title = {Composer Recognition using Language Models},
author = {J. M. Iñesta and C. Pérez-Sancho and M. Hontanilla},
isbn = {978-0-88986-865-6},
year = {2011},
date = {2011-02-01},
urldate = {2011-02-01},
booktitle = {Proc. of Signal Processing, Pattern Recognition, and Applications (SPPRA 2011)},
pages = {76-83},
publisher = {ACTA Press},
address = {Innsbruck, Austria},
abstract = {In this paper we present an application of language modeling
techniques using n-grams to an authorship attribution
task. An stylometric study has been conducted on a pair
of datasets of baroque and classical composers, with which
other authors performed previously a similar study using a
set of musicological features and pattern recognition techniques.
In this paper, a simple general-purpose encoding
method has been used, in conjunction with language modeling
to explore the same problem. The results show that
this simpler method can lead to the same conclusions than
other more sophisticated methods, even traditional musicological
studies, without the need of advanced musicological
knowledge for processing the scores.},
keywords = {DRIMS, UA-CPS},
pubstate = {published},
tppubtype = {inproceedings}
}
In this paper we present an application of language modeling
techniques using n-grams to an authorship attribution
task. An stylometric study has been conducted on a pair
of datasets of baroque and classical composers, with which
other authors performed previously a similar study using a
set of musicological features and pattern recognition techniques.
In this paper, a simple general-purpose encoding
method has been used, in conjunction with language modeling
to explore the same problem. The results show that
this simpler method can lead to the same conclusions than
other more sophisticated methods, even traditional musicological
studies, without the need of advanced musicological
knowledge for processing the scores. Bernabeu, J. F.; Calera-Rubio, J.; Iñesta, J. M.
Classifying melodies using tree grammars Journal Article
In: Lecture Notes in Computer Science, vol. 6669, pp. 572–579, 2011, ISSN: 0302-9743.
Abstract | Links | BibTeX | Tags: DRIMS, MIPRCV, TIASA
@article{k264,
title = {Classifying melodies using tree grammars},
author = {J. F. Bernabeu and J. Calera-Rubio and J. M. Iñesta},
url = {https://grfia.dlsi.ua.es/repositori/grfia/pubs/264/ibpria2011-bernabeu.pdf},
issn = {0302-9743},
year = {2011},
date = {2011-01-01},
journal = {Lecture Notes in Computer Science},
volume = {6669},
pages = {572--579},
abstract = {Similarity computation is a difficult issue in music information retrieval, because it tries to emulate the special ability that humans show
for pattern recognition in general, and particularly in the presence of noisy data. A number of works have addressed the problem of what
is the best representation for symbolic music in this context. The tree representation, using rhythm for defining the tree structure and pitch information for leaf and node labeling has proven to be effective in melodic similarity computation. In this paper we propose a solution when we have melodies represented by trees for the training but the duration information is not available for the input data. For that, we infer a probabilistic context-free grammar using the information in the trees (duration and pitch) and classify new melodies represented by strings using only the pitch. The case study in this paper is to identify a snippet query among a set of songs stored in symbolic format. For it, the utilized method must be able to deal with inexact queries and efficient for scalability issues.},
keywords = {DRIMS, MIPRCV, TIASA},
pubstate = {published},
tppubtype = {article}
}
Similarity computation is a difficult issue in music information retrieval, because it tries to emulate the special ability that humans show
for pattern recognition in general, and particularly in the presence of noisy data. A number of works have addressed the problem of what
is the best representation for symbolic music in this context. The tree representation, using rhythm for defining the tree structure and pitch information for leaf and node labeling has proven to be effective in melodic similarity computation. In this paper we propose a solution when we have melodies represented by trees for the training but the duration information is not available for the input data. For that, we infer a probabilistic context-free grammar using the information in the trees (duration and pitch) and classify new melodies represented by strings using only the pitch. The case study in this paper is to identify a snippet query among a set of songs stored in symbolic format. For it, the utilized method must be able to deal with inexact queries and efficient for scalability issues. Calvo-Zaragoza, J.; Rizo, D.; Iñesta, J. M.
A distance for partially labeled trees Journal Article
In: Lecture Notes in Computer Science, vol. 6669, pp. 492–499, 2011, ISSN: 0302-9743.
Abstract | Links | BibTeX | Tags: DRIMS, MIPRCV
@article{k265,
title = {A distance for partially labeled trees},
author = {J. Calvo-Zaragoza and D. Rizo and J. M. Iñesta},
url = {https://grfia.dlsi.ua.es/repositori/grfia/pubs/265/ibpria11-calvo.pdf},
issn = {0302-9743},
year = {2011},
date = {2011-01-01},
journal = {Lecture Notes in Computer Science},
volume = {6669},
pages = {492--499},
abstract = {Trees are a powerful data structure for representing data for which hierarchical
relations can be defined. It has been applied in a number of fields like
image analysis, natural language processing, protein structure, or music
retrieval, to name a few. Procedures for comparing trees are very relevant
in many tasks where tree representations are involved. The computation of
these measures is usually time consuming and different authors have
proposed algorithms that are able to compute them in a reasonable time,
by means of approximated versions of the similarity measure. Other methods
require that the trees are fully labeled for the distance to be computed.
The measure utilized in this paper is able to deal with trees labeled
only at the leaves that runs in $O(|T_1|times|T_2|)$ time. Experiments and
comparative results are provided.},
keywords = {DRIMS, MIPRCV},
pubstate = {published},
tppubtype = {article}
}
Trees are a powerful data structure for representing data for which hierarchical
relations can be defined. It has been applied in a number of fields like
image analysis, natural language processing, protein structure, or music
retrieval, to name a few. Procedures for comparing trees are very relevant
in many tasks where tree representations are involved. The computation of
these measures is usually time consuming and different authors have
proposed algorithms that are able to compute them in a reasonable time,
by means of approximated versions of the similarity measure. Other methods
require that the trees are fully labeled for the distance to be computed.
The measure utilized in this paper is able to deal with trees labeled
only at the leaves that runs in $O(|T_1|times|T_2|)$ time. Experiments and
comparative results are provided. Iñesta, J. M.; Rizo, D.; Illescas, P. R.
Learning melodic analysis rules Technical Report
2011.
Abstract | Links | BibTeX | Tags: DRIMS
@techreport{k276,
title = {Learning melodic analysis rules},
author = {J. M. Iñesta and D. Rizo and P. R. Illescas},
url = {https://grfia.dlsi.ua.es/repositori/grfia/pubs/276/mml2011-melan-final.pdf},
year = {2011},
date = {2011-01-01},
urldate = {2011-01-01},
booktitle = {4th Int.Workshop on Music and Machine Learning},
organization = {NIPS},
abstract = {Automatic musical analysis has been approached from different perspectives: grammars, expert systems, probabilistic models, and model matching have been proposed for implementing tonal analysis. In this work we focus on automatic melodic analysis. One question that arises when building a melodic analysis system using a-priori music theory is whether it is possible to automatically extract analysis rules from examples, and how similar are those learnt rules compared to music theory rules. This work investigates this question, i.e. given a dataset of analyzed melodies our objective is to automatically learn analysis rules and to compare them with music theory rules.},
keywords = {DRIMS},
pubstate = {published},
tppubtype = {techreport}
}
Automatic musical analysis has been approached from different perspectives: grammars, expert systems, probabilistic models, and model matching have been proposed for implementing tonal analysis. In this work we focus on automatic melodic analysis. One question that arises when building a melodic analysis system using a-priori music theory is whether it is possible to automatically extract analysis rules from examples, and how similar are those learnt rules compared to music theory rules. This work investigates this question, i.e. given a dataset of analyzed melodies our objective is to automatically learn analysis rules and to compare them with music theory rules.2010
Rizo, D.
Symbolic music comparison with tree data structures PhD Thesis
2010.
@phdthesis{k258,
title = {Symbolic music comparison with tree data structures},
author = {D. Rizo},
editor = {J. M. Supervisor: Iñesta},
year = {2010},
date = {2010-11-01},
organization = {Universidad de Alicante},
keywords = {DRIMS},
pubstate = {published},
tppubtype = {phdthesis}
}
Rauber, A.; Mayer, R.
Feature Selection in a Cartesian Ensemble of Feature Subspace Classifiers for Music Categorisation Proceedings Article
In: Proc. of. ACM Multimedia Workshop on Music and Machine Learning (MML 2010), pp. 53–56, ACM, Florence (Italy), 2010, ISBN: 978-1-60558-933-6.
Abstract | BibTeX | Tags: DRIMS, MIPRCV
@inproceedings{k255,
title = {Feature Selection in a Cartesian Ensemble of Feature Subspace Classifiers for Music Categorisation},
author = {A. Rauber and R. Mayer},
isbn = {978-1-60558-933-6},
year = {2010},
date = {2010-10-01},
urldate = {2010-10-01},
booktitle = {Proc. of. ACM Multimedia Workshop on Music and Machine Learning (MML 2010)},
pages = {53--56},
publisher = {ACM},
address = {Florence (Italy)},
abstract = {We evaluate the impact of feature selection on the classification
accuracy and the achieved dimensionality reduction,
which benefits the time needed on training classification
models. Our classification scheme therein is a Cartesian en-
semble classification system, based on the principle of late
fusion and feature subspaces. These feature subspaces describe
different aspects of the same data set. We use it for
the ensemble classification of multiple feature sets from the
audio and symbolic domains. We present an extensive set
of experiments in the context of music genre classification,
based on Music IR benchmark datasets. We show that while
feature selection does not benefit classification accuracy, it
greatly reduces the dimensionality of each feature subspace,
and thus adds to great gains in the time needed to train the
individual classification models that form the ensemble.},
keywords = {DRIMS, MIPRCV},
pubstate = {published},
tppubtype = {inproceedings}
}
We evaluate the impact of feature selection on the classification
accuracy and the achieved dimensionality reduction,
which benefits the time needed on training classification
models. Our classification scheme therein is a Cartesian en-
semble classification system, based on the principle of late
fusion and feature subspaces. These feature subspaces describe
different aspects of the same data set. We use it for
the ensemble classification of multiple feature sets from the
audio and symbolic domains. We present an extensive set
of experiments in the context of music genre classification,
based on Music IR benchmark datasets. We show that while
feature selection does not benefit classification accuracy, it
greatly reduces the dimensionality of each feature subspace,
and thus adds to great gains in the time needed to train the
individual classification models that form the ensemble. Pérez-García, Pérez-Sancho T.
Harmonic and Instrumental Information Fusion for Musical Genre Classification Proceedings Article
In: Proc. of. ACM Multimedia Workshop on Music and Machine Learning (MML 2010), pp. 49–52, ACM, Florence (Italy), 2010, ISBN: 978-1-60558-933-6.
Abstract | BibTeX | Tags: DRIMS, MIPRCV
@inproceedings{k256,
title = {Harmonic and Instrumental Information Fusion for Musical Genre Classification},
author = {Pérez-Sancho T. Pérez-García},
isbn = {978-1-60558-933-6},
year = {2010},
date = {2010-10-01},
booktitle = {Proc. of. ACM Multimedia Workshop on Music and Machine Learning (MML 2010)},
pages = {49--52},
publisher = {ACM},
address = {Florence (Italy)},
abstract = {This paper presents a musical genre classification system
based on the combination of two kinds of information of very
different nature: the instrumentation information contained
in a MIDI file (metadata) and the chords that provide the
harmonic structure of the musical score stored in that file
(content). The fusion of these two information sources gives
a single feature vector that represents the file and to which
classification techniques usually utilized for text categorization
tasks are applied. The classification task is performed
under a probabilistic approach that has improved the results
previously obtained for the same data using the instrumental
or the chord information independently.},
keywords = {DRIMS, MIPRCV},
pubstate = {published},
tppubtype = {inproceedings}
}
This paper presents a musical genre classification system
based on the combination of two kinds of information of very
different nature: the instrumentation information contained
in a MIDI file (metadata) and the chords that provide the
harmonic structure of the musical score stored in that file
(content). The fusion of these two information sources gives
a single feature vector that represents the file and to which
classification techniques usually utilized for text categorization
tasks are applied. The classification task is performed
under a probabilistic approach that has improved the results
previously obtained for the same data using the instrumental
or the chord information independently. Pérez-Sancho, C.; Rizo, D.; Iñesta, J. M.; León, P. J. Ponce; Kersten, S.; Ramírez, R.
Genre classification of music by tonal harmony Journal Article
In: Intelligent Data Analysis, vol. 14, no. 5, pp. 533-545, 2010, ISSN: 1088-467X.
Abstract | BibTeX | Tags: Acc. Int. E-A, DRIMS, PROSEMUS
@article{k232,
title = {Genre classification of music by tonal harmony},
author = {C. Pérez-Sancho and D. Rizo and J. M. Iñesta and P. J. Ponce León and S. Kersten and R. Ramírez},
issn = {1088-467X},
year = {2010},
date = {2010-09-01},
urldate = {2010-09-01},
journal = {Intelligent Data Analysis},
volume = {14},
number = {5},
pages = {533-545},
abstract = {In this paper we present a genre classification framework for audio music based on a symbolic classification system. Audio signals are transformed into a symbolic representation of harmony using a chord transcription algorithm, based on the computation of harmonic pitch class profiles. Then, language models built from a ground truth of chord progressions for each genre are used to perform classification. We show that chord progressions are a suitable feature to represent musical genre, as they capture the harmonic rules relevant in each musical period or style. Finally, results using both pure symbolic information and chords transcribed from audio-from-MIDI are compared, in order to evaluate the effects of the transcription process in this task.},
keywords = {Acc. Int. E-A, DRIMS, PROSEMUS},
pubstate = {published},
tppubtype = {article}
}
In this paper we present a genre classification framework for audio music based on a symbolic classification system. Audio signals are transformed into a symbolic representation of harmony using a chord transcription algorithm, based on the computation of harmonic pitch class profiles. Then, language models built from a ground truth of chord progressions for each genre are used to perform classification. We show that chord progressions are a suitable feature to represent musical genre, as they capture the harmonic rules relevant in each musical period or style. Finally, results using both pure symbolic information and chords transcribed from audio-from-MIDI are compared, in order to evaluate the effects of the transcription process in this task. Calera-Rubio, J.; Bernabeu, J. F.
Tree language automata for melody recognition Proceedings Article
In: Pérez, Juan Carlos (Ed.): Actas del II Workshop de Reconocimiento de Formas y Análisis de Imágenes (AERFAI), pp. 17-22, AERFAI IBERGARCETA PUBLICACIONES, S.L., Valencia, Spain, 2010, ISBN: 978-84-92812-66-0.
Abstract | Links | BibTeX | Tags: DRIMS, MIPRCV, TIASA
@inproceedings{k251,
title = {Tree language automata for melody recognition},
author = {J. Calera-Rubio and J. F. Bernabeu},
editor = {Juan Carlos Pérez},
url = {https://grfia.dlsi.ua.es/repositori/grfia/pubs/251/bernabeuCEDI2010Final.pdf},
isbn = {978-84-92812-66-0},
year = {2010},
date = {2010-09-01},
urldate = {2010-09-01},
booktitle = {Actas del II Workshop de Reconocimiento de Formas y Análisis de Imágenes (AERFAI)},
pages = {17-22},
publisher = {IBERGARCETA PUBLICACIONES, S.L.},
address = {Valencia, Spain},
organization = {AERFAI},
abstract = {The representation of symbolic music by
means of trees has shown to be suitable in
melodic similarity computation. In order to
compare trees, different tree edit distances
have been previously used, being their complexity
a main drawback. In this paper, the application of stochastic k-testable treemodels for computing the similarity between two melodies as a probability, compared to the classical edit distance has been addressed. The results show that k-testable tree-models seem to be adequate for the task, since they outperform other reference methods in both recognition rate and efficiency. The case study in this paper is to identify a snippet query among a set of songs. For it, the utilized method must be able to deal with inexact queries and efficiency
for scalability issues.},
keywords = {DRIMS, MIPRCV, TIASA},
pubstate = {published},
tppubtype = {inproceedings}
}
The representation of symbolic music by
means of trees has shown to be suitable in
melodic similarity computation. In order to
compare trees, different tree edit distances
have been previously used, being their complexity
a main drawback. In this paper, the application of stochastic k-testable treemodels for computing the similarity between two melodies as a probability, compared to the classical edit distance has been addressed. The results show that k-testable tree-models seem to be adequate for the task, since they outperform other reference methods in both recognition rate and efficiency. The case study in this paper is to identify a snippet query among a set of songs. For it, the utilized method must be able to deal with inexact queries and efficiency
for scalability issues. Iñesta, J. M.; Pérez-Sancho, C.; Pérez-García, T.
Fusión de información armónica e instrumental para la clasificación de géneros musicales Proceedings Article
In: Pérez, Juan Carlos (Ed.): Actas del II Workshop de Reconocimiento de Formas y Análisis de Imágenes (AERFAI), pp. 147-153, AERFAI Ibergarceta Publicaciones S.L., Valencia, Spain, 2010, ISBN: 978-84-92812-66-0.
Abstract | BibTeX | Tags: DRIMS, MIPRCV
@inproceedings{k252,
title = {Fusión de información armónica e instrumental para la clasificación de géneros musicales},
author = {J. M. Iñesta and C. Pérez-Sancho and T. Pérez-García},
editor = {Juan Carlos Pérez},
isbn = {978-84-92812-66-0},
year = {2010},
date = {2010-09-01},
urldate = {2010-09-01},
booktitle = {Actas del II Workshop de Reconocimiento de Formas y Análisis de Imágenes (AERFAI)},
pages = {147-153},
publisher = {Ibergarceta Publicaciones S.L.},
address = {Valencia, Spain},
organization = {AERFAI},
abstract = {En este artículo presentamos un sistema de clasificación de género musical basado en la combinación de dos tipos diferentes de información: la información instrumental contenida en un fichero MIDI y los acordes que proporcionan la estructura armónica de la partitura musical almacenada en dicho fichero. La unión de estas informaciones nos proporciona un único vector de caracteríticas sobre el que se aplican técnicas usadas habitualmente en la clasificación de textos. Finalmente esto nos proporciona un clasificador probabilítico que mejora los resultados obtenidos en trabajos previos en los que se usaba de forma independiente la información instrumental y la información armónica de un fichero MIDI.},
keywords = {DRIMS, MIPRCV},
pubstate = {published},
tppubtype = {inproceedings}
}
En este artículo presentamos un sistema de clasificación de género musical basado en la combinación de dos tipos diferentes de información: la información instrumental contenida en un fichero MIDI y los acordes que proporcionan la estructura armónica de la partitura musical almacenada en dicho fichero. La unión de estas informaciones nos proporciona un único vector de caracteríticas sobre el que se aplican técnicas usadas habitualmente en la clasificación de textos. Finalmente esto nos proporciona un clasificador probabilítico que mejora los resultados obtenidos en trabajos previos en los que se usaba de forma independiente la información instrumental y la información armónica de un fichero MIDI. Pérez, A.; Ramírez, R.; Iñesta, J. M.
Modeling violin performances using inductive logic programming Journal Article
In: Intelligent Data Analysis, vol. 14, no. 5, pp. 573–585, 2010, ISSN: 1088-467X.
@article{k253,
title = {Modeling violin performances using inductive logic programming},
author = {A. Pérez and R. Ramírez and J. M. Iñesta},
issn = {1088-467X},
year = {2010},
date = {2010-09-01},
urldate = {2010-09-01},
journal = {Intelligent Data Analysis},
volume = {14},
number = {5},
pages = {573--585},
keywords = {DRIMS},
pubstate = {published},
tppubtype = {article}
}
Lidy, T.; Mayer, R.; Rauber, A.; de León, P. J. Ponce; Pertusa, A.; Iñesta, J. M.
A Cartesian Ensemble of Feature Subspace Classifiers for Music Categorization Proceedings Article
In: Downie, J. Stephen; Veltkamp, Remco C. (Ed.): Proceedings of the 11th International Society for Music Information Retrieval Conference (ISMIR 2010), pp. 279-284, International Society for Music Information Retrieval International Society for Music Information Retrieval, Utrecht, Netherlands, 2010, ISBN: 978-90-393-53813.
Abstract | Links | BibTeX | Tags: DRIMS
@inproceedings{k246,
title = {A Cartesian Ensemble of Feature Subspace Classifiers for Music Categorization},
author = {T. Lidy and R. Mayer and A. Rauber and P. J. Ponce de León and A. Pertusa and J. M. Iñesta},
editor = {J. Stephen Downie and Remco C. Veltkamp},
url = {https://grfia.dlsi.ua.es/repositori/grfia/pubs/246/ismir2010.pdf},
isbn = {978-90-393-53813},
year = {2010},
date = {2010-08-01},
urldate = {2010-08-01},
booktitle = {Proceedings of the 11th International Society for Music Information Retrieval Conference (ISMIR 2010)},
pages = {279-284},
publisher = {International Society for Music Information Retrieval},
address = {Utrecht, Netherlands},
organization = {International Society for Music Information Retrieval},
abstract = {We present a cartesian ensemble classification system that is based on the principle of late fusion and feature subspaces. These feature subspaces describe different aspects of the same data set. The framework is built on the Weka machine learning toolkit and able to combine arbitrary feature sets and learning schemes. In our scenario, we use it for the ensemble classification of multiple feature sets from the audio and symbolic domains. We present an extensive set of experiments in the context of music genre classification, based on numerous Music IR benchmark datasets, and evaluate a set of combination/voting rules. The results show that the approach is superior to the best choice of a single algorithm on a single feature set. Moreover, it also releases the user from making this choice explicitly.},
keywords = {DRIMS},
pubstate = {published},
tppubtype = {inproceedings}
}
We present a cartesian ensemble classification system that is based on the principle of late fusion and feature subspaces. These feature subspaces describe different aspects of the same data set. The framework is built on the Weka machine learning toolkit and able to combine arbitrary feature sets and learning schemes. In our scenario, we use it for the ensemble classification of multiple feature sets from the audio and symbolic domains. We present an extensive set of experiments in the context of music genre classification, based on numerous Music IR benchmark datasets, and evaluate a set of combination/voting rules. The results show that the approach is superior to the best choice of a single algorithm on a single feature set. Moreover, it also releases the user from making this choice explicitly. Rizo, D.; Iñesta, J. M.
New partially labelled tree similarity measure: a case study Proceedings Article
In: Hancok, E. R.; Wilson, R. C.; Ilkay, T. W.; Escolano, F. (Ed.): Structural, Syntactic, and Statistical Pattern Recognition, pp. 296–305, Springer, Cesme, Turkey, 2010, ISBN: 978-3-642-14979-5.
@inproceedings{k248,
title = {New partially labelled tree similarity measure: a case study},
author = {D. Rizo and J. M. Iñesta},
editor = {E. R. Hancok and R. C. Wilson and T. W. Ilkay and F. Escolano},
url = {https://grfia.dlsi.ua.es/repositori/grfia/pubs/248/ssspr10-cr.pdf},
isbn = {978-3-642-14979-5},
year = {2010},
date = {2010-08-01},
booktitle = {Structural, Syntactic, and Statistical Pattern Recognition},
pages = {296--305},
publisher = {Springer},
address = {Cesme, Turkey},
keywords = {DRIMS},
pubstate = {published},
tppubtype = {inproceedings}
}
Pertusa, A.
Computationally efficient methods for polyphonic music transcription PhD Thesis
2010.
Abstract | Links | BibTeX | Tags: DRIMS, MIPRCV
@phdthesis{k244,
title = {Computationally efficient methods for polyphonic music transcription},
author = {A. Pertusa},
editor = {José M. Iñesta},
url = {https://grfia.dlsi.ua.es/repositori/grfia/pubs/244/pertusaphd.pdf},
year = {2010},
date = {2010-01-01},
organization = {Universidad de Alicante},
abstract = {Automatic music transcription is a music information retrieval (MIR) task which involves many different disciplines, such as audio signal processing, machine learning, computer science, psychoacoustics and music perception, music theory, and music cognition. The goal of automatic music transcription is to extract a human readable and interpretable representation, like a musical score, from an audio signal. To achieve this goal, it is necessary to estimate the pitches, onset times and durations of the notes, the tempo, the meter and the tonality of a musical piece.
The most obvious application of automatic music transcription is to help a musician to write down the music notation of a performance from an audio recording, which is a time consuming task when it is done by hand. Besides this application, automatic music transcription can also be useful for other MIR tasks, like plagiarism detection, artist identification, genre classification, and composition assistance by changing the instrumentation, the arrangement or the loudness before resynthesizing new pieces. In general, music transcription methods can also provide information about the notes to symbolic music algorithms.
This work addresses the automatic music transcription problem using different strategies. Novel efficient methods are proposed for onset detection (detection of the beginnings of musical events) and multiple fundamental frequency estimation (estimation of the pitches in a polyphonic mixture), using supervised learning and signal processing techniques.
The main contributions of this work can be summarized in the following points:
- An analytical and extensive review of the state of the art methods for onset detection and multiple fundamental frequency estimation.
- The development of an efficient approach for onset detection and the construction of a public ground-truth data set for this task.
- Two novel approaches for multiple pitch estimation of a priori known sounds using supervised learning methods. These algorithms were one of the first machine learning methods proposed for this task.
- A simple iterative cancellation approach, mainly intended to transcribe piano music at a low computational cost.
- Heuristic multiple fundamental frequency algorithms based on signal processing to analyze real music without any a priori knowledge. These methods, which are probably the main contribution of this work, experimentally reached the state of the art for this task with a very low
computational burden.},
keywords = {DRIMS, MIPRCV},
pubstate = {published},
tppubtype = {phdthesis}
}
Automatic music transcription is a music information retrieval (MIR) task which involves many different disciplines, such as audio signal processing, machine learning, computer science, psychoacoustics and music perception, music theory, and music cognition. The goal of automatic music transcription is to extract a human readable and interpretable representation, like a musical score, from an audio signal. To achieve this goal, it is necessary to estimate the pitches, onset times and durations of the notes, the tempo, the meter and the tonality of a musical piece.
The most obvious application of automatic music transcription is to help a musician to write down the music notation of a performance from an audio recording, which is a time consuming task when it is done by hand. Besides this application, automatic music transcription can also be useful for other MIR tasks, like plagiarism detection, artist identification, genre classification, and composition assistance by changing the instrumentation, the arrangement or the loudness before resynthesizing new pieces. In general, music transcription methods can also provide information about the notes to symbolic music algorithms.
This work addresses the automatic music transcription problem using different strategies. Novel efficient methods are proposed for onset detection (detection of the beginnings of musical events) and multiple fundamental frequency estimation (estimation of the pitches in a polyphonic mixture), using supervised learning and signal processing techniques.
The main contributions of this work can be summarized in the following points:
- An analytical and extensive review of the state of the art methods for onset detection and multiple fundamental frequency estimation.
- The development of an efficient approach for onset detection and the construction of a public ground-truth data set for this task.
- Two novel approaches for multiple pitch estimation of a priori known sounds using supervised learning methods. These algorithms were one of the first machine learning methods proposed for this task.
- A simple iterative cancellation approach, mainly intended to transcribe piano music at a low computational cost.
- Heuristic multiple fundamental frequency algorithms based on signal processing to analyze real music without any a priori knowledge. These methods, which are probably the main contribution of this work, experimentally reached the state of the art for this task with a very low
computational burden. Iñesta, J. M.; Rizo, D.
Trees and combined methods for monophonic music similarity evaluation Proceedings Article
In: MIREX 2010 - Music Information Retrieval Evaluation eXchange, MIREX Symbolic Melodic Similarity contest, Utrecht, The Nederlands, 2010.
@inproceedings{k254,
title = {Trees and combined methods for monophonic music similarity evaluation},
author = {J. M. Iñesta and D. Rizo},
url = {https://grfia.dlsi.ua.es/repositori/grfia/pubs/254/trees.pdf},
year = {2010},
date = {2010-01-01},
urldate = {2010-01-01},
booktitle = {MIREX 2010 - Music Information Retrieval Evaluation eXchange, MIREX Symbolic Melodic Similarity contest},
address = {Utrecht, The Nederlands},
keywords = {DRIMS},
pubstate = {published},
tppubtype = {inproceedings}
}
2009
Calera-Rubio, J.; Bernabeu, J. F.
A probabilistic approach to melodic similarity Proceedings Article
In: Proceedings of MML 2009, pp. 48-53, 2009.
Abstract | Links | BibTeX | Tags: ARFAI, DRIMS, MIPRCV, PROSEMUS, TIASA
@inproceedings{k231,
title = {A probabilistic approach to melodic similarity},
author = {J. Calera-Rubio and J. F. Bernabeu},
url = {https://grfia.dlsi.ua.es/repositori/grfia/pubs/231/mml2009Bernabeu.pdf},
year = {2009},
date = {2009-01-01},
urldate = {2009-01-01},
booktitle = {Proceedings of MML 2009},
pages = {48-53},
abstract = {Melodic similarity is an important research topic in music information retrieval.
The representation of symbolic music by means of trees has proven to be suitable
in melodic similarity computation, because they are able to code rhythm in their
structure leaving only pitch representations as a degree of freedom for coding.
In order to compare trees, different edit distances have been previously used.
In this paper, stochastic k-testable tree-models, formerly used in other domains
like structured document compression or natural language processing, have been
used for computing a similarity measure between melody trees as a probability
and their performance has been compared to a classical tree edit distance.},
keywords = {ARFAI, DRIMS, MIPRCV, PROSEMUS, TIASA},
pubstate = {published},
tppubtype = {inproceedings}
}
Melodic similarity is an important research topic in music information retrieval.
The representation of symbolic music by means of trees has proven to be suitable
in melodic similarity computation, because they are able to code rhythm in their
structure leaving only pitch representations as a degree of freedom for coding.
In order to compare trees, different edit distances have been previously used.
In this paper, stochastic k-testable tree-models, formerly used in other domains
like structured document compression or natural language processing, have been
used for computing a similarity measure between melody trees as a probability
and their performance has been compared to a classical tree edit distance.
2013
Pérez-Sancho, C.; Bernabeu, J. F.
A Multimodal Genre Recognition Prototype Proceedings Article
In: Actas del III Workshop de Reconocimiento de Formas y Análisis de Imágenes, pp. 13-16, Madrid, Spain, 2013, ISBN: 978-84-695-8332-6.
Abstract | Links | BibTeX | Tags: DRIMS, TIASA
@inproceedings{k305,
title = {A Multimodal Genre Recognition Prototype},
author = {C. Pérez-Sancho and J. F. Bernabeu},
url = {https://grfia.dlsi.ua.es/repositori/grfia/pubs/305/wsrfai2013_submission_4.pdf},
isbn = {978-84-695-8332-6},
year = {2013},
date = {2013-09-01},
urldate = {2013-09-01},
booktitle = {Actas del III Workshop de Reconocimiento de Formas y Análisis de Imágenes},
pages = {13-16},
address = {Madrid, Spain},
abstract = {In this paper, a multimodal and interactive prototype to perform music genre classification is presented. The system is oriented to multi-part files in symbolic format but it can be adapted using a transcription system to transform audio content in music scores. This prototype uses different sources of information to give a possible answer to the user. It has been developed to allow a human expert to interact with the system to improve its results. In its current implementation, it offers a limited range of interaction and multimodality. Further development aimed at full interactivity and multimodal interactions is discussed.},
keywords = {DRIMS, TIASA},
pubstate = {published},
tppubtype = {inproceedings}
}
Hontanilla, M.; Pérez-Sancho, C.; Iñesta, J. M.
Modeling Musical Style with Language Models for Composer Recognition Journal Article
In: Lecture Notes in Computer Science, vol. 7887, pp. 740-748, 2013, ISSN: 0302-9743.
Abstract | Links | BibTeX | Tags: DRIMS, Prometeo 2012
@article{k300,
title = {Modeling Musical Style with Language Models for Composer Recognition},
author = {M. Hontanilla and C. Pérez-Sancho and J. M. Iñesta},
url = {https://grfia.dlsi.ua.es/repositori/grfia/pubs/300/10.1007_978-3-642-38628-2_88.pdf},
issn = {0302-9743},
year = {2013},
date = {2013-06-01},
journal = {Lecture Notes in Computer Science},
volume = {7887},
pages = {740-748},
abstract = {In this paper we present an application of language modeling using n-grams to model the style of different composers. For this, we repeated the experiments performed in previous works by other authors using a corpus of 5 composers from the Baroque and Classical periods. In these experiments we found some signs that the results could be influenced by external factors other than the composers’ styles, such as the heterogeneity in the musical forms selected for the corpus. In order to as- sess the validity of the modeling techniques to capture the own personal style of the composers, a new experiment was performed with a corpus of fugues from Bach and Shostakovich. All these experiments show that language modeling is a suitable tool for modeling musical style, even when the styles of the different datasets are affected by several factors.},
keywords = {DRIMS, Prometeo 2012},
pubstate = {published},
tppubtype = {article}
}
Iñesta, J. M.; Pérez-Sancho, C.
Interactive multimodal music transcription Proceedings Article
In: Proc. of IEEE Int. Conf. on Acoustics, Speech and Signal Processing (ICASSP 2013), pp. 211-215, IEEE, Vancouver, Canada, 2013, ISBN: 978-1-4799-0356-6.
Abstract | BibTeX | Tags: DRIMS, Prometeo 2012
@inproceedings{k299,
title = {Interactive multimodal music transcription},
author = {J. M. Iñesta and C. Pérez-Sancho},
isbn = {978-1-4799-0356-6},
year = {2013},
date = {2013-05-01},
booktitle = {Proc. of IEEE Int. Conf. on Acoustics, Speech and Signal Processing (ICASSP 2013)},
pages = {211-215},
publisher = {IEEE},
address = {Vancouver, Canada},
abstract = {Automatic music transcription has usually been performed as an autonomous task and its evaluation has been made in terms of precision, recall, accuracy, etc. Nevertheless, in this work, assuming that the state of the art is far from being perfect, it is considered as an interactive one, where an expert user is assisted in its work by a transcription tool. In this context, the performance evaluation of the system turns into an assessment of how many user interactions are needed to complete the work. The strategy is that the user interactions can be used by the system to improve its performance in an adaptive way, thus minimizing the workload. Also, a multimodal approach has been implemented, in such a way that different sources of information, like onsets, beats, and meter, are used to detect notes in a musical audio excerpt. The system is focused on monotimbral polyphonic transcription.},
keywords = {DRIMS, Prometeo 2012},
pubstate = {published},
tppubtype = {inproceedings}
}
2012
Bresson, J.; Pérez-Sancho, C.
New Framework for Score Segmentation and Analysis in OpenMusic Proceedings Article
In: Serafin, S. (Ed.): Proceedings of the 9th Sound and Music Computing Conference, pp. 506-513, Sound & Music Computing Logos Verlag, Copenhagen, Denmark, 2012, ISBN: 978-3-8325-3180-5.
Abstract | Links | BibTeX | Tags: DRIMS, PASCAL2
@inproceedings{k295,
title = {New Framework for Score Segmentation and Analysis in OpenMusic},
author = {J. Bresson and C. Pérez-Sancho},
editor = {S. Serafin},
url = {https://grfia.dlsi.ua.es/repositori/grfia/pubs/295/smc2012-149.pdf},
isbn = {978-3-8325-3180-5},
year = {2012},
date = {2012-07-01},
booktitle = {Proceedings of the 9th Sound and Music Computing Conference},
pages = {506-513},
publisher = {Logos Verlag},
address = {Copenhagen, Denmark},
organization = {Sound & Music Computing},
abstract = {We present new tools for the segmentation and analysis of musical scores in the OpenMusic computer-aided composition environment. A modular object-oriented framework enables the creation of segmentations on score objects and the implementation of automatic or semi-automatic analysis processes. The analyses can be performed and displayed thanks to customizable classes and callbacks. Concrete examples are given, in particular with the implementation of a semi-automatic harmonic analysis system and a framework for rhythmic transcription.},
keywords = {DRIMS, PASCAL2},
pubstate = {published},
tppubtype = {inproceedings}
}
Bernabeu, J. F.; Calera-Rubio, J.; Iñesta, J. M.; Rizo, D.
Query Parsing Using Probabilistic Tree Grammars Technical Report
Edinburgh, 2012.
Abstract | Links | BibTeX | Tags: DRIMS
@techreport{k292,
title = {Query Parsing Using Probabilistic Tree Grammars},
author = {J. F. Bernabeu and J. Calera-Rubio and J. M. Iñesta and D. Rizo},
url = {https://grfia.dlsi.ua.es/repositori/grfia/pubs/292/mml2012.pdf},
year = {2012},
date = {2012-06-01},
booktitle = {5th workshop on Music and Machine Learning, MML 2012},
address = {Edinburgh},
organization = {5th workshop on Music and Machine Learning, MML 2012},
abstract = {The tree representation, using rhythm for defining the tree structure and pitch infor- mation for node labeling has proven to be ef- fective in melodic similarity computation. In this paper we propose a solution representing melodies by tree grammars. For that, we in- fer a probabilistic context-free grammars for the melodies in a database, using their tree coding (with duration and pitch) and classify queries represented as a string of pitches. We aim to assess their ability to identify a noisy snippet query among a set of songs stored in symbolic format.},
keywords = {DRIMS},
pubstate = {published},
tppubtype = {techreport}
}
Rico-Juan, J. R.; Iñesta, J. M.
New rank methods for reducing the size of the training set using the nearest neighbor rule Journal Article
In: Pattern Recognition Letters, vol. 33, no. 5, pp. 654–660, 2012, ISSN: 0167-8655.
Abstract | Links | BibTeX | Tags: DRIMS, MIPRCV, TIASA
@article{k283,
title = {New rank methods for reducing the size of the training set using the nearest neighbor rule},
author = {J. R. Rico-Juan and J. M. Iñesta},
url = {https://grfia.dlsi.ua.es/repositori/grfia/pubs/283/rankTrainingSet.pdf},
issn = {0167-8655},
year = {2012},
date = {2012-04-01},
journal = {Pattern Recognition Letters},
volume = {33},
number = {5},
pages = {654--660},
abstract = {(http://dx.doi.org/10.1016/j.patrec.2011.07.019)
Some new rank methods to select the best prototypes from a training set are proposed in this paper in order to establish its size according to an external parameter, while maintaining the classification accuracy. The traditional methods that filter the training set in a classification task like editing or condensing have some rules that apply to the set in order to remove outliers or keep some prototypes that help in the classification. In our approach, new voting methods are proposed to compute the prototype probability and help to classify correctly a new sample. This probability is the key to sorting the training set out, so a relevance factor from 0 to 1 is used to select the best candidates for each class whose accumulated probabilities are less than that parameter. This approach makes it possible to select the number of prototypes necessary to maintain or even increase the classification accuracy. The results obtained in different high dimensional databases show that these methods maintain the final error rate while reducing the size of the training set.},
keywords = {DRIMS, MIPRCV, TIASA},
pubstate = {published},
tppubtype = {article}
}
Some new rank methods to select the best prototypes from a training set are proposed in this paper in order to establish its size according to an external parameter, while maintaining the classification accuracy. The traditional methods that filter the training set in a classification task like editing or condensing have some rules that apply to the set in order to remove outliers or keep some prototypes that help in the classification. In our approach, new voting methods are proposed to compute the prototype probability and help to classify correctly a new sample. This probability is the key to sorting the training set out, so a relevance factor from 0 to 1 is used to select the best candidates for each class whose accumulated probabilities are less than that parameter. This approach makes it possible to select the number of prototypes necessary to maintain or even increase the classification accuracy. The results obtained in different high dimensional databases show that these methods maintain the final error rate while reducing the size of the training set.
Vicente, O.; Iñesta, J. M.
Bass track selection in MIDI files and multimodal implications to melody Proceedings Article
In: Carmona, J. Salvador Sá Pedro Latorre; nchez,; Fred, Ana (Ed.): Proceedings of the Int. Conf. on Pattern Recognition Applications and Methods (ICPRAM 2012), pp. 449–458, INSTICC SciTePress, Vilamoura, Portugal, 2012, ISBN: 978-989-8425-98-0.
Abstract | BibTeX | Tags: DRIMS, MIPRCV
@inproceedings{k285,
title = {Bass track selection in MIDI files and multimodal implications to melody},
author = {O. Vicente and J. M. Iñesta},
editor = {J. Salvador Sá Pedro Latorre Carmona and nchez and Ana Fred},
isbn = {978-989-8425-98-0},
year = {2012},
date = {2012-02-01},
urldate = {2012-02-01},
booktitle = {Proceedings of the Int. Conf. on Pattern Recognition Applications and Methods (ICPRAM 2012)},
pages = {449--458},
publisher = {SciTePress},
address = {Vilamoura, Portugal},
organization = {INSTICC},
abstract = {Standard MIDI files consist of a number of tracks containing information that can be considered as a symbolic representation of music. Usually each track represents an instrument or voice in a music piece. The goal for this work is to identify the track that contains the bass line. This information is very relevant for a number of tasks like rhythm analysis or harmonic segmentation, among others. It is not easy since a bass line can be performed by very different kinds of instruments. We have approached this problem by using statistical features from the symbolic representation of music and a random forest classifier. The first experiment was to classify a track as bass or non-bass. Then we have tried to select the correct bass track in a multi-track MIDI file. Eventually, we have studied the issue of how different sources of information can help in this latter task. In particular, we have analyzed the interactions between bass and melody information. Yielded results were very accurate and melody track identification was significantly improved when using this kind of multimodal help.},
keywords = {DRIMS, MIPRCV},
pubstate = {published},
tppubtype = {inproceedings}
}
Pertusa, A.; Iñesta, J. M.
Efficient methods for joint estimation of multiple fundamental frequencies in music signals Journal Article
In: EURASIP Journal on Advances in Signal Processing, vol. 2012, no. 1, pp. 27, 2012, ISSN: 1687-6180.
Abstract | BibTeX | Tags: DRIMS
@article{k286,
title = {Efficient methods for joint estimation of multiple fundamental frequencies in music signals},
author = {A. Pertusa and J. M. Iñesta},
issn = {1687-6180},
year = {2012},
date = {2012-01-01},
journal = {EURASIP Journal on Advances in Signal Processing},
volume = {2012},
number = {1},
pages = {27},
abstract = {This study presents efficient techniques for multiple fundamental frequency estimation in music signals. The proposed methodology can infer harmonic patterns from a mixture considering interactions with other sources and evaluate them in a joint estimation scheme. For this purpose, a set of fundamental frequency candidates are first selected at each frame, and several hypothetical combinations of them are generated. Combinations are independently evaluated, and the most likely is selected taking into account the intensity and spectral smoothness of its inferred patterns. The method is extended considering adjacent frames in order to smooth the detection in time, and a pitch tracking stage is finally performed to increase the temporal coherence. The proposed algorithms were evaluated in MIREX contests yielding state of the art results with a very low computational burden.},
keywords = {DRIMS},
pubstate = {published},
tppubtype = {article}
}
Orio, N.; Rauber, A.; Rizo, D.
Introduction to the focused issue on music digital libraries Journal Article
In: International Journal on Digital Libraries, vol. 12, no. 2-3, pp. 51-52, 2012, ISSN: ISSN: 1432-5012.
@article{k294,
title = {Introduction to the focused issue on music digital libraries},
author = {N. Orio and A. Rauber and D. Rizo},
issn = {ISSN: 1432-5012},
year = {2012},
date = {2012-01-01},
urldate = {2012-01-01},
journal = {International Journal on Digital Libraries},
volume = {12},
number = {2-3},
pages = {51-52},
keywords = {DRIMS},
pubstate = {published},
tppubtype = {article}
}
2011
Rizo, D.; Iñesta, J. M.; Lemström, K.
Polyphonic Music Retrieval with Classifier Ensembles Journal Article
In: Journal of New Music Research, vol. 40, no. 4, pp. 313-324, 2011, ISSN: 0929-8215.
@article{k284,
title = {Polyphonic Music Retrieval with Classifier Ensembles},
author = {D. Rizo and J. M. Iñesta and K. Lemström},
issn = {0929-8215},
year = {2011},
date = {2011-12-01},
urldate = {2011-12-01},
journal = {Journal of New Music Research},
volume = {40},
number = {4},
pages = {313-324},
keywords = {DRIMS},
pubstate = {published},
tppubtype = {article}
}
Iñesta, J. M.; Pérez-García, T.
A Multimodal Music Transcription Prototype Proceedings Article
In: Proc. of International Conference on Multimodal Interaction, ICMI 2011, pp. 315–318, ACM, Alicante, Spain, 2011, ISBN: 978-1-4503-0641-6.
Abstract | BibTeX | Tags: DRIMS, MIPRCV
@inproceedings{k274,
title = {A Multimodal Music Transcription Prototype},
author = {J. M. Iñesta and T. Pérez-García},
isbn = {978-1-4503-0641-6},
year = {2011},
date = {2011-11-01},
urldate = {2011-11-01},
booktitle = {Proc. of International Conference on Multimodal Interaction, ICMI 2011},
pages = {315--318},
publisher = {ACM},
address = {Alicante, Spain},
abstract = {Music transcription consists of transforming an audio signal encoding a music performance in a symbolic representation such as a music score. In this paper, a multimodal and interactive prototype to perform music transcription is
presented. The system is oriented to monotimbral transcription, its working domain is music played by a single instrument. This prototype uses three different sources of information to detect notes in a musical audio excerpt. It has been developed to allow a human expert to interact with the system to improve its results. In its current implementation, it offers a limited range of interaction and multimodality. Further development aimed at full interactivity and multimodal interactions is discussed.},
keywords = {DRIMS, MIPRCV},
pubstate = {published},
tppubtype = {inproceedings}
}
presented. The system is oriented to monotimbral transcription, its working domain is music played by a single instrument. This prototype uses three different sources of information to detect notes in a musical audio excerpt. It has been developed to allow a human expert to interact with the system to improve its results. In its current implementation, it offers a limited range of interaction and multimodality. Further development aimed at full interactivity and multimodal interactions is discussed.
Miotto, R.; Rizo, D.; Orio, N.; Lartillot, O.
MusiCLEF: a Benchmark Activity in Multimodal Music Information Retrieval Proceedings Article
In: Proc. of the 12th International Society for Music Information Retrieval Conference (ISMIR), Miami 2011, pp. 603-608, University of Miami, 2011, ISBN: 978-0-615-54865-4.
@inproceedings{k275,
title = {MusiCLEF: a Benchmark Activity in Multimodal Music Information Retrieval},
author = {R. Miotto and D. Rizo and N. Orio and O. Lartillot},
url = {https://grfia.dlsi.ua.es/repositori/grfia/pubs/275/Orio_etal_Ismir_2011.pdf},
isbn = {978-0-615-54865-4},
year = {2011},
date = {2011-10-01},
urldate = {2011-10-01},
booktitle = {Proc. of the 12th International Society for Music Information Retrieval Conference (ISMIR), Miami 2011},
pages = {603-608},
publisher = {University of Miami},
keywords = {DRIMS},
pubstate = {published},
tppubtype = {inproceedings}
}
León, Pedro J. Ponce
A statistical pattern recognition approach to symbolic music classification PhD Thesis
2011.
Abstract | Links | BibTeX | Tags: DRIMS, MIPRCV
@phdthesis{k271,
title = {A statistical pattern recognition approach to symbolic music classification},
author = {Pedro J. Ponce León},
editor = {José M. Iñesta},
url = {https://grfia.dlsi.ua.es/repositori/grfia/pubs/271/PhD_Pedro_J_Ponce_de_Leon_2011.pdf},
year = {2011},
date = {2011-09-01},
address = {Alicante, Spain},
organization = {University of Alicante},
abstract = {[ENGLISH] This is a work in the field of Music Information Retrieval, from symbolic sources (digital music scores or similar). It applies statistical pattern recognition techniques to approach two different, but related, problems: melody part selection in polyphonic works, and automatic music genre classification.
[ESPAÑOL] El trabajo se enmarca en el dominio de Recuperación de Música por Ordenador, a partir de fuentes simbólicas (partituras digitales o similares). En concreto, se plantean soluciones computacionales mediante la aplicación de técnicas estadísticas de reconocimiento de formas a dos problemas: la selección automática de partes melódicas en obras polifónicas y la clasificación automática de géneros musicales. Entre las posibles aplicaciones de estas técnicas está la catalogación, indexación y recuperación automática de obras musicales, basadas en su contenido, de grandes bases de datos que contienen obras en formato simbólico (partituras digitales, archivos MIDI, etc.). Otras aplicaciones, en el ámbito de la musicología computacional, incluyen la caracterización de géneros musicales y melodías mediante el análisis automático del contenido de grandes volúmenes de obras musicales.},
keywords = {DRIMS, MIPRCV},
pubstate = {published},
tppubtype = {phdthesis}
}
[ESPAÑOL] El trabajo se enmarca en el dominio de Recuperación de Música por Ordenador, a partir de fuentes simbólicas (partituras digitales o similares). En concreto, se plantean soluciones computacionales mediante la aplicación de técnicas estadísticas de reconocimiento de formas a dos problemas: la selección automática de partes melódicas en obras polifónicas y la clasificación automática de géneros musicales. Entre las posibles aplicaciones de estas técnicas está la catalogación, indexación y recuperación automática de obras musicales, basadas en su contenido, de grandes bases de datos que contienen obras en formato simbólico (partituras digitales, archivos MIDI, etc.). Otras aplicaciones, en el ámbito de la musicología computacional, incluyen la caracterización de géneros musicales y melodías mediante el análisis automático del contenido de grandes volúmenes de obras musicales.
Bernabeu, J. F.; Calera-Rubio, J.; Iñesta, J. M.; Rizo, D.
Melodic Identification Using Probabilistic Tree Automata Journal Article
In: Journal of New Music Research, vol. 40, no. 2, pp. 93-103, 2011, ISSN: 0929-8215.
Abstract | BibTeX | Tags: DRIMS, MIPRCV, TIASA
@article{k270,
title = {Melodic Identification Using Probabilistic Tree Automata},
author = {J. F. Bernabeu and J. Calera-Rubio and J. M. Iñesta and D. Rizo},
issn = {0929-8215},
year = {2011},
date = {2011-06-01},
urldate = {2011-06-01},
journal = {Journal of New Music Research},
volume = {40},
number = {2},
pages = {93-103},
abstract = {Similarity computation is a difficult issue in music information retrieval tasks, because it tries to emulate the special ability that humans show for pattern recognition in general, and particularly in the presence of noisy data. A number of works have addressed the problem of what is the best representation for symbolic music in this context. The tree representation, using rhythm for defining the tree structure and pitch information for leaf and node labelling has proven to be effective in melodic similarity computation. One of the main drawbacks of this approach is that the tree comparison algorithms are of a high time complexity. In this paper, stochastic k-testable tree-models are applied for computing the similarity between two melodies as a probability. The results are compared to those achieved by tree edit distances, showing that k-testable tree-models outperform other reference methods in both recognition rate and efficiency. The case study in this paper is to identify a snippet query among a set of songs stored in symbolic format. For it, the utilized method must be able to deal with inexact queries and with efficiency for scalability issues.},
keywords = {DRIMS, MIPRCV, TIASA},
pubstate = {published},
tppubtype = {article}
}
Iñesta, J. M.; Pérez-Sancho, C.; Hontanilla, M.
Composer Recognition using Language Models Proceedings Article
In: Proc. of Signal Processing, Pattern Recognition, and Applications (SPPRA 2011), pp. 76-83, ACTA Press, Innsbruck, Austria, 2011, ISBN: 978-0-88986-865-6.
Abstract | BibTeX | Tags: DRIMS, UA-CPS
@inproceedings{k261,
title = {Composer Recognition using Language Models},
author = {J. M. Iñesta and C. Pérez-Sancho and M. Hontanilla},
isbn = {978-0-88986-865-6},
year = {2011},
date = {2011-02-01},
urldate = {2011-02-01},
booktitle = {Proc. of Signal Processing, Pattern Recognition, and Applications (SPPRA 2011)},
pages = {76-83},
publisher = {ACTA Press},
address = {Innsbruck, Austria},
abstract = {In this paper we present an application of language modeling
techniques using n-grams to an authorship attribution
task. An stylometric study has been conducted on a pair
of datasets of baroque and classical composers, with which
other authors performed previously a similar study using a
set of musicological features and pattern recognition techniques.
In this paper, a simple general-purpose encoding
method has been used, in conjunction with language modeling
to explore the same problem. The results show that
this simpler method can lead to the same conclusions than
other more sophisticated methods, even traditional musicological
studies, without the need of advanced musicological
knowledge for processing the scores.},
keywords = {DRIMS, UA-CPS},
pubstate = {published},
tppubtype = {inproceedings}
}
techniques using n-grams to an authorship attribution
task. An stylometric study has been conducted on a pair
of datasets of baroque and classical composers, with which
other authors performed previously a similar study using a
set of musicological features and pattern recognition techniques.
In this paper, a simple general-purpose encoding
method has been used, in conjunction with language modeling
to explore the same problem. The results show that
this simpler method can lead to the same conclusions than
other more sophisticated methods, even traditional musicological
studies, without the need of advanced musicological
knowledge for processing the scores.
Bernabeu, J. F.; Calera-Rubio, J.; Iñesta, J. M.
Classifying melodies using tree grammars Journal Article
In: Lecture Notes in Computer Science, vol. 6669, pp. 572–579, 2011, ISSN: 0302-9743.
Abstract | Links | BibTeX | Tags: DRIMS, MIPRCV, TIASA
@article{k264,
title = {Classifying melodies using tree grammars},
author = {J. F. Bernabeu and J. Calera-Rubio and J. M. Iñesta},
url = {https://grfia.dlsi.ua.es/repositori/grfia/pubs/264/ibpria2011-bernabeu.pdf},
issn = {0302-9743},
year = {2011},
date = {2011-01-01},
journal = {Lecture Notes in Computer Science},
volume = {6669},
pages = {572--579},
abstract = {Similarity computation is a difficult issue in music information retrieval, because it tries to emulate the special ability that humans show
for pattern recognition in general, and particularly in the presence of noisy data. A number of works have addressed the problem of what
is the best representation for symbolic music in this context. The tree representation, using rhythm for defining the tree structure and pitch information for leaf and node labeling has proven to be effective in melodic similarity computation. In this paper we propose a solution when we have melodies represented by trees for the training but the duration information is not available for the input data. For that, we infer a probabilistic context-free grammar using the information in the trees (duration and pitch) and classify new melodies represented by strings using only the pitch. The case study in this paper is to identify a snippet query among a set of songs stored in symbolic format. For it, the utilized method must be able to deal with inexact queries and efficient for scalability issues.},
keywords = {DRIMS, MIPRCV, TIASA},
pubstate = {published},
tppubtype = {article}
}
for pattern recognition in general, and particularly in the presence of noisy data. A number of works have addressed the problem of what
is the best representation for symbolic music in this context. The tree representation, using rhythm for defining the tree structure and pitch information for leaf and node labeling has proven to be effective in melodic similarity computation. In this paper we propose a solution when we have melodies represented by trees for the training but the duration information is not available for the input data. For that, we infer a probabilistic context-free grammar using the information in the trees (duration and pitch) and classify new melodies represented by strings using only the pitch. The case study in this paper is to identify a snippet query among a set of songs stored in symbolic format. For it, the utilized method must be able to deal with inexact queries and efficient for scalability issues.
Calvo-Zaragoza, J.; Rizo, D.; Iñesta, J. M.
A distance for partially labeled trees Journal Article
In: Lecture Notes in Computer Science, vol. 6669, pp. 492–499, 2011, ISSN: 0302-9743.
Abstract | Links | BibTeX | Tags: DRIMS, MIPRCV
@article{k265,
title = {A distance for partially labeled trees},
author = {J. Calvo-Zaragoza and D. Rizo and J. M. Iñesta},
url = {https://grfia.dlsi.ua.es/repositori/grfia/pubs/265/ibpria11-calvo.pdf},
issn = {0302-9743},
year = {2011},
date = {2011-01-01},
journal = {Lecture Notes in Computer Science},
volume = {6669},
pages = {492--499},
abstract = {Trees are a powerful data structure for representing data for which hierarchical
relations can be defined. It has been applied in a number of fields like
image analysis, natural language processing, protein structure, or music
retrieval, to name a few. Procedures for comparing trees are very relevant
in many tasks where tree representations are involved. The computation of
these measures is usually time consuming and different authors have
proposed algorithms that are able to compute them in a reasonable time,
by means of approximated versions of the similarity measure. Other methods
require that the trees are fully labeled for the distance to be computed.
The measure utilized in this paper is able to deal with trees labeled
only at the leaves that runs in $O(|T_1|times|T_2|)$ time. Experiments and
comparative results are provided.},
keywords = {DRIMS, MIPRCV},
pubstate = {published},
tppubtype = {article}
}
relations can be defined. It has been applied in a number of fields like
image analysis, natural language processing, protein structure, or music
retrieval, to name a few. Procedures for comparing trees are very relevant
in many tasks where tree representations are involved. The computation of
these measures is usually time consuming and different authors have
proposed algorithms that are able to compute them in a reasonable time,
by means of approximated versions of the similarity measure. Other methods
require that the trees are fully labeled for the distance to be computed.
The measure utilized in this paper is able to deal with trees labeled
only at the leaves that runs in $O(|T_1|times|T_2|)$ time. Experiments and
comparative results are provided.
Iñesta, J. M.; Rizo, D.; Illescas, P. R.
Learning melodic analysis rules Technical Report
2011.
Abstract | Links | BibTeX | Tags: DRIMS
@techreport{k276,
title = {Learning melodic analysis rules},
author = {J. M. Iñesta and D. Rizo and P. R. Illescas},
url = {https://grfia.dlsi.ua.es/repositori/grfia/pubs/276/mml2011-melan-final.pdf},
year = {2011},
date = {2011-01-01},
urldate = {2011-01-01},
booktitle = {4th Int.Workshop on Music and Machine Learning},
organization = {NIPS},
abstract = {Automatic musical analysis has been approached from different perspectives: grammars, expert systems, probabilistic models, and model matching have been proposed for implementing tonal analysis. In this work we focus on automatic melodic analysis. One question that arises when building a melodic analysis system using a-priori music theory is whether it is possible to automatically extract analysis rules from examples, and how similar are those learnt rules compared to music theory rules. This work investigates this question, i.e. given a dataset of analyzed melodies our objective is to automatically learn analysis rules and to compare them with music theory rules.},
keywords = {DRIMS},
pubstate = {published},
tppubtype = {techreport}
}
2010
Rizo, D.
Symbolic music comparison with tree data structures PhD Thesis
2010.
@phdthesis{k258,
title = {Symbolic music comparison with tree data structures},
author = {D. Rizo},
editor = {J. M. Supervisor: Iñesta},
year = {2010},
date = {2010-11-01},
organization = {Universidad de Alicante},
keywords = {DRIMS},
pubstate = {published},
tppubtype = {phdthesis}
}
Rauber, A.; Mayer, R.
Feature Selection in a Cartesian Ensemble of Feature Subspace Classifiers for Music Categorisation Proceedings Article
In: Proc. of. ACM Multimedia Workshop on Music and Machine Learning (MML 2010), pp. 53–56, ACM, Florence (Italy), 2010, ISBN: 978-1-60558-933-6.
Abstract | BibTeX | Tags: DRIMS, MIPRCV
@inproceedings{k255,
title = {Feature Selection in a Cartesian Ensemble of Feature Subspace Classifiers for Music Categorisation},
author = {A. Rauber and R. Mayer},
isbn = {978-1-60558-933-6},
year = {2010},
date = {2010-10-01},
urldate = {2010-10-01},
booktitle = {Proc. of. ACM Multimedia Workshop on Music and Machine Learning (MML 2010)},
pages = {53--56},
publisher = {ACM},
address = {Florence (Italy)},
abstract = {We evaluate the impact of feature selection on the classification
accuracy and the achieved dimensionality reduction,
which benefits the time needed on training classification
models. Our classification scheme therein is a Cartesian en-
semble classification system, based on the principle of late
fusion and feature subspaces. These feature subspaces describe
different aspects of the same data set. We use it for
the ensemble classification of multiple feature sets from the
audio and symbolic domains. We present an extensive set
of experiments in the context of music genre classification,
based on Music IR benchmark datasets. We show that while
feature selection does not benefit classification accuracy, it
greatly reduces the dimensionality of each feature subspace,
and thus adds to great gains in the time needed to train the
individual classification models that form the ensemble.},
keywords = {DRIMS, MIPRCV},
pubstate = {published},
tppubtype = {inproceedings}
}
accuracy and the achieved dimensionality reduction,
which benefits the time needed on training classification
models. Our classification scheme therein is a Cartesian en-
semble classification system, based on the principle of late
fusion and feature subspaces. These feature subspaces describe
different aspects of the same data set. We use it for
the ensemble classification of multiple feature sets from the
audio and symbolic domains. We present an extensive set
of experiments in the context of music genre classification,
based on Music IR benchmark datasets. We show that while
feature selection does not benefit classification accuracy, it
greatly reduces the dimensionality of each feature subspace,
and thus adds to great gains in the time needed to train the
individual classification models that form the ensemble.
Pérez-García, Pérez-Sancho T.
Harmonic and Instrumental Information Fusion for Musical Genre Classification Proceedings Article
In: Proc. of. ACM Multimedia Workshop on Music and Machine Learning (MML 2010), pp. 49–52, ACM, Florence (Italy), 2010, ISBN: 978-1-60558-933-6.
Abstract | BibTeX | Tags: DRIMS, MIPRCV
@inproceedings{k256,
title = {Harmonic and Instrumental Information Fusion for Musical Genre Classification},
author = {Pérez-Sancho T. Pérez-García},
isbn = {978-1-60558-933-6},
year = {2010},
date = {2010-10-01},
booktitle = {Proc. of. ACM Multimedia Workshop on Music and Machine Learning (MML 2010)},
pages = {49--52},
publisher = {ACM},
address = {Florence (Italy)},
abstract = {This paper presents a musical genre classification system
based on the combination of two kinds of information of very
different nature: the instrumentation information contained
in a MIDI file (metadata) and the chords that provide the
harmonic structure of the musical score stored in that file
(content). The fusion of these two information sources gives
a single feature vector that represents the file and to which
classification techniques usually utilized for text categorization
tasks are applied. The classification task is performed
under a probabilistic approach that has improved the results
previously obtained for the same data using the instrumental
or the chord information independently.},
keywords = {DRIMS, MIPRCV},
pubstate = {published},
tppubtype = {inproceedings}
}
based on the combination of two kinds of information of very
different nature: the instrumentation information contained
in a MIDI file (metadata) and the chords that provide the
harmonic structure of the musical score stored in that file
(content). The fusion of these two information sources gives
a single feature vector that represents the file and to which
classification techniques usually utilized for text categorization
tasks are applied. The classification task is performed
under a probabilistic approach that has improved the results
previously obtained for the same data using the instrumental
or the chord information independently.
Pérez-Sancho, C.; Rizo, D.; Iñesta, J. M.; León, P. J. Ponce; Kersten, S.; Ramírez, R.
Genre classification of music by tonal harmony Journal Article
In: Intelligent Data Analysis, vol. 14, no. 5, pp. 533-545, 2010, ISSN: 1088-467X.
Abstract | BibTeX | Tags: Acc. Int. E-A, DRIMS, PROSEMUS
@article{k232,
title = {Genre classification of music by tonal harmony},
author = {C. Pérez-Sancho and D. Rizo and J. M. Iñesta and P. J. Ponce León and S. Kersten and R. Ramírez},
issn = {1088-467X},
year = {2010},
date = {2010-09-01},
urldate = {2010-09-01},
journal = {Intelligent Data Analysis},
volume = {14},
number = {5},
pages = {533-545},
abstract = {In this paper we present a genre classification framework for audio music based on a symbolic classification system. Audio signals are transformed into a symbolic representation of harmony using a chord transcription algorithm, based on the computation of harmonic pitch class profiles. Then, language models built from a ground truth of chord progressions for each genre are used to perform classification. We show that chord progressions are a suitable feature to represent musical genre, as they capture the harmonic rules relevant in each musical period or style. Finally, results using both pure symbolic information and chords transcribed from audio-from-MIDI are compared, in order to evaluate the effects of the transcription process in this task.},
keywords = {Acc. Int. E-A, DRIMS, PROSEMUS},
pubstate = {published},
tppubtype = {article}
}
Calera-Rubio, J.; Bernabeu, J. F.
Tree language automata for melody recognition Proceedings Article
In: Pérez, Juan Carlos (Ed.): Actas del II Workshop de Reconocimiento de Formas y Análisis de Imágenes (AERFAI), pp. 17-22, AERFAI IBERGARCETA PUBLICACIONES, S.L., Valencia, Spain, 2010, ISBN: 978-84-92812-66-0.
Abstract | Links | BibTeX | Tags: DRIMS, MIPRCV, TIASA
@inproceedings{k251,
title = {Tree language automata for melody recognition},
author = {J. Calera-Rubio and J. F. Bernabeu},
editor = {Juan Carlos Pérez},
url = {https://grfia.dlsi.ua.es/repositori/grfia/pubs/251/bernabeuCEDI2010Final.pdf},
isbn = {978-84-92812-66-0},
year = {2010},
date = {2010-09-01},
urldate = {2010-09-01},
booktitle = {Actas del II Workshop de Reconocimiento de Formas y Análisis de Imágenes (AERFAI)},
pages = {17-22},
publisher = {IBERGARCETA PUBLICACIONES, S.L.},
address = {Valencia, Spain},
organization = {AERFAI},
abstract = {The representation of symbolic music by
means of trees has shown to be suitable in
melodic similarity computation. In order to
compare trees, different tree edit distances
have been previously used, being their complexity
a main drawback. In this paper, the application of stochastic k-testable treemodels for computing the similarity between two melodies as a probability, compared to the classical edit distance has been addressed. The results show that k-testable tree-models seem to be adequate for the task, since they outperform other reference methods in both recognition rate and efficiency. The case study in this paper is to identify a snippet query among a set of songs. For it, the utilized method must be able to deal with inexact queries and efficiency
for scalability issues.},
keywords = {DRIMS, MIPRCV, TIASA},
pubstate = {published},
tppubtype = {inproceedings}
}
means of trees has shown to be suitable in
melodic similarity computation. In order to
compare trees, different tree edit distances
have been previously used, being their complexity
a main drawback. In this paper, the application of stochastic k-testable treemodels for computing the similarity between two melodies as a probability, compared to the classical edit distance has been addressed. The results show that k-testable tree-models seem to be adequate for the task, since they outperform other reference methods in both recognition rate and efficiency. The case study in this paper is to identify a snippet query among a set of songs. For it, the utilized method must be able to deal with inexact queries and efficiency
for scalability issues.
Iñesta, J. M.; Pérez-Sancho, C.; Pérez-García, T.
Fusión de información armónica e instrumental para la clasificación de géneros musicales Proceedings Article
In: Pérez, Juan Carlos (Ed.): Actas del II Workshop de Reconocimiento de Formas y Análisis de Imágenes (AERFAI), pp. 147-153, AERFAI Ibergarceta Publicaciones S.L., Valencia, Spain, 2010, ISBN: 978-84-92812-66-0.
Abstract | BibTeX | Tags: DRIMS, MIPRCV
@inproceedings{k252,
title = {Fusión de información armónica e instrumental para la clasificación de géneros musicales},
author = {J. M. Iñesta and C. Pérez-Sancho and T. Pérez-García},
editor = {Juan Carlos Pérez},
isbn = {978-84-92812-66-0},
year = {2010},
date = {2010-09-01},
urldate = {2010-09-01},
booktitle = {Actas del II Workshop de Reconocimiento de Formas y Análisis de Imágenes (AERFAI)},
pages = {147-153},
publisher = {Ibergarceta Publicaciones S.L.},
address = {Valencia, Spain},
organization = {AERFAI},
abstract = {En este artículo presentamos un sistema de clasificación de género musical basado en la combinación de dos tipos diferentes de información: la información instrumental contenida en un fichero MIDI y los acordes que proporcionan la estructura armónica de la partitura musical almacenada en dicho fichero. La unión de estas informaciones nos proporciona un único vector de caracteríticas sobre el que se aplican técnicas usadas habitualmente en la clasificación de textos. Finalmente esto nos proporciona un clasificador probabilítico que mejora los resultados obtenidos en trabajos previos en los que se usaba de forma independiente la información instrumental y la información armónica de un fichero MIDI.},
keywords = {DRIMS, MIPRCV},
pubstate = {published},
tppubtype = {inproceedings}
}
Pérez, A.; Ramírez, R.; Iñesta, J. M.
Modeling violin performances using inductive logic programming Journal Article
In: Intelligent Data Analysis, vol. 14, no. 5, pp. 573–585, 2010, ISSN: 1088-467X.
@article{k253,
title = {Modeling violin performances using inductive logic programming},
author = {A. Pérez and R. Ramírez and J. M. Iñesta},
issn = {1088-467X},
year = {2010},
date = {2010-09-01},
urldate = {2010-09-01},
journal = {Intelligent Data Analysis},
volume = {14},
number = {5},
pages = {573--585},
keywords = {DRIMS},
pubstate = {published},
tppubtype = {article}
}
Lidy, T.; Mayer, R.; Rauber, A.; de León, P. J. Ponce; Pertusa, A.; Iñesta, J. M.
A Cartesian Ensemble of Feature Subspace Classifiers for Music Categorization Proceedings Article
In: Downie, J. Stephen; Veltkamp, Remco C. (Ed.): Proceedings of the 11th International Society for Music Information Retrieval Conference (ISMIR 2010), pp. 279-284, International Society for Music Information Retrieval International Society for Music Information Retrieval, Utrecht, Netherlands, 2010, ISBN: 978-90-393-53813.
Abstract | Links | BibTeX | Tags: DRIMS
@inproceedings{k246,
title = {A Cartesian Ensemble of Feature Subspace Classifiers for Music Categorization},
author = {T. Lidy and R. Mayer and A. Rauber and P. J. Ponce de León and A. Pertusa and J. M. Iñesta},
editor = {J. Stephen Downie and Remco C. Veltkamp},
url = {https://grfia.dlsi.ua.es/repositori/grfia/pubs/246/ismir2010.pdf},
isbn = {978-90-393-53813},
year = {2010},
date = {2010-08-01},
urldate = {2010-08-01},
booktitle = {Proceedings of the 11th International Society for Music Information Retrieval Conference (ISMIR 2010)},
pages = {279-284},
publisher = {International Society for Music Information Retrieval},
address = {Utrecht, Netherlands},
organization = {International Society for Music Information Retrieval},
abstract = {We present a cartesian ensemble classification system that is based on the principle of late fusion and feature subspaces. These feature subspaces describe different aspects of the same data set. The framework is built on the Weka machine learning toolkit and able to combine arbitrary feature sets and learning schemes. In our scenario, we use it for the ensemble classification of multiple feature sets from the audio and symbolic domains. We present an extensive set of experiments in the context of music genre classification, based on numerous Music IR benchmark datasets, and evaluate a set of combination/voting rules. The results show that the approach is superior to the best choice of a single algorithm on a single feature set. Moreover, it also releases the user from making this choice explicitly.},
keywords = {DRIMS},
pubstate = {published},
tppubtype = {inproceedings}
}
Rizo, D.; Iñesta, J. M.
New partially labelled tree similarity measure: a case study Proceedings Article
In: Hancok, E. R.; Wilson, R. C.; Ilkay, T. W.; Escolano, F. (Ed.): Structural, Syntactic, and Statistical Pattern Recognition, pp. 296–305, Springer, Cesme, Turkey, 2010, ISBN: 978-3-642-14979-5.
@inproceedings{k248,
title = {New partially labelled tree similarity measure: a case study},
author = {D. Rizo and J. M. Iñesta},
editor = {E. R. Hancok and R. C. Wilson and T. W. Ilkay and F. Escolano},
url = {https://grfia.dlsi.ua.es/repositori/grfia/pubs/248/ssspr10-cr.pdf},
isbn = {978-3-642-14979-5},
year = {2010},
date = {2010-08-01},
booktitle = {Structural, Syntactic, and Statistical Pattern Recognition},
pages = {296--305},
publisher = {Springer},
address = {Cesme, Turkey},
keywords = {DRIMS},
pubstate = {published},
tppubtype = {inproceedings}
}
Pertusa, A.
Computationally efficient methods for polyphonic music transcription PhD Thesis
2010.
Abstract | Links | BibTeX | Tags: DRIMS, MIPRCV
@phdthesis{k244,
title = {Computationally efficient methods for polyphonic music transcription},
author = {A. Pertusa},
editor = {José M. Iñesta},
url = {https://grfia.dlsi.ua.es/repositori/grfia/pubs/244/pertusaphd.pdf},
year = {2010},
date = {2010-01-01},
organization = {Universidad de Alicante},
abstract = {Automatic music transcription is a music information retrieval (MIR) task which involves many different disciplines, such as audio signal processing, machine learning, computer science, psychoacoustics and music perception, music theory, and music cognition. The goal of automatic music transcription is to extract a human readable and interpretable representation, like a musical score, from an audio signal. To achieve this goal, it is necessary to estimate the pitches, onset times and durations of the notes, the tempo, the meter and the tonality of a musical piece.
The most obvious application of automatic music transcription is to help a musician to write down the music notation of a performance from an audio recording, which is a time consuming task when it is done by hand. Besides this application, automatic music transcription can also be useful for other MIR tasks, like plagiarism detection, artist identification, genre classification, and composition assistance by changing the instrumentation, the arrangement or the loudness before resynthesizing new pieces. In general, music transcription methods can also provide information about the notes to symbolic music algorithms.
This work addresses the automatic music transcription problem using different strategies. Novel efficient methods are proposed for onset detection (detection of the beginnings of musical events) and multiple fundamental frequency estimation (estimation of the pitches in a polyphonic mixture), using supervised learning and signal processing techniques.
The main contributions of this work can be summarized in the following points:
- An analytical and extensive review of the state of the art methods for onset detection and multiple fundamental frequency estimation.
- The development of an efficient approach for onset detection and the construction of a public ground-truth data set for this task.
- Two novel approaches for multiple pitch estimation of a priori known sounds using supervised learning methods. These algorithms were one of the first machine learning methods proposed for this task.
- A simple iterative cancellation approach, mainly intended to transcribe piano music at a low computational cost.
- Heuristic multiple fundamental frequency algorithms based on signal processing to analyze real music without any a priori knowledge. These methods, which are probably the main contribution of this work, experimentally reached the state of the art for this task with a very low
computational burden.},
keywords = {DRIMS, MIPRCV},
pubstate = {published},
tppubtype = {phdthesis}
}
The most obvious application of automatic music transcription is to help a musician to write down the music notation of a performance from an audio recording, which is a time consuming task when it is done by hand. Besides this application, automatic music transcription can also be useful for other MIR tasks, like plagiarism detection, artist identification, genre classification, and composition assistance by changing the instrumentation, the arrangement or the loudness before resynthesizing new pieces. In general, music transcription methods can also provide information about the notes to symbolic music algorithms.
This work addresses the automatic music transcription problem using different strategies. Novel efficient methods are proposed for onset detection (detection of the beginnings of musical events) and multiple fundamental frequency estimation (estimation of the pitches in a polyphonic mixture), using supervised learning and signal processing techniques.
The main contributions of this work can be summarized in the following points:
- An analytical and extensive review of the state of the art methods for onset detection and multiple fundamental frequency estimation.
- The development of an efficient approach for onset detection and the construction of a public ground-truth data set for this task.
- Two novel approaches for multiple pitch estimation of a priori known sounds using supervised learning methods. These algorithms were one of the first machine learning methods proposed for this task.
- A simple iterative cancellation approach, mainly intended to transcribe piano music at a low computational cost.
- Heuristic multiple fundamental frequency algorithms based on signal processing to analyze real music without any a priori knowledge. These methods, which are probably the main contribution of this work, experimentally reached the state of the art for this task with a very low
computational burden.
Iñesta, J. M.; Rizo, D.
Trees and combined methods for monophonic music similarity evaluation Proceedings Article
In: MIREX 2010 - Music Information Retrieval Evaluation eXchange, MIREX Symbolic Melodic Similarity contest, Utrecht, The Nederlands, 2010.
@inproceedings{k254,
title = {Trees and combined methods for monophonic music similarity evaluation},
author = {J. M. Iñesta and D. Rizo},
url = {https://grfia.dlsi.ua.es/repositori/grfia/pubs/254/trees.pdf},
year = {2010},
date = {2010-01-01},
urldate = {2010-01-01},
booktitle = {MIREX 2010 - Music Information Retrieval Evaluation eXchange, MIREX Symbolic Melodic Similarity contest},
address = {Utrecht, The Nederlands},
keywords = {DRIMS},
pubstate = {published},
tppubtype = {inproceedings}
}
2009
Calera-Rubio, J.; Bernabeu, J. F.
A probabilistic approach to melodic similarity Proceedings Article
In: Proceedings of MML 2009, pp. 48-53, 2009.
Abstract | Links | BibTeX | Tags: ARFAI, DRIMS, MIPRCV, PROSEMUS, TIASA
@inproceedings{k231,
title = {A probabilistic approach to melodic similarity},
author = {J. Calera-Rubio and J. F. Bernabeu},
url = {https://grfia.dlsi.ua.es/repositori/grfia/pubs/231/mml2009Bernabeu.pdf},
year = {2009},
date = {2009-01-01},
urldate = {2009-01-01},
booktitle = {Proceedings of MML 2009},
pages = {48-53},
abstract = {Melodic similarity is an important research topic in music information retrieval.
The representation of symbolic music by means of trees has proven to be suitable
in melodic similarity computation, because they are able to code rhythm in their
structure leaving only pitch representations as a degree of freedom for coding.
In order to compare trees, different edit distances have been previously used.
In this paper, stochastic k-testable tree-models, formerly used in other domains
like structured document compression or natural language processing, have been
used for computing a similarity measure between melody trees as a probability
and their performance has been compared to a classical tree edit distance.},
keywords = {ARFAI, DRIMS, MIPRCV, PROSEMUS, TIASA},
pubstate = {published},
tppubtype = {inproceedings}
}
The representation of symbolic music by means of trees has proven to be suitable
in melodic similarity computation, because they are able to code rhythm in their
structure leaving only pitch representations as a degree of freedom for coding.
In order to compare trees, different edit distances have been previously used.
In this paper, stochastic k-testable tree-models, formerly used in other domains
like structured document compression or natural language processing, have been
used for computing a similarity measure between melody trees as a probability
and their performance has been compared to a classical tree edit distance.