- Research Article
- Open access
- Published:
A Discriminative Model for Polyphonic Piano Transcription
EURASIP Journal on Advances in Signal Processing volume 2007, Article number: 048317 (2006)
Abstract
We present a discriminative model for polyphonic piano transcription. Support vector machines trained on spectral features are used to classify frame-level note instances. The classifier outputs are temporally constrained via hidden Markov models, and the proposed system is used to transcribe both synthesized and real piano recordings. A frame-level transcription accuracy of 68% was achieved on a newly generated test set, and direct comparisons to previous approaches are provided.
References
Moorer JA: On the transcription of musical sound by computer. Computer Music Journal 1977,1(4):32–38.
Rossi L, Girolami G, Leca M: Identification of polyphonic piano signals. Acustica 1997,83(6):1077–1084.
Sterian AD: Model-based segmentation of time-frequency images for musical transcription, Ph.D. thesis. University of Michigan, Ann Arbor, Mich, USA; 1999.
Dixon S: On the computer recognition of solo piano music. Proceedings of Australasian Computer Music Conference, July 2000, Brisbane, Australia 31–37.
Bello JP, Daudet L, Sandler M: Time-domain polyphonic transcription using self-generating databases. Proceedings of the 112th Convention of the Audio Engineering Society, May 2002, Munich, Germany
Klapuri A: A perceptually motivated multiple-f0 estimation method. Proceedings of IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA '05), October 2005, New Paltz, NY, USA
Ryynänen M, Klapuri A: Polyphonic music transcription using note event modeling. Proceedings of IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA '05), October 2005, New Paltz, NY, USA
Marolt M: A connectionist approach to automatic transcription of polyphonic piano music. IEEE Transactions on Multimedia 2004,6(3):439–449. 10.1109/TMM.2004.827507
Godsill S, Davy M: Bayesian harmonic models for musical pitch estimation and analysis. Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP '02), May 2002, Orlando, Fla, USA 2: 1769–1772.
Cemgil AT, Kappen HJ, Barber D: A generative model for music transcription. IEEE Transactions on Speech and Audio Processing 2006,14(2):679–694.
Kashino K, Godsill SJ: Bayesian estimation of simultaneous musical notes based on frequency domain modelling. Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP '04), May 2004, Montreal, Que, Canada 4: 305–308.
Ellis DPW, Poliner GE: Classification-based melody transcription. to appear in Machine Learning, https://doi.org/10.1007/s10994-006-8373-9 to appear in Machine Learning,
Platt J: Fast training of support vector machines using sequential minimal optimization. In Advances in Kernel Methods - Support Vector Learning. Edited by: Scholkopf B, Burges CJC, Smola AJ. MIT Press, Cambridge, Mass, USA; 1999:185–208.
Witten IH, Frank E: Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations. Morgan Kaufmann, San Francisco, Calif, USA; 2000.
National Institute of Standards and Technology Spring 2004 (RT-04S) rich transcription meeting recognition evaluation plan, 2004. https://doi.org/nist.gov/speech/tests/rt/rt2004/spring/
Taskar B, Guestrin C, Koller D: Max-margin Markov networks. Proceedings of Neural Information Processing Systems Conference (NIPS '03), December 2003, Vancouver, Canada
Schölkopf B, Platt JC, Shawe-Taylor J, Smola AJ, Williamson RC: Estimating the support of a high-dimensional distribution. Neural Computation 2001,13(7):1443–1471. 10.1162/089976601750264965
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 2.0 International License ( https://creativecommons.org/licenses/by/2.0 ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
About this article
Cite this article
Poliner, G.E., Ellis, D.P.W. A Discriminative Model for Polyphonic Piano Transcription. EURASIP J. Adv. Signal Process. 2007, 048317 (2006). https://doi.org/10.1155/2007/48317
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1155/2007/48317