Skip to main content
  • Research Article
  • Open access
  • Published:

Towards Structural Analysis of Audio Recordings in the Presence of Musical Variations

Abstract

One major goal of structural analysis of an audio recording is to automatically extract the repetitive structure or, more generally, the musical form of the underlying piece of music. Recent approaches to this problem work well for music, where the repetitions largely agree with respect to instrumentation and tempo, as is typically the case for popular music. For other classes of music such as Western classical music, however, musically similar audio segments may exhibit significant variations in parameters such as dynamics, timbre, execution of note groups, modulation, articulation, and tempo progression. In this paper, we propose a robust and efficient algorithm for audio structure analysis, which allows to identify musically similar segments even in the presence of large variations in these parameters. To account for such variations, our main idea is to incorporate invariance at various levels simultaneously: we design a new type of statistical features to absorb microvariations, introduce an enhanced local distance measure to account for local variations, and describe a new strategy for structure extraction that can cope with the global variations. Our experimental results with classical and popular music show that our algorithm performs successfully even in the presence of significant musical variations.

References

  1. Bartsch MA, Wakefield GH: Audio thumbnailing of popular music using chroma-based representations. IEEE Transactions on Multimedia 2005,7(1):96–104.

    Article  Google Scholar 

  2. Cooper M, Foote J: Automatic music summarization via similarity analysis. Proceedings of 3rd International Conference on Music Information Retrieval (ISMIR '02), October 2002, Paris, France

    Google Scholar 

  3. Dannenberg R, Hu N: Pattern discovery techniques for music audio. Proceedings of 3rd International Conference on Music Information Retrieval (ISMIR '02), October 2002, Paris, France

    Google Scholar 

  4. Goto M: A chorus-section detecting method for musical audio signals. Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP '03), April 2003, Hong Kong 5: 437–440.

    Google Scholar 

  5. Lu L, Wang M, Zhang H-J: Repeating pattern discovery and structure analysis from acoustic music data. Proceedings of the 6th ACM SIGMM International Workshop on Multimedia Information Retrieval (MIR '04), October 2004, New York, NY, USA 275–282.

    Google Scholar 

  6. Maddage NC, Xu C, Kankanhalli MS, Shao X: Content-based music structure analysis with applications to music semantics understanding. proceedings of the 12th ACM International Conference on Multimedia, October 2004, New York, NY, USA 112–119.

    Chapter  Google Scholar 

  7. Peeters G, Burthe AL, Rodet X: Toward automatic music audio summary generation from signal analysis. Proceedings of 3rd International Conference on Music Information Retrieval (ISMIR '02), October 2002, Paris, France 94–100.

    Google Scholar 

  8. Foote J: Visualizing music and audio using selfsimilarity. Proceedings of the 7th ACM International Conference on Multimedia (MM '99), October–November 1999, Orlando, Fla, USA 77–80.

    Google Scholar 

  9. Bartsch MA, Wakefield GH: To catch a chorus: using chroma-based representations for audio thumbnailing. Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA '01), October 2001, New Paltz, NY, USA 15–18.

    Google Scholar 

  10. Logan B, Chu S: Music summarization using key phrases. Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP '00), June 2000, Istanbul, Turkey 2: 749–752.

    Google Scholar 

  11. Xu C, Maddage NC, Shao X: Automatic music classification and summarization. IEEE Transactions on Speech and Audio Processing 2005,13(3):441–450.

    Article  Google Scholar 

  12. Chai W: Structural analysis of musical signals via pattern matching. Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP '03), April 2003, Hong Kong 5: 549–552.

    Google Scholar 

  13. Chai W, Vercoe B: Music thumbnailing via structural analysis. Proceedings of the ACM International Multimedia Conference and Exhibition (MM '03), November 2003, Berkeley, Calif, USA 223–226.

    Google Scholar 

  14. Goto M: SmartMusicKIOSK: music listening station with chorus-search function. Proceedings of the Annual ACM Symposium on User Interface Softaware and Technology (UIST '03), November 2003, Vancouver, BC, Canada 31–40.

    Chapter  Google Scholar 

  15. Kurth F, Müller M, Damm D, Fremerey C, Ribbrock A, Clausen M: Syncplayer—an advanced system for content-based audio access. Proceedings of 6th International Conference on Music Information Retrieval (ISMIR '05), September 2005, London, UK

    Google Scholar 

  16. Tzanetakis G, Ermolinskyi A, Cook P: Pitch histograms in audio and symbolic music information retrieval. Proceedings of 3rd International Conference on Music Information Retrieval (ISMIR '02), October 2002, Paris, France

    Google Scholar 

  17. Proakis JG, Manolakis DG: Digital Signal Processsing. Prentice Hall, Englewood Cliffs, NJ, USA; 1996.

    Google Scholar 

  18. Müller M, Kurth F, Clausen M: Audio matching via chroma-based statistical features. Proceedings of 6th International Conference on Music Information Retrieval (ISMIR '05), September 2005, London, UK

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Meinard Müller.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 2.0 International License ( https://creativecommons.org/licenses/by/2.0 ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and permissions

About this article

Cite this article

Müller, M., Kurth, F. Towards Structural Analysis of Audio Recordings in the Presence of Musical Variations. EURASIP J. Adv. Signal Process. 2007, 089686 (2006). https://doi.org/10.1155/2007/89686

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1155/2007/89686

Keywords