Skip to main content
  • Research Article
  • Open access
  • Published:

Permutation Correction in the Frequency Domain in Blind Separation of Speech Mixtures

Abstract

This paper presents a method for blind separation of convolutive mixtures of speech signals, based on the joint diagonalization of the time varying spectral matrices of the observation records. The main and still largely open problem in a frequency domain approach is permutation ambiguity. In an earlier paper of the authors, the continuity of the frequency response of the unmixing filters is exploited, but it leaves some frequency permutation jumps. This paper therefore proposes a new method based on two assumptions. The frequency continuity of the unmixing filters is still used in the initialization of the diagonalization algorithm. Then, the paper introduces a new method based on the time-frequency representations of the sources. They are assumed to vary smoothly with frequency. This hypothesis of the continuity of the time variation of the source energy is exploited on a sliding frequency bandwidth. It allows us to detect the remaining frequency permutation jumps. The method is compared with other approaches and results on real world recordings demonstrate superior performances of the proposed algorithm.

References

  1. Parra LC, Spence C: Convolutive blind separation of non-stationary sources. IEEE Transactions on Speech and Audio Processing 2000, 8(3):320–327. 10.1109/89.841214

    Article  Google Scholar 

  2. Smaragdis P: Blind separation of convolved mixtures in the frequency domain. Proceedings of the International ICSC Workshop on Independence & Artificial Neural Networks (I&ANN '98), February 1998, Tenerife, Spain 9–10.

    Google Scholar 

  3. Wu H-C, Principe JC: Simultaneous diagonalization in the frequency domain (SDIF) for source separation. Proceedings of the 1st International Conference on Independent Component Analysis and Signal Separation (ICA '99), January 1999, Aussois, France 245–250.

    Google Scholar 

  4. Mukai R, Araki S, Makino S: Separation and dereverberation performance of frequency domain blind source separation. Proceedings of the 3rd International Conference on Independent Component Analysis and Blind Signal Separation (ICA '01), December 2001, San Diego, Calif, USA 230–235.

    Google Scholar 

  5. Pham D-T, Cardoso J-F: Blind separation of instantaneous mixtures of nonstationary sources. IEEE Transactions on Signal Processing 2001, 49(9):1837–1848. 10.1109/78.942614

    Article  MathSciNet  Google Scholar 

  6. Pham D-T, Servière Ch, Boumaraf H: Blind separation of convolutive audio mixtures using nonstationarity. Proceedings of the 4th International Conference on Independent Component Analysis and Blind Signal Separation (ICA '03), April 2003, Nara, Japan 981–986.

    Google Scholar 

  7. Pham D-T, Servière Ch, Boumaraf H: Blind separation of speech mixtures based on nonstationarity. Proceedings of 7th International Symposium on Signal Processing and Its Applications (ISSPA '03), July 2003, Paris, France 2: 73–76.

    Google Scholar 

  8. Matsuoka K, Nakashima S: Minimal distortion principle for blind source separation. Proceedings of the 3rd International Conference on Independent Component Analysis and Blind Signal Separation (ICA '01), December 2001, San Diego, Calif, USA 722–727.

    Google Scholar 

  9. Murata N, Ikeda S, Ziehe A: An approach to blind source separation based on temporal structure of speech signals. Neurocomputing 2001, 41(1–4):1–24.

    Article  Google Scholar 

  10. Sawada H, Winter S, Mukai R, Araki S, Makino S: Estimating the number of sources for frequency-domain blind source separation. Proceedings of the 5th International Conference on Independent Component Analysis and Blind Signal Separation (ICA '04), September 2004, Granada, Spain 610–617.

    Chapter  Google Scholar 

  11. Torkkola K: Blind separation for audio signals—Are we there yet? Proceedings of the 1st International Workshop on Independent Component Analysis and Signal Separation (ICA '99), January 1999, Aussois, France 239–244.

    Google Scholar 

  12. Westner A: Object-based audio capture: separating acoustically mixed sounds, M.S. thesis. Massachusetts Institute of Technology, Cambridge, Mass, USA; 1998.

    Google Scholar 

  13. Parra LC, Spence C: On-line convolutive blind source separation of non-stationary signals. Journal of VLSI Signal Processing Systems for Signal, Image, and Video Technology 2000, 26(1–2):39–46.

    Article  Google Scholar 

  14. Anemüller J, Kollmeier B: Amplitude modulation decorrelation for convolutive blind source separation. Proceedings of the 2nd International Workshop on Independent Component Analysis and Blind Signal Separation (ICA '00), June 2000, Helsinki, Finland 215–220.

    Google Scholar 

  15. Wang W, Chambers JA, Sanei S: A novel hybrid approach to the permutation problem of frequency domain blind source separation. Proceedings of the 5th International Conference on Independent Component Analysis and Blind Signal Separation (ICA '04), September 2004, Granada, Spain 532–539.

    Chapter  Google Scholar 

  16. Ikeda S, Murata N: A method of blind separation based on temporal structure of signals. Proceedings of the 5th International Conference on Neural Information Processing (ICONIP '98), October 1998, Kitakyushu, Japan 737–742.

    Google Scholar 

  17. Servière Ch, Pham D-T: A novel method for permutation correction in frequency-domain in blind separation of speech mixtures. Proceedings of the 5th International Conference on Independent Component Analysis and Blind Signal Separation (ICA '04), September 2004, Granada, Spain 807–815.

    Chapter  Google Scholar 

  18. Asano F, Ikeda S, Ogawa M, Asoh H, Kitawaki N: A combined approach of array processing and independent component analysis for blind separation of acoustic signals. Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '01), May 2001, Salt Lake City, Utah, USA 5: 2729–2732.

    Google Scholar 

  19. Kamata K, Hu X, Kobatake H: A new approach to the permutation problem in frequency domain blind source separation. Proceedings of the 5th International Conference on Independent Component Analysis and Blind Signal Separation (ICA '04), September 2004, Granada, Spain 849–856.

    Chapter  Google Scholar 

  20. Ikram MZ, Morgan DR: A beamforming approach to permutation alignment for multichannel frequency-domain blind speech separation. Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '02), May 2002, Orlando, Fla, USA 1: 881–884.

    Google Scholar 

  21. Knaak M, Araki S, Makino S: Geometrically constrained ICA for robust separation of sound mixtures. Proceedings of the 4th International Conference on Independent Component Analysis and Blind Signal Separation (ICA '03), April 2003, Nara, Japan 951–956.

    Google Scholar 

  22. Kurita S, Saruwatari H, Kajita S, Takeda K, Itakura F: Evaluation of blind signal separation method using directivity pattern under reverberant conditions. Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '00), June 2000, Istanbul, Turkey 5: 3140–3143.

    Google Scholar 

  23. Mitianoudis N, Davies M: Permutation alignment for frequency domain ICA using subspace beamforming methods. Proceedings of the 5th International Conference on Independent Component Analysis and Blind Signal Separation (ICA '04), September 2004, Granada, Spain 669–676.

    Chapter  Google Scholar 

  24. Mukai R, Sawada H, Araki S, Makino S: Frequency domain blind source separation for many speech signals. Proceedings of the 5th International Conference on Independent Component Analysis and Blind Signal Separation (ICA '04), September 2004, Granada, Spain 461–469.

    Chapter  Google Scholar 

  25. Parra LC, Alvino CV: Geometric source separation: merging convolutive source separation with geometric beamforming. IEEE Transactions on Speech and Audio Processing 2002, 10(6):352–362. 10.1109/TSA.2002.803443

    Article  Google Scholar 

  26. Saruwatari H, Kawamura T, Shikano K: Fast-convergence algorithm for ICA-based blind source separation using array signal processing. Proceedings of the IEEE Workshop on the Applications of Signal Processing to Audio and Acoustics (WASPAA '01), October 2001, New Platz, NY, USA 91–94.

    Google Scholar 

  27. Soon VC, Tong L, Huang YF, Liu R: A robust method for wideband signal separation. Proceedings of IEEE International Symposium on Circuits and Systems (ISCAS '93), May 1993, Chicago, Ill, USA 1: 703–706.

    Google Scholar 

  28. Matsuoka K, Ohya M, Kawamoto M: A neural net for blind separation of nonstationary signals. Neural Networks 1995, 8(3):411–419. 10.1016/0893-6080(94)00083-X

    Article  Google Scholar 

  29. Pham D-T: Joint approximate diagonalization of positive definite Hermitian matrices. SIAM Journal on Matrix Analysis and Applications 2001, 22(4):1136–1152. 10.1137/S089547980035689X

    Article  MathSciNet  Google Scholar 

  30. Oppenheim AV, Schafer RW: Digital Signal Processing. Prentice Hall, Englewood Cliffs, NJ, USA; 1975.

    MATH  Google Scholar 

  31. Sawada H, Muaki R, Araki S, Makino S: A robust and precise method for solving the permutation problem of frequency-domain blind source separation. IEEE Transactions on Speech and Audio Processing 2004, 12(5):530–538. 10.1109/TSA.2004.832994

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ch Servière.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 2.0 International License ( https://creativecommons.org/licenses/by/2.0 ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and permissions

About this article

Cite this article

Servière, C., Pham, D. Permutation Correction in the Frequency Domain in Blind Separation of Speech Mixtures. EURASIP J. Adv. Signal Process. 2006, 075206 (2006). https://doi.org/10.1155/ASP/2006/75206

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1155/ASP/2006/75206

Keywords