Anthropomorphic Coding of Speech and Audio: A Model Inversion Approach

Feldbauer, Christian; Kubin, Gernot; Kleijn, W. Bastiaan

doi:10.1155/ASP.2005.1334

Research Article
Open access
Published: 21 June 2005

Anthropomorphic Coding of Speech and Audio: A Model Inversion Approach

Christian Feldbauer¹,
Gernot Kubin¹ &
W. Bastiaan Kleijn²

EURASIP Journal on Advances in Signal Processing volume 2005, Article number: 571618 (2005) Cite this article

1476 Accesses
10 Citations
Metrics details

Abstract

Auditory modeling is a well-established methodology that provides insight into human perception and that facilitates the extraction of signal features that are most relevant to the listener. The aim of this paper is to provide a tutorial on perceptual speech and audio coding using an invertible auditory model. In this approach, the audio signal is converted into an auditory representation using an invertible auditory model. The auditory representation is quantized and coded. Upon decoding, it is then transformed back into the acoustic domain. This transformation converts a complex distortion criterion into a simple one, thus facilitating quantization with low complexity. We briefly review past work on auditory models and describe in more detail the components of our invertible model and its inversion procedure, that is, the method to reconstruct the signal from the output of the auditory model. We summarize attempts to use the auditory representation for low-bit-rate coding. Our approach also allows the exploitation of the inherent redundancy of the human auditory system for the purpose of multiple description (joint source-channel) coding.

Author information

Authors and Affiliations

Signal Processing and Speech Communication Laboratory, Graz University of Technology, Graz, 8010, Austria
Christian Feldbauer & Gernot Kubin
Department for Signals, Sensors and Systems, KTH (Royal Institute of Technology), Stockholm, 10044, Sweden
W. Bastiaan Kleijn

Authors

Christian Feldbauer
View author publications
You can also search for this author in PubMed Google Scholar
Gernot Kubin
View author publications
You can also search for this author in PubMed Google Scholar
W. Bastiaan Kleijn
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Christian Feldbauer.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 2.0 International License ( https://creativecommons.org/licenses/by/2.0 ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and permissions

About this article

Cite this article

Feldbauer, C., Kubin, G. & Kleijn, W.B. Anthropomorphic Coding of Speech and Audio: A Model Inversion Approach. EURASIP J. Adv. Signal Process. 2005, 571618 (2005). https://doi.org/10.1155/ASP.2005.1334

Download citation

Received: 14 November 2003
Revised: 25 August 2004
Published: 21 June 2005
DOI: https://doi.org/10.1155/ASP.2005.1334

Anthropomorphic Coding of Speech and Audio: A Model Inversion Approach

Abstract

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords and phrases