Open Access

Probabilistic Aspects in Spoken Document Retrieval

  • Wolfgang Macherey1Email author,
  • Hans Jörg Viechtbauer1 and
  • Hermann Ney1
EURASIP Journal on Advances in Signal Processing20032003:863836

DOI: 10.1155/S1110865703210088

Received: 8 April 2002

Published: 25 February 2003

Abstract

Accessing information in multimedia databases encompasses a wide range of applications in which spoken document retrieval (SDR) plays an important role. In SDR, a set of automatically transcribed speech documents constitutes the files for retrieval, to which a user may address a request in natural language. This paper deals with two probabilistic aspects in SDR. The first part investigates the effect of recognition errors on retrieval performance and inquires the question of why recognition errors have only a little effect on the retrieval performance. In the second part, we present a new probabilistic approach to SDR that is based on interpolations between document representations. Experiments performed on the TREC-7 and TREC-8 SDR task show comparable or even better results for the new proposed method than other advanced heuristic and probabilistic retrieval metrics.

Keywords

spoken document retrieval error analysis probabilistic retrieval metrics

Authors’ Affiliations

(1)
Lehrstuhl für Informatik VI, Computer Science Department, RWTH Aachen, University of Technology

Copyright

© Copyright © 2003 Hindawi Publishing Corporation 2003