Skip to main content

Table 5 WER (%) obtained with the reverberant part of the evaluation data set

From: Reverberant speech recognition exploiting clarity index estimation

 

Sim.

Real

 

R1

R2

R3

R1

 

Near

Far

Near

Far

Near

Far

Near

Far

Clean-cond.

17.91

25.67

42.85

83.70

54.22

89.08

90.19

88.15

Multi-cond.

20.60

21.09

23.70

38.72

28.08

44.86

58.45

55.44

NIRA-CART

        

Clean&Multi cond.

18.67

21.59

23.83

38.72

28.15

44.86

58.45

55.44

C50FV

20.62

20.74

23.12

39.14

28.19

46.61

58.19

55.50

C50HLDA

18.38

19.99

21.34

37.03

27.44

42.55

55.92

54.09

MS3

18.08

19.82

21.92

35.94

27.35

40.25

55.64

53.51

MS3+C50HLDA

17.16

19.40

20.60

32.67

25.37

36.32

53.53

51.49

MS5

16.32

18.52

20.49

36.34

25.85

40.62

55.35

53.41

MS5+C50HLDA

16.44

17.93

19.91

32.51

24.45

37.62

53.66

51.65

MS8

16.72

19.32

20.79

34.02

26.50

39.31

53.24

53.11

MS8+C50HLDA

15.72

18.26

19.79

30.76

24.16

35.85

52.06

50.03

MS11

16.50

18.99

21.14

34.75

25.85

39.09

57.87

55.37

MS11+C50HLDA

16.10

17.79

19.95

31.58

23.90

36.21

54.77

51.49

MS14

16.50

19.06

21.37

34.64

24.83

39.50

55.61

54.66

MS14+C50HLDA

15.88

17.93

19.73

30.78

22.39

35.86

52.67

51.96

MS18

16.25

19.13

21.19

34.96

24.94

39.40

56.50

55.20

MS18+C50HLDA

15.64

18.23

19.79

31.15

22.83

36.15

53.78

52.46

NIRA-BLSTM

        

Clean&Multi cond.

18.01

21.08

23.70

38.72

28.08

44.86

58.45

55.44

C50FV

20.52

20.50

23.07

39.25

28.07

46.58

58.70

55.13

C50HLDA

18.40

19.69

21.31

37.16

27.18

42.83

55.35

53.78

MS3

16.93

18.87

21.79

35.99

27.25

39.98

54.52

52.36

MS3+C50HLDA

15.96

18.18

20.05

32.38

25.00

35.93

53.37

51.76

MS5

16.01

18.25

20.50

36.32

26.02

40.42

54.90

52.84

MS5+C50HLDA

16.16

17.15

19.50

32.67

24.07

37.25

53.40

50.74

MS8

16.01

18.50

20.13

34.12

25.39

39.00

53.66

52.63

MS8+C50HLDA

15.66

17.01

19.50

30.65

23.22

35.68

52.22

49.70

MS11

15.88

17.94

20.07

34.75

24.96

38.87

55.76

53.71

MS11+C50HLDA

14.79

16.79

18.85

31.50

22.85

36.09

53.27

50.57

MS14

15.66

17.42

20.08

34.23

24.25

38.95

55.41

53.21

MS14+C50HLDA

14.35

17.40

18.48

30.95

22.61

35.32

52.48

51.82

MS18

15.06

17.52

19.86

34.14

24.83

39.14

57.11

54.90

MS18+C50HLDA

14.81

16.66

18.93

31.02

22.49

35.93

54.07

51.22

  1. The first two rows correspond to the baseline methods, and the remainder are the methods proposed in this work. R1, R2 and R3 represent room numbers 1, 2 and 3, respectively. Best performance results in each column are shown in italics