Skip to main content

Table 4 WER (%) comparisons on RealData for distant-talking conditions with single-channel speech input

From: Joint training of DNNs by incorporating an explicit dereverberation structure for distant speech recognition

System

Room1

Room2

Room3

Avg

Single-channel systems

    

Multi-1

18.30

27.07

36.46

27.28

Multi-2

16.56

25.59

33.71

25.29

DNN-JT1

15.43

24.30

31.10

23.61

DNN-JT2

14.74

23.77

30.20

22.90

DNN-JT3

16.50

25.69

33.30

25.16

DNN-JT4

15.04

23.64

29.84

22.84

Multi-2(10HL)

16.90

27.04

34.89

26.28

  1. Room1 is a living room, Room2 is a conference room, and Room3 is a classroom