Residential College | false |
Status | 已發表Published |
Feature mapping of multiple beamformed sources for robust overlapping speech recognition using a microphone array | |
Li Weifeng1; Wang L.3; Zhou Yicong4; Dines J.2; Magimai-Doss M.2; Bourlard H.2; Liao Q.1 | |
2014-12-01 | |
Source Publication | IEEE/ACM Transactions on Audio Speech and Language Processing |
ISSN | 23299290 |
Volume | 22Issue:12Pages:2244-2255 |
Abstract | This paper introduces a nonlinear vector-based feature mapping approach to extract robust features for automatic speech recognition (ASR) of overlapping speech using a microphone array. We explore different configurations and additional sources of information to improve the effectiveness of the feature mapping. First, we investigate the full-vector based mapping of different sources in a log mel-filterbank energy (log MFBE) domain, and demonstrate that retraining the acoustic model using the generated training data can help improve the recognition performance. Then we investigate the feature mapping between different domains. Finally in order to improve the qualities of the mapping inputs we propose a nonlinear mapping of the features from multiple beamformed sources, which are directed at the target and interfering speakers, respectively. We demonstrate the effectiveness of the proposed approach through extensive evaluations on theMONC corpus, which includes non-overlapping single speaker and overlapping multi-speaker conditions. |
Keyword | Beamforming Microphone Array Neural Network Speech Recognition Speech Separation |
DOI | 10.1109/TASLP.2014.2364130 |
URL | View the original |
Indexed By | SCIE |
Language | 英語English |
WOS Research Area | Acoustics ; Engineering |
WOS Subject | Acoustics ; Engineering, Electrical & Electronic |
WOS ID | WOS:000362410800001 |
Scopus ID | 2-s2.0-84921727137 |
Fulltext Access | |
Citation statistics | |
Document Type | Journal article |
Collection | DEPARTMENT OF COMPUTER AND INFORMATION SCIENCE Faculty of Science and Technology |
Affiliation | 1.Shenzhen Engineering Lab. of IS and DRM 2.Institut Dalle Molle D'intelligence Artificielle Perceptive 3.Nagaoka University of Technology 4.Universidade de Macau |
Recommended Citation GB/T 7714 | Li Weifeng,Wang L.,Zhou Yicong,et al. Feature mapping of multiple beamformed sources for robust overlapping speech recognition using a microphone array[J]. IEEE/ACM Transactions on Audio Speech and Language Processing, 2014, 22(12), 2244-2255. |
APA | Li Weifeng., Wang L.., Zhou Yicong., Dines J.., Magimai-Doss M.., Bourlard H.., & Liao Q. (2014). Feature mapping of multiple beamformed sources for robust overlapping speech recognition using a microphone array. IEEE/ACM Transactions on Audio Speech and Language Processing, 22(12), 2244-2255. |
MLA | Li Weifeng,et al."Feature mapping of multiple beamformed sources for robust overlapping speech recognition using a microphone array".IEEE/ACM Transactions on Audio Speech and Language Processing 22.12(2014):2244-2255. |
Files in This Item: | There are no files associated with this item. |
Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.
Edit Comment