Residential College | false |
Status | 已發表Published |
Robust log-energy estimation and its dynamic change enhancement for in-car speech recognition | |
Li Weifeng3; Wang Longbiao1; Zhou Yicong2; Bourlard Hervé4; Liao Qingmin3 | |
2013-05-22 | |
Source Publication | IEEE Transactions on Audio, Speech and Language Processing |
ISSN | 15587916 |
Volume | 21Issue:8Pages:1689-1698 |
Abstract | The log-energy parameter, typically derived from a full-band spectrum, is a critical feature commonly used in automatic speech recognition (ASR) systems. However, log-energy is difficult to estimate reliably in the presence of background noise. In this paper, we theoretically show that background noise affects the trajectories of not only the 'conventional' log-energy, but also its delta parameters. This results in a poor estimation of the actual log-energy and its delta parameters, which no longer describe the speech signal. We thus propose a new method to estimate log-energy from a sub-band spectrum, followed by dynamic change enhancement and mean smoothing. We demonstrate the effectiveness of the proposed log-energy estimation and its post-processing steps through speech recognition experiments conducted on the in-car CENSREC-2 database. The proposed log-energy (together with its corresponding delta parameters) yields an average improvement of 32.8% compared with the baseline front-ends. Moreover, it is also shown that further improvement can be achieved by incorporating the new Mel-Frequency Cepstral Coefficients (MFCCs) obtained by non-linear spectral contrast stretching. |
Keyword | Dynamic Change Enhancement In-car Speech Recognition Log-energy Mel-filterbank (Mfb) Mel-frequency Cepstral Coefficients (Mfccs) |
DOI | 10.1109/TASL.2013.2260151 |
URL | View the original |
Indexed By | SCIE |
Language | 英語English |
WOS Research Area | Acoustics ; Engineering |
WOS Subject | Acoustics ; Engineering, Electrical & Electronic |
WOS ID | WOS:000319020800004 |
Publisher | IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC445 HOES LANE, PISCATAWAY, NJ 08855-4141 |
Scopus ID | 2-s2.0-84877861629 |
Fulltext Access | |
Citation statistics | |
Document Type | Journal article |
Collection | DEPARTMENT OF COMPUTER AND INFORMATION SCIENCE Faculty of Science and Technology |
Corresponding Author | Zhou Yicong |
Affiliation | 1.Nagaoka University of Technology, Nagaoka, Japan 2.Department of Computer and Information Science, University of Macau, Macau, China 3.Shenzhen Key Laboratory of Information Science and Technology, Department of Electronic Engineering/Graduate School at Shenzhen, Tsinghua University, Shenzhen, China 4.Idiap Research Institute, Ecole Polytechnique Fédérale de Lausanne, Lausanne, Switzerland |
Corresponding Author Affilication | University of Macau |
Recommended Citation GB/T 7714 | Li Weifeng,Wang Longbiao,Zhou Yicong,et al. Robust log-energy estimation and its dynamic change enhancement for in-car speech recognition[J]. IEEE Transactions on Audio, Speech and Language Processing, 2013, 21(8), 1689-1698. |
APA | Li Weifeng., Wang Longbiao., Zhou Yicong., Bourlard Hervé., & Liao Qingmin (2013). Robust log-energy estimation and its dynamic change enhancement for in-car speech recognition. IEEE Transactions on Audio, Speech and Language Processing, 21(8), 1689-1698. |
MLA | Li Weifeng,et al."Robust log-energy estimation and its dynamic change enhancement for in-car speech recognition".IEEE Transactions on Audio, Speech and Language Processing 21.8(2013):1689-1698. |
Files in This Item: | There are no files associated with this item. |
Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.
Edit Comment