Audio Replay Spoof Attack Detection by Joint Segment-Based Linear Filter Bank Feature Extraction and Attention-Enhanced DenseNet-BiLSTM Network

doi:10.1109/TASLP.2020.2998870

UM > Faculty of Science and Technology > DEPARTMENT OF COMPUTER AND INFORMATION SCIENCE

Residential College	false
Status	已發表Published
	Audio Replay Spoof Attack Detection by Joint Segment-Based Linear Filter Bank Feature Extraction and Attention-Enhanced DenseNet-BiLSTM Network
	Huang,Lian; Pun,Chi Man
	2020-06
Source Publication	IEEE/ACM Transactions on Audio Speech and Language Processing
ISSN	2329-9290
Volume	28 Pages:1813-1825
Abstract	Most automatic speaker verification (ASV) systems are vulnerable to various spoofing attacks. In recent years, there have been many methods were proposed for detecting spoofing attacks in ASV, and significant progress has been made. However, current methods have shown little improvements in replay spoof attack detection as they lack a more suitable model for replay detection. To address this issue, in this article, we propose a novel model based on attention-enhanced DenseNet-BiLSTM network and segment-based linear filter bank features. First, silent segments are selected from each speech signal by using a short-term zero-crossing rate and energy. If the total duration of silent segments only contains a very limited amount of data, the decaying tails will be selected instead. Second, the linear filter bank features are extracted from the selected segments in the relatively high-frequency domain. Finally, an attention-enhanced DenseNet-BiLSTM architecture which can avoid the problems of overfitting is built. To validate this model, we used two datasets, including BTAS2016 and ASVspoof2017. Experiments show that using the attention-enhanced DenseNet-BiLSTM model with the segment-based linear filter bank feature achieves the best performance. Compared with the baseline system based on constant Q cepstral coefficient and Gaussian mixture model (GMM), the proposed model can produce a relative improvement of 91.68% and 74.04% on the two data sets respectively.
Keyword	Attack Detection Attention-enhanced Densenet-bilstm Network Linear Filter Bank Feature Replay Spoof
DOI	10.1109/TASLP.2020.2998870
URL	View the original
Indexed By	SCIE
Language	英語English
WOS Research Area	Acoustics ; Engineering
WOS Subject	Acoustics ; Engineering, Electrical & Electronic
WOS ID	WOS:000543714200005
Scopus ID	2-s2.0-85087498997
Fulltext Access	View Full-Text via DOI View Full-Text via Web of Science View Full-Text via Scopus
Citation statistics
Document Type	Journal article
Collection	DEPARTMENT OF COMPUTER AND INFORMATION SCIENCE
Corresponding Author	Pun,Chi Man
Affiliation	Department of Computer and Information Science,University of Macau,999078,Macao
First Author Affilication	University of Macau
Corresponding Author Affilication	University of Macau
Recommended Citation GB/T 7714	Huang,Lian,Pun,Chi Man. Audio Replay Spoof Attack Detection by Joint Segment-Based Linear Filter Bank Feature Extraction and Attention-Enhanced DenseNet-BiLSTM Network[J]. IEEE/ACM Transactions on Audio Speech and Language Processing, 2020, 28, 1813-1825.
APA	Huang,Lian., & Pun,Chi Man (2020). Audio Replay Spoof Attack Detection by Joint Segment-Based Linear Filter Bank Feature Extraction and Attention-Enhanced DenseNet-BiLSTM Network. IEEE/ACM Transactions on Audio Speech and Language Processing, 28, 1813-1825.
MLA	Huang,Lian,et al."Audio Replay Spoof Attack Detection by Joint Segment-Based Linear Filter Bank Feature Extraction and Attention-Enhanced DenseNet-BiLSTM Network".IEEE/ACM Transactions on Audio Speech and Language Processing 28(2020):1813-1825.

Files in This Item:
There are no files associated with this item.

If you have any objections to this item, please fill out the form below and the administrator will contact you as soon as possible.
Content:
Email：	*
Affiliation No.
Verification Code:	Refresh

Any comments and suggestions are welcomed.
Title:	*
Content:
Email：	*
Verification Code:	Refresh