Residential Collegefalse
Status已發表Published
Graph-based lexicon regularization for PCFG with latent annotations
Zeng X.2; Wong D.F.2; Chao L.S.2; Trancoso I.1
2015-03-01
Source PublicationIEEE Transactions on Audio, Speech and Language Processing
ISSN15587916
Volume23Issue:3Pages:441-450
Abstract

This paper aims at learning a better probabilistic context-free grammar with latent annotations (PCFG-LA) by using a graph propagation (GP) technique. We propose leveraging the GP to regularize the lexical model of the grammar. The proposed approach constructs $k$-nearest neighbor ($k$ -NN) similarity graphs over words with identical pre-terminal (part-of-speech) tags, for propagating the probabilities of latent annotations given the words. The graphs demonstrate the relationship between words in syntactic and semantic levels, estimated by using a neural word representation method based on Recursive autoencoder (RAE). We modify the conventional PCFG-LA parameter estimation algorithm, expectation maximization (EM), by incorporating a GP process subsequent to the M-step. The GP encourages the smoothness among the graph vertices, where different words under similar syntactic and semantic environments should have approximate posterior distributions of nonterminal subcategories. The proposed PCFG-LA learning approach was evaluated together with a hierarchical split-and-merge training strategy, on parsing tasks for English, Chinese and Portuguese. The empirical results reveal two crucial findings: 1) regularizing the lexicons with GP results in positive effects to parsing accuracy; and 2) learning with unlabeled data can also expand the PCFG-LA lexicons.

KeywordGraph Propagation Natural Language Processing Neural Word Representation Syntax Parsing
DOI10.1109/TASLP.2015.2389034
URLView the original
Indexed BySCIE
Language英語English
WOS Research AreaAcoustics ; Engineering
WOS SubjectAcoustics ; Engineering, Electrical & Electronic
WOS IDWOS:000350876100003
Scopus ID2-s2.0-84923935819
Fulltext Access
Citation statistics
Document TypeJournal article
CollectionDEPARTMENT OF COMPUTER AND INFORMATION SCIENCE
Affiliation1.Instituto de Engenharia de Sistemas e Computadores Investigação e Desenvolvimento em Lisboa
2.Universidade de Macau
First Author AffilicationUniversity of Macau
Recommended Citation
GB/T 7714
Zeng X.,Wong D.F.,Chao L.S.,et al. Graph-based lexicon regularization for PCFG with latent annotations[J]. IEEE Transactions on Audio, Speech and Language Processing, 2015, 23(3), 441-450.
APA Zeng X.., Wong D.F.., Chao L.S.., & Trancoso I. (2015). Graph-based lexicon regularization for PCFG with latent annotations. IEEE Transactions on Audio, Speech and Language Processing, 23(3), 441-450.
MLA Zeng X.,et al."Graph-based lexicon regularization for PCFG with latent annotations".IEEE Transactions on Audio, Speech and Language Processing 23.3(2015):441-450.
Files in This Item:
There are no files associated with this item.
Related Services
Recommend this item
Bookmark
Usage statistics
Export to Endnote
Google Scholar
Similar articles in Google Scholar
[Zeng X.]'s Articles
[Wong D.F.]'s Articles
[Chao L.S.]'s Articles
Baidu academic
Similar articles in Baidu academic
[Zeng X.]'s Articles
[Wong D.F.]'s Articles
[Chao L.S.]'s Articles
Bing Scholar
Similar articles in Bing Scholar
[Zeng X.]'s Articles
[Wong D.F.]'s Articles
[Chao L.S.]'s Articles
Terms of Use
No data!
Social Bookmark/Share
All comments (0)
No comment.
 

Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.