Residential College | false |
Status | 已發表Published |
Graph-based lexicon regularization for PCFG with latent annotations | |
Zeng X.2![]() ![]() | |
2015-03-01 | |
Source Publication | IEEE Transactions on Audio, Speech and Language Processing
![]() |
ISSN | 15587916 |
Volume | 23Issue:3Pages:441-450 |
Abstract | This paper aims at learning a better probabilistic context-free grammar with latent annotations (PCFG-LA) by using a graph propagation (GP) technique. We propose leveraging the GP to regularize the lexical model of the grammar. The proposed approach constructs $k$-nearest neighbor ($k$ -NN) similarity graphs over words with identical pre-terminal (part-of-speech) tags, for propagating the probabilities of latent annotations given the words. The graphs demonstrate the relationship between words in syntactic and semantic levels, estimated by using a neural word representation method based on Recursive autoencoder (RAE). We modify the conventional PCFG-LA parameter estimation algorithm, expectation maximization (EM), by incorporating a GP process subsequent to the M-step. The GP encourages the smoothness among the graph vertices, where different words under similar syntactic and semantic environments should have approximate posterior distributions of nonterminal subcategories. The proposed PCFG-LA learning approach was evaluated together with a hierarchical split-and-merge training strategy, on parsing tasks for English, Chinese and Portuguese. The empirical results reveal two crucial findings: 1) regularizing the lexicons with GP results in positive effects to parsing accuracy; and 2) learning with unlabeled data can also expand the PCFG-LA lexicons. |
Keyword | Graph Propagation Natural Language Processing Neural Word Representation Syntax Parsing |
DOI | 10.1109/TASLP.2015.2389034 |
URL | View the original |
Indexed By | SCIE |
Language | 英語English |
WOS Research Area | Acoustics ; Engineering |
WOS Subject | Acoustics ; Engineering, Electrical & Electronic |
WOS ID | WOS:000350876100003 |
Scopus ID | 2-s2.0-84923935819 |
Fulltext Access | |
Citation statistics | |
Document Type | Journal article |
Collection | DEPARTMENT OF COMPUTER AND INFORMATION SCIENCE |
Affiliation | 1.Instituto de Engenharia de Sistemas e Computadores Investigação e Desenvolvimento em Lisboa 2.Universidade de Macau |
First Author Affilication | University of Macau |
Recommended Citation GB/T 7714 | Zeng X.,Wong D.F.,Chao L.S.,et al. Graph-based lexicon regularization for PCFG with latent annotations[J]. IEEE Transactions on Audio, Speech and Language Processing, 2015, 23(3), 441-450. |
APA | Zeng X.., Wong D.F.., Chao L.S.., & Trancoso I. (2015). Graph-based lexicon regularization for PCFG with latent annotations. IEEE Transactions on Audio, Speech and Language Processing, 23(3), 441-450. |
MLA | Zeng X.,et al."Graph-based lexicon regularization for PCFG with latent annotations".IEEE Transactions on Audio, Speech and Language Processing 23.3(2015):441-450. |
Files in This Item: | There are no files associated with this item. |
Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.
Edit Comment