UM
Residential Collegefalse
Status已發表Published
Integration of named entity information for chinese word segmentation based on maximum entropy
Leong K.S.; Wong F.; Li Y.; Dong M.C.
2008-11-27
Conference Name4th International Conference on Intelligent Computing
Source PublicationLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume5226 LNCS
Pages962-969
Conference DateSEP 15-18, 2008
Conference PlaceShanghai, PEOPLES R CHINA
Abstract

Word segmentation is an essential process in Chinese information processing. Although related researches were reported and made progresses, the Unknown Named Entity (UNE) problem in segmentation is not fully solved. This usually degrades the accuracy of segmentation in general. In this paper, a model to identify UNEs for improving the overall performance of the segmentation is presented. In order to capture the NE information, functions of characters or words are defined with tags. In addition, useful surrounding contexts are collected from a corpus and used as features. The model is constructed based on Maximum Entropy to handle the UNE identification as tagging problem. Empirical experiments show that the overall accuracy of the segmentation is improved after integrating the UNE identification module into the word segmenter. © 2008 Springer-Verlag Berlin Heidelberg.

DOI10.1007/978-3-540-87442-3_118
URLView the original
Language英語English
WOS IDWOS:000259555200118
Scopus ID2-s2.0-56549095784
Fulltext Access
Citation statistics
Document TypeConference paper
CollectionUniversity of Macau
AffiliationUniversidade de Macau
First Author AffilicationUniversity of Macau
Recommended Citation
GB/T 7714
Leong K.S.,Wong F.,Li Y.,et al. Integration of named entity information for chinese word segmentation based on maximum entropy[C], 2008, 962-969.
APA Leong K.S.., Wong F.., Li Y.., & Dong M.C. (2008). Integration of named entity information for chinese word segmentation based on maximum entropy. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 5226 LNCS, 962-969.
Files in This Item:
There are no files associated with this item.
Related Services
Recommend this item
Bookmark
Usage statistics
Export to Endnote
Google Scholar
Similar articles in Google Scholar
[Leong K.S.]'s Articles
[Wong F.]'s Articles
[Li Y.]'s Articles
Baidu academic
Similar articles in Baidu academic
[Leong K.S.]'s Articles
[Wong F.]'s Articles
[Li Y.]'s Articles
Bing Scholar
Similar articles in Bing Scholar
[Leong K.S.]'s Articles
[Wong F.]'s Articles
[Li Y.]'s Articles
Terms of Use
No data!
Social Bookmark/Share
All comments (0)
No comment.
 

Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.