|
Accession number: |
|
075110984085 |
|
|
Title: |
|
Applying machine learning to chinese entity detection and tracking |
|
|
Authors: |
|
Qian,
Donglei; Li,
Wenjie; Yuan,
Chunfa; Lu,
Qin; Wu,
Mingli |
|
|
Author affiliation: |
|
Department of Computing, Hong Kong
Polytechnic University, Hong Kong, Hong
Kong |
|
|
Serial title: |
|
Lecture
Notes in Computer Science (including subseries Lecture Notes
in Artificial Intelligence and Lecture Notes in
Bioinformatics) |
|
|
Abbreviated serial title: |
|
Lect. Notes
Comput. Sci. |
|
|
Volume: |
|
v 4394
LNCS |
|
|
Monograph title: |
|
Computational Linguistics and Intelligent
Text Processing - 8th International Conference, CICLing 2007,
Proceedings |
|
|
Publication year: |
|
2007 |
|
|
Pages: |
|
p
154-165 |
|
|
Language: |
|
English |
|
|
ISSN: |
|
0302-9743 |
|
|
Document type: |
|
Conference
article (CA) |
|
|
Conference name: |
|
8th Annual
Conference on Intelligent Text Processing and Computational
Linguistics, CICLing 2007 |
|
|
Conference date: |
|
Feb 18-24
2007 |
|
|
Conference location: |
|
Mexico
City, Mexico |
|
|
Conference code: |
|
70754 |
|
|
Publisher: |
|
Springer
Verlag, Heidelberg, D-69121, Germany |
|
|
Abstract: |
|
This paper
presents a Chinese entity detection and tracking system that takes advantages
of character-based models and machine learning approaches. An entity here is defined as a link of
all its mentions in text together with the associated
attributes. Entity mentions of
different types normally exhibit quite different linguistic
patterns. Six separate Conditional Random Fields (CRF) models
that incorporate character N-gram and word knowledge features
are built to detect the extent and the head of
three types of mentions, namely named, nominal and pronominal
mentions. For each type of mentions, attributes are identified
by Support Vector Machine (SVM)
classifiers which take mention heads and their context as
classification features. Mentions can then be merged into a
unified entity representation by
examining their attributes and connections in a rule-based
coreference resolution process. The system is evaluated on ACE
2005 corpus and achieves competitive results. ©
Springer-Verlag Berlin Heidelberg 2007. |
|
|
Number of references: |
|
18 |
|
|
Ei main heading: |
|
Support
vector machines |
|
|
Ei controlled terms: |
|
Classification
(of information) - Feature
extraction - Knowledge
acquisition - Target
tracking |
|
|
Uncontrolled terms: |
|
Conditional
Random Fields (CRF) -
Knowledge
features - Chinese entity detection |
|
|
Ei classification codes: |
|
716.1 Information Theory and Signal
Processing - 716.2 Radar Systems and Equipment - 723 Computer Software, Data Handling and
Applications - 723.4 Artificial Intelligence - 723.5 Computer Applications - 903.1 Information Sources and
Analysis |
|
|
Treatment: |
|
Theoretical (THR) |
|
|
Database: |
|
Compendex |
|
|
|
|
Compilation
and indexing terms, © 2008 Elsevier Inc. |
|