Information Extraction
High Performance Named Entity Recognizer in Newswire Domain
It beats all reported system on MUC-6 and MUC-7 data with F scores of 96.9% and 94.3% respectively using less than 1 forth and 1 ninth training data required by previous learning method. More details, see "Named Entity Recognition Using an HMM-based Chunk Tagger", P473-480, ACL'02
Named Entity Recognizer in Biomedical Domain
It is trained on GENIA corpus version 3.0. It achieves overall performance of 71.2 F-measure (P=72.7 R=69.8) covering 22 entity classes. It achieves quite high performance on some large classes, e.g. Protein (F=77.8), MultiCell (F=78.1), CellType (F=81.8) and BodyPart (F=80).
Gene/Protein Name Recognizer as used in BioCreative 2003
Coming Soon! I2R named entity recognition system (usr11) achieves the best performance for the task 1A(closed division) on recognizing protein / gene names from biology technical papers at 82.6 Fscore with BioCreative (Critical Assessment of Information Extraction systems in Biology) http://www.pdg.cnb.uam.es/BioLINK/workshop_BioCreative_04/
High Precision Coreference Resolution
More than 10% higher precision than previous best reported performance with same or even better recall; R=65.8, P=84.7, F=73.9 on MUC-6; R=55.7, P=82.8, F=66.5 on MUC-7.
Details in: Zhou GuoDong and Su Jian. A high-performance coreference resolution system using a multi-agent strategy. To appear in the Proceedings of 20th International Conference on Computational Linguistics (COLING'2004). Aug 23-27, 2004, Geneva, Switzerland.
Coreference Resolution in Biomedical DomainNEW
It is a supervised learning-based approach by exploring the relationships between NPs and coreferential clusters. R=84.9, P=78.9, F=81.8.
Details in: Xiaofeng Yang, Jian Su, Guodong Zhou and Chew Lim Tan. A NP-Cluster Based Approach to Coreference Resolution. To appear in the Proceedings of 20th International Conference on Computational Linguistics (COLING'2004). Aug 23-27, 2004, Geneva, Switzerland.
Relation Extraction in Newswire Domain (Illustration)NEW