NERF
Nerf is a statistical tool for Named Entity Recognition (NER) based on the Conditional Random Fields (CRF) modelling method. The tool has been constructed as a part of the National Corpus of Polish project. It has been adapted to recognize tree-like structures of NEs (i.e., with recursivelly embeded NEs) using the Joined Label Tagging (JLT) method. The JLT method is a simple method of encoding NE structures as a sequence of labels. With this method various additional informations about NEs of categorical nature – type, subtype, type of derivation – can be encoded on the level of labels and subsequently recognized using the resultant CRF model. The tool can be configured to use various types of observations during the training and recognition process, for example: lexical informations from textual level, or grammatical informations from morphosyntactic level.
Licence
The Nerf tool is released under the GNU General Public License v3 and by downloading the Nerf package you accept the conditions of that licence.
Authors: Jakub Waszczuk <jakub DOT waszczuk AT SPAMFREE ipipan DOT waw DOT pl>, Michał Lenart <michal DOT lenart AT SPAMFREE ipipan DOT waw DOT pl>
License: GPL v.3
Documentation
Readme file of the current version, in English NERF.pdf
Downloads
You can download the current distribution package from here. The package consists of three components:
- Python pycrf library, which has to be installed before the Nerf tool can be used,
- The Nerf tool itself,
- Supplementary data: trained models and examples of configuration files.
You can also download the newest versions of both pycrf library and Nerf tool directly from repositories:
- svn co svn://chopin.ipipan.waw.pl/nkjp/pycrf/trunk pycrf
- svn co svn://chopin.ipipan.waw.pl/nkjp/ner/trunk ner
If you have any problems with tool installation or usage, please send report to waszczuk.kuba@gmail.com.