Locked History Actions

Diff for "Nerf"

Differences between revisions 7 and 11 (spanning 4 versions)
Revision 7 as of 2013-01-23 12:25:44
Size: 2105
Comment:
Revision 11 as of 2013-02-01 10:36:17
Size: 2015
Comment:
Deletions are marked like this. Additions are marked like this.
Line 2: Line 2:
= NERF = = Nerf =
Line 4: Line 4:
Nerf is a statistical tool for Named Entity Recognition (NER) based on the Conditional Random Fields (CRF) modelling method. The tool has been constructed as a part of the National Corpus of Polish project. It has been adapted to recognize tree-like structures of NEs (i.e., with recursively embedded NEs) using the Joined Label Tagging (JLT) method. The JLT method is a simple method of encoding NE structures as a sequence of labels. With this method various additional informations about NEs of categorical nature – type, subtype, type of derivation – can be encoded on the level of labels and subsequently recognized using the resultant CRF model. The tool can be configured to use various types of observations during the training and recognition process, for example: lexical informations from textual level, or grammatical informations from morphosyntactic level. Nerf is a statistical named entity recognition tool based on linear-chain conditional random fields.
Line 6: Line 6:
== Licence ==

The Nerf tool is released under the [[http://www.gnu.org/licenses/gpl.html|GNU General Public License v3]] and by downloading the Nerf package you accept the conditions of that licence.

'''Authors:''' Jakub Waszczuk <<MailTo(jakub DOT waszczuk AT SPAMFREE ipipan DOT waw DOT pl)>>,
Michał Lenart <<MailTo(michal DOT lenart AT SPAMFREE ipipan DOT waw DOT pl)>> <<BR>>
'''License:''' GPL v.3
'''Principal developer:'''
[[http://zil.ipipan.waw.pl/JakubWaszczuk|Jakub Waszczuk]] <<BR>>
'''License:''' 2-clause BSD
Line 15: Line 11:
Readme file of the current version, in English [[attachment:NERF.pdf]]
See the [[https://github.com/kawu/nerf/blob/master/README.md#nerf|README]] file from the development repository.
Line 19: Line 16:
You can download the current distribution package from [[attachment:nerf.dist.0.2.tgz|here]]. The package consists of three components: Nerf is available in a form of a software distribution which can be downloaded from [[http://hackage.haskell.org/package/nerf|Hackage]] using the [[http://www.haskell.org/cabal/|Cabal]] tool. To compile Concraft you will also need the [[http://www.haskell.org/ghc/|Glasgow Haskell Compiler]] (GHC). The simplest way to get both Cabal and GHC is to install the [[http://www.haskell.org/platform/|Haskell Platform]]. Please see the documentation for more information about the installation process.

=== Pre-trained model ===

A [[attachment:model.bin|model]] for the Polish language has been trained on the [[http://clip.ipipan.waw.pl/LRT?action=AttachFile&do=view&target=NKJP-PodkorpusMilionowy-1.1.tgz|manually annotated subcorpus]] of the [[http://nkjp.pl/index.php?page=0&lang=1|National Corpus of Polish (NCP)]]. It can be used to recognize embedded structures of named entities consistent with the type hierarchy used in NCP.

== Python version ==

'''The Python version of Nerf is no longer supported.'''

'''Authors:'''
[[http://zil.ipipan.waw.pl/JakubWaszczuk|Jakub Waszczuk]],
[[http://zil.ipipan.waw.pl/MichalLenart|Michał Lenart]] <<BR>>
'''License:''' GPL v.3

Readme file of the Python version, in English [[attachment:NERF.pdf]]

You can download the obsolete distribution package from [[attachment:nerf.dist.0.2.tgz|here]]. The package consists of three components:
Line 23: Line 37:

You can also download the newest versions of both pycrf library and Nerf tool directly from repositories:
 * svn co svn://chopin.ipipan.waw.pl/nkjp/pycrf/trunk pycrf
 * svn co svn://chopin.ipipan.waw.pl/nkjp/ner/trunk ner

If you have any problems with tool installation or usage, please send report to waszczuk.kuba@gmail.com.

Nerf

Nerf is a statistical named entity recognition tool based on linear-chain conditional random fields.

Principal developer: Jakub Waszczuk
License: 2-clause BSD

Documentation

See the README file from the development repository.

Downloads

Nerf is available in a form of a software distribution which can be downloaded from Hackage using the Cabal tool. To compile Concraft you will also need the Glasgow Haskell Compiler (GHC). The simplest way to get both Cabal and GHC is to install the Haskell Platform. Please see the documentation for more information about the installation process.

Pre-trained model

A model for the Polish language has been trained on the manually annotated subcorpus of the National Corpus of Polish (NCP). It can be used to recognize embedded structures of named entities consistent with the type hierarchy used in NCP.

Python version

The Python version of Nerf is no longer supported.

Authors: Jakub Waszczuk, Michał Lenart
License: GPL v.3

Readme file of the Python version, in English NERF.pdf

You can download the obsolete distribution package from here. The package consists of three components:

  • Python pycrf library, which has to be installed before the Nerf tool can be used,
  • The Nerf tool itself,
  • Supplementary data: trained models and examples of configuration files.