Size: 2028
Comment:
|
← Revision 15 as of 2017-03-17 14:01:14 ⇥
Size: 2082
Comment:
|
Deletions are marked like this. | Additions are marked like this. |
Line 2: | Line 2: |
= NERF = | = Nerf = |
Line 4: | Line 4: |
Nerf is a statistical named entity recognition (NER) tool based on linear-chain conditional random fields (CRFs). | Nerf is a statistical named entity recognition tool based on linear-chain conditional random fields. |
Line 8: | Line 8: |
'''License:''' 2-clause BSD | '''License:''' GPL v.3 |
Line 16: | Line 16: |
Nerf is available in a form of a software distribution which can be downloaded from [[http://hackage.haskell.org/package/nerf|Hackage]] using the [[http://www.haskell.org/cabal/|Cabal]] tool. To compile Concraft you will also need the [[http://www.haskell.org/ghc/|Glasgow Haskell Compiler]] (GHC). The simplest way to get both Cabal and GHC is to install the [[http://www.haskell.org/platform/|Haskell Platform]]. Please see the documentation for more information about the installation process. | Nerf is available in a form of a software distribution which can be downloaded from [[http://hackage.haskell.org/package/nerf|Hackage]] using the [[http://www.haskell.org/cabal/|Cabal]] tool. To compile Nerf you will also need the [[http://www.haskell.org/ghc/|Glasgow Haskell Compiler]] (GHC). The simplest way to get both Cabal and GHC is to install the [[http://www.haskell.org/platform/|Haskell Platform]]. Please see the documentation for more information about the installation process. |
Line 20: | Line 20: |
A [[attachment:model.bin|model]] for the Polish language has been trained on the [[http://clip.ipipan.waw.pl/LRT?action=AttachFile&do=view&target=NKJP-PodkorpusMilionowy-1.1.tgz|manually annotated subcorpus]] of the [[http://nkjp.pl/index.php?page=0&lang=1|National Corpus of Polish (NCP)]]. It can be used to recognize embedded structures of named entities consistent with the type hierarchy used in NCP. | A model for the Polish language has been trained on the [[http://clip.ipipan.waw.pl/LRT?action=AttachFile&do=view&target=NKJP-PodkorpusMilionowy-1.1.tgz|manually annotated subcorpus]] of the [[http://nkjp.pl/index.php?page=0&lang=1|National Corpus of Polish (NCP)]]. An archive file with the model can be downloaded from [[attachment:model-0.3-4.0.zip|here]]. The model can be used to recognize embedded structures of named entities consistent with the type hierarchy used in NCP. |
Nerf
Nerf is a statistical named entity recognition tool based on linear-chain conditional random fields.
Principal developer: Jakub Waszczuk
License: GPL v.3
Documentation
See the README file from the development repository.
Downloads
Nerf is available in a form of a software distribution which can be downloaded from Hackage using the Cabal tool. To compile Nerf you will also need the Glasgow Haskell Compiler (GHC). The simplest way to get both Cabal and GHC is to install the Haskell Platform. Please see the documentation for more information about the installation process.
Pre-trained model
A model for the Polish language has been trained on the manually annotated subcorpus of the National Corpus of Polish (NCP). An archive file with the model can be downloaded from here. The model can be used to recognize embedded structures of named entities consistent with the type hierarchy used in NCP.
Python version
The Python version of Nerf is no longer supported.
Authors: Jakub Waszczuk, Michał Lenart
License: GPL v.3
Readme file of the Python version, in English NERF.pdf
You can download the obsolete distribution package from here. The package consists of three components:
- Python pycrf library, which has to be installed before the Nerf tool can be used,
- The Nerf tool itself,
- Supplementary data: trained models and examples of configuration files.