| Size: 665 Comment:  |  ← Revision 4 as of 2015-09-12 08:46:36  ⇥ Size: 1128 Comment:  | 
| Deletions are marked like this. | Additions are marked like this. | 
| Line 1: | Line 1: | 
| #acl +All:read Default | |
| Line 3: | Line 4: | 
| A manually annotated for dependency structure corpus of Polish. It consists of ~20000 sentences, the same set used in [[http://zil.ipipan.waw.pl/Sk%C5%82adnica|Składnica]]. | A corpus of Polish manually annotated for dependency structures. It consists of ~20000 sentences, the same set as used in [[http://zil.ipipan.waw.pl/Sk%C5%82adnica|Składnica]], but annotated independently of Składnica by a team of about 10 linguists into unlabelled dependency structures. Each of the sentences was first annotated by two or three annotators. In case of at least one discrepancy, a superannotator decided on the final tree, who also maintained the shared annotation manual and responded to inquires of all the linguists. 10617 sentences did not require superannotation, whereas 9395 did. | 
Krzaki (bushes)
A corpus of Polish manually annotated for dependency structures. It consists of ~20000 sentences, the same set as used in Składnica, but annotated independently of Składnica by a team of about 10 linguists into unlabelled dependency structures.
Each of the sentences was first annotated by two or three annotators. In case of at least one discrepancy, a superannotator decided on the final tree, who also maintained the shared annotation manual and responded to inquires of all the linguists.
10617 sentences did not require superannotation, whereas 9395 did.
This treebank has only segment-head links determined, without specifying their functions. Contrary to Składnica (which contains only sentences which could be parsed by Świgra), this treebank was created manually, from a representative set of sentences from the manually disambiguated for morphosyntax subcorpus of NKJP.
The corpus is distributed in CONLL format.
