<?xml version="1.0" encoding="utf-8"?><!DOCTYPE article  PUBLIC '-//OASIS//DTD DocBook XML V4.4//EN'  'http://www.docbook.org/xml/4.4/docbookx.dtd'><article><articleinfo><title>NKJP</title><revhistory><revision><revnumber>6</revnumber><date>2012-01-03 13:37:12</date><authorinitials>MichalLenart</authorinitials></revision><revision><revnumber>5</revnumber><date>2011-03-18 11:52:06</date><authorinitials>AdamPrzepiorkowski</authorinitials></revision><revision><revnumber>4</revnumber><date>2011-03-18 11:48:40</date><authorinitials>AdamPrzepiorkowski</authorinitials></revision><revision><revnumber>3</revnumber><date>2011-03-07 12:57:00</date><authorinitials>MaciejOgrodniczuk</authorinitials></revision><revision><revnumber>2</revnumber><date>2011-03-07 12:55:13</date><authorinitials>MaciejOgrodniczuk</authorinitials></revision><revision><revnumber>1</revnumber><date>2011-03-07 12:52:38</date><authorinitials>MaciejOgrodniczuk</authorinitials></revision></revhistory></articleinfo><section><title>NKJP project</title><section><title>Project factsheet</title><informaltable><tgroup cols="2"><colspec colname="col_0"/><colspec colname="col_1"/><tbody><row rowsep="1"><entry colsep="1" rowsep="1"><para> English name:         </para></entry><entry colsep="1" rowsep="1"><para> National Corpus of Polish </para></entry></row><row rowsep="1"><entry colsep="1" rowsep="1"><para> Polish name:          </para></entry><entry colsep="1" rowsep="1"><para> Narodowy Korpus Języka Polskiego </para></entry></row><row rowsep="1"><entry colsep="1" rowsep="1"><para> Project type:         </para></entry><entry colsep="1" rowsep="1"><para> A national <ulink url="http://www.eng.nauka.gov.pl/meinen/">Ministry of Science and Higher Education</ulink> research/development grant (number R17 003 03) </para></entry></row><row rowsep="1"><entry colsep="1" rowsep="1"><para> Duration:             </para></entry><entry colsep="1" rowsep="1"><para> 13 December 2007 ‒ 12 December 2010 (plus 6 months extension) </para></entry></row><row rowsep="1"><entry colsep="1" rowsep="1"><para> Project Web page:     </para></entry><entry colsep="1" rowsep="1"><para> <ulink url="http://nkjp.pl/"/> </para></entry></row><row rowsep="1"><entry colsep="1" rowsep="1"><para> Principal investigator: </para></entry><entry colsep="1" rowsep="1"><para> Adam Przepiórkowski </para></entry></row></tbody></tgroup></informaltable></section><section><title>Project description</title><para>The National Corpus of Polish is a shared initiative of four institutions: Institute of Computer Science at the Polish Academy of Sciences (coordinator), Institute of Polish Language at the Polish Academy of Sciences, Polish Scientific Publishers PWN, and the Department of Computational and Corpus Linguistics at the University of Łódź. It has been registered as a research-development project of the Ministry of Science and Higher Education. </para><para>These four institutions have started cooperation to build a reference corpus of Polish language containing hundreds millions of words. The corpus that will appear soon on this site will be searchable by means of advanced tools that analyse Polish inflection and the Polish sentence structure. </para><para>The list of sources for the corpora contains classic literature, daily newspapers, specialist periodicals and journals, transcripts of conversations, and a variety of short-lived and internet texts. For a corpus to be reliable, not only it is necessary to contain a high number of words, but it also needs a diversity of texts with respect to the subject and genre. The conversations ought to represent both male and female speakers, in various age groups, coming from various regions in Poland. </para></section></section></article>