Commit 9020263248075415d44207bc8d1c51b3ee96a6a2

Authored by Bartłomiej Nitoń
1 parent 50f3afb6

Update README.

Showing 1 changed file with 10 additions and 6 deletions
README.md
1 1 # The Polish Parliamentary Corpus / Korpus Dyskursu Parlamentarnego #
2 2  
3   -The *[Polish Parliamentary Corpus (PPC)](http://clip.ipipan.waw.pl/PPC)* is a large collection of linguistically analysed documents from the proceedings of *Polish Parliament*, [Sejm](http://opis.sejm.gov.pl/en/) and [Senate](http://www.senat.gov.pl/en/). It is based on the [Polish Sejm Corpus](http://clip.ipipan.waw.pl/PSC) co-funded by project [CESAR](http://clip.ipipan.waw.pl/CESAR) and is currently being updated by [CLARIN-PL](http://clip.ipipan.waw.pl/CLARIN-PL-3) infrastructure.
  3 +The *[Polish Parliamentary Corpus (PPC)](http://clip.ipipan.waw.pl/PPC)* is a large collection of linguistically analysed documents from the proceedings of the *Polish Parliament*, [Sejm](http://opis.sejm.gov.pl/en/) and [Senate](http://www.senat.gov.pl/en/). It is based on the [Polish Sejm Corpus](http://clip.ipipan.waw.pl/PSC) co-funded by project [CESAR](http://clip.ipipan.waw.pl/CESAR) and is currently being updated by [CLARIN-PL](http://clip.ipipan.waw.pl/CLARIN-PL-3) infrastructure.
4 4  
5 5 ## Corpus data ##
6 6  
7   -The current size of the corpus amounts over 700M segments. Apart from the stenographic records of plenary sittings and committee sittings, the corpus contains also interpellations and questions.
  7 +The current size of the corpus amounts to over 700M segments. Apart from the stenographic records of plenary sittings and committee sittings, the corpus also contains interpellations and questions.
8 8  
9 9 Corpus files are made available in *XML TEI P5* format compatible with the annotation used by the [National Corpus of Polish](http://nkjp.pl/index.php?page=0&lang=1). This repository contains *Unannotated TEI version* of the corpora. For annotated version please go to the [PPC homepage](http://clip.ipipan.waw.pl/PPC).
10 10  
... ... @@ -19,15 +19,19 @@ The parliamentary data is public domain. The corpus annotations are available on
19 19  
20 20 ## Publications ##
21 21  
22   -[Maciej Ogrodniczuk and Bartłomiej Nitoń. *New developments in the Polish Parliamentary Corpus*. In Darja Fišer, Maria Eskevich, and Franciska de Jong, editors, Proceedings of the Second ParlaCLARIN Workshop, pages 1–4, Marseille, France, 2020. European Language Resources Association (ELRA).](https://www.aclweb.org/anthology/2020.parlaclarin-1.1.pdf)
  22 + * [Maciej Ogrodniczuk and Bartłomiej Nitoń. *New developments in the Polish Parliamentary Corpus*. In Darja Fišer, Maria Eskevich, and Franciska de Jong, editors, Proceedings of the Second ParlaCLARIN Workshop, pages 1–4, Marseille, France, 2020. European Language Resources Association (ELRA).](https://www.aclweb.org/anthology/2020.parlaclarin-1.1.pdf)
23 23  
24 24  
25   -[Maciej Ogrodniczuk. *Polish Parliamentary Corpus*. In Darja Fišer, Maria Eskevich, and Franciska de Jong, editors, Proceedings of the LREC 2018 Workshop ParlaCLARIN: Creating and Using Parliamentary Corpora, pages 15–19, Paris, France, 2018. European Language Resources Association (ELRA).](http://lrec-conf.org/workshops/lrec2018/W2/pdf/11_W2.pdf)
  25 + * [Maciej Ogrodniczuk. *Polish Parliamentary Corpus*. In Darja Fišer, Maria Eskevich, and Franciska de Jong, editors, Proceedings of the LREC 2018 Workshop ParlaCLARIN: Creating and Using Parliamentary Corpora, pages 15–19, Paris, France, 2018. European Language Resources Association (ELRA).](http://lrec-conf.org/workshops/lrec2018/W2/pdf/11_W2.pdf)
26 26  
27 27  
28   -[Maciej Ogrodniczuk. *The Polish Sejm Corpus*. In Proceedings of the Eighth International Conference on Language Resources and Evaluation, LREC 2012, pages 2219–2223, Istanbul, Turkey, 2012. European Language Resources Association (ELRA).](http://www.lrec-conf.org/proceedings/lrec2012/pdf/653_Paper.pdf)
  28 + * [Maciej Ogrodniczuk. *The Polish Sejm Corpus*. In Proceedings of the Eighth International Conference on Language Resources and Evaluation, LREC 2012, pages 2219–2223, Istanbul, Turkey, 2012. European Language Resources Association (ELRA).](http://www.lrec-conf.org/proceedings/lrec2012/pdf/653_Paper.pdf)
  29 +
  30 +## See also ##
  31 +
  32 +* [The slides](https://www.clarin.eu/sites/default/files/2-ogrodniczuk.pdf) from [CLARIN-PLUS Workshop "Working with Parliamentary Records"](https://www.clarin.eu/event/2017/clarin-plus-workshop-working-parliamentary-records). Sofia, 27–29 March 2017.
  33 +* [ParlaMint](https://www.clarin.eu/content/parlamint-towards-comparable-parliamentary-corpora) project reusing data from the Polish Parliamentary Corpus in a multilingual setting.
29 34  
30   -Please see also [the slides](https://www.clarin.eu/sites/default/files/2-ogrodniczuk.pdf) from [CLARIN-PLUS Workshop "Working with Parliamentary Records"](https://www.clarin.eu/event/2017/clarin-plus-workshop-working-parliamentary-records). Sofia, 27–29 March 2017.
31 35  
32 36 ## Contact ##
33 37  
... ...