Blame view

subsyntax/README 2.24 KB
Wojciech Jaworski authored
1
ENIAMsubsyntax Version 1.1 :
Wojciech Jaworski authored
2
3
4
5
6
-----------------------

ENIAMsubsyntax is a library that
- performs tokenization, lemmatization, part of speech tagging;
- detects MWE and abbreviations;
Wojciech Jaworski authored
7
- recognizes named entities;
Wojciech Jaworski authored
8
9
10
11
12
13
- splits text into sentences.

Install
-------

ENIAMsubsyntax requires OCaml version 4.02.3 compiler
Wojciech Jaworski authored
14
15
together with Xlib library version 3.2 or later,
ENIAMtokenizer library version 1.1 and ENIAMmorphology library version 1.1.
Wojciech Jaworski authored
16
17
18
19
20
21
22
23
24
25
26
27

In order to install type:

make install

by default, ENIAMsubsyntax is installed in the 'ocamlc -where'/eniam directory.
you can change it by editing the Makefile.

In order to test library type:
make test
./test
Wojciech Jaworski authored
28
29
30
31
32
33
34
In order to compile a command line interface to the library type:
make interface

./interface --help provides information on command line options.

Both test and interface require graphviz installed.
Wojciech Jaworski authored
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
By default ENIAMsubsyntax looks for resources in /usr/share/eniam directory.
However this behaviour may be changed by setting end exporting ENIAM_RESOURCE_PATH
environment variable.

Credits
-------
Copyright © 2016 Wojciech Jaworski <wjaworski atSPAMfree mimuw dot edu dot pl>
Copyright © 2016 Institute of Computer Science Polish Academy of Sciences

The library uses the following licensed resources:

NKJP1M: the manually annotated 1-million word subcorpus sampled
from texts of a subset of the National Corpus of Polish.
version 1.2
Wojciech Jaworski authored
50
51
52
53
SGJP: Grammatical Dictionary of Polish, version 20151020
Copyright © 2007–2015 Zygmunt Saloni, Włodzimierz Gruszczyński, Marcin
Woliński, Robert Wołosz, Danuta Skowrońska
Wojciech Jaworski authored
54
55
56
57
58
59
60
61
62
63
64
Licence
-------

This library is free software: you can redistribute it and/or modify
it under the terms of the GNU Lesser General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.

This library is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
Wojciech Jaworski authored
65
GNU Lesser General Public License for more details.
Wojciech Jaworski authored
66
67
68

You should have received a copy of the GNU Lesser General Public License
along with this program.  If not, see <http://www.gnu.org/licenses/>.