README 2.03 KB
ENIAMsubsyntax Version 1.1 :
-----------------------

ENIAMsubsyntax is a library that
- performs tokenization, lemmatization, part of speech tagging;
- detects MWE and abbreviations;
- splits text into sentences.

Install
-------

ENIAMsubsyntax requires OCaml version 4.02.3 compiler
together with Xlib library version 3.2 or later,
ENIAMtokenizer library version 1.1 and ENIAMmorphology library version 1.1.

In order to install type:

make install

by default, ENIAMsubsyntax is installed in the 'ocamlc -where'/eniam directory.
you can change it by editing the Makefile.

In order to test library type:
make test
./test

In order to compile a command line interface to the library type:
make interface

./interface --help provides information on command line options.

Both test and interface require graphviz installed.

By default ENIAMsubsyntax looks for resources in /usr/share/eniam directory.
However this behaviour may be changed by setting end exporting ENIAM_RESOURCE_PATH
environment variable.

Credits
-------
Copyright © 2016 Wojciech Jaworski <wjaworski atSPAMfree mimuw dot edu dot pl>
Copyright © 2016 Institute of Computer Science Polish Academy of Sciences

The library uses the following licensed resources:

NKJP1M: the manually annotated 1-million word subcorpus sampled
from texts of a subset of the National Corpus of Polish.
version 1.2

Licence
-------

This library is free software: you can redistribute it and/or modify
it under the terms of the GNU Lesser General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.

This library is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
GNU Lesser General Public License for more details.

You should have received a copy of the GNU Lesser General Public License
along with this program.  If not, see <http://www.gnu.org/licenses/>.