README.md
Maca can be installed using the instructions on nlp.pwr.wroc.pl/redmine/projects/libpltagger/wiki/InstallOnUbuntu11 with the following exceptions:
The package libboost-all-dev can be installed in place of libboost1.42-all-dev as the latter might be unavailable.
-
It might also be necessary to install ncurses using
sudo apt-get install libncurses5-dev
Instead of using adding Bartosz Zaborowski's repository one can download Morfeusz directly from
ppa.launchpad.net/bartosz-zaborowski/nlp/ubuntu/pool/main/m/morfeusz-sgjp/morfeusz-sgjp_0.81-1~precise_amd64.debBefore installing SFST in lines 445 and 449 of maca/third_party/SFST-1.2/SFST/src/fst.C the identifier
it
must be replaced withiter
(or any other non-conflicting identifier).-
Additionally with newer versions of boost the following changes become necessary:
- before installing Corpus 2
- in line 25 of corpus2/learn_to_guess/main.cpp the type
ifstream
must be replaced withstd::ifstream
- in line 25 of corpus2/learn_to_guess/main.cpp the type
-
before installing Toki
- in lines 223-224 of toki/tests/compare.cpp the call
boost::unit_test::make_test_case(boost::bind(test_one_item, ci), name)
must be replaced with
boost::unit_test::make_test_case(boost::bind(test_one_item, ci), name, __FILE__, __LINE__)
- in line 105 of toki/tests/srx.cpp
BOOST_MESSAGE
must be replaced withBOOST_TEST_MESSAGE
- in lines 223-224 of toki/tests/compare.cpp the call
-
before installing Maca
- in line 100 of maca/tests/compareconv.cpp
BOOST_MESSAGE
must be replaced withBOOST_TEST_MESSAGE
- in lines 207-208 of maca/tests/compareconv.cpp the call
boost::unit_test::make_test_case(boost::bind(test_one_item, ci), name)
must be replaced with
boost::unit_test::make_test_case(boost::bind(test_one_item, ci), name, __FILE__, __LINE__)
- in line 100 of maca/tests/compareconv.cpp
- before installing Corpus 2
For an example usage of Maca's interactive mode (in OCaml) see maca_test.ml
Below are the contents of the original README file.
DEPENDENCIES
Maca requires several development packages in order to be built, primarily:
- Corpus2 and libpwrutils (both from the corpus2 repository)
- Toki Other dependencies (some of them indirect via the two libraries above) are:
- ICU (libicu-dev)
- Boost libraries, 1.41 or 1.42 (libboost1.42-all-dev); not tested on newer versions
- Loki (libloki-dev)
- LibXML++ (libxml++2.6-dev)
- bison and flex
Plugins require other libraries that are optional:
- The SFST plugin requires the SFST library 1.2 (see below)
- The Morfeusz plugin requires the Morfeusz library and header
- The Guesser plugin requires the Guesser package from TaKIPI / Corpus
SFST note: The SFST package included in Ubuntu and Debian repositories (libsfst-dev) is apparently missing required header files. The sources downloadable from ims.uni-stuttgart.de in turn need some Makefile modifications to build and install the library and headers. There are two possibilities to overcome this:
- Install the version with modified Makefile contained in third_party directory of this package.
- Install the Debian package libsfst1-1.2-0-dev, then download the 1.2 sources and copy the missing header files into the proper system header dir (presumably /usr/include/sfst-1.0/sfst/). Make sure those files are given read permissions.
USAGE
See the maca-analyse and maca-convert binaries.
NOTES
Spurious "array subscript is above array bounds" warnings originating in bits/stl_algo seem to be an artifact of gcc4.4's overly-eager correctness checking in conjunction with boost::algorithm::split and boost::is_any_of. Apparently, they are harmless.