Commit 9ec8175f8912235f32b58074f30c0d0682b52a59
1 parent
a0752add
uaktualnienie readme do otagowanej listy frekwencyjnej
Showing
1 changed file
with
4 additions
and
1 deletions
resources/NKJP1M/NKJP-tagged-frequency-tagset.txt
... | ... | @@ -39,9 +39,10 @@ NON-SGJP - all the others. |
39 | 39 | Possible values are: |
40 | 40 | NCH - not checked (automatic label for all entries which are NOT NON-SGJP), may |
41 | 41 | be actually any of the others. |
42 | +ACRO - acronyms. | |
42 | 43 | SYMB - a number, abbreviation, other symbol (eg. signs of articles in legal |
43 | 44 | acts). |
44 | -COMP - a compound form, with the second part after a hyphen. | |
45 | +COMPD - a compound form, with the second part after a hyphen, or apostrophe. | |
45 | 46 | PN - proper name. This includes the following cases, spelled in lowercase acco- |
46 | 47 | rding to the standard ortography: 1) names of inhabitants and adjectives deri- |
47 | 48 | ved from the names of places, nations etc. 2) words that are part of encyclo- |
... | ... | @@ -61,6 +62,8 @@ CORR - correctly spelled word. |
61 | 62 | ERR - incorrectly spelled word (including forms considered non-standard Polish |
62 | 63 | *but not* dialectal). |
63 | 64 | CERR - common error, often not perceived as such in less official contexts. |
65 | +PHON - word spelled incorrectly, but in a way clearly intended to reflect its | |
66 | +pronunciation. | |
64 | 67 | TAGD - word spelled correctly, but there's a disagreement between NKJP1M and |
65 | 68 | SGJP in its part of speech classification (of type "chory" - adj or subst?, |
66 | 69 | "cierpiący" - adj or pact?). |
... | ... |