Blame view

src/site/markdown/indexing_formats.md 1.87 KB
Matthijs Brouwer authored
1
#Formats
Matthijs Brouwer authored
2
Matthijs Brouwer authored
3
To configure the mapping from resources to the index structure, several parsers are available for different formats:
Matthijs Brouwer authored
4
Matthijs Brouwer authored
5
6
* [MtasFoliaParser](indexing_formats_folia.html) : mapping [FoLiA](https://proycon.github.io/folia/) resources
* [MtasTEIParser](indexing_formats_tei.html): mapping [ISO-TEI](http://www.tei-c.org/) resources
Matthijs Brouwer authored
7
* [MtasChatParser](indexing_formats_chat.html): mapping [CHAT transcription format](http://talkbank.org/manuals/CHAT.pdf) resources converted to [XML](http://talkbank.org/software/xsddoc/)
Matthijs Brouwer authored
8
* [MtasSketchParser](indexing_formats_sketch.html): mapping [Sketch Engine](https://www.sketchengine.co.uk/word-sketch-index-format/) resources
Matthijs Brouwer authored
9
* [MtasCRMParser](indexing_formats_crm.html): mapping resources with format Corpus Van Reenen-Mulder/Adelheid
Matthijs Brouwer authored
10
Matthijs Brouwer authored
11
For XML-based formats, these parsers often just slightly extend the abstract MtasXMLParser by defining the correct namespaces and root tags. 
Matthijs Brouwer authored
12
Matthijs Brouwer authored
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
The [configuration file](indexing_configuration.html#configuration) defining the [mapping](indexing_mapping.html) contains general settings and more specific settings defining and configuring the parser. 

The index part may contain general default settings to be applied in the mapping, the content of the parser part is more specific for the defined Mtas parser.

```xml
<?xml version="1.0" encoding="UTF-8" ?>
<mtas>

  <!-- START MTAS INDEX CONFIGURATION -->
  <index>
    <!-- START GENERAL SETTINGS MTAS INDEX PROCESS -->
    <payload index="false" />
    <offset index="false" />
    <realoffset index="false" />
    <parent index="true" />
    <!-- END GENERAL SETTINGS MTAS INDEX PROCESS -->
  </index>
  <!-- END MTAS INDEX CONFIGURATION -->

  <!-- START CONFIGURATION MTAS PARSER -->
  <parser name="...">
  ...
    <!-- START MAPPINGS -->
    <mappings>
    ...
    </mapping>
    <!-- END MAPPINGS --->
    ...
  </parser>
  <!-- END CONFIGURATION MTAS PARSER -->

</mtas>  
```