Knowledge Yielding Ontologies
for Transition-based Organization
  • Increase font size
  • Default font size
  • Decrease font size

KAF Sax parser


KAF parser is a Java module that reads KAF files and provides an API to efficiently access the KAF data layers. All the data are kept in memory. The module can also write the (modified) KAF to a file in KAF format. Once the the KAF file is read, you can access the different KAF layers and use various indexes for connecting layers:

    public KafMetaData kafMetaData; (meta data on the document in KAF)    
    public ArrayList kafCountryArrayList; (list of locations of the type country)
    public ArrayList kafPlaceArrayList; (list of locations of the type place)
    public ArrayList kafDateArrayList; (list of dates)
    public ArrayList kafWordFormList; (word forms in KAF)
    public ArrayList kafTermList; (terms in KAF)
    public ArrayList kafChunkList; (chunks in KAF)
    public ArrayList kafDepList;  (dependencies in KAF)
    public HashMap WordFormToTerm; (word form ids with a pointer to term ids)
    public HashMap> TermToChunk; (term ids with pointers to the chunk ids)
    public HashMap> TermToDeps; (term ids with pointers to the dependency ids)
    public HashMap> SentenceToWord; (sentence ids with pointers to all their word token ids)
    public HashMap> TermToWord; (terms isa with pointers to all their word token ids)

 Besides the API to the KAF structure, there are functions to write the KAF file to an output file. Likewise, it is easy to write a program to read KAF, modify it and write it.

 Here is a simple program that reads and writes KAF:

public class DummyKafModifier {
        static public void main (String[] args) {
        String kafFile = args[0];
        KafSaxParser parser = new KafSaxParser();
        for (int i = 0; i < parser.kafTermList.size(); i++) {
            KafTerm kafTerm = parser.kafTermList.get(i);
               //// DO SOMETHING TO MODIFY A TERM
        try {
            FileOutputStream fos = new FileOutputStream(kafFile+".my.kaf");
        } catch (IOException e) {
            e.printStackTrace();  //To change body of catch statement use File | Settings | File Templates.

Such a program can be turned into a PipeT module as well by defining a function with an input and output stream parameter.

Owner: VU University Amsterdam

License: GPLv3


Source code:





ICT-211423 - 2008 © Kyoto Consortium