KYOTO Logo
Knowledge Yielding Ontologies
for Transition-based Organization
  • Increase font size
  • Default font size
  • Decrease font size

KYOTO profiles

We created generic profiles that combine simple structural patterns with ontological constraints. The structural patterns are sequences of terms in KAF, restricted by part-of-speech and sometimes by lemma. We only used high-level ontology classes to further constraint the output. Semantic classes, word order and prepositions are together used to predict the semantic role of words with respect to other words that express the events. The output in the profiles is defined in the KYOTO style, using an event element and a role element that is connected to the event. Below is an example of such a pattern for a noun-verb sequence. The noun is restricted to persons and the verb belongs to a semantic class that is dominated by dynamic action verbs. The output is an event represented by the verb and a done-by role represented by the noun:

<?xml version="1.0" encoding="utf-8"?>

<Kybot id="generic_kybot_Norganism-OR-matter_Vaccomplishment_main_clause_done_by">
<variables>
<var name="v1" type="term" pos="N" reftype="SubClassOf" reference="Kyoto#person__individual__someone__somebody__mortal__soul-eng-3.0-00007846-n"/>
<var name="v2" type="term" pos="V" reftype="SubClassOf" reference="DOLCE-Lite.owl#accomplishment | Kyoto#verb_change | Kyoto#verb_consumption | Kyoto#verb_motion | Kyoto#verb_competition | Kyoto#verb_weather | Kyoto#verb_possession | Kyoto#verb_creation | Kyoto#verb_contact"/>
<var name="v3" type="term" lemma="be"/>
<var name="v4" type="term" pos="P"/>
</variables>
<relations>
<root span="v2"/>
<rel span="v1" pivot="v2" direction="preceding" notInBetween="v3"/>
<rel span="v1" pivot="v2" direction="preceding" notInBetween="v4"/>
</relations>
<events>
<event eid="" target="$v2/@tid" lemma="$v2/@lemma" pos="$v2/@pos"/>
<role rid="" event="" target="$v1/@tid" lemma="$v1/@lemma" pos="$v1/@pos" rtype="done-by"/>
</events>
</Kybot>

In this profile, we define 2 variables with the part-of-speech N and V, which follow but are not intersected by the lemma "be" (v3) and a preposition (v4). The output of the profile is defined in the events elements. Here it states that v2 represents the event and v1 represents the role. This profile generates output such as the following:

    <event eid="e9" target="t351" lemma="move" pos="V" synset="eng-30-01649999-v" rank="0.0769607"/>
<role rid="r9" event="e9" target="t350" lemma="people" pos="N" rtype="done-by" synset="eng-30-07942152-n" rank="0.294911"/>
<event eid="e10" target="t358" lemma="predict" pos="V" synset="eng-30-00917772-v" rank="0.502139"/>
<role rid="r10" event="e10" target="t357" lemma="expert" pos="N" rtype="done-by" synset="eng-30-09617867-n" rank="1.0"/>
<event eid="e45" target="t1183" lemma="foster" pos="V" synset="eng-30-00908351-v" rank="0.34663"/>
<role rid="r45" event="e45" target="t1179" lemma="watershed" pos="N" rtype="done-by" synset="eng-30-08518940-n" rank="0.377196"/>

In total 261 profiles have been created for English. They can be downloaded from:

https://kyoto.let.vu.nl/~kyoto/files/data/kybotprofiles/generic_profiles_v9.zip

When unpacked, the profiles are divided over a series of subdirectories that are firstly based on the POS sequence of terms and secondly on semantic constraints. The details and the evaluation of the mining using these profiles are described in deliverable D5.4.

The profiles have been created in about one month of work and included creating profiles and testing them on the bench mark in various iterations. During the creation of the profiles,
the Kybot mining module has been adapted several times to support required functionalit.

An important aspect of KYOTO is the sharing of the central ontology and the possibility to extract semantic relations in a uniform way. To test the feasibility of sharing the same semantic backbone and transferring Kybot profiles, we carried out a transfer experiment from English to Dutch. We collected 93 Dutch documents on a Dutch estuary (the Westerschelde) and related topics. We created KAF files and applied word-sense-disambiguation to these KAF file using the Dutch wordnet data and the UKB. We applied named-entity-recognition through  the same NER module as for English, which uses the GeoNames database.

To apply the profiles to the Dutch KAF documents, we need to apply the ontotagger program to the Dutch KAF. However, the tables map the English WordNet to the ontology and not the Dutch wordnet. We therefore generated Dutch variants of the tables on the basis of the equivalence relations between the Dutch wordnet and the English wordnet. For each Dutch synset, we looked up all the equivalent synsets in English, next we looked up the English synset in the ontotag tables. If there was a match, we created an entry for the Dutch synset in the new table with the same mapping. Likewise, we created tables that match every Dutch synset to the English Base Concepts and to the ontology. Some Dutch synsets have no equivalence and some have multiple equivalences. We generated 145,189 Dutch synset to English Base Concept mappings (for comparison for English we have 114,477 mappings) and 326,667 Dutch synset to ontology mappings (186,383 for English). These ontotag tables were used to insert the ontological implications into the Dutch KAF files.

Next, we adapted the 261 English Kybot profiles to replace all English specific elements by Dutch. This mainly involved:

  1.  replacing English prepositions and relative clause complementizers by Dutch
  2.  adapting the word order sequences for relative clauses in Dutch
  3.  adapting profiles including adverbials
  4.  eliminating profiles for multiword compounds which hardly occur in Dutch
  5.  eliminating the explicit role profiles

We kept all the ontological constraints exactly as they were for English. Only superficial syntactic properties were thus changed. It took us half-a-day to adapt the profiles for Dutch. From the original 261 English propfiles, we obtained 134 Dutch profiles. We ran the profiles on the 93 Dutch KAF files and 65 profiles generated output.

 The 65 profiles adapted for Dutch KAF can be donwloaded from:

 https://kyoto.let.vu.nl/~kyoto/files/data/kybotprofiles/generic_profiles_v9_dutch.zip

 

 

ICT-211423 - 2008 © Kyoto Consortium