KYOTO Logo
Knowledge Yielding Ontologies
for Transition-based Organization
  • Increase font size
  • Default font size
  • Decrease font size
KYOTO Project Forum
Welcome, Guest
Please Login or Register.    Lost Password?
Rudify: evaluation and extension (1 viewing) (1) Guests
Go to bottom Post Reply Favoured: 0
TOPIC: Rudify: evaluation and extension
#49
Roxane Segers (User)
Fresh Boarder
Posts: 5
graphgraph
User Offline Click here to see the profile of this user
Gender: Female Location: Amsterdam Birthdate: 1979-03-24
Rudify: evaluation and extension 1 Year, 6 Months ago Karma: 0&nbsp&nbsp
Hi Axel,

I understood that you excluded the synset [verbinding::d_n-31543: # sense 4 (scheikunde): ch. compound] from the Dutch GoldStandard because of obvious polysemy. Please note that more BCs in the Dutch GS are polysemes and that these have been disambiguated by providing a synset ID for the human taggers.

We’ll start tagging the full list of Dutch BCs for rigidity soon, but I will only include those BCs that have a equal synonym or equal near synonym relation to an English BC. A subset of the Dutch BCs have only a ‘has_hyperonym’ or ‘has_hyponym’ relation to an English BC. Including the hyponyms would not cause major problems for rigidity labelling as long as the English BC is rigid, and the same goes for has_hyperonym as long as the English BC is non-rigid, but I think the safest way is to use only close translations. So, if you want to create a multilingual standard of all the Kyoto BCs that are _link_ed to the English BCs by equal (near) synonyms, the intersection might end up being quite small.

Furthermore we’ve been thinking about the low performance of Rudify on the Dutch terms from the Term Data_base_ that we used for the Gold Standard. Rudify will be of great importance for the TMEKO protocol we’re developing, and it’s likely that the users will want to ontologize domain specific terms that will have a low recall. The Rudify output for those won’t be trustable/decisive. (‘Alluviaal bos (alluvial forest) for instance has only 320 hits on Google…)
If I understand it correctly, you suggested in the KEOD article to use hypernyms to boost the retrieval, but in the case of some (multiword) domain terms this might not be effective enough. I would consider an alien species as non-rigid for example, but species would be rigid. In all those cases that there is a low retrieval, we were thinking that instead of looking for alien species or species, we search for alien X , by this expanding the set with alien birds, alien genes, alien substances etcetera. From this collection we can process a more reliable Rudify tag. If necessary, we can also see if the set can be semantically clustered in some way. I think that Rudify needs to be modificated for this, but I think that it can be a promising method.

I’d be happy to know what you think of these ideas!

All the best from Amsterdam,

Roxane
 
Report to moderator   Logged Logged  
  The administrator has disabled public write access.
Go to top Post Reply
Powered by FireBoardget the latest posts directly to your desktop

ICT-211423 - 2008 © Kyoto Consortium