Our promised goals are:

  • Integrate the lexicon that we have with TRIPS, and evaluate the results with the Beetle corpora

  • Extend the lexicon with noun and adjective entries

  • Integrate with CMU tools

  • Develop tools for extracting domain-specific terminology from domain resources

To keep us justified to Ray, we need to make significant progress on 1 and 3 (or at least major prorgress on 1, and some small progress on 3).

To achieve 1, we need:

1. Convert the entries into TRIPS format

  • Detect and exclude passives and dative shifts
  • Detect meaningful prepositions, since they should always be listed as syntactic modifiers
  • Figure out proper feature for clausal complements, to represent control and syntactic properties
  • Figure out a mapping for adjp complements with appropriate features
  • Figure out how to detect and repersent multi-words and represent particles, which have a special status in TRIPS

2. Align semantic types -- semi-automatically

  • Do this step and steps 3-5 for new words only. Once we debugged these, move on to adding new senses of known words
  • Before doing this step, do statistics about how many new words are available from FrameNet compared to the latest TRIPS lexicon (trunk), to estimate usefullness.

3. Align semantic roles -- automatically or semi-automatically

4. For each (word, semantic type) combination, detect and filter out FrameNet entries already in TRIPS (semi-automatically)

5. Ideally, detect syntax-semantic mapping templates that already exist in TRIPS and use them rather than inserting new ones into the lexicon (could be skipped)

6. Run the new lexicon and compare its performance to an older lexicon (maybe without any of the entries that I added - they are clearly marked) - This step could be put off until next year

-- MarkMcConville - 16 Jun 2008

Topic revision: r1 - 16 Jun 2008 - 10:48:19 - MarkMcConville
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback
This Wiki uses Cookies