TFlex technical report: June 2007 to August 2007

Scientific and technical objectives

The overall goal of the TFLex project was to make language technology easier to use in tutorial dialogue and learning research. Specifically, we aimed to provide:

  • tools for fast and robust deep parsing of natural language, supporting tutorial dialogue and computer-assisted learning
  • lexical resources to be used in parsing, and tools to extend their coverage to new domains
  • tools that make language technology accessible to domain and educational experts


One of the big bottlenecks in designing tutorial dialogue systems is building semantic representa- tions of utterance content. Our approach to solving this problem is to develop tools resources for deep parsing combined with semantic interpretation and role labelling. These include building a fast parser with a grammar for syntactic and semantic interpretation (Objective 1), algorithms for extracting lexical entries from wide-coverage lexicons, and tools to support linguists in resolving inconsistencies and improving lexicon precision (Objective 2).

We are working on extending coverage of a parsing lexicon by combining information from lexical semantic resources developed in the computational linguistic community, in particular FrameNet . Because the existing lexicons have not been designed for parsing as primary application, and because they are built based on different linguistic theories, the information in them is sometimes incomplete or inconsistent.

Combining information from such different sources results in loss of precision. To improve precision, the automatically generated entries can be checked manually by experienced linguists. Our approach is to develop tools simplifying this process based on information from different lexicons and corpora.

Concise accomplishments

  • developing lexical resources for deep parsing

Expanded accomplishments

Developing Lexical Resources for Deep Parsing

Currently, tutorial dialogue systems only allow very limited forms of student input, and all data annotation is done entirely by hand. Deep parsing can facilitate the detailed analysis necessary for assessment of student input in tutorial dialogue systems and collaborative problem solving environments. In addition, linguistic features can improve classification accuracy. However, deep parsers are often difficult to extend to new domains because of their limited lexical coverage. Existing wide coverage lexical resources (used in information extraction and question answering) are not in the format suitable for deep parsers.

The Edinburgh team developed tools to improve coverage of deep parsers:

  1. Developed a set of methods to extract verb lexical entries from a widely used and publicly available corpus: FrameNet (McConville and Dzikovska, 2007):
    • Extracted a lexicon with 2,600 verb senses from the FrameNet corpus, with the goal of further expanding the TRIPS lexicon (pending funding)
  2. Developed a set of tools to check extracted entries and merge them efficiently with existing lexicon
  3. Developed a framework to connect lexical resources to different parsers
    • All lexical entries are extracted into a framework-independent representation
    • The representation can then be mapped to lexical entries for different parsers. Currently, we are only using a mapping to the TRIPS parser. However, the approach should allow a mapping to a different parser, for example, if text rather than dialogue needs to be parsed, with a different grammar


This effort has now been superseded by ONR award XXX, for which the appropriate project report has been written.

  • Mark McConville and Myrosia O. Dzikovska (2007). Extracting a verb lexicon for deep parsing from FrameNet . Proceedings of the ACL Workshop on Deep Linguistic Processing, Prague.

-- MarkMcConville - 14 Aug 2008

