TWiki> TFlex Web>Reports>July2008 (revision 15)EditAttach

TFlex technical report: June 2007 to May 2008

Scientific and technical objectives

[Summarize the current project objectives noting if they deviate from those listed in the original proposal (200 words).]

The overarching objective of the TFlex project is to make language technology easier to use across a range of educational applications, including tutorial dialogue and learning research.

The core objective of the Edinburgh team is to develop lexical tools for fast, robust, deep parsing of natural language, in order to support educational applications. Specifically, we address the following three issues:

  • Depth: accessible language processing tools that produce "deep", detailed analyses of student input. If such tools are easily available, they can benefit different educational and training applications:
    • tutorial dialogue systems require detailed analysis of student language in order to give feedback adapted to the learning situation
    • the accuracy of both computer-supported collaborative learning systems and annotation tools for educational researchers can be improved by incorporating additional linguistic features
  • Coverage: tools and resources to ensure that a large variety of common words used in educational settings can be understood by a system, and that new words can be added quickly and easily
  • Robustness: tools to understand incomplete, ungrammatical or fragmentary utterances common in tutorial dialogue

[200 words]


[Summarize the current project approach noting if/how the current approach deviates from the original proposal (200 words).]

The approach pursued by the Edinburgh team involves developing core language technology based on our depth, coverage and robustness goals, which can then be integrated with the user interfaces for educational researchers developed by the CMU team. We are building on the results of the previous joint effort between Edinburgh and CMU, as part of which we developed tools to harvest verb lexical entries from the VerbNet and FrameNet resources. We are approaching our current goals as follows:

  • Depth and coverage. We are combining two existing types of resources: deep parsers, which provide depth of analysis but lack coverage, and semantic lexicons, which have wider coverage but are not integrated with deep parsing systems. We are developing new methods for utilizing information available from FrameNet , to improve quality of extracted entries, and to extend coverage to nouns and adjectives in addition to verbs. We are integrating the newly extracted entries with a deep parser used in tutorial dialogue systems (the TRIPS parser)
  • Robustness. We are investigating the information available from dependency parsers, which are more robust than deep linguistic parsers, and ways to utilize it if a deep parser does not produce a sentence analysis.

[194 words]

Concise accomplishments

[Briefly summarize accomplishments from the current reporting period and briefly note the significance of data/results.]

  • We harvested 2,700 verb entries from the FrameNet corpus for potential inclusion in the lexicon of the parser used in the Beetle2 tutorial dialogue system. This includes 1,200 words that have not been defined in the lexicon at all, and thus would expand system coverage by 44%. Additionally, many entries for the verbs already defined in the system lexicon describe word usages not previously handled by the system. The investigation of how to detect and merge in such entries is in progress
  • We investigated the usability of five parsing schemes with a view for inclusion in a tutorial dialogue system. The investigation showed that features needed for interpretation in tutorial dialogue are available from different parsers. However, no single system provides all the necessary features. We plan to investigate the best options during the next funding year
  • To prepare for integration with CMU tools, parsed 100 doctor/patient dialogues supplied by CMU, and sent the results to the CMU team for further analysis.

[163 words]

Expanded accomplishments

[Describe in greater detail the progress achieved during the current reporting period and include the significance of data/results. You are encouraged to include graphs, charts, and photos. No word limit.]


One of the key components of the kind of language-enhanced educational technology that forms the motivation for the TFlex project is a language interpretation module. In an automated tutorial dialogue system, this module is responsible for analysing what the student says so as to allow the domain reasoner to plan an appropriate response. In computer-supported collaborative learning, the language interpreter is responsible for scanning both the documents created by students and the record of their interactions to diagnose the appropriate time for tutor intervention.

A language interpretation module needs to be:

  • wide-coverage -- able to handle all the words and grammatical structures that occur in educational domains

  • robust -- able to deal with fragmentary, mispelled or ungrammatical input, or with words and idioms the system has not seen before

  • deep -- able to output semantically-transparent representations which can interface straightforwardly with a symbolic reasoning system, or serve as informative features for a machine learning classifier.

The two key components of the interpretation system that determine its coverage, depth and robustness are the lexicon and the grammar. The lexicon constitutes a list of all the words (and idioms) known to the system. The grammar determines which sequences of words are grammatical English sentences, and how they can be interpreted (e.g. "a bulb is in a path" is a grammatical English sentence, whereas "a bulb path is in" isn't).

Language interpretation in a system may fail for several reasons:

  • lack of coverage -- the appropriate word is not defined in the lexicon

  • lack of robustness -- the grammar does not recognize the sentence as grammatical English.

Our focus has been to address those issues in two ways:

  • to compensate for lack of coverage in deep lexicons by extracting additional entries from semantic lexicons not previously used with deep parsers

  • to investigate the extent to which dependency parsers, which provide greater robustness, but less depth, can be used in the context of language interpretation for educational applications.

Extending coverage of deep lexicons

During the previous phase of the TFlex project, we harvested a lexicon of 2,700 verbs from the FrameNet semantically annotated corpus. The main issue that needed to be addressed in this process involves the fact that FrameNet annotation has not been defined with its use in deep parsing in mind. Because of this, the lexical entries extracted from the corpus tend to contain a lot of redundant information. These redundancies cause two main problems: the efficiency of the parser decreases, since it has to process a lot of additional unnecessary information in each entry; and since a single word usually has multiple usages specified in the lexicon (corresponding to different usage alternations, such as "the terminals are connected", "the bulb is connected to the battery"), multiple redundant usages may cause the parser to choose an incorrect interpretation among the many available options.

Having identified redundancy as the main issue, our goal is to discard the redundant usages automatically, while not discarding valid usages necessary for interpretation.

As a first step in this process, we conducted a more detailed evaluation of the lexicon extraction method developed in the previous phase of the project. It focused on the "complement/modifier" distinction, a well known requirement of deep parsing. This dichotomy is not explicitly marked in the FrameNet annotations, and we developed an approach to approximate it from the "core argument" feature in the FrameNet ontology.

We discovered that by discarding the arguments marked as "non-core", we were able to compress the lexicon by 45%, resulting in a significant reduction of spurious usages. The compression accuracy was 85%; specifically, 13% of discarded usages were correct (false positives), while 9% of the retained usages should have been discarded (false negatives). While the accuracy is not perfect, we concluded that it is sufficient for out needs at present, because many of the discarded usages can be obtained from other sources (e.g. the VerbNet lexicon), and at present the low false negative rate is more important. The results of this evaluation were published as McConville and Dzikovska (2008b).

We then used the ontological information distributed with the FrameNet corpus to further compress our harvested lexicon. FrameNet uses a knowledge representation geared towards question answering and information retrieval tasks, but not ideally suited for deep parsing. Each concept in the lexicon is associated with a large number of semantic roles describing participants in a given action - for example, the concept underlying the verb "speak" will specify roles for "Speaker" and "Addressee". Across the verb senses annotated in FrameNet as a whole, there is a total of 440 distinct role labels. In contrast, lexicons for deep parsing generally utilize simpler representations, with a vocabulary of 10-50 role labels. The large number of distinct role labels in FrameNet results in many redundant entries in the harvested lexicon. The FrameNet ontology contains features, role inheritance and coreness sets, which are intended to simplify the representation for use in other applications. We investigated the use of these features to reduce the vocabulary of semantic role labels. We discovered that these features had not been annotated consistently across the corpus as a whole, but in cases where they had been annotated, we were able to reduce the lexicon size by 39%, with an estimated 90% accuracy. The results of this part of the project were published in McConville and Dzikovska (2008a).

After these operations were completed, we have a lexicon with 2,700 verb entries, each with an average of 3.4 usages, and are using it to extend the TRIPS parsing lexicon. The TRIPS grammar and lexicon constitute a deep robust parsing system originally developed at the University of Rochester (partly funded by ONR). It has been used in a variety of applications, including intellingent assistants, Command Post of the Future system (DARPA funded), and two tutorial dialogue systems.

The lexicon/grammar underlying the TRIPS system has been developed painstakingly by hand, over a number of years. One advantage of this mode of grammar development is that the resultant system has very high accuracy - if the parser returns a particular analysis for an input sentence, you can be reasonably sure that this analysis is not a spurious one. On the other hand, the major disadvantage of constructing lexicons and grammars by hand involves the problem of coverage - at the start of the TFlex project, the TRIPS lexicon contained just 400 verbs, less than one tenth of the total number in common use in contemporary English.

The main thrust of the TFlex project (at least from the Edinburgh perspective) has been to attempt to semi-automatically harvest verbs from existing lexical resources (first VerbNet and then FrameNet ), in order to deliver an order-of-magnitude improvement in the coverage of the TRIPS lexicon, whilst maintaining both the high levels of accuracy and the "deepness" of the semantic analyses. In this we have had considerable success, adding around 2,000 brand new verbs from VerbNet . The extraction method we developed for FrameNet produced an additional 1,500 new verb entries, plus a number of entries for existing verbs that describe usages not previously defined in the lexicon. We are currently in the process of merging these entries.

We have completed the first step of merging, by making a mapping between the syntactic information available from FrameNet (e.g. whether a given can take a clause as a complement, as "means" in "This means that the path is closed). The second part will involve aligning the TRIPS and FrameNet semantic ontologies. We developed a method to do that during the previous TFlex effort, which resulted in a tool for mapping between two semantic lexicons based on class representations (see Figure XXX). This stage requires manual intervention from trained linguists, since existing mapping methods result in XX accuracy (Crabbe et al, 2006). We have extended the tool to display additional information that can facilitate mappings, such as example sentences (which were not available from the lexicon we used previously). We are currently starting the process of manual alignment, after which the TRIPS lexicon extension will be complete.

Our previous research concentrated on adding new entries for verbs. We have now extended our methods to cover other entries, specifically adjectives and nouns. We have extracted an initial lexicon, and are planning to evaluate whether methods for reducing redundancy developed for verb entries will perform sufficiently well for noun and adjective entries, after which the mapping process can be straightforwardly applied to further extend the lexicon.

Deep grammatical relations

The second thrust of our research involves investigating the extent to which one recent development in language technology, *dependency parsers*, can be used to improve the robustness of a language interpretation module. Dependency parsers aim to output a syntactic representation which is better suited to reasoning tasks, and they are now extensively used in information retrieval and biomedical text mining. They provide a higher level of robustness than deep parsers, at the expense of some depth of processing. We are investigating the potential for using these "off-the-shelf", wide-coverage dependency parsers for use in educational applications, for example for interpreting student input to tutorial dialogue systems, or in essay grading (which requires longer texts for which deep parsers developed for dialogue are not well suited).

Our work to date has involved analysing the output of various dependency parsers, and comparing them with output of the TRIPS parser, as to how successful they are at interpreting student input and connecting to a symbolic reasoning system. We started by drafting a list of desiderata for a system of 'deep' grammatical relations, which are as close to the level of semantic structure as it is possible to go, while still remaining straightforwardly computable by a parser. We then decided to use this proposal as the basis of our participation in the shared task at the Cross-Domain and Cross-Framework Parser Evaluation workshop at COLING. The aim of this task was to compare and contrast different systems of grammatical dependency annotation based on various practical tasks, in our case tutorial dialogue. We concluded that no one system provided for all of our desiderata, but every feature we desired was covered by at least one of the competing systems. The results of this investigation were published as McConville and Dzikovska (2008c).

This conclusion opens future new avenues of research, in particular in combining output from different systems to collect the range of features required for interpretation. To prepare for that, we created a gold-standard evaluation corpus for dependency parsers based on our deep grammatical dependency system and the dialogues between an automatic tutorial system and human learners collected as part of the BEETLE project. Next, we conducted a preliminary evaluation of one of the available off-the-shelf dependency parsers based on good it was at parsing this evaluation corpus, with initial results showing around 65% recall.

Integration with CMU team

One aim of the joint work we are undertaking with the CMU team is to investigate to what extent the 'deep' output representation of the TRIPS parser can improve the kinds of classification technologies they are working on. As a preliminary step in this direction, we parsed one of the dialogues collected by the CMU team using TRIPS, and sent them a brief report listing the features inherent in the TRIPS output which we believe will be of use to them.


The next steps are: (a) to improve the quality of the entries in our harvested lexicon, by identifying and eliminating spurious and/or redundant information; (b) to align the syntactic and semantic category information in FrameNet with that in the TRIPS parser.

[Describe the objectives you intend to achieve and the approaches that will be taken during the next reporting period. Detail any changes from the original proposed work plan (500 words).]

Major problems/issues

[Explain any problems that significantly affected the research plan or impacted expenditure rate during the current reporting period. Examples: late receipt of funds, loss of personnel, technique didn't work, etc. (250 words).]

The funds were recieved on January 15th, 2008

Technology transfer

[Technology Transfer is an important measure of the impact of scientific and technical endeavors. ONR Program Officers use this information to highlight the technological payoffs that can emerge from investments in research. Describe any recent (last two years) direct or indirect interactions you have had with the Navy, other DoD services, Congress, the media or industrial scientists and engineers related to Technology Transfer. For example, describe interactions that resulted in transitioning knowledge, methodology, data, software, or any other developments produced or directly derived from your ONR support. Stress development paths of actual products including commercialization. Describe any R&D intellectual property transactions such as the licensing of patented technology or the establishment of Cooperative R&D Agreements (CDRAs) resulting from the ONR-funded project. If ONR-funded R&D has been successfully transitioned or leveraged to obtain funds from another source (e.g., DARPA, industry, NSF), please provide a brief description of the accomplishment. If Technology Transfer occurred without such interactions, please describe that as well. Describe any future plans you have for Technology Transfer of ONR-funded R&D. (500 words).]

Foreign collaborations and supported foreign nationals


  • Mark McConville and Myrosia O. Dzikovska (2008a). Using Inheritance and Coreness Sets to Improve a Wide-Coverage Verb Lexicon Harvested from FrameNet . Proceedings of the Second Linguistic Annotation Workshop (LAW'08), Marrakech, Morocco.
  • Mark McConville and Myrosia O. Dzikovska (2008b). Evaluating Complement-Modifier Distinctions in a Semantically Annotated Corpus. Proceedings of the Sixth Conference on Language Resources and Evaluation (LREC'08).
  • Mark McConville and Myrosia O. Dzikovska (2008c). Deep' Grammatical Relations for Semantic Interpretation. Proceedings of the Cross-Domain and Cross-Framework Parser Evaluation Workshop at the 22nd International Conference on Computational Linguistics (COLING'08).

-- MarkMcConville - 15 Jul 2008

Edit | Attach | Print version | History: r19 | r17 < r16 < r15 < r14 | Backlinks | Raw View | Raw edit | More topic actions...
Topic revision: r15 - 07 Aug 2008 - 20:16:18 - MarkMcConville
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback
This Wiki uses Cookies