A potentially major issue arose in Senseval-2 concerning the quality of the WordNet sense inventory. WordNet was not designed to serve as a lexical resource, but its public availability and reasonable comprehensiveness were dominant factors in its selection as the lexical resource of choice. These same factors have now led to further funding by U.S. government agencies and many improvements are currently underway. Among these improvements is a planned hand-tagging of the WordNet glosses with their WordNet senses. At the same time, sense-tagging of the glosses is being performed in the Extended WordNet project under development at the University of Texas at Dallas. The extended WordNet project also transforms the WordNet glosses into a logical predicate form.
More generally, sense disambiguation of definitions in any lexical resource is an important objective in the language engineering community. The first significant disambiguation of dictionary definitions and creation of a hierarchy took place 25 years ago in the groundbreaking work of Robert Amsler. However, while substantial research has been performed on machine-readable dictionaries since that time, technology has not yet been developed to make systematic use of these resources. It seems appropriate for the lexical research community to take up a challenge of introspection, disambiguating dictionary definitions.
The Extended WordNet is used as a Core Knowledge Base for applications such as Question Answering, Information Retrieval, Information Extraction, Summarization, Natural Language Generation, Inferences, and other knowledge intensive applications. The glosses contain a part of the world knowledge since they define the most common concepts of the English language. In this project, many open-class words in WordNet glosses have been hand-tagged and provide an excellent source of data. The Senseval-3 task will be to replicate the hand-tagged results.
The Extended WordNet (XWN) project has disambiguated all glosses, combining human annotation and automated methods (see http://xwn.hlt.utdallas.edu/wsd.html for details). Word senses have been assigned to 630,599 open class words. However, only 15717 (less than 2.5 percent) open-class words in these glosses have been assigned manually. However, "gold" assignments have been given to more than one word in many of these glosses. As a result, the test set will consist of 9,257 glosses, distributed as follows:
Part of Speech | Gold Assignments | Synsets |
Adverb | 1833 | 1684 |
Adjective | 263 | 94 |
Noun | 11391 | 6706 |
Verb | 2230 | 773 |
Total | 15717 | 9257 |
The disambiguations are available, and participants are welcome to investigate them, as well as to use the methods followed by the Extended WordNet team. However, participants should develop their own systems, for comparison with the XWN manual annotations.
Participants will be provided with all glosses from WordNet in which at least one open-class word has been given a "Gold" quality assignment. Each gloss will identify a synset number, its part of speech, and the gloss itself. Glosses frequently include sample uses. These have not been parsed in the XWN project and will be absent in the trial and test data. (If you choose to work with the WordNet glosses directly, you need to remove these sample uses, or your test scores will be adversely affected.) The glosses will be provided exactly as they appear in the Extended WordNet files (the sample uses, shown in bold below, will not be present in either the trial or test data), as follows:
<gloss pos="ADV" synsetID="00001740"> <synonymSet>AD, A.D., anno_Domini</synonymSet> <text> in the Christian era; used before dates after the supposed year Christ was born; "in AD 200" </text> </gloss>
Trial data, consisting of 100 glosses from each part of speech (none of which contain "gold" quality assignments) are available to assist in working with the XWN files. Results will be evaluated using precision and recall as in the "all-words" task, with the XWN "gold" taggings as the "gold standard". Participants will return the gloss with each content word or phrase in the gloss marked with a WordNet sense number (following the format used in Senseval-2).
ADV.00001740 christian_era%1:28:00:: ADV.00001740 used%3:00:00:: ADV.00001740 dates%1:28:03:: ADV.00001740 supposed%3:00:00:: ADV.00001740 year%1:28:01:: ADV.00001740 christ%1:18:00:: ADV.00001740 was%2:42:03:: ADV.00001740 born%3:00:00::
This task is essentially identical to the Senseval-2 and Senseval-3 "all-words" tasks, except that there will be very little context and the gloss will not constitute a complete sentence. However, participants may consider using a synset's placement with WordNet (and all its relations) to assist in disambiguation. The XWN data also contains part of speech tags for each word in the glosses, as well as parses and logical forms, which participants may wish to use. Many glosses have hand-tagged words as well as words tagged by XWN systems. The senses assigned to other open-class words have a tag of "silver" or "normal". These other sense assignments will be used in constructing a voting system. Systems will then be evaluated against the sense selected by the voting system in order to provide additional insights into the performance of different systems.
Trial data for this task is either WordNet 2.0 or XWN 2.01, available from the websites identified above. There will be no training data. The test data (i.e., the glosses along with their synset members, synset number, part of speech, and the gloss itself, without sample uses) will be available on March 1, 2004. Since there is no training data for this task, participants have 14 days after download of the test data to make their submissions. This additional time is intended to ensure that there are no problems with the test data.
Please address any questions to Ken Litkowski. Since this task is new, these guidelines will be revised based on any comments or suggestions that are received.
Last revised 2/27/04.