Tracks

Test collections

History

2004

2005

Fact Extraction Track

Fact extraction from news reports.

This track is dedicated to the problems related to fact extraction from texts. In 2006 the concrete tasks were:

Extract all the named entities:
For each given text participating system must build a list of named entities.
For each entity the following information must be provided:
- list of references to usages of the entity in the text (offsets and lengths in bytes)
- (optional) specify the type of the entity: person/organization/place-name/other
Extract facts of the following types:
- Who worked/works in this organization?
- Where worked/works the given person?
- Who is the owner of the given organization?
- What companies did/does the given person/organization own?
Note: Company buyers, sellers, and shareholders are also accepted as owners.
Participants must process the whole collection without using the results of the name entitities extraction.
Fact description must include the following information:
- fact type
- reference to the text fragment, containing the fact description (offset, length (not longer than 500 bytes))
- two standardized names of the objects referenced in the fact
- reference to the entity in the text (offset from the beginning of the text fragment)

Participants are allowed to perform only the first task, the second one is optional.

Evaluation is carried out in two stages:

Proper nouns check
A random subset of the news reports in the collection is selected. Then we evaluate how good do participating systems extract the proper nouns found in this subset of news reports.
Instructions for assessors: Is the given line a proper noun in the context of the given text fragment? If yes, then is it an organization, a person, or a place?
Possible answers:not a proper noun, organization, person, place, other proper noun
Facts check
A certain number of proper nouns is selected (the selection procedure is not yet defined, but will be discussed with the participants) and fact extraction for these objects is evaluated.
Instructions for assessors: Does the given text fragment contain the fact description connected with the following objects: (A, B)? If yes, which fact type it is?
Possible answers: not a fact, purchase, selling, ownership, belonging, other.

Document collection: news collection
Standard metrics:
- precision
- recall
The metrics are calculated for generalized proper nouns, for each of the proper noun classes, and for the extracted facts.
Formats:
- collection
- results of
- expert judgements