Example

The first part is the plain text paragraph (with keyphrases type-faced for better readability), followed by the paragraph annotations visualised with brat, followed by stand-off keyphrase annotations based on character offsets and by relation annotations.


Input: excerpt from a scientific publication


Information extraction is the process of extracting structured data from unstructured text, which is relevant for several end-to-end tasks, including question answering. This paper addresses the tasks of named entity recognition (NER), a subtask of information extraction, using conditional random fields (CRF). Our method is evaluated on the ConLL-2003 NER corpus.

Annotated paragraph visualised with brat




Subtask (A): Identification of keyphrases


ID Start End
0 0 22
1 150 168
2 204 228
3 230 233
4 249 271
5 279 304
6 306 309
7 343 364


Subtask (B): Classification of identified keyphrases


ID Type
0 TASK
1 TASK
2 TASK
3 TASK
4 TASK
5 PROCESS
6 PROCESS
7 MATERIAL


Subtask (C): Extraction of relationship between two identified keyphrases


ID1 ID2 Type
2 3 SYNONYM-OF
3 4 HYPONYM-OF
5 6 SYNONYM-OF