Example
The first part is the plain text paragraph (with keyphrases type-faced for better readability), followed by the paragraph annotations visualised with brat, followed by stand-off keyphrase annotations based on character offsets and by relation annotations.
Input: excerpt from a scientific publication
Information extraction is the process of extracting structured data from unstructured text, which is relevant for several end-to-end tasks, including question answering. This paper addresses the tasks of named entity recognition (NER), a subtask of information extraction, using conditional random fields (CRF). Our method is evaluated on the ConLL-2003 NER corpus.
Annotated paragraph visualised with brat
Subtask (A): Identification of keyphrases
| ID | Start | End |
| 0 | 0 | 22 |
| 1 | 150 | 168 |
| 2 | 204 | 228 |
| 3 | 230 | 233 |
| 4 | 249 | 271 |
| 5 | 279 | 304 |
| 6 | 306 | 309 |
| 7 | 343 | 364 |
Subtask (B): Classification of identified keyphrases
| ID | Type |
| 0 | TASK |
| 1 | TASK |
| 2 | TASK |
| 3 | TASK |
| 4 | TASK |
| 5 | PROCESS |
| 6 | PROCESS |
| 7 | MATERIAL |
Subtask (C): Extraction of relationship between two identified keyphrases
| ID1 | ID2 | Type |
| 2 | 3 | SYNONYM-OF |
| 3 | 4 | HYPONYM-OF |
| 5 | 6 | SYNONYM-OF |