Task

Given a novel, the task is to segment this novel into coherent sections, specifically scenes, according to the definition in our EACL paper

Data

The data is split into three sets:

Trial Data: The initial trial data, which is now available, consists of one annotated text and is intended to show the format and properties of the data
Training Data: The training data, which is now available consists of 22 annotated dime novels, including the 15 that were part of the EACL publication.
Test Data:
- The test data for task 1 (in-domain scene segmentation) consists of 5 annotated dime novels
- The test data for task 2 (out-of-domain scene segmentation) consists of 2 annotated high literature texts

In order to obtain the training data, you need to register as a participant and we will send you the data.

The test data will not be publicly released during the shared task, evaluation will be done by submitting your code to us in a Docker container.