
Event coreference resolution with GPT
Possible course project for incoming students to explore
Introduction
Event coreference resolution (ECR) is the task of finding mentions of the same event within the same document (known as within-document coreference resolution, or WDCR), or across text (known as cross-document coreference resolution, or CDCR) documents. This task is used for knowledge graph construction, event salience detection and question answering.
Consider the following examples with marked event triggers (word or phrase referring to the event):
Now the task of ECR is to predict that E1 and E3 are coreferring events, and that E2 is a related but different event from E1 and E3.
ECR is typically tackled by performing pairwise predictions on mention pairs, then, clustering on those predictions. The pairwise scorer is trained by using a joint feature space of the two mentions. Now, the goal of this project is to probe GPT-4 by using it as the pairwise scorer. The task involves engineering various strategies for in-context learning.
Motivation
GPT seems to be really good at this task! Checkout the examples below


While it seems to be good on examples on which fine-tuned BERT based models fail, it will be interesting to show how well does it do on a complete dataset filled with all kinds of examples.
References
GitHub Repo
https://github.com/ahmeshaf/lemma_ce_coref
You can fork this repository for this project
Required Skills
- pyhon experience with PyTorch, Huggingface, OpenAI GPT API, spaCy
- writing: LateX
- git: Ability to collaborate and maintain a code repository
Deadlines
A. November 15, 2023
B. December 15, 2023