Abstract
This article presents our approach for the Style Change Detection Task at PAN 2022 using discourse markers. Discourse markers (such as ‘what’, ‘I have’, etc.) are words or expressions used to connect, organise and manage conversations. We present two different approaches for Style Change Detection at PAN-2022. For Task 1, (Style Change Basic) our approach is based on identifying conversational patterns within the documents between a user and a possible respondent. Then, using classification algorithms, we predict the point of style change detection within each document. For Task 2 (Style Change Advanced) and Task 3 (Style Change Real World), we use an extensive list of frequently occurring discourse markers to identify the number of speakers as the number of authors within the document. This prediction serves as the number of clusters for text segments within the document. Subsequently, using unsupervised clustering we detect clusters of similar text segments such that each cluster comprises of text segment groups corresponding to each author. The resulting F1 scores for our approaches on the test set are: 0.70518 for Task 1, 0.32128 for Task 2 and 0.56360 for Task 3.
| Original language | English |
|---|---|
| Pages (from-to) | 2375-2380 |
| Number of pages | 6 |
| Journal | CEUR Workshop Proceedings |
| Volume | 3180 |
| State | Published - 2022 |
Bibliographical note
Publisher Copyright:© 2022 Copyright for this paper by its authors.
Keywords
- Classification
- Clustering
- Conversational Patterns
- Discourse Markers
- Style Change Detection
ASJC Scopus subject areas
- General Computer Science