SATLab at SemEval-2022 Task 4: Trying to detect patronizing and condescending language with only character and word n-grams

Bestgen, Yves

SATLab at SemEval-2022 Task 4: Trying to detect patronizing and condescending language with only character and word n-grams

Bestgen, Yves

(2022) 16th International Workshop on Semantic Evaluation (SemEval-2022)

Files

acl_latex.pdf

Open Access
Adobe PDF
426.95 KB

Download

Details

Authors

Bestgen, YvesUCLouvain
Author

Abstract

A logistic regression model only fed with char- acter and word n-grams is proposed for the SemEval-2022 Task 4 on Patronizing and Con- descending Language Detection (PCL). It ob- tained an average level of performance, well above the performance of a system that tries to guess without using any knowledge about the task, but much lower than the best teams. To facilitate the interpretation of the performance scores, the F1 measure, the best level of perfor- mance of a system that tries to guess without using any knowledge is calculated and used to correct the F1 scores in the manner of a Kappa. As the proposed model is very similar to the one that performed well on a task requiring to automatically identify hate speech and offen- sive content, this paper confirms the difficulty of PCL detection.

Affiliations

UCLouvainSSH/IPSY - Psychological Sciences Research Institute

Citations

APA
Chicago
FWB

Bestgen, Y. (2022). SATLab at SemEval-2022 Task 4: Trying to detect patronizing and condescending language with only character and word n-grams. Proceedings of the 16th International Workshop on Semantic Evaluation (SemEval-2022). Published. 16th International Workshop on Semantic Evaluation (SemEval-2022). https://hdl.handle.net/2078.5/222251