SATLab at SemEval-2022 Task 4: Trying to detect patronizing and condescending language with only character and word n-grams

(2022) 16th International Workshop on Semantic Evaluation (SemEval-2022)

Files

acl_latex.pdf
  • Open Access
  • Adobe PDF
  • 426.95 KB

Details

Authors
Abstract
A logistic regression model only fed with char- acter and word n-grams is proposed for the SemEval-2022 Task 4 on Patronizing and Con- descending Language Detection (PCL). It ob- tained an average level of performance, well above the performance of a system that tries to guess without using any knowledge about the task, but much lower than the best teams. To facilitate the interpretation of the performance scores, the F1 measure, the best level of perfor- mance of a system that tries to guess without using any knowledge is calculated and used to correct the F1 scores in the manner of a Kappa. As the proposed model is very similar to the one that performed well on a task requiring to automatically identify hate speech and offen- sive content, this paper confirms the difficulty of PCL detection.
Affiliations

Citations

Bestgen, Y. (2022). SATLab at SemEval-2022 Task 4: Trying to detect patronizing and condescending language with only character and word n-grams. Proceedings of the 16th International Workshop on Semantic Evaluation (SemEval-2022). Published. 16th International Workshop on Semantic Evaluation (SemEval-2022). https://hdl.handle.net/2078.5/222251