Detecting opinion in news. An automated analysis of linguistic subjectivity in French-language press articles

(2022) 4th Biennial Conference of the Brussels Institute for Journalism Studies (BIJU) - A true and fair view — Location: Brussels, Belgium (8.December.2022)

Files

pres_BIJU.pdf
  • Open Access
  • Adobe PDF
  • 2.81 MB

Details

Authors
Abstract
To better understand the link between the increasing polarization of opinions on social media and the subjectivity of the press, we aim to develop an algorithm for automatically evaluating how subjective French-language press articles are. Our research is set at the crossroads of journalism studies, linguistics, and artificial intelligence. We investigate the differences and overlaps between the notions of journalistic and linguistic subjectivity and draw up a state of the question of opinion classification of press articles. We define the subjectivity of a press article as the extent to which the textual content of the article is influenced by the personal opinions of its author. In practice, we present the results of three experiments on linguistic subjectivity. These experiments have been made using the recently released open-source RTBF Corpus, which contains over 750,000 press articles published by the Belgian French public service media. First, using statistical models and a sample of 10,000 news articles and opinion pieces, we identify the 18 most significant linguistic features for the classification of opinionated and non-opinionated press articles, and obtain a model with 89% accuracy. Second, we fine-tune on the same opinion vs. information task a transformer-based CamemBERT model, which reaches a classification accuracy of 97%, at the cost of a much poorer potential for explainability and a higher computational cost. Through different model explanation methods, we explore the possibility to extract linguistic patterns from the transformer-based model and to insert them into our rule-based linguistic model. Eventually, we conducted an annotation experiment involving 30 students in journalism who were asked to highlight “subjective elements” in 150 different press articles from the RTBF Corpus. The results of this annotation are analyzed, and the most highlighted tokens are compared with those on which the linguistic rule-based and the transformer-based rely for opinion classification. Our findings contribute to a better understanding of what influences the subjectivity of press articles, based on journalistic and linguistic theory, deep learning, and human understanding.
Affiliations

Citations

Escouflaire, L., Descampe, A., & Fairon, C. (2022). Detecting opinion in news. An automated analysis of linguistic subjectivity in French-language press articles. 4th Biennial Conference of the Brussels Institute for Journalism Studies (BIJU) - A true and fair view, Brussels, Belgium. https://hdl.handle.net/2078.5/269117