The many faces of subjectivity in journalism: Multidisciplinary discourse analysis using linguistics and machine learning

(2024) ECREA 2024 - European Communication Research and Education Association Conference — Location: Ljubljana, Slovenia (24.September.2024)

Files

No attached file found for this publication.

Details

Authors
Abstract
To mitigate the inherent subjectivity of the news-making process, journalists use several writing techniques, in accordance with what Tuchman (1972) refers to as the “strategic ritual of objectivity”. This is realized through a range of neutralizing mechanisms designed to mask the journalist’s personal opinions in the content of the text (Koren, 2004). In the digital era, understanding how to measure how much a press article is influenced by its author’s personal opinions is an important matter (Levy, 2021): the dynamics of subjectivity in press discourse not only impact the credibility and trustworthiness of news sources but also have far-reaching implications for media literacy, shaping how readers interpret and engage with the information they encounter (Ku et al., 2019). Disambiguating facts and opinions online is becoming more complex inside the informational disorder induced by the growing presence of AI-generated articles, fake news, and polarized content on social media. We use several methods to increase knowledge on the mechanisms of subjectivity in press discourse and to improve automated tools for news vs. opinion text classification. This research is set at the crossroads of journalism studies and natural language processing, and focuses on French language. Our corpus consists of 80,000 articles identified by their authors as news or opinion pieces and published by four Belgian and four Canadian media. The news and opinion tags are used as ground-truth categories for objective vs. subjective text classification. First, we draw up a state of the question of opinion classification with linguistic methods. Then, using statistical models for text classification, we measure the predictive power of 30 state-of-the-art linguistic features of subjectivity for identifying news and opinion articles. We find that some features, such as the overall concreteness of the text or the ratio of negations, have more weight than others in predicting the class of an article. In parallel, we fine-tune the transformer model CamemBERT (Martin et al., 2019), pre-trained on French data, for classifying news vs. opinion articles. The accuracy of this model is higher than the statistical feature-based model, but its overall computational cost is higher. Using attention-based explainability methods (Chefer et al., 2021), we explore which textual elements have the most influence on the transformer model’s decisions. Among other features, the presence of discourse markers and deictic (context-related) words are elements to which this large language model grants much attention for our classification task. The observations made through those experiments are then confronted with the results of a qualitative experiment involving readers tasked with highlighting markers of subjectivity in press articles (Escouflaire et al., 2024), and with the views of Belgian and Canadian journalists on objective and subjective writing, gathered through sixteen semi-directive interviews. Our findings contribute to a better understanding of the many ways in which subjectivity may be constructed and perceived at the textual level in French-written journalistic discourse.
Affiliations

Citations

Escouflaire, L., Descampe, A., & Fairon, C. (2024). The many faces of subjectivity in journalism: Multidisciplinary discourse analysis using linguistics and machine learning. ECREA 2024 - European Communication Research and Education Association Conference, Ljubljana, Slovenia. https://hdl.handle.net/2078.5/269181