Enhancing Propaganda Detection with Open Source Language Models: A Comparative Study

Authors

  • Oleksandr Lytvyn Comenius University Bratislava

Abstract

Research Objective

This study leverages the open-source Mistral model, with 7 billion parameters, via the Ollama framework to enhance the detection of propaganda techniques in text. Mistral, a French general-use large language model, is compared against high-performing proprietary models like GPT-4 to evaluate its effectiveness.

Methodology

The research utilizes the SemEval-2020 Task 11 dataset, which features news articles labelled for propaganda techniques. This dataset includes text data with annotations for various propaganda techniques at the fragment level, facilitating the training and evaluation of models aimed at identifying propaganda in text. Ollama, an open-source platform, is designed to support the execution of Large Language Models (LLMs) within a local computing environment. Three experimental setups of Mistral were tested: (1) the base Mistral model (out of the box), (2) Mistral modified with a ModelFile, and (3) Mistral integrated with LangChain technology and the all-MiniLM-L6-v2 embedding model.

A ModelFile stores the data and settings required for the Large Language Model (LLM) to comprehend and make predictions based on new information. It also defines the model's behavior (e.g., temperature) and a system prompt. In the case of LLMs like ChatGPT or Mistral, LangChain enhances performance without altering the model's weights, eliminating the necessity for fine-tuning and re-training. This feature enables the model to access external documents and local files for contextual tasks, offering a cost-effective solution for enhancing performance through additional contextual information.

Findings

Preliminary results indicate that the ModelFile configuration improves performance with better recall and a more balanced F1 score compared to the base model and the model integrated with LangChain. The integration with LangChain shows promise in achieving the effectiveness of GPT-4 in precision and exceeding the precision of fine-tuned GPT-3 models. The models analyze labeled articles, providing text predictions and explanations, while an evaluator captures replies to fill metrics.

Significance

This investigation demonstrates the potential of using large language models and open-source software to detect complex propaganda techniques, emphasizing the feasibility of advanced AI research with minimal computational resources.

Implications for Practice

The approach offers a transparent and economical method for using private large language models, potentially democratizing access to state-of-the-art AI tools and encouraging broader adoption and innovation in AI technology.

Interdisciplinary Contribution

This work merges computational linguistics, computer science, and media studies to tackle social science challenges using advanced NLP technologies. It provides valuable insights into the cognitive processes involved in media consumption and the reception of propaganda, illustrating a comprehensive method to study the societal impacts of language models.

References

[1] Giovanni Da San Martino et al., "Detection of Propaganda Techniques in News Articles," in Proceedings of the SemEval-2020 Task 11 (2020).

[2] Kilian Sprenkamp et al., "Large Language Models for Propaganda Detection," in Proceedings of the 2023 5th International Conference on Computational Intelligence and Networks (2023).

Published

2024-06-10