• 1Tsinghua University
  • 2Shanghai Jiao Tong University
  • 3Stanford University
  • 4Nanyang Technological University

  • Equal Second Author Contribution
  • Equal Advising
  • *Corresponding Authors

A Quick Glance

overview

 LLMs can be easily misinformed in an interactive persuasive conversation! 

Paper Overview

Large Language Models (LLMs) encapsulate vast amounts of knowledge but still remain vulnerable to external misinformation. Existing research mainly studied this susceptibility behavior in a single-turn setting. However, belief can change during a multi-turn conversation, especially a persuasive one. Therefore, in this study, we delve into LLMs' susceptibility to persuasive conversations, particularly on factual questions that they can answer correctly. We first curate the Farm (i.e., Fact to Misinform) dataset, which contains factual questions paired with systematically generated persuasive misinformation. Then, we develop a testing framework to track LLMs' belief changes in a persuasive dialogue. Through extensive experiments, we find that LLMs' correct beliefs on factual knowledge can be easily manipulated by various persuasive strategies. As showns in the figures below, the drop in accuracy suggests that that LLMs are persuaded to believe in misinformation as the conversation continues.
overview

(a) ChatGPT

overview

(b) GPT-4

overview

(c) Llama-2-7B-chat

overview

(d) Vicuna-v1.5-7B

overview

(e) Vicuna-v1.5-13B

Figure 1. Main results on the tested closed-source and open-source LLMs. We depict both the MR (solid) and ACC (dashed) metrics. MR is the misinformed rate, and ACC is the accuracy. The drop in ACC and increase in MR shows LLMs are persuaded to believe in misinformation.


Findings

LLMs reveal a surprising susceptibility to change their beliefs when combating misinformation. In the first turn, where only the simplest CTRL is used, target LLMs exhibit a proportion of belief alteration ranging from 4.1% to 63.4%. Moreover, as we progress to the fourth turn, the cumulative proportion of belief alteration spans from 20.7% to 78.2%. This vulnerability is especially noteworthy, highlighting that even the most advanced model, GPT-4, bears a 20.7% susceptibility to misinformation.
GPT-4 stands out as the most resistant model against misinformation, consistently demonstrating exceptional resilience across all persuasive strategies on all datasets. On the other hand, Llama-2-7B-chat emerges as the most susceptible model. When considering Vicuna-v1.5-7B as a more advanced model obtained by finetuning Llama-2-7B, we observe that it does indeed demonstrate significantly higher robustness. Similarly, when comparing 7B and 13B Vicuna-v1.5 LLMs, we consistently observe that the 13B variant exhibits greater resistance to misinformation.
Our observations reveal a noteworthy increase in the misinformed rate after the repetition of misinformation. Notably, MR of GPT-4 doubled after 3 additional turns of repeating on questions from NQ2.
overview

The MR@4/MR@1 values to demonstrate the effects of repetition of misinformation

We observe that the three persuasive appeals have better misinformation effects in general. When we compare the effect of repetition with that of the three appeals, a distinct increase in MR@4 is apparent in most instances, which clearly demonstrates the efficacy of appealing strategies.
When assessing the significance of different appeal types, it is clear that non-factual but logical appeals consistently result in the highest misinformed rates, except in a few cases where credibility appeals outperform.

Farm Dataset

The dataset is curated by selecting questions that are easy to answer in a closed-book QA setting. The dataset consists of 1,952 samples chosen from BoolQ, Natural Questions (NQ), and TruthfulQA. The questions are reformatted into multiple-choice questions, and the orders of the choices are shuffled. The misinformation corresponding to each question in the dataset includes: (1) Control: A simple and concise statement that conveys incorrect information with respect to the original QA pair. (2) Logical appeal: A message that uses logic, facts, and evidence to convince an audience. (3) Credibility appeal: A message that employs the credential of the speaker or source to establish credibility and trustworthiness. (4) Emotional appeal: A message that aims to evoke the audience's feelings such as sympathy, empathy, anger, fear, or happiness to persuade them.
overview

Sample from the dataset

Examples

We present some examples of the misinformation conversations from each type of persuasive strategy. The behaviors and beliefs of the LLMs towards misinformaion can be seen through the demonstrations below. The parts of the conversation in gray are the implicit belief checks.

Persuasive Conversation

Ethics and Disclosure

  • In this study, we have developed a dataset, referred to as Farm, containing factual misinformation. While Farm has proven effective for our research objectives, focusing on investigating Large Language Model (LLM) behavior and beliefs, it also carries the potential for misuse, including its utilization in model training or fine-tuning.

    Inappropriately applying our dataset could result in the dissemination of false and potentially toxic information when integrated into other models. It is crucial to emphasize that the misinformation we have generated primarily involves trivial questions that are easily identifiable by humans, thus limiting their potential impact.

    Additionally, our proposed prompting method for systematically generating human-like persuasive appeals containing misinformation carries an inherent risk of being misused for harmful purposes. Therefore, it should be approached with extra caution and ethical consideration.

    We remain dedicated to upholding ethical research practices and the responsible use of the data and methodologies presented in this study. Our intention is to contribute to knowledge while ensuring the ethical use of our research findings.



Citation

If you find our project useful, please consider citing:
AخA
 
@misc{xu2023earth,
    title={The Earth is Flat because...: Investigating LLMs' Belief towards Misinformation via Persuasive Conversation},
    author={Rongwu Xu and Brian S. Lin and Shujian Yang and Tianqi Zhang and Weiyan Shi and Tianwei Zhang and Zhixuan Fang and Wei Xu and Han Qiu},
    year={2023},
    eprint={2312.09085},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}