PopYard:Today's Tech.-Algorithms can write fake reviews that humans rate as "helpful"

Sun Nov 24 01:03:53 2024

Algorithms can write fake reviews that humans rate as "helpful"
Source: Yuanshun Yao

One of the reasons that online review sites still have some utility is that "crowdturfing" attacks (in which reviewers are paid to write convincing fake reviews to artificially raise or lower a business or product's ranking) are expensive to do well, and cheap attacks are pretty easy to spot and nuke.

But in a new paper, a group of University of Chicago computer scientists show that they were able to train a Recurrent Neural Network (RNN) to write fake reviews that test subjects could not distinguish from real reviews -- and moreover, subjects were likely to rate these as "helpful" reviews.

This is an ominous sign, since fully automated attacks on review sites could spell the end of reviews as an even moderately useful way to sort out otherwise impossibly long lists of potential candidates for your money and/or attention.

The good news is that the researchers were able to develop a countermeasure in the form of another neural network that could reliably identify fake RNN-authored reviews -- and even better, it's cheaper to detect fakes than it is to improve them.

For now.

Future Work. In terms of potential future work, one direction is to consider the role that user and content metadata can play in both the attack and defense perspectives. Metadata can be crucial in terms of deceiving users (e.g., by increasing the number of friends/contacts on the site) and in assisting defenses [10, 19, 20, 31, 47, 71, 75] (e.g., by analyzing the patterns in timestamps of user activites). Orchestrating the general behavior of user accounts using deep learning to bypass metadata based defenses could be an interesting research challenge. Second, while we limit ourselves to the domain of online review systems and fake review attacks, deep learning-based generative text models can be applied to launch attacks in other scenarios as well. We highlight two of these possible application scenarios.

Strengthening Sybil Attacks. Attackers can use our techniques to generate realistic looking text-based user behavior patterns [4], e.g., posting, commenting and messaging. This can help attackers make Sybil (fake) accounts indistinguishable from legitimate accounts based on textual content. A special case of this involves launching an impersonation attack in online social networks [11].

Fake News Generation. Identifying fake news, i.e. “a made-up story with an intention to deceive” [61], currently remains an open challenge [9]. The research community has started to explore the possibility of automating the detection process by building an AI-assisted fact-checking pipeline [41, 72, 76]. We believe that AI can not only assist fake news detection but also generate fake news. Given the availability of large-scale news datasets [68], an attacker can potentially generate realistic looking news articles using a deep-learning approach (RNN). And due to its low economic cost, the attacker can pollute social media newsfeeds with a large number of fake articles.

We hope our results will bring more attention to the problem of malicious attacks based on deep learning language models, particularly in the context of fake content on online services, and encourage the exploration and development of new defenses.

Automated Crowdturfing Attacks and Defenses in Online Review Systems [Yuanshun Yao, Bimal Viswanath, Jenna Cryan, Haitao Zheng and Ben Y. Zhao/Arxiv Cryptography and Security]

}