Meta's decision to change its content moderation policy, replacing a centralized fact-checking team with user-generated community tags, caused an uproar. But on the face of it, the changes raise questions about the effectiveness of Meta's old policy (fact-checking) and its new policy (community commenting).
With billions of people around the world accessing their services, platforms like Meta’s Facebook and Instagram have a responsibility to ensure users are not harmed by consumer fraud, hate speech, misinformation or other online evils. Given the magnitude of the problem, combating online harm is a serious social challenge. Content moderation plays an important role in addressing these online harms.
Reviewing content involves three steps. The first is scanning online content (usually social media posts) to detect potentially harmful text or images. The second is to evaluate whether the flagged content violates the law or the platform’s terms of service. The third is to intervene in some way. Interventions include removing posts, adding warning labels to posts, and reducing the number of views or shares of posts.
Content moderation can range from user-driven moderation models on community-based platforms (such as Wikipedia) to centralized content moderation models (such as the one used by Instagram). Research shows that both approaches are a hybrid.
Do fact checks work?
Meta's previous content moderation policy relied on third-party fact-checking organizations, which brought questionable content to the attention of Meta staff. Meta's U.S. fact-checking organizations include AFP USA, Check Your Fact, Factcheck.org, Lead Stories, PolitiFact, Science Feedback, Reuters Fact Check, TelevisaUnivision, The Dispatch and USA TODAY.
Fact checking relies on unbiased expert review. Research shows it can reduce the impact of misinformation, but it’s not a panacea. Furthermore, the effectiveness of fact-checking depends on whether users perceive the role of the fact-checker and the nature of the fact-checking organization to be trustworthy.
Crowdsourced content review
Meta CEO Mark Zuckerberg emphasized in a statement that Meta's content moderation will shift to a community annotation model similar to X (formerly Twitter). Community Notes by X is a crowdsourced fact-checking method that allows users to write notes to inform others of potentially misleading posts.
Research on the effectiveness of X-style content moderation efforts is mixed. A large-scale study finds little evidence that the introduction of community annotations significantly reduces engagement with misleading tweets on X. Instead, such crowd-based efforts may be too slow to effectively reduce engagement with misinformation in its early and most explosive stages and its spread.
Quality certifications and badges on the platform have already seen some success. However, community-provided tags may not be effective in reducing engagement with misinformation, especially when they do not provide platform users with appropriate training on the tags. Research also shows that X’s community notes are partisan.
Crowdsourcing initiatives such as community-edited online reference Wikipedia rely on peer feedback and on having a strong contributor system. As I've written before, a Wikipedia-style model requires strong community governance mechanisms to ensure that individual volunteers follow consistent guidelines when verifying and fact-checking posts. People can game the system in a coordinated way and vote on interesting and compelling but unverified content.
[embed]https://www.youtube.com/watch?v=HsV6DHmC8UA[/embed]
Content moderation and consumer harm
Safe and trustworthy online spaces are akin to a public good, but without motivated people willing to invest their efforts in the greater common good, the overall user experience can suffer.
Algorithms on social media platforms are designed to maximize engagement. However, content moderation also plays a role in consumer safety and product liability, given that policies that encourage participation can also cause harm.
This aspect of content moderation has implications for businesses that use Meta to advertise or connect with consumers. Content moderation is also a brand safety issue, as platforms must balance the desire to keep social media environments safer with the desire to increase engagement.
AI content is everywhere
Content moderation may come under further pressure as the amount of content generated by AI tools continues to increase. AI detection tools are flawed, and the growth of generative AI is challenging the ability to differentiate between human-generated content and AI-generated content.
For example, in January 2023, OpenAI launched a classifier designed to distinguish between human-generated text and AI-generated text. However, the company stopped using the tool in July 2023 due to its low accuracy.
There is the potential for a large number of fake accounts (artificial intelligence bots) that exploit algorithmic and human vulnerabilities to monetize false and harmful content. For example, they may commit fraud and manipulate public opinion for financial or political gain.
Generative AI tools like ChatGPT make it easier to create massive, photorealistic social media profiles and content. AI-generated content designed to increase engagement may also exhibit explicit biases, such as race and gender. In fact, Meta has faced backlash for its AI-generated profiles, with commentators labeling them "AI-generated bullshit."
More than just moderation
Regardless of the type of content moderation, this practice alone is not effective in reducing belief in misinformation or limiting its spread.
Ultimately, the research shows that fact-checking methods combined with platform audits and collaboration with researchers and citizen activists are important to ensuring safe and trustworthy community spaces on social media.