Full Fact Head of Automation Andy Dudfield on the challenge of correcting misinformation at scale on the Internet
A lie flies halfway around the world before the truth sets foot. This famous aphorism has never seemed more apt, but who coined it? This is less clear. Such adages decrying the human tendency to believe sensational lies over the more common truth are a thing of the past, but if bots are the factor driving this into the brain, then we’re probably looking at an American newspaper TPortland newspaper around the 1820s, where it appeared as follows: “A lie will fly from Maine to Georgia, and truth pulls on its boots.”
In today’s age, attributing sources to statements and helping people navigate between fact and fiction – or, as Andy Dudfield, head of automated fact-checking at the London-based charity Full Fact, puts it, good information and bad information – is the main task of fact-checking organisations.
“Bad information” can be the result of a simple misquote, a mistake, a joke taken out of context, or satire that goes over the heads of the audience. Or it can be deliberately disseminated misinformation: selected or invented statistics, half-truths or outright lies. Whatever the root cause of its fallibility, once bad information is out in the wild, it takes on a life of its own, so attempts to classify it by intent are irrelevant (though intent may play a role in deciding how to address it and fix ). The most important consideration, Dudfield said, is his impact. As we’ve seen, bad information can have serious consequences online and in the real world.
“What harm does bad information do? Where does it spread? Who said it? What standards can we hold it to?”
Once information is determined to be factually inaccurate, the race is on to reduce its impact. Fact checks can be posted as corrections wherever they appear online, and authors and distributors can be called out and asked to correct misinformation. Like the errata section of a newspaper, fact-checking provides an opportunity to correct misunderstandings, but as a product of today’s information landscape, its focus is much broader than a single publication.
The Web Scale Challenge
The nature of the lies may not have changed much since the 1820s, but both their volume and the speed with which they are spread have greatly increased.
According to Dudfield, who leads the team developing technology to help Full Fact’s 35 staff deal with the misinformation tsunami, combating bad information in a timely manner is a “web-scale problem” and automation inevitably plays a role.
Not all facts are created equal, and Full Fact must first analyze the 80,000 to 90,000 pieces of information that pass through its systems each day to verify and match them. Having identified the likely candidates, the next step in the pipeline is to filter out opinions and predictions outside of the verifier’s purview and deal with claims of numbers, numerical values, and claims of voting results that can be verified by a trusted source.
The system then seeks to enrich the data by extracting objects, searching for names, places and topics and supplementing them with third-party information. It is also at this stage that any quotes are attributed, following the trail back to the actual source of the quote, as Dudfield found: “Did someone say something? Did someone say that someone else said something, or did the newspaper report that someone said someone else said something?”
The information hose is thus reduced to a manageable trickle. The AI task is done, and the work of analyzing chains of events, verifying sources, and identifying where bad information infiltrates is done by critical human fact-checkers at the end of the pipeline. .
“I always want to make sure checkers and other staff have the best tools, but you can never fully automate processes. It’s technically very difficult,” Dudfield explained.
Because while AI is “great at matching patterns and spotting new trends,” such as a sudden increase in the frequency of a certain word or phrase, humans are simply much better at “understanding context, caveats, and nuances; it’s that people are brilliant at it, and fact-checking is so much faster and easier there.”
For example, much of the misinformation being spread around Covid could and has been foreseen by looking at past outbreaks of the disease.
“Vaccine hesitancy has always been something we could see. It’s not an AI model that tells us that, it’s just something that’s a predictable part of what the information landscape will look like.”
Instruments of the check trade
To automatically sort, filter and enrich data, Full Fact developed an artificial intelligence system based on BERT, a natural language processing (NLP) model developed by Google and later open sourced.
“It’s a large-scale language model that has been specifically trained to identify claims using annotations provided by fact-checkers,” Dudfield said.
“This means that when we track hundreds of thousands of offers across different web pages, we can identify things that seem similar to claims, and then we can categorize those that use that pattern into different types of claims.”
Full Fact’s goal is to ensure that fact checks achieve the greatest possible impact. It’s about time, which at web scale means providing the means for fact-checking to spread quickly. Like other fact-checkers, it has relationships with major tech companies and a fact-checking ecosystem that has grown to ensure that the truth can be revealed as quickly as possible.
Schema.org, a web reference site, has a structured standard markup for fact-checking that means search engines and social media sites process and display them in a certain way. There’s also ClaimReview, where fact-checkers can comment on their articles to be curated and displayed alongside relevant search results or real-time feeds.
There are also strategic alliances, including with other fact-checkers around the world by sharing information and annotations using platforms such as the International Fact-Checking Network, as well as collaborations with other organizations, including fact-checkers and technology platforms.
Full Fact and its AI team work with Africa Check in Kenya, Nigeria and South Africa, including checking election information.
“25% of the fact checks published by Africa Check were identified by some form of artificial intelligence model created by Full Facts. So it’s very interesting to change that information significantly,” said Dudfield.
The future of fact checking
The work with Africa Check shows that facts can also spread quickly around the world – but so far this has mostly been limited to English and other widely spoken languages. Misinformation on social media has been used to incite hatred, leading to massacres in Myanmar and Ethiopia, and automated content filters on platforms like Facebook are far less effective in languages other than English. That’s something Dudfield and his team are looking to tackle, including working with Meta to improve its systems.
“Can we take what we’ve done in English and make it work in other languages using the same basic model? That’s what we’re going to focus on for the next couple of years.”
Given the pace of NLP and predictive analytics, will a lie one day travel halfway around the world to find a fact-check already waiting for it? No. Disinformation will always have the upper hand, but these tools certainly give truth a chance to stay in the race to stabilize the information landscape.
“In times of crisis and anxiety, people can be susceptible to conspiracy theories, especially when they feel powerless,” Dudfield said. “And so we really want to make sure that people are responsive and have the best information available when they’re consuming that information.”