Automated fact-checking using evidence from tables and text

The Amazon-sponsored FEVEROUS dataset and shared task challenge researchers to create more advanced fact-checking systems.

On November 10, at the fourth Fact Extraction and Verification workshop (FEVER), we will be announcing the winners of the third fact verification challenge in the FEVER series. The challenge follows our 2018 FEVER shared task and our 2019 FEVER 2.0 build-it, break-it, fix-it contest.

The announcement is the culmination of a year’s worth of work, starting with the design of our latest dataset, FEVEROUS, or Fact Extraction and Verification Over Unstructured and Structured Information. 

FEVER dataset release

The FEVEROUS dataset and shared task were released and launched in May 2021. You can find more information at the FEVER site.

As misleading and false claims proliferate, especially in online settings, there has been a growing interest in fully automated or assistive fact verification systems. Beyond checking potentially unreliable claims, automated fact verification is an invaluable tool for knowledge extraction and question answering, the work we do on my team at Alexa Knowledge. The ability to find evidence that supports or refutes a potential answer will give us more confidence in the answers we provide, and it will also allow us to offer that evidence as part of a follow-up conversation.

Since 2018, to provide the research community with the means to develop large-scale fact-checking systems, I have been working with colleagues from the University of Cambridge, King’s College London, and Facebook on the FEVER series of datasets, shared tasks, and academic workshops.

The FEVEROUS dataset comprises 87,026 manually constructed factual claims, each annotated with evidence in the form of sentences and/or table cells from Wikipedia pages. Based on that evidence, each claim is labeled Supported, Refuted, or NotEnoughInfo. The dataset annotation project was funded by Amazon and designed by the FEVER team. 

To get an idea of the dataset and the difficulty of the task, consider the two examples below. In the left-hand example, in order to refute the claim, we need to identify the two cells that contain the name of the candidate and the number of votes he received (along with the context — page, section title(s), and the closest row/column headers, highlighted in dark grey). As this first set of evidence refutes at least one part of the claim, we don’t need to continue.

For the example on the right, the evidence consists of two table cells and one sentence from two different pages, supporting the claim.

FEVEROUS examples.png

FEVEROUS contains more-complex claims than the original FEVER dataset (on average, 25.3 words per claim, compared to 9.4 for FEVER) but also a more complete pool of evidence (entire pages, including the tables, compared to just the introduction sections). That brings us closer to real-world scenarios, while maintaining the experimental control of an artificially designed dataset.

While the biggest change from the previous FEVER dataset is the use of structured information as evidence, we also worked to improve the quality of the annotation and remove known biases. For example, in the original dataset, a claim-only baseline (a system that classifies a claim without considering the evidence) was able to get an accuracy score of about 62%, compared to a majority-class baseline (choosing the most frequent label) of 33%. This means that the claim “gives away” its label based on the words it contains. In contrast, the claim-only baseline on FEVEROUS is 58%, against a majority-class baseline of 56% (the three labels don’t appear with equal frequency).

As we did with the first two shared tasks, with FEVEROUS, we released a baseline approach to support researchers in the design of fact-checking systems and to assess the feasibility of the task.

The baseline uses a combination of entity matching and TF-IDF to extract the most relevant sentences and tables to retrieve the evidence, followed by a cell extraction model that returns relevant cells from tables by linearizing them and treating the extraction as a sequence labelling task. Finally, a RoBERTa classifier pre-trained on NLI dataset and fine-tuned on the FEVEROUS training data is used to predict the final label for each claim.

The design of the baseline FEVEROUS approach.
The design of the baseline FEVEROUS approach.

We released the dataset and launched the shared task in May of this year. In late July, we opened the final test phase of the shared task, where participants sent predictions over a blind test set. During the final test phase, we received 13 entries, six of which were able to beat the baseline system. The winning team achieved a FEVEROUS score of 27% (+9% compared to the baseline).

The main emerging trends in the submissions were the use of table-based pretraining with systems like TaPas and an emphasis on multi-hop evidence retrieval.

For further insights about the participating systems and to learn more about the challenge in general, I invite you to join the shared-task session at the fourth FEVER workshop. Besides the discussion of the FEVEROUS challenge, our workshop will feature research papers on all topics related to fact verification and invited talks from leading researchers in the field: Mohit Bansal (UNC Chapel Hill), Mirella Lapata (University of Edinburgh), Maria Liakata (Queen Mary University of London), Pasquale Minervini (University College London), Preslav Nakov (Qatar Computing Research Institute), Steven Novella (Yale University School of Medicine), and Brendan Nyhan (Dartmouth College). I look forward to seeing you all there!

About the Author
Christos Christodoulopoulos is an applied scientist in the Alexa AI organization.

Related content

US, Virtual
Job summaryHow do you manage inventory when you don’t own it? How do you design and provide right incentives for millions of sellers that inbound and ship billions of customer orders? How do you optimize Amazon’s third-party supply chain using new ideas never implemented at this scale to benefit millions of customers worldwide? If these type of questions get your mind racing, we want to hear from you.Supply Chain Optimization Technologies (SCOT) optimizes Amazon’s global supply chain end to end and build systems to deliver billions of products to our customers’ doorsteps faster every year while saving hundreds of millions of dollars using science, machine learning, and scalable distributed software on the Cloud. FBA is an Amazon service for our marketplace third party sellers, where our sellers leverage our world-class facilities and provide customers Prime delivery promise on all their goods. SCOT has launched a new team called Fulfillment by Amazon (FBA) Automation & Optimization to focus on optimizing our third-party supply chain, and is in search to hire a Principal Economist.Key job responsibilities· Design and develop rigorous models to understand and assess third party sellers’ behaviors and experience, including causal impact of various Amazon inventory policies on their short-term and long-term performance.· Design and conduct experiments to validate theories and improve understanding of Amazon’s third party ecosystem.· Collaborate with product managers, scientists, and software developers to incorporate models into production processes and influence senior leaders.· Own the scientific vision and direction related to FBA Sellers.· Own all development phases of economic modeling, including defining key research questions, recommending measures, working with multiple data sources, evaluating methodology and design, executing analysis plans, and interpreting and communicating results· Effectively communicate econometric models to business teams and incorporate feedback into project analysis/modeling.About the teamSellers are a critical part of Amazon’s ecosystem to deliver on our vision of offering the Earth’s largest selection and lowest prices. Fulfillment By Amazon (FBA) enables Sellers to provide fast and efficient deliver to their customers using Amazon fulfillment services. In 2020, Sellers enjoyed strong growth using FBA shipping more than half of all products offered on Amazon. To our consumers, FBA provides a broad and diverse inventory of products from Books, Electronics and Apparel to Consumables and beyond with many of them available with 1-Day shipping. The FBA Inventory team within the Amazon Supply Chain Optimization Technology (SCOT) organization is in charge of defining and delivering fulfillment services to our Sellers by leveraging Amazon’s expertise in machine learning, inventory optimization, big data, and distributed systems to deliver the best inventory management experiences for our FBA Sellers. We work full stack, from foundational backend systems to future-forward user interfaces. Our culture is centered on rapid prototyping, rigorous experimentation, and data-driven decision-making.
US, CA, Palo Alto
Job summaryAmazon is the 4th most popular site in the US (http://www.alexa.com/topsites/countries/US). Our product search engine is one of the most heavily used services in the world, indexes billions of products, and serves hundreds of millions of customers world-wide. We are working on a new AI-first initiative to re-architect and reinvent the way we do search through the use of extremely large scale next-generation deep learning techniques. Our goal is to make step function improvements in the use of advanced Machine Learning (ML) on very large scale datasets, specifically through the use of aggressive systems engineering and hardware accelerators. This is a rare opportunity to develop cutting edge ML solutions and apply them to a problem of this magnitude. Some exciting questions that we expect to answer over the next few years include:· Can a focus on compilers and custom hardware help us accelerate model training and reduce hardware costs?· Can combining supervised multi-task training with unsupervised training help us to improve model accuracy?· Can we transfer our knowledge of the customer to every language and every locale ?This is a unique opportunity to get in on the ground floor, shape, and build the next-generation of Amazon Search. We are looking for exceptional scientists and ML engineers who are passionate about innovation and impact, and want to work in a team with a startup culture within a larger organization.Please visit https://www.amazon.science for more information
US, CA, Palo Alto
Job summaryAmazon is the 4th most popular site in the US (http://www.alexa.com/topsites/countries/US). Our product search engine is one of the most heavily used services in the world, indexes billions of products, and serves hundreds of millions of customers world-wide. We are working on a new AI-first initiative to re-architect and reinvent the way we do search through the use of extremely large scale next-generation deep learning techniques. Our goal is to make step function improvements in the use of advanced Machine Learning (ML) on very large scale datasets, specifically through the use of aggressive systems engineering and hardware accelerators. This is a rare opportunity to develop cutting edge ML solutions and apply them to a problem of this magnitude. Some exciting questions that we expect to answer over the next few years include:· Can a focus on compilers and custom hardware help us accelerate model training and reduce hardware costs?· Can combining supervised multi-task training with unsupervised training help us to improve model accuracy?· Can we transfer our knowledge of the customer to every language and every locale?· Can we compress an extremely large model to a small model with minimal accuracy loss?This is a unique opportunity to get in on the ground floor, shape, and build the next-generation of Amazon Search. We are looking for exceptional scientists and ML engineers who are passionate about innovation and impact, and want to work in a team with a startup culture within a larger organization.Please visit https://www.amazon.science for more information
US, CA, Sunnyvale
Job summaryAre you seeking an environment where you can drive innovation? Do you want to apply learning techniques and advanced mathematical modeling to solve real world problems? Do you want to play a key role in the future of Amazon's Retail business? Come and join us!Amazon’s Customer Analytics team is looking for Research Scientists, who can work at the intersection of machine learning, statistics and economics; and leverage the power of big data to solve complex problems like long-term causal effect estimation.As a research scientist, you will bring statistical modeling and machine learning advancements to analyze data and develop customer-facing solutions in complex industrial settings. You will be working in a fast-paced, cross-disciplinary team of researchers who are leaders in the field. You will take on challenging problems, distill real requirements, and then deliver solutions that either leverage existing academic and industrial research, or utilize your own out-of-the-box pragmatic thinking.Key job responsibilitiesUnderstand and mine the large amount of data, prototype and implement new learning algorithms and prediction techniques to improve long-term causal estimation approaches.Collaborate with product managers and engineering teams to design and implement solutions for Amazon problems
US, Virtual
Job summaryAlexa is the voice activated digital assistant powering devices like Amazon Echo, Echo Dot, Echo Show, and Fire TV, which are at the forefront of this latest technology wave. To preserve our customers’ experience and trust, the Alexa Sensitive Content Intelligence (ASCI) team builds services and tools through Machine Learning techniques to implement our policies to detect and mitigate sensitive content in across Alexa.We are looking for an experienced Principal Applied Science to build industry-leading technologies in attribute extraction, annotation, and sensitive content detection and interpretation across all languages, modal, and countries. A Principal Applied Scientist will be a tech lead for a team of exceptional scientists to develop novel algorithms and modeling techniques to advance the state of the art in NLP and Computer Vision related tasks. You will work in a hybrid, fast-paced organization where scientists, engineers, and product managers work together to build customer facing experiences. You will collaborate with and mentor other scientists to raise the bar of scientific research in Amazon.Key job responsibilitiesA Principal Applied Scientist should have good understanding of NLP models (e.g. Bi-LSTM, BERT, and other transformer based models) and where to apply them in different business cases. You leverage your exceptional technical expertise, a sound understanding of the fundamentals of Computer Science, and practical experience of building large-scale distributed systems to creating reliable, scalable, and high-performance products. In addition to technical depth, you must possess exceptional communication skills and understand how to influence key stakeholders. Your work will directly impact our customers in the form of products and services that make use of speech, language, and computer vision technologies.You will be joining a select group of people making history producing one of the most highly rated products in Amazon's history, so if you are looking for a challenging and innovative role where you can solve important problems while growing as a leader, this may be the place for you.A day in the lifeYou will be working with a group of talented scientists on researching algorithm and running experiments to test scientific proposal/solutions to improve our sensitive contents detection and mitigation for worldwide coverage. This will involve collaboration with partner teams including engineering, PMs, data annotators, and other scientists to discuss data quality, policy, model development, and solution implementation. You will mentor other scientists, review and guide their work, help develop roadmaps for the team. You work closely with partner teams across Alexa to deliver platform features that require cross-team leadership.About the teamThe mission of the Alexa Sensitive Content Intelligence (ASCI) team is to (1) minimize negative surprises to customers caused by sensitive content, (2) detect and prevent potential brand-damaging interactions, and (3) build customer trust through appropriate interactions on sensitive topics.The term “sensitive content” includes within its scope a wide range of categories of content such as offensive content (e.g., hate speech, racist speech), profanity, content that is suitable only for certain age groups, politically polarizing content, and religiously polarizing content. The term “content” refers to any material that is exposed to customers by Alexa (including both 1P and 3P experiences) and includes text, speech, audio, and video.Job responsibilities
US, WA, Virtual Location - Washington
Job summaryVoice-driven AI experiences are finally becoming a reality and Amazon’s Alexa voice cloud service and Echo devices are at the forefront of this latest technology wave. We deliver world-class products on aggressive schedules that are used every day, by people you know, in and about their homes. At the same time, we obsess about customer trust and ensure that we build products in a manner that maintains our high bar for customer privacy. We are looking for a passionate and talented Applied Scientist with experience in delivering production systems based on innovative research. This is a unique opportunity to play a key role in an exciting, fast growing business. You will be working on one of the world's most cutting edge customer experience and technology. You'll design and run experiments, research new algorithms, and find new ways of optimizing customer experience. Besides theoretical analysis and innovation, you will work closely with talented engineers and ML scientists to put your algorithms and models into practice. Your work will directly impact the trust customers place in Alexa, globally.You should thrive in ambiguous environments that require to find solutions to problems that have not been solved before. You enjoy and succeed in fast paced environments where learning new concepts quickly is a must. You leverage your exceptional technical expertise, a sound understanding of the fundamentals of Computer Science, and practical experience building large-scale distributed systems to creating reliable, scalable, and high performance products. Your strong communication skills enable you to work effectively with both business and technical partners.You will be joining a select group of people making history producing one of the most highly rated products in Amazon's history. Candidates can work in Arlington, VA OR Seattle, WA.
US, WA, Seattle
Job summaryAre you inspired by building new technologies to benefit customers? Do you dream of being at the forefront of robotics and autonomous system technology? Would you enjoy working in a fast paced, highly collaborative, start-up like environment? If you answered yes to any of these then you've got to check out the Amazon Scout team.We’ve been hard at work developing a new, fully-electric delivery system – Amazon Scout – designed to get packages to customers using autonomous delivery devices. These devices were created by Amazon, are the size of a small cooler, and roll along sidewalks at a walking pace.We developed Amazon Scout at our research and development lab in Seattle, ensuring the devices can safely and efficiently navigate around pets, pedestrians and anything else in their path.The Amazon Scout team shares a passion for innovation using advanced technologies, a love of solving complex challenges, and a desire to impact customers in a meaningful way. We're looking for individuals who like dealing with ambiguity, solving hard, large scale problems, and working in a startup like environment. To learn more about Amazon Scout, check out our Amazon Day One Blog post here: http://amazon.com/scoutAs a Sr. Applied Scientist specializing in Computer Vision, you will combine cutting-edge Deep Learning techniques with classical Computer Vision to create intelligent systems.In this job you will: - Collaborate closely with Robotics scientists and Hardware teams to develop perception systems for Robots.· Take responsibility for technical problem solving, including creatively meeting product objectives and developing best practices.· Interact with teammates in variety of roles to accomplish your goals.· Identify and initiate investigations of new technologies, prototype and test solutions for product features, and design and validate designs that deliver an exceptional user experience.· Recruit, hire and develop other Applied Scientists.You are a person with a commitment to team work, who enjoys working on complex systems, is customer centric, and thrives on the challenge of prototyping new systems that will eventually operate at world-wide scale.
SE, Stockholm
Job summaryCome build the future of entertainment with us.Are you interested in shaping the future of movies and television? Do you want to define the next generation of how and what Amazon customers are watching?Prime Video is a premium streaming service that offers customers a vast collection of TV shows and movies - all with the ease of finding what they love to watch in one place. We offer customers thousands of popular movies and TV shows from Originals and Exclusive content to exciting live sports events. We also offer our members the opportunity to subscribe to add-on channels which they can cancel at anytime and to rent or buy new release movies and TV box sets on the Prime Video Store. Prime Video is a fast-paced, growth business - available in over 240 countries and territories worldwide. The team works in a dynamic environment where innovating on behalf of our customers is at the heart of everything we do. If this sounds exciting to you, please read on.We strive to be a fast-moving, creative, and high-impact organization, but we think it is equally important to be collaborative, supporting, and high-trust in the way we work. We want to come to work every day loving not only what we do, but who we have the privilege of working with. Come help us make all of this a reality.Key job responsibilitiesAs part of the Automated Excellence organization, the Automated Reasoning team applies deep and cutting-edge automated reasoning techniques to detect defects automatically in Prime Video’s core systems and device-level code. The tools we build are mission-critical to the software development and release cycle of many Prime Video engineering organizations, and will represent a huge step forward in the sophistication of our approach to automated Quality Assurance. Your work on this team will help us address a new dimension of scale our business faces as we deliver our applications on an ever-expanding set of client devices.A day in the lifeYou will have the opportunity to apply your deep knowledge of automated reasoning techniques, such as static analysis, formal verification, symbolic execution, etc., to concrete problems our product and engineering teams face on a daily basis. You will collaborate with team members to design and deliver enterprise-scale systems that will be used by both internal and external customers. You will have the opportunity to analyse and verify code to solve real-world problems and translate business and functional requirements into quick prototypes or proofs of concept. You will help set and continuously evolve a culture of innovation and curiosity that helps us find and solve our customers’ biggest problems.About the teamTo help a growing organization quickly deliver more features to Prime Video customers, Prime Video’s Automated Excellence organization is innovating on behalf of our global software development team consisting of thousands of engineers. We build services and utilities that make developer’s lives easier and more productive, and that help them deliver at higher levels of quality.
IE, D, Dublin
Job summaryAre you a MS or PhD student interested in a 2022 Applied Science Internship in the fields of Speech, Robotics, Computer Vision, or Machine Learning/Deep Learning?Do you enjoy diving deep into hard technical problems and coming up with solutions that enable successful products that improve the lives of people in a meaningful way?If this describes you, come join our research teams at Amazon. As an Applied Science Intern, you will have access to large datasets with billions of images and video to build large-scale machine learning systems. Additionally, you will analyze and model terabytes of text, images, and other types of data to solve real-world problems and translate business and functional requirements into quick prototypes or proofs of concept.We are looking for smart scientists capable of using a variety of domain expertise combined with machine learning and statistical techniques to invent, design, evangelize, and implement state-of-the-art solutions for never-before-solved problems.
US, VA, Arlington
Job summaryThe AWS Human Resources Operations and Analytics organization is a critical piece of the AWS flywheel. We are the curators of people data for the industry leader in Cloud Computing. As pioneers in this space, we get to answer new and interesting problems in the People Analytics space, always at scale, and across a variety of business and technical leaders. Our data is sourced from a variety of internal and external sources. The work we do enables leaders to continue to make industry shaking decisions with the knowledge that they are doing so based on reliably sourced and responsibly secured data. We own systems and database environments which are built with reliability and security as the foundation on which balances accessibility, speed, scale, and insight generation. Our systems of self-service data today will quickly evolve into self-service insights in 2022 and beyond.Research Scientists on this team have end-to-end range and capabilities. They work closely with stakeholders to define key business needs and deliver on commitments, retrieve and aggregate data from multiple sources, and compile it into a digestible and actionable format. They also gather and use complex data sets across domains, work closely with product managers, and lead the development of key machine learning features from development to deployment in a cross-functional team.The successful candidate will create documents and share findings in line with scientific best practices for both technical and nontechnical audiences and occasionally present research result at internal and external conferences. They will also work closely with Amazon worldwide operations and the People, Experience, Technologies team to define key business objectives, metrics, and data science deliverables, as well as lead the development of key machine learning features from inception to production in an agile development environment.