Teaching Computers to Answer Complex Questions

Computerized question-answering systems usually take one of two approaches. Either they do a text search and try to infer the semantic relationships between entities named in the text, or they explore a hand-curated knowledge graph, a data structure that directly encodes relationships among entities.

With complex questions, however — such as “Which Nolan films won an Oscar but missed a Golden Globe?” — both of these approaches run into difficulties. Text search would require a single document to contain all of the information required to satisfy the question, which is highly unlikely. But even if the knowledge graph was up to date, it would have to explicitly represent all the connections established by the question, which is also unlikely.

In a paper we presented last week at the ACM’s SIGIR Conference on Research and Development in Information Retrieval, my colleagues and I describe a new approach to answering complex questions that, in tests, demonstrated clear improvements over several competing approaches.

In a way, our technique combines the two standard approaches. On the basis of the input question, we first do a text search, retrieving the 10 or so documents that the search algorithm ranks highest. Then, on the fly, we construct a knowledge graph that integrates data distributed across the documents.

Because that knowledge graph is produced algorithmically — not carefully curated, the way most knowledge graphs are — it includes a lot of noise, or spurious inferred relationships. We choose to err on the side of completeness, ensuring that our graph represents most of the relationships described in a text, even at the cost of a lot of noise. Then we rely on clever algorithms to filter out the noise when constructing a response to a question.

In evaluating our approach, we used two different types of baselines: an alternative system and alternative algorithms. The alternative system was a state-of-the-art neural network that learns to answer questions from a large body of training data. The alternative algorithms were state-of-the-art graph search algorithms, which we applied to our ad hoc knowledge graph.

In 36 tests using two different data sets and three different performance metrics, our system outperformed all three baselines on 34, finishing a close second on the other two. The average improvement over the best-performing baseline was 25%, with a high of 80%.

Our system begins with an ordinary web search, using the full text of the question as a search string. In our experiments, we used several different search engines, to ensure that search engine quality doesn’t bias the results. We retrieve the ten top-ranked documents and use standard algorithms to identify named entities and parts of speech within each.

Then we use an information extraction algorithm of our own devising to extract subject-predicate-object triples from the text. Predicates are established either by verbs — as in the triple <Nolan, directed, Inception> — or prepositions — as in <The Social Network, winner of, Best Screenplay>. We also assign each triple a confidence score, based on how close to each other the words are in the text.

Then, from all the triples extracted from all the documents, we assemble a graph.

Baseline_graph.jpg._CB439534692_.jpg
The baseline graph

Using syntactic clues — such as “A and other X’s” or “X’s such as A” — and data from existing knowledge graphs, we then add nodes to our graph that indicate the types of the named entities. We also use existing lexicons and embeddings, which capture information about words’ meanings, to decide which names in the graph refer to the same entities. Like the relationships encoded in the data triples, the name alignments are assigned confidence scores.

Types_and_name_alignment.jpg._CB439548959_.jpg
Graph with types added (left) and entity names aligned (right).

The graph itself is now complete. Our search algorithm’s first step is to identify cornerstones in the graph. These are words that very closely match individual words in the search string.

Graph with cornerstones in yellow

Our assumption is that the answers to questions lie on paths connecting cornerstones. Each path through the graph is evaluated according to two criteria: its length (shorter paths are better) and its weights (the confidence scores from the data triples and the name alignments). We then eliminate all but the shortest, highest-confidence paths.

Highest-scoring_paths.jpg._CB439534991_.jpg
Highest-scoring paths between cornerstones

Next, we remove all the cornerstones from the graph, on the assumption that they can’t be answers to the question, along with all the nodes that are not named entities.

Cornerstones_and_non-entities_removed.jpg._CB439534989_.jpg
High-scoring paths with cornerstones and non-entities removed.

From the initial query, an algorithm that we reported previously predicts the lexical type of the answer. If the question begins “Which films won … ”, for instance, the algorithm will predict that the answer to the question should be of the type “film”. We then excise all entities that do not match the predicted type. In this case, that leaves us with two entities: Inception and The Social Network.

Finally, our algorithm ranks the remaining entities according to several criteria, such as the weights of the paths that connect them to cornerstones, their distance from cornerstones, the number of paths through the network that lead through them, and so on. In this case, that leaves us with one entity, Inception, which the algorithm returns as the answer to the search question.

Although our system significantly outperforms state-of-the-art baselines, there is still room for improvement. One avenue of future research that we consider promising is the integration of the ad hoc knowledge graphs with existing, curated knowledge graphs and the adaptation of the search algorithm accordingly.

Acknowledgments: Xiaolu Lu, Soumajit Pramanik, Rishiraj Saha Roy, Yafang Wang, Gerhard Weikum

About the Author
Abdalghani Abujabal is a scientist in Alexa AI’s Natural Understanding group at Amazon.

Related content

IL, Tel Aviv
Job summaryAre you interested in working on fascinating scientific and engineering challenges of modern AI? Would you like to contribute to the development of the future generation of cloud computing at Amazon Web Services?As an Applied Scientist, you will be working on cutting edge research at the intersection of deep learning and causal inference. You will be part of an ambitious and multidisciplinary team of scientists and software engineers that is together developing novel tools to learn and exploit causal knowledge from real-world visual data.The AWS Causal Representation Learning Lab is located at the Tübingen site in Germany. Our goal is to develop the next generation of AI algorithms by learning and exploiting causal invariances extracted from non-i.i.d. visual data. Going beyond mere correlations, we quantify the causes of observations and provide robust predictions. Our mission is to provide credible and reliable AI models that do not fail unexpectedly under distribution shifts. The successful applicant will have previous research experience with either representation learning or causality, with an interest in the other.As an Applied Scientist in the Causal Representation Learning Lab, you will be responsible for: - Developing new machine learning and neural network architectures building on and inspired by causal principles· Causal discovery in complex environments and large-scale visual datasets· Benchmarks and data sets for causal representation learning· Developing evaluation pipelines to test model performance under distribution shifts· Engaging with product and development teams across AWS and Amazon to help bringing your scientific breakthroughs to customers· Contribute to our unique multidisciplinary environment with your own creativity and talent· Mentoring junior scientists and interns / PhD studentsWe at AWS value individual expression, respect different opinions, and work together to create a culture where each of us is able to contribute fully. Our unique backgrounds and perspectives strengthen our ability to achieve Amazon's mission of being Earth's most customer-centric company.
US, WA, Seattle
Job summaryWork at the intersection of data science and economics.The DAC AdsEcon Team is looking for a Data Scientist II to help and be part of a team to put cutting edge economic and data science advertising research into production. We are looking for a unique individual to help us build a prototype that will have a profound impact in our advertising businesses.Advertising is used daily to surface new selection and provide customers a wider set of product choices along their shopping journeys. The business is focused on generating value for shoppers as well as advertisers. Our team sits in the Business/Corporate Development, and our charter is to use econometrics, machine learning, and data science to build disruptive products that move the needle in our multiple Amazon Advertising businesses. We also generate insights to guide Amazon Advertising strategy, providing direct support to the high level leaders.If you have a background in economics, computer science, statistics, or mathematics and have a passion for solving large, and impactful problems, this is the job for you. Key responsibilities of Data Scientist include the following:· Partnering with economists and senior team members to drive science improvements and implement technical solutions at the cutting edge of machine learning and econometrics· Helping build data systems that leverage diverse data sources to understand how different advertiser’s decisions impact their performance across multiple advertising products.· Build interpretable statistical models and analyze experiment results to answer questions that will drive high impact decisions across Amazon.About Amazon's Advertising business:Amazon is investing heavily in building a world class advertising business and we are responsible for defining and delivering a collection of self-service performance advertising products that drive discovery and sales. Our products are strategically important to our Retail and Marketplace businesses driving long term growth. We deliver billions of ad impressions and millions of clicks daily and are breaking fresh ground to create world-class products. We are highly motivated, collaborative and fun-loving with an entrepreneurial spirit and bias for action. With a broad mandate to experiment and innovate, we are growing at an unprecedented rate with a seemingly endless range of new opportunities.
US, MA, North Reading
Job summaryAre you inspired by invention? Is problem solving through teamwork in your DNA? Do you like the idea of seeing how your work impacts the bigger picture? Answer yes to these questions and you'll fit right in here at Amazon Robotics. We are a smart team of doers who work passionately to apply cutting edge advances in robotics and software to solve real-world challenges that will transform our customers experiences in ways we can't even image yet. We invent new improvements every day.Amazon Robotics, a wholly owned subsidiary of Amazon.com, empowers a smarter, faster, more consistent customer experience through automation. Amazon Robotics automates fulfilment center operations using various methods of robotic technology including autonomous mobile robots, sophisticated control software, language perception, power management, computer vision, depth sensing, machine learning, object recognition, and semantic understanding of commands. Amazon Robotics has a dedicated focus on research and development to continuously explore new opportunities to extend its product lines into new areas.This role is a 6 month co-op to join AR full time (40 hours/week) from July-December 2022. Amazon Robotics co-op opportunities will be based in the Greater Boston Area, in our two state-of-the-art facilities in Westborough and North Reading, MA. Both campuses provide a unique opportunity for interns to have direct access to robotics testing labs and manufacturing facilities.Amazon Robotics is seeking a talented and motivated Engineering student to join the Advanced Robotics team for a Co-op assignment. The candidate will have the opportunity work with senior engineering staff to conduct research, develop, and test software and hardware for next-generation robotic manipulation solutions used in Amazon.com fulfillment operations. Ideal candidates are enrolled in an undergraduate, graduate, or PhD program related to software engineering or robotics, and have strong mechanical and or electrical aptitude, embedded programming, enjoys problem solving and can potentially handle multiple parallel tasks.The Advanced Robotics Co-op will be responsible for:· Working as part of an interdisciplinary team to design and analyze mechanisms, modules or systems· Identifying creative solutions for challenging problems in robotics and computer vision· Developing software solutions to test hypotheses and demonstrate new functionality· Building models, prototyping concepts, conducting tests, collecting data to quantify performance· Creating milestones and deliverables and tracking status with team· Developing design documentation and leading reviews with other engineers or co-ops· Writing code and unit tests and integrating code with other software and hardware components· Utilizing Amazon Robotics and Amazon engineering tools, processes and technologies
US, NJ, Newark
Job summaryGood storytelling starts with great listening. At Audible, that means each role and every project has our audience in mind. Because the same people who design, develop, and deploy our products also happen to use them. To us, that speaks volumes.ABOUT THIS ROLEAudible is searching for an exceptional data scientist to join our economics team and drive the development of models at the intersection of machine learning and econometrics at scale. The Audible economics organization works across the business to measure and maximize the value Audible delivers to customers, creators, and communities globally. In this role, there will be a focus on partnering with our content and product teams to build a groundbreaking catalog of audiobooks and spoken-word entertainment, develop innovative tools to generate value for creators, and optimize content distribution and monetization.We are looking for someone experienced in building ML models at scale for complex prediction and optimization problems, who also has a background (or burgeoning interest!) in causal inference or interpretable machine learning. In addition to working with our staff economists and data scientists, you will also collaborate closely with scientists across Audible and partner teams at Amazon on problems pertinent to subscription businesses and the production of original media content.As a Data Scientist, you will...· Work with leadership in our content and product organizations to identify key analytical problems and opportunities – your work is expected to be a key input to our future content strategy.· Develop and maintain scalable, innovative data science and machine learning models that deliver actionable insights and results.· Collaborate with other data scientists, economists, and analysts at Audible to build data-driven solutions to key business problems.
US, NJ, Newark
Job summaryGood storytelling starts with great listening. At Audible, that means each role and every project has our audience in mind. Because the same people who design, develop, and deploy our products also happen to use them. To us, that speaks volumes.ABOUT THIS ROLEAudible seeks a Data Scientist who will help our marketing team improve paid marketing efficiency and performance. In this role, you will make the best of your skillset in modeling and general analytics. Modelling: use your knowledge of (un-) supervised learning, reinforcement learning, and simulation to explain, quantify, predict and prescribe. Analytics: use your knowledge of marketing and paid media to translate business and financial goals into insights and influence action. Overall: you will seek to create value for both stakeholders and customers and will convey results in a clear, actionable way to managers and senior leaders.As a Data Scientist, you will...· Will build analytical products end-to-end (decks, dashboards, data science models, simulations) at scale and at speed, from ideation and data extraction to presenting results to stakeholders (from manager to VP level).· Support development of models to optimize the Who, When, Where and How of all our conversations with customers and specifically to measure and optimize paid media.· Develop, maintain, and iterate on Amazon-scale data engineering and modelling pipelines.· Imagine and invent before the business asks, and create groundbreaking applications using cutting-edge approaches.· Contribute to the growth of the Audible Global Insights and Data Science team by sharing your ideas, intellectual property and learning from others.· Work closely with Audible stakeholders to drive the business forward, and deliver impactful models and analyses based on robust economic, financial, and statistical analysis.
US, MA, North Reading
Job summaryAre you an MS or PhD student interested in Robotics, Manipulation, Computer Vision, or Machine Learning? Do you enjoy diving deep into hard technical problems and coming up with solutions that enable successful products that improve the lives of people in a meaningful way?At Amazon Robotics, we strive to push boundaries in order to provide the best possible experience for our customers. We are looking for scientists striving to use their domain expertise to invent, design, evangelize, and implement state-of-the-art solutions for never-before-solved problems. As an Applied Scientist intern, you will have access to large datasets with billions of images and video to build large-scale machine learning systems. Additionally, you will analyze and model terabytes of text, images, and other types of data to solve real-world problems and translate business and functional requirements into quick prototypes or proofs of concept.As an Applied Scientist intern, you will work from concept through to execution. This role will give you the opportunity to build tools and support structures needed to analyze data, dive deep to resolve root cause of systems errors and changes, and present findings to business partners to drive improvements.Come build the future with us. Amazon internships are full-time (40 hours/week) for 12 or more consecutive weeks with start dates between May and June 2022.Amazon Robotics intern opportunities will be based in the Greater Boston Area, in our two state-of-the-art facilities in Westborough and North Reading, MA. Both campuses provide a unique opportunity for co-ops to have direct access to robotics testing labs and manufacturing facilities!
US, WA, Seattle
Job summaryAmazon’s Shipping and Delivery Support (SDS) team is a part of Amazon World Wide Customer Service dedicated to support successful package deliveries to Amazon Customers. As a Data Scientist on our team, you’ll use Amazon’s wealth of data to help answer tough questions like where and when preemptively intervening with a problem is most likely to result in a successful delivery, which signals should alert us that a delivery is at risk of missing its estimate, and what is the relative value of a specific set of support associate actions as they relate to delivery success. You will also leverage Amazon's rich datasets and machine learning techniques to understand customer urgency, and build algorithms to recommend treatment actions to optimize delivery outcome. This role will be a key member of the Shipping and Delivery Support Science Team.The Senior Data Scientist will work closely with Business Intelligence Engineers, Data Engineers, Product Managers, Software Engineers, and Program Managers to develop statistical and machinelearning models, design and run experiments, and find new ways to improve support experience to optimize the customer experience and Amazon’s on-time deliveries. The Scientist will collaborate with technology and product leaders to solve business and technology problems using scientific approaches to build new services that surprise and delight Amazon drivers and our customers. Science at Amazon is a highly experimental activity, although theoretical analysis and innovation are also welcome. Our scientists work closely with software engineers to put algorithms into practice. They also work on cross-disciplinary efforts with other scientists within Amazon.The key strategic objectives for this role include:· Understanding drivers, impacts, and key influences on delivery success and support contacts.· Optimizing support processes to improve the Customer experience and Amazon’s on time delivery.· Automating feedback loops for algorithms in production.· Collaborate with researchers, software developers, and business leaders to define product requirements and provide analytical support.· Utilizing Amazon systems and tools to effectively work with terabytes of data.· Communicating verbally and in writing to business customers and leadership team with various levels of technical knowledge, educating them about our systems, as well as sharing insights and recommendations
US, WA, Seattle
Job summaryAmazon brings buyers and sellers together. Our retail customers depend on us to give them access to every product at the best possible price. Our sellers depend on us to give them a platform to launch their business into every home and marketplace. Making this happen is the mission of every engineer in Amazon's North America Consumer (NAC) organization.To this end, the Science team is tasked with:· Organizing available data sources, and creating detailed dictionaries of data that can be used in future analyses.· Partnering with product teams in evaluating the financial and operational impact of new product offerings.· Conducting research into optimization and machine learning algorithms which can be applied to solve business problems.· Partnering with other scientists in evaluating algorithms and suggestions from a business view point.· Carrying out independent data-backed initiatives that can be leveraged later on in the fields of network organization, costing and financial modeling of processes.In order to execute the above mandate we are on the look out for smart and qualified Data Scientists who will own projects in partnership with product and research teams as well as operate autonomously on independent initiatives that are expected to unlock benefits in the future. A past background in Statistics is necessary, along with advanced proficiency in languages such as Python and R.Key job responsibilitiesAs a Data Scientist, you are able to use a range of advanced analytical methodologies to solve challenging business problems when the solution is unclear. You have a combination of business acumen, broad knowledge of statistics, deep understanding of ML algorithms, and an analytical mindset. You thrive in a collaborative environment, and are passionate about learning. Our team utilizes a variety of AWS tools such as Redshift, Sagemaker, Lambda, S3, and EC2 with a variety of skillsets in Linear and Discrete Optimization, ML, NLP, Forecasting, Probabilistic ML and Causal ML. You will bring knowledge in many of these domains along with your own specialties and skillsets.
US, CA, Pasadena
Job summaryThe Amazon Web Services (AWS) Center for Quantum Computing in Pasadena, CA, is hiring a Quantum Research Scientist to join a multi-disciplinary, fast-paced team of theoretical and experimental physicists, materials scientists, and hardware and software engineers pushing the forefront of quantum computing. The candidate should demonstrate a thorough knowledge of experimental measurement techniques as well as quantum mechanics theory.Inclusive Team CultureHere at AWS, we embrace our differences. We are committed to furthering our culture of inclusion. We have ten employee-led affinity groups, reaching 40,000 employees in over 190 chapters globally. We have innovative benefit offerings, and host annual and ongoing learning experiences, including our Conversations on Race and Ethnicity (CORE) and AmazeCon (gender diversity) conferences.Work/Life BalanceOur team puts a high value on work-life balance. It isn’t about how many hours you spend at home or at work; it’s about the flow you establish that brings energy to both parts of your life. We believe striking the right balance between your personal and professional life is critical to life-long happiness and fulfillment. We offer flexibility in working hours and encourage you to find your own balance between your work and personal lives.Mentorship & Career GrowthOur team is dedicated to supporting new members. We have a broad mix of experience levels and tenures, and we’re building an environment that celebrates knowledge sharing and mentorship. Our senior members enjoy one-on-one mentoring and thorough, but kind, code reviews. We care about your career growth and strive to assign projects based on what will help each team member develop into a better-rounded engineer and enable them to take on more complex tasks in the future.Key job responsibilities* Contribute to fast-paced and agile research to help close the many orders of magnitude gap in gate error rates required for fault tolerant quantum computation* Design and perform experiments to characterize quantum devices in close collaboration with software and engineering teams* Develop models to understand and improve device performance* Effectively document results and communicate to a broad audience* Create robust software for implementation, automation, and analysis of measurements* Specify technical requirements in a cross-team collaboration using analytical arguments derived from physics theoryA day in the life* Analyze experimental data* Develop software to test and run new experiments on existing devices; collaborate with software engineers to achieve high code standard* Debug test setups to achieve high-quality data* Present results and cross-collaborate with others’ work* Perform code review for a colleague’s merge request
US, CA, Pasadena
Job summaryThe Amazon Web Services (AWS) Center for Quantum Computing in Pasadena, CA, is looking to hire a Quantum Research Scientist in the Test and Measurement group. You will join a multi-disciplinary team of theoretical and experimental physicists, materials scientists, and hardware and software engineers working at the forefront of quantum computing. You should have a deep and broad knowledge of experimental measurement techniques.Candidates with a track record of original scientific contributions will be preferred. We are looking for candidates with strong engineering principles, resourcefulness and a bias for action, superior problem solving, and excellent communication skills. Working effectively within a team environment is essential. As a research scientist you will be expected to work on new ideas and stay abreast of the field of experimental quantum computation.Inclusive Team CultureHere at AWS, we embrace our differences. We are committed to furthering our culture of inclusion. We have ten employee-led affinity groups, reaching 40,000 employees in over 190 chapters globally. We have innovative benefit offerings, and host annual and ongoing learning experiences, including our Conversations on Race and Ethnicity (CORE) and AmazeCon (gender diversity) conferences.Work/Life BalanceOur team puts a high value on work-life balance. It isn’t about how many hours you spend at home or at work; it’s about the flow you establish that brings energy to both parts of your life. We believe striking the right balance between your personal and professional life is critical to life-long happiness and fulfillment. We offer flexibility in working hours and encourage you to find your own balance between your work and personal lives.Mentorship & Career GrowthOur team is dedicated to supporting new members. We have a broad mix of experience levels and tenures, and we’re building an environment that celebrates knowledge sharing and mentorship. Our senior members enjoy one-on-one mentoring and thorough, but kind, code reviews. We care about your career growth and strive to assign projects based on what will help each team member develop into a better-rounded engineer and enable them to take on more complex tasks in the future.Key job responsibilitiesIn this role, you will drive improvements in qubit performance by characterizing the impact of environmental and material noise on qubit dynamics. This will require designing experiments to assess the role of specific noise sources, ensuring the collection of statistically significant data, analyzing the results, and preparing clear summaries for the team. Finally, you will work with hardware engineers, material scientists, and circuit designers to implement changes which mitigate the impact of the most significant noise sources.