Bringing the Power of Neural Networks to the Problem of Search

Using machine learning to train information retrieval models — such as Internet search engines — is difficult because it requires so much manually annotated data. Of course, training most machine learning systems requires manually annotated data, but because information retrieval models must handle such a wide variety of queries, they require a lot of data. Consequently, most information retrieval systems rely primarily on mechanisms other than machine learning.

This week at the ACM’s SIGIR Conference on Research and Development in Information Retrieval, my colleagues and I will describe a new way to train deep neural information retrieval models with less manual supervision. Where standard training with annotated data is referred to as supervised, our approach is weakly supervised. Weak supervision allows us to create data sets with millions of entries, instead of the tens of thousands typical with strong supervision.

In tests, our weak-supervision technique not only yielded more-accurate retrieval models than the supervised baseline (with limited training data) but offered improvements over previous weak-supervision techniques. It also offered dramatic improvements over the type of algorithm commonly used to assess the “relevance” of search results.

Typically, neural-network-based information retrieval systems are trained on data triples: each data item in the training set consists of a query and two documents, one that satisfies the user's information need (relevant) and one that doesn't (but is still related to the query, non-relevant). During training, the neural network learns to maximize the difference between the scores it assigns to the relevant and non-relevant documents. Here, manual annotation means tagging documents as relevant or non-relevant to particular queries.

seamless pattern on the theme of Newspapers.png
Image: Getty Images

In our approach, we leverage the fact that news article headlines and Wikipedia entry section titles are already associated with relevant texts: the articles and sections they introduce. Headlines and titles may not look exactly like search strings, but our hypothesis was that they’re similar enough for training purposes. Training a machine learning system to find correlations between headlines and articles, we reasoned, should help it find correlations between search strings and texts.

Our first step: collect millions of document-title pairs from the New York Times’ online repository and from Wikipedia. Each document-title pair constituted two-thirds of a data triple we would use in training a machine learning system: the query and the relevant text. To round out the triples, we used an industry-standard algorithm to identify texts that are related to the query (but less relevant than the associated text). The algorithm assigns relevance scores based on the number of words in the document that also appear in the query.

As a strong baseline, we used a data set from AOL consisting of actual customer queries and search results. Here, we used the standard algorithm to identify the most relevant and non-relevant texts for each query. We also used two other baselines: a set of about 25,000 hand-annotated data triples and the application of the standard relevance algorithm to the test data.

With each of the four test sets — NYT, Wikipedia, AOL, and the hand-annotated set — we trained three different neural networks to do information retrieval and scored them using a metric called normalized discounted cumulative gain (nDCG). We used this metric to measure the cumulative relevance of the top 20 results returned by each network. Of the baselines, the combination of the AOL data set and a neural architecture called a position-aware convolutional recurrent relevance network, or PACRR, yielded the best results. But on the same architecture, our NYT data set offered a 12% increase in nDCG. (The Wikipedia data set also conferred gains, but they were less dramatic.)

Once we established the utility of our approach, we tried to improve it still further by tuning our information retrieval system to the domain of the data on which it was going to be tested. To do this, we used two different filtration techniques to limit the training data to samples similar to those in the test set.

The first technique: take some canonical examples of data from the target domain and use a representation function to map them to a multidimensional space. Then we simply selected training examples that the same function mapped to nearby points in the space and used them to re-train the information retrieval system.

The second technique was somewhat akin to adversarial training: we trained a neural network to distinguish data from the new target domain from the data originally used to train the information retrieval system. Then we kept only the training examples that received low confidence scores from the discriminator — the ones that were hard to distinguish from data in the new domain.

This approach worked best. Again, the combination of the PACRR network and the NYT data set yielded the best results. But re-training the retrieval model on data filtered using the neural discriminator boosted the nDCG score by 35%.

Acknowledgments: Sean MacAvaney, Andrew Yates, Ophir Frieder

Related content

GB, London
Amazon Advertising is looking for a Data Scientist to join its brand new initiative that powers Amazon’s contextual advertising products. Advertising at Amazon is a fast-growing multi-billion dollar business that spans across desktop, mobile and connected devices; encompasses ads on Amazon and a vast network of hundreds of thousands of third party publishers; and extends across US, EU and an increasing number of international geographies. The Supply Quality organization has the charter to solve optimization problems for ad-programs in Amazon and ensure high-quality ad-impressions. We develop advanced algorithms and infrastructure systems to optimize performance for our advertisers and publishers. We are focused on solving a wide variety of problems in computational advertising like traffic quality prediction (robot and fraud detection), Security forensics and research, Viewability prediction, Brand Safety, Contextual data processing and classification. Our team includes experts in the areas of distributed computing, machine learning, statistics, optimization, text mining, information theory and big data systems. We are looking for a dynamic, innovative and accomplished Data Scientist to work on data science initiatives for contextual data processing and classification that power our contextual advertising solutions. Are you an experienced user of sophisticated analytical techniques that can be applied to answer business questions and chart a sustainable vision? Are you exited by the prospect of communicating insights and recommendations to audiences of varying levels of technical sophistication? Above all, are you an innovator at heart and have a track record of resolving ambiguity to deliver result? As a data scientist, you help our data science team build cutting edge models and measurement solutions to power our contextual classification technology. As this is a new initiative, you will get an opportunity to act as a thought leader, work backwards from the customer needs, dive deep into data to understand the issues, define metrics, conceptualize and build algorithms and collaborate with multiple cross-functional teams. Key job responsibilities * Define a long-term science vision for contextual-classification tech, driven fundamentally from the needs of our advertisers and publishers, translating that direction into specific plans for the science team. Interpret complex and interrelated data points and anecdotes to build and communicate this vision. * Collaborate with software engineering teams to Identify and implement elegant statistical and machine learning solutions * Oversee the design, development, and implementation of production level code that handles billions of ad requests. Own the full development cycle: idea, design, prototype, impact assessment, A/B testing (including interpretation of results) and production deployment. * Promote the culture of experimentation and applied science at Amazon. * Demonstrated ability to meet deadlines while managing multiple projects. * Excellent communication and presentation skills working with multiple peer groups and different levels of management * Influence and continuously improve a sustainable team culture that exemplifies Amazon’s leadership principles. We are open to hiring candidates to work out of one of the following locations: London, GBR
JP, 13, Tokyo
We are seeking a Principal Economist to be the science leader in Amazon's customer growth and engagement. The wide remit covers Prime, delivery experiences, loyalty program (Amazon Points), and marketing. We look forward to partnering with you to advance our innovation on customers’ behalf. Amazon has a trailblazing track record of working with Ph.D. economists in the tech industry and offers a unique environment for economists to thrive. As an economist at Amazon, you will apply the frontier of econometric and economic methods to Amazon’s terabytes of data and intriguing customer problems. Your expertise in building reduced-form or structural causal inference models is exemplary in Amazon. Your strategic thinking in designing mechanisms and products influences how Amazon evolves. In this role, you will build ground-breaking, state-of-the-art econometric models to guide multi-billion-dollar investment decisions around the global Amazon marketplaces. You will own, execute, and expand a research roadmap that connects science, business, and engineering and contributes to Amazon's long term success. As one of the first economists outside North America/EU, you will make an outsized impact to our international marketplaces and pioneer in expanding Amazon’s economist community in Asia. The ideal candidate will be an experienced economist in empirical industrial organization, labour economics, or related structural/reduced-form causal inference fields. You are a self-starter who enjoys ambiguity in a fast-paced and ever-changing environment. You think big on the next game-changing opportunity but also dive deep into every detail that matters. You insist on the highest standards and are consistent in delivering results. Key job responsibilities - Work with Product, Finance, Data Science, and Data Engineering teams across the globe to deliver data-driven insights and products for regional and world-wide launches. - Innovate on how Amazon can leverage data analytics to better serve our customers through selection and pricing. - Contribute to building a strong data science community in Amazon Asia. We are open to hiring candidates to work out of one of the following locations: Tokyo, 13, JPN
DE, BE, Berlin
Ops Integration: Concessions team is looking for a motivated, creative and customer obsessed Snr. Applied Scientist with a strong machine learning background, to develop advanced analytics models (Computer Vision, LLMs, etc.) that improve customer experiences We are the voice of the customer in Amazon’s operations, and we take that role very seriously. If you join this team, you will be a key contributor to delivering the Factory of the Future: leveraging Internet of Things (IoT) and advanced analytics to drive tangible, operational change on the ground. You will collaborate with a wide range of stakeholders (You will partner with Research and Applied Scientists, SDEs, Technical Program Managers, Product Managers and Business Leaders) across the business to develop and refine new ways of assessing challenges within Amazon operations. This role will combine Amazon’s oldest Leadership Principle, with the latest analytical innovations, to deliver business change at scale and efficiently The ideal candidate will have deep and broad experience with theoretical approaches and practical implementations of vision techniques for task automation. They will be a motivated self-starter who can thrive in a fast-paced environment. They will be passionate about staying current with sensing technologies and algorithms in the broader machine vision industry. They will enjoy working in a multi-disciplinary team of engineers, scientists and business leaders. They will seek to understand processes behind data so their recommendations are grounded. Key job responsibilities Your solutions will drive new system capabilities with global impact. You will design highly scalable, large enterprise software solutions involving computer vision. You will develop complex perception algorithms integrating across multiple sensing devices. You will develop metrics to quantify the benefits of a solution and influence project resources. You will validate system performance and use insights from your live models to drive the next generation of model development. Common tasks include: • Research, design, implement and evaluate complex perception and decision making algorithms integrating across multiple disciplines • Work closely with software engineering teams to drive scalable, real-time implementations • Collaborate closely with team members on developing systems from prototyping to production level • Collaborate with teams spread all over the world • Track general business activity and provide clear, compelling management reports on a regular basis We are open to hiring candidates to work out of one of the following locations: Berlin, BE, DEU | Berlin, DEU
DE, BY, Munich
Ops Integration: Concessions team is looking for a motivated, creative and customer obsessed Snr. Applied Scientist with a strong machine learning background, to develop advanced analytics models (Computer Vision, LLMs, etc.) that improve customer experiences We are the voice of the customer in Amazon’s operations, and we take that role very seriously. If you join this team, you will be a key contributor to delivering the Factory of the Future: leveraging Internet of Things (IoT) and advanced analytics to drive tangible, operational change on the ground. You will collaborate with a wide range of stakeholders (You will partner with Research and Applied Scientists, SDEs, Technical Program Managers, Product Managers and Business Leaders) across the business to develop and refine new ways of assessing challenges within Amazon operations. This role will combine Amazon’s oldest Leadership Principle, with the latest analytical innovations, to deliver business change at scale and efficiently The ideal candidate will have deep and broad experience with theoretical approaches and practical implementations of vision techniques for task automation. They will be a motivated self-starter who can thrive in a fast-paced environment. They will be passionate about staying current with sensing technologies and algorithms in the broader machine vision industry. They will enjoy working in a multi-disciplinary team of engineers, scientists and business leaders. They will seek to understand processes behind data so their recommendations are grounded. Key job responsibilities Your solutions will drive new system capabilities with global impact. You will design highly scalable, large enterprise software solutions involving computer vision. You will develop complex perception algorithms integrating across multiple sensing devices. You will develop metrics to quantify the benefits of a solution and influence project resources. You will validate system performance and use insights from your live models to drive the next generation of model development. Common tasks include: • Research, design, implement and evaluate complex perception and decision making algorithms integrating across multiple disciplines • Work closely with software engineering teams to drive scalable, real-time implementations • Collaborate closely with team members on developing systems from prototyping to production level • Collaborate with teams spread all over the world • Track general business activity and provide clear, compelling management reports on a regular basis We are open to hiring candidates to work out of one of the following locations: Munich, BE, DEU | Munich, BY, DEU | Munich, DEU
IT, MI, Milan
Ops Integration: Concessions team is looking for a motivated, creative and customer obsessed Snr. Applied Scientist with a strong machine learning background, to develop advanced analytics models (Computer Vision, LLMs, etc.) that improve customer experiences We are the voice of the customer in Amazon’s operations, and we take that role very seriously. If you join this team, you will be a key contributor to delivering the Factory of the Future: leveraging Internet of Things (IoT) and advanced analytics to drive tangible, operational change on the ground. You will collaborate with a wide range of stakeholders (You will partner with Research and Applied Scientists, SDEs, Technical Program Managers, Product Managers and Business Leaders) across the business to develop and refine new ways of assessing challenges within Amazon operations. This role will combine Amazon’s oldest Leadership Principle, with the latest analytical innovations, to deliver business change at scale and efficiently The ideal candidate will have deep and broad experience with theoretical approaches and practical implementations of vision techniques for task automation. They will be a motivated self-starter who can thrive in a fast-paced environment. They will be passionate about staying current with sensing technologies and algorithms in the broader machine vision industry. They will enjoy working in a multi-disciplinary team of engineers, scientists and business leaders. They will seek to understand processes behind data so their recommendations are grounded. Key job responsibilities Your solutions will drive new system capabilities with global impact. You will design highly scalable, large enterprise software solutions involving computer vision. You will develop complex perception algorithms integrating across multiple sensing devices. You will develop metrics to quantify the benefits of a solution and influence project resources. You will validate system performance and use insights from your live models to drive the next generation of model development. Common tasks include: • Research, design, implement and evaluate complex perception and decision making algorithms integrating across multiple disciplines • Work closely with software engineering teams to drive scalable, real-time implementations • Collaborate closely with team members on developing systems from prototyping to production level • Collaborate with teams spread all over the world • Track general business activity and provide clear, compelling management reports on a regular basis We are open to hiring candidates to work out of one of the following locations: Milan, MI, ITA
ES, M, Madrid
Ops Integration: Concessions team is looking for a motivated, creative and customer obsessed Snr. Applied Scientist with a strong machine learning background, to develop advanced analytics models (Computer Vision, LLMs, etc.) that improve customer experiences We are the voice of the customer in Amazon’s operations, and we take that role very seriously. If you join this team, you will be a key contributor to delivering the Factory of the Future: leveraging Internet of Things (IoT) and advanced analytics to drive tangible, operational change on the ground. You will collaborate with a wide range of stakeholders (You will partner with Research and Applied Scientists, SDEs, Technical Program Managers, Product Managers and Business Leaders) across the business to develop and refine new ways of assessing challenges within Amazon operations. This role will combine Amazon’s oldest Leadership Principle, with the latest analytical innovations, to deliver business change at scale and efficiently The ideal candidate will have deep and broad experience with theoretical approaches and practical implementations of vision techniques for task automation. They will be a motivated self-starter who can thrive in a fast-paced environment. They will be passionate about staying current with sensing technologies and algorithms in the broader machine vision industry. They will enjoy working in a multi-disciplinary team of engineers, scientists and business leaders. They will seek to understand processes behind data so their recommendations are grounded. Key job responsibilities Your solutions will drive new system capabilities with global impact. You will design highly scalable, large enterprise software solutions involving computer vision. You will develop complex perception algorithms integrating across multiple sensing devices. You will develop metrics to quantify the benefits of a solution and influence project resources. You will validate system performance and use insights from your live models to drive the next generation of model development. Common tasks include: • Research, design, implement and evaluate complex perception and decision making algorithms integrating across multiple disciplines • Work closely with software engineering teams to drive scalable, real-time implementations • Collaborate closely with team members on developing systems from prototyping to production level • Collaborate with teams spread all over the world • Track general business activity and provide clear, compelling management reports on a regular basis We are open to hiring candidates to work out of one of the following locations: Madrid, ESP | Madrid, M, ESP
US, TX, Austin
The role is available Arlington, Virginia (may consider New York, NY, Los Angeles, CA, or Toronto, Canada). Calling all inventors to work on exciting new opportunities in Sponsored Products. Amazon is building a world class advertising business and defining and delivering a collection of self-service performance advertising products that drive discovery and sales of merchandise. Our products are strategically important to our Retail and Marketplace businesses, driving long-term growth. Sponsored Products (SP) helps merchants, retail vendors, and brand owners grows incremental sales of their products sold on Amazon through native advertising. SP achieves this by using a combination of machine learning, big data analytics, ultra-low latency high-volume engineering systems, and quantitative product focus. We are a highly motivated, collaborative and fun-loving group with an entrepreneurial spirit and bias for action. You will join a newly-founded team with a broad mandate to experiment and innovate, which gives us the flexibility to explore and apply scientific techniques to novel product problems. You will have the satisfaction of seeing your work improve the experience of millions of Amazon shoppers while driving quantifiable revenue impact. More importantly, you will have the opportunity to broaden your technical skills, work with Generative AI, and be a science leader in an environment that thrives on creativity, experimentation, and product innovation. We are open to hiring candidates to work out of one of the following locations: Austin, TX, USA
GB, London
Ops Integration: Concessions team is looking for a motivated, creative and customer obsessed Snr. Applied Scientist with a strong machine learning background, to develop advanced analytics models (Computer Vision, LLMs, etc.) that improve customer experiences We are the voice of the customer in Amazon’s operations, and we take that role very seriously. If you join this team, you will be a key contributor to delivering the Factory of the Future: leveraging Internet of Things (IoT) and advanced analytics to drive tangible, operational change on the ground. You will collaborate with a wide range of stakeholders (You will partner with Research and Applied Scientists, SDEs, Technical Program Managers, Product Managers and Business Leaders) across the business to develop and refine new ways of assessing challenges within Amazon operations. This role will combine Amazon’s oldest Leadership Principle, with the latest analytical innovations, to deliver business change at scale and efficiently The ideal candidate will have deep and broad experience with theoretical approaches and practical implementations of vision techniques for task automation. They will be a motivated self-starter who can thrive in a fast-paced environment. They will be passionate about staying current with sensing technologies and algorithms in the broader machine vision industry. They will enjoy working in a multi-disciplinary team of engineers, scientists and business leaders. They will seek to understand processes behind data so their recommendations are grounded. Key job responsibilities Your solutions will drive new system capabilities with global impact. You will design highly scalable, large enterprise software solutions involving computer vision. You will develop complex perception algorithms integrating across multiple sensing devices. You will develop metrics to quantify the benefits of a solution and influence project resources. You will validate system performance and use insights from your live models to drive the next generation of model development. Common tasks include: • Research, design, implement and evaluate complex perception and decision making algorithms integrating across multiple disciplines • Work closely with software engineering teams to drive scalable, real-time implementations • Collaborate closely with team members on developing systems from prototyping to production level • Collaborate with teams spread all over the world • Track general business activity and provide clear, compelling management reports on a regular basis Basic Qualifications -Masters in Computer Science, Machine Learning, Robotics or equivalent with a focus on Computer Vision. -2+ years of experience of building machine learning models for business application -Broad knowledge of fundamentals and state of the art in computer vision and machine learning -Strong coding skills in two or more programming languages such as Python or C/C++ -Knowledge of fundamentals in optimization, supervised and reinforcement learning -Excellent problem-solving ability Preferred Qualifications -PhD and 4+ years of industry or academic applied research experience applying Computer Vision techniques and developing Computer vision algorithms -Depth and breadth in state-of-the-art computer vision and machine learning technologies and experience designing and building computer vision solutions -Industry experience in sensor systems and the development of production computer vision and machine learning applications built to use them -Experience developing software interfacing to AWS services -Excellent written and verbal communication skills with the ability to present complex technical information in a clear and concise manner to a variety of audiences -Ability to work on a diverse team or with a diverse range of coworkers -Experience in publishing at major Computer Vision, ML or Robotics conferences or Journals (CVPR, ICCV, ECCV, NeurIPS, ICML, IJCV, ICRA, IROS, RSS,...) We are open to hiring candidates to work out of one of the following locations: London, GBR
US, WA, Seattle
Want to work in a start-up environment with the resources of Amazon behind you? Do you want to have direct and immediate impact on millions of customers every day? If you are a self-starter, passionate about machine learning, deep learning, big data systems, enjoy designing and implementing new features and machine learned models, and intrigued by ambiguous problems, look no further. Amazon Advertising operates at the intersection of eCommerce and advertising, offering a rich array of digital display advertising solutions with the goal of helping our customers find and discover anything they want to buy. We help advertisers of all types to reach Amazon customers on Amazon.com, across our other owned and operated sites, on other high quality sites across the web, and on millions of Kindles, tablets, and mobile devices. We start with the customer and work backwards in everything we do, including advertising. If you’re interested in joining a rapidly growing team working to build a unique, world-class advertising group with a relentless focus on the customer, you’ve come to the right place. About Our Team: Our team is responsible for building a new advertising product for non-endemic advertisers. We are tasked with taking this start-up offering to market, with the goal of empowering over one million non-endemic advertisers to independently plan and execute campaigns. “Non-endemic” brands offer products and services that are not sold/available in Amazon’s retail marketplace, including restaurants, hotels, airlines, insurance, telecom, and automobiles. We are embarking on a multi-year vision to democratize display advertising for non-endemic advertisers at self-service scale. This will open up Amazon Ads to self-service non-endemic demand— whether they sell on the Amazon store or not— to activate Amazon Ads first-party audiences built from shopping and streaming signals and access unique ad inventory to help grow their business. Open to hire in NYC or Seattle. Key job responsibilities - Drive end-to-end Machine Learning projects that have a high degree of ambiguity, scale, complexity. - Perform hands-on analysis and modeling of enormous data sets to develop insights that increase traffic monetization and merchandise sales, without compromising the shopper experience. - Build machine learning models, perform proof-of-concept, experiment, optimize, and deploy your models into production; work closely with software engineers to assist in productionizing your ML models. - Run A/B experiments, gather data, and perform statistical analysis. - Establish scalable, efficient, automated processes for large-scale data analysis, machine-learning model development, model validation and serving. - Research new and innovative machine learning approaches. - Train and fine-tune neural models including transformers and language models. - Recruit Applied Scientists to the team and provide mentorship. We are open to hiring candidates to work out of one of the following locations: Seattle, WA, USA
LU, Luxembourg
Ops Integration: Concessions team is looking for a motivated, creative and customer obsessed Snr. Applied Scientist with a strong machine learning background, to develop advanced analytics models (Computer Vision, LLMs, etc.) that improve customer experiences We are the voice of the customer in Amazon’s operations, and we take that role very seriously. If you join this team, you will be a key contributor to delivering the Factory of the Future: leveraging Internet of Things (IoT) and advanced analytics to drive tangible, operational change on the ground. You will collaborate with a wide range of stakeholders (You will partner with Research and Applied Scientists, SDEs, Technical Program Managers, Product Managers and Business Leaders) across the business to develop and refine new ways of assessing challenges within Amazon operations. This role will combine Amazon’s oldest Leadership Principle, with the latest analytical innovations, to deliver business change at scale and efficiently The ideal candidate will have deep and broad experience with theoretical approaches and practical implementations of vision techniques for task automation. They will be a motivated self-starter who can thrive in a fast-paced environment. They will be passionate about staying current with sensing technologies and algorithms in the broader machine vision industry. They will enjoy working in a multi-disciplinary team of engineers, scientists and business leaders. They will seek to understand processes behind data so their recommendations are grounded. Key job responsibilities Your solutions will drive new system capabilities with global impact. You will design highly scalable, large enterprise software solutions involving computer vision. You will develop complex perception algorithms integrating across multiple sensing devices. You will develop metrics to quantify the benefits of a solution and influence project resources. You will validate system performance and use insights from your live models to drive the next generation of model development. Common tasks include: • Research, design, implement and evaluate complex perception and decision making algorithms integrating across multiple disciplines • Work closely with software engineering teams to drive scalable, real-time implementations • Collaborate closely with team members on developing systems from prototyping to production level • Collaborate with teams spread all over the world • Track general business activity and provide clear, compelling management reports on a regular basis We are open to hiring candidates to work out of one of the following locations: Luxembourg, LUX