An image grid shows screenshots from the 10 most downloaded Amazon Science publications, the year 2021 can be seen as an overlay
Amazon scientists published more research papers in 2021 than in any previous year in the company's history. Below is the list of the most downloaded papers from our site in 2021.

The 10 most read research papers of 2021

Amazon researchers authored hundreds of papers in 2021; the 10 below were downloaded the most.

  1. "This paper reports our experience applying lightweight formal methods to validate the correctness of ShardStore, a new key-value storage node implementation for the Amazon S3 cloud object storage service.

    Read our blog post about this paper

    At the ACM Symposium on Operating Systems Principles, the authors won a best-paper award. James Bornholt writes about how the paper describes lightweight formal methods for validating new S3 data storage service.

    By “lightweight formal methods" we mean a pragmatic approach to verifying the correctness of a production storage node that is under ongoing feature development by a full-time engineering team. We do not aim to achieve full formal verification, but instead emphasize automation, usability, and the ability to continually ensure correctness as both software and its specification evolve over time."

  2. "The rich body of Bandit literature not only offers a diverse toolbox of algorithms, but also makes it hard for a practitioner to find the right solution to solve the problem at hand. Typical textbooks on Bandits focus on designing and analyzing algorithms, and surveys on applications often present a list of individual applications. While these are valuable resources, there exists a gap in mapping applications to appropriate Bandit algorithms. In this paper, we aim to reduce this gap with a structured map of Bandits to help practitioners navigate to find relevant and practical Bandit algorithms."

  3. "We study the identification of direct and indirect causes on time series with latent variables, and provide a constrained-based causal feature selection method, which we prove that is both sound and complete under some graph constraints.

    Read our blog post about this paper

    Authors Atalanti Mastakouri and Dominik Janzing wrote about the paper they co-authored with Bernhard Schölkopf. Read their post about "a new technique for detecting all the direct causal features of a target time series."

    Our theory and estimation algorithm require only two conditional independence tests for each observed candidate time series to determine whether or not it is a cause of an observed target time series. Furthermore, our selection of the conditioning set is such that it improves signal to noise ratio. We apply our method on real data, and on a wide range of simulated experiments, which yield very low false positive and relatively low false negative rates."

  4. "In this paper, we focus on improving online multi-object tracking (MOT). In particular, we introduce a region-based Siamese Multi-Object Tracking network, which we name SiamMOT. SiamMOT includes a motion model that estimates the instance’s movement between two frames such that detected instances are associated. To explore how the motion modelling affects its tracking capability, we present two variants of Siamese tracker, one that implicitly models motion and one that models it explicitly. We carry out extensive quantitative experiments on three different MOT datasets: MOT17, TAO-person and Caltech Roadside Pedestrians, showing the importance of motion modelling for MOT and the ability of SiamMOT to substantially outperform the state-of-the-art."

  5. "Amazon Last Mile strives to learn an accurate delivery point for each address by using the noisy GPS locations reported from past deliveries. Centroids and other center-finding methods do not serve well, because the noise is consistently biased.

    Read our blog post about this paper

    George Forman wrote about the paper he presented at the European Conference on Machine Learning. Learn more about how he adapted "an idea from information retrieval — learning-to-rank — to the problem of predicting the coordinates of a delivery location from past GPS data."

    The problem calls for supervised machine learning, but how? We addressed it with a novel adaptation of learning to rank from the information retrieval domain. This also enabled information fusion from map layers. Offline experiments show outstanding reduction in error distance, and online experiments estimated millions in annualized savings."

  6. "Seasonality is an important dimension for relevance in e-commerce search. For example, a query jacket has a different set of relevant documents in winter than summer. For an optimal user experience, the e-commerce search engines should incorporate seasonality in product search. In this paper, we formally introduce the concept of seasonal relevance, define it and quantify using data from a major e-commerce store. In our analyses, we find 39% queries are highly seasonally relevant to the time of search and would benefit from handling seasonality in ranking. We propose LogSR and VelSR features to capture product seasonality using state-of-the-art neural models based on self-attention. Comprehensive offline and online experiments over large datasets show the efficacy of our methods to model seasonal relevance. The online A/B test on 784 MM queries shows the treatment with seasonal relevance features results in 2.20% higher purchases and better customer experience overall."

  7. "Since 2015, Amazon has reduced the weight of its outbound packaging by 36%, eliminating over 1,000,000 tons of packaging material worldwide, or the equivalent of over 2 billion shipping boxes, thereby reducing carbon footprint throughout its fulfillment supply chain. In this position paper, we share insights on using deep learning to identify the optimal packaging type best suited to ship each item in a diverse product catalog at scale so that it arrives undamaged, delights customers, and reduces packaging waste. Incorporating multimodal data on products including product images and class imbalance handling technique are important to improving model performance."

  8. "While pre-trained large language models (LLM) like BERT have achieved state-of-the-art in several NLP tasks, their performance on tasks with additional grounding e.g. with numeric and categorical features is less studied. In this paper, we study the application of pre-trained LLM for click-through-rate (CTR) prediction for product advertisement in e-commerce. This is challenging because the model needs to a) learn from language as well as tabular data features, b) maintain low-latency (<5 ms) at inference time, and c) adapt to constantly changing advertisement distribution. We first show that scaling the pre-trained language model to 1.5 billion parameters significantly improves performance over conventional CTR baselines. We then present CTR-BERT, a novel lightweight cache-friendly factorized model for CTR prediction that consists of twin-structured BERT-like encoders for text with a mechanism for late fusion for text and tabular features."

  9. "Large-scale time series panels have become ubiquitous over the last years in areas such as retail, operational metrics, IoT, and medical domain (to name only a few). This has resulted in a need for forecasting techniques that effectively leverage all available data by learning across all time series in each panel. Among the desirable properties of forecasting techniques, being able to generate probabilistic predictions ranks among the top. In this paper, we therefore present Level Set Forecaster (LSF), a simple yet effective general approach to transform a point estimator into a probabilistic one. By recognizing the connection of our algorithm to random forests (RFs) and quantile regression forests (QRFs), we are able to prove consistency guarantees of our approach under mild assumptions on the underlying point estimator. As a byproduct, we prove the first consistency results for QRFs under the CART-splitting criterion. Empirical experiments show that our approach, equipped with tree-based models as the point estimator, rivals state-of-the-art deep learning models in terms of forecasting accuracy."

  10. "For voice assistants like Alexa, Google Assistant and Siri, correctly interpreting users’ intentions is of utmost importance. However, users sometimes experience friction with these assistants, caused by errors from different system components or user errors such as slips of the tongue. Users tend to rephrase their query until they get a satisfactory response. Rephrase detection is used to identify the rephrases and has long been treated as a task with pairwise input, which does not fully utilize the contextual information (e.g. users’ implicit feedback). To this and, we propose a contextual rephrase detection model ContReph to automatically identify rephrases from multi-turn dialogues. We showcase how to leverage the dialogue context and user-agent interaction signals, including user’s implicit feedback and the time gap between different turns, which can help significantly outperform the pairwise rephrase detection models."

Related content

US, WA, Seattle
Job summaryPrime Video is an industry leading, high-growth business and a critical driver of Amazon Prime subscriptions, which contribute to customer loyalty and lifetime value. Prime Video is a digital video streaming and download service that offers Amazon customers the ability to rent, purchase or subscribe to a huge catalog of videos. The Prime Video Economist team works on disruptive ideas in the Prime Video space.We are looking for a truly innovative Data Scientist to work on disruptive ideas within the Prime Video space. Examples of problem spaces you may be working on include video product pricing, ecosystem effects (how streaming affects rentals or purchases), and forecasting demand for new content on the platform.On our team you will work with a diverse scientific team including engineers and economists as well as other data scientist to build statistical models using world-class data systems and partner directly with the business to implement the solutions.Key job responsibilities· Implement code (Python, R, Scala, etc.) for analyzing data and building machine learning/econometric models to solve specific business problems. Work with software engineering teams to productionize algorithms where appropriate.· Lead the development of the scientific roadmap, guide and develop junior engineers in designing and implementing scientific solutions.· Translate analytic insights into concrete, actionable recommendations for business or product improvement. Develop and present these as reports to senior stakeholders with ranging levels of technical knowledge.· Create, enhance, and maintain technical documentation, and present to other scientists, engineers and business leaders.· Demonstrate thorough technical knowledge on feature engineering of massive datasets, effective exploratory data analysis, and model building to deliver accurate and effective business insights.· Innovate by researching, learning, and adapting new modeling techniques and procedures to existing business problems.· Manage and execute entire project from start to finish including problem solving, data gathering and manipulation, predictive modeling, and stakeholder engagement.
US, WA, Bellevue
Job summaryDo you enjoy solving challenging problems and driving innovations in research? Are you seeking for an environment with a group of motivated and talented scientists like yourself? Do you want to create scalable optimization models and apply machine learning techniques to guide real-world decisions? Do you want to play a key role in the future of Amazon transportation and operations? Come and join us at Amazon's Modeling and Optimization team (MOP).Key job responsibilitiesAn Applied Scientist in the Modeling and Optimization (MOP) team· provides analytical decision support to Amazon planning teams via applying advanced mathematical and statistical techniques.· collaborates effectively with Amazon internal business customers, and is their trusted partner· is proactive and autonomous in discovering and resolving business pain-points within a given scope· is able to identify a suitable level of sophistication in resolving the different business needs· is confident in leveraging existing solutions to new problems where appropriate and is independent in designing and implementing new solutions where needed· is aware of the limitations of his/her proposed solutions and is proactive in communicating them to the business, and advances the application of sciences towards Amazon business problems by bringing new methods, ideas, and practices to the team and scientific community.A day in the life· Your will be developing model-based optimization, simulation, and/or predictive tools to identify and evaluate opportunities to improve customer experience, network speed, cost, and efficiency of capital investment.· You will quantify the improvements resulting from the application of these tools and you will evaluate the trade-offs between potentially competing objectives.· You will develop good communication skills and ability to speak at a level appropriate for the audience, will collaborate effectively with fellow scientists, software development engineers, and product managers, and will deliver business value in a close partnership with many stakeholders from operations, finance, IT, and business leadership.About the team· At the Modeling and Optimization (MOP) team, we use mathematical optimization, algorithm design, statistics, and machine learning to improve decision-making capabilities across WW Operations and Amazon Logistics.· We focus on transportation topology, labor and resource planning for fulfillment centers (FC), routing science, visualization research, data science and development, and process optimization.· We create models to simulate, optimize, and control the fulfillment network with the objective of reducing cost while improving speed and reliability.· We support multiple business lanes, therefore maintain a comprehensive and objective view, coordinating solutions across organizational lines where possible.
US, WA, Seattle
Job summaryAt Amazon, we're working to be the most customer-centric company on earth. To get there, we need exceptionally talented, bright, result oriented, and driven people. Amazon is seeking a Data Scientist - Simulation to assist in designing and optimizing the fulfillment network concepts and process improvements using discrete event simulations for our World Wide Design Engineering Team. Successful candidates will be natural self-starters who have the drive to design, model, and simulate new fulfillment center concepts and processes. The Simulation Data Scientist will be expected to deep dive problems and drive relentlessly towards creative solutions. This individual needs to be comfortable interfacing and driving various functional teams and individuals at all levels of the organization in order to be successful. Perform process modelling and simulation using discrete event simulation software’s, process optimization, statistical data analysis, and Design of Experiments (DOE) etc. to drive decisions on process and designs. Need based remote work option is available.Responsibilities:· Lead system level complex Discrete Event Simulation (DES) projects to build , simulate, and optimize the fulfillment center operational process flow models using FlexSim, Demo 3D, AnyLogic or any other Discrete Event Simulation (DES) software packages· Understand process flows , analyze data, perform Design of Experiments and effectively represent in simulation model to achieve better correlation and process improvements· Manage multiple DES simulation projects and tasks simultaneously and effectively influence, negotiate, and communicate with internal and external business partners, contractors and vendors.· Facilitate process improvement initiatives among site operations, engineering, and corporate systems groups.· Utilize code (python or another object oriented language) for data analysis and modeling algorithms· Analyze historical data to identify trends and support decision making using Statistical Techniques· Lead and coordinate simulation efforts between internal teams and outside vendors to develop optimal solutions for the network, including equipment specification, material flow, process design, and site layout.· Deliver results according to project schedules and quality· Provide written and verbal presentations to share insights and recommendations to audiences of varying levels of technical sophistication.· Make technical trade-offs for long term/short-term needs considering challenges in business area by applying relevant data science disciplines, and interactions among systems.
US, WA, Seattle
Job summaryAmazon is seeking an outstanding Data Scientist to uncover key insights on how customers engage with live sports events on Prime Video globally. With prestigious US sporting matches on Prime Video from NFL’s Thursday Night Football, the WNBA, AVP, the New York Yankees, and the Seattle Sounders, as well as global events like the English Premiere League (UK), UEFA Champions League (Italy, Germany), Ligue 1 (France), US Open Tennis (UK), Roland Garros (France), Autumn Nations Cup Rugby (UK) and more, live sports are an integral and growing component of Prime Video. As our selection of events expands, the Prime Video Content Analytics team is looking to enable agile decision making on live sports by developing key insights into customer engagement with live sport and translating these insights into large scale predictive modeling and analytics solutions.Key job responsibilitiesYou will have the following responsibilities within the scope of our global Prime Video business:· Drive analytics in an uncharted field that is not only developing at a fast pace but also becoming increasingly important to the Prime Video business· Support the analytical needs of stakeholders in the sports, advertising, finance, and live events teams, inclusive of statistical inference, demand modeling, and feature engineering· Build profitability models for new sports rights and partner with finance on business use cases· Think outside the box to use novel data and methodological approaches· Create new metrics that effectively guide the business and deploy dashboards to surface them to senior leadership· Ensure that the quality and timeliness of analytic deliverables meet business expectationsAbout the teamThe Prime Video Content Analytics team uses machine learning, econometrics, and data science to optimize Amazon’s streaming-video catalogue, driving customer engagement and Prime member acquisition. We generate insights to guide Amazon’s digital-video strategy, and we provide direct support to the content-acquisition process. We use detailed customer behavioral data (e.g. streaming history) and detailed information about content (e.g. IMDb-sourced characteristics) to predict and understand what customers like to watch.
ES, M, Madrid
Job summaryAmazon is looking for creative Applied Scientists to tackle some of the most interesting problems on the leading edge of machine learning (ML), search, natural language processing (NLP), and related areas with our Amazon Books team. At Amazon Books we believe that books are not only needed to work, education and entertainment, but are also required for a healthy society. As such, we aim to create an unmatched book discovery experience for our customers worldwide. We enable customers to discover new books, authors and genres through sophisticated recommendation engines, smart search tools and through social interaction, and we need your help to keep innovating in this space.If you are looking for an opportunity to solve deep technical problems and build innovative solutions in a fast-paced environment working within a smart and passionate team, this might be the role for you. You will develop and implement novel algorithms and modeling techniques to advance the state-of-the-art in technology areas at the intersection of ML, search, NLP, and deep learning. You will innovate, help move the needle for applied research in these exciting areas and build cutting-edge and scalable technologies that enable delightful experiences for hundreds of millions of people.In this role you will:· Work collaboratively with other scientists and developers to design and implement scalable models for improving our customers' experience discovering and getting the most out of their books;· Have the opportunity to work with a variety of technologies in a variety of use cases;· Drive scalable solutions from the business to prototyping, production testing and through engineering directly to production;· Drive best practices on the team, deal with ambiguity and competing objectives, and mentor and guide other members to achieve their career growth potential.About the teamWe aspire to be experts at the forefront of AI, machine learning and data science and their application to books e-commerce to help engineering teams innovate for readers, authors and publishers.As an Applied Scientist, you'll help us translate customer problems into tractable technical problems, and find ways to solve them by combining your expertise and that of other scientists and team members. You will work with partner engineering and business teams to ensure solutions have a real impact.
US, WA, Seattle
Job summaryAre you inspired by building new technologies to benefit customers? Do you dream of being at the forefront of robotics and autonomous system technology? Would you enjoy working in a fast paced, highly collaborative, start-up like environment? If you answered yes to any of these then you've got to check out the Amazon Scout team.We’ve been hard at work developing a new, fully-electric delivery system – Amazon Scout – designed to get packages to customers using autonomous delivery devices. These devices were created by Amazon, are the size of a small cooler, and roll along sidewalks at a walking pace. We developed Amazon Scout at our research and development lab in Seattle, ensuring the devices can safely and efficiently navigate around pets, pedestrians and anything else in their path.The Amazon Scout team shares a passion for innovation using advanced technologies, a love of solving complex challenges, and a desire to delight customers. We're looking for people who like dealing with ambiguity, solving hard, large scale problems, and working in a startup like environment. To learn more about Amazon Scout, check out our Amazon Day One Blog here: http://amazon.com/scoutAs a part of the localization team you will:· Collaborate closely with engineers, applied researchers and hardware teams to develop computer vision and machine learning algorithms and software for robots.· Take responsibility for technical problem solving, including creatively meeting product objectives and developing best practices.· Interact with teammates in variety of roles to accomplish your goals· Identify and initiate investigations of new technologies, prototype and test solutions for product features, and design and validate designs that deliver an exceptional user experience.· Recruit, hire and develop other applied scientists.
US, WA, Bellevue
Job summaryThe People eXperience and Technology Central Science Team (PXTCS) uses economics, behavioral science, statistics, and machine learning to proactively identify mechanisms and process improvements which simultaneously improve Amazon and the lives, wellbeing, and the value of work to Amazonians. We are an interdisciplinary team that combines the talents of science and engineering to develop and deliver solutions that measurably achieve this goal.We are looking for economists who are able to work with business partners to hone complex problems into specific, scientific questions, and test those questions to generate insights. The ideal candidate will work with engineers and computer scientists to estimate models and algorithms on large scale data, design pilots and measure their impact, and transform successful prototypes into improved policies and programs at scale. We are looking for creative thinkers who can combine a strong technical economic toolbox with a desire to learn from other disciplines, and who know how to execute and deliver on big ideas as part of an interdisciplinary technical team.Ideal candidates will work closely with business partners to develop science that solves the most important business challenges. They will work in a team setting with individuals from diverse disciplines and backgrounds. They will serve as an ambassador for science and a scientific resource for business teams, so that scientific processes permeate throughout the HR organization to the benefit of Amazonians and Amazon. Ideal candidates will own the data analysis, modeling, and experimentation that is necessary for estimating and validating models. They will work closely with engineering teams to develop scalable data resources to support rapid insights, and take successful models and findings into production as new products and services. They will be customer-centric and will communicate scientific approaches and findings to business leaders, listening to and incorporate their feedback, and delivering successful scientific solutions.Key job responsibilitiesUse causal inference methods to evaluate the impact of policies on employee outcomes. Examine how external labor market and economic conditions impact Amazon's ability to hire and retain talent. Use scientifically rigorous methods to develop and recommend career paths for employees.A day in the lifeWork with teammates to apply economic methods to business problems. This might include identifying the appropriate research questions, writing code to implement a DID analysis or estimate a structural model, or writing and presenting a document with findings to business leaders. Our economists also collaborate with partner teams throughout the process, from understanding their challenges, to developing a research agenda that will address those challenges, to help them implement solutions.About the teamWe are a multidisciplinary team that combines the talents of science and engineering to develop innovative solutions to make Amazon Earth's Best Employer.
US, Virtual
Job summaryAmazon’s Global Reliability Team is seeking a Principal Research Scientist to help envision, design and build the next generation of predictive maintenance capabilities and inventory management optimization behind Amazon’s Fulfillment Centers, Transportation Services, and Global Specialty Fulfillment.Key job responsibilitiesThe Principal Research Scientist will partner with senior leadership to develop long term strategic products/solutions and will represent and advocate them to leaders in our organization and other partner organizations such as Amazon Fulfillment Technologies, Workplace Health and Safety, amongst others. They will interact with Amazon scholars and universities among other research institutions to ensure that our team and our senior executives are up to speed on important trends, tools and technologies and how they can be used to impact the business.A day in the lifeIn this role, you will participate and lead the brainstorming sessions and review other scientists’ research. They will actively participate in the science community through presenting their research at the internal and external conference. They will mentor senior scientists for their career development and growth and help the company to identify and acquire scientists with the right skillset.About the teamWe are seeking high-energy individuals that are passionate about working with real-time machine and sensor data to build automated systems aimed to improve equipment availability.This position is perfect for someone who has a deep and broad analytic background and is passionate about using mathematical modeling and statistical analysis to make a real difference. Experience in applied analytics is essential, and they should be familiar with modern tools for data science and business analysis. We are particularly interested in candidates with research background in reliability engineering, econometrics, statistical inference, and time series modeling.
US, MA, Cambridge
Job summaryAmazon Lab126 is an inventive research and development company that designs and engineers high-profile consumer electronics. Lab126 began in 2004 as a subsidiary of Amazon.com, Inc., originally creating the best-selling Kindle family of products. Since then, we have produced groundbreaking devices like Fire tablets, Fire TV and Amazon Echo. What will you help us create?The Role:We are looking for a high caliber Applied Scientist Lead to join our team. As part of the larger technology team working on new consumer technology, your work will have a large impact to hardware, internal software developers, ecosystem, and ultimately the lives of Amazon customers. In this role, you will:• Lead a team of talented audio scientists and SW developers to bring a new and innovative audio products and services to delight customers• Propose new research projects, get buy-in from stakeholders, plan and budget the project and lead the team for successful execution• Work closely with an inter-disciplinary product development team including outside partners to bring the prototype algorithm into commercialization• Mentor team on music/speech/acoustic processing technology development• Manage small team of world class scientists and SW engineers in audio• Take a big part in the mission to create earth's best employerBe a respectable team leader in an open and collaborative environment
US, MA, Boston
Job summaryAre you inspired by invention? Is problem solving through teamwork in your DNA? Do you like the idea of seeing how your work impacts the bigger picture? Answer yes to any of these and you’ll fit right in here at Amazon Robotics. We are a smart team of doers that work passionately to apply cutting edge advances in robotics and software to solve real-world challenges that will transform our customers’ experiences in ways we can’t even image yet. We invent new improvements every day. We are Amazon Robotics and we will give you the tools and support you need to invent with us in ways that are rewarding, fulfilling and fun.We seek a talented and motivated engineer to tackle broad challenges in system-level analysis. You will work in a small team to quantify system performance at scale and to expand the breadth and depth of our analysis (e.g. increase the range of software components and warehouse processes covered by our models, develop our library of key performance indicators, construct experiments that efficiently root cause emergent behaviors). You will engage with growing teams of software development and warehouse design engineers to drive evolution of the AR system and of the simulation engine that supports our work.This role is a 6 month co-op to join AR full time (40 hours/week) from July-December 2022. Come join us in North Reading, MA, or in our newly expanded innovation hub in Westborough, MA!Both campuses provide a unique opportunity for co-ops to have direct access to robotics testing labs and manufacturing facilities. Remote and hybrid flexibility is available for this role.