Amazon at ICLR: Graphs, time series, and more

Other paper topics include natural-language processing, dataset optimization, and the limits of existing machine learning techniques.

Time series forecasting and graph representations of data are both major topics of research at Amazon: time series forecasting is crucial to both supply chain optimization and product recommendation, and graph representations help make sense of the large datasets that are common at Amazon’s scale, such as the Amazon product catalogue.

Related content
Amazon’s Stefano Soatto on how learning representations came to dominate machine learning.

So it’s no surprise that both topics are well represented among the Amazon papers at the 2022 International Conference on Learning Representations (ICLR), which takes place this week. Another paper also touches on one of Amazon’s core scientific interests, natural-language processing, or computation involving free-form text inputs.

The remaining Amazon papers discuss more general machine learning techniques, such as data augmentation, or automatically selecting or generating training examples that can improve the performance of machine learning models. Another paper looks at dataset optimization more generally, proposing a technique that could be used to evaluate individual examples for inclusion in a dataset or exclusion from it. And two papers from Amazon Web Services’ Causal-Representation Learning team, which includes Amazon vice president and distinguished scientist Bernhard Schölkopf, examine the limitations of existing approaches to machine learning.

Graphs

Graphs represent data as nodes, usually depicted as circles, and edges, usually depicted as line segments connecting nodes. Graph-structured data can make machine learning more efficient, because the graph explicitly encodes relationships that a machine learning model would otherwise have to infer from data correlations.

Graph neural networks (GNNs) are a powerful tool for working with graph-structured data. Like most neural networks, GNNs produce embeddings, or fixed-length vector representations of input data, that are useful for particular computational tasks. In the case of GNNs, the embeddings capture information about both the object associated with a given node and the structure of the graph.

In real-world applications — say, a graph indicating which products tend to be purchased together — some nodes may not be connected to any others, and some connections may be spurious inferences from sparse data. In “Cold Brew: Distilling graph node representations with incomplete or missing neighborhoods”, Amazon scientists present a method for handling nodes whose edge data is absent or erroneous.

Cold Brew data distribution 16x9.png
Cold Brew addresses the real-world problem in which graph representations of data feature potentially spurious connections (tail nodes) or absent connections (cold start). Figure from "Cold Brew: Distilling graph node representations with incomplete or missing neighborhoods".

In a variation on knowledge distillation, they use a conventional GNN, which requires that each input node be connected to the rest of the graph, to train a teacher network that can produce embeddings for connected nodes. Then they train a standard multilayer perceptron — a student network — to mimic the teacher’s outputs. Unlike a conventional GNN, the student network doesn’t explicitly use structural data to produce embeddings, so it can also handle unconnected nodes. The method demonstrates significant improvements over existing methods of inferring graph structure on several benchmark datasets.

Across disciplines, AI research has recently seen a surge in the popularity of self-supervised learning, in which a machine learning model is first trained on a “proxy task”, which is related to but not identical to the target task, using unlabeled or automatically labeled data. Then the model is fine-tuned on labeled data for the target task.

With GNNs, the proxy tasks generally teach the network only how to represent node data. But in “Node feature extraction by self-supervised multi-scale neighborhood prediction”, Amazon researchers and their colleagues at the University of Illinois and UCLA present a proxy task that teaches the GNN how to represent information about graph structure as well. Their approach is highly scalable, working with graphs with hundreds of millions of nodes, and in experiments, they show that it improves GNN performance on three benchmark datasets, by almost 30% on one of them.

XRT for graph neighborhoods.png
XR-Transformer creates a hierarchical tree that sorts data into finer- and finer-grained clusters. In the context of graph neural networks, the clusters represent graph neighborhoods. Figure from "Node feature extraction by self-supervised multi-scale neighborhood prediction".

The approach, which builds on Amazon’s XR-Transformer model and is known as GIANT-XRT, has already been widely adopted and is used by the leading teams in several of the public Open Graph Benchmark competitions hosted by Stanford University (leaderboard 1 | leaderboard 2 | leaderboard 3).

Domain graph.png
Where traditional domain adaptation (left) treats all target domains the same, a new method (right) uses graphs to represent relationships between source and target domains. For instance, weather patterns in adjacent U.S. states tend to be more similar than the weather patterns in states distant from each other. Figure from “Graph-relational domain adaptation”.

A third paper, “Graph-relational domain adaptation”, applies graphs to the problem of domain adaptation, or optimizing a machine learning model to work on data with a different distribution than the data it was trained on. Conventional domain adaptation techniques treat all target domains the same, but the Amazon researchers and their colleagues at Rutgers and MIT instead use graphs to represent relationships among all source and target domains. For instance, weather patterns in adjacent U.S. states tend to be more similar than the weather patterns in states distant from each other. In experiments, the researchers show that their method improves on existing domain adaptation methods on both synthetic and real-world datasets.

Time series

Time series forecasting is essential to demand prediction, which Amazon uses to manage inventory, and it’s also useful for recommendation, which can be interpreted as continuing a sequence of product (say, music or movie) selections.

In “Bridging recommendation and marketing via recurrent intensity modeling”, Amazon scientists adapt existing mechanisms for making personal recommendations on the basis of time series data (purchase histories) to the problem of identifying the target audience for a new product.

UserRec 16x9.png
Product recommendation can be interpreted as a time-series-forecasting problem, in which a product is recommended according to its likelihood of continuing a sequence of purchases. Figure from "Bridging recommendation and marketing via recurrent intensity modeling".

Where methods for identifying a product’s potential customers tend to treat customers as atemporal collections of purchase decisions, the Amazon researchers instead frame the problem as optimizing both the product’s relevance to the customer and the customer’s activity level, or likelihood of buying any product in a given time span. In experiments, this improved the accuracy of a prediction model on several datasets.

One obstacle to the development of machine learning models that base predictions on time series data is the availability of training examples. In “PSA-GAN: Progressive self attention GANs for synthetic time series”, Amazon researchers propose a method for using generative adversarial networks (GANs) to artificially produce time series training data.

Related content
In 2017, when the journal IEEE Internet Computing was celebrating its 20th anniversary, its editorial board decided to identify the single paper from its publication history that had best withstood the “test of time”. The honor went to a 2003 paper called “Amazon.com Recommendations: Item-to-Item Collaborative Filtering”, by then Amazon researchers Greg Linden, Brent Smith, and Jeremy York.

GANs pit generators, which produce synthetic data, against discriminators, which try to distinguish synthetic data from real. The two are trained together, each improving the performance of the other.

The Amazon researchers show how to synthesize plausible time series data by progressively growing — or adding network layers to — both the generator and the discriminator. This enables the generator to first learn general characteristics that the time series as a whole should have, then learn how to produce series that exhibit those characteristics.

Data augmentation

In addition to the paper on synthetic time series, one of Amazon’s other papers at ICLR, “Deep AutoAugment”, also focuses on data augmentation.

It’s become standard practice to augment the datasets used to train machine learning models by subjecting real data to sequences of transformations. For instance, a training image for a computer vision task might be flipped, stretched, rotated or cropped, or its color or contrast might be modified. Typically, the first few transformations are selected automatically, based on experiments in which a model is trained and retrained, and then domain experts add a few additional transformations to try to make the modified data look like real data.

Related content
New method enables users to specify properties such as subject age, light direction, and pose in images produced by generative adversarial networks.

In “Deep AutoAugment”, former Amazon senior applied scientist Zhi Zhang and colleagues at Michigan State University propose a method for fully automating the construction of a data augmentation pipeline. The goal is to continuously add transformations that steer the feature distribution of the synthetic data toward that of the real data. To do that, the researchers use gradient matching, or identifying training data whose sequential updates to the model parameters look like those of the real data. In tests, this approach improved on 10 other data augmentation techniques across four sets of real data.

Natural-language processing

Many natural-language-processing tasks involve pairwise comparison of sentences. Cross-encoders, which map pairs of sentences against each other, yield the most accurate comparison, but they’re computationally intensive, as they need to compute new mappings for every sentence pair. Moreover, converting a pretrained language model into a cross-encoder requires fine-tuning it on labeled data, which is resource intensive to acquire.

Bi-encoders, on the other hand, embed sentences in a common representational space and measure the distances between them. This is efficient but less accurate.

In “Trans-encoder: Unsupervised sentence-pair modelling through self- and mutual-distillations”, Amazon researchers, together with a former intern, propose a model that is trained in an entirely unsupervised way — that is, without unlabeled examples — and captures advantages of both approaches.

Trans-encoder.png
The trans-encoder training process, in which a bi-encoder trained in an unsupervised fashion creates training targets for a cross-encoder, which in turn outputs training targets for the bi-encoder.

The researchers begin with a pretrained language model, fine-tune it in an unsupervised manner using bi-encoding, then use the fine-tuned model to generate training targets for cross-encoding. They then use the outputs of the cross-encoding model to fine-tune the bi-encoder, iterating back and forth between the two approaches until training converges. In experiments, their model outperformed multiple state-of-the-art unsupervised sentence encoders on several benchmark tasks, with improvements of up to 5% over the best-performing prior models.

Dataset optimization

Weeding errors out of a dataset, selecting new training examples to augment a dataset, and determining how to weight the data in a dataset to better match a target distribution are all examples of dataset optimization. Assessing individual training examples’ contribution to the accuracy of a model, however, is difficult: retraining the model on a dataset with and without every single example is hardly practical.

In “DIVA: Dataset derivative of a learning task”, Amazon researchers show how to compute the dataset derivative: a function that can be used to assess a given training example’s utility relative to a particular neural-network model. During training, the model learns not only the weights of network parameters but also weights for individual training examples. The researchers show that, using a linearization technique, they can derive a closed-form equation for the dataset derivative, allowing them to assess the utility of a given training example without retraining the network.

DIVA weighting.png
Training examples that DIVA assigns high weights (left) and low (right) for the task of classifying aircraft. Figure from "DIVA: Dataset derivative of a learning task".

Limitations

“Machine learning ultimately is based on statistical dependencies,” Bernhard Schölkopf recently told Amazon Science. “Oftentimes, it's enough if we work at the surface and just learn from these dependencies. But it turns out that it's only enough as long as we're in this setting where nothing changes.”

The two ICLR papers from the Causal Representation Learning team explore contexts in which learning statistical dependencies is not enough. “Visual representation learning does not generalize strongly within the same domain” describes experiments with image datasets in which each image is defined by specific values of a set of variables — say, different shapes of different sizes and colors, or faces that are either smiling or not and differ in hair color or age.

The researchers test 17 machine learning models and show that, if certain combinations of variables or specific variable values are held out of the training data, all 17 have trouble recognizing them in the test data. For instance, a model trained to recognize small hearts and large squares has trouble recognizing large hearts and small squares. This suggests that we need revised training techniques or model designs to ensure that machine learning systems are really learning what they’re supposed to.

Visual representation learning.png
An illustration of the four methods of separating training data (black dots) and test data (red dots) in "Visual representation learning does not generalize strongly within the same domain".

Similarly, in “You mostly walk alone: Analyzing feature attribution in trajectory prediction”, members of the team consider the problem of predicting the trajectories of moving objects as they interact with other objects, an essential capacity for self-driving cars and other AI systems. For instance, if a person is walking down the street, and a ball bounces into her path, it could be useful to know that the person might deviate from her trajectory to retrieve the ball.

Adapting the game-theoretical concept of Shapley values, which enable the isolation of different variables’ contributions to an outcome, the researchers examine the best-performing recent models for predicting trajectories in interactive contexts and show that, for the most part, their predictions are based on past trajectories; they pay little attention to the influence of interactions.

Trajectory interactions.png
A new method enables the comparison of different trajectory prediction models according to the extent to which they use social interactions for making predictions (left: none; middle: weak; right: strong). The target agent, whose future trajectory is to be predicted, is shown in red, and modeled interactions are represented by arrows whose width indicates interaction strength. From "You mostly walk alone: Analyzing feature attribution in trajectory prediction".

The one exception is a models trained on a dataset of basketball video, where all the players’ movements are constantly coordinated. There, existing models do indeed learn to recognize the influence of interaction. This suggests that careful curation of training data could enable existing models to account for interactions when predicting trajectories.

Research areas

Related content

US, WA, Seattle
Join the Worldwide Sustainability (WWS) organization where we capitalize on our size, scale, and inventive culture to build a more resilient and sustainable company. WWS manages our social and environmental impacts globally, driving solutions that enable our customers, businesses, and the world around us to become more sustainable. Sustainability Science and Innovation is a multi-disciplinary team within the WW Sustainability organization that combines science, analytics, economics, statistics, machine learning, product development, and engineering expertise to identify, evaluate and/or develop new science, technologies, and innovations that aim to address long-term sustainability challenges. We are looking for a Sr. Research Scientist to help us develop and drive innovative scientific solutions that will improve the sustainability of materials in our products, packaging, operations, and infrastructure. You will be at the forefront of exploring and resolving complex sustainability issues, bringing innovative ideas to the table, and making meaningful contributions to projects across SSI’s portfolio. This role not only demands technical expertise but also a strategic mindset and the agility to adapt to evolving sustainability challenges through self-driven learning and exploration. In this role, you will leverage your breadth of expertise in AI models and methodologies and industrial research experience to build scientific tools that inform sustainability strategies related to materials and energy. The successful applicant will lead by example, pioneering science-vetted data-driven approaches, and working collaboratively to implement strategies that align with Amazon’s long-term sustainability vision. Key job responsibilities - Develop scientific models that help solve complex and ambiguous sustainability problems, and extract strategic learnings from large datasets. - Work closely with applied scientists and software engineers to implement your scientific models. - Support early-stage strategic sustainability initiatives and effectively learn from, collaborate with, and influence stakeholders to scale-up high-value initiatives. - Support research and development of cross-cutting technologies for industrial decarbonization, including building the data foundation and analytics for new AI models. - Drive innovation in key focus areas including packaging materials, building materials, and alternative fuels. About the team Diverse Experiences: World Wide Sustainability (WWS) values diverse experiences. Even if you do not meet all of the qualifications and skills listed in the job description, we encourage candidates to apply. If your career is just starting, hasn’t followed a traditional path, or includes alternative experiences, don’t let it stop you from applying. Inclusive Team Culture: It’s in our nature to learn and be curious. Our employee-led affinity groups foster a culture of inclusion that empower us to be proud of our differences. Ongoing events and learning experiences, including our Conversations on Race and Ethnicity (CORE) and AmazeCon conferences, inspire us to never stop embracing our uniqueness. Mentorship & Career Growth: We’re continuously raising our performance bar as we strive to become Earth’s Best Employer. That’s why you’ll find endless knowledge-sharing, mentorship and other career-advancing resources here to help you develop into a better-rounded professional. Work/Life Balance: We value work-life harmony. Achieving success at work should never come at the expense of sacrifices at home, which is why flexible work hours and arrangements are part of our culture. When we feel supported in the workplace and at home, there’s nothing we can’t achieve.
GB, MLN, Edinburgh
Do you want a role with deep meaning and the ability to make a major impact? As part of Intelligent Talent Acquisition (ITA), you'll have the opportunity to reinvent the hiring process and deliver unprecedented scale, sophistication, and accuracy for Amazon Talent Acquisition operations. ITA is an industry-leading people science and technology organization made up of scientists, engineers, analysts, product professionals and more, all with the shared goal of connecting the right people to the right jobs in a way that is fair and precise. Last year we delivered over 6 million online candidate assessments, and helped Amazon deliver billions of packages around the world by making it possible to hire hundreds of thousands of workers in the right quantity, at the right location and at exactly the right time. You’ll work on state-of-the-art research, advanced software tools, new AI systems, and machine learning algorithms, leveraging Amazon's in-house tech stack to bring innovative solutions to life. Join ITA in using technologies to transform the hiring landscape and make a meaningful difference in people's lives. Together, we can solve the world's toughest hiring problems. A day in the life As a Research Scientist, you will partner on design and development of AI-powered systems to scale job analyses enterprise-wide, match potential candidates to the jobs they’ll be most successful in, and conduct validation research for top-of-funnel AI-based evaluation tools. You’ll have the opportunity to develop and implement novel research strategies using the latest technology and to build solutions while experiencing Amazon’s customer-focused culture. The ideal scientist must have the ability to work with diverse groups of people and inter-disciplinary cross-functional teams to solve complex business problems. About the team The Lead Generation & Detection Services (LEGENDS) organization is a specialized organization focused on developing AI-driven solutions to enable fair and efficient talent acquisition processes across Amazon. Our work encompasses capabilities across the entire talent acquisition lifecycle, including role creation, recruitment strategy, sourcing, candidate evaluation, and talent deployment. The focus is on utilizing state-of-the-art solutions using Deep Learning, Generative AI, and Large Language Models (LLMs) for recruitment at scale that can support immediate hiring needs as well as longer-term workforce planning for corporate roles. We maintain a portfolio of capabilities such as job-person matching, person screening, duplicate profile detection, and automated applicant evaluation, as well as a foundational competency capability used throughout Amazon to help standardize the assessment of talent interested in Amazon.
US, NY, New York
About Sponsored Products and Brands The Sponsored Products and Brands team at Amazon Ads is re-imagining the advertising landscape through industry leading generative AI technologies, revolutionizing how millions of customers discover products and engage with brands across Amazon.com and beyond. We are at the forefront of re-inventing advertising experiences, bridging human creativity with artificial intelligence to transform every aspect of the advertising lifecycle from ad creation and optimization to performance analysis and customer insights. We are a passionate group of innovators dedicated to developing responsible and intelligent AI technologies that balance the needs of advertisers, enhance the shopping experience, and strengthen the marketplace. If you're energized by solving complex challenges and pushing the boundaries of what's possible with AI, join us in shaping the future of advertising. About our team The Search Ranking and Interleaving (R&I) team within Sponsored Products and Brands is responsible for determining which ads to show and the quality of ads shown on the search page (e.g., relevance, personalized and contextualized ranking to improve shopper experience, where to place them, and how many ads to show on the search page. This helps shoppers discover new products while helping advertisers put their products in front of the right customers, aligning shoppers’, advertisers’, and Amazon’s interests. To do this, we apply a broad range of GenAI and ML techniques to continuously explore, learn, and optimize the ranking and allocation of ads on the search page. We are an interdisciplinary team with a focus on improving the SP experience in search by gaining a deep understanding of shopper pain points and developing new innovative solutions to address them. A day in the life As an Applied Scientist on this team, you will identify big opportunities for the team to make a direct impact on customers and the search experience. You will work closely with with search and retail partner teams, software engineers and product managers to build scalable real-time GenAI and ML solutions. You will have the opportunity to design, run, and analyze A/B experiments that improve the experience of millions of Amazon shoppers while driving quantifiable revenue impact while broadening your technical skillset. Key job responsibilities - Solve challenging science and business problems that balance the interests of advertisers, shoppers, and Amazon. - Drive end-to-end GenAI & Machine Learning projects that have a high degree of ambiguity, scale, complexity. - Develop real-time machine learning algorithms to allocate billions of ads per day in advertising auctions. - Develop efficient algorithms for multi-objective optimization using deep learning methods to find operating points for the ad marketplace then evolve them - Research new and innovative machine learning approaches.
US, CA, San Francisco
Are you interested in a unique opportunity to advance the accuracy and efficiency of Artificial General Intelligence (AGI) systems? If so, you're at the right place! We are the AGI Autonomy organization, and we are looking for a driven and talented Member of Technical Staff to join us to build state-of-the art agents. AGI Autonomy is focused on developing new foundational capabilities for useful AI agents that can take actions in the digital and physical worlds. In other words, we’re enabling practical AI that can actually do things for us and make our customers more productive, empowered, and fulfilled. In this role, you will work closely with research teams to design, build, and maintain systems for training and evaluating state-of-the-art agent models. Our team works inside the Amazon AGI SF Lab, an environment designed to empower AI researchers and engineers to work with speed and focus. Our philosophy combines the agility of a startup with the resources of Amazon. Key job responsibilities * Evaluate performance of the training infrastructure, diagnose problems and address any gaps that exist. * Develop reliable infrastructure to schedule training and model evaluation jobs across clusters. * Work closely with researchers to create new techniques, infrastructure, and tooling around emerging research capabilities and evaluating models to meet customer needs. * Manage project prioritization, deliverables, timelines, and stakeholder communication. * Illuminate trade-offs, educate the team on best practices, and influence technical strategy. * Operate in a dynamic environment to deliver high quality software. About the team The Amazon AGI SF Lab is focused on developing new foundational capabilities for enabling useful AI agents that can take actions in the digital and physical worlds. In other words, we’re enabling practical AI that can actually do things for us and make our customers more productive, empowered, and fulfilled. The lab is designed to empower AI researchers and engineers to make major breakthroughs with speed and focus toward this goal. Our philosophy combines the agility of a startup with the resources of Amazon. By keeping the team lean, we’re able to maximize the amount of compute per person. Each team in the lab has the autonomy to move fast and the long-term commitment to pursue high-risk, high-payoff research.
US, MD, Jessup
Application deadline: Applications will be accepted on an ongoing basis Are you excited to help the US Intelligence Community design, build, and implement AI algorithms, including advanced Generative AI solutions, to augment decision making while meeting the highest standards for reliability, transparency, and scalability? The Amazon Web Services (AWS) US Federal Professional Services team works directly with US Intelligence Community agencies and other public sector entities to achieve their mission goals through the adoption of Machine Learning (ML) and Generative AI methods. We build models for text, image, video, audio, and multi-modal use cases, leveraging both traditional ML approaches and state-of-the-art generative models including Large Language Models (LLMs), text-to-image generation, and other advanced AI capabilities to fit the mission. Our team collaborates across the entire AWS organization to bring access to product and service teams, to get the right solution delivered and drive feature innovation based on customer needs. At AWS, we're hiring experienced data scientists with a background in both traditional and generative AI who can help our customers understand the opportunities their data presents, and build solutions that earn the customer trust needed for deployment to production systems. In this role, you will work closely with customers to deeply understand their data challenges and requirements, and design tailored solutions that best fit their use cases. You should have broad experience building models using all kinds of data sources, and building data-intensive applications at scale. You should possess excellent business acumen and communication skills to collaborate effectively with stakeholders, develop key business questions, and translate requirements into actionable solutions. You will provide guidance and support to other engineers, sharing industry best practices and driving innovation in the field of data science and AI. This position requires that the candidate selected must currently possess and maintain an active TS/SCI Security Clearance with Polygraph. The position further requires the candidate to opt into a commensurate clearance for each government agency for which they perform AWS work. Key job responsibilities As a Data Scientist, you will: - Collaborate with AI/ML scientists and architects to research, design, develop, and evaluate AI algorithms to address real-world challenges - Interact with customers directly to understand the business problem, help and aid them in implementation of AI solutions, deliver briefing and deep dive sessions to customers and guide customer on adoption patterns and paths to production. - Create and deliver best practice recommendations, tutorials, blog posts, sample code, and presentations adapted to technical, business, and executive stakeholder - Provide customer and market feedback to Product and Engineering teams to help define product direction - This position may require up to 25% local travel. About the team Why AWS? Amazon Web Services (AWS) is the world’s most comprehensive and broadly adopted cloud platform. We pioneered cloud computing and never stopped innovating — that’s why customers from the most successful startups to Global 500 companies trust our robust suite of products and services to power their businesses. Diverse Experiences AWS values diverse experiences. Even if you do not meet all of the qualifications and skills listed in the job description, we encourage candidates to apply. If your career is just starting, hasn’t followed a traditional path, or includes alternative experiences, don’t let it stop you from applying. Inclusive Team Culture Here at AWS, it’s in our nature to learn and be curious. Our employee-led affinity groups foster a culture of inclusion that empower us to be proud of our differences. Ongoing events and learning experiences, including our Conversations on Race and Ethnicity (CORE) and AmazeCon (diversity) conferences, inspire us to never stop embracing our uniqueness. Work/Life Balance We value work-life harmony. Achieving success at work should never come at the expense of sacrifices at home, which is why flexible work hours and arrangements are part of our culture. When we feel supported in the workplace and at home, there’s nothing we can’t achieve in the cloud. Mentorship & Career Growth We’re continuously raising our performance bar as we strive to become Earth’s Best Employer. That’s why you’ll find endless knowledge-sharing, mentorship and other career-advancing resources here to help you develop into a better-rounded professional.
US, MD, Jessup
Application deadline: Applications will be accepted on an ongoing basis Are you excited to help the US Intelligence Community design, build, and implement AI algorithms, including advanced Generative AI solutions, to augment decision making while meeting the highest standards for reliability, transparency, and scalability? The Amazon Web Services (AWS) US Federal Professional Services team works directly with US Intelligence Community agencies and other public sector entities to achieve their mission goals through the adoption of Machine Learning (ML) and Generative AI methods. We build models for text, image, video, audio, and multi-modal use cases, leveraging both traditional ML approaches and state-of-the-art generative models including Large Language Models (LLMs), text-to-image generation, and other advanced AI capabilities to fit the mission. Our team collaborates across the entire AWS organization to bring access to product and service teams, to get the right solution delivered and drive feature innovation based on customer needs. At AWS, we're hiring experienced data scientists with a background in both traditional and generative AI who can help our customers understand the opportunities their data presents, and build solutions that earn the customer trust needed for deployment to production systems. In this role, you will work closely with customers to deeply understand their data challenges and requirements, and design tailored solutions that best fit their use cases. You should have broad experience building models using all kinds of data sources, and building data-intensive applications at scale. You should possess excellent business acumen and communication skills to collaborate effectively with stakeholders, develop key business questions, and translate requirements into actionable solutions. You will provide guidance and support to other engineers, sharing industry best practices and driving innovation in the field of data science and AI. This position requires that the candidate selected must currently possess and maintain an active TS/SCI Security Clearance with Polygraph. The position further requires the candidate to opt into a commensurate clearance for each government agency for which they perform AWS work. Key job responsibilities As a Data Scientist, you will: - Collaborate with AI/ML scientists and architects to research, design, develop, and evaluate AI algorithms to address real-world challenges - Interact with customers directly to understand the business problem, help and aid them in implementation of AI solutions, deliver briefing and deep dive sessions to customers and guide customer on adoption patterns and paths to production. - Create and deliver best practice recommendations, tutorials, blog posts, sample code, and presentations adapted to technical, business, and executive stakeholder - Provide customer and market feedback to Product and Engineering teams to help define product direction - This position may require up to 25% local travel. About the team Why AWS? Amazon Web Services (AWS) is the world’s most comprehensive and broadly adopted cloud platform. We pioneered cloud computing and never stopped innovating — that’s why customers from the most successful startups to Global 500 companies trust our robust suite of products and services to power their businesses. Diverse Experiences AWS values diverse experiences. Even if you do not meet all of the qualifications and skills listed in the job description, we encourage candidates to apply. If your career is just starting, hasn’t followed a traditional path, or includes alternative experiences, don’t let it stop you from applying. Inclusive Team Culture Here at AWS, it’s in our nature to learn and be curious. Our employee-led affinity groups foster a culture of inclusion that empower us to be proud of our differences. Ongoing events and learning experiences, including our Conversations on Race and Ethnicity (CORE) and AmazeCon (diversity) conferences, inspire us to never stop embracing our uniqueness. Work/Life Balance We value work-life harmony. Achieving success at work should never come at the expense of sacrifices at home, which is why flexible work hours and arrangements are part of our culture. When we feel supported in the workplace and at home, there’s nothing we can’t achieve in the cloud. Mentorship & Career Growth We’re continuously raising our performance bar as we strive to become Earth’s Best Employer. That’s why you’ll find endless knowledge-sharing, mentorship and other career-advancing resources here to help you develop into a better-rounded professional.
IN, KA, Bengaluru
Are you passionate about building data-driven applied science solutions to drive the profitability of the business? Are you excited about solving complex real world problems? Do you have proven analytical capabilities, exceptional communication, project management skills, and the ability to multi-task and thrive in a fast-paced environment? Join us a Senior Applied Scientist to deliver applied science solutions for Amazon Payment Products. Amazon Payment Products team creates and manages a global portfolio of payment products, including co-branded credit cards, instalment financing, etc. Within this team, we are looking for a Senior Applied Scientist who will be responsible for the following: Key job responsibilities As a Senior Applied Scientist, you will be responsible for designing and deploying scalable ML, GenAI, Agentic AI solutions that will impact the payments of millions of customers and solve key customer experience issues. You will develop novel deep learning, LLM for task automation, text processing, pattern recognition, and anomaly detection problems. You will define the research and experiments strategy with an iterative execution approach to develop AI/ML models and progressively improve the results over time. You will partner with business and engineering teams to identify and solve large and significantly complex problems that require scientific innovation. You will help the team leverage your expertise, by coaching and mentoring. You will contribute to the professional development of colleagues, improving their technical knowledge and the engineering practices. You will independently as well as guide team to file for patents and/or publish research work where opportunities arise. As the Payment Products organization deals with problems that are directly related to payments of customers, the Senior Applied Scientist role will impact the large product strategy, identify new business opportunities and provides strategic direction, which will be very exciting.
US, CA, San Francisco
Are you interested in a unique opportunity to advance the accuracy and efficiency of Artificial General Intelligence (AGI) systems? If so, you're at the right place! We are the AGI Autonomy organization, and we are looking for a driven and talented Member of Technical Staff to join us to build state-of-the art agents. Our lab is a small, talent-dense team with the resources and scale of Amazon. Each team in the lab has the autonomy to move fast and the long-term commitment to pursue high-risk, high-payoff research. We’re entering an exciting new era where agents can redefine what AI makes possible. We’d love for you to join our lab and build it from the ground up! Key job responsibilities * Design and implement a modern, fast, and ergonomic development environment for AI researchers, eliminating current pain points in build times, testing workflows, and iteration speed * Build and manage CI/CD pipelines (CodePipeline, Jenkins, etc.) that support large-scale AI research workflows, including pipelines capable of orchestrating thousands of simultaneous agentic experiments * Develop tooling that bridges local development environments with remote supercomputing resources, enabling researchers to seamlessly leverage massive compute from their IDEs * Manage and optimize code repository infrastructure (GitLab, Phabricator, or similar) to support collaborative research at scale * Implement release management processes and automation to ensure reliable, repeatable deployments of research code and models * Optimize container build systems for GPU workloads, ensuring fast iteration cycles and efficient resource utilization * Work directly with researchers to understand workflow pain points and translate them into infrastructure improvements * Build monitoring and observability into development tooling to identify bottlenecks and continuously improve developer experience * Design and maintain build systems optimized for ML frameworks, CUDA code, and distributed training workloads About the team The team is shaping developer experience from the ground up. Building tools that enable researchers to move at the speed of thought: IDEs that seamlessly shell out to supercomputers, CI/CD pipelines that orchestrate thousands of agentic commands simultaneously, and build systems optimized for GPU-accelerated workflows. Your infrastructure will be the foundation that enables the next generation of AI research, directly contributing to our mission of building the most capable agents in the world.
US, CA, San Francisco
Are you interested in a unique opportunity to advance the accuracy and efficiency of Artificial General Intelligence (AGI) systems? If so, you're at the right place! We are the AGI Autonomy organization, and we are looking for a driven and talented Member of Technical Staff to join us to build state-of-the art agents. Our lab is a small, talent-dense team with the resources and scale of Amazon. Each team in the lab has the autonomy to move fast and the long-term commitment to pursue high-risk, high-payoff research. We’re entering an exciting new era where agents can redefine what AI makes possible. We’d love for you to join our lab and build it from the ground up! Key job responsibilities * Design, build, and maintain the compute platform that powers all AI research at the SF AI Lab, managing large-scale GPU pools and ensuring optimal resource utilization * Partner directly with research scientists to understand experimental requirements and develop infrastructure solutions that accelerate research velocity * Implement and maintain robust security controls and hardening measures while enabling researcher productivity and flexibility * Modernize and scale existing infrastructure by converting manual deployments into reproducible Infrastructure as Code using AWS CDK * Optimize system performance across multiple GPU architectures, becoming an expert in extracting maximum computational efficiency * Design and implement monitoring, orchestration, and automation solutions for GPU workloads at scale * Ensure infrastructure is compliant with Amazon security standards while creatively solving for research-specific requirements * Collaborate with AWS teams to leverage and influence cloud services that support AI workloads * Build distributed systems infrastructure, including Kubernetes-based orchestration, to support multi-tenant research environments * Serve as the bridge between traditional systems engineering and ML infrastructure, bringing enterprise-grade reliability to research computing About the team This role is part of the foundational infrastructure team at the SF AI Lab, responsible for the platform that enables all research across the organization. Our team serves as the critical link between Amazon's enterprise infrastructure and the Lab's research needs. We are experts in performance optimization, systems architecture, and creative problem-solving—finding ways to push the boundaries of what's possible while maintaining security and reliability standards. We work closely with research scientists, understanding their experimental needs and translating them into robust, scalable infrastructure solutions. Our team has deep expertise in ML framework internals and GPU optimization, but we're also pragmatic systems engineers who build traditional infrastructure with enterprise-grade quality. We value engineers who can balance research velocity with operational excellence, who bring curiosity about ML while maintaining strong fundamentals in systems engineering. This is a small, high-impact team where your work directly enables breakthrough AI research. You'll have the opportunity to work with some of the most advanced AI infrastructure in the world while building the skills that define the future of ML systems engineering.
US, NY, New York
About Sponsored Products and Brands The Sponsored Products and Brands team at Amazon Ads is re-imagining the advertising landscape through industry leading generative AI technologies, revolutionizing how millions of customers discover products and engage with brands across Amazon.com and beyond. We are at the forefront of re-inventing advertising experiences, bridging human creativity with artificial intelligence to transform every aspect of the advertising lifecycle from ad creation and optimization to performance analysis and customer insights. We are a passionate group of innovators dedicated to developing responsible and intelligent AI technologies that balance the needs of advertisers, enhance the shopping experience, and strengthen the marketplace. If you're energized by solving complex challenges and pushing the boundaries of what's possible with AI, join us in shaping the future of advertising. About our team The Search Ranking and Interleaving (R&I) team within Sponsored Products and Brands is responsible for determining which ads to show and the quality of ads shown on the search page (e.g., relevance, personalized and contextualized ranking to improve shopper experience, where to place them, and how many ads to show on the search page. This helps shoppers discover new products while helping advertisers put their products in front of the right customers, aligning shoppers’, advertisers’, and Amazon’s interests. To do this, we apply a broad range of GenAI and ML techniques to continuously explore, learn, and optimize the ranking and allocation of ads on the search page. We are an interdisciplinary team with a focus on improving the SP experience in search by gaining a deep understanding of shopper pain points and developing new innovative solutions to address them. A day in the life As an Applied Scientist on this team, you will identify big opportunities for the team to make a direct impact on customers and the search experience. You will work closely with with search and retail partner teams, software engineers and product managers to build scalable real-time GenAI and ML solutions. You will have the opportunity to design, run, and analyze A/B experiments that improve the experience of millions of Amazon shoppers while driving quantifiable revenue impact while broadening your technical skillset. Key job responsibilities - Solve challenging science and business problems that balance the interests of advertisers, shoppers, and Amazon. - Drive end-to-end GenAI & Machine Learning projects that have a high degree of ambiguity, scale, complexity. - Develop real-time machine learning algorithms to allocate billions of ads per day in advertising auctions. - Develop efficient algorithms for multi-objective optimization using deep learning methods to find operating points for the ad marketplace then evolve them - Research new and innovative machine learning approaches. - Recruit Scientists to the team and provide mentorship.