Learning to learn learning-rate schedules

In a series of papers, Amazon researchers performed a theoretical analysis of a simplified problem that led to a learnable learning-rate scheduler, applied that scheduler to a more complex neural model, and distilled the results into a practical algorithm.

Training a machine learning model can be thought of as exploring a landscape that maps settings of the model parameters against average error rate. The goal of training is to find the bottom of the lowest basin in the landscape, or the parameter settings that yield the lowest error rate or “loss” value.

A critical hyperparameter during training is the learning rate, which determines how big an effect the learning from a given batch of training data can have on a model’s parameter settings. It’s common to vary the learning rate throughout training: for instance, we might use a high learning rate at the outset to rapidly explore the whole landscape but slow the learning rate over time to ensure that we don’t leap over a global minimum.

Varying the learning rate is known as learning-rate scheduling, and it’s instrumental in achieving stable convergence and maximum accuracy. Yet crafting optimal schedules often relies on painstaking trial-and-error experimentation. As models grow more complex, manual tuning becomes increasingly unscalable, and human-designed schedules fail to respond to intricate details of the loss landscape, model parameters, and dataset.

Related content
Paper presents a criterion for halting the hyperparameter optimization process.

At Amazon, we are developing algorithms that can learn to schedule by harnessing data from past experiments. In a sequence of recent papers, we describe three phases of our research:

  1. Deriving stability guarantees for a simplified problem (non-negative-matrix factorization) and using them to develop a learnable scheduler;
  2. Extending that approach to deep neural networks; and
  3. Distilling the results into an efficient heuristic scheduler.

Analyzing stochastic non-negative-matrix factorization

In the first paper, “Efficient learning rate schedules for stochastic non-negative matrix factorization via reinforcement learning”, which we presented at ICLR 2023, we analyze stochastic non-negative-matrix factorization (NMF), a well-studied unsupervised-learning technique. NMF involves decomposing a non-negative matrix into two low-rank non-negative factor matrices.

Due to its popularity and mathematical simplicity, NMF served as an appealing testbed before we tackled more-complex models. Interestingly, our way of posing this well-studied matrix decomposition problem as a learning problem is related to the popular parameter-efficient fine-tuning (PEFT) methods that are used today for more-efficient compression and training of large language models.

In our first paper, we considered an optimization scheme for NMF that uses stochastic gradient descent — the standard machine learning algorithm — to minimize the difference between the original matrix and the matrix reconstituted from the factor matrices. To measure distance, we used the Frobenius norm, which is the square root of the sum of the squares of the individual differences for all matrix entries.

Related content
Syne Tune supports multiple backends, single-fidelity and multi-fidelity (early-exit) optimization algorithms, and hyperparameter transfer learning.

Assuming noisy gradients — that is, noisy estimations of slopes in the loss landscape — we established an upper bound for learning rates that guarantee stability, or convergence to a local minimum under repeated training epochs.

This yielded valuable insights. First, it quantified precisely how the learning rate controls trade-offs between convergence speed and potential divergence. Second, it showed that stability can be assured through proper learning rate initialization and clipping, or capping the extent to which any one model parameter can be modified during model updates.

With convergence guarantees in hand, we shifted our focus to learning what schedules may work well for specific problems. Reinforcement-learning (RL) agents search for and generate sequences of decisions that should lead to a better end state. This can be directly applied to learning-rate schedules that maximize convergence speed, while respecting stability bounds.

Empirically, the automated schedules our RL agent discovered consistently outperformed popular heuristics — such as step decay, which systematically lowers the learning rate after successive epochs — on NMF tasks. This provided a promising proof-of-concept for meta-learned scheduling in simplified domains where stability can be analytically assured.

Tackling deep-neural-network optimization

Given what we had learned about using RL for generating NMF schedules, we next sought to extend the adaptive-scheduling paradigm to deep neural networks. Unfortunately, deriving theoretical guarantees is vastly more difficult for complex nonconvex neural training objectives. Without assurances of stability, the optimization landscape becomes even more treacherous.

Related content
Amazon scientist’s award-winning paper predates — but later found applications in — the deep-learning revolution.

Nevertheless, in another 2023 ICLR paper, “Learned learning rate schedules for deep neural network training using reinforcement learning”, we hypothesized that data-driven scheduling could still improve on hand-tuned learning rates and schedules. We used the reinforcement-learning framework we’d developed for NMF to generate schedules for computer vision and natural-language-processing tasks.

The automated schedules successfully reduced training time and improved generalization compared to standard heuristics such as cosine annealing. This demonstrated the empirical viability of our approach even in the absence of stability guarantees. By learning online from data, the scheduler adapted to nuances of the loss landscape and gradient trajectories.

But using RL to find optimal schedules for this problem is still expensive — and it becomes more expensive as model and data sizes increase. So our next step was to distill our approach into a simple and usable algorithm.

The GreedyLR scheduler

At this year’s Conference on Pattern Recognition and Machine Learning (PRML), we won the best-presentation award for a lightweight learned scheduler called GreedyLR that sets the learning rate based on recent improvements in the training loss. In comparisons with popular scheduler and optimizer combinations, GreedyLR performed equivalently or better more than 90% of the time. It also enabled faster convergence than techniques like stochastic line search that adjust the learning rate by solving optimization problems during training.

Related content
Method presented to ICML workshop works with any machine learning model and fairness criterion.

In each training epoch, GreedyLR adapts the learning rate based on changes in the validation loss. Its core logic is simple: increase the learning rate if the loss improves and decrease it if the loss worsens. But GreedyLR employs additional techniques to make this greedy heuristic work well in practice:

  • Its patience parameter prevents overreaction to noisy loss fluctuations.
  • A smoothing window calculates the rolling-average validation loss for more-robust comparisons.
  • Thresholds prevent needless updates when the loss change is insignificant.
  • Cooldown and warmup stages continue increasing or decreasing the learning rate even if the loss trend reverses.
  • Configurable upper and lower bounds on the learning-rate range enable it to benefit from human intuition without sacrificing the ability to explore counterintuitive methods.

Overall, these enhancements make GreedyLR respond intelligently to trends in the loss rather than reacting impulsively. The algorithm tunes the learning rate adaptively during training to accelerate convergence without compromising stability.

Learning-rate schedule.16x9.png
A patience parameter, a smoothing window, thresholding, cooldown and warmup stages, and configurable upper and lower learning-rate bounds make GreedyLR respond intelligently to trends in the loss rather than reacting impulsively.

In our experiments, we found that GreedyLR is able to produce diverse, dynamic schedules, as shown in the figures below. Also shown below are standard schedules such as linear, constant, and cosine decay that are popular today:

Learning-rate results.png
Learning-rate schedules produced by GreedyLR (red), compared to those produced by several popular scheduling approaches.

GreedyLR achieved faster convergence, especially for large models, making it a promising general-purpose scheduler. It also performed better than more-advanced methods such as hypergradient descent, which can be considered a first-order version of GreedyLR. While hypergradient descent tries to achieve faster convergence by using gradient descent to learn one learning rate per parameter or parameter group, GreedyLR just uses one global, reactive learning rate. This is particularly interesting since you need a billion learning rates for a billion-parameter model in hypergradient descent, versus a single learning rate for GreedyLR.

GreedyLR loss history.png
Loss histories comparing GreedyLR (black) with a stochastic-gradient-descent baseline (red) and per-parameter (green) and per-group (blue) hypergradient descent.

Conclusion and future outlook

Together, these contributions demonstrate the potential for learned optimizers to accelerate deep learning. By automatically adapting to training dynamics, they can find more-optimal solutions than human-designed algorithms reliant on rules of thumb. The ease of use and consistent gains from GreedyLR make it a compelling, general-purpose scheduler ready for wide adoption. We plan to continue improving the efficiency of our learning-based methods to further enhance productivity for deep-learning practitioners.

Research areas

Related content

GB, London
Amazon Advertising is looking for a Data Scientist to join its brand new initiative that powers Amazon’s contextual advertising products. Advertising at Amazon is a fast-growing multi-billion dollar business that spans across desktop, mobile and connected devices; encompasses ads on Amazon and a vast network of hundreds of thousands of third party publishers; and extends across US, EU and an increasing number of international geographies. The Supply Quality organization has the charter to solve optimization problems for ad-programs in Amazon and ensure high-quality ad-impressions. We develop advanced algorithms and infrastructure systems to optimize performance for our advertisers and publishers. We are focused on solving a wide variety of problems in computational advertising like traffic quality prediction (robot and fraud detection), Security forensics and research, Viewability prediction, Brand Safety, Contextual data processing and classification. Our team includes experts in the areas of distributed computing, machine learning, statistics, optimization, text mining, information theory and big data systems. We are looking for a dynamic, innovative and accomplished Data Scientist to work on data science initiatives for contextual data processing and classification that power our contextual advertising solutions. Are you an experienced user of sophisticated analytical techniques that can be applied to answer business questions and chart a sustainable vision? Are you exited by the prospect of communicating insights and recommendations to audiences of varying levels of technical sophistication? Above all, are you an innovator at heart and have a track record of resolving ambiguity to deliver result? As a data scientist, you help our data science team build cutting edge models and measurement solutions to power our contextual classification technology. As this is a new initiative, you will get an opportunity to act as a thought leader, work backwards from the customer needs, dive deep into data to understand the issues, define metrics, conceptualize and build algorithms and collaborate with multiple cross-functional teams. Key job responsibilities * Define a long-term science vision for contextual-classification tech, driven fundamentally from the needs of our advertisers and publishers, translating that direction into specific plans for the science team. Interpret complex and interrelated data points and anecdotes to build and communicate this vision. * Collaborate with software engineering teams to Identify and implement elegant statistical and machine learning solutions * Oversee the design, development, and implementation of production level code that handles billions of ad requests. Own the full development cycle: idea, design, prototype, impact assessment, A/B testing (including interpretation of results) and production deployment. * Promote the culture of experimentation and applied science at Amazon. * Demonstrated ability to meet deadlines while managing multiple projects. * Excellent communication and presentation skills working with multiple peer groups and different levels of management * Influence and continuously improve a sustainable team culture that exemplifies Amazon’s leadership principles. We are open to hiring candidates to work out of one of the following locations: London, GBR
JP, 13, Tokyo
We are seeking a Principal Economist to be the science leader in Amazon's customer growth and engagement. The wide remit covers Prime, delivery experiences, loyalty program (Amazon Points), and marketing. We look forward to partnering with you to advance our innovation on customers’ behalf. Amazon has a trailblazing track record of working with Ph.D. economists in the tech industry and offers a unique environment for economists to thrive. As an economist at Amazon, you will apply the frontier of econometric and economic methods to Amazon’s terabytes of data and intriguing customer problems. Your expertise in building reduced-form or structural causal inference models is exemplary in Amazon. Your strategic thinking in designing mechanisms and products influences how Amazon evolves. In this role, you will build ground-breaking, state-of-the-art econometric models to guide multi-billion-dollar investment decisions around the global Amazon marketplaces. You will own, execute, and expand a research roadmap that connects science, business, and engineering and contributes to Amazon's long term success. As one of the first economists outside North America/EU, you will make an outsized impact to our international marketplaces and pioneer in expanding Amazon’s economist community in Asia. The ideal candidate will be an experienced economist in empirical industrial organization, labour economics, or related structural/reduced-form causal inference fields. You are a self-starter who enjoys ambiguity in a fast-paced and ever-changing environment. You think big on the next game-changing opportunity but also dive deep into every detail that matters. You insist on the highest standards and are consistent in delivering results. Key job responsibilities - Work with Product, Finance, Data Science, and Data Engineering teams across the globe to deliver data-driven insights and products for regional and world-wide launches. - Innovate on how Amazon can leverage data analytics to better serve our customers through selection and pricing. - Contribute to building a strong data science community in Amazon Asia. We are open to hiring candidates to work out of one of the following locations: Tokyo, 13, JPN
DE, BE, Berlin
Ops Integration: Concessions team is looking for a motivated, creative and customer obsessed Snr. Applied Scientist with a strong machine learning background, to develop advanced analytics models (Computer Vision, LLMs, etc.) that improve customer experiences We are the voice of the customer in Amazon’s operations, and we take that role very seriously. If you join this team, you will be a key contributor to delivering the Factory of the Future: leveraging Internet of Things (IoT) and advanced analytics to drive tangible, operational change on the ground. You will collaborate with a wide range of stakeholders (You will partner with Research and Applied Scientists, SDEs, Technical Program Managers, Product Managers and Business Leaders) across the business to develop and refine new ways of assessing challenges within Amazon operations. This role will combine Amazon’s oldest Leadership Principle, with the latest analytical innovations, to deliver business change at scale and efficiently The ideal candidate will have deep and broad experience with theoretical approaches and practical implementations of vision techniques for task automation. They will be a motivated self-starter who can thrive in a fast-paced environment. They will be passionate about staying current with sensing technologies and algorithms in the broader machine vision industry. They will enjoy working in a multi-disciplinary team of engineers, scientists and business leaders. They will seek to understand processes behind data so their recommendations are grounded. Key job responsibilities Your solutions will drive new system capabilities with global impact. You will design highly scalable, large enterprise software solutions involving computer vision. You will develop complex perception algorithms integrating across multiple sensing devices. You will develop metrics to quantify the benefits of a solution and influence project resources. You will validate system performance and use insights from your live models to drive the next generation of model development. Common tasks include: • Research, design, implement and evaluate complex perception and decision making algorithms integrating across multiple disciplines • Work closely with software engineering teams to drive scalable, real-time implementations • Collaborate closely with team members on developing systems from prototyping to production level • Collaborate with teams spread all over the world • Track general business activity and provide clear, compelling management reports on a regular basis We are open to hiring candidates to work out of one of the following locations: Berlin, BE, DEU | Berlin, DEU
DE, BY, Munich
Ops Integration: Concessions team is looking for a motivated, creative and customer obsessed Snr. Applied Scientist with a strong machine learning background, to develop advanced analytics models (Computer Vision, LLMs, etc.) that improve customer experiences We are the voice of the customer in Amazon’s operations, and we take that role very seriously. If you join this team, you will be a key contributor to delivering the Factory of the Future: leveraging Internet of Things (IoT) and advanced analytics to drive tangible, operational change on the ground. You will collaborate with a wide range of stakeholders (You will partner with Research and Applied Scientists, SDEs, Technical Program Managers, Product Managers and Business Leaders) across the business to develop and refine new ways of assessing challenges within Amazon operations. This role will combine Amazon’s oldest Leadership Principle, with the latest analytical innovations, to deliver business change at scale and efficiently The ideal candidate will have deep and broad experience with theoretical approaches and practical implementations of vision techniques for task automation. They will be a motivated self-starter who can thrive in a fast-paced environment. They will be passionate about staying current with sensing technologies and algorithms in the broader machine vision industry. They will enjoy working in a multi-disciplinary team of engineers, scientists and business leaders. They will seek to understand processes behind data so their recommendations are grounded. Key job responsibilities Your solutions will drive new system capabilities with global impact. You will design highly scalable, large enterprise software solutions involving computer vision. You will develop complex perception algorithms integrating across multiple sensing devices. You will develop metrics to quantify the benefits of a solution and influence project resources. You will validate system performance and use insights from your live models to drive the next generation of model development. Common tasks include: • Research, design, implement and evaluate complex perception and decision making algorithms integrating across multiple disciplines • Work closely with software engineering teams to drive scalable, real-time implementations • Collaborate closely with team members on developing systems from prototyping to production level • Collaborate with teams spread all over the world • Track general business activity and provide clear, compelling management reports on a regular basis We are open to hiring candidates to work out of one of the following locations: Munich, BE, DEU | Munich, BY, DEU | Munich, DEU
IT, MI, Milan
Ops Integration: Concessions team is looking for a motivated, creative and customer obsessed Snr. Applied Scientist with a strong machine learning background, to develop advanced analytics models (Computer Vision, LLMs, etc.) that improve customer experiences We are the voice of the customer in Amazon’s operations, and we take that role very seriously. If you join this team, you will be a key contributor to delivering the Factory of the Future: leveraging Internet of Things (IoT) and advanced analytics to drive tangible, operational change on the ground. You will collaborate with a wide range of stakeholders (You will partner with Research and Applied Scientists, SDEs, Technical Program Managers, Product Managers and Business Leaders) across the business to develop and refine new ways of assessing challenges within Amazon operations. This role will combine Amazon’s oldest Leadership Principle, with the latest analytical innovations, to deliver business change at scale and efficiently The ideal candidate will have deep and broad experience with theoretical approaches and practical implementations of vision techniques for task automation. They will be a motivated self-starter who can thrive in a fast-paced environment. They will be passionate about staying current with sensing technologies and algorithms in the broader machine vision industry. They will enjoy working in a multi-disciplinary team of engineers, scientists and business leaders. They will seek to understand processes behind data so their recommendations are grounded. Key job responsibilities Your solutions will drive new system capabilities with global impact. You will design highly scalable, large enterprise software solutions involving computer vision. You will develop complex perception algorithms integrating across multiple sensing devices. You will develop metrics to quantify the benefits of a solution and influence project resources. You will validate system performance and use insights from your live models to drive the next generation of model development. Common tasks include: • Research, design, implement and evaluate complex perception and decision making algorithms integrating across multiple disciplines • Work closely with software engineering teams to drive scalable, real-time implementations • Collaborate closely with team members on developing systems from prototyping to production level • Collaborate with teams spread all over the world • Track general business activity and provide clear, compelling management reports on a regular basis We are open to hiring candidates to work out of one of the following locations: Milan, MI, ITA
ES, M, Madrid
Ops Integration: Concessions team is looking for a motivated, creative and customer obsessed Snr. Applied Scientist with a strong machine learning background, to develop advanced analytics models (Computer Vision, LLMs, etc.) that improve customer experiences We are the voice of the customer in Amazon’s operations, and we take that role very seriously. If you join this team, you will be a key contributor to delivering the Factory of the Future: leveraging Internet of Things (IoT) and advanced analytics to drive tangible, operational change on the ground. You will collaborate with a wide range of stakeholders (You will partner with Research and Applied Scientists, SDEs, Technical Program Managers, Product Managers and Business Leaders) across the business to develop and refine new ways of assessing challenges within Amazon operations. This role will combine Amazon’s oldest Leadership Principle, with the latest analytical innovations, to deliver business change at scale and efficiently The ideal candidate will have deep and broad experience with theoretical approaches and practical implementations of vision techniques for task automation. They will be a motivated self-starter who can thrive in a fast-paced environment. They will be passionate about staying current with sensing technologies and algorithms in the broader machine vision industry. They will enjoy working in a multi-disciplinary team of engineers, scientists and business leaders. They will seek to understand processes behind data so their recommendations are grounded. Key job responsibilities Your solutions will drive new system capabilities with global impact. You will design highly scalable, large enterprise software solutions involving computer vision. You will develop complex perception algorithms integrating across multiple sensing devices. You will develop metrics to quantify the benefits of a solution and influence project resources. You will validate system performance and use insights from your live models to drive the next generation of model development. Common tasks include: • Research, design, implement and evaluate complex perception and decision making algorithms integrating across multiple disciplines • Work closely with software engineering teams to drive scalable, real-time implementations • Collaborate closely with team members on developing systems from prototyping to production level • Collaborate with teams spread all over the world • Track general business activity and provide clear, compelling management reports on a regular basis We are open to hiring candidates to work out of one of the following locations: Madrid, ESP | Madrid, M, ESP
US, TX, Austin
The role is available Arlington, Virginia (may consider New York, NY, Los Angeles, CA, or Toronto, Canada). Calling all inventors to work on exciting new opportunities in Sponsored Products. Amazon is building a world class advertising business and defining and delivering a collection of self-service performance advertising products that drive discovery and sales of merchandise. Our products are strategically important to our Retail and Marketplace businesses, driving long-term growth. Sponsored Products (SP) helps merchants, retail vendors, and brand owners grows incremental sales of their products sold on Amazon through native advertising. SP achieves this by using a combination of machine learning, big data analytics, ultra-low latency high-volume engineering systems, and quantitative product focus. We are a highly motivated, collaborative and fun-loving group with an entrepreneurial spirit and bias for action. You will join a newly-founded team with a broad mandate to experiment and innovate, which gives us the flexibility to explore and apply scientific techniques to novel product problems. You will have the satisfaction of seeing your work improve the experience of millions of Amazon shoppers while driving quantifiable revenue impact. More importantly, you will have the opportunity to broaden your technical skills, work with Generative AI, and be a science leader in an environment that thrives on creativity, experimentation, and product innovation. We are open to hiring candidates to work out of one of the following locations: Austin, TX, USA
GB, London
Ops Integration: Concessions team is looking for a motivated, creative and customer obsessed Snr. Applied Scientist with a strong machine learning background, to develop advanced analytics models (Computer Vision, LLMs, etc.) that improve customer experiences We are the voice of the customer in Amazon’s operations, and we take that role very seriously. If you join this team, you will be a key contributor to delivering the Factory of the Future: leveraging Internet of Things (IoT) and advanced analytics to drive tangible, operational change on the ground. You will collaborate with a wide range of stakeholders (You will partner with Research and Applied Scientists, SDEs, Technical Program Managers, Product Managers and Business Leaders) across the business to develop and refine new ways of assessing challenges within Amazon operations. This role will combine Amazon’s oldest Leadership Principle, with the latest analytical innovations, to deliver business change at scale and efficiently The ideal candidate will have deep and broad experience with theoretical approaches and practical implementations of vision techniques for task automation. They will be a motivated self-starter who can thrive in a fast-paced environment. They will be passionate about staying current with sensing technologies and algorithms in the broader machine vision industry. They will enjoy working in a multi-disciplinary team of engineers, scientists and business leaders. They will seek to understand processes behind data so their recommendations are grounded. Key job responsibilities Your solutions will drive new system capabilities with global impact. You will design highly scalable, large enterprise software solutions involving computer vision. You will develop complex perception algorithms integrating across multiple sensing devices. You will develop metrics to quantify the benefits of a solution and influence project resources. You will validate system performance and use insights from your live models to drive the next generation of model development. Common tasks include: • Research, design, implement and evaluate complex perception and decision making algorithms integrating across multiple disciplines • Work closely with software engineering teams to drive scalable, real-time implementations • Collaborate closely with team members on developing systems from prototyping to production level • Collaborate with teams spread all over the world • Track general business activity and provide clear, compelling management reports on a regular basis Basic Qualifications -Masters in Computer Science, Machine Learning, Robotics or equivalent with a focus on Computer Vision. -2+ years of experience of building machine learning models for business application -Broad knowledge of fundamentals and state of the art in computer vision and machine learning -Strong coding skills in two or more programming languages such as Python or C/C++ -Knowledge of fundamentals in optimization, supervised and reinforcement learning -Excellent problem-solving ability Preferred Qualifications -PhD and 4+ years of industry or academic applied research experience applying Computer Vision techniques and developing Computer vision algorithms -Depth and breadth in state-of-the-art computer vision and machine learning technologies and experience designing and building computer vision solutions -Industry experience in sensor systems and the development of production computer vision and machine learning applications built to use them -Experience developing software interfacing to AWS services -Excellent written and verbal communication skills with the ability to present complex technical information in a clear and concise manner to a variety of audiences -Ability to work on a diverse team or with a diverse range of coworkers -Experience in publishing at major Computer Vision, ML or Robotics conferences or Journals (CVPR, ICCV, ECCV, NeurIPS, ICML, IJCV, ICRA, IROS, RSS,...) We are open to hiring candidates to work out of one of the following locations: London, GBR
US, WA, Seattle
Want to work in a start-up environment with the resources of Amazon behind you? Do you want to have direct and immediate impact on millions of customers every day? If you are a self-starter, passionate about machine learning, deep learning, big data systems, enjoy designing and implementing new features and machine learned models, and intrigued by ambiguous problems, look no further. Amazon Advertising operates at the intersection of eCommerce and advertising, offering a rich array of digital display advertising solutions with the goal of helping our customers find and discover anything they want to buy. We help advertisers of all types to reach Amazon customers on Amazon.com, across our other owned and operated sites, on other high quality sites across the web, and on millions of Kindles, tablets, and mobile devices. We start with the customer and work backwards in everything we do, including advertising. If you’re interested in joining a rapidly growing team working to build a unique, world-class advertising group with a relentless focus on the customer, you’ve come to the right place. About Our Team: Our team is responsible for building a new advertising product for non-endemic advertisers. We are tasked with taking this start-up offering to market, with the goal of empowering over one million non-endemic advertisers to independently plan and execute campaigns. “Non-endemic” brands offer products and services that are not sold/available in Amazon’s retail marketplace, including restaurants, hotels, airlines, insurance, telecom, and automobiles. We are embarking on a multi-year vision to democratize display advertising for non-endemic advertisers at self-service scale. This will open up Amazon Ads to self-service non-endemic demand— whether they sell on the Amazon store or not— to activate Amazon Ads first-party audiences built from shopping and streaming signals and access unique ad inventory to help grow their business. Open to hire in NYC or Seattle. Key job responsibilities - Drive end-to-end Machine Learning projects that have a high degree of ambiguity, scale, complexity. - Perform hands-on analysis and modeling of enormous data sets to develop insights that increase traffic monetization and merchandise sales, without compromising the shopper experience. - Build machine learning models, perform proof-of-concept, experiment, optimize, and deploy your models into production; work closely with software engineers to assist in productionizing your ML models. - Run A/B experiments, gather data, and perform statistical analysis. - Establish scalable, efficient, automated processes for large-scale data analysis, machine-learning model development, model validation and serving. - Research new and innovative machine learning approaches. - Train and fine-tune neural models including transformers and language models. - Recruit Applied Scientists to the team and provide mentorship. We are open to hiring candidates to work out of one of the following locations: Seattle, WA, USA
LU, Luxembourg
Ops Integration: Concessions team is looking for a motivated, creative and customer obsessed Snr. Applied Scientist with a strong machine learning background, to develop advanced analytics models (Computer Vision, LLMs, etc.) that improve customer experiences We are the voice of the customer in Amazon’s operations, and we take that role very seriously. If you join this team, you will be a key contributor to delivering the Factory of the Future: leveraging Internet of Things (IoT) and advanced analytics to drive tangible, operational change on the ground. You will collaborate with a wide range of stakeholders (You will partner with Research and Applied Scientists, SDEs, Technical Program Managers, Product Managers and Business Leaders) across the business to develop and refine new ways of assessing challenges within Amazon operations. This role will combine Amazon’s oldest Leadership Principle, with the latest analytical innovations, to deliver business change at scale and efficiently The ideal candidate will have deep and broad experience with theoretical approaches and practical implementations of vision techniques for task automation. They will be a motivated self-starter who can thrive in a fast-paced environment. They will be passionate about staying current with sensing technologies and algorithms in the broader machine vision industry. They will enjoy working in a multi-disciplinary team of engineers, scientists and business leaders. They will seek to understand processes behind data so their recommendations are grounded. Key job responsibilities Your solutions will drive new system capabilities with global impact. You will design highly scalable, large enterprise software solutions involving computer vision. You will develop complex perception algorithms integrating across multiple sensing devices. You will develop metrics to quantify the benefits of a solution and influence project resources. You will validate system performance and use insights from your live models to drive the next generation of model development. Common tasks include: • Research, design, implement and evaluate complex perception and decision making algorithms integrating across multiple disciplines • Work closely with software engineering teams to drive scalable, real-time implementations • Collaborate closely with team members on developing systems from prototyping to production level • Collaborate with teams spread all over the world • Track general business activity and provide clear, compelling management reports on a regular basis We are open to hiring candidates to work out of one of the following locations: Luxembourg, LUX