Optimizing neural networks for special-purpose hardware

Curating the neural-architecture search space and taking advantage of human intuition reduces latency on real-world applications by up to 55%.

As neural networks grow in size, deploying them on-device increasingly requires special-purpose hardware that parallelizes common operations. But for maximum efficiency, it’s not enough to optimize the hardware for the networks; the networks should be optimized for the hardware, too.

Related content
The first step in training a neural network to solve a problem is usually the selection of an architecture: a specification of the number of computational nodes in the network and the connections between them. Architectural decisions are generally based on historical precedent, intuition, and plenty of trial and error.

The standard way to optimize a neural network is through neural-architecture search (NAS), where the goal is to minimize both the size of the network and the number of floating-point operations (FLOPS) it performs. But this approach doesn’t work with neural chips, which can often execute easily parallelized but higher-FLOPS tasks more rapidly than they can harder-to-parallelize but lower-FLOPS tasks.

Minimizing latency is a more complicated optimization objective than minimizing FLOPS, so in the Amazon Devices Hardware group, we’ve developed a number of strategies for adapting NAS to the problem of optimizing network architectures for Amazon’s new Neural Engine family of accelerators. Those strategies involve curating the architecture search space to, for instance, reduce the chances of getting stuck in local minima. We’ve also found that combining a little human intuition with the results of NAS for particular tasks can help us generalize to new tasks more reliably and efficiently.

In experiments involving several different machine learning tasks, we’ve found that our NAS strategies can reduce latencies by as much as 55%.

Varieties of neural-architecture search

NAS needs three things: a definition of the search space, which specifies the building blocks available to construct a network; a cost model, which is a function of the network's accuracy, latency, and memory; and an optimization algorithm. We use a performance estimator to measure latency and memory footprint, but to measure accuracy, we must train the network. This is a major bottleneck, as training a single network can take days. Sampling thousands of architectures would take thousands of GPU days, which is clearly neither practical nor environmentally sustainable.

There are three categories of NAS algorithm, which require networks to be trained different numbers of times: multishot, single-shot, and zero-shot.

Related content
A new approach that grows networks dynamically promises improvements over GANs with fixed architectures or predetermined growing strategies.

Multishot methods sample a cohort of architectures in each iteration. Each network is trained and evaluated for accuracy and performance, and the next set of architectures is sampled based on their cost. Evolutionary or reinforcement-learning-based algorithms are generally used for multishot methods.

Single-shot methods start with a large network called the supernet, which has multiple possible subgraphs. During training, the subgraphs start converging to a single, small network. Single-shot methods are designed to be trained only once, but their training takes much longer than that of a single network in multishot methods.

Zero-shot methods works like multishot methods, with the key difference that the network is never trained. As a proxy for accuracy, we use the network’s trainability score, which is computed using the network's topology, nonlinearity, and operations. Zero-shot methods are the fastest to converge, because calculating the score is computationally very cheap. The downside is that the trainability may not correlate well with model accuracy.

Search space curation

The NAS cost function can be visualized as a landscape, with each point representing a potential architecture. A cost function based on FLOPS changes monotonically with factors such as sizes or channels: that is, if you find a direction across the terrain in which the cost is going down, you can be sure that continuing in that direction will not cause the cost to go up.

However, the inclusion of accelerator-aware constraints disrupts the function by introducing more asymptotes, or points at which the cost switches from going down to going up. This results in a more complex and rocky landscape.

Related content
How to make trained systems evolve gracefully.

To address this issue, we reduced the number of options in the search space. We were exploring convolutional architectures, meaning that the inputs are decomposed into several different components, each of which has its own channel through the network. The data in each channel, in turn, is filtered in several different ways; each filter involves a different data convolution.

Previously, we would have explored the number of channels — known as the channel size — at increments of one; instead, we considered only a handful of channel sizes. We limited the options for channel sizes to certain values that were favorable for the parallelism factor of the Neural Engine. The parallelism factor is a count of operations, such as dot product, that can be performed in parallel. In some cases, we even added "depth multiplier" ratio that could be used to scale the number of channels across the entire model to the search space.

These improvements can be visualized as taking fewer, larger steps across a smoother terrain, rather than trying to navigate the rocky landscape that resulted from the inclusion of accelerator-aware performance in the cost function. During the optimization process, they resulted in a faster convergence rate because of the reduced number of options and in improved stability and reliability thanks to the monotonic nature of the curated search space.

NAS - 3x1.png
Illustration of how the cost landscape (green) changes from smooth (left) to rocky (center and right) when a cost function based on Neural Engine performance replaces one based on FLOPS. Curation (right) reduces the discrete search space (black dots) and ensures that points are far apart. The trajectory of a search algorithm (blue arrows) shows how curation (right) ensures that with each step in a search, the cost is monotonically decreasing.

One key detail in our implementation is the performance estimator. Instead of deploying an architecture on real hardware or an emulator to obtain performance metrics, we estimated them using a machine learning regression model trained on measurements of different operators or subgraphs.

At inference time, the estimator would decompose the queried architecture into subgraphs and use the regression model to estimate the performance of each. Then it would accumulate these estimates to give the model-level performance. This regressor-based design simplified our NAS framework, as it no longer required compilation, inference, or hardware. This technique enables us to test accelerators in the design phase, before we’ve developed custom compilers and hardware emulators for them.

Productizing NAS with expert-in-the-loop

Curating the search space improves convergence rate, stability, and reliability, but transferability to new use cases is not straightforward. NAS results for a detector model, for instance, may not be easy to transfer to a classification model. On the other hand, running NAS from scratch for each new dataset may not be feasible, due to time constraints. In these situations, we found that combining NAS results and human expertise was the fastest approach.

Channel reduction step.png
The initial channel reduction step (1x1 conv.) in the inverted-bottleneck (IBN) block at left is fused with the channel expansion step (KxK depth. conv.) in the fused IBN at right. This proved to be a common subgraph modification across datasets.

When we performed NAS on different datasets, we saw common patterns, such as the fusion of convolution layers with previous convolution layers, reducing the number of channels and, aligning them with the hardware parallelism factor.

In particular, fusing convolution layers in inverted bottleneck (IBN) blocks contributed most to boosting efficiency. With just these modifications, we observed latency reductions of up to 50%, whereas a fully converged NAS model would yield a slightly better 53% reduction.

In situations where running NAS from scratch is not feasible, a human expert can rely on mathematical intuition and observations of the results of NAS on similar datasets to build the required model architecture.

Results and product impact

We applied this technique to multiple products in the Amazon Devices portfolio, ranging from Echo Show and Blink home security products to the latest Astro, the in-home consumer robot.

1. Reduced detection latency by half on Echo Show

Echo Show runs a model to detect human presence and locate the detected person in a room. The original model used IBN blocks. We used accelerator-aware NAS to reduce the latency of this model by 53%.

Human-presence detection.png
Schematic representation of human-presence detection.

We performed a search for depth multipliers — that is, layers that multiply the number of channels — and for opportunities to replace IBN blocks with fused-IBN blocks. The requirement was to maintain the same mean average precision (mAP) of the original model while improving the latency. Our V3 model improved the latency by more than 53% (i.e. 2.2x faster) while keeping the mAP scores same as baseline.

Latency results for the original model and three models found through NAS.

Fused-IBN search

Depth multiplier search

Latency reduction (%)

Baseline

No

No

Baseline

V1

No

Yes

14%

V2

Yes

No

35%

V3

Yes

Yes

53%

After performing NAS, we found that not every IBN fusion improves latency and accuracy. The later layers are larger, and replacing them with fused layers hurt performance. For the layers where fusion was selected, the FLOPs, as expected, increased, but the latency did not.

2. Model fitting within the tight memory budget of the Blink Floodlight Camera

Blink cameras use a classification model for security assistance. Our goal was to fit the model parameters and peak activation memory within a tight memory budget. In this case, we combined NAS techniques with an expert-in-the-loop to provide fine-tuning. The NAS result on the classification dataset provided intuition on what operator/subgraph changes could extract benefits from the accelerator design.

Classification.png
Schematic representation of the classification model output.

The expert recommendations were to replace the depth-wise convolutions with standard convolutions and reduce the channels by making them even across the model, preferably by a multiple of the parallelism factor. With these changes, model developers were able to reduce both the model size and the intermediate memory usage by 47% and fit the model within the required budget.

3. Fast semantic segmentation for robotics

In the context of robotics, semantic segmentation is used to understand the objects and scenes the robot is interacting with. For example, it can enable the robot to identify chairs, tables, or other objects in the environment, allowing it to navigate and interact with its surroundings more effectively. Our goal for this model was to reduce latency by half. Our starting point was a semantic-segmentation model that was optimized to run on a CPU.

Semantic segmentation.png
Left: original image of a room at night; center: semantic-segmentation image; right: semantic segmentation overlaid on original image.

For this model, we searched for different channel sizes, fusion, and also output and input dimensions. We used the multishot method with the evolutionary search algorithm. NAS gave us multiple candidates with different performances. The best candidate was able to reduce the latency by half.

Latency improvement for different architectures found through NAS.

Latency reduction (%)

Original

Baseline

Model A

27%

Model B

37%

Model C

38%

Model D

41%

Model E

51%

4. User privacy with on-device inference

Amazon's Neural Engine supports large-model inference on-device, so we can process microphone and video feeds without sending data to the cloud. For example, the Amazon Neural Engine has enabled Alexa to perform automatic speech recognition on-device. On-device processing also provides a better user experience because the inference pipeline is not affected by intermittent connection issues. In our NAS work, we discovered that even larger, more accurate models can now fit on-device with no hit on latency.

Making edge AI sustainable

We mentioned earlier that multishot NAS with full training can take up to 2,000 GPU-days. However, with some of the techniques described in this blog, we were able to create efficient architectures in a substantially shorter amount of time, making NAS much more scalable and sustainable. But our sustainability efforts don't end there.

Related content
Innovative training methods and model compression techniques combine with clever engineering to keep speech processing local.

Because of its parallelism and mixed-precision features, the Neural Engine is more power efficient than a generic CPU. For a million average users, the difference is on order of millions of kilowatt-hours per year, equivalent to 200 gasoline-powered passenger vehicles per year or the energy consumption of a hundred average US households.

When we optimize models through NAS, we increase the device's capability to run more neural-network models simultaneously. This allows us to use smaller application processors and, in some cases, fewer of them. By reducing the hardware footprint in this way, we are further reducing the carbon footprint of our devices.

Future work

We have identified that curation requires an expert who understands the hardware design well. This may not scale to future generations of more complex hardware. We have also identified that in situations where time is tight, having an expert in the loop is still faster than running NAS from scratch. Because of this, we are continuing to investigate how NAS algorithms with accelerator awareness can handle large search spaces. We are also working on improving the search algorithm’s efficiency and effectiveness by exploring how the three categories of algorithms can be combined. We also plan to explore model optimization by introducing sparsity through pruning and clustering. Stay tuned!

Acknowledgements: Manasa Manohara, Lingchuan Meng, Rahul Bakshi, Varada Gopalakrishnan, Lindo St. Angel

Research areas

Related content

US, WA, Seattle
Innovators wanted! Are you an entrepreneur? A builder? A dreamer? This role is part of an Amazon Special Projects team that takes the company’s Think Big leadership principle to the next level. We focus on creating entirely new products and services with a goal of positively impacting the lives of our customers. No industries or subject areas are out of bounds. If you’re interested in innovating at scale to address big challenges in the world, this is the team for you. As a Senior Research Scientist, you will work with a unique and gifted team developing exciting products for consumers and collaborate with cross-functional teams. Our team rewards intellectual curiosity while maintaining a laser-focus in bringing products to market. Competitive candidates are responsive, flexible, and able to succeed within an open, collaborative, entrepreneurial, startup-like environment. At the intersection of both academic and applied research in this product area, you have the opportunity to work together with some of the most talented scientists, engineers, and product managers. Here at Amazon, we embrace our differences. We are committed to furthering our culture of inclusion. We have thirteen employee-led affinity groups, reaching 40,000 employees in over 190 chapters globally. We are constantly learning through programs that are local, regional, and global. Amazon’s culture of inclusion is reinforced within our 16 Leadership Principles, which remind team members to seek diverse perspectives, learn and be curious, and earn trust. Our team highly values work-life balance, mentorship and career growth. We believe striking the right balance between your personal and professional life is critical to life-long happiness and fulfillment. We care about your career growth and strive to assign projects and offer training that will challenge you to become your best.
US, VA, Arlington
Are you excited to help the US Intelligence Community design, build, and implement AI algorithms, including advanced Generative AI solutions, to augment decision making while meeting the highest standards for reliability, transparency, and scalability? The Amazon Web Services (AWS) US Federal Professional Services team works directly with US Intelligence Community agencies and other public sector entities to achieve their mission goals through the adoption of Machine Learning (ML) and Generative AI methods. We build models for text, image, video, audio, and multi-modal use cases, leveraging both traditional ML approaches and state-of-the-art generative models including Large Language Models (LLMs), text-to-image generation, and other advanced AI capabilities to fit the mission. Our team collaborates across the entire AWS organization to bring access to product and service teams, to get the right solution delivered and drive feature innovation based on customer needs. At AWS, we're hiring experienced data scientists with a background in both traditional and generative AI who can help our customers understand the opportunities their data presents, and build solutions that earn the customer trust needed for deployment to production systems. In this role, you will work closely with customers to deeply understand their data challenges and requirements, and design tailored solutions that best fit their use cases. You should have broad experience building models using all kinds of data sources, and building data-intensive applications at scale. You should possess excellent business acumen and communication skills to collaborate effectively with stakeholders, develop key business questions, and translate requirements into actionable solutions. You will provide guidance and support to other engineers, sharing industry best practices and driving innovation in the field of data science and AI. This position requires that the candidate selected be a US Citizen and currently possess and maintain an active Top Secret security clearance. Key job responsibilities As a Data Scientist, you will: - Collaborate with AI/ML scientists and architects to research, design, develop, and evaluate AI algorithms to address real-world challenges - Interact with customers directly to understand the business problem, help and aid them in implementation of AI solutions, deliver briefing and deep dive sessions to customers and guide customer on adoption patterns and paths to production. - Create and deliver best practice recommendations, tutorials, blog posts, sample code, and presentations adapted to technical, business, and executive stakeholder - Provide customer and market feedback to Product and Engineering teams to help define product direction - This position may require up to 25% local travel. About the team Why AWS? Amazon Web Services (AWS) is the world’s most comprehensive and broadly adopted cloud platform. We pioneered cloud computing and never stopped innovating — that’s why customers from the most successful startups to Global 500 companies trust our robust suite of products and services to power their businesses. Diverse Experiences AWS values diverse experiences. Even if you do not meet all of the qualifications and skills listed in the job description, we encourage candidates to apply. If your career is just starting, hasn’t followed a traditional path, or includes alternative experiences, don’t let it stop you from applying. Inclusive Team Culture Here at AWS, it’s in our nature to learn and be curious. Our employee-led affinity groups foster a culture of inclusion that empower us to be proud of our differences. Ongoing events and learning experiences, including our Conversations on Race and Ethnicity (CORE) and AmazeCon (diversity) conferences, inspire us to never stop embracing our uniqueness. Work/Life Balance We value work-life harmony. Achieving success at work should never come at the expense of sacrifices at home, which is why flexible work hours and arrangements are part of our culture. When we feel supported in the workplace and at home, there’s nothing we can’t achieve in the cloud. Mentorship & Career Growth We’re continuously raising our performance bar as we strive to become Earth’s Best Employer. That’s why you’ll find endless knowledge-sharing, mentorship and other career-advancing resources here to help you develop into a better-rounded professional.
US, WA, Seattle
Amazon Advertising is one of Amazon's fastest growing businesses. Amazon's advertising portfolio helps merchants, retail vendors, and brand owners succeed via native advertising, which grows incremental sales of their products sold through Amazon. The primary goals are to help shoppers discover new products they love, be the most efficient way for advertisers to meet their business objectives, and build a sustainable business that continuously innovates on behalf of customers. Our products and solutions are strategically important to enable our Retail and Marketplace businesses to drive long-term growth. We deliver billions of ad impressions and millions of clicks and break fresh ground in product and technical innovations every day! The Creative X team within Amazon Advertising time aims to democratize access to high-quality creatives (audio, images, videos, text) by building AI-driven solutions for advertisers. To accomplish this, we are investing in understanding how best users can leverage Generative AI methods such as latent-diffusion models, large language models (LLM), generative audio (music and speech synthesis), computer vision (CV), reinforced learning (RL) and related. As an Applied Scientist you will be part of a close-knit team of other applied scientists and product managers, UX and engineers who are highly collaborative and at the top of their respective fields. We are looking for talented Applied Scientists who are adept at a variety of skills, especially at the development and use of multi-modal Generative AI and can use state-of-the-art generative music and audio, computer vision, latent diffusion or related foundational models that will accelerate our plans to generate high-quality creatives on behalf of advertisers. Every member of the team is expected to build customer (advertiser) facing features, contribute to the collaborative spirit within the team, publish, patent, and bring SOTA research to raise the bar within the team. As an Applied Scientist on this team, you will: - Drive the invention and development of novel multi-modal agentic architectures and models for the use of Generative AI methods in advertising. - Work closely and integrate end-to-end proof-of-concept Machine Learning projects that have a high degree of ambiguity, scale and complexity. - Build interface-oriented systems that use Machine Learning models, perform proof-of-concept, experiment, optimize, and deploy your models into production; work closely with software engineers to assist in productionizing your ML models. - Curate relevant multi-modal datasets. - Perform hands-on analysis and modeling of experiments with human-in-the-loop that eg increase traffic monetization and merchandise sales, without compromising the shopper experience. - Run A/B experiments, gather data, and perform statistical analysis. - Establish scalable, efficient, automated processes for large-scale data analysis, machine-learning model development, model validation and serving. - Mentor and help recruit Applied Scientists to the team. - Present results and explain methods to senior leadership. - Willingness to publish research at internal and external top scientific venues. - Write and pursue IP submissions. Key job responsibilities This role is focused on developing new multi-modal Generative AI methods to augment generative imagery and videos. You will develop new multi-modal paradigms, models, datasets and agentic architectures that will be at the core of advertising-facing tools that we are launching. You may also work on development of ML and GenAI models suitable for advertising. You will conduct literature reviews to stay on the SOTA of the field. You will regularly engage with product managers, UX designers and engineers who will partner with you to productize your work. For reference see our products: Enhanced Video Generator, Creative Agent and Creative Studio. A day in the life On a day-to-day basis, you will be doing your independent research and work to develop models, you will participate in sprint planning, collaborative sessions with your peers, and demo new models and share results with peers, other partner teams and leadership. About the team The team is a dynamic team of applied scientists, UX researchers, engineers and product leaders. We reside in the Creative X organization, which focuses on creating products for advertisers that will improve the quality of the creatives within Amazon Ads. We are open to hiring candidates to work out of one of the following locations: UK (London), USA (Seattle).
US, WA, Seattle
Do you want to join an innovative team of scientists who use machine learning and statistical techniques to help Amazon provide the best customer experience by preventing eCommerce fraud? Are you excited by the prospect of analyzing and modeling terabytes of data and creating state-of-the-art algorithms to solve real world problems? Do you like to own end-to-end business problems/metrics and directly impact the profitability of the company? Do you enjoy collaborating in a diverse team environment? If yes, then you may be a great fit to join the Amazon Selling Partner Trust & Store Integrity Science Team. We are looking for a talented scientist who is passionate to build advanced machine learning systems that help manage the safety of millions of transactions every day and scale up our operation with automation. Key job responsibilities Innovate with the latest GenAI/LLM/VLM technology to build highly automated solutions for efficient risk evaluation and automated operations Design, develop and deploy end-to-end machine learning solutions in the Amazon production environment to create impactful business value Learn, explore and experiment with the latest machine learning advancements to create the best customer experience A day in the life You will be working within a dynamic, diverse, and supportive group of scientists who share your passion for innovation and excellence. You'll be working closely with business partners and engineering teams to create end-to-end scalable machine learning solutions that address real-world problems. You will build scalable, efficient, and automated processes for large-scale data analyses, model development, model validation, and model implementation. You will also be providing clear and compelling reports for your solutions and contributing to the ongoing innovation and knowledge-sharing that are central to the team's success.
US, WA, Seattle
Are you passionate about applying machine learning and advanced statistical techniques to protect one of the world's largest online marketplaces? Do you want to be at the forefront of developing innovative solutions that safeguard Amazon's customers and legitimate sellers while ensuring a fair and trusted shopping experience? Do you thrive in a collaborative environment where diverse perspectives drive breakthrough solutions? If yes, we invite you to join the Amazon Risk Intelligence Science Team. We're seeking an exceptional scientist who can revolutionize how we protect our marketplace through intelligent automation. As a key member of our team, you'll develop and deploy state-of-the-art machine learning systems that analyze millions of seller interactions daily, ensuring the integrity and trustworthiness of Amazon's marketplace while scaling our operations to new heights. Your work will directly impact the safety and security of the shopping experience for hundreds of millions of customers worldwide, while supporting the growth of honest entrepreneurs and businesses. Key job responsibilities • Use machine learning and statistical techniques to create scalable abuse detection solutions that identify fraudulent seller behavior, account takeovers, and marketplace manipulation schemes • Innovate with the latest GenAI technology to build highly automated solutions for efficient seller verification, transaction monitoring, and risk assessment • Design, develop and deploy end-to-end machine learning solutions in the Amazon production environment to prevent and detect sophisticated abuse patterns across the marketplace • Learn, explore and experiment with the latest machine learning advancements to protect customer trust and maintain marketplace integrity while supporting legitimate selling partners • Collaborate with cross-functional teams to develop comprehensive risk models that can adapt to evolving abuse patterns and emerging threats About the team You'll be working closely with business partners and engineering teams to create end-to-end scalable machine learning solutions that address real-world problems. You will build scalable, efficient, and automated processes for large-scale data analyses, model development, model validation, and model implementation. You will also be providing clear and compelling reports for your solutions and contributing to the ongoing innovation and knowledge-sharing that are central to the team's success.
US, WA, Seattle
Prime Video is a first-stop entertainment destination offering customers a vast collection of premium programming in one app available across thousands of devices. Prime members can customize their viewing experience and find their favorite movies, series, documentaries, and live sports – including Amazon MGM Studios-produced series and movies; licensed fan favorites; and exclusive access to coverage of live sports. All customers regardless of whether they have a Prime membership or not, can access programming from subscriptions such as Apple TV, Peacock Premium Plus, HBO Max, FOX One, Crunchyroll and MGM+, as well as more than 900 free ad-support (FAST) Channels, rent or buy titles, and enjoy even more content for free with ads. Are you interested in shaping the future of entertainment? Prime Video's technology teams are creating best-in-class digital video experience. As a Prime Video technologist, you’ll have end-to-end ownership of the product, user experience, design, and technology required to deliver state-of-the-art experiences for our customers. You’ll get to work on projects that are fast-paced, challenging, and varied. You’ll also be able to experiment with new possibilities, take risks, and collaborate with remarkable people. We’ll look for you to bring your diverse perspectives, ideas, and skill-sets to make Prime Video even better for our customers. With global opportunities for talented technologists, you can decide where a career Prime Video Tech takes you! Interested in influencing what customers around the world see when they turn on Prime Video? The Prime Video Personalization and Discovery team matches customers with the right content at the right time, at all touch points throughout the content discovery journey. We are looking for a customer-focused, solutions-oriented Senior Data Scientist to build and guide new data-driven frameworks to understand what makes new personalization and content discovery innovations successful for users and the business. You'll be part of an embedded science team on projects that are fast-paced, challenging, and ultimately influence what millions of customers around the world see when the log into Prime Video. The ideal candidate brings strong problem-solving skills, stakeholder communication skills, and the ability to balance technical rigor with delivery speed and customer impact. You will build cross-functional support within Prime Video, assess business problems, define metrics, and support iterative scientific solutions that balance short-term delivery with long-term science roadmaps. Key job responsibilities - Use advanced statistical and machine learning techniques to extract insights from complex, large-scale data sets - Design and implement end-to-end data science workflows, from data acquisition and cleaning to model development, testing, and deployment - Support scalable, self-service data analyses by building datasets for analytics, reporting and ML use cases - Partner with product stakeholders and science peers to identify strategic data-driven opportunities to improve the customer experience - Communicate findings, conclusions, and recommendations to technical and non-technical stakeholders - Stay up-to-date on the latest data science tools, techniques, and best practices and help evangelize them across the organization
US, CA, Sunnyvale
Prime Video is a first-stop entertainment destination offering customers a vast collection of premium programming in one app available across thousands of devices. Prime members can customize their viewing experience and find their favorite movies, series, documentaries, and live sports – including Amazon MGM Studios-produced series and movies; licensed fan favorites; and programming from Prime Video add-on subscriptions such as Apple TV+, Max, Crunchyroll and MGM+. All customers, regardless of whether they have a Prime membership or not, can rent or buy titles via the Prime Video Store, and can enjoy even more content for free with ads. Are you interested in shaping the future of entertainment? Prime Video's technology teams are creating best-in-class digital video experience. As a Prime Video team member, you’ll get to work on projects that are fast-paced, challenging, and varied. You’ll also be able to experiment with new possibilities, take risks, and collaborate with remarkable people. We’ll look for you to bring your diverse perspectives, ideas, and skill-sets to make Prime Video even better for our customers. We are looking for an Applied Scientist to push the envelope of AI content generation. As a scientist at Prime Video, you will contribute directly to productions using innovative tools in computer vision, deep learning, and generative AI to transform entertainment experiences. The ideal candidate has deep knowledge in one of: graphics, deep learning, generative AI and/or reinforcement learning and experience applying them real-world problems. You understand tradeoffs between business needs and model complexity, and you take calculated risks in developing rapid prototypes and iterative model improvements. You are excited to learn from and alongside seasoned scientists, engineers, and business leaders. You are an excellent communicator and effectively translate technical findings into production systems and business action (and customer delight). Key job responsibilities • Build generative AI models that create production-ready content, including movie content, localized assets, and visual marketing materials used across Prime Video's global platform. • Drive end-to-end machine learning projects that have a high degree of ambiguity, scale, complexity. • Build machine learning models, perform proof-of-concept, experiment, optimize, and deploy your models. • Run experiments, gather data, and perform statistical analysis. • Establish scalable, efficient, automated processes for large-scale data analysis, machine-learning model development, model validation and serving. • Research new and innovative machine learning approaches. • Share knowledge and research outcomes via internal and external conferences and journal publications A day in the life In this role, you will invent science and systems for content localization, generation, including graphics and machine learning-based modeling systems. You will work with a team of scientists and product managers to design customer-facing products, and you will work with technology teams to productize and maintain the associated solutions.
US, WA, Bellevue
The Amazon Fulfillment Technologies (AFT) Science team is seeking an exceptional Applied Scientist with strong operations research and optimization expertise to develop production solutions for one of the most complex systems in the world: Amazon's Fulfillment Network. At AFT Science, we design, build, and deploy optimization, statistics, machine learning, and GenAI/LLM solutions that power production systems running across Amazon Fulfillment Centers worldwide. We tackle a wide range of challenges throughout the network, including labor planning and staffing, pick scheduling, stow guidance, and capacity risk management. Our mission is to develop innovative, scalable, and reliable science-driven production solutions that exceed the published state of the art, enabling systems to run optimally and continuously (from every few minutes to every few hours) across our large-scale network. Key job responsibilities As an Applied Scientist, you will collaborate with scientists, software engineers, product managers, and operations leaders to develop optimization-driven solutions that directly impact process efficiency and associate experience in the fulfillment network. Your key responsibilities include: - Develop deep understanding and domain knowledge of operational processes, system architecture, and business requirements - Dive deep into data and code to identify opportunities for continuous improvement and disruptive new approaches - Design and develop scalable mathematical models for production systems to derive optimal or near-optimal solutions for existing and emerging challenges - Create prototypes and simulations for agile experimentation of proposed solutions - Advocate for technical solutions with business stakeholders, engineering teams, and senior leadership - Partner with software engineers to integrate prototypes into production systems - Design and execute experiments to test new or incremental solutions launched in production - Build and monitor metrics to track solution performance and business impact About the team Amazon Fulfillment Technology (AFT) designs, develops, and operates end-to-end fulfillment technology solutions for all Amazon Fulfillment Centers (FCs). We harmonize the physical and virtual worlds so Amazon customers can get what they want, when they want it. The AFT Science team brings expertise in operations research, optimization, statistics, machine learning, and GenAI/LLM, combined with deep domain knowledge of operational processes within FCs and their unique challenges. We prioritize advancements that support AFT tech teams and focus areas rather than specific fields of research or individual business partners. We influence each stage of innovation from inception to deployment, which includes both developing novel solutions and improving existing approaches. Our production systems rely on a diverse set of technologies, and our teams invest in multiple specialties as the needs of each focus area evolve.
CA, ON, Toronto
Are you interested in shaping the future of Advertising and B2B Sales? We are a growing science and engineering team with an exciting charter and need your passion, innovative thinking, and creativity to help take our products to new heights. Amazon Advertising is one of Amazon's fastest growing and most profitable businesses, responsible for defining and delivering a collection of advertising products that drive discovery and sales. Our products are strategically important to our businesses driving long term growth. We break fresh ground in product and technical innovations every day! Within the Advertising Sales organization, we are building a central AI/ML team and are seeking top science talent to build new, science-backed services to drive success for our customers. Our goal is to transform the way account teams operate by creating actionable insights and recommendations they can share with their advertising accounts, and ingesting Generative AI throughout their end-to-end workflows to improve their work efficiency. As a part of our team, you will bring deep expertise in Generative AI and quantitative modeling (forecasting, recommender systems, reinforcement learning, causal inferencing or generative artificial intelligence) to build and refine models that can be implemented in production. You will contribute to chart new courses with our ad sales support technologies, and you have the communication skills necessary to explain complex technical approaches to a variety of stakeholders and customers. You will be part of a team of fellow scientists and engineers taking on iterative approaches to tackle big, long-term problems. Why you will love this opportunity: Amazon has invested heavily in building a world-class advertising business. This team defines and delivers a collection of advertising products that drive discovery and sales. Our solutions generate billions in revenue and drive long-term growth for Amazon's Retail and Marketplace businesses. We deliver billions of ads impressions, millions of clicks daily, and break fresh ground to create world-class products. We are a highly motivated, collaborative, and fun-loving team with an entrepreneurial spirit with a broad mandate to experiment and innovate. Impact and Career Growth: You will invent new experiences and influence customer-facing shopping experiences; this is your opportunity to work within the fastest growing businesses across all of Amazon! Define a long-term scientific vision for our advertising sales business, driven from our customers' needs, translating that direction into specific plans for scientists, engineers and product teams. This role combines scientific leadership, organizational ability, technical strength, product focus, and business understanding. Key job responsibilities - Conceptualize and lead state-of-the-art research on new Machine Learning and Generative Artificial Intelligence solutions to optimize all aspects of the Ad Sales business - Guide the technical approach for the design and implementation of successful models and algorithms in support of expert cross-functional teams delivering on demanding projects - Conduct deep data analysis to derive insights to the business, and identify gaps and new opportunities - Run regular A/B experiments, gather data, and perform statistical analysis - Work closely with software engineers to deliver end-to-end solutions into production - Improve the scalability, efficiency and automation of large-scale data analytics, model training, deployment and serving About the team Sales AI is a central science and engineering organization within Amazon Advertising Sales that powers selling motions and account team workflows via state-of-the-art of AI/ML services. Sales AI is investing in a range of sales intelligence models, including the development of advertiser insights, recommendations and Generative AI-powered applications throughout account team workflows.
US, NY, New York
In this role, you will build scalable solutions and sophisticated models that identify and drive growth opportunities for Amazon Ads teams, specifically within Amazon's Demand Side Platform (ADSP). You will leverage machine learning, simulation, and advanced statistical techniques to explain complex patterns, quantify business impact, predict future trends, and prescribe actionable strategies that inform critical business decisions at the highest levels of the organization. You will work with various stakeholders to align on priorities, with the understanding that scope and direction may evolve based on organizational needs. You will translate business goals into agile, insightful analytics that create tangible value for both stakeholders and customers, and communicate your findings clearly and actionably to managers and senior leaders so they can quickly understand insights and take decisive action. You will set the strategy for ads delivery and quality and establish the measurement and decision frameworks. A core mandate for this role is to identify, instrument, and operationalize the input metrics that most directly drive ads delivery, quality, and performance, ensuring we optimize the levers that move outcomes rather than simply reporting on lagging KPIs. Key job responsibilities * You will define and execute in-depth data analysis that drives data-informed decision making for product, sales, and finance teams who speak on behalf of advertisers. * You will establish and drive data hygiene best practices to ensure coherence and integrity of data feeding into production ML/AI solutions. * You will identify, instrument, and operationalize the input metrics that most directly drive ads delivery, quality, and performance, creating robust measurement frameworks. * You will collaborate with colleagues across science and engineering disciplines for fast turnaround proof-of-concept prototyping at scale. * You will partner with product managers and stakeholders to define forward-looking product visions and prospective business use cases. * You will set the strategy for ads delivery and quality, establishing decision frameworks that enable teams to move from reactive reporting to proactive optimization. * You will drive and lead a culture of data-driven innovations within the Amazon AdTech org.