Long-form-video understanding and synthesis

Four CVPR papers from Prime Video examine a broad set of topics related to efficient model training for understanding and synthesizing long-form cinematic content.

At this year’s Conference on Computer Vision and Pattern Recognition (CVPR), Prime Video presented four papers that indicate the broad range of cutting-edge problems we work on.

In one paper, “Movies2Scenes: Using movie metadata to learn scene representation", we present a novel contrastive-learning approach that uses only commonly available movie metadata to learn a general-purpose scene representation. On a diverse set of tasks evaluated using multiple benchmark datasets, models that use our representations consistently outperform models using existing state-of-the-art representations.

Notably, our learned representation offers an average improvement of 7.9% on the seven classification tasks and 9.7% on the two regression tasks in the Long-Form Video Understanding (LVU) dataset. This effort is an important step toward the first foundation model for general-purpose movie understanding.

In another paper, “Selective structured state-spaces for long-form video understanding”, we expand on the recently proposed S4 model that employs a lightweight mask generator to adaptively select informative image tokens, resulting in more efficient and accurate modeling of long-term spatiotemporal dependencies in videos. Our approach is consistently more accurate than the previous state-of-the-art model, by as much as 9.6%, while reducing the memory footprint by 23%.

Related content
Detectors for block corruption, audio artifacts, and errors in audio-video synchronization are just three of Prime Video’s quality assurance tools.

Similarly, our paper "Dynamic inference with grounding based vision and language models" explores the problem of computational redundancy in large vision-and-language models, addressing this challenge by dynamically skipping network layers, dropping input tokens, and fusing multimodal tokens, conditioned on the input image-text pair. Our results show that we can improve the run-time efficiency of the state-of-the-art models by up to 50% on multiple downstream tasks with an accuracy drop of only 0.3%.

Lastly, our paper "LEMaRT: Label-efficient masked region transform for image harmonization" addresses the problem of requiring large amounts of labeled data to train image harmonization models, which modify content from different source images so that they blend together better in composite images. To this end, our method automatically generates training data by simulating defects in appearance that image harmonization models are expected to remove. Our method outperforms previous state-of-the-art approaches by a margin of 0.4dB (mean square error improvement = ~9%) when it is fine-tuned on only 50% of the training data from one of the standard benchmarks (iHarmony4) and by 1.0 dB (MSE improvement = ~21%) when it is trained on the full training dataset.

Toward a foundation model for movie understanding

The term “foundation model” generally relates to (i) a single large model that is (ii) trained on large amounts of mostly unlabeled data and can (iii) drive a number of downstream tasks. While several general-purpose visual-and-textual foundation models exist (e.g., BERT, GPT-4, CLIP, DALL-E 2, etc.), no foundation model particularly geared for movie understanding has been proposed before our work.

This is partly because directly applying existing visual or textual foundation models for movie understanding has limited effectiveness, given the large domain gap between cinematic content and the web-crawled images and text used to train those models. Factors such as the inaccessibility of much large-scale cinematic content, the computational resources required to process it, and the lack of benchmark datasets for evaluation on downstream applications add to the challenge of building a foundation model for movie understanding.

Related content
CVPR papers examine the recovery of 3-D information from camera movement and learning general representations from weakly annotated data.

To address these challenges, we proposed a novel model trained on over five million scenes automatically identified from thousands of movies and comprising more than 45 million frames. Our model does not require any manual annotations and relies only on commonly available movie-level information (genre, synopsis, etc.). The scene representations from our model can be applied to improve the performance of a diverse set of downstream tasks, which is a key step toward building a foundation model for movie understanding.

We use movie metadata to define a measure of movie similarity and use that similarity measure to identify data pairs for contrastive learning. In contrastive learning, a model is trained on both positive pairs — examples that are similar in the relevant way — and negative pairs. During training, the model learns to produce data representations that pull positive pairs together and push negative pairs apart.

Often, the positive pairs are created by augmenting existing examples — say, re-cropping them, reversing them, or re-coloring them. By instead using movies that are considered similar to each other (see below), we ensure that our positive scene-pairs are not only visually similar but also semantically coherent, providing us with a much richer set of geometric and thematic data augmentations that enhance the training objective beyond traditional augmentation approaches.

Overview of approach.png
Overview of our approach.

As can be seen in the video below, our learned scene representation is able to effectively put thematically similar scenes close to each other.

Qualitative examples of similar-scene pairs found using our approach.

In the examples below, we compare our representation with the commonly used CLIP visual representation for scene retrieval using place-labeled scenes in the Long-Form Video Understanding (LVU) dataset. Given a query scene, our representation can capture appearance as well as semantic concepts to retrieve similar scenes more effectively, while CLIP can capture only local appearance-based patterns. For overall retrieval precision on six categories of places, our representation offers a 22.7% improvement over CLIP.

Video representation comparison.png
A comparison of our video representation method and one of its predecessors, CLIP, on the task of place retrieval using the Long-Form Video Understanding (LVU) dataset.

Quantitatively, our learned representation exhibits an average improvement of 7.9% and 9.7% on the seven classification tasks and two regression tasks of the LVU dataset, respectively. Furthermore, using our newly collected MCD dataset in Prime Video, we compare our learned scene representation with state-of-the-art models pretrained on action recognition and image classification datasets. Our scene representation outperforms the alternatives by margins ranging from 3.8% to 50.9% across different models and tasks.

Reducing model complexity for long-form-video understanding

At Prime Video, we’re developing state-of-the-art AI models for cinematic-content understanding to facilitate a variety of downstream use cases. One of the key technical problems to this end is effective modeling of complex spatiotemporal dependencies, particularly in long-form videos such as movies and TV episodes.

Spatiotemporal dependencies.png
Various shots from the movie Stuart Little, showing the complex spatiotemporal dependencies of cinematic content.

Previously proposed convolutional and recurrent neural networks struggle to learn long-term dependencies. In part this is because of exploding or vanishing gradients — where cascading adjustments to model weights grow too small or too large — as information is incorporated over long durations. Vision transformers can use self-attention to address this challenge, attending to particular, prior frames of video when interpreting the current frame. But this is computationally expensive, as it requires pairwise computations between the current frame and its predecessors.

Related content
Prime Video beats previous state of the art on the MovieNet dataset by 13% with a new model that is 90% smaller and 84% faster.

The recently proposed structured-state-space-sequence (S4) model, with its linear complexity, offers a promising direction in this space; however, we empirically demonstrate that treating all image tokens equally, as the S4 model does, can adversely affect a model’s efficiency and accuracy.

To address this challenge, we present a novel selective S4 (i.e., S5) model that employs a lightweight mask generator to adaptively select informative image tokens, resulting in more efficient and accurate modeling of long-term spatiotemporal dependencies in videos. Unlike previous methods, which used mask-based token reduction in transformers, our S5 model avoids the dense self-attention calculation by following the guidance of the momentum-updated S4 model. This enables our model to efficiently discard less informative tokens and adapt to various long-form-video-understanding tasks more effectively.

S5 model.png
At left is an illustration of our S5 model (a). We introduce a “mask generator” that enacts a selective token-picking strategy, leveraging the feature representations from the momentum S4 model. The momentum S4 model is updated by the S4 model in the moving-average manner. At right is an illustration of the proposed pretraining framework using long-short masked contrastive learning (b), which initializes our S5 model to enhance robustness.

However, as is the case with most token reduction methods, the informative image tokens may be dropped incorrectly. To improve the robustness and the temporal horizon of our model, we propose a novel long-short masked contrastive-learning (LSMCL) approach that enables our model to predict longer temporal contexts using shorter input videos.

We present extensive comparative results using three challenging long-form video-understanding datasets (LVU, COIN, and Breakfast), demonstrating that our approach is consistently more accurate than the previous state-of-the-art S4 model, by as much as 9.6% on one dataset, with a memory footprint that’s 23% smaller.

Dynamic inference of multimodal models using reinforcement learning

The availability of transformer models operating over multiple data modalities as well as large-scale pretraining approaches has led to significant progress on joint image-and-language models. However, these models impose high computational costs and therefore offer low run-time efficiency, making them difficult to apply to Prime Video’s large catalogue.

Although approaches such as pruning, knowledge distillation, and quantization can help address this challenge, they can incur significant drops in accuracy (e.g., ≥ 1% at ≥ 50% model compression rates), as they are primarily designed for model-parameter reduction, not improving run-time efficiency.

Related content
The switch to WebAssembly increases stability, speed.

To address this challenge, we propose a model that saves computation by dynamically skipping layers of a multimodal network; pruning input tokens from either the language backbone, the image backbone, or both; and fusing tokens from the separate backbones, conditioned on the input image-text pair.

Most multimodal transformer models include multihead self-attention and feed-forward network layers, which can be skipped for some inputs. Additionally, we remove redundant tokens at different levels of the backbones and fuse the image tokens with the language tokens in an adaptive manner. To learn policies for dynamic inference, we train agents using reinforcement learning.

Our results demonstrate that we can improve the run-time efficiency of the state-of-the-art models MDETR and GLIP by up to 50% on the tasks of referring-expression comprehension, segmentation, and visual question-answering, with a maximum accuracy drop of only 0.3%.

Accuracy vs FPS:FLOPS.png
Accuracy-vs.-frames-per-second (a and b) and accuracy-vs.-GFLOPS (c and d) comparisons of the evaluated models. As shown, our proposed method comfortably outperforms multiple alternative approaches on both metrics while maintaining high accuracy.

Improving label efficiency of image harmonization models

Image harmonization is an important component of the broader problem of image composition, where new images are created by extracting foreground regions from one image and transferring them to another image in a photorealistic manner.

Related content
Two papers at WACV propose neural models for enhancing video-streaming experiences.

The main technical challenge for image harmonization is the appearance mismatch between the foreground extracted from the source image and the background of the destination image. Image harmonization aims to adjust the appearance of the foreground to make it compatible with the background. However, training traditional models for image harmonization requires a large amount of labeled data, which is costly and time-consuming to obtain.

To address this challenge, we introduce a novel approach to pretraining image harmonization models, LEMaRT, which automatically generates training data by simulating the types of defects that image harmonization models are expected to remove. LEMaRT takes an image as input, selects a region in that image, and applies a set of appearance transformations to it. We use these modified images, along with the original images, to pretrain our image harmonization model. Furthermore, we introduce an image harmonization model, SwinIH, by retrofitting the previously proposed Swin Transformer with a combination of local and global self-attention mechanisms.

Image transformations.png
Given an image, our approach applies a set of transformations (e.g., brightness, hue adjustment) to obtain a transformed image that is combined with the original image to form a composite. These composite images are used to pretrain our image harmonization transformer model. As shown in the figure, our model is capable of reconstructing photorealistic outputs.

Pretraining our SwinIH model with our LEMaRT approach results in a new state of the art for image harmonization, while being label-efficient, i.e., consuming less annotated data for fine-tuning than existing methods. Notably, on the iHarmony4 dataset, SwinIH outperforms the state of the art, i.e., SCS-Co by a margin of 0.4 dB when it is fine-tuned on only 50% of the training data and by 1.0 dB when it is trained on the full training dataset.

LeMART performance.png
Using our LEMaRT pretraining scheme, our image harmonization model (SwinIH) surpasses state-of-the-art (SOTA) counterparts with less than 40% of the training data from iHarmony4 for fine-tuning. Qualitatively, LEMaRT is better than competing methods at color correction, thanks to the distribution of photorealistic images that it learns from a large amount of unlabeled data during self-supervised pretraining.

Qualitative comparisons suggest that LEMaRT is better at color correction than prior methods, thanks to the pretraining process, during which LEMaRT learns the distribution of photorealistic images.

Qualitative comparison.png
Qualitative comparison between our method, LEMaRT (SwinIH), and three state-of-the-art methods (RainNet, iS2AM, DHT+) on the iHarmony4 dataset.

Research areas

Related content

BR, SP, Sao Paulo
A Amazon lançou o Centro de Inovação de IA Generativa em junho de 2023 para ajudar os clientes da AWS a acelerar a inovação e o sucesso empresarial com IA Generativa (https://press.aboutamazon.com/2023/6/aws-announces-generative-ai -centro de inovação). Este Centro de Inovação oferece oportunidades para inovar em uma organização de ritmo acelerado que contribui para projetos e tecnologias revolucionárias que são implantadas em dispositivos e na nuvem. Como cientista de dados, você é proficiente em projetar e desenvolver soluções avançadas baseadas em IA generativa para resolver diversos problemas dos clientes. Você trabalhará com terabytes de texto, imagens e outros tipos de dados para resolver problemas do mundo real por meio da Gen AI. Você trabalhará em estreita colaboração com equipes de contas e estrategistas de ML para definir o caso de uso, e com outros cientistas e engenheiros de ML da equipe para projetar experimentos e encontrar novas maneiras de agregar valor ao cliente. A pessoa selecionado possuirá habilidades técnicas e de contato com o cliente que permitirão que você faça parte da equipe técnica da AWS no ecossistema/ambiente de nossos provedores de soluções, bem como diretamente para os clientes finais. Você será capaz de conduzir discussões com pessoal técnico e de gerenciamento sênior de clientes e parceiros. A day in the life Aqui na AWS, abraçamos nossas diferenças. Estamos empenhados em promover a nossa cultura de inclusão. Temos dez grupos de afinidade liderados por funcionários, alcançando 40.000 funcionários em mais de 190 filiais em todo o mundo. Temos ofertas de benefícios inovadoras e organizamos experiências de aprendizagem anuais e contínuas, incluindo nossas conferências Conversations on Race and Ethnicity (CORE) e AmazeCon (diversidade de gênero). A cultura de inclusão da Amazon é reforçada pelos nossos 16 Princípios de Liderança, que lembram os membros da equipe de buscar perspectivas diversas, aprender e ser curiosos e ganhar confiança. About the team Equilíbrio trabalho/vida pessoal Nossa equipe valoriza muito o equilíbrio entre vida pessoal e profissional. Não se trata de quantas horas você passa em casa ou no trabalho; trata-se do fluxo que você estabelece que traz energia para ambas as partes da sua vida. Acreditamos que encontrar o equilíbrio certo entre sua vida pessoal e profissional é fundamental para a felicidade e a realização ao longo da vida. Oferecemos flexibilidade no horário de trabalho e incentivamos você a encontrar seu próprio equilíbrio entre trabalho e vida pessoal. Mentoria e crescimento de carreira Nossa equipe se dedica a apoiar novos membros. Temos uma ampla combinação de níveis de experiência e mandatos e estamos construindo um ambiente que celebra o compartilhamento de conhecimento e a orientação. Nossos membros seniores desfrutam de orientação individual e revisões de código completas, mas gentis. Nós nos preocupamos com o crescimento de sua carreira e nos esforçamos para atribuir projetos com base no que ajudará cada membro da equipe a se tornar um engenheiro mais completo e capacitá-los a assumir tarefas mais complexas no futuro. We are open to hiring candidates to work out of one of the following locations: Sao Paulo, SP, BRA
US, WA, Seattle
Outbound Communications own the worldwide charter for delighting our customers with timely, relevant notifications (email, mobile, SMS and other channels) to drive awareness and discovery of Amazon’s products and services. We meet customers at their channel of preference with the most relevant content at the right time and frequency. We directly create and operate marketing campaigns, and we have also enabled select partner teams to build programs by reusing and extending our infrastructure. We optimize for customers to receive the most relevant and engaging content across all of Amazon worldwide, and apply the appropriate guardrails to ensure a consistent and high-quality CX. Outbound Communications seek a talented Applied Scientist to join our team to develop the next generation of automated and personalized marketing programs to help Amazon customers in their shopping journeys worldwide. Come join us in our mission today! Key job responsibilities As an Applied Scientist on the team, you will lead the roadmap and strategy for applying science to solve customer problems in the automated marketing domain. This is an opportunity to come in on Day 0 and lead the science strategy of one of the most interesting problem spaces at Amazon - understanding the Amazon customer to build deeply personalized and adaptive messaging experiences. You will be part of a multidisciplinary team and play an active role in translating business and functional requirements into concrete deliverables. You will work closely with product management and the software development team to put solutions into production. You will apply your skills in areas such as deep learning and reinforcement learning while building scalable industrial systems. You will have a unique opportunity to produce and deliver models that help build best-in-class customer experiences and build systems that allow us to deploy these models to production with low latency and high throughput. We are open to hiring candidates to work out of one of the following locations: Seattle, WA, USA
US, WA, Seattle
The Artificial General Intelligence (AGI) team is looking for a passionate, talented, and inventive Applied Scientist with a strong deep learning background, to help build industry-leading technology with multimodal systems. Key job responsibilities As an Applied Scientist with the AGI team, you will work with talented peers to develop novel algorithms and modeling techniques to advance the state of the art with multimodal systems. Your work will directly impact our customers in the form of products and services that make use of vision and language technology. You will leverage Amazon’s heterogeneous data sources and large-scale computing resources to accelerate development with multimodal Large Language Models (LLMs) and Generative Artificial Intelligence (Gen AI) in Computer Vision. About the team The AGI team has a mission to push the envelope with multimodal LLMs and Gen AI in Computer Vision, in order to provide the best-possible experience for our customers. We are open to hiring candidates to work out of one of the following locations: Seattle, WA, USA
US, WA, Bellevue
The Fulfillment by Amazon (FBA) team is looking for a passionate, curious, and creative Senior Applied Scientist, with expertise in machine learning and a proven record of solving business problems through scalable ML solutions, to join our top-notch cross-domain FBA science team. We want to learn seller behaviors, understand seller experience, build automated LLM-based solutions to sellers, design seller policies and incentives, and develop science products and services that empower third-party sellers to grow their businesses. We also predict potentially costly defects that may occur during packing, shipping, receiving and storing the inventory. We aim to prevent such defects before occurring while we are also fulfilling customer demand as quickly and efficiently as possible, in addition to managing returns and reimbursements. To do so, we build and innovate science solutions at the intersection of machine learning, statistics, economics, operations research, and data analytics. As a senior applied scientist, you will propose and deploy solutions that will likely draw from a range of scientific areas such as supervised and unsupervised learning, recommendation systems, statistical learning, LLMs, and reinforcement learning. This role has high visibility to senior Amazon business leaders and involves working with other scientists, and partnering with engineering and product teams to integrate scientific work into production systems. Key job responsibilities - As a senior member of the science team, you will play an integral part in building Amazon's FBA management system. - Research and develop machine learning models to solve diverse business problems faced in Seller inventory management systems. - Define a long-term science vision and roadmap for the team, driven fundamentally from our customers' needs, translating those directions into specific plans for research and applied scientists, as well as engineering and product teams. - Drive and execute machine learning projects/products end-to-end: from ideation, analysis, prototyping, development, metrics, and monitoring. - Review and audit modeling processes and results for other scientists, both junior and senior. - Advocate the right ML solutions to business stakeholders, engineering teams, as well as executive level decision makers A day in the life In this role, you will be a technical leader in machine learning with significant scope, impact, and high visibility. Your solutions may lead to billions of dollars impact on either the topline or the bottom line of Amazon third-party seller business. As a senior scientist on the team, you will be involved in every aspect of the process - from idea generation, business analysis and scientific research, through to development and deployment of advanced models - giving you a real sense of ownership. From day one, you will be working with experienced scientists, engineers, and designers who love what they do. You are expected to make decisions about technology, models and methodology choices. You will strive for simplicity, and demonstrate judgment backed by mathematical proof. You will also collaborate with the broader decision and research science community in Amazon to broaden the horizon of your work and mentor engineers and scientists. The successful candidate will have the strong expertise in applying machine learning models in an applied environment and is looking for her/his next opportunity to innovate, build, deliver, and impress. We are seeking someone who wants to lead projects that require innovative thinking and deep technical problem-solving skills to create production-ready machine learning solutions. The candidate will need to be entrepreneurial, wear many hats, and work in a fast-paced, high-energy, highly collaborative environment. We value highly technical people who know their subject matter deeply and are willing to learn new areas. We look for individuals who know how to deliver results and show a desire to develop themselves, their colleagues, and their career. About the team Fulfillment by Amazon (FBA) is a service that allows sellers to outsource order fulfillment to Amazon, allowing sellers to leverage Amazon’s world-class facilities to provide customers Prime delivery promise. Sellers gain access to Prime members worldwide, see their sales lift, and are free to focus their time and resources on what they do best while Amazon manages fulfillment. Over the last several years, sellers have enjoyed strong business growth with FBA shipping more than half of all products offered by Amazon. FBA focuses on helping sellers with automating and optimizing the third-party supply chain. FBA sellers leverage Amazon’s expertise in machine learning, optimization, data analytics, econometrics, and market design to deliver the best inventory management experience to sellers. We work full-stack, from foundational backend systems to future-forward user interfaces. Our culture is centered on rapid prototyping, rigorous experimentation, and data-driven decision-making. We are open to hiring candidates to work out of one of the following locations: Bellevue, WA, USA
US, WA, Bellevue
The Fulfillment by Amazon (FBA) team is looking for a passionate, curious, and creative Applied Scientist, with expertise and experience in machine learning, to join our top-notch cross-domain FBA science team. We want to learn seller behaviors, understand seller experience, build automated LLM-based solutions to sellers, design seller policies and incentives, and develop science products and services that empower third-party sellers to grow their businesses. We also predict potentially costly defects that may occur during packing, shipping, receiving and storing the inventory. We aim to prevent such defects before occurring while we are also fulfilling customer demand as quickly and efficiently as possible, in addition to managing returns and reimbursements. To do so, we build and innovate science solutions at the intersection of machine learning, statistics, economics, operations research, and data analytics. As an applied scientist, you will design and implement ML solutions that will likely draw from a range of scientific areas such as supervised and unsupervised learning, recommendation systems, statistical learning, LLMs, and reinforcement learning. This role has high visibility to senior Amazon business leaders and involves working with other senior and principal scientists, and partnering with engineering and product teams to integrate scientific work into production systems. Key job responsibilities - Research and develop machine learning models to solve diverse FBA business problems. - Translate business requirements/problems into specific plans for research and applied scientists, as well as engineering and product teams. - Drive and execute machine learning projects/products end-to-end: from ideation, analysis, prototyping, development, metrics, and monitoring. - Work closely with teams of scientists, product managers, program managers, software engineers to drive production model implementations. - Build scalable, efficient, automated processes for large scale data analyses, model development, model validation and model implementation. - Advocate technical solutions to business stakeholders, engineering teams, as well as executive level decision makers A day in the life In this role, you will work in machine learning with significant scope, impact, and high visibility. Your solutions may lead to billions of dollars impact on either the topline or the bottom line of Amazon third-party seller business. As an applied scientist, you will be involved in every aspect of the scientific development process - from idea generation, business analysis and scientific research, through to development and deployment of advanced models - giving you a real sense of ownership. From day one, you will be working with experienced scientists, engineers, and designers who love what they do. You are expected to make decisions about technology, models and methodology choices. You will strive for simplicity, and demonstrate judgment backed by mathematical proof. You will also collaborate with the broader decision and research science community in Amazon to broaden the horizon of your work and mentor engineers and scientists. The successful candidate will have the strong expertise in applying machine learning models in an applied environment and is looking for her/his next opportunity to innovate, build, deliver, and impress. We are seeking someone who wants to lead projects that require innovative thinking and deep technical problem-solving skills to create production-ready machine learning solutions. We value highly technical people who know their subject matter deeply and are willing to learn new areas. We look for individuals who know how to deliver results and show a desire to develop themselves, their colleagues, and their career. About the team Fulfillment by Amazon (FBA) is a service that allows sellers to outsource order fulfillment to Amazon, allowing sellers to leverage Amazon’s world-class facilities to provide customers Prime delivery promise. Sellers gain access to Prime members worldwide, see their sales lift, and are free to focus their time and resources on what they do best while Amazon manages fulfillment. Over the last several years, sellers have enjoyed strong business growth with FBA shipping more than half of all products offered by Amazon. FBA focuses on helping sellers with automating and optimizing the third-party supply chain. FBA sellers leverage Amazon’s expertise in machine learning, optimization, data analytics, econometrics, and market design to deliver the best inventory management experience to sellers. We work full-stack, from foundational backend systems to future-forward user interfaces. Our culture is centered on rapid prototyping, rigorous experimentation, and data-driven decision-making. We are open to hiring candidates to work out of one of the following locations: Bellevue, WA, USA
GB, London
Economic Decision Science is a central science team working across a variety of topics in the EU Stores business and beyond. We work closely EU business leaders to drive change at Amazon. We focus on solving long-term, ambiguous and challenging problems, while providing advisory support to help solve short-term business pain points. Key topics include pricing, product selection, delivery speed, profitability, and customer experience. We tackle these issues by building novel econometric models, machine learning systems, and high-impact experiments which we integrate into business, financial, and system-level decision making. Our work is highly collaborative and we regularly partner with EU- and US-based interdisciplinary teams. We are looking for a Senior Economist who is able to provide structure around complex business problems, hone those complex problems into specific, scientific questions, and test those questions to generate insights. The ideal candidate will work with various science, engineering, operations and analytics teams to estimate models and algorithms on large scale data, design pilots and measure their impact, and transform successful prototypes into improved policies and programs at scale. If you have an entrepreneurial spirit, you know how to deliver results fast, and you have a deeply quantitative, highly innovative approach to solving problems, and long for the opportunity to build pioneering solutions to challenging problems, we want to talk to you. Key job responsibilities - Provide data-driven guidance and recommendations on strategic questions facing the EU Retail leadership - Scope, design and implement version-zero (V0) models and experiments to kickstart new initiatives, thinking, and drive system-level changes across Amazon - Build a long-term research agenda to understand, break down, and tackle the most stubborn and ambiguous business challenges - Influence business leaders and work closely with other scientists at Amazon to deliver measurable progress and change We are open to hiring candidates to work out of one of the following locations: London, GBR
US, WA, Seattle
Amazon is looking for a passionate, talented, and inventive Applied Scientist with background in Natural Language Processing (NLP), Deep Learning, Generative AI (GenAI) to help build industry-leading technology in contact center. The ideal candidate should have a robust foundation in NLP and machine learning and a keen interest in advancing the field. The ideal candidate would also enjoy operating in dynamic environments, have the self-motivation to take on challenging problems to deliver big customer impact, and move fast to ship solutions and innovate along the development process. As part of our Transcribe science team in Amazon AWS AI, you will have the opportunity to build the next generation call center analytic solutions. You will work along side a supportive and collaborative team with a healthy mix of scientists, software engineers and language engineers to research and develop state-of-the-art technology for natural language processing. A day in the life AWS Utility Computing (UC) provides product innovations — from foundational services such as Amazon’s Simple Storage Service (S3) and Amazon Elastic Compute Cloud (EC2), to consistently released new product innovations that continue to set AWS’s services and features apart in the industry. As a member of the UC organization, you’ll support the development and management of Compute, Database, Storage, Internet of Things (Iot), Platform, and Productivity Apps services in AWS, including support for customers who require specialized security solutions for their cloud services. Diverse Experiences AWS values diverse experiences. Even if you do not meet all of the qualifications and skills listed in the job description, we encourage candidates to apply. If your career is just starting, hasn’t followed a traditional path, or includes alternative experiences, don’t let it stop you from applying. Why AWS? Amazon Web Services (AWS) is the world’s most comprehensive and broadly adopted cloud platform. We pioneered cloud computing and never stopped innovating — that’s why customers from the most successful startups to Global 500 companies trust our robust suite of products and services to power their businesses. Inclusive Team Culture Here at AWS, it’s in our nature to learn and be curious. Our employee-led affinity groups foster a culture of inclusion that empower us to be proud of our differences. Ongoing events and learning experiences, including our Conversations on Race and Ethnicity (CORE) and AmazeCon (gender diversity) conferences, inspire us to never stop embracing our uniqueness. Mentorship & Career Growth We’re continuously raising our performance bar as we strive to become Earth’s Best Employer. That’s why you’ll find endless knowledge-sharing, mentorship and other career-advancing resources here to help you develop into a better-rounded professional. Work/Life Balance We value work-life harmony. Achieving success at work should never come at the expense of sacrifices at home, which is why we strive for flexibility as part of our working culture. When we feel supported in the workplace and at home, there’s nothing we can’t achieve in the cloud. Hybrid Work We value innovation and recognize this sometimes requires uninterrupted time to focus on a build. We also value in-person collaboration and time spent face-to-face. Our team affords employees options to work in the office every day or in a flexible, hybrid work model near one of our U.S. Amazon offices. We are open to hiring candidates to work out of one of the following locations: Bellevue, WA, USA | Seattle, WA, USA
US, WA, Seattle
We are looking for an Applied Scientist to join our Seattle team. As an Applied Scientist, you are able to use a range of science methodologies to solve challenging business problems when the solution is unclear. Our team solves a broad range of problems ranging from natural knowledge understanding of third-party shoppable content, product and content recommendation to social media influencers and their audiences, determining optimal compensation for creators, and mitigating fraud. We generate deep semantic understanding of the photos, and videos in shoppable content created by our creators for efficient processing and appropriate placements for the best customer experience. For example, you may lead the development of reinforcement learning models such as MAB to rank content/product to be shown to influencers. To achieve this, a deep understanding of the quality and relevance of content must be established through ML models that provide those contexts for ranking. In order to be successful in our team, you need a combination of business acumen, broad knowledge of statistics, deep understanding of ML algorithms, and an analytical mindset. You thrive in a collaborative environment, and are passionate about learning. Our team utilizes a variety of AWS tools such as SageMaker, S3, and EC2 with a variety of skillset in shallow and deep learning ML models, particularly in NLP and CV. You will bring knowledge in many of these domains along with your own specialties. Key job responsibilities • Use statistical and machine learning techniques to create scalable and lasting systems. • Analyze and understand large amounts of Amazon’s historical business data for Recommender/Matching algorithms • Design, develop and evaluate highly innovative models for NLP. • Work closely with teams of scientists and software engineers to drive real-time model implementations and new feature creations. • Establish scalable, efficient, automated processes for large scale data analyses, model development, model validation and implementation. • Research and implement novel machine learning and statistical approaches, including NLP and Computer Vision A day in the life In this role, you’ll be utilizing your NLP or CV skills, and creative and critical problem-solving skills to drive new projects from ideation to implementation. Your science expertise will be leveraged to research and deliver often novel solutions to existing problems, explore emerging problems spaces, and create or organize knowledge around them. About the team Our team puts a high value on your work and personal life happiness. It isn’t about how many hours you spend at home or at work; it’s about the flow you establish that brings energy to both parts of you. We believe striking the right balance between your personal and professional life is critical to life-long happiness and fulfillment. We offer flexibility in working hours and encourage you to establish your own harmony between your work and personal life. We are open to hiring candidates to work out of one of the following locations: New York, NY, USA | Seattle, WA, USA
US, WA, Seattle
The Automated Reasoning Group in AWS Platform is looking for an Applied Scientist with experience in building scalable solver solutions that delight customers. You will be part of a world-class team building the next generation of automated reasoning tools and services. AWS has the most services and more features within those services, than any other cloud provider–from infrastructure technologies like compute, storage, and databases–to emerging technologies, such as machine learning and artificial intelligence, data lakes and analytics, and Internet of Things. You will apply your knowledge to propose solutions, create software prototypes, and move prototypes into production systems using modern software development tools and methodologies. In addition, you will support and scale your solutions to meet the ever-growing demand of customer use. You will use your strong verbal and written communication skills, are self-driven and own the delivery of high quality results in a fast-paced environment. Each day, hundreds of thousands of developers make billions of transactions worldwide on AWS. They harness the power of the cloud to enable innovative applications, websites, and businesses. Using automated reasoning technology and mathematical proofs, AWS allows customers to answer questions about security, availability, durability, and functional correctness. We call this provable security, absolute assurance in security of the cloud and in the cloud. See https://aws.amazon.com/security/provable-security/ As an Applied Scientist in AWS Platform, you will play a pivotal role in shaping the definition, vision, design, roadmap and development of product features from beginning to end. You will: - Define and implement new solver applications that are scalable and efficient approaches to difficult problems - Apply software engineering best practices to ensure a high standard of quality for all team deliverables - Work in an agile, startup-like development environment, where you are always working on the most important stuff - Deliver high-quality scientific artifacts - Work with the team to define new interfaces that lower the barrier of adoption for automated reasoning solvers - Work with the team to help drive business decisions The AWS Platform is the glue that holds the AWS ecosystem together. From identity features such as access management and sign on, cryptography, console, builder & developer tools, to projects like automating all of our contractual billing systems, AWS Platform is always innovating with the customer in mind. The AWS Platform team sustains over 750 million transactions per second. Learn and Be Curious. We have a formal mentor search application that lets you find a mentor that works best for you based on location, job family, job level etc. Your manager can also help you find a mentor or two, because two is better than one. In addition to formal mentors, we work and train together so that we are always learning from one another, and we celebrate and support the career progression of our team members. Inclusion and Diversity. Our team is diverse! We drive towards an inclusive culture and work environment. We are intentional about attracting, developing, and retaining amazing talent from diverse backgrounds. Team members are active in Amazon’s 10+ affinity groups, sometimes known as employee resource groups, which bring employees together across businesses and locations around the world. These range from groups such as the Black Employee Network, Latinos at Amazon, Indigenous at Amazon, Families at Amazon, Amazon Women and Engineering, LGBTQ+, Warriors at Amazon (Military), Amazon People With Disabilities, and more. Key job responsibilities Work closely with internal and external users on defining and extending application domains. Tune solver performance for application-specific demands. Identify new opportunities for solver deployment. About the team Solver science is a talented team of scientists from around the world. Expertise areas include solver theory, performance, implementation, and applications. Diverse Experiences AWS values diverse experiences. Even if you do not meet all of the qualifications and skills listed in the job description, we encourage candidates to apply. If your career is just starting, hasn’t followed a traditional path, or includes alternative experiences, don’t let it stop you from applying. Why AWS? Amazon Web Services (AWS) is the world’s most comprehensive and broadly adopted cloud platform. We pioneered cloud computing and never stopped innovating — that’s why customers from the most successful startups to Global 500 companies trust our robust suite of products and services to power their businesses. Inclusive Team Culture Here at AWS, it’s in our nature to learn and be curious. Our employee-led affinity groups foster a culture of inclusion that empower us to be proud of our differences. Ongoing events and learning experiences, including our Conversations on Race and Ethnicity (CORE) and AmazeCon (gender diversity) conferences, inspire us to never stop embracing our uniqueness. Mentorship & Career Growth We’re continuously raising our performance bar as we strive to become Earth’s Best Employer. That’s why you’ll find endless knowledge-sharing, mentorship and other career-advancing resources here to help you develop into a better-rounded professional. Work/Life Balance We value work-life harmony. Achieving success at work should never come at the expense of sacrifices at home, which is why we strive for flexibility as part of our working culture. When we feel supported in the workplace and at home, there’s nothing we can’t achieve in the cloud. Hybrid Work We value innovation and recognize this sometimes requires uninterrupted time to focus on a build. We also value in-person collaboration and time spent face-to-face. Our team affords employees options to work in the office every day or in a flexible, hybrid work model near one of our U.S. Amazon offices. We are open to hiring candidates to work out of one of the following locations: Portland, OR, USA | Seattle, WA, USA
CN, 11, Beijing
Amazon Search JP builds features powering product search on the Amazon JP shopping site and expands the innovations to world wide. As an Applied Scientist on this growing team, you will take on a key role in improving the NLP and ranking capabilities of the Amazon product search service. Our ultimate goal is to help customers find the products they are searching for, and discover new products they would be interested in. We do so by developing NLP components that cover a wide range of languages and systems. As an Applied Scientist for Search JP, you will design, implement and deliver search features on Amazon site, helping millions of customers every day to find quickly what they are looking for. You will propose innovation in NLP and IR to build ML models trained on terabytes of product and traffic data, which are evaluated using both offline metrics as well as online metrics from A/B testing. You will then integrate these models into the production search engine that serves customers, closing the loop through data, modeling, application, and customer feedback. The chosen approaches for model architecture will balance business-defined performance metrics with the needs of millisecond response times. Key job responsibilities - Designing and implementing new features and machine learned models, including the application of state-of-art deep learning to solve search matching, ranking and Search suggestion problems. - Analyzing data and metrics relevant to the search experiences. - Working with teams worldwide on global projects. Your benefits include: - Working on a high-impact, high-visibility product, with your work improving the experience of millions of customers - The opportunity to use (and innovate) state-of-the-art ML methods to solve real-world problems with tangible customer impact - Being part of a growing team where you can influence the team's mission, direction, and how we achieve our goals We are open to hiring candidates to work out of one of the following locations: Beijing, 11, CHN | Shanghai, 31, CHN