New contrastive-learning methods for better data representation

New loss functions enable better approximation of the optimal loss and more-useful representations of multimodal data.

Many recent advances in artificial intelligence are the result of representation learning: a machine learning model learns to represent data items as vectors in a multidimensional space, where geometric relationships between vectors correspond to semantic relationships between items.

The M5 team at Amazon strives to construct general-purpose semantic representations of data related to the Amazon Store — product descriptions, queries, reviews, and more — that can be employed by machine learning (ML) systems throughout Amazon. Our approach involves leveraging all accessible data for each entity, often spanning multiple modalities.

One of the most successful ways to produce general-purpose representations is through contrastive learning, in which a model is trained on pairs of inputs, which are either positive (similar inputs/products) or negative (dissimilar inputs/products). The model learns to pull positive examples together and push negative examples apart.

Related content
Four CVPR papers from Prime Video examine a broad set of topics related to efficient model training for understanding and synthesizing long-form cinematic content.

In a pair of recent papers, M5 researchers have made substantial contributions to the theory and practice of contrastive learning. In “Why do we need large batch sizes in contrastive learning? A gradient-bias perspective”, presented at the 2022 Neural Information Processing Systems (NeurIPS) conference, we propose a new contrastive-learning loss function that enables models to converge on useful representations with lower memory cost and less training data.

And in “Understanding and constructing latent modality structures in multi-modal representation learning”, presented at this year’s Computer Vision and Pattern Recognition conference (CVPR), we propose geometric constraints on the representations of different modes of the same data item — say, image and text — that are more useful for downstream tasks than simply trying to resolve both representations to the same point in the representational space.

Do we need large batch sizes in contrastive learning?

In contrast with standard ML methods, contrastive learning typically requires very large batch sizes to achieve good performance: several popular models, for instance, require tens of thousands of training examples, significantly increasing the memory overhead; reducing the batch size can impair performance. In our NeurIPS paper, we attempt to understand this phenomenon and to propose techniques for mitigating it.

Related content
Two methods presented at CVPR achieve state-of-the-art results by imposing additional structure on the representational space.

Part of the appeal of contrastive learning is that it’s unsupervised, meaning it doesn’t require data annotation. Positive pairs can be generated by mathematically transforming an “anchor sample” and pairing the transformed version with the original; negative pairs can be generated by pairing an anchor sample with transformed versions of other anchor samples. With image data, a transformation might involve re-cropping, reversing, or distorting the colors of the anchor sample; with textual data, a transformation might involve substituting synonyms for the words in a sentence.

Given a measure of similarity between vectors in the representational space, the standard loss function for contrastive learning involves a ratio whose numerator includes the similarity between an anchor sample and one of its transformations; the denominator includes the sum of the similarities of the anchor sample and all possible negative samples. The goal of training is to maximize that ratio.

In principle, given the possibility of applying transformations to negative samples, “all possible negative samples” could describe an infinite set. In practice, contrastive learning typically just relies on the negative examples available in the training batch. Hence the need for large batch sizes — to approximate an infinite sum.

contrastive_learning [Read-Only].png
The contrastive-learning framework. Approximating an infinite sum with the samples in a finite minibatch of training data can introduce gradient bias.

If the distribution of minibatch samples differs from the distribution of possible negatives, however, this approximation can bias the model. One difficulty in correcting the bias is that, because the loss function contrasts each positive pair with all possible negatives at once, in a ratio, it cannot be decomposed into a sum of sub-losses.

We address the decomposability problem using Bayesian augmentation. The general approach is that, for each anchor sample, we create a random auxiliary variable, which can be thought of as a weight applied to the anchor sample’s similarity scores. Using identity under the gamma function, we can show that the auxiliary variable follows a gamma distribution, which is easy to sample. As a consequence, we can rewrite the loss in an exponential rather than a fractional form, making it decomposable.

During training, we begin by sampling the auxiliary variables for the current batch of data from a gamma distribution, giving us the weight of the similarity scores for all the anchor samples. Conditioned on the sampled values, we then apply maximum likelihood estimation to optimize the parameters of the model, which will consider the sampled weights on the similarity scores from the first step. We then repeat this process for the entire dataset, summing a sequence of (weighted) sub-losses to produce a cumulative loss. In our paper, we show that this procedure will converge toward the expected loss for the original contrastive-loss function, with its infinite sum in the denominator.

Contrastive-learning losses.png
Results of 10 training runs on synthetic data with added noise, comparing a model trained with our decomposable loss function (red) to one trained with the conventional loss function (blue). With our loss, the model consistently converged to the optimum (1.0), while with the conventional loss, it never did.

We evaluate our approach through a number of experiments. In one, we used simulated data, into which we injected noise to simulate bias. Then we used both our loss and the conventional loss function to train a model 10 times, with different initialization values. At heavy noise levels, the model trained with the conventional loss failed to converge, while ours consistently converged to the optimum.

We also evaluated the models on a variety of downstream tasks, including zero-/few-shot image classification and image/text retrieval. Our approach showed significant performance improvement over state-of-the-art baseline methods.

What geometries work best for multimodal representation matching?

At M5, we are building scalable models that can handle multimodal data — for instance, multilingual models that translate between product descriptions in different languages or multi-entity models that jointly model different images of the same product. Contrastive learning is a promising method for building such models: data in different modalities that are associated with the same products can be treated as positive pairs, and contrastive learning pulls them together in the representational space.

Related content
A new metric-learning loss function groups together superclasses and learns commonalities within them.

We theoretically investigated whether the standard contrastive-learning framework is optimal in terms of the prediction error rate on downstream tasks, and the surprising answer is no. In our CVPR paper, we prove that if the information gap between two modalities is large — that is, if you can’t infer much about one modality from the other — then the best prediction error we can hope to achieve using standard contrastive-learning representations is larger than that we can achieve if we simply train a machine learning model directly on data in a single modality.

This makes some intuitive sense. Ideally, contrastive learning would pull the different modalities so tightly together that they would essentially resolve to a single point in the representational space. But of course, the reason to use multimodal representations for downstream tasks is that each modality may capture useful information that the other does not. Collapsing the different modalities’ representations together neutralizes this advantage.

Consequently, in our CVPR paper, we explore different geometrical relationships in the representational space that can establish correlations between multimodal data without sacrificing information specific to each mode. We propose three general approaches to constructing modality structures in the representational space, suited to intramodal representation, intermodal representation, and a combination of the two:

  1. a deep feature separation loss for intramodality regularization, which uses two types of neural network components to separate different modality information: one component captures information that’s shared between modalities (tuned according to the standard contrastive-learning loss), and the other, which is orthogonal to the first, captures information unique to the modality;
  2. a “Brownian-bridge” loss for intermodality regularization, which uses Brownian motion to plot several trajectories/transitions between the representation of one modality (say, text) and the other (say, an image) and constrains representations of augmented data to lie along one of those paths; and
  3. a geometric-consistency loss for both intra- and intermodality regularization, which enforces symmetry in the geometric relationships between representations in one modality and the corresponding representations in the other modality, while simultaneously enforcing symmetries in cross-modal geometric relationships.
Contrastive learning.png
Three types of modality structures that can improve modality representation learning for downstream tasks. (1) With deep feature separation, a model produces two orthogonal vectors for each modality, one that encodes information shared across modalities and one that encodes mode-specific information. (2) Brownian bridges use Brownian motion to generate trajectories/transitions between representations of data in different modes, defining a subspace in which the representations of augmented data are induced to lie. (3) Geometric consistency enforces symmetries in the relationships between data representations, both within modes (orange-orange and blue-blue) and across modes (blue-orange).

We have conducted extensive experiments on two popular multimodal representation-learning frameworks, the CLIP-based two-tower model and the ALBEF-based fusion model. We tested our model on a variety of tasks, including zero-/few-shot image classification, image-text retrieval, visual question answering, visual reasoning, and visual entailment. Our method achieves consistent improvements over existing methods, demonstrating the effectiveness and generalizability of our proposed approach on multimodal representation learning.

Going forward

Our NeurIPS and CVPR papers represent only two interesting projects from our M5 team. There is a lot more research on multimodal learning going on in M5. This includes generative models for images, videos, and text (e.g. Stable Diffusion, DreamBooth) to enable data synthesis and representation learning and training and applying large language models to enhance customer shopping experiences. We expect to report on more research highlights in the near future.

Research areas

Related content

US, WA, Redmond
Project Kuiper is an initiative to launch a constellation of Low Earth Orbit satellites that will provide low-latency, high-speed broadband connectivity to unserved and under-served communities around the world. We are looking for an accomplished Applied Scientist who will deliver science applications such as anomaly detection, advanced calibration methods, space engineering simulations, and performance analytics -- to name a few. Key job responsibilities • Translate ambiguous problems into well defined mathematical problems • Prototype, test, and implement state-of-the-art algorithms for antenna pointing calibration, anomaly detection, predictive failure models, and ground terminal performance evaluation • Provide actionable recommendations for system design/definition by defining, running, and summarizing physically-accurate simulations of ground terminal functionality • Collaborate closely with engineers to deploy performant, scalable, and maintainable applications in the cloud Export Control Requirement: Due to applicable export control laws and regulations, candidates must be a U.S. citizen or national, U.S. permanent resident (i.e., current Green Card holder), or lawfully admitted into the U.S. as a refugee or granted asylum. A day in the life In this role as an Applied Scientist, you will design, implement, optimize, and operate systems critical to the uptime and performance of Kuiper ground terminals. Your contributions will have a direct impact on customers around the world. About the team This role will be part of the Ground Software & Analytics team, part of Ground Systems Engineering. Our team is responsible for: • Design, development, deployment, and support of a Tier-1 Monitoring and Remediation System (MARS) needed to maintain high availability of hundreds of ground terminals deployed around the world • Ground systems integration/test (I&T) automation • Ground terminal configuration, provisioning, and acceptance automation • Systems analysis • Algorithm development (pointing/tracking/calibration/monitoring) • Software interface definition for supplier-provided hardware and development of software test automation
US, VA, Arlington
Are you looking to work at the forefront of Machine Learning and AI? Would you be excited to apply Generative AI algorithms to solve real world problems with significant impact? The Generative AI Innovation Center helps AWS customers implement Generative AI solutions and realize transformational business opportunities. This is a team of strategists, scientists, engineers, and architects working step-by-step with customers to build bespoke solutions that harness the power of generative AI. Starting in 2024, the Innovation Center launched a new Custom Model and Optimization program to help customers develop and scale highly customized generative AI solutions. The team helps customers imagine and scope bespoke use cases that will create the greatest value for their businesses, define paths to navigate technical or business challenges, develop and optimize models to power their solutions, and make plans for launching solutions at scale. The GenAI Innovation Center team provides guidance on best practices for applying generative AI responsibly and cost efficiently. You will work directly with customers and innovate in a fast-paced organization that contributes to game-changing projects and technologies. You will design and run experiments, research new algorithms, and find new ways of optimizing risk, profitability, and customer experience. We’re looking for Applied Scientists capable of using GenAI and other techniques to design, evangelize, and implement state-of-the-art solutions for never-before-solved problems. Key job responsibilities • Collaborate with AI/ML scientists and architects to research, design, develop, and evaluate generative AI solutions to address real-world challenges • Interact with customers directly to understand their business problems, aid them in implementation of generative AI solutions, brief customers and guide them on adoption patterns and paths to production • Help customers optimize their solutions through approaches such as model selection, training or tuning, right-sizing, distillation, and hardware optimization • Provide customer and market feedback to product and engineering teams to help define product direction
US, CA, Culver City
Prime Video is a first-stop entertainment destination offering customers a vast collection of premium programming in one app available across thousands of devices. Prime members can customize their viewing experience and find their favorite movies, series, documentaries, and live sports – including Amazon MGM Studios-produced series and movies; licensed fan favorites; and programming from Prime Video add-on subscriptions such as Apple TV+, Max, Crunchyroll and MGM+. All customers, regardless of whether they have a Prime membership or not, can rent or buy titles via the Prime Video Store, and can enjoy even more content for free with ads. Are you interested in shaping the future of entertainment? Prime Video's technology teams are creating best-in-class digital video experience. As a Prime Video technologist, you’ll have end-to-end ownership of the product, user experience, design, and technology required to deliver state-of-the-art experiences for our customers. You’ll get to work on projects that are fast-paced, challenging, and varied. You’ll also be able to experiment with new possibilities, take risks, and collaborate with remarkable people. We’ll look for you to bring your diverse perspectives, ideas, and skill-sets to make Prime Video even better for our customers. With global opportunities for talented technologists, you can decide where a career Prime Video Tech takes you! We are forming a new organization within Prime Video to redefine our operational landscape through the power of artificial intelligence. As a Applied Scientist within this initiative, you will be a technical leader helping to design and build the intelligent systems that power our vision. You will tackle complex and ambiguous problems, designing and delivering scalable and resilient agentic AI and ML solutions from the ground up. You will not only write high-quality, maintainable software and models, but also mentor other scientists, influence our technical strategy, and drive engineering best practices across the team. Your work will directly contribute to making Prime Video's operations more efficient and will set the technical foundation for years to come. Key job responsibilities • Lead the design and architecture of highly scalable, available, and resilient services for our AI automation platform. • Write high-quality, maintainable, and robust code to solve complex business problems, building flexible systems without over-engineering. • Act as a technical leader and mentor for other engineers on the team, assisting with career growth and encouraging excellence. • Work through ambiguous requirements, cut through complexity, and translate business needs into scalable technical solutions. • Take ownership of the full software development lifecycle, including design, testing, deployment, and operations. • Work closely with product managers, scientists, and other engineers to build and launch new features and systems. About the team This role offers a unique opportunity to shape the future of one of Amazon's most exciting businesses through the application of AI technologies. If you're passionate about leveraging AI to drive real-world impact at massive scale, we want to hear from you.
US, CA, Sunnyvale
Prime Video is a first-stop entertainment destination offering customers a vast collection of premium programming in one app available across thousands of devices. Prime members can customize their viewing experience and find their favorite movies, series, documentaries, and live sports – including Amazon MGM Studios-produced series and movies; licensed fan favorites; and programming from Prime Video add-on subscriptions such as Apple TV+, Max, Crunchyroll and MGM+. All customers, regardless of whether they have a Prime membership or not, can rent or buy titles via the Prime Video Store, and can enjoy even more content for free with ads. Are you interested in shaping the future of entertainment? Prime Video's technology teams are creating best-in-class digital video experience. As a Prime Video technologist, you’ll have end-to-end ownership of the product, user experience, design, and technology required to deliver state-of-the-art experiences for our customers. You’ll get to work on projects that are fast-paced, challenging, and varied. You’ll also be able to experiment with new possibilities, take risks, and collaborate with remarkable people. We’ll look for you to bring your diverse perspectives, ideas, and skill-sets to make Prime Video even better for our customers. With global opportunities for talented technologists, you can decide where a career Prime Video Tech takes you! We are looking for a self-motivated, passionate and resourceful Applied Scientist to bring diverse perspectives, ideas, and skill-sets to make Prime Video even better for our customers. You will spend your time as a hands-on machine learning practitioner and a research leader. You will play a key role on the team, building and guiding machine learning models from the ground up. At the end of the day, you will have the reward of seeing your contributions benefit millions of Amazon.com customers worldwide. Key job responsibilities - Develop AI solutions for various Prime Video Search systems using Deep learning, GenAI, Reinforcement Learning, and optimization methods; - Work closely with engineers and product managers to design, implement and launch AI solutions end-to-end; - Design and conduct offline and online (A/B) experiments to evaluate proposed solutions based on in-depth data analyses; - Effectively communicate technical and non-technical ideas with teammates and stakeholders; - Stay up-to-date with advancements and the latest modeling techniques in the field; - Publish your research findings in top conferences and journals. About the team Prime Video Search Science team owns science solution to power search experience on various devices, from sourcing, relevance, ranking, to name a few. We work closely with the engineering teams to launch our solutions in production.
US, WA, Seattle
The People eXperience and Technology Central Science (PXTCS) team uses economics, behavioral science, statistics, and machine learning to proactively identify mechanisms and process improvements which simultaneously improve Amazon and the lives, wellbeing, and the value of work to Amazonians. PXTCS is an interdisciplinary team that combines the talents of science and engineering to develop and deliver solutions that measurably achieve this goal. PXTCS is looking for an economist who can apply economic methods to address business problems. The ideal candidate will work with engineers and computer scientists to estimate models and algorithms on large scale data, design pilots and measure impact, and transform successful prototypes into improved policies and programs at scale. PXTCS is looking for creative thinkers who can combine a strong technical economic toolbox with a desire to learn from other disciplines, and who know how to execute and deliver on big ideas as part of an interdisciplinary technical team. Ideal candidates will work in a team setting with individuals from diverse disciplines and backgrounds. They will work with teammates to develop scientific models and conduct the data analysis, modeling, and experimentation that is necessary for estimating and validating models. They will work closely with engineering teams to develop scalable data resources to support rapid insights, and take successful models and findings into production as new products and services. They will be customer-centric and will communicate scientific approaches and findings to business leaders, listening to and incorporate their feedback, and delivering successful scientific solutions. A day in the life The Economist will work with teammates to apply economic methods to business problems. This might include identifying the appropriate research questions, writing code to implement a DID analysis or estimate a structural model, or writing and presenting a document with findings to business leaders. Our economists also collaborate with partner teams throughout the process, from understanding their challenges, to developing a research agenda that will address those challenges, to help them implement solutions. About the team PXTCS is a multidisciplinary science team that develops innovative solutions to make Amazon Earth's Best Employer
US, NY, New York
The Ads Measurement Science team in the Measurement, Ad Tech, and Data Science (MADS) team of Amazon Ads serves a centralized role developing solutions for a multitude of performance measurement products. We create solutions which measure the comprehensive impact of advertiser's ad spend, including sales impacts both online and offline and across timescales, and provide actionable insights that enable our advertisers to optimize their media portfolios. We also own the science solutions for AI tools that unlock new insights and automate high-effort customer workflows, such as custom query and report generation based on natural language user requests. We leverage a host of scientific technologies to accomplish this mission, including Generative AI, classical ML, Causal Inference, Natural Language Processing, and Computer Vision. As an Applied Scientist on the team, you will lead measurement solutions end-to-end from inception to production. You will propose, design, analyze, and productionize models to provide novel measurement insights to our customers. Key job responsibilities - Leverage deep expertise in one or more scientific disciplines to invent solutions to ambiguous ads measurement problems - Disambiguate problems to propose clear evaluation frameworks and success criteria - Work autonomously and write high quality technical documents - Implement a significant portion of critical-path code, and partner with engineers to directly carry solutions into production - Partner closely with other scientists to deliver large, multi-faceted technical projects - Share and publish works with the broader scientific community through meetings and conferences - Communicate clearly to both technical and non-technical audiences - Contribute new ideas that shape the direction of the team's work - Mentor more junior scientists and participate in the hiring process About the team We are a team of scientists across Applied, Research, Data Science and Economist disciplines. You will work with colleagues with deep expertise in ML, NLP, CV, Gen AI, and Causal Inference with a diverse range of backgrounds. We partner closely with top-notch engineers, product managers, sales leaders, and other scientists with expertise in the ads industry and on building scalable modeling and software solutions.
US, WA, Bellevue
Are you inspired by invention? Do you like the idea of seeing how your work impacts the bigger picture? Answer yes to any of these and you’ll fit right in here at Amazon Last Mile Simulations and Analytics Engineering team. WW AMZL Simulations and Analytics Engineering team is looking to build out our Simulation team to drive innovation across our Last Mile network. We start with the customer and work backwards in everything we do. If you’re interested in joining a rapidly growing team working to build a unique, solutions advisory group with a relentless focus on the customer, you’ve come to the right place. This is a blue-sky role that gives you a chance to roll up your sleeves and dive into big data sets in order to build discrete event 3D simulations using tools like Flexsim, Anylogic, Emulate 3D etc and experimentation systems at scale, build optimization algorithms and leverage cutting-edge technologies across Amazon. This is an opportunity to think big about how to solve a challenging problem for the customers. As a Simulation Scientist, you are expected to deep dive into complex problems and drive relentlessly towards innovative solutions working with cross functional teams. Be comfortable interfacing and influencing various functional teams and individuals at all levels of the organization in order to be successful. Lead strategic modelling and simulation projects related to drive process design decisions. Your expertise in synthesizing and communicating insights and recommendations to audiences of varying levels of technical sophistication will enable you to answer specific business questions and innovate for the future. You will apply cutting edge designs and methodologies for complex use cases across Last Mile network to drive innovation. In addition, you will contribute to the end state vision for simulation and experimentation of future delivery stations at Amazon. Key job responsibilities Key job responsibilities • Lead the design, implementation, and delivery of the simulation data science solutions to perform system of systems discrete event simulations for significantly complex operational processes that have a long-term impact on a product, business, or function using FlexSim, Demo 3D, AnyLogic or any other Discrete Event Simulation (DES) software packages • Lead strategic modeling and simulation research projects to drive process design decisions • Be an exemplary practitioner in simulation science discipline to establish best practices and simplify problems to develop discrete event simulations faster with higher standards • Identify and tackle intrinsically hard process flow simulation problems (e.g., highly complex, ambiguous, undefined, with less existing structure, or having significant business risk or potential for significant impact • Deliver artifacts that set the standard in the organization for excellence, from process flow control algorithm design to validation to implementations to technical documents using simulations • Be a pragmatic problem solver by applying judgment and simulation experience to balance cross-organization trade-offs between competing interests and effectively influence, negotiate, and communicate with internal and external business partners, contractors and vendors for multiple simulation projects • Provide simulation data and measurements that influence the business strategy of an organization. Write effective white papers and artifacts while documenting your approach, simulation outcomes, recommendations, and arguments • Lead and actively participate in reviews of simulation research science solutions. You bring clarity to complexity, probe assumptions, illuminate pitfalls, and foster shared understanding within simulation data science discipline • Pay a significant role in the career development of others, actively mentoring and educating the larger simulation data science community on trends, technologies, and best practices • Use advanced statistical /simulation tools and develop codes (python or another object oriented language) for data analysis , simulation, and developing modeling algorithms • Lead and coordinate simulation efforts between internal teams and outside vendors to develop optimal solutions for the network, including equipment specification, material flow control logic, process design, and site layout • Deliver results according to project schedules and quality A day in the life If you are not sure that every qualification on the list above describes you exactly, we'd still love to hear from you! At Amazon, we value people with unique backgrounds, experiences, and skillsets. If you’re passionate about this role and want to make an impact on a global scale, please apply!
IN, KA, Bengaluru
Do you want to join an innovative team of scientists who use machine learning and statistical techniques to create state-of-the-art solutions for providing better value to Amazon’s customers? Do you want to build and deploy advanced ML systems that help optimize millions of transactions every day? Are you excited by the prospect of analyzing and modeling terabytes of data to solve real-world problems? Do you like to own end-to-end business problems/metrics and directly impact the profitability of the company? Do you like to innovate and simplify? If yes, then you may be a great fit to join the Machine Learning team for India Consumer Businesses. Machine Learning, Big Data and related quantitative sciences have been strategic to Amazon from the early years. Amazon has been a pioneer in areas such as recommendation engines, ecommerce fraud detection and large-scale optimization of fulfillment center operations. As Amazon has rapidly grown and diversified, the opportunity for applying machine learning has exploded. We have a very broad collection of practical problems where machine learning systems can dramatically improve the customer experience, reduce cost, and drive speed and automation. These include product bundle recommendations for millions of products, safeguarding financial transactions across by building the risk models, improving catalog quality via extracting product attribute values from structured/unstructured data for millions of products, enhancing address quality by powering customer suggestions We are developing state-of-the-art machine learning solutions to accelerate the Amazon India growth story. Amazon India is an exciting place to be at for a machine learning practitioner. We have the eagerness of a fresh startup to absorb machine learning solutions, and the scale of a mature firm to help support their development at the same time. As part of the India Machine Learning team, you will get to work alongside brilliant minds motivated to solve real-world machine learning problems that make a difference to millions of our customers. We encourage thought leadership and blue ocean thinking in ML. Key job responsibilities Use machine learning and analytical techniques to create scalable solutions for business problems Analyze and extract relevant information from large amounts of Amazon’s historical business data to help automate and optimize key processes Design, develop, evaluate and deploy, innovative and highly scalable ML models Work closely with software engineering teams to drive real-time model implementations Work closely with business partners to identify problems and propose machine learning solutions Establish scalable, efficient, automated processes for large scale data analyses, model development, model validation and model maintenance Work proactively with engineering teams and product managers to evangelize new algorithms and drive the implementation of large-scale complex ML models in production Leading projects and mentoring other scientists, engineers in the use of ML techniques About the team International Machine Learning Team is responsible for building novel ML solutions that attack India first (and other Emerging Markets across MENA and LatAm) problems and impact the bottom-line and top-line of India business. Learn more about our team from https://www.amazon.science/working-at-amazon/how-rajeev-rastogis-machine-learning-team-in-india-develops-innovations-for-customers-worldwide
US, WA, Bellevue
We are seeking a passionate, talented, and inventive individual to join the Applied AI team and help build industry-leading technologies that customers will love. This team offers a unique opportunity to make a significant impact on the customer experience and contribute to the design, architecture, and implementation of a highly innovative product. The mission of the Applied AI team is to enable organizations within Worldwide Amazon.com Stores to accelerate the adoption of AI technologies across various parts of our business. We are looking for a Senior Applied Science manager to join our Applied AI team and lead a cross-functional team of scientists and engineers who work on LLM-based solutions. On our team you will push the boundaries of ML and Generative AI techniques to scale the inputs for hundreds of billions of dollars of annual revenue for our eCommerce business. If you have a passion for AI technologies, a drive to innovate and a desire to make a meaningful impact, we invite you to become a valued member of our team. You will be responsible for leading a cross functional team of scientists and engineer and developing and maintaining the systems and tools that enable us to accelerate knowledge operations and work in the intersection of Science and Engineering. You will push the boundaries of ML and Generative AI techniques to scale the inputs for hundreds of billions of dollars of annual revenue for our eCommerce business. If you have a passion for AI technologies, a drive to innovate and a desire to make a meaningful impact, we invite you to become a valued member of our team. We are seeking an experienced Senior Applied Science Manager who combines superb technical, research, analytical and leadership capabilities with a demonstrated ability to get the right things done quickly and effectively. This person must be comfortable working with a team of top-notch developers and collaborating with our research teams. We’re looking for someone who innovates, and loves solving hard problems. You will be expected to have an established background in leading teams that build highly scalable systems and system design, have excellent project management skills, great communication skills, and a motivation to achieve results in a fast-paced environment. You should be somebody who enjoys working on complex problems, is customer-centric, and feels strongly about building good software as well as making that software achieve its operational goals. You will leverage Amazon’s heterogeneous data sources and large-scale computing resources to accelerate advances in artificial intelligence. Your work will directly impact our customers in the form of novel products and services.
US, WA, Seattle
The Sponsored Products and Brands (SPB) team at Amazon Ads is re-imagining the advertising landscape through generative AI technologies, revolutionizing how millions of customers discover products and engage with brands across Amazon.com and beyond. We are at the forefront of re-inventing advertising experiences, bridging human creativity with artificial intelligence to transform every aspect of the advertising lifecycle from ad creation and optimization to performance analysis and customer insights. We are a passionate group of innovators dedicated to developing responsible and intelligent AI technologies that balance the needs of advertisers, enhance the shopping experience, and strengthen the marketplace. If you're energized by solving complex challenges and pushing the boundaries of what's possible with AI, join us in shaping the future of advertising. This position will be part of the Conversational Ad Experiences team within the Amazon Advertising organization. Our cross-functional team focuses on designing, developing and launching innovative ad experiences delivered to shoppers in conversational contexts. We utilize leading-edge engineering and science technologies in generative AI to help shoppers discover new products and brands through intuitive, conversational, multi-turn interfaces. We also empower advertisers to reach shoppers, using their own voice to explain and demonstrate how their products meet shoppers' needs. We collaborate with various teams across multiple Amazon organizations to push the boundary of what's possible in these fields. We are seeking a science leader for our team within the Sponsored Products & Brands organization. You'll be working with talented scientists, engineers, and product managers to innovate on behalf of our customers. An ideal candidate is able to navigate through ambiguous requirements, working with various partner teams, and has experience in generative AI, large language models (LLMs), information retrieval, and ads recommendation systems. Using a combination of generative AI and online experimentation, our scientists develop insights and optimizations that enable the monetization of Amazon properties while enhancing the experience of hundreds of millions of Amazon shoppers worldwide. If you're fired up about being part of a dynamic, driven team, then this is your moment to join us on this exciting journey! Key job responsibilities - Serve as a tech lead for defining the science roadmap for multiple projects in the conversational ad experiences space powered by LLMs. - Build POCs, optimize and deploy models into production, run experiments, perform deep dives on experiment data to gather actionable learnings and communicate them to senior leadership - Work closely with software engineers on detailed requirements, technical designs and implementation of end-to-end solutions in production. - Work closely with product managers to contribute to our mission, and proactively identify opportunities where science can help improve customer experience - Research new machine learning approaches to drive continued scientific innovation - Be a member of the Amazon-wide machine learning community, participating in internal and external meetups, hackathons and conferences - Help attract and recruit technical talent, mentor scientists and engineers in the team