Responsible AI in the generative era

Generative AI raises new challenges in defining, measuring, and mitigating concerns about fairness, toxicity, and intellectual property, among other things. But work has started on the solutions.

In recent years, and even recent months, there have been rapid and dramatic advances in the technology known as generative AI. Generative AI models are trained on inconceivably massive collections of text, code, images, and other rich data. They are now able to produce, on demand, coherent and compelling stories, news summaries, poems, lyrics, paintings, and programs. The potential practical uses of generative AI are only just beginning to be understood but are likely to be manifold and revolutionary and to include writing aids, creative content production and refinement, personal assistants, copywriting, code generation, and much more.

Kearns with caption
Michael Kearns, a professor of computer and information science at the University of Pennsylvania and an Amazon Scholar.

There is thus considerable excitement about the transformations and new opportunities that generative AI may bring. There are also understandable concerns — some of them new twists on those of traditional responsible AI (such as fairness and privacy) and some of them genuinely new (such as the mimicry of artistic or literary styles). In this essay, I survey these concerns and how they might be addressed over time.

I will focus primarily on technical approaches to the risks, while acknowledging that social, legal, regulatory, and policy mechanisms will also have important roles to play. At Amazon, our hope is that such a balanced approach can significantly reduce the risks, while still preserving much of the excitement and usefulness of generative AI.

What is generative AI?

To understand what generative AI is and how it works, it is helpful to begin with the example of large language models (LLMs). Imagine the thought experiment in which we start with some sentence fragment like Once upon a time, there was a great ..., and we poll people on what word they would add next. Some might say wizard, others might say queen, monster, and so on. We would also expect that given the fairy tale nature of the fragment, words such as apricot or fork would be rather unlikely suggestions.

Related content
Model using ASR hypotheses as extra inputs reduces word error rate of human transcriptions by almost 11%.

If we poll a large enough population, a probability distribution over next words would begin to emerge. We could then randomly pick a word from that distribution (say wizard), and now our sequence would be one word longer — Once upon a time, there was a great wizard ... — and we could again poll for the next word. In this manner we could theoretically generate entire stories, and if we restarted the whole process, the crowd would produce an entirely different narrative due to the inherent randomness.

Dramatic advances in machine learning have effectively made this thought experiment a reality. But instead of polling crowds of people, we use a model to predict likely next words, one trained on a massive collection of documents — public collections of fiction and nonfiction, Wikipedia entries and news articles, transcripts of human dialogue, open-source code, and much more.

LLM objective.gif
An example of how a language model uses context to predict the next word in a sentence.

If the training data contains enough sentences beginning Once upon a time, there was a great …, it will be easy to sample plausible next words for our initial fragment. But LLMs can generalize and create as well, and not always in ways that humans might expect. The model might generate Once upon a time, there was a great storm based on occurrences of tremendous storm in the training data, combined with the learned synonymy of great and tremendous. This completion can happen despite great storm never appearing verbatim in the training data and despite the completions more expected by humans (like wizard and queen).

The resulting models are just as complex as their training data, often described by hundreds of billions of numbers (or parameters, in machine learning parlance), hence the “large” in LLM. LLMs have become so good that not only do they consistently generate grammatically correct text, but they create content that is coherent and often compelling, matching the tone and style of the fragments they were given (known as prompts). Start them with a fairy tale beginning, and they generate fairy tales; give them what seems to be the start of a news article, and they write a news-like article. The latest LLMs can even follow instructions rather than simply extend a prompt, as in Write lyrics about the Philadelphia Eagles to the tune of the Beatles song “Get Back”.

Related content
Models that map spoken language to objects in an image would make it easier for customers to communicate with multimodal devices.

Generative AI isn’t limited to text, and many models combine language and images, as in Create a painting of a skateboarding cat in the style of Andy Warhol. The techniques for building such systems are a bit more complex than for LLMs and involve learning a model of proximity between text and images, which can be done using data sources like captioned photos. If there are enough images containing cats that have the word cat in the caption, the model will capture the proximity between the word and pictures of cats.

The examples above suggest that generative AI is a form of entertainment, but many potential practical uses are also beginning to emerge, including generative AI as a writing tool (Shorten the following paragraphs and improve their grammar), for productivity (Extract the action items from this meeting transcript), for creative content (Propose logo designs for a startup building a dog-walking app), for simulating focus groups (Which of the following two product descriptions would Florida retirees find more appealing?), for programming (Give me a code snippet to sort a list of numbers), and many others.

So the excitement over the current and potential applications of generative AI is palpable and growing. But generative AI also gives rise to some new risks and challenges in the responsible use of AI and machine learning. And the likely eventual ubiquity of generative models in everyday life and work amplifies the stakes in addressing these concerns thoughtfully and effectively.

So what’s the problem?

The “generative” in generative AI refers to the fact that the technology can produce open-ended content that varies with repeated tries. This is in contrast to more traditional uses of machine learning, which typically solve very focused and narrow prediction problems.

For example, consider training a model for consumer lending that predicts whether an applicant would successfully repay a loan. Such a model might be trained using the lender’s data on past loans, each record containing applicant information (work history, financial information such as income, savings, and credit score, and educational background) along with whether the loan was repaid or defaulted.

Related content
NSF deputy assistant director Erwin Gianchandani on the challenges addressed by funded projects.

The typical goal would be to train a model that was as accurate as possible in predicting payment/default and then apply it to future applications to guide or make lending decisions. Such a model makes only lending outcome predictions and cannot generate fairy tales, improve grammar, produce whimsical images, write code, and so on. Compared to generative AI, it is indeed a very narrow and limited model.

But the very limitations also make the application of certain dimensions of responsible AI much more manageable. Consider the goal of making our lending model fair, which would typically be taken to mean the absence of demographic bias. For example, we might want to make sure that the error rate of the predictions of our model (and it generally will make errors, since even human loan officers are imperfect in predicting who will repay) is approximately equal on men and women. Or we might more specifically ask that the false-rejection rate — the frequency with which the model predicts default by an applicant who is in fact creditworthy — be the same across gender groups.

Once armed with this definition of fairness, we can seek to enforce it in the training process. In other words, instead of finding a model that minimizes the overall error rate, we find one that does so under the additional condition that the false-rejection rates on men and women are approximately equal (say, within 1% of each other). We might also want to apply the same notion of fairness to other demographic properties (such as young, middle aged, and elderly). But the point is that we can actually give reasonable and targeted definitions of fairness and develop training algorithms that enforce them.

It is also easy to audit a given model for its adherence to such notions of fairness (for instance, by estimating the error rates on both male and female applicants). Finally, when the predictive task is so targeted, we have much more control over the training data: we train on historical lending decisions only, and not on arbitrarily rich troves of general language, image, and code data.

Now consider the problem of making sure an LLM is fair. What might we even mean by this? Well, taking a cue from our lending model, we might ask that the LLM treat men and women equally. For instance, consider a prompt like Dr. Hanson studied the patient’s chart carefully, and then … . In service of fairness, we might ask that in the completions generated by an LLM, Dr. Hanson be assigned male and female pronouns with roughly equal frequency. We might argue that to do otherwise perpetuates the stereotype that doctors are typically male.

Related content
Method significantly reduces bias while maintaining comparable performance on machine learning tasks.

But then should we not also do this for mentions of nurses, firefighters, accountants, pilots, carpenters, attorneys, and professors? It’s clear that measuring just this one narrow notion of fairness will quickly become unwieldy. And it isn’t even obvious in what contexts it should be enforced. What if the prompt described Dr. Hanson as having a beard? What about the Women’s National Basketball Association (WNBA)? Should mention of a WNBA player in a prompt elicit male pronouns half the time?

Defining fairness for LLMs is even murkier than we suggest above, again because of the open-ended content they generate. Let’s turn from pronoun choices to tone. What if an LLM, when generating content about a woman, uses an ever-so-slightly more negative tone (in choice of words and level of enthusiasm) than when generating content about a man? Again, even detecting and quantifying such differences would be a very challenging technical problem. The field of sentiment analysis in natural-language processing might suggest some possibilities, but currently, it focuses on much coarser distinctions in narrower settings, such as distinguishing positive from negative sentiment in business news articles about particular corporations.

So one of the prices we pay for the rich, creative, open-ended content that generative AI can produce is that it becomes commensurately harder (compared to traditional predictive ML) to define, measure, and enforce fairness.

From fairness to privacy

In a similar vein, let’s consider privacy concerns. It is of course important that a consumer lending model not leak information about the financial or other data of the individual applicants in the training data. (One way this can happen is if model predictions are accompanied by confidence scores; if the model expresses 100% confidence that a loan application will default, it’s likely because that application, with a default outcome, was in the training data.) For this kind of traditional, more narrow ML, there are now techniques for mitigating such leaks by making sure model outputs are not overly dependent on any particular piece of training data.

Related content
Calibrating noise addition to word density in the embedding space improves utility of privacy-protected text.

But the open-ended nature of generative AI broadens the set of concerns from verbatim leaks of training data to more subtle copying phenomena. For example, if a programmer has written some code using certain variable names and then asks an LLM for help writing a subroutine, the LLM may generate code from its training data, but with the original variable names replaced with those chosen by the programmer. So the generated code is not literally in the training data but is different only in a cosmetic way.

There are defenses against these challenges, including curation of training data to exclude private information, and techniques to detect similarity of code passages. But more subtle forms of replication are also possible, and as I discuss below, this eventually bleeds into settings where generative AI reproduces the “style” of content in its training data.

And while traditional ML has begun developing techniques for explaining the decisions or predictions of trained models, they don’t always transfer to generative AI, in part because current generative models sometimes produce content that simply cannot be explained (such as scientific citations that don’t exist, something I’ll discuss shortly).

The special challenges of responsible generative AI

So the usual concerns of responsible AI become more difficult for generative AI. But generative AI also gives rise to challenges that simply don’t exist for predictive models that are more narrow. Let’s consider some of these.

Toxicity. A primary concern with generative AI is the possibility of generating content (whether it be text, images, or other modalities) that is offensive, disturbing, or otherwise inappropriate. Once again, it is hard to even define and scope the problem. The subjectivity involved in determining what constitutes toxic content is an additional challenge, and the boundary between restricting toxic content and censorship may be murky and context- and culture-dependent. Should quotations that would be considered offensive out of context be suppressed if they are clearly labeled as quotations? What about opinions that may be offensive to some users but are clearly labeled as opinions? Technical challenges include offensive content that may be worded in a very subtle or indirect fashion, without the use of obviously inflammatory language.

Related content
Prompt engineering enables researchers to generate customized training examples for lightweight “student” models.

Hallucinations. Considering the next-word distribution sampling employed by LLMs, it is perhaps not surprising that in more objective or factual use cases, LLMs are susceptible to what are sometimes called hallucinations — assertions or claims that sound plausible but are verifiably incorrect. For example, a common phenomenon with current LLMs is creating nonexistent scientific citations. If one of these LLMs is prompted with the request Tell me about some papers by Michael Kearns, it is not actually searching for legitimate citations but generating ones from the distribution of words associated with that author. The result will be realistic titles and topics in the area of machine learning, but not real articles, and they may include plausible coauthors but not actual ones.

In a similar vein, prompts for financial news stories result not in a search of (say) Wall Street Journal articles but news articles fabricated by the LLM using the lexicon of finance. Note that in our fairy tale generation scenario, this kind of creativity was harmless and even desirable. But current LLMs have no levers that let users differentiate between “creativity on” and “creativity off” use cases.

Related content
Combining contrastive training and selection of hard negative examples establishes new benchmarks.

Intellectual property. A problem with early LLMs was their tendency to occasionally produce text or code passages that were verbatim regurgitations of parts of their training data, resulting in privacy and other concerns. But even improvements in this regard have not prevented reproductions of training content that are more ambiguous and nuanced. Consider the aforementioned prompt for a multimodal generative model Create a painting of a skateboarding cat in the style of Andy Warhol. If the model is able to do so in a convincing yet still original manner because it was trained on actual Warhol images, objections to such mimicry may arise.

Plagiarism and cheating. The creative capabilities of generative AI give rise to worries that it will be used to write college essays, writing samples for job applications, and other forms of cheating or illicit copying. Debates on this topic are happening at universities and many other institutions, and attitudes vary widely. Some are in favor of explicitly forbidding any use of generative AI in settings where content is being graded or evaluated, while others argue that educational practices must adapt to, and even embrace, the new technology. But the underlying challenge of verifying that a given piece of content was authored by a person is likely to present concerns in many contexts.

Disruption of the nature of work. The proficiency with which generative AI is able to create compelling text and images, perform well on standardized tests, write entire articles on given topics, and successfully summarize or improve the grammar of provided articles has created some anxiety that some professions may be replaced or seriously disrupted by the technology. While this may be premature, it does seem that generative AI will have a transformative effect on many aspects of work, allowing many tasks previously beyond automation to be delegated to machines.

What can we do?

The challenges listed above may seem daunting, in part because of how unfamiliar they are compared to those of previous generations of AI. But as technologists and society learn more about generative AI and its uses and limitations, new science and new policies are already being created to address those challenges.

For toxicity and fairness, careful curation of training data can provide some improvements. After all, if the data doesn’t contain any offensive or biased words or phrases, an LLM simply won’t be able to generate them. But this approach requires that we identify those offensive phrases in advance and are certain that there are absolutely no contexts in which we would want them in the output. Use-case-specific testing can also help address fairness concerns — for instance, before generative AI is used in high-risk domains such as consumer lending, the model could be tested for fairness for that particular application, much as we might do for more narrow predictive models.

Related content
Amazon Visiting Academic Barbara Poblete helps to build safer, more-diverse online communities — and to aid disaster response.

For less targeted notions of toxicity, a natural approach is to train what we might call guardrail models that detect and filter out unwanted content in the training data, in input prompts, and in generated outputs. Such models require human-annotated training data in which varying types and degrees of toxicity or bias are identified, which the model can generalize from. In general, it is easier to control the output of a generative model than it is to curate the training data and prompts, given the extreme generality of the tasks we intend to address.

For the challenge of producing high-fidelity content free of hallucinations, an important first step is to educate users about how generative AI actually works, so there is no expectation that the citations or news-like stories produced are always genuine or factually correct. Indeed, some current LLMs, when pressed on their inability to quote actual citations, will tell the user that they are just language models that don’t verify their content with external sources. Such disclaimers should be more frequent and clear. And the specific case of hallucinated citations could be mitigated by augmenting LLMs with independent, verified citation databases and similar sources, using approaches such as retrieval-augmented generation. Another nascent but intriguing approach is to develop methods for attributing generated outputs to particular pieces of training data, allowing users to assess the validity of those sources. This could help with explainability as well.

Concerns around intellectual property are likely to be addressed over time by a mixture of technology, policy, and legal mechanisms. In the near term, science is beginning to emerge around various notions of model disgorgement, in which protected content or its effects on generative outputs are reduced or removed. One technology that might eventually prove relevant is differential privacy, in which a model is trained in a way that ensures that any particular piece of training data has negligible effects on the outputs the model subsequently produces.

Related content
By exploiting consistencies across components of ensemble classifiers, a new approach reduces data requirements by up to 89%.

Another approach is so-called sharding approaches, which divide the training data into smaller portions on which separate submodels are trained; the submodels are then combined to form the overall model. In order to undo the effects of any particular item of data on the overall model, we need only remove it from its shard and retrain that submodel, rather than retraining the entire model (which for generative AI would be sufficiently expensive as to be prohibitive).

Finally, we can consider filtering or blocking approaches, where before presentation to the user, generated content is explicitly compared to protected content in the training data or elsewhere and suppressed (or replaced) if it is too similar. Limiting the number of times any specific piece of content appears in the training data also proves helpful in reducing verbatim outputs.

Some interesting approaches to discouraging cheating using generative AI are already under development. One is to simply train a model to detect whether a given (say) text was produced by a human or by a generative model. A potential drawback is that this creates an arms race between detection models and generative AI, and since the purpose of generative AI is to produce high-quality content plausibly generated by a human, it’s not clear that detection methods will succeed in the long run.

An intriguing alternative is watermarking or fingerprinting approaches that would be implemented by the developers of generative models themselves. For example, since at each step LLMs are drawing from the distribution over the next word given the text so far, we can divide the candidate words into “red” and “green” lists that are roughly 50% of the probability each; then we can have the LLM draw only from the green list. Since the words on the green list are not known to users, the likelihood that a human would produce a 10-word sentence that also drew only from the green lists is ½ raised to the 10th power, which is only about 0.0009. In this way we can view all-green content as providing a virtual proof of LLM generation. Note that the LLM developers would need to provide such proofs or certificates as part of their service offering.

LLM watermarking.AI.gif
At each step, the model secretly divides the possible next words into green and red lists. The next word is then sampled only from the green list.
LLM watermarking.human.gif
A human generating a sentence is unaware of the division into green and red lists and is thus very likely to choose a sequence that mixes green and red words. Since, on long sentences, the likelihood of a human choosing an all-green sequence is vanishingly small, we can view all-green sentences as containing a proof they were generated by AI.

Disruption to work as we know it does not have any obvious technical defenses, and opinions vary widely on where things will settle. Clearly, generative AI could be an effective productivity tool in many professional settings, and this will at a minimum alter the current division of labor between humans and machines. It’s also possible that the technology will open up existing occupations to a wider community (a recent and culturally specific but not entirely ludicrous quip on social media was “English is the new programming language”, a nod to LLM code generation abilities) or even create new forms of employment, such as prompt engineer (a topic with its own Wikipedia entry, created in just February of this year).

But perhaps the greatest defense against concerns over generative AI may come from the eventual specialization of use cases. Right now, generative AI is being treated as a fascinating, open-ended playground in which our expectations and goals are unclear. As we have discussed, this open-endedness and the plethora of possible uses are major sources of the challenges to responsible AI I have outlined.

Related content
Technique that mixes public and private training data can meet differential-privacy criteria while cutting error increase by 60%-70%.

But soon more applied and focused uses will emerge, like some of those I suggested earlier. For instance, consider using an LLM as a virtual focus group — creating prompts that describe hypothetical individuals and their demographic properties (age, gender, occupation, location, etc.) and then asking the LLM which of two described products they might prefer.

In this application, we might worry much less about censoring content and much more about removing any even remotely toxic output. And we might choose not to eradicate the correlations between gender and the affinity for certain products in service of fairness, since such correlations are valuable to the marketer. The point is that the more specific our goals for generative AI are, the easier it is to make sensible context-dependent choices; our choices become more fraught and difficult when our expectations are vague.

Finally, we note that end user education and training will play a crucial role in the productive and safe use of generative AI. As the potential uses and harms of generative AI become better and more widely understood, users will augment some of the defenses I have outlined above with their own common sense.

Conclusion

Generative AI has stoked both legitimate enthusiasm and legitimate fears. I have attempted to partially survey the landscape of concerns and to propose forward-looking approaches for addressing them. It should be emphasized that addressing responsible-AI risks in the generative age will be an iterative process: there will be no “getting it right” once and for all. This landscape is sure to shift, with changes to both the technology and our attitudes toward it; the only constant will be the necessity of balancing the enthusiasm with practical and effective checks on the concerns.

Related content

US, WA, Seattle
The Sponsored Products and Brands team at Amazon Ads is re-imagining the advertising landscape through novel generative AI technologies, revolutionizing how millions of customers discover products and engage with brands across Amazon.com and beyond. We are at the forefront of re-inventing advertising experiences, bridging human creativity with artificial intelligence to transform every aspect of the advertising lifecycle from ad creation and optimization to performance analysis and customer insights. We are a passionate group of innovators dedicated to developing responsible and intelligent AI technologies that balance the needs of advertisers, enhance the shopping experience, and strengthen the marketplace ecosystem. If you're energized by solving complex challenges and pushing the boundaries of what's possible with AI, join us in shaping the future of advertising. Key job responsibilities As an applied scientist on our team, you will * Develop AI solutions for Sponsored Brands advertiser and shopper experiences. Build recommendation systems that leverage generative models to develop and improve campaigns. * You invent and design new solutions for scientifically-complex problem areas and/or opportunities in new business initiatives. * You drive or heavily influence the design of scientifically-complex software solutions or systems, for which you personally write significant parts of the critical scientific novelty. You take ownership of these components, providing a system-wide view and design guidance. These systems or solutions can be brand new or evolve from existing ones. * Define a long-term science vision and roadmap for our Sponsored Brands advertising business, driven from our customers' needs, translating that direction into specific plans for applied scientists and engineering teams. This role combines science leadership, organizational ability, technical strength, product focus, and business understanding. * Work closely with engineers and product managers to design, implement and launch AI solutions end-to-end; * Design and conduct A/B experiments to evaluate proposed solutions based on in-depth data analyses; * Think big about the arc of development of Gen AI over a multi-year horizon, and identify new opportunities to apply these technologies to solve real-world problems * Effectively communicate technical and non-technical ideas with teammates and stakeholders; * Translate complex scientific challenges into clear and impactful solutions for business stakeholders. * Mentor and guide junior scientists, fostering a collaborative and high-performing team culture. * Stay up-to-date with advancements and the latest modeling techniques in the field About the team The Sponsored Brands Impressions-based Offerings team is responsible for evolving the value proposition of Sponsored Brands to drive brand advertising in retail media at scale, helping brands get discovered, acquire new customers and sustainably grow customer lifetime value. We build end-to-end solutions that enable brands to drive discovery, visibility and share of voice. This includes building advertiser controls, shopper experiences, monetization strategies and optimization features. We succeed when (1) shoppers discover, engage and build affinity with brands and (2) brands can grow their business at scale with our advertising products. #GenAI
US, CA, San Diego
The Private Brands team is looking for a Sr. Research Scientist to join the team in building science solutions at scale. Our team applies Optimization, Machine Learning, Statistics, Causal Inference, and Econometrics/Economics to derive actionable insights about the complex economy of Amazon’s retail business and develop Statistical Models and Algorithms to drive strategic business decisions and improve operations. We are an interdisciplinary team of Scientists, Engineers, PMTs and Economists. Key job responsibilities You will work with business leaders, scientists, and economists to translate business and functional requirements into concrete deliverables, including the design, development, testing, and deployment of highly scalable optimization solutions and ML models. This is a unique, high visibility opportunity for someone who wants to have business impact, dive deep into large-scale problems, enable measurable actions on the consumer economy, and work closely with scientists and economists. As a Sr Scientist, you bring business and industry context to science and technology decisions. You set the standard for scientific excellence and make decisions that affect the way we build and integrate algorithms. Your solutions are exemplary in terms of algorithm design, clarity, model structure, efficiency, and extensibility. You tackle intrinsically hard problems, acquiring expertise as needed. You decompose complex problems into straightforward solutions. We are particularly interested in candidates with experience in Operations Research, ML and predictive models and working with distributed systems. Academic and/or practical background in Operations Research and Machine Learning specifically Reinforcement Learning are particularly relevant for this position. To know more about Amazon science, Please visit https://www.amazon.science About the team We are a one pizza, agile team of scientists focused on solving supply chain challenges for Amazon Private Brands products. We collaborate with Amazon central teams like SCOT and develop both central as well as APB-specific solutions to address various challenges, including sourcing, demand forecasting, ordering optimization, inventory distribution, and inventory health management. Working closely with business stakeholders, Product Management Teams (PMTs), and engineering partners, we drive projects from initial concept through production deployment and ongoing monitoring.
US, CA, Sunnyvale
The Artificial General Intelligence (AGI) team is looking for a passionate, talented, and inventive Principal Applied Scientist with a strong deep learning background, to lead the development of industry-leading technology with multimodal systems. As a Principal Scientist within the Artificial General Intelligence (AGI) organization, you are a trusted part of the technical leadership. You bring business and industry context to science and technology decisions, set the standard for scientific excellence, and make decisions that affect the way we build and integrate algorithms. A Principal Applied Scientist will solicit differing views across the organization and are willing to change your mind as you learn more. Your artifacts are exemplary and often used as reference across organization. You are a hands-on scientific leader; develop solutions that are exemplary in terms of algorithm design, clarity, model structure, efficiency, and extensibility; and tackle intrinsically hard problems, acquiring expertise as needed. Principal Applied Scientists are expected to decompose complex problems into straightforward solutions. You amplify your impact by leading scientific reviews within your organization or at your location; and scrutinize and review experimental design, modeling, verification and other research procedures. You also probe assumptions, illuminate pitfalls, and foster shared understanding; align teams toward coherent strategies; and educate keeping the scientific community up to date on advanced techniques, state of the art approaches, the latest technologies, and trends. AGI Principal Applied Scientists help managers guide the career growth of other scientists by mentoring and play a significant role in hiring and developing scientists and leads. You will play a critical role in driving the development of Generative AI (GenAI) technologies that can handle Amazon-scale use cases and have a significant impact on our customers' experiences. Key job responsibilities You will be responsible for defining key research directions, inventing new machine learning techniques, conducting rigorous experiments, and ensuring that research is translated into practice. You will develop long-term strategies, persuade teams to adopt those strategies, propose goals and deliver on them. A Principal Applied Scientist will participate in organizational planning, hiring, mentorship and leadership development. You will also be build scalable science and engineering solutions, and serve as a key scientific resource in full-cycle development (conception, design, implementation, testing to documentation, delivery, and maintenance).
US, MA, Boston
The Artificial General Intelligence (AGI) team is looking for a highly skilled and experienced Sr. Applied Scientist, to support the development and implementation of state-of-the-art algorithms and models for supervised fine-tuning and reinforcement learning through human feedback and complex reasoning; with a focus across text, image, and video modalities. As an Sr. Applied Scientist, you will play a critical role in supporting the development of Generative AI (Gen AI) technologies that can handle Amazon-scale use cases and have a significant impact on our customers' experiences. Key job responsibilities Collaborate with cross-functional teams of engineers, product managers, and scientists to identify and solve complex problems in Gen AI Design and execute experiments to evaluate the performance of different algorithms (PT, SFT, RL) and models, and iterate quickly to improve results Think big about the arc of development of Gen AI over a multi-year horizon, and identify new opportunities to apply these technologies to solve real-world problems Communicate results and insights to both technical and non-technical audiences, including through presentations and written reports About the team We are passionate scientists dedicated to pushing the boundaries of innovation in Gen AI with focus on Software Development use cases.
IN, HR, Gurugram
Do you want to join an innovative team of scientists who use machine learning and statistical techniques to create state-of-the-art solutions for providing better value to Amazon’s customers? Do you want to build and deploy advanced ML systems that help optimize millions of transactions every day? Are you excited by the prospect of analyzing and modeling terabytes of data to solve real-world problems? Do you like to own end-to-end business problems/metrics and directly impact the profitability of the company? Do you like to innovate and simplify? If yes, then you may be a great fit to join the Machine Learning team for India Consumer Businesses. Machine Learning, Big Data and related quantitative sciences have been strategic to Amazon from the early years. Amazon has been a pioneer in areas such as recommendation engines, ecommerce fraud detection and large-scale optimization of fulfillment center operations. As Amazon has rapidly grown and diversified, the opportunity for applying machine learning has exploded. We have a very broad collection of practical problems where machine learning systems can dramatically improve the customer experience, reduce cost, and drive speed and automation. These include product bundle recommendations for millions of products, safeguarding financial transactions across by building the risk models, improving catalog quality via extracting product attribute values from structured/unstructured data for millions of products, enhancing address quality by powering customer suggestions We are developing state-of-the-art machine learning solutions to accelerate the Amazon India growth story. Amazon India is an exciting place to be at for a machine learning practitioner. We have the eagerness of a fresh startup to absorb machine learning solutions, and the scale of a mature firm to help support their development at the same time. As part of the India Machine Learning team, you will get to work alongside brilliant minds motivated to solve real-world machine learning problems that make a difference to millions of our customers. We encourage thought leadership and blue ocean thinking in ML. Key job responsibilities Use machine learning and analytical techniques to create scalable solutions for business problems Analyze and extract relevant information from large amounts of Amazon’s historical business data to help automate and optimize key processes Design, develop, evaluate and deploy, innovative and highly scalable ML models Work closely with software engineering teams to drive real-time model implementations Work closely with business partners to identify problems and propose machine learning solutions Establish scalable, efficient, automated processes for large scale data analyses, model development, model validation and model maintenance Work proactively with engineering teams and product managers to evangelize new algorithms and drive the implementation of large-scale complex ML models in production Leading projects and mentoring other scientists, engineers in the use of ML techniques About the team International Machine Learning Team is responsible for building novel ML solutions that attack India first (and other Emerging Markets across MENA and LatAm) problems and impact the bottom-line and top-line of India business. Learn more about our team from https://www.amazon.science/working-at-amazon/how-rajeev-rastogis-machine-learning-team-in-india-develops-innovations-for-customers-worldwide
US, WA, Seattle
Prime Video is a first-stop entertainment destination offering customers a vast collection of premium programming in one app available across thousands of devices. Prime members can customize their viewing experience and find their favorite movies, series, documentaries, and live sports – including Amazon MGM Studios-produced series and movies; licensed fan favorites; and programming from Prime Video add-on subscriptions such as Apple TV+, Max, Crunchyroll and MGM+. All customers, regardless of whether they have a Prime membership or not, can rent or buy titles via the Prime Video Store, and can enjoy even more content for free with ads. Are you interested in shaping the future of entertainment? Prime Video's technology teams are creating best-in-class digital video experience. As a Prime Video technologist, you’ll have end-to-end ownership of the product, user experience, design, and technology required to deliver state-of-the-art experiences for our customers. You’ll get to work on projects that are fast-paced, challenging, and varied. You’ll also be able to experiment with new possibilities, take risks, and collaborate with remarkable people. We’ll look for you to bring your diverse perspectives, ideas, and skill-sets to make Prime Video even better for our customers. With global opportunities for talented technologists, you can decide where a career Prime Video Tech takes you! Key job responsibilities - Develop ML models for various recommendation & search systems using deep learning, online learning, and optimization methods - Work closely with other scientists, engineers and product managers to expand the depth of our product insights with data, create a variety of experiments to determine the high impact projects to include in planning roadmaps - Stay up-to-date with advancements and the latest modeling techniques in the field - Publish your research findings in top conferences and journals A day in the life We're using advanced approaches such as foundation models to connect information about our videos and customers from a variety of information sources, acquiring and processing data sets on a scale that only a few companies in the world can match. This will enable us to recommend titles effectively, even when we don't have a large behavioral signal (to tackle the cold-start title problem). It will also allow us to find our customer's niche interests, helping them discover groups of titles that they didn't even know existed. We are looking for creative & customer obsessed machine learning scientists who can apply the latest research, state of the art algorithms and ML to build highly scalable page personalization solutions. You'll be a research leader in the space and a hands-on ML practitioner, guiding and collaborating with talented teams of engineers and scientists and senior leaders in the Prime Video organization. You will also have the opportunity to publish your research at internal and external conferences. About the team Prime Video Recommendation Science team owns science solution to power recommendation and personalization experience on various Prime Video surfaces and devices. We work closely with the engineering teams to launch our solutions in production.
US, WA, Seattle
Revolutionize the Future of AI at the Frontier of Applied Science Are you a brilliant mind seeking to push the boundaries of what's possible with artificial intelligence? Join our elite team of researchers and engineers at the forefront of applied science, where we're harnessing the latest advancements in natural language processing, deep learning, and generative AI to reshape industries and unlock new realms of innovation. As an Applied Science Intern, you'll have the unique opportunity to work alongside world-renowned experts, gaining invaluable hands-on experience with cutting-edge technologies such as large language models, transformers, and neural networks. You'll dive deep into complex challenges, fine-tuning state-of-the-art models, developing novel algorithms for named entity recognition, and exploring the vast potential of generative AI. This internship is not just about executing tasks – it's about being a driving force behind groundbreaking discoveries. You'll collaborate with cross-functional teams, leveraging your expertise in statistics, recommender systems, and question answering to tackle real-world problems and deliver impactful solutions. Throughout your journey, you'll have access to unparalleled resources, including state-of-the-art computing infrastructure, cutting-edge research papers, and mentorship from industry luminaries. This immersive experience will not only sharpen your technical skills but also cultivate your ability to think critically, communicate effectively, and thrive in a fast-paced, innovative environment where bold ideas are celebrated. Join us at the forefront of applied science, where your contributions will shape the future of AI and propel humanity forward. Seize this extraordinary opportunity to learn, grow, and leave an indelible mark on the world of technology. Amazon has positions available for LLM & GenAI Applied Science Internships in, but not limited to, Bellevue, WA; Boston, MA; Cambridge, MA; New York, NY; Santa Clara, CA; Seattle, WA; Sunnyvale, CA; Pittsburgh, PA. Key job responsibilities We are particularly interested in candidates with expertise in: LLMs, NLP/NLU, Gen AI, Transformers, Fine-Tuning, Recommendation Systems, Deep Learning, NER, Statistics, Neural Networks, Question Answering. In this role, you will work alongside global experts to develop and implement novel, scalable algorithms and modeling techniques that advance the state-of-the-art in areas at the intersection of LLMs and GenAI. You will tackle challenging, groundbreaking research problems on production-scale data, with a focus on recommendation systems, question answering, deep learning and generative AI. The ideal candidate should possess the ability to work collaboratively with diverse groups and cross-functional teams to solve complex business problems. A successful candidate will be a self-starter, comfortable with ambiguity, with strong attention to detail and the ability to thrive in a fast-paced, ever-changing environment. A day in the life - Collaborate with cross-functional teams to tackle complex challenges in natural language processing, computer vision, and generative AI. - Fine-tune state-of-the-art models and develop novel algorithms to push the boundaries of what's possible. - Explore the vast potential of generative AI and its applications across industries. - Attend cutting-edge research seminars and engage in thought-provoking discussions with industry luminaries. - Leverage state-of-the-art computing infrastructure and access to the latest research papers to fuel your innovation. - Present your groundbreaking work and insights to the team, fostering a culture of knowledge-sharing and continuous learning.
US, WA, Seattle
Unlock the Future with Amazon Science! Calling all visionary minds passionate about the transformative power of machine learning! Amazon is seeking boundary-pushing graduate student scientists who can turn revolutionary theory into awe-inspiring reality. Join our team of visionary scientists and embark on a journey to revolutionize the field by harnessing the power of cutting-edge techniques in bayesian optimization, time series, multi-armed bandits and more. At Amazon, we don't just talk about innovation – we live and breathe it. You'll conducting research into the theory and application of deep reinforcement learning. You will work on some of the most difficult problems in the industry with some of the best product managers, scientists, and software engineers in the industry. You will propose and deploy solutions that will likely draw from a range of scientific areas such as supervised, semi-supervised and unsupervised learning, reinforcement learning, advanced statistical modeling, and graph models. Throughout your journey, you'll have access to unparalleled resources, including state-of-the-art computing infrastructure, cutting-edge research papers, and mentorship from industry luminaries. This immersive experience will not only sharpen your technical skills but also cultivate your ability to think critically, communicate effectively, and thrive in a fast-paced, innovative environment where bold ideas are celebrated. Join us at the forefront of applied science, where your contributions will shape the future of AI and propel humanity forward. Seize this extraordinary opportunity to learn, grow, and leave an indelible mark on the world of technology. Amazon has positions available for Machine Learning Applied Science Internships in, but not limited to Arlington, VA; Bellevue, WA; Boston, MA; New York, NY; Palo Alto, CA; San Diego, CA; Santa Clara, CA; Seattle, WA. Key job responsibilities We are particularly interested in candidates with expertise in: Optimization, Programming/Scripting Languages, Statistics, Reinforcement Learning, Causal Inference, Large Language Models, Time Series, Graph Modeling, Supervised/Unsupervised Learning, Deep Learning, Predictive Modeling In this role, you will work alongside global experts to develop and implement novel, scalable algorithms and modeling techniques that advance the state-of-the-art in areas at the intersection of Reinforcement Learning and Optimization within Machine Learning. You will tackle challenging, groundbreaking research problems on production-scale data, with a focus on developing novel RL algorithms and applying them to complex, real-world challenges. The ideal candidate should possess the ability to work collaboratively with diverse groups and cross-functional teams to solve complex business problems. A successful candidate will be a self-starter, comfortable with ambiguity, with strong attention to detail and the ability to thrive in a fast-paced, ever-changing environment. A day in the life - Develop scalable, efficient, automated processes for large scale data analyses, model development, model validation and model implementation. - Design, development and evaluation of highly innovative ML models for solving complex business problems. - Research and apply the latest ML techniques and best practices from both academia and industry. - Think about customers and how to improve the customer delivery experience. - Use and analytical techniques to create scalable solutions for business problems.
US, WA, Seattle
Shape the Future of Human-Machine Interaction Are you a master of natural language processing, eager to push the boundaries of conversational AI? Amazon is seeking exceptional graduate students to join our cutting-edge research team, where they will have the opportunity to explore and push the boundaries of natural language processing (NLP), natural language understanding (NLU), and speech recognition technologies. Imagine waking up each morning, fueled by the excitement of tackling complex research problems that have the potential to reshape the world. You'll dive into production-scale data, exploring innovative approaches to natural language understanding, large language models, reinforcement learning with human feedback, conversational AI, and multimodal learning. Your days will be filled with brainstorming sessions, coding sprints, and lively discussions with brilliant minds from diverse backgrounds. Throughout your journey, you'll have access to unparalleled resources, including state-of-the-art computing infrastructure, cutting-edge research papers, and mentorship from industry luminaries. This immersive experience will not only sharpen your technical skills but also cultivate your ability to think critically, communicate effectively, and thrive in a fast-paced, innovative environment where bold ideas are celebrated.. Join us at the forefront of applied science, where your contributions will shape the future of AI and propel humanity forward. Seize this extraordinary opportunity to learn, grow, and leave an indelible mark on the world of technology. Amazon has positions available for Natural Language Processing & Speech Applied Science Internships in, but not limited to, Bellevue, WA; Boston, MA; Cambridge, MA; New York, NY; Santa Clara, CA; Seattle, WA; Sunnyvale, CA. Key job responsibilities We are particularly interested in candidates with expertise in: NLP/NLU, LLMs, Reinforcement Learning, Human Feedback/HITL, Deep Learning, Speech Recognition, Conversational AI, Natural Language Modeling, Multimodal Learning. In this role, you will work alongside global experts to develop and implement novel, scalable algorithms and modeling techniques that advance the state-of-the-art in areas at the intersection of Natural Language Processing and Speech Technologies. You will tackle challenging, groundbreaking research problems on production-scale data, with a focus on natural language processing, speech recognition, text-to-speech (TTS), text recognition, question answering, NLP models (e.g., LSTM, transformer-based models), signal processing, information extraction, conversational modeling, audio processing, speaker detection, large language models, multilingual modeling, and more. The ideal candidate should possess the ability to work collaboratively with diverse groups and cross-functional teams to solve complex business problems. A successful candidate will be a self-starter, comfortable with ambiguity, with strong attention to detail and the ability to thrive in a fast-paced, ever-changing environment. A day in the life - Develop novel, scalable algorithms and modeling techniques that advance the state-of-the-art in natural language processing, speech recognition, text-to-speech, question answering, and conversational modeling. - Tackle groundbreaking research problems on production-scale data, leveraging techniques such as LSTM, transformer-based models, signal processing, information extraction, audio processing, speaker detection, large language models, and multilingual modeling. - Collaborate with cross-functional teams to solve complex business problems, leveraging your expertise in NLP/NLU, LLMs, reinforcement learning, human feedback/HITL, deep learning, speech recognition, conversational AI, natural language modeling, and multimodal learning. - Thrive in a fast-paced, ever-changing environment, embracing ambiguity and demonstrating strong attention to detail.
US, WA, Seattle
Do you enjoy solving challenging problems and driving innovations in research? Do you want to create scalable optimization models and apply machine learning techniques to guide real-world decisions? We are looking for builders, innovators, and entrepreneurs who want to bring their ideas to reality and improve the lives of millions of customers. As a Research Science intern focused on Operations Research and Optimization intern, you will be challenged to apply theory into practice through experimentation and invention, develop new algorithms using modeling software and programming techniques for complex problems, implement prototypes and work with massive datasets. As you navigate through complex algorithms and data structures, you'll find yourself at the forefront of innovation, shaping the future of Amazon's fulfillment, logistics, and supply chain operations. Imagine waking up each morning, fueled by the excitement of solving intricate puzzles that have a direct impact on Amazon's operational excellence. Your day might begin by collaborating with cross-functional teams, exchanging ideas and insights to develop innovative solutions. You'll then immerse yourself in a world of data, leveraging your expertise in optimization, causal inference, time series analysis, and machine learning to uncover hidden patterns and drive operational efficiencies. Throughout your journey, you'll have access to unparalleled resources, including state-of-the-art computing infrastructure, cutting-edge research papers, and mentorship from industry luminaries. This immersive experience will not only sharpen your technical skills but also cultivate your ability to think critically, communicate effectively, and thrive in a fast-paced, innovative environment where bold ideas are celebrated. Amazon has positions available for Operations Research Science Internships in, but not limited to, Bellevue, WA; Boston, MA; Cambridge, MA; New York, NY; Santa Clara, CA; Seattle, WA; Sunnyvale, CA. Key job responsibilities We are particularly interested in candidates with expertise in: Optimization, Causal Inference, Time Series, Algorithms and Data Structures, Statistics, Operations Research, Machine Learning, Programming/Scripting Languages, LLMs In this role, you will gain hands-on experience in applying cutting-edge analytical techniques to tackle complex business challenges at scale. If you are passionate about using data-driven insights to drive operational excellence, we encourage you to apply. The ideal candidate should possess the ability to work collaboratively with diverse groups and cross-functional teams to solve complex business problems. A successful candidate will be a self-starter, comfortable with ambiguity, with strong attention to detail and the ability to thrive in a fast-paced, ever-changing environment. A day in the life Develop and apply optimization, causal inference, and time series modeling techniques to drive operational efficiencies and improve decision-making across Amazon's fulfillment, logistics, and supply chain operations Design and implement scalable algorithms and data structures to support complex optimization systems Leverage statistical methods and machine learning to uncover insights and patterns in large-scale operations data Prototype and validate new approaches through rigorous experimentation and analysis Collaborate closely with cross-functional teams of researchers, engineers, and business stakeholders to translate research outputs into tangible business impact