Responsible AI in the generative era

Generative AI raises new challenges in defining, measuring, and mitigating concerns about fairness, toxicity, and intellectual property, among other things. But work has started on the solutions.

In recent years, and even recent months, there have been rapid and dramatic advances in the technology known as generative AI. Generative AI models are trained on inconceivably massive collections of text, code, images, and other rich data. They are now able to produce, on demand, coherent and compelling stories, news summaries, poems, lyrics, paintings, and programs. The potential practical uses of generative AI are only just beginning to be understood but are likely to be manifold and revolutionary and to include writing aids, creative content production and refinement, personal assistants, copywriting, code generation, and much more.

Kearns with caption
Michael Kearns, a professor of computer and information science at the University of Pennsylvania and an Amazon Scholar.

There is thus considerable excitement about the transformations and new opportunities that generative AI may bring. There are also understandable concerns — some of them new twists on those of traditional responsible AI (such as fairness and privacy) and some of them genuinely new (such as the mimicry of artistic or literary styles). In this essay, I survey these concerns and how they might be addressed over time.

I will focus primarily on technical approaches to the risks, while acknowledging that social, legal, regulatory, and policy mechanisms will also have important roles to play. At Amazon, our hope is that such a balanced approach can significantly reduce the risks, while still preserving much of the excitement and usefulness of generative AI.

What is generative AI?

To understand what generative AI is and how it works, it is helpful to begin with the example of large language models (LLMs). Imagine the thought experiment in which we start with some sentence fragment like Once upon a time, there was a great ..., and we poll people on what word they would add next. Some might say wizard, others might say queen, monster, and so on. We would also expect that given the fairy tale nature of the fragment, words such as apricot or fork would be rather unlikely suggestions.

Related content
Model using ASR hypotheses as extra inputs reduces word error rate of human transcriptions by almost 11%.

If we poll a large enough population, a probability distribution over next words would begin to emerge. We could then randomly pick a word from that distribution (say wizard), and now our sequence would be one word longer — Once upon a time, there was a great wizard ... — and we could again poll for the next word. In this manner we could theoretically generate entire stories, and if we restarted the whole process, the crowd would produce an entirely different narrative due to the inherent randomness.

Dramatic advances in machine learning have effectively made this thought experiment a reality. But instead of polling crowds of people, we use a model to predict likely next words, one trained on a massive collection of documents — public collections of fiction and nonfiction, Wikipedia entries and news articles, transcripts of human dialogue, open-source code, and much more.

LLM objective.gif
An example of how a language model uses context to predict the next word in a sentence.

If the training data contains enough sentences beginning Once upon a time, there was a great …, it will be easy to sample plausible next words for our initial fragment. But LLMs can generalize and create as well, and not always in ways that humans might expect. The model might generate Once upon a time, there was a great storm based on occurrences of tremendous storm in the training data, combined with the learned synonymy of great and tremendous. This completion can happen despite great storm never appearing verbatim in the training data and despite the completions more expected by humans (like wizard and queen).

The resulting models are just as complex as their training data, often described by hundreds of billions of numbers (or parameters, in machine learning parlance), hence the “large” in LLM. LLMs have become so good that not only do they consistently generate grammatically correct text, but they create content that is coherent and often compelling, matching the tone and style of the fragments they were given (known as prompts). Start them with a fairy tale beginning, and they generate fairy tales; give them what seems to be the start of a news article, and they write a news-like article. The latest LLMs can even follow instructions rather than simply extend a prompt, as in Write lyrics about the Philadelphia Eagles to the tune of the Beatles song “Get Back”.

Related content
Models that map spoken language to objects in an image would make it easier for customers to communicate with multimodal devices.

Generative AI isn’t limited to text, and many models combine language and images, as in Create a painting of a skateboarding cat in the style of Andy Warhol. The techniques for building such systems are a bit more complex than for LLMs and involve learning a model of proximity between text and images, which can be done using data sources like captioned photos. If there are enough images containing cats that have the word cat in the caption, the model will capture the proximity between the word and pictures of cats.

The examples above suggest that generative AI is a form of entertainment, but many potential practical uses are also beginning to emerge, including generative AI as a writing tool (Shorten the following paragraphs and improve their grammar), for productivity (Extract the action items from this meeting transcript), for creative content (Propose logo designs for a startup building a dog-walking app), for simulating focus groups (Which of the following two product descriptions would Florida retirees find more appealing?), for programming (Give me a code snippet to sort a list of numbers), and many others.

So the excitement over the current and potential applications of generative AI is palpable and growing. But generative AI also gives rise to some new risks and challenges in the responsible use of AI and machine learning. And the likely eventual ubiquity of generative models in everyday life and work amplifies the stakes in addressing these concerns thoughtfully and effectively.

So what’s the problem?

The “generative” in generative AI refers to the fact that the technology can produce open-ended content that varies with repeated tries. This is in contrast to more traditional uses of machine learning, which typically solve very focused and narrow prediction problems.

For example, consider training a model for consumer lending that predicts whether an applicant would successfully repay a loan. Such a model might be trained using the lender’s data on past loans, each record containing applicant information (work history, financial information such as income, savings, and credit score, and educational background) along with whether the loan was repaid or defaulted.

Related content
NSF deputy assistant director Erwin Gianchandani on the challenges addressed by funded projects.

The typical goal would be to train a model that was as accurate as possible in predicting payment/default and then apply it to future applications to guide or make lending decisions. Such a model makes only lending outcome predictions and cannot generate fairy tales, improve grammar, produce whimsical images, write code, and so on. Compared to generative AI, it is indeed a very narrow and limited model.

But the very limitations also make the application of certain dimensions of responsible AI much more manageable. Consider the goal of making our lending model fair, which would typically be taken to mean the absence of demographic bias. For example, we might want to make sure that the error rate of the predictions of our model (and it generally will make errors, since even human loan officers are imperfect in predicting who will repay) is approximately equal on men and women. Or we might more specifically ask that the false-rejection rate — the frequency with which the model predicts default by an applicant who is in fact creditworthy — be the same across gender groups.

Once armed with this definition of fairness, we can seek to enforce it in the training process. In other words, instead of finding a model that minimizes the overall error rate, we find one that does so under the additional condition that the false-rejection rates on men and women are approximately equal (say, within 1% of each other). We might also want to apply the same notion of fairness to other demographic properties (such as young, middle aged, and elderly). But the point is that we can actually give reasonable and targeted definitions of fairness and develop training algorithms that enforce them.

It is also easy to audit a given model for its adherence to such notions of fairness (for instance, by estimating the error rates on both male and female applicants). Finally, when the predictive task is so targeted, we have much more control over the training data: we train on historical lending decisions only, and not on arbitrarily rich troves of general language, image, and code data.

Now consider the problem of making sure an LLM is fair. What might we even mean by this? Well, taking a cue from our lending model, we might ask that the LLM treat men and women equally. For instance, consider a prompt like Dr. Hanson studied the patient’s chart carefully, and then … . In service of fairness, we might ask that in the completions generated by an LLM, Dr. Hanson be assigned male and female pronouns with roughly equal frequency. We might argue that to do otherwise perpetuates the stereotype that doctors are typically male.

Related content
Method significantly reduces bias while maintaining comparable performance on machine learning tasks.

But then should we not also do this for mentions of nurses, firefighters, accountants, pilots, carpenters, attorneys, and professors? It’s clear that measuring just this one narrow notion of fairness will quickly become unwieldy. And it isn’t even obvious in what contexts it should be enforced. What if the prompt described Dr. Hanson as having a beard? What about the Women’s National Basketball Association (WNBA)? Should mention of a WNBA player in a prompt elicit male pronouns half the time?

Defining fairness for LLMs is even murkier than we suggest above, again because of the open-ended content they generate. Let’s turn from pronoun choices to tone. What if an LLM, when generating content about a woman, uses an ever-so-slightly more negative tone (in choice of words and level of enthusiasm) than when generating content about a man? Again, even detecting and quantifying such differences would be a very challenging technical problem. The field of sentiment analysis in natural-language processing might suggest some possibilities, but currently, it focuses on much coarser distinctions in narrower settings, such as distinguishing positive from negative sentiment in business news articles about particular corporations.

So one of the prices we pay for the rich, creative, open-ended content that generative AI can produce is that it becomes commensurately harder (compared to traditional predictive ML) to define, measure, and enforce fairness.

From fairness to privacy

In a similar vein, let’s consider privacy concerns. It is of course important that a consumer lending model not leak information about the financial or other data of the individual applicants in the training data. (One way this can happen is if model predictions are accompanied by confidence scores; if the model expresses 100% confidence that a loan application will default, it’s likely because that application, with a default outcome, was in the training data.) For this kind of traditional, more narrow ML, there are now techniques for mitigating such leaks by making sure model outputs are not overly dependent on any particular piece of training data.

Related content
Calibrating noise addition to word density in the embedding space improves utility of privacy-protected text.

But the open-ended nature of generative AI broadens the set of concerns from verbatim leaks of training data to more subtle copying phenomena. For example, if a programmer has written some code using certain variable names and then asks an LLM for help writing a subroutine, the LLM may generate code from its training data, but with the original variable names replaced with those chosen by the programmer. So the generated code is not literally in the training data but is different only in a cosmetic way.

There are defenses against these challenges, including curation of training data to exclude private information, and techniques to detect similarity of code passages. But more subtle forms of replication are also possible, and as I discuss below, this eventually bleeds into settings where generative AI reproduces the “style” of content in its training data.

And while traditional ML has begun developing techniques for explaining the decisions or predictions of trained models, they don’t always transfer to generative AI, in part because current generative models sometimes produce content that simply cannot be explained (such as scientific citations that don’t exist, something I’ll discuss shortly).

The special challenges of responsible generative AI

So the usual concerns of responsible AI become more difficult for generative AI. But generative AI also gives rise to challenges that simply don’t exist for predictive models that are more narrow. Let’s consider some of these.

Toxicity. A primary concern with generative AI is the possibility of generating content (whether it be text, images, or other modalities) that is offensive, disturbing, or otherwise inappropriate. Once again, it is hard to even define and scope the problem. The subjectivity involved in determining what constitutes toxic content is an additional challenge, and the boundary between restricting toxic content and censorship may be murky and context- and culture-dependent. Should quotations that would be considered offensive out of context be suppressed if they are clearly labeled as quotations? What about opinions that may be offensive to some users but are clearly labeled as opinions? Technical challenges include offensive content that may be worded in a very subtle or indirect fashion, without the use of obviously inflammatory language.

Related content
Prompt engineering enables researchers to generate customized training examples for lightweight “student” models.

Hallucinations. Considering the next-word distribution sampling employed by LLMs, it is perhaps not surprising that in more objective or factual use cases, LLMs are susceptible to what are sometimes called hallucinations — assertions or claims that sound plausible but are verifiably incorrect. For example, a common phenomenon with current LLMs is creating nonexistent scientific citations. If one of these LLMs is prompted with the request Tell me about some papers by Michael Kearns, it is not actually searching for legitimate citations but generating ones from the distribution of words associated with that author. The result will be realistic titles and topics in the area of machine learning, but not real articles, and they may include plausible coauthors but not actual ones.

In a similar vein, prompts for financial news stories result not in a search of (say) Wall Street Journal articles but news articles fabricated by the LLM using the lexicon of finance. Note that in our fairy tale generation scenario, this kind of creativity was harmless and even desirable. But current LLMs have no levers that let users differentiate between “creativity on” and “creativity off” use cases.

Related content
Combining contrastive training and selection of hard negative examples establishes new benchmarks.

Intellectual property. A problem with early LLMs was their tendency to occasionally produce text or code passages that were verbatim regurgitations of parts of their training data, resulting in privacy and other concerns. But even improvements in this regard have not prevented reproductions of training content that are more ambiguous and nuanced. Consider the aforementioned prompt for a multimodal generative model Create a painting of a skateboarding cat in the style of Andy Warhol. If the model is able to do so in a convincing yet still original manner because it was trained on actual Warhol images, objections to such mimicry may arise.

Plagiarism and cheating. The creative capabilities of generative AI give rise to worries that it will be used to write college essays, writing samples for job applications, and other forms of cheating or illicit copying. Debates on this topic are happening at universities and many other institutions, and attitudes vary widely. Some are in favor of explicitly forbidding any use of generative AI in settings where content is being graded or evaluated, while others argue that educational practices must adapt to, and even embrace, the new technology. But the underlying challenge of verifying that a given piece of content was authored by a person is likely to present concerns in many contexts.

Disruption of the nature of work. The proficiency with which generative AI is able to create compelling text and images, perform well on standardized tests, write entire articles on given topics, and successfully summarize or improve the grammar of provided articles has created some anxiety that some professions may be replaced or seriously disrupted by the technology. While this may be premature, it does seem that generative AI will have a transformative effect on many aspects of work, allowing many tasks previously beyond automation to be delegated to machines.

What can we do?

The challenges listed above may seem daunting, in part because of how unfamiliar they are compared to those of previous generations of AI. But as technologists and society learn more about generative AI and its uses and limitations, new science and new policies are already being created to address those challenges.

For toxicity and fairness, careful curation of training data can provide some improvements. After all, if the data doesn’t contain any offensive or biased words or phrases, an LLM simply won’t be able to generate them. But this approach requires that we identify those offensive phrases in advance and are certain that there are absolutely no contexts in which we would want them in the output. Use-case-specific testing can also help address fairness concerns — for instance, before generative AI is used in high-risk domains such as consumer lending, the model could be tested for fairness for that particular application, much as we might do for more narrow predictive models.

Related content
Amazon Visiting Academic Barbara Poblete helps to build safer, more-diverse online communities — and to aid disaster response.

For less targeted notions of toxicity, a natural approach is to train what we might call guardrail models that detect and filter out unwanted content in the training data, in input prompts, and in generated outputs. Such models require human-annotated training data in which varying types and degrees of toxicity or bias are identified, which the model can generalize from. In general, it is easier to control the output of a generative model than it is to curate the training data and prompts, given the extreme generality of the tasks we intend to address.

For the challenge of producing high-fidelity content free of hallucinations, an important first step is to educate users about how generative AI actually works, so there is no expectation that the citations or news-like stories produced are always genuine or factually correct. Indeed, some current LLMs, when pressed on their inability to quote actual citations, will tell the user that they are just language models that don’t verify their content with external sources. Such disclaimers should be more frequent and clear. And the specific case of hallucinated citations could be mitigated by augmenting LLMs with independent, verified citation databases and similar sources, using approaches such as retrieval-augmented generation. Another nascent but intriguing approach is to develop methods for attributing generated outputs to particular pieces of training data, allowing users to assess the validity of those sources. This could help with explainability as well.

Concerns around intellectual property are likely to be addressed over time by a mixture of technology, policy, and legal mechanisms. In the near term, science is beginning to emerge around various notions of model disgorgement, in which protected content or its effects on generative outputs are reduced or removed. One technology that might eventually prove relevant is differential privacy, in which a model is trained in a way that ensures that any particular piece of training data has negligible effects on the outputs the model subsequently produces.

Related content
By exploiting consistencies across components of ensemble classifiers, a new approach reduces data requirements by up to 89%.

Another approach is so-called sharding approaches, which divide the training data into smaller portions on which separate submodels are trained; the submodels are then combined to form the overall model. In order to undo the effects of any particular item of data on the overall model, we need only remove it from its shard and retrain that submodel, rather than retraining the entire model (which for generative AI would be sufficiently expensive as to be prohibitive).

Finally, we can consider filtering or blocking approaches, where before presentation to the user, generated content is explicitly compared to protected content in the training data or elsewhere and suppressed (or replaced) if it is too similar. Limiting the number of times any specific piece of content appears in the training data also proves helpful in reducing verbatim outputs.

Some interesting approaches to discouraging cheating using generative AI are already under development. One is to simply train a model to detect whether a given (say) text was produced by a human or by a generative model. A potential drawback is that this creates an arms race between detection models and generative AI, and since the purpose of generative AI is to produce high-quality content plausibly generated by a human, it’s not clear that detection methods will succeed in the long run.

An intriguing alternative is watermarking or fingerprinting approaches that would be implemented by the developers of generative models themselves. For example, since at each step LLMs are drawing from the distribution over the next word given the text so far, we can divide the candidate words into “red” and “green” lists that are roughly 50% of the probability each; then we can have the LLM draw only from the green list. Since the words on the green list are not known to users, the likelihood that a human would produce a 10-word sentence that also drew only from the green lists is ½ raised to the 10th power, which is only about 0.0009. In this way we can view all-green content as providing a virtual proof of LLM generation. Note that the LLM developers would need to provide such proofs or certificates as part of their service offering.

LLM watermarking.AI.gif
At each step, the model secretly divides the possible next words into green and red lists. The next word is then sampled only from the green list.
LLM watermarking.human.gif
A human generating a sentence is unaware of the division into green and red lists and is thus very likely to choose a sequence that mixes green and red words. Since, on long sentences, the likelihood of a human choosing an all-green sequence is vanishingly small, we can view all-green sentences as containing a proof they were generated by AI.

Disruption to work as we know it does not have any obvious technical defenses, and opinions vary widely on where things will settle. Clearly, generative AI could be an effective productivity tool in many professional settings, and this will at a minimum alter the current division of labor between humans and machines. It’s also possible that the technology will open up existing occupations to a wider community (a recent and culturally specific but not entirely ludicrous quip on social media was “English is the new programming language”, a nod to LLM code generation abilities) or even create new forms of employment, such as prompt engineer (a topic with its own Wikipedia entry, created in just February of this year).

But perhaps the greatest defense against concerns over generative AI may come from the eventual specialization of use cases. Right now, generative AI is being treated as a fascinating, open-ended playground in which our expectations and goals are unclear. As we have discussed, this open-endedness and the plethora of possible uses are major sources of the challenges to responsible AI I have outlined.

Related content
Technique that mixes public and private training data can meet differential-privacy criteria while cutting error increase by 60%-70%.

But soon more applied and focused uses will emerge, like some of those I suggested earlier. For instance, consider using an LLM as a virtual focus group — creating prompts that describe hypothetical individuals and their demographic properties (age, gender, occupation, location, etc.) and then asking the LLM which of two described products they might prefer.

In this application, we might worry much less about censoring content and much more about removing any even remotely toxic output. And we might choose not to eradicate the correlations between gender and the affinity for certain products in service of fairness, since such correlations are valuable to the marketer. The point is that the more specific our goals for generative AI are, the easier it is to make sensible context-dependent choices; our choices become more fraught and difficult when our expectations are vague.

Finally, we note that end user education and training will play a crucial role in the productive and safe use of generative AI. As the potential uses and harms of generative AI become better and more widely understood, users will augment some of the defenses I have outlined above with their own common sense.

Conclusion

Generative AI has stoked both legitimate enthusiasm and legitimate fears. I have attempted to partially survey the landscape of concerns and to propose forward-looking approaches for addressing them. It should be emphasized that addressing responsible-AI risks in the generative age will be an iterative process: there will be no “getting it right” once and for all. This landscape is sure to shift, with changes to both the technology and our attitudes toward it; the only constant will be the necessity of balancing the enthusiasm with practical and effective checks on the concerns.

Related content

LU, Luxembourg
Are you a MS or PhD student interested in a 2025 Internship in the field of machine learning, deep learning, speech, robotics, computer vision, optimization, quantum computing, automated reasoning, or formal methods? If so, we want to hear from you! We are looking for students interested in using a variety of domain expertise to invent, design and implement state-of-the-art solutions for never-before-solved problems. You can find more information about the Amazon Science community as well as our interview process via the links below; https://www.amazon.science/ https://amazon.jobs/content/en/career-programs/university/science https://amazon.jobs/content/en/how-we-hire/university-roles/applied-science Key job responsibilities As an Applied Science Intern, you will own the design and development of end-to-end systems. You’ll have the opportunity to write technical white papers, create roadmaps and drive production level projects that will support Amazon Science. You will work closely with Amazon scientists, and other science interns to develop solutions and deploy them into production. You will have the opportunity to design new algorithms, models, or other technical solutions whilst experiencing Amazon’s customer focused culture. The ideal intern must have the ability to work with diverse groups of people and cross-functional teams to solve complex business problems. A day in the life At Amazon, you will grow into the high impact, visionary person you know you’re ready to be. Every day will be filled with developing new skills and achieving personal growth. How often can you say that your work changes the world? At Amazon, you’ll say it often. Join us and define tomorrow. Some more benefits of an Amazon Science internship include; • All of our internships offer a competitive stipend/salary • Interns are paired with an experienced manager and mentor(s) • Interns receive invitations to different events such as intern program initiatives or site events • Interns can build their professional and personal network with other Amazon Scientists • Interns can potentially publish work at top tier conferences each year About the team Applicants will be reviewed on a rolling basis and are assigned to teams aligned with their research interests and experience prior to interviews. Start dates are available throughout the year and durations can vary in length from 3-6 months for full time internships. This role may available across multiple locations in the EMEA region (Austria, Estonia, France, Germany, Ireland, Israel, Italy, Luxembourg, Netherlands, Poland, Romania, Spain, UAE, and UK). Please note these are not remote internships.
LU, Luxembourg
At Global Mile Expansion team, our vision is to become the carrier of choice for all of our Selling Partners cross-border shipping needs, offering complete set of end to end cross border solutions from key manufacturing hubs to footprint countries supporting business who use Amazon to grow their business globally. As we expand, the need for comprehensive business insight and robust demand forecasting to aid decision making on asset utilization especially where we know demand will be variable becomes vital, as well as operational excellence. We are building business models involving large amounts of data and Macro economic inputs to produce the robust forecast to help the operational excellence and continue improving the customer experience. We are looking for an experienced economist who can apply innovative modelling techniques to real-world problems, and convert it to highly business-impacting solutions. Key job responsibilities - Experienced in using mathematical and statistical approach to create new, scalable solutions for business problems - Analyze and extract relevant information from business data to help automate and optimize key processes - Design, develop and evaluate highly innovative models for predictive learning - Establish scalable, efficient, automated processes for large scale data analyses, model development, model validation and model implementation - Research and implement statistical approaches to understand the business long-term and short-term trend and support the strategies
ES, Madrid
Are you a MS or PhD student interested in a 2025 Internship in the field of machine learning, deep learning, speech, robotics, computer vision, optimization, quantum computing, automated reasoning, or formal methods? If so, we want to hear from you! We are looking for students interested in using a variety of domain expertise to invent, design and implement state-of-the-art solutions for never-before-solved problems. You can find more information about the Amazon Science community as well as our interview process via the links below; https://www.amazon.science/ https://amazon.jobs/content/en/career-programs/university/science https://amazon.jobs/content/en/how-we-hire/university-roles/applied-science Key job responsibilities As an Applied Science Intern, you will own the design and development of end-to-end systems. You’ll have the opportunity to write technical white papers, create roadmaps and drive production level projects that will support Amazon Science. You will work closely with Amazon scientists, and other science interns to develop solutions and deploy them into production. You will have the opportunity to design new algorithms, models, or other technical solutions whilst experiencing Amazon’s customer focused culture. The ideal intern must have the ability to work with diverse groups of people and cross-functional teams to solve complex business problems. A day in the life At Amazon, you will grow into the high impact, visionary person you know you’re ready to be. Every day will be filled with developing new skills and achieving personal growth. How often can you say that your work changes the world? At Amazon, you’ll say it often. Join us and define tomorrow. Some more benefits of an Amazon Science internship include; • All of our internships offer a competitive stipend/salary • Interns are paired with an experienced manager and mentor(s) • Interns receive invitations to different events such as intern program initiatives or site events • Interns can build their professional and personal network with other Amazon Scientists • Interns can potentially publish work at top tier conferences each year About the team Applicants will be reviewed on a rolling basis and are assigned to teams aligned with their research interests and experience prior to interviews. Start dates are available throughout the year and durations can vary in length from 3-6 months for full time internships. This role may available across multiple locations in the EMEA region (Austria, Estonia, France, Germany, Ireland, Israel, Italy, Luxembourg, Netherlands, Poland, Romania, Spain, UAE, and UK). Please note these are not remote internships.
US, MA, Boston
Are you looking to work at the forefront of Machine Learning and AI? Would you be excited to apply cutting edge Generative AI algorithms to solve real world problems with significant impact? Machine learning (ML) has been strategic to Amazon from the early years. We are pioneers in areas such as recommendation engines, product search, eCommerce fraud detection, and large-scale optimization of fulfillment center operations. The AWS Industries Team at AWS helps AWS customers implement Generative AI solutions and realize transformational business opportunities for AWS customers in the most strategic industry verticals. This is a team of data scientists, engineers, and architects working step-by-step with customers to build bespoke solutions that harness the power of generative AI. The team helps customers imagine and scope the use cases that will create the greatest value for their businesses, select and train and fine tune the right models, define paths to navigate technical or business challenges, develop proof-of-concepts, and build applications to launch these solutions at scale. The AWS Industries team provides guidance and implements best practices for applying generative AI responsibly and cost efficiently. You will work directly with customers and innovate in a fast-paced organization that contributes to game-changing projects and technologies. You will design and run experiments, research new algorithms, and find new ways of optimizing risk, profitability, and customer experience. We’re looking for Applied Scientists capable of using GenAI and other techniques to design, evangelize, and implement state-of-the-art solutions for never-before-solved problems. Key job responsibilities As an Applied Scientist, you will- - Collaborate with AI/ML scientists, engineers, and architects to research, design, develop, and evaluate cutting-edge generative AI algorithms and build ML systems to address real-world challenges - Interact with customers directly to understand the business problem, help and aid them in implementation of generative AI solutions, deliver briefing and deep dive sessions to customers and guide customer on adoption patterns and paths to production - Create and deliver best practice recommendations, tutorials, blog posts, publications, sample code, and presentations adapted to technical, business, and executive stakeholder. Publish novel developments in internal and external papers, forums, and conferences - Provide customer and market feedback to Product and Engineering teams to help define product direction About the team ABOUT AWS: Diverse Experiences Amazon values diverse experiences. Even if you do not meet all of the preferred qualifications and skills listed in the job description, we encourage candidates to apply. If your career is just starting, hasn’t followed a traditional path, or includes alternative experiences, don’t let it stop you from applying. Why AWS Amazon Web Services (AWS) is the world’s most comprehensive and broadly adopted cloud platform. We pioneered cloud computing and never stopped innovating — that’s why customers from the most successful startups to Global 500 companies trust our robust suite of products and services to power their businesses. Work/Life Balance We value work-life harmony. Achieving success at work should never come at the expense of sacrifices at home, which is why we strive for flexibility as part of our working culture. When we feel supported in the workplace and at home, there’s nothing we can’t achieve in the cloud. Inclusive Team Culture Here at AWS, it’s in our nature to learn and be curious. Our employee-led affinity groups foster a culture of inclusion that empower us to be proud of our differences. Ongoing events and learning experiences, including our Conversations on Race and Ethnicity (CORE) and AmazeCon (gender diversity) conferences, inspire us to never stop embracing our uniqueness. Mentorship and Career Growth We’re continuously raising our performance bar as we strive to become Earth’s Best Employer. That’s why you’ll find endless knowledge-sharing, mentorship and other career-advancing resources here to help you develop into a better-rounded professional.
US, VA, Arlington
Are you looking to work at the forefront of Machine Learning and AI? Would you be excited to apply cutting edge Generative AI algorithms to solve real world problems with significant impact? The AWS Industries Team at AWS helps AWS customers implement Generative AI solutions and realize transformational business opportunities for AWS customers in the most strategic industry verticals. This is a team of data scientists, engineers, and architects working step-by-step with customers to build bespoke solutions that harness the power of generative AI. The team helps customers imagine and scope the use cases that will create the greatest value for their businesses, select and train and fine tune the right models, define paths to navigate technical or business challenges, develop proof-of-concepts, and build applications to launch these solutions at scale. The AWS Industries team provides guidance and implements best practices for applying generative AI responsibly and cost efficiently. You will work directly with customers and innovate in a fast-paced organization that contributes to game-changing projects and technologies. You will design and run experiments, research new algorithms, and find new ways of optimizing risk, profitability, and customer experience. In this Data Scientist role you will be capable of using GenAI and other techniques to design, evangelize, and implement and scale cutting-edge solutions for never-before-solved problems. Key job responsibilities As a Data Scientist, you will- - Collaborate with AI/ML scientists, engineers, and architects to research, design, develop, and evaluate cutting-edge generative AI algorithms and build ML systems to address real-world challenges - Interact with customers directly to understand the business problem, help and aid them in implementation of generative AI solutions, deliver briefing and deep dive sessions to customers and guide customer on adoption patterns and paths to production - Create and deliver best practice recommendations, tutorials, blog posts, publications, sample code, and presentations adapted to technical, business, and executive stakeholder - Provide customer and market feedback to Product and Engineering teams to help define product direction About the team ABOUT AWS: Diverse Experiences Amazon values diverse experiences. Even if you do not meet all of the preferred qualifications and skills listed in the job description, we encourage candidates to apply. If your career is just starting, hasn’t followed a traditional path, or includes alternative experiences, don’t let it stop you from applying. Why AWS Amazon Web Services (AWS) is the world’s most comprehensive and broadly adopted cloud platform. We pioneered cloud computing and never stopped innovating — that’s why customers from the most successful startups to Global 500 companies trust our robust suite of products and services to power their businesses. Work/Life Balance We value work-life harmony. Achieving success at work should never come at the expense of sacrifices at home, which is why we strive for flexibility as part of our working culture. When we feel supported in the workplace and at home, there’s nothing we can’t achieve in the cloud. Inclusive Team Culture Here at AWS, it’s in our nature to learn and be curious. Our employee-led affinity groups foster a culture of inclusion that empower us to be proud of our differences. Ongoing events and learning experiences, including our Conversations on Race and Ethnicity (CORE) and AmazeCon (gender diversity) conferences, inspire us to never stop embracing our uniqueness. Mentorship and Career Growth We’re continuously raising our performance bar as we strive to become Earth’s Best Employer. That’s why you’ll find endless knowledge-sharing, mentorship and other career-advancing resources here to help you develop into a better-rounded professional.
US, WA, Seattle
When customers search for products on the Amazon website, they often see brand advertisements displayed right below the search bar. These ads are part of the Sponsored Brands (SB) program. Our team, the SB Search and Relevance team, works on solving challenges to retrieve the most relevant ads for a customer's search query. A customer's search query is typically a short, free-form text consisting of just a few words. Our algorithm needs to understand the customer's underlying intention from this limited information. At the same time, each advertisement consists of various elements like text descriptions, images, videos, and more. Our algorithm also needs to comprehend the content of these ads and identify the most relevant one from the large pool of ad candidates. As Amazon's advertising business is growing rapidly, we are looking for experienced applied scientists. As an Applied Scientist on this team, you will: - Drive end-to-end Machine Learning projects that have a high degree of ambiguity, scale, complexity. - Apply deep learning and natural language processing to improve information retrieval and relevance. - Design and run A/B experiments. Evaluate the impact of your optimizations and communicate your results to various business stakeholders. - Optimize deep learning inference latency by utilizing methods like knowledge distillation. - Work with software development engineers and write code to bring models into production. - Recruit Applied Scientists to the team and provide mentorship. Impact and Career Growth - You will invent new experiences and influence customer-facing shopping experiences to help suppliers grow their retail business and the auction dynamics that leverage native advertising; this is your opportunity to work within the fastest-growing businesses across all of Amazon! - Define a long-term science vision for our advertising business, driven fundamentally from our customers' needs, translating that direction into specific plans for research and applied scientists, as well as engineering and product teams. This role combines science leadership, organizational ability, technical strength, product focus, and business understanding.
US, PA, Pittsburgh
Our mission is to create best-in-class AI agents that seamlessly integrate multimodal inputs like speech, images, and video, enabling natural, empathetic, and adaptive interactions. We develop cutting-edge Large Language Models (LLMs) that leverage advanced architectures, cross-modal learning, interpretability, and responsible AI techniques to provide coherent, context-aware responses augmented by real-time knowledge retrieval. We seek a talented Applied Scientist with expertise in LLMs, speech, audio, NLP, or multimodal learning to pioneer innovations in data simulation, representation, model pre-training/fine-tuning, generation, reasoning, retrieval, and evaluation. The ideal candidate will build scalable solutions for a variety of applications, such as streaming real-time conversational experiences, talking avatar interactions, customizable personalities, and conversational turn-taking. With a passion for pushing boundaries and rapid experimentation, you'll deliver high-impact solutions from research to customer-facing products and services. Key job responsibilities As an Applied Scientist, you'll leverage your expertise to research novel algorithms and modeling techniques to develop systems for real-world interactions with a focus on the speech modality. You'll develop neural efficiency algorithms, acquire and curate large, diverse datasets while ensuring privacy, creating robust evaluation metrics and test sets to comprehensively assess LLM performance. Integrating human-in-the-loop feedback, you'll iterate on data selection, sampling, and enhancement techniques to improve the core model performance. Your innovations will directly impact customers through new AI products and services.
US, CA, San Diego
Are you looking to work at the forefront of Machine Learning and AI? Would you be excited to apply cutting edge Generative AI algorithms to solve real world problems with significant impact? The AWS Industries Team at AWS helps AWS customers implement Generative AI solutions and realize transformational business opportunities for AWS customers in the most strategic industry verticals. This is a team of data scientists, engineers, and architects working step-by-step with customers to build bespoke solutions that harness the power of generative AI. The team helps customers imagine and scope the use cases that will create the greatest value for their businesses, select and train and fine tune the right models, define paths to navigate technical or business challenges, develop proof-of-concepts, and build applications to launch these solutions at scale. The AWS Industries team provides guidance and implements best practices for applying generative AI responsibly and cost efficiently. You will work directly with customers and innovate in a fast-paced organization that contributes to game-changing projects and technologies. You will design and run experiments, research new algorithms, and find new ways of optimizing risk, profitability, and customer experience. In this Data Scientist role you will be capable of using GenAI and other techniques to design, evangelize, and implement and scale cutting-edge solutions for never-before-solved problems. Key job responsibilities As a Senior Data Scientist, you will- - Collaborate with AI/ML scientists, engineers, and architects to research, design, develop, and evaluate cutting-edge generative AI algorithms and build ML systems to address real-world challenges - Interact with customers directly to understand the business problem, help and aid them in implementation of generative AI solutions, deliver briefing and deep dive sessions to customers and guide customer on adoption patterns and paths to production - Create and deliver best practice recommendations, tutorials, blog posts, publications, sample code, and presentations adapted to technical, business, and executive stakeholder - Provide customer and market feedback to Product and Engineering teams to help define product direction About the team ABOUT AWS: Diverse Experiences Amazon values diverse experiences. Even if you do not meet all of the preferred qualifications and skills listed in the job description, we encourage candidates to apply. If your career is just starting, hasn’t followed a traditional path, or includes alternative experiences, don’t let it stop you from applying. Why AWS Amazon Web Services (AWS) is the world’s most comprehensive and broadly adopted cloud platform. We pioneered cloud computing and never stopped innovating — that’s why customers from the most successful startups to Global 500 companies trust our robust suite of products and services to power their businesses. Work/Life Balance We value work-life harmony. Achieving success at work should never come at the expense of sacrifices at home, which is why we strive for flexibility as part of our working culture. When we feel supported in the workplace and at home, there’s nothing we can’t achieve in the cloud. Inclusive Team Culture Here at AWS, it’s in our nature to learn and be curious. Our employee-led affinity groups foster a culture of inclusion that empower us to be proud of our differences. Ongoing events and learning experiences, including our Conversations on Race and Ethnicity (CORE) and AmazeCon (gender diversity) conferences, inspire us to never stop embracing our uniqueness. Mentorship and Career Growth We’re continuously raising our performance bar as we strive to become Earth’s Best Employer. That’s why you’ll find endless knowledge-sharing, mentorship and other career-advancing resources here to help you develop into a better-rounded professional.
US, WA, Bellevue
Amazon Last Mile builds global solutions that enable Amazon to attract an elastic supply of drivers, companies, and assets needed to deliver Amazon's and other shippers' volumes at the lowest cost and with the best customer delivery experience. Last Mile Science team owns the core decision models in the space of jurisdiction planning, delivery channel and modes network design, capacity planning for on the road and at delivery stations, routing inputs estimation and optimization. We also own scalable solutions to reduce risks, improve safety, enhance personalized experiences of our delivery associates and partners. Our research has direct impact on customer experience, driver and station associate experience, Delivery Service Partner (DSP)’s success and the sustainable growth of Amazon. We are looking for a passionate individual with strong machine learning and analytical skills to join its Last Mile Science team in the endeavor of designing and improving the most complex planning of delivery network in the world. As a Senior Data Scientist, you will work with software engineers, product managers, and business teams to understand the business problems and requirements, distill that understanding to crisply define the problem, and design and develop innovative solutions to address them. Our team is highly cross-functional and employs a wide array of scientific tools and techniques to solve key challenges, including supervised and unsupervised machine learning, non-convex optimization, causal inference, natural language processing, linear programming, reinforcement learning, and other forecast algorithms. Key job responsibilities Key job responsibilities * Drive end-to-end Machine Learning projects that have a high degree of ambiguity, scale and complexity. * Build Machine Learning models, perform proof-of-concept, experiment, optimize, and deploy your models into production; work closely with software engineers to assist in productionizing your ML models. * Run A/B experiments, gather data, and perform statistical analysis. * Measure and estimate risks, constructively critique peer research, and align research focuses with the Amazon's strategic needs. * Research new and innovative machine learning approaches. Help coach/mentor junior scientists in the team. * Willingness to publish research at internal and external top scientific venues. Write and pursue IP submissions.
US, CA, Pasadena
The AWS Center for Quantum Computing in Pasadena, CA, is looking to hire a Research Scientist specializing in Mixed-Signal Design. Working alongside other scientists and engineers, you will design and validate hardware performing the control and readout functions for AWS quantum processors. Candidates must have a strong background in mixed-signal design at the printed circuit board (PCB) level. Working effectively within a cross-functional team environment is critical. The ideal candidate will have a proven track record of hardware development through multiple product life-cycles, from requirements generation to design validation. AWS Utility Computing (UC) provides product innovations — from foundational services such as Amazon’s Simple Storage Service (S3) and Amazon Elastic Compute Cloud (EC2), to consistently released new product innovations that continue to set AWS’s services and features apart in the industry. As a member of the UC organization, you’ll support the development and management of Compute, Database, Storage, Internet of Things (Iot), Platform, and Productivity Apps services in AWS, including support for customers who require specialized security solutions for their cloud services. Key job responsibilities Our scientists and engineers collaborate across diverse teams and projects to offer state of the art, cost effective solutions for the control of AWS quantum processor systems. You’ll bring a passion for innovation, collaboration, and mentoring to: Solve layered technical problems, often ones not encountered before, across our hardware and software stacks. Develop requirements with key system stakeholders, including quantum device, test and measurement, cryogenic hardware, and theory teams. Design, implement, test, deploy, and maintain innovative solutions that meet both strict performance and cost metrics. Provide mentorship to junior team members. Research enabling technologies necessary for AWS to produce commercially viable quantum computers.