20B-parameter Alexa model sets new marks in few-shot learning

With an encoder-decoder architecture — rather than decoder only — the Alexa Teacher Model excels other large language models on few-shot tasks such as summarization and machine translation.

Most major advances in AI have come from supervised learning, in which machine learning models are trained on annotated data. But as commercial AI models continue to increase in scale, relying on data annotation is becoming unsustainable.

At Alexa AI, we are moving to the new paradigm of generalizable intelligence, in which models can learn new concepts and transfer knowledge from one language or task to another with minimal human input. Such models allow us to efficiently develop new features and improve Alexa on multiple languages at the same time.

As part of this move, we have introduced Transformer-based large-scale multilingual language models we call Alexa Teacher Models (AlexaTM). Given only a few examples of a task in a new language, AlexaTM can transfer what it knows to the new language with no extra human supervision.

Related content
New method would enable BERT-based natural-language-processing models to handle longer text strings, run in resource-constrained settings — or sometimes both.

In a paper we’re presenting at this year’s Knowledge Discovery and Data Mining Conference (KDD), we showed that 10-billion- and two-billion-parameter AlexaTM models can improve on state-of-art cross-lingual transfer learning and increase Alexa’s accuracy in different locales.

In a follow-up paper, which we've published on arXiv, we have taken this line of research a step further, with a 20-billion-parameter generative model called AlexaTM 20B. The experiments reported in the paper — which use only publicly available data — show that AlexaTM 20B can not only transfer what it learns across languages but also learn new tasks from just a handful of examples (few-shot learning).

In the example below, the model is provided with three examples of different intents, or tasks that the customer wants executed: book-restaurant, play-music, and get-weather. The model can generalize from these to the unfamiliar intent get-news-update and generate utterances corresponding to that intent in different languages. This allows us to develop new features more rapidly, and in multiple languages, simultaneously.

Multilingual annotation.png
Using AlexaTM 20B to generate annotated data for a new intent in different languages.

Our work is inspired by recent work by OpenAI and the development of the GPT-3 model. However, where other large language models use decoder-only architectures, the AlexaTM 20B model is a sequence-to-sequence (seq2seq) encoder-decoder.

In an encoder-decoder architecture, the encoder produces a representation of an input text using a bidirectional encoding, and the decoder uses that representation to perform some task — historically, generating a translation of the input.

20B-encoder-decoder.gif
In a language model with an encoder-decoder architecture, the encoder produces a representation of an input text using a bidirectional encoding, and the decoder uses that representation to predict the next tokens (such as words and punctuation) in the sequence.

By contrast, the decoder-only model uses left-to-right (unidirectional) encoding of the input text. This works well for language modeling, in which the task is to predict the next token in a sequence based on those that precede it, but it’s less effective for machine translation and text summarization, the tasks on which AlexaTM 20B outperforms GPT-3.

Decoder-only.final.jpeg
A decoder-only language model uses left-to-right (unidirectional) encoding of the input text.

AlexaTM 20B also tops GPT-3 by being multilingual, supporting Arabic, English, French, German, Hindi, Italian, Japanese, Marathi, Portuguese, Spanish, Tamil, and Telugu. And its carbon footprint during training is only one-fifth of GPT-3’s, thanks to its lower parameter count and internal improvements to our training engine.

Related content
Determining the optimal architectural parameters reduces network size by 84% while improving performance on natural-language-understanding tasks.

To train AlexaTM 20B, we break with convention, training on a mix of denoising and causal-language-modeling (CLM) tasks. On the denoising task, the model is required to find dropped spans and generate the complete version of the input. This is similar to how other seq2seq models like T5 and BART are trained. On the CLM task, the model is required to meaningfully continue the input text. This is similar to how decoder-only models like GPT-3 and PaLM are trained.

Training on a mix of these two pretraining tasks enables AlexaTM 20B to generalize based on the given input and generate new text (the CLM task), while also performing well on tasks that seq2seq models are particularly good at, such as summarization and machine translation (the denoising task).

Pre-training objectives.png
AlexaTM 20B pre-training objectives. During pretraining, the model is trained on the denoising task 80% of the time and on causal language modeling (CLM) 20% of the time.

For example, we demonstrated that, given a single article-summarization pair, AlexaTM 20B can generate higher-quality summaries in English, German, and Spanish than the much larger PaLM 540B can (see example, below).

Related content
Human-evaluation studies validate metrics, and experiments show evidence of bias in popular language models.

Moreover, AlexaTM 20B achieves state-of-the-art performance in few-shot machine translation (MT) across almost all language pairs supported by the model on the Flores-101 dataset. The gains in translating to and from low-resource languages like Marathi, Tamil, and Telugu are particularly significant (e.g., 21.8 Arabic-to-Tamil sentence-piece BLEU score compared to 0.9 for the supervised M2M-124 615M model).

These results suggest that large-scale seq2seq-style pretraining, as formulated in our work, improves MT for languages with few training pairs, especially when a large amount of monolingual data is available for the target language. AlexaTM 20B has no difficulty translating directly from different languages, in contrast to many-to-many MT systems that require parallel translation data for training.

News summarization.png
News summarization by AlexaTM 20B when given only a single example. The input to the encoder is in the yellow box, the decoder’s output in the pink box.

AlexaTM 20B is the largest multilingual seq2seq model to date that is also capable of few-shot learning. We will be releasing the model publicly for non-commercial use to aid the development and evaluation of multilingual large language models (LLMs). We have also implemented a function to enable loading the model on up to eight GPUs with limited GPU memory for running inference on instances of Amazon Web Services’ EC2 computation service. This provides a more flexible way for researchers to use AlexaTM 20B in their own work.

In an analysis reported in our paper, we found that AlexaTM 20B, like other LLMs, has some likelihood of reproducing toxic language, social biases, and harmful stereotypes found in its training data. Therefore, we recommend that users conduct a full task-specific fairness-and-bias analysis before using the model to fully understand and address any potential harm that might arise from its use. Depending on the downstream application that AlexaTM 20B is being applied to, one or several of the prior techniques from the literature might be used to detoxify and debias the model. We reiterate the importance of task-specific fairness auditing and emphasize the need for more research on bias measurement and mitigation from the community.

All in all, we demonstrated in our work that the proposed style of pretraining enables seq2seq models that outperform much larger decoder-only LLMs across different tasks, both in a few-shot setting and with fine-tuning. We hope our work presents a compelling case for seq2seq models as a powerful alternative to decoder-only models for LLM training.

Research areas

Related content

BR, SP, Sao Paulo
Amazon launched the Generative AI Innovation Center in June 2023 to help AWS customers accelerate innovation and business success with Generative AI (https://press.aboutamazon.com/2023/6/aws-announces- generative -ai -innovation center). This Innovation Center provides opportunities to innovate in a fast-paced organization that contributes to breakthrough projects and technologies that are deployed across devices and the cloud. As a data scientist, you are proficient in designing and developing advanced generative AI solutions to solve diverse customer problems. You'll work with terabytes of text, images, and other types of data to solve real-world problems through Gen AI. You will work closely with account teams and ML strategists to define the use case, and with other ML scientists and engineers on the team to design experiments and find new ways to deliver customer value. The selected person will possess technical and customer-facing skills that will enable you to be part of the AWS technical team within our solution providers ecosystem/environment as well as directly to end customers. You will be able to lead discussion with customer and partner staff and senior management. A day in the life Here at AWS, we embrace our differences. We are committed to promoting our culture of inclusion. We have ten employee-led affinity groups, reaching 40,000 employees in more than 190 branches around the world. We have innovative benefit offerings and host annual and ongoing learning experiences, including our Conversations on Race and Ethnicity (CORE) and AmazeCon (gender diversity) conferences. Amazon's culture of inclusion is reinforced by our 16 Leadership Principles, which remind team members to seek diverse perspectives, learn and be curious, and build trust. About the team Work/life balance Our team highly values work-life balance. It's not about how many hours you spend at home or at work; it's about the flow you establish that brings energy to both parts of your life. We believe that finding the right balance between your personal and professional life is fundamental to lifelong happiness and fulfillment. We offer flexibility in working hours and encourage you to find your own work-life balance. Mentoring and career growth Our team is dedicated to supporting new members. We have a broad mix of experience levels and mandates and are building an environment that celebrates knowledge sharing and mentorship. Our senior members enjoy one-on-one guidance and thorough but gentle code reviews. We care about your career growth and strive to assign projects based on what will help each team member become a more well-rounded engineer and enable them to take on more complex tasks in the future. We are open to hiring candidates to work out of one of the following locations: Sao Paulo, SP, BRA
MX, DIF, Mexico City
Amazon launched the Generative AI Innovation Center (GAIIC) in Jun 2023 to help AWS customers accelerate the use of Generative AI to solve business and operational problems and promote innovation in their organization (https://press.aboutamazon.com/2023/6/aws-announces-generative-ai-innovation-center). GAIIC provides opportunities to innovate in a fast-paced organization that contributes to game-changing projects and technologies that get deployed on devices and in the cloud. As a Data Science Manager in GAIIC, you'll partner with technology and business teams to build new GenAI solutions that delight our customers. You will be responsible for directing a team of data scientists, deep learning architects, and ML engineers to build generative AI models and pipelines, and deliver state-of-the-art solutions to customer’s business and mission problems. Your team will be working with terabytes of text, images, and other types of data to address real-world problems. The successful candidate will possess both technical and customer-facing skills that will allow them to be the technical “face” of AWS within our solution providers’ ecosystem/environment as well as directly to end customers. You will be able to drive discussions with senior technical and management personnel within customers and partners, as well as the technical background that enables them to interact with and give guidance to data/research/applied scientists and software developers. The ideal candidate will also have a demonstrated ability to think strategically about business, product, and technical issues. Finally, and of critical importance, the candidate will be an excellent technical team manager, someone who knows how to hire, develop, and retain high quality technical talent. AWS Sales, Marketing, and Global Services (SMGS) is responsible for driving revenue, adoption, and growth from the largest and fastest growing small- and mid-market accounts to enterprise-level customers including public sector. The AWS Global Support team interacts with leading companies and believes that world-class support is critical to customer success. AWS Support also partners with a global list of customers that are building mission-critical applications on top of AWS services. A day in the life A day in the life Here at AWS, we embrace our differences. We are committed to furthering our culture of inclusion. We have ten employee-led affinity groups, reaching 40,000 employees in over 190 chapters globally. We have innovative benefit offerings, and host annual and ongoing learning experiences, including our Conversations on Race and Ethnicity (CORE) and AmazeCon (gender diversity) conferences. Amazon’s culture of inclusion is reinforced within our 16 Leadership Principles, which remind team members to seek diverse perspectives, learn and be curious, and earn trust. About the team Work/Life Balance Our team puts a high value on work-life balance. It isn’t about how many hours you spend at home or at work; it’s about the flow you establish that brings energy to both parts of your life. We believe striking the right balance between your personal and professional life is critical to life-long happiness and fulfillment. We offer flexibility in working hours and encourage you to find your own balance between your work and personal lives. Mentorship & Career Growth Our team is dedicated to supporting new members. We have a broad mix of experience levels and tenures, and we’re building an environment that celebrates knowledge sharing and mentorship. Our senior members enjoy one-on-one mentoring and thorough, but kind, code reviews. We care about your career growth and strive to assign projects based on what will help each team member develop into a better-rounded engineer and enable them to take on more complex tasks in the future. We are open to hiring candidates to work out of one of the following locations: Mexico City, DIF, MEX
US, CA, Palo Alto
The Amazon Search Mission Understanding (SMU) team is at the forefront of revolutionizing the online shopping experience through the Amazon search page. Our ambition extends beyond facilitating a seamless shopping journey; we are committed to creating the next generation of intelligent shopping assistants. Leveraging cutting-edge Large Language Models (LLMs), we aim to redefine navigation and decision-making in e-commerce by deeply understanding our users' shopping missions, preferences, and goals. By developing responsive and scalable solutions, we not only accomplish the shopping mission but also foster unparalleled trust among our customers. Through our advanced technology, we generate valuable insights, providing a guided navigation system into various search missions, ensuring a comprehensive and holistic shopping experience. Our dedication to continuous improvement through constant measurement and enhancement of the shopper experience is crucial, as we strategically navigate the balance between immediate results and long-term business growth. We are seeking an Applied Scientist who is not just adept in the theoretical aspects of Machine Learning (ML), Artificial Intelligence (AI), and Large Language Models (LLMs) but also possesses a pragmatic, hands-on approach to navigating the complexities of innovation. The ideal candidate will have a profound expertise in developing, deploying, and contributing to the next-generation shopping search engine, including but not limited to Retrieval-Augmented Generation (RAG) models, specifically tailored towards enhancing the Rufus application—an integral part of our mission to revolutionize shopping assistance. You will take the lead in conceptualizing, building, and launching groundbreaking models that significantly improve our understanding of and capabilities in enhancing the search experience. A successful applicant will display a comprehensive skill set across machine learning model development, implementation, and optimization. This includes a strong foundation in data management, software engineering best practices, and a keen awareness of the latest developments in distributed systems technology. We are looking for individuals who are determined, analytically rigorous, passionate about applied sciences, creative, and possess strong logical reasoning abilities. Join the Search Mission Understanding team, a group of pioneering ML scientists and engineers dedicated to building core ML models and developing the infrastructure for model innovation. As part of Amazon Search, you will experience the dynamic, innovative culture of a startup, backed by the extensive resources of Amazon.com (AMZN), a global leader in internet services. Our collaborative, customer-centric work environment spans across our offices in Palo Alto, CA, and Seattle, WA, offering a unique blend of opportunities for professional growth and innovation. Key job responsibilities Collaborate with cross-functional teams to identify requirements for ML model development, focusing on enhancing mission understanding through innovative AI techniques, including retrieval-Augmented Generation or LLM in general. Design and implement scalable ML models capable of processing and analyzing large datasets to improve search and shopping experiences. Must have a strong background in machine learning, AI, or computational sciences. Lead the management and experiments of ML models at scale, applying advanced ML techniques to optimize science solution. Serve as a technical lead and liaison for ML projects, facilitating collaboration across teams and addressing technical challenges. Requires strong leadership and communication skills, with a PhD in Computer Science, Machine Learning, or a related field. We are open to hiring candidates to work out of one of the following locations: Palo Alto, CA, USA | Seattle, WA, USA
US, WA, Seattle
Amazon is looking for a passionate, talented, and inventive Senior Applied Scientist with a strong machine learning background to help build industry-leading language technology. Our mission is to provide a delightful experience to Amazon’s customers by pushing the envelope in Natural Language Processing (NLP), Generative AI, Large Language Model (LLM), Natural Language Understanding (NLU), Machine Learning (ML), Retrieval-Augmented Generation, Responsible AI, Agent, Evaluation, and Model Adaptation. As part of our AI team in Amazon AWS, you will work alongside internationally recognized experts to develop novel algorithms and modeling techniques to advance the state-of-the-art in human language technology. Your work will directly impact millions of our customers in the form of products and services, as well as contributing to the wider research community. You will gain hands on experience with Amazon’s heterogeneous text and structured data sources, and large-scale computing resources to accelerate advances in language understanding. The Science team at AWS Bedrock builds science foundations of Bedrock, which is a fully managed service that makes high-performing foundation models available for use through a unified API. We are adamant about continuously learning state-of-the-art NLP/ML/LLM technology and exploring creative ways to delight our customers. In our daily job we are exposed to large scale NLP needs and we apply rigorous research methods to respond to them with efficient and scalable innovative solutions. At AWS Bedrock, you’ll experience the benefits of working in a dynamic, entrepreneurial environment, while leveraging AWS resources, one of the world’s leading cloud companies and you’ll be able to publish your work in top tier conferences and journals. We are building a brand new team to help develop a new NLP service for AWS. You will have the opportunity to conduct novel research and influence the science roadmap and direction of the team. Come join this greenfield opportunity! Amazon Bedrock team is part of Utility Computing (UC) About the team AWS Utility Computing (UC) provides product innovations — from foundational services such as Amazon’s Simple Storage Service (S3) and Amazon Elastic Compute Cloud (EC2), to consistently released new product innovations that continue to set AWS’s services and features apart in the industry. As a member of the UC organization, you’ll support the development and management of Compute, Database, Storage, Internet of Things (Iot), Platform, and Productivity Apps services in AWS, including support for customers who require specialized security solutions for their cloud services. Diverse Experiences AWS values diverse experiences. Even if you do not meet all of the qualifications and skills listed in the job description, we encourage candidates to apply. If your career is just starting, hasn’t followed a traditional path, or includes alternative experiences, don’t let it stop you from applying. Why AWS? Amazon Web Services (AWS) is the world’s most comprehensive and broadly adopted cloud platform. We pioneered cloud computing and never stopped innovating — that’s why customers from the most successful startups to Global 500 companies trust our robust suite of products and services to power their businesses. Inclusive Team Culture Here at AWS, it’s in our nature to learn and be curious. Our employee-led affinity groups foster a culture of inclusion that empower us to be proud of our differences. Ongoing events and learning experiences, including our Conversations on Race and Ethnicity (CORE) and AmazeCon (gender diversity) conferences, inspire us to never stop embracing our uniqueness. Mentorship & Career Growth We’re continuously raising our performance bar as we strive to become Earth’s Best Employer. That’s why you’ll find endless knowledge-sharing, mentorship and other career-advancing resources here to help you develop into a better-rounded professional. Work/Life Balance We value work-life harmony. Achieving success at work should never come at the expense of sacrifices at home, which is why we strive for flexibility as part of our working culture. When we feel supported in the workplace and at home, there’s nothing we can’t achieve in the cloud. Hybrid Work We value innovation and recognize this sometimes requires uninterrupted time to focus on a build. We also value in-person collaboration and time spent face-to-face. Our team affords employees options to work in the office every day or in a flexible, hybrid work model near one of our U.S. Amazon offices. We are open to hiring candidates to work out of one of the following locations: Seattle, WA, USA
US, WA, Seattle
Alexa Personality Fundamentals is chartered with infusing Alexa's trustworthy, reliable, considerate, smart, and playful personality. Come join us in creating the future of personality forward AI here at Alexa. Key job responsibilities As a Data Scientist with Alexa Personality, your work will involve machine learning, Large Language Model (LLM) and other generative technologies. You will partner with engineers, applied scientists, voice designers, and quality assurance to ensure that Alexa can sing, joke, and delight our customers in every interaction. You will take a central role in defining our experimental roadmap, sourcing training data, authoring annotation criteria and building automated benchmarks to track the improvement of our Alexa's personality. We are open to hiring candidates to work out of one of the following locations: Bellevue, WA, USA | Seattle, WA, USA
US, CA, Palo Alto
The Amazon Search Mission Understanding (SMU) team is at the forefront of revolutionizing the online shopping experience through the Amazon search page. Our ambition extends beyond facilitating a seamless shopping journey; we are committed to creating the next generation of intelligent shopping assistants. Leveraging cutting-edge Large Language Models (LLMs), we aim to redefine navigation and decision-making in e-commerce by deeply understanding our users' shopping missions, preferences, and goals. By developing responsive and scalable solutions, we not only accomplish the shopping mission but also foster unparalleled trust among our customers. Through our advanced technology, we generate valuable insights, providing a guided navigation system into various search missions, ensuring a comprehensive and holistic shopping experience. Our dedication to continuous improvement through constant measurement and enhancement of the shopper experience is crucial, as we strategically navigate the balance between immediate results and long-term business growth. We are seeking an Applied Scientist who is not just adept in the theoretical aspects of Machine Learning (ML), Artificial Intelligence (AI), and Large Language Models (LLMs) but also possesses a pragmatic, hands-on approach to navigating the complexities of innovation. The ideal candidate will have a profound expertise in developing, deploying, and contributing to the next-generation shopping search engine, including but not limited to Retrieval-Augmented Generation (RAG) models, specifically tailored towards enhancing the Rufus application—an integral part of our mission to revolutionize shopping assistance. You will take the lead in conceptualizing, building, and launching groundbreaking models that significantly improve our understanding of and capabilities in enhancing the search experience. A successful applicant will display a comprehensive skill set across machine learning model development, implementation, and optimization. This includes a strong foundation in data management, software engineering best practices, and a keen awareness of the latest developments in distributed systems technology. We are looking for individuals who are determined, analytically rigorous, passionate about applied sciences, creative, and possess strong logical reasoning abilities. Join the Search Mission Understanding team, a group of pioneering ML scientists and engineers dedicated to building core ML models and developing the infrastructure for model innovation. As part of Amazon Search, you will experience the dynamic, innovative culture of a startup, backed by the extensive resources of Amazon.com (AMZN), a global leader in internet services. Our collaborative, customer-centric work environment spans across our offices in Palo Alto, CA, and Seattle, WA, offering a unique blend of opportunities for professional growth and innovation. Key job responsibilities Collaborate with cross-functional teams to identify requirements for ML model development, focusing on enhancing mission understanding through innovative AI techniques, including retrieval-Augmented Generation or LLM in general. Design and implement scalable ML models capable of processing and analyzing large datasets to improve search and shopping experiences. Must have a strong background in machine learning, AI, or computational sciences. Lead the management and experiments of ML models at scale, applying advanced ML techniques to optimize science solution. Serve as a technical lead and liaison for ML projects, facilitating collaboration across teams and addressing technical challenges. Requires strong leadership and communication skills, with a PhD in Computer Science, Machine Learning, or a related field. We are open to hiring candidates to work out of one of the following locations: Palo Alto, CA, USA | Seattle, WA, USA
US, WA, Bellevue
The Artificial General Intelligence (AGI) team is looking for a passionate, talented, and inventive Applied Science Manager with a strong deep learning background, to lead the development of industry-leading technology with multimodal systems. Key job responsibilities As an Applied Science Manager with the AGI team, you will lead the development of novel algorithms and modeling techniques to advance the state of the art with multimodal systems. Your work will directly impact our customers in the form of products and services that make use of vision and language technology. You will leverage Amazon’s heterogeneous data sources and large-scale computing resources to accelerate development with multimodal Large Language Models (LLMs) and Generative Artificial Intelligence (GenAI) in Computer Vision. About the team The AGI team has a mission to push the envelope with multimodal LLMs and GenAI in Computer Vision, in order to provide the best-possible experience for our customers. We are open to hiring candidates to work out of one of the following locations: Bellevue, WA, USA | Seattle, WA, USA | Sunnyvale, CA, USA
US, WA, Bellevue
Do you enjoy solving complex problems, driving research innovation, and creating insightful models that tackle real-world challenges? Join Amazon's Modeling and Optimization team. Our science models and data-driven solutions continuously reshape Amazon global supply chain - one of the most sophisticated networks in the world. Key job responsibilities In this role, you will use science to drive measurable improvements across customer experience, network speed, cost efficiency, safety, sustainability, and capital investment returns. You will collaborate with scientists to solve complex problems and with cross-functional teams to analyze systems and drive business value. You will develop optimization, simulation, and predictive models to identify improvement opportunities. You will develop innovative, scalable solutions. You will quantify expected improvements and evaluate trade-offs between competing objectives. You will communicate model insights to stakeholders and influence positive changes in Amazon's systems and operations. A day in the life Collaboration will be key - you will collaborate with scientists to design end-to-end solutions, work with business stakeholders to simplify and streamline processes, and partner with engineers to simplify systems and enhance their performances. The focus is on driving value through scientific thinking, technical knowledge, simplification, and cross-functional teamwork. About the team Our team of scientists specializes in network modeling, optimization, algorithms, control theory, machine learning and related disciplines. Our focus is driving supply chain improvements through applied science. By analyzing data and building insightful models, we identify opportunities and influence positive change across Amazon's end-to-end systems and operations - from vendors to customers. We are open to hiring candidates to work out of one of the following locations: Bellevue, WA, USA
US, MA, Boston
The Artificial General Intelligence (AGI) - Automations team is developing AI technologies to automate workflows, processes for browser automation, developers and ops teams. As part of this, we are developing services and inference engine for these automation agents, and techniques for reasoning, planning, and modeling workflows. If you are interested in a startup mode team in Amazon to build the next level of agents then come join us. Scientists in AGI - Automations will develop cutting edge multimodal LLMs to observe, model and derive insights from manual workflows to automate them. You will get to work in a joint scrum with engineers for rapid invention, develop cutting edge automation agent systems, and take them to launch for millions of customers. Key job responsibilities - Build automation agents by developing novel multimodal LLMs. A day in the life An Applied Scientist with the AGI team will support the science solution design, run experiments, research new algorithms, and find new ways of optimizing the customer experience.; while setting examples for the team on good science practice and standards. Besides theoretical analysis and innovation, an Applied Scientist will also work closely with talented engineers and scientists to put algorithms and models into practice. We are open to hiring candidates to work out of one of the following locations: Boston, MA, USA
US, MA, Boston
The Artificial General Intelligence (AGI) - Automations team is developing AI technologies to automate workflows, processes for browser automation, developers and ops teams. As part of this, we are developing services and inference engine for these automation agents, and techniques for reasoning, planning, and modeling workflows. If you are interested in a startup mode team in Amazon to build the next level of agents then come join us. Scientists in AGI - Automations will develop cutting edge multimodal LLMs to observe, model and derive insights from manual workflows to automate them. You will get to work in a joint scrum with engineers for rapid invention, develop cutting edge automation agent systems, and take them to launch for millions of customers. Key job responsibilities - Build automation agents by developing novel multimodal LLMs. A day in the life An Applied Scientist with the AGI team will support the science solution design, run experiments, research new algorithms, and find new ways of optimizing the customer experience.; while setting examples for the team on good science practice and standards. Besides theoretical analysis and innovation, an Applied Scientist will also work closely with talented engineers and scientists to put algorithms and models into practice. We are open to hiring candidates to work out of one of the following locations: Boston, MA, USA