Image shows an autonomous surface vehicle used for bathymetric mapping and water quality monitoring
This autonomous surface vehicle used for bathymetric mapping and water quality monitoring is part of a project being pursued by researchers at the Vehicle Autonomy and Intelligence Lab (VAIL) at Indiana University Bloomington.
Courtesy of Lantao Liu

How Lantao Liu and his team are helping robots adapt to challenges

The AWS Machine Learning Research Award winner is working to develop methods and open-source libraries that can potentially benefit the artificial intelligence and robotics communities.

Lantao Liu and his team at the Vehicle Autonomy and Intelligence Lab (VAIL) at Indiana University Bloomington want to help robots get better at navigating through complex and sometimes changing environments, while also boosting their ability to assess and process data. This challenge has significant applications, particularly in the realm of environmental modeling. Liu and his team are working to develop autonomous and machine learning methods and open-source libraries that can potentially benefit both the artificial intelligence and robotics communities.

“Machine learning algorithms are increasingly being developed for robotics missions. Many critical autonomy components are data-driven, where the data comes from onboard sensors such as LiDAR, sonar, and cameras,” says Liu who also is an assistant professor within the university’s Department of Intelligent Systems Engineering in the Luddy School of Informatics, Computing, and Engineering.

Photo is of Lantao Liu, who leads the Vehicle Autonomy and Intelligence Lab at Indiana University Bloomington
Lantao Liu leads the Vehicle Autonomy and Intelligence Lab at Indiana University Bloomington.
Courtesy of Lantao Liu

“The robots typically have weak computational capacity due to their limited dimensions and payloads, yet they require online learning with data processed on the fly,” he adds. “Unfortunately, many methods for solving these tasks entail large computational costs that can be very challenging for the robots. The key challenges have been computational-theoretical due to the increased complexity of stochastic modeling, but also practical due to the synergy of integrating hardware and software systems as well as customizing algorithms on the robots.”

Liu’s 2019 Amazon Machine Learning Research Award allows VAIL to access and leverage Amazon’s cloud computing tools and services for thousands of hours, boosting their work on both machine learning and autonomous systems.

“My lab works on various decision-making problems for different types of robots including aerial, ground, and aquatic vehicles. Our objective is to develop methodologies for autonomous robots to enhance their autonomy and intelligence in environmental sensing and modeling, search and rescue, among other applications of societal importance,” explains Liu.

Environmental sensing, modeling, and monitoring

One project being pursued by VAIL researchers involves a process that maps environmental attributes of interest, such as pollution in the water or air, by collecting corresponding measurement samples from different locations so that a “distribution map" (environment model) can be reconstructed.

“This mapping mechanism is also called environmental state estimation, a learning process where the parameters of an underlying environment model must be learned using streams of incoming sampling data collected by robots,” Liu explains.

“However, the environments can be dynamic, as can the associated environmental attributes to be mapped. A drawback to using robots is that the collection of samples requires a series of sequential, ordered, sampling operations (so data may not well represent the ground-truth map), and the entire sampling process is time consuming because the samples are typically spread over different spatial locations.

Environmental sensing, modeling, and monitoring using autonomous surface vehicles

“To provide a good estimate of the state of the environment at any time, the robot information-gathering sensing must be persistent to keep up with evolving environmental dynamics,” Liu explains. “One focus of our research has been developing principles that use data-driven methods to guide robots to learn the spatio-temporal and stochastic environment model, and utilize the learned model for path planning and decision-making solutions. This, in turn, benefits future environmental exploration and exploitation for subsequent modeling and monitoring.”

The VAIL team has been developing methods and software that can accurately characterize the spatiotemporal environment by designing a non-stationary modeling framework based on a variant of Gaussian processes (GPs).

“The map will not be the same everywhere,” says Liu. “There are locations on the map that vary more rapidly than others, and we need to accurately model both rapidly and slowly changing parts. It is even more challenging when the underlying map is dynamic, such as when we’re mapping pollution dispersion.

“In addition,” he explains, “the model computation must be fast for in-the-moment decisions. However, sensing data is continuously received, and the accumulated data quickly overwhelms the robots’ computing resources. To boost the learning performance, our researchers recently developed an adaptive learning approach where the key idea is a sparse approximation mechanism that incrementally incorporates the new incoming data with a learned model supported by ‘summarized old data.”

Robotic anomaly detection

In a related project, the lab has been developing a generic robotic anomaly detection framework, motivated by field experiments.

“Commonly, robots in the field encounter sensing and behavioral anomalies,” Liu explains. “For example, one of the thrusters of the autonomous surface vehicle (ASV) might malfunction in operation, resulting in a forward motion becoming a turning motion. Or the ASV might get stuck in aquatic plants or other underwater obstacles, which are difficult to perceive using cameras or LiDARs. The inertial measurement unit (IMU) can be sensitive to external disturbances such as magnetic fields and provide drifting readings. Surrounding objects, such as a tall tree near the shore, might block the GPS signals, which leads to inaccurate localization. Sonar data can also be affected by dynamic underwater objects or environmental disturbances.

“Resilient and adaptive robotic systems require cognitive capabilities to avoid anomalies and recover and learn from failures with minimal human intervention,” Liu adds. “Equipping robots with the self-examination ability to detect sensing and behavioral faults is an essential step. The intuitive idea of anomaly detection is to develop some concept of normality and treat the observations that deviate considerably from that as anomalies.

“It is difficult, if not impossible, to handcraft a model representing the expected behaviors of different kinds of robots in various applications,” Liu explains. “The framework learns the concept of normality via deep representation learning and graph neural networks. We train the framework using contrastive learning in a semi-supervised manner that utilizes the information in a large amount of unlabeled data and, optionally, a small amount of labeled data. During the development of this framework, the AWS EC2 instances have drastically accelerated the prototyping, training, and testing processes. We are currently finalizing this framework and will open-source software.

“Hopefully,” he adds, “it will also benefit the robotics and machine learning communities at large.”

Off-road autonomy

The AWS Machine Learning Research Award also helps VAIL research off-road autonomy.

“An important challenge is the stochastic modeling of unexpected robot behaviors,” he explains. “Basically, the robots operating in real-world complex environments need to reason about the long-term results of their physical interactions with the environment, but due to the high complexity of the real world, it is generally impossible to predict future events in an accurate manner.

“For example,” says Liu, “the effect of uneven road conditions or various disturbances on the robot’s motion is hard to model (or learn from data) precisely. It is even more challenging to model the interaction between the robot and the environment, especially when the environment is dynamic. Other representative scenarios include drones flying with strong winds or submarines moving under ocean currents, where air and water flows vary significantly in both space and time.

“Thus, it is necessary for the robots to consider these epistemic uncertainties caused by a lack of precise modeling of the environment while making decisions,” he explains. “We use Markov decision process as a basis to model autonomous decision-making under uncertainty problems. The solution to these problems is a closed-loop policy that maximizes a long-term goal and satisfies the safety constraints under a probabilistic interaction model between the robot and the environment. In principle, the resulting policy can generate a sequence of motor commands that complete the task assigned by a human, given that the probabilistic model can well describe the uncertainty of the world, and the computational method can allow the robot to calculate the policy within a reasonable amount of time.

“However,” Liu continues, “many real-world problems are non-trivial, and obtaining the required probabilistic model of the world is generally impossible. Our research focuses on solving these two challenges by developing novel methods and leveraging the strong computational power of GPUs. Our current focus is on addressing the computational part of the challenge by developing two planning algorithms that allow the robot to reason about its continuous motion on complicated terrain surfaces based on the kernel method (mesh-free) and finite-element method (mesh-based). Both methods leverage a set of discrete elements to represent the value function over the continuous space. The computation over the discrete parts can be parallelized, which allows our robot to reason and compute optimal policies in real-time to navigate through complicated terrains safely and efficiently.”

VAIL researchers have been working on using sampling methods to optimize over a class of parameterized policies.

robotdecisionmaking.gif
Lantao Liu and his team used AWS cloud computing services to speed up computation and analyses of robot decision-making policies in a simulated scenario.

“To do so, we first need to sample a large number of robot trajectories under the current policy, which can be computed quickly by the parallel architecture of Nvidia GPU CUDA cores,” Liu explains. “They use the gradient-based method for optimization of policy parameters: the policy is updated by computing the policy parameter gradients based on the sampled trajectories. The gradient computation and policy update involve large matrix operations, which can also be parallelized by GPUs for real-time solutions. They leverage AWS computation for this task.”

Navigable space segmentation for navigation

Liu notes that the AWS resources have also been very useful for the team’s visual autonomy research. Visual information has become increasingly important for robotic autonomy as it can provide rich information about surrounding environments, and VAIL’s visual data processing capability has been significantly improved due to the breakthrough on deep neural networks (DNNs). To develop deep approaches to process the vision perception, the team needs to develop models with complicated learning architectures, huge volumes of data, as well as various training strategies.

“A crucial capability for mobile robots to navigate in unknown environments is to construct obstacle-free space where the robot could move without collision,” Liu explains. “Roboticists have been developing methods for detecting such free space with the ray tracing of LiDAR beams to build occupancy maps in 2D or 3D space. Mapping methods with LiDAR require processing of large point cloud data, especially when a high-resolution LiDAR is used. As a much less expensive alternative, cameras have also been widely used for free space detection by leveraging DNNs to perform multi-class or binary-class segmentation of images.

Navigable space construction for robot visual navigation

“However,” he adds, “most existing DNN-based methods are built on a supervised-learning paradigm and rely on annotated datasets. The datasets usually contain a large amount of pixel-level annotated segmented images, which are prohibitively expensive and time-consuming to obtain for robotic applications in outdoor environments. To overcome limitations of fully supervised learning, we have been developing a new deep model based on variational auto-encoders. We target a representation learning-based framework to enable robots to learn navigable space segmentation in an unsupervised manner, with the aim of learning a polyline representation that compactly outlines the desired navigable space boundary. This is different from prevalent segmentation techniques which heavily rely on supervised learning strategies and typically demand immense pixel-level annotated images.

“We trained our model with the data from public datasets using GPUs,” Liu explains. “The large number of computing cores and memory space on AWS have enabled us to train our model fast and with high efficacy. This is crucial as it allows us to test and redesign models rapidly and provides great convenience to deploy the trained model to the robot systems.

“We then train our model with a small set of collected unlabeled images in real mission environments,” Liu adds. “Early testing shows that our model is able to detect navigable space in real time with high accuracy. “The computational resources provided by Amazon have greatly accelerated our design process.”

Research areas

Related content

  • Staff writer
    December 29, 2025
    From foundation model safety frameworks and formal verification at cloud scale to advanced robotics and multimodal AI reasoning, these are the most viewed publications from Amazon scientists and collaborators in 2025.
  • Staff writer
    December 29, 2025
    From quantum computing breakthroughs and foundation models for robotics to the evolution of Amazon Aurora and advances in agentic AI, these are the posts that captured readers' attention in 2025.
  • Amazon Research Awards team
    November 25, 2025
    Awardees, who represent 41 universities in 8 countries, have access to Amazon public datasets, along with AWS AI/ML services and tools.
US, MA, N.reading
Amazon Industrial Robotics Group is seeking exceptional talent to help develop the next generation of advanced robotics systems that will transform automation at Amazon's scale. We're building revolutionary robotic systems that combine cutting-edge AI, sophisticated control systems, and advanced mechanical design to create adaptable automation solutions capable of working safely alongside humans in dynamic environments. This is a unique opportunity to shape the future of robotics and automation at an unprecedented scale, working with world-class teams pushing the boundaries of what's possible in robotic dexterous manipulation, locomotion, and human-robot interaction. This role presents an opportunity to shape the future of robotics through innovative applications of deep learning and large language models. At Amazon Industrial Robotics Group, we leverage advanced robotics, machine learning, and artificial intelligence to solve complex operational challenges at an unprecedented scale. Our fleet of robots operates across hundreds of facilities worldwide, working in sophisticated coordination to fulfill our mission of customer excellence. We are pioneering the development of dexterous manipulation system that: - Enables unprecedented generalization across diverse tasks - Enables contact-rich manipulation in different environments - Seamlessly integrates low-level skills and high-level behaviors - Leverage mechanical intelligence, multi-modal sensor feedback and advanced control techniques. The ideal candidate will contribute to research that bridges the gap between theoretical advancement and practical implementation in robotics. You will be part of a team that's revolutionizing how robots learn, adapt, and interact with their environment. Join us in building the next generation of intelligent robotics systems that will transform the future of automation and human-robot collaboration. A day in the life - Work on design and implementation of methods for Visual SLAM, navigation and spatial reasoning - Leverage simulation and real-world data collection to create large datasets for model development - Develop a hierarchical system that combines low-level control with high-level planning - Collaborate effectively with multi-disciplinary teams to co-design hardware and algorithms for dexterous manipulation
US, NY, New York
We are seeking an Applied Scientist to lead the development of evaluation frameworks and data collection protocols for robotic capabilities. In this role, you will focus on designing how we measure, stress-test, and improve robot behavior across a wide range of real-world tasks. Your work will play a critical role in shaping how policies are validated and how high-quality datasets are generated to accelerate system performance. You will operate at the intersection of robotics, machine learning, and human-in-the-loop systems, building the infrastructure and methodologies that connect teleoperation, evaluation, and learning. This includes developing evaluation policies, defining task structures, and contributing to operator-facing interfaces that enable scalable and reliable data collection. The ideal candidate is highly experimental, systems-oriented, and comfortable working across software, robotics, and data pipelines, with a strong focus on turning ambiguous capability goals into measurable and actionable evaluation systems. Key job responsibilities - Design and implement evaluation frameworks to measure robot capabilities across structured tasks, edge cases, and real-world scenarios - Develop task definitions, success criteria, and benchmarking methodologies that enable consistent and reproducible evaluation of policies - Create and refine data collection protocols that generate high-quality, task-relevant datasets aligned with model development needs - Build and iterate on teleoperation workflows and operator interfaces to support efficient, reliable, and scalable data collection - Analyze evaluation results and collected data to identify performance gaps, failure modes, and opportunities for targeted data collection - Collaborate with engineering teams to integrate evaluation tooling, logging systems, and data pipelines into the broader robotics stack - Stay current with advances in robotics, evaluation methodologies, and human-in-the-loop learning to continuously improve internal approaches - Lead technical projects from conception through production deployment - Mentor junior scientists and engineers
US, WA, Seattle
Come be a part of a rapidly expanding $35 billion-dollar global business. At Amazon Business, a fast-growing startup passionate about building solutions, we set out every day to innovate and disrupt the status quo. We stand at the intersection of tech & retail in the B2B space developing innovative purchasing and procurement solutions to help businesses and organizations thrive. At Amazon Business, we strive to be the most recognized and preferred strategic partner for smart business buying. Bring your insight, imagination and a healthy disregard for the impossible. Join us in building and celebrating the value of Amazon Business to buyers and sellers of all sizes and industries. Unlock your career potential. Amazon Business Data Insights and Analytics team is looking for a Data Scientist to lead the research and thought leadership to drive our data and insights strategy for Amazon Business. This role is central in shaping the definition and execution of the long-term strategy for Amazon Business. You will be responsible for researching, experimenting and analyzing predictive and optimization models, designing and implementing advanced detection systems that analyze customer behavior at registration and throughout their journey. You will work on ambiguous and complex business and research science problems with large opportunities. You'll leverage diverse data signals including customer profiles, purchase patterns, and network associations to identify potential abuse and fraudulent activities. You are an analytical individual who is comfortable working with cross-functional teams and systems, working with state-of-the-art machine learning techniques and AWS services to build robust models that can effectively distinguish between legitimate business activities and suspicious behavior patterns You must be a self-starter and be able to learn on the go. Excellent written and verbal communication skills are required as you will work very closely with diverse teams. Key job responsibilities - Interact with business and software teams to understand their business requirements and operational processes - Frame business problems into scalable solutions - Adapt existing and invent new techniques for solutions - Gather data required for analysis and model building - Create and track accuracy and performance metrics - Prototype models by using high-level modeling languages such as R or in software languages such as Python. - Familiarity with transforming prototypes to production is preferred. - Create, enhance, and maintain technical documentation
US, TX, Austin
Amazon Leo is an initiative to launch a constellation of Low Earth Orbit satellites that will provide low-latency, high-speed broadband connectivity to unserved and underserved communities around the world. As a Systems Engineer, this role is primarily responsible for the design, development and integration of communication payload and customer terminal systems. The Role: Be part of the team defining the overall communication system and architecture of Amazon Leo’s broadband wireless network. This is a unique opportunity to innovate and define groundbreaking wireless technology at global scale. The team develops and designs the communication system for Leo and analyzes its overall system level performance such as for overall throughput, latency, system availability, packet loss etc. This role in particular will be responsible for leading the effort in designing and developing advanced technology and solutions for communication system. This role will also be responsible developing advanced physical layer + protocol stacks systems as proof of concept and reference implementation to improve the performance and reliability of the LEO network. In particular this role will be responsible for using concepts from digital signal processing, information theory, wireless communications to develop novel solutions for achieving ultra-high performance LEO network. This role will also be part of a team and develop simulation tools with particular emphasis on modeling the physical layer aspects such as advanced receiver modeling and abstraction, interference cancellation techniques, FEC abstraction models etc. This role will also play a critical role in the integration and verification of various HW and SW sub-systems as a part of system integration and link bring-up and verification. Export Control Requirement: Due to applicable export control laws and regulations, candidates must be a U.S. citizen or national, U.S. permanent resident (i.e., current Green Card holder), or lawfully admitted into the U.S. as a refugee or granted asylum.
US, MA, Boston
The Artificial General Intelligence (AGI) team is seeking a dedicated, skilled, and innovative Applied Scientist with a robust background in machine learning, statistics, quality assurance, auditing methodologies, and automated evaluation systems to ensure the highest standards of data quality, to build industry-leading technology with Large Language Models (LLMs) and multimodal systems. Key job responsibilities As part of the AGI team, an Applied Scientist will collaborate closely with core scientist team developing Amazon Nova models. They will lead the development of comprehensive quality strategies and auditing frameworks that safeguard the integrity of data collection workflows. This includes designing auditing strategies with detailed SOPs, quality metrics, and sampling methodologies that help Nova improve performances on benchmarks. The Applied Scientist will perform expert-level manual audits, conduct meta-audits to evaluate auditor performance, and provide targeted coaching to uplift overall quality capabilities. A critical aspect of this role involves developing and maintaining LLM-as-a-Judge systems, including designing judge architectures, creating evaluation rubrics, and building machine learning models for automated quality assessment. The Applied Scientist will also set up the configuration of data collection workflows and communicate quality feedback to stakeholders. An Applied Scientist will also have a direct impact on enhancing customer experiences through high-quality training and evaluation data that powers state-of-the-art LLM products and services. A day in the life An Applied Scientist with the AGI team will support quality solution design, conduct root cause analysis on data quality issues, research new auditing methodologies, and find innovative ways of optimizing data quality while setting examples for the team on quality assurance best practices and standards. Besides theoretical analysis and quality framework development, an Applied Scientist will also work closely with talented engineers, domain experts, and vendor teams to put quality strategies and automated judging systems into practice.
US, MA, Boston
The Artificial General Intelligence (AGI) team is seeking a dedicated, skilled, and innovative Applied Scientist with a robust background in machine learning, statistics, quality assurance, auditing methodologies, and automated evaluation systems to ensure the highest standards of data quality, to build industry-leading technology with Large Language Models (LLMs) and multimodal systems. Key job responsibilities As part of the AGI team, an Applied Scientist will collaborate closely with core scientist team developing Amazon Nova models. They will lead the development of comprehensive quality strategies and auditing frameworks that safeguard the integrity of data collection workflows. This includes designing auditing strategies with detailed SOPs, quality metrics, and sampling methodologies that help Nova improve performances on benchmarks. The Applied Scientist will perform expert-level manual audits, conduct meta-audits to evaluate auditor performance, and provide targeted coaching to uplift overall quality capabilities. A critical aspect of this role involves developing and maintaining LLM-as-a-Judge systems, including designing judge architectures, creating evaluation rubrics, and building machine learning models for automated quality assessment. The Applied Scientist will also set up the configuration of data collection workflows and communicate quality feedback to stakeholders. An Applied Scientist will also have a direct impact on enhancing customer experiences through high-quality training and evaluation data that powers state-of-the-art LLM products and services. A day in the life An Applied Scientist with the AGI team will support quality solution design, conduct root cause analysis on data quality issues, research new auditing methodologies, and find innovative ways of optimizing data quality while setting examples for the team on quality assurance best practices and standards. Besides theoretical analysis and quality framework development, an Applied Scientist will also work closely with talented engineers, domain experts, and vendor teams to put quality strategies and automated judging systems into practice.
US, MA, Boston
The Artificial General Intelligence (AGI) team is seeking a dedicated, skilled, and innovative Applied Scientist with a robust background in machine learning, statistics, quality assurance, auditing methodologies, and automated evaluation systems to ensure the highest standards of data quality, to build industry-leading technology with Large Language Models (LLMs) and multimodal systems. Key job responsibilities As part of the AGI team, an Applied Scientist will collaborate closely with core scientist team developing Amazon Nova models. They will lead the development of comprehensive quality strategies and auditing frameworks that safeguard the integrity of data collection workflows. This includes designing auditing strategies with detailed SOPs, quality metrics, and sampling methodologies that help Nova improve performances on benchmarks. The Applied Scientist will perform expert-level manual audits, conduct meta-audits to evaluate auditor performance, and provide targeted coaching to uplift overall quality capabilities. A critical aspect of this role involves developing and maintaining LLM-as-a-Judge systems, including designing judge architectures, creating evaluation rubrics, and building machine learning models for automated quality assessment. The Applied Scientist will also set up the configuration of data collection workflows and communicate quality feedback to stakeholders. An Applied Scientist will also have a direct impact on enhancing customer experiences through high-quality training and evaluation data that powers state-of-the-art LLM products and services. A day in the life An Applied Scientist with the AGI team will support quality solution design, conduct root cause analysis on data quality issues, research new auditing methodologies, and find innovative ways of optimizing data quality while setting examples for the team on quality assurance best practices and standards. Besides theoretical analysis and quality framework development, an Applied Scientist will also work closely with talented engineers, domain experts, and vendor teams to put quality strategies and automated judging systems into practice.
US, WA, Bellevue
We are seeking a passionate, talented, and inventive individual to join the Applied AI team and help build industry-leading technologies that customers will love. This team offers a unique opportunity to make a significant impact on the customer experience and contribute to the design, architecture, and implementation of a cutting-edge product. The mission of the Applied AI team is to enable organizations within Worldwide Amazon.com Stores to accelerate the adoption of AI technologies across various parts of our business. We are looking for a Senior Applied Scientist to join our Applied AI team to work on LLM-based solutions. On our team you will push the boundaries of ML and Generative AI techniques to scale the inputs for hundreds of billions of dollars of annual revenue for our eCommerce business. If you have a passion for AI technologies, a drive to innovate and a desire to make a meaningful impact, we invite you to become a valued member of our team. You will be responsible for developing and maintaining the systems and tools that enable us to accelerate knowledge operations and work in the intersection of Science and Engineering. You will push the boundaries of ML and Generative AI techniques to scale the inputs for hundreds of billions of dollars of annual revenue for our eCommerce business. If you have a passion for AI technologies, a drive to innovate and a desire to make a meaningful impact, we invite you to become a valued member of our team. We are seeking an experienced Scientist who combines superb technical, research, analytical and leadership capabilities with a demonstrated ability to get the right things done quickly and effectively. This person must be comfortable working with a team of top-notch developers and collaborating with our research teams. We’re looking for someone who innovates, and loves solving hard problems. You will be expected to have an established background in building highly scalable systems and system design, excellent project management skills, great communication skills, and a motivation to achieve results in a fast-paced environment. You should be somebody who enjoys working on complex problems, is customer-centric, and feels strongly about building good software as well as making that software achieve its operational goals.
IN, KA, Bengaluru
Do you want to lead the development of advanced machine learning systems that protect millions of customers and power a trusted global eCommerce experience? Are you passionate about modeling terabytes of data, solving highly ambiguous fraud and risk challenges, and driving step-change improvements through scientific innovation? If so, the Amazon Buyer Risk Prevention (BRP) Machine Learning team may be the right place for you. We are seeking a Senior Applied Scientist to define and drive the scientific direction of large-scale risk management systems that safeguard millions of transactions every day. In this role, you will lead the design and deployment of advanced machine learning solutions, influence cross-team technical strategy, and leverage emerging technologies—including Generative AI and LLMs—to build next-generation risk prevention platforms. Key job responsibilities Lead the end-to-end scientific strategy for large-scale fraud and risk modeling initiatives Define problem statements, success metrics, and long-term modeling roadmaps in partnership with business and engineering leaders Design, develop, and deploy highly scalable machine learning systems in real-time production environments Drive innovation using advanced ML, deep learning, and GenAI/LLM technologies to automate and transform risk evaluation Influence system architecture and partner with engineering teams to ensure robust, scalable implementations Establish best practices for experimentation, model validation, monitoring, and lifecycle management Mentor and raise the technical bar for junior scientists through reviews, technical guidance, and thought leadership Communicate complex scientific insights clearly to senior leadership and cross-functional stakeholders Identify emerging scientific trends and translate them into impactful production solutions
US, MA, Boston
The Artificial General Intelligence (AGI) team is seeking a dedicated, skilled, and innovative Applied Scientist with a robust background in machine learning, statistics, quality assurance, auditing methodologies, and automated evaluation systems to ensure the highest standards of data quality, to build industry-leading technology with Large Language Models (LLMs) and multimodal systems. Key job responsibilities As part of the AGI team, an Applied Scientist will collaborate closely with core scientist team developing Amazon Nova models. They will lead the development of comprehensive quality strategies and auditing frameworks that safeguard the integrity of data collection workflows. This includes designing auditing strategies with detailed SOPs, quality metrics, and sampling methodologies that help Nova improve performances on benchmarks. The Applied Scientist will perform expert-level manual audits, conduct meta-audits to evaluate auditor performance, and provide targeted coaching to uplift overall quality capabilities. A critical aspect of this role involves developing and maintaining LLM-as-a-Judge systems, including designing judge architectures, creating evaluation rubrics, and building machine learning models for automated quality assessment. The Applied Scientist will also set up the configuration of data collection workflows and communicate quality feedback to stakeholders. An Applied Scientist will also have a direct impact on enhancing customer experiences through high-quality training and evaluation data that powers state-of-the-art LLM products and services. A day in the life An Applied Scientist with the AGI team will support quality solution design, conduct root cause analysis on data quality issues, research new auditing methodologies, and find innovative ways of optimizing data quality while setting examples for the team on quality assurance best practices and standards. Besides theoretical analysis and quality framework development, an Applied Scientist will also work closely with talented engineers, domain experts, and vendor teams to put quality strategies and automated judging systems into practice.