Updating neural networks to recognize new categories, with minimal retraining

Many of today’s most popular AI systems are, at their core, classifiers. They classify inputs into different categories: this image is a picture of a dog, not a cat; this audio signal is an instance of the word “Boston”, not the word “Seattle”; this sentence is a request to play a video, not a song.

But what happens if you need to add a new class to your classifier — if, say, someone releases a new type of automated household appliance that your smart-home system needs to be able to control?

The traditional approach to updating a classifier is to acquire a lot of training data for the new class, add it to all the data used to train the classifier initially, and train a new classifier on the combined data set. With today’s commercial AI systems, many of which were trained on millions of examples, this is a laborious process.

This week, at the 33rd conference of the Association for the Advancement of Artificial Intelligence (AAAI), my colleague Lingzhen Chen from the University of Trento and I are presenting a paper on techniques for updating a classifier using only training data for the new class.

As an example application, we consider a neural network that has been trained to identify people and organizations in online news articles. We show that it is possible to transfer that network and its learned parameters into a new network trained to identify an additional type of named entity — locations.

In particular, we found that the most effective technique is to keep the original classifier; pass its output through a separate network, which we call a neural adapter; and use the output of the neural adapter as an additional input to a second, parallel classifier, which is trained on just the data for the new class. The adapter and the new classifier are trained together, and at run time, the same input passes to both classifiers.

Neural_adapter.png._CB455278445_.png
The output of our neural adapter joins the data flow of the new classifier (target network) just below the final layer (a conditional-random-field layer in one experiment, the final classification layer in the other).

The problem of adapting an existing network to new classes of data is an interesting one in general, but it’s particularly important to Alexa. Alexa scientists and engineers have poured a great deal of effort into Alexa’s core functionality, but through the Alexa Skills Kit, we’ve also enabled third-party developers to build their own Alexa skills — 70,000 and counting. The type of adaptation — or “transfer learning” — that we study in the new paper would make it possible for third-party developers to make direct use of our in-house systems without requiring access to in-house training data.

Modeled loosely on the human neural system, neural nets are networks of simple but densely interconnected processing nodes. Typically, those nodes are arranged into layers, and the output of each layer passes to the layer above it. The connections between layers have associated “weights” that determine how much the output of one node contributes to the computation performed by the next, and training is a matter of adjusting those weights. Input data is fed into the bottom layer, and the output of the top layer indicates the likelihood that the input fits into any of the available classes.

For our initial (pre-adaptation) classifiers, we evaluated two different network architectures. One includes a layer known as a conditional random field just under the output layer, and the other does not. Before adaptation, the network with the conditional-random-field (CRF) layer slightly outperforms the one without (91.35% accuracy versus 91.06%).

The first transfer-learning method we examine is to simply expand the size of the trained network’s output layer and the layers immediately beneath it, to accommodate the addition of the new class. Then we retrain the network on just the new data. We then compare this approach to the one that uses the neural adapter. The output of the neural adapter joins the data flow of the new network just below the final layer (the CRF layer in one case, the final classification layer in the other).

For both initial architectures and both transfer-learning methods, we considered the case in which we allowed only the weights of the top few layers to vary during retraining and the case in which we allowed the weights of the entire network to vary. Across the board, allowing all the weights to vary offered the best performance.

The best-performing post-adaptation network was the one that used both the CRF layer and the neural adapter. With that architecture, the performance of the adapted network on only the original data fell off slightly, from 91.35% to 91.08%, but that was still the best figure across all architectures and adaptation methods. And the performance on the new data was almost as good, at 90.73% accuracy.

Acknowledgments: Lingzhen Chen

Related content

US, CA, Palo Alto
Amazon Advertising is one of Amazon's fastest growing and most profitable businesses. Amazon's advertising portfolio helps merchants, retail vendors, and brand owners succeed via native advertising, which grows incremental sales of their products sold through Amazon. The primary goals are to help shoppers discover new products they love, be the most efficient way for advertisers to meet their business objectives, and build a sustainable business that continuously innovates on behalf of customers. Our products and solutions are strategically important to enable our Retail and Marketplace businesses to drive long-term growth. We deliver billions of ad impressions and millions of clicks and break fresh ground in product and technical innovations every day! Amazon continues to develop its advertising program. Ads run in our Stores (including Consumer Stores, Books, Amazon Business, Whole Foods Market, and Fresh) and Media and Entertainment publishers (including Fire TV, Fire Tablets, Kindle, Alexa, Twitch, Prime Video, Freevee, Amazon Music, MiniTV, Audible, IMDb, and others). In addition to these first-party (1P) publishers, we also deliver ads on third-party (3P) publishers. We have a number of ad products, including Sponsored Products and Sponsored Brands, display and video products for smaller brands, including Sponsored Display and Sponsored TV. We also operate ad tech products, including Amazon Marketing Cloud (a clean-room for advertisers), Amazon Publisher Cloud (a clean-room for publishers), and Amazon DSP (an enterprise-level buying tool that brings together our ad tech for buying video, audio, and display ads). Key job responsibilities This role is focused on diving deep into Amazon Ads data, especially full funnel ads campaigns, a new AI-driven workflow provided to advertisers. Rolling out this workflow at scale is critical for Amazon in 2026.
US, NY, New York
We are seeking a Robotics/AI Motor Control Scientist to develop cutting-edge machine learning algorithms for motor control systems in robots. In this role, you will focus on creating and optimizing intelligent motor control strategies to enable robots to perform complex, whole-body tasks. Your contributions will be essential in advancing robotics by enabling fluid, reliable, and safe interactions between robots and their environments. Key job responsibilities - Develop controllers that leverage reinforcement learning, imitation learning, or other advanced AI techniques to achieve natural, robust, and adaptive motor behaviors - Collaborate with multi-disciplinary teams to integrate motor control systems with robotic hardware, ensuring alignment with real-world constraints such as actuator dynamics and energy efficiency - Use simulation and real-world testing to refine and validate control algorithms - Stay updated on advancements in robotics, AI, and control systems to apply advanced techniques to robotic motion challenges - Lead technical projects from conception through production deployment - Mentor junior scientists and engineers - Bridge research initiatives with practical engineering implementation About the team Fauna Robotics, an Amazon company, is building capable, safe, and genuinely delightful robots for everyday life. Our goal is simple: make robots people actually want to live and interact with in everyday human spaces. We believe that future won’t arrive until building for robotics becomes far more accessible. Today, too much effort is spent reinventing the fundamentals. We’re changing that by developing tightly integrated hardware and software systems that make it faster, safer, and more intuitive to create real-world robotic products. Our work spans the full stack: mechanical design, control systems, dynamic modeling, and intelligent software. The focus is not just functionality, but experience. We’re building robots that feel responsive, expressive, and genuinely useful. At Fauna, you’ll work at the frontier of this space, helping define how robots move, manipulate, and interact with people in natural environments. It’s an opportunity to solve hard problems across hardware and software with a team focused on making robotics accessible and joyful to build. If you care about making robotics real for everyone and building systems that are as delightful as they are capable, we’re interested in hearing from you. an opportunity to solve hard problems across hardware and software with a team focused on making robotics accessible and joyful to build. If you care about making robotics real for everyone and building systems that are as delightful as they are capable, we’re interested in hearing from you.
US, WA, Bellevue
Are you passionate about applying machine learning, time series forecasting, and operations research to transform the delivery of heavy and bulky items for Amazon customers? Are you excited about working with large-scale operational data and developing models that drive real business impact? If so, the Amazon Extra Large (AMXL) Science team may be the right fit for you. AMXL is Amazon's specialized business for delivering heavy and bulky items — appliances, furniture, fitness equipment, and mattresses — with a premium customer experience that includes room-of-choice delivery, at-home installations, and assembly services. In this role, you will leverage large-scale operational data to develop and deploy predictive models and optimization solutions that solve real-world logistics and fulfillment challenges, partnering closely with scientists, engineers, and business stakeholders. Key job responsibilities Apply machine learning, statistical modeling, time series analysis, and operations research techniques to build solutions for delivery routing, capacity planning, demand forecasting, workforce scheduling, and network optimization Analyze large-scale historical and real-time operational data to surface efficiency patterns, bottlenecks, and emerging trends across the AMXL network Develop, validate, and deploy models that improve cost-to-serve and customer experience Partner with cross-functional teams to implement data-driven strategies and measure impact Build scalable, automated pipelines for data ingestion, feature engineering, model training, and validation Monitor deployed model performance and communicate results through clear reporting on key operational and business metrics A day in the life You'll be part of a small, collaborative team of scientists who move fast and care deeply about the problems they solve. A typical week might involve whiteboarding a new forecasting approach with a senior scientist, partnering with engineers to push a model into production, deep-diving into operational data to understand why a metric moved, or presenting your findings to business leaders who will act on them. The work is high-visibility and high-impact. The models you build will directly influence how millions of heavy and bulky items reach customers. About the team The AMXL Science team is a worldwide group of data scientists, applied scientists, and product managers solving Amazon's most complex heavy bulky supply chain challenges. We build forecasting models, capacity planning systems, and optimization tools that directly impact millions of customer deliveries. Our culture values scientific rigor, measurable business impact, and clear communication. We start with baselines, earn complexity, and partner closely with operations to ensure our work drives real decisions. You'll tackle problems where logistics constraints demand creative, data-driven solutions — and see your models shape labor planning, routing, and customer experience at scale.
US, CA, Sunnyvale
Prime Video is a first-stop entertainment destination offering customers a vast collection of premium programming in one app available across thousands of devices. Prime members can customize their viewing experience and find their favorite movies, series, documentaries, and live sports – including Amazon MGM Studios-produced series and movies; licensed fan favorites; and programming from Prime Video subscriptions such as Apple TV+, HBO Max, Peacock, Crunchyroll and MGM+. All customers, regardless of whether they have a Prime membership or not, can rent or buy titles via the Prime Video Store, and can enjoy even more content for free with ads. Are you interested in shaping the future of entertainment? Prime Video's technology teams are creating best-in-class digital video experience. As a Prime Video team member, you’ll have end-to-end ownership of the product, user experience, design, and technology required to deliver state-of-the-art experiences for our customers. You’ll get to work on projects that are fast-paced, challenging, and varied. You’ll also be able to experiment with new possibilities, take risks, and collaborate with remarkable people. We’ll look for you to bring your diverse perspectives, ideas, and skill-sets to make Prime Video even better for our customers. With global opportunities for talented technologists, you can decide where a career Prime Video Tech takes you! Key job responsibilities As an Applied Scientist at Prime Video, you will have end-to-end ownership of the product, related research and experimentation, applying advanced machine learning techniques in computer vision (CV), Generative AI, multimedia understanding and so on. You’ll work on diverse projects that enhance Prime Video’s content localization, image/video understanding, and content personalization, driving impactful innovations for our global audience. Other responsibilities include: - Research and develop generative models for controllable synthesis across images, video, vector graphics, and multimedia - Innovate in advanced diffusion and flow-based methods (e.g., inverse flow matching, parameter efficient training, guided sampling, test-time adaptation) to improve efficiency, controllability, and scalability. - Advance visual grounding, depth and 3D estimation, segmentation, and matting for integration into pre-visualization, compositing, VFX, and post-production pipelines. - Design multimodal GenAI workflows including visual-language model tooling, structured prompt orchestration, agentic pipelines. A day in the life Prime Video is pioneering the use of Generative AI to empower the next generation of creatives. Our mission is to make world-class media creation accessible, scalable, and efficient. We are seeking an Applied Scientist to advance the state of the art in Generative AI and to deliver these innovations as production-ready systems at Amazon scale. Your work will give creators unprecedented freedom and control while driving new efficiencies across Prime Video’s global content and marketing pipelines. This is a newly formed team within Prime Video Science!
ES, M, Madrid
Are you interested in building the measurement foundation that proves whether targeted, cohort-based marketing actually changes customer behavior at Amazon scale? We are seeking an Applied Scientist to own measurement and experimentation for our Lifecycle Marketing Experimentation roadmap within the PRIMAS (Prime & Marketing Analytics and Science) team. In this role, you will design and execute rigorous experiments that measure the effectiveness of audience-based marketing campaigns across multiple channels, providing the evidence that guides marketing strategy and investment decisions. This is a high-impact role where you will build measurement frameworks from scratch, design experiments that isolate causal effects, and establish the experimental standards for lifecycle marketing across EU. You will work closely with business leaders and the senior science lead to answer critical questions: does targeting specific cohorts (Bargain hunters, Young adults) improve efficiency vs. broad campaigns? Which creative strategies drive behavior change? How should we optimize marketing spend across channels? Key job responsibilities Measurement & Experimentation Ownership: 1. Own measurement end-to-end for lifecycle marketing campaigns – design experiments (RCTs, geo-tests, audience holdouts) that measure campaign effectiveness across marketing channels 2. Build measurement frameworks and experimental best practices that work across different activation platforms and can scale to multiple campaigns 3. Establish experimental standards and tooling for lifecycle marketing, ensuring statistical rigor while balancing business constraints Causal Inference & Analysis: 1. Apply causal inference methods to measure incremental impact of marketing campaigns vs. counterfactual 2. Navigate measurement challenges across different platforms (Meta attribution, LiveRamp, clean rooms, onsite tracking) 3. Analyze experiment results and provide optimization recommendations based on statistical evidence 4. Establish guardrails and success criteria for campaign evaluation About the team The PRIMAS team, is part of a larger tech tech team called WIMSI (WW Integrated Marketing Systems and Intelligence). WIMSI core mission is to accelerate marketing technology capabilities that enable de-averaged customer experiences across the marketing funnel: awareness, consideration, and conversion.
IN, KA, Bengaluru
Alexa+ is Amazon’s next-generation, AI-powered assistant. Building on the original Alexa, it uses generative AI to deliver a more conversational, personalized, and effective experience. The Trust CX Innovations team is looking for an Applied Scientist with strong background in Generative AI space to build solutions that help in upholding customer trust for Alexa+. A Senior Applied Scientist in Trust CX innovations, you will be at the forefront of developing innovative solutions to critical challenges in AI trust and privacy. You'll lead research in trust-preserving machine learning techniques. We are working on revolutionizing the way Amazonians work and collaborate. You will help us achieve new heights of productivity through the power of advanced generative AI technologies. We are looking for a leader with strong technical experiences a passion for building scientific driven solutions in a fast-paced environment. You should have good understanding of Artificial Intelligence (AI), Natural Language Understanding (NLU), Machine Learning (ML), Dialog Management, Automatic Speech Recognition (ASR), and Audio Signal Processing where to apply them in different business cases. You will be joining a select group of people making history producing one of the most highly rated products in Amazon's history, so if you are looking for a challenging and innovative role where you can solve important problems while growing as a leader, this may be the place for you. Key job responsibilities • Lead research initiatives in generative AI, focusing on LLMs, multimodal models, and frontier AI capabilities • Develop innovative approaches for model optimization, including prompt engineering, few-shot learning, and efficient fine-tuning • Pioneer new methods for AI safety, alignment, and responsible AI development • Design and execute sophisticated experiments to evaluate model performance and behavior • Lead the development of production-ready AI solutions that scale efficiently • Collaborate with product teams to translate research innovations into practical applications • Guide engineering teams in implementing AI models and systems at scale • Author technical papers for top-tier conferences • File patents for novel AI technologies and applications A day in the life You will be working with a group of talented scientists on researching algorithm and running experiments to test scientific proposal/solutions to improve our trust-preserving experiences. This will involve collaboration with partner teams including engineering, PMs, data annotators, and other scientists to discuss data quality, policy, and model development. You work closely with partner teams across Alexa to deliver platform features that require cross-team leadership. About the team Who We Are: Trust CX Innovations is a strategic innovation team within Amazon Devices & Services that focuses on advancing AI technology while prioritizing customer trust and experience. Our team operates at the intersection of artificial intelligence, privacy engineering and customer-centric design.
IN, TS, Hyderabad
The WW DSP Analytics team is a centralized analytics organization within Amazon's Last Mile Delivery Service Partner (DSP) program. We build best-in-class solutions that enable data-driven decision making across our global DSP ecosystem. Our team partners with internal stakeholders, DSP owners, and cross-functional teams to deliver insights that drive operational excellence, business growth, and the success of small business owners in Last Mile delivery. Our work directly impacts customer experience, driver and station associate experience, DSP success, and Amazon's sustainable growth. We are seeking a passionate Data Scientist with strong machine learning and analytical skills to join our team. You will work on challenging problems in the delivery planning space, applying data science rigor to generate actionable insights that support DSP performance measurement and continuous improvement. Key job responsibilities Develop Science Solutions for DSP Performance: Design and implement data science solutions to optimize Delivery Service Partner (DSP) operations, capacity planning, and performance measurement across the global DSP network Apply Advanced Machine Learning Techniques: Leverage solid research experience in Machine Learning and statistical modeling to identify opportunities for improving DSP analytics, forecasting models, and performance measurement systems Optimize DSP Program Policies and Sentiment Risks: Analyze sentiment risks and enhance existing algorithms that support DSP program management, including scorecard metrics, capacity reliability models, and performance evaluation frameworks Analyze Business Requirements with Return on Investment (ROI) calculation: Demonstrate superior logical thinking by quickly approaching large, ambiguous problems, translating high-level DSP program requirements into mathematical models, and applying models to predict the return on investment. Build Production-Scale Analytics: Contribute to the development and deployment of scalable data models, dashboards, and automated reporting systems that enable self-service analytics for DSP stakeholders Accelerate GenAI footprint: Partner with Data Engineers to expand our GenAI tools and improve developer productivity along with raising the bar on data quality. Conduct Independent Data Analysis: Mine and analyze complex datasets across multiple domains (performance metrics, financial data, operational data) using programming and statistical analysis tools to generate actionable insights Thrive in a Collaborative Environment: Excel in a fast-paced analytics organization that encourages collaborative and creative problem-solving, measure and communicate analytical risks, constructively critique peer work, and align research focuses with DSP program strategic needs Partner Cross-Functionally: Work closely with Business Intelligence Engineers, program teams, and DSP stakeholders to define KPIs, validate analytical approaches, and ensure insights drive meaningful business outcomes
US, TX, Austin
Applied Scientists in AWS Automated Reasoning are dedicated to making AWS the best computing service in the world for customers who require advanced and rigorous solutions for automated reasoning, privacy, and sovereignty. Key job responsibilities The successful candidate will: - Solve large or significantly complex problems that require deep knowledge and understanding of your domain and scientific innovation. - Own strategic problem solving, and take the lead on the design, implementation, and delivery for solutions that have a long-term quantifiable impact. - Provide cross-organizational technical influence, increasing productivity and effectiveness by sharing your deep knowledge and experience. - Develop strategic plans to identify fundamentally new solutions for business problems. - Assist in the career development of others, actively mentoring individuals and the community on advanced technical issues. A day in the life This is a unique and rare opportunity to get in early on a fast-growing segment of AWS and help shape the technology, product and the business. You will have a chance to utilize your deep technical experience within a fast moving, start-up environment and make a large business and customer impact. About the team Diverse Experiences Amazon Automated Reasoning values diverse experiences. Even if you do not meet all of the qualifications and skills listed in the job description, we encourage candidates to apply. If your career is just starting, hasn't followed a traditional path, or includes alternative experiences, don't let it stop you from applying. Why Amazon Automated Reasoning? At Amazon, automated reasoning is central to maintaining customer trust and delivering delightful customer experiences. Our organization is responsible for creating and maintaining a high bar for automated reasoning across all of Amazon's products and services. We offer talented automated reasoning professionals the chance to accelerate their careers with opportunities to build experience in a wide variety of areas including cloud, devices, retail, entertainment, healthcare, operations, and physical stores. Inclusive Team Culture In Amazon Automated Reasoning, it's in our nature to learn and be curious. Ongoing DEI events and learning experiences inspire us to continue learning and to embrace our uniqueness. Addressing the toughest automated reasoning challenges requires that we seek out and celebrate a diversity of ideas, perspectives, and voices. Training & Career Growth We're continuously raising our performance bar as we strive to become Earth's Best Employer. That's why you'll find endless knowledge-sharing, training, and other career-advancing resources here to help you develop into a better-rounded professional. Work/Life Balance We value work-life harmony. Achieving success at work should never come at the expense of sacrifices at home, which is why flexible work hours and arrangements are part of our culture. When we feel supported in the workplace and at home, there's nothing we can't achieve.
US, MA, Boston
Sr. Applied Scientists in AWS Automated Reasoning are dedicated to making AWS the best computing service in the world for customers who require advanced and rigorous solutions for automated reasoning, privacy, and sovereignty. Key job responsibilities The successful candidate will: - Solve large or significantly complex problems that require deep knowledge and understanding of your domain and scientific innovation. - Own strategic problem solving, and take the lead on the design, implementation, and delivery for solutions that have a long-term quantifiable impact. - Provide cross-organizational technical influence, increasing productivity and effectiveness by sharing your deep knowledge and experience. - Develop strategic plans to identify fundamentally new solutions for business problems. - Assist in the career development of others, actively mentoring individuals and the community on advanced technical issues. A day in the life This is a unique and rare opportunity to get in early on a fast-growing segment of AWS and help shape the technology, product and the business. You will have a chance to utilize your deep technical experience within a fast moving, start-up environment and make a large business and customer impact. About the team Diverse Experiences Amazon Automated Reasoning values diverse experiences. Even if you do not meet all of the qualifications and skills listed in the job description, we encourage candidates to apply. If your career is just starting, hasn't followed a traditional path, or includes alternative experiences, don't let it stop you from applying. Why Amazon Automated Reasoning? At Amazon, automated reasoning is central to maintaining customer trust and delivering delightful customer experiences. Our organization is responsible for creating and maintaining a high bar for automated reasoning across all of Amazon's products and services. We offer talented automated reasoning professionals the chance to accelerate their careers with opportunities to build experience in a wide variety of areas including cloud, devices, retail, entertainment, healthcare, operations, and physical stores. Inclusive Team Culture In Amazon Automated Reasoning, it's in our nature to learn and be curious. Ongoing DEI events and learning experiences inspire us to continue learning and to embrace our uniqueness. Addressing the toughest automated reasoning challenges requires that we seek out and celebrate a diversity of ideas, perspectives, and voices. Training & Career Growth We're continuously raising our performance bar as we strive to become Earth's Best Employer. That's why you'll find endless knowledge-sharing, training, and other career-advancing resources here to help you develop into a better-rounded professional. Work/Life Balance We value work-life harmony. Achieving success at work should never come at the expense of sacrifices at home, which is why flexible work hours and arrangements are part of our culture. When we feel supported in the workplace and at home, there's nothing we can't achieve.
US, MA, Boston
Applied Scientists in AWS Automated Reasoning are dedicated to making AWS the best computing service in the world for customers who require advanced and rigorous solutions for automated reasoning, privacy, and sovereignty. Key job responsibilities The successful candidate will: - Solve large or significantly complex problems that require deep knowledge and understanding of your domain and scientific innovation. - Own strategic problem solving, and take the lead on the design, implementation, and delivery for solutions that have a long-term quantifiable impact. - Provide cross-organizational technical influence, increasing productivity and effectiveness by sharing your deep knowledge and experience. - Develop strategic plans to identify fundamentally new solutions for business problems. - Assist in the career development of others, actively mentoring individuals and the community on advanced technical issues. A day in the life This is a unique and rare opportunity to get in early on a fast-growing segment of AWS and help shape the technology, product and the business. You will have a chance to utilize your deep technical experience within a fast moving, start-up environment and make a large business and customer impact. About the team Diverse Experiences Amazon Automated Reasoning values diverse experiences. Even if you do not meet all of the qualifications and skills listed in the job description, we encourage candidates to apply. If your career is just starting, hasn't followed a traditional path, or includes alternative experiences, don't let it stop you from applying. Why Amazon Automated Reasoning? At Amazon, automated reasoning is central to maintaining customer trust and delivering delightful customer experiences. Our organization is responsible for creating and maintaining a high bar for automated reasoning across all of Amazon's products and services. We offer talented automated reasoning professionals the chance to accelerate their careers with opportunities to build experience in a wide variety of areas including cloud, devices, retail, entertainment, healthcare, operations, and physical stores. Inclusive Team Culture In Amazon Automated Reasoning, it's in our nature to learn and be curious. Ongoing DEI events and learning experiences inspire us to continue learning and to embrace our uniqueness. Addressing the toughest automated reasoning challenges requires that we seek out and celebrate a diversity of ideas, perspectives, and voices. Training & Career Growth We're continuously raising our performance bar as we strive to become Earth's Best Employer. That's why you'll find endless knowledge-sharing, training, and other career-advancing resources here to help you develop into a better-rounded professional. Work/Life Balance We value work-life harmony. Achieving success at work should never come at the expense of sacrifices at home, which is why flexible work hours and arrangements are part of our culture. When we feel supported in the workplace and at home, there's nothing we can't achieve.