Using unlabeled data to improve sequence labeling

New method extends virtual adversarial training to sequence-labeling tasks, which assign different labels to different words of an input sentence.

Virtual adversarial training (VAT) is a way to improve machine learning systems by creating difficult-to-classify training examples through the addition of noise to unlabeled data. It has had great success on both image classification tasks and text classification tasks, such as determining the sentiment of a review or the topic of an article.

It’s not as well adapted, however, to sequence-labeling tasks, in which each word of an input phrase gets its own label. Largely, that’s because VAT is difficult to integrate with conditional random fields, a statistical-modeling method that has proven vital to state-of-the-art performance in sequence labeling.

SeqVAT architecture.png
Alexa researchers' new seqVAT procedure enables the use of virtual adversarial training (VAT) for networks with integrated conditional random fields (CRFs).

In a paper we presented this week at the annual meeting of the Association for Computational Linguistics, my colleagues and I describe a new way to integrate VAT with conditional random fields

In experiments, we compared our system to its four best-performing predecessors on three different sequence-labeling tasks using semi-supervised learning, in which a small amount of labeled training data is supplemented with a large body of unlabeled data. On eight different data sets, our method outperformed all four baselines across the board.

Conventional adversarial training is a supervised learning technique: noise is added to labeled training examples to make them harder to classify, and the machine learning system is evaluated according to how well it predicts the labels.

VAT extends this approach to semi-supervised learning, which seeks to take advantage of unlabeled data. First, a model is trained on labeled data. Then, noise is added to a large body of unlabeled data, and the model is further trained on how well its classifications of the noisy versions of the unlabeled data match its classifications of the clean versions.

This approach depends on a comparison of aggregate statistics — the classifications of the clean and noisy data. But conditional random fields (CRFs) make that comparison more complicated.

Sequential dependencies

A CRF models the statistical relationships between successive items in a sequence, which is what makes it so useful for sequence-labeling tasks, such as determining parts of speech or identifying the entity types (song, singer, album, and so on) associated with each name in a sequence of words.

For instance, on a named-entity recognition task, a CRF could predict that a word that follows the name of a song is much more likely to be the name of a singer than that of a travel company. In many neural-network-based natural-language-understanding models, the last layer of the network is a CRF, which narrows the range of possible outputs that the model needs to evaluate.

VAT, however, isn’t designed to handle the sequential dependencies captured by CRFs. Consider, for instance, a named-entity recognizer that receives the input sequence “Play ‘Burn’ by Usher”. It should classify “Burn” as a song name and “Usher” as an artist name.

Conventional VAT could attempt to match the classifications of the noisy and clean versions of the word “Burn” and the classifications of the noisy and clean versions of the word “Usher”. But it wouldn’t try to match the statistical dependency learned by the CRF: that if “Burn” is a song name, “Usher” is much more likely to be an artist name than otherwise.

That’s the dependency that we set out to capture with our model, which we call seqVAT, for sequential VAT.

Combinatorial explosion

One way to model that dependency is to calculate the probabilities of complete sequences of labels. That is, there’s some probability that “Burn” is a song name and “Usher” is an artist name, that “Burn” is a song name and “Usher” is an album name, that “Burn” is the name of a restaurant and “Usher” the name of a nearby geographical landmark, and so on.

As the number of entity classes grows, however, enumerating the probability of every possible sequence of classifications rapidly becomes computationally intractable. So instead, we use an algorithm called the k-best Viterbi algorithm to efficiently find a short list (with k items) of the mostly likely label sequences.

From the probabilities of those sequences, we can estimate a probability distribution over the labels of the entire output sequence. We then train the network to minimize the difference between that probability distribution in the case of noisy, unlabeled examples and in the case of clean, unlabeled examples.

In our experiments, in something of a departure from prior practice, we used one data set for the supervised portion of the training and a different but related data set for the semi-supervised portion. This more accurately simulates conditions in which the need for semi-supervised training tends to arise. Often, semi-supervised training is necessary precisely because labeled data is scarce or absent for the target application, although it’s available for related applications.

We compared seqVAT’s performance to that of three popular semi-supervised training approaches — self-training, entropy minimization, and cross-view training — and to that of conventional VAT, which seeks to minimize the distance between probability distributions over individual words in a sequence, rather than distributions over the sequence as a whole.

In the semi-supervised setting, seqVAT was consistently the best performer, while the second-best performer varied between cross-view training and conventional VAT.

Related content

US, CA, Santa Clara
Job summaryAmazon is looking for a passionate, talented, and inventive Applied Scientist with a strong machine learning background to help build industry-leading language technology.Our mission is to provide a delightful experience to Amazon’s customers by pushing the envelope in Natural Language Processing (NLP), Natural Language Understanding (NLU), Dialog management, conversational AI and Machine Learning (ML).As part of our AI team in Amazon AWS, you will work alongside internationally recognized experts to develop novel algorithms and modeling techniques to advance the state-of-the-art in human language technology. Your work will directly impact millions of our customers in the form of products and services, as well as contributing to the wider research community. You will gain hands on experience with Amazon’s heterogeneous text and structured data sources, and large-scale computing resources to accelerate advances in language understanding.We are hiring primarily in Conversational AI / Dialog System Development areas: NLP, NLU, Dialog Management, NLG.This role can be based in NYC, Seattle or Palo Alto.Inclusive Team CultureHere at AWS, we embrace our differences. We are committed to furthering our culture of inclusion. We have ten employee-led affinity groups, reaching 40,000 employees in over 190 chapters globally. We have innovative benefit offerings, and host annual and ongoing learning experiences, including our Conversations on Race and Ethnicity (CORE) and AmazeCon (gender diversity) conferences.Work/Life BalanceOur team puts a high value on work-life balance. It isn’t about how many hours you spend at home or at work; it’s about the flow you establish that brings energy to both parts of your life. We believe striking the right balance between your personal and professional life is critical to life-long happiness and fulfillment. We offer flexibility in working hours and encourage you to find your own balance between your work and personal lives.Mentorship & Career GrowthOur team is dedicated to supporting new members. We have a broad mix of experience levels and tenures, and we’re building an environment that celebrates knowledge sharing and mentorship. Our senior members enjoy one-on-one mentoring and thorough, but kind, code reviews. We care about your career growth and strive to assign projects based on what will help each team member develop into a better-rounded engineer and enable them to take on more complex tasks in the future.
US, NY, New York
Job summaryAmazon is looking for a passionate, talented, and inventive Applied Scientist with a strong machine learning background to help build industry-leading language technology.Our mission is to provide a delightful experience to Amazon’s customers by pushing the envelope in Natural Language Processing (NLP), Natural Language Understanding (NLU), Dialog management, conversational AI and Machine Learning (ML).As part of our AI team in Amazon AWS, you will work alongside internationally recognized experts to develop novel algorithms and modeling techniques to advance the state-of-the-art in human language technology. Your work will directly impact millions of our customers in the form of products and services, as well as contributing to the wider research community. You will gain hands on experience with Amazon’s heterogeneous text and structured data sources, and large-scale computing resources to accelerate advances in language understanding.We are hiring primarily in Conversational AI / Dialog System Development areas: NLP, NLU, Dialog Management, NLG.This role can be based in NYC, Seattle or Palo Alto.Inclusive Team CultureHere at AWS, we embrace our differences. We are committed to furthering our culture of inclusion. We have ten employee-led affinity groups, reaching 40,000 employees in over 190 chapters globally. We have innovative benefit offerings, and host annual and ongoing learning experiences, including our Conversations on Race and Ethnicity (CORE) and AmazeCon (gender diversity) conferences.Work/Life BalanceOur team puts a high value on work-life balance. It isn’t about how many hours you spend at home or at work; it’s about the flow you establish that brings energy to both parts of your life. We believe striking the right balance between your personal and professional life is critical to life-long happiness and fulfillment. We offer flexibility in working hours and encourage you to find your own balance between your work and personal lives.Mentorship & Career GrowthOur team is dedicated to supporting new members. We have a broad mix of experience levels and tenures, and we’re building an environment that celebrates knowledge sharing and mentorship. Our senior members enjoy one-on-one mentoring and thorough, but kind, code reviews. We care about your career growth and strive to assign projects based on what will help each team member develop into a better-rounded engineer and enable them to take on more complex tasks in the future.
US, CA, Santa Clara
Job summaryAWS AI/ML is looking for world class scientists and engineers to join its AI Research and Education group working on building automated ML solutions for planetary-scale sustainability and geospatial applications. Our team's mission is to develop ready-to-use and automated solutions that solve important sustainability and geospatial problems. We live in a time wherein geospatial data, such as climate, agricultural crop yield, weather, landcover, etc., has become ubiquitous. Cloud computing has made it easy to gather and process the data that describes the earth system and are generated by satellites, mobile devices, and IoT devices. Our vision is to bring the best ML/AI algorithms to solve practical environmental and sustainability-related R&D problems at scale. Building these solutions require a solid foundation in machine learning infrastructure and deep learning technologies. The team specializes in developing popular open source software libraries like AutoGluon, GluonCV, GluonNLP, DGL, Apache/MXNet (incubating). Our strategy is to bring the best of ML based automation to the geospatial and sustainability area.We are seeking an experienced Applied Scientist for the team. This is a role that combines science knowledge (around machine learning, computer vision, earth science), technical strength, and product focus. It will be your job to develop ML system and solutions and work closely with the engineering team to ship them to our customers. You will interact closely with our customers and with the academic and research communities. You will be at the heart of a growing and exciting focus area for AWS and work with other acclaimed engineers and world famous scientists. You are also expected to work closely with other applied scientists and demonstrate Amazon Leadership Principles (https://www.amazon.jobs/en/principles). Strong technical skills and experience with machine learning and computer vision are required. Experience working with earth science, mapping, and geospatial data is a plus. Our customers are extremely technical and the solutions we build for them are strongly coupled to technical feasibility.About the teamInclusive Team CultureAt AWS, we embrace our differences. We are committed to furthering our culture of inclusion. We have ten employee-led affinity groups, reaching 40,000 employees in over 190 chapters globally. We have innovative benefit offerings, and host annual and ongoing learning experiences, including our Conversations on Race and Ethnicity (CORE) and AmazeCon (gender diversity) conferences. Amazon’s culture of inclusion is reinforced within our 14 Leadership Principles, which remind team members to seek diverse perspectives, learn and be curious, and earn trust. Work/Life BalanceOur team puts a high value on work-life balance. It isn’t about how many hours you spend at home or at work; it’s about the flow you establish that brings energy to both parts of your life. We believe striking the right balance between your personal and professional life is critical to life-long happiness and fulfillment. We offer flexibility in working hours and encourage you to find your own balance between your work and personal lives.Mentorship & Career GrowthOur team is dedicated to supporting new members. We have a broad mix of experience levels and tenures, and we’re building an environment that celebrates knowledge sharing and mentorship. Our senior members enjoy one-on-one mentoring. We care about your career growth and strive to assign projects based on what will help each team member develop into a better-rounded scientist and enable them to take on more complex tasks in the future.Interested in this role? Reach out to the recruiting team with questions or apply directly via amazon.jobs.
US, CA, Santa Clara
Job summaryAWS AI/ML is looking for world class scientists and engineers to join its AI Research and Education group working on building automated ML solutions for planetary-scale sustainability and geospatial applications. Our team's mission is to develop ready-to-use and automated solutions that solve important sustainability and geospatial problems. We live in a time wherein geospatial data, such as climate, agricultural crop yield, weather, landcover, etc., has become ubiquitous. Cloud computing has made it easy to gather and process the data that describes the earth system and are generated by satellites, mobile devices, and IoT devices. Our vision is to bring the best ML/AI algorithms to solve practical environmental and sustainability-related R&D problems at scale. Building these solutions require a solid foundation in machine learning infrastructure and deep learning technologies. The team specializes in developing popular open source software libraries like AutoGluon, GluonCV, GluonNLP, DGL, Apache/MXNet (incubating). Our strategy is to bring the best of ML based automation to the geospatial and sustainability area.We are seeking an experienced Applied Scientist for the team. This is a role that combines science knowledge (around machine learning, computer vision, earth science), technical strength, and product focus. It will be your job to develop ML system and solutions and work closely with the engineering team to ship them to our customers. You will interact closely with our customers and with the academic and research communities. You will be at the heart of a growing and exciting focus area for AWS and work with other acclaimed engineers and world famous scientists. You are also expected to work closely with other applied scientists and demonstrate Amazon Leadership Principles (https://www.amazon.jobs/en/principles). Strong technical skills and experience with machine learning and computer vision are required. Experience working with earth science, mapping, and geospatial data is a plus. Our customers are extremely technical and the solutions we build for them are strongly coupled to technical feasibility.About the teamInclusive Team CultureAt AWS, we embrace our differences. We are committed to furthering our culture of inclusion. We have ten employee-led affinity groups, reaching 40,000 employees in over 190 chapters globally. We have innovative benefit offerings, and host annual and ongoing learning experiences, including our Conversations on Race and Ethnicity (CORE) and AmazeCon (gender diversity) conferences. Amazon’s culture of inclusion is reinforced within our 14 Leadership Principles, which remind team members to seek diverse perspectives, learn and be curious, and earn trust. Work/Life BalanceOur team puts a high value on work-life balance. It isn’t about how many hours you spend at home or at work; it’s about the flow you establish that brings energy to both parts of your life. We believe striking the right balance between your personal and professional life is critical to life-long happiness and fulfillment. We offer flexibility in working hours and encourage you to find your own balance between your work and personal lives.Mentorship & Career GrowthOur team is dedicated to supporting new members. We have a broad mix of experience levels and tenures, and we’re building an environment that celebrates knowledge sharing and mentorship. Our senior members enjoy one-on-one mentoring. We care about your career growth and strive to assign projects based on what will help each team member develop into a better-rounded scientist and enable them to take on more complex tasks in the future.Interested in this role? Reach out to the recruiting team with questions or apply directly via amazon.jobs.
US, WA, Seattle
Job summaryHow can we create a rich, data-driven shopping experience on Amazon? How do we build data models that helps us innovate different ways to enhance customer experience? How do we combine the world's greatest online shopping dataset with Amazon's computing power to create models that deeply understand our customers? Recommendations at Amazon is a way to help customers discover products. Our team's stated mission is to "grow each customer’s relationship with Amazon by leveraging our deep understanding of them to provide relevant and timely product, program, and content recommendations". We strive to better understand how customers shop on Amazon (and elsewhere) and build recommendations models to streamline customers' shopping experience by showing the right products at the right time. Understanding the complexities of customers' shopping needs and helping them explore the depth and breadth of Amazon's catalog is a challenge we take on every day. Using Amazon’s large-scale computing resources you will ask research questions about customer behavior, build models to generate recommendations, and run these models directly on the retail website. You will participate in the Amazon ML community and mentor Applied Scientists and software development engineers with a strong interest in and knowledge of ML. Your work will directly benefit customers and the retail business and you will measure the impact using scientific tools. We are looking for passionate, hard-working, and talented Applied scientist who have experience building mission critical, high volume applications that customers love. You will have an enormous opportunity to make a large impact on the design, architecture, and implementation of cutting edge products used every day, by people you know.Key job responsibilitiesScaling state of the art techniques to Amazon-scaleWorking independently and collaborating with SDEs to deploy models to productionDeveloping long-term roadmaps for the team's scientific agendaDesigning experiments to measure business impact of the team's effortsMentoring scientists in the departmentContributing back to the machine learning science community
US, NY, New York
Job summaryAmazon Web Services is looking for world class scientists to join the Security Analytics and AI Research team within AWS Security Services. This group is entrusted with researching and developing core data mining and machine learning algorithms for various AWS security services like GuardDuty (https://aws.amazon.com/guardduty/) and Macie (https://aws.amazon.com/macie/). In this group, you will invent and implement innovative solutions for never-before-solved problems. If you have passion for security and experience with large scale machine learning problems, this will be an exciting opportunity.The AWS Security Services team builds technologies that help customers strengthen their security posture and better meet security requirements in the AWS Cloud. The team interacts with security researchers to codify our own learnings and best practices and make them available for customers. We are building massively scalable and globally distributed security systems to power next generation services.Inclusive Team Culture Here at AWS, we embrace our differences. We are committed to furthering our culture of inclusion. We have ten employee-led affinity groups, reaching 40,000 employees in over 190 chapters globally. We have innovative benefit offerings, and host annual and ongoing learning experiences, including our Conversations on Race and Ethnicity (CORE) and AmazeCon (gender diversity) conferences. Amazon’s culture of inclusion is reinforced within our 14 Leadership Principles, which remind team members to seek diverse perspectives, learn and be curious, and earn trust. Work/Life Balance Our team puts a high value on work-life balance. It isn’t about how many hours you spend at home or at work; it’s about the flow you establish that brings energy to both parts of your life. We believe striking the right balance between your personal and professional life is critical to life-long happiness and fulfillment. We offer flexibility in working hours and encourage you to find your own balance between your work and personal lives. Mentorship & Career Growth Our team is dedicated to supporting new members. We have a broad mix of experience levels and tenures, and we’re building an environment that celebrates knowledge sharing and mentorship. We care about your career growth and strive to assign projects based on what will help each team member develop and enable them to take on more complex tasks in the future.A day in the lifeAbout the hiring groupJob responsibilities* Rapidly design, prototype and test many possible hypotheses in a high-ambiguity environment, making use of both quantitative and business judgment.* Collaborate with software engineering teams to integrate successful experiments into large scale, highly complex production services.* Report results in a scientifically rigorous way.* Interact with security engineers, product managers and related domain experts to dive deep into the types of challenges that we need innovative solutions for.
US, MA, Westborough
Job summaryAre you inspired by invention? Is problem solving through teamwork in your DNA? Do you like the idea of seeing how your work impacts the bigger picture? Answer yes to any of these and you’ll fit right in here at Amazon Robotics. We are a smart team of doers who work passionately to apply cutting edge advances in robotics and software to solve real-world challenges that will transform our customers’ experiences. We invent new improvements every day. We are Amazon Robotics and we will give you the tools and support you need to invent with us in ways that are rewarding, fulfilling, and fun.Amazon.com empowers a smarter, faster, more consistent customer experience through automation. Amazon Robotics automates fulfillment center operations using various methods of robotic technology including autonomous mobile robots, sophisticated control software, language perception, power management, computer vision, depth sensing, machine learning, object recognition, and semantic understanding of commands. Amazon Robotics has a dedicated focus on research and development to continuously explore new opportunities to extend its product lines into new areas.This role is a 6-month Co-Op to join AR full-time (40 hours/week) from January 9, 2023 to June 23, 2023. Amazon Robotics co-op opportunity will be Hybrid (2-3 days onsite) and based out of the Greater Boston Area in our two state-of-the-art facilities in Westborough, MA and North Reading, MA. Both campuses provide a unique opportunity to have direct access to robotics testing labs and manufacturing facilities.Key job responsibilitiesWe are seeking data scientist co-ops to help us analyze data, quantify uncertainty, and build machine learning models to make quick prediction.
US, WA, Seattle
Job summaryDo you want to join an innovative team of scientists who use machine learning and statistical techniques to help Amazon provide the best customer experience by preventing eCommerce fraud? Are you excited by the prospect of analyzing and modeling terabytes of data and creating state-of-the-art algorithms to solve real world problems? Do you like to own end-to-end business problems/metrics and directly impact the profitability of the company? Do you enjoy collaborating in a diverse team environment? If yes, then you may be a great fit to join the Amazon Buyer Risk Prevention (BRP) Machine Learning group. We are looking for a talented scientist who is passionate to build advanced algorithmic systems that help manage safety of millions of transactions every day.Major responsibilities Use statistical and machine learning techniques to create scalable risk management systemsLearning and understanding large amounts of Amazon’s historical business data for specific instances of risk or broader risk trendsDesign, development and evaluation of highly innovative models for risk managementWorking closely with software engineering teams to drive real-time model implementations and new feature creationsWorking closely with operations staff to optimize risk management operations,Establishing scalable, efficient, automated processes for large scale data analyses, model development, model validation and model implementationTracking general business activity and providing clear, compelling management reporting on a regular basisResearch and implement novel machine learning and statistical approaches
US, CA, Palo Alto
Job summaryAmazon is investing heavily in building a customer centric, world class advertising business across its many unique audio, video, and display surfaces. We are looking for an Applied Scientist who has a deep passion for building machine-learning solutions in our advertising decision system. In this role, you will be on the cutting edge of developing monetization solutions for Live TV, Connected TV and streaming Audio. These are nascent, high growth areas, where advertising monetization is an important, fully integrated part of the core strategy for each business.Key job responsibilitiesRapidly design, prototype and test machine learning algorithms for optimizing advertising reach, frequency and return on advertising spendBuild systems that extract and process volumes of disparate data using a variety of econometric and machine learning approaches. These systems should be designed to scale with exponential growth in data and run continuously.Leverage knowledge of advanced software system and algorithm development to build our measurement and optimization engine.Contribute intellectual property through patent generation.Functionally decompose complex problems into simple, straight-forward solutions.Understand system inter-dependencies and limitations as well as analytic inter-dependencies to build efficient solutions.A day in the lifeAs an Applied Scientist, you will be tasked with leading innovations in machine learning algorithms to deliver ads across platforms influencing product features and architectural choices for decision making systems. You will need to work with data scientists to invent elegant metrics and associated measurement models, and develop algorithms that help advertisers test and learn the impact of advertising strategies across channels on these metrics while ensuring a great customer experience.
US, WA, Seattle
Job summaryThe Amazon Devices Demand Science team is looking for an energetic, focused and skilled, truly innovative and technically strong research scientist with a background in data analytics, machine learning, data science, decision science and statistical modeling/analysis to help with demand forecasting and planning for the entire Amazon device family of products, services and accessories.Amazon is looking for a talented Senior Research Scientist to join the Amazon Devices team. We materially impact Amazon’s device businesses by forecasting demand, influencing promotion pricing and identifying optimal inventory allocation of all Amazon Devices using ML, operations research and big data.Key job responsibilitiesIn this role, you will have an opportunity to both develop advanced scientific solutions and drive critical customer and business impacts. You will play a key role to drive end-to-end solutions from understanding our business requirements, exploring a large amount of historical data and ML models, building prototypes and exploring conceptually new solutions, to working with partner teams for prod deployment. You will collaborate closely with scientists, engineering peers as well as business stakeholders. You will be responsible for researching, prototyping, experimenting, analyzing predictive models and developing artificial intelligence-enabled automation solutions.As a Senior Research Scientist, you will:• research and develop new methodologies for demand forecasting, alarms, alerts and automation.• apply your advanced data analytics, machine learning skills to solve complex demand planning and allocation problems.• work closely with stakeholders and translate data-driven findings into actionable insights.• improve upon existing methodologies by adding new data sources and implementing model enhancements.• create and track accuracy and performance metrics.• create, enhance, and maintain technical documentation, and present to other scientists, engineers and business leaders.• drive best practices on the team; mentor and guide junior members to achieve their career growth potential.A day in the lifeThis role will be a Problem Solver, Doer, Detail Oriented, Communicator and Influencer.Problem Solver: Ability to utilize exceptional modeling and problem-solving skills to work through different challenges in ambiguous situations.Doer: You’ve successfully delivered end-to-end operations research projects, working through conflicting viewpoints and data limitations.Detail Oriented: You have an enviable level of attention to details.Communicator: Ability to communicate analytical results to senior leaders, and peers.Influencer: Innovative scientist with the ability to identify opportunities and develop novel modeling approaches in a fast-paced and ever-changing environment, and gain support with data and storytelling.About the teamWe are a growing team continues to operate in "startup" mode to prove new business ideas, while strengthening our core ML platforms.This role is available for the following locations: Seattle/Bellevue, Washington; Arlington, Virginia (HQ2); Denver, Colorado; Bay Area/Los Angeles Metro, California; and Nashville, Tennessee. (other US Locations can be discussed further)