Amazon pushes the boundaries of extreme multilabel classification

Two NeurIPS papers examine the assignment of the same label to multiple categories, fast training of Transformer-based models.

In the past few years, we’ve published a number of papers about extreme multilabel classification (XMC), or classifying input data when the number of candidate labels is huge.

Partition model.png
Example of a label-partitioning model.

Earlier this year, we publicly released the code for our own XMC framework, PECOS, which makes XMC more efficient through label partitioning. With label partitioning, labels are first grouped into clusters, and a matcher model is trained to assign inputs to clusters. Then, a ranker is trained to select a single label from the designated group for a given input.

At this year’s Conference on Neural Information Processing Systems (NeurIPS), we’re presenting two papers that extend the range of label-partitioning frameworks — including but not limited to PECOS — and improve their classification accuracy.

In “Label disentanglement in partition-based extreme multilabel classification”, we consider the case in which the same label belongs to multiple clusters: for instance, the label “apple” might properly belong to one cluster designating computing devices and another designating fruits. We demonstrate a method for assigning labels to multiple clusters that improves classification accuracy with a negligible effect on efficiency.

In “Fast multi-resolution transformer fine-tuning for extreme multi-label text classification”, we propose a new method for training Transformer-based matching models that reduces training time by 95% while actually increasing accuracy.

XRTransformer.png
Training (left) of the Transformer-based matching model (XR-Transformer) begins with a preliminary hierarchical label tree (HLT). For each layer of the tree, we jointly train a Transformer-based encoder and a linear ranker (Ŵ). Once the Transformer-based encoder has been trained, we train new linear rankers (W), which are used at inference time (right).

Label disentanglement

PECOS is a flexible framework that allows variation in the implementation of XMC models, and one common approach to label clustering is to use a hierarchical tree, where the labels are first divided into a few, coarse-grained groups, then successively subdivided into finer- and finer-grained groups. The matcher is then trained to traverse the tree to find the lowest-level label cluster.

Typically, the tree is constructed using some fixed measure of label similarity, such as term frequency–inverse document frequency (TD-IDF), which identifies terms that predominate in a given text relative to a larger corpus of texts. In most existing label-partitioning approaches, each label ends up in exactly one low-level cluster.

In addition to the assignment of individual labels to multiple clusters, one of the novelties of our work is that the hierarchical tree itself is learned from data in a supervised manner.

To ensure that every label is assigned to all the clusters to which it properly belongs, we could simply assign every label to every cluster. But of course, that would waste the advantage of doing label partitioning in the first place.

Instead, we limit the number of clusters that a given label can be assigned to — in our experiments, we varied the limit from 1 to 6 — and then treat the cluster assignment as an optimization problem. That is, we learn which assignment of labels to which clusters maximizes the performance of the XMC model.

At the beginning of the training procedure, we create a provisional hierarchical tree using TF-IDF. Then we train a matcher on that tree. On the basis of that matcher, we then reassign labels to multiple clusters in a way that maximizes classification accuracy.

This process could be repeated as many times as necessary, but in our experiments, we found that one repetition was enough to secure most of the approach’s performance gains.

In our experiments, we compared our approach to nine earlier approaches, using six metrics across four datasets. Of the resulting 24 measurements, our approach achieved the highest score on 21, second place on two.

Multiresolution Transformer fine-tuning

PECOS comes with a number of built-in tools for performing all three steps in our XMC pipeline: label partitioning, matching, and ranking. At KDD 2020, we described our Transformer-based approach to matching, X-Transformer, which can be used with PECOS or with other XMC approaches.

The initial PECOS release also came with a recursive linear matching model, XR-linear, which learns to match inputs to clusters using the same iterative strategy that we use to build hierarchical trees: first XR-linear performs a coarse-grained matching, then a finer-grained matching, and so on. The “R” in XR-linear stands for “recursive”.

In our second NeurIPS paper, we combine these two approaches to create XR-Transformer, a recursive, Transformer-based matcher. On the Amazon-3M dataset, a standard benchmark in the XMC field that consists of products sorted into three million product categories (the label space), training an X-Transformer matcher takes 23 days on eight GPUs; training an XR-Transformer matcher takes only 29 hours — with a significant improvement in accuracy.

To train an XR-Transformer matcher, we begin as we did in the label disentanglement work, with a hierarchical label tree based on TF-IDF features. For each layer of the tree, we jointly train a Transformer-based encoder and a linear ranker, which uses both the Transformer model embedding and the TF-IDF features as the basis for assigning an input to a particular cluster in the next layer down the tree.

Once the Transformer-based encoder has been trained at every level of the tree, we concatenate its final label embeddings with the TF-IDF features and, on that basis, produce a new label tree. Then, for each level of the tree, we train new linear rankers with the concatenated features as inputs.

We tested our approach on six public datasets, whose output space ranged from about 4,000 labels to the three million of Amazon-3M. We compared our approach to 11 predecessors on three metrics (precision at 1, 3, and 5, or the number of relevant results among the top one, three, and five labels returned).

On the three datasets with 4,000 to 31,000 labels, XR-Transformer achieved the highest score on five of nine measures. But on the three datasets with 500,000 or more labels, it achieved the highest scores across the board, by a significant margin.

Related content

US, WA, Seattle
Job summaryPrime Video is an industry leading, high-growth business and a critical driver of Amazon Prime subscriptions, which contribute to customer loyalty and lifetime value. Prime Video is a digital video streaming and download service that offers Amazon customers the ability to rent, purchase or subscribe to a huge catalog of videos. The Prime Video Economist team works on disruptive ideas in the Prime Video space.We are looking for a truly innovative Data Scientist to work on disruptive ideas within the Prime Video space. Examples of problem spaces you may be working on include video product pricing, ecosystem effects (how streaming affects rentals or purchases), and forecasting demand for new content on the platform.On our team you will work with a diverse scientific team including engineers and economists as well as other data scientist to build statistical models using world-class data systems and partner directly with the business to implement the solutions.Key job responsibilities· Implement code (Python, R, Scala, etc.) for analyzing data and building machine learning/econometric models to solve specific business problems. Work with software engineering teams to productionize algorithms where appropriate.· Lead the development of the scientific roadmap, guide and develop junior engineers in designing and implementing scientific solutions.· Translate analytic insights into concrete, actionable recommendations for business or product improvement. Develop and present these as reports to senior stakeholders with ranging levels of technical knowledge.· Create, enhance, and maintain technical documentation, and present to other scientists, engineers and business leaders.· Demonstrate thorough technical knowledge on feature engineering of massive datasets, effective exploratory data analysis, and model building to deliver accurate and effective business insights.· Innovate by researching, learning, and adapting new modeling techniques and procedures to existing business problems.· Manage and execute entire project from start to finish including problem solving, data gathering and manipulation, predictive modeling, and stakeholder engagement.
US, WA, Bellevue
Job summaryDo you enjoy solving challenging problems and driving innovations in research? Are you seeking for an environment with a group of motivated and talented scientists like yourself? Do you want to create scalable optimization models and apply machine learning techniques to guide real-world decisions? Do you want to play a key role in the future of Amazon transportation and operations? Come and join us at Amazon's Modeling and Optimization team (MOP).Key job responsibilitiesAn Applied Scientist in the Modeling and Optimization (MOP) team· provides analytical decision support to Amazon planning teams via applying advanced mathematical and statistical techniques.· collaborates effectively with Amazon internal business customers, and is their trusted partner· is proactive and autonomous in discovering and resolving business pain-points within a given scope· is able to identify a suitable level of sophistication in resolving the different business needs· is confident in leveraging existing solutions to new problems where appropriate and is independent in designing and implementing new solutions where needed· is aware of the limitations of his/her proposed solutions and is proactive in communicating them to the business, and advances the application of sciences towards Amazon business problems by bringing new methods, ideas, and practices to the team and scientific community.A day in the life· Your will be developing model-based optimization, simulation, and/or predictive tools to identify and evaluate opportunities to improve customer experience, network speed, cost, and efficiency of capital investment.· You will quantify the improvements resulting from the application of these tools and you will evaluate the trade-offs between potentially competing objectives.· You will develop good communication skills and ability to speak at a level appropriate for the audience, will collaborate effectively with fellow scientists, software development engineers, and product managers, and will deliver business value in a close partnership with many stakeholders from operations, finance, IT, and business leadership.About the team· At the Modeling and Optimization (MOP) team, we use mathematical optimization, algorithm design, statistics, and machine learning to improve decision-making capabilities across WW Operations and Amazon Logistics.· We focus on transportation topology, labor and resource planning for fulfillment centers (FC), routing science, visualization research, data science and development, and process optimization.· We create models to simulate, optimize, and control the fulfillment network with the objective of reducing cost while improving speed and reliability.· We support multiple business lanes, therefore maintain a comprehensive and objective view, coordinating solutions across organizational lines where possible.
US, WA, Seattle
Job summaryAt Amazon, we're working to be the most customer-centric company on earth. To get there, we need exceptionally talented, bright, result oriented, and driven people. Amazon is seeking a Data Scientist - Simulation to assist in designing and optimizing the fulfillment network concepts and process improvements using discrete event simulations for our World Wide Design Engineering Team. Successful candidates will be natural self-starters who have the drive to design, model, and simulate new fulfillment center concepts and processes. The Simulation Data Scientist will be expected to deep dive problems and drive relentlessly towards creative solutions. This individual needs to be comfortable interfacing and driving various functional teams and individuals at all levels of the organization in order to be successful. Perform process modelling and simulation using discrete event simulation software’s, process optimization, statistical data analysis, and Design of Experiments (DOE) etc. to drive decisions on process and designs. Need based remote work option is available.Responsibilities:· Lead system level complex Discrete Event Simulation (DES) projects to build , simulate, and optimize the fulfillment center operational process flow models using FlexSim, Demo 3D, AnyLogic or any other Discrete Event Simulation (DES) software packages· Understand process flows , analyze data, perform Design of Experiments and effectively represent in simulation model to achieve better correlation and process improvements· Manage multiple DES simulation projects and tasks simultaneously and effectively influence, negotiate, and communicate with internal and external business partners, contractors and vendors.· Facilitate process improvement initiatives among site operations, engineering, and corporate systems groups.· Utilize code (python or another object oriented language) for data analysis and modeling algorithms· Analyze historical data to identify trends and support decision making using Statistical Techniques· Lead and coordinate simulation efforts between internal teams and outside vendors to develop optimal solutions for the network, including equipment specification, material flow, process design, and site layout.· Deliver results according to project schedules and quality· Provide written and verbal presentations to share insights and recommendations to audiences of varying levels of technical sophistication.· Make technical trade-offs for long term/short-term needs considering challenges in business area by applying relevant data science disciplines, and interactions among systems.
US, WA, Seattle
Job summaryAmazon is seeking an outstanding Data Scientist to uncover key insights on how customers engage with live sports events on Prime Video globally. With prestigious US sporting matches on Prime Video from NFL’s Thursday Night Football, the WNBA, AVP, the New York Yankees, and the Seattle Sounders, as well as global events like the English Premiere League (UK), UEFA Champions League (Italy, Germany), Ligue 1 (France), US Open Tennis (UK), Roland Garros (France), Autumn Nations Cup Rugby (UK) and more, live sports are an integral and growing component of Prime Video. As our selection of events expands, the Prime Video Content Analytics team is looking to enable agile decision making on live sports by developing key insights into customer engagement with live sport and translating these insights into large scale predictive modeling and analytics solutions.Key job responsibilitiesYou will have the following responsibilities within the scope of our global Prime Video business:· Drive analytics in an uncharted field that is not only developing at a fast pace but also becoming increasingly important to the Prime Video business· Support the analytical needs of stakeholders in the sports, advertising, finance, and live events teams, inclusive of statistical inference, demand modeling, and feature engineering· Build profitability models for new sports rights and partner with finance on business use cases· Think outside the box to use novel data and methodological approaches· Create new metrics that effectively guide the business and deploy dashboards to surface them to senior leadership· Ensure that the quality and timeliness of analytic deliverables meet business expectationsAbout the teamThe Prime Video Content Analytics team uses machine learning, econometrics, and data science to optimize Amazon’s streaming-video catalogue, driving customer engagement and Prime member acquisition. We generate insights to guide Amazon’s digital-video strategy, and we provide direct support to the content-acquisition process. We use detailed customer behavioral data (e.g. streaming history) and detailed information about content (e.g. IMDb-sourced characteristics) to predict and understand what customers like to watch.
ES, M, Madrid
Job summaryAmazon is looking for creative Applied Scientists to tackle some of the most interesting problems on the leading edge of machine learning (ML), search, natural language processing (NLP), and related areas with our Amazon Books team. At Amazon Books we believe that books are not only needed to work, education and entertainment, but are also required for a healthy society. As such, we aim to create an unmatched book discovery experience for our customers worldwide. We enable customers to discover new books, authors and genres through sophisticated recommendation engines, smart search tools and through social interaction, and we need your help to keep innovating in this space.If you are looking for an opportunity to solve deep technical problems and build innovative solutions in a fast-paced environment working within a smart and passionate team, this might be the role for you. You will develop and implement novel algorithms and modeling techniques to advance the state-of-the-art in technology areas at the intersection of ML, search, NLP, and deep learning. You will innovate, help move the needle for applied research in these exciting areas and build cutting-edge and scalable technologies that enable delightful experiences for hundreds of millions of people.In this role you will:· Work collaboratively with other scientists and developers to design and implement scalable models for improving our customers' experience discovering and getting the most out of their books;· Have the opportunity to work with a variety of technologies in a variety of use cases;· Drive scalable solutions from the business to prototyping, production testing and through engineering directly to production;· Drive best practices on the team, deal with ambiguity and competing objectives, and mentor and guide other members to achieve their career growth potential.About the teamWe aspire to be experts at the forefront of AI, machine learning and data science and their application to books e-commerce to help engineering teams innovate for readers, authors and publishers.As an Applied Scientist, you'll help us translate customer problems into tractable technical problems, and find ways to solve them by combining your expertise and that of other scientists and team members. You will work with partner engineering and business teams to ensure solutions have a real impact.
US, WA, Seattle
Job summaryAre you inspired by building new technologies to benefit customers? Do you dream of being at the forefront of robotics and autonomous system technology? Would you enjoy working in a fast paced, highly collaborative, start-up like environment? If you answered yes to any of these then you've got to check out the Amazon Scout team.We’ve been hard at work developing a new, fully-electric delivery system – Amazon Scout – designed to get packages to customers using autonomous delivery devices. These devices were created by Amazon, are the size of a small cooler, and roll along sidewalks at a walking pace. We developed Amazon Scout at our research and development lab in Seattle, ensuring the devices can safely and efficiently navigate around pets, pedestrians and anything else in their path.The Amazon Scout team shares a passion for innovation using advanced technologies, a love of solving complex challenges, and a desire to delight customers. We're looking for people who like dealing with ambiguity, solving hard, large scale problems, and working in a startup like environment. To learn more about Amazon Scout, check out our Amazon Day One Blog here: http://amazon.com/scoutAs a part of the localization team you will:· Collaborate closely with engineers, applied researchers and hardware teams to develop computer vision and machine learning algorithms and software for robots.· Take responsibility for technical problem solving, including creatively meeting product objectives and developing best practices.· Interact with teammates in variety of roles to accomplish your goals· Identify and initiate investigations of new technologies, prototype and test solutions for product features, and design and validate designs that deliver an exceptional user experience.· Recruit, hire and develop other applied scientists.
US, WA, Bellevue
Job summaryThe People eXperience and Technology Central Science Team (PXTCS) uses economics, behavioral science, statistics, and machine learning to proactively identify mechanisms and process improvements which simultaneously improve Amazon and the lives, wellbeing, and the value of work to Amazonians. We are an interdisciplinary team that combines the talents of science and engineering to develop and deliver solutions that measurably achieve this goal.We are looking for economists who are able to work with business partners to hone complex problems into specific, scientific questions, and test those questions to generate insights. The ideal candidate will work with engineers and computer scientists to estimate models and algorithms on large scale data, design pilots and measure their impact, and transform successful prototypes into improved policies and programs at scale. We are looking for creative thinkers who can combine a strong technical economic toolbox with a desire to learn from other disciplines, and who know how to execute and deliver on big ideas as part of an interdisciplinary technical team.Ideal candidates will work closely with business partners to develop science that solves the most important business challenges. They will work in a team setting with individuals from diverse disciplines and backgrounds. They will serve as an ambassador for science and a scientific resource for business teams, so that scientific processes permeate throughout the HR organization to the benefit of Amazonians and Amazon. Ideal candidates will own the data analysis, modeling, and experimentation that is necessary for estimating and validating models. They will work closely with engineering teams to develop scalable data resources to support rapid insights, and take successful models and findings into production as new products and services. They will be customer-centric and will communicate scientific approaches and findings to business leaders, listening to and incorporate their feedback, and delivering successful scientific solutions.Key job responsibilitiesUse causal inference methods to evaluate the impact of policies on employee outcomes. Examine how external labor market and economic conditions impact Amazon's ability to hire and retain talent. Use scientifically rigorous methods to develop and recommend career paths for employees.A day in the lifeWork with teammates to apply economic methods to business problems. This might include identifying the appropriate research questions, writing code to implement a DID analysis or estimate a structural model, or writing and presenting a document with findings to business leaders. Our economists also collaborate with partner teams throughout the process, from understanding their challenges, to developing a research agenda that will address those challenges, to help them implement solutions.About the teamWe are a multidisciplinary team that combines the talents of science and engineering to develop innovative solutions to make Amazon Earth's Best Employer.
US, Virtual
Job summaryAmazon’s Global Reliability Team is seeking a Principal Research Scientist to help envision, design and build the next generation of predictive maintenance capabilities and inventory management optimization behind Amazon’s Fulfillment Centers, Transportation Services, and Global Specialty Fulfillment.Key job responsibilitiesThe Principal Research Scientist will partner with senior leadership to develop long term strategic products/solutions and will represent and advocate them to leaders in our organization and other partner organizations such as Amazon Fulfillment Technologies, Workplace Health and Safety, amongst others. They will interact with Amazon scholars and universities among other research institutions to ensure that our team and our senior executives are up to speed on important trends, tools and technologies and how they can be used to impact the business.A day in the lifeIn this role, you will participate and lead the brainstorming sessions and review other scientists’ research. They will actively participate in the science community through presenting their research at the internal and external conference. They will mentor senior scientists for their career development and growth and help the company to identify and acquire scientists with the right skillset.About the teamWe are seeking high-energy individuals that are passionate about working with real-time machine and sensor data to build automated systems aimed to improve equipment availability.This position is perfect for someone who has a deep and broad analytic background and is passionate about using mathematical modeling and statistical analysis to make a real difference. Experience in applied analytics is essential, and they should be familiar with modern tools for data science and business analysis. We are particularly interested in candidates with research background in reliability engineering, econometrics, statistical inference, and time series modeling.
US, MA, Cambridge
Job summaryAmazon Lab126 is an inventive research and development company that designs and engineers high-profile consumer electronics. Lab126 began in 2004 as a subsidiary of Amazon.com, Inc., originally creating the best-selling Kindle family of products. Since then, we have produced groundbreaking devices like Fire tablets, Fire TV and Amazon Echo. What will you help us create?The Role:We are looking for a high caliber Applied Scientist Lead to join our team. As part of the larger technology team working on new consumer technology, your work will have a large impact to hardware, internal software developers, ecosystem, and ultimately the lives of Amazon customers. In this role, you will:• Lead a team of talented audio scientists and SW developers to bring a new and innovative audio products and services to delight customers• Propose new research projects, get buy-in from stakeholders, plan and budget the project and lead the team for successful execution• Work closely with an inter-disciplinary product development team including outside partners to bring the prototype algorithm into commercialization• Mentor team on music/speech/acoustic processing technology development• Manage small team of world class scientists and SW engineers in audio• Take a big part in the mission to create earth's best employerBe a respectable team leader in an open and collaborative environment
US, MA, Boston
Job summaryAre you inspired by invention? Is problem solving through teamwork in your DNA? Do you like the idea of seeing how your work impacts the bigger picture? Answer yes to any of these and you’ll fit right in here at Amazon Robotics. We are a smart team of doers that work passionately to apply cutting edge advances in robotics and software to solve real-world challenges that will transform our customers’ experiences in ways we can’t even image yet. We invent new improvements every day. We are Amazon Robotics and we will give you the tools and support you need to invent with us in ways that are rewarding, fulfilling and fun.We seek a talented and motivated engineer to tackle broad challenges in system-level analysis. You will work in a small team to quantify system performance at scale and to expand the breadth and depth of our analysis (e.g. increase the range of software components and warehouse processes covered by our models, develop our library of key performance indicators, construct experiments that efficiently root cause emergent behaviors). You will engage with growing teams of software development and warehouse design engineers to drive evolution of the AR system and of the simulation engine that supports our work.This role is a 6 month co-op to join AR full time (40 hours/week) from July-December 2022. Come join us in North Reading, MA, or in our newly expanded innovation hub in Westborough, MA!Both campuses provide a unique opportunity for co-ops to have direct access to robotics testing labs and manufacturing facilities. Remote and hybrid flexibility is available for this role.