Automatically optimizing execution of dynamic tensor operations

New auto-scheduler speeds optimization process sixfold while improving performance of resulting code up to 70%.

Deep-learning models rely centrally on algebraic operations involving tensors — higher-dimensional analogues of matrices — that might be repeated tens of thousands of times. Efficient learning requires optimizing frequently repeated tensor operations.

But operations involving tensors of different shapes — 32x32, 64x64, 128x128, etc. — have to be optimized individually. Auto-schedulers are programs that learn optimizations for shapes whose implementations may be suboptimal in current tensor operation libraries.

Existing auto-schedulers struggle, however, with workloads whose shapes vary. Many natural-language-processing applications, for instance, take inputs of arbitrary length, which means tensors of arbitrary shape.

Related content
Amazon researchers optimize the distributed-training tool to run efficiently on the Elastic Fabric Adapter network interface.

At this year’s Conference on Machine Learning and Systems (MLSys), we and our colleagues presented a new auto-scheduler called DietCode, which handles dynamic-shape workloads much more efficiently than its predecessors. Where existing auto-encoders have to optimize each possible shape individually, DietCode constructs a shape-generic search space that enables it to optimize all possible shapes simultaneously.

We tested our approach on a natural-language-processing (NLP) task that could take inputs ranging in size from 1 to 128 tokens. When we use a random sampling of input sizes that reflects a plausible real-world distribution, we speed up the optimization process almost sixfold relative to the best prior auto-scheduler. That speedup increases to more than 94-fold when we consider all possible shapes.

Despite being much faster, DietCode also improves the performance of the resulting code, by up to 70% relative to prior auto-schedulers and up to 19% relative to hand-optimized code in existing tensor operation libraries. It thus promises to speed up our customers’ dynamic-shaped machine learning workloads.

Dynamic workloads

NLP models that handle text strings of arbitrary length are examples of dynamic-by-design models, which allow variably sized inputs. But other applications also call for dynamic workloads.

Related content
The first step in training a neural network to solve a problem is usually the selection of an architecture: a specification of the number of computational nodes in the network and the connections between them. Architectural decisions are generally based on historical precedent, intuition, and plenty of trial and error.

Neural-architecture search, for instance, tries out different deep-learning architectures by building them up from different-shaped components, which requires operations on different-shaped tensors. And some models — the BERT language model, for instance — applies the same operation at different layers of a network, which have different numbers of nodes.

Microkernels

Auto-schedulers typically rely on computational kernels — program templates whose use greatly accelerates the speed with which different candidate optimizations can be evaluated. Odd-shaped workloads, however, may not fit the kernels precisely. For instance, if a tensor has 513 elements along one of its dimensions, but the kernel capacity is only 512, then two kernels must be tiled together in order to accommodate the tensor.

The tiled kernels, however, have a combined capacity of 1,024 along the relevant dimension, compared to only 513 for the input tensor. The input tensor thus has to be padded out in order to fill the kernel. This padding can slow down the optimization process dramatically, as it leads to unnecessary calculations that then have to be pruned out of the result.

Related content
Determining the optimal architectural parameters reduces network size by 84% while improving performance on natural-language-understanding tasks.

DietCode uses microkernels that are sized according to the available hardware, not the input shape, which aids in optimization for that hardware. For a given hardware configuration, DietCode can also generate a range of different microkernel shapes and sizes, which can be used in combination.

The microkernels are small enough that they can usually be tiled across an input, to fit its shape more precisely. This may still require some padding at the edges, but much less than larger kernels require.

The real advantage of microkernels, however, is that they enable DietCode to optimize operators for multiple shapes at once. A standard auto-scheduler will take a workload shape, pad it as necessary to fit its tiled kernels, and then estimate the efficiency of different implementations using a cost model that extracts program features such as loop structures and memory access patterns. Then it will repeat that process for the next shape.

DietCode, by contrast, breaks operators up across microkernels. The cost model has two components: one that evaluates features of the partial operation assigned to each microkernel and one that evaluates the cost of stitching those partial operations together to form a complete operator.

DietCode v. conventional.png
A traditional auto-encoder (left) separately optimizes implementations of tensor operations for different-shaped workloads. DietCode (right) instead optimizes implementations for multiple workloads at once, saving time and improving performance.

Here is where we realize our greatest gains in efficiency, because each partial operation is a component of operators for multiple workload shapes. Compared to the computational cost of evaluating operations — which is a machine learning process that involves real hardware measurements — the cost of stitching partial operations together is low.

With our optimized microkernels in hand, we train an efficient decision tree model to map workload shapes to microkernels. That decision tree is incorporated into the binary file for the execution of the tensor operations, to route inputs of arbitrary shape to the proper microkernels for processing.

For experimental results and more details, please refer to our paper.

Acknowledgements: Cody Yu, Yizhi Liu, Gennady Pekhimenko

Research areas

Related content

US, CA, East Palo Alto
The Applied Scientist will play a critical role in the research, develop, and implementation of solutions to key challenges in developing conversational AI systems that can understand and communicate with customers in a natural and contextually appropriate manner. This involves tackling complex problems in areas such as multi-turn dialogue management, knowledge grounding, and open-ended generation. Key job responsibilities 1. Research and development of LLM-based chatbots and conversational AI systems for customer service applications. 2. Design and implement state-of-the-art NLP and ML models for tasks such as language understanding, dialogue management, and response generation. 3. Collaborate with cross-functional teams, including data scientists, software engineers, and product managers, to integrate LLM-based solutions into Amazon's customer service platforms. 4. Develop and implement strategies for data collection, annotation, and model training to ensure high-quality and robust performance of the chatbots. 5. Conduct experiments and evaluations to measure the performance of the developed models and systems, and identify areas for improvement. 6. Stay up-to-date with the latest advancements in NLP, LLMs, and conversational AI, and explore opportunities to incorporate new techniques and technologies into Amazon's customer service solutions. 7. Collaborate with internal and external research communities, participate in conferences and publications, and contribute to the advancement of the field. A day in the life We thrive on solving challenging problems to innovate for our customers. By pushing the boundaries of technology, we create unparalleled experiences that enable us to rapidly adapt in a dynamic environment. Our decisions are guided by data, and we collaborate with engineering, science, and product teams to foster an innovative learning environment. If you are not sure that every qualification on the list above describes you exactly, we'd still love to hear from you! At Amazon, we value people with unique backgrounds, experiences, and skillsets. If you’re passionate about this role and want to make an impact on a global scale, please apply! Benefits Summary: Amazon offers a full range of benefits that support you and eligible family members, including domestic partners and their children. Benefits can vary by location, the number of regularly scheduled hours you work, length of employment, and job status such as seasonal or temporary employment. The benefits that generally apply to regular, full-time employees include: 1. Medical, Dental, and Vision Coverage 2. Maternity and Parental Leave Options 3. Paid Time Off (PTO) 4. 401(k) Plan About the team Join our team of scientists and engineers who develop and deploy LLM-based Conversational AI systems to enhance Amazon's customer service experience and effectiveness. We work on innovative solutions that help customers solve their issues and get their questions answered efficiently, and associate-facing products that support our customer service associate workforce.
US, WA, Seattle
The Alexa Smart Home team is focused on making Alexa the user interface for the home. From the simplest voice commands (turn on the lights, turn down the heat) to use cases spanning home security, home entertainment, and the home environment; we are evolving Alexa into an intelligent, indispensable companion that automates daily routines, simplifies interaction with appliances and electronics, and alerts when something unusual is detected. You can be part of a team delivering features that are highly anticipated by media and well received by our customers. As an Applied Scientist, you will work with other scientists and software developers to design and build the next generation of Smart Home voice control using the latest Large Language Models (LLMs). And, you will have the satisfaction of working on a product your friends and family can relate to, and want to use every day. Key job responsibilities - Develop new inference and training techniques to improve the performance of LLMs for Smart Home control and Automation - Develop robust techniques for synthetic data generation for training large models and maintaining model generalization - Mentoring junior scientists to improve their skills, knowledge, and their ability to get things done About the team We are a team of Scientists, Machine Learning Engineers, and Software Developers that work together to make Alexa more insightful and proactive through ambient intelligence, with features like Alexa Hunches that automatically control Smart Home devices. We are interdisciplinary and we act like it. We ask each other questions and value our different perspectives.
US, CA, Santa Clara
The Geospatial science team solves problems at the interface of ML/AI and GIS for Amazon's last mile delivery programs. We have access to Earth-scale datasets and use them to solve challenging problems that affect hundreds of thousands of transporters. We are looking for strong candidates to join the transportation science team which owns time estimation, GPS trajectory learning, and sensor fusion from phone data. You will join a team of GIS and ML domain experts and be expected to develop ML models, present research results to stakeholders, and collaborate with SDEs to implement the models in production. Key job responsibilities - Understand business problems and translate them into science problems - Develop ML models - Present research results - Write and publish papers - Collaborate with other scientists
US, CA, San Francisco
If you are interested in this position, please apply on Twitch's Career site https://www.twitch.tv/jobs/en/ About Us: Twitch is the world’s biggest live streaming service, with global communities built around gaming, entertainment, music, sports, cooking, and more. It is where thousands of communities come together for whatever, every day. We’re about community, inside and out. You’ll find coworkers who are eager to team up, collaborate, and crush (or elegantly solve) problems together. We’re on a quest to empower live communities, so if this sounds good to you, see what we’re up to on LinkedIn and X, and discover the projects we’re solving on our Blog. Be sure to explore our Interviewing Guide to learn how to ace our interview process. About the Role: We are looking for an experienced Data Scientist to support our central analytics and finance disciplines at Twitch. Bringing to bear a mixture of data analysis, dashboarding, and SQL query skills, you will use data-driven methods to answer business questions, and deliver insights that deepen understanding of our viewer behavior and monetization performance. Reporting to the Head of Finance, Analytics, and Business Operations, your team will be located in San Francisco. While there is a preference for the San Francisco Bay Area, we are open to this role operating remotely within the U.S. You Will: - Create actionable insights from data related to Twitch viewers, creators, advertising revenue, commerce revenue, and content deals. - Develop dashboards and visualizations to communicate points of view that inform business decision-making. - Create and maintain complex queries and data pipelines for ad-hoc analyses. - Author narratives and documentation that support conclusions. - Collaborate effectively with business partners, product managers, and data team members to align data science efforts with strategic goals.
US, WA, Seattle
The Private Brands Discovery team designs innovative machine learning solutions to enhance customer awareness of Amazon’s own brands and help customers find products they love. This interdisciplinary team of scientists and engineers incubates and develops disruptive solutions using cutting-edge technology to tackle some of the most challenging scientific problems at Amazon. To achieve this, the team utilizes methods from Natural Language Processing, deep learning, large language models (LLMs), multi-armed bandits, reinforcement learning, Bayesian optimization, causal and statistical inference, and econometrics to drive discovery throughout the customer journey. Our solutions are crucial to the success of Amazon’s private brands and serve as a model for discovery solutions across the company. This role presents a high-visibility opportunity for someone eager to make a business impact, delve into large-scale problems, drive measurable actions, and collaborate closely with scientists and engineers. As a team lead, you will be responsible for developing and coaching talent, guiding the team in designing and developing cutting-edge models, and working with business, marketing, and software teams to address key challenges. These challenges include building and improving models for sourcing, relevance, and CTR/CVR estimation, deploying reinforcement learning methods in production etc. In this role, you will be a technical leader in applied science research with substantial scope, impact, and visibility. A successful team lead will be an analytical problem solver who enjoys exploring data, leading problem-solving efforts, guiding the development of new frameworks, and engaging in investigations and algorithm development. You should be capable of effectively interfacing between technical teams and business stakeholders, pushing the boundaries of what is scientifically possible, and maintaining a sharp focus on measurable customer and business impact. Additionally, you will mentor and guide scientists to enhance the team's talent and expand the impact of your work.
IN, KA, Bangalore
AWS Sales, Marketing, and Global Services (SMGS) is responsible for driving revenue, adoption, and growth from the largest and fastest growing small- and mid-market accounts to enterprise-level customers including public sector. The AWS Global Support team interacts with leading companies and believes that world-class support is critical to customer success. AWS Support also partners with a global list of customers that are building mission-critical applications on top of AWS services. Do you have proven analytical capabilities to identify business opportunities, develop predictive models and optimization algorithms to help us build state of the art Support organization? At Amazon, we are working to be the most customer-centric company on earth. To get there, we need exceptionally talented, bright, and driven people. We set big goals and are looking for people who can help us reach and exceed them. Amazon Web Services (AWS) is one of the world’s most comprehensive and broadly adopted cloud platform, offering over 200 fully featured services from data centers globally. Amazon Web Services, Inc. provides services for broad range of applications including compute, storage, databases, networking, analytics, machine learning and artificial intelligence (AI), Internet of Things (IoT), security, and application development, deployment, and management. Global AWS Support BizOPs team is looking for a passionate Data Scientist to model contact forecasting, discovering insights and identifying opportunities through the use of statistics, machine learning, and deep learning to drive business and operational improvements. A successful candidate must be passionate about building solutions that will help drive a more efficient operations network and optimize cost. In this role, you will partner with data engineering, Tooling team, operations, Training, Customer Service, Capacity planning and finance teams, driving optimization and prediction solutions across the network. Key job responsibilities We are looking for an experienced and motivated Sr.Data Scientist with proven abilities to build and manage modeling projects, identify data requirements, build methodology and tools that are statistically grounded The candidate will be an expert in the areas of data science, optimization, machine learning and statistics, and is comfortable facilitating ideation and working from concept through execution. The candidate is customer obsessed, innovative, independent, results-oriented and enjoys working in a fast-paced growing organization. An interest in operations, manufacturing or process improvement is helpful. The ability to embrace this ambiguity and work with a highly distributed team of experts is critical. As we scale up, there is opportunity to own globally impactful work and grow your career in technical, programmatic or people leadership. You will likely work with Python or R, though specific particular modelling language. Your problem-solving ability, knowledge of data models and ability to drive results through ambiguity are more important to us. A day in the life Diverse Experiences AWS values diverse experiences. Even if you do not meet all of the qualifications and skills listed in the job description, we encourage candidates to apply. If your career is just starting, hasn’t followed a traditional path, or includes alternative experiences, don’t let it stop you from applying. Why AWS? Amazon Web Services (AWS) is the world’s most comprehensive and broadly adopted cloud platform. We pioneered cloud computing and never stopped innovating — that’s why customers from the most successful startups to Global 500 companies trust our robust suite of products and services to power their businesses. Inclusive Team Culture Here at AWS, it’s in our nature to learn and be curious. Our employee-led affinity groups foster a culture of inclusion that empower us to be proud of our differences. Ongoing events and learning experiences, including our Conversations on Race and Ethnicity (CORE) and AmazeCon (gender diversity) conferences, inspire us to never stop embracing our uniqueness. Mentorship & Career Growth We’re continuously raising our performance bar as we strive to become Earth’s Best Employer. That’s why you’ll find endless knowledge-sharing, mentorship and other career-advancing resources here to help you develop into a better-rounded professional. Work/Life Balance We value work-life harmony. Achieving success at work should never come at the expense of sacrifices at home, which is why we strive for flexibility as part of our working culture. When we feel supported in the workplace and at home, there’s nothing we can’t achieve in the cloud.
US, MA, Boston
* Note: This job is located in Hudson, MA Amazon Dash Cart allows shoppers to checkout without lines — you just place the items in the cart and the cart will take care of the rest. Designed and custom-built by Amazonians, our Dash Cart uses a variety of technologies including computer vision, sensor fusion, and advanced machine learning. Check it out at https://www.amazon.com/b?ie=UTF8&node=21289116011. We are looking for an Applied Scientist to develop hardware solutions that require significant innovation for our Amazon Dash Cart team, located in Hudson, MA. As an Applied Scientist within the hardware development team, you will engage with a skilled and accomplished cross-disciplinary staff to conceive and evaluate innovative technologies. You will collaborate with internal and external stakeholders to drive key aspects of technology solution definition, execution and validation. Key job responsibilities - Evaluate or conceive of new cameras, sensors, and computer vision systems which push the limit of existing technologies and delight Dash Cart customers. - Design embedded compute architectures optimized for cost and power efficiency. - Propose hardware solutions and create working prototypes while working with hardware development engineers to bring those prototypes to production. - Develop computer vision algorithms including ISP optimization and video pipelines architectures. - Develop firmware device drivers for interfacing to a range of hardware components and sensors. - Work closely with an inter-disciplinary product development team including outside partners to bring prototypes into production. - Use machine learning, data mining, statistical techniques and others to create actionable, meaningful, and scalable solutions for the business' problems.
US, MD, Annapolis Junction
Are you excited to help the US Intelligence Community design, build, and implement AI algorithms to augment decision making while meeting the highest standards for reliability, transparency, and scalability? The Amazon Web Services (AWS) US Federal Professional Services team works directly with US Intelligence Community agencies and other public sector entities to achieve their mission goals through the adoption of Machine Learning (ML) methods. We build models for text, image, video, audio, and multi-modal use cases, using traditional or generative approaches to fit the mission. Our team collaborates across the entire AWS organization to bring access to product and service teams, to get the right solution delivered and drive feature innovation based on customer needs. At AWS, we're hiring experienced data scientists with a background in both traditional and generative AI who can help our customers understand the opportunities their data presents, and build solutions that earn the customer trust needed for deployment to production systems. In this role, you will work closely with customers to deeply understand their data challenges and requirements, and design tailored solutions that best fit their use cases. You should have broad experience building models using all kinds of data sources, and building data-intensive applications at scale. You should possess excellent business acumen and communication skills to collaborate effectively with stakeholders, develop key business questions, and translate requirements into actionable solutions. You will provide guidance and support to other engineers, sharing industry best practices and driving innovation in the field of data science and AI. This position may require local travel up to 25% It is expected to work from one of the above locations (or customer sites) at least 1+ days in a week. This is not a remote position. You are expected to be in the office or with customers as needed. This position requires that the candidate selected must currently possess and maintain an active TS/SCI Security Clearance with Polygraph. The position further requires the candidate to opt into a commensurate clearance for each government agency for which they perform AWS work. Key job responsibilities As an Data Scientist, you will: - Collaborate with AI/ML scientists and architects to research, design, develop, and evaluate cutting-edge AI algorithms to address real-world challenges - Interact with customers directly to understand the business problem, help and aid them in implementation of AI solutions, deliver briefing and deep dive sessions to customers and guide customer on adoption patterns and paths to production. - Create and deliver best practice recommendations, tutorials, blog posts, sample code, and presentations adapted to technical, business, and executive stakeholder - Provide customer and market feedback to Product and Engineering teams to help define product direction About the team About AWS Diverse Experiences AWS values diverse experiences. Even if you do not meet all of the qualifications and skills listed in the job description, we encourage candidates to apply. If your career is just starting, hasn’t followed a traditional path, or includes alternative experiences, don’t let it stop you from applying. Why AWS? Amazon Web Services (AWS) is the world’s most comprehensive and broadly adopted cloud platform. We pioneered cloud computing and never stopped innovating — that’s why customers from the most successful startups to Global 500 companies trust our robust suite of products and services to power their businesses. Inclusive Team Culture Here at AWS, it’s in our nature to learn and be curious. Our employee-led affinity groups foster a culture of inclusion that empower us to be proud of our differences. Ongoing events and learning experiences, including our Conversations on Race and Ethnicity (CORE) and AmazeCon (gender diversity) conferences, inspire us to never stop embracing our uniqueness. Mentorship & Career Growth We’re continuously raising our performance bar as we strive to become Earth’s Best Employer. That’s why you’ll find endless knowledge-sharing, mentorship and other career-advancing resources here to help you develop into a better-rounded professional. Work/Life Balance We value work-life harmony. Achieving success at work should never come at the expense of sacrifices at home, which is why flexible work hours and arrangements are part of our culture. When we feel supported in the workplace and at home, there’s nothing we can’t achieve in the cloud.
US, VA, Arlington
Are you looking to work at the forefront of Machine Learning and AI? Would you be excited to apply cutting edge Generative AI algorithms to solve real world problems with significant impact? Amazon Web Services (AWS) Professional Services (ProServe) is looking for Data Scientists who like helping U.S. Federal agencies implement innovative cloud computing solutions and solve technical problems using state-of-the-art language models in the cloud. AWS ProServe engages in a wide variety of projects for customers and partners, providing collective experience from across the AWS customer base and are obsessed about strong success for the Customer. Our team collaborates across the entire AWS organization to bring access to product and service teams, to get the right solution delivered and drive feature innovation based upon customer needs. At AWS, we're hiring experienced data scientists with a background in NLP, generative AI, and document processing to help our customers understand, plan, and implement best practices around leveraging these technologies within their AWS cloud environments. Our consultants deliver proof-of-concept projects, reusable artifacts, reference architectures, and lead implementation projects to assist organizations in harnessing the power of their data and unlocking the potential of advanced NLP and AI capabilities. In this role, you will work closely with customers to deeply understand their data challenges and requirements, and design tailored solutions that best fit their use cases. You should have deep expertise in NLP/NLU, generative AI, and building data-intensive applications at scale. You should possess excellent business acumen and communication skills to collaborate effectively with stakeholders, develop key business questions, and translate requirements into actionable solutions. You will provide guidance and support to other engineers, sharing industry best practices and driving innovation in the field of data science and AI. It is expected to work from one of the above locations (or customer sites) at least 1+ days in a week. This is not a remote position. You are expected to be in the office or with customers as needed. This position requires that the candidate selected be a US Citizen and obtain and maintain a security clearance at the TS/SCI with polygraph level. Upon start, the selected candidate will be sponsored for a commensurate clearance for each government agency for which they perform AWS work. Key job responsibilities In this role, you will: - Collaborate with AI/ML scientists and architects to research, design, develop, and evaluate cutting-edge generative AI solutions to address real-world challenges. - Interact with customers directly to understand the business problem, help and aid them in implementation of generative AI solutions, deliver briefing and deep dive sessions to customers and guide customer on adoption patterns and paths to production. - Provide expertise and guidance in generative AI and document processing infrastructure, design, implementation, and optimization. - Maintain domain knowledge and expertise in generative AI, NLP, and NLU. - Architect and build large-scale solutions. - Build technical solutions that are secure, maintainable, scalable, reliable, performant, and cost-effective. - Identify and prepare metrics and reports for the internal team and for customers to delineate the value of their solution to the customer. - Identify, mitigate and communicate risks related to solution and service constraints by making technical trade-offs. - Participate in growing their team’s skills and help mentor internal and customer team members. - Provide guidance on the people, organizational, security and compliance aspects of AI/ML transformations for the customer. About the team Why AWS? Amazon Web Services (AWS) is the world’s most comprehensive and broadly adopted cloud platform. We pioneered cloud computing and never stopped innovating — that’s why customers from the most successful startups to Global 500 companies trust our robust suite of products and services to power their businesses. Diverse Experiences AWS values diverse experiences. Even if you do not meet all of the qualifications and skills listed in the job description, we encourage candidates to apply. If your career is just starting, hasn’t followed a traditional path, or includes alternative experiences, don’t let it stop you from applying. Inclusive Team Culture Here at AWS, it’s in our nature to learn and be curious. Our employee-led affinity groups foster a culture of inclusion that empower us to be proud of our differences. Ongoing events and learning experiences, including our Conversations on Race and Ethnicity (CORE) and AmazeCon (gender diversity) conferences, inspire us to never stop embracing our uniqueness. Work/Life Balance We value work-life harmony. Achieving success at work should never come at the expense of sacrifices at home, which is why flexible work hours and arrangements are part of our culture. When we feel supported in the workplace and at home, there’s nothing we can’t achieve in the cloud. Mentorship & Career Growth We’re continuously raising our performance bar as we strive to become Earth’s Best Employer. That’s why you’ll find endless knowledge-sharing, mentorship and other career-advancing resources here to help you develop into a better-rounded professional.
US, WA, Seattle
The Artificial General Intelligence (AGI) team is looking for a passionate, talented, and inventive Applied Scientist with a strong deep learning background, to build industry-leading Generative Artificial Intelligence (GenAI) technology with Large Language Models (LLMs) and multimodal systems. Key job responsibilities As a Applied Scientist with the AGI team, you will work with talented peers to lead the development of novel algorithms and modeling techniques, to advance the state of the art with LLMs. Your work will directly impact our customers in the form of products and services that make use of speech and language technology. You will leverage Amazon’s heterogeneous data sources and large-scale computing resources to accelerate advances in spoken language understanding. About the team The AGI team has a mission to push the envelope in GenAI with LLMs and multimodal systems, in order to provide the best-possible experience for our customers.