New method for compressing neural networks better preserves accuracy

Neural networks have been responsible for most of the top-performing AI systems of the past decade, but they tend to be big, which means they tend to be slow. That’s a problem for systems like Alexa, which depend on neural networks to process spoken requests in real time.

In natural-language-understanding (NLU) applications, most of a neural network’s size comes from a huge lookup table that correlates input words with “embeddings.” An embedding is a large vector (usually a sequence of 300 numbers) that captures information about a word’s meaning.

In a paper that we and our colleagues are presenting at the 33rd conference of the Association for the Advancement of Artificial Intelligence (AAAI), we describe a new method for compressing embedding tables that compromises the NLU network’s performance less than competing methods do.

In one set of experiments, for instance, we showed that our system could shrink a neural network by 90 percent while reducing its accuracy by less than 1%. At the same compression rate, the best prior method reduced the accuracy by about 3.5%.

The ability to compress NLU models means that, as Alexa learns to perform more and more complex tasks, she can continue to deliver responses in milliseconds. It also means that Alexa’s skill base can continue to expand unfettered. Alexa currently supports more than 70,000 third-party skills, with thousands more being added every month. Compression means that those skills’ NLU models can be stored efficiently.

In our experiments, we used a set of pretrained word embeddings called Glove. Like other popular embeddings, Glove assesses words’ meanings on the basis of their co-occurrence with other words in huge bodies of training data. It then represents each word as a single point in a 300-dimensional space, such that words with similar meanings (similar co-occurrence profiles) are grouped together.

NLU systems often benefit from using such pretrained embeddings, because it lets them generalize across conceptually related terms. (It could, for instance, help a music service learn that the comparatively rare instruction “Play the track ‘Roadrunner’” should be handled the same way as the more common instruction “Play the song ‘Roadrunner”.) But it’s usually possible to improve performance still further by fine-tuning the embeddings on training data specific to the task the system is learning to perform.

In previous work, NLU researchers had taken a huge lookup table, which listed embeddings for about 100,000 words, reduced the dimension of the embeddings from 300 to about 30, and used the smaller embeddings as NLU system inputs.

We improve on this approach by integrating the embedding table into the neural network in such a way that it can use task-specific training data not only to fine-tune the embeddings but to customize the compression scheme as well.

To reduce the embeddings’ dimensionality, we use a technique called singular-value decomposition. Singular-value decomposition (SVD) produces a lower-dimensional projection of points in a higher-dimensional space, kind of the way a line drawing is a two-dimensional projection of objects in three-dimensional space.

Projection.jpg._CB458139085_.jpg
Singular-value decomposition projects high-dimensional data into a lower-dimensional space, much the way a three-dimensional object can be projected onto a two-dimensional plane.
Projection image adapted from Michael Horvath under the CC BY-SA 4.0 license

The key is to orient the lower-dimensional space so as to minimize the distance between the points and their projections. Imagine, for instance, trying to fit a two-dimensional plane to a banana so as to minimize the distance between the points on the banana’s surface and the plane. A plane oriented along the banana’s long axis would obviously work better than one that cut the banana in half at the middle. Of course, when you’re projecting 300-dimensional points onto a 30-dimensional surface the range of possible orientations is much greater.

We use SVD to break our initial embedding matrix into two smaller embedding matrices. Suppose you have a matrix that is 10,000 rows long (representing a lexicon of 10,000 words) and 300 columns wide (representing a 300-dimensional vector for each word). You can break it into two matrices, one of which is 10,000 columns long and 30 columns wide, and the other of which is 30 columns long and 300 columns wide. This results in a reduction of parameters, from 10,000 x 300 to ((10,000 x 30) + (30 x 300)), or almost 90%.

We represent one of these matrices as one layer of a neural network and the second matrix as the layer above it. Between the layers are connections that have associated “weights,” which determine how much influence the outputs of the lower layer have on the computations performed by the higher one. The training process keeps readjusting those weights, trying to find settings that reduce the projection distance still further.

In our paper, we also describe a new procedure for selecting the network’s “learning rate”. The relationship between the weight settings of the entire network and the network’s error rate can be imagined as a landscape with peaks and valleys. Each point in the landscape represents a group of weight settings, and its altitude represents the corresponding error rate.

The goal is to find a group of weights that correspond to the bottom of one of the deepest valleys, but we can’t view the landscape as a whole; all we can do is examine individual points. At each point, however, we can calculate the slope of the landscape, and the standard procedure for training a neural network is to continually examine points that lie in the downhill direction from the last point examined.

Every time you select a new point, the question is how far in the downhill direction to leap, a metric called the learning rate. A recent approach to choosing the learning rate is the cyclical learning rate, which steadily increases the leap length until it hits a maximum, then steadily steps back down to a minimum, then back up to the maximum, and so on, until further exploration no longer yields performance improvements.

We vary this procedure by decreasing the maximum leap distance at each cycle, then pumping it back up and decreasing it again. The idea is that the large leaps help you escape local minima — basins at the tops of mountains rather than true valleys. But tapering the maximum leap distance reduces the chance that when you’ve found a true valley and have started down its slope, you’ll inadvertently leap out of it.

Learning_rate_comparison_(1).jpg._CB458139123_.jpg
A comparison of the learning-rate-selection strategies adopted in the cyclical learning rate (left) and the cyclically annealed learning rate (right).

We call this technique the cyclically annealed learning rate, and in our experiments, we found that it led to better performance than either the cyclical learning rate or a fixed learning rate.

To evaluate our compression scheme, we compared it to two alternatives. One is the scheme we described before, in which the embedding table is compressed before network training begins. The other is simple quantization, in which all of the values in the embedding vector — in this case, 300 — are rounded to a limited number of reference values. So, for instance, the numbers 75, 83, and 87 might all become 80. This can reduce, say, 32-bit vector values to 16 or 8 bits each.

We tested all three approaches across a range of compression rates, on different types of neural networks, using different data sets, and we found that in all instances, our approach outperformed the others.

Acknowledgments: Angeliki Metallinou, Inderjit Dhillon

Related content

US, NY, New York
We are seeking a Robotics/AI Motor Control Scientist to develop cutting-edge machine learning algorithms for motor control systems in robots. In this role, you will focus on creating and optimizing intelligent motor control strategies to enable robots to perform complex, whole-body tasks. Your contributions will be essential in advancing robotics by enabling fluid, reliable, and safe interactions between robots and their environments. Key job responsibilities - Develop controllers that leverage reinforcement learning, imitation learning, or other advanced AI techniques to achieve natural, robust, and adaptive motor behaviors - Collaborate with multi-disciplinary teams to integrate motor control systems with robotic hardware, ensuring alignment with real-world constraints such as actuator dynamics and energy efficiency - Use simulation and real-world testing to refine and validate control algorithms - Stay updated on advancements in robotics, AI, and control systems to apply advanced techniques to robotic motion challenges - Lead technical projects from conception through production deployment - Mentor junior scientists and engineers - Bridge research initiatives with practical engineering implementation About the team Fauna Robotics, an Amazon company, is building capable, safe, and genuinely delightful robots for everyday life. Our goal is simple: make robots people actually want to live and interact with in everyday human spaces. We believe that future won’t arrive until building for robotics becomes far more accessible. Today, too much effort is spent reinventing the fundamentals. We’re changing that by developing tightly integrated hardware and software systems that make it faster, safer, and more intuitive to create real-world robotic products. Our work spans the full stack: mechanical design, control systems, dynamic modeling, and intelligent software. The focus is not just functionality, but experience. We’re building robots that feel responsive, expressive, and genuinely useful. At Fauna, you’ll work at the frontier of this space, helping define how robots move, manipulate, and interact with people in natural environments. It’s an opportunity to solve hard problems across hardware and software with a team focused on making robotics accessible and joyful to build. If you care about making robotics real for everyone and building systems that are as delightful as they are capable, we’re interested in hearing from you. an opportunity to solve hard problems across hardware and software with a team focused on making robotics accessible and joyful to build. If you care about making robotics real for everyone and building systems that are as delightful as they are capable, we’re interested in hearing from you.
US, WA, Seattle
Amazon.com strives to be Earth's most customer-centric company where customers can shop in our stores to find and discover anything they want to buy. We hire the world's brightest minds, offering them a fast paced, technologically sophisticated and friendly work environment. Economists in the Forecasting, Macroeconomics & Finance field document, interpret and forecast Amazon business dynamics. This track is well suited for economists adept at combining times-series statistical methods with strong economic analysis and intuition. This track could be a good fit for candidates with research experience in: macroeconometrics and/or empirical macroeconomics; international macroeconomics; time-series econometrics; forecasting; financial econometrics and/or empirical finance; and the use of micro and panel data to improve and validate traditional aggregate models. Economists at Amazon are expected to work directly with our senior management and scientists from other fields on key business problems faced across Amazon, including retail, cloud computing, third party merchants, search, Kindle, streaming video, and operations. The Forecasting, Macroeconomics & Finance field utilizes methods at the frontier of economics to develop formal models to understand the past and the present, predict the future, and identify relevant risks and opportunities. For example, we analyze the internal and external drivers of growth and profitability and how these drivers interact with the customer experience in the short, medium and long-term. We build econometric models of dynamic systems, using our world class data tools, formalizing problems using rigorous science to solve business issues and further delight customers.
US, WA, Seattle
Amazon.com strives to be Earth's most customer-centric company where customers can shop in our stores to find and discover anything they want to buy. We hire the world's brightest minds, offering them a fast paced, technologically sophisticated and friendly work environment. Economists at Amazon partner closely with senior management, business stakeholders, scientist and engineers, and economist leadership to solve key business problems ranging from Amazon Web Services, Kindle, Prime, inventory planning, international retail, third party merchants, search, pricing, labor and employment planning, effective benefits (health, retirement, etc.) and beyond. Amazon Economists build econometric models using our world class data systems and apply approaches from a variety of skillsets – applied macro/time series, applied micro, econometric theory, empirical IO, empirical health, labor, public economics and related fields are all highly valued skillsets at Amazon. You will work in a fast moving environment to solve business problems as a member of either a cross-functional team embedded within a business unit or a central science and economics organization. You will be expected to develop techniques that apply econometrics to large data sets, address quantitative problems, and contribute to the design of automated systems around the company.
US, WA, Seattle
Amazon.com strives to be Earth's most customer-centric company where customers can shop in our stores to find and discover anything they want to buy. We hire the world's brightest minds, offering them a fast paced, technologically sophisticated and friendly work environment. Economists at Amazon partner closely with senior management, business stakeholders, scientist and engineers, and economist leadership to solve key business problems ranging from Amazon Web Services, Kindle, Prime, inventory planning, international retail, third party merchants, search, pricing, labor and employment planning, effective benefits (health, retirement, etc.) and beyond. Amazon Economists build econometric models using our world class data systems and apply approaches from a variety of skillsets – applied macro/time series, applied micro, econometric theory, empirical IO, empirical health, labor, public economics and related fields are all highly valued skillsets at Amazon. You will work in a fast moving environment to solve business problems as a member of either a cross-functional team embedded within a business unit or a central science and economics organization. You will be expected to develop techniques that apply econometrics to large data sets, address quantitative problems, and contribute to the design of automated systems around the company.
US, WA, Seattle
Amazon.com strives to be Earth's most customer-centric company where customers can shop in our stores to find and discover anything they want to buy. We hire the world's brightest minds, offering them a fast paced, technologically sophisticated and friendly work environment. Economists at Amazon partner closely with senior management, business stakeholders, scientist and engineers, and economist leadership to solve key business problems ranging from Amazon Web Services, Kindle, Prime, inventory planning, international retail, third party merchants, search, pricing, labor and employment planning, effective benefits (health, retirement, etc.) and beyond. Amazon Economists build econometric models using our world class data systems and apply approaches from a variety of skillsets – applied macro/time series, applied micro, econometric theory, empirical IO, empirical health, labor, public economics and related fields are all highly valued skillsets at Amazon. You will work in a fast moving environment to solve business problems as a member of either a cross-functional team embedded within a business unit or a central science and economics organization. You will be expected to develop techniques that apply econometrics to large data sets, address quantitative problems, and contribute to the design of automated systems around the company.
US, WA, Seattle
Amazon.com strives to be Earth's most customer-centric company where customers can shop in our stores to find and discover anything they want to buy. We hire the world's brightest minds, offering them a fast paced, technologically sophisticated and friendly work environment. Economists at Amazon partner closely with senior management, business stakeholders, scientist and engineers, and economist leadership to solve key business problems ranging from Amazon Web Services, Kindle, Prime, inventory planning, international retail, third party merchants, search, pricing, labor and employment planning, effective benefits (health, retirement, etc.) and beyond. Amazon Economists build econometric models using our world class data systems and apply approaches from a variety of skillsets – applied macro/time series, applied micro, econometric theory, empirical IO, empirical health, labor, public economics and related fields are all highly valued skillsets at Amazon. You will work in a fast moving environment to solve business problems as a member of either a cross-functional team embedded within a business unit or a central science and economics organization. You will be expected to develop techniques that apply econometrics to large data sets, address quantitative problems, and contribute to the design of automated systems around the company.
US, WA, Seattle
Economists in the Forecasting, Macroeconomics & Finance field document, interpret and forecast Amazon business dynamics. This track is well suited for economists adept at combining times-series statistical methods with strong economic analysis and intuition. This track could be a good fit for candidates with research experience in: macroeconometrics and/or empirical macroeconomics; international macroeconomics; time-series econometrics; forecasting; financial econometrics and/or empirical finance; and the use of micro and panel data to improve and validate traditional aggregate models. Economists at Amazon are expected to work directly with our senior management and scientists from other fields on key business problems faced across Amazon, including retail, cloud computing, third party merchants, search, Kindle, streaming video, and operations. The Forecasting, Macroeconomics & Finance field utilizes methods at the frontier of economics to develop formal models to understand the past and the present, predict the future, and identify relevant risks and opportunities. For example, we analyze the internal and external drivers of growth and profitability and how these drivers interact with the customer experience in the short, medium and long-term. We build econometric models of dynamic systems, using our world class data tools, formalizing problems using rigorous science to solve business issues and further delight customers.
US, WA, Seattle
Amazon.com strives to be Earth's most customer-centric company where customers can shop in our stores to find and discover anything they want to buy. We hire the world's brightest minds, offering them a fast paced, technologically sophisticated and friendly work environment. Economists at Amazon partner closely with senior management, business stakeholders, scientist and engineers, and economist leadership to solve key business problems ranging from Amazon Web Services, Kindle, Prime, inventory planning, international retail, third party merchants, search, pricing, labor and employment planning, effective benefits (health, retirement, etc.) and beyond. Amazon Economists build econometric models using our world class data systems and apply approaches from a variety of skillsets – applied macro/time series, applied micro, econometric theory, empirical IO, empirical health, labor, public economics and related fields are all highly valued skillsets at Amazon. You will work in a fast moving environment to solve business problems as a member of either a cross-functional team embedded within a business unit or a central science and economics organization. You will be expected to develop techniques that apply econometrics to large data sets, address quantitative problems, and contribute to the design of automated systems around the company.
US, WA, Seattle
Amazon.com strives to be Earth's most customer-centric company where customers can shop in our stores to find and discover anything they want to buy. We hire the world's brightest minds, offering them a fast paced, technologically sophisticated and friendly work environment. Economists at Amazon partner closely with senior management, business stakeholders, scientist and engineers, and economist leadership to solve key business problems ranging from Amazon Web Services, Kindle, Prime, inventory planning, international retail, third party merchants, search, pricing, labor and employment planning, effective benefits (health, retirement, etc.) and beyond. Amazon Economists build econometric models using our world class data systems and apply approaches from a variety of skillsets – applied macro/time series, applied micro, econometric theory, empirical IO, empirical health, labor, public economics and related fields are all highly valued skillsets at Amazon. You will work in a fast moving environment to solve business problems as a member of either a cross-functional team embedded within a business unit or a central science and economics organization. You will be expected to develop techniques that apply econometrics to large data sets, address quantitative problems, and contribute to the design of automated systems around the company.
US, WA, Seattle
Amazon.com strives to be Earth's most customer-centric company where customers can shop in our stores to find and discover anything they want to buy. We hire the world's brightest minds, offering them a fast paced, technologically sophisticated and friendly work environment. Economists in the Forecasting, Macroeconomics & Finance field document, interpret and forecast Amazon business dynamics. This track is well suited for economists adept at combining times-series statistical methods with strong economic analysis and intuition. This track could be a good fit for candidates with research experience in: macroeconometrics and/or empirical macroeconomics; international macroeconomics; time-series econometrics; forecasting; financial econometrics and/or empirical finance; and the use of micro and panel data to improve and validate traditional aggregate models. Economists at Amazon are expected to work directly with our senior management and scientists from other fields on key business problems faced across Amazon, including retail, cloud computing, third party merchants, search, Kindle, streaming video, and operations. The Forecasting, Macroeconomics & Finance field utilizes methods at the frontier of economics to develop formal models to understand the past and the present, predict the future, and identify relevant risks and opportunities. For example, we analyze the internal and external drivers of growth and profitability and how these drivers interact with the customer experience in the short, medium and long-term. We build econometric models of dynamic systems, using our world class data tools, formalizing problems using rigorous science to solve business issues and further delight customers.