AWS scientist wins ICLR outstanding paper award

Ability to balance parameter size and effectiveness could be “extremely useful” in reducing parameter size of deep-learning models.

An Amazon Web Services (AWS) senior applied scientist and collaborators learned last week that their research paper is one of eight to earn an Outstanding Paper Award for the forthcoming International Conference on Learning Representations (ICLR 2021), which is dedicated to the advancement of deep learning.

The award-winning paper, “Beyond Fully-Connected Layers With Quaternions: Parameterization of Hypercomplex Multiplications with 1/n Parameters”, is authored by Aston Zhang and six other researchers from Nanyang Technological University, ETH Zurich, and the University of Montreal.

Neural networks frequently include so-called fully connected layers, in which each node in one layer connects to all of the nodes in the next layer. The operations performed by fully connected layers are typically modeled as matrix multiplication. Recent work has shown that it’s possible to reduce the number of parameters necessary to represent a fully connected layer by using quaternions, four-dimensional generalizations of complex numbers.

A complex number is one that combines real numbers and the imaginary number i, the square root of -1. By extension, a quaternion combines real numbers and three imaginary numbers, i, j, and k.

Because they have four components, quaternions need only one-fourth as many parameters to represent the operations of a fully connected layer. Zhang and his collaborators’ paper explains how to extend this concept to even higher-dimensional hypercomplex numbers — with four imaginary components, or 20, or as many as you like — with even greater savings in parameter count.

In developing a mathematical representation flexible enough to capture operations involving arbitrary hypercomplex numbers, Zhang and his collaborators found that the same representation could also capture real-numbered operations, such as matrix multiplication. They had found a way to subsume arbitrary hypercomplex numbers and real numbers under a single description.

“The paper’s reviewers helped us improve the paper,” Zhang says. “They were the ones who suggested we see how we could empirically learn predefined multiplication rules in different spaces, such as on artificial datasets.

"There exist multiplication rules in those predefined quaternion-numbered or real-numbered systems. However, relying only on them may restrict the architectural flexibility of deep learning.

“By learning multiplication rules from data, the dimensionality of hypercomplex numbers can be flexibly specified or tuned by users based on their own applications, even when such numbers or rules do not exist mathematically."

Zhang’s collaborators on the paper include Yi TayShuai Zhang, Alvin Chan, Anh Tuan Luu, Siu Cheung Hui, and Jie Fu. At Amazon, Zhang is currently working on completing the book “Dive into Deep Learning”, which Zhang is co-authoring with three other primary authors, Zachary Lipton, Mu Li, and Alex Smola.

Conference organizers noted that 860 papers were submitted for this year’s program, and that a subset of them were submitted to the conference’s Outstanding Paper Committee for review.  The eight winning papers will be presented during two Outstanding Paper sessions on May 5 and 6.  To attend the event, individuals can register here.

Research areas

Related content

• Staff writer
January 03, 2024
Researchers honored for their contributions to the scientific community in 2023.
• Staff writer
November 30, 2023
The awards support four research projects exploring the intersection of AI and health care.
• March 14, 2024
Diffusion modeling within the representational space of a variational autoencoder enables state-of-the-art results.

Work with us

US, WA, Seattle
Do you want to join an innovative team of scientists who use machine learning to help Amazon provide the best experience to our Selling Partners by automatically understanding and addressing their challenges, needs and opportunities? Do you want to build advanced algorithmic systems that are powered by state-of-art ML, such as Natural Language Processing, Large Language Models, Deep Learning, Computer Vision and Causal Modeling, to seamlessly engage with Sellers? Are you excited by the prospect of analyzing and modeling terabytes of data and creating cutting edge algorithms to solve real world problems? Do you like to build end-to-end business solutions and directly impact the profitability of the company and experience of our customers? Do you like to innovate and simplify? If yes, then you may be a great fit to join the Selling Partner Experience Science team. Key job responsibilities - Use statistical and machine learning techniques to create the next generation of the tools that empower Amazon's Selling Partners to succeed. - Design, develop and deploy highly innovative models to interact with Sellers and delight them with solutions. - Work closely with teams of scientists and software engineers to drive real-time model implementations and deliver novel and highly impactful features. - Establish scalable, efficient, automated processes for large scale data analyses, model development, model validation and model implementation. - Research and implement novel machine learning and statistical approaches. - Participate in strategic initiatives to employ the most recent advances in ML in a fast-paced, experimental environment. About the team Selling Partner Experience Science is a growing team of scientists, engineers and product leaders engaged in the research and development of the next generation of ML-driven technology to empower Amazon's Selling Partners to succeed. We draw from many science domains, from Natural Language Processing to Computer Vision to Optimization to Economics, to create solutions that seamlessly and automatically engage with Sellers, solve their problems, and help them grow. Focused on collaboration, innovation and strategic impact, we work closely with other science and technology teams, product and operations organizations, and with senior leadership, to transform the Selling Partner experience. We are open to hiring candidates to work out of one of the following locations: Denver, CO, USA | Seattle, WA, USA
US, WA, Seattle
US, WA, Seattle
Alexa is the Amazon cloud service that powers Echo, the groundbreaking Amazon device designed around your voice. We believe voice is the most natural user interface for interacting with technology across many domains; we are inventing the future. Alexa Audio is responsible for fulfilling customers requests for all types of audio content (Music, Radio, Podcasts, Books, custom sounds) across all Alexa enabled devices. This covers a broad set of experiences including search, browse, recommendations, playback, and devices grouping and controls. We are seeking a talented, self-directed Applied Scientists who would come up with state of the art semantic search and recommendation techniques that work with both voice and visual interfaces. This is a unique opportunity where you will be working on latest technologies including LLMs, and also see it impact customer's lives in meaningful ways. Responsibilities - Apply advance state-of-the-art artificial intelligence techniques and develop algorithms in areas of personalization, voice based dialogue systems and natural language information retrieval. - Design scientifically sound online experiments and offline simulations to study and improve products. - Work closely with talented engineers to create scalable models and put them to production. - Perform statistical analyses on large data sets, identify problems, and propose solutions. - Work with partner science teams to identify collaboration opportunities. Work hard. Have fun. Make history. We are open to hiring candidates to work out of one of the following locations: Bellevue, WA, USA | Seattle, WA, USA | Sunnyvale, CA, USA