Amazon at NAACL 2021: The return of symbolic semantics

Amazon Scholar Heng Ji says that deep learning could benefit from the addition of a little linguistic intuition.

When Heng Ji, an Amazon Scholar and professor of computer science at the University of Illinois, began attending the annual meeting of the North American Chapter of the Association for Computational Linguistics (NAACL), about 15 years ago, the conference drew around 700 people.

“This year, we’ll probably reach something like 3,000 people” says Ji, who is one of the conference’s senior area chairs for the research topic of information extraction. “When we were students, we usually just needed one chair for each area. This year, we have three senior area chairs for information extraction, and under us we have 18 area chairs. So it's just growing like crazy.”

Heng Ji
Heng Ji, an Amazon Scholar and professor of computer science at the University of Illinois.

That growth, of course, is a result of the deep-learning revolution and the role that the statistical approach to natural-language processing has played in recent artificial-intelligence research. Ji, however, looks back fondly on the conference’s prerevolutionary period.

“Methodology-wise, it's actually less heterogenous, less diverse than before,” she says. “Machine learning methods are a hammer, and now we have many nails. In the past, we didn't have a very good hammer, so we were busy inventing other tools.”

As a senior area chair, however, Ji has a good overview of the paper submissions on her research topic — information extraction — and in recent work, she sees a revival of the idea of symbolic semantics, which had fallen into neglect.

“The whole idea of deep neural networks is built on distributional semantics, which means you don't need rules or linguistic intuitions because you can simply count words, right?” she says. “So ‘apple’ and ‘orange’ are similar just because they appear in similar contexts. If I give you one billion documents, you can simply count the stats.”

Distributional semantics is the basis for most linguistic embeddings, or representations of words and strings of words as points in a multidimensional space, such that spatial relationships between points encode semantic relationships between texts. Pretrained, transformer-based embedding networks such as BERT are the basis for most recent advances in natural-language processing (NLP). They generally infer semantic relationships from words’ co-occurrence with other words.

Symbolic semantics

Symbolic semantics, by contrast, makes use of logical relationships between symbols, encoded either as rules (based on linguistic intuitions) or as syntactic relationships within sentences. There are four main ways that symbolic semantics has begun to return to information extraction, Ji says.

“The first idea is we directly change the input data,” she says. “When I say ‘John Smith’, I could refer to this John Smith or that John Smith. Where I say ‘apple’, I can refer to the company or the fruit. The idea is, let's try to do disambiguation before we learn an embedding. So instead of just saying ‘Apple’, we say ‘Apple Incorporated’ to indicate that it's a company. We change the input data to make it more knowledge aware.

“The second idea is we keep the input data the same, but we try to convert the natural language into some sort of structure. For example, we can use semantic parsing to convert the input sentence into a graph structure. Then we can initialize the embedding using a traditional embedding but propagate the embedding representation among the neighboring nodes of the graph.

“So, for example, ‘succeed’ can mean following after something, or it can mean being successful. If we only count co-occurrences, it's very hard to distinguish these two senses. But if we know whether the verb has an object, we can distinguish these meanings. If we can teach the model in advance, ‘This is the structure’, ‘this is the object’, then we can represent it better.

“The third idea is that we will use distributional semantics to discover some new types or new clusters and then use symbolic semantics to name them. One big issue in many NLP tasks, especially information extraction, is that every time we define something that requires an ontology — these are the 10 types of events I want you to extract from news articles — we just annotate data for those 10 types. Then, when we want to add 10 new types of events, the old system we trained with the old training data has become useless, because the deep-learning model is customized for those 10 types. 

“The idea here is, Let's forget about the classification paradigm. Let's try to discover clusters using embedding, right? So if all these words look similar, we put them together into one cluster. And if we can look into the representative meaning of this cluster, we can then use symbolic semantics to come up with the naming. 

“And then the fourth method is that we just let the embedding methods do their work at their low level, and then we use the symbolic-semantic resources to do the final decoding. We use some background knowledge or commonsense knowledge as a global constraint, where we pick candidates from the distributional semantics. 

“I think this is a very promising direction because all the resources we have prepared in the past, maybe, two decades, three decades, will not be dumped. We can still leverage them. And on the other hand, it also makes all the results more explainable.”

For more about Amazon’s involvement in this year’s NAACL — including publications, committee memberships, and participation in workshops and tutorials — please visit our NAACL 2021 conference page.

Research areas

Related content

IL, Tel Aviv
Come join the AWS Agentic AI science team in building the next generation models for intelligent automation. AWS, the world-leading provider of cloud services, has fostered the creation and growth of countless new businesses, and is a positive force for good. Our customers bring problems that will give Applied Scientists like you endless opportunities to see your research have a positive and immediate impact in the world. You will have the opportunity to partner with technology and business teams to solve real-world problems, have access to virtually endless data and computational resources, and to world-class engineers and developers that can help bring your ideas into the world. As part of the team, we expect that you will develop innovative solutions to hard problems, and publish your findings at peer reviewed conferences and workshops. We are looking for world class researchers with experience in one or more of the following areas - autonomous agents, API orchestration, Planning, large multimodal models (especially vision-language models), reinforcement learning (RL) and sequential decision making.
IL, Tel Aviv
Are you a Masters or PhD student interested in a 2026 Internship in Data Science? If so, we want to hear from you! We are looking for a customer obsessed Data Scientist Intern who can innovate in a business environment and is comfortable owning data to drive step-change innovation in the EMEA region or worldwide. If this describes you, come and join our Data Science teams at Amazon for an exciting internship opportunity. If you are insatiably curious and always want to learn more, then you’ve come to the right place. You can find more information about the Amazon Science community as well as our interview process via the links below; https://www.amazon.science/ https://amazon.jobs/content/en/career-programs/university/science Key job responsibilities As a Data Science Intern, you will have the following key job responsibilities: • Work closely with scientists and engineers to develop new algorithms to implement scientific solutions for Amazon problems • Design, run, and analyze A/B tests • Work on an interdisciplinary team on customer-obsessed research • Experience Amazon's customer-focused culture • Create and deliver projects that can be quickly applied starting locally and scaled to EMEA/worldwide • Create and share data with audiences of varying levels technical papers and presentations • Define metrics and design algorithms to estimate customer satisfaction and engagement A day in the life At Amazon, you will grow into the high impact person you know you’re ready to be. Every day will be filled with developing new skills and achieving personal growth. How often can you say that your work changes the world? At Amazon, you’ll say it often. Join us and define tomorrow. Some more benefits of an Amazon Science internship include; • All of our internships offer a competitive stipend/salary • Interns are paired with an experienced manager and mentor(s) • Interns receive invitations to different events such as intern program initiatives or site events • Interns can build their professional and personal network with other Amazon Scientists • Interns can potentially publish work at top tier conferences each year About the team Applicants will be reviewed on a rolling basis and are assigned to teams aligned with their research interests and experience prior to interviews. Start dates are available throughout the year and durations can vary in length from 3-6 months for full time internships or 6-12 months for part time internships. Please note these are not remote internships.
IN, KA, Bengaluru
Alexa+ is the world’s best Generative AI powered personal assistant / agent for consumers. We are seeking an Applied Scientist to join our newly expanding team in India focused on Alexa Conversational Ads and Personalization. In this role, you will build machine learning models that seamlessly and naturally integrate relevant advertising into the Alexa experience while deeply personalizing user interactions. You will work closely with other scientists, engineers, and product managers to take models from conception to production. Key job responsibilities Design, develop, and evaluate innovative deep learning and GenAI models for natural language processing (NLP), recommendation systems, and personalization. Conduct hands-on data analysis and build scalable ML pipelines. Design and run A/B experiments to measure the impact of new models on customer experience and ad performance. Collaborate with software development engineers to deploy models into high-scale, real-time production environments. About the team We are building a new science team in Bangalore to solve some of the most impactful problems in computational advertising. This isn't about tweaking existing models as we are rethinking how ads are ranked, priced, and personalized across voice-first and screen-first surfaces. These are problems that don't have textbook solutions. Key points to note about the team: 🧪 Greenfield team - you are not joining a mature org with rigid processes. You will shape the science roadmap, pick the problems, and define the culture from day one. 📈 Direct business impact — your models directly drive revenue. No yearly cycles to see if your work matters. 🌏 Global scope, local autonomy — collaborate with scientists and engineers across Seattle, Sunnyvale, and Bangalore, but own your problem space end-to-end. 🎓 Ship AND Publish: We encourage top-tier publications (NeurIPS, ACL, EMNLP, KDD, ICML, WWW) while ensuring your research hits production.
IN, KA, Bengaluru
Alexa+ is the world’s best Generative AI powered personal assistant / agent for consumers. We are seeking an Applied Scientist to join our newly expanding team in India focused on Alexa Conversational Ads and Personalization. In this role, you will build machine learning models that seamlessly and naturally integrate relevant advertising into the Alexa experience while deeply personalizing user interactions. You will work closely with other scientists, engineers, and product managers to take models from conception to production. Key job responsibilities Design, develop, and evaluate innovative deep learning and GenAI models for natural language processing (NLP), recommendation systems, and personalization. Conduct hands-on data analysis and build scalable ML pipelines. Design and run A/B experiments to measure the impact of new models on customer experience and ad performance. Collaborate with software development engineers to deploy models into high-scale, real-time production environments. About the team We are building a new science team in Bangalore to solve some of the most impactful problems in computational advertising. This isn't about tweaking existing models as we are rethinking how ads are ranked, priced, and personalized across voice-first and screen-first surfaces. These are problems that don't have textbook solutions. Key points to note about the team: 🧪 Greenfield team - you are not joining a mature org with rigid processes. You will shape the science roadmap, pick the problems, and define the culture from day one. 📈 Direct business impact — your models directly drive revenue. No yearly cycles to see if your work matters. 🌏 Global scope, local autonomy — collaborate with scientists and engineers across Seattle, Sunnyvale, and Bangalore, but own your problem space end-to-end. 🎓 Ship AND Publish: We encourage top-tier publications (NeurIPS, ACL, EMNLP, KDD, ICML, WWW) while ensuring your research hits production.
IN, KA, Bengaluru
Alexa+ is the world’s best Generative AI powered personal assistant / agent for consumers. We are seeking an Applied Scientist to join our newly expanding team in India focused on Alexa Conversational Ads and Personalization. In this role, you will build machine learning models that seamlessly and naturally integrate relevant advertising into the Alexa experience while deeply personalizing user interactions. You will work closely with other scientists, engineers, and product managers to take models from conception to production. Key job responsibilities Design, develop, and evaluate innovative deep learning and GenAI models for natural language processing (NLP), recommendation systems, and personalization. Conduct hands-on data analysis and build scalable ML pipelines. Design and run A/B experiments to measure the impact of new models on customer experience and ad performance. Collaborate with software development engineers to deploy models into high-scale, real-time production environments. About the team We are building a new science team in Bangalore to solve some of the most impactful problems in computational advertising. This isn't about tweaking existing models as we are rethinking how ads are ranked, priced, and personalized across voice-first and screen-first surfaces. These are problems that don't have textbook solutions. Key points to note about the team: 🧪 Greenfield team - you are not joining a mature org with rigid processes. You will shape the science roadmap, pick the problems, and define the culture from day one. 📈 Direct business impact — your models directly drive revenue. No yearly cycles to see if your work matters. 🌏 Global scope, local autonomy — collaborate with scientists and engineers across Seattle, Sunnyvale, and Bangalore, but own your problem space end-to-end. 🎓 Ship AND Publish: We encourage top-tier publications (NeurIPS, ACL, EMNLP, KDD, ICML, WWW) while ensuring your research hits production.
IN, KA, Bengaluru
Alexa+ is the world’s best Generative AI powered personal assistant / agent for consumers. We are seeking an Applied Scientist to join our newly expanding team in India focused on Alexa Conversational Ads and Personalization. In this role, you will build machine learning models that seamlessly and naturally integrate relevant advertising into the Alexa experience while deeply personalizing user interactions. You will work closely with other scientists, engineers, and product managers to take models from conception to production. Key job responsibilities - Design, develop, and evaluate innovative machine learning and deep learning models for natural language processing (NLP), recommendation systems, and personalization. - Conduct hands-on data analysis and build scalable ML pipelines. - Design and run A/B experiments to measure the impact of new models on customer experience and ad performance. - Collaborate with software development engineers to deploy models into high-scale, real-time production environments.
US, CA, San Francisco
Join Amazon's Frontier AI & Robotics team as a Member of Technical Staff, this Technical Program Manager will become the driving force behind breakthrough robotics innovation. You'll orchestrate complex, cross-functional programs that bridge AI research, software, hardware, and production deployment—managing the technical workstreams that enable robots to see, reason, and act in Amazon's warehouse environments. Your program leadership will directly accelerate our mission to build the next generation of embodied intelligence. Key job responsibilities · Establish and drive program management mechanisms and cadence for complex robotics and AI development initiatives spanning research, software engineering, hardware, and operations · Manage end-to-end program execution across the full robotics stack—including AI models, software engineering, and hardware deployment · Drive decision-making velocity by facilitating tradeoff discussions when there are conflicting priorities; determine whether decisions are one-way or two-way doors · Own program-level risk management, proactively identifying technical, schedule, and resource risks; escalate where necessary and drive mitigation strategies · Manage dependencies and scope changes across internal teams and partner organizations, ensuring alignment on commitments, timelines, and technical requirements · Create transparency through clear RACI frameworks, program dashboards, and communication mechanisms that keep stakeholders aligned on status, risks, and decisions · Exercise strong technical judgment to influence program-level decisions on deployment methodology, scalability requirements, and technical feasibility—acting as the voice back to research and engineering teams · Build sustainable program management processes that scale as our organization grows, adapting agile frameworks to the unique challenges of AI robotics A day in the life Your focus centers on driving velocity and alignment across our robotics programs. You might start your morning facilitating tradeoff decisions between AI researchers and software engineers on a critical prototype milestone, then transition to managing dependencies across hardware and operations teams to keep timelines on track. In the afternoon, you could be conducting risk assessments on supply chain constraints that impact our development roadmap, updating program dashboards to provide leadership visibility, or working with partner teams to align on deployment strategies. You'll establish the mechanisms and cadence that keep our fast-moving organization synchronized—from sprint planning rituals to cross-functional design reviews. Throughout the day, you balance hands-on program execution with strategic escalation, ensuring technical decisions align with our long-term vision while removing obstacles that slow teams down. You're the connective tissue that enables researchers, engineers, and operations specialists to move fast together. About the team At Frontier AI & Robotics, we're not just advancing robotics – we're reimagining it from the ground up. Our team is building the future of intelligent robotics through frontier foundation models and end-to-end learned systems. We tackle some of the most challenging problems in AI and robotics, from developing sophisticated perception systems to creating adaptive manipulation strategies that work in complex, real-world scenarios. What sets us apart is our unique combination of ambitious research vision and practical impact. We leverage Amazon's computational infrastructure and rich real-world datasets to train and deploy state-of-the-art foundation models. Our work spans the full spectrum of robotics intelligence – from multimodal perception using images, videos, and sensor data, to sophisticated manipulation strategies that can handle diverse real-world scenarios. We're building systems that don't just work in the lab, but scale to meet the demands of Amazon's global operations. Join us if you're excited about pushing the boundaries of what's possible in robotics, working with world-class researchers, and seeing your innovations deployed at unprecedented scale.
US, CA, San Francisco
We are seeking a hands-on Electrical Engineer to lead the design and integration of electrical systems or subsystems for high-degree-of-freedom robotic platforms. This role involves architecting the robot’s power distribution, sensor wiring, and embedded electrical infrastructure. You will be responsible for designing across the full electrical system for advanced robotics platforms including power distribution, sensing, compute, motor controllers, communication infrastructure, battery system and power electronics in close collaboration with mechanical, controls and software engineers. You’ll play a key role in ensuring high-performance, reliable operation of complex electromechanical systems under real-world conditions. Key job responsibilities * Electrical system architect / owner for power electronics, actuation, PCBAs, battery, ware harness specs and high speed electrical/communications protocols * Design, develop and integrate power distribution, embedded electronics, motor controllers and safety-critical circuits for complex robotic systems * Own board layout of PCBAs including SoCs, microcontrollers, sensors, power devices, etc. using Cadence OrCAD/Allegro or equivalent tools. Oversee bring-up and validation * Determine appropriate high speed electrical and communication protocols (e.g., CAN, EtherCAT, USB, etc) for reliable and efficient system operation * Specify and design custom power electronics and power distribution boards to meet performance, thermal, and safety requirements * Design and route all cabling and wire harnesses across the robotic platform, considering EMI, signal integrity, serviceability, and integration with mechanical structures * Architect and integrate the robot’s battery system, including protection circuitry, battery management, charging systems, and thermal considerations * Define and implement wiring and electrical interfaces for sensors (e.g., lidar, stereo cameras, IMUs, tactile) and compute modules * Ownership over prototyping and bringing up electrical designs and creation of test & validation rigs About the team At Frontier AI & Robotics, we're not just advancing robotics – we're reimagining it from the ground up. Our team is building the future of intelligent robotics through innovative foundation models and end-to-end learned systems. We tackle some of the most challenging problems in AI and robotics, from developing sophisticated perception systems to creating adaptive manipulation strategies that work in complex, real-world scenarios. What sets us apart is our unique combination of ambitious research vision and practical impact. We leverage Amazon's massive computational infrastructure and rich real-world datasets to train and deploy state-of-the-art foundation models. Our work spans the full spectrum of robotics intelligence – from multimodal perception using images, videos, and sensor data, to sophisticated manipulation strategies that can handle diverse real-world scenarios. We're building systems that don't just work in the lab, but scale to meet the demands of Amazon's global operations. Join us if you're excited about pushing the boundaries of what's possible in robotics, working with world-class researchers, and seeing your innovations deployed at unprecedented scale.
US, NY, New York
We are seeking an Applied Scientist to develop and optimize Visual Inertial Odometry (VIO) and sensor fusion systems for our intelligent robots. In this role, you will design, implement, and deploy state estimation and tracking algorithms that enable robots to understand their position and motion in real time, even in challenging and dynamic environments. You will own the full pipeline from algorithm development through embedded deployment, ensuring that perception systems run efficiently on resource-constrained robotic hardware. You will also leverage modern machine learning approaches to push the boundaries of classical perception methods, combining learned representations with geometric techniques to achieve robust, real-time performance. This is a deeply hands-on role. You will work directly with sensors, hardware, and real-world data, while prototyping, testing, and iterating in physical environments. The ideal candidate has strong foundations in VIO and sensor fusion, practical experience optimizing algorithms for embedded platforms, and familiarity with how modern deep learning is transforming perception. Key job responsibilities - Design and implement Visual Inertial Odometry algorithms for robust real-time state estimation on robotic platforms like Sprout - Develop multi-sensor fusion pipelines integrating cameras, IMUs, and other sensing modalities for accurate pose tracking - Optimize perception and tracking algorithms for deployment on embedded hardware (e.g., ARM, GPU-accelerated edge devices) under strict latency and power constraints - Apply modern ML-based perception techniques (learned features, depth estimation, neural odometry) to complement and improve classical geometric approaches - Build and maintain calibration, evaluation, and benchmarking infrastructure for perception systems - Collaborate with hardware, controls, and navigation teams to integrate perception outputs into the robot’s autonomy stack - Lead technical projects from research prototyping through production deployment
US, WA, Bellevue
The candidate in this role will own delivery of science products and solutions to help Amazon Devices Sales and Marketing org. make better decisions: product recommendations to customers, segmentation, financial incrementality of marketing initiatives, A/B testing etc. Key job responsibilities The Amazon Devices organization designs, produces and markets Echo Speakers, Kindle e-readers, Fire Tablets, Fire TV Streaming Media Players, Ring and Blink Smart Home & Security products. We are constantly looking to innovate on behalf of customers with new devices in existing or new categories or improving customer experience on existing platforms. The Devices Data Services (DDS) team provides Data Science, Analytics and Engineering support to the broader organization to enable Sales and Marketing activities across all these product lines. We are looking for an innovative, hands-on and customer-obsessed Data Scientist who can be a strategic partner to the product managers and engineers on the team. Our projects span multiple organizations and require coordination of experimentation, economic and causal analysis, and building predictive machine learning models. A successful candidate will be a problem solver who enjoys diving into data, is excited by difficult modeling challenges, is motivated to build something that will eventually become a production software system, and possesses strong communication skills to effectively interface between technical and business teams. In this role, you will be a technical expert with massive impact. You will take the lead on developing advanced ML systems that are key to reaching our customers with the right recommendations at the right time. Your work will directly impact the success of Amazon's growing Devices business. You will work across diverse science/engineering/business teams. You will work on critical data science problems, building high quality, reliable, accurate, and consistent code sets that are aligned with our business needs. Key Performance Areas - Implement statistical or machine learning methods to solve specific business problems. - Improve upon existing methodologies by developing new data sources, testing model enhancements, and fine-tuning model parameters. - Directly contribute to development of modern automated recommendation systems - Build customer-facing reporting tools to provide insights and metrics to track model performance and explain variance - Collaborate with researchers, software developers, and business leaders to define product requirements, provide analytical support, and communicate feedback A day in the life You will work with other scientists, engineers, product managers, and marketers to develop new products that benefit our customers and help us reach our business goals. You will own solutions from end to end: conceptualization, prioritization, development, delivery, and productionalization. About the team We are a full stack science team that empowers product, marketing, and other business leaders to better understand customers who use Amazon devices, make decisions on product development or optimization, and measure the effectiveness of their efforts against our customer’s expectation. Our focus area is to build analytical frameworks that help the organization either access data, better understand the decisions customers are making and why, or assess customer satisfaction.