Machine translation accelerates how Alexa learns new languages

As Alexa-enabled devices continue to expand into new countries, we propose an approach for quickly bootstrapping machine-learning models in new languages, with the aim of more efficiently bringing Alexa to new customers around the world. We describe our approach in a paper we’re presenting next week at the 16th Annual Conference of the North American Chapter of the Association of Computational Linguistics: Human Language Technologies (NAACL-HLT).

Building a natural-language-understanding (NLU) model from scratch requires gathering and annotating huge sets of training data, which is a significant time burden for both annotators and scientists, and it’s a procedure that doesn’t scale to new languages. An obvious solution is to try to leverage the large data sets that have been used to train NLU models in other languages. In this work, we use machine translation (MT) to translate existing data sources into a target language and then use the translated data to bootstrap an NLU system.

A common way to begin training an NLU model in a new language is to use a formal grammar, a set of syntactic and semantic rules that, combined with a lexicon of words tagged with semantic information, can generate an arbitrary number of syntactically and semantically valid sentences. Although less time-consuming than annotating huge data sets, this does require language specialists to build grammars that offer good coverage for the target application.

Once this first system reaches a certain performance threshold, it can be shared with beta users. Beta users’ queries will, of course, better represent those of real users than artificially generated data will. All existing data sources are then used to train the system until it reaches a new, higher performance threshold, at which point it is made generally available to customers. Once customers begin using the system, their interactions with it generate even more training data.

However, it can take a significant amount of time and annotation effort to get enough real training data to achieve the type of feature coverage that Alexa customers in new languages will expect.

Machine translation could be a useful tool for quickly extending NLU systems to new languages and providing coverage of all Alexa features available in already supported languages. In this paper, we use a large data set of English utterances to bootstrap a German NLU system.

In addition, we explore ways to automatically identify “good” translations, i.e., the ones that improve NLU performance. First, we investigate filtering based on MT quality, rating translations according to the probability scores generated by the MT model. Next, we investigate filtering based on semantic accuracy. To measure this, we take the machine-translated text, automatically translate it back into the original language, and then rerun the NLU system on the result. The translation is scored according to how well the new semantic tags line up with those of the original.

Lastly, we apply some language-specific post-processing to the translation output. Specifically, we use target catalogues to resample the translated data. For instance, we automatically substitute the names of German cities for those of American cities mentioned in the original utterances, to better simulate data from German users. In addition, we choose to leave certain types of words, such as song and artist names, untranslated. For example, if the original utterance was “Play music by Queen,” the system would not translate the artist name “Queen” to the German word “Königin”.

In experiments we report in the paper, systems trained on MT data performed much better than those trained on grammar-generated data, and they even outperformed a system trained on 10,000 hand-annotated German utterances. The applied filtering and post-processing techniques improved results still further.

Overall, the work shows that the use of MT can shrink the first long phase of grammar generation and in-house data gathering for a new language. In addition, MT makes it possible to offer customers more features more rapidly, as data for existing features in all supported languages can be translated immediately for new languages.

Research areas

Related content

DE, Berlin
AWS AI is looking for passionate, talented, and inventive Applied Scientists with a strong machine learning background to help build industry-leading Conversational AI Systems. Our mission is to provide a delightful experience to Amazon’s customers by pushing the envelope in Natural Language Understanding (NLU), Dialog Systems including Generative AI with Large Language Models (LLMs) and Applied Machine Learning (ML). As part of our AI team in Amazon AWS, you will work alongside internationally recognized experts to develop novel algorithms and modeling techniques to advance the state-of-the-art in human language technology. Your work will directly impact millions of our customers in the form of products and services that make use language technology. You will gain hands on experience with Amazon’s heterogeneous text, structured data sources, and large-scale computing resources to accelerate advances in language understanding. We are hiring in all areas of human language technology and code generation. We are open to hiring candidates to work out of one of the following locations: Berlin, DEU
US, MA, North Reading
Working at Amazon Robotics Are you inspired by invention? Is problem solving through teamwork in your DNA? Do you like the idea of seeing how your work impacts the bigger picture? Answer yes to any of these and you’ll fit right in here at Amazon Robotics. We are a smart, collaborative team of doers that work passionately to apply cutting-edge advances in robotics and software to solve real-world challenges that will transform our customers’ experiences in ways we can’t even imagine yet. We invent new improvements every day. We are Amazon Robotics and we will give you the tools and support you need to invent with us in ways that are rewarding, fulfilling and fun. Position Overview The Amazon Robotics (AR) Software Research and Science team builds and runs simulation experiments and delivers analyses that are central to understanding the performance of the entire AR system. This includes operational and software scaling characteristics, bottlenecks, and robustness to “chaos monkey” stresses -- we inform critical engineering and business decisions about Amazon’s approach to robotic fulfillment. We are seeking an enthusiastic Data Scientist to design and implement state-of-the-art solutions for never-before-solved problems. The DS will collaborate closely with other research and robotics experts to design and run experiments, research new algorithms, and find new ways to improve Amazon Robotics analytics to optimize the Customer experience. They will partner with technology and product leaders to solve business problems using scientific approaches. They will build new tools and invent business insights that surprise and delight our customers. They will work to quantify system performance at scale, and to expand the breadth and depth of our analysis to increase the ability of software components and warehouse processes. They will work to evolve our library of key performance indicators and construct experiments that efficiently root cause emergent behaviors. They will engage with software development teams and warehouse design engineers to drive the evolution of the AR system, as well as the simulation engine that supports our work. Inclusive Team Culture Here at Amazon, we embrace our differences. We are committed to furthering our culture of inclusion. We have 12 affinity groups (employee resource groups) with more than 87,000 employees across hundreds of chapters around the world. We have innovative benefit offerings and host annual and ongoing learning experiences, including our Conversations on Race and Ethnicity (CORE) and AmazeCon (gender diversity) conferences. Amazon’s culture of inclusion is reinforced within our 14 Leadership Principles, which reminds team members to seek diverse perspectives, learn and be curious, and earn trust. Flexibility It isn’t about which hours you spend at home or at work; it’s about the flow you establish that brings energy to both parts of your life. We offer flexibility and encourage you to find your own balance between your work and personal lives. Mentorship & Career Growth We care about your career growth too. Whether your goals are to explore new technologies, take on bigger opportunities, or get to the next level, we'll help you get there. Our business is growing fast and our people will grow with it. A day in the life Amazon offers a full range of benefits that support you and eligible family members, including domestic partners and their children. Benefits can vary by location, the number of regularly scheduled hours you work, length of employment, and job status such as seasonal or temporary employment. The benefits that generally apply to regular, full-time employees include: 1. Medical, Dental, and Vision Coverage 2. Maternity and Parental Leave Options 3. Paid Time Off (PTO) 4. 401(k) Plan If you are not sure that every qualification on the list above describes you exactly, we'd still love to hear from you! At Amazon, we value people with unique backgrounds, experiences, and skillsets. If you’re passionate about this role and want to make an impact on a global scale, please apply! A day in the life Amazon offers a full range of benefits that support you and eligible family members, including domestic partners and their children. Benefits can vary by location, the number of regularly scheduled hours you work, length of employment, and job status such as seasonal or temporary employment. The benefits that generally apply to regular, full-time employees include: 1. Medical, Dental, and Vision Coverage 2. Maternity and Parental Leave Options 3. Paid Time Off (PTO) 4. 401(k) Plan If you are not sure that every qualification on the list above describes you exactly, we'd still love to hear from you! At Amazon, we value people with unique backgrounds, experiences, and skillsets. If you’re passionate about this role and want to make an impact on a global scale, please apply! We are open to hiring candidates to work out of one of the following locations: North Reading, MA, USA
LU, Luxembourg
Pooling Req - JKU Linz Pooling Req - JKU Linz Pooling Req - JKU Linz Pooling Req - JKU Linz Pooling Req - JKU Linz Pooling Req - JKU Linz Pooling Req - JKU Linz Pooling Req - JKU Linz Pooling Req - JKU Linz Pooling Req - JKU Linz We are open to hiring candidates to work out of one of the following locations: Luxembourg, LUX
US, WA, Bellevue
Are you excited about developing generative AI, reinforcement learning and foundation models? Are you looking for opportunities to build and deploy them on real problems at truly vast scale? At Amazon Fulfillment Technologies and Robotics, we are on a mission to build high-performance autonomous decision systems that perceive and act to further improve our world-class customer experience - at Amazon scale. We are looking for an Applied Scientist who will help us build next level simulation and optimization systems with the help of generative AI and LLMs. Together, we will be pushing beyond the state of the art in simulation and optimization of one of the most complex systems in the world: Amazon's Fulfillment Network. Key job responsibilities In this role, you will dive deep into our fulfillment network, understand complex processes and channel your insights to build large scale machine learning models (LLMs, graph neural nets and reinforcement learning) that will be able to understand and optimize the state and future of our buildings, network and orders. You will face a high level of research ambiguity and problems that require creative, ambitious, and inventive solutions. You will work with and in a team of applied scientists to solve cutting edge problems going beyond the published state of the art that will drive transformative change on a truly global scale. A day in the life In this role, you will dive deep into our fulfillment network, understand complex processes and channel your insights to build large scale machine learning models (LLMs, graph neural nets and reinforcement learning) that will be able to understand and optimize the state and future of our buildings, network and orders. You will face a high level of research ambiguity and problems that require creative, ambitious, and inventive solutions. You will work with and in a team of applied scientists to solve cutting edge problems going beyond the published state of the art that will drive transformative change on a truly global scale. A day in the life Amazon offers a full range of benefits that support you and eligible family members, including domestic partners and their children. Benefits can vary by location, the number of regularly scheduled hours you work, length of employment, and job status such as seasonal or temporary employment. The benefits that generally apply to regular, full-time employees include: 1. Medical, Dental, and Vision Coverage 2. Maternity and Parental Leave Options 3. Paid Time Off (PTO) 4. 401(k) Plan If you are not sure that every qualification on the list above describes you exactly, we'd still love to hear from you! At Amazon, we value people with unique backgrounds, experiences, and skillsets. If you’re passionate about this role and want to make an impact on a global scale, please apply! About the team Amazon Fulfillment Technologies (AFT) powers Amazon’s global fulfillment network. We invent and deliver software, hardware, and data science solutions that orchestrate processes, robots, machines, and people. We harmonize the physical and virtual world so Amazon customers can get what they want, when they want it. The AFT AI team has deep expertise developing cutting edge AI solutions at scale and successfully applying them to business problems in the Amazon Fulfillment Network. These solutions typically utilize machine learning and computer vision techniques, applied to text, sequences of events, images or video from existing or new hardware. We influence each stage of innovation from inception to deployment, developing a research plan, creating and testing prototype solutions, and shepherding the production versions to launch. We are open to hiring candidates to work out of one of the following locations: Bellevue, WA, USA
CN, Shanghai
亚马逊云科技上海人工智能实验室OpenSearch 研发团队正在招募应用科学实习生-多模态检索与生成方向实习生。OpenSearch是一个开源的搜索和数据分析套件, 它旨在为数据密集型应用构建解决方案,内置高性能、开发者友好的工具,并集成了强大的机器学习、数据处理功能,可以为客户提供灵活的数据探索、丰富和可视化功能,帮助客户从复杂的数据中发现有价值的信息。OpenSearch是现有AWS托管服务(AWS OpenSearch)的基础,OpenSearch核心团队负责维护OpenSearch代码库,他们的目标是使OpenSearch安全、高效、可扩展、可扩展并永远开源。 点击下方链接查看申请手册获得更多信息: https://amazonexteu.qualtrics.com/CP/File.php?F=F_55YI0e7rNdeoB6e Key job responsibilities 在这个实习期间,你将有机会: 1. 研究最新的搜索相关性人工智能算法。 2. 探索大模型技术在数据分析与可视化上的应用。 3. 了解主流搜索引擎Lucene的原理和应用。深入了解前沿自然语言处理技术和底层索引性能调优的结合。 4. 学习亚马逊云上的各种云服务。 5. 参与产品需求讨论,提出技术实现方案。 6. 与国内外杰出的开发团队紧密合作,学习代码开发和审查的流程。 We are open to hiring candidates to work out of one of the following locations: Shanghai, CHN
CN, Shanghai
亚马逊云科技上海人工智能实验室OpenSearch 研发团队正在招募应用科学家实习,方向是服务器端开发。OpenSearch是一个开源的搜索和数据分析套件, 它旨在为数据密集型应用构建解决方案,内置高性能、开发者友好的工具,并集成了强大的机器学习、数据处理功能,可以为客户提供灵活的数据探索、丰富和可视化功能,帮助客户从复杂的数据中发现有价值的信息。OpenSearch是现有AWS托管服务(AWS OpenSearch)的基础,OpenSearch核心团队负责维护OpenSearch代码库,他们的目标是使OpenSearch安全、高效、可扩展、可扩展并永远开源。 点击下方链接查看申请手册获得更多信息: https://amazonexteu.qualtrics.com/CP/File.php?F=F_55YI0e7rNdeoB6e Key job responsibilities 在这个实习期间,你将有机会: 1. 使用Java/Kotlin等服务器端技术编写高质量,高性能,安全,可维护和可测试的代码。 2. 了解主流搜索引擎Lucene的原理和应用。 3. 学习亚马逊云上的各种云服务。 4. 参与产品需求讨论,提出技术实现方案。 5. 与国内外杰出的开发团队紧密合作,学习代码开发和审查的流程。 6. 应用先进的人工智能和机器学习技术提升用户体验。 We are open to hiring candidates to work out of one of the following locations: Shanghai, CHN
CN, Shanghai
亚马逊云科技上海人工智能实验室OpenSearch 研发团队正在招募应用科学家实习,方向是服务器端开发。OpenSearch是一个开源的搜索和数据分析套件, 它旨在为数据密集型应用构建解决方案,内置高性能、开发者友好的工具,并集成了强大的机器学习、数据处理功能,可以为客户提供灵活的数据探索、丰富和可视化功能,帮助客户从复杂的数据中发现有价值的信息。OpenSearch是现有AWS托管服务(AWS OpenSearch)的基础,OpenSearch核心团队负责维护OpenSearch代码库,他们的目标是使OpenSearch安全、高效、可扩展、可扩展并永远开源。 点击下方链接查看申请手册获得更多信息: https://amazonexteu.qualtrics.com/CP/File.php?F=F_55YI0e7rNdeoB6e Key job responsibilities 在这个实习期间,你将有机会: • 使用HTML、CSS和TypeScript/Javascript等前端技术开发用户界面。 • 学习使用Node.js 为用户界面提供服务接口。 • 了解并实践工业级前端产品的开发/部署/安全审查/发布流程。 • 了解并实践前端框架React的使用。 • 参与产品需求讨论,提出技术实现方案。 • 与国内外杰出的开发团队紧密合作,学习代码开发和审查的流程。 • 编写高质量,高性能,安全,可维护和可测试的代码。 • 应用先进的人工智能和机器学习技术提升用户体验。 We are open to hiring candidates to work out of one of the following locations: Shanghai, CHN
US, WA, Seattle
Amazon is one of the most popular sites in the US. Our product search engine, one of the most heavily used services in the world, indexes billions of products and serves hundreds of millions of customers world-wide. Our team leads the science and analytics efforts for the search page and we own multiple aspects of understanding how we can measure customer satisfaction with our experiences. This include building science based insights and novel metrics to define and track customer focused aspects. We are working on a new measurement framework to better quantify and qualify the quality of the search customer experience and are looking for a Senior Applied Scientist to lead the development and implementation of different signals for this framework and tackle new and uncharted territories for search engines using LLMs. Key job responsibilities We are looking for an experienced Sr. Applied Scientist to lead LLM based signals development and data analytics and drive critical product decisions for Amazon Search. In a fast-paced and ambiguous environment, you will perform multiple large, complex, and business critical analyses that will inform product design and business priorities. You will design and build AI based science solutions to allow routine inspection and deep business understanding as the search customer experience is being transformed. Keeping a department-wide view, you will focus on the highest priorities and constantly look for scale and automation, while making technical trade-offs between short term and long-term needs. With your drive to deliver results, you will quickly analyze data and understand the current business challenges to assess the feasibility of different science projects as well as help shape the analytics roadmap of the Science and Analytics team for Search CX. Your desire to learn and be curious will help us look around corners for improvement opportunities and more efficient metrics development. In this role, you will partner with data engineers, business intelligence engineers, product managers, software engineers, economists, and other scientists. A day in the life You are have expertise in Machine learning and statistical models. You are comfortable with a higher degree of ambiguity, knows when and how to be scrappy, build quick prototypes and proofs of concepts, innate ability to see around corners and know what is coming, define a long-term science vision, and relish the idea of solving problems that haven’t been solved at scale. As part of our journey to learn about our data, some opportunities may be a dead end and you will balancing unknowns with delivering results for our customers. Along the way, you’ll learn a ton, have fun and make a positive impact at scale. About the team Joining this team, you’ll experience the benefits of working in a dynamic, entrepreneurial environment, while leveraging the resources of Amazon.com (AMZN), Earth's most customer-centric company and one of the world's leading internet companies. We provide a highly customer-centric, and team-oriented environment. We are open to hiring candidates to work out of one of the following locations: Seattle, WA, USA
US, MA, Westborough
The Research Team at Amazon Robotics is seeking a passionate Applied Scientist, with a strong track record of industrial research, innovation leadership, and technology transfer, with a focus on ML Applications. At Amazon Robotics, we apply cutting edge advancements in robotics, software development, Big Data, ML and AI to solve real-world challenges that will transform our customers’ experiences in ways we can’t even imagine yet. We operate hundreds of buildings that employ hundreds of thousands of robots teaming up to perform sophisticated, large-scale missions. There are a lot of exciting opportunities ahead of us that can be unlocked by scientific research. Amazon Robotics has a dedicated focus on research and development to continuously explore new opportunities to extend its product lines into new areas. As you could imagine, data is at the heart of our innovation. This role will be participating in creating the ML and AI roadmap, leading science initiatives, and shipping ML products. Key job responsibilities You will be responsible for: - Thinking Big and ideating with Data Science team, other Science teams, and stakeholders across the organization to co-create the ML roadmap. - Collaborating with customers and cross-functional stakeholder teams to help the team identify, disambiguate, and define key problems. - Independently innovating, creating, and iterating ML solutions for given business problems. Especially, using techniques such as Computer Vision, Deep Learning, Causal Inference, etc. - Collaborating with other Science, Tech, Ops, and Business leaders to ship and iterate ML products. - Promoting best practices and mentoring junior team members on problem solving and communication. - Leading state-of-the-art research work and pursuing internal/external scientific publications. A day in the life You will co-create ML/AI roadmap. You will help team identify business opportunities. You will prototype, iterate ML/AI solutions. You will drive communication with stakeholders to implement and ship ML solutions. e.g., computer vision, deep learning, explainable AI, causal inference, reinforcement learning, etc. You will mentor and guide junior team members in delivering projects and business impact. You will work with the team and lead scientific publications. Amazon offers a full range of benefits that support you and eligible family members, including domestic partners and their children. Benefits can vary by location, the number of regularly scheduled hours you work, length of employment, and job status such as seasonal or temporary employment. The benefits that generally apply to regular, full-time employees include: 1. Medical, Dental, and Vision Coverage 2. Maternity and Parental Leave Options 3. Paid Time Off (PTO) 4. 401(k) Plan If you are not sure that every qualification on the list above describes you exactly, we'd still love to hear from you! At Amazon, we value people with unique backgrounds, experiences, and skillsets. If you’re passionate about this role and want to make an impact on a global scale, please apply! About the team You will join a scientifically and demographically diverse research/science team. Our multi-disciplinary team includes scientists with backgrounds in planning/scheduling, grasping/manipulation, machine learning, statistical analysis, and operations research. We develop novel algorithms and machine learning models and apply them to real-word robotic warehouses, including: - Planning/coordinating the paths of thousands of robtos - Dynamic task allocation to thousands of robots. - Learning how to manipulate products sold by Amazon. - Co-designing an optimizing robotic logistics processes. Our team also serves as a hub to foster innovation and support scientists across Amazon Robotics. In addition, we coordinate research engagements with academia. We are open to hiring candidates to work out of one of the following locations: Westborough, MA, USA
US, CA, Sunnyvale
Amazon is looking for a passionate, talented, and inventive Applied Scientists with a strong machine learning background to help build industry-leading Speech and Language technology. Our mission is to provide a delightful experience to Amazon’s customers by pushing the envelope in Automatic Speech Recognition (ASR), Machine Translation (MT), Natural Language Understanding (NLU), Machine Learning (ML) and Computer Vision (CV). As part of our AI team in Amazon AGI, you will work alongside internationally recognized experts to develop novel algorithms and modeling techniques to advance the state-of-the-art in human language technology. Your work will directly impact millions of our customers in the form of products and services that make use of speech and language technology. You will gain hands on experience with Amazon’s heterogeneous speech, text, and structured data sources, and large-scale computing resources to accelerate advances in spoken language understanding. We are hiring in all areas of human language technology: ASR, MT, NLU, text-to-speech (TTS), and Dialog Management, in addition to Computer Vision. We are open to hiring candidates to work out of one of the following locations: Bellevue, WA, USA | San Francisco, CA, USA | Seattle, WA, USA | Sunnyvale, CA, USA