Tools for Generating Synthetic Data Helped Bootstrap Alexa’s New-Language Releases

In the past few weeks, Amazon announced versions of Alexa in three new languages: Hindi, U.S. Spanish, and Brazilian Portuguese.

Like all new-language launches, these addressed the problem of how to bootstrap the machine learning models that interpret customer requests, without the ability to learn from customer interactions. At a high level, the solution is to use synthetic data. These three locales were the first to benefit from two new in-house tools, developed by the Alexa AI team, that produce higher-quality synthetic data more efficiently.

Each new locale has its own speech recognition model, which converts an acoustic speech signal into text. But interpreting that text — determining what the customer wants Alexa to do — is the job of Alexa’s natural-language-understanding (NLU) systems.

When a new-language version of Alexa is under development, training data for its NLU systems is scarce. Alexa feature teams will propose some canonical examples of customer requests in the new language, which we refer to as “golden utterances”; training data from existing locales can be translated by machine translation systems; crowd workers may be recruited to generate sample texts; and some data may come from Cleo, an Alexa skill that allows multilingual customers to help train new-language models by responding to voice prompts with open-form utterances.

Even when data from all these sources is available, however, it’s sometimes not enough to train a reliable NLU model. The new bootstrapping tools, from Alexa AI’s Applied Modeling and Data Science group, treat the available sample utterances as templates and generate new data by combining and varying those templates.

One of the tools, which uses a technique called grammar induction, analyzes a handful of golden utterances to learn general syntactic and semantic patterns. From those patterns, it produces a series of rewrite expressions that can generate thousands of new, similar sentences. The other tool, guided resampling, generates new sentences by recombining words and phrases from examples in the available data. Guided resampling concentrates on optimizing the volume and distribution of sentence types, to maximize the accuracy of the resulting NLU models.

Rules of Grammar

Grammars have been a tool in Alexa’s NLU toolkit since well before the first Echo device shipped. A grammar is a set of rewrite rules for varying basic template sentences through word insertions, deletions, and substitutions.

Below is a very simple grammar, which models requests to play either pop or rock music, with or without the modifiers “more” and “some”. Below the rules of the grammar is a diagram of a computational system (a finite-state transducer, or FST) that implements them.

diagram of the resulting finite-state transducer
A toy grammar, which can model requests to play pop or rock music, with or without the modifiers “some” or “more”, and a diagram of the resulting finite-state transducer. The question mark indicates that the some_more variable is optional.

Given a list of, say, 50 golden utterances, a computational linguist could probably generate a representative grammar in a day, and it could be operationalized by the end of the following day. With the Applied Modeling and Data Science (AMDS) group’s grammar induction tool, that whole process takes seconds.

AMDS research scientists Ge Yu and Chris Hench and language engineer Zac Smith experimented with a neural network that learned to produce grammars from golden utterances. But they found that an alternative approach, called Bayesian model merging, offered similar performance with advantages in reproducibility and iteration speed.

The resulting system identifies linguistic patterns in lists of golden utterances and uses them to generate candidate rules for varying sentence templates. For instance, if two words (say, “pop” and “rock”) consistently occur in similar syntactic positions, but the phrasing around them varies, then one candidate rule will be that (in some defined contexts) “pop” and “rock” are interchangeable.

After exhaustively listing candidate rules, the system uses Bayesian probability to calculate which rule accounts for the most variance in the sample data, without overgeneralizing or introducing inconsistencies. That rule becomes an eligible variable in further iterations of the process, which recursively repeats until the grammar is optimized.

Crucially, the tool’s method for creating substitution rules allows it to take advantage of existing catalogues of frequently occurring terms or phrases. If, for instance, the golden utterances were sports related, and the grammar induction tool determined that the words “Celtics” and “Lakers” were interchangeable, it would also conclude that they were interchangeable with “Warriors”, “Spurs”, “Knicks”, and all the other names of NBA teams in a standard catalogue used by a variety of Alexa services.

From a list of 50 or 60 golden utterances, the grammar induction tool might extract 100-odd rules that can generate several thousand sentences of training data, all in a matter of seconds.

Safe Swaps

The guided-resampling tool also uses catalogues and existing examples to augment training data. Suppose that the available data contains the sentences “play Camila Cabello” and “can you play a song by Justin Bieber?”, which have been annotated to indicate that “Camila Cabello” and “Justin Bieber” are of the type ArtistName. In NLU parlance, ArtistName is a slot type, and “Camila Cabello” and “Justin Bieber” are slot values.

The guided-resampling tool generates additional training examples by swapping out slot values — producing, for instance, “play Justin Bieber” and “can you play a song by Camila Cabello?” Adding the vast Amazon Music databases of artist names and song titles to the mix produces many additional thousands of training sentences.

Blindly swapping slot values can lead to unintended consequences, so which slot values can be safely swapped? For example, in the sentences “play jazz music” and “read detective books”, both “jazz” and “detective” would be labeled with the slot type GenreName. But customers are unlikely to ask Alexa to play “detective music”, and unnatural training data would degrade the performance of the resulting NLU model.

AMDS’s Olga Golovneva, a research scientist, and Christopher DiPersio, a language engineer, used the Jaccard index — which measures the overlap between two sets — to evaluate pairwise similarity between slot contents in different types of requests. On that basis, they defined a threshold for valid slot mixing.

Quantifying Complexity

As there are many different ways to request music, another vital question is how many variations of each template to generate in order to produce realistic training data. One answer is simply to follow the data distributions from languages that Alexa already supports.

Comparing distributions of sentence types across languages requires representing customer requests in a more abstract form. We can encode a sentence like “play Camila Cabello” according to the word pattern other + ArtistName, where other represents the verb “play”, and ArtistName represents “Camila Cabello”. For “play ‘Havana’ by Camila Cabello”, the pattern would be other + SongName + other + ArtistName. To abstract away from syntactic differences between languages, we can condense this pattern further to other + ArtistName + SongName, which represents only the semantic concepts included in the request.

Given this level of abstraction, Golovneva and DiPersio investigated several alternative techniques for determining the semantic distributions of synthetic data.

Using Shannon entropy, which is a measure of uncertainty, Golovneva and DiPersio calculated the complexity of semantic sentence patterns, focusing on slots and their combinations. Entropy for semantic slots takes into consideration how many different values each slot might have, as well as how frequent each slot is in the data set overall. For example, the slot SongName occurs very frequently in music requests, and its potential values (different song titles) number in the millions; in contrast, GenreName also occurs frequently in music requests, but its set of possible values (music genres) is fairly small.

Customer requests to Alexa often include multiple slots (such as “play ‘Vogue’|SongName by Madonna|ArtistName” or “set a daily|RecurrenceType reminder to {walk the dog}|ReminderContent for {seven a. m.}|Time”), which increases the pattern complexity further.

In their experiments, Golovneva and DiPersio used the entropy measures from slot distributions in the data and the complexity of slot combinations to determine the optimal distribution of semantic patterns in synthetic training data. This results in proportionally larger training sets for more complex patterns than for less complex ones. NLU models trained on such data sets achieved higher performance than those trained on datasets which merely “borrowed” slot distributions from existing languages.

Alexa is always getting smarter, and these and other innovations from AMDS researchers help ensure the best experience possible when Alexa launches in a new locale.

Acknowledgments: Ge Yu, Chris Hench, Zac Smith, Olga Golovneva, Christopher DiPersio, Karolina Owczarzak, Sreekar Bhaviripudi, Andrew Turner

About the Author
Janet Slifka is director of research science in Alexa AI’s Natural Understanding group and leads the Applied Modeling and Data Science team.

Related content

Work with us

See more jobs
US, WA, Seattle
Job summaryAmazon is looking for a creative Senior Applied Scientist to tackle some of the most interesting problems on the leading edge of natural language processing (NLP) and machine learning (ML) with our Alexa Artificial Intelligence (AI) team. Alexa AI aims to reinvent search and information retrieval for a voice-forward, multi-modal future. We enable customers to interact with unstructured and semi-structured content via a broad range of customer experiences including question answering, summarization, search, and multi-turn dialogues.Key job responsibilitiesIf you are looking for an opportunity to develop innovative solutions to deep technical problems having a massive customer impact, this might be the role for you! As a Senior Applied Scientist, you will work with smart, passionate colleagues in a fast-paced environment. You will develop and help deploy novel, scalable algorithms to advance the state-of-the-art in technology areas at the intersection of NLP and ML. You will keep up with relevant research in the field of NLP and publish your work in top-tier conferences. You will contribute to a multi-year research roadmap, enabling the team to focus on the right technical challenges to delight our customers.
US, NY, New York
Job summaryAmazon AI is looking for an experienced Data Scientist with abackground in the intersection of linguistics, phonetics, or NLP andstatistics/machine learning to help build industry-leading speech andlanguage processing services.As part of our AI team in Amazon Web Services, you will work alongsideapplied scientists, software engineers, and language engineers todefine data requirements for training AI services, ensure dataquality, develop suitable evaluation metrics for novel AI features andservices, and analyze systems' input/output behavior. You will be responsiblefor implementing and maintaining data analysis tools and pipelines, creatinginsightful analyses of complex production systems, and communicating resultsto scientists, product managers, and customers.Your work will directly impact millions of our customers in the formof products and services that make use of speech and languagetechnology. You will gain hands on experience with Amazon’sheterogeneous speech, text, and structured data sources, andlarge-scale computing resources to accelerate advances in spokenlanguage processing.Inclusive Team CultureHere at AWS, we embrace our differences. We are committed to furthering our culture of inclusion. We have ten employee-led affinity groups, reaching 40,000 employees in over 190 chapters globally. We have innovative benefit offerings, and host annual and ongoing learning experiences, including our Conversations on Race and Ethnicity (CORE) and AmazeCon (gender diversity) conferences.Work/Life BalanceOur team puts a high value on work-life balance. It isn’t about how many hours you spend at home or at work; it’s about the flow you establish that brings energy to both parts of your life. We believe striking the right balance between your personal and professional life is critical to life-long happiness and fulfillment. We offer flexibility in working hours and encourage you to find your own balance between your work and personal lives.Mentorship & Career GrowthOur team is dedicated to supporting new members. We have a broad mix of experience levels and tenures, and we’re building an environment that celebrates knowledge sharing and mentorship. Our senior members enjoy one-on-one mentoring and thorough, but kind, code reviews. We care about your career growth and strive to assign projects based on what will help each team member develop into a better-rounded engineer and enable them to take on more complex tasks in the future.
GB, Cambridge
Job summaryAmazon is looking for a creative Applied Scientist to tackle some of the most interesting problems on the leading edge of Machine Learning (ML), Natural Language Processing (NLP), and Information Retrieval (IR) with our Device Design Group (DDG). Amazon’s Device Design Group has launched revolutionary products like Echo, Fire TV, Alexa Communications, and more. Are you interested in joining the team to lead Amazon’s next innovation?The successful candidate will develop novel ML/NLP/IR/Deep Learning technologies to make Alexa smarter. They will have a true passion for working in a collaborative, cross-functional environment that encourages thinking about optimized solutions to unique problems that do not have yet a known science solution.If you are looking for an opportunity to solve deep technical problems and build innovative solutions in a fast-paced environment working within a smart and passionate team, this might be the role for you. You will develop and implement novel algorithms and modeling techniques to leverage and advance the state-of-the-art in technology areas that are found at the intersection of ML, NLP, IR, and Deep Learning. Your work will directly impact Amazon products and services that make use of speech and language technology. You will gain hands on experience with Alexa and large-scale computing resources.In this role you will:· Work collaboratively with scientists and developers to design and implement automated, scalable MT models;· Drive scalable solutions from the business, to prototyping, production testing and through engineering directly to production;· Drive best practices on the team, deal with ambiguity and competing objectives, and mentor and guide junior members to achieve their career growth potential.
US, WA, Seattle
MULTIPLE POSITIONS AVAILABLEEntity: Amazon.com Services LLC, an Amazon.com CompanyTitle: Economist IIWorksite: Seattle, WAPosition Responsibilities:Work with the economists, scientists and/or senior management on key business problems faced in retail, international retail, third party merchants, search, and/or operations. Apply the frontier of economic thinking to experiment design, forecasting, program evaluation and other areas. Build econometric models using data systems. Apply economic theory to solve business problems. Own the development of economic models and manage the data analysis, modeling and experimentation necessary to estimate and validate the models, in collaboration with scientists and engineers. Develop new techniques to process large data sets, address quantitative problems, and contribute to design of automated systems. Apply tools from applied micro-econometrics (e.g. experimental design, difference-in-difference, regression discontinuity) and forecasting (essential time series models). Leverage big data tools for data extraction. Work closely with business partners to communicate the intuition, implication and detail of economic analyses/modeling and incorporate feedback. Write up and present analysis for distribution to various levels of management at Amazon.Amazon.com is an Equal Opportunity-Affirmative Action Employer - Minority / Female / Disability / Veteran / Gender Identity / Sexual Orientation.#0000
US, MA, North Reading
Job summaryAre you inspired by invention? Is problem solving through teamwork in your DNA? Do you like the idea of seeing how your work impacts the bigger picture? Answer yes to any of these and you’ll fit right in here at Amazon Robotics. We are a smart team of doers that work passionately to apply cutting edge advances in robotics and software to solve real-world challenges that will transform our customers’ experiences in ways we can’t even image yet. We invent new improvements every day. We are Amazon Robotics and we will give you the tools and support you need to invent with us in ways that are rewarding, fulfilling and fun.We seek a talented and motivated engineer to tackle broad challenges in system-level analysis. You will work in a small team to quantify system performance at scale and to expand the breadth and depth of our analysis (e.g. increase the range of software components and warehouse processes covered by our models, develop our library of key performance indicators, construct experiments that efficiently root cause emergent behaviors). You will engage with growing teams of software development and warehouse design engineers to drive evolution of the AR system and of the simulation engine that supports our work.This role is a 6 month co-op to join AR full time (40 hours/week) from January-June 2022. Come join us in North Reading, MA, or in our newly expanded innovation hub in Westborough, MA!Both campuses provide a unique opportunity for co-ops to have direct access to robotics testing labs and manufacturing facilities.
US, WA, Seattle
Job summaryWe are constantly making Alexa the best voice assistant in the world. Amazon’s Alexa cloud service and Echo devices are used every day, by people you know, in and about their homes. The Alexa Monetization team is hiring talented and experienced Sr. Applied Scientists to help building the next generation products for Alexa across multiple channels and domains. We are seeking an experienced, entrepreneurial, big thinker for a confidential new initiative within Alexa. You will be joining a team doing innovative work, making a direct impact to customers, showing measurable success, and building with the latest natural language processing systems. If you are holding out for an opportunity to:Make a huge impact as an individual· Be part of a team of smart and passionate professionals who will challenge you to grow every day· Solve difficult challenges using your expertise in coding elegant and practical solutions· Create applications at a massive scale used by millions of people· Work with machine learning systems to deliver real experiences, not just researchAnd you are experienced with…· Drive applied science (machine learning) projects end-to-end ~ from ideation, analysis, prototyping, development, metrics, and monitoring· Conduct deep analyses on massive user and contextual data sets· Propose viable modeling ideas to advance optimization or efficiency, with supporting argument, data, or, preferably, preliminary results· Design, develop, and maintain scalable, Machine Learning models with automated training, validation, monitoring and reporting· Stay familiar with the field and apply state-of-the-art Machine Learning techniques to NLP and related optimization problems· Produce peer-reviewed scientific paper in top journals and conferencesAnd you constantly look for opportunities to…· Innovate, simplify, reduce waste, and increase efficiencies· Use data to make decisions and validate assumptions· Automate processes otherwise performed by humans· Learn from others and help grow those around you...then we would love to chat!In 2021, we have the opportunity to build new products and features from the ground up and we are looking for strong, bias for action engineering leaders who are not afraid of taking bold bets and trying new things to improve customer experience for Alexa.As part of a new and growing team, you will be iterating on new features and products to help drive innovation and expansion. You will work on cross-functional and cross-domain opportunities; tackle challenging projects aim to accelerate experimentations in Alexa; and build out operating mechanisms and technology to enable novel customer experiences. You will be instrumental in setting the team culture, quality bar, engineering best practices, and norms. Mentoring and growing the team around you will be one of the primary ways you measure your own success. You will have the opportunity to contribute and develop deep expertise in the areas of distributed systems, machine learning, conversational technologies, user interfaces (including voice and natural user interfaces), data storage and data pipelines.This role is exciting for scientists who love to apply startup mindset to their day-to-day, enjoy working cross-functionally to master both business and technology knowledge, and are passionate about building engineering best practices. If you are looking for opportunity to learn, grow and lead, this is the position for you.
US, WA, Seattle
Job summaryWhy this job is awesome?· This is SUPER high-visibility work: Our mission is to provide consistent, accurate, and relevant delivery information to every single page on every Amazon-owned site.· MILLIONS of customers will be impacted by your contributions: The changes we make directly impact the customer experience on every Amazon site. This is a great position for someone who likes to leverage Machine learning technologies to solve the real customer problems, and also wants to see and measure their direct impact on customers.· We are a cross-functional team that owns the ENTIRE delivery experience for customers: From the business requirements to the technical systems that allow us to directly affect the on-site experience from a central service, business and technical team members are integrated so everyone is involved through the entire development process.· You will help the Delivery Experience organization to build causal inference framework and analyze the long-term effect on business· Your work will support the Amazon leadership to make visionary business decisions.· Do you want to join an innovative team of scientists and engineers who use machine learning and statistical inference techniques to deliver the best delivery experience on every Amazon-owned site?· · Are you excited by the prospect of analyzing and modeling terabytes of data on the cloud and create state-of-art algorithms to solve real world problems?· · Do you like to own end-to-end business problems/metrics and directly impact the profitability of the company?· · Do you like to innovate and simplify?If yes, then you may be a great fit to join the Delivery Experience Machine Learning team.Major responsibilities:· Research and implement causal inference techniques to create scalable and effective models in Delivery Experience (DEX) systems· Solve business problems and identify business opportunities to provide the best delivery experience on all Amazon-owned sites.· Design and develop machine learning framework to measure the long-term effect of all models in DEX systems· Design and develop search ranking, recommendation and personalization models to improve Amazon customer experience· Analyze and understand large amounts of Amazon’s historical business data to detect patterns, to analyze trends and to identify correlations and causalities· Establishing scalable, efficient, automated processes for large scale data analysis and causal inference
US, CA, Santa Monica
Amazon is investing in building a customer centric, world class advertising business across its many unique audio, video, and display surfaces. In this role, you will be on the cutting edge of developing monetization solutions for Live TV, Connected TV and streaming Audio.Our team is seeking a self-driven data science with broad technical and data management skills who will identify and onboard new data sources, conduct analytics using SQL and Python, and build lightweight tools and methods for others to replicate and scale their work. You should have expertise in the design, creation, management, and business use of large datasets. Your solutions are testable, maintainable, and mindful of resource usage. You should be able to apply statistical methods (e.g. regression) to difficult business problems and understand these methods’ assumptions and limitations. You will need excellent business and communication skills to work with product managers and business owners on ambiguous problems in a fast paced environment to develop and define key business questions and requirements.KEY RESPONSIBILITIES· Engage with leadership and diversified customer groups to understand the needs and recommend business intelligence solutions.· Partner with Data engineering team to define the data elements and data structure that the team should leverage to enable capabilities.· Design, implement, and support platforms that provide business teams ad-hoc access to large datasets (eg data visualization tools for non-tech business users)· Own the design, development, and maintenance of ongoing metrics, reports, analyses, dashboards, etc. to drive key business decisions.· Understand data resources and know how, when, and which to use (and which not to use). Write quality code to retrieve and analyze data.· Use advanced analytical techniques to solve business problems· Communicate analysis results and techniques, both verbally and in writing, clearly and confidently to peers and business partners.
US, WA, Seattle
Job summaryAmazon Alexa Comunications (connecting people to family and friends) is looking for a Senior Applied Science Manager to lead the development of next generation Spoken Language Understanding, Recommendation, and other understanding and intelligent response systems that revolutionize multi-modal (voice and GUI) communication for Alexa's customers. In this role, you will manage teams of passionate, talented, and inventive scientists, to develop industry-leading natural language understanding (NLU), automatic speech recognition (ASR), recommendation, and other inference and response systems and drive them successfully to production for the benefit of millions of Alexa users. Your mission is to push the envelope in order to provide the industry-leading, best-possible experience for our customers.As a Senior Applied Science Manager, you will identify research directions, create roadmaps for forward-looking research and communicate them to senior leadership, and work closely with engineering teams to bring research to production. You will work with teams of talented scientists, and fill the ranks by attracting the best scientists in SLU, dialog and other Communication-related signal processing systems by representing Amazon Alexa at international science conferences. You will work with talented peers and leverage Amazon’s heterogeneous data sources and large-scale computing resources.
LU, Luxembourg
Job summaryHave you ever wondered how Amazon delivers timely and reliably hundreds of millions of packages to customer’s doorsteps? Are you passionate about data and mathematics, and hope to impact the experience of millions of customers? Are you obsessed with designing algorithmic solutions to very challenging problems?If so, we look forward to hearing from you!Amazon Transportation Services is seeking an Applied Scientist specialized in Operations Research & Machine Learning to be based in the EU Headquarters in Luxembourg. As a key member of the Research Science Team, this person will be responsible for designing algorithmic solutions based on data and mathematics for optimizing the middle-mile Amazon Transportation Network. The successful applicant will ensure that our end-to-end strategies in terms of customer demand fulfillment, routing, consolidation locations, linehaul/airhaul/sea options and last-mile transportation are streamlined and optimized.This critical role requires an aptitude for independent initiative and decision-making, the ability to drive innovation in modelling and optimization across Amazon’s expanding European network and linking into global initiatives and expansion strategies. Key job responsibilities• Partner with the planning, linehaul/airhaul and sort center operations teams, while working closely with last-mile, supply chain, and global delivery departments for modeling and optimizing the transportation network of EU.• Design and prototype algorithmic solutions for standardized processes.• Lead complex time-bound, long-term as well as ad-hoc transportation modelling analyses to help management in decision making.• Communicate to leadership results from business analysis, strategies and tactics.• Drive large-scale projects to scale and enhance Amazon’s EU transportation network.
IN, TS, Hyderabad
Are you interested in applying your strong quantitative analysis and big data skills to world-changing problems? Are you interested in driving the development of methods, models and systems for strategy planning, transportation and fulfillment network? If so, then this is the job for you.Our team is responsible for creating core analytics tech capabilities, platforms development and data engineering. We develop scalable analytics applications and research modeling to optimize operation processes. We standardize and optimize data sources and visualization efforts across geographies, builds up and maintains the online BI services and data mart. You will work with professional software development managers, data engineers, scientists, business intelligence engineers and product managers using rigorous quantitative approaches to ensure high quality data tech products for our customers around the world, including India, Australia, Brazil, Mexico, Singapore and Middle East.We are looking for experienced hands-on Manager of Data Science to join us and lead science programs and developments for Amazon global.Amazon is growing rapidly and because we are driven by faster delivery to customers, a more efficient supply chain network, and lower cost of operations, our main focus is in the development of analytics tech services and applications fed by our massive amounts of available data. You will be responsible for building these models/tools that improve the economics of Amazon’s worldwide fulfillment networks in emerging countries as Amazon increases the speed and decreases the cost to deliver products to customers. You will identify and evaluate opportunities to reduce variable costs by improving fulfillment center processes, transportation operations and scheduling, and the execution to operational plans. You will also improve the efficiency of capital investment by helping the fulfillment centers to improve storage utilization and the effective use of automation. Finally, you will help create the metrics to quantify improvements to the fulfillment costs (e.g., transportation and labor costs) resulting from the application of these optimization models.Major responsibilities of the team include:· Translating business questions and concerns into specific analytical questions that can be answered with available data using statistical methods.· Apply Statistical and Machine Learning methods to specific business problems and data.· Ensure data quality throughout all stages of acquisition and processing, including such areas as data sourcing/collection, ground truth generation, normalization, transformation, cross-lingual alignment/mapping, etc.· Communicate proposals and results in a clear manner backed by data and coupled with actionable conclusions to drive business decisions.· Collaborate with colleagues from multidisciplinary science, engineering and business backgrounds.· Work with engineers to develop efficient data querying and modeling infrastructure.· Manage your own process. Prioritize and execute on high impact projects, triage external requests, and ensure to deliver projects in time.· Utilizing code (Python, R, Scala, etc.) for analyzing data and building statistical models.
US, WA, Seattle
Job summaryAmazon Advertising is one of Amazon's fastest growing and most profitable businesses. As a core product offering within our advertising portfolio, Sponsored Products (SP) helps merchants, retail vendors, and brand owners succeed via native advertising, which grows incremental sales of their products sold through Amazon. The SP team's primary goals are to help shoppers discover new products they love, be the most efficient way for advertisers to meet their business objectives, and build a sustainable business that continuously innovates on behalf of customers. Our products and solutions are strategically important to enable our Retail and Marketplace businesses to drive long-term growth. We deliver billions of ad impressions and millions of clicks and break fresh ground in product and technical innovations every day!As a Senior Applied Scientist on this team, you will:Be a technical leader in Machine Learning and drive full life-cycle Machine Learning projects.Lead technical efforts within this team and across other teams.Build machine learning models, perform proof-of-concept, experiment, optimize, and deploy your models into production.Run A/B experiments, gather data, and perform statistical analysis.Establish scalable, efficient, automated processes for large-scale data analysis, machine-learning model development, model validation and serving.Work closely with software engineers to assist in productionizing your ML models.Research new and innovative machine learning approaches.Recruit Applied Scientists to the team and mentor scientists on the team.Why you will love this opportunity: Amazon is investing heavily in building a world-class advertising business. This team defines and delivers a collection of advertising products that drive discovery and sales. Our solutions generate billions in revenue and drive long-term growth for Amazon’s Retail and Marketplace businesses. We deliver billions of ad impressions, millions of clicks daily, and break fresh ground to create world-class products. We are a highly motivated, collaborative, and fun-loving team with an entrepreneurial spirit - with a broad mandate to experiment and innovate.Impact and Career Growth: You will invent new experiences and influence customer-facing shopping experiences to help suppliers grow their retail business and the auction dynamics that leverage native advertising; this is your opportunity to work within the fastest-growing businesses across all of Amazon! Define a long-term science vision for our advertising business, driven from our customers' needs, translating that direction into specific plans for research and applied scientists, as well as engineering and product teams. This role combines science leadership, organizational ability, technical strength, product focus, and business understanding.Team video https://youtu.be/zD_6Lzw8raE
US, WA, Seattle
The GSF (Global Specialty Fulfillment) organization leads the innovation of Amazon’s ultra-fast fulfillment initiatives. We are an Operations org that hires and manages associates for ultra-fast businesses such as online grocery delivery, sub-same day delivery etc. GSFTech sits within GSF with the mission to build world-class automated Science-Tech products that enable ultra-fast delivery speeds for Amazon customers and job market opportunities for Amazon associates. Our key vision is to transform the online experience. We’re growing in scale and volume, by orders of magnitude. We are a team of passionate tech builders who work endlessly to make life better for our associates through amazing, thoughtful, and creative new scheduling experiences. To succeed, we need senior technical leaders to forge a path into the future by building innovative, maintainable, and scalable systems.At Amazon, we are constantly inventing and re-inventing to be the most associate-centric company in the world. To get there, we need exceptionally talented, bright, and driven people. Amazon is one of the most recognizable brand names in the world and we distribute millions of products each year to our loyal customers.We are looking for an Operational Research Scientist who will be driving optimization initiatives, responsible for building models and prototypes for labor planning systems, and will require close collaboration with other scientists on the team that are developing state-of-the-art ML and forecasting algorithms to scale. Common questions include: when to post shifts given changing demand and associate acceptance? How to optimally assign shifts to associates? This team plays a significant role in various stages of the innovation pipeline from identifying business needs, developing new algorithms, prototyping/simulation, to implementation by working closely with colleagues in engineering, product management, operations, retail and finance.As a Senior member of the scientist team, you will play an integral part on our Operations org with the following technical and leadership responsibilities:· Interact with engineering, operations, science and business teams to develop an understanding and domain knowledge of processes, system structures, and business requirements· Apply domain knowledge and business judgment to identify opportunities and quantify the impact aligning research direction to business requirements and make the right judgment on research project prioritization· Develop scalable models to derive optimal or near-optimal solutions to existing and new scheduling challenges· Create prototypes and simulations to test devised solutions· Advocate technical solutions to business stakeholders, engineering teams, as well as executive-level decision makers· Work closely with engineers to integrate prototypes into production system· Create policy evaluation methods to track the actual performance of devised solutions in production systems, identify areas with potential for improvement and work with internal teams to improve the solution with new features· Mentor and supervise the work of junior scientists on the team for technical development and their career development and growth· Present business cases and document models, analyses, and their results in order to influence important decisions
US, WA, Seattle
The GSF (Global Specialty Fulfillment) organization leads the innovation of Amazon’s ultra-fast fulfillment initiatives. We are an Operations org that hires and manages associates for ultra-fast businesses such as online grocery delivery, sub-same day delivery etc. GSFTech sits within GSF with the mission to build world-class automated Science-Tech products that enable ultra-fast delivery speeds for Amazon customers and job market opportunities for Amazon associates. Our key vision is to transform the online experience. We’re growing in scale and volume, by orders of magnitude. We are a team of passionate tech builders who work endlessly to make life better for our associates through amazing, thoughtful, and creative new scheduling experiences. To succeed, we need senior technical leaders to forge a path into the future by building innovative, maintainable, and scalable systems.At Amazon, we are constantly inventing and re-inventing to be the most associate-centric company in the world. To get there, we need exceptionally talented, bright, and driven people. Amazon is one of the most recognizable brand names in the world and we distribute millions of products each year to our loyal customers.We are looking for a Senior Applied Scientist who will be the science lead for all key ML and forecasting initiatives, responsible for building models and prototypes for labor planning systems, and will require close collaboration with other scientists on the team that are developing state-of-the-art optimization algorithms to scale. This team plays a significant role in various stages of the innovation pipeline from identifying business needs, developing new algorithms, prototyping/simulation, to implementation by working closely with colleagues in engineering, product management, operations, retail and finance.As a Senior member of the scientist team, you will play an integral part on our Operations org with the following technical and leadership responsibilities:· Help the team define the forward looking Science roadmap and vision by helping to identify, disambiguate and seek out new opportunities· Interact with engineering, operations, science and business teams to develop an understanding and domain knowledge of processes, system structures, and business requirements· Apply domain knowledge and business judgment to identify opportunities and quantify the impact aligning research direction to business requirements and make the right judgment on research project prioritization· Develop scalable models to derive optimal or near-optimal solutions to existing and new scheduling challenges· Create prototypes and simulations to test devised solutions· Advocate technical solutions to business stakeholders, engineering teams, as well as executive-level decision makers· Work closely with engineers to integrate prototypes into production system· Create policy evaluation methods to track the actual performance of devised solutions in production systems, identify areas with potential for improvement and work with internal teams to improve the solution with new features· Mentor and supervise the work of junior scientists on the team for technical development and their career development and growth· Present business cases and document models, analyses, and their results in order to influence important decisions
CA, ON, Toronto
Job summaryAmazon Advertising is one of Amazon's fastest growing and most profitable businesses. As a core product offering within our advertising portfolio, Sponsored Products (SP) helps merchants, retail vendors, and brand owners succeed via native advertising, which grows incremental sales of their products sold through Amazon. The SP team's primary goals are to help shoppers discover new products they love, be the most efficient way for advertisers to meet their business objectives, and build a sustainable business that continuously innovates on behalf of customers. Our products and solutions are strategically important to enable our Retail and Marketplace businesses to drive long-term growth. We deliver billions of ad impressions and millions of clicks and break fresh ground in product and technical innovations every day!Sponsored Products helps merchants, retail vendors, and brand owners succeed via native advertising that grows incremental sales of their products sold through Amazon. The Sponsored Products Ad Marketplace organization optimizes the systems and ad placements to match advertiser demand with publisher supply using a combination of machine learning, big data analytics, ultra-low latency high-volume engineering systems, and quantitative product focus. Our systems and algorithms operate on one of the world's largest product catalogs, matching shoppers with products - with a high relevance bar and strict latency constraints. Our goals are to help buyers discover new products they love, be the most efficient way for advertisers to meet their business objectives, and to build a major, sustainable business that helps Amazon continuously innovate on behalf of all customers.As an Applied Scientist for the Sponsored Products Detail Page Allocation and Pricing team, you develop systems which make the final decision on which ads to show, where to place them on the page and how many ads to place. This also includes selection of various themes that would appear in detail pages. This is a challenging technical and business problem, which requires us to balance the interests of advertisers, shoppers, and Amazon. You'll develop a data-driven product strategy to define the right quantitative measures of shopper impact, using this to evaluate decisions and opportunities. You'll balance a portfolio of pragmatic and long-term investments to drive long term growth of the ads and retail businesses. You'll develop real-time algorithms to allocate billions of ads per day in advertising auctions.As an Applied Scientist on this team you will:Build machine learning models, perform proof-of-concept, experiment, optimize, and deploy your models into production.Run A/B experiments, gather data, and perform statistical analysis.Establish scalable, efficient, automated processes for large-scale data analysis, machine-learning model development, model validation and serving.Work closely with software engineers to assist in productionizing your ML models.Research new machine learning approaches.Why you love this opportunityAmazon is investing heavily in building a world-class advertising business. This team is responsible for defining and delivering a collection of advertising products that drive discovery and sales. Our solutions generate billions in revenue and drive long-term growth for Amazon’s Retail and Marketplace businesses. We deliver billions of ad impressions, millions of clicks daily, and break fresh ground to create world-class products. We are highly motivated, collaborative, and fun-loving team with an entrepreneurial spirit - with a broad mandate to experiment and innovate.Impact and Career GrowthYou will invent new experiences and influence customer-facing shopping experiences to help suppliers grow their retail business and the auction dynamics that leverage native advertising; this is your opportunity to work within the fastest-growing businesses across all of Amazon! Define a long-term science vision for our advertising business, driven fundamentally from our customers' needs, translating that direction into specific plans for research and applied scientists, as well as engineering and product teams. This role combines science leadership, organizational ability, technical strength, product focus, and business understanding.Team video https://youtu.be/zD_6Lzw8raE
US, WA, Seattle
Job summaryDo you want to join the Alexa Artificial Intelligence (AI) team - the science team behind Amazon’s intelligence voice assistance system? Do you want to utilize cutting-edge deep-learning and machine learning algorithms to delight millions of Alexa users around the world?If your answers to these questions are “yes”, then come join the Alexa AI team, which is in charge of improving Alexa user satisfaction through continuous closed-loop self-learning. The team owns the modules that reduce user perceived defects through automatic defect detection and label generation.Key job responsibilitiesYou will be expected to:· Analyze, understand, and model dialogue context based on large scale speech and dialogue data;· Create and innovate deep learning and/or NLP based algorithms for improving accuracy of Alexa's speech recognition and natural language understanding through contextual modeling;· Perform model/data analysis and monitor user-experienced based metrics through online A/B testing;· Research and implement novel deep learning and NLP algorithms and models.A day in the life· Work collaboratively with scientists and developers to design and implement automated, scalable NLP/ML/IR models for accessing and presenting information· Drive scalable solutions from the business, to prototyping, production testing and through engineering directly to production· Drive best practices on the team, deal with ambiguity and competing objectives, and mentor and guide junior members to achieve their career growth potential.About the teamThe Alexa AI team is in charge of improving Alexa user satisfaction through continuous closed-loop self-learning. The team owns the modules that reduce user perceived defects through automatic defect detection and label generation.You will be working alongside a team of experienced deep learning and NLP scientists and engineers to create deep neural network based contextual dialogue modeling on tasks such as speech translation, natural language understanding, etc.
GB, London
Job summaryCome build the future of entertainment with us. Are you interested in shaping the future of movies and television? Do you want to define the next generation of how and what Amazon customers are watching?Prime Video is a premium streaming service that offers customers a vast collection of TV shows and movies - all with the ease of finding what they love to watch in one place. We offer customers thousands of popular movies and TV shows from Originals and Exclusive content to exciting live sports events. We also offer our members the opportunity to subscribe to add-on channels which they can cancel at anytime and to rent or buy new release movies and TV box sets on the Prime Video Store. Prime Video is a fast-paced, growth business - available in over 240 countries and territories worldwide. The team works in a dynamic environment where innovating on behalf of our customers is at the heart of everything we do. If this sounds exciting to you, please read on.As part of the Prime Video Automated Excellence organization, the Automated Reasoning team applies deep and cutting-edge automated reasoning techniques to detect defects automatically in Prime Video’s core systems and device-level code. The tools we build are mission-critical to the software development and release cycle of many Prime Video engineering organizations, and will represent a huge step forward in the sophistication of our approach to automated software quality. Your work on this team will help us address a new dimension of scale our business faces as we deliver our applications on an ever-expanding set of client devices.Key job responsibilitiesYou will have the opportunity to apply your deep knowledge of automated reasoning techniques, such as static analysis, formal verification, symbolic execution, etc., to concrete problems our product and engineering teams face on a daily basis. You will collaborate with team members to design and deliver enterprise-scale systems that will be used by both internal and external customers. You will have the opportunity to analyse and verify code to solve real-world problems and translate business and functional requirements into quick prototypes or proofs of concept. You will help set and continuously evolve a culture of innovation and curiosity that helps us find and solve our customers’ biggest problems.About the teamTo help a growing organization quickly deliver more features to Prime Video customers, Prime Video’s Automated Excellence organization is innovating on behalf of our global software development team consisting of thousands of engineers. We build services and utilities that make developer’s lives easier and more productive, and that help them deliver at higher levels of quality.
US, CA, Sunnyvale
Job summaryThe Amazon Alexa app is a companion to Alexa devices for setup, remote control, and enhanced features. The Alexa app understands a customer’s habits, preferences and delivers a personalized experience to help them manage their day by providing relevant information as customers want it. We believe voice is the most natural user interface for interacting with technology across many domains; we are inventing the future. As voice-enabled technology becomes increasingly advanced, consumers are demanding more from what their voice products can do. We’re looking for Scientists who are passionate about innovating on behalf of customers, demonstrate a high degree of product ownership, and want to have fun while they make history.As an Senior Data Scientist, you will help build a production scaled personalized recommendation and lead the team to build Machine Learning (ML) and Deep Learning (DL) models to help derive business value and new insights through the adoption of Artificial Intelligence (AI).Key job responsibilitiesThe successful candidate will be responsible for distilling user data insights for ML science applications and influence business decision with data-driven approach to increase Alexa mobile engagement and growth. A successful candidate will be a person who enjoys diving deep into data, doing analysis, discovering root causes, and designing long-term solutions.· Define the long-term development, science and business strategies for the team.· Expertise in the areas of data science, machine learning and statistics.· Translate business needs into advanced analytics and machine learning models and provide strong algorithm and coding execution and delivery of Machine Learning & Artificial Intelligence.· Work closely with the engineers to architect and develop the best technical design and approach.· Being able to dive a ML / DL project from beginning to end, including understanding the business need, aggregating data, exploring data, building & validating predictive models, and deploying completed models to deliver business impact to the organization.· Analyze, extract, normalize, and label relevant data.· Work with Engineers to help our customers operationalize models after they are built.A day in the life· Design and review mobile experiments for growth and engagement· Build statistical models and generate data insights to understand mobile growth and retention· Feature engineering to improve ML model performance.· Analyze, extract, normalize, and label relevant data.· Work with Engineers to deploy applications to production· Work with product manager to convert business problems to science problems and define the solutions.About the teamAlexa Mobile Intelligence team is motivated to make Alexa mobile app being the best intelligent assistant and providing personalized relevant features and content by understanding customers' habits, preferences, hence will reach high growth and retention for the app.
DE, BE, Berlin
Are you excited about building Robotics AI technology that works seamlessly with and around people? The Robotics AI team at Amazon is building high-performance, real-time robotic systems that can perceive, learn and act intelligently alongside humans, at Amazon scale.To this end, we are seeking an experienced Applied Science Manager who is interested in leading a team of Applied Scientist to bring Computer Vision innovations to Fulfillment Centers. We work on machine learning, planning, control, simulation and computer vision applied to robotics, particularly to manipulation and item understanding. We are expanding our Berlin team to meet the huge application demand in Amazon.Key responsibilities:· Lead a team of Computer Vision experts and oversee research and development projects at various stages ranging from initial exploration to deployment into production systems.· Rapidly design, prototype and test many possible hypotheses in a high-ambiguity environment, making use of both quantitative and business judgment.· Collaborate with software engineering teams to integrate successful experiments into large scale, highly complex production services.· Report results in a scientifically rigorous way.· Collaborate closely with stakeholders on developing systems from prototyping to production level.
US, WA, Seattle
Come join a team to work in the intersection of Machine Learning, Observability, Cloud, Big Data and Open Source!The AWS CloudWatch Predictions team produces Anomaly Detection solution that gives customers actionable visibility into the health of their applications and services by leveraging machine learning technologies. Our service continuously analyzes system and application metrics, detects and surfaces anomalies without requiring user intervention, enables AWS customers across the world to monitor and act on the dynamic nature of system and application behaviors.We are looking for applied scientists to help us lend meaning to vast amounts of time series data and delight our customers by finding the reasonable answers to the right problems. If you enjoyed your studies of Time Series Modeling and Predicting, Anomaly Detection Algorithms, Data Smoothing, Outlier Filtering, and innovating new features that can make a huge impact on the customer experience excites you, then we've got a good home for you here.You'll have a ground floor opportunity to work on cutting edge ways for online time series forecasting and anomaly detection, and improve the the use of advanced Machine Learning on massive scale datasets. You'll join a team of veteran service engineers who are focused on highly stable, low operational burden, continuously deployed software and help lead a new quantitative effort. It'll be an opportunity to not just change how the world understands their compute resources, but the opportunity to build world class distributed system software engineering skills as well.