Tools for Generating Synthetic Data Helped Bootstrap Alexa’s New-Language Releases

In the past few weeks, Amazon announced versions of Alexa in three new languages: Hindi, U.S. Spanish, and Brazilian Portuguese.

Like all new-language launches, these addressed the problem of how to bootstrap the machine learning models that interpret customer requests, without the ability to learn from customer interactions. At a high level, the solution is to use synthetic data. These three locales were the first to benefit from two new in-house tools, developed by the Alexa AI team, that produce higher-quality synthetic data more efficiently.

Each new locale has its own speech recognition model, which converts an acoustic speech signal into text. But interpreting that text — determining what the customer wants Alexa to do — is the job of Alexa’s natural-language-understanding (NLU) systems.

When a new-language version of Alexa is under development, training data for its NLU systems is scarce. Alexa feature teams will propose some canonical examples of customer requests in the new language, which we refer to as “golden utterances”; training data from existing locales can be translated by machine translation systems; crowd workers may be recruited to generate sample texts; and some data may come from Cleo, an Alexa skill that allows multilingual customers to help train new-language models by responding to voice prompts with open-form utterances.

Even when data from all these sources is available, however, it’s sometimes not enough to train a reliable NLU model. The new bootstrapping tools, from Alexa AI’s Applied Modeling and Data Science group, treat the available sample utterances as templates and generate new data by combining and varying those templates.

One of the tools, which uses a technique called grammar induction, analyzes a handful of golden utterances to learn general syntactic and semantic patterns. From those patterns, it produces a series of rewrite expressions that can generate thousands of new, similar sentences. The other tool, guided resampling, generates new sentences by recombining words and phrases from examples in the available data. Guided resampling concentrates on optimizing the volume and distribution of sentence types, to maximize the accuracy of the resulting NLU models.

Rules of Grammar

Grammars have been a tool in Alexa’s NLU toolkit since well before the first Echo device shipped. A grammar is a set of rewrite rules for varying basic template sentences through word insertions, deletions, and substitutions.

Below is a very simple grammar, which models requests to play either pop or rock music, with or without the modifiers “more” and “some”. Below the rules of the grammar is a diagram of a computational system (a finite-state transducer, or FST) that implements them.

diagram of the resulting finite-state transducer
A toy grammar, which can model requests to play pop or rock music, with or without the modifiers “some” or “more”, and a diagram of the resulting finite-state transducer. The question mark indicates that the some_more variable is optional.

Given a list of, say, 50 golden utterances, a computational linguist could probably generate a representative grammar in a day, and it could be operationalized by the end of the following day. With the Applied Modeling and Data Science (AMDS) group’s grammar induction tool, that whole process takes seconds.

AMDS research scientists Ge Yu and Chris Hench and language engineer Zac Smith experimented with a neural network that learned to produce grammars from golden utterances. But they found that an alternative approach, called Bayesian model merging, offered similar performance with advantages in reproducibility and iteration speed.

The resulting system identifies linguistic patterns in lists of golden utterances and uses them to generate candidate rules for varying sentence templates. For instance, if two words (say, “pop” and “rock”) consistently occur in similar syntactic positions, but the phrasing around them varies, then one candidate rule will be that (in some defined contexts) “pop” and “rock” are interchangeable.

After exhaustively listing candidate rules, the system uses Bayesian probability to calculate which rule accounts for the most variance in the sample data, without overgeneralizing or introducing inconsistencies. That rule becomes an eligible variable in further iterations of the process, which recursively repeats until the grammar is optimized.

Crucially, the tool’s method for creating substitution rules allows it to take advantage of existing catalogues of frequently occurring terms or phrases. If, for instance, the golden utterances were sports related, and the grammar induction tool determined that the words “Celtics” and “Lakers” were interchangeable, it would also conclude that they were interchangeable with “Warriors”, “Spurs”, “Knicks”, and all the other names of NBA teams in a standard catalogue used by a variety of Alexa services.

From a list of 50 or 60 golden utterances, the grammar induction tool might extract 100-odd rules that can generate several thousand sentences of training data, all in a matter of seconds.

Safe Swaps

The guided-resampling tool also uses catalogues and existing examples to augment training data. Suppose that the available data contains the sentences “play Camila Cabello” and “can you play a song by Justin Bieber?”, which have been annotated to indicate that “Camila Cabello” and “Justin Bieber” are of the type ArtistName. In NLU parlance, ArtistName is a slot type, and “Camila Cabello” and “Justin Bieber” are slot values.

The guided-resampling tool generates additional training examples by swapping out slot values — producing, for instance, “play Justin Bieber” and “can you play a song by Camila Cabello?” Adding the vast Amazon Music databases of artist names and song titles to the mix produces many additional thousands of training sentences.

Blindly swapping slot values can lead to unintended consequences, so which slot values can be safely swapped? For example, in the sentences “play jazz music” and “read detective books”, both “jazz” and “detective” would be labeled with the slot type GenreName. But customers are unlikely to ask Alexa to play “detective music”, and unnatural training data would degrade the performance of the resulting NLU model.

AMDS’s Olga Golovneva, a research scientist, and Christopher DiPersio, a language engineer, used the Jaccard index — which measures the overlap between two sets — to evaluate pairwise similarity between slot contents in different types of requests. On that basis, they defined a threshold for valid slot mixing.

Quantifying Complexity

As there are many different ways to request music, another vital question is how many variations of each template to generate in order to produce realistic training data. One answer is simply to follow the data distributions from languages that Alexa already supports.

Comparing distributions of sentence types across languages requires representing customer requests in a more abstract form. We can encode a sentence like “play Camila Cabello” according to the word pattern other + ArtistName, where other represents the verb “play”, and ArtistName represents “Camila Cabello”. For “play ‘Havana’ by Camila Cabello”, the pattern would be other + SongName + other + ArtistName. To abstract away from syntactic differences between languages, we can condense this pattern further to other + ArtistName + SongName, which represents only the semantic concepts included in the request.

Given this level of abstraction, Golovneva and DiPersio investigated several alternative techniques for determining the semantic distributions of synthetic data.

Using Shannon entropy, which is a measure of uncertainty, Golovneva and DiPersio calculated the complexity of semantic sentence patterns, focusing on slots and their combinations. Entropy for semantic slots takes into consideration how many different values each slot might have, as well as how frequent each slot is in the data set overall. For example, the slot SongName occurs very frequently in music requests, and its potential values (different song titles) number in the millions; in contrast, GenreName also occurs frequently in music requests, but its set of possible values (music genres) is fairly small.

Customer requests to Alexa often include multiple slots (such as “play ‘Vogue’|SongName by Madonna|ArtistName” or “set a daily|RecurrenceType reminder to {walk the dog}|ReminderContent for {seven a. m.}|Time”), which increases the pattern complexity further.

In their experiments, Golovneva and DiPersio used the entropy measures from slot distributions in the data and the complexity of slot combinations to determine the optimal distribution of semantic patterns in synthetic training data. This results in proportionally larger training sets for more complex patterns than for less complex ones. NLU models trained on such data sets achieved higher performance than those trained on datasets which merely “borrowed” slot distributions from existing languages.

Alexa is always getting smarter, and these and other innovations from AMDS researchers help ensure the best experience possible when Alexa launches in a new locale.

Acknowledgments: Ge Yu, Chris Hench, Zac Smith, Olga Golovneva, Christopher DiPersio, Karolina Owczarzak, Sreekar Bhaviripudi, Andrew Turner

About the Author
Janet Slifka is director of research science in Alexa AI’s Natural Understanding group and leads the Applied Modeling and Data Science team.

Related content

Amazon Science Newsletter Project Kuiper.jpg
Get more from Amazon Science
Sign up for our monthly newsletter

Work with us

See More Jobs
US, WA, Seattle
Business/Team IntroductionThe Supply Chain Optimization Technologies (SCOT) team builds technology to automate and optimize Amazon’s supply chain of physical goods. We seek a Data Scientist with strong analytical and communication skills to join our team. SCOT manages Amazon's inventory under uncertainty of demand, pricing, promotions, supply, vendor lead times, and product life cycle. We optimize complex trade-offs between customer experience, inventory costs, fulfillment costs, fulfillment center capacity, etc. We develop sophisticated algorithms that involve learning from large amounts of data such as prices, promotions, similar products, and other data from our product catalog in order to automatically act on millions of dollars’ worth of inventory weekly and establish plans for tens of thousands of employees. As a Data Scientist, you will contribute to the research community, by working with other scientists across Amazon and our Supply Chain, as well as collaborating with academic researchers and publishing papers. SCOT also engages in cutting edge research that we try to share with the community. Recent work from SCOT includes papers presented at the NIPS 2017 Time Series Workshop, SSRN, KDD 2018 Time Series Workshop, and ICML 2018 Deep Generative Models Workshop.Data Scientist ResponsibilitiesAs a Data Scientist in SCOT, will be tasked to understand and work with bleeding edge research to enable the implementation of sophisticated models on big data. As a successful data scientist in the SCOT team, you are an analytical problem solver who enjoys diving into data from various businesses, is excited about investigations and algorithms, can multi-task, and can credibly interface between scientists, engineers and business stakeholders. Your expertise in synthesizing and communicating insights and recommendations to audiences of varying levels of technical sophistication will enable you to answer specific business questions and innovate for the future.Major responsibilities include:· Analysis of large amounts of data from different parts of the supply chain and their associated business functions· Improving upon existing machine learning methodologies by developing new data sources, developing and testing model enhancements, running computational experiments, and fine-tuning model parameters for new models· Formalizing assumptions about how models are expected to behave, creating definitions of outliers, developing methods to systematically identify these outliers, and explaining why they are reasonable or identifying fixes for them· Communicating verbally and in writing to business customers with various levels of technical knowledge, educating them about our research, as well as sharing insights and recommendations· Utilizing code (Python, R, Scala, etc.) for analyzing data and building statistical and machine learning models and algorithms
US, WA, Seattle
Global Talent Management (GTM) is centrally responsible for creating and evolving Amazon’s human capital and talent programs and processes.People Science Team within GTM is a growing start-up team with direct impact on Amazonians across all of our businesses and locations around the world. We play a crucial role in ensuring top notch data products and insights facilitate our growth and development of talent in intelligent and curious ways. We regularly use data to pitch ideas and drive conversations with Amazon’s Senior Vice President of HR and other executives about how to improve existing talent programs to solve organizational problems focused on (but not limited to) talent differentiation, talent movement, employee-role matching, product integration, promotion practices, organization design and succession planning, and diversity and inclusion, or invent new ones that address the evolving needs of our diverse employee base.We are looking for a self-driven Economist to help shape analytics and research roadmap and enable data-driven innovation that fuel our rapidly scaling talent management mission. You will build econometric models, using our world class data systems, and apply economic theory to solve business problems in a fast moving environment. Economists at GTM will be expected to develop new techniques to process large data sets, apply a causal lens to the framework, address ambiguous business problems, and contribute to design of automated systems around the company.You will partner closely with product and program owners, as well as scientists and engineers from other disciplines (e.g. data science, software engineers, data engineering) with a clear path to business impact. You develop innovative and even frighteningly bold plans and ideas to discover new ways to advance our goals. You will be expected to be a thought leader as we chart new courses with our rapidly growing employee populations, and lead the way in experimenting new ideas that have not yet been explored.Key Responsibilities:· Participate in scoping and planning of GTM’s Science roadmap· Uncover drivers, impacts, and key influences on talent outcomes· Build new econometric models to improve existing talent products or those that make the case for new products· Bring a causal lens to questions in human resources employing either experiments or non-experimental approaches· Develop predictive and optimization models for key applications· Navigate a variety of data sources, such as enterprise data, customize surveys, focus groups, and/or external data sources· Ability to distill informal customer requirements into problem definitions, dealing with ambiguity and competing objectives· Work in expert cross-functional teams delivering on demanding projects
US, CA, Virtual Location - California
Amazon.com strives to be Earth's most customer-centric company where people can find and discover anything they want to buy online. We hire the world's brightest minds, offering them a fast paced, technologically sophisticated and friendly work environment.Economists at Amazon will be expected to work directly with senior management on key business problems faced in retail, international retail, cloud computing, third party merchants, search, Kindle, streaming video, and operations. Amazon economists will apply the frontier of economic thinking to market design, pricing, forecasting, program evaluation, online advertising and other areas. You will build econometric models, using our world class data systems, and apply economic theory to solve business problems in a fast moving environment. Economists at Amazon will be expected to develop new techniques to process large data sets, address quantitative problems, and contribute to design of automated systems around the company.
US, MA, Cambridge
The Alexa Translations team is looking for an experienced Applied Science Manager to build industry-leading technologies in speech translation. Alexa is the voice activated digital assistant powering devices like Amazon Echo, Echo Dot and Fire TV. Our team's mission is to enable Alexa to break down language barriers for our customers.As an Applied Science Manager, you will lead a team of exceptional scientists to develop novel algorithms and modeling techniques to advance the state of the art in speech translation. You will work in a hybrid, fast-paced organization where scientists, engineers and product managers work together to build novel customer facing experiences. You will collaborate with and mentor other scientists to raise the bar of scientific research in Amazon. Your work will directly impact our customers in the form of products and services that make use of speech and language technology.We are looking for a leader with strong technical experience, demonstrated progression of management scope, and a passion for managing Science talent in a fast-paced environment. In addition to technical depth, you must possess exceptional project management and communication skills, and understand how to coach a team. As a Science leader you will:· Manage and mentor other scientists and engineers, review and guide their work, help develop roadmaps for the team and provide coaching for career development· Contribute directly to our growth by hiring smart and motivated Scientists and managers to establish teams that can deliver swiftly and predictably, adjusting in an agile fashion to deliver what our customers need.· Work closely with other teams across Alexa to deliver platform features that require cross-team leadership.· Represent your business and operations to the highest level of leadership within Amazon.If you are looking for a challenging and innovative role where you can solve important problems while growing as a leader, this may be the place for you.
US, MA, Boston
Amazon Elastic File System (EFS) https://aws.amazon.com/efs/ is looking for a Data Scientist to dive deep into the vast data generated by a rapidly growing AWS storage service and applying current data analysis methods to produce insights that will improve customer experience, operational effectiveness, and business value. As a member of the Amazon EFS team, you’ll work closely with outstanding engineers and product managers to work hard, have fun, and create the future of cloud storage.Building a High-Performing & Inclusive Team CultureYou should be passionate about working with a world-class team that welcomes, celebrates, and leverages a diverse set of backgrounds and skillsets to deliver results. Driving results is your primary responsibility, and doing so in a way that builds on our inclusive culture is key to our long term success.Work/Life BalanceEFS values work-life balance. On normal days, our entire team is co-located in the Boston office, but we’re also flexible when people occasionally need to work from home. We generally keep core available hours from 10am to 4pm. Some of the team is available earlier and the rest of us work a little later.Energizing and Interesting Technical ProblemsYou will work in partnership with engineers on the team to build and operate large scale systems that move and transform customer volume data and accelerate access to their data. You’ll be working to provide solutions to both internal and external customers and engage deeply with other teams within EFS, S3, EC2, and many other services. It’s humbling and energizing to provide data movement solutions to customers at AWS scale.Mentorship & Career GrowthWe’re committed to the growth and development of every member of EFS, and that includes our engineers. You will have the opportunity to contribute to the culture and direction of the entire EFS org and deliver initiatives that will improve the life of all of our teams.As a Data Scientist on EFS you will be curious and dive deep into performance, business, and operational metrics. You’re excited about designing solutions that scale while also engaging individual customers in understanding their applications. You’re able to think about business opportunities, operational issues, and architectural diagrams in the course of a single conversation. You’re looking for a team of bright, capable engineers to work with directly in implementing your vision while also collaborating with other data scientists across AWS.
US, WA, Seattle
Workforce Staffing (WFS) supports Amazon Operations by hiring the hourly associates that staff our operational buildings. WFS is quickly becoming one of the world’s largest staffing organizations, forecasted to hire over one million hourly associates across North America and the European Union this year alone. Currently, we hire full time, part time, flex time and seasonal hires across Fulfillment Centers, Sort Centers, Amazon Logistics, Whole Foods, Amazon Air, Prime Now, Amazon Fresh, and emerging business lines. Interested in the businesses that Amazon creates and grows? Here’s your opportunity to be a part of this journey.The Workforce Intelligence team was created in 2018 to support the massive growth in scale and scope that WFS has experienced. The team has continued to grow rapidly in order to meet the expanding needs of the business, including: big data and machine learning solutions, innovative approaches to complex HR problems, and data-driven recommendations during a time of rapid change.Here’s where you come in:As a Research Scientist in Workforce Intelligence, your work is focused on research to deeply understand the people that make up our hourly workforce and help others do the same. You understand that even when hiring hundreds of thousands of hourly associates across multiple types of roles and businesses, the experience of each candidate matters.You use your deep expertise in surveys and statistics (regressions, multilevel models, etc.) to define and answer nebulous problems. You use experimental, quasi-experimental, and RCT methods to understand our candidates and influence critical business decisions. You relentlessly obsess over understanding our candidates and lead our survey program that seeks to amplify the voice of our candidates. You work with colleagues across Research, Data Science, Business Intelligence and related teams to enable Amazon find and hire the right candidates for the right roles at an unprecedented scale.This will be a highly visible role across multiple key deliverables for our global organization. If you are passionate and curious about data, obsess over customers, love questioning the status quo, and want to make the world a better place, let’s chat.
US, WA, Seattle
Amazon Science gives you insight into the company’s approach to customer-obsessed scientific innovation. Amazon fundamentally believes that scientific innovation is essential to being the most customer-centric company in the world. It’s the company’s ability to have an impact at scale that allows us to attract some of the brightest minds in artificial intelligence and related fields. Our scientists continue to publish, teach, and engage with the academic community, in addition to utilizing our working backwards method to enrich the way we live and work.Please visit https://www.amazon.science for more information.At Alexa Shopping, we strive to enable shopping in everyday life. We allow customers to instantly order whatever they need, by simply interacting with their Smart Devices such as Amazon Show, Spot, Echo, Dot or Tap. Our Services allow you to shop, no matter where you are or what you are doing, you can go from 'I want that' to 'that's on the way' in a matter of seconds. We are seeking the industry's best to help us create new ways to interact, search and shop. Join us, and you'll be taking part in changing the future of everyday life.What you will do: You will lead a team of talented and experienced scientists and engineers that implement solutions for natural language understanding of Alexa Shopping customers: this involves taking the outputs from automated speech recognition (ASR) component and producing a representation of its meaning. Additionally, your team will build Alexa Shopping Automated CX quality metrics and provide analytics. This involves exploring, developing, socializing, and implementing mechanisms for tracking automated CX quality across customer’s journey with Alexa Shopping to improve Alexa Shopping CX. And finally, you will have the satisfaction of being able to look back and say you were a key contributor to something special from its earliest stages. You will be working closely with executive leadership, multiple product managers and leaders from partner teams in Amazon Retail, Alexa, and Speech Recognition teams.What we are looking for: We are looking for a talented Data Science Manager with a strong technical background and solid people management skills to build, manage and develop a highly-talented and experienced data science team. We are seeking leaders that can guide technical and product innovation in the areas of voice experiences, machine learning models and the distributed systems to bring our vision together. Strong judgment and communication skills, long term technical vision, and continuous focus on engineering and operational excellence are essential for the success in this role.
US, VA, Arlington
Amazon’s Talent Assessment team designs and implements groundbreaking hiring solutions for one of the world’s fastest growing companies. We work in a fast-paced, global environment where we must solve complex problems (ranging from research-based to technical) and deliver solutions that have impact.We are seeking personnel selection researchers with a strong foundation in the development of pre-hire selection assessments, traditional and alternative legally defensible assessment validation approaches, research methodology, and data analysis. We are looking for equal parts researchers and consultant/thought leaders who are highly adaptable and continual learners who thrive in a fast paced environment.You will work closely with global teams to design and experiment new hiring solutions that predict success for highly complex roles (technical and non-technical) that have great impact on Amazon globally.What you’ll do:· Lead the tactical development and execution of large scale, highly visible personnel selection research projects· Develop and iterate on experimental research studies to optimize qualitative and quantitative hiring strategies· · Develop and innovate on new pre-hire test assessment design, validation, and implementation· · Partner with internal and external technology teams· Influence executive project sponsors and multiple business and development teams across the company· Drive effective teamwork, communication, and collaboration across multiple stakeholder groups
US, VA, Arlington
Amazon’s Talent Assessment team designs, implements, and optimizes hiring systems for one of the world’s fastest growing companies. We work in a data-focused, global environment solving complex problems with deep thought, large-sample research, and advanced quantitative methods to deliver practical solutions that make all aspects of hiring more fair, accurate, and efficient.We're looking for a curious data scientist interested in working on a multi-disciplinary team of applied scientists, psychologists, data engineers, business analysts, and program managers. In this role, you will apply your modeling skills to bust myths, create insights, and produce recommendations to help Amazon evaluate millions of potential new hires per year. You'll be involved in all phases of research and experiment design and analysis, including defining research questions, designing experiments, identifying data requirements, conducting statistical or machine learning-based modeling, and communicating insights and recommendations. You'll also be expected to continuously learn new systems, tools, and industry best practices to analyze big data and enhance our analytics.
LU, Luxembourg
Have you ever wondered how Amazon delivers timely and reliably hundreds of millions of packages to customer’s doorsteps? Are you passionate about data and mathematics, and hope to impact the experience of millions of customers? Are you obsessed with designing simple algorithmic solutions to very challenging problems?If so, we look forward to hearing from you!Amazon Transportation Services is seeking a Research Scientist to be based in the EU Headquarters in Luxembourg. As a key member of the Research Science Team of ATS operations, this person will be responsible for designing algorithmic solutions based on data and mathematics for optimizing the middle-mile Amazon Transportation Network. The successful applicant will ensure that our end-to-end strategies in terms of customer demand fulfillment, routing, consolidation locations, linehaul/airhaul/sea options and last-mile transportation are streamlined and optimized.We welcome candidates with different seniority levels, and the role will be adjusted to candidate’s experience.Tasks/ Responsibilities· Design and prototype algorithmic solutions for standardized processes.· Lead complex time-bound, long-term as well as ad-hoc transportation modelling analyses to help management in decision making.· Communicate to leadership results from business analysis, strategies and tactics (for senior candidates).· Drive large-scale projects to scale and enhance Amazon’s EU transportation network (for senior candidates).· Partner with the planning, linehaul/airhaul and sort center operations teams, while working closely with last-mile, supply chain, and global delivery departments for modeling and optimizing the transportation network of EU.
LU, Luxembourg
Have you ever wondered how Amazon delivers timely and reliably hundreds of millions of packages to customer’s doorsteps? Are you passionate about data and mathematics, and hope to impact the experience of millions of customers? Are you obsessed with designing simple algorithmic solutions to very challenging problems?If so, we look forward to hearing from you!Amazon Transportation Services is seeking a Research Scientist to be based in the EU Headquarters in Luxembourg. As a key member of the Research Science Team of ATS operations, this person will be responsible for designing algorithmic solutions based on data and mathematics for optimizing the middle-mile Amazon Transportation Network. The successful applicant will ensure that our end-to-end strategies in terms of customer demand fulfillment, routing, consolidation locations, linehaul/airhaul/sea options and last-mile transportation are streamlined and optimized.We welcome candidates with different seniority levels, and the role will be adjusted to candidate’s experience.Tasks/ Responsibilities· Design and prototype algorithmic solutions for standardized processes.· Lead complex time-bound, long-term as well as ad-hoc transportation modelling analyses to help management in decision making.· Communicate to leadership results from business analysis, strategies and tactics (for senior candidates).· Drive large-scale projects to scale and enhance Amazon’s EU transportation network (for senior candidates).· Partner with the planning, linehaul/airhaul and sort center operations teams, while working closely with last-mile, supply chain, and global delivery departments for modeling and optimizing the transportation network of EU.
ES, M, Madrid
Amazon is looking for creative Applied Scientists to tackle some of the most interesting problems on the leading edge of natural language processing (NLP), machine learning (ML), search and related areas with our Amazon Books team. At Amazon Books we believe that books are not only needed to work, education and entertainment, but are also required for a healthy society. As such, we aim to create an unmatched book discovery experience for our customers worldwide. We enable customers to discover new books, authors and genres through sophisticated recommendation engines, smart search tools and through social interaction, and we need your help to keep innovating in this space.If you are looking for an opportunity to solve deep technical problems and build innovative solutions in a fast-paced environment working within a smart and passionate team, this might be the role for you. You will develop and implement novel algorithms and modeling techniques to advance the state-of-the-art in technology areas at the intersection of ML, NLP, search, and deep learning. You will innovate, help move the needle for research in these exciting areas and build cutting-edge technologies that enable delightful experiences for hundreds of millions of people.In this role you will:· Work collaboratively with other scientists and developers to design and implement scalable models for accessing and presenting information;· Drive scalable solutions from the business to prototyping, production testing and through engineering directly to production;· Drive best practices on the team, deal with ambiguity and competing objectives, and mentor and guide other members to achieve their career growth potential.
US, WA, Seattle
The Economic Technology team (EconTech, ET) is looking for an Applied Scientist to build Reinforcement Learning solutions to solve economic problems at scale. ET uses Machine Learning, Reinforcement Learning, Causal Inference, and Econometrics/Economics to derive actionable insights about the complex economy of Amazon’s retail business. We also develop Statistical Models and Algorithms to drive strategic business decisions and improve operations. We are an interdisciplinary team of Economists, Engineers, and Scientists incubating and building disruptive solutions using cutting-edge technology to solve some of the toughest business problems at Amazon.You will work with business leaders, scientists, and economists to translate business and functional requirements into concrete deliverables, including the design, development, testing, and deployment of highly scalable distributed services. You will partner with scientists, economists, and engineers to help invent and implement scalable ML, RL, and econometric models while building tools to help our customers gain and apply insights. This is a unique, high visibility opportunity for someone who wants to have business impact, dive deep into large-scale economic problems, enable measurable actions on the Consumer economy, and work closely with scientists and economists. We are particularly interested in candidates with experience building predictive models and working with distributed systems.As an Applied Scientist, you bring structure to ambiguous business problems and use science, logic, and practical experience to decompose them into straightforward, scalable solutions. You set the standard for scientific excellence and make decisions that affect the way we build and integrate algorithms. Your solutions are exemplary in terms of algorithm design, clarity, model structure, efficiency, and extensibility. You tackle intrinsically hard problems; you're interested in learning; and you acquire skills and expertise as needed.
US, WA, Bellevue
Come join the Alexa team, building the speech and language solutions behind Amazon Echo and other Amazon products and services! You will help us invent the future.As a Senior Data Scientist, you will design, evangelize, and implement state-of-the-art solutions for never-before-solved problems, helping Alexa to provide customer great products. This role will be a key member of a Alexa Data Service Science team based in Bellevue, WA. You will work closely with other research scientists, machine learning experts, engineers to design and run experiments, research new algorithms, and find new ways to improve Alexa Data Service products. You will partner with technology and product leaders to solve business and technology problems using scientific approaches to build new services that surprise and delight our customers. Our scientists work closely with software engineers to put algorithms into practice. They also work on cross-disciplinary efforts with other scientists within Amazon.The key responsibility for this role include:· Define proper output business Metrics, and build input models to identify patterns and drivers of the output.· Drive actions at scale using scientifically-based methods and decision making.· Design and develop complex mathematical, statistical, simulation and optimization models and apply them to define strategic and tactical needs and drive the appropriate business and technical solutions· Design experiments, test hypotheses, and build actionable models· Prototype these models by using modeling languages such as R or in software languages such as Python.· Work with software engineering teams to drive scalable, real-time implementations· Utilizing Amazon systems and tools to effectively work with terabytes of data
US, WA, Bellevue
Come join the Alexa team, building the speech and language solutions behind Amazon Echo and other Amazon products and services! You will help us invent the future.As a Data Scientist, you will design, evangelize, and implement state-of-the-art solutions for never-before-solved problems, helping Alexa to provide customer great products. This role will be a key member of a Alexa Data Service Science team based in Bellevue, WA. You will work closely with other research scientists, machine learning experts, engineers to design and run experiments, research new algorithms, and find new ways to improve Alexa Data Service products. You will partner with technology and product leaders to solve business and technology problems using scientific approaches to build new services that surprise and delight our customers. Our scientists work closely with software engineers to put algorithms into practice. They also work on cross-disciplinary efforts with other scientists within Amazon.The key responsibility for this role include:· Define proper output business Metrics, and build input models to identify patterns and drivers of the output.· Drive actions at scale using scientifically-based methods and decision making.· Design and develop complex mathematical, statistical, simulation and optimization models and apply them to define strategic and tactical needs and drive the appropriate business and technical solutions· Design experiments, test hypotheses, and build actionable models· Prototype these models by using modeling languages such as R or in software languages such as Python.· Work with software engineering teams to drive scalable, real-time implementations· Utilizing Amazon systems and tools to effectively work with terabytes of data
IN, KA, Bangalore
Amazon AI is looking for world class scientists and engineers to join its CodeGuru Reviewer science group. This group is entrusted with developing core program analysis, data mining and machine learning algorithms for Amazon CodeGuru Reviewer. At the Reviewer science group at Amazon AI you will invent, implement, and deploy state of the art program analysis and machine learning algorithms and systems. You will build prototypes and explore conceptually new solutions. You will interact closely with our customers and with the academic community. You will be at the heart of a growing and exciting focus area for AWS and work with other acclaimed engineers and world-famous scientists.
DE, BE, Berlin
About us:Amazon is a company of builders. A philosophy of ownership carries through everything we do — from the proprietary technologies we create to the new businesses we launch and grow. You’ll find it in every team across our company; from providing Earth’s biggest selection of products to developing ground-breaking software and devices that change entire industries, Amazon embraces invention and progressive thinking. Amazon is continually evolving; it’s a place where motivated employees thrive, and ownership and accountability lead to meaningful results. It’s as simple as this: we pioneer.With every order made and parcel delivered, customer demand at Amazon is growing. And to meet this demand, and keep our world-class service running smoothly, we're growing our teams across Europe. Delivering hundreds of thousands of products to hundreds of countries worldwide, our Operations teams possess a wide range of skills and experience and this include software developers, data engineers, operations research scientists, and more.About these internships:Whatever your background, if you are excited about modeling huge amounts of data and creating state of the art algorithms to solve real world problems, if you have a passion for using mathematical optimization, including linear programming, combinatorial optimization, integer programming, dynamic programming, network flows and algorithms to design optimal or near optimal solution methodologies to be used by in-house decision support tools and software, if you enjoy solving operational challenges by using computer simulations, and if you’re motivated by results and driven enough to achieve them, Amazon is a great place to be. Because it’s only by coming up with new ideas and challenging the status quo that we can continue to be the most customer-centric company on Earth, we’re all about flexibility: we expect you to adapt to changes quickly and we encourage you to try new things.Amazon is looking for ambitious and enthusiastic students to join our unique world as interns. An Amazon EU internship will provide you with an unforgettable experience in a fast-paced, dynamic and international environment; it will boost your resume and will provide a superb introduction to our activities.As an intern in Ops Research and modelling, you could join one of the following teams: Supply Chain, Amazon Logistics, Transportation, Prime Now, Inventory Placement and more.You will put your analytical and technical skills to the test and roll up your sleeves to complete a project that will contribute to improve the functionality and level of service that teams provides to our customers. This could include:· Analyze and solve business problems at their root, stepping back to understand the broader context· Apply advanced statistics and data mining techniques to analyze and make insights from big data (data sets could include: historical production data, volumes, transportation and logistics metrics, simulation/experiment results etc.) in order to forecast, across multiple geographies.· Closely collaborate with operations research scientists, business analysts, BI teams, developers, economists and more on various models’ (including predictive models) development.· Perform quantitative, economic, and/or numerical analysis of the performance of supply chain systems under uncertainty using statistical and optimization tools to find both exact and heuristic solution strategies for optimization problems.· Create computer simulations to support operational decision-making. Identify areas with potential for improvement and work with internal teams to generate requirements that can realize these improvements.· Create software prototypes to verify and validate the devised solutions methodologies; integrate the prototypes into production systems using standard software development tools and methodologies.· Convert statistical output into detailed documents which influence business actions
IN, KA, Bangalore
Are you interested in shaping the future of movies, television, and digital video? Do you want to define what type and quality of X-Ray experiences should be delivered to Amazon customers? Prime Video X-Ray is a service/platform that enables creation and delivery of deep X-Ray experience for any video from any studio for millions of Amazon customers globally. Prime Video X-Ray is an experience that is growing and delighting customers globally on VoD content, Live Sports and Channels. We are looking for a Senior Applied Scientist who can work on different aspects of the video content, like text metadata, video, audio and images to apply from variety of techniques in computer vision, deep learning, machine learning and image processing algorithms to build visual understanding, metadata extraction and curation systems.You will be contributing to a platform from the very early stages which will process terabytes of video content data. You will collaborate with other research scientists across Amazon to define the scope of the product, identify and initiate investigations of new technologies, prototype, test solutions and deliver an exceptional customer experience.You will work closely with the software development teams to build robust vision-based solutions for customer-facing applications. You should be comfortable with a large degree of ambiguity and relish the idea of solving problems that, frankly, haven’t been solved at scale before. Along the way, we guarantee that you’ll learn a ton, have fun and make a positive impact on millions of people.
MX, DIF, Mexico City
At Amazon Web Services (AWS), we’re hiring highly technical Data and Machine Learning engineers to collaborate with our customers and partners on key engagements. Our consultants will develop and deliver proof-of-concept projects, technical workshops, and support implementation projects. These professional services engagements will focus on customer solutions such as Machine Learning, Data and Analytics, HPC and more.In this role, you will work with our partners, customers and focus on our AWS offerings such Amazon Kinesis, AWS Glue, Amazon Redshift, Amazon EMR, Amazon Athena, Amazon SageMaker and more. You will help our customers and partners to remove the constraints that prevent them from leveraging their data to develop business insights.AWS Professional Services engage in a wide variety of projects for customers and partners, providing collective experience from across the AWS customer base and are obsessed about customer success. Our team collaborates across the entire AWS organization to bring access to product and service teams, to get the right solution delivered and drive feature innovation based upon customer needs.You will also have the opportunity to create white papers, writing blogs, build demos and other reusable collateral that can be used by our customers. Most importantly, you will work closely with our Solution Architects, Data Scientists and Service Engineering teams.The ideal candidate will have extensive experience with design, development and operations that leverages deep knowledge in the use of services like Amazon Kinesis, Apache Kafka, Apache Spark, Amazon SageMaker, Amazon EMR, NoSQL technologies and other 3rd parties.This is a customer facing role. You will be required to travel to client locations and deliver professional services when needed.
MX, DIF, Mexico City
At Amazon Web Services (AWS), we’re hiring highly technical Data and Machine Learning engineers to collaborate with our customers and partners on key engagements. Our consultants will develop and deliver proof-of-concept projects, technical workshops, and support implementation projects. These professional services engagements will focus on customer solutions such as Machine Learning, Data and Analytics, HPC and more.In this role, you will work with our partners, customers and focus on our AWS offerings such Amazon Kinesis, AWS Glue, Amazon Redshift, Amazon EMR, Amazon Athena, Amazon SageMaker and more. You will help our customers and partners to remove the constraints that prevent them from leveraging their data to develop business insights.AWS Professional Services engage in a wide variety of projects for customers and partners, providing collective experience from across the AWS customer base and are obsessed about customer success. Our team collaborates across the entire AWS organization to bring access to product and service teams, to get the right solution delivered and drive feature innovation based upon customer needs.You will also have the opportunity to create white papers, writing blogs, build demos and other reusable collateral that can be used by our customers. Most importantly, you will work closely with our Solution Architects, Data Scientists and Service Engineering teams.The ideal candidate will have extensive experience with design, development and operations that leverages deep knowledge in the use of services like Amazon Kinesis, Apache Kafka, Apache Spark, Amazon SageMaker, Amazon EMR, NoSQL technologies and other 3rd parties.This is a customer facing role. You will be required to travel to client locations and deliver professional services when needed.