Alexa’s ASRU papers concentrate on extracting high-value training data

Related data selection techniques yield benefits for both speech recognition and natural-language understanding.

This year at the IEEE Automatic Speech Recognition and Understanding (ASRU) Workshop, Alexa researchers have two papers about training machine learning systems with minimal hand-annotated data. Both papers describe automated methods for producing training data, and both describe additional algorithms for extracting just the high-value examples from that data.

Each paper, however, gravitates to a different half of the workshop’s title: one is on speech recognition, or converting an acoustic speech signal to text, and the other is on natural-language understanding, or determining a text’s meaning.

The natural-language-understanding (NLU) paper is about adding new functions to a voice agent like Alexa when training data is scarce. It involves “self-training”, in which a machine learning model trained on sparse annotated data itself labels a large body of unannotated data, which in turn is used to re-train the model.

The researchers investigate techniques for winnowing down the unannotated data, to extract examples pertinent to the new function, and then winnowing it down even further, to remove redundancies.

The automatic-speech-recognition (ASR) paper is about machine-translating annotated data from a language that Alexa already supports to produce training data for a new language. There, too, the researchers report algorithms for identifying data subsets — both before and after translation — that will yield a more-accurate model.

Three of the coauthors on the NLU paper — applied scientists Eunah Cho and Varun Kumar and applied-scientist manager Bill Campbell — are also among the five Amazon organizers of the Life-Long Learning for Spoken-Language Systems workshop, which will take place on the first day of ASRU. The workshop focuses on the problem of continuously improving deployed conversational-AI systems.

Cho and her colleagues’ main-conference paper, “Efficient Semi-Supervised Learning for Natural Language Understanding by Optimizing Diversity”, addresses an instance of that problem: teaching Alexa to recognize new “intents”.

Enlarged intents

Alexa’s NLU models classify customer requests according to domain, or the particular service that should handle a request, and intent, or the action that the customer wants executed. They also identify the slot types of the entities named in the requests, or the roles those entities play in fulfilling the request. In the request “Play ‘Undecided’ by Ella Fitzgerald”, for instance, the domain is Music and the intent PlayMusic, and the names “Undecided” and “Ella Fitzgerald” fill the slots SongName and ArtistName.

Most intents have highly specific vocabularies (even when they’re large, as in the case of the PlayMusic intent), and ideally, the training data for a new intent would be weighted toward in-vocabulary utterances. But when Alexa researchers are bootstrapping a new intent, intent-specific data is scarce. So they need to use training data extracted from more-general text corpora.

As a first pass at extracting intent-relevant data from a general corpus, Cho and her colleagues use a simple n-gram-based linear logistic regression classifier, trained on whatever annotated, intent-specific data is available. The classifier breaks every input utterance into overlapping one-word, two-word, and three-word chunks — n-grams — and assigns each chunk a score, indicating its relevance to the new intent. The relevance score for an utterance is an aggregation of the chunks’ scores, and the researchers keep only the most relevant examples.

In an initial experiment, the researchers used sparse intent-specific data to train five different machine learning models to recognize five different intents. Then they fed unlabeled examples extracted by the regression classifier to each intent recognizer. The recognizers labeled the examples, which were then used to re-train the recognizers. On average, this reduced the recognizers’ error rates by 15%.

To make this process more efficient, Cho and her colleagues trained a neural network to identify paraphrases, which are defined as pairs of utterances that have the same domain, intent, and slot labels. So “I want to listen to Adele” is a paraphrase of “Play Adele”, but “Play Seal” is not.

Augmented-data embedding
The figure above depicts embeddings of NLU training data, or geometrical representations of the data such that utterances with similar meanings are grouped together. The brown points represent annotated data specific to a new intent; the blue points represent intent-relevant data extracted from a more general data set.

The researchers wanted their paraphrase detector to be as general as possible, so they trained it on data sampled from Alexa’s full range of domains and intents. From each sample, they produced a template by substituting slot types for slot values. So, for instance, “Play Adele in the living room” became something like “Play [artist_name] in the [device_location].” From those templates, they could generate as comprehensive a set of training pairs as they wanted — paraphrases with many different sentence structures and, as negative examples, non-paraphrases with the same sentence structures.

From the data set extracted by the logistic classifier, the paraphrase detector selects a small batch of examples that offer bad paraphrases of the examples in the intent-specific data set. The idea is that bad paraphrases will help diversify the data, increasing the range of inputs the resulting model can handle.

The bad paraphrases are added to the annotated data, producing a new augmented data set, and then the process is repeated. This method halves the amount of training data required to achieve the error rate improvements the researchers found in their first experiment.

Gained in translation

The other ASRU paper, “Language Model Bootstrapping Using Neural Machine Translation for Conversational Speech Recognition”, is from applied scientist Surabhi Punjabi, senior applied scientist Harish Arsikere, and senior manager for machine learning Sri Garimella, all of the Alexa Speech group. It investigates building an ASR system in a language — in this case, Hindi — in which little annotated training data is available.

ASR systems typically have several components. One, the acoustic model, takes a speech signal as input and outputs phonetic renderings of short speech sounds. A higher-level component, the language model, encodes statistics about the probabilities of different word sequences. It can thus help distinguish between alternate interpretations of the same acoustic signal (for instance, “Pulitzer Prize” versus “pullet surprise”).

Punjabi and her colleagues investigated building a Hindi language model by automatically translating annotated English-language training data into Hindi. The first step was to train a neural-network-based English-Hindi translator. This required a large body of training data, which matched English inputs to Hindi translations.

Here the researchers ran into a problem similar to the one that Cho and her colleagues confronted. By design, the available English-Hindi training sets were drawn from a wide range of sources and covered a wide range of topics. But the annotated English data that the researchers wanted to translate was Alexa-specific.

Punjabi and her colleagues started with a limited supply of Alexa-specific annotated data in Hindi, collected through Cleo, an Alexa skill that allows multilingual customers to help train machine learning models in new languages. Using an off-the-shelf statistical model, they embedded that data, or represented each sentence as a point in a geometric space, such that sentences with similar meanings clustered together.

Then they embedded Hindi sentences extracted from a large, general, English-Hindi bilingual corpus and measured their distance from the average embedding of the Cleo data. To train their translator, they used just those sentences within a fixed distance of the average — that is, sentences whose meanings were similar to those of the Cleo data.

In one experiment, they then used self-training to fine-tune the translator. After the translator had been trained, they used it to translate a subset of the English-only Alexa-specific data. Then they used the resulting English-Hindi sentence pairs to re-train the translator.

Like all neural translators, Punjabi and her colleagues’ outputs a list of possible translations, ranked according to the translator’s confidence that they’re accurate. In another experiment, the researchers used a simple language model, trained only on the Cleo data, to re-score the lists produced by the translator according to the probability of their word sequences. Only the top-ranked translation was added to the researchers’ Hindi data set.

In another experiment, once Punjabi and her colleagues had assembled a data set of automatically translated utterances, they used the weak, Cleo-based language model to winnow it down, discarding sentences that the model deemed too improbable. With the data that was left, they built a new, much richer language model.

Punjabi and her colleagues evaluated each of these data enrichment techniques separately, so they could measure the contribution that each made to the total error rate reduction of the resulting language model. To test each language model, they integrated it into a complete ASR system, whose performance they compared to that of an ASR system that used a language model trained solely on the Cleo data.

Each modification made a significant difference in its own right. In experiments involving a Hindi data set with 200,000 utterances, re-scoring translation hypotheses, for instance, reduced the ASR system’s error rate by as much as 6.28%, model fine-tuning by as much as 6.84%. But the best-performing language model combined all the modifications, reducing the error rate by 7.86%.

When the researchers reduced the size of the Hindi data set, to simulate the situation in which training data in a new language is particularly hard to come by, the gains were even greater. At 20,000 Hindi utterances, the error rate reduction was 13.18%, at 10,000, 15.65%.

Lifelong learning

In addition to Cho, Kumar, and Campbell, the seven organizers of the Life-Long Learning for Spoken-Language Systems Workshop include Hadrian Glaude, a machine learning scientist, and senior principal scientist Dilek Hakkani-Tür, both of the Alexa AI group.

The workshop, which addresses problems of continual improvement to conversational-AI systems, features invited speakers, including Nancy Chen, a primary investigator at Singapore’s Agency for Science, Technology, and Research (A*STAR), and Alex Waibel, a professor of computer science at Carnegie Mellon University and one of the workshop organizers. The poster session includes six papers, spanning topics from question answering to emotion recognition.

Related content

US, CA, Santa Clara
Job summaryAmazon is looking for a passionate, talented, and inventive Applied Scientist with a strong machine learning background to help build industry-leading language technology.Our mission is to provide a delightful experience to Amazon’s customers by pushing the envelope in Natural Language Processing (NLP), Natural Language Understanding (NLU), Dialog management, conversational AI and Machine Learning (ML).As part of our AI team in Amazon AWS, you will work alongside internationally recognized experts to develop novel algorithms and modeling techniques to advance the state-of-the-art in human language technology. Your work will directly impact millions of our customers in the form of products and services, as well as contributing to the wider research community. You will gain hands on experience with Amazon’s heterogeneous text and structured data sources, and large-scale computing resources to accelerate advances in language understanding.We are hiring primarily in Conversational AI / Dialog System Development areas: NLP, NLU, Dialog Management, NLG.This role can be based in NYC, Seattle or Palo Alto.Inclusive Team CultureHere at AWS, we embrace our differences. We are committed to furthering our culture of inclusion. We have ten employee-led affinity groups, reaching 40,000 employees in over 190 chapters globally. We have innovative benefit offerings, and host annual and ongoing learning experiences, including our Conversations on Race and Ethnicity (CORE) and AmazeCon (gender diversity) conferences.Work/Life BalanceOur team puts a high value on work-life balance. It isn’t about how many hours you spend at home or at work; it’s about the flow you establish that brings energy to both parts of your life. We believe striking the right balance between your personal and professional life is critical to life-long happiness and fulfillment. We offer flexibility in working hours and encourage you to find your own balance between your work and personal lives.Mentorship & Career GrowthOur team is dedicated to supporting new members. We have a broad mix of experience levels and tenures, and we’re building an environment that celebrates knowledge sharing and mentorship. Our senior members enjoy one-on-one mentoring and thorough, but kind, code reviews. We care about your career growth and strive to assign projects based on what will help each team member develop into a better-rounded engineer and enable them to take on more complex tasks in the future.
US, NY, New York
Job summaryAmazon is looking for a passionate, talented, and inventive Applied Scientist with a strong machine learning background to help build industry-leading language technology.Our mission is to provide a delightful experience to Amazon’s customers by pushing the envelope in Natural Language Processing (NLP), Natural Language Understanding (NLU), Dialog management, conversational AI and Machine Learning (ML).As part of our AI team in Amazon AWS, you will work alongside internationally recognized experts to develop novel algorithms and modeling techniques to advance the state-of-the-art in human language technology. Your work will directly impact millions of our customers in the form of products and services, as well as contributing to the wider research community. You will gain hands on experience with Amazon’s heterogeneous text and structured data sources, and large-scale computing resources to accelerate advances in language understanding.We are hiring primarily in Conversational AI / Dialog System Development areas: NLP, NLU, Dialog Management, NLG.This role can be based in NYC, Seattle or Palo Alto.Inclusive Team CultureHere at AWS, we embrace our differences. We are committed to furthering our culture of inclusion. We have ten employee-led affinity groups, reaching 40,000 employees in over 190 chapters globally. We have innovative benefit offerings, and host annual and ongoing learning experiences, including our Conversations on Race and Ethnicity (CORE) and AmazeCon (gender diversity) conferences.Work/Life BalanceOur team puts a high value on work-life balance. It isn’t about how many hours you spend at home or at work; it’s about the flow you establish that brings energy to both parts of your life. We believe striking the right balance between your personal and professional life is critical to life-long happiness and fulfillment. We offer flexibility in working hours and encourage you to find your own balance between your work and personal lives.Mentorship & Career GrowthOur team is dedicated to supporting new members. We have a broad mix of experience levels and tenures, and we’re building an environment that celebrates knowledge sharing and mentorship. Our senior members enjoy one-on-one mentoring and thorough, but kind, code reviews. We care about your career growth and strive to assign projects based on what will help each team member develop into a better-rounded engineer and enable them to take on more complex tasks in the future.
US, CA, Santa Clara
Job summaryAWS AI/ML is looking for world class scientists and engineers to join its AI Research and Education group working on building automated ML solutions for planetary-scale sustainability and geospatial applications. Our team's mission is to develop ready-to-use and automated solutions that solve important sustainability and geospatial problems. We live in a time wherein geospatial data, such as climate, agricultural crop yield, weather, landcover, etc., has become ubiquitous. Cloud computing has made it easy to gather and process the data that describes the earth system and are generated by satellites, mobile devices, and IoT devices. Our vision is to bring the best ML/AI algorithms to solve practical environmental and sustainability-related R&D problems at scale. Building these solutions require a solid foundation in machine learning infrastructure and deep learning technologies. The team specializes in developing popular open source software libraries like AutoGluon, GluonCV, GluonNLP, DGL, Apache/MXNet (incubating). Our strategy is to bring the best of ML based automation to the geospatial and sustainability area.We are seeking an experienced Applied Scientist for the team. This is a role that combines science knowledge (around machine learning, computer vision, earth science), technical strength, and product focus. It will be your job to develop ML system and solutions and work closely with the engineering team to ship them to our customers. You will interact closely with our customers and with the academic and research communities. You will be at the heart of a growing and exciting focus area for AWS and work with other acclaimed engineers and world famous scientists. You are also expected to work closely with other applied scientists and demonstrate Amazon Leadership Principles (https://www.amazon.jobs/en/principles). Strong technical skills and experience with machine learning and computer vision are required. Experience working with earth science, mapping, and geospatial data is a plus. Our customers are extremely technical and the solutions we build for them are strongly coupled to technical feasibility.About the teamInclusive Team CultureAt AWS, we embrace our differences. We are committed to furthering our culture of inclusion. We have ten employee-led affinity groups, reaching 40,000 employees in over 190 chapters globally. We have innovative benefit offerings, and host annual and ongoing learning experiences, including our Conversations on Race and Ethnicity (CORE) and AmazeCon (gender diversity) conferences. Amazon’s culture of inclusion is reinforced within our 14 Leadership Principles, which remind team members to seek diverse perspectives, learn and be curious, and earn trust. Work/Life BalanceOur team puts a high value on work-life balance. It isn’t about how many hours you spend at home or at work; it’s about the flow you establish that brings energy to both parts of your life. We believe striking the right balance between your personal and professional life is critical to life-long happiness and fulfillment. We offer flexibility in working hours and encourage you to find your own balance between your work and personal lives.Mentorship & Career GrowthOur team is dedicated to supporting new members. We have a broad mix of experience levels and tenures, and we’re building an environment that celebrates knowledge sharing and mentorship. Our senior members enjoy one-on-one mentoring. We care about your career growth and strive to assign projects based on what will help each team member develop into a better-rounded scientist and enable them to take on more complex tasks in the future.Interested in this role? Reach out to the recruiting team with questions or apply directly via amazon.jobs.
US, CA, Santa Clara
Job summaryAWS AI/ML is looking for world class scientists and engineers to join its AI Research and Education group working on building automated ML solutions for planetary-scale sustainability and geospatial applications. Our team's mission is to develop ready-to-use and automated solutions that solve important sustainability and geospatial problems. We live in a time wherein geospatial data, such as climate, agricultural crop yield, weather, landcover, etc., has become ubiquitous. Cloud computing has made it easy to gather and process the data that describes the earth system and are generated by satellites, mobile devices, and IoT devices. Our vision is to bring the best ML/AI algorithms to solve practical environmental and sustainability-related R&D problems at scale. Building these solutions require a solid foundation in machine learning infrastructure and deep learning technologies. The team specializes in developing popular open source software libraries like AutoGluon, GluonCV, GluonNLP, DGL, Apache/MXNet (incubating). Our strategy is to bring the best of ML based automation to the geospatial and sustainability area.We are seeking an experienced Applied Scientist for the team. This is a role that combines science knowledge (around machine learning, computer vision, earth science), technical strength, and product focus. It will be your job to develop ML system and solutions and work closely with the engineering team to ship them to our customers. You will interact closely with our customers and with the academic and research communities. You will be at the heart of a growing and exciting focus area for AWS and work with other acclaimed engineers and world famous scientists. You are also expected to work closely with other applied scientists and demonstrate Amazon Leadership Principles (https://www.amazon.jobs/en/principles). Strong technical skills and experience with machine learning and computer vision are required. Experience working with earth science, mapping, and geospatial data is a plus. Our customers are extremely technical and the solutions we build for them are strongly coupled to technical feasibility.About the teamInclusive Team CultureAt AWS, we embrace our differences. We are committed to furthering our culture of inclusion. We have ten employee-led affinity groups, reaching 40,000 employees in over 190 chapters globally. We have innovative benefit offerings, and host annual and ongoing learning experiences, including our Conversations on Race and Ethnicity (CORE) and AmazeCon (gender diversity) conferences. Amazon’s culture of inclusion is reinforced within our 14 Leadership Principles, which remind team members to seek diverse perspectives, learn and be curious, and earn trust. Work/Life BalanceOur team puts a high value on work-life balance. It isn’t about how many hours you spend at home or at work; it’s about the flow you establish that brings energy to both parts of your life. We believe striking the right balance between your personal and professional life is critical to life-long happiness and fulfillment. We offer flexibility in working hours and encourage you to find your own balance between your work and personal lives.Mentorship & Career GrowthOur team is dedicated to supporting new members. We have a broad mix of experience levels and tenures, and we’re building an environment that celebrates knowledge sharing and mentorship. Our senior members enjoy one-on-one mentoring. We care about your career growth and strive to assign projects based on what will help each team member develop into a better-rounded scientist and enable them to take on more complex tasks in the future.Interested in this role? Reach out to the recruiting team with questions or apply directly via amazon.jobs.
US, NY, New York
Job summaryAmazon Web Services is looking for world class scientists to join the Security Analytics and AI Research team within AWS Security Services. This group is entrusted with researching and developing core data mining and machine learning algorithms for various AWS security services like GuardDuty (https://aws.amazon.com/guardduty/) and Macie (https://aws.amazon.com/macie/). In this group, you will invent and implement innovative solutions for never-before-solved problems. If you have passion for security and experience with large scale machine learning problems, this will be an exciting opportunity.The AWS Security Services team builds technologies that help customers strengthen their security posture and better meet security requirements in the AWS Cloud. The team interacts with security researchers to codify our own learnings and best practices and make them available for customers. We are building massively scalable and globally distributed security systems to power next generation services.Inclusive Team Culture Here at AWS, we embrace our differences. We are committed to furthering our culture of inclusion. We have ten employee-led affinity groups, reaching 40,000 employees in over 190 chapters globally. We have innovative benefit offerings, and host annual and ongoing learning experiences, including our Conversations on Race and Ethnicity (CORE) and AmazeCon (gender diversity) conferences. Amazon’s culture of inclusion is reinforced within our 14 Leadership Principles, which remind team members to seek diverse perspectives, learn and be curious, and earn trust. Work/Life Balance Our team puts a high value on work-life balance. It isn’t about how many hours you spend at home or at work; it’s about the flow you establish that brings energy to both parts of your life. We believe striking the right balance between your personal and professional life is critical to life-long happiness and fulfillment. We offer flexibility in working hours and encourage you to find your own balance between your work and personal lives. Mentorship & Career Growth Our team is dedicated to supporting new members. We have a broad mix of experience levels and tenures, and we’re building an environment that celebrates knowledge sharing and mentorship. We care about your career growth and strive to assign projects based on what will help each team member develop and enable them to take on more complex tasks in the future.A day in the lifeAbout the hiring groupJob responsibilities* Rapidly design, prototype and test many possible hypotheses in a high-ambiguity environment, making use of both quantitative and business judgment.* Collaborate with software engineering teams to integrate successful experiments into large scale, highly complex production services.* Report results in a scientifically rigorous way.* Interact with security engineers, product managers and related domain experts to dive deep into the types of challenges that we need innovative solutions for.
US, Virtual
Job summaryDo you have consulting leadership experience deploying digital, data, technology strategy and execution within Fortune 500 enterprise organization? Have you built and led successful consulting practices? Do you have broad technical skills and experience across Machine Learning and Artificial Intelligence? Can you build, lead and influence machine learning engineers and data science consultants in a technical specialty team to deliver these new capabilities on the AWS platform to our enterprise customers? At AWS, we are looking for a Senior Practice Manager with a successful record of leading enterprise customers through a variety of transformative projects involving Machine Learning and Artificial Intelligence; delivering business outcomes that contribute to our customers’ transformation journey. An SPM will focus on a geography and a set of technical specialties, and will manage a team of direct reports. The SPM will develop a long-term plan to develop the right skills across the team, influence the go-to-market strategy within the region and collaborate across the entire AWS organization to bring access to product and service teams, to get the right solution delivered and drive feature innovation based upon customer needs. Key job responsibilities• Engage customers - collaborate with enterprise sales managers to develop strong customer and partner relationships and build a growing business, driving adoption of emerging technologies in key accounts.• Coach and teach - collaborate with field sales, pre-sales, marketing, training and support teams to help partners and customers drive business outcomes through application of AI/ML.• Deliver value - lead high quality delivery of a variety of customized engagements with partners and enterprise customers in the commercial sector.• Lead great people - attract top machine learning engineers and data scientists to build high performing teams of consultants with superior technical depth, and outstanding peer and customer relationship skills• Be a customer advocate - Work with engineering teams to convey partner and enterprise customer feedback as input to technology roadmaps
US, WA, Seattle
Job summaryAWS Insight is looking for a Data Scientist to help develop sophisticated algorithms and models that involve analyzing and learning from over 540 billion customer cost, usage, and utilization events daily. We use this data to generate recommendations and forecasts for customers to help them better understand and optimize their AWS costs and usage and reduce the complexity of managing their cloud costs. Our team's vision is to be the world's authoritative provider of AWS computing insight, where customers can understand, control and optimize usage of AWS products. We sit at the nexus of all AWS services and interact directly with end-customers, and we build relationships with teams across AWS to ensure that we offer a secure and reliable customer experience that builds trust with our customers and provides them with intelligent insights.As a successful data scientist in AWS Insights, you will be responsible for understanding and mining the large amount of data, and developing recommendations that will help improve the accuracy and relevance of our forecasting and recommendations models. You will work closely with talented data scientists, software engineers, and business groups to build enhance existing models and build new models that solve challenging customer problems. You will work with the engineers to drive implementation of the proposed models and establish testing strategies to validate the models before and after they are put into production. On top of that, you are an analytical problem solver who enjoys diving into data, are excited about investigating and developing algorithms, and can influence technical teams and business stakeholders to solve real-world customer problems.Key job responsibilitiesImproving upon existing forecasting statistical or machine learning methodologies by developing new data sources, testing model enhancements, running computational experiments, and fine-tuning model parameters for new forecasting modelsSupporting decision making by providing requirements to develop analytic capabilities, platforms, pipelines and metrics then using them to analyze trends and find root causes of forecast inaccuracyFormalizing assumptions about how demand forecasts are expected to behave, creating definitions of outliers, developing methods to systematically identify these outliers, and explaining why they are reasonable or identifying fixes for themTranslating forecasting business requirements into specific analytical questions that can be answered with available data using statistical and machine learning methods; working with engineers to produce the required data when it is not availableCommunicating verbally and in writing to business customers with various levels of technical knowledge, educating them about our systems, as well as sharing insights and recommendationsUtilizing code (Python, R, Scala, etc.) for analyzing data and building statistical and machine learning models and algorithms
US, Virtual
Job summaryIn the Amazon Selection Monitoring team, we have the goal of establishing the most comprehensive, accurate and fresh universal selection of products. We enrich and increase the quality and coverage of Amazon product selection using cutting edge machine learning and big data technologies. We are looking for highly motivated scientists who can lead the design, development, deployment and maintenance of data-driven models using machine learning (ML) and/or natural language (NL) and computer vision (CV) applications. Your models would be monitoring billions and billions of products. You will build Amazon scale applications running on Amazon Web Service (AWS) that both leverage and create new technologies to process large volumes of data that derive patterns and conclusions from the data. Amazon Science gives you insight into the company’s approach to customer-obsessed scientific innovation. Amazon fundamentally believes that scientific innovation is essential to being the most customer-centric company in the world. It’s the company’s ability to have an impact at scale that allows us to attract some of the brightest minds in artificial intelligence and related fields. Our scientists continue to publish, teach, and engage with the academic community, in addition to utilizing our working backwards method to enrich the way we live and work. Please visit https://www.amazon.science for more information. Responsibilities - Designing and implementing new features and machine learned models, including the application of state-of-art deep learning to solve search matching and ranking problems, including filtering, new content indexing, and apply document understandingConducting and coordinating process development leading to improved and streamlined processes for model development. Strong customer focus is essentialWorking closely with Product Managers to expand depth of our product insights with data, create a variety of experiments, and determine the highest-impact projects to include in planning roadmapsProviding technical and scientific guidance to your team membersCommunicating effectively with senior management as well as with colleagues from science, engineering, and business backgroundsBeing a cultural leader that ensures teams are collecting, understanding, and using data to inform every decision that impacts our customers The successful candidate will have an established background in developing customer-facing experiences, a strong technical ability, a start-up mentality, excellent project management skills, and great communication skills.Key job responsibilitiesDesigning and implementing new features and machine learned models, including the application of state-of-art deep learning to solve search matching and ranking problems, including filtering, new content indexing, and apply document understandingConducting and coordinating process development leading to improved and streamlined processes for model development. Strong customer focus is essentialWorking closely with Product Managers to expand depth of our product insights with data, create a variety of experiments, and determine the highest-impact projects to include in planning roadmapsProviding technical and scientific guidance to your team membersCommunicating effectively with senior management as well as with colleagues from science, engineering, and business backgroundsBeing a cultural leader that ensures teams are collecting, understanding, and using data to inform every decision that impacts our customersA day in the lifeYou will work with Product Managers to translate the business problem into a science problemYou will define methods for data collection and performance evaluationYou will experiment new models and evaluate their performanceYou will perform deep dive to understand potential issues impacting model performance, and form hypotheses for improvementYou will help deploy the model into productionYou will communicate your experimental and production result to Product Managers and business stakeholders
US, Virtual
Job summaryThe AWS Activate Program provides startups the resources they need to grow successfully on AWS. We do this by understanding the uniqueness of each and every startup that applies for Activate, and then personalizing the resources we make available to them. Our resources include (but are not limited to) AWS service credits, Business Support credits, technical education and training, opportunities for business and technical mentorship from Amazonians and startup peers, and personalized growth benefits. The Activate Personalization Team is the brains behind the Activate system. This team is responsible for ingesting startup data from multiple internal and external services, aggregating it into a holistic startup profile, and creating and productionizing ML models. Our team is looking for an experienced Data Scientist (DS) with outstanding leadership skills and the proven ability to build and manage medium-scale modeling projects. The candidate will be an expert across multiple data science domains including data transformation, machine learning, and statistics. Key job responsibilitiesResearch cutting edge algorithms, develop new models, and design and run experiments to improve customer personalizationPartner with scientists, engineers and product leaders to solve business and technology problems using scientific approaches to build new services that surprise and delight our customersCollaborate with BI/Data Engineer teams and drive the collection of new data and the refinement of existing data sources to continually improve data qualityPropose and validate hypothesis to deliver and direct our product road mapConstructively critique peer research and mentor junior scientists and engineers
US, NY, New York
Job summaryWe are open to candidates located in:Seattle, WashingtonPalo Alto, CaliforniaArlington, VirginiaKey job responsibilitiesAs a Senior Research Scientist, you will:Research and develop new methodologies for demand forecasting, alarms, alerts and automation.Apply your advanced data analytics, machine learning skills to solve complex demand planning and allocation problems.Work closely with stakeholders and translate data-driven findings into actionable insights.Improve upon existing methodologies by adding new data sources and implementing model enhancements.Create and track accuracy and performance metrics.Create, enhance, and maintain technical documentation, and present to other scientists, engineers and business leaders.Drive best practices on the team; mentor and guide junior members to achieve their career growth potential.A day in the lifeAbility to utilize exceptional modeling and problem-solving skills to work through different challenges in ambiguous situations.You’ve successfully delivered end-to-end operations research projects, working through conflicting viewpoints and data limitations.You have an enviable level of attention to details.Ability to communicate analytical results to senior leaders, and peers.Innovative scientist with the ability to identify opportunities and develop novel modeling approaches in a fast-paced and ever-changing environment, and gain support with data and storytelling.About the teamVideo advertising is a complex, multi-sided market with many technologies at play within the industry. The industry is rapidly growing and evolving as viewers are shifting from traditional TV viewing to OTT, and from terrestrial radio to streaming. In addition, publishers are increasingly adding video content to their online experiences. Amazon’s video advertising program is a rising competitor in this industry. Amazon’s service has differentiated assets in our customer & audience insights, exclusive video content and associated inventory on our streaming services (IMDbTV, Twitch, Prime Video, Amazon Music, etc.) and devices (FireTV, Echo, Fire Tablet) which all position us well as an end to end service for advertisers and agencies. As our business grows, we are continually experimenting with a portfolio of emerging ideas and technology as well as global expansion. We are looking for passionate, hard-working, and talented individuals to help foster these nascent ideas into scalable products and launch them into the market.