Amazon’s new research on automatic speech recognition

Interspeech papers include novel approaches to speaker identification and the training of end-to-end speech recognition models.

As the largest conference devoted to speech technologies, Interspeech has long been a showcase for the latest research on automatic speech recognition (ASR) from Amazon Alexa. This year, Alexa researchers had 12 ASR papers accepted at the conference.

Diagram illustrating the architecture of the RNN-T ASR system.
The architecture of the RNN-T ASR system. Xt indicates the current frame of the acoustic signal. Yu-1 indicates the sequence of output subwords corresponding to the preceding frames.
From "Efficient minimum word error rate training of RNN-transducer for end-to-end speech recognition"

One of these, “Speaker identification for household scenarios with self-attention and adversarial training”, reports the speech team’s recent innovations in speaker ID, or recognizing which of several possible speakers is speaking at a given time.

Two others — “Subword regularization: an analysis of scalability and generalization for end-to-end automatic speech recognition” and “Efficient minimum word error rate training of RNN-transducer for end-to-end speech recognition” —examine ways to improve the quality of speech recognizers that use an architecture know as a recurrent neural network-transducer, or RNN-T.

In his keynote address this week at Interspeech, Alexa director of ASR Shehzad Mavawalla highlighted both of these areas — speaker ID and the use of RNN-Ts for ASR — as ones in which the Alexa science team has made rapid strides in recent years.

Speaker ID

Speaker ID systems — which enable voice agents to personalize content to particular customers — typically rely on either recurrent neural networks or convolutional neural networks, both of which are able to track consistencies in the speech signal over short spans of time. 

In “Speaker identification for household scenarios with self-attention and adversarial training”, Amazon applied scientist Ruirui Li and colleagues at Amazon, the University of California, Los Angeles, and the University of Notre Dame instead use an attention mechanism to identify longer-range consistencies in the speech signal.

In neural networks — such as speech processors — that receive sequential inputs, attention mechanisms determine which other elements of the sequence should influence the network’s judgment about the current element. 

Speech signals are typically divided into frames, which represent power concentrations at different sound frequencies over short spans of time. For a given utterance, Li and his colleagues’ model represents each frame as a weighted sum of itself and all the other frames in the utterance. The weights depend on correlations between the frequency characteristics of the frames; the greater the correlation, the greater the weight.

This representation has the advantage of capturing the distinctive properties of a speaker’s voice conveyed by each frame but suppressing accidental properties that are unique to individual frames and less characteristic of the speaker’s voice as a whole. 

These representations pass to a neural network that, during training, learns which of these properties are the best indicators of a speaker’s identity. Finally, the sequential outputs of this network — one for each frame — are averaged together to produce a snapshot of the utterance as a whole. These snapshots are compared to stored profiles to determine the speaker’s identity.

Li and his colleagues also used a few other tricks to make their system more reliable, such as adversarial training.

In tests, the researchers compared their system to four prior systems and found that its speaker identifications were more accurate across the board. Compared to the best-performing of the four baselines, the system reduced the identification error rate by about 12% on speakers whose utterances were included in the model training data and by about 30% on newly encountered speakers.

The RNN-T architecture

Another pair of papers examine ways to improve the quality of speech recognizers that use the increasingly popular recurrent-neural-network-transducer architecture, or RNN-T. An RNN-T processes a sequence of inputs in order, so that the output corresponding to each input factors in both the inputs and outputs that preceded it. 

Illustration of a series of possible subword segmentations of the speech input, with the probability of each.
A series of possible subword segmentations of the speech input, with the probability of each.
From “Subword regularization: an analysis of scalability and generalization for end-to-end automatic speech recognition”

In the ASR application, the RNN-T takes in frames of an acoustic speech signal and outputs text — a sequence of subwords, or word components. For instance, the output corresponding to the spoken word “subword” might be the subwords “sub” and “_word”. 

Training the model to output subwords keeps the network size small. It also enables the model to deal with unfamiliar inputs, which it may be able to break into familiar components.

In the RNN-T architecture we consider, the input at time t — the current frame of the input speech — passes to an encoder network, which extracts acoustic features useful for speech recognition. At the same time, the current, incomplete sequence of output subwords passes to a prediction network, whose output indicates likely semantic properties of the next subword in the sequence.

These two representations — the encoding of the current frame and the likely semantic properties of the next subword — pass to another network, which on the basis of both representations determines the next word in the output sequence.

New wrinkles

Subword regularization: an analysis of scalability and generalization for end-to-end automatic speech recognition”, by applied scientist Egor Lakomkin and his Amazon colleagues, investigates the regularization of subwords in the model, or the enforcement of greater consistency in how words are segmented into subwords. In experiments, the researchers show that using multiple segmentations of the same speech transcription during training can reduce the ASR error rate by 8.4% in a model trained on 5,000 hours of speech data.

Efficient minimum word error rate training of RNN-transducer for end-to-end speech recognition”, by applied scientist Jinxi Guo and six of his Amazon colleagues, investigates a novel loss function — an evaluation criterion during training — for such RNN-T ASR systems. In experiments, it reduced the systems’ error rates by 3.6% to 9.2%.

For each input, RNN-Ts output multiple possible solutions — or hypotheses — ranked according to probability. In ASR applications, RNN-Ts are typically trained to maximize the probabilities they assign the correct transcriptions of the input speech.

But trained speech recognizers are judged, by contrast, according to their word error rates, or the rate at which they make mistakes — misinterpretations, omissions, or erroneous insertions. Jinxi Guo and his colleagues investigated efficient ways to directly train an RNN-T ASR system to minimize word error rate.

That means, for each training example, minimizing the expected word errors of the most probable hypotheses. But computing the probabilities of those hypotheses isn’t as straightforward as it may sound.

That’s because the exact same sequence of output subwords can align with the sequence of input frames in different ways: one output sequence, for instance, might identify the same subword as having begun one frame earlier or later than another output sequence does. Computing the probability of a hypothesis requires summing the probabilities of all its alignments.

The brute-force solution to this problem would be computationally impractical. But Guo and his colleagues propose using the forward-backward algorithm, which exploits the overlaps between alignments, storing intermediate computations that can be re-used. The result is a computationally efficient algorithm that enables a 3.6% to 9.2% reduction in error rates for various RNN-T models.

The other Amazon ASR papers at this year’s Interspeech are

DiPCo - Dinner Party Corpus
Maarten Van Segbroeck, Zaid Ahmed, Ksenia Kutsenko, Cirenia Huerta, Tinh Nguyen, Björn Hoffmeister, Jan Trmal, Maurizio Omologo, Roland Maas

End-to-end neural transformer based spoken language understanding
Martin Radfar, Athanasios Mouchtaris, Siegfried Kunzmann

Improving speech recognition of compound-rich languages
Prabhat Pandey, Volker Leutnant, Simon Wiesler, Jahn Heymann, Daniel Willett

Improved training strategies for end-to-end speech recognition in digital voice assistants
Hitesh Tulsiani, Ashtosh Sapru, Harish Arsikere, Surabhi Punjabi, Sri Garimella

Leveraging unlabeled speech for sequence discriminative training of acoustic models
Ashtosh Sapru, Sri Garimella

Quantization aware training with absolute-cosine regularization for automatic speech recognition
Hieu Duy Nguyen, Anastasios Alexandridis, Athanasios Mouchtaris

Rescore in a flash: Compact, cache efficient hashing data structures for N-gram language models
Grant P. Strimel, Ariya Rastrow, Gautam Tiwari, Adrien Pierard, Jon Webb

Semantic complexity in end-to-end spoken language understanding
Joseph McKenna, Samridhi Choudhary, Michael Saxon, Grant P. Strimel, Athanasios Mouchtaris

Speech to semantics: Improve ASR and NLU jointly via all-neural interfaces
Milind Rao, Anirudh Raju, Pranav Dheram, Bach Bui, Ariya Rastrow 

Research areas

Related content

US, WA, Seattle
Join us at the cutting edge of Amazon's sustainability initiatives to work on environmental and social advancements to support Amazon's long term worldwide sustainability strategy. At Amazon, we're working to be the most customer-centric company on earth. To get there, we need exceptionally talented, bright, and driven people. The Worldwide Sustainability (WWS) organization capitalizes on Amazon’s scale & speed to build a more resilient and sustainable company. We manage our social and environmental impacts globally, driving solutions that enable our customers, businesses, and the world around us to become more sustainable. Sustainability Science and Innovation (SSI) is a multi-disciplinary team within the WW Sustainability organization that combines science, analytics, economics, statistics, machine learning, product development, and engineering expertise. We use this expertise and skills to identify, develop and evaluate the science and innovations necessary for Amazon, customers and partners to meet their long-term sustainability goals and commitments. We’re seeking a Senior Principal Scientist for Sustainability and Climate AI to drive technical strategy and innovation for our long-term sustainability and climate commitments through AI & ML. You will serve as the strategic technical advisor to science, emerging tech, and climate pledge partners operating at the Director, VPs, and SVP level. You will set the next generation modeling standards for the team and tackle the most immature/complex modeling problems following the latest sustainability/climate sciences. Staying hyper current with emergent sustainability/climate science and machine learning trends, you'll be trusted to translate recommendations to leadership and be the voice of our interpretation. You will nurture a continuous delivery culture to embed informed, science-based decision-making into existing mechanisms, such as decarbonization strategies, ESG compliance, and risk management. You will also have the opportunity to collaborate with the Climate Pledge team to define strategies based on emergent science/tech trends and influence investment strategy. As a leader on this team, you'll play a key role in worldwide sustainability organizational planning, hiring, mentorship and leadership development. If you see yourself as a thought leader and innovator at the intersection of climate science and tech, we’d like to connect with you. About the team Diverse Experiences: World Wide Sustainability (WWS) values diverse experiences. Even if you do not meet all of the qualifications and skills listed in the job description, we encourage candidates to apply. If your career is just starting, hasn’t followed a traditional path, or includes alternative experiences, don’t let it stop you from applying. Inclusive Team Culture: It’s in our nature to learn and be curious. Our employee-led affinity groups foster a culture of inclusion that empower us to be proud of our differences. Ongoing events and learning experiences, including our Conversations on Race and Ethnicity (CORE) and AmazeCon (gender diversity) conferences, inspire us to never stop embracing our uniqueness. Mentorship & Career Growth: We’re continuously raising our performance bar as we strive to become Earth’s Best Employer. That’s why you’ll find endless knowledge-sharing, mentorship and other career-advancing resources here to help you develop into a better-rounded professional. Work/Life Balance: We value work-life harmony. Achieving success at work should never come at the expense of sacrifices at home, which is why flexible work hours and arrangements are part of our culture. When we feel supported in the workplace and at home, there’s nothing we can’t achieve.
US, WA, Seattle
Our mission is to create best-in-class AI agents that seamlessly integrate multimodal inputs like speech, images, and video, enabling natural, empathetic, and adaptive interactions. We develop cutting-edge Large Language Models (LLMs) that leverage advanced architectures, cross-modal learning, interpretability, and responsible AI techniques to provide coherent, context-aware responses augmented by real-time knowledge retrieval. We seek a talented Applied Scientist with expertise in LLMs, speech, audio, NLP, or multimodal learning to pioneer innovations in data simulation, representation, model pre-training/fine-tuning, generation, reasoning, retrieval, and evaluation. The ideal candidate will build scalable solutions for a variety of applications, such as streaming real-time conversational experiences, including multilingual support, talking avatar interactions, customizable personalities, and conversational turn-taking. With a passion for pushing boundaries and rapid experimentation, you'll deliver high-impact solutions from research to customer-facing products and services. Key job responsibilities As an Applied Scientist, you'll leverage your expertise to research novel algorithms and modeling techniques to develop data simulation approaches mimicking real-world interactions with a focus on the speech modality. You'll acquire and curate large, diverse datasets while ensuring privacy, creating robust evaluation metrics and test sets to comprehensively assess LLM performance. Integrating human-in-the-loop feedback, you'll iterate on data selection, sampling, and enhancement techniques to improve the core model performance. Your innovations in data representation, model pre-training/fine-tuning on simulated and real-world datasets, and responsible AI practices will directly impact customers through new AI products and services.
US, WA, Seattle
Our mission is to create best-in-class AI agents that seamlessly integrate multimodal inputs like speech, images, and video, enabling natural, empathetic, and adaptive interactions. We develop cutting-edge Large Language Models (LLMs) that leverage advanced architectures, cross-modal learning, interpretability, and responsible AI techniques to provide coherent, context-aware responses augmented by real-time knowledge retrieval. We seek a talented Applied Scientist with expertise in LLMs, speech, audio, NLP, or multimodal learning to pioneer innovations in data simulation, representation, model pre-training/fine-tuning, generation, reasoning, retrieval, and evaluation. The ideal candidate will build scalable solutions for a variety of applications, such as streaming real-time conversational experiences, including multilingual support, talking avatar interactions, customizable personalities, and conversational turn-taking. With a passion for pushing boundaries and rapid experimentation, you'll deliver high-impact solutions from research to customer-facing products and services. Key job responsibilities As an Applied Scientist, you'll leverage your expertise to research novel algorithms and modeling techniques to develop data simulation approaches mimicking real-world interactions with a focus on the speech modality. You'll acquire and curate large, diverse datasets while ensuring privacy, creating robust evaluation metrics and test sets to comprehensively assess LLM performance. Integrating human-in-the-loop feedback, you'll iterate on data selection, sampling, and enhancement techniques to improve the core model performance. Your innovations in data representation, model pre-training/fine-tuning on simulated and real-world datasets, and responsible AI practices will directly impact customers through new AI products and services.
US, NY, New York
Amazon is investing heavily in building a world class advertising business and developing a collection of self-service performance advertising products that drive discovery and sales. Our products are strategically important to our Retail and Marketplace businesses for driving long-term growth. We deliver billions of ad impressions and millions of clicks daily and are breaking fresh ground to create world-class products. We are highly motivated, collaborative and fun-loving with an entrepreneurial spirit and bias for action. With a broad mandate to experiment and innovate, we are growing at an unprecedented rate with a seemingly endless range of new opportunities. We are seeking a technical leader for our Supply Science team. This team is within the Sponsored Product team, and works on complex engineering, optimization, econometric, and user-experience problems in order to deliver relevant product ads on Amazon search and detail pages world-wide. The team operates with the dual objective of enhancing the experience of Amazon shoppers and enabling the monetization of our online and mobile page properties. Our work spans ML and Data science across predictive modeling, reinforcement learning (Bandits), adaptive experimentation, causal inference, data engineering. Key job responsibilities Search Supply and Experiences, within Sponsored Products, is seeking a Senior Applied Scientist to join a fast growing team with the mandate of creating new ads experience that elevates the shopping experience for our hundreds of millions customers worldwide. We are looking for a top analytical mind capable of understanding our complex ecosystem of advertisers participating in a pay-per-click model– and leveraging this knowledge to help turn the flywheel of the business. As a Senior Applied Scientist on this team you will: --Act as the technical leader in Machine Learning and drive full life-cycle Machine Learning projects. --Lead technical efforts within this team and across other teams. --Build machine learning models, perform proof-of-concept, experiment, optimize, and deploy your models into production. --Run A/B experiments, gather data, and perform statistical analysis. --Establish scalable, efficient, automated processes for large-scale data analysis, machine-learning model development, model validation and serving. --Work closely with software engineers to assist in productionizing your ML models. --Research new machine learning approaches. --Recruit Applied Scientists to the team and act as a mentor to other scientists on the team. A day in the life The successful candidate will be a self-starter comfortable with ambiguity, with strong attention to detail, and with an ability to work in a fast-paced, high-energy and ever-changing environment. The drive and capability to shape the direction is a must. About the team We are a customer-obsessed team of engineers, technologists, product leaders, and scientists. We are focused on continuous exploration of contexts and creatives where advertising delivers value to customers and advertisers. We specifically work on new ads experiences globally with the goal of helping shoppers make the most informed purchase decision. We obsess about our customers and we are continuously innovating on their behalf to enrich their shopping experience on Amazon
US, WA, Seattle
By applying to this position, your application will be considered for all locations we hire for in the United States. Are you interested in machine learning, deep learning, automated reasoning, speech, robotics, computer vision, optimization, or quantum computing? We are looking for applied scientists capable of using a variety of domain expertise to invent, design, evangelize, and implement state-of-the-art solutions for never-before-solved problems. Our full-time opportunities are available in, but are not limited to the following domains: • Machine Learning: You will put Machine Learning theory into practice through experimentation and invention, leveraging machine learning techniques (such as random forest, Bayesian networks, ensemble learning, clustering, etc.), and implement learning systems to work on massive datasets in an effort to tackle never-before-solved problems. • Automated Reasoning: AWS Automated Reasoning teams deliver tools that are called billions of times daily. Amazon development teams are integrating automated-reasoning tools such as Dafny, P, and SAW into their development processes, raising the bar on the security, durability, availability, and quality of our products. Areas of work include: Distributed proof search, SAT and SMT solvers, Reasoning about distributed systems, Automating regulatory compliance, Program analysis and synthesis, Security and privacy, Cryptography, Static analysis, Property-based testing, Model-checking, Deductive verification, compilation into mainstream programming languages, Automatic test generation, and Static and dynamic methods for concurrent systems. • Natural Language Processing and Speech Technologies: You will tackle some of the most interesting research problems on the leading edge of natural language processing. We are hiring in all areas of spoken language understanding: NLP, NLU, ASR, text-to-speech (TTS), and more! • Computer Vision and Robotics: You will help build solutions where visual input helps the customers shop, anticipate technological advances, work with leading edge technology, focus on highly targeted customer use-cases, and launch products that solve problems for our customers. • Quantum: Quantum computing is rapidly emerging and our customers can the see the potential it has to address their challenges. One of our missions at AWS is to give customers access to the most innovative technology available and help them continuously reinvent their business. Quantum computing is a technology that holds promise to be transformational in many industries. We are adding quantum computing resources to the toolkits of every researcher and developer. If this sounds exciting to you - come build the future with us! Key job responsibilities You will have access to large datasets with billions of images and video to build large-scale systems Analyze and model terabytes of text, images, and other types of data to solve real-world problems and translate business and functional requirements into quick prototypes or proofs of concept Own the design and development of end-to-end systems Write technical white papers, create technical roadmaps, and drive production level projects that will support Amazon Web Services Work closely with AWS scientists to develop solutions and deploy them into production Work with diverse groups of people and cross-functional teams to solve complex business problems
US, WA, Bellevue
Are you excited about developing cutting-edge generative AI, large language models (LLMs), and foundation models? Are you looking for opportunities to build and deploy them on real-world problems at a truly vast scale with global impact? At AFT (Amazon Fulfillment Technologies) AI, a group of around 50 scientists and engineers, we are on a mission to build a new generation of dynamic end-to-end prediction models (and agents) for our warehouses based on GenAI and LLMs. These models will be able to understand and make use of petabytes of human-centered as well as process information, and learn to perceive and act to further improve our world-class customer experience – at Amazon scale. We are looking for a Sr. Applied Scientist who will become of the research leads in a team that builds next-level end-to-end process predictions and shift simulations for all systems in a full warehouse with the help of generative AI, graph neural networks, and LLMs. Together, we will be pushing beyond the state of the art in simulation and optimization of one of the most complex systems in the world: Amazon's Fulfillment Network. Key job responsibilities In this role, you will dive deep into our fulfillment network, understand complex processes, and channel your insights to build large-scale machine learning models (LLMs and Transformer-based GNNs) that will be able to understand (and, eventually, optimize) the state and future of our buildings, network, and orders. You will face a high level of research ambiguity and problems that require creative, ambitious, and inventive solutions. You will work with and in a team of applied scientists to solve cutting-edge problems going beyond the published state of the art that will drive transformative change on a truly global scale. You will identify promising research directions, define parts of our research agenda and be a mentor to members of our team and beyond. You will influence the broader Amazon science community and communicate with technical, scientific and business leaders. If you thrive in a dynamic environment and are passionate about pushing the boundaries of generative AI, LLMs, and optimization systems, we want to hear from you. A day in the life Amazon offers a full range of benefits that support you and eligible family members, including domestic partners and their children. Benefits can vary by location, the number of regularly scheduled hours you work, length of employment, and job status such as seasonal or temporary employment. The benefits that generally apply to regular, full-time employees include: 1. Medical, Dental, and Vision Coverage 2. Maternity and Parental Leave Options 3. Paid Time Off (PTO) 4. 401(k) Plan If you are not sure that every qualification on the list above describes you exactly, we'd still love to hear from you! At Amazon, we value people with unique backgrounds, experiences, and skillsets. If you’re passionate about this role and want to make an impact on a global scale, please apply! About the team Amazon Fulfillment Technologies (AFT) powers Amazon’s global fulfillment network. We invent and deliver software, hardware, and data science solutions that orchestrate processes, robots, machines, and people. We harmonize the physical and virtual world so Amazon customers can get what they want, when they want it. The AFT AI team has deep expertise developing cutting edge AI solutions at scale and successfully applying them to business problems in the Amazon Fulfillment Network. These solutions typically utilize machine learning and computer vision techniques, applied to text, sequences of events, images or video from existing or new hardware. We influence each stage of innovation from inception to deployment, developing a research plan, creating and testing prototype solutions, and shepherding the production versions to launch.
US, CA, Santa Clara
Machine learning (ML) has been strategic to Amazon from the early years. We are pioneers in areas such as recommendation engines, product search, eCommerce fraud detection, and large-scale optimization of fulfillment center operations. The Generative AI team helps AWS customers accelerate the use of Generative AI to solve business and operational challenges and promote innovation in their organization. As an applied scientist, you are proficient in designing and developing advanced ML models to solve diverse challenges and opportunities. You will be working with terabytes of text, images, and other types of data to solve real-world problems. You'll design and run experiments, research new algorithms, and find new ways of optimizing risk, profitability, and customer experience. We’re looking for talented scientists capable of applying ML algorithms and cutting-edge deep learning (DL) and reinforcement learning approaches to areas such as drug discovery, customer segmentation, fraud prevention, capacity planning, predictive maintenance, pricing optimization, call center analytics, player pose estimation, event detection, and virtual assistant among others. Key job responsibilities The primary responsibilities of this role are to: • Design, develop, and evaluate innovative ML models to solve diverse challenges and opportunities across industries • Interact with customer directly to understand their business problems, and help them with defining and implementing scalable Generative AI solutions to solve them • Work closely with account teams, research scientist teams, and product engineering teams to drive model implementations and new solution A day in the life ABOUT AWS: Diverse Experiences Amazon values diverse experiences. Even if you do not meet all of the preferred qualifications and skills listed in the job description, we encourage candidates to apply. If your career is just starting, hasn’t followed a traditional path, or includes alternative experiences, don’t let it stop you from applying. Why AWS Amazon Web Services (AWS) is the world’s most comprehensive and broadly adopted cloud platform. We pioneered cloud computing and never stopped innovating — that’s why customers from the most successful startups to Global 500 companies trust our robust suite of products and services to power their businesses. Work/Life Balance We value work-life harmony. Achieving success at work should never come at the expense of sacrifices at home, which is why we strive for flexibility as part of our working culture. When we feel supported in the workplace and at home, there’s nothing we can’t achieve in the cloud. Inclusive Team Culture Here at AWS, it’s in our nature to learn and be curious. Our employee-led affinity groups foster a culture of inclusion that empower us to be proud of our differences. Ongoing events and learning experiences, including our Conversations on Race and Ethnicity (CORE) and AmazeCon (gender diversity) conferences, inspire us to never stop embracing our uniqueness. Mentorship and Career Growth We’re continuously raising our performance bar as we strive to become Earth’s Best Employer. That’s why you’ll find endless knowledge-sharing, mentorship and other career-advancing resources here to help you develop into a better-rounded professional.
IL, Tel Aviv
Come build the future of entertainment with us. Are you interested in helping shape the future of movies and television? Do you want to help define the next generation of how and what Amazon customers are watching? Prime Video is a premium streaming service that offers customers a vast collection of TV shows and movies - all with the ease of finding what they love to watch in one place. We offer customers thousands of popular movies and TV shows including Amazon Originals and exclusive licensed content to exciting live sports events. We also offer our members the opportunity to subscribe to add-on channels which they can cancel at anytime and to rent or buy new release movies and TV box sets on the Prime Video Store. Prime Video is a fast-paced, growth business - available in over 240 countries and territories worldwide. The team works in a dynamic environment where innovating on behalf of our customers is at the heart of everything we do. If this sounds exciting to you, please read on. We are looking for an Applied Scientist to embark on our journey to build a Prime Video Sports tech team in Israel from ground up. Our team will focus on developing products to allow for personalizing the customers’ experience and providing them real-time insights and revolutionary experiences using Computer Vision (CV) and Machine Learning (ML). You will get a chance to work on greenfield, cutting-edge and large-scale engineering and science projects, and a rare opportunity to be one of the founders of the Israel Prime Video Sports tech team in Israel. Key job responsibilities We are looking for an Applied Scientist with domain expertise in Computer Vision or Recommendation Systems to lead development of new algorithms and E2E solutions. You will be part of a team of applied scientists and software development engineers responsible for research, design, development and deployment of algorithms into production pipelines. As a technologist, you will also drive publications of original work in top-tier conferences in Computer Vision and Machine Learning. You will be expected to deal with ambiguity! We're looking for someone with outstanding analytical abilities and someone comfortable working with cross-functional teams and systems. You must be a self-starter and be able to learn on the go. About the team In September 2018 Prime Video launched its first full-scale live streaming experience to world-wide Prime customers with NFL Thursday Night Football. That was just the start. Now Amazon has exclusive broadcasting rights to major leagues like NFL Thursday Night Football, Tennis major like Roland-Garros and English Premium League to list few and are broadcasting live events across 30+ sports world-wide. Prime Video is expanding not just the breadth of live content that it offers, but the depth of the experience. This is a transformative opportunity, the chance to be at the vanguard of a program that will revolutionize Prime Video, and the live streaming experience of customers everywhere.
US, WA, Bellevue
The Geospatial science team solves problems at the interface of ML/AI and GIS for Amazon's last mile delivery programs. We have access to Earth-scale datasets and use them to solve challenging problems that affect hundreds of thousands of transporters. We are looking for strong candidates to join the transportation science team which owns time estimation, GPS trajectory learning, and sensor fusion from phone data. You will join a team of GIS and ML domain experts and be expected to develop ML models, present research results to stakeholders, and collaborate with SDEs to implement the models in production. Key job responsibilities - Understand business problems and translate them into science problems - Develop ML models - Present research results - Write and publish papers - Write production code - Collaborate with SDEs and other scientists
IN, KA, Bengaluru
Job Description AOP(Analytics Operations and Programs) team is responsible for creating core analytics, insight generation and science capabilities for ROW Ops. We develop scalable analytics applications and research modeling to optimize operation processes.. You will work with professional Product Managers, Data Engineers, Data Scientists, Research Scientists, Applied Scientists and Business Intelligence Engineers using rigorous quantitative approaches to ensure high quality data/science products for our customers around the world. We are looking for an Applied Scientist to join our growing Science Team in Bangalore/Hyderabad. As an Applied Scientist, you are able to use a range of science methodologies to solve challenging business problems when the solution is unclear. You will be responsible for building ML models to solve complex business problems and test them in production environment. The scope of role includes defining the charter for the project and proposing solutions which align with org's priorities and production constraints but still create impact . You will achieve this by leveraging strong leadership and communication skills, data science skills and by acquiring domain knowledge pertaining to the delivery operations systems. You will provide ML thought leadership to technical and business leaders, and possess ability to think strategically about business, product, and technical challenges. You will also be expected to contribute to the science community by participating in science reviews and publishing in internal or external ML conferences. Our team solves a broad range of problems that can be scaled across ROW (Rest of the World including countries like India, Australia, Singapore, MENA and LATAM). Here is a glimpse of the problems that this team deals with on a regular basis: • Using live package and truck signals to adjust truck capacities in real-time • HOTW models for Last Mile Channel Allocation • Using LLMs to automate analytical processes and insight generation • Using ML to predict parameters which affect truck scheduling • Working with global science teams to predict Shipments Per Route for $MM savings • Deep Learning models to classify addresses based on various attributes Key job responsibilities 1. Use machine learning and analytical techniques to create scalable solutions for business problems Analyze and extract relevant information from large amounts of Amazon’s historical business data to help automate and optimize key processes 2. Design, develop, evaluate and deploy, innovative and highly scalable ML models 3. Work closely with other science and engineering teams to drive real-time model implementations 4. Work closely with Ops/Product partners to identify problems and propose machine learning solutions 5. Establish scalable, efficient, automated processes for large scale data analyses, model development, model validation and model maintenance 6. Work proactively with engineering teams and product managers to evangelize new algorithms and drive the implementation of large-scale complex ML models in production 7. Leading projects and mentoring other scientists, engineers in the use of ML techniques As part of our team, candidate in this role will work in close collaboration with other applied scientists and cross functional teams on high visibility projects with direct exposure to the senior leadership team on regular basis. About the team This team is responsible for applying science based algo and techniques to solve the problems in operation and supply chain. Some of these problems include Truck Scheduling, LM capacity planning, LLM and so on.