Amazon Releases New Public Data Set to Help Address “Cocktail Party” Problem

Amazon today announced the public release of a new data set that will help speech scientists address the difficult problem of separating speech signals in reverberant rooms with multiple speakers.

In the field of automatic speech recognition, this problem is known as the “cocktail party” or “dinner party” problem; accordingly, we call our data set the Dinner Party Corpus, or DiPCo. The dinner party problem is widely studied: it was, for instance, the subject of the fifth and most recent CHiME challenge in speech separation and recognition sponsored by the International Speech Communication Association.

We hope that the availability of a high-quality public data set will both promote research on this topic and make that research more productive.

dinner_party_corpus_v1.jpg._CB452790362_.jpg
The layout of the space in which we captured audio from simulated dinner parties. The numbered circles indicate the placement of the five microphone arrays.

We created our data set with the assistance of Amazon volunteers, who simulated the dinner-party scenario in the lab. We conducted multiple sessions, each involving four participants. At the beginning of each session, participants served themselves food from a buffet table. Most of the session took place at a dining table, and at fixed points in several sessions, we piped music into the room, to reproduce a noise source that will be common in real-world environments.

Each participant was outfitted with a headset microphone, which captured a clear, speaker-specific signal. Also dispersed around the room were five devices with seven microphones each, which fed audio signals directly to an administrator’s laptop.

The data set we are releasing includes both the raw audio from each of the seven microphones in each device and the headset signals. The headset signals provide speaker-specific references that can be used to gauge the success of speech separation systems acting on the signals from the microphone arrays. The data set also includes transcriptions of the headset signals.

The division of the data into segments with and without background music enables researchers to combine clean and noisy training data in whatever way necessary to extract optimal performance from their machine learning systems.

The DiPCo data set has been released under the CDLA-Permissive license and can be downloaded here. We have also posted a paper detailing the work. DiPCo’s release follows on Amazon’s recent releases of three other public data sets, two for the development of conversational AI systems and the other for fact verification.

Acknowledgments: Maarten Van Segbroeck, Roland Maas

About the Author
Zaid Ahmed
About the Author
Maarten Van Segbroeck

Recommended for you

Work with us

See More Jobs
RO, Iasi
Location: Romania (Lasi & Bucharest)Duration: 4-6 monthsAmazon is a company of builders. A philosophy of ownership carries through everything we do — from the proprietary technologies we create to the new businesses we launch and grow. You’ll find it in every team across our company; from providing Earth’s biggest selection of products to developing ground-breaking software and devices that change entire industries, Amazon embraces invention and progressive thinking. Amazon is continually evolving; it’s a place where motivated employees thrive, and ownership and accountability lead to meaningful results. It’s as simple as this: we pioneer.With every order made and parcel delivered, customer demand at Amazon is growing. And to meet this demand, and keep our world-class service running smoothly, we're growing our teams across Europe. Delivering hundreds of thousands of products to hundreds of countries worldwide, our Operations teams possess a wide range of skills and experience and this include software developers, data engineers, operations research scientists, and more.About these internships:Whatever your background, if you are excited about modeling huge amounts of data and creating state of the art algorithms to solve real world problems, if you have a passion for using machine learning/mathematical optimization to design optimal or near optimal solution methodologies to be used by in-house decision support tools and software, if you enjoy solving operational challenges by using computer simulations, and if you’re motivated by results and driven enough to achieve them, Amazon is a great place to be. Because it’s only by coming up with new ideas and challenging the status quo that we can continue to be the most customer-centric company on Earth, we’re all about flexibility: we expect you to adapt to changes quickly and we encourage you to try new things.Amazon is looking for ambitious and enthusiastic students to join our unique world as interns. An Amazon EU internship will provide you with an unforgettable experience in a fast-paced, dynamic and international environment; it will boost your resume and will provide a superb introduction to our activities.As an intern in Ops Research and modelling, you could join one of the following teams: Supply Chain, Transportation, HR, Employee Relations and more.You will put your analytical and technical skills to the test and roll up your sleeves to complete a project that will contribute to improve the functionality and level of service that teams provides to our customers. This could include:· Analyze and solve business problems at their root, stepping back to understand the broader context· Apply advanced statistics /data mining/ operations research techniques to analyze and make insights from big data (data sets could include: historical production data, volumes, transportation and logistics metrics, simulation/experiment results etc.) across multiple geographies.· Closely collaborate with operations research scientists, data scientists, business analysts, BI teams, developers, economists and more on various models’ (including predictive/prescriptive models) development.· Perform quantitative, economic, and/or numerical analysis of the performance of supply chain systems under uncertainty using statistical and optimization tools to find both exact and heuristic solution strategies for optimization problems.· Create computer simulations to support operational decision-making. Identify areas with potential for improvement and work with internal teams to generate requirements that can realize these improvements.· Create software prototypes to verify and validate the devised solutions methodologies; integrate the prototypes into production systems using standard software development tools and methodologies.· Convert statistical output into detailed documents which influence business actionsAmazon is an equal opportunities employer. We believe passionately that employing a diverse workforce is central to our success. We make recruiting decision based on your experience and skills. We value your passion to discover, invent, simplify and build. We welcome applications from all members of society irrespective of age, sex, disability, sexual orientation, race, religion or belief.By submitting your resume and application information, you authorize Amazon to transmit and store your information in the Amazon group of companies' world-wide recruitment database, and to circulate that information as necessary for the purpose of evaluating your qualifications for this or other job vacancies.#EUInternHiring
AU, SA, ADELAIDE
At Amazon Australia, we are developing state-of-the-art large-scale Machine Learning Services and Applications on the Cloud involving Terabytes of data. We work on applying predictive technology to a wide spectrum of problems in areas such as Amazon Retail, Seller Services, Customer Service and so on. We are looking for talented and experienced Machine Learning Scientists (Ph.D. in a related area preferred) who can apply innovative Machine Learning techniques to real-world e-Commerce problems. You will get to work in a team dedicated to advancing Machine Learning technology at Amazon and converting it to business-impacting solutions.Although this position will be based in Adelaide, South Australia, for the duration of the Coronavirus-19 outbreak arrangements will be made to enable the successful candidate to observe the relevant travel restrictions, possibly by working from home, or another Amazon office.Major responsibilities· Use machine learning, data mining and statistical techniques to create new, scalable solutions for business problems· Analyze and extract relevant information from large amounts of Amazon’s historical business data to help automate and optimize key processes· Design, develop and evaluate highly innovative models for predictive learning· Establish scalable, efficient, automated processes for large scale data analyses, model development, model validation and model implementation· Research and implement novel machine learning and statistical approaches
US, WA, Seattle
Data Scientist IAmazon Web Services (AWS) provides companies of all sizes with an infrastructure web services platform in the cloud (“cloud computing”). With AWS you can requisition compute power, storage, and many other services – gaining access to a suite of elastic IT infrastructure services as your demands require them. AWS is the leading platform for designing and developing applications for the cloud and is growing rapidly with hundreds of thousands of companies in over 190 countries on the platform.The AWS Sales Analytics team uses machine learning, econometrics, and data science to optimize AWS’ sales strategy across various global programs, driving customer engagement, and developing insights across a global organization. We use detailed customer behavior and AWS usage data to predict and understand our customers and their needs and wants from AWS.The AWS Sales Analytics team is looking for a Data Scientist to join our predictive analytics team to work closely with a team of B.I. Engineers and Analysts to develop and automate data models by creating various tools and machine-learning models to answer complex questions and provide insights. The Data Scientist must be comfortable working with large volumes of data and be able to manipulate data using a variety of tools. They must be able to build scalable tools to not only process the data but transform it into actionable information. In addition, the ideal candidate must possess excellent interpersonal skills, strong written communication skills, and be able to provide thought leadership and guidance to the other members across the team.Role Summary:Key Responsibilities include:· Organize and structure large data sets· Work closely with the teams to define the performance indicators· Determine how the key metrics and performance indicators will be calculated, owned, and managed.· Collaborate with colleagues across AWS Sales, Data Scientists, Economists, and the Worldwide Revenue Operations team to build and manage data infrastructure and models in support of predictive research and other scientific experiments· Dive deep to understand data sets and ensure statistical accuracy to support conclusions· Analyze data and present information to support decision making
US, MA, Cambridge
A part of Amazon's Robotics Artificial Intelligence, Canvas Technology is using spatial AI to provide end-to-end autonomous delivery of goods. By using state-of-the-art cameras and other sensors, the system perceives its surroundings with unrivaled vision and fidelity. The system combines a mix of high-performance sensors with simultaneous localization and mapping software that builds and continuously updates maps in real-time, completely automatically. It has the capability to ‘see’ and identify different objects, people, vehicles, and places as it moves and react to moving people and vehicles in an intelligent way.We are seeking an experienced Manager of Applied Science to help guide and lead our team of motion planning scientists and engineers. In this role, you will be expected to help define a team direction for robot planning algorithms, object avoidance and congestion management strategies. This will include providing guidance on system architecture and algorithm selection or design. This is not solely a management role, you will be directly contributing to the implementation while you lead. A successful candidate will have strong technical ability, scientific vision, excellent project management skills, great communication skills, and a motivation to achieve results in a fast paced environment.You will be an integral part of the core robotics team and work with others to implement robot motion planning systems above and beyond the current state-of-the-art in the field.If you are an experienced Applied Science Manager, have a track record of delivering to timelines with high quality, are comfortable providing technical direction, have mentored and grown engineers and are deeply technical and innovative, we want to talk to you.
US, WA, Seattle
Not many teams aspire to zero. Welcome to the Worldwide Returns & ReCommerce team (WWR&R) at Amazon.com.WW R&R is an agile, innovative organization dedicated to ‘making zero happen’ to benefit our customers, our company, and the environment. Our goal is to achieve the three zeroes: zero cost of returns, zero waste, and zero defects. We do this by developing groundbreaking products and driving operational excellence to help customers keep what they buy, recover returned and damaged product value, keep thousands of tons of waste from landfills, and create the best customer returns experience in the world. We have an eye to the future – we create long-term value at Amazon by focusing not just on the bottom line, but on the planet. We are building the most sustainable re-use channel we can by driving multiple aspects of the Circular Economy for Amazon - returns, recommerce, and rentals.Amazon WW R&R is comprised of business, product, operational, program, software engineering and data teams that manage the life of a returned or damaged product from a customer to the warehouse and on to its next best use. Our work is broad and deep: we train machine learning models to automate routing and find signals to optimize re-use; we invent new channels to give products a second life; we develop innovative product support to help customers love what they buy; we pilot smarter product evaluations; we work from the customer backward to find ways to make the return experience remarkably delightful and easy; and we do it all while scrutinizing our business with laser focus.In this role, you would- Use machine learning and analytical techniques to create scalable solutions for business problems.- Analyze and extract relevant information from large amounts of historical business data to help automate and optimize key processes.- Design, development and evaluation of highly innovative models for predictive learning.- Establish scalable, efficient, automated processes for large scale data analyses, model development, model validation and model implementation.- Research and implement novel machine learning and statistical approaches.- Work closely with data & software engineering teams to build model implementations and integrate successful models and algorithms in production systems at very large scale.- Technically lead and mentor other scientists in team.We are a group that has fun at work while driving incredible customer, business, and environmental impact. We are backed by a strong leadership group dedicated to operational excellence that empowers a reasonable work-life balance. As an established, experienced team, we offer the scope and support needed for substantial career growth.Amazon is earth’s most customer-centric company and through WW R&R, the earth is our customer too. Come join us and innovate with the Amazon Worldwide Returns & ReCommerce team!
UK, BST, Bristol
Excited by using massive amounts of data to create sophisticated security analytics? Does pushing the envelope with user and entity behavior analytics (UEBA) including developing statistical, stochastic and machine learning models for security excite you? Want to help the largest global enterprises operate safely & securely in the cloud at an unprecedented scale?You will enhance real-time proactive and preemptive systems using data driven techniques. The analytic techniques are applied at each stage from detection to response mitigation. This job will challenge you to dive deep and understand the unique challenges in operating this platform at a scale unrivaled in the industry. Our scale provides unique opportunities that simply don’t exist elsewhere, but these opportunities can only be revealed by a scientific thinker with a curious mind, who is committed to improving every single day.This is an opportunity to operate in a truly groundbreaking manner given the sheer scale, breadth, and fast pace of the AWS environment. Data scientists are collaborative and highly determined, working backward from threats to create new and innovative ways to detect, assess and react to malicious activity.Your job is distilling meaningful insights from large volumes of data to address the constantly evolving threat and attack space from sophisticated adversaries. This role will give you an opportunity to use and grow a broad range of skills in data science, analytic development, and information security – all using and creating cloud services in large scale computing environments. Considering the scale and the magnitude of the technical challenge, this role is a great opportunity to make a meaningful contribution to an extremely important area. Operating in the cloud at our scale enables activities that have never before been possible, providing new advantages and opportunities for innovative work.We are hiring technical specialists at the convergence of the four hottest areas in tech: Data Science, Security, Software Development and Cloud Services. Worried that you do not have hands-on experience in all four areas? We are looking for solid base and expertise in two areas - you will grow expertise in the remaining areas to round out your background in all four key areas. Alongside your team, you will work directly with the biggest internal and external Amazon customers – leveraging data and analytics to create innovative ways to work securely in the cloud.
US, WA, Seattle
Amazon.com strives to be Earth's most customer-centric company where people can find and discover anything they want to buy online. We hire the world's brightest minds, offering them a fast paced, technologically and friendly work environment. Amazon’s Digital Content and Commerce Services team (DCCS) powers ordering, subscriptions and vendor payments for all Amazon’s digital and subscription businesses including Prime, Amazon Video, Music, Alexa and Kindle. We process billions of digital transactions every year and enable our digital businesses to grow their businesses worldwide. We are seeking an experienced, results-driven economist to join our team and help build our next generation of digital solutions and offerings to delight Amazon customers.DCCS owns key customer facing experiences including content selection, subscriptions, pricing, availability, customer experience etc. You will build econometric models using our world class data systems and apply economic theory to solve business problems in a fast moving environment. Economists at Amazon will be expected to develop new techniques to process large data sets, address quantitative problems, and contribute to design of automated systems around the company.Economists will work closely with other research scientists, machine learning experts, and economists globally to design new frameworks that systematically identify low touch machine driven recommendations that propel customer growth while creating a meaningful economic impact for Amazon. Research science at Amazon is a highly experimental activity, although theoretical analysis and innovation are also welcome. Our economists and scientists work closely with software engineers to put algorithms into practice. They also work on cross-disciplinary efforts with other scientists within Amazon.The key strategic objectives for this role include:-Provide data-driven guidance and recommendations on strategic initiatives- Identify opportunities/tests that drive product discovery, new customer acquisition to drive growth.-Conduct, direct, and coordinate all phases of research projects, demonstrating skill in all stages of the analysis process, including defining key research questions, recommending measures, working with multiple data sources, evaluating methodology and design, executing analysis plans, interpreting and communicating results.
LU, Luxembourg
Are you a talented and inventive scientist with strong passion about Artificial Intelligence and Predictive Modeling? Would you like to develop Machine-Learning models by playing a key role within EU RME Predictive Analytics team? Our mission is to drive the Predictive Maintenance (PdM) and Spare Parts (SP) programs for Amazon EU Operations that consists of complex automation, sortation, robotic and materials handling systems.As Machine Learning Scientist you will be working with large distributed systems of data and providing predictive maintenance expertise for over 2000 maintenance engineers, managers and administrators by supporting the entire network managed by EU RME, which may include non-EU locations (such as Singapore, Australia and Japan). You will connect with world leaders in your field and you will be tackling ML challenges by carrying out a systematic literature review of Machine Learning methods applied to PdM. The appropriate choice of the ML methods and their implementation will be the key for the success of the PdM and Spare Parts programs.This role requires an individual with strong skills in the area of data science, Machine Learning and statistics. The successful candidate will be a self-starter comfortable with ambiguity, with strong attention to detail and outstanding ability in balancing technical leadership with strong business judgment to make the right decisions about model and method choices.Key Areas of Responsibilities:· Provide technical expertise to support team strategies that will take EU RME towards World Class predictive maintenance practices and processes, driving better equipment up-time and lower repair costs with optimized spare parts inventory and placement· Implement an advanced maintenance framework utilizing Machine Learning technologies to drive equipment performance leading to reduced unplanned downtime· Provide technical expertise to support the development of long-term spares management strategies that will ensure spares availability at an optimal level for local sites and reduce the cost of spares· Facilitate the access to data and tools for the larger Reliability Engineering team to drive reliability insights
CA, ON
The Economic Technology team (ET) is looking for a Data Scientist to join our team in building Reinforcement Learning solutions at scale. The ET applies Machine Learning, Reinforcement Learning, Causal Inference, and Econometrics/Economics to derive actionable insights about the complex economy of Amazon’s retail business. We also develop Statistical Models and Algorithms to drive strategic business decisions and improve operations. We are an interdisciplinary team of Economists, Engineers, and Scientists incubating and building day one solutions using cutting-edge technology, to solve some of the toughest business problems at Amazon.You will work with business leaders, scientists, and economists to translate business and functional requirements into concrete deliverables, including the design, development, testing, and deployment of highly scalable distributed services. You will partner with scientists, economists, and engineers to help invent and implement scalable ML, RL, and econometric models while building tools to help our customers gain and apply insights.This is a unique, high visibility opportunity for someone who wants to have business impact, dive deep into large-scale economic problems, enable measurable actions on the Consumer economy, and work closely with scientists and economists. We are particularly interested in candidates with experience building predictive models and working with distributed systems.As a Data Scientist, you bring business and industry context to science and technology decisions. You set the standard for scientific excellence and make decisions that affect the way we build and integrate algorithms. Your solutions are exemplary in terms of algorithm design, clarity, model structure, efficiency, and extensibility. You tackle intrinsically hard problems, acquiring expertise as needed. You decompose complex problems into straightforward solutions.
US, CA, Palo Alto
Amazon is investing heavily in building a world class advertising business and we are responsible for defining and delivering a collection of self-service performance advertising products that drive discovery and sales. Our products are strategically important to our Retail and Marketplace businesses driving long term growth. We deliver billions of ad impressions and millions of clicks daily and are breaking fresh ground to create world-class products. We are highly motivated, collaborative and fun-loving with an entrepreneurial spirit and bias for action. With a broad mandate to experiment and innovate, we are growing at an unprecedented rate with a seemingly endless range of new opportunities.Sponsored Products helps merchants, retail vendors, and brand owners succeed via native advertising that grows incremental sales of their products sold through Amazon. The Sponsored Products Ad Marketplace organization optimizes the systems and ad placements to match advertiser demand with publisher supply using a combination of machine learning, big data analytics, ultra-low latency high-volume engineering systems, and quantitative product focus. Our goals are to help buyers discover new products they love, be the most efficient way for advertisers to meet their business objectives, and to build a major, sustainable business that helps Amazon continuously innovate on behalf of all customers.We are seeking an Applied Science Manager who has a solid background in applied Machine Learning and AI, deep passion for building data-driven products; ability to communicate data insights and scientific vision, and has a proven track record of leading both applied scientists and software engineers to execute complex projects and deliver business impacts.In this role, you will:· Lead a group of both applied scientists and software engineers to deliver machine-learning and AI solutions to production.· Advance team's engineering craftsmanship and drive continued scientific innovation as a thought leader and practitioner.· Develop science and engineering roadmap, run Sprint/quarter and annual planning, and foster cross-team collaboration to execute complex projects.· Perform hands-on data analysis, build machine-learning models, run regular A/B tests, and communicate the impact to senior management.· Hire and develop top talents, provide technical and career development guidance to both scientists and engineers in the organization.
US, WA, Seattle
Do you want to join Alexa AI -- the science team behind Amazon’s intelligence voice assistance system? Do you want to utilize cutting-edge deep-learning and machine learning algorithms to delight millions of Alexa users around the world?If your answers to these questions are “yes”, then come join us at the Alexa Artificial Intelligence team, which is in charge of improving Alexa user satisfaction through real-time metrics monitoring and continuous closed-loop learning. The team owns the modules that reduce user perceived defects and frictions through utterance reformulation, contextual and personalized hypothesis ranking.With the Alexa Artificial Intelligence team, you will be working alongside a team of experienced machine/deep learning scientists and engineers to create data driven machine learning models and solutions on tasks such as sequence-to-sequence query reformulation, graph feature embedding, personalized ranking, etc..You will be expected to:· Analyze, understand, and model user-behavior and the user-experience based on large scale data, to detect key factors causing satisfaction and dissatisfaction (SAT/DSAT).· Build and measure novel online & offline metrics for personal digital assistants and user scenarios, on diverse devices and endpoints· Create and innovate deep learning and/or machine learning based algorithms for utterance reformulation and contextual hypothesis ranking to reduce user dissatisfaction in various scenarios;· Perform model/data analysis and monitor user-experienced based metrics through online A/B testing;· Research and implement novel machine learning and deep learning algorithms and models.
US, WA, Seattle
Are you passionate about the use of Machine Learning to improve the experience for Alexa and Smart Home customers? We have a team that is making revolutionary leaps forward in this space and are in need of a Machine Learning Scientist. You will have an enormous opportunity to impact the customer experience, design, architecture, and implementation of a new machine-learning driven product that will impact the lives of people you know every day.Great candidates for this position will have a passion for machine learning and signal processing and will have hands-on experience with product development. You will have the deep expertise to drive the ML vision for our products and technical breadth to make the right decisions about technology, models and methodology choices.As a Machine Learning Scientist at Amazon, you will connect with world leaders in your field working on similar problems. On this team you will analyze and model sensor data, smart home signals, and contextual data to create new experiences for customers. Meeting business requirements will involve combining several different machine learning algorithms with domain knowledge into sophisticated ML workflows. You will work with large distributed systems of data and will tackle Machine Learning challenges in Supervised, Unsupervised, and One-shot Learning, utilizing modern methods such as Deep Neural Networks and others. MLS’s have contributed to and are aware of the state-of-the-art in their respective field of expertise and are constantly focused on advancing the state-of-the-art for improving Amazon’s products and services.KEY RESPONSIBILITIES· Analyze and extract relevant information from large amounts of data to support new experiences for customers· Create novel ML approaches and apply them to achieve project goals· Build ML software and algorithms that cost-effectively scale to millions of customers· Work closely with other teams across Amazon to deliver platform features that require cross-team leadership· Ensure that the quality and timeliness of ML deliverables
IN, KA, Bangalore
What would you do if you had access to the world’s largest product catalog with billions of products, offers, images, reviews, searches, and much more? Amazon Selection and Catalog Systems is looking for an innovative and customer-focused applied scientist to improve the data quality of the world’s biggest product catalog, utilizing state-of-the-art machine learning techniques.An information-rich and accurate product catalog is a critical strategic asset for Amazon. It powers unrivaled product discovery, informs customers’ buying decisions, offers a large selection and positions Amazon as the first stop for our customers. Maintaining and improving the accuracy of product catalog is challenging due to sheer scale (billions of products in the catalog), diversity (products ranging from electronics to groceries to instant video across multiple languages) and multitude of input sources (millions of sellers contributing product data with different quality).You will conceive innovative solutions to measure and improve the quality of various aspects of our product catalog and influence the way millions of our customers discover and buy our products worldwide. The opportunity (puzzle to solve) is that there is no single solution as the problem scope is varied and diverse. The solutions you build will vary from simple rule based systems to machine learning, semantic analysis and text processing. You will have the opportunity to design new data analytical workflows at a scale rarely available elsewhere, utilizing state-of-the-art data science and machine learning tools such as Spark, Python, and Theano and Amazon’s cloud computing technologies such as Elastic Map Reduce (EMR), Kinesis, and Redshift. You will apply your knowledge in data science by creating algorithmic solutions that combine techniques such as clustering, pattern mining, predictive modeling, deep learning, statistical testing, information retrieval, and natural language processing and apply them to the voluminous data describing the products in the catalog and the customer interactions. You will evaluate with scientific rigor and provide inputs to business strategy and technical direction. You will collaborate with software engineering teams to integrate your algorithmic solutions into large-scale highly complex Amazon production systems.Responsibilities include:· Map business requirements and customer needs to a scientific problem.· Align the research direction to business requirements and make the right judgments on research/development schedule and prioritization.· Research, design, implement and deploy scalable machine learned models, including the application of state-of-art deep learning, to solve problems that matter to our customers in an iterative fashion.· Mentor and develop junior applied scientists and developers who work on data science problems in the same organization.· Stay informed on the latest machine learning, natural language and/or artificial intelligence trends and make presentations to the larger engineering and applied science communities.
IN, KA, Bangalore
What would you do if you had access to the world’s largest product catalog with billions of products, offers, images, reviews, searches, and much more? Amazon Selection and Catalog Systems is looking for an innovative and customer-focused applied scientist to improve the data quality of the world’s biggest product catalog, utilizing state-of-the-art machine learning techniques.An information-rich and accurate product catalog is a critical strategic asset for Amazon. It powers unrivaled product discovery, informs customers’ buying decisions, offers a large selection and positions Amazon as the first stop for our customers. Maintaining and improving the accuracy of product catalog is challenging due to sheer scale (billions of products in the catalog), diversity (products ranging from electronics to groceries to instant video across multiple languages) and multitude of input sources (millions of sellers contributing product data with different quality).You will conceive innovative solutions to measure and improve the quality of various aspects of our product catalog and influence the way millions of our customers discover and buy our products worldwide. The opportunity (puzzle to solve) is that there is no single solution as the problem scope is varied and diverse. The solutions you build will vary from simple rule based systems to machine learning, semantic analysis and text processing. You will have the opportunity to design new data analytical workflows at a scale rarely available elsewhere, utilizing state-of-the-art data science and machine learning tools such as Spark, Python, and Theano and Amazon’s cloud computing technologies such as Elastic Map Reduce (EMR), Kinesis, and Redshift. You will apply your knowledge in data science by creating algorithmic solutions that combine techniques such as clustering, pattern mining, predictive modeling, deep learning, statistical testing, information retrieval, and natural language processing and apply them to the voluminous data describing the products in the catalog and the customer interactions. You will evaluate with scientific rigor and provide inputs to business strategy and technical direction. You will collaborate with software engineering teams to integrate your algorithmic solutions into large-scale highly complex Amazon production systems.Responsibilities include:· Map business requirements and customer needs to a scientific problem.· Align the research direction to business requirements and make the right judgments on research/development schedule and prioritization.· Research, design, implement and deploy scalable machine learned models, including the application of state-of-art deep learning, to solve problems that matter to our customers in an iterative fashion.· Mentor and develop junior applied scientists and developers who work on data science problems in the same organization.· Stay informed on the latest machine learning, natural language and/or artificial intelligence trends and make presentations to the larger engineering and applied science communities.
US, WA, Seattle
The Central AWS Econ team is dedicated to bringing the most trustworthy evidence-based analysis to the most strategic decisions for AWS leadership.Our studies impact strategic investments, service business model, resource allocation, product priorities and pricing models, go-to-market motions and more.This Senior economist role partners with AWS business leaders across the organization to define and deliver on economic questions that guide their most strategic decisions. The successful candidate will be a problem solver who enjoys diving into data, is excited by difficult modeling challenges and ambiguous starting points, and possesses strong communication skills to effectively interface and collaborate with product, finance, planning and business teams.Specific questions include developing supporting economics for new business model, evaluating the relationship between short and long term growth, mapping and affecting the customer journey through different AWS products and cloud technologies.The Central AWS Econ team is dedicated to answering these (and many more) questions using quantitative, economic and statistical methods.Key Responsibilities:· Identify and propose impactful economic studies based on business meetings· Lead / conduct economic studies, including developing and communicating practical implications to senior leadership· Mentor and develop junior economists and data scientists· Develop new repeatable data analysis pipelines to be used by non-economists
US, CA, Cupertino
Are you a biochemistry research scientist? At Amazon, we are constantly inventing and re-inventing to be the most customer-centric company in the world. To get there, we need exceptionally talented, bright, and driven people. We are a smart team of doers that work passionately to apply cutting-edge advances in technology and to solve real-world problems that will transform our customers’ experiences in ways we can’t even imagine yet.As a Research Scientist, you will be working with a unique and gifted team that is developing exciting products and collaborating with cross-functional teams.Responsibilities:· Collaborate to define product specifications and protocols· Iterate through experimentation to identify optimal product parameters· Identify and qualify new materials· Ensure manufacturability across the design process· Contribute to design control and regulated protocols· Collaborate with engineering teams to design, implement, and harmonize solutions
US, WA, Seattle
** LOCATION CAN BE EITHER TEMPE, SEATTLE OR NASHVILLE**Amazon Transportation Services (ATS) Line Haul is looking for a talented Data Scientist who will own analytics and develop solutions to drive insights and optimization for Network Planning and Forecasting. As a member of this team, you will have an opportunity to be an innovator in Amazon Logistics and work with a group of talented program managers, product managers, research scientists, software developers, and business stakeholders to design the Amazon network of the future.This position requires innovative thinking, ability to quickly approach large ambiguous problems, technical and engineering expertise to rapidly research, validate, visualize, prototype and deliver solutions. This position also requires significant cross functional work and integration with transportation, tech, operations and finance. Successful candidates will thrive in a fast-paced environment.As you further your career as a Data Scientist at Amazon, you will focus on improving corporate reporting frameworks and data visualization. You will examine performance data, discover and solve real world problems related to forecasting, and build critical metrics and business cases. We are focused on your success and want to build and support strong pioneers within Amazon Transportation Services. You can expect to leverage your problem solving skills and have full ownership of the projects you work on.Responsibilities:· Perform complex data research to identify opportunities to reduce fulfillment costs as well as improve efficiencies and customer experience· Design, develop and establish KPIs to provide strategic insights to drive growth and performance· Ability to perform/own reoccurring and ad-hoc business intelligence projects, including the development of advanced statistical models that improve the forecasting capabilities of the Amazon transportation network· Develop standardized metrics to evaluate and benchmark pertaining to short and long term network planning and forecasting· Communicate complex insights to stakeholders, both verbally and in writing
US, NY, New York
Amazon Web Services is looking for world class scientists to join the research team within AWS Security Services. You would be entrusted with researching and developing core data mining and machine learning algorithms for various AWS security services like GuardDuty (https://aws.amazon.com/guardduty/) and Macie (https://aws.amazon.com/macie/). On this team, you will invent and implement innovative solutions for never-before-solved problems. If you have passion for security and experience with large scale machine learning problems, this will be an exciting opportunity.The AWS Security Services team builds technologies that help customers strengthen their security posture and better meet security requirements in the AWS Cloud. The team interacts with security researchers to codify our own learnings and best practices and make them available for customers. We are building massively scalable and globally distributed security systems to power next generation services.Key Responsibilities:· Rapidly design, prototype and test many possible hypotheses in a high-ambiguity environment, making use of both quantitative and business judgment· Collaborate with software engineering teams to integrate successful experiments into large scale, highly complex production services.· Report results in a scientifically rigorous way· Interact with security engineers and related domain experts to dive deep into the types of challenges that we need innovative solutions for
US, WA, Seattle
Amazon delights millions of customers around the world. Meet PI-Squared, the behind-the-scenes team, that enables our HR and Operations Leaders to make informed decisions and improve the overall experience of a million frontline employees and leaders throughout their journey at Amazon. Our diverse team of statisticians, machine learning experts, and social scientists strive to make Amazon HR the most scientific HR organization in the world. We form hypotheses about the best talent acquisition, talent retention, and talent development techniques, and then set out to prove or disprove them with experiments and careful data collection.The ambition of Amazon HR is to be the most scientific organization in the world. We bring data and machine learning into management science to deliver workforce, associate experience, and leadership insights so Amazon leaders can focus their efforts in ways that will engage, retain and grow their talents. You will have the opportunity to work with operation leaders across different business lines to gain deep insights into Amazons’ daily operation and directly impact productivity, quality, and safety of hundreds of thousands of employees’ everyday life.Roles and Responsibilities:(1) Undertake econometric / statistical analysis to measure impact of various initiatives in the HR space.(2) Design and measure experiments(3) Undertake qualitative analysis to augment the findings from quantitative studies(4) Build scalable analytic solutions using state of the art tools based on large datasetsThis role requires an individual with strong quantitative modeling skills and the ability to apply statistical/machine learning, econometric, and experimental design methods. Preference will be given to candidates with additional experience in qualitative analysis in a variety of settings such as focus groups, field studies, surveys, and observational studies.
US, WA, Seattle
Try Before You Buy (TBYB) team at Amazon Fashion is looking for an Applied Scientist to join us to build our next-generation personalized recommendation systems for Personal Shopper and Prime Wardrobe. In this role, you will be responsible for researching, developing, and deploying machine learning, computer vision, and NLP models to make customers' fashion shopping experience at Amazon engaging and joyful.The primary responsibilities of this role include:· · Build ETL pipelines to collect and process data· · Frame and transform ambiguous business challenges into science hypotheses. Design and implement offline and online experiments to evaluate them· · Develop prototypes to test new concepts/proposals for models and algorithms· · Design and build automated, scalable pipelines to train and deploy ML models