More-efficient “kernel methods” dramatically reduce training time for natural-language-understanding systems

Machine learning systems often act on “features” extracted from input data. In a natural-language-understanding system, for instance, the features might include words’ parts of speech, as assessed by an automatic syntactic parser, or whether a sentence is in the active or passive voice.

Some machine learning systems could be improved if, rather than learning from extracted features, they could learn directly from the structure of the data they’re processing. To determine parts of speech, for instance, a syntactic parser produces a tree of syntactic relationships between parts of a sentence. That tree encodes more information than is contained in simple part-of-speech tags, information that could prove useful to a machine learning system.

The problem: comparing data structures is much more time consuming than comparing features, which means that the resulting machine learning systems are frequently too slow to be practical.

In a paper we’re presenting at the 33rd conference of the Association for the Advancement of Artificial Intelligence (AAAI), my colleagues at the University of Padova and the Qatar Computing Research Institute and I present a technique for making the direct comparison of data structures much more efficient.

In experiments involving a fundamental natural-language-understanding (NLU) task called semantic-role labeling, with syntactic trees as inputs, we compared our technique to the standard technique for doing machine learning on data structures. With slightly over four hours of training, a machine learning system using our technique achieved higher accuracy than a system trained for 7.5 days with the standard technique.

Our technique enables training using so-called mapping kernels. A kernel is a mathematical function for computing the similarity between two objects. The main idea behind mapping kernels is to decompose each object into substructures and then sum the contributions of kernel evaluations on a subset of the substructures. In our case, that means splitting syntactic trees into their constituent trees.

Typically, learning from structural examples requires that a machine learning system compare every training example it sees to all the others, in an attempt to identify structural features useful for the task at hand.

We show that, when training with tree structures, it’s possible to instead compress the complete set of constituent trees — the “forest” — into a single data structure called a direct acyclic graph (DAG).

Our analysis shows that many substructures repeat many times in the tree forest, and counts of those repetitions are encoded into the DAG. Then we can compute the similarity between the DAG and any other tree in one shot.

Tree_kernels.png._CB456832901_.png
An example of our method: (a) two trees to be compared; (b) the kernels for each tree — in this case, each tree’s complete set of “subset trees”; (c) a dictionary of the subset trees, which includes their frequency of occurrence; (d) the DAG (directed acyclic graph) representing the complete “forest” of subset trees.

A tree is a data structure that starts with a root node, usually depicted as a circle. The root can have any number of children, also depicted as circles, to which it is connected by edges, usually depicted as line segments. The children can have children, which can have children, and so on.

What defines a tree is that every node has exactly one parent. (Put differently, between any two nodes, there is exactly one path.) In NLU, trees often indicate syntactic relationships between words of a sentence.

Parse_tree.png._CB456832907_.png
The parse tree for the sentence “Mary brought a cat to school”. Abbreviations indicate parts of speech. For instance, VP means verb phrase, PP prepositional phrase, D determiner, and so on.

In our paper, we define several different kernels for comparing trees. One involves comparing partial trees, or any collections of a tree’s nodes and edges that are themselves trees. One involves comparing elastic trees, or trees that indicate the paths of descent between pairs of nodes (between, say, great-grandparent nodes and great-grandchildren nodes), without representing the intervening nodes. But the kernels we used in our experiments are called subset trees.

To create the set of subset trees for a given tree, you first lop off the root node and add the resulting tree or trees to your collection. Then you lop off the roots of each of those trees and add the results to your collection, and so on, until you’re left with the individual nodes of the tree’s last layer, which also join the collection (see section (b) of the first figure, above).

With our method, we pool the subset trees of all the trees we wish to compare, then select a single representative example of each subset tree in the pool. A given example may be unique to one tree, or it may be common to many. Along with each example, we also record the number of times it occurs in all of our sample trees (section (c), above).

Finally, we decompose the representative examples, too, into their constituent parts and record the frequency with which any node descends from any other (section (d), above). The result is a comparatively simple model that captures the statistical relationships between all the nodes in all the examples our machine learning system has seen.

With this approach, we can train a semantic-role-labeling system on two million examples in 4.3 hours. With the standard approach, using the same kernels, it takes more than a week to train a system on half as many examples.

Acknowledgments: Giovanni Da San Martino, Alessandro Sperduti, Fabio Aiolli

Tags
About the Author
Alessandro Moschitti is a principal scientist on the Alexa Search team.

Related content

View from space of a connected network around planet Earth representing the Internet of Things.
Get more from Amazon Science
Sign up for our monthly newsletter

Work with us

See more jobs
US, WA, Bellevue
Do you enjoy solving complex problems? Are you eager to change the world with data science? At Amazon Taskless, we challenge ourselves with questions like, what if we can verify documentation in seconds instead of days? What if we could quickly automate complex processes which are not well documented? What if we can improve customer retention?By adopting technologies such as machine learning, computer vision (Amazon Rekognition & Textract) and natural language processing(Amazon Lex), Amazon Taskless transforms tedious businesses processing with Intelligent Automation and Robotic Process Automation. We built an identity management system, which simplify compliance across all Amazon businesses including Twitch, Flex, Amazon sellers, Kindle Direct Publishing authors globally.As a Data Scientist, you will work on our Science team and partner closely with other data scientists , data engineers as well as product managers, UX designers, and business partners across Amazon to accurately model and remove tasks from their processes. Outputs from your models will directly improve customer experience across Amazon while delivering cost savings. You will be responsible for building data science prototypes that optimize business processes and innovate for our customers in new ways.You are skeptical. When someone gives you a data source or walks you through their process, you pepper them with questions about, accuracy, coverage, and the need of steps in their process. When you’re told a model can make assumptions, you aggressively try to break those assumptions.You do whatever it takes to add value. You don’t care whether you’re building complex machine learning models, writing blazing fast code, integrating multiple disparate data-sets, or creating baseline models - you care passionately about stakeholders and know that as a curator of data insight you can unlock massive cost savings and retain customers.You have a limitless curiosity. You constantly ask questions about the technologies and approaches we are taking and are constantly learning about industry best practices you can bring to our team.You have excellent business and communication skills to be able to work with product owners to understand key business questions and earn the trust of senior leaders. You will need to make the complex simple to understand.You are comfortable juggling competing priorities and handling ambiguity. You thrive in an agile and fast-paced environment on highly visible projects and initiatives. The tradeoffs of cost savings and customer experience are constantly up for debate among senior leadership - you will help drive this conversation.
US, WA, Bellevue
Do you enjoy solving complex problems? Are you eager to change the world with data science? At Amazon Taskless, we challenge ourselves with questions like, what if we can verify documentation in seconds instead of days? What if we could quickly automate complex processes which are not well documented? What if we can improve customer retention?By adopting technologies such as machine learning, computer vision (Amazon Rekognition & Textract) and natural language processing(Amazon Lex), Amazon Taskless transforms tedious businesses processing with Intelligent Automation and Robotic Process Automation. We built an identity management system, which simplify compliance across all Amazon businesses including Twitch, Flex, Amazon sellers, Kindle Direct Publishing authors globally.As a Data Scientist, you will work on our Science team and partner closely with other data scientists , data engineers as well as product managers, UX designers, and business partners across Amazon to accurately model and remove tasks from their processes. Outputs from your models will directly improve customer experience across Amazon while delivering cost savings. You will be responsible for building data science prototypes that optimize business processes and innovate for our customers in new ways.You are skeptical. When someone gives you a data source or walks you through their process, you pepper them with questions about, accuracy, coverage, and the need of steps in their process. When you’re told a model can make assumptions, you aggressively try to break those assumptions.You do whatever it takes to add value. You don’t care whether you’re building complex machine learning models, writing blazing fast code, integrating multiple disparate data-sets, or creating baseline models - you care passionately about stakeholders and know that as a curator of data insight you can unlock massive cost savings and retain customers.You have a limitless curiosity. You constantly ask questions about the technologies and approaches we are taking and are constantly learning about industry best practices you can bring to our team.You have excellent business and communication skills to be able to work with product owners to understand key business questions and earn the trust of senior leaders. You will need to make the complex simple to understand.You are comfortable juggling competing priorities and handling ambiguity. You thrive in an agile and fast-paced environment on highly visible projects and initiatives. The tradeoffs of cost savings and customer experience are constantly up for debate among senior leadership - you will help drive this conversation.
US, WA, Bellevue
Do you enjoy solving complex problems? Are you eager to change the world with data science? At Amazon Taskless, we challenge ourselves with questions like, what if we can verify documentation in seconds instead of days? What if we could quickly automate complex processes which are not well documented? What if we can improve customer retention?By adopting technologies such as machine learning, computer vision (Amazon Rekognition & Textract) and natural language processing(Amazon Lex), Amazon Taskless transforms tedious businesses processing with Intelligent Automation and Robotic Process Automation. We built an identity management system, which simplify compliance across all Amazon businesses including Twitch, Flex, Amazon sellers, Kindle Direct Publishing authors globally.As a Data Scientist, you will work on our Science team and partner closely with other data scientists , data engineers as well as product managers, UX designers, and business partners across Amazon to accurately model and remove tasks from their processes. Outputs from your models will directly improve customer experience across Amazon while delivering cost savings. You will be responsible for building data science prototypes that optimize business processes and innovate for our customers in new ways.You are skeptical. When someone gives you a data source or walks you through their process, you pepper them with questions about, accuracy, coverage, and the need of steps in their process. When you’re told a model can make assumptions, you aggressively try to break those assumptions.You do whatever it takes to add value. You don’t care whether you’re building complex machine learning models, writing blazing fast code, integrating multiple disparate data-sets, or creating baseline models - you care passionately about stakeholders and know that as a curator of data insight you can unlock massive cost savings and retain customers.You have a limitless curiosity. You constantly ask questions about the technologies and approaches we are taking and are constantly learning about industry best practices you can bring to our team.You have excellent business and communication skills to be able to work with product owners to understand key business questions and earn the trust of senior leaders. You will need to make the complex simple to understand.You are comfortable juggling competing priorities and handling ambiguity. You thrive in an agile and fast-paced environment on highly visible projects and initiatives. The tradeoffs of cost savings and customer experience are constantly up for debate among senior leadership - you will help drive this conversation.
US, WA, Bellevue
Do you enjoy solving complex problems? Are you eager to change the world with data science? At Amazon Taskless, we challenge ourselves with questions like, what if we can verify documentation in seconds instead of days? What if we could quickly automate complex processes which are not well documented? What if we can improve customer retention?By adopting technologies such as machine learning, computer vision (Amazon Rekognition & Textract) and natural language processing(Amazon Lex), Amazon Taskless transforms tedious businesses processing with Intelligent Automation and Robotic Process Automation. We built an identity management system, which simplify compliance across all Amazon businesses including Twitch, Flex, Amazon sellers, Kindle Direct Publishing authors globally.As a Data Scientist, you will work on our Science team and partner closely with other data scientists , data engineers as well as product managers, UX designers, and business partners across Amazon to accurately model and remove tasks from their processes. Outputs from your models will directly improve customer experience across Amazon while delivering cost savings. You will be responsible for building data science prototypes that optimize business processes and innovate for our customers in new ways.You are skeptical. When someone gives you a data source or walks you through their process, you pepper them with questions about, accuracy, coverage, and the need of steps in their process. When you’re told a model can make assumptions, you aggressively try to break those assumptions.You do whatever it takes to add value. You don’t care whether you’re building complex machine learning models, writing blazing fast code, integrating multiple disparate data-sets, or creating baseline models - you care passionately about stakeholders and know that as a curator of data insight you can unlock massive cost savings and retain customers.You have a limitless curiosity. You constantly ask questions about the technologies and approaches we are taking and are constantly learning about industry best practices you can bring to our team.You have excellent business and communication skills to be able to work with product owners to understand key business questions and earn the trust of senior leaders. You will need to make the complex simple to understand.You are comfortable juggling competing priorities and handling ambiguity. You thrive in an agile and fast-paced environment on highly visible projects and initiatives. The tradeoffs of cost savings and customer experience are constantly up for debate among senior leadership - you will help drive this conversation.
US, WA, Bellevue
At Amazon we're working to be the most customer-centric company on earth. Within the Access Points team, we do this by creating delivery experiences that delight customers, growing our worldwide network of Amazon Hub Lockers and Counters, providing away from home pickup options, and by creating new delivery initiatives that solve the changing needs of our Customers. At any Access Point, customers should expect to return, or redirect their Amazon deliveries. We measure our impact in transportation savings, revenue, and downstream customer acquisition / engagement /purchasing with Amazon.We are looking for an accomplished Manager, Research Science for Amazon Access Point’s worldwide data science team. You will define the research science direction for the team and work with our engineers to create an advanced system solving mathematically complex constraint problems. You will lead the team to own development of novel algorithmic architectures, toward the ultimate goal of accurately predicting customer purchase propensity, demand pattern and optimizing for site selection topology future Access Point locations and eligible products worldwide.Access Point has a rapidly growing customer base and an exciting science charter in front of us that includes solving highly complex algorithmic problems. You will work closely with and learn from data professionals from various disciplines (eg data engineers, analysts, machine learning engineers, economists and other fellow research scientists).Key responsibilities:· Hire, manager and grow a team of scientists and be the thought leader on the team· Collaborate with product managers and engineering teams to design and implement software solutions for Amazon problems· Contribute to progress of the Amazon and broader research communities by producing publications· Be hands-on when needed, to mine the large amount of data, prototype and implement new learning algorithms and prediction techniques to improve forecast accuracy or optimization performance
CA, ON, Toronto
Amazon Sponsored Ads is one of the fastest growing business domains and we are looking for talented scientists to join this team of incredible scientists to contribute to this growth. We are still in Day 1 and there is an abundance of opportunities that are yet to be explored. We are a team of highly motivated and collaborative team of machine learning and data scientists, with an entrepreneurial spirit and bias for action. We have a broad mandate to experiment and innovate, and we are growing at an unprecedented rate with a seemingly endless range of new opportunities. Sponsored Products (SP) Bids and Budgets team is focussed on helping advertisers set their campaign bids and budgets in an optimized fashion.As an Research Scientist on this team you will:· Build machine learning models and utilize data analysis to deliver scalable solutions to business problems.· Perform hands-on analysis and modeling with very large data sets to develop insights that increase traffic monetization and merchandise sales without compromising shopper experience.· Work closely with software engineers on detailed requirements, technical designs and implementation of end-to-end solutions in production.· Design and run A/B experiments that affect hundreds of millions of customers, evaluate the impact of your optimizations and communicate your results to various business stakeholders.· Work with scientists and economists to model the interaction between organic sales and sponsored content and to further evolve Amazon's marketplace.· Establish scalable, efficient, automated processes for large-scale data analysis, machine-learning model development, model validation and serving.· Research new predictive learning approaches for the sponsored products business.Why you love this opportunityAmazon is investing heavily in building a world-class advertising business. This team is responsible for defining and delivering a collection of advertising products that drive discovery and sales. Our solutions generate billions in revenue and drive long-term growth for Amazon’s Retail and Marketplace businesses. We deliver billions of ad impressions, millions of clicks daily, and break fresh ground to create world-class products. We are highly motivated, collaborative, and fun-loving team with an entrepreneurial spirit - with a broad mandate to experiment and innovate.Impact and Career GrowthYou will invent new experiences and influence customer-facing shopping experiences to help suppliers grow their retail business and the auction dynamics that leverage native advertising; this is your opportunity to work within the fastest-growing businesses across all of Amazon! Define a long-term science vision for our advertising business, driven fundamentally from our customers' needs, translating that direction into specific plans for research and applied scientists, as well as engineering and product teams. This role combines science leadership, organizational ability, technical strength, product focus, and business understanding.Team video https://youtu.be/zD_6Lzw8raE
US, MA, Cambridge
Alexa is Amazon’s intelligent cloud-based voice recognition and natural language understanding virtual assistant. We’re building the speech and language solutions behind Amazon Alexa and other Amazon products and services. Come join our team and help improve the customer experience for the growing base of Alexa users!The Alexa Artificial Intelligence (AI) team is seeking a talented Applied Scientist to build ML models to detect issues that end-users have in their interactions with Alexa (defects and their possible root causes). These models are then used to monitor trends over time with Customer Experience (CX) metrics, guardrail metrics in weblabs, setting defect reduction goals, and defect discovery and resolution.A day in the life· Design, build, test and release predictive ML models· Ensure data quality throughout all stages of acquisition and processing, including such areas as data sourcing/collection, ground truth generation, normalization, and transformation.· Collaborate with colleagues from science, engineering and business backgrounds.· Present proposals and results to partner teams in a clear manner backed by data and coupled with actionable conclusions· Work with engineers to develop efficient data querying and inference infrastructure for both offline and online use casesAbout the hiring groupAlexa AI is an analytics and science team within Alexa. Our mission is to provide an understanding of the customer experience that allows Alexa teams to improve system performance and customer engagement. Our primary deliverables are CX metrics, analytics tools, and customer insights.Job responsibilitiesAs an Applied Scientist with our Alexa AI team, you will work on assessing Alexa's performance using predictive ML models. You will build and improve models to classify Alexa’s responses as correct/incorrect, and predict the most likely cause of failure in cases of incorrect action. Your work will directly impact our customers in the form of products and services that make use of speech and language technology, particularly in developing predictive models to continuously improve the Alexa experience for our customers.Amazon is committed to a diverse and inclusive workplace. Amazon is an equal opportunity employer and does not discriminate on the basis of race, national origin, gender, gender identity, sexual orientation, protected veteran status, disability, age, or other legally protected status. For individuals with disabilities who would like to request an accommodation, please visit https://www.amazon.jobs/en/disability/us.
LU, Luxembourg
Are you a talented and inventive engineer with strong passion about Artificial Intelligence and Predictive Modeling? Would you like to develop Machine-Learning tools by playing a key role within EU RME Predictive Analytics team? Our mission is to drive the Predictive Maintenance (PdM) and Spare Parts (SP) programs for Amazon EU Operations that consists of complex automation, sortation, robotic and materials handling systems.As Machine Learning Tool Specialist you will be working with large distributed systems of data and providing predictive maintenance expertise for over 2000 maintenance engineers, managers and administrators by supporting the entire network managed by EU RME, which may include non-EU locations (such as Singapore, Australia and Japan). You will connect with world leaders in your field and you will be tackling ML challenges by carrying out a systematic review of existing solutions. The appropriate choice of the ML methods and their deployment into effective tools will be the key for the success in this role.The successful candidate will be a self-starter comfortable with ambiguity, with strong attention to detail and outstanding ability in balancing technical leadership with strong business judgment to make the right decisions about model and method choices.Key Areas of Responsibilities:· Provide technical expertise to support team strategies that will take EU RME towards World Class predictive maintenance practices and processes, driving better equipment up-time and lower repair costs with optimized spare parts inventory and placement· Implement an advanced maintenance framework utilizing Machine Learning technologies to drive equipment performance leading to reduced unplanned downtime· Provide technical expertise to support the development of long-term spares management strategies that will ensure spares availability at an optimal level for local sites and reduce the cost of spares
LU, Luxembourg
Are you a talented and inventive engineer with strong passion about Artificial Intelligence and Predictive Modeling? Would you like to develop Machine-Learning tools by playing a key role within EU RME Predictive Analytics team? Our mission is to drive the Predictive Maintenance (PdM) and Spare Parts (SP) programs for Amazon EU Operations that consists of complex automation, sortation, robotic and materials handling systems.As Machine Learning Tool Specialist you will be working with large distributed systems of data and providing predictive maintenance expertise for over 2000 maintenance engineers, managers and administrators by supporting the entire network managed by EU RME, which may include non-EU locations (such as Singapore, Australia and Japan). You will connect with world leaders in your field and you will be tackling ML challenges by carrying out a systematic review of existing solutions. The appropriate choice of the ML methods and their deployment into effective tools will be the key for the success in this role.The successful candidate will be a self-starter comfortable with ambiguity, with strong attention to detail and outstanding ability in balancing technical leadership with strong business judgment to make the right decisions about model and method choices.Key Areas of Responsibilities:· Provide technical expertise to support team strategies that will take EU RME towards World Class predictive maintenance practices and processes, driving better equipment up-time and lower repair costs with optimized spare parts inventory and placement· Implement an advanced maintenance framework utilizing Machine Learning technologies to drive equipment performance leading to reduced unplanned downtime· Provide technical expertise to support the development of long-term spares management strategies that will ensure spares availability at an optimal level for local sites and reduce the cost of spares
US, WA, Seattle
Do you want to join Alexa Artificial Intelligence (AI), the science team behind Amazon’s intelligence voice assistance system? Do you want to utilize cutting-edge deep-learning and machine learning algorithms to delight millions of Alexa users around the world?If your answers to these questions are “yes”, then come join the Alexa AI team, which is in charge of improving Alexa user satisfaction through real-time metrics monitoring and continuous closed-loop learning. The team owns the modules that reduce user perceived defects and frictions through utterance reformulation, contextual and personalized hypothesis ranking.A day in the lifeAs a Senior Applied Scientist, you will be working alongside a team of experienced machine/deep learning scientists and engineers to create data driven machine learning models and solutions on tasks such as sequence-to-sequence query reformulation, graph feature embedding, personalized ranking, etc..About the hiring groupThe Alexa AI team is in charge of improving Alexa user satisfaction through real-time metrics monitoring and continuous closed-loop learning. The team owns the modules that reduce user perceived defects and frictions through utterance reformulation, contextual and personalized hypothesis ranking.Job responsibilitiesYou will be expected to:· Analyze, understand, and model user-behavior and the user-experience based on large scale data, to detect key factors causing satisfaction and dissatisfaction (SAT/DSAT).· Build and measure novel online & offline metrics for personal digital assistants and user scenarios, on diverse devices and endpoints· Create and innovate deep learning and/or machine learning based algorithms for utterance reformulation and contextual hypothesis ranking to reduce user dissatisfaction in various scenarios;· Perform model/data analysis and monitor user-experienced based metrics through online A/B testing;· Research and implement novel machine learning and deep learning algorithms and models.Amazon is committed to a diverse and inclusive workplace. Amazon is an equal opportunity employer and does not discriminate on the basis of race, national origin, gender, gender identity, sexual orientation, protected veteran status, disability, age, or other legally protected status. For individuals with disabilities who would like to request an accommodation, please visit https://www.amazon.jobs/en/disability/us.
US, WA, Seattle
The Fresh Food Fast organization is responsible for transforming the online and offline grocery experience for Amazon. We are seeking a senior science leader to define our long-term science vision, build out a high-performing team and deliver business critical scientific models to increase customer engagement, inform long-term investment decisions, and measure how grocery is contributing to Prime and Amazon.A day in the life· You will influence senior leaders (VP+) across business, product, finance, and engineering functions and you will partner closely with central Amazon teams to pioneer new models to measure grocery’s future impact to Prime and Amazon.· You will manage a team of Data Scientists, Economists and BIEs to deliver results on behalf of customersAbout the hiring groupWe’re a team of Product Managers, Data Scientists, Economists and Business Intelligence Engineers focused on deeply understanding how F3 customers engage with physical and online grocery stores in order to enhance their shopping experience, drive engagement and loyalty, and measure their long-term impact to Amazon.Job responsibilitiesYour team will apply complex scientific methods to challenging business problems including, “How can we encourage customers to shop more frequently?”, and “how should we measure the impact of physical store expansion and technology innovation in those stores (e.g. Just Walk Out Technology)?”. You will power through ambiguity, finding the right solutions to problems and influencing others to align with your approach and help drive results. You will mentor and develop scientists to achieve their goals, raising the bar technically and driving scale and efficiencies to better leverage our data and technologies.Amazon is committed to a diverse and inclusive workplace. Amazon is an equal opportunity employer and does not discriminate on the basis of race, national origin, gender, gender identity, sexual orientation, protected veteran status, disability, age, or other legally protected status. For individuals with disabilities who would like to request an accommodation, please visit https://www.amazon.jobs/en/disability/us.
DE, BE, Berlin
Our mission is to build the automated intelligence supporting critical service operations at global scale. The Intelligent Cloud Control Machine Learning (ICCML) team works to automate complex large-scale operations of Amazon’s consumer services by developing data-driven, scalable, and seamless solutions available to customers and ICC partners. We employ machine learning to reduce system and information complexity while improving service reliability. We invent practical approaches within application areas such as anomaly detection, time series analysis, classification, causal inference, and text mining, and we apply the latest and most sound techniques of probabilistic modelling, estimation, deep neural networks, and natural language processing (NLP). Working with us offers exciting challenges where you will grow as an applied scientist and technical leader, combining your scientific and engineering skills to solve complex machine learning problems together with our tech teams around the world.As an Applied Scientist of the ICCML team, you will have the important role of mapping business problems to high-impact solutions. You will turn theoretically sound methods into practically applicable models designed for processing massive volumes of data in large-scale environments. You will define business relevant solutions implemented as end-to-end machine learning functions and data processing pipelines that integrate with our partners production systems. In a fast-paced innovation environment, you will work closely with our Applied Scientists, Machine Learning Engineers, and partners to design machine learning models and experiments at scale. You dive deep into all aspects of the practical machine learning development cycle, encompassing sound use of data pre-processing techniques, analysis, modelling, and validation methods. You master the complex theory under the hood of machine learning and you keep up to date with the latest scientific development in information processing, modelling, and learning methods. You take lead of the scientific and technical work in cross-team collaborations with the ultimate objective of creating a delightful experience for our customers using our services.
IL, Tel Aviv
You: Alexa, I am looking for a new career opportunity, where I could conduct applied research, impact millions of customers, and publish about it in top conferences. What do you suggest?Alexa: The Alexa Shopping team is looking for brilliant applied researchers to help me become the best personal shopping assistant. Do you want to hear more?You: Yes, please!Alexa: As an applied researcher in the Alexa Shopping Research team, you will be responsible for research, design, and implementation of new AI technologies for voice assistants. You will collaborate with scientists, engineers, and product partners locally and abroad. Your work will inventing, experimenting with, and launching new features, products and systems. Ideally you have a expertise in at least one of the following fields: Web search & data mining, Machine Learning, Natural Language Processing, Computer Vision, Speech Processing or Artificial Intelligence, with both hands-on experience and publications at top relevant academic venues.
IL, Tel Aviv
You: Alexa, I am looking for a new career opportunity, where I could conduct applied research, impact millions of customers, and publish about it in top conferences. What do you suggest?Alexa: The Alexa Shopping team is looking for brilliant researchers to help me become the best personal shopping assistant. Do you want to hear more?You: Yes, please!Alexa: As a researcher in the Alexa Shopping Research team, you will be responsible for research, design, and implementation of new AI technologies for voice assistants. You will collaborate with scientists, engineers, and product partners locally and abroad. Your work will inventing, experimenting with, and launching new features, products and systems. Ideally you have a expertise in at least one of the following fields: Web search & data mining, Machine Learning, Natural Language Processing, Computer Vision, Speech Processing or Artificial Intelligence, with both hands-on experience and publications at top relevant academic venues.
ES, M, Madrid
HR Data ScientistAmazon’s EU HR Operations Project Management Office (PMO) is looking for a Data Scientist to be part of our Research Analytics group. As a member of the team, you will leverage established and novel data sources, quantitative and qualitative research, and machine learning techniques to deliver tools and insights that have a direct impact on Amazon’s workforce. You will be at the forefront of using data science to transform how Amazon attracts, develops, and retains the world’s best employees. You will work closely with the business and technical teams to perform compelling analysis that delivers actionable results. You will also build predictive workforce models that have a direct impact on day-to-day decision making and on HR project investments.Responsibilities include:· Develop predictive models for important business- and people-centered outcomes· Design experiments to identify causal factors· Develop analysis plans and implement appropriate modeling techniques to answer complex business questions· Interpret data and communicate complex findings to leaders in HR and across the business· Write research papers for internal audiences· Carry out analysis in collaboration with our Program Managers to support EU HR projects and initiatives· Participate in planning and design of research. Scope, conduct, direct, and coordinate all phases of research projects· Apply appropriate techniques to collect, organize, and analyze data to generate insights· Drive the collection of new data and the refinement of existing data sources· Provide expert level consulting to HR and business leaders to develop appropriate reports, metrics and research
US, NY, New York
Are you passionate about big data, enjoy solving complex analytical problems, build predictive and descriptive analytics to drive growth at web scale - all in a challenging, fast-paced environment? Amazon’s affiliate marketing program, Amazon Associates, spans a large number of websites and blogs worldwide. These sites and bloggers create compelling original content to help customers make informed purchase decisions on Amazon. The Affiliate Marketing Sciences team is responsible for analyzing vast amounts of data on customer transits and purchases influenced by associates, generating insights, building models and developing recommendations to drive growth of the program. We are looking for an experienced Data Scientist to work backwards from customers, create models and develop insights, deliver impact, and drive growth of the Associates Program worldwide.As a Data Scientist in the Amazon Associates org, you will be working with business stakeholders, product/program managers, developers and executives to deeply understand customer problems and priorities. You will form hypotheses, analyze the corpus of associates and customer data using statistical methods, build predictive models, generate recommendations to address a range of problems across the associates program. These span developing insights on customer and associates engagement trends, building models to inform rates and bounties, defining segmentation for marketing, building customized guidance to optimize associates engagement with customers, developing risk models and reputation scores, etc. Your expertise will enable us to accelerate the grow of the associates business. You will be comfortable with big data systems, intimately familiar with machine learning and statistical methods, and experienced in applying these to solve business problems. You will have a strong bias towards customer obsession and delivering results while dealing with ambiguity in fast-paced dynamic environments.The ideal candidate should have excellent communication skills to work closely with stakeholders to translate business needs into methodology and data-driven findings into actionable insights. The successful candidate will be a self-starter comfortable with ambiguity, with strong attention to detail, and ability to work in a fast-paced and ever-changing environment.Roles and Responsibilities- Work with business teams, product managers, engineers and leadership to identify and prioritize customer and business problems- Translate problems into specific analytical questions and form hypotheses that can be answered with available data using statistical methods or identify additional data needed in the master datasets to fill any gaps- Perform hands-on data analyses and modeling with huge datasets to develop insights and recommendations to inform decisions across the program- Design and run A/B experiments to validate the hypotheses and evaluate the impact of your optimizations and communicate your results to various stakeholders- Collaborate with engineers and product managers to build scalable solutions and new capabilities- Coach others on the team with a systematical peer review and & model proposal evaluation- Communicate and influence senior management and implement novel and statistical approachesThis role can be based in Seattle or NYC.
US, NY, New York
Are you passionate about big data, enjoy solving complex analytical problems, build predictive and descriptive analytics to drive growth at web scale - all in a challenging, fast-paced environment? Amazon’s affiliate marketing program, Amazon Associates, spans a large number of websites and blogs worldwide. These sites and bloggers create compelling original content to help customers make informed purchase decisions on Amazon. The Affiliate Marketing Sciences team is responsible for analyzing vast amounts of data on customer transits and purchases influenced by associates, generating insights, building models and developing recommendations to drive growth of the program. We are looking for an experienced Senior Data Scientist to work backwards from customers, create models and develop insights, deliver impact, and drive growth of the Associates Program worldwide.As a Data Scientist in the Amazon Associates org, you will be working with business stakeholders, product/program managers, developers and executives to deeply understand customer problems and priorities. You will form hypotheses, analyze the corpus of associates and customer data using statistical methods, build predictive models, generate recommendations to address a range of problems across the associates program. These span developing insights on customer and associates engagement trends, building models to inform rates and bounties, defining segmentation for marketing, building customized guidance to optimize associates engagement with customers, developing risk models and reputation scores, etc. Your expertise will enable us to accelerate the grow of the associates business. You will be comfortable with big data systems, intimately familiar with machine learning and statistical methods, and experienced in applying these to solve business problems. You will have a strong bias towards customer obsession and delivering results while dealing with ambiguity in fast-paced dynamic environments.The ideal candidate should have excellent communication skills to work closely with stakeholders to translate business needs into methodology and data-driven findings into actionable insights. The successful candidate will be a self-starter comfortable with ambiguity, with strong attention to detail, and ability to work in a fast-paced and ever-changing environment.Roles and Responsibilities- Work with business teams, product managers, engineers and leadership to identify and prioritize customer and business problems- Translate problems into specific analytical questions and form hypotheses that can be answered with available data using statistical methods or identify additional data needed in the master datasets to fill any gaps- Perform hands-on data analyses and modeling with huge datasets to develop insights and recommendations to inform decisions across the program- Design and run A/B experiments to validate the hypotheses and evaluate the impact of your optimizations and communicate your results to various stakeholders- Collaborate with engineers and product managers to build scalable solutions and new capabilities- Coach others on the team with a systematical peer review and & model proposal evaluation- Communicate and influence senior management and implement novel and statistical approachesThis role can be based in Seattle or NYC.
US, CA, San Francisco
Prime Video is disrupting traditional media with an ever-increasing selection of movies, TV shows, Emmy Award-winning original content, add-on subscriptions including HBO and Showtime, and live events like Thursday Night Football and Major League Baseball. We are a premier provider of digital entertainment worldwide and we continue to grow very quickly! We need your passion, innovative ideas, and creativity to help continue to deliver on our ambitious goals.How often have you had an opportunity to be a founding member of a team solving significant customer problems through innovative AI technology at Amazon scale? We are looking for passionate, hard-working, and talented individuals to join our fast paced, start-up environment to help invent the future and define the next generation of how customers watch videos.Do you want to join an innovative team of scientists who use machine learning and statistical techniques to help Amazon provide the best customer experience by protecting Amazon customers from harmful content? Do you want to build advanced algorithmic systems that help millions of customers every day? Are you excited by the prospect of analyzing and modeling terabytes of data and creating state-of-the-art algorithms to solve real world problems? If yes, then you may be a great fit to join our Amazon Prime Video team. We are expanding our scene understanding team to drive compliance automation and exceptional customer experience using machine learning, computer vision, audio processing, and natural language understanding. Automation of video understanding at scale is our mission and passion. We need to solve problems across many cultures and languages. we have a huge amount of human-labelled data, and operation team to generate labels across many languages. Our team innovates, with many novel patents, inventions, and papers in the motion picture and television industry. We are highly motivated to extend the state of the art.We embrace the challenges of a fast-paced market and evolving technologies, paving the way to universal availability of content. You will be encouraged to see the big picture, be innovative, and positively impact millions of customers. This is a young and evolving business where creativity and drive will have a lasting impact on the way video is enjoyed worldwide.As a senior applied scientist, you will apply your knowledge of deep learning to concrete problems that have broad cross-organizational, global, and technology impact. Your work will focus on training and evaluating models and deploying them to production where we continuously monitor and evaluate. You will work on large engineering efforts that solve significantly complex problems facing global customers. You will be trusted to operate with independence and are often assigned to focus on areas with significant impact on audience satisfaction. You must be equally comfortable digging in to customer requirements as you are drilling into design with development teams and developing production ready learning models. You consistently bring strong, data-driven business and technical judgment to decisions. This is a greenfield with no "off-the-shelf algorithms" that can perform the job. We experiment a lot and it is a must to learn and be curios. You will be encouraged to see the big picture, be innovative, and positively impact millions of customersYou'll work with experienced managers who'll care for you. We'll guide you on your career growth path and there's no shortage of technical challenges.You will work with internal and external stakeholders, cross-functional partners, and end-users around the world at all levels. Our team makes a big impact because nothing is more important to us than pleasing our customers, continually earning their trust, and thinking long term. You are empowered to bring new technologies and deep learning approaches to your solutions.
US, CA, San Francisco
How often have you had an opportunity to be a founding member of a team solving significant customer problems through innovative AI technology at Amazon scale? We are looking for passionate, hard-working, and talented individuals to join our fast paced, start-up environment to help invent the future and define the next generation of how customers watch videos. We are disrupting a 100-years old industry through cloud services (AWS), 2D/3D computer vision, generative adversarial networks, scalable visual effects (VFX), and machine learning.Prime Video is disrupting traditional media with an ever-increasing selection of movies, TV shows, Emmy Award-winning original content, add-on subscriptions including HBO and Showtime, and live events like Thursday Night Football and Major League Baseball. We are a premier provider of digital entertainment worldwide and we continue to grow very quickly! We need your passion, innovative ideas, and creativity to help continue to deliver on our ambitious goals.We are building a new team to automatically "understand not just tag" the video content on a scene and a frame level by understanding the setting, objects, actions, and themes depicted in a scene. We are driving visual effects automation and exceptional customer experience using machine learning, 2D/3D computer vision, audio processing, and natural language understanding. Automation of video understanding at scale is our mission and passion. We need to solve problems across many cultures and languages. We have a huge amount of human-labelled data, and operation team to generate labels across many languages. Our team innovates, with many novel patents, inventions, and internal/external papers in the motion picture and television industry. We are highly motivated to extend the state of the art.As a senior applied scientist, you will apply your knowledge of deep learning to concrete problems that have broad cross-organizational, global, and technology impact. Your work will focus on training and evaluating models and deploying them to production where we continuously monitor and evaluate. You will work on large engineering efforts that solve significantly complex problems facing global customers. You will be trusted to operate with independence and are often assigned to focus on areas with significant impact on audience satisfaction. You must be equally comfortable digging in to customer requirements as you are drilling into design with development teams and developing production ready learning models. You consistently bring strong, data-driven business and technical judgment to decisions. This is a greenfield with no "off-the-shelf algorithms" that can perform the job. We experiment a lot and it is a must to learn and be curios. We embrace the challenges of a fast paced market and evolving technologies, paving the way to universal availability of content. You will be encouraged to see the big picture, be innovative, and positively impact millions of customers. This is a young and evolving business where creativity and drive will have a lasting impact on the way video is enjoyed worldwide.You'll work with experienced managers who'll care for you. We'll guide you on your career growth path and there's no shortage of technical challenges.You will work with internal and external stakeholders, cross-functional partners, and end-users around the world at all levels. Our team makes a big impact because nothing is more important to us than pleasing our customers, continually earning their trust, and thinking long term. You are empowered to bring new technologies and deep learning approaches to your solutions.
US, CA, San Francisco
Are you interested in revolutionizing the way people around the world enjoy live sports video? Come and join us and be part of the Prime Video Playback team. As a video scientist, you will:· Drive novel live encoding optimization to ensure the best live sports streaming experience delivered to millions of global customers.· Utilize the state-of-the-art computer vision and machine learning techniques to achieve content adaptive live sports encoding to maximize quality per bits at Amazon scale.· Innovate in video quality measurement, video content analysis, and video compression technologies to lead the video industry/community.A day in the lifeAs a video scientist in the Prime Video Playback, you will:· Research and prototype innovative ideas in live sports content analysis, quality measurement, and content-adaptive live video encoding.· Drive technical approach and innovation via proof-of-concept prototyping, paper/report writing, technical presentations and patent filing· Collaborate with and influence product and engineering teams for technology productization and deploymentAbout the hiring groupThe Live Encoding Optimization team's charter is to drive the live streaming video quality improvement at reduced bit costs and low latency, ensuring the best Prime Video customer experience across live sports events and live linear channels. Our innovative technical programs drive benefits at multiple levels: (1) Ensure the best live streaming video quality and Quality-of-Service (QoS) metrics for live events and live linear channel customers, (2) Reduce the live encoding (compute and bit) costs and the associated delivery cost, and (3) Elevate the industry-wide recognition of our innovations in content-driven encoding optimization and video quality measurement.Job responsibilitiesAs a video scientist in the Prime Video Playback, this person shall:· Get familiar with the latest development and advances in video processing, video compression, and computer vision and machine learning to video understanding and analysis· Build research prototypes in live sports video content analysis, objective and subjective video quality measurement, and content-adaptive live video encoding.· Document and present technical proposals and implementations to both internal and external stakeholders and partners.· Work closely with engineering and product team to prioritize technology prototyping, productization and deploymentAmazon is committed to a diverse and inclusive workplace. Amazon is an equal opportunity employer and does not discriminate on the basis of race, national origin, gender, gender identity, sexual orientation, protected veteran status, disability, age, or other legally protected status. For individuals with disabilities who would like to request an accommodation, please visit https://www.amazon.jobs/en/disability/us.