Mitigating social bias in knowledge graph embeddings

Method significantly reduces bias while maintaining comparable performance on machine learning tasks.

Question-answering systems frequently rely on knowledge graphs, large collections of facts about real-world entities (people, organizations, countries, etc.). To make use of the information in knowledge graphs, machine learning models often employ knowledge graph embeddings, vector representations of the entities in the graphs. 

A potential problem with this approach is that the distributions of data in knowledge graphs reflect current and historical social biases. For instance, most knowledge graphs include more male entities than female with the profession “banker”, or more “white American” entities than “African-American” entities with the profession “ballet dancer”. 

If knowledge graph embeddings end up encoding these biases, so will the question-answering systems that use them. If a little girl talking to a chatbot asks, “What should I be when I grow up?”, a biased embedding might rule out possible answers that are predominantly associated with men in the knowledge graph. For some professions — “baritone”, for instance — that may be fine. But in other cases, the biases may be relics of a less egalitarian past.

A two-dimensional representation of our method for measuring bias in knowledge graph embeddings.
A two-dimensional representation of our method for measuring bias in knowledge graph embeddings. In each diagram, the blue dots labeled person1 indicate the shift in an embedding as we tune its parameters. The orange arrows represent relation vectors and the orange dots the resulting sums. As we shift the gender relation toward maleness, the profession relation shifts away from nurse and closer to doctor.
Credit: Glynis Condon

Earlier this year, at the AKBC Workshop on Bias in Knowledge Graphs, we presented a paper that examines this problem. Using a standard embedding technique, we looked for correlations between the professions of people listed in Wikidata and demographic factors, such as gender, ethnicity, and religion, to see whether the embeddings do indeed encode harmful social biases. 

Following on from this, at last week’s Conference on Empirical Methods in Natural Language Processing (EMNLP), we presented “Debiasing knowledge graph embeddings”, in which we attempt to address this problem by developing a lightweight alteration to the standard method of training graph embeddings that reduces bias. 

As knowledge graph embeddings become more widely used within the machine learning community, we hope this work raises awareness of the biases they may encode and moves us closer to the goal of effective debiasing.

Knowledge graph embedding

Knowledge graph consisting of a left entity, a relation, and a right entity.
Knowledge graphs generally store facts as triples consisting of a left entity, a relation, and a right entity.
Credit: Joseph Fisher

A standard knowledge graph represents data using triples, each of which consists of two entities and the relationship between them: for instance, the entities emmanuelle_charpentier and germany are related by the relation lives_in.

Knowledge graph embeddings represent the entities in a knowledge graph as points in a multidimensional space. The idea is that spatial relationships between the points encode the relationships captured by the graph. 

With the common embedding framework TransE, for instance, adding the vector representing the relationship lives_in to the point representing emmanuelle_charpentier should bring us close to the location of the point representing germany. 

During training, the embedding model learns to maximize the accuracy of these spatial relationships across all the triples captured in the knowledge graph. Among other applications, embeddings can be used for link prediction, or inferring relationships between entities that do not yet feature in the graph.

Do trained knowledge graph embeddings encode social biases?

To see why knowledge graph embeddings might encode social biases, let’s look at the counts of male and female entities in Wikidata, the most extensive open-source knowledge graph.

Counts of male and female entities in Wikidata
Counts of male and female entities in Wikidata
Credit: Joseph Fisher

There are more than four times as many male entities in Wikidata as there are female, a reflection of long-persisting social biases in the real world. 

In our paper “Measuring social bias in knowledge graph embeddings”, we determine whether such differences in counts become encoded in embeddings. To do this, we take the embedding of a human entity and tune it so that the addition of a relation vector — such has has_religion or has_gender — edges closer to the embedding for some particular right-hand attribute — such as “Catholic” or “female”.

List of top 20 'most female' professions in Wikidata according to TransE embeddings. 
The top 20 “most female” professions in Wikidata according to TransE embeddings. B_p denotes the bias score, C_fem the counts of female entities in the knowledge graph with these professions, and C_male the counts of male entities with these professions.
From “Measuring social bias in knowledge graph embeddings”

As we tune the embedding, we observe how the result of adding the has_profession vector changes. That is, for each potential profession, we determine whether the model assigns it to the person with greater or lesser probability as the embedding changes. 

Running this calculation across all humans and professions, we are able to identify the professions that the embeddings encode as the “most male” and the “most female”. The table at right shows the top 20 “most female” professions according to our measure. (The number of entities in Wikipedia with non-binary genders is comparatively negligible; although this represents another bias in the data, it also means that the resulting embeddings would be too noisy to yield meaningful results in our study.)

List of the top 20 'most male' professions in Wikidata according to TransE embeddings.
Top 20 "most male" professions in Wikidata according to TransE embeddings.
From “Measuring social bias in knowledge graph embeddings”

The differences in the counts of entities in the knowledge graph with these professions appear to translate to biases in the embeddings. There are some professions, such as “homekeeper”, that we would prefer were not associated with a particular gender; others, such as “woman of letters”, may be less controversial. 

We also calculate the top 20 “male” professions, where the conclusions are similar.

Can we adjust the training of knowledge graph embeddings to reduce encoded biases?

In “Debiasing knowledge graph embeddings”, we turn our attention to reducing such biases and their potentially harmful consequences for downstream applications, such as chatbots. To do this, we train the embedding model not only on how faithfully it reconstructs triples but also on how well it approximates even distributions for gender and other sensitive characteristics, such as religion. 

Put another way, we update the embedding of person1 so that it becomes impossible for the model to predict gender. If this is done precisely, it should also break correlations between gender and profession.

Diagram that demonstrates measurement of how well a knowledge graph embedding scheme matches target distributions.
Our debiasing approach uses Kullback-Leibler (KL) divergence to measure how well a knowledge graph embedding scheme matches our target distributions.
Credit: Joseph Fisher

A potential drawback is that this approach prevents the model from using gender, religion, nationality, or ethnicity to predict noncontroversial triples. For instance, we may like the embeddings to reflect that a nun is more likely to be female than male.

To allow this, we introduce attribute embeddings. In cases where we wish to make use of sensitive information, we can simply add these attribute vectors back in to the embeddings.

Attribute vector: the male attribute embedding back into the model for the profession nun but not for the profession doctor.
Here, we add the male attribute embedding back into the model for the profession nun but not for the profession doctor.
Credit: Joseph Fisher

We evaluate our model against a Basic TransE model with no debiasing and against the debiasing approach adopted by Bose et al., which uses neural-network filters proposed in the literature previously. We measure the usefulness of the embeddings for link prediction (according to mean reciprocal rank, or MRR), their bias, and training time.

During training, embeddings are scored on their accuracy — the degree to which they reproduce the corresponding triples in the knowledge graph. We measure bias as the difference between those scores for entities that fall into one category or another — religion, gender, and so on. We find that our model incurs a slight (roughly 3%) dropoff in link prediction accuracy in exchange for a dramatic reduction in bias.

MRRGender biasSeconds per epoch
Basic0.682.7968.4
Bose et al.0.4262.75533.3
Ours0.660.1989.4

Related content

US, CA, San Diego
Do you want to join an innovative team of scientists who use deep learning, natural language processing, large language models to help Amazon provide the best seller experience across the entire Seller life cycle, including recruitment, growth, support and provide the best customer and seller experience by automatically mitigating risk? Do you want to build advanced algorithmic systems that help manage the trust and safety of millions of customer interactions every day? Are you excited by the prospect of analyzing and modeling terabytes of data and creating state-of-the-art algorithms to solve real world problems? Are you excited by the opportunity to leverage GenAI and innovate on top of the state-of-the-art large language models to improve customer and seller experience? Do you like to build end-to-end business solutions and directly impact the profitability of the company? Do you like to innovate and simplify processes? If yes, then you may be a great fit to join the Machine Learning Accelerator team in the Amazon Selling Partner Services (SPS) group. Key job responsibilities The scope of an Applied Scientist III in the Selling Partner Services (SPS) Machine Learning Accelerator (MLA) team is to research and prototype Machine Learning applications that solve strategic business problems across SPS domains. Additionally, the scientist collaborates with engineers and business partners to design and implement solutions at scale when they are determined to be of broad benefit to SPS organizations. They develop large-scale solutions for high impact projects, introduce tools and other techniques that can be used to solve problems from various perspectives, and show depth and competence in more than one area. They influence the team’s technical strategy by making insightful contributions to the team’s priorities, approach and planning. They develop and introduce tools and practices that streamline the work of the team, and they mentor junior team members and participate in hiring. We are open to hiring candidates to work out of one of the following locations: San Diego, CA, USA
US, WA, Seattle
Amazon is looking for a strategic, innovative science leader within the Global Talent and Compensation (GTMC) organization to lead an interdisciplinary team charged with developing data-driven solutions to model, automate, and inform high judgement decision making by bringing together science and technology in consumer grade internal talent products. GTMC delivers employee-focused experiences by providing scalable and responsive mechanisms for employees, as well as listening and signaling mechanisms for managers and leaders. They do this through intelligent, flexible, and extensible products and scalable data and science services. They set out to deliver a singular experience supporting multiple employee talent journeys (e.g., onboarding, evaluation, compensation, movement, promotion, exit), to generate and capture signals from product data, surface outliers, increase personalization, and improve the efficacy of “next best action” recommendations, for 1.6 million Amazonians around the world. In this role you will lead multiple research teams across the disciplines of Talent Management, Diversity Equity and Inclusion, and Compensation. You will interface with the most senior leaders at Amazon to develop and deliver on a strategic research roadmap that crosses all lines of Amazon businesses (e.g., Consumer, AWS, Devices, Advertising). This role will then partner with engineering and product management leader to deliver the outcomes of this research in production environments. Successful candidates will have an established background expertise in machine learning with some experience in applying this expertise to the fields of talent management, product management and/or software development. We are open to hiring candidates to work out of one of the following locations: Seattle, WA, USA
IN, KA, Bangalore
Are you interested in changing the Digital Reading Experience? We are from Kindle Books Team looking for a set of Scientists to take the reading experience in Kindle to next level with a set of innovations! We envision Kindle as the place where readers find the best manifestation of all written content optimized with features that enable them to get the most out of reading, and creators are able to realize their vision to customers quickly and at scale. Every time customers open their content, regardless of surface, they start or restart their reading in a familiar, useful and engaging place. We achieve this by building a strong foundation of core experiences and act as a force multiplier and partner for content creators (directly or indirectly) to easily innovate on top of Kindle's purpose built content experience stack in a simple and extensible way. We will achieve this by providing a best-in-class reading experience, unique content experiences, and remaining agile in meeting the evolving needs and preferences of our users. Our goal is to foster long-lasting reading habits and make us the preferred destination for enriching literary experiences. We are building a In The Book Science team and looking for Scientists, who are passionate about Reading and are willing to take Reading to the next level. Every Book is a complex structure with different entities, layout, format and semantics, with more than 17MM eBooks in our catalog. We are looking for experts in all domains like core NLP, Generative AI, CV and Deep Learning Techniques for unlocking capabilities like analysis, enhancement, curation, moderation, translation, transformation and generation in Books based on Content structure, features, Intent & Synthesis. Scientists will focus on Inside the book content and semantically learn the different entities to enhance the Reading experience overall (Kindle & beyond). They have an opportunity to influence in 2 major phases of life-cycle - Publishing (Creation of Books process) and Reading experience (building engaging features & representation in the book thereby driving reading engagement). Key job responsibilities - 5+ years of building machine learning models for business application experience - PhD, or Master's degree and 6+ years of applied research experience - Knowledge of programming languages such as C/C++, Python, Java or Perl - Experience programming in Java, C++, Python or related language - You have expertise in one of the applied science disciplines, such as machine learning, natural language processing, computer vision, Deep learning - You are able to use reasonable assumptions, data, and customer requirements to solve problems. - You initiate the design, development, execution, and implementation of smaller components with input and guidance from team members. - You work with SDEs to deliver solutions into production to benefit customers or an area of the business. - You assume responsibility for the code in your components. You write secure, stable, testable, maintainable code with minimal defects. - You understand basic data structures, algorithms, model evaluation techniques, performance, and optimality tradeoffs. - You follow engineering and scientific method best practices. You get your designs, models, and code reviewed. You test your code and models thoroughly - You participate in team design, scoping and prioritization discussions. You are able to map a business goal to a scientific problem and map business metrics to technical metrics. - You invent, refine and develop your solutions to ensure they are meeting customer needs and team goals. You keep current with research trends in your area of expertise and scrutinize your results. - Experience in mentoring junior scientists A day in the life You will be working with a group of talented scientists on researching algorithm and running experiments to test solutions to improve our experience. This will involve collaboration with partner teams including engineering, PMs, data annotators, and other scientists to discuss data quality, model development and productionizing the same. You will mentor other scientists, review and guide their work, help develop roadmaps for the team. We are open to hiring candidates to work out of one of the following locations: Banagalore, KA, IND | Bangalore, IND | Bangalore, KA, IND
IN, KA, Bangalore
Are you interested in changing the Digital Reading Experience? We are from Kindle Books Team looking for a set of Scientists to take the reading experience in Kindle to next level with a set of innovations! We envision Kindle as the place where readers find the best manifestation of all written content optimized with features that enable them to get the most out of reading, and creators are able to realize their vision to customers quickly and at scale. Every time customers open their content, regardless of surface, they start or restart their reading in a familiar, useful and engaging place. We achieve this by building a strong foundation of core experiences and act as a force multiplier and partner for content creators (directly or indirectly) to easily innovate on top of Kindle's purpose built content experience stack in a simple and extensible way. We will achieve this by providing a best-in-class reading experience, unique content experiences, and remaining agile in meeting the evolving needs and preferences of our users. Our goal is to foster long-lasting reading habits and make us the preferred destination for enriching literary experiences. We are building a In The Book Science team and looking for Scientists, who are passionate about Reading and are willing to take Reading to the next level. Every Book is a complex structure with different entities, layout, format and semantics, with more than 17MM eBooks in our catalog. We are looking for experts in all domains like core NLP, Generative AI, CV and Deep Learning Techniques for unlocking capabilities like analysis, enhancement, curation, moderation, translation, transformation and generation in Books based on Content structure, features, Intent & Synthesis. Scientists will focus on Inside the book content and semantically learn the different entities to enhance the Reading experience overall (Kindle & beyond). They have an opportunity to influence in 2 major phases of life-cycle - Publishing (Creation of Books process) and Reading experience (building engaging features & representation in the book thereby driving reading engagement). Key job responsibilities - 3+ years of building machine learning models for business application experience - PhD, or Master's degree and 2+ years of applied research experience - Knowledge of programming languages such as C/C++, Python, Java or Perl - Experience programming in Java, C++, Python or related language - You have expertise in one of the applied science disciplines, such as machine learning, natural language processing, computer vision, Deep learning - You are able to use reasonable assumptions, data, and customer requirements to solve problems. - You initiate the design, development, execution, and implementation of smaller components with input and guidance from team members. - You work with SDEs to deliver solutions into production to benefit customers or an area of the business. - You assume responsibility for the code in your components. You write secure, stable, testable, maintainable code with minimal defects. - You understand basic data structures, algorithms, model evaluation techniques, performance, and optimality tradeoffs. - You follow engineering and scientific method best practices. You get your designs, models, and code reviewed. You test your code and models thoroughly - You participate in team design, scoping and prioritization discussions. You are able to map a business goal to a scientific problem and map business metrics to technical metrics. - You invent, refine and develop your solutions to ensure they are meeting customer needs and team goals. You keep current with research trends in your area of expertise and scrutinize your results. A day in the life You will be working with a group of talented scientists on researching algorithm and running experiments to test solutions to improve our experience. This will involve collaboration with partner teams including engineering, PMs, data annotators, and other scientists to discuss data quality, model development and productionizing the same. You will mentor other scientists, review and guide their work, help develop roadmaps for the team. We are open to hiring candidates to work out of one of the following locations: Bangalore, IND | Bangalore, KA, IND
IN, KA, Bengaluru
Do you want to join an innovative team of scientists who use machine learning and statistical techniques to create state-of-the-art solutions for providing better value to Amazon’s customers? Do you want to build and deploy advanced algorithmic systems that help optimize millions of transactions every day? Are you excited by the prospect of analyzing and modeling terabytes of data to solve real world problems? Do you like to own end-to-end business problems/metrics and directly impact the profitability of the company? Do you like to innovate and simplify? If yes, then you may be a great fit to join the Machine Learning and Data Sciences team for India Consumer Businesses. If you have an entrepreneurial spirit, know how to deliver, love to work with data, are deeply technical, highly innovative and long for the opportunity to build solutions to challenging problems that directly impact the company's bottom-line, we want to talk to you. Major responsibilities - Use machine learning and analytical techniques to create scalable solutions for business problems - Analyze and extract relevant information from large amounts of Amazon’s historical business data to help automate and optimize key processes - Design, development, evaluate and deploy innovative and highly scalable models for predictive learning - Research and implement novel machine learning and statistical approaches - Work closely with software engineering teams to drive real-time model implementations and new feature creations - Work closely with business owners and operations staff to optimize various business operations - Establish scalable, efficient, automated processes for large scale data analyses, model development, model validation and model implementation - Mentor other scientists and engineers in the use of ML techniques We are open to hiring candidates to work out of one of the following locations: Bengaluru, KA, IND
IN, KA, Bengaluru
How to use the world’s richest collection of e-commerce data to improve payments experience for our customers? Amazon Payments Global Data Science team seeks a Senior Data Scientist for building analytical and scientific solutions that will address increasingly complex business questions in the Gift-Cards space. Amazon.com has a culture of data-driven decision-making and demands intelligence that is timely, accurate, and actionable. This team operates at WW level and provides a fast-paced environment where every day brings new challenges and opportunities. As a Senior Data Scientist in this team, you will be driving the Data Science/ML roadmap for business continuity & growth. You will develop statistical and machine learning models to solve for complex business problems in Gift-Cards space, design and run global experiments, and find new ways to optimize the customer experience. You will need to collaborate effectively with internal stakeholders, cross-functional teams to solve problems, create operational efficiencies, and deliver successfully against high organizational standards. You will explore GenAI use-cases within Gift-Cards space and also work on cross-disciplinary efforts with other scientists within Amazon. Key job responsibilities - You should be detail-oriented and must have an aptitude for solving unstructured and ambiguous problems. You should work in a self-directed environment, own tasks and drive them to completion - You should be passionate about working with huge data sets and be someone who loves to bring datasets together to answer business questions - You should demonstrate thorough technical expertise on feature engineering of massive datasets, exploratory data analysis, and model building using state-of-art ML algorithms - Random Forest, Gradient Boosting, SVM, Neural Nets, DL, Reinforcement Learning etc. You should be aware of automating feedback loops for algorithms in production - You should work closely with internal stakeholders like the business teams, engineering teams and partner teams and align them with respect to your focus areas - You should have excellent business and communication skills to be able to work with business owners to develop and define key business questions and build mechanisms that answer those questions We are open to hiring candidates to work out of one of the following locations: Bengaluru, KA, IND
US, NY, New York
The Automated Reasoning Group in AWS Platform is looking for an Applied Scientist with experience in building scalable solver solutions that delight customers. You will be part of a world-class team building the next generation of automated reasoning tools and services. AWS has the most services and more features within those services, than any other cloud provider–from infrastructure technologies like compute, storage, and databases–to emerging technologies, such as machine learning and artificial intelligence, data lakes and analytics, and Internet of Things. You will apply your knowledge to propose solutions, create software prototypes, and move prototypes into production systems using modern software development tools and methodologies. In addition, you will support and scale your solutions to meet the ever-growing demand of customer use. You will use your strong verbal and written communication skills, are self-driven and own the delivery of high quality results in a fast-paced environment. Each day, hundreds of thousands of developers make billions of transactions worldwide on AWS. They harness the power of the cloud to enable innovative applications, websites, and businesses. Using automated reasoning technology and mathematical proofs, AWS allows customers to answer questions about security, availability, durability, and functional correctness. We call this provable security, absolute assurance in security of the cloud and in the cloud. See https://aws.amazon.com/security/provable-security/ As an Applied Scientist in AWS Platform, you will play a pivotal role in shaping the definition, vision, design, roadmap and development of product features from beginning to end. You will: - Define and implement new solver applications that are scalable and efficient approaches to difficult problems - Apply software engineering best practices to ensure a high standard of quality for all team deliverables - Work in an agile, startup-like development environment, where you are always working on the most important stuff - Deliver high-quality scientific artifacts - Work with the team to define new interfaces that lower the barrier of adoption for automated reasoning solvers - Work with the team to help drive business decisions The AWS Platform is the glue that holds the AWS ecosystem together. From identity features such as access management and sign on, cryptography, console, builder & developer tools, to projects like automating all of our contractual billing systems, AWS Platform is always innovating with the customer in mind. The AWS Platform team sustains over 750 million transactions per second. Learn and Be Curious. We have a formal mentor search application that lets you find a mentor that works best for you based on location, job family, job level etc. Your manager can also help you find a mentor or two, because two is better than one. In addition to formal mentors, we work and train together so that we are always learning from one another, and we celebrate and support the career progression of our team members. Inclusion and Diversity. Our team is diverse! We drive towards an inclusive culture and work environment. We are intentional about attracting, developing, and retaining amazing talent from diverse backgrounds. Team members are active in Amazon’s 10+ affinity groups, sometimes known as employee resource groups, which bring employees together across businesses and locations around the world. These range from groups such as the Black Employee Network, Latinos at Amazon, Indigenous at Amazon, Families at Amazon, Amazon Women and Engineering, LGBTQ+, Warriors at Amazon (Military), Amazon People With Disabilities, and more. Key job responsibilities Work closely with internal and external users on defining and extending application domains. Tune solver performance for application-specific demands. Identify new opportunities for solver deployment. About the team Solver science is a talented team of scientists from around the world. Expertise areas include solver theory, performance, implementation, and applications. Diverse Experiences AWS values diverse experiences. Even if you do not meet all of the qualifications and skills listed in the job description, we encourage candidates to apply. If your career is just starting, hasn’t followed a traditional path, or includes alternative experiences, don’t let it stop you from applying. Why AWS? Amazon Web Services (AWS) is the world’s most comprehensive and broadly adopted cloud platform. We pioneered cloud computing and never stopped innovating — that’s why customers from the most successful startups to Global 500 companies trust our robust suite of products and services to power their businesses. Inclusive Team Culture Here at AWS, it’s in our nature to learn and be curious. Our employee-led affinity groups foster a culture of inclusion that empower us to be proud of our differences. Ongoing events and learning experiences, including our Conversations on Race and Ethnicity (CORE) and AmazeCon (gender diversity) conferences, inspire us to never stop embracing our uniqueness. Mentorship & Career Growth We’re continuously raising our performance bar as we strive to become Earth’s Best Employer. That’s why you’ll find endless knowledge-sharing, mentorship and other career-advancing resources here to help you develop into a better-rounded professional. Work/Life Balance We value work-life harmony. Achieving success at work should never come at the expense of sacrifices at home, which is why we strive for flexibility as part of our working culture. When we feel supported in the workplace and at home, there’s nothing we can’t achieve in the cloud. Hybrid Work We value innovation and recognize this sometimes requires uninterrupted time to focus on a build. We also value in-person collaboration and time spent face-to-face. Our team affords employees options to work in the office every day or in a flexible, hybrid work model near one of our U.S. Amazon offices. We are open to hiring candidates to work out of one of the following locations: New York, NY, USA
US, WA, Bellevue
Amazon’s Automated Inventory Management (AIM) Planning Organization is looking for a Data Scientist to help invent the next generation of Amazon's Capacity and Constraint Management system - Automated Planning System (APS). APS will herald a a new era in Sales and Operations Planning (S&OP). APS emerges as a next-generation decision-making framework for Amazon's Worldwide (WW) fulfillment networks. In an industry first, APS seamlessly aligns Amazon's business controls by uniting cutting-edge supply and demand forecasts with a state-of-the-art coordination framework – respecting the distributed ownership of business logic and outcomes. As the centralized planning system, APS takes charge of coordinating all fulfillment, inventory, and operational decisions, maximizing WW Long Term Free Cash Flow (LTFCF) over a 1-year horizon The AIM team is part of the Supply Chain Optimization Technology (SCOT) Team within the Operations Organization. The charter of the SCOT team is to maximize Amazon’s return on our inventory investment in terms of Free Cash Flow and customer satisfaction. The planning organization within Amazon leads the S&OP, IPE and Capacity Planning functions. As a Data Scientist on the this team, you will build a deep understanding of Amazon's supply chain systems, lead innovation in our forecasting capabilities and build principled solutions to identify improvement opportunities in our supply chain using the latest machine learning techniques. You will also work with a team of Product Managers, Business Intelligence Engineers and Software Engineers to research and build accurate predictive models and deploy automated software solutions to provide insights to business leaders at the most senior levels throughout the company. You will build models that make our data more actionable and help us make complex business decisions at scale. To help describe some of our challenges, we created a short video about Supply Chain Optimization at Amazon - http://bit.ly/amazon-scot Key job responsibilities - Implement statistical and machine learning methods to solve complex business problems - Research new ways to improve predictive and explanatory models - Directly contribute to the design and development of automated prediction systems and ML infrastructure - Build models that can detect supply chain defects and explain variance to the optimal state - Collaborate with other researchers, software developers, and business leaders to define the scientific roadmap for this team We are open to hiring candidates to work out of one of the following locations: Bellevue, WA, USA
US, WA, Seattle
Do you want to join an innovative team of scientists who use machine learning to help Amazon provide the best experience to our Selling Partners by automatically understanding and addressing their challenges, needs and opportunities? Do you want to build advanced algorithmic systems that are powered by state-of-art ML, such as Natural Language Processing, Large Language Models, Deep Learning, Computer Vision and Causal Modeling, to seamlessly engage with Sellers? Are you excited by the prospect of analyzing and modeling terabytes of data and creating cutting edge algorithms to solve real world problems? Do you like to build end-to-end business solutions and directly impact the profitability of the company and experience of our customers? Do you like to innovate and simplify? If yes, then you may be a great fit to join the Selling Partner Experience Science team. Key job responsibilities - Use statistical and machine learning techniques to create the next generation of the tools that empower Amazon's Selling Partners to succeed. - Design, develop and deploy highly innovative models to interact with Sellers and delight them with solutions. - Work closely with teams of scientists and software engineers to drive real-time model implementations and deliver novel and highly impactful features. - Establish scalable, efficient, automated processes for large scale data analyses, model development, model validation and model implementation. - Research and implement novel machine learning and statistical approaches. - Participate in strategic initiatives to employ the most recent advances in ML in a fast-paced, experimental environment. About the team Selling Partner Experience Science is a growing team of scientists, engineers and product leaders engaged in the research and development of the next generation of ML-driven technology to empower Amazon's Selling Partners to succeed. We draw from many science domains, from Natural Language Processing to Computer Vision to Optimization to Economics, to create solutions that seamlessly and automatically engage with Sellers, solve their problems, and help them grow. Focused on collaboration, innovation and strategic impact, we work closely with other science and technology teams, product and operations organizations, and with senior leadership, to transform the Selling Partner experience. We are open to hiring candidates to work out of one of the following locations: Denver, CO, USA | Seattle, WA, USA
US, WA, Seattle
Amazon is investing heavily in building a world class advertising business and developing a collection of self-service performance advertising products that drive discovery and sales. Our products are strategically important to our Retail and Marketplace businesses for driving long-term growth. We deliver billions of ad impressions and millions of clicks daily and are breaking fresh ground to create world-class products. We are highly motivated, collaborative and fun-loving with an entrepreneurial spirit and bias for action. With a broad mandate to experiment and innovate, we are growing at an unprecedented rate with a seemingly endless range of new opportunities. Key job responsibilities Search Supply and Experiences, within Sponsored Products, is seeking a Senior Data Scientist to join a fast growing team with the mandate of creating new ads experience that elevates the shopping experience for our hundreds of millions customers worldwide. We are looking for a top analytical mind capable of understanding our complex ecosystem of advertisers participating in a pay-per-click model– and leveraging this knowledge to help turn the flywheel of the business. As a Senior Data Scientist on this team you will: - Lead Data Science solutions from beginning to end. - Deliver with independence on challenging large-scale problems with ambiguity. - Manage and drive the technical and analytical aspects of Advertiser segmentation; continually advance approach and methods. - Write code (Python, R, Scala, etc.) to analyze data and build statistical models to solve specific business problems - Retrieve, synthesize, and present critical data in a format that is immediately useful to answering specific questions or improving system performance. - Analyze historical data to identify trends and support decision making. - Improve upon existing methodologies by developing new data sources, testing model enhancements, and fine-tuning model parameters. - Provide requirements to develop analytic capabilities, platforms, and pipelines. - Apply statistical and machine learning knowledge to specific business problems and data. - Formalize assumptions about how our systems should work, create statistical definitions of outliers, and develop methods to systematically identify outliers. Work out why such examples are outliers and define if any actions needed. - Given anecdotes about anomalies or generate automatic scripts to define anomalies, deep dive to explain why they happen, and identify fixes. - Build decision-making models and propose solution for the business problem you defined - Conduct written and verbal presentation to share insights and recommendations to audiences of varying levels of technical sophistication. - Write code (python or another object-oriented language) for data analyzing and modeling algorithms. A day in the life The Senior Data Scientist will have the opportunity to use one of the world's largest eCommerce and advertising data sets to influence the evolution of our products. This role requires an individual with excellent business, communication, and technical skills, enabling collaboration with various functions, including product managers, software engineers, economists and data scientists, as well as senior leadership. This role will create and enhance performance monitoring reports to find insights that product and business team should focus on. The successful candidate will be a self-starter comfortable with ambiguity, with strong attention to detail, and with an ability to work in a fast-paced, high-energy and ever-changing environment. The drive and capability to shape the direction is a must. This role will influence the direction of the business by leveraging our data to deliver insights that drive decisions and actions. The role will involve translating broad business problems into specific analytics projects, conducting deep quantitative analyses, and communicating results effectively. The role will help the organization identify, evaluate, and evangelize new techniques and tools to continue to improve our ability to deliver value to Amazon’s customers. About the team We are a customer-obsessed team of engineers, technologists, product leaders, and scientists. We are focused on continuous exploration of contexts and creatives where advertising delivers value to customers and advertisers. We specifically work on new ads experiences globally with the goal of helping shoppers make the most informed purchase decision. We obsess about our customers and we are continuously innovating on their behalf to enrich their shopping experience on Amazon We are open to hiring candidates to work out of one of the following locations: Seattle, WA, USA