The University of Oxford insignia on a sign outside the Pitt Rivers Museum, which houses the university's anthropological and archaeological collections
The University of Oxford insignia on a sign outside the Pitt Rivers Museum, which houses the university's anthropological and archaeological collections. Oxford Internet Institute academics Sandra Wachter, Brent Mittelstadt, and Chris Russell, now an Amazon senior applied scientist, “proposed a new test for ensuring fairness in algorithmic modelling and data driven decisions, called ‘Conditional Demographic Disparity’.”
georgeclerk/Getty Images

How a paper by three Oxford academics influenced AWS bias and explainability software

Why conditional demographic disparity matters for developers using SageMaker Clarify.

SageMaker Clarify helps detect statistical bias in data and machine learning models. It also helps explain why those models are making specific predictions. Achieving that requires the application of a collection of metrics that assess data for potential bias. One Clarify metric in particular — conditional demographic disparity (CDD) — was inspired by research done at the Oxford Internet Institute (OII) at the University of Oxford.

Sandra Wachter, left, associate professor and senior research fellow in law and ethics at OII; Brent Mittelstadt, middle, senior research fellow in data ethics at OII; and Chris Russell, a group leader in Safe and Ethical AI at the Alan Turing Institute, and now an Amazon senior applied scientist
The research paper's authors: Oxford Internet Institute academics Sandra Wachter, left, associate professor and senior research fellow in law and ethics; Brent Mittelstadt, middle, senior research fellow in data ethics; and Chris Russell, a group leader in Safe and Ethical AI at the Alan Turing Institute, and now an Amazon senior applied scientist.

In the paper “Why Fairness Cannot Be Automated: Bridging the gap between EU non-discrimination law and AI”, Sandra Wachter, associate professor and senior research fellow in law and ethics at OII; Brent Mittelstadt, senior research fellow in data ethics at OII; and Chris Russell, a group leader in Safe and Ethical AI at the Alan Turing Institute, and now an Amazon senior applied scientist, “proposed a new test for ensuring fairness in algorithmic modelling and data driven decisions, called ‘Conditional Demographic Disparity’.”

CDD is defined as “the weighted average of demographic disparities for each of the subgroups, with each subgroup disparity weighted in proportion to the number of observations it contains.”

“Demographic disparity asks: ‘Is the disadvantaged class a bigger proportion of the rejected outcomes than the proportion of accepted outcomes for the same class?’” explained Sanjiv Das, the William and Janice Terry professor of finance and data science at Santa Clara University's Leavey School of Business, and an Amazon Scholar.

Das came across the paper during his review of relevant literature while working on the team that developed Clarify.

“I read the first few pages and the writing just sucked me in,” he said. “It's the only paper I can honestly say, out of all of those I read, that really was a delight to read. I just found it beautifully written.”

I read the first few pages and the writing just sucked me in. It's the only paper I can honestly say, out of all of those I read, that really was a delight to read. I just found it beautifully written.
Sanjiv Das

The idea for the paper was rooted in research the OII group had done previously.

“Before we did this paper, we were working primarily in the space of machine learning and explainable artificial intelligence,” Mittelstadt said. “We got interested in this question of: Imagine you want to explain how AI works or how an automated decision was actually made, how can you do that in a way that is ethically desirable, legally compliant, and technically feasible?”

In pursuing that question, the researchers discovered that some of the technical standards for fairness that developers were relying on lacked an understanding as to how legal and ethical institutions view those same standards. That lack of cohesion between technical and legal/ethical standards of fairness meant developers might be unaware of normative bias in their models.

“Essentially, the question we asked was, ‘OK, how well does the technical work, which quite often drives the conversation, actually match up with the law and philosophy?’” Mittelstadt explained. “And we found that a lot of what's out there isn't necessarily going to be helpful for how fairness or how equality is operationalized. We found a fairly significant gap between the majority of the work that was out there on the technical side and how the law is actually applied.”

RAAIS 2020 - Sandra Wachter, Brent Mittelstadt and Chris Russell, University of Oxford

As a result, the OII team set about working on a way to bridge that gap.

“We tried to figure out, what's the legal notion of fairness in law, and does it have an equivalent in the tech community?” Wachter said. “And we found one where there's the greatest overlap between the two: conditional demographic disparity (CDD). There is a certain idea of fairness inside the law that says, ‘This is the ideal way, how things ought to be.’ And this way of measuring evidence, this way of deciding if something is unequal has a counterpart in computer science and that's CDD. So now we have a measure that is informed by the legal notion of fairness.”

OII researchers publish new paper on bias in machine learning

The authors “propose a novel classification scheme for fairness metrics in machine learning based on how they handle pre-existing bias.”

Das said the paper helped him see the appeal immediately.

“I was able to see the value not because I had an epiphany, but because the paper brings it out really well,” he said. “In fact, it's my favorite metric in the product.”

Das said the OII paper is useful for a couple of reasons, including the ability to discover when something that appears to be bias might not actually be bias.

Sanjiv Das
Sanjiv Das is the William and Janice Terry professor of finance and data science at Santa Clara University's Leavey School of Business, and an Amazon Scholar.

“It also allowed us to measure whether we were seeing a bias, but the bias was not truly a bias because we hadn't checked for something called Simpson's Paradox,” he said. “The paper actually deals with Simpson's Paradox.” The paradox says that trends that appear in aggregate data often disappear when that data is disaggregated.

“This came up with Berkeley's college admissions in the 1970s,” Das explained. “There was a concern that the school was admitting more men than women and so its admission process might be biased. But when people took the data and looked at the admission rates by school — engineering versus law versus arts and sciences — they found a very strange thing: In almost every department, more women were being admitted than men. It turns out that the reason those two things are reconciled is that women were applying to departments that were harder to get into and had lower admission rates. And so, even though department by department more women got admitted, because they were applying more often to departments where fewer people got admitted, a fewer number of women overall ended up at the university.”

The approach outlined by the OII researchers accounts for that paradox by utilizing summary statistics.

“Summary statistics essentially let you see how outcomes compare across different groups within the entire population of people that were affected by a system,” Mittelstadt explained. “We're shifting the conversation to what is the right feature or the right variable to condition on when you are measuring fairness.”

I was able to see the value not because I had an epiphany, but because the paper brings it out really well. In fact, CDD is my favorite metric in the product.
Sanjiv Das

The OII team is thrilled to see their work implemented in Clarify and they said they hope their paper proves to be useful for developers.

“There is an interest on the part of developers to test for bias as vigorously as possible,” Wachter said. “So, I’m hoping those who are actually developing and deploying the algorithms can easily implement our research in their daily practices. And it's extremely exciting to see that it’s actually useful for practical applications.”

“The Amazon implementation is exactly the sort of impact I was hoping to see,” Mittelstadt agreed. “You actually have to get a tool like this into the hands of people that will be working with AI systems and who are developing AI systems.”

For more information on how Clarify can help identify and limit bias, visit the AWS SageMaker Clarify page.

Research areas

Related content

GB, London
"Are you a MS or PhD student interested in the fields of Computer Science or Operational Research? Do you enjoy diving deep into hard technical problems and coming up with solutions that enable successful products? If this describes you, come join our research teams at Amazon. " Key job responsibilities The candidate will be responsible for the design and implementation of optimization algorithms. The candidate will translate high-level business problems into mathematical ones. Then, they will design and implement optimization algorithms to solve them. The candidate will be responsible also for the analysis and design of KPIs and input data quality. About the team ATS stands for Amazon Transportation Service, we are the middle-mile planners: we carry the packages from the warehouses to the cities in a limited amount of time to enable the “Amazon experience”. As the core research team, we grow with ATS business to support decision making in an increasingly complex ecosystem of a data-driven supply chain and e-commerce giant. We take pride in our algorithmic solutions: We schedule more than 1 million trucks with Amazon shipments annually; our algorithms are key to reducing CO2 emissions, protecting sites from being overwhelmed during peak days, and ensuring a smile on Amazon’s customer lips. We do not shy away from responsibility. Our mathematical algorithms provide confidence in leadership to invest in programs of several hundreds millions euros every year. Above all, we are having fun solving real-world problems, in real-world speed, while failing & learning along the way. We employ the most sophisticated tools: We use modular algorithmic designs in the domain of combinatorial optimization, solving complicated generalizations of core OR problems with the right level of decomposition, employing parallelization and approximation algorithms. We use deep learning, bandits, and reinforcement learning to put data into the loop of decision making. We like to learn new techniques to surprise business stakeholders by making possible what they cannot anticipate. For this reason, we work closely with Amazon scholars and experts from Academic institutions. We are open to hiring candidates to work out of one of the following locations: London, GBR
US, MA, Boston
The Artificial General Intelligence (AGI) team is looking for a highly-skilled Senior Applied Scientist, to lead the development and implementation of cutting-edge algorithms and push the boundaries of efficient inference for Generative Artificial Intelligence (GenAI) models. As a Senior Applied Scientist, you will play a critical role in driving the development of GenAI technologies that can handle Amazon-scale use cases and have a significant impact on our customers' experiences. Key job responsibilities - Design and execute experiments to evaluate the performance of different decoding algorithms and models, and iterate quickly to improve results - Develop deep learning models for compression, system optimization, and inference - Collaborate with cross-functional teams of engineers and scientists to identify and solve complex problems in GenAI - Mentor and guide junior scientists and engineers, and contribute to the overall growth and development of the team We are open to hiring candidates to work out of one of the following locations: Bellevue, WA, USA | Boston, MA, USA | New York, NY, USA
US, WA, Bellevue
We are looking for detail-oriented, organized, and responsible individuals who are eager to learn how to work with large and complicated data sets. Knowledge of econometrics, and basic familiarity with Python or R, is necessary. Experience with SQL is a plus. These are full-time positions at 40 hours per week, with compensation being awarded on an hourly basis. You will learn how to build data sets and apply econometric methods to support business decisions, collaborating with economists, scientists, and product managers. These skills will translate well into writing applied chapters in your dissertation and provide you with work experience that may help you with placement. Roughly 85% of previous cohorts have converted to full time economics employment at Amazon. If you are interested, please send your CV to our mailing list at econ-internship@amazon.com. Key job responsibilities Collaborate with business and science colleagues to understand the business problem and collect relevant data. Provide statistically rigorous analysis of data that contributes to business decision-making. Effectively communicate your results to colleagues and business leaders. A day in the life Meet with colleagues to discuss how the business currently works. Discuss ways in which the customer experience could be improved, and what data you'd need to test your hypotheses. Meet with data and business intelligence engineers to build an efficient data pipeline using SQL, spark and other big data tools. Propose and execute a plan to analyze your data, working closely with your econ colleagues. Use Amazon's development tools, coding your estimators in Python or R. Draft your findings for an internal knowledge sharing session. Iterate to improve your work and communicate your final results in a business document. About the team We are a team of four economists that works within the delivery experience org. Our goal is to improve the delivery experience for our customers while reducing costs. This mission puts us in a unique position to influence both the front end customer experience and the supply chain that ultimately places constraints on that experience. This means we often work with and influence teams outside of our own organization. As a result, we have the privilege of working with a diverse group of experts, including those in supply chain, operations, capacity management, and user experience. We are open to hiring candidates to work out of one of the following locations: Bellevue, WA, USA
ES, B, Barcelona
"Are you a MS or PhD student interested in the fields of Computer Science or Operational Research? Do you enjoy diving deep into hard technical problems and coming up with solutions that enable successful products? If this describes you, come join our research teams at Amazon. " Key job responsibilities The candidate will be responsible for the design and implementation of optimization algorithms. The candidate will translate high-level business problems into mathematical ones. Then, they will design and implement optimization algorithms to solve them. The candidate will be responsible also for the analysis and design of KPIs and input data quality. About the team ATS stands for Amazon Transportation Service, we are the middle-mile planners: we carry the packages from the warehouses to the cities in a limited amount of time to enable the “Amazon experience”. As the core research team, we grow with ATS business to support decision making in an increasingly complex ecosystem of a data-driven supply chain and e-commerce giant. We take pride in our algorithmic solutions: We schedule more than 1 million trucks with Amazon shipments annually; our algorithms are key to reducing CO2 emissions, protecting sites from being overwhelmed during peak days, and ensuring a smile on Amazon’s customer lips. We do not shy away from responsibility. Our mathematical algorithms provide confidence in leadership to invest in programs of several hundreds millions euros every year. Above all, we are having fun solving real-world problems, in real-world speed, while failing & learning along the way. We employ the most sophisticated tools: We use modular algorithmic designs in the domain of combinatorial optimization, solving complicated generalizations of core OR problems with the right level of decomposition, employing parallelization and approximation algorithms. We use deep learning, bandits, and reinforcement learning to put data into the loop of decision making. We like to learn new techniques to surprise business stakeholders by making possible what they cannot anticipate. For this reason, we work closely with Amazon scholars and experts from Academic institutions. We are open to hiring candidates to work out of one of the following locations: Barcelona, B, ESP
IN, TN, Chennai
DESCRIPTION The Digital Acceleration (DA) team in India is seeking a talented, self-driven Applied Scientist to work on prototyping, optimizing, and deploying ML algorithms for solving Digital businesses problems. Key job responsibilities - Research, experiment and build Proof Of Concepts advancing the state of the art in AI & ML. - Collaborate with cross-functional teams to architect and execute technically rigorous AI projects. - Thrive in dynamic environments, adapting quickly to evolving technical requirements and deadlines. - Engage in effective technical communication (written & spoken) with coordination across teams. - Conduct thorough documentation of algorithms, methodologies, and findings for transparency and reproducibility. - Publish research papers in internal and external venues of repute - Support on-call activities for critical issues BASIC QUALIFICATIONS - Experience building machine learning models or developing algorithms for business application - PhD, or a Master's degree and experience in CS, CE, ML or related field - Knowledge of programming languages such as C/C++, Python, Java or Perl - Experience in any of the following areas: algorithms and data structures, parsing, numerical optimization, data mining, parallel and distributed computing, high-performance computing - Proficiency in coding and software development, with a strong focus on machine learning frameworks. - Understanding of relevant statistical measures such as confidence intervals, significance of error measurements, development and evaluation data sets, etc. - Excellent communication skills (written & spoken) and ability to collaborate effectively in a distributed, cross-functional team setting. PREFERRED QUALIFICATIONS - 3+ years of building machine learning models or developing algorithms for business application experience - Have publications at top-tier peer-reviewed conferences or journals - Track record of diving into data to discover hidden patterns and conducting error/deviation analysis - Ability to develop experimental and analytic plans for data modeling processes, use of strong baselines, ability to accurately determine cause and effect relations - Exceptional level of organization and strong attention to detail - Comfortable working in a fast paced, highly collaborative, dynamic work environment We are open to hiring candidates to work out of one of the following locations: Chennai, TN, IND
US, VA, Arlington
Machine learning (ML) has been strategic to Amazon from the early years. We are pioneers in areas such as recommendation engines, product search, eCommerce fraud detection, and large-scale optimization of fulfillment center operations. The Generative AI team helps AWS customers accelerate the use of Generative AI to solve business and operational challenges and promote innovation in their organization. As an applied scientist, you are proficient in designing and developing advanced ML models to solve diverse challenges and opportunities. You will be working with terabytes of text, images, and other types of data to solve real-world problems. You'll design and run experiments, research new algorithms, and find new ways of optimizing risk, profitability, and customer experience. We’re looking for talented scientists capable of applying ML algorithms and cutting-edge deep learning (DL) and reinforcement learning approaches to areas such as drug discovery, customer segmentation, fraud prevention, capacity planning, predictive maintenance, pricing optimization, call center analytics, player pose estimation, event detection, and virtual assistant among others. AWS Sales, Marketing, and Global Services (SMGS) is responsible for driving revenue, adoption, and growth from the largest and fastest growing small- and mid-market accounts to enterprise-level customers including public sector. The AWS Global Support team interacts with leading companies and believes that world-class support is critical to customer success. AWS Support also partners with a global list of customers that are building mission-critical applications on top of AWS services. Key job responsibilities The primary responsibilities of this role are to: - Design, develop, and evaluate innovative ML models to solve diverse challenges and opportunities across industries - Interact with customer directly to understand their business problems, and help them with defining and implementing scalable Generative AI solutions to solve them - Work closely with account teams, research scientist teams, and product engineering teams to drive model implementations and new solution About the team About AWS Diverse Experiences AWS values diverse experiences. Even if you do not meet all of the preferred qualifications and skills listed in the job description, we encourage candidates to apply. If your career is just starting, hasn’t followed a traditional path, or includes alternative experiences, don’t let it stop you from applying. Why AWS? Amazon Web Services (AWS) is the world’s most comprehensive and broadly adopted cloud platform. We pioneered cloud computing and never stopped innovating — that’s why customers from the most successful startups to Global 500 companies trust our robust suite of products and services to power their businesses. Inclusive Team Culture Here at AWS, it’s in our nature to learn and be curious. Our employee-led affinity groups foster a culture of inclusion that empower us to be proud of our differences. Ongoing events and learning experiences, including our Conversations on Race and Ethnicity (CORE) and AmazeCon (gender diversity) conferences, inspire us to never stop embracing our uniqueness. Mentorship & Career Growth We’re continuously raising our performance bar as we strive to become Earth’s Best Employer. That’s why you’ll find endless knowledge-sharing, mentorship and other career-advancing resources here to help you develop into a better-rounded professional. Work/Life Balance We value work-life harmony. Achieving success at work should never come at the expense of sacrifices at home, which is why we strive for flexibility as part of our working culture. When we feel supported in the workplace and at home, there’s nothing we can’t achieve in the cloud. We are open to hiring candidates to work out of one of the following locations: Arlington, VA, USA | Atlanta, GA, USA | Austin, TX, USA | Houston, TX, USA | New York, NJ, USA | New York, NY, USA | San Francisco, CA, USA | Santa Clara, CA, USA | Seattle, WA, USA
US, WA, Seattle
Prime Video offers customers a vast collection of movies, series, and sports—all available to watch on hundreds of compatible devices. U.S. Prime members can also subscribe to 100+ channels including Max, discovery+, Paramount+ with SHOWTIME, BET+, MGM+, ViX+, PBS KIDS, NBA League Pass, MLB.TV, and STARZ with no extra apps to download, and no cable required. Prime Video is just one of the savings, convenience, and entertainment benefits included in a Prime membership. More than 200 million Prime members in 25 countries around the world enjoy access to Amazon’s enormous selection, exceptional value, and fast delivery. Are you interested in shaping the future of entertainment? Prime Video's technology teams are creating best-in-class digital video experience. As a Prime Video technologist, you’ll have end-to-end ownership of the product, user experience, design, and technology required to deliver state-of-the-art experiences for our customers. You’ll get to work on projects that are fast-paced, challenging, and varied. You’ll also be able to experiment with new possibilities, take risks, and collaborate with remarkable people. We’ll look for you to bring your diverse perspectives, ideas, and skill-sets to make Prime Video even better for our customers. With global opportunities for talented technologists, you can decide where a career Prime Video Tech takes you! Key job responsibilities As a Data Scientist at Amazon Prime Video, you will work with massive customer datasets, provide guidance to product teams on metrics of success, and influence feature launch decisions through statistical analysis of the outcomes of A/B experiments. You will develop machine learning models to facilitate understanding of customer's streaming behavior and build predictive models to inform personalization and ranking systems. You will work closely other scientists, economists and engineers to research new ways to improve operational efficiency of deployed models and metrics. A successful candidate will have a strong proven expertise in statistical modeling, machine learning, and experiment design, along with a solid practical understanding of strength and weakness of various scientific approaches. They have excellent communication skills, and can effectively communicate complex technical concepts with a range of technical and non-technical audience. They will be agile and capable of adapting to a fast-paced environment. They have an excellent track-record on delivering impactful projects, simplifying their approaches where necessary. A successful data scientist will own end-to-end team goals, operates with autonomy and strive to meet key deliverables in a timely manner, and with high quality. About the team Prime Video discovery science is a central team which defines customer and business success metrics, models, heuristics and econometric frameworks. The team develops, owns and operates a suite of data science and machine learning models that feed into online systems that are responsible for personalization and search relevance. The team is responsible for Prime Video’s experimentation practice and continuously innovates and upskills teams across the organization on science best practices. The team values diversity, collaboration and learning, and is excited to welcome a new member whose passion and creativity will help the team continue innovating and enhancing customer experience. We are open to hiring candidates to work out of one of the following locations: Seattle, WA, USA
US, NJ, Newark
Employer: Audible, Inc. Title: Data Scientist II Location: 1 Washington Street, Newark, NJ, 07102 Duties: Design and implement scalable and reliable approaches to support or automate decision making throughout the business. Apply a range of data science techniques and tools combined with subject matter expertise to solve difficult business problems and cases in which the solution approach is unclear. Acquire data by building the necessary SQL/ETL queries. Import processes through various company specific interfaces for accessing RedShift, and S3/edX storage systems. Build relationships with stakeholders and counterparts, and communicate model outputs, observations, and key performance indicators (KPIs) to the management to develop sustainable and consumable products. Explore and analyze data by inspecting univariate distributions and multivariate interactions, constructing appropriate transformations, and tracking down the source and meaning of anomalies. Build production-ready models using statistical modeling, mathematical modeling, econometric modeling, machine learning algorithms, network modeling, social network modeling, natural language processing, or genetic algorithms. Validate models against alternative approaches, expected and observed outcome, and other business defined key performance indicators. Implement models that comply with evaluations of the computational demands, accuracy, and reliability of the relevant ETL processes at various stages of production. Position reports into Newark, NJ office; however, telecommuting from a home office may be allowed. Requirements: Requires a Master’s in Statistics, Computer Science, Data Science, Machine Learning, Applied Math, Operations Research, Economics, or a related field plus two (2) years of Data Scientist or other occupation/position/job title with research or work experience related to data processing and predictive Machine Learning modeling at scale. Experience may be gained concurrently and must include: Two (2) years in each of the following: - Building statistical models and machine learning models using large datasets from multiple resources - Non-linear models including Neural Nets or Deep Learning, and Gradient Boosting - Applying specialized modelling software including Python, R, SAS, MATLAB, or Stata. One (1) year in the following: - Using database technologies including SQL or ETL. Alternatively, will accept a Bachelor's and five (5) years of experience. Multiple positions. Apply online: www.amazon.jobs Job Code: ADBL135. We are open to hiring candidates to work out of one of the following locations: Newark, NJ, USA
US, WA, Bellevue
The Artificial General Intelligence (AGI) team is looking for a passionate, talented, and inventive Applied Scientist with a strong deep learning background, to help build industry-leading technology with Large Language Models (LLMs) and multimodal systems. Key job responsibilities As an Applied Scientist with the AGI team, you will work with talented peers to develop novel algorithms and modeling techniques to advance the state of the art with LLMs. Your work will directly impact our customers in the form of products and services that make use of audio technology. You will leverage Amazon’s heterogeneous data sources and large-scale computing resources to accelerate advances in AGI in audio domain. About the team Our team has a mission to push the envelope of AGI in audio domain, in order to provide the best-possible experience for our customers. We are open to hiring candidates to work out of one of the following locations: Bellevue, WA, USA | Boston, MA, USA
DE, BE, Berlin
Are you fascinated by revolutionizing Alexa user experience with LLM? The Artificial General Intelligence (AGI) team is looking for an Applied Scientist with background in Large Language Model, Natural Language Process, Machine/Deep learning. You will be at the heart of the Alexa LLM transition working with a team of applied and research scientists to bring classic Alexa features and beyond into LLM empowered Alexa. You will interact in a cross-functional capacity with science, product and engineering leaders. Key job responsibilities * Work on core LLM technologies (supervised fine tuning, prompt optimization, etc.) to enable Alexa use cases * Research and develop novel metrics and algorithms for LLM evaluation * Communicating effectively with leadership team as well as with colleagues from science, engineering and business backgrounds. We are open to hiring candidates to work out of one of the following locations: Berlin, BE, DEU