Prem Natarajan, Alexa AI vice president of natural understanding, giving a presentation
Prem Natarajan, Alexa AI vice president of natural understanding
Credit: Micron Technology, Inc.

3 questions: Prem Natarajan on issues of AI fairness and bias

Alexa AI vice president of natural understanding Prem Natarajan discusses the upcoming cycle for the National Science Foundation collaboration on fairness in AI, his participation on the Partnership on AI board, and issues related to bias in natural language processing.

A year ago, Amazon and the National Science Foundation (NSF) announced a $20 million collaboration to fund academic research on fairness in AI over a three-year period. Recently, Erwin Gianchandani, deputy assistant director for Computer and Information Science and Engineering at NSF, discussed the work of the first ten recipients of the program’s grants. Here, Prem Natarajan, Alexa AI vice president of natural understanding, and the Amazon executive who helped launch the collaboration with NSF, discusses the next cycle of upcoming proposals from academic researchers, his work with the Partnership on AI, and what can be done to address bias in natural language processing models.

The 2020 award cycle for the Fairness in AI program in conjunction with the NSF recently launched. Full proposals are due by July 13th. What are you hoping to see in the next round of proposals?

We collaborated with the NSF to launch the Fairness in AI program with the goal of promoting academic research in this important aspect of AI. Our primary objective for engaging with academia on issues related to fairness and transparency in AI is to get many different and diverse perspectives focused on the challenge. The teams selected by NSF in the first round are addressing a variety of topics – from principled frameworks for developing and certifying fair AI, to domain-focused applications such as fair recommender systems for foster care services. To that end, I hope that the second round will build upon the success of the first round by bringing an even greater diversity of perspectives on definitions and perceptions of fairness. Without such diversity the entire field of research into fair AI will become a self-defeating exercise.

Another hope I have for the second round, and indeed for all rounds of this program, is that it will drive the creation of a portfolio of open-source artifacts – such as data sets, metrics, tools, and testing methodologies – which all stakeholders in AI can use to promote the use of fair AI. Such readily available artifacts will make it easier for the community to learn from one another, promote the replication of research results, and, ultimately, advance the state of the art more rapidly. Put differently, we hope that open access to the research under this program will form a rising tide that lifts all boats. It also seems natural that methodologies for fairness will benefit from broad and inclusive discussion across relevant academic and scientific communities.

The deadline for this next round of proposal submissions is July 13th. We hope that the response to this round will be even stronger than for the first. NSF selects the recipients, and I am sure NSF’s reviewers are looking forward to a summer of interesting reading!

You are Amazon’s representative on the Partnership on AI (PAI) board of directors. This unique organization has thematic pillars related to safety-critical AI; fair, transparent and accountable AI; AI labor and the economy; collaborations between AI systems and people; social and societal influences of AI; and AI and social good. It’s an ambitious, broad agenda. You’re fairly new in your role with PAI; what most excites you about the work being done there?

The most exciting aspect of the Partnership on AI is that it is a unique multi-sector forum where I get to listen to and learn from the incredible diversity of perspectives – from industry, academia, non-profits, and social justice groups. PAI today counts amongst its members about 59 non-profits, 24 academic institutions, and 18 industrial organizations. While I joined the board just a few months ago, I have already attended several meetings and participated in discussions with other PAI members as well as PAI staff. While every member has their own unique perspective on AI, it’s been really interesting and encouraging to see that we all share the same values and many of the same concerns. It should be of no surprise that the issue of equity is top of mind with a concomitant focus on fairness considerations.

Alexa & Friends Twitch show features Prem Natarajan

Earlier this month, Alexa evangelist Jeff Blankenburg interviewed Prem Natarajan live on the 'Alexa & Friends' Twitch show. In the video, they discuss recent advances in natural understanding , and how those advancements translate into better experiences for customers, developers and third-party device manufacturers.

From a technical perspective, I am excited by the number and quality of research initiatives underway at PAI. Many of these initiatives are of critical importance to the future development of the field of AI. Let me give you a couple of examples.

One is the area of fairness, accountability and transparency. There are several projects underway in this area, but I will mention one that to me exemplifies the kind of work that an organization like PAI can do. PAI researchers interviewed practitioners at twenty different organizations and performed an in-depth case study of how explainable AI is used today. This kind of research is very important to AI practitioners because it gives them a referential basis to assess their own work and to identify useful areas for future contributions.

Another example is ABOUT ML, which is focused on developing and sharing best practices as well as on advancing public understanding of AI. A couple of years ago some researchers had proposed the development of an AI model scorecard, along the lines of the nutritional information you get on the back of most food items we buy today. The scorecard would describe the attributes of the data used to train the models, the way in which it was tested, etc. The motivation behind the scorecard is to give other developers or model builders a sense of the strengths and limitations of the model, so they can better estimate and address potential weaknesses in the model for their target use cases. ABOUT ML goes well beyond such a scorecard, focusing on documentation, provenance of data and code artifacts, and other critical attributes of the model development process. Ultimately, only multisector organizations like PAI can successfully drive this kind of initiative, bringing together people across organizations and sectors.

Lastly, there’s an education role that PAI serves that I believe is unique, serving as the bridge between AI technologists and other stakeholders within society, making sure AI technologists are appropriately factoring in the perspectives and concerns of the other stakeholders within society. Some examples here include PAI’s collaborative work with First Draft, a PAI Partner, to help technologists and journalists at digital platforms address growing issues around manipulated media. PAI also helps those stakeholders understand more about how AI technology works, its strengths and its limitations.

You oversee Alexa’s natural understanding team. Natural language processing models have drawn criticism for capturing common social biases with respect to gender and race. A large body of work is emerging related to bias in word embedding and classifiers, and there are many proposals for countermeasures. Can you describe the challenge of bias in NLP models, and give us insight into some of the countermeasures you think are, or could be, effective?

A word embedding is a vector of real numbers representing that word; the core idea is that words with similar meanings map to vectors that are “close” to each other. Word embeddings have become a central feature of modern NLP. While embeddings can be computed using a variety of different techniques, deep learning techniques have proven to be tremendously effective at numerically representing the semantics of a word and concepts, etc. Today, deep learning based embeddings are used for all kinds of processing, from named entity recognition, to question answering, and natural language generation. As a result, the semantics that these embeddings encode greatly influence how we interpret text, the accuracy of those interpretations, and the actions we take in response to those interpretations.

Bias can also manifest in other ways because any system that is based on data can exhibit a majoritarian bias to it.
Prem Natarajan, Alexa AI VP of natural understanding

As word embeddings became prevalent, researchers naturally started looking into their fragilities and shortcomings. One of those fragilities is that the embeddings derive and encode meaning from context, which means that the meaning of a word is largely controlled by the different contexts in which that word is observed in the training data. While that seems like a reasonable basis for inferring meaning, it leads to undesirable consequences. My friend Kai-Wei Chang at UCLA is one of the early investigators of bias in NLP and he uses the following example: take the vector for doctor and you subtract the vector for man; when you add the vector for woman, you should in principle get the vector for doctor again, or a female doctor. But instead the resulting vector is close to the vector for ‘nurse.’ What this example shows is that the latent biases in human-generated text get encoded into the embeddings. One example of a system that is affected by these biases is natural language generation. Many studies have shown that such biases can result in the generation of text that exhibits the same biases and prejudices as humans, sometimes in an amplified manner. Left unmitigated, such systems could reinforce human biases and stereotypes.

Bias can also manifest in other ways because any system that is based on data can exhibit a majoritarian bias to it. So, for example, different groups in different parts of the world may speak the same language with different dialects, but the most frequent dialect will likely see the best performance only because it forms the major proportion of the training data. But we don’t want dialect or accent to determine how well the system will work for an individual. We want our systems to work equally well for everyone, regardless of geography, dialect, gender, or any other irrelevant factor.

Methodologically, we counter the impact of bias by using a principled approach to characterize the dimensions of bias and associated impact, and by developing techniques that are robust to these biasing factors. For example, it stands to reason that speech recognition systems should ignore parts of the signal that are not useful for recognizing the words that were spoken. It shouldn’t really matter whether the voice is male or female, only the actual words should. Similarly for natural language understanding, we want to be able to understand the queries of different groups of people regardless of the stylistic or syntactic variations of the language used. Scientists at Amazon and elsewhere are exploring a broad variety of approaches such as de-biasing techniques, adversarial invariance, active learning, and selective sampling. Personally, I find the adversarial approaches to both testing and to generating bias or nuisance invariant representations most appealing because of their scalability, but in the next few years, we will all find out what works best for different problems!

Research areas

Related content

US, CA, Santa Clara
Machine learning (ML) has been strategic to Amazon from the early years. We are pioneers in areas such as recommendation engines, product search, eCommerce fraud detection, and large-scale optimization of fulfillment center operations. The Generative AI team helps AWS customers accelerate the use of Generative AI to solve business and operational challenges and promote innovation in their organization. As an applied scientist, you are proficient in designing and developing advanced ML models to solve diverse challenges and opportunities. You will be working with terabytes of text, images, and other types of data to solve real-world problems. You'll design and run experiments, research new algorithms, and find new ways of optimizing risk, profitability, and customer experience. We’re looking for talented scientists capable of applying ML algorithms and cutting-edge deep learning (DL) and reinforcement learning approaches to areas such as drug discovery, customer segmentation, fraud prevention, capacity planning, predictive maintenance, pricing optimization, call center analytics, player pose estimation, event detection, and virtual assistant among others. AWS Sales, Marketing, and Global Services (SMGS) is responsible for driving revenue, adoption, and growth from the largest and fastest growing small- and mid-market accounts to enterprise-level customers including public sector. The AWS Global Support team interacts with leading companies and believes that world-class support is critical to customer success. AWS Support also partners with a global list of customers that are building mission-critical applications on top of AWS services. Key job responsibilities The primary responsibilities of this role are to: Design, develop, and evaluate innovative ML models to solve diverse challenges and opportunities across industries Interact with customer directly to understand their business problems, and help them with defining and implementing scalable Generative AI solutions to solve them Work closely with account teams, research scientist teams, and product engineering teams to drive model implementations and new solutions About the team Diverse Experiences AWS values diverse experiences. Even if you do not meet all of the qualifications and skills listed in the job description, we encourage candidates to apply. If your career is just starting, hasn’t followed a traditional path, or includes alternative experiences, don’t let it stop you from applying. Why AWS? Amazon Web Services (AWS) is the world’s most comprehensive and broadly adopted cloud platform. We pioneered cloud computing and never stopped innovating — that’s why customers from the most successful startups to Global 500 companies trust our robust suite of products and services to power their businesses. Inclusive Team Culture Here at AWS, it’s in our nature to learn and be curious. Our employee-led affinity groups foster a culture of inclusion that empower us to be proud of our differences. Ongoing events and learning experiences, including our Conversations on Race and Ethnicity (CORE) and AmazeCon (gender diversity) conferences, inspire us to never stop embracing our uniqueness. Mentorship & Career Growth We’re continuously raising our performance bar as we strive to become Earth’s Best Employer. That’s why you’ll find endless knowledge-sharing, mentorship and other career-advancing resources here to help you develop into a better-rounded professional. Work/Life Balance We value work-life harmony. Achieving success at work should never come at the expense of sacrifices at home, which is why flexible work hours and arrangements are part of our culture. When we feel supported in the workplace and at home, there’s nothing we can’t achieve in the cloud. We are open to hiring candidates to work out of one of the following locations: San Francisco, CA, USA | Santa Clara, CA, USA
IN, KA, Bangalore
Alexa is the voice activated digital assistant powering devices like Amazon Echo, Echo Dot, Echo Show, and Fire TV, which are at the forefront of this latest technology wave. To preserve our customers’ experience and trust, the Alexa Sensitive Content Intelligence (ASCI) team creates policies and builds services and tools through Machine Learning techniques to detect and mitigate sensitive content across Alexa. We are looking for an experienced Senior Applied Scientist to build industry-leading technologies in attribute extraction and sensitive content detection across all languages and countries. An Applied Scientist will be a tech lead for a team of exceptional scientists to develop novel algorithms and modeling techniques to advance the state of the art in NLP or CV related tasks. You will work in a hybrid, fast-paced organization where scientists, engineers, and product managers work together to build customer facing experiences. You will collaborate with and mentor other scientists to raise the bar of scientific research in Amazon. Your work will directly impact our customers in the form of products and services that make use of speech, language, and computer vision technologies. We are looking for a leader with strong technical experiences a passion for building scientific driven solutions in a fast-paced environment. You should have good understanding of NLP models (e.g. LSTM, transformer based models) or CV models (e.g. CNN, AlexNet, ResNet) and where to apply them in different business cases. You leverage your exceptional technical expertise, a sound understanding of the fundamentals of Computer Science, and practical experience of building large-scale distributed systems to creating reliable, scalable, and high-performance products. In addition to technical depth, you must possess exceptional communication skills and understand how to influence key stakeholders. You will be joining a select group of people making history producing one of the most highly rated products in Amazon's history, so if you are looking for a challenging and innovative role where you can solve important problems while growing as a leader, this may be the place for you. Key job responsibilities You'll lead the science solution design, run experiments, research new algorithms, and find new ways of optimizing customer experience. You set examples for the team on good science practice and standards. Besides theoretical analysis and innovation, you will work closely with talented engineers and ML scientists to put your algorithms and models into practice. Your work will directly impact the trust customers place in Alexa, globally. You contribute directly to our growth by hiring smart and motivated Scientists to establish teams that can deliver swiftly and predictably, adjusting in an agile fashion to deliver what our customers need. A day in the life You will be working with a group of talented scientists on researching algorithm and running experiments to test scientific proposal/solutions to improve our sensitive contents detection and mitigation. This will involve collaboration with partner teams including engineering, PMs, data annotators, and other scientists to discuss data quality, policy, and model development. You will mentor other scientists, review and guide their work, help develop roadmaps for the team. You work closely with partner teams across Alexa to deliver platform features that require cross-team leadership. About the hiring group About the team The mission of the Alexa Sensitive Content Intelligence (ASCI) team is to (1) minimize negative surprises to customers caused by sensitive content, (2) detect and prevent potential brand-damaging interactions, and (3) build customer trust through appropriate interactions on sensitive topics. The term “sensitive content” includes within its scope a wide range of categories of content such as offensive content (e.g., hate speech, racist speech), profanity, content that is suitable only for certain age groups, politically polarizing content, and religiously polarizing content. The term “content” refers to any material that is exposed to customers by Alexa (including both 1P and 3P experiences) and includes text, speech, audio, and video. We are open to hiring candidates to work out of one of the following locations: Bangalore, KA, IND
US, WA, Bellevue
Looking for your next challenge? North America Sort Centers (NASC) are experiencing growth and looking for a skilled, highly motivated Data Scientist to join the NASC Engineering Data, Product and Simulation Team. The Sort Center network is the critical Middle-Mile solution in the Amazon Transportation Services (ATS) group, linking Fulfillment Centers to the Last Mile. The experience of our customers is dependent on our ability to efficiently execute volume flow through the middle-mile network. Key job responsibilities The Senior Data Scientist will design and implement solutions to address complex business questions using simulation. In this role, you will apply advanced analysis techniques and statistical concepts to draw insights from massive datasets, and create intuitive simulations and data visualizations. You can contribute to each layer of a data solution – you work closely with process design engineers, business intelligence engineers and technical product managers to obtain relevant datasets and create simulation models, and review key results with business leaders and stakeholders. Your work exhibits a balance between scientific validity and business practicality. On this team, you will have a large impact on the entire NASC organization, with lots of opportunity to learn and grow within the NASC Engineering team. This role will be the first dedicated simulation expert, so you will have an exceptional opportunity to define and drive vision for simulation best practices on our team. To be successful in this role, you must be able to turn ambiguous business questions into clearly defined problems, develop quantifiable metrics and deliver results that meet high standards of data quality, security, and privacy. About the team NASC Engineering’s Product and Analytics Team’s sole objective is to develop tools for under the roof simulation and optimization, supporting the needs of our internal and external stakeholders (i.e Process Design Engineering, NASC Engineering, ACES, Finance, Safety and Operations). We develop data science tools to evaluate what-if design and operations scenarios for new and existing sort centers to understand their robustness, stability, scalability, and cost-effectiveness. We conceptualize new data science solutions, using optimization and machine learning platforms, to analyze new and existing process, identify and reduce non-value added steps, and increase overall performance and rate. We work by interfacing with various functional teams to test and pilot new hardware/software solutions. We are open to hiring candidates to work out of one of the following locations: Bellevue, WA, USA
US, NY, New York
We are looking for a motivated and experienced Senior Data Scientist with experience in Machine Learning (ML), Artificial Intelligence (AI), Big Data, and Service Oriented Architecture with deep understanding in advertising businesses, to be part of a team of talented scientists and engineers to innovate, iterate, and solve real world problem with cutting-edge AWS technologies. In this role, you will take a leading role in defining the problem, innovating the ML/AI solutions, and information the tech roadmap. You will join a cross-functional, fun-loving team, working closely with scientists and engineers in a daily basis. You will innovate on behalf of our customers by prototyping, delivering functional proofs of concept (POCs), and partnering with our engineers to productize and scale successful POCs. If you are passionate about creating the future, come join us as we have fun, and make history. Key job responsibilities - Define and execute a research & development roadmap that drives data-informed decision making for marketers and advertisers - Establish and drive data hygiene best practices to ensure coherence and integrity of data feeding into production ML/AI solutions - Collaborate with colleagues across science and engineering disciplines for fast turnaround proof-of-concept prototyping at scale - Partner with product managers and stakeholders to define forward-looking product visions and prospective business use cases - Drive and lead of culture of data-driven innovations within and outside across Amazon Ads Marketing orgs About the team Marketing Decision Science provides science products to enable Amazon Ads Marketing to deliver relevant and compelling guidance across marketing channels to prospective and active advertisers for success on Amazon. We own the product, technology and deployment roadmap for AI- and analytics-powered products across Amazon Ads Marketing. We analyze the needs, experiences, and behaviors of Amazon advertisers at petabytes scale, to deliver the right marketing communications to the right advertiser at the right team to help them make the data-informed advertising decisions. Our science-based products enable applications and synergies across Ads organization, spanning marketing, product, and sales use cases. We are open to hiring candidates to work out of one of the following locations: New York, NY, USA
US, WA, Bellevue
We are looking for detail-oriented, organized, and responsible individuals who are eager to learn how to work with large and complicated data sets. Some knowledge of econometrics, as well as basic familiarity with Python is necessary, and experience with SQL and UNIX would be a plus. These are full-time positions at 40 hours per week, with compensation being awarded on an hourly basis. You will learn how to build data sets and perform applied econometric analysis at Internet speed collaborating with economists, scientists, and product managers. These skills will translate well into writing applied chapters in your dissertation and provide you with work experience that may help you with placement. Roughly 85% of previous cohorts have converted to full time scientist employment at Amazon. If you are interested, please send your CV to our mailing list at econ-internship@amazon.com. We are open to hiring candidates to work out of one of the following locations: Bellevue, WA, USA
US, WA, Seattle
Are you excited about developing models to revolutionize automation, robotics and computer vision? Are you looking for opportunities to build and deploy them on real problems at truly vast scale? At Amazon Fulfillment Technologies and Robotics we are on a mission to build high-performance autonomous systems that perceive and act to further improve our world-class customer experience - at Amazon scale. We are looking for scientists, engineers and program managers for a variety of roles. The Amazon Robotics software team is seeking a collaborative Applied Scientist to focus on computer vision machine learning models. This includes building multi-viewpoint and time-series computer vision systems. It includes building large-scale models using data from many different tasks and scenes. This work spans from basic research such as cross domain training, to experimenting on prototype in the lab, to running wide-scale A/B tests on robots in our facilities. Key job responsibilities * Research vision - Where should we be focusing our efforts * Research delivery – Proving/dis-proving strategies in offline data or in the lab * Production studies - Insights from production data or ad-hoc experimentation. A day in the life Amazon offers a full range of benefits that support you and eligible family members, including domestic partners and their children. Benefits can vary by location, the number of regularly scheduled hours you work, length of employment, and job status such as seasonal or temporary employment. The benefits that generally apply to regular, full-time employees include: 1. Medical, Dental, and Vision Coverage 2. Maternity and Parental Leave Options 3. Paid Time Off (PTO) 4. 401(k) Plan If you are not sure that every qualification on the list above describes you exactly, we'd still love to hear from you! At Amazon, we value people with unique backgrounds, experiences, and skillsets. If you’re passionate about this role and want to make an impact on a global scale, please apply! We are open to hiring candidates to work out of one of the following locations: Seattle, WA, USA
US, WA, Seattle
Innovators wanted! Are you an entrepreneur? A builder? A dreamer? This role is part of an Amazon Special Projects team that takes the company’s Think Big leadership principle to the extreme. We focus on creating entirely new products and services with a goal of positively impacting the lives of our customers. No industries or subject areas are out of bounds. If you’re interested in innovating at scale to address big challenges in the world, this is the team for you. Here at Amazon, we embrace our differences. We are committed to furthering our culture of inclusion. We have thirteen employee-led affinity groups, reaching 40,000 employees in over 190 chapters globally. We are constantly learning through programs that are local, regional, and global. Amazon’s culture of inclusion is reinforced within our 16 Leadership Principles, which remind team members to seek diverse perspectives, learn and be curious, and earn trust. Our team highly values work-life balance, mentorship and career growth. We believe striking the right balance between your personal and professional life is critical to life-long happiness and fulfillment. We care about your career growth and strive to assign projects and offer training that will challenge you to become your best. We are open to hiring candidates to work out of one of the following locations: Seattle, WA, USA
US, MA, North Reading
We are looking for experienced scientists and engineers to explore new ideas, invent new approaches, and develop new solutions in the areas of Controls, Dynamic modeling and System identification. Are you inspired by invention? Is problem solving through teamwork in your DNA? Do you like the idea of seeing how your work impacts the bigger picture? Answer yes to any of these and you’ll fit right in here at Amazon Robotics. We are a smart team of doers that work passionately to apply cutting edge advances in robotics and software to solve real-world challenges that will transform our customers’ experiences in ways we can’t even imagine yet. We invent new improvements every day. We are Amazon Robotics and we will give you the tools and support you need to invent with us in ways that are rewarding, fulfilling and fun. Key job responsibilities Applied Scientists take on big unanswered questions and guide development team to state-of-the-art solutions. We want to hear from you if you have deep industry experience in the Mechatronics domain and : * the ability to think big and conceive of new ideas and novel solutions; * the insight to correctly identify those worth exploring; * the hands-on skills to quickly develop proofs-of-concept; * the rigor to conduct careful experimental evaluations; * the discipline to fast-fail when data refutes theory; * and the fortitude to continue exploring until your solution is found We are open to hiring candidates to work out of one of the following locations: North Reading, MA, USA | Westborough, MA, USA
GB, London
We are looking for detail-oriented, organized, and responsible individuals who are eager to learn how to work with large and complicated data sets. Some knowledge of econometrics, as well as basic familiarity with Python or R is necessary, and experience with SQL and UNIX would be a plus. These are full-time positions at 40 hours per week, with compensation being awarded on an hourly basis. You will learn how to build data sets and perform applied econometric analysis at Internet speed collaborating with economists, scientists, and product managers. These skills will translate well into writing applied chapters in your dissertation and provide you with work experience that may help you with placement. Roughly 85% of previous cohorts have converted to full time economics employment at Amazon. If you are interested, please send your CV to our mailing list at econ-internship@amazon.com. We are open to hiring candidates to work out of one of the following locations: London, GBR
GB, London
Are you excited about applying economic models and methods using large data sets to solve real world business problems? Then join the Economic Decision Science (EDS) team. EDS is an economic science team based in the EU Stores business. The teams goal is to optimize and automate business decision making in the EU business and beyond. An internship at Amazon is an opportunity to work with leading economic researchers on influencing needle-moving business decisions using incomparable datasets and tools. It is an opportunity for PhD students and recent PhD graduates in Economics or related fields. We are looking for detail-oriented, organized, and responsible individuals who are eager to learn how to work with large and complicated data sets. Knowledge of econometrics, as well as basic familiarity with Stata, R, or Python is necessary. Experience with SQL would be a plus. As an Economics Intern, you will be working in a fast-paced, cross-disciplinary team of researchers who are pioneers in the field. You will take on complex problems, and work on solutions that either leverage existing academic and industrial research, or utilize your own out-of-the-box pragmatic thinking. In addition to coming up with novel solutions and prototypes, you may even need to deliver these to production in customer facing products. Roughly 85% of previous intern cohorts have converted to full time scientist employment at Amazon. We are open to hiring candidates to work out of one of the following locations: London, GBR