Can computational linguists find a home in the technology industry?
Alexa senior applied scientist provides career advice to graduate students considering a research role in industry.
Editor’s Note: Christos Christodoulopoulos is a senior applied scientist within the Alexa Knowledge team based in Cambridge, UK. His research focuses on knowledge extraction, knowledge graph question answering and fact verification. Christodoulopoulos joined Amazon in 2016 as a research scientist — his first non-academic position.
His background is in computational linguistics: the study of human language using computational methods. After earning his undergraduate degree in digital systems and technology education, Christodoulopoulos obtained his master’s degree in computational linguistics at the University of Edinburgh, with a thesis on computational models for linguistic phenomena like entailment and polarity.
His doctoral research focused on the underlying structure of syntactic categories across languages and how (or if) they relate to semantic primitives. During his post-doctoral work at the University of Illinois at Urbana-Champaign, Christodoulopoulos worked on computational models of child language acquisition (based on the Syntactic Bootstrapping hypothesis) and machine-learning models for extending semantic role labeling (SRL). In the article below, Christodoulopoulos, who has transitioned from more theoretical research on language to more applied research on knowledge extraction, shares his advice on how young researchers can transition to an industry research position.
A friend who teaches at Cornell recently asked me to share career advice for graduate students who are deciding whether they want to work in industry. He teaches natural language processing and computational linguistics. Some of his students come from a traditional (non-computational) linguistics background and wanted to know whether there are career paths for them within the technology industry. Having not had any industry experience before joining Amazon, I tried to think of advice I wish someone had given me when I first started. Here’s what I shared:
Former Amazon interns offer their advice
We asked some recent science interns (and PhD students) what advice they’d give to fellow future interns — here’s what they told us.
- Pursue more than one internship, if possible. Try different companies or research groups. Find projects that lie just beyond your current research — close enough to hit the ground running and finish within three to six months, but challenging enough that you learn something new.
- During your internship talk to as many people as possible: start with your interview (I decided to accept my current position after my conversation with two of my panel members), arrange 1:1s with other team members/leaders, attend talks, seminars, reading groups, and other activities that provide a more multi-disciplinary perspective.
- Consciously expand your research to other areas, or use other tools than the ones you’re using in your day-to-day research.
- For writing both academic and industry research papers, try to think about the implications of your work. What will the reader take away? Can they incorporate your findings into their work? ("Our system performs x% better than our competitors" is not a finding) Would your paper/work be relevant in six months, two years, or even five years? At Amazon, we use a working backwards model where we start from a customer need and work our way back to the solution — this gives us the confidence that the problem/end state is important, even if the solution changes.
- Review research papers for as many conferences as you can. Try to gain a sense of the quality — and breadth —of work in your area. Read other reviewers' comments. See what they spotted and what they missed (or chose not to mention). Be respectful in your comments, but don't shy away from pointing out issues that stand out. Be constructive in your criticism and try to offer counter examples or suggestions for improvements. Try to highlight the positives of the work, focusing on what the community can learn from it. Always include an executive summary for the area chair (they will thank you).
- Don't confuse tools with ways of thinking about a problem. If I ask you how you would solve sentiment analysis, BERT isn't an answer. Think of the underlying reason why such a technique would work, and try to generalize it. A company will not hire you because you're an expert in a tool/technique — you need to show you can learn a new one when the first one goes out of style (or better yet, develop the new one).
- Be frugal with your resources. Do you need this amount of computation? This much data? How much effort would it take to transfer to other languages? What can the typological differences between languages tell us about the potential to generalize the model? This is academia's edge over industry.
- Try to collaborate with other researchers as you pursue your PhD. Learn how to share the workload, but also resources like code and data. Use this opportunity to develop best practices for version control, code commenting, lab notes, and unit testing.
- Before starting your PhD journey (or during the first year or so) decide if the academic model of research is for you. Getting a PhD is a long, arduous process (especially in the US) and can be very lonely even within a big research lab — the end state of your studies after all, is to be the sole expert in your (admittedly tiny) research area. If the extreme focus on a tiny sub-area isn't your thing, that’s OK — you can usually convert the first couple of years of your PhD into a master’s. Most research positions require a PhD, even though some companies will hire researchers with master’s degrees.
- Pursuing a PhD is a long process, but it provides the opportunity to demonstrate what research can be. As my advisor used to say, a PhD is just a "driver's license for research". In retrospect, this was when I had the most time to work on ideas that excited me, and discover as much about my field as I could. Even if your thesis is on a very narrow topic make sure you get a chance to expand your research horizons by collaborating with other students on their projects, or simply during your literature review.
As my advisor used to say, a PhD is just a 'driver's license for research'.Christos Christodoulopoulos
- Learn good administration practices. Look at how big companies organize their teams and programs (for example, Scrum and Kanban). Learn what makes a good meeting and adopt a meeting code of conduct (ask for an agenda, try to ensure everyone is heard, take notes and share).
- Be a good teammate and eventually leader. Unfortunately, academics are never taught management skills (people or project), and not everyone is a natural team player or leader. Be aware of your unconscious biases, be self-critical, and earn trust. If you aren’t sure if you should take management courses (I haven't), try to observe how management is done around you, and learn from what works and what doesn't. I have found that Amazon’s list of leadership principles make for excellent day-to-day guidelines (even for non-managers like me).
- The big technology companies — and a lot of start-ups — are interested in non-computational linguists. The difference is whether the positions offered are research/publications-oriented, or more engineering/analysis focused. At Amazon we have a number of roles like Language Engineer, Language Data Researcher, Data Linguist, Data Associate that consider linguists without computational background as candidates (data handling and scripting skills are required though — see below). You can also meet some of the Amazonians in these positions by visiting the Alexa AI team page, and clicking on Kat, Melanie, or Saumil.
- Coding in Python is vital, even for non-computational linguists. It's steadily replacing R as the default data analysis language and it's very versatile in that it can be used from hacky scripts all the way to production systems (and of course it's the language of deep nets). Take programming courses and try to participate in Kaggle competitions or other shared challenges in your area. Our recent FEVER challenge is a good example of a standalone competition that requires a big chunk of the standard NLP pipeline
I hope you find this advice of use, and wish that your career journey is as challenging and rewarding as mine has been. As extra homework, I highly recommend reading Chris Manning’s excellent position paper “Computational Linguists and Deep Learning” from the column “Last Words” of the Computational Linguistics Journal. In his article in the same column, my PhD advisor Mark Steedman writes: “Human knowledge is expressed in language. So computational linguistics is very important.”