Static code analyzers are now a common part of the code review process. These automated tools integrate into the code review process by commenting on code changes and suggesting improvements, in the same way as human reviewers. The comments made by static analyzers often trigger a conversation between developers to align on if and how the issue should be fixed. Because developers rarely give feedback directly to the tool, understanding the sentiment and intent in the conversation triggered by the tool comments can be used to measure the usefulness of the static analyzer.
In this paper, we report on an experiment where we use large language models to automatically label and categorize the sentiment and intent of such conversations triggered by static analyzer comments. Our experiment demonstrates that LLMs not only classify and interpret complex developer-analyzer conversations, but can be more accurate than human experts.
Understanding developer-analyzer interactions in code reviews
2024
Research areas