Description
Software developers and testers have long struggled with how to
elicit proactive responses from their coworkers when reviewing code
for security vulnerabilities and errors. For a code review to be
successful, it must not only identify potential problems but also
elicit an active response from the colleague responsible for
modifying the code. To understand the factors that contribute to
this outcome, we analyze a novel dataset of more than one million
code reviews for the Google Chromium
project, from which we extract linguistic features of feedback that
elicited responsive actions from coworkers. Using a
manually-labeled subset of reviewer comments, we trained a highly
accurate classifier to identify 'acted-upon' comments
(AUC = 0.85). Our results demonstrate the utility of our dataset,
the feasibility of using NLP for this new task, and the potential
of NLP to improve our understanding of how communications between
colleagues can be authored to elicit positive, proactive responses.
Datasets
There are two datasets available for download. In each one, we have
(to the best of our knowledge) de-identified all usernames and
email addresses of developers involved in the Chromium Project. The
datasets described below were exported from the PostgreSQL database
used in our research. Everything needed to recollect and recreate
our database is available here.
A README.md file, which explains the structure of the
datasets, is included in the download below.
CONVERSATIONS: This is the full dataset containing over 1.5 million comments posted by developers reviewing proposed code changes. The dataset also includes the values we calculated for all nine linguistic features (described in Section 4 of the paper cited below).
ANNOTATIONS: This dataset is a subset of the CONVERSATIONS dataset. It contains the data used in the classification experiment outlined in Section 5 of the paper cited below (2,994 comments automatically identified as acted-upon and 800 comments manually identified as not (known-to-be) acted-upon).
Download: chromium_conversations.tar.gz | 270MB Compressed; 1GB Raw
CONVERSATIONS: This is the full dataset containing over 1.5 million comments posted by developers reviewing proposed code changes. The dataset also includes the values we calculated for all nine linguistic features (described in Section 4 of the paper cited below).
ANNOTATIONS: This dataset is a subset of the CONVERSATIONS dataset. It contains the data used in the classification experiment outlined in Section 5 of the paper cited below (2,994 comments automatically identified as acted-upon and 800 comments manually identified as not (known-to-be) acted-upon).
Download: chromium_conversations.tar.gz | 270MB Compressed; 1GB Raw
Citation
We encourage you to use this dataset in your research. If you do, we ask that you please cite:
A Dataset for Identifying Actionable Feedback in Collaborative Software Development.
Proceedings of the 2018 Meeting for the Association for Computational Linguistics (ACL).
License
Rietveld, the system that facilitates code review in Chromium, is
licensed under the Apache v2.0 license.
The datasets we are releasing are licensed under the Creative Commons ShareAlike license (CC-BY-SA).
The datasets we are releasing are licensed under the Creative Commons ShareAlike license (CC-BY-SA).