Quantifying the Development Value of Code Contributions

Hezheng Yin

EECS Department
University of California, Berkeley
Technical Report No. UCB/EECS-2018-174
December 14, 2018

http://www2.eecs.berkeley.edu/Pubs/TechRpts/2018/EECS-2018-174.pdf

Counting the amount of source code that a developer contributes to a project does not reflect the value of the code contributions. Quantifying the value of code contributions, instead of only the amount, makes a useful tool for instructors grading students in massive online courses, managers reviewing employees' performance, developers collaborating in open source projects, and researchers measuring development activities. In this paper, we define the concept of development value and design a framework to quantify such value of code contributions. The framework consists of structural analysis and non-structural analysis. In structural analysis, we parse the code structure and construct a new PageRank-type algorithm; for non-structural analysis, we classify the impact of code changes, and take advantage of the natural-language artifacts in repositories to train machine learning models to automate the process. Our empirical study in a software engineering course with 10 group projects, a survey of 35 open source developers with 772 responses, and massive analysis of 250k commit messages demonstrate the effectiveness of our solution.

Advisor: Armando Fox


BibTeX citation:

@mastersthesis{Yin:EECS-2018-174,
    Author = {Yin, Hezheng},
    Title = {Quantifying the Development Value of Code Contributions},
    School = {EECS Department, University of California, Berkeley},
    Year = {2018},
    Month = {Dec},
    URL = {http://www2.eecs.berkeley.edu/Pubs/TechRpts/2018/EECS-2018-174.html},
    Number = {UCB/EECS-2018-174},
    Abstract = {Counting the amount of source code that a developer contributes to a project does not reflect the value of the code contributions. Quantifying the value of code contributions, instead of only the amount, makes a useful tool for instructors grading students in massive online courses, managers reviewing employees' performance, developers collaborating in open source projects, and researchers measuring development activities. In this paper, we define the concept of development value and design a framework to quantify such value of code contributions. The framework consists of structural analysis and non-structural analysis. In structural analysis, we parse the code structure and construct a new PageRank-type algorithm; for non-structural analysis, we classify the impact of code changes, and take advantage of the natural-language artifacts in repositories to train machine learning models to automate the process.
Our empirical study in a software engineering course with 10 group projects, a survey of 35 open source developers with 772 responses, and massive analysis of 250k commit messages demonstrate the effectiveness of our solution.}
}

EndNote citation:

%0 Thesis
%A Yin, Hezheng
%T Quantifying the Development Value of Code Contributions
%I EECS Department, University of California, Berkeley
%D 2018
%8 December 14
%@ UCB/EECS-2018-174
%U http://www2.eecs.berkeley.edu/Pubs/TechRpts/2018/EECS-2018-174.html
%F Yin:EECS-2018-174