Pull-based development has become an important paradigm for distributed software development.In this model,each developer independently works on a copied repository(i.e.,a fork)from the central repository.It is essent...Pull-based development has become an important paradigm for distributed software development.In this model,each developer independently works on a copied repository(i.e.,a fork)from the central repository.It is essential for developers to maintain awareness of the state of other forks to improve collaboration efficiency.In this paper,we propose a method to automatically generate a summary of a fork.We first use the random forest method to generate the label of a fork,i.e.,feature implementation or a bug fix.Based on the information of the fork-related commits,we then use the TextRank algorithm to generate detailed activity information of the fork.Finally,we apply a set of rules to integrate all related information to construct a complete fork summary.To validate the effectiveness of our method,we conduct 30 groups of manual experiment and 77 groups of case studies on Github.We propose Fea_(avg)to evaluate the performance of Fea_(avg)the generated fork summary,considering the content accuracy,content integrity,sentence fluency,and label extraction accuracy.The results show that the average of of the fork summary generated by this method is 0.672.More than 63%of project maintainers and the contributors believe that the fork summary can improve development efficiency.展开更多
基金This work was supported by the National Key Research and Development Program of China(2018YFB1004202).
文摘Pull-based development has become an important paradigm for distributed software development.In this model,each developer independently works on a copied repository(i.e.,a fork)from the central repository.It is essential for developers to maintain awareness of the state of other forks to improve collaboration efficiency.In this paper,we propose a method to automatically generate a summary of a fork.We first use the random forest method to generate the label of a fork,i.e.,feature implementation or a bug fix.Based on the information of the fork-related commits,we then use the TextRank algorithm to generate detailed activity information of the fork.Finally,we apply a set of rules to integrate all related information to construct a complete fork summary.To validate the effectiveness of our method,we conduct 30 groups of manual experiment and 77 groups of case studies on Github.We propose Fea_(avg)to evaluate the performance of Fea_(avg)the generated fork summary,considering the content accuracy,content integrity,sentence fluency,and label extraction accuracy.The results show that the average of of the fork summary generated by this method is 0.672.More than 63%of project maintainers and the contributors believe that the fork summary can improve development efficiency.