Reference Hub1
Improving Logging Prediction on Imbalanced Datasets: A Case Study on Open Source Java Projects

Improving Logging Prediction on Imbalanced Datasets: A Case Study on Open Source Java Projects

Sangeeta Lal, Neetu Sardana, Ashish Sureka
ISBN13: 9781799824602|ISBN10: 1799824608|EISBN13: 9781799824619
DOI: 10.4018/978-1-7998-2460-2.ch039
Cite Chapter Cite Chapter

MLA

Lal, Sangeeta, et al. "Improving Logging Prediction on Imbalanced Datasets: A Case Study on Open Source Java Projects." Cognitive Analytics: Concepts, Methodologies, Tools, and Applications, edited by Information Resources Management Association, IGI Global, 2020, pp. 740-772. https://doi.org/10.4018/978-1-7998-2460-2.ch039

APA

Lal, S., Sardana, N., & Sureka, A. (2020). Improving Logging Prediction on Imbalanced Datasets: A Case Study on Open Source Java Projects. In I. Management Association (Ed.), Cognitive Analytics: Concepts, Methodologies, Tools, and Applications (pp. 740-772). IGI Global. https://doi.org/10.4018/978-1-7998-2460-2.ch039

Chicago

Lal, Sangeeta, Neetu Sardana, and Ashish Sureka. "Improving Logging Prediction on Imbalanced Datasets: A Case Study on Open Source Java Projects." In Cognitive Analytics: Concepts, Methodologies, Tools, and Applications, edited by Information Resources Management Association, 740-772. Hershey, PA: IGI Global, 2020. https://doi.org/10.4018/978-1-7998-2460-2.ch039

Export Reference

Mendeley
Favorite

Abstract

Logging is an important yet tough decision for OSS developers. Machine-learning models are useful in improving several steps of OSS development, including logging. Several recent studies propose machine-learning models to predict logged code construct. The prediction performances of these models are limited due to the class-imbalance problem since the number of logged code constructs is small as compared to non-logged code constructs. No previous study analyzes the class-imbalance problem for logged code construct prediction. The authors first analyze the performances of J48, RF, and SVM classifiers for catch-blocks and if-blocks logged code constructs prediction on imbalanced datasets. Second, the authors propose LogIm, an ensemble and threshold-based machine-learning model. Third, the authors evaluate the performance of LogIm on three open-source projects. On average, LogIm model improves the performance of baseline classifiers, J48, RF, and SVM, by 7.38%, 9.24%, and 4.6% for catch-blocks, and 12.11%, 14.95%, and 19.13% for if-blocks logging prediction.

Request Access

You do not own this content. Please login to recommend this title to your institution's librarian or purchase it from the IGI Global bookstore.