Data in DevOps and Its Importance in Code Analytics

Data in DevOps and Its Importance in Code Analytics

Girish Babu (Cisco Systems Inc., Canada) and Charitra Kamalaksh Patil (MNP LLP, Canada)
DOI: 10.4018/978-1-7998-1863-2.ch007

Abstract

Robust DevOps plays a huge role in the health and sanity of software. The metadata generated during DevOps need to be harnessed for deriving useful insights on the health of the software. This area of work can be classified as code analytics and comprises of the following (but not limited to): 1. commit history from the source code management system (SCM); 2. the engineers that worked on the commit; 3. the reviewers on the commit; 4. the extent of build (if applicable) and test validation prior to the commit, the types of failures found in iterative processes, and the fixes done; 5. test extent of test coverage on the commit; 6. any static profiling on the code in the commit; 7. the size and complexity of the commit; 8. many more. This chapter articulates many ways the above information can be used for effective software development.
Chapter Preview
Top

Introduction

Robust DevOps plays a huge role in the health and sanity of software and the metadata generated during this activity need to be harnessed for deriving useful insights. This area of work can be classified as Code Analytics and comprises of the following (but not limited to) –

  • 1.

    Commit history from the Source Code Management system (SCM)

  • 2.

    The engineers that worked on the commit

  • 3.

    The reviewers on the commit

  • 4.

    The extent of build (if applicable) & test validation prior to the commit, the types of failures found in iterative processes & the fixes done

  • 5.

    Test extent of test coverage on the commit

  • 6.

    Any static profiling on the code in the commit

  • 7.

    The size and complexity of the commit

The proposed chapter introduces the various attributes that are available during DevOps, means to use them effectively and their application in source code analytics that can help produce good quality software at increasing velocity. Each section also gives a pictorial view of the role played by each metadata and how they can be represented visually for effective insights.

Section 1 describes how commit history can be used to derive BugSpots/BugCache (Rahman, et al, 2011). The techniques in this paper are extended to give a ‘phase-containment’ view to bugs / commits which is used in Cisco systems by the author.

Section 2 dwells into the code review practices and the meta data that is available during this activity and its application for sound peer reviews in the development life cycle. An effective illustration around this is described in the papers Search-Based Peer Reviewers Recommendation in Modern Code Review (Ouni, et al, 2016) and A Large-Scale Study on Source Code Reviewer Recommendation (Lipcˇak & Rossi, 2018).

Section 3 forays into code coverage measures from various test cycles and its potential use in determining efficacy of the test activities, a paper in this area of note is Examining the Effectiveness of Testing Coverage Tools: An Empirical Study (Alemerien & Magel, 2014).

Section 4 goes into static analysis and static profiling of software, which is one of the earliest indicators of quality and stability of software. This uses recommendations described in the papers Structured Testing: A Testing Methodology Using the Cyclomatic Complexity Metric (Watson & McCabe, 1996) and The Correlation among Software Complexity Metrics with Case Study (Dr. Tashtoush, et al, 2014).

Section 5 leverages the study on cyclomatic complexity analysis (Watson & Mccabe, 1996) to expand its use in code analytics.

Section 6 ventures into the DevOps workflow itself and why it is relevant to interject many of the earlier insights right into the CI/CD (Continuous Integration / Continuous Delivery) pipeline. It also circles around the first five sections and how they contribute to meta data necessary for code analytics.

Key Terms in this Chapter

Static Analysis: This deals with the aspect of static profiling of software prior to actual testing, and the various techniques to use the same for code analytics.

Code Complexity: It is the area of work that defines and deals with a measure of complexity for a software which can then be used to forecast stability of that software when deployed/used by customers.

Code Analytics: The area of study that relates to applying data driven analysis techniques to understanding and predicting how software would perform right from development, testing, deployment all the way to usage by customers.

Coverage Analysis: This pertains to the application of data driven techniques to data collected from multiple test/validation phases of a software. It also dwells into the various techniques and methods to capturing and harnessing test coverage meta information.

BugSpots/HotSpots: It is the area of a software that churns or has a lot of code commits done to service bugs or defects.

Complete Chapter List

Search this Book:
Reset