Log File Template Detection as a Multi-Objective Optimization Problem

Log File Template Detection as a Multi-Objective Optimization Problem

Mathi Murugan T. (Sathyabama Institute of Science and Technology, India) and E. Baburaj (Department of Computer Science Engineering, Marian Engineering College, India)
Copyright: © 2022 |Pages: 20
DOI: 10.4018/IJSIR.2022010107
OnDemand PDF Download:
No Current Special Offers


There is a need for automatic log file template detection tool to find out all the log messages through search space. On the other hand, the template detection tool should cope with two constraints: (i) it could not be too general and (ii) it could not be too specific These constraints are, contradict to one another and can be considered as a multi-objective optimization problem. Thus, a novel multi-objective optimization based log-file template detection approach named LTD-MO is proposed in this paper. It uses a new multi-objective based swarm intelligence algorithm called chicken swarm optimization for solving the hard optimization issue. Moreover, it analyzes all templates in the search space and selects a Pareto front optimal solution set for multi-objective compensation. The proposed approach is implemented and evaluated on eight publicly available benchmark log datasets. The empirical analysis shows LTD-MO detects large number of appropriate templates by significantly outperforming the existing techniques on all datasets.
Article Preview


Nowadays, large-scale computer processors are comprised of number of software applications and components running on thousands of operating nodes. The runtime statistics of these processors are continuously gathered and accumulated in the form of log files and analyzed thereof to detect the cause and exact location of issues during system failure and malfunctioning (Bao et al., 2018). In general, runtime storage or logging is a usual process to store system functional data helpful for developers as well as support engineers to analyze the behavior of systems and track down the difficulties that may arise in the future. Thus, log files play an essential role in the maintenance and development of software-based computing systems. Additionally, the rich data present in the log files facilitate a huge variety of system analytic practices like ensuring software security (Latib et al., 2018), analyzing application statistics (Patel & Parikh, 2017), detecting performance anomalies (Vaarandi et al., 2018), identifying crashes and errors (Suman et al., 2018; Adam et al., 2016) and so on. In spite of the remarkable data available in logs, it is a great deal to perform effective analysis due to the following challenges. Firstly, recent software systems regularly produce tons of logs (e.g., a commercial cloud device can generate about gigabytes of data every hour) (Astekin et al., 2019). These high volume logs make it impossible to do a manual inspection for key diagnostics even if provided with search and grep utilities. As a result, traditional log analysis methods that mostly depend on manual operation have become unfeasible and prohibitive (Jia et al., 2018; Li et al., 2018). Secondly, the messages in log files are intrinsically unstructured as developers normally store the system activities in a free-text format for better accessibility and flexibility (Rath, 2016). Therefore, there exists a great demand for automated log analysis for all kinds of applications (He et al., 2017; El-Masri et al., 2020). An automatic log analysis based on keyword searches with ad hoc scripts like “CRITICAL” or “ERROR” is found to be inadequate for fixing several problems (Baudart, 2018; Vega et al., 2017). Further, rule-based methods are an advanced technique; however it is difficult to formulate each and every rule throughout the analysis (Khan & Parkinson, 2018). The drawbacks of these initial approaches significantly increased the difficulty in log file data analysis. To overcome these limitations, recent studies as well as industrialists provide different alternatives with a powerful word search and machine learning analytics like Splunk (Carasso, 2012), ELK (Smith, 2015), Logentries (Jaunin & Burdick, 2011), etc. Nevertheless, the first and foremost step to enable these log analysis is log parsing through which free-text raw log data are parsed into a stream of structured data (He et al., 2016).

Complete Article List

Search this Journal:
Open Access Articles
Volume 13: 4 Issues (2022): 3 Released, 1 Forthcoming
Volume 12: 4 Issues (2021)
Volume 11: 4 Issues (2020)
Volume 10: 4 Issues (2019)
Volume 9: 4 Issues (2018)
Volume 8: 4 Issues (2017)
Volume 7: 4 Issues (2016)
Volume 6: 4 Issues (2015)
Volume 5: 4 Issues (2014)
Volume 4: 4 Issues (2013)
Volume 3: 4 Issues (2012)
Volume 2: 4 Issues (2011)
Volume 1: 4 Issues (2010)
View Complete Journal Contents Listing