Special Offers
- IGI Global’s New Emerging Topic e-Book Collections
  Acquire highly focused and affordable Cutting-Edge Peer-Reviewed Research Content through a selection of 17 topic-focused e-Book Collections discounted up to 90%, compared to list prices. Collection topics include Artificial Intelligence, Data Science, Language Learning, Marketing and Customer Relations, Sustainability, and many more. Hosted on the InfoSci^® platform, these collections feature no DRM, no additional cost for multi-user licensing, no embargo of content, full-text PDF & HTML format, and more.
  Learn More
- Open Access Book (Free Access) - Encyclopedia of Information Science and Technology, Sixth Edition (ISBN: 9781668473665)
  The Encyclopedia of Information Science and Technology, Sixth Edition) continues the legacy set forth by the first five editions by providing comprehensive coverage and up-to-date definitions of the most important issues, concepts, and trends pertaining to technological advancements and information management within a variety of settings and industries. The entire book is being published under open access.
  Read Now
- Open Access Book (Free Access) - Food Sustainability, Environmental Awareness, and Adaptation and Mitigation Strategies for Developing Countries (ISBN: 9781668456293)
  Food Sustainability, Environmental Awareness, and Adaptation and Mitigation Strategies for Developing Countries provides information on the recent technology, mitigation, and environmental protection that must be applied for food sustainability in developing countries. This book is being published under Platinum Open Access through funding from Diponegoro University, Indonesia.
  Read Now
- Open Access Book (Free Access) - New Models of Higher Education: Unbundled, Rebundled, Customized, and DIY (ISBN: 9781668438091)
  The Walmart Corporation and the Lumina Foundation have provided funding to make New Models of Higher Education: Unbundled, Rebundled, Customized, and DIY fully open access, completely removing any paywall between scholars in education and the latest research on new models for the future of higher education.
  Read Now
- Open Access Book (Free Access) - Handbook of Research on the Global View of Open Access and Scholarly Communications (ISBN: 9781799898054)
  Through a collaboration between IGI Global and the University of North Texas, the Handbook of Research on the Global View of Open Access and Scholarly Communications has been published as fully open access, completely removing any paywall between researchers of any field, and the latest research on the equitable and inclusive nature of Open Access and all of its complications.
  Read Now
Books
- - Books by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education
  - Books by Field
Journals
- - Journals
  - OnDemand Journal Articles
  - Journals by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education
  - Journals by Field
e-Collections
OnDemand
Open Access
- View All Open Access Opportunities
  Search across all of IGI Global’s available open access publishing opportunities to unleash your research potential.
  Find an Open Access Journal for Your Next Manuscript
  Search across all of IGI Global’s available open access publishing opportunities to unleash your research potential.
  Submit an Open Access Book Proposal
  Learn more about open access book publishing and how it can propel your research forward in the field.
  Convert Your Work to Open Access
  Already published? You can convert your work to open access to increase its impact through IGI Global’s Restrospective Open Access Program.
  Utilize Open Access Collection Database
  Open up your research potential by utilizing our open access content or integrating the open access collection into your library
  Consider Open Access Agreements
  For Libraries: consider no-cost or investment-level open access agreements with IGI Global to support your faculty's research endeavors.
  Search Funding Resources
  Looking for additional funding resources to support your open accesss endeavors? View industry resources compiled by our open access team.
  Review Open Access Policies & Ethical Guidelines
  Considering IGI Global to publish your work under open access? Review IGI Global’s open access policies and ethical guidelines
Publish with Us
Resources
- - Instructors
  - Course Adoption
  - Teaching Cases
  - K-12 Online Learning Collection
  - Authors and Editors
  - eEditorial Discovery^® System
  - Peer Review Process
  - Ethics and Malpractice
  - COPE Membership
  - Fair Use Policy
  - Open Access Publishing
  - FAQ
Catalogs
About Us

Intelligent Log Analysis Using Machine and Deep Learning

Steven Yen, Melody Moh

Source Title: Machine Learning and Cognitive Science Applications in Cyber Security

DOI: 10.4018/978-1-5225-8100-0.ch007

OnDemand:

(Individual Chapters)

Available

$37.50

Current Special Offers

No Current Special Offers

Abstract

Computers generate a large volume of logs recording various events of interest. These logs are a rich source of information and can be analyzed to extract various insights about the system. However, due to its overwhelmingly large volume, logs are often mismanaged and not utilized effectively. The goal of this chapter is to help researchers and industrial professionals make more informed decisions about their logging solutions. It first lays the foundation by describing log sources and format. Then it describes all the components involved in logging. The remainder of the chapter provides a survey of different log analysis techniques and their applications, consisting of conventional techniques using rules and event correlators that can detect known issues, plus more advanced techniques such as statistical, machine learning, and deep learning techniques that can also detect unknown issues. The chapter concludes describing the underlying concepts of the techniques, their application to log analysis, and their comparative effectiveness.

Chapter Preview

Top

Introduction

Long before the advent of computers, logging has been used in various fields. Examples included physical logbooks, accounting transaction ledgers, car maintenance records, etc. They are used to record any events of interest based on the context. The information in the logs can then be used in the future for troubleshooting purposes, help improve operating procedures, act as an audit trail, and so on.

The practice of logging was adopted in computing systems from the very beginning. Developers used printf statements throughout their code to print relevant information to help them debug the code when issues arise. Some of the messages are only used during development and are removed before release, others were placed strategically to help with troubleshooting or monitoring purposes later on. These log messages can be shown directly to the user or be sent to specific outputs channels such as to a file. Due to its usefulness, logging became common practice, and nowadays almost every piece of software has logging capability. In modern computing systems, logs can come from operating systems, network devices, and various application software. They are meant to record interesting events that occurred when programs are ran.

These logs from various devices and processes proved to be extremely useful for the detection of security issues. Operating system logs (or host logs) can be analyzed to detect unauthorized access, such as that by an attacker using a ssh-scanner (Chuvakin, Schmidt, & Philips, 2013). Network logs can be analyzed to detect unusual traffic such as that between a malware and a remote attacker’s device (Stamp, 2006). Web application logs can be analyzed to detect attacks such as cross-site scripting, SQL injection, and invalid resource access (Liang, Zhao, & Ye, 2017). Many, if not all, cyberattacks leave traces in logs somewhere, one just needs to know what to look for.

However, because of the automated nature of log generation in computing systems, the volume of logs generated became very large. An unfortunate consequence of this is that many users began to view logs as an annoyance rather than a helpful tool. Logs were seldom looked at and are often simply deleted when space runs out. To address these issues, log management systems were developed to facilitate the collection, storage, and analysis of logs.

Log analysis can be done manually by inspecting raw text files directly or using event viewers provided by log management systems. Such manual inspection is labor-intensive and often not timely enough for real-time incident response. To address these limitation, rule-based systems were developed that can evaluate log events based on a library of known issues (known as a rule-base). These tools proved to be quite effective and have helped organizations prevent many incidents in a timely fashion. The drawback is that they can only detect known issues for which there are exact rules in the rule-base, and misses unknown issues. To help detect new and unknown issues, anomaly detection approaches were introduced, which are based on identifying unusual or abnormal behavior. Statistical, machine learning, and deep learning techniques proved to be quite suitable for this application, because they can form their own detection criteria from training data rather than relying on human operators to specify rules. Over the years, more and more of these techniques have been applied to log analysis with impressive results.

Key Terms in this Chapter

Recurrent Neural Networks (RNN): Class of ANN that have recurrent connections that allow the network to maintain internal state/memory.

Long Short-Term Memory (LSTM): Type of RNN that incorporates multiplicative gates that allows the network to have long- and short-term memory.

Anomaly detection: Analyzing data to detect unusual or abnormal behavior.

Rules Engine: Software that allow the user to specify rules in a library (known as a rule-set), which the software then applies for various purposes. In the context of log analysis, rules engines use the rules to evaluate log events and take appropriate actions.

K-Means: Machine learning technique that identifies groups/clusters of data points that are similar to each other.

Log Analysis: The analysis of Logs to extract useful information for troubleshooting, monitoring, auditing, and other purposes.

Multilayer Perception (MLP): Class of ANN that are feedforward and fully connected in construction.

Event Correlation: Looking across different events to extract global insights based on their relationships.

Principle Component Analysis (PCA): Machine learning technique that transform the data to identify important relationships and reduce the dimensionality of the data.

Artificial Neural Networks (ANN): Computing systems that use networks of interconnected nodes to process and gain knowledge from training data, then apply the knowledge to make predictions.

Complete Chapter List

Search this Book:

Reset

MLA

APA

Chicago

Export Reference

Intelligent Log Analysis Using Machine and Deep Learning

Abstract

Introduction

Key Terms in this Chapter

Complete Chapter List