Special Offers
- IGI Global’s New Emerging Topic e-Book Collections
  Acquire highly focused and affordable Cutting-Edge Peer-Reviewed Research Content through a selection of 17 topic-focused e-Book Collections discounted up to 90%, compared to list prices. Collection topics include Artificial Intelligence, Data Science, Language Learning, Marketing and Customer Relations, Sustainability, and many more. Hosted on the InfoSci^® platform, these collections feature no DRM, no additional cost for multi-user licensing, no embargo of content, full-text PDF & HTML format, and more.
  Learn More
- Open Access Book (Free Access) - Encyclopedia of Information Science and Technology, Sixth Edition (ISBN: 9781668473665)
  The Encyclopedia of Information Science and Technology, Sixth Edition) continues the legacy set forth by the first five editions by providing comprehensive coverage and up-to-date definitions of the most important issues, concepts, and trends pertaining to technological advancements and information management within a variety of settings and industries. The entire book is being published under open access.
  Read Now
- Open Access Book (Free Access) - Food Sustainability, Environmental Awareness, and Adaptation and Mitigation Strategies for Developing Countries (ISBN: 9781668456293)
  Food Sustainability, Environmental Awareness, and Adaptation and Mitigation Strategies for Developing Countries provides information on the recent technology, mitigation, and environmental protection that must be applied for food sustainability in developing countries. This book is being published under Platinum Open Access through funding from Diponegoro University, Indonesia.
  Read Now
- Open Access Book (Free Access) - New Models of Higher Education: Unbundled, Rebundled, Customized, and DIY (ISBN: 9781668438091)
  The Walmart Corporation and the Lumina Foundation have provided funding to make New Models of Higher Education: Unbundled, Rebundled, Customized, and DIY fully open access, completely removing any paywall between scholars in education and the latest research on new models for the future of higher education.
  Read Now
- Open Access Book (Free Access) - Handbook of Research on the Global View of Open Access and Scholarly Communications (ISBN: 9781799898054)
  Through a collaboration between IGI Global and the University of North Texas, the Handbook of Research on the Global View of Open Access and Scholarly Communications has been published as fully open access, completely removing any paywall between researchers of any field, and the latest research on the equitable and inclusive nature of Open Access and all of its complications.
  Read Now
Books
- - Books by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education
  - Books by Field
Journals
- - Journals
  - OnDemand Journal Articles
  - Journals by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education
  - Journals by Field
e-Collections
Open Access
- View All Open Access Opportunities
  Search across all of IGI Global’s available open access publishing opportunities to unleash your research potential.
  Find an Open Access Journal for Your Next Manuscript
  Search across all of IGI Global’s available open access publishing opportunities to unleash your research potential.
  Submit an Open Access Book Proposal
  Learn more about open access book publishing and how it can propel your research forward in the field.
  Convert Your Work to Open Access
  Already published? You can convert your work to open access to increase its impact through IGI Global’s Restrospective Open Access Program.
  Utilize Open Access Collection Database
  Open up your research potential by utilizing our open access content or integrating the open access collection into your library
  Consider Open Access Agreements
  For Libraries: consider no-cost or investment-level open access agreements with IGI Global to support your faculty's research endeavors.
  Search Funding Resources
  Looking for additional funding resources to support your open accesss endeavors? View industry resources compiled by our open access team.
  Review Open Access Policies & Ethical Guidelines
  Considering IGI Global to publish your work under open access? Review IGI Global’s open access policies and ethical guidelines
Publish with Us
Resources
- - Instructors
  - Course Adoption
  - Teaching Cases
  - K-12 Online Learning Collection
  - Authors and Editors
  - eEditorial Discovery^® System
  - Peer Review Process
  - Ethics and Malpractice
  - COPE Membership
  - Fair Use Policy
  - Open Access Publishing
  - FAQ
Catalogs
About Us
Newsroom

Massively Threaded Digital Forensics Tools

Lodovico Marziale, Santhi Movva, Golden G. Richard III, Vassil Roussev, Loren Schwiebert

Source Title: Handbook of Research on Computational Forensics, Digital Crime, and Investigation: Methods and Solutions

DOI: 10.4018/978-1-60566-836-9.ch010

OnDemand:

(Individual Chapters)

Available

$37.50

Current Special Offers

No Current Special Offers

Abstract

Digital forensics comprises the set of techniques to recover, preserve, and examine digital evidence, and has applications in a number of important areas, including investigation of child exploitation, identity theft, counter-terrorism, and intellectual property disputes. Digital forensics tools must exhaustively examine and interpret data at a low level, because data of evidentiary value may have been deleted, partially overwritten, obfuscated, or corrupted. While forensics investigation is typically seen as an off-line activity, improving case turnaround time is crucial, because in many cases lives or livelihoods may hang in the balance. Furthermore, if more computational resources can be brought to bear, we believe that preventative network security (which must be performed on-line) and digital forensics can be merged into a common research focus. In this chapter we consider recent hardware trends and argue that multicore CPUs and Graphics Processing Units (GPUs) offer one solution to the problem of maximizing available compute resources.

Chapter Preview

Top

Introduction

The complexity of digital forensic analysis continues to grow in lockstep with the rapid growth of the size of forensic targets—as the generation of digital content continues at an ever-increasing rate, so does the amount of data that ends up in the forensic lab. According to FBI statistics (Federal Bureau of Investigation, 2007), the average amount of data examined per criminal case has been growing at an average annual rate of 35%—from 83GB in 2003 to 277GB in 2007. However, this is just the tip of the iceberg—the vast majority of forensic analyses are in support of either civil cases or internal investigations and can easily involve the examination of terabyte-scale data sets.

Ultimately, a tiny fraction of that information ends up being relevant—the proverbial ‘needle in a haystack’—so there is a pressing need for high-performance forensic tools that can quickly sift through the data with increasing sophistication. As an illustration of the difficulty of the problem, consider the 2002 Department of Defense investigation into a leaked memo with Iraq war plans. It has been reported (Roberts, 2005) that a total of 60TB of data were seized in an attempt to identify the source. Several months later, the investigation was closed with no results. The Enron case involved over 30TB of raw data and took many months to complete. While these examples might seem exceptional, it is not difficult to come up with similar, plausible scenarios in a corporate environment involving large amounts of data. As media capacity continues to double every two years, such huge data sets will be increasingly the norm, not the exception.

Current state-of-the-art forensic labs use a private network of high-end workstations backed up by a Storage Area Network as their hardware platform. Almost all processing for a case is done on a single workstation—the target is first pre-processed (indexed) and subsequently queried. Current technology trends (Patterson, 2004) unambiguously render such an approach as unsustainable: I/O capacity increases at a significantly faster rate than corresponding improvements in throughput and latency.

This means that, in relative terms, we are falling behind in our ability to access data on the forensic target. At the same time, our raw hardware capabilities to process the data have kept up with capacity growth. The basic problem that we have is two-fold: a) current tools do a poor job of maximizing compute resource usage; b) the current index-query model of forensic computation effectively neutralizes most of the gains in compute power by traversing the I/O bottleneck multiple times.

Before we look at the necessary changes in the computational model, let us briefly review recent hardware trends. Starting in 2005, with the introduction of a dual-core Opteron processor by AMD, single-chip multiprocessors entered the commodity market. The main reason for their introduction is that chip manufacturing technologies are approaching fundamental limits and the decades-old pursuit of speedup by doubling the density every two years, a.k.a. keeping up with Moore’s Law, had to make a 90 degree turn. Instead of shrinking the size and increasing the clock rate of the processor, more processing units are packed onto the same chip and each processor has the ability to simultaneously execute multiple threads of computation. This is an abrupt paradigm shift towards massive CPU parallelism and existing forensic tools are clearly not designed to take advantage of it.

Another important hardware development that gives us a peek into how massively parallel computation on the desktop will look in the near future is the rise of Graphics Processing Units (GPUs) as a general-purpose compute platform. GPUs have evolved as a result of the need to speedup graphics computations, which tend to be highly parallelizable and follow a very regular pattern. As a result, GPU architectures have followed a different evolution from that of the CPU. Instead of having relatively few, very complex processing units and large caches, GPUs have hundreds of simpler processing units and very little cache on board.

Key Terms in this Chapter

Single Instruction Multiple Thread (SIMT): SIMT is an approach to parallel computing where multiple threads execute the same computations on different data items.

Graphics Processing Unit (GPU): A GPU is a computing device that was traditionally designed specifically to render computer graphics. Modern GPU designs more readily support general computations.

File Carving: File carving is the process of extracting deleted files or file fragments from a disk image without reliance on filesystem metadata.

Single Instruction Multiple Data (SIMD): SIMD is an approach to parallel computing where multiple processors execute the same instruction stream but on different data items.

Digital forensics: Digital forensics is the application of forensic techniques to the legal investigation of computers and other digital devices.

String Matching Algorithm: A string matching algorithm is a procedure for finding all occurrences of a string in a block of text.

Multicore CPU: A multicore CPU is a single-chip processor that contains multiple processing elements.

Multi-Pattern String Matching: A multi-pattern string matching algorithm is a procedure for finding all occurrences of any of a set of text strings in a block of text.

Beowulf Cluster: A Beowulf cluster is a parallel computer built from commodity PC hardware.

Complete Chapter List

Search this Book:

Reset

MLA

APA

Chicago

Export Reference

Massively Threaded Digital Forensics Tools

Abstract

Introduction

Key Terms in this Chapter

Complete Chapter List