Special Offers
- IGI Global’s New Emerging Topic e-Book Collections
  Acquire highly focused and affordable Cutting-Edge Peer-Reviewed Research Content through a selection of 17 topic-focused e-Book Collections discounted up to 90%, compared to list prices. Collection topics include Artificial Intelligence, Data Science, Language Learning, Marketing and Customer Relations, Sustainability, and many more. Hosted on the InfoSci^® platform, these collections feature no DRM, no additional cost for multi-user licensing, no embargo of content, full-text PDF & HTML format, and more.
  Learn More
- Open Access Book (Free Access) - Encyclopedia of Information Science and Technology, Sixth Edition (ISBN: 9781668473665)
  The Encyclopedia of Information Science and Technology, Sixth Edition) continues the legacy set forth by the first five editions by providing comprehensive coverage and up-to-date definitions of the most important issues, concepts, and trends pertaining to technological advancements and information management within a variety of settings and industries. The entire book is being published under open access.
  Read Now
- Open Access Book (Free Access) - Food Sustainability, Environmental Awareness, and Adaptation and Mitigation Strategies for Developing Countries (ISBN: 9781668456293)
  Food Sustainability, Environmental Awareness, and Adaptation and Mitigation Strategies for Developing Countries provides information on the recent technology, mitigation, and environmental protection that must be applied for food sustainability in developing countries. This book is being published under Platinum Open Access through funding from Diponegoro University, Indonesia.
  Read Now
- Open Access Book (Free Access) - New Models of Higher Education: Unbundled, Rebundled, Customized, and DIY (ISBN: 9781668438091)
  The Walmart Corporation and the Lumina Foundation have provided funding to make New Models of Higher Education: Unbundled, Rebundled, Customized, and DIY fully open access, completely removing any paywall between scholars in education and the latest research on new models for the future of higher education.
  Read Now
- Open Access Book (Free Access) - Handbook of Research on the Global View of Open Access and Scholarly Communications (ISBN: 9781799898054)
  Through a collaboration between IGI Global and the University of North Texas, the Handbook of Research on the Global View of Open Access and Scholarly Communications has been published as fully open access, completely removing any paywall between researchers of any field, and the latest research on the equitable and inclusive nature of Open Access and all of its complications.
  Read Now
Books
- - Books by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education
  - Books by Field
Journals
- - Journals
  - OnDemand Journal Articles
  - Journals by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education
  - Journals by Field
e-Collections
Open Access
- View All Open Access Opportunities
  Search across all of IGI Global’s available open access publishing opportunities to unleash your research potential.
  Find an Open Access Journal for Your Next Manuscript
  Search across all of IGI Global’s available open access publishing opportunities to unleash your research potential.
  Submit an Open Access Book Proposal
  Learn more about open access book publishing and how it can propel your research forward in the field.
  Convert Your Work to Open Access
  Already published? You can convert your work to open access to increase its impact through IGI Global’s Restrospective Open Access Program.
  Utilize Open Access Collection Database
  Open up your research potential by utilizing our open access content or integrating the open access collection into your library
  Consider Open Access Agreements
  For Libraries: consider no-cost or investment-level open access agreements with IGI Global to support your faculty's research endeavors.
  Search Funding Resources
  Looking for additional funding resources to support your open accesss endeavors? View industry resources compiled by our open access team.
  Review Open Access Policies & Ethical Guidelines
  Considering IGI Global to publish your work under open access? Review IGI Global’s open access policies and ethical guidelines
Publish with Us
Resources
- - Instructors
  - Course Adoption
  - Teaching Cases
  - K-12 Online Learning Collection
  - Authors and Editors
  - eEditorial Discovery^® System
  - Peer Review Process
  - Ethics and Malpractice
  - COPE Membership
  - Fair Use Policy
  - Open Access Publishing
  - FAQ
Catalogs
About Us
Newsroom

Migrating a Legacy Web-Based Document-Analysis Application to Hadoop and HBase: An Experience Report

Himanshu Vashishtha, Michael Smit, Eleni Stroulia

Source Title: Migrating Legacy Applications: Challenges in Service Oriented Architecture and Cloud Computing Environments

DOI: 10.4018/978-1-4666-2488-7.ch010

OnDemand:

(Individual Chapters)

Available

$37.50

Current Special Offers

No Current Special Offers

Abstract

Migrating a legacy application to a more modern computing platform is a recurring software-development activity. This chapter describes the authors’ experience with a contemporary rendition of this activity, migrating a Web-based system to a service-oriented application on two different cloud software platforms, Hadoop and HBase. Using the case study as a running example, they review the information needed for a successful migration and examine the trade-offs between development/re-design effort and performance/scalability improvements. The two levels of re-design, towards Hadoop and HBase, require notably different levels of effort, and as the authors found through exercising the migrated applications, they achieve different benefits. The authors found that both redesigns led to substantial benefit in performance improvement, and that expending the additional effort required by the more complex migration resulted in notable improvements in the ability to leverage the benefits of the platform.

Chapter Preview

Top

Introduction

Migrating applications to cloud-computing environments is a software-engineering activity attracting increasing attention, as cloud environments become more accessible and better supported. Such migrations pose questions regarding the changes necessary to the code and to the architecture of the original software system, the effort necessary to perform these changes, and the possible performance improvements to be gained by the migration. The software-development team undertaking a migration-to-the-cloud project needs to address the following questions.

•
What types of software (i.e., components and/or libraries) can developers expect when undertaking a migration project?
•
What are the modifications typically required in order for the migrated application to better leverage the potential of the target cloud platform? What are the implications of the various platforms to the architectural and detailed design of the software deployed on them?
•
Will the particular software application benefit from its migration to a cloud environment? How might one assess the trade-off between the costs of the planned modifications vs. the improvements anticipated of the application post-migration?

The term cloud computing characterizes the perspective of end users, who are offered a service (which could be in the form of a computing platform or infrastructure) while being agnostic about its underlying technology. The implementation details of the service are abstracted away, and it is consumed on a pay-per-use basis, as opposed to being acquired as an asset. In principle, one distinguishes among three different types of cloud-based services. When infrastructure is offered as a service (IaaS), end users are able to procure virtualized hardware. When a software platform is offered as a service (PaaS), end users consume a software platform, i.e., a combination of an operating system, basic tools and libraries. Finally, when a software application is offered as a service (SaaS), end users consume as clients a specific application that is independently deployed and managed. Of course, these offerings can be combined into a stack of service offerings.

All three above scenarios promote improved scalability albeit through different mechanisms. The first scenario eliminates the need for users to acquire, manage and replace hardware, since any number of appropriately configured virtual machines can be easily procured (and abandoned), for example, through Amazon Web Services¹. The second scenario promises improved scalability with novel tools and computational metaphors, such as those of the Hadoop ecosystem for storing and manipulating “big data.” Finally, when a software system is offered as a service, such as SalesForce², its consumers are offered state-of-the-art functionality, regularly maintained, and extended, with guaranteed quality, at negotiable costs.

In this chapter, we report on our experience migrating a legacy application, TAPoR, to take advantage of IaaS (using AWS) and PaaS (in two scenarios, Hadoop and HBase). The original version of TAPoR had severe performance limitations and it was the promise of scalability through its migration “to the cloud” that motivated our study. The original application ran on a single machine, in a single thread, within a single process. Taking advantage of the IaaS model, we modified it to incorporate a load-balancing component to distribute incoming requests to multiple identical processes, running on multiple virtual machines (Smit, Nisbet, Stroulia, Iszlai, & Edgar, 2009). This change however did not address the fundamental inability of the application to scale to large documents. To that end, we investigated the advantages of an architectural shift to exploit the advantages of (two variants of) the Hadoop ecosystem as a platform. To summarize, we have performed three types of modification to the original system:

•
No architectural changes; deploy the software (with a load balancer) to multiple machines (on Amazon EC2, for instance);
•
Rearchitecting towards the MapReduce paradigm; modify the architecture and implementation to make use of the distributed computation features of Hadoop; and
•
Rearchitecting to use a NoSQL database; further change the implementation to also make use of the distributed database feature of Hadoop, HBase³.

Complete Chapter List

Search this Book:

Reset

MLA

APA

Chicago

Export Reference

Migrating a Legacy Web-Based Document-Analysis Application to Hadoop and HBase: An Experience Report

Abstract

Introduction

Complete Chapter List