Hadoop Tools

Hadoop Tools

Copyright: © 2019 |Pages: 47
ISBN13: 9781522537908|ISBN10: 1522537902|ISBN13 Softcover: 9781522586951|EISBN13: 9781522537915
DOI: 10.4018/978-1-5225-3790-8.ch009
Cite Chapter Cite Chapter

MLA

T. Revathi, et al. "Hadoop Tools." Big Data Processing With Hadoop, IGI Global, 2019, pp.169-215. https://doi.org/10.4018/978-1-5225-3790-8.ch009

APA

T. Revathi, K. Muneeswaran, & M. Blessa Binolin Pepsi (2019). Hadoop Tools. IGI Global. https://doi.org/10.4018/978-1-5225-3790-8.ch009

Chicago

T. Revathi, K. Muneeswaran, and M. Blessa Binolin Pepsi. "Hadoop Tools." In Big Data Processing With Hadoop. Hershey, PA: IGI Global, 2019. https://doi.org/10.4018/978-1-5225-3790-8.ch009

Export Reference

Mendeley
Favorite

Abstract

As the name indicates, this chapter explains the various additional tools provided by Hadoop. The additional tools provided by Hadoop distribution are Hadoop Streaming, Hadoop Archives, DistCp, Rumen, GridMix, and Scheduler Load Simulator. Hadoop Streaming is a utility that allows the user to have any executable or script for both mapper and reducer. Hadoop Archives is used for archiving old files and directories. DistCp is used for copying files within the cluster and also across different clusters. Rumen is the tool for extracting meaningful data from JobHistory files and analyzes it. It is used for statistical analysis. GridMix is benchmark for Hadoop. It takes a trace of job and creates a synthetic job with the same pattern as that of trace. The trace can be generated by Rumen tool. Scheduler Load Simulator is a tool for simulating different loads and scheduling methods like FIFO, Fair Scheduler, etc. This chapter explains all the tools and gives the syntax of various commands for each tool. After reading this chapter, the reader will be able to use all these tools effectively.

Request Access

You do not own this content. Please login to recommend this title to your institution's librarian or purchase it from the IGI Global bookstore.