Special Offers
- IGI Global’s New Emerging Topic e-Book Collections
  Acquire highly focused and affordable Cutting-Edge Peer-Reviewed Research Content through a selection of 17 topic-focused e-Book Collections discounted up to 90%, compared to list prices. Collection topics include Artificial Intelligence, Data Science, Language Learning, Marketing and Customer Relations, Sustainability, and many more. Hosted on the InfoSci^® platform, these collections feature no DRM, no additional cost for multi-user licensing, no embargo of content, full-text PDF & HTML format, and more.
  Learn More
- Open Access Book (Free Access) - Encyclopedia of Information Science and Technology, Sixth Edition (ISBN: 9781668473665)
  The Encyclopedia of Information Science and Technology, Sixth Edition) continues the legacy set forth by the first five editions by providing comprehensive coverage and up-to-date definitions of the most important issues, concepts, and trends pertaining to technological advancements and information management within a variety of settings and industries. The entire book is being published under open access.
  Read Now
- Open Access Book (Free Access) - Food Sustainability, Environmental Awareness, and Adaptation and Mitigation Strategies for Developing Countries (ISBN: 9781668456293)
  Food Sustainability, Environmental Awareness, and Adaptation and Mitigation Strategies for Developing Countries provides information on the recent technology, mitigation, and environmental protection that must be applied for food sustainability in developing countries. This book is being published under Platinum Open Access through funding from Diponegoro University, Indonesia.
  Read Now
- Open Access Book (Free Access) - New Models of Higher Education: Unbundled, Rebundled, Customized, and DIY (ISBN: 9781668438091)
  The Walmart Corporation and the Lumina Foundation have provided funding to make New Models of Higher Education: Unbundled, Rebundled, Customized, and DIY fully open access, completely removing any paywall between scholars in education and the latest research on new models for the future of higher education.
  Read Now
- Open Access Book (Free Access) - Handbook of Research on the Global View of Open Access and Scholarly Communications (ISBN: 9781799898054)
  Through a collaboration between IGI Global and the University of North Texas, the Handbook of Research on the Global View of Open Access and Scholarly Communications has been published as fully open access, completely removing any paywall between researchers of any field, and the latest research on the equitable and inclusive nature of Open Access and all of its complications.
  Read Now
Books
- - Books by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education
  - Books by Field
Journals
- - Journals
  - OnDemand Journal Articles
  - Journals by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education
  - Journals by Field
e-Collections
OnDemand
Open Access
- View All Open Access Opportunities
  Search across all of IGI Global’s available open access publishing opportunities to unleash your research potential.
  Find an Open Access Journal for Your Next Manuscript
  Search across all of IGI Global’s available open access publishing opportunities to unleash your research potential.
  Submit an Open Access Book Proposal
  Learn more about open access book publishing and how it can propel your research forward in the field.
  Convert Your Work to Open Access
  Already published? You can convert your work to open access to increase its impact through IGI Global’s Restrospective Open Access Program.
  Utilize Open Access Collection Database
  Open up your research potential by utilizing our open access content or integrating the open access collection into your library
  Consider Open Access Agreements
  For Libraries: consider no-cost or investment-level open access agreements with IGI Global to support your faculty's research endeavors.
  Search Funding Resources
  Looking for additional funding resources to support your open accesss endeavors? View industry resources compiled by our open access team.
  Review Open Access Policies & Ethical Guidelines
  Considering IGI Global to publish your work under open access? Review IGI Global’s open access policies and ethical guidelines
Publish with Us
Resources
- - Instructors
  - Course Adoption
  - Teaching Cases
  - K-12 Online Learning Collection
  - Authors and Editors
  - eEditorial Discovery^® System
  - Peer Review Process
  - Ethics and Malpractice
  - COPE Membership
  - Fair Use Policy
  - Open Access Publishing
  - FAQ
Catalogs
About Us

End-to-End Dataflow Parallelism for Transfer Throughput Optimization

Esma Yildirim, Tevfik Kosar

Source Title: Advancements in Distributed Computing and Internet Technologies: Trends and Issues

DOI: 10.4018/978-1-61350-110-8.ch002

OnDemand:

(Individual Chapters)

Available

$37.50

Current Special Offers

No Current Special Offers

Abstract

The emerging petascale increase in the data produced by large-scale scientific applications necessitates innovative solutions for efficient transfer of data through the advanced infrastructure provided by today’s high-speed networks and complex computer-architectures (e.g. supercomputers, parallel storage systems). Although the current optical networking technology reached transport speeds of 100Gbps, the applications still suffer from the inadequate transport protocols and end-system bottlenecks such as processor speed, disk I/O speed and network interface card limits that cause underutilization of the existing network infrastructure and let the application achieve only a small portion of the theoretical performance. Fortunately, with the parallelism provided by usage of multiple CPUs/nodes and multiple disks present in today’s systems, these bottlenecks could be eliminated. However it is necessary to understand the characteristics of the end-systems and the transport protocol used. In this book chapter, we analyze methodologies that will improve the data transfer speed of applications and provide maximal speeds that could be obtained from the available end-system resources and high-speed networks through usage of end-to-end dataflow parallelism.

Chapter Preview

Top

Introduction

The data transfer throughput is a major factor that affects the performance of applications from many scientific areas (e.g. high-energy physics, bioinformatics, numerical relativity and computational fluid dynamics). The advancements in optical networking technology have gone beyond the achievable throughput values the applications get, however the same speed up is not seen in the application performance due to many reasons such as the protocol inadequacy, poorly tuned protocol parameters and underutilized capacity of the end-systems. The current protocols that are highly common (e.g. TCP) were originally not designed for high-bandwidth networks. Due to its additive increase multiplicative decrease policy, TCP takes a long time to fill the pipe of long-fat network pipes. Many protocols have been designed for high-bandwidth networks in the transport layer (Kola & Vernon, 2007; Jin et al, 2005; Floyd, 2003) to overcome this problem however they fail to replace TCP.

Other than transport layer protocols, some application-level solutions are proposed as well. Two of the very common methods are tuning buffer size and using parallel streams. While some buffer tuning methods need modification to the kernel (Cohen &Cohen, 2002; Semke, Madavi & Mathis, 1998; Torvalds et al, 2010; Weigle & Feng, 2001), the others are done at the application level (Jain, Prasad & Davrolis,2003; Prasad, Jain &Davrolis, 2004, Hasegawa et al 2001; Morajko, 2004). Although the buffer size parameter is properly tuned, it does not show a better performance than using parallel streams because parallel streams recover from packet loss quickly rather than a single stream with tuned buffer. They achieve high throughput by mimicking the behavior of individual streams and get an unfair share of the available bandwidth (Sivakumar, Bailey & Grossman, 2000; Lee et al, 2001; Balakrishman et al, 1998; Hacker, Noble & Atley, 2005; Eggert, Heideman & Touch, 2000; Karrer, Park & Kim, 2006; Lu, Qiao & Dinda, 2005). However excessive usage of parallel streams reaches the network to a congestion point and it is hard to predict this point. The studies that try to find the optimal number of streams are so few and mostly based on approximate theoretical models (Hacker, Noble & Atley, 2002; Lu et al, 2005; Altman et al, 2006; Kola & Vernon, 2007}. They all have specific constraints and assumptions. Also the correctness of the proposed models is mostly proved with simulation results.

The foretold solutions to improve the throughput only remove the disadvantages of the protocols used. However, at some point the end-system resources become the source of bottleneck such as CPU, disk and NIC itself. Additional parallelism is needed through striping but the optimal level of striping is an open research area. The existing tools such as the GridFTP striped server (Allcock et al, 2005) and Dmover (Nathan et al, 2010) provide a means to utilize striping through multiple CPUs and nodes of an end-system architecture but they give the preference to the user to define the parallelism level. A dynamic and autonomic system that will decide this level depends on many factors.

In this book chapter we discuss many factors that affect the end-to-end application throughput such as the buffer size, parallel streams, CPU speed, disk speed and access methods in systems that use high-speed networks. The major purpose of this chapter is to provide insight to the characteristics of the end-systems that cause the bottleneck for throughput and to discuss future directions. We also present a method to optimize the parallel stream number and we have seen that this model gives very accurate results regardless of the type of the network.

Complete Chapter List

Search this Book:

Reset

MLA

APA

Chicago

Export Reference

End-to-End Dataflow Parallelism for Transfer Throughput Optimization

Abstract

Introduction

Complete Chapter List