|Total results: 1,299||
|Big Data Management, Technologies, and Applications
Wen-Chen Hu, Naima Kaabouch.
Due to the tremendous amount of data generated daily from fields such as business, research, and sciences, big data is everywhere. Therefore, alternative management and processing methods have to be created to handle this complex and unstructured data size.
Big Data Management, Technologies, and...
Technologies for Big Data
This chapter provides a review and analysis of several key Big Data technologies. Currently, there are many Big Data technologies in development and implementation; hence, a comprehensive review of all of these technologies is beyond the scope of this chapter. This chapter focuses on the most popularly...
Applying the K-Means Algorithm in Big Raw Data Sets with Hadoop and MapReduce
Ilias K. Savvas, Georgia N. Sofianidou, M-Tahar Kechadi.
Big data refers to data sets whose size is beyond the capabilities of most current hardware and software technologies. The Apache Hadoop software library is a framework for distributed processing of large data sets, while HDFS is a distributed file system that provides high-throughput access to...
Synchronizing Execution of Big Data in Distributed and Parallelized Environments
Gueyoung Jung, Tridib Mukherjee.
In the modern information era, the amount of data has exploded. Current trends further indicate exponential growth of data in the future. This prevalent humungous amount of data—referred to as big data—has given rise to the problem of finding the “needle in the haystack” (i.e., extracting meaningful...
Parallel Data Reduction Techniques for Big Datasets
Ahmet Artu Yildirim, Cem Özdogan, Dan Watson.
Data reduction is perhaps the most critical component in retrieving information from big data (i.e., petascale-sized data) in many data-mining processes. The central issue of these data reduction techniques is to save time and bandwidth in enabling the user to deal with larger datasets even in minimal...
Techniques for Sampling Online Text-Based Data Sets
Lynne M. Webb, Yuanxin Wang.
The chapter reviews traditional sampling techniques and suggests adaptations relevant to big data studies of text downloaded from online media such as email messages, online gaming, blogs, micro-blogs (e.g., Twitter), and social networking websites (e.g., Facebook). The authors review methods of...
Big Data Warehouse Automatic Design Methodology
Francesco Di Tria, Ezio Lefons, Filippo Tangorra.
Traditional data warehouse design methodologies are based on two opposite approaches. The one is data oriented and aims to realize the data warehouse mainly through a reengineering process of the well-structured data sources solely, while minimizing the involvement of end users. The other is...
Big Data Management in the Context of Real-Time Data Warehousing
M. Asif Naeem, Gillian Dobbie, Gerald Weber.
In order to make timely and effective decisions, businesses need the latest information from big data warehouse repositories. To keep these repositories up to date, real-time data integration is required. An important phase in real-time data integration is data transformation where a stream of updates...
Big Data Sharing Among Academics
The goal of this chapter is to explore the practice of big data sharing among academics and issues related to this sharing. The first part of the chapter reviews literature on big data sharing practices using current technology. The second part presents case studies on disciplinary data repositories in...
Scalable Data Mining, Archiving, and Big Data Management for the Next Generation Astronomical Telescopes
Chris A. Mattmann, Andrew Hart, Luca Cinquini, Joseph Lazio, Shakeh Khudikyan, Dayton Jones, Robert Preston, Thomas Bennett, Bryan Butler, David Harland, Brian Glendenning, Jeff Kern, James Robnett.
Big data as a paradigm focuses on data volume, velocity, and on the number and complexity of various data formats and metadata, a set of information that describes other data types. This is nowhere better seen than in the development of the software to support next generation astronomical instruments...
Efficient Metaheuristic Approaches for Exploration of Online Social Networks
Zorica Stanimirovic, Stefan Miškovic.
This study presents a novel approach in analyzing big data from social networks based on optimization techniques for efficient exploration of information flow within a network. Three mathematical models are proposed, which use similar assumptions on a social network and different objective functions...
Big Data at Scale for Digital Humanities: An Architecture for the HathiTrust Research Center
Stacy T. Kowalczyk, Yiming Sun, Zong Peng, Beth Plale, Aaron Todd, Loretta Auvil, Craig Willis, Jiaan Zeng, Milinda Pathirage, Samitha Liyanage, Guangchen Ruan, J. Stephen Downie.
Big Data in the humanities is a new phenomenon that is expected to revolutionize the process of humanities research. The HathiTrust Research Center (HTRC) is a cyberinfrastructure to support humanities research on big humanities data. The HathiTrust Research Center has been designed to make the...
GeoBase: Indexing NetCDF Files for Large-Scale Data Analysis
Data-rich scientific disciplines increasingly need end-to-end systems that ingest large volumes of data, make it quickly available, and enable processing and exploratory data analysis in a scalable manner. Key-value stores have attracted attention, since they offer highly available data storage, but...
Large-Scale Sensor Network Analysis: Applications in Structural Health Monitoring
Joaquin Vanschoren, Ugo Vespier, Shengfa Miao, Marvin Meeng, Ricardo Cachucho, Arno Knobbe.
Sensors are increasingly being used to monitor the world around us. They measure movements of structures such as bridges, windmills, and plane wings, human’s vital signs, atmospheric conditions, and fluctuations in power and water networks. In many cases, this results in large networks with different...
Accelerating Large-Scale Genome-Wide Association Studies with Graphics Processors
Mian Lu, Qiong Luo.
Large-scale Genome-Wide Association Studies (GWAS) are a Big Data application due to the great amount of data to process and high computation intensity. Furthermore, numerical issues (e.g., floating point underflow) limit the data scale in some applications. Graphics Processors (GPUs) have been used to...
Excess Entropy in Computer Systems
Modern data centers house tens of thousands of servers in complex layouts. That requires sophisticated reporting – turning available terabytes of data into information. The classical approach was introduced decades ago to handle a small number of lightly connected computers. Today, we also need to...
Innovations in Database Design, Web Applications, and Information Systems Management
New techniques and tools for database and database technologies are continuously being introduced. These technologies are the heart of many business information systems and can benefit from theories, models, and research results from other disciplines.Innovations in Database Design, Web Applications...
A Study of Open Source Software Development from Control Perspective
Bo Xu, Zhangxi Lin, Yan Xu.
Open source software (OSS) has achieved great success and exerted significant impact on the software industry. OSS development takes online community as its organizational form, and developers voluntarily work for the project. In the project execution process, control aligns individual behaviors toward...
A Survey of Approaches to Web Service Discovery in Service-Oriented Architectures
Marco Crasso, Alejandro Zunino, Marcelo Campo.
Discovering services acquires importance as Service-Oriented Computing (SOC) becomes an adopted paradigm. SOC’s most popular materializations, namely Web Services technologies, have different challenges related to service discovery and, in turn, many approaches have been proposed. As these approaches...
Multi-Level Modeling of Web Service Compositions with Transactional Properties
K. Vidyasankar, Gottfried Vossen.
Web services have become popular as a vehicle for the design, integration, composition, reuse, and deployment of distributed and heterogeneous software. However, although industry standards for the description, composition, and orchestration of Web services have been under development, their conceptual...