The Library Big Data Research: Status and Directions

The Library Big Data Research: Status and Directions

Shaochun Xu, Wencai Du, Chunning Wang, Dapeng Liu
DOI: 10.4018/978-1-5225-3914-8.ch042
OnDemand:
(Individual Chapters)
Available
$37.50
No Current Special Offers
TOTAL SAVINGS: $37.50

Abstract

Libraries are widely used by government, universities, research institutes, and the public since they are storing and managing intellectual assets. The library information directly stored in libraries and about the people interaction with libraries can be transformed into accessible data which then will be used by researchers to help library better serve users. Librarians need to understand how to transform, analyze, and present data in order to facilitate such knowledge creation. For example, the challenges they face include how to make big datasets more useful, visible and accessible. Fortunately, with new and powerful analytics of big data, such as information visualization tools, researchers/users can look at data in new ways and mine it for information they intend to have. Moreover, interaction of users and stored information has been taken into librarian's consideration to improve library service quality. In this work, the authors discuss the characteristics of datasets in library and argue against a popular confusion that data involved in library research is not big enough, conduct a review for the research work on library big data and then summarize the applications and research directions in this field. The status of big data research in library in China is discussed. The challenges associated with it are also discussed and explored.
Chapter Preview
Top

“Big data” describes innovative techniques and technologies to capture, store, distribute, manage and analyze datasets that traditional data management methods are normally unable to handle. The concept of “Big data” was first defined by Laney in his research note (Laney, 2001). According to the definition, big data is mainly characterized by three Vs: Volume, Velocity, and Variety (Zikopoulos et al., 2012). The first V, refers to the data volume. General speaking, the size of the data sets of big data is huge compared to regular data. However, it seems that there is no fixed definition for the size, i.e. how big of data could be classified as big data. Therefore, the size might vary based on the disciplines. Traditional software usually can handle megabyte and even gigabyte sized data sets, while big data tools should be able to handle terabyte and petabyte sized data sets. The second V, velocity, refers to the situation where data is created dynamically and accessed in a fast way; The data come in frequently, such as every second or so, and data access often has to be finished in a fraction of a second. Sometimes, data processing has to be done in real time therefore the software system has large throughput. The third V, referring to variety, indicates data heterogeneity which makes big data sets harder to organize and analyze. The regular type of data collected by researchers or businesses is strictly structured, such as data entered into a spreadsheet with specific rows and columns. However, big data sets often have unstructured data and different types of data, such as email messages or notes.

Complete Chapter List

Search this Book:
Reset