Massive Digital Libraries (MDLs)

Massive Digital Libraries (MDLs)

Andrew Philip Weiss (California State University – Northridge, USA)
DOI: 10.4018/978-1-5225-7659-4.ch038


Massive digital library (MDL) is a term coined to define a class of digital libraries gathering mass-digitized print books and monographs, which rival the size of brick-and-mortar libraries. Specific examples of MDLs, including Google Books, HathiTrust, DPLA, Internet Archive, et al., are presented. The issues raised by MDLs include mass-aggregation of digital content and the ability to maintain source-material accuracy and veracity; copyright, fair use, and the mass-digitization of materials not in the public domain; and disparities in the level of diversity, especially with regard to Spanish-language, Japanese-language, and Hawaii-Pacific materials. Finally, the impact of MDLs on Digital Humanities, especially with regard to the Google Books digital corpus and the Google Ngram Viewer, will be investigated.
Chapter Preview


To provide a clearer framework for analyzing the growth of digital libraries, Weiss and James have proposed the term Massive Digital Libraries (MDLs), which is based on the size, scope and increasing scalability of digitized book collections. Such MDLs rival the size, breadth, and depth of a physical library’s print holdings, and often reach a scale seen among library consortia collections. (Weiss and James, 2013a, 2013b, 2014, 2015; Weiss, 2016)

The root of the concept begins in late 2004 when Google made its “resounding announcement” to digitize millions of the world's books—including works still under copyright protection—and to place them all online. (Jeanneney, 2005) Jean-Noel Jeanneney, head of Bibliothèque nationale de France at the time, interpreted Google’s planned project as a wake-up call for European countries. Failure to catch up to the American company, he argued, would result in significant problems for non-American organizations.

Twelve years on, it is hard to imagine that Google’s desire to create an online digital library on such a large scale should have come as such a shock. Yet at the time Google caused significant hand-wringing and soul-searching among institutions traditionally charged with producing or preserving cultural artifacts. (Jeanneney; Venkatraman, 2009) In retrospect, the controversy seems almost quaint in comparison to the current crop of issues – especially the current “disruptions” of established economic models by Uber/Lyft, Facebook, Twitter, Spotify, Snapchat, e-readers, et al. and the encroachments on civil rights via electronic digital surveillance and other intrusions of privacy.

A number of mass-digitization projects have grown in the wake of Google’s announcement, including the HathiTrust, Internet Archive, Digital Public Library of America (DPLA), California Digital Library, Texas Digital Library, Gallica, and Europeana. These projects each transcend their roots as localized digital libraries and have simultaneously adapted to and altered the digital landscape. These various MDLs have allowed for and contributed to the ascendancy of our current mass-digitization online culture.

This chapter will describe the characteristics of Massive Digital Libraries (MDLs) and outline their impact upon contemporary information science issues, especially with regard to digital collection metadata, copyright and the diversity of the source collections. Traditionally, libraries have been created to serve particular communities defined by geography, intellectual discipline, or specific end users. However, MDLs in their current trajectories promise–for better and for worse—to transcend such limits.

Complete Chapter List

Search this Book: