Visualizations of the GRUBA Bibliographic Database: From Printed Sources to the Maps of Science

Visualizations of the GRUBA Bibliographic Database: From Printed Sources to the Maps of Science

Anna Małgorzata Kamińska
DOI: 10.4018/978-1-5225-4990-1.ch009
(Individual Chapters)
No Current Special Offers


This chapter describes the author's experience of building the research environment for the implementation of bibliometric research on the science of mining, being developed in Poland in 1945-1989, on the basis of periodicals published by the major technical universities involved in teaching and research in that field at that time. The study was conducted on the volume of data entered (by typing), collected, and processed in a relational database. The data, covering information of more than 36,000 articles and more than 22,000 authors, formed bibliographic database named “GRUBA” (an acronym for polish phrase “Mining Register Enabling Bibliometric Analysis” and a word meaning mine in the Silesian dialect as well). The aim of this chapter is not to present a comprehensive and extensive bibliometric research results. Only a small part of it is a background for presenting the experience gained during the implementation of research, with the primary emphasis on the final stages – modeling and analyzing the visual maps created mainly using Gephi software and representing science development.
Chapter Preview


Research to discover rules hidden in the collection of related documents becomes necessary as a result of the growing number of scientific publications in the second half of the twentieth century. This has resulted from the rapid increase in scientific research, which resulted in a doubling in the production of books and a fourfold increase in the number of scientific journals. In part to keep track of this increase, Eugene Garfield, in an article published in Science (Garfield, 1955), proposed to build a citation index for science, as a tool for the evaluation of scientific journals. The first Science Citation Index (SCI) was published in 1963 and included 102,000 articles published in 1961 by 613 selected journals.

Since then, quantitative methods have been applied ever more purposefully to identify rules hidden in communication artifacts, resulted in establishing the discipline of bibliometrics, a term first proposed in 1969 by Alan Pritchard.

Initially, bibliographic data were collected in tabular form, and to this day they are most often managed by relational database management systems that store them in data tables. The use of such mechanisms naturally suggests presentation of the results of bibliometric analyzes in tabular form. But such tabular analyses are not conducive to recognizing certain phenomena, for which graphical displays of networks may be more convincing.

It is worth noting that bibliographic data naturally have a network (or graph) structure rather than a tabular structure. Whether these graphs represent citations between articles or co-operation between the authors, the nodes of the graph represent the analyzed units, and the edges of the graph represent the relationships between these entities.

The above observation has led to the hypothesis of this chapter – using visualization techniques implemented in proposed analytical environment, allow for gaining better both general and specific knowledge about co-authoring data. The second, minor aim of this chapter is to present some research results on state of mining science, as it developed in the post-war years from 1945 to 1989 in Poland. Since the source data used for these analyses have not previously been collected in any national bibliographic database, these results are novel. In order to achieve the above, the chapter presents the realization of the following partial objectives:

  • Description of the structure and scope of the source data,

  • Presentation the experience of performing research using an analytical visualization environment,

  • Description of obtained visualizations as a result of research,

  • Verification of the initial hypothesis underpinning this study.

This chapter may primarily be of interest to researchers studying the development of science and scholarly publication, who can adapt these visualization methods to their own ends. But is will also prove of relevance to those interested in the development of mining science in Poland after World War II, as this country was then a major player in European coal mining. Lastly, researchers in other fields, who are looking for an effective visualization tool for large volumes of network data, may find the author’s experience with the proposed visualization tool to be useful.

The purpose of this chapter, however, is not to present the results of author’s bibliometric research (which will be presented in separate publications), but specifically to show the application of selected visualization methods to the results obtained by applying social network analysis (SNA) metrics to bibliographic data. The author wishes to point out that it is possible to conduct such research on bibliometric data collected independently from commercial sources such as Web of Science Core Collection (WoS), and encourages other researchers to follow this path.



Equitable stimulation of the development of science centers and individual researchers, by providing better conditions for more efficient entities, requires the ability to correctly quantify their contributions. One way to measure this contribution is to aggregate bibliographic data to the level of institutions or authors, the latter is the subject of this chapter. It should not be surprising that such analyzes have been carried out for quite some time.

Complete Chapter List

Search this Book: