Using Article Networks on Wikipedia to Explore Public Understandings of Academic Domains and Address Observed Gaps

Using Article Networks on Wikipedia to Explore Public Understandings of Academic Domains and Address Observed Gaps

Copyright: © 2019 |Pages: 14
DOI: 10.4018/978-1-5225-7528-3.ch011


Structural relationships, expressed as networks, have come to the fore in data analytics. One example are article networks based on pages in the open crowd-sourced Wikipedia. An analysis of real-world article networks from Wikipedia (with the MediaWiki understructure) shows the promise of this analytical approach to understand related ideas. This chapter suggests ways to address observed article network gaps in particular informational domains through value-added linking, creating article stubs, authoring works, and other efforts. This approach has additional applications on other MediaWiki-based sites.
Chapter Preview


If the Social Web is about how people relate to each other through a range of electronic means, it is also about relationships between inanimate shared digital objects like words, like folksonomic tags, like crowd-sourced articles, and other byproducts of intense sharing. Article-article networks on Wikipedia are pages related to each other through the power of http outlinking from a target article. These are a form of link networks based on co-related http links. Such direct pagelink networks of articles (normal pages on Wikipedia) are conceptualized as indicating “that there is a topical relation between articles, since completely unrelated articles would not refer to each other” (Suchecki, Salah, Gao, & Scharnhorst, 2012, p. 12), or topically-related “document networks”. Figure 1 shows the “Wikipedia” article ( on the English Wikipedia at the center of the article-article network graph, and the other nodes are articles pointed to from the Wikipedia article, with a one-degree connection (only one hop out from the target article). If the target in the middle represented a person or an entity (a collection of people), then this type of network could be called an “ego neighborhood,” a direct network with a single central node and the “alters” connected by one degree. A “directed graph” is a network diagram with arrows on the edges or links showing directionality; an “undirected graph” is a network diagram without arrows, with the lines only showing some sort of relationship (but not directionality). An outlink is an edge that leaves a focal node and points to a receiving node (in this case, an article page); an in-link is an edge that points to a particular focal (or target) node. In-degree is a term that describes how many edges are pointing to a particular page, and out-degree refers to how many edges are pointing from a particular page.

Figure 1.

“Wikipedia” article network on Wikipedia (1 deg.)


Table 1 shows some of the graph metrics for Figure 1. This one-degree network captured 571 vertices or nodes, and there were 570 unique edges. The network graph was drawn using the Fruchterman-Reingold force-directed layout algorithm, with the target node in the middle. It is possible to acquire a gist of the linkages with a light perusal. There is only one group or one large connected component in this network. The maximum geodesic distance of this network (the graph diameter) is two, to move from one end of the network graph to the other (and this is to be expected with one-degree network graphs).

Table 1.
Graph metrics for the Wikipedia article network on Wikipedia (1 deg.)
Graph MetricValue
Graph TypeDirected
Unique Edges570
Edges With Duplicates0
Total Edges570
Reciprocated Vertex Pair Ratio0
Reciprocated Edge Ratio0
Connected Components1
Single-Vertex Connected Components0
Maximum Vertices in a Connected Component571
Maximum Edges in a Connected Component570
Maximum Geodesic Distance (Diameter)2
Average Geodesic Distance1.993001
Graph Density0.001751313
ModularityNot Applicable
NodeXL Version1.0.1.336

Key Terms in this Chapter

Article-Article Network: Articles connected or related by embedded outgoing http links from a target article.

Directed graph: A network diagram with edges with arrows on one or both line ends to indicate relationship direction.

Ego Neighborhood: A direct network with a target node and alters connected to that central node by one degree.

Wiki: A collaboratively edited website with a fast-edit understucture enabling various functionalities.

Link Network: An interconnected group of co-related http links.

Undirected graph: A network diagram with edges without arrows on the ends (indicating relationship but not the direction of relation).

Complete Chapter List

Search this Book: