Article Preview
TopIntroduction
With rapid development of next generation networks (NGN) proposed by (ITU-T, 2006), heterogeneity and complexity in networks have become increasingly prominent. In the meanwhile, to manage and secure the networks, information visualization is imperative for administrators. Under these circumstances, many researchers have devoted themselves to web usage mining and obtained numerous achievements. This paper focuses on exploring latent interrelationships between web objects on World Wide Web (WWW).
One of the most widely used model for web structural analysis is Web graph, where the nodes are web pages and the edges are the hyperlinks between them. (Broder, 2000) conducted the first large-scale study on the Web graph and presented a bow-tie picture consisting of three distinct components in almost equal size to describe the macroscopic structure of the web. On this basis, (Donato, 2005) offered a better understanding of inner structures in the Web graph. Besides, (Huang & Lai, 2003) proposed a new approach to cluster the Web graph and applied it to web visualization. Nevertheless, Web graphs only represent the relationship between hyperlinks, without considering the context. (Sethu & Yates, 2010) achieved hyperlink classification using text mining analysis and built a multi-relational web graph. As compared with the fixed web, the mobile web is structurally different. (Jindal, 2008) found that the connectivity of mobile web was sparser than the fixed one and the node degree distributions fell off much more rapidly. (Liu & Ansari, 2014a) identified website communities successfully in mobile Internet based on affinity measurement.
This paper introduces the notion of the Bipartite Request Dependency Graph (BRDG), a graph derived from a dependency graph model in (Liu, 2014b), which took a two-step algorithm to identify user clicks from a plenty of HTTP requests. In other words, the BRDG is a useful application of the dependency graph model. The authors’ study finds that the BRDG is very large, sparse and seemingly complex. To explore latent web structural patterns, the authors apply a tNMF-based graph decomposition method to the BRDG and extract a number of interesting structural properties. The tNMF is a co-clustering algorithm, which has been shown to be useful in many applications, such as identifying suspicious activities through DNS failure graph in (Jiang, 2010), and understanding the intensive and continuous data usage patterns from mobile users (Jin, 2012).
The major contributions of this paper lie in three respects: (i) the authors propose the bipartite dependency graph to describe the interrelationships between user click requests and embedded web object requests in mobile Internet; (ii) the authors implement the tNMF-based decomposition method on BRDG; (iii) based on decomposition results, the authors classify them as five structural patterns and reveal the causes of these patterns. The rest of this paper is organized as follows. Section 2 introduces related work. Section 3 mainly introduces the definition and overall characteristics of the BRDGs. Section 4 presents the graph decomposition and classification methodology. Section 5 summarizes decomposition results as five structural patterns and gives meaningful interpretations. Section 6 concludes our paper briefly and proposes the future work.