Article Preview
TopIntroduction
Semantic relationships are at the heart of ontologies. They connect words, terms and entities through meaning, and thus enable a graph representation of knowledge with rich semantics. Complex semantic relationship, also known as semantic association (Aleman-Meza et al., 2003), is a sequence of consecutive properties that link two resource entities; in RDF graphs, it is a path consisting of labeled edges that connects two entity nodes. Semantic association mining is a critical step towards getting useful semantic information for better integration, search and decision-making. A number of search techniques and query languages have been developed for discovering semantic associations, such as ρ-Queries (Anyanwu & Sheth, 2003) and SPARQLeR (Kochut & Janik, 2007). A semantic association search query consists of a pair of entities, and the results contain all the semantic associations between them.
With the amount, scale and complexity of ontologies growing rapidly, the number of semantic associations between a pair of entities is becoming increasingly overwhelming. Thus, a semantic association search is very likely to return too many results for a user to digest. For example, we parsed the entire fictional_universe domain of Freebase linked-open-data (Google, 2011) into an RDF knowledge base containing 192K resources and 411K properties. We observed that in such a knowledge base, even a simple query (e.g., between Harry Potter and James Potter) with a strict path length restriction (e.g., 10) returns thousands of semantic associations. Thus, an effective ranking technique is need for identifying the most relevant results.
A fundamental challenge in semantic association ranking is to understand user preferences. Different users can have different preferences in terms of personal interests and search intentions. For example, Figure 1 is a small fraction of the RDF knowledge base we created from Freebase data (Google, 2011). Given a query “finding semantic associations between Harry Potter and James Potter”, a few typical search results are listed in Table 1. Among these results, a user who is familiar with the fiction Harry Potter would find complicated relationships such as Result 4 more informative, while another user interested in the topic of superpower may want relationships like Result 5 to gain higher ranks. Given that such preferences are difficult to be explicitly expressed in current semantic association query languages (Anyanwu & Sheth, 2003; Kochut & Janik, 2007), it is the ranking method’s responsibility to cater for each individual user’s specific preferences.
Figure 1. A small fraction of the RDF knowledge base we created from Freebase data under the topic “fictional_universe”. The color of each instance node denotes its class.
Table 1. Typical results of the semantic association search between “Harry Potter” and “James Potter”