Searching Semantic Data Warehouses

Searching Semantic Data Warehouses

Alfredo Cuzzocrea (ICAR-CNR and University of Calabria, Italy)
DOI: 10.4018/978-1-4666-5888-2.ch190
OnDemand PDF Download:
No Current Special Offers

Chapter Preview



The problem of searching Semantic Data Warehouses lies in the between of two leading and well-understood research areas, i.e. Search Computing (also extended by means of semantic technologies – e.g., (Ceri & Brambilla, 2010)) and Semantic Data Warehouses (e.g., (Spaccapietra et al., 2009)). This problem has been of renewed attention at now, due to the important applications that found on a typical Semantics Data Warehouse (SDW) architecture. Among these, relevant ones convey in the large family represented by the Semantic Web applications such as Ontology-based Web Information Systems (e.g., (Wache et al., 2001)), RDF-based Complex Systems (e.g., (Nejdl et al., 2003)), Analytical Tools over Large Resource-based Systems (e.g., (Cuzzocrea et al., 2011)), Web Warehouses (e.g., (Bonifati & Cuzzocrea, 2007)), and so forth.

In more detail, Search Computing is a novel discipline whose main goal consists in enabling a novel paradigm for modeling, defining, and devising so-called search services that may be deployed over a large spectrum of architectures, ranging from centralized ones to more probing distributed ones (e.g., Cloud Infrastructures (Dikaiakos et al., 2009)). Search Computing is a key component of modern Web search engines, such as Google and Yahoo!, where the main issue is represented by the problem of locating, annotating, indexing and retrieving Web information from the deep Web via simple natural language queries. This problem has been investigated by means of a plethora of approaches, but modern proposals, mainly relying on semantic technologies, seem to playing a leading role in this scientific area. Briefly, these proposals pursue the idea of exploiting semantic approaches (e.g., Ontologies, RDF) to model Web resources, and then using this information at query execution time in order to magnify the effectiveness of the Web search engine.

Semantic Data Warehouses, instead, follow the main paradigm of equipping layers of architectures of classical Data Warehouses (e.g., (Chaudhuri & Dayal, 1997)), namely ETL (Extraction, Transformation, and Loading), Storage, and DW Data Deployment/Presentation (e.g., OLAP (Gray et al., 1997)) by means of semantic-inspired methodologies. Indeed, thanks for the evident synergies that this paradigm exposes with the related area of semantic-based Web resource modeling, annotating and discovering (like in early research experiences), the so-called Semantic Web Data Warehousing (e.g., (Binh et al., 2003)) seems to be one of the most promising initiative in the context investigated by our research. Indeed, Semantic Data Warehousing research opens of a widespread collection of research challenges that can reasonably be considered as “hot topics” for actual Database and Data Warehousing research. Among these, some that deserve recall are: the critical problem of defining semantic-inspired methodologies for supporting ETL (e.g., (Pequeno & Aparício, 2005)); semantic-inspired techniques for annotating internal repositories of Data Warehouse storage layers (e.g., (Feng & Dillon, 2003)); semantic OLAP (e.g., (Lakshmanan et al., 2002)), and so forth. From this, it clearly follows the enormous potentialities of Semantic Data Warehousing research in future years, even with important industrial applications (e.g., (Chute et al., 2010; McCusker et al., 2009)).

Key Terms in this Chapter

Search Computing: Set of techniques to enhance search engine’s effectiveness and expressive power.

Semantic Web: A collection of models, techniques and algorithms that aim at annotating Web content via semantics, for more efficient search and indexing.

Semantic Search: Data searching technique in which a search query provides more meaningful and relevant search results (in a website, database or any other data repository) by evaluating the contextual meaning of the words used for search.

Data Warehouse: A central repository of current and historical data made by integrating data from heterogeneous sources.

Annotation: Comment, explanation or presentational markup attached to text, image, or other data and used to add information about the desired visual presentation, or machine-readable semantic information.

OLAP: On-Line Analytical Processing, or OLAP, designate a set of software techniques for interactive analysis of large amounts of multidimensional data from multiple perspectives.

Semantics: Branch of linguistics that studies the meaning of words, individual letters, phrases and texts.

ETL: The process of Extraction, Transformation and Loading of data coming from a collection of heterogeneous data sources, needed to make them suitable for Data Warehouse usage.

Semantic Data Warehouses: Repositories of historical data capable of providing semantic information regarding the stored data.

Complete Chapter List

Search this Book: