Remarks on a Fuzzy Approach to Flexible Database Querying, Its Extension and Relation to Data Mining and Summarization

Remarks on a Fuzzy Approach to Flexible Database Querying, Its Extension and Relation to Data Mining and Summarization

Janusz Kacprzyk (Polish Academy of Sciences, Poland), Guy de Tré (Ghent University, Belgium) and Slawomir Zadrozny (Polish Academy of Sciences, Poland)
Copyright: © 2013 |Pages: 20
DOI: 10.4018/978-1-4666-2455-9.ch014
OnDemand PDF Download:
$30.00
List Price: $37.50

Abstract

For an effective and efficient information search of databases, various issues should be solved. A very important one, though still usually neglected by traditional database management systems, is related to a proper representation of user preferences and intentions, and then their representation in querying languages. In many scenarios, they are not clear-cut, and often have their original form deeply rooted in natural language implying a need of flexible querying. Although the research on introducing elements of natural language into the database querying languages dates back to the late 1970s, the practical commercial solutions are still not widely available. This chapter is meant to revive the line of research in flexible querying languages based on the use of fuzzy logic. This chapter recalls details of a basic technique of flexible fuzzy querying, discusses some newest developments in this area and, moreover, shows how other relevant tasks may be implemented in the framework of such queries interface. In particular, it considers fuzzy queries with linguistic quantifiers and shows their intrinsic relation with linguistic data summarization. Moreover, the chapter mentions so called “bipolar queries” and advocates them as a next relevant breakthrough in flexible querying based on fuzzy logic and possibility theory.
Chapter Preview
Top

Introduction

Databases are a crucial element of all kinds of information systems that are in turn the “backbone” of virtually all kinds of nontrivial human activities. The growing power, and falling prices of computer hardware and software, including those that have a direct impact on database technology, have implied an avalanche growth of data volume stored all over the world. That huge volume makes an effective and efficient use of information resources in databases difficult. On the other hand, the use of databases is not longer an area where database professionals are only active and, in fact, nowadays most of the users are novice. This implies a need for a proper human-computer (database) interaction which would adapt to the specifics of the human being, mainly – in our context – to the fact that for the human user the only fully natural means of articulation and communication is natural language with its inherent imprecision.

The aspects mentioned above, the importance of which has been growing over the lasts decades or years, have triggered many research efforts, notably related to what is generally termed flexible querying, and some human consistent approaches to data mining and knowledge discovery, including the use of natural langue, for instance in linguistic data summarization.

Basically, the construction of a database query consists in spelling out conditions that should be met by the data sought. Very often, the meaning of these conditions is deeply rooted in natural language, i.e., their original formulation is available in the form of natural language utterances. It is then, often with difficulty, translated into mathematical formulas requested by the traditional query languages. For example, looking for a suitable house in a real estate agency database one may prefer a cheap one. In order to pose a query, the concept of “cheap” has to be expressed by an interval of prices. The bounds of such an interval will usually be rather difficult to assess. Thus, a tool to somehow define the notion of “being cheap” may essentially ease the construction of a query. The same definition may be then used, in other queries referring to this concept, also in the context of other words, as, e.g., very. The words of this kind, interpreted as so-called modifiers, modify the meaning of the original concept in a way that may be assumed context-independent and expressed by a strict mathematical formula.

It seems obvious that a condition referring to such terms as “cheap”, “large” etc. should be considered, in general, to be satisfied to a degree rather than as satisfied or not satisfied – as it is assumed in the classical approach to database querying. Thus, the notion of the matching degree is one of the characteristic features of flexible fuzzy queries.

Moreover, usually, a query comprises more than just one condition. In such a case, the user may require various combinations of conditions to be met. Classically, directly only the satisfaction of all conditions may be required or the satisfaction of any one condition may be required. However, these are in fact only some extreme cases of conceivable aggregation requirements. For instance, a user may be completely satisfied with the data satisfying most of the his or her conditions.

The study of modeling of such natural language terms as “cheap”, “very” or “most” for the purposes of database querying is the most important part of the agenda of the flexible fuzzy querying research.

In this paper we will present a focused overview of the main research results on the development of flexible querying techniques that are based on fuzzy set theory (Zadeh, 1965). The scope of the chapter is further limited to an overview of those techniques that aim to enhance database querying by introducing various forms of user specified fuzzy preferences (Bosc, Kraft & Petry, 2005). We will not consider other techniques that are relevant in this area, exemplified by self-correcting, navigational, cooperative, etc. querying systems.

For our purposes, we will view a fuzzy query as a combination of a number of imprecisely specified (fuzzy) conditions on attribute values to be met. The fuzzy preferences in queries are introduced inside query conditions and between query conditions. For the former, fuzzy preferences introduced inside query conditions via flexible search criteria which make possible to indicate a graded desirability of particular values. For the latter, fuzzy preferences between query conditions are given via grades of importance of particular query conditions.

Complete Chapter List

Search this Book:
Reset