|Total results: 908||
|Innovative Techniques and Applications of Entity Resolution
Entity resolution is an essential tool in processing and analyzing data in order to draw precise conclusions from the information being presented. Further research in entity resolution is necessary to help promote information quality and improved data reporting in multidisciplinary fields requiring...
Overview of Entity Resolution
Entity resolution is one of many importation operations for data quality management, information retrieval, and data management. It has wide applications in Web search, ecommerce search, data cleaning, and information integration. Due to its importance, entity resolution has been studied by researchers...
Measures of Entity Resolution Result
In this chapter, the authors introduce how to measure Entity Resolution (ER) result. As the authors have already made the entity resolution process, they need to know how much better this result is. This is often done by comparing the ER result with the ground truth. First, two important parameters...
Entity Resolution on Names
Errors with names occur frequently. “California” and “CA” refer to the same state of the USA; however, they may both appear as records in a database at the same time. Several techniques need to be proposed to solve these problems. In this chapter, the authors introduce the methods of entity resolution...
Context-Based Entity Resolution
Prior work of entity resolution involves expensive similarity comparison and clustering approaches. Additionally, the quality of entity resolution may be low due to insufficient information. To address these problems, by adopting context information of data objects, the authors present a novel...
Entity Resolution on Single Relation
A basic work of entity resolution is to detect duplicate records in single relation. To address this problem, many different approaches for different areas are proposed. The basic process of entity resolution is attribute similarity computation. Based on the attribute similarity computation methods...
Entity Resolution on Multiple Relations
Entity resolution is a central issue in data quality management. It has been proven extremely useful in data fusion, inconsistency and inaccuracy detection, knowledge extraction, and data repairing. Nevertheless, in the real world, entities often have two or more representations in databases. The...
XML Object Identification
For the ability to represent data from a wide variety of sources, XML is rapidly emerging as the new standard for data representation and exchange on Web and e-government. To effectively use XML data in practice, entity resolution, which has been proven extremely useful in data fusion, inconsistency...
Entity Resolution on Graph Data Set
In this chapter, the authors study entity resolution on graph data set. In order to conduct entity resolution on graph data, the authors need to define the distance of graph. The authors compute these distances or approximately compute them for time efficiency. At last, the authors utilize the...
Entity Resolution on Complex Network
Complex networks can be used to describe the Internet, social network, or more broadly describe a binary relation of a set of objects. Structure information of complex network helps the identification of the entity corresponding to nodes in the network. There is much research in this area, and the...
Entity Resolution on Cloud
Large quantities of records need to be read and analyzed in cloud computing; many records referring to the same entity bring challenges for data processing and analysis. Entity resolution has become one of the hot issues in database research. Clustering based on records similarity is one of most...
Basic Data Operators for Entity Resolution
This chapter focuses on the basic data operators for entity resolution, which include similarity search, similarity join, and clustering on sets or strings. These three problems are of increasing complexity, and the solution of simpler problems is the building blocks for the harder problem. The authors...
Data Cleaning Based on Entity Resolution
Data quality is one of the most prevalent problems in data management. A traditional data management application typically concerns the creation, maintenance, and use of a large amount of data, focusing only on clean datasets. However, real-life data are often dirty: inconsistent, duplicated...
Query Processing Based on Entity Resolution
Dirty data exist in many systems. Efficient and effective management of dirty data is in demand. Since data cleaning may result in useful data lost and new dirty data, this research attempts to manage dirty data without cleaning and retrieve query result according to the quality requirement of users....
Duplicate Record Detection for Data Integration
In information integration systems, duplicate records bring problems in data processing and analysis. To represent the similarity between two records from different data sources with different schema, the optimal bipartite graph matching is adopted on the attributes of them, and the similarity is...
Entity Resolution in Bibliography Information Management
Entity resolution, that is to build corresponding relationships between objects and entities in dirty data, plays an important role in data cleaning. In bibliography information management system, the confusion between authors and their names often results in dirty data. That is, different authors may...
Product Entity Resolution in E-Commerce
With the rapid development of e-commerce, there is a huge amount of commodity data on the Internet. Users are always spending a lot of time looking for the exact product. Therefore, finding products representing the same entity is an effective way to improve the efficiency of purchasing. Due to...
Entity Resolution in Healthcare
Abbreviations are common in biomedical documents, and many are ambiguous in the sense that they have several potential expansions. Identifying the correct expansion is necessary for language understanding and important for applications such as document retrieval. Identifying the correct expansion can...
Evolving Application Domains of Data Warehousing and Mining: Trends and Solutions
Pedro Nuno San-Banto Furtado.
Data warehousing and mining technologies are key assets today in many areas of human knowledge, from scientific to commercial and industrial settings, and the last decades have seen tremendous advances in those fields.Evolving Application Domains of Data Warehousing and Mining: Trends and Solutions...
From User Requirements to Conceptual Design in Warehouse Design: A Survey
Conceptual design and requirement analysis are two of the key steps within the data warehouse design process. They are to a great extent responsible for the success of a data warehouse project since, during these two phases, the expressivity of the multidimensional schemata is completely defined. This...
Data Extraction, Transformation and Integration Guided by an Ontology
Chantal Reynaud, Nathalie Pernelle, Marie-Christine Rousset.
This chapter deals with integration of XML heterogeneous information sources into a data warehouse with data defined in terms of a global abstract schema or ontology. The authors present an approach supporting the acquisition of data from a set of external sources available for an application of...
X-WACoDa: An XML-Based Approach for Warehousing and Analyzing Complex Data
Hadj Mahboubi, Jean-Christian Ralaivao, Sabine Loudcher, Omar Boussaïd, Fadila Bentayeb.
Data warehousing and OLAP applications must nowadays handle complex data that are not only numerical or symbolic. The XML language is well-suited to logically and physically represent complex data. However, its usage induces new theoretical and practical challenges at the modeling, storage and analysis...
Designing Data Marts from XML and Relational Data Sources
Yasser Hachaichi, Jamel Feki, Hanene Ben-Abdallah.
Due to the international economic competition, enterprises are ever looking for efficient methods to build data marts/warehouses to analyze the large data volume in their decision making process. On the other hand, even though the relational data model is the most commonly used model, any data mart/...
Intelligent Techniques for Warehousing and Mining Sensor Network Data
Sensor network data management poses new challenges outside the scope of conventional systems where data is represented and regulated.Intelligent Techniques for Warehousing and Mining Sensor Network Data presents fundamental and theoretical issues pertaining to data management. Covering a broad range...