Using Weakly Structured Documents at the User-Interface Level to Fill in a Classical Database

Using Weakly Structured Documents at the User-Interface Level to Fill in a Classical Database

Frederique Laforest (The National Institute of Applied Sciences, France) and Andre Flory (The National Institute of Applied Sciences, France)
Copyright: © 2002 |Pages: 21
DOI: 10.4018/978-1-930708-41-9.ch010
OnDemand PDF Download:
$30.00
List Price: $37.50

Abstract

Electronic documents have become a universal way of communication due to Web expansion. But using structured information stored in databases is still essential for data coherence management, querying facilities, etc. We thus face a classical problem–known as “impedance mismatch” in the database world; two antagonist approaches have to collaborate. Using documents at the end-user interface level provides simplicity and flexibility. But it is possible to take documents as data sources only if helped by a human being; automatic document analysis systems have a significant error rate. Databases are an alternative as semantics and format of information are strict; queries via SQL provide 100% correct responses. The aim of this work is to provide a system that associates document capture freedom with database storage structure. The system we propose does not intend to be universal. It can be used in specific cases where people usually work with technical documents dedicated to a particular domain. Our examples concern medicine and more explicitly medical records. Computerization has very often been rejected by physicians because it necessitates too much standardization and form-based user interfaces are not easily adapted to their daily practice. In this domain, we think that this study provides a viable alternative approach. This system offers freedom to doctors; they would fill in documents with the information they want to store, in a convenient order and in a freer way. We have developed a system that allows a database to fill in quasi-automatically from documents paragraphs. The database used is an already existing database that can be queried in a classical way for statistical studies or epidemiological purposes. In this system, the document fund and the database containing extractions from documents coexist. Queries are sent to the database, answers include data from the database and references to source documents.

Complete Chapter List

Search this Book:
Reset