Business Intelligence Applications and the Web: Models, Systems and Technologies
Book Citation Index

Business Intelligence Applications and the Web: Models, Systems and Technologies

Marta E. Zorrilla (University of Cantabria, Spain), Jose-Norberto Mazón (University of Alicante, Spain), Óscar Ferrández (University of Alicante, Spain), Irene Garrigós (University of Alicante, Spain), Florian Daniel (University of Trento, Italy) and Juan Trujillo (University of Alicante, Spain)
Release Date: September, 2011|Copyright: © 2012 |Pages: 374
ISBN13: 9781613500385|ISBN10: 1613500386|EISBN13: 9781613500392|DOI: 10.4018/978-1-61350-038-5

Description

Over the last decade, we have witnessed an increasing use of Business Intelligence (BI) solutions that allow business people to query, understand, and analyze their business data in order to make better decisions. Traditionally, BI applications allow management and decision-makers to acquire useful knowledge about the performance and problems of business from the data of their organization by means of a variety of technologies, such as data warehousing, data mining, business performance management, OLAP, and periodical business reports. Research in these areas has produced consolidated solutions, techniques, and methodologies, and there are a variety of commercial products available that are based on these results.

Business Intelligence Applications and the Web: Models, Systems and Technologies summarizes current research advances in BI and the Web, emphasizing research solutions, techniques, and methodologies which combine both areas in the interest of building better BI solutions. This comprehensive collection aims to emphasize the interconnections that exist among the two research areas and to highlight the benefits of combined use of BI and Web practices, which so far have acted rather independently, often in cases where their joint application would have been sensible.

Topics Covered

The many academic areas covered in this publication include, but are not limited to:

  • BI for Designing Adaptive Websites
  • BI with Unstructured Data and Semi-Structured Data
  • Extraction, Transformation, and Load of Web Data
  • NLP Applied to BI
  • Semantic Web Technologies
  • The Role of Web 2.0/3.0 in BI
  • Web Data Quality
  • Web Engineering Techniques for BI Applications
  • Web Integration
  • Web Intelligence

Reviews and Testimonials

Zorrilla, Mazón, Ferrández, Garrigós, Daniel, and Trujillo have done a great job in compiling a book that fills the gap in the related bibliography with an interesting and comprehensive coverage of the challenges that we face for linking business intelligence and the Web. The book contains a mixture of research insights, system architectures, discussion of tools, and real-world cases and organizes the discussion of these issues in two areas of coverage. First, the book covers the area of exploitation of data that are already present in the Web for the purposes of business intelligence. The book discusses topics like the management of unstructured text and its combination with super-structured environments like data warehouses and OLAP tools, as well as the management of relevance, freshness, and in general, quality of the Web data with a view to its exploitation via business intelligence tools. The second area involves the automation of data processing and the usage of the Web as a platform for business intelligence including topics like BI-as-a-service, mashups and the Semantic Web for business intelligence, and, collaborative business intelligence. The book mainly acts as a reference for the state of the art in several areas, including problems and challenges that are not straightforward to be addressed as well as suggestions for paths to follow. The book's primary target is breadth of coverage, with suggestions of the relevant readings for further probing and initial insights for solutions, rather than an in-depth investigation of technical problems in the typical research-oriented fashion. In this sense, the text is easy to follow without losing its interest or significance. In my opinion, the book's primary audience is the interested researcher or practitioner who wants to get a broader view of the environment around the combination of Web and business intelligence, with pointers to the state of the art, as well as the broader challenges that remain open.

– Panos Vassiliadis, Ioannina

Table of Contents and List of Contributors

Search this Book:
Reset

Preface

Over the last decade, we have witnessed an increasing use of Business Intelligence (BI) solutions that allow business people to query, understand, and analyze their business data in order to make better decisions. Traditionally, BI applications allow management and decision makers to acquire useful knowledge about the performance and problems of business from the data of their organization by means of a variety of technologies, such as data warehousing, data mining, business performance management, OLAP, periodical business reports, and the like. Research in these areas has produced consolidated solutions, techniques, and methodologies, and there is a variety of commercial products available that are based on these results.

More recently, a new trend in BI applications has emerged: BI applications no longer limit their analysis to the data of just one organization or company. Increasingly, they also source their data from the outside, thus complementing internal company data with value-adding information from the Web (e.g., retail prices of products sold by competitors), in order to provide richer insights into the dynamics of today's business and to better support decision-making processes. As a result, BI applications aim to assist the dynamics of modern management practices, where decision-making requires a comprehensive view of the market and the business ecosystem as a whole, and hence, BI using just internal company data no longer suffices.

Interestingly, in parallel to the movement of data from the Web into BI applications, we are now also experiencing the movement of BI applications from internal company information systems to the Web: Business Intelligence as a service (e.g., hosted BI platforms for small- and medium-sized companies) or software support to manage business outsourcing or crowd sourcing is the target of huge investments and the focus of enormous research efforts by both industry and academia. The underlying idea is moving the processing and analysis of large bodies of data into the cloud and consuming BI via the Web.

In light of these trends, conciliation of Business Intelligence and the Web is of paramount importance for further progress in the area. Specifically, this book presents a selection of chapters falling into two main topics:  

  • Data from the Web feeding BI applications. In the last decade, the amount and complexity of data available on the Web has been growing rapidly. As a consequence, designers of BI applications making use of data from the Web have to deal with several issues. Many interesting research challenges arise when the Web is seen as a data repository: Web warehousing models, data quality issues, integration of Semantic Web technologies, Web mining technologies, BI with unstructured or semi-structured data, Web intelligence methodologies, or the application of natural language processing (NLP) techniques to the BI field.
  • BI applications moving to the Web. The movement of BI applications from internal company Information Systems to applications that are accessible over the Web implies the need for Web-specific design competencies. In this context, research is focused on using Web engineering methodologies and technologies in BI: real-time BI and business performance management applications, Web mashups and RIA for BI development, usability and accessibility for BI applications, security issues in BI, and so on.

In short, the aim of this book is to provide an overview of the two main current research lines about (i) how to fully exploit the huge amount of data available on the Web with BI applications, and (ii) how to apply Web engineering methods and techniques to the design of BI applications. This book aims to share theoretical or applied models and systems regarding decision making with data from the Web, showing emerging technologies and tendencies regarding BI systems and the Web and their applications in different fields, and provide the academic community with a base text that could serve as a reference in business and computer science undergraduate and graduate courses. 

The target audience of this book is varied and spans researchers and academics working on both BI and the Web, practitioners and software developers, managers and executives of companies operating in the BI market, and lecturers and students of related courses. Researchers and academics will find an analysis of the state of the art and an outline of current and future research challenges. Practitioners and software developers will find hints to cutting-edge implementation solutions and technologies. Managers and executives will find a comprehensive spectrum of current trends and market needs. Finally, lecturers and students will gain an interesting insight into the relationship between BI and the Web.

The book is structured into two sections that group chapters into thematically related areas: 

BI with Web Data
This section comprises 8 chapters that specifically focus on the problem of processing and analyzing data that are sourced from the Web. The first five chapters describe how BI systems deal with these issues from different points of view: quality, storing complex data, semantic integration, and text OLAP analysis, among others. The following three chapters discuss the importance of NLP in BI.

Marotta et al. explain in Chapter 1, “Quality Management in Web Warehouses,” a reference architecture for quality-aware Web Warehouses, which is useful for evaluating and managing quality aspects through all the life cycle of a Web Warehouse. The challenging task of designing a Web Warehouse requires the management of quality aspects to (i) properly select Web sources with which to populate the Web Warehouse, and to (ii) measure and offer quality attributes to final users of the Web Warehouse to improve their decision making.

Next, in Chapter 2 “Innovative Approaches for efficiently Warehousing Complex Data from the Web,” Bentayeb et al. propose extracting information from the Web, and transforming and loading it to a Web Warehouse, which provides uniform access methods for automatic processing of the data. Specifically, the authors present three research lines (i) the use of XML as a logical and physical model for complex data warehouses, (ii) associating data mining to OLAP to allow elaborated analysis tasks for complex data, and (iii) schema evolution in complex data warehouses for personalized analyses. 

In Chapter 3, “An Extraction, Transformation and Loading Tool applied to a Fuzzy Data Mining System,” by Carrasco et al., the authors present a tool with which to semantically integrate heterogeneous data from various websites into a BI system to empower the decision making process. They also present a real case study from the Business School at the University of Granada (Spain).

As unstructured documents including text data from the Web constitute the majority of business data, in Chapter 4, “Incorporating Text OLAP in Business Intelligence,” Park and Song present a Text OLAP solution to perform multidimensional analysis of text documents in the same way structured relational data is analyzed. The aim of their approach is to seamlessly analyze structured and unstructured data, thus realizing the total BI. 

In Chapter 5, "A Semantic Approach for News Recommendation,” Frasincar et al. describe the importance of news items in the business decision process. Moreover, they present the problem of traditional news recommenders, and how this problem is overcome by using semantic similarity measures. Existing semantic similarities as well as new proposals for semantic similarity are discussed and evaluated in this chapter. The results point out that the application of semantics successfully improves traditional recommenders.

Pallotta et al., in Chapter 6 "Interaction Business Analytics: Making business sense of customers conversations through semantic and pragmatic analysis,” propose an interaction business analytics perspective focused on unstructured customers’ interactions. Such a perspective extends the understanding of business data achieved by statistical methods through deriving valuable information from unstructured data. They present a new approach for interaction business analytics based on argumentative analysis obtained by a deep linguistic processing of conversations. Examples from three different scenarios are presented, showing the benefits of the proposed approach. 

Chapter 7, "OpAL: a System for Mining Opinion from Text for Business Applications,” presents the need for computational approaches capable of dealing with subjective data on the Web. A feature-based opinion mining system is described and evaluated in several scenarios. Different natural language processing techniques applied to opinion mining are discussed throughout the chapter. Furthermore, the authors also describe a robust and multilingual method for opinion retrieval. The results reported by this system are very encouraging, demonstrating the business potential behind the processing of subjective information. 

Lastly, Henschel et al., in Chapter 8, “A unified Approach for Taxonomy-based Technology Forecasting,” propose a technique using bibliometric indicators that allows decision makers and researchers alike to understand the state of the art of their area of interest. As a concrete example, the authors discuss a case study in the field of renewable energy.

Engineering Web-Enabled BI
This section collects 6 contributions that concentrate on the problem of engineering advanced BI applications that leverage on Web technologies. Chapters 9 and 10 focus on service-oriented technologies. Chapters 11 and 12 deal with collaborative BI. And the remaining chapters survey two relevant topics within engineering Web-enabled BI: Semantic Web and real-time BI and situational analysis.

In Chapter 9, “Business Intelligence-as-a-Service: Studying the Functional and the Technical Architectures,” by Essaidi et al., the authors provide a study on the benefits of the SaaS model for the design of business intelligence architectures and propose a functional architecture to support common on-demand business intelligence services. Next, they describe and discuss the utility of a service, which helps developers to design data warehouses based on MDA and 2TUP. Finally, they offer an open-source solution for the implementation of their approach. 

Zorrilla and García, in Chapter 10, “A Data Mining Service to Assist Instructors involved in Virtual Education,” describe an on-demand data mining service developed in their university, which aims to help instructors involved in distance education to discover their students’ behavior profiles and obtain models about how they navigate and work in their virtual courses. In the chapter, the authors justify its necessity and utility for both professors and students and describe its architecture based on SOA and standard Web technologies. 

Chapter 11, “BIN: Business Intelligence Networks,” proposes a framework, called Business Intelligence Network, for sharing BI functionalities over complex networks of companies that are pursuing mutual advantages through the sharing of strategic information. After proposing their architecture, they outline the main research issues involved in its building and operating, and focus on the definition of an ad-hoc language for expressing semantic mappings between the multidimensional schemata owned by the different peers, aimed at enabling query reformulation over the network. 

Berthold et al. envision in Chapter 12, “Towards Ad-hoc and Collaborative Business Intelligence,” a highly scalable and flexible BI platform that is able to perform ad-hoc analyses in a collaborative manner. The authors describe the main blocks of their proposal which aim to complement traditional BI environments in order to overcome these challenges and empower the business users.

In Chapter 13, “Real-Time BI and Situational Analysis,” the authors provide an elaborated and forward-looking survey of current data warehousing trends from the perspective of both the applications and the database systems and review state-of-the-art techniques that may help addressing the typical problems that emerge when building a real-time data warehouse. The chapter further nicely connects to the domain of the Web by extending the typical architecture of real-time data warehouses toward mashup situational data analysis. 

Berlanga et al. in Chapter 14, “Semantic Web Technologies for Business Intelligence,” describe in detail the convergence of BI with one of the most influential technologies in the last decade: the Semantic Web.They make a survey about the use of Semantic Web technologies in the different stages of the development of BI applications: data integration, multidimensional modeling, intelligent BI querying, scalability issues, and so on.

As the above summary shows, this book summarizes current research advances in BI and the Web, emphasizing research solutions, techniques, and methodologies which combine both areas in the interest of building better BI solutions. Novel proposals are presented throughout the two sections in which the book is structured, giving the reader a general view as well as detailed descriptions of approaches addressing the main issues posed in this book.

To contribute to the sharing of knowledge, the book stresses the use of technologies capable of dealing with data obtained from the Web, i.e., semistructured or unstructured information, with the aim to achieve better understanding and analysis of business data, and likewise, Web engineering methods and techniques, which can be applied to the design of BI applications hosted in the Web. 

To the best of our knowledge, this is the first book that puts together topics about BI and the Web, and as such, aims to emphasize the interconnections that exist among the two research areas and to highlight the benefits of a joint use of BI and Web practices, which so far have acted rather independently, often in cases where their joint application would have been sensible.

Author(s)/Editor(s) Biography

Marta Elena Zorrilla Pantaleón is an Assistant Professor in Computer Science at the University of Cantabria (Spain). She earned her Bachelor’s degree in Telecommunication Engineering and PhD in Computer Science at the University of Cantabria in 1994 and 2001, respectively. She has participated in and managed more than 20 research projects, most of them with companies, and she is author of a database book and more than 40 works published in international journals, books, and conferences. She is an active reviewer of several international journals and conferences (DSS, IJCSA, IEEE-Education, IEEE-RITA, SCI, BEWEB, etc.). Her research interests are the design and development of Information Systems and intelligent systems for companies, and, inside the educational area, the application of data mining techniques and OLAP technologies in order to analyse and improve Web-based learning sites.
Jose-Norberto Mazón is Assistant Professor at the Department of Software and Computing Systems in the University of Alicante (Spain). He obtained his Ph.D. in Computer Science from the University of Alicante (Spain) within the Lucentia Research Group. He has published several papers about data warehouses and requirement engineering in national and international workshops and conferences, (such as DAWAK, ER, DOLAP, BNCOD, JISBD and so on) and in several journals such as Decision Support Systems (DSS), SIGMOD Record or Data and Knowledge Engineering (DKE). He has also been co-organizer of the International Workshop on Business Intelligence and the Web (BEWEB 2010) and the International Workshop on The Web and Requirements Engineering (WeRE 2010). His research interests are: business intelligence, design of data warehouses, multidimensional databases, requirement engineering, and model driven development.
Óscar Ferrández is a postdoctoral researcher in the BioMedical Informatics department at the University of Utah (USA). He got his Ph.D in Computational Linguistics at the University of Alicante (Spain) within the Natural Language Processing and Information Systems research group. He has publications in international journals (such as Information Sciences, Data and Knowledge Engineering, and Information Processing and Management) as well as communications in relevant conferences and workshops related to his research field. He has been involved in several national and European research projects together with other international research institutions. He is an active member and reviewer of several international journals and conferences. His research interests are focused on human system interaction; ontologies and Semantic Web; machine learning; knowledge discovering; and natural language processing and its application to clinical records.
Irene Garrigós is an Assistant Professor and post-doc researcher at the University of Alicante, (Spain), from which she holds a PhD and a Master’s in Computer Science. She has published several papers in national and international workshops, conferences, and journals (such as ICWE, ER, WISE, APWEB, JISBD, Information and Software Technologies, Journal of Web Engineering, and so on). Dr. Garrigós has served as a Program Committee member of several workshops and conferences such as ER, JISBD, WISM, MDA, FPUML, UWA, and has served as assistant referee in several international conferences such as WWW and ICWE. She has done research stays in Belgium (Vrije Universiteit Brussel) and the Netherlands (Technische Universiteit Eindhoven). Her research interests are: Web engineering, personalization, model driven development, requirement engineering, Web and business intelligence, and adaptive systems.
Florian Daniel is a postdoctoral researcher at the Department of Information Engineering and Computer Science of the University of Trento, Italy. He has a PhD in Information Technology from Politecnico di Milano, Italy. His main research interests are mash-ups and Web/services engineering, and compliance, quality, and privacy in business intelligence applications. He is co-author of the book “Engineering Web Applications” (Springer, 2009) and has published more than 50 scientific papers in international conferences and journals. Florian is co-organizer of the international workshops ComposableWeb and BEWEB and is involved in the organization of international conferences like BPM, ICSOC, and ICWE.
Juan Trujillo is a Full-time Professor at the Department of Software and Computing Systems in the University of Alicante (Spain). His main research topics include business intelligence applications, data warehouses’ development, OLAP, data mining, UML, MDA, data warehouses security and quality, etc. He has advised 9 PhD students and published more than a 120 papers in different national and international high impact conferences such as the ER, UML, ADBIS or CaiSE, and more than 30 papers in highly ranked international journals indexed by JCR such as the DKE, DSS, ISOFT, IS, or JDBM. He has also been co-editor of five special issues in different JCR journals (e.g. DKE). He has also been PC member of different events and JCR journals such as ER, DAWAK, CIKM, ICDE, DOLAP, DSS, JDM, ISOFT, and DKE, and PC Chair of DOLAP'05, DAWAK'05-'06 and FP-UML’05-’09. Further information on his main research publications can be found on: http://www.informatik.uni-trier.de/~ley/db/indices/a-tree/t/Trujillo:Juan.html.

Indices

Editorial Board

  • Rafael Berlanga, Universitat Jaume I, Spain
  • Sven Casteleyn, Vrije Universiteit Brussel, Belgium 
  • Maristella Matera, Politecnico di Milano, Italy
  • Andrés Montoyo, Universidad de Alicante, Spain
  • Jesús Pardillo, Universidad de Alicante, Spain
  • Stefano Rizzi, University of Bologna, Italy
  • Alkis Simitsis, Hewlett–Packard Laboratories, USA
  • Mario Piattini, University of Castilla-La Mancha, Spain
  • Gustavo Rossi, University of La Plata, Argentina
  • Michael Thelwall, University of Wolverhampton, UK