Article Preview
Top1. Introduction
The massive adoption of web-based social media for the daily activity of e-commerce users, from customers to marketing departments, is attracting more and more the attention of Business Intelligence (BI) companies. So far BI has been confined to corporate data, with little attention to external data. Capturing external data for contextualizing data analysis operations is a time-consuming and complex task that, however, would bring large benefits to current BI environments (Pérez et al., 2008a). The main external contexts for e-commerce applications are the Voice of the Customer (VoC) and the Voice of the Market (VoM) forums. The former regards the customer opinions about the products and services offered by a company, and the latter comprises all the information related to the target market that can affect the company business. Listening to the VoM allows setting the strategic direction of a business based on in depth consumer insights, whereas listening to the VoC helps to identify better ways of targeting and retaining customers. As pointed out by Reidenbach (2009), both perspectives are important to build long-term competitive advantage.
The traditional scenario for performing BI tasks has dramatically changed with the consolidation of the Web 2.0, and the proliferation of opinion feeds, blogs, and social networks. Nowadays, we are able to listen to the VoM and VoC directly from these new social spaces thanks to the burst of automatic methods for performing sentiment analysis over them (Liu, 2012). These methods directly deal with the posted texts to identify global assessments (i.e., reputation) over target items, to detect the subject of the opinion (i.e., aspects) and its orientation (i.e., polarity). From now on, we will consider as social data the collective information produced by customers and consumers as they actively participate in online social activities, and we will refer to all the data elements extracted from social data by means of sentiment analysis tools as sentiment data.
A good number of commercial tools have recently appeared in the market for listening and analyzing social media and product review forums, for example Salesforce Radian6 (http://synthesio.com), to mention just a few. Unfortunately, these commercial tools aim to provide customized reports for end-users, and sentiment data on which these reports rely on are not publicly available (indeed this is the key of their business). Consequently, critical aspects such as the quality and reliability of the delivered data cannot be contrasted nor validated by the analysts. This fact contrasts with the high quality that BI requires for corporate data in order to make reliable decisions.
Figure 1.
BI contexts and their relation to the Web 3.0 data infrastructure
Apart from the sentiment analysis approaches, there is also a great interest on publishing strategic data for BI tasks within the Linked Open Data (LOD) cloud (Heath & Bizer, 2011). The Web 3.0 and LOD are about publishing data identified and linked to each other through a Unique Resource Identifier (URI), and providing data with well-defined semantics to allow users and machines to rightly interpret them. Projects like Schema.org are allowing the massive publication of product offers as micro-data, as well as specific vocabularies for e-commerce applications. Unfortunately, nowadays there is no open data infrastructure that allows users and applications to directly perform analysis tasks over huge amounts of published opinions in the Web.