Adaptable Services for Novelty Mining

Adaptable Services for Novelty Mining

Flora S. Tsai, Agus T. Kwee, Wenyin H. S. Tang, Kap Luk Chan
DOI: 10.4018/jssoe.2010040105
(Individual Articles)
No Current Special Offers


Novelty mining is the process of mining relevant information on a given topic. However, designing adaptable services for real-world novelty mining faces several challenges like real-time processing of incoming documents, computational efficiency, multi-user working environment, diverse system requirements, and integration of domain knowledge from different users. In this paper, the authors bridge the gap between generic data mining methodologies and domain-specific constraints by providing adaptable services for intelligent novelty mining that model user preferences by synthesizing the parameters of novelty scoring, threshold setting, performance monitoring, and contextual information access. The resulting novelty mining system has been tested in a variety of performance situations and user settings. By considering the special issues based on domain knowledge, the authors’ adaptable novelty mining services can be used to support a real-life enterprise.
Article Preview


With the fast growth of technology, the Web is changing from a data-centric Web into Web of semantic data and Web of services (Yee, Tiong, Tsai, & Kanagasabai, 2009). Moreover, the demand for Web services that enable users to run offline standalone applications over the Internet has increased rapidly. The World Wide Web Consortium (W3C) defines a Web service as “a software system designed to support interoperable machine-to-machine interaction over a network” (Hugo & Allan, 2004). As a result, recently more and more software applications are available online and can be accessed on the remote system at the client side. To easily identify those online applications, each of them is assigned with the unique URI (Uniform Resource Indicator), which serves as the address of the individual application or service. Individual services can be built and combined with each other to create other services with more comprehensive functionality at little additional cost (Zheng & Bouguettaya, 2009). The use of these Web services has significance in the business domain, where they are used as means of communication or exchanging data between businesses and clients (Kwee & Tsai, 2009).

Another consequence of the rapid growth of technology is the information overload from news articles, scientific papers, blogs (Chen, Tsai, & Chan, 2007), social networks (Tsai, Han, Xu, & Chua, 2009), and mobile content (Tsai et al., 2010). Because of the vast amount of information available, people tend to suffer from information overload because of irrelevant and redundant information in these documents. Thus, novelty mining (NM), or novelty detection, is a solution to this phenomenon. A novelty mining process is a process of retrieving novel yet relevant information, based on a topic given by the user (Ng, Tsai, & Goh, 2007; Ong, Kwee, & Tsai, 2009). Novelty mining can be used to solve many solve many business problems, such as in corporate intelligence (Tsai, Chen, & Chan, 2007) and cyber security (Tsai, 2009; Tsai & Chan, 2007). The pioneer work of novelty mining was proposed by Zhang et al. at the document level (Zhang, Callan, & Minka, 2002). They defined the definition of “novelty” which was the opposite of “redundancy”. Given any set of documents, a document which was less similar to its history documents was regarded as a “novel” document. Although users can retrieve all the novel documents, each document still needs to be read to find the novel sentences within these documents (Tsai & Chan, 2010). Therefore, to serve users better, later studies of novelty mining were performed at the sentence level (Allan, Wade, & Bolivar, 2003; Kwee, Tsai, & Tang, 2009; Tang & Tsai, 2010; Zhang, Xu, Bai, Wang, & Cheng, 2004; Zhang & Tsai, 2009b). To the best of our knowledge, no previous work has been reported in designing and developing adaptable services for novelty mining system in the business enterprise. For enterprise users, this service-oriented novelty mining system conveniently helps them to retrieve new information about certain events of interest. They do not need to read through all documents or passages in order to find the novel information. Creating Web services for the novelty mining components allows for the rapid deployment and availability of these services for these diverse set of users, which can balance technical significance and business concerns in business processes and enterprise systems.

Complete Article List

Search this Journal:
Volume 13: 1 Issue (2024): Forthcoming, Available for Pre-Order
Volume 12: 2 Issues (2022): 1 Released, 1 Forthcoming
Volume 11: 2 Issues (2021)
Volume 10: 2 Issues (2020)
Volume 9: 2 Issues (2019)
Volume 8: 4 Issues (2018)
Volume 7: 4 Issues (2017)
Volume 6: 4 Issues (2016)
Volume 5: 4 Issues (2015)
Volume 4: 4 Issues (2014)
Volume 3: 4 Issues (2012)
Volume 2: 4 Issues (2011)
Volume 1: 4 Issues (2010)
View Complete Journal Contents Listing