Current Issues and Future Analysis in Text Mining for Information Security Applications

Current Issues and Future Analysis in Text Mining for Information Security Applications

Shuting Xu (Virginia State University, USA) and Xin Luo (Virginia State University, USA)
DOI: 10.4018/978-1-60566-230-5.ch010


Text mining is an instrumental technology that today’s organizations can employ to extract information and further evolve and create valuable knowledge for more effective knowledge management. It is also an important tool in the arena of information systems security (ISS). While a plethora of text mining research has been conducted in search of revamped technological developments, relatively limited attention has been paid to the applicable insights of text mining in ISS. In this chapter, we address a variety of technological applications of text mining in security issues. The techniques are categorized according to the types of knowledge to be discovered and the text formats to be analyzed. Privacy issues of text mining as well as future trends are also discussed.
Chapter Preview


Defined as “the discovery by computer of new, previously unknown, information by automatically extracting information from different written resources”(Fan et al. 2006), text mining is an emerging technology characterized by a set of technological tools which allow for the extraction of unstructured information from text. With the exponential growth of the internet, it is literally cumbersome for individuals as well as companies to process all the overwhelmed information. Not like some data mining techniques discovering knowledge from only the structured data, such as numeric data, text mining is related to finding knowledge from the unstructured textual data including e-mails, Web pages, business reports, and articles, etc. Leaping from old-fashioned information retrieval to information and knowledge discovery, text mining applies the same analytical functions of data mining to the domain of textual information and replies on sophisticated text analysis techniques that distill information from free-text documents (Dörre et al. 1999).

As voluminous corporate information must be merged and managed and the dynamic business environment pushes decision makers to promptly and effectively locate, read, and analyze relevant documents to produce the most informative decisions, discovering hidden patterns from the structured data plays an important role in business where patterns are paramount for strategic decision making. Text mining pursues knowledge discovery from textual databases by isolating key bits of information from large amounts of text, by identifying relationships among documents, and by inferring new knowledge from them (Durfee 2006). Furthermore, (Fan et al. 2006) indicated that the key to text mining is creating technology that combines a human’s linguistic capabilities with the speed and accuracy of a computer. Gluing the generic process model for text-mining application proposed by (Fan et al. 2006) and general text mining framework suggested by (Durfee 2006), we think that the following model can capture the processes involved in text mining from text collection and distillation to knowledge representation (see Figure 1).

Figure 1.

Processes involved in text mining (Adapted from (Durfee 2006; Fan et al. 2006)

Complete Chapter List

Search this Book: