Article Preview
TopIntroduction
Textual semantics can be divided into two types, i.e., surface semantics and deep semantics. The former only contains textual keywords, keyword associations, sentence patterns and even the structure of the article which is from the text itself. And the latter, a type of implicit semantics not only contained in text but hidden in background knowledge, is constituted by the domain knowledge of text, and is necessary for a machine or a human being to better understand a text.
The acquisition of deep textual semantics is a key issue on the text understanding process which can significantly improve the performances of e-learning, web search and web knowledge services. Though many models have been proposed to acquire textual semantics, it is still a challenge issue because the current models only extract the surface semantics from the text itself.
There are four main types of surface textual semantic acquisition models; 1) Statistics models, such as vector space model (VSM) (Salton & Wong, 1975); 2) Cognition based models, such as element fuzzy cognitive map (EFCM) (Zhuge & Luo, 2006), concept algebra based model (Wang, 2006, 2007b) and associated linked network model (ALN) (Luo & Xu, 2010); 3) Probability topic models, such as author topic mode (ATM) (Rosen-Zvi, 2004), author-recipient-topic model (ART) (McCallum & Wang, 2004) and correlated topic models (CTM) (Blei & Lafferty, 2006); and 4) Ontology based models, such as ontology inference layer (OIL) (Fikes & McGuinness, 2001) and ontology web language (OWL) (Smith, 2002). However, the most models are based on the text itself without considering the domain knowledge. Unfortunately, there is much deep textual semantics excluded in text, such as the textual background (i.e., a part of domain knowledge). The machine or human beings would be hard to understand the text without the help of background knowledge, because the absence of the domain knowledge causes the incoherence of textual surface semantics and the bad intelligibility of text. Therefore, compared with the deep textual semantics, the surface semantics is rough, limited, narrow, and lack of coherence and cohesion for text understanding process.
There are several models having been proposed to acquire the deep textual semantics, such as “WordNet” and “HowNet” (Tang, 2007), which are stiff and inflexible in the acquisition process of deep textual semantics. From current studies, we know that there exist two key issues in the acquisition process of deep textual semantics (Luo & Lu, 2011); 1) How to obtain and organize the domain knowledge extracted from domain text set and 2) how to activate the domain knowledge for obtaining the deep textual semantics. The acquisition of domain knowledge extracted from domain text set based on human reading cognitive process has been studied by Luo and Cai (2010).