Article Preview
Top1. Introduction
As various novel web social media appears, a large volume of short messages are transmitted by sentences such as Twitter, Facebook, micro-blogs, etc. In the volume short messages, two associated short texts belonging to one topic may be far away from each other and thus have weak associate relations between them since the weak association relations between short texts are being diluted by short text ocean on web, resulting in mass decentralized topics, large redundancy, and abundant noises. Therefore, it is a significant and practical problem to study how to link some unordered short texts in a semantic coherence way in large scale web data environment. However, direct research on sentence links is a difficult problem because of no well-round mathematic methods and no already standards dataset of short text. For simplicity, we use sentences to refer short texts in the follow parts since the short texts and sentences are alike in length.
Coherence is defined as a “continuity of senses” and “the mutual access and relevance within a configuration of concepts and relations” [Beaugrande and Dressler 1981]. In the human discourse process, semantic coherence is a key problem and thus readers routinely attempt to construct coherent meanings by inference [Graesser, et al. 1994; Ferstl and Cramon 2001;Kintsch 1988; Singer 1994]. Among the inferences, bridging inference is particularly central to the textual semantic coherence, which adds bridges between sentences to narrow semantic gaps between sentences [Kim 1999; Mckoon and Ratcliff 1992; Graesser et al. 1994; Singer 1990].
Figure 1 shows the bridging inference process when human beings face unordered sentence. The reader first acquires the meaning at terms/sentence level as the explicit knowledge (shown as steps 1-2). Then, he/she makes bridging inference to narrow the semantic gaps between the sentences by the explicit knowledge (shown as step 3). If the linked sentences are semantic incoherence, the human bridging inference process activates implicit bridges from domain knowledge to link sentences (shown as loop in steps 3-6) [McKoon and Ratcliff, 1992; O’Brien et al., 1988; Singer and Ferreira, 1983]. Less semantic coherence of the sentences are, more bridging inference are added to link sentences [Johnson et al., 1973] (shown as steps 7-8).
Figure 1. Human beings’ bridging inference process to link unordered sentences
Obviously the above manually process is too time-consuming to link the large scale unordered sentence on web. To solve the above issue, inspired by cognitive Informatics and cognitive computation [Wang et al., 2012, 2013], we propose bridging inference based linking model which simulates bridging inference in discourse process for linking unordered sentences on web.