With the increasing popularity of XML for data representations, there is a lot of interest in keyword query on XML. Many algorithms have been proposed for XML keyword queries. But the existing approaches fall short in their abilities to analyze the logical relationship between keywords of spatiotemporal data. To overcome this limitation, in this chapter, the authors first propose the concept of query time series (QTS) according to the data revision degree. For the logical relationship of keywords in QTS, the authors study the intra-coupling logic relationship and the inter-coupling logic relationship separately. Then a calculation method of keyword similarity is proposed and the best parameter in the method is found through experiment. Finally, the authors compare this method with others. Experimental results show that this method is superior to previous approaches.
Top1 Introduction
The Extensible Markup Language (XML) has evolved to be a paradigm for data exchange over the network since its foundation in 1998 (An et al., 2005). XML is perceived as an adaptable hierarchical model that is appropriate to communicate a large amount of data without a rigid structure (Ahuja & Gadicha, 2014). Hence, the ability to acquire knowledge from XML documents for decision support is certainly optimistic and it has been dominant format used on the web (Ahuja & Gadicha, 2014). Besides, XML’s self-describing property enables XML to represent data without losing its semantics information (Chang & Chen, 2012).
Keyword query on XML document has received wide attention. The query semantics and algorithms of keyword queries on XML documents have been extensively studied in the literature (Bhalotia et al., 2002; Chang & Chen, 2012; Cost & Salzberg, 1993; Deerwester et al., 1990; Guo et al., 2003; Hristidis et al., 2006; Hristidis & Papakonstantinou, 2002; Hu & Hammad, 2005; Kong et al., 2009; Li et al., 2007; Li et al., 2009; Li et al., 2012; Li et al., 2004; Wang & Aggarwal, 2010; Chen et al., 2009; Tian et al., 2011). The keyword search semantics on XML documents are mainly focused on the Lowest common ancestor (LCA) based semantics, including the LCA semantics and its variants (Cost & Salzberg, 1993; Hristidis et al., 2006; Hristidis & Papakonstantinou, 2002; Li et al., 2009; Li et al., 2012; Li et al., 2004) to improve the search quality. In (Xu & Papakonstantinou, 2005), Xu and Papakonstantinou propose the SLCA semantics for keyword query processing on XML documents, and therefore two algorithms are presented. Guo et al. (Guo et al., 2003) introduce an ELCA (Exclusive lowest common ancestor) semantics to the keyword queries on XML documents, and an efficient algorithm named Indexed Stack for keyword queries on XML documents with the ELCA semantics is presented in (Xu & Papakonstantinou, 2008). The Valuable Lowest Common Ancestor (VLCA) semantics is proposed by (Li et al., 2007) to answer keyword queries effectively, which not only improves the accuracy of LCAs by eliminating redundant LCAs that should not contribute to the answer but also retrieves the false negatives filtered out wrongly by SLCA. Sun et al. (Sun et al., 2007) propose a Multiway SLCA (MSLCA) semantics to support the keyword search both in AND and OR Boolean operators, and two algorithms named basic multiway-SLCA (BMS) and incremental multiway-SLCA (IMS) are presented. Literature (Li & M, 2018) proposes a structure-based approach to keyword querying for XML data, which combines the structure query language into the keyword query over XML to get more meaningful and comprehensive results. Literature (Hu & Hammad, 2005) uses indexes to improve the performance of the query engines for XML data.