Article Preview
TopThe term “linked data” was proposed in the area of the Semantic Web (W3C, 2018; Wikipedia, 2018). Linking is a process of extraction, setting and shared usage of links between elements of data, information, and knowledge (linkeddata.org, 2018). In government reports data linking is considered as a process of setting links between elements from different sources. Links are a set based on some common properties of these sources (Data.gov.au, 2018).
There are a lot of definitions of linked data. The main idea of linking is to set relations (or links) between different information elements. These relations define interdependencies between elements and enrich the original data.
Contemporary methods for data linking are mostly based on models and methods of intelligent data analyses and machine learning. They enable investigation of indirect links, restoration of some kind of unobvious interdependencies including those that can be found in large datasets or in a high dimensional data.
The following groups of intelligent data analyses methods can be used for data linking: clustering; classification and forecasting; template matching, etc. (Graesser, 2018; Hastie et al., 2009; Nikolenko & Tulup’ev, 2009; Pospelov, 1990; Russell & Norvig, 2009; Witten & Frank, 2005; Zaki & Meira, 2014).
There are two types of analyses used for dependency discovering: associative and sequential. Associative analysis methods enable discovering logical dependencies between information elements. Sequential analysis methods are used to process sequences of events.
The majority of classical dependency discovering methods is based on brute force search. The complexity of algorithms implementing these methods is decreased with the help of optimization. Classification and evaluation of associative algorithms can be found in (Tushkanova & Gorodeckij, 2015a, 2015b). Some of these algorithms are described in (Zaki & Meira, 2014). In the last few years, a number of new effective associative models have been created, e.g. an associative Bayesian network (Gorodeckij & Samojlov, 2009).