Article Preview
Top1. Introduction
By increasing the amount of information available on the Internet, the technology for censoring network information is also improving. Currently, anonymous network communication protocols have been developed to meet the needs for privacy data protection. An example is the onion router (Tor)(Dingledine et al., 2004) which is one of the most popular anonymity networks. Tor embeds data in multiple layers of onion encryption effectively preventing attackers from accessing the identity of the communication terminals and the data. This has attracted many attackers and researchers to propose diversified target attacking approaches centering on the traffic analysis (Biryukov et al., 2013; Galteland & Gjøsteen, 2018; Jansen et al., 2018; Kwon et al., 2015). The core objective is to achieve de-anonymization attacks on the communication users by observing the pattern of the encrypted traffic around the onion nodes. Nevertheless, for each attack method, there are several traffic countermeasures such as traffic obfuscation and node protection (Imani et al., 2018; Johnson et al., 2017) which challenge traditional attack methods. In addition to the communication terminals, anonymous transmission circuits are also an important part of the communication process. Hence the analysis of transmission circuits creates serious threats compared to attacking the communication terminals. Galteland & Gjøsteen (2018) shows that by analyzing and constructing the transmission circuit any accessing behavior at the communication terminal becomes transparent hence facilitates de-anonymization attacks.
Conventional onion circuits are designed with a cooperative transmission mode among three nodes, where the anonymization effect only involves the communication sender. Subsequently, the proposed hidden service onion circuit can anonymize both the sender and receiver by splicing two sets of regular circuits. In this paper, a novel attack technique of middle node traffic analysis is proposed based on the idea of reconstructing the hidden service communication circuits by controlling the circuit nodes. In this technique, The primary goal is to find the location of the middle node found between the hidden service and the client service. The internal traffic on both sides is then correlated and analyzed and used to accurately reconstruct the complete communication circuits within the network. The initial location determination of the middle node is critical. This is because the middle node makes the circuits transparent, and further evades the existing protection for the critical nodes. Hence knowing the middle node enables attackers to track the entry nodes at both ends of the communication and possess attacking capability using minimal attack resources.
In this paper, referring to the basic conditions derived from the node location determination results, the traffic correlation target is divided into client area traffic and hidden service area traffic. The advantage of this approach is that the transmission features of the unidirectional data within a circuit are independent of and interrelated with each other. In addition, it is shown in the previous works (Guan et al., 2020) that traffic correlation attacks can be performed in a large traffic environment under continuous observation. However, they still have deficiencies such as poor noise immunity, long observation time of the traffic, and high demand for positive and negative sample datasets. Moreover, it is necessary to capture large traffic label data in advance and for over a long time and also use pre-processing models to eliminate the interference of noise in the identification of unsmooth data.