Radio Frequency Fingerprint Identification Based on Metric Learning

With the popularization of the internet of things (IoT), its security has become increasingly prominent. Radio-frequency fingerprinting (RFF) is used as a physical-layer security method to provide security in wireless networks. However, the problems of poor performance in a highly noisy environment and less consideration of calculation resources are urgent to be resolved in a practical RFF application domain. The authors propose a new RFF identification method based on metric learning. They used power spectrum density (PSD) to extract the RFF from the nonlinearity of the RF front end. Then they adopted the large margin nearest neighbor (LMNN) classification algorithm to identify eight software-defined radio (SDR) devices. Different from existing RFF identification algorithms, the proposed LMNN method is more general and can learn the optimal metric from the wireless communication environment. Furthermore, they propose a new training and test strategy based on mixed SNR, which significantly improves the performance of conventional low-complexity RFF identification methods. Experimental results show that the proposed method can achieve 99.8% identification accuracy with 30dB SNR and 96.83% with 10dB SNR. In conclusion, the study demonstrates the effectiveness of the proposed method in recognition efficiency and computational complexity.


INTRodUCTIoN
Radio-frequency fingerprint (RFF) is the intrinsic characteristics of wireless devices generated from hardware imperfection.Because hardware imperfection is unique for different wireless devices, RFF identification has become an emerging device authentication technique (Danev et al., 2012).
In general, RFF identification includes two steps: feature extraction and classification.Feature extraction determines the quality of RFF and directly affects classification accuracy.Many studies have explored the characteristics of different electronic components to extract effective RFF features-for example, in-phase and quadrature offset (Brik et al., 2008), phase offset (Nguyen et al., 2011), carrier frequency offset (Wheeler et al., 2017), differential constellation trace figure (Peng et al., 2019), and signal spectrum (Rehman et al., 2014).Recently, Wang et al. (2016) built a theoretical model for the entire wireless communication link to analyze the effectiveness of different RFF features.The results show that power spectrum density (PSD) can characterize the nonlinearity of the RF front end, contributing to the most significant RFF feature.
On the other hand, classification algorithm design is another key part of RFF identification in which lots of machine learning algorithms have been used.Danev et al. (2009) successfully classified 50 radio-frequency identification (RFID) transponders using principal component analysis (PCA) and k -nearest neighbor (KNN).Baldini et al. (2017) compared the performance of KNN, support vector machine (SVM), and decision tree algorithm.Wang et al. (2017) used the Fisher linear discriminant analysis (LDA) based on the Mahalanobis distance metric to analyze the user capacity of wireless physical-layer identification.However, the performance of existing classification algorithms in RFF identification will severely degrade with the decrease of receive SNR.For example, six devices are classified with 98% accuracy under 30 dB and 90% accuracy under 10 dB (Patel et al., 2014).The results achieved only 51% identification accuracy for non-line-of-sight (NLOS) channel model (Wang et al., 2016), where the SNR is 15 dB.Peng et al. (2019) shared a method based on convolutional neural network (CNN) that can achieve 99.1% accuracy at high SNR but quickly dropped to 80% at 10 dB.
We propose a new RFF identification method based on metric learning (Weinberger et al., 2009) that can adapt to different SNRs.We used the large margin nearest neighbor (LMNN) to directly learn the optimal distance metric from training samples that have not been used in existing RFF identification works.Moreover, we propose a new training and test strategy based on mixed SNR that significantly improves the performance of conventional RFF identification methods with low complexity.We describe designing the real testbeds where eight devices are used for identification under different SNRs.The experiment results show that the LMNN algorithm achieves higher identification accuracy at low SNR than existing algorithms-for example, 96.83% accuracy at 10 dB.
The main contributions of this article are summarized as follows.
Methodologically, we propose a new RFF identification method based on metric learning.We adopted the LMNN classification algorithm to identify eight software-defined radio (SDR) devices.Different from existing RFF identification algorithms, the proposed LMNN method is more general and can learn the optimal metric from the wireless communication environment.
Experimentally, we propose a new training and test strategy based on mixed SNR that significantly improves the performance of conventional low-complexity RFF identification methods that produce poor datasets.Specifically, with our method, the signal-to-noise ratio has dropped to 0 dB.Experimental results within such poor datasets also show the high applicability of our methods in IoT scenarios where physical and computational resources are extremely scarce.

RFF GeNeRATIoN
The key idea behind RFF identification is exploiting the unique characteristics of hardware to identify wireless devices.A simplified transmitter circuit and the signal transmit procedure are shown in Figure 1.Based on digital modulation, the baseband binary bits are first mapped to the in-phase and quadrature (I/Q) channels.A digital-to-analog converter (DAC) is then used to convert the I/Q signal into a time-continuous analog signal.After the DAC is finished, the mixer and RF front end move the analog signal to the passband.Note that almost all elements of the transmitter circuit are not perfect, and the imperfection is mainly generalized from the following aspects.First, the imperfection of the local oscillator (LO) will affect the carrier frequency of devises (i.e., frequency offset and phase offset).Second, the quadrature mixer is often impaired by gain and phase mismatches that result in I/Q imbalance and random phase noise.Finally, the passband signals go through the RF front-end amplifier and filter to gain enough power for radiation, which generates the most significant RFF (i.e., the nonlinearity of the RF front end).
Figure 1 shows how the digital-to-analog converter (DAC) introduces quantization error and integral nonlinearity.The local oscillators (LOs) introduce frequency offset, the quadrature mixers introduce I/Q imbalance, and the front end introduces nonlinearity distortions.
Moreover, almost all modern digital communication systems contain a stable preamble signal at the front of the data packets; this signal does not change from one transmission to the next.Consequently, the received preamble signal can be used to analyze the impairment of transmit circuits.Note that the impairment of transmit circuits is unique for different devices, and this impairment data can be used for RFF identification.It's paramount to note that the impairment of transmit circuits are unique for diverse devices, which can be utilized for RFF identification.
For example, if we assume the received signal sequence is s M×1 , then s can be expressed as shown in equation (1): In this equation, s w is the noise signal, s p is the preamble signal, and s d is the data signal.Using the I/Q modulation, we can further express the m th sample of s as shown in equation ( 2): In equation ( 2), s m I ( ) and s m Q ( ) are the in-phase and quadrature components, respectively.
The amplitude of the m th sample of s is formulated as shown in equation ( 3): (3) Then the amplitude changes of s m ( ) can be detected by using the variance of s m a ( ) , which can be expressed using the formula shown in equation ( 4): In equation ( 4), α is a scaling factor that can be determined by experiment; L is the length of the sliding window, and a L is the mean value of the received signal sequence a m L a m , , 1 .When the discrete variance vector v is derived, the threshold detection approach (Rasmussen et al., 2007) and the cumulative (Pignatiello et al., 1990) sum (CUSUM) algorithm can be used to find out the preamble signal s p .
Because time-domain I/Q signals are sensitive to noise, directly using the preamble s p to identify different wireless devices will result in poor performance.Consequently, most existing RFF identification algorithms contain the process of feature extraction in which the nonlinearity of the RF front end has been extracted as the most significant RFFs.The PSD of preamble s p can be calculated as shown in equation ( 5): In equation ( 5

RFF IdeNTIFICATIoN
Existing works (Wang et al., 2016;Peng et al., 2019) have tried various algorithms to classify different PSDs.However, the performance of most algorithms is weak at low SNR (from 0 dB to 10 dB).In this section, we propose an RFF identification framework based on metric learning in which the LMNN is used to learn more effective distance metric at low SNR.

distance Metric
Similarity measurement (i.e., the calculation of the distance metric) is the critical procedure for most RFF identification algorithms; it can affect the overall identification accuracy.For example, the wellknown supervised classification algorithm k -nearest neighbor (KNN) is affected profoundly by the distance metric and should be carefully selected through experience.
Define the dataset of the PSD as { , } x y » 1 represents the PSD of the i th preamble in the entire dataset, d is the dimension of vector, and x i , y i is the label for different transmitters.The Euclidean distance between two samples x i , x j can be expressed as shown in equation ( 6): In equation ( 6), represents the distance in the k th dimension of x i and x j .However, it has been shown in Figure 2 that high-frequency components are easily noised under low SNR.Therefore, we can assign weights for different spectrum components.A simple weighted distance metric between x i and x j can be expressed as shown in equation ( 7): In equation ( 7), W is a diagonal matrix, W i i w i , ( ) = ≥ 0 .Note that the nondiagonal elements of W are all zeros, which implies that different spectral components are irrelevant.However, there are usually correlations between different spectral components in practical datasets.Thus, a more general semi-define symmetric matrix M will outperform the diagonal matrix W . Matrix M can be used to construct a new distance metric using the formula shown in equation ( 8): In equation ( 8), the distance D M is exactly the parameterized Mahalanobis distance.It is easily known that D M contains the Euclidean distance D e and weighted distance metric D w as special cases.

Robust Classification
Using the parameterized Mahalanobis distance D M , we now introduce a new RFF classification algorithm based on LMNN (Weinberger et al., 2009).The LMNN implements a more robust KNN classification based on two simple intuitions.First, each training sample x i should share the same label y i with its k nearest neighbors; second, the training samples with different labels should be widely separated.The formulation of LMNN constraints can be expressed as shown in equation ( 9): In equation ( 9), x i is the i th training sample, x j represents the j th target samples of its k nearest neighbors, and x l denotes the impostor (i.e., a sample that does not belong to the same class with x i ).
Based on these two intuitive principles, we can further define the loss functions.The first term pulls target neighbors closer by penalizing the large distance between homogeneous samples.This term can be formulated as shown in equation ( 10): In equation ( 10), m is the number of training samples.The second term pushes heterogeneous samples away by penalizing the small distance between the impostors and the perimeter.This term can be expressed as shown in equation ( 11): In equation ( 11), hinge z z is the standard hinge loss function; ; and y il = the formula shown in equation ( 12): The loss functions shown in equations ( 10) and ( 11) can be expressed together as shown in equation ( 13): , is used to balance the two loss functions.Combining the formulas shown in equations ( 9) and ( 13), we can formulate the RFF identification problem as shown in equations ( 14a) and (14b): Note that the above optimization is non-convex and is hard to solve.Nevertheless, because M ( ) is a piecewise linear convex function, we can introduce non-negative slack variables ξ ijl to relax equation ( 9) as a semi-definite program (SDP) as shown in equations ( 15a) and ( 15b): In these equations, ξ ijl is a substitution for the hinge loss function hinge z ( ) .Note that the above optimization is convex and can be efficently solved using the CVX tools (Boyd & Vandenberghe, 2004).Figure 3(a) shows a photo of eight ADALM-PLUTO SDR devices.Figure 3(b) shows a photo of the experiment platform.

eXPeRIMeNTAL eVALUATIoN experimental Setup
In our experiment, we used 9 ADALM-PLUTO software-defined radio devices (one receiver and eight transmitters, as shown in Figure 3).These devices were produced in the same batch by Analog Devices (ADI) and worked at 2.4 GHz (Analog, 2020).The modulation scheme was designed according to the IEEE 802.15.4 protocol (IEEE, 2020), including spread spectrum, OQPSK modulation, and a half-sine shaping filter (the oversampling factor was 4).After spread spectrum and modulation, each I/Q channel signal had 128 symbols.The symbols were then transmitted to the designated frequency band through carrier modulation through the shaping filter and DAC.The baseband transmitting rate was 1 MHz.The receiver was set at a distance of 0.1 m from the transmitter, where the sampling rate was 4 MHz.All devices were connected to the computer via a universal serial bus (USB) and used Matlab 2019b for data processing.
A total of 7,600 samples (950 samples per device) were obtained in the experiment.To get a reliable and stable model, 80% of the data was used as the training set, and the remaining 20% was used as the test set.Different levels of white Gaussian noise were added to simulate various SNR environments (from 0 dB to 30 dB), where the additive white Gaussian noise (AWGN) channel module in Matlab was used.

Classic Training and Test Strategy
The visualizations of RFFs with/without metric learning are shown in Figure 4, where the t-Distributed Stochastic Neighbor Embedding (tSNE) technology was used (van der Maaten & Hinton, 2008).It is obvious that the RFF samples in two dimensions are close to each other as shown in Figure 4(a).In contrast, the RFF samples are far away from each other with metric learning as shown in Figure 4(b).The results show that the metric learning will greatly improve the separability for RFFs.
The performances of LMNN, KNN, LDA, and support vector machine (SVM) are compared in Figure 5.It is clear that the identification accuracy of SVM is better than other existing algorithms under high SNR ( ≥ 13 dB).However, the accuracy will decrease with low SNR, where the conventional LDA method is better than other existing algorithms.Nevertheless, the identification accuracy of LMNN is higher than most existing algorithms with SNR ranging from 0 dB to 30 dB.More detailed results are shown in Table 1.At 30dB SNR, the identification accuracy of KNN, LMNN, and SVM is nearly equivalent, which is about 99%.At 0dB SNR, the identification accuracy of the KNN algorithm quickly drops to 80.42%.Note that the LMNN algorithm maintains 87.29% at 0 dB.
In Figure 5, both KNN and SVM use PCA to reduce dimensionality (95% variance).LDA reduces the sample to seven dimensions.LMNN reduces the sample to 20 dimensions.

Novel Training and Test Strategy Based on Mixed SNR
In the conventional training and test strategy, the training set and model parameters must be updated according to different SNRs, which are adopted by most existing works.However, the conventional training and test strategy will result in high memory and time consumption for different SNRs, and this method is difficult to implement in low-cost devices.Moreover, the conventional training and test strategy requires precise estimation of the SNR of test samples and is hard to work in practice.To simplify the conventional strategy, existing works proposed to simplify the training process (i.e., training the algorithm under fixed SNR while testing the algorithm under different SNRs).However, the simplified strategy will result in decreased identification accuracy.Aiming to overcome this problem, we propose a new training and test strategy, where the training is processed on a mixed dataset (consisting of the original received signals with different noise from 0 to 30 dB).It is clear in Figure 6 that the identification accuracy of the LMNN algorithm trained by the conventional strategy drops rapidly with the decrease of SNR (less than 80% below 25 dB).Note that the proposed training and test strategy is much better than the conventional method primarily because the training set and the test set are not independent and identically distributed (i.i.d) (Bishop, 2006).The proposed In Figure 7, the parameter selection is the same as Figure 5.
In the last example, we compared the performance of the proposed algorithm with other existing algorithms in the new training and test strategy.The results in Figure 7 show that the LMNN algorithm maintains stability and high identification accuracy under different SNRs.Moreover, to evaluate the overall performance of the algorithms under different SNRs, we used test samples from 16 different levels of SNRs to calculate the average identification accuracy of the two strategies, as shown in Table 2. Obviously, the proposed training and test strategy is better than existing strategies.The average identification accuracy of LMNN algorithm is about 95.58%, which is the higher than other methods.

CoNCLUSIoN
We proposed a new RFF identification method based on metric learning.The nonlinear characteristics of the RF front end from eight SDR devices were used as RFFs.Different from existing works, our research is based on using the LMNN algorithm for RFF identification, which can be formulated as a convex optimization.Moreover, we proposed a novel training and test strategy based on mixed SNR to improve the performance of existing RFF identification methods.Experimental results showed that the proposed LMNN algorithm under the new training strategy achieved 95.58% identification accuracy, which is better than existing methods.Recognizing RFFs remains a challenging task, particularly when the channel environment is unknown and there is a lack of prior information on signal modulation schemes.Although our proposed algorithms showed promising results, further evaluation is necessary to assess their effectiveness in a more complex environment.The data used to support the findings of this study are included within the article.

CoNFLICTS oF INTeReST
We declare that there is no conflict of interest regarding the publication of this paper.

FUNdING STATeMeNT
This research received no external funding.
), N FFT is the length of FFT transformation points, and N N FFT ≥ should be satisfied.The PSDs of two devices under different SNRs are shown in Figure 2. It is obvious in Figure 2(a) that the difference of PSDs is big at high frequency and small at low frequency.Consequently, the PSDs at high frequency can be used to identify different devices.However, Figure 2(b) shows that the PSDs at low SNR (15 dB) will be severely affected by noise.
Figure 2(a) shows the normalized PSD of devices 1 and 2 under 30dB SNR. Figure 2(b) shows the normalized PSD of devices 1 and 2 under 15dB SNR.These two figures use dB as the PSD unit, and the curves in them use smoothing.

Figure 2 .
Figure 2. PSDs of two devices under different SNRs

Figure 3 .
Figure 3. SDR devices and experiment platform

Figure 4 .
Figure 4. Visualization with t-SNE for different sample space: (a) Original sample space, (b) the new sample space after metric learning

Figure
Figure 5. Identification accuracy with different training strategies

Figure 6 .
Figure 6.Identification accuracy with different training and test strategies

Figure
Figure 7. Identification accuracy of different algorithms trained under mixed SNR