One-Factor Cancellable Fingerprint Template Protection Based on Index Self-Encoding

The existing one-factor cancellable biometrics algorithms generally require random sequences to reorder the biometrics, which reduces the discrimination of the transformed biometrics. Some algorithms hide and transmit the random sequence by XORing the random sequence with original biometrics, which may cause the leakage of the original biometrics. Therefore, this paper proposes a one-factor cancellable fingerprint template protection based on index self-encoding. First, the integer sequence generated by the hash function is used as the index. The random sequence is automatically encoded directly by the index value, and the generated binary sequence retains the original biological characteristics to the greatest extent. Second, self-encoding binary sequence and random binary sequence are XORed to obtain the encoded key without directly storing binary factor sequences. Experiments are implemented on the fingerprint database of FVC2002 and FVC2004, the results show that the recognition rate is enhanced; meanwhile, it fits the design criteria of cancellable biometrics.


INTRODUCTION
Today, due to the progress of technology in daily life, a variety of data security issues emerge one after another (Dong et al.,2022). People begin to pay attention to data security and privacy protection (Turesson et al., 2021), and information security has become increasingly important in daily life (Bollle et al., 2002;Lin et al., 2020). In identity management, people are also paying increasing attention to the identification and protection of biometrics. Common biometrics include fingerprint (Yang et al., 2022), iris (Lai et al., 2017), finger vein (Kirchgasser et al., 2019), and so on. Fingerprint identification technology (Wang & Hu, 2016) is the most convenient and widely used biometric technology with strong adaptability, easy operation, and high stability. At the same time, there are some problems with biometric templates that are worth noting: because biometric features are irrevocable, they cannot be reissued once they are damaged, and the authors need to ensure that the generated cancellable biological template is reproducible; in addition, biometrics are unique, and new biometrics cannot be generated if user characteristics are stolen (Ratha et al., 2001). Therefore, biometric template protection is a pivotal and urgent matter.
Biometric template protection falls into two types, including the two-factor cancellable method and the one-factor cancellable method. The two-factor cancellable biometric template protection algorithm needs an extra specific parameter from the user, which is a token or password, along with biometrics as input, to guarantee the unlinkability and revocability of the converted template. For example, reference (Teoh et al., 2004(Teoh et al., , 2006) generated a binary vector by the inner product of a feature vector and a user-specific nonsquare orthogonal random matrix and then performed threshold binarization to generate a scheme of cancellable biometric template, Biohash. Reference  proposed a cancellable fingerprint template using the local Hadamard transform method, which used a randomly generated token k to construct a submatrix of a Hadamard matrix for local Hadamard transformation to obtain a cancellable biometric template. An original approach is proposed in the reference  and used a randomly generated token matrix A to further hide the original biometric information. In a typical two-factor cancellable scheme, user-specific parameters (password or token) are important input factors, but there are also some problems caused by external factors (user-specific parameters): (i) it is necessary to keep the token or remember the password.
(ii) External factors may be lost, forgotten, or stolen. (iii) The exposure of user-specific parameters may lead to the risk of conversion template intrusion, especially for some salt-based schemes.
The one-factor cancellable biometric template protection algorithm, as you can see from the name, does not require additional input factors and only requires biometrics as input; it effectively avoids the problems caused by external factors. Reference (Lee et al., 2018) put forward a one-factor cancellable template protection algorithm based on extended feature vector hashing (EFV). After copying and expanding the original biometrics, a hash function is used to generate a permutation factor to array the random sequence again to obtain a revocable template. Reference (Kong et al., 2021) proposed a one-factor sliding window algorithm based on fingerprints (WSE). After combining the extended binary biometric vector through the sliding window jumping value, biometric information is hidden by corresponding steps, such as the hash function. Reference (Zhang et al., 2021) proposed a one-factor cancellable fingerprint template protection algorithm called feature enhanced hashing (FE). After copying and expanding the original biometric features, the improved hash function is used to calculate the replacement factor and randomize it. The sequence is reordered, and a cancellable template is generated by shortening the random sequence before and after by the same length. Reference (Li & Wang, 2022) proposed a one-factor fingerprint feature template protection scheme based on the novel minimum hash signature (NMHS) and the secure extended feature vector (SEFV). NMHS has generated the fused hash code and SEFV is used to map. Generate a pseudo identifier for matching during registration and verification without additional storage of keys and biometrics. Finally, the pseudo identifier is matched and identified.
When the above one-factor cancellable biological template protection scheme reorders the random sequence after calculating the hash function, the hash function may generate different permutation factors, and the same cancellable biological template may be rearranged to generate the same cancellable biological template, resulting in information leakage. For the storage of random binary sequences, the existing algorithm is to perform a simple XOR operation with the original biometric vector to acquire the encoded key. If the database is stolen, some original feature vectors can be restored. Therefore, the biometric template protection framework is based on the reference (Lee et al., 2018). This paper presents a one-factor cancellable fingerprint template protection algorithm based on index self-encoding. This algorithm primarily improves the encoding method of converting decimal to binary and the method of storing random binary sequences. First, this paper uses the integer sequence generated by the hash function as an index and marks the position corresponding to the index value as '1' and the other positions as '0' to obtain a new binary sequence, that is, the binary sequence generated by the index self-encoding. It realizes the difference and uniqueness after encoding different categories of features while realizing the same category of features after encoding. Second, when the random binary sequence is stored, the binary sequence generated by the index self-encoding is XORed with the random binary sequence to obtain the encoded key and stored in the database. Improved the security of random binary sequence storage.
The article refers to the fingerprint template in binary vector form (Jin et al., 2016), and the experiment uses fingerprint datasets FVC2002 (Maio et al., 2002) and FVC2004 (Maio et al., 2004). It is also demonstrated that the scheme defends against both attacks on safety and meets the design criteria of cancellability.
The main contributions are as follows: 1. An algorithm based on index self-encoding is proposed. The integer sequence generated by the hash function is used as the index, and the index value is directly used for automatic encoding without introducing additional random sequences. The generated binary sequence can preserve the original to the greatest extent. biological characteristics. 2. A new random sequence storage algorithm is designed. The two binary sequences obtained by index self-encoding and random binary sequences are respectively XOR operated, so as to obtain the revocable fingerprint template for matching and the encoding key for storing in the database, which can realize the revocability of biometrics and prevent attackers from recovering the original biometrics from the encoding key. So as to prevent information leakage.

Locality Sensitive Hashing
LSH (Datar et al., 2004) is a way to maximize similarity while dimensionality reduction processing on high-dimensional data by hashing so that the probability that similar features are mapped to similar positions after hashing is increased (Aydar & Ayvaz, 2019). LSH family H formula:

Biometric Identification Scheme
Biometric template protection (Rathgeb & Uhl, 2011) is generally divided into cancellable biometrics (Patel et al., 2015) and biometric cryptosystems (Jain et al., 2008), of which cancellable biometrics 4 are the focus of this paper. A cancellable biometric is the generation of an irreversible biometric template from a common biometric template through a transfer function and user-unique parameters. A cancellable biometric template protection scheme needs to possess four properties (Patel et al., 2015): noninvertibility, unlinkability, revocability, and performance preservation. The cancellable biometric authentication methods include two-factor cancellable methods that combine external factors and biometrics and one-factor cancellable methods that only use biometrics. This subsection presents related work about these two methods.
The extended feature vector hashing (EFV) algorithm is a one-factor cancellable biometric template protection algorithm. Specifically, the enrollment process has two inputs: a biometric vector x and a random binary vector r . The two vectors perform a series of operations and are then arranged as a "cancellable template". The random binary vector (original key) and fingerprint feature vector are XOR encoded to generate the encoded key, and the encoded key together with the revocable template are stored in the database. What is noteworthy is that the original key is deleted after enrollment. The verification process, a query biometric and the encoded key are XORed to generate the decoded key, and the query template is generated by the decoded key. Since two permutation factors are biometrics independent of enrollment and query, this scheme does not require a second input of permutation factors like typical permutation-based methods.
Biohash is a two-factor cancellable biometric template protection algorithm. Specifically, generating a randomization matrix on the basis of each user's unique token, and inner product the matrix with the biometric vector, and then the inner product vector is subjected to a threshold algorithm to obtain a binarized hash code. The Biohash algorithm (Teoh et al., 2004(Teoh et al., , 2006 was first applied in the field of fingerprint biometrics. Because the Biohash method has noninvertibility and unlinkability, the dataset test achieves good results with a recognition rate of 100% in an ideal state. The algorithm satisfies the revocability, and the user token can be replaced to cancel the original template and replace the new encrypted template when the template information leaks. The Biohash algorithm requires biometric vectors x Î  n and orthogonal random matrices R n q ∈ ×  together as input, where q n £ . Cancellable biological template generation steps in the Biohash algorithm are as follows: Obtain the inner product vector y by calculating y x = R T ; (ii) Binarize y based on a predefined threshold τ τ , and bioCode where i q = 1... . The Biohash algorithm is also applied to other biometric modalities, not limited to fingerprints, but it must combine biometrics and user tokens to assure the accuracy of the algorithm. If the user token is leaked, it will seriously affect the algorithm's accuracy. From this, it can be seen that external factors play a decisive role. Some methods, such as compromised key algorithms and orthogonal matrices, have also been found to recover the original biometrics.
The two-factor cancellable biological template protection algorithm represented by Biohash requires external factors as input, so in the process of enrollment and authentication, the attacker is likely to steal the token or password; in addition, the storage of the token or password is a problem for the user. Therefore, this paper proposes the one-factor cancellable biometric template protection algorithm, which improves the recognition rate and security.

METHOD
This section mainly elaborates on four aspects: framework, algorithm steps, index self-encoding process, and cancellable template generation.

System Framework
From Figure 1, the framework of the one-factor cancellable template authentication system based on index self-encoding is proposed. This method only uses biometric features x . r is the ancillary data, which is a random binary vector. Both only need to perform certain actions to receive the cancellable biometric template.
The main process of the authentication system is to apply it to the fingerprint matching scenario. The authentication system is applied in the scene of fingerprint matching in the enrollment stage. The input binary biometric vector x is copied and expanded to obtain the sequence x . The binary vector is combined by the sliding window jumping value (Kong et al., 2021) and then converted into a decimal number sequence x  . Next, the authors use the index value y 1 obtained by the hash function calculation. A new binary sequence w 1 is obtained by self-encoding the index value y 1 . Finally, the sequence w 1 and the key r are XORed to obtain the final cancellable template m . Meanwhile, it uses another hash function to generate a hash code y 2 for self-encoding to obtain a binary sequence w 2 and uses the generated w 2 and the key r to the XOR operation to obtain a random binary vector t . After the enrollment phase, the authors need to memorize the data in the database, including the encoded random binary vector t and cancellable template m . In the verification phase, the authors need to decode the generated key r ' from the ciphertext t , which is used to generate a cancellable template m ' . Finally, m and m ' are matched to estimate whether the match is ture or not. In this paper, w 1 and w 2 are irreversible binary sequences obtained by self-encoding after hashing. Irreversible binary sequences are obtained by XOR of w 1 , w 2 and random binary sequences r . Compared with the irreversible binary sequences are obtained by XOR of x and r proposed in the previous paper, the method in this paper is more difficult than the original method to derive the original biological characteristics, which improves the irreversibility and further ensures the security of data. The improved self-encoding method and the storage of random binary vectors in this paper have increased noninvertibility, and the security and recognition rate have also been obviously improved. Figure 2 shows the specific flow chart of the index self-encoding hash algorithm. The detailed steps of the index self-encoding algorithm are as follows:

Hash Algorithm Based on Index Self-Encoding
be a binary biometric vector, copy x to compose an extended binary biometric vector x ∈       0 1 , hn , in which h is the length of the vector and n is the system parameter. This step allows for better protection of biological characteristics by replication.

Each
where k is the system parameter and is called the window size. This generates a subbit block 1 mod to generate a real-valued vector. For both hash functions, all elements are incremented by '1' to avoid the situation of x i = 0 ,x j = 0 , because the index value obtained by adding '1' and not adding '1' will be very different, which may adversely affect the recognition rate; mod( ) hn + 1 to ensure that x i is transformed to obtain the maximum value of y is hn . If the result of the modulo operation is 0, set y = 1 . This procedure generates y 1 =       1,hn hn as a vector of integers, and y 2 =       1,hn hn that helps to store random binary sequences.
4. Generating an all-zero vector of the same length as y 1 , use y 1 as an index, mark the position where the index occurs as '1', and mark the rest as '0'. For example, y 1 = 300 , and the 300th position of the all-zero vector is marked as '1'. By checking the set 1,hn ( ) of all position indices, a new binary sequence w 1 is generated. In the same way, y 2 also generates a new binary sequence w 2 . 5. Let r ∈       0 1 , hn be a binary vector as the ancillary data of the algorithm. The cancellable template m is obtained by XORing the new binary sequence w 1 obtained from index self-encoding with r .
The following is the pseudocode of hash algorithm for index self-encoding.  Step 1: Extended binary biometric vector For i n = 1 : Copy x extension n times and assign it to x End for Step 2: Generating subbits blocks For i hn = 1 : , where | is the connection operation

End for Step 3: Each subbit block is converted and the hash function is computed
For i hn = 1 :

The Process of Index Self-Encoding
As shown in Figure 3, the combination of binary vectors is converted into decimal numbers through the sliding window hopping value, and the hash function is used to compute the hash value. Then, the authors generate an all-zero vector and find the corresponding position of the all-zero vector for each hash value. The position corresponding to the vector is marked as '1'. If the position is repeated, the position remains '1'. Index positions that do not appear are marked with '0'. In addition, so on, traverse all the hash values. Finally, a binary sequence based on index self-encoding is obtained.

Cancellable Template Generation
If the cancellable template is destroyed, the cancellable template m and its encoded key t can be readily replaced via a new random binary vector r ∈       0 1 , hn to receive a new cancellable template m ∈       0 1 , hn . This step fully reflects the revocability and replacement in this scheme.

EXPERIMENTS AND DISCUSSIONS
This paper uses a 256-bit binary fingerprint vector, and the generation steps of the vector are: 1) tiny descriptor extraction (Cappelli et al., 2010), 2) transformation based on kernel learning, and 3) feature vector binarization (Lim et al., 2012). To prove that this method has better recognition performance, this paper conducts experiments on four publicly available fingerprint datasets: FVC2002 (DB1, DB2) (Maio et al., 2002) and FVC2004 (DB1, DB2) (Maio et al., 2004). Each database involves 100 users, The assessment criteria of this experiment are based on the reference (Cappelli et al., 2006), which assesses the accuracy of the authentication system according to the Genuine/Imposter matching score and the Equal Error Rate. For each database, five samples per class generate cancellable templates, which can generate 1000 100 5 2 × ( ) C true match scores and 4950 C 100 2 ( ) false match scores.
Due to the application of random binary sequences, to estimate the scheme by rule and line, this paper uses five different ancillary data r for testing. Finally, the obtained equal error rate (EER(%)) is averaged as the final result.

Parameters for Index Self-Encoding
In this section, the authors investigate the effect of parameters on the authentication performance of this method in the light of EER (%). The EER is related to recognition performance. The two system parameters in this scheme are:(i) Window size k k ( ) ³ 2 .(ii) Number of repetitive concatenations n n ( ) ³ 1 .

Influence of Parameter k
For the parameter k k ( ) ³ 2 , the authors take the values of 2, 3, 4, and 5 in turn and conduct experiments on magnifications of 1, 5, 10, and 15, respectively. In Figure 4 "EER(%)-vs-k" (FVC2002 DB1), when k increases, EER(%) increases. As described in Algorithm 1, k is a subbit block, so an increasing number of bits will be added as k increases, which increases the probability that the subbit block is influenced by noise bits, and thus the EER(%) increases. It can also be surveyed that the EER(%) decreases when k does not change and n increases.

Influence of Parameter n
For the parameter n n ( ) ³ 1 , on the basis of k = 2 , the expansion multiplier n is set to be 1, 5, 10, 15, 20, 50, 100, 200, 500, 800, 1000, and 1200 to conduct experiments. Figure 5 "EER(%)-vs-n" (FVC2004) shows that with the increase in the expansion multiple, the EER(%) decreases dramatically when the multiple does not exceed 100. The rate at which the expansion factor increases EER(%) decreases slowly when the multiple is greater than 100. To reduce the gap between the two random sequences generated from enrollment and query biometrics, n should be large. However, n cannot be extended discretionarily because a template that is too long will result in waste and problems that attackers are easily steal.

Performance Evaluation
In this section, the various parameters that identify the best performance in the previous section are selected, and experiments are performed on the databases FVC2002 (DB1, DB2) and FVC2004 (DB1, DB2) when k = 2 , n = 1200 . Table 1 lists the results of several methods for comparative analysis. After a series of experimental comparisons, the authors find that the one-factor cancellable fingerprint template protection scheme based on index self-encoding has significantly lower equal error rates (EER) on the four databases compared with the same type of one-factor cancellable biometric template protection algorithm. Compared with the two-factor cancellable biometric template protection algorithm, the EER of the ISC algorithms is basically lower than those algorithms. Only the fourth database of GRP-IOM Hashing has a moderate error rate smaller than the ISC algorithm. Since the error rate is lower, the recognition rate is higher. The index self-encoding algorithm enhances the recognition rate. because more information of the original features is retained when the hashed integer sequence is indexed and self-encoded so that the similarity of the biometric template matching is higher. On the whole, the recognition rate of the index self-coding (ISC) algorithm has been improved and solves the problem of external factors, which improves security. Therefore, the index self-coding algorithm is better. Average Processing Time Table 2 shows the ISC processing time of n = 1200 and k = 2 in MATLAB R2017a. The processing time includes the total time of the enrollment stage and verification stage. Table 2 shows that the average processing time for both stages = 0.029 (seconds).

Analysis of Noninvertibility
For noninvertibility, this paper generates a new binary sequence y 2 and a random binary sequence r by XORing with a self-encoding method to obtain t and stores it in the database. If the encoding key t and the revocable template m in the database are stolen by the attacker and can deduce x or x . Conversely, this means that the noninvertibility is not satisfied. The previous algorithm may restore part of an original biometric template by the postencoded key t and template m . In this article, t y r 2 = ⊕ and the database, the authors can only steal t from the database, and two parameters y 2 and r are unknown, so the thief cannot accurately know the information of one of the parameters; thus, another parameter information cannot be inferred. Therefore, r or y 2 cannot be deduced inversely. If the brute force method is used to crack, it requires 2 2 2 256 1200 307200 hn = = × guesses, and the authors will be able to know that the actual calculation is infeasible, so recovering the biometric vector by guessing is hard. This enhances the security of the template.

Revocability Analysis
For revocability, it means that a new template should be generated to substitute the damaged template when a template is destroyed. Genuine match score, Imposter match score, and Mated-Genuine match score distributions were computed and evaluated. Figure 6 shows the revocability analysis on the four databases, from which it can be observed that they have considerable overlap in the Mated-Genuine and Imposter score distributions. This indicates that for the same user, templates generated with different keys r are indistinguishable from each other, which satisfies the revocability.

Unlinkability Analysis
According to the requirement of unlinkability, the keys r are unlinkable. Different cancellable biometric templates m are generated by XORing the same biometrics with different keys r . This paper verifies the unlinkability of ISC by following the benchmark framework in reference (Gomez-Barrero et al., 2018). Cross-matching the cancellable biometric templates generated by ISC with the mated/nonmated sample fraction distribution model. The unlinkability of templates is calculated according to two different measurement methods, local measures and global measures, proposed in the reference (Gomez-Barrero et al., 2018). Two measurement methods are computed from the mated/nonmated sample distribution. The local measure Ds , is a local score measure, which represents the link degree of the cancellable template on a scoring basis. The global measure Dsys evaluates the unlinkability of the total system, the value of Dsys from 0 to 1, and it can compare more fairly with other cancellable schemes at the level of unlinkability. As the global measure Dsys decreases, the unlinkability of the cancellable template increases. The unlinkability analysis of the four databases from Figure 7 clearly shows that the mated/nonmated sample score distribution curves overlap, indicating that cancellable templates are indistinguishable from the same user or not. Therefore, the indexed self-coding (ISC) algorithm satisfies the unlinkability criterion. Table 3 lists the detailed values of Dsys for all test datasets of ISC and other one-factor schemes for comparison. It serves to show that the Dsys value of the ISC algorithm on the database is smaller, indicating that the ISC algorithm has higher unlinkability and better security.

Security Analysis
Brute force attack (Najafabadi et al., 2014) is an attack method about safety, which is to brute force the query cancellable template used by a user to match by enumeration. For the ISC algorithm, the  A falsely accepted attack is more critical and practical than a brute force attack. It is to illegally access the system through fewer attempts. Here, the authors employ an authentication system; if the matching score exceeds the threshold τ τ set by the system, the authentication system will allow access. Taking the fingerprint database FVC2002 DB1 as a verification example, the parameter values are n = 1200 and k = 2 for the experiment. Figure 8 illustrates that the threshold τ τ = 0 54 . , which reveals that the minimum number of matches for falsely accepted attack attempts is hnt = × × = 256 1200 0 54 165888 .
. Therefore, the complex rate of the falsely accepted attack is calculated as 2 2 165888 hnt = . Although the complexity is nearly half that of a brute force attack, this number is still difficult to achieve in practice.

CONCLUSION
This paper proposes a one-factor cancellable biometric template protection scheme. In this scheme, binary biometrics are used as the only input, avoiding some safety issues attributed to the introduction of external factors in the two-factor cancellable biometric template protection algorithm. The onefactor cancellable biometric template protection algorithm based on index self-encoding (ISC) is proposed in this paper. The theory and experiments reveal that the method of index self-encoding can assure the accuracy of biological characteristics and further improve the safety of the program while comparing other scheme identification rates. The ISC algorithm also satisfies noninvertibility, revocability, and unlinkability. For future work, the current one-factor scheme only involves fingerprint biometrics, and the authors can try to protect the biometric template after fusion while ensuring that the security and recognition rate of the algorithm are guaranteed. Liang Tao received the B.S. degree in radio technology and the M.S. degree in circuit and system from Anhui University, Hefei, China, in 1985 and1988, respectively, andthe Ph.D. degree