When an attacker has access to an SDK, for instance, using the same SDK as the system he attempts to target, he can carry out hill-climbing attack or wolf attack.
Hill climbing iteratively reconstructs an image (or feature) starting from an unrelated probe sample by sending modified versions. By gradually improving the result, the attacker can eventually compromise the system security. Hill climbing can be used to bypass either the sensor or the feature extractor.
Wolf attack differs from hill climbing in that instead of synthesizing new samples, wolf attack starts with a large database of biometric samples and attempts to find the closest match.
In this tutorial, we shall combine wolf attack with hill climbing attack to demonstrate the potential vulnerability of a minutiae-based fingerprint recognition system to these attacks.
A wolf according to Doddington’s original paper is a biometric sample belonging to different person than the target user (victim) that can produce very a high similarity score.
However, not all target user (victim) is succeptible to a wolf attack. In partivcular, the biometric traits of subjects who are more unique would be more resistant to the wolf attack. It is, therefore, sensible to combine both attacks by first introducing the wolf attack first followed by hill climbing in order to further expose the potency of this threat.
The procedures outlined below will start with a wolf attack by using a database in order to find out which fingerprint templates (references) is closest to the target fingerprint template (reference). This is similar to a dictionary attack since a database is used.
Then, the we shall use the top three closest matching fingerprint templates and start hill climbing with them by randomly pertubing their minutiae. The top 500 candidates are then retained for each generation. The algorithm is then repeated for a number of iterations (generations).
Why three top closest templates and why capping top 500 candidates per generation? The reason is that we want to keep the computation to a minimum while ensuring some diversities. These numbers are parameters that you can optimise.
We shall continue from a previous post in which a database of fingerprint features
have been prepared using NIST’s fingerprint matcher mindct
.
A text file named filelist_xyt.txt
contains all the processed files in .xyt
format. This file list serves as a gallery of zero-effort attack from which potential wolves can be identified. There are 200 unique fingers each containing 5 impressions, so making a total of 1000 samples. The 1000 fingerprint templates in the *.xyt
format are stored in the features
directory. The following Octave code generates the content of the file filelist_xyt.txt
.
Let us now prepare the fingerprint to be attacked by extracting the .xyt
features using NIST’s mindtct
command line tool below. You should replace fingerprint2attack.jpg
with your own file name. In practice, of course, an attacker would not have access to this raw data or its corresponding feature file in .xyt
format.
The above command outputs file2attack.xyt
among other auxiliary files (which we won’t need).
We shall now use invokes bozorth3
to match the target file file2attack.xyt
against the gallery of templates in filelist_xyt.txt
; the result of which is stored in file2attack.scores
.
We could have run the command in bash
directly.
As you might have guessed, the command unix
gives you access to shell environment where you can execute a bash
command.
The next lines of codes below then load the scores, sort them and plot the sorted scores.
The last line save the figure in .png
format.
How does the wolf fingerprint, i.e., the fingerprint that is closest to the target template look like? To answer this question, I have written a function display_xyt
to display the minutiae. Because it is an image, we have to reverse the x-axis; and this custom function does this for you. This function also takes in the magnitude variable, mag
, which controls the length of the arrow of each minutia for visualisation purpose only.
Simply call this function for a given minutia template and its corresponding image.
The code below finds the closest template.
We now get the filename of the original fingerprint in .png
format. f you want to copy and paste the code below, you should replace the variable original_png_folder
with the folder where your original folder where the fingerprint images in .png
are stored.
Finally, having loaded the image and xyt file, we are ready to call the display_xyt
function.
This produces an output similar to the one below:
In order to carry out the hill-climbing attack, we shall define a supporting function which interfaces with NIST’s bozorth3
matcher. This function takes in a fingerprint template in .xyt
which has four features per minutiae, namely row, column, orientation and quality and then outputs a set of slightly modified templates with some pertubation as controlled by pertube
.
The function below, evolve_sample
below simply produces 30 novel fingerprint templates, each of which differs by simultaneous change in the row and column of a minutiae. Note that this is just one of the many possible ways to evolve novel fingerprint templates. You can add or remove a minutia or change the orientation of the minutia. This is left as a hands-on exercise.
If you have noticed, we have not made use of the temperature
parameter in anyway. You could, for instance, control the amount (magnitude) of pertubation in this way. The algorithm used here is reminiscent of simulated annealing.
We now need to write the core function which consists of maintaining, say, 100 generations. In each generation, we need to cap the maximum number of candidates that we can evaluate in order to reduce the computation complexity. This is set to 500.
The steps taken are as follow:
At the beginning of the loop, the algorithm takes all the candidates of current generation and evaluates their fitness in terms of similirity score. Once the fitness values are evaluated, only the top 500 candidates are kept, to be ready for next round of evolution; and the process repeats this way until the desired number of generations is reached (100 in this case). Below is the code fragments that implement this idea.
The last statement saves the key variables. xyt_genration
stores all the top candidates (up to 500) whereas fitness
stores the candidates’ fitness value.
We shall analyse the number of (capped) candidates in each generation.
Then, plot the best scores along with the wolf scores for comparison.
What do we observe here? The wolf attack is more efficient compared to the hill-climbing attack in this case. There is certainly room for improvement here for the hill climbing attack.
So, what improvements can we make here?
We can also plot the top candidate from one generation to another.