In a previous post, I discussed an exhaustive biometric experimental protocol. This is not only the simpliest way to generate scores for evaluating biometric experiment, but it also makes use of all available biometric samples. However, the downside of this approach is if there are samples, there will be comparisons. Among these comparisons, the proportion of genuine scores to the impostor ones is roughly so it is growing with and this can be significant for large . In this article, I discuss an anternative experimental protocol which aims to reduce the number of impostor scores needed, thus reducing the disparaity in the ratio between the number of genuine scores to the impostor ones, without suffering from significant inaccuracy in performance estimation.
Suppose we have five sessions of data, so and there are five unique fingerprints, . We shall reserve samples in session 1 as the templates whereas samples in sessions 2 through to 5 as queries. The comparisons can then be arranged in four matrices, each of which is a result of comparing the template in the gallery (aligned in rows) with the probes (aligned across the columns) of sessions 2 through to 5, respectively, as shown below.
The diagonal elements of each of the four matrices are hence the genuine scores, whereas the off-diagonal elements are the impostor scores. Therefore, the number of genuine scores is: and the number of impostor scores is: So, the genuine to impostor score ratio is This is still not satisfactory since the ratio changes with increasing number of users (or unique fingers).
One way is to randomly select the impostor samples from all other query sessions (except session one which is reserved as templates). So, we shall harvest all the genuine scores from the matrix but only sub-sample the impostor scores by randomly taking only one of the sessions. This process is illustrated in the figure below.
As a result, the number of impostor scores is: The ratio between the number of genuine scores to the impostor ones is therefore
Let us continue from the previous tutorial in which 1000 fingerprints have been extracted and are stored in the features
directories. We first get the filenames using the code fraction below.
Write two helper functions.
Get the user and session indices
Write the query samples to a file.
The top rows of file queries.tmp
contain the following lines.
We shall compare each of these query samples with a list of templates which be definition belong to session 1.
The following codes achieve this by invoking the NIST fingerprint matcher, bozorth3
in order to obtain the score matrix scores
.
We now generate the genuine score mask by using the meta-data user
and session
which are associated with the list of query samples.
Let us see how the genuine score matrix look by zooming into the top 20 rows (corresponding to the top 20 unique fingers).
As can be observed, the scores are arranged in the row of consecutive four pixels, corresponding to query samples in session 2, 3, 4, and 5.
Let us plot the masks for the genine and impostor scores, as well as their corresponding score matrices.
The upper left and right diagrams show where the genuine and impostor score matrices are. The locations are indicated by red pixels (having values of one) against the blue background pixels (having values of zero).
To simply the process, we shall create two helper functions so that you can reuse them in the future. The first one, generate_mask
will generate the mask and the second one, get_scores
will get the scores given the mask:
Apply the two functions and plot the various curves using wer
.
Again, we can do the same using logit_transform
, defined below.
Apply logit_transform
to the scores so that the distributions becomes more visible before plotting the curves.
In this post, I have discussed how to define a biometric experiment protocol such that we need only a minimal number of comparisons. An exhaustive comparison protocol makes comparisons (in the big O notation), where is the number of unique fingers and is the number of samples per unique finger.
A protocol that holds session 1 only samples as templates - single-session template protocol requires comparisons.
Finally, an even simplified or minimalist version, or minimalist protocol needs number of comparisons which is significantly smaller.