IBBS - priprava na izpit

# IBBS - priprava na izpit ### Common frame of reference and fiducial points For example, if you have multiple images of the same scene, but each image was taken from a different location or at a different angle, it can be difficult to compare or align the images. By using a common frame of reference, such as a set of fiducial points that are visible in each image, it is possible to align the images so that they can be compared and analyzed. A common frame of reference is a **standardized coordinate system** that allows for consistent and **accurate description** of the **position** and **orientation** of features in an image, and enables **comparison**, **alignment** and **positioning** of different images." ### Biometrics Use of **physical** or **behavioral characteristics**, such as fingerprints, face, iris, voice, gait,..., for the purpose of identification and authentication. **Positive recognition:** a sample represents a subject **known** to the system. **Negative recognition:** a sample represents a subject **unknown** to the system **Biometric modalility**: pyhsical or behavioral property we use for biometric recogniton **Identification:** The goal is to determine the identity of an individual based on their characteristics or credentials **Verification:** The goal is to confirm that the identified person is who they claim to be. In short, identification is the process of determining who someone is, while verification is the process of confirming that the identified person is who they claim to be. **Authentication** refers to the process of confirming an individual's identity (authentication confirms the identity of an individual that has already been identified). **Biometric sample**: raw or preproccesed data collected from the subject by sensor ### Templates A set of **features** or characteristics that are **extracted** from an individual's biometric sample, such as a fingerprint, face, iris, or voice. These templates are used to represent the individual's biometric identity. These templates are often used in biometric systems for the **purpose** of **matching** and **identification**. **Assumptions** are typically made by feature extractors when extracting features from a biometric sample. ### Matching Matching is the process of **comparing** a **biometric feature** (such as a face or fingerprint) from an **image** to a **stored template** in order to determine if there is a match. **Matching assumptions:** assumptions made about the biometric data and the matching algorithm used to compare it ### Enrollment Enrollment refers to the process of **capturing**, **storing**, and **maintaining** **biometric data** of an individual in a biometric system. A **template** can be either a **vector of features** extracted from the image or the i**mage itself**. The process of enrollment is usually done at the time of **initial registration**, and the stored template is later used for **comparison** during the **matching** process. The enrollment process is a **one-time** process for each individual, and the template is then used for **future identification** and **verification**. **Selecting the best samples**: have the highest image quality and provide the best representation of the biometric feature **Merging the best samples:** merged together to create a composite template that is more robust and accurate than any single sample **Quality control:** to ensure that the selected and merged samples are of high quality and accurate representation of the individual's biometric feature ### Error statistics **False Match:** a false match refers to a situation where a system incorrectly matches an individual to a predefined template in the database. ![](https://i.imgur.com/IHKNZye.png =400x) **False Non Match** a genuine sample did not match a reference template ![](https://i.imgur.com/9ctz9YX.png =400x) #### Problems: * Problem 1: **Biometric verification** – “Why am I rejected?: users may not understand why they are being rejected by the system, despite their biometric information being legitimate. * Problem 2: **Biometric (mis)identification** –“Why am I delayed as a suspect?”: users may be incorrectly identified as a suspect, despite their biometric information being legitimate. (FMR for example in terrorists databases) * Problem 3: **Biometric identification** – “Who can I be today?”" The problem is that the same person may be able to present different biometric characteristics. The user may have multiple identities, with different biometric information, and use them to bypass the system **FAR** - false acceptance rate (or FMR) FAR is the proportion of **imposters** that are **incorrectly accepted** by the system. FAR is calculated as the number of false acceptances divided by the total number of imposters. A high FAR means that the system is too permissive and is allowing unauthorized access **FRR** - false rejection rate (of FNMR) is the proportion of **genuine users** that are **incorrectly rejected** by the system. FRR is calculated as the number of false rejections divided by the total number of genuine users. A high FRR means that the system is too restrictive and is rejecting legitimate users. Both **FRR** and **FAR** are **inversely proportional**, meaning that as one decreases, the other increases, and their values are usually in trade-off relation. The optimal balance between FRR and FAR depends on the specific requirements of the system and the acceptable level of security and user convenience. ![](https://i.imgur.com/XH0YryS.png =x230) **A Detection Error Tradeoff** (DET) curve represents the trade-off between the false acceptance rate (FAR) and the false rejection rate (FRR) for various decision thresholds. ### ROC curve ![](https://i.imgur.com/2JylvUm.png =x230) True Positive Rate (**TPR**), also known as sensitivity or recall, is the proportion of actual positive cases (e.g. **genuine biometric samples**) that are **correctly classified** as positive by the classifier. False Positive Rate (**FPR**) is the proportion of actual negative cases (e.g. **impostor samples**) that are classified as positive by the classifier. **AUC**: The area under the ROC curve (AUC) is commonly used to summarize the overall performance of a classifier and can range from 0 to 1, with 1 indicating perfect classification and 0.5 indicating random classification. #### EER (equal error rate) ![](https://i.imgur.com/nwTZaGq.png =x230) It is the point at which the false acceptance rate (**FAR**) and the false rejection rate (**FRR**) are **equal**. The system is not performing better for one type of error over the other. #### CMC (cumulative match curbe) ![](https://i.imgur.com/FUQUMTu.png =x230) The rank-1 identification rate is the probability that the true match is ranked first in the list of candidate matches. The rank-n identification rate is the probability that the true match is ranked among the first n candidates in the list. #### Accuracy ![](https://i.imgur.com/bCRSy4Q.png =500x) Accuracy is calculated by dividing the number of **correct classifications** by the total number of classifications made. Where the number of correct classifications is the sum of the number of genuine samples correctly accepted by the system and the number of impostor samples correctly rejected by the system, and the total number of classifications is the sum of the number of genuine samples and the number of impostor samples. #### Recall ![](https://i.imgur.com/am3RPQC.png =300x) Retrieved images refer to a set of images that are retrieved by a face recognition system, which returns all the images that are the most similar to the queried one. Relevant images are those that are related to the search query, specifically all images of the same face. #### Precision ![](https://i.imgur.com/eGimUaV.png =300x) Number of retrieved images that are also relevant, divided by the total number of retrieved images. #### F1 score The F1 Score is a way to measure the **balance** between **precision** and **recall** of a model. Imagine you have a binary classification problem (e.g. spam or not spam) and your model makes predictions. * Precision: out of all the items your model classified as positive (spam), what proportion were actually positive (spam)? * Recall: out of all the items that were truly positive (spam), what proportion did your model correctly classify as positive (spam)? A good model should have both high precision and high recall, meaning that it correctly identifies most of the positive instances (high recall) and also doesn't misclassify many negative instances as positive (high precision). The F1 Score is the harmonic mean of these two measures, which means that it will be high only if both precision and recall are high. #### K-fold validation The steps of 10-fold cross validation are typically: * Randomly split the data into 10 equal-sized "folds" * For each fold, train the model on the 9 remaining folds and evaluate it on the current fold. This will give you 10 evaluation scores. * The final evaluation score is obtained by averaging the 10 evaluation scores. **Uniqueness** refers to the idea that a biometric trait, such as a fingerprint, is unique to each individual and cannot be found in another person. **Permanence** refers to the idea that a biometric trait, such as a fingerprint, does not change over time. This is important for biometric identification systems, as it means that the trait can be used to identify an individual over time. Some biometric traits, such as fingerprints, may change over time due to factors such as injury or aging. Other biometric traits, such as facial recognition, may be affected by changes in lighting or the individual's appearance. **Fact 1**: **Biometric systems rely** only on **digital measurements**. This means that biometric systems only use digital data that is captured by sensors, such as cameras or fingerprint scanners, to identify individuals. This digital data is then compared to a reference template to determine if it belongs to a specific individual. **Fact 2**: Sensing introduces **variations in the samples** of the same **biometric trait** of a user obtained **over a period of time**. This means that the digital data captured by sensors can vary depending on factors such as the quality of the sensor, the lighting conditions, or the position of the individual's body. As a result, the digital data captured at different times may not be identical, even if it belongs to the same individual. **Fact 3**: specific characteristics of the biometric trait that are used for identification, may vary from sample to sample. This means that the system is designed to find a match that is similar, rather than identical, to the reference template. This is done to account for the variations in the digital data that can occur due to sensing and other factors. #### The design cycle of a biometric system: 1. **Problem definition:** clearly define the problem that the biometric system is intended to solve. This involves identifying the specific use case and the requirements of the system. 1. **Data collection:** collect a representative sample of the biometric data that the system will be using. This typically involves capturing images or other data using sensors, such as cameras or fingerprint scanners. 1. **Feature extraction:** The next step is to extract the relevant features of the biometric data that will be used for identification. 1. **Model design:** The next step is to design the model that will be used to match the biometric data to the reference templates. 1. **Evaluation:** The next step is to evaluate the performance of the biometric system. #### Understand nature of application 1. **Verification vs. identification** 1. **Cooperative vs. non-cooperative users** 1. **Overt vs. covert deployment**: Overt deployment refers to biometric systems that are visible and known to the users. Covert deployment refers to biometric systems that are hidden or disguised and not known to the users. 1. **Habituated vs. non-habituated users**: familiarity with the biometric system and have been using it for an extended period of time 1. **Attended vs. unattended operation**: Attended operation refers to biometric systems that are operated by a human, such as a security guard who is responsible for monitoring the system. Unattended operation refers to biometric systems that operate autonomously, without human intervention. 1. **Controlled vs. uncontrolled operation**: Controlled operation refers to biometric systems that are operated in a controlled environment, such as a laboratory or a controlled access area. Uncontrolled operation refers to biometric systems that are operated in an uncontrolled environment. 1. **Open vs. closed system**: Open system refers to biometric systems that are accessible to a wide range of users and can be used for a variety of applications. Closed system refers to biometric systems that are restricted to a specific group of users and are used for a specific application. #### Choice of biometric trait 1. **Universality**: refers to the property of a trait that is possessed by all individuals in a population 1. **Uniqueness**: refers to the property of a trait that is different between individuals 1. **Permanence**: refers to the property of a trait that remains unchanged over a period of time 1. **Measurability**: refers to the property of a trait that can be easily acquired and digitized, meaning that it can be captured and analyzed using sensors and digital processing. 1. **Performance**: how many of the samples can we process in certain amount of time 1. **Acceptability**: refers to the willingness of individuals to present the trait, meaning that it should not be too invasive and should be acceptable to the users. 1. **Circumvention**: the trait should be difficult to replicate or bypass. **Data collection**: the data should be representative of the population that the system will be used on and should include samples that exhibit realistic variations in different sessions, times and conditions. **Choice of features and matching algorithm:** 1. Use the **prior knowledge** about the selected **trait** 1. Check **state-of-the-art results** in scientific literature: The choice of features and matching algorithm should also be based on the state-of-the-art results in the scientific literature. 1. **Problem of interoperability**: This means that the algorithms should be able to operate seamlessly across different sensors and should be able to handle variations in the data that can occur due to different sensors or different conditions. **Protocols:**: are important because they provide a standardized way to evaluate and compare different biometric techniques, which allows for comparison and repeatability of results. This ensures that the results are reliable and can be replicated by other researchers, which is essential for the progress of the field. **Henry 5 classes system**: In the Henry Classification System, one fingerprint is classified into one of the five main classes: arches, loops, whorls, left loops, and right loops. Each class is defined by specific characteristics of the fingerprint pattern. Additionally, the system allows for the identification of fingerprints even if they are of low quality, such as latent prints. ### Fingerprint recognition The features used in fingerprint recognition systems have a physical interpretation. These features can be classified into three levels: * **L1:** only observe the **ridge flow** (direction of the ridges) and **ridge frequency** (number of ridges per unit area) of the fingerprint, and ignore the exact location and dimensional details of the ridges. **Singular points** are locations on a fingerprint where the **ridge orientations change abruptly**. **Loop singular point** occurs when the ridge makes a complete loop. **Core point** corresponds to the north most loop-type singular point Abstract representation of the fingerprint is the **fingerprint class**. This representation is based on the number of loops and deltas and the spatial relationship between them. The pattern type can be used to classify fingerprints into different classes, such as loops, whorls, arches, and tented arches. ![](https://i.imgur.com/CxhjRdA.png =500x) * **L2**: based on the **exact location** and **dimensional details** of the **ridges** in the fingerprint. They can be extracted from high-resolution fingerprint images. Minutia points are the locations where a ridges ends or branch Each **minutia point** has two other properties: **direction** (angle of the ridge) and **type** (ridge ending, bifurcation). ![](https://i.imgur.com/CNsGsXX.png =400x) **Minutia points**: * abstract **representation** of the **ridge skeleton**. * Represents fingerprints in a much more compact form and can be **stored** and **matched** more **efficiently** than the original images. * **L3**: include the They include the inner holes (sweat pores) and **outer contours** (edges - no longer viewed as 1 pixel width edges) of the fingerprint, as well as the patterns formed by **incipient ridges**. (sweat pores) and outer contours (edges) of the fingerprint, as well as the patterns formed by incipient ridges (thinner and contain no sweat pores). **high-resolution** fingerprint images (1000ppi) are needed ![](https://i.imgur.com/wVHx62P.png) **Latent fingerprints** are fingerprints left on a surface by the oils and sweat on a person's skin. Latent fingerprints are usually of **lower quality** than rolled fingerprints, which are taken with a fingerprint scanner, and often **contain less minutiae points**. Thus, extracting L3 details from latent fingerprints may be useful in cases where the number of minutiae is low and the quality of the fingerprints is poor. ### Live scan sensors * **Optical FTIR-based sensor**: uses infrared radiation to analyze the reflected light from a sample. It works by splitting the incoming light into a range of wavelengths, and then measuring the intensity of the light at each wavelength. ![](https://i.imgur.com/j3f9yQ1.png =400x) * **Capacitance-based sensor**: When a finger is placed on the sensor, the fingerprint skin acts as the other electrode, forming a miniature capacitor. The magnitude of electrical charges stored on the electrodes depends on the **distance** between the **fingerprint surface** and the **electrodes**. The **capacitance** due to the **ridges is higher** than those formed by valleys, and this difference is used to generate a digital image of the fingerprint. ![](https://i.imgur.com/DkpYXaz.png =400x) * **Piezoelectric effect based sensors** type of **pressure-sensitive** sensor that generates an **electrical signal** when **pressure** is applied. This is because of the piezoelectric effect, a phenomenon in which certain materials generate an **electrical charge** when they are subjected to **mechanical stress**. They are sensing the pressure applied by the finger on the sensor. Does not capture the relief of the fingerprint accurately because of its **low sensitivity**. * **Temperature differential based sensors** Typically made of **pyro-electric** materials, which are materials that **generate an electric current** when they are subjected to a **change in temperature**. The sensor captures the fingerprint by measuring the **temperature changes** caused by the **pressure of the finger** on the sensor, thus creating an image of the fingerprint. One of the limitations is that the **image** captured has only **four-bit gray scale resolution**, thus reducing the accuracy of the recognition system. These sensors are also relatively short-lived, lasting only about 100ms. ### Image quality of the fingerprint Is determined by **image resolution**, finger **area**, and **clarity** of the **ridge pattern**. A **higher resolution image** with a larger finger area and **clear ridge pattern** will result in a more accurate system. However, in order to reduce cost, lower resolution sensors are often used in civilian applications. **Evaluate the quality of fingerprint:** * Two main approaches to evaluate the quality: use of **local properties** and use of **global properties**. * **local properties** involves dividing the fingerprint image into smaller blocks and evaluating the quality of each block. * Local properties can be used to identify **specific areas** of the image of **lower quality**. * Use of **global properties** involves **evaluating** the **overall quality** of the image using a **single metric**. (useful for identifying overall trends in the image quality such as **overall brightness** or **contrast**) * A combination of **both approaches** can be used to **evaluate** the **quality** of the fingerprint image **Pixel intensity method**: * Use histogram to determine background value (b) based on a decision threshold of peak in the lighter part > 1k * Subtract the background value from each pixel * Divide the image into 9x9 pixel blocks * Calculate mean, variance, gradient for each block and compare against decision thresholds to classify as background or fingerprint * If a block is still unclassified, calculate value based on gradients within the block and subject it to a threshold * If the percentage of background blocks is greater than 50%, reject the image. **Features counting method**: A way to evaluate the quality of a fingerprint image by counting the **number of minutiae points** present in the image. This method is based on the idea that images with poor quality will have either few or many minutiae, and thus can be rejected. If the number of minutiae **falls within this range**, the image is considered to be of **good quality**, otherwise it is rejected. **How to evaluate quality** ![](https://i.imgur.com/t62sw6r.png =400x) This method converts the image into a 11-dimensional feature vector that includes the number of **fingerprint blocks**, number of **features** (minutiae), number of features with a certain quality (greater than or equal to a certain threshold, e.g. 0.5, 0.6, 0.7, 0.8, 0.9), and percentage of blocks with a certain quality (1, 2, 3, 4). This feature vector is then used to classify the image into one of five classes, with class 1 being perfect quality and class 5 being bad quality. **How to improve quality** * **Well defined regions**: These regions have clear and distinct ridges, and are easy to identify and match. * **Renewable regions**: These regions have some degree of distortion or noise, but the ridges can be enhanced or restored through image processing techniques. * **Nonrenewable regions**: These regions have severe distortion or noise and the ridges cannot be restored or enhanced. The goal of quality improvement methods is to: * **Enhance** the ridge structure in **well-defined** and **renewable regions** to make them more distinct and easier to match. * **Mark** or identify **nonrenewable regions** so that they can be excluded from the matching process. #### Hongs method Using a 2D Gabor filter with the obtained **orientation** and **frequency** information to **enhance** the image. #### 1D smoothing method ![](https://i.imgur.com/UGwqJak.png =400x) The method uses an **orientation field** to determine the **best fit line** for the **ridges** within each block. The method then uses 11 points along this line to **smooth the ridges** in the block, resulting in a clearer and more defined image. #### Calculating Poincare index Used to **identify singularity points** by computing the cumulative change of orientations along a closed path in an orientation field. ![](https://i.imgur.com/odA2zcG.png) #### Direction of singularity: **Define** the **reference orientation field** for a loop and a delta Once these singularity points are identified, it can be helpful to determine the **direction** of the **singularity**. This can be done by **comparing** the **local orientation** field around the singularity point to a **reference orientation** field. #### Ridge extraction: The process of creating a thin, **one-pixel wide** image of the **ridges** in a fingerprint, which is used to **extract minutiae** points. We convert the enhanced fingerprint image to a **binary image** using a **thresholding** method like Otsu thresholding. This separates the pixels that represent ridges from the pixels that represent the background. Then, a morphological operation called **thinning** is applied to reduce the width of the ridges to one pixel. #### Minutia extraction ![](https://i.imgur.com/w9ynkPL.png) #### Minutia direction ![](https://i.imgur.com/xVfh6Iz.png =200x) For minutiae that are endings, the ridge is traced to a fixed distance and the direction of the ridge is calculated. For minutiae that are bifurcations, three points are obtained by tracing the ridges to a fixed distance and the mean of the two directions is calculated. #### Minutia filtering * Process of **removing false minutiae** from the image * **Heuristic methods** are used to filter minutiae, such as **removing** minutiae located at **image boundaries**, minutiae that are **close in location** and **have opposite directions**, and minutiae that are **too numerous** in a **small neighborhood**. * **Duality of minutiae** means that the process of filtering is done on both the original image and its negative version, to ensure that all false minutiae are removed. #### Matching **Alignment** refers to the process of **adjusting** the **position** and **orientation** of a fingerprint template so that it is in the same coordinate system as a query fingerprint. This is done so that the minutiae points in the two images can be accurately compared and matched. One common method to **estimate** the spatial **transformation** between two point sets is using the **Generalized Hough Transform**. **Score generation** is the process of computing a **match score** based on how well the corresponding **minutiae points match**. This score can be used to determine the **likelihood of a match** between the two fingerprints. ![](https://i.imgur.com/nEV6UJd.png =500x) **Pairing**: * Minutia a is in correspondence with b if **distance** is within a predefined distance threshold * the **angle** between their directions is within another predefined angle threshold * a is allowed to **match** to at most **one b** and vice versa **Match score** Is measure of how **similar** two fingerprints are to each other. It is calculated by **comparing the minutiae** of the two fingerprints and determining how many of them **match**. The match score is then compared to a predefined **threshold** to classify the two fingerprints as a **genuine** match or an **impostor** match (two-class classification) ### IRIS recognition 1. **Image segmentation**: the iris region is segmented from the rest of the image. This is done by detecting the **boundary** of the **iris** and **pupil**, and removing any eyelashes, eyelids, or other artifacts that may be present in the image. 2. **Feature extraction**: features are extracted from the segmented iris image. **Gabor wavelets** are used to extract features from the iris image, by **analyzing** the **texture** and **pattern** of the iris. The convolution of the image with the Gabor kernel results in the Gabor transformed image, where each pixel represents the **response** of the image to a specific B and B. 3. **Encoding:** The **phase angle** of **each pixel** in each of the Gabor transformed images is extracted and concatenated into a **single vector**. 4. **Matching**: the iris code of the input image is compared with a database of iris codes to find a match. #### Daugman’s Integro-Differential Operator (Circular Edge Detector) It works by using the **contrast** in image intensity between the **pupil** and the **iris**. Both these boundaries are approximated using circles and the magnitude of the edge pixels contributing to these boundaries is stronger than those relating to other circular contours in the image. **How to limit search space** ![](https://i.imgur.com/K3rrFtY.png =300x) We limit the search space for the iris boundaries by first **identifying** the **bounding box** of the **largest dark region** in the image, which is likely to correspond to the **pupil**. The **center** of this bounding box (x0, y0) can be used as a **starting point** for the search, and the radius (r) can be constrained to some range. **Eyelid detection** ![](https://i.imgur.com/OYc8aT0.png =200x) We idetntify the region of the image that corresponds to the eyelid, which can interfere with the iris segmentation. This is done by searching for a **parabolic edge** within the region defined by the outer circle, using a spline-fitting procedure (constructing a smooth curve that passes through a set of data points). **Eyelashes detection** Eyelashes can also interfere with the iris segmentation. We search for **strong near-vertical edges** in the segmented iris, which are likely to correspond to the eyelashes. **Noise mask** ![](https://i.imgur.com/LfDykaR.png =200x) * records the locations of **undesired iris occlusions** such as eyelids, eyelashes, shadows, and specular reflections. * These occlusions can interfere with the iris recognition process and hence need to be eliminated. **Normalization** ![](https://i.imgur.com/VjwLs29.png =400x) Normalizing the size of iris images. It works by stretching or compressing the iris image to a standard size frame. Model aims to **eliminate variations in pupil size** **IRIS encoding -> IRIS code** **Gabor filter** **emphasizes** **certain frequencies** and **orientations** in the image, while de-emphasizing others. The **result** of a Gabor filter is a **value in complex-plane**. To obtain this code, 1024 sample points are taken from the iris, which are spread across 8 radii and 128 samples per radius. ![](https://i.imgur.com/9TmNrM5.png) **Normalized Hamming distance** Use to compare 2 iris codes. The Hamming distance is the number of **bits** that **differ** between the two codes. A mask is applied to the comparison, with 1s representing the iris information and 0s representing occlusions or non-iris information. The resulting fraction of differing bits is then normalized by the total number of bits. A value of 0 means that the two iris codes are identical, while a value of 1 means that they are completely different. This **normalized value** is used as a **measure of similarity** between the two iris codes. ![](https://i.imgur.com/v8KX2US.png =400x) **Rotation of the iris** The iris code is **rotated multiple times** and the matching **metric** is calculated for each rotation. The **rotation** that results in the **lowest metric** is considered the **best match**, and the iris codes are compared using this best match rotation. This helps to account for **head tilt**. **Iris quality assessment** * **Occlusion**: eyelids, eyelashes, or other objects that are in the way of the iris. * **Defocus/Out-of-focus**: refers to the image being blurry or out of focus * **Motion blur**: image being blurred due to movement of the subject or camera during the capture. * **Non-uniform illumination**: refers to uneven lighting in the image, which can cause shadows or other distortions that affect the visibility of the iris. * **Low resolution/Large imaging distance**: refers to the image having a low resolution or being captured from a large distance * **Iris/Pupil dilation**: refers to the iris or pupil being dilated (expanded) in the image * **Off-angled imaging**: refers to the image being captured at an angle that is not straight on to the iris * **Presence of accessories**: refers to the presence of any accessories such as fake or printed contact lenses ![](https://i.imgur.com/QQ5Avia.png) #### Sharpness evaluation: It is calculated by assessing the **intensity gradient** of the pixels near the **pupillary boundary**. A **higher gradient magnitude** indicates a **sharper** and more **in-focus image**, while a lower gradient magnitude indicates a blurrier and out-of-focus image. #### Sharpness/Focus Evaluation Using 2D Fourier Transform * **Defocusing** primarily **reduces** the **highest frequencies** in the image, while the **lower** frequency components are **unaffected**. * To identify a defocused image, the technique measures the **total power** of the image in the Fourier domain at **higher spatial frequencies**. #### Measuring the energy from 2D wavelets at local concentric bands of segmented iris The method uses **higher weights** for **inner regions** of the iris that are **more stable** compared to the **outer regions** that are more **prone to occlusions**. The method then **partitions** the iris image into **multiple concentric bands** of fixed width and **measures the energy** from 2D wavelets of each band. The energy of the band is then used as a **quality measure** for that band of the iris. In summary, this method uses a technique called the weighted Hamming distance to compare two iris images and the **weight of each bit** is based on the **index of the band** that contains the bit, with the idea of giving **more importance** to the **bits** that are **more stable** and less prone to occlusions. #### Dilation influence The conclusion of the experiment is that when the **difference in pupil dilation** between two images being compared for a match is **large**, the False Non-Match Rate (**FNMR**) **increases**. This means that the accuracy of the iris recognition system decreases as the difference in dilation between the two images increases. #### Daugman’s Patent He claims that the iris of every human eye has a **unique texture** of **high complexity** that proves to be essentially **immutable** over a person's life. He argues that this unique texture can be used as a biometric identifier and that a **single enrollment** can last a **lifetime**. This narrative has been **widely adopted** and is the basis of many iris recognition systems. #### Open research questions: * **Acquisition**: How well can iris recognition be performed in situations where the user is not cooperating, or when the iris is moving and difficult to capture. * **Masking**: How well can iris texture occlusions be found * **Quality:** How can iris quality be assessed and incorporated into real-time recognition systems. * **Dilation:** How to deal with the effects of differences in pupil dilation between images. * **Lenses:** Can the presence of contact lenses be automatically detected and can the artifacts created by them be reversed or masked out. * **Aging:** Does template aging occur in iris recognition and can aging-resistant algorithms be developed. ## Face recognition: * Human face also gives other attributes such as **gender**, **age**, **ethnicity**, and **emotional state** of a person * The face is considered to be the most commonly used biometric trait by humans * Face recognition is challenging due to **variations** in **pose**, **illumination**, **expression**, and i**mperfections** such as glasses, sunglasses, caps, scarfs, make-up, and facial hair * **Not enough inter-class variations**: twins, family * **Advantages** of face recognition: * Can be **captured** at a **longer distance** using non-contact sensors, useful for surveillance applications * **Large** legacy face **databases** (e.g. U.S. driver's license repositories cover over 95% of the adult population) * People are generally more **willing** to **share** their **face** images in the **public** domain (e.g. Facebook, Instagram) Facial features refer to size and shape of the eyes, nose, mouth, and other features. **Anthropometric** studies try to **measure** and **understand** these **facial features** by identifying specific points on the face, called **landmark** or fiducial points. These measurements have been used to study how the **face changes over time**, and to **identify differences** between **genders** and **ethnicities**. Anthropometric measurements have limited usefulness for automated facial recognition (FR) systems, as they may **not** be **distinctive** enough to accurately **identify** individuals. ### Feature levels * **Level 1** (L1) features include gross facial characteristics such as the **overall shape** of the face and **global skin color**. These features can be extracted even from **low resolution images**, and can be used to **distinguish** between different **face shapes** and **races**. * **Level 2** (L2) features include **localized information** about the **structure of the face**, such as the **relationship** between **facial components** and the **precise shape** of the **face**. These features can be used for more **accurate face** recognition, and can be **represented** using **geometric** or **texture** **descriptors** (structure of face components (eyes), relationship between them, general skin texture and precise shape of the face) * **Level 3** (L3) features include **micro-level** information such as **scars**, **freckles**, **skin discoloration** and **moles**. These features can be used to **differentiate** between people with identical facial features, such as identical **twins**. ### Viola jones detector * It scans through the input image with **detection windows** of **different sizes** and decides whether each window contains a face or not. * Simple local features are derived using **Haar-like** filters, to **extract features** such as edges, lines, and textures in the image that are characteristic of a face. * Then applies classifer to these features * The **responses** of the **filters** are **combined** to create a **feature vector** that represents the window. * When the **combination of filter responses** (feature vectors) in a certain **window** exceeds a **threshold**, a face is said to have been **detected**. * The **threshold** is **set empirically** and is used to minimize false positives. * It is **efficient** and **fast**, and has a good trade-off between detection rate and false positives. * It is **widely used** in many applications such as surveillance, security, and facial recognition systems. ### Integral image An integral image is a pre-processed image where the **sum of all pixel intensities** **above** and to the **left** of a certain point (**x, y**) in the original image is calculated. This **allows** for the **sum** of **pixel values** within **any rectangular region** in the original image to be **computed quickly**, based on just **4 array accesses** in the integral image. The use of an integral image allows the Viola-Jones face detector to efficiently apply the Haar-like filters and combine the filter responses to detect a face. This makes the algorithm faster and more efficient ### Adaboost * The Viola-Jones face detector uses the Adaboost algorithm, which is a **supervised learning method** that **combines weak learners** into a **more accurate classifier**, to **select** the most **discriminative features** and **train** the **classifier function**. * A weak learner is a classifier that only needs to exceed chance performance. * The training process consists of **multiple boosting rounds**. * During **each round**, the Adaboost algorithm **selects** a **weak learner** that does well on examples that were hard for the weak learners in previous rounds. * The **weights** are **increased for examples** that were **misclassified** by the current weak classifier. * The **final strong classifier** is a **weighted linear combination** of the T (most discriminative features) best weak classifiers based on the selected features * This allows for **efficient feature selection** and classifier training, which is essential for achieving real-time performance. ### Adaboost and cascades The idea is to use a **simple** and smaller set of **filters** to **reject** a large number of **non-face** images in the **early stage**, and then more powerful classifiers can be used in later stages. ![](https://i.imgur.com/LzU9KMG.png =600x) #### Alignment of faces by geometric warping A method that uses the **positions** of the **eyes** in an image to create a **scaled**, **rotated** and **cropped image** as output. The problem with this method is that it only addresses **in-plane rotation**, meaning it can only rotate the face around one eye. ### Facial landmark detection A method that aims to find the location of **facial landmarks** in order **to align** the image. The process involves finding specific points on the face, such as the **corners** of the **eyes** and **mouth**, which can then be used to align and normalize the image. ### Pose adaptive feature extractors Faces are often captured at different **angles** and in different **poses**, which can make recognition more difficult. Pose adaptive feature extractors **adapt** the **feature extraction process** to **account** for the **different poses** present in the **image**. ![](https://i.imgur.com/gtbk4kR.png =400x) ### Landmark detection using Cascade regression Cascaded regression breaks the process of d**etecting landmarks** into **multiple stages**. In the **first stage**, a **simple model** is used to roughly estimate the positions of the landmarks. In the **second stage**, a **more complex model** i used to **refine the positions** of the landmarks based on the estimated positions from the first stage. This process can be repeated for multiple stages, each stage using a more complex model and the output from the previous stage as input. In each stage, the system uses a regression model to **predict** the **required displacement** for the landmarks based on the **features extracted** from the input image and the **current shape estimate**. This means that in each stage, the system takes the current estimate of where the landmarks are and adjusts it slightly based on the features of the image. To **train** a landmark detection model, one needs a **set of sample images** with **annotated landmarks**. ![](https://i.imgur.com/V2Tz2O4.png) ### Lightning normalization Make sure that **lighting conditions** of the **face** image being captured do **not affect** the **performance** of the **facial recognition** system. * **Active methods** solve the problem at the **time** of **image acquisition**, for example by using thermal infrared or near-infrared images, or 3D information. * **Passive methods** analyze the acquired images **after** they have been **captured**. ### Appearance based Face recognition **Idea:** the pixel value at location (x,y) in a face image can be expressed as a **weighted sum** of pixel values in **all the training images at (x,y)** * (how much one pixel contributes to the overall unique characteristics of a face) * (unique characteristics of a face can be represented as a combination of different "parts" of other faces) * The idea is that by **analyzing** a large number of **faces**, the system can learn **which combinations** of these **features** are **most characteristic** of a specific **person** * Images can be **matched** by directly comparing their **vector of weights** ### Face subspace * is **lower-dimensional representation** of a **face** that **captures** the most **important** **characteristics** of a face. * In a high-dimensional space, faces that are similar to each other will be close together, but it's difficult to identify and compare them. * A face subspace reduces the dimensionality of this space by finding a lower-dimensional space that captures the most important information about a face, making it easier for the system to identify and compare new faces by comparing them to this subspace. ### PCA analysis * The first step is to use the **training data** to learn a **subspace**, this subspace aims to capture **as much variability** in the training data **as possible**. * Next, the technique uses an **Eigen value decomposition** of the **covariance matrix** of the **data**. * to **identify** **eigenvectors** and **eigenvalues** (these contain information about the unique characteristics of the faces) * PCA algorithm uses these eigenvectors and eigenvalues to project the images into a lower dimensional space (**Eigen coefficients**) where the unique characteristics of the faces are better captured and can be used for recognition. #### Matching Matching in PCA is the process of **comparing the unique characteristics** of a **new face image** to a **set of known face images** to find the **best match** and identify the person in the new image. 1. We **calculate** the **Eigen coefficients** of the **probe** image (new image) and each of the **gallery** images (known images). The Eigen coefficients are a set of numerical values that represent the unique characteristics of a face, as captured by the eigenvectors from PCA. 1. The **Euclidean distance** between the Eigen **coefficients** of the **probe** image and each of the **gallery** images is **calculated**. 2. The **gallery image** with the **smallest Euclidean distance** to the probe image is considered the **best match** ### Eigenfaces vs Fisherfaces **Eigenfaces** attempt to **maximize** the **scatter** of the **training images** in **face space**. This means that it aims to **spread out** the **images** as much as possible in the lower-dimensional space. **Fisherfaces** attempt to **maximize** the **between-class scatter** (maximize the difference between images of different people). At the same time, it also aims to **minimize** the **within-class scatter** (minimize the difference between images of the same person). Fisherfaces are based on **LDA**, which is used to find the best linear combination of features for **discriminating** between different **classes**. ![](https://i.imgur.com/7J5XLq4.png) ### Model based face recognition A technique for recognizing faces that should work **regardless of the position** of the **face** in the image (e.g. whether the face is turned to the left or the right). To achieve this, model-based face recognition techniques typically **require** the **detection** of several fiducial or **landmark** **points** on the face (corners of the eyes, the tip of the nose, etc.) Once these **points** are detected, the system uses them to **align** the **face** and create a **pose-invariant representation** ### Elastich Graph Bunch Mathcing - FBP (Face Bunch Graph) EBGM represents a **face** as a **labeled image graph**, where **nodes** are **fiducial points**, labeled with **Gabor coefficients** (**Jet**) for **local texture information** and the **connections** between nodes are **labeled** based on the **average distance** between the **corresponding fiducial points**. This allows the system to take into account the **local texture information** and also the **spatial relationship** between the **different fiducial points** on the face. #### Gabor coefficients The Gabor coefficient at a location in the image is obtained by **convolving** the image with a complex **2D Gabor filter** **centered** at that **location** * By **varying** the **orientation** and **frequency** of the Gabor filter, a **set of coefficients** or a **Gabor jet** is obtained #### Construction of FBG model Stage 1: - Manually mark fiducial points - Obtain the rest semi-automatically, by comparing new images to those who have already been marked Stage 2: - Combine a representative set of graphs in a stack-like stucture ### Texture based face recognition This approach uses a more robust feature representation scheme by characterizing the texture of an image using the **distribution** of **local pixel values**. This means that instead of relying on the overall intensity values of the image, the system **focuses** on the **specific patterns** and **textures present** in the image. This can make the recognition **more robust** to changes in **ambient lighting** and **facial expressions**, which can affect the raw pixel intensity values. #### SIFT (SIFT) is a technique used to **extract unique** and **robust features** from images. It works by detecting **key points** in an image and **creating** a **descriptor** for **each one** based on the **local pixel values** in a **neighborhood** around the point. This descriptor is **tolerant** to changes in **pose** and **lighting**, making it a robust method for face recognition. The SIFT method is also **scale-invariant**, meaning it can identify features in an image **regardless** of the **size** of the image. It has two main steps: **key point extraction** and **descriptor calculation**. The key point extraction step is used to locate **unique** and **distinctive** **points** in an image. The descriptor calculation step is used to extract information from the image's texture **around each key point**, represented as a **histogram** of **gradient orientations** within a local neighborhood. The final descriptor is obtained by **concatenating** all the descriptors from **all** the **patches**.