ML model using Computer Vision for classifying a picture based on the presence of occlusion on the face


Healthcare, EMR, Remote patient monitorning


Java, Spring Boot, Spring MVC, JavaScript, Angular

Client Requirements

The client is a leading biometric service provider based in India whose objective is to classify images as accepted or rejected based on the presence of facial obstructions, such as earphones, sunglasses, scarves, caps, etc., which could indicate a potential attempt to fake identity.

Images without such obstructions should be accepted. However, the challenge lies in the Indian Dataset, where cultural practices like wearing a Turban, Ghoonghat (veil), or other religious outfits covering the foreheads are standard.

The client provided us with a dataset of over 10,000 images for training the model. The client requires to achieve an accuracy of over 95%. Hence, the model must consider the cultural aspect present in the image when determining whether to accept or reject the face.

Thinkitive Solution

Thinkitive Technologies assembled a team of computer vision experts and data scientists to train an ML model. By utilizing transfer learning on a state-of-the-art image classification model with a vision-transformer architecture, we achieved an accuracy of only 93%.

This was because the traditional image classification algorithms were insufficient as they classify the image as a whole, i.e., they will consider the face and background of the person, making it necessary to develop a model that solely focuses on the person's face.

To address this, the model must first detect the face in the picture and then try classifying the detected face based on the presence of occlusion, keeping in mind the cultural aspect as well. We Incorporated the YOLO v8 object detection algorithm, which identifies and locates objects in images or videos. It is used in autonomous driving, object tracking, and image analysis applications. In our case, the model was trained to detect and label faces as accepted or rejected, successfully addressing both image classification and cultural aspects.

ML model using Computer Vision for classifying a picture based on the presence of occlusion on the face image

Solution Highlights

  • The model was able to achieve a testing accuracy of 95.5%.

  • The latency of the model was less than 2 milliseconds per image on GPU (Graphic Processing Unit)enabled systems.

  • The model was initially trained using the PyTorch framework, which produces models in the .pth format. However, the model was converted into the TFLite format to accommodate the client's TensorFlow production environment. This conversion allows seamless integration and deployment within the client's existing TensorFlow infrastructure.

  • The object detection model outperformed image classification in its ability to accurately identify and reject occlusions on the face, such as earphones or hands. This is because the object detection model focuses explicitly on classifying the face rather than considering the entire image.

  • The model was able to distinguish between cultural cases and fakes reasonably well. For example, a person wearing a religious outfit was accepted, whereas one wearing a scarf with a similar pattern to the religious outfit was rejected.

  • The model demonstrated the ability to distinguish between different types of eyewear. It accurately identified individuals wearing spectacles as accepted while rejecting those wearing sunglasses.

Value Delivered

  • Streamlined Workflow: By automating the identification of fake individuals, the model eliminated the need to check each picture manually. This significantly increased the productivity and efficiency of the client's system, allowing them to process a larger volume of images in a shorter amount of time.

  • Enhanced Accuracy: The model achieved a testing accuracy of 95.5%, surpassing the client's requirement of 95%. This high level of accuracy ensured reliable identification and classification of individuals, reducing the chances of false positives or negatives.

  • Cultural Sensitivity: The model successfully addressed the challenge of cultural diversity within the Indian Dataset. It accurately distinguished between cultural cases and fakes, considering specific religious outfits. This ensured that individuals adhering to cultural practices were correctly accepted.

Technologies and tools

  • Object Detection, Image Classification, Face detection

  • Python, PyTorch, Tensorflow, ONNX, OpenCV, Google Colab