Research Papers

Research Papers Post Type

Deep Learning-based Person Tracking using Facial Recognition A Smart Approach to Security and Civic Monitoring

Deep Learning-based Person Tracking using Facial Recognition: A Smart Approach to Security and Civic Monitoring

Title: Deep Learning-based Person Tracking using Facial Recognition: A Smart Approach to Security and Civic Monitoring Authors: Shailendra Singh Kathait, Ashish Kumar, Samay Sawal Summary: Person tracking using facial recognition has emerged as a crucial technology in surveillance, security, and human-computer interaction applications. This paper presents a comprehensive framework that integrates advanced facial detection, feature extraction, and tracking methodologies to robustly identify and monitor individuals in video streams. The approach in this paper combines state-of-the-art computer vision techniques with deep learning-based facial recognition to achieve real-time performance while maintaining high accuracy. The system integrates YOLO for object detection and DeepFace for facial recognition, offering an efficient solution for real-time person tracking. Additionally, the framework extends beyond individual tracking by incorporating intelligent analysis for detecting traffic violations, monitoring criminal activities, and identifying civic issues such as unauthorized encroachments or safety hazards. By leveraging existing surveillance infrastructure, this system enhances preventive policing and response times, making urban spaces safer and more efficient. The system is built using widely available open-source libraries and is designed for scalability across various camera setups. Experimental results demonstrate that this framework provides effective tracking and identification even under challenging conditions such as occlusions, varied lighting, and rapid movements. Download Research Paper

Real-Time Restricted Zone Violation Detection Using YOLOv8 and Centroid-Based Tracking

Deep Learning-based Person Tracking: A Smart Approach to Security and Civic Monitoring

Title: Deep Learning-based Person Tracking: A Smart Approach to Security and Civic Monitoring Authors: Shailendra Singh Kathait – Co-Founder & Chief Data Scientist, Ashish Kumar – Principal Data Scientist, Samay Sawal – Intern Data Scientist, Ram Patidar – Data Scientist, Khushi Agrawal – Intern Data Scientist [all Valiance Solutions Noida, India] Summary: This paper introduces a deep learning-based framework designed for real-time detection and surveillance of individuals violating designated restricted zones, such as vehicle-only areas. Utilizing advanced object detection algorithms, specifically YOLOv8, the system focuses on head detection and spatial reasoning to accurately track individuals entering these zones. A centroid-based tracking mechanism ensures each individual is flagged only once per frame, enhancing detection precision. To further improve accuracy, the framework incorporates modifications to bounding boxes and employs region-specific polygonal filtering, allowing for more precise violation detection. Visual feedback is provided through overlaying boundary boxes and labels on detected individuals, while cumulative violation counts are recorded for monitoring purposes. The proposed system demonstrates stable performance under varying conditions, making it suitable for applications in crowd management, security, and surveillance. Its flexible architecture allows for the integration of additional capabilities, such as movement direction and speed analysis, to provide more context-aware violation assessments. By leveraging existing surveillance infrastructure, this approach offers a cost-effective solution for enhancing urban safety and monitoring. Download Research Paper

Deep-Learning-based-Approach-for-Detecting-Traffic-Violations-Involving-No-Helmet-Use-and-Wrong-Cycle-Lane-Usage

Deep Learning-based Approach for Detecting Traffic Violations Involving No Helmet Use and Wrong Cycle Lane Usage

Title: Deep Learning-based Approach for Detecting Traffic Violations Involving No Helmet Use and Wrong Cycle Lane Usage Authors: Shailendra Singh Kathait – Co-Founder & Chief Data Scientist, Ashish Kumar – Principal Data Scientist, Samay Sawal – Intern Data Scientist, Ram Patidar – Data Scientist, Khushi Agrawal – Intern Data Scientist [all Valiance Solutions Noida, India] Summary: Urban road safety is significantly compromised by traffic violations such as motorcyclists riding without helmets and unauthorized use of cycle lanes. This study introduces a deep learning-based framework designed for the automated, real-time detection of these specific infractions. By leveraging advanced object detection and tracking algorithms, notably the YOLO (You Only Look Once) architecture, combined with spatial reasoning techniques, the system effectively identifies motorcyclists without helmets and detects bicycles operating outside designated lanes. Enhancements like bounding box adjustments, centroid-based relationships, and region-specific filtering are employed to improve detection accuracy. Additional analyses, including speed and direction assessments, provide contextual understanding of the violations. The system offers visual feedback and maintains cumulative violation counts, demonstrating robust performance across diverse urban traffic scenarios. Its scalable architecture allows for extension to detect a broader range of traffic violations, aiming to reduce reliance on manual monitoring and bolster road safety enforcement.​ Download Research Paper

Computer Vision And Deep Learning Based Approach For Violations Due To Illegal Parking Detection

Title: Computer Vision and Deep Learning based Approach for Violations due to Illegal Parking Detection Authors: Shailendra Singh Kathait; Co-Founder & Chief Data Scientist, Ashish Kumar; Principal Data Scientist, Ram Patidar; Data Scientist, Samay Sawal; Intern Data Scientist, Khushi Agrawal; Intern Data Scientist [all Valiance Solutions] Summary: The research paper titled “Computer Vision and Deep Learning based Approach for Traffic Violations due to Over-speeding and Wrong Direction Detection” by Shailendra Singh Kathait et al. presents a cost-effective and scalable method for monitoring traffic violations using non-specialized public cameras. Traditional traffic enforcement relies on expensive infrastructure like Automatic Number Plate Recognition (ANPR) cameras and radar guns, which are often limited to specific locations due to their high costs. In contrast, this study leverages widely available public surveillance cameras, repurposing them for traffic monitoring without significant additional investments. The proposed system integrates state-of-the-art deep learning object detection models, specifically YOLO (You Only Look Once) architectures, with advanced computer vision techniques to accurately estimate vehicle speed and detect direction in real-time. By analyzing video feeds, the system identifies vehicles, tracks their movements, calculates speeds, and determines travel directions. This enables the detection of critical traffic violations such as over-speeding and wrong-direction driving. Experimental results demonstrate the robustness, accuracy, and real-time capabilities of the approach, highlighting its potential for practical deployment in urban traffic surveillance. The modular design and reliance on general-purpose cameras facilitate widespread and affordable implementation, offering a viable solution for enhancing traffic law enforcement and road safety in rapidly urbanizing areas. Download Research Paper

Computer Vision and Deep Learning Based Approach for Traffic Violations Due To Over-speeding and Wrong Direction Detection

Computer Vision and Deep Learning Based Approach for Traffic Violations Due To Over-Speeding and Wrong Direction Detection

Title: Computer Vision and Deep Learning Based Approach for Traffic Violations Due To Over-Speeding and Wrong Direction Detection Authors: Shailendra Singh Kathait: Co-Founder and Chief Data Scientist, Ashish Kumar: Principal Data Scientist, Samay Sawal: Intern Data Scientist, Ram Patidar: Data Scientist, Khushi Agrawal: Intern Data Scientist [all Valiance Solutions Noida, India] Summary: The research paper titled “Computer Vision and Deep Learning based Approach for Traffic Violations due to Over-speeding and Wrong Direction Detection” presents a cost-effective and scalable method for monitoring traffic violations using non-specialized public cameras. Traditional traffic enforcement relies on expensive infrastructure like Automatic Number Plate Recognition (ANPR) cameras and radar guns, which are often limited to specific locations due to their high costs. In contrast, this study leverages widely available public surveillance cameras, repurposing them for traffic monitoring without significant additional investments. The proposed system integrates state-of-the-art deep learning object detection models, specifically YOLO (You Only Look Once) architectures, with advanced computer vision techniques to accurately estimate vehicle speed and detect direction in real-time. By analyzing video feeds, the system identifies vehicles, tracks their movements, calculates speeds, and determines travel directions. This enables the detection of critical traffic violations such as over-speeding and wrong-direction driving. Experimental results demonstrate the robustness, accuracy, and real-time capabilities of the approach, highlighting its potential for practical deployment in urban traffic surveillance. The modular design and reliance on general-purpose cameras facilitate widespread and affordable implementation, offering a viable solution for enhancing traffic law enforcement and road safety in rapidly urbanizing areas. Download Research Paper

MobileNetV2: Transfer Learning for Elephant Detection

Title: MobileNetV2: Transfer Learning for Elephant Detection Authors: Samay Sawal – Intern Data Scientist, Valiance Solutions, Noida, India Shailendra Singh Kathait – Co-Founder and Chief Data Scientist, Valiance Solutions, Noida, India Summary: Wildlife conservation and ecological monitoring rely heavily on accurate species classification. This research presents a deep learning-based approach for elephant detection using MobileNetV2 and transfer learning techniques. Traditional classification methods are labor-intensive and prone to human errors, making automated solutions essential for improving efficiency and accuracy. The study utilizes images captured from a specific reserved region, structured into two categories: “elephants” and “others.” Data augmentation techniques, including rotation, shifting, zooming, and flipping, enhance model robustness. MobileNetV2, a lightweight and efficient convolutional neural network, is employed as the feature extractor, leveraging pre-trained ImageNet weights. Custom layers such as Global Average Pooling, Fully Connected Layers, and Dropout were integrated to optimize performance. Comparative analysis with CNN and VGG16 models demonstrated that MobileNetV2 achieved superior classification performance, with a test accuracy of 98.31% and significantly lower computational costs. Transfer learning expedited model training and improved generalization across diverse environmental conditions. This research highlights the effectiveness of MobileNetV2 in wildlife monitoring and conservation. Future work includes expanding the dataset, deploying real-time monitoring systems on edge devices, and implementing individual elephant identification for enhanced conservation efforts. The proposed model serves as a scalable solution for automated wildlife classification tasks. Download Research Paper

Artificial Intelligence for Human-Animal Conflict Mitigation: Image Classification and Human Tracking in Tadoba Andhari Tiger Reserve

Title: Artificial Intelligence for Human-Animal Conflict Mitigation: Image Classification and Human Tracking in Tadoba Andhari Tiger Reserve Authors: Mothukuri Sujith, Shailendra Singh Kathait, Piyush Dhuliya Valiance Analytics Private Limited Summary: This study presents an advanced AI-driven approach to mitigating human-animal conflicts within the Tadoba Andhari Tiger Reserve (TATR), located in the Chandrapur region. This area faces significant issues as it harbors a diverse population of flora and fauna, including tigers, leopards, and bears, which frequently come into contact with surrounding communities. The Human-Animal Conflict Mitigation System (HACMS) developed for TATR utilizes edge AI cameras, deep learning-based image classification, and human tracking systems to predict and prevent potential conflict scenarios. Central to this approach are daytime-specific deep learning models that detect and classify animals in real time, leveraging the YOLO v5 architecture. Three distinct models comprise this system: a custom detection model trained on species-specific data, a pre-trained model based on YOLO for public datasets, and a segmentation model to resolve specific challenges in detecting animals like bears and bisons that often appear similar in images. Each model serves a specific function within the detection pipeline, achieving robust accuracy in species identification and human recognition. To build the models, a custom dataset of 7,959 images from TATR was utilized, with 73% allocated for training, 16% for validation, and 11% for testing. Data augmentation techniques such as rotation, brightness adjustment, and image preprocessing were applied to increase model generalization, enabling it to handle varied lighting and forest conditions. The YOLO v5 architecture’s use of anchor-free detection and mini-batch normalization significantly boosted efficiency and precision, allowing the model to adapt to various object shapes and sizes. Through this setup, the system achieved an overall test accuracy of 94.82%, with a near-perfect ~100% accuracy for critical species like tigers, leopards, and bears, meeting forest authorities’ requirements for reliable animal identification and alerting. For human detection, the system integrates the Nanotrack algorithm from OpenCV, which provides lightweight, real-time tracking of human movement within forest areas. When the AI-enabled cameras detect human presence, this tracking mechanism initiates and follows the individual’s movement using bounding boxes across frames. This process aids in monitoring human entry into restricted zones, alerting authorities if a person is close to potentially dangerous wildlife. Additionally, adjustments were made to the pre-trained model by replacing common vehicle classes with a ‘Human’ class, improving detection accuracy by focusing on forest-relevant categories. This paper emphasizes that effective conflict mitigation relies not only on accurate animal classification but also on tracking human activities to preemptively raise alerts and deter risky encounters. By harnessing edge analytics, the HACMS operates with limited dependence on cloud computing, making it well-suited to remote areas where connectivity may be sporadic. The system’s design is both scalable and adaptive, offering a template for future implementations in other high-conflict zones. Ultimately, this research demonstrates the transformative potential of AI and deep learning in human-animal conflict management, combining real-time image analysis with proactive alerting to create a safer environment for both humans and animals. The solution offers a promising step toward sustainable coexistence, supporting local communities, wildlife authorities, and conservation efforts by leveraging innovative technology to address the complex dynamics of shared ecosystems. Download Research Paper

Smart Screening: Non-Invasive Detection of Severe Neonatal Jaundice using Computer Vision and Deep Learning

Paper Title: Smart Screening: Non-Invasive Detection of Severe Neonatal Jaundice using Computer Vision and Deep Learning Authors: Kartikya Gupta, Valiance Solutions Vaibhav Sharma, Valiance Solutions Shailendra Singh Kathait, Valiance Solutions Summary: The research paper titled “Smart Screening: Non-Invasive Detection of Severe Neonatal Jaundice using Computer Vision and Deep Learning” presents a novel approach to detecting severe neonatal jaundice through non-invasive techniques, using advanced computer vision and deep learning algorithms. Neonatal jaundice is a common condition among newborns, and early detection is critical in preventing severe complications, such as kernicterus, a form of brain damage. Traditionally, detection methods involve blood tests, which are invasive, time-consuming, and expensive. This study proposes an innovative solution that could address these limitations. The research focuses on utilizing image processing techniques to analyze visual data of newborns’ skin to classify jaundice severity. The authors developed a custom convolutional neural network (CNN) model and compared its performance against several state-of-the-art transfer learning models, including MobileNet, EfficientNet, and Vision Transformer. These models were trained using a dataset of medical images specifically aimed at diagnosing jaundice. The deep learning models successfully identified the degree of jaundice with high accuracy, particularly in detecting severe cases that require medical attention. One of the key advantages of this system is its non-contact, affordable nature, which makes it an ideal solution for resource-limited healthcare settings. The proposed model could easily be deployed in remote or underdeveloped areas, where access to traditional diagnostic tools may be restricted. By leveraging smartphone cameras or other imaging devices, healthcare professionals and caregivers can screen infants in a timely and efficient manner, enabling earlier intervention and reducing the risk of complications. The paper also discusses the potential scalability of the system, as well as its possible integration into telemedicine platforms. The findings indicate that the solution could significantly enhance the early detection of jaundice while minimizing the need for invasive procedures. Additionally, the cost-effectiveness and ease of use of the system suggest its potential as a widespread tool in neonatal care. In conclusion, this study highlights the promising role of computer vision and deep learning in healthcare, specifically in providing a non-invasive, affordable, and accessible solution for the early detection of severe neonatal jaundice. It represents a step forward in improving neonatal care, especially in areas with limited medical resources. Download Research Paper

Individual Tiger Identification Using Transfer Learning

Paper Title: Individual Tiger Identification using Transfer Learning Authors: Shailendra Singh Kathait, Co-Founder and Chief Data Scientist, Valiance Analytics Pvt. Ltd., Noida, Uttar Pradesh Vaibhav Singh, Data Scientist, Valiance Analytics Pvt. Ltd., Noida, Uttar Pradesh Ashish Kumar, Principal Data Scientist, Valiance Analytics Pvt. Ltd.,Noida, Uttar Pradesh Summary: The research paper “Individual Tiger Identification using Transfer Learning” presents a novel approach to classify images of tigers into their respective classes, where each class represents an individual tiger. The study leverages deep learning models, specifically the integration of YOLOv8 and EfficientNetB3 models through transfer learning, to accurately identify individual tigers from a dataset of images collected via motion-activated cameras in a large tiger reserve. The paper outlines the significance of individual animal identification for wildlife preservation, particularly in tracking movements and understanding the population density of tigers in their natural habitats. The existing challenge is the classification of species into individual entities, a task made difficult due to the high similarity between individuals of the same species and the vast differences in environmental conditions in which images are captured. The methodology proposed involves a two-step process. Initially, the YOLOv8 model is employed for object detection to create bounding boxes around tigers in images, despite inaccurately labeling them as zebras. This step is crucial for focusing on the tiger within each image, despite the background complexity. Subsequently, the EfficientNetB3 model, pre-trained on the ImageNet dataset, is fine-tuned for the specific task of tiger identification, using a dataset comprising images of 98 unique tigers. This subset was selected based on a minimum availability of 15 images per tiger, from a larger pool of 192 tigers, to ensure sufficient data for reliable model training. Data augmentation techniques, including rotation, horizontal flipping, width and height shifts, and zooming, were employed to address the issue of class imbalance and enhance the robustness of the model. The paper discusses the importance of data preprocessing and augmentation in detail, emphasizing the need for a standardized image resolution and the creation of a balanced training dataset to improve model accuracy. The results demonstrate the model’s high efficiency, achieving a validation accuracy of over 85% and a test accuracy of 88.49% across 98 different tiger classes. The performance is underscored by the precision and recall values, indicating the model’s reliability in individual tiger identification. The paper also discusses the challenges encountered, such as the variability in feature descriptor counts across different images of the same tiger, and the impact of varying backgrounds on SIFT feature extraction results. To conclude, the paper highlights the potential of deep learning and transfer learning in wildlife conservation efforts, particularly in individual animal identification. It suggests that future research could expand the model’s applicability to different geographical locations and diverse environmental conditions, further enhancing its accuracy and robustness. The ultimate goal is to facilitate real-time monitoring of tigers, contributing to better understanding and sustainable human-wildlife interactions. Download Research Paper

Deep Learning-Based Model For Wildlife Species Classification

Paper Title: Deep Learning-based Model for Wildlife Species Classification Authors: Shailendra Singh Kathait, Ashish Kumar, Piyush Dhuliya and Ikshu Chauhan Summary: Motion-activated cameras have become ubiquitous in ecological parks and wildlife sanctuaries, capturing images upon sensor-triggered motion, including infrared visuals that were once impractical. Despite this technological leap, extracting pertinent wildlife information from the vast image dataset remains time and labor-intensive. This paper presents a groundbreaking solution, employing deep learning models, particularly the VGG16 ConvNet architecture through transfer learning, to achieve near-human-level accuracy in information extraction. The study focuses on a dataset of 33,511 images representing 19 species from the Ladakh region of India. Training and testing the model yielded an impressive accuracy of 89.12%. The established pipeline exhibits vast potential for wildlife monitoring in various national parks, advancing ecological research and conservation. The methodology involved utilizing 80% of the dataset for training and 20% for validation. Subsequently, 3309 unseen images were tested, leading to a confusion matrix. The matrix highlights accurate species classifications, such as correctly identifying 284 out of 302 bird images. However, the study acknowledges geographical limitations, emphasizing the need for a region-specific model. Enhancements in overall test accuracy are anticipated through increased and diverse training data, optimizing the model’s efficiency beyond the Ladakh region. Download Research Paper

Scroll to Top