DeepFake Face Detection using AI ML Project in Odisha

Project Centre
2 days ago
4 min read

With the exponential rise in the creation and circulation of deepfake videos, digital misinformation has reached alarming levels. Deepfakes, generated using AI techniques like GANs (Generative Adversarial Networks), can convincingly superimpose a person’s face onto another’s body or alter facial expressions and speech. This poses a serious threat to digital integrity, especially in sensitive areas such as politics, media, and security. In Odisha, a growing hub for AI and machine learning research, this project focuses on leveraging AI-driven solutions—particularly Long Short-Term Memory (LSTM) networks—to detect deepfake videos effectively.

What Are DeepFakes?

Deepfakes are synthetic media in which a person in an existing image or video is replaced with someone else's likeness using deep learning models. Although the technology has legitimate applications in entertainment and education, it is increasingly being misused for spreading disinformation, creating non-consensual content, and conducting identity fraud. Deepfake detection, therefore, has become a critical challenge in the digital era.

Why Odisha for DeepFake Detection Research?

Odisha has emerged as a rising contributor in the Indian AI and tech ecosystem. With premier institutions like IIIT Bhubaneswar, NIT Rourkela, and growing tech incubators, the state provides a fertile ground for advanced machine learning research. The availability of academic talent, increasing governmental interest in digital safety, and expanding infrastructure make Odisha a promising location for a project of this magnitude and social impact.

Project Objective

The primary objective of this machine learning project is to develop a robust deepfake detection system using LSTM networks. The system aims to analyze facial patterns over time within a video to identify inconsistencies typical of manipulated content. By doing so, it not only detects deepfakes with high accuracy but also aids in curbing the spread of malicious content on digital platforms.

Technical Approach: Why LSTM?

LSTM (Long Short-Term Memory) networks are a type of Recurrent Neural Network (RNN) especially suited for sequence data such as videos. Unlike traditional neural networks, LSTM can remember long-term dependencies and identify patterns across time frames, which is essential for detecting subtle anomalies in deepfake videos. Manipulated videos often introduce artifacts or inconsistencies in facial expressions that might go unnoticed in individual frames but become apparent when analyzed sequentially.

Data Preprocessing and Augmentation

The success of any machine learning model largely depends on the quality of the data it is trained on. For this project, video data is first segmented into frames. Each face in the video is extracted using a face detection model such as MTCNN or Haar Cascades. These facial frames are then normalized and resized for consistency.

To enhance model generalization, data augmentation techniques like random cropping, brightness adjustment, flipping, and noise addition are applied. This ensures the model is robust enough to handle variations in real-world video conditions such as lighting, background clutter, and camera motion.

Model Training and Optimization

The LSTM model is trained on a curated dataset that includes both authentic and manipulated videos. Popular datasets such as FaceForensics++, DeepFake Detection Challenge Dataset, and Celeb-DF are utilized. Each video is fed into the model as a sequence of facial embeddings obtained using pre-trained CNNs (e.g., VGGFace or ResNet50).

Hyperparameter tuning plays a critical role in performance. Parameters such as the number of LSTM layers, hidden units, learning rate, and batch size are optimized using grid search and cross-validation methods. The Adam optimizer is employed to minimize the loss function, and dropout layers are used to prevent overfitting.

Evaluation Metrics

To determine the effectiveness of the deepfake detection system, multiple evaluation metrics are used:
Accuracy: Measures the overall correctness of the model.
Precision: Indicates how many of the detected deepfakes are actually fake.
Recall: Shows how many of the actual deepfakes the model was able to detect.
F1 Score: Provides a balance between precision and recall.
Preliminary results show that the LSTM-based model achieves an accuracy of over 90%, with high F1 scores indicating a strong ability to detect even well-crafted deepfakes.

Challenges Faced

Despite promising results, the project faces certain challenges:
False Positives/Negatives: Some genuine videos may be incorrectly flagged, and some deepfakes may slip through undetected.
Real-Time Processing: Detecting deepfakes in real time remains computationally intensive.
Generalization: Models trained on specific datasets may not perform well on new, unseen deepfakes.
Efforts are being made to incorporate real-time detection capabilities using lightweight models and further training on diverse datasets to improve generalization.

Real-World Applications

The application of this technology extends to various sectors:
Social Media: Platforms like Facebook, Instagram, and YouTube can integrate this model to flag and review suspicious content.
Law Enforcement: Authorities can use it to verify the authenticity of video evidence.
Media Houses: Journalists can use the tool to validate user-submitted or viral videos before publishing.

Project Includes:

PPT
Synopsis
Report
Project Source Code
Base Research Paper
Video Tutorials