Anomaly Analysis
Fraud Detection Through Anomalies
Project Overview
This project focuses on detecting fraudulent transactions
using machine learning models and
anomaly detection techniques. Given the
extreme class imbalance, the main objective
is to
enhance fraud detection while minimizing false
negatives, ensuring high recall without compromising precision.
Implemented Techniques
• Logistic Regression:
Initial model
without adjustments, achieving high accuracy but poor fraud
detection.
Anomaly Detection Approaches
• Minimum Covariance Determinant (MCD): Identifies outliers based on robust covariance estimation.
• Isolation Forest: Detects anomalies by analyzing data point isolation in the
feature space.
Class Balancing Strategies
• Undersampling: Reduces majority class to
balance distribution.
• Oversampling: Increases fraud cases to prevent model bias.
• SMOTE (Synthetic Minority Over-sampling Technique): Generates synthetic fraud samples to improve representation.
Key Considerations
The main challenge in fraud detection lies in
handling skewed distributions, where fraudulent
transactions constitute less than 0.2% of the
dataset. Traditional classifiers struggle in
such scenarios, making
precision-recall trade-offs crucial.
Anomaly detection methods leverage the inherent
rarity of fraud cases, while
data balancing techniques prevent models from
being biased toward the majority class.