Anomaly Analysis
Fraud Detection Through Anomalies
Project Overview
This project focuses on detecting fraudulent transactions using machine learning models and anomaly detection techniques. Given the extreme class imbalance, the main objective is to enhance fraud detection while minimizing false negatives, ensuring high recall without compromising precision.
Implemented Techniques
• Logistic Regression:
Initial model without adjustments, achieving high accuracy but poor fraud detection.
Anomaly Detection Approaches
• Minimum Covariance Determinant (MCD): Identifies outliers based on robust covariance estimation.
• Isolation Forest: Detects anomalies by analyzing data point isolation in the feature space.
Class Balancing Strategies
• Undersampling: Reduces majority class to balance distribution.
• Oversampling: Increases fraud cases to prevent model bias.
• SMOTE (Synthetic Minority Over-sampling Technique): Generates synthetic fraud samples to improve representation.
Key Considerations
The main challenge in fraud detection lies in handling skewed distributions, where fraudulent transactions constitute less than 0.2% of the dataset. Traditional classifiers struggle in such scenarios, making precision-recall trade-offs crucial. Anomaly detection methods leverage the inherent rarity of fraud cases, while data balancing techniques prevent models from being biased toward the majority class.