Skip to content

tezzytezzy/credit-risk-anomaly-detection

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Credit Card Anomaly Detection

Objective

Experiment with various binary classification models below and select the most appropriate based on Area Under the ROC Curve together with Principal Component Analysis (PCA) in Apache Spark.

  • Logistic Regression
  • RandomForest Classification
  • Linear Support Vector Classification
  • Gradient Boosted Tree Classification
  • Naive Bayes Classification

Installation

The following package to be installed:

pyspark                   2.4.5                      py_0 

Dataset

Statlog (German Credit Data) Data Set

Reference

Machine Learning with PySpark (ISBN 978-1-4842-4130-1)

Releases

No releases published

Packages

 
 
 

Contributors