CipherCredit: Decoding Creditworthiness with Machine Learning

Credit scoring is becoming increasingly vital in financial decisions. Forbes reported an average credit card debt of $5,474 per borrower in Q3 2022, totaling $38 billion. The intersection of technology and finance, notably in credit evaluation, is rapidly evolving. This project aims to utilize machine learning to assess 'good' or 'bad' credit risks, offering insights into improving traditional financial models by utilizing the dataset found on Kaggle.

Dataset

Project dataset consists of application_record.csv and credit_record.csv, mergeable via the client number (ID). a. application_record.csv includes personal/financial info (gender, car ownership, income, etc.): 17 columns and ~440,000 rows b. credit_record.csv tracks monthly credit history, overdue days, and payments: 3 columns and ~1,000,000 rows
Found during research on credit scoring and finance machine learning on Kaggle.

Data Preparation

De-duplication
Dealing with Sparse Columns
Handling Outliers
Imputing Missing Values using MICE
Balancing dataset using SMOTE

ML Pipeline

Results

Random Forest performed the best after tuning hyperparameters. The results are shown below:

Based on the feature importances of the variables, we recommend:

Age and employment critical to approving credit card apps
Have tailored strategies for different age groups and employment categories
Consider personalized credit offerings based on family dynamics

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
code		code
README.md		README.md
final_presentation.pdf		final_presentation.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CipherCredit: Decoding Creditworthiness with Machine Learning

Dataset

Data Preparation

ML Pipeline

Results

About

Releases

Packages

Languages

masadshoaib/CipherCredit-Decoding-Creditworthiness-with-ML

Folders and files

Latest commit

History

Repository files navigation

CipherCredit: Decoding Creditworthiness with Machine Learning

Dataset

Data Preparation

ML Pipeline

Results

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages