Skip to content

indu-explores-data/Visa-Approval-Prediction-using-ML

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

19 Commits
 
 
 
 
 
 
 
 

Repository files navigation

🧠 Visa Approval Prediction using Machine Learning

This project focuses on building a predictive model to determine the likelihood of visa approval based on candidate and job-related features. The goal is to assist in automating parts of the visa evaluation process by leveraging data-driven insights and machine learning models. Through exploratory data analysis (EDA), feature engineering, and model tuning, this project demonstrates the workflow of developing an end-to-end classification model.


🎯 Objectives

  • Understand key factors influencing visa approvals.
  • Perform in-depth univariate and bivariate analysis of applicant and job attributes.
  • Apply data preprocessing, feature selection, and model tuning.
  • Compare different classification algorithms and evaluate their performance.
  • Identify the most significant predictors contributing to visa approval.

🧩 Key Methods

  • Data Preprocessing: Handling missing values, encoding categorical variables, and winsorizing outliers.
  • Exploratory Data Analysis (EDA): Univariate and bivariate analyses to uncover variable distributions and relationships.
  • Feature Engineering: Transformation and selection based on importance scores.
  • Model Development: Logistic Regression, Random Forest, and Gradient Boosting (before and after hyperparameter tuning).
  • Evaluation Metrics: F1-score, precision, recall, and accuracy.

📊 Visualizations

📊 Categorical Analysis

Univariate Analysis Categorical 1
Univariate Analysis Categorical 2

📈 Numerical Analysis

Univariate Analysis Numerical

🔄 Bivariate Relationships

Bivariate Analysis 1
Bivariate Analysis 2

💰 Wage Analysis

Wage Analysis
Boxplot Prevailing Wage
Prevailing Wage After Winsorization

🤖 Model Performance

Models Before vs After Tuning F1 Score Comparison Before vs After Tuning Boosting Models

🌟 Feature Importance

Top 10 Feature Importances Random Forest
Top 10 Important Features Tuned Gradient Boosting


💡 Key Insights & Outcomes

🔍 Model Performance & Interpretability

  • Achieved the highest F1 Score (82.06%), matching Gradient Boosting but with easier interpretability.
  • Handles class imbalance effectively.
  • Reduces overfitting by aggregating multiple diverse decision trees.
  • Robust to noise and scalable to high-dimensional data.
  • Offers faster training and easier tuning than Gradient Boosting.

⚙️ Automated Application Screening

  • Classifies visa petitions as likely to be certified or denied, helping prioritize workload and improve processing efficiency.

🚨 Risk Flagging

  • Automatically flags borderline or uncertain cases for manual verification by legal teams, minimizing decision errors.

📈 Transparent, Data-Driven Dashboards

  • Feature importance visualizations explain the reasons behind predictions, enhancing trust and transparency for stakeholders and applicants.

🧭 Policy Optimization

  • Helps discover patterns (e.g., wage, experience, education) that impact certification rates, guiding internal policy and strategy decisions.

🏆 Final Model Selection

The Tuned Random Forest Classifier combines:

  • Strong predictive performance
  • Excellent interpretability
  • Speed and scalability
  • Ease of deployment

💡 Conclusion:
The Tuned Random Forest model proved to be the most practical and effective solution for EasyVisa’s automated visa prediction system, balancing accuracy, explainability, and operational efficiency.


🛠️ Technologies Used

  • Python
  • Pandas, NumPy for data manipulation
  • Matplotlib, Seaborn for visualization
  • Scikit-learn for model building and evaluation
  • Jupyter Notebook for analysis workflow

⚙️ Setup & Installation Instructions

1. Clone the repository:

git clone https://github.com/indu-explores-data/Visa-Approval-Prediction-using-ML/.git
cd Visa-Approval-Prediction-using-ML

2. Install dependencies:

pip install -r requirements.txt

3. Open the Jupyter Notebook:

jupyter notebook "Visa Approval Prediction using ML.ipynb"

▶️ Usage Instructions

  • Run each cell in the notebook sequentially to reproduce the analysis.
  • Modify parameters (e.g., test size, model hyperparameters) to experiment with model behavior.
  • Visualizations are auto-generated during execution for interactive exploration.

🔗 Connect with Me

Let’s connect on LinkedIn for project discussions or data-driven collaborations:

LinkedIn


🙌 Feedback & Support

If you found this project helpful, please ⭐ star the repository and share your thoughts. Suggestions and contributions are always welcome!

About

This project builds a predictive model to estimate visa approval likelihood using candidate and job-related features. It showcases an end-to-end machine learning workflow with EDA, feature engineering, and model tuning to automate parts of the visa evaluation process.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors