🧠 Visa Approval Prediction using Machine Learning

This project focuses on building a predictive model to determine the likelihood of visa approval based on candidate and job-related features. The goal is to assist in automating parts of the visa evaluation process by leveraging data-driven insights and machine learning models. Through exploratory data analysis (EDA), feature engineering, and model tuning, this project demonstrates the workflow of developing an end-to-end classification model.

🎯 Objectives

Understand key factors influencing visa approvals.
Perform in-depth univariate and bivariate analysis of applicant and job attributes.
Apply data preprocessing, feature selection, and model tuning.
Compare different classification algorithms and evaluate their performance.
Identify the most significant predictors contributing to visa approval.

🧩 Key Methods

Data Preprocessing: Handling missing values, encoding categorical variables, and winsorizing outliers.
Exploratory Data Analysis (EDA): Univariate and bivariate analyses to uncover variable distributions and relationships.
Feature Engineering: Transformation and selection based on importance scores.
Model Development: Logistic Regression, Random Forest, and Gradient Boosting (before and after hyperparameter tuning).
Evaluation Metrics: F1-score, precision, recall, and accuracy.

📊 Visualizations

📊 Categorical Analysis

📈 Numerical Analysis

🔄 Bivariate Relationships

💰 Wage Analysis

🤖 Model Performance

🌟 Feature Importance

💡 Key Insights & Outcomes

🔍 Model Performance & Interpretability

Achieved the highest F1 Score (82.06%), matching Gradient Boosting but with easier interpretability.
Handles class imbalance effectively.
Reduces overfitting by aggregating multiple diverse decision trees.
Robust to noise and scalable to high-dimensional data.
Offers faster training and easier tuning than Gradient Boosting.

⚙️ Automated Application Screening

Classifies visa petitions as likely to be certified or denied, helping prioritize workload and improve processing efficiency.

🚨 Risk Flagging

Automatically flags borderline or uncertain cases for manual verification by legal teams, minimizing decision errors.

📈 Transparent, Data-Driven Dashboards

Feature importance visualizations explain the reasons behind predictions, enhancing trust and transparency for stakeholders and applicants.

🧭 Policy Optimization

Helps discover patterns (e.g., wage, experience, education) that impact certification rates, guiding internal policy and strategy decisions.

🏆 Final Model Selection

The Tuned Random Forest Classifier combines:

Strong predictive performance
Excellent interpretability
Speed and scalability
Ease of deployment

💡 Conclusion:
The Tuned Random Forest model proved to be the most practical and effective solution for EasyVisa’s automated visa prediction system, balancing accuracy, explainability, and operational efficiency.

🛠️ Technologies Used

Python
Pandas, NumPy for data manipulation
Matplotlib, Seaborn for visualization
Scikit-learn for model building and evaluation
Jupyter Notebook for analysis workflow

⚙️ Setup & Installation Instructions

1. Clone the repository:

git clone https://github.com/indu-explores-data/Visa-Approval-Prediction-using-ML/.git
cd Visa-Approval-Prediction-using-ML

2. Install dependencies:

pip install -r requirements.txt

3. Open the Jupyter Notebook:

jupyter notebook "Visa Approval Prediction using ML.ipynb"

▶️ Usage Instructions

Run each cell in the notebook sequentially to reproduce the analysis.
Modify parameters (e.g., test size, model hyperparameters) to experiment with model behavior.
Visualizations are auto-generated during execution for interactive exploration.

🔗 Connect with Me

Let’s connect on LinkedIn for project discussions or data-driven collaborations:

🙌 Feedback & Support

If you found this project helpful, please ⭐ star the repository and share your thoughts. Suggestions and contributions are always welcome!

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
images		images
EasyVisa.csv		EasyVisa.csv
README.md		README.md
Visa Approval Prediction using ML.ipynb		Visa Approval Prediction using ML.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🧠 Visa Approval Prediction using Machine Learning

🎯 Objectives

🧩 Key Methods

📊 Visualizations

📊 Categorical Analysis

📈 Numerical Analysis

🔄 Bivariate Relationships

💰 Wage Analysis

🤖 Model Performance

🌟 Feature Importance

💡 Key Insights & Outcomes

🔍 Model Performance & Interpretability

⚙️ Automated Application Screening

🚨 Risk Flagging

📈 Transparent, Data-Driven Dashboards

🧭 Policy Optimization

🏆 Final Model Selection

🛠️ Technologies Used

⚙️ Setup & Installation Instructions

▶️ Usage Instructions

🔗 Connect with Me

🙌 Feedback & Support

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

🧠 Visa Approval Prediction using Machine Learning

🎯 Objectives

🧩 Key Methods

📊 Visualizations

📊 Categorical Analysis

📈 Numerical Analysis

🔄 Bivariate Relationships

💰 Wage Analysis

🤖 Model Performance

🌟 Feature Importance

💡 Key Insights & Outcomes

🔍 Model Performance & Interpretability

⚙️ Automated Application Screening

🚨 Risk Flagging

📈 Transparent, Data-Driven Dashboards

🧭 Policy Optimization

🏆 Final Model Selection

🛠️ Technologies Used

⚙️ Setup & Installation Instructions

▶️ Usage Instructions

🔗 Connect with Me

🙌 Feedback & Support

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages