Welcome to Practical Machine Learning, a hands-on course focused on developing and applying machine learning models to solve real business problems. Throughout this course, I explored a wide range of statistical and machine learning methodologies while learning to implement modern data science workflows with open-source tools.
This course introduces the core concepts and practices of machine learning with business applications, which included:
- Machine learning workflows and evaluation frameworks
- Linear and non-linear regression models
- Classification methods and diagnostics
- Regularization techniques & feature selection
- Principal components & dimensionality reduction
- Tree-based ensemble models (Random Forest, Gradient Boosting)
- Unsupervised learning and clustering techniques
- Deep learning architectures:
- Neural networks
- Convolutional neural networks (CNNs)
By the end of this course, I was be able to:
- Understand and explain the machine learning model development lifecycle.
- Prepare, explore, and preprocess data for supervised and unsupervised learning.
- Train, tune, and evaluate a wide range of machine learning models.
- Apply model diagnostics and evaluate performance using standard metrics.
- Implement feature engineering and dimensionality reduction techniques.
- Build deep learning models for vision and sequence applications.
- Communicate machine learning insights and recommendations effectively.
I used industry-standard, open-source tools such as:
- Python for modeling and analysis
- Jupyter Notebook for reproducible ML workflows
- scikit-learn, TensorFlow, Keras
- Pandas, NumPy, Matplotlib, Seaborn for data exploration
The data used in each of the assignments was the Home Equity dataset that is used for credit risk analysis and loan default prediction. It contains information on 5,960 home equity loan applications, and its primary purpose is to predict whether an applicant will default on their loan. The dataset has 13 variables, with BAD as the target variable for predicting loan default and an amount variable to determine how much was lost. Additionally, there is a data dictionary that describes the variable types for each varaible and a short description on the values of each variable.
Each learning objective will be in its own folder with a more detailed README.md file that full describes the learning objectives, tools used, and data neccessary for each assignment.
If you have any questions or inquires, feel reach out to Joshua Pasaye.