Welcome back! In this level, we are going to make a model to solve the problem we chose in Level 1. You may have noticed that we categorized the example focus questions as 'classification' or 'clustering' instead of by what specific algorithm they use, i.e. Naive Bayes or K-means. This is because we want you to choose the algorithm that you think is best suited for your problem!
This is where the magic begins! It is in this level you should choose which algorithm would be best for the focus question you choose.
Depending on what focus question you choose and what category it is (classification, clustering, or other), the algorithm you choose will differ. Even with the same focus questiona and category, you might find yourself choosing a different algorithm than someone else since there are so many different approaches! This is the part where you get to be creative and choose what you think is best for your problem
Similar to Level_1, you should keep track of your answers to the following questions:
- What is your plan for creating this model? (you don't have to go into too much detail, just a general idea of what you are going to do)
- Describe the algorithm that you chose.
- Detailed explanations on why you chose your algorithm and why you think it is best suited for your problem.
- Mention any resources you used to help you in this portion of the challenge (we love links!).
- Any extra information you'd like to include
The code you are using to load the dataset you choose as well as clean the dataset should be located in the Submissions folder, you may title this notebook as you wish.
- Using Python
- Very nice code comments (up to the grader's discretion)
How do I start?? There's a lot of algorithms out there, and it can be overwhelming to choose one! Here are some resources that might help you: Naive Bayes, K-Nearest Neighbors (KNN), and Random Forest are some popular algorithms to use for classification problems. This is a great tutorial on how to create a Random Forest model in Python using the Titanic dataset. The documentation for scikit-learn also has some great examples on different algorithms
I'm still confused! No worries! If a cursory Google search doesn't help, feel free to reach out to us and we'll be happy to help you out. This is the hardest part of the coding challenge and we want to make sure you have all the resources you need!