Skip to content

oskccy/knn-from-scratch

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

K-Nearest Neighbours Clustering

K-Nearest Neighbours clustering can be used as a supervised ML model to find patterns within labelled cluster datasets for classification and regression. This ML model is simple and quick, so no indexes as in my other repo's will be needed.

By: Oscar Sharaz Spencer


  • Given a data point, calculate distance from that point to every other point in dataset.
  • Return the $K$ nearest points (smallest distances).
  • $K$ is a hyper-parameter pre-defined by the user.
  • For regression you get average of values of $K$ nearest neighbours.
  • For classification get the majority vote class of $K$ nearest neighbours.

To calculate this distance from one data point to the other, we can use the euclidean distance equation:

$$d(\mathbf{p}, \mathbf{q}) = \sqrt{\sum_{i=1}^n (q_i - p_i)^2}$$

From here, we can sort the values of each euclidian distance from nearest to furthest, and take the $K$ nearest neighbours with their associated classes.

From there, we can take the average of the K-NN for regression, and take the majority vote class (using a simple iterative counter) for classification, implementing with label encoding works as well. This implementation uses label encoding.

main.py

The main.py file is the main entry point for running the K-Nearest Neighbour predictions. It initializes the dataset, label encodes features, runs the euclidean distance calculation on the given points, and outputs the predicted point class (0 or 1).

Installation and Execution

To run the predictions, follow these steps:

  1. Ensure you have Python 3 installed on your system.
  2. Clone this repository and navigate to the project directory.
  3. Run the following commands to install any necessary dependencies and execute the script:
# Clone the repository
git clone https://github.com/oskccy/knn-from-scratch.git
cd knn-from-scratch

# Input any point using the `classifier` function

# Run the predictions
python3 main.py

About

Stock Market Analysis with KNN

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages