Skip to content

katistix/spam-filtering

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

spam-filtering - implementing Naive Bayes Theorem in Go

Heavily inspired by Tsoding's "Email Spam Filter in Go" video.

Formula:

$$ P(C|D) = \frac{P(D|C) \cdot P(C)}{P(D)} $$

where:

  • $P(D)$ - probability of a given document to be exist based on the current Bag-of-Words (order of words is not relevant)
  • $P(C)$ - probability of a class of document to be True (in this case the SPAM class)

get the dataset

Go to https://www2.aueb.gr/users/ion/data/enron-spam/ and download and extract enron1, enron2 etc. and put them in data/

Progress:

  • generate Bag-of-Words for a directory
  • compute $P(D)$ for a given document (at a filePath)
  • compute $P(C)$
  • compute $P(D|C)$

About

Naive Bayes Theorem in Go

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages