🤦♂️ Naive Bayes
= a simple (“naive”) 👮♂️ Supervised machine learning method for 🏷 Classifier based on bayes rule
Formal definition
Assumptions
- Bag of words assumption: Word position doesn’t matter
- Conditional independence: Feature probabilities are independent of the class
Formula
Calculating
Using the formula above, can be calculated:
- 🍔 Maximum likelihood estimation:
- = fraction of times word appears among all words of topic
Problems
- Zero-probabilities: → smoothing (= add +1 to each word count)
- Unknown-words: → ignore them
👍 Pros
- fast, low storage
- robust to irrelevant features
- good all-rounder (“Bayes has never failed me”)