🤦‍♂️ Naive Bayes

= a simple (“naive”) 👮‍♂️ Supervised machine learning method for 🏷 Classifier based on bayes rule

Formal definition

Assumptions

  • Bag of words assumption: Word position doesn’t matter
  • Conditional independence: Feature probabilities are independent of the class

Formula

Calculating

Using the formula above, can be calculated:

  1. 🍔 Maximum likelihood estimation:
    • = fraction of times word appears among all words of topic

Problems

  • Zero-probabilities: → smoothing (= add +1 to each word count)
  • Unknown-words: → ignore them

👍 Pros

  • fast, low storage
  • robust to irrelevant features
  • good all-rounder (“Bayes has never failed me”)

📖 Example: