Alex' Gardenアレックスの庭

Search

❯

❯

❯

❯

❯

❯

❯

❯

Jul 26, 2025, 1 min read

🤦‍♂️ Naive Bayes

= a simple (“naive”) 👮‍♂️ Supervised machine learning method for 🏷 Classifier based on bayes rule

Formal definition

Assumptions

Bag of words assumption: Word position doesn’t matter
Conditional independence: Feature probabilities are independent of the class

Formula

$c_{NB} = c_{j} \in C argmax P (c_{j}) \prod_{i \in p os i t i o n s} P (w_{i} ∣ c_{j})$

Calculating

Using the formula above, $P (x_{i} ∣ c_{j})$ can be calculated:

$P (x_{j} ∣ c_{j}) = \frac{P ( x _{j} & c _{j} )}{P ( c _{j} )}$
🍔 Maximum likelihood estimation: $\hat{P} (w_{i} ∣ c_{j}) = \frac{count ( w _{i} , c _{j} )}{\sum _{w \in V} count ( w , c _{j} )}$
- = fraction of times word $w_{i}$ appears among all words of topic $c_{j}$

Problems

Zero-probabilities: → smoothing (= add +1 to each word count)
Unknown-words: → ignore them

👍 Pros

fast, low storage
robust to irrelevant features
good all-rounder (“Bayes has never failed me”)

📖 Example:

Graph View

🤦‍♂️ Naive Bayes
Formal definition
Calculating
👍 Pros
📖 Example:

Backlinks

NL1 Lectures
Bayes Theorem
Language Model
ML Classifiers