๐ฎ Word Prediction
How?
- Determine the ๐ฒ Probability of each word: MLE
- Use a N-Grams 2 to obtain the most likely follow-up word
Problems
- Sparse data:
- Most events of encountering long word sequences hardly ever occur
- โ Markov Assumption โ Only look at N-Grams (typically 2 or 3)
- Most events of encountering long word sequences hardly ever occur
- Zeroes:
- Some input words are not in the training set โ MLE estimates P(w) = 0
- โ Smoothing: assigning small non-zero probabilities to P(w) = 0
- โ Back-off: use lower order n-grams when higher ones arenโt available
- Some input words are not in the training set โ MLE estimates P(w) = 0
- Underflow:
- Multiplying many small numbers can result in an underflow โ loss of data
- โ Do all calculations in log space
- Multiplying many small numbers can result in an underflow โ loss of data