🔪 Tokenization

= the process of splitting up text into an array of tokens (e.g. words/symbols)

How?

📖 Example:

  • “I like cookies” → “I” “like” “cookies”