๐Ÿท Text classification

= the process of categorizing elements into organized groups

Formal definition

  • Input:
    • a pattern: (?)
    • fixed set of classes:
  • Output:
    • predicted class:

Methods

Multiple classes

  • Build a classifier for each class
  • Compare the results of each classifier and chose the highest result

Evaluation

Measures

Averaging

  • Macroaveraging: Compute performance for each class, then average
  • Microaveraging: Collect decision for all classes, compute contingency table, evaluate

๐Ÿ“– Example:

Spam or not?