Natural language engineering 1

โš ๏ธ Important

  • Assessment: in person exam (22.02.2022)
  • Textbook: Jurafsky & Martin 3rd edition

Lecture

๐Ÿ“‘ VL01 NLE Overview

๐Ÿ“‘ VL02 Basic NLE Pipeline

  1. What is โš™ Natural Language Processing?
  2. What is ๐Ÿง Linguistic analysis?
  3. Which ๐Ÿชœ Linguistic analysis levels exist?
  4. How does the ๐Ÿšช GATE NLP pipeline work?

๐Ÿ“‘ VL03

โ†’ public holiday

๐Ÿ“‘ VL04 RegEx & FSE

  1. What are ๐Ÿ”ก Regular Expressions?
  2. Whatโ€™s a ๐Ÿ“• Formal Language and how does it relate to ๐Ÿ“– Formal grammar
  3. How to classify formal grammar?
  4. What are ๐ŸŽฐ Automata and which classes are there?
  5. How to define ๐Ÿ Finite state automata and which types are there?
  6. 1๏ธโƒฃ Deterministic finite automata Vs. ๐Ÿ”ข Non-deterministic finite automata
  7. How to implement RegEx using a FSA?

๐Ÿ“‘ VL05 Preprocessing

  1. What is ๐Ÿงฝ Preprocessing (NLP) and why is it important?
  2. ๐Ÿฆพ State machines in detail

๐Ÿ“‘ VL06 Word Prediction

  1. Whatโ€™s ๐ŸŽฒ Probability?
  2. How do ๐Ÿงฎ Frequentist Probability and ๐Ÿ‘จโ€๐Ÿฆฑ Subjective Probability differ from each other?
  3. Whatโ€™s the difference between ๐Ÿงโ€โ™€๏ธ Prior Probability and ๐Ÿ‘ซ Conditional Probability?
  4. What is a ๐Ÿช™ Trial and what does it consist of?
  5. What does the โ›“ Chain Rule (Probability) state?
  6. What does the ๐Ÿ“œ Bayes Theorem state?
  7. How to do ๐Ÿ”ฎ Word Prediction?
  8. Whatโ€™s the ๐Ÿ” Maximum likelihood estimation?
  9. What is a ๐Ÿ“™ Language Model?
  10. What are some problems during word prediction?
  11. What does the ๐Ÿ’ญ Markov Assumption (Language) state?

๐Ÿ“‘ VL07 Text classification

  1. Whatโ€™s ๐Ÿท Classifier?
  2. How does ๐Ÿคฆโ€โ™‚๏ธ Naive Bayes work?
  3. ๐ŸŽฏ Accuracy Vs. ๐Ÿน Precision Vs. ๐Ÿ›’ Recall
  4. Whatโ€™s the โš– Balanced F measure?

๐Ÿ“‘ VL08 POS Tagging

  1. Which categories of POS tags are there? โ†’ ๐Ÿงฉ Parts of Speech
  2. Which POS-tagging methods are there?
  3. How to do โœ Hand coded POS-Tagging?
  4. Whatโ€™s the ๐Ÿ•ถ Brill tagger algorithm
  5. What is a โฉ Markov model?
  6. How can the ๐Ÿฅท Hidden Markov model be used for ๐Ÿท POS-Tagging?

๐Ÿ“‘ VL09 Logistic Regression

  1. What are some ๐Ÿท ML Classifiers?
  2. Which types of ๐Ÿšถ Logistic Regression are there?
  3. How does 1๏ธโƒฃ Binary logistic regression work?
  4. Whatโ€™s the Cross-entropy loss function?
  5. Whatโ€™s the Stochastic gradient descent?
  6. How to calculate the Z-Score?
  7. Whatโ€™s the Logistic Sigmoid Function?

๐Ÿ“‘ VL10 Text embeddings 1

  1. What is a ๐Ÿ†– Lemma and โ“ Lexical semantics?
  2. Whatโ€™s a ใ€ฐ๏ธ Word embedding?
  3. ๐Ÿฆ’ Sparse vector VS. ๐Ÿ€ Dense vector
  4. Whatโ€™s ๐Ÿงฎ tf-idf? (WTF)
  5. Whatโ€™s the Pointwise Mutual Information and PPMI?
  6. How to calculate the ๐Ÿ“ Vector length?
  7. How to calculate the โšซ๏ธ Dot-product and ๐Ÿ“ Cosine similarity?

๐Ÿ“‘ VL11 Word2Vec

  1. How does ๐Ÿ“  word2vec work?

๐Ÿ“‘ VL 12 Formal grammars

  1. What is ๐Ÿง‘ BERT?
  2. ๐Ÿ‘€ Contextual embedding Vs. ๐Ÿคทโ€โ™‚๏ธ Non-Contextual embedding
  3. Whatโ€™s the ๐Ÿ‘” Chomsky normal form?
  4. Whatโ€™s a ๐Ÿ— Formal generative grammar?
  5. What are some Phenomena?

๐Ÿ“‘ VL 13 Syntax & semantic analysis

  1. Whatโ€™s ๐ŸŒณ Parsing?
  2. Whatโ€™s ๐ŸŽฒ Probabilistic parsing?
  3. Whatโ€™s the 1๏ธโƒฃ Predicate Calculus?
  4. What are the units of a formal grammar?
  5. Define ๐ŸŸฉ Syntax
  6. Whatโ€™s โž• Syntax-driven semantic analysis?
  7. What are the time complexities of different ๐ŸŽฐ Automata?

๐Ÿ“‘ VL 14 NLE Applications

  1. ๐Ÿงฉ Applications

๐Ÿ“‘ VL 15 Revision

  1. ๐Ÿšช GATE NLP pipeline
  2. ๐Ÿ”ก Regular Expressions, ๐ŸŽฐ Automata
  3. 1๏ธโƒฃ Type 1 error, 2๏ธโƒฃ Type 2 error

โ„น๏ธ Course topics

  • Motivation
  • Regular expressions
  • Basic statistical natural language processing
  • Part-of-speech tagging
  • Text classification
  • Lexical semantics (embeddings)
  • Context-free grammars
  • Parsing principles + Complexity
  • Applications: E, IR, QA,

๐Ÿ“‘ Extra resources