Natural language engineering 1
⚠️ Important
- Assessment: in person exam (22.02.2022)
- Textbook: Jurafsky & Martin 3rd edition
Lecture
📑 VL01 NLE Overview
- What is 🗣 Natural Language Engineering?
- Whats the state of the art of NLE?
📑 VL02 Basic NLE Pipeline
- What is ⚙ Natural Language Processing?
- What is 🧐 Linguistic analysis?
- Which 🪜 Linguistic analysis levels exist?
- How does the 🚪 GATE NLP pipeline work?
📑 VL03
→ public holiday
📑 VL04 RegEx & FSE
- What are 🔡 Regular Expressions?
- What’s a 📕 Formal Language and how does it relate to 📖 Formal grammar
- How to classify formal grammar?
- What are 🎰 Automata and which classes are there?
- How to define 🏁 Finite state automata and which types are there?
- 1️⃣ Deterministic finite automata Vs. 🔢 Non-deterministic finite automata
- How to implement RegEx using a FSA?
📑 VL05 Preprocessing
- What is 🧽 Preprocessing (NLP) and why is it important?
- 🦾 State machines in detail
📑 VL06 Word Prediction
- What’s 🎲 Probability?
- How do 🧮 Frequentist Probability and 👨🦱 Subjective Probability differ from each other?
- What’s the difference between 🧍♀️ Prior Probability and 👫 Conditional Probability?
- What is a 🪙 Trial and what does it consist of?
- What does the ⛓ Chain Rule (Probability) state?
- What does the 📜 Bayes Theorem state?
- How to do 🔮 Word Prediction?
- What’s the 🍔 Maximum likelihood estimation?
- What is a 📙 Language Model?
- What are some problems during word prediction?
- What does the 💭 Markov Assumption (Language) state?
📑 VL07 Text classification
- What’s 🏷 Classifier?
- How does 🤦♂️ Naive Bayes work?
- 🎯 Accuracy Vs. 🏹 Precision Vs. 🛒 Recall
- What’s the ⚖ Balanced F measure?
📑 VL08 POS Tagging
- Which categories of POS tags are there? → 🧩 Parts of Speech
- Which POS-tagging methods are there?
- How to do ✍ Hand coded POS-Tagging?
- What’s the 🕶 Brill tagger algorithm
- What is a ⏩ Markov model?
- How can the 🥷 Hidden Markov model be used for 🏷 POS-Tagging?
📑 VL09 Logistic Regression
- What are some 🏷 ML Classifiers?
- Which types of 🚶 Logistic Regression are there?
- How does 1️⃣ Binary logistic regression work?
- What’s the Cross-entropy loss function?
- What’s the Stochastic gradient descent?
- How to calculate the Z-Score?
- What’s the Logistic Sigmoid Function?
📑 VL10 Text embeddings 1
- What is a 🆖 Lemma and ❓ Lexical semantics?
- What’s a 〰️ Word embedding?
- 🦒 Sparse vector VS. 🐀 Dense vector
- What’s 🧮 tf-idf? (WTF)
- What’s the Pointwise Mutual Information and PPMI?
- How to calculate the 📏 Vector length?
- How to calculate the ⚫️ Dot-product and 📐 Cosine similarity?
📑 VL11 Word2Vec
- How does 📠 word2vec work?
📑 VL 12 Formal grammars
- What is 🧑 BERT?
- 👀 Contextual embedding Vs. 🤷♂️ Non-Contextual embedding
- What’s the 👔 Chomsky normal form?
- What’s a 🏗 Formal generative grammar?
- What are some [[Natural language#Phenomena|🗣 Natural language#Phenomena]]?
📑 VL 13 Syntax & semantic analysis
- What’s 🌳 Parsing?
- What’s 🎲 Probabilistic parsing?
- What’s the 1️⃣ Predicate Calculus?
- What are the units of a formal grammar?
- Define 🟩 Syntax
- What’s ➕ Syntax-driven semantic analysis?
- What are the time complexities of different 🎰 Automata?
📑 VL 14 NLE Applications
- [[Natural Language Engineering#🧩 Applications|🗣 Natural Language Engineering#🧩 Applications]]
📑 VL 15 Revision
ℹ️ Course topics
- Motivation
- Regular expressions
- Basic statistical natural language processing
- Part-of-speech tagging
- Text classification
- Lexical semantics (embeddings)
- Context-free grammars
- Parsing principles + Complexity
- Applications: E, IR, QA,