Alex' Gardenアレックスの庭

Search

❯

❯

❯

❯

❯

Informationslinguistik

❯

❯

❯

VL13 End of course

Jul 21, 2025, 2 min read

#todo

General

Course survey

`rir:Checkbox`#todo Exam IL

Read up references
Read reference book

`rir:Pencil`Exam

30.03.21 probably in person exam
Reference book EXTREMELY important for overall understanding
50/50 Bookwork and Open-Ended questions
- Bookwork: Specific questions about lectures
- Open Ended: Imagine that… Find solution for…

`rir:Time`Old exam questions

Basics
- List 3 examples of lexical ambiguity
- Write down a sentence that matches the RegEx
- Why use tf in if.idf?
- Assign POS-Tags to this sentence
  - Nouns, verbs adj
- What does a tokenizer might have when processing a ”.”
- Why did sparse word embeddings become so popular
  - = Most weights 0
More applied knowledge
- Develop a grammar that
  - Accepts x
  - Rejects y
  - Extend it, so that it also accepts other sentences like z
    - -> add more high lv rules
  - How would a chart parser parse x
Applications (demonstrating knowledge)
- Build word processor, explain how it would predict
  - -> Unigram/Bigram etc.
- Motivation for smoothing?
- Implications of Zipf’s law? Larger corpus -/-> better predictions
  - Properties stay the same
- How to evaluate the word processor

`rir:Stack`Module overview

NLP as a pipeline
RegEx, Automata
- Automata
  - Finite Text Transductors
- Regex
  - Good quick n dirty solution
Errors
- Type 1 false positive
  - Increase accuracy/precision
- Type 2 false negative
  - Increase coverage/recall
Text normalization
- First step in all NLP approaches
Language modeling
- Sequence of tokens/tags used to model a language
- Predictions are often a choice
  - -> Predict most likely with statistical approach
  - Markov Assumption
POS-Tagging is a basic NLE application
- = assigning Tags to words
- Combine frequency and contextual information
  - Frequency: How common is Token t
  - Context: How likely is this (Bi)gram
Text classification
- Assign category to text
- Naive Bayes
  - Not so naive
Vector semantics
- Vectors used to represent words and their relationships/meaning
- Nearby vectors = similar words
- Knowledge-Driven approach
  - Recall, define yourself
- Data-Driven approach
  - Embeddings, get info from corpus
    - Tf-idf
      - = sparse vectors
      - count nearby words
    - Word2Vec
      - = dense vectors
      - learn by training
    - Calculate cosine, then similarity
- Evaluating
  - Extrinsic
    - Is the result better than …?
  - Intrinsic
    - Are the processes better than… ?
- BERT
  - Sparse, contextual
  - Trained similar to word2vec
Formal Grammars
- Capturing structure of natural languages
- Parsing to map input depending on Grammar
  - Top Down
  - Bottom Up
  - Chart parsing
  - Probabilistic Parsing
- Semantics
  - Required: Representation, Meaning of word/phrase, logical form for result
Application Areas (VL12 Notes IL)
- Question answering system
- Chatbots
  - IR- Vs. Generation-Based
- Rule-Based Vs. ML-Based Vs. Hybrid
- Industry Vs. Academia

Graph View

General
rir:Checkbox #ToDo Exam IL
rir:PencilExam
rir:TimeOld exam questions
rir:StackModule overview

Backlinks

No backlinks found