Allgemein §
What causes Deadlocks? §
rir:LockUnlock
How to manage Deadlocks? §
- Prevent:
- Lock records at the beginning of a transaction
- Use two-phase locking protocol
- growing phase
- shrinking phase
- Resolve:
rir:GitBranch
Alternative to Locking: MultiVersioning §
- Read
- Manipulate
- Commit
- Only works if manipulated file = same version as current file
rir:ArrowLeftRight
Sequential Consistency §
- = Interpreting parallell manipulations as sequential
rir:Cloud
Big Data §
- Volume
- Veriety
- Velovity
- Veracity
- Value
rir:PencilRuler
Schema on Read Vs. Schema on Write §
- Schema on Write (traditional)
- Design Schema before writing data
- Schema on Read (big data)
- Design Schema after writing data
rir:Database
Not Only SQL (NOSQL) §
- Do not need to understand relations before making changes
- -> Better performance, because relations are optional
- -> Data that’s not defined in the DB can still be saved
- data storage/retrieval technologies natural for cloud environment
- not ACID compliant
- BASE
- basically available
- soft state
- eventually consistent
rir:PriceTag2
NOSQL Classifications §
- Key-value stores
- Document stores
- Wide-Column stores
- Graph
rir:HonorOfKings
HADOOP §
- = open source implementation framework of MapReduce
- How to analyze data if it is stored across multiple computers (cloud)?
rir:FileCloud
Hadoop Distributed File System (HDFS) §
- File system for data stored in cloud
- Data -> broken into Blocks -> stored in nodes -> stored in clusters
- Cluster:
- consists of NameNode (master server) and DataNodes (slaves)
- Overall controll through YARN
- No updates just appending
rir:Map
MapReduce Design Pattern §
- Requirement for HADOOP
- Enables parallelization of data storage across multiple servers
- Map
- Divide tasks so that multiple nodes can work on it
- Reduce
- integrate two results into one
- repeat
Important: §
- Map-Reduce
- Schema read/write
Resources: §