Allgemein §
What causes Deadlocks? §
rir:LockUnlockHow to manage Deadlocks? §
- Prevent:
- Lock records at the beginning of a transaction
- Use two-phase locking protocol
- growing phase
- shrinking phase
- Resolve:
rir:GitBranchAlternative to Locking: MultiVersioning §
- Read
- Manipulate
- Commit
- Only works if manipulated file = same version as current file
rir:ArrowLeftRightSequential Consistency §
- = Interpreting parallell manipulations as sequential
rir:CloudBig Data §
- Volume
- Veriety
- Velovity
- Veracity
- Value
rir:PencilRulerSchema on Read Vs. Schema on Write §
- Schema on Write (traditional)
- Design Schema before writing data
- Schema on Read (big data)
- Design Schema after writing data
rir:DatabaseNot Only SQL (NOSQL) §
- Do not need to understand relations before making changes
- -> Better performance, because relations are optional
- -> Data that’s not defined in the DB can still be saved
- data storage/retrieval technologies natural for cloud environment
- not ACID compliant
- BASE
- basically available
- soft state
- eventually consistent
rir:PriceTag2NOSQL Classifications §
- Key-value stores
- Document stores
- Wide-Column stores
- Graph
rir:HonorOfKingsHADOOP §
- = open source implementation framework of MapReduce
- How to analyze data if it is stored across multiple computers (cloud)?
rir:FileCloudHadoop Distributed File System (HDFS) §
- File system for data stored in cloud
- Data -> broken into Blocks -> stored in nodes -> stored in clusters
- Cluster:
- consists of NameNode (master server) and DataNodes (slaves)
- Overall controll through YARN
- No updates just appending
rir:MapMapReduce Design Pattern §
- Requirement for HADOOP
- Enables parallelization of data storage across multiple servers
- Map
- Divide tasks so that multiple nodes can work on it
- Reduce
- integrate two results into one
- repeat
Important: §
- Map-Reduce
- Schema read/write
Resources: §