Why is it hard to build BIg Stoarge
want performance -> do sharding -> faults(some storage might fail) -> we want tolerance -> so we use replication -> now there can be inconsistency -> we need lots of network talking to achieve consistency -> low performance
trade off happens
bad replication design
want to keep this 2 table identical
we haven't made any process that 2 servers are going to handle request in same order , client will try to read same key but might end with different value
GFS
big seqential access (not random)
Single data center , internal use
chunk handles = where to find data (identifier)
primary is only allow to be primary for certain lease time == lease expiration
nv = non volatile (need to be wrriten to disk)
read
1. name and offset (what client send to master)
2. master send handler (list of servers that has chunk of what client want)
client caches the chunk server , so client don't need to ask master again
3. client receive data from one of the chunkserver
write
if there are no primary master need to find the latest chunk(data) from servers
but server that has latest data might be down , and master might consider second latest data as up to date data which is bad
version number = even master itself crashes it can know what was primary and secondary chunk server with this number because they are non volatile(written in disk)
primary picks offset -> all replica write data at that offset -> if all "yes we did" primary reply to client to "success" else "failed(no)" -> client need to reissue(send again) append operation request
B failed and since primary choose offset (where to append)
C is append at weird place at replica 3. and for D too
GFS ++
https://www.youtube.com/watch?v=eRgFNW4QFDc&ab_channel=DefogTech
'Database > Distributed Systems' 카테고리의 다른 글
Lecture 10: Cloud Replicated DB, Aurora (0) | 2022.03.06 |
---|---|
Lecture 8: Zookeeper , More Replication, CRAQ (0) | 2022.03.05 |
Lecture 6: Fault Tolerance: Raft (1) ,(2) (0) | 2022.02.26 |
Lecture 4: Primary-Backup Replication (0) | 2022.02.11 |
Lecture 2: RPC and Threads (0) | 2022.01.30 |