Lecture 3: GFS(Google File System)

Database/Distributed Systems

Lecture 3: GFS(Google File System)

Tony Lim 2022. 1. 31. 14:10

728x90

Why is it hard to build BIg Stoarge

want performance -> do sharding -> faults(some storage might fail) -> we want tolerance -> so we use replication -> now there can be inconsistency -> we need lots of network talking to achieve consistency -> low performance

trade off happens

bad replication design

want to keep this 2 table identical

we haven't made any process that 2 servers are going to handle request in same order , client will try to read same key but might end with different value

GFS

big seqential access (not random)

Single data center , internal use

chunk handles = where to find data (identifier)

primary is only allow to be primary for certain lease time == lease expiration

nv = non volatile (need to be wrriten to disk)

read

1. name and offset (what client send to master)

2. master send handler (list of servers that has chunk of what client want)
client caches the chunk server , so client don't need to ask master again

3. client receive data from one of the chunkserver

write

if there are no primary master need to find the latest chunk(data) from servers

but server that has latest data might be down , and master might consider second latest data as up to date data which is bad

version number = even master itself crashes it can know what was primary and secondary chunk server with this number because they are non volatile(written in disk)

primary picks offset -> all replica write data at that offset -> if all "yes we did" primary reply to client to "success" else "failed(no)" -> client need to reissue(send again) append operation request

B failed and since primary choose offset (where to append)

C is append at weird place at replica 3. and for D too

GFS ++

https://www.youtube.com/watch?v=eRgFNW4QFDc&ab_channel=DefogTech

728x90

저작자표시 (새창열림)

'Database > Distributed Systems' 카테고리의 다른 글

Lecture 10: Cloud Replicated DB, Aurora (0)	2022.03.06
Lecture 8: Zookeeper , More Replication, CRAQ (0)	2022.03.05
Lecture 6: Fault Tolerance: Raft (1) ,(2) (0)	2022.02.26
Lecture 4: Primary-Backup Replication (0)	2022.02.11
Lecture 2: RPC and Threads (0)	2022.01.30

현재글Lecture 3: GFS(Google File System)

250x250

스레드, systemd, 메소드 참조, Linux, 파일입출력, spring, dijkstra, 자바8, Algorithm, 날짜시간, fft, JPA, Interval Scheduling, Text Justification, 람다, 영속성, Median Find, Weighted Interval Scheduling, Matrix Mutilply, Quicksort,

Today :
Yesterday :

일	월	화	수	목	금	토
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31

관심있는것들