架構師訓練營 week6 總結
Distributed DBMS
Distributed databases
1 master, multiple slaves
Distributed loading
Master/master duplicate
Can’t concurrent write
Data sharding
By coding
Mapping table in outside storage
Middleware
Challenges
Need extra codes
Can’t use SQL join
Can’t use transaction
Need more servers
Middleware
Cluster scaling
Deployment
1 service 1 database
Master / slave
2 services 2 databases
Complex
CAP theorem
Consistency
Availability
Partition tolerance
Data non-consistent
Eventual consistency
Eventual consistency conflicts on writing
Decided by timestamp and overwrite
Decided by client-side
Voting - (cassandra)
Cassandra voting structure
ACID
Atomicity
Isolation
Durability
Consistency
BASE
Basically Available
Soft state
Allow latency
Eventually consistent
ZooKeeper
Split-brain
Different servers get conflicts command. Cluster/data chaos
Paxos - distributed consensus algorithm
Cluster management and Failover
Search Engine
Crawler system
Robots exclusion protocol
robots.txt
Inverted index
Lucene structure
Lucene reverted index
Lucene
If data is big, rebuilding index takes time, so Lucene introduce “Segment"
Separate to Segment - every segment is independent
Need to merge segments regularly
ElasticSearch
How to dispatch
Assistant robot sample
https://github.com/zhihuili/robo
Doris分析案例
Doris Architecture
Doris storage
Data partition
Visit structure
2 write to promise availability (2W, 1R)
Partition algo to find nodes
Data recovery and data sync
Redo log
Update log
Cluster - healthy check
Failover
Scalable and data migration
Logical storage structure
Doris consistent hash
评论