写点什么

架構師訓練營 week6 總結

用户头像
ilake
关注
发布于: 2020 年 11 月 01 日

Distributed DBMS

Distributed databases

  • 1 master, multiple slaves

  • Distributed loading

  • Master/master duplicate

  • Can’t concurrent write 



Data sharding

  • By coding

  • Mapping table in outside storage

  • Middleware



Challenges

  • Need extra codes

  • Can’t use SQL join

  • Can’t use transaction

  • Need more servers



Middleware



Cluster scaling



Deployment

  • 1 service 1 database



  • Master / slave

  • 2 services 2 databases

  • Complex

CAP theorem

  • Consistency

  • Availability

  • Partition tolerance



Data non-consistent

Eventual consistency

Eventual consistency conflicts on writing

  • Decided by timestamp and overwrite

  • Decided by client-side

  • Voting - (cassandra)



Cassandra voting structure





ACID

  • Atomicity

  • Isolation

  • Durability

  • Consistency



BASE

  • Basically Available

  • Soft state

  • Allow latency

  • Eventually consistent



ZooKeeper

Split-brain

  • Different servers get conflicts command. Cluster/data chaos 



Paxos - distributed consensus algorithm







Cluster management and Failover

Search Engine

Crawler system

Robots exclusion protocol

  • robots.txt

Inverted index





Lucene structure



Lucene reverted index



Lucene

  • If data is big, rebuilding index takes time, so Lucene introduce “Segment"

  • Separate to Segment - every segment is independent

  • Need to merge segments regularly

ElasticSearch



How to dispatch



Assistant robot sample

https://github.com/zhihuili/robo





Doris分析案例



Product Goals
* Features
* Data structure
* KV engine
* Logic storage structure - Namespace
* Data visit
* KV API
* KV Client, abstract API, dispatch framework
* High performance communicate
* Non-features
* Mass data
* Transparent cluster management, storage replacement
* scalability
* Linear expansion, Smooth expansion
* Partition, better routing algorithm
* availability
* Automatic fault tolerance and failover
* Transparent cluster management, config management
* performance
* High concurrence, low latency
* Feature expandability
* Easy to add new features
* Low maintain cost
* Easy to management
* Easy to monitor
* Eventually consistency
* Key tech points
* failover
* Scalable and data migration
* Logical storage structure
* Namespace to separate business logics

Doris Architecture

Doris storage 

Data partition

Visit structure

  • 2 write to promise availability (2W, 1R)

  • Partition algo to find nodes

  • Data recovery and data sync

  • Redo log

  • Update log



Cluster - healthy check

Failover



Scalable and data migration



Logical storage structure



Doris consistent hash 

https://github.com/itisaid/Doris/tree/master/common/doris.common/doris.algorithm/src/main/java/com/alibaba/doris/algorithm/vpm



用户头像

ilake

关注

还未添加个人签名 2019.04.15 加入

还未添加个人简介

评论

发布
暂无评论
架構師訓練營 week6 總結