架構師訓練營 week5 作業
发布于: 2020 年 10 月 25 日
用你熟悉的编程语言实现一致性 hash 算法。
ref: https://github.com/domnikl/consistent-hashing
A generic implementation of the Consistent Hashing algorithm using an AVL tree.
virtual_point.rb => nodes/points in the ring
avl_tree.rb => use avl_tree to build the ring => better performance for sorting to find nodes
ring.rb => main logics
module ConsistentHashing
class VirtualPoint
attr_accessor :index
attr_reader :node
def initialize(node, index)
@node = node
@index = index.to_i
end
end
end
复制代码
require 'avl_tree'
module ConsistentHashing
class AVLTree < ::AVLTree
# Return the key with the smallest key value
def minimum_pair
return nil if @root.empty?
current_node = @root
while !current_node.left.empty?
current_node = current_node.left
end
[current_node.key, current_node.value]
end
# Returns the key/value pair with a key that follows the provided key in sorted order
def next_gte_pair(key)
node = next_gte_node(@root, key)
[node.key, node.value] if !node.empty?
end
private
def next_gte_node(node, key)
return AVLTree::Node::EMPTY if node.empty?
if key < node.key
# The current key qualifies as after the provided key. However, we need
# to check the tree on the left to see if there's a key in there also
# greater than the provided key but less than the current key.
after = next_gte_node(node.left, key)
after = node if after.empty?
elsif key > node.key
# The current key will not be after the provided key, but something
# in the right branch maybe. Check the right branch for the first key
# larger than our value.
after = next_gte_node(node.right, key)
elsif node.key == key
# An exact match qualifies as the next largest node.
after = node
end
after
end
end
end
复制代码
require 'digest/md5'
require 'set'
require './virtual_point.rb'
require './avl_tree.rb'
# Public: the hash ring containing all configured nodes
module ConsistentHashing
class Ring
# returns a new ring object
def initialize(nodes = [], replicas = 3)
@replicas = replicas
@ring = AVLTree.new
# add_init_nodes
nodes.each { |node| add(node) }
end
# returns the (virtual) points in the hash ring
#
# Returns: a Fixnum
def length
@ring.length
end
# adds a new node into the hash ring
def add(node)
@replicas.times do |i|
# generate the key of this (virtual) point in the hash
key = hash_key(node, i)
@ring[key] = VirtualPoint.new(node, key)
end
end
# removes a node from the hash ring
def delete(node)
@replicas.times do |i|
key = hash_key(node, i)
@ring.delete key
end
end
# gets the point for an arbitrary key
def point_for(key)
return if @ring.empty?
key = hash_key(key)
_, value = @ring.next_gte_pair(key)
_, value = @ring.minimum_pair unless value
value
end
# gets the node where to store the key
def node_for(key)
point_for(key).node
end
# get all nodes in the ring
def nodes
nodes = points.map { |point| point.node }
nodes.uniq
end
# get all points in the ring
def points
@ring.map { |point| point[1] }
end
private
# hashes the key
#
# returns: a String
def hash_key(key, index = nil)
key = "#{key}:#{index}" if index
Digest::MD5.hexdigest(key.to_s)[0..16].hex
end
end
end
复制代码
编写测试用例测试这个算法,测试 100 万 KV 数据,10 个服务器节点的情况下,计算这些 KV 数据在服务器上分布数量的标准差,以评估算法的存储负载不均衡性。
require 'descriptive_statistics'
require './ring.rb'
require 'pry'
servers = []
(0..9).each do |i|
servers << "server-#{i}"
end
standard_deviation = []
size = []
(1..60).each do |replicas|
h_count = {}
servers.map { |server| h_count[server] = 0 }
ring = ConsistentHashing::Ring.new(servers, replicas)
(1..1000000).each do |i|
n = "n#{i}"
server = ring.node_for(n)
h_count[server] += 1
end
puts ring.points.size
size << ring.points.size
standard_deviation << h_count.values.standard_deviation
end
puts size.inspect
puts standard_deviation.inspect
复制代码
servers: 10
virtual points: 10~600
data: 1,000,000
x: points
y: standard deviation
we can know the standard_deviation is stable around 400+ points.
划线
评论
复制
发布于: 2020 年 10 月 25 日阅读数: 45
ilake
关注
还未添加个人签名 2019.04.15 加入
还未添加个人简介
评论