写点什么

架構師訓練營 week5 作業

用户头像
ilake
关注
发布于: 2020 年 10 月 25 日
  1. 用你熟悉的编程语言实现一致性 hash 算法。

ref: https://github.com/domnikl/consistent-hashing

A generic implementation of the Consistent Hashing algorithm using an AVL tree.


  • virtual_point.rb => nodes/points in the ring

  • avl_tree.rb => use avl_tree to build the ring => better performance for sorting to find nodes

  • ring.rb => main logics


module ConsistentHashing  class VirtualPoint    attr_accessor :index    attr_reader :node
def initialize(node, index) @node = node @index = index.to_i end endend
复制代码


require 'avl_tree'
module ConsistentHashing
class AVLTree < ::AVLTree
# Return the key with the smallest key value def minimum_pair return nil if @root.empty?
current_node = @root
while !current_node.left.empty? current_node = current_node.left end
[current_node.key, current_node.value] end
# Returns the key/value pair with a key that follows the provided key in sorted order def next_gte_pair(key) node = next_gte_node(@root, key) [node.key, node.value] if !node.empty? end
private
def next_gte_node(node, key) return AVLTree::Node::EMPTY if node.empty?
if key < node.key # The current key qualifies as after the provided key. However, we need # to check the tree on the left to see if there's a key in there also # greater than the provided key but less than the current key. after = next_gte_node(node.left, key) after = node if after.empty? elsif key > node.key # The current key will not be after the provided key, but something # in the right branch maybe. Check the right branch for the first key # larger than our value. after = next_gte_node(node.right, key) elsif node.key == key # An exact match qualifies as the next largest node. after = node end
after end endend
复制代码


require 'digest/md5'require 'set'require './virtual_point.rb'require './avl_tree.rb'
# Public: the hash ring containing all configured nodesmodule ConsistentHashing class Ring
# returns a new ring object def initialize(nodes = [], replicas = 3) @replicas = replicas @ring = AVLTree.new
# add_init_nodes nodes.each { |node| add(node) } end
# returns the (virtual) points in the hash ring # # Returns: a Fixnum
def length @ring.length end
# adds a new node into the hash ring def add(node) @replicas.times do |i| # generate the key of this (virtual) point in the hash key = hash_key(node, i) @ring[key] = VirtualPoint.new(node, key) end end
# removes a node from the hash ring def delete(node) @replicas.times do |i| key = hash_key(node, i) @ring.delete key end end
# gets the point for an arbitrary key def point_for(key) return if @ring.empty?
key = hash_key(key) _, value = @ring.next_gte_pair(key) _, value = @ring.minimum_pair unless value value end
# gets the node where to store the key def node_for(key) point_for(key).node end
# get all nodes in the ring def nodes nodes = points.map { |point| point.node } nodes.uniq end
# get all points in the ring def points @ring.map { |point| point[1] } end
private
# hashes the key # # returns: a String def hash_key(key, index = nil) key = "#{key}:#{index}" if index Digest::MD5.hexdigest(key.to_s)[0..16].hex end
endend
复制代码


  1. 编写测试用例测试这个算法,测试 100 万 KV 数据,10 个服务器节点的情况下,计算这些 KV 数据在服务器上分布数量的标准差,以评估算法的存储负载不均衡性。

require 'descriptive_statistics'require './ring.rb'require 'pry'
servers = [](0..9).each do |i| servers << "server-#{i}"end
standard_deviation = []size = []
(1..60).each do |replicas| h_count = {} servers.map { |server| h_count[server] = 0 }
ring = ConsistentHashing::Ring.new(servers, replicas) (1..1000000).each do |i| n = "n#{i}" server = ring.node_for(n) h_count[server] += 1 end
puts ring.points.size
size << ring.points.size standard_deviation << h_count.values.standard_deviationend
puts size.inspectputs standard_deviation.inspect
复制代码


servers: 10

virtual points: 10~600

data: 1,000,000

x: points

y: standard deviation


we can know the standard_deviation is stable around 400+ points.

用户头像

ilake

关注

还未添加个人签名 2019.04.15 加入

还未添加个人简介

评论

发布
暂无评论
架構師訓練營 week5 作業