架构师训练营第五周作业 一致性哈希
发布于: 2020 年 07 月 08 日
用你熟悉的编程语言实现一致性 hash 算法。
编写测试用例测试这个算法,测试 100 万 KV 数据,10 个服务器节点的情况下,计算这些 KV 数据在服务器上分布数量的标准差,以评估算法的存储负载不均衡性。
先写了一个工具类,用于计算标准差:
import java.util.Collection;public class StandardDivision { public double calculate(Collection<Integer> nums, int average) { if(nums == null || nums.isEmpty()) throw new IllegalArgumentException(); int n = nums.size(); double variance = 0; for(int num : nums) { variance += (num - average) * (num - average); } variance /= n; return Math.sqrt(variance); }}
主要功能类,构造函数接收两个参数,一个是物理机器数,一个是每个机器创建的虚拟节点数。
generateVirtualNodes() 是主要功能方法,针对每一个物理机器,循环调用生成虚拟节点方法,同时将虚拟节点跟物理节点的关系存在一个Map中。
assignValueToVirtualNode()是用于当接收到一个value后,分配到哪个虚拟节点中,同时将对应的物理节点的映射数据总数加1.
import java.util.*;public class ConsistentHash { private final Random random = new Random(); private final Map<Integer, Integer> virtualPhysicalMapping = new HashMap<>();//key is virtual node, value is physical node id private final List<Integer> virtualNodes = new ArrayList<>(); //keep all virtualNodes private final int machineNumbers; private final int virtualNodesNumbersForOneMachine; // key is index of physical machine, value is number of data in this physical machine Map<Integer, Integer> physicalMachineCountMap = new HashMap<>(); public ConsistentHash(int machineNumbers, int virtualNodesNumbersForOneMachine) { if(machineNumbers <= 0) throw new IllegalArgumentException("machineNumbers needs to be positive"); if(virtualNodesNumbersForOneMachine <= 0) throw new IllegalArgumentException("virtualNodesNumbersForOneMachine needs to be positive"); this.machineNumbers = machineNumbers; this.virtualNodesNumbersForOneMachine = virtualNodesNumbersForOneMachine; generateVirtualNodes(); } public void assignValueToVirtualNode(int value) { int index = Collections.binarySearch(virtualNodes, value); if(index < 0) index = -index - 1; //if index < 0, get the correct insert position if(index == virtualNodes.size()) index = 0; //reach the end of the hash cycle, use the first virtual node int physicalMachineId = virtualPhysicalMapping.get(virtualNodes.get(index)); int count = physicalMachineCountMap.getOrDefault(physicalMachineId, 0); physicalMachineCountMap.put(physicalMachineId, count + 1); } public Map<Integer, Integer> getPhysicalMachineCountMap() { return physicalMachineCountMap; } private void generateVirtualNodes() { for(int i=0; i<machineNumbers; i++) { List<Integer> virtualNodesForOneMachine = generateVirtualNodesForOneMachine(); for(Integer virtual : virtualNodesForOneMachine) { virtualPhysicalMapping.put(virtual, i); } virtualNodes.addAll(virtualNodesForOneMachine); } virtualNodes.sort(Comparator.comparingInt(x -> x)); } private List<Integer> generateVirtualNodesForOneMachine() { List<Integer> virtualNodesForOneMachine = new ArrayList<>(); for(int i=0; i<virtualNodesNumbersForOneMachine; i++) { virtualNodesForOneMachine.add(random.nextInt()); } return virtualNodesForOneMachine; }}
最后,就是测试了,简单写了一个main方法用于测试10个机器,100万数据情况下的标准差:
public static void main(String[] args) { int numberOfTestData = 1000000; int numberOfMachine = 10; int numberOfVirtualNodes = 200; ConsistentHash consistentHash = new ConsistentHash(numberOfMachine, numberOfVirtualNodes); Random r = new Random(); for(int i=0; i<numberOfTestData; i++) { int value = r.nextInt(); consistentHash.assignValueToVirtualNode(value); } Map<Integer, Integer> physicalMachineCountMap = consistentHash.getPhysicalMachineCountMap(); StandardDivision standardDivision = new StandardDivision(); double standDivision = standardDivision.calculate(physicalMachineCountMap.values(), numberOfTestData/numberOfMachine); System.out.println("Standard Division is: " + standDivision); }
经过10次左右的测试,输出的数值均为6900上下。
划线
评论
复制
发布于: 2020 年 07 月 08 日阅读数: 52
版权声明: 本文为 InfoQ 作者【sunnywhy】的原创文章。
原文链接:【http://xie.infoq.cn/article/735aefcd86f51776bec7c64f1】。文章转载请联系作者。
sunnywhy
关注
还未添加个人签名 2019.04.25 加入
还未添加个人简介
评论