架构师训练营第五周作业 一致性哈希

发布于: 2020 年 07 月 08 日

  • 用你熟悉的编程语言实现一致性 hash 算法。

  • 编写测试用例测试这个算法,测试 100 万 KV 数据,10 个服务器节点的情况下,计算这些 KV 数据在服务器上分布数量的标准差,以评估算法的存储负载不均衡性。

先写了一个工具类,用于计算标准差:

import java.util.Collection;
public class StandardDivision {
public double calculate(Collection<Integer> nums, int average) {
if(nums == null || nums.isEmpty()) throw new IllegalArgumentException();
int n = nums.size();
double variance = 0;
for(int num : nums) {
variance += (num - average) * (num - average);
}
variance /= n;
return Math.sqrt(variance);
}
}

主要功能类,构造函数接收两个参数,一个是物理机器数,一个是每个机器创建的虚拟节点数。

generateVirtualNodes() 是主要功能方法,针对每一个物理机器,循环调用生成虚拟节点方法,同时将虚拟节点跟物理节点的关系存在一个Map中。

assignValueToVirtualNode()是用于当接收到一个value后,分配到哪个虚拟节点中,同时将对应的物理节点的映射数据总数加1.

import java.util.*;
public class ConsistentHash {
private final Random random = new Random();
private final Map<Integer, Integer> virtualPhysicalMapping = new HashMap<>();//key is virtual node, value is physical node id
private final List<Integer> virtualNodes = new ArrayList<>(); //keep all virtualNodes
private final int machineNumbers;
private final int virtualNodesNumbersForOneMachine;
// key is index of physical machine, value is number of data in this physical machine
Map<Integer, Integer> physicalMachineCountMap = new HashMap<>();
public ConsistentHash(int machineNumbers, int virtualNodesNumbersForOneMachine) {
if(machineNumbers <= 0) throw new IllegalArgumentException("machineNumbers needs to be positive");
if(virtualNodesNumbersForOneMachine <= 0) throw new IllegalArgumentException("virtualNodesNumbersForOneMachine needs to be positive");
this.machineNumbers = machineNumbers;
this.virtualNodesNumbersForOneMachine = virtualNodesNumbersForOneMachine;
generateVirtualNodes();
}
public void assignValueToVirtualNode(int value) {
int index = Collections.binarySearch(virtualNodes, value);
if(index < 0) index = -index - 1; //if index < 0, get the correct insert position
if(index == virtualNodes.size()) index = 0; //reach the end of the hash cycle, use the first virtual node
int physicalMachineId = virtualPhysicalMapping.get(virtualNodes.get(index));
int count = physicalMachineCountMap.getOrDefault(physicalMachineId, 0);
physicalMachineCountMap.put(physicalMachineId, count + 1);
}
public Map<Integer, Integer> getPhysicalMachineCountMap() {
return physicalMachineCountMap;
}
private void generateVirtualNodes() {
for(int i=0; i<machineNumbers; i++) {
List<Integer> virtualNodesForOneMachine = generateVirtualNodesForOneMachine();
for(Integer virtual : virtualNodesForOneMachine) {
virtualPhysicalMapping.put(virtual, i);
}
virtualNodes.addAll(virtualNodesForOneMachine);
}
virtualNodes.sort(Comparator.comparingInt(x -> x));
}
private List<Integer> generateVirtualNodesForOneMachine() {
List<Integer> virtualNodesForOneMachine = new ArrayList<>();
for(int i=0; i<virtualNodesNumbersForOneMachine; i++) {
virtualNodesForOneMachine.add(random.nextInt());
}
return virtualNodesForOneMachine;
}
}

最后,就是测试了,简单写了一个main方法用于测试10个机器,100万数据情况下的标准差:

public static void main(String[] args) {
int numberOfTestData = 1000000;
int numberOfMachine = 10;
int numberOfVirtualNodes = 200;
ConsistentHash consistentHash = new ConsistentHash(numberOfMachine, numberOfVirtualNodes);
Random r = new Random();
for(int i=0; i<numberOfTestData; i++) {
int value = r.nextInt();
consistentHash.assignValueToVirtualNode(value);
}
Map<Integer, Integer> physicalMachineCountMap = consistentHash.getPhysicalMachineCountMap();
StandardDivision standardDivision = new StandardDivision();
double standDivision = standardDivision.calculate(physicalMachineCountMap.values(), numberOfTestData/numberOfMachine);
System.out.println("Standard Division is: " + standDivision);
}

经过10次左右的测试,输出的数值均为6900上下。

发布于: 2020 年 07 月 08 日 阅读数: 9
用户头像

sunnywhy

关注

还未添加个人签名 2019.04.25 加入

还未添加个人简介

评论

发布
暂无评论
架构师训练营第五周作业 一致性哈希