架构师训练营 第 5 周作业

用户头像
Lingjun
关注
发布于: 2020 年 07 月 04 日

用你熟悉的编程语言实现一致性 hash 算法。编写测试用例测试这个算法,测试100万 KV 数据,10 个服务器节点的情况下,计算这些 KV 数据在服务器上分布数量的标准差,以评估算法的存储负载不均衡性。

代码实现如下:

public class Node {
private String ip;
public Node(String ip) {
this.ip = ip;
}
public String getIp() {
return ip;
}
@Override
public String toString() {
return "Node{" +
"ip='" + ip + '\'' +
'}';
}
}
public class ConsistentHash {
private int numberOfVirtualNodes;
private SortedMap<Integer, Node> hashCircle = new TreeMap();
public ConsistentHash(int numberOfVirtualNodes, List<Node> nodes) {
this.numberOfVirtualNodes = numberOfVirtualNodes;
this.addNodes(nodes);
}
public void addNodes(List<Node> nodes) {
for (Node node : nodes) {
addNode(node);
}
}
public void addNode(Node node) {
for (int i = 0; i < this.numberOfVirtualNodes; i++) {
hashCircle.put(getVirtualNodeHash(node, i), node);
}
}
public void removeNodes(List<Node> nodes) {
for (Node node : nodes) {
removeNode(node);
}
}
public void removeNode(Node node) {
for (int i = 0; i < this.numberOfVirtualNodes; i++) {
hashCircle.remove(getVirtualNodeHash(node, i));
}
}
public Node getNode(String key) {
if (hashCircle.isEmpty()) {
return null;
}
int keyHash = Math.abs(key.hashCode());
if (hashCircle.containsKey(keyHash)) {
return hashCircle.get(keyHash);
}
SortedMap<Integer, Node> tailMap = hashCircle.tailMap(keyHash);
int virtualNodeHash = tailMap.isEmpty() ? hashCircle.firstKey() : tailMap.firstKey();
return hashCircle.get(virtualNodeHash);
}
/**
* FNV1_32_HASH
*
* @param node
* @param factor
* @return
*/
private int getVirtualNodeHash(Node node, int factor) {
final int p = 16777619;
int hash = (int) 2166136261L;
String key = node.getIp() + factor;
for (int i = 0; i < key.length(); i++) {
hash = (hash ^ key.charAt(i)) * p;
}
hash += hash << 13;
hash ^= hash >> 7;
hash += hash << 3;
hash ^= hash >> 17;
hash += hash << 5;
return hash < 0 ? Math.abs(hash) : hash;
}
}
public class ConsistentHashDemo {
public static final int NODE_COUNT = 10;
public static final String NODE_IP_PREFIX = "192.168.0.";
public static final int TESTING_DATA_COUNT = 1000000;
public static final int AVERAGE_NODE_TESTING_DATA_COUNT = TESTING_DATA_COUNT / NODE_COUNT;
public static void main(String[] args) {
// initialize nodes
List<Node> nodes = new ArrayList<Node>();
for (int i = 0; i < NODE_COUNT; i++) {
nodes.add(new Node(NODE_IP_PREFIX + i));
}
testConsistentHashStandardDeviation(nodes, 100);
testConsistentHashStandardDeviation(nodes, 150);
testConsistentHashStandardDeviation(nodes, 200);
testConsistentHashStandardDeviation(nodes, 300);
testConsistentHashStandardDeviation(nodes, 400);
testConsistentHashStandardDeviation(nodes, 500);
}
private static void testConsistentHashStandardDeviation(List<Node> nodes, int virtualNodeCount) {
ConsistentHash consistentHash = new ConsistentHash(virtualNodeCount, nodes);
Map<String, Integer> nodeTestingDataCount = generateNodeTestingDataCountCache(nodes);
// assign testing data to node
for (int i = 0; i < TESTING_DATA_COUNT; i++) {
String testingDataKey = UUID.randomUUID().toString() + new Date().getTime();
Node node = consistentHash.getNode(testingDataKey);
nodeTestingDataCount.put(node.getIp(), nodeTestingDataCount.get(node.getIp()) + 1);
}
double standardDeviation = calculateStandardDeviation(nodeTestingDataCount);
System.out.println("Standard Deviation for " + virtualNodeCount + " virtual nodes per node is: " + standardDeviation);
for (int i = 0; i < NODE_COUNT; i++) {
System.out.println(NODE_IP_PREFIX + i + ": " + nodeTestingDataCount.get(NODE_IP_PREFIX + i));
}
}
private static double calculateStandardDeviation(Map<String, Integer> nodeTestingDataCount) {
long powSum = 0;
for (Map.Entry<String, Integer> entry : nodeTestingDataCount.entrySet()) {
powSum += Math.pow(AVERAGE_NODE_TESTING_DATA_COUNT - entry.getValue(), 2);
}
return Math.sqrt(powSum / NODE_COUNT);
}
private static Map<String, Integer> generateNodeTestingDataCountCache(List<Node> nodes) {
Map<String, Integer> nodeTestingDataCount = new HashMap<String, Integer>();
for (Node node : nodes) {
nodeTestingDataCount.put(node.getIp(), 0);
}
return nodeTestingDataCount;
}
}



以上代码运行3次结果如下,平均来看,基于 FNV1_32_HASH 算法 和 10 个节点的情况下,随机生成 100万个 key 的负载均衡情况,每个节点设置 200 个左右的虚拟节点可以达到较好的平衡(标准差在4600左右)。

10个节点,每个节点有100个虚拟节点的情况

标准差分别为:8693,8683,8730

10个节点,每个节点有150个虚拟节点的情况

标准差分别为:5496,5386,5523

10个节点,每个节点有200个虚拟节点的情况

标准差分别为:4679,4515,4860

10个节点,每个节点有300个虚拟节点的情况

标准差分别为:5321,5227,5364

10个节点,每个节点有400个虚拟节点的情况

标准差分别为:5213,5058,5067

10个节点,每个节点有500个虚拟节点的情况

标准差分别为:5150,5074,5280



详细运行结果如下:

第一次运行:

Standard Deviation for 100 virtual nodes per node is: 8693.312544709295

192.168.0.0: 118791

192.168.0.1: 92913

192.168.0.2: 103922

192.168.0.3: 107664

192.168.0.4: 88269

192.168.0.5: 105893

192.168.0.6: 97984

192.168.0.7: 97126

192.168.0.8: 96432

192.168.0.9: 91006

Standard Deviation for 150 virtual nodes per node is: 5496.127090961416

192.168.0.0: 107064

192.168.0.1: 101915

192.168.0.2: 95442

192.168.0.3: 103929

192.168.0.4: 89248

192.168.0.5: 99471

192.168.0.6: 105860

192.168.0.7: 105254

192.168.0.8: 95207

192.168.0.9: 96610

Standard Deviation for 200 virtual nodes per node is: 4679.677552994437

192.168.0.0: 107498

192.168.0.1: 101070

192.168.0.2: 104508

192.168.0.3: 93105

192.168.0.4: 96091

192.168.0.5: 99643

192.168.0.6: 106865

192.168.0.7: 97630

192.168.0.8: 95203

192.168.0.9: 98387

Standard Deviation for 300 virtual nodes per node is: 5321.195730284689

192.168.0.0: 95601

192.168.0.1: 106554

192.168.0.2: 102375

192.168.0.3: 90274

192.168.0.4: 101705

192.168.0.5: 100535

192.168.0.6: 109052

192.168.0.7: 94047

192.168.0.8: 99808

192.168.0.9: 100049

Standard Deviation for 400 virtual nodes per node is: 5213.945147390793

192.168.0.0: 99042

192.168.0.1: 100168

192.168.0.2: 105234

192.168.0.3: 90787

192.168.0.4: 96798

192.168.0.5: 102825

192.168.0.6: 109440

192.168.0.7: 103093

192.168.0.8: 98993

192.168.0.9: 93620

Standard Deviation for 500 virtual nodes per node is: 5150.529487343995

192.168.0.0: 97573

192.168.0.1: 98579

192.168.0.2: 103108

192.168.0.3: 91767

192.168.0.4: 96772

192.168.0.5: 103092

192.168.0.6: 108037

192.168.0.7: 107834

192.168.0.8: 94261

192.168.0.9: 98977



第二次运行:

Standard Deviation for 100 virtual nodes per node is: 8683.248355310356

192.168.0.0: 119135

192.168.0.1: 92794

192.168.0.2: 104242

192.168.0.3: 107241

192.168.0.4: 88359

192.168.0.5: 105521

192.168.0.6: 98146

192.168.0.7: 96932

192.168.0.8: 96066

192.168.0.9: 91564

Standard Deviation for 150 virtual nodes per node is: 5386.41940439101

192.168.0.0: 107350

192.168.0.1: 101576

192.168.0.2: 95606

192.168.0.3: 103232

192.168.0.4: 88816

192.168.0.5: 100135

192.168.0.6: 105525

192.168.0.7: 104738

192.168.0.8: 95658

192.168.0.9: 97364

Standard Deviation for 200 virtual nodes per node is: 4515.555004647823

192.168.0.0: 107422

192.168.0.1: 100751

192.168.0.2: 104330

192.168.0.3: 92717

192.168.0.4: 96310

192.168.0.5: 100518

192.168.0.6: 105979

192.168.0.7: 97894

192.168.0.8: 95480

192.168.0.9: 98599

Standard Deviation for 300 virtual nodes per node is: 5227.366258451764

192.168.0.0: 95233

192.168.0.1: 106445

192.168.0.2: 101948

192.168.0.3: 90681

192.168.0.4: 101499

192.168.0.5: 100582

192.168.0.6: 109191

192.168.0.7: 94415

192.168.0.8: 99779

192.168.0.9: 100227

Standard Deviation for 400 virtual nodes per node is: 5058.7798924246545

192.168.0.0: 98366

192.168.0.1: 100031

192.168.0.2: 104660

192.168.0.3: 91247

192.168.0.4: 97449

192.168.0.5: 103261

192.168.0.6: 109327

192.168.0.7: 103020

192.168.0.8: 99018

192.168.0.9: 93621

Standard Deviation for 500 virtual nodes per node is: 5074.836154990622

192.168.0.0: 97299

192.168.0.1: 98732

192.168.0.2: 102883

192.168.0.3: 91981

192.168.0.4: 97387

192.168.0.5: 102633

192.168.0.6: 107817

192.168.0.7: 108204

192.168.0.8: 94319

192.168.0.9: 98745



第三次运行:

Standard Deviation for 100 virtual nodes per node is: 8730.952754424914

192.168.0.0: 119173

192.168.0.1: 93661

192.168.0.2: 104095

192.168.0.3: 107548

192.168.0.4: 88132

192.168.0.5: 105199

192.168.0.6: 98534

192.168.0.7: 96809

192.168.0.8: 96078

192.168.0.9: 90771

Standard Deviation for 150 virtual nodes per node is: 5523.682738898026

192.168.0.0: 107269

192.168.0.1: 101227

192.168.0.2: 95658

192.168.0.3: 103547

192.168.0.4: 88636

192.168.0.5: 100025

192.168.0.6: 106281

192.168.0.7: 104698

192.168.0.8: 95401

192.168.0.9: 97258

Standard Deviation for 200 virtual nodes per node is: 4860.374265424423

192.168.0.0: 108006

192.168.0.1: 100720

192.168.0.2: 104379

192.168.0.3: 92756

192.168.0.4: 96353

192.168.0.5: 100688

192.168.0.6: 106872

192.168.0.7: 97235

192.168.0.8: 94657

192.168.0.9: 98334

Standard Deviation for 300 virtual nodes per node is: 5364.141403803595

192.168.0.0: 94960

192.168.0.1: 106411

192.168.0.2: 102442

192.168.0.3: 90665

192.168.0.4: 101865

192.168.0.5: 100248

192.168.0.6: 109595

192.168.0.7: 94311

192.168.0.8: 99893

192.168.0.9: 99610

Standard Deviation for 400 virtual nodes per node is: 5067.849938583423

192.168.0.0: 99335

192.168.0.1: 100605

192.168.0.2: 104311

192.168.0.3: 91079

192.168.0.4: 96490

192.168.0.5: 102606

192.168.0.6: 109127

192.168.0.7: 103753

192.168.0.8: 99055

192.168.0.9: 93639

Standard Deviation for 500 virtual nodes per node is: 5280.030965818288

192.168.0.0: 97015

192.168.0.1: 98175

192.168.0.2: 103242

192.168.0.3: 91869

192.168.0.4: 97478

192.168.0.5: 102680

192.168.0.6: 108318

192.168.0.7: 108347

192.168.0.8: 93974

192.168.0.9: 98902



发布于: 2020 年 07 月 04 日 阅读数: 35
用户头像

Lingjun

关注

还未添加个人签名 2018.11.22 加入

还未添加个人简介

评论

发布
暂无评论
架构师训练营 第 5 周作业