写点什么

Week 5 作业

用户头像
Shawn
关注
发布于: 2020 年 07 月 07 日
  • 用你熟悉的编程语言实现一致性 hash 算法。

  • 编写测试用例测试这个算法,测试 100 万 KV 数据,10 个服务器节点的情况下,计算这些 KV 数据在服务器上分布数量的标准差,以评估算法的存储负载不均衡性。


java 实现的一致性 hash 算法:


import java.util.*;import java.util.stream.Collectors;
public class Main { public static void main(String[] args) { int nodeCount = 10; int dataSize = 1_000_000;
Map<String, Integer> data = new HashMap<>(); for (int j = 0; j < dataSize; j++) { data.put("key-" + Math.random(), j); }
int base = 500; for (int i = 10; i <= 1000; i += 5) { HashRing<String, Integer> hashRing = new HashRing<>(nodeCount, i); data.forEach(hashRing::put); int sd = (int) hashRing.calcStandardDeviation(); System.out.printf("%4d => %6d ", i, sd); int count = sd / base; for (int j = 0; j < count; j++) { System.out.print("."); } System.out.println(); } }}
class HashRing<K, V> { private static final Random random = new Random(); private int nodeCount; private int virtualNodeSize; private TreeMap<Integer, DataNode<K, V>> hashRingMap; private List<DataNode<K, V>> dataNodes;
public HashRing(int nodeCount, int virtualNodeSize) { this.nodeCount = nodeCount; this.virtualNodeSize = virtualNodeSize; init(); }
private void init() { hashRingMap = new TreeMap<>(); dataNodes = new ArrayList<>(nodeCount); for (int i = 0; i < nodeCount; i++) { addDataNode(); } }
public void addDataNode() { DataNode<K, V> dataNode = new DataNode<>(); dataNodes.add(dataNode); for (int j = 0; j < virtualNodeSize; j++) { hashRingMap.put(random.nextInt(), dataNode); } }
public void removeDataNode(int i){ DataNode<K, V> kvDataNode = dataNodes.get(i); if (kvDataNode == null){ throw new IllegalArgumentException(); } dataNodes.remove(i); hashRingMap.entrySet().removeIf(integerDataNodeEntry -> integerDataNodeEntry.getValue() == kvDataNode); }
public V get(K k) { return getDataNode(k).get(k); }
public V put(K k, V v) { return getDataNode(k).put(k, v); }
private DataNode<K, V> getDataNode(K k) { assert k != null; return Optional.ofNullable(hashRingMap.ceilingEntry(k.hashCode())).orElse(hashRingMap.firstEntry()).getValue(); }
public double calcStandardDeviation() { double average = dataNodes.stream().map(DataNode::size).collect(Collectors.summarizingInt(Integer::intValue)).getAverage(); return Math.sqrt(dataNodes.stream().map(DataNode::size).map(size -> Math.pow(size - average, 2)).reduce(0D, Double::sum) / dataNodes.size()); }

private static class DataNode<K, V> extends HashMap<K, V> { }}
复制代码


总数据 100 万 KV、10 个数据节点不变,虚拟节点个数从 10 逐渐增加到 1000 步进值为 5,记录标准差的变化,多次运行结果可能不一样,其中之一如下所示,标准差的变化规律为:

10 ~ 150 之间快速下降

150 ~ 400 之间缓慢下降

400 之后随机波动不再下降

因此虚拟节点 400 以上为佳,200 左右亦可,少于 150 则会有较大的数据倾斜。


  10 =>  30542 .............................................................  15 =>  25151 ..................................................  20 =>  19787 .......................................  25 =>  15100 ..............................  30 =>  14915 .............................  35 =>  10430 ....................  40 =>  10432 ....................  45 =>  12076 ........................  50 =>  14701 .............................  55 =>  12178 ........................  60 =>   8165 ................  65 =>  13423 ..........................  70 =>  10831 .....................  75 =>  11920 .......................  80 =>  13061 ..........................  85 =>   8093 ................  90 =>  14513 .............................  95 =>   8206 ................ 100 =>  10173 .................... 105 =>   9210 .................. 110 =>   9260 .................. 115 =>   9755 ................... 120 =>   6674 ............. 125 =>   7931 ............... 130 =>   7544 ............... 135 =>   7299 .............. 140 =>   7296 .............. 145 =>   8821 ................. 150 =>   7742 ............... 155 =>   3395 ...... 160 =>   7306 .............. 165 =>   8690 ................. 170 =>   8565 ................. 175 =>   7199 .............. 180 =>   9408 .................. 185 =>   6208 ............ 190 =>   7257 .............. 195 =>   4889 ......... 200 =>   4636 ......... 205 =>   8274 ................ 210 =>   9178 .................. 215 =>   7881 ............... 220 =>   5861 ........... 225 =>   5108 .......... 230 =>   8181 ................ 235 =>   6631 ............. 240 =>   3819 ....... 245 =>   6797 ............. 250 =>   6104 ............ 255 =>   7939 ............... 260 =>   4585 ......... 265 =>   8550 ................. 270 =>   5161 .......... 275 =>   6196 ............ 280 =>   3424 ...... 285 =>   5403 .......... 290 =>   3812 ....... 295 =>   4287 ........ 300 =>   6187 ............ 305 =>   5059 .......... 310 =>   3897 ....... 315 =>   5463 .......... 320 =>   8230 ................ 325 =>   4347 ........ 330 =>   5951 ........... 335 =>   4779 ......... 340 =>   4265 ........ 345 =>   5768 ........... 350 =>   4798 ......... 355 =>   4406 ........ 360 =>   3672 ....... 365 =>   3955 ....... 370 =>   5607 ........... 375 =>   2656 ..... 380 =>   5713 ........... 385 =>   5396 .......... 390 =>   5068 .......... 395 =>   5720 ........... 400 =>   2626 ..... 405 =>   4581 ......... 410 =>   6259 ............ 415 =>   2669 ..... 420 =>   4049 ........ 425 =>   3796 ....... 430 =>   4607 ......... 435 =>   3564 ....... 440 =>   2947 ..... 445 =>   2971 ..... 450 =>   2459 .... 455 =>   2481 .... 460 =>   3773 ....... 465 =>   3989 ....... 470 =>   6942 ............. 475 =>   2898 ..... 480 =>   4617 ......... 485 =>   6044 ............ 490 =>   4217 ........ 495 =>   3374 ...... 500 =>   5312 .......... 505 =>   3115 ...... 510 =>   4212 ........ 515 =>   5001 .......... 520 =>   3437 ...... 525 =>   4463 ........ 530 =>   4394 ........ 535 =>   5161 .......... 540 =>   3811 ....... 545 =>   2919 ..... 550 =>   2903 ..... 555 =>   4181 ........ 560 =>   3842 ....... 565 =>   2985 ..... 570 =>   4257 ........ 575 =>   2751 ..... 580 =>   2616 ..... 585 =>   2275 .... 590 =>   3788 ....... 595 =>   3884 ....... 600 =>   4610 ......... 605 =>   2767 ..... 610 =>   4245 ........ 615 =>   4664 ......... 620 =>   4200 ........ 625 =>   2573 ..... 630 =>   2156 .... 635 =>   3063 ...... 640 =>   3278 ...... 645 =>   3342 ...... 650 =>   3185 ...... 655 =>   3914 ....... 660 =>   4025 ........ 665 =>   3744 ....... 670 =>   4392 ........ 675 =>   2866 ..... 680 =>   3864 ....... 685 =>   4608 ......... 690 =>   3334 ...... 695 =>   3203 ...... 700 =>   3085 ...... 705 =>   2755 ..... 710 =>   4331 ........ 715 =>   3261 ...... 720 =>   4452 ........ 725 =>   3255 ...... 730 =>   3491 ...... 735 =>   3768 ....... 740 =>   4017 ........ 745 =>   3301 ...... 750 =>   2200 .... 755 =>   4483 ........ 760 =>   3757 ....... 765 =>   3059 ...... 770 =>   4131 ........ 775 =>   3124 ...... 780 =>   2648 ..... 785 =>   3940 ....... 790 =>   2706 ..... 795 =>   3488 ...... 800 =>   3849 ....... 805 =>   2795 ..... 810 =>   2568 ..... 815 =>   2844 ..... 820 =>   2422 .... 825 =>   3518 ....... 830 =>   2920 ..... 835 =>   3418 ...... 840 =>   3818 ....... 845 =>   3683 ....... 850 =>   3252 ...... 855 =>   3200 ...... 860 =>   3517 ....... 865 =>   3194 ...... 870 =>   3241 ...... 875 =>   2654 ..... 880 =>   4007 ........ 885 =>   3835 ....... 890 =>   2670 ..... 895 =>   2799 ..... 900 =>   3591 ....... 905 =>   2980 ..... 910 =>   3484 ...... 915 =>   2956 ..... 920 =>   2890 ..... 925 =>   3760 ....... 930 =>   1691 ... 935 =>   2195 .... 940 =>   2718 ..... 945 =>   2243 .... 950 =>   1779 ... 955 =>   1856 ... 960 =>   2641 ..... 965 =>   3052 ...... 970 =>   3334 ...... 975 =>   3005 ...... 980 =>   2895 ..... 985 =>   1887 ... 990 =>   2617 ..... 995 =>   2872 .....1000 =>   2608 .....
复制代码


发布于: 2020 年 07 月 07 日阅读数: 60
用户头像

Shawn

关注

还未添加个人签名 2017.10.23 加入

还未添加个人简介

评论

发布
暂无评论
Week 5 作业