Week 5 作业
发布于: 2020 年 07 月 07 日
- 用你熟悉的编程语言实现一致性 hash 算法。 
- 编写测试用例测试这个算法,测试 100 万 KV 数据,10 个服务器节点的情况下,计算这些 KV 数据在服务器上分布数量的标准差,以评估算法的存储负载不均衡性。 
java 实现的一致性 hash 算法:
import java.util.*;import java.util.stream.Collectors;
public class Main {    public static void main(String[] args) {        int nodeCount = 10;        int dataSize = 1_000_000;
        Map<String, Integer> data = new HashMap<>();        for (int j = 0; j < dataSize; j++) {            data.put("key-" + Math.random(), j);        }
        int base = 500;        for (int i = 10; i <= 1000; i += 5) {            HashRing<String, Integer> hashRing = new HashRing<>(nodeCount, i);            data.forEach(hashRing::put);            int sd = (int) hashRing.calcStandardDeviation();            System.out.printf("%4d => %6d ", i, sd);            int count = sd / base;            for (int j = 0; j < count; j++) {                System.out.print(".");            }            System.out.println();        }    }}
class HashRing<K, V> {    private static final Random random = new Random();    private int nodeCount;    private int virtualNodeSize;    private TreeMap<Integer, DataNode<K, V>> hashRingMap;    private List<DataNode<K, V>> dataNodes;
    public HashRing(int nodeCount, int virtualNodeSize) {        this.nodeCount = nodeCount;        this.virtualNodeSize = virtualNodeSize;        init();    }
    private void init() {        hashRingMap = new TreeMap<>();        dataNodes = new ArrayList<>(nodeCount);        for (int i = 0; i < nodeCount; i++) {            addDataNode();        }    }
    public void addDataNode() {        DataNode<K, V> dataNode = new DataNode<>();        dataNodes.add(dataNode);        for (int j = 0; j < virtualNodeSize; j++) {            hashRingMap.put(random.nextInt(), dataNode);        }    }
    public void removeDataNode(int i){        DataNode<K, V> kvDataNode = dataNodes.get(i);        if (kvDataNode == null){            throw new IllegalArgumentException();        }        dataNodes.remove(i);        hashRingMap.entrySet().removeIf(integerDataNodeEntry -> integerDataNodeEntry.getValue() == kvDataNode);    }
    public V get(K k) {        return getDataNode(k).get(k);    }
    public V put(K k, V v) {        return getDataNode(k).put(k, v);    }
    private DataNode<K, V> getDataNode(K k) {        assert k != null;        return Optional.ofNullable(hashRingMap.ceilingEntry(k.hashCode())).orElse(hashRingMap.firstEntry()).getValue();    }
    public double calcStandardDeviation() {        double average = dataNodes.stream().map(DataNode::size).collect(Collectors.summarizingInt(Integer::intValue)).getAverage();        return Math.sqrt(dataNodes.stream().map(DataNode::size).map(size -> Math.pow(size - average, 2)).reduce(0D, Double::sum) / dataNodes.size());    }
    private static class DataNode<K, V> extends HashMap<K, V> {    }}
复制代码
 总数据 100 万 KV、10 个数据节点不变,虚拟节点个数从 10 逐渐增加到 1000 步进值为 5,记录标准差的变化,多次运行结果可能不一样,其中之一如下所示,标准差的变化规律为:
10 ~ 150 之间快速下降
150 ~ 400 之间缓慢下降
400 之后随机波动不再下降
因此虚拟节点 400 以上为佳,200 左右亦可,少于 150 则会有较大的数据倾斜。
  10 =>  30542 .............................................................  15 =>  25151 ..................................................  20 =>  19787 .......................................  25 =>  15100 ..............................  30 =>  14915 .............................  35 =>  10430 ....................  40 =>  10432 ....................  45 =>  12076 ........................  50 =>  14701 .............................  55 =>  12178 ........................  60 =>   8165 ................  65 =>  13423 ..........................  70 =>  10831 .....................  75 =>  11920 .......................  80 =>  13061 ..........................  85 =>   8093 ................  90 =>  14513 .............................  95 =>   8206 ................ 100 =>  10173 .................... 105 =>   9210 .................. 110 =>   9260 .................. 115 =>   9755 ................... 120 =>   6674 ............. 125 =>   7931 ............... 130 =>   7544 ............... 135 =>   7299 .............. 140 =>   7296 .............. 145 =>   8821 ................. 150 =>   7742 ............... 155 =>   3395 ...... 160 =>   7306 .............. 165 =>   8690 ................. 170 =>   8565 ................. 175 =>   7199 .............. 180 =>   9408 .................. 185 =>   6208 ............ 190 =>   7257 .............. 195 =>   4889 ......... 200 =>   4636 ......... 205 =>   8274 ................ 210 =>   9178 .................. 215 =>   7881 ............... 220 =>   5861 ........... 225 =>   5108 .......... 230 =>   8181 ................ 235 =>   6631 ............. 240 =>   3819 ....... 245 =>   6797 ............. 250 =>   6104 ............ 255 =>   7939 ............... 260 =>   4585 ......... 265 =>   8550 ................. 270 =>   5161 .......... 275 =>   6196 ............ 280 =>   3424 ...... 285 =>   5403 .......... 290 =>   3812 ....... 295 =>   4287 ........ 300 =>   6187 ............ 305 =>   5059 .......... 310 =>   3897 ....... 315 =>   5463 .......... 320 =>   8230 ................ 325 =>   4347 ........ 330 =>   5951 ........... 335 =>   4779 ......... 340 =>   4265 ........ 345 =>   5768 ........... 350 =>   4798 ......... 355 =>   4406 ........ 360 =>   3672 ....... 365 =>   3955 ....... 370 =>   5607 ........... 375 =>   2656 ..... 380 =>   5713 ........... 385 =>   5396 .......... 390 =>   5068 .......... 395 =>   5720 ........... 400 =>   2626 ..... 405 =>   4581 ......... 410 =>   6259 ............ 415 =>   2669 ..... 420 =>   4049 ........ 425 =>   3796 ....... 430 =>   4607 ......... 435 =>   3564 ....... 440 =>   2947 ..... 445 =>   2971 ..... 450 =>   2459 .... 455 =>   2481 .... 460 =>   3773 ....... 465 =>   3989 ....... 470 =>   6942 ............. 475 =>   2898 ..... 480 =>   4617 ......... 485 =>   6044 ............ 490 =>   4217 ........ 495 =>   3374 ...... 500 =>   5312 .......... 505 =>   3115 ...... 510 =>   4212 ........ 515 =>   5001 .......... 520 =>   3437 ...... 525 =>   4463 ........ 530 =>   4394 ........ 535 =>   5161 .......... 540 =>   3811 ....... 545 =>   2919 ..... 550 =>   2903 ..... 555 =>   4181 ........ 560 =>   3842 ....... 565 =>   2985 ..... 570 =>   4257 ........ 575 =>   2751 ..... 580 =>   2616 ..... 585 =>   2275 .... 590 =>   3788 ....... 595 =>   3884 ....... 600 =>   4610 ......... 605 =>   2767 ..... 610 =>   4245 ........ 615 =>   4664 ......... 620 =>   4200 ........ 625 =>   2573 ..... 630 =>   2156 .... 635 =>   3063 ...... 640 =>   3278 ...... 645 =>   3342 ...... 650 =>   3185 ...... 655 =>   3914 ....... 660 =>   4025 ........ 665 =>   3744 ....... 670 =>   4392 ........ 675 =>   2866 ..... 680 =>   3864 ....... 685 =>   4608 ......... 690 =>   3334 ...... 695 =>   3203 ...... 700 =>   3085 ...... 705 =>   2755 ..... 710 =>   4331 ........ 715 =>   3261 ...... 720 =>   4452 ........ 725 =>   3255 ...... 730 =>   3491 ...... 735 =>   3768 ....... 740 =>   4017 ........ 745 =>   3301 ...... 750 =>   2200 .... 755 =>   4483 ........ 760 =>   3757 ....... 765 =>   3059 ...... 770 =>   4131 ........ 775 =>   3124 ...... 780 =>   2648 ..... 785 =>   3940 ....... 790 =>   2706 ..... 795 =>   3488 ...... 800 =>   3849 ....... 805 =>   2795 ..... 810 =>   2568 ..... 815 =>   2844 ..... 820 =>   2422 .... 825 =>   3518 ....... 830 =>   2920 ..... 835 =>   3418 ...... 840 =>   3818 ....... 845 =>   3683 ....... 850 =>   3252 ...... 855 =>   3200 ...... 860 =>   3517 ....... 865 =>   3194 ...... 870 =>   3241 ...... 875 =>   2654 ..... 880 =>   4007 ........ 885 =>   3835 ....... 890 =>   2670 ..... 895 =>   2799 ..... 900 =>   3591 ....... 905 =>   2980 ..... 910 =>   3484 ...... 915 =>   2956 ..... 920 =>   2890 ..... 925 =>   3760 ....... 930 =>   1691 ... 935 =>   2195 .... 940 =>   2718 ..... 945 =>   2243 .... 950 =>   1779 ... 955 =>   1856 ... 960 =>   2641 ..... 965 =>   3052 ...... 970 =>   3334 ...... 975 =>   3005 ...... 980 =>   2895 ..... 985 =>   1887 ... 990 =>   2617 ..... 995 =>   2872 .....1000 =>   2608 .....
复制代码
 划线
评论
复制
发布于: 2020 年 07 月 07 日阅读数: 60
版权声明: 本文为 InfoQ 作者【Shawn】的原创文章。
原文链接:【http://xie.infoq.cn/article/8261b2ae097373256bf17ffc8】。未经作者许可,禁止转载。
Shawn
关注
还未添加个人签名 2017.10.23 加入
还未添加个人简介











 
    
评论