Week 5 作业
发布于: 2020 年 07 月 07 日
用你熟悉的编程语言实现一致性 hash 算法。
编写测试用例测试这个算法,测试 100 万 KV 数据,10 个服务器节点的情况下,计算这些 KV 数据在服务器上分布数量的标准差,以评估算法的存储负载不均衡性。
java 实现的一致性 hash 算法:
import java.util.*;import java.util.stream.Collectors;
public class Main { public static void main(String[] args) { int nodeCount = 10; int dataSize = 1_000_000;
Map<String, Integer> data = new HashMap<>(); for (int j = 0; j < dataSize; j++) { data.put("key-" + Math.random(), j); }
int base = 500; for (int i = 10; i <= 1000; i += 5) { HashRing<String, Integer> hashRing = new HashRing<>(nodeCount, i); data.forEach(hashRing::put); int sd = (int) hashRing.calcStandardDeviation(); System.out.printf("%4d => %6d ", i, sd); int count = sd / base; for (int j = 0; j < count; j++) { System.out.print("."); } System.out.println(); } }}
class HashRing<K, V> { private static final Random random = new Random(); private int nodeCount; private int virtualNodeSize; private TreeMap<Integer, DataNode<K, V>> hashRingMap; private List<DataNode<K, V>> dataNodes;
public HashRing(int nodeCount, int virtualNodeSize) { this.nodeCount = nodeCount; this.virtualNodeSize = virtualNodeSize; init(); }
private void init() { hashRingMap = new TreeMap<>(); dataNodes = new ArrayList<>(nodeCount); for (int i = 0; i < nodeCount; i++) { addDataNode(); } }
public void addDataNode() { DataNode<K, V> dataNode = new DataNode<>(); dataNodes.add(dataNode); for (int j = 0; j < virtualNodeSize; j++) { hashRingMap.put(random.nextInt(), dataNode); } }
public void removeDataNode(int i){ DataNode<K, V> kvDataNode = dataNodes.get(i); if (kvDataNode == null){ throw new IllegalArgumentException(); } dataNodes.remove(i); hashRingMap.entrySet().removeIf(integerDataNodeEntry -> integerDataNodeEntry.getValue() == kvDataNode); }
public V get(K k) { return getDataNode(k).get(k); }
public V put(K k, V v) { return getDataNode(k).put(k, v); }
private DataNode<K, V> getDataNode(K k) { assert k != null; return Optional.ofNullable(hashRingMap.ceilingEntry(k.hashCode())).orElse(hashRingMap.firstEntry()).getValue(); }
public double calcStandardDeviation() { double average = dataNodes.stream().map(DataNode::size).collect(Collectors.summarizingInt(Integer::intValue)).getAverage(); return Math.sqrt(dataNodes.stream().map(DataNode::size).map(size -> Math.pow(size - average, 2)).reduce(0D, Double::sum) / dataNodes.size()); }
private static class DataNode<K, V> extends HashMap<K, V> { }}
复制代码
总数据 100 万 KV、10 个数据节点不变,虚拟节点个数从 10 逐渐增加到 1000 步进值为 5,记录标准差的变化,多次运行结果可能不一样,其中之一如下所示,标准差的变化规律为:
10 ~ 150 之间快速下降
150 ~ 400 之间缓慢下降
400 之后随机波动不再下降
因此虚拟节点 400 以上为佳,200 左右亦可,少于 150 则会有较大的数据倾斜。
10 => 30542 ............................................................. 15 => 25151 .................................................. 20 => 19787 ....................................... 25 => 15100 .............................. 30 => 14915 ............................. 35 => 10430 .................... 40 => 10432 .................... 45 => 12076 ........................ 50 => 14701 ............................. 55 => 12178 ........................ 60 => 8165 ................ 65 => 13423 .......................... 70 => 10831 ..................... 75 => 11920 ....................... 80 => 13061 .......................... 85 => 8093 ................ 90 => 14513 ............................. 95 => 8206 ................ 100 => 10173 .................... 105 => 9210 .................. 110 => 9260 .................. 115 => 9755 ................... 120 => 6674 ............. 125 => 7931 ............... 130 => 7544 ............... 135 => 7299 .............. 140 => 7296 .............. 145 => 8821 ................. 150 => 7742 ............... 155 => 3395 ...... 160 => 7306 .............. 165 => 8690 ................. 170 => 8565 ................. 175 => 7199 .............. 180 => 9408 .................. 185 => 6208 ............ 190 => 7257 .............. 195 => 4889 ......... 200 => 4636 ......... 205 => 8274 ................ 210 => 9178 .................. 215 => 7881 ............... 220 => 5861 ........... 225 => 5108 .......... 230 => 8181 ................ 235 => 6631 ............. 240 => 3819 ....... 245 => 6797 ............. 250 => 6104 ............ 255 => 7939 ............... 260 => 4585 ......... 265 => 8550 ................. 270 => 5161 .......... 275 => 6196 ............ 280 => 3424 ...... 285 => 5403 .......... 290 => 3812 ....... 295 => 4287 ........ 300 => 6187 ............ 305 => 5059 .......... 310 => 3897 ....... 315 => 5463 .......... 320 => 8230 ................ 325 => 4347 ........ 330 => 5951 ........... 335 => 4779 ......... 340 => 4265 ........ 345 => 5768 ........... 350 => 4798 ......... 355 => 4406 ........ 360 => 3672 ....... 365 => 3955 ....... 370 => 5607 ........... 375 => 2656 ..... 380 => 5713 ........... 385 => 5396 .......... 390 => 5068 .......... 395 => 5720 ........... 400 => 2626 ..... 405 => 4581 ......... 410 => 6259 ............ 415 => 2669 ..... 420 => 4049 ........ 425 => 3796 ....... 430 => 4607 ......... 435 => 3564 ....... 440 => 2947 ..... 445 => 2971 ..... 450 => 2459 .... 455 => 2481 .... 460 => 3773 ....... 465 => 3989 ....... 470 => 6942 ............. 475 => 2898 ..... 480 => 4617 ......... 485 => 6044 ............ 490 => 4217 ........ 495 => 3374 ...... 500 => 5312 .......... 505 => 3115 ...... 510 => 4212 ........ 515 => 5001 .......... 520 => 3437 ...... 525 => 4463 ........ 530 => 4394 ........ 535 => 5161 .......... 540 => 3811 ....... 545 => 2919 ..... 550 => 2903 ..... 555 => 4181 ........ 560 => 3842 ....... 565 => 2985 ..... 570 => 4257 ........ 575 => 2751 ..... 580 => 2616 ..... 585 => 2275 .... 590 => 3788 ....... 595 => 3884 ....... 600 => 4610 ......... 605 => 2767 ..... 610 => 4245 ........ 615 => 4664 ......... 620 => 4200 ........ 625 => 2573 ..... 630 => 2156 .... 635 => 3063 ...... 640 => 3278 ...... 645 => 3342 ...... 650 => 3185 ...... 655 => 3914 ....... 660 => 4025 ........ 665 => 3744 ....... 670 => 4392 ........ 675 => 2866 ..... 680 => 3864 ....... 685 => 4608 ......... 690 => 3334 ...... 695 => 3203 ...... 700 => 3085 ...... 705 => 2755 ..... 710 => 4331 ........ 715 => 3261 ...... 720 => 4452 ........ 725 => 3255 ...... 730 => 3491 ...... 735 => 3768 ....... 740 => 4017 ........ 745 => 3301 ...... 750 => 2200 .... 755 => 4483 ........ 760 => 3757 ....... 765 => 3059 ...... 770 => 4131 ........ 775 => 3124 ...... 780 => 2648 ..... 785 => 3940 ....... 790 => 2706 ..... 795 => 3488 ...... 800 => 3849 ....... 805 => 2795 ..... 810 => 2568 ..... 815 => 2844 ..... 820 => 2422 .... 825 => 3518 ....... 830 => 2920 ..... 835 => 3418 ...... 840 => 3818 ....... 845 => 3683 ....... 850 => 3252 ...... 855 => 3200 ...... 860 => 3517 ....... 865 => 3194 ...... 870 => 3241 ...... 875 => 2654 ..... 880 => 4007 ........ 885 => 3835 ....... 890 => 2670 ..... 895 => 2799 ..... 900 => 3591 ....... 905 => 2980 ..... 910 => 3484 ...... 915 => 2956 ..... 920 => 2890 ..... 925 => 3760 ....... 930 => 1691 ... 935 => 2195 .... 940 => 2718 ..... 945 => 2243 .... 950 => 1779 ... 955 => 1856 ... 960 => 2641 ..... 965 => 3052 ...... 970 => 3334 ...... 975 => 3005 ...... 980 => 2895 ..... 985 => 1887 ... 990 => 2617 ..... 995 => 2872 .....1000 => 2608 .....
复制代码
划线
评论
复制
发布于: 2020 年 07 月 07 日阅读数: 60
版权声明: 本文为 InfoQ 作者【Shawn】的原创文章。
原文链接:【http://xie.infoq.cn/article/8261b2ae097373256bf17ffc8】。未经作者许可,禁止转载。
Shawn
关注
还未添加个人签名 2017.10.23 加入
还未添加个人简介











评论