第五周作业

关注
发布于: 2020 年 11 月 22 日
作业一：用java实现一致性哈希
根据定义，实现带虚拟节点的一致性哈希的要点如下
1，创建节点时，为每个真实节点随机生成若干数量的虚拟节点，并维护虚拟节点与实际节点的映射关系。
2，数据存储/取得的时候，根据key的哈希值查找对应的实际节点，保存/获取数据。
3，删除节点时，删除对应的虚拟节点。
4，可以动态新增节点
﻿
具体实现过程：
注意：为了方便检查，代码做了简化，命名变得有些奇怪，请见谅
1，初始化时，读入需要创建的真实节点以及每个真实节点对应的每个虚拟节点数， 默认虚拟节点为150
﻿
    private int slot = 150;
    // 虚拟节点最大可能数
    private final long MAX_COUNT = (1l << 32) - 1;
    
    public Manager(int nodeCount, int slot) {
        this.slot = slot;
        init(nodeCount);
    }
﻿
    public Manager(int nodeCount) {
        init(nodeCount);
    }
﻿
﻿
2，创建节点，随机生成若干个虚拟节点并维护映射关系
    // 映射关系
    private Map<Long, Node> nodeMap = new HashMap<Long, Node>();
    private List<Long> keyList = new ArrayList<Long>();
    private void init(int nodeCount) {
        for (int i = 0; i < nodeCount; i++) {
            addNode();
        }
        // 为了加快节点查找，对虚拟节点做了排序操作
        keyList.sort(new Comparator<Long>() {
            @Override
            public int compare(Long o1, Long o2) {
                return o1 > o2 ? 1 : o1 == o2 ? 0 : -1;
            }
        });
    }
    // 动态新增真实节点时使用
   public void addNode() {
        Node node = new Node();
        nodeList.add(node);
        addNode(node);
    }
﻿
    private void addNode(Node node) {
        Random rand = new Random();
        int index = 0;
        List<Long> allocatedSlots = new ArrayList<Long>();
        node.setSlots(allocatedSlots);
        while(index < slot) {
            long nextSlot = rand.nextInt() + (long)Integer.MAX_VALUE + 1;
            if (!nodeMap.containsKey(nextSlot)) {
                nodeMap.put(nextSlot, node);
                keyList.add(nextSlot);
                allocatedSlots.add(nextSlot);
                index++;
            }
        }
    }
    
3，数据插入和查找
    public String get(String key) {
        long hashCode = key.hashCode() + (long)Integer.MAX_VALUE + 1;
        long slot = hashCode % MAX_COUNT;
        Node node = getNodeBySlot(slot);
        return node.get(key);
    }
﻿
    public void put(String key, String value) {
        long hashCode = key.hashCode() + (long)Integer.MAX_VALUE + 1;
        long slot = hashCode % MAX_COUNT;
        Node node = getNodeBySlot(slot);
        node.put(key, value);
    }
    // 查找节点的方式还可以进行优化
    private Node getNodeBySlot(long slot) {
        if (keyList.contains(slot)) {
            return nodeMap.get(slot);
        }
        long key = 0l;
        int index = 0;
        while(key < slot && index < keyList.size()) {
            key = keyList.get(index++);
        }
        Node node = nodeMap.get(key);
        return node != null ? node : nodeMap.get(keyList.get(0));
    }
﻿
4，节点删除，未实现
﻿
PS github链接：https://github.com/softthink-csp/consistent-hashing
﻿
作业二：编写测试用例测试这个算法，测试 100 万 KV 数据，10 个服务器节点的情况下，计算这些 KV 数据在服务器上分布数量的标准差，以评估算法的存储负载不均衡性。
﻿
测试代码：
    public static void main(String[] args) throws InterruptedException {
        for (int i = 0; i < 10; i++) {
            test(10);
        }
    }
﻿
    public static void test(int slot) throws InterruptedException {
﻿
        Manager manager = new Manager(10, slot);
        CountDownLatch latch = new CountDownLatch(10);
        ExecutorService executors = Executors.newFixedThreadPool(10);
        for (int i = 0; i < 10; i++) {
            executors.execute(() -> {
                for (int j = 0; j < 100000; j++) {
                    String key = TestUtil.generatorRandomString(10);
                    String value = "";
                    manager.put(key, value);
                }
                latch.countDown();
            });
        }
        latch.await();
        List<Integer> all = manager.getAllNode();
        int[] totalNums = new int[10];
        for (int i = 0; i < 10; i++) {
            totalNums[i] = all.get(i);
        }
﻿
        System.out.println(getStd(totalNums));
    }
﻿
    private static double getStd(int[] nums) {
        int total = 0;
        for (int val : nums) {
            total += val;
        }
        double average = total / 10;
        double variance = 0;
        for (int val : nums) {
            variance += (val - average) * (val - average);
        }
        return Math.sqrt(variance / 10);
    }
﻿
测试结果
虚拟节点/真实节点	   十次测试结果平均标准差
10	              28869.219296129588
50	              13168.856671704878
100	              9205.791157924474
150	              7662.509956242087
200	              6241.896504474808
300	              5593.145756056335
﻿
从测试结果可看出，真实节点对应的虚拟节点越多，标准差越小，但是到150之后，增加虚拟节点带来的效果也越来越小，节点查找时间也越来越长。老师所说的150应该是一个综合考虑之后的一个最优值。
﻿
发布于: 2020 年 11 月 22 日阅读数: 28
小兵

关注
还未添加个人签名 2018.10.07 加入
还未添加个人简介
发布
暂无评论
创作场景

第五周作业

小兵

评论