架构训练营第五周作业

发布于: 2020 年 07 月 08 日

  • 用你熟悉的编程语言实现一致性 hash 算法。

  • 编写测试用例测试这个算法,测试 100 万 KV 数据,10 个服务器节点的情况下,计算这些 KV 数据在服务器上分布数量的标准差,以评估算法的存储负载不均衡性。


Hash算法设计

定义Hash算法接口

public interface HashAlgrth {
int getHashCode(String key);
}

查阅了网上资料,采用FNV132算法,生成32位Hash Code

public class FNVHashAlgrth implements HashAlgrth {
private final static int PRIME = 16777619;
private final static long OFFSET_BASIC = 2166136261L;
@Override
public int getHashCode(String data) {
int hashCode = (int) OFFSET_BASIC;
for (int i = 0; i < data.length(); i++) {
hashCode = (hashCode ^ data.charAt(i)) * PRIME;
}
hashCode += hashCode << 13;
hashCode ^= hashCode >> 7;
hashCode += hashCode << 3;
hashCode ^= hashCode >> 17;
hashCode += hashCode << 5;
if (hashCode < 0) {
hashCode = Math.abs(hashCode);
}
return hashCode;
}
}

定义节点类Node,由于要记录节点有多少个对象落在该节点,定义了countNum栏位记录对象数

public class Node {
private String IpAdress;
//用于记录当前服务器记录的对象数
private int countNum = 0;
public String getIpAdress() {
return IpAdress;
}
public void setIpAdress(String ipAdress) {
IpAdress = ipAdress;
}
public int getCountNum() {
return countNum;
}
public void setCountNum(int countNum) {
this.countNum = countNum;
}
public void addOne() {
this.countNum += 1;
}
//虚拟IP形式为"XXX.XXX.XXX.XXX{INT}"
public String createVirtualIP(int replicaNumber) {
return IpAdress + Integer.toString(replicaNumber);
}
public Node(String ipAdress) {
IpAdress = ipAdress;
}
}

按照一致性Hash的算法定义实现。算法关键是如何设计一个数据结构实现哈希环。一开始想采用循环链表来实现,但如果记录比较稀疏,发现查询效率不高,算法复杂度是O(N)。参考了网上资料,采用了排序树。Java的排序树实现TreeMap是红黑树,查找时间复杂度是O(logN)。

public class ConsistentHash {
//虚拟节点数
private int replicasCount;
private HashAlgrth hashAlgrth = new FNVHashAlgrth();
private SortedMap<Integer, Node> hashCircle = new TreeMap<> ();
public ConsistentHash(int replicasCount, Collection<Node> nodes){
this.replicasCount = replicasCount;
nodes.forEach( n -> addNode(n));
}
public void addNewNode(Node node) {
if(node == null)
throw new IllegalArgumentException("Node must be not null");
if(hashCircle.containsKey(hashAlgrth.getHashCode(node.createVirtualIP(0))))
throw new IllegalArgumentException("Node does exist!");
addNode(node);
}
private void addNode(Node node) {
for(int i = 0; i < replicasCount; i++) {
hashCircle.put(hashAlgrth.getHashCode(node.createVirtualIP(i)), node);
}
}
public Node getNode(String keyData) {
if(hashCircle.isEmpty())
return null;
int hashCode = hashAlgrth.getHashCode(keyData);
if(!hashCircle.containsKey(hashCode)){
SortedMap<Integer, Node> tailMap = hashCircle.tailMap(hashCode);
hashCode = tailMap.isEmpty() ? hashCircle.firstKey() : tailMap.firstKey();
}
return hashCircle.get(hashCode);
}
}

整个作业的实现: 采用10台IP地址为192.168.0.1至10的机器,查询Key采用Java自带的UUID生成,计算标准差

public class ConsistentHashDemo {
private static final String[] nodesIP = new String[] {
"192.168.0.1",
"192.168.0.2",
"192.168.0.3",
"192.168.0.4",
"192.168.0.5",
"192.168.0.6",
"192.168.0.7",
"192.168.0.8",
"192.168.0.9",
"192.168.0.10"
};
private static final long requestCount = 1000000L;
private static final int replicasCount = 50;
public static void main(String[] args) {
List<Node> nodes = Stream.of(nodesIP).map(s -> new Node(s)).collect(Collectors.toList());
ConsistentHash consistentHash = new ConsistentHash(replicasCount, nodes);
for(long i = 0; i < requestCount; i++){
consistentHash.getNode(UUID.randomUUID().toString()).addOne();
}
long average = requestCount / nodesIP.length;
double sum = 0;
for(Node node : nodes){
System.out.println(node.getIpAdress() + ": " + node.getCountNum());
sum += (average - node.getCountNum()) * (average - node.getCountNum());
}
System.out.println(Math.sqrt(sum / (nodes.size())));
}
}

在10000000KV数据下,50个虚拟节点标准差约为15000,100个虚拟节点约为7000,150约为5600,200约为4600

代码见:https://github.com/Albertsirius/consistenthashingdemo

用户头像

子豪sirius

关注

还未添加个人签名 2018.05.03 加入

还未添加个人简介

评论

发布
暂无评论
架构训练营第五周作业