Consistent Hashing算法及相關技術

2021-11-08 03:53:06

當面對bigdata, scale-up思路完全行不通, 需要使用scale-out來進行系統擴充的時候

data sharding将是必須要面對的問題, how to map records to physical nodes?

1. load balance, 避免出現hotspots

2. 節點發生變化時, fail, new add, leave, 不會影響到sharding的結果

使用的方法,

1. 基于master, 比如google的bigtable, master來協調一切, 避免hotspots, 當節點失效或加入時, 經行相應的調整

所有的資料分布狀況資訊都存在master上, 當然問題就是單點

2. 基于内容的劃分, 比如時間, 地點, 問題在于無法解決hotspots問題

3. 基于hash的劃分 (partition = key mod (total_vns)), 這個可以比較有效的解決hotspots問題, 但是無法解決節點變化問題

當節點變化時, 會導緻之前的所有劃分失效

4, 一緻性hash, 比較理想的方案, 并且是去中心化設計

the idea of consistent hashing was introduced by david karger et al. in 1997 (cf. [kll+97]) in the context of a paper about “a family of caching protocols for distributed networks that can be used to decrease or eliminate the occurrence of hot spots in the networks”.

a到b就是, 普通hash到一緻性hash的轉化

c, 當節點增加或删除時, 一緻性hash可以簡單的應對

d, 為了load balance, 使用虛拟節點的概念, 并可以根據節點的能力配置設定不同個數的節點

client是否需要儲存一緻性hash環資訊, 并如何更新的問題?

如果能容忍多一跳的延遲, client可以不儲存任何一緻性hash環資訊, 就近将request發給任一server, 由server進行coordinate

to provide high reiability from individually unreliable resource, we need to replicate the data partitions.

in a partitioned database where nodes may join and leave the system at any time without impacting its operation all nodes have to communicate with each other, especially when membership changes.

一緻性hash有效解決節點動态變化的問題, 也隻是将影響降到較小的範圍, 在節點變化時, 鄰接的節點仍然需要做一定的調整和資料transfer

when a new node joins the system the following actions have to happen (cf. [ho09a]):

1. the newly arriving node announces its presence and its identifier to adjacent nodes or to all nodes via broadcast.

2. the neighbors of the joining node react by adjusting their object and replica ownerships.

3. the joining node copies datasets it is now responsible for from its neighbours. this can be done in bulk and alsoasynchronously.

4. if, in step 1, the membership change has not been broadcasted to all nodes, the joining node is now announcing its arrival.

when a node leaves the system the following actions have to occur (cf. [ho09a]):

1. nodes within the system need to detect whether a node has left as it might have crashed and not been able to notify the other nodes of its departure.

2. if a node’s departure has been detected, the neighbors of the node have to react by exchanging data with each other and adjusting their object and replica ownerships.

本文章摘自部落格園，原文釋出日期：2013-04-13

Consistent Hashing算法及相關技術

繼續閱讀

Codeforces 1417 D. Make Them Equal(思維+構造)

查找算法之二分查找查找算法之二分查找

查找算法學習之二分查找（Python版本）——BinarySearch

CQ V1.0分詞bates(基于雙數組tire樹)—應該是目前最快的中文分詞算法

Command Network(POJ 3164)---定根最小樹形圖模闆題題目描述輸入格式輸出格式輸入樣例輸出樣例分析源程式

開源低帶寬語音編解碼器

241 Different Ways to Add Parentheses（C代碼版）

【趨高機器視覺】機器視覺技術原了解析及解決方案

CSMA/CD1． CSMA/CD的概述2． CSMA 的工作原理3． CSMA/CD控制規程及特點4． CSMA/CD協定5． CSMA/CD的優點6．結束語

極大似然法(ML)與最大期望法(EM)

C++ 第十五周報告1--《冒泡法排序》

筆試面試題目：滑動視窗(二)

資料結構與算法（27）——排序（二）

Dijkstra--簡易版（最短路徑）

GitHub連夜封殺！這份阿裡 10W 字内部 Java 字面試手冊到底有多強？

hdu7108哈希