以下涉及到的源碼均為redis5.0-rc3版本的代碼【點選檢視官方源碼】
文章目錄
-
- 壓縮清單(ziplist)
- 資料結構
-
-
- 壓縮清單
- 壓縮清單節點
- prvlen
- encoding
-
- 壓縮清單存儲樣例
- API
-
-
- entry的定義
- 删除操作
- 插入操作
- 空間擴充
-
壓縮清單(ziplist)
- 壓縮清單是一個特殊編碼的清單,為節約記憶體而開發。
- 壓縮清單可以儲存字元串和整型值,其中整數存放時仍是整數,而不會變成字元。
- 壓縮清單允許push和pop操作在時間複雜度為O(1)的情況下完成。但是每個操作都可能需要重新配置設定ziplist使用的記憶體,是以實際的複雜性受ziplist使用記憶體量及其變動的影響。
- 壓縮清單在redis中是清單鍵和哈希鍵的底層實作之一。
- 由于壓縮清單本就為節約記憶體而開發,且其改動的操作都可能會重新配置設定entry的記憶體,導緻操作的時間複雜度增加,是以為了性能上不受影響,其适用于存儲小整數值和較短的字元串。
資料結構
壓縮清單
壓縮清單的資料結構是在ziplist.c檔案中注釋說明中這樣描述的:
<zlbytes> <zltail> <zllen> <entry> <entry> ... <entry> <zlend>
其中各部分的含義如下:
-
zlbytes
值類型為uint32_t,占用4個位元組。是一個無符号整型值,儲存着整個壓縮清單占用的位元組數(包含自己占用的4個位元組)。此值的作用在于當對ziplist的記憶體進行重新配置設定的時候直接使用該值,而無需進行周遊整個ziplist;
-
zltail
值類型為uint32_t,占用4個位元組。清單最後一個節點的位址偏移量,這樣就可以使得進行pop操作的時候無需周遊整個ziplist;
-
zllen
值類型為uint16_t,占用2個位元組。記錄整個ziplist中節點的總數,當其大小大于216-2的時候,zllen=216-1,此時,便需要周遊整個ziplist才能确定節點的數量;
-
entry
ziplist的節點
-
zlend
值類型為uint8_t,占用1個位元組。值為255,表示ziplist結束的位置,當一個位元組被設定值為255時即表示後續不會再有節點;
下面用圖來舉例說明:
![](https://img.laitimes.com/img/__Qf2AjLwojIjJCLyojI0JCLiAzNvwVZ2x2bzNXak9CX90TQNNkRrFlQKBTSvwFbslmZvwFMwQzLcVmepNHdu9mZvwFVywUNMZTY18CX052bm9CX6FEVOhXVE9ENJpHW4Z0MMBjVtJWd0ckW65UbM5WOHJWa5kHT20ESjBjUIF2LcRHelR3LcJzLctmch1mclRXY39DNyMTNwkTNwITNwgDM4EDMy8CX0Vmbu4GZzNmLn9Gbi1yZtl2Lc9CX6MHc0RHaiojIsJye.jpg)
壓縮清單節點
壓縮清單節點的組成一般情況下如下所示:
<prevlen> <encoding> <entry-data>
然而,如果節點存儲的是小整型時,節點就不會有這個資料,因為真正的值可以用encoding的值表示,然後節點的組成又會如下所示:
<prevlen> <encoding>
由上可見,每個entry節點都會有字首資訊,且其字首都由兩方面資訊組成:prvlen、encoding。
prvlen
- 代表前一個節點占用的位元組長度,友善ziplist從後往前周遊。如p1代表壓縮清單位址,p2代表最後一個節點位址,p3代表倒數第二個節點位址,由此可見:
p2 = p1 + zltail
p3 = p2 - p2->prvlen
p(pre) = p3 - p3->prvlen
- 如果長度小于254位元組,prvlen則占用1個位元組。
<prevlen from 0 to 253> <encoding> <entry>
- 如果長度大于254位元組,prvlen則占用5個位元組長度,第一個位元組代值為254(FE),後面的四個位元組用于儲存前一個節點的長度。
0xFE <4 bytes unsigned little endian prevlen> <encoding> <entry>
encoding
- 表示目前節點的編碼,可以是integer或者string。
- 當其節點存儲的是一個string時,encoding的第一個位元組的前2個bit(00、01、10)代表了ecoding的類型,其他的bit則表示string的長度。
- 當encoding的第一個位元組的前2個bit為11時,代表entry節點存儲的是整數,且後續的2個bit則可以用來表示integer的值(即節點存儲小整數的時候)
下面直接貼身源碼中的例子,來舉例當類型不同時的encoding的樣子。
//---------------字元串類型---------------//
* |00pppppp| - 1 byte
* String value with length less than or equal to 63 bytes (6 bits).
* "pppppp" represents the unsigned 6 bit length.
* |01pppppp|qqqqqqqq| - 2 bytes
* String value with length less than or equal to 16383 bytes (14 bits).
* IMPORTANT: The 14 bit number is stored in big endian.
* |10000000|qqqqqqqq|rrrrrrrr|ssssssss|tttttttt| - 5 bytes
* String value with length greater than or equal to 16384 bytes.
* Only the 4 bytes following the first byte represents the length
* up to 32^2-1. The 6 lower bits of the first byte are not used and
* are set to zero.
* IMPORTANT: The 32 bit number is stored in big endian.
//-----------------整數類型------------//
* |11000000| - 3 bytes
* Integer encoded as int16_t (2 bytes).
* |11010000| - 5 bytes
* Integer encoded as int32_t (4 bytes).
* |11100000| - 9 bytes
* Integer encoded as int64_t (8 bytes).
* |11110000| - 4 bytes
* Integer encoded as 24 bit signed (3 bytes).
* |11111110| - 2 bytes
* Integer encoded as 8 bit signed (1 byte).
* |1111xxxx| - (with xxxx between 0000 and 1101) immediate 4 bit integer.
* Unsigned integer from 0 to 12. The encoded value is actually from
* 1 to 13 because 0000 and 1111 can not be used, so 1 should be
* subtracted from the encoded 4 bit value to obtain the right value.
* |11111111| - End of ziplist special entry.
壓縮清單存儲樣例
這裡用源碼中的兩個例子來具體表示壓縮清單的存儲情況,如下所示:
//-----------------整數類型------------//
* The following is a ziplist containing the two elements representing
* the strings "2" and "5". It is composed of 15 bytes, that we visually
* split into sections:
*
* [0f 00 00 00] [0c 00 00 00] [02 00] [00 f3] [02 f6] [ff]
* | | | | | |
* zlbytes zltail entries "2" "5" end
*
* The first 4 bytes represent the number 15, that is the number of bytes
* the whole ziplist is composed of. The second 4 bytes are the offset
* at which the last ziplist entry is found, that is 12, in fact the
* last entry, that is "5", is at offset 12 inside the ziplist.
* The next 16 bit integer represents the number of elements inside the
* ziplist, its value is 2 since there are just two elements inside.
* Finally "00 f3" is the first entry representing the number 2. It is
* composed of the previous entry length, which is zero because this is
* our first entry, and the byte F3 which corresponds to the encoding
* |1111xxxx| with xxxx between 0001 and 1101. We need to remove the "F"
* higher order bits 1111, and subtract 1 from the "3", so the entry value
* is "2". The next entry has a prevlen of 02, since the first entry is
* composed of exactly two bytes. The entry itself, F6, is encoded exactly
* like the first entry, and 6-1 = 5, so the value of the entry is 5.
* Finally the special entry FF signals the end of the ziplist.
*
//-----------------字元串類型------------//
* Adding another element to the above string with the value "Hello World"
* allows us to show how the ziplist encodes small strings. We'll just show
* the hex dump of the entry itself. Imagine the bytes as following the
* entry that stores "5" in the ziplist above:
*
* [02] [0b] [48 65 6c 6c 6f 20 57 6f 72 6c 64]
*
* The first byte, 02, is the length of the previous entry. The next
* byte represents the encoding in the pattern |00pppppp| that means
* that the entry is a string of length <pppppp>, so 0B means that
* an 11 bytes string follows. From the third byte (48) to the last (64)
* there are just the ASCII characters for "Hello World".
API
在ziplist.h頭檔案中給出了如下api:
//建立壓縮清單
unsigned char *ziplistNew(void);
//合并兩個壓縮清單,将second追加在first後面
unsigned char *ziplistMerge(unsigned char **first, unsigned char **second);
//新加節點
unsigned char *ziplistPush(unsigned char *zl, unsigned char *s, unsigned int slen, int where);
//傳回給定索引的節點
unsigned char *ziplistIndex(unsigned char *zl, int index);
//下一個節點
unsigned char *ziplistNext(unsigned char *zl, unsigned char *p);
//前一個節點
unsigned char *ziplistPrev(unsigned char *zl, unsigned char *p);
//擷取節點中的值
unsigned int ziplistGet(unsigned char *p, unsigned char **sval, unsigned int *slen, long long *lval);
//插入節點
unsigned char *ziplistInsert(unsigned char *zl, unsigned char *p, unsigned char *s, unsigned int slen);
//删除節點
unsigned char *ziplistDelete(unsigned char *zl, unsigned char **p);
//删除多個節點
unsigned char *ziplistDeleteRange(unsigned char *zl, int index, unsigned int num);
//比較節點值大小
unsigned int ziplistCompare(unsigned char *p, unsigned char *s, unsigned int slen);
//查找
unsigned char *ziplistFind(unsigned char *p, unsigned char *vstr, unsigned int vlen, unsigned int skip);
//擷取節點個數
unsigned int ziplistLen(unsigned char *zl);
//壓縮清單暫用總位元組數
size_t ziplistBlobLen(unsigned char *zl);
//格式化列印壓縮清單資訊
void ziplistRepr(unsigned char *zl);
下面将隻對節點的定義,以及節點的插入和删除操作的源碼做介紹,其餘的若想要了解可去浏覽源碼。而為什麼要單獨隻介紹删除和插入兩個操作呢?因為其涉及到壓縮清單的一個影響性能的重要操作——空間擴充
entry的定義
typedef struct zlentry {
//prevrawlensize占用的位元組情況
unsigned int prevrawlensize;
//前一個節點的長度
unsigned int prevrawlen;
//節點的長度(string有1、2、5位元組情況,integer隻有單位元組的情況)
unsigned int lensize;
//節點占用位元組數
unsigned int len;
//prevrawlensize + lensize
unsigned int headersize;
//編碼類型:ZIP_STR_* or ZIP_INT_*
unsigned char encoding;
//指向節點起始的指針
unsigned char *p;
} zlentry;
删除操作
//删除元素為num的節點
unsigned char *__ziplistDelete(unsigned char *zl, unsigned char *p, unsigned int num) {
unsigned int i, totlen, deleted = 0;
size_t offset;
int nextdiff = 0;
zlentry first, tail;
zipEntry(p, &first);
for (i = 0; p[0] != ZIP_END && i < num; i++) {
p += zipRawEntryLength(p);
deleted++;
}
//計算需要删除的空間
totlen = p-first.p;
if (totlen > 0) {
if (p[0] != ZIP_END) {
nextdiff = zipPrevLenByteDiff(p,first.prevrawlen);
p -= nextdiff;
zipStorePrevEntryLength(p,first.prevrawlen);
ZIPLIST_TAIL_OFFSET(zl) =
intrev32ifbe(intrev32ifbe(ZIPLIST_TAIL_OFFSET(zl))-totlen);
zipEntry(p, &tail);
if (p[tail.headersize+tail.len] != ZIP_END) {
ZIPLIST_TAIL_OFFSET(zl) =
intrev32ifbe(intrev32ifbe(ZIPLIST_TAIL_OFFSET(zl))+nextdiff);
}
//移動後面的節點
memmove(first.p,p,
intrev32ifbe(ZIPLIST_BYTES(zl))-(p-zl)-1);
} else {
//尾節點删除,不需要進行空間擴充
ZIPLIST_TAIL_OFFSET(zl) =
intrev32ifbe((first.p-zl)-first.prevrawlen);
}
//重新計算尾節點的偏移量和計算節點總位元組數
offset = first.p-zl;
zl = ziplistResize(zl, intrev32ifbe(ZIPLIST_BYTES(zl))-totlen+nextdiff);
ZIPLIST_INCR_LENGTH(zl,-deleted);
p = zl+offset;
if (nextdiff != 0)
//進行空間擴充更新
zl = __ziplistCascadeUpdate(zl,p);
}
return zl;
}
插入操作
//插入節點p
unsigned char *__ziplistInsert(unsigned char *zl, unsigned char *p, unsigned char *s, unsigned int slen) {
size_t curlen = intrev32ifbe(ZIPLIST_BYTES(zl)), reqlen;
unsigned int prevlensize, prevlen = 0;
size_t offset;
int nextdiff = 0;
unsigned char encoding = 0;
long long value = 123456789;
zlentry tail;
//查找要插入節點的前一個節點
if (p[0] != ZIP_END) {
ZIP_DECODE_PREVLEN(p, prevlensize, prevlen);
} else {
unsigned char *ptail = ZIPLIST_ENTRY_TAIL(zl);
if (ptail[0] != ZIP_END) {
prevlen = zipRawEntryLength(ptail);
}
}
//判斷entry是否要encoding
if (zipTryEncoding(s,slen,&value,&encoding)) {
//設定int類型
reqlen = zipIntSize(encoding);
} else {
reqlen = slen;
}
reqlen += zipStorePrevEntryLength(NULL,prevlen);
reqlen += zipStoreEntryEncoding(NULL,encoding,slen);
//當插入的節點不是在最後的時候,我們需要確定下一個節點有足夠的記憶體來儲存目前節點的長度
int forcelarge = 0;
nextdiff = (p[0] != ZIP_END) ? zipPrevLenByteDiff(p,reqlen) : 0;
if (nextdiff == -4 && reqlen < 4) {
nextdiff = 0;
forcelarge = 1;
}
/* Store offset because a realloc may change the address of zl. */
offset = p-zl;
zl = ziplistResize(zl,curlen+reqlen+nextdiff);
p = zl+offset;
//當不是插入到最後的時候
if (p[0] != ZIP_END) {
/* Subtract one because of the ZIP_END bytes */
memmove(p+reqlen,p-nextdiff,curlen-offset-1+nextdiff);
/* Encode this entry's raw length in the next entry. */
if (forcelarge)
zipStorePrevEntryLengthLarge(p+reqlen,reqlen);
else
zipStorePrevEntryLength(p+reqlen,reqlen);
/* Update offset for tail */
ZIPLIST_TAIL_OFFSET(zl) =
intrev32ifbe(intrev32ifbe(ZIPLIST_TAIL_OFFSET(zl))+reqlen);
//當要插入位置後面有多個節點時
zipEntry(p+reqlen, &tail);
if (p[reqlen+tail.headersize+tail.len] != ZIP_END) {
ZIPLIST_TAIL_OFFSET(zl) =
intrev32ifbe(intrev32ifbe(ZIPLIST_TAIL_OFFSET(zl))+nextdiff);
}
} else {
//當是插入到最後的時候
ZIPLIST_TAIL_OFFSET(zl) = intrev32ifbe(p-zl);
}
//判斷是否進行空間擴充
if (nextdiff != 0) {
offset = p-zl;
zl = __ziplistCascadeUpdate(zl,p+reqlen);
p = zl+offset;
}
//寫入節點
p += zipStorePrevEntryLength(p,prevlen);
p += zipStoreEntryEncoding(p,encoding,slen);
if (ZIP_IS_STR(encoding)) {
memcpy(p,s,slen);
} else {
zipSaveInteger(p,value,encoding);
}
ZIPLIST_INCR_LENGTH(zl,1);
return zl;
}
空間擴充
/* When an entry is inserted, we need to set the prevlen field of the next
* entry to equal the length of the inserted entry. It can occur that this
* length cannot be encoded in 1 byte and the next entry needs to be grow
* a bit larger to hold the 5-byte encoded prevlen. This can be done for free,
* because this only happens when an entry is already being inserted (which
* causes a realloc and memmove). However, encoding the prevlen may require
* that this entry is grown as well. This effect may cascade throughout
* the ziplist when there are consecutive entries with a size close to
* ZIP_BIG_PREVLEN, so we need to check that the prevlen can be encoded in
* every consecutive entry.
*
* Note that this effect can also happen in reverse, where the bytes required
* to encode the prevlen field can shrink. This effect is deliberately ignored,
* because it can cause a "flapping" effect where a chain prevlen fields is
* first grown and then shrunk again after consecutive inserts. Rather, the
* field is allowed to stay larger than necessary, because a large prevlen
* field implies the ziplist is holding large entries anyway.
*
* The pointer "p" points to the first entry that does NOT need to be
* updated, i.e. consecutive fields MAY need an update. */
unsigned char *__ziplistCascadeUpdate(unsigned char *zl, unsigned char *p) {
size_t curlen = intrev32ifbe(ZIPLIST_BYTES(zl)), rawlen, rawlensize;
size_t offset, noffset, extra;
unsigned char *np;
zlentry cur, next;
while (p[0] != ZIP_END) {
zipEntry(p, &cur);
rawlen = cur.headersize + cur.len;
rawlensize = zipStorePrevEntryLength(NULL,rawlen);
//當以及時最後一個節點的時候,結束操作
if (p[rawlen] == ZIP_END) break;
zipEntry(p+rawlen, &next);
//當prvlen記憶體不需要改變的時候,結束操作
if (next.prevrawlen == rawlen) break;
if (next.prevrawlensize < rawlensize) {
offset = p-zl;
extra = rawlensize-next.prevrawlensize;
zl = ziplistResize(zl,curlen+extra);
p = zl+offset;
np = p+rawlen;
noffset = np-zl;
if ((zl+intrev32ifbe(ZIPLIST_TAIL_OFFSET(zl))) != np) {
ZIPLIST_TAIL_OFFSET(zl) =
intrev32ifbe(intrev32ifbe(ZIPLIST_TAIL_OFFSET(zl))+extra);
}
memmove(np+rawlensize,
np+next.prevrawlensize,
curlen-noffset-next.prevrawlensize-1);
zipStorePrevEntryLength(np,rawlen);
p += rawlen;
curlen += extra;
} else {
if (next.prevrawlensize > rawlensize) {
zipStorePrevEntryLengthLarge(p+rawlen,rawlen);
} else {
zipStorePrevEntryLength(p+rawlen,rawlen);
}
break;
}
}
return zl;
}