Redis源碼剖析之壓縮清單

以下涉及到的源碼均為redis5.0-rc3版本的代碼【點選檢視官方源碼】

文章目錄

- 壓縮清單（ziplist）
- 資料結構
- - - 壓縮清單
    - 壓縮清單節點
    - prvlen
    - encoding
- 壓縮清單存儲樣例
- API
- - - entry的定義
    - 删除操作
    - 插入操作
    - 空間擴充

壓縮清單（ziplist）

壓縮清單是一個特殊編碼的清單，為節約記憶體而開發。
壓縮清單可以儲存字元串和整型值，其中整數存放時仍是整數，而不會變成字元。
壓縮清單允許push和pop操作在時間複雜度為O(1)的情況下完成。但是每個操作都可能需要重新配置設定ziplist使用的記憶體，是以實際的複雜性受ziplist使用記憶體量及其變動的影響。
壓縮清單在redis中是清單鍵和哈希鍵的底層實作之一。
由于壓縮清單本就為節約記憶體而開發，且其改動的操作都可能會重新配置設定entry的記憶體，導緻操作的時間複雜度增加，是以為了性能上不受影響，其适用于存儲小整數值和較短的字元串。

資料結構

壓縮清單

壓縮清單的資料結構是在ziplist.c檔案中注釋說明中這樣描述的：

<zlbytes> <zltail> <zllen> <entry> <entry> ... <entry> <zlend>

其中各部分的含義如下：

zlbytes

值類型為uint32_t，占用4個位元組。是一個無符号整型值，儲存着整個壓縮清單占用的位元組數（包含自己占用的4個位元組）。此值的作用在于當對ziplist的記憶體進行重新配置設定的時候直接使用該值，而無需進行周遊整個ziplist；
zltail

值類型為uint32_t，占用4個位元組。清單最後一個節點的位址偏移量，這樣就可以使得進行pop操作的時候無需周遊整個ziplist；
zllen

值類型為uint16_t，占用2個位元組。記錄整個ziplist中節點的總數，當其大小大于216-2的時候，zllen=216-1，此時，便需要周遊整個ziplist才能确定節點的數量；
entry

ziplist的節點
zlend

值類型為uint8_t，占用1個位元組。值為255，表示ziplist結束的位置，當一個位元組被設定值為255時即表示後續不會再有節點；

下面用圖來舉例說明：

Redis源碼剖析之壓縮清單

壓縮清單節點

壓縮清單節點的組成一般情況下如下所示：

<prevlen> <encoding> <entry-data>

然而，如果節點存儲的是小整型時，節點就不會有這個資料，因為真正的值可以用encoding的值表示，然後節點的組成又會如下所示：

<prevlen> <encoding>

由上可見，每個entry節點都會有字首資訊，且其字首都由兩方面資訊組成：prvlen、encoding。

prvlen

代表前一個節點占用的位元組長度，友善ziplist從後往前周遊。如p1代表壓縮清單位址，p2代表最後一個節點位址，p3代表倒數第二個節點位址，由此可見：

p2 = p1 + zltail
    p3 = p2 - p2->prvlen
    p(pre) = p3 - p3->prvlen

如果長度小于254位元組，prvlen則占用1個位元組。

<prevlen from 0 to 253> <encoding> <entry>

如果長度大于254位元組，prvlen則占用5個位元組長度，第一個位元組代值為254（FE），後面的四個位元組用于儲存前一個節點的長度。

0xFE <4 bytes unsigned little endian prevlen> <encoding> <entry>

encoding

表示目前節點的編碼，可以是integer或者string。
當其節點存儲的是一個string時，encoding的第一個位元組的前2個bit（00、01、10）代表了ecoding的類型，其他的bit則表示string的長度。
當encoding的第一個位元組的前2個bit為11時，代表entry節點存儲的是整數，且後續的2個bit則可以用來表示integer的值（即節點存儲小整數的時候）

下面直接貼身源碼中的例子，來舉例當類型不同時的encoding的樣子。

//---------------字元串類型---------------//
 * |00pppppp| - 1 byte
 *      String value with length less than or equal to 63 bytes (6 bits).
 *      "pppppp" represents the unsigned 6 bit length.
 * |01pppppp|qqqqqqqq| - 2 bytes
 *      String value with length less than or equal to 16383 bytes (14 bits).
 *      IMPORTANT: The 14 bit number is stored in big endian.
 * |10000000|qqqqqqqq|rrrrrrrr|ssssssss|tttttttt| - 5 bytes
 *      String value with length greater than or equal to 16384 bytes.
 *      Only the 4 bytes following the first byte represents the length
 *      up to 32^2-1. The 6 lower bits of the first byte are not used and
 *      are set to zero.
 *      IMPORTANT: The 32 bit number is stored in big endian.
//-----------------整數類型------------//
 * |11000000| - 3 bytes
 *      Integer encoded as int16_t (2 bytes).
 * |11010000| - 5 bytes
 *      Integer encoded as int32_t (4 bytes).
 * |11100000| - 9 bytes
 *      Integer encoded as int64_t (8 bytes).
 * |11110000| - 4 bytes
 *      Integer encoded as 24 bit signed (3 bytes).
 * |11111110| - 2 bytes
 *      Integer encoded as 8 bit signed (1 byte).
 * |1111xxxx| - (with xxxx between 0000 and 1101) immediate 4 bit integer.
 *      Unsigned integer from 0 to 12. The encoded value is actually from
 *      1 to 13 because 0000 and 1111 can not be used, so 1 should be
 *      subtracted from the encoded 4 bit value to obtain the right value.
 * |11111111| - End of ziplist special entry.

壓縮清單存儲樣例

這裡用源碼中的兩個例子來具體表示壓縮清單的存儲情況，如下所示：

//-----------------整數類型------------//
 * The following is a ziplist containing the two elements representing
 * the strings "2" and "5". It is composed of 15 bytes, that we visually
 * split into sections:
 *
 *  [0f 00 00 00] [0c 00 00 00] [02 00] [00 f3] [02 f6] [ff]
 *        |             |          |       |       |     |
 *     zlbytes        zltail    entries   "2"     "5"   end
 *
 * The first 4 bytes represent the number 15, that is the number of bytes
 * the whole ziplist is composed of. The second 4 bytes are the offset
 * at which the last ziplist entry is found, that is 12, in fact the
 * last entry, that is "5", is at offset 12 inside the ziplist.
 * The next 16 bit integer represents the number of elements inside the
 * ziplist, its value is 2 since there are just two elements inside.
 * Finally "00 f3" is the first entry representing the number 2. It is
 * composed of the previous entry length, which is zero because this is
 * our first entry, and the byte F3 which corresponds to the encoding
 * |1111xxxx| with xxxx between 0001 and 1101. We need to remove the "F"
 * higher order bits 1111, and subtract 1 from the "3", so the entry value
 * is "2". The next entry has a prevlen of 02, since the first entry is
 * composed of exactly two bytes. The entry itself, F6, is encoded exactly
 * like the first entry, and 6-1 = 5, so the value of the entry is 5.
 * Finally the special entry FF signals the end of the ziplist.
 *
//-----------------字元串類型------------//
 * Adding another element to the above string with the value "Hello World"
 * allows us to show how the ziplist encodes small strings. We'll just show
 * the hex dump of the entry itself. Imagine the bytes as following the
 * entry that stores "5" in the ziplist above:
 *
 * [02] [0b] [48 65 6c 6c 6f 20 57 6f 72 6c 64]
 *
 * The first byte, 02, is the length of the previous entry. The next
 * byte represents the encoding in the pattern |00pppppp| that means
 * that the entry is a string of length <pppppp>, so 0B means that
 * an 11 bytes string follows. From the third byte (48) to the last (64)
 * there are just the ASCII characters for "Hello World".

API

在ziplist.h頭檔案中給出了如下api：

//建立壓縮清單
unsigned char *ziplistNew(void);
//合并兩個壓縮清單，将second追加在first後面
unsigned char *ziplistMerge(unsigned char **first, unsigned char **second);
//新加節點
unsigned char *ziplistPush(unsigned char *zl, unsigned char *s, unsigned int slen, int where);
//傳回給定索引的節點
unsigned char *ziplistIndex(unsigned char *zl, int index);
//下一個節點
unsigned char *ziplistNext(unsigned char *zl, unsigned char *p);
//前一個節點
unsigned char *ziplistPrev(unsigned char *zl, unsigned char *p);
//擷取節點中的值
unsigned int ziplistGet(unsigned char *p, unsigned char **sval, unsigned int *slen, long long *lval);
//插入節點
unsigned char *ziplistInsert(unsigned char *zl, unsigned char *p, unsigned char *s, unsigned int slen);
//删除節點
unsigned char *ziplistDelete(unsigned char *zl, unsigned char **p);
//删除多個節點
unsigned char *ziplistDeleteRange(unsigned char *zl, int index, unsigned int num);
//比較節點值大小
unsigned int ziplistCompare(unsigned char *p, unsigned char *s, unsigned int slen);
//查找
unsigned char *ziplistFind(unsigned char *p, unsigned char *vstr, unsigned int vlen, unsigned int skip);
//擷取節點個數
unsigned int ziplistLen(unsigned char *zl);
//壓縮清單暫用總位元組數
size_t ziplistBlobLen(unsigned char *zl);
//格式化列印壓縮清單資訊
void ziplistRepr(unsigned char *zl);

下面将隻對節點的定義，以及節點的插入和删除操作的源碼做介紹，其餘的若想要了解可去浏覽源碼。而為什麼要單獨隻介紹删除和插入兩個操作呢？因為其涉及到壓縮清單的一個影響性能的重要操作——空間擴充

entry的定義

typedef struct zlentry {
    //prevrawlensize占用的位元組情況
    unsigned int prevrawlensize;
    //前一個節點的長度
    unsigned int prevrawlen; 
    //節點的長度（string有1、2、5位元組情況，integer隻有單位元組的情況）
    unsigned int lensize; 
    //節點占用位元組數
    unsigned int len; 
    //prevrawlensize + lensize
    unsigned int headersize;
    //編碼類型：ZIP_STR_* or ZIP_INT_*
    unsigned char encoding;      
    //指向節點起始的指針
    unsigned char *p;
} zlentry;

删除操作

//删除元素為num的節點
unsigned char *__ziplistDelete(unsigned char *zl, unsigned char *p, unsigned int num) {
    unsigned int i, totlen, deleted = 0;
    size_t offset;
    int nextdiff = 0;
    zlentry first, tail;

    zipEntry(p, &first);
    for (i = 0; p[0] != ZIP_END && i < num; i++) {
        p += zipRawEntryLength(p);
        deleted++;
    }

    //計算需要删除的空間
    totlen = p-first.p; 
    if (totlen > 0) {
        if (p[0] != ZIP_END) {
            nextdiff = zipPrevLenByteDiff(p,first.prevrawlen);

            p -= nextdiff;
            zipStorePrevEntryLength(p,first.prevrawlen);

            ZIPLIST_TAIL_OFFSET(zl) =
                intrev32ifbe(intrev32ifbe(ZIPLIST_TAIL_OFFSET(zl))-totlen);

            zipEntry(p, &tail);
            if (p[tail.headersize+tail.len] != ZIP_END) {
                ZIPLIST_TAIL_OFFSET(zl) =
                   intrev32ifbe(intrev32ifbe(ZIPLIST_TAIL_OFFSET(zl))+nextdiff);
            }
            //移動後面的節點
            memmove(first.p,p,
                intrev32ifbe(ZIPLIST_BYTES(zl))-(p-zl)-1);
        } else {
            //尾節點删除，不需要進行空間擴充
            ZIPLIST_TAIL_OFFSET(zl) =
                intrev32ifbe((first.p-zl)-first.prevrawlen);
        }

        //重新計算尾節點的偏移量和計算節點總位元組數
        offset = first.p-zl;
        zl = ziplistResize(zl, intrev32ifbe(ZIPLIST_BYTES(zl))-totlen+nextdiff);
        ZIPLIST_INCR_LENGTH(zl,-deleted);
        p = zl+offset;

        if (nextdiff != 0)
            //進行空間擴充更新
            zl = __ziplistCascadeUpdate(zl,p);
    }
    return zl;
}

插入操作

//插入節點p
unsigned char *__ziplistInsert(unsigned char *zl, unsigned char *p, unsigned char *s, unsigned int slen) {
    size_t curlen = intrev32ifbe(ZIPLIST_BYTES(zl)), reqlen;
    unsigned int prevlensize, prevlen = 0;
    size_t offset;
    int nextdiff = 0;
    unsigned char encoding = 0;
    long long value = 123456789;
    zlentry tail;

    //查找要插入節點的前一個節點
    if (p[0] != ZIP_END) {
        ZIP_DECODE_PREVLEN(p, prevlensize, prevlen);
    } else {
        unsigned char *ptail = ZIPLIST_ENTRY_TAIL(zl);
        if (ptail[0] != ZIP_END) {
            prevlen = zipRawEntryLength(ptail);
        }
    }

    //判斷entry是否要encoding
    if (zipTryEncoding(s,slen,&value,&encoding)) {
        //設定int類型
        reqlen = zipIntSize(encoding);
    } else {
        reqlen = slen;
    }
    
    reqlen += zipStorePrevEntryLength(NULL,prevlen);
    reqlen += zipStoreEntryEncoding(NULL,encoding,slen);

    //當插入的節點不是在最後的時候，我們需要確定下一個節點有足夠的記憶體來儲存目前節點的長度
    int forcelarge = 0;
    nextdiff = (p[0] != ZIP_END) ? zipPrevLenByteDiff(p,reqlen) : 0;
    if (nextdiff == -4 && reqlen < 4) {
        nextdiff = 0;
        forcelarge = 1;
    }

    /* Store offset because a realloc may change the address of zl. */
    offset = p-zl;
    zl = ziplistResize(zl,curlen+reqlen+nextdiff);
    p = zl+offset;

    //當不是插入到最後的時候
    if (p[0] != ZIP_END) {
        /* Subtract one because of the ZIP_END bytes */
        memmove(p+reqlen,p-nextdiff,curlen-offset-1+nextdiff);

        /* Encode this entry's raw length in the next entry. */
        if (forcelarge)
            zipStorePrevEntryLengthLarge(p+reqlen,reqlen);
        else
            zipStorePrevEntryLength(p+reqlen,reqlen);

        /* Update offset for tail */
        ZIPLIST_TAIL_OFFSET(zl) =
            intrev32ifbe(intrev32ifbe(ZIPLIST_TAIL_OFFSET(zl))+reqlen);

        //當要插入位置後面有多個節點時
        zipEntry(p+reqlen, &tail);
        if (p[reqlen+tail.headersize+tail.len] != ZIP_END) {
            ZIPLIST_TAIL_OFFSET(zl) =
                intrev32ifbe(intrev32ifbe(ZIPLIST_TAIL_OFFSET(zl))+nextdiff);
        }
    } else {
        //當是插入到最後的時候
        ZIPLIST_TAIL_OFFSET(zl) = intrev32ifbe(p-zl);
    }

    //判斷是否進行空間擴充
    if (nextdiff != 0) {
        offset = p-zl;
        zl = __ziplistCascadeUpdate(zl,p+reqlen);
        p = zl+offset;
    }

    //寫入節點
    p += zipStorePrevEntryLength(p,prevlen);
    p += zipStoreEntryEncoding(p,encoding,slen);
    if (ZIP_IS_STR(encoding)) {
        memcpy(p,s,slen);
    } else {
        zipSaveInteger(p,value,encoding);
    }
    ZIPLIST_INCR_LENGTH(zl,1);
    return zl;
}

空間擴充

/* When an entry is inserted, we need to set the prevlen field of the next
 * entry to equal the length of the inserted entry. It can occur that this
 * length cannot be encoded in 1 byte and the next entry needs to be grow
 * a bit larger to hold the 5-byte encoded prevlen. This can be done for free,
 * because this only happens when an entry is already being inserted (which
 * causes a realloc and memmove). However, encoding the prevlen may require
 * that this entry is grown as well. This effect may cascade throughout
 * the ziplist when there are consecutive entries with a size close to
 * ZIP_BIG_PREVLEN, so we need to check that the prevlen can be encoded in
 * every consecutive entry.
 *
 * Note that this effect can also happen in reverse, where the bytes required
 * to encode the prevlen field can shrink. This effect is deliberately ignored,
 * because it can cause a "flapping" effect where a chain prevlen fields is
 * first grown and then shrunk again after consecutive inserts. Rather, the
 * field is allowed to stay larger than necessary, because a large prevlen
 * field implies the ziplist is holding large entries anyway.
 *
 * The pointer "p" points to the first entry that does NOT need to be
 * updated, i.e. consecutive fields MAY need an update. */
 
unsigned char *__ziplistCascadeUpdate(unsigned char *zl, unsigned char *p) {
    size_t curlen = intrev32ifbe(ZIPLIST_BYTES(zl)), rawlen, rawlensize;
    size_t offset, noffset, extra;
    unsigned char *np;
    zlentry cur, next;

    while (p[0] != ZIP_END) {
        zipEntry(p, &cur);
        rawlen = cur.headersize + cur.len;
        rawlensize = zipStorePrevEntryLength(NULL,rawlen);

        //當以及時最後一個節點的時候，結束操作
        if (p[rawlen] == ZIP_END) break;
        zipEntry(p+rawlen, &next);

        //當prvlen記憶體不需要改變的時候，結束操作
        if (next.prevrawlen == rawlen) break;

        if (next.prevrawlensize < rawlensize) {
            offset = p-zl;
            extra = rawlensize-next.prevrawlensize;
            zl = ziplistResize(zl,curlen+extra);
            p = zl+offset;

            np = p+rawlen;
            noffset = np-zl;

            if ((zl+intrev32ifbe(ZIPLIST_TAIL_OFFSET(zl))) != np) {
                ZIPLIST_TAIL_OFFSET(zl) =
                    intrev32ifbe(intrev32ifbe(ZIPLIST_TAIL_OFFSET(zl))+extra);
            }

            memmove(np+rawlensize,
                np+next.prevrawlensize,
                curlen-noffset-next.prevrawlensize-1);
            zipStorePrevEntryLength(np,rawlen);

            p += rawlen;
            curlen += extra;
        } else {
            if (next.prevrawlensize > rawlensize) {
               zipStorePrevEntryLengthLarge(p+rawlen,rawlen);
            } else {
                zipStorePrevEntryLength(p+rawlen,rawlen);
            }
            break;
        }
    }
    return zl;
}

Redis源碼剖析之壓縮清單

文章目錄

壓縮清單（ziplist）

資料結構

壓縮清單

壓縮清單節點

prvlen

encoding

壓縮清單存儲樣例

API

entry的定義

删除操作

插入操作

空間擴充

繼續閱讀

秒懂JVM的三大參數類型，就靠這十個小實驗了初級—中級—進階三個級别的大廠面試真題阿裡雲——Java 實習生/初級美團——Java 中級螞蟻金服——Java 進階基礎篇JVM 篇MySQL 篇Redis 篇

CentOs7 安裝redis4.0 遇到的坑一、CentOs7連接配接網絡二、make檔案的時候出錯三、安裝ruby的redis插件時候報錯最後

docker 搭建Redis 問題解決

如何解決Redis緩存擊穿、雪崩、穿透問題

java 版本的redis-stat不能運作在背景和daemon

redis管理常用指令

django短信驗證碼的後端實作

Redis訂閱了一段時間後訂閱失效了（ redisTemplate.convertAndSend）

2022秋招面試總結（cpp+java+測開）百度測開一面位元組後端一面蝦皮後端一面蝦皮後端二面

資料遷移方法資料遷移原則資料遷移之雙寫方案資料遷移之級聯同步方案

微服務-性能壓測\緩存redis和分布式鎖redisson和SpringCache

Nacos 2.0 更新前後性能對比壓測

Spring資料和Redis

redis叢集資料一緻性_RedisRaft為Redis叢集帶來強大的資料一緻性

supervisor 管理redis 和httpd 環境centos7

Redis簡介一(單機版)發展曆程Redis