天天看點

cassandra 之 快照(snapshots)與sstableloader 備份、恢複、腳本

一、cassandra的備份和恢複

cassandra的備份恢複主要是通過snapshots 來實作。

步驟:

備份階段:
1. 生成快照;
恢複階段:
1. 清空表的資料(truncate table tablename)或者建立表結構;
2. 把各個表複制到對應标的資料目錄下,覆寫原來的資料;
3. refresh 加載恢複資料;
           

1、生成快照(snapshots)

文法:
nodetool -h 伺服器ip -p 端口号 snapshots 資料庫名 #全庫快照
nodetool -h 伺服器ip -p 端口号 snapshots -t 快照名稱 -kt 資料庫名.表名 #某個表快照
           

注:不指定-t 會自動生成時間戳字元串

生成好的snapshots 儲存在資料目錄對應的表目錄下的snapshots,例:table-uuid/snapshots/snapshotsname

[email protected]:~# nodetool -h localhost -p 7199 snapshot -t xn_dolphin_1-20181010  xn_dolphin_1
Requested creating snapshot(s) for [xn_dolphin_1] with snapshot name [xn_dolphin_1-20181010] and options {skipFlush=false}
Snapshot directory: xn_dolphin_1-20181010

[email protected]:~# nodetool -h localhost -p 7199 snapshot   xn_dolphin_1
Requested creating snapshot(s) for [xn_dolphin_1] with snapshot name [1539180816386] and options {skipFlush=false}
Snapshot directory: 1539180816386
[email protected]:~# date
Wed Oct 10 14:14:45 UTC 2018
[email protected]:~# nodetool listsnapshots
Snapshot Details:
Snapshot name         Keyspace name Column family name               True size Size on disk
1539180816386         xn_dolphin_1  dolphin_conversation_result      5.1 MiB   5.1 MiB
1539180816386         xn_dolphin_1  dolphin_conversation_member      0 bytes   1.02 KiB
1539180816386         xn_dolphin_1  dolphin_wchat_openid             0 bytes   895 bytes
1539180816386         xn_dolphin_1  zoogate_login_info               0 bytes   1.02 KiB
1539180816386         xn_dolphin_1  dolphin_conversation_list        0 bytes   946 bytes
1539180816386         xn_dolphin_1  dolphin_leaving_msg              0 bytes   1.27 KiB
1539180816386         xn_dolphin_1  dolphin_conversation             0 bytes   1.1 KiB
1539180816386         xn_dolphin_1  dolphin_member_inout             0 bytes   1.05 KiB
1539180816386         xn_dolphin_1  dolphin_conversation_message     0 bytes   1.18 KiB
1539180816386         xn_dolphin_1  zoogate_blacklist                0 bytes   1.01 KiB
1539180816386         xn_dolphin_1  dolphin_conversation_visitorinfo 0 bytes   1.2 KiB
1539180816386         xn_dolphin_1  dolphin_conversation_statistics  0 bytes   1 KiB
xn_dolphin_1-20181010 xn_dolphin_1  dolphin_conversation_result      5.1 MiB   5.1 MiB
xn_dolphin_1-20181010 xn_dolphin_1  dolphin_conversation_member      0 bytes   1.02 KiB
xn_dolphin_1-20181010 xn_dolphin_1  dolphin_wchat_openid             0 bytes   895 bytes
xn_dolphin_1-20181010 xn_dolphin_1  zoogate_login_info               0 bytes   1.02 KiB
xn_dolphin_1-20181010 xn_dolphin_1  dolphin_conversation_list        0 bytes   946 bytes
xn_dolphin_1-20181010 xn_dolphin_1  dolphin_leaving_msg              0 bytes   1.27 KiB
xn_dolphin_1-20181010 xn_dolphin_1  dolphin_conversation             0 bytes   1.1 KiB
xn_dolphin_1-20181010 xn_dolphin_1  dolphin_member_inout             0 bytes   1.05 KiB
xn_dolphin_1-20181010 xn_dolphin_1  dolphin_conversation_message     0 bytes   1.18 KiB
xn_dolphin_1-20181010 xn_dolphin_1  zoogate_blacklist                0 bytes   1.01 KiB
xn_dolphin_1-20181010 xn_dolphin_1  dolphin_conversation_visitorinfo 0 bytes   1.2 KiB
xn_dolphin_1-20181010 xn_dolphin_1  dolphin_conversation_statistics  0 bytes   1 KiB

[email protected]:/var/lib/cassandra/data/xn_dolphin_1/dolphin_conversation_result-d9e929d0cc9511e8a7ad6d2c86545d91/snapshots/xn_dolphin_1-20181010# pwd
/var/lib/cassandra/data/xn_dolphin_1/dolphin_conversation_result-d9e929d0cc9511e8a7ad6d2c86545d91/snapshots/xn_dolphin_1-20181010

[email protected]:/var/lib/cassandra/data/xn_dolphin_1/dolphin_conversation_result-d9e929d0cc9511e8a7ad6d2c86545d91# ls
backups			     mc-1-big-Data.db	    mc-1-big-Filter.db	mc-1-big-Statistics.db	mc-1-big-TOC.txt
mc-1-big-CompressionInfo.db  mc-1-big-Digest.crc32  mc-1-big-Index.db	mc-1-big-Summary.db	snapshots

[email protected]:/var/lib/cassandra/data/xn_dolphin_1/dolphin_conversation_result-d9e929d0cc9511e8a7ad6d2c86545d91/snapshots# ls
1539180816386  testdb_bak  xn_dolphin_1-20181010

[email protected]:/var/lib/cassandra/data/xn_dolphin_1/dolphin_conversation_result-d9e929d0cc9511e8a7ad6d2c86545d91/snapshots# cd xn_dolphin_1-20181010/
[email protected]:/var/lib/cassandra/data/xn_dolphin_1/dolphin_conversation_result-d9e929d0cc9511e8a7ad6d2c86545d91/snapshots/xn_dolphin_1-20181010# ls
manifest.json		     mc-1-big-Data.db	    mc-1-big-Filter.db	mc-1-big-Statistics.db	mc-1-big-TOC.txt
mc-1-big-CompressionInfo.db  mc-1-big-Digest.crc32  mc-1-big-Index.db	mc-1-big-Summary.db	schema.cql

           

2、删除資料:

cqlsh:xn_dolphin_1> select count(*) from dolphin_conversation_result;

 count
-------
 53426

(1 rows)

Warnings :
Aggregation query used without partition key
cqlsh:xn_dolphin_1> truncate table  dolphin_conversation_result;

#cassandra 在truncate table的時候會自動建立一個截斷表的快照,表目錄下的檔案除了backups  snapshots兩個目錄,其他都會被删除,删除*.db 檔案
[email protected]:/var/lib/cassandra/data/xn_dolphin_1/dolphin_conversation_result-d9e929d0cc9511e8a7ad6d2c86545d91# nodetool listsnapshots
Snapshot Details:
Snapshot name                                       Keyspace name Column family name               True size Size on disk
truncated-1539182023411-dolphin_conversation_result xn_dolphin_1  dolphin_conversation_result      5.1 MiB   5.1 MiB
           

3、#複制快照檔案

[email protected]:/var/lib/cassandra/data/xn_dolphin_1/dolphin_conversation_result-d9e929d0cc9511e8a7ad6d2c86545d91# ls
backups  snapshots

[email protected]:/var/lib/cassandra/data/xn_dolphin_1/dolphin_conversation_result-d9e929d0cc9511e8a7ad6d2c86545d91# cp snapshots/truncated-1539182023411-dolphin_conversation_result/* .
#權限
[email protected]:/var/lib/cassandra/data/xn_dolphin_1/dolphin_conversation_result-d9e929d0cc9511e8a7ad6d2c86545d91# chown -R cassandra.cassandra *

           

4、#恢複資料

文法:nodetool -h 伺服器 -p 端口 refresh – 資料庫名 資料表名

注:port 為7199

#快照恢複是在schma存在的情況下恢複資料,是以確定schema存在

[email protected]:/var/lib/cassandra/data/xn_dolphin_1/dolphin_conversation_result-d9e929d0cc9511e8a7ad6d2c86545d91# cqlsh -k xn_dolphin_1
Connected to Test Cluster at 127.0.0.1:9042.
[cqlsh 5.0.1 | Cassandra 3.11.3 | CQL spec 3.4.4 | Native protocol v4]
Use HELP for help.
cqlsh:xn_dolphin_1> desc dolphin_conversation_result ;

CREATE TABLE xn_dolphin_1.dolphin_conversation_result (
    siteid text,
    converid text,
    type int,
    content text,
    customerid text,
    deal_content text,
    deal_time bigint,
    isdeal int,
    submit_time bigint,
    supplierid text,
    PRIMARY KEY (siteid, converid, type)
) WITH CLUSTERING ORDER BY (converid ASC, type ASC)
    AND bloom_filter_fp_chance = 0.01
    AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
    AND comment = ''
    AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32', 'min_threshold': '4'}
    AND compression = {'chunk_length_in_kb': '64', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor'}
    AND crc_check_chance = 1.0
    AND dclocal_read_repair_chance = 0.1
    AND default_time_to_live = 0
    AND gc_grace_seconds = 864000
    AND max_index_interval = 2048
    AND memtable_flush_period_in_ms = 0
    AND min_index_interval = 128
    AND read_repair_chance = 0.0
    AND speculative_retry = '99PERCENTILE';
[email protected]:/var/lib/cassandra/data/xn_dolphin_1/dolphin_conversation_result-d9e929d0cc9511e8a7ad6d2c86545d91# ls
backups        mc-1-big-CompressionInfo.db  mc-1-big-Digest.crc32  mc-1-big-Index.db	   mc-1-big-Summary.db	schema.cql
manifest.json  mc-1-big-Data.db		    mc-1-big-Filter.db	   mc-1-big-Statistics.db  mc-1-big-TOC.txt	snapshots
[email protected]:/var/lib/cassandra/data/xn_dolphin_1/dolphin_conversation_result-d9e929d0cc9511e8a7ad6d2c86545d91# nodetool refresh -- xn_dolphin_1  dolphin_conversation_result
[email protected]:/var/lib/cassandra/data/xn_dolphin_1/dolphin_conversation_result-d9e929d0cc9511e8a7ad6d2c86545d91# cqlsh -k xn_dolphin_1
Connected to Test Cluster at 127.0.0.1:9042.
[cqlsh 5.0.1 | Cassandra 3.11.3 | CQL spec 3.4.4 | Native protocol v4]
Use HELP for help.
cqlsh:xn_dolphin_1> select count(*) from dolphin_conversation_result;

 count
-------
 53426

(1 rows)

Warnings :
Aggregation query used without partition key

           

二、sstableloader遷移工具

在cassandra的bin目錄下提供了一個sstableloader工具,這個工具專門用于把一個表的sstable檔案導入到一個新的叢集中。

注意:如果舊的三台遷移新的三台,因為sstableloader遷移的資料隻是執行所在節點上的資料,是以需要old1->new1、old2->new2、old3->new3。

1.舊叢集環境:表是mykeyspace.mytable。你的資料存一個3個節點組成的叢集中,每個節點的資料都存在/opt/data目錄下。

2.新叢集環境:位址是192.168.31.185, 先在新叢集建離相同名字的keyspace和表結構。

3.在舊叢集環境裡執行:

bin/sstableloader -d 192.168.31.185 -u cassandra -pw cassandra -t 100 /opt/data/mykeyspace/mytable

其中-u是 使用者名 -pw是密碼 -t是限制流量100M/bps

等所有節點執行完畢,表資料就成功導入到了新的叢集中,當然隻要機器io和網絡條件允許,你可以多個節點并發執行。

示例:

[email protected]:/$ sstableloader -d 192.168.31.185 -u cassandra -pw cassandra -t 100 /opt/bitnami/cassandra/data/data/xn_dolphin_1/dolphin_conversation_result-2c8866e0ce3711e89b4687b65adcf047/


WARN  16:25:38,358 Small commitlog volume detected at /opt/bitnami/cassandra/bin/../data/commitlog; setting commitlog_total_space_in_mb to 4348.  You can override this in cassandra.yaml
WARN  16:25:38,377 Small cdc volume detected at /opt/bitnami/cassandra/bin/../data/cdc_raw; setting cdc_total_space_in_mb to 2174.  You can override this in cassandra.yaml
WARN  16:25:38,669 Only 6.847GiB free across all data volumes. Consider adding more capacity to your cluster or removing obsolete snapshots
Established connection to initial hosts
Opening sstables and calculating sections to stream
Streaming relevant part of /bitnami/cassandra/data/data/xn_dolphin_1/dolphin_conversation_result-2c8866e0ce3711e89b4687b65adcf047/mc-1-big-Data.db  to [/192.168.31.185]
progress: [/192.168.31.185]0:1/1 100% total: 100% 792.200KiB/s (avg: 792.200KiB/s)
progress: [/192.168.31.185]0:1/1 100% total: 100% 0.000KiB/s (avg: 634.013KiB/s)

Summary statistics:
   Connections per host    : 1
   Total files transferred : 1
   Total bytes transferred : 5.053MiB
   Total duration          : 8175 ms
   Average transfer rate   : 632.865KiB/s
   Peak transfer rate      : 792.200KiB/s
           

遷移完成!

腳本:

python備份腳本,适用于k8s、docker化的cassandra資料備份

#-*- coding:utf-8 -*-
# Author:json
import os,re
import subprocess
import datetime
import zipfile

def exeCommd(cmd):
    subp = subprocess.getstatusoutput(cmd)
    return subp[1]
def copySnapshots(containerId,snapshot_path,backcup_path):
        copyCommd = "kubectl cp  containerId:snapshot_path {backcup_path}".format(backcup_path=backcup_path)
        exeCommd(copyCommd)

def mkdirFile(path):
       if os.path.isdir(path):
            print('目錄已存在!')
       else:
           os.makedirs(path)

def exeCommd2(cmd):
    subp = subprocess.Popen(cmd,shell=True,stdout=subprocess.PIPE,encoding='utf8')
    data = subp.stdout.readlines()
    return data

def writeFile(path,filename,content):
    with open('{path}/{filename}'.format(path=path,filename=filename),'w+',encoding='utf8') as f:
        f.write(content)

def zipFile(zipfname,LOCAL_BACKUP_PATH):
	try:
		z = zipfile.ZipFile(zipfname,'w',zipfile.ZIP_DEFLATED,allowZip64=True)
		print ('開始壓縮檔案')
		for dirpath,dirnames,filenames in os.walk(LOCAL_BACKUP_PATH):
			for filename in filenames:
				z.write(os.path.join(dirpath,filename))
		z.close()
		print ('備份結束')
	except Exception as e:
         print (e)

if __name__ == "__main__":

    #需要修改的參數:BACKUP_DIR、CONFIG_YAML、CONTAINER_Id、KEYSPACES_NAME
    BACKUP_DIR = '/opt/backup/cassandra'
    CONFIG_YAML = '/etc/cassandra/cassandra.yaml'
    # CONTAINER_Id = "docker ps|grep 'cassandra'|awk -F' ' '{print $1}'"
    CONTAINER_Id = "cassandra-0"
    # CURRENT_IP = "ip addr | grep 'eth0' |grep 'inet'|awk '{print $2}'|cut -f1 -d'/'"
    KEYSPACES_NAME = ['dolphin','im']

    CurrmentDate = datetime.datetime.now().strftime('%Y%m%d%H%I%S')
    # containerId = exeCommd(CONTAINER_Id)

    #建立 snapshot
    # CASSANDRA_SNAPSHOT = "docker exec -it {containerId} nodetool snapshot -t {CURRENT_DATE} ".format(CURRENT_DATE=CURRENT_DATE,containerId=containerId)
    CASSANDRA_SNAPSHOT = "kubectl exec -it {containerId} nodetool snapshot".format(CURRENT_DATE=CurrmentDate,containerId=CONTAINER_Id)
   #檢視cassandra資料目錄
    CASSANDRA_DataFile="kubectl exec -it {CONTAINER_Id} cat  {CONFIG_YAML}|grep 'data_file_directories' -A1|head -2|tail -n1".format(CONFIG_YAML=CONFIG_YAML,CONTAINER_Id=CONTAINER_Id)
    a,dataPath = exeCommd(CASSANDRA_DataFile).split()
    data_list = []

    print(CASSANDRA_SNAPSHOT)
    #擷取快照的名稱
    Snapshot_Str = exeCommd(CASSANDRA_SNAPSHOT)
    re = re.search('\[\d+\]', Snapshot_Str).group()
    SnapshotName = re.replace('[', '').replace(']', '')
    print(SnapshotName)
    try:
        for keyspaceName in KEYSPACES_NAME:
            # path = "{dataPath}/{keyspacename}".format(dataPath=dataPath,keyspacename=keyspaceName)
            keyspacePath = os.path.join(dataPath,'{keyspacename}'.format(keyspacename=keyspaceName))
            print('-------------------------',keyspaceName)
            CASSANDRA_TABLE = "kubectl exec -it {containerId} ls {path} ".format(path=keyspacePath,
                                                                                 containerId=CONTAINER_Id)
            table_uuid = exeCommd(CASSANDRA_TABLE).split('\n')
            for  tableFileName in table_uuid:
                    tableName,uuid = tableFileName.split('-')
                    # snapshot 儲存的路徑
                    snapshotPath = "{path}/{tableFileName}/snapshots/{SnapshotName}".format(tableFileName=tableFileName,
                                                                                          SnapshotName=SnapshotName,
                                                                                          path=keyspacePath)
                    # 備份到本地的路徑
                    bakcupPath = '{backdir}/{keyspacename}/{tableFileName}'.format(backdir=BACKUP_DIR,
                                                                                    tableFileName=tableFileName,
                                                                                    keyspacename=keyspaceName)
                    print(bakcupPath)
                    #建立備份目錄
                    mkdirFile('{back_path}'.format(back_path=bakcupPath))

                    #copy snapshot backup file
                    CopySnapshot = 'kubectl cp  {containerId}:{table_path} {bakcup_path}/{SnapshotName} '.format(table_path=snapshotPath,containerId=CONTAINER_Id,bakcup_path=bakcupPath,SnapshotName=SnapshotName)
                    print(CopySnapshot,'>>>:',tableFileName)
                    if exeCommd(CopySnapshot):
                        writeFile(BACKUP_DIR, '{CurrmentDate}.log'.format(CurrmentDate=CurrmentDate), 'ok')
                        print('複制成功!')
                    else:
                        writeFile(BACKUP_DIR, '{CurrmentDate}.log'.format(CurrmentDate=CurrmentDate), 'on')
    except Exception as e:
            print(e)
            writeFile(BACKUP_DIR, '{CurrmentDate}.log'.format(CurrmentDate=CurrmentDate), e)
    zipfname = BACKUP_DIR + '/' + CurrmentDate + CONTAINER_Id + '.zip'
    LOCAL_BACKUP_PATH = os.path.join(BACKUP_DIR,'dolphin')
    zipFile(zipfname, LOCAL_BACKUP_PATH)
           

交流群:725450393

cassandra 之 快照(snapshots)與sstableloader 備份、恢複、腳本

繼續閱讀