Python 使用itchat 擷取微信好友資訊并解析（性别區域頭像簽名等）

Itchat簡介

itchat官網：https://itchat.readthedocs.io/zh/latest/

itchat是一個開源的微信個人号接口

使用這個接口可以完成擷取微信好友資訊，發送資訊，接收資訊等操作

借此可以開發個人的微信機器人。

（目前發送資訊等功能好像已被官方禁止）

現在我們隻是用最基本的擷取好友資訊功能來得到微信好友資訊并進行初步的分析統計

1. 登入擷取好友資訊

首先當然是要安裝這個開源包

pip install itchat

隻需要一條指令就可以進行微信的掃碼登入

import itchat
itchat.auto_login()

掃碼後便會提示登陸了網頁版微信 (ps: 新的微信賬号貌似已經不支援網頁版登陸了。。。)

登入之後便可以通過下面這條指令擷取最新的好友資訊

這裡的update參數是代表是否重新擷取最新資訊如果是False那麼就會使用緩存資訊

為了清楚地了解其内部資料結構我們這裡會将好友資訊儲存在本地 (效果同update=False一樣)

代碼如下：

if os.path.exists('friends.json') is False:
        with open("friends.json",'w') as f:  
            itchat.auto_login()
            friends = itchat.get_friends(update=True) 
            json.dump(friends,f)
            print("save friends info.")
else:
        with open('friends.json','r') as lf:
            friends = json.load(lf)
            print('load friends info.')

此時我們将好友資訊儲存到了friends.json

我們可以檢視一下裡面的資訊存儲結構：

Python 使用itchat 擷取微信好友資訊并解析（性别區域頭像簽名等）

根據上圖我們可以對好友的城市、省份、簽名、性别、頭像等進行統計分析

2. 好友性别統計

sex 代表着好友的性别 1:男 2:女當然不能排除其它情況。。。0.0

def sex_analysis(friends):
    labels = ['男', '女', '其他']
    data = [0, 0, 0]

    # friends[0] 是自己的資訊
    for friend in friends[1:]:
        sex = friend["Sex"]
        if sex == 1:
            data[0] += 1
        elif sex == 2:
            data[1] += 1
        else:
            data[2] += 1
    labels = [labels[i]+':'+str(data[i]) for i in range(len(labels))]
    plt.title("微信好友性别比例")
    plt.pie(data, labels=labels,autopct="%.2f%%")
    plt.savefig("sex.jpg")
    plt.show()

我們可以使用matplotlib根據這個統計結果生成一個餅狀圖

Python 使用itchat 擷取微信好友資訊并解析（性别區域頭像簽名等）

3. 好友省份統計

我們将各省擁有的好友個數進行統計并排序取top10

這裡會有特殊的空(’’)省份需要去除

def prov_analysis(friends):
    prov_dict = defaultdict(int)
    
    for friend in friends[1:]:
        prov = friend['Province']
        prov_dict[prov] +=1
    prov_dict.pop('')
    prov_top10 = sorted(prov_dict.items(),key=lambda x:x[1],reverse=True)[:10]
    prov_name = [x[0] for x in prov_top10]
    prov_num = [x[1] for x in prov_top10]
    plt.bar(prov_name,prov_num,width=0.5,align='center',color='#87CEFA')
    for i in range(len(prov_num)):
        x = prov_name[i]
        y = prov_num[i]
        plt.text(x,y+1,'%s'%y,fontsize=10,ha='center')
    plt.ylabel("好友數量")
    plt.xlabel("省")
    plt.title("各省好友分布 TOP10")
    plt.savefig("province.jpg")
    plt.show()

同樣将結果使用plt呈現

Python 使用itchat 擷取微信好友資訊并解析（性别區域頭像簽名等）

根據這個明顯的省份分布幾乎可以斷定。。該微信号是個浙江人

4. 好友城市統計

這個和省份統計基本一緻

def city_analysis(friends):
    city_dict = defaultdict(int)
    for friend in friends[1:]:
        city = friend['City']
        city_dict[city] +=1
    city_dict.pop('')
    city_top10 = sorted(city_dict.items(),key=lambda x:x[1],reverse=True)[:10]
    city_name = [x[0] for x in city_top10]
    city_num = [x[1] for x in city_top10]
    plt.bar(city_name,city_num,width=0.5,align='center',color='#87CEFA')
    for i in range(len(city_num)):
        x = city_name[i]
        y = city_num[i]
        plt.text(x,y+1,'%s'%y,fontsize=10,ha='center')
    plt.ylabel("好友數量")
    plt.xlabel("城市")
    plt.title("各城市好友分布 TOP10")
    plt.savefig("city.jpg")
    plt.show()

Python 使用itchat 擷取微信好友資訊并解析（性别區域頭像簽名等）

分析上圖:

該微信号在杭紹兩城市出生/工作

并在成都生活過一段時間

5. 好友頭像擷取

這個功能好像沒什麼能分析的。。。看看好友的頭像是否非主流。。

這裡需要注意

從緩存的friends.json好友資訊中可以擷取頭像位址

但是下載下傳頭像是需要重新登陸的

我們在這裡繼續進行緩存将頭像儲存在 photos/檔案夾内

get_photo：下載下傳好友頭像圖檔

photo_merge: 拼接好友頭像生成一張大圖

def get_photo(path):
    itchat.auto_login()
    friends = itchat.get_friends(update=True)[0:]
    num=0
    for i in friends:
        img = itchat.get_head_img(userName=i["UserName"])
        imgpath = path+str(num)+'.jpg'
        try:
            with open(imgpath,'wb') as imgf:
                imgf.write(img)
        except Exception as e:
            print("get err:",e)
        num+=1
        
def photo_merge(friends):
    path='photos/'      
    
    if os.path.exists(path) is False:
        os.mkdir(path)
    
    flist = os.listdir(path)
    if len(flist)==0:
        get_photo(path)
        flist = os.listdir(path)
    line = int(math.sqrt(len(flist)))
    each_size = int(640/line)
    image= Image.new('RGB',(line*each_size,int(len(flist)/line)*each_size))
    x,y=0,0
    poslist = list(range(len(flist)))
    random.shuffle(poslist)
    for i in flist:
        try:
            pos = poslist.pop()
            img = Image.open(path+str(pos)+'.jpg')
        except IOError as e:
            pass
        else:
            img = img.resize((each_size,each_size),Image.ANTIALIAS)
            image.paste(img,(x*each_size,y*each_size))
            x+=1
            if x==line:
                x=0
                y+=1
    image.save("frinds_photo.jpg")
    img = plt.imread("frinds_photo.jpg")
    plt.imshow(img)
    plt.axis('off')
    plt.show()

注意：為了讓拼接的圖檔沒有黑邊更加的好看

我們在拼接圖檔時會舍棄掉一些圖檔

假設我們有num張照片 sqrt(num) 取整 = n

我們會拼接n x n 大小的圖檔整張圖檔大小為640x640 可以自由設定

每位好友的頭像大小(640/n x 640/n)

結果如下圖：

Python 使用itchat 擷取微信好友資訊并解析（性别區域頭像簽名等）

看上去挺壯觀。。。。

6. 好友簽名分析

這裡我們會用到中文分詞工具 jieba:https://github.com/fxsjy/jieba

我們将每一個好友簽名去除停用詞（這裡隻是初步手動去除不全面）

拼接後使用結巴進行分詞

統計分詞後詞組的出現頻率

使用詞雲（wordcloud）https://github.com/amueller/word_cloud 将這些詞組展現出來

頻率越高詞越大

def sig_analysis(friends):
    text = ''
    rule = re.compile("1fd+w*|[<>/=]")
    for fri in friends:
        sig = fri['Signature'].strip()
        if len(sig)>0 and not sig.startswith('<span'):
            sig = sig.replace("span",'').replace("class",'').replace("emoji","").replace("\n","").replace("00","")
            sig = rule.sub("",sig)
            text += sig + ' '
    jiebatext = list(jieba.cut(text,cut_all=True))
    jiebatext = [x for x in jiebatext if len(x)>1]
    
    wordDic = dict(Counter(jiebatext))
    bgimg = plt.imread('bk.jpg')
    mywordcloud = wordcloud.WordCloud(
            font_path='jdxs.TTF',
            background_color="white",
            mask=bgimg,
            width=1200,
            height=1200)
    mywordcloud.generate_from_frequencies(wordDic)
    plt.imshow(mywordcloud)
    plt.axis("off")
    plt.show()
    mywordcloud.to_file("sigimg.png")

結果如下圖

Python 使用itchat 擷取微信好友資訊并解析（性别區域頭像簽名等）

總結

雖然隻是擷取了最基礎的微信好友資訊

但我們已經能夠從這些不起眼的資料中擷取到一些有用的資訊

好友數量好友性别比例使用者生活城市等等

細思極恐0.0

那些并未開放擷取不到的資訊裡是否已經将我們展示地一覽無餘

Python 使用itchat 擷取微信好友資訊并解析（性别區域頭像簽名等）

目錄

Itchat簡介

1. 登入擷取好友資訊

2. 好友性别統計

3. 好友省份統計

4. 好友城市統計

5. 好友頭像擷取

6. 好友簽名分析

總結

繼續閱讀

來自python的【條件控制/語句循環/break/continue/else/pass】一、條件控制二、語句循環

無法解析的外部符号 wmain，該符号在函數 "void cdecl mainCRTStartupHelper(struct HINSTANCE *,unsigned short con......

TestLink導出用例轉換工具(XML2Excel)

YAML簡介和PyYAML安全操作YAML支援的類型YAML的優點：yaml的基本文法python操作

Small tricks

libsvm for python 安裝

學習軟體測試基礎測試第七天

Zeppelin 配置通路 REST APIApache Zeppelin Configuration REST API

【Torch】最簡潔logging使用指南

27. Remove Element(清單)題目代碼

Cloud Studio初體驗

使用 ctypes 進行 Python 和 C 的混合程式設計

【python】【資料處理】畫多元資料分布圖

【python】netconf協定對接管理裝置

「Python 網絡自動化」NETCONF —— Python 使用 NETCONF 管理配置 H3C 網絡裝置

在python中建立excel并寫入

Python 使用itchat 擷取微信好友資訊并解析（性别 區域 頭像 簽名等）

目錄

Itchat簡介

1. 登入 擷取好友資訊

2. 好友性别統計

3. 好友省份統計

4. 好友城市統計

5. 好友頭像擷取

6. 好友簽名分析

總結

繼續閱讀

Python 使用itchat 擷取微信好友資訊并解析（性别區域頭像簽名等）

1. 登入擷取好友資訊