天天看點

python3.6 爬取微信好友清單和個性簽名,繪制個性簽名雲圖

python3.6 爬取微信好友清單和個性簽名,繪制個性簽名雲圖

1. 簡要介紹

本次實驗主要用到下面幾個庫 : 

 1)itchat---用于微信接口,實作生成QR碼,用于微信掃描登陸

 2)re(正則化)---由于微信好友個性簽名含有中英文,本次隻提取中文,需要使用re子產品去除其他無關字元

 3)wordcloud(雲圖)---使用該子產品生成中文雲圖

 4)jieba(中文分詞)--- 号稱最好的中文分詞工具

2. 安裝以上幾個庫

pip install re
pip install jieba
pip install itchat
pip install wordcloud
           

3. 實驗代碼

#!/usr/bin/python3
# -*- coding: utf-8 -*-
# @Time    : 2018/1/19 14:37
# @Author  : Z.C.Wang
# @Email   : 
# @File    : spider_wechat.py
# @Software: PyCharm Community Edition
"""
Description :

"""
import re
import jieba
import itchat
from pandas import DataFrame
import matplotlib.pyplot as plt
from wordcloud import WordCloud, ImageColorGenerator
import numpy as np
import PIL.Image as Image
import pickle

def get_var(var):
    variable = []
    for i in friends:
        value = i[var]
        variable.append(value)
    return variable

def list2str(wordlist):
    string = ' '
    for word in wordlist:
        string = string + ' ' + word
    return string

if __name__ == '__main__':
    itchat.login()
    friends = itchat.get_friends(update=True)
    male = female = other = 0
    for i in friends[1:]:
        sex = i['Sex']
        if sex == 1: male += 1
        elif sex == 2: female += 1
        else: other += 1
    total = len(friends[1:])
    # print('男性好友:%.2f%%' % float(male/total*100))
    # print('女性好友:%.2f%%' % float(female/total*100))
    # print('不明性别好友:%.2f%%' % float(other/total*100))
    Nickname = get_var('NickName')
    Sex = get_var('Sex')
    Province = get_var('Province')
    print(Province)
    City = get_var('City')
    Signature = get_var('Signature')
    data = {'Nickname': Nickname, 'Sex': Sex, 'Province': Province,
            'City': City, 'Signature': Signature}
    pickle.dump(data, open('data.txt', 'wb'))
    frame = DataFrame(data)
    frame.to_csv('info.csv', index=True, encoding='utf-8-sig')

    siglist = []
    for i in friends:
        signature = i['Signature'].strip().replace('spam', '').replace('class', '').replace('emoji', '')
        # rep = re.compile('1f\d+\w*|[<>/=]')
        rep = re.compile("[^\u4e00-\u9fa5^]")
        signature = rep.sub('', signature)
        siglist.append(signature)
    text = ''.join(siglist)
    wordlist = jieba.cut(text, cut_all=True)
    wordlist = list(wordlist)
    String = list2str(wordlist)

    coloring = np.array(Image.open('alice.png'))
    my_wordcloud = WordCloud(background_color='white', max_words=2000,
                             mask=coloring, max_font_size=55, random_state=42,
                             scale=2, font_path=r'C:\Windows\Fonts\simhei.ttf').generate(String)
    image_colors = ImageColorGenerator(coloring)
    plt.imshow(my_wordcloud.recolor(color_func=image_colors))
    plt.imshow(my_wordcloud)
    plt.axis('off')
    plt.show()
           

4. 實驗結果

python3.6 爬取微信好友清單和個性簽名,繪制個性簽名雲圖
python3.6 爬取微信好友清單和個性簽名,繪制個性簽名雲圖