天天看點

PM2.5資料的清洗,彙總與制作散點圖(含源資料連結)

相關資料:

北京空氣品質(2012-2018年)

點選打開連結

全國空氣品質曆史資料 | 北京市空氣品質曆史資料(每周更新)

點選打開連結

相關程式: 程式:(補全單個csv中的資料,用的是該天每個站點的中位數,并整理成date,type,mean三類的csv檔案)

# -*- coding: UTF-8 -*-
import pandas as pd
import datetime
import csv

def writer_data_extra(date,type,mean):
    csvfile = open('.\\beijing_20180101-20180324/aqi1.csv', 'a', newline='')
    writer = csv.writer(csvfile)
    info = [date,type,mean]
    writer.writerow(info)
    csvfile.close()

def run_extra():
    begin = datetime.date(2018,1,1)
    end = datetime.date(2018,3,24)
    d = begin
    delta = datetime.timedelta(days=1)
    q = 0
    while d <= end:
        num = d.strftime('%m%d')
        filename = pd.read_csv('./beijing_20180101-20180324/beijing_extra_2018' + num + '.csv')
        for j in range(0, 8, 2):
            nf = filename[j::8]
            #奇數的語句print x[::2]
            #偶數的語句print x[1::2]
            for i in nf.columns[3:]:
                a = nf[str(i)].median()
                nf.fillna(a, inplace=True)
            date = list(set(nf['date']))[0]
            type = list(set(nf['type']))[0]
            sum = 0
            for i in nf.columns[3:]:
                b = nf[str(i)].mean()
                sum += b
            mean = round(sum / len(nf.columns[3:]), 1)
            # print('date:{} type:{} val:{}'.format(date, type, mean))
            writer_data_extra(date, type, mean)
            q += 1
            if q % 10 == 0:
                print("正在轉錄...")
        d += delta
    print("**********轉錄完畢**************")

if __name__ == '__main__':
    run_extra()
           

程式:将兩個表根據相同項(date)合并

import  pandas as pd
import csv

def writer_data_all(date,type,val):
    csvfile = open('.\\beijing_20180101-20180324/aqi_all.csv', 'a', newline='')
    writer = csv.writer(csvfile)
    # writer.writerow(('date', 'type', 'val'))
    info = [date,type,val]
    writer.writerow(info)
    csvfile.close()

def main():
    filename1 = pd.read_csv('./beijing_20180101-20180324/aqi1.csv')
    filename2 = pd.read_csv('./beijing_20180101-20180324/aqi2.csv')
    fn3 = pd.concat([filename1,filename2])
    fn4= fn3.sort_values(by='date',ascending=True).reset_index(drop=True)
    print(fn4.T)


if __name__ == '__main__':
    main()
           

程式:制作散點圖

import pandas as pd
import matplotlib.pyplot as plt

def main():
    df = pd.read_csv('./data.csv')
    list = ['NO2', 'SO2', 'O3', 'CO', 'PM10', 'AQI']
    for i in list:
        item = df[i]
        PM2_5 = df['PM2.5']

        plt.scatter(item,PM2_5)
        plt.title(i + ' And PM2.5')
        plt.xlabel(i)
        plt.ylabel('PM2.5')
        plt.savefig('./'+ i + 'AndPM2.5.png')
        plt.show()

if __name__ == '__main__':
    main()
           

效果:

PM2.5資料的清洗,彙總與制作散點圖(含源資料連結)
PM2.5資料的清洗,彙總與制作散點圖(含源資料連結)
PM2.5資料的清洗,彙總與制作散點圖(含源資料連結)
PM2.5資料的清洗,彙總與制作散點圖(含源資料連結)
PM2.5資料的清洗,彙總與制作散點圖(含源資料連結)
PM2.5資料的清洗,彙總與制作散點圖(含源資料連結)

繼續閱讀