天天看點

省市區三級行政區資料擷取和GeoJson地圖下載下傳

文章目錄

    • 1.背景
    • 2.行政區域資料擷取
    • 3.擷取GeoJson資料

1.背景

項目中用到省市區三級的行政區劃的選擇,在網上找到的資料與最新的行政區域劃分不一緻,也難以确認資料的完成性。

基于echarts完成資料地區分布圖時,需要提供地區對應的

geoJson

格式地圖。

2.行政區域資料擷取

高德開放平台提供了豐富資料API,其中行政區域資訊可通過以下接口:

https://restapi.amap.com/v3/config/district?keywords=中國&subdistrict=3&key=5b138cc729f37d29702ff904ca9cedeb

接口擷取的資料是按照行政等級多層嵌套,為了友善後續操作,将資料轉化為了List。

以下代碼将行政區劃轉化為List并添加了

id

parent_id

,同時儲存為json數組(

area_dict.json

)和MySQL資料庫腳本(

area_dict.sql

)。

#-*-coding:UTF-8-*-
"""
Author: Gray Snail
Date: 2020-06-30

最新行政區劃擷取
基于高德地圖API擷取資料
https://restapi.amap.com/v3/config/district?keywords=中國&subdistrict=3&key=5b138cc729f37d29702ff904ca9cedeb

"""
import json
import requests

def parse_district(districtObj : dict, idx=1, parent_id=0):
    res = []
    if 'name' in districtObj.keys():
        if districtObj['level'] == 'street':
            return res
        
        lng, lat = districtCenter(districtObj['center'])
        level = districtLevel(districtObj['level'])
        citycode = districtObj['citycode'] if isinstance(districtObj['citycode'], str) else ''
        
        # {"citycode":"0379","adcode":"410300","name":"洛陽市","center":"112.434468,34.663041","level":"city"}
        # idx, districtObj['adcode'], districtObj['name'], level, citycode, lng, lat, parent_id
        item = {
            'id'        : idx,
            'adcode'    : districtObj['adcode'],
            'name'      : districtObj['name'],
            'level'     : level,
            'citycode'  : citycode,
            'lng'       : lng,
            'lat'       : lat,
            'parent_id' : parent_id
        }
        res.append(item)
        parent_id = idx
        idx = idx + 1

    if isinstance(districtObj.get('districts'), list) and len(districtObj['districts']) > 0:
        for subitem in districtObj['districts']:
            subs = parse_district(subitem, idx, parent_id)
            res += subs
            idx += len(subs)
    return res

def districtLevel(levelStr):
    map_val = {
        'country': 0,
        'province': 1,
        'city': 2,
        'district': 3
    }
    return map_val[levelStr]

def districtCenter(center):
    items = center.split(',')
    return float(items[0]), float(items[1])

# 結果儲存為json數組
def saveJson(data):
    with open('area_dict.json', 'w', encoding='utf-8') as fp:
        json.dump(data, fp, ensure_ascii=False, indent=4)
    print('Save json file: area_dict.json')

# 儲存為SQL腳本
def saveSqlFile(data, includeCreate=True):
    # +--------------+-------------+------+-----+---------+----------------+
    # | Field        | Type        | Null | Key | Default | Extra          |
    # +--------------+-------------+------+-----+---------+----------------+
    # | area_id      | int(11)     | NO   | PRI | NULL    | auto_increment |
    # | area_code    | char(6)     | NO   | MUL | NULL    |                |
    # | area_name    | varchar(20) | NO   | MUL | NULL    |                |
    # | level        | tinyint(1)  | NO   | MUL | 0       |                |
    # | city_code    | char(4)     | YES  |     | NULL    |                |
    # | longitudinal | int(11)     | YES  |     | 0       |                |
    # | lateral      | int(11)     | YES  |     | 0       |                |
    # | parent_id    | int(11)     | NO   | MUL | -1      |                |
    # +--------------+-------------+------+-----+---------+----------------+
    createCode = """
CREATE TABLE `area_dict` (
    `area_id` int(11) NOT NULL AUTO_INCREMENT COMMENT '地區Id',
    `area_code` char(6) NOT NULL COMMENT '地區編碼',
    `area_name` varchar(20) NOT NULL COMMENT '地區名',
    `level` tinyint(1) NOT NULL DEFAULT '0' COMMENT '地區級别(1:省份province,2:市city,3:區縣district,4:街道street)',
    `city_code` char(4) DEFAULT NULL COMMENT '城市編碼',
    `lng` int(11) DEFAULT '0' COMMENT '城市中心經度',
    `lat` int(11) DEFAULT '0' COMMENT '城市中心緯度',
    `parent_id` int(11) NOT NULL DEFAULT '-1' COMMENT '地區父節點',
    PRIMARY KEY (`area_id`),
    KEY `areaCode` (`area_code`),
    KEY `parentId` (`parent_id`),
    KEY `level` (`level`),
    KEY `areaName` (`area_name`)
) ENGINE=InnoDB AUTO_INCREMENT=3261 DEFAULT CHARSET=utf8 COMMENT='地區碼表';
"""
    with open('area_dict.sql', 'w', encoding='utf-8') as fp:
        if includeCreate:
            fp.write(createCode)
        for item in data:
            item['lng'] = int(item['lng'] * 1e6)
            item['lat'] = int(item['lat'] * 1e6)
            sql = "INSERT INTO area_dict(`area_id`,`area_code`,`area_name`,`level`,`city_code`,`lng`,`lat`,`parent_id`) " + \
                "VALUES({id},'{adcode}','{name}',{level},'{citycode}',{lng},{lat},{parent_id});\n".format(**item)

            fp.write(sql)
            
    print('Save sql file: area_dict.sql')

if __name__ == "__main__":
    url = 'https://restapi.amap.com/v3/config/district?keywords=中國&subdistrict=3&key=5b138cc729f37d29702ff904ca9cedeb'

    response = requests.get(url)
    if response.ok and response.status_code == 200:
        data = response.json()
        data = parse_district(data)
        print('Download data successful, total:{0}!'.format(len(data)))
        saveJson(data)
        saveSqlFile(data)
    else:
        print('Request error!')
           

3.擷取GeoJson資料

資料來源:基于阿裡雲datav,資料檔案以地區編碼命名。

根據行政區域資料中儲存的

area_dict.json

自動下載下傳對應

GeoJson

檔案。

行政區劃的更新,兩個平台的資料可能存在差異,即同一地區有着不同的地區編碼,導緻對應地區的地圖無法下載下傳。代碼中

errorCodes

記錄了未成功下載下傳的地區編碼。2020.07.01,未下載下傳成功的不到30條。

#-*-coding:UTF-8-*-
"""
Author: Gray Snail
Date: 2020-06-30

GeoJson地圖資料下載下傳
基于阿裡雲datav
http://datav.aliyun.com/tools/atlas
"""
import requests
import json
import os

def loadDistrict(filename):
    # {"citycode":"0379","adcode":"410300","name":"洛陽市","center":"112.434468,34.663041","level":"city"}
    data = []
    with open(filename, 'r', encoding='utf-8') as fp:
        data = json.load(fp)
    return data

def saveGeoJson(areaCode, force=False):
    saveName = 'geo/{0}.json'.format(areaCode)
    if not force and os.path.isfile(saveName):
        return None 

    baseUrl = 'https://geo.datav.aliyun.com/areas_v2/bound/{0}_full.json'
    baseUrl2 = 'https://geo.datav.aliyun.com/areas_v2/bound/{0}.json'

    if areaCode[-2:] == '00':
        url = baseUrl.format(areaCode)
    else:
        url = baseUrl2.format(areaCode)
    print(url)
    
    response = requests.get(url)
    if response.ok and response.status_code == 200:
        res_json = response.json()
        with open(saveName, 'w', encoding='utf-8') as fp:
            json.dump(res_json, fp, ensure_ascii=False)
    else:
        return areaCode

if __name__ == "__main__":
    districts = loadDistrict('area_dict.json')

    errorCodes = []
    for district in districts:
        code = saveGeoJson(district['adcode'])
        if not code is None:
            errorCodes.append(code)
    print(errorCodes)