python-網易雲音樂搜尋下載下傳腳本

前言
開發環境及所需的庫
- 開發環境
- 所需的庫
思想
- 搜尋
- 下載下傳
- 儲存
實作
- 搜尋
- 下載下傳和儲存
- 主函數和輔助函數
效果
結語

前言

之前寫過如何從網易雲網頁下載下傳歌曲，但是感覺太麻煩了，于是就想着寫一個腳本來完成搜尋加下載下傳的功能。

開發環境及所需的庫

開發環境

Windows 10

python 3.7

所需的庫

requests

time

selenium

prettytable

思想

搜尋

使用爬蟲，通路https://music.163.com/#/search/m/?s=音樂名&type=1，來的到歌曲清單、歌手以及歌曲ID（下載下傳音樂時需要）

下載下傳

網易雲網頁播放音樂時會調用連結http://music.163.com/song/media/outer/url?id=音樂ID.mp3，來擷取音樂，是以我們才需要音樂ID

儲存

直接将爬取到的mp3儲存到檔案就行了

實作

調用庫

from selenium.webdriver.firefox.options import Options
import requests
import re
from selenium import webdriver
from time import sleep
from prettytable import PrettyTable

搜尋

本來我想用requests庫來實作歌曲清單的爬取，但是，由于頁面中嵌套着一個iframe，導緻無法得到（也可能是我太菜了）。是以使用selenium來實作。

我使用的是火狐，先配置selenium，不顯示gui

# 設定options
firefox_options = Options()
firefox_options.add_argument("--headless")
firefox_options.add_argument("--disable-gpu")

搜尋音樂的函數：

# 搜尋音樂
def SearchMusic(name):
	url_search = "https://music.163.com/#/search/m/?s="+name+"&type=1"
	# 初始化
	browser = webdriver.Firefox(executable_path="geckodriver.exe", options=firefox_options)
	# 歌曲搜尋
	browser.get(url=url_search)
	# 切換iframe
	browser.switch_to.frame("g_iframe")
	sleep(0.5)
	# 頁面資訊
	page_text = browser.execute_script("return document.documentElement.outerHTML")
	# 退出
	browser.quit()
	# 正規表達式
	re1 = '<a.*?id="([0-9]*?)"'
	re2 = '<b title="(.*?)">.*?<span class="s-fc7">'
	re3 = 'class="td w1"><div class="text"><a href=".*?" target="_blank" rel="external nofollow"  target="_blank" rel="external nofollow" >(.*?)</div></div>'

	id_list = re.findall(re1, page_text, re.M)[::2]
	song_list = re.findall(re2, page_text, re.M)
	singer_list = re.findall(re3, page_text, re.M)

	total_list = list(zip(song_list, singer_list, id_list))

	# 指令行表格
	table = PrettyTable(["序号", "音樂名", "歌手", "音樂ID"])

	for i in range(len(total_list)):
		# 處理多個歌手
		# 處理不完全，可能有BUG
		if "<a href=" in total_list[i][1]:
			re4 = '(.*?)</a>'
			temp = re.findall(re4, total_list[i][1], re.M)
			re5 = '<a href=".*?" target="_blank" rel="external nofollow"  target="_blank" rel="external nofollow" >(.*?)</a>'
			singer = ""
			for s in temp:
				t = re.findall(re5, s+"</a>", re.M)
				if t == []:
					singer += s + "/"
				else:
					singer += t[0] + "/"
			total_list[i] = (total_list[i][0], singer[:-1], total_list[i][2])
		else:
			if total_list[i][1][-1] != ">":
				temp = total_list[i][1].replace("</a>","")
			else:
				temp = total_list[i][1][:-4]
			total_list[i] = (total_list[i][0], temp, total_list[i][2])
		# 将資料加入到表格
		table.add_row([str(i+1), total_list[i][0], total_list[i][1], total_list[i][2]])
	# 輸出表格
	print(table,"\n")
	return total_list

火狐的webdrive是geckodriver.exe，可以從網上下載下傳，并放到腳本目錄。

中間的正規表達式是為了比對歌曲名、歌手、歌曲ID，同時對于多個歌手的情況，不能隻用一個正規表達式解決，我試了好多次，才可能成功，不一定能保證一定能成功，是以可能存在BUG。

對于清單的輸出，我才用的是prettytable來使得指令行輸出更好看。

下載下傳和儲存

下載下傳時記得設定header，否則會遇見網絡通路頻繁的傳回。

下載下傳和儲存函數：

# 下載下傳音樂
def GetMusic(music_id):
	head = {"User-Agent":'Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_6_8; en-us) AppleWebKit/534.50 (KHTML, like Gecko) Version/5.1 Safari/534.50',
			"Referer": "https://music.163.com/"}
	url_music = "http://music.163.com/song/media/outer/url?id="+music_id+".mp3"
	html = requests.get(url_music, headers=head)
	with open(music_id+".mp3", "wb") as f:
		f.write(html.content)

儲存到本地時，我是用的是用音樂ID名儲存的，是以當下載下傳多個音樂時，要厘清楚音樂名。

主函數和輔助函數

# 輸出菜單
def menu():
	print("網易雲音樂搜尋下載下傳")
	print("1.音樂搜尋")
	print("2.音樂下載下傳(需要音樂ID)")
	print("3.退出")
	print("注: 音樂搜尋可以獲得音樂ID\n")

# 主函數
def main():
	while 1:
		menu()
		num = input("請輸入序号(1-3):")
		try:
			num = int(num)
		except:
			print("錯誤序号!\n")
			continue
		if num not in [1,2,3]:
			print("錯誤序号!\n")

		if num == 1:
			name = input("\n輸入歌名(輸入q傳回):")
			if name == "q":
				continue
			total_list = SearchMusic(name)
		elif num == 2:
			music_id = input("\n請輸入歌曲id(輸入q傳回):")
			if music_id == "q":
				continue
			print("下載下傳中")
			GetMusic(music_id)
			print("下載下傳完成\n")
		elif num == 3:
			break

menu函數用于輸出菜單。

主函數用while循環。

效果

檔案：

python-網易雲音樂搜尋下載下傳腳本前言開發環境及所需的庫思想實作效果結語

菜單和搜尋：

python-網易雲音樂搜尋下載下傳腳本前言開發環境及所需的庫思想實作效果結語

下載下傳：

python-網易雲音樂搜尋下載下傳腳本前言開發環境及所需的庫思想實作效果結語

結語

希望大家可以喜歡，代碼檔案和geckodriver.exe我會放到GitHub上，希望大家可以點一個star。

python-網易雲音樂搜尋下載下傳腳本前言開發環境及所需的庫思想實作效果結語

python-網易雲音樂搜尋下載下傳腳本

前言

開發環境及所需的庫

開發環境

所需的庫

思想

搜尋

下載下傳

儲存

實作

搜尋

下載下傳和儲存

主函數和輔助函數

效果

結語

繼續閱讀

來自python的【條件控制/語句循環/break/continue/else/pass】一、條件控制二、語句循環

無法解析的外部符号 wmain，該符号在函數 "void cdecl mainCRTStartupHelper(struct HINSTANCE *,unsigned short con......

TestLink導出用例轉換工具(XML2Excel)

YAML簡介和PyYAML安全操作YAML支援的類型YAML的優點：yaml的基本文法python操作

Small tricks

libsvm for python 安裝

學習軟體測試基礎測試第七天

Zeppelin 配置通路 REST APIApache Zeppelin Configuration REST API

【Torch】最簡潔logging使用指南

27. Remove Element(清單)題目代碼

Cloud Studio初體驗

使用 ctypes 進行 Python 和 C 的混合程式設計

【python】【資料處理】畫多元資料分布圖

【python】netconf協定對接管理裝置

「Python 網絡自動化」NETCONF —— Python 使用 NETCONF 管理配置 H3C 網絡裝置

在python中建立excel并寫入