論文
De novo assembly, annotation, and comparative analysis of 26 diverse maize genomes

image.png
部分資料和代碼是公開的,我們今天試着重複一下論文補充材料裡的 Figure S29
image.png
這個熱圖是用python中的seaborn子產品畫的,下面介紹畫圖代碼
導入需要用到的子產品
import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
複制
讀入資料集
部分資料截圖如下
image.png
file = "matrix-b73-ref.csv"
b73Ref = pd.read_csv(file, index_col=0).reindex(["B97", "Ky21", "M162W",
"Ms71", "Oh43", "Oh7B", "M37W", "Mo18W", "Tx303", "HP301", "P39",
"Il14H", "CML52", "CML69", "CML103", "CML228", "CML247", "CML277",
"CML322", "CML333", "Ki3", "Ki11", "NC350", "NC358", "Tzi8"])
b73Ref = b73Ref[["B97", "Ky21", "M162W",
"Ms71", "Oh43", "Oh7B", "M37W", "Mo18W", "Tx303", "HP301", "P39",
"Il14H", "CML52", "CML69", "CML103", "CML228", "CML247", "CML277",
"CML322", "CML333", "Ki3", "Ki11", "NC350", "NC358", "Tzi8"]]
複制
這裡
index_col=0
是用資料集中的第一列來做行名
reindx()
函數是将行按照自己制定的内容排序
[[]]
是把列按照指定的内容排序
檢視資料集的前5行
b73Ref.head(5)
複制
最基本的熱圖
sns.heatmap(b73Ref)
複制
image.png
隻保留下三角
這裡直接讀取的資料集的資料類型是整數型,我們需要把資料轉換為浮點型。論文中提供的代碼是沒有轉換資料類型的,如果完全按照他的代碼運作可能會遇到報錯,這裡可能是因為python的版本不同吧,我現在用的python是3.8.3
colnames = ["B97", "Ky21", "M162W",
"Ms71", "Oh43", "Oh7B", "M37W", "Mo18W", "Tx303", "HP301", "P39",
"Il14H", "CML52", "CML69", "CML103", "CML228", "CML247", "CML277",
"CML322", "CML333", "Ki3", "Ki11", "NC350", "NC358", "Tzi8"]
dtype = {}
for colname in colnames:
dtype[colname] = np.float64
df = b73Ref.astype(dtype)
mask = np.triu(np.ones_like(df,dtype=bool))
sns.heatmap(df,mask=mask)
複制
image.png
更改配色
cmap = sns.diverging_palette(370, 120, n=80, as_cmap=True)
sns.heatmap(df, mask=mask, cmap=cmap, robust=True,
square=True, linewidths=.5, cbar_kws={"shrink": .5})
複制
image.png
添加輔助線,去掉y軸标題
f, ax = plt.subplots(figsize=(14, 14))
cmap = sns.diverging_palette(370, 120, n=80, as_cmap=True)
sns.heatmap(df, mask=mask, cmap=cmap, robust=True,
square=True, linewidths=.5, cbar_kws={"shrink": .5})
plt.ylabel('')
ax.axvline(x=6, color ='blue', lw = 1.5, alpha = 0.75, ymax = 0.76)
ax.axvline(x=9, color ='blue', lw = 1.5, alpha = 0.75, ymax = 0.64)
ax.axvline(x=10, color ='blue', lw = 1.5, alpha = 0.75, ymax = 0.6)
ax.axvline(x=12, color ='blue', lw = 1.5, alpha = 0.75, ymax = 0.52)
ax.axhline(y=6, color ='black', lw = 1.5, alpha = 0.75, xmax = 0.24)
ax.axhline(y=9, color ='black', lw = 1.5, alpha = 0.75, xmax = 0.36)
ax.axhline(y=10, color ='black', lw = 1.5, alpha = 0.75, xmax = 0.4)
ax.axhline(y=12, color ='black', lw = 1.5, alpha = 0.75, xmax = 0.48)
複制
image.png
給坐标軸的标簽賦予顔色
f, ax = plt.subplots(figsize=(14, 14))
cmap = sns.diverging_palette(370, 120, n=80, as_cmap=True)
sns.heatmap(df, mask=mask, cmap=cmap, robust=True,
square=True, linewidths=.5, cbar_kws={"shrink": .5})
plt.ylabel('')
ax.axvline(x=6, color ='blue', lw = 1.5, alpha = 0.75, ymax = 0.76)
ax.axvline(x=9, color ='blue', lw = 1.5, alpha = 0.75, ymax = 0.64)
ax.axvline(x=10, color ='blue', lw = 1.5, alpha = 0.75, ymax = 0.6)
ax.axvline(x=12, color ='blue', lw = 1.5, alpha = 0.75, ymax = 0.52)
ax.axhline(y=6, color ='black', lw = 1.5, alpha = 0.75, xmax = 0.24)
ax.axhline(y=9, color ='black', lw = 1.5, alpha = 0.75, xmax = 0.36)
ax.axhline(y=10, color ='black', lw = 1.5, alpha = 0.75, xmax = 0.4)
ax.axhline(y=12, color ='black', lw = 1.5, alpha = 0.75, xmax = 0.48)
mycol = ["#4169E1", "#4169E1", "#4169E1", "#4169E1", "#4169E1", "#4169E1", "#787878", "#787878", "#787878", "#DA70D6", "#FF4500", "#FF4500", "#32CD32", "#32CD32", "#32CD32", "#32CD32", "#32CD32", "#32CD32", "#32CD32", "#32CD32", "#32CD32", "#32CD32", "#32CD32", "#32CD32", "#32CD32"]
for tick, color in zip(ax.get_xticklabels(), mycol): tick.set_color(color)
for tick, color in zip(ax.get_yticklabels(), mycol): tick.set_color(color)
plt.savefig("1.pdf")
複制
image.png
這個是最終的結果
歡迎大家關注我的公衆号
小明的資料分析筆記本
小明的資料分析筆記本 公衆号 主要分享:1、R語言和python做資料分析和資料可視化的簡單小例子;2、園藝植物相關轉錄組學、基因組學、群體遺傳學文獻閱讀筆記;3、生物資訊學入門學習資料及自己的學習筆記!