論文
https://www.pnas.org/content/118/20/e2010588118
Death rates at specific life stages mold the sex gap in life expectancy

image.png
論文本地存儲
e2010588118.full.pdf
很有意思的一篇論文,研究的内容是為什麼女生比男生活的時間長(Why do women live longer than men?)哈哈哈。但是整篇論文我還沒有看明白,是以先不給大家介紹結論了。
這篇論文的資料和代碼是公開的,連結是 https://github.com/CPop-SDU/sex-gap-e0-pnas,我們按照他提供的代碼和資料試着複原一下論文裡的圖。今天的推文重複的内容是論文中的Figure1A
image.png
分組折線圖
用到的資料集是連結裡的dat檔案夾下的
df4qx.rda
檔案,
首選是導入資料
load("data/df4qx.rda")
head(df4qx)
複制
image.png
這個是一個長格式資料,把它轉變成寬格式
#install.packages("tidyverse")
library(tidyverse)
df4qx %>%
pivot_wider(names_from = sex,values_from = qx) %>%
head()
複制
image.png
這一步是為了友善計算不同年齡男女死亡率的比例
ggplot2作圖
df4qx %>%
pivot_wider(names_from = sex,values_from = qx) -> dftemp
複制
最基本的圖
library(ggplot2)
dftemp %>%
ggplot(aes(age,y=m/f,color=country))+
geom_smooth(se=F,size=1,color="#ffffff",span=0.25)+
geom_smooth(se = F, size = .5, span = .25)+
theme_minimal(base_size = 16)
複制
image.png
這裡原始代碼還設定字型了,我這裡就跳過了,因為我的電腦沒有這個字型
接下來做細節調整
添加一條水準輔助線
dftemp %>%
ggplot(aes(age,y=m/f,color=country))+
geom_smooth(se=F,size=1,color="#ffffff",span=0.25)+
geom_smooth(se = F, size = .5, span = .25)+
theme_minimal(base_size = 16,base_family = "serif")+
geom_hline(yintercept = 1, color = "gray25", size = .5)
複制
image.png
更改x軸刻度範圍
dftemp %>%
ggplot(aes(age,y=m/f,color=country))+
geom_smooth(se=F,size=1,color="#ffffff",span=0.25)+
geom_smooth(se = F, size = .5, span = .25)+
theme_minimal(base_size = 16,base_family = "serif")+
geom_hline(yintercept = 1, color = "gray25", size = .5)+
scale_x_continuous(breaks = c(0, 15, 40, 60, 80))
複制
image.png
對y軸進行log2轉換
dftemp %>%
ggplot(aes(age,y=m/f,color=country))+
geom_smooth(se=F,size=1,color="#ffffff",span=0.25)+
geom_smooth(se = F, size = .5, span = .25)+
theme_minimal(base_size = 16,base_family = "serif")+
geom_hline(yintercept = 1, color = "gray25", size = .5)+
scale_x_continuous(breaks = c(0, 15, 40, 60, 80))+
scale_y_continuous(
trans = "log",
breaks = c(.5, 1, 2, 3),
labels = c("", 1, 2, 3),
limits = c(.75, 3.5))
複制
image.png
這一步為啥要做轉化呢 有些沒看明白
自定義配色
pal_safe_five <- c(
"#eec21f", # default R 4.0 yellow
"#009C9C", # light shade of teal: no red, equal green and blue
"#df356b", # default R 4.0 red
"#08479A", # blues9[8] "#08519C" made a bit darker
"#003737" # very dark shade of teal
)
pal_safe_five_ordered <- pal_safe_five[c(5,2,1,3,4)]
pal_four <- pal_safe_five_ordered[c(2,5,3,4)]
dftemp %>%
ggplot(aes(age,y=m/f,color=country))+
geom_smooth(se=F,size=1,color="#ffffff",span=0.25)+
geom_smooth(se = F, size = .5, span = .25)+
theme_minimal(base_size = 16,base_family = "serif")+
geom_hline(yintercept = 1, color = "gray25", size = .5)+
scale_x_continuous(breaks = c(0, 15, 40, 60, 80))+
scale_y_continuous(
trans = "log",
breaks = c(.5, 1, 2, 3),
labels = c("", 1, 2, 3),
limits = c(.75, 3.5))+
scale_color_manual(NULL, values = pal_four)
複制
image.png
添加文本注釋
dftemp %>%
ggplot(aes(age,y=m/f,color=country))+
geom_smooth(se=F,size=1,color="#ffffff",span=0.25)+
geom_smooth(se = F, size = .5, span = .25)+
theme_minimal(base_size = 16,base_family = "serif")+
geom_hline(yintercept = 1, color = "gray25", size = .5)+
scale_x_continuous(breaks = c(0, 15, 40, 60, 80))+
scale_y_continuous(
trans = "log",
breaks = c(.5, 1, 2, 3),
labels = c("", 1, 2, 3),
limits = c(.75, 3.5))+
scale_color_manual(NULL, values = pal_four)+
annotate(
"text", x = 50, y = .9,
label = "Most recent year",
size = 8.5, color = "grey50", alpha = .5,
vjust = 1, family = "serif", fontface = 2
)
複制
image.png
去掉圖例并更改坐标軸标題
dftemp %>%
ggplot(aes(age,y=m/f,color=country))+
geom_smooth(se=F,size=1,color="#ffffff",span=0.25)+
geom_smooth(se = F, size = .5, span = .25)+
theme_minimal(base_size = 16,base_family = "serif")+
geom_hline(yintercept = 1, color = "gray25", size = .5)+
scale_x_continuous(breaks = c(0, 15, 40, 60, 80))+
scale_y_continuous(
trans = "log",
breaks = c(.5, 1, 2, 3),
labels = c("", 1, 2, 3),
limits = c(.75, 3.5))+
scale_color_manual(NULL, values = pal_four)+
annotate(
"text", x = 50, y = .9,
label = "Most recent year",
size = 8.5, color = "grey50", alpha = .5,
vjust = 1, family = "serif", fontface = 2
)+
theme(
legend.position = "none",
panel.grid.minor = element_blank()
)+
labs(
y = "Sex ratio, log scale",
x = "Age"
)
複制
image.png
歡迎大家關注我的公衆号
小明的資料分析筆記本
今天推文的示例資料和代碼可以在公衆号背景留言
20210829
擷取
(精确比對開頭結尾都不能有空格)
小明的資料分析筆記本 公衆号 主要分享:1、R語言和python做資料分析和資料可視化的簡單小例子;2、園藝植物相關轉錄組學、基因組學、群體遺傳學文獻閱讀筆記;3、生物資訊學入門學習資料及自己的學習筆記!
後記
今天發現視訊号和公衆号現在可以帶貨了,京東和拼多多平台的商品可以生成我自己的連結,如果有人通過這個連結購買商品 我就可以得到相應比例的傭金。比如我今天買了兩雙鞋,總共花費400多,我拿到的傭金是20幾塊。大家如果經常在京東或者拼多多買東西的話可以加一下下面的微信群,比如你想買一件東西,可以先把商品的連結發給我,我生成我專屬的連結,然後你再通過我的專屬連結買,這樣我就能有收入,我可以将收入的一半再轉給你,你能省幾塊錢,我也能賺幾塊錢。