天天看點

soil traits with k-means clustercluster with k-means

cluster with k-means

To select one traits in one cluster, simplify the gwas result

1. calculate the Pearson correlation coefficient by using R and plot

library(pheatmap)
data<-read.table("pro_alti_rm.cor",header=T,row.names = 1)
pdf(file="pro_soil.cor.pdf",height = 7,width = 7)
pheatmap(data,show_rownames = F,show_colnames = F)
dev.off()
           
soil traits with k-means clustercluster with k-means

2. choose a suitable K

library("ClusterR")
corm<-as.matrix(read.table("pro_alti_rm.cor")) ## with the altitude trait and soil properites without enough data removed, so get a k in 11 not in 14
Optimal_Clusters_KMeans(corm,max_clusters=25,criterion="variance_explained") # do not minimize with AIC (with k=4)
k=11
bob<-KMeans_rcpp(corm,clusters=k)
clusters<-data.frame(colnames(corm),bob$clusters)
for (i in 1:11)  # randomly select one trait in a cluster
{
print (sample(clusters[which(clusters$bob.clusters==i),1],1))
}
           
soil traits with k-means clustercluster with k-means

3. Manhattan plot for the 11 traits

soil traits with k-means clustercluster with k-means

Question: lost the information like below, and lost the QTLs listed in supplementary table 16.

Such as: organic carbon in 0-0.045m

soil traits with k-means clustercluster with k-means

May consider PCA or:

If the soil properites under different depths are clusted together, pick 1 soil property.

For example, total carbon (16.6mm), total carbon (28.9mm), total N (16.6mm), total N (28.9) all for trait in 1 cluster, randomly select a trait in total carbon and total N, not just remain 1 trait in 1 cluster.

繼續閱讀