cluster with k-means
To select one traits in one cluster, simplify the gwas result
1. calculate the Pearson correlation coefficient by using R and plot
library(pheatmap)
data<-read.table("pro_alti_rm.cor",header=T,row.names = 1)
pdf(file="pro_soil.cor.pdf",height = 7,width = 7)
pheatmap(data,show_rownames = F,show_colnames = F)
dev.off()

2. choose a suitable K
library("ClusterR")
corm<-as.matrix(read.table("pro_alti_rm.cor")) ## with the altitude trait and soil properites without enough data removed, so get a k in 11 not in 14
Optimal_Clusters_KMeans(corm,max_clusters=25,criterion="variance_explained") # do not minimize with AIC (with k=4)
k=11
bob<-KMeans_rcpp(corm,clusters=k)
clusters<-data.frame(colnames(corm),bob$clusters)
for (i in 1:11) # randomly select one trait in a cluster
{
print (sample(clusters[which(clusters$bob.clusters==i),1],1))
}
3. Manhattan plot for the 11 traits
Question: lost the information like below, and lost the QTLs listed in supplementary table 16.
Such as: organic carbon in 0-0.045m