Seed plants include angiosperms and gymnosperms, and gymnosperms are divided into four main groups, namely cycads, ginkgo biloba, conifers, and hemp vines. Gymnosperms have large genomes, high levels of repetitive sequences, and complex structures. So far, the most primitive seed plant cycad branch still lacks a complete genome map, and the release of the cycad genome represents the completion of the last piece of the puzzle for human beings to complete the study of the evolution of the seed plant genome.
On April 18, 2022, Nature & Plant published its research work online titled The Cycas genome and the early evolution of seed plants online. The research team released cycad reference genome information, filling the gap in seed plant genome research, marking that the genomes of all branches of seed plants have been covered, laying the foundation for subsequent comparative genomics research.

Paper cover
The cycad genome selects the basal taxa of cycads, which is also based on Cycas panzhihuaensis, the species with the northernmost distribution latitude of the current cycads, as the sequencing object. Based on long fragment sequencing and MGI-SEQ sequencing, the cycad genome is assembled at 10.5 Gb and contig N50 is 12Mb, combined with Hi-C data, mounted to 11 chromosomes (Figure 1). A total of 32,353 protein-coding genes were annotated, and the BUSCO assessment was 91.6% complete, which is the highest quality large genome atlas in gymnosperms at present.
Figure 1: Genomic characteristics of Panzhihua cycads, corresponding to its 11 chromosomes
Gymnosperms have 5 major branches, about 1118 species. There have always been different academic views on the phylogenetic relationships between large branches within gymnosperms. Systematic analysis of 3282 orthotic low-copy nuclear genes (Fig. 2a) of 15 vascular plant genomes, 1569 orthologetic genes of 90 seed plant transcriptomes, and 72 species of vascular plant chloroplasts and mitochondrial genome data showed that cycads alone (mitochondrial data) or together with ginkgo biloba (nuclear genes, chloroplast data) constituted the sister groups of all other gymnosperms (Fig. 2b).
Figure 2: a) Seed plant phylogeny tree based on nuclear genes. b) Inferring phylogenetic relationships of seed plants based on different data and methods.
Genome doubling is an important driver of plant evolutionary adaptation, and there has been controversy about whether the common ancestor of gymnosperms experienced a genome-wide doubling event. Using synonymous analysis of duplicate genes and phylogenetic genomics, and using a comparative verification of intragenome collinear regions, the researchers found that the most recent common ancestor of extant gymnosperms may have experienced an ancient whole genome replication event (named omega, Figure 3a).
Along with the origin of seed plants, many key innovative traits such as seed development, pollen, and genetic families associated with secondary growth have been innovated or expanded. A total of 663 newly acquired gene families and 368 expanded gene families were found at the ancestral nodes of seed plants. Among them, 106 newly acquired and 55 significantly expanded gene families were associated with seed physiological development, including regulation of early embryonic development, seed dormancy and germination, seed energy and nutrient metabolism, seed coat formation, and seed immunity and stress response (Figure 3b).
Figure 3: a) Inferring genome-wide doubling events of seed plants based on phylogenetic relationships. b) Genetic family innovation and expansion of seed plants.
The most significantly expanded seed physiology-related family is the cupin family. Panzhihua cycad encodes a new class of vicilin-like storage protein vicilin-like antimicrobial peptides (v-AMP), which are distributed in tandem gene arrays in the genome, mostly expressed in the late pollination ovules and the fertilized ovule period, and then gradually decreased, suggesting that the v-AMP gene plays an important role in a specific period of seed development (Figure 4a, b). The LAFL family (LEC1, ABI3, LEC2 and FUS3) are the core regulatory genes for seed development, and the FUS3 and LEC2 genes of gymnosperms such as cycads can form a new evolutionary branch, defined as the FUS3/LEC2-like type, forming a sister branch relationship with the FUS3 and LEC2 of angiosperms. The FUS3/LEC2-like category is endemic to gymnosperms. After pollination of Panzhihua cycads, it will show a pronounced expression indicating that it may play a specific role in the early stages of gymnosperm embryogenesis .
Figure 4: a) Cuppin's phylogenetic tree. b) Expression level of transcripts of cycad Cupin gene during seed pollination and fertilization.
Cycads originated in the early Paleozoic Permian and are at least 270 million years old. After experiencing mass extinctions, modern cycads are mostly descendants of recent radiation evolutions. Today cycads have 2 families and 10 genera. Based on transcriptome data from 339 existing cycad species, the researchers reconstructed the phylogenetic relationship of cycads themselves. Molecular clock analysis shows that the diversification of existing cycads occurred simultaneously between 11 and 20 million years ago, the result of drastic climate changes since the Miocene (Figure 5).
Figure 5: Cycad phylogeny trees support existing cycads as a result of radiation evolution.
Male-female sex differentiation is an evolutionary trait. Hermaphroditics account for 6% of angiosperms, but 65% of the 1118 reported gymnosperm species, and cycads are all dioecious (Figure 6). Cycads are similar in morphology and can only be judged by sex when flowering, but cycads grow slowly, grow for more than ten years and have a suitable environment. Studying the molecular mechanism of cycad sex determination can determine its sex at the seedling stage, which is of great significance for the in situ and ex situ conservation of cycad plants, as well as garden cultivation.
In this study, 62 male and female cycad plants in Panzhihua Cycad National Reserve in Sichuan were resequenced, and the sex difference region of cycad was located on chromosome 8 of the genome by genome-wide association analysis. By performing single-molecule sequencing and sequence assembly of a male panzhihua cycad, a 45.5 Mb male plant-specific Y chromosome sequence was obtained, which was nearly 80 Mb shorter than the corresponding 120 Mb female plant-specific X chromosome sequence, showing obvious sex chromosome differentiation characteristics. Through transcriptome analysis of male and female large and small spore leaves, it was found that one of the genes with the largest difference in expression came from the Y chromosome of the male strain, which encoded the MADS-box transcription factor, and it was speculated that it was involved in regulating the development of the small spore leaves of the male plant. Homologous gene sequences of this transcription factor can also be detected in the genomes of male plants in representative species of Cycads and Ozawa Mitasidae, illustrating the conservatism of this sex-determining mechanism in cycads.
In this study, the sequences of Cycad X and Y chromosomes were obtained by assembling them, which revealed the genetic mechanism of cycad sex determination, which provided a theoretical basis for the study of the spatial distribution pattern of cycads in the wild and the spatial distribution pattern of males and females in the field.
Fig. 6: a) Panzhihua cycad. b) Sporophytes of male and female cycads | References[1]
The sperm of early vascular plants were all flagellar and could swim. As it evolved, the flagella was lost. In live seed plants, only cycad and ginkgo biloba retain sperm with flagellar characteristics. The researchers found that both cycad and ginkgo biloba retained a large number of genes needed for flagellar assembly, but compared with cycad, there was some loss of RSP genes in ginkgo biloba (RSP2, RSP3, RSP 9, and RSP11, etc.). In addition, peripheral dense fiber synthesis genes (ODFs), which are closely related to flagellar function, are present only in the cycad and ginkgo genomes, and are completely lost in other seed plants (Figure 7) – further confirming the ancient position of cycad in seed plant evolution.
Figure VII, a) Cycad flagellation. b) Loss of flagella during plant evolution. c) Distribution of ODF and other flagellar-related genes | References[1]
Horizontal gene transfer refers to the exchange of genes between different species, which plays a driving role in the adaptive evolution of terrestrial plants. The researchers found a cytotoxin protein gene (fitD) in the cycad genome of Panzhihua, which originates in bacteria and is transferred to fungi and cycads through horizontal gene transfer (Figure 8a). Based on transcriptome data from 339 cycad species, the researchers found that the toxin protein is present only in cycad species. At the same time, the fitD gene is highly expressed in seeds and roots, which may also be one of the reasons why cycad seeds and roots are toxic. Based on the recombinant gene technology, the toxin protein products expressed by E. coli have significant lethality to cabbage moths and bollworms (Figure 8b-f), showing that toxin proteins have certain agricultural application prospects.
Figure 8: a) Evolutionary history of cycad-level transfer of toxic protein genes. b-f) Cycad toxic protein gene expression, and toxicity to insects | References[1]
bibliography
Author: Liu Yang
Edit: Crispy fish
Typography: Yin Ningliu
Wikimedia Commons, Mercy / CC BY-SA 3.0 (https://creativecommons.org/licenses/by-sa/3.0/deed.en)
Research team
(Co-author) First Author Liu Yang: Shenzhen Huada Life Science Research Institute, Shenzhen Xianhu Botanical Garden; Wang Sibo, Li Linzhou, Yang Ting, Wei Tong: Shenzhen Huada Life Science Research Institute; Dong Shanshan: Shenzhen Xianhu Botanical Garden; Wu Shengyan: Lanzhou University; Liu Yongbo: Chinese Academy of Environmental Sciences
Corresponding author Zhang Shouzhou: Shenzhen Xianhu Botanical Garden; Liu Huan: Shenzhen Huada Institute of Life Sciences; Gong Xun: Kunming Institute of Botany, Chinese Academy of Sciences; Douglas E. Soltis: University of Florida; Yves Van de Peer: University of Ghent, Belgium.