2016 · 研究进展Research
两栖动物作为脊椎动物进化中承上(鱼类)启下(爬行类/哺乳类)的关键节点,在发育生物学、神经生物学、毒理学和环境科学研究中具有重要价值。然而,相比斑马鱼和小鼠,两栖动物的基因组资源长期匮乏,大量非模式两栖动物甚至缺乏基本的转录组注释。Amphibase 项目旨在系统整合 22 个两栖动物物种的转录组数据,为两栖动物功能基因组学研究提供统一的参考平台。Amphibians occupy a critical evolutionary transition between fish and amniotes, making them valuable for developmental biology, neuroscience, toxicology and environmental science. Yet, compared with zebrafish and mice, amphibian genomic resources are sparse, with most non-model species lacking even basic transcriptome annotation. The Amphibase project aims to systematically integrate transcriptomic data from 22 amphibian species, providing a unified reference platform for amphibian functional genomics.
爪蟾(Xenopus laevis 和 X. tropicalis)是目前基因组资源最为完善的两栖动物模式生物。2010 年 X. tropicalis 基因组发布,2016 年 X. laevis 异源四倍体基因组随之完成测序。然而,对其他两栖动物(蝾螈、无足目、其他蛙类)的基因组研究仍严重滞后。Amphibase 项目试图通过系统的转录组测序与注释,弥合这一差距。Xenopus laevis and X. tropicalis are the best-resourced amphibian models: the X. tropicalis genome was released in 2010, followed by the X. laevis allotetraploid genome in 2016. However, genomic resources for other amphibians (salamanders, caecilians, other frogs) lag far behind. Amphibase aims to close this gap through systematic transcriptome sequencing and annotation of 22 species.
两栖动物基因组体积普遍偏大(部分蝾螈基因组超过 30 Gb),含有大量重复序列,导致 RNA-seq 短读段拼接困难;多倍体物种(如爪蟾)存在高度同源的亚基因组,进一步增加了转录本正确归组的难度。Amphibase 采用混合策略:Trinity 等 de novo 拼接结合 Xenbase 参考基因组比对,并通过聚类与 cd-hit 去冗余处理提高注释质量。Amphibian genomes are large (some salamander genomes exceed 30 Gb) and highly repetitive, complicating short-read RNA-seq assembly. Polyploid species (e.g., X. laevis) have highly similar homeologous subgenomes, further hindering correct transcript assignment. Amphibase uses a hybrid strategy: Trinity de novo assembly combined with Xenbase reference mapping, with cd-hit clustering for de-redundancy.
大量两栖动物特有基因(孤儿基因)在其他脊椎动物中无同源序列,Blast 注释无效。Amphibase 整合蛋白结构域数据库(Pfam、InterPro)、GO 和 KEGG 通路注释,结合系统发育分析,尽可能为孤儿基因提供功能推断。Many amphibian-specific genes (orphan genes) lack homologs in other vertebrates, making BLAST annotation ineffective. Amphibase integrates protein domain databases (Pfam, InterPro), GO and KEGG pathway annotation, and phylogenetic analysis to infer functions for orphan genes.
Amphibase 第一版计划收录来自 22 个两栖动物物种(含 3 个无足目、5 个蝾螈、14 个蛙类)的转录组数据,涵盖多个组织和发育阶段。首批数据发布于 2017 年,并持续更新。用户可通过 Xenbase(xenbase.org)访问与 Xenopus 整合的注释数据。The first Amphibase release covers transcriptomic data from 22 species (3 caecilians, 5 salamanders, 14 frogs) across multiple tissues and developmental stages. Initial data were released in 2017 and are continuously updated. Users can access integrated Xenopus-linked annotation data via xenbase.org.
相关资源:非洲爪蟾基因组起源详见 基因组起源研究;Xenbase 数据库应用参见 爪蟾作为模式动物的应用。Related: genome origin at Xenopus genome origin; Xenbase applications at Xenopus as a model organism.