The Ancestral Angiosperm Genome Project

  • Depamphilis, Claude Walker (PI)
  • Soltis, Pamela P.S. (CoPI)
  • Soltis, Douglas D.E. (CoPI)
  • Leebens-mack, James J.H. (CoPI)
  • Ma, Hong H. (CoPI)

Project: Research project

Project Details


PI: Claude dePamphilis (Penn State University)

Co-PIs: Hong Ma (Penn State University), Jim Leebens-Mack (University of Georgia), Pamela Soltis (University of Florida), Douglas Soltis (University of Florida)

Senior Personnel: Sandra Clifton (Washington University), Naomi Altman (Penn State University).

The origin and early diversification of flowering plants (angiosperms) had profound impacts on Earth's biota and provided the raw genetic material from which all economically important angiosperm crop plants were derived. The evolution of genomes, genes, regulatory processes, and numerous specific adaptations in monocot and eudicot angiosperms cannot be adequately interpreted without a comparative framework firmly rooted with genome sequences from basal angiosperms. However, little is known about the genomes or, for that matter, most genes of the basalmost extant flowering plants, which represent our sole surviving links to the earliest angiosperms. Limited EST data, generated from early flower development in the Floral Genome Project, provide our only clues to the gene set present in the Ancestral Angiosperm. These data suggest that the earliest flowering plants had a large and diverge set of genes contributing to the flowering process, and may have undergone a polyploid event just prior to diversification of extant lineages.

The Ancestral Angiosperm Genome Project (AAGP) is built around three primary experimental and analytical objectives, designed to provide resources that are essential to the development of a comparative framework to all angiosperms:

1) Comparative 'deep transcriptome sequencing' of basal angiosperm lineages and one gymnosperm - The AAGP will tag and sequence, as fully as possible, the transcriptomes of five phylogenetically critical, basal angiosperms: Amborella, Nuphar, Persea, Liriodendron, and Aristolochia. Together with a comparable EST dataset for the cycad Zamia and other existing gymnosperm data, these species will surround the ancestral angiosperm node and will provide strong evidence for inferring the ancestral state of the angiosperm transcriptome and a large majority of flowering plant genes and gene families.

2) BAC library fingerprinting and physical mapping of Amborella - An existing high quality BAC library of Amborella will be end sequenced, fingerprinted, and assembled to generate a physical map of Amborella. 40 clones will be selected for proof-of-concept sequencing of small but scientifically important portions of the genome of this phylogenetically pivotal angiosperm. This map should make it possible to identify genome-scale duplication event, while the BAC end sequence will provide the first information about the linkage patterns of genes and other sequences in the ancestral angiosperm.

3) Bioinformatic platform - Existing informatic structures will be expanded to make all of the data readily accessible with tools to help researchers access and study in a comparative genomic framework any plant gene, gene family, or otherwise defined gene set.

4) Outreach - The AAGP will focus outreach efforts in two areas: a) a pilot crosscultural mentoring program with faculty and students at historically black universities, with a focus on bioinformatic skills and genomic analysis, and b) scientific outreach on the Ancestral Angiosperm. Public involvement in understanding the Ancestral Angiosperm will be encouraged through two open calls for public input on Amborella BAC clones to be sequenced.

Access to project outcomes

All sequence data will be deposited immediately in GenBank (, with providing comprehensive database access and analytical tools for the interactive examination of the gene space, sequences, phylogeny, and expression information for each of the study species and inferences about the Ancestral Angiosperm.

Effective start/end date1/1/0712/31/11


  • National Science Foundation: $2,967,113.00


Explore the research topics touched on by this project. These labels are generated based on the underlying awards/grants. Together they form a unique fingerprint.