De novo characterization of the gametophyte transcriptome in bracken fern, Pteridium aquilinum

Joshua P. Der, Michael S. Barker, Norman J. Wickett, Claude Walker Depamphilis, Paul G. Wolf

Research output: Contribution to journalArticle

85 Citations (Scopus)

Abstract

Background: Because of their phylogenetic position and unique characteristics of their biology and life cycle, ferns represent an important lineage for studying the evolution of land plants. Large and complex genomes in ferns combined with the absence of economically important species have been a barrier to the development of genomic resources. However, high throughput sequencing technologies are now being widely applied to non-model species. We leveraged the Roche 454 GS-FLX Titanium pyrosequencing platform in sequencing the gametophyte transcriptome of bracken fern (Pteridium aquilinum) to develop genomic resources for evolutionary studies.Results: 681,722 quality and adapter trimmed reads totaling 254 Mbp were assembled de novo into 56,256 unique sequences (i.e. unigenes) with a mean length of 547.2 bp and a total assembly size of 30.8 Mbp with an average read-depth coverage of 7.0×. We estimate that 87% of the complete transcriptome has been sequenced and that all transcripts have been tagged. 61.8% of the unigenes had blastx hits in the NCBI nr protein database, representing 22,596 unique best hits. The longest open reading frame in 52.2% of the unigenes had positive domain matches in InterProScan searches. We assigned 46.2% of the unigenes with a GO functional annotation and 16.0% with an enzyme code annotation. Enzyme codes were used to retrieve and color KEGG pathway maps. A comparative genomics approach revealed a substantial proportion of genes expressed in bracken gametophytes to be shared across the genomes of Arabidopsis, Selaginella and Physcomitrella, and identified a substantial number of potentially novel fern genes. By comparing the list of Arabidopsis genes identified by blast with a list of gametophyte-specific Arabidopsis genes taken from the literature, we identified a set of potentially conserved gametophyte specific genes. We screened unigenes for repetitive sequences to identify 548 potentially-amplifiable simple sequence repeat loci and 689 expressed transposable elements.Conclusions: This study is the first comprehensive transcriptome analysis for a fern and represents an important scientific resource for comparative evolutionary and functional genomics studies in land plants. We demonstrate the utility of high-throughput sequencing of a normalized cDNA library for de novo transcriptome characterization and gene discovery in a non-model plant.

Original languageEnglish (US)
Article number99
JournalBMC genomics
Volume12
DOIs
StatePublished - Feb 8 2011

Fingerprint

Pteridium
Ferns
Transcriptome
Arabidopsis
Embryophyta
Genes
Genomics
Selaginellaceae
Plant Germ Cells
Bryopsida
Genome
Protein Databases
DNA Transposable Elements
Nucleic Acid Repetitive Sequences
Genetic Association Studies
Gene Expression Profiling
Enzymes
Titanium
Life Cycle Stages
Gene Library

All Science Journal Classification (ASJC) codes

  • Biotechnology
  • Genetics

Cite this

Der, Joshua P. ; Barker, Michael S. ; Wickett, Norman J. ; Depamphilis, Claude Walker ; Wolf, Paul G. / De novo characterization of the gametophyte transcriptome in bracken fern, Pteridium aquilinum. In: BMC genomics. 2011 ; Vol. 12.
@article{895d7ca9f835433eaac2fbe756fd6978,
title = "De novo characterization of the gametophyte transcriptome in bracken fern, Pteridium aquilinum",
abstract = "Background: Because of their phylogenetic position and unique characteristics of their biology and life cycle, ferns represent an important lineage for studying the evolution of land plants. Large and complex genomes in ferns combined with the absence of economically important species have been a barrier to the development of genomic resources. However, high throughput sequencing technologies are now being widely applied to non-model species. We leveraged the Roche 454 GS-FLX Titanium pyrosequencing platform in sequencing the gametophyte transcriptome of bracken fern (Pteridium aquilinum) to develop genomic resources for evolutionary studies.Results: 681,722 quality and adapter trimmed reads totaling 254 Mbp were assembled de novo into 56,256 unique sequences (i.e. unigenes) with a mean length of 547.2 bp and a total assembly size of 30.8 Mbp with an average read-depth coverage of 7.0×. We estimate that 87{\%} of the complete transcriptome has been sequenced and that all transcripts have been tagged. 61.8{\%} of the unigenes had blastx hits in the NCBI nr protein database, representing 22,596 unique best hits. The longest open reading frame in 52.2{\%} of the unigenes had positive domain matches in InterProScan searches. We assigned 46.2{\%} of the unigenes with a GO functional annotation and 16.0{\%} with an enzyme code annotation. Enzyme codes were used to retrieve and color KEGG pathway maps. A comparative genomics approach revealed a substantial proportion of genes expressed in bracken gametophytes to be shared across the genomes of Arabidopsis, Selaginella and Physcomitrella, and identified a substantial number of potentially novel fern genes. By comparing the list of Arabidopsis genes identified by blast with a list of gametophyte-specific Arabidopsis genes taken from the literature, we identified a set of potentially conserved gametophyte specific genes. We screened unigenes for repetitive sequences to identify 548 potentially-amplifiable simple sequence repeat loci and 689 expressed transposable elements.Conclusions: This study is the first comprehensive transcriptome analysis for a fern and represents an important scientific resource for comparative evolutionary and functional genomics studies in land plants. We demonstrate the utility of high-throughput sequencing of a normalized cDNA library for de novo transcriptome characterization and gene discovery in a non-model plant.",
author = "Der, {Joshua P.} and Barker, {Michael S.} and Wickett, {Norman J.} and Depamphilis, {Claude Walker} and Wolf, {Paul G.}",
year = "2011",
month = "2",
day = "8",
doi = "10.1186/1471-2164-12-99",
language = "English (US)",
volume = "12",
journal = "BMC Genomics",
issn = "1471-2164",
publisher = "BioMed Central",

}

De novo characterization of the gametophyte transcriptome in bracken fern, Pteridium aquilinum. / Der, Joshua P.; Barker, Michael S.; Wickett, Norman J.; Depamphilis, Claude Walker; Wolf, Paul G.

In: BMC genomics, Vol. 12, 99, 08.02.2011.

Research output: Contribution to journalArticle

TY - JOUR

T1 - De novo characterization of the gametophyte transcriptome in bracken fern, Pteridium aquilinum

AU - Der, Joshua P.

AU - Barker, Michael S.

AU - Wickett, Norman J.

AU - Depamphilis, Claude Walker

AU - Wolf, Paul G.

PY - 2011/2/8

Y1 - 2011/2/8

N2 - Background: Because of their phylogenetic position and unique characteristics of their biology and life cycle, ferns represent an important lineage for studying the evolution of land plants. Large and complex genomes in ferns combined with the absence of economically important species have been a barrier to the development of genomic resources. However, high throughput sequencing technologies are now being widely applied to non-model species. We leveraged the Roche 454 GS-FLX Titanium pyrosequencing platform in sequencing the gametophyte transcriptome of bracken fern (Pteridium aquilinum) to develop genomic resources for evolutionary studies.Results: 681,722 quality and adapter trimmed reads totaling 254 Mbp were assembled de novo into 56,256 unique sequences (i.e. unigenes) with a mean length of 547.2 bp and a total assembly size of 30.8 Mbp with an average read-depth coverage of 7.0×. We estimate that 87% of the complete transcriptome has been sequenced and that all transcripts have been tagged. 61.8% of the unigenes had blastx hits in the NCBI nr protein database, representing 22,596 unique best hits. The longest open reading frame in 52.2% of the unigenes had positive domain matches in InterProScan searches. We assigned 46.2% of the unigenes with a GO functional annotation and 16.0% with an enzyme code annotation. Enzyme codes were used to retrieve and color KEGG pathway maps. A comparative genomics approach revealed a substantial proportion of genes expressed in bracken gametophytes to be shared across the genomes of Arabidopsis, Selaginella and Physcomitrella, and identified a substantial number of potentially novel fern genes. By comparing the list of Arabidopsis genes identified by blast with a list of gametophyte-specific Arabidopsis genes taken from the literature, we identified a set of potentially conserved gametophyte specific genes. We screened unigenes for repetitive sequences to identify 548 potentially-amplifiable simple sequence repeat loci and 689 expressed transposable elements.Conclusions: This study is the first comprehensive transcriptome analysis for a fern and represents an important scientific resource for comparative evolutionary and functional genomics studies in land plants. We demonstrate the utility of high-throughput sequencing of a normalized cDNA library for de novo transcriptome characterization and gene discovery in a non-model plant.

AB - Background: Because of their phylogenetic position and unique characteristics of their biology and life cycle, ferns represent an important lineage for studying the evolution of land plants. Large and complex genomes in ferns combined with the absence of economically important species have been a barrier to the development of genomic resources. However, high throughput sequencing technologies are now being widely applied to non-model species. We leveraged the Roche 454 GS-FLX Titanium pyrosequencing platform in sequencing the gametophyte transcriptome of bracken fern (Pteridium aquilinum) to develop genomic resources for evolutionary studies.Results: 681,722 quality and adapter trimmed reads totaling 254 Mbp were assembled de novo into 56,256 unique sequences (i.e. unigenes) with a mean length of 547.2 bp and a total assembly size of 30.8 Mbp with an average read-depth coverage of 7.0×. We estimate that 87% of the complete transcriptome has been sequenced and that all transcripts have been tagged. 61.8% of the unigenes had blastx hits in the NCBI nr protein database, representing 22,596 unique best hits. The longest open reading frame in 52.2% of the unigenes had positive domain matches in InterProScan searches. We assigned 46.2% of the unigenes with a GO functional annotation and 16.0% with an enzyme code annotation. Enzyme codes were used to retrieve and color KEGG pathway maps. A comparative genomics approach revealed a substantial proportion of genes expressed in bracken gametophytes to be shared across the genomes of Arabidopsis, Selaginella and Physcomitrella, and identified a substantial number of potentially novel fern genes. By comparing the list of Arabidopsis genes identified by blast with a list of gametophyte-specific Arabidopsis genes taken from the literature, we identified a set of potentially conserved gametophyte specific genes. We screened unigenes for repetitive sequences to identify 548 potentially-amplifiable simple sequence repeat loci and 689 expressed transposable elements.Conclusions: This study is the first comprehensive transcriptome analysis for a fern and represents an important scientific resource for comparative evolutionary and functional genomics studies in land plants. We demonstrate the utility of high-throughput sequencing of a normalized cDNA library for de novo transcriptome characterization and gene discovery in a non-model plant.

UR - http://www.scopus.com/inward/record.url?scp=79551687108&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=79551687108&partnerID=8YFLogxK

U2 - 10.1186/1471-2164-12-99

DO - 10.1186/1471-2164-12-99

M3 - Article

VL - 12

JO - BMC Genomics

JF - BMC Genomics

SN - 1471-2164

M1 - 99

ER -