Towards the understanding of the cocoa transcriptome: Production and analysis of an exhaustive dataset of ESTs of Theobroma cacao L. generated from various tissues and under various conditions

Xavier Argout, Olivier Fouet, Patrick Wincker, Karina Gramacho, Thierry Legavre, Xavier Sabau, Ange Marie Risterucci, Corinne Da Silva, Julio Cascardo, Mathilde Allegre, David Kuhn, Joseph Verica, Brigitte Courtois, Gaston Loor, Regis Babin, Olivier Sounigo, Michel Ducamp, Mark J. Guiltinan, Manuel Ruiz, Laurence AlemannoRegina Machado, Wilberth Phillips, Ray Schnell, Martin Gilmour, Eric Rosenquist, David Butler, Siela Maximova, Claire Lanaud

Research output: Contribution to journalArticle

79 Citations (Scopus)

Abstract

Background: Theobroma cacao L., is a tree originated from the tropical rainforest of South America. It is one of the major cash crops for many tropical countries. T. cacao is mainly produced on smallholdings, providing resources for 14 million farmers. Disease resistance and T. cacao quality improvement are two important challenges for all actors of cocoa and chocolate production. T. cacao is seriously affected by pests and fungal diseases, responsible for more than 40% yield losses and quality improvement, nutritional and organoleptic, is also important for consumers. An international collaboration was formed to develop an EST genomic resource database for cacao. Results: Fifty-six cDNA libraries were constructed from different organs, different genotypes and different environmental conditions. A total of 149,650 valid EST sequences were generated corresponding to 48,594 unigenes, 12,692 contigs and 35,902 singletons. A total of 29,849 unigenes shared significant homology with public sequences from other species. Gene Ontology (GO) annotation was applied to distribute the ESTs among the main GO categories. A specific information system (ESTtik) was constructed to process, store and manage this EST collection allowing the user to query a database. To check the representativeness of our EST collection, we looked for the genes known to be involved in two different metabolic pathways extensively studied in other plant species and important for T. cacao qualities: the flavonoid and the terpene pathways. Most of the enzymes described in other crops for these two metabolic pathways were found in our EST collection. A large collection of new genetic markers was provided by this ESTs collection. Conclusion: This EST collection displays a good representation of the T. cacao transcriptome, suitable for analysis of biochemical pathways based on oligonucleotide microarrays derived from these ESTs. It will provide numerous genetic markers that will allow the construction of a high density gene map of T. cacao. This EST collection represents a unique and important molecular resource for T. cacao study and improvement, facilitating the discovery of candidate genes for important T. cacao trait variation.

Original languageEnglish (US)
Article number512
JournalBMC genomics
Volume9
DOIs
StatePublished - Oct 30 2008

Fingerprint

Cacao
Expressed Sequence Tags
Gene Expression Profiling
Gene Ontology
Quality Improvement
Metabolic Networks and Pathways
Genetic Markers
Datasets
Databases
Molecular Sequence Annotation
Disease Resistance
Mycoses
South America
Terpenes
Genetic Association Studies
Oligonucleotide Array Sequence Analysis
Gene Library
Flavonoids
Information Systems
Genes

All Science Journal Classification (ASJC) codes

  • Biotechnology
  • Genetics

Cite this

Argout, Xavier ; Fouet, Olivier ; Wincker, Patrick ; Gramacho, Karina ; Legavre, Thierry ; Sabau, Xavier ; Risterucci, Ange Marie ; Da Silva, Corinne ; Cascardo, Julio ; Allegre, Mathilde ; Kuhn, David ; Verica, Joseph ; Courtois, Brigitte ; Loor, Gaston ; Babin, Regis ; Sounigo, Olivier ; Ducamp, Michel ; Guiltinan, Mark J. ; Ruiz, Manuel ; Alemanno, Laurence ; Machado, Regina ; Phillips, Wilberth ; Schnell, Ray ; Gilmour, Martin ; Rosenquist, Eric ; Butler, David ; Maximova, Siela ; Lanaud, Claire. / Towards the understanding of the cocoa transcriptome : Production and analysis of an exhaustive dataset of ESTs of Theobroma cacao L. generated from various tissues and under various conditions. In: BMC genomics. 2008 ; Vol. 9.
@article{7707a825f0f04f54824c9f29af4ab5d2,
title = "Towards the understanding of the cocoa transcriptome: Production and analysis of an exhaustive dataset of ESTs of Theobroma cacao L. generated from various tissues and under various conditions",
abstract = "Background: Theobroma cacao L., is a tree originated from the tropical rainforest of South America. It is one of the major cash crops for many tropical countries. T. cacao is mainly produced on smallholdings, providing resources for 14 million farmers. Disease resistance and T. cacao quality improvement are two important challenges for all actors of cocoa and chocolate production. T. cacao is seriously affected by pests and fungal diseases, responsible for more than 40{\%} yield losses and quality improvement, nutritional and organoleptic, is also important for consumers. An international collaboration was formed to develop an EST genomic resource database for cacao. Results: Fifty-six cDNA libraries were constructed from different organs, different genotypes and different environmental conditions. A total of 149,650 valid EST sequences were generated corresponding to 48,594 unigenes, 12,692 contigs and 35,902 singletons. A total of 29,849 unigenes shared significant homology with public sequences from other species. Gene Ontology (GO) annotation was applied to distribute the ESTs among the main GO categories. A specific information system (ESTtik) was constructed to process, store and manage this EST collection allowing the user to query a database. To check the representativeness of our EST collection, we looked for the genes known to be involved in two different metabolic pathways extensively studied in other plant species and important for T. cacao qualities: the flavonoid and the terpene pathways. Most of the enzymes described in other crops for these two metabolic pathways were found in our EST collection. A large collection of new genetic markers was provided by this ESTs collection. Conclusion: This EST collection displays a good representation of the T. cacao transcriptome, suitable for analysis of biochemical pathways based on oligonucleotide microarrays derived from these ESTs. It will provide numerous genetic markers that will allow the construction of a high density gene map of T. cacao. This EST collection represents a unique and important molecular resource for T. cacao study and improvement, facilitating the discovery of candidate genes for important T. cacao trait variation.",
author = "Xavier Argout and Olivier Fouet and Patrick Wincker and Karina Gramacho and Thierry Legavre and Xavier Sabau and Risterucci, {Ange Marie} and {Da Silva}, Corinne and Julio Cascardo and Mathilde Allegre and David Kuhn and Joseph Verica and Brigitte Courtois and Gaston Loor and Regis Babin and Olivier Sounigo and Michel Ducamp and Guiltinan, {Mark J.} and Manuel Ruiz and Laurence Alemanno and Regina Machado and Wilberth Phillips and Ray Schnell and Martin Gilmour and Eric Rosenquist and David Butler and Siela Maximova and Claire Lanaud",
year = "2008",
month = "10",
day = "30",
doi = "10.1186/1471-2164-9-512",
language = "English (US)",
volume = "9",
journal = "BMC Genomics",
issn = "1471-2164",
publisher = "BioMed Central",

}

Argout, X, Fouet, O, Wincker, P, Gramacho, K, Legavre, T, Sabau, X, Risterucci, AM, Da Silva, C, Cascardo, J, Allegre, M, Kuhn, D, Verica, J, Courtois, B, Loor, G, Babin, R, Sounigo, O, Ducamp, M, Guiltinan, MJ, Ruiz, M, Alemanno, L, Machado, R, Phillips, W, Schnell, R, Gilmour, M, Rosenquist, E, Butler, D, Maximova, S & Lanaud, C 2008, 'Towards the understanding of the cocoa transcriptome: Production and analysis of an exhaustive dataset of ESTs of Theobroma cacao L. generated from various tissues and under various conditions', BMC genomics, vol. 9, 512. https://doi.org/10.1186/1471-2164-9-512

Towards the understanding of the cocoa transcriptome : Production and analysis of an exhaustive dataset of ESTs of Theobroma cacao L. generated from various tissues and under various conditions. / Argout, Xavier; Fouet, Olivier; Wincker, Patrick; Gramacho, Karina; Legavre, Thierry; Sabau, Xavier; Risterucci, Ange Marie; Da Silva, Corinne; Cascardo, Julio; Allegre, Mathilde; Kuhn, David; Verica, Joseph; Courtois, Brigitte; Loor, Gaston; Babin, Regis; Sounigo, Olivier; Ducamp, Michel; Guiltinan, Mark J.; Ruiz, Manuel; Alemanno, Laurence; Machado, Regina; Phillips, Wilberth; Schnell, Ray; Gilmour, Martin; Rosenquist, Eric; Butler, David; Maximova, Siela; Lanaud, Claire.

In: BMC genomics, Vol. 9, 512, 30.10.2008.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Towards the understanding of the cocoa transcriptome

T2 - Production and analysis of an exhaustive dataset of ESTs of Theobroma cacao L. generated from various tissues and under various conditions

AU - Argout, Xavier

AU - Fouet, Olivier

AU - Wincker, Patrick

AU - Gramacho, Karina

AU - Legavre, Thierry

AU - Sabau, Xavier

AU - Risterucci, Ange Marie

AU - Da Silva, Corinne

AU - Cascardo, Julio

AU - Allegre, Mathilde

AU - Kuhn, David

AU - Verica, Joseph

AU - Courtois, Brigitte

AU - Loor, Gaston

AU - Babin, Regis

AU - Sounigo, Olivier

AU - Ducamp, Michel

AU - Guiltinan, Mark J.

AU - Ruiz, Manuel

AU - Alemanno, Laurence

AU - Machado, Regina

AU - Phillips, Wilberth

AU - Schnell, Ray

AU - Gilmour, Martin

AU - Rosenquist, Eric

AU - Butler, David

AU - Maximova, Siela

AU - Lanaud, Claire

PY - 2008/10/30

Y1 - 2008/10/30

N2 - Background: Theobroma cacao L., is a tree originated from the tropical rainforest of South America. It is one of the major cash crops for many tropical countries. T. cacao is mainly produced on smallholdings, providing resources for 14 million farmers. Disease resistance and T. cacao quality improvement are two important challenges for all actors of cocoa and chocolate production. T. cacao is seriously affected by pests and fungal diseases, responsible for more than 40% yield losses and quality improvement, nutritional and organoleptic, is also important for consumers. An international collaboration was formed to develop an EST genomic resource database for cacao. Results: Fifty-six cDNA libraries were constructed from different organs, different genotypes and different environmental conditions. A total of 149,650 valid EST sequences were generated corresponding to 48,594 unigenes, 12,692 contigs and 35,902 singletons. A total of 29,849 unigenes shared significant homology with public sequences from other species. Gene Ontology (GO) annotation was applied to distribute the ESTs among the main GO categories. A specific information system (ESTtik) was constructed to process, store and manage this EST collection allowing the user to query a database. To check the representativeness of our EST collection, we looked for the genes known to be involved in two different metabolic pathways extensively studied in other plant species and important for T. cacao qualities: the flavonoid and the terpene pathways. Most of the enzymes described in other crops for these two metabolic pathways were found in our EST collection. A large collection of new genetic markers was provided by this ESTs collection. Conclusion: This EST collection displays a good representation of the T. cacao transcriptome, suitable for analysis of biochemical pathways based on oligonucleotide microarrays derived from these ESTs. It will provide numerous genetic markers that will allow the construction of a high density gene map of T. cacao. This EST collection represents a unique and important molecular resource for T. cacao study and improvement, facilitating the discovery of candidate genes for important T. cacao trait variation.

AB - Background: Theobroma cacao L., is a tree originated from the tropical rainforest of South America. It is one of the major cash crops for many tropical countries. T. cacao is mainly produced on smallholdings, providing resources for 14 million farmers. Disease resistance and T. cacao quality improvement are two important challenges for all actors of cocoa and chocolate production. T. cacao is seriously affected by pests and fungal diseases, responsible for more than 40% yield losses and quality improvement, nutritional and organoleptic, is also important for consumers. An international collaboration was formed to develop an EST genomic resource database for cacao. Results: Fifty-six cDNA libraries were constructed from different organs, different genotypes and different environmental conditions. A total of 149,650 valid EST sequences were generated corresponding to 48,594 unigenes, 12,692 contigs and 35,902 singletons. A total of 29,849 unigenes shared significant homology with public sequences from other species. Gene Ontology (GO) annotation was applied to distribute the ESTs among the main GO categories. A specific information system (ESTtik) was constructed to process, store and manage this EST collection allowing the user to query a database. To check the representativeness of our EST collection, we looked for the genes known to be involved in two different metabolic pathways extensively studied in other plant species and important for T. cacao qualities: the flavonoid and the terpene pathways. Most of the enzymes described in other crops for these two metabolic pathways were found in our EST collection. A large collection of new genetic markers was provided by this ESTs collection. Conclusion: This EST collection displays a good representation of the T. cacao transcriptome, suitable for analysis of biochemical pathways based on oligonucleotide microarrays derived from these ESTs. It will provide numerous genetic markers that will allow the construction of a high density gene map of T. cacao. This EST collection represents a unique and important molecular resource for T. cacao study and improvement, facilitating the discovery of candidate genes for important T. cacao trait variation.

UR - http://www.scopus.com/inward/record.url?scp=60549092591&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=60549092591&partnerID=8YFLogxK

U2 - 10.1186/1471-2164-9-512

DO - 10.1186/1471-2164-9-512

M3 - Article

C2 - 18973681

AN - SCOPUS:60549092591

VL - 9

JO - BMC Genomics

JF - BMC Genomics

SN - 1471-2164

M1 - 512

ER -