A bi-Poisson model for clustering gene expression profiles by RNA-seq

Ningtao Wang, Yaqun Wang, Han Hao, Luojun Wang, Zhong Wang, Jianxin Wang, Rongling Wu

Research output: Contribution to journalArticle

5 Citations (Scopus)

Abstract

With the availability of gene expression data by RNA-seq, powerful statistical approaches for grouping similar gene expression profiles across different environments have become increasingly important.We describe and assess a computational model for clustering genes into distinct groups based on the pattern of gene expression in response to changing environment. The model capitalizes on the Poisson distribution to capture the count property of RNA-seq data. A two-stage hierarchical expectation-maximization (EM) algorithm is implemented to estimate an optimal number of groups and mean expression amounts of each group across two environments. A procedure is formulated to test whether and how a given group shows a plastic response to environmental changes. The impact of gene-environment interactions on the phenotypic plasticity of the organism can also be visualized and characterized. The model was used to analyse an RNA-seq dataset measured from two cell lines of breast cancer that respond differently to an anti-cancer drug, from which genes associated with the resistance and sensitivity of the cell lines are identified. We performed simulation studies to validate the statistical behaviour of the model. The model provides a useful tool for clustering gene expression data by RNA-seq, facilitating our understanding of gene functions and networks.

Original languageEnglish (US)
Article numberbbt029
Pages (from-to)534-541
Number of pages8
JournalBriefings in bioinformatics
Volume15
Issue number4
DOIs
StatePublished - Jul 2014

Fingerprint

RNA
Transcriptome
Gene expression
Cluster Analysis
Genes
Gene Expression
Poisson Distribution
Cell Line
Gene-Environment Interaction
Cells
Gene Regulatory Networks
Statistical Models
Poisson distribution
Plastics
Plasticity
Breast Neoplasms
Availability
Pharmaceutical Preparations
Neoplasms

All Science Journal Classification (ASJC) codes

  • Information Systems
  • Molecular Biology

Cite this

Wang, N., Wang, Y., Hao, H., Wang, L., Wang, Z., Wang, J., & Wu, R. (2014). A bi-Poisson model for clustering gene expression profiles by RNA-seq. Briefings in bioinformatics, 15(4), 534-541. [bbt029]. https://doi.org/10.1093/bib/bbt029
Wang, Ningtao ; Wang, Yaqun ; Hao, Han ; Wang, Luojun ; Wang, Zhong ; Wang, Jianxin ; Wu, Rongling. / A bi-Poisson model for clustering gene expression profiles by RNA-seq. In: Briefings in bioinformatics. 2014 ; Vol. 15, No. 4. pp. 534-541.
@article{269cac05c54a45bc990f956b9cae2f05,
title = "A bi-Poisson model for clustering gene expression profiles by RNA-seq",
abstract = "With the availability of gene expression data by RNA-seq, powerful statistical approaches for grouping similar gene expression profiles across different environments have become increasingly important.We describe and assess a computational model for clustering genes into distinct groups based on the pattern of gene expression in response to changing environment. The model capitalizes on the Poisson distribution to capture the count property of RNA-seq data. A two-stage hierarchical expectation-maximization (EM) algorithm is implemented to estimate an optimal number of groups and mean expression amounts of each group across two environments. A procedure is formulated to test whether and how a given group shows a plastic response to environmental changes. The impact of gene-environment interactions on the phenotypic plasticity of the organism can also be visualized and characterized. The model was used to analyse an RNA-seq dataset measured from two cell lines of breast cancer that respond differently to an anti-cancer drug, from which genes associated with the resistance and sensitivity of the cell lines are identified. We performed simulation studies to validate the statistical behaviour of the model. The model provides a useful tool for clustering gene expression data by RNA-seq, facilitating our understanding of gene functions and networks.",
author = "Ningtao Wang and Yaqun Wang and Han Hao and Luojun Wang and Zhong Wang and Jianxin Wang and Rongling Wu",
year = "2014",
month = "7",
doi = "10.1093/bib/bbt029",
language = "English (US)",
volume = "15",
pages = "534--541",
journal = "Briefings in Bioinformatics",
issn = "1467-5463",
publisher = "Oxford University Press",
number = "4",

}

Wang, N, Wang, Y, Hao, H, Wang, L, Wang, Z, Wang, J & Wu, R 2014, 'A bi-Poisson model for clustering gene expression profiles by RNA-seq', Briefings in bioinformatics, vol. 15, no. 4, bbt029, pp. 534-541. https://doi.org/10.1093/bib/bbt029

A bi-Poisson model for clustering gene expression profiles by RNA-seq. / Wang, Ningtao; Wang, Yaqun; Hao, Han; Wang, Luojun; Wang, Zhong; Wang, Jianxin; Wu, Rongling.

In: Briefings in bioinformatics, Vol. 15, No. 4, bbt029, 07.2014, p. 534-541.

Research output: Contribution to journalArticle

TY - JOUR

T1 - A bi-Poisson model for clustering gene expression profiles by RNA-seq

AU - Wang, Ningtao

AU - Wang, Yaqun

AU - Hao, Han

AU - Wang, Luojun

AU - Wang, Zhong

AU - Wang, Jianxin

AU - Wu, Rongling

PY - 2014/7

Y1 - 2014/7

N2 - With the availability of gene expression data by RNA-seq, powerful statistical approaches for grouping similar gene expression profiles across different environments have become increasingly important.We describe and assess a computational model for clustering genes into distinct groups based on the pattern of gene expression in response to changing environment. The model capitalizes on the Poisson distribution to capture the count property of RNA-seq data. A two-stage hierarchical expectation-maximization (EM) algorithm is implemented to estimate an optimal number of groups and mean expression amounts of each group across two environments. A procedure is formulated to test whether and how a given group shows a plastic response to environmental changes. The impact of gene-environment interactions on the phenotypic plasticity of the organism can also be visualized and characterized. The model was used to analyse an RNA-seq dataset measured from two cell lines of breast cancer that respond differently to an anti-cancer drug, from which genes associated with the resistance and sensitivity of the cell lines are identified. We performed simulation studies to validate the statistical behaviour of the model. The model provides a useful tool for clustering gene expression data by RNA-seq, facilitating our understanding of gene functions and networks.

AB - With the availability of gene expression data by RNA-seq, powerful statistical approaches for grouping similar gene expression profiles across different environments have become increasingly important.We describe and assess a computational model for clustering genes into distinct groups based on the pattern of gene expression in response to changing environment. The model capitalizes on the Poisson distribution to capture the count property of RNA-seq data. A two-stage hierarchical expectation-maximization (EM) algorithm is implemented to estimate an optimal number of groups and mean expression amounts of each group across two environments. A procedure is formulated to test whether and how a given group shows a plastic response to environmental changes. The impact of gene-environment interactions on the phenotypic plasticity of the organism can also be visualized and characterized. The model was used to analyse an RNA-seq dataset measured from two cell lines of breast cancer that respond differently to an anti-cancer drug, from which genes associated with the resistance and sensitivity of the cell lines are identified. We performed simulation studies to validate the statistical behaviour of the model. The model provides a useful tool for clustering gene expression data by RNA-seq, facilitating our understanding of gene functions and networks.

UR - http://www.scopus.com/inward/record.url?scp=84904722796&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84904722796&partnerID=8YFLogxK

U2 - 10.1093/bib/bbt029

DO - 10.1093/bib/bbt029

M3 - Article

C2 - 23665510

AN - SCOPUS:84904722796

VL - 15

SP - 534

EP - 541

JO - Briefings in Bioinformatics

JF - Briefings in Bioinformatics

SN - 1467-5463

IS - 4

M1 - bbt029

ER -

Wang N, Wang Y, Hao H, Wang L, Wang Z, Wang J et al. A bi-Poisson model for clustering gene expression profiles by RNA-seq. Briefings in bioinformatics. 2014 Jul;15(4):534-541. bbt029. https://doi.org/10.1093/bib/bbt029