Tree-guided Bayesian inference of population structures

Yu Zhang

Research output: Contribution to journalArticle

6 Citations (Scopus)

Abstract

Motivation: Inferring population structures using genetic data sampled from a group of individuals is a challenging task. Many methods either consider a fixed population number or ignore the correlation between populations. As a result, they can lose sensitivity and specificity in detecting subtle stratifications. In addition, when a large number of genetic markers are used, many existing algorithms perform rather inefficiently. Result: We propose a new Bayesian method to infer population structures using multiple unlinked single nucleotide polymorphisms (SNPs). Our approach explicitly considers the population correlation through a tree hierarchy, and treat the population number as a random variable. Using both simulated and real datasets of worldwide samples, we demonstrate that an incorporated tree can consistently improve the power in detecting subtle population stratifications. A tree-based model often involves a large number of unknown parameters, and the corresponding estimation procedure can be highly inefficient. We further implement a partition method to analytically integrate out all nuisance parameters in the tree. As a result, our method can analyze large SNP datasets with significantly improved convergence rate.

Original languageEnglish (US)
Pages (from-to)965-971
Number of pages7
JournalBioinformatics
Volume24
Issue number7
DOIs
StatePublished - Apr 1 2008

Fingerprint

Population Structure
Bayesian inference
Nucleotides
Polymorphism
Random variables
Single nucleotide Polymorphism
Population
Stratification
Single Nucleotide Polymorphism
Nuisance Parameter
Bayesian Methods
Unknown Parameters
Specificity
Convergence Rate
Bayes Theorem
Genetic Structures
Random variable
Integrate
Partition
Genetic Markers

All Science Journal Classification (ASJC) codes

  • Statistics and Probability
  • Medicine(all)
  • Biochemistry
  • Molecular Biology
  • Computer Science Applications
  • Computational Theory and Mathematics
  • Computational Mathematics

Cite this

Zhang, Yu. / Tree-guided Bayesian inference of population structures. In: Bioinformatics. 2008 ; Vol. 24, No. 7. pp. 965-971.
@article{682e6a6709dd48ad9662b76723ce8bc4,
title = "Tree-guided Bayesian inference of population structures",
abstract = "Motivation: Inferring population structures using genetic data sampled from a group of individuals is a challenging task. Many methods either consider a fixed population number or ignore the correlation between populations. As a result, they can lose sensitivity and specificity in detecting subtle stratifications. In addition, when a large number of genetic markers are used, many existing algorithms perform rather inefficiently. Result: We propose a new Bayesian method to infer population structures using multiple unlinked single nucleotide polymorphisms (SNPs). Our approach explicitly considers the population correlation through a tree hierarchy, and treat the population number as a random variable. Using both simulated and real datasets of worldwide samples, we demonstrate that an incorporated tree can consistently improve the power in detecting subtle population stratifications. A tree-based model often involves a large number of unknown parameters, and the corresponding estimation procedure can be highly inefficient. We further implement a partition method to analytically integrate out all nuisance parameters in the tree. As a result, our method can analyze large SNP datasets with significantly improved convergence rate.",
author = "Yu Zhang",
year = "2008",
month = "4",
day = "1",
doi = "10.1093/bioinformatics/btn070",
language = "English (US)",
volume = "24",
pages = "965--971",
journal = "Bioinformatics",
issn = "1367-4803",
publisher = "Oxford University Press",
number = "7",

}

Tree-guided Bayesian inference of population structures. / Zhang, Yu.

In: Bioinformatics, Vol. 24, No. 7, 01.04.2008, p. 965-971.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Tree-guided Bayesian inference of population structures

AU - Zhang, Yu

PY - 2008/4/1

Y1 - 2008/4/1

N2 - Motivation: Inferring population structures using genetic data sampled from a group of individuals is a challenging task. Many methods either consider a fixed population number or ignore the correlation between populations. As a result, they can lose sensitivity and specificity in detecting subtle stratifications. In addition, when a large number of genetic markers are used, many existing algorithms perform rather inefficiently. Result: We propose a new Bayesian method to infer population structures using multiple unlinked single nucleotide polymorphisms (SNPs). Our approach explicitly considers the population correlation through a tree hierarchy, and treat the population number as a random variable. Using both simulated and real datasets of worldwide samples, we demonstrate that an incorporated tree can consistently improve the power in detecting subtle population stratifications. A tree-based model often involves a large number of unknown parameters, and the corresponding estimation procedure can be highly inefficient. We further implement a partition method to analytically integrate out all nuisance parameters in the tree. As a result, our method can analyze large SNP datasets with significantly improved convergence rate.

AB - Motivation: Inferring population structures using genetic data sampled from a group of individuals is a challenging task. Many methods either consider a fixed population number or ignore the correlation between populations. As a result, they can lose sensitivity and specificity in detecting subtle stratifications. In addition, when a large number of genetic markers are used, many existing algorithms perform rather inefficiently. Result: We propose a new Bayesian method to infer population structures using multiple unlinked single nucleotide polymorphisms (SNPs). Our approach explicitly considers the population correlation through a tree hierarchy, and treat the population number as a random variable. Using both simulated and real datasets of worldwide samples, we demonstrate that an incorporated tree can consistently improve the power in detecting subtle population stratifications. A tree-based model often involves a large number of unknown parameters, and the corresponding estimation procedure can be highly inefficient. We further implement a partition method to analytically integrate out all nuisance parameters in the tree. As a result, our method can analyze large SNP datasets with significantly improved convergence rate.

UR - http://www.scopus.com/inward/record.url?scp=41349097940&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=41349097940&partnerID=8YFLogxK

U2 - 10.1093/bioinformatics/btn070

DO - 10.1093/bioinformatics/btn070

M3 - Article

C2 - 18296461

AN - SCOPUS:41349097940

VL - 24

SP - 965

EP - 971

JO - Bioinformatics

JF - Bioinformatics

SN - 1367-4803

IS - 7

ER -