Using galaxy to perform large-scale interactive data analyses.

James Taylor, Ian Schenck, Daniel James Blankenberg, Anton Nekrutenko

Research output: Contribution to journalArticle

Abstract

While most experimental biologists know where to download genomic data, few have a concrete plan on how to analyze it. This situation can be corrected by: (1) providing unified portals serving genomic data and (2) building Web applications to allow flexible retrieval and on-the-fly analyses of the data. Powerful resources, such as the UCSC Genome Browser already address the first issue. The second issue, however, remains open. For example, how to find human protein-coding exons with the highest density of single nucleotide polymorphisms (SNPs) and extract orthologous sequences from all sequenced mammals? Indeed, one can access all relevant data from the UCSC Genome Browser. But once the data is downloaded how would one deal with millions of SNPs and gigabytes of alignments? Galaxy (http://g2.bx.psu.edu) is designed specifically for that purpose. It amplifies the strengths of existing resources (such as UCSC Genome Browser) by allowing the user to access and, most importantly, analyze data within a single interface in an unprecedented number of ways.

Original languageEnglish (US)
JournalCurrent protocols in bioinformatics / editoral board, Andreas D. Baxevanis ... [et al.]
VolumeChapter 10
StatePublished - Sep 1 2007

Fingerprint

Galaxies
Genes
Genome
Polymorphism
Single Nucleotide Polymorphism
Nucleotides
Mammals
World Wide Web
Exons
Concretes
Proteins

All Science Journal Classification (ASJC) codes

  • Structural Biology
  • Biochemistry

Cite this

@article{df26b6b5af9c4f7c89c48a82b94a5423,
title = "Using galaxy to perform large-scale interactive data analyses.",
abstract = "While most experimental biologists know where to download genomic data, few have a concrete plan on how to analyze it. This situation can be corrected by: (1) providing unified portals serving genomic data and (2) building Web applications to allow flexible retrieval and on-the-fly analyses of the data. Powerful resources, such as the UCSC Genome Browser already address the first issue. The second issue, however, remains open. For example, how to find human protein-coding exons with the highest density of single nucleotide polymorphisms (SNPs) and extract orthologous sequences from all sequenced mammals? Indeed, one can access all relevant data from the UCSC Genome Browser. But once the data is downloaded how would one deal with millions of SNPs and gigabytes of alignments? Galaxy (http://g2.bx.psu.edu) is designed specifically for that purpose. It amplifies the strengths of existing resources (such as UCSC Genome Browser) by allowing the user to access and, most importantly, analyze data within a single interface in an unprecedented number of ways.",
author = "James Taylor and Ian Schenck and Blankenberg, {Daniel James} and Anton Nekrutenko",
year = "2007",
month = "9",
day = "1",
language = "English (US)",
volume = "Chapter 10",
journal = "Current Protocols in Bioinformatics",
issn = "1934-3396",
publisher = "John Wiley and Sons Inc.",

}

Using galaxy to perform large-scale interactive data analyses. / Taylor, James; Schenck, Ian; Blankenberg, Daniel James; Nekrutenko, Anton.

In: Current protocols in bioinformatics / editoral board, Andreas D. Baxevanis ... [et al.], Vol. Chapter 10, 01.09.2007.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Using galaxy to perform large-scale interactive data analyses.

AU - Taylor, James

AU - Schenck, Ian

AU - Blankenberg, Daniel James

AU - Nekrutenko, Anton

PY - 2007/9/1

Y1 - 2007/9/1

N2 - While most experimental biologists know where to download genomic data, few have a concrete plan on how to analyze it. This situation can be corrected by: (1) providing unified portals serving genomic data and (2) building Web applications to allow flexible retrieval and on-the-fly analyses of the data. Powerful resources, such as the UCSC Genome Browser already address the first issue. The second issue, however, remains open. For example, how to find human protein-coding exons with the highest density of single nucleotide polymorphisms (SNPs) and extract orthologous sequences from all sequenced mammals? Indeed, one can access all relevant data from the UCSC Genome Browser. But once the data is downloaded how would one deal with millions of SNPs and gigabytes of alignments? Galaxy (http://g2.bx.psu.edu) is designed specifically for that purpose. It amplifies the strengths of existing resources (such as UCSC Genome Browser) by allowing the user to access and, most importantly, analyze data within a single interface in an unprecedented number of ways.

AB - While most experimental biologists know where to download genomic data, few have a concrete plan on how to analyze it. This situation can be corrected by: (1) providing unified portals serving genomic data and (2) building Web applications to allow flexible retrieval and on-the-fly analyses of the data. Powerful resources, such as the UCSC Genome Browser already address the first issue. The second issue, however, remains open. For example, how to find human protein-coding exons with the highest density of single nucleotide polymorphisms (SNPs) and extract orthologous sequences from all sequenced mammals? Indeed, one can access all relevant data from the UCSC Genome Browser. But once the data is downloaded how would one deal with millions of SNPs and gigabytes of alignments? Galaxy (http://g2.bx.psu.edu) is designed specifically for that purpose. It amplifies the strengths of existing resources (such as UCSC Genome Browser) by allowing the user to access and, most importantly, analyze data within a single interface in an unprecedented number of ways.

UR - http://www.scopus.com/inward/record.url?scp=43349095809&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=43349095809&partnerID=8YFLogxK

M3 - Article

C2 - 18428782

VL - Chapter 10

JO - Current Protocols in Bioinformatics

JF - Current Protocols in Bioinformatics

SN - 1934-3396

ER -