Enabling cloud bursting for life sciences within Galaxy

Enis Afgan, Nate Coraor, John M. Chilton, Dannon Baker, James Taylor

Research output: Contribution to journalArticle

5 Citations (Scopus)

Abstract

Fueled by the radically increased capacity to generate data over the past decade, the field of biomedical research has been constrained by the ability to analyze data. Galaxy, a Web-based, open-source data integration and analysis platform for life science research, has been democratizing access to data analysis tools. However, the scale of data and the scope of tools required have proven to be a significant challenge for any monolithic deployment of the Galaxy application. We have found that a distributed and federated approach to utilizing compute and storage resources is necessary. This paper describes the ongoing efforts in creating a ubiquitous platform capable of simultaneously utilizing dedicated as well as on-demand cloud resources. Specifically, the requirements, process, and an implementation of a cloud-bursting system are detailed.

Original languageEnglish (US)
Pages (from-to)4330-4343
Number of pages14
JournalConcurrency Computation
Volume27
Issue number16
DOIs
StatePublished - Nov 1 2015

Fingerprint

Galaxies
Bursting
Life sciences
Data integration
Data analysis
Resources
Data Integration
Open Source
Web-based
Necessary
Requirements

All Science Journal Classification (ASJC) codes

  • Software
  • Theoretical Computer Science
  • Computer Science Applications
  • Computer Networks and Communications
  • Computational Theory and Mathematics

Cite this

Afgan, Enis ; Coraor, Nate ; Chilton, John M. ; Baker, Dannon ; Taylor, James. / Enabling cloud bursting for life sciences within Galaxy. In: Concurrency Computation. 2015 ; Vol. 27, No. 16. pp. 4330-4343.
@article{065378073a9445e8bd44a67663adc424,
title = "Enabling cloud bursting for life sciences within Galaxy",
abstract = "Fueled by the radically increased capacity to generate data over the past decade, the field of biomedical research has been constrained by the ability to analyze data. Galaxy, a Web-based, open-source data integration and analysis platform for life science research, has been democratizing access to data analysis tools. However, the scale of data and the scope of tools required have proven to be a significant challenge for any monolithic deployment of the Galaxy application. We have found that a distributed and federated approach to utilizing compute and storage resources is necessary. This paper describes the ongoing efforts in creating a ubiquitous platform capable of simultaneously utilizing dedicated as well as on-demand cloud resources. Specifically, the requirements, process, and an implementation of a cloud-bursting system are detailed.",
author = "Enis Afgan and Nate Coraor and Chilton, {John M.} and Dannon Baker and James Taylor",
year = "2015",
month = "11",
day = "1",
doi = "10.1002/cpe.3536",
language = "English (US)",
volume = "27",
pages = "4330--4343",
journal = "Concurrency Computation Practice and Experience",
issn = "1532-0626",
publisher = "John Wiley and Sons Ltd",
number = "16",

}

Afgan, E, Coraor, N, Chilton, JM, Baker, D & Taylor, J 2015, 'Enabling cloud bursting for life sciences within Galaxy', Concurrency Computation, vol. 27, no. 16, pp. 4330-4343. https://doi.org/10.1002/cpe.3536

Enabling cloud bursting for life sciences within Galaxy. / Afgan, Enis; Coraor, Nate; Chilton, John M.; Baker, Dannon; Taylor, James.

In: Concurrency Computation, Vol. 27, No. 16, 01.11.2015, p. 4330-4343.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Enabling cloud bursting for life sciences within Galaxy

AU - Afgan, Enis

AU - Coraor, Nate

AU - Chilton, John M.

AU - Baker, Dannon

AU - Taylor, James

PY - 2015/11/1

Y1 - 2015/11/1

N2 - Fueled by the radically increased capacity to generate data over the past decade, the field of biomedical research has been constrained by the ability to analyze data. Galaxy, a Web-based, open-source data integration and analysis platform for life science research, has been democratizing access to data analysis tools. However, the scale of data and the scope of tools required have proven to be a significant challenge for any monolithic deployment of the Galaxy application. We have found that a distributed and federated approach to utilizing compute and storage resources is necessary. This paper describes the ongoing efforts in creating a ubiquitous platform capable of simultaneously utilizing dedicated as well as on-demand cloud resources. Specifically, the requirements, process, and an implementation of a cloud-bursting system are detailed.

AB - Fueled by the radically increased capacity to generate data over the past decade, the field of biomedical research has been constrained by the ability to analyze data. Galaxy, a Web-based, open-source data integration and analysis platform for life science research, has been democratizing access to data analysis tools. However, the scale of data and the scope of tools required have proven to be a significant challenge for any monolithic deployment of the Galaxy application. We have found that a distributed and federated approach to utilizing compute and storage resources is necessary. This paper describes the ongoing efforts in creating a ubiquitous platform capable of simultaneously utilizing dedicated as well as on-demand cloud resources. Specifically, the requirements, process, and an implementation of a cloud-bursting system are detailed.

UR - http://www.scopus.com/inward/record.url?scp=84944883160&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84944883160&partnerID=8YFLogxK

U2 - 10.1002/cpe.3536

DO - 10.1002/cpe.3536

M3 - Article

AN - SCOPUS:84944883160

VL - 27

SP - 4330

EP - 4343

JO - Concurrency Computation Practice and Experience

JF - Concurrency Computation Practice and Experience

SN - 1532-0626

IS - 16

ER -