Democratization of Data Analysis in Life Sciences Through Galaxy

Project: Research project

Description

DESCRIPTION (provided by applicant): Biomedical research has been rapidly transformed into an informatics intensive discipline. This has created challenges at many levels, from the availability of computational infrastructure and expertise, the burden of keeping up with rapidly developing tools and best-practices, communication difficulties between experimentalists and computational researchers, and difficulties ensuring reproducibility. Over the last six years we have developed an open-source software framework, Galaxy (http://usegalaxy.org), to address these issues. Galaxy provides an accessible analysis environment allowing experimentalists to use cuttingedge tools on large datasets, with automated tracking to ensure reproducibility. Galaxy makes it easy for tool developers to quickly put their tools into experimentalist's hands. Galaxy has become an indispensable resource for the genomic research community. First, for the thousands of experimentalists using Galaxy's tools in their research (as evidenced in many publications). Beyond that, Galaxy has been adopted as the local analysis infrastructure for many dozens of labs and institutes. Galaxy is flexible enough to be deployed on a variety of different compute resources, particularly important as data-production is increasingly de-centralized. At Galaxy's core is a powerful extensible framework that other important community resource projects are now integrating or building on. Thus Galaxy is ideally positioned to become a substrate for sharing and communicating analysis. We propose to expand the Galaxy resource with novel approaches for accessible, transparent, and reproducible analysis in a decentralized world. Driven by biological projects, we will build best practice workflows for several sequencing based experiments. We will create innovative methods to automate packaging and deploying analysis tools. We will build and maintain the Galaxy Tool Shed, a hub for sharing tools, best-practice workflows, and analysis strategies. We will develop a novel approach for publishing analysis. We will create a framework for visual analytics leveraging existing Galaxy Tools. Finally, we will build a complete solution for managing sequencing workflows including sample tracking and instrument integration. PUBLIC HEALTH RELEVANCE: Rapid proliferation of genomic approaches is revolutionizing medical field by creating novel diagnostic applications. This project will make cutting edge biomedical analysis tools available to every clinical researcher fulfilling the translation promise of sequencing technologies.
StatusActive
Effective start/end date2/22/1212/31/20

Funding

  • National Institutes of Health: $1,740,000.00
  • National Institutes of Health: $1,402,446.00
  • National Institutes of Health: $1,476,748.00
  • National Institutes of Health: $1,373,678.00
  • National Institutes of Health: $1,409,638.00
  • National Institutes of Health: $1,740,000.00
  • National Institutes of Health: $1,740,000.00
  • National Institutes of Health: $1,740,000.00

Fingerprint

Galaxies
Throughput
Cloud computing
Computational methods
Statistical methods
DNA
Inspection
Experiments