Community-Driven Data Analysis Training for Biology

Galaxy Training Network

Research output: Contribution to journalArticlepeer-review

71 Scopus citations


The primary problem with the explosion of biomedical datasets is not the data, not computational resources, and not the required storage space, but the general lack of trained and skilled researchers to manipulate and analyze these data. Eliminating this problem requires development of comprehensive educational resources. Here we present a community-driven framework that enables modern, interactive teaching of data analytics in life sciences and facilitates the development of training materials. The key feature of our system is that it is not a static but a continuously improved collection of tutorials. By coupling tutorials with a web-based analysis framework, biomedical researchers can learn by performing computation themselves through a web browser without the need to install software or search for example datasets. Our ultimate goal is to expand the breadth of training materials to include fundamental statistical and data science topics and to precipitate a complete re-engineering of undergraduate and graduate curricula in life sciences. This project is accessible at We developed an infrastructure that facilitates data analysis training in life sciences. It is an interactive learning platform tuned for current types of data and research problems. Importantly, it provides a means for community-wide content creation and maintenance and, finally, enables trainers and trainees to use the tutorials in a variety of situations, such as those where reliable Internet access is unavailable.

Original languageEnglish (US)
Pages (from-to)752-758.e1
JournalCell Systems
Issue number6
StatePublished - Jun 27 2018

All Science Journal Classification (ASJC) codes

  • Pathology and Forensic Medicine
  • Histology
  • Cell Biology


Dive into the research topics of 'Community-Driven Data Analysis Training for Biology'. Together they form a unique fingerprint.

Cite this