MicroRNAs are short RNA molecules (~20 to 24 nucleotides) that function to regulate levels of gene expression in animals and plants. The targets of microRNAs function in many key cellular processes, including development, biotic and abiotic stress responses. In crop plants, microRNAs have demonstrated roles supporting phenotypes important for fertility and yield, and thus contribute to food security and the agricultural economy. As with all features of genomes, cataloging and tracking microRNA genes consistently and across genomes using appropriate gene names is crucial. This requires uniform and rigorous assignment of identifiers, periodic reassessment and screening using the latest criteria, and community input. It is crucial that different publications and databases use the same name for a particular microRNA, and homologs in different species should ideally receive consistent names. A single, central registry for microRNA nomenclature and quality control has existed since 2004 to name and catalogue microRNAs: miRBase. miRBase faces challenges in the current era of 'big data'. In particular, registration of new microRNAs currently relies on time-consuming, manual curation. Inexpensive sequencing of genomes and small RNAs has massively increased the discovery rate of microRNAs, exceeding the capacity of miRBase for manual curation of new entries. Yet the need for a single trusted resource is more acute than ever, for rapid registration of new microRNAs, and to assess both existing annotations and new.
This collaborative project between the Donald Danforth Plant Science Center and Pennsylvania State University in the US and the University of Manchester in the UK will develop an autonomous and automated microRNA registry system for miRbase. This will be integrated with miRBase naming protocols and adding additional quality controls. The project will resolve a critical problem for the continued growth of the small RNA field, particularly in plants. By automating submission, curation, and release of new annotations, this project will eliminate the current reliance on time-consuming manual annotations to maintain miRBase. This will allow miRBase to continue to catalyze fundamental discoveries in all areas of biology. The project will thus build on existing tools to annotate high-confidence microRNAs from plant small RNA deep sequencing datasets. The project will develop a pipeline to automatically assign consistent gene names to novel plant microRNA annotations generated by users from these deep datasets. And the project will incorporate new interfaces into the miRBase infrastructure to accept and display user submissions of novel microRNA data, and to expand options for users to visualize data. An output of the project is training researchers in the analysis of biological big data and full stack web development -- skill sets in high demand across the sciences and beyond. The project will also integrate research with undergraduate education, and it will build on programs to broaden participation by under-represented groups.
This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
|Effective start/end date||12/1/21 → 11/30/24|
- National Science Foundation: $60,638.00