GeneLab: Open Science for Life in Space
To enable scientific discovery and space exploration through multi-omics data-driven research.
- Design and deploy a unique repository, the GeneLab Data System (GLDS), housing molecular data generated from DNA, RNA, proteins and metabolites extracted from spaceflight or spaceflight-relevant samples (collectively called “omics” - transcriptomic, proteomic, epigenomic, metagenomic and metabolomics data)
- Partner with spaceflight-relevant projects through sample sharing or augmentation of experimental samples to expand omics analyses
- Process spaceflight-relevant samples answering knowledge gaps, establishing best-practices for sample processing and providing common omics processing platforms for the space biology community, leading to a more cohesive sets of independent datasets
- Curate, beyond best-practice, spaceflight-relevant datasets and make GeneLab processed data publicly available as expediently as possible
- Provide web-based analytical and visualization tools for raw and processed data, democratizing the access to spaceflight-relevant data and disseminating knowledge of how life responds to the space environment. This platform will be an essential tool to discover new countermeasures for human exploration of space, and will inevitably benefit life on Earth as well.
Long-duration human exploration of space faces major hurdles including risks to astronaut health as well as challenges in environmental control and life support. NASA initiated the GeneLab project on the premise that mining omics data from spaceflight experiments offers an immense opportunity to understand the effects of spaceflight on biological systems, and that this can best be accomplished by ensuring open access to these data by as many researchers as possible. GeneLab collects data from a variety of model organisms. These studies enable queries of how DNA, RNA, and proteins, the building blocks of life, adapt and respond to the space environment. Because no single omics analysis can fully unravel the complexities of fundamental systems biology, GeneLab is building on the experiments designed by investigators to provide multiple layers of omics information that can be studied in an integrated fashion. This approach enables the scientific community to obtain a more complete understanding of how biological systems adapt to spaceflight, leading to translational research.
“The whole is greater than the sum of its parts.” - Aristotle
The GeneLab project is a multi-year, multi-phase project that includes a unique data repository and partnerships with investigators performing spaceflight and spaceflight-related experiments in order to generate additional data. The GeneLab team comprises computer scientists, physicists, biologists and bioinformaticians at NASA Ames Research Center, Moffett Field, California. Science direction and funding is provided by the Space Life and Physical Sciences Division at NASA Headquarters. Initial project funding was also provided by the International Space Station Research Integration Office at NASA Johnson Space Center.
GeneLab Data System
The GeneLab Data System (GLDS) is NASA’s premier open-access omics data platform for biological experiments developed by the GeneLab team. The GLDS houses standards-compliant, high-throughput sequencing and other omics data from spaceflight-relevant experiments. GLDS Version 1.0 went online in April 2015 and by October 2016, the GLDS contained 80 datasets and had basic search and download capabilities. Version 2.0 was released in September 2017 and introduced integrated search capabilities leveraging other public omics databases (NCBI GEO, PRIDE, MG-RAST) as well as elastic search for keywords. GLDS 2.0 also introduced a workspace that provides easier access to the data repository and allows users to upload their data. The current GLDS, Version 3.0, provides a collaborative platform equipped with tools to analyze omics data powered by the Galaxy user interface. Version 3.0 also includes processed data that are generated by the GeneLab team using omics-species-specific workflows vetted by a large cohort of bioinformaticians and scientists from the space biology community gathered under GeneLab’s Analysis Working Group. Future versions in this multi-phase project will provide a means for data visualization so that users will be able to easily identify and navigate the biological changes that result from spaceflight.
As of GeneLab’s latest release, Version 3.0, in October, 2018, over 200 omics datasets are housed in the GLDS repository. These datasets include data from experiments worldwide that explore the biological effects of the spaceflight environment on a wide variety of model organisms including rodents, invertebrates, plants, and microbes. Human datasets are currently limited to samples from unidentifiable donors (e.g. cultured cell lines and non-astronaut data). The datasets include NASA-funded experiments as well as those funded by other international space agencies. The GLDS ensures prompt release and open access to high-throughput genomics, transcriptomics, proteomics, and metabolomics data from spaceflight and ground-based simulations of microgravity, radiation, or other space environment factors. The data are meticulously curated to assure that accurate experimental and sample processing metadata are included with each data set. The GLDS is an international public repository with unrestricted access worldwide; dataset download volumes indicate strong interest in these data by the scientific community.
Analysis of the GLDS data via exploring the network of molecular responses of terrestrial biology to the space environment will contribute fundamental knowledge of how the space environment affects biological systems. The knowledge gained will also yield terrestrial benefits resulting from mitigation strategies to prevent effects observed during exposure to space environments that mimic diseases here on Earth (e.g. osteoporosis and aging).
GeneLab partners with as many spaceflight missions as possible to maximize the amount of omics data from every biology payload. Currently, GeneLab primarily partners with experiments that are flown on the International Space Station and funded through NASA’s Space Life and Physical Sciences Division. The purpose of these partnerships is multifold and can include sample sharing, enhanced omics analyses over the originally designed experiment, or data sharing. To date, GeneLab has partnered with multiple biology missions including plant, mouse, microbe, and fruit fly experiments. Through these partnerships, GeneLab has generated a growing volume of publicly available data in the GLDS.
Analysis Working Groups
To foster the generation of new knowledge from data housed in the GLDS, establish consensus data analysis pipelines, and improve the functionality of the GLDS, GeneLab has established Analysis Working Groups (AWGs). These are groups of subject matter experts tasked with analyzing all GLDS data within a specific domain (plants, microbes, animals, multi-omics). Through this intensive utilization of GLDS data, these groups of excellence improve the utility of the GLDS while accelerating the pace of discovery from limited space biology experiments.