GeneLab’s In-house Omics Data Generation

Valery Boyko
Use of liquid handlers for high throughput sample processing.

The opportunities for biological experimentation in space are limited and costly. To maximize the scientific discoveries made aboard the International Space Station (ISS) and space shuttle, NASA GeneLab’s Sample Processing Lab (SPL) has developed and validated standard operating procedures to process biological samples with high efficiency.  

It is important to compare results across different experiments to gain additional insights into the effects of spaceflight on biological systems. Harmonization of datasets has been achieved with a centralized sample processing operation such as the SPL. The lab has developed protocols for the generation of raw and processed DNA and RNA-sequencing data from mammalian tissues, cells, and bacteria. Once the lab’s work is completed and the data processing and curation steps have taken place, the data become immediately available to the public worldwide through the GeneLab Data Repository. The SPL is fully equipped to process numerous model organisms for many downstream applications, including rodents, plants, microbes, fruit flies, and worms, just to name a few.

State-of-the-art omics lab

Metadata curation and an open science database are what set GeneLab’s Sample Processing Lab apart from other labs. The SPL has the experience required to anticipate the type of experimental details needed to provide scientists with sequencing information, and to allow them to use these data in a way that augments their hypothesis-driven research. It starts with the experimental design of how animals were housed, to what preservation method was used to store samples. The metadata collection continues through the whole workflow.

The highly skilled SPL team continuously trains to expand their knowledge on Next Generation Sequencing (NGS) technologies. NGS enables researchers to perform a wide variety of applications and to study biological systems at a level never before possible with ultra-high throughput, scalability, and speed. The state-of-the-art sequencing facility has the latest equipment:

Yi-Chun Chen
GeneLab provides sequencing service to Space-Biology-funded PIs with the production-scale sequencer, NovaSeq 6000.
  • NovaSeq 6000 sequencer
  • iSeq 100 sequencers
  • Bullet Blender 24 Gold
  • Maxwell® RSC  
  • epMotion 5073t liquid handler
  • epMotion 5075t liquid handler
  • Qubit 4 Fluorometer  
  • 2100 Bioanalyzer
  • 4200 TapeStation System
  • BluePippin size selection tool.
  • QIAgility
  • Covaris

Quality data with high-throughput processing

Sample processing starts with gathering background information on all samples to be processed. Once the required information is at hand, a protocol workflow that suits the goal is selected. The SPL has a well-defined workflow of standard operating procedures that are meticulously controlled and tested on many different sample types.

Laboratory automation capitalizes on technology and avoids operator bias: a Promega Maxwell RSC system extracts bacterial RNA and DNA; the Eppendorf epMotion liquid handlers prepare sequencing libraries. Many quality-control steps are incorporated into the workflows: laboratory automation protocols ensure safe handling of precious samples, a Qubit fluorometer is used for quantification, and the Agilent 2100 Bioanalyzer and 4200 TapeStation systems provide quality assessments.

SPL’s new NovaSeq 6000 is the most high-throughput sequencer currently on the market for short-read sequencing. The team focuses on getting quality data from every run so that a smaller scale benchtop sequencer, the iSeq 100, can be used to get a very accurate picture of library quality prior to high-throughput sequencing.

The NovaSeq 6000 can generate up to 6000 GB of data, 20 billion reads in a single two-day run. This enormous amount of data makes the NovaSeq 6000 the most cost-efficient sequencer on the market. Following sequencing, the data are transferred to the GeneLab Data Processing Cluster for pre-processing and analysis. GeneLab’s Data Processing Team has developed standardized processing pipelines to convert bcl files to fastq, perform quality control, and further downstream analysis.   

GeneLab sequencing standards

The Sample Processing Lab uses NGS capabilities to process samples returned to Earth from space. All the standard protocols used for nucleic acid and protein extraction, library preparation, and RNA sequencing have been established in collaboration with the scientific community through GeneLab’s Analysis Working Groups. These standard operating procedures include sample extraction, library generation, internal and external controls and sequencing parameters, and will soon be publically available on the GeneLab website. To ensure consistency across all spaceflight biology experiments, once a protocol is established and verified for a particular tissue type, that same protocol is performed on every sample from every experiment containing that tissue type.

All newly processed omics data hosted on GeneLab are carefully curated to include all essential metadata associated with each spaceflight experiment. Meticulous curation and organization of data allows users to easily navigate through datasets and identify key information necessary for accurate interpretation of results. The SPL team has perfected their RNA sequencing protocols and are currently working on metagenomics and DNA sequencing procedures.

Sequencing services for space biology principal investigators

GeneLab is working with principal investigators (PIs) to generate non-proprietary data using its budget, or proprietary data using the PI’s budget. Currently, only scientists funded by the NASA’s Biological and Physical Sciences Program or NASA employees are able to take advantage of these services. Utilizing GeneLab services allows PI’s to take advantage of GeneLab’s experience processing space-flown tissues and to harmonize their data by utilizing GeneLab’s standard assays. All primary sequencing data are run through a standard GeneLab data processing pipeline (developed and approved by GeneLab Analysis Working Groups) to generate a minimal set of processed data that will be delivered to the PI along with the raw data.

To contact GeneLab for data processing quote requests, fill out this form,