Frequently Asked Questions

General

Submission

Details

Sample Processing and Sequencing Group FAQ

 


General

What is GeneLab?

The multi-year GeneLab project is both a science collaboration initiative as well as a data system effort to establish a public bioinformatics repository.The mission of GeneLab is to maximize the utilization of the valuable biological research resources aboard the International Space Station (ISS) by collecting genomic, transcriptomic, proteomic, and metabolomics data known as “omics.” These omics data will enable exploration of the molecular network responses of terrestrial biology to the space environment, and will be made available to researchers worldwide. The ultimate goal of GeneLab is to enable space exploration by allowing researchers to understand the complex responses of biological systems to the space environment. GeneLab data will potentially be useful for the development of countermeasures, monitoring the microbes that colonize the space station, understanding how food plants could be modified to grow better in space, and unraveling the responses of humans and other organisms to the combined effects of altered gravity and space radiation.

What are omics data?

Omics refers to multiple datatypes when the Latin suffix -ome ('totality') is added to a field of study. If the data are derived from DNA (which encode genes comprising the genome), then the data are genomic data. Data from RNA which are transcribed from DNA are called transcriptomics; from protein - proteomics, from metabolic substances - metabolomics. Other data types include epigenomics and epitranscriptomics - data that show modifications of DNA and RNA respectively that are involved in gene regulation, and lipidomics - characterizations of the lipids (fats) in a sample. Metagenomics is the study of genetic material from an environmental or clinical sample. The sample may contain many different microbial types (i.e. bacteria, archaea, fungi, protists and viruses) and might be quite complex in nature. Data sets generated by high-throughput analysis of these various omics types are often quite large.

What is the GeneLab Data System (GLDS)?

The GeneLab Data System (GLDS) is NASA’s premier open-access omics data platform for biological experiments. GLDS archives, houses, and freely distributes standards-compliant, high-throughput sequencing and other omics data from spaceflight-relevant experiments. The data are meticulously curated by GeneLab project bioinformaticians and enhanced by associated experimental metadata that include flight information, project details, sample/tissue processing protocols, omics analysis details and other ancillary metadata. For more information about various aspects of the GLDS or the GeneLab project please see our documentation.

Who sponsors GeneLab?

GeneLab is developed and managed at NASA’s Ames Research Center. Science direction and project funding by NASA’s Division of Biological and Physical Sciences (BPS) at NASA Headquarters.

What data types does the GLDS support?

GeneLab accepts studies whose design types, assay types, and data types match those listed in the table below.

Study Design Type Assay Type Data Type Accepted Data File Format(s)

Genotyping Design

 

Genotyping Assay

DNA Sequence Data

FASTA, FASTQ

Genotyping by Array Design

Genotyping by Array Assay

DNA Sequence Data

CEL, Tab-delimited Text

Genotyping by High Throughput Sequencing Design

Genotyping By High Throughput Sequencing Assay

DNA Sequence Data

FASTA, FASTQ, SAM, HDF4, NetCDF, SFF, GTF/GFF3/VCF/BED

Comparative Genome Hybridization by Array Design

Comparative Genomic Hybridization By Array Assay

DNA Sequence Data

CEL, Tab-delimited Text

ChIP-seq Design

ChIP-seq Assay

DNA Sequence Data

FASTA, FASTQ, SAM, HDF4, NetCDF, SFF, GTF/GFF3/VCF/BED

DNA Methylation Profiling By High Throughput Sequencing Design

DNA Methylation Profiling By High Throughput Sequencing Assay

DNA Sequence Data

FASTA, FASTQ, SAM, HDF4, NetCDF, SFF, GTF/GFF3/VCF/BED

MicroRNA Profiling by Array Design

 

MicroRNA Profiling by Array Assay

 

Transcription Profiling Data

CEL, Tab-delimited Text

MicroRNA Profiling by High Throughput Sequencing Design

MicroRNA Profiling by High Throughput Sequencing Assay

Transcription Profiling Data

FASTA, FASTQ, SAM, HDF4, NetCDF, SFF, GTF/GFF3/VCF/BED

Transcription Profiling by Array Design

Transcription Profiling by Array Assay

Transcription Profiling Data

CEL, Tab-delimited Text

Transcription Profiling by High Throughput Sequencing Design

Transcription Profiling by High Throughput Sequencing (RNA-seq)

Transcription Profiling Data

FASTA, FASTQ, SAM, HDF4, NetCDF, SFF, GTF/GFF3/VCF/BED

Transcription Profiling by RT-PCR Design

Transcription Profiling by RT-PCR Assay

Transcription Profiling Data

Tab-delimited Text

Transcription Profiling by Tiling Array Design

Transcription Profiling by Tiling Array Assay

Transcription Profiling Data

CEL, Tab-delimited Text

Translation Profiling Design

Poly(A)-Site Sequencing Assay

RNA Bind-n-Seq Assay

Ribosomal Profiling by Sequencing Assay

Self-transcribing Active Regulatory Region Sequencing Assay

Serial Analysis of Gene Expression

Transcript Leader Sequencing
Transcription Profiling by MPSS Assay

Translation-associated Transcript Leader Sequencing

Translation Profiling Assay

Transcription Profiling Data

FASTA, FASTQ, SAM, HDF4, NetCDF, SFF, GTF/GFF3/VCF/BED, CEL, Tab-delimited Text

Proteomic Profiling Design

Mass Spectrometry Assay

Mass Spectrometry Assay Data

mzML, mzQuantML, spML, pdb, pdbseq, pdbnuc, pdbnucseq, CID, HCD, mzData, DTA, PKL, MS2, MGF, ETD, RAW

Proteomic Profiling Design

Electrophoresis System

Electrophoresis System Data

GelML, spML

Proteomic Profiling Design

Western Blot Analysis

Western Blot Analysis Data

spML

Proteomic Profiling by Array Design

Proteomic Profiling By Array Assay

Protein Microarray Assay Data

CEL, Tab-delimited Text

Metabolomic Profiling Design

Mass Spectrometry Assay

Mass Spectrometry Assay Data

mzML, mzQuantML, spML, mzData, DTA, PKL, MS2, MGF, ETD, RAW

Metabolomic Profiling Design

NMR Spectroscopy Assay

 

Spectra: .doc, .docx, .txt, .pdf, .tif formats

Metabolomic Profiling Design

Other Metabolite Profiling Assay (TBD)[1]

Metabolomic Data (TBD)[1]

CSV, Excel, and (TBD)[2]

[1] Pending concept specification by OBI (Ontology for Biomedical Investigation)

[2] Pending format(s) specification by life science research community

 

Submission

Submission process - How to do a submission?

Please see the following web page for how to submit data to GeneLab using this link.

Post-submission process - How do I modify a dataset or metadata?

Please contact GeneLab using this link. A GeneLab team member will assist you in making the corrections.

When will my data receive a GeneLab accession number?

Each dataset will receive a GeneLab accession number in this format: GLDS-xxx. Submitter's may request to receive their accession number and link before the public release of their dataset.

I'm a reviewer, how do I access and evaluate pre-publication data?

Currently, the GeneLab Data System cannot provide limited access to datasets. As a reviewer, you may ask the submitter to share their metadata and data files via the GeneLab collaborative workspace. We are evaluating ways to improve reviewer access in future releases of the GLDS.

How to cite GeneLab Datasets?

Acknowledgement:

A general statement crediting NASA GeneLab for data, assistance, and/or review. This statement is included in a paragraph at the end of an article, before the reference list.

Example Acknowledgment for GeneLab data:

"GeneLab data are courtesy of the NASA GeneLab Data Repository (https://genelab-data.ndc.nasa.gov/genelab/projects/)."

Reference Citation:

The preferred way to cite GeneLab datasets is to use the dataset citation provided in the “Citation” panel of the “Description” section. This citation can be downloaded in BibTex or RIS format. 

For example, the citation for GLDS-249 is:

Galazka JM, Green SJ, Lai Polo S, Saravia-Butler AM, Fogle HW, Bense NB, Boyko V, Dinh MT, Chen Y, Walton T, Kunstman KJ, Costes SV, Gebre SG, Lee MD. "Metagenomic analysis of feces from mice flown on the RR-6 mission", GeneLab, Version 11, https://genelab-data.ndc.nasa.gov/genelab/accession/GLDS-249/

GeneLab Database Citation:

When referencing the GeneLab database, please use the following citations:

Barrios DC, Galazka J, Grigorev K, Gebre S, and Costes S. NASA GeneLab: interfaces for the exploration of space omics data. Nucleic Acids Res, 2021. 49(D1): p. D1515 - D1522.

Ray S, Gebre S, Fogle H, Berrios DC, Tran PB, Galazka JM, and Costes S. GeneLab: Omics database for spaceflight experiments. Bioinformatics, 2019. 35(10): p. 1753 - 1759.

Details

What technologies is the GeneLab Data System (GLDS) built on?

The current GLDS Phase 1 system is built using a NASA customized web-based, collaboration and knowledge sharing software platform called the Center for Cross-discipline Collaboration (C3 for short). It is developed by NASA Ames Research Center's Intelligent Systems Division (Code TI) using the Open Source Python web framework called Django, the Python and JavaScript programming languages, and the MySQL Relational DataBase Management System (RDBMS).

From the results of the realigned FY2016 GLDS software platform trade study, we are building a customizable new GeneLab software platform on top of the Broad Institute of MIT and Harvard's GenomeSpace integration platform for GLDS Phase 2 and beyond for extensibility, scalability, modularity, and performance.

Why should I deposit data in GL and not one of other online repositories available?

GeneLab has meticulous curated metadata compared to majority of other online bioinformatics data repositories. GeneLab is also focused on space biology domain specific data sets/studies; whereas other repositories may concentrate on other specific data domains. In the coming years, the GeneLab is planning to host a collaborative workspace, housing bioinformatic and other analysis and data display tools.

Can I get an Accession Number to include in my manuscript prior to the data being posted on GL?

Yes, you may request a GeneLab accession number to include in your manuscript before your data is public in the GeneLab Repository. When submitting your data, please request an accession number from your GeneLab curator.

Does my dataset need to be related to ISS research or does GL also host suborbital and ground results?

All data sets must be relevant to spaceflight research or to the study of gravity as a continuum. This includes ground results that simulate aspects of the spaceflight environment and suborbital experiments.

How can I programmatically (bulk) download data files from the GeneLab Data System (GLDS)?

One way is to use the 'GL-download-GLDS-data' program included in the genelab-utils package. You can find installation and usage information for that here.

Sample Processing and Sequencing Group FAQ

What services does GeneLab offer?

  • RNA/DNA extraction from mouse tissues, bacteria and cells using column based kits
  • RNA/DNA extraction from mouse tissues, bacteria and cells using Trizol
  • DNA extraction from bacterial cells using Promega Maxwell RSC Instrument
  • RNA and DNA quantification with Qubit v.3 and v.4/Nanodrop 
  • RNA and DNA quality check with 2100 Bioanalyzer/4200 TapeStation
  • Manual/automated library preparation for ribo-depleted RNA sequencing (TruSeq Stranded Total RNA, Illumina) with up to 384 UDI with UMI.
  • Manual/automated library preparation for metagenomics samples (Nextera DNA Flex, Illumina)
  • Sequencing library quantification with Qubit/qPCR
  • Sequencing library quality assessment with 4200 TapeStation
  • iSeq sequencing
  • NovaSeq sequencing
  • Positive workflow controls: spike-in mixes as well as Universal Mouse/Human RNA samples
  • Assistance with data analysis

How do I initiate a project with GeneLab?

To initiate a project with GeneLab, fill in the inquiry form. For quick turnaround, make sure to include as much information as possible. At a minimum, provide the following info: services requested, number of samples, type of sample (tissue/cells/RNA/DNA), requested sequencing length and desired clusters/sample.

What type of library preparation and sequencing services are available?

  • Manual/automated library preparation for ribo-depleted RNA sequencing (TruSeq Stranded Total RNA, Illumina) with up to 384 UDI with UMI.
  • Manual/automated library preparation for metagenomics samples (Nextera DNA Flex, Illumina)
  • iSeq sequencing
  • NovaSeq sequencing using SP, S1, S2 or S4 flow cell.
  • NovaSeq Xp workflow

What sequencing standards does GeneLab recommend?

GeneLab sequencing group recommended sequencing depth and standards for Total RNA sequencing are outlined here.

How to acknowledge or cite GeneLab Sequencing group?

GeneLab sequencing group acknowledgement guide can be found here.

In what format will I receive my data?

GeneLab will provide the raw data in FASTQ format with corresponding QC results. Files will be compressed (gzip, .gz).

How do I download my sequencing data?

To download your sequencing data, sign into your collaborative workspace account to retrieve your files. Detailed instructions will be provided by the GeneLab sequencing group.

For instruction on how to access and download files in the workspace, please review the Collaborative Workspace section in the User Manual.