Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2014;8 Suppl 2(Suppl 2):S9.
doi: 10.1186/1752-0509-8-S2-S9. Epub 2014 Mar 13.

STATegra EMS: an Experiment Management System for complex next-generation omics experiments

STATegra EMS: an Experiment Management System for complex next-generation omics experiments

Rafael Hernández-de-Diego et al. BMC Syst Biol. 2014.

Abstract

High-throughput sequencing assays are now routinely used to study different aspects of genome organization. As decreasing costs and widespread availability of sequencing enable more laboratories to use sequencing assays in their research projects, the number of samples and replicates in these experiments can quickly grow to several dozens of samples and thus require standardized annotation, storage and management of preprocessing steps. As a part of the STATegra project, we have developed an Experiment Management System (EMS) for high throughput omics data that supports different types of sequencing-based assays such as RNA-seq, ChIP-seq, Methyl-seq, etc, as well as proteomics and metabolomics data. The STATegra EMS provides metadata annotation of experimental design, samples and processing pipelines, as well as storage of different types of data files, from raw data to ready-to-use measurements. The system has been developed to provide research laboratories with a freely-available, integrated system that offers a simple and effective way for experiment annotation and tracking of analysis procedures.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Overview of the STATegra EMS architecture.
Figure 2
Figure 2
Metadata Module structure in STATegra EMS. The Sample module stores information of biological conditions, biological replicates and the associated analytical samples. The analysis module contains all analysis steps from raw to processed data. Both samples and analyses are associated to one or more experiments within the Experiment module.
Figure 3
Figure 3
STATegra EMS analysis workflow components. The workflow is linked to an analytical sample object and consists of raw, intermediate and processed data IUs.
Figure 4
Figure 4
Example of primary and secondary workflow for a DNase-seq analysis. Primary workflow (a) involves calling DNase hypersensitivity regions (DHR) by applying a peak-calling algorithm to a BAM file of mapped reads whereas secondary workflow (b) involves merging of DHR.bed files from different samples to obtain a set of consolidated regions and then counting the number of reads of each sample in the consolidated region set to generate a per-sample signal value file.
Figure 5
Figure 5
Analysis module input window.
Figure 6
Figure 6
Sample scheme for cell line K562 ENCODE user case data. See main text for description.
Figure 7
Figure 7
Annotation details at Experiment module.
Figure 8
Figure 8
Sample form. The sample form provides fields to annotate biological condition details including data on the associated biological replicates and analytical samples.

References

    1. Song CX, Szulwach KE, Dai Q, Fu Y, Mao SQ, Lin L, Street C, Li Y, Poidevin M, Wu H, Gao J, Liu P, Li L, Xu GL, Jin P, He C. Genome-wide profiling of 5-formylcytosine reveals its roles in epigenetic priming. Cell. 2013;153(3):678–91. doi: 10.1016/j.cell.2013.04.001. - DOI - PMC - PubMed
    1. Wei G, Abraham BJ, Yagi R, Jothi R. et al.Genome-wide analyses of transcription factor GATA3-mediated gene regulation in distinct T cell types. Immunity. 2011;35(2):299–311. doi: 10.1016/j.immuni.2011.08.007. - DOI - PMC - PubMed
    1. Schmid N, Pessi G, Deng Y, Aguilar C. et al.The AHL- and BDSF-dependent quorum sensing systems control specific and overlapping sets of genes in Burkholderia cenocepacia H111. PLoS One. 2012;7(11):e49966. doi: 10.1371/journal.pone.0049966. - DOI - PMC - PubMed
    1. Bordbar A, Mo ML, Nakayasu ES, Schrimpe-Rutledge AC. et al.Model-driven multi-omic data analysis elucidates metabolic immunomodulators of macrophage activation. Mol Syst Biol. 2012;8:558. - PMC - PubMed
    1. Baltz AG, Munschauer M, Schwanhäusser B, Vasile A. et al.The mRNA-bound proteome and its global occupancy profile on protein-coding transcripts. Mol Cell. 2012;46(5):674–90. doi: 10.1016/j.molcel.2012.05.021. PMID: 22681889. - DOI - PubMed

Publication types

LinkOut - more resources