RNA-seq: impact of RNA degradation on transcript quantification.

Author(s): Gallego Romero I, Pai AA, Tung J, Gilad Y

Publication: BMC Biol, 2014, Vol. 12, Page 42

PubMed ID: 24885439 PubMed Review Paper? No

Purpose of Paper

The purpose of this paper was to determine the effects of decreased RNA integrity numbers (RIN) due to room temperature storage of peripheral blood mononuclear cells (PBMC) on Whole Transcriptome Sequencing data (WTSS). The effects of transcript characteristics and data-handling were also investigated.

Conclusion of Paper

With increasing room temperature storage of PBMC, the RIN of the resultant RNA declined. RIN was positively associated with the number of uniquely mapped reads and the number of reads mapped to genes and negatively associated with the proportion of reads that were due to spiked in control material. While specimens with high RIN (mean RIN ≥7.9) clustered by individual, those with a RIN <7.9 clustered with specimens with a similar level of degradation. While almost all transcripts were detected at all timepoints, the degradation rates increased with increasing %GC content and length of either the 3' untranslated region (UTR) or coding DNA sequence (CDS). Including the RIN as a covariate in the generalized linear model, or regression of the data to account for the RIN, allowed for the identification of more genes that were differentially expressed between individuals.

Studies

  1. Study Purpose

    The purpose of this study was to determine the effects of room temperature storage of PBMC on RIN and the effects of using specimens with a low RIN on whole transcriptome sequencing data. The effects of GC content, transcript and UTR length as well as data-handling characteristics such as the mapping algorithm used, exclusion of regions >1000 nt from the 3'UTR, and accounting for RIN in statistical analysis were also investigated. Aliquots of PBMC from 4 patients were stored at room temperature, lysed at the appropriate timepoint and stored frozen until extraction using the RNeasy kit. 50 bp reads were obtained using the Illumina HiSeq2000.

    Summary of Findings:

    RINs declined from an average of 9.3 in specimens extracted immediately, to 3.8 in specimens stored at room temperature for 84 h before RNA extraction. RIN was positively associated with the number of uniquely mapped reads (p<0.01) and the number of reads mapped to genes (p<0.00) and negatively associated with the proportion of reads that were due to spiked in control material. Importantly, 28.9% of the variance in gene expression of principle component 1 was associated with RIN score (p<0.000001). Further, specimens with high RIN (mean RIN ≥7.9) clustered by individual, but those with a RIN <7.9 were more correlated with specimens with a similar level of degradation than with intact specimens from the same individual. This effect was observed regardless of distance from the 3' UTR and mapping algorithm used. As RIN decreased, the mean reads per kilobase transcript per million (RPKM) increased (p<0.0001), but the median RPKM decreased reflecting non-uniform degradation resulting in a less complex library. While almost all transcripts were detected at all timepoints, the degradation rates were transcript dependent. The rate of degradation was correlated with CDS length (ρ= -0.068, p<10^−12), %GC content (ρ= -0.039, p<0.001), and 3′UTR length (ρ= -0.136, p<10^−15) with faster degradation occurring with higher %GC content and increased length of either the 3' UTR or the CDS. Degradation of pseudogenes tended to be slower than that of protein-coding genes (p<10^-16). When RIN was a covariate in the generalized linear model, the number of differentially expressed genes across time-points decreased dramatically, and the number of genes found to be differentially expressed between individuals increased. Regression of the data for the RIN was better at eliminating the effects of degradation than including RIN as a covariate, but neither method, alone or together, was able to completely eliminate the effects.

    Biospecimens
    Preservative Types
    • None (Fresh)
    Diagnoses:
    • Not specified
    Platform:
    AnalyteTechnology Platform
    RNA Next generation sequencing
    RNA Bioanalyzer
    Pre-analytical Factors:
    ClassificationPre-analytical FactorValue(s)
    Storage Time at room temperature 0 h
    12 h
    24 h
    36 h
    48 h
    60 h
    72 h
    84 h
    Next generation sequencing Specific Data handling All reads considered
    Only reads within 1000 nt of 3' UTR considered
    Mapped with BWA 0.63
    Mapped with TopHat 2.08
    RIN included as covariate
    Regressed for RIN

You Recently Viewed  

comments powered by Disqus

News and Announcements

  • New Expert-Vetted BEBP Now Available

  • ISBER 2018 Meeting

  • AACR Annual Meeting 2018

  • More...