National Genomics Infrastructure Stockholm - Technical Note Low Input RNA-Seq Introduction Library complexity For a standard RNA-Seq transcriptome library, SciLifeLab NGI Stockholm currently requires at least 2.0 µg RNA. To obtain this, users will typically need somewhere in the region of 2 - 5x107 primary cells (ref). In many experimental setups, it may not be possible to generate this much source material. To inspect how different input concentrations of RNA affect library complexity (the number of unique molecules in a sample), we used preseq to extrapolate unique versus total molecules (Fig 2). Decreasing the input concentration does result in lower diversity, as expected. However, the difference between the two samples has a larger effect. This is likely to do with the difference in RIN values of the input RNA. NGI Stockholm will accept RNA samples with less than 2.0 µg RNA, though we do not provide the same results guarantee that typically comes with transcriptome analysis. The aim of this study is to quantify the effect of using low concentration RNAinputs in the RNA-Seq pipeline. Experimental setup Two biological samples were used. AM7852 is total RNA prepared from HeLa-S3, bought directly from Life technologies (see product guide). GM12878 is RNA prepared in-house from the GM12878 cell line, ordered from the Coriell Institute. Nine libraries were prepared with varying input concentrations using the Illumina TruSeq RNA HT kit with polyA-selection, multiplexed in 1 lane and sequenced on an Illumina HiSeq with High Output, PE 2x100bp. Data was processed using the tuxedo suite (tophat, cufflinks, cuffdiff and cummeRbund), using the Human genome assembly GRCh37. Samples clustered as expected (Fig 1). Fig 1. Dendrogram showing sample clustering. Libraries can be seen to cluster by sample type, with close correlation between different input concentrations. Author: Phil Ewels [email protected] Fig 2. Preseq complexity curves. The number of reads sequenced for each library is plotted as a point. The low input GM12878 libraries are sequenced nearly to saturation, whereas further sequencing of the AM7852 would reveal more unique reads. Low Input RNA-Seq Page 1 of 2 1474-2_LowInputRNA-SeqTechnote.pdf NGI Technical Note Doc #1474:2, 2014-12-16 National Genomics Infrastructure Stockholm - Technical Note Number of observed genes Correlation between replicates For a greater understanding of the biological impact of this difference in library complexity, we calculated the number of observed genes at different sub-sampling points within each library. This gives curves with a similar profile to the preseq plot, yet with a more tangible meaning (Fig 3). Here, the difference between libraries is more pronounced and input concentration appears to have little effect. To check that replicates of the same sample yield similar counts for each transcript, we plotted a matrix of FPKM scatter plots with histograms (Fig 4). Fig 4. Scatter plots and histograms of FPKM values for the AM7852 samples (left:right, top:bottom - 1 µg, 1 µg, 1 µg, 500 ng, 200 ng, 50 ng). Replicates show a high degree of correlation, indicating excellent reproducibility. However, the final 50 ng input sample has a drop in the left peak of the histogram showing a loss of information about lowly expressed genes. There is also greater variation in the scatter plots involving this sample. This suggests that for transcript level analysis this sample may be less reliable than the others. Conclusion Fig 3. Cufflinks gene observations at increasing sub-sampling levels. Number of genes and the slope of the curve are similar across input concentrations. Biological variation and sample quality have a greater impact than input concentration. Author: Phil Ewels [email protected] In summary, we conclude that sequencing of RNA samples can give reliable data down to an input concentration of 200 ng. Our results again show the importance of high quality RNA extractions, with the RIN score of the input RNA having a far larger impact than the input concentration. Low Input RNA-Seq Page 2 of 2 1474-2_LowInputRNA-SeqTechnote.pdf NGI Technical Note Doc #1474:2, 2014-12-16
© Copyright 2024 ExpyDoc