Skip to main content
Fig. 5. | BMC Biology

Fig. 5.

From: Enhanced transcriptome-wide RNA G-quadruplex sequencing for low RNA input samples with rG4-seq 2.0

Fig. 5.

The influences of rG4-seq protocols on library efficiencies, nucleotide bias and RTS site identification outcomes. A Overall bioinformatic workflow of identifying RT-stalled positions, RTS sites and rG4s from rG4-seq datasets using rG4-seeker. Starting from deduplicated read pairs, the template-strand reads (in the same direction as RNA) were first selected. The 5′ nucleotide of template-strand reads was the position where reverse transcription stopped. Next, the K+ and Li+ libraries were compared using statistical models to identify RT-stalled positions, which were defined as the transcript positions with significantly more RT stops in K+ than in Li+. Each RT-stalled position can be supported by a varying number of template-strand reads in K+ libraries. The read number depended on both the abundance of the parent transcript and the strength of the RTS effect. Since each rG4 can induce an RTS effect on more than one position (typically at position 0 and/or position +1), adjacent RT-stalled positions were merged into one RTS site spanning multiple positions. Meanwhile, each singleton RT-stalled position was considered one RTS site spanning one position. RTS sites were then assigned rG4 categories based on their adjacent sequence. B Plot of the number of raw read pairs and deduplicated reads pairs as a function of sequencing depth for the rG4-seq 2.0 and 1.0 libraries. The variation in sequencing depths was simulated by random subsampling of the sequencing datasets. The lines and their shaded area in the plot indicated the averaged and standard deviations of the read pair numbers in respective rG4-seq libraries. Libraries with raw sequencing depth <100 million read pairs were not up-sampled and became data dropouts. Fig. S6 in Additional file 1 showed the break-down plots for each library. C Distribution of the 5′ nucleotides of template-strand, deduplicated reads in rG4-seq libraries. The 5′ nucleotide was the position where reverse transcription stopped. D Distribution of the 5′ nucleotides of template-strand, deduplicated, RTS-supporting reads in rG4-seq libraries (K+ conditions only). E, F Nucleotide distribution of the RT-stalled positions detected in rG4-seq libraries, comparing (E) Guanine and non-Guanine positions (F) within non-Guanine positions. (G) Nucleotide distribution of the 3′ flanking nucleotide (corresponds to position +1 as described in Fig. 5A) of the detected rG4 motifs in rG4-seq libraries. While an rG4 motif always has a G nucleotide for its 3′ end (position 0), its 3′ flanking nucleotide (position +1) can be both A/T/C/G. Nucleotide bias in RT-stalled positions can favor the detection of rG4s with a specific 3′ flanking nucleotide. Asterisks represent significant differences in A-bias between two libraries in respective plots (Alexander-Govern test; ***p<0.001; **p<0.01 *p<0.05). Raw data values are provided in the Additional file 8

Back to article page