Mice, plasmids, and antibodies
C57BL/6 and 129/SvCP mice were obtained from the Model Animal Research Center of Nanjing University. C57BL/6 ×129S-Gt (Rosa) 26Sor/J mice were obtained from the Jackson Laboratory. All mice were housed with 12/12-h light/dark cycles, at 22 °C, and allowed free access to water and food.
The pLKO.1 shRNA vector and lentivirus packaging plasmids (pmd-REV and pmd-1G/pmd-LG) were provided by Dr. Xin Wu (State Key Laboratory of Reproductive Medicine, Nanjing Medical University, China). Two pLKO.1 shRNA constructs (shMov10-832 and shMov10-833) were designed with shRNA sequences shown in Additional file 6: Table S7. To construct the pCDH-MOV10 plasmid, Mov10 full length CDS cloned from mouse testis was inserted into the pCDH-EF1-MCS-T2A-puro plasmid using EcoRI/BamHI restriction sites. Plasmids for the expression of FLAG-MOV10 (ENSMUSG00000002227,1004 aa), HA-SRSF1 (ENSMUSG00000018379, 248 aa), HA-DDX5 (ENSMUSG00000020719, 615 aa), and HA-DDX17 (ENSMUSG00000055065, 652 aa) were constructed by cloning Mov10, Srsf1, Ddx5, and Ddx17 full length CDS into the pRK5 vector with either FLAG or HA tag at their N terminus.
Primary antibodies used in this study were rabbit anti-MOV10 (10370-1-AP, Proteintech, RRID:AB_2297897), rabbit anti-MOV10L1 (UP2175, [35]), rabbit anti-ETV5 (13011-1-AP, Proteintech, RRID:AB_2278092), rabbit anti-BCL6b (DF9075, Affinity), goat anti-PLZF/ZBTB16 (AF2944, R&D, RRID:AB_2218943), rabbit anti-LIN28 (ab46020, Abcam, RRID:AB_776033), mouse anti-β-Actin (A5316, Sigma, RRID:AB_476743), rabbit anti-β-Tubulin (ab6046, Abcam, RRID:AB_2210370), mouse anti-GAPDH (MB001, Bioworld Tech), goat anti-RPL22 (NBP1–06069, Novus Biologicals, RRID:AB_2181599), mouse anti-Histone H3 (05-499, Millipore, RRID:AB_309763), rabbit anti-ELAVL1 (ab200342, Abcam, RRID:AB_2784506), rabbit anti-ILF3 (ab92355, Abcam, RRID:AB_2049804), rabbit anti-DHX9 (ab26271, Abcam, RRID:AB_777725), rabbit anti-DDX17 (ab70184, Abcam, RRID:AB_1209629), mouse anti-SRSF1(32-4500, Thermo, RRID:AB_2533079), rabbit anti-SNRPA1(ab128937, Abcam, RRID:AB_11139816), rabbit anti-SNRPD2 (ab198296, Abcam), rabbit anti-HNRNPC (ab133607, Abcam), rabbit anti-DDX5 (ab21696, Abcam, RRID:AB_446484), mouse anti-SYNCRIP (ab184946, Abcam), rabbit anti-MVH (ab13840, Abcam, RRID:AB_443012), rabbit anti-MILI (ab36764, Abcam, RRID:AB_777284), rabbit anti-DROSHA (55001-1-AP, Proteintech, RRID:AB_10859254), rabbit anti-DGCR8 (10996-1-AP, Proteintech, RRID:AB_2090987), rabbit anti-DICER (A6021, ABclonal, RRID:AB_2766716), and normal rabbit IgG.
MOV10 antibody validation
In alignment with ENCODE guidelines (version 2.0, 9 January 2012), the rabbit anti-MOV10 (10370-1-AP, Proteintech) which was used for all experiments in this study was validated as follows. Outcomes of western blot analyses were consistent with results obtained with a second commercially available rabbit anti-MOV10 antibody (ab80613, Abcam); both antibodies identified the same single band in MOV10 IP complexes from testis produced using 10370-1-AP (Additional file 1: Figure S1), and in lysates from FLAG-MOV10 overexpressing HEK293T cells (Additional file 1: Figure S1); both 10370-1-AP (Proteintech) and anti-FLAG antibodies identified the same band in FLAG IP complexes from HEK293T cells (Additional file 1: Figure S1). Crosslinking IP (CLIP)-WB (Fig. 5b) and IP-MS (Fig. 7a) using 10370-1-AP (Proteintech) identified a single strong band of the predicted size, and among cytoplasmic proteins, MOV10 had the largest number of unique peptides identified and highest iBAQ value, confirming high affinity and specificity of the antibody used for IP (Additional file 5: Table S6). No cross reactivity with similar proteins was observed when anti-MOV10L1 was used for MOV10 IP complex (Fig. 7d). MOV10 immunostaining of cryosections from E16.5, P1, P10 testes, and SPCs and purified Spg from P6–8 testes produced a signal (Fig. 1e–g and Additional file 1: Figure S1) consistent with nuclear IP and WB results (Fig. 7a).
Purification of spermatogenic cells
We isolated three representative types of germ cells, spermatogonia (Spg), pachytene spermatocyte (PS), and round spermatid (RS), using the STA-PUT method of sedimentation velocity at unit gravity at small scale [95]. Seminiferous tubules were isolated from decapsulated testes from 2-month-old adult (for PS and RS) or P6-P8 mice (for Spg) by incubation in DMEM with collagenase (1 mg/ml) at 37 °C for 15 min with shaking. Tubules were digested in Trypsin (0.25%) and DNase (1 mg/ml) in DMEM at 37 °C for 15 min and then filtered to obtain single cell suspensions. Germ cell populations were separated using a BSA gradient using the STA-PUT Velocity Sedimentation Cell Separator (ProScience Inc. Canada). Sequential fractions were collected, and cell types determined based on morphology. Based on morphologic characteristics and cell diameter, cell purities of isolated stage-specific germ cell populations were determined. The purity of isolated Spg, PS, and RS was approximately 80%, 80%, and 90%, respectively. Cell fractions of uniform populations were pooled, pelleted, and stored for subsequent analyses.
Immunofluorescence microscopy
To prepare frozen sections, testes were fixed in 4% paraformaldehyde (PFA) at 4 °C overnight, embedded and sectioned (6 μm). The sections were treated with 6 μM DTT and 10% serum in TBST (1 × TBS containing 0.1% Tween 20) for 1 h at room temperature (RT), then incubated overnight at 4 °C with primary antibodies diluted in 10% serum in TBST. After incubation with secondary antibodies, the sections were then washed in TBS three times and stained with DAPI (Vector Laboratories). For whole-mount assay, seminiferous tubules from adult testis were prepared as previously described with modifications [54]. Testis tubules were digested with trypsin and collagenase, washed with PBS, and fixed with 4% PFA for 2 h. After blocking with 10% serum in TBST (1 × TBS containing 0.1% Tween 20) for 1 h at RT, the samples were incubated overnight at 4 °C with anti-MOV10 antibody (1:25) and anti-PLZF antibodies (1:200), washed three times in TBS, and incubated with Texas red or FITC-conjugated secondary antibodies (Jackson Immuno Research) for 1 h at RT. Samples were washed as before and mounted in microslide shield with DAPI. Immunofluorescence for all samples was examined under laser scanning confocal microscope (Carl Zeiss, LSM700).
SPC cell culture, transduction, and transplantation
Establishment of long-term SPC cultures has been described previously [20]. For cell culture, we isolated Thy1-positive cells from 6- to 8-day-old B6;129S-Gt (Rosa) 26Sor/J mice and cultured these on 12-well plates with mitotically inactivated STO (SIM mouse embryo-derived thioguanine and ouabain-resistant feeder, SNLP76/7-4, ATCC) feeder layers in a defined serum-free, consisting of minimal essential medium (MEMa, Life Technology) supplemented with 2% bovine serum albumin (BSA, Sigma-Aldrich, St. Louis, MO, USA), 20 ng/ml GDNF (R&D Systems, Minneapolis, MN, USA), 150 ng/ml GFRA1 (R&D Systems) and 1 ng/ml basic fibroblast growth factor (FGF2; BD Biosciences), 10 μg/ml transferrin (Sigma-Aldrich), 50 μM free fatty acid mixture (5.6 mM linolenic acid, 13.4 mM oleic acid, 2.8 mM palmitoleic acid, 35.6 mM linoleic acid, 31.0 mM palmitic acid, 76.9 mM stearic acid; all from Sigma-Aldrich), 30 nM Na2SeO3 (Sigma-Aldrich), 2 mM L-glutamine (Life Technology), 50 μM 2-mercaptoethanol (Sigma-Aldrich), 5 μg/ml insulin (Sigma-Aldrich), 10 mM HEPES (Sigma-Aldrich), and 60 μM putrescine (Sigma-Aldrich). Medium was replaced every 2–3 days.
Lentivirus particles were generated following Addgene protocols (http://www.addgene.org/tools/protocols/plko/). The pLKO.1 shRNA plasmids (shVector or shMov10) and lentivirus packaging plasmids (pmd-REV and pmd-1G/pmd-LG) were co-transfected into HEK293T cells, and supernatant containing lentivirus particles was harvested after 48 h transfection. For Mov10 knockdown, 2.5 × 105 SPCs were plated onto 12-well plates pre-coated with 0.1% gelatin (Sigma-Aldrich), and 12 h later, transduced overnight with an equal mixture of culture medium and lentiviral supernatant, supplemented with 5 μg/ml polybrene. Cells were washed, replated on STO feeder layers, cultured for 72 h, and harvested for RNA and protein isolation.
For transplantation experiments, cell suspensions (transduced by shVector or shMov10) were harvested at 72 h post transduction and live cells were enriched by 30% percoll density gradient centrifugation to remove dead cells and debris. One hundred thousand SPCs (GT-Rosa 26Sor/J) were transplanted into each testis of 129/SvCP ×C57BL/6 F1 male mice, in which endogenous spermatogenesis had been depleted by treatment with busulfan (55 mg/kg; (Sigma-Aldrich) at the age of 8 weeks and 5 weeks prior to transplantation. Two months after transplantation, testes were harvested and donor-cell derived colonies visualized with X-gal staining. All animals used in this study were housed under the 12-h light/12-h dark cycles in a specific pathogen-free barrier facility. All experiments and procedures were approved by the Institutional Animal Care and Use Committee of Nanjing Medical University (ID: IACUC-1601287).
Vector construction, lentivirus packaging, and testis transduction
Two pairs of cDNA oligonucleotides targeting the mouse Mov10 mRNA were designed. The off-target effects of both sets of the Mov10 shRNAs were detected by Sylamer program, and analysis indicated that the seeding sequences of both Mov10 shRNAs were specifically enriched on Mov10 gene only, but not on other genes. Then, oligonucleotides were synthesized (see Additional file 6: Table S7) and inserted into pSilencer-H1-LV [57, 58], which carries a CMV-driven EGFP reporter downstream of H1-driven shRNA. shRNA and Flag-Mov10 overexpression vectors were first co-transfected to 293T cells to test its efficacy of gene silencing before lentiviral packaging. If these shRNAs were efficient, pSilencer-shRNA lentiviral vectors and packaging plasmids were co-transfected in 293T cells to produce recombinant lentiviral vectors using the calcium phosphate method. After transfection of 293T cells for 48 h, the viral supernatant was filtered through 0.45-μm cellulose acetate filters and harvested. After that, the viral supernatant was spun at 120,000×g for 90 min at 4 °C; then, appropriate PBS was used to resuspend the viral pellet for preparing high-titer lentivirus (> 108 transduction units/ml). After adult mice were anesthetized by tri-bromoethanol, one testis was pulled out from the abdominal cavity. A mixture of 10 μL of fresh high-titer lentivirus and 1 μL trypan blue was injected into seminiferous tubule through the microinjection apparatus (FemtoJet 4i, Eppendorf) under a stereoscopic microscope. The testis was returned to the abdominal cavity; then, the abdominal wall and skin were closed with sutures. Each injected mouse was marked and kept warm until they wake up. Mouse testes were harvested after 1–2 weeks recovery and fixed in 4% paraformaldehyde immediately. After testis was packed by O.C.T. compound, 5-μm cryosections of testis were cut and used for TUNEL assay.
Flow cytometric studies
Apoptosis was examined using FITC Annexin V Apoptosis Detection Kit I (BD Bioscience). SPCs were harvested 96 h after transfection, washed twice with cold PBS buffer, and resuspended in 1× binding buffer at a final concentration of ~ 1 × 106 cells/ml. Five-microliter FITC Annexin V and 5-μl PI solution were added to 100-μl cell suspension, which was then incubated in the dark for 15 min at RT, followed by addition of 400 μl 1× binding buffer within 1 h. For cell cycle analysis, we used the PI/RNase Staining Buffer (BD Bioscience) followed by fluorescence activated cell sorting (FACS). Per assay, two independent experiments with triplicate samples were performed.
Preparation of CLIP-seq libraries
For each CLIP replicate, 15 pairs of P10 testes were detunicated, UV-crosslinked, flash-frozen in liquid nitrogen, and then stored as pellet at − 80 °C. When processed for CLIP, pellets were lysed with PMPG buffer, treated with DNase, and then centrifuged. The supernatant from the treated lysates was precleared by rabbit IgG and then immunoprecipitated with ~ 5 μg anti-MOV10 antibody using protein A Dynabeads. Meanwhile, 3′-RNA linkers (RL3) were labeled with 32P and ligated to CIP (calf intestinal phosphatase)-treated RNA on beads. After stringent wash steps, crosslinked MOV10 RNPs were eluted from beads with Novex reducing loading buffer, separated by electrophoresis in NuPAGE precast gels (4–12% gradient) with MOPS buffer, and transferred onto nitrocellulose (Invitrogen LC2001). Membranes were exposed to film overnight, and fragments containing the main radioactive signal were excised. Library construction, including RNA extraction, 5′ linker ligation, RT-PCR, second PCR, electrophoretic separation, and extraction were performed as described previously [38, 96]. For deep sequencing, we prepared a multiplexed library consisting of three independent MOV10 HITS-CLIP libraries with identifying 3′ barcodes (RL5i1(AUCACG), RL5i3(UUAGGC), and RL5i7(CAGAUC)).
Isolation of nuclear and cytoplasmic fractions
Subcellular extracts were prepared as described [97], with minor modifications (see related data in supplemental material). One hundred-milligram P10 testis tissue was homogenized in 1 ml Cytoplasmic Extraction Buffer (250 mM sucrose, 10 mM Tris-HCl (pH 8.0), 10 mM MgCl2, 1 mM EGTA, 1× protease inhibitor cocktail III) with 100 strokes. Nuclei were pelleted by centrifugation at 300×g for 5 min, and the supernatant was collected as cytoplasmic fraction. The nuclear pellet was washed three times in Cytoplasmic Extraction Buffer, resuspended in Nuclear Extraction Buffer (250 mM sucrose, 10 mM Tris-HCl (pH 8.0), 10 mM MgCl2, 1 mM EGTA, 0.1% Triton X-100, 0.25% NP-40, and 1× protease inhibitor cocktail III) with 40 strokes, and centrifuged at 100×g for 30 s. The supernatant was collected as nuclear fraction. Fractionation efficiency was validated by western blot using antibodies specific for cytoplasmic (GAPDH) and nuclear (histone H3) proteins.
Immunoprecipitation and mass spectrometry (IP-MS)
Nucleus and cytoplasm were separated via centrifugation at 300×g for 5 min as described above. For nuclear MOV10 IP, the nuclear pellet was washed three times with Cytoplasmic Extraction Buffer and lysed separately with 3 ml different IP buffers with or without 10 μg/ml RNase A/T1 (EN0551, Thermo Fisher), including RIPA buffer (50 mM Tris, 150 mM NaCl, 1% NP-40, 1 mM DTT, 0.5% sodium deoxycholate, 0.05% SDS, 1 mM EDTA, and protease inhibitor cocktail), CHAPS buffer (40 mM Hepes, 120 mM NaCl, 1 mM EDTA, 0.3% CHAPS, 50 mM NaF, 10 mM β-glycerophosphate, 10 mM sodium pyrophosphate, 0.5 mM sodium orthovanadate and protease inhibitor cocktail), and gentle lysis buffer (50 mM Tris, 150 mM NaCl, 1 mM EDTA, 0.5% Triton X-100, 10% Glycerol, 50 mM NaF, 5 mM β-glycerophosphate, and protease inhibitor cocktail). For cytoplasmic IP, RIPA buffer was used with no RNase treatment. The supernatant from nuclear or cytoplasmic lysate was collected for IP via centrifugation at 12000×g for 10 min and was then precleared with Protein A Agarose Beads (Millipore) for 30 min at 4 °C with rotation, followed by centrifugation and supernatant collection. Rabbit polyclonal anti-MOV10 antibodies (10 μl) or rabbit IgG (3 μl) were added to the supernatant, followed by overnight incubation at 4 °C with rotation. Beads were added to the tube again and incubated 3 h at 4 °C with rotation. Beads were then washed three times with lysis buffer, resuspended in sample loading buffer, boiled, and separated by SDS-PAGE.
MS analysis of proteins was performed by Proteomics Core Facility of Nanjing Medical University. Briefly, gel slices were dissolved in 0.1% formic acid and filtered through a 0.45-μm membrane. Samples were separated by an Ultimate 3000 nano-LC system (Dionex) by loading onto a trap column, followed by automatic submission to a matrix-assisted laser desorption/ionization time-of-flight/time-of-flight (MALDI TOF/TOF) apparatus (ultrafleXtreme, BrukerDaltonics, Bremen) operated in the positive ion mode. MassLynx™ software (version 4.1, Waters Corporation) was used for information collection and MS analysis.
Cell line co-expression and co-immunoprecipitation (co-IP)
We co-expressed FLAG-tagged MOV10 and HA-tagged SRSF1, HA-tagged DDX5 or HA-tagged DDX17 in HEK293T cells. After transfection of cells for 48 h, the 10-cm dishes were washed three times with cold PBS and then lysed in 50 mM Tris, 150 mM NaCl, 1 mM EDTA, 0.5% Triton, 10% Glycerol, 50 mM NaF and 5 mM β-glycerophosphate and protease inhibitors cocktail for 10 min on the ice. The total cell lysate was spun at 12,000×g for 10 min at 4 °C. After centrifugation, supernatants were pre-cleared with 50 μl sepharose protein A beads. Fifty-microliter protein A beads were washed three times with 20 mM sodium phosphate (pH 7.0) and incubated with anti-FLAG or anti-HA antibodies at 4 °C for 3 h. Then, the pre-cleared lysate was incubated with the antibody coated beads at 4 °C for 3 h. The beads were collected by gentle centrifugation (500×g for 1 min at 4 °C) and then washed three times with wash buffer (50 mM Tris-HCl (pH 7.4), 150 mM NaCl, 0.1% Triton X-100, 1 mM EDTA), and incubated at 95 °C for 8 min with SDS loading buffer. IP complexes were analyzed by western blot with anti-FLAG and anti-HA antibodies.
RNA immunoprecipitation (RIP)
Protein A Agarose Beads were washed three times in NT2 buffer (50 mM Tris-HCl (pH 7.5), 150 mM NaCl, 1 mM MgCl2, 0.05% NP40, protease inhibitor cocktail, and RNase inhibitor) and coated with 5 μg MOV10 antibody and Normal rabbit IgG in 500 μl NT2 buffer for 3 h at 4 °C. Fifteen pairs of detunicated testes (P10) were homogenized in lysis buffer (100 mM KCl, 5 mM MgCl2, 10 mM HEPES, 0.5% NP-40 containing 10 U/ml RNase inhibitor (Promega) and a protease inhibitor cocktail (Roche)) for 1 h at 4 °C. The lysate was centrifuged at 20,000×g for 30 min and the supernatant precleared by incubation with beads for 1 h, followed by incubation with antibody- or IgG-bound beads for 5 h. After stringent washing with NT2 buffer and digestion with protease K, total RNAs were extracted with TRIzol.
Reverse transcription-quantitative real-time polymerase chain reaction (RT-qPCR)
RNA samples were reverse transcribed using PrimeScript RT reagent Kit (Takara) and random primers. As amplification of individual mature miRNAs is inefficient due to their short length, we used specific stem loop primers that extend the miRNA template for subsequent qPCR detection [98]. qPCR analysis was performed using SYBR Premix Ex Taq II mixture (Takara) and the StepOne plus real-time PCR system (StepOne Plus, Life Technology). 36b4 serves as internal control for transcripts and U6 for mature miRNAs. Oligonucleotide primer sequences are provided in Additional file 6: Table S7.
Sucrose gradient polysome fractionation
Approximately 20 P10 testes were harvested for polysome fractionation assays. Testis lysates were prepared in a buffer containing 100 mM KCl, 0.1% Triton X-100, 50 mM HEPES, 2 m MgCl2, 10% glycerol, 1 mM DTT, 20 U/ml Protector RNase Inhibitor (Promega) and 1 × EDTA free protease inhibitor cocktail (Roche), and kept on ice for 15 min before centrifugation at 10,000g for 10 min. The supernatant was carefully loaded on 20 to 50% w/v linear density sucrose gradient (Gradient Master, Biocomp, Fredericton, NB, Canada) and centrifuged at 38,000 rpm, for 3 h (Beckman Coulter Optima L-100XP Ultracentrifuge, Brea, CA, USA). RNP, 40S to 80S ribosome, and polysome fractions were collected using a piston gradient fractionator (Biocomp). The efficiency of polysome separation was verified by western blot analysis of individual fractions using antibodies against ribosomal protein L22 and TUBULIN.
RNA-seq, small RNA-seq, and CLIP-seq
Strand-directional RNA-seq libraries were prepared from total RNA (depleted of ribosomal RNA) from SPC control (shVector) and Mov10 knockdown (shMov10-832) samples using the TruSeq Stranded Total RNA Sample Preparation kit (Illumina, USA) according to the manufacturer’s instructions. Small RNA libraries were prepared using small RNAs (size range of 15–50 nt) isolated from total RNA samples using the TruSeq Small RNA library prep kit (Illumina, USA) according to the manufacturer’s protocols. MOV10 CLIP libraries were prepared as described above. Sequencing of 3 CLIP libraries, 6 RNA-seq libraries, and 2 small RNA libraries (with triplicate samples) was performed using Illumina Hiseq2500 according to the manufacturer’s instructions.
Shanghai Biotech Co. (Shanghai, China) and Vazyme Biotech Co., Ltd. (Nanjing, China) performed bioinformatics computations. The quality of sequencing data was validated using KASAVA (version 1.8). Raw reads were pre-processed using Fastx (fastx_toolkit-0.0.13.2) to achieve clean data by filtering rRNAs and trimming adaptors. For RNA-seq and CLIP-seq, clean reads were mapped to the mouse genome (mm10) using TopHat (version 2.0.9) with a GTF file download from Ensemble database with maximum 2-base mismatch. Reads from small RNA-seq were mapped to miRBase (version 21.0) by CLC genomics workbench (version 5.5) without base mismatch.
CLIP read pre-processing and genomic mapping
Qualified reads were sorted into three separate libraries with barcodes and then processed into clean reads to perform further analysis. All clean reads no shorter than 15 nt were mapped to the mouse reference genome (UCSC mm10 assembly). Only reads mapping uniquely to the genome were used for further analysis. Alignment data generated using Top Hat were converted from BAM into bigWig format and visualized using the UCSC genome browser. Next, we quantified the distribution of reads aligning with different genomic regions (5′-UTR, CDS, 3′-UTR and intergenic). The genomic coordinates for repeat elements were downloaded from the UCSC website (“rmsk.txt.gz”, output from “repeat masker”). The aligned reads were annotated using the GTF file <GTF_FILE> (downloaded from Ensemble). RNA expression values (FPKM) were calculated by Cufflinks v2.1.1. Scatter plots for correlations in between libraries were plotted using R script.
CLIP read coverage across genic regions
Thirteen thousand two hundred ten MOV10-bound mRNAs met the selection criterion of FPKM ≥ 0.5 in at least two libraries, and a total of 32,617 corresponding transcripts were extracted from Ensemble. Genomic coordinates of mRNA regions (5′-UTR, CDS and 3′-UTR) were acquired from Ensemble. Each region was divided into 100 bins. The normalized sequencing depths per bin were plotted by scanning read density on every transcript across three independent CLIP libraries. To analyze the correlation of 3′-UTR length with MOV10 binding intensity, we ranked transcripts according to 3′-UTR length, followed by partitioning into terciles: long (10,872 transcripts), medium (10,872 transcripts), and short (10,873 transcripts). The normalized sequencing depth on three groups was plotted separately.
CLIP-seq analysis on deletion-based crosslinked sites
Reverse transcription of CLIP-captured target RNA produces a deletion event at the crosslinked site. Therefore, deletion sites are by default considered sites of protein-RNA binding [67]. We extracted all MOV10 CLIP tags with mutations (including insertion, deletion or mismatch) and, using samtools and perl script, identified total 33,468 deletion sites relative to the genome from 297,083 deletion residues. Genomic information for deletion sites was acquired from mm10 GTF file (downloaded from Ensemble). To calculate nucleotide composition within 100 nt on both sides, deletion sites were set as position zero. To predict the RNA secondary structure, we assessed the possibility of base pairing and unpairing in the indicated ranges. The percentage of base pairing in each position reflects the local potential of secondary structure. This analysis was verified by an expected result of high frequency pairing of miRNA stem nucleotides compared with those on the miRNA loop.
To evaluate MOV10 binding to pre-mRNA introns, CLIP tags containing crosslinked sites falling within intronic regions were extracted [74]. Exon-intron junctions were located on a genome-wide level, and each was marked as position zero. Next, genomic windows were set around each site that consisted of the downstream 200 nt relative to the 5′ splice site and of the upstream 200 nt relative to 3′ splice site. The number of genomic crosslinked sites or crosslinked nucleotides were mapped, calculated, and plotted within these set windows. Further, ± 100 nt genomic windows flanking each exon-intron junction were scanned, and the windows with deletion sites identified on the intronic side were reserved for calculating nucleotide composition.
CLIP-seq analysis on MOV10-bound miRNA transcripts
Genomic coordinates of miRNA hairpins were obtained from miRBase (release 21.0). Similarly to a previously reported approach [68], we examined CLIP-seq reads to identify reads located within or with an overlap of at least one nucleotide (nt) of a 100 nt window flanking either side of the miRNA hairpins. Each miRNA region was scanned to determine whether MOV10 bound to mature, pre- or pri-miRNA on the basis of read pileup patterns. We classified retained reads into five categories: in detail, category I, reads that map fully to mature miRNAs (no more than 1 nt excess sequence); category II, reads within pre-miRNA sequences that overlap with regions of the mature miRNA (at least 1 nt sequence overlapping with pre-miRNA); category III, reads mapping to pre-miRNA sequences but without overlap with mature miRNA sequence; category IV, reads that overstep the boundary of pre-miRNA sequences (an overlap of at least 1 nt sequence with pre-miRNA); and category V, reads completely outside of the pre-miRNA boundaries. A minimum of five CLIP reads was required to define each form of miRNA: mature (category I), pre-miRNA (category II plus III), and pri-miRNAs (category IV plus V). To characterize the global distribution of CLIP reads relative to the position of miRNA secondary structure, the ± 200 nt windows flanking the stem loop midpoint were extracted and divided into 30 bins for read coverage plotting.
RNA-seq analysis on mRNA transcripts with differential 3′-UTR lengths
Six hundred eighty-nine genes with significant differences in transcript level were selected (p < 0.05, fold change> 1.5). The genomic coordinates of all transcripts corresponding to these genes were obtained from Ensemble. All transcripts were ranked according to the length of 3′-UTR from long to short and then divided into terciles (long, medium, and short). Transcripts levels (FPKM values) were calculated by Cufflinks v2.1.1. Transcripts without mapped reads were discarded. Expression changes were measured using log2 ratios (Mov10 knockdown vs control cells).
RNA-seq analysis on miRNA transcript levels
Referring to Fig. 3d, the colored parts of miRNA transcripts are matched between the left and right panels. Left: black solid lines represent pri-miRNA regions extended from stem-loop; black dashed lines represent that the two sides of a pri-miRNA region may either join or separate after initial processing; green lines represent the mature miRNA and/or its stem side; red lines represent the other stem side; and gray circles represent loop region. Right: reads mapping to ± 200 nt window flanking the miRNA hairpin, shown as staggered black lines on top of the two initial miRNA transcripts (before and after processing, respectively), were extracted from RNA-seq library. Note that these two initial transcripts cannot be separately evaluated as they would likely be overrepresented by their cognate RNA-seq reads; that both pre-miRNA (green+gray+red) and mature miRNA (green) are excluded from RNA-seq library; and that the small RNA library contains relatively short reads that are mostly mature miRNAs. The RNA-seq reads (staggered black lines) were collectively evaluated as a normalized density that roughly reflects the whole transcript level of each miRNA, equivalent to the seed levels of mature miRNA, i.e., the maximum potential of miRNA generation.
RNA-seq analysis on alternative splicing events
To detect differentially regulated exons or isoforms at a genome-wide level, we applied a statistical model of mixture-of-isoforms (MISO) to RNA-seq data [99]. Sequencing reads were aligned to known and predicted regions for alternative splicing including exon-intron junction boundaries annotated from genome mm10. We discarded any events with less than 5 supporting reads. Specific events were identified from either the Mov10 knockdown or control libraries and then classified into five categories.
Statistical analysis
Data are reported as mean ± SE unless otherwise noted in the figure legends. Significance between groups was determined using the two-tailed unpaired Student T test (*p < 0.05; **p < 0.01; ***p < 0.001).