Application of long single-stranded DNA donors in genome editing: generation and validation of mouse mutants

Background Recent advances in clustered regularly interspaced short palindromic repeats (CRISPR)/CRISPR-associated protein 9 (Cas9) genome editing have led to the use of long single-stranded DNA (lssDNA) molecules for generating conditional mutations. However, there is still limited available data on the efficiency and reliability of this method. Results We generated conditional mouse alleles using lssDNA donor templates and performed extensive characterization of the resulting mutations. We observed that the use of lssDNA molecules as donors efficiently yielded founders bearing the conditional allele, with seven out of nine projects giving rise to modified alleles. However, rearranged alleles including nucleotide changes, indels, local rearrangements and additional integrations were also frequently generated by this method. Specifically, we found that alleles containing unexpected point mutations were found in three of the nine projects analyzed. Alleles originating from illegitimate repairs or partial integration of the donor were detected in eight projects. Furthermore, additional integrations of donor molecules were identified in four out of the seven projects analyzed by copy counting. This highlighted the requirement for a thorough allele validation by polymerase chain reaction, sequencing and copy counting of the mice generated through this method. We also demonstrated the feasibility of using lssDNA donors to generate thus far problematic point mutations distant from active CRISPR cutting sites by targeting two distinct genes (Gckr and Rims1). We propose a strategy to perform extensive quality control and validation of both types of mouse models generated using lssDNA donors. Conclusion lssDNA donors reproducibly generate conditional alleles and can be used to introduce point mutations away from CRISPR/Cas9 cutting sites in mice. However, our work demonstrates that thorough quality control of new models is essential prior to reliably experimenting with mice generated by this method. These advances in genome editing techniques shift the challenge of mutagenesis from generation to the validation of new mutant models. Electronic supplementary material The online version of this article (10.1186/s12915-018-0530-7) contains supplementary material, which is available to authorized users.


Background
Classical gene targeting employing embryonic stem cells has long been the principal method to introduce complex alleles into the mouse genome [1]. More recently, microinjection of an RNA-guided engineered nuclease (RGEN) together with a single-stranded oligodeoxynucleotide (ssODN) has revolutionized our ability to direct mutations in vivo [2]. However, clustered regularly interspaced short palindromic repeat (CRISPR)/CRISPR-associated protein 9 (Cas9)-aided knock-ins of larger cassettes or loxP sites directly into one-cell mouse embryos [3,4] were breakthroughs that have remained technically very challenging [5]. Equally, CRISPR/Cas9 reagents and ssODNs have become widely used for the introduction of point mutations in one-cell embryos (see examples in [6][7][8]). However, particular locations within genomes, including sequences that are highly conserved and/or repeated, regions with a low number or absence of -NGG tri-nucleotides or sequences without active single guide RNA (sgRNA) close to the target can represent a barrier to the generation of specific mutants [9].
Miura and colleagues [10] first proposed long single-stranded DNA (lssDNA) molecules, larger than standard chemically synthesized oligonucleotides, as an efficient alternative donor template for RGEN-aided homologous recombination (HR). The authors recently extended the method to the creation of conditional alleles and tag insertions, showing the generation of sequence-perfect alleles [11]. We and others documented that CRISPR/Cas9-aided genome editing can give rise to unexpected allele rearrangements ("illegitimate repairs" [7], "KI + indels" [9,12]); therefore, thorough validation of new models is essential to ensure reproducibility of the studies employing these models [12][13][14][15]. However, limited data are available on unexpected events arising from the use of lssDNA and the associated requirements for the quality control (QC) of new models. With our extensive experience in the generation of conditional alleles through large-scale mouse model production [16,17], we have developed a strategy for validation of these alleles.
Here, we have extended the application of lssDNA to the generation of more conditional knock-out (cKO) alleles directly in the embryo. We also produced point mutations where the desired nucleotide change is remote from active CRISPR cutting sites, which so far had proved technically challenging with the available protocols. Although not all attempts were successful, we confirm that new designs employing lssDNA indeed facilitated mutant production for cKOs and particular point mutations that had previously been challenging to generate. Furthermore, we show that novel point mutations and imperfect and/or off-target donor integration(s) can occur in the process of mutagenesis. This work emphasizes the importance of a comprehensive strategy for the QC of new mutants. We conclude that the utilization of lssDNA donor templates shifts the challenge of mutagenesis from generation to the validation of new mutant models.

Results
Generation of a conditional knock-out allele Production of F 0 animals Proof of principle for the RGEN-aided generation of conditional alleles employing two CRISPR/Cas9 cuts and two separate ssODN templates as donors was published in the early days of CRISPR/Cas9-aided mutagenesis [3]. However, the use of this strategy for allele generation has not flourished in the literature in the same way as other CRISPR-directed mutagenesis applications [18]. This is most likely because its success requires two concurrent events of homology-directed recombination occurring on the same allele, which remain less frequent than non-homologous end joining (NHEJ) events [5]; this is in keeping with our own experience of the approach (see examples below). We therefore decided to pilot the use of lssDNAs as a possible alternative to ssODN donors.
As a first test case, we aimed to generate a conditional allele in Syt7 by flanking the critical exon ENSMUSE00000225700 with loxP sites (Fig. 1a). This exon was chosen as defined by Skarnes and colleagues [19]. Specifically, the exon is common to the majority of coding transcripts in the gene, and its ablation results in frame-shift transcripts. Two pairs of sgRNAs were designed, centred on each of the genomic sequences to be interrupted by loxP (Fig. 1a), and synthesized to enhance the likelihood of simultaneous cuts on both sides of the same allele. A lssDNA donor corresponding to the floxed allele was generated as per Miura and colleagues ( [10], and see Methods). Specifically, a double-stranded DNA template including a T7 transcription promoter followed by the 1149 bp sequence of the donor was obtained commercially (gBlock®, Integrated DNA Technologies (IDT); Fig. 1). A lssDNA was synthesized by in vitro transcription (IVT) and reverse transcription (detailed in Methods). The sgRNAs and lssDNA (the sequences are provided in Additional file 1: Table S1) were co-injected with Cas9 mRNA into one-cell embryos. One hundred thirty-eight injected embryos were re-implanted in pseudopregnant females. Seventeen pups were weaned and ear biopsies taken for screening of new alleles (the numbers are summarized in Additional file 1: Table S2, Syt7).

Screening of F 0 generation and genotyping of F 1 animals
As animals of the F 0 generation were likely to be mosaic, we analyzed them by screening for the presence of the allele of interest [13]. Polymerase chain reaction (PCR) amplicons were produced from genomic DNA with primers flanking the homology arms and external to the donor (Syt7 primers R1 and F1, Fig. 1a). Their analysis on agarose showed two founders (Fig. 1b, Animals Syt7-1 and Syt7-6) containing deletions. The PCR products from founder animals were purified and sequenced by Sanger sequencing. The sequencing showed that a total of 10 animals out of 17 were mutated on target (Syt7, Table 1). Among them, five pups had indels at either or both 5′ and 3′ guide target sites. Three other animals (Syt7-1, Syt7-6 and Syt7-9) carried alleles with deletions of the sequence flanked by the two pairs of sgRNAs corresponding to non-cKO alleles. The remaining two mutants (Syt7-4 and Syt7-8) were carriers of the designed cKO allele, with sequencing traces suggesting Syt7-8 to be homozygous and Syt7-4 compound heterozygous with one cKO allele and one allele including the 3′ loxP and an indel in 5′ (Additional file 2: Figure S1).
Positive founders Syt7-4 and Syt7-8 were mated to wild-type (WT) animals, and the progeny (F 1 ) were analyzed. In contrast to the analysis of mosaic F 0 animals, sequencing of PCR fragments amplified from F 1 individuals allowed for definitive characterization of the edited alleles [13]. The outcome of the analysis of F 1 animals by PCR and sequencing, employing the same primers used for screening F 0 animals, is summarized in Table 2. Sequencing showed successful transmission of the correctly mutated sequence (cKO allele) by both founders to their progeny (individuals Syt7-4.1d and Syt7-8.1c, e, f and g).
Screening of mutants obtained by co-injection of transcription activator-like effector nuclease (TALEN) and ssODNs showed that random integration of ssODNs can occur when using such a mutagenesis approach [20], illustrating the requirement of further validation of positive animals by a method allowing copy counting. We therefore checked for the presence of additional copies of the lssDNA donor sequence in the genome of F 0 and F 1 animals using digital droplet PCR (ddPCR) and a TaqMan™ assay centred on the critical exon present in the donor sequence run against a known two-copy reference assay (Syt7 exon 7, Dot1l reference assay, as per [13]). Table 2 shows the copy number of the donor sequence in each individual, Fig. 1 Generation of a Syt7 floxed allele. a Diagrammatic representation of the genomic sequence with the Syt7 critical exon highlighted, the corresponding template for lssDNA synthesis and the position of sgRNAs for in vivo delivery together with the primer locations used for reverse transcription and for genotyping. Note loxP sites in the lssDNA prevent reprocessing of repaired alleles by CRISPR-Cas9 complex. Diagram shows the process for the generation of lssDNA through in vitro transcription and reverse transcription. HA homology arm. b PCR products amplified from genomic DNA extracted from the 17 F 0 born from the microinjection session using Syt7-F1 and Syt7-R1 primers. L1 = 1 kb DNA molecular weight ladder (thick band is 3 kb). L2 = 100 bp DNA molecular weight ladder (thick bands are 1000 and 500 bp). Sequence trace data derived from animals Syt7-4 and Syt7-8 are displayed in Additional file 2: Figure S1.   Table 1 are shown in Additional file 2: Figure S1, Additional file 3: Figure S2, Additional file 4: Figure S3, Additional file 5: Figure S4, Additional file 6: Figure S5, Additional file 7: Figure S6, Additional file 8: Figure S7, Additional file 9: Figure S8, Additional file 10: Figure S9 and Additional file 11: Figure  illustrating the presence of additional copies in some F 0 (Syt7-8) and F 1 individuals (Syt7-8.1c, d, g and h). In particular, copy counting for founder Syt7-8 (which was suggested as a potential homozygous for the cKO allele by PCR and sequencing) also revealed additional integrations of the lssDNA donor (close to 2.8 copies per genome, Table 2). The copy number obtained in the founder is not a clear integer number, which is not impossible in a mosaic animal. Analysis of the F 1 progeny confirmed the presence of an additional integration (Syt7-8.1c, d, g and h) and strongly suggested that this event was not physically linked to the targeted allele in the founder, as this integration could be segregated from the mutated allele in other F 1 progeny (Syt7-8.1e and f ).
Copy counting of the critical exon also confirmed deletions of the target region in some F 0 (Syt7-4) and F 1 individuals (Syt7-4.1a, b and c; Syt7-8.1a). The ddPCR analysis also showed a reduced copy number of exon 7 in F 1 animals initially thought to be WT as an exon deletion had not been detected by standard PCR with external primers (Syt7-4.1a, b and c; Syt7-8.1a) Table 2. This suggests that these animals were bearing a deletion larger than the segments flanked by the genotyping primers.
In summary, the delivery of lssDNA donor together with CRISPR/Cas9 reagent to a modest number of one-cell embryos produced mosaic animals that transmitted a conditional allele. Some of the transmitting progeny were excluded upon further validation steps due to additional integrations of donor sequence.

Other conditional alleles Production of F 0 animals
The pilot was next extended to include a further eight genes with the same design principles (Table 1 and Additional file 1: Table S2): Two sgRNAs were selected on each side of a critical exon in the genomic sequences to be interrupted by the loxP sites (details of sequences are given in Additional file 1: Table S1, designs in Additional file 4: Figure  S3). Refining our strategy in the process of extending the pilot, we introduced standard sequences flanking the loxP sites in the designs, thus allowing us to re-use established diagnostic tests for the validation of alleles (restriction enzyme sites or LoxP-F and LoxP-R primers in Additional file 4: Figure S3). This facilitated the analysis of animals. CRISPR/Cas9 reagents and lssDNA were delivered to C57BL/6NTac one-cell embryos by pronuclear injection.
Screening of F 0 generation and genotyping of F 1 animals F 0 and F 1 animals were analyzed according to the same strategy as that used for the Syt7 conditional allele: PCR The  Revealed by copy number, on or off target c Deletion including at least one external genotyping primer site using primers external to the donor homology arms (or two PCRs bridging the homology arms, depending on PCR efficiency) and a PCR amplifying the region flanked by the two loxP sites, all of which were analyzed by Sanger sequencing (Additional file 5: Figure  S4, Additional file 6: Figure S5, Additional file 7: Figure  S6, Additional file 8: Figure S7, Additional file 9: Figure  S8, Additional file 10: Figure S9, Additional file 11: Figure S10 and Additional file 12: Figure S11). A total of 279 F 0 animals were analyzed, and 129 animals were identified as bearing mutations. Seven out of nine projects yielded founders bearing the conditional allele, with an additional one yielding a floxed allele with an unwanted point mutation. One project (Rapgef5) only yielded one founder bearing a conditional allele, that died before mating age. Correct conditional alleles were transmitted to the F 1 generation for four out of the seven projects where founder progeny were analyzed (Table 1). However, in at least three out of nine projects, other alleles were detected which contained unexpected point mutations identified at the F 0 generation (Inpp5k project, Additional file 12: Figure S11h; 6430573F11Rik project, Additional file 13: Figure S12a; Cx3cl1 project, Additional file 13: Figure S12b and c).
It is also noteworthy that illegitimate repairs [7] or partial integration(s) of the donor were detected frequently (in eight out of nine projects analyzed, see example in (Additional file 12: Figure S11d), highlighting the requirement of extensive allele validation by PCR and sequencing. These events-point mutations, partial and/or rearranged integrations-are reported as illegitimate repairs in Table 1.
Interestingly, F 0 animals with exon deletions were generated in all but one project as a by-product. Whenever null animals were required for ongoing research, these founders were also mated (numbers in brackets, Table 1). So far, germline transmission (GLT) of this additional allele was obtained in five out of six projects where positive founders were bred.
It is noteworthy that two out of these nine projects (Ikzf2 and Usp45) had been previously attempted employing ssODNs or plasmids without yielding founders with conditional alleles, in contrast to subsequent attempts with lssDNA donors (Additional file 1: Table S3). F 0 and F 1 animals containing the cKO alleles were further validated by copy counting with a TaqMan™ assay centred on the floxed region. Importantly, copy counting of the floxed region in combination with the outcome of the targeted allele validation showed additional integrations in four out of seven projects analyzed (Table 1).
Point mutations remote from active sgRNA cutting site Production of F 0 animals Finally, we assessed whether the production of a point mutation distal from an active sgRNA cutting site, the generation of which has so far been unsuccessful by repeated attempts using other methods, could also be facilitated by the use of lssDNA. The first target for this pilot was the generation of the Gckr P446L point mutation in C57BL/6NTac mouse embryos (sequence change illustrated in Additional file 15: Figure S14). We initially designed a strategy according to the standard approach, employing a ssODN and one efficient and specific sgRNA cutting as close as possible to the targeted nucleotide. However, some factors limited options for design, such as the close proximity of the target to the exon-intron junction and splice sites that should not be altered. Furthermore, the poor specificity of the target sequence (sequence conserved and repeated at two additional locations in the mouse genome; GRCm38.p5:10:82265447-82265469/12:21568953-21568975) rendered many guides unspecific. The closest sgRNA to the target nucleotide (sgRNA_20 (Fig. 2a)) was shown to be inactive by a Guide-it™ assay, where the CRISPR/Cas9 nuclease activity is assessed on a target DNA fragment in vitro (Fig. 3). This was subsequently confirmed by the fact that no mutagenesis was detected in microinjection session 1 where this sgRNA was used. Therefore, the closest efficient (as confirmed by Guide-it™ assay) and specific sgRNA that could be selected was cutting 34 nt away from the targeted base pair (sgRNA_3, Figs. 2a and 3). Thus, our next strategy employed sgRNA_3 and a ssODN donor, although a distance larger than 30 bp between the target sequence and the cutting site of the sgRNA can represent a barrier to the generation of a specific point mutation [9]. In addition to the targeted nucleotide mutation, a silent mutation was included in the ssODN donor template in order to abolish the protospacer adjacent motif (PAM) of the selected sgRNA and prevent re-processing of the mutated allele by the CRISPR/Cas9 system (Fig. 2a). The sgRNA activities were checked in vitro (Fig. 3), and each RNA was co-injected with Cas9 mRNA and the ssODN, as per the designs shown in Fig. 2a and Additional file 1: Table S1.
We anticipated that generating the desired mutation would be challenging, as the target base is a sub-optimal 34 base pairs away from sgRNA_3's cut site. We therefore performed multiple injection sessions with two different ssODN designs (Gckrdonor_2 and Gckrdonor_3, centred or offset towards the targeted mutation, respectively; sequences in Additional file 1: Table S1) to enhance the likelihood of obtaining the desired point mutation. The outcome of these microinjections was analyzed by PCR and sequencing of the region of interest in a total of 90 pups and is summarized in Table 3. Although the silent mutation was detected in F 0 animals on five occasions, it was not accompanied by the mutation of interest (Table 3 and example in Fig. 4a, ssO-Gckr P446L -54). Sequencing data from founders are shown in Additional file 16.
We subsequently designed an alternative strategy employing a larger (339 bases) lssDNA sequence and two sgRNAs flanking the region containing the targeted nucleotide. The sgRNAs were selected to introduce double-stranded breaks on each side of the target (40 and 98 nt away in 5′ and 3′, respectively), and their activity was checked in vitro. We consequently selected sgRNA_5.2 and sgRNA_3.1 as they were shown to be most active in vitro (Figs. 2b and 3). The donor sequence was designed with 100 nt homology arms flanking the cut sites, silent mutations that modify the seed sequences of the selected sgRNAs to prevent re-processing and the targeted base change (Fig. 2b). The lssDNA was synthesized in accordance with prior experiments and co-injected with Cas9 mRNA and the two sgRNAs in a single session, the outcome of which is shown in Table 3. Twenty-two pups were weaned, and ear biopsies were taken to screen for new alleles.
Screening of F 0 generation and genotyping of F 1 animals Primers were designed in genomic regions flanking, but external to, the donor sequence to span the donor integration (Gckr P446L -F2 and Gckr P446L -R2 primers, Additional file 1: Table S1 and Fig. 2b). PCR Sequencing showed that 14 animals out of 22 were mutated on target. Among them, eight individuals carried the designed knock-in (KI) allele (Table 3), with sequencing traces suggesting that four animals were homozygous for the KI (Fig. 4b). Three other individuals showed illegitimately repaired alleles (Table 3 and silent mutation only Fig. 4b).
Two of the four apparently homozygous positive F 0 s (lss-Gckr P446L -11, lss-Gckr P446L -19) were mated to WT animals for GLT of the mutated allele. The analysis of F 1 animals (summarized in Table 4) showed the successful transmission of the correctly mutated sequence by both founders (i.e. lss-Gckr P446L -11.1f, Fig. 4b).

Further model validation
We also checked for the presence of additional copies of the donor sequence in the genome of F 0 and F 1 animals using ddPCR and a TaqMan™ assay centred on the donor sequence (as per [13]). Table 4 shows the copy number of the donor sequence in each individual, illustrating a deletion likely spanning a fragment larger than the segments flanked by the genotyping primers (individuals lss-Gckr P446L -11.1a, b, d, e and h, Table 4). Although both founders appeared homozygous for the point mutation by Sanger sequencing, lss-Gckr P446L -11 also transmitted a deletion allele to its progeny, confirming mosaicism in this individual.
We next attempted to employ lssDNA donors for the generation of a mouse line bearing a point mutation in the Rims1 gene, which also had not been achieved with standard ssODN donors (Additional file 17: Figure S15 and Additional file 18: Figure S16; Additional file 1: Table S4, 1 positive founder/155 animals born (0.6%); this founder did not yield GLT, Additional file 1: Table S5). The new design employing lssDNA (Additional file 17: Figure S15) yielded founders bearing the correct mutation at a much higher frequency (4 positive founders/39 animals born (10%) with lssDNA donors), one of which achieved GLT of this second challenging point mutation (Additional file 1: Tables S4 and S5; Additional file 19: Figure S17; sequencing data in Additional file 20). Sequencing data from all founders for the point mutation (with ssODNs and lssDNA donors) are shown in Additional file 20.

Discussion
Novel strategy for challenging point mutations Standard methods employing chemically synthesized oligonucleotides had not permitted the introduction of the Gckr P446L point mutation (Table 3), although  The table shows the numbers of embryos and animals involved in mutagenesis attempts employing the injection of CRISPR/Cas9 reagents and oligonucleotides or lssDNA donors. The percentage of transferred embryos yielding live animals at weaning is shown in parentheses. The outcome of these attempts is also summarized. Note that sgRNA_20 was employed for the first microinjection session with ssODN_20 and substituted to sgRNA_3 and relevant donor ssODNs for subsequent sessions, as it was confirmed to be inactive. Sequencing data from this project are displayed in Fig. 4 (additional raw sequencing data are provided in Additional file 16) MS microinjection session, n.d. not determined SM silent mutation evidence of partial integration of the donor (silent mutation) was recorded in five animals. This is likely due to the distance between the available sgRNA and the target sequence (34 bp). We have extended the pilot to a second challenging point mutation and also found that the use of a lssDNA donor yielded the generation and GLT of the point mutation (Additional file 1: Tables S4  and S5; Additional file 19: Figure S17), reinforcing the proposition that the use of lssDNA can rescue such unsuccessful projects. This study is the first proof of principle that the use of lssDNAs can lift the barrier to the introduction of hitherto challenging point mutations into the mouse genome, where no active and/or specific sgRNA is available in the immediate vicinity of the target site. Extending our capacity to generate point mutations further away from available optimal sgRNA target sites is of crucial importance, as it will enable the generation of thus far challenging mutants, including those models essential for the validation of candidate mutations causing human disease arising from whole genome sequencing (WGS) or quantitative trait locus (QTL) analysis [21].
Alternative methods for production of lssDNA donor We chose IVT followed by reverse transcription as a method to obtain lssDNAs [10]. Alternative methods employing combined nickase and nuclease digestion of a plasmid [22], use of a biotin-labelled primer [23], conversion of double-stranded DNA to ssDNA by nucleases (Guide-it™ Long ssDNA Production System, Takara) or chemical synthesis [11] have been proposed. However, synthesizing lssDNA donor molecules remains a challenge: the IVT-based method is both lengthy and expensive; the use of nucleases can give limited yield and requires DNA of impeccable quality; and chemical synthesis is expensive and also has size limitations. It will be  Figure S14. a ssODN donors only yielded introduction of the intended silent mutations, while (b) lssDNA yielded the desired mutation in some individuals (F 0 11 transmitting to 11.f) and only the silent mutations in others (F 0 10). Note that founders appeared homozygous (ssO-Gckr P446L -54, lss-Gckr P446L -11 and lss-Gckr P446L -10) when analyzed by Sanger sequencing, but also could contain deletion alleles in trans, as suggested by copy counting (lss-Gckr P446L -11 in Table 4). A summary of the microinjection session outcomes is detailed in Table 3, and raw sequencing data are provided in Additional file 16 important to refine or replace these methods to facilitate access to high-quality donors.

Efficiency of model generation
Many advancements in the rapidly evolving genome editing field have been published on the basis of a small number of experiments, and these have sometimes proven to be difficult to reproduce [24,25]. Our results support the view that lssDNAs facilitate the production of complex alleles, suggesting that the method as described by Quadros and colleagues [11] is sufficiently robust for reproducibility between laboratories. Two of these projects (Ikzf2 and Usp45) were initially attempted employing ssODNs or plasmids as donors, but only the switch to lssDNA has yielded founders with conditional alleles, suggesting it is a more successful method (previous approaches and their outcomes are summarized in Additional file 1: Table S3). We note that other labs have encountered some successes with ssODN donors and otherwise very similar methods for the generation of cKOs ( [3], this issue, Lanza et al. [18]). However, the use of lssDNA as donors has proven more efficient in our hands than that of ssODNs, when compared for the generation of the same mutations (Ikzf2 conditional allele and Gckr and Rims1 point mutations). In particular, it alleviates the challenge of integrating both loxP sites in the same allele when generating cKOs and facilitates the introduction of point mutations away from active sgRNA active sites.
It is not yet clear why lssDNAs are proving to be superior donor molecules in this context, but their particular efficiency is likely not due to the length of homology arms used in lssDNA donors (up to 100 bases), as much larger homologous sequences were present in plasmid donors.
However, not all projects were successful. The efficiency of this method is likely to be reliant on sufficiently active sgRNAs on both sides of the sequence to be integrated (i.e. the Acvr2b project did not yield conditional alleles or any deletions). It is therefore prudent to check the activity of sgRNAs in vitro and design the donor sequence according to which sgRNAs are the most active. Also, GLT of the floxed allele relies on the viability and fertility of mosaic founders, as illustrated by the failure so far of the Rapgef5 project to yield a conditional allele. Finally, some failures were due to unwanted single nucleotide changes (examples in Additional file 13: Figure S12), most likely picked up during the lssDNA generation process. It is our prediction that some of these failures, but not all, will be reversed by further repeat attempts.
In summary, our data support efficiency, but not all models were achieved. Interestingly, the process also produced exon deletion alleles as a by-product of the generation of cKOs, allowing rapid access to null alleles.

Mutant validation
Mutant validation was performed by PCR, employing genomic primers external to the donor sequence and systematic sequencing of the integration, as well as copy counting of the donor sequence.

Validation of mutated allele
We and others have previously described that imperfect alleles can be generated when using ssODNs as donors ("illegitimate repairs" [7], "KI + indels" [9]). Further, rearranged alleles have also been detected when no donor is included in the mutagenesis strategy [7,12,26].
Here we show that rearrangements also occur in the presence of lssDNA donors (Table 1 and example in Additional file 14: Figure S13). As such, the use of lssDNA does not lessen the requirement for allele validation by full sequencing, as rearrangements (including indels and partial integrations) may occur during the double-strand break repair event. In addition, the synthesis of lssDNA itself can be a source of errors [27], potentially introducing unwanted sequence changes early in the process that will require monitoring by full sequencing of the allele. The use of new high-fidelity enzymes (including a replacement of standard reverse transcriptase) might contribute to reducing the frequency of sequence errors in the edited alleles. Inclusion in the donor of sequences of known primers that are specific and efficient in PCR or restriction enzyme sites can simplify screening for mutated loci but does not replace QC by sequencing. Alternative methods for validation of new alleles, involving string sequencing for example, could further facilitate QC.

Additional integrations
Our results show that additional donor integrations are common (five out of six projects; this was also found in [18]). Even when there is no evidence of such an event in the founder generation, it is essential to check for their presence at the F 1 stage, as there is a clonal event at the point of GLT. Furthermore, if the mutant-specific genotyping assay used in subsequent generations is internal to the donor sequence, it will not discriminate between on-target and unidentified additional integrations. Copy counting can be performed by quantitative PCR (qPCR) or most easily by ddPCR, employing an assay centred on the donor that will recognize both WT and mutant alleles (universal) or a mutation-specific assay in correlation with sequencing of a locus-specific amplicon (amplified with primers external to the donor). The locations of random integrations were not identified, so it is unclear whether they were associated with CRISPR/Cas9 off-target activity.

Standards for quality control
We found examples of sequence changes, indels, locus rearrangements or random insertion of lssDNA donors in all projects attempted, showing that mutagenesis artefacts are very common. Full model validation at the F 1 stage is therefore essential, and it constitutes a labor-intensive exercise involving the sequencing of large or several overlapping amplicons and copy counting of donor insertions. The need for extensive model validation is not specific to the use of lssDNA in genome editing [9,13,20], but it is not alleviated by the use of this new donor type.
Publications reporting proof-of-principle cases for using the CRISPR/Cas9 system for genome engineering focus on the novelty of methods and often do not include the intricacies of QC of mutants [2,3,11]. However, thorough validation of new models is essential to the reproducibility of research employing mutated laboratory animals. This can be a complex exercise, as genome editing can yield many unpredicted events, both on-target and in other loci. There are profound consequences in using mouse lines harbouring additional mutations in ongoing research, including misleading results, erroneous interpretations of study and avoidable animal wastage. Therefore, the dissemination of good practice for QC is just as essential as the distribution of efficient protocols for mutagenesis. Also, an extensive validation of mouse mutants is indispensable to providing a complete documentation of animals used in research [14].

Conclusion
Prior to the use of lssDNA, the reliable generation of complex alleles and some point mutations remote from efficacious sgRNA target sequences was out of reach. Here, we have shown the application of lssDNA to both the generation of cKO alleles and challenging point mutations. However, the technique can also produce a variety of artefacts: point mutations, indels, locus rearrangements and additional donor integrations. A comprehensive mutant validation strategy involving sequencing of the locus and copy counting of the donor is therefore essential. The utilization of lssDNA as a donor sequence lifts the barrier to the generation of complex alleles and shifts the challenge of the exercise from the production of founders bearing these new alleles towards the validation of these new mutants.

Methods sgRNAs
Guide sequence selection was carried out using the following online tools: CRISPOR [28] and Wellcome Trust Sanger Institute (WTSI) Genome Editing (WGE) [29]. sgRNA sequences were selected with as few predicted off-target events as possible, particularly on the same chromosome as the intended modification. sgRNAs used in this study are shown in Additional file 1: Table S1. sgRNAs were synthesized directly from gBlock® (IDT, Skokie, IL, USA) templates containing the T7 promoter using the HiScribe™ T7 High Yield RNA Synthesis Kit (New England BioLabs®, Ipswich, MA, USA) following manufacturer's instructions. RNAs were purified using the MEGAclear Kit (Ambion). RNA quality was assessed using a NanoDrop spectrophotometer (ThermoScientific) and by electrophoresis on 2% agarose gel containing ethidium bromide (Fisher Scientific). A Guide-it™ assay was performed as per manufacturer instructions (Takara, Kyoto, Japan).

Templates for lssDNA synthesis
Templates for lssDNA synthesis were either assembled by cloning in a plasmid or, when possible, were obtained from IDT as a single gBlock®. Additional file 1: Table S1 details the generation of the lssDNA employed in this study.

Donor sequences
Donor ssODNs (desalted grade) were obtained from IDT. Donor lssDNAs were initially generated following a method adapted from [10]. Briefly, templates for IVT (donor sequence flanked by the T7 promoter) were obtained as a gBlock® (IDT) or cloned in a plasmid that was subsequently linearized. Typically, 150 ng of double-stranded gBlock® template or 2 μg of plasmid template was transcribed using the HiScribe T7 High Yield RNA Synthesis Kit (New England BioLabs®). At the end of the reaction, DNase I was added to remove the DNA template. RNA was purified employing the MEGAclear Transcription Clean-Up Kit (Ambion). Single-stranded DNA was synthesized by reverse transcription from 20 μg of RNA template employing SuperScript III Reverse Transcriptase (Invitrogen), treated with RNAse H (Ambion) and purified employing the QIAquick Gel Extraction Kit (Qiagen, Hilden, Germany). Donor concentration was quantified using the NanoDrop (Thermo Scientific), and the integrity was checked on 1.5% agarose gel containing ethidium bromide (Fisher Scientific).

Mice
All animals were housed and maintained in the Mary Lyon Centre, MRC Harwell Institute under specific-pathogen-free (SPF) conditions, in individually ventilated cages adhering to environmental conditions as outlined in the Home Office Code of Practice. Mice were euthanized by Home Office Schedule 1 methods. Colonies established during the course of this study are available for distribution and are detailed in Additional file 1: Table S6.

Pronuclear microinjection of zygotes
All embryos were obtained by superovulation. Pronuclear microinjection was performed as per Gardiner and Teboul [30], employing a FemtoJet (Eppendorf AG, Hamburg, Germany) and C57BL/6NTac embryos for all projects shown here, apart from Rims1, which was performed with C57BL/6J embryos. Specifically, the injection pressure (P i ) was set between 100 and 700 hPa, depending on the needle opening; the injection time (T i ) was set at 0.5 s and the compensation pressure (P c ) was set at 10 hPa. Mixes were centrifuged at high speed for a further minute prior to microinjection. Injected embryos were re-implanted in CD-1 pseudopregnant females. Host females were allowed to litter and rear F 0 s.

Breeding for germline transmission
F 0 animals where the presence of a desired allele was detected were mated to WT isogenic animals to obtain F 1 animals to assess the GLT of the allele of interest and permit the definitive validation of its integrity.

Genomic DNA extraction ear biopsies
Genomic DNA from F 0 and F 1 animals was extracted from ear clip biopsies using the DNA Extract All Reagents Kit (Applied Biosystems) according to the manufacturer's instructions. The crude lysate was stored at − 20°C.

PCR amplification and sequencing
New primer pairs were set up in a PCR reaction containing 500 ng genomic DNA extracted from a WT mouse, 1× Expand Long Range Buffer with 12.5 mM MgCl 2 (Roche), 500 μM PCR Nucleotide Mix (dATP, dCTP, dGTP, dTTP at 10 mM, Roche), 0.3 μM of each primer, 3% dimethyl sulfoxide (DMSO) and 1.8 U Expand Long Range Enzyme mix (Roche) in a total volume of 25 μl. Using a T100 thermocycler (Bio-Rad, Hercules, CA, USA), PCRs were subjected to the following thermal conditions: 92°C for 2 min followed by 40 cycles of 92°C for 10 s, a gradient of annealing temperatures between 55 and 65°C for 15 s and 68°C for 1 min/kilobase and a final elongation step for 10 min at 68°C. The PCR outcome was analyzed on a 1.5-2% agarose gel, depending on the amplicon size, and the highest efficient annealing temperature was identified for the primer pair. If no temperature allowed for an efficient and/or specific PCR amplification, the assay was repeated with an increased DMSO concentration (up to 12%). Using optimized conditions as defined above, PCRs for each project were run and an aliquot analyzed on agarose gel. The PCR products were purified employing a QIAquick Gel Extraction Kit (Qiagen) and sent for Sanger sequencing (Source Bioscience, Oxford, UK). Genotyping primers were chosen to be at least 200 bp away from the extremity of donors, depending on available sequences for design.

Sequencing data analysis
Sequencing data were analyzed differently depending on whether they were obtained from F 0 s or F 1 s (as per [13]). At the F 0 stage, animals were screened for evidence of the expected change, i.e. the presence of loxP sites for conditional allele projects or the presence of the expected base change for the Gckr P446L point mutation project. F 0 animals should be considered mosaic animals. All F 1 animals are heterozygous containing one WT allele and one allele to be determined, as they are obtained from mating F 0 animals with desired gene edits to WT animals. The F 1 stage enables definitive characterization of the new mutant.

Sub-cloning of PCR products
PCR products amplified from F 0 DNA showing complex sequencing traces were sub-cloned using a Zero-Blunt PCR Cloning Kit (Invitrogen). The appropriate number of clones (usually 12-24) per founder were picked and grown overnight in accordance with the complexity of the traces observed prior to sub-cloning. Plasmids were isolated using a QIAprep Miniprep Kit (Qiagen) and analyzed by Sanger sequencing (Source Bioscience) using the M13R oligonucleotide or gene-specific primers.
ddPCR Copy number variation experiments were performed as duplex reactions, where the sequence employed as a donor was amplified using a fluorescein amidite (FAM)-labelled assay (sourced from Biosearch Technologies, Petaluma, CA, USA), in parallel with a VIC-labelled reference gene assay (Dot1l, sourced from ThermoFisher) set at two copies (CNV2) on the Bio-Rad QX200 ddPCR System (Bio-Rad) as per Codner and colleagues [31]. Reaction mixes (22 μl) contained 2 μl crude DNA lysate or 50 ng of phenol/ chloroform purified genomic DNA, 1× ddPCR Supermix for probes (Bio-Rad), 225 nM of each primer (two primers per assay) and 50 nM of each probe (one VIC-labelled probe for the reference gene assay and one FAM-labelled for the ssODN sequence assay). These reaction mixes were loaded either into DG8 cartridges together with 70 μl droplet oil per sample and the droplets generated using the QX100 Droplet Generator or loaded in plate format into the Bio-Rad QX200 AutoDG and the droplets generated as per the manufacturer's instructions. Post droplet generation, the oil/reagent emulsion was transferred to a 96-well semi-skirted plate (Eppendorf), and the samples were amplified on a Bio-Rad C1000 Touch thermocycler (95°C for 10 min, followed by 40 cycles of 94°C for 30 s and 58°C for 60 s, with a final elongation step of 98°C for 10 min, where all temperature ramping was set to 2.5°C/s). The plate containing the droplet amplicons was subsequently loaded into the QX200 Droplet Reader (Bio-Rad). Standard reagents and consumables supplied by Bio-Rad were used, including cartridges and gaskets, droplet generation oil and droplet reader oil. Copy numbers were assessed using the QuantaSoft software using at least 10,000 accepted droplets per sample. The copy numbers were calculated by applying Poisson statistics to the fraction of end-point positive reactions, and the 95% confidence interval of this measurement is shown.

Additional files
Additional file 1: Table S1. Sequences of reagents used in the study. The table shows the sequences of the oligonucleotides and lssDNA donors, primers and TaqMan assays employed in this study. LoxP sites (for all conditional projects) and point mutations (for Gckr and Rims1 project) are underlined. Sequences added for diagnostic (for all conditional projects except Syt7) and silent mutations (for Gckr and Rims1 project) are shown in italics. For the plasmids, sequences flanked by and including homology arms are shown. The ddPCR reference copy counting assay is labelled with VIC. All other ddPCR copy counting assays are labelled with fluorescein amidite (FAM). Copy counting assays labelled as UNIV ddPCR assays recognize both WT and engineered alleles; MUT ddPCR assays recognize engineered allele only. Table S2. Production of founders for conditional alleles. The table shows the numbers of embryos and animals involved in mutagenesis attempts employing the injection of CRISPR/Cas9 reagents and lssDNA donors. Table S3. Generation of conditional alleles employing different donor types. The table shows the numbers of embryos and animals involved in mutagenesis attempts employing the injection of CRISPR/Cas9 reagents and oligonucleotides, plasmids or lssDNA donors. The results of the analysis of the founders obtained from these attempts are also summarized. Table S4. Generation of a Rims1 R655H point mutation. Further genotype screening data for this project are shown in Additional file 18: Figure S16 and Additional file 19: Figure S17. Table S5. Analysis of the Rims1 R655H project. The table details the results of screening of five positive F 0 animals obtained for the generation of a Rims1 R655H point mutation and the subsequent characterization of the F 1 animals obtained from mating of these F 0 animals to WT mice. Table S6. Nomenclature of new mouse lines established in the course of the study. (XLS 81 kb) Additional file 2: Figure S1. Screening by Sanger sequencing of animals for the generation of a Syt7 conditional allele. The figure shows the sequencing traces from PCR products amplified from founder Syt7-4 (a) and founder Syt7-8 (b) that reveal the integration of two loxP sites in both animals. Note that Syt7-8 appears to be homozygous (a single trace detected), while Syt7-4 appears to contain at least two different alleles. The PCR products from which the sequence traces were derived are shown in Fig. 1

. (PNG 377 kb)
Additional file 3: Figure S2. Additional animal analysis information. (DOCX 19408 kb) Additional file 4: Figure S3. The figure shows the designs of reagents employed for the generation of conditional alleles. Red triangles mark loxP sites. RNA is transcribed in vitro from a double-stranded DNA template containing the T7 promoter and the donor sequence. The resulting RNA is reverse-transcribed employing a primer that is specific to the donor sequence. Additional sequences (orange boxes, marked as universal) were added to the design for the purpose of facilitating initial screening of animals employing restriction enzyme sites and/or validated primer pairs, with the exception of the Syt7 conditional allele (described in Fig. 1). (PNG 91 kb) Additional file 5: Figure S4. Analysis of the Ikzf2 project. PCR amplification of the genomic region of interest from (a, b) F 0 animals and (f, g) Ikzf2-2's offspring with (a, f) Ikzf2-F3 and Ikzf2-3R2 primers (1594-bp amplicon) and (b, g) LoxPF and LoxPR primers (906-bp amplicon) from biopsies. (a, b, f, g) Animals' IDs are shown. + is positive control amplified from an unrelated (a) WT, (b) plasmid template. Sequencing of PCR amplicon from (c) the founder Ikzf2-2, (h) Ikzf2-2.1f and (i) Ikzf2-2.1 h with Ikzf2-F3 and Ikzf2-3R2 primers. LoxP sequences are highlighted in blue. (d) ID and outcome of PCR analysis of the region of interest and the conclusion for each F 0 individual. (e) ID, outcome of sequencing and copy counting of the region of interest as well as the conclusion for each individual of the first litter obtained by mating Ikzf2-2 with a WT mouse. *Animal mated; **deletion not picked up by Ikzf2 PCR, likely encompassing at least one primer sequence; ***allele detailed in Additional file 14: Figure S13. Evidence of deletion is highlighted in blue. L1 = 1 kb DNA molecular weight ladder (thick band is 3 kb). Sequencing data showing a correct conditional allele are shown in Additional file 3: Figure S2d Additional file 10: Figure S9. Analysis of the 6430573F11Rik project. PCR amplification of genomic DNA of (a) F 0 animals, (f) 6430573F11Rik-11's offspring or (i) 6430573F11Rik-28's offspring with (a, f) 6430573F11Rik-F3 and 6430573F11Rik-R2 (1721-bp amplicon) and (b, f) LoxPF and LoxPR (999-bp amplicon). Sequencing of PCR amplicons from (c) 6430573F11Rik-11 and (g) 6430573F11Rik-11.1a with 6430573F11Rik-F3 and 6430573F11Rik-R2. LoxPs are in blue. ID, outcome of PCR analysis and conclusion for (d) each F 0 animal and (e) the first litter obtained by mating 6430573F11Rik-11 with a WT mouse. Two founders were mated for cKO GLT. *Mated; ⁑no evidence of loxP in 6430573F11Rik amplicon, suggesting donor integrated randomly (6430573F11Rik-28 sequence trace in Additional file 3: Figure S2q). (g) Only WT sequence is found, indicating random donor insertion. (f, i) Animal IDs are shown. + is positive control from unrelated WT and conditional floxed animal for 6430573F11Rik and LoxP PCR, respectively. L1 = 1 kb DNA molecular weight ladder (thick band is 3 kb). (h) First litter obtained by mating 6430573F11Rik-28 with a WT mouse. ID, outcome of sequencing and copy counting of the region of interest and the conclusion for each individual. (j) Sequencing of amplicons obtained with 6430573F11Rik-F3 and 6430573F11Rik-R2 and 6430573F11Rik-28.1a. Only WT sequence is found, indicating random donor insertion. Sequencing of deletion allele in founder 6430573F11Rik-6, summary of analysis of F 1 animals derived from 6430573F11Rik-6 and transmitted deletion allele are shown in Additional file 3: Figure S2r