General TA drive principles and systems
The underlying design principle of all TA drive systems is that the drive alleles contain a toxin together with an antidote that rescues the effect of the toxin. We assume that the toxin is a CRISPR nuclease targeting an essential gene that will be disrupted and rendered nonfunctional when mutations are introduced at the cut sites through end-joining or homology-directed repair. The antidote consists of a recoded version of the gene, which does not match the gRNAs and therefore cannot be cleaved by the drive. Cells or individuals exposed to the toxin will often be nonviable, unless rescued by a drive allele. In contrast to homing drives that spread by directly increasing the number of drive alleles, TA drives spread by reducing the number of wild-type alleles (and thus still increasing the relative frequency of the drive). Various potential arrangements and targets for TA systems can be conceived. In this study, we will focus on three general classes of such systems:
TARE (Toxin-Antidote Recessive Embryo). These drives target an essential but haplosufficient gene. Disrupted alleles are recessive lethal (i.e., one functional copy of the gene is required for viability, which can be a drive or wild-type allele).
TADE (Toxin-Antidote Dominant Embryo). These drives target a haplolethal gene (i.e., two functional copies of the gene are required for viability).
TADS (Toxin-Antidote Dominant Sperm). These drives target a gene that is transcribed in gametocytes after meiosis I in males, with this expression being critical for successful spermatogenesis. We assume that all sperm with a disrupted target allele are nonviable.
Figure 1a shows which genotypes are rendered nonviable in each of these classes of drive. The detailed features of these systems will be discussed in the relevant sections below. Generally, TARE systems are aimed for population modification while TADE and TADS systems can be used for both population modification and suppression. For TADE or TADS suppression, we assume that the drive and target loci are unlinked and that the drive is placed in an essential but haplosufficient fertility gene (that only affects fertility in one sex for TADE and specifically males for TADS), disrupting the gene with its presence (not by targeting it with gRNAs) so that drive homozygotes of one sex are sterile. We further discuss a population modification variant of TADE that we term Toxin-Antidote Double Dominant Embryo (TADDE) drive, which still targets a haplolethal gene but has a stronger rescue element such that a single drive allele is sufficient for an individual to be viable (Fig. 1a), allowing the drive to spread faster with a lower invasion threshold than TADE drive. Finally, we will discuss a variant of TADS where the drive is located on the Y chromosome (termed TADS Y-suppression). Such a system is expected to exhibit similar dynamics to previously studied X-shredder drives [41,42,43].
Overview of population dynamics
Before conducting a full analysis of each individual drive, we will provide an overview of their expected population dynamics, as compared to a homing drive, X-shredder, and Medea system. For these initial analyses, we assume “ideal” drives (no resistance evolution, 100% target cutting activity in the germline for the homing and TADE drives and both germline and early embryo cutting for TARE, TADDE, and TADS). For Medea, we assume that all offspring of mothers with a Medea allele will be nonviable unless they receive a Medea allele from either parent. We further assume a panmictic population of infinite size. This allows us to use a deterministic model, specified by recursion equations for the expected changes in genotype frequencies between discrete generations (see the “Methods” section).
All of the TA systems we tested are able to spread quickly through the population in this idealized model (Figure S1), but most have frequency thresholds below which the drive will not invade if it carries a fitness cost (Fig. 1b). Realistic drives should carry at least a small fitness costs from drive components itself (from expression of the CRISPR nuclease, for example), and payload genes would likely add additional fitness costs (although these might be removed if mutations render the payload nonfunctional). When introduced above the threshold frequency, the drive is expected to increase in frequency and spread successfully, while below the frequency, it would likely be eliminated from the population. In contrast, TADS drives, as well as homing drives [4, 44] and X-shredders [44], all have a zero-threshold introduction frequency unless fitness costs are very high (drive homozygote fitness < 0.5 in idealized forms). These drives would therefore be expected to spread from any release frequency if they are expected to spread at all. Note, though, that there exists a narrow range of fitness values under which homing [4, 44] and TADS drives also have a nonzero introduction frequency threshold.
The presence of an introduction threshold can allow a drive to be confined to a target population if the migration rate into a connected population is below a “migration threshold.” Note that this migration threshold is different from the “introduction threshold” because migrants could accumulate over time to eventually exceed the introduction threshold [45,46,47,48,49,50]. However, if an introduction threshold exists, a migration threshold will exist as well. To assess the migration thresholds for our drives in a simple scenario, we studied a wild-type population experiencing a fixed rate of immigration from drive-carrying individuals each generation (presumably from a separate population where the drive is already established). The migration threshold then represents the minimum rate of immigration (as a fraction of the population) needed for the drive to eventually spread through the population (Fig. 1c). These migration thresholds follow the same pattern as the introduction thresholds, though all are lower. Note that such thresholds are also representative of the level of effort needed in a continual release strategy, rather than the single-release strategy considered elsewhere in this manuscript.
TARE, TADDE, and Medea drives are not expected to go to fixation but instead reach equilibrium frequencies that are dependent on fitness costs (Fig. 1b). At equilibrium, all individuals are expected to carry at least one copy of the drive (Figure S1B), but some will carry disrupted alleles as well. Suppression forms of the drives are potentially capable of inducing high genetic loads (defined as the average net fitness reduction relative to a wild-type population of the same size after the drive reaches an equilibrium), though fitness costs can allow modification-type drives to induce a modest genetic load as well (Fig. 1d). However, loads based on such fitness costs in modification drives will usually be insufficient to eradicate a population or even substantially reduce its numbers, depending on ecological characteristics.
We will next study the individual TA systems more closely, exploring how their dynamics change as drive parameters are varied from the idealized model. The following analyses no longer assume a deterministic model of an infinite population as used in Fig. 1. Instead, they are based on our individual-based simulations, which seek to model a more realistic population of finite size with density regulation. These simulations therefore take stochastic effects into account, which can become particularly relevant for suppression approaches as population size decreases.
TARE drive
These drives constitute modification drives that target a gene that is essential and haplosufficient (disrupted alleles are recessive lethal), with the drive providing rescue (Fig. 2a). One consequence of this mechanism is that TARE drive will have threshold-dependent invasion dynamics (Figs. 1b and 2b). Another consequence is that embryo Cas9 cleavage from maternally deposited Cas9, which poses a major problem for homing-type drives, actually makes a TARE drive more efficient (Fig. 2c). For example, when a heterozygous female mates with a wild-type male, most of their offspring will end up carrying the drive. This is because those that did not inherit a drive allele from their mother will likely inherit a disrupted target allele, and the wild-type allele inherited from the father will then become disrupted due to maternal Cas9 activity, rendering those individuals nonviable. TARE drives should therefore be highly tolerant of variation in expression from the nuclease promoter. Indeed, the promoter of a TARE drive need not even be restricted to expression in the germline and early embryo. Constitutively active promoters would presumably work equally well (though they may have a higher fitness cost), as long as there is expression in germline or germline precursor cells.
The TARE drive can be “same-site” as in Fig. 2a or a “distant-site” drive in which the drive allele is not located at the same genomic site as the target allele (Figure S2A) (note that Fig. 1a shows genotypes for “same-site” drives). Successful same-site [39] and distant-site (called ClvR) [40] systems have already been engineered with high germline and embryo cut rates and little to no observable fitness costs. Same-site and distant-site systems should have nearly equivalent performance when cut rates are high (Figure S2B and the ClvR study [40]), but the distant-site drive retains higher performance when both the germline and embryo cut rates are low (Figure S3C) since it often has two wild-type alleles available to cleave in this parameter space, rather than one as for the same-site drive. On the other hand, a same-site drive may be easier to engineer since the recoded region is smaller and the natural target gene promoter would drive expression of the rescue allele. The natural promoter and genomic site of the rescue element may also avoid the pitfall of incomplete rescue that is a more significant consideration for distant-site drives.
In our model, TARE systems reach all individuals quickly with a modest release size (Figure S1B), but their rate of increase becomes slowed at high frequencies (Figure S1A), which could be an issue for a population modification strategy where the payload is substantially more effective in homozygotes than heterozygotes. To avoid this, the target of a TARE system could be located on the X chromosome, so that males with only one copy of the disrupted target gene are nonviable (Figure S3D). This would allow the drive allele to fix substantially more quickly than autosomal TARE systems (Figure S3E). However, X-linked TARE drives would not have any cleavage activity in the germline of males and therefore have a slower rate of spread than autosomal systems (Figure S3F), at least until the drive has reached most individuals.
TADE drive
These drives target a haplolethal gene, with the drive allele providing rescue (Fig. 3a). Like TARE, such a drive is expected to show threshold-dependent dynamics (Figs. 1b and 3b). However, nuclease cleavage should occur only in germline gametocytes for a TADE drive, rather than in both germline and early embryo. Otherwise, drive/wild-type heterozygotes will not have two functioning copies of the haplolethal gene in all cells, which will likely result in low fitness or death, depending on the magnitude of expression outside the germline. Similarly, embryo cleavage activity would render some drive-carrying individuals nonviable. Though the nuclease promoter should be germline-restricted, it could still have expression before or after the narrow window for homology-directed repair in early meiosis, allowing TADE drives to be somewhat more flexible for promoters than homing drives. With a suitable promoter, the offspring of both males and females that fail to inherit the drive will perish. This allows the TADE drive to spread more rapidly than the TARE drive and quickly fix (Figure S1A). However, substantial embryo resistance would likely thwart such a drive (Fig. 3c).
As with the TARE drive, a TADE drive can be same-site or distant-site (Figure S3A). Both configurations are expected to have similar performance (Figure S3B), but the distant-site drive may remain viable for higher embryo resistance rates when germline cleavage is low (Figure S3C). This is because a low rate of embryo cleavage can help remove wild-type alleles that were not cleaved in the germline due to low germline cleavage rates. The drive alleles in this situation should still remain viable in most instances, since the other wild-type target allele would often remain undisrupted.
TADE suppression drive
The TADE suppression drive is a form of distant-site TADE in which the drive is located in an essential but haplosufficient female (or male, but not both) fertility (or viability) gene, disrupting the gene with its presence (Fig. 4a). Thus, female drive homozygotes are sterile. If the germline cleavage rate is less than 100%, this drive would not fix but instead impose a genetic load on the population (Fig. 4b), defined as the average net fitness reduction relative to a wild-type population of the same size after the drive reaches its maximum frequency. This includes direct fitness effects of the drive regardless of genotype, drive-induced sterility in certain drive homozygotes, and loss of offspring due to nonviable genotypes formed by the drive. In our stochastic model with density regulation, complete eradication is expected to occur when the genetic load is equal to or greater than 1—(1/population growth rate at low density), though eradication may occur before this point due to stochastic effects or if Allee effects begin to contribute to suppression [51]. For the germline cut rates observed experimentally in mosquito and Drosophila systems [14,15,16,17,18,19,20,21,22,23], this would likely be sufficient to cause complete population eradication. High genetic loads are also possible even if the target gene shows only partial haploinsufficiency (Figure S4, defined as the fitness cost to individuals with a single functioning copy of the target gene). Note that unlike homing drives and X-shredders, TADE suppression drives are expected to show threshold-dependent dynamics (Figs. 1b and 4c), making them regionally confinable systems. In an effective TADE suppression drive, the parameter space for embryo and germline cut rates is even more restricted than for a TADE drive (Fig. 4d), though still within the range demonstrated in mosquito drives [26]. Note that if a TARE drive was similarly placed in a female fertility gene, it would likely lack the power to eradicate the population and only be able to induce a modest genetic load (Fig. 1d).
TADDE drive
TADDE drives are simply TADE drives in which the rescue element either has two recoded copies of the haplolethal gene or a sufficiently altered promoter to increase expression of the rescue element, such that a single drive allele is sufficient to provide rescue even if paired with a disrupted allele (Fig. 5a). TADDE drives thus allow for the removal of wild-type alleles immediately after disruption in both males and females, while preventing removal of drive-carrying individuals, which occurs in TADE drive offspring when two drive heterozygotes mate. This allows a TADDE drive to spread more quickly (Fig. 2b) with a lower threshold (Fig. 1b) than similarly efficient TADE or TARE systems, while retaining similar threshold-based dynamics (Fig. 5b). Because drive alleles are not automatically removed when paired with disrupted targets, embryo cleavage can be fully tolerated (as well as somatic expression, like in TARE), even though it would not significantly increase the rate of spread of this drive when germline cleavage is already high (Fig. 5c). Same-site and distant-site TADDE drives are expected to have very similar performance except when both germline and embryo cleavage rates are very low (Figure S5).
TADS drive
These drives target a gene that is transcribed in male gametocytes after meiosis I, and this expression must be critical for successful spermatogenesis such that sperm with a disrupted target allele are nonviable. Thus, only sperm with drive or wild-type alleles can successfully fertilize eggs (Fig. 6a), resulting in rapid spread of the drive (Figure S1A). However, the rate of spread would be somewhat reduced if females can mate with multiple males and sperm could be competing to fertilize eggs. The mechanism by which such a drive spreads is similar to a homing drive, and it therefore has a zero-threshold introduction frequency (Fig. 6b) unless fitness costs are very high, meaning that it would not be expected to remain regionally confined. Somatic expression would likely be fully tolerated for such drives, and they should also allow for a wide variety of promoters varying in both germline and embryo cut rates (Fig. 6c). Nevertheless, finding a suitable target gene could be difficult. Distant-site and same-site configurations of TADS drives should be similar (Figure S6A), although as with the other types of drive, distant-site TADS should perform somewhat better than same-site TADS when both germline and embryo cleavage rates are very low (Figure S6B-C).
TADS suppression drive
A distant-site TADS drive can be configured for population suppression by placing it in an essential but haplosufficient male fertility (or viability) gene, disrupting the gene with its presence (Fig. 7a). Thus, male drive homozygotes would be sterile. Note that because the drive works during spermatogenesis, it would be unable to provide any substantial suppression if located in a female (or both-sex) fertility or viability gene. However, in a male fertility gene, it would be expected to cause complete population eradication with a zero-threshold invasion frequency (Fig. 7b) unless fitness costs are very high, similar to homing drives targeting a fertility or viability gene or to X-shredders (Figure S1A). The suppression form of TADS should be less tolerant of low embryo and germline cut rates than modification TADS, but such drives can still achieve success over a wide range of values (Fig. 7c).
TADS Y-linked suppression drive
If a distant-site TADS drive is located on the Y chromosome (with the target on a different chromosome), it will bias inheritance in favor of males (Fig. 8a). This is expected to induce a germline cut rate-dependent genetic load (Fig. 8b) on the population after the drive fixes. According to our deterministic model, this genetic load should be halfway between one and that of a Y-linked X-shredder with a similar X-shredding rate. The overall dynamics of a TADS Y-linked suppression drive should be similar to that of an ideal X-shredder (Figure S1A). Such a drive would have a zero-threshold invasion frequency unless fitness costs are very high and should be highly tolerant of both fitness costs (Fig. 9c) and low germline cut rates (Fig. 9d), though the germline cut rate will still need to induce a sufficient genetic load if complete eradication is desired. A TADS suppression system could also be located on the X chromosome, similarly biasing inheritance in favor of females and thereby eventually inducing population suppression.
Resistance to TA systems
With a modest degree of multiplexing, TA systems should generate substantially fewer resistance alleles than homing-type drives without sacrificing drive performance, since there is no need for homology-directed repair. To study the rates at which r1 resistance alleles (those which preserve the function of the target gene) are expected to form in such systems, we assumed that cleavage repair at a single site had a 10% probability of forming an r1 allele (instead of a disrupted allele), placing it near the upper end of the likely range of this parameter based on experiments [17,18,19] (by careful targeting, a significantly lower rate could probably be achieved [26]). The presence of a single disrupted site was considered to be sufficient to render the target gene disrupted, so to form a complete r1 allele, each gRNA target site needed to get an r1 sequence.
In drives with this high r1 formation rate, 100% efficiency, and assuming that drive homozygotes had a relative fitness of 95% compared to wild-type homozygotes, a single gRNA was not sufficient to allow for success of TARE, TADE, TADDE, or TADS same-site modification drives (Fig. 9). Though all drives initially increased in frequency rapidly, the (relatively low) fitness cost of the drive coupled with the high rate of r1 formation resulted in elimination of most drive alleles after 100 generations for TARE, TADE, and TADS. TADDE performed somewhat better, since r1 alleles would not be viable in the presence of a disrupted allele for this drive, while drive alleles would remain viable. Nonetheless, the final frequency of r1 alleles was still high for a scenario with only one gRNA.
As the number of gRNAs is increased, the number of r1 alleles that remain decreases drastically (Fig. 9), indicating that for even very large populations a modest number of gRNAs would likely be sufficient to preclude formation of resistance against the TA drives. Indeed, our calculations may substantially overestimate the number of r1 alleles formed, perhaps even greater than 100-fold. This is not only because we assumed a high proportion of repair resulting in r1 sequences, but also because the possibility for simultaneous cutting was not included in our deterministic model. However, such events should take place quite often, particularly as the number gRNAs increases because even one instance of simultaneous gRNA cleavage would likely cause a large enough deletion to prevent formation of an r1 allele [18, 27]. Additionally, homology-directed repair of drive cleavage using disrupted alleles as a template would likely preclude the formation of r1 alleles, and this was not taken into account in our model. Widely spaced gRNAs could reduce the chance of such events taking place, but also increase the chance of successful disruption of the gene, making optimization of gRNA target spacing a potentially important consideration when designing these drives.