A number of very serious human disorders are characterized by the appearance of protein aggregates, including neurodegenerative diseases such as Huntington’s disease (HD), metabolic disorders, and certain developmental disorders [1, 2]. Hereditary forms of these diseases are caused by de novo gain-of-function mutations that result in expression of destabilized, aggregation-prone proteins [3, 4], accompanied by the disruption of protein-folding homeostasis (proteostasis) and cellular dysfunction [5–11]. Although these mutations themselves are causative, many disease characteristics such as age of onset, penetrance, and severity of specific symptoms can vary widely between individuals. For example in HD, the age of neurological onset is strongly associated with the length of polyglutamine (polyQ) expansion in huntingtin protein. Yet age of onset can vary by several decades in people carrying the same length polyglutamine expansion, and a large proportion of this residual variation is genetic in nature and may be due to polymorphisms in other genes [12–15]. This variability is also seen in animal models of disease; for example, different genetic backgrounds of common laboratory mouse strains differentially modify somatic CAG repeat expansion and the onset of nuclear accumulation of mutant huntingtin in a knock-in HD mouse model [16–18]. It is generally well appreciated that genetic variation influences the phenotypic expression of mutations and transgenes, and the effects of environmental perturbations [19–23]. However, the nature of modifying alleles segregating in populations remains elusive for protein conformational diseases [15, 24].
Much of the current knowledge about modifiers of conformational diseases has been gathered from linkage/association studies, candidate approaches, and unbiased genetic screens. The latter two approaches often use induced mutations, including knockouts/knockdowns and overexpression, in fixed genetic backgrounds. Although genetic variants identified in this way are enormously informative about pathways that affect disease processes and that can, when perturbed, modify disease phenotypes [25–30], they are likely to be different from the naturally occurring variations. First, these approaches are designed to identify individual modifier genes with strong phenotypic effects. By contrast, a naturally occurring genetic variation, in addition to the singleton genes with strong effect, is likely to be represented by complex networks of interactions of multiple common variants with small effects, and/or of rare or even private alleles with larger contributions [31–37], as well as cryptic genetic variation [38–40], thus making association of phenotypic differences with their underlying genotypes very challenging. Protein aggregation diseases present an additional unique challenge, owing to the sheer number of different processes and pathways that control proteostasis [30, 41–43]. Modulation of proteostasis has been shown to buffer the expression of certain mutations and coding polymorphisms into phenotypes. For example, expression of aggregation-prone polyQ and SOD1 proteins [5, 6], or down-modulation in plants and animals of activity of molecular chaperones, such as heat shock protein 90 (HSP90), unmasks buffered phenotypes caused by mildly destabilizing mutations segregating in populations [44–48], a phenomenon similar to the heat-induced phenocopies [49, 50]. At the same time, the presence of such mild destabilizing variants in the genetic background modulates the aggregation and toxicity of aggregation-prone proteins, including polyQ [5, 6]. An additional consideration in protein conformation diseases is the possibility of certain induced mutations, especially knockouts/knockdowns, to imbalance the proteostasis directly, for example by targeting a member of a multimeric complex and inducing a folding stress in the cell. Because imbalance in proteostasis can indirectly affect the aggregation phenotype of the disease-causing mutation, such genetic intervention may appear as a true modifier.
Secondly, because natural variants have been acted upon by selection, they are largely compatible with normal organismal development and function , in contrast to many of the modifiers of the induced-mutation type, which are not phenotypically innocuous. Moreover, genetic manipulations that protect cells and organisms from the toxic effects of protein aggregation can themselves be deleterious to the organism; many of the lifespan-extending mutations protect against proteotoxicity, but are incompatible with normal development, or are maladaptive under competitive conditions [51–53]. Similarly, overexpression of molecular chaperones or heat shock factor, which consistently protects against proteotoxicity, can be accompanied by slow growth and development, decreased fecundity, and aberrant signaling, and may even support transformed phenotypes [54–57]. Thus, identifying proteostasis modifiers from among natural variants shaped by selection may pinpoint the potential genes and networks that are naturally plastic  and thus can be pharmacologically modulated without negative effects on the organism.
In contrast to induced-mutation approaches, association studies do look at natural variants. However, the complexity of the genetic variation involved in control of proteostasis, as discussed above, makes these studies very difficult. Indeed, only a small number of genes have been firmly identified as modifiers of aggregation diseases by either candidate or association approaches [15, 24]. Furthermore, these modifiers are often not replicated in different populations, and replicated modifiers account for only a small fraction of the heritable variation [15, 58–60]. Genetic tractability of model animals, together with the ability to control the contribution of the environment, offers a unique opportunity to examine how natural variation in the genetic background of individuals affects their ability to resist protein aggregation; what is the nature of the genetic variants that modify proteostasis; whether these are distinct from the spectrum of induced mutations; and whether natural variation can predict/modify the susceptibility to protein conformation diseases.
Recombinant inbred lines (RILs) have been used successfully, from yeast to mice, to answer similar questions about non-transgenic physiological traits, including complex traits [18, 61–66]. Here we established the polyglutamine transgenic Caenorhabditis elegans as one such model, taking advantage of the availability of genetically diverse wild isolates, and the ability to detect and measure protein aggregation in a live animal throughout its lifespan. In addition to the short generation time and lifespan of C. elegans, and the wealth of knowledge and reagents available for the pathways known to modulate aging and proteostasis, the facultative sexual reproduction mode of this organism allows for multifactorial perturbation of genetic variants through outcrossing, combined with the capture of true-breeding genes by self-reproduction, thus providing a rich resource for future mapping of the natural modifier alleles of protein aggregation and toxicity.