Polyploid species are numerous and economically important. Cycles of polyploidization and diploidization have recurred throughout the evolutionary history of eukaryotes [1, 2], such that many eukaryotic species possess more than the two chromosome sets expected in a diploid. There are polyploid animals, fungi and protists [3, 4], but polyploidization has been especially prevalent in flowering plants. The entire angiosperm lineage underwent at least one round of ancient polyploidization early in its evolution . Many family- or lineage-specific polyploidization events have occurred since then (examples reviewed in ), and most of our main food and economic crop species are recent polyploids, including cotton, bread wheat, maize, potatoes, brassicas, bananas, tobacco and coffee [6, 7].
Polyploidization is often associated with genome plasticity. Chromosomal rearrangements, new transposable element activity, DNA mutation, duplicate gene deletion, gene expression and epigenetic changes are commonly observed with polyploid formation [2, 5, 6, 8]. Polyploidy maintains higher levels of heterozygosity in a population, reducing the occurrence of inbreeding depression, although mutant alleles accumulate in polyploid populations more quickly than in diploids . Polyploids can also adapt faster than diploids as long as beneficial mutant alleles are not masked too strongly by wild-type alleles . Each polyploidization event combines genes in novel ways, potentially producing a lineage with new phenotypes, capable of surviving and adapting to environments outside the range of its parent lineages [5, 9, 10]. Thus polyploidization may be a vital mechanism for adaptation to rapidly-changing environments, such as those expected under anthropogenic climate change.
Practical difficulties have limited our current understanding of polyploid evolution, diversification and population dynamics. Nuclear DNA sequence is the most informative data source for phylogenetic inference. Haploid organellar sequence data can be useful, but nuclear regions must be included to obtain multiple unlinked markers. These are necessary where the evolutionary history of a lineage is complicated by incomplete lineage sorting or hybridization, because a single marker has a low probability of predicting the true evolutionary tree [11–14]. Nuclear sequence data are often useful in investigating complex evolutionary histories [15–17]. Since polyploids contain multiple distinct copies of each nuclear gene, known as homeologues, it is usually impossible to amplify a homogeneous, single amplicon using PCR. DNA sequencing by the Sanger method can only be performed on a single pure amplicon. If more than one copy is amplified with a particular PCR primer pair, direct sequencing will give double peaks at sites that differ within and between homeologues, and the phase of many such double peaks can not be determined. Furthermore, if there are insertion-deletion polymorphisms (indels) that distinguish between the copies, then direct sequencing will fail because all sites after an indel will be undecipherable double peaks. The resulting practical difficulties with gene sequencing have been well documented [13, 16, 18].
Several approaches have been used to overcome this problem, but all have drawbacks, in terms of the extra time and cost associated with each method compared to similar work in diploid or haploid taxa. Many researchers have used bacterial cloning to separate gene copies [17, 19, 20]. This is time-consuming and expensive, meaning that few individuals and few genes are investigated. A single species is often represented by a single individual [17, 20, 21], which may be adequate for investigating the hybrid origins of species , but not for investigating intraspecific variation. Comparisons based on individuals may also lead to the wrong conclusion about the evolutionary history of related taxa, due to undetected incomplete lineage sorting and/or reticulate evolution . Other researchers have designed primers specific to each known homeologue of a multi-copy nuclear gene [16, 22], but this method requires extensive preliminary experimentation in poorly-characterized species [13, 22, 23]. It may also be unsuitable for autopolyploids or some allopolyploid genes for which good distinguishing primers cannot be designed. In some studies, DNA is extracted from a mixture of individuals from a single population , but this might not accurately estimate the frequency of alleles in a population, as preferential PCR amplification may occur.
Here we describe a new method that uses the capacity of next-generation sequencing technologies to sequence mixtures of DNA rather than pure PCR products. We sequence multiple gene copies, multiple genes, and multiple individuals in a single run, using barcoded samples. Our method demonstrates the utility of 454 sequencing for phylogenetic inference in a group of polyploid grasses that seem to have evolved under a recent, rapid radiation. It could be applied to any experimental design that aims to sequence amplicon mixtures from many independent samples.
Tussock grasses in the genus Poa (Poaceae) dominate the alpine communities of the Australian Alps. These grasses were initially lumped as Poa australis R. Br. or P. caespitosa Forst. f., despite variation in morphology and habitat associations . More recently, approximately 50 Australian Poa taxa have been described , about 12 of which occur in the alpine and/or subalpine region [25, 26]. The genus Poa has long been regarded as taxonomically difficult [21, 27, 28] and even viewed as a massive polyploid complex . The taxonomy of the Australian alpine species is also considered to be imperfect [25, 26].
Initial results from a pilot study using microsatellite markers, chloroplast non-coding DNA and nuclear regions amplified in 25 herbarium specimens indicated that these Poa species are tetraploid. This was consistent with evidence from a previous study by Patterson et al.  that found two copies of each of the two nuclear genes analyzed in a range of Poa species, and with the very common occurrence of polyploidy in the entire grass family .