The authors of a recent article in BMC Biology adopted exactly this approach. Li and colleagues [4] report mitochondrial and Y chromosome data for 23 inhumations at the Xiaohe cemetery in Western China's Tarim Basin. Here, ancient DNA techniques were employed to characterize the genetic diversity of samples fixed in both time and space. The Xiaohe community existed around 4,000 years ago on the ancient east-west network of trade routes known collectively as the Silk Road. Despite straddling the border of East Asia, 'Caucasoid' mummies, artwork influenced by ancient western civilizations and records written in Tocharian, an extinct branch of the Indo-European language family, amply illustrate the region's close connections with the west. Li and colleagues contribute to this extensive evidence of east-west links by announcing the presence of West Eurasian mtDNA and Y chromosome lineages among Xiaohe's dead.
However, mtDNA and Y chromosome records are necessarily biased towards the sex-specific movements of women and men, respectively. Further, demographic reconstructions made from mtDNA and the Y chromosome have considerable uncertainty due to the substantially higher rates of stochastic genetic drift in these uniparentally inherited systems compared to autosomal loci. Ultimately, mtDNA and the Y chromosome are just two small loci in a vast ocean of human genetic variation. In contrast, the autosomes consist of millions of independently evolving regions, any number of which can be surveyed to characterize gene flow in humans. Today, the rapid advent of new genotyping technologies is allowing massive numbers of markers to be screened across the human genome. Xu and colleagues [5] adopted this approach in a recent article in BMC Genetics. They report approximately 50,000 single nucleotide polymorphisms (SNPs, or point mutations) drawn from the nuclear genome. These markers were genotyped in individuals from two indigenous communities in Thailand that show linguistic and anthropological evidence of prehistoric connections. Using a suite of clustering methods, Xu and colleagues demonstrate that the Mlabri and Htin share more nuclear variants with each other than either does with surrounding populations.
While determining whether two populations share genetic variation is a relatively simple exercise, identifying and quantifying the amount of gene flow between them requires more advanced modeling and inferential statistics. This is usually applied within a framework of coalescent theory. Large quantities of genetic information are required to infer rates of gene flow, and data sets of this size have only recently become feasible. In a 2008 article in BMC Genetics, we adopted exactly this sort of strategy to quantify rates of gene flow on a global scale [6]. Instead of screening many pre-ascertained point mutations like Xu and colleagues, we instead fully sequenced 20 large genomic regions distributed across the human X chromosome in 90 individuals from six globally distributed populations (Mandenka, Biaka and San in Africa, and French Basques, Han Chinese and New Guinea Highlanders in Eurasia). This research was unique because these 20 genomic regions were chosen specifically to be recombinationally unlinked (that is, independent) and selectively neutral (that is, located far away from genes), in marked contrast to most studies where genetic variants are tightly linked to functional sites that are potentially affected by natural selection (for example, mtDNA and the Y chromosome) or even occur within selected loci (for example, SNP chip data where many polymorphisms are located in, or close to, genes). We found that worldwide rates of gene flow (m) were approximately five-fold higher among non-African populations relative to African groups [6]. Interestingly though, effective population sizes (N) and migration rates (m) are inversely proportional in African and Eurasian groups - although migration rates are approximately five-fold lower in Africa, effective population sizes are approximately five-fold higher. Consequently, population migration rates (Nm) are globally very similar (Nm = 2.4). In other words, we found that approximately two migrants per generation have moved between these globally distributed populations on average through time. Some of these movements may have been recent, despite the large geographical distances between populations, but most would have occurred when these populations were geographically much closer (for example, during the initial expansion of anatomically modern humans out of Africa). Developing statistical methods to determine exactly when this gene flow occurred remains an important outstanding task.
Today, sequencing and comparing entire human genome sequences is the surveying method of choice. For example, the complete genomes of many individuals have now been sequenced, including such famous names as Archbishop Desmond Tutu [7] and Nobel Laureate James Watson [8]. Genome sequences yield orders of magnitude more information than either the SNPs screened by Xu and colleagues, or the small genomic regions (comparatively speaking) sequenced in our 2008 research program.