We have provided here a detailed characterization of the zebra finch MHC. There is clear cytogenetic evidence that MHC genes map to at least two different chromosome pairs in the zebra finch. If the chicken MHC represents the ancestral state, the situation in the zebra finch may have arisen through fission of chromosome 16 or a translocation of part of it to another pair of microchromosomes. The hypothesis of chromosomal fission is consistent with the finding that the MHC BACs did not map to zebra finch chromosomes 9 to 15 or 17 to 28, and that the microchromosomes recognized by these probes were small.
The finding of MHC genes on two chromosomes in the zebra finch is particularly intriguing because TAP genes map to one of them, whereas an expressed Class I gene (and a number of other MHC-associated genes) maps to a distinct chromosome (Figure 1). This finding is unexpected because TAP and Class I genes functionally interact and are syntenic in most MHCs studied to date including both chicken and humans [reviewed in , but see [55, 56]]. In chicken this tight linkage is thought to result in coevolution between TAP and Class I genes and strong correlations between MHC haplotype and disease resistance [reviewed in ]. TAP genes in mammals, while generally syntenic, are not as closely linked to Class I as they are in Galliform birds. The separation of TAP and Class I in mammals has been hypothesized to have resulted in their evolutionary independence and in turn led to high levels of duplication and divergence in Class I genes . This dissociation is perhaps most clearly illustrated by the tammar wallaby Macropus eugenii in which Class I sequences have been found dispersed across seven chromosomes . The separation of TAP and Class I genes in the zebra finch may therefore represent convergent dissociation of these genes.
An alternative explanation for the separation of Class I and TAP genes in the zebra finch is that the regions sequenced here could represent duplication blocks. The sequenced Class I locus could even be related to the MHC-Y region of chicken. Phylogenetic analyses of zebra finch Class I and chicken Class I (MHC-B and MHC-Y), however, suggest that our sequenced Class I gene is not the ortholog of a chicken MHC-Y gene as chicken (including MHC-Y) and zebra finch sequences are reciprocally monophyletic (Additional file 6). It is also possible that a second Class I gene resides on the same chromosome as TAP1 and TAP2 and therefore, that Class I and TAP are actually syntenic. In fact, a sequenced BAC was positive for both MHC Class I and TNXB; Another four clones were positive for TNXB and TAP2 suggesting a possible linkage between these MHC Class I and TAP2. Based on a divergent sequence and a lack of expression, we suggest that this Class I sequence is a pseudogene. Even if it were not a pseudogene, TAP and this Class I gene would be much more distantly located in zebra finch than they are in chicken and would be free of the linkage seen in the chicken. The whole genome assembly, digital expression profiling  and EST data suggest only one full-length, expressed, Class I gene. It is also possible that there is a second set of TAP genes that we have not sequenced. Given the extremely low coverage of TAP genes in the genome trace archives (for example, only one read covering TAP2), it is unlikely that TAP genes have been duplicated. FISH mapping of five pairs of putative TAP2 and MHC Class I clones further supports the lack of synteny among TAP and Class I genes (Table 4). Together these findings suggest that the Class I and TAP are not linked in the zebra finch. In addition to Class I loci identified in the BACs, we identified three distinct Class I sequences that appear to be pseudogenes. One of the putative pseudogenes only contains exon 2, one only contain exons 4 to 6, while the third contains exons 1 to 3. Because the probes used in RFLP analyses target exon 3 only one of these pseudogenes would be reflected in the RFLP banding patterns. While the zebra finch appears to possess only one expressed Class I locus, the great reed warbler Acrocephalus arundinaceous, another passerine species, expresses multiple Class I loci . An intriguing possibility is that the dissociation of TAP and Class I in ancestral passerines preceded the radiation of Class I genes in some passerine groups  as has been suggested for the wallaby [55, 56].
Class IIB genes in zebra finch are highly duplicated as evidenced by the genome assembly, BAC sequencing and the RFLP analysis. We identified 10 distinct Class IIB sequences in the genome assembly (Table 1) some of which appear to be pseudogenes. These findings corroborate previous surveys of Class IIB variation in other passerine birds [40, 58, 59]. Another feature of zebra finch Class IIB regions is their high LTR content, mostly in the form of ERV elements (Figure 7). The finding of multiple zinc-finger genes and retroelements in proximity to Class II genes was also presaged by multikilobase MHC sequences from red-winged blackbirds, which showed a similar pattern [43, 48]. Given the large number of Class IIB duplicates and pseudogenes we speculate that duplication may have been related to the presence of retroviral sequences. Thus, the passerine MHC Class IIB may have been invaded by endogenous retroviruses much like the primate Class I . Endogenous retroviruses have also been implicated in the duplication of wallaby Class I genes and their spread across multiple chromosomes .
Given the FISH mapping results and the whole genome assembly, MHC genes appear to be located on even more than two chromosomes. The genome assembly suggests that homologs of chicken MHC genes have been dispersed in the genome. There are at least three possible explanations for this: 1) There have been chromosome rearrangements for these genes between the chicken and zebra finch; 2) The contigs containing these genes have been misplaced in either the chicken or the zebra finch genome assembly; 3) The zebra finch gene identified is not the true ortholog of the chicken gene. Chicken MHC genes placed on different chromosomes in the zebra finch assembly compared to the chicken include MHC Class I (Chr22_random), CD1 and CD2 (Chr12), and NKR, Blec1 and TRIM27 (ChrZ) (Table 1). The MHC Class I gene placed on chromosome 22 and its surrounding region in the assembly is essentially identical to that in our sequenced BAC. This sequenced BAC did not cohybridize with two known chromosome 22 BACs (Figure 4B; Additional file 5), so the placement of this Class I region on chromosome 22 appears to be an assembly artifact. Rather, the FISH mapping results suggest that these genes are in fact on chromosome 16 as they are in chicken. The genome assembly data underlying the placement of CD1 genes on chromosome 12 is also somewhat uncertain, with no BAC-end sequences linking contigs containing these genes to chromosome 12. Further work will be needed to test whether the genome assembly has properly placed these genes. Contigs containing, Blec1, NKr and TRIM27, however, are linked by BAC-end sequence pairs to the Z chromosome, making it likely that these are appropriately placed in the assembly.
A number of core MHC-associated genes including DMA, BG, C4, TNXB, TAP2 and TAPBP are conspicuous by their absence in the zebra finch genome assembly (Table 1). There is no reason, however, to believe that these are truly absent in the zebra finch as they are present in a wide range of other vertebrates and are crucial for MHC function. More likely, these genes cannot be identified due to the incomplete assembly of zebra finch chromosome 16. TAP2, TAPBP and TNXB-like sequences, for example, were found in the BAC sequences but are not represented in the genome assembly. Many of the zebra finch MHC-related genes identified in the genome scan map to linkage groups in chromosome unknown. This again appears to be a result of the incomplete assembly of chromosome 16. The problem of assembling chromosome 16 is likely due in part to the highly duplicated MHC region in combination with the high repeat content in these regions.
BAC sequencing revealed two genes, FLOT and DAXX, that are MHC-linked in non-avian vertebrates [10, 11], but have not been described in chicken. The relatively close linkage to MHC Class I and II genes of FLOT, TUBB and DAXX in the zebra finch is actually more similar to the organisation in some teleost MHCs [for example, ] than it is to either Xenopus or the human MHC, where DAXX is physically distant from the FLOT and TUBB genes. Chicken chromosome 16, like the zebra finch, is not well assembled at this point so it is possible that these genes will be found as the chicken assembly continues to improve.
Phylogenetic analyses highlight the clustering of Class IIB loci by species rather than by orthology relationships, suggesting a history of concerted evolution, at least on portions of the genes [38, 60, 61]. We did, however, identify a unique Class IIB lineage that falls at the base of all other passerine Class II sequences. This appears to be a novel locus that has not previously been sequenced in birds and it is unknown whether it is expressed and/or polymorphic. Further analysis will be needed to clarify the role of this locus but its discovery underscores the utility of genomic approaches (rather than PCR amplification using degenerate primers) for characterizing MHC genes in birds. Tests of selection using zebra finch and other passerine MHC sequences support a strong role of selection in shaping patterns of polymorphism in the peptide binding region of Class I and Class II genes in passerines. The specific sites under positive selection are similar to those previously identified for other bird groups [53, 54] and they closely match the peptide binding regions in humans [51, 52]. High variability among individuals in RFLP banding patterns support the prediction that MHC Class IIB genes are influenced by balancing selection.
Among birds, there is tremendous variation among lineages in the number of MHC genes. In quail , red-winged blackbird [42, 48] and the zebra finch, there are multiple Class II genes. Most non-passerine species, in contrast, appear to have only between one and three loci [60, 62, 63]. Given the derived phylogenetic position of passerines , these patterns imply that in terms of Class II genes, a minimal MHC may be ancestral for birds [60, 62]. Because of the extensive variation among avian lineages in the number of Class I genes [for example, [34, 44, 65]], it remains unclear what the ancestral condition for Class I genes might be.