TonB-dependent transporters and their occurrence in cyanobacteria

Background Different iron transport systems evolved in Gram-negative bacteria during evolution. Most of the transport systems depend on outer membrane localized TonB-dependent transporters (TBDTs), a periplasma-facing TonB protein and a plasma membrane localized machinery (ExbBD). So far, iron chelators (siderophores), oligosaccharides and polypeptides have been identified as substrates of TBDTs. For iron transport, three uptake systems are defined: the lactoferrin/transferrin binding proteins, the porphyrin-dependent transporters and the siderophore-dependent transporters. However, for cyanobacteria almost nothing is known about possible TonB-dependent uptake systems for iron or other substrates. Results We have screened all publicly available eubacterial genomes for sequences representing (putative) TBDTs. Based on sequence similarity, we identified 195 clusters, where elements of one cluster may possibly recognize similar substrates. For Anabaena sp. PCC 7120 we identified 22 genes as putative TBDTs covering almost all known TBDT subclasses. This is a high number of TBDTs compared to other cyanobacteria. The expression of the 22 putative TBDTs individually depends on the presence of iron, copper or nitrogen. Conclusion We exemplified on TBDTs the power of CLANS-based classification, which demonstrates its importance for future application in systems biology. In addition, the tentative substrate assignment based on characterized proteins will stimulate the research of TBDTs in different species. For cyanobacteria, the atypical dependence of TBDT gene expression on different nutrition points to a yet unknown regulatory mechanism. In addition, we were able to clarify a hypothesis of the absence of TonB in cyanobacteria by the identification of according sequences.


Background
Filamentous cyanobacteria contain molecular machines for oxygenic photosynthesis under all growth conditions [1]. These machines, as well as those involved in respiration and nitrogen metabolism, depend on non-proteinaceous cofactors such as iron [2,3]. The level of iron found in cyanobacteria is generally one order of magnitude higher than in non-photosynthetic bacteria [4] and represents about 0.1% of their biomass [5]. Even though iron and copper are required for the function of respiratory and photosynthetic complexes, their intracellular level has to be tightly controlled as these ions pose a risk of oxidation [3]. Therefore, the uptake of iron is highly regulated in order to avoid intoxication. On the other hand, it is hypothesized that iron limitation might have been one of the selective forces in the evolution of cyanobacteria [6], and one might speculate that those cyanobacteria with the most efficient iron uptake systems might have had an evolutionary advantage. To enhance iron uptake, eubacteria secrete low-molecular-weight iron chelators (siderophores) under iron-limiting conditions to complex environmental iron [7]. The siderophore-iron complexes are bound by receptor proteins (TonB-dependent transporters, TBDTs) in the outer membrane which are composed of a transmembrane -barrel domain, a so-called plug domain and a periplasmic exposed TonB box. The siderophore-iron is subsequently transferred to the cytoplasm by transport proteins in the cytoplasmic membrane [8,9]. This process is dependent on TonB which provides the energy required for the translocation of siderophoreiron complexes across the outer membrane [10]. In order to facilitate this translocation, the periplasmic domain of TonB interacts with the TonB box of the loaded TBDT. It is proposed that TonB exerts a pulling force on the TonB box and, thereby, partially unfolds the plug domain enabling the translocation of the siderophore into the periplasmic space [11]. Several TBDTs have been identified. Beside the ones for iron transport [12,13], TBDTs for nickel [14], disaccharides (for sucrose SuxA; [15], for maltose MalA; [16]), oligo-(CsuF; [17]), polysaccharides (SusC; [18]) or large degradation products of proteins (RagA; [19]) are described. The most intensively studied function of TBDTs is the iron uptake in Gram-negative bacteria. Three large classes are defined, namely transferrin-/lactoferrin-binding proteins, porphyrin and siderophore transporters [20]. In addition to the transport of iron across the outer membrane by TBDTs, an additional ferric iron uptake system is postulated, but the corresponding outer membrane receptor has not yet been identified [21]. The TBDTs TbpA (transferring-binding protein A) and LbpA (lactoferrin-binding protein A) facilitate the uptake of iron from transferrin/lactoferrin, respectively; the uptake is also assisted by the lipoproteins TbpB and LbpB which face the extracellular side [22]. The porphyrin-transporting TBDTs include HasR, HgbA, HmbR (heme; [12,22]) and BtuB which transports the cobalt-complexing vitamin B 12 (cobalamin [23]). Heme uptake is especially important in bacterial pathogens, where various heme-containing compounds are utilized [13]. The siderophore TBDTs are further sub-classified according to their substrate -that is the chemical nature of the siderophore they bind. Siderophores belong inter alia to hydroxamates, catecholates, phenolates, citrates or combinations thereof [9]. For example, the siderophore transporters FepA, ViuA and IroN recognize catecholates, FhuA, FoxA and FhuE hydroxamate and FecA citrate. The iron uptake system in cyanobacteria is not well understood. For the non-filamentous cyanobacterium Synechocystis sp. PCC 6803 the TBDTs encoded by sll1206, sll1406, sll1409 and slr1490 were partially characterized [24,25]. For filamentous cyanobacteria such as Anabaena sp. PCC 7120 (also termed Nostoc sp. PCC 7120) only siderophore secretion [26][27][28], and the influence of enhanced or reduced iron levels on the growth [29][30][31][32], were investigated. Anabaena sp. PCC 7120 secretes the hydroxamate-type siderophore schizokinen, allegedly the only siderophore secreted [26,27]. Only recently, a TBDT encoded by schT (alr0397) involved in the uptake of schizokinen was identified. The expression of the gene schT (alr0397) was mildly increased under a shortage of Fe 3+ . A schT knock-out mutant showed a moderate phenotype of iron starvation, and the characterization of its siderophore-dependent iron uptake demonstrated the function of schT as a TonB-dependent schizokinen transporter [33].
To learn more about iron transport systems in general and in cyanobacteria particularly we searched for genes coding for TBDTs based on previously experimentally characterized TBDTs. Subsequently, we assigned putative substrates for so far uncharacterized TBDTs, according to their sequence similarity to already known TBDTs. We observed a substantial difference in the number of TBDT genes in the analysed cyanobacteria. The expression pattern of the TBDT genes in Anabaena sp. PCC 7120 is analysed with respect to iron, copper and nitrogen availability.

Classification of TonB-dependent transporters
Ninety-eight TBDTs and the (putative) substrates (for example, metallophores or sugars) were extracted from the published literature (see Additional file 1) [14][15][16][17][18][19]22,. In order to classify the TBDTs with unknown substrates, we first searched for putative TBDTs in 686 sequenced genomes. We identified 4600 putative TBDTs in 347 species (see Additional file 2). Compared to previously published bioinformatic analyses [15,125], we identified fewer sequences in the species which had been analysed in the past due to a more stringent cutoff (not shown). More specifically, within the species analysed by Koebnik, we selected seven sequences not previously identified, but did not consider 103 sequences [125]. A similar ratio was found when analysing the number of sequences selected by us from the species analysed by [15]), who selected 3020 sequences which resulted in a discrepancy of about 5%.
We subsequently performed a cluster analysis of the identified sequences of putative TBDTs (see Methods) leading to 195 clusters with at least two sequences. Figure 1A shows the consensus tree used to highlight 'regions' on the two-dimensional sequence landscape. A region is marked by roman numerals if the substrate for at least one TBDT in this region is experimentally verified (expTBDTs) or predicted (pTBDTs), and marked by upper case letters if no substrate TBDT in the region is known (Figure 1). Figure 1B shows the expTBDT regions I-VII, XI, XII and XIII and the pTBDTs regions VIII, IX and X together with the uncharacterized regions A-N. Figure 2 shows an enlarged version of the dashed rectangle in Figure 1B. The colours describe the substrate that binds to the corresponding TBDTs. Figure 2 (bottom) shows a magnification of the expTBDTs regions, where the numbers refer to sequences with a known substrate (Additional file 1, [14][15][16][17][18][19]22,). In the following, we characterize the regions according to the 98 TBDTs that have been experimentally verified or predicted (Additional file 1, [14][15][16][17][18][19]22,).

Region IV
The largest cluster in region IV (No. 82) contains the sequences of the ferric rhizoferrin (carboxylate) transporter RumA and the diferric dicitrate transporter FecA (sequences 44 and 45).

Region VI
This region represents transporters for phenolates, catecholates or hexylsulfate and contains several clusters. A hexylsulfate transporting TBDT (sequence 77) can be found in cluster 45, a vibriobactin (catecholate) transporter (sequence 74) in cluster 140 and proteins transporting yersiniabactin (phenolate; sequences 72, 73) in cluster 79. As already observed in region V, we also detected two sequences (75,76) in cluster 118 that are putative thiamin transporters.

Region VII (cluster 67)
Cluster 67 contains SuxA (sequence 78), an experimentally verified sucrose transporter. Please note, that sequence 79 has been predicted to transport sucrose [15]. The prediction was based on the co-localization of the corresponding gene with the transcriptional regulator ScrR. Thus, our bioinformatic analysis provides additional evidence for the functional characterization.

Region VIII (cluster 52)
This region contains predicted nickel and cobalt TBDTs with unknown metallophore specificity and no representative of the expTBDTs.

Region IX
This consists of eight sequences in one cluster (No. 32), where two of the eight are putative thiamin transporters. However, proteins assigned as thiamin transporters were also found in regions V (sequences 67, 68, cluster 7) and VI (sequences 75 and 76, cluster 118). Their genes are colocalized on the genome with a cytoplasmic membrane transporter for thiamin (PnuT, [128]), however, the functional assignment remains to be proven.

Region X
This contains a TBDT predicted to transport cobalt-complexing vitamin B 12 (sequence 43, cluster 166). However, it is far away from the BtuB cluster in region II ( Figure 2). Hence, the assigned function should be experimentally confirmed.

Region XI
The region is clearly separated from the rest and contains cluster 26. The experimentally characterized TBDTs include oligosaccharide (CsuF, sequence 88), polysaccharide transporters (SusC, sequence 87) and transporters for degradation products of proteins (RagA, sequences [85][86]. While many taxa are represented by sequences in the region I-X, region XI consists almost exclusively of bacteroidetes with the exception of one -proteobacterial sequence (gi|108757959, Myxococcus xanthus). Thus, sequences in this region may indicate a special adaptation of these organisms, which may be due to their lifestyle. Bacteroidetes are involved in food digestion in the intesti-nal tract of mammals. Hence a specific TBDT class for the uptake of substrates provided by the host seems plausible.

Region XII
This also appears as an outlier ( Figure 2 72. It appears that this region is composed of diand oligosaccharide transporters. In line with this notion, the -proteobacterial TBDTs are from Myxobacteria (Myxococcus xanthus, Sorangium cellulosum), which are found on decaying plant material consuming their saccharides. Most of the sequences in this region stem from and -proteobacteria (18.4%, 76%) and a few bacteroidetes, and -proteobacteria taxa.

Region XIII
Positioned between region XI and the crowded area on the right side, this region is defined by a fibronectin-binding TBDT (sequence 98, cluster 41). As in most of the sequences in region XI, the sequences of this region consist almost exclusively of bacteroidetes. The close proximity of regions XI and XIII is consistent with the observed interaction of the TBDT with a glycoprotein.

Other regions
For regions I to XIII we were in the lucky position of being able to infer at least putative functions to ~3700 sequences. (The putative annotation can be viewed at http://www.cibiv.at/TBDT.) However, from the 4600 sequences from GenBank ~900 sequences remain in regions A-N, where we were unable to assign any function ( Figure 1). While we cannot discuss potential substrates for clusters in regions A-N, we can at least point to some regions that show a peculiar taxonomic composition. In regions A and B sequences from mostly -(74%) and proteobacteria (19%), but also a few -proteobacterial Clustering of the sequences of putative TonB-dependent transporters (TBDTs) Figure 1 (see previous page) Clustering of the sequences of putative TonB-dependent transporters (TBDTs). The sequences found by the described genome-wide searches were analysed by CLANs as depicted. (A) shows the consensus tree of the pair-wise mean cluster distances. The branches are coloured according to their respective bootstrap value in shades of grey as indicated by the legend in the middle of the tree. The numbers at each leaf are of the format 'x_y', where 'x' is the cluster number and 'y' the number of sequences belonging to this cluster. We have further indicated the transported substrates and the regions as shown in Figure 1B are marked by I to XII and A to N. Brackets indicate that the metal ion is known, but the metallophore has not yet been identified. An asterisk marks predicted substrates. (B) shows the result of two-dimensional clustering in CLANS. The regions from Figure 1A are marked by red polygons (containing at least a single exp/pTBDT) and red circles (no functionally characterized TBDT). Sequences with a high similarity (P-value < 10 -90 ) are connected by lines coloured in shades of grey (the darker the smaller the P-value). The regions shown in Figure 2 (grey dashed line) and Figure 3 (grey dashed-dotted line) are highlighted.

Classification of TonB-dependent transporters in cyanobacteria
One of our aims was the identification and classification of cyanobacterial TBDTs. Hence we searched for sequences of putative TBDTs in 32 cyanobacterial genomes (proteins listed according to their accession code ( Table 1, column 1). We additionally extracted the automated annotation from GenBank (Table 1, column 2). At present, this annotation is mostly limited to CirA, FhuE or BtuB. Hence, we analysed the location of cyanobacterial sequences on the CLANS plot ( Figure 3 shows the section of Figure 1B Figure 1. The colour code shows the different species as indicated in the right corner. The numbers are according to Table 1.   The annotation of the sequences is indicated in column 1, the spot number according to Figure 3 is indicated in column 4; the initial annotation in the database is given in column 2; the classification according to Figure 3 using the name of a representative transporter of the related category is given in column 3; the accession code in column 5; and the source organism column 6. a (OMC, outer membrane channel; '*', incomplete sequence; '?', no clear assignment possible). ized TBDTs (see Figure 2 and dashed frames in Figure 3). To further confirm the classification determined with CLANS we also constructed a phylogenetic tree for the cyanobacterial sequences ( Figure 4). Seven 'subtrees' (a-f) were identified and mapped to regions I-X.
The six sequences in subtree 'a' belong to region I ( Figure  3, 4) and show a relation to heme transporters such as HutA (Figures 1, 2, sequence 13). The sequences are found in Synechococcus sp., Acaryochloris marina and Anabaena sp. PCC 7120 (see new assignment in Table 1, column 3). Subtrees 'b' and 'c' contain only sequences from Gloeobacter violaceus. Subtree 'b' is within region I and is equidistant to enterobactin and heme transporters. Thereby, a clear assignment to a characterized TBDT family appears currently impossible. Subtree 'd' is close to the BtuB transporter cluster (region II) (Figure 1). In this region we find sequences from most of the analysed cyanobacteria (8 of 12), suggesting that transporters with similarity to BtuB are common. Subtree 'e' (Figure 3, 4) represents transporters, which can clearly be assigned as specific for aerobactin/rhizobactin (IutA-/RhtA-type). Subtree 'f' represents sequences of transporters with the closest relation to FhuA-type transporters of cluster 0. The sequences of subtree 'g' (cluster 1), closely related to ViuA, are probably transporters for catecholates. The sequences of subtree 'g' are also close to cluster 118, which contains putative thiamin transporters. Nevertheless, since the two putative thiamin transporters have not yet been experimentally confirmed, we consider these cyanobacterial TBDTs to be iron transporters of the ViuA-type.
Summarizing, the assignment of the cyanobacterial TBDTs to regions with functional characterization was successful with the exception of some TBDTs from Gloeobacter violaceus (subtrees 'b' and 'c'). Although BtuB-like transporters and hydroxamate-type metallophore transporters were found in cyanobacteria, we did not find FecAtype (diferric dicitrate) TBDTs, even though they occur in -, -, -, and -proteobacteria, bacteroidetes and spirochaetes.

Identification of TBDTs in Anabaena sp. PCC 7120
In order to explore the cyanobacterial TBDTs in more detail we analysed the full genome of Anabaena sp. PCC 7120. We identified 21 TBDT genes carrying the plug domain and -barrel domain characteristic for TBDTs. In addition, we identified four genes (all2620, alr2179, all2578, alr4028) containing the plug domain of the TBDT, but an incomplete -barrel domain. Downstream of all2620 ( Figure 5A) and alr4028 ( Figure 5B) a gene coding for the 'missing part' of the -barrel domain is present Distribution of cyanobacterial sequences Figure 4 Distribution of cyanobacterial sequences. An alignment of sequences of TonB-dependent transporters listed in Table 1 was used to reconstruct a maximum likelihood phylogeny. Bootstrap values were calculated from 1000 phylogenetic trees. To indicate the probability of occurrence of an edge in these trees the edges are shown in shades of grey.
The genomic structure of the loci coding for TonB-dependent transporters (TBDTs) in Anabaena sp. PCC 7120 (all2619 and alr4029, respectively). Consequently, we checked the stop codon separating the two gene pairs. We confirmed the stop codon between all2620 and all2619 ( Figure 5A) and could not identify a frame shift in the sequence of the region 500 bp upstream or downstream of the stop codon. If All2620 is, indeed, part of a TBDT it has to form a heterodimer. A putative interaction partner would be All2619. It would, therefore, be interesting to investigate the existence of such complex and to understand whether it is just a remnant of a genetic accident which led to a split of the TBDT gene in all2620 and all2619. In contrast to all2620 and all2619, for alr4028 and alr4029 we found a T to C exchange in the sequence when comparing our results with that of the deposited sequence. Hence, we conclude that the stop codon does not exist and that the two genes alr4028 and alr4029 encode one protein. Therefore, 22 TBDTs exist in Anabaena sp. PCC 7120.
For 19 TBDTs the genomic organization suggests the integration of the gene in an operon ( Figure 5C). Twelve TBDTs are directly positioned behind a gene coding for a (putative) transcriptional regulator ( Figure 5C, violet), and most of the (putative) operon structures contain genes coding for proteins involved in iron transport. The gene coding for a ViuA-type transporter is in a putative operon with subunits of a cytochrome D ubiquinol oxidase, which is rather unexpected, because, to date, a rela-tion between this oxidase and iron transport has not been reported ( Figure 5C). Of the BtuB transporters one is a single gene (all3310), whereas the other (the gene which we confirmed and which is still annotated as alr4028/ alr4029) is in a rather typical genomic environment, namely in front of three genes encoding the periplasmic and the plasma membrane localized iron transport machinery. The same holds true for the hutA-like gene alr3242. The other hutA-like gene (alr2153) is in a putative operon with a gene encoding a tetracenomycin C synthesis protein and a gene of unknown function. Again, the relation between the TBDT and the downstream genes are rather questionable.
Three genes are classified as iutA-like. alr0397 (schT) is single standing in the genome. Downstream of alr2581 we found two genes coding for an unknown protein and a dicitrate binding protein, respectively. Alr2209 is a component of a large genomic region (~14 kbp, alr2208-alr2215) containing upstream a transcription regulator and downstream a cluster with three genes coding for periplasmic dicitrate binding proteins and one fhuA-like gene (alr2211). Thirteen of the 14 fhuA-like genes are upstream of a gene coding for a protein annotated as dicitrate-binding. However, most of the genes found in the putative operons defined by the 14 fhuA-like genes encode for proteins of unknown function. Three of the fhuA-like genes (alr2588, alr2592, alr2596) are in the same chromo- somal region. Upstream of these, a gene coding a transcription regulator and downstream a gene encoding a dicitrate binding protein are found. However, the phylogenetic analysis (Figure 4) argues against recent gene duplication.

Variations of the number of genes encoding TBDTs in cyanobacteria
The results presented in Figures 3, 4 [129] was not observed [128], which is supported by our analysis (not shown).
TBDTs are regulated by TonB proteins. Hence, the large number of TBDTs leads to the question of whether each TBDT is regulated individually or (at least a sub-population of the TBDTs) in concert by one TonB protein. We, therefore, screened the genomes for the presence of tonB (Table 2). One to three tonB genes were detected. Hence, the number of TBDTs largely exceeds the number of TonB proteins. Please note that we identified a TonB-like protein (Slr1484) in Synechocystis sp. PCC 6803, which corrects a previous statement excluding the presence of a TonB-like protein in this species [130].

Expression of genes in Anabaena sp. coding for TBDTs
We analysed the gene expression of the 22 TDBT genes and of all2620, which only codes for the N-terminal portion of a TBDT ( Figure 5A) in Anabaena sp. PCC 7120 (Figure 6). To this end, Anabaena sp. PCC 7120 was grown in normal medium (BG11), medium without iron (BG11 -Fe ), medium without copper (BG11 -Cu ) or medium lacking both (BG11 -Fe-Cu ). The presence of transcript was then determined by non-quantitative reverse transcription polymerase chain reaction (RT-PCR; primers are listed in Table 3). Iron and copper were chosen, because iron is known to be involved in the regulation of the gene expression of TBDTs and copper was recently found to induce an expression of a gene cluster involved in siderophore synthesis [131]. Remarkably, 13 TBDT-gene transcripts were present under normal growth conditions in such amounts that they could be amplified and visualized by RT-PCR ( Figure 6A, lane 2; Figure 6C, grey lines and black dashed line). It should be noted that the absence of a transcript for the other genes might only reflect low transcript abundance. For 19 genes, we detected transcripts under Fe minus or/Cu minus conditions ( Figure 6A lane 4, 6B, lane 1, 2). The analysis of the detection pattern revealed the following: (1) the genes all2148 and all2236, both hydroxamate-type TBDTs, were down-regulated upon iron and/ or copper starvation compared to transcript levels under normal conditions; (2) the expression of seven genes (iutA-like genes alr2209 and alr2581, the btuB-like gene alr4028, the hutA-like gene alr3242 and the fhuA-like genes all2674, all4924 and alr2592) not detected under normal growth conditions is increased in response to copper, but not iron, limitation in the BG11 medium ( Figure  6B, lane 1). This is notable, because, for four of these seven genes, the expression in the absence of one metalion (either Cu or Fe) is higher than in the absence of both iron and copper. One viuA-like gene (all4026) is expressed at a low level in BG11, but not in BG11 deficient of iron.

Iron-dependent expression of TonB-dependent transporters (TBDTs) of
An exclusive dependence of (upregulation of) expression in BG11 medium on iron limitation was only observed for alr0397 (iutA-like) and all2610 (fhuA-like).
Finally, we investigated the expression pattern of TDBTs under conditions enforcing heterocyst formation by growth in medium without a nitrogen source (BG11 0 ). We again analysed the amount of transcript in the four different media. Strikingly, in BG11 0 medium 17 genes are expressed ( Figure 6A, lane 6, Figure 6C, black hemicircle) but seven of them are not expressed in BG11. Moreover, we found four genes -alr2153 and alr3242 (hutAlike),alr2626 and alr2185 (fhuA-like) -for which a transcript was detected only under additional metal starvation (BG11 0 -Fe, -Cu or -Fe/-Cu). Remarkably, all2620 is expressed under all conditions without a nitrogen source, which suggests that all2620 is not a pseudogene. In general, one can conclude that not only metal starvation but also nitrogen starvation induces transcription of TBDTencoding genes in Anabaena sp. PCC 7120. As only all5036 encodes a TonB-like protein in Anabaena sp. PCC 7120 we analysed its expression under the conditions outlined ( Figure 7). As expected, all5036 transcript can be detected under all conditions tested. Assuming that the function of all identified TBDTs in Anabaena sp. depends on TonB, All5036 is required for iron homeostasis in general.

Conclusion
By clustering ~4,600 TBDTs we found that they group by their substrate and not according to their taxonomy with the exception of regions IX, XI, XIII and C. The latter are specific for sequences from bacteroidetes and -proteobacteria, respectively. Hence, the transported molecule dominates the sequence variation among TBDTs. According to the occurrence of expTBDTs within clusters, we were able to assign a tentative substrate for almost twothirds of the analysed sequences. We have developed a website for a further detailed inspection of the clustering of individual sequences http://www.cibiv.at/TBDT. Here, the individual clusters or sequences can be highlighted based on the presentation in Figure 2. However, the current assignment has to be viewed with care as Schauer and colleagues pointed out that further substrates might be discovered in future [128], which will then be introduced into the web interface. We identified several clusters of TBDTs with putatively so far unknown substrates. Further research on a few candidate proteins of each of these clusters would be of great interest, as it would significantly advance the knowledge on substrate uptake by bacteria on the protein level and it might also reveal new potential drug targets.
Large differences to previously suggested classifications were not observed for iron-transporting TBDTs. Generally, our approach resembles previous classifications of TBDTs according to their substrates based on a smaller number of sequences and a phylogenetic tree reconstruction [53,80,82,111,112], but the positioning of the IutA and of the ViuA sequences differs with respect to distances previously proposed [82,112]. In contrast to the report by LeVier and Guerinot who placed ViuA between the lactoferrin and transferrin recognizing transporters [82], we found that ViuA (sequence 74, Region VI) clearly clusters with FyuA (sequence 72) sequences. This discrepancy might reflect the fact that: (i) more sequences of TBDTs are available nowadays; and (ii) the methodology to analyse sequence relationships has improved.
A deviation from this general picture was found for the predicted BtuBs, which are spread over a long stripe from regions II to V. Hence, BtuBs might show a similar diffuse distribution pattern like the heme and hydroxamate transporters (regions I and V, respectively). The predicted BtuBs might, therefore, transport substrates only structurally related to cobalt-complexing vitamin B 12 .

TBDTs in Anabaena sp
Based on database searches, we have identified 25 sequences with TonB-box signature [39] Table 3) to visualize expression.
limitations in the environment per se. Therefore, symbiotic cyanobacteria such as Nostoc punctiforme may possibly contain a rather low number of TBDTs because iron is provided by the host. Unfortunately, to the best of our knowledge, the source of Anabaena sp. strain PCC 7120formerly named Nostoc muscorum ISU ( [133]; further synonyms are Anabaena sp. ATCC 27893, Nostoc sp. strain PCC 7120) -is unknown and it is considered to be a 'free living cyanobacterium'. The observation that this cyanobacterium is susceptible to viruses isolated from the Lake Mendota, Dane County, Wisconsin, USA, [133] might suggest that a similar environment was its place of isolation. This would be in line with an original natural habitat of Anabaena sp. PCC 7120 that contained rather limited iron sources, because it has been reported that the iron concentration in rivers is higher than in lakes ( [134]). The variety of TBDT classes found in Anabaena sp. rather agrees with iron limited environmental conditions. The only TBDT type which could not be identified in the analysed cyanobacterial species, in general, and, thereby, also in Anabaena sp. PCC 7120, is the FecA-type (diferric dicitrate) which can be found in many other bacteria. To date, schizokinen is the only confirmed siderophore which is secreted by Anabaena sp. PCC 7120 [27] and, recently, its transporter was identified [33]. However, additional siderophores are secreted by Anabaena sp. [33,131], but they have not yet been characterized. Nevertheless, other interpretations for the variable number of TBDTs might still be possible.

The environment influences the expression of TBDT genes in Anabaena sp
In line with iron limitation in the native environment, several differential expression regulation regimes have been observed. For instance, six out of 14 genes encoding hydroxamate recognizing FhuA-like transporters are expressed under (almost) all tested conditions ( Figure 6C, grey and grey dashed line, Table 1). The same holds true for one BtuB-like transporter encoded by all3310, which is in accordance with its identification in a proteome analysis of cells grown under standard conditions [135,136]. Interestingly, the other BtuB-like transporter encoded by the joint gene all4028/all4029 is only expressed under iron-limiting conditions ( Figure 6C, black dotted line). Furthermore, the iutA-like genes are always expressed under nitrogen-limiting conditions, whereas hutA-like genes are only expressed upon metal starvation ( Figure  6C, black dashed dotted line). Also, for the gene encoding the schizokinen transporter SchT (Alr0397) only a moderate and intermediate influence of iron starvation on expression was observed [33]. The gene encoding the only putative catecholate transporter (All4026) appears to be expressed under non-limiting conditions as well as after nitrogen starvation. To our surprise, we did not observe a transcript under iron limitation but under copper limita-tion in BG11 or in the absence of both metals in BG11 and BG11 0 . Such a clear relation to copper starvation was detected for four FhuA-type transporters as well ( Figure  6C). The relation between the expression of genes encoding for TBDTs in Anabaena sp. and copper agrees with the recent observation that genes involved in siderophore production are also induced by copper starvation [131]. Nevertheless, the components of the network regulating the expression of TBDT encoding genes still need to be identified. Even though a complex network of TBDTs was discovered, only a single TonB protein was found in 58% of Gram-negative bacteria [137]. The gene is expressed under all tested conditions and, hence, it has to be considered as a master 'regulator' of the large group of TBDTs.

Identification of TonB-dependent transporters
Ninety-eight TBDT sequences were extracted from the NCBI database after extensive literature search. For 67 of them, experimental data is available, but for four of them the substrate is still unknown. Information on predicted substrates for the remaining 27 is available (see Additional file 1, [14][15][16][17][18][19]22,). These predictions are based on co-localization with genes of a specific metabolic pathway or on co-regulation by either transcription factors or a riboswitch [128]. Moreover, we downloaded 686 completely sequenced eubacterial genomes from the NCBI ftp server ftp://ftp.ncbi.nih.gov/genomes/Bacteria/ that were available in June 2008. In order to locate putative TBDTs in the genomes, we searched for open reading frames containing the TBDT -barrel domain and the plug domain. To this end, we used hmmsearch (hidden Markov model search) from the hmmer package http:// hmmer.janelia.org/ and the profile hidden Markov models PF00593 and PF07715 provided by the PFAM database [138,139]. The hmmsearch output-files were parsed considering only hits with an E-value < 10 -10 . We used only sequences for further analysis that resulted in a significant hit for both domains.

Phylogenetic analysis and clustering
The 97 cyanobacterial TBDT sequences were aligned with MAFFT [140] and a maximum likelihood tree was constructed with IQPNNI v3.3.b4 [141]. As a substitution model we selected VT [142] with gamma-distributed rate heterogeneity. Support values were calculated from 1000 bootstrap replicates. The consensus tree was reconstructed with Tree-Puzzle v5.2 [143] applying the majority consensus rule. The program CLANS [144] was used to cluster the 4648 putative TBDTs detected in the complete genomes, and to visualize their degree of similarity. In CLANS we set the cut off such that only P-values < 10 -10 obtained by pairwise BLASTs were used for the CLANS-clustering. In the context of this manuscript, we use the term 'cluster' to refer to an aggregation of sequences. Each sequence in a cluster has at least one correspondent within the cluster with a BLAST p-value < 10 -90 leading to 195 clusters with at least two elements.
To further elucidate the relationship of the 195 clusters, we ran CLANS 100 times with a random initial configuration of the sequences in 3d space. In each run we determined the cluster centres and computed pair-wise distances between the centres. With the PHYLIP package v3.68 [145] we constructed a neighbour-joining tree for the resulting 100 distance matrices and we inferred the majority rule consensus tree with support values for the splits in the consensus tree.

Genome loci of TonB-dependent transporters in
Anabaena sp. PCC 7120 The annotations of genes upstream and downstream of the TBDT loci, shown in Figure 5, were done manually.

Analysis of the operon structure
Genomic DNA of Anabaena sp. was isolated as described [146]. The intergenic sequences between all2619 and all2620 and between alr4028 and alr4029, respectively, and additional ~250 bp inside each flanking gene were amplified with 5' Prime PCR Extender Polymerase (5' Prime, Hamburg, Germany) according to the manufacturer's protocol. The PCR product was cloned into pCR2.1 (Invitrogen, Karlsruhe, Germany), transformed into DH5 (GibcoBRL, Eggenstein, Deutschland) and the resulting plasmids purified for sequencing.

RNA isolation and analysis
Total RNA was isolated from 50 ml cells of log phase cultures (OD 750  2) as described [147]. RT-PCRs were performed according to the protocol of the Invitrogen SuperScript ® III First-Strand Synthesis System for Random Hexamer Primers (Invitrogen, Carlsbad, USA). The used oligonucleotides are listed in Table 3.