Genome mining for methanobactins

Background Methanobactins (Mbns) are a family of copper-binding natural products involved in copper uptake by methanotrophic bacteria. The few Mbns that have been structurally characterized feature copper coordination by two nitrogen-containing heterocycles next to thioamide groups embedded in a peptidic backbone of varying composition. Mbns are proposed to derive from post-translational modification of ribosomally synthesized peptides, but only a few genes encoding potential precursor peptides have been identified. Moreover, the relevance of neighboring genes in these genomes has been unclear. Results The potential for Mbn production in a wider range of bacterial species was assessed by mining microbial genomes. Operons encoding Mbn-like precursor peptides, MbnAs, were identified in 16 new species, including both methanotrophs and, surprisingly, non-methanotrophs. Along with MbnA, the core of the operon is formed by two putative biosynthetic genes denoted MbnB and MbnC. The species can be divided into five groups on the basis of their MbnA and MbnB sequences and their operon compositions. Additional biosynthetic proteins, including aminotransferases, sulfotransferases and flavin adenine dinucleotide (FAD)-dependent oxidoreductases were also identified in some families. Beyond biosynthetic machinery, a conserved set of transporters was identified, including MATE multidrug exporters and TonB-dependent transporters. Additional proteins of interest include a di-heme cytochrome c peroxidase and a partner protein, the roles of which remain a mystery. Conclusions This study indicates that Mbn-like compounds may be more widespread than previously thought, but are not present in all methanotrophs. This distribution of species suggests a broader role in metal homeostasis. These data provide a link between precursor peptide sequence and Mbn structure, facilitating predictions of new Mbn structures and supporting a post-translational modification biosynthetic pathway. In addition, testable models for Mbn transport and for methanotrophic copper regulation have emerged. Given the unusual modifications observed in Mbns characterized thus far, understanding the roles of the putative biosynthetic proteins is likely to reveal novel pathways and chemistry.


Background
Methanotrophs are Gram-negative bacteria that use methane, a potent greenhouse gas, as their sole source of carbon and energy [1]. As the only biological methane sink, methanotrophs have attracted much attention as a means of mitigating methane emissions [2][3][4]. The first step in their metabolic pathway, the oxidation of methane to methanol, is catalyzed by methane monooxygenase (MMO) enzymes, which are of broad interest in the quest to exploit abundant natural gas reserves as fuel and chemical feedstocks. Most methanotrophs utilize particulate methane monooxygenase (pMMO), a copper-dependent integral membrane enzyme [5,6]. Under copper-limiting growth conditions, some methanotrophs can also express an alternative, soluble form of MMO (sMMO) that utilizes iron [7]. In these methanotroph strains, the switch between pMMO and sMMO is controlled by copper: copper represses transcription of the sMMO genes and causes formation of intracytoplasmic membranes that house pMMO [8][9][10]. The details of this "copper switch" regulatory mechanism are not understood and represent a major outstanding question in the field.
An important part of the copper switch puzzle is the discovery of methanobactins (Mbns), a family of copperbinding natural products initially detected in the methanotroph Methylosinus trichosporium OB3b [11][12][13], and potentially useful in applications ranging from wastewater copper removal in the semiconductor industry [14] to treatment of Wilson disease, a human disorder of copper metabolism [15]. Mbns are believed to be secreted under copper limiting conditions in a copper-free (apo) form to acquire copper from the environment and then internalized in a copper-loaded form to provide essential copper to the methanotroph [16,17]. In support of this model, methanobactin (Mbn) promotes the copper switch [18,19] and can mediate release of copper from insoluble mineral sources [19,20]. In addition, direct uptake of copper-loaded Mbn (CuMbn) by Methylosinus trichosporium OB3b has been demonstrated, and proceeds via an active transport process [21]. Because this model for Mbn function as well as aspects of its structure (vide infra) are reminiscent of iron siderophores, Mbn has also been referred to as a chalkophore [13] (chalko-is derived from the Greek word for copper whereas sidero-is from the Greek word for iron).
Mbn molecules from Methylosinus trichosporium OB3b, Methylocystis strain SB2, Methylocystis hirsuta CSC-1, Methylocystis strain M and Methylocystis rosea SV97T have been characterized by mass spectrometry, nuclear magnetic resonance (NMR) and crystallography ( Figure 1; Additional file 1, Figure S1). These data reveal a peptidic backbone and copper coordination by two nitrogen-containing heterocycles next to thioamide groups [13,[22][23][24][25]. The Methylosinus trichosporium OB3b Mbn backbone has the sequence 1-(N-(mercapto-  Asn 10 Gly 11 Figure 1 Post-translational modifications required to produce methanobactins from Methylosinus trichosporium OB3b and Methylocystis rosea SV97T. (A) Mbn from Methylosinus trichosporium OB3b is generated from the precursor peptide MTVKIAQKKVLPVIGRAAALCGSCYPCSCM. Post-translational modifications needed to produce the final natural product include leader peptide cleavage and subsequent N-terminal transamination (orange), oxazolone formation (green), thioamide formation (blue), and disulfide bond formation (yellow). (B) Mbn from Methylocystis rosea SV97T is generated from the precursor peptide MTIRIAKRITLNVIGRASARCASTCAATNG. Posttranslational modifications needed to produce the final natural product include leader peptide cleavage, pyrazinedione formation (purple) oxazolone formation (green), thioamide formation (blue), and threonine sulfonation (pink). Several residues that are present in the precursor peptide are missing in the reported structure (gray); the loss of a C-terminal threonine and asparagine had been previously reported, but identification of the precursor peptide indicates that a final glycine is also lost.
is thought to derive from the peptide backbone LCGSCYPCSCM ( Figure 1A). By comparison, Methylocystis Mbns are alanine-rich, and the first nitrogen-containing heterocycle is not an oxazolone [24]. All the Methylocystis Mbns have a similar backbone. The N-terminal residue is either arginine-or methionine-derived (the latter only in Methylocystis hirsuta CSC-1), and immediately precedes the first heterocycle. The heterocycle/thioamide pair (pyrazinadione in all structures except the NMR structure of Methylocystis strain SB2) is followed by an alanine, a serine and a sulfonated threonine (Additional file 1, Figure S1). Next is an oxaozlone/thioamide pair and an alanine followed by a methionine (Methylocystis sp. M) or a second alanine (Additional file 1, Figure S1). Additional C-terminal residues are present in some forms of the molecule. The Methylocystis rosea SV97T Mbn contains a Thr-Asn sequence [24], and likely derives from a peptide backbone containing the sequence RCASTCAATN ( Figure 1B). Despite these structural differences, these Mbns retain their strong and specific affinity for copper [24].
The Mbn biosynthetic pathway has not been elucidated, and was initially suggested to involve nonribosomal peptide synthetases [16,26], similar to production of many siderophores [27]. However, sequencing of the Methylosinus trichosporium OB3b genome [28] led to the identification of a 30-amino acid open-reading frame (ORF) with similarities to the peptidic Mbn backbone [23], supporting previous suggestions that Mbn is produced via post-translational modification of a ribosomally synthesized precursor peptide [22]. A similar precursor peptide was identified in an unrelated species, Azospirillum sp. B510, along with several conserved neighboring genes [17,22], but analogous ORFs were not detected in other available methanotroph genomes, and the relevance of many of the neighboring genes surrounding the precursors was unclear.
Genes encoding the precursors of small ribosomallyproduced natural products can be difficult to detect and annotate, and the underdetection of biologically relevant small ORFs is a known problem [29][30][31]. However, the ever-increasing rate at which bacterial genomes are released has prompted the design of genome mining tools for widespread classes of ribosomally synthesized and post-translationally modified peptide natural products (RiPPs), such as lantibiotics [32][33][34][35][36][37]. With the aim of identifying the potential for Mbn production in a wider range of bacterial species, we mined the available microbial genomes in the National Center for Biotechnology Information (NCBI) and Joint Genome Institute (JGI)/Integrated Microbial Genomes (IMG) databases, identifying 18 new Mbn-like precursors and accompanying biosynthetic genes from 16 species, including unknown or provisionally identified species present in metagenomic samples. Surprisingly, many of these precursor peptides and their operons are from non-methanotrophic species and several well-studied methanotrophic species seem to lack Mbn operons similar to that of Methylosinus trichosporium OB3b. Beyond biosynthesis-related genes, we also identified a widely-conserved set of transporters and sigma factors, which has implications for Mbn export and import as well as its involvement in cellular copper homeostasis. Finally, this bioinformatics study provides new tools to better detect Mbn-like gene clusters in novel genomes.

Results and discussion
Using a variety of bioinformatics techniques, we were able to detect putative biosynthesis operons for Mbn-like natural products in 14 new species, as well as several unidentified or tentatively identified species present in metagenomic studies ( Figure 2; Additional file 2, Table  S1). While five of the identified species are Type II methanotrophs like the first identified Mbn-producer, Methylosinus trichosporium OB3b, the remaining species are not. Operons were detected in βand γ-proteobacteria as well as α-proteobacteria, to which the Type II methanotrophs belong. Both the precursor peptides and the range of non-core biosynthesis genes present in the operon hint at a set of potential modifications that may define the Mbn family. Furthermore, genes likely to be related to export, import and copper regulation are found in almost every operon. Based on sequence analysis, the presence of specific Mbn-related genes and the overall operon structure, we have provisionally divided the operons into five groups ( Figure 2).

Locating the precursor peptide MbnA
Automated detection of small peptide sequences in newlysequenced genomes is problematic [29]. Short sequences are poorly detected by Basic Local Alignment Search Tool (BLAST) and similar sequence analysis methods, and uncurated small ORF detection results in the annotation of many spurious small ORFs. For well-established classes of small ribosomally-produced natural products, such as bacteriocins, hidden Markov model (HMM)-based tools, such as BAGEL and BAGEL2, have been developed to better detect precursors in newly sequenced genomes [32,33]. With only two published precursor peptide sequences (MbnAs), Mbn was not a good candidate for this detection method [17,23]. A TIGRFAM group (TIGR04071) does exist for the precursor, and is a member of the Gen-Prop0962 family (which also includes TIGRFAM groups for MbnC and half of MbnB) [38], but it is based on only the two previously published precursor peptide sequences and a third suggested MbnA homologue from Gluconacetobacter sp. SXCC-1 [38]. Four possible MbnAs detected here are also mentioned in the 2013 TIGRFAM update [38], but are not included in the available HMM.
Because of the limitations of direct precursor peptide detection, we pursued an alternate genome mining strategy focusing on the detection of biosynthetic proteins, followed by manual identification of unannotated precursors. This method has been used with some success for a variety of natural products, including radical S-adenosyl methionine (SAM)-modified peptides, bacteriocins in cyanobacteria, and a new class of lantibiotic-like natural products stemming from nitrile hydratase or Nif11 leader peptides [34][35][36]39]. We used the MbnB and MbnC sequences from Methylosinus trichosporium OB3b [38] as seeds in a tBLASTn search through the NCBI's Nonredundant (NR) and Whole Genome Shotgun (WGS) databases, as well as the microbial genomes available at Methylobacterium sp. B34 † Figure 2 The genetic organization of Mbn-producing operons. Genomic regions containing all identified Mbn operons are depicted. Operons are grouped into five families on the basis of operon content, MbnB conservation, and MbnA sequence. All operons contain MbnA (black) and MbnB (orange) and at least one transport protein, a MATE efflux pump (purple), a TonB-dependent transporter (blue) or both. Most operons contain additional biosynthesis-related genes, including MbnC (green), aminotransferases (MbnN, brown) or sulfotransferases (MbnS, dark purple). Additional genes may be related to regulation (MbnR, yellow and MbnI, red) or may play an unknown role in copper homeostasis (MbnP, gold and MbnH, dark blue) Genetic mobility elements of several varieties (light teal) are common in the vicinity of these operons. Metagenomic samples corresponding to unidentified or provisionally identified species are marked ( †). JGI/IMG. For every MbnB homologue detected, a 2 kb region preceding and following that gene was manually examined for 45 to 150 bp ORFs coding for short peptides with at least one cysteine in the last 10 amino acids and an N-terminal region containing multiple arginine or lysine residues.
A total of 18 novel MbnA-like ORFs were identified using these methods, one preceding every close MbnB homologue excluding truncated homologues from metagenomic sequencing. Two Methylosinus species (Methylosinus sp. LW3 and Methylocystis parvus OBBP, which may be misclassified as Methylocystis [40]) have two distinct MbnA genes encoding unique Mbns. While it is not uncommon for bacteria to produce multiple siderophores to control iron acquisition in different environments [41], a similar phenomenon has not yet been observed for chalkophores. As shown in the Multiple Alignment with Fast Fourier Transform (MAFFT) alignment ( Figure 3A), both the leader and core sequences exhibit some conservation over the 20 complete sequences. MbnA-like sequences range from 23 to 35 amino acids (aa), with predicted core sequences ranging from 7 to 15 aa. The leader peptides are better conserved than the core peptides, perhaps indicating the involvement of the leader peptide in interactions with biosynthesis proteins [42]. The leader sequences are lysine/arginine rich, with at least two such residues occurring near the beginning and one present in a conserved area immediately prior to the core sequence ( Figure 3B). The core sequences are more variable, but all contain at least one C (G|A|S) (S|T) motif. Of the complete MbnA sequences, 18 have a second core cysteine and 11 contain one or two additional cysteines.
One basis for the proposed five operon groups ( Figure 3C) is the nature of the MbnA sequences, including the structurally, but not genomically, characterized Methylocystis Mbns. The Group I MbnA sequences, primarily from Methylosinus genera are long (11 to 15 aa), with four nonadjacent core peptide cysteines, and contain core prolines. It is unknown whether the presence of four cysteines allows for the formation of disulfide bonds as found in Methylosinus trichosporium OB3b Mbn or whether they lead to the production of additional oxazole/thioamide pairs, analogous   to the multiple thiazoles and oxazoles present in many bacteriocins [42].
The primarily Methylocystis Group II MbnA sequences are shorter, contain only two or three cysteines, and many have a conserved threonine which, based on NMR and crystal structures [23,24], is likely to be a sulfotransferase target. Interestingly, the sequences from Methylocystis strain SC2 and Methylocystis rosea SV97T appear to be merged with an extracytoplasmic function (ECF) sigma factor, at least based on the annotation [43]. It is not clear whether the precursor peptide is cleaved from these sigma factors and whether sigma factor activity remains or is altered. Although there is no structure for Mbn from Methylocystis strain SC2, its MbnA sequence and the similarity of its operon structure to that of Methylocystis rosea SV97T suggest that its Mbn will resemble Methylocystis rosea SV97T Mbn and will be identical to Methylocysis hirsuta CSC-1 Mbn [24]. Similarly, although there are no genomes for Methylocystis strain SB2, Methylocystis strain M and Methylocystis hirsuta CSC-1, we can predict that the core peptides for their structurally characterized Mbns will be RCASTCAA, RCASTCAMT and MCASTCAAT, respectively (likely followed by -TNG, -NG and -NG), and that their leader sequences will resemble those from Methylocystis rosea SV97T and Methylocystis strain SC2 [24,43]. A subfamily of Group II MbnAs from Methylosinus or related species (Methylosinus sp. LW3, Methylocystis parvus OBBP and a bioreactor metagenome) do not have the CASTCA(A) motif. Instead, the second cysteine is followed by a tryptophan. If the core peptide sequence dictates cysteine modification, these residues lack the C(G| A|S) motif associated with cyclization and thioamide formation in existing Mbn structures.
The remaining families include MbnAs from a variety of non-methanotrophic species. The species that have Group III MbnA sequences include two Pseudomonas species, two Azospirillum species and single species each from the Cupriavidus, Tistrella and Methylobacterium genera. In this group, the two Pseudomonas sequences, the two Azospirillum sequences and the Methylobacterium sequence are most similar, with somewhat lengthy and near-identical MbnA core sequences containing two cysteine doublets. The less similar Cupriavidus basilensis B-8 MbnA preserves the cysteine doublets whereas the Tistrella mobilis KA081020-065 sequence contains only two non-adjacent cysteines.
The Group IV MbnA sequences are currently only found in the two Gluconacetobacter species. These two sequences are nearly identical, and feature only two core cysteines, with a leader sequence potentially extended by two amino acids. Finally, the Group V MbnA-like sequences are found in Vibrio caribbenthicus ATCC BAA-2122 and Phorhabdus luminescens subsp. laumondii TT01. These sequences are short and somewhat divergent, containing only a single cysteine, which may suggest that they represent a natural product with some structural similarities to Mbn that either does not chelate copper or does not chelate copper in the same way that other chalkophores do. This overall classification scheme extends to the MbnB and MbnC sequences (vide infra), and will be subject to future modification as more MbnA sequences are identified in new genomes.

The first unknown biosynthesis protein: MbnB
MbnB is the core protein in the Mbn biosynthesis operon, and was detected in 19 operons, including truncated forms in several metagenomic samples (Figures 2 and 4; Additional file 2, Table S1). However, the initial identification of this protein in Methylosinus trichosporium OB3b has been problematic. In Methylosinus trichosporium OB3b, and one other operon detected (Methylobacterium sp. B34), MbnB is split into two ORFs, formerly annotated as MettrDRAFT_3894 and MettrDRAFT_3895, but reannotated as one entity (MettrDRAFT_3422) in a recently assembled genome build available on IMG [23]. A TIGRFAM HMM (TIGR04159) exists for the half of the protein that resembles MettrDRAFT_3894, but does not cover MettrDRAFT_3895 [38]. Therefore, a conjugate with a glycine replacing the stop codon between the two ORFs was used for BLAST detection and annotation.
Despite the addition of new members to the MbnB family, no motifs or domains of known function have been identified beyond occasional classification as TIM-barrel proteins [44]. MbnB homologues may be a subfamily in the larger DUF692 family (PFAM class PF05114). However, when conducting a BLAST search or a HMMbased search for homologues, MbnB-like proteins represent a distinct subgroup, with a sharp drop-off in expectation value between the last MbnB-like protein (E <1E-50, except for sequences truncated by the end of a contig/scaffold) and other DUF692-like proteins. Notably, in the Group V operons, the MbnB gene is separated from the MbnA-like precursor by a gene-sized ORF.
A comparison of MbnB sequences ( Figure 4A) strongly supports the five operon families assigned on the basis of the MbnA sequences ( Figure 4B). There are about six different regions that are strikingly well-preserved, even in the Group V homologues. Without knowledge of the structure or function of MbnB, it is difficult to interpret which of these conserved regions are important. However, given that MbnB and MbnC are the only proteins with unassigned functions that are preserved in both the Methylosinus trichosporium OB3b and Methylocystis rosea SV97T operons, it is possible that one or both are responsible for the nitrogen-containing heterocycles and the neighboring thioamides that have been present in every Mbn structure obtained thus far.

The second unknown biosynthesis protein: MbnC
MbnC is the second unknown Mbn biosynthesis protein, and as with MbnB, there is an existing, if limited, TIGR-FAM class (TIGR04061) [38]. We detected MbnC-like proteins in 17 novel operons, a number that includes two fragmentary hits in a bioreactor metagenome ( Figure 5A). As with MbnB, there is a broader class of distantly related hits (with high Pseudomonas representation and a more divergent C-terminal region), visible after a sharp decline in expectation value quality. This set of more distant relatives appears to correspond to the TIGRFAM family TIGR04061.
MbnC homologues are present in Groups I to IV Mbn operons. For those operons, the phylogenetic tree      constructed for MbnC resembles that for MbnB and MbnA, supporting the proposed classification scheme ( Figure 5B). In families with a true MbnC homologue, the predicted MbnC ORF frequently overlaps MbnB by a significant number of residues, but it is not in frame with MbnB. As with MbnB, multiple alignment of MbnC homologues confirms the broad conservation of several regions of the gene, but the relationship between the conserved regions and MbnC's potential role in biosynthesis of thioamide and nitrogen-containing heterocycles remains unclear.

WT E V I R S A R Q F HV A G Y DH S V GQ P P I V L DT HDR A L A A DT R A A L A L C R A L V T DP DA T I T Y E R DDD L D E DR I I A D I DT L R A VMG R E P A HV -------------------W S P L A A E A R H F H I A G F T G S I L S P R L A I D A H D T D L A P D T L A Y L E R I G P A L A L P G R T L T Y E R D G N I E Y E A I V R D L H R L R T A L A C P K P S H G H A A V N A G A S E D R N A N V T A WD G V I D A S R H F H T A G Y N L S I L Q P H L V L D T H D R A L S E A T M A F L E S R R G L F D K P G A T M T Y E R D D N F D E A D I A A D L D R L R A L F V R D P V A P A L E R A V P ------------WDG I I DA A S H F HT A G Y N L S I LQ P H L V L DT HDR A L S DA T MA F L E S R R E R F DT P G A T MT Y E R DDN F D E L D I A S D L DR L R A L F A DDA V A P P V E R A A P ------------W E G I I E L S S H F HT A G Y N L S I LQ P H L I L DT HDQ A L S DA T L D F LQ R Y R MQ F DT P HA T MT Y E R DDR F D E H E I V A D LQ R L R G V FMP P SQ A LQ S -----------------W E G V I E H S S H F H T A G Y N L S I L Q P H L I L D T H DQ V L S E A T L G F L Q R Y R G L F D G P E A T L T Y E R D G R F D E A E I I G D L Q R L R G I F P S S P M E P R T -----------------WK K I I E T T R H F HV A G Y G T A F I E P R V K A DT HDR EMA E DT L D F L S R MR T S F DK P G A T I T Y E R D F D I DY E S I S V D L K R L R D I F P C -V E E E R H E P V A HC A G ---------WK D I V A DT K H F HV A G Y G P S F I E DDV I V DT HDR EM S T E T L D F L R R MR T D F DK P G A T I T Y E R D F E I DY E S I S V D L R R L R E I F P R -T E E E A HG P L I A C A G ---------WN D V I S S T R H F H V A G Y G P S F L D P R V I V D S H D R E M A P D T L E F L R S M R A A F D K P G S T I T Y E R D F E I E Y D S I A I D L E R L R E I F P H -A E D T T N E S L V A C A G ---------WND I V E T T Q H F HV A G Y AQ S F S A P HV I V DA HDR EMA P DT I E F L S SMR S S F DK P G A T I T Y E R D I N I DY D S I A V D LQ R L R E I F P H-T Q E L E HA D L V A C A S ---------WDK I I A E T P H F HV A G Y S R A T T P P Y V I HD S H S E E L S E K T L D F L R G R R D L F DK P NA T MT Y E R DG N I DY E S I V I D L K R L R D I F S T T S E DQ R H E S N L A C A H---------WDK V I A S T NH F HV A G Y S R A V NP P H I V HD S HA E E L A E DT L A F L R G R R H I F DK P DA T I T Y E R DG N I E Y D S I I A D L R R L R E I F T S G T E E R Q D E R A I A C A N---------WR NV I A E A K H F HV A G Y R H S L V E P F I S L DT HA E A L A P DT L A F L R D F R S V F DK P G A T MT Y E R DDR I E F DDV V A D L K L L R E L F GQ -P E E R R HD L A L S A -----------WR NV I A K AQ H F HV A G Y R Q S L I E P F I S L DT HA E A L A P DT L A F LQ N F R S V F DK S G A T MT Y E R DDQ I E F DD I V V D L K A L R D L F GQ -P E E T R HD L A L T A -----------W L D I --EMNH L H I G G Y A E T S L R P S F L V DT HA DR I S N L S L K Y F NK L G T E S K DN L T S L S V E R DDN F V L G DW I ND I E L C R Q ----------------------------W F D L I K HCQ H F H I A G F E NA P -DNQ F L V DT H SQ C I E E P V L S F LQ E V NNA --T S I A T I S V E R D E N F DV S DWA L D I DNV R NR V S NG R DT R -------------------WK NV I S K T L HY HV A G Y T P A P S DNN I L I D S H S EQ I S S E T E I F I DR Y A D L F F R E NT T I T Y E R DG N F DY D L I V E D L N S L R L K S K NQ S V Y N EQ S ----------------WK NV I S K T L HY HV A G Y T P A P S DNN I L I D S H S EQ I S S E T E I F I DR Y A D L F F R E NT T I T Y E R DG N F DY D L I V E D L N S L R L K S K NQ S V Y N EQ S ----------------
The Group V operons, which appear to be the most distantly related to the Methylosinus trichosporium OB3b operon, diverge with MbnC. There do not appear to be clear Group V homologues for MbnC as there are for MbnB. There is, however, an unidentified ORF immediately neighboring the precursor, conserved primarily in these two species. This ORF could possibly encode a core biosynthetic protein for the Group V operons (Additional file 1, Figure S2). These sequences appear to have no close homologues in other species, and have a weak N-terminal similarity to the DUF692-    Methylocystis sp. SC2 M. rosea SV97T P. extremaustralis sp. 14-3 substr. 14-3b P. fluorescens NZI7 Azospirillum sp. like domain (PF05114), which is more like MbnB than MbnC.

Other biosynthesis proteins: MbnN, MbnS and MbnF
The Mbns from Methylosinus and Methylocystis species exhibit post-translational modifications beyond the formation of nitrogen-containing heterocycles and neighboring thioamides. Mbn biosynthesis in Methylosinus trichosporium OB3b requires a transamination reaction on the N-terminal amine group of the core peptide following leader peptide removal, as well as the formation of a disulfide bond, and all four Methylocystis Mbns contain a sulfonated threonine group [22,24]. Although specific proteases and disulfide-forming proteins are not evident, we have discovered proteins likely responsible for transamination and threonine sulfonation in the Mbn biosynthesis operons of several genomes. Transaminases are present in three operons only: Methylosinus trichosporium OB3b (annotated as "histidinol phosphate transaminase/cobyric acid decarboxylase" and with a PFAM classification of PF00155 or Class I/II aminotransferase), Methylosinus sp. LW4 (also PF00155 or class I/II aminotransferase), and Gluconacetobacter sp. SXCC-1 (classified as PF00202 or Class III aminotransferase) (Figure 2). The transaminase has tentatively been designated MbnN. The paucity of transaminases in Mbn operons suggests that the N-terminal transamination present in Methylosinus trichosporium OB3b Mbn may not be a common modification.
Like the N-terminal transamination, threonine sulfonation may only be present in a subset of Mbns. To date, it has only been observed in the four structures of Mbns produced by Methylocystis species [23,24]. Sulfotransferases with domains corresponding to Pfam family PF00685 were detected only in the two Group II Methylocystis operons. Although no structure for Mbn from Methylocystis strain SC2 is available, the similarity of its MbnA to that of Methylocystis rosea SV97T combined with the presence of a sulfotransferase in its operon strongly suggests that its Mbn will also be sulfonated, presumably at the same threonine. This sulfotransferase has been designated MbnS.
Finally, the gene encoding MbnF, generally annotated as a flavin adenine dinucleotide (FAD)-dependent monooxygenase or an FAD-dependent oxidoreductase (Pfam PF01494), is also present in six Group I and II operons (including all known Methylocystis genomes and some Methylosinus genomes), always following MbnM (Figure 2). The function of MbnF is unclear, but given its presence in the Methylocystis rosea SV97T operon and absence in the Methylosinus trichosporium OB3b operon, it could play a role in pyrazinedione biosynthesis (Figure 1), possibly hydroxylating the heterocycle. Without structures of Mbn-like products from non-methanotrophs, it is difficult to connect other neighboring genes (annotated or not) to potential biosynthetic modifications and to determine the effective ending point of the operon and potentially the end of any multicistronic mRNA transcripts. In both Methylocystis species, MbnS is followed by a gene resembling MoaA, a protein responsible for the first step in molybdenum cofactor biosynthesis [45] (which involves the conversion of a guanosine derivative to precursor Z) and a gene generally annotated as a 3-hydroxyisobutyrate dehydrogenase. Hypothetical unknown proteins (including the MbnC replacement in Group V operons) are present in several operons, and a range of proteins of unknown relevance, including several varieties of known copper-related proteins, appear in a few operons only ( Figure 2).

Exporting methanobactin via MbnM
A proton/sodium-dependent multidrug export pump (MATE), belonging to the PFAM class PF01554, is found in 13 of the identified operons ( Figure 2). Of the remaining operons, several are on small contigs in more fragmented draft genomes making it difficult to rule out the presence of a similar exporter. Excluding Vibrio caribbenthicus and Photorhabdus luminescens, which appear to have dissimilar MATE transporters, perhaps reflecting a less similar final Mbn-like product, this exporter is well-conserved, even in the non-methanotrophs Pseudomonas fluorescens NZI7 and Azospirillum sp. B510 and B506 (Additional file 1, Figure S3.) In prokaryotes, MATE transporters primarily function as exporters of antibiotics and similar toxic compounds, simultaneously importing Na + or H + and exporting mostly cationic natural products [46][47][48]. Native natural products are primarily exported by non-MATE efflux pumps, such as the resistance-nodulation-cell division (RND) or major facilitator superfamily (MFS) exporters that are believed to transport some siderophores out of the cell [49][50][51]. However, many MATE transporters do not have known substrates, and MATE transporters are even found in antibiotic hypersensitive strains [52]. Thus, the ability of a MATE transporter to secrete Mbn-like compounds is plausible, if unprecedented.

Importing copper-loaded methanobactin via MbnT
A family of small molecule importers, known as TonBdependent transporters (TBDTs), are also commonly associated with the Mbn biosynthesis operons. The only genomes for which nearby TBDTs are not observed are Vibrio caribbenthicus and Photorhabdus luminescens, as well as the second Mbn operon in Methylosinus sp. LW3, which is small and surrounded by transposon elements; contig truncation of several other operons may be hiding additional potential transporters in other species. We have shown previously that CuMbn is imported via an active process [17,21] and TBDTs are good candidates for importers since they play a similar role for siderophores [53][54][55][56]. TBDTs found in the vicinity of Mbn operons are generally annotated as siderophore receptors and classifiable under models including TIGR01783 (full siderophore-specific TBDT model), PF00593 (TBDT barrel only), PF07715 (TBDT plug domain only) and in some cases PF07660 (an extended N-terminal region, which appears to approximate the published N-terminal extension (NExT) domain [57]); they have provisionally have been designated the MbnT family. Conservation of these TBDTs is weaker than that of MbnB, MbnC or MbnM; even the plug domain displays less homology (Additional file 1 Figure S4A, B). However, differences in the core peptide backbone sequence may require markedly different binding approaches. While methanotroph Mbn-related genes are generally relatively similar, the plug domain sequences of Methylocystis Group II TBDTs and Methylosinus Group I TBDTs diverge markedly, perhaps reflecting the structural differences of the final compounds (Additional file 1, Figure S4A, B.)

MbnT may have a FecIRA-like regulation system in Methylosinus species
In four operons from Methylosinus species, the TBDT has an extra N-terminal domain (Additional file 1, Figure S4C.) These larger TBDTs are preceded by an ORF generally annotated as an "Fe(III) dicitrate membrane sensor" (PFAM PF04773) and an "ECF sigma factor" (with conserved σ-70-like regions 2 (PFAM PF04542) and 4 (PFAM PF08281)), designated MbnR and MbnI, respectively ( Figure 2). This pairing is generally observed for FecIRA-like systems, in which the holo siderophore-bound TBDT interacts with the membrane sensor, which then interacts with the ECF sigma factor to regulate expression of siderophore biosynthesis and transport proteins [57][58][59][60]. The earliest example of this system is the eponymous FecIRA system, which controls the transcription of iron citrate transporters [57,59,61,62]. Similar systems exist for siderophores, such as pseudobactins BN7 and BN8 (the PupBRI system) [63], pyoverdines (FpvARI/PvdS) [64] and a range of other siderophores. Not all of these systems have identical regulatory pathways. The pyoverdine transport system has two ECF sigma factors (FpvI and PvdS) which regulate different operons [64], and the HasISR system, which transports heme, has an unusual regulatory scheme in which the membrane-bound sigma factor HasS inhibits the activity of the ECF sigma factor HasI until heme binding to the TBDT HasR [65].
Strikingly, only the four Methylosinus MbnT TBDTs have the N-terminal extensions necessary for FecIRA signaling [57], suggesting a possible regulatory mechanism for Mbn production and transport in Group I operons ( Figure 6). In this model, when CuMbn binds to MbnT, a periplasmic TonB-mediated interaction with MbnR results in an altered cytoplasm-side interaction with MbnI. The MbnI ECF sigma factor may then interact with RNA polymerase to either upregulate or inhibit Mbn biosynthesis and transport and may also regulate other operons that are highly expressed at low copper, such as the sMMO operon. If MbnIRT is a positive regulation system, a negative regulator that binds copper and represses Mbn biosynthesis and transport, among other systems, may also be present.
The TBDTs in other operons beyond the Methylosinus (Group I) species lack N-terminal extension domains and are not adjacent to FecIR homologues. Although a FecIRA-like system could still be present in these species in a distant small operon, it is less likely. It may be that models analogous to different siderophore regulatory systems are more relevant to these Mbn operons. For example, iron-loaded pyochelin is taken up into the cell and binds to the transcription factor PchR, which regulates its biosynthesis and transport [66][67][68][69]. If such a system exists for chalkophores (Additional file 1, Figure S5), the regulators do not appear to be consistently encoded near the biosynthesis operon. However, genes encoding periplasmic binding proteins, commonly associated with natural product import via adenosine triphosphate (ATP)-binding cassette (ABC) transporters, are located downstream of TBDTs in both complete Methylocystis and Azospirillum operons, and could be relevant to the need for cytoplasmic uptake in a PchRlike model (Figure 2).

MbnP and MbnH: mysterious partners
The genes encoding MbnP and MbnH are conserved as a pair far beyond the group of Mbn producers analyzed here and are defined by an existing set of TIGRFAM HMMs (TIGR04039 and TIGR04052) and an associated genome property (GenProp0940). The pair consists of the di-heme cytochrome c peroxidase MbnH, frequently annotated as resembling MauG, and its neighboring partner protein, MbnP. In two non-Group V genomes (Methylosinus sp. LW3 and LW4), there are cases where this pair is not immediately proximal to an Mbn operon, but is present elsewhere in the genome. Methylosinus trichosporium OB3b has two such additional pairs. Interestingly, these isolated pairs are located near MbnT-like TBDTs that also have adjacent MbnI and MbnR homologues.
A somewhat similar pair of proteins are found in some methanotroph species that lack Mbn operons. In Methylococcus capsulatus (Bath), the proteins are called SACCP (the di-heme cytochrome c peroxidase) and MopE (the partner protein). MopE is known to be the subject of a post-translational modification (possibly by SACCP, which is similar to MauG [70]) in which a tryptophan converted to kynurenine participates in a copper binding site [71]. Additionally, while the intact MopE protein is surface-associated, a C-terminal region is fully secreted [72]. In Methylomicrobium album BG8, these proteins are called CorB (the di-heme cytochrome c peroxidase) and CorA (the partner protein) [73,74]. The genes encoding these proteins are downregulated in the presence of copper [75][76][77]. However, although there are several well-conserved tryptophans in the MbnP proteins, the sequence is not markedly similar to MopE or CorA (Additional file 1, Figure S6), and there are no data linking any close MbnP homologues or their diheme cytochrome c peroxidase partners to copper. The relevance of this gene pair to Mbn biosynthesis, regulation or transport thus remains unclear.

Overall structure of the Mbn operon
The core of the Mbn operon ( Figure 2) is the MbnB biosynthesis gene, located directly downstream of MbnA in all operons except for the two Group V operons, which have an unknown gene between MbnA and MbnB. MbnC encodes a secondary core protein, present immediately downstream of MbnB in all operons except Group V operons. All components beyond that core are more flexible. When present, MbnM follows the core biosynthesis peptides. Other biosynthesis-related genes, such as MbnN and MbnS follow MbnM. In some cases, the MbnP/MbnH pair appears after the biosynthesis proteins. In others, it is present before them on the same strand, or before them but on the complementary strand. MbnT, downstream of MbnI/R in Group I operons, primarily occurs prior to the biosynthesis cluster on the same strand and frequently neighbors the MbnP/ MbnH pair as well.
In many of the operons, factors related to genetic mobility, such as insertion sequences, transposases, integrases, insertion sites, shufflons and conjugation-related proteins, occur on one or both sides of the Mbn operon or within several kilobases (Figure 2). These elements may suggest an explanation for the seemingly unrelated assortment of species in which these operons have been detected, and for the lack of operon detection in several well-studied methanotroph species, including Methylocystis str. Figure 6 Proposed Mbn signal transduction pathway featuring the MbnIRT triad. Mbn is secreted from the cell via the MATE multi-drug exporter MbnM and an unknown outer membrane partner. CuMbn is readmitted to the cell via the TonB-dependent transducer MbnT. CuMbn binding to MbnT induces a conformational change that results in contact with both the inner-membrane TonB-ExbD-ExbB complex and MbnR via the unique N-terminal extension of MbnT. CuMbn may or may not enter the cytoplasm intact, but either way, MbnR activates MbnI analogous to a standard FecIRA system. MbnI replaces σ 70 in the active RNA polymerase complex, activating transcription of Mbn biosynthesis and transport genes, and potentially other operons needed in low copper conditions. In siderophore systems, the negative regulator Fur binds iron as intracellular iron levels rise, and the holo Fur binds to siderophore biosythesis and transport promoter regions, inhibiting transcription. A similar negative regulator might be needed to trigger the copper switch to pMMO production.
Rockwell [78]. Siderophores are sometimes transported between species on virulence or fitness cassettes [79]. Similarly, it may be that chalkophores are transported in this fashion and adapted by species that have a special need for copper-binding compounds.

Conclusions
We have detected a total of 18 novel Mbn-like precursors located in full or partial biosynthesis/transport operons in 16 species or metagenomic samples. Of the methanotroph species, operons are present in both strains that undergo the copper switch from sMMO to pMMO (for example, Methylosinus trichosporium OB3b [28], Methylocystis str. M [80,81], Methylocystis hirsuta CSC-1 [82]) and those that only express pMMO (for example, Methylocystis parvus OBBP [40], Methylocystis rosea SV97T [83]). The 16 species are not limited to methanotrophic bacteria, providing compelling evidence that Mbn-like compounds may play a broader role in proteobacterial metal homeostasis. This analysis reveals the precursor peptide for Methylocystis rosea SV97T Mbn [24] and identifies in the same operon genes encoding enzymes that would be necessary to produce the novel features of this Mbn, specifically the sulfonated threonine. Moreover, these data allow us to predict that the Mbn produced by Methylocystis strain SC2 will be very similar to that of Methylocystis rosea SV97T and likely identical to that of Methylocystis hirsuta CSC-1. Conversely, we can predict that the Mbn operons of Methylocystis str. SB2, Methylocystis str. M and Methylocystis hirsuta CSC-1 will have the same core components as the two Methylocystis operons presented here. Taken together, these findings provide strong new support for a post-translational modification biosynthetic pathway.
Beyond the four Methylocystis Mbns, the only other structurally characterized Mbn is the original compound from Methylosinus trichosporium OB3b, which has a Group I Mbn operon. As the related natural products from Group I, III, IV and V familes are characterized, the extent of structural diversity in the Mbn family should become more clear. The roles of MbnB and MbnC as well as the less universal MbnN, MbnS and MbnF proteins in biosynthesis are unknown or unconfirmed and need to be investigated biochemically. This is particularly important since Mbns contain uncommon post-translational modifications, such as thioamide groups, a modification rare enough that Mbns have doubled the number of compounds known to contain it [84]. In addition, there are no other examples of RiPPs containing pyrazinediones [85,86], and even oxazolone rings are uncommon, with oxazoles and thiazoles constituting the more common products of serine, threonine and cysteine cyclization. The combination of these motifs with the possibility of more unknown post-translational modifications in Mbns from Groups I and III to V suggests that novel biochemical mechanisms may be involved in Mbn biosynthesis.
The two identified Group V operons may represent a different natural product subfamily, albeit one that shares some similar biosynthesis proteins and modifications with the main Mbn family. Notably, their MbnA sequences contain only a single modifiable cysteine, suggesting that if the final products bind copper at all, they do not use the paired heterocycle/thioamide coordination scheme. Instead of MbnC homologues, these operons include a third unidentified putative protein which neighbors MbnA, and Vibrio caribbenthicus also has a second unknown protein following MbnB. Both have nearby exporters, but no TBDT-like importers.
The identification of MbnM and MbnT as common members of the Mbn operon provides candidate transporters for both Mbn import and export. The possible involvement of MATE-type exporters is somewhat surprising, but the ability of TBDTs to import metal-loaded siderophores is well documented, and the association of such transporters with Mbn operons supports experimental work showing that Mbn uptake is an active process [21,[53][54][55]. Furthermore, in the case of Group I operons, the N-terminal transduction element in MbnT combined with the presence of MbnI and MbnR is consistent with FecIRA-style regulation. This model, along with a hypothetical pyochelin-like route for non-Group I operons, provides testable mechanisms for CuMbn involvement in methanotrophic copper regulation, and may help unravel the mystery of the copper switch.
A final point of interest lies in what was not found in this analysis. There are a variety of methanotroph genomes, including but not limited to Methylococcus capsulatus (Bath) [87], Methylocella sylvesteris BL2 [88], Methylocystis str. Rockwell (ATCC 49242) [78] and Methylomicrobium album BG8, in which we detect no Mbn biosynthesis/transport operons. Based on their genomes, if these species produce a chalkophore as suggested [89], it is not similar to existing structurally characterized Mbns and its biosynthetic enzymes do not closely resemble MbnB and MbnC. While one of these species only produces sMMO, the rest produce pMMO and some, including Methylocystis str. Rockwell, produce only pMMO. If these methanotrophs do not produce their own chalkophores, they might scavenge chalkophores from other species, similar to what is observed for siderophores [90], and may still possess Mbn-transporting TBDTs. Alternatively, these strains may have other, yet to be unidentified, mechanisms of copper uptake. Taken together, these data provide new insight into Mbn and Mbn-like compounds and their biosynthesis, provide new tools for investigating these processes, and have implications for the broader question of bacterial heavy metal homeostasis.