Bacilli are aerobic, rod-shaped, Gram-positive bacteria, with low G/C content. They are widely distributed in soil, air and water, and form oval endospores, as a consequence to deprived environmental conditions. Representatives of this genus, comprising some 51 validly described species, are being used in a wide range of industrial processes, mainly due to their ability to produce extracellular enzymes, antibiotics and insecticides, and secrete them in high concentrations .
In particular, Bacillus amyloliquefaciens and its numerous natural isolated strains serve for the production of a-amylase, an enzyme necessary for liquefaction of starch prior to saccharification for the production of sugar syrups in food industry [2, 3]. The habitat of this species is the soil, especially the rhizosphere where it colonizes plant roots [4, 5]. The commercially available strain of B. amyloliquefaciens FZB24 is applied as bio-fertilizer, as it stimulates plant growth and suppresses plant pathogenic organisms. These abilities are also shared bystrain FZB42 .
The soil is also the natural environment of Bacillus subtilis, the best characterized member of the genus. Strain 168 was the first Gram-positive bacterium to be sequenced and has been used as a model organism to study the behavior of microorganisms for more than a century . It is closely related to Bacillus amyloliquefaciens strain FZB42, but does not promote plant growth. The features and mechanisms governing the biocontrol-related function of B. amyloliquefaciens FZB42, which are obviously not active in the domesticated strain of B. subtilis 168, have not been fully characterized yet.
The genome contains the complete set of genetic information that organisms require to live and thrive. Therefore, the complete sequencing of the genomic DNA of an organism offers better understanding in respect with the mechanisms the organism adopts to withstand its environment. The function of several sequenced genes can be predicted as a basis of the genetic organisation of the gene’s surrounding region, the conserved regions within the gene and the degree of its alignment with other genes of established function. The existence of databases that compile information about sequenced organisms/genes in combination with powerful bioinformatics tools that perform multi-level gene comparisons / annotations in a very short time (seconds), makes the task of assigning a gene’s putative function easier, faster and more successful. The information obtained by comparisons with such databases can serve as a basis for further molecular and biochemical work. Finally comparisons of complete genome sequences are very informative from an evolutionary aspect, as they allow better phylogenetic taxonomy of sequenced organisms and highlight the genetic reorganisation that evolution has imposed within closely related organisms.
The first complete genome sequence of a microbial organism was that of Haemophilus influenzae Rd KW20, published in 1995 . Bacillus subtilis was the first Gram-positive bacterium to be sequenced as its biochemistry, physiology and genetics had been already thoroughly studied for many years . In the following years, various other Bacilli were sequenced [9, 10, 11]. It is interesting to mention that the fully sequenced microbial genomes has rapidly risen from only 30 in year 2000 to more than 300 today. Shortly after the completion of sequencing of the first prokaryotic organism, the complete sequence of major eukaryotic organisms (such as drosophila, mouse and human) was accomplished and this marked a major breakthrough in science of the last century [12, 13, 14]. It is therefore apparent that information concerning complete sequenced genomes accumulates exponentially and combined with the development of more powerful databases (that reflect the advances in bioinformatics) provides better understanding / prediction of the abilities and functions of newly sequenced organisms.
Antibiotics are a diverse group of chemical substances produced by both prokaryotes and eukaryotes and are of great importance in medicine due to their ability to disrupt pathogenic microbial metabolism, by various mechanisms. They can be classified according to their structure or their action (Fig. 1). In the medical field, the two most important groups of antibiotics are the β-lactam and tetracyclines. Members of the first group, such as penicillins and cephalosporins, are produced by fungi and are potent inhibitors of cell wall synthesis of bacteria. The tetracyclines consist of a naphthacene ring system that can be substituted at several positions to form new analogs. They are produced by prokaryotes and inhibit almost all Gram-positive and negative bacteria, by interfering with 30S ribosomal subunit function. In addition, aminoglycoside antibiotics, a separate group of antibiotics, exhibit the same way of function. They contain amino sugars bound to each other by glycosidic linkage, as in the case of streptomycin and kanamycin. Furthermore, macrolide antibiotics are widespread antibiotics in medicine that contain large lactone rings connected to sugar moieties. Erythromycin belongs to this group and inhibits protein synthesis at the 50S subunit of the ribosome (Fig. 1) .
|Figure 1: Chemical structural representation of different classes of antibiotics with major importance.|
|A. Penicillin core, B. Tetracycline core, C. Neomycin; aminoglycoside antibiotic and D. Erythromycin; a macrolide antibiotic|
Bacilli are widely known and used microorganisms for production of a wide range of antibiotics, such as polymyxins (B. polymyxa) which destroy membrane integrity as well as edeines (B. brevis) which inhibit the formation of the initiation complex on the 30S ribosomal subunit . The predominant class though of antibiotics produced by Bacilli are peptide antibiotics. These exhibit highly rigid hydrophobic and/or cyclic structures with unusual constituents like D-amino acids, and are generally resistant to hydrolysis by peptidases and proteases . Furthermore, they are insensitive to oxidation, because cysteine residues are either oxidized to disulphides and/or modified to characteristic C-S (thioether) linkages. The peptide synthesis is achieved ribosomally, followed by post-translational modifications, or nonribosomally by multienzyme complexes .
For the production of proteins and peptides three basic enzymes are required: tRNA synthetases, tRNAs and the ribosome . First the aa-tRNA synthetase selects the cognate amino acid and loads it onto the 2’- or 3’- hydroxyl group of the corresponding t-RNA [20, 21]. Subsequently, with help of the elongation factor (EF-Tu), the ribosome selects the correct aa-tRNA during each cycle of polypeptide elongation, according to the mRNA sequence . Therefore a complex comprising aa-tRNA, EF-Tu and GTP enters the acceptor site of the ribosome. The large ribosomal subunit stimulates GTP hydrolysis when there is complementary base-pairing between the mRNA and the cognate aa-tRNA. Peptide bond is formed when the aa-tRNA has been accommodated to the acceptor site, whereupon translocation can occur regenerating the ribosome. Eventually, post-translational modification events lead to the completion of synthesis of these peptide antibiotics .
Lantibiotics are the major group of ribosomally synthesized antibiotics in Bacilli. They contain lanthionine, which is formed post-translationally through dehydration of serine or threonine residues followed by addition of neighbouring cysteine thiol groups, leading to inter-residual thioether bonds [24, 25]. Based on structural properties, two types of lantibiotics are distinguished: type A, with a more linear secondary structure, and type B, with a more globular one .
Subtilin and ericin are members of the type A group and are lethal against Gram-positive bacteria by forming voltage-dependent pores into the cytoplasmic membrane . Mersacidin belongs to the type B group and inhibits cell wall biosynthesis by complexing lipid II . Other unusual lantibiotics produced by Bacilli are sublacin and subtilosin, which also act against Gram-positive bacteria through yet unknown mechanisms. The organization of these gene clusters is shown in figure 2.
Subtilin biosynthesis is mediated by the prepeptide SpaS , which is post-translationally modified by SpaBC . Furthermore, the translocator SpaT exports the lantibiotic. Immunity to the producer strain is conferred by the lipoprotein SpaI and the ABC transporter SpaFEG . In a positive feedback loop, subtilin activates the two component regulatory system SpaRK (response regulator and sensor histidine kinase) and directly stimulates expression of genes involved in biosynthesis and immunity [32, 33]. SpaRK expression is also controlled by the sporulation transcription factor SigH .
Ericin has the same gene cluster organization as subtilin, but surprisingly two structural genes, eriA and eriS. However, the production of ericin A and ericin S, that differ in amino acid composition and ring structure, is under the control of the same synthetase (EriBC) .
|Figure 2: Bacillus subtilis lantibiotics, lantibiotic-like peptides and specifying gene clusters.|
|The organisation of gene clusters (boxed) specifying lantibiotic and lantibiotic-like peptides are presented along with schematic structure representations of the mature peptides. The size of gene clusters is given in kilobases (kb). Black boxes indicate structural genes and genes involved in post-translational modification and transport; grey boxes indicate regulatory genes; filled boxes stand for immunity genes. The figure is reproduced from .|
mrsA is the structural gene in the mersacidin gene cluster, whereas the genes mrsM and mrsD are involved in its post-translational modification . Furthermore, mrsT, coding for a transporter with an associated protease domain, mediates the transport while the operon mrsFGE, an ABC transporter, confers self-protection against the lantibiotic. mrsR1 is a response regulator that controls biosynthesis of mersacidin whereas the putative two component regulatory system mrsR2K2 controls immunity [36, 37].
The structural gene for sublancin biosynthesis is sunA and it belongs to the B. subtilis temperate bacteriophage SPβ. An ABC transporter (SunT) and two thiol-disulphide oxidoreductases (BdbAB) belong to the same locus . Until now, only BdbB is proven to be involved in the sublancin production, most probably for the formation of the disulphide bonds . The genes conferring immunity are unidentified.
Finally, the gene cluster of subtilosin (sbo-alb) encodes AlbA protein, probably involved in post-translational modification of presubtilosin, and AlbBCD proteins, a putative ATP-binding transporter, involved in immunity . The expression of alb genes is under the negative control of AbrB .
Structural diversity is a predominant feature of nonribosomally synthesized peptides, as they are assembled from an exceedingly heterogeneous group of precursors. There are more than 300 members in this group including pseudo, nonproteinogenic, hydroxy, N-methylated and D-amino acids . In contrast, ribosomal synthesis of peptides is restricted to 20 amino acids.
In spite of their structural heterogeneity, the peptide antibiotics of this group share a common mode of synthesis, the multicarrier thiotemplate mechanism . According to this model, peptide bond formation takes place on multienzymes designated nonribosomal peptide synthetases (NRPS), which are arranged in modules. Modules are the units responsible for the incorporation and/or modification of a specific amino acid into the peptide product, and their arrangement and number are usually colinear to the amino acid sequence and the length of the peptide respectively (colinearity rule) , [45, 46]. Modules are further divided into domains; the enzymatic units involved in a specific step of synthesis, such as substrate activation, covalent binding, elongation etc .
According to the multicarrier thiotemplate mechanism, the carboxy group of amino acid is activated to the corresponding adenylate by ATP hydrolysis and then it is transferred onto the free thiol-group of an enzyme bound 4'-phosphopantetheinyl cofactor (4'-PP), forming a thioester. At this stage, the substrates can undergo modifications such as epimerization or N- methylation. Peptide assembly is achieved via peptide bond formation steps, by binding of the thioester-activated carboxyl group of the upstream module to the free amino group of the adjacent downstream module. During this N to C stepwise elongation, the intermediates are covalently attached to the multienzyme complex. The termination of the synthesis is induced by the release of the thioester-bound peptide product by hydrolysis, cyclization or transfer to a functional group [19, 44, 48]. As an example figure 3 shows a prototype NRPS assembly line for the cyclic lipoheptapeptide surfactin .
|Figure 3: Surfactin assembly line.|
|The multienzyme complex consists of seven modules which are responsible for the incorporation of seven amino acids. 24 domains catalyse the same number of chemical reactions. The peptide chain is elongated stepwise from N to C end. The last domain is responsible for release and cyclization surfactin. The figure is reproduced from .|
Domains are not just imaginary sections in the module. They are enzymatically active, as well as structurally and catalytically independent. They can be excised from the peptide chain and still retain their activity .
Nonribosomal peptide synthesis is initiated by the recognition and activation of the designated substrate. This is the role of the adenylation domain (A), which recognizes and incorporates the suitable amino acid substrate into the peptide. At the expense of Mg+2-ATP and release of PPi, the amino acid is activated as aminoacyl adenylate (Fig. 4A) [50, 51, 52]. There is a specific adenylation domain for each amino acid included in the peptide antibiotic and its location indicates the primary structure of the product . Sequence comparison of the A-domains (ca 550aa) deriving from various genes that code for peptide synthetases revealed 10 residues as the major determinants of substrate specifity; this result was also confirmed by introducing specific point mutations at these sites [45, 53, 54].
The thiolation domain (T), also known as peptidyl carrier domain (PCP), accepts the activated amino acid. The prerequisite for the functionality of the T-domain is its post-translational modification with the 4'-phosphopantetheine cofactor (4'-PP). Associated 4'-phosphopantetheinyl transferases catalyze the transfer of the 4'-PP moiety from the coenzyme A to a conserved serine residue of the T-domain, converting thus the inactive apo-T to its active holo-T (see also Fig. 6) [43, 55, 56]. Furthermore, the aminoacyl adenylate from the A-domain forms a thioester with the cysteamine thiol group of 4'-PP cofactor and therefore can be transported to the next module (Fig. 4B) [43, 47, 55, 57, 58]. The thiolation domain has around 100 amino acid residues and is located downstream of the adenylation domain . It represents the transport unit that enables the elongation intermediates to move between the catalytic centers. The combination of adenylation and thiolation domains is referred to as initiation module, since both domains are required to activate and covalently tether the first building block in the peptide synthesis.
|Figure 4: Domain catalyzed reactions.|
|A) The adenylation domain recognizes and activates the suitable amino acid as aminoacyl adenylate at the expense of ATP. B) Covalent attachment of the activated aminoacyl adenylate onto the free thiol group of the 4'-phosphopantetheine cofactor bound to the peptidyl carrier domain. C) Peptide elongation by the condensation domain, which catalyses the attack of the nucleophilic amine of the acceptor substrate onto the electrophilic thioester of the donor substrate. A1- A2, adenylation domains; PCP, thiolation / peptidyl carrier domain; C, condensation domain; d and a, donor and acceptor sites on condensation domain; ppan, 4'-phosphopantetheine cofactor. Domains in action are indicated in red. The figure is reproduced from .|
The condensation domain (C), ca. 450 amino acid length, is responsible for the formation of the peptide bond between two activated amino acids on adjacent modules and therefore controls the elongation of the growing peptide chain . It catalyses the attack of the nucleophile aminoacyl-S-4'-PP-T to the electrophile aminoacyl/peptidyl-S-4'-PP-Ts, that lye downstream and upstream of the C-domain respectively (Fig. 4C) . For this scope the C-domain harbors two selective substrate-binding sites: an enantioselective electrophilic donor site and an amino acid selective nucleophilic acceptor site . The amino acid acceptor site is responsible for preventing internal mis-initiation as well as for controlling the timing of substrate epimerization , whereas the donor site for incorporating the correct isomer [60, 62].
The C-domain is found between two consecutive initiation modules located on the same synthetase (intramolecular amino acid transfer). In case the initiation modules belong to different synthetases, the C-domain is located at the N-terminus of the one accepting the substrates (intermolecular amino acid transfer). Peptide synthetases involved in lipopeptide biosynthesis contain an additional C-domain preceding the first initiation module, probably involved in the coupling of the fatty acid moiety to the first amino acid of the peptide moiety .
Variations on the peptide backbone can be obtained by the replacement of C-domains with the structurally and mechanically related heterocyclization (Cy) domains. Five-membered heterocyclic rings such as oxazoline in vibriobactin or thiazoline in bacitracin are common features of nonribosomal peptides and significant for chelating metals and interaction with proteins, RNA, DNA [48, 64]. The formation of such heterocyclic rings and the subsequent peptide elongation is catalyzed by Cy-domains with the nucleophilic attack of a T-bound cysteine, threonine or serine acceptor substrate onto the thioester of the donor substrate. As observed for C-domains, the free α-amino group of the cysteine, threonine or serine is the nucleophile. Subsequently, the side chain hydroxyl or thiol group carries out a nucleophilic attack onto the α-carbonyl C atom of the donor amino acid, producing a heterocyclic ring. Finally, the product is dehydrated to form oxazoline or thiazoline (Fig. 5A) [65, 66].
|Figure 5: Schematic representation of the catalytic functions of Cy-, TE-, E- and N-MT-domains.|
|A) Formation of thiazoline heterocyclic rings from cysteine precursors catalyzed by Cy-domain. Three reactions are catalyzed by the Cy-domains: amide bond formation, cyclization and dehydration. Cy, heterocylization domain. Example from yersiniabactin nonribosomal synthetase present in Yersinia pestis. ArCP, aryl carrier protein; Sal-S-ArCP, activated salicyl group onto the N-terminal ArCP. The figure is reproduced from .|
B) Peptide release by the TE-domain. Peptide release is achieved either by external nucleophile water resulting in a linear product (A) or by an internal nucleophile resulting in a cyclic product (B), depending on the NRPS template. The figure is reproduced from .
C) Peptide synthesis order in the presence of an E-domain within the elongation modules. 1. Substrate adenylation by A-domain. 2. Transfer of the activated amino acid to the PCP domain. 3. Binding on the upstream C-domain acceptor site and formation of peptide bond. 4. The resulting peptidyl-PCP has lower affinity for the acceptor site and is transferred to the subsequent E-domain. Equilibrium of D/L isomers is produced. 5. Binding of the D-isomer on the donor site of the downstream C-domain. AA, amino acid; AAx,, upstream peptidyl chain; E-epimerization domain. The figure is reproduced from 
D) Cyclization strategies. The majority of cyclization reactions within NRPS are catalyzed by TE-domains. A putative C-domain accounts for cyclization of cyclosporine A while a T-C domain controls oligomerization of the trilactone enniatin. A reductase domain (R) is responsible for cyclization of the imine nostocyclopeptide. The figure is reproduced from 
E) N-Methylation of nonribosomal peptides by embedded N-MT domains. N-methylation occurs on the aminoacyl thioester monomer prior to amide bond formation with the upstream peptidyl chain. Example from yersiniabactin nonribosomal synthestase in Yersinis pestis. The figure is reproduced from .
The thioesterase domain (TE), ca 250 amino acid length, is responsible for the release of the peptide from the multienzyme complex. During synthesis, the growing peptide chain is transported between the T-domains of the subsequent modules from the N to the C-terminus of the synthetase until it reaches the final module. This module usually contains the TE-domain, causing product liberation by a two-step process. This involves an acyl-O-TE- enzyme intermediate that is attacked by either a peptide-internal nucleophile [67, 68] or water , and results in a macrocyclic  or a linear product  (Fig. 5B).
TE-domains are very diverse since they catalyze various reactions (Fig. 5D) . In the case of tyrocidine (B. brevis), head to tail cyclization is achieved by amide bond formation between the N-terminal amine and the C terminus of the peptide, yielding a lactam product , whereas for surfactin and mycosubtilin (B. subtilis et al.) lipo branched chain cyclization is accomplished by connection of a β-hydroxy and a β-amino fatty acid to the C-terminus, yielding a lactone and a lactam respectively [63, 70]. The same situation is observed for the calcium dependent antibiotic (CDA) produced by Streptomyces coelicolor A3(2) . For fengycin (B. subtilis et al.) and syringomycin (P. syringae), amino acid branched chain cyclization occurs by using a tyrosine and a serine from the peptide chain as nucleophiles, discriminating them from other peptide antibiotics that use the β-hydroxyl group of the attached fatty acid moiety [74, 75, 76]. Some TE-domains do not permit cyclization of one peptide chain, but force the multienzyme to repeat the synthesis once or twice more. Subsequently, they have the ability to count the assembled synthetase monomers at the end and initiate release by cyclic dimer or trimer formation, when the desired length is achieved . This mechanism, though not yet fully characterized, applies for the synthesis of gramicidin S (B. brevis) , enterobactin (E. coli)  and bacillibactin (B. subtilis) . As they control such different mechanisms of cyclization, TE-domains show high degree of specialization and therefore share low sequence homology (10%-15%) .
Nevertheless, cyclization is not accomplished exclusively by TE-domains. For cyclosporin A (Tolypocladium niveum), a putative C-domain is responsible for the final peptide bond , whereas for enniatin (Fusarium script), a T-C didomain accounts for the oligomerization . In the case of nostocyclopeptide (Nostoc sp.), the C-terminal residue of the linear peptide is reduced by a reductase domain (R-domain) to give an aldehyde, that is intramolecularly captured by the α-amino group of the N-terminal amino acid residue to produce a cyclic imine .
The epimerization domain (E) controls the conversion of amino acids, that belong to the attached growing peptide chain, from L to D-configuration. Usually these domains (ca 450 amino acid length) are located internally in the synthetases upstream of the condensation domain . They represent a class of cofactor independent amino acid epimerases that catalyze the de- and reprotonation of the α-carbon atom of an enzyme bound aminoacyl or peptidyl-S-4'-PP thioester in both directions (L-to-D, D-to-L), resulting in a mixture of both isomers. However, the L-isomer is rejected by the enantioselective donor site of the following C-domain, whereas the D-isomer is used from the same domain for the elongation of the peptide chain [83, 84].
If the E-domain is part of the initiation module, an equilibration between the two isomers takes place as the amino acid is bound as thioester at the thiolation domain, prior to peptide bond formation [83, 85, 86]. The downstream C-domain is selective for the D-isomer, which is eventually incorporated [60, 62]. However, if an E-domain is embedded in the elongation modules, epimerization occurs at the peptidyl-4'-PP-T stage. The corresponding A-domain recognizes and activates the L-isomer, which is then transferred onto the following T-domain and then onto the upstream C-domain for peptide bond formation. Then, E-domain acts to produce a D/L equilibrium of peptidyl-S-4'-PP thioesters. Furthermore, downstream C-domain catalyses only the transfer of the D-isomer to the next elongation module (Fig. 5C) [61, 84, 87].
Quite rarely, D-amino amino acids are present in the peptides independently from the catalytic function of E-domains. These substrates are first epimerized by racemases, which are not intergrated in the peptide synthetase, and then recognized and incorporated by the corresponding A-domain. This is the case for D-Ala1 in the cyclosporine synthetase .
The N-Methyltransferase (N-MT) and C-Methyltransferase (C-MT) domains are responsible for the N-or C-methylation of amino acid residues, thus making the peptide less susceptible to proteolytic breakdown. N-MT, which is usually located between the corresponding A- and T-domains, catalyzes the transfer of S-methyl group from S-adenosyl methionine (SAM) to the α-amino group of the thiosterified amino acid (Fig. 5E) . This reaction is accomplished prior to peptide bond formation, as determined for the enniatin synthetase . C-MT domains appear rarely in nonribosomal peptide synthetases, but use also SAM as the methyl donor .
Nonribosomal peptide synthetases require posttranslational modification to be functionally active. As it has been already mentioned, thiolation domains are unable to serve as transport proteins immediately after translation, resulting in blocking of peptide synthesis. A modification by transfer of the 4'-PP moiety of coenzyme A onto a conserved serine residue of each T-domain, converts the latter from apo- to holo-form and unblocks the synthesis. The mobile 4'-PP prosthetic group is about 20Å in length and since it is covalently bound as a phosphothioester to the multienzyme , it serves as a “flexible arm”, which initially accepts the activated substrates and later on delivers them to the next building-block [43, 55, 57]. The conversion of T-domain is catalyzed by a dedicated 4'-phosphopantetheinyl transferase (4'-PPTase) in a Mg+2-dependent way, thereby releasing 3', 5'-ADP (Fig. 6) [92, 93]. Sfp and Gsp proteins control this reaction in B. subtilis and B. brevis, respectively [56, 92, 93].
Recent studies have shown that Sfp accepts as substrates CoA derivatives, such as acetyl-CoA and aminoacyl-CoA [60, 94]. It is therefore likely that PPTases also modify the T-domains of NRPSs with acyl-4'-PP, rendering the enzyme inactive, as misprimed transport units are unable to accept activated amino acids. The activity can be restored by thioesterases II (TE-II) which hydrolyze the acyl-4'-PP, leaving only the 4'-PP bound, and are found in association with the peptide synthetases . TE-IIs contribute as proofreading enzymes, since they preferentially hydrolyze acetyl-Ts versus aminoacyl or peptidyl-Ts . Consequently, the capable of nonribosomal peptide synthesis holo-Ts are made either by direct priming of the apo-derivatives, catalyzed by PPTases ,or by deblocking misprimed derivatives, catalyzed by TE-IIs.
|Figure 6: Conversion of thiolation domain from apo- to holo-form.|
|The 4'-phosphopantetheine moiety of coenzyme A is covalently attached onto an invariant serine residue of the thiolation domain (PCP) by dedicated phosphopantetheinyl transferases; thus PCP-domains are activated. The figure is reproduced from .|
In recent years increasingly more peptide synthetases have been identified that contain domains normally present in fatty acid (FASs) or polyketide (PKSs) synthases. The first determined mixed NRPS-PKS biosynthetic gene cluster was that of rapamycin in Streptomyces hydroscopius, that contains a NRPS module for the incorporation of pipecolic acid into the polyketide[97, 98]. In addition, synthesis of melithiazole and myxothiazole requires six multifunctional enzymes that switch back and forth between NRPS and PKS [99, 100]. Furthermore, hybrid systems of peptide synthetase and fatty acid synthase, such as mycosubtilin and iturin were characterized in various Bacillus strains [63, 101]. Most recently, a genomic island (54kb) that consists of three nonribosomal peptide synthetases, three polyketide synthases and two hybrid NRPS/PKS synthases was identified among pathogenic E. coli strains of the B2 group. Interestingly, it was shown that E. coli strains expressing this gene cluster induce double-strand breaks in eukaryotic cells leading to cell death .
Fatty acids are essential for primary and secondary metabolism, because they are used as a form of energy storage, but also as building blocks for cell membranes or for nonribosomally synthesized peptides. The fatty acid synthase (FAS) of bacteria is a multienzyme complex that consists of individual, highly conserved enzymes [103, 104].
The first step in fatty acid production is the synthesis of malonyl-CoA from acetyl-CoA and CO2, which involves the biotin carboxyl carrier protein and is catalyzed by biotin carboxylase [105, 106]. The manolyl units are subsequently transferred to the 4'-PP of the holo-acyl carrier protein (ACP) by action of malonyl-CoA:ACP transacylase . The acylated β-ketoacyl-ACP synthase III is then in the position to initiate chain elongation via condensation with malonyl-ACP and release of CO2, resulting in an ACP-bound acyl chain that is extended by C2 . The β-carbon of the intermediate tethered to the ACP is reduced by a ketoacyl-ACP reductase (KR) and then dehydrated by a β-hydroxyacyl-ACP dehydratase (DH) (Fig. 7A). Finally, the enoyl-ACP reductase (ER) catalyzes reduction of the β-carbon to CH2. This elongated acyl-ACP can participate in subsequent rounds of synthesis that involve additional keto synthases (KSs) with different substrate selectivities [19, 100].
Polyketides are secondary metabolites which are synthesized on modularly organized giant multienzymes (polyketide synthases, PKSs) by decarboxylative Claisen condensations. In general, their biosynthetic pathway shares similarities to nonribosomally synthesized peptides and requires at least three domains .
The acyltransferase (AT) domain is responsible for the selection of substrate, which can be malonyl-, methyl-, ethyl- or propylmalonyl-CoA . This appears to be a significant difference to FASs whose substrate selectivity is limited only to malonyl-CoA. Further on, the AT-domain transfers the chosen substrate to the 4'-PP of the corresponding holo-ACP, which is analogous to the transport protein of FASs. Like in NRPSs, ACPs are posttranslationally modified by 4'-phosphopantetheinyl transferases . Relocation of the malonyl-derivative occurs to an active cysteine residue of the KS-domain. The substrate of the next module binds to the ACP-domain and is decarboxylated, resulting in the free nucleophile necessary for the subsequent Claisen-condensation with the KS-bound ketide. Therefore, an enzyme-bound β-ketoacyl intermediate is generated. Moreover, the produced intermediates are always transferred on the synthase according to the indicated elongation steps and finally a TE-domain catalyzes the cleavage of the product by macrocyclization. Like in the case of NRPSs, the order of modules determines the sequence of polyketide synthesis (Fig. 7B) .
“Optional” domains, such as KR-, DH-, ER- domains, are also observed in PKSs, such as and they operate in a similar manner to those used by FASs [110, 111]. In general, even though fatty acid and polyketide synthases share striking architectural and organizational similarities with the peptide synthetases, they are more closely related to each other.
|Figure 7: FASs and PKSs; multienzyme complexes with distinct domains.|
|A. Fatty acid synthases (FASs). A malonyl residue loaded onto the central ACP is condensed with an acyl chain bound to the KS. After condensation with release of CO2, the β-keto group is first reduced by a KR, dehydrated by a DH and finally reduced to the methyl group by an ER. ACP, acyl carrier protein; KS, keto synthase; KR, ketoacyl-ACP reductase; DH, β-hydroxyacyl-ACP dehydratase; ER, enoyl-ACP reductase.B. A fictitious dimodular polyketide synthase (PKS). The ACP of the first module is loaded with propionyl by the AT domain of the first module, while the second AT domain loads its ACP with methylmalonyl. The propionyl residue is translocated to an active-site cysteine of the KS-domain, whereas the methylmalonyl is decarboxylated resulting in the nucleophile for the condensation with the KS-bound propionyl. The product of condensation is covalently tethered to the 4'-PP present at the ACP of the second module. KR domain causes reduction of the β-carbonyl group to a hydroxyl one. ACP, acyl carrier protein; AT, acyl transferase; KS, keto synthase; KR, ketoacyl-ACP reductase. The figure is reproduced from .|
Nonribosomally synthesized peptide antibiotics are widespread among Bacilli. Some of them are characteristically produced by only one member of the genus whereas others are more conserved. Nowadays more information concerning their diversity and distribution has accumulated, partly as a result of the increased number of sequenced genomes. Due to their conserved genetic structure and huge size, these synthetases can be easily recognized. Together with the polyketide synthases, they are the largest operons in the genome. In this section, an attempt will be made to summarize the current knowledge in respect with how the most well studied antibiotics of this group are organized and operate.
Different Bacillus strains produce small cyclic peptides with long fatty moiety, the so-called lipopeptides. Based on their structure, they can be generally classified into three different groups: i) the surfactin , ii) the fengycin [76, 113, 114] and iii) the iturin group .
Surfactin is a heptapetide linked via lactone bond to a β-hydroxy fatty acid composed of 13 to 15 carbon atoms (Fig. 8A) [116, 117]. Its operon comprises four open reading frames (ORFs) codifying the proteins SrfAA, SrfAB, SrfAC, SrfAD (Fig. 9A) [49, 118, 119, 120, 121]. SrfAC protein ends with a TE-domain, responsible for peptide release and cyclization, whereas the following protein SrfAD shows high homology to TE-IIs. Remarkably, disruption of this gene leads to severe reduction but not abolishment of the antibiotic’s production [95, 96]. Furthermore, SrfAD acts in a double manner by hydrolyzing 4'-PP bound acetyl groups of misprimed NRPSs, according to the TEII ability  as well as by mediating the transfer of the fatty acid substrate to the Glu-module and stimulating β-hydroxyacyl-glutamate formation . In general, the number of amino acids and their configuration agrees totally with the organization of modules and domains on the surfactin synthetase, confirming the colinearity rule mentioned earlier. An example is the presence of two D-configurated amino acids that correspond exactly to the position of two epimerization domains.
Surfactin is one of the best characterized lipopeptides, since it possesses various beneficial abilities. Firstly, surfactin is able to lower surface and interfacial tension, thanks to its amphiphilic structure. In particular, surfactin produced by B. subtilis ATCC 21332 is considered one of the most powerful biosurfactans, since it can lower the surface tension of water from 72 to 28 mN/m at concentrations as low as 24μM [123, 124]. Furthermore, surfactin is responsible for inhibition of fibrin clotformation  and for erythrocytes lysis . Other beneficial properties, with potential biotechnological and pharmaceutical applications are. i) antitumor activity , ii) activity against enveloped viruses , iii) antibiotic function against the protoplast of B. megaterium  and Mycoplasma [129, 130]. Furthermore, the srf operon encodes the regulatory gene, comS , which is involved in the development of genetic competence, an active process aimed at acquiring new genetic material that enables the cell to survive under changing environmental conditions .Surfactin is also essential for swarming motility [132, 133, 134, 135], a flagellum-driven social form of surface locomotion, as well as for formation of biofilms, i.e. surface-associated multicellular communities [136, 137].
Fengycin, synonymous to plipastatin, is a cyclic decapeptide linked to a β-hydroxy fatty acid moiety, with lengths that vary from 14 to 18 carbon atoms (Fig. 8E, 9B) [138, 139, 140]. Fengycin demonstrates strong surface activity, although lower compared to surfactin . Fengycin is active against filamentous fungi [76, 139, 140], and inhibits the enzymes phospholipase A2  and aromatase .
Iturin, mycosubtilin and bacillomycin belong to the same group of lipopeptides. These compounds consist of seven α-amino acids and one β-amino fatty acid, that distinguishes them from the already mentioned groups. The peptide moiety contains a tyrosine in the D-configuration at the second amino acid position as well as two additional D-amino acids at positions three and six (Fig. 8B, 8C, 8D). Gene sequences encoding enzymes for biosynthesis of iturin A and mycosubtilin, but not bacillomycin D, have been reported (Fig. 9C) [63, 101]. Thereby it has been revealed that these lipopeptides are synthesized on hybrid synthases, since domains homologous to fatty acid and polyketide synthases are situated at their N- terminus . These domains are absent from the peptide synthetases of surfactin and fengycin groups, so it appears very likely that these domains are involved in the incorporation of the β-amino fatty acid moiety into the peptides of the iturin group lipopeptides . Moreover, these antibiotics exhibit strong antifungal and hemolytic activities, whereas their antibacterial function is more limited [76, 115].
|Figure 8: Schematic structure of various lipopeptides produced by Bacilli.|
|A Surfactin, n = 10-12, B Iturin A, n = 10-13, C Mycosubtilin, n = 10-13, D Bacillomycin D, n = 10-13, E Fengycin, n = 13-17, F Lichenysin, n = 9 -14|
The above mentioned lipopeptides are produced by different Bacilli, such as B. subtilis and B. cereus. However, one lipopeptide with similar structure to surfactin is exclusively composed by B. licheniformis [144, 145, 146]. It is designated as lichenysin and is a cyclic heptapeptide with a β-hydroxy fatty acid moiety, composed of 12-17 carbon atoms (Fig. 8F, 9D) . It demonstrates antimicrobial properties and reduces the surface tension of water [144, 146]. In particular, lichenysin A can cause a similar reduction in water surface tension as surfactin from B. subtilis ATCC 21332, albeit in lower concentration (12μM versus 24μM) .
Another nonribosomally synthesized antibiotic compound is bacitracin found in B. licheniformis [147, 148]. This thiazoline ring-containing dodecapeptide is synthesized by the large multienzyme complex BacABC (Fig. 9E) . Bacitracin is a prominent inhibitor of cell wall biosynthesis and most active against Gram-positive bacteria . However, B. licheniformis and several other Gram-positive bacteria are not susceptible to this antibiotic suggesting the existence of specific resistance mechanisms . Its primary mode of action is the formation of a tight ternary complex with the peptidoglycan carrier C55-isoprenyl pyrophosphate (IPP) and a divalent metal cation. This carrier is responsible for the translocation of cell envelope building blocks from the cytosol to the external side of the cytoplasmic membrane, where they are incorporated to the macromolecular network of the cell envelope (i.e. peptidoglycan, teichoic acids and polysaccharide capsule). Binding of bacitracin to IPP prevents its recycling by dephosphorylation to the monophosphate form that is normally reloaded on the inner face of the membrane [150, 151].
Another member of the Bacillus genus, B. brevis, produces two cyclic decapeptides, tyrocidine and gramicidin S (Fig. 5D, 9F) [52, 84, 152]. The first one characteristically contains a nonproteinogenetic residue, the L-ornithine and acts as antibiotic by membrane perturbation [17, 52]. Gramicidin S is synthesized on the enzymes GrsTAB, where only five amino acids are activated and incorporated. However, the peptide is dimerized to the decapeptide prior to its release. Furthermore, gramicidin S exhibits strong antibacterial activities against Gram positive and negative bacteria [153, 154], probably due to an interaction with membrane phospholipids. Thereby, gramicidin S causes a phase separation of negatively charged phospholipids from other lipids leading to a disturbance of the membrane’s osmotic barrier [155, 156].
|Figure 9: Schematic representation of peptide synthetase operons in Bacilli.|
|The genes comprising each peptide synthetase operon and their sizes are indicated. Organisation within the modules is presented, while the respective activated amino acid are depicted within the adenylation domains. A. Surfactin operon in B. subtilis . B. Fengycin operon in B. subtilis . C. Iturin A and mycosubtilin operons in B. subtilis [63, 101]. D. Lichenysin A operon in B. licheniformis . E. bacitracin operon in B. licheniformis . F. Tyrocidine and gramicidin S operons in B. brevis [52, 152]. The figure is adapted from .|
In the last few decades, the pathways that govern the synthesis of antibiotics on large multienzymes have been thoroughly studied. Significant progress has been made on the functional analysis of various domains as well as on the role of their assembly in the peptide synthetases. Moreover, high resolution structures obtained for several enzymatic subunits from different antibiotics led to a better understanding of their architectural organization, substrate specificity and catalytic action [70, 158, 159, 160]. In contrast, our knowledge concerning how the organism regulates expression of these systems or the mechanisms which govern export of the peptides and/or resistance to them is rather limited. An exception is the case of surfactin, for which studying the regulation of gene expression received increased attention due its connection with the development of genetic competence.
The expression of surfactin is growth-phase dependent and is induced during transition to stationary phase . Its transcription is driven by a σA-dependent promoter  and its expression is regulated via a complex network, including the two component regulatory system, ComAP [161, 162]. ComP is the sensor histidine kinase that is autophosphorylated after sensing increase in the concentration of the pheromone ComX . The phosphoryl group is then transferred to the response regulator, ComA and activates it. Phosphorylated ComA can bind upstream of the srf operon and induce its expression. Therefore, systems involved in the phosphorylation / DNA-binding ability of ComA (ComXQ, RapC-CSF, RapF) modify indirectly the antibiotic’s expression [163, 164, 165, 166, 167]. PerR, a general repressor of the peroxide stress regulon, is shown to positively regulate surfactin in a direct manner, independently of ComA . In contrast CodY, a GTP-activated global regulator, acts as a direct repressor under casamino acids rich conditions . Furthermore, YerP, a protein homologous to the RND (resistance, nodulation and cell division) family of efflux pumps in Gram-negative bacteria, seems to contribute in secretion of surfactin and self-resistance of the producer strain against it .
Knowledge on transcriptional regulation of the remaining lipopeptides is rather limited. The promoters of fengycin and iturin operons have been successfully identified and show similarity to a housekeeping σA promoter [101, 113]. Furthermore, deletion of degQ, a pleiotropic regulator gene that controls the production of several secreted and degradative enzymes , reduces severely the production of these antibiotics, via an unidentified mechanism [172, 173].
All these lipopeptides are post-translationally regulated by sfp, a 4'-phosphopantetheinyl transferase which converts T-domains to their active form (see corresponding chapter; [92, 137]. The importance of this gene is demonstrated in strains that contain intact synthetases but dysfunctional sfp. B. subtilis strain 168 contains intact srf and fen operons but is unable to produce the antibiotics, due to a frameshift mutation on the sfp gene . However, when complemented with a functional 4'-PPTase, the antibiotic production of the strain is restored [172, 175].
Mechanisms that govern regulation of lichenysin and bacitracin are studied only in a preliminary basis. Lichenysin expression is dependent on the two component regulatory system ComAP . In the case of bacitracin, an ABC transporter (BcrABC) conferring resistance to the producer strain against the antibiotic was determined [177, 178]. It is located about 3 kb downstream of the bacitracin biosynthetic operon bacABC and its expression is induced by the dodecylpeptide [150, 179]. Moreover, a two component regulatory system BacRS, situated between the bac operon and the bcrABC genes, negatively regulates expression of the transporter genes .
Transcription of the tyrocidine operon is driven by a typical σA promoter and its expression is induced at the end of exponential phase of growth. Spo0A, Spo0B and Spo0E, involved in the sporulation process, are required for full activation of the operon, whereas AbrB, a transition-phase regulator, acts as its repressor . Further studies revealed that AbrB inhibits tyrocidine expression directly by binding to the upstream region of tycA . Moreover, tycD and tycE, which are located downstream of the operon, show high similarity to members of the ABC transporter family and thus may confer immunity to the producer strain . However, their role remains to be verified.
Years of research revealed that NRPS and NRPS-PKS hybrids can produce biologically active compounds exhibiting high antimicrobial activity. Their modular architecture allows the possibility to manipulate the enzymatic machinery in order to increase or alter their biological action. In the last decade, successful steps have been made in creating novel improved antibiotics by genetically redesigning natural synthesized compounds.
Genetic engineering has been achieved using different approaches. The first approach was based on exchanging the A-T units of the terminal module of surfactin synthetase that is originally responsible for the incorporation of leucine. Different A-T units have replaced the already existing one and novel surfactins with aliphatic (Val), charged (Orn) and aromatic (Phe) residues at position 7 were created. However, their hemolytic activity did not differ significantly from that of the wild type product . Nevertheless, swapping of numerous domains indicated for the first time that a rational design of antibiotics is accomplishable .
A further strategy for constructing synthetic antibiotics involves entire module swapping as well asinsertions or deletions of modules. Conistent to this concept, deletion of the second module of the srf operon produced a new hexapeptide surfactin . Alternatively, the manipulation of the A-domain’s specificity via point mutagenesis can also result in novel antibiotics. The altered A-domain recognizes and activates a different amino acid, which is then incorporated in the polypeptide chain to yield a new product . Another pathway to novel antibiotic production involves the replacement of TE-domains on the synthetase to force earlier release and cyclization . It has been already shown that the bioactivity of many peptide antibiotics is attributed to small heterocyclic compounds, such as thiazoline and oxazoline, which are composed by heterocyclization domains present on the synthetases. Therefore incorporation of such domains on peptide synthetases could lead to new pharmaceutical substances .
Nowadays, there are an increasing number of examples for functional engineered peptide synthetases. Genetic redesign requires well-defined sequence information about the biosynthetic system that will be altered. Although this is often provided, manipulation has been unsuccessful in some cases, due to possible disruptions on some catalytic site(s) . Therefore, information on domain structures as well as on possible protein-protein interaction sites between domains would improve manufacturing of novel antibiotic compounds.
Bacilli do not produce only peptide antibiotics, but also several other secondary metabolites such as polyketides. Their biosynthesis occurs on PKSs by step-wise decarboxylative condensations (see chapter 126.96.36.199.2). Difficidin, oxydifficin as well as bacillaene are polyketides produced by various Bacilli strains and exhibit antibacterial activity. Posttranslational modification occurs by the 4'-PPTase Sfp . Therefore, strains containing intact PKSs but defective sfp gene are deprived of polyketide production.
Furthermore, some new antibiotics have been recently isolated from various Bacilli. One of them is bacilysocin, a phospholipid that accumulates within the cells. It possibly derives from phosphatidylglycerol via acyl ester hydrolysis, a reaction controlled by YtpA. Bacilysocin inhibits the growth of various organisms such as Staphylococcus aureus, Saccharomyces cerevisiae and the fungi Candida pseudotropicalis, Cryptococcus neoformans . Furthermore, Bacilli produce low weight phenylpropanol derivative substances named isocoumarins with antibacterial and anti-inflammatory activity. Among them, amicoumacins could be used for treatment of chronic gastritis and peptic ulcer in humans, as they act against Heliobacter pylori . Moreover, 3, 3'-neotrehalosadiamine (NTD), an aminosugar antibiotic produced by B. pumilus and B. circulans, inhibits the growth of Staphylococcus aureus and Klebsiella pneumoniae. In B. subtilis production is achieved only in RNA polymerase mutated strains that show resistance to rifampicin and is driven by the operon ntdABC. Expression is induced by NTD itself, via the regulatory protein NtdR .
Bacillus amyloliquefaciens and its numerous natural isolates are closely related to the already sequenced “Methuselah of the labs” Bacillus subtilis 168, but in parallel show broad biotechnological interest and often unique and remarkable characteristics. Since no representative of the B. amyloliquefaciens species had been yet sequenced, the molecular and biochemical work on these strains was hindered and the elucidation of the pathways that contribute to the organism’s characteristics remained incomplete. Therefore, our laboratory, in collaboration with the GenoMik Network in Göttingen, set out to map the sequence of the plant growth promoting strain of B. amyloliquefaciens FZB42. This strenuous work that started in 2001 comprised a big part of my research the past years.
Nevertheless the primary focus of my work has been the elucidation and characterisation of pathways involved in the beneficial features of Bacillus amyloliquefaciens FZB42. For this scope and since the genome sequencing project did not immediately deliver results, alternative methods had to be employed in order to compare the FZB42 strain with its sequenced relative B. subtilis 168 and find the unique genomic regions that might be associated with the plant growth promoting abilities of Bacillus amyloliquefaciens FZB42.
The finding of such gene candidates (by both genomic and non-genomic approaches) generated new questions that I explored to answer. What are their products and what is the mechanism of action? When and how are they produced? How is their expression regulated? What is the effect of global regulators in their expression? But before all these questions could be answered, a protocol had to be established for the genetic manipulation of the natural isolate strain of Bacillus amyloliquefaciens FZB42.
To conclude, my thesis aimed to provide a first insight view of the unique features that enable Bacillus amyloliquefaciens FZB42 to promote plant growth. In order to accomplish such a task, a dual genomics and functional genomics approach was adopted. In parallel to this, the elucidation of the organism’s genome sequence sets the ground for future work with other isolated strains of the same species and adds important information to the function and evolution of the Bacilli genus.
|© Die inhaltliche Zusammenstellung und Aufmachung dieser Publikation sowie die elektronische Verarbeitung sind urheberrechtlich geschützt. Jede Verwertung, die nicht ausdrücklich vom Urheberrechtsgesetz zugelassen ist, bedarf der vorherigen Zustimmung. Das gilt insbesondere für die Vervielfältigung, die Bearbeitung und Einspeicherung und Verarbeitung in elektronische Systeme.|
|DiML DTD Version 4.0||Zertifizierter Dokumentenserver|
der Humboldt-Universität zu Berlin