3.1 Abstract


The potential of 16S rRNA and 16S-23S rRNA intergenic spacer region (ISR) sequence-based analyses to identify bacteria as part of a polyphasic approach was evaluated. These approaches were used to assess whether the taxonomies of plant-inhabiting diazotrophic bacteria belonging to the Pseudomonads and genus Bacillus as previously identified by phenotypic methods could be further defined. Bacterial isolates were identified to genus level and then their 16S rRNA and 16S-23S rRNA ISR gene segments were sequenced and analysed as a prelude to sequence-based identification. For one genus, however, bacterial identification was complicated because, according to GenBank entries, Bacillus licheniformis and B. subtili had up to 100% similar 16S rRNA gene sequence. Furthermore, significant sequence variability within and between strains of the same species also led to difficulties in identification. However, we found that 16S-23S rRNA ISR sequence-based analysis provided more specific identification when 16S rRNA sequence-based identification failed; although the potential of ISR analysis for bacterial identification is likely to depend on species characteristic sequence stretches of the ISR or on the bacterial species studied. Moreover, we performed similarity searches using different databases (EMBL-Bank, RDP-II and RIDOM) and programs (FASTA, BLAST and RDP-II) to evaluate their potential for bacterial identification. Here, we report that molecular identification methods may also lead to misidentification unless they are performed in combination with basic phenotypic tests. Finally, we conclude that 16S-23S rRNA ISR sequence-based identification in conjunction with analysis of 16S rRNA sequences has a great potential in future bacterial identification studies only when reliable and enough sequence entries in database are provided.

Key words 

bacteria identification – BIOLOG test – sequence-based identification – 16S rRNA – 16S-23S rRNA ISR.

3.2 Introduction


Harmless bacterial isolates, for which the turnaround time to identification is less critical in comparison to clinical isolates, can generally be identified with conventional tests like morphological, physiological and chemotaxonomic characterization. However, when conventional methods are used to identify bacteria, interpretation of test results involves substantial subjective judgement (Stager and Davis 1992). Several commercial identification systems, such as the carbon source utilisation system developed by Biolog, Inc., offer computer-assisted identification of a wide variety of bacterial isolates (Miller and Rhoden 1991, Holmes et al. 1994, Tang et al. 1998). However, while such systems may reduce subjectivity and safe labour time, they still rely on phenotypic identification (Drancourt et al. 2000). Since some microorganisms share phenotypic profiles, but are segregated based on polymorphisms in their genome (Nakamura et al. 1999), the application of only morphological-biochemical-based approaches can lead to inaccurate bacterial identification. Moreover, even these approaches are sometimes unable to identify bacteria to the species level.

Nowadays, with the advent of molecular biology-based techniques, investigations based on comparative DNA sequence analysis of genes that carry phylogenetic information have become commonplace in microbiology as a tool for classification of microbial organisms. In contrast to phenotypic identification methods, identification using molecular techniques provides two primary advantages: (1) they are faster and (2) their accuracy in identification is improved (Springer et al. 1996, Patel et al. 2000). Two of the most useful and extensively investigated taxonomic marker molecules for bacterial identification studies are the 16S rRNA gene (Mahenthiralingam et al. 2004, Clarridge 2004) and ribosomal intergenic spacers, e.g. 16S-23S rRNA ISR (Drebot et al. 1996, Roth et al. 1998, Blackwood et al. 2004). However, some researchers found that for certain species of the genus, rRNA intergenic spacer regions do not contain sufficient stretches of identical sequence to allow identification at the phylogenetic level (Yoon et al. 1997, Yoon et al. 1998, Kuwahara et al. 2001). On the other hand, phylogenetic analysis of the 16S-23S ISR gene sequences of some species was confirmed by 16S rRNA gene-based phylogenic identification and had the added benefit of providing a higher resolution (Leblond-Bourget et al. 1996, Aakra et al. 1999, Goncalves and Rosato 2002, Song et al. 2004). Thus, the evaluation of the potential of ISR sequence-based analysis as part of a polyphasic approach for bacteria identification is both important and necessary.

The most common molecular-based method of bacterial identification is the amplification of marker genes, followed by either probe hybridisation, restriction fragment length polymorphism analysis (RFLP) or sequencing. All the aforementioned molecular-based methods have become both routine and more affordable and offer the most accurate level of bacterial identification (Turenne et al. 2001). Although molecular identification of bacteria based on sequencing of taxonomic marker genes is regarded as the best method to date (Patel et al. 2000), it is dependent on the quality of available sequence databases, and currently, many of them are not optimal for this purpose. For example, the presence of faulty and/or redundant sequence entries (due to error prone sequencing techniques used earlier, e.g. reverse transcriptase sequencing), ragged sequence ends (resulting in wrong 'best' matches in similarity searches), non-characterised entries, outdated nomenclature, absence of quality control of sequence entries, and finally, a lack of type strains pertaining to many important microorganisms. Therefore, it is imperative to assess the quality of the database before using it for bacterial identification purposes.


In this article, conventional tests (like morphological, physiological and chemotaxonomic characterization) and the use of commercial identification systems (like, carbon source utilisation system BIOLOG) are referred to as phenotypic identification methods. In this study, bacterial isolates were identified to genus level by morphological characterization, and additional phenotypic investigations were not performed, but rather, 16S rRNA gene (Mahenthiralingam et al. 2004, Clarridge 2004) and 16S-23S rRNA ISR sequence analyses were used as a complement to phenotypic identification. We compared our sequences to those deposited in public databases, such as EMBL Bank (Cochrane et al. 2006), RDP-II (Maidak et al. 2001) and RIDOM (Harmsen et al. 2003). Although the RIDOM database was developed specifically for the identification of clinical microorganisms, we were interested to also use this database to identify harmless bacterial isolates, some of which may have clinical counterparts. The results presented here emphasize the need to take a polyphasic approach, i.e. phenotypic and sequence-based approaches, when identifying bacterial isolates, particularly in cases when correct assignment of the bacteria to the species level is required.

3.3 Materials and methods

3.3.1 Bacteria isolation 

Wheat plants were uprooted 21 days after sowing in salty soil (Syrdarya, Uzbekistan). Roots were washed in running tap water to remove adhering soil, cut into 1 cm pieces and surface sterilised in 0.7% NaOCl solution for 30 minutes. In tubes containing 20 ml sterile water, 1 g root pieces were placed and macerated by vigorous shaking for 2 hours. The suspension was then spread on Petri dishes containing Ashby medium (Methods of soil microbiology and biochemistry, 1991) to allow endophytic bacteria growth. After 4 to 7 days’ incubation, single colonies were transferred to fresh Petri dishes containing the same medium. This process was repeated 3 times to purify bacterial culture – strain 148 - that were then stored in tubes containing Ashby agar medium. Isolate BL43 was taken from the Culture Collection of the Institute of Microbiology, Uzbekistan Academy of Sciences, Tashkent.

3.3.2 Phenotypic characterisation of bacterial isolates

The Gram reaction was performed as described previously (Suslow et al. 1982) using a 3% KOH test in parallel with traditional Gram staining (Gram 1884). According to their Gram-type, carbon source utilisation patterns of bacterial pure cultures were analysed using the BIOLOG® test as described below. Single colonies were picked, subcultured on BUGM (Biolog Universal Growth Medium) and incubated overnight at 28°C. A homogenous suspension of inoculum was made in 0.85% saline and diluted to a transmittance of 55 to 60% at 590 nm. From this suspension, 150 µl was dispensed into each well of the Gram-negative (GN2) or Gram-positive (GP2) MicroPlatesTM (Oxoid GmbH, Wesel, Germany), which were then incubated for 24 h at 28°C. Colour development was measured at 590 nm at 4 and 24 h with a computer-controlled MicroPlate reader (Miller and Rhoden 1991, Holmes et al. 1994, Tang et al. 1998). The purified bacterial isolates were identified by comparing their substrate utilisation patterns with those found in the MicroLog System 2 database, release 4.01B (BioLog, Inc., Hayward, CA).


Further identification was performed using morphological characterisation and basic biochemical tests. After 24 and 48 h of growth on PA (Peptone Agar) at 28°C, colonies of purified bacterial isolates were characterised for the following traits: colour, shape, length, breadth and width, surface, opacity and texture. Motility, cell morphology, size and division mode were also evaluated by performing phase-contrast microscopy. Respiration type was determined by growth of bacteria in MPB (Meat Peptone Broth) at 28°C (for 3 d), temperature of growth was determined in MPA (Meat Peptone Agar) medium incubating at 4°C (for 14 d) and 50°C (for 5 d). Oxidase activity was tested using Bactident-Oxidase test strips (Merck) according to the manufacturer's instructions. The activity of catalase was tested by suspending a loopful of cells in a 10% (vol/vol) H2O2 solution.

Plant growth-stimulating effects of isolates were analysed for different agricultural crops, as described in previous studies (Egamberdiyeva et al. 2003, 2004). 

3.3.3 Extraction of bacterial DNA

Pure bacterial cultures were grown at 28°C in standard I (Merck, Darmstadt, Germany) nutrient broth for 48 hours and bacterial DNA was extracted using MO BIO Ultra Clean™ Microbial DNA isolation kit (MO BIO laboratories, Inc. Hamburg, Germany) according to the manufacturer’s instructions. Concentration and purity of DNA samples were measured at optical densities of 260 and 280 nm using an Eppendorf spectrophotometer. DNA concentration was adjusted to 20 ng μl-1 by diluting in deionised rRNA-free H2O and stored at -20°C.

3.3.4 16S rRNA gene amplification and sequencing


16S rRNA genes were amplified by PCR using the primer set 27f and 1492r (Martin-Laurent et al., 2001) (Tab. 11). PCR reaction mixture contained 12.5 μl Master Mix (Qiagen, Hilden, Germany), 2.5 μl of each 10 μM primer and was brought to a final volume of 25 μl by addition of 5 μl of H2O. 2.5 μl of bacterial DNA (approximately 50 ng) was used as template. Sterile water was used for the no-template negative control. PCR amplification was carried out in 96-well PCR plates with a Bio-Rad iCycler as follows: 94°C for 15 min, followed by 35 cycles of 95°C for 30 s, 56°C for 30 s, and 72°C for 1 min 15 s. A single final extension step consisted of 72°C for 10 min. Resulting PCR products were examined by agarose gel electrophoresis (2%) using GeneRuler™ DNA ladder mix, Marker SMO 0328 (MBI, Fermentas, St. Leon-Rot, Germany) as size standard (Fig.6). PCR products were purified using MiniElute PCR Purification Kit (Qiagen, Hilden, Germany) and sequenced (the value read, MWG-Biotech).

3.3.5 16S-23S ISR amplification and sequencing

To obtain the 16S-23S rRNA ISR sequence information of selected bacterial strains, nested PCR was performed using universal prokaryotic primers (Tab. 11). The intergenic spacer regions were first amplified using primers 785 and 422. This was followed by a second nested PCR using primers 3-17R and EricM. PCRs were performed as described by Rumpf et al. (1999) with the following exceptions: (i) 300 nM of each primer was added to each reaction mixture and (ii) as information about the annealing temperature of the primer pair was not mentioned in the original protocol, the second amplification primers were optimally annealed at 60°C as determined experimentally using gradient real-time PCR in this study (data not shown). The reaction mixture contained 12.5 μl QuantiTect mastermix (Qiagen, Hilden, Germany) and 2.5 μl of each 300 nM primer. As template, for the first and the nested PCR, 2.5 μl of pure bacterial DNA (approximately 20 ng μl-1) along with a 1:10 or 1:100 dilution of purified product of the first PCR were used, respectively. In both cases, the total reaction mixture was brought to a final volume of 25 μl by the addition of 5 μl RNA-free H2O (Qiagen, Hilden, Germany). The templates for the second PCR were prepared as follows: (i) the first PCR product DNA fragments were separated by electrophoresis in 2.5% agarose in TBA (thiobarbituric acid), (ii) gels were stained with ethidium bromide, (iii) gel pieces containing the desired-size DNA band (2000 bp) were cut and (iv) were cleaned with Mini Elute Gel purification Kit (Qiagen, Hilden, Germany). The amplified nested PCR products were excised from a 1% agarose gel after electrophoresis and purified using a QIAquick gel extraction kit (Qiagen, Hilden, Germany) and sequenced (the value read, MWG-Biotech).

3.3.6 Sequence data analysis

The Internet tools CLUSTAL W, BLAST and FASTA3 provided by the European Bioinformatics Institute (http://www.ebi.ac.uk) and the RDP-II (Sequence Match, version 9.0, provided by the


Tab. 11: Specificity and nucleotide sequences of PCR primers used in this study.

Fig. 6: Gel electrophoreses analysis of amplified 16S rRNA from bacterial isolates BL43 and Xs148. Lane M: Marker SMO 0328; lane 1 – E.coli (positive control), lane 2 – isolate BL43, lane 3 – isolate Xs148.

Ribosomal Database Project (Maidak et al. 2001; http://rdp.cme.msu.edu/)) were used for identifying the isolates. The sequencing data were analysed as follows: (i) assembly of the reverse and forward sequences into a consensus sequence; (ii) comparison of the consensus sequences with sequences deposited in EMBL-Bank (Release 85, December 2005), RDP-II and RIDOM databases and the basic local alignment search tools BLAST, FASTA and RDP-II. The newly determined sequences were aligned with their related sequences retrieved from EMBL-Bank by using the CLUSTAL W (1.8) graphical multiple alignment The DNA sequences determined for the strains BL43 and Xs148, both 16S rRNA and 16S-23S rRNA ISR sequences have been deposited in the EMBL-Bank database and given accession numbers EF601575 – EF601582.

3.3.7 Criteria for bacterial isolate identification


To define identification at the genus or species level, the following similarity score values were used: (i) when the comparison of the determined sequence with a reference sequence of a classified species deposited in the databases yielded a similarity score ≥ 99%, the unknown isolate was assigned to this species; (ii) when the score was < 99% and > 96%, the unknown isolate was assigned to the corresponding genus; and (iii) when the score was < 96% and > 92%, the unknown isolate was assigned to a family (Bosshard et al. 2003).

3.4 Results

3.4.1 Conventional bacterial identification

In this study, two bacterial isolates were tested: isolates 43 and 148. In the KOH test that lyses Gram-negative bacteria, isolate Xs148 was observed to lyse; thereby indicating that this isolate could be Gram negative. Considering the results of commercial BIOLOG test-based identification, Gram-positive isolate BL43 did not respond to the BIOLOG test, while isolate Xs148 was most similar to Burkholderia glumae showing an identity score of 0.590 (Tab. 12a). Using a range of common bacteriological tests, as listed in materials and methods, classic bacteriological profiles of bacterial isolates were obtained and both bacteria were classified accordingly to genus level. The Gram-negative strain, namely isolate Xs148, showed key classic characteristics of the Pseudomonads, e.g. the strain was aerobic and motile. Moreover, the cells are rod-shaped, approximately 0.5 - 1.0 μm by 1 – 1.5 μm. The strain grows best between 25°C – 37°C, but poorly at 45°C and produces beehive-shaped colonies on GPA medium after 48 hours of incubation at 28°C with a predominant amber-coloured colony type approximately 2 - 3 mm in diameter. In addition, the strain proved positive in the oxidase and catalase assays. Isolate BL43 was of the genus Bacillus. The strain was a facultative aerobe and motile. It was also rod-shaped (approximately 0.5 – 0.6 μm by 1.2 – 1.5 μm). Moreover, isolate BL43 was oxidase and catalase positive and capable of growth on N-free Ashby medium. The strain is also capable of forming endospores. Finally, the strain grows between 25°C – 37°C, but poorly at 45°C and forms colonies that are mucoid, slimy and tend to spread at 28°C. Both bacteria were capable of growth in N-free Ashby medium.

3.4.2 16S rRNA sequence-based bacterial isolate identification 

The value of comparing 16S rRNA sequences derived from unknown bacterial isolates to EMBL-Bank and RDP-II databases is addressed in Tab. 12a and Tab. 12b. In these databases, the degree of relatedness is expressed differently. EMBL-Bank’s main measure is percent identity and RDP-II’s measure is a relatedness value somewhat close to (but lower than) the EMBL-Bank percent identity (Clarridge 2004). The identification of bacterial isolates based on 16S rDNA gene sequencing was determined by choosing the species of bacteria with the highest identity match. Searching for sequence similarity against the EMBL-Bank database was performed using the FASTA and BLASTN programs. The identification of bacteria was complicated due to numerous ambiguous results. For example, searching the EMBL-Bank database based on BLAST search using the BLASTN program showed that isolate BL43 was 99% identical to four species (among 100 hits obtained in BLAST searches) of Bacillus subtilis group: ↓38 B. licheniformis, B. subtilis, B. mojavensis


and B. vallismortis

and also


  B. amyloliquefaciens. Moreover, BLAST searches showed 99% identity of the sequence in query to a micro-organism belonging to another family, Pseudomonas sp. (EM_PRO:AB211031). When the FASTA program was used, almost the same results were obtained: isolate BL43 yielded an ambiguous result showing 99.72% identical percent similarity to two species B. licheniformis and B. subtilis, while it was 99.37% identical to B. mojavensis . For isolate Xs148, different results were obtained even when the sequence was compared against the same database (EMBL-Bank) by using different programs (FASTA and BLASTN), resulting in the assignment of different identities: using BLAST searches resulted in the highest matches to the Pseudomonas genus, while using the FASTA program, isolate Xs148 was closest in identity to the Xanthomonas species (Tab. 12a). Analysing the search results obtained from FASTA searches for isolate Xs148 in more detail revealed that this isolate was 96% identical to ↓41 Xanthomonas oryzae pv. oryzae

and Xanthomonas campestris, and 95% identical to a different genus, Pseudomonas. To demonstrate the quality and accuracy of results provided from different programs and databases, searches for sequence similarity analysis were further performed using RDP-II (Sequence Match, version 9.0). Using RDP-II with dataset-2 (consisting of both type and non-type strain and isolate sequences), the best match for isolate Xs148 sequences was Azotobacter salinestris with similarity scores of 0.776 (Tab. 12b), while the highest match to Xanthomonas oryzae pv. oryzae (sequence similarity – 96%) and Pseudomonas sp. (sequence similarity – 97%) was obtained using FASTA and BLASTN, respectively. Considering the results of the two data sets of the RDP-II database used, dataset-1 showed highly ambiguous results, most probably, due to the short target sequence or due to the limited number of sequences included in the sequence match search (Tab. 12b); whereas, dataset-2 had almost 17 times more sequences in the search.

3.4.3 16S-23S ISR-based bacterial isolate identification

For isolate Xs148, 16S rRNA (based on the FASTA analysis) and 16S-23S rRNA ISR designations offered a consensus at the genus level (98.53% similarity to EM_PRO:DQ003220 ↓42) showing that this bacterium belongs to the Xanthomonas genus. Among 100 hits obtained from BLAST searches for isolate Xs148 sequence (240 bp) similarities, 93% of the 100 hits obtained belonged to the genus Xanthomonas, mainly to Xanthomonas oryzae and Xanthomonas campestris, and 7% belonged to members of other genera, such as Stenotrophomonas maltophilia (5%), Pseudomonas (1%), rice phylosphere bacterium (1%) and uncultured bacteria (1%). Considering that isolate Xs148 was identified as Burkholderia glumae by the BIOLOG test, we performed CLUSTAL W alignment analysis of 16S-23S rRNA ISR sequences of randomly selected Burkholderia genus strains with isolate Xs148-derived ISR sequences. The alignment results showed only 76-78% similarity scores to the 16S-23S rRNA ISR partial sequence of isolate Xs148, while Xanthomonas genera has a 100% similar sequence composition; thereby, strongly suggesting that from the molecular point of view, this bacteria belongs to the Xanthomonas genera. We must note however that among BLAST search isolate Xs148 (Rice phyllosphere bacterium (EM_PRO:AY485407 )) with a sequence similarity of 99.63%, since this entry was not fully characterised.

The ISR sequence analysis of isolate BL43 showed that among 100 hits, there was only three matches to the Bacillus genera, B. ↓43subtilis ( EM_PRO:BSTGRG16 ), B. licheniformis ( EM_PRO:CP000002 ), and B. clausii ( EM_PRO:AP006627 ), showing 88%, 88% and 75% identity, respectively, while the 16S rRNA sequence of this bacteria was above 99% similar to different Bacillus species (Tab. 12b). Since, one of the mostly related species B. amyloliquefaciens was not in 100 hit list, the CLUSTALW analysis were performed comparing the isolate BL43 and B. amyloliquefaciens 16S-23S ISR sequences (EM PRO: AF478079). The result showed low similarity of 47%. The rest of the hits included microorganisms belonging to the same order of Firmicutes (order Bacillales, class Bacilli) as our selected bacteria with very low similarity:  ↓44 Staphylococcus haemolyticus ( 81% identical), Listeria innocua

citenumber start="45"/> (74% identical) and OceanoBacillus iheyensis (71% identical). This did not provide enough information to be able to identify isolate BL43 based on 16S-23S rRNA ISR sequences alone. If we consider the previous reports on the maximum level of ISR divergence between the strains belonging to the same species, e.g. 13% for



species, (Leblond-Bourget et al. 1996), 16% for Saccharomonospora (Yoon et al. 1997) and



(Yoon et al. 1998), the 12% difference between EMBL-Bank deposited

Bacillus licheniformis



Bacillus subtilis

and the studied isolate BL43 is in the scope of acceptable divergence. According to both FASTA and BLAST, target bacteria sequences showed 43 out of 100 hits for the Gram-negative bacilli (Tang et al. 1998) genus Acinetobacter sp. with identity ranging from 61 to 98%. Since the Acinetobacter genus belongs to Gram-negative bacteria, respective hits were excluded from identification analysis. Moreover, when 16S rRNA partial sequences of Acinetobacter baumannii (AJ247197) deposited in EMBL-Bank and isolate BL43 derived in this study aligned using CLUSTAL W, these two organisms showed only a 63% score of identity.

Tab. 12a: Results obtained from phenotypic and molecular biological methods used to identify plant-inhabiting bacteria.


Phenotypic identification

BIOLOG identification

16S rRNA sequence-based identification







Programs and databases used with EMBL database

NCBI-BLASTN 2.2.13 / EMBL database

NCBI-FASTA version 3.4t / EMBL database*




Included sequences





Bacillus sp.




B. licheniformis

B. subtilis

B. mojavensis

B. vallismortis



B. licheniformis

B. subtilis




Burkholderia glumae



Pseudomonas sp.



Xanthomonas oryzae



Tab. 12b: Results obtained from phenotypic and molecular biological methods used to identify plant-inhabiting bacteria.


16S rRNA sequence based identification

16S-23S ISR sequence based identification

Seq length


Program and dataset used with RDP-II database




Programs and databases used with EMBL database

SeqMatch - Dataset 2

NCBI-BLASTN 2.2.13 /EMBL database*

NCBI-FASTA version 3.4t /EMBL database*



# of included sequences










B. licheniformis

B. subtilis 




B. licheniformis

B. subtilis


B. licheniformis

B. subtilis 





Azotobacter salinestris




X. oryzae

X. campestris


Xanthomonas sp.


* - the number of sequences included in the similarity search was not provided.

3.5 Discussion

The exact taxonomic affiliation of some microorganisms can only be ascertained by using a polyphasic taxonomic approach. The present study genetically characterised two bacterial isolates isolated from plant endorhizosphere and compared these results against those derived from bacteriological and metabolic identification techniques to assess the need for a polyphasic approach to identify bacteria. Although the BIOLOG system has been evaluated as having the largest database (Holmes et al. 1994), this system’s identification did not always agree with biochemical criteria belonging to some genera including  Bacillus (Tang et al. 1998). Therefore, failure of the BIOLOG system to identify isolate BL43 could be explained by the fact that this isolate belongs to the genera  Bacillus as established using rRNA sequence tests. Moreover, there are examples where reliance on only biochemical-based identification could lead to inaccurate bacterial identification because some closely related microorganisms within the Bacillus genus can share phenotypic properties, but have previously been classified as different species based on DNA re-association values (Nakamura et al. 1999). Furthermore, inaccurate conventional identification of Bacillus species due to a result of unmatched Gram and biochemical profile determination as well as growth requirements was established using 16S rRNA sequence-based identification (Drancourt et al. 2000). In this study, genotypic methods based on rRNA sequence analysis improved the identification of plant-inhabiting diazotrophic bacteria; thereby completing the identification results obtained using conventional biochemical methods and the BIOLOG® test.

3.5.1 16S rRNA sequence-based bacteria identification and conflicting results

To identify microorganisms correctly, sequences deposited in databases must be correct and appropriately named. To date, there are only a few reports on the quality of commonly used databases, e.g. GenBank-EMBL, RIDOM (Harmsen et al. 2003) and RDP-II ( Maidak et al. 2001) as well as user preference of these databases along with programs concerning the quality and/or number of sequences (Turenne et al. 2001, Cloud et al. 2002). As sequences can be deposited in the GenBank-EMBL and RDP-II databases without undergoing any checks, it is not surprising that errors do occur as regards species assignment (Harmsen et al. 2003). For example, among EMBL-Bank deposited sequences of Pseudomonas spp., we found one that we believe is misidentified; EM_PRO: AB211031 ↓50  “Pseudomonas sp. SSCS3 gene for 16S rRNA, partial sequence (1479 bp)” shows 99% total 16S rRNA similarity to Bacillus subtilis, but 72-78% to sequences derived from strains belonging to Pseudomonas genus. Therefore, EM_PRO: AB211031 is most probably the sequence of a strain belonging to the Bacillus genus. This obviously incorrect database entry highlights the common presence of faulty sequence entries.

Turenne et al. (2001) reported that the quality and/or the number of the sequences derived from a certain group of bacteria in the RIDOM database are higher. However, although the RIDOM database has a particularly good collection of mycobacterial sequences, this does not extend to all other bacterial categories. We found, for example, that the RIDOM database is not useful for identification of the bacterial strains investigated in this study although Blackwood et al. (2004) reported that 16S rRNA gene sequences for 65 (of all 83) type strains of the Bacillus genus have been submitted to the RIDOM database at http://rdna.ridom.de/. It is therefore likely that submission of all these sequences is either not yet complete or that the database has not been updated. Note that it was last modified on 05.08.2005.


In this study, the 16S rRNA-derived sequences were analysed by comparing the results from the RDP-II database with those of the EMBL-Bank. We suggested that since RDP-II offers an opportunity for comparing user submitted sequences to the RDP-II database using data subsets (data sets) available for sequences from type material, it allows avoiding using taxonomically misidentified and non-cultured bacteria sequences in searches that may result in ambiguous identification. Therefore, we performed searches in RDP-II for sequence similarities to strains 43 and 148 using two data subsets consisting of sequences of (i) type strains (data set-1) and (ii) both type and non-type strains and isolates (data set-2).

For isolate BL43, the search results of three programs (BLASTN, FASTA and RDP-II) were in agreement when the RDP-II search performed with data set-2. However, with data set-1, we obtained results that were not in agreement with the FASTA and BLASTN results, and hence, these were discarded. Therefore, we suggest that although performing sequence similarity searches with the sequences of type strains provided by the RDP-II program should allow avoiding inaccurate identification, it is likely that there are not enough type strain sequence entries in the current RDP-II database. Hence, the observed difference in results for the two data sets of RDP-II may be partly explained by the vastly lower number of sequences included in searches with data set-1 compared to that of data set-2 (Tab. 12a). Hence, these results were not considered in this study.

Because the 16S rDNA gene is highly conserved, determination of species and strain distinctions relies upon the resolution of only small differences between sequences. Moreover, as two distinct species may possess identical 16S rRNA sequences including Bacillus, gene sequence identification is not foolproof (Ash et al. 1991, Fox et al. 1992, Clarridge 2004). Using BLAST searches in the EMBL database, since the identity match value of isolate BL43 to different species (Bacillus licheniformis and Bacillus subtilis) was equal; it was not possible to distinguish this isolate to species level. Furthermore, since isolate BL43 was assigned to Bacillus licheniformis and Bacillus subtilis with sequence similarities of 99.72% and the second classified species in the scoring list ( ↓52 B. mojavensis and B. vallismortis ) showed less than 0.5% additional sequence divergence, isolate BL43 was marked as a “Bacillus licheniformis or Bacillus subtilis with low demarcation to the next species, such as B. mojavensis and B. vallismortis

” (Bosshard et al. 2003). Finally, besides ambiguous results regarding related species, search results also showed the highest matches to non-respective Gram-type microorganisms. Therefore, we were careful to only consider matches with the target bacteria Gram-type. Taken together, the results presented here emphasise that a number of basic biochemical tests alongside sequence-based analysis are of critical importance.

3.5.2 Comparison of 16S rRNA sequence-based identification results and conventional bacteria identification

Considering previous studies, both Pseudomonas and Xanthomonas species are regularly misidentified as Burkholderia species (Burdge et al. 1995, Urakami et al. 1994). Recent studies based on rRNA sequence showed that Burkholderia species sharing characteristics in common with members of the genus Pseudomonas are distinct and separate, and therefore, some species of the Burkholderia genus were transferred from one genus to other (Kersters et al. 1996). Our results for isolate Xs148 were in agreement with the aforementioned studies reporting conflicting results between conventional and sequence-based identification: the BIOLOG test resulted in Burkholderia glumae being identified, whereas 16S rRNA sequence-based analysis using FASTA, BLAST and RDP-II (with data set-2) showed the highest matches to Xanthomonas oryzae pv. oryzae, Pseudomanas sp. and Azotobacter salinestris, respectively. Although the best sequence matches given by these three programs were different, the EMBL-Bank sequences of Burkholderia were not even shown on the BLAST match list due to too low similarity (identity score by multiple alignment is 77%) between the sequences; thus, we strongly suspect that this strain was misidentified by conventional methods.


For some genera or species, the conflict between sequence-based and phenotypic identification is simply due to too few entries deposited in the databases, so that the similarity level for a particular query does not exceed to species (99%) or even, genus level (96%) (Drancourt et al., 2000). However, we found that there are more than 600 entries for cultured Burkholderia species in the EMBL-Bank (January 2006). Such a number should be enough to determine the sequence in question, at least to the genus level, if the target bacteria belonged to this genus. On the other hand, it may be argued that the conflicting results obtained for isolate Xs148 in this study may be due to the very short sequence length analysed. However, Wilck et al. (2001) reported that DNA sequences of even less than 200 bp were enough for identification purposes. Although only partial sequence of isolate Xs148 was used, it suggests that this isolate belongs to a different genus as identified by conventional methods. Therefore, taken together, we suggest that 16S rRNA-based identification of strains of some genera at the species level is not reliable enough and requires additional tests. Finally, reliance on a single molecular method for species definition, such as 16S rRNA gene sequencing, cannot take into account small evolutionary changes, such as point mutations (Stackebrandt et al. 2002). Thus, in practice, a polyphasic approach including alternate gene targets performed in parallel with the examination of a number of phenotypic properties is necessary for definitive species identification.

3.5.3 16S-23S ISR-based bacteria identification

Since 16S rRNA-based identification did not enable us to identify isolate Xs148 or to classify isolate BL43 to the species level, the potential of alternate genes was suggested. A review of current taxonomic molecular marker genes for bacteria identification prompted us to examine the use of the 16S-23S rRNA ISR (Harrel et al. 1995, Dong and Cote, 2003) for further sequence-based identification. The 16S-23S rRNA ISR proved to be the best alternate target to 16S rRNA because we found a high consensus between results from 16S rRNA and 16S-23S rRNA ISR sequence-based identification for isolate Xs148. Goncalves and Rosato (2002) reported that among Xanthomonas species, 16S-23S rRNA ISR sequence similarity ranged from 63 to 99%; however, the topology of the 16S-23S rRNA ISR phylogenetic tree was very similar to that of the higher level of the diversity among the ISR (ITS) sequences (16.2%) compared with the 16S rDNA sequences (1.8%). In this study, both 16S rRNA and 16S-23S rRNA ISR analyses showed that studied bacterial isolate Xs148 fell into Cluster I of Xanthomonas species (as determined by Goncalves and Rosato, 2002) showing the highest similarity to Xanthomonas oryzae and Xanthomonas campestris. For isolate Xs148, direct sequence determination of 16S-23S ISR fragments represented a highly accurate method for bacterial identification to the species level, even when the species in question was notoriously difficult to identify by 16S rRNA sequence-based identification. This is because 16S-23S rRNA ISR similarities reflect phylogenetic relationships and has more discriminative nucleotide stretches, which allow identifying in species; even to the strain level (Blackwood et al. 2004).

The 16S-23S rRNA ISR-based approach failed to identify isolate BL43 producing differing data to both the basic bacteriological and 16S rRNA sequence-based identification results. We suggest that the potential of ISR analysis for bacteria identification is likely to depend on specific nucleotide stretches of the ISR or on the bacteria species studied (Yoon et al. 1997, Yoon et al. 1998, Kuwahara et al. 2001).

3.6 Conclusion


Since no identification method is able to identify all microorganisms and each of them has its advantages and disadvantages, we tested the potential of 16S rRNA and 16S-23S rRNA ISR sequence-based molecular identification methods in combination with conventional identification ones (morphological-biochemical method and phonological BIOLOG test) as a part of a polyphasic approach to identify two plant root colonizing bacteria as Bacillus subtilis and Xanthomonas sp.

The present study is unique in that (i) results derived from 16S rRNA and 16S-23S rRNA ISR sequence-based identification were compared to better allow further species definition and (ii) two bacterial strains belonging to different taxonomic groups were examined to ensure that the findings reported here are applicable to a wide range of bacteria.

From our analyses, we conclude that (i) rRNA sequence-based identification should be performed in conjunction with traditional biochemical bacteriological tests that provide basic information about the microorganism in query, (ii) for similarity scores of different species belonging to the same genus of less than 1 % dissimilarity, the FASTA program displays more accurate values, i.e. to 3 decimal places, than BLAST and (iii) 16S rRNA in conjunction with ↓5516S-23S rRNA ISR sequence-based identification can be used to identify and differentiate between the species only if the quality of the database is high enough, i.e. it contains a high number of reliable sequence entries taken from a comprehensive range of species.

The present data suggest that an integrated genetic and phenotypic approach for taxonomic classification, a so-called “polyphasic approach” (Vandamme et al. 1996), provides more specific and accurate species definition.


Chapter 4: Enumeration of 16S-23S ISR of two diazotrophic bacteria in plant samples by real-time PCR with SYBR Green I approach

© Die inhaltliche Zusammenstellung und Aufmachung dieser Publikation sowie die elektronische Verarbeitung sind urheberrechtlich geschützt. Jede Verwertung, die nicht ausdrücklich vom Urheberrechtsgesetz zugelassen ist, bedarf der vorherigen Zustimmung. Das gilt insbesondere für die Vervielfältigung, die Bearbeitung und Einspeicherung und Verarbeitung in elektronische Systeme.
DiML DTD Version 4.0Zertifizierter Dokumentenserver
der Humboldt-Universität zu Berlin
HTML-Version erstellt am: