The sequences of the five phase 3 identical 16S rRNA genes of strain 1H11T were compared using NCBI BLAST [10] under default settings (e.g., considering only the high-scoring segment pairs (HSPs) from the best 250 hits) with the most recent release of the Greengenes database [11] and the relative frequencies of taxa and keywords (reduced to their stem [12]) were determined and weighted by BLAST scores. The most frequently occurring genera were Halomonas (50.7%), Chromohalobacter (46.3%), ‘Haererehalobacter’ (1.7%), Bacillus (0.8%) and Pseudomonas (0.5%) (214 hits in total). For 16 hits to sequences from members of the C. salexigens species, the average identity within HSPs was 99.9% and the average coverage by HSPs was 97.9%. For 22 hits to sequences from other members of the genus Chromohalobacter, the average identity within HSPs was 98.
2% and the average coverage by HSPs was 98.6%. Among all other species, the one yielding the highest score was Chromohalobacter marismortui (“type”:”entrez-nucleotide”,”attrs”:”text”:”X87222″,”term_id”:”992984″,”term_text”:”X87222″X87222), which corresponded to an identity of 99.9% and an HSP coverage of 100.0%. (Note that the Greengenes database uses the INSDC (= EMBL/NCBI/DDBJ) annotation, which is not an authoritative source for nomenclature or classification.) The highest-scoring environmental sequence was “type”:”entrez-nucleotide”,”attrs”:”text”:”EU799899″,”term_id”:”190704824″,”term_text”:”EU799899″EU799899 (‘It’s all ranking aquatic Newport Harbor RI clone 1C227569′), which showed an identity of 100.0% and an HSP coverage of 100.
0%. The most frequently occurring keywords within the labels of environmental samples which yielded hits were ‘soil’ (12.1%), ‘lake’ (3.6%), ‘salin’ (3.0%), ‘agricultur’ (2.9%) and ‘alkalin, chang, flood, former, mexico, texcoco’ (2.6%) (36 hits in total). The most frequently occurring keyword within the labels of environmental samples which yielded hits of a higher score than the highest scoring species was ‘aquat, harbour, newport, rank’ (25.0%) (2 hits in total). These keywords fit reasonably well with the ecological and physiological properties reported for strain 1H11T in the original description [1]. Figure 1 shows the phylogenetic neighborhood of C. salexigens in a 16S rRNA based tree.
The sequences of the five identical 16S rRNA gene copies in the genome differ by two nucleotides from the previously GSK-3 published 16S rRNA sequence (“type”:”entrez-nucleotide”,”attrs”:”text”:”AJ295146″,”term_id”:”14270774″,”term_text”:”AJ295146″AJ295146), which contains three ambiguous base calls. Figure 1 Phylogenetic tree highlighting the position of C. salexigens relative to the type strains of the other species within the genus and the type species of the other genera within the family Halomonadaceae. The tree was inferred from 1,440 aligned characters … Table 1 Classification and general features of C.