Comprehensive Analysis of Stop Codon Usage in Bacteria and its Correlation with Release Factor Abundance [RNA]

September 16th, 2014 by Korkmaz, G., Holm, M., Wiens, T., Sanyal, S.

We present a comprehensive analysis of stop codon usage in bacteria by analyzing over 8 billion nucleotide sequences of 4684 bacterial sequences. Using a newly developed program called ″stop codon counter″, the frequencies of the three classical stop codons TAA, TAG and TGA have been analyzed and a publicly available stop codon database has been built. Our analysis show that with increase in genomic GC content, the frequency of the TAA codon decreases and that of the TGA codon increases in a reciprocal manner. Interestingly, the release factor 1 specific codon TAG maintains a more or less uniform frequency (~20%), irrespective of the GC content. The low abundance of TAG is also valid with respect to expression level of the genes ending with different stop codons. In contrast, the highly expressed genes predominantly end with TAA, ensuring termination with either of the two release factors. Using three model bacteria with different stop codon usage (Escherichia coli, Mycobacterium smegmatis, and Bacillus subtilis), we show that the frequency of TAG and TGA codons correlates well with the relative steady state amount of mRNA and protein for release factors RF1 and RF2 during exponential growth. Furthermore, using available microarray data for gene expression, we show that in both fast growing and contrasting biofilm formation condition, the relative level of RF1 is nicely correlated with the expression level of the genes ending with TAG.