ResearchDoes codon bias have an evolutionary origin?Homulus Foundation, 612 S Flower St., #1220, Los Angeles, 90017 CA, USA
Theoretical Biology and Medical Modelling 2008, 5:16doi:10.1186/1742-4682-5-16
Additional filesAdditional file 1: CUB Tables – Summary of 113 Species Format: XLS Size: 154KB Download file This file can be viewed with: Microsoft Excel Viewer Additional file 2: Calculation of Codon Usage Bias (CUB) – Explanation and Example. The 64 codons were sorted in to 21 subgroups (fractions) corresponding to the 20 coded amino acids and the stop signal. The sum of synonymous codon frequencies were always regarded as 100% i.e. the sum of all codon frequencies is 2100% (color coded columns). The fractional frequency (CUFij %) of a synonymous codon is the contribution of that codon to this 100%. The theoretical, natural frequencies of the synonymous codons is regarded as equal to each other (for example the natural fractional frequency of each synonymous codon of Arg is 100%/6 = 16.7%). The difference between this theoretical (calculated) frequency and the real (counted) fractional frequency of a codon is the CUBij %. However it is necessary to use the |CUBij %| value instead to be able to calculate and compare the total CUB values of entire proteins (i.e. the sum of 64 CUB values). A theoretical extreme case of codon usage is when only one of all synonymous codons is used (CUB% 1 max column). The maximal possible CUB of all codons will in this case be 2416.7%, which is regarded as the CUFmax. In the real case of Homo sapiens the sum of fractional frequencies is 456, which is 18.9% of the theoretical CUBmax. Format: XLS Size: 44KB Download file This file can be viewed with: Microsoft Excel Viewer Additional file 3: Correlation Analyses of Codon Usage Bias (CUB) in 113 Species. CUBs of 113 species (each containing 64 values) were compared to the virtual CUB values in the Pan-Genomic Codon Usage Table by linear regression analyses. The – log C values were used as a measure of similarity and are indicated by horizontal bars at the right edge of the table. C: significance of correlation. The subgroups, corresponding to larger phylogenetic categories, are color coded and mean values for the groups are also indicated. The numbers of species in the subgroups are given in the "Mean" rows. Format: XLS Size: 503KB Download file This file can be viewed with: Microsoft Excel Viewer Additional file 4: CUB commitment and variation. The 64 codons in 113 species were sorted according to the size and +/- orientation of their CUB. Some manual adjustments were made to segregate the data into four approximately symmetrical subgroups (corresponding to the color codes). Format: XLS Size: 119KB Download file This file can be viewed with: Microsoft Excel Viewer Additional file 5: CUF- Pan-Genomic Codon Correlations. Codon frequencies were collected from 113 Codon Usage Frequency Tables and the significances of correlations (C, 64 × 64) were calculated. (n = 113). The table displays the -log C values. A – sign was added to the -log C value to indicate negative correlations. Significant positive correlations (values > 2) are indicated by bold numbers and gray background, while significant negative correlations (values < -2) are indicated by italic numbers and pink background. The collected data are sorted into 4 × 4 × 3 × 3 = 144 different subgroups corresponding to the 4 × 4 codon letter combinations and the 3 × 3 codon positions (red letters). Format: XLS Size: 883KB Download file This file can be viewed with: Microsoft Excel Viewer Additional file 6: Prediction of Wobble Bases. List of correlations between the frequency of a codon and the frequency of other codons, which contain the 4 × 4 permutations of codons at the 1st and 2nd codon positions. 64 times 16 equations were calculated from these correlations. Only the strongest correlations, those used in codon predictions, are listed and color coded. Positive correlations are indicated by bold letter in blue background and negative correlations are given by italic letters in pink background. F is as defined in fig. 9. Format: XLS Size: 265KB Download file This file can be viewed with: Microsoft Excel Viewer |





on Google Scholar








author email
corresponding author email