- 11:35 AM
765c

Predicting Translation Initiation Rate from RNA Sequence

Howard Salis1, Ethan Mirsky2, and Christopher Voigt1. (1) Department of Pharmaceutical Chemistry, UC San Francisco, 1700 4th St., 408 Byers Hall, San Francisco, CA 94158, (2) Biophysics graduate program, UC San Francisco, 1700 4th St., 408 Byers Hall, San Francisco, CA 94158

The reliable genetic engineering of bacterial systems would be greatly aided by a more quantitative and predictive understanding of gene expression. The production rate of a protein can be precisely tuned by varying the mRNA sequence of the 5' UTR (untranslated region), including the ribosome binding site (RBS), that precedes the protein's coding sequence. However, there is currently no quantitative model that can predict an RNA sequence that yields a desired production rate of protein.

We have developed and experimentally validated a statistical thermodynamic model of translation initiation in Escherichia coli. We combine the quantitative model with Monte Carlo sampling to create a design method that generates synthetic 5' UTR RNA sequences. The user inputs a protein coding sequence and a desired translation initiation rate and the design method will generate a RNA sequence accordingly. We measure the accuracy of the design method by generating numerous synthetic 5' UTRs that drive the expression of the fluorescent protein RFP in a simplified test system. We then compare the design method's predictions with flow cytometry data. The design method is capable of accurately generating RNA sequences that yield the desired translation rate, with rates varying across three orders of magnitude.

The presented design method enables the rational tuning of the production rates of one or more proteins in a synthetic or natural biological system and speeds up the optimization of metabolic pathways or genetic networks. With the emergence of lower cost large-scale DNA synthesis and the rapid assembly of genetic systems, the development bottleneck now shifts towards identifying the DNA sequence that ultimately yields a desired system behavior. These and other quantitative and predictive models aim to provide that much-needed missing link.