- 3:15 PM

De Novo Protein Design Based on Binding Affinity Predictions for the Discovery of Novel Inhibitors for Complement 3 and HIV-1

Meghan L. Bellows, Ho Ki Fung, and Christodoulos A. Floudas. Department of Chemical Engineering, Princeton University, Engineering Quadrangle, Princeton, NJ 08544

Protein design, also known as the inverse folding problem, seeks the amino acid sequence that will fold into a given 3-dimensional template. The protein design problem exhibits degeneracy due to the fact that many amino acid sequences fold into a given template. It is therefore important to examine all the possible sequences for a given template and rank them based upon specific properties that are being designed (activity, specificity, etc.). A new de novo design framework based on binding affinity calculations is introduced. The framework consists of two stages: a sequence selection stage and a binding affinity calculation stage. The sequence selection stage produces a rank-ordered list of amino acid sequences with the lowest energies by solving an integer programming sequence selection model [1]. The sequence selection model incorporates backbone flexibility into the design process by utilizing the set of structures obtained from NMR experiments. In doing so, the pairwise energy between two residues can take on a range of values depending upon the range of distances obtained from the structures. The second stage employs Monte Carlo simulations to first predict the structure (using Rosetta Abinitio [2]) of the sequences from stage one and then to perform docking simulations (using Rosetta Dock [3]) between the new sequence and the target protein. A rotamerically-based ensemble of structures for the new peptide, the target protein, and the peptide-protein complex is generated using Rosetta Design [4], and is used to calculate approximate molecular partition functions of the new peptide, the target protein, and the peptide-protein complex. Using these approximate partition functions, a binding affinity is calculated [5]. The more accurate the partition functions are, the more precise the binding affinity will be. This new framework was applied to a complex of Complement C3c with compstatin variant E1 and a complex of HIV-1 gp120 with CCR5. The studies of the design of compstatin variants involve calculating the binding affinity of known sequences of compstatin obtained from previous designs in the literature. The binding affinities follow the trends of the activity of the designed sequences compared to the native sequence. The studies of the design of an inhibitor for HIV-1 gp120 based upon CCR5 look at sequences generated from the sequence selection stage using both the original selection model that does not incorporate backbone flexibility and sequences generated using the model that does incorporate backbone flexibility. The sequences examined so far demonstrate that the model incorporating the backbone flexibility provides more sequences with higher binding affinities compared to the native sequence of CCR5. The novel framework predicts improved binding affinities for a number of the candidate sequences designed based on the structure of compstatin for C3c inhibition and the structure of CCR5 for gp120 inhibition.

[1] H. K. Fung, M. S. Taylor, and C. A. Floudas. Novel Formulations for the sequence selection problem in de novo protein design with flexible templates. Optim. Methods & Software, 22:51-71, 2007.

[2] C. A. Rohl, C. E. M. Strauss, K. M. S. Misura, and D. Baker. Protein structure prediction using Rosetta. Methods in Enzymology, 383:66-93, 2004.

[3] J. J. Gray, S. Moughon, C. Wang, O. Schueler-Furman, B. Kuhlman, C. A. Rohl, and D. Baker, Protein-protein docking with simultaneous optimization of rigid-body displacement and side-chain conformations. Journal of Molecular Biology, 331:281-299, 2003.

[4] Y. Liu and B. Kuhlman. RosettaDesign server for protein design. Nucleic Acids Research, 34:W235-W238, 2006.

[5] R. H. Lilien, B. W. Stevens, A. C. Anderson, and B. R. Donald. A novel ensemble-based scoring and search algorithm for protein redesign and its application to modify the substrate specificity of the Gramicidin Synthetase A Phenylalanine Adenylation enzyme. Journal of Computational Biology, 12:740-761, 2005.