Directed evolution protocols involve multiple rounds of mutation, screening and amplification, demanding a total time and cost which increases in proportion to the number of rounds. Therefore, a potential alternative that has been recently studied experimentally by Fellouse et al.(1,2,3) consists in engineering antibodies by using a reduced amino acids alphabet, consisting of an 'optimal' subset of 2 or 4 amino acids instead of the entire 20 code. In theory, a smaller alphabet reduces the available sequence space to be explored by combinatorial methods, by a factor (20/Q)^(N), where N = 100 is the typical length of the variable regions and Q is the size of the optimal subset (2 or 4). However, an open question concerning this approach is if the potential dissociation constants to be achieved by applying a reduced alphabet would be comparable to the ones obtained by using the complete 20 amino acids code. Indeed, the dissociation constants reported by Fellouse et al.(1,2,3) are higher for smaller alphabets, and orders of magnitude higher than the lowest dissociation constant obtained so far by employing traditional phage diplay methods (4).
In the present work, we address part of these important questions by applying a theory from statistical mechanics, the generalized NK model (5), to simulate directed evolution experiments at different sizes of the amino acids alphabet: Q = 2, 5 and 20. Our theoretical results reveal that a larger amino acids alphabet leads, at long-term, to lower evolved free energies, and therefore to lower dissociation constants, in agreement with the trend observed in the experimental results (1,2,3,4).
Finally, we present experimental support to our theoretical model by comparing the amino acid usage distributions obtained from our simulation, with the ranked usage frequency distributions obtained from antibody sequences corresponding the human CDR-H3 hypervariable loop. To this purpose, we processed the data reported by Zemlin et al. (6,7), and obtained usage frequency hystograms and Shannon entropy calculations, both in fair agreement with our simulations.
1) F. A. Fellouse, C. Wiessmann, and S. S. Sidhu. Proc. Natl. Acad. Sci. USA, 101:12467 - 12472, 2004.
2) F. A. Fellouse, P. A. Barthelemy, R. F. Kelley, and S. S. Sidhu. J. Mol. Bio., 357:100 - 114, 2006.
3) F. A. Fellouse, B. Li, D. M. Compaan, A. A. Peden, S. G. Hymowitz, and S. S. Sidhu. J. Mol. Bio., 348:1153 - 1162, 2005.
4) E. T. Boder, K. S. Midelfort, and K. D. Wittrup. Proc. Natl. Acad. Sci. USA, 97:10701 - 10705, 2000.
5) L. D. Bogarad and M. W. Deem. Proc. Natl. Acad. Sci. USA, 96:2591 - 2596, 1999.
6) M. Zemlin, M. Klinger, J. Link, C. Zemlin, K. Bauer, J. A. Engler, H. W. Jr. Schroeder, and P. M. Kirkham. J. Mol. Bio., 334:820 - 824, 2003.
7) M. Zemlin, M. Klinger, J. Link, C. Zemlin, K. Bauer, J. A. Engler, H. W. Jr. Schroeder, and P. M. Kirkham. J. Mol. Bio., 334:820 - 824, 2003. Supplementary information. Data available at doi=10.1016/j.jmb.2003.10.007.