471183 Accurate Structure Prediction of CDR H3 Loops Enabled By a Structure-Based C-Terminal Constraint within Rosetta
RosettaAntibody's approach to modeling is to break the structure into eight distinct structural components: the heavy- and light-chain frameworks; CDR loops L1–3; and CDR loops H1–3. Because the non-H3 CDR loops adopt canonical conformations, accurate backbone conformations for them can usually be found in known structures. RosettaAntibody exploits this by selecting templates from curated structural databases by BLAST bit-score for CDRs L1–3, H1 and H2 and the framework regions. Each structural component is defined such that they have overlapping residues that can then be superposed to create a grafted model. An initial VH–VL orientation is also selected from databases, and the grafted heavy and light chains are each superposed to the corresponding chain in the orientation template. After this, the CDR H3 loop is modeled de novo while sampling the VH–VL orientation.
We have presented the performance of RosettaAntibody in Antibody Modeling Assessment II (AMA II). With few exceptions, RosettaAntibody selects templates for the framework regions and the non-H3 CDR loops with sub-angstrom RMSDs from the native structure. The most difficult aspect of antibody homology remains accurately predicting the VH–VL orientation and the CDR H3 conformation.
A large majority of CDR H3 loops have a C-terminal kink, and in AMA II we found that producing low-RMSD models required filtering out non-kinked H3 conformations. However, the scores of the kinked structures was higher than some of the extended structures that Rosetta produced. In response to these findings, we developed new geometric parameters that describe the kink.
Because the CDR H3 loop lies at the interface between the heavy and light chains, incorrect VH–VL orientations can frustrate identifying correct CDR H3 conformations. In the time that has elapsed since AMA II was conducted, progress has been made in predicting VH–VL orientation from sequence by training a random forest model on a set of “fingerprint” residues at the VH–VL interface using ABangle's six degree-of-freedom description of orientation. Similarly, effort has been made to develop a CDR H3-specific loop modeling routine, but successful predictions require extremely accurate atomic coordinates for the rest of the FV, which may make these tools better-suited for refining crystal structures with poor electron density around the CDR H3 loop than for homology modeling.
De novo loop modeling has endured as a challenging problem in part because of the large number of degrees of freedom that need to be sampled, as well as the challenges associated with accurately ranking different structures that may appear to be very similar when using a coarse-grained measurement such as RMSD. Additionally, side-chain interactions may play key roles in stabilizing observed loop conformations, potentially complicating low-resolution searches. Complicating the task even further is the most common source of the reference coordinates: crystal structures. Crystals are extremely crowded environments in which each protein molecule is surrounded by several by others; this may or may not influence the observed conformation within the asymmetric unit. Without the existence of a crystal structure of the same protein in more than one distinct crystal form, it cannot be determined if these “crystal contacts” perturb the conformation of any region of the protein.
Similarly, another complication of loop modeling is the search for a single set of coordinates. Proteins in physiological conditions are not completely rigid, and estimating the conformational entropy of a loop requires supplying a model to describe the describe the modes of flexibility accessible to the loop. Nevertheless, the possible existence of multiple degenerate-energy conformations cannot be dismissed.
In this study, we use the parameters defined in our previous work on CDR H3 structures to constrain the kink during the course of a simulation. To limit the uncertainty in the crystallographic coordinates, we have constructed a set of extremely high-resolution H3 loops. Given the high degree of confidence in the atomic coordinates, computed RMSD values are also better-defined. The constraint is tested by predicting H3 conformations on the crystal framework structure across the set of benchmark structures. Finally, to test the utility of the constraint, we also assess the ability to dock an antibody with a modeled H3 loop and CDR H3 modeling on a homology modeled framework.
The method is integrated into the computational protocols implemented within Rosetta for predicting the three-dimensional structure of an antibody from sequence and subsequently docking the antibody to its cognate antigen. The approach employed is robust; antibody modeling leverages the existence of canonical loop conformations to graft large segments from experimentally-determined antibody structures as well as energetic calculations to (1) minimize canonical loops, (2) docking methodology to refine the relative orientation of the VH and VL domains, and (3) de novo loop structure prediction to model the elusive CDR H3 loop. Similarly, to alleviate model uncertainty, antibody–antigen docking resamples CDR loop conformations (either through minimization or explicit refinement) in the binding context and can use a set of input models to represent an ensemble of conformations for the antibody, the antigen or both.
Our methods are freely available to academic users and can be run fully-automated on a user’s computer or via the ROSIE web server, or semi-automated on a user’s computer.