364785 Mbuild: A Hierarchical, Component Based Molecule Builder
The biophysics simulation community has put considerable effort into creating tools and databases for building and parameterizing biological molecules with minimal effort, e.g. the Protein Data Bank[1], VMD[2], AmberTools[3]. Such toolchains allow researchers to generate input files for complex structures, such as proteins and DNA, that can run on most molecular dynamics simulation engines with little to no manual intervention. However, while the biophysics community’s tools provide excellent functionality for biological system setup, they do not allow one to easily generate arbitrary structures and in many cases do not provide functionality when semi-infinite substrates are present in the system or chemical bonding between separate species is required (e.g. self-assembled monolayer systems). Unique aspects of non-biochemical systems, such as monolayers, heterogeneous polymer melts and surface bound polymer brushes, require a different approach. These systems may not be regular and thus defining a small unit cell that could be duplicated is not always possible.
mBuild is a hierarchical, component based molecule builder tool that relies on equivalence relations for component composition. Every component can recursively contain particles and other components to generate arbitrary, hierarchical structures where every particle represents a leaf in the hierarchy. Bottom-level components, such as an alkyl group or a monomer, can be hand-drawn using software like Avogadro[4] and are then connected using an equivalence operator which matches defined attachment sites between two components. mBuild supports parameterized structures through generative modeling. This allows for declaratively expressing repetitive structures, e.g. polymer chains, crystal structures, planar or spatial tiling, as well as for parameterizing the affine transformations applied to subcomponents of a composite component. Accompanying mBuild is a version-controlled library of components which promotes reuse, as well as collaborative component development and curation. Additionally, mBuild contains an interface to a molecular dynamics forcefield database. Upon generating a structure with mBuild, users can choose a forcefield with which to generate their system’s topology, parameterize the system and ultimately produce a usable simulation input script.
As a case study, we generated an ensemble of poly(ethylene glycol) monolayers, where surface density and patterning of the monomers as well as their height is trivially tunable via the Python interface/component parameters thus allowing rapid generation of input files for parameter sweeping studies. We also demonstrate the scalability of mBuild through a more complex example: a system of poly(2-methacryloyloxyethyl phosphorylcholine) or pMPC brushes attached to a silica substrate via an atom transfer radical polymerization initiator. Similarly, the structure can be tuned in terms of surface coverage, patterning and brush height. Such systems, with atom numbers in the range of a hundred thousand to several million, can be generated and parameterized in seconds to minutes of computing time on a conventional workstation.
[1] F. C. Bernstein, T. F. Koetzle, G. J. Williams, E. F. Meyer, M. D. Brice, J. R. Rodgers, O. Kennard, T. Shimanouchi, and M. Tasumi, “The protein data bank: A computer-based archival file for macromolecular structures,” Archives of Biochemistry and Biophysics 185, 584–591 (1978).
[2] W. Humphrey, A. Dalke, and K. Schulten, “VMD: Visual molecular dynamics,” Journal of Molecular Graphics 14, 33–38 (1996).
[3] R. Salomon-Ferrer, D. A. Case, and R. C. Walker, “An overview of the Amber biomolecular simulation package,” Wiley Interdisciplinary Reviews: Computational Molecular Science 3, 198–210 (2013).
[4] M. D. Hanwell, D. E. Curtis, D. C. Lonie, T. Vandermeersch, E. Zurek, and G. R. Hutchison, “Avogadro: an advanced semantic chemical editor, visualization, and analysis platform.” Journal of Cheminformatics 4, 17 (2012).
See more of this Group/Topical: Computational Molecular Science and Engineering Forum