466166 A Heterogeneous High Performance Computing Implementation of Thin Film Growth Simulation

Tuesday, November 15, 2016: 9:06 AM
Carmel I (Hotel Nikko San Francisco)
Xuelei Zhang and Dong Ni, College of Control Science and Engineering, Zhejiang University, Hangzhou, China

Thin film growth processes are one of the most important types of nanofabrication processes existing many industrial applications such as semiconductor manufacturing, solar cell fabrication, and nanomaterial coating, etc. There are quite a number of methods to simulate various kind of thin film growth, among which, the kinetic Monte Carlo (kMC) method turned out to be one very promising technique for its high efficiency in computation, high fidelity of process description, and high flexibility to accommodate complex process mechanisms.

However, because of the serial execution nature of kMC algorithm, it is quite challenging to leverage the high performance computing (HPC) power available nowadays in kMC computation since most HPC capability come from parallelization. Furthermore, as HPC architecture moving from homogeneous (all CPU computation) to heterogeneous (CPU + GPU or CPU + MIC/Intel Many Integrated Core Architecture[1]), it has become even less straight forward to efficiently implement kMC algorithm on HPC.

In this work, we carefully investigated the implementation of kMC algorithm on a heterogeneous computing platform which consists of two Xeon CPUs and four Xeon Phi many integrated core coprocessors. Our parallel algorithm follows the synchronous method proposed by Martínez et al.[2] of which time synchronization is perfect but computation is semi rigorous with boundary conflicts. We simulate a thin film growth process using a solid-on-solid lattice model with surface adsorption, migration, and desorption. We study the performance of the algorithm in terms of speedup factor, boundary error in different Xeon Phi execution mode (offload, native and symmetrical), as well as its relationship to simulation parameters such as lattice size, microscopic event activation energy level, etc.. Recommendations on implementation strategy is provided finally together with methods to minimize boundary conflicts.


[1] R. Rezaur, Intel Xeon Phi Architecture and Tools, Apress, 2013.

[2] E. Martínez, J. Marian, M.H. Kalos, J.M. Perlado, Synchronous parallel kinetic Monte Carlo for

continuum diffusion-reaction systems, Journal of Computational Physics, 2008, 227, pp3804–3823

Extended Abstract: File Not Uploaded