442785 Computationally-Accelerated Rule-Based Chemical Reaction Network Generation in Modeling Coke Formation during Biomass Catalytic Pyrolysis for Novel Catalyst Development

Monday, November 9, 2015
Exhibit Hall 1 (Salt Palace Convention Center)
Sam Pradhan1, Shoucheng Du1 and George M. Bollas2, (1)Chemical and Biomolecular Engineering, University of Connecticut, Storrs, CT, (2)Department of Chemical and Biomolecular Engineering, University of Connecticut, Storrs, CT

Computationally-accelerated rule-based chemical reaction network generation in modeling coke formation during biomass catalytic pyrolysis for novel catalyst development

Sam Pradhan*, Shoucheng Du*, Georgios M. Bollas*

*Department of Chemical and Biomolecular Engineering, University of Connecticut

Phone: (609)-680-2388, email: sam.pradhan@uconn.edu


Chemical reaction pathways in biomass pyrolysis are enhanced by catalysts such as ZSM-5, which enable hydrogen transfer reactions on their active sites. The challenge this process faces is the multiplicity of radical olefins and paraffins that react to stable polyaromatic hydrocarbons known as coke. Through experimentation, it was identified that coking often occurs inside the zeolite pores, in which the process-enhancing active sites are occupied, deactivating the catalyst. To prevent this deactivation, the specific reactions that lead to coke must be studied so that novel, coke-tolerant catalysts can be designed.  


Our approach is to use cheminformatics to deduce the pyrolysis reaction intermediates, so that commonalities among compounds that lead to coke may be analyzed. The objective of this project is to generate a complete network of chemical reactions leading to coke as a final product, that are subject to constraints (e.g. NMR constraints, zeolite pore diameter) determined by prior experimental data. The network must be exhaustive, yet selective to satisfy these constraints. For this purpose, the ChemAxon suite of cheminformatics products are used as “plug-ins” to the Java and Eclipse-based KNIME Analytics Platform, which is employed to handle the big-data mining and clustering algorithms required to generate the reaction network. Figure 1 shows an example of the ChemAxon products organized in the KNIME user interface. An iterative algorithm governs the project - a new list of chemical reactants is initialized and sent to a parallelized set of virtual chemical reactors, each handling one unique chemical reaction determined from the literature. The various products generated are concatenated to the list. The molecules that do not follow the aforementioned constraints are deleted from the list. The leftover product list is then sent back as a reactant list in a new iteration of this algorithm, in which this process is repeated for a user-defined number of iterations until a complete reaction tree is generated.                                                                                                                    radical.PNG

Figure 1.  KNIME workflow structure

Data Handling

As a data mining project, the design of the workflow focuses on simplifying the project and saving system resources. The parallelization of the reactions shown in Figure 2 sends identical copies of the reactant list to each of the reactors to generate the maximum number of possible products. Only one copy of a unique molecule is required in the reactant list for all reactors, rather than one per reactor. Additionally, if a reaction is defined to require multiple reactants, it will react every possible combination of reactants so that each reaction is not limited to the exact number of molecules in the list. Multiple reactions may produce an identical product, so a grouping algorithm is inserted to remove redundant products after the reactions complete. In Figure 2 below, the initial reactant list includes toluene and a proton. The list is sent to three different reactions that do not use the proton since these are not catalytic reactions that facilitate proton transfer. The products and unreacted species are concatenated into a new product list, which is then filtered by removing duplicates (in this example, the two redundant protons). The product list then returns to the beginning as a list of reactants and the process is repeated.


Figure 2. Workflow fragment showing parallel reactions, product concatenation, and redundant molecule grouping

Reaction Processing

In this study, all reactions are categorized as either thermal or catalytic. Thermal reactions include but are not limited to initiations, radical scissions, propagations, terminations, radical substitutions, and cyclizations - all radical reactions that occur under pyrolysis conditions with or without the presence of a catalyst. Catalytic reactions include protonations, rearrangements, cationic scissions, cationic cyclizations, and proton abstractions - reactions which involve a proton transfers among species. These are handled by the ChemAxon Reactor KNIME node; each instance of the Reactor node contains a user-defined chemical reaction that is applied to the molecule lists that flow through it. The Reactor node is purposed for cheminformatics, so instead of looking for the specific reactants specified in the reaction, it searches for instances of those reactants within all molecules checked against it. A reaction may be defined as the following:


Reaction R1 splits ethane into two methyl radicals, but R1 is not limited to ethane reactants and methyl products. Any hydrogens seen are “implied” meaning they are placed by the program to show carbon tetrahedrals wherever possible. What reaction R1 actually means is that any two singly-bonded carbons in an aliphatic system can split into two monoradical components by scission of the single bond. Figure 3 below shows the list of reactions generated by the Reactor node containing reaction R1 for an input reactant list containing only 1-pentene:react.PNG

Figure 3. Reaction output by Reactor node when fed 1-pentene with a single bond carbon initiation reaction, R1

Starting from 1-pentene, we see that wherever two singly-bonded carbons were present, the molecule was split into two monoradical components. All possibilities are considered by the Reactor node to satisfy one key function of cheminformatics, that is, to generate a complete set of chemical structures relevant to its application (in this study, the application being coke formation).

Present Work

The goal of this project is to generate a robust method for reaction engineering through computer-generated reaction pathway generation, constrained by experimental evidence. The application of this project is the reaction pathway generation of coke formation in zeolites during biomass pyrolysis. Catalysts such as ZSM-5 enhance biomass pyrolysis by enabling hydrogen transfer reactions in the catalyst active sites. Coke formation in the zeolite pores results in catalyst deactivation, since the active sites are occupied by coke. Because of this, the reaction pathways must be studied so that novel, coke-tolerant catalysts can be designed. This study began with toluene as a primary reactant to model reaction pathways exclusive to hydrocarbons. Defining only toluene and a proton as initial reactants, the goal is to generate a complete reaction tree that will include coke structures further down the tree. Using the literature, the most generalized reactions are defined such that the maximum number of feasible reactions can be generated by the workflow. When this is complete, the project will be extended to react oxygenates such as furan. The current workflow version can readily generate numerous bi- and tri-aromatic fused ring structures identified by prior experimental data to be coke precursors (e.g. naphthalenes, anthracenes). To generate larger trees, more processing power is required so that the increasingly large molecule lists can be split into clusters to be handled in manageable parallelized chunks rather than all at once. This is necessary because the consideration of all possible pathways requires that all previous intermediates be forwarded through all subsequent iterations of the workflow’s aforementioned iterative execution algorithm, which consumes more system resources with each iteration as a result. For this purpose, we consider importing the workflow as a format with which it may be executed on a compute cluster as a future solution.

Extended Abstract: File Uploaded