J. M. Caruthers1, L. Delgass1, H. Wang2, S. Dunlop3, and K. Novstrup1. (1) School of Chemical Engineering, Purdue University, 480 Stadium Mall Drive, West Lafayette, IN 47907-2100, (2) School of Industrial Engineering, Purdue University, 480 Stadium Mall Drive, West Lafayette, IN 47907-2100, (3) Information Technology at Purdue, Purdue University, 480 Stadium Mall Drive, West Lafayette, IN 47907-2100
The scientific community is currently awash in data, and with the continuing growth of high throughput experimental methods and high throughput computations this flood of data is ever increasing. However data by itself is only of minimal value. What is needed is knowledge from the data with the hope that there will eventually be new insights. In our opinion, for the foreseeable future the human researcher will remain the critical component in this knowledge generation, where an important objective of cyberinfrastructure is to enable the researcher to more effectively develop knowledge from data. The eye is the only human component with sufficient bandwidth to ingress information at a sufficient rate to keep pace with this explosion of data. However, the typical way of presenting data in terms of tables and graphs is not always the most natural way to represent information in the chemical sciences. For example, the official chemical name, e.g. [rac-(C2H4(1-Ind)2)ZrMe][MeB(C6F5)3], although precise, conveys much less information than a 2D or 3D image of the chemical structure. Information like spectra or stress-strain curves or molecular orbitals or many other forms of information are also much more naturally represented graphically. We have begun developing a visualization environment for the chemical sciences, where all types of graphical chemical information can be used inside of tables and graphs. As one example, inside of the cells of a spreadsheet various molecular structures can be displayed, resized, rotated, etc. in order to compare differences/similarities between a series of molecular structures. A second example is when one hovers over a point in a 2D or 3D graph the molecular structure or some other property of that point appears; or alternatively, all of the points of the graphs are a molecular structure or a spectra or a spider graph. This rich visualization environment enables the researcher to more naturally navigate complex, multi-dimensional data, allowing the more efficient development of scientific postulates with these large, complex data sets. We will demonstrate a variety of rich visualization techniques using examples from a current research problem in polymerization catalysis.