Sunday, November 4, 2007
6bi

Developing Predictive Statistical Models to Understand the Dynamics of Inflammatory Cell Signals

Arthur C. Goldsipe1, Christopher W. Espelin2, Peter K. Sorger3, and Douglas A. Lauffenburger1. (1) Biological Engineering, Massachusetts Institute of Technology, 77 Massachusetts Ave. 56-353, Cambridge, MA 02139, (2) Pfizer Research Technology Center, Cambridge, MA 02139, (3) Systems Biology, Harvard Medical School, 200 Longwood Ave, WAB 438, Boston, MA 02115

Inflammation involves coordinated communication and response by the vascular and immune systems. When properly regulated, inflammation assists an organism in dealing with injury or pathogenic invasion. Unregulated inflammation is also associated with diseases such as rheumatoid arthritis, which is an autoimmune disorder in which the joints become inflamed. Although the details of signaling cascades of inflammation are not completely known, many of the key players have been identified. In particular, both c-Jun N-terminal kinases (JNKs) and p38 are mitogen-activated protein kinases (MAPKs) with critical roles in the mammalian response to stress.

In order to better understand the signaling events surrounding the inflammatory response, high-throughput techniques have been used to measure the dynamic phospho-protein levels and cytokine release from U937 cells treated with a various inflammatory stimuli. The U937 cell line serves as a model for monocytes and macrophages, both of which are present in inflamed tissue. The cells were treated with lipopolysaccharide (LPS) and either pro-inflammatory cytokines or small-molecule inhibitors.

The experimental data have been modeled by a class of predictive statistical techniques known as partial least-squares regression (PLSR). For such biological data sets, PLSR offers several advantages, including the ability to handle highly collinear data, robustness to missing data and biological noise, and the ability to summarize complex data. Because of the underlying experimental data is multidimensional in nature, multi-block and multi-way variations of PLSR were also investigated. Various applications of these PLSR models will be illustrated, such as the prediction of cytokine release, the identification of the most informative experimental measurements, and the elucidation of the potential for crosstalk and autocrine feedback.

More generally, high-throughput biological data are being generated at an increasing rate. Therefore, it is also important to automate and simplify repetitive and error-prone aspects of data analysis and model development. To this end, we are developing modules for PLSR analysis that will be incorporated into a general data workflow for systems biology.