Significance Analysis of Time-Series High-Throughput Transcriptional Profiling Data: Applied to Arabidopsis Thaliana Liquid Cultures Subjected to Environmental Stresses

Bhaskar Dutta1, Robert W. Snyder2, and Maria I. Klapa1. (1) University of Maryland, 2113 Chemical and Nuclear Engineering Building, College Park, MD 20742-2111, (2) Department of Chemical and Biomolecular Engineering, University of Maryland College Park, College Park, MD 20742

Various hypothesis testing methodologies like t-test, ANOVA are applied to high-throughput transcriptional profiling data to find out genes that are differentially expressed between two more experimental conditions. Significance Analysis of Microarrays (SAM) is a hypothesis testing method based on t-test but tailored for microarray data. However these hypothesis testing methods are not suitable for analysis of time-series data as none of these methods take into consideration the sequence at which the measurements are obtained, which is an inherent property of time-series data.

We developed algorithms for the systematic analysis of time series high-throughput ľomic data, based on existing SAM analysis. The algorithm allows the identification of differentially expressed genes for each timepoint between two experimental conditions, hence allows us to analyze each timepoint separately rather than just observing the overall effect. The sequential information about the timepoints was used to create different metrics and scores that can be used for ranking the genes. The significant genes were subjected to Gene Ontology (GO) analysis to reveal how significantly different GO terms are changing with time. All these algorithms were integrated into software which will be available to the scientific community shortly.

Using full genome DNA microarray, the algorithms developed were applied to analyze transcriptional response of A. thaliana liquid cultures subjected to environmental perturbations applied individually or in combination. The results obtained were analyzed in the context of A. thaliana plant physiology. The wealth of information these algorithms can provide makes it a valuable tool for time-series high-throughput ľomic analysis.