481919 Regression Strategies for Large Data Sets (poster version)
The quantity of data that must be analyzed in pursuit of the sought inferences is invariably large, consisting of hundreds or thousands of quantities sampled at many millions of points in time. It is seldom possible to process data sets of this size in computing resource memory.
A number of frameworks for partitioning large data sets have been developed and popularized, though adoption by industrial companies has been notably slower than by IT-centric enterprises. This paper endeavors to demystify the processing of out-of-memory data for an engineering audience.
Regression, which is fundamental in data analysis, is used to motivate use of the techniques. This paper begins with computing simple statistics. Subsequently, some strategies for matrix manipulations (relevant to both direct and iterative methods) are discussed. Finally, an illustrative example of a large data set regression is presented.
See more of this Group/Topical: Spring Meeting Poster Session and Networking Reception