430076 Learning Models of Unspecified Functional Form through Symbolic Regression

Tuesday, November 10, 2015: 2:00 PM
Salon F (Salt Lake Marriott Downtown at City Creek)
Alison Cozad1, Zachary Wilson1 and Nick Sahinidis2, (1)Carnegie Mellon University, Pittsburgh, PA, (2)Chemical Engineering, Carnegie Mellon University, Pittsburgh, PA

We address the problem of learning simple algebraic models from data obtained from simulations or experiments. Standard regression techniques seek to develop models with a pre-determined model structure or set of alternative model structures. However, in a practical setting, data is often acquired from a number of sources without clear understanding of the system at hand. Algebraic models could be used to make the system more amenable to the tasks of optimization, prediction, and control; however, a lack of insightful functional forms to use in a regression is problematic. Symbolic regression addresses this problem by learning an algebraic model of unspecified functional form from exogenous data [1].

Symbolic regression is traditionally approached with genetic programming and other heuristic algorithms. These stochastic approaches to model identification offer no guarantee of either local or global optimality, and often perform poorly in practice [2]. We show that symbolic regression can be formulated as a nonlinear nonconvex disjunctive program and can be solved to global optimality.  Our approach includes steps to avoid redundant solutions and ensure that the optimal solution can be efficiently found using a branch-and-bound framework. We present extensive computational results comparing our symbolic regression approach to approaches in the existing literature.


[1]    J.R. Koza. Genetic Programming: On the Programming of Computers by Means of Natural Selection. MIT Press, 1992.

[2]    M.F. Korns, Accuracy in symbolic Regression. In Genetic Programming Theory and Practice IX, pages 129-151. Springer, 2011.

Extended Abstract: File Not Uploaded
See more of this Session: Data Analysis and Big Data in Chemical Engineering
See more of this Group/Topical: Computing and Systems Technology Division