The disease transmission parameters in classic models are typically estimated with best fits on one-step-ahead prediction. The main utility of models estimated in this fashion is the prediction of short-term dynamics, or to roughly characterize the magnitude of disease transmission. The lack of long-term predictive capability makes it increasingly difficult to make use of these models in public health decisions. In this work, we develop a nonlinear programming approach for estimation of long-term dynamics allowing simultaneous estimation of the susceptible dynamics and the seasonality of transmission in the Time-Series SIR model. We will illustrate the effectiveness of this approach on simulated data and on real measles data from Thailand.
A classic model for describing infectious disease dynamics is the SIR (Susceptible-Infected-Recovered) compartment model[1,2] where the population is divided into these three distinct groups. In the most basic version of the model, individuals enter the susceptible compartment shortly following birth and, upon exposure to an infected individual, have a probability of becoming infected themselves. A variety of assumptions exist for modeling the infectious period, nevertheless, after some time, the individual overcomes the disease and enters the recovered state. From here, immunity may be lifelong, or temporary, where the individual may later return to the susceptible compartment. Typically, this conceptual idea is modeled as a system of differential equations relating accumulation in, and flow between, the compartments. Arbitrary complexities can be introduced into this simple model through additional expressions and compartments (e.g. maternal immunity, death from disease, age-structured models).
Estimation of parameters in these models has complications. Often, the only observed variable is the reported number of infections, and the true number of infections are related through an unknown (and possibly time-varying) reporting factor. Furthermore, the unobserved susceptible dynamics have shown to be the dominant driving force for the progressing infection dynamics. In the approach of Finkenstadt & Grenfell[3], a Time-series SIR (TSIR) model[3] was developed to deal with some of these difficulties. This model is essentially a discrete time SIR model written in terms of generations of the disease, and includes a seasonal transmission parameter and a constant mixing exponent on the number of infectives. Considering measles data for England & Wales, they presented a two-phase approach where the susceptible dynamics and reporting factor are estimated first using a susceptible reconstruction procedure. This procedure relies on stationarity of the susceptible dynamics and eventual infection of all new susceptibles. Following susceptible reconstruction, the disease parameters are estimated with least squares using a one-step-ahead fitting procedure on the log-transformed model.
Here we develop an alternative approach which uses nonlinear programming (NLP) to simultaneously estimate the susceptible dynamics and reporting factor, along with the seasonal disease transmission parameters. This approach does not rely on stationarity of the susceptibles. We first demonstrate, using simulated data, that the NLP formulation yields comparable results to the two-step procedure when the susceptible dynamics are indeed stationary. However, the NLP approach is also able to reproduce the susceptible dynamics and seasonal transmission parameters when the simulated data includes non-stationary susceptible dynamics, whereas the two-step approach fails to estimate these quantities accurately.
We show the effectiveness of the simultaneous NLP approach on the existing England/Wales data, as well as the recently obtained measles data from Thailand. We will discuss the seasonality of transmission in the Thai data, as well as differences observed across various provinces. The NLP framework is general in nature and we will outline, as future work, extension of this formulation to continuous time models, and the use of NLP decomposition approaches for estimation over large spatially coupled domains[4].
[1] Anderson, R.M. and May, R.M., INFECTIOUS DISEASES OF HUMANS: DYNAMICS AND CONTROL, Oxford University Press, Oxford, UK
[2] Hethcote, H.W., "The Mathematics of Infectious Diseases", SIAM Review, Vol. 42, No. 4. (Dec., 2000), pp. 599-653
[3] Finkenstadt, B.F. and Grenfell, B.T., "Time series modelling of childhood diseases: a dynamical systems approach", Journal of the Royal Statistical Society, Series C, 49:187-205.
[4] Laird, C. D., Biegler, L. T., “Large-Scale Nonlinear Programming for Multi-scenario Optimization”, accepted for publication in proceedings of the International Conference on High Performance Scientific Computing, Hanoi, Vietnam, 2006.