• ### A History Matching Approach for Calibrating Hydrological Models(1709.02907)

March 25, 2019 stat.CO, stat.AP
Calibration of hydrological time-series models is a challenging task since these models give a wide spectrum of output series and calibration procedures require significant amount of time. From a statistical standpoint, this model parameter estimation problem simplifies to finding an inverse solution of a computer model that generates pre-specified time-series output (i.e., realistic output series). In this paper, we propose a modified history matching approach for calibrating the time-series rainfall-runoff models with respect to the real data collected from the state of Georgia, USA. We present the methodology and illustrate the application of the algorithm by carrying a simulation study and the two case studies. Several goodness-of-fit statistics were calculated to assess the model performance. The results showed that the proposed history matching algorithm led to a significant improvement, of 30% and 14% (in terms of root mean squared error) and 26% and 118% (in terms of peak percent threshold statistics), for the two case-studies with Matlab-Simulink and SWAT models, respectively.
• ### Robust Hierarchical Bayes Small Area Estimation for Nested Error Regression Model(1702.05832)

Oct. 24, 2018 stat.ME, stat.AP
National statistical institutes in many countries are now mandated to produce reliable statistics for important variables such as population, income, unemployment, health outcomes, etc. for small areas, defined by geography and/or demography. Due to small samples from these areas, direct sample-based estimates are often unreliable. Model-based small area estimation is now extensively used to generate reliable statistics by "borrowing strength" from other areas and related variables through suitable models. Outliers adversely influence standard model-based small area estimates. To deal with outliers, Sinha and Rao (2009) proposed a robust frequentist approach. In this article, we present a robust Bayesian alternative to the nested error regression model for unit-level data to mitigate outliers. We consider a two-component scale mixture of normal distributions for the unit-level error to model outliers and present a computational approach to produce Bayesian predictors of small area means under a noninformative prior for model parameters. A real example and extensive simulations convincingly show robustness of our Bayesian predictors to outliers. Simulations comparison of these two procedures with Bayesian predictors by Datta and Ghosh (1991) and M-quantile estimators by Chambers et al. (2014) shows that our proposed procedure is better than the others in terms of bias, variability, and coverage probability of prediction intervals, when there are outliers. The superior frequentist performance of our procedure shows its dual (Bayes and frequentist) dominance, and makes it attractive to all practitioners, both Bayesian and frequentist, of small area estimation.
• ### D-optimal Designs with Ordered Categorical Data(1502.05990)

Sept. 10, 2016 math.ST, stat.TH
Cumulative link models have been widely used for ordered categorical responses. Uniform allocation of experimental units is commonly used in practice, but often suffers from a lack of efficiency. We consider D-optimal designs with ordered categorical responses and cumulative link models. For a predetermined set of design points, we derive the necessary and sufficient conditions for an allocation to be locally D-optimal and develop efficient algorithms for obtaining approximate and exact designs. We prove that the number of support points in a minimally supported design only depends on the number of predictors, which can be much less than the number of parameters in the model. We show that a D-optimal minimally supported allocation in this case is usually not uniform on its support points. In addition, we provide EW D-optimal designs as a highly efficient surrogate to Bayesian D-optimal designs. Both of them can be much more robust than uniform designs.
• ### Using particle swarm optimization to search for locally $D$-optimal designs for mixed factor experiments with binary response(1602.02187)

Feb. 5, 2016 stat.AP
Identifying optimal designs for generalized linear models with a binary response can be a challenging task, especially when there are both continuous and discrete independent factors in the model. Theoretical results rarely exist for such models, and the handful that do exist come with restrictive assumptions. This paper investigates the use of particle swarm optimization (PSO) to search for locally $D$-optimal designs for generalized linear models with discrete and continuous factors and a binary outcome and demonstrates that PSO can be an effective method. We provide two real applications using PSO to identify designs for experiments with mixed factors: one to redesign an odor removal study and the second to find an optimal design for an electrostatic discharge study. In both cases we show that the $D$-efficiencies of the designs found by PSO are much better than the implemented designs. In addition, we show PSO can efficiently find $D$-optimal designs on a prototype or an irregularly shaped design space, provide insights on the existence of minimally supported optimal designs, and evaluate sensitivity of the $D$-optimal design to mis-specifications in the link function.
• ### A two-component normal mixture alternative to the Fay-Herriot model(1510.04482)

Oct. 22, 2015 stat.ME
This article considers a robust hierarchical Bayesian approach to deal with random effects of small area means when some of these effects assume extreme values, resulting in outliers. In presence of outliers, the standard Fay-Herriot model, used for modeling area-level data, under normality assumptions of the random effects may overestimate random effects variance, thus provides less than ideal shrinkage towards the synthetic regression predictions and inhibits borrowing information. Even a small number of substantive outliers of random effects result in a large estimate of the random effects variance in the Fay-Herriot model, thereby achieving little shrinkage to the synthetic part of the model or little reduction in posterior variance associated with the regular Bayes estimator for any of the small areas. While a scale mixture of normal distributions with known mixing distribution for the random effects has been found to be effective in presence of outliers, the solution depends on the mixing distribution. As a possible alternative solution to the problem, a two-component normal mixture model has been proposed based on noninformative priors on the model variance parameters, regression coefficients and the mixing probability. Data analysis and simulation studies based on real, simulated and synthetic data show advantage of the proposed method over the standard Bayesian Fay-Herriot solution derived under normality of random effects.
• ### Optimal Designs for 2^k Factorial Experiments with Binary Response(1109.5320)

Jan. 27, 2015 math.ST, stat.TH
We consider the problem of obtaining D-optimal designs for factorial experiments with a binary response and $k$ qualitative factors each at two levels. We obtain a characterization for a design to be locally D-optimal. Based on this characterization, we develop efficient numerical techniques to search for locally D-optimal designs. Using prior distributions on the parameters, we investigate EW D-optimal designs, which are designs that maximize the determinant of the expected information matrix. It turns out that these designs can be obtained very easily using our algorithm for locally D-optimal designs and are very good surrogates for Bayes D-optimal designs. We also investigate the properties of fractional factorial designs and study the robustness with respect to the assumed parameter values of locally D-optimal designs.
• ### Maximin and maximin-efficient event-related fMRI designs under a nonlinear model(1401.1631)

Jan. 8, 2014 stat.AP
Previous studies on event-related functional magnetic resonance imaging experimental designs are primarily based on linear models, in which a known shape of the hemodynamic response function (HRF) is assumed. However, the HRF shape is usually uncertain at the design stage. To address this issue, we consider a nonlinear model to accommodate a wide spectrum of feasible HRF shapes, and propose efficient approaches for obtaining maximin and maximin-efficient designs. Our approaches involve a reduction in the parameter space and a search algorithm that helps to efficiently search over a restricted class of designs for good designs. The obtained designs are compared with traditional designs widely used in practice. We also demonstrate the usefulness of our approaches via a motivating example.
• ### D-optimal Factorial Designs under Generalized Linear Models(1301.3581)

May 3, 2013 math.ST, stat.TH
Generalized linear models (GLMs) have been used widely for modelling the mean response both for discrete and continuous random variables with an emphasis on categorical response. Recently Yang, Mandal and Majumdar (2013) considered full factorial and fractional factorial locally D-optimal designs for binary response and two-level experimental factors. In this paper, we extend their results to a general setup with response belonging to a single-parameter exponential family and for multi-level predictors.
• ### Robustness of Optimal Designs for 2^2 Experiments with Binary Response(1005.1982)

May 12, 2010 stat.ME
We consider an experiment with two qualitative factors at 2 levels each and a binary response, that follows a generalized linear model. In Mandal, Yang and Majumdar (2010) we obtained basic results and characterizations of locally D-optimal designs for special cases. As locally optimal designs depend on the assumed parameter values, a critical issue is the sensitivity of the design to misspecification of these values. In this paper we study the sensitivity theoretically and by simulation, and show that the optimal designs are quite robust. We use the method of cylindrical algebraic decomposition to obtain locally D-optimal designs in the general case.
• ### Optimal Designs for Two-Level Factorial Experiments with Binary Response(1003.1557)

May 12, 2010 math.ST, stat.TH, stat.ME
We consider the problem of obtaining locally D-optimal designs for factorial experiments with qualitative factors at two levels each with binary response. Our focus is primarily on the 2^2 experiment. In this paper, we derive analytic results for some special cases and indicate how to handle the general case. The performance of the uniform design in examined and we show that this design is highly efficient in general. For the general 2^k case we show that the uniform design has a maximin property.
• ### $\mathcal{G}$-SELC: Optimization by sequential elimination of level combinations using genetic algorithms and Gaussian processes(0906.1433)

June 8, 2009 stat.AP
Identifying promising compounds from a vast collection of feasible compounds is an important and yet challenging problem in the pharmaceutical industry. An efficient solution to this problem will help reduce the expenditure at the early stages of drug discovery. In an attempt to solve this problem, Mandal, Wu and Johnson [Technometrics 48 (2006) 273--283] proposed the SELC algorithm. Although powerful, it fails to extract substantial information from the data to guide the search efficiently, as this methodology is not based on any statistical modeling. The proposed approach uses Gaussian Process (GP) modeling to improve upon SELC, and hence named $\mathcal{G}$-SELC. The performance of the proposed methodology is illustrated using four and five dimensional test functions. Finally, we implement the new algorithm on a real pharmaceutical data set for finding a group of chemical compounds with optimal properties.