• Collaborative forecasting involves exchanging information on how much of an item will be needed by a buyer and how much can be supplied by a seller or manufacturer in a supply chain. This exchange allows parties to plan their operations based on the needs and limitations of their supply chain partner. The success of this system critically depends on the healthy flow of information. This paper focuses on methods to easily analyze and visualize this process. To understand how the information travels on this network and how parties react to new information from their partners, this paper proposes a Gaussian Graphical Model based method, and finds certain inefficiencies in the system. To simplify and better understand the update structure, a Continuum Canonical Correlation based method is proposed. The analytical tools introduced in this article are implemented as a part of a forecasting solution software developed to aid the forecasting practice of a large company.
  • This study focuses on the inventory pooling problem under the newsvendor framework. The specific focus is the change in inventory levels when product inventories are pooled. We provide analytical conditions under which an increase (or decrease) in the total inventory levels should be expected. We introduce the copula framework to model a wide range of dependence structures between pooled demands, and provide a numerical study that gives valuable insights into the effect of marginal demand distributions and dependence structure on the effect of pooling to inventory levels.
  • Object Oriented Data Analysis is a new area in statistics that studies populations of general data objects. In this article we consider populations of tree-structured objects as our focus of interest. We develop improved analysis tools for data lying in a binary tree space analogous to classical Principal Component Analysis methods in Euclidean space. Our extensions of PCA are analogs of one dimensional subspaces that best fit the data. Previous work was based on the notion of tree-lines. In this paper, a generalization of the previous tree-line notion is proposed: k-tree-lines. Previously proposed tree-lines are k-tree-lines where k=1. New sub-cases of k-tree-lines studied in this work are the 2-tree-lines and tree-curves, which explain much more variation per principal component than tree-lines. The optimal principal component tree-lines were computable in linear time. Because 2-tree-lines and tree-curves are more complex, they are computationally more expensive, but yield improved data analysis results. We provide a comparative study of all these methods on a motivating data set consisting of brain vessel structures of 98 subjects.
  • The statistical analysis of tree structured data is a new topic in statistics with wide application areas. Some Principal Component Analysis (PCA) ideas were previously developed for binary tree spaces. In this study, we extend these ideas to the more general space of rooted and labeled trees. We re-define concepts such as tree-line and forward principal component tree-line for this more general space, and generalize the optimal algorithm that finds them. We then develop an analog of classical dimension reduction technique in PCA for the tree space. To do this, we define the components that carry the least amount of variation of a tree data set, called backward principal components. We present an optimal algorithm to find them. Furthermore, we investigate the relationship of these the forward principal components, and prove a path-independency property between the forward and backward techniques. We apply our methods to a data set of brain artery data set of 98 subjects. Using our techniques, we investigate how aging affects the brain artery structure of males and females. We also analyze a data set of organization structure of a large US company and explore the structural differences across different types of departments within the company.
  • This study introduces a new method of visualizing complex tree structured objects. The usefulness of this method is illustrated in the context of detecting unexpected features in a data set of very large trees. The major contribution is a novel two-dimensional graphical representation of each tree, with a covariate coded by color. The motivating data set contains three dimensional representations of brain artery systems of 105 subjects. Due to inaccuracies inherent in the medical imaging techniques, issues with the reconstruction algo- rithms and inconsistencies introduced by manual adjustment, various discrepancies are present in the data. The proposed representation enables quick visual detection of the most common discrepancies. For our driving example, this tool led to the modification of 10% of the artery trees and deletion of 6.7%. The benefits of our cleaning method are demonstrated through a statistical hypothesis test on the effects of aging on vessel structure. The data cleaning resulted in improved significance levels.
  • The active field of Functional Data Analysis (about understanding the variation in a set of curves) has been recently extended to Object Oriented Data Analysis, which considers populations of more general objects. A particularly challenging extension of this set of ideas is to populations of tree-structured objects. We develop an analog of Principal Component Analysis for trees, based on the notion of tree-lines, and propose numerically fast (linear time) algorithms to solve the resulting optimization problems. The solutions we obtain are used in the analysis of a data set of 73 individuals, where each data object is a tree of blood vessels in one person's brain.