Collaborative forecasting involves exchanging information on how much of an
item will be needed by a buyer and how much can be supplied by a seller or
manufacturer in a supply chain. This exchange allows parties to plan their
operations based on the needs and limitations of their supply chain partner.
The success of this system critically depends on the healthy flow of
information. This paper focuses on methods to easily analyze and visualize this
process. To understand how the information travels on this network and how
parties react to new information from their partners, this paper proposes a
Gaussian Graphical Model based method, and finds certain inefficiencies in the
system. To simplify and better understand the update structure, a Continuum
Canonical Correlation based method is proposed. The analytical tools introduced
in this article are implemented as a part of a forecasting solution software
developed to aid the forecasting practice of a large company.
This study focuses on the inventory pooling problem under the newsvendor
framework. The specific focus is the change in inventory levels when product
inventories are pooled. We provide analytical conditions under which an
increase (or decrease) in the total inventory levels should be expected. We
introduce the copula framework to model a wide range of dependence structures
between pooled demands, and provide a numerical study that gives valuable
insights into the effect of marginal demand distributions and dependence
structure on the effect of pooling to inventory levels.
Object Oriented Data Analysis is a new area in statistics that studies
populations of general data objects. In this article we consider populations of
tree-structured objects as our focus of interest. We develop improved analysis
tools for data lying in a binary tree space analogous to classical Principal
Component Analysis methods in Euclidean space. Our extensions of PCA are
analogs of one dimensional subspaces that best fit the data. Previous work was
based on the notion of tree-lines.
In this paper, a generalization of the previous tree-line notion is proposed:
k-tree-lines. Previously proposed tree-lines are k-tree-lines where k=1. New
sub-cases of k-tree-lines studied in this work are the 2-tree-lines and
tree-curves, which explain much more variation per principal component than
tree-lines. The optimal principal component tree-lines were computable in
linear time. Because 2-tree-lines and tree-curves are more complex, they are
computationally more expensive, but yield improved data analysis results.
We provide a comparative study of all these methods on a motivating data set
consisting of brain vessel structures of 98 subjects.
The statistical analysis of tree structured data is a new topic in statistics
with wide application areas. Some Principal Component Analysis (PCA) ideas were
previously developed for binary tree spaces. In this study, we extend these
ideas to the more general space of rooted and labeled trees. We re-define
concepts such as tree-line and forward principal component tree-line for this
more general space, and generalize the optimal algorithm that finds them.
We then develop an analog of classical dimension reduction technique in PCA
for the tree space. To do this, we define the components that carry the least
amount of variation of a tree data set, called backward principal components.
We present an optimal algorithm to find them. Furthermore, we investigate the
relationship of these the forward principal components, and prove a
path-independency property between the forward and backward techniques.
We apply our methods to a data set of brain artery data set of 98 subjects.
Using our techniques, we investigate how aging affects the brain artery
structure of males and females. We also analyze a data set of organization
structure of a large US company and explore the structural differences across
different types of departments within the company.
This study introduces a new method of visualizing complex tree structured
objects. The usefulness of this method is illustrated in the context of
detecting unexpected features in a data set of very large trees. The major
contribution is a novel two-dimensional graphical representation of each tree,
with a covariate coded by color. The motivating data set contains three
dimensional representations of brain artery systems of 105 subjects. Due to
inaccuracies inherent in the medical imaging techniques, issues with the
reconstruction algo- rithms and inconsistencies introduced by manual
adjustment, various discrepancies are present in the data. The proposed
representation enables quick visual detection of the most common discrepancies.
For our driving example, this tool led to the modification of 10% of the artery
trees and deletion of 6.7%. The benefits of our cleaning method are
demonstrated through a statistical hypothesis test on the effects of aging on
vessel structure. The data cleaning resulted in improved significance levels.
The active field of Functional Data Analysis (about understanding the
variation in a set of curves) has been recently extended to Object Oriented
Data Analysis, which considers populations of more general objects. A
particularly challenging extension of this set of ideas is to populations of
tree-structured objects. We develop an analog of Principal Component Analysis
for trees, based on the notion of tree-lines, and propose numerically fast
(linear time) algorithms to solve the resulting optimization problems. The
solutions we obtain are used in the analysis of a data set of 73 individuals,
where each data object is a tree of blood vessels in one person's brain.