
We propose a new class of spatiotemporal models with unknown and banded
autoregressive coefficient matrices. The setting represents a sparse structure
for highdimensional spatial panel dynamic models when panel members represent
economic (or other type) individuals at many different locations. The structure
is practically meaningful when the order of panel members is arranged
appropriately. Note that the implied autocovariance matrices are unlikely to be
banded, and therefore, the proposal is radically different from the existing
literature on the inference for highdimensional banded covariance matrices.
Due to the innate endogeneity, we apply the least squares method based on a
YuleWalker equation to estimate autoregressive coefficient matrices. The
estimators based on multiple YuleWalker equations are also studied. A
ratiobased method for determining the bandwidth of autoregressive matrices is
also proposed. Some asymptotic properties of the inference methods are
established. The proposed methodology is further illustrated using both
simulated and real data sets.

Precision matrices play important roles in many practical applications.
Motivated by temporally dependent multivariate data in modern social and
scientific studies, we consider the statistical inference of precision matrices
for highdimensional time dependent observations. Specifically, we propose a
datadriven procedure to construct a class of simultaneous confidence regions
for the precision coefficients within an index set of interest. The confidence
regions can be applied to test for specific structures of a precision matrix
and to recover its nonzero components. We first construct an estimator of the
underlying precision matrix via penalized nodewise regressions, and then
develope the Gaussian approximation results on the maximal difference between
the estimated and true precision matrices. A computationally feasible
parametric bootstrap algorithm is developed to implement the proposed
procedure. Theoretical results indicate that the proposed procedure works well
without the second order crosstime stationary assumption on the data and
sparse structure conditions on the longrun covariance of the estimates.
Simulation studies and a real example on S&P 500 stock return data confirm the
performance of the proposed approach.

We propose a new approach to represent nonparametrically the linear
dependence structure of a spatiotemporal process in terms of latent common
factors. Though it is formally similar to the existing reduced rank
approximation methods (Section 7.1.3 of Cressie and Wikle, 2011), the
fundamental difference is that the lowdimensional structure is completely
unknown in our setting, which is learned from the data collected irregularly
over space but regularly over time. Furthermore a graph Laplacian is
incorporated in the learning in order to take the advantage of the continuity
over space, and a new aggregation method via randomly partitioning space is
introduced to improve the efficiency. We do not impose any stationarity
conditions over space either, as the learning is facilitated by the
stationarity in time. Krigings over space and time are carried out based on the
learned lowdimensional structure, which is scalable to the cases when the data
are taken over a large number of locations and/or over a long time period.
Asymptotic properties of the proposed methods are established. Illustration
with both simulated and real data sets is also reported.

We propose a new and easytouse method for identifying cointegrated
components of nonstationary time series, consisting of an eigenanalysis for a
certain nonnegative definite matrix. Our setting is modelfree, and we allow
the integervalued integration orders of the observable series to be unknown,
and to possibly differ. Consistency of estimates of the cointegration space and
cointegration rank is established both when the dimension of the observable
time series is fixed as sample size increases, and when it diverges slowly. The
proposed methodology is also extended and justified in a fractional setting. A
Monte Carlo study of finitesample performance, and a small empirical
illustration, are reported.

While it is common practice in applied network analysis to report various
standard network summary statistics, these numbers are rarely accompanied by
some quantification of uncertainty. Yet any error inherent in the measurements
underlying the construction of the network, or in the network construction
procedure itself, necessarily must propagate to any summary statistics
reported. Here we study the problem of estimating the density of edges in a
noisy network, as a canonical prototype of the more general problem of
estimating density of arbitrary subgraphs. Under a simple model of network
error, we show that consistent estimation of such densities is impossible when
the rates of error are unknown and only a single network is observed. We then
develop methodofmoment estimators of network edge density and error rates for
the case where a minimal number of network replicates are available. These
estimators are shown to be asymptotically normal as the number of vertices
increases to infinity. We also provide the confidence intervals for quantifying
the uncertainty in these estimates based on the asymptotic normality. We
illustrate the use of our estimators in the context of gene coexpression
networks.

We propose a new approach to represent nonparametrically the linear
dependence structure of a multivariate spatiotemporal process in terms of
latent common factors. The matrix structure of observations from the
multivariate spatiotemporal process is well reserved through the matrix factor
model configuration. The space loading functions are estimated
nonparametrically by sieve approximation and the variable loading matrix is
estimated via an eigenanalysis of a symmetric nonnegative definite matrix.
Though it is similar to the low rank approximation methods in spatial
statistics, the fundamental difference is that the lowdimensional structure is
completely unknown in our setting. Additionally, our method accommodate
nonstationarity over space. Asymptotic properties of the proposed methods are
established.

We extend the principal component analysis (PCA) to secondorder stationary
vector time series in the sense that we seek for a contemporaneous linear
transformation for a $p$variate time series such that the transformed series
is segmented into several lowerdimensional subseries, and those subseries are
uncorrelated with each other both contemporaneously and serially. Therefore
those lowerdimensional series can be analysed separately as far as the linear
dynamic structure is concerned. Technically it boils down to an eigenanalysis
for a positive definite matrix. When $p$ is large, an additional step is
required to perform a permutation in terms of either maximum crosscorrelations
or FDR based on multiple tests. The asymptotic theory is established for both
fixed $p$ and diverging $p$ when the sample size $n$ tends to infinity.
Numerical experiments with both simulated and real data sets indicate that the
proposed method is an effective initial step in analysing multiple time series
data, which leads to substantial dimension reduction in modelling and
forecasting highdimensional linear dynamical structures. Unlike PCA for
independent data, there is no guarantee that the required linear transformation
exists. When it does not, the proposed method provides an approximate
segmentation which leads to the advantages in, for example, forecasting for
future values. The method can also be adapted to segment multiple volatility
processes.

We propose a new omnibus test for vector white noise using the maximum
absolute autocorrelations and crosscorrelations of the component series.
Based on the newly established approximation by the $L_\infty$norm of a normal
random vector, the critical value of the test can be evaluated by bootstrapping
from a multivariate normal distribution. In contrast to the conventional white
noise test, the new method is proved to be valid for testing the departure from
nonIID white noise. We illustrate the accuracy and the power of the proposed
test by simulation, which also shows that the new test outperforms several
commonly used methods including, for example, the Lagrange multiplier test and
the multivariate BoxPierce portmanteau tests especially when the dimension of
time series is high in relation to the sample size. The numerical results also
indicate that the performance of the new test can be further enhanced when it
is applied to the pretransformed data obtained via the time series principal
component analysis proposed by Chang, Guo and Yao (2014). The proposed
procedures have been implemented in an Rpackage HDtest and is available online
at CRAN.

We propose a hybrid approach for the modelling and the shortterm forecasting
of electricity loads. Two building blocks of our approach are (i) modelling the
overall trend and seasonality by fitting a generalised additive model to the
weekly averages of the load, and (ii) modelling the dependence structure across
consecutive daily loads via curve linear regression. For the latter, a new
methodology is proposed for linear regression with both curve response and
curve regressors. The key idea behind the proposed methodology is the dimension
reduction based on a singular value decomposition in a Hilbert space, which
reduces the curve regression problem to several ordinary (i.e. scalar) linear
regression problems. We illustrate the hybrid method using the French
electricity loads between 1996 and 2009, on which we also compare our method
with other available models including the EDF operational model.

We consider a class of vector autoregressive models with banded coefficient
matrices. The setting represents a type of sparse structure for
highdimensional time series, though the implied autocovariance matrices are
not banded. The structure is also practically meaningful when the order of
component time series is arranged appropriately. The convergence rates for the
estimated banded autoregressive coefficient matrices are established. We also
propose a Bayesian information criterion for determining the width of the bands
in the coefficient matrices, which is proved to be consistent. By exploring
some approximate banded structure for the autocovariance functions of banded
vector autoregressive processes, consistent estimators for the autocovariance
matrices are constructed.

We consider a class of spatiotemporal models which extend popular
econometric spatial autoregressive panel data models by allowing the scalar
coefficients for each location (or panel) different from each other. To
overcome the innate endogeneity, we propose a generalized YuleWalker
estimation method which applies the least squares estimation to a YuleWalker
equation. The asymptotic theory is developed under the setting that both the
sample size and the number of locations (or panels) tend to infinity under a
general setting for stationary and alphamixing processes, which includes
spatial autoregressive panel data models driven by i.i.d. innovations as
special cases. The proposed methods are illustrated using both simulated and
real data.

We consider a multivariate time series model which represents a high
dimensional vector process as a sum of three terms: a linear regression of some
observed regressors, a linear combination of some latent and serially
correlated factors, and a vector white noise. We investigate the inference
without imposing stationary conditions on the target multivariate time series,
the regressors and the underlying factors. Furthermore we deal with the
endogeneity that there exist correlations between the observed regressors and
the unobserved factors. We also consider the model with nonlinear regression
term which can be approximated by a linear regression function with a large
number of regressors. The convergence rates for the estimators of regression
coefficients, the number of factors, factor loading space and factors are
established under the settings when the dimension of time series and the number
of regressors may both tend to infinity together with the sample size. The
proposed method is illustrated with both simulated and real data examples.

The following conversation is partly based on an interview that took place in
the Hong Kong University of Science and Technology in July 2013.

For discrete panel data, the dynamic relationship between successive
observations is often of interest. We consider a dynamic probit model for short
panel data. A problem with estimating the dynamic parameter of interest is that
the model contains a large number of nuisance parameters, one for each
individual. Heckman proposed to use maximum likelihood estimation of the
dynamic parameter, which, however, does not perform well if the individual
effects are large. We suggest new estimators for the dynamic parameter, based
on the assumption that the individual parameters are random and possibly large.
Theoretical properties of our estimators are derived and a simulation study
shows they have some advantages compared to Heckman's estimator.

We propose a new method for estimating the extreme quantiles for a function
of several dependent random variables. In contrast to the conventional approach
based on extreme value theory, we do not impose the condition that the tail of
the underlying distribution admits an approximate parametric form, and,
furthermore, our estimation makes use of the full observed data. The proposed
method is semiparametric as no parametric forms are assumed on all the marginal
distributions. But we select appropriate bivariate copulas to model the joint
dependence structure by taking the advantage of the recent development in
constructing large dimensional vine copulas. Consequently a sample quantile
resulted from a large bootstrap sample drawn from the fitted joint distribution
is taken as the estimator for the extreme quantile. This estimator is proved to
be consistent. The reliable and robust performance of the proposed method is
further illustrated by simulation.

The curve time series framework provides a convenient vehicle to accommodate
some nonstationary features into a stationary setup. We propose a new method to
identify the dimensionality of curve time series based on the dynamical
dependence across different curves. The practical implementation of our method
boils down to an eigenanalysis of a finitedimensional matrix. Furthermore, the
determination of the dimensionality is equivalent to the identification of the
nonzero eigenvalues of the matrix, which we carry out in terms of some
bootstrap tests. Asymptotic properties of the proposed method are investigated.
In particular, our estimators for zeroeigenvalues enjoy the fast convergence
rate n while the estimators for nonzero eigenvalues converge at the standard
$\sqrt{n}$rate. The proposed methodology is illustrated with both simulated
and real data sets.

This paper deals with the factor modeling for highdimensional time series
based on a dimensionreduction viewpoint. Under stationary settings, the
inference is simple in the sense that both the number of factors and the factor
loadings are estimated in terms of an eigenanalysis for a nonnegative definite
matrix, and is therefore applicable when the dimension of time series is on the
order of a few thousands. Asymptotic properties of the proposed method are
investigated under two settings: (i) the sample size goes to infinity while the
dimension of time series is fixed; and (ii) both the sample size and the
dimension of time series go to infinity together. In particular, our estimators
for zeroeigenvalues enjoy faster convergence (or slower divergence) rates,
hence making the estimation for the number of factors easier. In particular,
when the sample size and the dimension of time series go to infinity together,
the estimators for the eigenvalues are no longer consistent. However, our
estimator for the number of the factors, which is based on the ratios of the
estimated eigenvalues, still works fine. Furthermore, this estimation shows the
socalled "blessing of dimensionality" property in the sense that the
performance of the estimation may improve when the dimension of time series
increases. A twostep procedure is investigated when the factors are of
different degrees of strength. Numerical illustration with both simulated and
real data is also reported.

Discussion of "Feature Matching in Time Series Modeling" by Y. Xia and H.
Tong [arXiv:1104.3073]

This paper deals with the dimension reduction for highdimensional time
series based on common factors. In particular we allow the dimension of time
series $p$ to be as large as, or even larger than, the sample size $n$. The
estimation for the factor loading matrix and the factor process itself is
carried out via an eigenanalysis for a $p\times p$ nonnegative definite
matrix. We show that when all the factors are strong in the sense that the norm
of each column in the factor loading matrix is of the order $p^{1/2}$, the
estimator for the factor loading matrix, as well as the resulting estimator for
the precision matrix of the original $p$variant time series, are weakly
consistent in $L_2$norm with the convergence rates independent of $p$. This
result exhibits clearly that the `curse' is canceled out by the `blessings' in
dimensionality. We also establish the asymptotic properties of the estimation
when not all factors are strong. For the latter case, a twostep estimation
procedure is preferred accordingly to the asymptotic theory. The proposed
methods together with their asymptotic properties are further illustrated in a
simulation study. An application to a real data set is also reported.

We propose to approximate the conditional expectation of a spatial random
variable given its nearestneighbour observations by an additive function. The
setting is meaningful in practice and requires no unilateral ordering. It is
capable of catching nonlinear features in spatial data and exploring local
dependence structures. Our approach is different from both Markov field methods
and disjunctive kriging. The asymptotic properties of the additive estimators
have been established for $\alpha$mixing spatial processes by extending the
theory of the backfitting procedure to the spatial case. This facilitates the
confidence intervals for the component functions, although the asymptotic
biases have to be estimated via (wild) bootstrap. Simulation results are
reported. Applications to real data illustrate that the improvement in
describing the data over the autonormal scheme is significant when
nonlinearity or nonGaussianity is pronounced.

Motivated by applications to prediction and forecasting, we suggest methods
for approximating the conditional distribution function of a random variable Y
given a dependent random dvector X. The idea is to estimate not the
distribution of YX, but that of Y\theta^TX, where the unit vector \theta is
selected so that the approximation is optimal under a leastsquares criterion.
We show that \theta may be estimated rootn consistently. Furthermore,
estimation of the conditional distribution function of Y, given \theta^TX, has
the same firstorder asymptotic properties that it would enjoy if \theta were
known. The proposed method is illustrated using both simulated and realdata
examples, showing its effectiveness for both independent datasets and data from
time series. Numerical work corroborates the theoretical result that \theta can
be estimated particularly accurately.

We propose to model multivariate volatility processes based on the newly
defined conditionally uncorrelated components (CUCs). This model represents a
parsimonious representation for matrixvalued processes. It is flexible in the
sense that we may fit each CUC with any appropriate univariate volatility
model. Computationally it splits one highdimensional optimization problem into
several lowerdimensional subproblems. Consistency for the estimated CUCs has
been established. A bootstrap test is proposed for testing the existence of
CUCs. The proposed methodology is illustrated with both simulated and real data
sets.