
We study factor models augmented by observed covariates that have explanatory
powers on the unknown factors. In financial factor models, the unknown factors
can be reasonably well explained by a few observable proxies, such as the
FamaFrench factors. In diffusion index forecasts, identified factors are
strongly related to several directly measurable economic variables such as
consumptionwealth variable, financial ratios, and term spread. With those
covariates, both the factors and loadings are identifiable up to a rotation
matrix even only with a finite dimension. To incorporate the explanatory power
of these covariates, we propose a smoothed principal component analysis (PCA):
(i) regress the data onto the observed covariates, and (ii) take the principal
components of the fitted data to estimate the loadings and factors. This allows
us to accurately estimate the percentage of both explained and unexplained
components in factors and thus to assess the explanatory power of covariates.
We show that both the estimated factors and loadings can be estimated with
improved rates of convergence compared to the benchmark method. The degree of
improvement depends on the strength of the signals, representing the
explanatory power of the covariates on the factors. The proposed estimator is
robust to possibly heavytailed distributions. We apply the model to forecast
US bond risk premia, and find that the observed macroeconomic characteristics
contain strong explanatory powers of the factors. The gain of forecast is more
substantial when the characteristics are incorporated to estimate the common
factors than directly used for forecasts.

This paper studies model selection consistency for high dimensional sparse
regression when data exhibits both crosssectional and serial dependency. Most
commonlyused model selection methods fail to consistently recover the true
model when the covariates are highly correlated. Motivated by econometric
studies, we consider the case where covariate dependence can be reduced through
factor model, and propose a consistent strategy named FactorAdjusted
Regularized Model Selection (FarmSelect). By separating the latent factors from
idiosyncratic components, we transform the problem from model selection with
highly correlated covariates to that with weakly correlated variables. Model
selection consistency as well as optimal rates of convergence are obtained
under mild conditions. Numerical studies demonstrate the nice finite sample
performance in terms of both model selection and outofsample prediction.
Moreover, our method is flexible in a sense that it pays no price for weakly
correlated and uncorrelated cases. Our method is applicable to a wide range of
high dimensional sparse regression problems. An Rpackage FarmSelect is also
provided for implementation.

In this paper, we study the model selection and structure specification for
the generalised semivarying coefficient models (GSVCMs), where the number of
potential covariates is allowed to be larger than the sample size. We first
propose a penalised likelihood method with the LASSO penalty function to obtain
the preliminary estimates of the functional coefficients. Then, using the
quadratic approximation for the local loglikelihood function and the adaptive
group LASSO penalty (or the local linear approximation of the group SCAD
penalty) with the help of the preliminary estimation of the functional
coefficients, we introduce a novel penalised weighted least squares procedure
to select the significant covariates and identify the constant coefficients
among the coefficients of the selected covariates, which could thus specify the
semiparametric modelling structure. The developed model selection and structure
specification approach not only inherits many nice statistical properties from
the local maximum likelihood estimation and nonconcave penalised likelihood
method, but also computationally attractive thanks to the computational
algorithm that is proposed to implement our method. Under some mild conditions,
we establish the asymptotic properties for the proposed model selection and
estimation procedure such as the sparsity and oracle property. We also conduct
simulation studies to examine the finite sample performance of the proposed
method, and finally apply the method to analyse a real data set, which leads to
some interesting findings.