
Highdimensional prediction typically comprises two steps: variable selection
and subsequent leastsquares refitting on the selected variables. However, the
standard variable selection procedures, such as the lasso, hinge on tuning
parameters that need to be calibrated. Crossvalidation, the most popular
calibration scheme, is computationally costly and lacks finite sample
guarantees. In this paper, we introduce an alternative scheme, easy to
implement and both computationally and theoretically efficient.

We study the behavior of a real $p$dimensional Wishart random matrix with
$n$ degrees of freedom when $n,p\rightarrow\infty$ but $p/n\rightarrow 0$. We
establish the existence of phase transitions when $p$ grows at the order
$n^{(K+1)/(K+3)}$ for every $k\in\mathbb{N}$, and derive expressions for
approximating densities between every two phase transitions. To do this, we
make use of a novel tool we call the Gtransform of a distribution, which is
closely related to the characteristic function. We also derive an extension of
the $t$distribution to the real symmetric matrices, which naturally appears as
the conjugate distribution to the Wishart under a Gtransformation, and show
its empirical spectral distribution obeys a semicircle law when $p/n\rightarrow
0$. Finally, we discuss how the phase transitions of the Wishart distribution
might originate from changes in rates of convergence of symmetric $t$
statistics.

The greatest root statistic arises as the test statistic in several
multivariate analysis settings. Suppose there is a global null hypothesis that
consists of different independent subnull hypotheses, and suppose the greatest
root statistic is used as the test statistic for each subnull hypothesis. Such
problems may arise when conducting a batch MANOVA or several batches of
pairwise testing for equality of covariance matrices. Using the
unionintersection testing approach and by letting the problem dimension tend
to infinity faster than the number of batches, we show that the global null can
be tested using a Gumbel distribution to approximate the critical values.
Although the theoretical results are asymptotic, simulation studies indicate
that the approximations are very good even for small to moderate dimensions.
The results are general and can be applied in any setting where the greatest
root statistic is used, not just for the two methods we use for illustrative
purposes.

We consider the problem of estimating covariance and precision matrices, and
their associated discriminant coefficients, from normal data when the rank of
the covariance matrix is strictly smaller than its dimension and the available
sample size. Using unbiased risk estimation, we construct novel estimators by
minimizing upper bounds on the difference in risk over several classes. Our
proposal estimates are empirically demonstrated to offer substantial
improvement over classical approaches.

The problem of estimating a spiked covariance matrix in high dimensions under
Frobenius loss, and the parallel problem of estimating the noise in spiked PCA
is investigated. We propose an estimator of the noise parameter by minimizing
an unbiased estimator of the invariant Frobenius risk using calculus of
variations. The resulting estimator is shown, using random matrix theory, to be
strongly consistent and essentially asymptotically normal and minimax for the
noise estimation problem. We apply the construction to construct a robust
spiked covariance matrix estimator with consistent eigenvalues.

We consider the problem of estimating the mean vector of a pvariate normal
$(\theta,\Sigma)$ distribution under invariant quadratic loss,
$(\delta\theta)'\Sigma^{1}(\delta\theta)$, when the covariance is unknown.
We propose a new class of estimators that dominate the usual estimator
$\delta^0(X)=X$. The proposed estimators of $\theta$ depend upon X and an
independent Wishart matrix S with n degrees of freedom, however, S is singular
almost surely when p>n. The proof of domination involves the development of
some new unbiased estimators of risk for the p>n setting. We also find some
relationships between the amount of domination and the magnitudes of n and p.