
In highdimensional linear models, the sparsity assumption is typically made,
stating that most of the parameters are equal to zero. Under the sparsity
assumption, estimation and, recently, inference have been well studied.
However, in practice, sparsity assumption is not checkable and more importantly
is often violated; a large number of covariates might be expected to be
associated with the response, indicating that possibly all, rather than just a
few, parameters are nonzero. A natural example is a genomewide gene
expression profiling, where all genes are believed to affect a common disease
marker. We show that existing inferential methods are sensitive to the sparsity
assumption, and may, in turn, result in the severe lack of control of TypeI
error. In this article, we propose a new inferential method, named CorrT, which
is robust to model misspecification such as heteroscedasticity and lack of
sparsity. CorrT is shown to have Type I error approaching the nominal level for
\textit{any} models and Type II error approaching zero for sparse and many
dense models.
In fact, CorrT is also shown to be optimal in a variety of frameworks:
sparse, nonsparse and hybrid models where sparse and dense signals are mixed.
Numerical experiments show a favorable performance of the CorrT test compared
to the stateoftheart methods.

The purpose of this paper is to construct confidence intervals for the
regression coefficients in highdimensional Cox proportional hazards regression
models where the number of covariates may be larger than the sample size. Our
debiased estimator construction is similar to those in Zhang and Zhang (2014)
and van de Geer et al. (2014), but the timedependent covariates and censored
risk sets introduce considerable additional challenges. Our theoretical
results, which provide conditions under which our confidence intervals are
asymptotically valid, are supported by extensive numerical experiments.

This paper studies hypothesis testing and confidence interval construction in
highdimensional linear models with possible nonsparse structures. For a given
component of the parameter vector, we show that the difficulty of the problem
depends on the sparsity of the corresponding row of the precision matrix of the
covariates, not the sparsity of the model itself. We develop new concepts of
uniform and essentially uniform nontestability that allow the study of
limitations of tests across a broad set of alternatives. Uniform
nontestability identifies an extensive collection of alternatives such that
the power of any test, against any alternative in this group, is asymptotically
at most equal to the nominal size, whereas minimaxity shows the existence of
one particularly "bad" alternative. Implications of the new constructions
include new minimax testability results that in sharp contrast to existing
results do not depend on the sparsity of the model parameters. We identify new
tradeoffs between testability and feature correlation. In particular, we show
that in models with weak feature correlations minimax lower bound can be
attained by a confidence interval whose width has the parametric rate
regardless of the size of the model sparsity.

Many scientific and engineering challenges  ranging from pharmacokinetic
drug dosage allocation and personalized medicine to marketing mix (4Ps)
recommendations  require an understanding of the unobserved heterogeneity in
order to develop the best decision makingprocesses. In this paper, we develop
a hypothesis test and the corresponding pvalue for testing for the
significance of the homogeneous structure in linear mixed models. A robust
matching moment construction is used for creating a test that adapts to the
size of the model sparsity. When unobserved heterogeneity at a cluster level is
constant, we show that our test is both consistent and unbiased even when the
dimension of the model is extremely high. Our theoretical results rely on a new
family of adaptive sparse estimators of the fixed effects that do not require
consistent estimation of the random effects. Moreover, our inference results do
not require consistent model selection. We showcase that moment matching can be
extended to nonlinear mixed effects models and to generalized linear mixed
effects models. In numerical and real data experiments, we find that the
developed method is extremely accurate, that it adapts to the size of the
underlying model and is decidedly powerful in the presence of irrelevant
covariates.

Models with many signals, highdimensional models, often impose structures on
the signal strengths. The common assumption is that only a few signals are
strong and most of the signals are zero or close (collectively) to zero.
However, such a requirement might not be valid in many reallife applications.
In this article, we are interested in conducting largescale inference in
models that might have signals of mixed strengths. The key challenge is that
the signals that are not under testing might be collectively nonnegligible
(although individually small) and cannot be accurately learned. This article
develops a new class of tests that arise from a moment matching formulation. A
virtue of these momentmatching statistics is their ability to borrow strength
across features, adapt to the sparsity size and exert adjustment for testing
growing number of hypothesis. GRouplevel Inference of Parameter, GRIP, test
harvests effective sparsity structures with hypothesis formulation for an
efficient multiple testing procedure. Simulated data showcase that GRIPs error
control is far better than the alternative methods. We develop a minimax
theory, demonstrating optimality of GRIP for a broad range of models, including
those where the model is a mixture of a sparse and highdimensional dense
signals.

The purpose of this paper is to construct confidence intervals for the
regression coefficients in the FineGray model for competing risks data with
random censoring, where the number of covariates can be larger than the sample
size. Despite strong motivation from biomedical applications, a
highdimensional FineGray model has attracted relatively little attention
among the methodological or theoretical literature. We fill in this gap by
developing confidence intervals based on a onestep biascorrection for a
regularized estimation. We develop a theoretical framework for the partial
likelihood, which does not have independent and identically distributed entries
and therefore presents many technical challenges. We also study the
approximation error from the weighting scheme under random censoring for
competing risks and establish new concentration results for timedependent
processes. In addition to the theoretical results and algorithms, we present
extensive numerical experiments and an application to a study of noncancer
mortality among prostate cancer patients using the linked MedicareSEER data.

We provide comments on the article "Highdimensional simultaneous inference
with the bootstrap" by Ruben Dezeure, Peter Buhlmann and CunHui Zhang.

This article develops a framework for testing general hypothesis in
highdimensional models where the number of variables may far exceed the number
of observations. Existing literature has considered less than a handful of
hypotheses, such as testing individual coordinates of the model parameter.
However, the problem of testing general and complex hypotheses remains widely
open. We propose a new inference method developed around the hypothesis
adaptive projection pursuit framework, which solves the testing problems in the
most general case. The proposed inference is centered around a new class of
estimators defined as $l_1$ projection of the initial guess of the unknown onto
the space defined by the null. This projection automatically takes into account
the structure of the null hypothesis and allows us to study formal inference
for a number of longstanding problems. For example, we can directly conduct
inference on the sparsity level of the model parameters and the minimum signal
strength. This is especially significant given the fact that the former is a
fundamental condition underlying most of the theoretical development in
highdimensional statistics, while the latter is a key condition used to
establish variable selection properties. Moreover, the proposed method is
asymptotically exact and has satisfactory power properties for testing very
general functionals of the highdimensional parameters. The simulation studies
lend further support to our theoretical claims and additionally show excellent
finitesample size and power properties of the proposed test.

Hypothesis tests in models whose dimension far exceeds the sample size can be
formulated much like the classical studentized tests only after the initial
bias of estimation is removed successfully. The theory of debiased estimators
can be developed in the context of quantile regression models for a fixed
quantile value. However, it is frequently desirable to formulate tests based on
the quantile regression process, as this leads to more robust tests and more
stable confidence sets. Additionally, inference in quantile regression requires
estimation of the so called sparsity function, which depends on the unknown
density of the error. In this paper we consider a debiasing approach for the
uniform testing problem. We develop highdimensional regression rank scores and
show how to use them to estimate the sparsity function, as well as how to adapt
them for inference involving the quantile regression process. Furthermore, we
develop a KolmogorovSmirnov test in a locationshift highdimensional models
and confidence sets that are uniformly valid for many quantile values. The main
technical result are the development of a Bahadur representation of the
debiasing estimator that is uniform over a range of quantiles and uniform
convergence of the quantile process to the Brownian bridge process, which are
of independent interest. Simulation studies illustrate finite sample properties
of our procedure.

We propose a methodology for testing linear hypothesis in highdimensional
linear models. The proposed test does not impose any restriction on the size of
the model, i.e. model sparsity or the loading vector representing the
hypothesis. Providing asymptotically valid methods for testing general linear
functions of the regression parameters in highdimensions is extremely
challenging  especially without making restrictive or unverifiable
assumptions on the number of nonzero elements. We propose to test the moment
conditions related to the newly designed restructured regression, where the
inputs are transformed and augmented features. These new features incorporate
the structure of the null hypothesis directly. The test statistics are
constructed in such a way that lack of sparsity in the original model parameter
does not present a problem for the theoretical justification of our procedures.
We establish asymptotically exact control on Type I error without imposing any
sparsity assumptions on model parameter or the vector representing the linear
hypothesis. Our method is also shown to achieve certain optimality in detecting
deviations from the null hypothesis. We demonstrate the favorable finitesample
performance of the proposed methods, via a number of numerical and a real data
example.

Understanding efficiency in high dimensional linear models is a longstanding
problem of interest. Classical work with smaller dimensional problems dating
back to Huber and Bickel has illustrated the benefits of efficient loss
functions. When the number of parameters $p$ is of the same order as the sample
size $n$, $p \approx n$, an efficiency pattern different from the one of Huber
was recently established. In this work, we consider the effects of model
selection on the estimation efficiency of penalized methods. In particular, we
explore whether sparsity, results in new efficiency patterns when $p > n$. In
the interest of deriving the asymptotic mean squared error for regularized
Mestimators, we use the powerful framework of approximate message passing. We
propose a novel, robust and sparse approximate message passing algorithm
(RAMP), that is adaptive to the error distribution. Our algorithm includes many
nonquadratic and nondifferentiable loss functions. We derive its asymptotic
mean squared error and show its convergence, while allowing $p, n, s \to
\infty$, with $n/p \in (0,1)$ and $n/s \in (1,\infty)$. We identify new
patterns of relative efficiency regarding a number of penalized $M$ estimators,
when $p$ is much larger than $n$. We show that the classical information bound
is no longer reachable, even for lighttailed error distributions. We show
that the penalized least absolute deviation estimator dominates the penalized
least square estimator, in cases of heavytailed distributions. We observe
this pattern for all choices of the number of nonzero parameters $s$, both $s
\leq n$ and $s \approx n$. In nonpenalized problems where $s =p \approx n$,
the opposite regime holds. Therefore, we discover that the presence of model
selection significantly changes the efficiency patterns.

In analyzing highdimensional models, sparsity of the model parameter is a
common but often undesirable assumption. In this paper, we study the following
twosample testing problem: given two samples generated by two highdimensional
linear models, we aim to test whether the regression coefficients of the two
linear models are identical. We propose a framework named TIERS (short for
TestIng Equality of Regression Slopes), which solves the twosample testing
problem without making any assumptions on the sparsity of the regression
parameters. TIERS builds a new model by convolving the two samples in such a
way that the original hypothesis translates into a new moment condition. A
selfnormalization construction is then developed to form a moment test. We
provide rigorous theory for the developed framework. Under very weak conditions
of the feature covariance, we show that the accuracy of the proposed test in
controlling Type I errors is robust both to the lack of sparsity in the
features and to the heavy tails in the error distribution, even when the sample
size is much smaller than the feature dimension. Moreover, we discuss minimax
optimality and efficiency properties of the proposed test. Simulation analysis
demonstrates excellent finitesample performance of our test. In deriving the
test, we also develop tools that are of independent interest. The test is built
upon a novel estimator, called AutoaDaptive Dantzig Selector (ADDS), which not
only automatically chooses an appropriate scale of the error term but also
incorporates prior information. To effectively approximate the critical value
of the test statistic, we develop a novel highdimensional plugin approach
that complements the recent advances in Gaussian approximation theory.

This paper develops robust confidence intervals in highdimensional and
leftcensored regression. TypeI censored regression models are extremely
common in practice, where a competing event makes the variable of interest
unobservable. However, techniques developed for entirely observed data do not
directly apply to the censored observations. In this paper, we develop smoothed
estimating equations that augment the debiasing method, such that the
resulting estimator is adaptive to censoring and is more robust to the
misspecification of the error distribution. We propose a unified class of
robust estimators, including Mallow's, Schweppe's and HillRyan's onestep
estimator.
In the ultrahighdimensional setting, where the dimensionality can grow
exponentially with the sample size, we show that as long as the preliminary
estimator converges faster than $n^{1/4}$, the onestep estimator inherits
asymptotic distribution of fully iterated version. Moreover, we show that the
size of the residuals of the Bahadur representation matches those of the simple
linear models, $s^{3/4 } (\log (p \vee n))^{3/4} / n^{1/4}$  that is, the
effects of censoring asymptotically disappear. Simulation studies demonstrate
that our method is adaptive to the censoring level and asymmetry in the error
distribution, and does not lose efficiency when the errors are from symmetric
distributions. Finally, we apply the developed method to a real data set from
the MAQCII repository that is related to the HIV1 study.

This paper examines the role and efficiency of the nonconvex loss functions
for binary classification problems. In particular, we investigate how to design
a simple and effective boosting algorithm that is robust to the outliers in the
data. The analysis of the role of a particular nonconvex loss for prediction
accuracy varies depending on the diminishing tail properties of the gradient of
the loss  the ability of the loss to efficiently adapt to the outlying data,
the local convex properties of the loss and the proportion of the contaminated
data. In order to use these properties efficiently, we propose a new family of
nonconvex losses named $\gamma$robust losses. Moreover, we present a new
boosting framework, {\it Arch Boost}, designed for augmenting the existing work
such that its corresponding classification algorithm is significantly more
adaptable to the unknown data contamination. Along with the Arch Boosting
framework, the nonconvex losses lead to the new class of boosting algorithms,
named adaptive, robust, boosting (ARB). Furthermore, we present theoretical
examples that demonstrate the robustness properties of the proposed algorithms.
In particular, we develop a new breakdown point analysis and a new influence
function analysis that demonstrate gains in robustness. Moreover, we present
new theoretical results, based only on local curvatures, which may be used to
establish statistical and optimization properties of the proposed Arch boosting
algorithms with highly nonconvex loss functions. Extensive numerical
calculations are used to illustrate these theoretical properties and reveal
advantages over the existing boosting methods when data exhibits a number of
outliers.

We introduce a very general method for sparse and largescale variable
selection. The largescale regression settings is such that both the number of
parameters and the number of samples are extremely large. The proposed method
is based on careful combination of penalized estimators, each applied to a
random projection of the sample space into a lowdimensional space. In one
special case that we study in detail, the random projections are divided into
nonoverlapping blocks; each consisting of only a small portion of the original
data. Within each block we select the projection yielding the smallest
outofsample error. Our random ensemble estimator then aggregates the results
according to new maximalcontrast voting scheme to determine the final selected
set. Our theoretical results illuminate the effect on performance of increasing
the number of nonoverlapping blocks. Moreover, we demonstrate that statistical
optimality is retained along with the computational speedup. The proposed
method achieves minimax rates for approximate recovery over all estimators
using the full set of samples. Furthermore, our theoretical results allow the
number of subsamples to grow with the subsample size and do not require
irrepresentable condition. The estimator is also compared empirically with
several other popular highdimensional estimators via an extensive simulation
study, which reveals its excellent finitesample performance.

To better understand the interplay of censoring and sparsity we develop
finite sample properties of nonparametric Cox proportional hazard's model. Due
to high impact of sequencing data, carrying genetic information of each
individual, we work with overparametrized problem and propose general class of
group penalties suitable for sparse structured variable selection and
estimation. Novel nonasymptotic sandwich bounds for the partial likelihood are
developed. We establish how they extend notion of local asymptotic normality
(LAN) of Le Cam's. Such nonasymptotic LAN principles are further extended to
high dimensional spaces where $p \gg n$. Finite sample prediction properties of
penalized estimator in nonparametric Cox proportional hazards model, under
suitable censoring conditions, agree with those of penalized estimator in
linear models.

High throughput genetic sequencing arrays with thousands of measurements per
sample and a great amount of related censored clinical data have increased
demanding need for better measurement specific model selection. In this paper
we establish strong oracle properties of nonconcave penalized methods for
nonpolynomial (NP) dimensional data with censoring in the framework of Cox's
proportional hazards model. A class of foldedconcave penalties are employed
and both LASSO and SCAD are discussed specifically. We unveil the question
under which dimensionality and correlation restrictions can an oracle estimator
be constructed and grasped. It is demonstrated that nonconcave penalties lead
to significant reduction of the "irrepresentable condition" needed for LASSO
model selection consistency. The large deviation result for martingales,
bearing interests of its own, is developed for characterizing the strong oracle
property. Moreover, the nonconcave regularized estimator, is shown to achieve
asymptotically the information bound of the oracle estimator. A coordinatewise
algorithm is developed for finding the grid of solution paths for penalized
hazard regression problems, and its performance is evaluated on simulated and
gene association study examples.

In highdimensional model selection problems, penalized simple leastsquare
approaches have been extensively used. This paper addresses the question of
both robustness and efficiency of penalized model selection methods, and
proposes a datadriven weighted linear combination of convex loss functions,
together with weighted $L_1$penalty. It is completely dataadaptive and does
not require prior knowledge of the error distribution. The weighted
$L_1$penalty is used both to ensure the convexity of the penalty term and to
ameliorate the bias caused by the $L_1$penalty. In the setting with
dimensionality much larger than the sample size, we establish a strong oracle
property of the proposed method that possesses both the model selection
consistency and estimation efficiency for the true nonzero coefficients. As
specific examples, we introduce a robust method of composite L1L2, and optimal
composite quantile method and evaluate their performance in both simulated and
real data examples.