In the setting of high-dimensional linear regression models, we propose two
frameworks for constructing pointwise and group confidence sets for penalized
estimators which incorporate prior knowledge about the organization of the
non-zero coefficients. This is done by desparsifying the estimator as in van de
Geer et al.  and van de Geer and Stucky , then using an appropriate
estimator for the precision matrix $\Theta$. In order to estimate the precision
matrix a corresponding structured matrix norm penalty has to be introduced.
After normalization the result is an asymptotic pivot.
The asymptotic behavior is studied and simulations are added to study the
differences between the two schemes.
We study a set of regularization methods for high-dimensional linear
regression models. These penalized estimators have the square root of the
residual sum of squared errors as loss function, and any weakly decomposable
norm as penalty function. This fit measure is chosen because of its property
that the estimator does not depend on the unknown standard deviation of the
noise. On the other hand, a generalized weakly decomposable norm penalty is
very useful in being able to deal with different underlying sparsity
structures. We can choose a different sparsity inducing norm depending on how
we want to interpret the unknown parameter vector $\beta$. Structured sparsity
norms, as defined in Micchelli et al. , are special cases of weakly
decomposable norms, therefore we also include the square root LASSO (Belloni et
al. ), the group square root LASSO (Bunea et al. ) and a new method
called the square root SLOPE (in a similar fashion to the SLOPE from Bogdan et
al. ). For this collection of estimators our results provide sharp oracle
inequalities with the Karush-Kuhn-Tucker conditions. We discuss some examples
of estimators. Based on a simulation we illustrate some advantages of the
square root SLOPE.
We study a high-dimensional regression model. Aim is to construct a
confidence set for a given group of regression coefficients, treating all other
regression coefficients as nuisance parameters. We apply a one-step procedure
with the square-root Lasso as initial estimator and a multivariate square-root
Lasso for constructing a surrogate Fisher information matrix. The multivariate
square-root Lasso is based on nuclear norm loss with $\ell_1$-penalty. We show
that this procedure leads to an asymptotically $\chi^2$-distributed pivot, with
a remainder term depending only on the $\ell_1$-error of the initial estimator.
We show that under $\ell_1$-sparsity conditions on the regression coefficients
$\beta^0$ the square-root Lasso produces to a consistent estimator of the noise
variance and we establish sharp oracle inequalities which show that the
remainder term is small under further sparsity conditions on $\beta^0$ and
compatibility conditions on the design.