
Convex estimators such as the Lasso, the matrix Lasso and the group Lasso
have been studied extensively in the last two decades, demonstrating great
success in both theory and practice. Two quantities are introduced, the noise
barrier and the large scale bias, that provides insights on the performance of
these convex regularized estimators. It is now well understood that the Lasso
achieves fast prediction rates, provided that the correlations of the design
satisfy some Restricted Eigenvalue or Compatibility condition, and provided
that the tuning parameter is large enough. Using the two quantities introduced
in the paper, we show that the compatibility condition on the design matrix is
actually unavoidable to achieve fast prediction rates with the Lasso. The Lasso
must incur a loss due to the correlations of the design matrix, measured in
terms of the compatibility constant. This results holds for any design matrix,
any active subset of covariates, and any tuning parameter. It is now well known
that the Lasso enjoys a dimension reduction property: the prediction error is
of order $\lambda\sqrt k$ where $k$ is the sparsity; even if the ambient
dimension $p$ is much larger than $k$. Such results require that the tuning
parameters is greater than some universal threshold. We characterize sharp
phase transitions for the tuning parameter of the Lasso around a critical
threshold dependent on $k$. If $\lambda$ is equal or larger than this critical
threshold, the Lasso is minimax over $k$sparse target vectors. If $\lambda$ is
equal or smaller than critical threshold, the Lasso incurs a loss of order
$\sigma\sqrt k$ which corresponds to a model of size $k$ even if the target
vector has fewer than $k$ nonzero coefficients. Remarkably, the lower bounds
obtained in the paper also apply to random, datadriven tuning parameters. The
results extend to convex penalties beyond the Lasso.

Minimax lower bounds are pessimistic in nature: for any given estimator,
minimax lower bounds yield the existence of a worstcase target vector
$\beta^*_{worst}$ for which the prediction error of the given estimator is
bounded from below. However, minimax lower bounds shed no light on the
prediction error of the given estimator for target vectors different than
$\beta^*_{worst}$. A characterization of the prediction error of any convex
regularized leastsquares is given. This characterization provide both a lower
bound and an upper bound on the prediction error. This produces lower bounds
that are applicable for any target vector and not only for a single, worstcase
$\beta^*_{worst}$. Finally, these lower and upper bounds on the prediction
error are applied to the Lasso is sparse linear regression. We obtain a lower
bound involving the compatibility constant for any tuning parameter, matching
upper and lower bounds for the universal choice of the tuning parameter, and a
lower bound for the Lasso with small tuning parameter.

Upper and lower bounds are derived for the Gaussian mean width of the
intersection of a convex hull of $M$ points with an Euclidean ball of a given
radius. The upper bound holds for any collection of extreme point bounded in
Euclidean norm. The upper bound and the lower bound match up to a
multiplicative constant whenever the extreme points satisfy a one sided
Restricted Isometry Property.
This bound is then applied to study the Lasso estimator in fixeddesign
regression, the Empirical Risk Minimizer in the anisotropic persistence
problem, and the convex aggregation problem in density estimation.

Following recent success on the analysis of the Slope estimator, we provide a
sharp oracle inequality in term of prediction error for GraphSlope, a
generalization of Slope to signals observed over a graph. In addition to
improving upon best results obtained so far for the Total Variation denoiser
(also referred to as GraphLasso or Generalized Lasso), we propose an efficient
algorithm to compute GraphSlope. The proposed algorithm is obtained by
applying the forwardbackward method to the dual formulation of the GraphSlope
optimization problem. We also provide experiments showing the interest of the
method.