• Consider the standard nonparametric regression model and take as estimator the penalized least squares function. In this article, we study the trade-off between closeness to the true function and complexity penalization of the estimator, where complexity is described by a seminorm on a class of functions. First, we present an exponential concentration inequality revealing the concentration behavior of the trade-off of the penalized least squares estimator around a nonrandom quantity, where such quantity depends on the problem under consideration. Then, under some conditions and for the proper choice of the tuning parameter, we obtain bounds for this nonrandom quantity. We illustrate our results with some examples that include the smoothing splines estimator.
  • This study aims at contributing to lower bounds for empirical compatibility constants or empirical restricted eigenvalues. This is of importance in compressed sensing and theory for $\ell_1$-regularized estimators. Let $X$ be an $n \times p$ data matrix with rows being independent copies of a $p$-dimensional random variable. Let $\hat \Sigma := X^T X / n$ be the inner product matrix. We show that the quadratic forms $u^T \hat \Sigma u$ are lower bounded by a value converging to one, uniformly over the set of vectors $u$ with $u^T \Sigma_0 u $ equal to one and $\ell_1$-norm at most $M$. Here $\Sigma_0 := {\bf E} \hat \Sigma$ is the theoretical inner product matrix which we assume to exist. The constant $M$ is required to be of small order $\sqrt {n / \log p}$. We assume moreover $m$-th order isotropy for some $m >2$ and sub-exponential tails or moments up to order $\log p$ for the entries in $X$. As a consequence we obtain convergence of the empirical compatibility constant to its theoretical counterpart, and similarly for the empirical restricted eigenvalue. If the data matrix $X$ is first normalized so that its columns all have equal length we obtain lower bounds assuming only isotropy and no further moment conditions on its entries. The isotropy condition is shown to hold for certain martingale situations.
  • We consider an additive regression model consisting of two components $f^0$ and $g^0$, where the first component $f^0$ is in some sense "smoother" than the second $g^0$. Smoothness is here described in terms of a semi-norm on the class of regression functions. We use a penalized least squares estimator $(\hat f, \hat g)$ of $(f^0, g^0)$ and show that the rate of convergence for $\hat f $ is faster than the rate of convergence for $\hat g$. In fact, both rates are generally as fast as in the case where one of the two components is known. The theory is illustrated by a simulation study. Our proofs rely on recent results from empirical process theory.