
Consider a sequence of real data points $X_1,\ldots, X_n$ with underlying
means $\theta^*_1,\dots,\theta^*_n$. This paper starts from studying the
setting that $\theta^*_i$ is both piecewise constant and monotone as a function
of the index $i$. For this, we establish the exact minimax rate of estimating
such monotone functions, and thus give a nontrivial answer to an open problem
in the shapeconstrained analysis literature. The minimax rate involves an
interesting iterated logarithmic dependence on the dimension, a phenomenon that
is revealed through characterizing the interplay between the isotonic shape
constraint and model selection complexity. We then develop a penalized
leastsquares procedure for estimating the vector
$\theta^*=(\theta^*_1,\dots,\theta^*_n)^T$. This estimator is shown to achieve
the derived minimax rate adaptively. For the proposed estimator, we further
allow the model to be misspecified and derive oracle inequalities with the
optimal rates, and show there exists a computationally efficient algorithm to
compute the exact solution.

In this article, we develop methods for estimating a low rank tensor from
noisy observations on a subset of its entries to achieve both statistical and
computational efficiencies. There have been a lot of recent interests in this
problem of noisy tensor completion. Much of the attention has been focused on
the fundamental computational challenges often associated with problems
involving higher order tensors, yet very little is known about their
statistical performance. To fill in this void, in this article, we characterize
the fundamental statistical limits of noisy tensor completion by establishing
minimax optimal rates of convergence for estimating a $k$th order low rank
tensor under the general $\ell_p$ ($1\le p\le 2$) norm which suggest
significant room for improvement over the existing approaches. Furthermore, we
propose a polynomialtime computable estimating procedure based upon power
iteration and a secondorder spectral initialization that achieves the optimal
rates of convergence. Our method is fairly easy to implement and numerical
experiments are presented to further demonstrate the practical merits of our
estimator.

The Lasso is biased. Concave penalized least squares estimation (PLSE) takes
advantage of signal strength to reduce this bias, leading to sharper error
bounds in prediction, coefficient estimation and variable selection. For
prediction and estimation, the bias of the Lasso can be also reduced by taking
a smaller penalty level than what selection consistency requires, but such
smaller penalty level depends on the sparsity of the true coefficient vector.
The sorted L1 penalized estimation (Slope) was proposed for adaptation to such
smaller penalty levels. However, the advantages of concave PLSE and Slope do
not subsume each other. We propose sorted concave penalized estimation to
combine the advantages of concave and sorted penalizations. We prove that
sorted concave penalties adaptively choose the smaller penalty level and at the
same time benefits from signal strength, especially when a significant
proportion of signals are stronger than the corresponding adaptively selected
penalty levels. A local convex approximation, which extends the local linear
and quadratic approximations to sorted concave penalties, is developed to
facilitate the computation of sorted concave PLSE and proven to possess desired
prediction and estimation error bounds. We carry out a unified treatment of
penalty functions in a general optimization setting, including the penalty
levels and concavity of the above mentioned sorted penalties and mixed
penalties motivated by Bayesian considerations. Our analysis of prediction and
estimation errors requires the restricted eigenvalue condition on the design,
not beyond, and provides selection consistency under a required minimum signal
strength condition in addition. Thus, our results also sharpens existing
results on concave PLSE by removing the upper sparse eigenvalue component of
the sparse Riesz condition.

The Bonferroni adjustment, or the union bound, is commonly used to study rate
optimality properties of statistical methods in highdimensional problems.
However, in practice, the Bonferroni adjustment is overly conservative. The
extreme value theory has been proven to provide more accurate multiplicity
adjustments in a number of settings, but only on ad hoc basis. Recently,
Gaussian approximation has been used to justify bootstrap adjustments in large
scale simultaneous inference in some general settings when $n \gg (\log p)^7$,
where $p$ is the multiplicity of the inference problem and $n$ is the sample
size. The thrust of this theory is the validity of the Gaussian approximation
for maxima of sums of independent random vectors in highdimension. In this
paper, we reduce the sample size requirement to $n \gg (\log p)^5$ for the
consistency of the empirical bootstrap and the multiplier/wild bootstrap in the
KolmogorovSmirnov distance, possibly in the regime where the Gaussian
approximation is not available. New comparison and anticoncentration theorems,
which are of considerable interest in and of themselves, are developed as
existing ones interweaved with Gaussian approximation are no longer applicable.

Additive regression provides an extension of linear regression by modeling
the signal of a response as a sum of functions of covariates of relatively low
complexity. We study penalized estimation in highdimensional nonparametric
additive regression where functional seminorms are used to induce smoothness
of component functions and the empirical $L_2$ norm is used to induce sparsity.
The functional seminorms can be of Sobolev or bounded variation types and are
allowed to be different amongst individual component functions. We establish
new oracle inequalities for the predictive performance of such methods under
three simple technical conditions: a subgaussian condition on the noise, a
compatibility condition on the design and the functional classes under
consideration, and an entropy condition on the functional classes. For random
designs, the sample compatibility condition can be replaced by its population
version under an additional condition to ensure suitable convergence of
empirical norms. In homogeneous settings where the complexities of the
component functions are of the same order, our results provide a spectrum of
explicit convergence rates, from the socalled slow rate without requiring the
compatibility condition to the fast rate under the hard sparsity or certain
$L_q$ sparsity to allow many small components in the true regression function.
These results significantly broadens and sharpens existing ones in the
literature.

The problem of estimating the mean of a normal vector with known but unequal
variances introduces substantial difficulties that impair the adequacy of
traditional empirical Bayes estimators. By taking a different approach, that
treats the known variances as part of the random observations, we restore
symmetry and thus the effectiveness of such methods. We suggest a grouplinear
empirical Bayes estimator, which collects observations with similar variances
and applies a spherically symmetric estimator to each group separately. The
proposed estimator is motivated by a new oracle rule which is stronger than the
best linear rule, and thus provides a more ambitious benchmark than that
considered in previous literature. Our estimator asymptotically achieves the
new oracle risk (under appropriate conditions) and at the same time is minimax.
The grouplinear estimator is particularly advantageous in situations where the
true means and observed variances are empirically dependent. To demonstrate the
merits of the proposed methods in real applications, we analyze the baseball
data used in Brown (2008), where the grouplinear methods achieved the
prediction error of the best nonparametric estimates that have been applied to
the dataset, and significantly lower error than other parametric and
semiparametric empirical Bayes estimators.

We develop some theoretical results for a robust similarity measure named
"generalized minmax" (GMM). This similarity has direct applications in machine
learning as a positive definite kernel and can be efficiently computed via
probabilistic hashing. Owing to the discrete nature, the hashed values can also
be used for efficient near neighbor search. We prove the theoretical limit of
GMM and the consistency result, assuming that the data follow an elliptical
distribution, which is a very general family of distributions and includes the
multivariate $t$distribution as a special case. The consistency result holds
as long as the data have bounded first moment (an assumption which essentially
holds for datasets commonly encountered in practice). Furthermore, we establish
the asymptotic normality of GMM. Compared to the "cosine" similarity which is
routinely adopted in current practice in statistics and machine learning, the
consistency of GMM requires much weaker conditions. Interestingly, when the
data follow the $t$distribution with $\nu$ degrees of freedom, GMM typically
provides a better measure of similarity than "cosine" roughly when $\nu<8$
(which is already very close to normal). These theoretical results will help
explain the recent success of GMM in learning tasks.

We propose a residual and wild bootstrap methodology for individual and
simultaneous inference in highdimensional linear models with possibly
nonGaussian and heteroscedastic errors. We establish asymptotic consistency
for simultaneous inference for parameters in groups $G$, where $p \gg n$, $s_0
= o(n^{1/2}/\{\log(p) \log(G)^{1/2}\})$ and $\log(G) = o(n^{1/7})$, with
$p$ the number of variables, $n$ the sample size and $s_0$ denoting the
sparsity. The theory is complemented by many empirical results. Our proposed
procedures are implemented in the Rpackage hdi.

In this paper, we investigate the sample size requirement for a general class
of nuclear norm minimization methods for higher order tensor completion. We
introduce a class of tensor norms by allowing for different levels of
coherence, which allows us to leverage the incoherence of a tensor. In
particular, we show that a $k$th order tensor of rank $r$ and dimension
$d\times\cdots\times d$ can be recovered perfectly from as few as
$O((r^{(k1)/2}d^{3/2}+r^{k1}d)(\log(d))^2)$ uniformly sampled entries through
an appropriate incoherent nuclear norm minimization. Our results demonstrate
some key differences between completing a matrix and a higher order tensor:
They not only point to potential room for improvement over the usual nuclear
norm minimization but also highlight the importance of explicitly accounting
for incoherence, when dealing with higher order tensors.

We study confidence regions and approximate chisquared tests for variable
groups in highdimensional linear regression. When the size of the group is
small, lowdimensional projection estimators for individual coefficients can be
directly used to construct efficient confidence regions and pvalues for the
group. However, the existing analyses of lowdimensional projection estimators
do not directly carry through for chisquaredbased inference of a large group
of variables without inflating the sample size by a factor of the group size.
We propose to debias a scaled group Lasso for chisquaredbased statistical
inference for potentially very large groups of variables. We prove that the
proposed methods capture the benefit of group sparsity under proper conditions,
for statistical inference of the noise level and variable groups, large and
small. Such benefit is especially strong when the group size is large.

We provide a principled way for investigators to analyze randomized
experiments when the number of covariates is large. Investigators often use
linear multivariate regression to analyze randomized experiments instead of
simply reporting the difference of means between treatment and control groups.
Their aim is to reduce the variance of the estimated treatment effect by
adjusting for covariates. If there are a large number of covariates relative to
the number of observations, regression may perform poorly because of
overfitting. In such cases, the Lasso may be helpful. We study the resulting
Lassobased treatment effect estimator under the NeymanRubin model of
randomized experiments. We present theoretical conditions that guarantee that
the estimator is more efficient than the simple differenceofmeans estimator,
and we provide a conservative estimator of the asymptotic variance, which can
yield tighter confidence intervals than the differenceofmeans estimator.
Simulation and data examples show that Lassobased adjustment can be
advantageous even when the number of covariates is less than the number of
observations. Specifically, a variant using Lasso for selection and OLS for
estimation performs particularly well, and it chooses a smoothing parameter
based on combined performance of Lasso and OLS.

The Gaussian graphical model, a popular paradigm for studying relationship
among variables in a wide range of applications, has attracted great attention
in recent years. This paper considers a fundamental question: When is it
possible to estimate lowdimensional parameters at parametric squareroot rate
in a large Gaussian graphical model? A novel regression approach is proposed to
obtain asymptotically efficient estimation of each entry of a precision matrix
under a sparseness condition relative to the sample size. When the precision
matrix is not sufficiently sparse, or equivalently the sample size is not
sufficiently large, a lower bound is established to show that it is no longer
possible to achieve the parametric rate in the estimation of each entry. This
lower bound result, which provides an answer to the delicate sample size
question, is established with a novel construction of a subset of sparse
precision matrices in an application of Le Cam's lemma. Moreover, the proposed
estimator is proven to have optimal convergence rate when the parametric rate
cannot be achieved, under a minimal sample requirement. The proposed estimator
is applied to test the presence of an edge in the Gaussian graphical model or
to recover the support of the entire model, to obtain adaptive rateoptimal
estimation of the entire precision matrix as measured by the matrix $\ell_q$
operator norm and to make inference in latent variables in the graphical model.
All of this is achieved under a sparsity condition on the precision matrix and
a side condition on the range of its spectrum. This significantly relaxes the
commonly imposed uniform signal strength condition on the precision matrix,
irrepresentability condition on the Hessian tensor operator of the covariance
matrix or the $\ell_1$ constraint on the precision matrix. Numerical results
confirm our theoretical findings. The ROC curve of the proposed algorithm,
Asymptotic Normal Thresholding (ANT), for support recovery significantly
outperforms that of the popular GLasso algorithm.

We study the use of very sparse random projections for compressed sensing
(sparse signal recovery) when the signal entries can be either positive or
negative. In our setting, the entries of a Gaussian design matrix are randomly
sparsified so that only a very small fraction of the entries are nonzero. Our
proposed decoding algorithm is simple and efficient in that the major cost is
one linear scan of the coordinates. We have developed two estimators: (i) the
{\em tie estimator}, and (ii) the {\em absolute minimum estimator}. Using only
the tie estimator, we are able to recover a $K$sparse signal of length $N$
using $1.551 eK \log K/\delta$ measurements (where $\delta\leq 0.05$ is the
confidence). Using only the absolute minimum estimator, we can detect the
support of the signal using $eK\log N/\delta$ measurements. For a particular
coordinate, the absolute minimum estimator requires fewer measurements (i.e.,
with a constant $e$ instead of $1.551e$). Thus, the two estimators can be
combined to form an even more practical decoding framework.
Prior studies have shown that existing onescan (or roughly onescan)
recovery algorithms using sparse matrices would require substantially more
(e.g., one order of magnitude) measurements than L1 decoding by linear
programming, when the nonzero entries of signals can be either negative or
positive. In this paper, following a known experimental setup, we show that, at
the same number of measurements, the recovery accuracies of our proposed method
are (at least) similar to the standard L1 decoding.

Consider a linear regression model where the design matrix X has n rows and p
columns. We assume (a) p is much large than n, (b) the coefficient vector beta
is sparse in the sense that only a small fraction of its coordinates is
nonzero, and (c) the Gram matrix G = X'X is sparse in the sense that each row
has relatively few large coordinates (diagonals of G are normalized to 1).
The sparsity in G naturally induces the sparsity of the socalled graph of
strong dependence (GOSD). We find an interesting interplay between the signal
sparsity and the graph sparsity, which ensures that in a broad context, the set
of true signals decompose into many different smallsize components of GOSD,
where different components are disconnected.
We propose Graphlet Screening (GS) as a new approach to variable selection,
which is a twostage Screen and Clean method. The key methodological innovation
of GS is to use GOSD to guide both the screening and cleaning. Compared to
mvariate bruteforth screening that has a computational cost of p^m, the GS
only has a computational cost of p (up to some multilog(p) factors) in
screening.
We measure the performance of any variable selection procedure by the minimax
Hamming distance. We show that in a very broad class of situations, GS achieves
the optimal rate of convergence in terms of the Hamming distance. Somewhat
surprisingly, the wellknown procedures subset selection and the lasso are rate
nonoptimal, even in very simple settings and even when their tuning parameters
are ideally set.

Many problems can be formulated as recovering a lowrank tensor. Although an
increasingly common task, tensor recovery remains a challenging problem because
of the delicacy associated with the decomposition of higher order tensors. To
overcome these difficulties, existing approaches often proceed by unfolding
tensors into matrices and then apply techniques for matrix completion. We show
here that such matricization fails to exploit the tensor structure and may lead
to suboptimal procedure. More specifically, we investigate a convex
optimization approach to tensor completion by directly minimizing a tensor
nuclear norm and prove that this leads to an improved sample size requirement.
To establish our results, we develop a series of algebraic and probabilistic
techniques such as characterization of subdifferetial for tensor nuclear norm
and concentration inequalities for tensor martingales, which may be of
independent interests and could be useful in other tensor related problems.

We study concentration in spectral norm of nonparametric estimates of
correlation matrices. We work within the confine of a Gaussian copula model.
Two nonparametric estimators of the correlation matrix, the sine
transformations of the Kendall's tau and Spearman's rho correlation
coefficient, are studied. Expected spectrum error bound is obtained for both
the estimators. A general large deviation bound for the maximum spectral error
of a collection of submatrices of a given dimension is also established. These
results prove that when both the number of variables and sample size are large,
the spectral error of the nonparametric estimators is of no greater order than
that of the latent sample covariance matrix, at least when compared with some
of the sharpest known error bounds for the later. As an application, we
establish the minimax optimal convergence rate in the estimation of
highdimensional bandable correlation matrices via tapering off of these
nonparametric estimators. An optimal convergence rate for sparse principal
component analysis is also established as another example of possible
applications of the main results.

Compressed sensing (sparse signal recovery) often encounters nonnegative data
(e.g., images). Recently we developed the methodology of using (dense)
Compressed Counting for recovering nonnegative Ksparse signals. In this paper,
we adopt very sparse Compressed Counting for nonnegative signal recovery. Our
design matrix is sampled from a maximallyskewed pstable distribution (0<p<1),
and we sparsify the design matrix so that on average (1g)fraction of the
entries become zero. The idea is related to very sparse stable random
projections (Li et al 2006 and Li 2007), the prior work for estimating summary
statistics of the data.
In our theoretical analysis, we show that, when p>0, it suffices to use M=
K/(1exp(gK) log N measurements, so that all coordinates can be recovered in
one scan of the coordinates. If g = 1 (i.e., dense design), then M = K log N.
If g= 1/K or 2/K (i.e., very sparse design), then M = 1.58K log N or M = 1.16K
log N. This means the design matrix can be indeed very sparse at only a minor
inflation of the sample complexity.
Interestingly, as p>1, the required number of measurements is essentially M
= 2.7K log N, provided g= 1/K. It turns out that this result is a general
worstcase bound.

This paper addresses the following simple question about sparsity. For the
estimation of an $n$dimensional mean vector $\boldsymbol{\theta}$ in the
Gaussian sequence model, is it possible to find an adaptive optimal threshold
estimator in a full range of sparsity levels where nonadaptive optimality can
be achieved by threshold estimators? We provide an explicit affirmative answer
as follows. Under the squared loss, adaptive minimaxity in strong and weak
$\ell_p$ balls with $0\le p<2$ is achieved by a class of smooth threshold
estimators with the threshold level of the BenjaminiHochberg FDR rule or its a
certain approximation, provided that the minimax risk is between
$n^{\delta_n}$ and $\delta_n n$ for some $\delta_n\to 0$. For $p=0$, this
means adaptive minimaxity in $\ell_0$ balls when $1\le
\\boldsymbol{\theta}\_0\ll n$. The class of smooth threshold estimators
includes the soft and firm threshold estimators but not the hard threshold
estimator. The adaptive minimaxity in such a wide range is a delicate problem
since the same is not true for the FDR hard threshold estimator at certain
threshold and nominal FDR levels. The above adaptive minimaxity of the FDR
smooththreshold estimator is established by proving a stronger notion of
adaptive ratio optimality for the soft threshold estimator in the sense that
the risk for the FDR threshold level is uniformly within an infinitesimal
fraction of the risk for the optimal threshold level for each unknown vector,
when the minimum risk of nonadaptive soft threshold estimator is between
$n^{\delta_n}$ and $\delta_n n$. It is an interesting consequence of this
adaptive ratio optimality that the FDR smooththreshold estimator outperforms
the sample mean in the common mean model $\theta_i=\mu$ when $\mu<n^{1/2}$.

We propose a new method, semipenalized inference with direct false discovery
rate control (SPIDR), for variable selection and confidence interval
construction in highdimensional linear regression. SPIDR first uses a
semipenalized approach to constructing estimators of the regression
coefficients. We show that the SPIDR estimator is ideal in the sense that it
equals an ideal least squares estimator with high probability under a sparsity
and other suitable conditions. Consequently, the SPIDR estimator is
asymptotically normal. Based on this distributional result, SPIDR determines
the selection rule by directly controlling false discovery rate. This provides
an explicit assessment of the selection error. This also naturally leads to
confidence intervals for the selected coefficients with a proper confidence
statement. We conduct simulation studies to evaluate its finite sample
performance and demonstrate its application on a breast cancer gene expression
data set. Our simulation studies and data example suggest that SPIDR is a
useful method for highdimensional statistical inference in practice.

We propose a new method of learning a sparse nonnegativedefinite target
matrix. Our primary example of the target matrix is the inverse of a population
covariance or correlation matrix. The algorithm first estimates each column of
the target matrix by the scaled Lasso and then adjusts the matrix estimator to
be symmetric. The penalty level of the scaled Lasso for each column is
completely determined by data via convex minimization, without using
crossvalidation.
We prove that this scaled Lasso method guarantees the fastest proven rate of
convergence in the spectrum norm under conditions of weaker form than those in
the existing analyses of other $\ell_1$ regularized algorithms, and has faster
guaranteed rate of convergence when the ratio of the $\ell_1$ and spectrum
norms of the target inverse matrix diverges to infinity. A simulation study
demonstrates the computational feasibility and superb performance of the
proposed method.
Our analysis also provides new performance bounds for the Lasso and scaled
Lasso to guarantee higher concentration of the error at a smaller threshold
level than previous analyses, and to allow the use of the union bound in
columnbycolumn applications of the scaled Lasso without an adjustment of the
penalty level. In addition, the least squares estimation after the scaled Lasso
selection is considered and proven to guarantee performance bounds similar to
that of the scaled Lasso.

Compressed sensing (sparse signal recovery) has been a popular and important
research topic in recent years. By observing that natural signals are often
nonnegative, we propose a new framework for nonnegative signal recovery using
Compressed Counting (CC). CC is a technique built on maximallyskewed pstable
random projections originally developed for data stream computations. Our
recovery procedure is computationally very efficient in that it requires only
one linear scan of the coordinates. Our analysis demonstrates that, when
0<p<=0.5, it suffices to use M= O(C/eps^p log N) measurements so that all
coordinates will be recovered within eps additive precision, in one scan of the
coordinates. The constant C=1 when p>0 and C=pi/2 when p=0.5. In particular,
when p>0 the required number of measurements is essentially M=K\log N, where K
is the number of nonzero coordinates of the signal.

We study the absolute penalized maximum partial likelihood estimator in
sparse, highdimensional Cox proportional hazards regression models where the
number of timedependent covariates can be larger than the sample size. We
establish oracle inequalities based on natural extensions of the compatibility
and cone invertibility factors of the Hessian matrix at the true regression
coefficients. Similar results based on an extension of the restricted
eigenvalue can be also proved by our method. However, the presented oracle
inequalities are sharper since the compatibility and cone invertibility factors
are always greater than the corresponding restricted eigenvalue. In the Cox
regression model, the Hessian matrix is based on timedependent covariates in
censored risk sets, so that the compatibility and cone invertibility factors,
and the restricted eigenvalue as well, are random variables even when they are
evaluated for the Hessian at the true regression coefficients. Under mild
conditions, we prove that these quantities are bounded from below by positive
constants for timedependent covariates, including cases where the number of
covariates is of greater order than the sample size. Consequently, the
compatibility and cone invertibility factors can be treated as positive
constants in our oracle inequalities.

Many applications concern sparse signals, for example, detecting anomalies
from the differences between consecutive images taken by surveillance cameras.
This paper focuses on the problem of recovering a Ksparse signal x in N
dimensions. In the mainstream framework of compressed sensing (CS), the vector
x is recovered from M nonadaptive linear measurements y = xS, where S (of size
N x M) is typically a Gaussian (or Gaussianlike) design matrix, through some
optimization procedure such as linear programming (LP).
In our proposed method, the design matrix S is generated from an
$\alpha$stable distribution with $\alpha\approx 0$. Our decoding algorithm
mainly requires one linear scan of the coordinates, followed by a few
iterations on a small number of coordinates which are "undetermined" in the
previous iteration. Comparisons with two strong baselines, linear programming
(LP) and orthogonal matching pursuit (OMP), demonstrate that our algorithm can
be significantly faster in decoding speed and more accurate in recovery
quality, for the task of exact spare recovery. Our procedure is robust against
measurement noise. Even when there are no sufficient measurements, our
algorithm can still reliably recover a significant portion of the nonzero
coordinates.
To provide the intuition for understanding our method, we also analyze the
procedure by assuming an idealistic setting. Interestingly, when K=2, the
"idealized" algorithm achieves exact recovery with merely 3 measurements,
regardless of N. For general K, the required sample size of the "idealized"
algorithm is about 5K.

This paper concerns the problem of matrix completion, which is to estimate a
matrix from observations in a small subset of indices. We propose a calibrated
spectrum elastic net method with a sum of the nuclear and Frobenius penalties
and develop an iterative algorithm to solve the convex minimization problem.
The iterative algorithm alternates between imputing the missing entries in the
incomplete matrix by the current guess and estimating the matrix by a scaled
softthresholding singular value decomposition of the imputed matrix until the
resulting matrix converges. A calibration step follows to correct the bias
caused by the Frobenius penalty. Under proper coherence conditions and for
suitable penalties levels, we prove that the proposed estimator achieves an
error bound of nearly optimal order and in proportion to the noise level. This
provides a unified analysis of the noisy and noiseless matrix completion
problems. Simulation results are presented to compare our proposal with
previous ones.

The purpose of this paper is to propose methodologies for statistical
inference of lowdimensional parameters with highdimensional data. We focus on
constructing confidence intervals for individual coefficients and linear
combinations of several of them in a linear regression model, although our
ideas are applicable in a much broad context. The theoretical results presented
here provide sufficient conditions for the asymptotic normality of the proposed
estimators along with a consistent estimator for their finitedimensional
covariance matrices. These sufficient conditions allow the number of variables
to far exceed the sample size. The simulation results presented here
demonstrate the accuracy of the coverage probability of the proposed confidence
intervals, strongly supporting the theoretical results.