
This paper is concerned with the problem of top$K$ ranking from pairwise
comparisons. Given a collection of $n$ items and a few pairwise comparisons
across them, one wishes to identify the set of $K$ items that receive the
highest ranks. To tackle this problem, we adopt the logistic parametric model
 the BradleyTerryLuce model, where each item is assigned a latent
preference score, and where the outcome of each pairwise comparison depends
solely on the relative scores of the two items involved. Recent works have made
significant progress towards characterizing the performance (e.g. the mean
square error for estimating the scores) of several classical methods, including
the spectral method and the maximum likelihood estimator (MLE). However, where
they stand regarding top$K$ ranking remains unsettled.
We demonstrate that under a natural random sampling model, the spectral
method alone, or the regularized MLE alone, is minimax optimal in terms of the
sample complexity  the number of paired comparisons needed to ensure exact
top$K$ identification, for the fixed dynamic range regime. This is
accomplished via optimal control of the entrywise error of the score estimates.
We complement our theoretical studies by numerical experiments, confirming that
both methods yield low entrywise errors for estimating the underlying scores.
Our theory is established via a novel leaveoneout trick, which proves
effective for analyzing both iterative and noniterative procedures. Along the
way, we derive an elementary eigenvector perturbation bound for probability
transition matrices, which parallels the DavisKahan $\sin\Theta$ theorem for
symmetric matrices. This also allows us to close the gap between the $\ell_2$
error upper bound for the spectral method and the minimax lower limit.

This paper considers the problem of solving systems of quadratic equations,
namely, recovering an object of interest
$\mathbf{x}^{\natural}\in\mathbb{R}^{n}$ from $m$ quadratic equations/samples
$y_{i}=(\mathbf{a}_{i}^{\top}\mathbf{x}^{\natural})^{2}$, $1\leq i\leq m$. This
problem, also dubbed as phase retrieval, spans multiple domains including
physical sciences and machine learning.
We investigate the efficiency of gradient descent (or Wirtinger flow)
designed for the nonconvex least squares problem. We prove that under Gaussian
designs, gradient descent  when randomly initialized  yields an
$\epsilon$accurate solution in $O\big(\log n+\log(1/\epsilon)\big)$ iterations
given nearly minimal samples, thus achieving nearoptimal computational and
sample complexities at once. This provides the first global convergence
guarantee concerning vanilla gradient descent for phase retrieval, without the
need of (i) carefullydesigned initialization, (ii) sample splitting, or (iii)
sophisticated saddlepoint escaping schemes. All of these are achieved by
exploiting the statistical models in analyzing optimization algorithms, via a
leaveoneout approach that enables the decoupling of certain statistical
dependency between the gradient descent iterates and the data.

We study the problem of computerassisted teaching with explanations.
Conventional approaches for machine teaching typically only provide feedback at
the instance level e.g., the category or label of the instance. However, it is
intuitive that clear explanations from a knowledgeable teacher can
significantly improve a student's ability to learn a new concept. To address
these existing limitations, we propose a teaching framework that provides
interpretable explanations as feedback and models how the learner incorporates
this additional information. In the case of images, we show that we can
automatically generate explanations that highlight the parts of the image that
are responsible for the class label. Experiments on human learners illustrate
that, on average, participants achieve better test set performance on
challenging categorization tasks when taught with our interpretable approach
compared to existing methods.

We consider the problem of recovering lowrank matrices from random rankone
measurements, which spans numerous applications including covariance sketching,
phase retrieval, quantum state tomography, and learning shallow polynomial
neural networks, among others. Our approach is to directly estimate the
lowrank factor by minimizing a nonconvex quadratic loss function via vanilla
gradient descent, following a tailored spectral initialization. When the true
rank is small, this algorithm is guaranteed to converge to the ground truth (up
to global ambiguity) with nearoptimal sample complexity and computational
complexity. To the best of our knowledge, this is the first guarantee that
achieves nearoptimality in both metrics. In particular, the key enabler of
nearoptimal computational guarantees is an implicit regularization phenomenon:
without explicit regularization, both spectral initialization and the gradient
descent iterates automatically stay within a region incoherent with the
measurement vectors. This feature allows one to employ much more aggressive
step sizes compared with the ones suggested in prior literature, without the
need of sample splitting.

In realworld applications of education and human teaching, an effective
teacher chooses the next example intelligently based on the learner's current
state. However, most of the existing works in algorithmic machine teaching
focus on the batch setting, where adaptivity plays no role. In this paper, we
study the case of teaching consistent, version space learners in an interactive
settingat any time step, the teacher provides an example, the learner
performs an update, and the teacher observes the learner's new state. We
highlight that adaptivity does not speed up the teaching process when
considering existing models of version space learners, such as the "worstcase"
model (the learner picks the next hypothesis randomly from the version space)
and "preferencebased" model (the learner picks hypothesis according to some
global preference). Inspired by human teaching, we propose a new model where
the learner picks hypothesis according to some local preference defined by the
current hypothesis. We show that our model exhibits several desirable
properties, e.g., adaptivity plays a key role, and the learner's transitions
over hypotheses are smooth/interpretable. We develop efficient teaching
algorithms for our model, and demonstrate our results via simulations as well
as user studies.

We consider a simple periodicallyforced 1D Langevin equation which
possesses two stable periodic orbits in the absence of noise. We ask the
question: is there a most likely transition path between the stable orbits that
would allow us to identify a preferred phase of the periodic forcing for which
tipping occurs? The regime where the forcing period is long compared to the
adiabatic relaxation time has been well studied. Our work complements this by
focusing on the regime where the forcing period is comparable to the relaxation
time. We compute optimal paths using the least action method which involves the
OnsagerMachlup functional and validate results with Monte Carlo simulations in
a regime where noise and drift are balanced. Results for the preferred tipping
phase are compared with the deterministic aspects of the problem. We identify
parameter regimes where nullclines, associated with the deterministic problem
in a 2D extended phase space, form passageways through which the optimal paths
transit. As the nullclines are independent of the relaxation time and the noise
strength, this leads to a robust deterministic predictor of a preferred tipping
phase for the noise and drift balanced regime.

We consider the optimal value of information (VoI) problem, where the goal is
to sequentially select a set of tests with a minimal cost, so that one can
efficiently make the best decision based on the observed outcomes. Existing
algorithms are either heuristics with no guarantees, or scale poorly (with
exponential run time in terms of the number of available tests). Moreover,
these methods assume a known distribution over the test outcomes, which is
often not the case in practice. We propose an efficient samplingbased online
learning framework to address the above issues. First, assuming the
distribution over hypotheses is known, we propose a dynamic hypothesis
enumeration strategy, which allows efficient information gathering with strong
theoretical guarantees. We show that with sufficient amount of samples, one can
identify a nearoptimal decision with high probability. Second, when the
parameters of the hypotheses distribution are unknown, we propose an algorithm
which learns the parameters progressively via posterior sampling in an online
fashion. We further establish a rigorous bound on the expected regret. We
demonstrate the effectiveness of our approach on a realworld interactive
troubleshooting application and show that one can efficiently make highquality
decisions with low cost.

Logistic regression is used thousands of times a day to fit data, predict
future outcomes, and assess the statistical significance of explanatory
variables. When used for the purpose of statistical inference, logistic models
produce pvalues for the regression coefficients by using an approximation to
the distribution of the likelihoodratio test. Indeed, Wilks' theorem asserts
that whenever we have a fixed number $p$ of variables, twice the loglikelihood
ratio (LLR) $2\Lambda$ is distributed as a $\chi^2_k$ variable in the limit of
large sample sizes $n$; here, $k$ is the number of variables being tested. In
this paper, we prove that when $p$ is not negligible compared to $n$, Wilks'
theorem does not hold and that the chisquare approximation is grossly
incorrect; in fact, this approximation produces pvalues that are far too small
(under the null hypothesis). Assume that $n$ and $p$ grow large in such a way
that $p/n\rightarrow\kappa$ for some constant $\kappa < 1/2$. We prove that for
a class of logistic models, the LLR converges to a rescaled chisquare, namely,
$2\Lambda~\stackrel{\mathrm{d}}{\rightarrow}~\alpha(\kappa)\chi_k^2$, where the
scaling factor $\alpha(\kappa)$ is greater than one as soon as the
dimensionality ratio $\kappa$ is positive. Hence, the LLR is larger than
classically assumed. For instance, when $\kappa=0.3$,
$\alpha(\kappa)\approx1.5$. In general, we show how to compute the scaling
factor by solving a nonlinear system of two equations with two unknowns. Our
mathematical arguments are involved and use techniques from approximate message
passing theory, nonasymptotic random matrix theory and convex geometry. We
also complement our mathematical study by showing that the new limiting
distribution is accurate for finite sample sizes. Finally, all the results from
this paper extend to some other regression models such as the probit regression
model.

This paper investigates the information rate loss in analog channels when the
sampler is designed to operate independent of the instantaneous channel
occupancy. Specifically, a multiband linear timeinvariant Gaussian channel
under universal subNyquist sampling is considered. The entire channel
bandwidth is divided into $n$ subbands of equal bandwidth. At each time only
$k$ constantgain subbands are active, where the instantaneous subband
occupancy is not known at the receiver and the sampler. We study the
information loss through a capacity loss metric, that is, the capacity gap
caused by the lack of instantaneous subband occupancy information. We
characterize the minimax capacity loss for the entire subNyquist rate regime,
provided that the number $n$ of subbands and the SNR are both large. The
minimax limits depend almost solely on the band sparsity factor and the
undersampling factor, modulo some residual terms that vanish as $n$ and SNR
grow. Our results highlight the power of randomized sampling methods (i.e. the
samplers that consist of random periodic modulation and lowpass filters),
which are able to approach the minimax capacity loss with exponentially high
probability.

Various applications involve assigning discrete label values to a collection
of objects based on some pairwise noisy data. Due to the discreteand hence
nonconvexstructure of the problem, computing the optimal assignment
(e.g.~maximum likelihood assignment) becomes intractable at first sight. This
paper makes progress towards efficient computation by focusing on a concrete
joint alignment problemthat is, the problem of recovering $n$ discrete
variables $x_i \in \{1,\cdots, m\}$, $1\leq i\leq n$ given noisy observations
of their modulo differences $\{x_i  x_j~\mathsf{mod}~m\}$. We propose a
lowcomplexity and modelfree procedure, which operates in a lifted space by
representing distinct label values in orthogonal directions, and which attempts
to optimize quadratic functions over hypercubes. Starting with a first guess
computed via a spectral method, the algorithm successively refines the iterates
via projected power iterations. We prove that for a broad class of statistical
models, the proposed projected power method makes no errorand hence
converges to the maximum likelihood estimatein a suitable regime. Numerical
experiments have been carried out on both synthetic and real data to
demonstrate the practicality of our algorithm. We expect this algorithmic
framework to be effective for a broad range of discrete assignment problems.

A variety of information processing tasks in practice involve recovering $n$
objects from singleshot graphbased measurements, particularly those taken
over the edges of some measurement graph $\mathcal{G}$. This paper concerns the
situation where each object takes value over a group of $M$ different values,
and where one is interested to recover all these values based on observations
of certain pairwise relations over $\mathcal{G}$. The imperfection of
measurements presents two major challenges for information recovery: 1)
$\textit{inaccuracy}$: a (dominant) portion $1p$ of measurements are
corrupted; 2) $\textit{incompleteness}$: a significant fraction of pairs are
unobservable, i.e. $\mathcal{G}$ can be highly sparse.
Under a natural random outlier model, we characterize the $\textit{minimax
recovery rate}$, that is, the critical threshold of noncorruption rate $p$
below which exact information recovery is infeasible. This accommodates a very
general class of pairwise relations. For various homogeneous random graph
models (e.g. Erdos Renyi random graphs, random geometric graphs, small world
graphs), the minimax recovery rate depends almost exclusively on the edge
sparsity of the measurement graph $\mathcal{G}$ irrespective of other graphical
metrics. This fundamental limit decays with the group size $M$ at a square root
rate before entering a connectivitylimited regime. Under the Erdos Renyi
random graph, a tractable combinatorial algorithm is proposed to approach the
limit for large $M$ ($M=n^{\Omega(1)}$), while orderoptimal recovery is
enabled by semidefinite programs in the small $M$ regime.
The extended (and most updated) version of this work can be found at
(http://arxiv.org/abs/1504.01369).

We consider the Bayesian active learning and experimental design problem,
where the goal is to learn the value of some unknown target variable through a
sequence of informative, noisy tests. In contrast to prior work, we focus on
the challenging, yet practically relevant setting where test outcomes can be
conditionally dependent given the hidden target variable. Under such
assumptions, common heuristics, such as greedily performing tests that maximize
the reduction in uncertainty of the target, often perform poorly. In this
paper, we propose ECED, a novel, computationally efficient active learning
algorithm, and prove strong theoretical guarantees that hold with correlated,
noisy tests. Rather than directly optimizing the prediction error, at each
step, ECED picks the test that maximizes the gain in a surrogate objective,
which takes into account the dependencies between tests. Our analysis relies on
an informationtheoretic auxiliary function to track the progress of ECED, and
utilizes adaptive submodularity to attain the nearoptimal bound. We
demonstrate strong empirical performance of ECED on two problem instances,
including a Bayesian experimental design task intended to distinguish among
economic theories of how people make risky decisions, and an active preference
learning task via pairwise comparisons.

Motivated by applications in domains such as social networks and
computational biology, we study the problem of community recovery in graphs
with locality. In this problem, pairwise noisy measurements of whether two
nodes are in the same community or different communities come mainly or
exclusively from nearby nodes rather than uniformly sampled between all nodes
pairs, as in most existing models. We present an algorithm that runs nearly
linearly in the number of measurements and which achieves the information
theoretic limit for exact recovery.

This paper is concerned with jointly recovering $n$ nodevariables $\left\{
x_{i}\right\}_{1\leq i\leq n}$ from a collection of pairwise difference
measurements. Imagine we acquire a few observations taking the form of
$x_{i}x_{j}$; the observation pattern is represented by a measurement graph
$\mathcal{G}$ with an edge set $\mathcal{E}$ such that $x_{i}x_{j}$ is
observed if and only if $(i,j)\in\mathcal{E}$. To account for noisy
measurements in a general manner, we model the data acquisition process by a
set of channels with given input/output transition measures. Employing
informationtheoretic tools applied to channel decoding problems, we develop a
\emph{unified} framework to characterize the fundamental recovery criterion,
which accommodates general graph structures, alphabet sizes, and channel
transition measures. In particular, our results isolate a family of
\emph{minimum} \emph{channel divergence measures} to characterize the degree of
measurement corruption, which together with the size of the minimum cut of
$\mathcal{G}$ dictates the feasibility of exact information recovery. For
various homogeneous graphs, the recovery condition depends almost only on the
edge sparsity of the measurement graph irrespective of other graphical metrics;
alternatively, the minimum sample complexity required for these graphs scales
like \[ \text{minimum sample complexity }\asymp\frac{n\log
n}{\mathsf{Hel}_{1/2}^{\min}} \] for certain information metric
$\mathsf{Hel}_{1/2}^{\min}$ defined in the main text, as long as the alphabet
size is not superpolynomial in $n$. We apply our general theory to three
concrete applications, including the stochastic block model, the outlier model,
and the haplotype assembly problem. Our theory leads to orderwise tight
recovery conditions for all these scenarios.

We consider the fundamental problem of solving quadratic systems of equations
in $n$ variables, where $y_i = \langle \boldsymbol{a}_i, \boldsymbol{x}
\rangle^2$, $i = 1, \ldots, m$ and $\boldsymbol{x} \in \mathbb{R}^n$ is
unknown. We propose a novel method, which starting with an initial guess
computed by means of a spectral method, proceeds by minimizing a nonconvex
functional as in the Wirtinger flow approach. There are several key
distinguishing features, most notably, a distinct objective functional and
novel update rules, which operate in an adaptive fashion and drop terms bearing
too much influence on the search direction. These careful selection rules
provide a tighter initial guess, better descent directions, and thus enhanced
practical performance. On the theoretical side, we prove that for certain
unstructured models of quadratic systems, our algorithms return the correct
solution in linear time, i.e. in time proportional to reading the data
$\{\boldsymbol{a}_i\}$ and $\{y_i\}$ as soon as the ratio $m/n$ between the
number of equations and unknowns exceeds a fixed numerical constant. We extend
the theory to deal with noisy systems in which we only have $y_i \approx
\langle \boldsymbol{a}_i, \boldsymbol{x} \rangle^2$ and prove that our
algorithms achieve a statistical accuracy, which is nearly unimprovable. We
complement our theoretical study with numerical examples showing that solving
random quadratic systems is both computationally and statistically not much
harder than solving linear systems of the same sizehence the title of this
paper. For instance, we demonstrate empirically that the computational cost of
our algorithm is about four times that of solving a leastsquares problem of
the same size.

A particular sequence of patterns, "$\text{gaps} \to \text{labyrinth} \to
\text{spots}$," occurs with decreasing precipitation in previously reported
numerical simulations of PDE dryland vegetation models. These observations have
led to the suggestion that this sequence of patterns can serve as an early
indicator of desertification in some ecosystems. Since parameter values can
take on a range of plausible values in the vegetation models, it is important
to investigate whether the pattern sequence prediction is robust to variation.
For a particular model, we find that a quantity calculated via
bifurcationtheoretic analysis appears to serve as a proxy for the pattern
sequences that occur in numerical simulations across a range of parameter
values. We find in further analysis that the quantity takes on values
consistent with the standard sequence in an ecologically relevant limit of the
model parameter values. This suggests that the standard sequence is a robust
prediction of the model, and we conclude by proposing a methodology for
assessing the robustness of the standard sequence in other models and
formulations.

This work considers blackbox Bayesian inference over highdimensional
parameter spaces. The wellknown adaptive Metropolis (AM) algorithm of (Haario
etal. 2001) is extended herein to scale asymptotically uniformly with respect
to the underlying parameter dimension for Gaussian targets, by respecting the
variance of the target. The resulting algorithm, referred to as the
dimensionindependent adaptive Metropolis (DIAM) algorithm, also shows improved
performance with respect to adaptive Metropolis on nonGaussian targets. This
algorithm is further improved, and the possibility of probing highdimensional
targets is enabled, via GPUaccelerated numerical libraries and periodically
synchronized concurrent chains (justified a posteriori). Asymptotically in
dimension, this GPU implementation exhibits a factor of four improvement versus
a competitive CPUbased Intel MKL parallel version alone. Strong scaling to
concurrent chains is exhibited, through a combination of longer time per sample
batch (weak scaling) and yet fewer necessary samples to convergence. The
algorithm performance is illustrated on several Gaussian and nonGaussian
target examples, in which the dimension may be in excess of one thousand.

This paper explores the preferencebased top$K$ rank aggregation problem.
Suppose that a collection of items is repeatedly compared in pairs, and one
wishes to recover a consistent ordering that emphasizes the top$K$ ranked
items, based on partially revealed preferences. We focus on the
BradleyTerryLuce (BTL) model that postulates a set of latent preference
scores underlying all items, where the odds of paired comparisons depend only
on the relative scores of the items involved.
We characterize the minimax limits on identifiability of top$K$ ranked
items, in the presence of random and nonadaptive sampling. Our results
highlight a separation measure that quantifies the gap of preference scores
between the $K^{\text{th}}$ and $(K+1)^{\text{th}}$ ranked items. The minimum
sample complexity required for reliable top$K$ ranking scales inversely with
the separation measure irrespective of other preference distribution metrics.
To approach this minimax limit, we propose a nearly lineartime ranking scheme,
called \emph{Spectral MLE}, that returns the indices of the top$K$ items in
accordance to a careful score estimate. In a nutshell, Spectral MLE starts with
an initial score estimate with minimal squared loss (obtained via a spectral
method), and then successively refines each component with the assistance of
coordinatewise MLEs. Encouragingly, Spectral MLE allows perfect top$K$ item
identification under minimal sample complexity. The practical applicability of
Spectral MLE is further corroborated by numerical experiments.

Statistical inference and information processing of highdimensional data
often require efficient and accurate estimation of their secondorder
statistics. With rapidly changing data, limited processing power and storage at
the acquisition devices, it is desirable to extract the covariance structure
from a single pass over the data and a small number of stored measurements. In
this paper, we explore a quadratic (or rankone) measurement model which
imposes minimal memory requirements and low computational complexity during the
sampling process, and is shown to be optimal in preserving various
lowdimensional covariance structures. Specifically, four popular structural
assumptions of covariance matrices, namely low rank, Toeplitz low rank,
sparsity, jointly rankone and sparse structure, are investigated, while
recovery is achieved via convex relaxation paradigms for the respective
structure.
The proposed quadratic sampling framework has a variety of potential
applications including streaming data processing, highfrequency wireless
communication, phase space tomography and phase retrieval in optics, and
noncoherent subspace detection. Our method admits universally accurate
covariance estimation in the absence of noise, as soon as the number of
measurements exceeds the information theoretic limits. We also demonstrate the
robustness of this approach against noise and imperfect structural assumptions.
Our analysis is established upon a novel notion called the mixednorm
restricted isometry property (RIP$\ell_{2}/\ell_{1}$), as well as the
conventional RIP$\ell_{2}/\ell_{2}$ for nearisotropic and bounded
measurements. In addition, our results improve upon the bestknown phase
retrieval (including both dense and sparse signals) guarantees using PhaseLift
with a significantly simpler approach.

The performance analysis of random vector channels, particularly
multipleinputmultipleoutput (MIMO) channels, has largely been established in
the asymptotic regime of large channel dimensions, due to the analytical
intractability of characterizing the exact distribution of the objective
performance metrics. This paper exposes a new nonasymptotic framework that
allows the characterization of many canonical MIMO system performance metrics
to within a narrow interval under moderatetolarge channel dimensionality,
provided that these metrics can be expressed as a separable function of the
singular values of the matrix. The effectiveness of our framework is
illustrated through two canonical examples. Specifically, we characterize the
mutual information and power offset of random MIMO channels, as well as the
minimum mean squared estimation error of MIMO channel inputs from the channel
outputs. Our results lead to simple, informative, and reasonably accurate
control of various performance metrics in the finitedimensional regime, as
corroborated by the numerical simulations. Our analysis framework is
established via the concentration of spectral measure phenomenon for random
matrices uncovered by Guionnet and Zeitouni, which arises in a variety of
random matrix ensembles irrespective of the precise distributions of the matrix
entries.

The paper explores the problem of \emph{spectral compressed sensing}, which
aims to recover a spectrally sparse signal from a small random subset of its
$n$ time domain samples. The signal of interest is assumed to be a
superposition of $r$ multidimensional complex sinusoids, while the underlying
frequencies can assume any \emph{continuous} values in the normalized frequency
domain. Conventional compressed sensing paradigms suffer from the basis
mismatch issue when imposing a discrete dictionary on the Fourier
representation. To address this issue, we develop a novel algorithm, called
\emph{Enhanced Matrix Completion (EMaC)}, based on structured matrix completion
that does not require prior knowledge of the model order. The algorithm starts
by arranging the data into a lowrank enhanced form exhibiting multifold
Hankel structure, and then attempts recovery via nuclear norm minimization.
Under mild incoherence conditions, EMaC allows perfect recovery as soon as the
number of samples exceeds the order of $r\log^{4}n$, and is stable against
bounded noise. Even if a constant portion of samples are corrupted with
arbitrary magnitude, EMaC still allows exact recovery, provided that the sample
complexity exceeds the order of $r^{2}\log^{3}n$. Along the way, our results
demonstrate the power of convex relaxation in completing a lowrank multifold
Hankel or Toeplitz matrix from minimal observed entries. The performance of our
algorithm and its applicability to super resolution are further validated by
numerical experiments.

Maximum a posteriori (MAP) inference over discrete Markov random fields is a
fundamental task spanning a wide spectrum of realworld applications, which is
known to be NPhard for general graphs. In this paper, we propose a novel
semidefinite relaxation formulation (referred to as SDR) to estimate the MAP
assignment. Algorithmically, we develop an accelerated variant of the
alternating direction method of multipliers (referred to as SDPADLR) that can
effectively exploit the special structure of the new relaxation. Encouragingly,
the proposed procedure allows solving SDR for largescale problems, e.g.,
problems on a grid graph comprising hundreds of thousands of variables with
multiple states per node. Compared with prior SDP solvers, SDPADLR is capable
of attaining comparable accuracy while exhibiting remarkably improved
scalability, in contrast to the commonly held belief that semidefinite
relaxation can only been applied on smallscale MRF problems. We have evaluated
the performance of SDR on various benchmark datasets including OPENGM2 and PIC
in terms of both the quality of the solutions and computation time.
Experimental results demonstrate that for a broad class of problems, SDPADLR
outperforms stateoftheart algorithms in producing better MAP assignment in
an efficient manner.

This paper investigates the effect of subNyquist sampling upon the capacity
of an analog channel. The channel is assumed to be a linear timeinvariant
Gaussian channel, where perfect channel knowledge is available at both the
transmitter and the receiver. We consider a general class of rightinvertible
timepreserving sampling methods which include irregular nonuniform sampling,
and characterize in closed form the channel capacity achievable by this class
of sampling methods, under a sampling rate and power constraint. Our results
indicate that the optimal sampling structures extract out the set of
frequencies that exhibits the highest signaltonoise ratio among all spectral
sets of measure equal to the sampling rate. This can be attained through
filterbank sampling with uniform sampling at each branch with possibly
different rates, or through a single branch of modulation and filtering
followed by uniform sampling. These results reveal that for a large class of
channels, employing irregular nonuniform sampling sets, while typically
complicated to realize, does not provide capacity gain over uniform sampling
sets with appropriate preprocessing. Our findings demonstrate that aliasing or
scrambling of spectral components does not provide capacity gain, which is in
contrast to the benefits obtained from random mixing in spectrumblind
compressive sampling schemes.

We propose a minimal model of predatorswarm interactions which captures many
of the essential dynamics observed in nature. Different outcomes are observed
depending on the predator strength. For a "weak" predator, the swarm is able to
escape the predator completely. As the strength is increased, the predator is
able to catch up with the swarm as a whole, but the individual prey are able to
escape by "confusing" the predator: the prey forms a ring with the predator at
the center. For higher predator strength, complex chasing dynamics are observed
which can become chaotic. For even higher strength, the predator is able to
successfully capture the prey. Our model is simple enough to be amenable to a
full mathematical analysis which is used to predict the shape of the swarm as
well as the resulting predatorprey dynamics as a function of model parameters.
We show that as the predator strength is increased, there is a transition (due
to a Hopf bifurcation) from confusion state to chasing dynamics, and we compute
the threshold analytically. Our analysis indicates that the swarming behaviour
is not helpful in avoiding the predator, suggesting that there are other
reasons why the species may swarm. The complex shape of the swarm in our model
during the chasing dynamics is similar to the shape of a flock of sheep
avoiding a shepherd.

We explore a fundamental problem of superresolving a signal of interest from
a few measurements of its lowpass magnitudes. We propose a 2stage tractable
algorithm that, in the absence of noise, admits perfect superresolution of an
$r$sparse signal from $2r^22r+2$ lowpass magnitude measurements. The spike
locations of the signal can assume any value over a continuous disk, without
increasing the required sample size. The proposed algorithm first employs a
conventional superresolution algorithm (e.g. the matrix pencil approach) to
recover unlabeled sets of signal correlation coefficients, and then applies a
simple sorting algorithm to disentangle and retrieve the true parameters in a
deterministic manner. Our approach can be adapted to multidimensional spike
models and random Fourier sampling by replacing its first step with other
harmonic retrieval algorithms.