-
The \emph{state complexity} of a regular language $L_m$ is the number $m$ of
states in a minimal deterministic finite automaton (DFA) accepting $L_m$. The
state complexity of a regularity-preserving binary operation on regular
languages is defined as the maximal state complexity of the result of the
operation where the two operands range over all languages of state complexities
$\le m$ and $\le n$, respectively. We find a tight upper bound on the state
complexity of the binary operation \emph{overlap assembly} on regular
languages. This operation was introduced by Csuhaj-Varj\'u, Petre, and Vaszil
to model the process of self-assembly of two linear DNA strands into a longer
DNA strand, provided that their ends "overlap". We prove that the state
complexity of the overlap assembly of languages $L_m$ and $L_n$, where $m\ge 2$
and $n\ge1$, is at most $2 (m-1) 3^{n-1} + 2^n$. Moreover, for $m \ge 2$ and $n
\ge 3$ there exist languages $L_m$ and $L_n$ over an alphabet of size $n$ whose
overlap assembly meets the upper bound and this bound cannot be met with
smaller alphabets. Finally, we prove that $m+n$ is a tight upper bound on the
overlap assembly of unary languages, and that there are binary languages whose
overlap assembly has exponential state complexity at least $m(2^{n-1}-2)+2$.
-
Significant success has been realized recently on applying machine learning
to real-world applications. There have also been corresponding concerns on the
privacy of training data, which relates to data security and confidentiality
issues. Differential privacy provides a principled and rigorous privacy
guarantee on machine learning models. While it is common to design a model
satisfying a required differential-privacy property by injecting noise, it is
generally hard to balance the trade-off between privacy and utility. We show
that stochastic gradient Markov chain Monte Carlo (SG-MCMC) -- a class of
scalable Bayesian posterior sampling algorithms proposed recently -- satisfies
strong differential privacy with carefully chosen step sizes. We develop theory
on the performance of the proposed differentially-private SG-MCMC method. We
conduct experiments to support our analysis and show that a standard SG-MCMC
sampler without any modification (under a default setting) can reach
state-of-the-art performance in terms of both privacy and utility on Bayesian
learning.
-
The statistical analysis of covariance matrices occurs in many important
applications, e.g. in diffusion tensor imaging and longitudinal data analysis.
We consider the situation where it is of interest to estimate an average
covariance matrix, describe its anisotropy, to carry out principal geodesic
analysis and to interpolate between covariance matrices. There are many choices
of metric available, each with its advantages. The particular choice of what is
best will depend on the particular application. The use of the Procrustes
size-and-shape metric is particularly appropriate when the covariance matrices
are close to being deficient in rank. We discuss the use of different metrics
for diffusion tensor analysis, and we also introduce certain types of
regularization for tensors.
-
The radiofrequency (RF) transmit field is severely inhomogeneous at ultrahigh
field due to both RF penetration and RF coil design issues. This particularly
impairs image quality for sequences that use inversion pulses such as
magnetization prepared rapid acquisition gradient echo and limits the use of
quantitative arterial spin labeling sequences such as flow-attenuated inversion
recovery. Here we have used a search algorithm to produce inversion pulses
tailored to take into account the heterogeneity of the RF transmit field at 7
T. This created a slice selective inversion pulse that worked well (good slice
profile and uniform inversion) over the range of RF amplitudes typically
obtained in the head at 7 T while still maintaining an experimentally
achievable pulse length and pulse amplitude in the brain at 7 T. The pulses
used were based on the frequency offset correction inversion technique, as well
as time dilation of functions, but the RF amplitude, frequency sweep, and
gradient functions were all generated using a genetic algorithm with an
evaluation function that took into account both the desired inversion profile
and the transmit field inhomogeneity.