
Wing disc pouches of fruit flies are a powerful genetic model for studying
physiological intercellular calcium ($Ca^{2+}$) signals for dynamic analysis of
cell signaling in organ development and disease studies. A key to analyzing
spatialtemporal patterns of $Ca^{2+}$ signal waves is to accurately align the
pouches across image sequences. However, pouches in different image frames may
exhibit extensive intensity oscillations due to $Ca^{2+}$ signaling dynamics,
and commonly used multimodal nonrigid registration methods may fail to achieve
satisfactory results. In this paper, we develop a new twophase nonrigid
registration approach to register pouches in image sequences. First, we conduct
segmentation of the region of interest. (i.e., pouches) using a deep neural
network model. Second, we obtain an optimal transformation and align pouches
across the image sequences. Evaluated using both synthetic data and real pouch
data, our method considerably outperforms the stateoftheart nonrigid
registration methods.

Image segmentation is a fundamental problem in biomedical image analysis.
Recent advances in deep learning have achieved promising results on many
biomedical image segmentation benchmarks. However, due to large variations in
biomedical images (different modalities, image settings, objects, noise, etc),
to utilize deep learning on a new application, it usually needs a new set of
training data. This can incur a great deal of annotation effort and cost,
because only biomedical experts can annotate effectively, and often there are
too many instances in images (e.g., cells) to annotate. In this paper, we aim
to address the following question: With limited effort (e.g., time) for
annotation, what instances should be annotated in order to attain the best
performance? We present a deep active learning framework that combines fully
convolutional network (FCN) and active learning to significantly reduce
annotation effort by making judicious suggestions on the most effective
annotation areas. We utilize uncertainty and similarity information provided by
FCN and formulate a generalized version of the maximum set cover problem to
determine the most representative and uncertain areas for annotation. Extensive
experiments using the 2015 MICCAI Gland Challenge dataset and a lymph node
ultrasound image segmentation dataset show that, using annotation suggestions
by our method, stateoftheart segmentation performance can be achieved by
using only 50% of training data.

In this paper, we consider the problem of automatically segmenting neuronal
cells in dualcolor confocal microscopy images. This problem is a key task in
various quantitative analysis applications in neuroscience, such as tracing
cell genesis in Danio rerio (zebrafish) brains. Deep learning, especially using
fully convolutional networks (FCN), has profoundly changed segmentation
research in biomedical imaging. We face two major challenges in this problem.
First, neuronal cells may form dense clusters, making it difficult to correctly
identify all individual cells (even to human experts). Consequently,
segmentation results of the known FCNtype models are not accurate enough.
Second, pixelwise ground truth is difficult to obtain. Only a limited amount
of approximate instancewise annotation can be collected, which makes the
training of FCN models quite cumbersome. We propose a new FCNtype deep
learning model, called deep complete bipartite networks (CBNet), and a new
scheme for leveraging approximate instancewise annotation to train our
pixelwise prediction model. Evaluated using seven real datasets, our proposed
new CBNet model outperforms the stateoftheart FCN models and produces
neuron segmentation results of remarkable quality

Convolution is a fundamental operation in many applications, such as computer
vision, natural language processing, image processing, etc. Recent successes of
convolutional neural networks in various deep learning applications put even
higher demand on fast convolution. The high computation throughput and memory
bandwidth of graphics processing units (GPUs) make GPUs a natural choice for
accelerating convolution operations. However, maximally exploiting the
available memory bandwidth of GPUs for convolution is a challenging task. This
paper introduces a general model to address the mismatch between the memory
bank width of GPUs and computation data width of threads. Based on this model,
we develop two convolution kernels, one for the general case and the other for
a special case with one input channel. By carefully optimizing memory access
patterns and computation patterns, we design a communicationoptimized kernel
for the special case and a communicationreduced kernel for the general case.
Experimental data based on implementations on Kepler GPUs show that our kernels
achieve 5.16X and 35.5% average performance improvement over the latest cuDNN
library, for the special case and the general case, respectively.

Segmentation of 3D images is a fundamental problem in biomedical image
analysis. Deep learning (DL) approaches have achieved stateoftheart
segmentation perfor mance. To exploit the 3D contexts using neural networks,
known DL segmentation methods, including 3D convolution, 2D convolution on
planes orthogonal to 2D image slices, and LSTM in multiple directions, all
suffer incompatibility with the highly anisotropic dimensions in common 3D
biomedical images. In this paper, we propose a new DL framework for 3D image
segmentation, based on a com bination of a fully convolutional network (FCN)
and a recurrent neural network (RNN), which are responsible for exploiting the
intraslice and interslice contexts, respectively. To our best knowledge, this
is the first DL framework for 3D image segmentation that explicitly leverages
3D image anisotropism. Evaluating using a dataset from the ISBI Neuronal
Structure Segmentation Challenge and inhouse image stacks for 3D fungus
segmentation, our approach achieves promising results comparing to the known
DLbased 3D segmentation approaches.

The Thomson Problem, arrangement of identical charges on the surface of a
sphere, has found many applications in physics, chemistry and biology. Here we
show that the energy landscape of the Thomson Problem for $N$ particles with
$N=132, 135, 138, 141, 144, 147$ and $150$ is single funnelled, characteristic
of a structureseeking organisation where the global minimum is easily
accessible. Algorithmically constructing starting points close to the global
minimum of such a potential with spherical constraints is one of Smale's 18
unsolved problems in mathematics for the 21st century because it is important
in the solution of univariate and bivariate random polynomial equations. By
analysing the kinetic transition networks, we show that a randomly chosen
minimum is in fact always `close' to the global minimum in terms of the number
of transition states that separate them, a characteristic of small world
networks.

In this paper, we study the problem of moving $n$ sensors on a line to form a
barrier coverage of a specified segment of the line such that the maximum
moving distance of the sensors is minimized. Previously, it was an open
question whether this problem on sensors with arbitrary sensing ranges is
solvable in polynomial time. We settle this open question positively by giving
an $O(n^2 \log n)$ time algorithm. For the special case when all sensors have
the samesize sensing range, the previously best solution takes $O(n^2)$ time.
We present an $O(n \log n)$ time algorithm for this case; further, if all
sensors are initially located on the coverage segment, our algorithm takes
$O(n)$ time. Also, we extend our techniques to the cycle version of the problem
where the barrier coverage is for a simple cycle and the sensors are allowed to
move only along the cycle. For sensors with the samesize sensing range, we
solve the cycle version in $O(n)$ time, improving the previously best $O(n^2)$
time solution.

A fundamental problem in computational geometry is to compute an
obstacleavoiding Euclidean shortest path between two points in the plane. The
case of this problem on polygonal obstacles is well studied. In this paper, we
consider the problem version on curved obstacles, commonly modeled as
splinegons. A splinegon can be viewed as replacing each edge of a polygon by a
convex curved edge (polygons are special splinegons). Each curved edge is
assumed to be of O(1) complexity. Given in the plane two points s and t and a
set of $h$ pairwise disjoint splinegons with a total of $n$ vertices, we
compute a shortest stot path avoiding the splinegons, in
$O(n+h\log^{1+\epsilon}h+k)$ time, where k is a parameter sensitive to the
structures of the input splinegons and is upperbounded by $O(h^2)$. In
particular, when all splinegons are convex, $k$ is proportional to the number
of common tangents in the free space (called "free common tangents") among the
splinegons. We develop techniques for solving the problem on the general
(nonconvex) splinegon domain, which also improve several previous results. In
particular, our techniques produce an optimal outputsensitive algorithm for a
basic visibility problem of computing all free common tangents among $h$
pairwise disjoint convex splinegons with a total of $n$ vertices. Our algorithm
runs in $O(n+h\log h+k)$ time and $O(n)$ space, where $k$ is the number of all
free common tangents. Even for the special case where all splinegons are convex
polygons, the previously best algorithm for this visibility problem takes
$O(n+h^2\log n)$ time.

Let $\mathcal{P}$ be a set of $h$ pairwisedisjoint polygonal obstacles with
a total of $n$ vertices in the plane. We consider the problem of building a
data structure that can quickly compute an $L_1$ shortest obstacleavoiding
path between any two query points $s$ and $t$. Previously, a data structure of
size $O(n^2\log n)$ was constructed in $O(n^2\log^2 n)$ time that answers each
twopoint query in $O(\log^2 n+k)$ time, i.e., the shortest path length is
reported in $O(\log^2 n)$ time and an actual path is reported in additional
$O(k)$ time, where $k$ is the number of edges of the output path. In this
paper, we build a new data structure of size $O(n+h^2\cdot \log h \cdot
4^{\sqrt{\log h}})$ in $O(n+h^2\cdot \log^{2} h \cdot 4^{\sqrt{\log h}})$ time
that answers each query in $O(\log n+k)$ time. Note that $n+h^2\cdot \log^{2} h
\cdot 4^{\sqrt{\log h}}=O(n+h^{2+\epsilon})$ for any constant $\epsilon>0$.
Further, we extend our techniques to the weighted rectilinear version in which
the "obstacles" of $\mathcal{P}$ are rectilinear regions with "weights" and
allow $L_1$ paths to travel through them with weighted costs. Our algorithm
answers each query in $O(\log n+k)$ time with a data structure of size
$O(n^2\cdot \log n\cdot 4^{\sqrt{\log n}})$ that is built in $O(n^2\cdot
\log^{2} n\cdot 4^{\sqrt{\log n}})$ time (note that $n^2\cdot \log^{2} n\cdot
4^{\sqrt{\log n}}= O(n^{2+\epsilon})$ for any constant $\epsilon>0$).

We consider the problem of finding k centers for n weighted points on a real
line. This (weighted) kcenter problem was solved in O(n log n) time previously
by using Cole's parametric search and other complicated approaches. In this
paper, we present an easier O(n log n) time algorithm that avoids the
parametric search, and in certain special cases our algorithm solves the
problem in O(n) time. In addition, our techniques involve developing
interesting data structures for processing queries that find a lowest point in
the common intersection of a certain subset of halfplanes. This subproblem is
interesting in its own right and our solution for it may find other
applications as well.

In the classic $k$center problem, we are given a metric graph, and the
objective is to open $k$ nodes as centers such that the maximum distance from
any vertex to its closest center is minimized. In this paper, we consider two
important generalizations of $k$center, the matroid center problem and the
knapsack center problem. Both problems are motivated by recent content
distribution network applications. Our contributions can be summarized as
follows:
1. We consider the matroid center problem in which the centers are required
to form an independent set of a given matroid. We show this problem is NPhard
even on a line. We present a 3approximation algorithm for the problem on
general metrics. We also consider the outlier version of the problem where a
given number of vertices can be excluded as the outliers from the solution. We
present a 7approximation for the outlier version.
2. We consider the (multi)knapsack center problem in which the centers are
required to satisfy one (or more) knapsack constraint(s). It is known that the
knapsack center problem with a single knapsack constraint admits a
3approximation. However, when there are at least two knapsack constraints, we
show this problem is not approximable at all. To complement the hardness
result, we present a polynomial time algorithm that gives a 3approximate
solution such that one knapsack constraint is satisfied and the others may be
violated by at most a factor of $1+\epsilon$. We also obtain a 3approximation
for the outlier version that may violate the knapsack constraint by
$1+\epsilon$.

Given a simple polygon P in the plane, we present new algorithms and data
structures for computing the weak visibility polygon from any query line
segment in P. We build a data structure in O(n) time and O(n) space that can
compute the visibility polygon for any query line segment s in O(k log n) time,
where k is the size of the visibility polygon of s and n is the number of
vertices of P. Alternatively, we build a data structure in O(n^3) time and
O(n^3) space that can compute the visibility polygon for any query line segment
in O(k + log n) time.

Given a point $s$ and a set of $h$ pairwise disjoint polygonal obstacles of
totally $n$ vertices in the plane, we present a new algorithm for building an
$L_1$ shortest path map of size O(n) in $O(T)$ time and O(n) space such that
for any query point $t$, the length of the $L_1$ shortest obstacleavoiding
path from $s$ to $t$ can be reported in $O(\log n)$ time and the actual
shortest path can be found in additional time proportional to the number of
edges of the path, where $T$ is the time for triangulating the free space. It
is currently known that $T=O(n+h\log^{1+\epsilon}h)$ for an arbitrarily small
constant $\epsilon>0$. If the triangulation can be done optimally (i.e.,
$T=O(n+h\log h)$), then our algorithm is optimal. Previously, the best
algorithm computes such an $L_1$ shortest path map in $O(n\log n)$ time and
O(n) space. Our techniques can be extended to obtain improved results for other
related problems, e.g., computing the $L_1$ geodesic Voronoi diagram for a set
of point sites in a polygonal domain, finding shortest paths with fixed
orientations, finding approximate Euclidean shortest paths, etc.

Given $n$ points in a circular region $C$ in the plane, we study the problems
of moving the $n$ points to its boundary to form a regular $n$gon such that
the maximum (minmax) or the sum (minsum) of the Euclidean distances traveled
by the points is minimized. The problems have applications, e.g., in mobile
sensor barrier coverage of wireless sensor networks. The minmax problem
further has two versions: the decision version and optimization version. For
the minmax problem, we present an $O(n\log^2 n)$ time algorithm for the
decision version and an $O(n\log^3 n)$ time algorithm for the optimization
version. The previously best algorithms for the two problem versions take
$O(n^{3.5})$ time and $O(n^{3.5}\log n)$ time, respectively. For the minsum
problem, we show that a special case with all points initially lying on the
boundary of the circular region can be solved in $O(n^2)$ time, improving a
previous $O(n^4)$ time solution. For the general minsum problem, we present a
3approximation $O(n^2)$ time algorithm, improving the previous
$(1+\pi)$approximation $O(n^2)$ time algorithm. A byproduct of our techniques
is an algorithm for dynamically maintaining the maximum matching of a circular
convex bipartite graph; our algorithm can handle each vertex insertion or
deletion on the graph in $O(\log^2 n)$ time. This result is interesting in its
own right.

In this paper, we study the following problem of reconstructing a simple
polygon: Given a cyclically ordered vertex sequence of an unknown simple
polygon P of n vertices and, for each vertex v of P, the sequence of angles
defined by all the visible vertices of v in P, reconstruct the polygon P (up to
similarity). An O(n^3 log n) time algorithm has been proposed for this problem.
We present an improved algorithm with running time O(n^2), based on new
observations on the geometric structures of the problem. Since the input size
(i.e., the total number of input visibility angles) is O(n^2) in the worst
case, our algorithm is worstcase optimal.