
In physics, biology and engineering, network systems abound. How does the
connectivity of a network system combine with the behavior of its individual
components to determine its collective function? We approach this question for
networks with linear timeinvariant dynamics by relating internal network
feedbacks to the statistical prevalence of connectivity motifs, a set of
surprisingly simple and local statistics of connectivity. This results in a
reduced order model of the network inputoutput dynamics in terms of motifs
structures. As an example, the new formulation dramatically simplifies the
classic ErdosRenyi graph, reducing the overall network behavior to one
proportional feedback wrapped around the dynamics of a single node. For general
networks, higherorder motifs systematically provide further layers and types
of feedback to regulate the network response. Thus, the local connectivity
shapes temporal and spectral processing by the network as a whole, and we show
how this enables robust, yet tunable, functionality such as extending the time
constant with which networks remember past signals. The theory also extends to
networks composed from heterogeneous nodes with distinct dynamics and
connectivity, and patterned input to (and readout from) subsets of nodes. These
statistical descriptions provide a powerful theoretical framework to understand
the functionality of realworld network systems, as we illustrate with examples
including the mouse brain connectome.

With pervasive applications of medical imaging in healthcare, biomedical
image segmentation plays a central role in quantitative analysis, clinical
diagno sis, and medical intervention. Since manual anno tation su ers limited
reproducibility, arduous e orts, and excessive time, automatic segmentation is
desired to process increasingly larger scale histopathological data. Recently,
deep neural networks (DNNs), par ticularly fully convolutional networks
(FCNs), have been widely applied to biomedical image segmenta tion, attaining
much improved performance. At the same time, quantization of DNNs has become an
ac tive research topic, which aims to represent weights with less memory
(precision) to considerably reduce memory and computation requirements of DNNs
while maintaining acceptable accuracy. In this paper, we apply quantization
techniques to FCNs for accurate biomedical image segmentation. Unlike existing
litera ture on quantization which primarily targets memory and computation
complexity reduction, we apply quan tization as a method to reduce over tting
in FCNs for better accuracy. Speci cally, we focus on a stateof theart
segmentation framework, suggestive annotation [22], which judiciously extracts
representative annota tion samples from the original training dataset, obtain
ing an e ective smallsized balanced training dataset. We develop two new
quantization processes for this framework: (1) suggestive annotation with
quantiza tion for highly representative training samples, and (2) network
training with quantization for high accuracy. Extensive experiments on the
MICCAI Gland dataset show that both quantization processes can improve the
segmentation performance, and our proposed method exceeds the current
stateoftheart performance by up to 1%. In addition, our method has a
reduction of up to 6.4x on memory usage.

An essential step toward understanding neural circuits is linking their
structure and their dynamics. In general, this relationship can be almost
arbitrarily complex. Recent theoretical work has, however, begun to identify
some broad principles underlying collective spiking activity in neural
circuits. The first is that local features of network connectivity can be
surprisingly effective in predicting global statistics of activity across a
network. The second is that, for the important case of large networks with
excitatoryinhibitory balance, correlated spiking persists or vanishes
depending on the spatial scales of recurrent and feedforward connectivity. We
close by showing how these ideas, together with plasticity rules, can help to
close the loop between network structure and activity statistics.

In this paper, we propose commonsense knowledge enhanced embeddings (KEE) for
solving the Pronoun Disambiguation Problems (PDP). The PDP task we investigate
in this paper is a complex coreference resolution task which requires the
utilization of commonsense knowledge. This task is a standard first round test
set in the 2016 Winograd Schema Challenge. In this task, traditional linguistic
features that are useful for coreference resolution, e.g. context and gender
information, are no longer effective anymore. Therefore, the KEE models are
proposed to provide a general framework to make use of commonsense knowledge
for solving the PDP problems. Since the PDP task doesn't have training data,
the KEE models would be used during the unsupervised feature extraction
process. To evaluate the effectiveness of the KEE models, we propose to
incorporate various commonsense knowledge bases, including ConceptNet, WordNet,
and CauseCom, into the KEE training process. We achieved the best performance
by applying the proposed methods to the 2016 Winograd Schema Challenge. In
addition, experiments conducted on the standard PDP task indicate that, the
proposed KEE models could solve the PDP problems by achieving 66.7% accuracy,
which is a new stateoftheart performance.

In this paper, we propose a new deep learning approach, called neural
association model (NAM), for probabilistic reasoning in artificial
intelligence. We propose to use neural networks to model association between
any two events in a domain. Neural networks take one event as input and compute
a conditional probability of the other event to model how likely these two
events are to be associated. The actual meaning of the conditional
probabilities varies between applications and depends on how the models are
trained. In this work, as two case studies, we have investigated two NAM
structures, namely deep neural networks (DNN) and relationmodulated neural
nets (RMNN), on several probabilistic reasoning tasks in AI, including
recognizing textual entailment, triple classification in multirelational
knowledge bases and commonsense reasoning. Experimental results on several
popular datasets derived from WordNet, FreeBase and ConceptNet have all
demonstrated that both DNNs and RMNNs perform equally well and they can
significantly outperform the conventional methods available for these reasoning
tasks. Moreover, compared with DNNs, RMNNs are superior in knowledge transfer,
where a pretrained model can be quickly extended to an unseen relation after
observing only a few training samples. To further prove the effectiveness of
the proposed models, in this work, we have applied NAMs to solving challenging
Winograd Schema (WS) problems. Experiments conducted on a set of WS problems
prove that the proposed models have the potential for commonsense reasoning.

This paper proposes a model to learn word embeddings with weighted contexts
based on partofspeech (POS) relevance weights. POS is a fundamental element
in natural language. However, stateoftheart word embedding models fail to
consider it. This paper proposes to use positiondependent POS relevance
weighting matrices to model the inherent syntactic relationship among words
within a context window. We utilize the POS relevance weights to model each
wordcontext pairs during the word embedding training process. The model
proposed in this paper paper jointly optimizes word vectors and the POS
relevance matrices. Experiments conducted on popular word analogy and word
similarity tasks all demonstrated the effectiveness of the proposed method.

In this paper, we propose a novel neural network structure, namely
\emph{feedforward sequential memory networks (FSMN)}, to model longterm
dependency in time series without using recurrent feedback. The proposed FSMN
is a standard fullyconnected feedforward neural network equipped with some
learnable memory blocks in its hidden layers. The memory blocks use a
tappeddelay line structure to encode the long context information into a
fixedsize representation as shortterm memory mechanism. We have evaluated the
proposed FSMNs in several standard benchmark tasks, including speech
recognition and language modelling. Experimental results have shown FSMNs
significantly outperform the conventional recurrent neural networks (RNN),
including LSTMs, in modeling sequential signals like speech or language.
Moreover, FSMNs can be learned much more reliably and faster than RNNs or LSTMs
due to the inherent nonrecurrent model structure.

Over repeat presentations of the same stimulus, sensory neurons show variable
responses. This "noise" is typically correlated between pairs of cells, and a
question with rich history in neuroscience is how these noise correlations
impact the population's ability to encode the stimulus. Here, we consider a
very general setting for population coding, investigating how information
varies as a function of noise correlations, with all other aspects of the
problem  neural tuning curves, etc.  held fixed. This work yields unifying
insights into the role of noise correlations. These are summarized in the form
of theorems, and illustrated with numerical examples involving neurons with
diverse tuning curves. Our main contributions are as follows.
(1) We generalize previous results to prove a sign rule (SR)  if noise
correlations between pairs of neurons have opposite signs vs. their signal
correlations, then coding performance will improve compared to the independent
case. This holds for three different metrics of coding performance, and for
arbitrary tuning curves and levels of heterogeneity. This generality is true
for our other results as well.
(2) As also pointed out in the literature, the SR does not provide a
necessary condition for good coding. We show that a diverse set of correlation
structures can improve coding. Many of these violate the SR, as do
experimentally observed correlations. There is structure to this diversity: we
prove that the optimal correlation structures must lie on boundaries of the
possible set of noise correlations.
(3) We provide a novel set of necessary and sufficient conditions, under
which the coding performance (in the presence of noise) will be as good as it
would be if there were no noise present at all.

How does connectivity impact network dynamics? We address this question by
linking network characteristics on two scales. On the global scale we consider
the coherence of overall network dynamics. We show that such \emph{global
coherence} in activity can often be predicted from the \emph{local structure}
of the network. To characterize local network structure we use "motif
cumulants," a measure of the deviation of pathway counts from those expected in
a minimal probabilistic network model.
We extend previous results in three ways. First, we give a new combinatorial
formulation of motif cumulants that relates to the allied concept in
probability theory. Second, we show that the link between global network
dynamics and local network architecture is strongly affected by heterogeneity
in network connectivity. However, we introduce a networkpartitioning method
that recovers a tight relationship between architecture and dynamics. Third,
for a particular set of models we generalize the underlying theory to treat
dynamical coherence at arbitrary orders (i.e. triplet correlations, and
beyond). We show that at any order only a highly restricted set of motifs
impact dynamical correlations.

Emerging technologies are revealing the spiking activity in ever larger
neural ensembles. Frequently, this spiking is far from independent, with
correlations in the spike times of different cells. Understanding how such
correlations impact the dynamics and function of neural ensembles remains an
important open problem. Here we describe a new, generative model for correlated
spike trains that can exhibit many of the features observed in data. Extending
prior work in mathematical finance, this generalized thinning and shift (GTaS)
model creates marginally Poisson spike trains with diverse temporal correlation
structures. We give several examples which highlight the model's flexibility
and utility. For instance, we use it to examine how a neural network responds
to highly structured patterns of inputs. We then show that the GTaS model is
analytically tractable, and derive cumulant densities of all orders in terms of
model parameters. The GTaS framework can therefore be an important tool in the
experimental and theoretical exploration of neural dynamics.

Motifs are patterns of subgraphs of complex networks. We studied the impact
of such patterns of connectivity on the level of correlated, or synchronized,
spiking activity among pairs of cells in a recurrent network model of integrate
and fire neurons. For a range of network architectures, we find that the
pairwise correlation coefficients, averaged across the network, can be closely
approximated using only three statistics of network connectivity. These are the
overall network connection probability and the frequencies of two secondorder
motifs: diverging motifs, in which one cell provides input to two others, and
chain motifs, in which two cells are connected via a third intermediary cell.
Specifically, the prevalence of diverging and chain motifs tends to increase
correlation. Our method is based on linear response theory, which enables us to
express spiking statistics using linear algebra, and a resumming technique,
which extrapolates from second order motifs to predict the overall effect of
coupling on network correlation. Our motifbased results seek to isolate the
effect of network architecture perturbatively from a known network state.

Novel experimental techniques reveal the simultaneous activity of larger and
larger numbers of neurons. As a result there is increasing interest in the
structure of cooperative  or correlated  activity in neural populations,
and in the possible impact of such correlations on the neural code. A
fundamental theoretical challenge is to understand how the architecture of
network connectivity along with the dynamical properties of single cells shape
the magnitude and timescale of correlations. We provide a general approach to
this problem by extending prior techniques based on linear response theory. We
consider networks of general integrateandfire cells with arbitrary
architecture, and provide explicit expressions for the approximate
crosscorrelation between constituent cells. These correlations depend strongly
on the operating point (input mean and variance) of the neurons, even when
connectivity is fixed. Moreover, the approximations admit an expansion in
powers of the matrices that describe the network architecture. This expansion
can be readily interpreted in terms of paths between different cells. We apply
our results to large excitatoryinhibitory networks, and demonstrate first how
precise balance  or lack thereof  between the strengths and timescales of
excitatory and inhibitory synapses is reflected in the overall correlation
structure of the network. We then derive explicit expressions for the average
correlation structure in randomly connected networks. These expressions help to
identify the important factors that shape coordinated neural activity in such
networks.