• In physics, biology and engineering, network systems abound. How does the connectivity of a network system combine with the behavior of its individual components to determine its collective function? We approach this question for networks with linear time-invariant dynamics by relating internal network feedbacks to the statistical prevalence of connectivity motifs, a set of surprisingly simple and local statistics of connectivity. This results in a reduced order model of the network input-output dynamics in terms of motifs structures. As an example, the new formulation dramatically simplifies the classic Erdos-Renyi graph, reducing the overall network behavior to one proportional feedback wrapped around the dynamics of a single node. For general networks, higher-order motifs systematically provide further layers and types of feedback to regulate the network response. Thus, the local connectivity shapes temporal and spectral processing by the network as a whole, and we show how this enables robust, yet tunable, functionality such as extending the time constant with which networks remember past signals. The theory also extends to networks composed from heterogeneous nodes with distinct dynamics and connectivity, and patterned input to (and readout from) subsets of nodes. These statistical descriptions provide a powerful theoretical framework to understand the functionality of real-world network systems, as we illustrate with examples including the mouse brain connectome.
  • With pervasive applications of medical imaging in health-care, biomedical image segmentation plays a central role in quantitative analysis, clinical diagno- sis, and medical intervention. Since manual anno- tation su ers limited reproducibility, arduous e orts, and excessive time, automatic segmentation is desired to process increasingly larger scale histopathological data. Recently, deep neural networks (DNNs), par- ticularly fully convolutional networks (FCNs), have been widely applied to biomedical image segmenta- tion, attaining much improved performance. At the same time, quantization of DNNs has become an ac- tive research topic, which aims to represent weights with less memory (precision) to considerably reduce memory and computation requirements of DNNs while maintaining acceptable accuracy. In this paper, we apply quantization techniques to FCNs for accurate biomedical image segmentation. Unlike existing litera- ture on quantization which primarily targets memory and computation complexity reduction, we apply quan- tization as a method to reduce over tting in FCNs for better accuracy. Speci cally, we focus on a state-of- the-art segmentation framework, suggestive annotation [22], which judiciously extracts representative annota- tion samples from the original training dataset, obtain- ing an e ective small-sized balanced training dataset. We develop two new quantization processes for this framework: (1) suggestive annotation with quantiza- tion for highly representative training samples, and (2) network training with quantization for high accuracy. Extensive experiments on the MICCAI Gland dataset show that both quantization processes can improve the segmentation performance, and our proposed method exceeds the current state-of-the-art performance by up to 1%. In addition, our method has a reduction of up to 6.4x on memory usage.
  • An essential step toward understanding neural circuits is linking their structure and their dynamics. In general, this relationship can be almost arbitrarily complex. Recent theoretical work has, however, begun to identify some broad principles underlying collective spiking activity in neural circuits. The first is that local features of network connectivity can be surprisingly effective in predicting global statistics of activity across a network. The second is that, for the important case of large networks with excitatory-inhibitory balance, correlated spiking persists or vanishes depending on the spatial scales of recurrent and feedforward connectivity. We close by showing how these ideas, together with plasticity rules, can help to close the loop between network structure and activity statistics.
  • In this paper, we propose commonsense knowledge enhanced embeddings (KEE) for solving the Pronoun Disambiguation Problems (PDP). The PDP task we investigate in this paper is a complex coreference resolution task which requires the utilization of commonsense knowledge. This task is a standard first round test set in the 2016 Winograd Schema Challenge. In this task, traditional linguistic features that are useful for coreference resolution, e.g. context and gender information, are no longer effective anymore. Therefore, the KEE models are proposed to provide a general framework to make use of commonsense knowledge for solving the PDP problems. Since the PDP task doesn't have training data, the KEE models would be used during the unsupervised feature extraction process. To evaluate the effectiveness of the KEE models, we propose to incorporate various commonsense knowledge bases, including ConceptNet, WordNet, and CauseCom, into the KEE training process. We achieved the best performance by applying the proposed methods to the 2016 Winograd Schema Challenge. In addition, experiments conducted on the standard PDP task indicate that, the proposed KEE models could solve the PDP problems by achieving 66.7% accuracy, which is a new state-of-the-art performance.
  • In this paper, we propose a new deep learning approach, called neural association model (NAM), for probabilistic reasoning in artificial intelligence. We propose to use neural networks to model association between any two events in a domain. Neural networks take one event as input and compute a conditional probability of the other event to model how likely these two events are to be associated. The actual meaning of the conditional probabilities varies between applications and depends on how the models are trained. In this work, as two case studies, we have investigated two NAM structures, namely deep neural networks (DNN) and relation-modulated neural nets (RMNN), on several probabilistic reasoning tasks in AI, including recognizing textual entailment, triple classification in multi-relational knowledge bases and commonsense reasoning. Experimental results on several popular datasets derived from WordNet, FreeBase and ConceptNet have all demonstrated that both DNNs and RMNNs perform equally well and they can significantly outperform the conventional methods available for these reasoning tasks. Moreover, compared with DNNs, RMNNs are superior in knowledge transfer, where a pre-trained model can be quickly extended to an unseen relation after observing only a few training samples. To further prove the effectiveness of the proposed models, in this work, we have applied NAMs to solving challenging Winograd Schema (WS) problems. Experiments conducted on a set of WS problems prove that the proposed models have the potential for commonsense reasoning.
  • This paper proposes a model to learn word embeddings with weighted contexts based on part-of-speech (POS) relevance weights. POS is a fundamental element in natural language. However, state-of-the-art word embedding models fail to consider it. This paper proposes to use position-dependent POS relevance weighting matrices to model the inherent syntactic relationship among words within a context window. We utilize the POS relevance weights to model each word-context pairs during the word embedding training process. The model proposed in this paper paper jointly optimizes word vectors and the POS relevance matrices. Experiments conducted on popular word analogy and word similarity tasks all demonstrated the effectiveness of the proposed method.
  • In this paper, we propose a novel neural network structure, namely \emph{feedforward sequential memory networks (FSMN)}, to model long-term dependency in time series without using recurrent feedback. The proposed FSMN is a standard fully-connected feedforward neural network equipped with some learnable memory blocks in its hidden layers. The memory blocks use a tapped-delay line structure to encode the long context information into a fixed-size representation as short-term memory mechanism. We have evaluated the proposed FSMNs in several standard benchmark tasks, including speech recognition and language modelling. Experimental results have shown FSMNs significantly outperform the conventional recurrent neural networks (RNN), including LSTMs, in modeling sequential signals like speech or language. Moreover, FSMNs can be learned much more reliably and faster than RNNs or LSTMs due to the inherent non-recurrent model structure.
  • Over repeat presentations of the same stimulus, sensory neurons show variable responses. This "noise" is typically correlated between pairs of cells, and a question with rich history in neuroscience is how these noise correlations impact the population's ability to encode the stimulus. Here, we consider a very general setting for population coding, investigating how information varies as a function of noise correlations, with all other aspects of the problem - neural tuning curves, etc. - held fixed. This work yields unifying insights into the role of noise correlations. These are summarized in the form of theorems, and illustrated with numerical examples involving neurons with diverse tuning curves. Our main contributions are as follows. (1) We generalize previous results to prove a sign rule (SR) - if noise correlations between pairs of neurons have opposite signs vs. their signal correlations, then coding performance will improve compared to the independent case. This holds for three different metrics of coding performance, and for arbitrary tuning curves and levels of heterogeneity. This generality is true for our other results as well. (2) As also pointed out in the literature, the SR does not provide a necessary condition for good coding. We show that a diverse set of correlation structures can improve coding. Many of these violate the SR, as do experimentally observed correlations. There is structure to this diversity: we prove that the optimal correlation structures must lie on boundaries of the possible set of noise correlations. (3) We provide a novel set of necessary and sufficient conditions, under which the coding performance (in the presence of noise) will be as good as it would be if there were no noise present at all.
  • How does connectivity impact network dynamics? We address this question by linking network characteristics on two scales. On the global scale we consider the coherence of overall network dynamics. We show that such \emph{global coherence} in activity can often be predicted from the \emph{local structure} of the network. To characterize local network structure we use "motif cumulants," a measure of the deviation of pathway counts from those expected in a minimal probabilistic network model. We extend previous results in three ways. First, we give a new combinatorial formulation of motif cumulants that relates to the allied concept in probability theory. Second, we show that the link between global network dynamics and local network architecture is strongly affected by heterogeneity in network connectivity. However, we introduce a network-partitioning method that recovers a tight relationship between architecture and dynamics. Third, for a particular set of models we generalize the underlying theory to treat dynamical coherence at arbitrary orders (i.e. triplet correlations, and beyond). We show that at any order only a highly restricted set of motifs impact dynamical correlations.
  • Emerging technologies are revealing the spiking activity in ever larger neural ensembles. Frequently, this spiking is far from independent, with correlations in the spike times of different cells. Understanding how such correlations impact the dynamics and function of neural ensembles remains an important open problem. Here we describe a new, generative model for correlated spike trains that can exhibit many of the features observed in data. Extending prior work in mathematical finance, this generalized thinning and shift (GTaS) model creates marginally Poisson spike trains with diverse temporal correlation structures. We give several examples which highlight the model's flexibility and utility. For instance, we use it to examine how a neural network responds to highly structured patterns of inputs. We then show that the GTaS model is analytically tractable, and derive cumulant densities of all orders in terms of model parameters. The GTaS framework can therefore be an important tool in the experimental and theoretical exploration of neural dynamics.
  • Motifs are patterns of subgraphs of complex networks. We studied the impact of such patterns of connectivity on the level of correlated, or synchronized, spiking activity among pairs of cells in a recurrent network model of integrate and fire neurons. For a range of network architectures, we find that the pairwise correlation coefficients, averaged across the network, can be closely approximated using only three statistics of network connectivity. These are the overall network connection probability and the frequencies of two second-order motifs: diverging motifs, in which one cell provides input to two others, and chain motifs, in which two cells are connected via a third intermediary cell. Specifically, the prevalence of diverging and chain motifs tends to increase correlation. Our method is based on linear response theory, which enables us to express spiking statistics using linear algebra, and a resumming technique, which extrapolates from second order motifs to predict the overall effect of coupling on network correlation. Our motif-based results seek to isolate the effect of network architecture perturbatively from a known network state.
  • Novel experimental techniques reveal the simultaneous activity of larger and larger numbers of neurons. As a result there is increasing interest in the structure of cooperative -- or correlated -- activity in neural populations, and in the possible impact of such correlations on the neural code. A fundamental theoretical challenge is to understand how the architecture of network connectivity along with the dynamical properties of single cells shape the magnitude and timescale of correlations. We provide a general approach to this problem by extending prior techniques based on linear response theory. We consider networks of general integrate-and-fire cells with arbitrary architecture, and provide explicit expressions for the approximate cross-correlation between constituent cells. These correlations depend strongly on the operating point (input mean and variance) of the neurons, even when connectivity is fixed. Moreover, the approximations admit an expansion in powers of the matrices that describe the network architecture. This expansion can be readily interpreted in terms of paths between different cells. We apply our results to large excitatory-inhibitory networks, and demonstrate first how precise balance --- or lack thereof --- between the strengths and timescales of excitatory and inhibitory synapses is reflected in the overall correlation structure of the network. We then derive explicit expressions for the average correlation structure in randomly connected networks. These expressions help to identify the important factors that shape coordinated neural activity in such networks.