• Inspired by applications to theories of coding and communication in networks of nervous tissue, we study maximum entropy distributions on weighted graphs with a given expected degree sequence. These distributions are characterized by independent edge weights parameterized by a shared vector of vertex potentials. Using the general theory of exponential family distributions, we derive the existence and uniqueness of the maximum likelihood estimator (MLE) of the vertex parameters. We also prove consistency of the MLE from a single sample in the limit of large graphs, extending results of Chatterjee, Diaconis, and Sly in the unweighted case (the "beta-model" in statistics). Interestingly, our proofs require tight estimates on the norms of inverses of symmetric, diagonally dominant positive matrices. Along the way, we derive analogues of the Erdos-Gallai criterion of graphical degree sequences for weighted graphs.
  • Maximum entropy models are increasingly being used to describe the collective activity of neural populations with measured mean neural activities and pairwise correlations, but the full space of probability distributions consistent with these constraints has not been explored. We provide upper and lower bounds on the entropy for the {\em minimum} entropy distribution over arbitrarily large collections of binary units with any fixed set of mean values and pairwise correlations. We also construct specific low-entropy distributions for several relevant cases. Surprisingly, the minimum entropy solution has entropy scaling logarithmically with system size for any set of first- and second-order statistics consistent with arbitrarily large systems. We further demonstrate that some sets of these low-order statistics can only be realized by small systems. Our results show how only small amounts of randomness are needed to mimic low-order statistical properties of highly entropic distributions, and we discuss some applications for engineered and biological information transmission systems.
  • The Hopfield recurrent neural network is a classical auto-associative model of memory, in which collections of symmetrically-coupled McCulloch-Pitts neurons interact to perform emergent computation. Although previous researchers have explored the potential of this network to solve combinatorial optimization problems and store memories as attractors of its deterministic dynamics, a basic open problem is to design a family of Hopfield networks with a number of noise-tolerant memories that grows exponentially with neural population size. Here, we discover such networks by minimizing probability flow, a recently proposed objective for estimating parameters in discrete maximum entropy models. By descending the gradient of the convex probability flow, our networks adapt synaptic weights to achieve robust exponential storage, even when presented with vanishingly small numbers of training patterns. In addition to providing a new set of error-correcting codes that achieve Shannon's channel capacity bound, these networks also efficiently solve a variant of the hidden clique problem in computer science, opening new avenues for real-world applications of computational models originating from biology.
  • We present an algorithm to store binary memories in a Hopfield neural network using minimum probability flow, a recent technique to fit parameters in energy-based probabilistic models. In the case of memories without noise, our algorithm provably achieves optimal pattern storage (which we show is at least one pattern per neuron) and outperforms classical methods both in speed and memory recovery. Moreover, when trained on noisy or corrupted versions of a fixed set of binary patterns, our algorithm finds networks which correctly store the originals. We also demonstrate this finding visually with the unsupervised storage and clean-up of large binary fingerprint images from significantly corrupted samples.
  • The Little-Hopfield network is an auto-associative computational model of neural memory storage and retrieval. This model is known to robustly store collections of randomly generated binary patterns as stable-states of the network dynamics. However, the number of binary memories so storable scales linearly in the number of neurons, and it has been a long-standing open problem whether robust exponential storage of binary patterns was possible in such a network memory model. In this note, we design simple families of Little-Hopfield networks that provably solve this problem affirmatively. As a byproduct, we produce a set of novel (nonlinear) binary codes with an efficient, highly parallelizable denoising mechanism.
  • We prove that multilinear (tensor) analogues of many efficiently computable problems in numerical linear algebra are NP-hard. Our list here includes: determining the feasibility of a system of bilinear equations, deciding whether a 3-tensor possesses a given eigenvalue, singular value, or spectral norm; approximating an eigenvalue, eigenvector, singular vector, or the spectral norm; and determining the rank or best rank-1 approximation of a 3-tensor. Furthermore, we show that restricting these problems to symmetric tensors does not alleviate their NP-hardness. We also explain how deciding nonnegative definiteness of a symmetric 4-tensor is NP-hard and how computing the combinatorial hyperdeterminant of a 4-tensor is NP-, #P-, and VNP-hard. We shall argue that our results provide another view of the boundary separating the computational tractability of linear/convex problems from the intractability of nonlinear/nonconvex ones.
  • In the article "Distilling free-form natural laws from experimental data", Schmidt and Lipson introduced the idea that free-form natural laws can be learned from experimental measurements in a physical system using symbolic (genetic) regression algorithms. An important claim in this work is that the algorithm finds laws in data without having incorporated any prior knowledge of physics. Upon close inspection, however, we show that their method implicitly incorporates Hamilton's equations of motions and Newton's second law, demystifying how they are able to find Hamiltonians and special classes of Lagrangians from data.
  • We describe a general framework for large-scale computational experiments in mathematics using computer resources that are available in most mathematics departments. This framework was developed for an experiment that is helping to formulate and test conjectures in the real Schubert calculus. Largely using machines in instructional computer labs during off-hours and University breaks, it consumed in excess of 350 GigaHertz-years of computing in its first six months of operation, solving over 1.1 billion polynomial systems.
  • We define a word in two positive definite (complex Hermitian) matrices $A$ and $B$ as a finite product of real powers of $A$ and $B$. The question of which words have only positive eigenvalues is addressed. This question was raised some time ago in connection with a long-standing problem in theoretical physics, and it was previously approached by the authors for words in two real positive definite matrices with positive integral exponents. A large class of words that do guarantee positive eigenvalues is identified, and considerable evidence is given for the conjecture that no other words do.
  • A generalized word in two letters $A$ and $B$ is an expression of the form $W=A^{\alpha_1}B^{\beta_1}A^{\alpha_2}B^{\beta_2}... A^{\alpha_N}B^{\beta_N}$ in which the exponents $\alpha_i$, $\beta_i$ are nonzero real numbers. When independent positive definite matrices are substituted for $A$ and $B$, we are interested in whether $W$ necessarily has positive eigenvalues. This is known to be the case when N=1 and has been studied in case all exponents are positive by two of the authors. When the exponent signs are mixed, however, the situation is quite different (even for 2-by-2 matrices), and this is the focus of the present work.