
Inspired by applications to theories of coding and communication in networks
of nervous tissue, we study maximum entropy distributions on weighted graphs
with a given expected degree sequence. These distributions are characterized by
independent edge weights parameterized by a shared vector of vertex potentials.
Using the general theory of exponential family distributions, we derive the
existence and uniqueness of the maximum likelihood estimator (MLE) of the
vertex parameters. We also prove consistency of the MLE from a single sample in
the limit of large graphs, extending results of Chatterjee, Diaconis, and Sly
in the unweighted case (the "betamodel" in statistics). Interestingly, our
proofs require tight estimates on the norms of inverses of symmetric,
diagonally dominant positive matrices. Along the way, we derive analogues of
the ErdosGallai criterion of graphical degree sequences for weighted graphs.

Maximum entropy models are increasingly being used to describe the collective
activity of neural populations with measured mean neural activities and
pairwise correlations, but the full space of probability distributions
consistent with these constraints has not been explored. We provide upper and
lower bounds on the entropy for the {\em minimum} entropy distribution over
arbitrarily large collections of binary units with any fixed set of mean values
and pairwise correlations. We also construct specific lowentropy distributions
for several relevant cases. Surprisingly, the minimum entropy solution has
entropy scaling logarithmically with system size for any set of first and
secondorder statistics consistent with arbitrarily large systems. We further
demonstrate that some sets of these loworder statistics can only be realized
by small systems. Our results show how only small amounts of randomness are
needed to mimic loworder statistical properties of highly entropic
distributions, and we discuss some applications for engineered and biological
information transmission systems.

The Hopfield recurrent neural network is a classical autoassociative model
of memory, in which collections of symmetricallycoupled McCullochPitts
neurons interact to perform emergent computation. Although previous researchers
have explored the potential of this network to solve combinatorial optimization
problems and store memories as attractors of its deterministic dynamics, a
basic open problem is to design a family of Hopfield networks with a number of
noisetolerant memories that grows exponentially with neural population size.
Here, we discover such networks by minimizing probability flow, a recently
proposed objective for estimating parameters in discrete maximum entropy
models. By descending the gradient of the convex probability flow, our networks
adapt synaptic weights to achieve robust exponential storage, even when
presented with vanishingly small numbers of training patterns. In addition to
providing a new set of errorcorrecting codes that achieve Shannon's channel
capacity bound, these networks also efficiently solve a variant of the hidden
clique problem in computer science, opening new avenues for realworld
applications of computational models originating from biology.

We present an algorithm to store binary memories in a Hopfield neural network
using minimum probability flow, a recent technique to fit parameters in
energybased probabilistic models. In the case of memories without noise, our
algorithm provably achieves optimal pattern storage (which we show is at least
one pattern per neuron) and outperforms classical methods both in speed and
memory recovery. Moreover, when trained on noisy or corrupted versions of a
fixed set of binary patterns, our algorithm finds networks which correctly
store the originals. We also demonstrate this finding visually with the
unsupervised storage and cleanup of large binary fingerprint images from
significantly corrupted samples.

The LittleHopfield network is an autoassociative computational model of
neural memory storage and retrieval. This model is known to robustly store
collections of randomly generated binary patterns as stablestates of the
network dynamics. However, the number of binary memories so storable scales
linearly in the number of neurons, and it has been a longstanding open problem
whether robust exponential storage of binary patterns was possible in such a
network memory model. In this note, we design simple families of
LittleHopfield networks that provably solve this problem affirmatively. As a
byproduct, we produce a set of novel (nonlinear) binary codes with an
efficient, highly parallelizable denoising mechanism.

We prove that multilinear (tensor) analogues of many efficiently computable
problems in numerical linear algebra are NPhard. Our list here includes:
determining the feasibility of a system of bilinear equations, deciding whether
a 3tensor possesses a given eigenvalue, singular value, or spectral norm;
approximating an eigenvalue, eigenvector, singular vector, or the spectral
norm; and determining the rank or best rank1 approximation of a 3tensor.
Furthermore, we show that restricting these problems to symmetric tensors does
not alleviate their NPhardness. We also explain how deciding nonnegative
definiteness of a symmetric 4tensor is NPhard and how computing the
combinatorial hyperdeterminant of a 4tensor is NP, #P, and VNPhard. We
shall argue that our results provide another view of the boundary separating
the computational tractability of linear/convex problems from the
intractability of nonlinear/nonconvex ones.

In the article "Distilling freeform natural laws from experimental data",
Schmidt and Lipson introduced the idea that freeform natural laws can be
learned from experimental measurements in a physical system using symbolic
(genetic) regression algorithms. An important claim in this work is that the
algorithm finds laws in data without having incorporated any prior knowledge of
physics. Upon close inspection, however, we show that their method implicitly
incorporates Hamilton's equations of motions and Newton's second law,
demystifying how they are able to find Hamiltonians and special classes of
Lagrangians from data.

We describe a general framework for largescale computational experiments in
mathematics using computer resources that are available in most mathematics
departments. This framework was developed for an experiment that is helping to
formulate and test conjectures in the real Schubert calculus. Largely using
machines in instructional computer labs during offhours and University breaks,
it consumed in excess of 350 GigaHertzyears of computing in its first six
months of operation, solving over 1.1 billion polynomial systems.

We define a word in two positive definite (complex Hermitian) matrices $A$
and $B$ as a finite product of real powers of $A$ and $B$. The question of
which words have only positive eigenvalues is addressed. This question was
raised some time ago in connection with a longstanding problem in theoretical
physics, and it was previously approached by the authors for words in two real
positive definite matrices with positive integral exponents. A large class of
words that do guarantee positive eigenvalues is identified, and considerable
evidence is given for the conjecture that no other words do.

A generalized word in two letters $A$ and $B$ is an expression of the form
$W=A^{\alpha_1}B^{\beta_1}A^{\alpha_2}B^{\beta_2}... A^{\alpha_N}B^{\beta_N}$
in which the exponents $\alpha_i$, $\beta_i$ are nonzero real numbers. When
independent positive definite matrices are substituted for $A$ and $B$, we are
interested in whether $W$ necessarily has positive eigenvalues. This is known
to be the case when N=1 and has been studied in case all exponents are positive
by two of the authors. When the exponent signs are mixed, however, the
situation is quite different (even for 2by2 matrices), and this is the focus
of the present work.