
Bayesian matrix factorization (BMF) is a powerful tool for producing lowrank
representations of matrices and for predicting missing values and providing
confidence intervals. Scaling up the posterior inference for massivescale
matrices is challenging and requires distributing both data and computation
over many workers, making communication the main computational bottleneck.
Embarrassingly parallel inference would remove the communication needed, by
using completely independent computations on different data subsets, but it
suffers from the inherent unidentifiability of BMF solutions. We introduce a
hierarchical decomposition of the joint posterior distribution, which couples
the subset inferences, allowing for embarrassingly parallel computations in a
sequence of at most three stages. Using an efficient approximate
implementation, we show improvements empirically on both real and simulated
data. Our distributed approach is able to achieve a speedup of almost an order
of magnitude over the full posterior, with a negligible effect on predictive
accuracy. Our method outperforms stateoftheart embarrassingly parallel MCMC
methods in accuracy, and achieves results competitive to other available
distributed and parallel implementations of BMF.

Bayesian networks, and especially their structures, are powerful tools for
representing conditional independencies and dependencies between random
variables. In applications where related variables form a priori known groups,
chosen to represent different "views" to or aspects of the same entities, one
may be more interested in modeling dependencies between groups of variables
rather than between individual variables. Motivated by this, we study prospects
of representing relationships between variable groups using Bayesian network
structures. We show that for dependency structures between groups to be
expressible exactly, the data have to satisfy the socalled groupwise
faithfulness assumption. We also show that one cannot learn causal relations
between groups using only groupwise conditional independencies, but also
variablewise relations are needed. Additionally, we present algorithms for
finding the groupwise dependency structures.

Learning a Bayesian network structure from data is an NPhard problem and
thus exact algorithms are feasible only for small data sets. Therefore, network
structures for larger networks are usually learned with various heuristics.
Another approach to scaling up the structure learning is local learning. In
local learning, the modeler has one or more target variables that are of
special interest; he wants to learn the structure near the target variables and
is not interested in the rest of the variables. In this paper, we present a
scorebased local learning algorithm called SLL. We conjecture that our
algorithm is theoretically sound in the sense that it is optimal in the limit
of large sample size. Empirical results suggest that SLL is competitive when
compared to the constraintbased HITON algorithm. We also study the prospects
of constructing the network structure for the whole node set based on local
results by presenting two algorithms and comparing them to several heuristics.

The fastest known exact algorithms for scorebased structure discovery in
Bayesian networks on n nodes run in time and space 2nnO(1). The usage of these
algorithms is limited to networks on at most around 25 nodes mainly due to the
space requirement. Here, we study spacetime tradeoffs for finding an optimal
network structure. When little space is available, we apply the GurevichShelah
recurrenceoriginally proposed for the Hamiltonian path problemand obtain time
22nsnO(1) in space 2snO(1) for any s = n/2, n/4, n/8, . . .; we assume the
indegree of each node is bounded by a constant. For the more practical setting
with moderate amounts of space, we present a novel scheme. It yields running
time 2n(3/2)pnO(1) in space 2n(3/4)pnO(1) for any p = 0, 1, . . ., n/2; these
bounds hold as long as the indegrees are at most 0.238n. Furthermore, the
latter scheme allows easy and efficient parallelization beyond previous
algorithms. We also explore empirically the potential of the presented
techniques.

We present a new Markov chain Monte Carlo method for estimating posterior
probabilities of structural features in Bayesian networks. The method draws
samples from the posterior distribution of partial orders on the nodes; for
each sampled partial order, the conditional probabilities of interest are
computed exactly. We give both analytical and empirical results that suggest
the superiority of the new method compared to previous methods, which sample
either directed acyclic graphs or linear orders on the nodes.