
A matching in a twosided market often incurs an externality: a matched
resource may become unavailable to the other side of the market, at least for a
while. This is especially an issue in online platforms involving human experts
as the expert resources are often scarce. The efficient utilization of experts
in these platforms is made challenging by the fact that the information
available about the parties involved is usually limited.
To address this challenge, we develop a model of a taskexpert matching
system where a task is matched to an expert using not only the prior
information about the task but also the feedback obtained from the past
matches. In our model the tasks arrive online while the experts are fixed and
constrained by a finite service capacity. For this model, we characterize the
maximum task resolution throughput a platform can achieve. We show that the
natural greedy approaches where each expert is assigned a task most suitable to
her skill is suboptimal, as it does not internalize the above externality. We
develop a throughput optimal backpressure algorithm which does so by accounting
for the `congestion' among different task types. Finally, we validate our model
and confirm our theoretical findings with datadriven simulations via logs of
Math.StackExchange, a StackOverflow forum dedicated to mathematics.

We study the canonical problem of maximizing a stochastic submodular function
subject to a cardinality constraint, where the goal is to select a subset from
a ground set of items with uncertain individual performances to maximize their
expected group value. Although nearoptimal algorithms have been proposed for
this problem, practical concerns regarding scalability, compatibility with
distributed implementation, and expensive oracle queries persist in largescale
applications. Motivated by online platforms that rely on individual item scores
for content recommendation and team selection, we propose a special class of
algorithms that select items based solely on individual performance measures
known as test scores. The central contribution of this work is a novel and
systematic framework for designing test score based algorithms for a broad
class of naturally occurring utility functions. We introduce a new scoring
mechanism that we refer to as replication test scores and prove that as long as
the objective function satisfies a diminishing returns property, one can
leverage these scores to compute solutions that are within a constant factor of
the optimum. We then extend our results to the more general stochastic
submodular welfare maximization problem, where the goal is to select items and
assign them to multiple groups to maximize the sum of the expected group
values. For this more difficult problem, we show that replication test scores
can be used to develop an algorithm that approximates the optimum solution up
to a logarithmic factor. The techniques presented in this work bridge the gap
between the rigorous theoretical work on submodular optimization and simple,
scalable heuristics that are useful in certain domains.

Parallel implementations of stochastic gradient descent (SGD) have received
significant research attention, thanks to excellent scalability properties of
this algorithm, and to its efficiency in the context of training deep neural
networks. A fundamental barrier for parallelizing largescale SGD is the fact
that the cost of communicating the gradient updates between nodes can be very
large. Consequently, lossy compression heuristics have been proposed, by which
nodes only communicate quantized gradients. Although effective in practice,
these heuristics do not always provably converge, and it is not clear whether
they are optimal.
In this paper, we propose Quantized SGD (QSGD), a family of compression
schemes which allow the compression of gradient updates at each node, while
guaranteeing convergence under standard assumptions. QSGD allows the user to
trade off compression and convergence time: it can communicate a sublinear
number of bits per iteration in the model dimension, and can achieve
asymptotically optimal communication cost. We complement our theoretical
results with empirical data, showing that QSGD can significantly reduce
communication cost, while being competitive with standard uncompressed
techniques on a variety of real tasks.
In particular, experiments show that gradient quantization applied to
training of deep neural networks for image classification and automated speech
recognition can lead to significant reductions in communication cost, and
endtoend training time. For instance, on 16 GPUs, we are able to train a
ResNet152 network on ImageNet 1.8x faster to full accuracy. Of note, we show
that there exist generic parameter settings under which all known network
architectures preserve or slightly improve their full accuracy when using
quantization.

We consider the estimation accuracy of individual strength parameters of a
Thurstone choice model when each input observation consists of a choice of one
item from a set of two or more items (so called top1 lists). This model
accommodates the wellknown choice models such as the Luce choice model for
comparison sets of two or more items and the BradleyTerry model for pair
comparisons.
We provide a tight characterization of the mean squared error of the maximum
likelihood parameter estimator. We also provide similar characterizations for
parameter estimators defined by a rankbreaking method, which amounts to
deducing one or more pair comparisons from a comparison of two or more items,
assuming independence of these pair comparisons, and maximizing a likelihood
function derived under these assumptions. We also consider a related binary
classification problem where each individual parameter takes value from a set
of two possible values and the goal is to correctly classify all items within a
prescribed classification error.

We consider the communication complexity of finding an approximate maximum
matching in a graph in a multiparty messagepassing communication model. The
maximum matching problem is one of the most fundamental graph combinatorial
problems, with a variety of applications.
The input to the problem is a graph $G$ that has $n$ vertices and the set of
edges partitioned over $k$ sites, and an approximation ratio parameter
$\alpha$. The output is required to be a matching in $G$ that has to be
reported by one of the sites, whose size is at least factor $\alpha$ of the
size of a maximum matching in $G$.
We show that the communication complexity of this problem is $\Omega(\alpha^2
k n)$ information bits. This bound is shown to be tight up to a $\log n$
factor, by constructing an algorithm, establishing its correctness, and an
upper bound on the communication cost. The lower bound also applies to other
graph combinatorial problems in the messagepassing communication model,
including maxflow and graph sparsification.

We describe a seriation algorithm for ranking a set of items given pairwise
comparisons between these items. Intuitively, the algorithm assigns similar
rankings to items that compare similarly with all others. It does so by
constructing a similarity matrix from pairwise comparisons, using seriation
methods to reorder this matrix and construct a ranking. We first show that this
spectral seriation algorithm recovers the true ranking when all pairwise
comparisons are observed and consistent with a total order. We then show that
ranking reconstruction is still exact when some pairwise comparisons are
corrupted or missing, and that seriation based spectral ranking is more robust
to noise than classical scoring methods. Finally, we bound the ranking error
when only a random subset of the comparions are observed. An additional benefit
of the seriation formulation is that it allows us to solve semisupervised
ranking problems. Experiments on both synthetic and real datasets demonstrate
that seriation based spectral ranking achieves competitive and in some cases
superior performance compared to classical ranking methods.

We consider collaborative systems where users make contributions across
multiple available projects and are rewarded for their contributions in
individual projects according to a local sharing of the value produced. This
serves as a model of online social computing systems such as online Q&A forums
and of credit sharing in scientific coauthorship settings. We show that the
maximum feasible produced value can be well approximated by simple local
sharing rules where users are approximately rewarded in proportion to their
marginal contributions and that this holds even under incomplete information
about the player's abilities and effort constraints. For natural instances we
show almost 95% optimality at equilibrium. When players incur a cost for their
effort, we identify a threshold phenomenon: the efficiency is a constant
fraction of the optimal when the cost is strictly convex and decreases with the
number of players if the cost is linear.

We introduce a framework for studying the effect of cooperation on the
quality of outcomes in utility games. Our framework is a coalitional analog of
the smoothness framework of noncooperative games. Coalitional smoothness
implies bounds on the strong price of anarchy, the loss of quality of
coalitionally stable outcomes, as well as bounds on coalitional versions of
coarse correlated equilibria and sink equilibria, which we define as
outofequilibrium myopic behavior as determined by a natural coalitional
version of bestresponse dynamics.
Our coalitional smoothness framework captures existing results bounding the
strong price of anarchy of network design games. We show that in any monotone
utilitymaximization game, if each player's utility is at least his marginal
contribution to the welfare, then the strong price of anarchy is at most 2.
This captures a broad class of games, including games with a very high price of
anarchy. Additionally, we show that in potential games the strong price of
anarchy is close to the price of stability, the quality of the best Nash
equilibrium.

We consider the convergence time for solving the binary consensus problem
using the interval consensus algorithm proposed by B\' en\' ezit, Thiran and
Vetterli (2009). In the binary consensus problem, each node initially holds one
of two states and the goal for each node is to correctly decide which one of
these two states was initially held by a majority of nodes.
We derive an upper bound on the expected convergence time that holds for
arbitrary connected graphs, which is based on the location of eigenvalues of
some contact rate matrices. We instantiate our bound for particular networks of
interest, including complete graphs, paths, cycles, starshaped networks, and
Erd\" osR\' enyi random graphs; for these graphs, we compare our bound with
alternative computations. We find that for all these examples our bound is
tight, yielding the exact order with respect to the number of nodes.
We pinpoint the fact that the expected convergence time critically depends on
the voting margin defined as the difference between the fraction of nodes that
initially held the majority and the minority states, respectively. The
characterization of the expected convergence time yields exact relation between
the expected convergence time and the voting margin, for some of these graphs,
which reveals how the expected convergence time goes to infinity as the voting
margin approaches zero.
Our results provide insights into how the expected convergence time depends
on the network topology which can be used for performance evaluation and
network design. The results are of interest in the context of networked
systems, in particular, peertopeer networks, sensor networks and distributed
databases.

We consider a dynamical system for computing Nash bargaining solutions on
graphs and focus on its rate of convergence. More precisely, we analyze the
edgebalanced dynamical system by Azar et al and fully specify its convergence
for an important class of elementary graph structures that arise in Kleinberg
and Tardos' procedure for computing a Nash bargaining solution on general
graphs. We show that all these dynamical systems are either linear or
eventually become linear and that their convergence times are quadratic in the
number of matched edges.