Oct. 26, 2018 cs.AI, cs.LG, stat.ML
• ### Submodular Maximization using Test Scores(1605.07172)

May 9, 2019 math.CO, cs.DS
We study the canonical problem of maximizing a stochastic submodular function subject to a cardinality constraint, where the goal is to select a subset from a ground set of items with uncertain individual performances to maximize their expected group value. Although near-optimal algorithms have been proposed for this problem, practical concerns regarding scalability, compatibility with distributed implementation, and expensive oracle queries persist in large-scale applications. Motivated by online platforms that rely on individual item scores for content recommendation and team selection, we propose a special class of algorithms that select items based solely on individual performance measures known as test scores. The central contribution of this work is a novel and systematic framework for designing test score based algorithms for a broad class of naturally occurring utility functions. We introduce a new scoring mechanism that we refer to as replication test scores and prove that as long as the objective function satisfies a diminishing returns property, one can leverage these scores to compute solutions that are within a constant factor of the optimum. We then extend our results to the more general stochastic submodular welfare maximization problem, where the goal is to select items and assign them to multiple groups to maximize the sum of the expected group values. For this more difficult problem, we show that replication test scores can be used to develop an algorithm that approximates the optimum solution up to a logarithmic factor. The techniques presented in this work bridge the gap between the rigorous theoretical work on submodular optimization and simple, scalable heuristics that are useful in certain domains.
• ### QSGD: Communication-Optimal Stochastic Gradient Descent, with Applications to Training Neural Networks(1610.02132)

May 25, 2017 cs.DS, cs.LG
• ### Parameter Estimation for Thurstone Choice Models(1705.00136)

April 29, 2017 math.ST, stat.TH
We consider the estimation accuracy of individual strength parameters of a Thurstone choice model when each input observation consists of a choice of one item from a set of two or more items (so called top-1 lists). This model accommodates the well-known choice models such as the Luce choice model for comparison sets of two or more items and the Bradley-Terry model for pair comparisons. We provide a tight characterization of the mean squared error of the maximum likelihood parameter estimator. We also provide similar characterizations for parameter estimators defined by a rank-breaking method, which amounts to deducing one or more pair comparisons from a comparison of two or more items, assuming independence of these pair comparisons, and maximizing a likelihood function derived under these assumptions. We also consider a related binary classification problem where each individual parameter takes value from a set of two possible values and the goal is to correctly classify all items within a prescribed classification error.
• ### Communication complexity of approximate maximum matching in the message-passing model(1704.08462)

April 27, 2017 cs.DS
We consider the communication complexity of finding an approximate maximum matching in a graph in a multi-party message-passing communication model. The maximum matching problem is one of the most fundamental graph combinatorial problems, with a variety of applications. The input to the problem is a graph $G$ that has $n$ vertices and the set of edges partitioned over $k$ sites, and an approximation ratio parameter $\alpha$. The output is required to be a matching in $G$ that has to be reported by one of the sites, whose size is at least factor $\alpha$ of the size of a maximum matching in $G$. We show that the communication complexity of this problem is $\Omega(\alpha^2 k n)$ information bits. This bound is shown to be tight up to a $\log n$ factor, by constructing an algorithm, establishing its correctness, and an upper bound on the communication cost. The lower bound also applies to other graph combinatorial problems in the message-passing communication model, including max-flow and graph sparsification.
• ### Spectral Ranking using Seriation(1406.5370)

March 10, 2016 cs.AI, cs.LG, stat.ML
We describe a seriation algorithm for ranking a set of items given pairwise comparisons between these items. Intuitively, the algorithm assigns similar rankings to items that compare similarly with all others. It does so by constructing a similarity matrix from pairwise comparisons, using seriation methods to reorder this matrix and construct a ranking. We first show that this spectral seriation algorithm recovers the true ranking when all pairwise comparisons are observed and consistent with a total order. We then show that ranking reconstruction is still exact when some pairwise comparisons are corrupted or missing, and that seriation based spectral ranking is more robust to noise than classical scoring methods. Finally, we bound the ranking error when only a random subset of the comparions are observed. An additional benefit of the seriation formulation is that it allows us to solve semi-supervised ranking problems. Experiments on both synthetic and real datasets demonstrate that seriation based spectral ranking achieves competitive and in some cases superior performance compared to classical ranking methods.
• ### Incentives and Efficiency in Uncertain Collaborative Environments(1308.0990)

Aug. 5, 2013 cs.GT
We consider collaborative systems where users make contributions across multiple available projects and are rewarded for their contributions in individual projects according to a local sharing of the value produced. This serves as a model of online social computing systems such as online Q&A forums and of credit sharing in scientific co-authorship settings. We show that the maximum feasible produced value can be well approximated by simple local sharing rules where users are approximately rewarded in proportion to their marginal contributions and that this holds even under incomplete information about the player's abilities and effort constraints. For natural instances we show almost 95% optimality at equilibrium. When players incur a cost for their effort, we identify a threshold phenomenon: the efficiency is a constant fraction of the optimal when the cost is strictly convex and decreases with the number of players if the cost is linear.
• ### Strong Price of Anarchy and Coalitional Dynamics(1307.2537)

July 9, 2013 cs.GT
We introduce a framework for studying the effect of cooperation on the quality of outcomes in utility games. Our framework is a coalitional analog of the smoothness framework of non-cooperative games. Coalitional smoothness implies bounds on the strong price of anarchy, the loss of quality of coalitionally stable outcomes, as well as bounds on coalitional versions of coarse correlated equilibria and sink equilibria, which we define as out-of-equilibrium myopic behavior as determined by a natural coalitional version of best-response dynamics. Our coalitional smoothness framework captures existing results bounding the strong price of anarchy of network design games. We show that in any monotone utility-maximization game, if each player's utility is at least his marginal contribution to the welfare, then the strong price of anarchy is at most 2. This captures a broad class of games, including games with a very high price of anarchy. Additionally, we show that in potential games the strong price of anarchy is close to the price of stability, the quality of the best Nash equilibrium.
• ### Convergence Speed of Binary Interval Consensus(1202.1083)

Feb. 6, 2012 math.PR, cs.DM, math.OC
We consider the convergence time for solving the binary consensus problem using the interval consensus algorithm proposed by B\' en\' ezit, Thiran and Vetterli (2009). In the binary consensus problem, each node initially holds one of two states and the goal for each node is to correctly decide which one of these two states was initially held by a majority of nodes. We derive an upper bound on the expected convergence time that holds for arbitrary connected graphs, which is based on the location of eigenvalues of some contact rate matrices. We instantiate our bound for particular networks of interest, including complete graphs, paths, cycles, star-shaped networks, and Erd\" os-R\' enyi random graphs; for these graphs, we compare our bound with alternative computations. We find that for all these examples our bound is tight, yielding the exact order with respect to the number of nodes. We pinpoint the fact that the expected convergence time critically depends on the voting margin defined as the difference between the fraction of nodes that initially held the majority and the minority states, respectively. The characterization of the expected convergence time yields exact relation between the expected convergence time and the voting margin, for some of these graphs, which reveals how the expected convergence time goes to infinity as the voting margin approaches zero. Our results provide insights into how the expected convergence time depends on the network topology which can be used for performance evaluation and network design. The results are of interest in the context of networked systems, in particular, peer-to-peer networks, sensor networks and distributed databases.
• ### Bargaining Dynamics in Exchange Networks(1202.1089)

Feb. 6, 2012 cs.GT, math.OC
We consider a dynamical system for computing Nash bargaining solutions on graphs and focus on its rate of convergence. More precisely, we analyze the edge-balanced dynamical system by Azar et al and fully specify its convergence for an important class of elementary graph structures that arise in Kleinberg and Tardos' procedure for computing a Nash bargaining solution on general graphs. We show that all these dynamical systems are either linear or eventually become linear and that their convergence times are quadratic in the number of matched edges.