• We introduce the problem of assigning resources to improve their utilization. The motivation comes from settings where agents have uncertainty about their own values for using a resource, and where it is in the interest of a group that resources be used and not wasted. Done in the right way, improved utilization maximizes social welfare--- balancing the utility of a high value but unreliable agent with the group's preference that resources be used. We introduce the family of contingent payment mechanisms (CP), which may charge an agent contingent on use (a penalty). A CP mechanism is parameterized by a maximum penalty, and has a dominant-strategy equilibrium. Under a set of axiomatic properties, we establish welfare-optimality for the special case CP(W), with CP instantiated for a maximum penalty equal to societal value W for utilization. CP(W) is not dominated for expected welfare by any other mechanism, and second, amongst mechanisms that always allocate the resource and have a simple indirect structure, CP(W) strictly dominates every other mechanism. The special case with no upper bound on penalty, the contingent second-price mechanism, maximizes utilization. We extend the mechanisms to assign multiple, heterogeneous resources, and present a simulation study of the welfare properties of these mechanisms.
  • In the standard single-dimensional model of position auctions, bidders agree on the relative values of the positions and each of them submits a single bid that is interpreted in terms of these values. Motivated by current practice in sponsored search we consider a situation where the auctioneer uses estimates of the relative values, which may be imprecise, and show that under both complete and incomplete information a non-truthful mechanism is able to support an efficient outcome in equilibrium for a wider range of these estimates than the VCG mechanism. We thus exhibit a property of the VCG mechanism that may help explain the surprising rarity with which it is used even in settings with unit demand, a relative lack of robustness to misspecification of the bidding language. The result for complete information concerns the generalized second-price mechanism and lends additional theoretical support to the use of this mechanism in practice. Particularly interesting from a technical perspective is the result for incomplete information, which is driven by a surprising connection between equilibrium bids in the VCG mechanism and the generalized first-price mechanism.
  • Without monetary payments, the Gibbard-Satterthwaite theorem proves that under mild requirements all truthful social choice mechanisms must be dictatorships. When payments are allowed, the Vickrey-Clarke-Groves (VCG) mechanism implements the value-maximizing choice, and has many other good properties: it is strategy-proof, onto, deterministic, individually rational, and does not make positive transfers to the agents. By Roberts' theorem, with three or more alternatives, the weighted VCG mechanisms are essentially unique for domains with quasi-linear utilities. The goal of this paper is to characterize domains of non-quasi-linear utilities where "reasonable" mechanisms (with VCG-like properties) exist. Our main result is a tight characterization of the maximal non quasi-linear utility domain, which we call the largest parallel domain. We extend Roberts' theorem to parallel domains, and use the generalized theorem to prove two impossibility results. First, any reasonable mechanism must be dictatorial when the utility domain is quasi-linear together with any single non-parallel type. Second, for richer utility domains that still differ very slightly from quasi-linearity, every strategy-proof, onto and deterministic mechanism must be a dictatorship.
  • Ridesharing platforms match drivers and riders to trips, using dynamic prices to balance supply and demand. A challenge is to set prices that are appropriately smooth in space and time, so that drivers will choose to accept their dispatched trips, rather than drive to another area or wait for higher prices or a better trip. We work in a complete information, discrete time, multi-period, multi-location model, and introduce the Spatio-Temporal Pricing (STP) mechanism. The mechanism is incentive-aligned, in that it is a subgame-perfect equilibrium for drivers to accept their dispatches, and the mechanism is also welfare-optimal, envy-free, individually rational and budget balanced from any history onward. The proof of incentive alignment makes use of the $M^\natural$ concavity of min-cost flow objectives. We also give an impossibility result, that there can be no dominant-strategy mechanism with the same economic properties. An empirical analysis conducted in simulation suggests that the STP mechanism can achieve significantly higher social welfare than a myopic pricing mechanism.
  • Designing an incentive compatible auction that maximizes expected revenue is an intricate task. The single-item case was resolved in a seminal piece of work by Myerson in 1981. Even after 30-40 years of intense research the problem remains unsolved for settings with two or more items. In this work, we initiate the exploration of the use of tools from deep learning for the automated design of optimal auctions. We model an auction as a multi-layer neural network, frame optimal auction design as a constrained learning problem, and show how it can be solved using standard machine learning pipelines. We prove generalization bounds and present extensive experiments, recovering essentially all known analytical solutions for multi-item settings, and obtaining novel mechanisms for settings in which the optimal mechanism is unknown.
  • Cluster-based randomized experiments are popular designs for mitigating the bias of standard estimators when interference is present and classical causal inference and experimental design assumptions (such as SUTVA or ITR) do not hold. Without an exact knowledge of the interference structure, it can be challenging to understand which partitioning of the experimental units is optimal to minimize the estimation bias. In the paper, we introduce a monotonicity condition under which a novel two-stage experimental design allows us to determine which of two cluster-based designs yields the least biased estimator. We then consider the setting of online advertising auctions and show that reserve price experiments verify the monotonicity condition and the proposed framework and methodology applies. We validate our findings on an advertising auction dataset.
  • Peer prediction is a method to promote contributions of information by users in settings in which there is no way to verify the quality of responses. In multi-task peer prediction, the reports from users across multiple tasks are used to score contributions. This paper extends the correlated agreement (CA) multi-task peer prediction mechanism to allow the reports from users to be on heterogeneous tasks, each associated with different distributions on responses. The motivation comes from wanting to elicit user-generated content about places in a city, where tasks vary because places, and questions about places, vary. We prove that the generalized CA mechanism is informed truthful under weak conditions, meaning that it is strictly beneficial for a user to invest effort and acquire information, and that truthful reporting is the best strategy when investing effort, as well as an equilibrium. We demonstrate that the mechanism has good incentive properties when tested in simulation on distributions derived from user reports on Google Local Guides.
  • We study fairness within the stochastic, \emph{multi-armed bandit} (MAB) decision making framework. We adapt the fairness framework of "treating similar individuals similarly" to this setting. Here, an `individual' corresponds to an arm and two arms are `similar' if they have a similar quality distribution. First, we adopt a {\em smoothness constraint} that if two arms have a similar quality distribution then the probability of selecting each arm should be similar. In addition, we define the {\em fairness regret}, which corresponds to the degree to which an algorithm is not calibrated, where perfect calibration requires that the probability of selecting an arm is equal to the probability with which the arm has the best quality realization. We show that a variation on Thompson sampling satisfies smooth fairness for total variation distance, and give an $\tilde{O}((kT)^{2/3})$ bound on fairness regret. This complements prior work, which protects an on-average better arm from being less favored. We also explain how to extend our algorithm to the dueling bandit setting.
  • Planned experiments are the gold standard in reliably comparing the causal effect of switching from a baseline policy to a new policy. One critical shortcoming of classical experimental methods, however, is that they typically do not take into account the dynamic nature of response to policy changes. For instance, in an experiment where we seek to understand the effects of a new ad pricing policy on auction revenue, agents may adapt their bidding in response to the experimental pricing changes. Thus, causal effects of the new pricing policy after such adaptation period, the {\em long-term causal effects}, are not captured by the classical methodology even though they clearly are more indicative of the value of the new policy. Here, we formalize a framework to define and estimate long-term causal effects of policy changes in multiagent economies. Central to our approach is behavioral game theory, which we leverage to formulate the ignorability assumptions that are necessary for causal inference. Under such assumptions we estimate long-term causal effects through a latent space approach, where a behavioral model of how agents act conditional on their latent behaviors is combined with a temporal model of how behaviors evolve over time.
  • The problem of peer prediction is to elicit information from agents in settings without any objective ground truth against which to score reports. Peer prediction mechanisms seek to exploit correlations between signals to align incentives with truthful reports. A long-standing concern has been the possibility of uninformative equilibria. For binary signals, a multi-task mechanism [Dasgupta-Ghosh '13] achieves strong truthfulness, so that the truthful equilibrium strictly maximizes payoff. We characterize conditions on the signal distribution for which this mechanism remains strongly-truthful with non-binary signals, also providing a greatly simplified proof. We introduce the Correlated Agreement (CA) mechanism, which handles multiple signals and provides informed truthfulness: no strategy profile provides more payoff in equilibrium than truthful reporting, and the truthful equilibrium is strictly better than any uninformed strategy (where an agent avoids the effort of obtaining a signal). The CA mechanism is maximally strongly truthful, in that no mechanism in a broad class of mechanisms is strongly truthful on a larger family of signal distributions. We also give a detail-free version of the mechanism that removes any knowledge requirements on the part of the designer, using reports on many tasks to learn statistics while retaining epsilon-informed truthfulness.
  • In this paper, we take a statistical decision-theoretic viewpoint on social choice, putting a focus on the decision to be made on behalf of a system of agents. In our framework, we are given a statistical ranking model, a decision space, and a loss function defined on (parameter, decision) pairs, and formulate social choice mechanisms as decision rules that minimize expected loss. This suggests a general framework for the design and analysis of new social choice mechanisms. We compare Bayesian estimators, which minimize Bayesian expected loss, for the Mallows model and the Condorcet model respectively, and the Kemeny rule. We consider various normative properties, in addition to computational complexity and asymptotic behavior. In particular, we show that the Bayesian estimator for the Condorcet model satisfies some desired properties such as anonymity, neutrality, and monotonicity, can be computed in polynomial time, and is asymptotically different from the other two rules when the data are generated from the Condorcet model for some ground truth parameter.
  • We consider the design of experiments to evaluate treatments that are administered by self-interested agents, each seeking to achieve the highest evaluation and win the experiment. For example, in an advertising experiment, a company wishes to evaluate two marketing agents in terms of their efficacy in viral marketing, and assign a contract to the winner agent. Contrary to traditional experimental design, this problem has two new implications. First, the experiment induces a game among agents, where each agent can select from multiple versions of the treatment it administers. Second, the action of one agent -- selection of treatment version -- may affect the actions of another agent, with the resulting strategic interference complicating the evaluation of agents. An incentive-compatible experiment design is one with an equilibrium where each agent selects its natural action, which is the action that would maximize the performance of the agent if there was no competition (e.g., expected number of conversions if agent was assigned the contract). Under a general formulation of experimental design, we identify sufficient conditions that guarantee incentive-compatible experiments. These conditions rely on the existence of statistics that can estimate how agents would perform without competition, and their use in constructing score functions to evaluate the agents. In the setting with no strategic interference, we also study the power of the design, i.e., the probability that the best agent wins, and show how to improve the power of incentive-compatible designs. From the technical side, our theory uses a range of statistical methods such as hypothesis testing, variance-stabilizing transformations and the Delta method, all of which rely on asymptotics.
  • The well-studied problem of statistical rank aggregation has been applied to comparing sports teams, information retrieval, and most recently to data generated by human judgment. Such human-generated rankings may be substantially different from traditional statistical ranking data. In this work, we show that a recently proposed generalized random utility model reveals distinctive patterns in human judgment across three different domains, and provides a succinct representation of variance in both population preferences and imperfect perception. In contrast, we also show that classical statistical ranking models fail to capture important features from human-generated input. Our work motivates the use of more flexible ranking models for representing and describing the collective preferences or decision-making of human participants.
  • Economic mechanisms administer the allocation of resources to interested agents based on their self-reported types. One objective in mechanism design is to design a strategyproof process so that no agent will have an incentive to misreport its type. However, typical analyses of the incentives properties of mechanisms operate under strong, usually untestable assumptions. Empirical, data-oriented approaches are, at best, under-developed. Furthermore, mechanism/policy evaluation methods usually ignore the dynamic nature of a multi-agent system and are thus inappropriate for estimating long-term effects. We introduce the problem of estimating the causal effects of mechanisms on incentives and frame it under the Rubin causal framework \citep{rubin74, rubin78}. This raises unique technical challenges since the outcome of interest (agent truthfulness) is confounded with strategic interactions and, interestingly, is typically never observed under any mechanism. We develop a methodology to estimate such causal effects that using a prior that is based on a strategic equilibrium model. Working on the domain of kidney exchanges, we show how to apply our methodology to estimate causal effects of kidney allocation mechanisms on hospitals' incentives. Our results demonstrate that the use of game-theoretic prior captures the dynamic nature of the kidney exchange multiagent system and shrinks the estimates towards long-term effects, thus improving upon typical methods that completely ignore agents' strategic behavior.
  • We design a protocol for dynamic prioritization of data on shared routers such as untethered 3G/4G devices. The mechanism prioritizes bandwidth in favor of users with the highest value, and is incentive compatible, so that users can simply report their true values for network access. A revenue pooling mechanism also aligns incentives for sellers, so that they will choose to use prioritization methods that retain the incentive properties on the buy-side. In this way, the design allows for an open architecture. In addition to revenue pooling, the technical contribution is to identify a class of stochastic demand models and a prioritization scheme that provides allocation monotonicity. Simulation results confirm efficiency gains from dynamic prioritization relative to prior methods, as well as the effectiveness of revenue pooling.
  • This paper discusses {General Random Utility Models (GRUMs)}. These are a class of parametric models that generate partial ranks over alternatives given attributes of agents and alternatives. We propose two preference elicitation scheme for GRUMs developed from principles in Bayesian experimental design, one for social choice and the other for personalized choice. We couple this with a general Monte-Carlo-Expectation-Maximization (MC-EM) based algorithm for MAP inference under GRUMs. We also prove uni-modality of the likelihood functions for a class of GRUMs. We examine the performance of various criteria by experimental studies, which show that the proposed elicitation scheme increases the precision of estimation.
  • Since economic mechanisms are often applied to very different instances of the same problem, it is desirable to identify mechanisms that work well in a wide range of circumstances. We pursue this goal for a position auction setting and specifically seek mechanisms that guarantee good outcomes under both complete and incomplete information. A variant of the generalized first-price mechanism with multi-dimensional bids turns out to be the only standard mechanism able to achieve this goal, even when types are one-dimensional. The fact that expressiveness beyond the type space is both necessary and sufficient for this kind of robustness provides an interesting counterpoint to previous work on position auctions that has highlighted the benefits of simplicity. From a technical perspective our results are interesting because they establish equilibrium existence for a multi-dimensional bid space, where standard techniques break down. The structure of the equilibrium bids moreover provides an intuitive explanation for why first-price payments may be able to support equilibria in a wider range of circumstances than second-price payments.
  • The publication culture in Computer Science is different from that of all other disciplines. Whereas other disciplines focus on journal publication, the standard practice in CS has been to publish in a conference and then (sometimes) publish a journal version of the conference paper. We discuss the role of journal publication in CS. Indeed, it is through publication in selective, leading conferences that the quality of CS research is typically assessed.
  • Random utility theory models an agent's preferences on alternatives by drawing a real-valued score on each alternative (typically independently) from a parameterized distribution, and then ranking the alternatives according to scores. A special case that has received significant attention is the Plackett-Luce model, for which fast inference methods for maximum likelihood estimators are available. This paper develops conditions on general random utility models that enable fast inference within a Bayesian framework through MC-EM, providing concave loglikelihood functions and bounded sets of global maxima solutions. Results on both real-world and simulated data provide support for the scalability of the approach and capability for model selection among general random utility models including Plackett-Luce.
  • In mechanism design it is typical to impose incentive compatibility and then derive an optimal mechanism subject to this constraint. By replacing the incentive compatibility requirement with the goal of minimizing expected ex post regret, we are able to adapt statistical machine learning techniques to the design of payment rules. This computational approach to mechanism design is applicable to domains with multi-dimensional types and situations where computational efficiency is a concern. Specifically, given an outcome rule and access to a type distribution, we train a support vector machine with a special discriminant function structure such that it implicitly establishes a payment rule with desirable incentive properties. We discuss applications to a multi-minded combinatorial auction with a greedy winner-determination algorithm and to an assignment problem with egalitarian outcome rule. Experimental results demonstrate both that the construction produces payment rules with low ex post regret, and that penalizing classification errors is effective in preventing failures of ex post individual rationality.
  • Online double auctions (DAs) model a dynamic two-sided matching problem with private information and self-interest, and are relevant for dynamic resource and task allocation problems. We present a general method to design truthful DAs, such that no agent can benefit from misreporting its arrival time, duration, or value. The family of DAs is parameterized by a pricing rule, and includes a generalization of McAfee's truthful DA to this dynamic setting. We present an empirical study, in which we study the allocative-surplus and agent surplus for a number of different DAs. Our results illustrate that dynamic pricing rules are important to provide good market efficiency for markets with high volatility or low volume.
  • Consider a multi-agent system in a dynamic and uncertain environment. Each agent's local decision problem is modeled as a Markov decision process (MDP) and agents must coordinate on a joint action in each period, which provides a reward to each agent and causes local state transitions. A social planner knows the model of every agent's MDP and wants to implement the optimal joint policy, but agents are self-interested and have private local state. We provide an incentive-compatible mechanism for eliciting state information that achieves the optimal joint plan in a Markov perfect equilibrium of the induced stochastic game. In the special case in which local problems are Markov chains and agents compete to take a single action in each period, we leverage Gittins allocation indices to provide an efficient factored algorithm and distribute computation of the optimal policy among the agents. Distributed, optimal coordinated learning in a multi-agent variant of the multi-armed bandit problem is obtained as a special case.
  • In addressing the challenge of exponential scaling with the number of agents we adopt a cluster-based representation to approximately solve asymmetric games of very many players. A cluster groups together agents with a similar "strategic view" of the game. We learn the clustered approximation from data consisting of strategy profiles and payoffs, which may be obtained from observations of play or access to a simulator. Using our clustering we construct a reduced "twins" game in which each cluster is associated with two players of the reduced game. This allows our representation to be individually- responsive because we align the interests of every individual agent with the strategy of its cluster. Our approach provides agents with higher payoffs and lower regret on average than model-free methods as well as previous cluster-based methods, and requires only few observations for learning to be successful. The "twins" approach is shown to be an important component of providing these low regret approximations.
  • Strategyproof mechanisms provide robust equilibrium with minimal assumptions about knowledge and rationality but can be unachievable in combination with other desirable properties such as budget-balance, stability against deviations by coalitions, and computational tractability. In the search for maximally-strategyproof mechanisms that simultaneously satisfy other desirable properties, we introduce a new metric to quantify the strategyproofness of a mechanism, based on comparing the payoff distribution, given truthful reports, against that of a strategyproof "reference" mechanism that solves a problem relaxation. Focusing on combinatorial exchanges, we demonstrate that the metric is informative about the eventual equilibrium, where simple regretbased metrics are not, and can be used for online selection of an effective mechanism.
  • A fundamental result in mechanism design theory, the so-called revelation principle, asserts that for many questions concerning the existence of mechanisms with a given outcome one can restrict attention to truthful direct revelation-mechanisms. In practice, however, many mechanism use a restricted message space. This motivates the study of the tradeoffs involved in choosing simplified mechanisms, which can sometimes bring benefits in precluding bad or promoting good equilibria, and other times impose costs on welfare and revenue. We study the simplicity-expressiveness tradeoff in two representative settings, sponsored search auctions and combinatorial auctions, each being a canonical example for complete information and incomplete information analysis, respectively. We observe that the amount of information available to the agents plays an important role for the tradeoff between simplicity and expressiveness.