
Webbased services often run randomized experiments to improve their
products. A popular way to run these experiments is to use geographical regions
as units of experimentation, since this does not require tracking of individual
users or browser cookies. Since users may issue queries from multiple
geographical locations, georegions cannot be considered independent and
interference may be present in the experiment. In this paper, we study this
problem, and first present GeoCUTS, a novel algorithm that forms geographical
clusters to minimize interference while preserving balance in cluster size. We
use a random sample of anonymized traffic from Google Search to form a graph
representing user movements, then construct a geographically coherent
clustering of the graph. Our main technical contribution is a statistical
framework to measure the effectiveness of clusterings. Furthermore, we perform
empirical evaluations showing that the performance of GeoCUTS is comparable to
handcrafted georegions with respect to both novel and existing metrics.

We introduce a new model of stochastic bandits with adversarial corruptions
which aims to capture settings where most of the input follows a stochastic
pattern but some fraction of it can be adversarially changed to trick the
algorithm, e.g., click fraud, fake reviews and email spam. The goal of this
model is to encourage the design of bandit algorithms that (i) work well in
mixed adversarial and stochastic models, and (ii) whose performance
deteriorates gracefully as we move from fully stochastic to fully adversarial
models.
In our model, the rewards for all arms are initially drawn from a
distribution and are then altered by an adaptive adversary. We provide a simple
algorithm whose performance gracefully degrades with the total corruption the
adversary injected in the data, measured by the sum across rounds of the
biggest alteration the adversary made in the data in that round; this total
corruption is denoted by $C$. Our algorithm provides a guarantee that retains
the optimal guarantee (up to a logarithmic term) if the input is stochastic and
whose performance degrades linearly to the amount of corruption $C$, while
crucially being agnostic to it. We also provide a lower bound showing that this
linear degradation is necessary if the algorithm achieves optimal performance
in the stochastic setting (the lower bound works even for a known amount of
corruption, a special case in which our algorithm achieves optimal performance
without the extra logarithm).

Clusterbased randomized experiments are popular designs for mitigating the
bias of standard estimators when interference is present and classical causal
inference and experimental design assumptions (such as SUTVA or ITR) do not
hold. Without an exact knowledge of the interference structure, it can be
challenging to understand which partitioning of the experimental units is
optimal to minimize the estimation bias. In the paper, we introduce a
monotonicity condition under which a novel twostage experimental design allows
us to determine which of two clusterbased designs yields the least biased
estimator. We then consider the setting of online advertising auctions and show
that reserve price experiments verify the monotonicity condition and the
proposed framework and methodology applies. We validate our findings on an
advertising auction dataset.

We study revenue optimization in a repeated auction between a single seller
and a single buyer. Traditionally, the design of repeated auctions requires
strong modeling assumptions about the bidder behavior, such as it being myopic,
infinite lookahead, or some specific form of learning behavior. Is it possible
to design mechanisms which are simultaneously optimal against a multitude of
possible buyer behaviors? We answer this question by designing a simple
statebased mechanism that is simultaneously approximately optimal against a
$k$lookahead buyer for all $k$, a buyer who is a noregret learner, and a
buyer who is a policyregret learner. Against each type of buyer our mechanism
attains a constant fraction of the optimal revenue attainable against that type
of buyer. We complement our positive results with almost tight impossibility
results, showing that the revenue approximation tradeoffs achieved by our
mechanism for different lookahead attitudes are nearoptimal.

Randomized composable coresets were introduced recently as an effective
technique for solving matching and vertex cover problems in various models of
computation. In this technique, one partitions the edges of a graph randomly
into multiple pieces, compresses each piece into a smaller subgraph, namely a
coreset, and solves the problem on the union of these coresets to find the
solution. By designing small size randomized coresets, one can obtain efficient
algorithms, in a blackbox way, in multiple computational models including
streaming, distributed communication, and the massively parallel computation
(MPC) model.
We develop randomized coresets of size $\widetilde{O}(n)$ that for any
constant $\varepsilon > 0$, give a $(3/2+\varepsilon)$approximation to
matching and a $(3+\varepsilon)$approximation to vertex cover. Our coresets
improve upon the previously best approximation ratio of $O(1)$ for matching and
$O(\log{n})$ for vertex cover. Our result for matching goes beyond a
2approximation, which is a natural barrier for maximum matching in many models
of computation.
We further build on our coresets to achieve a $(1+\varepsilon)$approximation
to matching and $O(1)$approximation to vertex cover in the MPC model with only
$O(n/poly\log{(n)})$ memory per machine and $O(\log\log{(n)})$ MPC rounds. Our
algorithm for vertex cover is the first $O(1)$approximation MPC algorithm with
$o(\log{n})$ rounds with $O(n)$ memory per machine and our matching algorithm
improves upon the stateoftheart $O((\log\log{n})^2)$ round algorithm of
Czumaj et. al. (STOC 2018).
A key technical ingredient of our paper is a novel application of edge degree
constrained subgraphs (EDCS). At the heart of our proofs are new structural
properties of EDCS that identify these subgraphs as sparse certificates for
matchings and vertex covers which are quite robust to sampling and composition.

We present ASYMP, a distributed graph processing system developed for the
timely analysis of graphs with trillions of edges. ASYMP has several
distinguishing features including a robust fault tolerance mechanism, a
lockless architecture which scales seamlessly to thousands of machines, and
efficient data access patterns to reduce permachine overhead. ASYMP is used to
analyze the largest graphs at Google, and the graphs we consider in our
empirical evaluation here are, to the best of our knowledge, the largest
considered in the literature.
Our experimental results show that compared to previous graph processing
frameworks at Google, ASYMP can scale to larger graphs, operate on more crowded
clusters, and complete realworld graph mining analytic tasks faster. First, we
evaluate the speed of ASYMP, where we show that across a diverse selection of
graphs, it runs Connected Component 350x faster than state of the art
implementations in MapReduce and Pregel. Then we demonstrate the scalability
and parallelism of this framework: first by showing that the running time
increases linearly by increasing the size of the graphs (without changing the
number of machines), and then by showing the gains in running time while
increasing the number of machines. Finally, we demonstrate the faulttolerance
properties for the framework, showing that inducing 50% of our machines to fail
increases the running time by only 41%.

Designing algorithms for balanced allocation of clients to servers in dynamic
settings is a challenging problem for a variety of reasons. Both servers and
clients may be added and/or removed from the system periodically, and the main
objectives of allocation algorithms are: the uniformity of the allocation, and
the number of moves after adding or removing a server or a client. The most
popular solution for our dynamic settings is Consistent Hashing. However, the
load balancing of consistent hashing is no better than a random assignment of
clients to servers, so with $n$ of each, we expect many servers to be
overloaded with $\Theta(\log n/ \log\log n)$ clients. In this paper, with $n$
clients and $n$ servers, we get a guaranteed maxload of 2 while only moving an
expected constant number of clients for each update.
We take an arbitrary user specified balancing parameter $c=1+\epsilon>1$.
With $m$ balls and $n$ bins in the system, we want no load above $\lceil
cm/n\rceil$. Meanwhile we want to bound the expected number of balls that have
to be moved when a ball or server is added or removed. Compared with general
lower bounds without capacity constraints, we show that in our algorithm when a
ball or bin is inserted or deleted, the expected number of balls that have to
be moved is increased only by a multiplicative factor $O({1\over \epsilon^2})$
for $\epsilon \le 1$ (Theorem 4) and by a factor $1+O(\frac{\log c}c)$ for
$\epsilon\ge 1$ (Theorem 3). Technically, the latter bound is the most
challenging to prove. It implies that we for superconstant $c$ only pay a
negligible cost in extra moves. We also get the same bounds for the simpler
problem where we instead of a user specified balancing parameter have a fixed
bin capacity $C$ for all bins.

This paper formulates a novel problem on graphs: find the minimal subset of
edges in a fully connected graph, such that the resulting graph contains all
spanning trees for a set of specifed subgraphs. This formulation is motivated
by an unsupervised grammar induction problem from computational linguistics.
We present a reduction to some known problems and algorithms from graph theory,
provide computational complexity results, and describe an approximation
algorithm.

This paper considers a traditional problem of resource allocation, scheduling
jobs on machines. One such recent application is cloud computing, where jobs
arrive in an online fashion with capacity requirements and need to be
immediately scheduled on physical machines in data centers. It is often
observed that the requested capacities are not fully utilized, hence offering
an opportunity to employ an overcommitment policy, i.e., selling resources
beyond capacity. Setting the right overcommitment level can induce a
significant cost reduction for the cloud provider, while only inducing a very
low risk of violating capacity constraints. We introduce and study a model that
quantifies the value of overcommitment by modeling the problem as a bin packing
with chance constraints. We then propose an alternative formulation that
transforms each chance constraint into a submodular function. We show that our
model captures the risk pooling effect and can guide scheduling and
overcommitment decisions. We also develop a family of online algorithms that
are intuitive, easy to implement and provide a constant factor guarantee from
optimal. Finally, we calibrate our model using realistic workload data, and
test our approach in a practical setting. Our analysis and experiments
illustrate the benefit of overcommitment in cloud services, and suggest a cost
reduction of 1.5% to 17% depending on the provider's risk tolerance.

Coverage problems are central in optimization and have a wide range of
applications in data mining and machine learning. While several distributed
algorithms have been developed for coverage problems, the existing methods
suffer from several limitations, e.g., they all achieve either suboptimal
approximation guarantees or suboptimal space and memory complexities. In
addition, previous algorithms developed for submodular maximization assume
oracle access, and ignore the computational complexity of communicating large
subsets or computing the size of the union of the subsets in this subfamily. In
this paper, we develop an improved distributed algorithm for the $k$cover and
the set cover with $\lambda$ outliers problems, with almost optimal
approximation guarantees, almost optimal memory complexity, and linear
communication complexity running in only four rounds of computation. Finally,
we perform an extensive empirical study of our algorithms on a number of
publicly available real data sets, and show that using sketches of size $30$ to
$600$ times smaller than the input, one can solve the coverage maximization
problem with quality very close to that of the stateoftheart singlemachine
algorithm.

Maximum coverage and minimum set cover problems collectively called
coverage problems have been studied extensively in streaming models. However,
previous research not only achieve suboptimal approximation factors and space
complexities, but also study a restricted set arrival model which makes an
explicit or implicit assumption on oracle access to the sets, ignoring the
complexity of reading and storing the whole set at once. In this paper, we
address the above shortcomings, and present algorithms with improved
approximation factor and improved space complexity, and prove that our results
are almost tight. Moreover, unlike most of previous work, our results hold on a
more general edge arrival model. More specifically, we present (almost) optimal
approximation algorithms for maximum coverage and minimum set cover problems in
the streaming model with an (almost) optimal space complexity of
$\tilde{O}(n)$, i.e., the space is {\em independent of the size of the sets or
the size of the ground set of elements}. These results not only improve over
the best known algorithms for the set arrival model, but also are the first
such algorithms for the more powerful {\em edge arrival} model. In order to
achieve the above results, we introduce a new general sketching technique for
coverage functions: This sketching scheme can be applied to convert an
$\alpha$approximation algorithm for a coverage problem to a
$(1\eps)\alpha$approximation algorithm for the same problem in streaming, or
RAM models. We show the significance of our sketching technique by ruling out
the possibility of solving coverage problems via accessing (as a black box) a
$(1 \pm \eps)$approximate oracle (e.g., a sketch function) that estimates the
coverage function on any subfamily of the sets.

Lately, the problem of designing multistage dynamic mechanisms has been
shown to be both theoretically challenging and practically important. In this
paper, we consider the problem of designing revenue optimal dynamic mechanism
for a setting where an auctioneer sells a set of items to a buyer in multiple
stages. At each stage, there could be multiple items for sale but each item can
only appear in one stage. The type of the buyer at each stage is thus a
multidimensional vector characterizing the buyer's valuations of the items at
that stage and is assumed to be stagewise independent.
In particular, we propose a novel class of mechanisms called bank account
mechanisms. Roughly, a bank account mechanism is no different from any
stagewise individual mechanism except for an augmented structure called bank
account, a real number for each node that summarizes the history so far. We
first establish that the optimal revenue from any dynamic mechanism in this
setting can be achieved by a bank account mechanism, and we provide a simple
characterization of the set of incentive compatible and expost individually
rational bank account mechanisms. Based on these characterizations, we then
investigate the problem of finding the (approximately) optimal bank account
mechanisms. We prove that there exists a simple, randomized bank account
mechanism that approximates optimal revenue up to a constant factor. Our result
is general and can accommodate previous approximation results in singleshot
multidimensional mechanism design. Based on the previous mechanism, we further
show that there exists a deterministic bank account mechanism that achieves
constantfactor approximation as well. Finally, we consider the problem of
computing optimal mechanisms when the type space is discrete and provide an
FPTAS via linear and dynamic programming.

The problem of column subset selection has recently attracted a large body of
research, with feature selection serving as one obvious and important
application. Among the techniques that have been applied to solve this problem,
the greedy algorithm has been shown to be quite effective in practice. However,
theoretical guarantees on its performance have not been explored thoroughly,
especially in a distributed setting. In this paper, we study the greedy
algorithm for the column subset selection problem from a theoretical and
empirical perspective and show its effectiveness in a distributed setting. In
particular, we provide an improved approximation guarantee for the greedy
algorithm which we show is tight up to a constant factor, and present the first
distributed implementation with provable approximation factors. We use the idea
of randomized composable coresets, developed recently in the context of
submodular maximization. Finally, we validate the effectiveness of this
distributed algorithm via an empirical study.

We give a deterministic nearlylinear time algorithm for approximating any
point inside a convex polytope with a sparse convex combination of the
polytope's vertices. Our result provides a constructive proof for the
Approximate Carath\'{e}odory Problem, which states that any point inside a
polytope contained in the $\ell_p$ ball of radius $D$ can be approximated to
within $\epsilon$ in $\ell_p$ norm by a convex combination of only $O\left(D^2
p/\epsilon^2\right)$ vertices of the polytope for $p \geq 2$. We also show that
this bound is tight, using an argument based on anticoncentration for the
binomial distribution.
Along the way of establishing the upper bound, we develop a technique for
minimizing norms over convex sets with complicated geometry; this is achieved
by running Mirror Descent on a dual convex function obtained via Sion's
Theorem.
As simple extensions of our method, we then provide new algorithms for
submodular function minimization and SVM training. For submodular function
minimization we obtain a simplification and (provable) speedup over Wolfe's
algorithm, the method commonly found to be the fastest in practice. For SVM
training, we obtain $O(1/\epsilon^2)$ convergence for arbitrary kernels; each
iteration only requires matrixvector operations involving the kernel matrix,
so we overcome the obstacle of having to explicitly store the kernel or compute
its Cholesky factorization.

Balanced partitioning is often a crucial first step in solving largescale
graph optimization problems, e.g., in some cases, a big graph can be chopped
into pieces that fit on one machine to be processed independently before
stitching the results together. In other cases, links between different parts
may show up in the running time and/or network communications cost. We study a
distributed balanced partitioning problem where the goal is to partition the
vertices of a given graph into k pieces so as to minimize the total cut size.
Our algorithm is composed of a few steps that are easily implementable in
distributed computation frameworks. The algorithm first embeds nodes of the
graph onto a line, and then processes nodes in a distributed manner guided by
the linear embedding order. We examine various ways to find the first
embedding, e.g., via a hierarchical clustering or Hilbert curves. Then we apply
four different techniques including local swaps, minimum cuts on the boundaries
of partitions, as well as contraction and dynamic programming. As our empirical
study, we compare the above techniques with each other, and also to previous
work in distributed graph algorithms, e.g., a label propagation method, FENNEL
and Spinner. We report our results both on a private map graph and several
public social networks, and show that our results beat previous distributed
algorithms: e.g., compared to the label propagation algorithm, we report an
improvement of 1525% in the cut value. We also observe that our algorithms
allow for scalable distributed implementation for any number of partitions.
Finally, we apply our techniques for the Google Maps Driving Directions to
minimize the number of multishard queries with the goal of saving in CPU
usage. During live experiments, we observe an ~40% drop in the number of
multishard queries when comparing our method with a standard geographybased
method.

Designing distributed and scalable algorithms to improve network connectivity
is a central topic in peertopeer networks. In this paper we focus on the
following wellknown problem: given an $n$node $d$regular network for
$d=\Omega(\log n)$, we want to design a decentralized, local algorithm that
transforms the graph into one that has good connectivity properties (low
diameter, expansion, etc.) without affecting the sparsity of the graph. To this
end, Mahlmann and Schindelhauer introduced the random "flip" transformation,
where in each time step, a random pair of vertices that have an edge decide to
`swap a neighbor'. They conjectured that performing $O(n d)$ such flips at
random would convert any connected $d$regular graph into a $d$regular
expander graph, with high probability. However, the best known upper bound for
the number of steps is roughly $O(n^{17} d^{23})$, obtained via a delicate
Markov chain comparison argument.
Our main result is to prove that a natural instantiation of the random flip
produces an expander in at most $O(n^2 d^2 \sqrt{\log n})$ steps, with high
probability. Our argument uses a potentialfunction analysis based on the
matrix exponential, together with the recent beautiful results on the
higherorder Cheeger inequality of graphs. We also show that our technique can
be used to analyze another wellstudied random process known as the `random
switch', and show that it produces an expander in $O(n d)$ steps with high
probability.

An effective technique for solving optimization problems over massive data
sets is to partition the data into smaller pieces, solve the problem on each
piece and compute a representative solution from it, and finally obtain a
solution inside the union of the representative solutions for all pieces. This
technique can be captured via the concept of {\em composable coresets}, and
has been recently applied to solve diversity maximization problems as well as
several clustering problems. However, for coverage and submodular maximization
problems, impossibility bounds are known for this technique \cite{IMMM14}. In
this paper, we focus on efficient construction of a randomized variant of
composable coresets where the above idea is applied on a {\em random
clustering} of the data. We employ this technique for the coverage, monotone
and nonmonotone submodular maximization problems. Our results significantly
improve upon the hardness results for nonrandomized coresets, and imply
improved results for submodular maximization in a distributed and streaming
settings.
In summary, we show that a simple greedy algorithm results in a
$1/3$approximate randomized composable coreset for submodular maximization
under a cardinality constraint. This is in contrast to a known $O({\log k\over
\sqrt{k}})$ impossibility result for (nonrandomized) composable coreset. Our
result also extends to nonmonotone submodular functions, and leads to the
first 2round MapReducebased constantfactor approximation algorithm with
$O(n)$ total communication complexity for either monotone or nonmonotone
functions. Finally, using an improved analysis technique and a new algorithm
$\mathsf{PseudoGreedy}$, we present an improved $0.545$approximation algorithm
for monotone submodular maximization, which is in turn the first
MapReducebased algorithm beating factor $1/2$ in a constant number of rounds.

In this paper, we initiate the study of the multiplicative bidding language
adopted by major Internet search companies. In multiplicative bidding, the
effective bid on a particular search auction is the product of a base bid and
bid adjustments that are dependent on features of the search (for example, the
geographic location of the user, or the platform on which the search is
conducted). We consider the task faced by the advertiser when setting these bid
adjustments, and establish a foundational optimization problem that captures
the core difficulty of bidding under this language. We give matching
algorithmic and approximation hardness results for this problem; these results
are against an informationtheoretic bound, and thus have implications on the
power of the multiplicative bidding language itself. Inspired by empirical
studies of search engine price data, we then codify the relevant restrictions
of the problem, and give further algorithmic and hardness results. Our main
technical contribution is an $O(\log n)$approximation for the case of
multiplicative prices and monotone values. We also provide empirical
validations of our problem restrictions, and test our algorithms on real data
against natural benchmarks. Our experiments show that they perform favorably
compared with the baseline.

Constraints on agent's ability to pay play a major role in auction design for
any setting where the magnitude of financial transactions is sufficiently
large. Those constraints have been traditionally modeled in mechanism design as
\emph{hard budget}, i.e., mechanism is not allowed to charge agents more than a
certain amount. Yet, real auction systems (such as Google AdWords) allow more
sophisticated constraints on agents' ability to pay, such as \emph{average
budgets}. In this work, we investigate the design of Pareto optimal and
incentive compatible auctions for agents with \emph{constrained quasilinear
utilities}, which captures more realistic models of liquidity constraints that
the agents may have. Our result applies to a very general class of allocation
constraints known as polymatroidal environments, encompassing many settings of
interest such as multiunit auctions, matching markets, videoondemand and
advertisement systems.
Our design is based Ausubel's \emph{clinching framework}. Incentive
compatibility and feasibility with respect to abilitytopay constraints are
direct consequences of the clinching framework. Paretooptimality, on the other
hand, is considerably more challenging, since the notrade condition that
characterizes it depends not only on whether agents have their budgets
exhausted or not, but also on prices {at} which the goods are allocated. In
order to get a handle on those prices, we introduce novel concepts of dropping
prices and saturation. These concepts lead to our main structural result which
is a characterization of the tight sets in the clinching auction outcome and
its relation to dropping prices.

Motivated by applications of largescale graph clustering, we study
randomwalkbased LOCAL algorithms whose running times depend only on the size
of the output cluster, rather than the entire graph. All previously known such
algorithms guarantee an output conductance of $\tilde{O}(\sqrt{\phi(A)})$ when
the target set $A$ has conductance $\phi(A)\in[0,1]$. In this paper, we improve
it to $$\tilde{O}\bigg( \min\Big\{\sqrt{\phi(A)},
\frac{\phi(A)}{\sqrt{\mathsf{Conn}(A)}} \Big\} \bigg)\enspace, $$ where the
internal connectivity parameter $\mathsf{Conn}(A) \in [0,1]$ is defined as the
reciprocal of the mixing time of the random walk over the induced subgraph on
$A$.
For instance, using $\mathsf{Conn}(A) = \Omega(\lambda(A) / \log n)$ where
$\lambda$ is the second eigenvalue of the Laplacian of the induced subgraph on
$A$, our conductance guarantee can be as good as
$\tilde{O}(\phi(A)/\sqrt{\lambda(A)})$. This builds an interesting connection
to the recent advance of the socalled improved Cheeger's Inequality [KKL+13],
which says that global spectral algorithms can provide a conductance guarantee
of $O(\phi_{\mathsf{opt}}/\sqrt{\lambda_3})$ instead of
$O(\sqrt{\phi_{\mathsf{opt}}})$.
In addition, we provide theoretical guarantee on the clustering accuracy (in
terms of precision and recall) of the output set. We also prove that our
analysis is tight, and perform empirical evaluation to support our theory on
both synthetic and real data.
It is worth noting that, our analysis outperforms prior work when the cluster
is wellconnected. In fact, the better it is wellconnected inside, the more
significant improvement (both in terms of conductance and accuracy) we can
obtain. Our results shed light on why in practice some randomwalkbased
algorithms perform better than its previous theory, and help guide future
research about local clustering.

Auctions for perishable goods such as internet ad inventory need to make
realtime allocation and pricing decisions as the supply of the good arrives in
an online manner, without knowing the entire supply in advance. These
allocation and pricing decisions get complicated when buyers have some global
constraints. In this work, we consider a multiunit model where buyers have
global {\em budget} constraints, and the supply arrives in an online manner.
Our main contribution is to show that for this setting there is an
individuallyrational, incentivecompatible and Paretooptimal auction that
allocates these units and calculates prices on the fly, without knowledge of
the total supply. We do so by showing that the Adaptive Clinching Auction
satisfies a {\em supplymonotonicity} property.
We also analyze and discuss, using examples, how the insights gained by the
allocation and payment rule can be applied to design better ad allocation
heuristics in practice. Finally, while our main technical result concerns
multiunit supply, we propose a formal model of online supply that captures
scenarios beyond multiunit supply and has applications to sponsored search. We
conjecture that our results for multiunit auctions can be extended to these
more general models.

In light of the growing market of Ad Exchanges for the realtime sale of
advertising slots, publishers face new challenges in choosing between the
allocation of contractbased reservation ads and spot market ads. In this
setting, the publisher should take into account the tradeoff between shortterm
revenue from an Ad Exchange and quality of allocating reservation ads. In this
paper, we formalize this combined optimization problem as a stochastic control
problem and derive an efficient policy for online ad allocation in settings
with general joint distribution over placement quality and exchange bids. We
prove asymptotic optimality of this policy in terms of any tradeoff between
quality of delivered reservation ads and revenue from the exchange, and provide
a rigorous bound for its convergence rate to the optimal policy. We also give
experimental results on data derived from real publisher inventory, showing
that our policy can achieve any paretooptimal point on the quality vs. revenue
curve. Finally, we study a parametric trainingbased algorithm in which instead
of learning the dual variables from a sample data (as is done in nonparametric
trainingbased algorithms), we learn the parameters of the distribution and
construct those dual variables from the learned parameter values. We compare
parametric and nonparametric ways to estimate from data both analytically and
experimentally in the special case without the ad exchange, and show that
though both methods converge to the optimal policy as the sample size grows,
our parametric method converges faster, and thus performs better on smaller
samples.

A central issue in applying auction theory in practice is the problem of
dealing with budgetconstrained agents. A desirable goal in practice is to
design incentive compatible, individually rational, and Pareto optimal auctions
while respecting the budget constraints. Achieving this goal is particularly
challenging in the presence of nontrivial combinatorial constraints over the
set of feasible allocations.
Toward this goal and motivated by AdWords auctions, we present an auction for
{\em polymatroidal} environments satisfying the above properties. Our auction
employs a novel clinching technique with a clean geometric description and only
needs an oracle access to the submodular function defining the polymatroid. As
a result, this auction not only simplifies and generalizes all previous
results, it applies to several new applications including AdWords Auctions,
bandwidth markets, and video on demand. In particular, our characterization of
the AdWords auction as polymatroidal constraints might be of independent
interest. This allows us to design the first mechanism for Ad Auctions taking
into account simultaneously budgets, multiple keywords and multiple slots.
We show that it is impossible to extend this result to generic polyhedral
constraints. This also implies an impossibility result for multiunit auctions
with decreasing marginal utilities in the presence of budget constraints.

Lookahead search is perhaps the most natural and widely used game playing
strategy. Given the practical importance of the method, the aim of this paper
is to provide a theoretical performance examination of lookahead search in a
wide variety of applications.
To determine a strategy play using lookahead search}, each agent predicts
multiple levels of possible reactions to her move (via the use of a search
tree), and then chooses the play that optimizes her future payoff accounting
for these reactions. There are several choices of optimization function the
agents can choose, where the most appropriate choice of function will depend on
the specifics of the actual game  we illustrate this in our examples.
Furthermore, the type of search tree chosen by computationallyconstrained
agent can vary. We focus on the case where agents can evaluate only a bounded
number, $k$, of moves into the future. That is, we use depth $k$ search trees
and call this approach {\em klookahead search}.
We apply our method in five wellknown settings: AdWord auctions; industrial
organization (Cournot's model); congestion games; validutility games and
basicutility games; costsharing network design games. We consider two
questions. First, what is the expected social quality of outcome when agents
apply lookahead search? Second, what interactive behaviours can be exhibited
when players use lookahead search?

The spread of influence in social networks is studied in two main categories:
the progressive model and the nonprogressive model (see e.g. the seminal work
of Kempe, Kleinberg, and Tardos in KDD 2003). While the progressive models are
suitable for modeling the spread of influence in monopolistic settings,
nonprogressive are more appropriate for modeling nonmonopolistic settings,
e.g., modeling diffusion of two competing technologies over a social network.
Despite the extensive work on the progressive model, nonprogressive models
have not been studied well. In this paper, we study the spread of influence in
the nonprogressive model under the strict majority threshold: given a graph
$G$ with a set of initially infected nodes, each node gets infected at time
$\tau$ iff a majority of its neighbors are infected at time $\tau1$. Our goal
in the \textit{MinPTS} problem is to find a minimumcardinality initial set of
infected nodes that would eventually converge to a steady state where all nodes
of $G$ are infected.
We prove that while the MinPTS is NPhard for a restricted family of graphs,
it admits an improved constantfactor approximation algorithm for powerlaw
graphs. We do so by proving lower and upper bounds in terms of the minimum and
maximum degree of nodes in the graph. The upper bound is achieved in turn by
applying a natural greedy algorithm. Our experimental evaluation of the greedy
algorithm also shows its superior performance compared to other algorithms for
a set of realworld graphs as well as the random powerlaw graphs. Finally, we
study the convergence properties of these algorithms and show that the
nonprogressive model converges in at most $O(E(G))$ steps.