
Recent developments in formal verification have identified approximate
liftings (also known as approximate couplings) as a clean, compositional
abstraction for proving differential privacy. This construction can be defined
in two styles. Earlier definitions require the existence of one or more witness
distributions, while a recent definition by Sato uses universal quantification
over all sets of samples. These notions have each have their own strengths: the
universal version is more general than the existential ones, while existential
liftings are known to satisfy more precise composition principles.
We propose a novel, existential version of approximate lifting, called
$\star$lifting, and show that it is equivalent to Sato's construction for
discrete probability measures. Our work unifies all known notions of
approximate lifting, yielding cleaner properties, more general constructions,
and more precise composition theorems for both styles of lifting, enabling
richer proofs of differential privacy. We also clarify the relation between
existing definitions of approximate lifting, and consider more general
approximate liftings based on $f$divergences.

Proof by coupling is a classical technique for proving properties about pairs
of randomized algorithms by carefully relating (or coupling) two probabilistic
executions. In this paper, we show how to automatically construct such proofs
for probabilistic programs. First, we present fcoupled postconditions, an
abstraction describing two correlated program executions. Second, we show how
properties of fcoupled postconditions can imply various probabilistic
properties of the original programs. Third, we demonstrate how to reduce the
proofsearch problem to a purely logical synthesis problem of the form $\exists
f\ldotp \forall X\ldotp \phi$, making probabilistic reasoning unnecessary. We
develop a prototype implementation to automatically build coupling proofs for
probabilistic properties, including uniformity and independence of program
expressions.

We tackle the problem of deciding whether two probabilistic programs are
equivalent in Probabilistic NetKAT, a formal language for specifying and
reasoning about the behavior of packetswitched networks. We show that the
problem is decidable for the historyfree fragment of the language by
developing an effective decision procedure based on stochastic matrices. The
main challenge lies in reasoning about iteration, which we address by designing
an encoding of the program semantics as a finitestate absorbing Markov chain,
whose limiting distribution can be computed exactly. In an extended case study
on a realworld data center network, we automatically verify various
quantitative properties of interest, including resilience in the presence of
failures, by analyzing the Markov chain semantics.

Research on deductive verification of probabilistic programs has considered
expectationbased logics, where pre and postconditions are realvalued
functions on states, and assertionbased logics, where pre and postconditions
are boolean predicates on state distributions. Both approaches have developed
over nearly four decades, but they have different standings today.
Expectationbased systems have managed to formalize many sophisticated case
studies, while assertionbased systems today have more limited expressivity and
have targeted simpler examples.
We present Ellora, a sound and relatively complete assertionbased program
logic, and demonstrate its expressivity by verifying several classical examples
of randomized algorithms using an implementation in the EasyCrypt proof
assistant. Ellora features new proof rules for loops and adversarial code, and
supports richer assertions than existing program logics. We also show that
Ellora allows convenient reasoning about complex probabilistic concepts by
developing a new program logic for probabilistic independence and distribution
law, and then smoothly embedding it into Ellora. Our work demonstrates that the
assertionbased approach is not fundamentally limited and suggests that some
notions are potentially easier to reason about in assertionbased systems.

We develop a semantics framework for verifying recent relaxations of
differential privacy: R\'enyi differential privacy and zeroconcentrated
differential privacy. Both notions require a bound on a particular statistical
divergence between two probability distributions. In order to reason about such
properties compositionally, we introduce approximate spanliftings,
generalizing approximate relational liftings previously developed for standard
differential privacy to a more general class of divergences, and to continuous
distributions. To enable verification of possibly nonterminating programs, our
framework supports generalized divergences between subprobability measures. As
a concrete application, we use approximate spanliftings to develop a program
logic that can prove relaxations of differential privacy and other
probabilistic properties based on statistical

We define Almost Sure Productivity (ASP), a probabilistic generalization of
the productivity condition for coinductively defined structures. Intuitively, a
probabilistic coinductive stream or tree is ASP if it produces infinitely many
outputs with probability 1. Formally, we define almost sure productivity using
a final coalgebra semantics of programs inspired from Kerstan and K\"onig.
Then, we introduce a core language for probabilistic streams and trees, and
provide two approaches to verify ASP: a sufficient syntactic criterion, and a
reduction to modelchecking pCTL* formulas on probabilistic pushdown automata.
The reduction shows that ASP is decidable for our core language.

Differential privacy has emerged as a promising probabilistic formulation of
privacy, generating intense interest within academia and industry. We present a
pushbutton, automated technique for verifying $\epsilon$differential privacy
of sophisticated randomized algorithms. We make several conceptual,
algorithmic, and practical contributions: (i) Inspired by the recent advances
on approximate couplings and randomness alignment, we present a new proof
technique called coupling strategies, which casts differential privacy proofs
as discovering a strategy in a game where we have finite privacy resources to
expend. (ii) To discover a winning strategy, we present a constraintbased
formulation of the problem as solving a set of Horn modulo couplings (HMC)
constraints, a novel combination of firstorder Horn clauses and probabilistic
constraints. (iii) We present a technique for solving HMCconstraints by
transforming probabilistic constraints into logical constraints with
uninterpreted functions. (iv) Finally, we implement our technique and provide
the first automated privacy proofs for a number of challenging algorithms from
the differential privacy literature, including Report Noisy Max, the
Exponential Mechanism, and the Sparse Vector Mechanism.

Program sensitivity, also known as Lipschitz continuity, describes how small
changes in a program's input lead to bounded changes in the output. We propose
an average notion of program sensitivity for probabilistic programsexpected
sensitivitythat averages a distance function over a probabilistic coupling
of two output distributions from two nearby inputs. By varying the distance,
expected sensitivity captures useful notions of probabilistic function
sensitivity, including algorithmic stability of machine learning algorithms and
convergence of Markov chains.
Furthermore, expected sensitivity satisfies clean compositional properties
and is amenable to formal verification. We develop a relational program logic
called EpRHL for proving expected sensitivity properties. Our logic features
two key ideas. First, relational preconditions and postconditions are
expressed using distances, which can be seen as a realvalued generalization of
typical booleanvalued (relational) assertions. Second, judgments are
interpreted in terms of expectation coupling, a novel, quantitative
generalization of probabilistic couplings which supports compositional
reasoning.
We demonstrate our logic on two classes of examples: formalizing uniform
stability for the stochastic gradient algorithm from machine learning, and
rapid mixing for a model of asexual population dynamics. We also extend our
logic with a transitivity principle for expectation couplings to capture the
path coupling proof technique by Bubley and Dyer, and we demonstrate our
extension by proving rapid mixing of the Glauber dynamics from statistical
physics.

Program sensitivity measures how robust a program is to small changes in its
input, and is a fundamental notion in domains ranging from differential privacy
to cyberphysical systems. A natural way to formalize program sensitivity is in
terms of metrics on the input and output spaces, requiring that an
$r$sensitive function map inputs that are at distance $d$ to outputs that are
at distance at most $r \cdot d$. Program sensitivity is thus an analogue of
Lipschitz continuity for programs.
Reed and Pierce introduced Fuzz, a functional language with a linear type
system that can express program sensitivity. They show soundness operationally,
in the form of a metric preservation property. Inspired by their work, we study
program sensitivity and metric preservation from a denotational point of view.
In particular, we introduce metric CPOs, a novel semantic structure for
reasoning about computation on metric spaces, by endowing CPOs with a
compatible notion of distance. This structure is useful for reasoning about
metric properties of programs, and specifically about program sensitivity. We
demonstrate metric CPOs by giving a model for the deterministic fragment of
Fuzz.

Proof by coupling is a classical proof technique for establishing
probabilistic properties of two probabilistic processes, like stochastic
dominance and rapid mixing of Markov chains. More recently, couplings have been
investigated as a useful abstraction for formal reasoning about relational
properties of probabilistic programs, in particular for modeling
reductionbased cryptographic proofs and for verifying differential privacy. In
this paper, we demonstrate that probabilistic couplings can be used for
verifying nonrelational probabilistic properties. Specifically, we show that
the program logic pRHLwhose proofs are formal versions of proofs by
couplingcan be used for formalizing uniformity and probabilistic
independence. We formally verify our main examples using the EasyCrypt proof
assistant.

In this paper, we develop compositional methods for formally verifying
differential privacy for algorithms whose analysis goes beyond the composition
theorem. Our methods are based on the observation that differential privacy has
deep connections with a generalization of probabilistic couplings, an
established mathematical tool for reasoning about stochastic processes. Even
when the composition theorem is not helpful, we can often prove privacy by a
coupling argument.
We demonstrate our methods on two algorithms: the Exponential mechanism and
the Above Threshold algorithm, the critical component of the famous Sparse
Vector algorithm. We verify these examples in a relational program logic
apRHL+, which can construct approximate couplings. This logic extends the
existing apRHL logic with more general rules for the Laplace mechanism and the
onesided Laplace mechanism, and new structural rules enabling pointwise
reasoning about privacy; all the rules are inspired by the connection with
coupling. While our paper is presented from a formal verification perspective,
we believe that its main insight is of independent interest for the
differential privacy community.

In mechanism design, the gold standard solution concepts are dominant
strategy incentive compatibility and Bayesian incentive compatibility. These
solution concepts relieve the (possibly unsophisticated) bidders from the need
to engage in complicated strategizing. While incentive properties are simple to
state, their proofs are specific to the mechanism and can be quite complex.
This raises two concerns. From a practical perspective, checking a complex
proof can be a tedious process, often requiring experts knowledgeable in
mechanism design. Furthermore, from a modeling perspective, if unsophisticated
agents are unconvinced of incentive properties, they may strategize in
unpredictable ways.
To address both concerns, we explore techniques from computeraided
verification to construct formal proofs of incentive properties. Because formal
proofs can be automatically checked, agents do not need to manually check the
properties, or even understand the proof. To demonstrate, we present the
verification of a sophisticated mechanism: the generic reduction from Bayesian
incentive compatible mechanism design to algorithm design given by Hartline,
Kleinberg, and Malekian. This mechanism presents new challenges for formal
verification, including essential use of randomness from both the execution of
the mechanism and from the prior type distributions. As an immediate
consequence, our work also formalizes Bayesian incentive compatibility for the
entire family of mechanisms derived via this reduction. Finally, as an
intermediate step in our formalization, we provide the first formal
verification of incentive compatibility for the celebrated
VickreyClarkeGroves mechanism.

Couplings are a powerful mathematical tool for reasoning about pairs of
probabilistic processes. Recent developments in formal verification identify a
close connection between couplings and pRHL, a relational program logic
motivated by applications to provable security, enabling formal construction of
couplings from the probability theory literature. However, existing work using
pRHL merely shows existence of a coupling and does not give a way to prove
quantitative properties about the coupling, which are need to reason about
mixing and convergence of probabilistic processes. Furthermore, pRHL is
inherently incomplete, and is not able to capture some advanced forms of
couplings such as shift couplings. We address both problems as follows.
First, we define an extension of pRHL, called xpRHL, which explicitly
constructs the coupling in a pRHL derivation in the form of a probabilistic
product program that simulates two correlated runs of the original program.
Existing verification tools for probabilistic programs can then be directly
applied to the probabilistic product to prove quantitative properties of the
coupling. Second, we equip pRHL with a new rule for while loops, where
reasoning can freely mix synchronized and unsynchronized loop iterations. Our
proof rule can capture examples of shift couplings, and the logic is relatively
complete for deterministic programs.
We show soundness of xpRHL and use it to analyze two classes of examples.
First, we verify rapid mixing using different tools from coupling: standard
coupling, shift coupling, and path coupling, a compositional principle for
combining local couplings into a global coupling. Second, we verify
(approximate) equivalence between a source and an optimized program for several
instances of loop optimizations from the literature.

We consider a private variant of the classical allocation problem: given k
goods and n agents with individual, private valuation functions over bundles of
goods, how can we partition the goods amongst the agents to maximize social
welfare? An important special case is when each agent desires at most one good,
and specifies her (private) value for each good: in this case, the problem is
exactly the maximumweight matching problem in a bipartite graph.
Private matching and allocation problems have not been considered in the
differential privacy literature, and for good reason: they are plainly
impossible to solve under differential privacy. Informally, the allocation must
match agents to their preferred goods in order to maximize social welfare, but
this preference is exactly what agents wish to hide. Therefore, we consider the
problem under the relaxed constraint of joint differential privacy: for any
agent i, no coalition of agents excluding i should be able to learn about the
valuation function of agent i. In this setting, the full allocation is no
longer publishedinstead, each agent is told what good to get. We first show
that with a small number of identical copies of each good, it is possible to
efficiently and accurately solve the maximum weight matching problem while
guaranteeing joint differential privacy. We then consider the more general
allocation problem, when bidder valuations satisfy the gross substitutes
condition. Finally, we prove that the allocation problem cannot be solved to
nontrivial accuracy under joint differential privacy without requiring
multiple copies of each type of good.

Differential privacy is a promising formal approach to data privacy, which
provides a quantitative bound on the privacy cost of an algorithm that operates
on sensitive information. Several tools have been developed for the formal
verification of differentially private algorithms, including program logics and
type systems. However, these tools do not capture fundamental techniques that
have emerged in recent years, and cannot be used for reasoning about
cuttingedge differentially private algorithms. Existing techniques fail to
handle three broad classes of algorithms: 1) algorithms where privacy depends
accuracy guarantees, 2) algorithms that are analyzed with the advanced
composition theorem, which shows slower growth in the privacy cost, 3)
algorithms that interactively accept adaptive inputs.
We address these limitations with a new formalism extending apRHL, a
relational program logic that has been used for proving differential privacy of
noninteractive algorithms, and incorporating aHL, a (nonrelational) program
logic for accuracy properties. We illustrate our approach through a single
running example, which exemplifies the three classes of algorithms and explores
new variants of the Sparse Vector technique, a wellstudied algorithm from the
privacy literature. We implement our logic in EasyCrypt, and formally verify
privacy. We also introduce a novel coupling technique called \emph{optimal
subset coupling} that may be of independent interest.

We present PrivInfer, an expressive framework for writing and verifying
differentially private Bayesian machine learning algorithms. Programs in
PrivInfer are written in a rich functional probabilistic programming language
with constructs for performing Bayesian inference. Then, differential privacy
of programs is established using a relational refinement type system, in which
refinements on probability types are indexed by a metric on distributions. Our
framework leverages recent developments in Bayesian inference, probabilistic
programming languages, and in relational refinement types. We demonstrate the
expressiveness of PrivInfer by verifying privacy for several examples of
private Bayesian inference.

Walrasian equilibrium prices can be said to coordinate markets: They support
a welfare optimal allocation in which each buyer is buying bundle of goods that
is individually most preferred. However, this clean story has two caveats.
First, the prices alone are not sufficient to coordinate the market, and buyers
may need to select among their most preferred bundles in a coordinated way to
find a feasible allocation. Second, we don't in practice expect to encounter
exact equilibrium prices tailored to the market, but instead only approximate
prices, somehow encoding "distributional" information about the market. How
well do prices work to coordinate markets when tiebreaking is not coordinated,
and they encode only distributional information?
We answer this question. First, we provide a genericity condition such that
for buyers with Matroid Based Valuations, overdemand with respect to
equilibrium prices is at most 1, independent of the supply of goods, even when
tiebreaking is done in an uncoordinated fashion. Second, we provide
learningtheoretic results that show that such prices are robust to changing
the buyers in the market, so long as all buyers are sampled from the same
(unknown) distribution.

When analyzing probabilistic computations, a powerful approach is to first
find a martingalean expression on the program variables whose expectation
remains invariantand then apply the optional stopping theorem in order to
infer properties at termination time. One of the main challenges, then, is to
systematically find martingales.
We propose a novel procedure to synthesize martingale expressions from an
arbitrary initial expression. Contrary to stateoftheart approaches, we do
not rely on constraint solving. Instead, we use a symbolic construction based
on Doob's decomposition. This procedure can produce very complex martingales,
expressed in terms of conditional expectations.
We show how to automatically generate and simplify these martingales, as well
as how to apply the optional stopping theorem to infer properties at
termination time. This last step typically involves some simplification steps,
and is usually done manually in current approaches. We implement our techniques
in a prototype tool and demonstrate our process on several classical examples.
Some of them go beyond the capability of current semiautomatic approaches.

We propose a probabilistic Hoare logic aHL based on the union bound, a tool
from basic probability theory. While the union bound is simple, it is an
extremely common tool for analyzing randomized algorithms. In formal
verification terms, the union bound allows flexible and compositional reasoning
over possible ways an algorithm may go wrong. It also enables a clean
separation between reasoning about probabilities and reasoning about events,
which are expressed as standard firstorder formulas in our logic. Notably,
assertions in our logic are nonprobabilistic, even though we can conclude
probabilistic facts from the judgments.
Our logic can also prove accuracy properties for interactive programs, where
the program must produce intermediate outputs as soon as pieces of the input
arrive, rather than accessing the entire input at once. This setting also
enables adaptivity, where later inputs may depend on earlier intermediate
outputs. We show how to prove accuracy for several examples from the
differential privacy literature, both interactive and noninteractive.

We present a practical, differentially private algorithm for answering a
large number of queries on high dimensional datasets. Like all algorithms for
this task, ours necessarily has worstcase complexity exponential in the
dimension of the data. However, our algorithm packages the computationally hard
step into a concisely defined integer program, which can be solved
nonprivately using standard solvers. We prove accuracy and privacy theorems
for our algorithm, and then demonstrate experimentally that our algorithm
performs well in practice. For example, our algorithm can efficiently and
accurately answer millions of queries on the Netflix dataset, which has over
17,000 attributes; this is an improvement on the state of the art by multiple
orders of magnitude.

Probabilistic coupling is a powerful tool for analyzing pairs of
probabilistic processes. Roughly, coupling two processes requires finding an
appropriate witness process that models both processes in the same probability
space. Couplings are powerful tools proving properties about the relation
between two processes, include reasoning about convergence of distributions and
stochastic dominancea probabilistic version of a monotonicity property.
While the mathematical definition of coupling looks rather complex and
cumbersome to manipulate, we show that the relational program logic pRHLthe
logic underlying the EasyCrypt cryptographic proof assistantalready
internalizes a generalization of probabilistic coupling. With this insight,
constructing couplings is no harder than constructing logical proofs. We
demonstrate how to express and verify classic examples of couplings in pRHL,
and we mechanically verify several couplings in EasyCrypt.

We investigate the problem of heterogeneous task assignment in crowdsourcing
markets from the point of view of the requester, who has a collection of tasks.
Workers arrive online one by one, and each declare a set of feasible tasks they
can solve, and desired payment for each feasible task. The requester must
decide on the fly which task (if any) to assign to the worker, while assigning
workers only to feasible tasks. The goal is to maximize the number of assigned
tasks with a fixed overall budget.
We provide an online algorithm for this problem and prove an upper bound on
the competitive ratio of this algorithm against an arbitrary (possibly
worstcase) sequence of workers who want small payments relative to the
requester's total budget. We further show an almost matching lower bound on the
competitive ratio of any algorithm in this setting. Finally, we propose a
different algorithm that achieves an improved competitive ratio in the random
permutation model, where the order of arrival of the workers is chosen
uniformly at random. Apart from these strong theoretical guarantees, we carry
out experiments on simulated data which demonstrates the practical
applicability of our algorithms.

Recent works have shown the power of linear indexed type systems for
enforcing complex program properties. These systems combine linear types with a
language of typelevel indices, allowing more finegrained analyses. Such
systems have been fruitfully applied in diverse domains, including implicit
complexity and differential privacy. A natural way to enhance the
expressiveness of this approach is by allowing the indices to depend on runtime
information, in the spirit of dependent types. This approach is used in DFuzz,
a language for differential privacy. The DFuzz type system relies on an index
language supporting real and natural number arithmetic over constants and
variables. Moreover, DFuzz uses a subtyping mechanism to make types more
flexible. By themselves, linearity, dependency, and subtyping each require
delicate handling when performing type checking or type inference; their
combination increases this challenge substantially, as the features can
interact in nontrivial ways. In this paper, we study the typechecking problem
for DFuzz. We show how we can reduce type checking for (a simple extension of)
DFuzz to constraint solving over a firstorder theory of naturals and real
numbers which, although undecidable, can often be handled in practice by
standard numeric solvers.

In this paper, we give efficient algorithms and lower bounds for solving the
heavy hitters problem while preserving differential privacy in the fully
distributed local model. In this model, there are n parties, each of which
possesses a single element from a universe of size N. The heavy hitters problem
is to find the identity of the most common element shared amongst the n
parties. In the local model, there is no trusted database administrator, and so
the algorithm must interact with each of the $n$ parties separately, using a
differentially private protocol. We give tight informationtheoretic upper and
lower bounds on the accuracy to which this problem can be solved in the local
model (giving a separation between the local model and the more common
centralized model of privacy), as well as computationally efficient algorithms
even in the case where the data universe N may be exponentially large.

In this paper we present an extremely general method for approximately
solving a large family of convex programs where the solution can be divided
between different agents, subject to joint differential privacy. This class
includes multicommodity flow problems, general allocation problems, and
multidimensional knapsack problems, among other examples. The accuracy of our
algorithm depends on the \emph{number} of constraints that bind between
individuals, but crucially, is \emph{nearly independent} of the number of
primal variables and hence the number of agents who make up the problem. As the
number of agents in a problem grows, the error we introduce often becomes
negligible.
We also consider the setting where agents are strategic and have preferences
over their part of the solution. For any convex program in this class that
maximizes \emph{social welfare}, there is a generic reduction that makes the
corresponding optimization \emph{approximately dominant strategy truthful} by
charging agents prices for resources as a function of the approximately optimal
dual variables, which are themselves computed under differential privacy. Our
results substantially expand the class of problems that are known to be
solvable under both privacy and incentive constraints.