
Consider a queueing system consisting of multiple servers. Jobs arrive over
time and enter a queue for service; the goal is to minimize the size of this
queue. At each opportunity for service, at most one server can be chosen, and
at most one job can be served. Service is successful with a probability (the
service probability) that is a priori unknown for each server. An algorithm
that knows the service probabilities (the "genie") can always choose the server
of highest service probability. We study algorithms that learn the unknown
service probabilities. Our goal is to minimize queueregret: the (expected)
difference between the queuelengths obtained by the algorithm, and those
obtained by the "genie."
Since queueregret cannot be larger than classical regret, results for the
standard multiarmed bandit problem give algorithms for which queueregret
increases no more than logarithmically in time. Our paper shows surprisingly
more complex behavior. In particular, as long as the bandit algorithm's queues
have relatively long regenerative cycles, queueregret is similar to cumulative
regret, and scales (essentially) logarithmically. However, we show that this
"early stage" of the queueing bandit eventually gives way to a "late stage",
where the optimal queueregret scaling is $O(1/t)$. We demonstrate an algorithm
that (orderwise) achieves this asymptotic queueregret in the late stage. Our
results are developed in a more general model that allows for multiple job
classes as well.

We introduce a new malware detector  ShapeGD  that aggregates permachine
detectors into a robust global detector. ShapeGD is based on two insights: 1.
Structural: actions such as visiting a website (waterhole attack) by nodes
correlate well with malware spread, and create dynamic neighborhoods of nodes
that were exposed to the same attack vector. However, neighborhood sizes vary
unpredictably and require aggregating an unpredictable number of local
detectors' outputs into a global alert. 2. Statistical: feature vectors
corresponding to true and false positives of local detectors have markedly
different conditional distributions  i.e. their shapes differ. The shape of
neighborhoods can identify infected neighborhoods without having to estimate
neighborhood sizes  on 5 years of Symantec detectors' logs, ShapeGD reduces
false positives from ~1M down to ~110K and raises alerts 345 days (on average)
before commercial antivirus products; in a waterhole attack simulated using
Yahoo webservice logs, ShapeGD detects infected machines when only ~100 of
~550K are compromised.

We consider the task of learning the parameters of a {\em single} component
of a mixture model, for the case when we are given {\em side information} about
that component, we call this the "search problem" in mixture models. We would
like to solve this with computational and sample complexity lower than solving
the overall original problem, where one learns parameters of all components.
Our main contributions are the development of a simple but general model for
the notion of side information, and a corresponding simple matrixbased
algorithm for solving the search problem in this general setting. We then
specialize this model and algorithm to four common scenarios: Gaussian mixture
models, LDA topic models, subspace clustering, and mixed linear regression. For
each one of these we show that if (and only if) the side information is
informative, we obtain parameter estimates with greater accuracy, and also
improved computation complexity than existing moment based mixture model
algorithms (e.g. tensor methods). We also illustrate several natural ways one
can obtain such side information, for specific problem instances. Our
experiments on real data sets (NY Times, Yelp, BSDS500) further demonstrate the
practicality of our algorithms showing significant improvement in runtime and
accuracy.

We consider the problem of contextual bandits with stochastic experts, which
is a variation of the traditional stochastic contextual bandit with experts
problem. In our problem setting, we assume access to a class of stochastic
experts, where each expert is a conditional distribution over the arms given a
context. We propose upperconfidence bound (UCB) algorithms for this problem,
which employ two different importance sampling based estimators for the mean
reward for each expert. Both these estimators leverage information leakage
among the experts, thus using samples collected under all the experts to
estimate the mean reward of any given expert. This leads to instance dependent
regret bounds of $\mathcal{O}\left(\lambda(\pmb{\mu})\mathcal{M}\log T/\Delta
\right)$, where $\lambda(\pmb{\mu})$ is a term that depends on the mean rewards
of the experts, $\Delta$ is the smallest gap between the mean reward of the
optimal expert and the rest, and $\mathcal{M}$ quantifies the information
leakage among the experts. We show that under some assumptions
$\lambda(\pmb{\mu})$ is typically $\mathcal{O}(\log N)$. We implement our
algorithm with stochastic experts generated from costsensitive classification
oracles and show superior empirical performance on realworld datasets, when
compared to other state of the art contextual bandit algorithms.

We consider learningbased variants of the $c \mu$ rule  a classic and
wellstudied scheduling policy  in single and multiserver settings for
multiclass queueing systems. In the single server setting, the $c \mu$ rule is
known to minimize the expected holdingcost (weighted queuelengths summed both
over classes and time). We focus on the setting where the service rates $\mu$
are unknown, and are interested in the holdingcost regret  the difference in
the expected holdingcosts between that induced by a learningbased rule (that
learns $\mu$) and that from the $c \mu$ rule (which has knowledge of the
service rates) over any fixed time horizon. We first show that empirically
learning the service rates and then scheduling using these learned values
results in a regret of holdingcost that does not depend on the time horizon.
The key insight that allows such a constant regret bound is that a
workconserving scheduling policy in this setting allows explorefree learning,
where no penalty is incurred for exploring and learning server rates.
We next consider the multiserver setting. We show that in general, the $c
\mu$ rule is not stabilizing (i.e. there are stabilizable arrival and service
rate parameters for which the multiserver $c \mu$ rule results in unstable
queues). We then characterize sufficient conditions for stability (and also
concentrations on busy periods). Using these results, we show that
learningbased variants of the $c\mu$ rule again result in a constant regret
(i.e. does not depend on the time horizon). This result hinges on (i) the busy
period concentrations of the multiserver $c \mu$ rule, and that (ii) our
learningbased rule is designed to dynamically explore server rates, but in
such a manner that it eventually satisfies an explorefree condition.

We consider the problem of nonparametric Conditional Independence testing
(CI testing) for continuous random variables. Given i.i.d samples from the
joint distribution $f(x,y,z)$ of continuous random vectors $X,Y$ and $Z,$ we
determine whether $X \perp Y  Z$. We approach this by converting the
conditional independence test into a classification problem. This allows us to
harness very powerful classifiers like gradientboosted trees and deep neural
networks. These models can handle complex probability distributions and allow
us to perform significantly better compared to the prior state of the art, for
highdimensional CI testing. The main technical challenge in the classification
problem is the need for samples from the conditional product distribution
$f^{CI}(x,y,z) = f(xz)f(yz)f(z)$  the joint distribution if and only if $X
\perp Y  Z.$  when given access only to i.i.d. samples from the true joint
distribution $f(x,y,z)$. To tackle this problem we propose a novel nearest
neighbor bootstrap procedure and theoretically show that our generated samples
are indeed close to $f^{CI}$ in terms of total variational distance. We then
develop theoretical results regarding the generalization bounds for
classification for our problem, which translate into error bounds for CI
testing. We provide a novel analysis of Rademacher type classification bounds
in the presence of noni.i.d nearindependent samples. We empirically validate
the performance of our algorithm on simulated and real datasets and show
performance gains over previous methods.

Behavioral malware detectors promise to expose previously unknown malware and
are an important security primitive. However, even the best behavioral
detectors suffer from high false positives and negatives. In this paper, we
address the challenge of aggregating weak perdevice behavioral detectors in
noisy communities (i.e., ones that produce alerts at unpredictable rates) into
an accurate and robust global anomaly detector (GD).
Our system  Shape GD  combines two insights: Structural: actions such as
visiting a website (waterhole attack) or membership in a shared email thread
(phishing attack) by nodes correlate well with malware spread, and create
dynamic neighborhoods of nodes that were exposed to the same attack vector; and
Statistical: feature vectors corresponding to true and false positives of local
detectors have markedly different conditional distributions. We use
neighborhoods to amplify the transient lowdimensional structure that is latent
in highdimensional feature vectors  but neighborhoods vary unpredictably, and
we use shape to extract robust neighborhoodlevel features that identify
infected neighborhoods.
Unlike prior works that aggregate local detectors' alert bitstreams or
cluster the feature vectors, Shape GD analyzes the feature vectors that led to
local alerts (alertFVs) to separate true and false positives. Shape GD first
filters these alertFVs into neighborhoods and efficiently maps a
neighborhood's alertFVs' statistical shapes into a scalar score. Shape GD then
acts as a neighborhood level anomaly detector  training on benign program
traces to learn the ShapeScore of false positive neighborhoods, and classifying
neighborhoods with anomalous ShapeScores as malicious.
Shape GD detects malware early (~100 infected nodes in a ~100K node system
for waterhole and ~10 of 1000 for phishing) and robustly (with ~100% global TP
and ~1% global FP rates).

Content Delivery Networks (CDNs) deliver a majority of the userrequested
content on the Internet, including web pages, videos, and software downloads. A
CDN server caches and serves the content requested by users. Designing caching
algorithms that automatically adapt to the heterogeneity, burstiness, and
nonstationary nature of realworld content requests is a major challenge and
is the focus of our work. While there is much work on caching algorithms for
stationary request traffic, the work on nonstationary request traffic is very
limited. Consequently, most prior models are inaccurate for production CDN
traffic that is nonstationary.
We propose two TTLbased caching algorithms and provide provable guarantees
for content request traffic that is bursty and nonstationary. The first
algorithm called dTTL dynamically adapts a TTL parameter using a stochastic
approximation approach. Given a feasible target hit rate, we show that the hit
rate of dTTL converges to its target value for a general class of bursty
traffic that allows Markov dependence over time and nonstationary arrivals.
The second algorithm called fTTL uses two caches, each with its own TTL. The
firstlevel cache adaptively filters out nonstationary traffic, while the
secondlevel cache stores frequentlyaccessed stationary traffic. Given
feasible targets for both the hit rate and the expected cache size, fTTL
asymptotically achieves both targets. We implement dTTL and fTTL and evaluate
both algorithms using an extensive nineday trace consisting of 500 million
requests from a production CDN server. We show that both dTTL and fTTL
converge to their hit rate targets with an error of about 1.3%. But, fTTL
requires a significantly smaller cache size than dTTL to achieve the same hit
rate, since it effectively filters out the nonstationary traffic for
rarelyaccessed objects.

Motivated by applications in computational advertising and systems biology,
we consider the problem of identifying the best out of several possible soft
interventions at a source node $V$ in an acyclic causal directed graph, to
maximize the expected value of a target node $Y$ (located downstream of $V$).
Our setting imposes a fixed total budget for sampling under various
interventions, along with cost constraints on different types of interventions.
We pose this as a best arm identification bandit problem with $K$ arms where
each arm is a soft intervention at $V,$ and leverage the information leakage
among the arms to provide the first gap dependent error and simple regret
bounds for this problem. Our results are a significant improvement over the
traditional best arm identification results. We empirically show that our
algorithms outperform the state of the art in the Flow Cytometry dataset, and
also apply our algorithm for model interpretation of the Inceptionv3 deep net
that classifies images.

Motivated by online recommendation and advertising systems, we consider a
causal model for stochastic contextual bandits with a latent lowdimensional
confounder. In our model, there are $L$ observed contexts and $K$ arms of the
bandit. The observed context influences the reward obtained through a latent
confounder variable with cardinality $m$ ($m \ll L,K$). The arm choice and the
latent confounder causally determines the reward while the observed context is
correlated with the confounder. Under this model, the $L \times K$ mean reward
matrix $\mathbf{U}$ (for each context in $[L]$ and each arm in $[K]$)
factorizes into nonnegative factors $\mathbf{A}$ ($L \times m$) and
$\mathbf{W}$ ($m \times K$). This insight enables us to propose an
$\epsilon$greedy NMFBandit algorithm that designs a sequence of interventions
(selecting specific arms), that achieves a balance between learning this
lowdimensional structure and selecting the best arm to minimize regret. Our
algorithm achieves a regret of $\mathcal{O}\left(L\mathrm{poly}(m, \log K) \log
T \right)$ at time $T$, as compared to $\mathcal{O}(LK\log T)$ for conventional
contextual bandits, assuming a constant gap between the best arm and the rest
for each context. These guarantees are obtained under mild sufficiency
conditions on the factors that are weaker versions of the wellknown
Statistical RIP condition. We further propose a class of generative models that
satisfy our sufficient conditions, and derive a lower bound of
$\mathcal{O}\left(Km\log T\right)$. These are the first regret guarantees for
online matrix completion with bandit feedback, when the rank is greater than
one. We further compare the performance of our algorithm with the state of the
art, on synthetic and real world datasets.

With a vast number of items, webpages, and news to choose from, online
services and the customers both benefit tremendously from personalized
recommender systems. Such systems however provide great opportunities for
targeted advertisements, by displaying ads alongside genuine recommendations.
We consider a biased recommendation system where such ads are displayed without
any tags (disguised as genuine recommendations), rendering them
indistinguishable to a single user. We ask whether it is possible for a small
subset of collaborating users to detect such a bias. We propose an algorithm
that can detect such a bias through statistical analysis on the collaborating
users' feedback. The algorithm requires only binary information indicating
whether a user was satisfied with each of the recommended item or not. This
makes the algorithm widely appealing to real world issues such as
identification of search engine bias and pharmaceutical lobbying. We prove that
the proposed algorithm detects the bias with high probability for a broad class
of recommendation systems when sufficient number of users provide feedback on
sufficient number of recommendations. We provide extensive simulations with
real data sets and practical recommender systems, which confirm the trade offs
in the theoretical guarantees.

We consider a server serving a timeslotted queued system of multiple
packetbased flows, with exogenous packet arrivals and timevarying service
rates. At each time, the server can observe instantaneous service rates for
only a subset of flows (from within a fixed collection of observable subsets)
before scheduling a flow in the subset for service. We are interested in
queuelength aware scheduling to keep the queues short, and develop scheduling
algorithms that use only partial service rate information from subsets of
channels to minimize the likelihood of queue overflow in the system.
Specifically, we present a new joint subsetsampling and scheduling algorithm
called MaxExp that uses only the current queue lengths to pick a subset of
flows, and subsequently schedules a flow using the Exponential rule. When the
collection of observable subsets is disjoint, we show that MaxExp achieves the
best exponential decay rate, among all scheduling algorithms using partial
information, of the tail of the longest queue in the system. Towards this, we
employ novel analytical techniques for studying the performance of scheduling
algorithms using partial state, which may be of independent interest. These
include new samplepath large deviations results for processes obtained by
nonrandom, predictable sampling of sequences of independent and identically
distributed random variables. A consequence of these results is that scheduling
with partial state information yields a rate function significantly different
from scheduling with full channel information. In the special case when the
observable subsets are singleton flows, i.e., when there is effectively no a
priori channelstate information, MaxExp reduces to simply serving the flow
with the longest queue; thus, our results show that to always serve the longest
queue in the absence of any channelstate information is largedeviations
optimal.

Motivated by emerging big streaming data processing paradigms (e.g., Twitter
Storm, Streaming MapReduce), we investigate the problem of scheduling graphs
over a large cluster of servers. Each graph is a job, where nodes represent
compute tasks and edges indicate dataflows between these compute tasks. Jobs
(graphs) arrive randomly over time, and upon completion, leave the system. When
a job arrives, the scheduler needs to partition the graph and distribute it
over the servers to satisfy load balancing and cost considerations.
Specifically, neighboring compute tasks in the graph that are mapped to
different servers incur load on the network; thus a mapping of the jobs among
the servers incurs a cost that is proportional to the number of "broken edges".
We propose a low complexity randomized scheduling algorithm that, without
service preemptions, stabilizes the system with graph arrivals/departures; more
importantly, it allows a smooth tradeoff between minimizing average
partitioning cost and average queue lengths. Interestingly, to avoid service
preemptions, our approach does not rely on a Gibbs sampler; instead, we show
that the corresponding limiting invariant measure has an interpretation
stemming from a loss system.

We investigate the sensitivity of epidemic behavior to a bounded
susceptibility constraint  susceptible nodes are infected by their neighbors
via the regular SI/SIS dynamics, but subject to a cap on the infection rate.
Such a constraint is motivated by modern social networks, wherein messages are
broadcast to all neighbors, but attention spans are limited. Bounded
susceptibility also arises in distributed computing applications with download
bandwidth constraints, and in human epidemics under quarantine policies.
Network epidemics have been extensively studied in literature; prior work
characterizes the graph structures required to ensure fast spreading under the
SI dynamics, and long lifetime under the SIS dynamics. In particular, these
conditions turn out to be meaningful for two classes of networks of practical
relevance  dense, uniform (i.e., cliquelike) graphs, and sparse, structured
(i.e., starlike) graphs. We show that bounded susceptibility has a surprising
impact on epidemic behavior in these graph families. For the SI dynamics,
bounded susceptibility has no effect on starlike networks, but dramatically
alters the spreading time in cliquelike networks. In contrast, for the SIS
dynamics, cliquelike networks are unaffected, but starlike networks exhibit a
sharp change in extinction times under bounded susceptibility.
Our findings are useful for the design of diseaseresistant networks and
infrastructure networks. More generally, they show that results for existing
epidemic models are sensitive to modeling assumptions in nonintuitive ways,
and suggest caution in directly using these as guidelines for real systems.

In this paper we look at content placement in the highdimensional regime:
there are n servers, and O(n) distinct types of content. Each server can store
and serve O(1) types at any given time. Demands for these content types arrive,
and have to be served in an online fashion; over time, there are a total of
O(n) of these demands. We consider the algorithmic task of content placement:
determining which types of content should be on which server at any given time,
in the setting where the demand statistics (i.e. the relative popularity of
each type of content) are not known apriori, but have to be inferred from the
very demands we are trying to satisfy. This is the high dimensional regime
because this scaling (everything being O(n)) prevents consistent estimation of
demand statistics; it models many modern settings where large numbers of users,
servers and videos/webpages interact in this way. We characterize the
performance of any scheme that separates learning and placement (i.e. which use
a portion of the demands to gain some estimate of the demand statistics, and
then uses the same for the remaining demands), showing it is orderwise
strictly suboptimal. We then study a simple adaptive scheme  which myopically
attempts to store the most recently requested content on idle servers  and
show it outperforms schemes that separate learning and placement. Our results
also generalize to the setting where the demand statistics change with time.
Overall, our results demonstrate that separating the estimation of demand, and
the subsequent use of the same, is strictly suboptimal.

A common phenomena in modern recommendation systems is the use of feedback
from one user to infer the `value' of an item to other users. This results in
an exploration vs. exploitation tradeoff, in which items of possibly low value
have to be presented to users in order to ascertain their value. Existing
approaches to solving this problem focus on the case where the number of items
are small, or admit some underlying structure  it is unclear, however, if
good recommendation is possible when dealing with contentrich settings with
unstructured content.
We consider this problem under a simple natural model, wherein the number of
items and the number of itemviews are of the same order, and an `accessgraph'
constrains which user is allowed to see which item. Our main insight is that
the presence of the accessgraph in fact makes good recommendation possible 
however this requires the exploration policy to be designed to take advantage
of the accessgraph. Our results demonstrate the importance of `serendipity' in
exploration, and how higher graphexpansion translates to a higher quality of
recommendations; it also suggests a reason why in some settings, simple
policies like Twitter's `LatestFirst' policy achieve a good performance.
From a technical perspective, our model presents a way to study
explorationexploitation tradeoffs in settings where the number of `trials' and
`strategies' are large (potentially infinite), and more importantly, of the
same order. Our algorithms admit competitiveratio guarantees which hold for
the worstcase user, under both finitepopulation and infinitehorizon
settings, and are parametrized in terms of properties of the underlying graph.
Conversely, we also demonstrate that improperlydesigned policies can be highly
suboptimal, and that in many settings, our results are orderwise optimal.

We study epidemic spreading processes in large networks, when the spread is
assisted by a small number of external agents: infection sources with bounded
spreading power, but whose movement is unrestricted vis\`avis the underlying
network topology. For networks which are `spatially constrained', we show that
the spread of infection can be significantly speeded up even by a few such
external agents infecting randomly. Moreover, for general networks, we derive
upperbounds on the order of the spreading time achieved by certain simple
(random/greedy) externalspreading policies. Conversely, for certain common
classes of networks such as line graphs, grids and random geometric graphs, we
also derive lower bounds on the order of the spreading time over all
(potentially networkstate aware and adversarial) externalspreading policies;
these adversarial lower bounds match (up to logarithmic factors) the spreading
time achieved by an external agent with a random spreading policy. This
demonstrates that random, stateoblivious infectionspreading by an external
agent is in fact orderwise optimal for spreading in such spatially constrained
networks.

We consider the problem of detecting an epidemic in a population where
individual diagnoses are extremely noisy. The motivation for this problem is
the plethora of examples (influenza strains in humans, or computer viruses in
smartphones, etc.) where reliable diagnoses are scarce, but noisy data
plentiful. In flu/phoneviruses, exceedingly few infected people/phones are
professionally diagnosed (only a small fraction go to a doctor) but less
reliable secondary signatures (e.g., people staying home, or
greaterthantypical upload activity) are more readily available. These
secondary data are often plagued by unreliability: many people with the flu do
not stay home, and many people that stay home do not have the flu. This paper
identifies the precise regime where knowledge of the contact network enables
finding the needle in the haystack: we provide a distributed, efficient and
robust algorithm that can correctly identify the existence of a spreading
epidemic from highly unreliable local data. Our algorithm requires only
localneighbor knowledge of this graph, and in a broad array of settings that
we describe, succeeds even when false negatives and false positives make up an
overwhelming fraction of the data available. Our results show it succeeds in
the presence of partial information about the contact network, and also when
there is not a single "patient zero", but rather many (hundreds, in our
examples) of initial patientzeroes, spread across the graph.

We study the effect of external infection sources on phase transitions in
epidemic processes. In particular, we consider an epidemic spreading on a
network via the SIS/SIR dynamics, which in addition is aided by external agents
 sources unconstrained by the graph, but possessing a limited infection rate
or virulence. Such a model captures many existing models of externally aided
epidemics, and finds use in many settings  epidemiology, marketing and
advertising, network robustness, etc. We provide a detailed characterization of
the impact of external agents on epidemic thresholds. In particular, for the
SIS model, we show that any external infection strategy with constant virulence
either fails to significantly affect the lifetime of an epidemic, or at best,
sustains the epidemic for a lifetime which is polynomial in the number of
nodes. On the other hand, a random externalinfection strategy, with rate
increasing linearly in the number of infected nodes, succeeds under some
conditions to sustain an exponential epidemic lifetime. We obtain similar sharp
thresholds for the SIR model, and discuss the relevance of our results in a
variety of settings.

The history of infections and epidemics holds famous examples where
understanding, containing and ultimately treating an outbreak began with
understanding its mode of spread. Influenza, HIV and most computer viruses,
spread person to person, device to device, through contact networks; Cholera,
Cancer, and seasonal allergies, on the other hand, do not. In this paper we
study two fundamental questions of detection: first, given a snapshot view of a
(perhaps vanishingly small) fraction of those infected, under what conditions
is an epidemic spreading via contact (e.g., Influenza), distinguishable from a
"random illness" operating independently of any contact network (e.g., seasonal
allergies); second, if we do have an epidemic, under what conditions is it
possible to determine which network of interactions is the main cause of the
spread  the causative network  without any knowledge of the epidemic, other
than the identity of a minuscule subsample of infected nodes?
The core, therefore, of this paper, is to obtain an understanding of the
diagnostic power of network information. We derive sufficient conditions
networks must satisfy for these problems to be identifiable, and produce
efficient, highly scalable algorithms that solve these problems. We show that
the identifiability condition we give is fairly mild, and in particular, is
satisfied by two common graph topologies: the grid, and the ErdosRenyi graphs.

We consider information dissemination in a large $n$user wireless network in
which $k$ users wish to share a unique message with all other users. Each of
the $n$ users only has knowledge of its own contents and state information;
this corresponds to a onesided pushonly scenario. The goal is to disseminate
all messages efficiently, hopefully achieving an orderoptimal spreading rate
over unicast wireless random networks. First, we show that a randompush
strategy  where a user sends its own or a received packet at random  is
orderwise suboptimal in a random geometric graph: specifically,
$\Omega(\sqrt{n})$ times slower than optimal spreading. It is known that this
gap can be closed if each user has "full" mobility, since this effectively
creates a complete graph. We instead consider velocityconstrained mobility
where at each time slot the user moves locally using a discrete random walk
with velocity $v(n)$ that is much lower than full mobility. We propose a simple
twostage dissemination strategy that alternates between individual message
flooding ("self promotion") and random gossiping. We prove that this scheme
achieves a close to optimal spreading rate (within only a logarithmic gap) as
long as the velocity is at least $v(n)=\omega(\sqrt{\log n/k})$. The key
insight is that the mixing property introduced by the partial mobility helps
users to spread in space within a relatively short period compared to the
optimal spreading time, which macroscopically mimics message dissemination over
a complete graph.

We seek to develop network algorithms for function computation in sensor
networks. Specifically, we want dynamic joint aggregation, routing, and
scheduling algorithms that have analytically provable performance benefits due
to innetwork computation as compared to simple data forwarding. To this end,
we define a class of functions, the FullyMultiplexible functions, which
includes several functions such as parity, MAX, and k th order statistics. For
such functions we exactly characterize the maximum achievable refresh rate of
the network in terms of an underlying graph primitive, the minmincut. In
acyclic wireline networks, we show that the maximum refresh rate is achievable
by a simple algorithm that is dynamic, distributed, and only dependent on local
information. In the case of wireless networks, we provide a MaxWeightlike
algorithm with dynamic flow splitting, which is shown to be throughputoptimal.

Greedy Maximal Scheduling (GMS) is an attractive lowcomplexity scheme for
scheduling in wireless networks. Recent work has characterized its throughput
for the case when there is no fading/channel variations. This paper aims to
understand the effect of channel variations on the relative throughput
performance of GMS visavis that of an optimal scheduler facing the same
fading. The effect is not apriori obvious because, on the one hand, fading
could help by decoupling/precluding global states that lead to poor GMS
performance, while on the other hand fading adds another degree of freedom in
which an event unfavourable to GMS could occur.
We show that both these situations can occur when fading is adversarial. In
particular, we first define the notion of a {\em Fading Local Pooling factor
(FLPF)}, and show that it exactly characterizes the throughput of GMS in this
setting. We also derive general upper and lower bounds on FLPF. Using these
bounds, we provide two example networks  one where the relative performance of
GMS is worse than if there were no fading, and one where it is better.

We propose a new yet natural algorithm for learning the graph structure of
general discrete graphical models (a.k.a. Markov random fields) from samples.
Our algorithm finds the neighborhood of a node by sequentially adding nodes
that produce the largest reduction in empirical conditional entropy; it is
greedy in the sense that the choice of addition is based only on the reduction
achieved at that iteration. Its sequential nature gives it a lower
computational complexity as compared to other existing comparisonbased
techniques, all of which involve exhaustive searches over every node set of a
certain size. Our main result characterizes the sample complexity of this
procedure, as a function of node degrees, graph size and girth in factorgraph
representation. We subsequently specialize this result to the case of Ising
models, where we provide a simple transparent characterization of sample
complexity as a function of model and graph parameters.
For tree graphs, our algorithm is the same as the classical ChowLiu
algorithm, and in that sense can be considered the extension of the same to
graphs with cycles.

We consider the problem of broadcasting a viral video (a large file) over an
ad hoc wireless network (e.g., students in a campus). Many smartphones are GPS
enabled, and equipped with peertopeer (ad hoc) transmission mode, allowing
them to wirelessly exchange files over short distances rather than use the
carrier's WAN. The demand for the file however is transmitted through the
social network (e.g., a YouTube link posted on Facebook).
To address this couplednetwork problem (demand on the social network;
bandwidth on the wireless network) where the two networks have different
topologies, we propose a file dissemination algorithm. In our scheme, users
query their social network to find geographically nearby friends that have the
desired file, and utilize the underlying ad hoc network to route the data via
multihop transmissions. We show that for many popular models for social
networks, the file dissemination time scales sublinearly with n; the number of
users, compared to the linear scaling required if each user who wants the file
must download it from the carrier's WAN.