• Learning Unknown Service Rates in Queues: A Multi-Armed Bandit Approach(1604.06377)

Nov. 21, 2019 cs.SY
Consider a queueing system consisting of multiple servers. Jobs arrive over time and enter a queue for service; the goal is to minimize the size of this queue. At each opportunity for service, at most one server can be chosen, and at most one job can be served. Service is successful with a probability (the service probability) that is a priori unknown for each server. An algorithm that knows the service probabilities (the "genie") can always choose the server of highest service probability. We study algorithms that learn the unknown service probabilities. Our goal is to minimize queue-regret: the (expected) difference between the queue-lengths obtained by the algorithm, and those obtained by the "genie." Since queue-regret cannot be larger than classical regret, results for the standard multi-armed bandit problem give algorithms for which queue-regret increases no more than logarithmically in time. Our paper shows surprisingly more complex behavior. In particular, as long as the bandit algorithm's queues have relatively long regenerative cycles, queue-regret is similar to cumulative regret, and scales (essentially) logarithmically. However, we show that this "early stage" of the queueing bandit eventually gives way to a "late stage", where the optimal queue-regret scaling is $O(1/t)$. We demonstrate an algorithm that (order-wise) achieves this asymptotic queue-regret in the late stage. Our results are developed in a more general model that allows for multiple job classes as well.
• The Shape of Alerts: Detecting Malware Using Distributed Detectors by Robustly Amplifying Transient Correlations(1803.00883)

March 1, 2018 cs.CR
We introduce a new malware detector - Shape-GD - that aggregates per-machine detectors into a robust global detector. Shape-GD is based on two insights: 1. Structural: actions such as visiting a website (waterhole attack) by nodes correlate well with malware spread, and create dynamic neighborhoods of nodes that were exposed to the same attack vector. However, neighborhood sizes vary unpredictably and require aggregating an unpredictable number of local detectors' outputs into a global alert. 2. Statistical: feature vectors corresponding to true and false positives of local detectors have markedly different conditional distributions - i.e. their shapes differ. The shape of neighborhoods can identify infected neighborhoods without having to estimate neighborhood sizes - on 5 years of Symantec detectors' logs, Shape-GD reduces false positives from ~1M down to ~110K and raises alerts 345 days (on average) before commercial anti-virus products; in a waterhole attack simulated using Yahoo web-service logs, Shape-GD detects infected machines when only ~100 of ~550K are compromised.
• The Search Problem in Mixture Models(1610.00843)

Feb. 24, 2018 cs.LG, stat.ML
We consider the task of learning the parameters of a {\em single} component of a mixture model, for the case when we are given {\em side information} about that component, we call this the "search problem" in mixture models. We would like to solve this with computational and sample complexity lower than solving the overall original problem, where one learns parameters of all components. Our main contributions are the development of a simple but general model for the notion of side information, and a corresponding simple matrix-based algorithm for solving the search problem in this general setting. We then specialize this model and algorithm to four common scenarios: Gaussian mixture models, LDA topic models, subspace clustering, and mixed linear regression. For each one of these we show that if (and only if) the side information is informative, we obtain parameter estimates with greater accuracy, and also improved computation complexity than existing moment based mixture model algorithms (e.g. tensor methods). We also illustrate several natural ways one can obtain such side information, for specific problem instances. Our experiments on real data sets (NY Times, Yelp, BSDS500) further demonstrate the practicality of our algorithms showing significant improvement in runtime and accuracy.
• Contextual Bandits with Stochastic Experts(1802.08737)

Feb. 23, 2018 cs.IT, math.IT, cs.AI, cs.LG, stat.ML
We consider the problem of contextual bandits with stochastic experts, which is a variation of the traditional stochastic contextual bandit with experts problem. In our problem setting, we assume access to a class of stochastic experts, where each expert is a conditional distribution over the arms given a context. We propose upper-confidence bound (UCB) algorithms for this problem, which employ two different importance sampling based estimators for the mean reward for each expert. Both these estimators leverage information leakage among the experts, thus using samples collected under all the experts to estimate the mean reward of any given expert. This leads to instance dependent regret bounds of $\mathcal{O}\left(\lambda(\pmb{\mu})\mathcal{M}\log T/\Delta \right)$, where $\lambda(\pmb{\mu})$ is a term that depends on the mean rewards of the experts, $\Delta$ is the smallest gap between the mean reward of the optimal expert and the rest, and $\mathcal{M}$ quantifies the information leakage among the experts. We show that under some assumptions $\lambda(\pmb{\mu})$ is typically $\mathcal{O}(\log N)$. We implement our algorithm with stochastic experts generated from cost-sensitive classification oracles and show superior empirical performance on real-world datasets, when compared to other state of the art contextual bandit algorithms.
• On Learning the $c\mu$ Rule: Single and Multiserver Settings(1802.06723)

Feb. 2, 2018 cs.PF, math.OC
We consider learning-based variants of the $c \mu$ rule -- a classic and well-studied scheduling policy -- in single and multi-server settings for multi-class queueing systems. In the single server setting, the $c \mu$ rule is known to minimize the expected holding-cost (weighted queue-lengths summed both over classes and time). We focus on the setting where the service rates $\mu$ are unknown, and are interested in the holding-cost regret -- the difference in the expected holding-costs between that induced by a learning-based rule (that learns $\mu$) and that from the $c \mu$ rule (which has knowledge of the service rates) over any fixed time horizon. We first show that empirically learning the service rates and then scheduling using these learned values results in a regret of holding-cost that does not depend on the time horizon. The key insight that allows such a constant regret bound is that a work-conserving scheduling policy in this setting allows explore-free learning, where no penalty is incurred for exploring and learning server rates. We next consider the multi-server setting. We show that in general, the $c \mu$ rule is not stabilizing (i.e. there are stabilizable arrival and service rate parameters for which the multi-server $c \mu$ rule results in unstable queues). We then characterize sufficient conditions for stability (and also concentrations on busy periods). Using these results, we show that learning-based variants of the $c\mu$ rule again result in a constant regret (i.e. does not depend on the time horizon). This result hinges on (i) the busy period concentrations of the multi-server $c \mu$ rule, and that (ii) our learning-based rule is designed to dynamically explore server rates, but in such a manner that it eventually satisfies an explore-free condition.
• Model-Powered Conditional Independence Test(1709.06138)

Sept. 18, 2017 cs.IT, math.IT, cs.AI, cs.LG, stat.ML
We consider the problem of non-parametric Conditional Independence testing (CI testing) for continuous random variables. Given i.i.d samples from the joint distribution $f(x,y,z)$ of continuous random vectors $X,Y$ and $Z,$ we determine whether $X \perp Y | Z$. We approach this by converting the conditional independence test into a classification problem. This allows us to harness very powerful classifiers like gradient-boosted trees and deep neural networks. These models can handle complex probability distributions and allow us to perform significantly better compared to the prior state of the art, for high-dimensional CI testing. The main technical challenge in the classification problem is the need for samples from the conditional product distribution $f^{CI}(x,y,z) = f(x|z)f(y|z)f(z)$ -- the joint distribution if and only if $X \perp Y | Z.$ -- when given access only to i.i.d. samples from the true joint distribution $f(x,y,z)$. To tackle this problem we propose a novel nearest neighbor bootstrap procedure and theoretically show that our generated samples are indeed close to $f^{CI}$ in terms of total variational distance. We then develop theoretical results regarding the generalization bounds for classification for our problem, which translate into error bounds for CI testing. We provide a novel analysis of Rademacher type classification bounds in the presence of non-i.i.d near-independent samples. We empirically validate the performance of our algorithm on simulated and real datasets and show performance gains over previous methods.
• Exploiting Latent Attack Semantics for Intelligent Malware Detection(1708.01864)

Aug. 6, 2017 cs.CR
Behavioral malware detectors promise to expose previously unknown malware and are an important security primitive. However, even the best behavioral detectors suffer from high false positives and negatives. In this paper, we address the challenge of aggregating weak per-device behavioral detectors in noisy communities (i.e., ones that produce alerts at unpredictable rates) into an accurate and robust global anomaly detector (GD). Our system - Shape GD - combines two insights: Structural: actions such as visiting a website (waterhole attack) or membership in a shared email thread (phishing attack) by nodes correlate well with malware spread, and create dynamic neighborhoods of nodes that were exposed to the same attack vector; and Statistical: feature vectors corresponding to true and false positives of local detectors have markedly different conditional distributions. We use neighborhoods to amplify the transient low-dimensional structure that is latent in high-dimensional feature vectors - but neighborhoods vary unpredictably, and we use shape to extract robust neighborhood-level features that identify infected neighborhoods. Unlike prior works that aggregate local detectors' alert bitstreams or cluster the feature vectors, Shape GD analyzes the feature vectors that led to local alerts (alert-FVs) to separate true and false positives. Shape GD first filters these alert-FVs into neighborhoods and efficiently maps a neighborhood's alert-FVs' statistical shapes into a scalar score. Shape GD then acts as a neighborhood level anomaly detector - training on benign program traces to learn the ShapeScore of false positive neighborhoods, and classifying neighborhoods with anomalous ShapeScores as malicious. Shape GD detects malware early (~100 infected nodes in a ~100K node system for waterhole and ~10 of 1000 for phishing) and robustly (with ~100% global TP and ~1% global FP rates).
• Adaptive TTL-Based Caching for Content Delivery(1704.04448)

April 14, 2017 cs.NI
Content Delivery Networks (CDNs) deliver a majority of the user-requested content on the Internet, including web pages, videos, and software downloads. A CDN server caches and serves the content requested by users. Designing caching algorithms that automatically adapt to the heterogeneity, burstiness, and non-stationary nature of real-world content requests is a major challenge and is the focus of our work. While there is much work on caching algorithms for stationary request traffic, the work on non-stationary request traffic is very limited. Consequently, most prior models are inaccurate for production CDN traffic that is non-stationary. We propose two TTL-based caching algorithms and provide provable guarantees for content request traffic that is bursty and non-stationary. The first algorithm called d-TTL dynamically adapts a TTL parameter using a stochastic approximation approach. Given a feasible target hit rate, we show that the hit rate of d-TTL converges to its target value for a general class of bursty traffic that allows Markov dependence over time and non-stationary arrivals. The second algorithm called f-TTL uses two caches, each with its own TTL. The first-level cache adaptively filters out non-stationary traffic, while the second-level cache stores frequently-accessed stationary traffic. Given feasible targets for both the hit rate and the expected cache size, f-TTL asymptotically achieves both targets. We implement d-TTL and f-TTL and evaluate both algorithms using an extensive nine-day trace consisting of 500 million requests from a production CDN server. We show that both d-TTL and f-TTL converge to their hit rate targets with an error of about 1.3%. But, f-TTL requires a significantly smaller cache size than d-TTL to achieve the same hit rate, since it effectively filters out the non-stationary traffic for rarely-accessed objects.
• Identifying Best Interventions through Online Importance Sampling(1701.02789)

March 9, 2017 cs.IT, math.IT, cs.LG, stat.ML
Motivated by applications in computational advertising and systems biology, we consider the problem of identifying the best out of several possible soft interventions at a source node $V$ in an acyclic causal directed graph, to maximize the expected value of a target node $Y$ (located downstream of $V$). Our setting imposes a fixed total budget for sampling under various interventions, along with cost constraints on different types of interventions. We pose this as a best arm identification bandit problem with $K$ arms where each arm is a soft intervention at $V,$ and leverage the information leakage among the arms to provide the first gap dependent error and simple regret bounds for this problem. Our results are a significant improvement over the traditional best arm identification results. We empirically show that our algorithms outperform the state of the art in the Flow Cytometry data-set, and also apply our algorithm for model interpretation of the Inception-v3 deep net that classifies images.
• Contextual Bandits with Latent Confounders: An NMF Approach(1606.00119)

Oct. 27, 2016 cs.LG, stat.ML, cs.SY
Motivated by online recommendation and advertising systems, we consider a causal model for stochastic contextual bandits with a latent low-dimensional confounder. In our model, there are $L$ observed contexts and $K$ arms of the bandit. The observed context influences the reward obtained through a latent confounder variable with cardinality $m$ ($m \ll L,K$). The arm choice and the latent confounder causally determines the reward while the observed context is correlated with the confounder. Under this model, the $L \times K$ mean reward matrix $\mathbf{U}$ (for each context in $[L]$ and each arm in $[K]$) factorizes into non-negative factors $\mathbf{A}$ ($L \times m$) and $\mathbf{W}$ ($m \times K$). This insight enables us to propose an $\epsilon$-greedy NMF-Bandit algorithm that designs a sequence of interventions (selecting specific arms), that achieves a balance between learning this low-dimensional structure and selecting the best arm to minimize regret. Our algorithm achieves a regret of $\mathcal{O}\left(L\mathrm{poly}(m, \log K) \log T \right)$ at time $T$, as compared to $\mathcal{O}(LK\log T)$ for conventional contextual bandits, assuming a constant gap between the best arm and the rest for each context. These guarantees are obtained under mild sufficiency conditions on the factors that are weaker versions of the well-known Statistical RIP condition. We further propose a class of generative models that satisfy our sufficient conditions, and derive a lower bound of $\mathcal{O}\left(Km\log T\right)$. These are the first regret guarantees for online matrix completion with bandit feedback, when the rank is greater than one. We further compare the performance of our algorithm with the state of the art, on synthetic and real world data-sets.
• Detecting Sponsored Recommendations(1504.03713)

April 14, 2015 cs.IR
With a vast number of items, web-pages, and news to choose from, online services and the customers both benefit tremendously from personalized recommender systems. Such systems however provide great opportunities for targeted advertisements, by displaying ads alongside genuine recommendations. We consider a biased recommendation system where such ads are displayed without any tags (disguised as genuine recommendations), rendering them indistinguishable to a single user. We ask whether it is possible for a small subset of collaborating users to detect such a bias. We propose an algorithm that can detect such a bias through statistical analysis on the collaborating users' feedback. The algorithm requires only binary information indicating whether a user was satisfied with each of the recommended item or not. This makes the algorithm widely appealing to real world issues such as identification of search engine bias and pharmaceutical lobbying. We prove that the proposed algorithm detects the bias with high probability for a broad class of recommendation systems when sufficient number of users provide feedback on sufficient number of recommendations. We provide extensive simulations with real data sets and practical recommender systems, which confirm the trade offs in the theoretical guarantees.
• Wireless Scheduling with Partial Channel State Information: Large Deviations and Optimality(1405.6307)

March 31, 2015 cs.IT, math.IT, cs.NI
We consider a server serving a time-slotted queued system of multiple packet-based flows, with exogenous packet arrivals and time-varying service rates. At each time, the server can observe instantaneous service rates for only a subset of flows (from within a fixed collection of observable subsets) before scheduling a flow in the subset for service. We are interested in queue-length aware scheduling to keep the queues short, and develop scheduling algorithms that use only partial service rate information from subsets of channels to minimize the likelihood of queue overflow in the system. Specifically, we present a new joint subset-sampling and scheduling algorithm called Max-Exp that uses only the current queue lengths to pick a subset of flows, and subsequently schedules a flow using the Exponential rule. When the collection of observable subsets is disjoint, we show that Max-Exp achieves the best exponential decay rate, among all scheduling algorithms using partial information, of the tail of the longest queue in the system. Towards this, we employ novel analytical techniques for studying the performance of scheduling algorithms using partial state, which may be of independent interest. These include new sample-path large deviations results for processes obtained by non-random, predictable sampling of sequences of independent and identically distributed random variables. A consequence of these results is that scheduling with partial state information yields a rate function significantly different from scheduling with full channel information. In the special case when the observable subsets are singleton flows, i.e., when there is effectively no a priori channel-state information, Max-Exp reduces to simply serving the flow with the longest queue; thus, our results show that to always serve the longest queue in the absence of any channel-state information is large-deviations optimal.
• Scheduling Storms and Streams in the Cloud(1502.05968)

Feb. 20, 2015 cs.DC, cs.NI, math.OC
Motivated by emerging big streaming data processing paradigms (e.g., Twitter Storm, Streaming MapReduce), we investigate the problem of scheduling graphs over a large cluster of servers. Each graph is a job, where nodes represent compute tasks and edges indicate data-flows between these compute tasks. Jobs (graphs) arrive randomly over time, and upon completion, leave the system. When a job arrives, the scheduler needs to partition the graph and distribute it over the servers to satisfy load balancing and cost considerations. Specifically, neighboring compute tasks in the graph that are mapped to different servers incur load on the network; thus a mapping of the jobs among the servers incurs a cost that is proportional to the number of "broken edges". We propose a low complexity randomized scheduling algorithm that, without service preemptions, stabilizes the system with graph arrivals/departures; more importantly, it allows a smooth trade-off between minimizing average partitioning cost and average queue lengths. Interestingly, to avoid service preemptions, our approach does not rely on a Gibbs sampler; instead, we show that the corresponding limiting invariant measure has an interpretation stemming from a loss system.
• The Behavior of Epidemics under Bounded Susceptibility(1502.05096)

Feb. 18, 2015 cs.SI
We investigate the sensitivity of epidemic behavior to a bounded susceptibility constraint -- susceptible nodes are infected by their neighbors via the regular SI/SIS dynamics, but subject to a cap on the infection rate. Such a constraint is motivated by modern social networks, wherein messages are broadcast to all neighbors, but attention spans are limited. Bounded susceptibility also arises in distributed computing applications with download bandwidth constraints, and in human epidemics under quarantine policies. Network epidemics have been extensively studied in literature; prior work characterizes the graph structures required to ensure fast spreading under the SI dynamics, and long lifetime under the SIS dynamics. In particular, these conditions turn out to be meaningful for two classes of networks of practical relevance -- dense, uniform (i.e., clique-like) graphs, and sparse, structured (i.e., star-like) graphs. We show that bounded susceptibility has a surprising impact on epidemic behavior in these graph families. For the SI dynamics, bounded susceptibility has no effect on star-like networks, but dramatically alters the spreading time in clique-like networks. In contrast, for the SIS dynamics, clique-like networks are unaffected, but star-like networks exhibit a sharp change in extinction times under bounded susceptibility. Our findings are useful for the design of disease-resistant networks and infrastructure networks. More generally, they show that results for existing epidemic models are sensitive to modeling assumptions in non-intuitive ways, and suggest caution in directly using these as guidelines for real systems.
• Serving Content with Unknown Demand:the High-Dimensional Regime(1412.6463)

Dec. 19, 2014 cs.NI
In this paper we look at content placement in the high-dimensional regime: there are n servers, and O(n) distinct types of content. Each server can store and serve O(1) types at any given time. Demands for these content types arrive, and have to be served in an online fashion; over time, there are a total of O(n) of these demands. We consider the algorithmic task of content placement: determining which types of content should be on which server at any given time, in the setting where the demand statistics (i.e. the relative popularity of each type of content) are not known a-priori, but have to be inferred from the very demands we are trying to satisfy. This is the high- dimensional regime because this scaling (everything being O(n)) prevents consistent estimation of demand statistics; it models many modern settings where large numbers of users, servers and videos/webpages interact in this way. We characterize the performance of any scheme that separates learning and placement (i.e. which use a portion of the demands to gain some estimate of the demand statistics, and then uses the same for the remaining demands), showing it is order-wise strictly suboptimal. We then study a simple adaptive scheme - which myopically attempts to store the most recently requested content on idle servers - and show it outperforms schemes that separate learning and placement. Our results also generalize to the setting where the demand statistics change with time. Overall, our results demonstrate that separating the estimation of demand, and the subsequent use of the same, is strictly suboptimal.
• Online Collaborative-Filtering on Graphs(1411.2057)

Nov. 7, 2014 cs.LG
A common phenomena in modern recommendation systems is the use of feedback from one user to infer the value' of an item to other users. This results in an exploration vs. exploitation trade-off, in which items of possibly low value have to be presented to users in order to ascertain their value. Existing approaches to solving this problem focus on the case where the number of items are small, or admit some underlying structure -- it is unclear, however, if good recommendation is possible when dealing with content-rich settings with unstructured content. We consider this problem under a simple natural model, wherein the number of items and the number of item-views are of the same order, and an access-graph' constrains which user is allowed to see which item. Our main insight is that the presence of the access-graph in fact makes good recommendation possible -- however this requires the exploration policy to be designed to take advantage of the access-graph. Our results demonstrate the importance of serendipity' in exploration, and how higher graph-expansion translates to a higher quality of recommendations; it also suggests a reason why in some settings, simple policies like Twitter's Latest-First' policy achieve a good performance. From a technical perspective, our model presents a way to study exploration-exploitation tradeoffs in settings where the number of trials' and strategies' are large (potentially infinite), and more importantly, of the same order. Our algorithms admit competitive-ratio guarantees which hold for the worst-case user, under both finite-population and infinite-horizon settings, and are parametrized in terms of properties of the underlying graph. Conversely, we also demonstrate that improperly-designed policies can be highly sub-optimal, and that in many settings, our results are order-wise optimal.
• Epidemic Spreading with External Agents(1206.3599)

April 13, 2014 physics.soc-ph, cs.IT, math.IT, cs.NI, cs.SI
We study epidemic spreading processes in large networks, when the spread is assisted by a small number of external agents: infection sources with bounded spreading power, but whose movement is unrestricted vis-\a-vis the underlying network topology. For networks which are spatially constrained', we show that the spread of infection can be significantly speeded up even by a few such external agents infecting randomly. Moreover, for general networks, we derive upper-bounds on the order of the spreading time achieved by certain simple (random/greedy) external-spreading policies. Conversely, for certain common classes of networks such as line graphs, grids and random geometric graphs, we also derive lower bounds on the order of the spreading time over all (potentially network-state aware and adversarial) external-spreading policies; these adversarial lower bounds match (up to logarithmic factors) the spreading time achieved by an external agent with a random spreading policy. This demonstrates that random, state-oblivious infection-spreading by an external agent is in fact order-wise optimal for spreading in such spatially constrained networks.
• Localized epidemic detection in networks with overwhelming noise(1402.1263)

Feb. 6, 2014 cs.SI, cs.LG
We consider the problem of detecting an epidemic in a population where individual diagnoses are extremely noisy. The motivation for this problem is the plethora of examples (influenza strains in humans, or computer viruses in smartphones, etc.) where reliable diagnoses are scarce, but noisy data plentiful. In flu/phone-viruses, exceedingly few infected people/phones are professionally diagnosed (only a small fraction go to a doctor) but less reliable secondary signatures (e.g., people staying home, or greater-than-typical upload activity) are more readily available. These secondary data are often plagued by unreliability: many people with the flu do not stay home, and many people that stay home do not have the flu. This paper identifies the precise regime where knowledge of the contact network enables finding the needle in the haystack: we provide a distributed, efficient and robust algorithm that can correctly identify the existence of a spreading epidemic from highly unreliable local data. Our algorithm requires only local-neighbor knowledge of this graph, and in a broad array of settings that we describe, succeeds even when false negatives and false positives make up an overwhelming fraction of the data available. Our results show it succeeds in the presence of partial information about the contact network, and also when there is not a single "patient zero", but rather many (hundreds, in our examples) of initial patient-zeroes, spread across the graph.
• Epidemic Thresholds with External Agents(1312.0938)

Dec. 3, 2013 physics.soc-ph, cs.SI
We study the effect of external infection sources on phase transitions in epidemic processes. In particular, we consider an epidemic spreading on a network via the SIS/SIR dynamics, which in addition is aided by external agents - sources unconstrained by the graph, but possessing a limited infection rate or virulence. Such a model captures many existing models of externally aided epidemics, and finds use in many settings - epidemiology, marketing and advertising, network robustness, etc. We provide a detailed characterization of the impact of external agents on epidemic thresholds. In particular, for the SIS model, we show that any external infection strategy with constant virulence either fails to significantly affect the lifetime of an epidemic, or at best, sustains the epidemic for a lifetime which is polynomial in the number of nodes. On the other hand, a random external-infection strategy, with rate increasing linearly in the number of infected nodes, succeeds under some conditions to sustain an exponential epidemic lifetime. We obtain similar sharp thresholds for the SIR model, and discuss the relevance of our results in a variety of settings.
• Distinguishing Infections on Different Graph Topologies(1309.6545)

Sept. 25, 2013 q-bio.PE, cs.SI
The history of infections and epidemics holds famous examples where understanding, containing and ultimately treating an outbreak began with understanding its mode of spread. Influenza, HIV and most computer viruses, spread person to person, device to device, through contact networks; Cholera, Cancer, and seasonal allergies, on the other hand, do not. In this paper we study two fundamental questions of detection: first, given a snapshot view of a (perhaps vanishingly small) fraction of those infected, under what conditions is an epidemic spreading via contact (e.g., Influenza), distinguishable from a "random illness" operating independently of any contact network (e.g., seasonal allergies); second, if we do have an epidemic, under what conditions is it possible to determine which network of interactions is the main cause of the spread -- the causative network -- without any knowledge of the epidemic, other than the identity of a minuscule subsample of infected nodes? The core, therefore, of this paper, is to obtain an understanding of the diagnostic power of network information. We derive sufficient conditions networks must satisfy for these problems to be identifiable, and produce efficient, highly scalable algorithms that solve these problems. We show that the identifiability condition we give is fairly mild, and in particular, is satisfied by two common graph topologies: the grid, and the Erdos-Renyi graphs.
• On the Role of Mobility for Multi-message Gossip(1204.3114)

Feb. 7, 2013 cs.IT, math.IT, cs.NI, cs.SI
We consider information dissemination in a large $n$-user wireless network in which $k$ users wish to share a unique message with all other users. Each of the $n$ users only has knowledge of its own contents and state information; this corresponds to a one-sided push-only scenario. The goal is to disseminate all messages efficiently, hopefully achieving an order-optimal spreading rate over unicast wireless random networks. First, we show that a random-push strategy -- where a user sends its own or a received packet at random -- is order-wise suboptimal in a random geometric graph: specifically, $\Omega(\sqrt{n})$ times slower than optimal spreading. It is known that this gap can be closed if each user has "full" mobility, since this effectively creates a complete graph. We instead consider velocity-constrained mobility where at each time slot the user moves locally using a discrete random walk with velocity $v(n)$ that is much lower than full mobility. We propose a simple two-stage dissemination strategy that alternates between individual message flooding ("self promotion") and random gossiping. We prove that this scheme achieves a close to optimal spreading rate (within only a logarithmic gap) as long as the velocity is at least $v(n)=\omega(\sqrt{\log n/k})$. The key insight is that the mixing property introduced by the partial mobility helps users to spread in space within a relatively short period compared to the optimal spreading time, which macroscopically mimics message dissemination over a complete graph.
• Towards a Queueing-Based Framework for In-Network Function Computation(1105.5651)

June 22, 2012 cs.IT, math.IT, cs.NI
We seek to develop network algorithms for function computation in sensor networks. Specifically, we want dynamic joint aggregation, routing, and scheduling algorithms that have analytically provable performance benefits due to in-network computation as compared to simple data forwarding. To this end, we define a class of functions, the Fully-Multiplexible functions, which includes several functions such as parity, MAX, and k th -order statistics. For such functions we exactly characterize the maximum achievable refresh rate of the network in terms of an underlying graph primitive, the min-mincut. In acyclic wireline networks, we show that the maximum refresh rate is achievable by a simple algorithm that is dynamic, distributed, and only dependent on local information. In the case of wireless networks, we provide a MaxWeight-like algorithm with dynamic flow splitting, which is shown to be throughput-optimal.
• On the Effect of Channel Fading on Greedy Scheduling(1203.1997)

March 9, 2012 cs.NI
Greedy Maximal Scheduling (GMS) is an attractive low-complexity scheme for scheduling in wireless networks. Recent work has characterized its throughput for the case when there is no fading/channel variations. This paper aims to understand the effect of channel variations on the relative throughput performance of GMS vis-a-vis that of an optimal scheduler facing the same fading. The effect is not a-priori obvious because, on the one hand, fading could help by decoupling/precluding global states that lead to poor GMS performance, while on the other hand fading adds another degree of freedom in which an event unfavourable to GMS could occur. We show that both these situations can occur when fading is adversarial. In particular, we first define the notion of a {\em Fading Local Pooling factor (F-LPF)}, and show that it exactly characterizes the throughput of GMS in this setting. We also derive general upper and lower bounds on F-LPF. Using these bounds, we provide two example networks - one where the relative performance of GMS is worse than if there were no fading, and one where it is better.
• Greedy Learning of Markov Network Structure(1202.1787)

Feb. 8, 2012 stat.ML
We propose a new yet natural algorithm for learning the graph structure of general discrete graphical models (a.k.a. Markov random fields) from samples. Our algorithm finds the neighborhood of a node by sequentially adding nodes that produce the largest reduction in empirical conditional entropy; it is greedy in the sense that the choice of addition is based only on the reduction achieved at that iteration. Its sequential nature gives it a lower computational complexity as compared to other existing comparison-based techniques, all of which involve exhaustive searches over every node set of a certain size. Our main result characterizes the sample complexity of this procedure, as a function of node degrees, graph size and girth in factor-graph representation. We subsequently specialize this result to the case of Ising models, where we provide a simple transparent characterization of sample complexity as a function of model and graph parameters. For tree graphs, our algorithm is the same as the classical Chow-Liu algorithm, and in that sense can be considered the extension of the same to graphs with cycles.
• On Sharing Viral Video over an Ad Hoc Wireless Network(1101.5088)

Jan. 26, 2011 cs.SI
We consider the problem of broadcasting a viral video (a large file) over an ad hoc wireless network (e.g., students in a campus). Many smartphones are GPS enabled, and equipped with peer-to-peer (ad hoc) transmission mode, allowing them to wirelessly exchange files over short distances rather than use the carrier's WAN. The demand for the file however is transmitted through the social network (e.g., a YouTube link posted on Facebook). To address this coupled-network problem (demand on the social network; bandwidth on the wireless network) where the two networks have different topologies, we propose a file dissemination algorithm. In our scheme, users query their social network to find geographically nearby friends that have the desired file, and utilize the underlying ad hoc network to route the data via multi-hop transmissions. We show that for many popular models for social networks, the file dissemination time scales sublinearly with n; the number of users, compared to the linear scaling required if each user who wants the file must download it from the carrier's WAN.