• We present and study the Static-Routing-Resiliency problem, motivated by routing on the Internet: Given a graph $G$, a unique destination vertex $d$, and an integer constant $c>0$, does there exist a static and destination-based routing scheme such that the correct delivery of packets from any source $s$ to the destination $d$ is guaranteed so long as (1) no more than $c$ edges fail and (2) there exists a physical path from $s$ to $d$? We embark upon a systematic exploration of this fundamental question in a variety of models (deterministic routing, randomized routing, with packet-duplication, with packet-header-rewriting) and present both positive and negative results that relate the edge-connectivity of a graph, i.e., the minimum number of edges whose deletion partitions $G$, to its resiliency.
  • Recent work has made great progress in verifying the forwarding correctness of networks . However, these approaches cannot be used to verify networks containing middleboxes, such as caches and firewalls, whose forwarding behavior depends on previously observed traffic. We explore how to verify reachability properties for networks that include such "mutable datapath" elements. We want our verification results to hold not just for the given network, but also in the presence of failures. The main challenge lies in scaling the approach to handle large and complicated networks, We address by developing and leveraging the concept of slices, which allow network-wide verification to only require analyzing small portions of the network. We show that with slices the time required to verify an invariant on many production networks is independent of the size of the network itself.
  • Control planes for global carrier networks should be programmable (so that new functionality can be easily introduced) and scalable (so they can handle the numerical scale and geographic scope of these networks). Neither traditional control planes nor new SDN-based control planes meet both of these goals. In this paper, we propose a framework for recursive routing computations that combines the best of SDN (programmability) and traditional networks (scalability through hierarchy) to achieve these two desired properties. Through simulation on graphs of up to 10,000 nodes, we evaluate our design's ability to support a variety of routing and traffic engineering solutions, while incorporating a fast failure recovery mechanism.
  • In this paper we address a seemingly simple question: Is there a universal packet scheduling algorithm? More precisely, we analyze (both theoretically and empirically) whether there is a single packet scheduling algorithm that, at a network-wide level, can match the results of any given scheduling algorithm. We find that in general the answer is "no". However, we show theoretically that the classical Least Slack Time First (LSTF) scheduling algorithm comes closest to being universal and demonstrate empirically that LSTF can closely, though not perfectly, replay a wide range of scheduling algorithms in realistic network settings. We then evaluate whether LSTF can be used {\em in practice} to meet various network-wide objectives by looking at three popular performance metrics (mean FCT, tail packet delays, and fairness); we find that LSTF performs comparable to the state-of-the-art for each of them.
  • Several recently proposed techniques achieve latency reduction by trading it off for some amount of additional bandwidth usage. But how would one quantify whether the tradeoff is actually beneficial in a given system? We develop an economic cost vs. benefit analysis for answering this question. We use the analysis to derive a benchmark for wide-area client-server applications, and demonstrate how it can be applied to reason about a particular latency saving technique --- redundant DNS requests.
  • Great progress has been made recently in verifying the correctness of router forwarding tables. However, these approaches do not work for networks containing middleboxes such as caches and firewalls whose forwarding behavior depends on previously observed traffic. We explore how to verify isolation properties in networks that include such "dynamic datapath" elements using model checking. Our work leverages recent advances in SMT solvers, and the main challenge lies in scaling the approach to handle large and complicated networks. While the straightforward application of model checking to this problem can only handle very small networks (if at all), our approach can verify simple realistic invariants on networks containing 30,000 middleboxes in a few minutes.
  • Low latency is critical for interactive networked applications. But while we know how to scale systems to increase capacity, reducing latency --- especially the tail of the latency distribution --- can be much more difficult. In this paper, we argue that the use of redundancy is an effective way to convert extra capacity into reduced latency. By initiating redundant operations across diverse resources and using the first result which completes, redundancy improves a system's latency even under exceptional conditions. We study the tradeoff with added system utilization, characterizing the situations in which replicating all tasks reduces mean latency. We then demonstrate empirically that replicating all operations can result in significant mean and tail latency reduction in real-world systems including DNS queries, database servers, and packet forwarding within networks.
  • Shark is a new data analysis system that marries query processing with complex analytics on large clusters. It leverages a novel distributed memory abstraction to provide a unified engine that can run SQL queries and sophisticated analytics functions (e.g., iterative machine learning) at scale, and efficiently recovers from failures mid-query. This allows Shark to run SQL queries up to 100x faster than Apache Hive, and machine learning programs up to 100x faster than Hadoop. Unlike previous systems, Shark shows that it is possible to achieve these speedups while retaining a MapReduce-like execution engine, and the fine-grained fault tolerance properties that such engines provide. It extends such an engine in several ways, including column-oriented in-memory storage and dynamic mid-query replanning, to effectively execute SQL. The result is a system that matches the speedups reported for MPP analytic databases over MapReduce, while offering fault tolerance properties and complex analytics capabilities that they lack.
  • Many modern network designs incorporate "failover" paths into routers' forwarding tables. We initiate the theoretical study of the conditions under which such resilient routing tables can guarantee delivery of packets.
  • Source-controlled routing has been proposed as a way to improve flexibility of future network architectures, as well as simplifying the data plane. However, if a packet specifies its path, this precludes fast local re-routing within the network. We propose SlickPackets, a novel solution that allows packets to slip around failures by specifying alternate paths in their headers, in the form of compactly-encoded directed acyclic graphs. We show that this can be accomplished with reasonably small packet headers for real network topologies, and results in responsiveness to failures that is competitive with past approaches that require much more state within the network. Our approach thus enables fast failure response while preserving the benefits of source-controlled routing.
  • We present the Scalable Nucleotide Alignment Program (SNAP), a new short and long read aligner that is both more accurate (i.e., aligns more reads with fewer errors) and 10-100x faster than state-of-the-art tools such as BWA. Unlike recent aligners based on the Burrows-Wheeler transform, SNAP uses a simple hash index of short seed sequences from the genome, similar to BLAST's. However, SNAP greatly reduces the number and cost of local alignment checks performed through several measures: it uses longer seeds to reduce the false positive locations considered, leverages larger memory capacities to speed index lookup, and excludes most candidate locations without fully computing their edit distance to the read. The result is an algorithm that scales well for reads from one hundred to thousands of bases long and provides a rich error model that can match classes of mutations (e.g., longer indels) that today's fast aligners ignore. We calculate that SNAP can align a dataset with 30x coverage of a human genome in less than an hour for a cost of $2 on Amazon EC2, with higher accuracy than BWA. Finally, we describe ongoing work to further improve SNAP.