• In the following, we present example illustrative and experimental results comparing fair schedulers allocating resources from multiple servers to distributed application frameworks. Resources are allocated so that at least one resource is exhausted in every server. Schedulers considered include DRF (DRFH) and Best-Fit DRF (BF-DRF), TSF, and PS-DSF. We also consider server selection under Randomized Round Robin (RRR) and based on their residual (unreserved) resources. In the following, we consider cases with frameworks of equal priority and without server-preference constraints. We first give typical results of a illustrative numerical study and then give typical results of a study involving Spark workloads on Mesos which we have modified and open-sourced to prototype different schedulers.
  • Efficient and fair allocation of multiple types of resources is a crucial objective in a cloud/distributed computing cluster. Users may have diverse resource needs. Furthermore, diversity in server properties/ capabilities may mean that only a subset of servers may be usable by a given user. In platforms with such heterogeneity, we identify important limitations in existing multi-resource fair allocation mechanisms, notably Dominant Resource Fairness (DRF) and its follow-up work. To overcome such limitations, we propose a new server-based approach; each server allocates resources by maximizing a per-server utility function. We propose a specific class of utility functions which, when appropriately parameterized, adjusts the trade-off between efficiency and fairness, and captures a variety of fairness measures (such as our recently proposed Per-Server Dominant Share Fairness). We establish conditions for the proposed mechanism to satisfy certain properties that are generally deemed desirable, e.g., envy-freeness, sharing incentive, bottleneck fairness, and Pareto optimality. To implement our resource allocation mechanism, we develop an iterative algorithm which is shown to be globally convergent. Finally, we show how the proposed mechanism could be implemented in a distributed fashion. We carry out extensive trace-driven simulations to show the enhanced performance of our proposed mechanism over the existing ones.
  • We study the broadcast transmission of a single file to an arbitrary number of receivers using Random Linear Network Coding (RLNC) in a network with unreliable channels. Due to the increased computational complexity of the decoding process (especially for large files) we apply chunked RLNC (i.e. RLNC is applied within non-overlapping subsets of the file). In our work we show the optimality of the Least Received (LR) batch scheduling policy (which was introduced in our prior work) with regards to the expected file transfer completion time. Furthermore, we refine some of our earlier results, namely the expected file transfer completion time of the LR policy and the minimum achievable coding window size in the case of a user defined delay constraint. Finally, we experimentally evaluate a modification of the LR policy in a more realistic system setting with reduced feedback from the receivers.
  • Users of cloud computing platforms pose different types of demands for multiple resources on servers (physical or virtual machines). Besides differences in their resource capacities, servers may be additionally heterogeneous in their ability to service users - certain users' tasks may only be serviced by a subset of the servers. We identify important shortcomings in existing multi-resource fair allocation mechanisms - Dominant Resource Fairness (DRF) and its follow up work - when used in such environments. We develop a new fair allocation mechanism called Per-Server Dominant-Share Fairness (PS-DSF) which we show offers all desirable sharing properties that DRF is able to offer in the case of a single "resource pool" (i.e., if the resources of all servers were pooled together into one hypothetical server). We evaluate the performance of PS-DSF through simulations. Our evaluation shows the enhanced efficiency of PS-DSF compared to the existing allocation mechanisms. We argue how our proposed allocation mechanism is applicable in cloud computing networks and especially large scale data-centers.
  • Random linear network coding (RLNC) has been shown to efficiently improve the network performance in terms of reducing transmission delays and increasing the throughput in broadcast and multicast communications. However, it can result in increased storage and computational complexity at the receivers end. In our previous work we considered the broadcast transmission of large file to N receivers. We showed that the storage and complexity requirements at the receivers end can be greatly reduced when segmenting the file into smaller blocks and applying RLNC to these blocks. To that purpose, we proposed a packet scheduling policy, namely the Least Received. In this work we will prove the optimality of our previously proposed policy, in terms of file transfer completion time, when N = 2. We will model our system as a Markov Decision Process and prove the optimality of the policy using Dynamic Programming. Our intuition is that the Least Received policy may be optimal regardless of the number of receivers. Towards that end, we will provide experimental results that verify that ntuition.
  • Network Coding is a packet encoding technique which has recently been shown to improve network performance (by reducing delays and increasing throughput) in broadcast and multicast communications. The cost for such an improvement comes in the form of increased decoding complexity (and thus delay) at the receivers end. Before delivering the file to higher layers, the receiver should first decode those packets. In our work we consider the broadcast transmission of a large file to N wireless users. The file is segmented into a number of blocks (each containing K packets - the Coding Window Size). The packets of each block are encoded using Random Linear Network Coding (RLNC).We obtain the minimum coding window size so that the completion time of the file transmission is upper bounded by a used defined delay constraint.
  • In this paper, a multi-user multi-server queuing system is studied in which each user is constrained to get service from a subset of servers. In the studied system, rate allocation in the sense of max-min fairness results in multi-level fair rates. To achieve such fair rates, we propose $CM^4FQ$ algorithm. In this algorithm users are chosen for service on a packet by packet basis. The priority of each user $i$ to be chosen at time $t$ is determined based on a parameter known as service tag (representing the amount of work counted for user $i$ till time $t$). Hence, a free server will choose to serve an eligible user with the minimum service tag. Based on such simple selection criterion, $CM^4FQ$ aims at guaranteed fair throughput for each demanding user without explicit knowledge of each server service rate. We argue that $CM^4FQ$ can be applied in a variety of practical queuing systems specially in mobile cloud computing architecture.
  • We study the problem of assigning $K$ identical servers to a set of $N$ parallel queues in a time-slotted queueing system. The connectivity of each queue to each server is randomly changing with time; each server can serve at most one queue and each queue can be served by at most one server during each time slot. Such a queueing model has been used in addressing resource allocation problems in wireless networks. It has been previously proven that Maximum Weighted Matching (MWM) is a throughput-optimal server assignment policy for such a queueing system. In this paper, we prove that for a system with i.i.d. Bernoulli packet arrivals and connectivities, MWM minimizes, in stochastic ordering sense, a broad range of cost functions of the queue lengths such as total queue occupancy (which implies minimization of average queueing delays). Then, we extend the model by considering imperfect services where it is assumed that the service of a scheduled packet fails randomly with a certain probability. We prove that the same policy is still optimal for the extended model. We finally show that the results are still valid for more general connectivity and arrival processes which follow conditional permutation invariant distributions.
  • We consider a problem of supplying electricity to a set of $\mathcal{N}$ customers in a smart-grid framework. Each customer requires a certain amount of electrical energy which has to be supplied during the time interval $[0,1]$. We assume that each demand has to be supplied without interruption, with possible duration between $\ell$ and $r$, which are given system parameters ($\ell\le r$). At each moment of time, the power of the grid is the sum of all the consumption rates for the demands being supplied at that moment. Our goal is to find an assignment that minimizes the {\it power peak} - maximal power over $[0,1]$ - while satisfying all the demands. To do this first we find the lower bound of optimal power peak. We show that the problem depends on whether or not the pair $\ell, r$ belongs to a "good" region $\mathcal{G}$. If it does - then an optimal assignment almost perfectly "fills" the rectangle $time \times power = [0,1] \times [0, A]$ with $A$ being the sum of all the energy demands - thus achieving an optimal power peak $A$. Conversely, if $\ell, r$ do not belong to $\mathcal{G}$, we identify the lower bound $\bar{A} >A$ on the optimal value of power peak and introduce a simple linear time algorithm that almost perfectly arranges all the demands in a rectangle $[0, A /\bar{A}] \times [0, \bar{A}]$ and show that it is asymptotically optimal.
  • In this paper, we present a probabilistic analysis of iterative node-based verification-based (NB-VB) recovery algorithms over irregular graphs in the context of compressed sensing. Verification-based algorithms are particularly interesting due to their low complexity (linear in the signal dimension $n$). The analysis predicts the average fraction of unverified signal elements at each iteration $\ell$ where the average is taken over the ensembles of input signals and sensing matrices. The analysis is asymptotic ($n \rightarrow \infty$) and is similar in nature to the well-known density evolution technique commonly used to analyze iterative decoding algorithms. Compared to the existing technique for the analysis of NB-VB algorithms, which is based on numerically solving a large system of coupled differential equations, the proposed method is much simpler and more accurate. This allows us to design irregular sensing graphs for such recovery algorithms. The designed irregular graphs outperform the corresponding regular graphs substantially. For example, for the same recovery complexity per iteration, we design irregular graphs that can recover up to about 40% more non-zero signal elements compared to the regular graphs. Simulation results are also provided which demonstrate that the proposed asymptotic analysis matches the performance of recovery algorithms for large but finite values of $n$.
  • In this paper, we characterize the stability region of multi-queue multi-server (MQMS) queueing systems with stationary channel and packet arrival processes. Toward this, the necessary and sufficient conditions for the stability of the system are derived under general arrival processes with finite first and second moments. We show that when the arrival processes are stationary, the stability region form is a polytope for which we explicitly find the coefficients of the linear inequalities which characterize the stability region polytope.
  • In this paper, we investigate the problem of assignment of $K$ identical servers to a set of $N$ parallel queues in a time slotted queueing system. The connectivity of each queue to each server is randomly changing with time; each server can serve at most one queue and each queue can be served by at most one server per time slot. Such queueing systems were widely applied in modeling the scheduling (or resource allocation) problem in wireless networks. It has been previously proven that Maximum Weighted Matching (MWM) is a throughput optimal server assignment policy for such queueing systems. In this paper, we prove that for a symmetric system with i.i.d. Bernoulli packet arrivals and connectivities, MWM minimizes, in stochastic ordering sense, a broad range of cost functions of the queue lengths including total queue occupancy (or equivalently average queueing delay).
  • In this paper, we characterize the network stability region (capacity region) of multi-queue multi-server (MQMS) queueing systems with stationary channel distribution and stationary arrival processes. The stability region is specified by a finite set of linear inequalities. We first show that the stability region is a polytope characterized by the finite set of its facet defining hyperplanes. We explicitly determine the coefficients of the linear inequalities describing the facet defining hyperplanes of the stability region polytope. We further derive the necessary and sufficient conditions for the stability of the system for general arrival processes with finite first and second moments. For the case of stationary arrival processes, the derived conditions characterize the system stability region. Furthermore, we obtain an upper bound for the average queueing delay of Maximum Weight (MW) server allocation policy which has been shown in the literature to be a throughput optimal policy for MQMS systems. Using a similar approach, we can characterize the stability region for a fluid model MQMS system. However, the stability region of the fluid model system is described by an infinite number of linear inequalities since in this case the stability region is a convex surface. We present an example where we show that in some cases depending on the channel distribution, the stability region can be characterized by a finite set of non-linear inequalities instead of an infinite number of linear inequalities.
  • In this paper, we present a new approach for the analysis of iterative node-based verification-based (NB-VB) recovery algorithms in the context of compressive sensing. These algorithms are particularly interesting due to their low complexity (linear in the signal dimension $n$). The asymptotic analysis predicts the fraction of unverified signal elements at each iteration $\ell$ in the asymptotic regime where $n \rightarrow \infty$. The analysis is similar in nature to the well-known density evolution technique commonly used to analyze iterative decoding algorithms. To perform the analysis, a message-passing interpretation of NB-VB algorithms is provided. This interpretation lacks the extrinsic nature of standard message-passing algorithms to which density evolution is usually applied. This requires a number of non-trivial modifications in the analysis. The analysis tracks the average performance of the recovery algorithms over the ensembles of input signals and sensing matrices as a function of $\ell$. Concentration results are devised to demonstrate that the performance of the recovery algorithms applied to any choice of the input signal over any realization of the sensing matrix follows the deterministic results of the analysis closely. Simulation results are also provided which demonstrate that the proposed asymptotic analysis matches the performance of recovery algorithms for large but finite values of $n$. Compared to the existing technique for the analysis of NB-VB algorithms, which is based on numerically solving a large system of coupled differential equations, the proposed method is much simpler and more accurate.
  • We investigate an optimal scheduling problem in a discrete-time system of L parallel queues that are served by K identical, randomly connected servers. Each queue may be connected to a subset of the K servers during any given time slot. This model has been widely used in studies of emerging 3G/4G wireless systems. We introduce the class of Most Balancing (MB) policies and provide their mathematical characterization. We prove that MB policies are optimal; we define optimality as minimization, in stochastic ordering sense, of a range of cost functions of the queue lengths, including the process of total number of packets in the system. We use stochastic coupling arguments for our proof. We introduce the Least Connected Server First/Longest Connected Queue (LCSF/LCQ) policy as an easy-to-implement approximation of MB policies. We conduct a simulation study to compare the performance of several policies. The simulation results show that: (a) in all cases, LCSF/LCQ approximations to the MB policies outperform the other policies, (b) randomized policies perform fairly close to the optimal one, and, (c) the performance advantage of the optimal policy over the other simulated policies increases as the channel connectivity probability decreases and as the number of servers in the system increases.
  • In this paper, we present a new approach for the analysis of iterative node-based verification-based (NB-VB) recovery algorithms in the context of compressive sensing. These algorithms are particularly interesting due to their low complexity (linear in the signal dimension $n$). The asymptotic analysis predicts the fraction of unverified signal elements at each iteration $\ell$ in the asymptotic regime where $n \rightarrow \infty$. The analysis is similar in nature to the well-known density evolution technique commonly used to analyze iterative decoding algorithms. To perform the analysis, a message-passing interpretation of NB-VB algorithms is provided. This interpretation lacks the extrinsic nature of standard message-passing algorithms to which density evolution is usually applied. This requires a number of non-trivial modifications in the analysis. The analysis tracks the average performance of the recovery algorithms over the ensembles of input signals and sensing matrices as a function of $\ell$. Concentration results are devised to demonstrate that the performance of the recovery algorithms applied to any choice of the input signal over any realization of the sensing matrix follows the deterministic results of the analysis closely. Simulation results are also provided which demonstrate that the proposed asymptotic analysis matches the performance of recovery algorithms for large but finite values of $n$. Compared to the existing technique for the analysis of NB-VB algorithms, which is based on numerically solving a large system of coupled differential equations, the proposed method is much simpler and more accurate.
  • Network capacity region of multi-queue multi-server queueing system with random ON-OFF connectivities and stationary arrival processes is derived in this paper. Specifically, the necessary and sufficient conditions for the stability of the system are derived under general arrival processes with finite first and second moments. In the case of stationary arrival processes, these conditions establish the network capacity region of the system. It is also shown that AS/LCQ (Any Server/Longest Connected Queue) policy stabilizes the system when it is stabilizable. Furthermore, an upper bound for the average queue occupancy is derived for this policy.
  • In this paper, we propose a general framework for the asymptotic analysis of node-based verification-based algorithms. In our analysis we tend the signal length $n$ to infinity. We also let the number of non-zero elements of the signal $k$ scale linearly with $n$. Using the proposed framework, we study the asymptotic behavior of the recovery algorithms over random sparse matrices (graphs) in the context of compressive sensing. Our analysis shows that there exists a success threshold on the density ratio $k/n$, before which the recovery algorithms are successful, and beyond which they fail. This threshold is a function of both the graph and the recovery algorithm. We also demonstrate that there is a good agreement between the asymptotic behavior of recovery algorithms and finite length simulations for moderately large values of $n$.