• Small-cell architecture is widely adopted by cellular network operators to increase network capacity. By reducing the size of cells, operators can pack more (low-power) base stations in an area to better serve the growing demands, without causing extra interference. However, this approach suffers from low spectrum temporal efficiency. When a cell becomes smaller and covers fewer users, its total traffic fluctuates significantly due to insufficient traffic aggregation and exhibiting a large "peak-to-mean" ratio. As operators customarily provision spectrum for peak traffic, large traffic temporal fluctuation inevitably leads to low spectrum temporal efficiency. In this paper, we advocate device-to-device (D2D) load-balancing as a useful mechanism to address the fundamental drawback of small-cell architecture. The idea is to shift traffic from a congested cell to its adjacent under-utilized cells by leveraging inter-cell D2D communication, so that the traffic can be served without using extra spectrum, effectively improving the spectrum temporal efficiency. We provide theoretical modeling and analysis to characterize the benefit of D2D load balancing, in terms of total spectrum requirements of all individual cells. We also derive the corresponding cost, in terms of incurred D2D traffic overhead. We carry out empirical evaluations based on real-world 4G data traces to gauge the benefit and cost of D2D load balancing under practical settings. The results show that D2D load balancing can reduce the spectrum requirement by 25% as compared to the standard scenario without D2D load balancing, at the expense of negligible 0.7% D2D traffic overhead.
  • We study a delay-sensitive information flow problem where a source streams information to a sink over a directed graph G(V,E) at a fixed rate R possibly using multiple paths to minimize the maximum end-to-end delay, denoted as the Min-Max-Delay problem. Transmission over an edge incurs a constant delay within the capacity. We prove that Min-Max-Delay is weakly NP-complete, and demonstrate that it becomes strongly NP-complete if we require integer flow solution. We propose an optimal pseudo-polynomial time algorithm for Min-Max-Delay, with time complexity O(\log (Nd_{\max}) (N^5d_{\max}^{2.5})(\log R+N^2d_{\max}\log(N^2d_{\max}))), where N = \max\{|V|,|E|\} and d_{\max} is the maximum edge delay. Besides, we show that the integrality gap, which is defined as the ratio of the maximum delay of an optimal integer flow to the maximum delay of an optimal fractional flow, could be arbitrarily large.
  • There is a pressing need to build an architecture that could subsume these networks under a unified framework that achieves both higher performance and less overhead. To this end, two fundamental issues are yet to be addressed. The first one is how to implement the back propagation when neuronal activations are discrete. The second one is how to remove the full-precision hidden weights in the training phase to break the bottlenecks of memory/computation consumption. To address the first issue, we present a multi-step neuronal activation discretization method and a derivative approximation technique that enable the implementing the back propagation algorithm on discrete DNNs. While for the second issue, we propose a discrete state transition (DST) methodology to constrain the weights in a discrete space without saving the hidden weights. Through this way, we build a unified framework that subsumes the binary or ternary networks as its special cases, and under which a heuristic algorithm is provided at the website https://github.com/AcrossV/Gated-XNOR. More particularly, we find that when both the weights and activations become ternary values, the DNNs can be reduced to sparse binary networks, termed as gated XNOR networks (GXNOR-Nets) since only the event of non-zero weight and non-zero activation enables the control gate to start the XNOR logic operations in the original binary networks. This promises the event-driven hardware design for efficient mobile intelligence. We achieve advanced performance compared with state-of-the-art algorithms. Furthermore, the computational sparsity and the number of states in the discrete space can be flexibly modified to make it suitable for various hardware platforms.
  • The rich interaction phenomena at antiferromagnet (AFM)/ ferromagnet (FM) interfaces are key ingredients in AFM spintronics, where many underlying mechanisms remain unclear. Here we report a correlation observed between interfacial Dzyaloshinskii-Moriya interaction (DMI) Ds and effective spin mixing conductance g at IrMn/CoFeB interface. Both Ds and g are quantitatively determined with Brillouin light scattering measurements, and increase with IrMn thickness in the range of 2.5~7.5 nm. Such correlation likely originates from the AFM-states-mediated spin-flip transitions in FM, which promote both interfacial DMI and spin pumping effect. Our findings provide deeper insight into the AFM-FM interfacial coupling for future spintronic design.
  • Batch Normalization (BN) has been proven to be quite effective at accelerating and improving the training of deep neural networks (DNNs). However, BN brings additional computation, consumes more memory and generally slows down the training process by a large margin, which aggravates the training effort. Furthermore, the nonlinear square and root operations in BN also impede the low bit-width quantization techniques, which draws much attention in deep learning hardware community. In this work, we propose an L1-norm BN (L1BN) with only linear operations in both the forward and the backward propagations during training. L1BN is shown to be approximately equivalent to the original L2-norm BN (L2BN) by multiplying a scaling factor. Experiments on various convolutional neural networks (CNNs) and generative adversarial networks (GANs) reveal that L1BN maintains almost the same accuracies and convergence rates compared to L2BN but with higher computational efficiency. On FPGA platform, the proposed signum and absolute operations in L1BN can achieve 1.5$\times$ speedup and save 50\% power consumption, compared with the original costly square and root operations, respectively. This hardware-friendly normalization method not only surpasses L2BN in speed, but also simplify the hardware design of ASIC accelerators with higher energy efficiency. Last but not the least, L1BN promises a fully quantized training of DNNs, which is crucial to future adaptive terminal devices.
  • We perform decoy-state quantum key distribution between a low-Earth-orbit satellite and multiple ground stations located in Xinglong, Nanshan, and Graz, which establish satellite-to-ground secure keys with ~kHz rate per passage of the satellite Micius over a ground station. The satellite thus establishes a secure key between itself and, say, Xinglong, and another key between itself and, say, Graz. Then, upon request from the ground command, Micius acts as a trusted relay. It performs bitwise exclusive OR operations between the two keys and relays the result to one of the ground stations. That way, a secret key is created between China and Europe at locations separated by 7600 km on Earth. These keys are then used for intercontinental quantum-secured communication. This was on the one hand the transmission of images in a one-time pad configuration from China to Austria as well as from Austria to China. Also, a videoconference was performed between the Austrian Academy of Sciences and the Chinese Academy of Sciences, which also included a 280 km optical ground connection between Xinglong and Beijing. Our work points towards an efficient solution for an ultralong-distance global quantum network, laying the groundwork for a future quantum internet.
  • Compared with artificial neural networks (ANNs), spiking neural networks (SNNs) are promising to explore the brain-like behaviors since the spikes could encode more spatio-temporal information. Although pre-training from ANN or direct training based on backpropagation (BP) makes the supervised training of SNNs possible, these methods only exploit the networks' spatial domain information which leads to the performance bottleneck and requires many complicated training skills. Another fundamental issue is that the spike activity is naturally non-differentiable which causes great difficulties in training SNNs. To this end, we build an iterative LIF model that is more friendly for gradient descent training. By simultaneously considering the layer-by-layer spatial domain (SD) and the timing-dependent temporal domain (TD) in the training phase, as well as an approximated derivative for the spike activity, we propose a spatio-temporal backpropagation (STBP) training framework without using any complicated technology. We achieve the best performance of multi-layered perceptron (MLP) compared with existing state-of-the-art algorithms over the static MNIST and the dynamic N-MNIST dataset as well as a custom object detection dataset. This work provides a new perspective to explore the high-performance SNNs for future brain-like computing paradigm with rich spatio-temporal dynamics.
  • Quantum key distribution (QKD) uses individual light quanta in quantum superposition states to guarantee unconditional communication security between distant parties. In practice, the achievable distance for QKD has been limited to a few hundred kilometers, due to the channel loss of fibers or terrestrial free space that exponentially reduced the photon rate. Satellite-based QKD promises to establish a global-scale quantum network by exploiting the negligible photon loss and decoherence in the empty out space. Here, we develop and launch a low-Earth-orbit satellite to implement decoy-state QKD with over kHz key rate from the satellite to ground over a distance up to 1200 km, which is up to 20 orders of magnitudes more efficient than that expected using an optical fiber (with 0.2 dB/km loss) of the same length. The establishment of a reliable and efficient space-to-ground link for faithful quantum state transmission constitutes a key milestone for global-scale quantum networks.
  • Driven by green communications, energy efficiency (EE) has become a new important criterion for designing wireless communication systems. However, high EE often leads to low spectral efficiency (SE), which spurs the research on EE-SE tradeoff. In this paper, we focus on how to maximize the utility in physical layer for an uplink multi-user multiple-input multipleoutput (MU-MIMO) system, where we will not only consider EE-SE tradeoff in a unified way, but also ensure user fairness. We first formulate the utility maximization problem, but it turns out to be non-convex. By exploiting the structure of this problem, we find a convexization procedure to convert the original nonconvex problem into an equivalent convex problem, which has the same global optimum with the original problem. Following the convexization procedure, we present a centralized algorithm to solve the utility maximization problem, but it requires the global information of all users. Thus we propose a primal-dual distributed algorithm which does not need global information and just consumes a small amount of overhead. Furthermore, we have proved that the distributed algorithm can converge to the global optimum. Finally, the numerical results show that our approach can both capture user diversity for EE-SE tradeoff and ensure user fairness, and they also validate the effectiveness of our primal-dual distributed algorithm.
  • Conventional single image based localization methods usually fail to localize a querying image when there exist large variations between the querying image and the pre-built scene. To address this, we propose an image-set querying based localization approach. When the localization by a single image fails to work, the system will ask the user to capture more auxiliary images. First, a local 3D model is established for the querying image set. Then, the pose of the querying image set is estimated by solving a nonlinear optimization problem, which aims to match the local 3D model against the pre-built scene. Experiments have shown the effectiveness and feasibility of the proposed approach.