• The volume of time series data has exploded due to the popularity of new applications, such as data center management and IoT. Subsequence matching is a fundamental task in mining time series data. All index-based approaches only consider raw subsequence matching (RSM) and do not support subsequence normalization. UCR Suite can deal with normalized subsequence match problem (NSM), but it needs to scan full time series. In this paper, we propose a novel problem, named constrained normalized subsequence matching problem (cNSM), which adds some constraints to NSM problem. The cNSM problem provides a knob to flexibly control the degree of offset shifting and amplitude scaling, which enables users to build the index to process the query. We propose a new index structure, KV-index, and the matching algorithm, KV-match. With a single index, our approach can support both RSM and cNSM problems under either ED or DTW distance. KV-index is a key-value structure, which can be easily implemented on local files or HBase tables. To support the query of arbitrary lengths, we extend KV-match to KV-match$_{DP}$, which utilizes multiple varied-length indexes to process the query. We conduct extensive experiments on synthetic and real-world datasets. The results verify the effectiveness and efficiency of our approach.
  • Often, it is required to estimate the probability that a quantity such as toxicity level, plutonium, temperature, rainfall, damage, wind speed, wave size, earthquake magnitude, risk, etc., exceeds an unsafe high threshold. The probability in question is then very small. To estimate such a probability, information is needed about large values of the quantity of interest. However, in many cases, the data only contain values below or even far below the designated threshold, let alone exceedingly large values. It is shown that by repeated fusion of the data with externally generated random data, more information about small tail probabilities is obtained with the aid of certain new statistical functions. This provides relatively short, yet reliable interval estimates based on moderately large samples. A comparison of the approach with a method from extreme values theory (Peaks over Threshold, or POT), using both artificial and real data, points to the merit of repeated out of sample fusion.
  • This paper presents a non-iterative solution to RGB-D-inertial odometry system. Traditional odometry methods resort to iterative algorithms which are usually computationally expensive or require well-designed initialization. To overcome this problem, this paper proposes to combine a non-iterative front-end (odometry) with an iterative back-end (loop closure) for the RGB-D-inertial SLAM system. The main contribution lies in the novel non-iterative front-end, which leverages on inertial fusion and kernel cross-correlators (KCC) to match point clouds in frequency domain. Dominated by the fast Fourier transform (FFT), our method is only of complexity $\mathcal{O}(n\log{n})$, where $n$ is the number of points. Map fusion is conducted by element-wise operations, so that both time and space complexity are further reduced. Extensive experiments show that, due to the lightweight of the proposed front-end, the framework is able to run at a much faster speed yet still with comparable accuracy with the state-of-the-arts.
  • In this paper, we prove two conjectural supercongruences on the $(p-1)$th Ap\'ery number, which were recently proposed by Z.-H. Sun.
  • Many Engineering Problems could be mathematically described by Final Value Problem, which is the inverse problem of Initial Value Problem. Accordingly, the paper studies the final value problem in the field of ODE problems and analyses the differences and relations between initial and final value problems. The more general new concept of the endpoints-value problem which could describe both initial and final problems is proposed. Further, we extend the concept into inner-interval value problem and arbitrary value problem and point out that both endpoints-value problem and inner-interval value problem are special forms of arbitrary value problem. Particularly, the existence and uniqueness of the solutions of final value problem and inner-interval value problem of first order ordinary differential equation are proved for discrete problems. The numerical calculation formulas of the problems are derived, and for each algorithm, we propose the convergence and stability conditions of them. Furthermore, multivariate and high-order final value problems are further studied, and the condition of fixed delay is also discussed in this paper. At last, the effectiveness of the considered methods is validated by numerical experiment.
  • Vector quantization aims to form new vectors/matrices with shared values close to the original. It could compress data with acceptable information loss, and could be of great usefulness in areas like Image Processing, Pattern Recognition and Machine Learning. In recent years, the importance of quantization has been soaring as it has been discovered huge potentials in deploying practical neural networks, which is among one of the most popular research topics. Conventional vector quantization methods usually suffer from their own flaws: hand-coding domain rules quantization could produce poor results when encountering complex data, and clustering-based algorithms have the problem of inexact solution and high time consumption. In this paper, we explored vector quantization problem from a new perspective of sparse least square optimization and designed multiple algorithms with their program implementations. Specifically, deriving from a sparse form of coefficient matrix, three types of sparse least squares, with $l_0$, $l_1$, and generalized $l_1 + l_2$ penalizations, are designed and implemented respectively. In addition, to produce quantization results with given amount of quantized values(instead of penalization coefficient $\lambda$), this paper proposed a cluster-based least square quantization method, which could also be regarded as an improvement of information preservation of conventional clustering algorithm. The algorithms were tested on various data and tasks and their computational properties were analyzed. The paper offers a new perspective to probe the area of vector quantization, while the algorithms proposed could provide more appropriate options for quantization tasks under different circumstances.
  • An ultra-wideband (UWB) aided localization system is presented. Existing filter-based methods using UWB need kinetic model which is generally hard to obtain because of the complex structure of robots. This paper proposes a graph optimization approach to localization with inertial measurement unit (IMU) and UWB measurements which converts the localization problem into an optimization problem on Lie-Manifold, so that kinetic model can be avoided. This method makes full use of measurement data to localize a robot with acceptable computational complexity. It is achieved by minimizing the trajectory error based on several constrained equations arising from different types of sensor measurements. Our first contribution is graph realization in Manifold instead of Euclidean space for UWB-based systems, and new type of edge is defined and created in order to apply exiting graph optimization tool. Our second contribution is online approach that enables this approach to robots with ultra-low power processor. Experiments under a variety of scenarios verify the stability of this method and demonstrate that the algorithm achieves a much higher localization accuracy than the filter-based methods.
  • Cross-correlator plays a significant role in many visual perception tasks, such as object detection and tracking. Beyond the linear cross-correlator, this paper proposes a kernel cross-correlator (KCC) that breaks traditional limitations. First, by introducing the kernel trick, the KCC extends the linear cross-correlation to non-linear space, which is more robust to signal noises and distortions. Second, the connection to the existing works shows that KCC provides a unified solution for correlation filters. Third, KCC is applicable to any kernel function and is not limited to circulant structure on training data, thus it is able to predict affine transformations with customized properties. Last, by leveraging the fast Fourier transform (FFT), KCC eliminates direct calculation of kernel vectors, thus achieves better performance yet still with a reasonable computational cost. Comprehensive experiments on visual tracking and human activity recognition using wearable devices demonstrate its robustness, flexibility, and efficiency. The source codes of both experiments are released at https://github.com/wang-chen/KCC
  • In this paper we propose a method to achieve relative positioning and tracking of a target by a quadcopter using Ultra-wideband (UWB) ranging sensors, which are strategically installed to help retrieve both relative position and bearing between the quadcopter and target. To achieve robust localization for autonomous flight even with uncertainty in the speed of the target, two main features are developed. First, an estimator based on Extended Kalman Filter (EKF) is developed to fuse UWB ranging measurements with data from onboard sensors including inertial measurement unit (IMU), altimeters and optical flow. Second, to properly handle the coupling of the target's orientation with the range measurements, UWB based communication capability is utilized to transfer the target's orientation to the quadcopter. Experiments results demonstrate the ability of the quadcopter to control its position relative to the target autonomously in both cases when the target is static and moving.
  • Robust velocity and position estimation is crucial for autonomous robot navigation. The optical flow based methods for autonomous navigation have been receiving increasing attentions in tandem with the development of micro unmanned aerial vehicles. This paper proposes a kernel cross-correlator (KCC) based algorithm to determine optical flow using a monocular camera, which is named as correlation flow (CF). Correlation flow is able to provide reliable and accurate velocity estimation and is robust to motion blur. In addition, it can also estimate the altitude velocity and yaw rate, which are not available by traditional methods. Autonomous flight tests on a quadcopter show that correlation flow can provide robust trajectory estimation with very low processing power. The source codes are released based on the ROS framework.
  • The superconducting fluxonium circuit is an artificial atom with a strongly anharmonic spectrum: when biased at a half flux quantum, the lowest qubit transition is an order of magnitude smaller in frequency than those to higher levels. Similar to conventional atomic systems, such a frequency separation between the computational and noncomputational subspaces allows independent optimizations of the qubit coherence and two-qubit interactions. Here we describe a controlled-Z gate for two fluxoniums connected either capacitively or inductively, with qubit transitions fixed near 500 MHz. The gate is activated by a microwave drive at a resonance involving the second excited state. We estimate intrinsic gate fidelities over 99.9% with gate times below 100 ns.
  • We present a large scale survey of CO outflows in the Gem OB1 molecular cloud complex and its surroundings using the Purple Mountain Observatory Delingha 13.7 m telescope. A total of 198 outflow candidates were identified over a large area ($\sim$ 58.5 square degrees), of which 193 are newly detected. Approximately 68% (134/198) are associated with the Gem OB1 molecular cloud complex, including clouds GGMC 1, GGMC 2, BFS 52, GGMC 3 and GGMC 4. Other regions studied are: Local Arm (Local Lynds, West Front), Swallow, Horn, and Remote cloud. Outflow candidates in GGMC 1, BFS 52, and Swallow are mainly located at ring-like or filamentary structures. To avoid excessive uncertainty in distant regions ($\gtrsim$ 3.8 kpc), we only estimated the physical parameters for clouds in the Gem OB1 molecular cloud complex and in the Local arm. In those clouds, the total kinetic energy and the energy injection rate of the identified outflow candidates are $\lesssim$ 1% and $\lesssim$ 3% of the turbulent energy and the turbulent dissipation rate of each cloud, indicating that the identified outflow candidates cannot provide enough energy to balance turbulence of their host cloud at the scale of the entire cloud (several pc to dozens of pc). The gravitational binding energy of each cloud is $\gtrsim$ 135 times the total kinetic energy of the identified outflow candidates within the corresponding cloud, indicating that the identified outflow candidates cannot cause major disruptions to the integrity of their host cloud at the scale of the entire cloud.
  • The three-terminal heat device consisting of a cavity and coupled to a heat bath is established. By tuning the temperatures of the electrodes and the phonon bath, the device can function as a heat engine or a refrigerator. We study the characteristic performance in the linear and nonlinear regime for both setups. It is our focus here to analyze how the efficiency of the heat engine and coefficient of performance of the refrigerator are affected by the nonlinear transport. With such considerations, the maximum efficiency and power are then optimized for various energy levels, temperatures and other parameters.
  • In biostatistics, propensity score is a common approach to analyze the imbalance of covariate and process confounding covariates to eliminate differences between groups. While there are an abundant amount of methods to compute propensity score, a common issue of them is the corrupted labels in the dataset. For example, the data collected from the patients could contain samples that are treated mistakenly, and the computing methods could incorporate them as a misleading information. In this paper, we propose a Machine Learning-based method to handle the problem. Specifically, we utilize the fact that the majority of sample should be labeled with the correct instance and design an approach to first cluster the data with spectral clustering and then sample a new dataset with a distribution processed from the clustering results. The propensity score is computed by Xgboost, and a mathematical justification of our method is provided in this paper. The experimental results illustrate that xgboost propensity scores computing with the data processed by our method could outperform the same method with original data, and the advantages of our method increases as we add some artificial corruptions to the dataset. Meanwhile, the implementation of xgboost to compute propensity score for multiple treatments is also a pioneering work in the area.
  • Entangling gates between qubits are a crucial component for performing algorithms in quantum computers. However, any quantum algorithm must ultimately operate on error-protected logical qubits encoded in high-dimensional systems. Typically, logical qubits are encoded in multiple two-level systems, but entangling gates operating on such qubits are highly complex and have not yet been demonstrated. Here, we realize a controlled NOT (CNOT) gate between two multiphoton qubits in two microwave cavities. In this approach, we encode a qubit in the high-dimensional space of a single cavity mode, rather than in multiple two-level systems. We couple two such encoded qubits together through a transmon, which is driven by an RF pump to apply the gate within 190 ns. This is two orders of magnitude shorter than the decoherence time of the transmon, enabling a high-fidelity gate operation. These results are an important step towards universal algorithms on error-corrected logical qubits.
  • We investigate nonequilibrium energy transfer in a single-site Bose-Hubbard model coupled to two thermal baths. By including a quantum kinetic equation combined with full counting statistics, we investigate the steady state energy flux and noise power. The influence of the nonlinear Bose-Hubbard interaction on the transfer behaviors is analyzed, and the nonmonotonic features are clearly exhibited. Particularly, in the strong on-site repulsion limit, the results become identical with the nonequilibrium spin-boson model. We also extend the quantum kinetic equation to study the geometric-phase-induced energy pump. An interesting reversal behavior is unraveled by enhancing the Bose-Hubbard repulsion strength.
  • We study the quantum Zeno effect (QZE) and quantum anti-Zeno effect (QAZE) in a two-level system(TLS) interacting with an environment owning 1/f noise. Using a numerically exact method based on the thermo field dynamics(TFD) theory and the matrix product states(MPS), we obtain exact evolutions of the TLS and bath(environment) under repetitive measurements at both zero and finite temperatures. At zero temperature, we observe a novel transition from a pure QZE in the short time scale to a QZE-QAZE crossover in the long time scale, by considering the measurement induced non-Markvoian effect. At finite temperature, we exploit that the thermal fluctuation suppresses the decay of the survival probability in the short time scale, whereas it enhances the decay in the long time scale.
  • This paper proposes an ultra-wideband (UWB) aided localization and mapping system that leverages on inertial sensor and depth camera. Inspired by the fact that visual odometry (VO) system, regardless of its accuracy in the short term, still faces challenges with accumulated errors in the long run or under unfavourable environments, the UWB ranging measurements are fused to remove the visual drift and improve the robustness. A general framework is developed which consists of three parallel threads, two of which carry out the visual-inertial odometry (VIO) and UWB localization respectively. The other mapping thread integrates visual tracking constraints into a pose graph with the proposed smooth and virtual range constraints, such that an optimization is performed to provide robust trajectory estimation. Experiments show that the proposed system is able to create dense drift-free maps in real-time even running on an ultra-low power processor in featureless environments.
  • We establish lower bounds for the $p$-divisibility of the quantity $\#\operatorname{Hom}(G,GL_n(\mathbb{F}_q))$, the number of homomorphisms from $G$ to a general linear group, where $G$ is an Abelian $p$-group. This is in analogy to the result of Krattenthaler and M\"{u}ller \cite{MR3383810} on homomorphisms to symmetric groups.
  • Despite outperforming the human in many tasks, deep neural network models are also criticized for the lack of transparency and interpretability in decision making. The opaqueness results in uncertainty and low confidence when deploying such a model in model sharing scenarios, when the model is developed by a third party. For a supervised machine learning model, sharing training process including training data provides an effective way to gain trust and to better understand model predictions. However, it is not always possible to share all training data due to privacy and policy constraints. In this paper, we propose a method to disclose a small set of training data that is just sufficient for users to get the insight of a complicated model. The method constructs a boundary tree using selected training data and the tree is able to approximate the complicated model with high fidelity. We show that traversing data points in the tree gives users significantly better understanding of the model and paves the way for trustworthy model sharing.
  • Behavioral economics show us that emotions play an important role in individual behavior and decision-making. Does this also affect collective decision making in a community? Here we investigate whether the community sentiment energy of a topic is related to the spreading popularity of the topic. To compute the community sentiment energy of a topic, we first analyze the sentiment of a user on the key phrases of the topic based on the recent tweets of the user. Then we compute the total sentiment energy of all users in the community on the topic based on the Markov Random Field (MRF) model and graph entropy model. Experiments on two communities find the linear correlation between the community sentiment energy and the real spreading popularity of topics. Based on the finding, we proposed two models to predict the popularity of topics. Experimental results show the effectiveness of the two models and the helpful of sentiment in predicting the popularity of topics. Experiments also show that community sentiment affects collective decision making of spreading a topic or not in the community.
  • While Markov Random Fields (MRFs) are widely used in computer vision, they present a quite challenging inference problem. MRF inference can be accelerated by pre-processing techniques like Dead End Elimination (DEE) or QPBO-based approaches which compute the optimal labeling of a subset of variables. These techniques are guaranteed to never wrongly label a variable but they often leave a large number of variables unlabeled. We address this shortcoming by interpreting pre-processing as a classification problem, which allows us to trade off false positives (i.e., giving a variable an incorrect label) versus false negatives (i.e., failing to label a variable). We describe an efficient discriminative rule that finds optimal solutions for a subset of variables. Our technique provides both per-instance and worst-case guarantees concerning the quality of the solution. Empirical studies were conducted over several benchmark datasets. We obtain a speedup factor of 2 to 12 over expansion moves without preprocessing, and on difficult non-submodular energy functions produce slightly lower energy.
  • The goal of this paper is to create a new framework for dense SLAM that is light enough for micro-robot systems based on depth camera and inertial sensor. Feature-based and direct methods are two mainstreams in visual SLAM. Both methods minimize photometric or reprojection error by iterative solutions, which are computationally expensive. To overcome this problem, we propose a non-iterative framework to reduce computational requirement. First, the attitude and heading reference system (AHRS) and axonometric projection are utilized to decouple the 6 Degree-of-Freedom (DoF) data, so that point clouds can be matched in independent spaces respectively. Second, based on single key-frame training, the matching process is carried out in frequency domain by Fourier transformation, which provides a closed-form non-iterative solution. In this manner, the time complexity is reduced to $\mathcal{O}(n \log{n})$, where $n$ is the number of matched points in each frame. To the best of our knowledge, this method is the first non-iterative and online trainable approach for data association in visual SLAM. Compared with the state-of-the-arts, it runs at a faster speed and obtains 3-D maps with higher resolution yet still with comparable accuracy.
  • We present a large-scale mapping toward the GEM OB1 association in the galactic anti-center direction. The 9^deg * 6.5^deg area was mapped in 12CO, 13CO, and C18O with 50" angular resolution at 30" sampling. The region was divided into four main components based on spatial distribution and velocity: the Gemini OB1 Giant Molecular Cloud (GGMC) Complex, the Lynds Dark Clouds and the West Front Clouds, the Swallow and Horn, and the Remote Clouds. The GGMC Complex is located in the Perseus arm, while the Lynds Dark Clouds and the West Front Clouds are located in the Local arm. Swallow and Horn are revealed for the first time in this paper. The two clouds have a similar velocity interval ([11, 21] km s^-1) and have similar sizes (0.6 and 0.8 deg^2). We analyzed the structure of these clouds in detail and calculated their parameters (mass, temperature, etc.). Two elongated structures were discovered in a longitude-velocity map in the velocity interval [11, 30] km s^-1. We also found an interesting filament that shows a 0.8 km s^-1 pc^-1 gradient perpendicular to the direction of the long axis.
  • Eliminating duplicate data in primary storage of clouds increases the cost-efficiency of cloud service providers as well as reduces the cost of users for using cloud services. Existing primary deduplication techniques either use inline caching to exploit locality in primary workloads or use post-processing deduplication running in system idle time to avoid the negative impact on I/O performance. However, neither of them works well in the cloud servers running multiple services or applications for the following two reasons: Firstly, the temporal locality of duplicate data writes may not exist in some primary storage workloads thus inline caching often fails to achieve good deduplication ratio. Secondly, the post-processing deduplication allows duplicate data to be written into disks, therefore does not provide the benefit of I/O deduplication and requires high peak storage capacity. This paper presents HPDedup, a Hybrid Prioritized data Deduplication mechanism to deal with the storage system shared by applications running in co-located virtual machines or containers by fusing an inline and a post-processing process for exact deduplication. In the inline deduplication phase, HPDedup gives a fingerprint caching mechanism that estimates the temporal locality of duplicates in data streams from different VMs or applications and prioritizes the cache allocation for these streams based on the estimation. HPDedup also allows different deduplication threshold for streams based on their spatial locality to reduce the disk fragmentation. The post-processing phase removes duplicates whose fingerprints are not able to be cached due to the weak temporal locality from disks. Our experimental results show that HPDedup clearly outperforms the state-of-the-art primary storage deduplication techniques in terms of inline cache efficiency and primary deduplication efficiency.