• We study stochastic routing in the PAth-CEntric (PACE) uncertain road network model. In the PACE model, uncertain travel times are associated with not only edges but also some paths. The uncertain travel times associated with paths are able to well capture the travel time dependency among different edges. This significantly improves the accuracy of travel time distribution estimations for arbitrary paths, which is a fundamental functionality in stochastic routing, compared to classic uncertain road network models where uncertain travel times are associated with only edges. We investigate a new formulation of the shortest path with on-time arrival reliability (SPOTAR) problem under the PACE model. Given a source, a destination, and a travel time budget, the SPOTAR problem aims at finding a path that maximizes the on-time arrival probability. We develop a generic algorithm with different speed-up strategies to solve the SPOTAR problem under the PACE model. Empirical studies with substantial GPS trajectory data offer insight into the design properties of the proposed algorithm and confirm that the algorithm is effective. This report extends the paper "Stochastic Shortest Path Finding in Path-Centric Uncertain Road Networks", to appear in IEEE MDM 2018, by providing a concrete running example of the studied SPOTAR problem in the PACE model and additional statistics of the used GPS trajectories in the experiments.
  • Deep neural networks have been shown to be very powerful modeling tools for many supervised learning tasks involving complex input patterns. However, they can also easily overfit to training set biases and label noises. In addition to various regularizers, example reweighting algorithms are popular solutions to these problems, but they require careful tuning of additional hyperparameters, such as example mining schedules and regularization hyperparameters. In contrast to past reweighting methods, which typically consist of functions of the cost value of each example, in this work we propose a novel meta-learning algorithm that learns to assign weights to training examples based on their gradient directions. To determine the example weights, our method performs a meta gradient descent step on the current mini-batch example weights (which are initialized from zero) to minimize the loss on a clean unbiased validation set. Our proposed method can be easily implemented on any type of deep network, does not require any additional hyperparameter tuning, and achieves impressive performance on class imbalance and corrupted label problems where only a small amount of clean validation data is available.
  • In this paper, we propose a multi-level attention model to solve the weakly labelled audio classification problem. The objective of audio classification is to predict the presence or absence of audio events in an audio clip. Recently, Google published a large scale weakly labelled dataset called Audio Set, where each audio clip contains only the presence or absence of the audio events, without the onset and offset time of the audio events. Our multi-level attention model is an extension to the previously proposed single-level attention model. It consists of several attention modules applied on intermediate neural network layers. The output of these attention modules are concatenated to a vector followed by a multi-label classifier to make the final prediction of each class. Experiments shown that our model achieves a mean average precision (mAP) of 0.360, outperforms the state-of-the-art single-level attention model of 0.327 and Google baseline of 0.314.
  • Motivated by the increasing availability of vehicle trajectory data, we propose learn-to-route, a comprehensive trajectory-based routing solution. Specifically, we first construct a graph-like structure from trajectories as the routing infrastructure. Second, we enable trajectory-based routing given an arbitrary (source, destination) pair. In the first step, given a road network and a collection of trajectories, we propose a trajectory-based clustering method that identifies regions in a road network. If a pair of regions are connected by trajectories, we maintain the paths used by these trajectories and learn a routing preference for travel between the regions. As trajectories are skewed and sparse, many region pairs are not connected by trajectories. We thus transfer routing preferences from region pairs with sufficient trajectories to such region pairs and then use the transferred preferences to identify paths between the regions. In the second step, we exploit the above graph-like structure to achieve a comprehensive trajectory-based routing solution. Empirical studies with two substantial trajectory data sets offer insight into the proposed solution, indicating that it is practical. A comparison with a leading routing service offers evidence that the paper's proposal is able to enhance routing quality. This is an extended version of "Learning to Route with Sparse Trajectory Sets" [1], to appear in IEEE ICDE 2018.
  • Motivated by many practical applications in logistics and mobility-as-a-service, we study the top-k optimal sequenced routes (KOSR) querying on large, general graphs where the edge weights may not satisfy the triangle inequality, e.g., road network graphs with travel times as edge weights. The KOSR querying strives to find the top-k optimal routes (i.e., with the top-k minimal total costs) from a given source to a given destination, which must visit a number of vertices with specific vertex categories (e.g., gas stations, restaurants, and shopping malls) in a particular order (e.g., visiting gas stations before restaurants and then shopping malls). To efficiently find the top-k optimal sequenced routes, we propose two algorithms PruningKOSR and StarKOSR. In PruningKOSR, we define a dominance relationship between two partially-explored routes. The partially-explored routes that can be dominated by other partially-explored routes are postponed being extended, which leads to a smaller searching space and thus improves efficiency. In StarKOSR, we further improve the efficiency by extending routes in an A* manner. With the help of a judiciously designed heuristic estimation that works for general graphs, the cost of partially explored routes to the destination can be estimated such that the qualified complete routes can be found early. In addition, we demonstrate the high extensibility of the proposed algorithms by incorporating Hop Labeling, an effective label indexing technique for shortest path queries, to further improve efficiency. Extensive experiments on multiple real-world graphs demonstrate that the proposed methods significantly outperform the baseline method. Furthermore, when k=1, StarKOSR also outperforms the state-of-the-art method for the optimal sequenced route queries.
  • The problem of identifying end-use electrical appliances from their individual consumption profiles, known as the appliance identification problem, is a primary stage in both Non-Intrusive Load Monitoring (NILM) and automated plug-wise metering. Therefore, appliance identification has received dedicated studies with various electric appliance signatures, classification models, and evaluation datasets. In this paper, we propose a neural network ensembles approach to address this problem using high resolution measurements. The models are trained on the raw current and voltage waveforms, and thus, eliminating the need for well engineered appliance signatures. We evaluate the proposed model on a publicly available appliance dataset from 55 residential buildings, 11 appliance categories, and over 1000 measurements. We further study the stability of the trained models with respect to training dataset, sampling frequency, and variations in the steady-state operation of appliances.
  • Recently, and with the growing development of big energy datasets, data-driven learning techniques began to represent a potential solution to the energy disaggregation problem outperforming engineered and hand-crafted models. However, most proposed deep disaggregation models are load-dependent in the sense that either expert knowledge or a hyper-parameter optimization stage is required prior to training and deployment (normally for each load category) even upon acquisition and cleansing of aggregate and sub-metered data. In this paper, we present a feasibility study on the development of a generic disaggregation model based on data-driven learning. Specifically, we present a generic deep disaggregation model capable of achieving state-of-art performance in load monitoring for a variety of load categories. The developed model is evaluated on the publicly available UK-DALE dataset with a moderately low sampling frequency and various domestic loads.
  • In this paper, we propose a multi-generator extension to the adversarial training framework, in which the objective of each generator is to represent a unique component of a target mixture distribution. In the training phase, the generators cooperate to represent, as a mixture, the target distribution while maintaining distinct manifolds. As opposed to traditional generative models, inference from a particular generator after training resembles selective sampling from a unique component in the target distribution. We demonstrate the feasibility of the proposed architecture both analytically and with basic Multi-Layer Perceptron (MLP) models trained on the MNIST dataset.
  • Considering both non-line-of-sight (NLoS) and line-of-sight (LoS) transmissions, the transitional behaviors from noise-limited regime to dense interference-limited regime have been investigated for the fifth generation (5G) small cell networks (SCNs). Besides, we identify four performance regimes based on base station (BS) density, i.e., (i) the noise-limited regime, (ii) the signal-dominated regime, (iii) the interference-dominated regime, and (iv) the interference-limited regime. To characterize the performance regime, we propose a unified framework analyzing the future 5G wireless networks over generalized shadowing/fading channels, in which the user association schemes based on the strongest instantaneous received power (SIRP) and the strongest average received power (SARP) can be studied, while NLoS/LoS transmissions and multi-slop path loss model are considered. Simulation results indicate that different factors, i.e., noise, desired signal, and interference, successively and separately dominate the network performance with the increase of BS density. Hence, our results shed new light on the design and management of SCNs in urban and rural areas with different BS deployment densities.
  • Conventional deep convolutional neural networks (CNNs) apply convolution operators uniformly in space across all feature maps for hundreds of layers - this incurs a high computational cost for real time applications. For many problems such as object detection and semantic segmentation, we are able to obtain a low-cost computation mask, either from a priori problem knowledge, or from a low resolution segmentation network. We show that such computation masks can be used to reduce computation in the high resolution main network. Variants of sparse activation CNNs have previously been explored on small scale tasks, and showed no degradation in terms of object classification accuracy, but often measured gains in terms of theoretical FLOPs without realizing a practical speed-up when compared to highly optimized dense convolution implementations. In this work, we leverage the sparsity structure of computation masks and propose a novel tiling-based sparse convolution algorithm. We verified the effectiveness of our sparse CNN on LiDAR based 3D object detection, and we report significant wall-clock speed-ups compared to dense convolution, as well as improved detection accuracy.
  • Isotopic ratios in comets are critical to understanding the origin of cometary material and the physical and chemical conditions in the early solar nebula. Comet C/2015 ER61 (PANSTARRS) underwent an outburst with a total brightness increase of 2 magnitudes on the night of 2017 April 4. The sharp increase in brightness offered a rare opportunity to measure the isotopic ratios of the light elements in the coma of this comet. We obtained two high-resolution spectra of C/2015 ER61 with UVES/VLT on the nights of 2017 April 13 and 17. At the time of our observations, the comet was fading gradually following the outburst. We measured the nitrogen and carbon isotopic ratios from the CN violet (0,0) band and found that $^{12}$C/$^{13}$C=100 $\pm$ 15, $^{14}$N/$^{15}$N=130 $\pm$ 15. In addition, we determined the $^{14}$N/$^{15}$N ratio from four pairs of NH$_2$ isotopolog lines and measured $^{14}$N/$^{15}$N=140 $\pm$ 28. The measured isotopic ratios of C/2015 ER61 do not deviate significantly from those of other comets.
  • We investigate the design of the optimal routing path in a moving vehicles involved the internet of things (IoT). In our model, jammers exist that may interfere with the information exchange between wireless nodes, leading to worsened quality of service (QoS) in communications. In addition, the transmit power of each battery-equipped node is constrained to save energy. We propose a three-step optimal routing path algorithm for reliable and energy-efficient communications. Moreover, results show that with the assistance of moving vehicles, the total energy consumed can be reduced to a large extend. We also study the impact on the optimal routing path design and energy consumption which is caused by path loss, maximum transmit power constrain, QoS requirement, etc.
  • We review the evidence for buried ice in the asteroid belt; specifically the questions around the so-called Main Belt Comets (MBCs). We summarise the evidence for water throughout the Solar System, and describe the various methods for detecting it, including remote sensing from ultraviolet to radio wavelengths. We review progress in the first decade of study of MBCs, including observations, modelling of ice survival, and discussion on their origins. We then look at which methods will likely be most effective for further progress, including the key challenge of direct detection of (escaping) water in these bodies.
  • (349) Dembowska, a large, bright main-belt asteroid, has a fast rotation and oblique spin axis. It may have experienced partial melting and differentiation. We constrain Dembowska's thermophysical properties, e.g., thermal inertia, roughness fraction, geometric albedo and effective diameter within 3$\sigma$ uncertainty of $\Gamma=20^{+12}_{-7}\rm~Jm^{-2}s^{-0.5}K^{-1}$, $f_{\rm r}=0.25^{+0.60}_{-0.25}$, $p_{\rm v}=0.309^{+0.026}_{-0.038}$, and $D_{\rm eff}=155.8^{+7.5}_{-6.2}\rm~km$, by utilizing the Advanced Thermophysical Model (ATPM) to analyse four sets of thermal infrared data obtained by IRAS, AKARI, WISE and Subaru/COMICS at different epochs. In addition, by modeling the thermal lightcurve observed by WISE, we obtain the rotational phases of each dataset. These rotationally resolved data do not reveal significant variations of thermal inertia and roughness across the surface, indicating the surface of Dembowska should be covered by a dusty regolith layer with few rocks or boulders. Besides, the low thermal inertia of Dembowska show no significant difference with other asteroids larger than 100 km, indicating the dynamical lives of these large asteroids are long enough to make the surface to have sufficiently low thermal inertia. Furthermore, based on the derived surface thermophysical properties, as well as the known orbital and rotational parameters, we can simulate Dembowska's surface and subsurface temperature throughout its orbital period. The surface temperature varies from $\sim40$ K to $\sim220$ K, showing significant seasonal variation, whereas the subsurface temperature achieves equilibrium temperature about $120\sim160$ K below $30\sim50$ cm depth.
  • The state-of-the-art performance for object detection has been significantly improved over the past two years. Besides the introduction of powerful deep neural networks such as GoogleNet and VGG, novel object detection frameworks such as R-CNN and its successors, Fast R-CNN and Faster R-CNN, play an essential role in improving the state-of-the-art. Despite their effectiveness on still images, those frameworks are not specifically designed for object detection from videos. Temporal and contextual information of videos are not fully investigated and utilized. In this work, we propose a deep learning framework that incorporates temporal and contextual information from tubelets obtained in videos, which dramatically improves the baseline performance of existing still-image detection frameworks when they are applied to videos. It is called T-CNN, i.e. tubelets with convolutional neueral networks. The proposed framework won the recently introduced object-detection-from-video (VID) task with provided data in the ImageNet Large-Scale Visual Recognition Challenge 2015 (ILSVRC2015).
  • Main Belt Comets are a recently identified population of minor bodies with stable asteroid-like orbits but cometary appearances. Sublimation of water ice is the most likely mechanism for their recurrent activity (i.e. dust tails and dust comae), although there has been no direct detection of gas. These peculiar objects could hold the key to the origin of water on Earth. In this paper we present a search for the gas responsible for lifting dust from P/2012 T1 (Pan-STARRS), and review previous attempts at such measurements. To date such searches have mainly been indirect, looking for the common cometary gas CN rather than gasses related to water itself. We use the VLT and X-shooter to search for emission from OH in the UV, a direct dissociation product of water. We do not detect any emission lines, and place an upper limit on water production rate from P/2012 T1 of $8-9\times10^{25}$ molecules s$^{-1}$. This is similar to limits derived from observations using the Herschel space telescope. We conclude that the best current facilities are incapable of detecting water emission at the exceptionally low levels required to produce the observed activity in Main Belt Comets.
  • In this paper, we propose a unified framework to analyze the performance of dense small cell networks (SCNs) in terms of the coverage probability and the area spectral efficiency (ASE). In our analysis, we consider a practical path loss model that accounts for both non-line-of-sight (NLOS) and line-of-sight (LOS) transmissions. Furthermore, we adopt a generalized fading model, in which Rayleigh fading, Rician fading and Nakagami-m fading can be treated in a unified framework. The analytical results of the coverage probability and the ASE are derived, using a generalized stochastic geometry analysis. Different from existing work that does not differentiate NLOS and LOS transmissions, our results show that NLOS and LOS transmissions have a significant impact on the coverage probability and the ASE performance, particularly when the SCNs grow dense. Furthermore, our results establish for the first time that the performance of the SCNs can be divided into four regimes, according to the intensity (aka density) of BSs, where in each regime the performance is dominated by different factors.
  • In this paper we introduce the TorontoCity benchmark, which covers the full greater Toronto area (GTA) with 712.5 $km^2$ of land, 8439 $km$ of road and around 400,000 buildings. Our benchmark provides different perspectives of the world captured from airplanes, drones and cars driving around the city. Manually labeling such a large scale dataset is infeasible. Instead, we propose to utilize different sources of high-precision maps to create our ground truth. Towards this goal, we develop algorithms that allow us to align all data sources with the maps while requiring minimal human supervision. We have designed a wide variety of tasks including building height estimation (reconstruction), road centerline and curb extraction, building instance segmentation, building contour extraction (reorganization), semantic labeling and scene type classification (recognition). Our pilot study shows that most of these tasks are still difficult for modern convolutional neural networks.
  • Asteroid (130) Elektra belongs to one of the six known triple asteroids in the main belt, so its mass has been reliably determined. We aim to use all available disk-resolved images of (130) Elektra obtained by the SPHERE instrument at VLT and by the Nirc2 of the Keck telescope together with the disk-integrated photometry to determine its shape model and its size. The volume can be then used in combination with the known mass to derive the bulk density of the primary. We apply the All-Data Asteroid Modeling (ADAM) algorithm to the optical disk-integrated data, 2 disk-resolved images obtained by the SPHERE instrument and 13 disk-resolved images from the Nirc2 of the Keck telescope, and derive the shape model and size of Elektra. We present the shape model, volume-equivalent diameter (199$\pm$7 km) and bulk density (1.60$\pm$0.13 g cm$^{-3}$) of the C-type asteroid Elektra.
  • Here we explore a technique for constraining physical properties of near-Earth asteroids (NEAs) based on variability in thermal emission as a function of viewing aspect. We present case studies of the low albedo, near-Earth asteroids (285263) 1998 QE2 and (175706) 1996 FG3. The Near-Earth Asteroid Thermal Model (NEATM) is used to fit signatures of thermal emission in near-infrared (0.8 - 2.5 micron) spectral data. This analysis represents a systematic study of thermal variability in the near-IR as a function of phase angle. The observations of QE2 imply that carefully timed observations from multiple viewing geometries can be used to constrain physical properties like retrograde versus prograde pole orientation and thermal inertia. The FG3 results are more ambiguous with detected thermal variability possibly due to systematic issues with NEATM, an unexpected prograde rotation state, or a surface that is spectrally and thermally heterogenous. This study highlights the potential diagnostic importance of high phase angle thermal measurements on both sides of opposition. We find that the NEATM thermal beaming parameters derived from our near-IR data tend to be of order 10's of percent higher than parameters from ensemble analyses of longer wavelength data sets. However, a systematic comparison of NEATM applied to data in different wavelength regimes is needed to understand whether this offset is simply a reflection of small number statistics or an intrinsic limitation of NEATM when applied to near-IR data. With the small sample presented here, it remains unclear whether NEATM modeling at near-IR wavelengths can robustly determine physical properties like pole orientation and thermal inertia.
  • The visual cues from multiple support regions of different sizes and resolutions are complementary in classifying a candidate box in object detection. Effective integration of local and contextual visual cues from these regions has become a fundamental problem in object detection. In this paper, we propose a gated bi-directional CNN (GBD-Net) to pass messages among features from different support regions during both feature learning and feature extraction. Such message passing can be implemented through convolution between neighboring support regions in two directions and can be conducted in various layers. Therefore, local and contextual visual patterns can validate the existence of each other by learning their nonlinear relationships and their close interactions are modeled in a more complex way. It is also shown that message passing is not always helpful but dependent on individual samples. Gated functions are therefore needed to control message transmission, whose on-or-offs are controlled by extra visual evidence from the input sample. The effectiveness of GBD-Net is shown through experiments on three object detection datasets, ImageNet, Pascal VOC2007 and Microsoft COCO. This paper also shows the details of our approach in wining the ImageNet object detection challenge of 2016, with source code provided on \url{https://github.com/craftGBD/craftGBD}.
  • In this article, a network model incorporating both line-of-sight (LOS) and non-line-of-sight (NLOS) transmissions is proposed to investigate impacts of blockages in urban areas on heterogeneous network coverage performance. Results show that co-existence of NLOS and LOS transmissions has a significant impact on network performance. We find in urban areas, that deploying more BSs in different tiers is better than merely deploying all BSs in the same tier in terms of coverage probability.
  • Object detection is a fundamental problem in image understanding. One popular solution is the R-CNN framework and its fast versions. They decompose the object detection problem into two cascaded easier tasks: 1) generating object proposals from images, 2) classifying proposals into various object categories. Despite that we are handling with two relatively easier tasks, they are not solved perfectly and there's still room for improvement. In this paper, we push the "divide and conquer" solution even further by dividing each task into two sub-tasks. We call the proposed method "CRAFT" (Cascade Region-proposal-network And FasT-rcnn), which tackles each task with a carefully designed network cascade. We show that the cascade structure helps in both tasks: in proposal generation, it provides more compact and better localized object proposals; in object classification, it reduces false positives (mainly between ambiguous categories) by capturing both inter- and intra-category variances. CRAFT achieves consistent and considerable improvement over the state-of-the-art on object detection benchmarks like PASCAL VOC 07/12 and ILSVRC.
  • As the ESA Rosetta mission approached, orbited, and sent a lander to comet 67P/Churyumov-Gerasimenko in 2014, a large campaign of ground-based observations also followed the comet. We constrain the total activity level of the comet by photometry and spectroscopy to place Rosetta results in context and to understand the large-scale structure of the comet's coma pre-perihelion. We performed observations using a number of telescopes, but concentrate on results from the 8m VLT and Gemini South telescopes in Chile. We use R-band imaging to measure the dust coma contribution to the comet's brightness and UV-visible spectroscopy to search for gas emissions, primarily using VLT/FORS. In addition we imaged the comet in near-infrared wavelengths (JHK) in late 2014 with Gemini-S/Flamingos 2. We find that the comet was already active in early 2014 at heliocentric distances beyond 4 au. The evolution of the total activity (measured by dust) followed previous predictions. No gas emissions were detected despite sensitive searches. The comet maintains a similar level of activity from orbit to orbit, and is in that sense predictable, meaning that Rosetta results correspond to typical behaviour for this comet. The gas production (for CN at least) is highly asymmetric with respect to perihelion, as our upper limits are below the measured production rates for similar distances post-perihelion in previous orbits.
  • Using the growing volumes of vehicle trajectory data, it becomes increasingly possible to capture time-varying and uncertain travel costs in a road network, including travel time and fuel consumption. The current paradigm represents a road network as a graph, assigns weights to the graph's edges by fragmenting trajectories into small pieces that fit the underlying edges, and then applies a routing algorithm to the resulting graph. We propose a new paradigm that targets more accurate and more efficient estimation of the costs of paths by associating weights with sub-paths in the road network. The paper provides a solution to a foundational problem in this paradigm, namely that of computing the time-varying cost distribution of a path. The solution consists of several steps. We first learn a set of random variables that capture the joint distributions of sub-paths that are covered by sufficient trajectories. Then, given a departure time and a path, we select an optimal subset of learned random variables such that the random variables' corresponding paths together cover the path. This enables accurate joint distribution estimation of the path, and by transferring the joint distribution into a marginal distribution, the travel cost distribution of the path is obtained. The use of multiple learned random variables contends with data sparseness, the use of multi-dimensional histograms enables compact representation of arbitrary joint distributions that fully capture the travel cost dependencies among the edges in paths. Empirical studies with substantial trajectory data from two different cities offer insight into the design properties of the proposed solution and suggest that the solution is effective in real-world settings.