• Sequential recommendation is a fundamental task for network applications, and it usually suffers from the item cold start problem due to the insufficiency of user feedbacks. There are currently three kinds of popular approaches which are respectively based on matrix factorization (MF) of collaborative filtering, Markov chain (MC), and recurrent neural network (RNN). Although widely used, they have some limitations. MF based methods could not capture dynamic user's interest. The strong Markov assumption greatly limits the performance of MC based methods. RNN based methods are still in the early stage of incorporating additional information. Based on these basic models, many methods with additional information only validate incorporating one modality in a separate way. In this work, to make the sequential recommendation and deal with the item cold start problem, we propose a Multi-View Recurrent Neural Network (MV-RNN}) model. Given the latent feature, MV-RNN can alleviate the item cold start problem by incorporating visual and textual information. First, At the input of MV-RNN, three different combinations of multi-view features are studied, like concatenation, fusion by addition and fusion by reconstructing the original multi-modal data. MV-RNN applies the recurrent structure to dynamically capture the user's interest. Second, we design a separate structure and a united structure on the hidden state of MV-RNN to explore a more effective way to handle multi-view features. Experiments on two real-world datasets show that MV-RNN can effectively generate the personalized ranking list, tackle the missing modalities problem and significantly alleviate the item cold start problem.
  • Owl is a new numerical library developed in the OCaml language. It focuses on providing a comprehensive set of high-level numerical functions so that developers can quickly build up data analytical applications. In this abstract, we will present Owl's design, core components, and its key functionality.
  • We present a statistical study of the planet-metallicity (P-M) correlation, by comparing the 744 stars with candidate planets (SWPs) in the Kepler field which have been observed with LAMOST, and a sample of distance-independent, fake "twin" stars in the Kepler field with no planet reported (CKSNPs) yet. With the well-defined and carefully-selected large samples, we find for the first time a turn-off P-M correlation of Delta [Fe/H]_(SWPs-SNPs), which in average increases from ~0.00+-0.03 dex to 0.06+-0.03 dex, and to 0.12+-0.03 for stars with Earth, Neptune, Jupiter-sized planets successively, and then declines to ~-0.01+-0.03 dex for more massive planets or brown dwarfs. Moreover, the percentage of those systems with positive Delta[Fe/H] has the same turn-off pattern. We also find FG-type stars follow this general trend, but K-type stars are different. Moderate metal enhancement (~0.1-0.2 dex) for K-type stars with planets of radii between 2 to 4 Earth radius as compared to CKSNPs is observed, which indicates much higher metallicities are required for Super-Earths, Neptune-sized planets to form around K-type stars. We point out that the P-M correlation is actually metallicity-dependent, i.e., the correlation is positive at solar and super-solar metallicities, and negative at subsolar metallicities. No steady increase of Delta[Fe/H] against planet sizes is observed for rocky planets, excluding the pollution scenario as a major mechanism for the P-M correlation. All these clues suggest that giant planets probably form differently from rocky planets or more massive planets/brown dwarfs, and the core-accretion scenario is highly favoured, and high metallicity is a prerequisite for massive planets to form.
  • With the fast development of effective and low-cost human skeleton capture systems, skeleton-based action recognition has attracted much attention recently. Most existing methods use Convolutional Neural Network(CNN) and Recurrent Neural Network(RNN) to extract spatio-temporal information embedded in the skeleton sequences for action recognition. However, these approaches are limited in the ability of relational modeling in a single skeleton, due to the loss of important structural information when converting the raw skeleton data to adapt to the CNN or RNN input. In this paper, we propose an Attentional Recurrent Relational Network-LSTM(ARRN-LSTM) to simultaneously model spatial configurations and temporal dynamics in skeletons for action recognition. The spatial patterns embedded in a single skeleton are learned by a Recurrent Relational Network, followed by a multi-layer LSTM to extract temporal features in the skeleton sequences. To exploit the complementarity between different geometries in the skeleton for sufficient relational modeling, we design a two-stream architecture to learn the relationship among joints and explore the underlying patterns among lines simultaneously. We also introduce an adaptive attentional module for focusing on potential discriminative parts of the skeleton towards a certain action. Extensive experiments are performed on several popular action recognition datasets and the results show that the proposed approach achieves competitive results with the state-of-the-art methods.
  • Skeleton-based action recognition has made great progress recently, but many problems still remain unsolved. For example, most of the previous methods model the representations of skeleton sequences without abundant spatial structure information and detailed temporal dynamics features. In this paper, we propose a novel model with spatial reasoning and temporal stack learning (SR-TSL) for skeleton based action recognition, which consists of a spatial reasoning network (SRN) and a temporal stack learning network (TSLN). The SRN can capture the high-level spatial structural information within each frame by a residual graph neural network, while the TSLN can model the detailed temporal dynamics of skeleton sequences by a composition of multiple skip-clip LSTMs. During training, we propose a clip-based incremental loss to optimize the model. We perform extensive experiments on the SYSU 3D Human-Object Interaction dataset and NTU RGB+D dataset and verify the effectiveness of each network of our model. The comparison results illustrate that our approach achieves much better results than state-of-the-art methods.
  • We present the first results of applying Gaussian Mixture Models in the stellar kinematic space of normalized angular momentum and binding energy on NIHAO high resolution galaxies to separate the stars into multiple components. We exemplify this method using a simulated Milky Way analogue, whose stellar component hosts: thin and thick discs, classical and pseudo bulges, and a stellar halo. The properties of these stellar structures are in good agreement with observational expectations in terms of sizes, shapes and rotational support. Interestingly, the two kinematic discs show surface mass density profiles more centrally concentrated than exponentials, while the bulges and the stellar halo are purely exponential. We trace back in time the Lagrangian mass of each component separately to study their formation history. Between z~3 and the end of halo virialization, z~1.3, all components lose a fraction of their angular momentum. The classical bulge loses the most (~95%) and the thin disc the least (~60%). Both bulges formed their stars in-situ at high redshift, while the thin disc formed ~98% in-situ, but with a constant SFR~1.5M$_{\rm\odot}$/yr$^{\rm-1}$ over the last ~11 Gyr. Accreted stars (6% of total stellar mass) are mainly incorporated to the thick disc or the stellar halo, which formed ex-situ 8% and 45% of their respective masses. Our analysis pipeline is freely available at https://github.com/aobr/gsf.
  • We present the relation between stellar specific angular momentum $j_*$, stellar mass $M_*$, and bulge-to-total light ratio $\beta$ for THINGS, CALIFA and Romanowsky \& Fall datasets, exploring the existence of a fundamental plane between these parameters as first suggested by Obreschkow \& Glazebrook. Our best-fit $M_*-j_*$ relation yields a slope of $\alpha = 1.03 \pm 0.11$ with a trivariate fit including $\beta$. When ignoring the effect of $\beta$, the exponent $\alpha = 0.56 \pm 0.06$ is consistent with $\alpha = 2/3$ predicted for dark matter halos. There is a linear $\beta - j_*/M_*$ relation for $\beta \lesssim 0.4$, exhibiting a general trend of increasing $\beta$ with decreasing $j_*/M_*$. Galaxies with $\beta \gtrsim 0.4$ have higher $j_*$ than predicted by the relation. Pseudobulge galaxies have preferentially lower $\beta$ for a given $j_*/M_*$ than galaxies that contain classical bulges. Pseudobulge galaxies follow a well-defined track in $\beta - j_*/M_*$ space, consistent with Obreschkow \& Glazebrook, while galaxies with classical bulges do not. These results are consistent with the hypothesis that while growth in either bulge type is linked to a decrease in $j_*/M_*$, the mechanisms that build pseudobulges seem to be less efficient at increasing bulge mass per decrease in specific angular momentum than those that build classical bulges.
  • The plasmoid instability in evolving current sheets has been widely studied due to its effects on the disruption of current sheets, the formation of plasmoids, and the resultant fast magnetic reconnection. In this Letter, we study the role of the plasmoid instabality in two-dimensional magnetohydrodynamic (MHD) turbulence by means of high-resolution direct numerical simulations. At sufficiently large magnetic Reynolds number ($R_m=10^6$), the combined effects of dynamic alignment and turbulent intermittency lead to a copious formation of plasmoids in a multitude of intense current sheets. The disruption of current sheet structures facilitates the energy cascade towards small scales, leading to the breaking and steepening of the energy spectrum. In the plasmoid-mediated regime, the energy spectrum displays a scaling that is close to the spectral index $-2.2$ as proposed by recent analytic theories. We also demonstrate that the dynamic alignment exists in 2D MHD turbulence and the corresponding slope of the alignment angle is close to 0.25.
  • This paper describes our system for SemEval-2018 Task 11: Machine Comprehension using Commonsense Knowledge. We use Three-way Attentive Networks (TriAN) to model interactions between the passage, question and answers. To incorporate commonsense knowledge, we augment the input with relation embedding from the graph of general knowledge ConceptNet (Speer et al., 2017). As a result, our system achieves 2nd place with 83.95% accuracy on the official test data. Code is publicly available at https://github.com/intfloat/commonsense-rc
  • By a small-size complex network of coupled chaotic Hindmarsh-Rose circuits, we study experimentally the stability of network synchronization to the removal of shortcut links. It is shown that the removal of a single shortcut link may destroy either completely or partially the network synchronization. Interestingly, when the network is partially desynchronized, it is found that the oscillators can be organized into different groups, with oscillators within each group being highly synchronized but are not for oscillators from different groups, showing the intriguing phenomenon of cluster synchronization. The experimental results are analyzed by the method of eigenvalue analysis, which implies that the formation of cluster synchronization is crucially dependent on the network symmetries. Our study demonstrates the observability of cluster synchronization in realistic systems, and indicates the feasibility of controlling network synchronization by adjusting network topology.
  • Many current Internet services rely on inferences from models trained on user data. Commonly, both the training and inference tasks are carried out using cloud resources fed by personal data collected at scale from users. Holding and using such large collections of personal data in the cloud creates privacy risks to the data subjects, but is currently required for users to benefit from such services. We explore how to provide for model training and inference in a system where computation is pushed to the data in preference to moving data to the cloud, obviating many current privacy risks. Specifically, we take an initial model learnt from a small set of users and retrain it locally using data from a single user. We evaluate on two tasks: one supervised learning task, using a neural network to recognise users' current activity from accelerometer traces; and one unsupervised learning task, identifying topics in a large set of documents. In both cases the accuracy is improved. We also analyse the robustness of our approach against adversarial attacks, as well as its feasibility by presenting a performance evaluation on a representative resource-constrained device (a Raspberry Pi).
  • Sponsored search is an indispensable business model and a major revenue contributor of almost all the search engines. From the advertisers' side, participating in ranking the search results by paying for the sponsored search advertisement to attract more awareness and purchase facilitates their commercial goal. From the users' side, presenting personalized advertisement reflecting their propensity would make their online search experience more satisfactory. Sponsored search platforms rank the advertisements by a ranking function to determine the list of advertisements to show and the charging price for the advertisers. Hence, it is crucial to find a good ranking function which can simultaneously satisfy the platform, the users and the advertisers. Moreover, advertisements showing positions under different queries from different users may associate with advertisement candidates of different bid price distributions and click probability distributions, which requires the ranking functions to be optimized adaptively to the traffic characteristics. In this work, we proposed a generic framework to optimize the ranking functions by deep reinforcement learning methods. The framework is composed of two parts: an offline learning part which initializes the ranking functions by learning from a simulated advertising environment, allowing adequate exploration of the ranking function parameter space without hurting the performance of the commercial platform. An online learning part which further optimizes the ranking functions by adapting to the online data distribution. Experimental results on a large-scale sponsored search platform confirm the effectiveness of the proposed method.
  • We studied the role of electron physics in 3D two-fluid 10-moment simulation of the Ganymede's magnetosphere. The model captures non-ideal physics like the Hall effect, the electron inertia, and anisotropic, non-gyrotropic pressure effects. A series of analyses were carried out: 1) The resulting magnetic field topology and electron and ion convection patterns were investigated. The magnetic fields were shown to agree reasonably well with in-situ measurements by the Galileo satellite. 2) The physics of collisionless magnetic reconnection were carefully examined in terms of the current sheet formation and decomposition of generalized Ohm's law. The importance of pressure anisotropy and non-gyrotropy in supporting the reconnection electric field is confirmed. 3) We compared surface "brightness" morphology, represented by surface electron and ion pressure contours, with oxygen emission observed by the Hubble Space Telescope (HST). The correlation between the observed emission morphology and spatial variability in electron/ion pressure was demonstrated. Potential extension to multi-ion species in the context of Ganymede and other magnetospheric systems is also discussed.
  • Sequential recommendation is one of fundamental tasks for Web applications. Previous methods are mostly based on Markov chains with a strong Markov assumption. Recently, recurrent neural networks (RNNs) are getting more and more popular and has demonstrated its effectiveness in many tasks. The last hidden state is usually applied as the sequence's representation to make recommendation. Benefit from the natural characteristics of RNN, the hidden state is a combination of long-term dependency and short-term interest to some degrees. However, the monotonic temporal dependency of RNN impairs the user's short-term interest. Consequently, the hidden state is not sufficient to reflect the user's final interest. In this work, to deal with this problem, we propose a Hierarchical Contextual Attention-based GRU (HCA-GRU) network. The first level of HCA-GRU is conducted on the input. We construct a contextual input by using several recent inputs based on the attention mechanism. This can model the complicated correlations among recent items and strengthen the hidden state. The second level is executed on the hidden state. We fuse the current hidden state and a contextual hidden state built by the attention mechanism, which leads to a more suitable user's overall interest. Experiments on two real-world datasets show that HCA-GRU can effectively generate the personalized ranking list and achieve significant improvement.
  • Most machine learning and deep neural network algorithms rely on certain iterative algorithms to optimise their utility/cost functions, e.g. Stochastic Gradient Descent. In distributed learning, the networked nodes have to work collaboratively to update the model parameters, and the way how they proceed is referred to as synchronous parallel design (or barrier control). Synchronous parallel protocol is the building block of any distributed learning framework, and its design has direct impact on the performance and scalability of the system. In this paper, we propose a new barrier control technique - Probabilistic Synchronous Parallel (PSP). Com- paring to the previous Bulk Synchronous Parallel (BSP), Stale Synchronous Parallel (SSP), and (Asynchronous Parallel) ASP, the proposed solution e ectively improves both the convergence speed and the scalability of the SGD algorithm by introducing a sampling primitive into the system. Moreover, we also show that the sampling primitive can be applied atop of the existing barrier control mechanisms to derive fully distributed PSP-based synchronous parallel. We not only provide a thorough theoretical analysis1 on the convergence of PSP-based SGD algorithm, but also implement a full-featured distributed learning framework called Actor and perform intensive evaluation atop of it.
  • The bitcoin peer-to-peer network has drawn significant attention from researchers, but so far has mostly focused on publicly visible portions of the network, i.e., publicly reachable peers. This mostly ignores the hidden parts of the network: unreachable Bitcoin peers behind NATs and firewalls. In this paper, we characterize Bitcoin peers that might be behind NATs or firewalls from different perspectives. Using a special-purpose measurement tool we conduct a large scale measurement study of the Bitcoin network, and discover several previously unreported usage patterns: a small number of peers are involved in the propagation of 89% of all bitcoin transactions, public cloud services are being used for Bitcoin network probing and crawling, a large amount of transactions are generated from only two mobile applications. We also empirically evaluate a method that uses timing information to re-identify the peer that created a transaction against unreachable peers. We find this method very accurate for peers that use the latest version of the Bitcoin Core client.
  • The online ads trading platform plays a crucial role in connecting publishers and advertisers and generates tremendous value in facilitating the convenience of our lives. It has been evolving into a more and more complicated structure. In this paper, we consider the problem of maximizing the revenue for the seller side via utilizing proper reserve price for the auctions in a dynamical way. Predicting the optimal reserve price for each auction in the repeated auction marketplaces is a non-trivial problem. However, we were able to come up with an efficient method of improving the seller revenue by mainly focusing on adjusting the reserve price for those high-value inventories. Previously, no dedicated work has been performed from this perspective. Inspired by Paul and Michael, our model first identifies the value of the inventory by predicting the top bid price bucket using a cascade of classifiers. The cascade is essential in significantly reducing the false positive rate of a single classifier. Based on the output of the first step, we build another cluster of classifiers to predict the price separations between the top two bids. We showed that although the high-value auctions are only a small portion of all the traffic, successfully identifying them and setting correct reserve price would result in a significant revenue lift. Moreover, our optimization is compatible with all other reserve price models in the system and does not impact their performance. In other words, when combined with other models, the enhancement on exchange revenue will be aggregated. Simulations on randomly sampled Yahoo ads exchange (YAXR) data showed stable and expected lift after applying our model.
  • Convolutional kernels are basic and vital components of deep Convolutional Neural Networks (CNN). In this paper, we equip convolutional kernels with shape attributes to generate the deep Irregular Convolutional Neural Networks (ICNN). Compared to traditional CNN applying regular convolutional kernels like ${3\times3}$, our approach trains irregular kernel shapes to better fit the geometric variations of input features. In other words, shapes are learnable parameters in addition to weights. The kernel shapes and weights are learned simultaneously during end-to-end training with the standard back-propagation algorithm. Experiments for semantic segmentation are implemented to validate the effectiveness of our proposed ICNN.
  • With the rapid growth of social media, massive misinformation is also spreading widely on social media, such as microblog, and bring negative effects to human life. Nowadays, automatic misinformation identification has drawn attention from academic and industrial communities. For an event on social media usually consists of multiple microblogs, current methods are mainly based on global statistical features. However, information on social media is full of noisy and outliers, which should be alleviated. Moreover, most of microblogs about an event have little contribution to the identification of misinformation, where useful information can be easily overwhelmed by useless information. Thus, it is important to mine significant microblogs for a reliable misinformation identification method. In this paper, we propose an Attention-based approach for Identification of Misinformation (AIM). Based on the attention mechanism, AIM can select microblogs with largest attention values for misinformation identification. The attention mechanism in AIM contains two parts: content attention and dynamic attention. Content attention is calculated based textual features of each microblog. Dynamic attention is related to the time interval between the posting time of a microblog and the beginning of the event. To evaluate AIM, we conduct a series of experiments on the Weibo dataset and the Twitter dataset, and the experimental results show that the proposed AIM model outperforms the state-of-the-art methods.
  • Sketch-based image retrieval (SBIR) is challenging due to the inherent domain-gap between sketch and photo. Compared with pixel-perfect depictions of photos, sketches are iconic renderings of the real world with highly abstract. Therefore, matching sketch and photo directly using low-level visual clues are unsufficient, since a common low-level subspace that traverses semantically across the two modalities is non-trivial to establish. Most existing SBIR studies do not directly tackle this cross-modal problem. This naturally motivates us to explore the effectiveness of cross-modal retrieval methods in SBIR, which have been applied in the image-text matching successfully. In this paper, we introduce and compare a series of state-of-the-art cross-modal subspace learning methods and benchmark them on two recently released fine-grained SBIR datasets. Through thorough examination of the experimental results, we have demonstrated that the subspace learning can effectively model the sketch-photo domain-gap. In addition we draw a few key insights to drive future research.
  • Information-centric networking extensively uses universal in-network caching. However, developing an efficient and fair collaborative caching algorithm for selfish caches is still an open question. In addition, the communication overhead induced by collaboration is especially poorly understood in a general network setting such as realistic ISP and Autonomous System networks. In this paper, we address these two problems by modeling the in-network caching problem as a Nash bargaining game. We show that the game is a convex optimization problem and further derive the corresponding distributed algorithm. We analytically investigate the collaboration overhead on general graph topologies, and theoretically show that collaboration has to be constrained within a small neighborhood due to its cost growing exponentially. Our proposed algorithm achieves at least 16% performance gain over its competitors on different network topologies in the evaluation, and guarantees provable convergence, Pareto efficiency and proportional fairness.
  • Stimulated Brillouin scattering (SBS) has been demonstrated in silicon waveguides in recent years. However, due to the weak interaction between photons and acoustic phonons in these waveguides, long interaction length is typically necessary. Here, we experimentally show that forward stimulated Brillouin scattering in a short interaction length of a 20 um radius silicon microring resonator could give 1.2 dB peak gain at only 10mW coupled pump power. The experimental results demonstrate that both optical and acoustic modes can have efficient interaction in a short interaction length. The observed Brillouin gain varies with coupled pump power in good agreement with theoretical prediction. The work shows the potential of SBS in silicon for moving the demonstrated fiber SBS applications to the integrated silicon photonics platform.
  • Recently, skeleton based action recognition gains more popularity due to cost-effective depth sensors coupled with real-time skeleton estimation algorithms. Traditional approaches based on handcrafted features are limited to represent the complexity of motion patterns. Recent methods that use Recurrent Neural Networks (RNN) to handle raw skeletons only focus on the contextual dependency in the temporal domain and neglect the spatial configurations of articulated skeletons. In this paper, we propose a novel two-stream RNN architecture to model both temporal dynamics and spatial configurations for skeleton based action recognition. We explore two different structures for the temporal stream: stacked RNN and hierarchical RNN. Hierarchical RNN is designed according to human body kinematics. We also propose two effective methods to model the spatial structure by converting the spatial graph into a sequence of joints. To improve generalization of our model, we further exploit 3D transformation based data augmentation techniques including rotation and scaling transformation to transform the 3D coordinates of skeletons during training. Experiments on 3D action recognition benchmark datasets show that our method brings a considerable improvement for a variety of actions, i.e., generic actions, interaction activities and gestures.
  • Recent research has shown a substantial active presence of bots in online social networks (OSNs). In this paper we utilise our past work on studying bots (Stweeler) to comparatively analyse the usage and impact of bots and humans on Twitter, one of the largest OSNs in the world. We collect a large-scale Twitter dataset and define various metrics based on tweet metadata. We divide and filter the dataset in four popularity groups in terms of number of followers. Using a human annotation task we assign 'bot' and 'human' ground-truth labels to the dataset, and compare the annotations against an online bot detection tool for evaluation. We then ask a series of questions to discern important behavioural bot and human characteristics using metrics within and among four popularity groups. From the comparative analysis we draw important differences as well as surprising similarities between the two entities, thus paving the way for reliable classification of automated political infiltration, advertisement campaigns, and general bot detection.
  • A novel Brillouin optical time-domain analysis (BOTDA) system is proposed using intensity-modulated optical orthogonal frequency division multiplexing probe signal and direct detection (IM-DD-OOFDM) without frequency sweep operation. The influence of peak to average power ratio (PAPR) of OFDM probe signal on the recovery of Brillouin gain spectrum (BGS) is analyzed in theory and experiment. The complex BGS is reconstructed by channel estimation algorithm and Brillouin frequency shift (BFS) is located by curve fitting of intensity spectrum. The IM-DD-OOFDM BOTDA is demonstrated experimentally with 25m spatial resolution over 2 km standard single mode fiber.