• Video recommendation has become an essential way of helping people explore the massive videos and discover the ones that may be of interest to them. In the existing video recommender systems, the models make the recommendations based on the user-video interactions and single specific content features. When the specific content features are unavailable, the performance of the existing models will seriously deteriorate. Inspired by the fact that rich contents (e.g., text, audio, motion, and so on) exist in videos, in this paper, we explore how to use these rich contents to overcome the limitations caused by the unavailability of the specific ones. Specifically, we propose a novel general framework that incorporates arbitrary single content feature with user-video interactions, named as collaborative embedding regression (CER) model, to make effective video recommendation in both in-matrix and out-of-matrix scenarios. Our extensive experiments on two real-world large-scale datasets show that CER beats the existing recommender models with any single content feature and is more time efficient. In addition, we propose a priority-based late fusion (PRI) method to gain the benefit brought by the integrating the multiple content features. The corresponding experiment shows that PRI brings real performance improvement to the baseline and outperforms the existing fusion methods.
  • In this work, we propose a novel scenario to probe the interactions between dark matter (DM) particles and electrons, via hydrogen-atmosphere pulsating white dwarfs (DAVs) in globular clusters. In this special configuration, the DM particles, which are predominantly captured by multiple scattering with the electrons in a DAV, would annihilate by pairs and provide extra energy source to the DAV. This mechanism slows down the natural cooling evolution which can be presented by the period variation rates of pulsation modes. The differences between the secular rates predicted by the precise asteroseismology and the secular rates obtained from observation can reveal the DM-electron interactions. An important observable has been proposed and corresponding estimations have been made. According to the estimation, if this scenario could be implemented in the near future, the potential sensitivity on $m_{\chi}$ (DM particle's mass) and $\sigma_{\chi,e}$ (elastic scattering cross section between DM and electron) could be hopefully extended to a region $5 \mathrm{GeV} \lesssim m_{\chi} \lesssim 10^{4} \mathrm{GeV}$ and $\sigma_{\chi,e} \gtrsim 10^{-40} \mathrm{cm}^{2}$. Combining with indirect DM detection results, this could give us a cross check on the existence of such leptonphilic DM particles to some extent.
  • This research studies finite element (FE) model updating through sum of squares (SOS) optimization to minimize modal dynamic residuals. In the past few decades, many FE model updating algorithms have been studied to improve the similitude between a numerical model and the as-built structure. FE model updating usually requires solving nonconvex optimization problems, while most off-the-shelf optimization solvers can only find local optima. To improve the model updating performance, this paper proposes the SOS global optimization method for minimizing modal dynamic residuals of the generalized eigenvalue equations in structural dynamics. The proposed method is validated through both numerical simulation and experimental study of a four-story shear frame structure.
  • Boson sampling, thought to be intractable classically, can be solved by a quantum machine composed of merely generation, linear evolution and detection of single photons. Such an analog quantum computer for this specific problem provides a shortcut to boost the absolute computing power of quantum computers to beat classical ones. However, the capacity bound of classical computers for simulating boson sampling has not yet been identified. Here we simulate boson sampling on the Tianhe-2 supercomputer which occupied the first place in the world ranking six times from 2013 to 2016. We computed the permanent of the largest matrix using up to 312,000 CPU cores of Tianhe-2, and inferred from the current most efficient permanent-computing algorithms that an upper bound on the performance of Tianhe-2 is one 50-photon sample per ~100 min. In addition, we found a precision issue with one of two permanent-computing algorithms.
  • We generalize the compatible tower condition given by Strichartz to the almost-Parseval-frame tower and show that non-trivial examples of almost-Parseval-frame tower exist. By doing so, we demonstrate the first singular fractal measure which has only finitely many mutually orthogonal exponentials (and hence it does not admit any exponential orthonormal bases), but it still admits Fourier frames.
  • Person re-identification (re-ID) requires rapid, flexible yet discriminant representations to quickly generalize to unseen observations on-the-fly and recognize the same identity across disjoint camera views. Recent effective methods are developed in a pair-wise similarity learning system to detect a fixed set of features from distinct regions which are mapped to their vector embeddings for the distance measuring. However, the most relevant and crucial parts of each image are detected independently without referring to the dependency conditioned on one and another. Also, these region based methods rely on spatial manipulation to position the local features in comparable similarity measuring. To combat these limitations, in this paper we introduce the Deep Co-attention based Comparators (DCCs) that fuse the co-dependent representations of the paired images so as to focus on the relevant parts of both images and produce their \textit{relative representations}. Given a pair of pedestrian images to be compared, the proposed model mimics the foveation of human eyes to detect distinct regions concurrent on both images, namely co-dependent features, and alternatively attend to relevant regions to fuse them into the similarity learning. Our comparator is capable of producing dynamic representations relative to a particular sample every time, and thus well-suited to the case of re-identifying pedestrians on-the-fly. We perform extensive experiments to provide the insights and demonstrate the effectiveness of the proposed DCCs in person re-ID. Moreover, our approach has achieved the state-of-the-art performance on three benchmark data sets: DukeMTMC-reID \cite{DukeMTMC}, CUHK03 \cite{FPNN}, and Market-1501 \cite{Market1501}.
  • This short paper describes our solution to the 2018 IEEE World Congress on Computational Intelligence One-Minute Gradual-Emotional Behavior Challenge, whose goal was to estimate continuous arousal and valence values from short videos. We designed four base regression models using visual and audio features, and then used a spectral approach to fuse them to obtain improved performance.
  • In this paper, we propose a novel deep generative approach to cross-modal retrieval to learn hash functions in the absence of paired training samples through the cycle consistency loss. Our proposed approach employs adversarial training scheme to lean a couple of hash functions enabling translation between modalities while assuming the underlying semantic relationship. To induce the hash codes with semantics to the input-output pair, cycle consistency loss is further proposed upon the adversarial training to strengthen the correlations between inputs and corresponding outputs. Our approach is generative to learn hash functions such that the learned hash codes can maximally correlate each input-output correspondence, meanwhile can also regenerate the inputs so as to minimize the information loss. The learning to hash embedding is thus performed to jointly optimize the parameters of the hash functions across modalities as well as the associated generative models. Extensive experiments on a variety of large-scale cross-modal data sets demonstrate that our proposed method achieves better retrieval results than the state-of-the-arts.
  • Recently, DAMPE has released its first results on the high-energy cosmic-ray electrons and positrons (CREs) from about $25$ GeV to $4.6$ TeV, which directly detect a break at $\sim 1$ TeV. This result gives us an excellent opportunity to study the source of the CREs excess. In this work, we used the data fo proton and helium flux (from AMS-02 and CREAM), $\bar{\mathrm{p}}/\mathrm{p}$ ratio (from AMS-02), positron flux (from AMS-02) and CREs flux (from DAMPE without the peak signal point at $\sim 1.4$ TeV) to do global fitting simultaneously, which can account the influence from the propagation model, the nuclei and electron primary source injection and the secondary lepton production precisely. For extra source to interpret the excess in lepton spectrum, we consider two separate scenarios (pulsar and dark matter annihilation via leptonic channels) to construct the bump ($\gtrsim 100$ GeV) and the break at $\sim 1$ TeV. The result shows: (i) in pulsar scenario, the spectral index of the injection should be $\nu_{\mathrm{psr}} \sim 0.65$ and the cut-off should be $R_{c} \sim 650$ GV; (ii) in dark matter scenario, the dark matter particle's mass is $m_{\chi} \sim 1208$ GeV and the cross section is $\langle \sigma v \rangle \sim 1.48 \times 10^{-23} \mathrm{cm}^{3} \mathrm{s}^{-1}$. Moreover, in the dark matter scenario, the $\tau \bar{\tau}$ annihilation channel is highly suppressed, and a DM model is built to satisfy the fitting results.
  • Safety analysis is a predominant activity in developing safety-critical systems. It is a highly cooperative task among multiple functional departments due to the increasingly sophisticated safety-critical systems and close-knit development processes. Communication occurs pervasively. Effective communication channels among multiple functional departments influence safety analysis quality as well as a safe product delivery. However, the use of communication channels during safety analysis is sometimes arbitrary and poses challenges. In this article, we aim to investigate the existing communication channels, the usage frequencies, their purposes and challenges during safety analysis in industry. We conducted a multiple case study by surveying 39 experts and interviewing 21 experts in safety-critical companies including software developers, quality engineers and functional safety managers. Direct observations and documentation review were also conducted. Popular communication channels during safety analysis include formal meetings, project coordination tools, documentation and telephone. Email, personal discussion, training, internal communication software and boards are also in use. Training involving safety analysis happens 1-4 times per year, while other aforementioned communication channels happen ranging from 1-4 times per day to 1-4 times per month. We summarise 28 purposes of using these aforementioned communication channels. Communication happens mostly for the purpose of clarifying safety requirements, fixing temporary problems, conflicts and obstacles and sharing safety knowledge.
  • In this paper, we present BigDL, a distributed deep learning framework for Big Data platforms and workflows. It is implemented on top of Apache Spark, and allows users to write their deep learning applications as standard Spark programs (running directly on large-scale big data clusters in a distributed fashion). It provides an expressive, "data-analytics integrated" deep learning programming model, so that users can easily build the end-to-end analytics + AI pipelines under a unified programming paradigm; by implementing an AllReduce like operation using existing primitives in Spark (e.g., shuffle, broadcast, and in-memory data persistence), it also provides a highly efficient "parameter server" style architecture, so as to achieve highly scalable, data-parallel distributed training. Since its initial open source release, BigDL users have built many analytics and deep learning applications (e.g., object detection, sequence-to-sequence generation, visual similarity, neural recommendations, fraud detection, etc.) on Spark.
  • The intra-cluster light (ICL) in observations is usually identified through the surface brightness limit method. In this paper, for the first time we produce the mock images of galaxy groups and clusters using a cosmological hydro- dynamical simulation, to investigate the ICL fraction and focus on its dependence on observational parameters, e.g., the surface brightness limit (SBL), the effects of cosmological redshift dimming, point spread function and CCD pixel size. Detailed analyses suggest that the width of point spread function has a significant effect on the measured ICL fraction, while the relatively small pixel size shows almost no influence. It is found that the measured ICL fraction depends strongly on the SBL. At a fixed SBL and redshift, the measured ICL fraction decreases with increasing halo mass, while with a much faint SBL, it does not depend on halo mass at low redshifts. In our work, the measured ICL fraction shows clear dependence on the cosmological redshift dimming effect. It is found that there are more mass locked in ICL component than light, suggesting that the use of a constant mass-to-light ratio at high surface brightness levels will lead to an underestimate of ICL mass. Furthermore, it is found that the radial profile of ICL shows a characteristic radius which is almost independent of halo mass. The current measurement of ICL from observations has a large dispersion due to different methods, and we emphasize the importance of using the same definition when observational results are compared with the theoretical predictions.
  • Context: Agile development is in widespread use, even in safety-critical domains. Motivation: However, there is a lack of an ap- propriate safety analysis and verification method in agile development. Objective: In this paper, we investigate the use of Behavior Driven De- velopment (BDD) instead of standard User Acceptance Testing (UAT) for safety verification with System-Theoretic Process Analysis (STPA) for safety analysis in agile development. Method: We evaluate the effect of this combination in a controlled experiment with 44 students in terms of productivity, test thoroughness, fault detection effectiveness and com- munication effectiveness. Results: The results show that BDD is more effective for safety verification regarding the impact on communication effectiveness than standard UAT, whereas productivity, test thorough- ness and fault detection effectiveness show no statistically significant difference in our controlled experiment. Conclusion: The combination of BDD and STPA seems promising with an enhancement on communica- tion, but its impact needs more research.
  • Agile techniques recently have received attention in developing safety-critical systems. However, a lack of empirical knowledge of performing safety assurance techniques in practice, especially safety analysis into agile development processes prevents further steps. In this article, we aim at investigating the feasibility and the effects of our S-Scrum development process, and stepwise improving and proposing an Optimized S-Scrum development process for safety-critical systems in a real environment. We conducted an exploratory case study in a one-year student project "Smart Home" at the University of Stuttgart, Germany. We participated in the project and collected quantitative and qualitative data from questionnaire, interviews, participant observation, physical artifacts, and documentation review. Furthermore, we evaluated the Optimized S-Scrum in industry by conducting interviews. The first-stage results showed that by integrating STPA (System-Theoretic Process Analysis) can ensure the safety during each sprint and enhance the safety of delivered products, while the agility of S-Scrum is slightly worse than the original Scrum. Six challenges have been explored: Management changes the team's priorities during an iteration; Disturbed safety-related communication; Non-functional requirements are determined too late; Insufficient upfront planning; Insufficient well-defined completion criteria; Excessive time to perform upfront planning. We investigated further the causalities and optimizations. The second-stage results revealed that the safety and agility have been improved after the optimizations. We have gained a positive assessment and suggestions from industry. The optimized S-Scrum is feasible for developing safety-critical systems concerning the capability to ensure safety and the acceptable agility in a student project. Further attempt is still needed in industrial projects.
  • It has been recently shown that a convolutional neural network can learn optical flow estimation with unsupervised learning. However, the performance of the unsupervised methods still has a relatively large gap compared to its supervised counterpart. Occlusion and large motion are some of the major factors that limit the current unsupervised learning of optical flow methods. In this work we introduce a new method which models occlusion explicitly and a new warping way that facilitates the learning of large motion. Our method shows promising results on Flying Chairs, MPI-Sintel and KITTI benchmark datasets. Especially on KITTI dataset where abundant unlabeled samples exist, our unsupervised method outperforms its counterpart trained with supervised learning.
  • We report ultrafast time-resolved optical reflectivity investigation of the dynamic densities and relaxations of pseudogap (PG) and superconducting (SC) quasiparticles (QPs) in the underdoped $\rm{Bi_2Sr_2CaCu_2O_{8+\delta}}$ ($T_c$ = 82 K). We find evidence of two distinct PG components in the positive reflectivity changes in the PG state, characterized by relaxation timescales of $\tau_{fast}$ $\approx$ 0.2 ps and $\tau_{slow}$ $\approx$ 2 ps with abrupt changes in both amplitudes $A_{fast}$ and $A_{slow}$ at $T^*$. The former presents no obvious change at $T_c$ and coexists with the SC QP. The latter's amplitude starts decreasing at the SC phase fluctuation $T_p$ and vanishes at $T_c$ followed by a negative amplitude signifying the emergence of the SC QP, therefore suggesting a competition with superconductivity.
  • A bipartite bilinear program (BBP) is a quadratically constrained quadratic optimization problem where the variables can be partitioned into two sets such that fixing the variables in any one of the sets results in a linear program. We propose a new second order cone representable (SOCP) relaxation for BBP, which we show is stronger than the standard SDP relaxation intersected with the boolean quadratic polytope. We then propose a new branching rule inspired by the construction of the SOCP relaxation. We describe a new application of BBP called as the finite element model updating problem, which is a fundamental problem in structural engineering. Our computational experiments on this problem class show that the new branching rule together with an polyhedral outer approximation of the SOCP relaxation outperforms a state-of-the-art commercial global solver in obtaining dual bounds.
  • Learning to estimate 3D geometry in a single image by watching unlabeled videos via deep convolutional network is attracting significant attention. In this paper, we introduce a "3D as-smooth-as-possible (3D-ASAP)" prior inside the pipeline, which enables joint estimation of edges and 3D scene, yielding results with significant improvement in accuracy for fine detailed structures. Specifically, we define the 3D-ASAP prior by requiring that any two points recovered in 3D from an image should lie on an existing planar surface if no other cues provided. We design an unsupervised framework that Learns Edges and Geometry (depth, normal) all at Once (LEGO). The predicted edges are embedded into depth and surface normal smoothness terms, where pixels without edges in-between are constrained to satisfy the prior. In our framework, the predicted depths, normals and edges are forced to be consistent all the time. We conduct experiments on KITTI to evaluate our estimated geometry and CityScapes to perform edge evaluation. We show that in all of the tasks, i.e.depth, normal and edge, our algorithm vastly outperforms other state-of-the-art (SOTA) algorithms, demonstrating the benefits of our approach.
  • Low-Rank Representation (LRR) is arguably one of the most powerful paradigms for Multi-view spectral clustering, which elegantly encodes the multi-view local graph/manifold structures into an intrinsic low-rank self-expressive data similarity embedded in high-dimensional space, to yield a better graph partition than their single-view counterparts. In this paper we revisit it with a fundamentally different perspective by discovering LRR as essentially a latent clustered orthogonal projection based representation winged with an optimized local graph structure for spectral clustering; each column of the representation is fundamentally a cluster basis orthogonal to others to indicate its members, which intuitively projects the view-specific feature representation to be the one spanned by all orthogonal basis to characterize the cluster structures. Upon this finding, we propose our technique with the followings: (1) We decompose LRR into latent clustered orthogonal representation via low-rank matrix factorization, to encode the more flexible cluster structures than LRR over primal data objects; (2) We convert the problem of LRR into that of simultaneously learning orthogonal clustered representation and optimized local graph structure for each view; (3) The learned orthogonal clustered representations and local graph structures enjoy the same magnitude for multi-view, so that the ideal multi-view consensus can be readily achieved. The experiments over multi-view datasets validate its superiority.
  • Soft-cast, a cross-layer design for wireless video transmission, is proposed to solve the drawbacks of digital video transmission: threshold transmission framework achieving the same effect. Specifically, in encoder, we carry out power allocation on the transformed coefficients and encode the coefficients based on the new formulation of power distortion. In decoder, the process of LLSE estimator is also improved. Accompanied with the inverse nonlinear transform, DCT coefficients can be recovered depending on the scaling factors , LLSE estimator coefficients and metadata. Experiment results show that our proposed framework outperforms the Soft-cast in PSNR 1.08 dB and the MSSIM gain reaches to 2.35% when transmitting under the same bandwidth and total power.
  • We report on atomic-scale visualization of the structure of infinite-layer cuprate SrCuO2 thin films grown on Nb-doped SrTiO3 substrates by molecular beam epitaxy. In-situ scanning tunneling microscopy study reveals stoichiometric copper oxide (CuO2) plane with a 2 x 2 surface reconstruction, prompted by preferential clustering of four adjacent CuO2 plaquettes. By imaging the subsurface Sr atoms, intra-unit-cell rotational symmetry breaking is observed, which, together with the adjacent CuO2 clustering, can be well accounted for by a periodic up-down buckling of oxygen ions on the CuO2 plane. Further post-annealing leads to an incommensurate stripe structure of the surface layer. Our findings provide important structural information for deeply understanding the electronic structure of superconducting CuO2 plane as well as high temperature superconductivity in cuprates.
  • Portfolio selection is the central task for assets management, but it turns out to be very challenging. Methods based on pattern matching, particularly the CORN-K algorithm, have achieved promising performance on several stock markets. A key shortage of the existing pattern matching methods, however, is that the risk is largely ignored when optimizing portfolios, which may lead to unreliable profits, particularly in volatile markets. We present a risk-aversion CORN-K algorithm, RACORN-K, that penalizes risk when searching for optimal portfolios. Experiments on four datasets (DJIA, MSCI, SP500(N), HSI) demonstrate that the new algorithm can deliver notable and reliable improvements in terms of return, Sharp ratio and maximum drawdown, especially on volatile markets.
  • Models based on deep convolutional neural networks (CNN) have significantly improved the performance of semantic segmentation. However, learning these models requires a large amount of training images with pixel-level labels, which are very costly and time-consuming to collect. In this paper, we propose a method for learning CNN-based semantic segmentation models from images with several types of annotations that are available for various computer vision tasks, including image-level labels for classification, box-level labels for object detection and pixel-level labels for semantic segmentation. The proposed method is flexible and can be used together with any existing CNN-based semantic segmentation networks. Experimental evaluation on the challenging PASCAL VOC 2012 and SIFT-flow benchmarks demonstrate that the proposed method can effectively make use of diverse training data to improve the performance of the learned models.
  • Modeling statistical regularities is the problem of representing the pixel distributions in natural images, and usually applied to solve the ill-posed image processing problems. In this paper, we present an extremely efficient CNN architecture for modeling statistical regularities. Our method is based on the observation that, by random sampling the pixels in natural images, we can obtain a set of pixel ensembles in which the pixel value is independent identically distributed. This leads to the idea of using 1*1 (point-wise) convolution kernel instead of k*k convolution kernel to learn the feature representation efficiently. Accordingly, we design a novel architecture with fully point-wise convolutions to greatly reduce the model complexity while maintaining the representation ability. Experiments on three applications: color constancy, image dehazing and underwater image enhancement demonstrate the superior performance of our proposed network over the existing architectures, i.e., using 1/10-1/100 network parameters and computational cost over the state-of-the-art networks while achieving comparable accuracy. Codes and models will be made publicly available.
  • The intrinsic alignment of galaxies is an important systematic effect in weak-lensing surveys, which can affect the derived cosmological parameters. One direct way to distinguish different alignment models and quantify their effects on the measurement is to produce mocked weak-lensing surveys. In this work, we use full-sky ray-tracing technique to produce mock images of galaxies from the ELUCID $N$-body simulation run with the WMAP9 cosmology. In our model we assume that the shape of central elliptical galaxy follows that of the dark matter halo, and spiral galaxy follows the halo spin. Using the mocked galaxy images, a combination of galaxy intrinsic shape and the gravitational shear, we compare the predicted tomographic shear correlations to the results of KiDS and DLS. It is found that our predictions stay between the KiDS and DLS results. We rule out a model in which the satellite galaxies are radially aligned with the center galaxy, otherwise the shear-correlations on small scales are too high. Most important, we find that although the intrinsic alignment of spiral galaxies is very weak, they induce a positive correlation between the gravitational shear signal and the intrinsic galaxy orientation (GI). This is because the spiral galaxy is tangentially aligned with the nearby large-scale overdensity, contrary to the radial alignment of elliptical galaxy. Our results explain the origin of detected positive GI term from the weak-lensing surveys. We conclude that in future analysis, the GI model must include the dependence on galaxy types in more detail. And the full-sky mock data introduced in this work can be available if you are interesting.