• A robot that can carry out a natural-language instruction has been a dream since before the Jetsons cartoon series imagined a life of leisure mediated by a fleet of attentive robot helpers. It is a dream that remains stubbornly distant. However, recent advances in vision and language methods have made incredible progress in closely related areas. This is significant because a robot interpreting a natural-language navigation instruction on the basis of what it sees is carrying out a vision and language process that is similar to Visual Question Answering. Both tasks can be interpreted as visually grounded sequence-to-sequence translation problems, and many of the same methods are applicable. To enable and encourage the application of vision and language methods to the problem of interpreting visually-grounded navigation instructions, we present the Matterport3D Simulator -- a large-scale reinforcement learning environment based on real imagery. Using this simulator, which can in future support a range of embodied vision and language tasks, we provide the first benchmark dataset for visually-grounded natural language navigation in real buildings -- the Room-to-Room (R2R) dataset.
  • In this paper, we exploit a memory-augmented neural network to predict accurate answers to visual questions, even when those answers occur rarely in the training set. The memory network incorporates both internal and external memory blocks and selectively pays attention to each training exemplar. We show that memory-augmented neural networks are able to maintain a relatively long-term memory of scarce training exemplars, which is important for visual question answering due to the heavy-tailed distribution of answers in a general VQA setting. Experimental results on two large-scale benchmark datasets show the favorable performance of the proposed algorithm with a comparison to state of the art.
  • Here we report the observation of pressure-induced melting of antiferromagnetic (AFM) order and emergence of a new quantum state in the honeycomb-lattice halide alpha-RuCl3, a candidate compound in the proximity of quantum spin liquid state. Our high-pressure heat capacity measurements demonstrate that the AFM order smoothly melts away at a critical pressure (Pc) of 0.7 GPa. Intriguingly, the AFM transition temperature displays an increase upon applying pressure below the Pc, in stark contrast to usual phase diagrams, for example in pressurized parent compounds of unconventional superconductors. Furthermore, in the high-pressure phase an unusual steady of magnetoresistance is observed. These observations suggest that the high-pressure phase is in an exotic gapped quantum state which is robust against pressure up to ~140 GPa.
  • Here we propose the existence of a non-static ground state in the Kondo insulator SmB6, with a unique accompany-type valence fluctuation of Sm ions in its bulk. Whether SmB6 is a fashion of time crystal is an intriguing issue.
  • We report the observation of extraordinarily robust zero-resistance superconductivity in the pressurized (TaNb)0.67(HfZrTi)0.33 high entropy alloy - a new kind of material with a body-centered cubic crystal structure made from five randomly distributed transition metal elements. The transition to superconductivity (TC) increases from an initial temperature of 7.7 K at ambient pressure to 10 K at ~ 60 GPa, and then slowly decreases to 9 K by 190.6 GPa, a pressure that falls within that of the outer core of the earth. We infer that the continuous existence of the zero-resistance superconductivity from one atmosphere up to such a high pressure requires a special combination of electronic and mechanical characteristics. This high entropy alloy superconductor thus may have a bright future for applications under extreme conditions, and also poses a challenge for understanding the underlying quantum physics.
  • We report the discovery of superconductivity in pressurized CeRhGe3, until now the only remaining non-superconducting member of the isostructural family of non-centrosymmetric heavy-fermion compounds CeTX3 (T = Co, Rh, Ir and X = Si, Ge). Superconductivity appears in CeRhGe3 at a pressure of 19.6 GPa and the transition temperature Tc reaches a maximum value of 1.3 K at 21.5 GPa. This finding provides an opportunity to establish systematic correlations between superconductivity and materials properties within this family. Though ambient-pressure unit-cell volumes and critical pressures for superconductivity vary substantially across the series, all family members reach a maximum Tcmax at a common critical cell volume Vcrit, and Tcmax at Vcrit increases with increasing spin-orbit coupling strength of the d-electrons. These correlations show that substantial Kondo hybridization and spin-orbit coupling favor superconductivity in this family, the latter reflecting the role of broken centro-symmetry.
  • Visual Question Answering (VQA) has attracted a lot of attention in both Computer Vision and Natural Language Processing communities, not least because it offers insight into the relationships between two important sources of information. Current datasets, and the models built upon them, have focused on questions which are answerable by direct analysis of the question and image alone. The set of such questions that require no external information to answer is interesting, but very limited. It excludes questions which require common sense, or basic factual knowledge to answer, for example. Here we introduce FVQA, a VQA dataset which requires, and supports, much deeper reasoning. FVQA only contains questions which require external information to answer. We thus extend a conventional visual question answering dataset, which contains image-question-answerg triplets, through additional image-question-answer-supporting fact tuples. The supporting fact is represented as a structural triplet, such as <Cat,CapableOf,ClimbingTrees>. We evaluate several baseline models on the FVQA dataset, and describe a novel model which is capable of reasoning about an image on the basis of supporting facts.
  • The Classification of medical images and illustrations in the literature aims to label a medical image according to the modality it was produced or label an illustration according to its production attributes. It is an essential and challenging research hotspot in the area of automated literature review, retrieval and mining. The significant intra-class variation and inter-class similarity caused by the diverse imaging modalities and various illustration types brings a great deal of difficulties to the problem. In this paper, we propose a synergic deep learning (SDL) model to address this issue. Specifically, a dual deep convolutional neural network with a synergic signal system is designed to mutually learn image representation. The synergic signal is used to verify whether the input image pair belongs to the same category and to give the corrective feedback if a synergic error exists. Our SDL model can be trained 'end to end'. In the test phase, the class label of an input can be predicted by averaging the likelihood probabilities obtained by two convolutional neural network components. Experimental results on the ImageCLEF2016 Subfigure Classification Challenge suggest that our proposed SDL model achieves the state-of-the art performance in this medical image classification problem and its accuracy is higher than that of the first place solution on the Challenge leader board so far.
  • Visual relationship detection aims to capture interactions between pairs of objects in images. Relationships between objects and humans represent a particularly important subset of this problem, with implications for challenges such as understanding human behaviour, and identifying affordances, amongst others. In addressing this problem we first construct a large-scale human-centric visual relationship detection dataset (HCVRD), which provides many more types of relationship annotation (nearly 10K categories) than the previous released datasets. This large label space better reflects the reality of human-object interactions, but gives rise to a long-tail distribution problem, which in turn demands a zero-shot approach to labels appearing only in the test set. This is the first time this issue has been addressed. We propose a webly-supervised approach to these problems and demonstrate that the proposed model provides a strong baseline on our HCVRD dataset.
  • Recently, the BESIII Collaboration reported two new decay processes $h_c(1P)\to \gamma \eta$ and $\gamma \eta^\prime$. Inspired by this measurement, we propose to study the radiative decays of $h_c$ via intermediate charmed meson loops in an effective Lagrangian approach. With the acceptable cutoff parameter range, the calculated branching ratios of $h_c(1P)\to \gamma \eta$ and $\gamma \eta^\prime$ are orders of $10^{-4}\sim 10^{-3}$ and $10^{-3} \sim 10^{-2}$, respectively. The ratio $R_{h_c}= \mathcal{B}( h_c\to \gamma \eta )/\mathcal{B}( h_c\to \gamma \eta^\prime )$ can reproduce the experimental measurements with the commonly acceptable $\alpha$ range. This ratio provide us some information on the $\eta-\eta^\prime$ mixing, which may be helpful for us to test SU(3)-flavor symmetries in QCD.
  • In layered transition metal dichalcogenides (LTMDCs) that display both charge density waves (CDWs) and superconductivity, the superconducting state generally emerges directly on suppression of the CDW state. Here, however, we report a different observation for pressurized TaTe2, a non-superconducting CDW-bearing LTMDC at ambient pressure. We find that a superconducting state does not occur in TaTe2 after the full suppression of its CDW state, which we observe at about 3 GPa, but, rather, a non-superconducting semimetal state is observed. At a higher pressure, ~21 GPa, where both the semimetal state and the corresponding positive magnetoresistance effect are destroyed, superconductivity finally emerges and remains present up to ~50 GPa, the high pressure limit of our measurements. Our pressure-temperature phase diagram for TaTe2 demonstrates that the CDW and the superconducting phases in TaTe2 do not directly transform one to the other, but rather are separated by a semimetal state, - the first experimental case where the CDW and superconducting states are separated by an intermediate phase in LTMDC systems.
  • Deep convolution neural networks (CNN) have demonstrated advanced performance on single-label image classification, and various progress also have been made to apply CNN methods on multi-label image classification, which requires to annotate objects, attributes, scene categories etc. in a single shot. Recent state-of-the-art approaches to multi-label image classification exploit the label dependencies in an image, at global level, largely improving the labeling capacity. However, predicting small objects and visual concepts is still challenging due to the limited discrimination of the global visual features. In this paper, we propose a Regional Latent Semantic Dependencies model (RLSD) to address this problem. The utilized model includes a fully convolutional localization architecture to localize the regions that may contain multiple highly-dependent labels. The localized regions are further sent to the recurrent neural networks (RNN) to characterize the latent semantic dependencies at the regional level. Experimental results on several benchmark datasets show that our proposed model achieves the best performance compared to the state-of-the-art models, especially for predicting small objects occurred in the images. In addition, we set up an upper bound model (RLSD+ft-RPN) using bounding box coordinates during training, the experimental results also show that our RLSD can approach the upper bound without using the bounding-box annotations, which is more realistic in the real world.
  • We report high pressure studies of the structural stability of Ru2Sn3, a new type of three dimensional topological insulator (3D-TI) with unique quasi-one dimensional Dirac electron states throughout the surface Brillouin zone of its one-atmosphere low temperature orthorhombic form. Our in-situ high-pressure synchrotron x-ray diffraction and electrical resistance measurements reveal that upon increasing pressure the tetragonal to orthorhombic shifts to higher temperature. We find that the stability of the orthorhombic phase that hosts the non-trivial topological ground state can be pushed up to room temperature by an applied pressure of ~ 20 GPa. This is in contrast to the commonly known 3D-TIs whose ground state is usually destroyed under pressure. Our results indicate that pressure provides a possible pathway for realizing a room-temperature topological insulating state in Ru2Sn3.
  • One of the most intriguing features of the Visual Question Answering (VQA) challenge is the unpredictability of the questions. Extracting the information required to answer them demands a variety of image operations from detection and counting, to segmentation and reconstruction. To train a method to perform even one of these operations accurately from {image,question,answer} tuples would be challenging, but to aim to achieve them all with a limited set of such training data seems ambitious at best. We propose here instead a more general and scalable approach which exploits the fact that very good methods to achieve these operations already exist, and thus do not need to be trained. Our method thus learns how to exploit a set of external off-the-shelf algorithms to achieve its goal, an approach that has something in common with the Neural Turing Machine. The core of our proposed method is a new co-attention model. In addition, the proposed approach generates human-readable reasons for its decision, and can still be trained end-to-end without ground truth reasons being given. We demonstrate the effectiveness on two publicly available datasets, Visual Genome and VQA, and show that it produces the state-of-the-art results in both cases.
  • Much recent progress in Vision-to-Language problems has been achieved through a combination of Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs). This approach does not explicitly represent high-level semantic concepts, but rather seeks to progress directly from image features to text. In this paper we first propose a method of incorporating high-level concepts into the successful CNN-RNN approach, and show that it achieves a significant improvement on the state-of-the-art in both image captioning and visual question answering. We further show that the same mechanism can be used to incorporate external knowledge, which is critically important for answering high level visual questions. Specifically, we design a visual question answering model that combines an internal representation of the content of an image with information extracted from a general knowledge base to answer a broad range of image-based questions. It particularly allows questions to be asked about the contents of an image, even when the image itself does not contain a complete answer. Our final model achieves the best reported results on both image captioning and visual question answering on several benchmark datasets.
  • One of the most strikingly universal features of the high temperature superconductors is that the superconducting phase emerges in the close proximity of the antiferromagnetic phase, and the interplay between these two phases poses a long standing challenge. It is commonly believed that,as the antiferromagnetic transition temperature is continuously suppressed to zero, there appears a quantum critical point, around which the existence of antiferromagnetic fluctuation is responsible for the development of the superconductivity. In contrast to this scenario, we report the discovery of a bi-critical point identified at 2.88 GPa and 26.02 K in the pressurized high quality single crystal Ca0.73La0.27FeAs2 by complementary in situ high pressure measurements. At the critical pressure, we find that the antiferromagnetism suddenly disappears and superconductivity simultaneously emerges at almost the same temperature, and that the external magnetic field suppresses the superconducting transition temperature but hardly affects the antiferromagnetic transition temperature.
  • The influence of carrier type on superconductivity has been an important issue for understanding both conventional and unconventional superconductors [1-7]. For elements that superconduct, it is known that hole-carriers govern the superconductivity for transition and main group metals [8-10]. The role of hole-carriers in elements that are not normally conducting but can be converted to superconductors, however, remains unclear due to the lack of experimental data. Here we report the first in-situ high pressure Hall effect measurements on single crystal black phosphorus, measured up to ~ 50 GPa, and find a correlation between the Hall coefficient and the superconducting transition temperature (TC). Our results reveal that hole-carriers play a vital role in developing superconductivity and enhancing TC. Importantly, we also find a Lifshitz transition in the high-pressure cubic phase at ~17.2GPa, which uncovers the origin of a puzzling valley in the superconducting TC-pressure phase diagram. These results offer insight into the role of hole-carriers in developing superconductivity in simple semiconducting solids under pressure.
  • In recent years, the study on the Kondo insulator SmB6, a strongly correlated electron material with decades-long puzzles, has become one of the most attractive topics again because the discovery of the coexistence of its unusual metallic surface state with an insulating bulk. Many efforts have been made in understanding the corresponding physics behind in SmB6, but some puzzles on it, being hotly debated and argued, has not been solved. In this article, based on the latest progress in our high pressure studies and the accumulating results reported by other groups on SmB6, we propose a notion named as accompany-type valence fluctuation state, which possibly coexists with the Kondo ground state of SmB6. The purpose of this article is to search a common starting point from which most of the accumulated low-temperature phenomena observed by different experimental investigations on SmB6 could be understood in a unified way. Although this notion is only our personal understanding from a phenomenological point of view and may be immature, anyway, we expect that this notion could attract rigorous theoretical interpretations and further experimental investigations, or stimulate better thinking on the physics in SmB6.
  • Non-centrosymmetric superconductors, whose crystal structure is absent of inversion symmetry, have recently received special attentions due to the expectation of unconventional pairings and exotic physics associated with such pairings. The newly discovered superconductors A2Cr3As3 (A=K, Rb), featured by the quasi-one dimensional structure with conducting CrAs chains, belongs to such kind of superconductor. In this study, we are the first to report the finding that the superconductivity of A2Cr3As3 (A=K, Rb) has a positive correlation with the extent of non-centrosymmetry. Our in-situ high pressure ac susceptibility and synchrotron x-ray diffraction measurements reveal that the larger bond angle of As-Cr-As in the CrAs chains can be taken as a key factor controlling superconductivity. While the smaller bond angle and the distance between the CrAs chains also affect the superconductivity due to their structural connections with the angle. We find that the larger value of the difference between the larger and samller angles, which is associated with the extent of the non-centrosymmetry of the lattice structure, is in favor of superconductivity. These results are expected to shed a new light on the underlying mechanism of the superconductivity in these Q1D superconductors and also to provide new perspective in understanding other non-centrosymmetric superconductors.
  • Visual Question Answering (VQA) is a challenging task that has received increasing attention from both the computer vision and the natural language processing communities. Given an image and a question in natural language, it requires reasoning over visual elements of the image and general knowledge to infer the correct answer. In the first part of this survey, we examine the state of the art by comparing modern approaches to the problem. We classify methods by their mechanism to connect the visual and textual modalities. In particular, we examine the common approach of combining convolutional and recurrent neural networks to map images and questions to a common feature space. We also discuss memory-augmented and modular architectures that interface with structured knowledge bases. In the second part of this survey, we review the datasets available for training and evaluating VQA systems. The various datatsets contain questions at different levels of complexity, which require different capabilities and types of reasoning. We examine in depth the question/answer pairs from the Visual Genome project, and evaluate the relevance of the structured annotations of images with scene graphs for VQA. Finally, we discuss promising future directions for the field, in particular the connection to structured knowledge bases and the use of natural language processing models.
  • In this work, we investigate the production of $X_b$ in the process $\Upsilon(5S,6S)\to \gamma X_b$, where $X_b$ is assumed to be the counterpart of $X(3872)$ in the bottomonium sector as a $B {\bar B}^*$ molecular state. We use the effective Lagrangian based on the heavy quark symmetry to explore the rescattering mechanism and calculate their production ratios. Our results have shown that the production ratios for the $\Upsilon(5S,6S) \to \gamma X_b$ are orders of $10^{-5}$ with reasonable cutoff parameter range $\alpha \simeq 2\sim 3$. The sizeable production ratios may be accessible at the future experiments like forthcoming BelleII, which will provide important clues to the inner structures of the exotic state $X_b$.
  • SmB6 is a promising candidate material that promises to elucidate the connection between strong correlations and topological electronic states, which is a major challenge in condensed matter physics. The electron correlations are responsible for the development of multiple gaps in SmB6, whose elucidation is sorely needed. Here we do so by studying the evolutions of the gaps and other corresponding behaviors under pressure. Our measurements of the valence, Hall effect and electrical resistivity clearly identify the gap which is associated with the bulk Kondo hybridization and, moreover, uncover a pressure-induced quantum phase transition from the putative topological Kondo insulating state to a Fermi-liquid state at ~4 GPa. We provide the evidences for the transition by a jump of inverse Hall coefficient, a diverging tendency of the electron-electron scattering coefficient and, thereby, a destruction of the Kondo entanglement in the ground state. These effects take place in a mixed-valence background. Our results raise the new prospect for studying topological electronic states in quantum critical materials settings.
  • Much of the recent progress in Vision-to-Language (V2L) problems has been achieved through a combination of Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs). This approach does not explicitly represent high-level semantic concepts, but rather seeks to progress directly from image features to text. We propose here a method of incorporating high-level concepts into the very successful CNN-RNN approach, and show that it achieves a significant improvement on the state-of-the-art performance in both image captioning and visual question answering. We also show that the same mechanism can be used to introduce external semantic information and that doing so further improves performance. In doing so we provide an analysis of the value of high level semantic information in V2L problems.
  • We propose a method for visual question answering which combines an internal representation of the content of an image with information extracted from a general knowledge base to answer a broad range of image-based questions. This allows more complex questions to be answered using the predominant neural network-based approach than has previously been possible. It particularly allows questions to be asked about the contents of an image, even when the image itself does not contain the whole answer. The method constructs a textual representation of the semantic content of an image, and merges it with textual information sourced from a knowledge base, to develop a deeper understanding of the scene viewed. Priming a recurrent neural network with this combined information, and the submitted question, leads to a very flexible visual question answering approach. We are specifically able to answer questions posed in natural language, that refer to information not contained in the image. We demonstrate the effectiveness of our model on two publicly available datasets, Toronto COCO-QA and MS COCO-VQA and show that it produces the best reported results in both cases.
  • In-situ hydrostatic and uniaxial high pressure studies were performed on recently discovered CrAs-based qausi-one-dimensional superconductors A2Cr3As3 (A=K and Rb). The established Pressure-Temperature phase diagram in this study clearly demonstrates that either hydrostatic pressure or uniaxial pressure globally suppresses the superconducting transition temperature (Tc), and the latter is more effective than the former. Interestingly, in the same hydrostatic pressure environment, the suppressing rate of Tc in Rb2Cr3As3 is nearly twice as that of K2Cr3As3. Significantly, the reduced Tc in these superconductors can fully recover to its ambient-pressure value after the applied pressure is entirely released. Our results suggest that the bonding distance and angle between Cr-Cr in the Cr3As3 chains are the key factor in determining Tc and that the optimal lattice for superconductivity is hosted in the pristine K2Cr3As3.