• Predictive modeling with electronic health record (EHR) data is anticipated to drive personalized medicine and improve healthcare quality. Constructing predictive statistical models typically requires extraction of curated predictor variables from normalized EHR data, a labor-intensive process that discards the vast majority of information in each patient's record. We propose a representation of patients' entire, raw EHR records based on the Fast Healthcare Interoperability Resources (FHIR) format. We demonstrate that deep learning methods using this representation are capable of accurately predicting multiple medical events from multiple centers without site-specific data harmonization. We validated our approach using de-identified EHR data from two U.S. academic medical centers with 216,221 adult patients hospitalized for at least 24 hours. In the sequential format we propose, this volume of EHR data unrolled into a total of 46,864,534,945 data points, including clinical notes. Deep learning models achieved high accuracy for tasks such as predicting in-hospital mortality (AUROC across sites 0.93-0.94), 30-day unplanned readmission (AUROC 0.75-0.76), prolonged length of stay (AUROC 0.85-0.86), and all of a patient's final discharge diagnoses (frequency-weighted AUROC 0.90). These models outperformed state-of-the-art traditional predictive models in all cases. We also present a case-study of a neural-network attribution system, which illustrates how clinicians can gain some transparency into the predictions. We believe that this approach can be used to create accurate and scalable predictions for a variety of clinical scenarios, complete with explanations that directly highlight evidence in the patient's chart.
  • In this paper, we introduce a novel end-end framework for multi-oriented scene text detection from an instance-aware semantic segmentation perspective. We present Fused Text Segmentation Networks, which combine multi-level features during the feature extracting as text instance may rely on finer feature expression compared to general objects. It detects and segments the text instance jointly and simultaneously, leveraging merits from both semantic segmentation task and region proposal based object detection task. Not involving any extra pipelines, our approach surpasses the current state of the art on multi-oriented scene text detection benchmarks: ICDAR2015 Incidental Scene Text and MSRA-TD500 reaching Hmean 84.1% and 82.0% respectively. Morever, we report a baseline on total-text containing curved text which suggests effectiveness of the proposed approach.
  • Current end-to-end machine reading and question answering (Q\&A) models are primarily based on recurrent neural networks (RNNs) with attention. Despite their success, these models are often slow for both training and inference due to the sequential nature of RNNs. We propose a new Q\&A architecture called QANet, which does not require recurrent networks: Its encoder consists exclusively of convolution and self-attention, where convolution models local interactions and self-attention models global interactions. On the SQuAD dataset, our model is 3x to 13x faster in training and 4x to 9x faster in inference, while achieving equivalent accuracy to recurrent models. The speed-up gain allows us to train the model with much more data. We hence combine our model with data generated by backtranslation from a neural machine translation model. On the SQuAD dataset, our single model, trained with augmented data, achieves 84.6 F1 score on the test set, which is significantly better than the best published F1 score of 81.8.
  • High-performance object detection relies on expensive convolutional networks to compute features, often leading to significant challenges in applications, e.g. those that require detecting objects from video streams in real time. The key to this problem is to trade accuracy for efficiency in an effective way, i.e. reducing the computing cost while maintaining competitive performance. To seek a good balance, previous efforts usually focus on optimizing the model architectures. This paper explores an alternative approach, that is, to reallocate the computation over a scale-time space. The basic idea is to perform expensive detection sparsely and propagate the results across both scales and time with substantially cheaper networks, by exploiting the strong correlations among them. Specifically, we present a unified framework that integrates detection, temporal propagation, and across-scale refinement on a Scale-Time Lattice. On this framework, one can explore various strategies to balance performance and cost. Taking advantage of this flexibility, we further develop an adaptive scheme with the detector invoked on demand and thus obtain improved tradeoff. On ImageNet VID dataset, the proposed method can achieve a competitive mAP 79.6% at 20 fps, or 79.0% at 62 fps as a performance/speed tradeoff.
  • Here we present linear and circular polarized soft x-ray absorption spectroscopy (XAS) data at the Ce $M_{4,5}$ edges of the electron (Ir) and hole-doped (Re) Kondo semiconductor CeOs$_2$Al$_{10}$. Both substitutions have a strong impact on the unusual high N$\acute{e}$el temperature, $T_N$=28.5\,K, and also the direction of the ordered moment in case of Ir. The substitution dependence of the linear dichroism is weak thus validating the crystal-field description of CeOs$_2$Al$_{10}$ being representative for the Re and Ir substituted compounds. The impact of electron- and hole-doping on the hybridization between conduction and 4$f$ electrons is related to the amount of $f^0$ in the ground state and reduction of x-ray magnetic circular dichroism. A relationship of $cf$-hybridization strength and enhanced $T_N$ is discussed. The direction and doping dependence of the circular dichroism is in agreement with strong Kondo screening along the crystallographic $a$ direction.
  • Modern switches have packet processing capacity of up to multi-tera bits per second, and they are also becoming more and more programmable. We seek to understand whether the programmability can translate packet processing capacity to computational power for parallel computing applications. In this paper, we first develop a simple mathematical model to understand the costs and overheads of data plane computation. Then we validate the the performance benefits of offloading computation to network. Using experiments on real data center network, we finnd that offloading computation to the data plane results in up to 20x speed-up for a simple Map-Reduce application. Motivated by this, we propose a parallel programming framework, p4mr, to help users efficiently program multiple switches. We successfully build and test a prototype of p4mr on a simulated testbed.
  • Adaptive algorithm based on multi-channel linear prediction is an effective dereverberation method balancing well between the attenuation of the long-term reverberation and the dereverberated speech quality. However, the abrupt change of the speech source position, usually caused by the shift of the speakers, forms an obstacle to the adaptive algorithm and makes it difficult to guarantee both the fast convergence speed and the optimal steady-state behavior. In this paper, the RLS-based adaptive multi-channel linear prediction method is investigated and a time-varying forgetting factor based on the relative weighted change of the adaptive filter coefficients is proposed to effectively tracing the abrupt change of the target speaker position. The advantages of the proposed scheme are demonstrated in the simulations and experiments.
  • The TRINICON ('Triple-N ICA for convolutive mixtures') framework is an effective blind signal separation (BSS) method for separating sound sources from convolutive mixtures. It makes full use of the non-whiteness, non-stationarity and non-Gaussianity properties of the source signals and can be implemented either in time domain or in frequency domain, avoiding the notorious internal permutation problem. It usually has best performance when the sources are continuously mixed. In this paper, the offline dual-channel frequency domain TRINICON implementation for sparsely mixed signals is investigated, and a multi-source activity detection is proposed to locate the active period of each source, based on which the filter updating strategy is regularized to improve the separation performance. The objective metric provided by the BSSEVAL toolkit is utilized to evaluate the performance of the proposed scheme.
  • Membership Inference Attack (MIA) determines the presence of a record in a machine learning model's training data by querying the model. Prior work has shown that the attack is feasible when the model is overfitted to its training data or when the adversary controls the training algorithm. However, when the model is not overfitted and the adversary does not control the training algorithm, the threat is not well understood. In this paper, we report a study that discovers overfitting to be a sufficient but not a necessary condition for an MIA to succeed. More specifically, we demonstrate that even a well-generalized model contains vulnerable instances subject to a new generalized MIA (GMIA). In GMIA, we use novel techniques for selecting vulnerable instances and detecting their subtle influences ignored by overfitting metrics. Specifically, we successfully identify individual records with high precision in real-world datasets by querying black-box machine learning models. Further we show that a vulnerable record can even be indirectly attacked by querying other related records and existing generalization techniques are found to be less effective in protecting the vulnerable instances. Our findings sharpen the understanding of the fundamental cause of the problem: the unique influences the training instance may have on the model.
  • The popularity of ASR (automatic speech recognition) systems, like Google Voice, Cortana, brings in security concerns, as demonstrated by recent attacks. The impacts of such threats, however, are less clear, since they are either less stealthy (producing noise-like voice commands) or requiring the physical presence of an attack device (using ultrasound). In this paper, we demonstrate that not only are more practical and surreptitious attacks feasible but they can even be automatically constructed. Specifically, we find that the voice commands can be stealthily embedded into songs, which, when played, can effectively control the target system through ASR without being noticed. For this purpose, we developed novel techniques that address a key technical challenge: integrating the commands into a song in a way that can be effectively recognized by ASR through the air, in the presence of background noise, while not being detected by a human listener. Our research shows that this can be done automatically against real world ASR applications. We also demonstrate that such CommanderSongs can be spread through Internet (e.g., YouTube) and radio, potentially affecting millions of ASR users. We further present a new mitigation technique that controls this threat.
  • Measurement-device-independent quantum key distribution (MDI-QKD) protocol was proposed to remove all the detector side channel attacks, while its security relies on the trusted encoding systems. Here we propose a one-sided MDI-QKD (1SMDI-QKD) protocol, which enjoys detection loophole-free advantage, and at the same time weakens the state preparation assumption in MDI-QKD. The 1SMDI-QKD can be regarded as a modified MDI-QKD, in which Bob's encoding system is trusted, while Alice's is uncharacterized. For the practical implementation, we also provide a scheme by utilizing coherent light source with an analytical two decoy state estimation method. Simulation with realistic experimental parameters shows that the protocol has a promising performance, and thus can be applied to practical QKD applications.
  • We investigated the crystal-electric field ground state of the 4$f$ manifold in the strongly correlated topological insulator SmB$_6$ using core level non-resonant inelastic x-ray scattering (NIXS). The directional dependence of the scattering function that arises from higher multipole transitions establishes unambiguously that the $\Gamma_8$ quartet state of the Sm $f^5$ $J$=$5/2$ configuration governs the ground-state symmetry and hence the topological properties of SmB$_6$. Our findings contradict the results of density functional calculations reported so far.
  • Today's cloud networks are shared among many tenants. Bandwidth guarantees and work conservation are two key properties to ensure predictable performance for tenant applications and high network utilization for providers. Despite significant efforts, very little prior work can really achieve both properties simultaneously even some of them claimed so. In this paper, we present QShare, an in-network based solution to achieve bandwidth guarantees and work conservation simultaneously. QShare leverages weighted fair queuing on commodity switches to slice network bandwidth for tenants, and solves the challenge of queue scarcity through balanced tenant placement and dynamic tenant-queue binding. QShare is readily implementable with existing switching chips. We have implemented a QShare prototype and evaluated it via both testbed experiments and simulations. Our results show that QShare ensures bandwidth guarantees while driving network utilization to over 91% even under unpredictable traffic demands.
  • In this paper, we seek to better understand Android obfuscation and depict a holistic view of the usage of obfuscation through a large-scale investigation in the wild. In particular, we focus on four popular obfuscation approaches: identifier renaming, string encryption, Java reflection, and packing. To obtain the meaningful statistical results, we designed efficient and lightweight detection models for each obfuscation technique and applied them to our massive APK datasets (collected from Google Play, multiple third-party markets, and malware databases). We have learned several interesting facts from the result. For example, malware authors use string encryption more frequently, and more apps on third-party markets than Google Play are packed. We are also interested in the explanation of each finding. Therefore we carry out in-depth code analysis on some Android apps after sampling. We believe our study will help developers select the most suitable obfuscation approach, and in the meantime help researchers improve code analysis systems in the right direction.
  • Hypergraph states, a generalization of graph states, constitute a large class of quantum states with intriguing non-local properties and have promising applications in quantum information science and technology. In this paper, we generalize hypergraph states to qudit hypergraph states, i.e., each vertex in the generalized hypergraph (multi-hypergraph) represents a $d$-level quantum system instead of a qubit. It is shown that multi-hypergraphs and $d$-level hypergraph states have a one-to-one correspondence. We prove that if one part of a multi-hypergraph is connected with the other part, the corresponding subsystems are entangled. More generally, the structure of a multi-hypergraph reveals the entanglement property of the corresponding quantum state. Furthermore, we discuss their relationship with some well-known state classes, e.g., real equally weighted states and stabilizer states. These states' responses to the generalized $Z$ ($X$) operations and $Z$ ($X$) measurements are studied. The Bell non-locality, an important resource in fulfilling many quantum information tasks, is also investigated.
  • Conventional video segmentation methods often rely on temporal continuity to propagate masks. Such an assumption suffers from issues like drifting and inability to handle large displacement. To overcome these issues, we formulate an effective mechanism to prevent the target from being lost via adaptive object re-identification. Specifically, our Video Object Segmentation with Re-identification (VS-ReID) model includes a mask propagation module and a ReID module. The former module produces an initial probability map by flow warping while the latter module retrieves missing instances by adaptive matching. With these two modules iteratively applied, our VS-ReID records a global mean (Region Jaccard and Boundary F measure) of 0.699, the best performance in 2017 DAVIS Challenge.
  • Despite the remarkable progress in recent years, detecting objects in a new context remains a challenging task. Detectors learned from a public dataset can only work with a fixed list of categories, while training from scratch usually requires a large amount of training data with detailed annotations. This work aims to explore a novel approach -- learning object detectors from documentary films in a weakly supervised manner. This is inspired by the observation that documentaries often provide dedicated exposition of certain object categories, where visual presentations are aligned with subtitles. We believe that object detectors can be learned from such a rich source of information. Towards this goal, we develop a joint probabilistic framework, where individual pieces of information, including video frames and subtitles, are brought together via both visual and linguistic links. On top of this formulation, we further derive a weakly supervised learning algorithm, where object model learning and training set mining are unified in an optimization procedure. Experimental results on a real world dataset demonstrate that this is an effective approach to learning new object detectors.
  • Quantum state transfer from flying photons to stationary matter qubits is an important element in the realization of quantum networks. Self-assembled semiconductor quantum dots provide a promising solid-state platform hosting both single photon and spin, with an inherent light-matter interface. Here, we develop a method to coherently and actively control the single-photon frequency bins in superposition using electro-optic modulators, and measure the spin-photon entanglement with a fidelity of $0.796\pm0.020$. Further, by Greenberger-Horne-Zeilinger-type state projection on the frequency, path and polarization degrees of freedom of a single photon, we demonstrate quantum state transfer from a single photon to a single electron spin confined in an InGaAs quantum dot, separated by 5 meters. The quantum state mapping from the photon's polarization to the electron's spin is demonstrated along three different axis on the Bloch sphere, with an average fidelity of $78.5\%$.
  • In the upgrade of ATLAS experiment, the front-end electronics components are subjected to a large radiation background. Meanwhile high speed optical links are required for the data transmission between the on-detector and off-detector electronics. The GBT architecture and the Versatile Link (VL) project are designed by CERN to support the 4.8 Gbps line rate bidirectional high-speed data transmission which is called GBT link. In the ATLAS upgrade, besides the link with on-detector, the GBT link is also used between different off-detector systems. The GBTX ASIC is designed for the on-detector front-end, correspondingly for the off-detector electronics, the GBT architecture is implemented in Field Programmable Gate Arrays (FPGA). CERN launches the GBT-FPGA project to provide examples in different types of FPGA. In the ATLAS upgrade framework, the Front-End LInk eXchange (FELIX) system is used to interface the front-end electronics of several ATLAS subsystems. The GBT link is used between them, to transfer the detector data and the timing, trigger, control and monitoring information. The trigger signal distributed in the down-link from FELIX to the front-end requires a fixed and low latency. In this paper, several optimizations on the GBT-FPGA IP core are introduced, to achieve a lower fixed latency. For FELIX, a common firmware will be used to interface different front-ends with support of both GBT modes: the forward error correction mode and the wide mode. The modified GBT-FPGA core has the ability to switch between the GBT modes without FPGA reprogramming. The system clock distribution of the multi-channel FELIX firmware is also discussed in this paper.
  • In this paper, we propose a Polar coding scheme for parallel Gaussian channels. The encoder knows the sum rate of the parallel channels but does not know the rate of any channel. By using the nesting property of Polar code, we design a coding/decoding scheme to achieve the sum rates.
  • A logarithmic oscillator has been proposed recently to serve as a thermostat recently since it has a peculiar property of infinite heat capacity according to the virial theorem. In order to examine its feasibility by numerical simulations, a modified logarithmic potential has been applied in previous studies to eliminate the singularity at origin. The role played by the modification has been elucidated in the present study. We argue that the virial theorem is practically violated for the modified log-oscillator illustrated by a linear dependence of kinetic temperature on energy. Furthermore, as far as a thermalized log-oscillator is concerned, the generalized equipartition theorem with respect to the position coordinate is broken if the temperature is higher than a critical temperature. Finally, we show that log-oscillators fail to serve as thermostats for its incapability of maintaining a nonequilibrium steady state even though their energy is appropriately assigned.
  • This paper presents a Convolutional Neural Network (CNN) based page segmentation method for handwritten historical document images. We consider page segmentation as a pixel labeling problem, i.e., each pixel is classified as one of the predefined classes. Traditional methods in this area rely on carefully hand-crafted features or large amounts of prior knowledge. In contrast, we propose to learn features from raw image pixels using a CNN. While many researchers focus on developing deep CNN architectures to solve different problems, we train a simple CNN with only one convolution layer. We show that the simple architecture achieves competitive results against other deep architectures on different public datasets. Experiments also demonstrate the effectiveness and superiority of the proposed method compared to previous methods.
  • Singlet exciton fission (SF), the conversion of one spin-singlet exciton (S1) into two spin-triplet excitons (T1), could provide a means to overcome the Shockley-Queisser limit in photovoltaics. SF as measured by the decay of S1 has been shown to occur efficiently and independently of temperature even when the energy of S1 is as much as 200 meV less than 2T1. Here, we study films of TIPS-tetracene using transient optical spectroscopy and show that the initial rise of the triplet pair state (TT) occurs in 300 fs, matched by rapid loss of S1 stimulated emission, and that this process is mediated by the strong coupling of electronic and vibrational degrees of freedom. This is followed by a slower 10 ps morphology-dependent phase of S1 decay and TT growth. We observe the TT to be thermally dissociated on 10-100 ns timescales to form free triplets. This provides a model for "temperature independent", efficient TT formation and thermally activated TT separation.
  • Inspired by the boom of the consumer IoT market, many device manufacturers, start-up companies and technology giants have jumped into the space. Unfortunately, the exciting utility and rapid marketization of IoT, come at the expense of privacy and security. Industry reports and academic work have revealed many attacks on IoT systems, resulting in privacy leakage, property loss and large-scale availability problems. To mitigate such threats, a few solutions have been proposed. However, it is still less clear what are the impacts they can have on the IoT ecosystem. In this work, we aim to perform a comprehensive study on reported attacks and defenses in the realm of IoT aiming to find out what we know, where the current studies fall short and how to move forward. To this end, we first build a toolkit that searches through massive amount of online data using semantic analysis to identify over 3000 IoT-related articles. Further, by clustering such collected data using machine learning technologies, we are able to compare academic views with the findings from industry and other sources, in an attempt to understand the gaps between them, the trend of the IoT security risks and new problems that need further attention. We systemize this process, by proposing a taxonomy for the IoT ecosystem and organizing IoT security into five problem areas. We use this taxonomy as a beacon to assess each IoT work across a number of properties we define. Our assessment reveals that relevant security and privacy problems are far from solved. We discuss how each proposed solution can be applied to a problem area and highlight their strengths, assumptions and constraints. We stress the need for a security framework for IoT vendors and discuss the trend of shifting security liability to external or centralized entities. We also identify open research problems and provide suggestions towards a secure IoT ecosystem.
  • Polar codes are the first class of constructive channel codes achieving the symmetric capacity of the binary-input discrete memoryless channels. But the corresponding code length is limited to the power of two. In this paper, we establish a systematic framework to design the rate-compatible punctured polar (RCPP) codes with arbitrary code length. A new theoretic tool, called polar spectra, is proposed to count the number of paths on the code tree with the same number of zeros or ones respectively. Furthermore, a spectrum distance SD0 (SD1) and a joint spectrum distance (JSD) are presented as performance criteria to optimize the puncturing tables. For the capacity-zero puncturing mode (punctured bits are unknown to the decoder), we propose a quasi-uniform puncturing algorithm, analyze the number of equivalent puncturings and prove that this scheme can maximize SD1 and JSD. Similarly, for the capacity-one mode (punctured bits are known to the decoder), we also devise a reversal quasi-uniform puncturing scheme and prove that it has the maximum SD0 and JSD. Both schemes have a universal puncturing table without any exhausted search. These optimal RCPP codes outperform the performance of turbo codes in LTE wireless communication systems.