• We introduce a large scale MAchine Reading COmprehension dataset, which we name MS MARCO. The dataset comprises of 1,010,916 anonymized questions---sampled from Bing's search query logs---each with a human generated answer and 182,669 completely human rewritten generated answers. In addition, the dataset contains 8,841,823 passages---extracted from 3,563,535 web documents retrieved by Bing---that provide the information necessary for curating the natural language answers. A question in the MS MARCO dataset may have multiple answers or no answers at all. Using this dataset, we propose three different tasks with varying levels of difficulty: (i) predict if a question is answerable given a set of context passages, and extract and synthesize the answer as a human would (ii) generate a well-formed answer (if possible) based on the context passages that can be understood with the question and passage context, and finally (iii) rank a set of retrieved passages given a question. The size of the dataset and the fact that the questions are derived from real user search queries distinguishes MS MARCO from other well-known publicly available datasets for machine reading comprehension and question-answering. We believe that the scale and the real-world nature of this dataset makes it attractive for benchmarking machine reading comprehension and question-answering models.
  • We introduce a novel generative model for interpretable subgroup analysis for causal inference applications, Causal Rule Sets (CRS). A CRS model uses a small set of short rules to capture a subgroup where the average treatment effect is elevated compared to the entire population. We present a Bayesian framework for learning a causal rule set. The Bayesian framework consists of a prior that favors simpler models and a Bayesian logistic regression that characterizes the relation between outcomes, attributes and subgroup membership. We find maximum a posteriori models using discrete Monte Carlo steps in the joint solution space of rules sets and parameters. We provide theoretically grounded heuristics and bounding strategies to improve search efficiency. Experiments show that the search algorithm can efficiently recover a true underlying subgroup and CRS shows consistently competitive performance compared to other state-of-the-art baseline methods.
  • We propose a two-stage neural model to tackle question generation from documents. First, our model estimates the probability that word sequences in a document are ones that a human would pick when selecting candidate answers by training a neural key-phrase extractor on the answers in a question-answering corpus. Predicted key phrases then act as target answers and condition a sequence-to-sequence question-generation model with a copy mechanism. Empirically, our key-phrase extraction model significantly outperforms an entity-tagging baseline and existing rule-based approaches. We further demonstrate that our question generation system formulates fluent, answerable questions from key phrases. This two-stage system could be used to augment or generate reading comprehension datasets, which may be leveraged to improve machine reading systems or in educational settings.
  • Many Natural Language Processing and Computational Linguistics applications involves the generation of new texts based on some existing texts, such as summarization, text simplification and machine translation. However, there has been a serious problem haunting these applications for decades, that is, how to automatically and accurately assess quality of these applications. In this paper, we will present some preliminary results on one especially useful and challenging problem in NLP system evaluation: how to pinpoint content differences of two text passages (especially for large pas-sages such as articles and books). Our idea is intuitive and very different from existing approaches. We treat one text passage as a small knowledge base, and ask it a large number of questions to exhaustively identify all content points in it. By comparing the correctly answered questions from two text passages, we will be able to compare their content precisely. The experiment using 2007 DUC summarization corpus clearly shows promising results.
  • Stories are a vital form of communication in human culture; they are employed daily to persuade, to elicit sympathy, or to convey a message. Computational understanding of human narratives, especially high-level narrative structures, remain limited to date. Multiple literary theories for narrative structures exist, but operationalization of the theories has remained a challenge. We developed an annotation scheme by consolidating and extending existing narratological theories, including Labov and Waletsky's (1967) functional categorization scheme and Freytag's (1863) pyramid of dramatic tension, and present 360 annotated short stories collected from online sources. In the future, this research will support an approach that enables systems to intelligently sustain complex communications with humans.
  • Interpretable machine learning models have received increasing interest in recent years, especially in domains where humans are involved in the decision-making process. However, the possible loss of the task performance for gaining interpretability is often inevitable. This performance downgrade puts practitioners in a dilemma of choosing between a top-performing black-box model with no explanations and an interpretable model with unsatisfying task performance. In this work, we propose a novel framework for building a Hybrid Decision Model that integrates an interpretable model with any black-box model to introduce explanations in the decision making process while preserving or possibly improving the predictive accuracy. We propose a novel metric, explainability, to measure the percentage of data that are sent to the interpretable model for decision. We also design a principled objective function that considers predictive accuracy, model interpretability, and data explainability. Under this framework, we develop Collaborative Black-box and RUle Set Hybrid (CoBRUSH) model that combines logic rules and any black-box model into a joint decision model. An input instance is first sent to the rules for decision. If a rule is satisfied, a decision will be directly generated. Otherwise, the black-box model is activated to decide on the instance. To train a hybrid model, we design an efficient search algorithm that exploits theoretically grounded strategies to reduce computation. Experiments show that CoBRUSH models are able to achieve same or better accuracy than their black-box collaborator working alone while gaining explainability. They also have smaller model complexity than interpretable baselines.
  • Multi-Value Rule Sets (1710.05257)

    Oct. 15, 2017 cs.AI, cs.DS
    We present the Multi-vAlue Rule Set (MARS) model for interpretable classification with feature efficient presentations. MARS introduces a more generalized form of association rules that allows multiple values in a condition. Rules of this form are more concise than traditional single-valued rules in capturing and describing patterns in data. MARS mitigates the problem of dealing with continuous features and high-cardinality categorical features faced by rule-based models. Our formulation also pursues a higher efficiency of feature utilization, which reduces the cognitive load to understand the decision process. We propose an efficient inference method for learning a maximum a posteriori model, incorporating theoretically grounded bounds to iteratively reduce the search space to improve search efficiency. Experiments with synthetic and real-world data demonstrate that MARS models have significantly smaller complexity and fewer features, providing better interpretability while being competitive in predictive accuracy. We conducted a usability study with human subjects and results show that MARS is the easiest to use compared with other competing rule-based models, in terms of the correct rate and response time. Overall, MARS introduces a new approach to rule-based models that balance accuracy and interpretability with feature-efficient representations.
  • Participatory sensing (PS) is a novel and promising sensing network paradigm for achieving a flexible and scalable sensing coverage with a low deploying cost, by encouraging mobile users to participate and contribute their smartphones as sensors. In this work, we consider a general PS system model with location-dependent and time-sensitive tasks, which generalizes the existing models in the literature. We focus on the task scheduling in the user-centric PS system, where each participating user will make his individual task scheduling decision (including both the task selection and the task execution order) distributively. Specifically, we formulate the interaction of users as a strategic game called Task Scheduling Game (TSG) and perform a comprehensive game-theoretic analysis. First, we prove that the proposed TSG game is a potential game, which guarantees the existence of Nash equilibrium (NE). Then, we analyze the efficiency loss and the fairness index at the NE. Our analysis shows the efficiency at NE may increase or decrease with the number of users, depending on the level of competition. This implies that it is not always better to employ more users in the user-centric PS system, which is important for the system designer to determine the optimal number of users to be employed in a practical system.
  • An important and difficult challenge in building computational models for narratives is the automatic evaluation of narrative quality. Quality evaluation connects narrative understanding and generation as generation systems need to evaluate their own products. To circumvent difficulties in acquiring annotations, we employ upvotes in social media as an approximate measure for story quality. We collected 54,484 answers from a crowd-powered question-and-answer website, Quora, and then used active learning to build a classifier that labeled 28,320 answers as stories. To predict the number of upvotes without the use of social network features, we create neural networks that model textual regions and the interdependence among regions, which serve as strong benchmarks for future research. To our best knowledge, this is the first large-scale study for automatic evaluation of narrative quality.
  • We propose a generative machine comprehension model that learns jointly to ask and answer questions based on documents. The proposed model uses a sequence-to-sequence framework that encodes the document and generates a question (answer) given an answer (question). Significant improvement in model performance is observed empirically on the SQuAD corpus, confirming our hypothesis that the model benefits from jointly learning to perform both tasks. We believe the joint model's novelty offers a new perspective on machine comprehension beyond architectural engineering, and serves as a first step towards autonomous information seeking.
  • We propose a recurrent neural model that generates natural-language questions from documents, conditioned on answers. We show how to train the model using a combination of supervised and reinforcement learning. After teacher forcing for standard maximum likelihood training, we fine-tune the model using policy gradient techniques to maximize several rewards that measure question quality. Most notably, one of these rewards is the performance of a question-answering system. We motivate question generation as a means to improve the performance of question answering systems. Our model is trained and evaluated on the recent question-answering dataset SQuAD.
  • We present NewsQA, a challenging machine comprehension dataset of over 100,000 human-generated question-answer pairs. Crowdworkers supply questions and answers based on a set of over 10,000 news articles from CNN, with answers consisting of spans of text from the corresponding articles. We collect this dataset through a four-stage process designed to solicit exploratory questions that require reasoning. A thorough analysis confirms that NewsQA demands abilities beyond simple word matching and recognizing textual entailment. We measure human performance on the dataset and compare it to several strong neural models. The performance gap between humans and machines (0.198 in F1) indicates that significant progress can be made on NewsQA through future research. The dataset is freely available at https://datasets.maluuba.com/NewsQA.
  • We present a comprehensive investigation of the polarization properties of non-polar a-plane InGaN quantum dots (QDs) and their origin with statistically significant experimental data and rigorous k.p modelling. The unbiased selection and study of 180 individual QDs allow us to compute an average polarization degree of 0.90, with a standard deviation of only 0.08. When coupled with theoretical insights, we show that a-plane InGaN QDs are highly insensitive to size differences, shape anisotropies, and indium content fluctuations. Furthermore, 91% of the studied QDs exhibit a polarization axis along the crystal [1-100] axis, with the other 9% polarized orthogonal to this direction. When coupled with their ability to emit single-photons, a-plane QDs are good candidates for the generation of linearly polarized single-photons, a feature attractive for quantum cryptography protocols.
  • A crucial requirement for the realisation of efficient and scalable on-chip quantum communication is an ultrafast polarised single photon source operating beyond the Peltier cooling barrier of 200 K. While a few systems based on different materials and device structures have achieved single photon generation above this threshold, there has been no report of single quantum emitters with deterministic polarisation properties at the same high temperature conditions. Here, we report the first device to simultaneously achieve single photon emission with a g(2)(0) of only 0.21, a high polarisation degree of 0.80, a fixed polarisation axis determined by the underlying crystallography, and a GHz repetition rate with a radiative lifetime of 357 ps at 220 K. The temperature insensitivity of these properties, together with the simple planar growth method, and absence of complex device geometries, makes this system an excellent candidate for on-chip applications in integrated systems.
  • Inferring topics from the overwhelming amount of short texts becomes a critical but challenging task for many content analysis tasks, such as content charactering, user interest profiling, and emerging topic detecting. Existing methods such as probabilistic latent semantic analysis (PLSA) and latent Dirichlet allocation (LDA) cannot solve this prob- lem very well since only very limited word co-occurrence information is available in short texts. This paper studies how to incorporate the external word correlation knowledge into short texts to improve the coherence of topic modeling. Based on recent results in word embeddings that learn se- mantically representations for words from a large corpus, we introduce a novel method, Embedding-based Topic Model (ETM), to learn latent topics from short texts. ETM not only solves the problem of very limited word co-occurrence information by aggregating short texts into long pseudo- texts, but also utilizes a Markov Random Field regularized model that gives correlated words a better chance to be put into the same topic. The experiments on real-world datasets validate the effectiveness of our model comparing with the state-of-the-art models.
  • We demonstrate single photon emission from self-assembled m-plane InGaN quantum dots (QDs) embedded on the side-walls of GaN nanowires. A combination of electron microscopy, cathodoluminescence, time-resolved micro-PL and photon autocorrelation experiments give a thorough evaluation of the QDs structural and optical properties. The QD exhibits anti-bunched emission up to 100 K, with a measured autocorrelation function of g^2(0) = 0.28 (0.03) at 5 K. Studies on a statistically significant number of QDs show that these m-plane QDs exhibit very fast radiative lifetimes (260 +/- 55 ps) suggesting smaller internal fields than any of the previously reported c-plane and a-plane QDs. Moreover, the observed single photons are almost completely linearly polarized aligned perpendicular to the crystallographic c-axis with a degree of linear polarization of 0.84 +/- 0.12. Such InGaN QDs incorporated in a nanowire system meet many of the requirements for implementation into quantum information systems and could potentially open the door to wholly new device concepts.
  • Text simplification (TS) aims to reduce the lexical and structural complexity of a text, while still retaining the semantic meaning. Current automatic TS techniques are limited to either lexical-level applications or manually defining a large amount of rules. Since deep neural networks are powerful models that have achieved excellent performance over many difficult tasks, in this paper, we propose to use the Long Short-Term Memory (LSTM) Encoder-Decoder model for sentence level TS, which makes minimal assumptions about word sequence. We conduct preliminary experiments to find that the model is able to learn operation rules such as reversing, sorting and replacing from sequence pairs, which shows that the model may potentially discover and apply rules such as modifying sentence structure, substituting words, and removing words for TS.
  • Link prediction aims to uncover the underlying relationship behind networks, which could be utilized to predict the missing edges or identify the spurious edges, and attracts much attention from various fields. The key issue of link prediction is to estimate the likelihood of two nodes in networks. Most current approaches of link prediction base on static structural analysis and ignore the temporal aspects of evolving networks. Unlike previous work, in this paper, we propose a popularity based structural perturbation method (PBSPM) that characterizes the similarity of an edge not only from existing connections of networks, but also from the popularity of its two endpoints, since popular nodes have much more probability to form links between themselves. By taking popularity of nodes into account, PBSPM could suppress nodes that have high importance, but gradually become inactive. Therefore the proposed method is inclined to predict potential edges between active nodes, rather than edges between inactive nodes. Experimental results on four real networks show that the proposed method outperforms the state-of-the-art methods both in accuracy and robustness in evolving networks.
  • Blood exhibits a heterogeneous nature of hematocrit, velocity, and effective viscosity in microcapillaries. Microvascular bifurcations have a significant influence on the distribution of the blood cells and blood flow behavior. This paper presents a simulation study performed on the two-dimensionalmotions and deformation of multiple red blood cells in microvessels with diverging and converging bifurcations. Fluid dynamics and membrane mechanics were incorporated. Effects of cell shape, hematocrit, and deformability of the cell membrane on rheological behavior of the red blood cells and the hemodynamics have been investigated. It was shown that the blood entering the daughter branch with a higher flow rate tended to receive disproportionally more cells. The results also demonstrate that red blood cells in microvessels experienced lateral migration in the parent channel and blunted velocity profiles in both straight section and daughter branches, and this effect was influenced by the shape and the initial position of the cells, the hematocrit, and the membrane deformability. In addition, a cell free region around the tip of the confluence was observed. The simulation results are qualitatively consistent with existing experimental findings. This study may provide fundamental knowledge for a better understanding of hemodynamic behavior of micro-scale blood flow.
  • Or's of And's (OA) models are comprised of a small number of disjunctions of conjunctions, also called disjunctive normal form. An example of an OA model is as follows: If ($x_1 = $ `blue' AND $x_2=$ `middle') OR ($x_1 = $ `yellow'), then predict $Y=1$, else predict $Y=0$. Or's of And's models have the advantage of being interpretable to human experts, since they are a set of conditions that concisely capture the characteristics of a specific subset of data. We present two optimization-based machine learning frameworks for constructing OA models, Optimized OA (OOA) and its faster version, Optimized OA with Approximations (OOAx). We prove theoretical bounds on the properties of patterns in an OA model. We build OA models as a diagnostic screening tool for obstructive sleep apnea, that achieves high accuracy with a substantial gain in interpretability over other methods.
  • Antiferromagnetic order at $T_{\mathrm{N}} = 23$ K has been identified in Mn(III)F(salen), salen = H$_{14}$C$_{16}$N$_2$O$_2$, an $S = 2$ linear-chain system. Using single crystals, specific heat studies performed in magnetic fields up to 9 T revealed the presence of a field-independent cusp at the same temperature where $^1$H NMR studies conducted at 42 MHz observed dramatic changes in the spin-lattice relaxation time, $T_1$, and in the linewidths. Neutron powder diffraction performed on a randomly-oriented, as-grown, deuterated (12 of 14 H replaced by d) sample of 2.2 g at 10 K and 100 K did not resolve the magnetic ordering, while low-field (less than 0.1 T) magnetic susceptibility studies of single crystals and randomly-arranged microcrystalline samples reveal subtle features associated with the transition. Ensemble these data suggest a magnetic signature previously detected at 3.8 T for temperatures below nominally 500 mK is a spin-flop field of small net moments arising from alternating subsets of three Mn spins along the chains.
  • We present a machine learning algorithm for building classifiers that are comprised of a small number of disjunctions of conjunctions (or's of and's). An example of a classifier of this form is as follows: If X satisfies (x1 = 'blue' AND x3 = 'middle') OR (x1 = 'blue' AND x2 = '<15') OR (x1 = 'yellow'), then we predict that Y=1, ELSE predict Y=0. An attribute-value pair is called a literal and a conjunction of literals is called a pattern. Models of this form have the advantage of being interpretable to human experts, since they produce a set of conditions that concisely describe a specific class. We present two probabilistic models for forming a pattern set, one with a Beta-Binomial prior, and the other with Poisson priors. In both cases, there are prior parameters that the user can set to encourage the model to have a desired size and shape, to conform with a domain-specific definition of interpretability. We provide two scalable MAP inference approaches: a pattern level search, which involves association rule mining, and a literal level search. We show stronger priors reduce computation. We apply the Bayesian Or's of And's (BOA) model to predict user behavior with respect to in-vehicle context-aware personalized recommender systems.
  • In this paper, we consider a multihop wireless sensor network with multiple relay nodes for each hop where the amplify-and-forward scheme is employed. We present algorithmic strategies to jointly design linear receivers and the power allocation parameters via an alternating optimization approach subject to different power constraints which include global, local and individual ones. Two design criteria are considered: the first one minimizes the mean-square error and the second one maximizes the sum-rate of the wireless sensor network. We derive constrained minimum mean-square error and constrained maximum sum-rate expressions for the linear receivers and the power allocation parameters that contain the optimal complex amplification coefficients for each relay node. An analysis of the computational complexity and the convergence of the algorithms is also presented. Computer simulations show good performance of our proposed methods in terms of bit error rate and sum-rate compared to the method with equal power allocation and an existing power allocation scheme.
  • In this paper, we show that coding can be used in storage area networks (SANs) to improve various quality of service metrics under normal SAN operating conditions, without requiring additional storage space. For our analysis, we develop a model which captures modern characteristics such as constrained I/O access bandwidth limitations. Using this model, we consider two important cases: single-resolution (SR) and multi-resolution (MR) systems. For SR systems, we use blocking probability as the quality of service metric and propose the network coded storage (NCS) scheme as a way to reduce blocking probability. The NCS scheme codes across file chunks in time, exploiting file striping and file duplication. Under our assumptions, we illustrate cases where SR NCS provides an order of magnitude savings in blocking probability. For MR systems, we introduce saturation probability as a quality of service metric to manage multiple user types, and we propose the uncoded resolution- aware storage (URS) and coded resolution-aware storage (CRS) schemes as ways to reduce saturation probability. In MR URS, we align our MR layout strategy with traffic requirements. In MR CRS, we code videos across MR layers. Under our assumptions, we illustrate that URS can in some cases provide an order of magnitude gain in saturation probability over classic non-resolution aware systems. Further, we illustrate that CRS provides additional saturation probability savings over URS.
  • Ehrenfeucht-Fraisse games are very useful in studying separation and equivalence results in logic. The standard finite Ehrenfeucht-Fraisse game characterizes equivalence in first order logic. The standard Ehrenfeucht-Fraisse game in infinitary logic characterizes equivalence in $L_{\infty\omega}$. The logic $L_{\omega_1\omega}$ is the extension of first order logic with countable conjunctions and disjunctions. There was no Ehrenfeucht-Fraisse game for $L_{\omega_1\omega}$ in the literature. In this paper we develop an Ehrenfeucht-Fraisse Game for $L_{\omega_1\omega}$. This game is based on a game for propositional and first order logic introduced by Hella and Vaananen. Unlike the standard Ehrenfeucht-Fraisse games which are modeled solely after the behavior of quantifiers, this new game also takes into account the behavior of connectives in logic. We prove the adequacy theorem for this game. We also apply the new game to prove complexity results about infinite binary strings.