
We introduce a large scale MAchine Reading COmprehension dataset, which we
name MS MARCO. The dataset comprises of 1,010,916 anonymized
questionssampled from Bing's search query logseach with a human generated
answer and 182,669 completely human rewritten generated answers. In addition,
the dataset contains 8,841,823 passagesextracted from 3,563,535 web
documents retrieved by Bingthat provide the information necessary for
curating the natural language answers. A question in the MS MARCO dataset may
have multiple answers or no answers at all. Using this dataset, we propose
three different tasks with varying levels of difficulty: (i) predict if a
question is answerable given a set of context passages, and extract and
synthesize the answer as a human would (ii) generate a wellformed answer (if
possible) based on the context passages that can be understood with the
question and passage context, and finally (iii) rank a set of retrieved
passages given a question. The size of the dataset and the fact that the
questions are derived from real user search queries distinguishes MS MARCO from
other wellknown publicly available datasets for machine reading comprehension
and questionanswering. We believe that the scale and the realworld nature of
this dataset makes it attractive for benchmarking machine reading comprehension
and questionanswering models.

We introduce a novel generative model for interpretable subgroup analysis for
causal inference applications, Causal Rule Sets (CRS). A CRS model uses a small
set of short rules to capture a subgroup where the average treatment effect is
elevated compared to the entire population. We present a Bayesian framework for
learning a causal rule set. The Bayesian framework consists of a prior that
favors simpler models and a Bayesian logistic regression that characterizes the
relation between outcomes, attributes and subgroup membership. We find maximum
a posteriori models using discrete Monte Carlo steps in the joint solution
space of rules sets and parameters. We provide theoretically grounded
heuristics and bounding strategies to improve search efficiency. Experiments
show that the search algorithm can efficiently recover a true underlying
subgroup and CRS shows consistently competitive performance compared to other
stateoftheart baseline methods.

We propose a twostage neural model to tackle question generation from
documents. First, our model estimates the probability that word sequences in a
document are ones that a human would pick when selecting candidate answers by
training a neural keyphrase extractor on the answers in a questionanswering
corpus. Predicted key phrases then act as target answers and condition a
sequencetosequence questiongeneration model with a copy mechanism.
Empirically, our keyphrase extraction model significantly outperforms an
entitytagging baseline and existing rulebased approaches. We further
demonstrate that our question generation system formulates fluent, answerable
questions from key phrases. This twostage system could be used to augment or
generate reading comprehension datasets, which may be leveraged to improve
machine reading systems or in educational settings.

Many Natural Language Processing and Computational Linguistics applications
involves the generation of new texts based on some existing texts, such as
summarization, text simplification and machine translation. However, there has
been a serious problem haunting these applications for decades, that is, how to
automatically and accurately assess quality of these applications. In this
paper, we will present some preliminary results on one especially useful and
challenging problem in NLP system evaluation: how to pinpoint content
differences of two text passages (especially for large passages such as
articles and books). Our idea is intuitive and very different from existing
approaches. We treat one text passage as a small knowledge base, and ask it a
large number of questions to exhaustively identify all content points in it. By
comparing the correctly answered questions from two text passages, we will be
able to compare their content precisely. The experiment using 2007 DUC
summarization corpus clearly shows promising results.

Stories are a vital form of communication in human culture; they are employed
daily to persuade, to elicit sympathy, or to convey a message. Computational
understanding of human narratives, especially highlevel narrative structures,
remain limited to date. Multiple literary theories for narrative structures
exist, but operationalization of the theories has remained a challenge. We
developed an annotation scheme by consolidating and extending existing
narratological theories, including Labov and Waletsky's (1967) functional
categorization scheme and Freytag's (1863) pyramid of dramatic tension, and
present 360 annotated short stories collected from online sources. In the
future, this research will support an approach that enables systems to
intelligently sustain complex communications with humans.

Interpretable machine learning models have received increasing interest in
recent years, especially in domains where humans are involved in the
decisionmaking process. However, the possible loss of the task performance for
gaining interpretability is often inevitable. This performance downgrade puts
practitioners in a dilemma of choosing between a topperforming blackbox model
with no explanations and an interpretable model with unsatisfying task
performance.
In this work, we propose a novel framework for building a Hybrid Decision
Model that integrates an interpretable model with any blackbox model to
introduce explanations in the decision making process while preserving or
possibly improving the predictive accuracy. We propose a novel metric,
explainability, to measure the percentage of data that are sent to the
interpretable model for decision. We also design a principled objective
function that considers predictive accuracy, model interpretability, and data
explainability. Under this framework, we develop Collaborative Blackbox and
RUle Set Hybrid (CoBRUSH) model that combines logic rules and any blackbox
model into a joint decision model. An input instance is first sent to the rules
for decision. If a rule is satisfied, a decision will be directly generated.
Otherwise, the blackbox model is activated to decide on the instance. To train
a hybrid model, we design an efficient search algorithm that exploits
theoretically grounded strategies to reduce computation. Experiments show that
CoBRUSH models are able to achieve same or better accuracy than their blackbox
collaborator working alone while gaining explainability. They also have smaller
model complexity than interpretable baselines.

We present the MultivAlue Rule Set (MARS) model for interpretable
classification with feature efficient presentations. MARS introduces a more
generalized form of association rules that allows multiple values in a
condition. Rules of this form are more concise than traditional singlevalued
rules in capturing and describing patterns in data. MARS mitigates the problem
of dealing with continuous features and highcardinality categorical features
faced by rulebased models. Our formulation also pursues a higher efficiency of
feature utilization, which reduces the cognitive load to understand the
decision process. We propose an efficient inference method for learning a
maximum a posteriori model, incorporating theoretically grounded bounds to
iteratively reduce the search space to improve search efficiency. Experiments
with synthetic and realworld data demonstrate that MARS models have
significantly smaller complexity and fewer features, providing better
interpretability while being competitive in predictive accuracy. We conducted a
usability study with human subjects and results show that MARS is the easiest
to use compared with other competing rulebased models, in terms of the correct
rate and response time. Overall, MARS introduces a new approach to rulebased
models that balance accuracy and interpretability with featureefficient
representations.

Participatory sensing (PS) is a novel and promising sensing network paradigm
for achieving a flexible and scalable sensing coverage with a low deploying
cost, by encouraging mobile users to participate and contribute their
smartphones as sensors. In this work, we consider a general PS system model
with locationdependent and timesensitive tasks, which generalizes the
existing models in the literature. We focus on the task scheduling in the
usercentric PS system, where each participating user will make his individual
task scheduling decision (including both the task selection and the task
execution order) distributively. Specifically, we formulate the interaction of
users as a strategic game called Task Scheduling Game (TSG) and perform a
comprehensive gametheoretic analysis. First, we prove that the proposed TSG
game is a potential game, which guarantees the existence of Nash equilibrium
(NE). Then, we analyze the efficiency loss and the fairness index at the NE.
Our analysis shows the efficiency at NE may increase or decrease with the
number of users, depending on the level of competition. This implies that it is
not always better to employ more users in the usercentric PS system, which is
important for the system designer to determine the optimal number of users to
be employed in a practical system.

An important and difficult challenge in building computational models for
narratives is the automatic evaluation of narrative quality. Quality evaluation
connects narrative understanding and generation as generation systems need to
evaluate their own products. To circumvent difficulties in acquiring
annotations, we employ upvotes in social media as an approximate measure for
story quality. We collected 54,484 answers from a crowdpowered
questionandanswer website, Quora, and then used active learning to build a
classifier that labeled 28,320 answers as stories. To predict the number of
upvotes without the use of social network features, we create neural networks
that model textual regions and the interdependence among regions, which serve
as strong benchmarks for future research. To our best knowledge, this is the
first largescale study for automatic evaluation of narrative quality.

We propose a generative machine comprehension model that learns jointly to
ask and answer questions based on documents. The proposed model uses a
sequencetosequence framework that encodes the document and generates a
question (answer) given an answer (question). Significant improvement in model
performance is observed empirically on the SQuAD corpus, confirming our
hypothesis that the model benefits from jointly learning to perform both tasks.
We believe the joint model's novelty offers a new perspective on machine
comprehension beyond architectural engineering, and serves as a first step
towards autonomous information seeking.

We propose a recurrent neural model that generates naturallanguage questions
from documents, conditioned on answers. We show how to train the model using a
combination of supervised and reinforcement learning. After teacher forcing for
standard maximum likelihood training, we finetune the model using policy
gradient techniques to maximize several rewards that measure question quality.
Most notably, one of these rewards is the performance of a questionanswering
system. We motivate question generation as a means to improve the performance
of question answering systems. Our model is trained and evaluated on the recent
questionanswering dataset SQuAD.

We present NewsQA, a challenging machine comprehension dataset of over
100,000 humangenerated questionanswer pairs. Crowdworkers supply questions
and answers based on a set of over 10,000 news articles from CNN, with answers
consisting of spans of text from the corresponding articles. We collect this
dataset through a fourstage process designed to solicit exploratory questions
that require reasoning. A thorough analysis confirms that NewsQA demands
abilities beyond simple word matching and recognizing textual entailment. We
measure human performance on the dataset and compare it to several strong
neural models. The performance gap between humans and machines (0.198 in F1)
indicates that significant progress can be made on NewsQA through future
research. The dataset is freely available at
https://datasets.maluuba.com/NewsQA.

We present a comprehensive investigation of the polarization properties of
nonpolar aplane InGaN quantum dots (QDs) and their origin with statistically
significant experimental data and rigorous k.p modelling. The unbiased
selection and study of 180 individual QDs allow us to compute an average
polarization degree of 0.90, with a standard deviation of only 0.08. When
coupled with theoretical insights, we show that aplane InGaN QDs are highly
insensitive to size differences, shape anisotropies, and indium content
fluctuations. Furthermore, 91% of the studied QDs exhibit a polarization axis
along the crystal [1100] axis, with the other 9% polarized orthogonal to this
direction. When coupled with their ability to emit singlephotons, aplane QDs
are good candidates for the generation of linearly polarized singlephotons, a
feature attractive for quantum cryptography protocols.

A crucial requirement for the realisation of efficient and scalable onchip
quantum communication is an ultrafast polarised single photon source operating
beyond the Peltier cooling barrier of 200 K. While a few systems based on
different materials and device structures have achieved single photon
generation above this threshold, there has been no report of single quantum
emitters with deterministic polarisation properties at the same high
temperature conditions. Here, we report the first device to simultaneously
achieve single photon emission with a g(2)(0) of only 0.21, a high polarisation
degree of 0.80, a fixed polarisation axis determined by the underlying
crystallography, and a GHz repetition rate with a radiative lifetime of 357 ps
at 220 K. The temperature insensitivity of these properties, together with the
simple planar growth method, and absence of complex device geometries, makes
this system an excellent candidate for onchip applications in integrated
systems.

Inferring topics from the overwhelming amount of short texts becomes a
critical but challenging task for many content analysis tasks, such as content
charactering, user interest profiling, and emerging topic detecting. Existing
methods such as probabilistic latent semantic analysis (PLSA) and latent
Dirichlet allocation (LDA) cannot solve this prob lem very well since only
very limited word cooccurrence information is available in short texts. This
paper studies how to incorporate the external word correlation knowledge into
short texts to improve the coherence of topic modeling. Based on recent results
in word embeddings that learn se mantically representations for words from a
large corpus, we introduce a novel method, Embeddingbased Topic Model (ETM),
to learn latent topics from short texts. ETM not only solves the problem of
very limited word cooccurrence information by aggregating short texts into
long pseudo texts, but also utilizes a Markov Random Field regularized model
that gives correlated words a better chance to be put into the same topic. The
experiments on realworld datasets validate the effectiveness of our model
comparing with the stateoftheart models.

We demonstrate single photon emission from selfassembled mplane InGaN
quantum dots (QDs) embedded on the sidewalls of GaN nanowires. A combination
of electron microscopy, cathodoluminescence, timeresolved microPL and photon
autocorrelation experiments give a thorough evaluation of the QDs structural
and optical properties. The QD exhibits antibunched emission up to 100 K, with
a measured autocorrelation function of g^2(0) = 0.28 (0.03) at 5 K. Studies on
a statistically significant number of QDs show that these mplane QDs exhibit
very fast radiative lifetimes (260 +/ 55 ps) suggesting smaller internal
fields than any of the previously reported cplane and aplane QDs. Moreover,
the observed single photons are almost completely linearly polarized aligned
perpendicular to the crystallographic caxis with a degree of linear
polarization of 0.84 +/ 0.12. Such InGaN QDs incorporated in a nanowire system
meet many of the requirements for implementation into quantum information
systems and could potentially open the door to wholly new device concepts.

Text simplification (TS) aims to reduce the lexical and structural complexity
of a text, while still retaining the semantic meaning. Current automatic TS
techniques are limited to either lexicallevel applications or manually
defining a large amount of rules. Since deep neural networks are powerful
models that have achieved excellent performance over many difficult tasks, in
this paper, we propose to use the Long ShortTerm Memory (LSTM) EncoderDecoder
model for sentence level TS, which makes minimal assumptions about word
sequence. We conduct preliminary experiments to find that the model is able to
learn operation rules such as reversing, sorting and replacing from sequence
pairs, which shows that the model may potentially discover and apply rules such
as modifying sentence structure, substituting words, and removing words for TS.

Link prediction aims to uncover the underlying relationship behind networks,
which could be utilized to predict the missing edges or identify the spurious
edges, and attracts much attention from various fields. The key issue of link
prediction is to estimate the likelihood of two nodes in networks. Most current
approaches of link prediction base on static structural analysis and ignore the
temporal aspects of evolving networks. Unlike previous work, in this paper, we
propose a popularity based structural perturbation method (PBSPM) that
characterizes the similarity of an edge not only from existing connections of
networks, but also from the popularity of its two endpoints, since popular
nodes have much more probability to form links between themselves. By taking
popularity of nodes into account, PBSPM could suppress nodes that have high
importance, but gradually become inactive. Therefore the proposed method is
inclined to predict potential edges between active nodes, rather than edges
between inactive nodes. Experimental results on four real networks show that
the proposed method outperforms the stateoftheart methods both in accuracy
and robustness in evolving networks.

Blood exhibits a heterogeneous nature of hematocrit, velocity, and effective
viscosity in microcapillaries. Microvascular bifurcations have a significant
influence on the distribution of the blood cells and blood flow behavior. This
paper presents a simulation study performed on the twodimensionalmotions and
deformation of multiple red blood cells in microvessels with diverging and
converging bifurcations. Fluid dynamics and membrane mechanics were
incorporated. Effects of cell shape, hematocrit, and deformability of the cell
membrane on rheological behavior of the red blood cells and the hemodynamics
have been investigated. It was shown that the blood entering the daughter
branch with a higher flow rate tended to receive disproportionally more cells.
The results also demonstrate that red blood cells in microvessels experienced
lateral migration in the parent channel and blunted velocity profiles in both
straight section and daughter branches, and this effect was influenced by the
shape and the initial position of the cells, the hematocrit, and the membrane
deformability. In addition, a cell free region around the tip of the confluence
was observed. The simulation results are qualitatively consistent with existing
experimental findings. This study may provide fundamental knowledge for a
better understanding of hemodynamic behavior of microscale blood flow.

Or's of And's (OA) models are comprised of a small number of disjunctions of
conjunctions, also called disjunctive normal form. An example of an OA model is
as follows: If ($x_1 = $ `blue' AND $x_2=$ `middle') OR ($x_1 = $ `yellow'),
then predict $Y=1$, else predict $Y=0$. Or's of And's models have the advantage
of being interpretable to human experts, since they are a set of conditions
that concisely capture the characteristics of a specific subset of data. We
present two optimizationbased machine learning frameworks for constructing OA
models, Optimized OA (OOA) and its faster version, Optimized OA with
Approximations (OOAx). We prove theoretical bounds on the properties of
patterns in an OA model. We build OA models as a diagnostic screening tool for
obstructive sleep apnea, that achieves high accuracy with a substantial gain in
interpretability over other methods.

Antiferromagnetic order at $T_{\mathrm{N}} = 23$ K has been identified in
Mn(III)F(salen), salen = H$_{14}$C$_{16}$N$_2$O$_2$, an $S = 2$ linearchain
system. Using single crystals, specific heat studies performed in magnetic
fields up to 9 T revealed the presence of a fieldindependent cusp at the same
temperature where $^1$H NMR studies conducted at 42 MHz observed dramatic
changes in the spinlattice relaxation time, $T_1$, and in the linewidths.
Neutron powder diffraction performed on a randomlyoriented, asgrown,
deuterated (12 of 14 H replaced by d) sample of 2.2 g at 10 K and 100 K did not
resolve the magnetic ordering, while lowfield (less than 0.1 T) magnetic
susceptibility studies of single crystals and randomlyarranged
microcrystalline samples reveal subtle features associated with the transition.
Ensemble these data suggest a magnetic signature previously detected at 3.8 T
for temperatures below nominally 500 mK is a spinflop field of small net
moments arising from alternating subsets of three Mn spins along the chains.

We present a machine learning algorithm for building classifiers that are
comprised of a small number of disjunctions of conjunctions (or's of and's). An
example of a classifier of this form is as follows: If X satisfies (x1 = 'blue'
AND x3 = 'middle') OR (x1 = 'blue' AND x2 = '<15') OR (x1 = 'yellow'), then we
predict that Y=1, ELSE predict Y=0. An attributevalue pair is called a literal
and a conjunction of literals is called a pattern. Models of this form have the
advantage of being interpretable to human experts, since they produce a set of
conditions that concisely describe a specific class. We present two
probabilistic models for forming a pattern set, one with a BetaBinomial prior,
and the other with Poisson priors. In both cases, there are prior parameters
that the user can set to encourage the model to have a desired size and shape,
to conform with a domainspecific definition of interpretability. We provide
two scalable MAP inference approaches: a pattern level search, which involves
association rule mining, and a literal level search. We show stronger priors
reduce computation. We apply the Bayesian Or's of And's (BOA) model to predict
user behavior with respect to invehicle contextaware personalized recommender
systems.

In this paper, we consider a multihop wireless sensor network with multiple
relay nodes for each hop where the amplifyandforward scheme is employed. We
present algorithmic strategies to jointly design linear receivers and the power
allocation parameters via an alternating optimization approach subject to
different power constraints which include global, local and individual ones.
Two design criteria are considered: the first one minimizes the meansquare
error and the second one maximizes the sumrate of the wireless sensor network.
We derive constrained minimum meansquare error and constrained maximum
sumrate expressions for the linear receivers and the power allocation
parameters that contain the optimal complex amplification coefficients for each
relay node. An analysis of the computational complexity and the convergence of
the algorithms is also presented. Computer simulations show good performance of
our proposed methods in terms of bit error rate and sumrate compared to the
method with equal power allocation and an existing power allocation scheme.

In this paper, we show that coding can be used in storage area networks
(SANs) to improve various quality of service metrics under normal SAN operating
conditions, without requiring additional storage space. For our analysis, we
develop a model which captures modern characteristics such as constrained I/O
access bandwidth limitations. Using this model, we consider two important
cases: singleresolution (SR) and multiresolution (MR) systems. For SR
systems, we use blocking probability as the quality of service metric and
propose the network coded storage (NCS) scheme as a way to reduce blocking
probability. The NCS scheme codes across file chunks in time, exploiting file
striping and file duplication. Under our assumptions, we illustrate cases where
SR NCS provides an order of magnitude savings in blocking probability. For MR
systems, we introduce saturation probability as a quality of service metric to
manage multiple user types, and we propose the uncoded resolution aware
storage (URS) and coded resolutionaware storage (CRS) schemes as ways to
reduce saturation probability. In MR URS, we align our MR layout strategy with
traffic requirements. In MR CRS, we code videos across MR layers. Under our
assumptions, we illustrate that URS can in some cases provide an order of
magnitude gain in saturation probability over classic nonresolution aware
systems. Further, we illustrate that CRS provides additional saturation
probability savings over URS.

EhrenfeuchtFraisse games are very useful in studying separation and
equivalence results in logic. The standard finite EhrenfeuchtFraisse game
characterizes equivalence in first order logic. The standard
EhrenfeuchtFraisse game in infinitary logic characterizes equivalence in
$L_{\infty\omega}$. The logic $L_{\omega_1\omega}$ is the extension of first
order logic with countable conjunctions and disjunctions. There was no
EhrenfeuchtFraisse game for $L_{\omega_1\omega}$ in the literature.
In this paper we develop an EhrenfeuchtFraisse Game for
$L_{\omega_1\omega}$. This game is based on a game for propositional and first
order logic introduced by Hella and Vaananen. Unlike the standard
EhrenfeuchtFraisse games which are modeled solely after the behavior of
quantifiers, this new game also takes into account the behavior of connectives
in logic. We prove the adequacy theorem for this game. We also apply the new
game to prove complexity results about infinite binary strings.