-
Unambiguous automata are nondeterministic automata in which every word has at
most one accepting run. In this paper we give a polynomial-time algorithm for
model checking discrete-time Markov chains against $\omega$-regular
specifications represented as unambiguous automata. We furthermore show that
the complexity of this model checking problem lies in NC: the subclass of P
comprising those problems solvable in poly-logarithmic parallel time. These
complexity bounds match the known bounds for model checking Markov chains
against specifications given as deterministic automata, notwithstanding the
fact that unambiguous automata can be exponentially more succinct than
deterministic automata. We report on an implementation of our procedure,
including an experiment in which the implementation is used to model check LTL
formulas on Markov chains.
-
Given two labelled Markov decision processes (MDPs), the trace-refinement
problem asks whether for all strategies of the first MDP there exists a
strategy of the second MDP such that the induced labelled Markov chains are
trace-equivalent. We show that this problem is decidable in polynomial time if
the second MDP is a Markov chain. The algorithm is based on new results on a
particular notion of bisimulation between distributions over the states.
However, we show that the general trace-refinement problem is undecidable, even
if the first MDP is a Markov chain. Decidability of those problems was stated
as open in 2008. We further study the decidability and complexity of the
trace-refinement problem provided that the strategies are restricted to be
memoryless.
-
We prove results on the decidability and complexity of computing the total
variation distance (equivalently, the $L_1$-distance) of hidden Markov models
(equivalently, labelled Markov chains). This distance measures the difference
between the distributions on words that two hidden Markov models induce. The
main results are: (1) it is undecidable whether the distance is greater than a
given threshold; (2) approximation is #P-hard and in PSPACE.
-
We study 2-player turn-based perfect-information stochastic games with
countably infinite state space. The players aim at maximizing/minimizing the
probability of a given event (i.e., measurable set of infinite plays), such as
reachability, B\"uchi, omega-regular or more general objectives.
These games are known to be weakly determined, i.e., they have value.
However, strong determinacy of threshold objectives (given by an event and a
threshold $c \in [0,1]$) was open in many cases: is it always the case that the
maximizer or the minimizer has a winning strategy, i.e., one that enforces,
against all strategies of the other player, that the objective is satisfied
with probability $\ge c$ (resp. $< c$)?
We show that almost-sure objectives (where $c=1$) are strongly determined.
This vastly generalizes a previous result on finite games with almost-sure tail
objectives. On the other hand we show that $\ge 1/2$ (co-)B\"uchi objectives
are not strongly determined, not even if the game is finitely branching.
Moreover, for almost-sure reachability and almost-sure B\"uchi objectives in
finitely branching games, we strengthen strong determinacy by showing that one
of the players must have a memoryless deterministic (MD) winning strategy.
-
We study countably infinite MDPs with parity objectives, and special cases
with a bounded number of colors in the Mostowski hierarchy (including
reachability, safety, Buchi and co-Buchi).
In finite MDPs there always exist optimal memoryless deterministic (MD)
strategies for parity objectives, but this does not generally hold for
countably infinite MDPs. In particular, optimal strategies need not exist. For
countable infinite MDPs, we provide a complete picture of the memory
requirements of optimal (resp., $\epsilon$-optimal) strategies for all
objectives in the Mostowski hierarchy. In particular, there is a strong
dichotomy between two different types of objectives. For the first type,
optimal strategies, if they exist, can be chosen MD, while for the second type
optimal strategies require infinite memory. (I.e., for all objectives in the
Mostowski hierarchy, if finite-memory randomized strategies suffice then also
MD strategies suffice.) Similarly, some objectives admit $\epsilon$-optimal
MD-strategies, while for others $\epsilon$-optimal strategies require infinite
memory. Such a dichotomy also holds for the subclass of countably infinite MDPs
that are finitely branching, though more objectives admit MD-strategies here.
-
We consider the problem of minimising the number of states in a multiplicity
tree automaton over the field of rational numbers. We give a minimisation
algorithm that runs in polynomial time assuming unit-cost arithmetic. We also
show that a polynomial bound in the standard Turing model would require a
breakthrough in the complexity of polynomial identity testing by proving that
the latter problem is logspace equivalent to the decision version of
minimisation. The developed techniques also improve the state of the art in
multiplicity word automata: we give an NC algorithm for minimising multiplicity
word automata. Finally, we consider the minimal consistency problem: does there
exist an automaton with $n$ states that is consistent with a given finite
sample of weight-labelled words or trees? We show that this decision problem is
complete for the existential theory of the rationals, both for words and for
trees of a fixed alphabet rank.
-
Nonnegative matrix factorization (NMF) is the problem of decomposing a given
nonnegative $n \times m$ matrix $M$ into a product of a nonnegative $n \times
d$ matrix $W$ and a nonnegative $d \times m$ matrix $H$. A longstanding open
question, posed by Cohen and Rothblum in 1993, is whether a rational matrix $M$
always has an NMF of minimal inner dimension $d$ whose factors $W$ and $H$ are
also rational. We answer this question negatively, by exhibiting a matrix for
which $W$ and $H$ require irrational entries.
-
Nonnegative matrix factorization (NMF) is the problem of decomposing a given
nonnegative $n \times m$ matrix $M$ into a product of a nonnegative $n \times
d$ matrix $W$ and a nonnegative $d \times m$ matrix $H$. Restricted NMF
requires in addition that the column spaces of $M$ and $W$ coincide. Finding
the minimal inner dimension $d$ is known to be NP-hard, both for NMF and
restricted NMF. We show that restricted NMF is closely related to a question
about the nature of minimal probabilistic automata, posed by Paz in his seminal
1971 textbook. We use this connection to answer Paz's question negatively, thus
falsifying a positive answer claimed in 1974. Furthermore, we investigate
whether a rational matrix $M$ always has a restricted NMF of minimal inner
dimension whose factors $W$ and $H$ are also rational. We show that this holds
for matrices $M$ of rank at most $3$ and we exhibit a rank-$4$ matrix for which
$W$ and $H$ require irrational entries.
-
Hidden Markov Chains (HMCs) are commonly used mathematical models of
probabilistic systems. They are employed in various fields such as speech
recognition, signal processing, and biological sequence analysis. We consider
the problem of distinguishing two given HMCs based on an observation sequence
that one of the HMCs generates. More precisely, given two HMCs and an
observation sequence, a distinguishing algorithm is expected to identify the
HMC that generates the observation sequence. Two HMCs are called
distinguishable if for every $\varepsilon > 0$ there is a distinguishing
algorithm whose error probability is less than $\varepsilon$. We show that one
can decide in polynomial time whether two HMCs are distinguishable. Further, we
present and analyze two distinguishing algorithms for distinguishable HMCs. The
first algorithm makes a decision after processing a fixed number of
observations, and it exhibits two-sided error. The second algorithm processes
an unbounded number of observations, but the algorithm has only one-sided
error. The error probability, for both algorithms, decays exponentially with
the number of processed observations. We also provide an algorithm for
distinguishing multiple HMCs. Finally, we discuss an application in stochastic
runtime verification.
-
Herman's self-stabilisation algorithm, introduced 25 years ago, is a
well-studied synchronous randomised protocol for enabling a ring of $N$
processes collectively holding any odd number of tokens to reach a stable state
in which a single token remains. Determining the worst-case expected time to
stabilisation is the central outstanding open problem about this protocol. It
is known that there is a constant $h$ such that any initial configuration has
expected stabilisation time at most $h N^2$. Ten years ago, McIver and Morgan
established a lower bound of $4/27 \approx 0.148$ for $h$, achieved with three
equally-spaced tokens, and conjectured this to be the optimal value of $h$. A
series of papers over the last decade gradually reduced the upper bound on $h$,
with the present record (achieved in 2014) standing at approximately $0.156$.
In this paper, we prove McIver and Morgan's conjecture and establish that $h =
4/27$ is indeed optimal.
-
A cost Markov chain is a Markov chain whose transitions are labelled with
non-negative integer costs. A fundamental problem on this model, with
applications in the verification of stochastic systems, is to compute
information about the distribution of the total cost accumulated in a run. This
includes the probability of large total costs, the median cost, and other
quantiles. While expectations can be computed in polynomial time, previous work
has demonstrated that the computation of cost quantiles is harder but can be
done in PSPACE. In this paper we show that cost quantiles in cost Markov chains
can be computed in the counting hierarchy, thus providing evidence that
computing those quantiles is likely not PSPACE-hard. We obtain this result by
exhibiting a tight link to a problem in formal language theory: counting the
number of words that are both accepted by a given automaton and have a given
Parikh image. Motivated by this link, we comprehensively investigate the
complexity of the latter problem. Among other techniques, we rely on the
so-called BEST theorem for efficiently computing the number of Eulerian
circuits in a directed graph.
-
We show that the Kth largest subset problem and the Kth largest m-tuple
problem are in PP and hard for PP under polynomial-time Turing reductions.
Several problems from the literature were previously shown NP-hard via
reductions from those two problems, and by our main result they become PP-hard
as well. We also provide complementary PP-upper bounds for some of them.
-
We study the pattern frequency vector for runs in probabilistic Vector
Addition Systems with States (pVASS). Intuitively, each configuration of a
given pVASS is assigned one of finitely many patterns, and every run can thus
be seen as an infinite sequence of these patterns. The pattern frequency vector
assigns to each run the limit of pattern frequencies computed for longer and
longer prefixes of the run. If the limit does not exist, then the vector is
undefined. We show that for one-counter pVASS, the pattern frequency vector is
defined and takes only finitely many values for almost all runs. Further, these
values and their associated probabilities can be approximated up to an
arbitrarily small relative error in polynomial time. For stable two-counter
pVASS, we show the same result, but we do not provide any upper complexity
bound. As a byproduct of our study, we discover counterexamples falsifying some
classical results about stochastic Petri nets published in the~80s.
-
In runtime verification, the central problem is to decide if a given program
execution violates a given property. In online runtime verification, a monitor
observes a program's execution as it happens. If the program being observed has
hard real-time constraints, then the monitor inherits them. In the presence of
hard real-time constraints it becomes a challenge to maintain enough
information to produce error traces, should a property violation be observed.
In this paper we introduce a data structure, called tree buffer, that solves
this problem in the context of automata-based monitors: If the monitor itself
respects hard real-time constraints, then enriching it by tree buffers makes it
possible to provide error traces, which are essential for diagnosing defects.
We show that tree buffers are also useful in other application domains. For
example, they can be used to implement functionality of capturing groups in
regular expressions. We prove optimal asymptotic bounds for our data structure,
and validate them using empirical data from two sources: regular expression
searching through Wikipedia, and runtime verification of execution traces
obtained from the DaCapo test suite.
-
Given Markov chains and Markov decision processes (MDPs) whose transitions
are labelled with non-negative integer costs, we study the computational
complexity of deciding whether the probability of paths whose accumulated cost
satisfies a Boolean combination of inequalities exceeds a given threshold. For
acyclic Markov chains, we show that this problem is PP-complete, whereas it is
hard for the PosSLP problem and in PSPACE for general Markov chains. Moreover,
for acyclic and general MDPs, we prove PSPACE- and EXP-completeness,
respectively. Our results have direct implications on the complexity of
computing reward quantiles in succinctly represented stochastic systems.
-
Labelled Markov chains (LMCs) are widely used in probabilistic verification,
speech recognition, computational biology, and many other fields. Checking two
LMCs for equivalence is a classical problem subject to extensive studies, while
the total variation distance provides a natural measure for the "inequivalence"
of two LMCs: it is the maximum difference between probabilities that the LMCs
assign to the same event.
In this paper we develop a theory of the total variation distance between two
LMCs, with emphasis on the algorithmic aspects: (1) we provide a
polynomial-time algorithm for determining whether two LMCs have distance 1,
i.e., whether they can almost always be distinguished; (2) we provide an
algorithm for approximating the distance with arbitrary precision; and (3) we
show that the threshold problem, i.e., whether the distance exceeds a given
threshold, is NP-hard and hard for the square-root-sum problem. We also make a
connection between the total variation distance and Bernoulli convolutions.
-
We consider the state-minimisation problem for weighted and probabilistic
automata. We provide a numerically stable polynomial-time minimisation
algorithm for weighted automata, with guaranteed bounds on the numerical error
when run with floating-point arithmetic. Our algorithm can also be used for
"lossy" minimisation with bounded error. We show an application in image
compression. In the second part of the paper we study the complexity of the
minimisation problem for probabilistic automata. We prove that the problem is
NP-hard and in PSPACE, improving a recent EXPTIME-result.
-
We study the qualitative and quantitative zero-reachability problem in
probabilistic multi-counter systems. We identify the undecidable variants of
the problems, and then we concentrate on the remaining two cases. In the first
case, when we are interested in the probability of all runs that visit zero in
some counter, we show that the qualitative zero-reachability is decidable in
time which is polynomial in the size of a given pMC and doubly exponential in
the number of counters. Further, we show that the probability of all
zero-reaching runs can be effectively approximated up to an arbitrarily small
given error epsilon > 0 in time which is polynomial in log(epsilon),
exponential in the size of a given pMC, and doubly exponential in the number of
counters. In the second case, we are interested in the probability of all runs
that visit zero in some counter different from the last counter. Here we show
that the qualitative zero-reachability is decidable and SquareRootSum-hard, and
the probability of all zero-reaching runs can be effectively approximated up to
an arbitrarily small given error epsilon > 0 (these result applies to pMC
satisfying a suitable technical condition that can be verified in polynomial
time). The proof techniques invented in the second case allow to construct
counterexamples for some classical results about ergodicity in stochastic Petri
nets.
-
Basic Parallel Processes (BPPs) are a well-known subclass of Petri Nets. They
are the simplest common model of concurrent programs that allows unbounded
spawning of processes. In the probabilistic version of BPPs, every process
generates other processes according to a probability distribution. We study the
decidability and complexity of fundamental qualitative problems over
probabilistic BPPs -- in particular reachability with probability 1 of
different classes of target sets (e.g. upward-closed sets). Our results concern
both the Markov-chain model, where processes are scheduled randomly, and the
MDP model, where processes are picked by a scheduler.
-
This paper is concerned with the computational complexity of equivalence and
minimisation for automata with transition weights in the field Q of rational
numbers. We use polynomial identity testing and the Isolation Lemma to obtain
complexity bounds, focussing on the class NC of problems within P solvable in
polylogarithmic parallel time. For finite Q-weighted automata, we give a
randomised NC procedure that either outputs that two automata are equivalent or
returns a word on which they differ. We also give an NC procedure for deciding
whether a given automaton is minimal, as well as a randomised NC procedure that
minimises an automaton. We consider probabilistic automata with rewards,
similar to Markov Decision Processes. For these automata we consider two
notions of equivalence: expectation equivalence and distribution equivalence.
The former requires that two automata have the same expected reward on each
input word, while the latter requires that each input word induce the same
distribution on rewards in each automaton. For both notions we give algorithms
for deciding equivalence by reduction to equivalence of Q-weighted automata.
Finally we show that the equivalence problem for Q-weighted visibly pushdown
automata is logspace equivalent to the polynomial identity testing problem.
-
Given a basic process algebra (BPA) and two stack symbols, the BPA
bisimilarity problem asks whether the two stack symbols are bisimilar. We show
that this problem is EXPTIME-hard.
-
Given two pushdown systems, the bisimilarity problem asks whether they are
bisimilar. While this problem is known to be decidable our main result states
that it is nonelementary, improving EXPTIME-hardness, which was the previously
best known lower bound for this problem. Our lower bound result holds for
normed pushdown systems as well.
-
We study the bisimilarity problem for probabilistic pushdown automata (pPDA)
and subclasses thereof. Our definition of pPDA allows both probabilistic and
non-deterministic branching, generalising the classical notion of pushdown
automata (without epsilon-transitions). Our first contribution is a general
construction that reduces checking bisimilarity of probabilistic transition
systems to checking bisimilarity of non-deterministic transition systems. This
construction directly yields decidability of bisimilarity for pPDA, as well as
an elementary upper bound for the bisimilarity problem on the subclass of
probabilistic basic process algebras, i.e., single-state pPDA. We further show
that, with careful analysis, the general reduction can be used to prove an
EXPTIME upper bound for bisimilarity of probabilistic visibly pushdown
automata. Here we also provide a matching lower bound, establishing
EXPTIME-completeness. Finally we prove that deciding bisimilarity of
probabilistic one-counter automata, another subclass of pPDA, is
PSPACE-complete. Here we use a more specialised argument to obtain optimal
complexity bounds.
-
Stochastic branching processes are a classical model for describing random
trees, which have applications in numerous fields including biology, physics,
and natural language processing. In particular, they have recently been
proposed to describe parallel programs with stochastic process creation. In
this paper, we consider the problem of model checking stochastic branching
process. Given a branching process and a deterministic parity tree automaton,
we are interested in computing the probability that the generated random tree
is accepted by the automaton. We show that this probability can be compared
with any rational number in PSPACE, and with 0 and 1 in polynomial time. In a
second part, we suggest a tree extension of the logic PCTL, and develop a
PSPACE algorithm for model checking a branching process against a formula of
this logic. We also show that the qualitative fragment of this logic can be
model checked in polynomial time.
-
We study termination time and recurrence time in programs with unbounded
recursion, which are either randomized or operate on some statistically
quantified inputs. As the underlying formal model for such programs we use
probabilistic pushdown automata (pPDA) which are equivalent to probabilistic
recursive state machines. We obtain tail bounds for the distribution of
termination time for pPDA. We also study the recurrence time for probabilistic
recursive programs that are not supposed to terminate (such as system daemons,
network servers, etc.). Typically, such programs react to certain requests
generated by their environment, and hence operate in finite request-service
cycles. We obtain bounds for the frequency of long request-service cycles.