
A methodology is developed for data analysis based on empirically constructed
geodesic metric spaces. For a probability distribution, the length along a path
between two points can be defined as the amount of probability mass accumulated
along the path. The geodesic, then, is the shortest such path and defines a
geodesic metric. Such metrics are transformed in a number of ways to produce
parametrised families of geodesic metric spaces, empirical versions of which
allow computation of intrinsic means and associated measures of dispersion.
These reveal properties of the data, based on geometry, such as those that are
difficult to see from the raw Euclidean distances. Examples of application
include clustering and classification. For certain parameter ranges, the spaces
become CAT(0) spaces and the intrinsic means are unique. In one case, a minimal
spanning tree of a graph based on the data becomes CAT(0). In another, a
socalled "metric cone" construction allows extension to CAT($k$) spaces. It is
shown how to empirically tune the parameters of the metrics, making it possible
to apply them to a number of real cases.

Loewner hulls are determined by their realvalued driving functions. We study
the geometric effect on the Loewner hulls when the driving function is composed
with a random time change, such as the inverse of an $\alpha$stable
subordinator. In contrast to SLE, we show that for a large class of random time
changes, the timechanged Brownian motion process does not generate a simple
curve. Further we develop criteria which can be applied in many situations to
determine whether the Loewner hull generated by a timechanged driving function
is simple or nonsimple. To aid our analysis of an example with a timechanged
deterministic driving function, we prove a deterministic result that a driving
function that moves faster than $at^r$ for $r \in (0,1/2)$ generates a hull
that leaves the real line tangentially.

Machine learning has played an important role in information retrieval (IR)
in recent times. In search engines, for example, query keywords are accepted
and documents are returned in order of relevance to the given query; this can
be cast as a multilabel ranking problem in machine learning. Generally, the
number of candidate documents is extremely large (from several thousand to
several million); thus, the classifier must handle many labels. This problem is
referred to as extreme multilabel classification (XMLC). In this paper, we
propose a novel approach to XMLC termed the Sparse Weighted NearestNeighbor
Method. This technique can be derived as a fast implementation of
stateoftheart (SOTA) oneversusrest linear classifiers for very sparse
datasets. In addition, we show that the classifier can be written as a sparse
generalization of a representer theorem with a linear kernel. Furthermore, our
method can be viewed as the vector space model used in IR. Finally, we show
that the Sparse Weighted NearestNeighbor Method can process data points in
real time on XMLC datasets with equivalent performance to SOTA models, with a
single thread and smaller storage footprint. In particular, our method exhibits
superior performance to the SOTA models on a dataset with 3 million labels.

We demonstrated visualization of Au nanoparticles buried 300 nm into a
polymer matrix by measurement of the thermal noise spectrum of a
microcantilever with a tip in contact to the polymer surface. The subsurface Au
nanoparticles were detected as the variation in the contact stiffness and
damping reflecting the viscoelastic properties of the polymer surface. The
variation in the contact stiffness well agreed with the effective stiffness of
a simple onedimensional model, which is consistent with the fact that the
maximum depth range of the technique is far beyond the extent of the contact
stress field.

Makespan, which is defined as the time difference between the starting time
and the terminate time of a sequence of jobs or tasks, as the time to traverse
a belt conveyor system, is well known as one of the most important criteria in
scheduling problems. It is often used by manufacturing firms in practice in
order to improve the operational efficiency with respect to the order of job
processing to be performed. It is known that the performance of a machine
depends on the particular timing of the job processing even if the job
processing order is fixed. That is, the performance of a system with respect to
flowshop processing depends on the procedure of scheduling. In this present
work, we first discuss the relationship between makespan and several scheduling
procedures in detail by using a small example and provide an algorithm for
deriving the makespan. Using our proposed algorithm, several numerical
experiments are examined so as to reveal the relationship between the typical
behavior of makespan and the position of the fiducial machine, with respect to
several distinguished distributions of the processing time. We also discuss the
behavior of makespan by using the properties of the shape functions used in the
context of percolation theory. Our contributions are firstly giving a detail
discussion on the universality of makespan in flowshop problems and obtaining
several novel properties of makespan, as follows: (1) makespan possesses
universality in the sense of being little affected by a change in the
probability distribution of the processing time, (2) makespan can be decomposed
into the sum of two shape functions, and (3) makespan is less affected by the
dispatching rule than by the scheduling procedure.

This paper establishes a discretization scheme for a large class of
stochastic differential equations driven by a timechanged Brownian motion with
drift, where the time change is given by a general inverse subordinator. The
scheme involves two types of errors: one generated by application of the
EulerMaruyama scheme and the other ascribed to simulation of the inverse
subordinator. With the two errors carefully examined, the orders of strong and
weak convergence are derived. Numerical examples are attached to support the
convergence results.

This paper establishes small ball probabilities for a class of timechanged
processes $X\circ E$, where $X$ is a selfsimilar process and $E$ is an
independent continuous process, each with a certain small ball probability. In
particular, examples of the outer process $X$ and the time change $E$ include
an iterated fractional Brownian motion and the inverse of a general
subordinator with infinite L\'evy measure, respectively. The small ball
probabilities of such timechanged processes show power law decay, and the rate
of decay does not depend on the small deviation order of the outer process $X$,
but on the selfsimilarity index of $X$.

A novel type of permutation tests for dendrogram data is studied with respect
to two types of metrics for measuring the difference between dendrograms.
First, the Frobenius norm is used, and we prove the consistency and efficiency
of the permutation tests. Next, the geodesic distance on a dendrogram space is
used. The uniqueness of the geodesics on every dendrogram space is proved and
some existing algorithms for computing geodesics are applied. Mental lexicons
of English words are analyzed as an application example of the proposed
permutation tests. The difference of mental lexicons between native and
nonnative English speakers is examined by analyzing sorting task data that
used English words taken from various word classes.

A strong link between information geometry and algebraic statistics is made
by investigating statistical manifolds which are algebraic varieties. In
particular it it shown how first and second order efficient estimators can be
constructed, such as bias corrected Maximum Likelihood and more general
estimators, and for which the estimating equations are purely algebraic. In
addition it is shown how Gr\"obner basis technology, which is at the heart of
algebraic statistics, can be used to reduce the degrees of the terms in the
estimating equations. This points the way to the feasible use, to find the
estimators, of special methods for solving polynomial equations, such as
homotopy continuation methods. Simple examples are given showing both equations
and computations. *** The proof of Theorem 2 was corrected by the latest
version. Some minor errors were also corrected.

This paper establishes FokkerPlanckKolmogorov type equations for
timechanged Gaussian processes. Examples include those equations for a
timechanged fractional Brownian motion with timedependent Hurst parameter and
for a timechanged OrnsteinUhlenbeck process. The timechange process
considered is the inverse of either a stable subordinator or a mixture of
independent stable subordinators.

It is shown that under a certain condition on a semimartingale and a
timechange, any stochastic integral driven by the timechanged semimartingale
is a timechanged stochastic integral driven by the original semimartingale. As
a direct consequence, a specialized form of the Ito formula is derived. When a
standard Brownian motion is the original semimartingale, classical Ito
stochastic differential equations driven by the Brownian motion with drift
extend to a larger class of stochastic differential equations involving a
timechange with continuous paths. A form of the general solution of linear
equations in this new class is established, followed by consideration of some
examples analogous to the classical equations. Through these examples, each
coefficient of the stochastic differential equations in the new class is given
meaning. The new feature is the coexistence of a usual drift term along with a
term related to the timechange.

It is known that the transition probabilities of a solution to a classical
It\^o stochastic differential equation (SDE) satisfy in the weak sense the
associated Kolmogorov equation. The Kolmogorov equation is a partial
differential equation with coeffcients determined by the corresponding SDE.
Timefractional Kolmogorov type equations are used to model complex processes
in many fields. However, the class of SDEs that is associated with these
equations is unknown except in a few special cases. The present paper shows
that in the cases of either timefractional order or more general
timedistributed order differential equations, the associated class of SDEs can
be described within the framework of SDEs driven by semimartingales. These
semimartingales are timechanged L\'evy processes where the independent
timechange is given respectively by the inverse of a single or mixture of
independent stable subordinators. Examples are provided, including a fractional
analogue of the FeynmanKac formula.

In this paper FokkerPlanckKolmogorov type equations associated with
stochastic differential equations driven by a timechanged fractional Brownian
motion are derived. Two equivalent forms are suggested. The timechange process
considered is either the first hitting time process for a stable subordinator
or a mixture of stable subordinators. A family of operators arising in the
representation of the FokkerPlankKolmogorov equations is shown to have the
semigroup property.

We consider Bayesian shrinkage predictions for the Normal regression problem
under the frequentist KullbackLeibler risk function.
Firstly, we consider the multivariate Normal model with an unknown mean and a
known covariance. While the unknown mean is fixed, the covariance of future
samples can be different from training samples. We show that the Bayesian
predictive distribution based on the uniform prior is dominated by that based
on a class of priors if the prior distributions for the covariance and future
covariance matrices are rotation invariant.
Then, we consider a class of priors for the mean parameters depending on the
future covariance matrix. With such a prior, we can construct a Bayesian
predictive distribution dominating that based on the uniform prior.
Lastly, applying this result to the prediction of response variables in the
Normal linear regression model, we show that there exists a Bayesian predictive
distribution dominating that based on the uniform prior. Minimaxity of these
Bayesian predictions follows from these results.