• Empirical geodesic graphs and CAT(k) metrics for data analysis(1401.3020)

March 14, 2019 math.ST, stat.TH
A methodology is developed for data analysis based on empirically constructed geodesic metric spaces. For a probability distribution, the length along a path between two points can be defined as the amount of probability mass accumulated along the path. The geodesic, then, is the shortest such path and defines a geodesic metric. Such metrics are transformed in a number of ways to produce parametrised families of geodesic metric spaces, empirical versions of which allow computation of intrinsic means and associated measures of dispersion. These reveal properties of the data, based on geometry, such as those that are difficult to see from the raw Euclidean distances. Examples of application include clustering and classification. For certain parameter ranges, the spaces become CAT(0) spaces and the intrinsic means are unique. In one case, a minimal spanning tree of a graph based on the data becomes CAT(0). In another, a so-called "metric cone" construction allows extension to CAT($k$) spaces. It is shown how to empirically tune the parameters of the metrics, making it possible to apply them to a number of real cases.
• Effect of Random Time Changes on Loewner Hulls(1802.09466)

March 2, 2018 math.PR, math.CV
Loewner hulls are determined by their real-valued driving functions. We study the geometric effect on the Loewner hulls when the driving function is composed with a random time change, such as the inverse of an $\alpha$-stable subordinator. In contrast to SLE, we show that for a large class of random time changes, the time-changed Brownian motion process does not generate a simple curve. Further we develop criteria which can be applied in many situations to determine whether the Loewner hull generated by a time-changed driving function is simple or non-simple. To aid our analysis of an example with a time-changed deterministic driving function, we prove a deterministic result that a driving function that moves faster than $at^r$ for $r \in (0,1/2)$ generates a hull that leaves the real line tangentially.
• Revisiting the Vector Space Model: Sparse Weighted Nearest-Neighbor Method for Extreme Multi-Label Classification(1802.03938)

Feb. 12, 2018 cs.LG, stat.ML
Machine learning has played an important role in information retrieval (IR) in recent times. In search engines, for example, query keywords are accepted and documents are returned in order of relevance to the given query; this can be cast as a multi-label ranking problem in machine learning. Generally, the number of candidate documents is extremely large (from several thousand to several million); thus, the classifier must handle many labels. This problem is referred to as extreme multi-label classification (XMLC). In this paper, we propose a novel approach to XMLC termed the Sparse Weighted Nearest-Neighbor Method. This technique can be derived as a fast implementation of state-of-the-art (SOTA) one-versus-rest linear classifiers for very sparse datasets. In addition, we show that the classifier can be written as a sparse generalization of a representer theorem with a linear kernel. Furthermore, our method can be viewed as the vector space model used in IR. Finally, we show that the Sparse Weighted Nearest-Neighbor Method can process data points in real time on XMLC datasets with equivalent performance to SOTA models, with a single thread and smaller storage footprint. In particular, our method exhibits superior performance to the SOTA models on a dataset with 3 million labels.
• Visualization of Au Nanoparticles Buried in a Polymer Matrix by Scanning Thermal Noise Microscopy(1609.02999)

Sept. 10, 2016 cond-mat.mes-hall
We demonstrated visualization of Au nanoparticles buried 300 nm into a polymer matrix by measurement of the thermal noise spectrum of a microcantilever with a tip in contact to the polymer surface. The subsurface Au nanoparticles were detected as the variation in the contact stiffness and damping reflecting the viscoelastic properties of the polymer surface. The variation in the contact stiffness well agreed with the effective stiffness of a simple one-dimensional model, which is consistent with the fact that the maximum depth range of the technique is far beyond the extent of the contact stress field.
• Universality of Makespan in Flowshop Scheduling Problem(1607.07303)

Makespan, which is defined as the time difference between the starting time and the terminate time of a sequence of jobs or tasks, as the time to traverse a belt conveyor system, is well known as one of the most important criteria in scheduling problems. It is often used by manufacturing firms in practice in order to improve the operational efficiency with respect to the order of job processing to be performed. It is known that the performance of a machine depends on the particular timing of the job processing even if the job processing order is fixed. That is, the performance of a system with respect to flowshop processing depends on the procedure of scheduling. In this present work, we first discuss the relationship between makespan and several scheduling procedures in detail by using a small example and provide an algorithm for deriving the makespan. Using our proposed algorithm, several numerical experiments are examined so as to reveal the relationship between the typical behavior of makespan and the position of the fiducial machine, with respect to several distinguished distributions of the processing time. We also discuss the behavior of makespan by using the properties of the shape functions used in the context of percolation theory. Our contributions are firstly giving a detail discussion on the universality of makespan in flowshop problems and obtaining several novel properties of makespan, as follows: (1) makespan possesses universality in the sense of being little affected by a change in the probability distribution of the processing time, (2) makespan can be decomposed into the sum of two shape functions, and (3) makespan is less affected by the dispatching rule than by the scheduling procedure.
• A strong and weak approximation scheme for stochastic differential equations driven by a time-changed Brownian motion(1408.4377)

Nov. 11, 2015 math.PR
This paper establishes a discretization scheme for a large class of stochastic differential equations driven by a time-changed Brownian motion with drift, where the time change is given by a general inverse subordinator. The scheme involves two types of errors: one generated by application of the Euler-Maruyama scheme and the other ascribed to simulation of the inverse subordinator. With the two errors carefully examined, the orders of strong and weak convergence are derived. Numerical examples are attached to support the convergence results.
• Small ball probabilities for a class of time-changed self-similar processes(1502.07777)

Feb. 26, 2015 math.PR
This paper establishes small ball probabilities for a class of time-changed processes $X\circ E$, where $X$ is a self-similar process and $E$ is an independent continuous process, each with a certain small ball probability. In particular, examples of the outer process $X$ and the time change $E$ include an iterated fractional Brownian motion and the inverse of a general subordinator with infinite L\'evy measure, respectively. The small ball probabilities of such time-changed processes show power law decay, and the rate of decay does not depend on the small deviation order of the outer process $X$, but on the self-similarity index of $X$.
• Permutation test for dendrograms and its application to the analysis of mental lexicons(1403.2845)

March 12, 2014 math.ST, stat.TH
A novel type of permutation tests for dendrogram data is studied with respect to two types of metrics for measuring the difference between dendrograms. First, the Frobenius norm is used, and we prove the consistency and efficiency of the permutation tests. Next, the geodesic distance on a dendrogram space is used. The uniqueness of the geodesics on every dendrogram space is proved and some existing algorithms for computing geodesics are applied. Mental lexicons of English words are analyzed as an application example of the proposed permutation tests. The difference of mental lexicons between native and non-native English speakers is examined by analyzing sorting task data that used English words taken from various word classes.
• Computational algebraic methods in efficient estimation(1310.6515)

Jan. 10, 2014 math.ST, stat.TH
A strong link between information geometry and algebraic statistics is made by investigating statistical manifolds which are algebraic varieties. In particular it it shown how first and second order efficient estimators can be constructed, such as bias corrected Maximum Likelihood and more general estimators, and for which the estimating equations are purely algebraic. In addition it is shown how Gr\"obner basis technology, which is at the heart of algebraic statistics, can be used to reduce the degrees of the terms in the estimating equations. This points the way to the feasible use, to find the estimators, of special methods for solving polynomial equations, such as homotopy continuation methods. Simple examples are given showing both equations and computations. *** The proof of Theorem 2 was corrected by the latest version. Some minor errors were also corrected.
• On time-changed Gaussian processes and their associated Fokker-Planck-Kolmogorov equations(1011.2473)

Nov. 10, 2010 math.PR
This paper establishes Fokker-Planck-Kolmogorov type equations for time-changed Gaussian processes. Examples include those equations for a time-changed fractional Brownian motion with time-dependent Hurst parameter and for a time-changed Ornstein-Uhlenbeck process. The time-change process considered is the inverse of either a stable subordinator or a mixture of independent stable subordinators.
• Stochastic Calculus for a Time-changed Semimartingale and the Associated Stochastic Differential Equations(0906.5385)

Oct. 25, 2010 math.PR
It is shown that under a certain condition on a semimartingale and a time-change, any stochastic integral driven by the time-changed semimartingale is a time-changed stochastic integral driven by the original semimartingale. As a direct consequence, a specialized form of the Ito formula is derived. When a standard Brownian motion is the original semimartingale, classical Ito stochastic differential equations driven by the Brownian motion with drift extend to a larger class of stochastic differential equations involving a time-change with continuous paths. A form of the general solution of linear equations in this new class is established, followed by consideration of some examples analogous to the classical equations. Through these examples, each coefficient of the stochastic differential equations in the new class is given meaning. The new feature is the coexistence of a usual drift term along with a term related to the time-change.
• SDEs driven by a time-changed L\'evy process and their associated time-fractional order pseudo-differential equations(0907.0253)

June 22, 2010 math.PR
It is known that the transition probabilities of a solution to a classical It\^o stochastic differential equation (SDE) satisfy in the weak sense the associated Kolmogorov equation. The Kolmogorov equation is a partial differential equation with coeffcients determined by the corresponding SDE. Time-fractional Kolmogorov type equations are used to model complex processes in many fields. However, the class of SDEs that is associated with these equations is unknown except in a few special cases. The present paper shows that in the cases of either time-fractional order or more general time-distributed order differential equations, the associated class of SDEs can be described within the framework of SDEs driven by semimartingales. These semimartingales are time-changed L\'evy processes where the independent time-change is given respectively by the inverse of a single or mixture of independent stable subordinators. Examples are provided, including a fractional analogue of the Feynman-Kac formula.
• Fokker-Planck-Kolmogorov equations associated with SDEs driven by time-changed fractional Brownian motion(1002.1494)

Feb. 7, 2010 math-ph, math.MP
In this paper Fokker-Planck-Kolmogorov type equations associated with stochastic differential equations driven by a time-changed fractional Brownian motion are derived. Two equivalent forms are suggested. The time-change process considered is either the first hitting time process for a stable subordinator or a mixture of stable subordinators. A family of operators arising in the representation of the Fokker-Plank-Kolmogorov equations is shown to have the semigroup property.
• Bayesian shrinkage prediction for the regression problem(math/0701583)

Jan. 21, 2007 math.ST, stat.TH
We consider Bayesian shrinkage predictions for the Normal regression problem under the frequentist Kullback-Leibler risk function. Firstly, we consider the multivariate Normal model with an unknown mean and a known covariance. While the unknown mean is fixed, the covariance of future samples can be different from training samples. We show that the Bayesian predictive distribution based on the uniform prior is dominated by that based on a class of priors if the prior distributions for the covariance and future covariance matrices are rotation invariant. Then, we consider a class of priors for the mean parameters depending on the future covariance matrix. With such a prior, we can construct a Bayesian predictive distribution dominating that based on the uniform prior. Lastly, applying this result to the prediction of response variables in the Normal linear regression model, we show that there exists a Bayesian predictive distribution dominating that based on the uniform prior. Minimaxity of these Bayesian predictions follows from these results.