• The widespread use of generalized linear models in case-control genetic studies has helped identify many disease-associated risk factors typically defined as DNA variants, or single nucleotide polymorphisms (SNPs). Up to now, most literature has focused on selecting a unique best subset of SNPs based on some statistical perspectives. In the presence of pronounced noise, however, multiple biological paths are often found to be equally supported by a given dataset when dealing with complex genetic diseases. We address the ambiguity related to SNP selection by constructing a list of models called variable selection confidence set (VSCS), which contains the collection of all well-supported SNP combinations at a user-specified confidence level. The VSCS extends the familiar notion of confidence intervals in the variable selection setting and provides the practitioner with new tools aiding the variable selection activity beyond trusting a single model. Based on the VSCS, we consider natural graphical and numerical statistics measuring the inclusion importance of a SNP based on its frequency in the most parsimonious VSCS models. This work is motivated by available case-control genetic data on age-related macular degeneration, a widespread complex disease and leading cause of vision loss.
  • The low ionization energy of an $A$ site molecule is a very important factor, which determines the thermodynamical stability of $A$PbI$_3$ hybrid halide perovskites, while the size of the molecule governs the stable phase at room temperature and, eventually, the bandgap. It is challenging to achieve both a low ionization energy and the reasonable size for the PbI$_3$ cage to circumvent the stability issue inherent to hybrid halide perovskites. Here we propose a new three-membered charged ring radical, which demonstrates a low ionization energy that renders a good stability for its corresponding perovskite and a reasonable cation size that translates into a suitable bandgap for the photovoltaic application. We use ab initio calculations to evaluate a polymorphism of the crystal structure of the proposed halide hybrid perovskite, its stability and electronic properties in comparison to the mainstream perovskites, such as the methylammonium and formamidinium lead iodide. Our results highlight the importance of van der Waals interactions for predicting a correct polymorphism of the perovskite vs hexagonal crystal structure.
  • The traditional activity of model selection aims at discovering a single model superior to other candidate models. In the presence of pronounced noise, however, multiple models are often found to explain the same data equally well. To resolve this model selection ambiguity, we introduce the general approach of model selection confidence sets (MSCSs) based on likelihood ratio testing. A MSCS is defined as a list of models statistically indistinguishable from the true model at a user-specified level of confidence, which extends the familiar notion of confidence intervals to the model-selection framework. Our approach guarantees asymptotically correct coverage probability of the true model when both sample size and model dimension increase. We derive conditions under which the MSCS contains all the relevant information about the true model structure. In addition, we propose natural statistics based on the MSCS to measure importance of variables in a principled way that accounts for the overall model uncertainty. When the space of feasible models is large, MSCS is implemented by an adaptive stochastic search algorithm which samples MSCS models with high probability. The MSCS methodology is illustrated through numerical experiments on synthetic data and real data examples.
  • In this paper, we study the problem of testing the mean vectors of high dimensional data in both one-sample and two-sample cases. The proposed testing procedures employ maximum-type statistics and the parametric bootstrap techniques to compute the critical values. Different from the existing tests that heavily rely on the structural conditions on the unknown covariance matrices, the proposed tests allow general covariance structures of the data and therefore enjoy wide scope of applicability in practice. To enhance powers of the tests against sparse alternatives, we further propose two-step procedures with a preliminary feature screening step. Theoretical properties of the proposed tests are investigated. Through extensive numerical experiments on synthetic datasets and an human acute lymphoblastic leukemia gene expression dataset, we illustrate the performance of the new tests and how they may provide assistance on detecting disease-associated gene-sets. The proposed methods have been implemented in an R-package HDtest and are available on CRAN.
  • Instability of hybrid organic-inorganic halide perovskites hinders their development for photovoltaic applications. First-principle calculations are used for evaluation of a decomposition reaction enthalpy of hybrid halide perovskites, which is linked to experimentally observed degradation of device characteristics. However, simple criteria for predicting stability of halide perovskites are lacking since Goldschmidt's tolerance and octahedral geometrical factors do not fully capture formability of those perovskites. In this paper, we extend the Born-Haber cycle to partition the reaction enthalpy of various perovskite structures into lattice, ionization, and molecularization energy components. The analysis of various contributions to the reaction enthalpy points to an ionization energy of a molecule and a cage as an additional criterion for predicting chemical trends in stability of hybrid halide perovskites. Prospects of finding new perovskite structures with improved chemical stability aimed for photovoltaic applications are discussed.
  • Composite likelihood estimation has an important role in the analysis of multivariate data for which the full likelihood function is intractable. An important issue in composite likelihood inference is the choice of the weights associated with lower-dimensional data sub-sets, since the presence of incompatible sub-models can deteriorate the accuracy of the resulting estimator. In this paper, we introduce a new approach for simultaneous parameter estimation by tilting, or re-weighting, each sub-likelihood component called discriminative composite likelihood estimation (D-McLE). The data-adaptive weights maximize the composite likelihood function, subject to moving a given distance from uniform weights; then, the resulting weights can be used to rank lower-dimensional likelihoods in terms of their influence in the composite likelihood function. Our analytical findings and numerical examples support the stability of the resulting estimator compared to estimators constructed using standard composition strategies based on uniform weights. The properties of the new method are illustrated through simulated data and real spatial data on multivariate precipitation extremes.
  • This paper considers the problem of testing the equality of two unspecified distributions. The classical omnibus tests such as the Kolmogorov-Smirnov and Cram\`er-von Mises are known to suffer from low power against essentially all but location-scale alternatives. We propose a new two-sample test that modifies the Neyman's smooth test and extend it to the multivariate case based on the idea of projection pursue. The asymptotic null property of the test and its power against local alternatives are studied. The multiplier bootstrap method is employed to compute the critical value of the multivariate test. We establish validity of the bootstrap approximation in the case where the dimension is allowed to grow with the sample size. Numerical studies show that the new testing procedures perform well even for small sample sizes and are powerful in detecting local features or high-frequency components.
  • Introduced in the field of many-body statistical mechanics, Yang-Baxter equation has become an important tool in a variety fields of physics. In this work, we report the first direct experimental simulation of the Yang-Baxter equation using linear quantum optics. The equality between the two sides of the Yang-Baxter equation in two dimension has been demonstrated directly, and the spectral parameter transformation in the Yang-Baxter equation is explicitly confirmed.
  • To find and realize the optimal evolution between two states is significant both in theory and application. In quantum mechanics, the minimal evolution is bounded by the gap between the largest and smallest eigenvalue of the Hamiltonian. In the parity-time-symmetric(PT-symmetric) Hamiltonian theory, it was predicted that the optimized evolution time can be reduced drastically comparing to the bound in the Hermitian case, and can become even zero. In this Letter, we report the experimental observation of the fast evolution of a PT-symmetric Hamiltonian in an nuclear magnetic resonance (NMR) quantum system. The experimental results demonstrate that the PT-symmetric Hamiltonian can indeed evolve much faster than that in a quantum system, and time it takes can be arbitrary close to zero.
  • Masillo [1] commented on our manuscript [2] "Observation of a Fast Evolution in a Parity-time-symmetric System", pointing out a contradiction of our work with Ref.[3]. In this reply, we pointed out there is no disagreement between Masillo's comment and our work in Ref. [2]. The efficiency cost pointed out in Ref.\cite{masillo} exists, namely to obtain the PT-symmetric hamiltonian evolution, one has to make a measurement on the auxiliary qubit and the auxiliary qubit is at state $|0\ket$ only probabilistically. This is reflected in the amplitude of the spectrum in the NMR quantum simulation. As a result, we made a small modification in a new version of the Ref. [2], and Fig. 2 of Ref.[2] has been replaced by spectra of two different $\alpha$'s in order to illustrate this fact.