• ### Ultrahigh-dimensional Robust and Efficient Sparse Regression using Non-Concave Penalized Density Power Divergence(1802.04906)

April 7, 2018 stat.ME
We propose a sparse regression method based on the non-concave penalized density power divergence loss function which is robust against infinitesimal contamination in very high dimensionality. Present methods of sparse and robust regression are based on $\ell_1$-penalization, and their theoretical properties are not well-investigated. In contrast, we use a general class of folded concave penalties that ensure sparse recovery and consistent estimation of regression coefficients. We propose an alternating algorithm based on the Concave-Convex procedure to obtain our estimate, and demonstrate its robustness properties using influence function analysis. Under some conditions on the fixed design matrix and penalty function, we prove that this estimator possesses large-sample oracle properties in an ultrahigh-dimensional regime. The performance and effectiveness of our proposed method for parameter estimation and prediction compared to state-of-the-art are demonstrated through simulation studies.
• ### Joint Estimation and Inference for Data Integration Problems based on Multiple Multi-layered Gaussian Graphical Models(1803.03348)

March 9, 2018 math.ST, stat.TH, stat.ME, stat.ML
The rapid development of high-throughput technologies has enabled the generation of data from biological or disease processes that span multiple layers, like genomic, proteomic or metabolomic data, and further pertain to multiple sources, like disease subtypes or experimental conditions. In this work, we propose a general statistical framework based on Gaussian graphical models for horizontal (i.e. across conditions or subtypes) and vertical (i.e. across different layers containing data on molecular compartments) integration of information in such datasets. We start with decomposing the multi-layer problem into a series of two-layer problems. For each two-layer problem, we model the outcomes at a node in the lower layer as dependent on those of other nodes in that layer, as well as all nodes in the upper layer. We use a combination of neighborhood selection and group-penalized regression to obtain sparse estimates of all model parameters. Following this, we develop a debiasing technique and asymptotic distributions of inter-layer directed edge weights that utilize already computed neighborhood selection coefficients for nodes in the upper layer. Subsequently, we establish global and simultaneous testing procedures for these edge weights. Performance of the proposed methodology is evaluated on synthetic data.
• ### Simultaneous Selection of Multiple Important Single Nucleotide Polymorphisms in Familial Genome Wide Association Studies Data(1802.01141)

Feb. 22, 2018 stat.AP
We propose a resampling-based fast variable selection technique for selecting important Single Nucleotide Polymorphisms (SNP) in multi-marker mixed effect models used in twin studies. Due to computational complexity, current practice includes testing the effect of one SNP at a time, commonly termed as single SNP association analysis'. Joint modeling of genetic variants within a gene or pathway may have better power to detect the relevant genetic variants, hence we adapt our recently proposed framework of $e$-values to address this. In this paper, we propose a computationally efficient approach for single SNP detection in families while utilizing information on multiple SNPs simultaneously. We achieve this through improvements in two aspects. First, unlike other model selection techniques, our method only requires training a model with all possible predictors. Second, we utilize a fast and scalable bootstrap procedure that only requires Monte-Carlo sampling to obtain bootstrapped copies of the estimated vector of coefficients. Using this bootstrap sample, we obtain the $e$-value for each SNP, and select SNPs having $e$-values below a threshold. We illustrate through numerical studies that our method is more effective in detecting SNPs associated with a trait than either single-marker analysis using family data or model selection methods that ignore the familial dependency structure. We also use the $e$-values to perform gene-level analysis in nuclear families and detect several SNPs that have been implicated to be associated with alcohol consumption.
• ### Correlations of the feedback energy and BCG radio luminosity in galaxy clusters(1802.00454)

Feb. 1, 2018 astro-ph.CO, astro-ph.GA
We study the excess entropy and the corresponding non-gravitational feedback energy ($E_{feedback}$) in the intra-cluster medium (ICM) by considering a sample of 22 galaxy clusters using Chandra X-ray and NRAO VLA Sky Survey (NVSS)/Giant Metrewave Radio Telescope (GMRT) radio observations. We find moderate to strong correlation of the brightest cluster galaxy (BCG) radio luminosity ($L_R$) with the feedback energy and various other cluster thermal properties. We show conclusively that the active galactic nucleus (AGN) is more efficient in transferring feedback energy to the ICM in less massive clusters. In particular, we find that for the sample of 11 clusters with optically confirmed BCGs, BCG radio luminosity scales with feedback energy as $L_R\propto E_{feedback}^{4.52\pm1.84}$ and $E_{feedback}/L_{R} \propto T_{obs}^{-6.0}$ (or $E_{feedback}/L_{R} \propto M_{500}^{-3.36}$). Finally, we discuss the implications of our results with regard to feedback in clusters.
• ### X-ray and SZ constraints on the properties of hot CGM(1801.06557)

Jan. 19, 2018 astro-ph.CO, astro-ph.GA
We use observations of stacked X-ray luminosity and Sunyaev-Zel'dovich (SZ) signal from a cosmological sample of $\sim 80,000$ and $104,000$ massive galaxies, respectively, with $10^{12.6}\lesssim M_{500} \lesssim 10^{13} M_{\odot}$ and mean redshift, \={z} $\sim$ 0.1 - 0.14 to constrain the hot Circumgalactic Medium (CGM) density and temperature. The X-ray luminosities constrain the density and hot CGM mass, while the SZ signal helps in breaking the density-temperature degeneracy. We consider a simple power-law density distribution ($n_e \propto r^{-3\beta}$) as well as a hydrostatic hot halo model, with the gas assumed to be isothermal in both cases. The datasets are best described by the mean hot CGM profile $\propto r^{-1.2}$, which is shallower than an NFW profile. For halo virial mass $\sim 10^{12}$ - $10^{13} M_{\odot}$, the hot CGM contains $\sim$ 20 - 30\% of galactic baryonic mass for the power-law model and 4 - 11\% for the hydrostatic halo model, within the virial radii. For the power-law model, the hot CGM profile broadly agrees with observations of the Milky Way. The mean hot CGM mass is comparable to or larger than the mass contained in other phases of the CGM for $L^*$ galaxies.
• ### AGN feedback with the Square Kilometer Array (SKA) and implications for cluster physics and cosmology(1705.04444)

Oct. 17, 2017 astro-ph.CO
AGN feedback is regarded as an important non-gravitational process in galaxy clusters, providing useful constraints on large-scale structure formation. It modifies the structure and energetics of the intra-cluster medium (ICM) and hence its understanding is crucially needed in order to use clusters as high precision cosmological probes. In this context, particularly keeping in mind the upcoming high quality radio data expected from radio surveys like SKA with its higher sensitivity, high spatial and spectral resolutions, we review our current understanding of AGN feedback, its cosmological implications and the impact that SKA can have in revolutionizing our understanding of AGN feedback in large-scale structures. Recent developments regarding the AGN outbursts and its possible contribution to excess entropy in the hot atmospheres of groups and clusters, its correlation with the feedback energy in ICM, quenching of cooling flows and the possible connection between cool core clusters and radio mini-halos, are discussed. We describe current major issues regarding modeling of AGN feedback and its impact on the surrounding medium. With regard to the future of AGN feedback studies, we examine the possible breakthroughs that can be expected from SKA observations. In the context of cluster cosmology, for example, we point out the importance of SKA observations for cluster mass calibration by noting that most of $z>1$ clusters discovered by eROSITA X-ray mission can be expected to be followed up through a 1000 hour SKA-1 mid programme. Moreover, approximately $1000$ radio mini halos and $\sim 2500$ radio halos at $z<0.6$ can be potentially detected by SKA1 and SKA2 and used as tracers of galaxy clusters and determination of cluster selection function.
• ### Excess entropy and energy feedback from within cluster cores up to r$_{200}$(1703.00028)

Aug. 3, 2017 astro-ph.CO
We estimate the "non-gravitational" entropy-injection profiles, $\Delta K$, and the resultant energy feedback profiles, $\Delta E$, of the intracluster medium for 17 clusters using their Planck SZ and ROSAT X-Ray observations, spanning a large radial range from $0.2r_{500}$ up to $r_{200}$. The feedback profiles are estimated by comparing the observed entropy, at fixed gas mass shells, with theoretical entropy profiles predicted from non-radiative hydrodynamic simulations. We include non-thermal pressure and gas clumping in our analysis. The inclusion of non-thermal pressure and clumping results in changing the estimates for $r_{500}$ and $r_{200}$ by 10\%-20\%. When clumpiness is not considered it leads to an under-estimation of $\Delta K\approx300$ keV cm$^2$ at $r_{500}$ and $\Delta K\approx1100$ keV cm$^2$ at $r_{200}$. On the other hand, neglecting non-thermal pressure results in an over-estimation of $\Delta K\approx 100$ keV cm$^2$ at $r_{500}$ and under-estimation of $\Delta K\approx450$ keV cm$^2$ at $r_{200}$. For the estimated feedback energy, we find that ignoring clumping leads to an under-estimation of energy per particle $\Delta E\approx1$ keV at $r_{500}$ and $\Delta E\approx1.5$ keV at $r_{200}$. Similarly, neglect of the non-thermal pressure results in an over-estimation of $\Delta E\approx0.5$ keV at $r_{500}$ and under-estimation of $\Delta E\approx0.25$ keV at $r_{200}$. We find entropy floor of $\Delta K\approx300$ keV cm$^2$ is ruled out at $\approx3\sigma$ throughout the entire radial range and $\Delta E\approx1$ keV at more than 3$\sigma$ beyond $r_{500}$, strongly constraining ICM pre-heating scenarios. We also demonstrate robustness of results w.r.t sample selection, X-Ray analysis procedures, entropy modeling etc.
• ### Fast and General Model Selection using Data Depth and Resampling(1706.02429)

July 29, 2017 math.ST, stat.TH, stat.ME
We propose a technique using data-depth functions and resampling to simultaneously assign a score, called an \textit{e-value}, to statistical models and conduct inference using that model, in a very general framework. The \textit{e-value} may be used to select models, and we establish that under general conditions, it can separate statistical models that adequately explain properties of the data from those that do not. Our resampling-based approach achieves concurrent ranking of models and consistent approximation of sampling distribution of parameter estimators under any model, thus enabling inference within each model. Consequently, our proposal is one of simultaneous model discovery and inference. This results in a fast and parallel algorithm that fits only a single model and evaluates $p +1$ models, as opposed to the traditional requirement of fitting and evaluating $2^{p}$ models. We illustrate in simulation experiments that our proposed method typically performs better than or competitively with currently used methods for model selection, in linear models and fixed effect selection in linear mixed models. As a real data application, we use our procedure to elicit climatic drivers of Indian summer monsoon precipitation.
• In the context of the ESA M5 (medium mission) call we proposed a new satellite mission, Theia, based on relative astrometry and extreme precision to study the motion of very faint objects in the Universe. Theia is primarily designed to study the local dark matter properties, the existence of Earth-like exoplanets in our nearest star systems and the physics of compact objects. Furthermore, about 15 $\%$ of the mission time was dedicated to an open observatory for the wider community to propose complementary science cases. With its unique metrology system and "point and stare" strategy, Theia's precision would have reached the sub micro-arcsecond level. This is about 1000 times better than ESA/Gaia's accuracy for the brightest objects and represents a factor 10-30 improvement for the faintest stars (depending on the exact observational program). In the version submitted to ESA, we proposed an optical (350-1000nm) on-axis TMA telescope. Due to ESA Technology readiness level, the camera's focal plane would have been made of CCD detectors but we anticipated an upgrade with CMOS detectors. Photometric measurements would have been performed during slew time and stabilisation phases needed for reaching the required astrometric precision.
• ### Nonconvex penalized multitask regression using data depth-based penalties(1610.07540)

June 23, 2017 stat.ME
We propose a new class of nonconvex penalty functions, based on data depth functions, for multitask sparse penalized regression. These penalties quantify the relative position of rows of the coefficient matrix from a fixed distribution centered at the origin. We derive the theoretical properties of an approximate one-step sparse estimator of the coefficient matrix using local linear approximation of the penalty function, and provide algorithm for its computation. For orthogonal design and independent responses, the resulting thresholding rule enjoys near-minimax optimal risk performance, similar to the adaptive lasso (Zou, 2006). A simulation study and real data analysis demonstrate its effectiveness compared to some of the present methods that provide sparse solutions in multivariate regression.
• ### Constraining the X-ray AGN halo occupation distribution: implications for eROSITA(1608.05184)

Jan. 21, 2017 astro-ph.HE
The X-ray emission from active galactic nucleus (AGN) is a major component of extragalactic X-ray sky. In this paper, we use the X-ray luminosity function (XLF) and halo occupation distribution (HOD) formalism to construct a halo model for the X-ray emission from AGNs. Verifying that the two inputs (XLF and HOD) are in agreement with each other, we compute the auto-correlation power spectrum in the soft X-ray band (0.5-2 keV) due to the AGNs potentially resolved by eROSITA (extended ROentgen Survey with an Imaging Telescope Array) mission and explore the redshift and mass dependence of the power spectrum. Studying the relative contribution of the Poisson and the clustering terms to the total power, we find that at multipoles $l\lesssim 1000$ (i.e. large scales), the clustering term is larger than the Poisson term. We also forecast the potential of X-ray auto-correlation power spectrum and X-ray-lensing cross-correlation power spectrum using eROSITA and eROSITA-LSST (Large Synoptic Survey Telescope) surveys, respectively, to constrain the HOD parameters and their redshift evolution. In addition, we compute the power spectrum of the AGNs lying below the flux resolution limit of eROSITA, which is essential to understand in order to extract the X-ray signal from the hot diffuse gas present in galaxies and clusters.
• ### The many scales to cosmic homogeneity: Use of multiple tracers from the SDSS(1611.07915)

Nov. 23, 2016 astro-ph.CO
We carry out multifractal analyses of multiple tracers namely the main galaxy sample, the LRG sample and the quasar sample from the SDSS to test the assumption of cosmic homogeneity and identify the scale of transition to homogeneity, if any. We consider the behaviour of the scaled number counts and the scaling relations of different moments of the galaxy number counts in spheres of varying radius $R$ to calculate the spectrum of the Minkowski-Bouligand general dimension $D_{q} (R)$ for $-4 \leq q \leq 4$. The present analysis provides us the opportunity to study the spectrum of the generalized dimension $D_{q}(R)$ for multiple tracers of the cosmic density field over a wide range of length scales and allows us to confidently test the validity of the assumption of cosmic homogeneity. Our analysis indicates that the SDSS main galaxy sample is homogeneous on a length scales of $80\, h^{-1}\, {\rm Mpc}$ and beyond whereas the SDSS quasar sample and the SDSS LRG sample show transition to homogeneity on an even larger length scales at $\sim 150\, h^{-1}\, {\rm Mpc}$ and $\sim 230\, h^{-1}\, {\rm Mpc}$ respectively. These differences in the scale of homogeneity arise due to the effective mass and redshift scales probed by the different tracers in a Universe where structures form hierarchically. Our results reaffirm the validity of cosmic homogeneity on large scales irrespective of the tracers used and strengthens the foundations of the Standard Model of Cosmology.
• ### Little evidence for entropy and energy excess beyond $r_{500}$ - An end to ICM preheating?(1606.00014)

Oct. 27, 2016 astro-ph.CO
Non-gravitational feedback affects the nature of the intra-cluster medium (ICM). X-ray cooling of the ICM and in situ energy feedback from AGN's and SNe as well as {\it preheating} of the gas at epochs preceding the formation of clusters are proposed mechanisms for such feedback. While cooling and AGN feedbacks are dominant in cluster cores, the signatures of a preheated ICM are expected to be present even at large radii. To estimate the degree of preheating, with minimum confusion from AGN feedback/cooling, we study the excess entropy and non-gravitational energy profiles upto $r_{200}$ for a sample of 17 galaxy clusters using joint data sets of {\it Planck} SZ pressure and {\it ROSAT/PSPC} gas density profiles. The canonical value of preheating entropy floor of $\gtrsim 300$ keV cm$^2$, needed in order to match cluster scalings, is ruled out at $\approx 3\sigma$. We also show that the feedback energy of 1 keV/particle is ruled out at 5.2$\sigma$ beyond $r_{500}$. Our analysis takes both non-thermal pressure and clumping into account which can be important in outer regions. Our results based on the direct probe of the ICM in the outermost regions do not support any significant preheating.
• ### Robust estimation of principal components from depth-based multivariate rank covariance matrix(1502.07042)

March 9, 2016 math.ST, stat.TH
Analyzing principal components for multivariate data from its spatial sign covariance matrix (SCM) has been proposed as a computationally simple and robust alternative to normal PCA, but it suffers from poor efficiency properties and is actually inadmissible with respect to the maximum likelihood estimator. Here we use data depth-based spatial ranks in place of spatial signs to obtain the orthogonally equivariant Depth Covariance Matrix (DCM) and use its eigenvector estimates for PCA. We derive asymptotic properties of the sample DCM and influence functions of its eigenvectors. The shapes of these influence functions indicate robustness of estimated principal components, and good efficiency properties compared to the SCM. Finite sample simulation studies show that principal components of the sample DCM are robust with respect to deviations from normality, as well as are more efficient than the SCM and its affine equivariant version, Tyler's shape matrix. Through two real data examples, we also show the effectiveness of DCM-based PCA in analyzing high-dimensional data and outlier detection, and compare it with other methods of robust PCA.
• ### Probing the circumgalactic baryons through cross-correlations(1505.03658)

Dec. 15, 2015 astro-ph.CO
We study the cross-correlation of distribution of galaxies, the Sunyaev-Zel'dovich (SZ) and X-ray power spectra of galaxies from current and upcoming surveys and show these to be excellent probes of the nature, i.e. extent, evolution and energetics, of the circumgalactic medium (CGM). The SZ-galaxy cross-power spectrum, especially at large multipoles, depends on the steepness of the pressure profile of the CGM. This property of the SZ signal can, thus, be used to constrain the pressure profile of the CGM. The X-ray cross power spectrum also has a similar shape. However, it is much more sensitive to the underlying density profile. We forecast the detectability of the cross-correlated galaxy distribution, SZ and X-ray signals by combining South Pole Telescope-Dark Energy Survey (SPT-DES) and eROSITA-DES/eROSITA-LSST (extended ROentgen Survey with an Imaging Telescope Array-Large Synoptic Survey Telescope) surveys, respectively. We find that, for the SPT-DES survey, the signal-to-noise ratio (SNR) peaks at high mass and redshift with SNR $\sim 9$ around $M_h\sim 10^{13} h^{-1} M_{\odot}$ and $z\sim 1.5\hbox{--} 2$ for flat density and temperature profiles. The SNR peaks at $\sim 6 (12 )$ for the eROSITA-DES (eROSITA-LSST) surveys. We also perform a Fisher matrix analysis to find the constraint on the gas fraction in the CGM in the presence or absence of an unknown redshift evolution of the gas fraction. Finally, we demonstrate that the cross-correlated SZ-galaxy and X-ray-galaxy power spectrum can be used as powerful probes of the CGM energetics and potentially discriminate between different feedback models recently proposed in the literature; for example, one can distinguish a no active galactic nuclei feedback' scenario from a CGM energized by fixed-velocity hot winds' at greater than $3\sigma$.
• ### CMB distortion from circumgalactic gas(1408.4896)

Jan. 26, 2015 astro-ph.CO
We study the Sunyaev-Zel'dovich (SZ) distortion of the cosmic microwave background radiation (CMBR) from extensive circumgalactic gas (CGM) in massive galactic halos. Recent observations have shown that galactic halos contain a large amount of X-ray emitting gas at the virial temperature, as well as a significant amount of warm OVI absorbing gas. We consider the SZ distortion from the hot gas in those galactic halos in which the gas cooling time is longer than the halo destruction time scale. We show that the SZ distortion signal from the hot gas in these galactic halos at redshifts $z\approx 1\hbox{--}8$ can be significant at small angular scales ($\ell\sim 10^4$), and dominate over the signal from galaxy clusters. The estimated SZ signal for most massive galaxies (halo mass $\ge 10^{12.5}$ M$_\odot$) is consistent with the marginal detection by {\it Planck} at these mass scales. We also consider the SZ effect from warm circumgalactic gas. The integrated Compton distortion from the warm OVI absorbing gas is estimated to be $y\sim 10^{-8}$, which could potentially be detected by experiments planned for the near future. Finally, we study the detectability of the SZ signal from circumgalactic gas in two types of surveys, a simple extension of the SPT survey and a more futuristic cosmic variance-limited survey. We find that these surveys can easily detect the kSZ signal from CGM. With the help of a Fisher Matrix analysis, we find that it will be possible for these surveys to constrain the gas fraction in CGM, after marginalizing over cosmological parameters, to $\le 33$\%, in case of no redshift evolution of the gas fraction.
• ### Post-Planck Dark Energy Constraints(1310.6161)

March 12, 2014 hep-th, hep-ph, gr-qc, astro-ph.CO
We constrain plausible dark energy models, parametrized by multiple candidate equation of state, using the recently published Cosmic Microwave Background (CMB) temperature anisotropy data from Planck together with the WMAP-9 low-$\ell$ polarization data and data from low redshift surveys. To circumvent the limitations of any particular equation of state towards describing all existing dark energy models, we work with three different equation of state covering a broader class of dark energy models and, hence, provide more robust and generic constraints on the dark energy properties. We show that a clear tension exists between dark energy constraints from CMB and non-CMB observations when one allows for dark energy models having both phantom and non-phantom behavior; while CMB is more favorable to phantom models, the low-z data prefers model with behavior close to a Cosmological Constant. Further, we reconstruct the equation of state of dark energy as a function of redshift using the results from combined CMB and non-CMB data and find that Cosmological Constant lies outside the 1$\sigma$ band for multiple dark energy models allowing phantom behavior. A considerable fine tuning is needed to keep models with strict non-phantom history inside 2$\sigma$ allowed range. This result might motivate one to construct phantom models of dark energy,which is achievable in the presence of higher derivative operators as in string theory. However, disallowing phantom behavior, based only on strong theoretical prior, leads to both CMB and non-CMB datasets agree on the nature of dark energy, with the mean equation of state being very close to the Cosmological Constant. Finally, to illustrate the impact of additional dark energy parameters on other cosmological parameters, we provide the cosmological parameter constraints for different dark energy models.
• ### AGN feedback and entropy injection in galaxy cluster cores(1211.3817)

July 25, 2013 astro-ph.CO, astro-ph.GA
The non-gravitational energy feedback is of crucial importance in modeling/simulating clusters to be used as cosmological probes. AGNs are, arguably, of primary importance in injecting energy in the cluster cores. We make the first estimate of non-gravitational energy {\it profiles} in galaxy cluster cores (and beyond) from observational data. Comparing the observed entropy profiles within $r_{500}$, from the Representative {\it XMM-Newton} Cluster Structure Survey (REXCESS), to simulated entropy profiles from both AMR and SPH non-radiative simulations, we estimate the amount of non-gravitational energy, $E_{\rm ICM}$, contained in the ICM. Adding the radiative losses we estimate the total energy feedback, $E_{\rm Feedback}$, into the clusters. The profiles for the energy deposition, $\Delta E_{\rm ICM}(x)$, in the inner regions differ for Cool-Core (CC) and Non Cool-Core (NCC) clusters, decreasing after accounting for the radiative cooling. The total feedback energy scales with the mean spectroscopic temperature as $E_{\rm Feedback} \propto T_{\rm sp}^{2.52 \pm0.08}$ and $E_{\rm Feedback} \propto T_{\rm sp}^{2.17 \pm 0.11}$, when compared with the baseline SPH and AMR profiles respectively. The scatter in the two cases is 15% and 23%, respectively. The mean non-gravitational energy per particle within $r_{500}$, is $\epsilon_{\rm ICM} = {2.8} \pm {0.8}$ keV for the SPH theoretical relation and $\epsilon_{\rm ICM} = {1.7} \pm {0.9}$ keV for the AMR theoretical relation. We use the {\it NRAO/VLA Sky Survey} (NVSS) source catalog to determine the radio luminosity, $L_R$, at 1.4 GHz of the central source(s) of our sample. For $T_{\rm sp} > 3$ keV, the $E_{\rm Feedback}$ correlates with $L_R$. We show that AGNs could provide a significant component of the feedback. (Abridged)
• ### Adjusting for Treatment Effects in Studies of Quantitative Traits(1305.7284)

May 31, 2013 stat.AP
A population-based study of a quantitative trait, e.g. Blood Pressure(BP) may be seriously compromised when the trait is subject to the effects of a treatment. Without appropriate corrections this can lead to considerable reduction of statistical power. Here we demonestrate this in the scenario of QTL mapping through Single-Marker Analysis. The data are simulated from a normal mixtrure for different values of allele frequencies, separation between normal populations and Linkage Disequilibrium, and several methods of correction are compared to check which can best compensate for the loss of power if treatment effects are ignored. In one of these methods, underlying BPs are approximated by subtracting an estimate of mean value of medicine effect from obsereved BPs in treated subjects. We domonestrate the efficacy of this method throughout different choices of parameters. Finally to account for quantitative traits that follow non-normal distributions, data are simulated from lognormal mixtures similarly and Kruskal-Wallis test is used to obtain estimates of powers for different methods of analysis.
• ### Adapting the Interrelated Two-way Clustering method for Quantitative Structure-Activity Relationship (QSAR) Modeling of a Diverse Set of Chemical Compounds(1305.7285)

May 31, 2013 stat.CO
Interrelated Two-way Clustering (ITC) is an unsupervised clustering method developed to divide samples into two groups in gene expression data obtained through microarrays, selecting important genes simultaneously in the process. This has been found to be a better approach than conventional clustering methods like K-means or self-organizing map for the scenarios when number of samples much smaller than number of variables (n<<p). In this paper we used the ITC approach for classification of a diverse set of 508 chemicals regarding mutagenicity. A large number of topological indices (TIs), 3-dimensional, and quantum chemical descriptors, as well as atom pairs (APs) have been used as explanatory variables. In this paper, ITC has been used only for predictor selection, after which ridge regression is employed to build the final predictive model. The proper leave-one-out (LOO) method of cross-validation in this scenario is to take as holdout each of the 508 compounds before predictor thinning and compare the predicted values with the experimental data. ITC based results obtained here are comparable to those developed earlier.
• ### Searching for systematics in SNIa and galaxy cluster data using the cosmic duality relation(1212.1277)

May 20, 2013 astro-ph.CO
We compare two different probes of the expansion history of the universe, namely, luminosity distances from type Ia supernovae and angular diameter distances from galaxy clusters, using the Bayesian interpretation of Crossing statistic [1, 2] in conjunction with the assumption of cosmic duality relation. Our analysis is conducted independently of any a-priori assumptions about the nature of dark energy. The model independent method which we invoke searches for inconsistencies between SNIa and galaxy cluster data sets. If detected such an inconsistency would imply the presence of systematics in either of the two data sets. Simulating observations based on expected WFIRST supernovae data and X-ray eROSITA + SZ Planck cluster data, we show that our method allows one to detect systematics with high precision and without advancing any hypothesis about the nature of dark energy.
• ### Deriving the velocity distribution of Galactic Dark Matter particles from rotation curve data(1210.2328)

April 5, 2013 hep-ph, astro-ph.CO, astro-ph.GA
The velocity distribution function (VDF) of the hypothetical Weakly Interacting Massive Particles (WIMPs), currently the most favored candidate for the Dark Matter (DM) in the Galaxy, is determined directly from the circular speed ("rotation") curve data of the Galaxy assuming isotropic VDF. This is done by "inverting" --- using Eddington's method --- the Navarro-Frenk-White universal density profile of the DM halo of the Galaxy, the parameters of which are determined, by using Markov Chain Monte Carlo (MCMC) technique, from a recently compiled set of observational data on the Galaxy's rotation curve extended to distances well beyond the visible edge of the disk of the Galaxy. The derived most-likely local isotropic VDF strongly differs from the Maxwellian form assumed in the "Standard Halo Model" (SHM) customarily used in the analysis of the results of WIMP direct-detection experiments. A parametrized (non-Maxwellian) form of the derived most-likely local VDF is given. The astrophysical "g-factor" that determines the effect of the WIMP VDF on the expected event rate in a direct-detection experiment can be lower for the derived most-likely VDF than that for the best Maxwellian fit to it by as much two orders of magnitude at the lowest WIMP mass threshold of a typical experiment.
• ### Cosmology with the largest galaxy cluster surveys: Going beyond Fisher matrix forecasts(1210.5586)

Feb. 6, 2013 astro-ph.CO
We make the first detailed MCMC likelihood study of cosmological constraints that are expected from some of the largest, ongoing and proposed, cluster surveys in different wave-bands and compare the estimates to the prevalent Fisher matrix forecasts. Mock catalogs of cluster counts expected from the surveys -- eROSITA, WFXT, RCS2, DES and Planck, along with a mock dataset of follow-up mass calibrations are analyzed for this purpose. A fair agreement between MCMC and Fisher results is found only in the case of minimal models. However, for many cases, the marginalized constraints obtained from Fisher and MCMC methods can differ by factors of 30-100%. The discrepancy can be alarmingly large for a time dependent dark energy equation of state, w(a); the Fisher methods are seen to under-estimate the constraints by as much as a factor of 4--5. Typically, Fisher estimates become more and more inappropriate as we move away from LCDM, to a constant-w dark energy to varying-w dark energy cosmologies. Fisher analysis, also, predicts incorrect parameter degeneracies. From the point of mass-calibration uncertainties, a high value of unknown scatter about the mean mass-observable relation, and its redshift dependence, is seen to have large degeneracies with the cosmological parameters sigma_8 and w(a) and can degrade the cosmological constraints considerably. We find that the addition of mass-calibrated cluster datasets can improve dark energy and sigma_8 constraints by factors of 2--3 from what can be obtained compared to CMB+SNe+BAO only. Since, details of future cluster surveys are still being planned, we emphasize that optimal survey design must be done using MCMC analysis rather than Fisher forecasting. [abridged]
• ### Energy Deposition Profiles and Entropy in Galaxy Clusters(1203.6535)

April 11, 2012 astro-ph.CO, astro-ph.GA
We report the results of our study of fractional entropy enhancement in the intra-cluster medium (ICM) of the clusters from the representative XMM-Newton cluster structure survey (REXCESS). We compare the observed entropy profile of these clusters with that expected for the ICM without any feedback, as well as with the introduction of preheating and entropy change due to gas cooling. We make the first estimate of the total, as well as radial, non-gravitational energy deposition up to r500 for a large, nearly flux-limited, sample of clusters. We find that the total energy deposition corresponding to the entropy enhancement is proportional to the cluster temperature (and hence mass), and that the energy deposition per particle as a function of gas mass shows a similar profile in all clusters, with its being more pronounced in the central region than in the outer region. Our results support models of entropy enhancement through AGN feedback.
• ### Constraining Thawing Quintessence(1109.4112)

Feb. 6, 2012 gr-qc, astro-ph.CO
We look at observational constraints on the thawing class of scalar field models proposed to explain the late time acceleration of the universe. Using the recently introduced Statefinder Hierarchy', we compare these thawing class of models with other widely studied dark energy (and modified gravity) models to check the underlying parameter degeneracies. We put constraints on the deviations of these thawing models from the canonical \Lambda CDM model using a large class of observational data, e.g, the Supernova Type Ia data, the BAO data, the CMB data and data from the measurements of the Hubble parameter using red-envelope galaxies. We also forecast constraints using a simulated dataset for the future JDEM SNe survey. Our study shows that, although with current data it is difficult to distinguish different thawing models from \Lambda CDM, a future JDEM like mission would be able tell apart thawing models from \Lambda CDM for currently acceptable values of \Omega_{m0}