• We present an analysis of anomaly detection for machine learning redshift estimation. Anomaly detection allows the removal of poor training examples, which can adversely influence redshift estimates. Anomalous training examples may be photometric galaxies with incorrect spectroscopic redshifts, or galaxies with one or more poorly measured photometric quantity. We select 2.5 million 'clean' SDSS DR12 galaxies with reliable spectroscopic redshifts, and 6730 'anomalous' galaxies with spectroscopic redshift measurements which are flagged as unreliable. We contaminate the clean base galaxy sample with galaxies with unreliable redshifts and attempt to recover the contaminating galaxies using the Elliptical Envelope technique. We then train four machine learning architectures for redshift analysis on both the contaminated sample and on the preprocessed 'anomaly-removed' sample and measure redshift statistics on a clean validation sample generated without any preprocessing. We find an improvement on all measured statistics of up to 80% when training on the anomaly removed sample as compared with training on the contaminated sample for each of the machine learning routines explored. We further describe a method to estimate the contamination fraction of a base data sample.
  • We present cosmological constraints from the Dark Energy Survey (DES) using a combined analysis of angular clustering of red galaxies and their cross-correlation with weak gravitational lensing of background galaxies. We use a 139 square degree contiguous patch of DES data from the Science Verification (SV) period of observations. Using large scale measurements, we constrain the matter density of the Universe as Omega_m = 0.31 +/- 0.09 and the clustering amplitude of the matter power spectrum as sigma_8 = 0.74 +/- 0.13 after marginalizing over seven nuisance parameters and three additional cosmological parameters. This translates into S_8 = sigma_8(Omega_m/0.3)^{0.16} = 0.74 +/- 0.12 for our fiducial lens redshift bin at 0.35 <z< 0.5, while S_8 = 0.78 +/- 0.09 using two bins over the range 0.2 <z< 0.5. We study the robustness of the results under changes in the data vectors, modelling and systematics treatment, including photometric redshift and shear calibration uncertainties, and find consistency in the derived cosmological parameters. We show that our results are consistent with previous cosmological analyses from DES and other data sets and conclude with a joint analysis of DES angular clustering and galaxy-galaxy lensing with Planck CMB data, Baryon Accoustic Oscillations and Supernova type Ia measurements.
  • We present analyses of data augmentation for machine learning redshift estimation. Data augmentation makes a training sample more closely resemble a test sample, if the two base samples differ, in order to improve measured statistics of the test sample. We perform two sets of analyses by selecting 800k (1.7M) SDSS DR8 (DR10) galaxies with spectroscopic redshifts. We construct a base training set by imposing an artificial r band apparent magnitude cut to select only bright galaxies and then augment this base training set by using simulations and by applying the K-correct package to artificially place training set galaxies at a higher redshift. We obtain redshift estimates for the remaining faint galaxy sample, which are not used during training. We find that data augmentation reduces the error on the recovered redshifts by 40% in both sets of analyses, when compared to the difference in error between the ideal case and the non augmented case. The outlier fraction is also reduced by at least 10% and up to 80% using data augmentation. We finally quantify how the recovered redshifts degrade as one probes to deeper magnitudes past the artificial magnitude limit of the bright training sample. We find that at all apparent magnitudes explored, the use of data augmentation with tree based methods provide a estimate of the galaxy redshift with a negligible bias, although the error on the recovered values increases as we probe to deeper magnitudes. These results have applications for surveys which have a spectroscopic training set which forms a biased sample of all photometric galaxies, for example if the spectroscopic detection magnitude limit is shallower than the photometric limit.
  • Galaxy-galaxy weak lensing is a direct probe of the mean matter distribution around galaxies. The depth and sky coverage of the CFHT Legacy Survey yield statistically significant galaxy halo mass measurements over a much wider range of stellar masses ($10^{8.75}$ to $10^{11.3} M_{\odot}$) and redshifts ($0.2 < z < 0.8$) than previous weak lensing studies. At redshift $z \sim 0.5$, the stellar-to-halo mass ratio (SHMR) reaches a maximum of $4.0\pm0.2$ percent as a function of halo mass at $\sim 10^{12.25} M_{\odot}$. We find, for the first time from weak lensing alone, evidence for significant evolution in the SHMR: the peak ratio falls as a function of cosmic time from $4.5 \pm 0.3$ percent at $z \sim 0.7$ to $3.4 \pm 0.2$ percent at $z \sim 0.3$, and shifts to lower stellar mass haloes. These evolutionary trends are dominated by red galaxies, and are consistent with a model in which the stellar mass above which star formation is quenched "downsizes" with cosmic time. In contrast, the SHMR of blue, star-forming galaxies is well-fit by a power law that does not evolve with time. This suggests that blue galaxies form stars at a rate that is balanced with their dark matter accretion in such a way that they evolve along the SHMR locus. The redshift dependence of the SHMR can be used to constrain the evolution of the galaxy population over cosmic time.
  • We present a novel way of using neural networks (NN) to estimate the redshift distribution of a galaxy sample. We are able to obtain a probability density function (PDF) for each galaxy using a classification neural network. The method is applied to 58714 galaxies in CFHTLenS that have spectroscopic redshifts from DEEP2, VVDS and VIPERS. Using this data we show that the stacked PDF's give an excellent representation of the true $N(z)$ using information from 5, 4 or 3 photometric bands. We show that the fractional error due to using N(z_(phot)) instead of N(z_(truth)) is <=1 on the lensing power spectrum P_(kappa) in several tomographic bins. Further we investigate how well this method performs when few training samples are available and show that in this regime the neural network slightly overestimates the N(z) at high z. Finally the case where the training sample is not representative of the full data set is investigated. An IPython notebook accompanying this paper is made available here: https://bitbucket.org/christopher_bonnett/nn_notebook
  • We present a study of the relation between dark matter halo mass and the baryonic content of host galaxies, quantified via luminosity and stellar mass. Our investigation uses 154 deg2 of Canada-France-Hawaii Telescope Lensing Survey (CFHTLenS) lensing and photometric data, obtained from the CFHT Legacy Survey. We employ a galaxy-galaxy lensing halo model which allows us to constrain the halo mass and the satellite fraction. Our analysis is limited to lenses at redshifts between 0.2 and 0.4. We express the relationship between halo mass and baryonic observable as a power law. For the luminosity-halo mass relation we find a slope of 1.32+/-0.06 and a normalisation of 1.19+0.06-0.07x10^13 h70^-1 Msun for red galaxies, while for blue galaxies the best-fit slope is 1.09+0.20-0.13 and the normalisation is 0.18+0.04-0.05x10^13 h70^-1 Msun. Similarly, we find a best-fit slope of 1.36+0.06-0.07 and a normalisation of 1.43+0.11-0.08x10^13 h70^-1 Msun for the stellar mass-halo mass relation of red galaxies, while for blue galaxies the corresponding values are 0.98+0.08-0.07 and 0.84+0.20-0.16x10^13 h70^-1 Msun. For red lenses, the fraction which are satellites tends to decrease with luminosity and stellar mass, with the sample being nearly all satellites for a stellar mass of 2x10^9 h70^-2 Msun. The satellite fractions are generally close to zero for blue lenses, irrespective of luminosity or stellar mass. This, together with the shallower relation between halo mass and baryonic tracer, is a direct confirmation from galaxy-galaxy lensing that blue galaxies reside in less clustered environments than red galaxies. We also find that the halo model, while matching the lensing signal around red lenses well, is prone to over-predicting the large-scale signal for faint and less massive blue lenses. This could be a further indication that these galaxies tend to be more isolated than assumed. [abridged]
  • We use weak gravitational lensing to analyse the dark matter halos around satellite galaxies in galaxy groups in the CFHTLenS dataset. This dataset is derived from the CFHTLS-Wide survey, and encompasses 154 sq. deg of high-quality shape data. Using the photometric redshifts, we divide the sample of lens galaxies with stellar masses in the range 10^9 Msun to 10^10.5 Msun into those likely to lie in high-density environments (HDE) and those likely to lie in low-density environments (LDE). Through comparison with galaxy catalogues extracted from the Millennium Simulation, we show that the sample of HDE galaxies should primarily (~61%) consist of satellite galaxies in groups, while the sample of LDE galaxies should consist of mostly (~87%) non-satellite (field and central) galaxies. Comparing the lensing signals around samples of HDE and LDE galaxies matched in stellar mass, the lensing signal around HDE galaxies clearly shows a positive contribution from their host groups on their lensing signals at radii of ~500--1000 kpc, the typical separation between satellites and group centres. More importantly, the subhalos of HDE galaxies are less massive than those around LDE galaxies by a factor 0.65 +/- 0.12, significant at the 2.9 sigma level. A natural explanation is that the halos of satellite galaxies are stripped through tidal effects in the group environment. Our results are consistent with a typical tidal truncation radius of ~40 kpc.
  • We present cosmological constraints from 2D weak gravitational lensing by the large-scale structure in the Canada-France Hawaii Telescope Lensing Survey (CFHTLenS) which spans 154 square degrees in five optical bands. Using accurate photometric redshifts and measured shapes for 4.2 million galaxies between redshifts of 0.2 and 1.3, we compute the 2D cosmic shear correlation function over angular scales ranging between 0.8 and 350 arcmin. Using non-linear models of the dark-matter power spectrum, we constrain cosmological parameters by exploring the parameter space with Population Monte Carlo sampling. The best constraints from lensing alone are obtained for the small-scale density-fluctuations amplitude sigma_8 scaled with the total matter density Omega_m. For a flat LambdaCDM model we obtain sigma_8(Omega_m/0.27)^0.6 = 0.79+-0.03. We combine the CFHTLenS data with WMAP7, BOSS and an HST distance-ladder prior on the Hubble constant to get joint constraints. For a flat LambdaCDM model, we find Omega_m = 0.283+-0.010 and sigma_8 = 0.813+-0.014. In the case of a curved wCDM universe, we obtain Omega_m = 0.27+-0.03, sigma_8 = 0.83+-0.04, w_0 = -1.10+-0.15 and Omega_K = 0.006+0.006-0.004. We calculate the Bayesian evidence to compare flat and curved LambdaCDM and dark-energy CDM models. From the combination of all four probes, we find models with curvature to be at moderately disfavoured with respect to the flat case. A simple dark-energy model is indistinguishable from LambdaCDM. Our results therefore do not necessitate any deviations from the standard cosmological model.
  • We present the Canada-France-Hawaii Telescope Lensing Survey (CFHTLenS) that accurately determines a weak gravitational lensing signal from the full 154 square degrees of deep multi-colour data obtained by the CFHT Legacy Survey. Weak gravitational lensing by large-scale structure is widely recognised as one of the most powerful but technically challenging probes of cosmology. We outline the CFHTLenS analysis pipeline, describing how and why every step of the chain from the raw pixel data to the lensing shear and photometric redshift measurement has been revised and improved compared to previous analyses of a subset of the same data. We present a novel method to identify data which contributes a non-negligible contamination to our sample and quantify the required level of calibration for the survey. Through a series of cosmology-insensitive tests we demonstrate the robustness of the resulting cosmic shear signal, presenting a science-ready shear and photometric redshift catalogue for future exploitation.