• Parallaxes for 331 classical Cepheids, 31 Type II Cepheids and 364 RR Lyrae stars in common between Gaia and the Hipparcos and Tycho-2 catalogues are published in Gaia Data Release 1 (DR1) as part of the Tycho-Gaia Astrometric Solution (TGAS). In order to test these first parallax measurements of the primary standard candles of the cosmological distance ladder, that involve astrometry collected by Gaia during the initial 14 months of science operation, we compared them with literature estimates and derived new period-luminosity ($PL$), period-Wesenheit ($PW$) relations for classical and Type II Cepheids and infrared $PL$, $PL$-metallicity ($PLZ$) and optical luminosity-metallicity ($M_V$-[Fe/H]) relations for the RR Lyrae stars, with zero points based on TGAS. The new relations were computed using multi-band ($V,I,J,K_{\mathrm{s}},W_{1}$) photometry and spectroscopic metal abundances available in the literature, and applying three alternative approaches: (i) by linear least squares fitting the absolute magnitudes inferred from direct transformation of the TGAS parallaxes, (ii) by adopting astrometric-based luminosities, and (iii) using a Bayesian fitting approach. TGAS parallaxes bring a significant added value to the previous Hipparcos estimates. The relations presented in this paper represent first Gaia-calibrated relations and form a "work-in-progress" milestone report in the wait for Gaia-only parallaxes of which a first solution will become available with Gaia's Data Release 2 (DR2) in 2018.
• Context. The first Gaia Data Release contains the Tycho-Gaia Astrometric Solution (TGAS). This is a subset of about 2 million stars for which, besides the position and photometry, the proper motion and parallax are calculated using Hipparcos and Tycho-2 positions in 1991.25 as prior information. Aims. We investigate the scientific potential and limitations of the TGAS component by means of the astrometric data for open clusters. Methods. Mean cluster parallax and proper motion values are derived taking into account the error correlations within the astrometric solutions for individual stars, an estimate of the internal velocity dispersion in the cluster, and, where relevant, the effects of the depth of the cluster along the line of sight. Internal consistency of the TGAS data is assessed. Results. Values given for standard uncertainties are still inaccurate and may lead to unrealistic unit-weight standard deviations of least squares solutions for cluster parameters. Reconstructed mean cluster parallax and proper motion values are generally in very good agreement with earlier Hipparcos-based determination, although the Gaia mean parallax for the Pleiades is a significant exception. We have no current explanation for that discrepancy. Most clusters are observed to extend to nearly 15 pc from the cluster centre, and it will be up to future Gaia releases to establish whether those potential cluster-member stars are still dynamically bound to the clusters. Conclusions. The Gaia DR1 provides the means to examine open clusters far beyond their more easily visible cores, and can provide membership assessments based on proper motions and parallaxes. A combined HR diagram shows the same features as observed before using the Hipparcos data, with clearly increased luminosities for older A and F dwarfs.
• ### The Gaia astrophysical parameters inference system (Apsis). Pre-launch description(1309.2157)

The Gaia satellite will survey the entire celestial sphere down to 20th magnitude, obtaining astrometry, photometry, and low resolution spectrophotometry on one billion astronomical sources, plus radial velocities for over one hundred million stars. Its main objective is to take a census of the stellar content of our Galaxy, with the goal of revealing its formation and evolution. Gaia's unique feature is the measurement of parallaxes and proper motions with hitherto unparalleled accuracy for many objects. As a survey, the physical properties of most of these objects are unknown. Here we describe the data analysis system put together by the Gaia consortium to classify these objects and to infer their astrophysical properties using the satellite's data. This system covers single stars, (unresolved) binary stars, quasars, and galaxies, all covering a wide parameter space. Multiple methods are used for many types of stars, producing multiple results for the end user according to different models and assumptions. Prior to its application to real Gaia data the accuracy of these methods cannot be assessed definitively. But as an example of the current performance, we can attain internal accuracies (RMS residuals) on F,G,K,M dwarfs and giants at G=15 (V=15-17) for a wide range of metallicites and interstellar extinctions of around 100K in effective temperature (Teff), 0.1mag in extinction (A0), 0.2dex in metallicity ([Fe/H]), and 0.25dex in surface gravity (logg). The accuracy is a strong function of the parameters themselves, varying by a factor of more than two up or down over this parameter range. After its launch in November 2013, Gaia will nominally observe for five years, during which the system we describe will continue to evolve in light of experience with the real data.
• ### A data-driven model for spectra: Finding double redshifts in the Sloan Digital Sky Survey(1201.3370)

May 7, 2012 astro-ph.CO, astro-ph.IM
We present a data-driven method - heteroscedastic matrix factorization, a kind of probabilistic factor analysis - for modeling or performing dimensionality reduction on observed spectra or other high-dimensional data with known but non-uniform observational uncertainties. The method uses an iterative inverse-variance-weighted least-squares minimization procedure to generate a best set of basis functions. The method is similar to principal components analysis, but with the substantial advantage that it uses measurement uncertainties in a responsible way and accounts naturally for poorly measured and missing data; it models the variance in the noise-deconvolved data space. A regularization can be applied, in the form of a smoothness prior (inspired by Gaussian processes) or a non-negative constraint, without making the method prohibitively slow. Because the method optimizes a justified scalar (related to the likelihood), the basis provides a better fit to the data in a probabilistic sense than any PCA basis. We test the method on SDSS spectra, concentrating on spectra known to contain two redshift components: These are spectra of gravitational lens candidates and massive black-hole binaries. We apply a hypothesis test to compare one-redshift and two-redshift models for these spectra, utilizing the data-driven model trained on a random subset of all SDSS spectra. This test confirms 129 of the 131 lens candidates in our sample and all of the known binary candidates, and turns up very few false positives.
• ### A semi-empirical library of galaxy spectra for Gaia classification based on SDSS data and PEGASE models(1110.6806)

Oct. 31, 2011 astro-ph.CO
Aims:This paper is the third in a series implementing a classification system for Gaia observations of unresolved galaxies. The system makes use of template galaxy spectra in order to determine spectral classes and estimate intrinsic astrophysical parameters. In previous work we used synthetic galaxy spectra produced by PEGASE.2 code to simulate Gaia observations and to test the performance of Support Vector Machine (SVM) classifiers and parametrizers. Here we produce a semi-empirical library of galaxy spectra by fitting SDSS spectra with the previously produced synthetic libraries. We present (1) the semi-empirical library of galaxy spectra, (2) a comparison between the observed and synthetic spectra, and (3) first results of claassification and parametrization experiments with simulated Gaia spectrophotometry of this library. Methods: We use chi2-fitting to fit SDSS galaxy spectra with the synthetic library in order to construct a semi-empirical library of galaxy spectra in which (1) the real spectra are extended by the synthetic ones in order to cover the full wavelength range of Gaia, and (2) astrophysical parameters are assigned to the SDSS spectra by the best fitting synthetic spectrum. The SVM models were trained with and applied to semi-empirical spectra. Tests were performed for the classification of spectral types and the estimation of the most significant galaxy parameters (in particular redshift, mass to light ratio and star formation history). Results: We produce a semi-empirical library of 33670 galaxy spectra covering the wavelength range 250 to 1050 nm at a sampling of 1 nm or less. Using the results of the fitting of the SDSS spectra with our synthetic library, we investigate the range of the input model parameters that produces spectra which are in good agreement with observations. (abridged)
• ### A systematic search for massive black hole binaries in SDSS spectroscopic sample(1106.1180)

June 6, 2011 astro-ph.CO
We present the results of a systematic search for massive black hole binaries in the Sloan Digital Sky Survey spectroscopic database. We focus on bound binaries, under the assumption that one of the black holes is active. In this framework, the broad lines associated to the accreting black hole are expected to show systematic velocity shifts with respect to the narrow lines, which trace the rest-frame of the galaxy. For a sample of 54586 quasars and 3929 galaxies at redshifts 0.1<z<1.5 we brute-force model each spectrum as a mixture of two quasars at two different redshifts. The spectral model is a data-driven dimensionality reduction of the SDSS quasar spectra based on a matrix factorization. We identified 32 objects with peculiar spectra. Nine of them can be interpreted as black hole binaries. This doubles the number of known black hole binary candidates. We also report on the discovery of a new class of extreme double-peaked emitters with exceptionally broad and faint Balmer lines. For all the interesting sources, we present detailed analysis of the spectra, and discuss possible interpretations.
• ### Towards a library of synthetic galaxy spectra and preliminary results of classification and parametrization of unresolved galaxies for Gaia - II(0907.1671)

July 9, 2009 astro-ph.GA
This paper is the second in a series, implementing a classification system for Gaia observations of unresolved galaxies. Our goals are to determine spectral classes and estimate intrinsic astrophysical parameters via synthetic templates. Here we describe (1) a new extended library of synthetic galaxy spectra, (2) its comparison with various observations, and (3) first results of classification and parametrization experiments using simulated Gaia spectrophotometry of this library. Using the PEGASE.2 code, based on galaxy evolution models that take account of metallicity evolution, extinction correction, and emission lines (with stellar spectra based on the BaSeL library), we improved our first library and extended it to cover the domain of most of the SDSS catalogue. We produce an extended library of 28885 synthetic galaxy spectra at zero redshift covering four general Hubble types of galaxies, over the wavelength range between 250 and 1050 nm at a sampling of 1 nm or less. The library is also produced for 4 random values of redshift in the range of 0-0.2. It is computed on a random grid of four key astrophysical parameters (infall timescale and 3 parameters defining the SFR) and, depending on the galaxy type, on two values of the age of the galaxy. The synthetic library was compared and found to be in good agreement with various observations. The first results from the SVM classifiers and parametrizers are promising, indicating that Hubble types can be reliably predicted and several parameters estimated with low bias and variance.
• ### Towards a library of synthetic galaxy spectra and preliminary results of classification and parametrization of unresolved galaxies for Gaia(0705.2152)

June 14, 2007 astro-ph
Aims:The Gaia astrometric survey mission will, as a consequence of its scanning law, obtain low resolution optical (330-1000 nm) spectrophotometry of several million unresolved galaxies brighter than V=22. We present the first steps in a project to design and implement a classification system for these data. The goal is both to determine morphological classes and to estimate intrinsic astrophysical parameters via synthetic templates. Here we describe (1) a new library of synthetic galaxy spectra, and (2) first results of classification and parametrization experiments using simulated Gaia spectrophotometry of this library. Methods:We have created a large grid of synthetic galaxy spectra using the PEGASE.2 code, which is based on galaxy evolution models that take into account metallicity evolution, extinction correction, emission lines (with stellar spectra based on the BaSeL library). Our classification and regression models are Support Vector Machines (SVMs), which are kernel-based nonlinear estimators. Results:We produce a basic library of about 4000 zero redshift galaxy spectra covering the main Hubble types over wavelength range 250 to 1050 nm at a sampling of 1 nm or less. It is computed on a regular grid of four key astrophysical parameters for each type and for intermediate random values of the same parameters. An extended library reproduces this at a series of redshifts. Initial results from the SVM classifiers and parametrizers are promising, indicating that Hubble types can be reliably predicted and several parameters estimated with low bias and variance. Comparing the colours of our synthetic library with Sloan Digital Sky Survey (SDSS) spectra we find good agreement over the full range of Hubble types and parameters.
• ### Star-forming Regions in the Small Magellanic Cloud Multi-wavelength Properties of Stellar Complexes(astro-ph/0703761)

March 29, 2007 astro-ph
We trace the star formation regions in the SMC and study their properties. The size and spatial distribution of these regions is found to support the hierarchical scenario of star formation, whereas, the evaluation of their intensity, contributes to the understanding of the various stages of star formation. Their connection to the LMC-SMC close encounter, about $(0.9-2) \times 10^{8}$ years ago, is investigated as well. The SMC, being almost edge-on, does not easily reveal these areas, as is the case with the LMC. However, a study through multi-wavelength images such as optical, IR and radio has been proved very useful. A selection of areas, with enhanced 60 and 100-$\mu$m infrared flux and emission in all IRAS bands, identifies the star forming regions. All of the identified regions are dominated by early-type stars and considering their overall size (increasing order) a total of 24 aggregates, 23 complexes, and 3 super-complexes were found. We present their coordinates, dimensions, and IR fluxes. Moreover, we correlate their positions with known associations, SNRs, and \hii regions and discuss their activity.
• ### Luminous AGB stars in nearby galaxies. A study using Virtual Observatory tools(astro-ph/0510307)

Oct. 13, 2005 astro-ph
Aims. This study focuses on very luminous Mbol<-6.0 mag AGB stars with J-Ks>1.5 mag and H-Ks>0.4 mag in the LMC, SMC, M31, and M33 from 2MASS data. Methods.The data were taken from the 2MASS All-Sky Point Source catalogue archive. We used Virtual Observatory tools and took advantage of its capabilities at various stages in the analysis. Results. It is well known that stars with the colors we selected correspond mainly to carbon stars. Although the most luminous AGBs detected here contain a large number of carbon stars,they are not included in existing catalogues produced from data in the optical domain, where they are not visible since they are dust-enshrouded. A comparison of the AGB stars detected with combined near and mid-infrared data from MSX and 2MASS in the LMC shows that 10% of the bright AGB stars are bright carbon stars never detected before and that the other 50% are OH/IR oxygen rich stars, whereas the 40% that remain were not cross-matched. Conclusions. The catalogues of the most luminous AGB stars compiled here are an important complement to existing data. In the LMC, these bright AGB stars are centrally located, whereas they are concentrated in an active star-formation ring in M31. In the SMC and M33, there are not enough of them to draw definite conclusions, although they tend to be centrally located. Their luminosity functions are similar for the four galaxies we studied.