L. M. Sarro

Radboud University Nijmegen, Nymegen, Gelderland, Netherlands

Are you L. M. Sarro?

Claim your profile

Publications (47)69.66 Total impact

  • [Show abstract] [Hide abstract]
    ABSTRACT: Astrometry and photometry from the DANCe project is used to derive statistical models of the distribution of sources in the Pleiades region of the sky, in the space of proper motions, colours and magnitudes. These models are subsequently used to estimate membership probabilities to the Pleaides cluster. This catalog contains the original data from the DANCe project and the infered membership probabilities obtained in various setups.(1 data file).
    02/2014;
  • [Show abstract] [Hide abstract]
    ABSTRACT: Context. With the advent of deep wide surveys, large photometric and astrometric catalogues of literally all nearby clusters and associations have been produced. The unprecedented accuracy and sensitivity of these data sets and their broad spatial, temporal and wavelength coverage make obsolete the classical membership selection methods that were based on a handful of colours and luminosities. We present a new technique designed to take full advantage of the high dimensionality (photometric, astrometric, temporal) of such a survey to derive self-consistent and robust membership probabilities of the Pleiades cluster. Aims: We aim at developing a methodology to infer membership probabilities to the Pleiades cluster from the DANCe multidimensional astro-photometric data set in a consistent way throughout the entire derivation. The determination of the membership probabilities has to be applicable to censored data and must incorporate the measurement uncertainties into the inference procedure. Methods: We use Bayes' theorem and a curvilinear forward model for the likelihood of the measurements of cluster members in the colour-magnitude space, to infer posterior membership probabilities. The distribution of the cluster members proper motions and the distribution of contaminants in the full multidimensional astro-photometric space is modelled with a mixture-of-Gaussians likelihood. Results: We analyse several representation spaces composed of the proper motions plus a subset of the available magnitudes and colour indices. We select two prominent representation spaces composed of variables selected using feature relevance determination techniques based in Random Forests, and analyse the resulting samples of high probability candidates. We consistently find lists of high probability (p > 0.9975) candidates with ≈1000 sources, 4 to 5 times more than obtained in the most recent astro-photometric studies of the cluster. Conclusions: Multidimensional data sets require statistically sound multivariate analysis techniques to fully exploit their scientific information content. Proper motions in particular are, as expected, critical for the correct separation of contaminants. The methodology presented here is ready for application in data sets that include more dimensions, such as radial and/or rotational velocities, spectral indices, and variability.Membership probability catalogs for the DANCe Pleiades data are only available at the CDS via anonymous ftp to http://cdsarc.u-strasbg.fr (ftp://130.79.128.5) or via http://cdsarc.u-strasbg.fr/viz-bin/qcat?J/A+A/563/A45
    02/2014;
  • [Show abstract] [Hide abstract]
    ABSTRACT: Following the recent discovery of a large population of young stars in front of the Orion Nebula, we carried out an observational campaign with the DECam wide-field camera covering ~10~deg^2 centered on NGC 1980 to confirm, probe the extent of, and characterize this foreground population of pre-main-sequence stars. We confirm the presence of a large foreground population towards the Orion A cloud. This population contains several distinct subgroups, including NGC1980 and NGC1981, and stretches across several degrees in front of the Orion A cloud. By comparing the location of their sequence in various color-magnitude diagrams with other clusters, we found a distance and an age of 380pc and 5~10Myr, in good agreement with previous estimates. Our final sample includes 2123 candidate members and is complete from below the hydrogen-burning limit to about 0.3Msun, where the data start to be limited by saturation. Extrapolating the mass function to the high masses, we estimate a total number of ~2600 members in the surveyed region. We confirm the presence of a rich, contiguous, and essentially coeval population of about 2600 foreground stars in front of the Orion A cloud, loosely clustered around NGC1980, NGC1981, and a new group in the foreground of the OMC-2/3. For the area of the cloud surveyed, this result implies that there are more young stars in the foreground population than young stars inside the cloud. Assuming a normal initial mass function, we estimate that between one to a few supernovae must have exploded in the foreground population in the past few million years, close to the surface of Orion A, which might be responsible, together with stellar winds, for the structure and star formation activity in these clouds. This long-overlooked foreground stellar population is of great significance, calling for a revision of the star formation history in this region of the Galaxy.
    02/2014;
  • [Show abstract] [Hide abstract]
    ABSTRACT: We present a new technique designed to take full advantage of the high dimensionality (photometric, astrometric, temporal) of the DANCe survey to derive self-consistent and robust membership probabilities of the Pleiades cluster. We aim at developing a methodology to infer membership probabilities to the Pleiades cluster from the DANCe multidimensional astro-photometric data set in a consistent way throughout the entire derivation. The determination of the membership probabilities has to be applicable to censored data and must incorporate the measurement uncertainties into the inference procedure. We use Bayes' theorem and a curvilinear forward model for the likelihood of the measurements of cluster members in the colour-magnitude space, to infer posterior membership probabilities. The distribution of the cluster members proper motions and the distribution of contaminants in the full multidimensional astro-photometric space is modelled with a mixture-of-Gaussians likelihood. We analyse several representation spaces composed of the proper motions plus a subset of the available magnitudes and colour indices. We select two prominent representation spaces composed of variables selected using feature relevance determination techniques based in Random Forests, and analyse the resulting samples of high probability candidates. We consistently find lists of high probability (p > 0.9975) candidates with $\approx$ 1000 sources, 4 to 5 times more than obtained in the most recent astro-photometric studies of the cluster. The methodology presented here is ready for application in data sets that include more dimensions, such as radial and/or rotational velocities, spectral indices and variability.
    01/2014;
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: The Gaia satellite will survey the entire celestial sphere down to 20th magnitude, obtaining astrometry, photometry, and low resolution spectrophotometry on one billion astronomical sources, plus radial velocities for over one hundred million stars. Its main objective is to take a census of the stellar content of our Galaxy, with the goal of revealing its formation and evolution. Gaia's unique feature is the measurement of parallaxes and proper motions with hitherto unparalleled accuracy for many objects. As a survey, the physical properties of most of these objects are unknown. Here we describe the data analysis system put together by the Gaia consortium to classify these objects and to infer their astrophysical properties using the satellite's data. This system covers single stars, (unresolved) binary stars, quasars, and galaxies, all covering a wide parameter space. Multiple methods are used for many types of stars, producing multiple results for the end user according to different models and assumptions. Prior to its application to real Gaia data the accuracy of these methods cannot be assessed definitively. But as an example of the current performance, we can attain internal accuracies (RMS residuals) on F,G,K,M dwarfs and giants at G=15 (V=15-17) for a wide range of metallicites and interstellar extinctions of around 100K in effective temperature (Teff), 0.1mag in extinction (A0), 0.2dex in metallicity ([Fe/H]), and 0.25dex in surface gravity (logg). The accuracy is a strong function of the parameters themselves, varying by a factor of more than two up or down over this parameter range. After its launch in November 2013, Gaia will nominally observe for five years, during which the system we describe will continue to evolve in light of experience with the real data.
    Astronomy and Astrophysics 09/2013; 559. · 5.08 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: The WFCAM Transit Survey is a photometric survey in the near-infrared and aims at finding Earth-like planets transiting M-dwarf stars. As a by-product of the survey, a variety of variable stars has been detected. We report the discovery and classification of 192 periodic variable stars in the WFCAM Transit Survey. 185 of those objects are previously unknown variable sources. The derived parameters of their light curves will be helpful for the creation of a robust sample of light curves (and their parameters thereof) of classified variable stars in the near-infrared for the automatic classification of light curves of stellar objects in the J-band.
    04/2013;
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Aims: We present an improved method for automated stellar variability classification, using fundamental parameters derived from high resolution spectra, with the goal to improve the variability classification obtained using information derived from CoRoT light curves only. Although we focus on Giraffe spectra and CoRoT light curves in this work, the methods are much more widely applicable. Methods: In order to improve the variability classification obtained from the photometric time series, only rough estimates of the stellar physical parameters (Teff and log (g)) are needed because most variability types that overlap in the space of time series parameters, are well separated in the space of physical parameters (e.g. γ Dor/SPB or δ Sct/β Cep). In this work, several state-of-the-art machine learning techniques are combined to estimate these fundamental parameters from high resolution Giraffe spectra. Next, these parameters are used in a multi-stage Gaussian-Mixture classifier to perform an improved supervised variability classification of CoRoT light curves. The variability classifier can be used independently of the regression module that estimates the physical parameters, so that non-spectroscopic estimates derived e.g. from photometric colour indices can be used instead. Results: Teff and log (g) are derived from Giraffe spectra, for 6832 CoRoT targets. The use of those parameters in addition to information extracted from the CoRoT light curves, significantly improves the results of our previous automated stellar variability classification. Several new pulsating stars are identified with high confidence levels, including hot pulsators such as SPB and β Cep, and several γ Dor-δ Sct hybrids. From our samples of new γ Dor and δ Sct stars, we find strong indications that the instability domains for both types of pulsators are larger than previously thought. The CoRoT space mission, launched on 27 December 2006, has been developed and is operated by CNES, with the contribution of Austria, Belgium, Brazil, ESA (RSSD and Science Programmes), Germany, and Spain.Full Table 2 is only available in electronic form at the CDS via anonymous ftp to cdsarc.u-strasbg.fr (130.79.128.5) or via http://cdsarc.u-strasbg.fr/viz-bin/qcat?J/A+A/550/A120
    Astronomy and Astrophysics 02/2013; 550:1-12. · 5.08 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: We present an automated classification of stars exhibiting periodic, non-periodic and irregular light variations. The Hipparcos catalogue of unsolved variables is employed to complement the training set of periodic variables of Dubath et al. with irregular and non-periodic representatives, leading to 3881 sources in total which describe 24 variability types. The attributes employed to characterize light-curve features are selected according to their relevance for classification. Classifier models are produced with random forests and a multistage methodology based on Bayesian networks, achieving overall misclassification rates under 12 per cent. Both classifiers are applied to predict variability types for 6051 Hipparcos variables associated with uncertain or missing types in the literature.
    Monthly Notices of the Royal Astronomical Society 01/2013; 427(4). · 5.52 Impact Factor
  • Source
    M J Marquez, T Budavári, L M Sarro
    [Show abstract] [Hide abstract]
    ABSTRACT: The identification of the same astronomical objects in different exposures taken with different instruments is a fundamental but difficult problem, which has long been studied for its statistical and computational complexity. We typically consider the celestial coordinates of detections to decide whether they belong to the same object, but crowded areas often yield degenerate cases when multiple matching configurations have similar likelihoods. We applied Bayesian inference to alleviate the problem by including photometric measurements. The spectral energy distribution of a candidate association is compared with models to test whether the photometric evidence points toward a good match or not. We discuss our preliminary results from simulated data and the COSMOS catalog.
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: We aimed to assess the accuracy of the Gaia teff and logg estimates as derived with current models and observations. We assessed the validity of several inference techniques for deriving the physical parameters of ultra-cool dwarf stars. We used synthetic spectra derived from ultra-cool dwarf models to construct (train) the regression models. We derived the intrinsic uncertainties of the best inference models and assessed their validity by comparing the estimated parameters with the values derived in the bibliography for a sample of ultra-cool dwarf stars observed from the ground. We estimated the total number of ultra-cool dwarfs per spectral subtype, and obtained values that can be summarised (in orders of magnitude) as 400000 objects in the M5-L0 range, 600 objects between L0 and L5, 30 objects between L5 and T0, and 10 objects between T0 and T8. A bright ultra-cool dwarf (with teff=2500 K and \logg=3.5 will be detected by Gaia out to approximately 220 pc, while for teff=1500 K (spectral type L5) and the same surface gravity, this maximum distance reduces to 10-20 pc. The RMSE of the prediction deduced from ground-based spectra of ultra-cool dwarfs simulated at the Gaia spectral range and resolution, and for a Gaia magnitude G=20 is 213 K and 266 K for the models based on k-nearest neighbours and Gaussian process regression, respectively. These are total errors in the sense that they include the internal and external errors, with the latter caused by the inability of the synthetic spectral models (used for the construction of the regression models) to exactly reproduce the observed spectra, and by the large uncertainties in the current calibrations of spectral types and effective temperatures.
    Astronomy and Astrophysics 12/2012; · 5.08 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: This table contains the parameters used to classify CoRoT targets observed with the Giraffe spectrograph at the Very Large Telescope VLT. It contains both parameters derived from the photometric time series and physical parameters (Teff and logg) derived from the spectra. We also include the final classification obtained with these parameters. (1 data file).
    VizieR Online Data Catalog. 11/2012;
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: We started a systematic search for periodic variable-star candidates in the EROS-2 database in the context of preparatory work for the Gaia satellite mission. The goal is to evaluate different classification tools and strategies, and to identify a large sample of variable candidates. In this paper we present the results of an assessment study of a three-step identification and classification process. In the study we took a sample of about 80,000 stars from one of the LMC EROS fields.
    Proceedings of the International Astronomical Union 04/2012; 7(S285):309-311.
  • [Show abstract] [Hide abstract]
    ABSTRACT: The Hipparcos catalogue (ESA 1997, Cat. I/239) and the AAVSO Variable Star Index (Watson et al., 2011, Cat. B/vsx) are employed to complement the training set of periodic variables of Dubath et al. (2011, Cat. J/MNRAS/414/2602) with irregular and non-periodic representatives, leading to 3904 sources in total which described 24 variability types. The attributes employed to characterize light-curve features are selected according to their relevance for classification. Classifier models are produced with random forests and a multi-stage methodology based on Bayesian networks, achieving overall misclassification rates around 12 per cent. Both classifiers are applied to predict variability types for 6048 Hipparcos variables still associated with uncertain or missing types in the literature. (3 data files).
    VizieR Online Data Catalog. 03/2012;
  • [Show abstract] [Hide abstract]
    ABSTRACT: The Hipparcos catalogue (ESA 1997, Cat. I/239) and the AAVSO Variable Star Index (Watson et al., 2011, Cat. B/vsx) are employed to complement the training set of periodic variables of Dubath et al. (2011, Cat. J/MNRAS/414/2602) with irregular and non-periodic representatives, leading to 3881 sources in total which described 24 variability types. The attributes employed to characterize light-curve features are selected according to their relevance for classification. Classifier models are produced with random forests and a multi-stage methodology based on Bayesian networks, achieving overall misclassification rates under 12%. Both classifiers are applied to predict variability types for 6051 Hipparcos variables associated with uncertain or missing types in the literature. (6 data files).
    VizieR Online Data Catalog. 03/2012;
  • [Show abstract] [Hide abstract]
    ABSTRACT: We present an evaluation of the performance of an automated classification of the Hipparcos periodic variable stars into 26 types. The sub-sample with the most reliable variability types available in the literature is used to train supervised algorithms to characterize the type dependencies on a number of attributes. The most useful attributes evaluated with the random forest methodology include, in decreasing order of importance, the period, the amplitude, the V-I colour index, the absolute magnitude, the residual around the folded light-curve model, the magnitude distribution skewness and the amplitude of the second harmonic of the Fourier series model relative to that of the fundamental frequency. (2 data files).
    VizieR Online Data Catalog. 02/2012;
  • [Show abstract] [Hide abstract]
    ABSTRACT: VOSED (http://sdc.cab.inta-csic.es/vosed/) is a tool developedin the framework of the Spanish VO to ease the generation of SpectralEnergy Distributions (SEDs) by gathering information from thespectroscopic services available in VO. These datasets can becomplemented with photometric information from a number of Viziercatalogues as well as with data provided by the user. The SEDs areprovided in VOTable format and can be uploaded in other VO tools (like,for instance, VOSpec) for further visualisation and analysis.The main functionalities of VOSED are described in this poster.
    11/2011;
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: We present a novel automated methodology to detect and classify periodic variable stars in a large data base of photometric time series. The methods are based on multivariate Bayesian statistics and use a multistage approach. We applied our method to the ground-based data of the Trans-Atlantic Exoplanet Survey (TrES) Lyr1 field, which is also observed by the Kepler satellite, covering ∼26 000 stars. We found many eclipsing binaries as well as classical non-radial pulsators, such as slowly pulsating B stars, Doradus, β Cephei and δ Scuti stars. Also a few classical radial pulsators were found.
    Monthly Notices of the Royal Astronomical Society 09/2011; 418(1):96 - 106. · 5.52 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: A complete periodic star extraction and classification scheme is set up and tested with the Hipparcos catalogue. The efficiency of each step is derived by comparing the results with prior knowledge coming from the catalogue or from the literature. A combination of two variability criteria is applied in the first step to select 17 006 variability candidates from a complete sample of 115 152 stars. Our candidate sample turns out to include 10 406 known variables (i.e., 90% of the total of 11 597) and 6600 contaminating constant stars. A random forest classification is used in the second step to extract 1881 (82%) of the known periodic objects while removing entirely constant stars from the sample and limiting the contamination of non-periodic variables to 152 stars (7.5%). The confusion introduced by these 152 non-periodic variables is evaluated in the third step using the results of the Hipparcos periodic star classification presented in a previous study (Dubath et al. [1]).
    07/2011;
  • Source
    Luis M. Sarro, Angel Berihuete
    [Show abstract] [Hide abstract]
    ABSTRACT: Solar explosive events are commonly explained as small scale magnetic reconnection events, although unambiguous confirmation of this scenario remains elusive due to the lack of spatial resolution and of the statistical analysis of large enough samples of this type of events. In this work, we propose a sound statistical treatment of data cubes consisting of a temporal sequence of long slit spectra of the solar atmosphere. The analysis comprises all the stages from the explosive event detection to its characterization and the subsequent sample study. We have designed two complementary approaches based on the combination of standard statistical techniques (Robust Principal Component Analysis in one approach and wavelet decomposition and Independent Component Analysis in the second) in order to obtain least biased samples. These techniques are implemented in the spirit of letting the data speak for themselves. The analysis is carried out for two spectral lines: the C IV line at 1548.2 angstroms and the Ne VIII line at 770.4 angstroms. We find significant differences between the characteristics of the line profiles emitted in the proximities of two active regions, and in the quiet Sun, most visible in the relative importance of a separate population of red shifted profiles. We also find a higher frequency of explosive events near the active regions, and in the C IV line. The distribution of the explosive events characteristics is interpreted in the light of recent numerical simulations. Finally, we point out several regions of the parameter space where the reconnection model has to be refined in order to explain the observations.
    Astronomy and Astrophysics 01/2011; 528. · 5.08 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: We present an evaluation of the performance of an automated classification of the Hipparcos periodic variable stars into 26 types. The sub-sample with the most reliable variability types available in the literature is used to train supervised algorithms to characterize the type dependencies on a number of attributes. The most useful attributes evaluated with the random forest methodology include, in decreasing order of importance, the period, the amplitude, the V-I colour index, the absolute magnitude, the residual around the folded light-curve model, the magnitude distribution skewness and the amplitude of the second harmonic of the Fourier series model relative to that of the fundamental frequency. Random forests and a multi-stage scheme involving Bayesian network and Gaussian mixture methods lead to statistically equivalent results. In standard 10-fold cross-validation experiments, the rate of correct classification is between 90 and 100%, depending on the variability type. The main mis-classification cases, up to a rate of about 10%, arise due to confusion between SPB and ACV blue variables and between eclipsing binaries, ellipsoidal variables and other variability types. Our training set and the predicted types for the other Hipparcos periodic stars are available online.
    Monthly Notices of the Royal Astronomical Society 01/2011; 414. · 5.52 Impact Factor