Abel Dasylva

Abel Dasylva
Statistics Canada | STATCAN · International Cooperation and Corporate Statistical Methods Division

Doctor of Statistics

About

28
Publications
1,088
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
138
Citations
Introduction
I am a methodologist with a general background in survey methodology and biostatistics. I am currently developing state-of-the-art methods for linking massive data sets, assessing linkage errors and analyzing linked data while accounting for linkage errors.

Publications

Publications (28)
Presentation
Duplicate records are records from the same unit in a given data source, regardless of whether they are identical. Their identification is required when the source is used to produce official statistics, such as a sampling frame or a census. To date, many Bayesian models have been described to perform this task in an automated manner. Yet, they inv...
Article
In official statistics, record linkage is used to find records from the same entity in many data sources, often without a unique identifier. Consequently, linkage errors arise that are commonly measured by the recall and the precision; two finite population parameters that require the identification of all the record pairs, where the records are fr...
Presentation
Full-text available
The probabilistic method of record linkage aims at making optimal linkage decisions for a given set of features. Yet it does not prescribe how to select these features. Another challenge is the estimation of the decision of parameters due to the common lack of training data. This presentation addresses both issues with a model and a recursive parti...
Article
When linking massive data sets, blocking is used to select a manageable subset of record pairs at the expense of losing a few matched pairs. This loss is an important component of the overall linkage error, because blocking decisions are made early on in the linkage process, with no way to revise them in subsequent steps. Yet, measuring this contri...
Presentation
In the context of its "admin-first" paradigm, Statistics Canada is prioritizing the use of non-survey sources to produce official statistics. This paradigm critically relies on non-survey sources that may have a nearly perfect coverage of some target populations, including administrative files or big data sources. Yet, this coverage must be measure...
Conference Paper
The accurate and cost effective estimation of linkage errors remains a major challenge for the automated production and use of linked data. However this exercise is worthwhile only if the linked data are fit for use. A new model is proposed to estimate the errors without clerical reviews, training data or conditional independence assumptions, under...
Preprint
When linking massive data sets, blocking is used to select a manageable subset of record pairs at the expense of losing a few matched pairs. This loss is an important component of the overall linkage error, because blocking decisions are made early on in the linkage process, with no way to revise them in subsequent steps. Yet, measuring this contri...
Presentation
Full-text available
This presentation describes a new model for the estimation of linkage errors without clerical reviews and without assumptions of conditional independence.
Preprint
Full-text available
In theory, the probabilistic linkage method provides two distinct advantages over non-probabilistic methods, including minimal rates of linkage error and accurate measures of these rates for data users. However, implementations can fall short of these expectations either because the conditional independence assumption is made, or because a model wi...
Conference Paper
A new estimating equation methodology is proposed for the primary analysis of linked data, i.e. an analysis by someone having an unfettered access to the related microdata and project information. It is described when the data come from the linkage of two registers with an exhaustive coverage of the same population, or from the linkage of two overl...
Article
Full-text available
This article looks at the estimation of an association parameter between two variables in a finite population, when the variables are separately recorded in two population registers that are also imperfectly linked. The main problem is the occurrence of linkage errors that include bad links and missing links. A methodology is proposed when clerical...
Conference Paper
We propose an optimal estimating equation for logistic regression with linked data while accounting for false positives. It builds on a previous solution but estimates the regression coefficients with a smaller variance, in large samples.
Article
Full-text available
We propose an optimal estimating equation for logistic regression with linked data while accounting for false positives. It builds on a previous solution but estimates the regression coefficients with a smaller variance, in large samples.
Article
Background: This study summarizes the linkage of the Canadian Community Health Survey (CCHS) and the Canadian Mortality Database (CMDB), which was performed to examine relationships between social determinants, health behaviours and mortality in the household population. Data and methods: The 2000/2001-to-2011 Canadian Community Health Surveys w...
Conference Paper
Probabilistic linkage is susceptible to linkage errors such as missed links and false links. In many cases, these errors may be reliably measured through clerical-reviews, i.e. the visual inspection of a sample of record pairs to determine if they are matched. A framework is described to effectively carry-out such clerical-reviews. It is based on a...
Article
This paper presents new constructions of multistage wave-mixing networks with arbitrary b×b space-switching elements, where b ≥ 2. In these networks, for a size of F fiber links and W wavelengths per link, converter requirements are O(Flog<sub>b</sub>W) or O(FW/b) for rearrangeable nodes, and O(Flog<sub>b</sub>Wlog<sub>b</sub>(FW)) or O(FWlog<sub>b...
Article
There is current interest in differentiated service architectures where packets with different priorities can share the same queue. In the case of congestion, packets marked with higher drop probability are preferentially dropped in order to make buffer room for packets marked with lower drop probability. Active queue management (AQM) based on rand...
Article
Multistage cross connects with wave-mixing conversion have two essential characteristics. First, individual converters are simultaneously shared by a significant number of channels. Second, individual channels may be converted through one or more cascaded wave-mixing conversions. The combination of both design principles contributes to the degradat...
Article
We describe a general construction for space-wavelength log2(FW;m; p) networks capable of wave-mixing conversion, including previous designs proposed in the literature, where F is the number of fibers and W is the number of wavelengths per fiber. In these networks, a lightpath is converted through at most f log2W +min(m; log2 W)g cascaded conversio...
Article
The health status of the control plane and the data plane of a GMPLS-controlled optical network is independent in the physically separated control network implementation. In most control plane designs, besides the topology information, the entities of the routing protocol only record the number of available wavelengths on each link. However, the st...
Article
We describe what we believe to be new designs for all-optical cross connects, capable of wavelength conversion. They are based on two-dimensional, space–wavelength, Benes or Cantor topologies, and they exploit cascaded wave-mixing bulk frequency conversion. In these cross connects many channels at distinct frequencies can be simultaneously frequenc...
Conference Paper
The LDP (label distribution protocol) is used in the control plane to control an optical network. The data plane and the control plane of an optical network could be physically separate. So a failure in the control plane does not necessarily imply a data plane failure and that user communications have to be interrupted. The standard LDP, however, d...
Article
We consider single-hop wavelength-division multiplexed networks in which the transmitters take a nonzero amount of time, called tuning latency, to tune from one wavelength to another. For such networks, we show that, under certain conditions on the traffic matrix, there exist polynomial-time algorithms that produce the optimal schedule. Further, th...
Conference Paper
We consider the problem of obtaining non-trivial lower bounds on the the lost revenue under any routing and admission control scheme in a multi-class loss network. First, we use the following simple idea to bound the performance of any coordinate-convex admission policy on a single link: the blocking probability of any call class is lower bounded b...
Conference Paper
We consider the problem of routing permanent virtual circuits over general-topology networks under some shortest-path rule. We derive bounds on the competitiveness ratio of any online algorithm as a function of the cost function used to describe the congestion on each network element, link or node
Conference Paper
We propose competitive policies for admission control and routing in general topology networks, with bandwidth and buffer resources. The goal of these solutions is to maximize the network revenue, when the traffic demand is not known ahead of time, and over-allocation of network resources is allowed with some probability. On top of appropriate reso...
Article
We consider single-hop wavelength-division multiplexed (WDM) networks in which the transmitters take a nonzero amount of time, called tuning latency, to tune from one wavelength to another. For such networks, we show that, under certain conditions on the traffic matrix, there exist polynomial-time algorithms that produce the optimal schedule. Furth...

Network

Cited By

Projects

Project (1)