Carlos Alberto de Bragança PereiraInstitute of Mathematical Statistics, University of São Paulo, Brazil · Statistics
Carlos Alberto de Bragança Pereira
PhD in statistics
About
387
Publications
71,470
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
7,966
Citations
Introduction
May 19th, 2015 I have completed 46 years as a teacher and a researcher in USP. From August 30th, 2015 I have been retired from my position. My papers are on different subjects. I believe I am a multidisciplinary scientist. At this moment my research is directed to foundations and in differential gene expression. In administration, I have been President of the Brazilian Statistical Association, Dean of the Institute and Head of Department. In 2014 I was the Patron of all graduate courses. After hac=ve been awarded a research one-year project by CNPq, I have moved to Campo Grande to work for the Federal University of Mato Grosso do Sul - UFMS. I am very happy to be living with my wife in this very nice town.
Publications
Publications (387)
This article gives a conceptual review of the e-value, ev(H|X) -- the epistemic value of hypothesis H given observations X. This statistical significance measure was developed in order to allow logically coherent and consistent tests of hypotheses, including sharp or precise hypotheses, via the Full Bayesian Significance Test (FBST). Arguments of a...
In this paper, we propose a hierarchical Bayesian approach for modeling the evolution of the 7-day moving average for the number of deaths due to COVID-19 in a country, state or city. The proposed approach is based on a Gaussian process regression model. The main advantage of this model is that it assumes that a nonlinear function f used for modeli...
We present a new way to estimate the lifetime distribution of a reparable system consisted of similar (equal) components. We consider as a reparable system, a system where we can replace a failed component by a new one. Assuming that the lifetime distribution of all components (originals and replaced ones) are the same, the position of a single com...
This article describes a coherent Bayesian measure of evidence for precise or sharp null hypotheses, the evidence value, or e‐value, derived from the fully Bayesian significance test (FBST), based solely on the posterior distribution of the parameters of the statistical model. The method can be easily implemented using modern numerical optimization...
A method for statistical analysis of multimodal and/or highly distorted data is presented. The new methodology combines different clustering methods with the GAMLSS (generalized additive models for location, scale, and shape) framework, and is therefore called c-GAMLSS, for “clustering GAMLSS. ” In this new extended structure, a latent variable (cl...
The pandemic scenery caused by the new coronavirus, called SARS-CoV-2, increased interest in statistical models capable of projecting the evolution of the number of cases (and associated deaths) due to COVID-19 in countries, states and/or cities. This interest is mainly due to the fact that the projections may help the government agencies in making...
Data Science: Measuring Uncertainties
A modeling procedure for task-based functional magnetic resonance imaging (fMRI) data analysis using a Bayesian matrix-variate dynamic linear model (MVDLM) is presented. With this type of model, less complex than the more traditional temporal-spatial models, it is possible to take into account the temporal and, at least locally, the spatial structu...
This paper presents a discussion regarding regression models, especially those belonging to the location class. Our main motivation is that, with simple distributions having simple interpretations, in some cases, one gets better results than the ones obtained with overly complex distributions. For instance, with the reverse Gumbel (RG) distribution...
In December of 2019, a new coronavirus was discovered in the city of Wuhan, China. The World Health Organization officially named this coronavirus as COVID-19. Since its discovery, the virus has spread rapidly around the world and is currently one of the main health problems, causing an enormous social and economic burden. Due to this, there is a g...
This special issue of the Brazilian Journal of Biometrics (BJB) contains 16 papers. The main topic of the special issue is "Biostatistics and Biometry in the Era of Data Science", and it is the result of a collaboration between the BJB and the Brazilian Region of the International Biometric Society (RBras) in the sequence of the cancelled 2020 RBra...
This article is a direct consequence of the authors' desire to discuss the role of statistics in data analysis. The analysis of coronavirus (COVID-19) databases are used as to show simple, but powerful statistical frameworks. We do believe that models for assessing future trends in temporal data in general, and in cases and/or deaths of COVID-19, b...
It is the introduction of a spetial issue of entropy and of a book with the papers
In this article, we propose a Bayesian criterion for the identification of differentially expressed genes by using the Kullback-Leibler divergence. The advantage of using the Kullback-Leibler divergence is that it allows measuring the influence of the treatment average on the posterior distribution of the parameters of the control distribution. To...
To perform statistical inference for time series, one should be able to assess if they present deterministic or stochastic trends. For univariate analysis, one way to detect stochastic trends is to test if the series has unit roots, and for multivariate studies it is often relevant to search for stationary linear relationships between the series, o...
To perform statistical inference for time series, one should be able to assess if they present deterministic or stochastic trends. For univariate analysis one way to detect stochastic trends is to test if the series has unit roots, and for multivariate studies it is often relevant to search for stationary linear relationships between the series, or...
Optimization and Stochastic Processes Applied to Economy and Finance -- is the name of this book translated to English; It has been used at the IME-USP - The Institute of Mathematics and Statistics of the University of Sao Paulo, since 1993. Contents: Ch.1: Linear Programming; Ch.2: Non-Linear Programming; Ch.3: Quadratic Programming; Ch.4: Markowi...
This article gives a survey of the e-value, a statistical significance measure a.k.a. the evidence rendered by observational data, X, in support of a statistical hypothesis, H, or, the other way around, the epistemic value of H given X. The $e$-value and the accompanying FBST, the Full Bayesian Significance Test, constitute the core of a research p...
This article gives a survey of the e-value, a statistical significance measure a.k.a. the evidence rendered by observational data, X, in support of a statistical hypothesis, H, or, the other way around, the epistemic value of H given X. The e-value and the accompanying FBST, the Full Bayesian Significance Test, constitute the core of a research pro...
Breast cancer stromal compartment, may influence responsiveness to chemotherapy. Our aim was to detect a stromal cell signature (using a direct approach of microdissected stromal cells) associated with response to neoadjuvant chemotherapy (neoCT) in locally advanced breast cancer (LABC). The tumor samples were collected from 44 patients with LABC (...
Background: Obsessive-compulsive disorder (OCD) is often a life-long disorder with high psychosocial impairment. Serotonin reuptake inhibitors (SRIs) are the only FDA approved drugs, and approximately 50% of patients are non-responders when using a criterion of 25% to 35% improvement with the Yale-Brown Obsessive-Compulsive Scale (Y-BOCS). About 30...
Science. The descriptors were low-back pain, back pain, lower-back pain, prevalence, and elderly in Portuguese and English. Two independent reviewers conducted a search for studies and evaluated their methodological quality. The search strategy returned 2186 titles, and 35 were included in this review. The studies evaluated 135,059 elderly individu...
This paper presents an integrated approach for the estimation of the parameters of a mixture model in the context of data clustering. The method is designed to estimate the unknown number of clusters from observed data. For this, we marginalize out the weights for getting allocation probabilities that depend on the number of clusters but not on the...
In this work, we propose a modeling procedure for fMRI data analysis using a Bayesian Matrix-Variate Dynamic Linear Model (MVDLM). With this type of model, less complex than the more traditional temporal-spatial models, we are able to take into account the temporal and -- at least locally -- the spatial structures that are usually present in this t...
These slides illustrates our work in significance tests
In this paper, we consider a Bayesian mixture model that allows us to integrate out the weights of the mixture in order to obtain a procedure in which the number of clusters is an unknown quantity. To determine clusters and estimate parameters of interest, we develop an MCMC algorithm denominated by sequential data-driven allocation sampler. In thi...
Apresenta presumidamente a metodologia desenvolvida para a pesquisa de Controle de Qualidade do Censo Escolar, cuja importância se ampliou a partir da promulgação da Emenda Constitucional nº 14, que estabeleceu uma relação entre a alocação de recursos federais e a quantidade de alunos matriculados no Ensino Fundamental. Descreve a metodologia e a e...
The reliability of a coherent system of components depends on the reliability of each component and the initial statistical work should be an estimation of the reliability of each component. This work represents a challenging task because if the system fails, the failure time of a given component cannot be observed, that is, the phenomenon of censo...
Elite judo demands high levels of physical and psychological skills. The brain-derived neurotrophic factor (BDNF) may be of particular interest in sports medicine for its ability to promote neuroplasticity. We investigated the plasma BDNF before and after a judo training session (Randori) and the maximal incremental ramp test (MIRT) in athletes fro...
Devido à relevância do uso de técnicas de revisão sistemática e metanálise na aquisição de informações consistentes e fidedignas nas diversas áreas do conhecimento, este artigo apresenta a versão preliminar 1.0.0 do software Calculadora Metanalítica, desenvolvido nas linguagens R e C#, com interface gráfica amigável, que realiza uma metanálise baye...
This article argues that researchers do not need to completely abandon the p-value, the best-known significance index, but should instead stop using significance levels that do not depend on sample sizes. A testing procedure is developed using a mixture of frequentist and Bayesian tools, with a significance level that is a function of sample size,...
Obsessive-compulsive disorder (OCD) is a psychiatric disorder characterized by obsessions and/or compulsions. Different striatal subregions belonging to the cortico-striato-thalamic circuitry (CSTC) play an important role in the pathophysiology of OCD. The transcriptomes of 3 separate striatal areas (putamen (PT), caudate nucleus (CN) and accumbens...
de Bragança Pereira was born in 1946 in the city of Rio de Janeiro and holds a degree in Statistics from the National School of Statistical Sciences (ENCE) since 1968. During his graduation, he was a teaching monitor at the Public Health Teaching Foundation, Oswaldo Cruz Institute. Shortly thereafter, in May 1969, he began his teaching career at th...
The purpose of these notes is to present an assessment of the probability of a candidate be elected in a two-round presidential election. In the first round, all candidates can be voted on. If one of them has more than 50% of the vote (s)he is elected and there is no second round. If none of the candidates obtain more than 50% of the votes, then th...
The purpose of these notes is to present an assessment of the probability of a candidate be elected in a two-round presidential election. In the first round, all candidates can be voted on. If one of them has more than 50% of the vote (s)he is elected and there is no second round. If none of the candidates obtain more than 50% of the votes, then th...
The 37th edition of MaxEnt was held in Brazil, hosting several distinguished researchers and students. The workshop offered four tutorials, nine invited talks, twenty four oral presentations and twenty seven poster presentations. All submissions received their first choice between oral and poster presentations. The event held a celebration to Julio...
This is a special edition of ENTROPY: Foundations of statistics. It will tourn on in a book.
It Can be seen in
http://www.mdpi.com/journal/entropy/special_issues/Foundations_of_Statistics
Hypothesis testing in contingency tables is usually based on asymptotic results, thereby restricting its proper use to large samples. To study these tests in small samples, we consider the likelihood ratio test (LRT) and define an accurate index for the celebrated hypotheses of homogeneity, independence, and Hardy-Weinberg equilibrium. The aim is t...
Objective:
This sequential multiple assignment randomized trial (SMART) tested the effect of beginning treatment of childhood OCD with fluoxetine (FLX) or group cognitive-behavioral therapy (GCBT) accounting for treatment failures over time.
Methods:
A two-stage, 28-week SMART was conducted with 83 children and adolescents with OCD. Participants...
Meta-analysis is a procedure that combines results from studies (or experiments) with a common interest: inferences about an unknown parameter. We present a meta-analytic measure based on a combination of the posterior density functions obtained in each of the studies. Clearly, the point of view is from a Bayesian perspective. The measure preserves...
In this work, we propose two methods, a Bayesian and a maximum likelihood model, for estimating the failure time distribution of components in a repairable series system with a masked (i.e., unknown) cause of failure. As our proposed estimators also consider latent variables, they yield better performance results compared to commonly used estimator...
Functional magnetic resonance imaging or functional MRI (fMRI) is a non-invasive way to assess brain activity by detecting changes associated with blood flow. In this work, we propose a full Bayesian procedure to analyze fMRI data for individual and group stages. For the individual stage we use a multivariate dynamic linear model (MDLM), where the...
It is about statistics analysis of reliability models
Usually, methods evaluating system reliability require engineers to quantify the reliability of each of the system components. For series and parallel systems, there are some options to handle the estimation of each component's reliability. We will treat the reliability estimation of complex problems of two classes of coherent systems: series-paral...
RESUMO: Testes para famílias separadas de hipóteses foram inicialmente considerados por Cox (1961, 1962). Neste artigo, examinamos o teste de signi�cância totalmente Bayesiano, o FBST, para discriminar entre os modelos lognormal e Weibull cujas famílias de distribuições são separadas. Aqui, o problema é abordado num contexto de mistura linear dos m...
Horizontal gene transfer (HGT) has a major impact on the evolution of prokaryotic genomes, as it allows genes evolved in different contexts to be combined in a single genome, greatly enhancing the ways evolving organisms can explore the gene content space and adapt to the environment. A systematic analysis of HGT in a large number of genomes is of...
The first step in statistical reliability studies of coherent systems is the estimation of the reliability of each system component. For the cases of parallel and series systems, the literature is abundant, but it seems that the present paper is the first to present the general case of component inferences in coherent systems. The failure time mode...
This is a letter to explain how to use adaptive significance levels
Measuring the dependence between random variables is one of the most fundamental problems in statistics, and therefore, determining the joint distribution of the relevant variables is crucial. Copulas have recently become an important tool for properly inferring the joint distribution of the variables of interest. Although many studies have address...
We use the logistic normal transformation to obtain the correct confidence intervals for prevalencies.
The main objective of this paper is to find the relation between the adaptive significance level presented here and the sample size. We statisticians know of the inconsistency, or paradox, in the current classical tests of significance that are based on p-value statistics that are compared to the canonical significance levels (10%, 5%, and 1%): “Ra...
Analytical methods for granting credit have gone through great advances in recent decades, particularly in the field of statistical methods for classification of individuals into groups with different default rate. Most of the existing works suggest decisions of the type granting credit or not, regard just marginally the expected financial outcome...
The main objective of this paper is to find a close link between the adaptive level of significance, presented here, and the sample size. We, statisticians, know of the inconsistency, or paradox, in the current classical tests of significance that are based on p-value statistics that is compared to the canonical significance levels (10%, 5% and 1%)...
The main objective of this paper is to find a close link between the adaptive level of significance, presented here, and the sample size. We, statisticians, know of the inconsistency, or paradox, in the current tests of significance that are based on p-value statistics that is compared to the canonical significance levels (10%, 5% and 1%): "Raise t...
The present work is the natural companion of Rodrigues et.al (2017). The reliability of a coherent system depends on the reliability of each component of the system. Thus, the initial statistical work should be the estimation of the reliability of each component of the system. This is not an easy task because in the observed time of the system fail...
The first step in statistical reliability studies of coherent systems is the estimation of the reliability of each system component. For the cases of parallel and series systems the literature is abundant. It seems that the present paper is the first that presents the general case of component inferences in coherent systems. The failure time model...
Current research to explore genetic susceptibility factors in obsessive-compulsive disorder (OCD) has resulted in the tentative identification of a small number of genes. However, findings have not been readily replicated. It is now broadly accepted that a major limitation to this work is the heterogeneous nature of this disorder, and that an appro...
Tests of separate families of hypotheses were initially considered by Cox (1961,1962) In this work, the Fully Bayesian Significance Test, FBST, is evaluated for discriminating between the lognormal, gamma and Weibull models whose families of distributions are separate. Considering a linear mixture model including all candidate distributions, the FB...
An evaluation of FBST, Fully Bayesian Significance Test, restricted to survival models is the main objective of the present paper. A Survival distribution should be chosen among the tree celebrated ones, lognormal, gamma, and Weibull. For this discrimination, a linear mixture of the three distributions, for which the mixture weights are defined by...
Abstract
We aimed to investigate which items of the Yale-Brown Obsessive-Compulsive Severity Scale best discriminate the reduction in total scores in obsessive-compulsive disorder patients after 4 and 12 weeks of pharmacological treatment. Data from 112 obsessive-compulsive disorder patients who received fluoxetine (⩽80 mg/day) for 12 weeks were in...
Abstract
We aimed to investigate which items of the Yale-Brown Obsessive-Compulsive Severity Scale best discriminate the reduction in total scores in obsessive-compulsive disorder patients after 4 and 12 weeks of pharmacological treatment. Data from 112 obsessive-compulsive disorder patients who
received fluoxetine (⩽80 mg/day) for 12 weeks were in...
In this study, the effects of the heavy metal cadmium on the stress protein HSP70 are investigated in freshwater mollusks Biomphalaria glabrata. Adult snails were exposed for 96h to CdCl2 at concentrations ranging from 0.09 to 0.7mgL(-1) (LC50/96h=0.34 (0.30-0.37). Time and concentration-dependent increases in the expression of HSP70 were observed...
This chapter presents frequentist statistical methods. Hypothesis tests, namely, the Cox test and alternatives, are described. An interpretation of the test results is provided. Applications to the exponential
, gamma
, Weibull
, and lognormal
distributions are presented. Misspecification
and the efficiencies of false regression models
are studied....
This chapter addresses
the pure
likelihood
approach to model choice
. The concepts of normalized
, adjusted
, relative
, and profile likelihood
are introduced. A relative likelihood approach for discriminating separate models
is presented using an example. The concepts of computer simulations
, the Monte Carlo
method, Monte Carlo simulations
, and...