Science topic
Epidemiological Statistics - Science topic
Explore the latest questions and answers in Epidemiological Statistics, and find Epidemiological Statistics experts.
Questions related to Epidemiological Statistics
Is it necessary to test the questionnaire before starting the study? and how to do this?
newcastle ottawa scale is good for cohort and case control observational studies, but I am doing meta-analysis with both randomized an non randomized clinical trials. some of the non randomized trials have single arms (without comparison) and I dont think Newcastle Ottawa scale can be used here.
How to calculate R_0, R_t, beta (the average number of contacts per person per time) and gamma?
I try to defined SIR model, my data have infected, death and recovered.
My country show the next information(I don’t know to calculate R_t) and I have the before said variable.
Thank you.

I am a bit stuck with the following problem. I am currently performing a meta analysis on observational studies using CMA where my outcome variable is the performance on a cognitive test as a function of adherence to an exposure variable (on a scale of 0 to 9). Normally, the results are presented either as means differences between tertiles according to the level of adherence (low, middle or high) or as a regression coefficient per additional unit of the exposure variable.
My main question is, how can I pool together both types of studies into the same meta analysis? I have found a similar question on risk estimates that suggest to estimate the linear trend of the categorical results. I don't have access to raw data and I only know sample sizes, mean differences and confidence interval. Is it possible to do the same in this case? If so, how should I do it?
I was thinking of just including each tertile comparison as a subgroup of the same study, and leave the continuous variables as they are. But I am not sure if this lousy approach is acceptable.
Thanks.
Studies have shown that as many as 50% of submissions are declined directly by editors after being submitted. If the paper receives a “yay” instead of a “nay,” the journal sends it to reviewers. How do journals select competent reviewers?
Common sense says that more experience and a higher rank translate to better reviewing skills. However, a PLOS Medicine study in 2007 showed no such relationship. The authors examined 2,856 reviews by 308 reviewers for Annals of Emergency Medicine, a revered journal that for over 15 years has rated the quality of each review using a numerical scoring system. The results showed that experience, academic rank, and formal training in epidemiology or statistics did not significantly predict subsequent performance of higher-quality reviews. It also suggested that, in general, younger reviewers submitted stronger reviews.
So what? When presented the opportunity, any physician can and would produce a scrupulous review of a manuscript — right? Wrong.
Flashback to 1998, when Annals of Emergency Medicine cleverly put together a fictitious manuscript riddled with errors and distributed it to 203 reviewers for evaluation. The errors were divided into major and minor categories. The major errors included such blunders as faulty or plainly unscientific methods, as well as blatantly erroneous data analyses. Minor errors consisted of failure to observe or report negative effects on study participants, incorrect statistical analysis, and fabricated references — just to mention a few. According to the authors, the majority of peer reviewers failed to identify two-thirds of the major errors in the manuscript. Forty-one percent of reviewers indicated that the manuscript should be accepted for publication.
What about consistency? In 1982, two authors took twelve papers that had been published by prestigious psychology journals within the previous three years and resubmitted them to the respective journals. The names of the authors for the resubmitted papers, and the names of their affiliations, were all changed to fictitious ones. Three manuscripts were recognized as being duplicates. Of the nine remaining papers, eight were rejected for “methodological inconsistencies,” not for lack of originality. One paper was accepted again.
Last week, I received an email from a well-respected medical journal. The editor wanted my help reviewing a manuscript that was being considered for publication. Noticing the request was addressed to “Dr. Spencer,” I shot back a quick reply saying there’d been a mistake. I’m not a doctor; I’m a medical student.
Hours later, I got this response:
Thank you for your email. We would very much appreciate your review of this manuscript. Your degree information has been corrected.
The peer review process clearly has flaws. It’s no wonder so many publications are retracted every year, or that each issue of a journal includes several corrections of previously published articles. Without universal standards, manuscript reviews remain subjective, imperfect, and inconsistent at best. The time has come to re-examine the quality of this system.
Meanwhile, those who rely on medical journals for practice guidelines should be educated on the flawed nature of this system and should be encouraged to judge publications on their merit rather than the apparent prestige of a journal.
Now, if you’ll please excuse me, I have a manuscript to review.
Robert Spencer is a medical student.
What is the justification for 1:4 ratio in Case Control study?
What are epidemiological and statistical reasons for considering 1: 4 ratio in a case control study? How can we choose four controls per one case?
Dear All,
I need to estimate sensitivity, specificity, PPV and NPV for clustered data using GEE and programming in SAS. I will use PROC GENMOD with dist=binomial link=log. However it is not clear to me how the model should be specified. E.G. How should I tell sas to calculate for se the probability of true positives on true positives + false negatives?
Is anybody there who can help me?
Thanks in advance
Federica
- Hello ..I am working on Heart Disease Prediction using Data Mining Techniques.So for that I need Dataset for more than 1000 patient records,so plz anyone can send me the link.Thankyou.
I'm conducting a meta-analysis on hypoglycemic risk associated with diabetic drugs. Some studies report only the incidence rate of hypoglicemic events and the number of patients. Data are in the following format: 3.2 hypo/patient/year with 100 patients on drug A versus 2.1 hypo/patient/year with 100 patients on drug B; the event can occur more than once in each patient. Is there any way of estimating standard error when nor confidence intervals or standard deviations are given? Thanks
Annals of Internal Medicine published today a very interesting paper introducing the "E-value" as a way of assessing robustness of Relative Risk, Odds Ratios, Hazard Ratios etc which may change how we interpret and present these statistics. I'd like to hear the opinion of statisticians?
If anyone want to play with the E-value I've attached a calculator
Does it even make sense to study other outcomes in a matched nested case-control study? In these type of studies, do we always have to use case /control as our outcome?
If I do a meta analysis of incidence rate of observational studies, can I also include an incidence rate of a control group of RCT? Could the control group of RCT be treated as a comparable cohort? ( I am not interested in effect size.)
When prevalence is not known and difficult to get mean and standard deviation in that cases how to calculate sample size. Does it matter for a descriptive, Analytical and empirical studies.
I am planning to perform a meta-analysis of randomized studies. However, I would like to exclude the possibility of Type I and II error due to repeated hypothesis testing. Does anyone have experience with TSA? Is it compatible with statistical software other than RevMan
I am planning to perform meta-analysis of side effects for a medication. The outcome is the number of people who experienced the side effect of interest out of the total exposed. The duration of follow up (weeks to years) and sample size of the studies (14-30000) vary. The incidence is in the range of 0.3 %. Do I need to transform the data and which form of transformation will be the best. Secondly which measure will be the most appropriate for meta-analysis.
I was trying to do meta analysis o effect of ART on congenital anomalies. I was trying to do subgroup analysis. I am getting data for subgroups (ART, PI, NRTI, NNRTI) from one study. Can I take and put them together in Rev man. When i didthe author may appear 2 times in the pooled analysis. Did any one has information how to inter this type of data in Revman or Stata.
at the moment i am critically appraising a meta-analysis and the researchers used a random effects meta-analysis using the inverse variance method obtaining odds ratio.
My PICOT is as follows: for nurses at a jail setting, how does implementing a policy of EBP guidelines on TDM influence serum drug lab testing?
i plan on comparing the percentage of completed serum drug lab testing pre-intervention to the percentage of completed serum drug lab testing post-intervention
1. Is it correct to mix all study types (randomized-controlled trials, cohort, and case-control studies) into a single meta-analysis and get overall RR, and then stratify the analysis by study type (randomized-controlled trials vs. cohort/case-control studies)? If yes, whether the overall RR is reliable for the final report by taking into account both the heterogeneity and publication bias.
Would you agree with me that that the advantages of including both RCTs and observational studies in a final meta-analysis could outweigh the disadvantages in many.situations?
2. In our study, to explore the source of heterogeneity, we performed subgroup analyses according to the types of study design. We also identified the studies which had both the largest variance (wide intervals) and the extreme outlier weight in each clinical outcome group. We then conducted a leave-one-out sensitivity analysis to assess the impact of individual studies, and thus the average RR was estimated in the absence of each study and heterogeneity was quantified using both the I2 and τ2 statistics. What do you think about the "leave-one-out sensitivity analysis"? Do you agree with the following statement?
"In meta-analyses, the rationale of deleting studies should be clearly stated and medically or methodologically sound, not only based on variance."
3. Is the following statement correct?
"if heterogeneity is observed, the common estimate is not of much meaning and should not be cited."
I am completing a Systematic Review of interventions for my MA thesis. The main focus of my thesis was going to be a meta-analysis. I have 5 articles that meet my inclusion criteria. Only one study contains data necessary for a meta-analysis. Two of these studies use multi-level modelling, and do not provide group Standard Deviations(SD). My concern with using Standard Estimates (SE) rather SD, or even converting the SE to SD myself is confounding variables. Is there a was to include this in a meta-analysis without original data or SD's? Because I don't know specifically what the original researchers would have statistically controlled for different variables across studies, would it make sense to include them or would it be better to use a narrative synthesis method to summarize the data rather than comparing overall effect sizes? I know a meta-analysis is only as good as the data included and I have to make sure that I am contributing to my field in a meaningful way rather that inflating (or circulating inaccurate information). Any information and/or articles you could give me on pooling effects across studies or using SE in meta-analysis.
Recently a collaborator came with a project which aims to validate at his setting some prediction models for digestive bleeding. They are the rockall score, glsagow blatchford score, AIMS65 score, Charlson morbidity index, Child-Pugh and MELD classification. His question was how to estimate a reasonable sample size. I took a look at the original papers and their validation sets ranged from 197 subjects with 41% outcome to 32504 subjects with 2% outcome. An absence of a refreshing happiness occurred when I saw that in the largest set of validation there were significant coefficients as low as 0.3 with SE of 0.18. Also, I did Iook around for some guidance and found the following comments in Steyerberg's book where there is a sample size for validation studies part: "modest sample sizes are required for model validation"; "to have reasonable power, we need at least 100 events and at least 100 non-events in external validation studies, but preferably more (>250 events). With lower numbers the uncertainty in performance measures is large." But in the text there are several simulations results showing that it depends a lot of the coefficients and SE, which could lead, even with these amounts of outcomes to power as low as 50%. Taking these rule of thumb, and expecting a 4% outcome in a validation cohort, it would be necessary to include 2500 to 6250 subjects. Pretty scary and with very wide range, which does not help much in the planning time. I found a logistic regression sample size formula, but it did not help much as it allows only two predictors at a time and the permutation predictors coeff and Se in the formula, the N ranged from a few hundred to dozens of thousands. http://www.dartmouth.edu/~eugened/power-samplesize.php
I would like very much to be comfortable in recommending a sample size of 2500 taking the Styerberg's rule of thumb as the basis of it. I would like to hear from those who have some experience in this issue.
I would like to calculate the Fleiss kappa for a number of nominal fields that were audited from patient's charts. I have a situation where charts were audited by 2 or 3 raters. Is anyone aware of a way to calculate the Fleiss kappa when the number of raters differs? I've been trialing a macro in SAS by Chen called MKAPPA that accounts for "missing" observations, but I don't think this really addresses my needs. Any suggestions are greatly appreciated!
I had faced difficulties in calculating sample size for a cluster randomized controlled trail with three groups (one control and two different intervention groups)?Do you have some advice or recommendation?Is there assumption what rho (ICC), coefficient of variation, cluster number and cluster size should be? Which statistical software is appropriate for calculating sample size for a cluster randomized controlled trail?
Recently, we completed a TST survey among diabetic population in some Malaysian primary care clinics. We used 2 tuberculin unit (2 TU of RT23) during the TST survey. I would like to know how the use different tuberculin unit (2 TU, 5 TU, 10 TU) may affect the TST results and how the results should be interpreted. Are there any evidence in literature to substantiate 2 TU reduces false positives in a high burden country with wide BCG coverage?
I am combining multiple studies on mortality in CKD per different categories of 25(OH)D. One study have reported Hazard Ration per 1 unit standard deviation increase in 25(OH)D and some reported Hazard Ratios (HR or relative risk). what is the difference between HR/SD and HR? Is there any possibility to convert HR/SD to HR? Assuming both types of study measure mortality, can both be combined together? Thank you.
In Epidemiological study, we often find the term of predictor and risk factor, what is the difference between them and when do we use them? Is it also related to statistical analysis that we use?
Dear RG members.
Im planing to conduct a Systematic review about the choice of a surgical approach for the treatment of a surgical condition.
My problem is that the majority of the studies are observational and they report results like this: "Patients with the condition were admitted to the ER. Intervention X was done in 15 of the 45 patients while intervention Y was done in 30 of the 45 patients"... Well these could be taken a comparison groups, but I dont want to compare this in a meta-analysis since the data is from observational research.
Instead of conduct a meta analysis of comparisons. I want to do a proportion meta-analysis by taking the proportion of outcome A from the intervention X and the proportion of outcome A from the intervention Y and pooling these proportions between studies instead of compare them. Do you believe this is possible?
Form example if one of my outcomes si "Re-operation" i would say something like: Re operation for bleeding was done in approximately 56% (CI95% XX-XX) of the patients with the X intervention. On the other hand, re operation for bleeding was done in approximately 30% of patients with the Y intervention. (NO COMPARING)
Do you thing is better to perform a classic meta analysis using a random effects model by comparing outcomes in groups and between studies? or do you believe my approach is a good one.
Thanks.
Ramiro
We will be conducting a project to asses the risk factors of developing type 2 diabetes in a young population through stats analysis of the data that we generate from volunteers in a cohort study of around 95 volunteers, it will involve data such as blood analysis, BMI, diet, age etc….
What statistical test using SPSS would be useful to include and why?
I am a PhD student trying to see if there is a statistical significant differences in my educational intervention to minimize self-medication practice with antibiotics among students ...I have 100 pharmacy students and 160 dental students ..how can i calculate the required sample size to detect a significant difference ?
thank you
PLEASE guide me how can I use or even convert Mean(%) to Mean(mm)
I wish to demonstrate the ordinal property of a set of items, which are supposed to be a Guttman scale. I was not able to find any useful suggestion in the internet about how to compute that analysis in Stata 13 (and neither "by paper&pencil").
Could anyone help me?
Thanks for any suggestion.
We assess regulatory cell markers in CD4+ T cells in patients by flow cytometry and would like to compare patients with "mild disease" versus "severe disease". My question is which statistical test I should apply and why. We compare frequencies of marker positive cells among the CD4+ T cell population. Patient numbers are rare and in one analysis, we only have 13 patients in the "mild disease" group and 3 in the "severe disease" group (for other markers we have 20 versus 10 or 30 versus 15). The results range for mild disease from 7- 34% and for the severe disease group from 32-39%. We used the student t test throughout the results but during the revision process, one reviewer pointed out that he/she would use a non-parametric test for small n (without further specifying this statement) .
My question is: which test would you use? Do you think there is a cut-off for "n", under which you would prefer non-parametric tests ( n< 10, for example)? I already consulted a biostatistican at our institution and he said that he was not convinced the data was not normally distributed but suggested a cut-off < 10 to use non-parametric tests. As this biostatician was not at all familiar with immunological data I would like to hear other opinions. What would you do?
Thank you for your help!
I have censored survival data. The Gaussian distribution appears to be an excellent fit. How can I compute an R-square statistic? The model has no covariates; it only estimated mu and sigma.

I am attempting to track down reliable data concerning BMI/BSA (anthropometrics) of the German population. So far, my search has not revealed any optimal data. I would be interested in Data from the last 10 years ideally.
Thank you
cystic fibrosis is increasing in our area and need to establish registry
Hi,
Does anybody know if there is a command in STATA that allows you to explore the relative differences in the distribution of continuous variables between groups?
I have found reference on the internet to a command called 'reldist' that was written back in 2008 but I don't think it is available via ssc download. Does anybody know if this has been superseded by anything else?
Many thanks
Elaine
Today, there are many screening tests for a variety of diseases. Most of these tests are valid and reliable and for them calculated sensitivity, specificity, positive predictive value, negative predictive value and other diagnostic accuracy criteria. For now, I want to know How can compare the diagnostic accuracy of two or more screening test?
For example, we have three tests A, B, C and profile of each test is as follows (Profile of each test has been extracted from a separate article.):
Test A:
Sensitivity 87%, specificity 79%, positive predictive value 75%, and negative predictive value 64%
Test B:
Sensitivity 84%, specificity 81%, positive predictive value 70%, and negative predictive value 71%
Test c:
Sensitivity 90%, specificity 85%, positive predictive value 82%, and negative predictive value of 73%
Now I would like to compare the sensitivity, specificity, and other indicators statistically these tests together and determine whether are statistically significant differences between these tests or not?
Please suggest me related books and articles.
Please guide me. Thanks a lot
I am working with the Human plasma and PBMC. I measure Immunoglobulins in plasma and Interleukins in stimulated PBMC. I am in confusion about using the central tendency in analysis of data because I have seen some articles using mean and some using Median. So what will be optimum to use?
I am currently writing a systematic review and the majority if not all my studies are descriptive. I looked for quality assessment tools and found out that the
QAT is widely used: http://www.nccmt.ca/registry/view/eng/14.html but it is somehow applicable to intervention rather than descriptive studies.
I also came across circum which seems appropriate but I haven't seen any review that used circum before http://circum.com/index.cgi?en:appr
Do you think I should be using QAT? what other tools would you suggest?
Thank you
Mohamad
I have reaserch about Distribution of disease samples using Spatial Analysis and statistic analysis. At the end of work ,I have predict hotspot cluster in the future with use of detected hotspot cluster in statistic analysis.
Dear Colleagues,
Our research group is preparing a meta-analysis of neonatal imitation. We are hereby requesting any unpublished data, regardless of the size of the sample or the pattern of results.
To accomplish the meta-analysis, we don't necessarily need full datasets, but we do require a description of the methodology, the resulting means and standard deviations, summary statistics, effect sizes, and/or test statistics.
If you have an unpublished neonatal imitation study that you are willing to contribute, can you please send it to Jonathan Redshaw: j.redshaw@uq.edu.au.
Please also include a citation for the unpublished work so that we can acknowledge the source.
With sincere thanks,
Siobhan
What is the best clinical interpretation for the odds ratio of 0.5. Does it make sense to tell 200% risk (or odds) reduction for exposed people? This OR extracted from a hospital based case-control study.
I have repeated measurement data and single time measure outcome variable. I want to see the effect of each time measurement on outcome variable, as well as I want to see which time measure has more influence on the outcome variable after adjustment of some covariates. It would be a great help if anyone tells me the appropriate statistical method for this analysis.
I have 240 human plasma samples divided in 8 treatment groups (30 in each) that I want to measure endothelial activation and other markers by bioplex, but I don´t have enough money to run all the samples. What is the best approach: 1. pool samples from the different treatment groups? or 2. randomly select representative samples from each group?
Other question, could I do only one replicate of each sample?
An Odds Ratio of 3.25 with Confidence Interval, 0.975 to 10.901, P = 0.055. Is it significant or not? Please see the figure in attachment

Earwax composition among Caucasians, among Ethnic Indians of Canada and South America, among Asians and among Indigenous Africans.
Actually, we are doing a Chinese version questionnaire validation study. We would like to perform a Pearson correlation test to examine (1) its concurrent reliability between telephone interview and clinician interview, and (2) test-retest reliability by tel interview at time 1 and 2.
The sample is 200, including 80 respondents classified as “case”, 65 as “subthreshold cases”, and 55 as “non-case.” (in the SPSS, we would code the variable, “diagnosis”, as 0 = non-case; 1 = subthreshold case; 2 = case). However, in the real world situation, the prevalence of case and subthreshold cases are 10% and 20% respectively. Therefore, the proportion of case and subthreshold in the sample are oversampled.
The study that we replicated described that “The completed clinician interviews were weighted to adjust for the oversampling of positives (i.e. case and subthreshold case)” and then described in the Statistic method that “The Taylor series linearization method was used to adjust estimates of statistical significance for the effects of weighting.”
I just wonder how to make the weighting and Taylor series linearization in SPSS.
In a systematic review, if outcome information in some articles are not uniform (ex; some in mean, some in medians & some in proportions) then how to summarize it in forest plot?
MOOSE: Meta-analysis of observational studies in epidemiology: a proposal for reporting. Meta-analysis Of Observational Studies in Epidemiology (MOOSE) group (2000)
PRISMA: Preferred reporting items for systematic reviews and meta-analyses: the PRISMA statement (2009)
I have data about monthly number of patients of a disease X admitted to a tertiary care hospitals in Pakistan. The data is zero inflated and has seasonality (one cycle per year) but no trend. I want to evaluate association between numbers of cases and climate variables. May you please guide me which analysis I should use?
Dears researchers,I want to analyse association between disease(absence or presence :dependent variables ) and SNP (independents )and others parameters by using logistic regression binary (spss),please how can i do adjustment for age and sex. Thanks
Can you suggest me some indications of databases and / or sociological studies of thalidomidics' life courses of in Italy? Thank you
Qualche indicazione su banche dati e/o studi sociologici sui percorsi di vita dei talidomidici in Italia? Grazie
How many cases of complications of influenza A(H1N1) in the form of meningitis or menigoencephalitis in the world
Does anyone know if there is a population attributable risk equation for continuous outcomes?
I am studying long term epidemiology trends of dengue in Barbados. I am looking at ways and means to define endemicity and possibly quantify the burden of dengue so as to be able to gauge the changes in epidemiological characteristics over time.
I've been doing meta-analyses on medication use and/or exposure during pregnancy for a while. Because ethical barriers exist regarding RCTs, the best evidence in this field usually comes from prospective cohort and case-control studies both of which may have different types of biases.
My question is; would you combine the case-control and cohort data in the same meta-analysis or would you prefer to do them seperately? Is combining both in the same meta-analysis definetely an unacceptable thing or is it a choice?
I'd very much appreciate your suggestions.
Dear all,
We have observed and catalogued clinical pneumonia patients into three groups. The three groups are patients infected with bacterial P, patients with bacterial S, and patients with both of bacterial P and S. The prognosis for the patients with combined infection of P and S is pretty gloomy than other 2 groups of patients. It seems a synergy effect of these two pathogens in a patient. Which statistical method I should use to analyse these three groups of samples?
Thank you.
Does anyone knows which is the right form of database for the poprisktime command in stata for age-period-cohort analysis? Could please someone help me with an example?
I am reviewing studies to calculate pooled prevalence of a disease in the country. I need to calculate pooled prevalence and to plot Forest Plots for overall prevalence and for each subgroup. Please suggest the type of review I have to use (Methodology, Flexible, etc.) and any reference link I can read. Thanks.
Specifically, my review is a systematic review of the construct of financial well-being. It is systematic because I want to respect all the guidelines for systematic review (e.g. PRISMA) but my aim is not to evaluate an clinical intervention's efficacy but to synthesize how previous studies defined and operazionalized the construct of financial well-being.
Registries as PROSPERO or COCHRANE accept only reviews that have clinical outcomes.
Due to the different sampling weights, I have concerns. I have included some background info for context and questions are at the bottom.
ISSUE--Developing a database: Individual-level collision records data will be merged with community characteristics gathered from several data sources (Census, population-based surveys, other individual-level data).
Population-based survey data have different population sampling weights.
All 4 datasets will be linked to the collision data using a geographic variable contained within each dataset.
DATASETS---
Individual-level Collision Data (linking variable=drilled down geographic variable which can be modified for ease of linking to different geographic levels)
+ Census data
+ County-level population-based survey #1 (weighted at the county level & has zip code)
+ County-level population-based survey #2 (weighted at the county level & has zip code)
QUESTIONS:
1) If the appropriate sampling weights are applied to the population surveys, is it allowable to link them to the collision data and other datasets?
2) The population survey data are representative of the County. However, we would like to present data at a more granular level (ex: census tract, planning area, etc.).
- Is there a methodological approach to accomplish this?
Thank you!
I have been documenting strange step-like changes in deaths in a number of countries and would like others to check and see if these observations can be replicated using small-area death statistics. Attached is a paper documenting the parallel effects of these events on medical admissions to hospital and it gives an idea of the sort of analysis which could be required.
If needed many of the supporting studies can be accessed at www.hcaf.biz in the 'Emergency Admissions' web page - which also contains the published stuidies on deaths.
Much appreciated if you can assist.
Is it possible to make an estimation of Hepatitis B Virus prevalence in a society depending on the data collected during blood donation and sample examination ??
I am conducting a meta-analysis and I plan to only include quasi-experimental studies if the treatment and comparison groups are closely matched on particular baseline characteristics. I have seen "closely matched" defined many different ways in various meta-analyses. I am looking for an efficient way to consistently determine the equivalence of groups when coding studies. Thanks in advance for any suggestions you can provide!
In meta-analysis 3RCTs, unlucky random baseline(+SD) & end(+SD) PRO psychometric scores. Cochrane h'book ref 17.7 discourages using means difference? Non-significant result when outcome measure is used alone, significant benefit shown if the change in measure is used, due to unlucky randomisation (all three had worse baseline for treatment arm thus reducing the end score - despite significant improvement). H'book ref 16.1.3.2 advises imputing an SD for the change but worst case assumptions for correlation still just approximate the average of baseline/end SDs.
This issue also sent to Gotzsche and Glasziou, as benefit is 'obvious', but conclusion is negative!
This is a question for all the statisticians out there! I am conducting an interrater reliability study. I have decided that using the Intraclass Correlation coefficient (ICC) would be the most appropriate statistical procedure, given that I have more than 2 raters involved (k=4). I have also calculated my appropriate sample size. My question relates to the type of model. I have chosen absolute agreement over consistency as I want to measure the ratings with absolute agreement to test the reliability of utilizing a proposed assessment rubric. I am not really sure if it is a two-way mixed or two-way random model, given that I have “randomly’ selected and invited raters from different disciplines, however, all four raters are fixed and will rate the same subjects drawn randomly from a larger population pool. The raters do not know the subjects. I am inclined to think this is a two-way mixed model rather than a two-way random model. What is the general consensus? I would greatly appreciate any feedback.
The model including one binary outcome (0/1; incident rate ~1.2%), one main exposure, and 13 covariates. The whole model is significant and the goodness-of-fit is OK. However, model diagnostic is completely questionable (See Fig). Almost all the observations fall into the category of y=1 had residuals larger than 3. I wonder whether this is due to the low event rate of the outcome, and can anyone give me some advice as to how to deal with this problem? If the purpose of the model is to explain an exposure-disease relationship rather than to predict, can I just ignore the model diagnostic results?

Hi, I am looking for recent articles (on future projections on HIV/TB co-infection burden in low-income settings (esp. sub-saharan Africa). Epidemiological studies., mathematical modelling, etc.
Thanks.
I am running survival analysis using SAS. I have a very large sample size (>19,000) and I am finding very narrow CIs, For example, 1.066 (1.062-1.069). The model is also weighted because it is a complex survey with mortality follow-up. What would explain the narrow CI?
I want to understand the physical significance of the product of D (Delta I) (term of diffusion with D the diffusion coefficient) and S the number of the susceptible.
Delta = Laplace operator
Thanks
I am doing a cross sectional study, using surveys, about prevalence of tobacco chewing. I will study the prevalence and its characteristic in population. Survey includes sociodemographic factors and behavioral questions about risk factors.how are we going to calculate the reliability and validity ? for this type of questionnaire. I have adapted from state forms for surveillance
Interested in improved glycemic control RCTs and Meta-Analysis looking at MDI or glargine injections versus CSII pump administration
Hi,
I have data from a randomized controlled trial which assessed the effect of drug A versus B on stroke. Now I want to select some specific subgroup (eg. the patients with cancer) and see what the risk factors for death in this specific subgroup would be.
Please of note, the trial has heterogenous groups, eg. some patients with cancer, some with heart diseases, some with COPD, etc. Now my research question is: what are the risk factors for death in patients with cancer? That means, I want to use the data for only some subgroup, but my interest is not the interventions (in the trial protocol, i.e., drug A and B). And similarly the outcome (death) is not the one in the trial protocol (stroke).
The drug (A or B) may be one of the risk factors for death in the subgroup patients. And I did some preliminary analysis for this subgroup stratified by drug A and B, and found their baseline characteristics were not significantly different, which means the randomized procedure was well achieved in this subgroup.
Now I wonder is it feasible to do such a project? Of course, it has limitations to the validity because we use a trial data (rather than cohort data). However is it feasible and clinically meaningful to run a project that chooses a subgroup and assesses the risk factors for death in this specific subgroup? However, I am not interested in the effect of the drug (A or B) on stroke (as in the trial protocol) in this subgroup..
Many many thanks for any advice and suggestion in advance!!
Regards,
GL
I carried out a Dersimonian-Laird random-effects meta-analysis using Cochrane (RevMan) software to create a summary effect estimate of weighted mean difference. I notice that the distribution of study effects (n=20) is negatively skewed. Do you recommend using the Permutation Method for non-parametric RE meta-analysis (e.g. Stata)? How skewed do the estimates have to be to require this form of RE meta-analysis. Thanks.
Assume you have 4 treatment groups (A,B,C,D) with five subjects per group. Should one expect your Two- sample T-stats results and conclusion for A and B to be similar to the One Way ANOVA F-test for A,B,C and D treatment grps?
Besides the type of epidemic how do we determine the overall severity (i.e mild moderate severe)?
I want to enter data which I have gathered through my study tool into epi info. I am designing the data form at the moment but am unsure whether I should include the answers as it is or just put in the codes pre-assigned to the responses ? Also is it possible to assign codes to the choices at a later point or would I even need to assign codes at all?
Meta-analyses, epidemiology, public health
How can we get number needed to harm from pooled odds ratio in meta analysis?
- If we get a significant Odds ratio of very high magnitude say more than 10 or 15. can we say that there is very very strong association associated with?
- If we do not get a significant association in an Un-adjusted model but get it in the adjusted model does it mean there is still an association associated.
Sudan is a country where consanguinous marriage is very common. Recently the country was divided in to two countries, so we do not have a recent population number. Many disease associated with different patter of inheritance are common.
I would like to meaure the percentage of consangunity in the country. That is why I am looking for biostatistic or epidemiology or statistical method with which I can answer the question of what is the percentage of the consanguinity in Sudan.
Using ILI cumulative summation currently. Is there another method that has proven valuable?
If we are taking daily samples to study Aspergillus development as surveillance in high complex hospital, which kind of epidemiological and statistical approach do you suggest to use: Time series or incidence study? The outcome is positive culture.
Lots has been written about quarantine and closing borders domestically (US) but not much else.
I know that this isn't recommended, but I don't why. Can anyone explain to me?
To perform a test of significance between means of two groups is well known. But if I use the medians is it possible to test for significance between medians of two groups?
Since the calculation of binary exposures, Levine formula is introduced. But how to calculate when the study has multi categorical exposure and confounding variables
The data was coded wrongly mixing the false positive and false negative and so the true values, and it's impossible to get the detailed data. Can I do cohen's Kappa test?
I am running a multiple regression model and I an realizing there is an interaction between two covariates and with this interaction term, both covariates remain in the model while the R² increases by about 10%. Because the interaction is not with the main exposure of interest, the regression coefficient for the main exposure of interest does not change with the inclusion of this interaction term. Should I still keep it in the model? How best do I explain the importance of this interaction to the association between the exposure of interest and the outcome? Thank you for your answers.
In leon gordis, these methods were explained under validity and I felt it logically correct, but few other books covered these measures under diagnostic agreement.
I am trying to collect between- and within- person variance data for common environmental and medical biomarkers. Right now, I am focussing on hydroxy PAHs such oh-pyrene in urine, cytokines, lipid (cholesterol) in blood, pesticides in various biological media, volatiles (chloroform, BTEX) in breath, etc.
Surprisingly, these data are difficult to find in the literature. Would love some assistance.
Dear all,
I am using Epi Info 7 to create form for a survey data entry.
The survey contains a general section and sections specific for male, female and children.
The form has 8 pages and I used the following codes so that entry for children will skip the irrelevant pages.
Page [P1 Demographic]
After
//add code here
IF FormCoding = "Children" THEN
GOTO 7
ELSE
GOTO 2
END-IF
However, when I fill up the form, entries for children do not skip Page 2 - 6.
I have tried variations of the codes above, and have used similar codes on different pages.
However, none of them worked.
The variable "FormCoding" is created using as Option.
Have also tried creating it as a legal value but also did not seem to work.
What could be the possible reasons?
Is there any rule which I have accidentally omitted?
Please advise.
Thank you very much in advance.
I have been using the Cox proportional model option in SPSS to estimate relative risk between two groups (which appears as the Exp B statistic in the output). But the estimates I get don't seem to reflect the survival plots. I tried to do some calculations by hand using the life table outputs. These estimates are different from the SPSS outputs. Has anyone experienced a difference between manually calculated relative risk and SPSS computed RR, and if so how did you reconcile the difference?
How must sample size calculations be adjusted, if a hawthorne effect might be occuring during study conduction? Is there a publication concerning this problem?
Using observation data where the data can not be assumed to be missing at random,
I need to show the statistical power of a significant p-value obtained for a survival curve in melanoma.
I am trying to estimate parameters driving the spatial spread of a disease. For the moment, I am using simulated data to check whether the estimation process performs well.
My objective would be to estimate the transmission parameters using a maximum likelihood approach.
So I used the optim() function in R from which I extracted the Hessian matrix. To derive the confidence intervals, I computed the standard errors by taking the root square of the diagonal elements of the inverse of the Hessian (http://stats.stackexchange.com/questions/27033/in-r-given-an-output-from-optim-with-a-hessian-matrix-how-to-calculate-paramet). My problem is that the confidence intervals derived are too wide (the CI for my probabilities range from 0 to 1). This problem seems to be relatively common, but I have no clue where it comes from and how to solve it.
Is there anyone to give me some tips?
PS: Here is my code:
fit <- optim(inits, MinusLogLikelihood_function, method="BFGS",hessian=T)
fisher_info <- solve(fit$hessian)
prop_sigma <- sqrt(diag(fisher_info))
upper <- fit1$par+1.96*prop_sigma
lower <- fit1$par-1.96*prop_sigma
For example, suppose a certain method was chosen for power/sample size calculation for a future experiment, with the intent to use the same method to analyze the experimental data. But the experimental data turned out to violate the assumptions of this method. What is an appropriate strategy to analyze these data?
If we conducted a control study without determining the sample size and power of study, is it possible to calculate the power of study at the end (after data collection is completed)?
We are in the process of analyzing the data from a case control study to identify the various risk factors for esophageal cancer and want to assess the possible interaction among such factors in determining the risk associated with certain genetic markers.