Science topic

Longitudinal Analysis - Science topic

Explore the latest questions and answers in Longitudinal Analysis, and find Longitudinal Analysis experts.
Questions related to Longitudinal Analysis
  • asked a question related to Longitudinal Analysis
Question
2 answers
I have 15 treatments. My main interest is to find the best treatment. The response is measured every day up to 30 days. My model has an interaction effect between time and treatments. I use suitable effect size, but I need power of 80% with type I error 5%. How can I calculate sample size by simulation?
Relevant answer
Answer
did you calculate eventually? need guidance
  • asked a question related to Longitudinal Analysis
Question
6 answers
I am searching for a step-by-step procedure explaining all the commands and possible commands and models when using the mixed models command in SPSS (I have unbalanced longitudinal data).
Relevant answer
Answer
This is an excellent site
includes SPSS syntax , that accompanies the book
Longitudinal Analysis: Modeling Within-Person Fluctuation and Change (by Lesa Hoffman)
  • asked a question related to Longitudinal Analysis
Question
3 answers
I am conducting an intensive longitudinal analysis with multilevel models.
I have some participants that don't have any variation on the predictor variables. Of course they don't really provide any useful information to the model that can be analyzed.
But I have also heared, that keeping such cases in the dataset could introduce bias to the results. However, I can't find any literature on this topic.
Does anyone know a source that discusses this and recommends excluding these cases from the sample?
Relevant answer
Answer
Hello Pascal,
I'm a bit unclear about your query. Do you mean that a batch of cases has exactly the same score on each predictor? If so, omitting them would bias estimates of score variance on the variables (and standard errors) and most likely the regression coefficient estimates associated with them.
Aside from the fact that you might find these instances inconvenient, do you have a substantive basis, whether theoretical, or empirical, to warrant jettisoning such data? (For example, do you have after-the-fact information to suggest that such instances were not accurate portrayals of the true values for individuals?) If not, it sounds as if doing so would be ill-advised.
Good luck with your work.
  • asked a question related to Longitudinal Analysis
Question
3 answers
Hello,
I am interested in how mental health evolves over a 10-year period (N=10000). My analysis plan is to first factor out age, gender, and other demographics in each wave, then take these residuals into a liner model with years as predictors. Does it sound sensible to do so?
Best,
Relevant answer
Demographics could be considered as the factors, however, psychological factors and other factors are also could be predictors of mental health.
Kind Regards,
  • asked a question related to Longitudinal Analysis
Question
13 answers
Are there any literature available on how much time should lapse to be considered a longitudinal study?
Relevant answer
Answer
The timescale required for a longitudinal study to demonstrate effects depends on a number of factors.
These include the phenomenon under investigation, and the comparative data already available - there may be comparison data from studies of the effects of other approaches against comparable non-intervention controls and you are looking at a novel intervention. In this context, the timescale for effects may already have been already established.
Change may be being evaluated against a stable baseline where the phenomenon is not otherwise expected to change over time and there is data to substantiate this.
Some phenomena have an expected chronobiology - for example changes in menstrual-period related phenomena may take longer before it can be established whether change has taken place against a monthly pattern.
Epileptic seizure and migraine frequency can vary markedly over time making a stable baseline more difficult to establish.
Changes in seasonal affective disorder, asthma, and certain allergies show strong seasonality and annual patterns some of which are hemisphere, day length and latitude-dependent and might require several years of data.
In short, for a research study it depends on what is known about the H0 phenomenon you are studying and its variability.
Hope this helps.
Ken
  • asked a question related to Longitudinal Analysis
Question
5 answers
I have 2 seperate data sets that I need to combine, time point 1 and time point 2.
Not all participants in time point 1 are in time point 2 (i.e., attrition, etc.). So I will need to know how to match participants, and keep the duplicates. Too many instructional videos tell you how to remove duplicates.
Next, I need to structure of the time points to be on top of each other, by columns. Meaning, I need time 1 to be above time 2, not next to each other. Again too many videos tell you how to add rows, (i.e., dplyr left join, right join), its very hard to find those that teach you how to add by columns. I want the data structure to be suitable for longitudinal analysis, or at least some form of repeated measures - where adding data sets to left or right may not work.
Please help! Either in R or excel!
Relevant answer
Answer
You are looking for the functions merge and reshape. Lots of packages, like data.table, also have functions, but merge and reshape are the ones in R. Just look at their help files.
  • asked a question related to Longitudinal Analysis
Question
1 answer
Meta-analysis is a good method to estimate the average correlation between two variables based on previous findings. When examining the associations between two variables, we also want to figure out the direction of the relationship (or in other words, who influences who). Is it possible for us to infer the causal direction by examining the moderating effect of the measurement order of two variables in a meta-analysis (i.e., comparing the correlations measured between two variables when one was measured before or after the other, or when they were measured concurrently)? The cross-lagged panel model states A to be the likely cause of B when the correlation between A measured at an earlier point and B measured at a later point is larger than the correlation between B measured at an earlier point and A measured at a later point (Kearney, 2017). Following this logic, can we say that, in a meta-analysis, if the correlation was more negative when A was measured after B than before or concurrently, A may be the cause of B and play a dominant role in the relation? If yes, are there some published examples? Thank you very much!
Relevant answer
Answer
No, cause-and-effect conclusions cannot be made from subgroup analyses or meta-regression analyses of potential moderator variables. These analyses are done across studies, not within studies where cause-and-effect conclusions can be made (e.g., treated group vs. untreated group).
  • asked a question related to Longitudinal Analysis
Question
5 answers
What statistical test should I use to investigate how changes in variables A (continuous; defined as A at Time2- A at Time1) affect changes in Variables B (continuous; defined as B at Time2 - B at Time1) using longitudinal data? I really appreciate your help.
Relevant answer
Answer
Hi,
Repeated measure analysis. Here are a few references:
Schober P, Vetter TR. Repeated Measures Designs and Analysis of Longitudinal Data: If at First You Do Not Succeed-Try, Try Again. Anesth Analg. 2018;127(2):569-575. doi:10.1213/ANE.0000000000003511
For atheoretical insight try: Get this book in print
M. Ataharul Islam, Rafiqul I Chowdhury.Analysis of Repeated Measures Data. Publisher-Springer Singapore, Year: 2017
ISBN: 978-981-10-3793-1, 978-981-10-3794-8
  • asked a question related to Longitudinal Analysis
Question
5 answers
I have collected data at 3 different time periods. For the 1st time period, I have analysed using SEM (SPSS + AMOS). Subsequently, I have collected data for 3 different time periods (say each year). What type of statistical analysis can be done for drawing inference ?
Relevant answer
Answer
Why you needed to collect data at three different time periods? Your answer lies in there!
  • asked a question related to Longitudinal Analysis
Question
6 answers
Hi all,
I'm analyzing a set of longitudinal data obtained from subjects at regular time intervals following an intervention. I want to look for the effect of time, but I also want to adjust the observed time effect for age and sex of subjects. My initial thought was to run a linear model like this:
lm(Signal ~ SubjectID + Time + Age + Sex)
However, age and sex are both attributes of subjectID, and, therefore, are not independent covariates. Should I rather simplify the model like this:
lm(Signal ~ SubjectID + Time) ?
Any suggestions are much appreciated.
Thank you very much!
Relevant answer
Answer
You need somthing more than standrad linear models to analyse longitudinal data. You dependet variable whill be time varying but gdsex wil be time invariant and applt to teh peson. The natural method of analsyis is the two level random effcets model will can hadle within individual and between individual variation.
This is a short practical example of what can be done
this is an overview of alternative models
this is a comprehensive site that considers a range of software :
  • asked a question related to Longitudinal Analysis
Question
5 answers
I have a dataset where the same survey was sent out to a community of individuals over the course of seven waves. I will note that not all participants started at the same wave, nor did all participants complete all waves.
One question in the dataset asked participants if a particular event had occurred in their life with a binary answer format (yes/no). I want to look at if there are significant differences within participants on certain dependent outcome variables before vs. after this event occurred. I have used Excel coding to identify ~90 participants from this dataset who stated the event had not occurred in their life in their first wave, but stated the event had taken place in at least one subsequent wave. This means not all participants stated the event took place at the same wave.
Therefore, I'm wondering what would be the best course of analysis? I originally was suggested to do a repeated measures ANOVA, but I cannot neatly select two within subject variables when participants have different pre/post timepoints. It was also suggested I dummy code each case for whether the case occurred before or after the event for each participant, which makes sense, but I am unsure how to use this grouping variable in a repeated measures ANOVA (this task is also tedious by hand, but doable).
Relevant answer
Answer
It looks good except that I would code "event" as 0 and 1 and "wave" as 0, 1, 2, 3, .... Doing so will facilitate your interpretation of the regression coefficients (in particular, the intercept) because then both independent variables have a meaningful zero point.
  • asked a question related to Longitudinal Analysis
Question
1 answer
Hello!
We have a question about implementing the ‘Mundlak’ approach in a multilevel (3-levels) nested hierarchical model. We have employees (level 1), nested within year-cohorts (level 2), nested within firms (level 3).
In terms of data structure, the dependent variable is employee satisfaction (ordinal measure) at employee level (i) over time (t) and across firms (j) (let’s call this Y_itj), noting that we have repeated cross sections with different individuals observed in every period, while as regressors, we are mainly interested in the impact of a firm level time-variant, but employee invariant, variable (let’s call it X_tj). We apply a 3-level ordered profit model (meoprobit in Stata).
We are concerned with endogeneity issues of the X_tj variable, which we hope to (at least partially) resolve by using some form of a Mundlak approach, by including firm specific averages for X_tj, as well as for all other time-varying explanatory variables. The idea is that if the firm-specific averages are added as additional control variables, then the coefficients of the original variables represent the ‘within effect’, i.e. how changing X_tj affects Y_itj (employee satisfaction).
However, we are not sure whether approach 1 or 2 below is more appropriate, because X_tj is a level 2 (firm level) variable.
1. The firm specific averages of X_tj (as well as other explanatory variables measured at level 2) need to be calculated by averaging over individuals, even though the variable itself is a Level 2 variable (varies only over time for each firm). That is, in Stata: bysort firm_id: egen mean_X= mean(X). As our data set is unbalanced (so the number of observations for each firm varies over time), these means are driven by the time periods with more observations. For example, in a 2-period model, if a company has a lot of employee reviews in t=1 but very few in t=2, the observations in t=1 will dominate this mean.
2. Alternatively, as the X_tj variable is a level 2 variable, the firm specific averages need to be calculated by averaging over time periods. That is: we first create a tag that is one only for the first observation in the sample per firm/year, and then do: bysort firm_id: egen mean_X= mean(X) if tag==1. This gives equal weight to each time period, irrespective of how many employee-level observations we have in that period. For example, although a company has a lot of employee reviews in t=1 and very few in t=2, the firm specific mean will treat the two periods as equally important.
The two means are different, and we are unsure which approach is the correct one (and which mean is the ‘true’ contextual effect of X_tj on Y_itj). We have been unable to locate in the literature a detailed treatment of the issue for 3-level models (as opposed to 2-level models where the situation is straightforward). Any advice/suggestions on the above would be very much appreciated.
Relevant answer
Answer
For general orientation you may be interested in
In this manual , see Chapter 8 where we use another multilevel model to get precision-weighted estimates of the group means
For a fuller discussion of the multilevel model as a measurement model) and a more convincing example of using precision-weighted estimate of the group mean can be found at
Finally , as in all research call involving a judgement call, do it both ways and see if it makes a substantive difference.
  • asked a question related to Longitudinal Analysis
Question
6 answers
Hi,
I am testing this hypothesis:
The sustainability of IC at T1 and T2 is moderated by the difficulties to readjust at work after an international assignment.
{IC= Intercultural competences (Continous), difficulties to readjust (1= Yes/ 2= No )}. The sample size is 72.
For the continuous variables, I used a scale (7 Likert scales) to measure intercultural competence (IC) at two-time points (T1 and then T2 after 1 year). Then I computed the results for each time into one continuous variable.
I am using SPSS to test my hypothesis. I want to add that my data are not normally distributed. I tried to use multinomial logistic regression. Thus I converted the variables (IC T1 and T2) to categorical variables (High, Average, Low). I added IC T2 to the dependent variable and IC T1 to the predictors along with the moderator.
So, the data I have converting the continuous variable into categories:
Dependent IC T2 (nominal)
predictors: Independent IC_T1 (nominal)
Modertor= difficulties to readjust (1= Yes/ 2= No)
My question is, do you think what I have done so far is correct? If not, what are your suggestions?
And If you think it is correct, how could I interpret the results?
I am new to work with statistics, and this is part of my PhD research. I hope I can learn something from you about this issue.
Kind regards,
Relevant answer
Answer
If you have an ANCOVA style model with pre-score as a covariate it isn't repeated measures. If you use a repeated measures model in the ANOVA context this is equivalent to an analysis of the difference scores in a non-repeated measures model.
So if you just have two time points it isn't always necessary to use a repeated measures model.You can see this with something as simple as a paired t test where it is equivalent to a one-sample test of the differences between pre- and post-. So there is no problem using an ANCOVA style model if you'd like to (it may also have greater statistical power than the standard approach) and arguably makes less strong assumptions.
  • asked a question related to Longitudinal Analysis
Question
1 answer
A few days ago a colleague of mine made me think about the impact the COVID19 crisis will have on cohort studies. Especially those focused on causes of mortality and the elderly will be deeply impacted by the number of deaths due to the pandemic.
Is this something manageable?
How big is this matter in your mind?
How can this be handled?
Relevant answer
Answer
Well, the cohort study is highly suggested in covid pendamic. however, the covid-19 is no more a cohort situation it is done for observation and per and post-test by training and other medication.
  • asked a question related to Longitudinal Analysis
Question
5 answers
Hi everyone!
I'm doing a PhD in Clinical Psychology and I have some treatment data to analyse. The design is a 2 (condition: self-compassion, cognitive restructuring) x 5 (time: baseline, mid-treatment, post-treatment, 1-week follow-up, and 5-week follow-up) design. I had 119 participants randomized and engage in their respective interventions for a 2-week period, with follow-up assessments. The aim was to reduce social anxiety.
One analysis I'm trying to do is mediation, and preferably I would use a more simple strategy such as Hayes' PROCESS macro on SPSS. However, my understanding is that I won't be able to use all five waves of my data if I use PROCESS. Does anyone know if that is an appropriate strategy for multiple waves? Should I be using all information? And if so, how?
Relevant answer
Answer
Hi Ying, please see our paper to read about how what we eventually did:
  • asked a question related to Longitudinal Analysis
Question
8 answers
Hi Scholars, I am confused between theoretical & practical implications and also theoretical & practical contributions so, How to distinguish between them?
thanks a lot in advance.
Relevant answer
Answer
Contribution = the new knowledge you are bringing in
Implication = Benefits from the derived knowledge
  • asked a question related to Longitudinal Analysis
Question
2 answers
Hi everyone, I need help for what analysis is better for this longitudinal study. In my study, I measured the attachment security of children at 5 time points. My dubts are that: 1) spacing between time points is not equal (T1 = beginning; T2 = after 1 month; T3 = after 2 months; T4 = after 6 months; T5 = after 15 months); 2) the total sample is 148 children but not all of them have the 5 observations/scores (T1 = 148; T2 = 140; T3 = 112; T4 = 20; T5 = 50) so there are many missing, especially at T4.
Aim: I would like to examine if attachment scores change significantly over time and if these are affected by other variables such as gender, age, etc.
My questions are:
- focusing on the first period of time, as preliminary analysis for T1-2-3 I used the Repeated Measures ANOVA, because the spacing between time points are equal (however, there are some missing and I lose some information). Then, I analyzed means with the Repeated Measures ANOVA analysis and Post-hoc tests (Bonferroni) with e.g. "gender" as between-subjects factor. Does that work?
- then the study has continued at T4 and T5. Which analysis can I use now? Does it make sense to quit T4 with so few subjects?
- what analysis considering the role played by other variables? Growth Curve Model?
Thanks so much
Relevant answer
Answer
Hello Alessia, I would use mixed models with "subjects" as random intercept and "gender" as a fixed effect: mixed models allow you to use eventual alternative distributions in case of your dependent variable is not normally distributed and they are less sensitive to missing data.
Anyway, the different time points are so unbalanced, in my opinion, and I would suggest considering just T1-2-3.
You can repeat the same analysis including T4 and T5 only on participants included at T4, so with a reduced sample.
This seems to be the only reliable solution, in my opinion, but I'm curious to know whether someone else has some best suggestions.
Good work and good analyses!
  • asked a question related to Longitudinal Analysis
Question
9 answers
Most of recent books in longitudinal data analysis I have come through have mentioned the issue of unbalanced data but actually did not present a solution for it. Take for example:
  • Hoffman, L. (2015). Longitudinal analysis: modeling within-person fluctuation and change (1 Edition). New York, NY: Routledge.
  • Liu, X. (2015). Methods and applications of longitudinal data analysis. Elsevier.
Unbalanced measurements in longitudinal data occurs when participants of a study are not measured at the exact same points of time. We gathered big, complex and unbalanced data. Data comes from arousal level which is measured every minute (automatically) for a group of students while engaging in learning activities. Students were asked to report on what they felt while in the activities. Considering that not all students were participating in similar activities in the same time and not all of them were active in reporting their feelings, we end up with unstructured and uncontrolled data which does not reflect a systematic and regular longitudinal data. Add to this issue the complexity of the arousal level itself. Most of longitudinal data analysis assume the linearity (the outcome variable changes positively/negatively with the predictors). Clearly that does not apply to our case, since the arousal level fluctuates over time.
My questions:
Can you please specify a useful resource (e.g., book, article, forum of experts) to analysis unbalanced panel data?
Do you have yourself any idea on how one can handle unbalanced data analysis?
Relevant answer
Answer
I do not have experience of this level of complexity in repeated measures , but it seems to me that you actually have multiple episodes - what you call events. So I am suggesting a three level structure with an identifier at level 3 for individuals; at level 2 for episodes within individuals and level 1 for repeated occasions within a specific episode; the minute by minute recording of repeated measurements. You could have variables measured at all three levels. The general approach is considered here
Fiona Steele (2008) Multilevel Models for Longitudinal Data, Journal of the Royal Statistical Society. Series A (Statistics in Society)Vol. 171, No. 1 (2008), pp. 5-19
  • asked a question related to Longitudinal Analysis
Question
8 answers
Hi Everyone,
I am currently looking at decline/change over time. 
The data (unbalanced) is in long format with each individual having more than 2 assessments. Thus each individual has data in 2 or more rows depending on the no of assessments he/she has had.
I am uncertain about how to go about creating a variable on the basis of unique id to calculate the time difference between each individuals assessments considering that there are multiple assessments for each individual.
I am uncertain how to look at decline over time for each individual as well as average decline of each group.
I would be really grateful if anyone could possibly help me understand how to go about this or point me towards a good reference.
Relevant answer
Answer
Eoin Finegan , the IDRE site at UCLA also has multiple examples. See this page, for example:
Scroll down to Exercise example, model 2 using MIXED Command. You can also find examples from the well-known book by Singer & Willett on this page:
HTH.
  • asked a question related to Longitudinal Analysis
Question
5 answers
Hello, panel data professionals!
I have a longitudinal dataset containing yearly data collected for 20 companies over 10 years.
The dataset contains the dependent variable, one independent variable (the task is to find if this variable Granger causes the dependent variable), and 3 additional variables that can be potentially correlated with the dependent variable.
In summary, we have the following conditions:
  1. Non-stationary series (for each variable)
  2. Additional control variables
  3. Longitudinal data, T=10, N=20.
Which test for Granger causality would be appropriate in this case?
Tanks a lot!
Relevant answer
Answer
Hi Andrey,
If you have balanced panel data, the proper test for this case is Dumitrescu and Hurlin (2012) for testing Granger causality in panel datasets. You can install this -function/ code by - ssc install xtgcause- . Then, you may try this command
xtgcause Y X, lags(#) and the other way around
xtgcause X Y, lags(#)
--------------------------------------------
H0: Y does not Granger-cause X. H1: Y does Granger-cause Xat least one panelvar (id).
----------------------------------------------------
For more information see these papers:
I hope this helps.
  • asked a question related to Longitudinal Analysis
Question
6 answers
I have been using mixed effect models to analyzing neuroimaging datasets with multiple scanning sessions per participant. All my previous models only included a random intercept, and it was not until recently that I heard about random slopes.
I am still uncertain about when I should be using a random intercept and slope in my mixed effect models? Any practical or theoretical insight would be greatly appreciated.
This is a general question, but for the sake of example lets say I am testing a hypothesis that there will be a negative relationship between total brain volume and depression severity symptoms (both variables are numeric). In this case, I am trying to control for age at scan since there are inconsistent intervals between scan sessions across all participants. Therefore, should I include a random intercept and random slope of age?
gamm4(GreyMatterVol ~ s(AgeAtScan, k=4) + DepressionSeverity, random=as.formula(~(1|sub), data=alltimepoints, REML=T)$gam
Relevant answer
Answer
I am assuming that you specifically want to compare a standard random intercept model (participants allowed their own intercepts only) to a standard random coefficient model (participants are allowed their own intercept AND slope). In other words we are not talking about other types of models (e.g. shared intercept, but random slope)
There are two basic approaches to choosing between these two models.
The first approach (purely empirical) is to compare the model fit of a random intercept model to that of a random coefficients model (i.e. random intercepts AND random slopes) using something like AIC or BIC. The advantage of these measures (and other IC statistics) is that they penalize for model complexity. The basic idea is that if the AIC (or BIC) is lower for a random coefficient model compared to a random intercept model then the gain in fit is worth the extra model complexity.
Personally, I think this approach alone is a little too 'black box'. I like to think about the problem in terms of the data (and research problem itself). For a random intercept model we are assuming that participants are allowed to have their own baseline values (and these are normally distributed around the average intercept). BTW remember this is AFTER we account for differences in participants due to treatment group and/or risk factors (i.e. we can assume the fixed effect predictors don't interfere with these decisions). The subtext for the random intercept model is that we assume (despite there different starting points) that participants respond to time in exactly the same way (in other words, all of the individual participants' regression lines are parallel). So the important question is: Do we think such an assumption is valid? If we assume that participants may respond to time differently (e.g. different patients may respond to therapy in a different way...remember though, beyond that explain by the fixed effect predictors in the model).To summarize, if you had to fit a regression line individually for each participant, would you constrain them to a share slope (random intercept) or not (random coefficient). In other words, have look at your spaghetti plot.
Hope this helps
  • asked a question related to Longitudinal Analysis
Question
4 answers
Hi all,
I´m doing research in a big corporation that has implemented a purpose program, which consist of a 6 hour workshop for employees where they reflect about their purpose in work and the purpose of the firm. We are designing a longitudinal study for measuring the impact of this program in atendants' perceived productivity, collaboration and enviromental awareness.
We are going to do a multilevel longitudinal analysis surveying both employees and their inmediate supervisor. We have develop the scales for each variable but we have doubts regarding the time-lag between each wave. Could you help me with that?
I thank you so much in advance your help,
Alvaro
Relevant answer
Answer
Thanks again for your response Kelvyn. I only could ask for one measure after the program. Defitively 1 week is a short time. I have doubts between 1 and 3 months. The company wants to evaluate 1 month later for having the results as soon as possible. However, as we want to measure the real effect of the course in the attitudes and behavior of attendants I don´t know if one month is too little time. What do you think?
  • asked a question related to Longitudinal Analysis
Question
16 answers
Over the past couple of years, I have been getting into methods to infer causality (after a strong focus on correlation, survey-based research). Methods such as longitudinal analysis using fixed effects as well as DiD-analysis are contested. But are there other valid methods to infer causality apart from actual experiments or the above-mentioned methods? And what are the benefits and costs of such methods? Looking forward to hearing what you think.
Relevant answer
Answer
Qualificative methods and case study research are by far the strongest ways to study causality
  • asked a question related to Longitudinal Analysis
Question
9 answers
I have read multiple articles that have used machine learning algorithms (convolutional neural network, random forest, support vector regression, and gaussian process regression) on cross-sectional MRI data. I am wondering whether it is possible to apply these same methods to longitudinal or clustered data with repeated measures? If so, is there an algorithm that might be better to use?
I would be interested in seeing how adding longitudinal data could improve the performance of these types of machine learning models. So far, I am only aware of using mixed effect-models or generalized estimating equation on longitudinal data, but I am reading books and papers to learn more. Any advice or resources would be greatly appreciated.
Relevant answer
Answer
Hello Robert, there are extensions of recursive partitioning and trees for longitudinal and clustered data. They essentially include a mixed model element into the algorithm. I have used the RE-EM algorithm in the past (see DOI: 10.1007/s10994-011-5258-3 and DOI: 10.1016/j.csda.2015.02.004). There are also binary partitioning for continuous longitudinal data (DOI: 10.1002/sim.1266) and mixed-effect random forest (DOI: 10.1080/00949655.2012.741599). Implementations can be found in R packages: REEMtree, longRPart2, MixRF.
  • asked a question related to Longitudinal Analysis
Question
4 answers
Hello,
assume I have data from two (and later three) time points that are in a three-level structure. I want to test mediation. The predictor is binary, the mediator is (usually) continuous (in some analyses also binary), the outcome is continuous (and in some analyses binary).
So, it's
- mutilevel
- longitundinal
- mediation
analyses.
Can you recommend how to analyse this data? I was thinking about latent growth modeling with mediation but I am not sure if this is either possible or the best.
Thanks!
Relevant answer
Answer
Kelvyn Jones thank you so much for your help. I like the idea of building the model stepwise and kind of adding more complexity to it step by step. I'll do some more research to see which procedure fits our data best. Now I have some starting points for finding out, which is great. Thanks!
  • asked a question related to Longitudinal Analysis
Question
8 answers
I collect data of 21 participants daily over 14 weeks (98 measurement points per participant/physiological data). Each participant is part of an intervention group. All interventions have the possibility to improve the dependent variable. I assume that there was no improvement per day but perhaps per week (I know that it is not perfect because all interventions had the possibility to improve the dependent variable). My questions are:
  • should I combine the daily measurements into weekly measurements?
  • should I use the time variable (measurement or week) as continuous or as a factor?
  • Is this a possible code (here using the lmer function)?
model1_fit <- lmer(formula = dependent variable ~ daily measurement/week+intervention+daily measurement/week:intervention+ (1|id),
data=data,
na.action=na.exclude)
summary(model1_fit)
  • how should i interpret the interaction when using the time variable as a factor (assuming that's the better choice)?
Thanks for your help.
Relevant answer
Answer
Should I combine the daily measurements into weekly measurements
NO you are loosing valuable information
Should I use the time variable (measurement oder week) as continous or as a factor?
A bot of both
In a repeated measures random effects multilevel model you have occasions at level 1 nested within individuals. The residual at the individual level will give 21 differences one fro each individual around and intercept and the 98 by 21 occasion residuals will give differences from the individual overall mean (intercept plus individual residual) for each and every occasion.
You can then put in variables to try and account fro this differences in the fixed which may be a function of continuous time ( linear or quadratic etc to get the overall trends), categorical time (a dummy for weekend or morning) and dummies distinguishing the three groups. You may also need interactions eg a linear rend by group interaction but 21 is not many for this type of modelling.
It is also usual to fit a random-slope model whereby the differences between individual may increase over time. A potential sequence of models is given in my answer to this question.
  • asked a question related to Longitudinal Analysis
Question
7 answers
I need to perform a linear regression analysis in SPSS where:
-the predictor is a continuous variable representing the SD of changes over time.
-the outcome is a continuous variable measured at one time point.
-other co-variates are measured repeatedly over time.
What would be the best approach to test the association between predictor and outcome while co-varying for the other time-varying variables?
Thank you
Relevant answer
Answer
Let me see if I understand. You have a single variable as an outcome, and a bunch of sets of covariates, where each set is a single variable with multiple measurements. And then there is the SD of something. First, what is this SD of and is it there to address heteroskedascity? On the sets of covariates, is the issue how to deal with the collinearity within each of these sets? Are you expecting similar covariances within these sets (and are the times the same)? Are you after something like estimating the slope and mean of each set, and then using these? Or wanting some more model-ish thing? How many variables do you have, how big a sample, how many time points? Are you testing a specific hypothesis or just trying to find a model with good fit?
  • asked a question related to Longitudinal Analysis
Question
3 answers
I have assessed (Trait measure of attribution - for positive and negative situations ) for 85 sports persons. They all went on to play a competitive match. After the match (T1), they filled a state attribution questionnaire based on the result of the match (win/lose). Total number of winners were 51 and 34 lost their match. They all went on to play their next game (T2) and filled the state attribution scale, based on the result of the game (Win/lose) again. From the 51, who won their match at (T1), only 35 won and 16 lost the second game. From the 34, who lost their game at (T1), only 16 won their second game and 18 lost again.
I want to see whether attributions remain same? (trait-state-state)
whether one behaves like their trait attributions?
Any help regarding statistical approach would be appreciated.
  • asked a question related to Longitudinal Analysis
Question
3 answers
Hello readers
I am studying attributions in sports. I have a trait like measure which assesses attributions from six dimensions (Internal-External, Stable-Unstable, Global-Specific, Controllability, Intentionality). I also have state measure for attributions, which i assessed after two performances t1 and t2.
I want to know/test, people who are optimistic (trait) remain optimistic (state) irrespective of the result of t1 and t2 and people who are pessimistic remain pessimistic.
Can any one help me with the suitable analysis.
Any help would be appreciated
Relevant answer
Answer
اختبارات البارامترية
  • asked a question related to Longitudinal Analysis
Question
3 answers
In Criminology as in all social science, longitudinal analysis is NOT hurting causal science. If, owing to poor conceptualization, a lack of proper measurement and data, longitudal analysis is premature then so would be cross-sectional analysis. Classic research survives not because of the findings, the theory or methods but because of the scholarship. Advancing our understanding of any complex phenomena requires so much more than analysis over time or space. It requires our conceptualization to advance beyond these 2 dimensions and include interactive simultaneous multilayered dynamic processes contingent in time and space. So instead of waiting to model time when we have more knowledge or better data I would contend that waiting would only worsen our understanding by continuing to isolate and truncate how the world actually works. Associations at time t are not just time dependent, they may be transformed or die out by time t plus 1. Don’t fear time, lean into it. Let’s collaborate across disciplines to conceptualize, invent and test new heuristics. It’s messy. Let’s get out there and model the heck out of it!
Relevant answer
Answer
Modern legal researches assume the appeal not only to classical methodology, but also postmodern. Not only the received result, but also its interpretation by the author is important.
  • asked a question related to Longitudinal Analysis
Question
1 answer
A government body performs an annual rating of schools. The rating is given a score (called performance rating). In addition, there are the PISA and TIMSS ranking and other global competitiveness ranking shows the education performance of the country. I want to test if there is an impact of the schools rating on those indicators. Does the improvement in the annual assessment scores (increase or decrease) impact the PISA, TIMSS and GCI ? Longitudinal analysis, but I am strutting to plug in the data in SPSS and to identify the test tool to use. I have 3 years schools assessment scores, the indicators results.
Relevant answer
Answer
Thank you Gregory, I have 224 schools, grade 8-12, been rated since 2011. I want to exam if the authority performance review is worth it and relevant to improving the country ranking.
  • asked a question related to Longitudinal Analysis
Question
7 answers
I am doing a meta-analysis of longitudinal data in which few studies have OR as summary estimate while others have rate ratio (incidence). Except for the estimate used, there is no heterogeneity between the studies. It doesn't seem sensible for me to conduct separate meta-analyses just for this reason. I will be grateful if somebody can advise. The years of follow-up in both the exposed and unexposed groups is similar. 
According to my knowledge 
Incidence rate ratio = (incidence in exposed group/person-years) / (incidence in unexposed group/person-years) 
Relative risk = (Incidence in exposed group/persons in exposed group/total persons in exposed group) / (incidence in unexposed group/total persons in unexposed group)
Given that the years of follow-up in both the exposed and unexposed are similar, can't we consider IRR as equivalent to Relative Risk. 
Please correct me if I am wrong and let me know if there is some way to do this 
Relevant answer
Answer
I am doing a meta-analysis some studies have OR while others have RR. Study design is case-crossover or time series. I use Comprehensive Meta-analysis Software version 2.0 for analysis. How can convert OR to RR?
  • asked a question related to Longitudinal Analysis
Question
5 answers
I am trying to decide which longitudinal analysis I should choose for my study. I collected data at four different time points:
1. Before the intervention
2. After the intervention
3. Three months after the intervention
4. Six months after the intervention
Study design: 12 participants took an intervention made to foster a sense of purpose in life. I want to see whether the toolkit fostered purpose and if the effects held over a period of time.
Relevant answer
Answer
I do recommend " Applied Longitudinal Analysis, 2nd Edition"
by Garrett Fitzmaurice, Nan Laird & James Ware
  • asked a question related to Longitudinal Analysis
Question
4 answers
I am carrying out research on longitudinal changes in migrant health. In one of my objectives I am investigating how BMI changes over time for migrants compared to the native born, in wave 1 I have around 16000 participants, in wave 2, 8700 and in wave 3, 3700. The attrition is very high as indicated, some of the participants had their BMI recorded for some waves only. When I attempted repeated measures ANOVA, only around 900 participants had complete cases were included in the analysis. Are there other longitudinal methods that can include incomplete cases?
Relevant answer
Answer
Yes mixed or multilevel models are quite forgiving about imbalance and missingness. That is they do nor require MCAR abut only that the response is MAR. However if the missingness is due to the response variable (ie NMAR) then the estimates will be biased.
here is a straightforward annotated example
This is an outstanding site for beginners (with lots of different software)
Longitudinal Analysis: Modeling Within-Person Fluctuation and Change (by Lesa Hoffman)
  • asked a question related to Longitudinal Analysis
Question
4 answers
For studying the transversely isotropic elastic behaviour of cylindrical structures of a given length, subjected to radial and axial pressure simultaneously, which plane of isotropy will be considered for the analysis?
TIA
Relevant answer
Answer
  • asked a question related to Longitudinal Analysis
Question
3 answers
I'm looking at the impact of a policy change on crime rates, but there are two policy interventions over the time span I'm studying. Would it be appropriate to do a time series analysis with two interventions, or would an ANOVA be better?
Relevant answer
Answer
Yes. You can set your DVs to match Your time frame periods and apply a ANOVA.
  • asked a question related to Longitudinal Analysis
Question
8 answers
I have a medical longitudinal retrospective dataset, records between the observation period of 2000 and end 2016. For many reasons not every medical record spans that entire time-frame, e.g. the patient may have died, or they may have transferred in to the study half way through or transferred out at some stage.
A particular event (or exposure) is seen as a clinical event e.g., going to the doctor and saying or being told that you have a particular disease, e.g., a chest infection. That patient will also have a categorical variable to indicate whether they are a smoker or not.
I wish to count the frequency of chest infections per patient and distribute them over whether they smoke or not. I can imagine this would be a box plot with UQ and LQ being defined, frequency of disease on the Y, and a Smoke YES and NO on the X. This would be very easy to do. The problem I have though is that I am not sure how I deal with medical records of varying length. Surely there is bias if a smoker vs. non-smoker both have twenty chest infections, but there is a four year medical record difference?
Thanks
Relevant answer
Answer
You are about to discover the concept of incidence density! This is the rate of events per unit time. In your case, the rate of events per 100 or 1000 person-years.
You can tackle the problem using Poisson regression, with length of observation set as the exposure time variable.
Alternatively, you can treat the data as time-to-event data with repeated events. This has the advantage that the probability of chest infections probably rises with age. You can use age as the time variable in the analysis, with subjects entering at the age they were first seen and exiting at the age of last follow up. In order to avoid immortal time bias, you need to declare them to be at risk from some point, say from age 18.
This is pretty straightforward in Stata, if you have it.
  • asked a question related to Longitudinal Analysis
Question
4 answers
Mixed effect model and baseline dependent variable as covariate
I would like to determine which predictors are associated with the rate of change in a continuous dependent variable repeatedly/longitudinally measured over time in each patient. The analysis will use a mixed effect model. The model will therefore include an interaction term between the predictor and time (c.time in STATA) and the coefficient for the interaction term will give what is essentially a difference in slopes. So, if the predictor is gender, and males (coded 0) have a c.time coefficient of 1.5 (slope) then the interaction term could give a coefficient of say 0.5 and tell you that females (coded 1) have a slope of 2.0.
This will be tough since I have many variables which will involve many interaction terms (one for each variable) and the interpretation may get complicated. I recently came across an article that states the following where FEV1 (lung function) is the outcome that is being repeatedly measured and mixed effect model is also used:
" All models include baseline lung function as a covariate; therefore, the regression coefficients express the influence of predictor variables upon the annual rate of decline of lung function. " (PMC2078677)
However, is this correct? I tried this with sample data to see if the answers match and they do not. I am not sure if I am doing it right.
I did the following in STATA:
1) xtmixed y gender##c.time || id:, var
and compared this to
2) xtmixed y gender time baseline_y || id:, var
I guess I was expecting the gender coefficient from 2) to match the gender##c.time coefficient from 1) which seems very silly now.
My only other option would be to group individual slopes of the dependent variable into high and low slope patients to use in a logistic regression model which doesn't adjust for covariance. I am wondering what you all think of this as well.
Relevant answer
Answer
I did something like this some time ago. I just did multiple linear regression of rate of FEV1 loss ( FEV1 at time x - initial FEV1) / time between measurements) (dependent variable) vs age, height, dust., smoking,alcohol etc (indep variables). you could easily add sex.
see attached paper
  • asked a question related to Longitudinal Analysis
Question
3 answers
dear all,
I have a dataset with almost 2 millions observation nested within (European) countries. My DV is probability of weekly religious practice.
I want to disentangle between Age, Period and Cohort effects and there is the weel-known identification problem.
Given that I have so many observations and a quite wide time span (year from 1970 to 2015, cohort from 1900 to 2000, age from 15 to 100) what is the best strategy to apply?
I know this is a very broad question and that there is a huge debate behind but I really need to collect some opinions about this.
Thanks in advance!
Francesco Molteni
Relevant answer
Answer
You may want to look at this Proect on Research Gate
you will see that we are very specptiacl amout autmatic fail safe proecures, but this shows what can be done
  • asked a question related to Longitudinal Analysis
Question
7 answers
This data is from a 9-month intervention with children with Down syndrome (n=12). I know my sample size is very small, but we do have a lot of data points per child.
The dependent variable is essentially a measure of performance (ability to independently activate and drive a powered mobility device, biweekly assessment). The independent variable is practice time (from an activity log, in minutes per day).
First, I am stuck in how to categorize the dependent variable. I was thinking of just picking the date in which independent driving emerges, but ideally, I would like to capture more of the variability in driving patterns and abilities over time.
Another option is to use the percentage of time they were independently activating the car in each assessment OR categorizing each session as novice/intermediate/advanced behavior.
This is all fairly new research, so there is little to follow in the literature. The goal for these analyses would be to provide recommendations to clinicians on when learning is expected to occur based on usage patterns (and yes, recommend with caution of course, given that the scope of inference should likely be restricted to this sample).
Is there a way to analyze the association between two patterns over time for a sample of this size? Or should I treat them as single cases and report all children individually?
Any and all advice appreciated - I am a Master's student, and this level of statistics is a bit daunting to me!
Relevant answer
Answer
I think my best advice here is to try to plot out or describe the data in a way that you would if you were trying to convey the results to an audience. This will guide you as to what analysis you want to do. What would you want to point out to your audience if you were giving a talk presenting your data?
As you mention, you might just report average date or average practice time to some performance benchmark, perhaps several benchmarks. Reporting these plus some measure of dispersion ---- range, IQR, or standard deviation ---- might be the best way to convey what you want to convey.
Or maybe you want to plot the curve of performance vs. practice time. As David Morgan points out, with repeated measures, the statistics here can be a little complicated.
I think to answer your question, you probably want to find a way to present the data from all subjects together, not treat them individually. This gives some sense of the variability among your sample. Although, with 12 subjects, you can pull out a couple cases that exemplify some features you wish to highlight.
  • asked a question related to Longitudinal Analysis
Question
3 answers
I have a number of questions regarding a multinomial logistic mixed model analysis that I would like to ask. Essentially the questions can be reduced to: "Is a multinomial logistic mixed model analysis appropriate for only two time points?", and "Is it possible to measure multiple outcomes using this or an alternative, more appropriate analysis method?".
I am currently conducting an analysis on a subsample drawn from a larger longitudinal cohort study. So far, only two waves of the study have been conducted, with 2-3 years between them. I will be using both waves. My subsample includes only those individuals who were aged 15-25 years at baseline (Wave 1), following them on to Wave 2. I am investigating the relationship between ecstasy use and psychological variables, over time.
At each wave, there are separate questionnaires for those aged 15-17 and 18+. This I not an issue for most variables I am interested in, with the exception of quality of life (QoL). I’ll expand on this in a moment.
By Wave 2, the age range of my sample was 17-28 years of age. Therefore, at Wave 2, most of my sample completed the 18+ questionnaire except for those aged 17 years. This will become relevant when explaining the QoL measures I have.
From what I have read, I believe a multinomial logistic mixed model analysis is the best way to analyse my data. My exposure variable “ecstasy use” is categorical. The outcome variables I am interested in are related to depression and quality of life (QoL). I have one depression scale (PHQ9) and two QoL scales - none of these are normally distributed and I will therefore have to categorise them. That being the case, have a categorical exposure variable and categorical outcome variables.
I also have several dichotomous and categorical covariates. Of my covariates, many are time-varying (marijuana use etc.), but some (sexuality, socioeconomic status, indigenous status and language spoken at home) were only measured at baseline and are assumed to be time-invariant. 
My first question is whether a multinomial logistic mixed model analysis is appropriate with a dataset for which only 2 time points are available.
Secondly, whether or not the mixed modelling approach is appropriate, is it possible to measure two outcomes (depression and QoL) in the same model? If so, will the likely high correlation between depression score and QoL score be an issue?
Finally, QoL is measured using different, incompatible scales for 15-17 years and 18+ years. Is it possible to use both of these when including QoL in the model? I assume this is unlikely as Wave 2 data for the 15-17 years scale is obviously only available for those who are aged 17 years by wave 2. The alternative is only assessing QoL for adults, and ignoring the QoL scale I have for the younger group. If this is necessary, am I still able to measure two outcomes in the same model, or will this be an issue due to one outcome (depression) being only measured for those individuals aged 15-25 years at baseline and the other outcome (QoL) being only measured for those individuals aged 18+ at baseline?
Regards,
Rowan
Relevant answer
Answer
Hi Rowan,
Since you are likely to have more measurement points later and you plan to adjust for multiple covariates, the mixed model (or generalized linear mixed model, to be precise, because of the categorization) framework seems to me to be more useful in the long run to get familiarity with the model. Mixed model also permits you to test various contrasts rather easily considering that you have more than one outcome.
About your research question: you could potentially have a number of research question in mind with this setting, but I assume it is the usual: is there change over time points in the effect of ecstasy use on the outcome (when adjusted for the covariates). Is this it (i.e. are you interested in the time-ecstasy use interaction)?
Another approach might be to compute the differnces and presumably to use the linear model for analysis, but since you have categorized variables this depends on how the categorization was done (did you use same cut points in each time points etc.) and if you are happy to report the rank-based results. Also, in this type of analysis it is difficult to accommodate the other potential outcome variables in the same model.  
But a few questions about your variables: Is ecstasy use a dichotomous variable?What do you mean by not normally distributed? Are the outliers, skewed distribution or what. I would advice against categorizing - you will loose information from the varibles and also power from you test statistics. Categorization is also an easy target for reviewer criticism. Instead, skewed distributions can be easily treated as log-normally distributed or you could treat them as censored varibles to account for floor or ceiling effects (the latter is easier in the framework of what I suggest in structural equation modeling approach below). If you are adamant about the distributional assumptions, there are also non-parametric longitudinal models that you could consider. I have constructed some scripts to run such models on R.
If you would still like to proceed with the categorical data, then the number of categories in the variables is important, because it directly relates to the kind of model that is useful to consider. So why would you consider multinomial logistic model the best for your data and not e.g. the ordinal logistic model? I would imagine this results from having more than two categories in the categorized variables and you may expect that the predictors have different coefficients for the outcome category pairings. If you have just two categories then binary logistic model would suffice.
Any of the cases described above can always be analysed as a mixed model (continuous variables) or generalized linear mixed model (other outcome types). You can then include the two outcomes (e.g depression and QoL) in the same mixed model, if you have them as outcomes and specify no relationship between them except the within- and between-outcome covariances. Then, you just need to be careful in speficying the variance-covariance matrices for the outcomes. Mixed model is also better than alternatives models (except for its 'easier' cousin, generalized estimation equation (GEE) approach), because it permits specifying arbitrary covariance matrices, so it is not dependent of the assumptions of constant covariance and variance among the time-points like the repeated measures ANOVA. Since the covariances are part of the mixed model, it is fine if QoL and depression covary. However, if you have more than one or two variables as outcomes over time points, the covariance specifications may become a tedious job to do.
If you wish to model more complex structures among the outcomes, then I would suggest that you get familiar with structural equation modelling which permits more flexible models between outcomes. You can also test more specific hypotheses regarding the relationships between the outcomes. For example, software like Mplus are very useful for models like this. It is also easiest to learn to use in comparison to the others (see statmodel.com for further information).
Finally, about QoL. Which QoL measure is it? How to link the different measures depends heavily on the kinds of items presented to the subjects, but also on how you think the subjects perceive the items when they take the items. I guess you are in the best position to know this since you know the measure and persons who responded to it. Perhaps you could build some kind of an ordinal index from the QoL measures that could connect measures at 15-17 to 18+? It is difficult to answer without  more detailed knowledge about the Instruments.
Regards,
Timo
  • asked a question related to Longitudinal Analysis
Question
3 answers
I am running a multilevel model using stata but the data only contains replicate sampling weight. I was suggested to use -meglm- but -meglm- is incompatible with replicate weights. Are there any alternative ways or commands I can try to run the multilevel model using replicate sampling weight? 
Also, the data doesn't have longitudinal weights (only contains wave-specific weights) and what weights were suggested to use under this occasion when the data structure is longitudinal? 
Relevant answer
Answer
Thanks Ajit. I saw the IDRE website before but it doesn't totally apply to my situation here. It has been confirmed that stata cannot run multilevel model using BRR weight and I think neither does R or SAS can do that.  
  • asked a question related to Longitudinal Analysis
Question
7 answers
Hello, 
I have recently come across the following in an article I am assessing for my own data analysis regarding platelet refractory levels as a result of multiple variables at several points in time following transfusion. "Risk factors contributing to platelet count increments within 1 hour and between 18 and 24 hours after transfusion and to the interval between transfusions were analyzed by longitudinal linear regression using a random effects model derived by generalized estimating equations"..I was under the impression that generalized estimating equations did not use random effects modelling? Could someone please clarify this for me? 
Regards, 
Derek
Relevant answer
Answer
Without seeing the article, my guess is that they had repeated measures within patients (or some other unit), so they used GEE to model to correlated nature of the data.  I suspect that their use of the term, "random effects", refers to the repeated measurements wiithin patients or units.  GEE does not model random effects, rather considers the clusters or units as nuisance parameters, used only to account for the lack of independence among observations.  Their approach to the data analysis, however, may be satisfactory -- only their description confusing.
Paul
  • asked a question related to Longitudinal Analysis
Question
4 answers
I conducted a multiple logistic regression to assess the effect of parent's nationality, mother's weight at conception, mother' weight at delivery, and father's weight on adverse pregnancy outcomes. The Block 0 of analysis, which assume only constant in the analysis and does not include our explanatory variables, should show that without any independent variable, the best prediction is to assume that no participants show adverse pregnancy outcome. However, my Block 0 table did not show this.  It shows that with out any independent variable, all the participants show the adverse pregnancy outcome. I saw this result in adverse pregnancy outcome 1 and 2, but I saw assuming of non of participants show adverse pregnancy in my adverse pregnancy outcome 3. What is the wrong with my two first outcomes analysis?
Thank in advance,
Relevant answer
Answer
This online course which is free covers logit modelling
You can follow it in MLwin Stata and R 
these are the current modules
Modules
Using quantitative data in research (watch video introduction)
Introduction to quantitative data analysis (watch video introduction)
Multiple regression
Multilevel structures and classifications (watch video introduction)
Introduction to multilevel modelling
Regression models for binary responses
Multilevel models for binary responses
Multilevel modelling in practice: Research questions, data preparation and analysis
Single-level and multilevel models for ordinal responses
Single-level and multilevel models for nominal responses
Three-level multilevel models
Cross-classified multilevel models
Multiple membership multilevel models
Missing Data
Multilevel Modelling of Repeated Measures Data
  • asked a question related to Longitudinal Analysis
Question
5 answers
Hello, I wish to calculate relative risk adjusted for other risk factors perinatal deaths ( dead=1; alive =0 ). Given that perinatal deaths is a common outcome, so use the adjusted odds ratio is overestimated. So how do I calculate adjusted relative risk?
Relevant answer
Answer
Cox regression algorithm will be applied
  • asked a question related to Longitudinal Analysis
Question
7 answers
Respected Researchers
i have panel data of 1252 firm years observations with 182 firms and time period is 2010-2016. i have 14 independent variables including 3 control variables, 1 mediator and 1 dependent variable. i want to use STATA for testing direct and mediation models. i have 11 direct hypotheses from independent variables to dependent variables and 11 mediation hypotheses.
Data: Unbalanced panel 
QUESTIONS:
1) which test should be implemented other than the selection between fixed and random effect?
2) how mediation analysis can be performed?
Relevant answer
Answer
The 'MEDIATION' package written by Hicks & Tingley may be more up-to-date than 'medeff'. Type "findit mediation" or search for it under the help or use the link below.
  • asked a question related to Longitudinal Analysis
Question
3 answers
I am running a difference in difference regression to assess the early impact of minimum wage introduced in 2015 on satisfaction of workers. I actually have data from 2010 to 2015 but the panel is unbalanced since some individual seem to be missing in different years. Do you suggest that i should focus on just 2014 and 2015 waves using the same individuals (balanced data) to run the regression and I should consider also the other years before the introduction of the minimum wage?
Relevant answer
Answer
Hi David, I think you need the years before the minimum wage too because with DID, you are not only interested in groups with and without but also their differences with respect to time.
My only challenge is whether you distinctively have two groups -- those who benefited from the minimum wage and those who did not.
  • asked a question related to Longitudinal Analysis
Question
2 answers
Dear all,
I need some support within a problem in statistics. I need the explanations by an example, please. Otherwise I suspect that I won’t understand the answer, unfortunately. Attached you will find a screenshot (.jpg) of the example:
There is a sampling of 5 taxis. I know the kilometers travelled and the numbers of the accidents of each taxi.  The goal is, to calculate the variance of the average number of accidents per 100000 km (and afterwards to calculate the confidence interval).
It is important, that I want to calculate the average number of accidents in that way, that I divide the summed accidents and the summed travelled kilometers of all taxis. (and not calculate the number of accidents per 100000 km of each single taxi and then calculate the mean).
That means I won’t calculate the variance of the single values (accidents per 100000 km), but I want to estimate the resulting variance of the variance of the travelled kilometers and the variance of the accidents.
Thanks a lot for any support!
Andreas
Relevant answer
Answer
Hello Fabrice,
Sorry for my late response. I was pleasantly surprised because of your really immediate feedback.
Thanks a lot for your support. You understood it very well :)
With you help I found other equations (which looks a little bit easier to me to understand).
I hope they are more or less equal (but I really don’t know).
But in these equations I need always μ (I think the mean of the population). But this is unknown. On the other hand my N (population size) is very large (but unknown, too). For this reason I can remove (N-n)/N.
My sample size (n) is about 60.
Nevertheless I don’t understand the equations of the link you gave me fortunately. Might it be possible to explain it with help of the example? (But please don’t feel forced!)
Many thanks again!
  • asked a question related to Longitudinal Analysis
Question
3 answers
Hi all,
I'm currently trying to do an analysis of the correlates of change in the county-level mortality rate (M) between 2010 and 2014. Since I'm particularly interested in the within-county change in mortality, I thought the most appropriate thing would be either to regress my county-level predictors onto a difference score (M2 - M1) or regress predictors onto Time-2 mortality (M2) while including Time-1 mortality (M1) as a covariate. I've read that this latter method is equivalent to ANCOVA, and should be avoided when doing nonrandomized, observational studies (due to Lord's Paradox). So in my case, a multiple regression with difference scores as my DV seems appropriate (please correct me if I'm wrong). 
However, I'm struggling with some theoretical questions regarding my county-level predictor variables. My predictors are mostly coming from Census data, which comes aggregated into 5-year periods. Therefore, my county-level predictors (e.g., median income, % population with college degrees) represent two 5-year time periods: 2005-2009 and 2010-2014. My goal is to analyze how the overall between-county differences (a) and within-county trends over time (b) relate to the change in mortality over time (M2 - M1). My questions are as follows:
1) Is it possible to disaggregate the between-county and within-county effects with only two time points (2005-2009 and 2010-2014) for each county? My inclination is to follow the advice of Curran & Bauer, 2011 (pg. 9), and compute county-level means (collapsed across time) of my predictor variables to represent between-county effects, and then compute the within-county trends by subtracting the county-level mean (of the predictor) from the T2 (2010-2014) predictor value. Is this an appropriate approach when I only have two time points for my predictor variables?
2) Is it a problem that my outcome variable (change in M from 2010 to 2014) does not match up temporally with my predictor variables (change in IV from 2005-2009 to 2010-2014)? I realize this would be a problem if I wanted to conclude a simultaneous change in M and IV, but I want to test something slightly different: whether between-county differences (at the 2010-2014 period) or within-county trends (between 2005-2009 and 2010-2014 account for the change in mortality in the later period (2010 to 2014). 
Thanks very much,
Jake
  • asked a question related to Longitudinal Analysis
Question
2 answers
I am working on a meta-analysis of RCTs, and I have to calculate .metabias (several tests, including Egger's) for continuous data (variables such as means and standard deviations). What is the process? Which are the commands used?
Thank you so much.
Relevant answer
Answer
Thank you so much dear Mr. Weaver.
  • asked a question related to Longitudinal Analysis
Question
2 answers
Good Evening Sir/Ma'am,
I have a Model with four constructs (A, B, C, D). The Constructs are related in the following manner:
                        A-------> B-------> C--------> D.
Each Construct is having "Repeated measures" i.e., each construct is a "State-level variable", which is to be repeatedly measured, to obtain its Trait-level.
The sample respondent is Employee.
         I am planning to use "Hierarchial Linear Modelling" to analyse the Model.
So, could you please suggest me the Minimum sample requirement at:
1) Level-1 (i.e., How many times should I measure the state-level variable?) &
2) Level-2 (i.e., How many Employees are required?)
Relevant answer
Answer
Thank you so much, for your generosity sir...
  • asked a question related to Longitudinal Analysis
Question
24 answers
Is it true that there is a linear relationship between risk and return i.e. high risk associated with the high return and low risk with the low return. 
Relevant answer
Answer
Have a look at the work done by Ross & Rawls (Arbitrage Pricing Theory) and some of the stuff taught by Damodaran.  Fernandes each year asks all accounting professors for their veiw of the risk premium (but that's hardly logical to ask people who never have any money to invest).  Logic dictates that there is a positive relationship, but the issue goes back to how do you measure risk - a single Beta is hardly a good measure and probably does not account for more than about 30% of the answer.  I think you need to look at some of the work on behavioural finance to see that, unlike the assumptions behind CAPM, investors are not really rational - and without that assumption (amongst others) there can be no linear relationship.
  • asked a question related to Longitudinal Analysis
Question
2 answers
Hi,
Imagine you have measured two variables X and Y at two points in time. You want to predict Y2 from X1 and also control for the autoregressive effect (= temporal stability) of Y1 and control for the correlation of X1 and Y1. My question: Is there any statistical reason that makes it necessary and/or advantageous to implement a full CLP, that is to include the pathes X1 -> X2, Y1 -> X2, and the correlation X2 <-> Y2?
I am not interested in reciprocal relations between X and Y. I just wonder whether including these additional pathes has an impact on my path of interest (X1 -> Y2) and if so, why? I'd also be glad if oyu could provide a reference.
Relevant answer
Answer
Dear Philipp,
thanks for correctly completing my question ;)
And even more for your answer. I totally agree, but it feels good to be more confident now.
All the best,
Johannes
  • asked a question related to Longitudinal Analysis
Question
3 answers
I have a problem with SPSS. I want to write a code in SPSS to repeat linear regression for ten times but every time the dependent variable change and the other part remain the same.  How can do this? 
I use this code to do linear regression and I want to repeat this for ten times changing the dependent variable :
REGRESSION
/MISSING LISTWISE
/STATISTICS COEFF OUTS R ANOVA
/CRITERIA=PIN(.05) POUT(.10)
/NOORIGIN
/DEPENDENT mmsecorretto
/METHOD=BACKWARD CerebralWM_FD CortexVol Mean_GM_temporal_lobe_volume sex anniscolarita
EstimatedTotalIntraCranialVol age
Relevant answer
Answer
Hi Barbara
You can use the Do command within a macro to do this.  The syntax below should do what you are looking for.
Andy
*************************************.
*** Part 1 - Define the macro ***.
*************************************.
* You can change the parameters of the regression command as required (e.g. plots, residuals etc).
* Highlight and run Part 1 to define the macro.
DEFINE ManyDVs  (DVs = !ENCLOSE("[","]").
!DO !I !IN (!DVs).
REGRESSION
  /MISSING LISTWISE /STATISTICS COEFF OUTS R ANOVA
  /CRITERIA=PIN(.05) POUT(.10) /NOORIGIN 
  /DEPENDENT  !I  
  /METHOD=BACKWARD CerebralWM_FD CortexVol
                                      Mean_GM_temporal_lobe_volume sex anniscolarita 
                                      EstimatedTotalIntraCranialVol age.
!DOEND .
!ENDDEFINE .
**********************************.
*** Part 2 - Call the macro ***.
**********************************.
* Replace the variable names in the square brackets with your own variables.
* Highlight and run Part 2 to produce the regression analyses.
ManyDVs DVs = [mmsecorretto **add other DVs here**].
  • asked a question related to Longitudinal Analysis
Question
6 answers
I don't grasp the concept. For example, if I have a sample n=1000 (that is stratified and clustered) and made three groups of that sample to compare means, is using a regression model enough?
IV is categorical and DV is continuous variable
Does it affect how I choose the groups?Logistic regression is better? If I have two dependent variables, do I use a multiple regression model ?
I am working with a complex survey (secondary analysis) in mental health and I want to know if having a history of ADHD symptoms (IV) can affect QoL (DV) in adults so I have three groups
1. adults without history of ADHD nor current symptoms---> QoL x
2. adults with history of adhd w/o current symptoms-----> QoL y
3. adults with history of adhd and current symptoms------> QoL z
and want to prove if there is an statistical difference between x, y z
Relevant answer
Answer
There are different views about the effects of complex sample designs in regression estimation. If, as in the link you provided, you are estimating characteristics of a single population such as means or proportions you should certainly take account of the survey design. For the relationship between variables it rarely makes much difference. Sampling rates will have similar effect on both explanatory and response variables and will have little effect on the regression. You could try using the usual sampling weights and compare results.
  • asked a question related to Longitudinal Analysis
Question
2 answers
A data modeled with categorical predictors and has a large proportion of cells with zero counts. Apart from considering adding a constant to all cells, we may collapse categories in a theoretically meaningful way.
How do we do such collapsing?
Relevant answer
Answer
You can collapse levels according to their ontology (applied area). For example, There are different variants of disease. some of them are rare. You can join rare variants into one group (level of categorial variable).
Of couse, David's book recommendation is very valuable.
  • asked a question related to Longitudinal Analysis
Question
1 answer
I am doing longitudinal research.
There are two variables.
One was measured for three times, but the other one was observed for five times.
In other words, one variable was missing twice.
Do you think that I can develop the AR model with these variables?
Relevant answer
try to use the average of the three as 2 other variable to be 5
  • asked a question related to Longitudinal Analysis
Question
1 answer
I have found the only relation between moment, thickness, and connectivity but I guess it is only for longitudinal method.
Relevant answer
  • asked a question related to Longitudinal Analysis
Question
8 answers
I am working on a longitudinal study with 140 participants divided in 3 groups. The participants were assessed every 2 years from 2010 (4 time-points in total), so time-points are equally distributed, but there are some dropouts, so some patients are missing some time-point. 
The assessment consisted of some tests, the results of which are discrete numerical variables (e.g. one of these is the MoCA test, which is a cognitive test with different tasks and for each task the participant is given a score; the final score is the sum of the partial scores).
My goal would be to show any difference between groups in the progression of the scores through time.
After some readings I am thinking to use a mixed effect model with the random part on the single individual level and the fixed part on the group level, would that make sense? What other statistical model could I use?
Relevant answer
Answer
I agree with both Cauane and Georgio above.
You are dealing with a multi-level analysis of panel data with 4 repeated measures.
When it comes to the wonderful stats package, R, it's a fair bet that someone has faced a similar problem and shared their solutions. See the link attached: Multilevel analysis: panel data and multiple levels.
  • asked a question related to Longitudinal Analysis
Question
7 answers
I am trying to analyze data (self-concept and testscores) of students  before and after transition from primary to secondary education. The aim is to show the impact of individual achievement and class achievement on self-concept both before and after transition: I hypothesize (1)  that individual achievement has a positive impact on self-concept and class achievement a negative one, (controlling for individual achivement) and more important (2) that after transition to secondary school, class achievement of the "old" class before transition does no longer have it's negative impact on self-concept measured after transition.
Now I do not know how to set up a model, for students change classes with transition and therefore are nested in two different groups - their classes - before and after transition.
Does anyone have an idea how to set up a model that allows to analyse these questions or has anyone done some similar analysis?
Thank you very much for your answers!
Relevant answer
Answer
Thank you again for your answers. Unfortunately, I am still struggling with my data. I do not want to bother you anymore with my questions, still trying to tell you what I intend to do now.
Because we measured self-concept as a latent variable  by several indicators I think I need to do a two-level CFA controlling for measurement invariance between the two levels.
If this works I realized that I probably have to run a contextual effects model. The reason for this is, that I have on both individual and class level the same predictor (testscores) and want to see what impact class means of test scores do have on self-concept beyound individual test-scores (see Marsh et al., 2009).
At the moment I think I will then just try to show
(1) that there are contextual effects  of class achievement at t1 on self-concept at t1 for both groups of students (those with and those without transition, in seperate models).
(2) that there is a contextual effect of class achivement at t1 on self-concept at t2 only for those students wihtout transition to secondary school after t1.
(3) hat there are contextual effects of class achievement at t2 on self-concept at t2 for both groups of students (those with and those without transition, in seperate models).
More sophisticated analyzes seem to exceed my competencies at the moment.
Marsh, H. W., Lüdtke, O., Robitzsch, A., Trautwein, U., Asparouhov, T., & Muthén, B. (2009). Doubly-latent models of school contextual effects: Integrating multilevel and structural equation approaches to control measurement and sampling error. Multivariate Behavioral Research, 44, 764-802. 
  • asked a question related to Longitudinal Analysis
Question
3 answers
Longitudinal data elements need to be embedded in usual snap-shot mode of survey. what techniques can help to achieve this?
Relevant answer
Answer
Good Question !
You can address partially the longitudinal aspects with  questions/issues that refer to the past and future - as well as the present.  
An example of this:  Perceived Chances for Promotion Among Women Associate Professors in Computing (below)
  • asked a question related to Longitudinal Analysis
Question
4 answers
Hi
I'm running a series of multilevel regression models (mixed effects or random coefficient analysis) in Stata 13 to investigate associations between a set of predictors, time (here interpreted as duration in months from time of diagnosis) and my outcome of interest which is continuous (say cholesterol in mmol/L).
The main purpose is to investigate rate of change (i.e. this is a longitudinal analysis) - does cholesterol change with duration (time) given a set of certain predictors.
I know that the modeling results in two parts; the fixed effects part and the random effects part. I know how to interpret the fixed effects part, but could someone help me understand the estimates from the random effects part, when this is run for longitudinal analysis?
Below, we see that cholesterol reduces by -19 units per month (duration is in months). Mixed and black ethnicities have cholesterol levels at 3.4 & 3.3 at time 0 (i.e. at diagnosis here) and so on. But how do I interpret the estimates under 'random-effects parameters', in the bottom half of the output?
Example of output: 
xtmixed hba1cifcc2 durationm durationm2 sex1 i.ethnicnew2 || id: durationm, cov(unstr) mle var, if diagyr>2004 & durationm<6.1 & imd4!=.
Mixed-effects ML regression                      Number of obs      =      1028
Group variable: id                                     Number of groups   =       443 
                                                                 Obs per group: min =         1
                                                                                           avg =       2.3
                                                                                          max =         7
                                                                Wald chi2(9)       =    729.89
Log likelihood = -4329.3449                     Prob > chi2        =    0.0000
------------------------------------------------------------------------------
  cholesterol |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
   durationm |  -19.43518   .7736272   -25.12   0.000    -20.95146    -17.9189
  durationm2 |   2.402219   .1226445    19.59   0.000      2.16184    2.642598
          sex|  -1.840137   1.352356    -1.36   0.174    -4.490706    .8104314
  ethnicity |
      mixed |   3.442956   2.547798     1.35   0.177    -1.550636    8.436549
      Black  |   3.286653    2.12706     1.55   0.122    -.8823077    7.455614
      asian  |   5.825651   1.820642     3.20   0.001     2.257258    9.394044
      _cons |   93.19885   2.935992    31.74   0.000     87.44441    98.95329
   -------------------------------------------------------------------------
  Random-effects Parameters  |  Estimate   Std. Err.     [95% Conf. Interval]
-----------------------------+------------------------------------------------
id: Unstructured            |
               var(durati~m) |   21.35454   2.9492331     16.03032   27.2855
                   var(_cons) |    352.785   38.02899      284.87269    434.91405
   corr(durati~m,_cons) |  -69.80694   9.286172     -88.562195   -50.74312
-----------------------------+------------------------------------------------
                sd(Residual) |   10.55247   .4243647      9.752664    11.41787
------------------------------------------------------------------------------
LR test vs. linear regression:   chi2(3) =   214.13   Prob > chi2 = 0.0000
Relevant answer
Answer
there is quite a lot going on here so I would advise reading an extensive account of these types of models
I have found this book useful for researchers without a lot of statistical background
Longitudinal Analysis: Modeling Within-Person Fluctuation and Change (by Lesa Hoffman)
This website gives all the code for a number of software implementations including Stata
My reading of you results suggest that that there are large unexplained differences between individuals (labeled ID)  that diminish over time (as shown by the negative covariance) and that these are large compared to the within person variation  ( that is labeled residual here); you have a very simple structure  at the lowest occasion level that is homogeneity and no autocorrelation within a person.
This gives a relatively short example that will help you understand what you have done
You have been modeling complex level 2 variation between individuals as a quadratic function of duration.
  • asked a question related to Longitudinal Analysis
Question
2 answers
In my study,there are eleven  response variable and one independent variabe.
Is there a way?(using by longitudinal ordinal response GEE models)
Relevant answer
Answer
Thank you very much.
  • asked a question related to Longitudinal Analysis
Question
4 answers
Hi I did an experiment for miRNA and targets but I have a problem in calculating the significance of the data can you help me which test should I have to use and how to caluclte the p value or z score for these data
negative (no vector)
Positive ( null miRNA vector + gene)
G1, G2, G3 (miRNA vector + gene)
Relevant answer
Answer
Dear Marwa, The correct approach to the analysis of your data - and any other, for that matter - is to state your Null Hypothesis first. In other words, what is your goal? Presumably, you mean to show that two events are the same or not. You must make this clear. The way you stated your question reads "Here are data, now help me find a test". Not appropriate. What you want to say is "I want to show that ... is true (e.g., equal means, equal medians, equal variances, etc) under the given experimental conditions". Furthermore, with stating your hypothesis, you make a choice regarding the significance cut-off. Usually, at 5% (or 0.05). You may not realize it, but by stating your hypothesis, you have defined "your" test statistic - in advance, mind you, and without having gathered data. Then, you go gather data, apply the test statistic and state whether under the conditions given, you accept or reject the hypothesis. In summary, your problem is that you do not state what you wanted to discover from the data. Please do so - essential for you, and essential for those who are willing to help you with your question. Best regards, Hans
  • asked a question related to Longitudinal Analysis
Question
3 answers
Is there a way to easily check the "geeglm" model (with ar1 correlation structure, in R) like its residuals or something else?
Thanx a lot
Relevant answer
Answer
for Michaela,
7 eviews program use more practical use. If the data exceeds 3 AR lag, should be re-examined using cointegration test. if cointegration test is not passed, then you test mneggunakan VECM, or VAR if it passes cointegrasi. or if you are using a mobile data ARMA
  • asked a question related to Longitudinal Analysis
Question
4 answers
Dear all,
I am running an MLM, where I am interested in individual and regional effects, but only want to control for the country. However, once I insert the country dummies at Level3 -while none of the previous results change- the LR test now indicates that I should rather use a logit model.
Thus I am wondering: Is the variance of the dummies already accounted for? i.e. do I need the dummies in the first place to control for the nesting of regions in countries and take out all country fixed effects, or is that already done?
Thank you very much for your time!
Best, Jo
PS: I am using STATA'S build in command which takes forever. I have tried MLWIN but was disappointed since it crashed very often. If you have a program you would recommend please also do let me know.
Relevant answer
Answer
How many countries do you have ? And how many regions are there? I presume that the outcome is discrete . And I presume in this initial null model that there are no predictors included at any level.
If you have few counties ( and/or  an overly complex model) any software will struggle
Ideas behind a three level model are considered here:
and aspects of multilevel discrete model are fully considered here:
You may want to look at this paper that considers discrete outcomes with few countries and how well MCMC estimation does in comparison to likelihood approach
Political Analysis 21 (3), 2013 [PDF] [publisher link] Winner of the 2014 Jon Rasbash prize.How many countries for multilevel modeling? A Monte Carlo study comparing Bayesian and frequentist approaches. American Journal of Political Science 57 (3), 2013. [PDF] [publisher link]
MLwin has the capacity to use MCMC estimation
see
Finally and this may be your real problem - If you are  including a set of dummies at level 3 - then you are fitting a fixed effects model and then there can be no variance left unexplained at level 3 it will always be zero
See
  • asked a question related to Longitudinal Analysis
Question
5 answers
What is the best type of study to grasp the changes in well-being within a city's population in a time period of 3 months? Cross-section or longitudinal?
The population will the surveyed two times. Once before a major event, and the second time after the event. The time period between the two surveys will be of 3 months.
Relevant answer
Answer
You may want to look at the appendix  of this study to see an analysis of change due to an intervention - the outcome at 3 months later is regressed on 3 months earleir so that you are modelling change  and there is a dummy variable for those who did/did not receive the intervention.    
You do not say how intervention is determined - classically this would be randomly to the individual so that you are holding off potential confounders. But if this is not the case this book usefully discusses quasi experiments
 Dunning, Thad (2012). Natural Experiments in the Social Sciences: A Design-Based Approach. Cambridge University Press. 
So that you may want to look at a regression dis- continuity design 
  • asked a question related to Longitudinal Analysis
Question
4 answers
I have been conducting a longitudinal study examining the association between aerobic fitness, weight status, and academic achievement. Recently the standardized testing measure, used to assess academic achievement among youth, was changed due to a switch in government contract. Is there any way to continue longitudinal analysis with this switch in the measurement tool for the outcome variable of academic achievement?
Relevant answer
Answer
Hi,
According to what you mentioned there is unexpected change in policy, this is not a justification to change your proposed study design. The longitudinal design which I assume you already started data collection accordingly will entail the presence of consistency in your measurement over time. 
However, if you still did not start data collection, you may go back to your funding agent or IRB committee to seek their permission to change your measures based on the new circumstances. 
  • asked a question related to Longitudinal Analysis
Question
3 answers
What is the best method to examine the dynamics of cattle colonization by anti microbial resistant microorganisms? The data consists of 188 cattle measured at four equally spaced time points over the course of the year. The same cows were followed and a binary outcome (ARM present/absent) as well as the number of bacteria present (log colony forming units) are available. I was interested to explore either dynamic or longitudinal models to understand the underlying process of colonization in the herd over time. Please provide references explaining any methodologies or other examples.
Relevant answer
Answer
It would be helpful if you describe the situation in some more detail.Are the 188  in a single herd or multiple herds - what is the nature of their contact?
If there is a complex structure you may need to be aware of these models as well as the two-level repeated measures of occasion (at level1) nested in cows (at level 2).
There is an example in the following  of salmonella in chickens in different flocks (nut no time)
These models are not for the beginner; consulting with a biostatistician would be helpful.
  • asked a question related to Longitudinal Analysis