Science topic

# Hierarchical Linear Modeling - Science topic

Explore the latest questions and answers in Hierarchical Linear Modeling, and find Hierarchical Linear Modeling experts.
Questions related to Hierarchical Linear Modeling
Question
I have data from European Social Survey (24 countries) and want to model a cross level interaction. Can I do this with a simple random intercept (fixed slope) model? Or do I have to model a more complex random slope model? And if so, are 24 countries sufficient?
I am not explicit interested in explain the different slopes on Level 2 due to the cross level interaction. If its possible I would do that, but I think I need more countries right?
But i definetely want to show, that trust in institutions (Level 1 variable) depends on the level of corruption (level 2 variable) in a country. Can I do this with random intercept fixed slope model?
Question
Hi, RG community! I am new to network analysis and I am currently facing a challenge with coding, processing, and quantifying networks in a hierarchical scheme. In this scheme, nodes pertain to differing hierarchical ranks and ranks denote inclusion relationships. So, for example if node “A” includes node “Z”, it is said that “A” is “Z”’s parent and “Z” is “A”’ daughter. However, a rather uncommon feature is that nodes at different ranks of the hierarchy can relate in a non-inclusive fashion. For example, node “A” parent of “Z” may have a directional link to “Y”, which is “B”’s daughter (if “A” were directionally linked to “B”, then it could be said that “A” is “Y”’ aunt). Here is a more concrete example to illustrate the plausibility of this scheme: “A” is a website in which person “Z” is signed in (inclusiveness; specifically, parentship); website “A” can advertise banners of website “B” (siblingship) or recommend to follow a link to person “Y” profile in website “B” (auntship).
OK. So, in the image below (left top panel) I present a graphical depiction of this rationale. For simplicity, a two-rank hierarchy is used, where gray and red colors denote higher and lower hierarchies, respectively. The image displays siblingship, parentship, and auntship links. My first approach to coding this network scheme was to denote inclusiveness as one-directional relationships (green numbers) and simple links as symmetrical (two-way; brown numbers) relationships (see table in right panel). However, I soon realized that this does not reflect what I expected in networks’ metrics. For example, I am mainly interested in quantifying cohesiveness and the way I coded the network in left top panel entails something like the non-hierarchical network depicted at the left bottom panel. In short, I am not interested in the directionality of the links but in actual inclusiveness. To my mind, the network in the top panel is more cohesive than that in the bottom panel but my coding approach does not allow me to distinguish between them formally.
The solution conceived in the interest of solving this problem was to stipulate that a relationship between any pair of nodes implicates a relationship of each with all of the other’s descendance. This certainly yields, for example, the top network being more cohesive than that in the bottom, which is in line with my goals. However, this solution is not at all as elegant as I would have hoped. Can anyone tell if there is a better solution? Maybe another way to code or an R package allowing for qualitatively distinct relationships (not just one-way or two-way). Thank you.
Instead of listing A/B on the same level (in matrix) as Y etc, what you say you want seems (to me) a graph G = [X,Y,Z,W] with 2 subgraphs A, B.
A simple representation then is a matrix 4x4 and a datastructure to test if a node is in a subgraph or not (dictionary/set).
If you want to model edges between A/B, you can define a new 2x2 matrix describing those.
Note that for large datasets adjacency matrices will scale poorly, so either a adjacency list or sparse matrices can be useful.
There is a fast datastructure for this kind of problem (if A/B are disjoint)
If you find you need multiple edges, consider hyper/multigraphs
Hope this helps,
Ben
Question
I need code of Consensus + Innovations and OCD in any programming language preferably Matlab or R
Aamir Nawaz, Can you provide code for Consensus + Innovations and Optimality Conditions Decomposition?
I would appreciate it if you help me.
Question
Dear colleagues,
I am asking your kind comments or recommendation on analyzing hierarchical- and multiple responses (outcomes). I use hierarchical and multiple responses to express my outcome variable which is because my outcome is Quality of life (by Rand-36 or SF-36). However, by calculating the 36-items questions, I would have a continuous mean score for the total quality of life. But, as you may know, under SF-36, we also could calculate 8 domain scores (separately PF, RP, BP, GH, and MH, RE, VT, SF), and 2 dimensions (summary) scores (the PCS and MSC). Therefore, in a way, my outcomes are multiple responses and also are hierarchical.
level 1: total mean score of quality of life
level 2: --- Physical component summary
level 3: ------ PF: physical functioning
level 3: ------ RF: role limitation due to physical problems
level 3: ------ BP: body pain
level 3: ------ GH: general health
level 2: --- Mental component summary
level 3: ------ MH: mental health
level 3: ------ RE: role limitation due to emotional problems
level 3: ------ VT: vitality (fatigue or energy)
level 3: ------ SF: social functioning
My purpose of study (cross-sectional design) is to understand associated factors to hemodialysis patients' quality of life. Therefore, I have a series of explanatory variables (Xs) to estimate the Ys. My original analysis was using "multiple regression" to each of the quality of life scores (Three hierarchical levels of scores: the total QoL mean score, each of the 8 domain scores, and each of the 2 dimension scores).
But, this brings me to the problem of "multiple comparisons" and also I treated each type of scores (no matter the total QoL mean score, or the domain score, or the summary score) as "independent to each other" which actually are correlated. However, from the QoL measurement instrument, there is inherent hierarchical and also correlations among the three levels of scores in the designed conceptual framework: SF-36.
Therefore, Ii would like to kindly ask for your comments or recommendation:
1). how can I analyze my (Y (outcomes) when they are multiple-responses and hierarchical?
2). will multilevel analysis (hierarchical linear regression) work for my Ys?
3). other analysis methods could try?
4), could you please suggest to me some literature to explore this issue I am countering?
Ronán Michael Conroy thank you so much for your kind advice. I will read through the Dirichlet distribution.
Question
Hello!
We have a question about implementing the ‘Mundlak’ approach in a multilevel (3-levels) nested hierarchical model. We have employees (level 1), nested within year-cohorts (level 2), nested within firms (level 3).
In terms of data structure, the dependent variable is employee satisfaction (ordinal measure) at employee level (i) over time (t) and across firms (j) (let’s call this Y_itj), noting that we have repeated cross sections with different individuals observed in every period, while as regressors, we are mainly interested in the impact of a firm level time-variant, but employee invariant, variable (let’s call it X_tj). We apply a 3-level ordered profit model (meoprobit in Stata).
We are concerned with endogeneity issues of the X_tj variable, which we hope to (at least partially) resolve by using some form of a Mundlak approach, by including firm specific averages for X_tj, as well as for all other time-varying explanatory variables. The idea is that if the firm-specific averages are added as additional control variables, then the coefficients of the original variables represent the ‘within effect’, i.e. how changing X_tj affects Y_itj (employee satisfaction).
However, we are not sure whether approach 1 or 2 below is more appropriate, because X_tj is a level 2 (firm level) variable.
1. The firm specific averages of X_tj (as well as other explanatory variables measured at level 2) need to be calculated by averaging over individuals, even though the variable itself is a Level 2 variable (varies only over time for each firm). That is, in Stata: bysort firm_id: egen mean_X= mean(X). As our data set is unbalanced (so the number of observations for each firm varies over time), these means are driven by the time periods with more observations. For example, in a 2-period model, if a company has a lot of employee reviews in t=1 but very few in t=2, the observations in t=1 will dominate this mean.
2. Alternatively, as the X_tj variable is a level 2 variable, the firm specific averages need to be calculated by averaging over time periods. That is: we first create a tag that is one only for the first observation in the sample per firm/year, and then do: bysort firm_id: egen mean_X= mean(X) if tag==1. This gives equal weight to each time period, irrespective of how many employee-level observations we have in that period. For example, although a company has a lot of employee reviews in t=1 and very few in t=2, the firm specific mean will treat the two periods as equally important.
The two means are different, and we are unsure which approach is the correct one (and which mean is the ‘true’ contextual effect of X_tj on Y_itj). We have been unable to locate in the literature a detailed treatment of the issue for 3-level models (as opposed to 2-level models where the situation is straightforward). Any advice/suggestions on the above would be very much appreciated.
For general orientation you may be interested in
In this manual , see Chapter 8 where we use another multilevel model to get precision-weighted estimates of the group means
For a fuller discussion of the multilevel model as a measurement model) and a more convincing example of using precision-weighted estimate of the group mean can be found at
Finally , as in all research call involving a judgement call, do it both ways and see if it makes a substantive difference.
Question
I have a sample with 138 participants. Only 6 of them reported living alone (%4.3) while the remaining 132 share the household with others (family/partner/housemate etc.)
I am trying to decide whether I can add "living alone" as a dichotomous variable in my hierarchical regression. What worries me is the very low percentage of individuals living alone in the sample. Do you think this would be problematic?
Thank you so much for your answer!
In general I agree with Prof Wright. I would like to add a few things. First if if the DV IS
dichotomous this is a logistic regression and hierarchical regression makes little sense because you can't follow change in things like in Rsq. Second if ols regression is appropriate in this
Situation changes in F stats or Rsq .wlll tell you nothing as far as I can see. Third if this is a logistic regression then are no Fstats or Rsq to follow so why do hierarchical regression? Fourth in this case I can't see how hierarchical regression will tell you anything if ols is appropriate. Thus do a full model of the appropriate type. Fifth if the DV is dichotomous and you do logistic regression with low percentage in one group then Firth method of logistic regression.is your best choice this case is Firth logistic regression.which rewires a special program and That's why Prof Wright asked for a a research question. Finally if more than one DV life gets very complicated as it will if you have more than 2 categories in your DV. So next time please specify things like these whenever possible to save time in getting an answer. Best wishes, David Booth
Question
I would like to perform a sensitivity analysis of a CFD solver. There are 8 input variables, for each of them there are 2-3 prescribed numerical values.
To evaluate one set of parameters three costly simulations (each running for 20 hours on 800 cpu cores). Budget for these simulations is limited and due to the queuing system of the HPC, it would take a long time to get the results.
I'm aware of latin hypercube hierarchical refinement methods that allows starting the sensitivity analysis with smaller budget and subsequently incorporating newer results when they're available.
But those methods works with continuous variables. Is there a method for categorical and ranked/ordinal variables?
Thank you, Andrea
Question
Hello,
I have a dataset with a three-level structure. Participants reported an outcome of interest for 3-4 aspects of an incident, and they reported 4 types of incidents (1 incident for each level of 2 * 2 factors). I draw the structure in the figure below.
I have two questions:
(1) How should I structure the covariance matrix for residuals? I expect that the residuals for each person to correlate. Also, I expect that the residuals within each event correlate stronger. I prefer not to use an unstructured matrix, as the number of parameters would be too large. (If it cannot be done in SPSS, then R, preferably in nlme or lme4 packages.)
In SPSS, I'm using this syntax:
MIXED Y BY Valence Type Aspect
/FIXED = Valence Type Valence*Type | SSTYPE(3)
/REPEATED=Type*Valence*Aspect | SUBJECT(participant_ID) COVTYPE( )
(2) As is illustrated in the figure below, one aspect of the outcome is not applicable to the type 2 events (type 2 events only have 3 level-1 categories). I assume that when I want to assess the effect of incident type, I should exclude the number 4 aspect that is only applicable to type 1 incidents so that the results would reflect the difference between types including same level-1 outcomes. But, excluding number 4 or including it results in a negligible difference. Can I just report that, and then include the number 4 in subsequent analyses? Especially as excluding data from the number 4 will cost substantial statistical power.
Thank you in advance.
Kelvyn Jones - good thought: valence x Type should be together one random factor with nominal level 1,2,3,4 (end the fixed factors should explain the variations between the "four events"). so Participant, event nested in participant, aspect nested in event n.i.p. are 3 levels.
Question
I am interested to know if anyone has conducted online studies with a hierarchical linear modelling design in the field of education involving teachers and their students or in a similar field involving coach/mentor/supervisor and their athletes/employees? What platforms have you used and how has it been compatible with data privacy? What kind of ethical issues have you encountered?
Intresting topic
Question
Hell Expert(s),
I am considering a variable that has a further 5 sub-variables (dimensions). Concerning this, how should I check moderation for such kind of variable? Am I supposed to consider those all as separate moderators while analyzing data in Hayes Process Macro (Model 01)?
Next, while considering those all as separate, if one moderator (dimension) demonstrates a different effect (-/+) compared to others, what approach should I adopt to interpret? 5 dimensions reflect mental stability; if one is missing or has a different effect, how to address that?
Regards,
Treat each sub variable as a moderator on the same x in successive regressions assuming the composite variable is a moderator.
Question
I need to assess the effect of weight on the improvement in DV score over 6 time points.
The attached document includes R syntax and output of 3 HLM models.
The documents includes side notes with conclusions and questions that reflect my imperfect knowledge of multi-level models.
I would appreciate it a lot if someone expert could be so kind to have a look and answer my questions. Also I need to know how to report the results in a minimal and simple way.
All partial or perfect answers and opinions are highly welcomed and appreciated.
Kind regards,
Also, use this ( https://stackoverflow.com/ ) platform to ask this question. For me, it works perfectly many times.
Good Luck!
Question
I read quite a few resources on linear mixed effects, but none are sufficient for helping me address this complex data structure I am working with. I would be very grateful if someone could help me pinpoint the best way to account for the random effects in my data.
The overarching hypothesis I want to test is whether brain activation is increased when participants engage in one cognitive task compared to another. So within each subject is a nested variables consisting of 2 task conditions, a varying amount of trails (i.e. overall time per condition), and a varying amount of inter-cranial electrodes that measure brain activation. | attached here a figure to visualize the random-effects on a subject-level basis and a schematic of the overall data structure. I do not think any of these variables are crossed-nested aside from each subject completing both task conditions. Afterwards, my lab may want to me analyze these random effects more directly.
lmer(activation ~ condition + (trail | subject/electrode))
The model above is my best approach so far to test my overall hypothesis, but I have several concerns and questions:
1) I think this current model is treating condition as a between-subject effect even though it should be a within-subject variable since each subject completed both task conditions. Is there additional syntax that I could include to account for this?
2) The visualize_mixed_effects file attached shows that the brain activity varies tremendously from subject-to-subject, electrode to electrode, and throughout the number of trails. This is not essential to my main hypothesis so I am thinking I should account for the heterogeneity by including a random intercept and slope of trail number in the model. This seems find to me but I am less certain about whether it is logically sound since subjects is nested by electrode number. I have never done a double nested lme before. Is the formula above robust enough to test my hypothesis?
I'm a bit confused. I think that you want this basic model:
lmer(activation ~ condition + (1 | subject))
This treats subject as a random effect and condition as a within-subject effect (assuming that there are multiple trials per subject - because lme4 picks up on the structure based on the subject ids). I'm not convinced electrodes are are a level here - that implies that there are a large number of electrodes that could have been placed that you want to generate to - also if the electrode placement is identical between subjects I think would make it a crossed random effect - not a nested one. So electrode is probably a fixed effect?
If trials are just replications and don't involve varying stimuli (beyond condition) then I can't see why you'd include them as a random effect. Their variation is picked up in the level error term. If they represent different stimulus ids its possible that you need to include them as a crossed random effect (if the same items are given to each subject).
To add crossed random effects you'd get (but its not clear to me you want to add either or both these:
lmer(activation ~ condition + (1 | subject) + (1 | electrode) + (1 | trial) )
Finally you might want random effects of condition:
lmer(activation ~ condition + (1 + condition | subject))
This fits a random slope that allows the condition effect to vary between subjects and also the covariance between condition and slope. Sometimes the more complex models struggle to converge. There are various options for that but I usually refit the model in the Bayesian R package brms (which uses similar syntax).
My best guess is you want one of these basic models:
lmer(activation ~ condition* electrode + (1 | subject))
lmer(activation ~ condition + (1 | subject) + (1 | electrode))
I'd maybe start with a simple model:
lmer(activation ~ 1 + (1 | subject) )
and build up to see if any convergence issues crop up.
Question
I have been using hier.part package and function in R and love it, but now I need to run hierarchical partitioning on a linear model but I have 10 variables. With more than 9 variables hier.part has rounding bias (see the (Olea, et al., 2010)).
Is there an alternative function in R to run hierarchical partitioning that works with 10 explanatory variables? I know it is a lot of explanatory variables for my candidate model, but I do not have any further ecological or statistical justification to remove any more explanatory variables before running best model selection.
I found the Stan125 package (https://github.com/Stan125/ghp) but it works off of hier.part so I do not know if I should use it.
Or would the adipart function from the vegan package accomplish the same results? Does hierarchical null model testing also show independent and joint effects?
Forgive me if I am not clear, I am fairly new to statistical analysis! Thank you for your time and help!
try partition()
Question
We examine language outcomes in children in childcare. For most children, the nesting structure is: child in classroom in chilcare center in municipality.
However, some childcare centers do not have traditional classroom division but a more open structure. That is, there is no classroom level, or it is identical to the childcare center level. Will the HLM work if, for some children, two levels are not distinguished?
I'm using STATA 16, "mixed" effects command.
Anders
No.
Hope this helps,
Matt
Question
Question edited:for clarity:
My study is an observational two-wavel panels study involving one group samples with different levels of baseline pre-outcome measures.
There are three outcome measurements that will be measured two times (pre-rest and post-rest):
1. Subjective fatigue level (measured by visual analog score - continous numerical data)
2. Work engagement level (measured by Likert scale - ordinal data)
3. Objective fatigue level (mean reaction time in miliseconds - continous numberical data)
The independant variables consist of different type of data i.e. continous numerical (age, hours, etc), categorical (yes/no, role, etc) and ordinal type (likert scale).
To represent the concept of recovery i.e. unwinding of initial fatigue level, i decided to measure recovery by substracting pre-measure with post-measure for each outcome, and the score differences are operationally defined as recovery level (subjective recovery level, objective recovery level and engagement recovery level).
I would like to determine whether the independant variables would significantly predict each outcome (subjective fatigue, work engagement and objective fatigue).
Currently i am thinking of these statistical strategies. Kindly comments on these strategies whether they are appropriate.
1. Multiple linear regression, however one outcome measure i.e. work engagement is ordinal data.
2. Hierarchical regression or hierarchical linear modelling or multilevel modelling, but i am not quite familiar with the concept, assumption or other aspect of these method.
3. I would consider reading on beta regression (sorry, this is my first time reading on this method).
4. Structural Equation Modelling.
- Can the 3 different type of fatigue measurement act as inidcator to measure an outcome latent construct of Fatigue?
- Can the independant variables consist of mix type of continous, categorical and ordinal type of data
Thanks for your kind assistance.
Regards,
Don't worry about your grammar. I should have been more careful, too.
I think you have enough people for your analyses.
I am attaching an article from a highly respected methodologist / statistician that should reassure you about your work engagement variable being able to be regarded as at the equal-interval level for purposes of analysis. It has some highlighting that I placed in it. I hope that's OK.
Robt.
Question
G. David Garson in his book Hierarchical Linear Modeling (2003) p. 64. asserts,
"The fact that the intercept component is significant means that the intraclass correlation coefficient, ICC, is also significant..."
I'm actually interested in the converse -- I have a non-significant intercept component in a null model and want to conclude that the ICC is not statistically significant -- but a referee has questioned that reasoning. Garson gives no justification or citation to support the assertion.
This represents significant between-level variation.
Hope this helps
Matt
Question
Here's what I have written for written for a linear model:
lin.base <- lmer(FreqTrialStim.RESP ~ zFreq_i + (zFreq_i | Subject), data = data10, REML = F)
This is the truncated script. Here's the full version, with intercepts included:
lin.base <- lmer(FreqTrialStim.RESP ~ 1 + zFreq_i + (1 + zFreq_i | Subject), data = data10, REML = F)
Your model is not linear.
You may linearize it:
log(Y) = log(aXb) = b*log(aX) = b*(log(a)+log(X)) = b*log(a) + b*log(X)
If you set Y' = log(Y), X'=log(X), c=b, d=b*log(a), the model is
Y' = cX' + d
This would make sense when the errors in Y' are conditionally normal. If not, you should better fit the nonlinear model using nlme.
Question
In the field of intellectual disability, data are often hierarchical. As in the case of the classic example of schools (school, class, student), data in our field are often hierarchical ( residential setting, life unit, individuals).
In this context, I am looking for studies that have analyzed the impact of this hierarchy in the field of intellectual disability or empirical studies that have at least controlled for this using multilevel models.
is anyone aware of the existence of such studies?
Just to say that there is now no need fro a strict hierarchy with random effcets multilevel models. You can cross classifiactions and multiple memebership wich can take account of individuals being nested in more than one setting and spending different amonts of time in them
Question
Dear researchers,
I am currently conducting a meta analysis which analyses the relation between personal values and different well-being scales across cultures. In my meta analysis the effect size I search for in all studies is the Pearson Correlation Coefficient.
Today, I came across a relevant study that conducted a hierarchical linear model to predict well-being through personal values. Now, I am not sure, whether I can transform the result-table, e.g. the HLM coefficients, in this study into a correlation.
Attached the relevant study with the result tables at the end. For my meta analysis, pp. 38 and 39 are relevant, in order to get information about the relation between personal values and well-being.
It would be awesome if someone could let me know whether it is possible to calculate correlations from the results of this hierarchical linear model.
Best regards and thank you so much,
Benedict
E-mail: florencia.sortheix@helsinki.fi
Am besten fragen Sie Florencia M. Sortheix direkt. Sie hat höchst wahrscheinlich die Daten noch und arbeitet bei der Universität in Helsinki, Finnland.
Question
Best Source to learn Hierarchical Linear Modeling
this part of the website gives organized access to all of CMM's training materials
and this is the address of the LEMMA training materials
you can follow these topics in R, Stata and MLwiN (there are also some materials for SPSS)
1. Using quantitative data in research (watch video introduction)
2. Introduction to quantitative data analysis (watch video introduction)
3. Multiple regression
4. Multilevel structures and classifications (watch video introduction)
5. Introduction to multilevel modelling
6. Regression models for binary responses
7. Multilevel models for binary responses
8. Multilevel modelling in practice: Research questions, data preparation and analysis
9. Single-level and multilevel models for ordinal responses
10. Single-level and multilevel models for nominal responses
11. Three-level multilevel models
12. Cross-classified multilevel models
13. Multiple membership multilevel models
14. Missing Data
15. Multilevel Modelling of Repeated Measures Data
Question
The study takes the data from the clients from an RCT that studies the outcome of couple therapy (experimental and control are two different types of therapy).
However, unlike the RCT, this study does not compare the 2 types of therapy. Rather, it compares two different groups of the population (infidelity vs. non-infidelity), and sees how they did in therapy.
If it helps, they used hierarchical linear modeling analysis.
What would be the design of this study? Here is how it is described in the article:
In the present exploratory study, the authors examined the therapy outcomes of a sample of infidelity couples (n 19) who had participated in a randomized clinical trial of marital therapy (N 134).
I think it's a very straightforward correlational study comparing outcomes for two non-randomly selected groups of subjects. Perhaps that's an oversimplification. The question couldn't have waited until I finished teaching Research Methods this Fall? Just kidding. But I think Ryan was on the right track when he suggested pre/post intervention. There would need to be a pre-therapy  survey (or similar instrument) as well as a post-therapy survey to measure at least perceived outcomes and success of therapy. The findings of each of the groups (infidelity versus non-infidelity) would be compared statistically - with fidelity vs. non-fidelity being the one single variable - assuming most other demographics between the two groups are comparable. The research description stated that it was an exploratory study. This would tend to support the idea that the researchers are using a very simple correlational design based on survey instruments - such a simple design is often used in exploratory research because there is not yet sufficient data upon which to base or test specific hypotheses via more quantitative/experimental methods. What do you think?
Jonathan
Question
How to run Hierarchical Linear model Using SPSS?
Dear statisticians,
I am doing analysis of a paper contains some variabes as following:
One dependent variable and
Level-1: students’ characteristics which consists 4 independent variables
Level-2: teachers’ characteristics which consists 4 independent variables.
Level-3: school characteristics which consists 2 independent variables
I would be grateful if you help me in how to run this analysis by SPSS.
Dear Getinet,
It is very easy - just enter the dependent variable in the Dependent list and the Level 1 variables in the Independent(s) list in the linear regression dialog box and click on the Next button, the add the Level 2 variables in the same way and click on the Next button again, then add the Level 3 variables in the same way and click on OK.
Question
Hi,
I am trying to find a way to express the importance/relevance of a cross-level interaction in a hierarchical linear model (so not in terms of significance). I have a level 2 moderator M (standard normal) influencing the relation between X and Y (both level 1).
Can I express the relevance in terms of the variability of the slope of the level 1 variable?
Let me give you an example: my cross-level interaction is .10. Let's say that for someone with an average value on M, the slope is .15, so for someone with a high score on M (+1), the slope is .25.
The variance of the random slope of the level 1 variable is .03 (hence the SD of the slope is sqrt(.03) = .17). Intuitively, I would say that the effect of the moderator M is quite strong, because it amounts to (.10/.17 =) .59SD of the random slope of X.
I don't think I have seen this being done anywere, but conceptually it makes sense I think. Any thoughts?
Best,
Dirk
Dirk
you may want to look a this current exchange
I think it has to be done by the substantive subject specialist (you?) by reflecting on how much of a difference there is for Y for  different values of the Xs. Is it millimetres or meters or kilometers  etc?
Kelvyn
Question
Hi! I have a sample that could be theoretically split into two groups. In running a series of models I am only able to achieve model fit when one of the two groups is filtered out (and the group that is filtered out never yields model fit when it's entered alone). Could that be indicative of some fundamental difference between the two groups? I'm looking for literature to further discuss implications of this but can't seem to find any....any guidance is appreciated!
Hi Alyce. I would create a binary variable (1= group 1 and 0= group 2). Then run a logit or probit model using your binary group membership variable. Then use the post estimation predict command as a latent variable and save this new variable. Then run your original regression with the new latent variable in. Good luck, Marc
Question
Dear all,
I am looking for an option to estimate a cross-level interaction (between Level1 and Level2) in MLwiN with runmlwin from within STATA (using runmlwin available from ssc instal runmlwin)?
Any hints would be greatly appreciated!
Best
Perfect, thank you!
Question
I read in one article published in top ranked journal where the authors ran null model before testing more complex model to calculate between- and within-group variance for job performance in their data set. I am trying to make sense of it but could not locate any helping material.
I will appreciate if someone can explain the purpose of the running null model in hierarchical linear modeling, its procedure and the interpretation of the output generated by null model.
Thank you.
One reason for doing this to get some sense of what needs to be explained. The null or empty model contains just one fixed term -the mean- and then a variance at each level, So in an educational context you would have the overall pupil score in the typical school and between school variation and within school between pupil variation. This will allow you to calculate the Intra class correlation which in this context gives the proportion of the total variation at the school level and also how similar pupils within a school are on the outcome. This variation is of course unexplained variation. You can then add variables in to fixed part at each level to see what happens to the unexplained variation. Adding in a pupil variable can reduce the school variance but could also increase it.
This classic early paper gives a very clear example of this variance decomposition
And this site gives unrivalled resources
Question
Hello experts!
I am doing research on the moderating effect of Emotional intelligence. 3-step Hierarchal analysis was used in this case, R square increased from .78 to .805 .(.028 higher) as the interaction was put into the model. I have no idea how much change in R square is considered to be significant, if .028 is too low, is there any way to boost it?
Question
I am running GLM on my dataset. As the dataset hs over-dispersion and have less number of data point then required i need to use QAICc value to chose the best model.
Question
Hi. I wonder to know is the Mundlak (1978) procedure to control for level 2 endogeneity may been extended to a 3-level hierarchical model in a longitudinal setting. The case is as follows: time is the first level of the hierarchy, firms are at the second level and regions a the the third level.
Question
Hi all,
I have a dataset from a camera trapping grid that has a low number of sites but spans a long period of time.
The cameras have been in the same sites since 2013, and there are 64 sites over 200 square miles. The number of sites is very low when compared to other camera trapping studies, but we have over 100,000 photos.  I am interested in whether this dataset could be used to compare species diversity over time, but most of the camera trapping literature I've read has hundreds of sampling locations.  The data has already been collected and it isn't possible to add more sites.
I'm looking for advice on the best statistical approaches to use for a dataset like this. Has anyone worked with a camera trapping dataset that has a very low site sample size and what analytical methods did you use?
Thank you,
For species distribution modelling, the number of camera sites is optimal. Evidently, the camera sites should be unbiased. The more critical question is whether your set of environmental covariates is comprehensive and of a matching spatial and attribute accuracy (WorldClim not always is).
Our paper on predicting a post-fire tree diversity is attached for your inspiration. In the mean-time we have seen deer impacting on the post-fire development.
Question
I am working on comparing and developing pseudo R-squared statistics for logit models particularly as applied to multilevel (that is mixed or hierarchical) models. I am particularly keen on hearing about any recent developments say since 2010. The  most recent development that I have found is
Tjur, T. (2009) “Coefficients of determination in logistic regression models—A new proposal: The coefficient of discrimination.” The American Statistician 63: 366-372.
Is there anything subsequent to this or indeed anything recent that compares different measures. I am aware of these two earlier comparisons
Menard, S. (2000) “Coefficients of determination for multiple logistic regression analysis.”The American Statistician 54: 17-24.
Mittlbock, M. and M. Schemper (1996) “Explained variation in logistic regression.”Statistics in Medicine 15: 1987-1997.
Thank you
Kelvyn, this recently published article might be of interest to you:
Hemmert, G., Schons, L. M., Wieseke, J., & Schimmelpfennig, H. (2016). Log-likelihood-based Pseudo-R2 in Logistic Regression: Deriving Sample-sensitive Benchmarks. Sociological Methods Research, published online before print, March 18, 2016. DOI: 10.1177/0049124116638107
Abstract
The literature proposes numerous so-called pseudo-R2 measures for evaluating “goodness of fit” in regression models with categorical dependent variables. Unlike ordinary least square-R2, log-likelihood-based pseudo-R2s do not represent the proportion of explained variance but rather the improvement in model likelihood over a null model. The multitude of available pseudo-R2 measures and the absence of benchmarks often lead to confusing interpretations and unclear reporting. Drawing on a meta-analysis of 274 published logistic regression models as well as simulated data, this study investigates fundamental differences of distinct pseudo-R2 measures, focusing on their dependence on basic study design characteristics. Results indicate that almost all pseudo-R2s are influenced to some extent by sample size, number of predictor variables, and number of categories of the dependent variable and its distribution asymmetry. Hence, an interpretation by goodness-of-fit benchmark values must explicitly consider these characteristics. The authors derive a set of goodness-of-fit benchmark values with respect to ranges of sample size and distribution of observations for this measure. This study raises awareness of fundamental differences in characteristics of pseudo-R2s and the need for greater precision in reporting these measures.
Regards, Karin
Question
I applied an ANCOVA for my analysis and obtained beautiful results but the data were collected in three different batches, that is to say we have collected data from different people three different times in two years, around 75 people each time. I have learnt that multilevel linear models are for hierarchical structures, like different classrooms, schools etc. All our participants are enrolled in the same university. I am not sure if this applies to my situation. What do you recommend, carry on with ANCOVA or apply a multilevel linear model to get rid of the possible effects of different data collection periods?
I am not the expert in MLM, but what if you would like to run MLM you need to have multilevel structure. It is recomended to have at least 8 or even 10 objects (in this case: batches) in lvl 2, and at least 10-15 objects (here:participants) in every batche.
75 people is not enough to MLM. It is not enough to have 10 object in lvl 1 and 8 object in lvl 2. The program will not run the anaysis.
Even if you cluster people within time. You have still not enough. 3 measures is still not 8. And what about power effect? Sometimes power effect to show statistical defferences is not enough for even 1000 people in some research.
I would recomend ANOVA :)
Question
I am studying a process of internal collaboration among sales people belonging to 8 different business units. During the process one sales reps “A” belonging to BU  “1” makes a recommendation to a Sales reps “B” who work in BU 2.  In our research we are interested in understand conditions under which the dyad (connection between both salesmen) work.  Drivers such as homophily, strength of tie, etc. will be tested at the dyad level.
Given that sales reps are assigned to a specific BU, of course there is an interesting problem of multilevel analysis, in which level 1 = sales rep characteristic and level 2: BU.
However, given that our research is focused on the dyad (connection between two salespeople), I would like to know whether level 2 is in fact the product of the combination between two BUs. (BU 1 and BU 2 in my example). If this is the case, how would you proceed? Would for example you build a variable which is the combinations of both BU?
Are the variables for which you want to test the interaction continuous or categorical?
If X1 and X2 are continuous, then you test the interaction between X1 and X2 by computing the product of them (literally a variable that is X1 times X2) and running the analysis with the two main effects and the interaction variable as predictors (NOTE - you are only concerned with the significance of the interaction term - if it is significant you have evidence of an interaction and may want to pull apart your sample by X1 and test the difference in the impact of X2).  (NOTE 2 - you can reduce the colinearity problem by centering both variables before you compute the interaction variable).
If X1 and X2 are both categorical, then building variables that are the combination of your groups would be how you test the interaction.  For example, if X1 and X2 are dichotomous, then you make three new dichotomous variables - X1notX2, X2notX1, X1and X2 - and test the significance of each group against your comparison of neither X1 or X2.  You will have 1 fewer dichotomous measures as there are total groups (so, 2 groups and 2 groups = 3 measures; 4 groups and 3 groups = 6 measures, etc).
If one is categorical and one is continuous, I usually just split the sample on the categorical measure and compare the coefficient for the continuous one between the different samples.  You can compute a t value to compare the size of the coefficients but you will need to run the analysis also with the full sample - the formula for the t value would be the difference of coefficients between your groups over the standard error of the coefficient from the full sample analysis.
Question
Hello everyone,
I am trying to interpret an interaction term in HLM. How is it possible? My predictors are both on level 1.
Thanks for your help!
Just to say that we have implemented Irvin's suggestion in our multilevel software which has a number of tools for post estimation interpretation of results - it works by simulation so that it is able to make interaction plots for example for  population average and subject specific estimates for discrete outcomes and to put confidence intervals around the plots. You can even change the reference groups for categorical predictors without re-estimating the model.
I would argue that the graph is the most meaningful way of interpreting such  results as the effect and associated confidence intervals are changing with different values of the predictor variables and it is difficulty to appreciate this from a table of estimates.
Question
Have you worked with MECanalyst for laddering and creating Hierarchical Value Maps? Can you recommend the software?
no; I'm still not using any application. but I am researching two : laddermap and mecanalyst .
Question
While analyzing multi level model, which option should be used while including predictor variables from uncentered variable, group centered or grand mean centered?
Also guide about the implications of these three?
Enders, C. K., & Tofighi, D. (2007). Centering predictor variables in cross-sectional multilevel models: A new look at an old issue. Psychological Methods, 12(2), 121-138.
Question
My Question is
I have 3 variables (for example A, B and C)
A is at level 2, where as B and C are at level one (lowest level)
I want test 4 models
1.  A------> C (Cross Level)
2. A-------->B (Cross Level)
3. B---------->C (at level one just)
4. A & B -------------> C (cross level where one predictor is at level 2 (A) and other is at level 1 (B)
Can all these relations be tested in HLM software including no 3 where both predictor and outcome variables are of same level Or this will be done by using simple regression/ linear regression??
thanks Tae-Yeol Kim,
Kindly also guide me to interpret the outcome file of the HLM software that how to report it and interpret it
Question
While doing analysis in HLM software regarding multilevel modeling, how to interpret the output files of HLM especially where the values of Adj R square and F value (as in linear regression) are shown when doing analysis in HLM. Please guide
Thanks Daniel for respons
Question
Hi every one.,
I'm working on multi-class data, and I want to predict the test data. In this regard, I have problem with introduction the response matrix to the classifer. The most common methods are adapted with two classes (0, 1) but about multi-class data there are lots of limitations...
Can any one help me?
Many Thanks
Reihaneh
Hi Mr. Shahbazy,
As previous researchers said, it is a rather easy task. First of all you can use Linear Discriminant Analysis (LDA) approach and only define your class vector in following order:
Y= 1;1;1;1;1;1;1;1;2;2;2;2;2;2;2;3;3;3;3;3;3;3;4;4;4;4;4;4;4, respect to the row order of the objects in X space,
Then, using the following command it would be possible to run multiclass classification approach in MATLAB.
[Yhat,e]=classify(X,X,Y);
Yhat is the predicted class vector and e is the classification error.
For those classes that are not seperated linearly it would be possible to implement quadratic discriminant analysis (QDA) using the following command:
Finally, as Dr. Vasighi said, using one against all and one against one approaches and the majority voting strategy works better than the multi-class classification procedure.
Good Luck ;)
Question
Hi,
I am testing a 2-level model using HLM 7 and would like to test 3 variables which mediate a direct cross level relationship. Can anyone suggest references on how to test and interpret mediated hypotheses?
Tanks you for your (Amir) question,
We recommend that you use HLM 2 or 3, because you only use three variables alone.
HLM is an appropriate analytical techniques for data analysis nested or hierarchical structure in which the individual observations nested within groups. Analyzing the data that is hierarchical with HLM can be used to obtain information on diversity at the group level which can not be known if the analysis is simple linear regression model. Implementation of HLM in the disciplines of education, for example on issues of intelligence and socio-economic status of students' language test scores, where students nested within schools. This study aims to establish several models HLM 2 levels and determine the best model in the case of the influence of IQ and socioeconomic status of the language test scores. HLM second level consists of two submodels are models of level 1 and level 2 models, a combination of 2 submodel into a form consisting of a combination model of fixed effects and random effects. HLM-model models built is random intercept model of level 1, level 2 models random intercept and random coefficient models. The data set is distinguished hierarchy of nested level 1 in the group at level 2, in this case the students are grouped in school. The data used in this study consisted of a response variable the results of the test with two predictor variables, level 1 is IQ and socioeconomic status of students, the predictor variables are variables level 2 -rata average socioeconomic status of each - each school.
Question
Interpreting a random coefficients model (one predictor at level 1 but none at level 2) with a dichotomous outcome variable is simple enough, but I can't grasp how to interpret a model that includes a predictor in level 2 as well. I understand that my level 1 predictor now can't be interpreted without considering the level 2 predictor, but I don't understand how to talk about odds ratios and probabilities when this is the case. Help would be appreciated!
MLwiN has a tool - customized predictions - to do this for logits, odds and probabilities see Manual supplement and even if you do not use the software it should give you plenty of insight
Question
It should be easy to read for students, and include discussion of software. This will be used primarily with social sciences (mostly psychology and education) students.
This is  a large and free online resource and you can follow the course in MLwiN (a version is free to download) , Stata and and for some parts in SPSS
Here are the modules - we have tried to write them into two parts - concepts and then practice (using particular software)
Using quantitative data in research (watch video introduction)
Introduction to quantitative data analysis (watch video introduction)
Multiple regression
Multilevel structures and classifications (watch video introduction)
Introduction to multilevel modelling - This one should meet your needs
Regression models for binary responses
Multilevel models for binary responses
Multilevel modelling in practice: Research questions, data preparation and analysis
Single-level and multilevel models for ordinal responses
Single-level and multilevel models for nominal responses
Three-level multilevel models
Cross-classified multilevel models
Multiple membership multilevel models
Missing Data
Multilevel Modelling of Repeated Measures Data
and it is part of a wider resource for multilevel modelling
Question
data from how many respondents is required for multilevel of analysis at the lowest level of analysis? is there any rule?
means if 1 respondent from team level of analysis then if there is 1 respondent from individual level of analysis, we may consider this for multilevel analysis in HLM & in SPSS
That was my point about the target of inference - no one would wish to infer to each and every specific twin sibling pair - it is likely to be of little interest per se and it would be very uncertain as based on two observations. But a genuine target of inference is  the within- and between- variances across all pairs of twin and these will be based on all pairs that you have and this could be reliably estimated.
This paper powerfully sets out the importance of the multilevel model in such circumstances
Guang Guo and Jianmin Wang (2002) The Mixed or Multilevel Model for Behavior Genetic Analysis, Behavior Genetics, Vol. 32, No. 1,  37
Furthermore and in relation  to shrinkage,  the variance at level 2 summarizes the differences between level 2 entities; but this not the variance of the shrunken residuals. but variance of the raw mean residuals.. The level 2 variance is not the estimated between group variance of the  sample, but the estimated between-group variance in the  population.
Typically (with max likelihood) estimation we estimate the between and within variance and having estimated these values we MAY  go onto estimate the shrunken residuals at level 2 - but we may choose not to do this ; indeed the SPSS implementation does not have this facility as it mainly geared up for longitudinal analysis  and the level 2 entity - when it is the individual - is usually of little interest per se. And this brings us back to my original answer in terms of the target of inference.
Question
For example, SEX or SES=socioeconomic status in pupil level (level 1) in comparison to RGIRLS=girls ratio or MEANSES in school level (level 2). Some authors use the term contextual variable, others compositional variable. Some of them even use both terms for the same variable. Is the difference between contextual and compositional variable or not?
There is a lot of confusion in the literature and different disciplines have different practices. and I will probably add to it!
I like to talk about micro variables at the lower level and macro variables at the higher. So in a pupils within schools study the gender of the pupil would be a micro variable and sex of the school (boys only; girls only mixed) would be a macro variable. I often distinguish whether the macro variables is aggregate - a summary of lower level units - eg then mean score of the school or global that is only measured at school level and is not a summary attribute of the pupils eg an inner-city school, I use the terms contextual and ecological as alternatives for macro variables if that seems appropriate for the nature of the things that I am modelling   ( but see contextual effects below)
In a panel or longitudinal study where occasion are nested within individuals - the micro variable is then time-varying while the macro (eg gender of the individual)  is time - invariant.
I find this glossary helpful
Anna  V Diez Roux A glossary for multilevel analysis
In terms of contextual effects I find this diagram by Steve Raudenbush to be insightful
·Within-effect (βw ): the expected difference in the response for two pupils that belong to the same group (school) but differ by one on the individual predictor (pupil prior attainment)

·Between-effect (βb ): the expected difference in the mean response of two  schools which differ by one in the mean of the group variable – school mean prior attainment

·Contextual effect (βc ): the expected difference in the response for two level 1 pupils who have the same individual value on the pupil-level predictor but who belong to schools that differ by one in their group mean; this is the potential differential effect on the response from belonging to groups or contexts with different means.
I failed to paste in the diagram and equation so I have attached them as powerpoint slides - it shows how to estimate contextual effects in two (related ) ways.
Updated 24th July 2016
Some of this material has now  been worked up into
Question
I have used stratified sampling in which i have drawn stratum's based 'age' a categorical variable. I am confused whether to use 'age' as a 'control variable' or as a  'independent variable' in hierarchical regression. I am in desperate need for the answer. Technical assistance is highly appreciated. Thanks in prior.
Thank you very much for the kind responses. Sir Paul E. Spector unfortunately i can't have access to this paper, can you send this paper.
Question
How to apply this methodology in landuse optimization process?
Here is a nice report in this area, in particular on the necessity to clearly specify what is needed or not - although it appears that some of the math is missing:
Question
Hi,
I have daily level data nested within individuals from a diary study. Let's say I only have hypotheses on between level effects: in this case, is it then always wrong to aggregate the data over the days and use standard OLS type analyses (e.g. correlations)?
I know there are some concerns with aggregation, e.g. it assumes that within variance is zero, gives people with fewer observations relatively more weight).The latter is less of a problem in my data since people reported about equal numbers of observations and quite a lot of them.
But given that I only have expectations on between person effects, is aggregation such a bad thing?
Dear Dirk,
Models are supposed to serve your interests, not the other way around. So don't feel compelled to use a multi-level model. Psychologists, especially, often take repeated measures and derive an average. They then analyse differences between averages. It is particularly suitable if you are actually interested in average performance and no curiosity about the actual distributions involved.
Two things to reflect on, though. One is that your between person effects can often be modelled more precisely and accurately using a multi-level model then can be done using an aggregated approach. The second is that the size of the difference between the two approaches can be easily verified if you use a MLM or HLM first... so assuming that the aggregation effects are irrelevant seems a little silly once you become accustomed to MLM/HLMs. The literature on the aggregation bias and ecological fallacy give reasonable pointers on what sorts of mistakes in interpretation are likely to be made by ignoring these points - however, as I wrote, in some fields aggregation is still the standard approach.
WRT to what you wrote I am not so sure that these points should dominate your decision. Aggregation doesn't assume that within level/person variance is zero. It is done precisely because people are deciding that the variance is large enough to matter and that they think it is irrelevant to what they wish to test. There is no compelling reason to worry about the weights given to participants as a function of the number of measurements if the pattern of missingness is not informative (MCAR) and you are disinterested in that level of variation. But participant averages can be weighted by their number of constituent observations - and Bayesians would likely insist on this - without much effort on your part.
So, to summarize, it is not always wrong and not always a problem; in your field aggregation may even make more sense ( I can picture how sources of variation like temperature, feed humidity, sunlight might be less interesting than the average milk output). The things to watch for are biases in interpretation and a rather unnecessary loss of information.
Hope this helps
Question
I am developing a screening tool to identify mental disorders in forced migrant populations. So far we have used CART & ROC & sensitivity/specificity analyses to get a reasonable ROC (.801) with a cut-off score of 2 items but we really need to increase the ROC (i.e. specificity is too low - about  65%).
I am therefore after input from anyone proficient in Item Response Models (e.g. Rasch or other hierarchical models) that will help me to maximise the predictive accuracy of our screening tool.
Many thanks.
Debbie
hi Debbie, Using Rasch analysis will ensure your scale is at its optimum e.g. removing misfitting items, allowing you to examine if data fit the Rasch model and your scoring categories are working as intended. You could then rerun the ROC to see if a Rasch transformed scale has improved specificity.  I am an occupational therapist who has used Rasch extensively and would also be happy to help or I can make some suggestions of people near you if that would be helpful.
BW
Anita
Question
I am working in Hierarchical Modulation
Dear R.K. Mugelan,
In the following research article, whose Web address is given below, the authors employ hierarchical modulation using Karnaugh map style Gray coding. I hope, it is helpful for you.
"Hierarchical Modulation for Upgrading Digital Broadcast Systems," H. Jiang and P. Wilford, IEEE Transactions on Broadcasting, vol. 51, no. 2, pp. 223-229, 2005.
Best wishes,
Question
I have a Hierarchical Bayes probit model, but I'm seeking answers in general for any random coefficients model.   Depending on what level I calculate DIC (sum the DICs per unit a vs. DIC for the average respondant), I get different answers in model comparison to a regular probit model.  The by unit way is worse than the regular probit, the other is better.  The hit rate for the HB model is superior the regular probit, so I don't why the sum of the by unit DICs is worse than regular probit.
Kalinda,
I misunderstood your calculations earlier. If you computed pd(i) for each outcome i and then summed them together, you obtained the correct value for pd.
But there's an important issue here: pd is not a measure for each outcome, but a property of the model. It's a measure of effective number of parameters, taking into account the partial pooling of coefficient estimates in multilevel models.
Using a notation similar to R functions:
p= meanS(sumN(-2 * log p(yi | parss))) - sumN(-2 log p(yi|parsi^)
Where:
• parss is a vector of simulated values of pars (i.e., in a MCMC chain);
• sumN(-2 * log p(|)) is the model deviance, computed for each simulated parameter value and averaged (first term in the right side) and for the average parameter value (second term in the right side).
The problem might be the choice of the likelihood function, p(y|pars). What you choose as pars in the likelihood function changes the "focus" of the model. Spielgelhalter has some nice slides with examples of model focus in the attached link.
Simply put, the model focus is the level of the model for which you want to predict new cases. In your case, for the first level (responses within individuals?):
p(yi|ai, bi, se) = dnorm(yi, mean=ai+bi*xi, sd=se)
Where each parameter is either the posterior expected value (par^) or a simulated posterior value from the MCMC chain (pars). In case you want to focus on the second level (new individuals?), you must integrate over the first level parameters that depend on the second level. This can be achieved with:
p(yi|ai, g0, g1, se, sb) = dnorm(yi, mean=ai+(g0+g1*zi)*xi, sd=sqrt(se^2 + sb^2))
Question
Do you know any analysis methods to analyze mediation in mixed effects logistic regression? In my design, I have a continuous and a binary fixed effect, and a binary outcome variable.
Bauer, Preacher & Gil (2006) provides a method for analyzing mediation in multilevel data. Can we use this method for logistic multilevel regression as well?
Hello Berna,
have you found a good reference for this question yet? I am searching for the same!
This paper by Finley et al http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3076684/ was helpful for me; it reports a mixed modelling mediation analyses for a dichotomous dependent variable and names relevant references (see results and discussion Experiment 1). However, our statistician is sceptical about the calculation of the standard errors, so I am still looking for other examples.
Best, Gesa
Question
I want to test a multilevel path model (e.g., A predicts B, B predicts C, C predicts D) where all of my variables are individual observations nested within groups. So far I've been doing this through multiple unique multilevel analysis in R. I would prefer to use a technique like SEM that lets me test multiple paths at the same time (A -> B -> C -> D) and still properly handle the 2-levels (individuals in groups). I understand that MPLUS can handle this. Is there an R package I can use?
Question
I'm running SPSS v. 22 and have a data set on which I want to perform hierarchical linear modeling.  I have not be able to find any method to do so, and I'm wondering whether I am missing something.
First some comments on terminology. Multilevel models, hierarchical (linear) models and mixed models are the same thing. The term mixed models is used (particularly in biosciences when modeling over time) because of the two aspects of the model - fixed effects (that is averages) and random terms (that is variance- covariances). Personally I use multilevel and not HLM because it  is possible to analyze data  that is not strictly hierarchical such as when pupils are nested in neighborhoods and pupils are nested in schools but schools and neighborhoods are crossed. I believe that the SPSS algorithm  is able to handle such complex structures.
SPSS does have a mixed model facility but it depends on what package you or your institution  have bought and what has been installed on your machine. SPSS was renamed PASW when taken over by IBM and  it has now reverted to its old name.
I have two problems with Mixed models in SPSS
1 I do not like the interface as it presumes that you are doing a repeated measures longitudinal analysis and everything is set within that framework but I am sure you could get used to it
2 (And related to 1) while SPSS with estimate the higher-level variances (eg between school variance) it does not estimate the specific random effects eg the school effects - in the work that I do, this is of substantive interest. I suspect that this is due to the repeated measures notion as people would not be interested the level2 effects which are in this case would be individual people.
Chris Charlton has converted of some of our online Lemma training materials into SPSS. Two Modules have been completed.
• Module 3 on using Multiple regression this can be used as a pre-cursor to the multilevel  part of the course
• Module 5: Introduction to Multilevel Modelling.
Importantly the latter show you how to use SPSS syntax to calculate higher- level residuals. The material separates out concepts from implementation

The course already has material for MLwiN, Stata and R.
If you want to understand  about different forms of structures;
see Module 4
Question
In 2003, in U.S. Patent Application 20030009099, Lett et al disclosed a component-based design pattern that facilitates hierarchical model building and model sharing.   In 2009, the Randhawa, Shaffer & Tyson article in Bioinformatics used a very similar approach.  While the authors acknowledged that some commercial applications implemented hierarchical composition of model components, they did not know of the public disclosure in application 20030009099.   I'm curious if the authors of Smith, Hucka, Hoops et al might comment on similarities and differences between the approaches?
Randhawa R, Shaffer CA, Tyson JJ. Model aggregation: a building-block approach to creating large macromolecular regulatory networks. Bioinformatics 2009;25(24):3289-3295. doi:10.1093/bioinformatics/btp581.
Thank you, Tarik.
I am a big fan of the COMBINE effort, and proud to have collaborated with many pioneers in this field for over a decade.   It is gratifying to see so much momentum.  While I am not active in this area now, I retain an intellectual interest, and hope to keep up with what's new, and selfishly curious if  any of our contributions were of lasting value.  Wishing continued success,
Scott
Question
Hi all, I'm trying to fit a couple of community occupancy models and am noticing that r-hat values for certain parameters are well above 1.1 (in the 1.3 - 1.6 range) even with fairly long runs (like, 200,000 iterations w/ 100,000 burn-in).  The model structure includes Bernoulli inclusion parameters associated with all beta parameters.  At any rate, the problematic r-hat values are consistently associated with the precision hyper-parameters for the betas.  Covariates with essentially no model support seem to converge slightly better (or at least have lower r-hats for the precision hyperparameter) than covariates with some support.   Strangely, these r-hat values also seem to stabilize pretty quickly (like, the difference between 15,000 iterations and 50,000 iterations is minimal). I'm curious to hear if anybody else has encountered this.  My kind of desperate explanations are
1. The model just needs to be run out longer.
2. I should increase the thinning rate (the trace-plots indicate that the tau chains will occasionally make pretty massive leaps that might be inconsequential if culled by thinning).
3. This is related to the Bernoulli inclusion parameters turning the betas on and off, and r-hat wouldn't be expected to approach convergent values unless the inclusion probability were very close to 0 or 1.
4. This is related to tau hyper-parameters themselves, which might not be expected to be distributed roughly normally.
5.  The taus are derived nodes based upon stochastic sigma hyperparameters, and I should be tracing the latter instead.
Sorry for the book.  Thanks for any help!
Hi John:
- Did you use covariates? Did you scaled them?
- Consider changing the parametrizacion. Sometimes you can improve the convergence .
- What's the magnitude of p? As you know, with very low p values, the results would be inconsistent.
Jose
Question
Research at the individual level very frequently employ self-reports from interviewees (leaders, managers, employees, patients, teachers, students).  Self-reports suffer from biases on the part of interviewees (halo effect, social desirability) as interviewees seek to paint a favourable impression of themselves.  One method that is used to correct this bias defect is to engage with reports from others whether at the one-up, peer-level or one-down level.  Other reports, however, also have their own challenges; eg supervisors may not be fully aware or cognisant of the full range of employees behaviours, some of which may not be evident to them. Supervisors may also not be objective and may bring negative subjective emotions into ratings.  Has there been any work done on methods to combine other-reports with self-reports as a means of arriving at a fairer more balanced assessment ?
If the self-report and other-report measure represent the same concept, you might estimate a common latent trait score using SEM or CFA. We did this with two measures (self-report and other-report) of treatment motivation when predicting the patients' behavioural engagement during the treatment. Although each of the two treatment motivation measure predicted the behavioural engagement only moderately and both motivation measures were only moderately intercorrelated, the latent (error free) common motivation factor emerged as a very strong predictor of the behavioral engagement. Put differently, although the self-report and other-report measure had only limited common variance, this common variance was a much more valid predictor of en external criterion than each measure alone.
You can find this in http://repository.ubn.ru.nl/bitstream/handle/2066/45156/45156_meastrmoa.pdf?sequence=1 and in Drieschner & Boomsma (2008) 'The Treatment Engagement Rating scale ...' .
An explanation for this phenomenon might be in line with your own analysis. Self-report and other-report have their limitations and thus generate partly invalid score variance. However, because they have different limitations, the variance they have in common represets the intended concept much more validly.
Question
Hi, I'm Betzy, a Mathematics Student from the University of Indonesia. I'm doing my undergraduate thesis right now. My topic is about Jackknife Ridge Estimator, that proposed by Singh, Chaubey, and Dwivedi (1986). But there is something that makes me confused. It's about the formula of the weighted jackknife (pseudo-value). It's actually the modification of an ordinary formula of jackknife (pseudo-value). But because the ordinary one neglects the unbalanced one in data regression, it is modified by adding the weights. I don't know how they chose the weights like and wonder if it is related to the influence function. But, it's hard for me to understand as an undergraduate student. Here I attached some files. Hope that you can understand what I mean. Thank you
Best
Betzy
Hi Betzy, I noticed your request of one of my articles. If you mail me on my (ordinary) e-mail address I would be happy to send you my article.
Best, Hans N
Question
I am analyzing results from a daily diary study in R using the nlme package. I am using hierarchical linear modelling and am not sure which code to use to examine whether the latter model fits better than the previous model.
I tried the code anova (model1,model2) but it is giving me the following error message: Error in anova.lme(Model1.SigControls.IIP, Null.Model.IIP) :
all fitted objects must use the same number of observations
In addition: Warning message:
In anova.lme(Model1.SigControls.IIP, Null.Model.IIP) :
fitted objects with different fixed effects. REML comparisons are not meaningful.
As Kelvin noted, you need to compare models that differ only in the fixed part using ML estimation rather than REML. One solution is to refit the models with lme4. The anova command there uses profile likelihoods (rather than REML) to compare models and this avoids the problem with using REML fits to compare fixed effects.
The other error message refers to having different n (presumably because of missingness on one or more predictors). This is a common issue with use of any likelihood based statistic. The simple solution is to run the models on exactly the same data (e.g., through some kind of imputation approach or - less satisfactorily - by omitting cases).
Question
I want to estimate genetic parameters for mean and residual variance of traits by applying double hierarchical generalize linear model.
What do you mean by double hierarchical? Do you have two levels of grouping, like schools -> classes? Or is it a cross-classified situation, for example schools + families?
Question
In a random-slope model, the addition of Level-2 (L2) covariates implies a x-level interaction between the random Level-1 variable and the L2 covariate. In Stata (and maybe R, SAS), despite it's implicit nature, the x-level interaction needs to be explicitly entered into the model syntax by creating the interaction term(s).
Is a model considered mis-specified if one does not include these x-level interaction terms? For example, if the fixed effect of a L2 covariate is siginificant (and theoretically important) but it's x-level interaction is not significant, must the x-level interaction remain specified in the model?
No. It is perfectly ok to not have a cross level interaction but still specify a random slope.
Conside the following random intercept - random slope model where x is a level 1 predictor and z is level 2 covariate
y_i = a_ij + b_ij *x_i + e_i
a_j= g01 + g02*z +u_j
b_j= g11 + g12*z +w_j
The g02 coefficient is the coefficient for the direct effect of the model, and g12 is the coefficient for the crosslevel interaction. If you do not include the interaction term then your random slope would be estimated from the error term:
b_j= g11 + w_j
This does not affect in any way the a_j equation which specifies the direct effect of z. You can also potentially remove the random slope (b_j) all together (e.g. if it does not improve the model) and still test for the direct effect of z (as long as you keep the random intercept).
Question
I work with effects of contexts like the place of residence, and use different softwares that fit multilevel models (R, Stata, MLWin, Mplus). Almost any software does this analysis, nowadays (SAS, SPSS, HLM) and all provide similar estimates for coefficients, especially for linear models. I noticed, however, some difference in the variances (i.e. second level variance) and I am aware they use different estimators (IGLS, REML, MLR, and so on). What are the advantages and disadvantages of the main softwares? Is there any published paper comparing them for discrete variables and non linear models (Binomial, Poisson, N-Binomial, zero-inflated, etc)?
My experience (like that of most people) is limited to a number of software packages. HLM is very easy to use. BUGS and GLLAMM (in Stata) are very flexible and cover the widest range of models, but both are challenged by large and complex data sets which can take a long time to estimate. MLwiN can handle large data sets very efficiently and can estimate models in likelihood and Bayesian (MCMC) mode- this allows goodish likelihood estimates to be used as starting values for the MCMC estimation. The MLwiN software also has lots of post estimation procedures to help interpret the results. MIXREG is very efficient and is very useful for discrete outcomes.
The Centre for Multilevel Modelling at the University of Bristol has recognised the problems with this and has started a large programme of work (sponsored by the UK ESRC) to provide for “inter-operability” – that is the ability to work across software platforms. The fruits of this so far are
1: runmlwin: Runs MLwiN from within Stata
2: R2MLwiN: runs MLwiN from within R
And much more ambitiously
3: Stat-JR
“is a brand new statistical software system: it constitutes a very different data analysis experience, featuring: an interface with a range of other statistical software packages, circumventing the need to learn software-specific techniques each time functionality of a new package required, but also providing tools to help teach software-specific knowledge to those wishing to learn; its own in-house MCMC based estimation engine (eStat) for complex data modelling (including multilevel models); open source templates allowing users to write their own Stat-JR functions; an eBook interface providing an interactive way of reporting and disseminating science, and an innovative tool for teaching statistics” Put simply you can specify a model in this environment and then ask for it to create syntax in a very large range of different software, as well as estimate with its own MCMC procedures .
This has just been released; see
Question
Correlated data arise frequently in statistical analyses. This may be due to grouping of subjects, e.g., students within classrooms, or to repeated measurements on each subject over time or space, or to multiple related outcome measures at one point in time. Mixed model analysis provides a general, flexible approach in these situations, because it allows a wide variety of correlation patterns (or variance-covariance structures) to be explicitly modeled.
One of the best books for mixed models (especially if you are using R) is the one by Andrew Gelman:
Question