Science topic

# Hierarchical Linear Modeling - Science topic

Explore the latest questions and answers in Hierarchical Linear Modeling, and find Hierarchical Linear Modeling experts.

Questions related to Hierarchical Linear Modeling

I have data from European Social Survey (24 countries) and want to model a cross level interaction. Can I do this with a simple random intercept (fixed slope) model? Or do I have to model a more complex random slope model? And if so, are 24 countries sufficient?

I am not explicit interested in explain the different slopes on Level 2 due to the cross level interaction. If its possible I would do that, but I think I need more countries right?

But i definetely want to show, that trust in institutions (Level 1 variable) depends on the level of corruption (level 2 variable) in a country. Can I do this with random intercept fixed slope model?

Hi, RG community! I am new to network analysis and I am currently facing a challenge with coding, processing, and quantifying networks in a hierarchical scheme. In this scheme, nodes pertain to differing hierarchical ranks and ranks denote inclusion relationships. So, for example if node “A” includes node “Z”, it is said that “A” is “Z”’s parent and “Z” is “A”’ daughter. However, a rather uncommon feature is that nodes at different ranks of the hierarchy can relate in a non-inclusive fashion. For example, node “A” parent of “Z” may have a directional link to “Y”, which is “B”’s daughter (if “A” were directionally linked to “B”, then it could be said that “A” is “Y”’ aunt). Here is a more concrete example to illustrate the plausibility of this scheme: “A” is a website in which person “Z” is signed in (inclusiveness; specifically, parentship); website “A” can advertise banners of website “B” (siblingship) or recommend to follow a link to person “Y” profile in website “B” (auntship).

OK. So, in the image below (left top panel) I present a graphical depiction of this rationale. For simplicity, a two-rank hierarchy is used, where gray and red colors denote higher and lower hierarchies, respectively. The image displays siblingship, parentship, and auntship links. My first approach to coding this network scheme was to denote inclusiveness as one-directional relationships (green numbers) and simple links as symmetrical (two-way; brown numbers) relationships (see table in right panel). However, I soon realized that this does not reflect what I expected in networks’ metrics. For example, I am mainly interested in quantifying cohesiveness and the way I coded the network in left top panel entails something like the non-hierarchical network depicted at the left bottom panel. In short, I am not interested in the directionality of the links but in actual inclusiveness. To my mind, the network in the top panel is more cohesive than that in the bottom panel but my coding approach does not allow me to distinguish between them formally.

The solution conceived in the interest of solving this problem was to stipulate that a relationship between any pair of nodes implicates a relationship of each with all of the other’s descendance. This certainly yields, for example, the top network being more cohesive than that in the bottom, which is in line with my goals. However, this solution is not at all as elegant as I would have hoped. Can anyone tell if there is a better solution? Maybe another way to code or an R package allowing for qualitatively distinct relationships (not just one-way or two-way). Thank you.

I need code of Consensus + Innovations and OCD in any programming language preferably Matlab or R

Dear colleagues,

I am asking your kind comments or recommendation on analyzing hierarchical- and multiple responses (outcomes). I use hierarchical and multiple responses to express my outcome variable which is because my outcome is Quality of life (by Rand-36 or SF-36). However, by calculating the 36-items questions, I would have a continuous mean score for the total quality of life. But, as you may know, under SF-36, we also could calculate 8 domain scores (separately PF, RP, BP, GH, and MH, RE, VT, SF), and 2 dimensions (summary) scores (the PCS and MSC). Therefore, in a way, my outcomes are multiple responses and also are hierarchical.

level 1: total mean score of quality of life

level 2: --- Physical component summary

level 3: ------ PF: physical functioning

level 3: ------ RF: role limitation due to physical problems

level 3: ------ BP: body pain

level 3: ------ GH: general health

level 2: --- Mental component summary

level 3: ------ MH: mental health

level 3: ------ RE: role limitation due to emotional problems

level 3: ------ VT: vitality (fatigue or energy)

level 3: ------ SF: social functioning

My purpose of study (cross-sectional design) is to understand associated factors to hemodialysis patients' quality of life. Therefore, I have a series of explanatory variables (Xs) to estimate the Ys. My original analysis was using "multiple regression" to each of the quality of life scores (Three hierarchical levels of scores: the total QoL mean score, each of the 8 domain scores, and each of the 2 dimension scores).

But, this brings me to the problem of "multiple comparisons" and also I treated each type of scores (no matter the total QoL mean score, or the domain score, or the summary score) as "independent to each other" which actually are correlated. However, from the QoL measurement instrument, there is inherent hierarchical and also correlations among the three levels of scores in the designed conceptual framework: SF-36.

Therefore, Ii would like to kindly ask for your comments or recommendation:

1). how can I analyze my (Y (outcomes) when they are multiple-responses and hierarchical?

2). will multilevel analysis (hierarchical linear regression) work for my Ys?

3). other analysis methods could try?

4), could you please suggest to me some literature to explore this issue I am countering?

Hello!

We have a question about implementing the ‘Mundlak’ approach in a multilevel

**(3-levels) nested hierarchical model. We have employees (level 1), nested within year-cohorts (level 2), nested within firms (level 3).**In terms of data structure, the dependent variable is

**employee satisfaction**(ordinal measure) at employee level (i) over time (t) and across firms (j) (let’s call this Y_itj), noting that we have repeated cross sections with different individuals observed in every period, while as regressors, we are mainly interested in the impact of a firm level time-variant, but employee invariant, variable (let’s call it X_tj). We apply a 3-level ordered profit model (meoprobit in Stata).We are concerned with endogeneity issues of the X_tj variable, which we hope to (at least partially) resolve by using some form of a Mundlak approach, by including firm specific averages for X_tj, as well as for all other time-varying explanatory variables. The idea is that if the firm-specific averages are added as additional control variables, then the coefficients of the original variables represent the ‘within effect’, i.e. how changing X_tj affects Y_itj (employee satisfaction).

However, we are not sure whether approach 1 or 2 below is more appropriate, because X_tj is a level 2 (firm level) variable.

1. The firm specific averages of X_tj (as well as other explanatory variables measured at level 2) need to be calculated by averaging over individuals, even though the variable itself is a Level 2 variable (varies only over time for each firm). That is, in Stata: bysort firm_id: egen mean_X= mean(X). As our data set is unbalanced (so the number of observations for each firm varies over time), these means are driven by the time periods with more observations. For example, in a 2-period model, if a company has a lot of employee reviews in t=1 but very few in t=2, the observations in t=1 will dominate this mean.

2. Alternatively, as the X_tj variable is a level 2 variable, the firm specific averages need to be calculated by averaging over time periods. That is: we first create a tag that is one only for the first observation in the sample per firm/year, and then do: bysort firm_id: egen mean_X= mean(X) if tag==1. This gives equal weight to each time period, irrespective of how many employee-level observations we have in that period. For example, although a company has a lot of employee reviews in t=1 and very few in t=2, the firm specific mean will treat the two periods as equally important.

The two means are different, and we are unsure which approach is the correct one (and which mean is the ‘true’ contextual effect of X_tj on Y_itj). We have been unable to locate in the literature a detailed treatment of the issue for 3-level models (as opposed to 2-level models where the situation is straightforward). Any advice/suggestions on the above would be very much appreciated.

I have a sample with 138 participants. Only 6 of them reported living alone (%4.3) while the remaining 132 share the household with others (family/partner/housemate etc.)

I am trying to decide whether I can add "living alone" as a dichotomous variable in my hierarchical regression. What worries me is the very low percentage of individuals living alone in the sample. Do you think this would be problematic?

Thank you so much for your answer!

I would like to perform a sensitivity analysis of a CFD solver. There are 8 input variables, for each of them there are 2-3 prescribed numerical values.

To evaluate one set of parameters three costly simulations (each running for 20 hours on 800 cpu cores). Budget for these simulations is limited and due to the queuing system of the HPC, it would take a long time to get the results.

I'm aware of latin hypercube hierarchical refinement methods that allows starting the sensitivity analysis with smaller budget and subsequently incorporating newer results when they're available.

But those methods works with continuous variables. Is there a method for categorical and ranked/ordinal variables?

Hello,

I have a dataset with a three-level structure. Participants reported an outcome of interest for 3-4 aspects of an incident, and they reported 4 types of incidents (1 incident for each level of 2 * 2 factors). I draw the structure in the figure below.

I have two questions:

(1) How should I structure the covariance matrix for residuals? I expect that the residuals for each person to correlate. Also, I expect that the residuals within each event correlate stronger. I prefer not to use an unstructured matrix, as the number of parameters would be too large. (If it cannot be done in SPSS, then R, preferably in nlme or lme4 packages.)

In SPSS, I'm using this syntax:

MIXED Y BY Valence Type Aspect

/FIXED = Valence Type Valence*Type | SSTYPE(3)

/REPEATED=Type*Valence*Aspect | SUBJECT(participant_ID) COVTYPE( )

(2) As is illustrated in the figure below, one aspect of the outcome is not applicable to the type 2 events (type 2 events only have 3 level-1 categories). I assume that when I want to assess the effect of incident type, I should exclude the number 4 aspect that is only applicable to type 1 incidents so that the results would reflect the difference between types including same level-1 outcomes. But, excluding number 4 or including it results in a negligible difference. Can I just report that, and then include the number 4 in subsequent analyses? Especially as excluding data from the number 4 will cost substantial statistical power.

Thank you in advance.

I am interested to know if anyone has conducted online studies with a hierarchical linear modelling design in the field of education involving teachers and their students or in a similar field involving coach/mentor/supervisor and their athletes/employees? What platforms have you used and how has it been compatible with data privacy? What kind of ethical issues have you encountered?

Hell Expert(s),

I am considering a variable that has a further 5 sub-variables (dimensions). Concerning this,

**how should I check moderation for such kind of variable?****Am I supposed to consider those all as separate moderators**while analyzing data in Hayes Process Macro (Model 01)?Next, while considering those all as separate, if one moderator (dimension) demonstrates a different effect (-/+) compared to others,

**what approach should I adopt to interpret?**5 dimensions reflect mental stability; if one is missing or has a different effect,**how to address that?**Regards,

I need to assess the effect of weight on the improvement in DV score over 6 time points.

The attached document includes R syntax and output of 3 HLM models.

The documents includes side notes with conclusions and questions that reflect my imperfect knowledge of multi-level models.

I would appreciate it a lot if someone expert could be so kind to have a look and answer my questions. Also I need to know how to report the results in a minimal and simple way.

All partial or perfect answers and opinions are highly welcomed and appreciated.

Kind regards,

Wadie

I read quite a few resources on linear mixed effects, but none are sufficient for helping me address this complex data structure I am working with. I would be very grateful if someone could help me pinpoint the best way to account for the random effects in my data.

The overarching hypothesis I want to test is whether brain activation is increased when participants engage in one cognitive task compared to another. So within each subject is a nested variables consisting of 2 task conditions, a varying amount of trails (i.e. overall time per condition), and a varying amount of inter-cranial electrodes that measure brain activation. | attached here a figure to visualize the random-effects on a subject-level basis and a schematic of the overall data structure. I do not think any of these variables are crossed-nested aside from each subject completing both task conditions. Afterwards, my lab may want to me analyze these random effects more directly.

lmer(activation ~ condition + (trail | subject/electrode))

The model above is my best approach so far to test my overall hypothesis, but I have several concerns and questions:

1) I think this current model is treating condition as a between-subject effect even though it should be a within-subject variable since each subject completed both task conditions. Is there additional syntax that I could include to account for this?

2) The visualize_mixed_effects file attached shows that the brain activity varies tremendously from subject-to-subject, electrode to electrode, and throughout the number of trails. This is not essential to my main hypothesis so I am thinking I should account for the heterogeneity by including a random intercept and slope of trail number in the model. This seems find to me but I am less certain about whether it is logically sound since subjects is nested by electrode number. I have never done a double nested lme before. Is the formula above robust enough to test my hypothesis?

I have been using hier.part package and function in R and love it, but now I need to run hierarchical partitioning on a linear model but I have 10 variables. With more than 9 variables hier.part has rounding bias (see the (Olea, et al., 2010)).

Is there an alternative function in R to run hierarchical partitioning that works with 10 explanatory variables? I know it is a lot of explanatory variables for my candidate model, but I do not have any further ecological or statistical justification to remove any more explanatory variables before running best model selection.

I found the Stan125 package (https://github.com/Stan125/ghp) but it works off of hier.part so I do not know if I should use it.

Or would the adipart function from the vegan package accomplish the same results? Does hierarchical null model testing also show independent and joint effects?

Forgive me if I am not clear, I am fairly new to statistical analysis! Thank you for your time and help!

We examine language outcomes in children in childcare. For most children, the nesting structure is:

**child**in**classroom**in**chilcare center**in**municipality.**However, some childcare centers do not have traditional classroom division but a more open structure. That is, there is no classroom level, or it is identical to the childcare center level. Will the HLM work if, for some children, two levels are not distinguished?

I'm using STATA 16, "mixed" effects command.

Anders

Question edited:for clarity:

My study is an observational two-wavel panels study involving one group samples with different levels of baseline pre-outcome measures.

There are three outcome measurements that will be measured two times (pre-rest and post-rest):

1. Subjective fatigue level (measured by visual analog score - continous numerical data)

2. Work engagement level (measured by Likert scale - ordinal data)

3. Objective fatigue level (mean reaction time in miliseconds - continous numberical data)

The independant variables consist of different type of data i.e. continous numerical (age, hours, etc), categorical (yes/no, role, etc) and ordinal type (likert scale).

To represent the concept of recovery i.e. unwinding of initial fatigue level, i decided to measure recovery by substracting pre-measure with post-measure for each outcome, and the score differences are operationally defined as recovery level (subjective recovery level, objective recovery level and engagement recovery level).

I would like to determine whether the independant variables would significantly predict each outcome (subjective fatigue, work engagement and objective fatigue).

Currently i am thinking of these statistical strategies. Kindly comments on these strategies whether they are appropriate.

1. Multiple linear regression, however one outcome measure i.e. work engagement is ordinal data.

2. Hierarchical regression or hierarchical linear modelling or multilevel modelling, but i am not quite familiar with the concept, assumption or other aspect of these method.

3. I would consider reading on beta regression (sorry, this is my first time reading on this method).

4. Structural Equation Modelling.

- Can the 3 different type of fatigue measurement act as inidcator to measure an outcome latent construct of Fatigue?

- Can the independant variables consist of mix type of continous, categorical and ordinal type of data

Thanks for your kind assistance.

Regards,

Fadhli

G. David Garson in his book Hierarchical Linear Modeling (2003) p. 64. asserts,

"The fact that the intercept component is significant means that the intraclass correlation coefficient, ICC, is also significant..."

I'm actually interested in the converse -- I have a non-significant intercept component in a null model and want to conclude that the ICC is not statistically significant -- but a referee has questioned that reasoning. Garson gives no justification or citation to support the assertion.

Here's what I have written for written for a linear model:

lin.base <- lmer(FreqTrialStim.RESP ~ zFreq_i + (zFreq_i | Subject), data = data10, REML = F)

This is the truncated script. Here's the full version, with intercepts included:

lin.base <- lmer(FreqTrialStim.RESP ~ 1 + zFreq_i + (1 + zFreq_i | Subject), data = data10, REML = F)

In the field of intellectual disability, data are often hierarchical. As in the case of the classic example of schools (school, class, student), data in our field are often hierarchical ( residential setting, life unit, individuals).

In this context, I am looking for studies that have analyzed the impact of this hierarchy in the field of intellectual disability or empirical studies that have at least controlled for this using multilevel models.

is anyone aware of the existence of such studies?

Dear researchers,

I am currently conducting a meta analysis which analyses the relation between personal values and different well-being scales across cultures. In my meta analysis the effect size I search for in all studies is the Pearson Correlation Coefficient.

Today, I came across a relevant study that conducted a hierarchical linear model to predict well-being through personal values. Now, I am not sure, whether I can transform the result-table, e.g. the HLM coefficients, in this study into a correlation.

Attached the relevant study with the result tables at the end. For my meta analysis, pp. 38 and 39 are relevant, in order to get information about the relation between personal values and well-being.

It would be awesome if someone could let me know whether it is possible to calculate correlations from the results of this hierarchical linear model.

Best regards and thank you so much,

Benedict

Best Source to learn Hierarchical Linear Modeling

The study takes the data from the clients from an RCT that studies the outcome of couple therapy (experimental and control are two different types of therapy).

However, unlike the RCT, this study does not compare the 2 types of therapy. Rather, it compares two different groups of the population (infidelity vs. non-infidelity), and sees how they did in therapy.

If it helps, they used hierarchical linear modeling analysis.

What would be the design of this study? Here is how it is described in the article:

In the present exploratory study, the authors examined the therapy outcomes of a sample of infidelity couples (n 19) who had participated in a randomized clinical trial of marital therapy (N 134).

How to run Hierarchical Linear model Using SPSS?

Dear statisticians,

I am doing analysis of a paper contains some variabes as following:

One dependent variable and

Level-1: students’ characteristics which consists 4 independent variables

Level-2: teachers’ characteristics which consists 4 independent variables.

Level-3: school characteristics which consists 2 independent variables

I would be grateful if you help me in how to run this analysis by SPSS.

Hi,

I am trying to find a way to express the importance/relevance of a cross-level interaction in a hierarchical linear model (so not in terms of significance). I have a level 2 moderator M (standard normal) influencing the relation between X and Y (both level 1).

Can I express the relevance in terms of the variability of the slope of the level 1 variable?

Let me give you an example: my cross-level interaction is .10. Let's say that for someone with an average value on M, the slope is .15, so for someone with a high score on M (+1), the slope is .25.

The variance of the random slope of the level 1 variable is .03 (hence the

*SD*of the slope is sqrt(.03) = .17). Intuitively, I would say that the effect of the moderator M is quite strong, because it amounts to (.10/.17 =) .59*SD*of the random slope of X.I don't think I have seen this being done anywere, but conceptually it makes sense I think. Any thoughts?

Best,

Dirk

Hi! I have a sample that could be theoretically split into two groups. In running a series of models I am only able to achieve model fit when one of the two groups is filtered out (and the group that is filtered out never yields model fit when it's entered alone). Could that be indicative of some fundamental difference between the two groups? I'm looking for literature to further discuss implications of this but can't seem to find any....any guidance is appreciated!

Dear all,

I am looking for an option to estimate a cross-level interaction (between Level1 and Level2) in MLwiN with runmlwin from within STATA (using runmlwin available from ssc instal runmlwin)?

Any hints would be greatly appreciated!

Best

I read in one article published in top ranked journal where the authors ran null model before testing more complex model to calculate between- and within-group variance for job performance in their data set. I am trying to make sense of it but could not locate any helping material.

I will appreciate if someone can explain the purpose of the running null model in hierarchical linear modeling, its procedure and the interpretation of the output generated by null model.

Thank you.

Hello experts!

I am doing research on the moderating effect of Emotional intelligence. 3-step Hierarchal analysis was used in this case, R square increased from .78 to .805 .(.028 higher) as the interaction was put into the model. I have no idea how much change in R square is considered to be significant, if .028 is too low, is there any way to boost it?

I am running GLM on my dataset. As the dataset hs over-dispersion and have less number of data point then required i need to use QAICc value to chose the best model.

Hi. I wonder to know is the Mundlak (1978) procedure to control for level 2 endogeneity may been extended to a 3-level hierarchical model in a longitudinal setting. The case is as follows: time is the first level of the hierarchy, firms are at the second level and regions a the the third level.

Hi all,

I have a dataset from a camera trapping grid that has a low number of sites but spans a long period of time.

The cameras have been in the same sites since 2013, and there are 64 sites over 200 square miles. The number of sites is very low when compared to other camera trapping studies, but we have over 100,000 photos. I am interested in whether this dataset could be used to compare species diversity over time, but most of the camera trapping literature I've read has hundreds of sampling locations. The data has already been collected and it isn't possible to add more sites.

I'm looking for advice on the best statistical approaches to use for a dataset like this. Has anyone worked with a camera trapping dataset that has a very low site sample size and what analytical methods did you use?

Thank you,

I am working on comparing and developing pseudo R-squared statistics for logit models particularly as applied to multilevel (that is mixed or hierarchical) models. I am particularly keen on hearing about any recent developments say since 2010. The most recent development that I have found is

Tjur, T. (2009) “Coefficients of determination in logistic regression models—A new proposal: The coefficient of discrimination.” The American Statistician 63: 366-372.

Is there anything subsequent to this or indeed anything recent that compares different measures. I am aware of these two earlier comparisons

Menard, S. (2000) “Coefficients of determination for multiple logistic regression analysis.”The American Statistician 54: 17-24.

Mittlbock, M. and M. Schemper (1996) “Explained variation in logistic regression.”Statistics in Medicine 15: 1987-1997.

Thank you

I applied an ANCOVA for my analysis and obtained beautiful results but the data were collected in three different batches, that is to say we have collected data from different people three different times in two years, around 75 people each time. I have learnt that multilevel linear models are for hierarchical structures, like different classrooms, schools etc. All our participants are enrolled in the same university. I am not sure if this applies to my situation. What do you recommend, carry on with ANCOVA or apply a multilevel linear model to get rid of the possible effects of different data collection periods?

Thanks in advance!

I am studying a process of internal collaboration among sales people belonging to 8 different business units. During the process one sales reps “A” belonging to BU “1” makes a recommendation to a Sales reps “B” who work in BU 2. In our research we are interested in understand conditions under which the dyad (connection between both salesmen) work. Drivers such as homophily, strength of tie, etc. will be tested at the dyad level.

Given that sales reps are assigned to a specific BU, of course there is an interesting problem of multilevel analysis, in which level 1 = sales rep characteristic and level 2: BU.

However, given that our research is focused on the dyad (connection between two salespeople), I would like to know whether level 2 is in fact the product of the combination between two BUs. (BU 1 and BU 2 in my example). If this is the case, how would you proceed? Would for example you build a variable which is the combinations of both BU?

Hello everyone,

I am trying to interpret an interaction term in HLM. How is it possible? My predictors are both on level 1.

Thanks for your help!

Have you worked with MECanalyst for laddering and creating Hierarchical Value Maps? Can you recommend the software?

While analyzing multi level model, which option should be used while including predictor variables from uncentered variable, group centered or grand mean centered?

Also guide about the implications of these three?

My Question is

I have 3 variables (for example A, B and C)

A is at level 2, where as B and C are at level one (lowest level)

I want test 4 models

1. A------> C (Cross Level)

2. A-------->B (Cross Level)

3. B---------->C (at level one just)

4. A & B -------------> C (cross level where one predictor is at level 2 (A) and other is at level 1 (B)

Can all these relations be tested in HLM software including no 3 where both predictor and outcome variables are of same level Or this will be done by using simple regression/ linear regression??

While doing analysis in HLM software regarding multilevel modeling, how to interpret the output files of HLM especially where the values of Adj R square and F value (as in linear regression) are shown when doing analysis in HLM. Please guide

Hi every one.,

I'm working on multi-class data, and I want to predict the test data. In this regard, I have problem with introduction the response matrix to the classifer. The most common methods are adapted with two classes (0, 1) but about multi-class data there are lots of limitations...

Can any one help me?

Many Thanks

Reihaneh

Hi,

I am testing a 2-level model using HLM 7 and would like to test 3 variables which mediate a direct cross level relationship. Can anyone suggest references on how to test and interpret mediated hypotheses?

Interpreting a random coefficients model (one predictor at level 1 but none at level 2) with a dichotomous outcome variable is simple enough, but I can't grasp how to interpret a model that includes a predictor in level 2 as well. I understand that my level 1 predictor now can't be interpreted without considering the level 2 predictor, but I don't understand how to talk about odds ratios and probabilities when this is the case. Help would be appreciated!

It should be easy to read for students, and include discussion of software. This will be used primarily with social sciences (mostly psychology and education) students.

data from how many respondents is required for multilevel of analysis at the lowest level of analysis? is there any rule?

means if 1 respondent from team level of analysis then if there is 1 respondent from individual level of analysis, we may consider this for multilevel analysis in HLM & in SPSS

For example, SEX or SES=socioeconomic status in pupil level (level 1) in comparison to RGIRLS=girls ratio or MEANSES in school level (level 2). Some authors use the term contextual variable, others compositional variable. Some of them even use both terms for the same variable. Is the difference between contextual and compositional variable or not?

I have used stratified sampling in which i have drawn stratum's based 'age' a categorical variable. I am confused whether to use 'age' as a 'control variable' or as a 'independent variable' in hierarchical regression. I am in desperate need for the answer. Technical assistance is highly appreciated. Thanks in prior.

How to apply this methodology in landuse optimization process?

Hi,

I have daily level data nested within individuals from a diary study. Let's say I only have hypotheses on between level effects: in this case, is it then always wrong to aggregate the data over the days and use standard OLS type analyses (e.g. correlations)?

I know there are some concerns with aggregation, e.g. it assumes that within variance is zero, gives people with fewer observations relatively more weight).The latter is less of a problem in my data since people reported about equal numbers of observations and quite a lot of them.

But given that I only have expectations on between person effects, is aggregation such a bad thing?

I am developing a screening tool to identify mental disorders in forced migrant populations. So far we have used CART & ROC & sensitivity/specificity analyses to get a reasonable ROC (.801) with a cut-off score of 2 items but we really need to increase the ROC (i.e. specificity is too low - about 65%).

I am therefore after input from anyone proficient in Item Response Models (e.g. Rasch or other hierarchical models) that will help me to maximise the predictive accuracy of our screening tool.

Many thanks.

Debbie

I am working in Hierarchical Modulation

I have a Hierarchical Bayes probit model, but I'm seeking answers in general for any random coefficients model. Depending on what level I calculate DIC (sum the DICs per unit a vs. DIC for the average respondant), I get different answers in model comparison to a regular probit model. The by unit way is worse than the regular probit, the other is better. The hit rate for the HB model is superior the regular probit, so I don't why the sum of the by unit DICs is worse than regular probit.

Do you know any analysis methods to analyze mediation in mixed effects logistic regression? In my design, I have a continuous and a binary fixed effect, and a binary outcome variable.

Bauer, Preacher & Gil (2006) provides a method for analyzing mediation in multilevel data. Can we use this method for logistic multilevel regression as well?

I want to test a multilevel path model (e.g., A predicts B, B predicts C, C predicts D) where all of my variables are individual observations nested within groups. So far I've been doing this through multiple unique multilevel analysis in R. I would prefer to use a technique like SEM that lets me test multiple paths at the same time (A -> B -> C -> D) and still properly handle the 2-levels (individuals in groups). I understand that MPLUS can handle this. Is there an R package I can use?

I'm running SPSS v. 22 and have a data set on which I want to perform hierarchical linear modeling. I have not be able to find any method to do so, and I'm wondering whether I am missing something.

In 2003, in U.S. Patent Application 20030009099, Lett et al disclosed a component-based design pattern that facilitates hierarchical model building and model sharing. In 2009, the Randhawa, Shaffer & Tyson article in Bioinformatics used a very similar approach. While the authors acknowledged that some commercial applications implemented hierarchical composition of model components, they did not know of the public disclosure in application 20030009099. I'm curious if the authors of Smith, Hucka, Hoops et al might comment on similarities and differences between the approaches?

Randhawa R, Shaffer CA, Tyson JJ. Model aggregation: a building-block approach to creating large macromolecular regulatory networks. Bioinformatics 2009;25(24):3289-3295. doi:10.1093/bioinformatics/btp581.

Hi all, I'm trying to fit a couple of community occupancy models and am noticing that r-hat values for certain parameters are well above 1.1 (in the 1.3 - 1.6 range) even with fairly long runs (like, 200,000 iterations w/ 100,000 burn-in). The model structure includes Bernoulli inclusion parameters associated with all beta parameters. At any rate, the problematic r-hat values are consistently associated with the precision hyper-parameters for the betas. Covariates with essentially no model support seem to converge slightly better (or at least have lower r-hats for the precision hyperparameter) than covariates with some support. Strangely, these r-hat values also seem to stabilize pretty quickly (like, the difference between 15,000 iterations and 50,000 iterations is minimal). I'm curious to hear if anybody else has encountered this. My kind of desperate explanations are

- The model just needs to be run out longer.
- I should increase the thinning rate (the trace-plots indicate that the tau chains will occasionally make pretty massive leaps that might be inconsequential if culled by thinning).
- This is related to the Bernoulli inclusion parameters turning the betas on and off, and r-hat wouldn't be expected to approach convergent values unless the inclusion probability were very close to 0 or 1.
- This is related to tau hyper-parameters themselves, which might not be expected to be distributed roughly normally.
- The taus are derived nodes based upon stochastic sigma hyperparameters, and I should be tracing the latter instead.

Sorry for the book. Thanks for any help!

Research at the individual level very frequently employ self-reports from interviewees (leaders, managers, employees, patients, teachers, students). Self-reports suffer from biases on the part of interviewees (halo effect, social desirability) as interviewees seek to paint a favourable impression of themselves. One method that is used to correct this bias defect is to engage with reports from others whether at the one-up, peer-level or one-down level. Other reports, however, also have their own challenges; eg supervisors may not be fully aware or cognisant of the full range of employees behaviours, some of which may not be evident to them. Supervisors may also not be objective and may bring negative subjective emotions into ratings. Has there been any work done on methods to combine other-reports with self-reports as a means of arriving at a fairer more balanced assessment ?

Hi, I'm Betzy, a Mathematics Student from the University of Indonesia. I'm doing my undergraduate thesis right now. My topic is about Jackknife Ridge Estimator, that proposed by Singh, Chaubey, and Dwivedi (1986). But there is something that makes me confused. It's about the formula of the weighted jackknife (pseudo-value). It's actually the modification of an ordinary formula of jackknife (pseudo-value). But because the ordinary one neglects the unbalanced one in data regression, it is modified by adding the weights. I don't know how they chose the weights like and wonder if it is related to the influence function. But, it's hard for me to understand as an undergraduate student. Here I attached some files. Hope that you can understand what I mean. Thank you

Best

Betzy

I am analyzing results from a daily diary study in R using the nlme package. I am using hierarchical linear modelling and am not sure which code to use to examine whether the latter model fits better than the previous model.

I tried the code anova (model1,model2) but it is giving me the following error message: Error in anova.lme(Model1.SigControls.IIP, Null.Model.IIP) :

all fitted objects must use the same number of observations

In addition: Warning message:

In anova.lme(Model1.SigControls.IIP, Null.Model.IIP) :

fitted objects with different fixed effects. REML comparisons are not meaningful.

I want to estimate genetic parameters for mean and residual variance of traits by applying

**double hierarchical generalize linear model**.In a random-slope model, the addition of Level-2 (L2) covariates implies a x-level interaction between the random Level-1 variable and the L2 covariate. In Stata (and maybe R, SAS), despite it's implicit nature, the x-level interaction needs to be explicitly entered into the model syntax by creating the interaction term(s).

Is a model considered mis-specified if one does not include these x-level interaction terms? For example, if the fixed effect of a L2 covariate is siginificant (and theoretically important) but it's x-level interaction is not significant, must the x-level interaction remain specified in the model?

I work with effects of contexts like the place of residence, and use different softwares that fit multilevel models (R, Stata, MLWin, Mplus). Almost any software does this analysis, nowadays (SAS, SPSS, HLM) and all provide similar estimates for coefficients, especially for linear models. I noticed, however, some difference in the variances (i.e. second level variance) and I am aware they use different estimators (IGLS, REML, MLR, and so on). What are the advantages and disadvantages of the main softwares? Is there any published paper comparing them for discrete variables and non linear models (Binomial, Poisson, N-Binomial, zero-inflated, etc)?

Correlated data arise frequently in statistical analyses. This may be due to grouping of subjects, e.g., students within classrooms, or to repeated measurements on each subject over time or space, or to multiple related outcome measures at one point in time. Mixed model analysis provides a general, flexible approach in these situations, because it allows a wide variety of correlation patterns (or variance-covariance structures) to be explicitly modeled.

I'm conducting analyses for a group counseling intervention study. Clients are nested within their respective counseling groups (4-9 clients). They were randomly assigned to a condition (Tx1, Tx2, or waiting list). Those receiving a treatment have an easily identifiable group. Those on the waiting list do not. So, when I go to conduct the three-level HLM (time within clients within groups) only the treatment conditions have groups that can be assigned. How can I include the waiting list condition in the analyses without losing the nested structure of the data? Is that possible? Any suggestions for how to analyze such data?

I have a simultaneous growth model with two constructs at three time points. I have estimated an unstructured covariance matrix and know that the model with the unstructured matrix fits better than a restricted matrix (it is reasonable that my intercepts and slopes are related to one another). The slopes for my constructs are VERY correlated, but I want to a significance test of that correlation (to present in the paper). I am using SAS for this analysis and it provides z tests of the covariances, but the z distribution is not appropriate for covariances, so I would like to compare model fit, but do not know how to restrict just this one covariance. All of the covariance matrix alternatives in SAS will make larger restrictions. I would like to isolate just the covariance between my two slope parameters.