Science topic

# Latent Variable Modeling - Science topic

Explore the latest questions and answers in Latent Variable Modeling, and find Latent Variable Modeling experts.

Questions related to Latent Variable Modeling

Imagine we have ten pictures of faces and we have them rated for attractiveness. Let's say 20 participants rate them using a 1-5 Likert scale (1=unattractive; 5=attractive). Another 20 participants rate the same faces using a 1-7 Likert scale (1=unattractive; 7=attractive). Finally, another 20 participants rate those same faces using a 1-9 Likert scale (1=unattractive; 9=attractive).

We assume there is some underlying quality, "attractiveness", that these different scales are trying to quantify. How best can we analyse these data to get at the question of this underlying measure, how well each scale measures it, and whether some scales do a better job than others, etc.?

Thanks in advance for any ideas/suggestions!

Dear SEM experienced Users 😊, I’m thinking whether it is possible to make sthg like “variance decomposition” of latent variable in Structural Equation Modelling (SEM)?

Let’s imagine we’ve got some latent factor Y “determined” by two other latent variables X and Z. We have standardized parameter estimates between X->Y (0,5) and Z->Y (-0,4). Is it possible somehow to use theses two estimates to say which latent factor is more “important” for Y determination? Ideally, is it possible to say that X accounts for x% of variability in Y, and Z for z%? Thanks a lot in advance for any hints.

Hi,

I am looking at the three-way interaction of a latent variable (F) with ordered-categorical indicators (mostly likert scales) and two observed variables (M1, M2), using Mplus. The two following issues arise:

1. How do I account for the ordered structure of the indicators? Do they need to be defined as categorical?

2. What does this mean for the standardization of the latent construct (F)? Standardizing the indicators does not seem appropriate!

Any advice or guidance is greatly appreciated. Thank you in advance!

I'm doing a psychology research project, and one hypothesis uses "Family Connections" as my IV and "Family Resilience" as my DV. Both are measured using a series of Likert-type items. Family Connections is unique, however, in which participants are asked about the quality of their relationship with one family member. So I have participants fill out the Family Connections questions once per family member. I allow up to 10 submissions, and I do not restrict the type of family members. So it means that in my data collection, some will have only 1 set for 1 family member, whereas others will have 10 sets for 10 family members. Family Resilience questions are only done once by the partipcant.

How do I construct a model and then run Data Analysis?

I haven't focused on latent variable SEM, because I wasn't sure how to control/manage the number of manifest variables. I don't want participants with 1 family member to have 9 missing spots. But perhaps I need to deepen my understanding of latent variable models.

I am leaning towards multilevel modelling, because the ability to nest all family members under a participant (no matter the number of members) satisfies this problem. But then I'm left with a level-2 outcome variable, Family Resilience, and most of the literature regarding MLM only focuses on level-1 outcome variables. I also don't know how to adjust stats packages to allow for a level-2 outcome variable.

Attached is an example of the path model *IF* a participant has 3 family members. Level 2 is the top row, and Level 1 is the bottom row.

Any help is appreciated!

What is the difference between mediating and moderating variables in panel data regression?

I am trying to extract a linear equation of the regression estimation of a model performed on the AMOS program. I have one latent endogenous latent variable, one latent mediator, and one latent exogenous variable. To extract a regression equation, I have relied on the following formula:

Z = Intercept + aX + bY + error

where X is the exogenous variable and Y is the mediator. a and b are standardized loading factors

Knowing that AMOS does not provide the intercept value of the latent endogenous variable, how can I calculate it?

Is it sufficient to consider the error variance calculated by AMOS as the error value?

I am conducting a confirmatory factor analysis examining a three-factor model using R's lavaan package. My factors are shifting, working memory, and response inhibition. I used the modindices function to see if there was any modification index that would reduce my chi-square by 10 or more. It told me that two of my response inhibition variables highly covaried, so I redid the model and accepted the suggestion. As expected, the model improved and my loading coefficients for response inhibition evened out, but all of a sudden the standardized factor loadings for shifting became "NA." What does this mean, and why did a tweak in the response inhibition factor affect the shifting factor? In the original model, I had values for the shifting factor (though it was my worst factor). The only change I made in this new model was drawing the covariance between the two suggested response inhibition variables.

I am working on an SEM model in MPLUS with survey data (n=300) that has ordinal/binary indicators, multiple imputations (10), and survey weights.

**My variables are:**

Y1: latent with 7 ordinal indicators

Y2: Observed, single item, ordinal

Y3: latent with 6 binary indicators

Y4: Latent, 7 ordinal indicators

X1: Binary, dummy

X2 Binary dummy

**Model is:**

Y1 on Y2 Y3 Y4 X1 X2

Y2 on Y3 Y4 X1 X2

Y3 on Y4 X1 X2

Y4 on X1

X1 with X2

My latent variable model has a perfect fit with CFI= 0.99, TLI=0.98, RMSEA=0.03, SRMR=0.08, and chi-sq/df=1.35. However, given the complexity of the model and the small sample size, I wanted to reduce the variables by calculating composite scores (sum of indicator value*factor loading) for all three latent variables, and run a path model. The significance of the variables is almost similar to that in the latent model but fit indices changed significantly. CFI=0.94, TLI =0.16, SRMR=0.03, RMSEA=0.12. I read in the literature when the model has a small degree of freedom, RMSEA has no interpretive value. But what about TLI, why is it so low compared to the CFI? Any thoughts on what can I test to find the cause of this strange result?

I almost asked this as a "technical question" but I think it's more of a discussion topic. Let me describe where I get lost in this discussion, and what I'm seeing in practice. For context, I'm a quantitative social scientist, not "real statistician" nor a data scientist per se. I know how to run and interpret model results, and a little statistical theory, but I wouldn't call myself anything more than an applied statistician. So take that into account as you read. The differences between "prediction-based" modeling goals and "inference-based" modeling goals are just starting to crystalize for me, and I think my background is more from the "inference school" (though I wouldn't have thought to call it that until recently). By that I mean, I'm used to doing theoretically-derived regression models that include terms that can be substantively interpreted. We're interested in the regression coefficient or odds ratio more than the overall fit of the model. We want the results to make sense with respect to theory and hypotheses, and provide insight into the data generating (i.e., social/psychological/operational) process. Maybe this is a false dichotomy for some folks, but it's one I've seen in data science intro materials.

**The scenario:**This has happened to me a few times in the last few years. We're planning a regression analysis project and a younger, sharper, data-science-trained statistician or researcher suggests that we set aside 20% (or some fraction like that) of the full sample as test sample, develop the model on our test sample, and then validate the model on the remaining 80% (validation).

**Why I don't get this (or at least struggle with it):**My first struggle point is a conceptual/theoretical one. If you use a random subset of your data, shouldn't you get the same results on that data as you would with the whole data (in expectation) for the same reason you would with a random sample from anything? By that I mean, you'd have larger variances and some "significant" results won't be significant due to sample size of course, but shouldn't any "point estimates" (e.g., regression coefficients) be the same since it's a random subset? In other words, shouldn't we see all the same relationships between variables (ignoring significance)? If the modeling is using significance as input to model steps (e.g., decision trees), that could certainly lead to a different final model. But if you're just running a basic regression, why would anyone do this?

There are also some times when a test sample just isn't practical (i.e., a data set of 200 cases). And sometimes it's impractical because there just isn't time to do it. Let's set those aside for the discussion.

Despite my struggles, there are some scenarios where the "test sample" approach makes sense to me. On a recent project we were developing relatively complex models, including machine learning models, and our goal was best prediction across methods. We wanted to choose which model predicted the outcome best. So we used the "test and validate" approach. But I've never used it on a theory/problem-driven study where we're interested in testing hypotheses and interpreting effect sizes (even when I've had tens of thousands of cases in my data file). It always just seems like a step that gets in the way. FWIW, I've been discussing this technique in terms of data science, but I first learned about it when learning factor analysis and latent variable models. The commonality is how "model-heavy" these methods are relative to other kinds of statistical analysis.

So...am I missing something? Being naive? Just old-fashioned?

If I could phrase this as a question, it's

**"Why should I use test and validation samples in my regression analyses? And please answer with more than 'you might get different results on the two samples' " :)**Thanks! Looking forward to your insights and perspective. Open to enlightenment! :)

I am performing a small study. This model has passed construct reliability, convergent validity, and construct validity. However, discriminant validity is not achieved. Is "Discriminant Validity" possible for two latent variable model. Any guideline from literature. Any advice ?

Regards

We can say that these factorial analysis approach are generally used for two main purposes:

1) a more purely psychometric approach in which the objectives tend to verify the plausibility of a specific measurement model; and,

2) with a more ambitious aim in speculative terms, applied to represent the functioning of a psychological construct or domain, which is supposed to be reflected in the measurement model.

What do you think of these generals' use?

The opinion given for Wes Bonifay and colleagues can be useful for the present discussion:

Hey,

For my research I have created 7 latent variables, which will all consist of at least 3 items (measured by an online survey with statement and a 7-point Likert scale). My question is the following: is it possible to apply Confirmatory Factor Analysis? I want to show in the result section that I grouped the right items together under the correct latent variable before conducting multiple linear regression with my latent variables. Also, is there some literature available to back this up?

Thanks in advance for answering!

Kind Regards,

Bram ten Barge

I have questions regarding

**measurement invariance**for**longitudinal data**when analyzing**latent variables**. I would like to analyze**4 cohorts**(grades 3,4,5,6 at baseline) longitudinally over 2 years with four assessments.1. If I consider the four cohorts as

**one sample**, should I then show measurement invariance (**configural invariance, metric invariance, scalar invariance and strict invariance**) between the cohorts for each of the four assessments. Or would it be more appropriate to evaluate each cohort separately? As an analysis method I would like to calculate**random-intercept cross-lagged panel model**s.2. Is it necessary to test for

**configural invariance, metric invariance, scalar invariance as well as strict invariance**between the four assessments for the latent variables? Or is it also sufficient under certain conditions if only configural invariance is given?I am writing my research paper and I have noticed that the moderator variables of my model are not strongly theoretically supported. Just to clarify, there are articles explaining the phenomenas but there are a few articles supporting them as moderator effects. Indeed, I run the model on SPSS and SmartPLS and the results were statistically significant (With high validity & reliability results). However, I also have to find theories to support them as moderator variables since I am pursuing deductive approach.

If you know any academic article supporting the use of moderator variables with low number of theoretical support (or without support), kindly type them down so I go through them.

I wanted to perform a confirmatory analysis on Schwartz PVQ questionnaire part. The construction of the questionnaire assumes that individual items combine into dimensions (latent variables), those dimmensions in turn - into larger dimensions, and those into even larger dimensions. For example (fragment is in the picture uploaded below): three questions/items each create the variables: SDT, SDA, POD and POR, and then, according to the instructions of the questionnaire, SDT and SDA, form the superior/master dimension SD (and POD and POR makes OP). Then PO and SD merge into one variable again... and this is just a fragment of the questionnaire key.

I don't understand why such a construct in SPSS AMOS is always under-identified - what is missing in it? What structure? How to calculate this model?

Hi folks! We did a large survey and planned to use 3 indicators to represent a latent construct (Sense of Place/SoP) in a latent regression model (structural equation modeling). However, one of the indicators did not work well (too low model fit) and I was left with only two indicators. I fear that reviewers who are used with formative scales will think that two indicators yields a very weak scale.

As reflective items are interchangeable, the latent construct should not change whether you use few or many indicators. I can argue for the face & construct validity of the indicators/scale, but there are many other suggested indicators for SoP if one were to develop a formative scale. Is there some good references for scale with few indicators? By the way: the model fit was excellent. (see enclosed structural model, the SoP construct had only two indicators).

I have a three latent variable model where LV1 predicts LV2 and LV3. How do I test if the effect of LV1 is the same on both LV2 and LV3. I have set the paths to be equal, but how do I then check whether the paths are actually equal or not (i.e., if there is a statistically significant difference)?

I want to estimate the causal relationship between two latent variables over time using structural equation modelling. both latent variables are measured according to different indicators (observed). the data is panel with N=21 and T=16 years.

after going through SEM estimation techniques, cross-lagged panel models seem to be the best option, however, given the small sample size where N=21. is cross-lagged panel applicable in this case? I noticed that cross-lagged models involve large N and small T (max 5 waves).

The obstacle I'm facing is the small sample size given the number of indicators.

Any ideas if cross-lagged models are applicable or are there other techniques? (please note that I am estimating two latent variables with many indicators)

Dear community,

I would like to combine two subscales of a questionnaire to form one predictor: It is an instrument on intrinsic vs. extrinsic goals which assesses for several goals their attainment and importance by two separated questions. Since using the attainment and the importance as two distinct predictors won't work I was thinking of using the difference as a variable by subtracting the importance from the attainment. For example a positive or zero value for the scale "personal growth" would mean that the attainment is large/as large as it's attributed importance, while a negative value indicates the lack of attainment of an important value.

Has anybody modelled something similar before, e.g. using lavaan or MPLUS? It seems to work nicely as manifest varibale in a MLR but I would like to try it as SEM.

Thank you very much in advance,

Kind regards

Matteo

I know that they are different but I am kind of at a loss for how to explain the difference, but I may be overthinking it. In simple terms, my understanding is that a latent mean(intercept) is an estimation of a theoretical construct that is equal to 0. You can rescale the latent mean by fixing the variance to 1 and setting the intercept of one of the indicators to 0. In this case the latent mean will take on the scale of the indicator. An observed composite mean is simply the mean of the indicators of the factor. The main difference between the two is that the composite mean includes the measurement error of the observed variables, while the latent mean adjusts for this.

Am I missing anything else here? Also, I assume that it would be inappropriate to compare the latent mean from one study with a composite mean from another study?

Is there any possibility to get the estimated values (listed for every participant) of a specific latent variable in R or Mplus?

I wish to use self-monitoring scale developed by

**Richard D. Lennox and Raymond**N. Wolfe ("Revision of self-monitoring").**It is a 2 factor structure scale with total 13 items-****(1) ability to modify self-presentation-7 items & (2) sensitivity to expressive behaviour of others-6 items .**

But for a part of my study I am trying to show the impact of self-monitoring on consumption behaviour , so I feel only the first dimension (ability to modify self-representation) is useful for me . Also I have 6-7 more latent variables so the survey is already touching 90 questions so I want to minimise the items .

**Can I just use the 7 items given by this scale (representing the first dimension of self-monitoring), without disturbing psychometric properties and still call the composite variable of these 7 items - "SELF MONITORING"? Is this an acceptable practice in Research ? (I will doing SEM eventually)**

I am running a PLS-PM to evaluate Satisfaction "S".

I have defined a latent variable LV "D" and I concern how to define the manifest variables(MV).

I have:

- 3 variables regarding satisfaction with "D", expressed on 7-likert scale
- 1 variable regarding how many x you consume
- 1 variable regarding how much you spend to buy x

My doubt:

I expect the satisfaction MVs to be positive correlated with LV and the last two MVs to be negative correlated. I have seen someone trasforming MVs into negativeMVs to have unidimensionality condition respected, but I would like to avoid this solution becouse it conceptually does not make much sense.

- can I consider my MVs to be reflective with LV since I have a MV about consuption and a MV about expences?
- in alternative, can I state my LV "D" to be explained by 3 reflective MVs and 2 formative MVs?
**suggestions to define my LV?**

Thanks a lot for the attenction, any help will be highly appreciated.

Is there a package in R (or Stata) to solve the Integrated Choice and Latent Variable Models ( hybrid choice model)?

Hi,

I am a serious novice at AMOS (still learning terminology etc. as I go), I have been reading and researching everything and still struggling at the particular point.

After running my model it says that the co variance matrix between my latent variables is not a positive definite and that the solution is not admissible. I changed the variance on my errors to 1, as this is what I read was supposed to solve the problem and it still hasn't helped.

Please see attached my model and the output.

The model is supposed to investigate, how the characteristics of victim, offender and offence (e.g. observed variables are sex, age etc.) influence the outcome of the case, which is an observed variable as well.

Any help would be greatly appreciated been at this for weeks.

EDIT. There is also a previous model, with the disturbance terms on the latent variables, however I removed these when AMOS wouldn't allow me to draw the co variance arrow from the latent variables, only from the disturbance terms. However it did produce different output results but with the same error message. If this is any help? (see attached)

Hey everyone,

currently, I am working with Gaussian process latent variable based models. In literature, the model likelihood is discussed for model selection.

Unfortunately, this does not work for my application. Currently I am using the log likelihood and the reconstruction error. The model likelihood increases and the reconstruction error decreases with increasing dimension/inducing points. BIC doesn't make sense in this context (and behave similar...).

Are there better parameters for model selection?

All the best,

Will

In value-added models, school selection is endogenous to school-level treatments as well as the school random effects. It is easy to justify that students in highly selective schools (e.g., middle schools) would go to highly selective schools further (e.g., high schools). Therefore, I think it would be reasonable to use the school effects for high schools (random effects) in the study of middle school outcomes, as an instrumental variable to control for the potential endogenous selection problem. What do you think?

Hello,

I have 5 variables each measured using multiple items, taken from already published studies (that is there is evidence of construct reliability and validity).

I am taking one variable as example here, lets say V1 which is measured with help of 6 items [V1 (i1, i2, i3, i4, i5, i6)]. In previous study all six items loading significantly on V1.

I ran model for two different group of respondents.

In Study 1, only 4 items loaded significantly on V1

V (i1, i2, i3, i4) with alpha, composite reliability and AVE in acceptable threshold. [Case A]

In Study 2 where different set of respondents were used, 5 items loaded significantly on V1 (i1, i3, i4, i5, i6) with alpha, composite reliability and AVE in acceptable threshold. [Case B]

Now I want to compute variables to make a comparison between the two groups.

In such situations, should I compute the variable using the original study where V1 was measured with 6 items or should I compute variables as per the loading mentioned in Case A (4 items), and Case B (5 items), and then run t-test for comparison?

(N.B. Variables are reflective so I think the difference in number of items will not matter, or will it *thinking*)

Thanks in advance.

Ali

I have some explanatory variables and i have designed a stated preference survey design with those variables or attributes. Along with that I have asked questions about safety perception (only three questions measured in ordinal scale) keeping in mind not to enlarge the questionnaire. If I do factor analysis, these three questions will merge to one factor. My question is whether I can estimate an ICLV model directly incorporating the three perception based questions (as three questions are related to safety but not necessarily correlated) measured in ordinal scale, into the choice model estimation without doing factor analysis? Is there any other suggestions? Please help.

Hi there, can anyone provide me suggestions on choosing software? I'm interested in knowing which software is the best for simultaneous estimation approach.

**Matlab, Mplus, Biogeme**, etc. are all capable for this. What is your favorite software and why? What are their pros and cons? Thank you.I, personally, am not a huge fan of

**Matlab**because of long code and the troublesome mistake checking process. I'm curious if**Latent Gold**can be used for simultaneous estimation of hybrid choice models?Can someone please explain how to do a path analysis within a structure equation model in AMOS when having only latent variables and within the research model 3 moderators and 1 mediator? I am struggling to build it in AMOS to test the structural model. I would be grateful for any hint. Thank you.

I have two latent variable models which I have identified in separate analyses (we'll call them Model A and Model B). In each of the analyses, all parameters were freely estimated and I achieved excellent model fit. The latent variables in each model are thought to represent trait abilities measured via different assessment tools. I want to see how the latent variables from Model A predict the latent variables in Model B. Given that each model is believed to represent trait-level (static) abilities, is it acceptable to constrain each of the models based on the parameters identified in the separate analyses, and ONLY freely estimate the prediction parameters from Model A to Model B?

What are the differences between SEM and Path Model? As far as I know, SEM overcomes two of the issues with path model, with latent variables and non-recursive models. But is it necessary that SEM should always have a latent variable in the model? Please clarify. Thank You.

I'm using 4-6 parcels as indicators for each of the 4 latent factors in my structural model. I parcelled the items in 2 factors using the internal-consistency approach (each parcel containing items from 1 subscale), and the other 2 factors using the domain-representative approach (each parcel containing a representation of items from all subscales). I would like to know whether what I have done is appropriate?

Can we design a 1st, 2nd, 3rd, 4th and 5th order/level variables Model comprised of reflective and Formative variables as follows: items to 1st order variables are formative & 1st order variables to 2nd Order variable is Reflective than 2nd order variable to 3rd order variables are Formative, 3rd to 4th is also formative and 4th to 5th order/Level is Formative?

I'm planning to perform a SEM in order to investigate intention to innovate regarding to attitude toward innovation and entrepreneurial self-efficacy, as well as other variables.

Each of these factors are measured through 4 or 5 items in a five-point Likert scale. "Latent variable" is a variable that cannot be directly measured. Are these factors latent variables, because they are measured through other measures (the 4 itens), or are these factors observed variables, where the observation is consisted of 4 itens?

To my understanding, it should be seen as an observed variable, because each item is not alone a variable, but the mean of all of them is. Nevertheless, I am not sure about this definition, and would be grateful if someone could elucidate it for me, once this changes the whole structure of SEM.

Best regards,

Pedro

Hi,

I have a problem with SEM using Lisrel. I am using a Lisrel for my SEM modelling. All my variables are ordinal. Hence the indicator variables for the independent latent variables (intention) and the observed dependent variable (behaviour) are all ordinal. Now if I try to define an observed variable as a dependent variable to the latent variable, it assumes it to be another indicator variable to the latent variable. In order to solve this problem, I have tried using a single indicator latent variable. Hence I create a latent variable where behaviour is a single indicator and this latent variable then becomes the dependent variable to the latent variable intention remains the independent variable. The model works but reading the above argument I have doubts on its reliability. So I have two questions:

1) In Lisrel, how can I treat an observed variable - behaviour, as a dependent variable to a latent variable without it being mistaken as another indicator variable of the independent latent variable intention

2) If this is not possible then can I use a single indicator latent variable to define my dependent variable

Thank you in advance

Sandeep

I am conducting latent measurement invariance analyses using MPlus. The data are best modeled by a single factor (A) and a reverse-coded method factor (RC). All six items load onto A; three of the six items load onto RC.

I would like to calculate composite reliability (omega) for A. It appears that the variance of the dual-loading items is affected (lowered) by this factor structure, which results in potentially liberal estimations of omega. Modeling the data using a correlated error structure provides markedly higher variances for the cross loading items, and, thus, a lower estimation for omega.

What is the appropriate way to calculate omega for data with this structure? Thank you very much for your help,

Chris Napolitano

I am trying to include some attitudinal in mode choice models. Does anybody know which software package can be used to estimate the model?

I would like to know about anaysing SEM images related to types of wear

I am doing a correlational study on the relationship between teachers' sense of efficacy and school culture score( IV) and student achievement(DV). The new NJ Teacher's Evaluation system moderates/mediates the relationship.

Hello

I'm running a SEM model with many latent variables, and after generating factor scores for each of them (based on the observed indicators), I'm struggling to normalize them in order to proceed with the analysis. The regular z-score is not working.

I have found some studies using the Tukey's proportion estimation formula for normalization, but had no success looking for this transformation.

Does anyone have any experience in this area?

Thanks in advance!

My question is regarding SEM using Amos. I used only three observed variables for each latent factors and I am having six latent factors. In CFA I found one of these observed variables is not at satisfactory level.(Its factor loading is less than 0.5) In that case can I remove that variable and do the analysis having two observed variable per latent factor or any option for that?

I have a data-set that contains different states of a country. In every state there are different companies and one company in every state is manager of other companies in that state (other companies are branches of this leader company at different levels). I want normalize (or standardize) this data-set and after that use Factor Analysis(FA) to combine different input features to create a single performance indicator.

- Is is possible to normalize data in every state separately and consider the leading company features values as denominator of other companies in that state?
- Can we compare a company from one state with other company in another state in this structure? (comparing to using one leading company for whole data).
- Is this normalization method affect factor analysis assumptions?

******Whole data leading company is so big and has very high value features so I decided to use this normalization structure. Scale and measurement unit of features are different.

I know there are a bunch of validated 'peer friendship quality'-scales out there, but the problem is that they all involve a lot of items.For my newest research I want to ask adolescents about the 'friendship quality' with at least 6 different peers. I want to ask them also a lot of other questions.

To avoid repetitive answering and questionnaire fatigue, I was now considering the use of a one or two-item (e.g.

*How good do you consider the friendship with peer X?*) scale with a range between 1 (*very low quality*) and 10 (*very high quality*).I know latent constructs are better in terms of measurement error, but keeping respondents motivated is perhaps more difficult if you ask them to complete 20 items per peer.

I am working on my master's thesis and will be testing models that have facets of health as the outcome. Specifically, I am looking at:

- physical health (i.e., health problems, such as hypertension, pain, vision problems),
- functional health (i.e., how health problems impair or limit daily functioning, such as working, sleeping, seeing),

I'm thinking that these facets of health are formed by their indicators, rather than the indicators being reflective of the facet of health. But can an argument be made in favor of reflective?

Related, if I do treat these are formative, what are the implications for treating these latent variables as endogenous outcomes? I've read Diamantopoulos et al (2008) and I am not sure how, or if any recommendations for formative latent variables change if the latent variable is the outcome.

If it helps in any way, most of my indicators are categorical, but I also have a few continuous. I was planning on using robust weighted least squares as my estimator and conducting my analyses in Mplus.

Thank you in advance, and let me know if you need more details.

Thank you for reading this question.

In my data the age was captured in categories (e.g., "under 20 years", "21-30 years"). Since those categories lead to some kind of inaccuracy I'm not sure if "age" has to be modeled as manifest variable with or without a disturbance term. The point is: Does measuring in categories produce an error which is modeled in SEM or not?

Moreover, if "age" is modeled as a manifest variable it belongs to the measurement model but since my hypotheses include "age" doesn't it also belong to the structural model?

Thank you for your help!

I have three questions regarding SEM, as below:

1- There are observed and latent variables in the models and I know the definition of each, but I do not know whether we can consider the total score of a scale under an observed variable or not?

As I have around 150 items (for all scales used in my research), I can not draw measurement model and mediating models based on the items. Do you have any suggestion? Please!

2- Would you please let me know whether drawing a covariance between the mediators is wrong or not?

Although mediators in the models are kind of endogenous variables and covariances should be drawn for exogenous ones only. But I saw some scholars that used a convariance between mediators! Is it right?

3- Is it acceptable if I draw separate models in my thesis based on my hypotheses?

Indeed, I used SPSS to to cover 3 of objectives, but for the last objective which is about the mediating effects, I used SEM-Amos. I drew 4 different models based on each hypothesis. Is it correct?

The attached file is a sample to show what I am trying to ask.

Thank you for your valuable time,

I have multiple repeated risk factors and 6 times points of children growth.I want to create a latent variable for growth change and how this affected by repeated risk factors.

I would ideally like to use a single threshold - eg category 0/1 vs 2/3.

In my two segment latent class model, I used maximum iteration =50. Then ran my limdep program but showed the message maximum 50 iteration. Exit iteration status=1. I have not understood if my result with the above message is right or wrong?

please reply.

thanks in advance.

What does it mean when the captured variance in X-block in PLSR is so low, for example, 20% but the captured variance of Y-block is 98%?

Is this model proper? Can this model be used for prediction?

Can we use 2 different likert scales eg., 2&4 within a single latent variable?

Within the same structural model can I use indicators of one latent variable with continuous data and indicators for another with categorical data?

For a latent variable, I have two indicators with 4 point Likert scale and two are dichotomous, is it correct to use two different likert scales to measure the same latent variable?

And can we use 2 different likert scales to measure a single latent variant?

Within same structural model, can we use one latent var with continuous data and one with categorical data?

Within the same latent var can we use 2 indicators with 4 point likert scales and two dichotomous scales?

I work in biometrics and I would like to apply latent variables analysis (LVA) onto my data, but I couldn't find out any useful information to help me. Where do I start?