Science topic
Stata Software - Science topic
Explore the latest questions and answers in Stata Software, and find Stata Software experts.
Questions related to Stata Software
What are the processes to extract ASI microdata with STATA and SPSS?
I have microdata in STATA and SPSS formats. I want to know about the process. Is there any tutorial on youtube for ASI microdata?
Hey all,
I need help testing for multivariate outliers using STATA for my master thesis. The literature recommends the Minimum Covariance Determinant (MCD) (Verardi & Dehon, 2010). I found the relevant STATA commands which can be installed with ssc install robmv.
However, I am unsure how to use the robmv command as there is limited beginner-friendly information available. Could someone guide me on detecting multivariate outliers using the MCD method in STATA?
Thank you in advance for your help!
So my student have a question that i cannot answer as well. She analyzing the effect of ICT toward labor productivity using 8 years data panel using 4 independent variables with EVIEWS 13. Frankly i quite surprised that the R-squared value on her results is 0.94 with only 2 significance variables. Theoretically, in simple regression model with higher value of R-squared most likely indicates bad and have statistics problems. Recently, i asked her to calculate the data using STATA and the results shows that only have 0,51 R-Square with exact similar coefficient.
I've search some articles about it and says that eviews might be wrong, and some says that STATA is wrong. Can someone explain what should i do and which software have to use?
note:
1. Some articles says to using areg command in stata to find similar value as eviews, but i quite doubt because areg is using for categorical regression in stata and its not quite fit in panel regression model.
2. Some says that eviews is wrong calculation.
I am creating a forest plot for a meta-analysis and require assistance in adding a superscript after the Author/Year on the left side of the graph. Any help would be greatly appreciated.
(STATA)
Hello everybody.
I would like to build a Forest Plot graph that includes all this information from the table. I am unable to build the commands to include this information in the chart. I've tried to make other posts here, but I haven't been successful. Can someone help me?
I send the result of table for the chat.
I would like to set up the graph like this example:
Hello Everyone,
I am working with a "labelled" DCE which has 20 choice sets, 4 alternatives with 6 attributes. Each respondent answers 4 choice sets.
I have my data in long form (Respondent_ID, alternatives, attributes.....) such that Respondent_ID 1 has 16lines for 4 choice sets with 4 alternatives. When i cmset this data
cmset Respondent_ID alternatives : I get an error "at least one choice set has more than one instance of the same alternative r(459);" which i guess is understandable as 1 Respondent ID has multiple alternatives but in my case 1 RespondentID has answered 4 choice sets.
How should i format the data to work in this case? I have seen multiple videos in youtube and stata manual but they all had only 1choice set per respondent.
1 additional question:
I have 4 "labelled" alternatives A,B,C,D (with A being the base option)
based on the choice scenario i ask to pick the best choice among the four, but also gave a 5th option of "I will pick A but i will consider C or D if the cost is lower" (cost is one of the attributes).
How can I add this 5th option in my analysis?
Thanks a lot.
When I try to do the metatrim on STATA I get the following error:
db metatrim
. metatrim _ES _seES, reffect eform
Note: default data input format (theta, se_theta) assumed.
subcommand meta __000005 is unrecognized
r(199);
Does anyone know how to solve it? I couldn't find anything on the internet.
I want to run python codes on the current variables of Stata using PyStata. For example:
- set obs 5
- gen a = _n
- python
- a
- end
But python does not recognize "a". Instead I should run:
- python
- a = [1, 2, 3, 4, 5]
- a
- end
How can I use the existed "a" instead of new coding?
What is the best software for meta-analysis?
I usually use RevMan for meta-analysis. What is the limitation of RevMan compared to other software (STATA or CMA)?
I have been looking at transmission of world market prices to domestic African markets, in the face of rapid policy changes. My challenge is to model threshold VECM in stata. Does anyone have a suggestion?
My objective is to generate a composite index of risk perceptions.
I want to assess whether there are differences in risk perceptions of 2 subpopulations. Using Likert-scale responses (scale of 1-5 where 1=extremely serious and 5=not serious) to several Likert-type questions, I will build an index of risk perceptions for each subpopulation. I have 10 Likert-type questions and for each question (or risk perception category), I understand that I have to calculate the weighted mean of responses for each respondent, X:
X = {5 x F(5) + 4 x F(4) + … 1 x F(1)} / N where F is the frequency of responses and N the total number of respondents/ observations. See:
5 = extremely serious, ... 1= not serious
I have a total of 14 questions (or risk perception categories). To get the index (or mean score) for each subpopulation, I read that I just need to get the ratio of each weighed mean X and the total number of questions (i.e. 14). Is this correct?
If so, how do I do that in Stata? This index will then be used in an ordered logit regression as one of the dependent variables.
Thank you in advance.
I just received the latest TOC alert for Behavior Research Methods, and this article caught my eye:
I've not had time to read it yet, but judging from a quick glance, I wonder if the main "problem" might be that users do not always take time to RTFM* and therefore, do not understand what their software is doing? In any case, I thought some members of this forum might be interested.
Cheers,
Bruce
* RTFM = Read The Fine Manual ;-)
What is the command at Stata17 for Bartlett sphericity test?
I'm doing PCA and Factorial Analysis.
(Epidemiological analysis)
Colleagues, does anyone know which commands to run a mediation analysis (with direct and indirect effects) in Stata?
My exposure is a categorical variable, the outcome is a dichotomous variable.
The mediating variable is numerical.
Dear respected scholars,
I am trying to estimate average treatment effects on the treated (ATT) using IPWRA, PSM, and ESR models. However, I struggle to estimate ATT using the ESR model in STATA/Stata software.
The outcome variable is production cost and net returns.
The treatment variable is 1=if the respondents have adopted the technology and 0=otherwise.
The control variable is Age, Education, Experience, Farm size, Land ownership, and Land typology.
I would be grateful if anyone could guide me with the procedure.
Thank you very much.
Sincerely,
Faruque.
I am looking for a graphical tool like visual basic software to define R codes for interactive graphical buttons and text boxes.
For example, I want to design a windows application with graphical design for calculation of body mass index (BMI). I want to have two boxes for weight and height imputation and a button for run. When clicking the button, I want to the below code be run.
BMI < - box1/(box2^2)
I've been searching for PSTR model STATA codes. Please assist if you have any idea or information about it.
I will appreciate any assistance in this regard.
Thanks in anticipation
Shaurav
there are many statistics software that researchers usually use in their works. In your opinion, which one is better? which one do you offer to start?
Your opinions and experience can help others in particular younger researchers in selection.
Sincerely
I am trying to run a spatio-temporal autoregressive model (STAR). Therefore I need to create a spatial weight matrix W with N × T rows and N × T columns to weight country interdependencies based on yearly trade data. Could someone please tell me how I to create such a matrix in R or Stata?
Hi, I am currently writing my master thesis on early indicators of bank failure. For this I have calculated probability of default as my dependent variable and have around 15 different financial ratios as explanatory variables, I have also included two measures of interest sensitivity as possible independent variables.
My data consists of 125 companies over 20 years. I'm using STATA and need help with how I should format my data from excel. I'm also unsure of which kind of regression is best suited for my data. I've tried reading Econometric Analysis of Cross Section and Panel Data by Woolridge but I'm feeling a bit lost.
Attached is the Excel file.
Thank You for the help!
As an Indonesian researcher in public health with a medical background and a Taiwan doctorate-level education, I am confident that I can contribute valuable insights and expertise to any ongoing research projects. My diverse background has equipped me with a unique set of skills and knowledge that I believe would make me an excellent collaborator for your research paper.
Having been trained in a highly respected academic institution in Taiwan, I have developed a rigorous approach to research that emphasizes attention to detail, critical thinking, and creative problem-solving. I have experience using various statistical software programs such as SPSS and STATA.
Furthermore, my medical background has given me a deep understanding of health-related issues and their impact on communities. I have a particular interest in public health, and I am passionate about finding evidence-based solutions to the most pressing health challenges facing our society today.
As a co-author, I am willing to take on any necessary tasks to contribute to the project's success. I am especially comfortable with data analysis and can use my expertise to generate meaningful insights from complex data sets. Additionally, I am adept at academic writing, and I can help ensure that the paper adheres to the highest standards of clarity and coherence.
Overall, I believe that my skills, knowledge, and passion make me an excellent candidate for collaboration on any public health research project. I am excited about the opportunity to work with other researchers to generate impactful findings that can improve health outcomes and contribute to the scientific community's collective knowledge.
Don't let your research project fall short of its full potential. Reach out to me today at yosephsamodra[at]gmail.com to explore how we can collaborate to generate impactful findings in public health research. Let's work together to make a difference in the world of healthcare.
Regards,
Yoseph
Can clustering standard errors be used for cross-sectional data? If possible, what is the corresponding command in stata.
Hi there!
I'm following the method by Branko Miladinovic et al. "Trial sequential boundaries for cumulative
meta-analyses".
I'm using the following code: metacumbounds ln_hr se_ln_hr N, data(loghr) effect(r) id(study) surv(0.40) loss(0.00) alpha(0.05) beta(0.20) is(APIS) graph spending(1) rrr(.15) kprsrce(StataRsource.R) rdir(C:\Program Files\R\R-4.2.1\bin\x64\) shwRRR pos(10) xtitle(Information size)
but the following error is popping up:
File bounds.dta not found.
Any guess what's causing the error and how to fix it?
Thanks in advance
Hi, everyone. i am calculating NRI and IDI using STATA. I want to compare the discrimination ability of two seperate models (Model A and Model B). I think the nri program of stata seems only to calculate NRI that reflects the discrimination between a base model and the model in which a new marker is added to the base model. Now, how should i calculate a NRI that reflects the discrimination of two models which includes different variates.
Thanks in advance!
I am doing a choice experiment using a random parameter logit model using the mixlogit command "mixlogit choice price no, rand(a b) group(group) id(id)", how do I calculate the marginal effects?
I tried the following command and the result seems to be incorrect, the RPL coefficient is positive, but the marginal effect calculated with this command is negative.Any help will be much appreciated.
preserve
set seed 12345
gen rnd = runiform()
bysort id groupid (rnd): gen alt=_n
replace a = 0 if alt==1
mixlpred p0, nrep(500)
replace a=1 if alt==1
mixlpred p1, nrep(500)
gen p_diff = p1-p0
sum p_diff if alt==1
restore
I am using -corr2data- to simulated raw data from a correlation matrix. However, some variables that I need should be binary. How can I convert?
Is it possible to convert higher amounts to 1 (and the other ones to 0) as the form to reach the same mean? How should I do it?
Is a way in R?
(I want to perform a GSEM on a correlation matrix)
(I know -faux- package in R. But my problem is that just some of [not all of] my variables are binary.)
I'm using STATA 17 with plugin & command pbis bvar cvar.
I'm wondering is there another command or macro that would run the calculation for multiple bvar simultaneously into a table or list? My research dataset is small (14 items & 30 candidates) so it's certainly faesible to do the calculation for each individually (and to correct test score to do corrected as well) but I'm wondering what I would do in the future with a larger dataset and this wasn't a practical option...
Dear colleagues,
i am looking for the package or command, which performs times series decomposition in STATA. So far I did not find anything. Example can be found here: https://towardsdatascience.com/an-end-to-end-project-on-time-series-analysis-and-forecasting-with-python-4835e6bf050b at figure 5.
Look forward to valuable comments)
Dear all,
I have a panel dataset consisting of string names of countries. I want to generate specific ID for each country.
E.g.
Country ----------ID
A----------------------1
A----------------------1
B----------------------2
B----------------------2
Dear community,
What is the code to perform a Fama-MacBeth regression in Stata? I understand how this works theoretically, but I do not understand how this is implemented in Stata. My variables are the 5 factors of the Fama French 5 factor model and 25 portfolios double sorted on size and book-to-market value of equity.
Additionally I have another question as well. That is, in order to test the Fama French 5 factor model, you just regress the factors on one of the portfolios right? In other words, is the correct code to test the 5 factor model:
- tsset date (in order to declare dataset to be time-series data with date as the time variable)
- reg me1bm1 markt smb hml rmw cma (where me1bm1 is the portfolio with lowest marketcap and lowest B/M and the other 5 variables are the 5 factors).
When I use this code I get very strange results, namely that almost all intercepts are significant (which is in contradiction with the Fama French papers). Hence, I am wondering whether there is something wrong with this code. I hope you all could help me with these 2 questions!
Yours truly,
Niek
Model 1
. regress ir_1yr eq_rt fi_rt pe_rt re_rt eq_al fi_al pe_al re_al ir_1yr_df
Source | SS df MS Number of obs = 1,857
-------------+---------------------------------- F(9, 1847) = 9215.93
Model | 20.3975564 9 2.26639516 Prob > F = 0.0000
Residual | .454216753 1,847 .000245921 R-squared = 0.9782
-------------+---------------------------------- Adj R-squared = 0.9781
Total | 20.8517732 1,856 .011234792 Root MSE = .01568
------------------------------------------------------------------------------
ir_1yr | Coefficient Std. err. t P>|t| [95% conf. interval]
-------------+----------------------------------------------------------------
eq_rt | .5248209 .0029915 175.44 0.000 .5189538 .5306879
fi_rt | .2278904 .0075746 30.09 0.000 .2130347 .2427461
pe_rt | .0831074 .0033364 24.91 0.000 .0765638 .089651
re_rt | .0703441 .0034776 20.23 0.000 .0635237 .0771644
eq_al | .0708922 .0047206 15.02 0.000 .061634 .0801504
fi_al | .0354871 .0064044 5.54 0.000 .0229266 .0480477
pe_al | .1004973 .0092323 10.89 0.000 .0823904 .1186043
re_al | .0333808 .0116003 2.88 0.004 .0106297 .056132
ir_1yr_df | .9498746 .0149831 63.40 0.000 .9204891 .9792601
_cons | -.050006 .0041722 -11.99 0.000 -.0581887 -.0418233
------------------------------------------------------------------------------
Model 2 - New variable is pe_al*pe_al (want to assess if pe_al turns at -b/2a)
. regress ir_1yr eq_rt fi_rt pe_rt re_rt eq_al fi_al pe_al c.pe_al#c.pe_al re_al ir_1yr_df
Source | SS df MS Number of obs = 1,857
-------------+---------------------------------- F(10, 1846) = 8365.23
Model | 20.4015612 10 2.04015612 Prob > F = 0.0000
Residual | .450211929 1,846 .000243885 R-squared = 0.9784
-------------+---------------------------------- Adj R-squared = 0.9783
Total | 20.8517732 1,856 .011234792 Root MSE = .01562
---------------------------------------------------------------------------------
ir_1yr | Coefficient Std. err. t P>|t| [95% conf. interval]
----------------+----------------------------------------------------------------
eq_rt | .5241271 .002984 175.65 0.000 .5182747 .5299794
fi_rt | .2265008 .007551 30.00 0.000 .2116914 .2413101
pe_rt | .0838555 .0033277 25.20 0.000 .077329 .090382
re_rt | .0696112 .0034678 20.07 0.000 .0628098 .0764125
eq_al | .0690265 .0047235 14.61 0.000 .0597626 .0782904
fi_al | .0311132 .0064685 4.81 0.000 .0184268 .0437995
pe_al | .0195947 .02198 0.89 0.373 -.0235136 .062703
|
c.pe_al#c.pe_al | .3383581 .0834983 4.05 0.000 .1745971 .5021191
|
re_al | .0293011 .011596 2.53 0.012 .0065584 .0520437
ir_1yr_df | .9509827 .0149234 63.72 0.000 .9217142 .9802512
_cons | -.0440566 .0044066 -10.00 0.000 -.0526991 -.035414
---------------------------------------------------------------------------------
I am using two-step system GMM to find impact of two independent business related variables on governance of an economy.
Dataset size: 8 Year data for 71 countries
I am not using any control variable in the main models but reporting 20 different results for different types of countries using the independent variables and their one year lags. For example, for the full dataset, I checked impact of x1, x2, and L1.x1, L1. x2. Then ran similar regression only for the developed countries and so on.
If I find fairly consistent results, and shed discussion on impact of different control variables only in additional analysis, would it suffice?
(Prior literature used 2-4 control variables but they used only OLS. Many of those papers were published in ABDC A/B ranked journals).
Thanks for your cooperation.
I am trying to analyze multilevel binary logistic regression using STATA software applying the svy command. Is it necessary to estimate Information criteria, if yes, which command in Stata can I use under complex sample data?
Dear colleagues,
I am testing whether a variable mediates or moderates a relationship across two groups in a SEM. All variables are continuous and latent.
Is there a way I can model latent interactions in SEM and still being able to get a LR and overall goodness of fit indicators so that I can compare the moderation model with the mediation one?
Anyone ever attempted with the med4way package?
Thanks in advance
Dear All
I would like to run time series data of ARDL model using STATA software. So would you please assist provide STATA commands and with steps to run ARDL model of time series data?
With best regards
Addisu
I have a Labor force survey from 2007-2016 (10 years). The survey is not a panel data (selected from different strata and have different N across the years). N = 12,581 |12,823 |9,561 | 6,167| 9269 | 8865 | 12319 | 12544 | 12731 | 9137
I am interested in the factors (X) that influence the education-job mismatch (Y) by running Profit model. And then the Mincer equation. My questions are:
1. Should I use pooled data analysis? If so, how to merge the 10 years data together?
2. Should I analyse the 10 years survey separately before, to see, if my (assumed) model is rather stable over time?
3. There is a possibility that one same person completed the surveys but I don't know how to match people directly or narrow down the data as a panel data considering large number of sample and years?
Thank you.
Can it become linear, then linear regression could be done.
Should I estimate using non linear estimates?
We are conducting a meta-analysis to compare a new non-invasive measure to a gold-standard measure. Data within the retrieved studies are reported in the forms of Pearson correlation coefficients and limits of agreement (Bland–Altman method). We wonder if there is a method to pool the LoAs using RevMan or STATA? and what are types of data we need to obtain such pooling estimates?
In confimatory factor analysis (CFA) in Stata, the first observed variable is constrained by default (beta coefficient =1, mean of latent variable =constant).
I don't know what is it! Because, other software packages report beta coefficients of all observed variables.
So, I have two questions.
1- Which variable should be constrained in confirmatory factor analysis in stata?
2- Is it possible to have a model without a constrained variable like other software packages?
As there are several software to estimate threshold panel regressions such as Matlab, R, and Stata; I'm wondering which one is accurate and easier to use (dynamic balanced panel data).
Regard
Maryam
Hi All,
Does anyone know how to acquire efficient score derived from Translog Cost Function like attached where,
lnTCit is the natural logarithm of total costs used by bank i in year t.
Total cost (TC) is measured as the sum of interest expense and noninterest expense.
pnit is the price of the nth input employed by bank i in year t,
ynit is the amount of the nth output produced by bank i in year t.
α, β, δ, and λ are coefficients to be estimated.
εit is stochastic or aggregate disturbance term.
What software can be used to deal with this ? and what does this "m" denotes for ?
So much thanks for the answers
The Malmquist-Luenberger index is used to evaluate energy efficiency by considering unfavorable output. Is this index accessible by Stata software?
What are the new unit root tests for Structural Break? I have seen a structural break test in Eviews 10 but i don't know what is the name of it? I have to write theoretical background of it. Please share your valuable thoughts.
Thank you!
Could anybody explain that while performing the system GMM model, I have to perform a unit root test or not? I have data from 46 firms for 12 years, but while performing the Levin-Lin-Chiu test in Stata software, it shows the following "Levin-Lin-Chiu test requires strongly balanced data".
After years using STATA, I am thinking of making a leap and learn Python.
Before, I learn R and found that the Rcmdr is a good package to learn R quickly. The Rcmdr point and click style that generates R script is really handy for beginners.
I wonder if a similar package or tool like Rcmdr is available for learning Python?
Hi,
To understand participants' life satisfaction, 5 questions have been asked. In each question, there are 5 options, from disagree to agree. Please suggest me how to create a single variable from these 5 categorical (ordinal) variables through MCA preferably in STATA or else in SPSS.
I am working on my dissertation and I am trying to reproduce the GARCH model that I have attached. With those models I am trying to investigate if any financial assets have a significant effect on ethereum prices.
I have collected data, respectively the returns on ethereum, FTSE index, Gold Cash and Future, Fed interests.
I am struggling with the code for garch(1,1) with an AR(2) process: for my variance equation, I have to include in the expression the exponential of a series of independent variables, so I would be able to find the respective coefficients. Does anybody know which code should I add to the function to include the exponential of those values (either in stata or eview)?
Which one of these multilevel models are better? should the random equation variables be added also as covariates?
Model A: with random equation variables as covariates
Model B: without random equation variables as covariates
* Model A resulted in same results with a routine ologit. So, if model A is better than model B, what the philosophy of using multilevel mixed models (because of same result with ologit)?!
Hi all. We are planning to conduct a study comparing new diagnostics tools that will be used at the doctor's office and compare this with the gold standard that is commonly used in the lab. The main objective of this study is to look at the sensitivity and specificity of this new tool and our aim is to have at least a sample size that has at least 90% power to detect 95% sensitivity or worse as compared to the gold standard. However, when we ask the lab, the machines/equipment in the lab that will be used as the gold standard reference for this study, don't have data on sensitivity/specificity since this machine/equipment runs quantitative assay and does not require sensitivity/specificity validation. On the contrary, the new machines that we want to test is using the semi-quantitative assay. I will be using Stata to calculate the sample size and if anyone can help/guide me, it will be really helpful. Thanks.
Hi
I'm currently writing my masters thesis and i'm analyzing data from a school-based intervention study. The statistical analysis is conducted with STATA 17.
We want to examine whether the implementation quality of a school-based intervention has an effect on the effectiveness of the intervention.
As i just started to work with mixed models i'm not sure whether my model is correct.
My statistical approach is adopted from a study by Humphrey et al. (2018):
I further rely on the book by Twisk (2019). You can access the relevant part here: https://1drv.ms/u/s!AggQKHp7NGMFg6NMkBiK-rjcZ-EodQ?e=23xJmS
Stress measurement
z_k_sskj_vul1: vulnerability to stress at time 1 (continuous, z-transformed)
z_k_sskj_vul2: vulnerability to stress at time 2 (continuous, z-transformed)
Implementation Quality
resp_high: high responsiveness of children (1 if high, otherwise 0)
resp_med: medium responsiveness of children (1 if medium, otherwise 0)
resp_low: low responsiveness of children (1 if low, otherwise 0)
qual_high: high responsiveness of children (1 if high, otherwise 0)
qual_med: medium responsiveness of children (1 if medium, otherwise 0)
qual_low: low responsiveness of children (1 if low, otherwise 0)
I choose a two-level approach with students on level 1 and classrooms on level 2.
First i checked the ICC in order to assess the between-group variable. Here i'm not sure whether to include z_k_sskj_vul1 in the model.
## unconditional mdel z_k_sskj_vul2 (check for between-classroom variance)
mixed z_k_sskj_vul2 || classroom:
estat icc
Second i built this model in order to assess whether the effect differs for high or medium implementation quality in comparison with low implementation quality. I included z_k_sskj_vul1 to control for baseline measurements.
## mixed model to examine implementation quality
mixed z_k_sskj_vul2 resp_high resp_med qual_high qual_med z_k_sskj_vul1 || classroom:
I would be really grateful if someone could just check if my model is correct.
Thank you
David
I have an ordinal dependent variable with six ranks (these are grades from a test) and several interval independent variables (one independent variable, sex, is nominal). I would like to determine the relative importance of each independent variable. Which method(s) would be appropriate? I have access to SPSS, STATA and R (possibly also MATLAB) and I would be grateful to know if any appropriate methods are implemented in these packages.
If you were forced to pick between the two, which software package would you prefer for analysis of medical data? Some commonly listed pros of SPSS are ease of use, robust handling of large data sets, access to a broad variety of analyses, and decent generation of figures and tables. STATA, on the other hand, is claimed to offer more user control and better handling of typical medical data (through better implementation of cluster analysis, survival analysis, imputation of missing values, etc.).
There are obviously pros and cons inherent in both, and there could well exist better alternatives (such as R). But if you had to pick between SPSS and STATA for the purpose of medical research, which would it be, and why?
Dear everyone,
I am currently trying to estimate the impact of adopting a technology on farmers irrigation cost through endogenous switching regression model (movestay command) using STATA/Stata software.
However, the rho0 and rho1 values derived from my analysis showing negative but significant sign for both groups (adopted and non-adopted).
I therefore request you all if anyone could help me interpreting these values (please see the attach files)
Thanks you very much.
I have a case-control study with 200 cases and 200 controls (d == 0 for controls and d == 1 for cases). Two tests y1 and y2 are applied to each case and control and the results are coded 1 for a positive test result and 0 for a negative test result. The estimated prevalence of the disease in the population, p = 0.15.
I want to calculate the PPV and NPV of these tests, the 95% CI of the predictive values, and compare the PPV and NPV of tests y1 and y2.
Can someone please help me with some Stata code for the same?
I am attaching the data file for the exercise. This is hypothetical data generated for an academic exercise only.
Hello Stata users. Please help.
When running Cronbach's Alpha test for internal consistency...
I have some missing values in the data set coded as 999.
Are they included in calculations or dismissed by Stata software as default?
Using other words do I have to mark some option in Stata before running Cronbach's alpha calculations so the software would dismiss missing values?
Could anybody clarify? Many thanks in advance.
I have average level household- income data of 26 statistical divisions of Turkey (NUTS 2). I need to break down and estimate this data to city (81 sub division NUTS 3) or town level (LAU) level to assess economic conditions of each local area. I have demographic data at the city and town levels. I did some research in the small area estimation field, but I couldn't find the exact method which proposes solution for my research question given above. Can you please recommend any book, article, or method on this case? Thank you very much.
I'm doing my research on modelling of barriers (limitations) and determinants (social constraints) of maize farmers' adoption of adaptation options to curb the negative impacts of CVC (climate variability and change) using the MNL model.
For the results of the MNL regression model, I'm using the assumptions of IIA (the Hausman test), the VIF, and the contingency coefficient that need deep result interpretation.
But in the STATA software, both the terms contingency coefficient and correlation coefficient appear, of course with different values.
What is the conceptual difference between the contingency coefficient and correlation coefficient? What best example do we have to differentiate them for better understanding?
THANK YOU SO MUCH!!!
Hello,
I have two data sets -
1) 10 years of data on what percentage of the total patients in a hospital emergency room are diabetic
2) 10 years of data on what percentage of the total patients admitted in the same hospital are diabetic.
While I can easily compare the above data in the form of two linear trend lines (X-axis - years, Y-axis - Percentage of total patients being diabetic ), I wanted to ask how to statistically compare the two trend lines?
Thank you
Happy New Year! I need help.
I have missing values in the emission (panel) data and they have been replaced by rnormal(sample mean, sample sd). Unfortunately, sd is large that these missing values are replaced with negative values. Since the emission can't be negative, I want to re-replace these negative values with rnromal(sample mean, sample sd) until they are replaced with positive values. Unfortunately, the loop stops after replacing negative values a single time.
Just to let yall know that although it may sound as if I am trying to fabricate data, it is not the case here. I am simply trying to see how drastically different the tsplot would be with "complete" data when compared to the original data.
The following is my code:
tsset id date, daily
tsfill, full
foreach var in emitCOD emitDUST {
egen `var'mean = mean(`var'), by(Province)
egen `var'sd = sd(`var'), by(Province)
tsset id date
bysort id: carryforward `var'mean, replace
bysort id: carryforward `var'sd, replace
gsort id-date
bysort id: carryforward `var'mean, replace
bysort id: carryforward `var'sd, replace
forvalues i = 1/2 {
gen `var'_`i' = `var'
replace `var'_`i' = rnormal(`var'mean, `var'sd) if `var' == .
while (`var'_`i' < 0) {
replace `var'_`i' = rnormal(`var'mean, `var'sd)
if `var'_`i' >= 0 {
continue, break
}
}
}
}
Thanks.
I want to learn how to handle and analyse panel data and I am looking for references (handbooks, articles or any other material, but I prefer a good handbook). Preferably using STATA language and focus on social sciences applications. I look forward to your suggestions. Thanks beforehand.
I am using "mvprobit" in STATA, however it is not clear to me how i can estimate marginal effect after this. Any help will be much appreciated.
Hey all,
I am conducting a CFA on a well know theory, the technology acceptance model theory. In one of the latent variables, I am getting both the AVE and the CR equal to approximately .75.
I do believe this is due to having only two observed variable accounted to the latent variable. As this latent variable is still part of a theory and I cannot deleted, can I go ahead with my SEM and consider this as a limitation? Thank you.
Hello,
I am using Stata 13, i want to download data from yhaoo finance into Stata using the getsymbols command:
. scc install getsymbols
. getsymbols MSFT AAPL FB, yahoo clear
I encounter the following error:
<istmt>: 3499 m_getsymbol() not found
I don't know what it means. Can anyone tell me how can i solve this?
Thank you.
Dear community,
As part of my master thesis I am going to test a structural equation model that I am attaching next. It is a multi-stage moderated mediation. My university offers me the chance to buy the SPSS or STATA license for students, so it would be great if you could give some insights on past experience using these software packages for the purpose of my research. Other software packages are of course also welcomed :-)
I would like to know 2 specific things:
1. Which software would you recommend me to conduct the model testing and why?
2. What would be the necessary steps in the recommended tool?
Thanks a lot!!!
Best,
Andrea
My friend and I have arguments about some statistical assumptions. He thinks that if he has 4 constructs, and every construct has 3 variables, for example. To test a mediation analysis, he can take the mean average of these variables, compute it, name the average means, and start the mediation test. My argument is that he has to obtain the latent variable through load factors and make a path structural equation model. is both methodologies right? which is better?
I have tried EVIEWS. But I came across of many Research Papers relevant to my Research who have used STATA. Many workshops to be held are asking for installed STATA Software. How can I get STATA?
Hi, I am looking to estimate a willingness to pay for a choice model, but for the given alternatives compared to a base rather than for the product attributes variables. I have run a mixed logit model and have coefficients for the attribute variables followed by coefficients for case-specific variables with each different option.
Hello everyone.
For my master dissertation I'm researching WTP (willingness to pay) for Carbon offsets. I examine WTP as the dependent variable in a double-bounded dichotomous format. I read a lot of literature speaking of "interval regression model", which can be used in the "stata" software. I have no acces however to that software through my university and therefore need to use the double-bounded dichotomous choice format in SPSS.
The question is, how do I apply the double-bounded DC as the dependent variable for regression in the SPSS software?