Science topic

# Econometric Applications - Science topic

A group for discussion and experience sharing on applied econometrics issues.

Questions related to Econometric Applications

What is the difference between mediating and moderating variables in panel data regression?

There are several independent variables and several dependent variables. I want to see how those independent variables affect the dependent variables. In other words, I want analyze:

[y1, y2, y3] = [a1, a2, a3] + [b1, b2, b3]*x1 + [c1, c2, c3]*x2 + [e1, e2, e3]

The main problem is that y1, y2, y3 are correlated. y1 increases may have lead to decrease of y2 and y3. In this situation, what multivariate multiple regression models can I use? And what assumptions of those models?

Hi all, I'm doing fyp with the title of the determinant of healthcare expenditure in 2011-2020. Here are my variables: government financing, gdp, total population.

first model is: healthcare expendituret=B0+B1gov financingt + B2gdpt + B3populationt + et

second causal relationship model is healthcare expenditure per capitat= B0 + B1gdp per capitat +et

It is possible to use unit root test then ADRL for the first model and what test can use for the second model?

Thank you in advance for those reply me :)

In the file that I attached below there is a line upon the theta(1) coefficient and another one exactly below C(9). In addition, what is this number below C(9)? There is no description

Suppose, one of the regressors is an index number and the index ranges from 1-15, where the higher value indicates greater flexibility and the lower value indicates greater rigidity. If the coefficient is negative, then what will be the interpretation of this coefficient? Does it mean moving to greater rigidity or greater flexibility is better?

Dear All,

I’m conducting an event study for the yearly inclusion and exclusion of some stocks (from different industry sectors) in an index.

I need to calculate the abnormal return per each stock upon inclusion or exclusion from the index.

I have some questions:

1- How to decide upon the length of backward time to consider for the “Estimation Window” and how to justify ?

2- Stock return is calculated by:

(price today – price yesterday)/(price yesterday)

OR

LN(price today/price yesterday)?

I see both ways are used, although they give different results.

Can any of them be used to calculate CAR?

3- When calculating the Abnormal return as the difference between stock return and a Benchmark Return (market return), The market (benchmark) return should be the index itself (on which stock are included or excluded) ? Or the sector index related to the stock?

Appreciate your advice with justification.

Many thanks in advance.

I collected 109 responses for 60 indicators to measure the status of urban sustainability as a pilot study. So far I know, I cannot run EFA as 1 indicator required at least 5 responses, but I do not know whether I can run PCA with limited responses? Would you please suggest to me the applicability of PCA or any other possible analysis?

I have heard some academics argue that t-test can only be used for hypothesis testing. That it is too weak a tool to be used to analyse a specific objective when carrying out an academic research. For example, is t-test an appropriate analytical tool to determine the effect of credit on farm output?

Dear everyone,

I am in great distress and desperately need your advice. I have the cumulated (disaggregated) data of a survey of an industry (total export, total labour costs etc.) of 380 firms. The original paper is using a Two-stage least square (TSLS) model in oder to analyze several industries with one Independent variable having a relationship with the dependent variable, which was the limitation not to use an OLS method, according to the author. However, i want to conduct a single industry analysis and exclude the variable with the relationship, BUT instead analyze the model over 3 years. What is the best econometric model to use? Can is use an OLS regression over period of 3 years? if yes, what tests are applicable then?

Thank you so much for your help, you are helping me out so much !!!!!!!

Hello everyone,

i would like to analyze the effect of innovation in 1 industry over a time period of 10 years. the dependent variable is export and the Independent variables are R&D and Labour costs.

What is the best model to use? i am planning to do a Log-linear model.

Thank you very much for your greatly needed help!

I am trying to measure the existence of dynamic relations between green innovation and the implementation of environmental regulation. According to literature, green innovation has a dynamic impact but I found 'lag term of green patents counts' is not statically significant. Is there any other test to confirm dynamic relation? Do you have any suggestions?

Dear colleagues,

I am planning to investigate the panel data set containing three countries and 10 variables. The time frame is a bit short that concerns me (between 2011-2020 for each country). What should be the sample size in this case? Can I apply fixed effects, random effects, or pooled OLS?

Thank you for your responses beforehand.

Best

Ibrahim

Dear Colleagues,

I ran an Error Correction Model, obtaining the results depicted below. The model comes from the literature, where Dutch disease effects were tested in the case of Russia. My dependent variable was the real effective exchange rate, while oil prices (OIL_Prices), terms of trade (TOT), public deficit (GOV), industrial productivity (PR) were independent variables. My main concern is that only the Error Correction Term, the dummy variable, and the intercept are statistically significant. Moreover, residuals are not normally distributed, while also the residuals are heteroscedasdic. There is no serial correlation issue according to the LM test. How can I improve my findings? Thank you beforehand.

Best

Ibrahim

Dear colleagues,

I applied the Granger Causality test in my paper and the reviewer wrote me the following:

*the statistical analysis was a bit short – usually the Granger-causality is followed by some vector autoregressive modeling...*What can I respond in this case?

P.S. I had a small sample size and serious data limitation.

Best

Ibrahim

In a regression, I'm using household income and household specific expenditures as independent variables, both with a natural logarithmic transformation, and I want to control them by the household size. I find in the literature that in that kind of cases, it is used the natural logarithm this last variable, but I don't get the logic. If I'm not wrong, the household size is the number of people living in the household, so i find that the interpretation would be very weird: an increase in 1% of the number of people leads to a x% in Y?

Hello everyone. I am using the VECM model and I want to use variance decomposition, but as you know variance decomposition is very sensitive to the ordering of the variable. I read in some papers that it will be better to use generalized variance decomposition because it is invariant to the ordering of the variables. I am using Stata, R or Eviews and the problem is how to perform Generalised VD and please if anyone knows help me

I am running an ARDL model on eviews and I need to know the following if anyone could help!

1. Is the optimal number of lags for annual data (30 observations) 1 or 2

**OR**should VAR be applied to know the optimal number of lags?2. When we apply the VAR, the maximum number of lags applicable was 5, beyond 5 we got singular matrix error, but the problem is as we increase the number of lags, the optimal number of lags increase (when we choose 2 lags, we got 2 as the optimal, when we choose 5 lags, we got 5 as the optimal) so what should be done?

Dear colleagues,

i am looking for the package or command, which performs times series decomposition in STATA. So far I did not find anything. Example can be found here: https://towardsdatascience.com/an-end-to-end-project-on-time-series-analysis-and-forecasting-with-python-4835e6bf050b at figure 5.

Look forward to valuable comments)

I am doing a study on the relationship between religiosity and visitation to religious healers. Before adding interaction term in the regression, the multilevel model showed religiosity as significant, but after I added the religiosity variable (i.e., frequency of prayers) as interaction terms with other control variables education, gender, urbanity, and such, the original variable frequency of prayers turned insignificant.

- what am i doing wrong here?

- is it because I added too many interactions with it?

- should I use another religiosity indicator (e.g., frequency of reading the Bible) along with the frequency of prayers as an interaction term with education, gender, urbanity?

It would be really helpful for me if someone tell me what to do in this situation.

Dear Colleagues,

I estimated the OLS models and checked them for several tests; however, instability in CUSUMSQ persists as described in the photo. What should I do in this case?

Best

Ibrahim

Dear Colleagues,

I paid attention to that, when I estimate an equation by Least Squares in Eviews, under the options tab we have a tick mark for degrees of freedom (d.f.) Adjustment. What is the importance and its role? Because, when I estimate an equation without d.f. Adjustment, I get two statistically significant relationship coefficients out of five explanatory variables; however, when I estimate with d.f. Adjustment, I do not get any significant results.

Thank you beforehand.

Dear Colleagues,

If I have 10 variables in my dataset (time series) out of which 9 is explanatory and 1 dependent, and if I clarify that all the variables are non-stationary, should I take the first difference of the dependent variable as well?

Best

Ibrahim

for the year 1975-76, some work is done... but how to get latest tables for input-output analysis in Pakistan?

Dear colleagues,

I am applying the PCA to political and institutional variables to create an index and use it in the regression analysis as a dependent variable. However, the variables which will form the main components contain different measurements. For example, while control of corruption ranges between -2.5 (weaker) and 2.5 (stronger), freedom of press ranges between 0-100 and if the value is higher, it shows fewer degrees of freedom of the press. So, I am in a loss to understand if this difference creates any hardship to PCA to produce a valid index. In other words, is it a problem for PCA if one variable implies higher success as the values of it get higher, while the other variable shows higher success as the values of it get lower? What should I do in this case? Thank you beforehand.

Best

Ibrahim

Dear Colleagues,

I applied A PCA analysis (fixed two components to reduce 7 variables) and SPSS stored vectors of two components, which as far as I see can be treated as an index (especially the Fac_1 which explains more variation of the original variables compared to Fac_2). My question is as follows: can I use this index as an independent variable in simple or multivariate regression? Thank you beforehand for your help and suggestions.

Best

Ibrahim

Dear colleagues,

Is it ok to include 2 or 3 dummy variables in the regression equation? Or should I rotate the dummy variables in different models? The thing is, I have never come up with the model examples with more than 2 dummy variables in economics so far. Do you know any serious shortcomings of using more than 1 dummy variable in the same equation? Thank you beforehand.

Best

Ibrahim

Hi,

I am using ARDL bound test model for my research. It indicates that there is no long-run cointegration. How can I now interpret my result? Can I use granger causality?

Thanks.

Dear colleagues,

I am capable of using linear estimations between X and Y variables via OLS or 2SLS (on Eviews, for example); however, I need to study how to estimate/model non-linear relationships as well. If you know any source which can explain it in a simple language based on time series, your recommendations are well-welcomed. Thank you beforehand.

Best

Ibrahim

I can't find the exact command to run the model.

Dear All,

I would like to perform event study analysis through website: https://www.eventstudytools.com/.

Unfortunately they ask for uploading data in a format i dont understand , dont know how to put data in this form, and i dont find a user manual or email to communicate with them.

Can anyone kindly advise how to use this service and explain it in a plain easy way?

Thanks in advance.

Ahmed Samy

Dear All,

I'm conducting an event study for a sample of 25 firms that each gone through certain yearly event (inclusion in an index).

(The 25 firms (events) are collected from last 5 years.)

I'm using daily price abnormal returns (AR), and consolidated horizontally the daily returns for the 25 firms to get daily "Average abnormal Returns" (AAR).

Estimation Window (before the event)= 119 days

Event Window = 30 days

1- I tested the significance of daily AAR through a t-test and corresponding P-value, How can i calculate the statistical power for those daily P-values?

(significance level used=.0.05, 2 tailed)

2- I calculated "Commutative Average Abnormal Returns" (CAAR) for some period in the event window, performed a significance test for it by t-test and corresponding P-value, how can i calculate the statistical power of this CAAR significance test?

(significance level used=.0.05, 2 tailed)

Thank you for your help and guidance.

Ahmed Samy

I have seen that some researchers just compare the difference in R

^{2 }in two models: one in which the variables of interest are included and one in which they are excluded. However, in my case, I have that this difference is small (0.05). Is there any method by which I can be sure (or at least have some support for the argument that) this change is not just due to luck or noise?To illustrate my point I present you an hypothetical case with the following equation:

wage=C+0.5education+0.3rural area (

Where the variable "education" measures the number of years of education a person has and rural area is a dummy variable that takes the value of 1 if the person lives in the rural area and 0 if she lives in the urban area.

In this situation (and assuming no other relevant factors affecting wage), my questions are:

1) Is the 0.5 coefficient of education reflecting the difference between (1) the mean of the marginal return of an extra year of education on the wage of an urban worker and (2) the mean of the marginal return of an extra year of education of an rural worker?

a) If my reasoning is wrong, what would be the intuition of the mechanism of "holding constant"?

2) Mathematically, how is that just adding the rural variable works on "holding constant" the effect of living in a rural area on the relationship between education and wage?

Dear All,

Doing Financial Event studies using excel is just horrible process for the arrangement and chopping of huge data and complicated manual calculations..etc

Please advise what software are out there that can do Financial Event Studies in a more neat and time efficient way?

Thanks

Ahmed Samy

Dear All,

I’m conducting and event study for inclusion of companies in a certain index.

The event is the “inclusion event” for companies in this index for last 5 years.

For the events, we have yearly Announcement date (AD) for inclusions, and also effective Change Dates (CD) for the inclusion in the index.

Within same year, I have aligned all companies together on (AD) as day 0, and since they are companies from same year, CD will also align for all of them.

The problem comes when I try to aggregate companies from different years together, although I aligned them all to have same AD, but CD is different from one year to another so CD don’t align for companies from different years.

How can I overcome this misalignment of CD from different years , so that I’m able to aggregate all the companies together?

Many Thanks.

Dears,

I'm conducting an event study for the effect of news announcement at certain date on stock return.

Using the market model to estimate the expected stock return in the "estimation window" , we need to regress stock returns ( stock under study) with returns from market portfolio index.

1- How can we decide upon choosing this market portfolio index for regression ?

Is it just the main index of the market?

Sector index from which the stock under study belong?..etc ?

2- Is it necessary that stock under study be among the constituents of this market index?

Appricite to justify your kind answers with research citations if possible

Many thanks

I am doing some research on the growth of joblessness in the manufacturing sector of Pakistan. In the literature I have studied, the seemingly unrelated regression technique (SUR) was used instead of the ordinary least square (OLS) method. Can anyone help me to identify the reasons why OLS is not used here? Or under what conditions SUR technique should be used instead of OLS?

I'm working with life satisfaction as my dependent variable and some other independent variables that measure purchasing power (consumption, income and specific expenditures). To take into account the diminishig marginal returns of this last variables (following the literature) I transformed them in terms of their natural logarithm. However, now I want to compare the size of the coefficients of specific expenditures with the ones of consumption and income. Specifically, I would like some procedure which allows me to interpret the result like this: 1 unit of resources directed to a type of expenditure (say culture) is more/less effective to improve life satisfaction in comparison with the effect that this same unit would have under the category of income. If I just do this with withouth the natural logarithm (that is, expressed in dolars) the coefficients change in counterintuitive ways, so I would prefer to avoid this.

I was thinking about using beta coefficients, but I don't know if it makes sense to standarize an already logarithmic coefficient.

In a regression with a database with N=1200, I have an independent dummy variable that measures if the surveyed is unemployed or employed. The variable has the following characteristics:

Unemployment = 0 - Frecuency: 1196

Unemployment = 1 - Frecuency : 4

The regression gives me a significant coefficient, but, also, very counter intuitive (especifically, thay Life Satisfaction has a possitve association with unemployment). I think, however, that it's wrong to obtain a valid conclusion from just 4 cases in Unemployment=1. I also have other dummy variables where the situation is even less clear. For example:

Dummy = 0 - Frecuency: 1170

Dummy = 1 - Frecuency: 30

Or even more:

Categorical option A = 0 - Frecuency: 1150

Categorical option B = 1 - Frecuency: 30

Categorical option C = 2 - Frecuency: 12

Cateogorical optio D = 3 - Frecuency: 8

Can I obtain valid conlcusions from this? And, in more general terms, is there a minimun number of observations needed per category of response in each independent variable so the conslusions that arise from it are pertinent/correct? If that's the case, how can I calculate this number?

In order to analyze if there is a mediation effect using Baron & Kenny's steps, is it necessary to include the control variables of my model, or is it enough to do the analysis just with the independent variable, the mediator variable and the dependent variable of my interest?

I have a dummy varible as the possible mediatior of a relationship in my model. By reading the Baron and Kenny's (1986) steps, I see that, in the second one you have to test the relationship between the indepentend variable and the mediator, using the last one as a dependent variable. However, normally you won't use an OLS when you have a dummy as a dependent variable. Should I use a Probit in this case?

In my investigation about the determinants of subjective well being (life satisfaction) I have some variables that measure the access to food and also other variables that measure the affections (if, in the last week, the interviewed felt sad/happy, for example). These variables don't show high levels of simple Pearson correlations nor high levels of VIF. In experimenting with different models (including and excluding some variables), I see that access to food has a positive and significant coefficient, except in the ones that the affective varaibles are included. Can I make the case that this is due to the fact that affective variables are mediating the effect of access to food to life satisfaction? I also tried a with an interaction between access to food and affective variables but they are not significant.

Reading Wooldridge's book on introductory econometrics I observe that the F test allows us to see if, in a group, at least one of the coefficients is statistically significant. However, in my model I have that, individually, one of the variables of the group I want to test is already statistically significant (measured by the t-test). So, if that is the case I expect that, no matter with which variable I test for, if I include the one that is already individually significant, the F test will also be significant. Is there any useful usage I can make with the F test in this case?

I have a well endowed database with almost 29 0000 observations and I want to make an analysis with more than 50 variables. What are the problems that can arise from this situation? Can the model be overfitted? If it is possible, why?

I have reported life satisfaction as my dependent variable and many independent variables of different kinds. Of them, one is the area in which the individual lives (urban/rural) and other is the access to public provided water service. When the area variable is included in the model, the second variable is non significant. However, when it is excluded, the public service gains enough significance for a 95% level of confidence. The two variables are moderately and negatively correlated (r= -0.45).

What possible explanations do you see for this phenomenom?

I'm studying the determinants of subjectve well being in my country and I have reported satisfaction with life as my dependent variable and almost 40 independent variables. I ran multicolinearity tests and I dind't find values bigger than 5 (in fact, just two variables had a VIF above 2). Also, my N=22 000, so I dont expect to have an overfitted model. Actually, at thet beggining, all was going well: the variables maintained their signficance and the values of their coefficients when I added or deleted some variables to test the robustness of the model and the adjusted R squared increased with the inclusion of more variables.

However, I finally included some variables that measure the satisfaction with specific life domains (family, work, profession, community, etc.) and there is when the problem started: my adjusted R squared tripled and the significance and even the signs of some variables changed dramatically, in some cases, in a counterintuitive way. I also tested multicolinearity and the correlation of these variables with the other estimators and I didn't find this to be a problem.

The literature says that it is very likely that there are endogeneity problems between satisfaction with life domains and satisfaction with life since it is not too much the objective life conditions that affect life satisfaction but the propensity to self-report satisfaction. Can this be the cause of my problem? If so, how?

PD: I'm not trying to demonstrate causality.

I'm investigating the determinants of subjective well being in my country and I have a well endowed database in which I found a lot of environmental, psicosocial and political variables (+ the common ones) that are theory-related to my dependent variable (subjective well-being). In this context, do you find any trouble about including them all (almost 35, and that already deleting the ones that measure the same concept) in one single model (using the adjusted R

^{2})?More researcher have applied R and Matlab software to estimate Panel Threshold Regression models. Can we estimate Panel Threshold Regression models by Eviews?

I am using time series for my research.

I have estimated a VECM model. It reports standard error and t-statistic instead of the P-value. So how do I check if the coefficients are significant using the t-statistics?

*I'm using Eviews for my estimations

Hi all, i'm trying to understand Fama - Macbeth two step regression. I have 10 portfolios and T=5 years. In the first step i compute 10 time series regressions and if i have 2 factors i get 20 betas. How many regression i have to compute in the second step? only 5? and which betas should i use? The average of the betas found in the first step?

Thank you in advance.

While Conducting ARDL BOUND test I discover that my F statistics is greater than the lower bound but less than the upper bound . Upon which Pesaran and Shin 2001 concluded it to be inconclusive. Please scholar what then becomes the implication could we agree there is long run relationship or not if yes what action statisticalLy can be taken in the decision Of The right hypothesis to use Null or Alternate

Hi, in an RCT, I have 3 different treatment groups and one control group. The size of the control group is around 1000 while the size of other groups are just above 300.

To test balance I used ANOVA and Welch test, which show that several variables are unbalanced. Can I draw a smaller random sample(400) from the control group so that the sizes of all groups are relatively equal?

Actually after doing that only one variable is unbalanced. So would it cause any problems if I want to draw a smaller sample just for more balance? Thanks!

I have tried with E-views 7 and stata 11 but failed to find the result. While doing it with E-views, I am getting only result for "level shift" but no such estimated result can be found for "level shift with trend" and "Regime Shift". But in stata I could not find the command.

Please help me.

Hi,

When estimated the DCC-GARCH in stata at the end of the output pairwise quasi correlations are given. What does it mean in practice? is it the mean value of dynamic correlations or something else?

Much appreciated if anybody could clarify this.

Kind regards

Thushara

I use monthly data in my model with 166 observations. I did the unit root tests. I choose and made the appropriate delayed ARDL model. Diagnostic statistics include autocorrelation, heteroskedactity and Ramsey reset pass. All greater than 0.05. But when I test normality, jarque bera statistic is 0.00003. So it's not normal.I also took logarithms of variables. Should I continue the analysis by ignoring the normality test? Do you have any other advice?

Thank you in advance for your help

I have 4 time series variables, say x1, x2, x3, x4. I am mainly interested in finding a co-integration relation between x1 and rest of the variables. All variables are I(1). Johansen’s p procedure for testing for co-integration suggest 1 co-integration among x1,x2,x3,x4. I estimated the VECM with one co-integrating vector. The alpha coefficient with x1 equation came out positive and significant, which is my main interest along with beta coefficients. The alpha coefficients on x2, and x4 are also positive and the same coefficient in x3 is negative. All coefficients are significant. I have done estimation in EViews, which also reports the significance levels of coefficients in Beta vector, which are also significant and economically makes sense. Can I expect to have a positive alpha with x1 variable, which I want to show it having a long-run relationship with other variables in the system when they are all I(1) variables. If alpha with x1 cannot be negative, then what does it suggest?

Hi

I've estimated a DCC-GARCH(1,1) model using STATA. at the end of the stata output, correlation matrix is given and it is also called quasi correlation matrix. Is it the conditional correlation matrix or a different one? if so is it the average/mean value of the dynamic conditional correlations?

Much appreciated if anybody clarifies this.

(I've herewith attached the output)

Kind regards

Thushara

I have 12 portfolios per year, I entered one for each month of the year. I did that for a number of years. I have calculated the average return for each year. Now I need to calculate correlation between these returns and other returns from another model, and also the correlation between these returns and benchmark volumes and standard deviation.

I tried to calculate correlations with the pearson method using percentages, using prices, using a mix, etc. and I continuosly get different correlation coefficients, I mean even opposite in some cases. What is the best way to calculate correlations in this case?

Hi,

Does the eviews allow to estimate ARIMA models assuming a student's t distribution or GED distribution? I have estimated some ARIMA models and found that some residuals are not normally distributed. I guess either Student's t distribution or GED distribution would fit well.

Much appreciated if anyone could advise in this regards

Kind regards

Thushara

is there any difference between exponential moving average(EMA) and exponentially weighted moving average(EWMA)?

thanks in advance

Hi,

suppose we get the following 2sls model:

Log Y=B

_{1}- 0.8623X_{2}+0.05X_{3 }Where X

_{2}is a dummy variableCan you please interpret the coefficient of X

_{2}?Thanks in advance.

HI everyone,

During my late piece of research, I have found some argument discussing about the need to adjust daily stock prices for the exchange rate vs US dollar (given the adoption od S&P 500 as benchmark). Actually, I can't very understand the reasons underlying this point, since daily returns are stated in percentage terms, unless you are to demonstrate the cofounding effect of exchanges on daily returns.

Please could you provide me with your suggestions on this point, if possible with some relevant reference?

Thank you so much in advance.

Regards,

Nicola

having a discrete variable will affect it's distribution, so is it appropriate to assume it is continuous especially there will be problem in the interpretation of the results?

Hello. I am actually doing a study for the relationship of foreign trade with economic growth in Albania. My data is yearly, ranging from 1993 to 2016 (Export, Import, GDP). I am using log form (lnexport,lnimport,lnGDP) to conduct the study (lnGDP is the dependent variable). Unit Root test(ADF) indicates the series are stationary at first difference [I(1)].

Regarding lag length, 'Lag Length Criteria' chooses as optimal lag 1 when max lag is 1; lag 2 when max lag is 2; lag 1 when max lag is 3 and lag 4 when max lag is 4 (SIC and AIC criterion). Judging from my intuition lag length should be 1, as the data is yearly.

When I open the differenced series as VAR the coefficients after estimation appear to be insignificant. When I do the Johansen Cointegration test (data in levels) it shows 2 cointegrating equations in the long run. After that I check VECM (in levels) and the coefficients still appear insignifcant. I also ran a Granger Causality test (differenced data) which shows no causality in any direction.

What can I do now? Does that mean the series need to have more data?

I would be very grateful for any suggestions regarding my study, which is actually very important to me, as I need it for my thesis. I am also attaching the results for a better understanding (*removed after edit).

**EDIT: Thank you everyone for your suggestions. I tried to use quarterly data and it worked. With optimal lag 7 (according to AIC) there was one cointegrating vector and VECM was successful with the error correction term being negative and significant. Also the export coefficient sign was positive and the import coefficient was negative (in the long run part of the equation) thus satisfying economic theories. I am attaching the result below.**

I am studying on how the interaction of domestic credit growth and macroeconomic variables changed after a certain event in a country by an econometric analysis thus the data (quarterly) starts from 2012:Q3 till 2017:Q1 corresponding to 19 terms. I also took the changes in each variable to see the differences according to the previous term.

Is it alright if I conduct causality, cointegration and multiple regression analyses for this data?

I would be pleased if you could suggest a proper analysis method for this limited data or any other relevant suggestions.

Thanks a lot in advance..

I have 2 such variables in my var model that exhibit the above given condition. Should i do something special in my Var model.

Different methods give different break dates. If we should be concerned about these breaks then which method is reliable?

Dear all,

I need assistance with interpreting the results from Z-A (1992) test on a variable with structural break point. Here are some results:

The variable is Gini index (measure of income inequality):

**zandrews gini, break(trend) trim(0.01) lagmethod(BIC) graph**

**Results:**//Zivot-Andrews unit root test for gini

//Allowing for break in trend

//Lag selection via BIC: lags of D.gini included = 0

//Minimum t-statistic

**-3.449**at 2006 (obs 27)//Critical values: 1%: -4.93 5%: -4.42 10%: -4.11

The break point is at 2006 and since the calculated t-stat (

**-3.449**) is below the critical values, i suppose that the variable is non-stationary, so I took the 1st difference and here's the result:**zandrews d.gini, break(trend) trim(0.01) lagmethod(BIC) graph**

**Results:**//Zivot-Andrews unit root test for D.gini

//Allowing for break in trend

//Lag selection via BIC: lags of D.D.gini included = 0

//Minimum t-statistic -6.478 at 2011 (obs 32)

//Critical values: 1%: -4.93 5%: -4.42 10%: -4.11

The t-stat is now

**-6.478**which is well above the critical values evidencing stationarity but the break point has**. So, which break point is applicable for the analysis?***now shifted to 2011 from 2006*Kindly advise....thank you.

Does a variable have to be I(1) to be cointegrated in levels with another I(1) variable, or can it be I(2)? I'm using the Phillips-Oularis method, and I'm only interested in the long-run coefficients, not in making an ECM.

I ran a model with two integrated variables of order I(1). The variables are CDS and a Bond yield with daily data over a period of 3 years. I proceed by doing the dickey fuller test, then i ran a VAR model and choose the lags using Scharz lag criterion. Then i test LM test for the residual, heterocedasticity and a Normatility. In every one p value is les than 0,05 meaning that the model is not optimal. When i choose for johansen cointegration test it tells me that there is at least 1 cointegration relation. If the var model is not optimal does this means that the variables are not cointegrated?

Hello, I am studying the effects of ICT diffusion on financial sector activity and efficiency. To do so, I am creating a GMM model using panel data across 205 countries over 24 years. The model is based on one by Asongu and Moulin (2016). The model and link to Asongu and Moulin's paper are attached.

I am using the xtabond2 command in Stata, writing my line as so:

xtabond2 fe lfe ict inf gdppcg trade inv linc hinc, twostep gmm(lfe inf gdppcg trade inv linc hinc, lag(1 24))

I am particularly confused with how to correctly reflect the (t-tau) lag subscripts on the select independent variables.

fe is my proxy dependent variable for financial development. (financial efficiency).

May I please have some help in specifying the correct Stata line to reflect the model?

Thanks!

I have to perform Zivot-Andrews unit root test in E-views. How to decide which type of model to use (intercept, slope, both). Is it possible to use a formal test to decide that or something else. Also, how to decide what lag length to use. The data are on quarterly basis.

Hello Experts! I'm conducting a study on the effect of trade openness on industry competition. My panel dataset is composed by 3 dimensions: country (5), industry (19) and time (9 years). The purpose of my study is to capture a dynamic equilibrium relationship. For that I need to distinguish between the short-run impact of a trade openness on industry price markups and the long-run equilibrium relationship between price markups and the competitive environment. It is important from an econometric point of view to control for the high degree of persistence of price markups: an Error Correction Model specification I allow industries to trend to different equilibria.

Now my question concerns the stata code for an error correction model with a 3 dimension panel data. I'm also indecisive between combining country and industry information to one panel unit instead of having 2 units - which doesn't let me perform xtset due to repeated time values within panel.

Thank you for you attention and time to consider my issue!

I have unbalanced panel of 12 source countries ( 1984-2015x12= 384 Obs) investing in Pakistan. The contribution of these 12 countries into FDI inflows to Pakistan is around 70 percent. The aim is to find out macroeconomic policy and institutional determinants of FDI in Pakistan. There is a vast literature available where there are multiple source counties and multiple host countries (like developing countries, EU, Asian countries). But literature on a single host and multiple source countries is rare ( base paper is attached).

Is my approach right? I have only TWO bilateral variables (FDI and trade) and other like Tax, Inflation, institutional quality etc are host country specific variables.

When I am applying Hausman test for choosing between RE and FE model, the result states "Cross-section test variance is invalid. Hausman statistic set to zero" (Cross-section random 0.000000 6 1.0000). . Can anyone help me understand what it means? What should I use FE model or RE model?

Thank you