Science topic

# Applied Econometrics - Science topic

Explore the latest questions and answers in Applied Econometrics, and find Applied Econometrics experts.

Questions related to Applied Econometrics

I would like to test whether the general relationship between the number of years of education and the wage is linear, exponential, etc. Or in other words, does going from 1 year to 2 years of education have the same impact on wages as going from 10 to 11. I want a general assessment for the world and not for a specific country.

I got standardized data from surveys on several countries and multiple times (since 2000). My idea is to build a multilevel mixed-effects model, with a fixed effect for the number of years of education and random effects for the country, the year of the survey and other covariates (age, sex, etc.). I’m not so used to this type of model: do you think it makes sense? Is this the most appropriate specification of the model for my needs?

Hey there. I am trying to walk through this concept in my head but unfortunately do not have the name for the type of model I am attempting to create. Let me describe the data:

1) One dependent variable on an hourly interval

2) Multiple exogenous "shock" variables that have positive effects on the dependent variable.

3) The dependent variable has 0 effect on the exogenous shock variables.

The dependent variable can be modeled by a function of it's own lags and the exogenous shock variables.

I would like to model the variable with an ARIMA model with exogenous variables that have an immediate impact at time T and lagging effects for a short period of time. (This is similar to a VAR's IRFS, except the exogenous variables are independent of the dependent variable).

The assumption is that without the exogenous shock variables, there is an underlying behavior of the data series. It is this underlying behavior that I would like to capture with the ARIMA. The exogenous shock variables are essentially subtracted from the series in order to predict what the series would look like without exogenous interference.

The problem:

I am worried that the ARIMA will use information from the "exogenous" shocks within the dependent series in estimating the AR and MA terms. That would mean that there would be positive bias in the terms. For example: If an exogenous shock is estimated to have an effect of a 100 unit increase the dependent variable, then this 100 unit increase should NOT effect the estimation of the AR or MA terms since it is considered to be unrelated to underlying function of the dependent variable.

I've attempted to write this out mathematically as well.

I have heterogeneous panel data model,, N=6 T=21,What is the appropriate regression model? I have applied CD test , It shows the data have cross-sectional dependency

I used the 2nd unit root tests , and the result found that my data is stationary at level

is it possible to use PMG ? would you pleas explain the appropriate regression model?

Hello

I am conducting a Quadratic ARDL model on Eviews for Kuznets Curve.

My significance level is 10%. And I want to conduct CUSUM and CUSUM SQUARE test for stability using 10% significance level on Eviews. But I am only getting for 5%.

If not on eviews, where and how can I get this test?

My thesis is due soon in next week. Can you please help me with this.

Thank you

I want to find how the number of COVID-19 cases, deaths and lockdown affect each sector in Sri Lankan economy. Hausman test suggested me fixed effect model as the most suitable model. I need to fit the fixed effect model for each sector separately including the sector as a dummy variable to the model. but I get the error 'Near singular matrix'. Why is that? Is it impossible to fit fixed effect model for each sector separately when the sector was included as a dummy variable?

Dear Scholars, I am measuring the effect of policy on firms' performance. I found a common (in both treatment and control groups) structural break 4 years before the policy intervention. I used the difference-in-difference model to find the impact of the policy. I am using 20 years of firm-level panel data. What are your suggestions?

I am currently trying to understand a possible dynamic panel model with the years of observation (t) higher than the number of unit of observation (n). The particular dataset contain 6 different cross-sectional region observed within the span of 30 years.

I ran the ARDL approach on E-Views 9, and it turns out that the independent variable has a small coefficient, but it appears as zero in the long-term equation as shown in the table, despite having a very small value in the ordinary least squares method. How can I show clearly this low value?

Using E-Views 9, I ran the ARDL test, resulting in an R-Squared value in the initial ARDL model output and an R-Squared value under the Bounds test. so, what is the difference between these two R squared values?

Hi! I would like to have an opinion on something, rather than a straight-out answer, so to speak. In time-series econometrics, it is common to present both long-term coefficients from the cointegrating equation, as well as the short-term coefficients from the error correction model. Since I have a lot of specifications, and since I'm really only interested in the long-term, I only present the long-term coefficients from a cointegrating equation in a paper I'm writing. Would you say that is feasible? I'm using the Phillips-Oularis singe-equation approach to cointegration.

Hi, I am wondering if we can "manually generate" more instrumental variables in TSLS estimation by taking high order terms?

For instance, when Z is a valid instrumental variable for X, then Z^2 must also satisfy the conditions for relevance and exogeneity. Can I make more insturments in this way?

My guess is that the answer dependents on the true relationship between X and Z, when X is linear in Z, then the additional quadratic term will not be significant in the first stage regression and hence is useless. But for cases when X has a quadratic relationship with Z, I think it is better to include the quadratic term in the first stage as well. However, I hardly see such a practice in literatures, why? Can you share some examples when quadratic or higher order terms IV are included? Thanks

Hi,

Could anyone suggest me material (ppt, pdf) related with Applied Econometrics with R (intro level)?

Thanks in advane.

Can ECM term be lower than -1? How to interpret a value lower than -1? Does it suggest that something wrong with the model?

Any help?

Thank you in advance.

What is the difference between mediating and moderating variables in panel data regression?

Hello, I am new to VARs and currently building a SVAR with the following variables to analyse monetary policy shocks and their affects on house prices: House prices, GDP Growth, Inflation, Mortgage Rate first differenced, Central Bank Base Rate First Differenced and M4 Growth Rate. The aim of the VAR analysis is to generate impulse responses of house prices to monetary policy shocks, and understand the forecast error variance decomposition.

I'm planning on using the Base Rate and the M4 Growth Rate as policy instruments, for a time period spanning 1995 to 2015. Whilst all variables are reject the null hypothesis of non stationarity in an Augmented Dickey Fuller test, with 4 lags, the M4 growth rate fails to reject the null hypothesis.

Now, if I go ahead anyway and create a SVAR via recursive identification, the VAR is stable (eigenvalues within the unit circle), and the LM test states no autocorrelation at the 4 lag order.

Is my nonstationarity of M4 Growth Rate an issue here? As I am not interested in parameter estimates, but rather just impulse responses and the forecast error variance decomposition, there is no need for adjusting M4 Growth rate. Is that correct?

Alternatively I could first difference the M4 growth rate to get a measure of M4 acceleration, but I'm not sure what intuitive value that would add as a policy instrument.

Many thanks in advance for your help, please let me know if anything is unclear.

KB

I am using an ARDL model however I am having some difficulties interpreting the results. I found out that there is a cointegration in the long run. I provided pictures below.

I am currently replicating a study in which the dependent variable describes whether a household belongs to a certain category. Therefore, for each household the variable either takes the value 0 or the value 1 for each category. In the study that I am replicating the maximisation of the log-likelihood function yields one vector of regression coefficients, where each independent variable has got one regression coefficient. So there is one vector of regression coefficients for ALL households, independent of which category the households belong to. Now I am wondering how this is achieved, since (as I understand) a multinomial logistic regression for n categories yields n-1 regression coefficients per variable as there is always one reference category.

Hey! I need to improve already existing panel data model by adding 1 variable for access to technology. Is it possible, and what is the best variable to measure for technology accessibility. If is possible I would like to measure technological advancment as well. What should be my variables fpr this? What are the common practises so far? Thank you!

Hi all, I'm doing fyp with the title of the determinant of healthcare expenditure in 2011-2020. Here are my variables: government financing, gdp, total population.

first model is: healthcare expendituret=B0+B1gov financingt + B2gdpt + B3populationt + et

second causal relationship model is healthcare expenditure per capitat= B0 + B1gdp per capitat +et

It is possible to use unit root test then ADRL for the first model and what test can use for the second model?

Thank you in advance for those reply me :)

I use a conditional logit model with income, leisure time and interaction terms of the two variables with other variables (describing individual's characteristics) as independent variables.

After running the regression, I use the predict command to obtain probabilities for each individual and category. These probabilities are then multiplied with the median working hours of the respective categories to compute expected working hours.

The next step is to increase wage by 1%, which increases the variable income by 1% and thus also affects all interaction terms which include the variable income.

After running the modified regression, again I use the predict command and should obtain slightly different probabilities. My problem is now that the probabilities are exactly the same, so that there would be no change in expected working hours, which indicates that something went wrong.

On the attached images with extracts of the two regression outputs one can see that indeed the regression coefficients of the affected variables are very, very similar and that both the value of the R² and the values of the log likelihood iterations are exactly the same. To my mind these observations should explain why probabilities are indeed very similar, but I am wondering why they are exactly the same and what I did possibly wrong. I am replicating a paper where they did the same and where they were able to compute different expected working hours for the different scenarios.

In the file that I attached below there is a line upon the theta(1) coefficient and another one exactly below C(9). In addition, what is this number below C(9)? There is no description

Hello, I am facing a problem concerning the computation of regression coefficients (necessary information in attached image):

Three regression coefficients (alpha y, m and f) of the main regression (2) are generated through three separate regressions.

Now I was wondering which would be the appropriate way to compute the alphas and gammas.

In case I first run regression (2) and obtain the three regression coefficients alpha y, m and f, can I use these for the separate regressions as dependent variables in order to then run regressions (3) and obtain the gammas?

What is strikes me with this approach is that the value of the dependent variables alpha y, m and f would always be the same for each observation.

In the paper they state that the alphas are vectors, but I don't properly understand how they could be vectors (maybe that's the issue after all?).

Or is there a way to directly merge the regressions / directly integrate the regressions (3) into regression (1)? Preferably in Stata.

I appreciate any help, thank you!

Dear All,

I’m conducting an event study for the yearly inclusion and exclusion of some stocks (from different industry sectors) in an index.

I need to calculate the abnormal return per each stock upon inclusion or exclusion from the index.

I have some questions:

1- How to decide upon the length of backward time to consider for the “Estimation Window” and how to justify ?

2- Stock return is calculated by:

(price today – price yesterday)/(price yesterday)

OR

LN(price today/price yesterday)?

I see both ways are used, although they give different results.

Can any of them be used to calculate CAR?

3- When calculating the Abnormal return as the difference between stock return and a Benchmark Return (market return), The market (benchmark) return should be the index itself (on which stock are included or excluded) ? Or the sector index related to the stock?

Appreciate your advice with justification.

Many thanks in advance.

I have a big dataset (n>5,000) on corporate indebtedness and want to test wether SECTOR and FAMILY-OWNED are significant to explain it. The information is in percentage (total liabilities/total assets) but is NOT bounded: many companies have an indebtedness above 100%. My hypothesis are that SERVICES sector is more indebted than other sectors, and FAMILY-OWNED companies are less indebted than other companies.

If the data were normally distributed and had equal variances, I'd perform a two-way ANOVA.

If the data were normally distributed but were heteroscedastic, I'd perform a two-way robust ANOVA (using the R package "WRS2")

As the data is not normally distributed nor heteroscedastic (according to many tests I performed), and there is no such thing as a "two-way-kruskall wallis test",

**which is the best option?**1) perform a

**generalized least squares regression**(therefore corrected for heteroscedasticity) to check for the effect of two factors in my dependent variable?2) perform a

**non-parametric ANCOVA**(with the R package "sm"? Or "fANCOVA"?)What are the pros and cons of each alternative?

I have run an ARDL model for a Time Series Cross Sectional data but the output is not reporting the R.squared. What could be the reason/s.

Thank you.

Maliha Abubakari

Hi colleagues,

I use Stata13 and I want to run panel ARDL on the impact of institutional quality on inequality for 20 SSA countries. I have never used the technique so I am reading up available articles that used it. But I need help with a Stata do-file because I still don't know what codes to apply, how to arrange my variables in the model, and what diagnostics to conduct.

Any help or suggestion will do....thanks in anticipation!!!

Dear research community,

I am currently working with Hofstede's dimensions, however, I do not exactly use his questionnaire. In order to calculate my index in accordance to his process, I am looking for the meaning of the constants in front of the mean scores.

For example: PDI = 35(m07 – m02) + 25(m20 – m23) ... What do 35 and 25 mean? How could I calculate them with regard to my research?

Thank you very much for your help!

Best wishes,

Katharina Franke

Dear Researchers,

I'm working on the research using the DEA(Data Envelopment Analysis) method to measure the provincial energy efficiency. However, due to the data constraint the provincial energy consumption data is not available. Can i assume the provincial energy consumption is proportional to provincial GDP?

(national energy consumption/national GDP x province i GDP)?

Can I use Granger Causality test on a monetary variable only? or do I need non-monetary variables?

Also Do I need to do any test before Granger, like a unit root test, or just use raw data?

What free programs can I use to compute the data?

I have heard some academics argue that t-test can only be used for hypothesis testing. That it is too weak a tool to be used to analyse a specific objective when carrying out an academic research. For example, is t-test an appropriate analytical tool to determine the effect of credit on farm output?

I am using "mvprobit" in STATA, however it is not clear to me how i can estimate marginal effect after this. Any help will be much appreciated.

What is the most acceptable method to measure the impact of regulation/policy so far?

I only know the Difference-in-Difference (DID), Propensity Score Matching (PSM), Two-Step System GMM (for dynamic) are common methods. Expecting your opinion for 20 years long panel for firm-level data.

Hello everyone,

i would like to analyze the effect of innovation in 1 industry over a time period of 10 years. the dependent variable is export and the Independent variables are R&D and Labour costs.

What is the best model to use? i am planning to do a Log-linear model.

Thank you very much for your greatly needed help!

Dear colleagues,

I am planning to investigate the panel data set containing three countries and 10 variables. The time frame is a bit short that concerns me (between 2011-2020 for each country). What should be the sample size in this case? Can I apply fixed effects, random effects, or pooled OLS?

Thank you for your responses beforehand.

Best

Ibrahim

Hi Everyone,

I am investigating the change of a dependent variable (Y) over time (Years). I have plotted the dependent variable across time as a line graph and it seems to be correlated with time (i.e. Y increases over time but not for all years).

I was wondering if there is a formal statistical test to determine if this relationship exists between the time variable and Y?

Any help would be greatly appreciated!

Dear Research Community,

I would like to check structural breaks in polynomial regression that predicts expect excess return on excess equity-to-bond market volatility. I find some good references useful but none is dealing with the polynomials. For instance:

- Andrews, D.W.K., 1993, Tests for Parameter Instability and Structural Change With Unknown Change Point. Econometrica 61, 821-856.

- Bai, J. and P. Perron, 1998, Estimating and Testing Linear Models With Multiple Structural Changes. Econometrica 66, 47-78.

- Bai, J. and P. Perron, 2003. Computation and Analysis of Multiple Structural Change Models. Journal of Applied Econometrics 18, 1-22.

- Bai, J. and P. Perron, 2004. Multiple Structural Change Models: A Simulation Analysis. In Econometric Essays, Eds. D. Corbae, S. Durlauf, and B.E. Hansen (Cambridge, U.K.: Cambridge University Press).

As my polynomial is 3-order, I am wondering if structural breaks have to be checked for the 3 orders' parameters (X, X2 and X3) in time-varying or is there other efficient way to handle this issue?

Thank you!

Faten

Hi,

I am looking forward to test unit root for a panel data series. In this regard, I would want to use the Hadri and Rao (2008) test with structural break. Is there any way, I can perform the test in STATA or any other like statistical software.

thanks,

Sagnik

My research is to find out the determinants of FDI. I am doing bound test to see the long run relationship, cointegration test and other diagnostic tests.

Dear Colleagues,

I ran an Error Correction Model, obtaining the results depicted below. The model comes from the literature, where Dutch disease effects were tested in the case of Russia. My dependent variable was the real effective exchange rate, while oil prices (OIL_Prices), terms of trade (TOT), public deficit (GOV), industrial productivity (PR) were independent variables. My main concern is that only the Error Correction Term, the dummy variable, and the intercept are statistically significant. Moreover, residuals are not normally distributed, while also the residuals are heteroscedasdic. There is no serial correlation issue according to the LM test. How can I improve my findings? Thank you beforehand.

Best

Ibrahim

i estimated Autoregressive model in eview. I got parameter estimation for one additional variabel which i have not included in the model. the variable is labelled as ' SIGMASQ '.

what is that variable and how to interpret it?

i am attaching the results of the autoregressive model.

thanks in advace.

Dear colleagues,

I applied the Granger Causality test in my paper and the reviewer wrote me the following:

*the statistical analysis was a bit short – usually the Granger-causality is followed by some vector autoregressive modeling...*What can I respond in this case?

P.S. I had a small sample size and serious data limitation.

Best

Ibrahim

Hello, dear network. I need some help.

I'm working on research, using the Event Study Approach. I have a couple of doubts about the significance of the treatment variable leads and lag coefficients.

I'm not sure to be satisfying the pre-treatment Parallel Trends Assumption: all the lags are not statistically significant and are around the 0 line. Is that enough to accomplish the identification assumption?

Also, I'm not sure about the leads coefficient's significance and their interpretation. The table with the coefficients is attached.

Thank you so much for your help.

Dear researchers,

I am working on formulating hydrological model when runoff(output variable) is available at monthly time-step while rainfall(input variable) is at daily time-step.

I firstly wanted to explore mathematical models and techniques that can be used here. I have found MIDAS regression method, which forms relationship between mixed frequency data variables (output at monthly time step and input at daily time step). But the problem is variables in hydrological models are at the same time step. So that technique will not work, because the MIDAS model will have relation between variables sampled at different frequency.

So can anyone suggest relevant literature, in which both output and input variables of model are related at high frequency (say daily) but the model is learning through low frequency (monthly) output data and high frequency (daily) input data.

The pair-wise granger causality test can be done by using e-views. Only doing this test, is it reliable enough to explain causality? and is it only for long run causality or both long run and short run causality test?

My aim is to find out the significant relationship between FDI and its determinants. I am using bound test and error correction model.

Hello everyone. I am using the VECM model and I want to use variance decomposition, but as you know variance decomposition is very sensitive to the ordering of the variable. I read in some papers that it will be better to use generalized variance decomposition because it is invariant to the ordering of the variables. I am using Stata, R or Eviews and the problem is how to perform Generalised VD and please if anyone knows help me

I am running an ARDL model on eviews and I need to know the following if anyone could help!

1. Is the optimal number of lags for annual data (30 observations) 1 or 2

**OR**should VAR be applied to know the optimal number of lags?2. When we apply the VAR, the maximum number of lags applicable was 5, beyond 5 we got singular matrix error, but the problem is as we increase the number of lags, the optimal number of lags increase (when we choose 2 lags, we got 2 as the optimal, when we choose 5 lags, we got 5 as the optimal) so what should be done?

In one of my paper, I have applied Newey-West standard error model in panel data for robustness purpose. I want to differentiate this model from FMOLS and DOLS model. So, on what ground can we justify this model over FMOLS and DOLS model.

My research is based on Foreign direct investment and its determinants. So, I need to see if there is any significant relationship between the variables by looking at the p values. Should i interpret all the variables including the lagged ones ?

Dear colleagues,

I ran several models in OLS and found these results (see the attached screenshot please). My main concern is that some coefficients are extremely small, yet statistically significant. Is it a problem? Can it be that my dependent variables are index values that ranged between -2.5 and +2.5 while I have explanatory variables that have, i.e the level of measurement in Thousand tons? Thank you beforehand.

Best

Ibrahim

I have GDP and MVA data and though the MVA is stationary, the GDP is non stationary even after log-transformation followed by de-trend followed by differencing. I want to build a VAR/VEC model for ln(GDP) and ln(MVA) but this data has been haunting me for past 3 days. I also tried both method of differencing i.e linear regression detrend and direct difference but nothing seems to work.

Also, they(ln GDP and ln MVA) satisfy the cointegration test, the trends are very similar. But for VAR/VEC I will need them to be I(1) which is not the case. Any suggestions on how to handle this data will be highly appreciated!

I have attached the snapshot of the data and also the data itself.

I would like to employ within transformation in panel data analysis. Market Value Added represents dependent variable. Various value drivers (advertising expenses, number of patents, etc) are explanatory variables. Is it appropriate to use standardized coefficients? Maybe logarithmic forms of a regression is more suitable

Dear Colleagues,

I paid attention to that, when I estimate an equation by Least Squares in Eviews, under the options tab we have a tick mark for degrees of freedom (d.f.) Adjustment. What is the importance and its role? Because, when I estimate an equation without d.f. Adjustment, I get two statistically significant relationship coefficients out of five explanatory variables; however, when I estimate with d.f. Adjustment, I do not get any significant results.

Thank you beforehand.

I made a Nested logit model. Level 1: 8 choices and level 2: 22 choices.

In type 4, I have only 1 choice in level 1 corresponds to one choice in level 2.

The dissimilarity parameters are equal to 1 in this case (not surprising).

Can i run the model normally when i have a an IV parameter than is equal to one?

The results can be interpreted normally or what should i have to do in this case?

I tried the commande "constraint 1=[type4_tau]_cons=1" but the model does not run.

what can i do?

Thanks in advance for your advices

I am trying to run a regression of cobb douglas function:

The problem that my dataset capture the firm at a point of time,

So I have a dataset over the period 1988-2012.

Each firm appears one time!

(I cannot define if it is a panel/time series/cross section..)

I want to find the effect of labor, capital on value added.

I have information on intermediate input.

I use two methods Olley& pakes, levinsohn-patrin.

But Stata is always telling me that there is no observations!

my command:

levpet lvalue, free(labour) proxy(intermediate_input) capital(capital) valueadded reps(250)

Why the command is not working and telling that there is no observations?

(Is this due the fact that each firm appear only one time in the data?)

(If yes, what is the possible corrections for simultanety and selection bias in this data?)

Thanks in advance for your help,

Mina

Dear All,

I have Panel Data fits difference in difference

I regress the (Bilateral Investment Treaties-BIT) on (Bilateral FDI).
BIT: is dummy taking 1 if BIT exists and Zero Otherwise. While
Bilateral FDI: Amount of FDI between the two economies.

**Objective: Examine**If BIT enhances Bilateral FDI?The issue is : - Each country have started its BIT with another pair country at a

**fixed time (different from the others): NO Fixed Time for the whole data.****I am willing to assume different time periods in a random way and run my Diff in Diff (for robustness):**

**Year 2004**

**Year 2006**

**Year 2008**

**My questions :**

**(1) Do you suggest this method is efficient?**

**(2) Any suggestion random selection of time?**

I am interested to know about the difference between 1st and 2nd and 3rd generation panel data techniques.....

I am currently trying to estimate the effect of energy crises on food prices. Given the link between energy and food prices, I am inclined to reason that ECM will be best to estimate the relationship between food price and energy price (fuel price). Additionally I would like to include dummy variables in the model to estimate the effects of periods of energy crises on food prices. This I know is simple to do.

Where am confused is, how to model price volatility in the context of an ECM. I am only interested in the direction where fuel price, as well as the structural dummies for energy crises influences not just the determination of food price, but their volatility as well.

Can anyone help me to carry out mean group analysis and pooled mean group analysis. I have used Microfit and Eviews before. Appreciate if I can get some advice on how to use these panel data methods in Microfit, Eviews and STATA.

Hello,

I am estimating a bivariate probit model, where the errors of the two probit equations are correlated and therefore not independent. However, I suspect that one of the explanatory variables of both models may also cause endogeneity problems. My question is whether there is a perhaps two-stage procedure to correct this situation? Instrumental variables maybe? Could you suggest literature on this problem?

Dear All,

I would like to perform event study analysis through website: https://www.eventstudytools.com/.

Unfortunately they ask for uploading data in a format i dont understand , dont know how to put data in this form, and i dont find a user manual or email to communicate with them.

Can anyone kindly advise how to use this service and explain it in a plain easy way?

Thanks in advance.

Ahmed Samy

Dear All,

I'm conducting an event study for a sample of 25 firms that each gone through certain yearly event (inclusion in an index).

(The 25 firms (events) are collected from last 5 years.)

I'm using daily price abnormal returns (AR), and consolidated horizontally the daily returns for the 25 firms to get daily "Average abnormal Returns" (AAR).

Estimation Window (before the event)= 119 days

Event Window = 30 days

1- I tested the significance of daily AAR through a t-test and corresponding P-value, How can i calculate the statistical power for those daily P-values?

(significance level used=.0.05, 2 tailed)

2- I calculated "Commutative Average Abnormal Returns" (CAAR) for some period in the event window, performed a significance test for it by t-test and corresponding P-value, how can i calculate the statistical power of this CAAR significance test?

(significance level used=.0.05, 2 tailed)

Thank you for your help and guidance.

Ahmed Samy

The original series is nonstationary as it has a clear increasing trend and its ACF plot gradually dampens. To make the series stationary, what optimum order of differencing (d) is needed?

Furthermore, if the ACF and PACF plots of the differenced series do not cut off after a definite value of lags but have peaks at certain intermittent lags. How to choose the optimum values of 'p' and 'q' in such a case?

I have seen that some researchers just compare the difference in R

^{2 }in two models: one in which the variables of interest are included and one in which they are excluded. However, in my case, I have that this difference is small (0.05). Is there any method by which I can be sure (or at least have some support for the argument that) this change is not just due to luck or noise?To illustrate my point I present you an hypothetical case with the following equation:

wage=C+0.5education+0.3rural area (

Where the variable "education" measures the number of years of education a person has and rural area is a dummy variable that takes the value of 1 if the person lives in the rural area and 0 if she lives in the urban area.

In this situation (and assuming no other relevant factors affecting wage), my questions are:

1) Is the 0.5 coefficient of education reflecting the difference between (1) the mean of the marginal return of an extra year of education on the wage of an urban worker and (2) the mean of the marginal return of an extra year of education of an rural worker?

a) If my reasoning is wrong, what would be the intuition of the mechanism of "holding constant"?

2) Mathematically, how is that just adding the rural variable works on "holding constant" the effect of living in a rural area on the relationship between education and wage?

I am trying to learn use of augmented ARDL. But I did not find the command for augmented ardl in stata. Can anyone please refer to the user written code for Augmented ARDL?
Is there any good paper that describe the difference between ARDL bound test and augmented ARDL process? I would be happy if you can answer those questions.

In this Augmented ARDL, I find there are three test to get confirmation for the long run cointegration; e.g, overall F test, t test on lagged dependent variable, F test on lagged independent variable.

- How to find/calculate t-statistics for the lagged dependent variable?
- How to find/calculate F-statistics for the lagged independent variable?

Using STATA, I find that the bound test produces two test statistics: F statistics and t-statistics. But both of them are for examining overall test for cointegration. How could I find t-statistics for lagged dependent variable and F statistics for lagged independent variable?

Thank you.

The paper, on which I am working, is a multivariate study. I am planning to use this model as it has two advantages:

1. It tests the stability of the long-term relationship across quantiles and provides a more flexible econometric framework.

2. It can explain the possible asymmetry in the response on one variable to changes in another variable.

Because of these two reasons, I am preferring it above NARDL.

As I am not good in STATA coding, therefore, any help regarding coding this method is highly appreciated.

Dear All,

I’m conducting and event study for inclusion of companies in a certain index.

The event is the “inclusion event” for companies in this index for last 5 years.

For the events, we have yearly Announcement date (AD) for inclusions, and also effective Change Dates (CD) for the inclusion in the index.

Within same year, I have aligned all companies together on (AD) as day 0, and since they are companies from same year, CD will also align for all of them.

The problem comes when I try to aggregate companies from different years together, although I aligned them all to have same AD, but CD is different from one year to another so CD don’t align for companies from different years.

How can I overcome this misalignment of CD from different years , so that I’m able to aggregate all the companies together?

Many Thanks.

Dears,

I'm conducting an event study for the effect of news announcement at certain date on stock return.

Using the market model to estimate the expected stock return in the "estimation window" , we need to regress stock returns ( stock under study) with returns from market portfolio index.

1- How can we decide upon choosing this market portfolio index for regression ?

Is it just the main index of the market?

Sector index from which the stock under study belong?..etc ?

2- Is it necessary that stock under study be among the constituents of this market index?

Appricite to justify your kind answers with research citations if possible

Many thanks

I am currently assisting on a research on cross border capital flows.

A common problem seems to be that both the acquisition of assets and valuation effects determine the cross border asset holdings as , for example, reported in the CPIS data. Hobza and Zeugner use the BoP statistics on portfolio investments to derive valuation effects on portfolio debt and equity (change in asset holdings minus acquisitions) (2014).

I am wondering if the valuation effect could also be estimated because I do not only want to distinguish between portfolio debt and equity but also between different types of instruments.

For instance, between different debt maturities.

Dear community,

I am struggling with statistics for price comparison.