Questions related to Time Series Econometrics
I've read Box & Jenkins 5th edn, there is mention of ARCH and GARCH in the later chapters but they are mentioned in the model checking. Then in Hamilton, there is no mention at all. If I use an ARIMA model for hypothesis tests (assuming conditional LS and MLE for estimating the coefficients), then would heteroskedasticity be a problem, as in resulting in incorrect standard errors of the coefficients?
I'm using stock price data to model the volatility and forecast volatility using EViews. In forecasting when I use the static forecast, the mean absolute percentage error is high but it is low when I use dynamic forecast. What is the difference between these two methods? and which is more accurate?
In many articles I saw that ARDL model can be used when there is only 1 cointegration relationship. Therefore, to check the number of cointegration relationships I used Johansen cointegration test and found there is only 1 cointegration relationship. But in theory, ARDL model uses bound F statistic test to check whether there are cointegration relationships exist. How do we identify number of cointegration relationships using bound test. Is it unnecessary to use bound test if I already used the Johansen cointegration test?
I am analyzing the Sri Lankan stock market performance during COVID-19. I considered daily stock prices from 2015 to 2021. The total number of days in the series is 2555, but 905 of them are missing records due to the stock exchange holidays. Is imputing those values based on the available data okay, or does that affect the result's accuracy to impute a large number of observations? Please give me a solution. This is the only data available for the stock market.
I am currently trying to analyze the impact of high levels of geopolitical uncertainty (GPR) on monetary connectivity using a quantile regression model with a quantile set at 0.95. My current concern is whether the coefficients are positive or negative and whether they are significant. However, almost all coefficients are not significant. Four variables in the current data are unstable (including independent variables and dependent variables), and all time series have serial autocorrelation. At present, I does not know how to deal with them, no matter it is logarithm, difference or logarithmic difference, it will change the economic meaning of coefficients.
I have few time series which are cointegrated. First I differenced the non-stationary series and fitted VAR after making them stationary. Later I fitted vector error correction model separately and obtained the long run and short run relationship. As I know taking difference removes the long run relationship of variables. So does that mean I can use VAR to explain the short run relationship even the series are cointegrated? If so can I compare the results of VAR model and short run results of VECM model?
i am looking for the package or command, which performs times series decomposition in STATA. So far I did not find anything. Example can be found here: https://towardsdatascience.com/an-end-to-end-project-on-time-series-analysis-and-forecasting-with-python-4835e6bf050b at figure 5.
Look forward to valuable comments)
I want to know that how can I calculate bt (dependent variable) in equation 1b?
If anyone has a code of this methodology, kindly send me?
Whether these three equations need to estimate simultaneously, if yes then how?
Paper: Baur, D. G., & McDermott, T. K. (2010). Is gold a safe haven? International evidence. Journal of Banking & Finance, 34(8), 1886-1898.
Running a time series model with a dependent variable and an endogenous regressor via 2SLS. Both variables expressed as % growth rates. Cumby - Huizinga is rejected at 5% so I am looking for possible causes of autocorrelation. The regressor clearly changes behavior after a specific period and there's good evidence that this change is policy related. Adding a dummy that becomes 1 after the policy changes became effective seems to improve serial correlation. Cumby - Huizinga p value is at 0.12. Dummy coefficient is very small but negative as expected. I know that omitted variables can potentially cause autocorrelation but does this also hold if omitted variable is a dummy or is this likely to be an artifact?
I have seen papers where PSM has been performed using cross sectional and panel data. I want to know if PSM can be used for time series data too.
I also have a question that which quantitative method should one use for analysing the impact of a policy intervention. The dataset is time series in nature.
Hello, I am new to VARs and currently building a SVAR with the following variables to analyse monetary policy shocks and their affects on house prices: House prices, GDP Growth, Inflation, Mortgage Rate first differenced, Central Bank Base Rate First Differenced and M4 Growth Rate. The aim of the VAR analysis is to generate impulse responses of house prices to monetary policy shocks, and understand the forecast error variance decomposition.
I'm planning on using the Base Rate and the M4 Growth Rate as policy instruments, for a time period spanning 1995 to 2015. Whilst all variables are reject the null hypothesis of non stationarity in an Augmented Dickey Fuller test, with 4 lags, the M4 growth rate fails to reject the null hypothesis.
Now, if I go ahead anyway and create a SVAR via recursive identification, the VAR is stable (eigenvalues within the unit circle), and the LM test states no autocorrelation at the 4 lag order.
Is my nonstationarity of M4 Growth Rate an issue here? As I am not interested in parameter estimates, but rather just impulse responses and the forecast error variance decomposition, there is no need for adjusting M4 Growth rate. Is that correct?
Alternatively I could first difference the M4 growth rate to get a measure of M4 acceleration, but I'm not sure what intuitive value that would add as a policy instrument.
Many thanks in advance for your help, please let me know if anything is unclear.
I have long-term rainfall data and have calculated Mann-Kendall test statistics using the XLSTAT trial version ( addon in MS word). There is an option for asymptotic and continuity correction in XLSTAT drop-down menu.
- What does the term "Asymptotic" and "continuity correction" mean?
- When and under what circumstances should we apply it?
- Is there any assumption on time series before applying it?
- What are the advantages and limitations of these two processes?
Dear all, I want to replicate an Eview plot (attached as Plot 1) in STATA after performing a time series regression. I made an effort to produce this STATA plot (attached as Plot 2). However, I want Plot 2 to be exactly the same thing as Plot 1.
Please, kindly help me out. Below are the STATA codes I run to produce Plot 2. What exactly did I need to include?
twoway (tsline Residual, yaxis(1) ylabel(-0.3(0.1)0.3)) (tsline Actual, yaxis(2)) (tsline Fitted, yaxis(2)),legend(on)
I've been working on a paper recently wich its model's variables are time series and the dependent variable is bounded to the unit interval [0,1]; now I want to exploit Fractional Logit/Probit models but as far as I know, most of the previous articles which used these kind of models were dealing with panel or cross sectional data and I haven't seen these models applied for time series modeling.
My question is that is it accurate to use fractional logit/probit for time series analysis? if yes/no why and what is the reference?
I have data with outliers and without outliers of same financial series. Suppose I estimate the AR(1)-GARCH(1,1) separately for both series and come with different values of Log Likelihood values of both models. I want to apply Likelihood ratio test to compare the models and select the best one. I am confused with the degree of freedom of the test. Is the degree of freedom is 1 or 4 in this case?
Rsoner test will detect outliers that are either much smaller or much larger than the rest of the data. Rosner's approach is designed to avoid the problem of masking, where an outlier that is close in value to another outlier can go undetected.Rosner's test is appropriate only when the data, excluding the suspected outliers, are approximately normally distributed, and when the sample size is greater than or equal to 25. My questions is that can we use it on univariate time series or should we apply it for univariate datasets only?
- Which is the correct order in Data-processing of rainfall time series- Homogeneity test followed outlier detection & treatment (OR) Outlier detection & treatment followed by Homogeneity test?
- I have monthly rainfall data for 113 years. I am planning to run four homogeneity test- Buishand range test (BRT), Standard normal homogeneity test(SNHT), Von-Neumann Ratio (VNR) and Pettitt.
- Which is the appropriate method for identifying outliers in a non-normal distribution ?
- Should the descriptive statistics(DS) and Exploratory data analysis (EDA) should be conducted before (or) after treating the outlier? (or) a comparison should be made in the EDA & DS before and after treating the outlier
I'm using garch code,where data is a file with 204 values,train is a test sample size 50 with a shifting +1 each next step (25 columns and 178 rows).
To make a prediction,I'm using mod_fitting=ugarchfit(train[(90:96),],spec=mode_specify,out.sample=20)
forc=ugarchforecast(fitORspec = mod_fitting,n.ahead=20),but I've got only one column as a output and where train[(90:96),I would like to get 7 columns as a result.
So I need to shift and change a number manually train[(9),train[(8),train[(20),...
Could you tell me please,is it possible to create a dataframe or something to get a result with multiple columns?
Thank you very much for your elp
Code is below:
#y <- read.csv('lop.txt', header =TRUE)
data <- read.csv('k.csv')
a <- data[,1]
shift <- 50
S <- c()
for (i in 1:(length(mi)-shift+1))
s <- mi[i:(i+shift-1)]
S <- rbind(S,s)
mode_specify=ugarchspec(mean.model=list(armaOrder=c(0,0)),variance.model = list(model="gjrGARCH",garchOrder=c(1,0)),distribution.model='sstd')
forc=ugarchforecast(fitORspec = mod_fitting,n.ahead=20)
Hi, I'm to do a multivariate regression analysis using TSCS data in R, and I'm a bit lost as to where I should start. My research question is how ALMP's have affected unemployment in four European countries over a 20 year time period, using expenditure data on subprograms from OECD. The resources I have from class don't show how to do a TSCS analysis so I'm wondering if people have tips on where I should start and what to do.
1. Each time series variable is log-transformed and stationary at I(1), not at I(0). I intend to employ the ARDL method.
2. I have run ADF and PP tests, without considering lags in the command (STATA).
That is, dfuller [varname]; pperron [varname]. Is the syntax OK?
3. How to make sure that the variables are NOT I(2)? Do I have to run the ADF, PP tests at the second difference? or I(1) means THE SERIES is NOT integrated in order of 2.
Does anyone have the codes (written on RATS/MATLAB/any other platform) for Rolling Hinich Bicorrelation test and Rolling Hurst Exponent test? Would greatly appreciate if you could share them.
i estimated Autoregressive model in eview. I got parameter estimation for one additional variabel which i have not included in the model. the variable is labelled as ' SIGMASQ '.
what is that variable and how to interpret it?
i am attaching the results of the autoregressive model.
thanks in advace.
I applied the Granger Causality test in my paper and the reviewer wrote me the following: the statistical analysis was a bit short – usually the Granger-causality is followed by some vector autoregressive modeling...
What can I respond in this case?
P.S. I had a small sample size and serious data limitation.
I am currently trying to estimate the long memory parameter of various time series with several versions of "whittle estimators" à la Robinson, Shimotsu, and others.
However, the estimated value depends crucially on the bandwidth parameter.
Is there any rule on choosing it, or is there any literature about choosing this parameter adequately?
I really appreciate any help you can provide.
From the screenshot, you can see my OLS estimations between institutional variables and oil-related predictor variables. My main hypothesis was that oil-related variables have a negative impact on institutional quality (according to Resource curse theory); however, my estimations produced mixed results, giving both positive and negative coefficients. In this case, what should I do? How do I accept or reject the alternative hypothesis that I have already mentioned.? Thank you beforehand.
I ran several models in OLS and found these results (see the attached screenshot please). My main concern is that some coefficients are extremely small, yet statistically significant. Is it a problem? Can it be that my dependent variables are index values that ranged between -2.5 and +2.5 while I have explanatory variables that have, i.e the level of measurement in Thousand tons? Thank you beforehand.
There is a general discussion that dependent variables used in the ARDL method should be stationary at level I . However, in some studies, the ARDL method was used, although the dependent variable was stationary at the I  level. Can it be said that these analyses are not accurate? In addition, since NARDL methods are ARDLbased estimators, do dependent variables have to be stationary at level I ?
I’m doing a time series analysis of the relationship between high valuable patents and economic growth for six countries. (Growth measured in GDP and GDP per capita). Sample size: 20 years.
To check for stationarity and cointegration, I want to do ADF test and Engle-Granger test.
For both tests, when do I have to choose:
- test without constant
- test with constant
- or test with constant and trend
Second question is how do I identify the optimal lag length?
Thanks in advance.
Dear colleagues, how do you think about applying Pearson's R or Spearman's Rho correlation analysis on the panel data? Is possible to meaningfully interpret the results? Do you know any study or research that would fit my question? I highly appreciate your help.
I paid attention to that, when I estimate an equation by Least Squares in Eviews, under the options tab we have a tick mark for degrees of freedom (d.f.) Adjustment. What is the importance and its role? Because, when I estimate an equation without d.f. Adjustment, I get two statistically significant relationship coefficients out of five explanatory variables; however, when I estimate with d.f. Adjustment, I do not get any significant results.
Thank you beforehand.
I am trying to use xtabond2 command in Stata14. The data I will use in my thesis is dynamic panel data model. Time is from 2005QI to 2016QIV for 24 banks. Regarding this, I intend to use two step GMM estimator and used the below command;
xtabond2 LiqC_TA L.LiqC_TA L.(Equity_TA earningvol MSHARE npl_ratio loan_depo) if new==0, gmmstyle (L.(LiqC_TA earningvol npl_ratio loan_depo), lag(1 1) collapse) gmmstyle (L.Equity_TA, lag(2 3) collapse) ivstyle (L.MSHARE) robust twostep nodiffsargan
xtabond2 Equity_TA L.Equity_TA L.(LiqC_TA earningvol MSHARE npl_ratio loan_depo) if new==0, gmmstyle (L.(Equity_TA earningvol npl_ratio loan_depo), lag(1 1) collapse) gmmstyle (L.LiqC_TA, lag(2 6) collapse) ivstyle (L.MSHARE) robust twostep nodiffsargan
I got the results that I want. If I use time dummies, my independent variables and control variables are omitted. Therefore, is it correct if I do not consider time dummies in the above commands.
Thank you very much in advance for any response.
Is it ok to include 2 or 3 dummy variables in the regression equation? Or should I rotate the dummy variables in different models? The thing is, I have never come up with the model examples with more than 2 dummy variables in economics so far. Do you know any serious shortcomings of using more than 1 dummy variable in the same equation? Thank you beforehand.
- I have two series, price and sales.
- Sales is mean-reverting stationery but price is stationary only after controlling for an intercept break.
- I want to set up a 2-equation VAR model and the research interest is to estimate the cumulative effect of price on sales through impulse response function.
My question: Is the irf of a price shock on sales still biased even after I include the break dummy as a regressor in the two equations? Say the var model is:
Salest = β0+ β1Salest-1 + β2Salest-2 + β3Pricet-1 + β4Pricet-2 + β5Dt + et
Pricet = β0+ β1Salest-1 + β2Salest-2 + β3Pricet-1 + β4Pricet-2 + β5Dt + et
My answer is yes, irf is still biased because the regressors Pricet-1 & Pricet-2 are still nonstationary.
My solution: include both equations the interaction terms: β6Pricet-1*Dt and β7Pricet-2*Dt?
Would you please assess the above my question, my answer, and my solution?
Thank you very much!
I am interested to know about the difference between 1st and 2nd and 3rd generation panel data techniques.....
Hi everyone! I have a question relating to the specification form of the ARDL regression (which includes both long-run and short-run dynamics). In most of the research articles I reviewed, the specification of the regression (assuming y is dependent, and x and z are explanatory) takes the form shown in the attached file. That is: the first difference of y is regressed on y(-1); x(-1), z(-1), as well as on the first difference of the lagged variables (both explanatory and dependent) based on the optimal number of lags.
I have two related questions: Firstly, why in some of articles the symbol (p) which represents the upper limit of the summation operator (∑) in the regression, is defined as the optimal lag? While in fact, it should be the optimal lag minus 1? Because if the optimal lag for x is 3 for example, the first difference of this variable should be represented by the three terms:
Δx, Δ[x(-1)]; and Δ[x(-2)];
the last term in fact is [x(-2) - x(-3)]. So, the upper limit of the (∑ ) symbol should be 2 which is the optimal lag minus 1.
Secondly, relating to the first question, why in most of articles, the lower limit of (∑) associated with the first difference of lagged explanatory variables is (i=1) while it should be (i=0)? The regression results show that the first difference of the level explanatory variable is included as well as the first difference of its lagged value. So, why do not we say that (i=0) for the first difference of the explanatory variables? The current expression used in literature excludes the first difference of the level explanatory variable while it is included in the regression results!
I really appreciate your kind response!
I am running a time series regression (OLS) based on stationary dependent variables and log form of explanatory variables. Very few of logged exp. variables are stationary. When I took 1st difference, I could not get any significant results, plus, also the 1st difference did not make much sense to me when I analyzed it graphically. My question is, will my regression results be untrusted if I report such an analysis. I asked a similar question and got some replies that even if you log your variables, still you have to test it for a unit root; however, I am observing several papers with log variables where stationarity was not taken into account. Thank you beforehand.
Dear RG colleagues,
I applied OLS regression analysis and usually, I report CUSUM and CUSUMSQ stability tests. But this time, I have to report more stability tests and I also included heteroscedasticity tests. My question is, are these two enough, or should I incorporate additional stability tests of coefficients or residuals? What are the most popular stability tests of the models? Thank you beforehand.
I am capable of using linear estimations between X and Y variables via OLS or 2SLS (on Eviews, for example); however, I need to study how to estimate/model non-linear relationships as well. If you know any source which can explain it in a simple language based on time series, your recommendations are well-welcomed. Thank you beforehand.
I am solving some regression equations based on the OLS method in Eviews software. I have overall 12 variables and 11 of them are non-stationary. I am planning to use Log forms. If so, do I need to report the ADF or Philips-Perron unit root tests on the paper? Shouldn't the Log form of variables become stationary? Your answers are highly appreciated, thank you beforehand.
I am running a VECM, and the error correction term equals -0.90. I have four variables, one of my speed of adjustment parameters equals -0.55, and the other one equals -0.20; however, the other two variables are unresponsive and insignificant. I know that there is nothing wrong with the results theoretically, but I am afraid of the fact that it might be overestimated. I have run several residual and stability diagnostic tests for the model and the results were just fine. I have also checked the literature but could find not an answer as I am using the data of a different country
Do you think that the results might be overestimated?
PS I am using annual data 1991-2018
I am currently analysing the relationship between monetary base and Unemployment and have constructed an ARDL model. When I use the BIC to determine the optimal lag length of my independent variable (monetary base), the model that is suggested only has one lag. I have a feeling this doesn't make very much economic sense. In the model with one lag, the independent variable isn't significant.
When I include 12 lags, the 5th lag of the independent variable is significant.
I have read that with monthly data, including 12 lags is reasonable.
Could I just include more lags than are suggested by the BIC in order to get a significant variable?
I have looked at Econometrics books such as Brooks time series Econometrics but i'm looking for books and articles that specialize in GARCH models.
Could it be possible to get the estimation of d parameter greater than 1 ?
I want to set 4 models namely ARFIMA-FIGARCH, ARFIMA-FIEGARCH, ARFIMA-FIAPARCH and ARFIMA-HYGARCH for stock returns, in OxMetrics (G@rch) but my outputs are not seem to be appropriate...
Could anyone help me, please?
Thank you in advance
I have an ARDL error correction model, in the CUSUM chart as far as I diagnosed there is a negative trend, moving means. Before that, I found that the explanatory variable and the dependent variable are not integrated to the same degree. I know I could not use ECM when there is no cointegration, I don't know is there any other model that gives more consistent results.
Could anyone offer a remedy?
Dear all, I would like to start a discussion here on the use of generalised mixed effect (or additive) models to analyse count data over time. I reported here the "few" analyses I know in R for which I found GOOD (things) and LIMITS /DOUBTS. Please feel free to add/ comment further information and additional approaches to analyse such a dataset. Said that, generalised mixed effect modelling still requires further understanding (at least from me) and that my knowledge is limited, I would like to start here a fruitful discussion including both people which would like to know more about this topic, and people who knows more.
About my specific case: I have counted data (i.e., taxa richness of fish) collected over 30 years in multiple sites (each site collected multiple times). Therefore my idea is to fit a model to predict trends in richness over years using generalised (Poisson) mixed effect models with fixed factor "Year" (plus another couple of environmental factors such as elevation and catchment area) and random factor "Site". I also believe that since I am dealing with data collected over time I would need to account for potential serial autocorrelation (let us leave the spatial correlation aside for the moment!). So here some GOOD (things) and LIMITS I found in using the different approaches:
GOOD: good model residual validation plot (fitted values vs residuals) and good estimation of the richness over years, at least based on the model plot produced.
LIMITS: i) it is not possible to include correction factor (e.g., corARMA) for autocorrelation.
GOOD: possible to include corARMA in the model
LIMITS: i) bad final residual vs fitted validation plot and completely different estimation of the richness over years compared to glmer; ii) How to compare different models e.g., to find the best autocorrelation structure (as far as I know, no AIC or BIC are produced)? iii) I read that glmmPQL it is not recommended for Poisson distributions (?).
GOOD: Possible to include corARMA, and smoothers for specific dependent variables (e.g., years) to add the non-linear component.
LIMITS (DOUBTS): i) How to obtain residual validation plot (residuals vs fitted)? ii) double output summary ($gam; $lme): which one to report? iii) in $gam output, variables with smoothers are not estimated (only degree of freedom and significance is given)? Is this reported somewhere else?
If you have any comment, please feel free to answer to this question. Also, feel free to suggest different methodologies.
Just try to keep the discussion at a level which is understandable for most of the readers, including not experts.
Thank you and best regards
I am Finding Volatility of Stock Returns, i Took daily data, and the data is stationary but there is no ARCH effect. can i proceed to GARCH, TGARCH?
While running a gravity model through OLS or PPML technique for panel data (aggregated) of 60 countries for say 10 years, I get single table showing estimates of all regressors.
I don't understand how to interpret it. For which country pair does the coefficients belong?
My dependant variable is exports from i to J while independent includes distance, contagion, gdp etc.
What is the best way to estimate the model when dependent variable is level stationary?
I am currently investigating the presence of bubbles in a particular financial market.
I have implemented a right-tailed ADF test to check for "mild-explosiveness" of my time series, along with a supremum ADF test to check for the presence of a bubble, as proposed by Phillips et al. in their 2011 paper.
The issue I am currently running into is an apparent contradiction in my results. While the initial ADF test fails to reject the null hypothesis of a unit root, the SADF rejects this same null in favour of the alternative hypothesis; the existence of a price bubble.
Is this is to be expected? It seems unusual that one would both reject, and fail to reject the same null hypothesis when using two relatively similar tests.
As per the theory, when all the variables are not integrated at the same order, ARDL is applied to test the relationship between the variables.
my doubt is whether the relationship tested in ARDL indicates the long run cointegration relationship or not .
thanks in advance
I would appreciate it very much if you could recommend the newest and most applied Non-linear Time series Econometrics Techniques and some articles. Many thanks. Kind regards, Sule
There are arguments for and against adjusting data for seasonality before estimating a VAR model (and then Granger causality). I've monthly tourist arrival data for three countries (for 18 years) and interested in spill-over effects or causality among the arrivals. I expect your views on the following.
1. Is seasonal adjustment is compulsory before estimating VAR?
2. If I take 12-month seasonal differenced data without adjusting for seasonality, will it be okay?
Consider the long-run covariance matrix estimator for time series as proposed in Newey and West (1987). Ledoit and Wolf (2004) proposed an estimator for the covariance matrix (but not the long-run covariance matrix) based on shrinkage that is well-conditioned, and I would like a similar estimator for the long-run covariance matrix. Anyone aware of any papers addressing this?
In regression modelling process, somtimes we deal to make a categorization of a continuos variable ( DVs or IDVs), What are really the potential problems inherent of such transformation, on:
- Estimation results
- Precesion and accuracy
- Hypothesis tests...
Thank you so much for any response and clarification
Metric Modeling and Representation of 4D Data through 3D Vector Tensoring
In representational, metric analytics, it is often made use of binary vectors, classically drafted in two-dimensional (2D) grids. However, these do not reflect complexity in manifest, part-observable reality.
Tensors be 4D-computed in order to be 3D-geometrically represented. This can be done by applying data to a three-dimensional grid containing dynamic, non-binary-graph and 3D vector objects changing, in different scenarios, on the time axes, generating multiple grids. Scenarios can be limited by applying probability weighing or targeting specific results, by eliminating transitions and breaks, or else intermittent vector changes.
If one attempted to simplify the Tensor representation neglecting the shift representations in time, one would replace many 3D, non-binary-graph vector objects with fewer complex, but 4D-grafically representable objects in time, resulting in vector folding. This is of benefit in posterio, for typical aggregates. In practical applications, and in comparative settings, one will rather apply unfolded 3D objects in stage representations, adhering to 3D Vector Tensoring, or Plurigrafity.
I am doing two time-series regressions, both regressions contain 3 mutual variables with several other variables in them.
Set 1: A B C D E F G
Set 2: A B C H I J K L M
I am running covariance matrix (looking at the correlation amongst variables for both regressions). However, since the 3 mutual variables are present for both regressions, do I still include these 3 variables in the covariance matrices for both regression? I am not sure because the correlation coefficients for the same mutual variables are different.
Set 1: A-B = 0.5678
Set 2: A-B = -0.4892
Any helps will be greatly appreciated.
I estimated ARIMA model with daily gold time series. The residuals' corelogram is flat but its squared is not flat. Already I tried eVİEWS heterodasticity >> arch effect and ı found prob value 0.00 so there is heterodasticity. Can ı continue with ARIMA? (Flat correlogram is residuals other is squared residuals)(Are there permission to continue with ARIMA in this case?)
I recently spoke to an econometrics professor in my department and he said that in certain cases you can ignore the time dimension in long t panels (specifically this referred to a probit model for working out winners of tennis matches but I would like this discussion to be more general).
He suggested that this was possible so long as:
1. I controlled for the time dimension by including "lags" of the dependent variable (obviously these are not lags in the usual "subscript t case" but rather cross-sectional variables that state whether player i won their previous game(s))
2. Use cluster-robust standard errors, to take into account the correlation in the residuals for each player.
I'd be interested to hear whether:
1. You agree with this approach; and
2. In what circumstances you think this is appropriate and when it is inappropriate.
Thanks in advance!
I am working on testing the Arbitrage Pricing Theory. So, I have 5 independent variables, or factors (oil price, inflation, industrial production, FX, stock index) that affect the dependent variable (stock return). So, I have a raw data on theses factors for the last 7 years to be processed. I need to find the betas regressing these factors on the computer software (maybe, STATA). But I don't have any idea, how to start this process. Should I create my own .dta format file to regress it on STATA if I don't have such ready to use data file? Is it possible at all?
p/s: I am considering the US stock market, actually S&P500 index. Thanks a lot.
I'm testing the Fama&French three factor model in the italian stock market. After having done my 16 time series regressions i applied the usual diagnostic tests for error terms and what i see is that errors are not normally distributed. I checked for arch effect but there isn't. What can i do (i can't use dummy variables in this context) taking into account that each of the 16 time series is made up of 96 monthly obs?
Suppose i have regression model as-
gdp = a+ bx+ c(z/gdp)+ e
Here x is a vector of explanatory variables. z/gdp is one of my control variables. In this regression equation, GDP lies in both side and my data is in Time Series. There is obvious issue is estimating this equation. How to rectify it for estimation ? Can i use one period lag of z/gdp to overcome this issue. ?
I want to use the GMM technique to estimate parameters of Fama Fench three factor model. I don't have stata license so how can i do with Eviews? I saw there is GMM between the methods available but i don't understand what to write in the field "instruments". Has anyone ever used Eviews to apply GMM estimation technique? Thank You in advance.
I am working on a project that uses ECM model to inspect the short-run dynamics of money supply (m(t)) to loans (l(t)) since both variables are I(1). Excluding the error correction term, is it appropriate to aggregate the coefficients for l(t) to obtain its overall short-run effects? i.e. for al(t-2)+bl(t-1)+cl(t), I obtain total short-run coefficient by a+b+c. I am not sure if this is an appropriate way to follow.
Please, I really need technical assistance from experts here.
When estimated the DCC-GARCH in stata at the end of the output pairwise quasi correlations are given. What does it mean in practice? is it the mean value of dynamic correlations or something else?
Much appreciated if anybody could clarify this.
I am estimating the relationships between Economic Growth (GDP), Public Debt and Private Debt through a PVAR model in which my panel data consists of 20 countries across 22 years.