Science topic

Applied Microeconometrics - Science topic

Explore the latest questions and answers in Applied Microeconometrics, and find Applied Microeconometrics experts.
Questions related to Applied Microeconometrics
  • asked a question related to Applied Microeconometrics
Question
4 answers
I have run pooled mean group estimation (PMG) on a panel data using xtpmg command in Stata. However, i could not find method to test for autocorrelation and heteroskedasticity from my pmg results. In case, if there is autocorrelation and heteroskedasticity in pmg estimation, how do we remove them? I shall be grateful, if somebody can help me. 
Relevant answer
Answer
  • Set the data set to be a time-series data set.
  • Run regression.
  • Examine for serial correlation.
  • Correct the regression for the serial correlation.
  • asked a question related to Applied Microeconometrics
Question
7 answers
I am trying to run a regression of cobb douglas function:
The problem that my dataset capture the firm at a point of time,
So I have a dataset over the period 1988-2012.
Each firm appears one time!
(I cannot define if it is a panel/time series/cross section..)
I want to find the effect of labor, capital on value added.
I have information on intermediate input.
I use two methods Olley& pakes, levinsohn-patrin.
But Stata is always telling me that there is no observations!
my command:
levpet lvalue, free(labour) proxy(intermediate_input) capital(capital) valueadded reps(250)
Why the command is not working and telling that there is no observations?
(Is this due the fact that each firm appear only one time in the data?)
(If yes, what is the possible corrections for simultanety and selection bias in this data?)
Thanks in advance for your help,
Mina
Relevant answer
Answer
I agree with anton
  • asked a question related to Applied Microeconometrics
Question
6 answers
Dear All,
I have Panel Data fits difference in difference
I regress the (Bilateral Investment Treaties-BIT) on (Bilateral FDI). BIT: is dummy taking 1 if BIT exists and Zero Otherwise. While Bilateral FDI: Amount of FDI between the two economies. Objective: Examine If BIT enhances Bilateral FDI?
The issue is : - Each country have started its BIT with another pair country at a fixed time (different from the others): NO Fixed Time for the whole data.
I am willing to assume different time periods in a random way and run my Diff in Diff (for robustness):
Year 2004
Year 2006
Year 2008
My questions :
(1) Do you suggest this method is efficient?
(2) Any suggestion random selection of time?
Relevant answer
Answer
مهتم
  • asked a question related to Applied Microeconometrics
Question
14 answers
how to explain diamond water paradox in simple language?
Relevant answer
Answer
Adam Smith noted that, even though life cannot exist without water and can easily exist without diamonds, diamonds are, pound for pound, vastly more valuable than water. The marginal-utility theory of value resolves the paradox. Water in total is much more valuable than diamonds in total because the first few units of water are necessary for life itself. But, because water is plentiful and diamonds are scarce, the marginal value of a pound of diamonds exceeds the marginal value of a pound of water. The idea that value derives from utility contradicted Karl Marx’s labour theory of value, which held that an item’s value derives from the labour used to produce it and not from its ability to satisfy human wants.
  • asked a question related to Applied Microeconometrics
Question
4 answers
Hi,
I am trying to combine parent data information to their children file.
1. I have one dta file for Children information and another for their parents who are in the separate data file. The children data set contains unique identifier for their parents ID as well which I can use who is the childs mother or father. What I want to do is to match some details such as Father and Mothers employment details, education level to their children database. Is there any efficient way to do this.
2. If I have combined the two Children and Adult dataset into one dta file, can I then do what I want to do above or I will have to do it separately as I mention in (1).
3. What if I have organised the children data into a panel dataset and now want to add this information about parents. Any efficient way to do from here then?
Looking for ideas.
Thank You.
Relevant answer
Answer
Hi Kushneel,
Here's a link to a book about database management. It contains several discussions about data relationships and crafting a relational database. It may be of some benefit.
Have a great day!
--Adrian
  • asked a question related to Applied Microeconometrics
Question
2 answers
I have 5 different questions in survey data which is used to measure degree of individuals risk-taking behaviour. I need to use all these information to find my Y-variable (capturing risk behaviour of individuals).
I am attaching the questions here along with dataset for some valuable input.
Relevant answer
Answer
You can use all questions to determine when does an individual switch between the lottery and the sure payment. This information gives you the individual's degree of risk aversion.
For example, if an individual prefers the sure payoff to the lottery in the first question, he states that he is risk averse (because the lottery has an expected value of 100, which is equal to the sure payoff). His degree of risk aversion increases if he still prefers the sure payment when the sure payoff decreases as in the next two questions.
Similarly, if the individual prefers the lottery in the first question, he states that he is risk tolerant. The last questions examine how strong is his risk tolerance by increasing the size of the sure payoff. A rejection of higher sure payoffs indicates a stronger risk tolerance (lower risk aversion). You can calculate the degree of risk aversion if you assume a particular utility function. This is then your y-variable that you are looking for.
  • asked a question related to Applied Microeconometrics
Question
2 answers
Hi everyone,
Please this question is directed to Stata users and those familiar with the pooled mean group estimator (PMG) that is within the ARDL framework. 
I use Stata 13 and I received an error message while attempting to obtain optimal lag orders for my PMG model. I have 10 countries and 3 variables from 1980 to 2015. I used this command: 
forval i = 1/10 {     
                                          ardl dcb dr gdp if (c_id==`i'), aic ec 
                                          matrix list e(lags)
                                          di   
                                     }
This tells Stata to automatically generate AIC-optimal lag orders which it was able to do BUT only for the first 9 countries after which I got this error message:
"Maximum number of iterations exceeded. r(498)"
I also tried increasing the number of iterations by using this code:
forval i = 1/10 {     
                                          ardl dcb dr gdp if (c_id==`i'), aic ec  maxcombs (2500)
                                          matrix list e(lags)
                                          di   
                                     }
I received the same output as before and same error message, and I still do not don’t have the optimal lag order for the 10th country.
Please what can I do, or can anyone advise on another way of obtaining optimal lag orders for each variable and per country?
Relevant answer
Answer
The error message seems to be related to the iterate() option of the nlcom command which is used by ardl to compute the error-correction coefficients and standard errors.  You might have to reduce the  number of lags, for example with the maxlags() option of ardl and choose a lag order by looking at the model fit.
  • asked a question related to Applied Microeconometrics
Question
2 answers
Seeking to learn about matrix stability
Relevant answer
Answer
Thanks Hasrert for your reply. I wonder how stability and consistency which are related to the deterministic specification could be of any help when we are dealing with estimation. Think of a differential equation. It could be stable or not, but that does not tell us how that equation should be estimated. Does it?
  • asked a question related to Applied Microeconometrics
Question
15 answers
Dear scholars,
Please can any one assist me with the steps/codes to follow in estimating the Carrion-i-Silvestre et al (2005) test using stata or Eviews.
Thank you
Relevant answer
Thank you Sir, for your suggestions, I will surely contact him (Prof.)
Best wishes
Abdulkadir, A. R, PhD
  • asked a question related to Applied Microeconometrics
Question
6 answers
i want to shock wage rate change on output and productivity of and textile sectors. Could you advise about use of type of elasticity and code for swapping this wage rate to be endogenous variable ? 
Relevant answer
Answer
Swap the endowment level variable with the factor price variable to shock the wage rate directly.
or more specifically:
Simply shock the factor price variable after swapping with the exogenous factor endowment slack variable!
  • asked a question related to Applied Microeconometrics
Question
3 answers
Dear all,
I'm using Latent Gold Choice to estimante a LC model. The model performs well when up to three classes are considered. However, when I try with four classes I got a message of no convergence ("estimation procedure did not converge, 25 gradients larger than 1.0E-3) and maximum number of iterations reached without convergence. Anybody knows the possible reasons? Mis-specification of the model? A large number of observations (in my case 250000 obs)? Something else?
Thank you
Relevant answer
Answer
Thank you Amir for your reply.
I've already tried to increase the number of iteration without success, so I think the problem is related to the first point. Even if I don't have convergence, I obtain the value of the parameters and the measure of BIC and AIC. Are these results reliable? At the end it turns out that the lowest value of BIC and AIC are obtained with 6 classes (in which I don't have convergence). Do you suggest to use only the results of the 3 classes  as the results of 6 classes are not reliable? Thank you
  • asked a question related to Applied Microeconometrics
Question
5 answers
Hi! How do I interpret beta and aplha in VECM? I know alpha is the adjustment to equilibrium, and that a significant and negative coefficient of alpha shows that the variable is "caused" by the other variable. But how do I interpret when beta (cointegrating equation) is significant with one variable or another? Or both? I have 2 vectors.
Relevant answer
Answer
You'll find the following textbook written by Chris Brooks very useful in this regard.
Good luck
  • asked a question related to Applied Microeconometrics
Question
3 answers
I have a problem regarding price variable. The choice set has 4 alternatives, one of the attribute that characterizes each alternative is price: for example alternative from 1 to 4 has price (10,20,30,40). For some people (student for example) prices are discounted of 50%, so the price of the alternatives are (5,10,15,20). In such a case how it is possible to treat the price variable? Interact the price with a discount variable?Put as explanatory variable all the different price categorie (e.g full price, price children, price senior…)?Consider only the full price and set a dummy variable for the category?
Thank you
Relevant answer
Answer
I assume that you are talking about a discrete choice model (not an aggregate one). If you propose a set of dummy variables for the sub-categories (student, senior, etc.), you must know who they are (which observation corresponds to a student's choice, a semior's choice, etc.).
If I get this right, then you apply the appropriate price for that individual (full or discounted).
The next issue is then whether you assume the same coefficient for price independently from the individual's category, and the same mathematical form (I prefer to use Box-Cox and let the data decide on the form). I would start with the same coefficient and functional form and test this, especially for the coefficient, across different categories.
Marc Gaudry
  • asked a question related to Applied Microeconometrics
Question
5 answers
In conditional logit model attributes of the alternatives and characteristic of the choosers are included as explanatory variables. However, let us suppose that the choice can be made in different context or choice situation that can affect the choice?can these attributes be included as independent variable? (so adding variables that are not related neither to the agent's characteristics nor to the attributes of the alternative)?
Thanks
Relevant answer
Answer
It depends on what you want to estimate. In the standard MNL model, whatever is not measured by observed variables (characteristics and attributes) is absorbed into the random extreme value component. In order to identify any effect you have to go somehow through  the observed variables. Say you are interested in the effects of different locations or time periods. Then you might introduce location and/or time variables in interaction with personal characteristics and/or choice attributes.  However, the effects of those variables at the end will consists of affecting the coefficients of characteristics and attributes. Alternatively, you might perform different estimations of the same model using datasets collected in different contexts. The effect of different contexts will show up in different parameter estimates.
  • asked a question related to Applied Microeconometrics
Question
3 answers
I estimated the translog cost function (KLEM model) by dropping one equation. now i am to test coefficients. but, as one equation is dropped, in this situation how to perform the test? For example, coefficient qi, (i=K,L,E,M) is zero.
and i am to test whether  model A is nested in B or not. i use Likelihood values of model  and B. is there any difference in likelihood values when one equation is dropped? If yes, how can i do it? 
Relevant answer
Answer
If you used the generalized translog cost function, you should also test it against more restrictive technological forms.  Attached is an example I've done with a co-author with the referred tests illustrated in Table IV.  I hope that this helps.
  • asked a question related to Applied Microeconometrics
Question
1 answer
I am trying to understand how the authors conduct the Worldwide Governance Indicator (WGI), which is the most cited governance index nowadays at here . Though they try to publish as much details as possible to public access, I am not very convinced with the technical appendix (see page 97 - 99). 
In short, they use Unobserved component model to estimate the non-observable characteristics of governance that the index has not covered. To derive the parameters of the model, they say "applying standard maximum likelihood estimation" (page 100). 
I do not know how to estimate such estimation (at page 100) given the data that they have. Also, they do not provide any documents/methods that have been used. 
If anyone has come across this problem, please can you explain the methodology behind that MLE? Any relevant programme/coding would be much greatly appreciated. 
Relevant answer
Answer
Hey Canh,
I haven't calculated the MLE estimator for your problem, furthermore this is not my field of research anyhow I try to give some hints.
For MLE you have to set up your Likelihoodfunktion which is the joint density function of your data.
We know the conditional mean the conditional variance of g_j, which depend on your parameters alpha, beta and sigma. So before we can estimate the conditional mean and the conditional variance we have to estimate these parameters.
Let's consider the K_j representative indicators for a certain country j. These indicators are jointly normally distributed, so we can use the density function of the multivariate normal distribution:
f_j=1/sqrt((2*pi)^(K_j)*det(Omega))*exp(-1/2*(y_j-alpha)'Omega^(-1)(y_j-alpha))
So this is the density of the indicators of a certain country. The mean alpha and covariance matrix Omega are from the article.
Now we are interested in the density of all reprensentative indicators. Therefore, we assume that the indicators across countries are independent. Otherwise you can't calculate the joint density as the product of marginal densities. Now we get the Likelihoodfunction as a product of all densities above, across all countries:
L(alpha,beta,sigma|y)=product_(j=1)^(J)(f_j)
We use the logarthmic likelihoodfunction because it makes things more easy:
lnL=sum_(j=1)^(J)ln(f_j)
You can find a reduced form of ln(f_j) in the article, see (4). The constant terms are missing there.
Now have to build the first derivatives of lnL with respect to alpha, beta and sigma, set them to zero and solve the equations for the parameters of interest, alpha, beta and sigma. I think this is meant by standard mle.
In R exist a package stats4 there is a mle function which can solve your problem. Therefore, you have to define the negative lnL. Maximization or in the case of the function minimization is done by R.
Now, we have estimates for alpha, beta, and sigma, so we can estimate the conditional mean the conditional variance of g_j.
I hope this helps.
Best regards,
Flo
  • asked a question related to Applied Microeconometrics
Question
5 answers
How a product safety and quality can be estimated by applying econometrics technique?
Relevant answer
Answer
For the research question you  concerned, I think you need to develop a sound theoretical basis for a particular product by showing that how safety and quality are determined at the manufacturing stage (at the factory) and at the consumption stage (at the market). Having developed this model, derive some hypotheses that underlay answers for your main research questions of the study. This approach is deductive because otherwise you may not be able to apply quantitative methods.
You may be able to get some row data at the firm level and based on that you can decide what statistical methods are most useful to analyse your research question.  You may be able to conduct some sound analysis at basic level using techniques such as cross tabulation and chi-square depending on the type of the data you collected.
At the market level, you may have to collect data for particular product from all the stakeholders who are involved in any kind of business with that product (consumers, distributors, sellers etc). To do this you may have to develop questionnaires for each group based on you theoretical model. When develop the questionnaires, think of the type of data you need to test you hypotheses using statistical tools.
For example, in the questionnaire to be  designed for the consumers you can include some closed as well as open questions about the consumers’ perspectives on the safety and quality (this yields dependent variable as ranks) of a particular product and reasons as to why they think so (this yields a set of independent variables).  Do not forget to get some data on other variables such as level of education, family background, profession etc. as these variables can have some impacts on the decisions to be made on the safety and quality of a particular product.
Then apply some basic tools such as cross tabulation and chi-square techniques to investigate whether there is any kind of association between the rankings of products in quality and safety and other selected independent variables. Always use rankings of products as the dependent variable (in rows) in the above basic tetchiness which are available in SPSS (statistical software for social sciences) software.  You can run this test to check whether there is any association between safety and quality as well. Having conduct these basic tests, then think of more advanced techniques if you think you need to do so. 
Good luck for your research!
  • asked a question related to Applied Microeconometrics
Question
11 answers
I am interested to estimate technical efficiency based on Battese and Coelli (1992) model. I have tried many times for the estimation both using the command prompt as well the use of instruction file in Frontier 4.1. But the dialog box immediately disappear once I end up the process by giving the command through running the FR0NT41 .EXE. Could anyone guide me where is the problem and how could i estimate it.
Relevant answer
Answer
Dear Auro,
I think that there might be an inconsistency between your ins file and the txt file used by the program in order to read the data. If you are not already using R and still want to try on Frontier, send me the .ins file and the .txt file containing the first two or three rows of your data in order to check it on my computer.
Regards
Spyros
  • asked a question related to Applied Microeconometrics
Question
8 answers
in GMM application (xtabond2) we need to classify our variables as endogenous, predetermined and exogenous variables. what is the criteria for doing so? how would we get to know which variable falls into endogenous category?
Relevant answer
Answer
Your decision should be driven by economic theory. A variable x would be exogeneous, if there is no feedback from the endogenous variable y on x. This implies that x is uncorrelated with past and future error terms. If unpredictable errors today have some impact on x in later periods, x variable is predetermined in the sense that it is fixed in period t, but may be influenced in t+1, t+2 etc. If x is endogenous, it is correlated with contemporanous errors. In that case, you should specify instruments for x, before the estimation is done.
In general, the status of x -exogenous, predetermined, endogenous- has implications on the list of instruments you use in your GMM analysis.
  • asked a question related to Applied Microeconometrics
Question
4 answers
I am currently working on a panel of count data with severe underdispersion (a lot of ones). Poisson is not efficient and generalized Poisson seems a better option. Gpoisson in Stata (Harrris, Yang, and Hardin, 2012) provides efficient estimates for cross section/pooled data. Does anyone know of any such routine for panel data with endogenous regressors?
Relevant answer
Answer
ISSUE: Generalized Poisson Regression Model
GENERALIZED POISSON PROBABILITY FUNCTION: Define the number of incidence as Yi in a set Yi:(y1, y2, ..., yn) and that Yi is the count response. The probability function of Yi is given by:
(1)   fi(yi, μi, α) = ABC
(1.1)   A = (μi / 1 + αμi)
(1.2)   B = (1 + αyi)yi - 1 / yi!
(1.3)   C = exp[-μi(1 + αyi) / 1 + αyi]
The mean is given by:
(2)   E(Yi|x + i) = μi
... where xi = (k - 1) dimensional vector pf covariates; yi = the counts 1, 2, ...; μi = μi(xi) = exp(xi, β) and that β is the k-dimensional vector of regression parameters.
The variance:
(3)   V(Yi|xi) = μi(1 + αμi)2
The probability of Yi depends on three factors: yi = counts, μi = mean; and α = dispersion parameter. If alpha α = 0, the probability function above is reduced to the "simple" Poisson regression (PR) model. Therefore, the objective of the test or the routine is to prove that α does NOT equal to zero. if alpha does not equal to zero, common rules of interpretation are: (i) α > 0 means that the dispersion is over-dispersed and the probability of fi(yi, μi, α) will sum to 1; (ii) α < 0 means that the dispersion is under-dispersed and the probability fi(yi, μi, α) does not sum to 1.
HYPOTHESIS AND DECISION RULE FOR TESTING: the null hypothesis is H0: α = 0 and HA: α not equal to 0. There are two options of testing for α: (i) using the Wald t-test; and (ii) log-likelihood. The objective of the test is testing for the goodness-of-fit.
(i) Wald Test: The Wald test is a goodness-of-fit test where the observed value is compared to the chi-square. The Wald test is given by:
(4)   (θ^ - θ0) / var(θ^)
... where θ^ = estimated value; and θ0 = observed value.
(ii) Log-Likelihood Test: Assume that the likelihood function is given by L(θ) and the log-likelihood is given by F(θ); the log-likelihood function is defined as the natural logarithm of L(θ) or F(θ) = ln L(θ); thus:
(5)   F(θ) = Σln fi(yi | θ)
This completes the routine test for the generalized Poisson regression model. I hope this has been helpful. Cheers.
REFERENCES: Some relevant articles are attached.
[1] Hald, A. (1999), "On the History of Maximum Likelihood in Relation to Inverse Probability and Least Squares", Statistical Science 14 (2): 214–222,doi:10.1214/ss/1009212248, JSTOR 2676741.
[2] Pratt, J. W. (May 1976), "F. Y. Edgeworth and R. A. Fisher on the Efficiency of Maximum Likelihood Estimation", The Annals of Statistics 4 (3): 501–514,
  • asked a question related to Applied Microeconometrics
Question
8 answers
Panel data in nominal form - Unit root test is required?
Relevant answer
Answer
Ok, @Jay that 's is clear.
STATIONARITY OF DATA: in time series, the common model used in autoregressive type and its variations. in autoregressive time series, we are regressing Yt against its own value in the last period Yt-1; thus:
Yt = B0 + B1Xt-1
The value of Yt throughout the period when plotted will not be smooth; there will be some period when there will be spikes (up and down). Let's call this volatility (up and down) the effect of shock. the idea of testing for stationarity is to verify whether the effect of shock is permanent or transitory. if the effect of shock is transient (temporary), the value of Yt in subsequent period will return to its long-run equilibrium. If Yt return to its long-run equilibrium, we say that the data set is stationary, i.e. meaning that the data is stable even with the effect of shock, Yt still goes back to its long-run mean (mean reverting). However, if after the shock, the subsequent Yt does not go back to its long-run equilibrium, its means that the effect of the shock is absorbed into the system and becomes part of the system. This type of data set is called integrated time series. This is one rationale for checking data stationarity.
AFTER ADF TEST, WHAT 'S NEXT? If the ADF test shows that is stationary at first difference---most cases are--it means that the data is considered stable when lag one period. in general, if the data is stationary after I(1), i.e. after first difference, it means that the data set is stationary. This means that the still retains its memory in mean reverting. In such a case, the data set is predictable. There is no need for error correction mechanism. if the data set lost its mean reverting characteristic, then you would need to implement ECM. What's next? Construct your predictive function (forecast model) and test the model.
REFERENCE: I attach here a reference material that you can use from time to time when doing modeling. I hope it will be helpful.
  • asked a question related to Applied Microeconometrics
Question
22 answers
I'm analyzing panel data and would like to include and determine the firm specific and industry specific effect.
Relevant answer
Answer
1. there'll be collinearity by adding the industry dummy, hence industry dummy will be dropped. no problem with the year dummy.
2. Using random effect will solve the problem. however, hausman test indicate fixed effects is better.
3.The firm individual fixed effects have taken into account that there are differences among countries. However, we need to know the magnitude of the industry effect if fixed effects is used.
  • asked a question related to Applied Microeconometrics
Question
8 answers
my research aim is to identify the affect of fresh graduates on labor demand and supply, whether this is supply is surplus or deficit according to market needs.
Relevant answer
Answer
Here is a reference that may help in one segment of your analysis:
The Research Productivity of New PhDs in Economics: The Surprisingly High
Non-Success of the Successful by John P. Conley and Ali Sina Önder in Journal of Economic Perspectives—Volume 28, Number 3—Summer 2014—Pages 205–216
  • asked a question related to Applied Microeconometrics
Question
3 answers
Simultaneous (or multiprocess) event history models have been developed over the most recent years, as a very particular and advanced type of duration models. Does anyone know if:
i) there is already some type of multiprocess event history models allowing for competing risks in BOTH processes (e.g., any way of modelling two main transition processes, where each transition may assume two or more different modes or routes)?
ii) there is any package that allows the estimation of these models in STATA?
The idea would be to estimate both processes jointly. One of the processes has been already studied through a discrete-time competing-risks model, but it would be nice if some methodology would allow the joint estimation of this competing-risk model (where transitions may occur through 2 routes) with another one, for another choice problem (precisely, a multinomial choice problem where agents decide among 4 alternative occupations), in order to allow for potential interdependencies (through unobservables) between the two processes.
Relevant answer
Answer
Dear Vera,
For the outcome equation: Competing risks models can be readily estimated for each destination separately, you do not need a special package for that: just stset your data accordingly and re-estimate the same models for each outcome, marking the spells that experience the respective other events as zero until that point and censored afterward. Your risk set will be the same each time, but the number of events will vary.
As for the selection equation I remember the routine had some rigidities in that respect, but maybe there, too, you can work with separate binary response models?
Sorry I cannot be of more help.
Best, Jonas
  • asked a question related to Applied Microeconometrics
Question
10 answers
I have tested the unit root for my panel data, some of the variables are stationary in first difference, and some of them in level, should I take any log for my variables? But the dependent variable is stationary at first difference.
Relevant answer
Answer
It is not necessary to take log (or ln) of any variable. In some cases you can try using variables at level and variables with log then compare the performance of the models and choose the one with higher performance. Concerning that some variables are stationary at level and others are at first difference (Which is usually the case), it would be better to employ an ARDL model.
  • asked a question related to Applied Microeconometrics
Question
5 answers
Keynes postulated that the marginal propensity to consume (MPC), the rate of change of consumption for a unit (say, a dollar) change in income, is greater than zero, but less than 1. I would like to know if this always holds. If not, under which conditions may this theory fail to hold?
Relevant answer
Answer
Thanks a lot Uddin for the opinion shared. Your point is well noted
  • asked a question related to Applied Microeconometrics
Question
4 answers
Heckman procedures have been widely used in empirical research to correct for selection bias. However, for duration models (survival analysis/time-to-event data), selection correction is still under development. There is an important contribution by Boehmke et al (2006) in American Journal of Political Science, which resulted in the program "DURSEL" for STATA.
Does anyone know any subsequent advance to correct for selection bias in duration models, especially for STATA?
Thanks in advance!
Relevant answer
Answer
Hi Vera, I am not aware of the paper by Boehmke et al. (2006), but with single observation per unit data you can control uncorrelated (with X) unobserved heterogeneity(UH), with multiple observation per unit even for correlated UH (see eg Van den Berg, 2001, Handbook). Concerning dynamic selection, you should have a look at the so called "Timing of events" approach by Abbring/Van den Berg (2003). In addition, I do remember having seen something on selection and duration models by Denis Fougère or Jen-Pierre Florens. I haven't found it now on the spot, but it might be worthwile looking at Florens/Fougère/Mouchart (2008) or Fougère/Kamionka (2008). Best Alfred
  • asked a question related to Applied Microeconometrics
Question
13 answers
SUR (Seemingly Unrelated Regressions) models are well-suited for cross-section, whenever we have two or more equations (for the same cross-section units) whose errors are believed to be correlated. Extensions of SUR models to panel data, however, seem to be conceptually different – each “separate” equation corresponds to each time period of the panel, rather than to a different dependent variable. (The same is true even using the user-written command XTSUR (for STATA)).
Is there any possibility of extending SUR model to a dataset in panel format, in order to estimate two equations and allow for correlation among their errors, and still control for unobserved heterogeneity of the panel units?
I am using a panel dataset for several institutions and I would like to estimate two regressions for two different dependent variables (two different sources of revenues). It is believed that the errors of these two regressions may be correlated. How could we estimate such a model using something like SUR does for cross-section?
Relevant answer
Answer
Hi Vera,
I didn't understand what you meant by "each “separate” equation corresponds to each time period of the panel, rather than to a different dependent variable." I think, also as you stated later, you have two or more dependent variables as in any other SUR application but you have two or more observations per cross sectional unit. So you want to estimate a SUR model while allowing for unobserved heterogeneity. If that is the case, estimation of SUR models with panel data (balanced or unbalanced) is possible.
The first thing I would think is to do a within transformation on the panel and treat the transformed data as a cross section. By doing so you can avoid estimation of the panel specific intercepts and it all becomes a standard application of SUR. For an example of this approach see Bezlepkina et al. (2005) Effects of subsidies in Russian dairy farming, Agricultural Economics, 33:277-288. Note the corrections required for the standard errors in this case.
Second, if you have variables that do not vary over time but across cross section, the above approach may not be appealing. In this case you may want a sort of random effect specification, which is also possible for a system of equations. Two papers I think may be helpful in this respect are Biorn et al. (2003) "Random coefficients and unbalanced panels: An application of data from Norwegian chemical plants" Annales d'Economie et de Statistique 69: 55 - 83, and Biorn (2004) "Regression system for unbalanced panel data: a stepwise maximum likelihood procedure." Journal of Econometrics 122: 281-91. The latter is more into the econometric details while the former is an application.
Software-wise, you sound like a Stata user and you can do both approaches in Stata. In the latter case, the -xtmixed- command would be useful.
Daniel