Science topic
Data Model - Science topic
Explore the latest questions and answers in Data Model, and find Data Model experts.
Questions related to Data Model
I am currently researching control strategies for wind-excited tall buildings and seeking the MATLAB data/models for the Wind-Excited 76-Story Building benchmark developed by Yang et al., UC-Irvine (1997/2000). The original links appear to be inactive. Could anyone assist me in locating these resources?Any assistance or direction to obtain these resources would be greatly appreciated.
Integration of AI applications with Cybersecurity to create a model or agent that will have faster and more reliable techniques to be able to penterate into a system so that we may understand its flaws and improve upon them.
hello researcher greetings
Actually I want to run panel data model in stata, my panel data consist monthly time variable with 6 cross-sectional observation. When I am putting my data on stata the time variable is coming to be string. When i generating monthly time variable, the time variable get extended to many time period ahead. Can any one help me to solve such problem.
Generally life extension and anti-aging. A lower death rate cancels out a low birthrate. https://www.researchgate.net/publication/382049802_Correcting_Cell_Errors
Can someone direct me to a working link to download the Century model for SOC?
The link in the Colorado State University site below doesn't seem to work
Hi,
In the panel data model, where I'm researching the effects of demographic indicators associated with the aging of the population on the Poverty Risk Rate indicator. In this model, I found a negative regression coefficient for the regressor Proportion of seniors (it is a statistically significant effect), which means that with a higher proportion of seniors, the rate of poverty risk should decrease and vice versa. Please, how could I explain this in my thesis? The model is also tested for heteroskedasticity, autocorrelation, and multicollinearity, and all come out well.
Thank you!
I am analyzing some time-series data. I wrote a script in R and used two methods from two different packages in R to calculate the DW statistics and respective p-values. Surprisingly, for the same value of DW statistics, they give me significantly different p-values. Why and which one is more trustworthy (I assume the one calculated with the durbinWatsonTest)? Part of my code is below:
dwtest(model)
durbinWatsonTest(model)
R output is the following:
data: model DW = 1.8314, p-value = 0.1865
alternative hypothesis: true autocorrelation is greater than 0
lag Autocorrelation D-W Statistic p-value
1 0.07658155 1.831371 0.348
Furthermore, durbinWatsonTest from car package seems to involve some randomness. I executed for the same data (different than above) a script from the terminal within couple of seconds and the output is as below:
lag Autocorrelation D-W Statistic p-value
1 0.1181864 1.7536 0.216
lag Autocorrelation D-W Statistic p-value
1 0.1181864 1.7536 0.204
lag Autocorrelation D-W Statistic p-value
1 0.1181864 1.7536 0.198
p-value is different every time I execute the script.
Any ideas why? Which method gives correct p-values dwtest or durbinWatsonTest?
2024 4th International Conference on Computer Technology and Media Convergence Design (CTMCD 2024) will be held in Kuala Lumpur,Malaysia on February 23-25, 2024.
---Call For Papers---
The topics of interest for submission include, but are not limited to:
1. Digital design
· Animation design
· Digital media art
· Visual media design
· Digital design analysis
· Smart design
2. Computer Technology
· Artificial intelligence
· Virtual reality and human-computer interaction
· Computer animation
· Software engineering
· Computer modeling
· Data model and method
· Big data search and information retrieval technology
· Intelligent information fusion
All accepted papers will be published in SPIE conference proceedings,which will be indexed by EI Compendex and Scopus.
Important Dates:
Full Paper Submission Date: February 06, 2024
Registration Deadline: February 13, 2024
Final Paper Submission Date: February 18, 2024
Conference Dates: February 23-25, 2024
For More Details please visit:

The website of Colorado University is not working since months and I don´t know where to find this model!
Hi
I intend to model a thermal responde we measured with ERA5 variables, and later, use this model to predict the future responde with CMIP6 variables.
My doubt is, how to use precipitation correctly?
In ERA5 hour analysis total precipitation comes in meters acumulated in 1hour (so its m/h I suppose) and in CMIP6 comes in Kg/m2/s .
Kg/m2/s is the same as mm/s, so I'm wondering if turning ERA5 precipitation from meters/hour to mm/s and use that in a valid way?
Also, am I messing up units considering the grids are not even the same in both datasets?
Thank you!
Cheers, Luís Pereira.
A software design is a plan or a blueprint for building a software program. It is a high-level representation of the structure, behavior, and functionality of the software that guides the coding process. A software design typically includes a number of different components, such as:
- Architecture: This describes the overall structure of the software, including how different components will interact with each other and how data will flow through the system.
- Data structures: This describes the way that data will be organized and stored within the software, including databases, data models, and other data-related components.
- Algorithms: This describes the specific methods and procedures that will be used to perform different tasks within the software, such as sorting data or searching for information.
- User interface: This describes how users will interact with the software, including the layout of the user interface, the types of controls and widgets that will be used, and other details related to the user experience.
- Functional requirements: This describes the specific features and functionality that the software will provide, including the different tasks that it will be able to perform and the types of data that it will be able to handle.
I am estimating female labour participation rates using panel data for 7 countries. I have data period from 1991 to 2021. I have reviewed the literature, and it suggests using GMM only when you have larger Ns and small Ts. Can you please help in this regard that which advanced or dynamic panel model be used?
I have panel data model,, N=42 T=11, i need the differents commands of the 2nd unit root tests to stata?
Thanks
If I have a panel data survey that collect 100 patients' EQ5D scores by app for their health. One patient can submit their scores more than one times as time passing by. The time in this model is defined as the interval between the data they register the app and the data the submit their scores. Some patients may fill this survey several times but some of them may only fill it one to two times. Thus, we could view the latter as drop out if we want to study the trajectory of the EQ5D scores. In this situation, how to use IPW to weight the sample, as early drop out may be a bias in the model because they feel better thus do not want to continue to record their quality of life.
I have heterogeneous panel data model,, N=6 T=21,What is the appropriate regression model? I have applied CD test , It shows the data have cross-sectional dependency
I used the 2nd unit root tests , and the result found that my data is stationary at level
is it possible to use PMG ? would you pleas explain the appropriate regression model?
I have panel data comprises 5 cross sections across 14 independent variables. the data time series part is 10 years. while I run the panel data model for pooled OLS and FE model it gives results while for Random effect model it shows error as RE estimation required number of cross-section>number of coefficients for between estimators for estimation of RE innovation variance. Can anyone help me how to get the results for Random effect model?
I am seeking recommendations for potential research topics for my PhD in the field of AI in healthcare, with a particular focus on neuroscience. I am interested in exploring how artificial intelligence can be used to improve our understanding and treatment of neurological and neuropsychiatric disorders.
Could you kindly suggest some potential research topics that are currently of interest or relevance in this field? I am particularly interested in topics that leverage machine learning, deep learning, or other AI techniques to analyze neuroimaging data, model brain function, or develop diagnostic or therapeutic tools for neurological and neuropsychiatric disorders.
I have panel data model, my sample includes 6 countries (won't add more), t= 11 years , independent variables =6 or 7
Can I use all the 6 or 7 Independent variables while I have 6 countries (cross -sections)?
Is there any test like this in Stata?
Dear Scholars,
I have stationary dependent variable and non-stationary independent variables. I employed the Panel ARDL model but also I would like to run a static panel data model too. To control country differences, I decided to use fixed effects model but I could not find proper answer about taking differences.
Should I take differences for all variables or just for non-stationary variables?
Thank you very much for your helps.
I'm trying to get the data of loan officers from microfinance(how many borrowers they approach, loan amount outstanding, the portfolio risk, the percentage of complete repayment, etc). Can anyone suggest to me the database to use data for the panel data model?
Thank you.
One of my big problems is finding articles that could suggest new thoughts/research to my work. Part of the problem is the amount of extraneous material (dirt) that is available. For example, when I see an abstract that is long (>about 300 words in English), I simply ignore. My experience tells me it is usually unfounded or vague or hand waving. But there is a possibility there may be a grain of something that I'm ignoring. There is also the possibility I'm missing some paper that may be valuable. Then there is all those ad-hominem statements to which I respond to just ignoring those authors. I'd like to be more effective at finding new data/models while ignoring the dirt. How can I be more effective at distributing my research?
i sudied a process using design of experiments. firstly, i used screening by fractional factorial design. results showed that 3 out of 5 affecting factors are significant. also i found significant curvature in model. so, i used RSM method (box-behnken) to better understand the process using the 3 selected factors. results showed that the linear model is the best model that fit the data. i have confused with the results. whats the reason that results from fractional factorial design show curvature but behave linear in RSM method?
I am working on the development of a PMMS model. To select the best-performing tools and models, several models are needed to be developed and validated. Can this be replaced by some optimization algorithms?
The 5 methods of estimating dynamic panel data models using 'dynpanel" in R
# Fit the dynamic panel data using the Arellano Bond (1991) instruments
reg<-dpd(log(gsp) ~ log(pcap) + log(pc) + log(emp) + unemp,Produc,index=c("state","year"),1,4)
summary(reg)
# Fit the dynamic panel data using an automatic selection of appropriate IV matrix
#reg<-dpd(log(gsp) ~ log(pcap) + log(pc) + log(emp) + unemp,Produc,index=c("state","year"),1,0)
#summary(reg)
# Fit the dynamic panel data using the GMM estimator with the smallest set of instruments
#reg<-dpd(log(gsp) ~ log(pcap) + log(pc) + log(emp) + unemp,Produc,index=c("state","year"),1,1)
#summary(reg)
# Fit the dynamic panel data using a reduced form of IV from method 3
#reg<-dpd(log(gsp) ~ log(pcap) + log(pc) + log(emp) + unemp,Produc,index=c("state","year"),1,2)
#summary(reg)
# Fit the dynamic panel data using the IV matrix where the number of moments grows with kT
# K: variables number and T: time per group
#reg<-dpd(log(gsp) ~ log(pcap) + log(pc) + log(emp) + unemp,Produc,index=c("state","year"),1,3)
#summary(reg)
A common threshold for standardized coefficients in structural equation models is 0.1. But is this also valid for first difference models?
Dear blockchain researchers,
In the classical Nakamoto Blockchain (BC) model, Transactions (TXs) are packaged in blocks and each block points to, specifically, its previous single block (I'm not gonna go into technical details here). This is a linear data model which justifies the name 'chain'. In the DAG-based BCs, TXs may, or may not, be packaged into blocks, and then each TX/ block (say 'a' ) is allowed/enforced to point to more than one parent. Consequently, several children blocks/TXs (say 'b', 'c' and 'd') are similarly allowed/enforced to randomly point later to 'a'. This is a network like data model which is obvious.
Searching in previous works, all DAG-based BCs I found adopt a many-to-many cardinality model of blocks/TXs as described above. Some do propose children must point to several parent for higher throughput and credibility. However, none of those proposed, specifically, a relaxed one-to-many parent-child dependency.
To clarify, I specifically mean that children are enforced to point to 'only one' parent, while each parent is allowed to be pointed to by several children. This leads to a tree-like DAG instead of a complicated dense network. I need some references that discuss such data modelling. Would be much beneficial if a comparison is also conducted between different types of DL data models (1-to-many vs. vs. many-to-one vs. 1to1 vs many-to-many).
Any help, explanation, or suggestions are most welcome!
Can anyone suggest any ensembling methods for the output of pre-trained models? Suppose, there is a dataset containing cats and dogs. Three pre-trained models are applied i.e., VGG16, VGG19, and ResNet50. How will you apply ensembling techniques? Bagging, boosting, voting etc.
Extended/edited from an early question for clarity.
I have temporally high resolution outputs of modelled climate data (model x, spanning 0-5000 ka. Low spatial resolution 0.5 degrees). Compared to other climate models, however, I have reason to believe it is under-predicting precipitation/temperature changes at certain time intervals. Is there a way to calibrate this with better quality records (i.e., those available via WorldClim/PaleoClim)?
For example, the response to the MIS 5e (120-130 ka BP) incursion of the African Summer Monsoon and Indian Summer Monsoon into the Saharan and Arabian deserts is very weak compared to the MIS 5e data from WorldClim/PaleoClim (and corroborated by palaeoclimatic data). Can I correct/calibrate model x with these more responsive models, and how should this be done?
Dear all,
I wanted to evaluate the accuracy of a model using observation data. My problem is the correlation of the model with observed data is really good (bigger than 0.7) but RMSE is very high too (like bigger than 100 mm in a month for monthly rainfall data). How can I explain it? the model also has low bias.
How to explain this case?
Thank you all
The aim of my study is to investigate the impact of Integrated Reporting disclosure practices on operational performance (ROA) and firm value (Tobin's). I have applied panel data model for my analysis. Under descriptive statistics the Std. deviation of tobin's q is high i.e. 4.964. One of the reviewer commented that high std deviation of tobin's q means that variable is not normal, which may affect results. However, I have studied that normality is not required in panel data models? What justification should I give to satisfy reviewer? Please also mention some references.
I try to create a model in Rstudio, however, I can't find a solution. Order of my procedure is;
- data
- Shapiro-Wilk test for normality (It says; data has non-normal distribution)
- log transformation
- Shapiro-Wilk test for normality again (It says; data still have non-normal distribution)
What can I do?
Base on Hansen(1999) we can estimate fixed effect threshold panel data model. In my model Hausman Test says it's random effect, what can I do?
HI everybody
I am trying to run the CESM-atm model, but I don't get where is the path for the data, I am attaching an image of what must be the structure of the path.
By the way, I am running this model in my personal lap, so I had to do the porting proccess before, so I don't think that would be really the problem here.
Could anyone explain me what I must do for downloading the data for the model?
Thanks a lot!

I noticed that while using the gemtc package to perform a fixed effect model with likelihood = "normal" and link = "identity" (mean difference), the burn in iteration specified in mtc.run ("n.adapt") are not taken into account.
Example (with "parkinson" data):
model <- mtc.model(parkinson, likelihood='normal', link='identity', linearModel = 'fixed')
res <- mtc.run(model, n.adapt = 20000, n.iter = 75000)
summary(res)
#Results on the Mean Difference scale
#Iterations = 1:75000
#Thinning interval = 1
#Number of chains = 4
#Sample size per chain = 75000
if no specification for the linear model, a random effect is performed by default. Random effect is working, and other likelihood / link are working in both model.
Is there a way to use the package in mean difference with a fixed effect model including "burn in" interations ? Do you see any error in the way I used the likelihood='normal' / link='identity' ?
In the development of forecasting, prediction or estimation models, we have recourse to information criterions so that the model is parsimonious. So, why and when should one or the other of these information criterions be used ?
I have a panel data set of 11 countries and 40 years while data is consisted of two groups developing and developed countries. The chosen method will be applied on both groups of data set separately in order to compare results of two groups. Suggestions will be appreciated.
I am using transfer learning using pre-trained models in PyTorch for the Image classification task.
When I modified the output layer of the pre-trained model (e,g, alexnet) as per our dataset and run the code for seeing the modified architecture of alexnet it gives output as "none".
I have non-stationary time-series data for variables such as Energy Consumption, Trade, Oil Prices, etc and I want to study the impact of these variables on the growth in electricity generation from renewable sources (I have taken the natural logarithms for all the variables).
I performed a linear regression which gave me spurious results (r-squared >0.9)
After testing these time series for unit roots using Augmented Dickey- Fuller test all of them were found to be non-stationary and hence the spurious regression. However their first differences for some of them, and second differences for the others, were found to be stationary.
Now when I test the new linear regressions with the proper order of integration for each variables (in order to have a stationary model) the statistical results are not good (high p-value for some variables and low r-squared (0.25))
My question is how should I proceed now? Should i change my variables?
EDIT: Up to the literature suggested in the answers, IT IS NOT POSSIBLE because they are required at least some calibration data, which - in my case - are not available.
I am looking for a technique/function to estimate soil temperature from meteorological data only, for soils covered with crops.
In particular, I need to estimate soil temperature for a field with herbaceous crops at mid-latitudes (north Italy), but the models I found in literature are fitted for snow-covered and/or high-latitude soils.
I have daily values of air temperature (minimum, mean and maximum), precipitation, relative humidity (minimum, mean and maximum), solar radiation and wind speed.
Thank you very much
I am running an ARDL model on eviews and I need to know the following if anyone could help!
1. Is the optimal number of lags for annual data (30 observations) 1 or 2 OR should VAR be applied to know the optimal number of lags?
2. When we apply the VAR, the maximum number of lags applicable was 5, beyond 5 we got singular matrix error, but the problem is as we increase the number of lags, the optimal number of lags increase (when we choose 2 lags, we got 2 as the optimal, when we choose 5 lags, we got 5 as the optimal) so what should be done?



Hello,
My friend is seeking an collaborator in psychology-related statistics. Current projects including personality traits and their relations to other variables (e.g., age). You will be responsible for doing data analysis for potential publications. Preferbably you should have some knowledge about statistics and is fimaliar with software that is used to do analysis (e.g., MATLAB, R, SPSS). 10 hours a week is required. Leave your email address if interested.
I'm a community ecologist (for soil microbes), and I find hurdle models are really neat/efficient for modeling the abundance of taxa with many zeros and high degrees of patchiness (separate mechanisms governing likelihood of existing in an environment versus the abundance of the organism once it appears in the environment). However, I'm also very interested in the interaction between organisms, and I've been toying with models that include other taxa as covariates that help explain the abundance of a taxon of interest. But the abundance of these other taxa also behave in a way that might be best understood with a hurdle model. I'm wondering if there's a way of constructing a hurdle model with two gates - one that is defined by the taxon of interest (as in a classic hurdle model); and one that is defined by a covariate such that there is a model that predicts the behavior of taxon 1 given that taxon 2 is absent, and a model that predicts the behavior of taxon 1 given that taxon 2 is present. Thus there would be three models total:
Model 1: Taxon 1 = 0
Model 2: Taxon 1 > 0 ~ Environment, Given Taxon 2 = 0
Model 3: Taxon 1 > 0 ~ Environment, Given Taxon 2 > 0
Is there a statistical framework / method for doing this? If so, what is it called? / where can I find more information about it? Can it be implemented in R? Or is there another similar approach that I should be aware of?
To preempt a comment I expect to receive: I don't think co-occurrence models get at what I'm interested in. These predict the likelihood of taxon 1 existing in a site given the distribution of taxon 2. These models ask the question do taxon 1 and 2 co-occur more than expected given the environment? But I wish to ask a different question: given that taxon 1 does exist, does the presence of taxon 2 change the abundance of taxon 1, or change the relationship of taxon 1 to the environmental parameters?
Hello,
Dpes anyone have an idea about howto analyse my panel data of exchange rate and stock markets of six countries spread over ten years. My panel data set is actually long (T greater than N) and is unbalanced. I'm initially using the pooled regression and fixed effects models and the Wald test. But while reading, I come to notice that panel data models are applied according to panel data structure. So I'm a bit confused. I will be glad if I could have more insight on which model best fit my data structure. Thanks in advance.
I need your help and expertise on the J48 decision tree algorithm that will walk me through the data analysis and interpretation of the data model.
How the data will be consolidated? Processed? Analyzed? and interpretation of the data model.
I first conducted a fixed effect model using xtreg y x, fe and I found that all the variables are significant and R-squared is .51.
So I thought that maybe I should use two step system GMM to account for endogeneity. But, since I only have 3 years, when i include the lagged variable as a predictor using xtabond2 y l.y x y*, gmm ( l.y) iv (x y*), equation (level)) nodiffsargan twostep robust orthogonal small, the number of observations per group shrinks to two and I can't even run an AR(1) test or Sargan test. And Also the output shows insignificant lagged variable.
I am still new to dynamic panel data models. Do I need GMM in such small sample size and small number of observations? Should i use something else? If i only report fixed effects results would that be sufficient to be considered for publication?
I would love to hear your recommendation. Thank you very much,
I wish to investigate the effects of landscape parameters on avian community assemblages in an agricultural landscape. In order to conduct modelling in ArcGIS is it advisable to use BIOCLIM data in the Model Builder?
I'm not going for prediction , rather, just want to see the effects of landscape parameters on birds' assemblage.
I am having 21 json files containing more than 15 million rows with approx. 10 features in each file. I need to first convert all the json files to csv and combine all the csv files into one to have a high dimensional dataset. For now, if I load each individual json file as csv, it provides me only the max limit of excel which is 1048576 rows of data which means I am losing rest of the data. I know I can analyze them using data model and power pivot in excel. However, I need to load them first in a csv file for doing dimensionality reduction.
Any idea or suggestion on loading this much data in a csv, excel or any other accepted format which I can later use in Python?
Hello!
I estimate the influence of some components of the global competitiveness index on the index itself for 12 countries over the period of 11 years. So, in my model I have N=12 and T=11, while the number of components is equal to 32. I am facing the situation when the only model, which provides acceptable test results for my data is the 1-step dynamic panel. In my model I use log differences of selected variable. Yes, it contains lagged dependent variables, but am worried if the presence of lagged dependent variables and the acceptable test results are enough to justify the selection of dynamic panel data model.
what useful information can be extracted from a saved model file regarding the training data.
From security perspective too. If someone has access to the saved model what information can they gain?
I have to estimate a panel data model( 19 country and and 37 year) with xtscc command (Regression with Driscoll-Kraay standard errors), i want to know how can i choose the optimal lag for this estimation . Thank you for any suggestion .
I am working on this corporate panel data model: LEVERAGE_it = PROFITABILITY_it + NON-DEBT-TAX-SHIELD_it + SIZE_it + LIQUIDITY_it + TOBIN_Q_it + TANGIBILITY_it + u_it. Where:
leverage = long term debt/total assets
profitability = cash flows/total asset
non debt tax shield = depreciation/total asset
size = log (sales)
liquidity = current assets/current liabilities
tobin_q = mkt capitalization/total assets
tangibility = tangible fixed asset/total assets
What can I say about the exogeneity condition? Can I assume that expected value of the covariance between error term u_it and of X_i is zero? Why? A lot of papers make this assumption but do not explain why.
Thank in advance for your response.
Can I use one Sentinel image for training my model and another one for testing? Since i have two or three images, I wanted to use one or two image for training and the rest for test. However, I know 70 15 15 is the ideal proportion. But i dont know how to implement it for three images. And also, is this possible not to include 15 percent for validation? Just 70 30?
Dear all,
I am working on the BACON model to establish the chronology of a lake core. However, I have a question seeking help from you.
Is it necessary to add my 210Pb data into the model? If yes, how to calculate the dating error ?
Thanks,
Mingda
Why some researchers, in their paper, report the results from the static panel data models (OLS, FE and RE) beside the results from the dynamic models (1st difference GMM and SYS-GMM) while they chose GMM models as the best model for the research problem.
I have behavioral data (feeding latency) which is the dependent variable. There are 4 populations from which the behavioral data is collected. So population becomes a random effect. I have various environmental parameters like dissolved oxygen, water velocity, temperature, fish diversity index, habitat complexity etc. as the independent variables (continuous). I want to see which of these variables or combination of variables will have significant effect on the behavior.
Regarding interoperability of FEA tools:
1. Is Dassault Abaqus Input-file Format is widely supported by other FEA tools (such as, Ansys, Nastran, etc.)? Or every FEA tool has a specific Input file format that cannot be handled/used by a different tool?
2. Are there any interoperability issues between different versions of Nastran provided by different vendors (for instance, MSC Nastran and NX Nastran, etc.)? Or can we use the model developed in one Nastran version (e.g. MSC Nastran) easily in a different Nastran version (e.g. NX Nastran)?
Thanking you.
Obviously, the world is watching covid-19 transmission very carefully. Elected officials and the press are discussing what "the models" predict. As far as I can tell, they are talking about the SIR model: (Susceptible, Infected, Recovered). However, I can't tell if they are using a spatial model and if the spatial model they are using is point pattern or areal.This is critical because the disease has very obvious spatial autocorrelation and clustering in dense urban areas. However, there appears to be a road network effect and a social network effect. For example, are they using a Bayesian maximum entropy SIR? Or a Conditional Autoregressive Bayesian spatio-temporal model? An agent based model? Random walk?
I mean "they" generally. I'm sure different scholars are using different models, but right now I think I can find one spatio-temporal model, and what these scholars meant is that they did two cross sectional count data models (not spatial ones either) in two different time periods.
Dear researchers,
It is several years in which OSM (Open Street Map) is developing huge amounts of spatial data all around the world. As I know some countries like Canada have reorganized their NTDB (National Topographic Data Base) data models to be harmonized with OSM data layers and being merged with them, and although they accept the ODBL licenses as Open DataBase Licence.
I am wondering if it is possible to have a list of such countries' names.
Any help will be so appreciated
Thank you very much for your time.
With Regards
Ali Madad
sir
my research topic is crowding-in and crowding-out effects of public investment on private investment in emerging Asian economies. i have panel data of 6 countries 15-years yearly data and 1 IV (public investment) , 1 DV (private investment) and 8 control variables as my panel data is small , i need your suggestion which panel data model on stata is suitable for my data.
What variable(s) can be used as instruments for public health and education expenditure in testing for endogeneity in a static panel data model that regresses public health/education expenditure on economic growth? I am using random effects estimators since this is the most appropriate traditional panel technique (the Hausman test suggested this).
There are many variables in the literature that have a positive correlation with public health expenditure, for instance. However, these variables also have strong correlation with real GDP per capita growth rate and therefore are unsuitable instruments.
Dear all,
the panel data model i am going to analyse has some stationary and non-stationary variables, and non-stationary variables are integrated of order one. what would be the best estimation method i must apply? discuss plz
The conventional test for the system GMM is 1) testing for instrument validity and 2) test for second order serial autocorrelation.
Are there pre-estimation tests that may be relevant i.e normality,heteroskedasticity, panel unit root tests, panel cointegration test
I do ask since almost 90% of academic papers reviewed seem to ignore these tests and stress mostly on the two .
Hi, I'm testing a serial multiple mediation model with two mediators. I tried twice by testing different data but they all showed the same results that CMIN and df=0.
First, I did CFA test to ensure the validity of model and the model fit was acceptable. Second, when I was doing mediation test(I made casual model only using unobserved variables), the information of Notes for Model were shown that:
Number of distinct sample moments: 15
Number of distinct parameters to be estimated: 15
Degrees of freedom(15-15):0
Result(Default model)
Minimum was achieved
Chi-square= .000
Degrees of freedom= 0
Probablity level cannot be computed
Based on these, will this model be acceptable to test further hypotheses? Or will this model be meaningful to study? And I checked the literature by Byrne(2001).
The reference is: Byrne, B.M. 2001. Structural equation modeling with AMOS : basic concepts, applications, and programming. Mahwah, N.J. ;: Lawrence Erlbaum Associates.
It mentioned that "this kind of model is not scientifically interesting because it has no degrees of freedom and therefore can never be rejected" (Byrne, 2001). Anyone could give any comments and suggestions on this?
I think it might result from this particular type of causal relationship or coincidence? Because the CFA test of this model:
CMIN=1170.358
df=399
CFI= 0.919
TLI=0.911
SRMR=0.045
RMSEA=0.066
which might provide evidence that the data and model could match well. So, what might be the actual reasons?
Thank you all for any comments in advance!
Thanks!
Hi, I am trying to model the effect of human perception on wildfire ignition in the United States. I want to use Google trend data to model society's perception of wildfire. Are there any similar studies that use Google trend data to model people's perception?
I have done uni-axial testing on a biological tissue for 10% strain and have the data. Now I need to use the data and model the tissue in abaqus.I believe Fung-Anisotropic model suits best for the tissue.I could not find any clear reference textbooks/sources for modelling from test results.
I was able to run an analysis on AMOS.
However, I have a low fit of my data to the model, and given time contrasts I doubt I would be able to double the number of data I got (I only have 217 responses..).
What could I do? At the moment I have a
- CMIN/DF: 18,5
-CFI: 0,46
-RMSEA: 0,285
I tried to go through the modification indices, but the only available covariance modification i could do between two 'errors' doesn't make sense and would only improve my model by 4,7.
Any suggestions?
I’m working with a panel data about Foreign Direct Investments using FDI flows as endogenous and, among others, FDI stock in the previous year as one of the explanatory variables. If we use the lagged endogenous as an explanatory variable we would have a dynamic panel data model and we should use a convenient estimator (say Arellano Bond, for example). However, in my case, I'm not using as exogenous the lagged endogenous (flow [yt-1-yt-2]), but the lagged stock of FDI (yt-1). Should this case be considered as a dynamic model too? Should it be estimated using Arellano&Bond or similar to avoid the inconsistency and Is there any specific alternative for this type of specification?
I would like to incorporate semi-structured surveys, satellite tracking, and eBird records into a single species distribution model, while being able to control for potential limitations and biases of each sampling approach.
See these papers for background / theory of this approach:
Hello!
Does anybody know how to estimate variance components for GLM-models in R?
It can be easily done for ordinary linear model (e.g., using VCA package), but I am not able to find any solutions for GLMs.
I would be greatful for any advices or links, R-code is very appreciated.
Here is an example of data and model I have:
N <- 200
dummydata <- rbind(
data.frame(
incidence = sample(x = 0:5, size = N/2, replace = T),
size = 12,
Pred1 = rep(c("X1", "X2", "X3", "X4"), each = 25),
Pred2 = "T1"
),
data.frame(
incidence = sample(x = 6:10, size = N/2, replace = T),
size = 12,
Pred1 = rep(c("X1", "X2", "X3", "X4"), each = 25),
Pred2 = "T2"
))
mod <- glm(
cbind(incidence, size - incidence) ~ Pred1 * Pred2,
data = dummydata,
family = binomial)
With best regards,
Igor
Thanks in advance!