Kent State University
Asked 20th Mar, 2023
Is the multiple correlation coefficient (R) undefined in the case of negative determination coefficients (Artificial Neural Networks)?
I noticed that in some very bad models of neural networks, the value of R² (coefficient of determination) can be negative. That is, the model is so bad that the mean of the data is better than the model.
In linear regression models, the multiple correlation coefficient (R) can be calculated using the root of R². However, this is not possible for a model of neural networks that presents a negative R². In that case, is R mathematically undefined?
I tried calculating the correlation y and y_pred (Pearson), but it is mathematically undefined (division by zero). I am attaching the values.
Obs.: The question is about artificial neural networks.
Most recent answer
Top contributors to discussions in this field
All Answers (5)
I don't know, but my first thought here is "why even dwell on it here?"
In regression analysis we often use the coefficient of determination (R**2). It is a useful metric for fit but not for prediction when you need to measure how stable the regression model is. Here, a suggested change to R**2 is to replace SSE with PRESS, and we get R**2(pred)= 1- (PRESS/SSTotal). It can happen that the model is so bad that PRESS>SS Total. R**2(pred)<0 then. I would say, here it is best to either state that R**2(pred) is undefined or set it to zero. The prediction oriented R**2 looks really bad.
Raid PRESS invented 1970s by David Allen of U Kentucky is not a good idea either. See things like AIC and BIC and how to handle prediction models in general. Some references can be found in the attached. IN ANY CASE R**2 cannot be less than 0 but it is well known adjusted R**2 can. Best wishes David Booth
Similar questions and discussions
Can anyone please explain the two questions?
- Kazi F Salahin
It is actually the Simple Linear Regression Analysis Question
- For diabetics with an initial weight the same as yours, calculate the 95% confidence interval on their predicted mean weight loss at one year after DBI therapy.
- For an individual diabetic with an initial weight the same as yours, calculate the 95% confidence interval on his predicted weight loss at one year after DBI therapy
Should the model fit indices GFI, AGFI,CFI, TLI, IFI in CFA be above 0.9 or values more than 0.8 is also acceptable?
- Amita Venkatesh
My model fit indices as mentioned above for the sample size of 600 is more than 0.8 but not meeting the rule of thumb of more than 0.9 value can this value be considered and what are the ways to improve the model fit values
Mplus Data loading issues
- Magda Poter
Hello, I am trying to load SPSS data to Mplus, but I am experiencing multiple problems, although following all the Mplus recommendations. At the moment I keep getting the followinmg error: *** ERROR Non-missing blank found in data file at record #3261, field #: 1
The problem is that, I do not have any blanks. The record referenced by Mplus is the number 0.967547 and it represents ESS weight variable. I have got several records with value between 0 and 1 in mydataset.
Any idea how can I solve this problem? Much appreciate any recomendations.
Multiple Linear regression unmet assumptions, what can I do?
- Abdelkarim Kharadji
I need your help with some work I am doing.
Some context first:
I am writing a dissertation for my master. The topic is about perceived trust in Smart Home technology. I launched a survey with a closed ended questions for demographic data, and likert scale that asks 8 Questions on a scale of 1 to 5. I gathered 159 responses in total.
The 8 Questions in ther likert scale are actually 4 different dependent variables. Q1/Q2 make dependent variable1, Q3/Q4 dependent variable 2 etc.
Since it's a likert scale the data is not an interval, so what I did is that I took the sum of Q1 and Q2 and divided it by 2, which gave me a mean. This mean is one of the 4 dependent variables. I did this an additional 3 times for the other 3.
The idea is to test each one of these dependent variables and see if they can be predicted with the independent variables (and control variables) that I have ( age, gender, educational attainment, household size and income).
For that I read that a multiple linear regression would be enough. So I started reading about that method and I saw that there were some assumptions that needed to be met before I could use that method. For normality (3 of the 4dependent variables were normally distributed, but the last one had was not quite normally distributed. Secondly, it seems that testing the the four variables for linearity resulted in all of them not being linear.
Now I need to start the analysis part of my dissertation but I have no clue wich method I should use since the assumptions of the multiple linear regression are not met.
I know about non-parametric tests, but I can't find anything non-parametric alternative for the multiple linear regression.
If you need more info about the variables etc let me know, I will provide them!
Thanks for your help and time.
What are the significance of R and R^2 in regression analysis?
- Mahtab Uddin
In finding the correlation and regression of multivariable distribution what is the significance of R and R^2? What is the main relation between them?
Can clustering standard errors be used for cross-sectional data?
- Fan Chao
Can clustering standard errors be used for cross-sectional data? If possible, what is the corresponding command in stata.
Help to explain a problem of multiple linear regression?
- Lijian Sun
I encountered a problem while performing multiple linear regression and ols single factor regression. "There are two factors with low coefficients when performing single factor regression, but when performing multiple factor regression, the coefficients of these two factors are also high.". After removing one factor and performing multifactor regression, the coefficient of the other factor also decreased significantly. But. These two factors have passed the collinearity test and are not collinear. Therefore, why is there such a result? Are these two factors good at fitting the equation？
Scientific Paper Research topic with hypothesis and statistical computation?
- Solange Claire Batobalani Jamelo
i need a Scientific Paper Research topic with hypothesis and statistical computation
Tailgate stability in a mechanized longwall mine is serious to mine productivity and personnel’s safety. Although the stability of both roadways is critical in longwall mining, the tailgate is subjected to higher stresses and deformations than the other one. Therefore, the prediction of tailgate stability is a distinctive challenge in mechanized co...
Runoff prediction has an important role in hydrology, water management, flood prediction and socio-economical concern. The effective flood management is always of great apprehension in the field of hydrology and water resources engineering. The present study shows the comparison of various training algorithms available for training multi-layer perc...
The main aim of this paper is to predict NO and NO2 concentrations 4days in advance by comparing two artificial intelligence learning methods, namely, multi-layer perceptron and support vector machines, on two kinds of spatial embedding of the temporal time series. Hourly values of NO and NO2 concentrations, as well as meteorological variables were...