Question
Asked 20th Mar, 2023

Is the multiple correlation coefficient (R) undefined in the case of negative determination coefficients (Artificial Neural Networks)?

I noticed that in some very bad models of neural networks, the value of R² (coefficient of determination) can be negative. That is, the model is so bad that the mean of the data is better than the model.
In linear regression models, the multiple correlation coefficient (R) can be calculated using the root of R². However, this is not possible for a model of neural networks that presents a negative R². In that case, is R mathematically undefined?
I tried calculating the correlation y and y_pred (Pearson), but it is mathematically undefined (division by zero). I am attaching the values.
Obs.: The question is about artificial neural networks.

Most recent answer

David Eugene Booth
Kent State University
Raid, apologies here's the attachment. David Booth
1 Recommendation

All Answers (5)

David Eugene Booth
Kent State University
R**2 cannot be negative. Adjusted R**2 can sometimes be because of the adjusted number of degrees of freedom in the denominator. See most good linear statistical models text books. Best wishes David Booth
Raid Amin
University of West Florida
I don't know, but my first thought here is "why even dwell on it here?"
In regression analysis we often use the coefficient of determination (R**2). It is a useful metric for fit but not for prediction when you need to measure how stable the regression model is. Here, a suggested change to R**2 is to replace SSE with PRESS, and we get R**2(pred)= 1- (PRESS/SSTotal). It can happen that the model is so bad that PRESS>SS Total. R**2(pred)<0 then. I would say, here it is best to either state that R**2(pred) is undefined or set it to zero. The prediction oriented R**2 looks really bad.
David Eugene Booth
Kent State University
Raid PRESS invented 1970s by David Allen of U Kentucky is not a good idea either. See things like AIC and BIC and how to handle prediction models in general. Some references can be found in the attached. IN ANY CASE R**2 cannot be less than 0 but it is well known adjusted R**2 can. Best wishes David Booth
1 Recommendation
David Eugene Booth
Kent State University
Raid, apologies here's the attachment. David Booth
1 Recommendation

Similar questions and discussions

Multiple Linear regression unmet assumptions, what can I do?
Question
4 answers
  • Abdelkarim KharadjiAbdelkarim Kharadji
Hello everyone!
I need your help with some work I am doing.
Some context first:
I am writing a dissertation for my master. The topic is about perceived trust in Smart Home technology. I launched a survey with a closed ended questions for demographic data, and likert scale that asks 8 Questions on a scale of 1 to 5. I gathered 159 responses in total.
The 8 Questions in ther likert scale are actually 4 different dependent variables. Q1/Q2 make dependent variable1, Q3/Q4 dependent variable 2 etc.
Since it's a likert scale the data is not an interval, so what I did is that I took the sum of Q1 and Q2 and divided it by 2, which gave me a mean. This mean is one of the 4 dependent variables. I did this an additional 3 times for the other 3.
The idea is to test each one of these dependent variables and see if they can be predicted with the independent variables (and control variables) that I have ( age, gender, educational attainment, household size and income).
For that I read that a multiple linear regression would be enough. So I started reading about that method and I saw that there were some assumptions that needed to be met before I could use that method. For normality (3 of the 4dependent variables were normally distributed, but the last one had was not quite normally distributed. Secondly, it seems that testing the the four variables for linearity resulted in all of them not being linear.
Now I need to start the analysis part of my dissertation but I have no clue wich method I should use since the assumptions of the multiple linear regression are not met.
I know about non-parametric tests, but I can't find anything non-parametric alternative for the multiple linear regression.
If you need more info about the variables etc let me know, I will provide them!
Thanks for your help and time.

Related Publications

Article
Tailgate stability in a mechanized longwall mine is serious to mine productivity and personnel’s safety. Although the stability of both roadways is critical in longwall mining, the tailgate is subjected to higher stresses and deformations than the other one. Therefore, the prediction of tailgate stability is a distinctive challenge in mechanized co...
Article
Full-text available
Runoff prediction has an important role in hydrology, water management, flood prediction and socio-economical concern. The effective flood management is always of great apprehension in the field of hydrology and water resources engineering. The present study shows the comparison of various training algorithms available for training multi-layer perc...
Article
Full-text available
The main aim of this paper is to predict NO and NO2 concentrations 4days in advance by comparing two artificial intelligence learning methods, namely, multi-layer perceptron and support vector machines, on two kinds of spatial embedding of the temporal time series. Hourly values of NO and NO2 concentrations, as well as meteorological variables were...
Got a technical question?
Get high-quality answers from experts.