Kent State University

Question

Asked 20th Mar, 2023

# Is the multiple correlation coefficient (R) undefined in the case of negative determination coefficients (Artificial Neural Networks)?

I noticed that in some very bad models of neural networks, the value of R² (coefficient of determination) can be negative. That is, the model is so bad that the mean of the data is better than the model.

In linear regression models, the multiple correlation coefficient (R) can be calculated using the root of R². However, this is not possible for a model of neural networks that presents a negative R². In that case, is R mathematically undefined?

I tried calculating the correlation y and y_pred (Pearson), but it is mathematically undefined (division by zero). I am attaching the values.

Obs.: The question is about artificial neural networks.

## Most recent answer

## All Answers (5)

Kent State University

R**2 cannot be negative. Adjusted R**2 can sometimes be because of the adjusted number of degrees of freedom in the denominator. See most good linear statistical models text books. Best wishes David Booth

University of West Florida

I don't know, but my first thought here is "why even dwell on it here?"

In regression analysis we often use the coefficient of determination (R**2). It is a useful metric for fit but not for prediction when you need to measure how stable the regression model is. Here, a suggested change to R**2 is to replace SSE with PRESS, and we get R**2(pred)= 1- (PRESS/SSTotal). It can happen that the model is so bad that PRESS>SS Total. R**2(pred)<0 then. I would say, here it is best to either state that R**2(pred) is undefined or set it to zero. The prediction oriented R**2 looks really bad.

Kent State University

Raid PRESS invented 1970s by David Allen of U Kentucky is not a good idea either. See things like AIC and BIC and how to handle prediction models in general. Some references can be found in the attached. IN ANY CASE R**2 cannot be less than 0 but it is well known adjusted R**2 can. Best wishes David Booth

1 Recommendation

## Similar questions and discussions

How to calculate power for a 2-level model with two L1-predictors?

- Janna Hämpke

Hello,

I am new to power analysis in multi-level models. I am looking for a possibility to do a power analysis for the following 2-level model: Y = y

_{00}+ y_{10}*D_{1}+ y_{20}*D_{2}+y_{01}*Z +y_{11}*D_{1}Z+y_{21}*D_{2}Z.In this model, I investigate the effect of time (D

_{1}and D_{2}) and an experimental condition as well as their interaction effect on my outcome variable. The time is measured three times and integrated as dummy-coded contrasts in the model (D_{1}and D_{2}). The experimental condition is also dummy-coded.I tried to work with the instruction for a power analysis in 2-level models by Trend & Schäfer (2019) (see R code attached). However, I do not know how create the conditional variances for my model and I think there must be a mistake in the model .

I would be very happy to get your advice. Thanks a lot!

**R code:**

#Specifying standardized input parameters

alpha.S <- .05 #Alpha level

Size.clus <- 3 #L1 sample size

N.clus <- 200 #L2 sample size

L1_DE_standardized <- .30 #L1 direct effects

L2_DE_standardized <- .50 #L2 direct effect

CLI_E_standardized <- .50 #CLI effects

ICC <- .50 #ICC

rand.sl <- .09 #Standardized random slope

#Creating variables for power simulation in z-standardized form

#Creates a dataset with two L1-predictor x and one L2-predictor Z; all predictors are dichotomous

Size.clus <- 3 #L1 sample size

N.clus <- 200 #L2 sample size

EG<-rep(c(0,1),each=300)

x<- scale(rep(1:Size.clus))

g <- as.factor(1:N.clus)

X <- cbind(expand.grid("x"=x, "g"=g))

X <- cbind(X, EG)

X$D1<- recode(var = X$x,

recodes = "-1 = 0; 0 = 1; 1 = 0")

X$D2<- recode(var = X$x,

recodes = "-1 = 0; 0 = 0; 1 = 1")

#Adapting the standardized parameters

varL1 <- 1 #L1 variance component

varL2 <- ICC/(1-ICC) #L2 variance component

varRS1 <- rand.sl*varL1 #Random slope variance tau 11

varRS2 <- rand.sl*varL1 #Random slope variance tau 22

L1_DE <- L1_DE_standardized*sqrt(varL1) #L1 direct effect

L2_DE <- L2_DE_standardized*sqrt(varL2) #L2 direct effect

CLI_E <- CLI_E_standardized*sqrt(varRS) #CLI effect

#Creating conditional variances

**#I don’t know how to calculate this conditional variance with two L1 predictor**

**s <- sqrt((varL1)*(1-(L1_DE_standardized^2))) #L1 variance**

V1 <- varL2*(1-(L2_DE_standardized^2)) #L2 variance

rand_sl.con <- varRS1*(1-(CLI_E_standardized^2)) #Random slope variance

#Creating a population model for simulation

b <- c(0, L1_DE, L1_DE, L2_DE, CLI_E,CLI_E) #vector of fixed effects (fixed intercept, L1.1. direct, L1.2. direct, L2 direct, CLI.1 effect, CLI.2 effect)

V2 <- matrix(c(V1,0,0, 0,rand_sl.con,0, 0,0,rand_sl.con), 3) #Random effects covariance matrix with covariances set to 0

**# there must be a mistake some steps before that the model doesn't work**

**model <- makeLmer(y ~ D1 + D2 + EG + D1:EG + D2:EG +(D1+D2 | g), fixef = b, VarCorr = V2, sigma = s, data = X) #Model creation**

print(model)

## Related Publications

Runoff prediction has an important role in hydrology, water management, flood prediction and socio-economical concern. The effective flood management is always of great apprehension in the field of hydrology and water resources engineering. The present study shows the comparison of various training algorithms available for training multi-layer perc...

Performance of silt extractors has been the issue for field engineers. Due to wide variations in observed and computed removal efficiencies as obtained by empirical and analytical relationships, it is difficult to operate and maintain extractors. The present study investigates the performance of commonly employed relationship for estimation of remo...

In this paper, the generalized profile function models, GPFMs, based on linear regression and neural networks, are compared. GPFM provides an approximation of individual models (models of individual stem profile) facility using only two basic measurements. GPFM based on neural network is obtained as the average of all available normalized individua...