Hello everyone,

I would like to clear my query regarding selection of best fit model for water quality retrieval using statistical analysis.

I applied three techniques for their comparative analysis in estimation of water quality parameter i.e. multiple linear regression analysis (MLR), artificial neural networks (ANN) and gene expression programming (GEP).

Coefficient of determination R2 for calibration data between observed and estimated for each approach is as follows 0.95 (GEP), 0.96 (ANN) and 0.81 (MLR) and for validation data set 0.91 (GEP), 0.87 (ANN) and (0.67 (MLR) respectively. Here, it seems that GEP is estimating the values more accurately as compared to the other two approaches.

To further ascertain the goodness of fit, I tried to apply one way ANOVA post hoc: Tuckey's test to the results from three employed approaches and got the results as under

Group 1 Group 2 Critical p

ANN MLR 0.01667 0.920624 not significant

GEP MLR 0.025 0.935268 not significant

GEP ANN 0.05 0.986082 not significant

Can I ascertain from the above ANOVA Post hoc results that GEP is better than the rest? Else I have to do perform some other analysis…kindly suggest

Thanks