# Significance of the validation Forums:

A Reviewer is insisting for the presentation of the significance of the regression of an external validation of a PLS model.

As far as I can remember, no one has ever presented or even asked to do it. Anybody?

Kind regards

José Significance of a validation harks back to fundamental concepts of any statistical procedure, and poses the question "What is the probability that this result could have been obtained if the data were completely random?" In the case of validation, two sets of data are being compared (the values calculated from the calibraiton model and the set of values from the reference laboratory). If the probability is low enough then we conclude that the data is NOT random, but that there is a true relationship between to two sets of data. One way this can be determined is to perform a simple (one-independent variable) regression between  the two sets of data.Statisticians have defined several statisitics for the case of simple regression.inclkuding the t and F statistics, from which the Significance can be determined. A good source for this is Draper and Smith's "Applied Regression Analysis" (any edition). Note that the significance is not the same as the accuracy or the robustness of the results, although they are related. Over the years, chemometriicans have lost sight of some of the fundamentals of the statistical concepts upon which their modern algorithms depend, so you would be hard put to find chemometric software that includes those basic statistics, further distancing them from the routine users of the algorithms. Thank you very much Howard. The referree asked for the p-value. Yes, the p (or probability) is one of the values that statisticians conisder key but chemometricians tend to ignore. The validation procedure based on SEE and SEP, that Ann described, is very good for estimating the expected accuracy of the model. The problem with it is that you'll get a value for SEP even if the data forms a (hyper)sphere in multidimensional space, so that the model isn't doing any actual predicting. Under those conditions you would do as well by simply predicting the mean value of the data, and the calculated regresson coefficient for any model would be a random number (which could well be between 0.9 and 1.1). While no single number can tell you everything you want to know about a calibraiton model, the F statistic for a regression is the best single statistic you can use. The early chemometricians who ignored or neglected the prior knowledge of the statisticians made a mistake in doing so. Although dimensionless, the F statistic is sensitive to not only the accuracy, but also to the number of samples and also to whether the model is doing any actual predicting. If the data forms a (hyper)sphere is space, the F value will => 0 regardless of the SEP. Thank you very much Significance of a validation harks back to fundamental concepts of any statistical procedure, and poses the question "What is the probability that this result could have been obtained if the data were completely random?" In the case of validation, two sets of data are being compared (the values calculated from the calibraiton model and the set of values from the reference laboratory). If the probability is low enough then we conclude that the data is NOT random, but that there is a true relationship between to two sets of data. One way this can be determined is to perform a simple (one-independent variable) regression between  the two sets of data. Statisticians have defined several statisitics for the case of simple regression.including the t and F statistics, from which the Significance can be determined. A good source for this is Draper and Smith's "Applied Regression Analysis" (any edition). Note that the significance is NOT the same as the accuracy or the robustness of the results, although they are related. Over the years, chemometricians have lost sight of some of the fundamentals of the statistical concepts upon which their modern algorithms depend, so you would be hard put to find chemometric software that includes those basic statistics, further distancing them from the routine users of the algorithms. Hi all

I hope you find well and keepping at home. Happy to see that this NIRS learning resource it is still alive !!!!. Thanks

I like to recommend my students  and to use in our papers sthe validation protocol recommended by Windham et al., (1989) and Shenk et al., (2001) , based on the following statistics: RMSEP (root mean square error of prediction), RMSEP (c) (standard error of prediction corrected for bias), bias and R2V (coefficient of determination for validation). Generally, for calibration sets that encompass100 or more samples and validation sets of nine or more samples, the following limits are established: RMSEP(c) Control limit = 1.30 x RMSEE; bias control limit =  ±0.60 x RMSEE; the slope of the regression line obtained between the reference data and the predicted NIRS data should vary between 0.9 and 1.1 and R2V with a minimum value of 0.6.

Windham, W.R.; Mertens, D.R. and Barton F.E. (1989). Protocol for NIRS calibration: sample selection and equation development and validation. In: G.C. Martens, J.S. Shenk, F.E. (Eds.), Near Infrared spectroscopy (NIRS): Analysis of Forage Quality. Agriculture Handbook, nº 643 (pp. 96-103). USDA-ARS, US Government Printing Office, Washington D.C.

Shenk, J.S.; Workman, J.Jr. and Westerhaus M.O. (2001). Application of NIR Spectroscopy to Agricultural Products. In D. A. Burns and E.W. Ciurczak (Eds.), Handbook of Near Infrared Analysis (pp. 348-383). CRC Press, Florida, USA.

Hope that help. Take care at stay at home!!!. See you anywhere soon!!

Ana Dear Ana

Thank you very much for your input. Just to clear a bit the question, all the statistics you mentioned were given in the manuscript, the question is that it is uncommon to present the p-value for the validation regression. We should present it or not, this is the question.

Kind regard

José Jose - yes, in my opinion, you should. In addition to the other statistics, not as a replacement for them. Thank you very much Hi Jose,

I agree with Ana, and I hate to argue with my friend Howard, but requesting or reporting a p-value seems pedantic to me.  The stats that Ana describes are sufficient to describe the case when the NIR calibration is not performing well.  Another statistic is the RPD, proposed by Phil Williams and Debbie Sobering, JNIRS 1 25-32 (1993).  The RPD is the ratio of the SEP to the SD of the original validation data.  A high value shows good validation, and a value near 1.0 demonstrates Howard's case of no prediction.  Phil recently gave more guidelines in "The RPD statistic: a tutorial note", NIR News, 2014 25 (1) 22-23.

My vote is against providing the p-statistic.

Best regards,

Dave

Posted on behalf of Dave Hopkins by Ian Michael Dear Ian and Dave

Thank you very mush for your comment, all statistics mentioned by Ana were presented and also the RPD value. I calculated the p-value and sent it in the reply for the sake of the reviewer, but asked the editor not to publish it. This was before getting all contributions just because I was pressed to submit the revised document. Thanks again to all for valuable cntributions

Kind regards

José