All reviews of published articles are made public. This includes manuscript files, peer review comments, author rebuttals and revised materials. Note: This was optional for articles submitted before 13 February 2023.
Peer reviewers are encouraged (but not required) to provide their names to the authors when submitting their peer review. If they agree to provide their name, then their personal profile page will reflect a public acknowledgment that they performed a review (even if the article is rejected). If the article is accepted, then reviewers who provided their name will be associated with the article itself.
All reviewers' comments have been addressed. The paper can be accepted.
The reviewers are generally positive about the manuscript. Please make the suggested changes and provide a point to point response.
No comment.
No comment.
No comment.
The reviewer attaches a PDF document with some additional comments, suggestions and
highlighted typos.
No new comments to add.
No new comments to add.
No new comments to add.
Overall, the article is good and has been improved since the last review round.
The new scenario of children obesity prediction is exactly what I was looking for, and further illustrates the conclusions drawn by the authors regarding the superiority of R-squared.
Once again, it would be interesting to see a deeper and more insightful discussion on possible steps towards better measures, and I look forward to seeing it in future studies.
We have received a mixed review report. Two of the reviewers are positive and one is negative. Please revise the paper according to all three reviewers' comments and provide point to point responses.
No comment.
No comment.
No comment.
-The reviewer attaches a PDF document with some additional comments, suggestions and highlighted typos.
-The authors split their data into train and test sets. It would be interesting to see results with train, validation and test sets, for instance.
The manuscript follows the defined standards. Includes a thorough review of the literature, and contextualisation of the work within the state-of-the-art. Methods and experiments are described thoroughly. Results are clearly and transparently presented. Data is either provided or freely available online (Lichtinghagen dataset). Language is very good, with only some typos/mistakes to correct (e.g. "deeper description of for R2" on line 122, "despite of the ranges" on line 191).
The choice of R2 and SMAPE is well-justified based on informativeness, among all the other surveyed metrics. The experiments cover most aspects where R2 and SMAPE differ, especially the use-cases 1-5. The real medical scenario is interesting, but the manuscript would be much more complete if other real scenarios were included: other medical scenarios (e.g., COVID-19 cases prediction) or other real contexts (e.g., financial predictions).
The findings are valid. The use-cases and the medical experiment cover most differences between R2 and SMAPE and truly show the benefit of using R2. However, beyond the issue of the negative space discussed by the authors in the conclusion, no other drawbacks or possible limitations of R2 have been discussed. Perhaps additional experiments in other real scenarios or the future comparison with Huber metric Hδ, LogCosh loss, and Quantile Qγ would help unveil such downsides of R2.
Nevertheless, the lack of deeper discussion on these makes the paper appear quite one-sided (in favour of R2), and also leads to the absence of proposals to improve: what could be improved in R2, what desirable behaviours should the new R2 verify, how could we change R2 to achieve it? Since the paper is merely a comparison between R2 and SMAPE, that does not propose an improvement on either measure, at least it should include this deeper speculative discussion on what could be done to achieve an improved R2 metric.
Professional English: the writing of the article is mostly good professional English, with some minor errors.
Literature references: the article cited a good number of original sources of information.
Article structure: the article appears to follow common structured used for medical publications.
Self-contained: The authors did not make a clear statement about the central hypotheses of this article.
Formal results: No clear criteria was given in how to judge the different metrics.
The core of the work is to compare different quality metrics, however, since the authors did not establish a concrete standard, it is very hard to say which one is better.
Originality: poor.
Research question: not well defined
Rigor: anecdotal evaluation
Methods: lack of the key evaluation criteria
This is an anecdotal evaluation of five different quality metrics for regression, however, due to a lack of clear standard of comparison, it is hard to judge whether the conclusions are valid or not.
This article needs a clearly defined standard for comparing different quality metrics.
All text and materials provided via this peer-review history page are made available under a Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.