审稿人意见statistical test是什么意思？

作者 linenus: 来源: 小木虫 200 4 举报帖子

第三个审稿人的以及如下：

1- English writing is good but please check and polish again.

2- My main criticism is the experimental results. The authors need to compare the results with new algorithms. The proposed algorithm compared with 7 algorithms and most of those are before 2010.

3- When the authors discuss the work on imbalanced datasets, the statistical test (Friedman test) should be considered. Please use the area under the receiver operating characteristic curve (AUC)+statistical test to validate the classification performance.

请问Friedman test如何做，机器学习的多个算法比较一般不是划分训练集测试集，然后汇报测试集上的分类性能么？这个Friedman test如何回复处理，第一次投稿有虫友知道怎么回复么？

第一个审稿人意见很详细：
1. For each section that involves several subsections, it would be better to add a paragraph below the heading of the section to outline the main contents before starting the subsections, such at readers can understand better the structure of each section at an early point.
2. In Section 1, the authors mentioned in lines 5-7 of the left column in page 2 that there was seldom any research work on learning from imbalanced data through deep reinforcement learning. However, it would be better to justify why it is worth to explore the adoption of deep reinforcement learning for imbalanced data classification, i.e., to justify how deep reinforcement learning shows its potential to effectively deal with class imbalance, in order to better stress the motivation and relevance of this paper and to better connect to the next sentence "A deep Q-learning network (DQN) based model for imbalanced data classification is proposed in this paper.
3. Also, it is worth to clarify in Section 1 whether this paper focuses on binary classification, multi-class classification or both in terms of dealing with class imbalance.
4. At the end of Section 2.2, the authors wrote the sentence "In this paper, we propose a deep Q-network based model for imbalanced classification which is efficient in complex high-dimensional data such as image or text and demonstrates a better performance compared to other imbalanced classification methods". However, it is too early to write the sentence at this point. Instead, it is more necessary to describe what the authors aim to do and expect to achieve by proposing the deep Q-network based model.
5. In Section 4.1, the authors describe several methods used for comparison with the proposed method, but it would be better to add a paragraph at the end to justify briefly why these methods are chosen for comparison.
6. Eq. (15) is provided to show the formula of F-measure. However, the traditional F-measure is formulated as 2*Precision*Recall/(Precision+Recall), where Precision= TP/(TP+FP) and Recall= TP/(TP+FN). To the best of my knowledge, Eq. (14) represents the geometric mean of recall and specificity whereas Eq. (15) represents the geometric mean of recall and precision, i.e., they represent two different cases of G-mean, so it is necessary to make this more clear and correct the terms of the metrics if necessary.
7. In Section 4.2, for each data set, a reference can be given to make it easy for readers to retrieve the data sets.
8. In Section 4.3, the authors mention that the network architecture used for image classification consists of two convolution layers, two fully connected layers and a softmax output layer. However, when looking at the Table 3, it is clearly shown that each of the two convolution layers is followed by a max-pooling layer, which is a popular setup of the architecture of a convolutional neural network, but the text description in line 39 of the right column in page 7 needs to be modified to avoid confusion.
9. Also, the detailed parameters used for image classification tasks are given in Table 2 rather than Table 3, i.e., Table 3 is given to show the detailed parameters used for text classification tasks, so the table referencing errors need to be corrected.
10. Moreover, the authors mention that the network architecture used for text classification consists of an embedding layer, two fully connected layers and a softmax output layer. However, convolution and pooling layers (that are involved in convolutional neural networks) and long-short term memory (LSTM) layers (that are involved in recurrent neural networks) have been popularly used alongside an embedding layer for text classification tasks, so it is necessary to justify why the network architecture metnioned in Section 4.3 is used particularly for imbalanced data classification tasks.
11. This paper needs some language editing to improve the presentation of contents, e.g., in line 42 of the left column in page 11, the authors wrote the sentence "In other data sets, we get the same results", which can to be rephrased to "On other data sets, the results show similar phenomenons" to make it sound more sensible. Also, there are some places where the definite article "the" is missing or the determiner "a/an" should be used. Therefore, the authors are suggested to check and edit again the full text of this paper carefully and thoroughly. 返回小木虫查看更多

今日热帖