24小时热门版块排行榜

返回列表

【悬赏金币】回答本帖问题，作者空空如也0将赠送您 50 个金币

当前只显示满足指定条件的回帖，点击这里查看本话题的所有回帖

空空如也0

新虫 (正式写手)

应助: 0 (幼儿园)
金币: 76.3
散金: 102
红花: 1
沙发: 1
帖子: 435
在线: 98.4小时
虫号: 9512889
注册: 2018-07-17
专业: 计算机软件

[求助] 给了一个月的修订时间，我10天就改好了。我要不要等到一个月后再投？已有6人参与

审稿意见：
Reviewer #1: I understand that AL-based approach needs both "initial training set" and "the newly added samples" in order to select most representative samples. However, there are two questions. First, what do you mean representative samples? Does it mean a set of samples which covers both valid and invalid links? Second, it seems that in the Algorithm 1, active learning relies on the termination condition to ensure the representativeness of the samples. If I am correct, the termination condition used in the paper (i.e., the number of labeled samples reaches a preset value) does not make any sense. A better termination condition could be the labeled samples shall contains at least 1 valid link and 1 invalid link.

Reviewer #2: Tracking the relation between artifacts in software project is important. Generally, it is human intensive task to construct the traceability links.
Traditional information retrieve techniques has been employed to automatic analyze and recover traceability links. Even machine learning approaches as adopted to train an effective predictive model for traceability link recovery. It requires humans to label traceability links.

This paper presents a TLR approach based on active learning. Evaluation experiments were conducted on seven commonly used traceability datasets. It was compared with an IR-based approach and a current machine learning approach. The experiment shows that AL-based approach outperforms the other two approaches in terms of F-score.

Concerns

1、 Page1, "(hereafter called AL-based approach)" is repeated in the abstract and introduction.
2、 Page 1, section 1, left column, last line, "traceability" means "traceability relationships" or "traceability links"?
3、 Page 2, left column, "TSL-based approach is that how to select traceability links for labeling to generate traceability information."=》that
4、 Page 3, Section 3, Step1: (1)" randomly selecting a small number of samples for labeling to initialize Dt", =>"randomly labeling a small number of samples to initialize Dt "
5、 Page 3, Section 3, Step1: (3) "selecting an unlabeled sample from the unlabeled sample set based on sample selection strategy and requesting experts to label the sample"=> The authors need to define the D and Dl here. Regarding to the context, the Dt and Dl seem equivalent, why use different symbol?
6、 Page 3, Section 3, Step 4, This paper chose Random Forest as the classification algorithm. However, the authors only claimed that "The reason for choosing Random Forest is because it has been shown to be accurate and robust". It would be better to explain why random forest is more suitable for the task.
7、 Page 4, Algorithm1 needs to be reconstructed. It would be much better to define input and output, as well as all the variables used in the algorithm.
8、 Section 4. It would be much better to move Experimental metric at the beginning of section 4. Authors use the F-score before its definition.
9、 The format of refences should be standardized, especially, the names and abbreviations of journals and conferences.

回复此楼