24小时热门版块排行榜    

查看: 401  |  回复: 1
本帖产生 1 个 翻译EPI ,点击这里进行查看

yuxintian

新虫 (著名写手)

[求助] 一段话,求翻译为英语

目前,大多数监督标注方法在大规模语料环境中可以获得较好的效果,但
标注语料资源在实际应用中既难以获取,也难具有通用性。本文提出基于A方法的原型模式扩展算法:

首先,使用初始小规模训练数据构造一定准确率的一体化标注器。
其次,利用A 算法自动扩展训练数据。从未标记数据中预测出候选实例,把
数值大于某个域值的数据加入训练集。
最后,通过训练数据中存在的约束来对噪声进行剪辑。并使用扩展后的训练
数据重新迭代训练分类器,直到最终趋于稳定迭代终止。
已阅   回复此楼   关注TA 给TA发消息 送TA红花 TA的回帖

起沃尔特与

木虫 (小有名气)

【答案】应助回帖

★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★
yuxintian: 金币+50, 翻译EPI+1 2013-03-21 14:37:36
At present, most of the supervision and labeling methods  can achieve good effect in the large-scale  corpus environment , but in a real world application,
tagging corpus resources is not only difficult to obtain, hard also to be versatile. In this article, we present a prototype model extension algorithma based on A-method:
First of all, using the original small-scale  training data conducts integration annotators with a certain accuracy rate.
Secondly,  useing the A-algorithm expands the training data automatically.  To predict the candidate example among untagged data, then the
numerical data which is greater than a certain thresholdto should join in a training set .
Finally, in line with the constraints existed in training data cutting the noise for clips. And using the training
data after extension to afresh the training classifier iteratively, until approaching the final stable iteration.
远见胜于经验。
2楼2013-03-21 08:52:35
已阅   回复此楼   关注TA 给TA发消息 送TA红花 TA的回帖
相关版块跳转 我要订阅楼主 yuxintian 的主题更新
信息提示
请填处理意见