24小时热门版块排行榜

>论坛更新日志 (3737)
>虫友互识 (493)
>文献求助 (326)
>导师招生 (193)
>休闲灌水 (125)
>考博 (115)
>硕博家园 (100)
>论文投稿 (82)
>论文道贺祈福 (74)
>基金申请 (63)
>考研 (63)
>公派出国 (61)
>博后之家 (60)
>教师之家 (46)
>找工作 (35)
>攻关文献(高奖励) (29)

返回列表

lionel0822

铁虫 (初入文坛)

应助: 0 (幼儿园)
金币: 812
红花: 1
帖子: 3
在线: 4.2小时
虫号: 2723427
注册: 2013-10-14
专业: 通信理论与系统

[求助] 摘要翻译（通信、计算机、机器学习）

目前主要的访问权限控制机制为：DAC(Discretionary Access Control)、MAC(Mandatory Access Control)、RBAC(Role- based Access Control)。本文旨在提出一种新的方法，用机器学习算法建立一个访问权限自动化配置的模型。近年来，在机器学习领域，越来越多的学者把关注重点放在对原始数据集的处理上，因为如果能够用特征工程的方法尽可能的挖掘出隐藏在原始的数据集中的更多数据、更多特征，用相同的机器学习算法，可以得到更好的效果。
本文由原始的数据集生成了很多新的数据集、特征集的组合，介绍了几种机器学习算法：逻辑回归、梯度提升决策树、随机森林。用上述三种算法在数据集、特征集的组合上产生了很多分类器模型，最后在上述几种典型分类器模型的基础上，研究了一些常用的集成学习算法，并用两种集成学习算法组合了上述几种分类器模型。
具体来说，本文的工作主要体现在以下几个方面：
（1）在原始数据集的基础上，产生了4个新的数据集、5个新的特征集，本文介绍了几种数学上的处理方式并选择性的应用在数据集和特征集上。尤其是在greedy数据集的产生过程中，本文利用贪婪前向选择的特征选择算法从繁杂的数据集合中选择了最优子集。
（2）介绍了逻辑回归、梯度提升决策树、随机森林等机器学习算法，分别在不同的训练集上训练，最终选择了14个典型分类器模型（五个逻辑回归模型、四个梯度提升决策树模型、五个随机森林模型），逻辑回归模型的AUC(Area Under Curve )分数分布在0.9109～0.9196；梯度提升决策树模型的AUC分布在0.8756～0.9079；随机森林模型的AUC分布在0.8782～0.9047，并用上述三个算法在三个数据集上分别训练，比较了各个模型在三个数据集上的表现。证明了特征工程的处理在单一分类模型中是非常必要的。逻辑回归在含有greedy数据集的训练集中表现不错，而梯度提升决策树和随机森林在含有tuples数据集的训练集中表现不错。但总体来说，逻辑回归模型，在某些训练集上的表现是较好的。
（3）在上述分类模型的基础上，本文介绍了投票表决和stacked generation集成学习算法，并集成上述14种典型分类器模型，投票表决集成模型的AUC达到了0.9244，相对于上述14个分类器模型的最大AUC，提高了0.0048，而stacked generation集成模型的AUC达到了0.9247，提高了0.0051。实验表明，运用集成学习算法提高了最终模型的分类能力。

回复此楼

» 猜你喜欢

表哥与省会女结婚，父母去帮带孩子被省会女气回家生重病了已经有12人回复
依托企业入选了国家启明计划青年人才。有无高校可以引进的。已经有14人回复
江汉大学解明教授课题组招博士研究生/博士后已经有3人回复
AI 太可怕了，写基金时，提出想法，直接生成的文字比自己想得深远，还有科学性已经有11人回复
依托企业入选了国家启明计划青年人才。有无高校可以引进的。已经有11人回复

1楼 2015-04-07 13:21:43

已阅回复此楼关注TA 给TA发消息送TA红花 TA的回帖

灿灿妹

至尊木虫 (著名写手)

翻译EPI: 68
应助: 135 (高中生)
贵宾: 1.097
金币: 14467.1
散金: 3921
红花: 102
沙发: 3
帖子: 2121
在线: 396.8小时
虫号: 1783324
注册: 2012-04-27
性别: GG
专业: 环境工程

【答案】应助回帖

★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★
lionel0822(RXMCDM代发): 金币+40, 多谢应助！ 2015-10-04 19:23:12

At present, the main control mechanism of access permissions are: DAC(Discretionary Access Control)、MAC(Mandatory Access Control)、RBAC(Role- based Access Control). A new method that establish a automation configuration model of access permissions using machine learning algorithms was introduced in this paper. Recently, more and more scholars have focused on original data processing in machine learning field, due to that the better effects would be received using the same machine learning algorithms base on that more data and features hidden in the original data sets can be dig by feature engineering methods.
Many new combinations consisted of data sets and feature sets were created base on the original data sets. This paper was introduced several machine learning algorithms, including the logistic regression, gradient boost decision tree and random forest. Many classification models were produced by using the three above algorithms in data sets and feature sets’ combinations. Some commonly used ensemble learning algorithm based on several above classification models, which are grouped by using two ensemble learning algorithm soon afterwards.
Specifically speaking, The main contributions of this paper are as follows:
(1) 4 data sets and 5 feature sets were generated base on the original data sets. Several mathematical treatment methods were introduced and selectively applied in data sets and feature sets, especially in the process of greedy data sets, subset regression was selected from tedious data sets using feature selection algorithm which greed to choose before.
(2)The logistic regression, gradient boost decision tree and random forest were introduced in this study. 14 typical classification models( 5 logistic regression, 4 gradient boost decision tree and 5 random forest) were selected base on training in different training sets. The AUC(Area Under Curve ) distribution for logistic regression, gradient boost decision tree and random forest were 0.9109～0.9196, 0.8756～0.9079 and 0.8782～0.9047, respectively. Then, used above three algorithms training in three data sets separately to compare the performance of each model to prove the very necessary of feature engineering in single classification model. Logistic regression showed a good performance in training of greedy data sets, while gradient boost decision tree and random forest showed a better performance in training of tuples data sets. But generally speaking, logistic regression showed better in some training sets.
(3)This paper introduced voting ensemble learning algorithm and stacked generation ensemble learning algorithm base on above classification models, and integrated above 14 typical classification models. The AUC distribution for voting reached 0.9244, 0.0048 higher than the biggest AUC of above 14 classification models, moreover, The AUC distribution for stacked generation was 0.9247, advanced 0.0051. Results showed that the classification capability of final model was improved by using ensemble Learning Algorithm.

赞一下(1人)

回复此楼

执此一生，只为一人，天涯海角，惟愿君安

2楼2015-04-10 00:43:01

已阅回复此楼关注TA 给TA发消息送TA红花 TA的回帖

相关版块跳转我要订阅楼主 lionel0822 的主题更新

返回列表