24小时热门版块排行榜

>论坛更新日志 (3646)
>文献求助 (321)
>虫友互识 (278)
>导师招生 (265)
>博后之家 (137)
>硕博家园 (135)
>休闲灌水 (126)
>论文投稿 (106)
>考博 (101)
>基金申请 (81)
>教师之家 (47)
>招聘信息布告栏 (41)
>公派出国 (38)
>考研 (33)
>绿色求助(高悬赏) (32)
>找工作 (27)

返回列表

【奖励】本帖被评价77次，作者jlcuit增加金币 59.6 个

当前只显示满足指定条件的回帖，点击这里查看本话题的所有回帖

jlcuit

木虫 (正式写手)

应助: 6 (幼儿园)
金币: 2200.7
帖子: 570
在线: 238.7小时
虫号: 444828

[资源] The Elements of Statistical Learning: Data Mining, Inference, and Prediction

最近在看数据统计方面的资料，查到一本书，分享给大家，希望有帮助~
The Elements of Statistical Learning: Data Mining, Inference, and Prediction
Second Edition
Trevor Hastie
Robert Tibshirani
Jerome Friedman
Springer，2008

内容简介
During the past decade there has been an explosion in computation and information technology. With it have come vast amounts of data in a variety of fields such as medicine, biology, finance, and marketing. The challenge of understanding these data has led to the development of new tools in the field of statistics, and spawned new areas such as data mining, machine learning, and bioinformatics. Many of these tools have common underpinnings but are often expressed with different terminology. This book describes the important ideas in these areas in a common conceptual framework. While the approach is statistical, the emphasis is on concepts rather than mathematics. Many examples are given, with a liberal use of color graphics. It is a valuable resource for statisticians and anyone interested in data mining in science or industry. The book's coverage is broad, from supervised learning (prediction) to unsupervised learning. The many topics include neural networks, support vector machines, classification trees and boosting---the first comprehensive treatment of this topic in any book. This major new edition features many topics not covered in the original, including graphical models, random forests, ensemble methods, least angle regression & path algorithms for the lasso, non-negative matrix factorization, and spectral clustering. There is also a chapter on methods for "wide" data (p bigger than n), including multiple testing and false discovery rates.

目录
Preface to the Second Edition vii
Preface to the First Edition xi
1 Introduction 1
2 Overview of Supervised Learning 9
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.2 Variable Types and Terminology . . . . . . . . . . . . . . 9
2.3 Two Simple Approaches to Prediction:
Least Squares and Nearest Neighbors . . . . . . . . . . . 11
2.3.1 Linear Models and Least Squares . . . . . . . . 11
2.3.2 Nearest-Neighbor Methods . . . . . . . . . . . . 14
2.3.3 From Least Squares to Nearest Neighbors . . . . 16
2.4 Statistical Decision Theory . . . . . . . . . . . . . . . . . 18
2.5 Local Methods in High Dimensions . . . . . . . . . . . . . 22
2.6 Statistical Models, Supervised Learning
and Function Approximation . . . . . . . . . . . . . . . . 28
2.6.1 A Statistical Model
for the Joint Distribution Pr(X,Y ) . . . . . . . 28
2.6.2 Supervised Learning . . . . . . . . . . . . . . . . 29
2.6.3 Function Approximation . . . . . . . . . . . . . 29
2.7 Structured Regression Models . . . . . . . . . . . . . . . 32
2.7.1 Difficulty of the Problem . . . . . . . . . . . . . 32
xiv Contents
2.8 Classes of Restricted Estimators . . . . . . . . . . . . . . 33
2.8.1 Roughness Penalty and Bayesian Methods . . . 34
2.8.2 Kernel Methods and Local Regression . . . . . . 34
2.8.3 Basis Functions and Dictionary Methods . . . . 35
2.9 Model Selection and the Bias–Variance Tradeoff . . . . . 37
Bibliographic Notes . . . . . . . . . . . . . . . . . . . . . . . . . 39
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
3 Linear Methods for Regression 43
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 43
3.2 Linear Regression Models and Least Squares . . . . . . . 44
3.2.1 Example: Prostate Cancer . . . . . . . . . . . . 49
3.2.2 The Gauss–Markov Theorem . . . . . . . . . . . 51
3.2.3 Multiple Regression
from Simple Univariate Regression . . . . . . . . 52
3.2.4 Multiple Outputs . . . . . . . . . . . . . . . . . 56
3.3 Subset Selection . . . . . . . . . . . . . . . . . . . . . . . 57
3.3.1 Best-Subset Selection . . . . . . . . . . . . . . . 57
3.3.2 Forward- and Backward-Stepwise Selection . . . 58
3.3.3 Forward-Stagewise Regression . . . . . . . . . . 60
3.3.4 Prostate Cancer Data Example (Continued) . . 61
3.4 Shrinkage Methods . . . . . . . . . . . . . . . . . . . . . . 61
3.4.1 Ridge Regression . . . . . . . . . . . . . . . . . 61
3.4.2 The Lasso . . . . . . . . . . . . . . . . . . . . . 68
3.4.3 Discussion: Subset Selection, Ridge Regression
and the Lasso . . . . . . . . . . . . . . . . . . . 69
3.4.4 Least Angle Regression . . . . . . . . . . . . . . 73
3.5 Methods Using Derived Input Directions . . . . . . . . . 79
3.5.1 Principal Components Regression . . . . . . . . 79
3.5.2 Partial Least Squares . . . . . . . . . . . . . . . 80
3.6 Discussion: A Comparison of the Selection
and Shrinkage Methods . . . . . . . . . . . . . . . . . . . 82
3.7 Multiple Outcome Shrinkage and Selection . . . . . . . . 84
3.8 More on the Lasso and Related Path Algorithms . . . . . 86
3.8.1 Incremental Forward Stagewise Regression . . . 86
3.8.2 Piecewise-Linear Path Algorithms . . . . . . . . 89
3.8.3 The Dantzig Selector . . . . . . . . . . . . . . . 89
3.8.4 The Grouped Lasso . . . . . . . . . . . . . . . . 90
3.8.5 Further Properties of the Lasso . . . . . . . . . . 91
3.8.6 Pathwise Coordinate Optimization . . . . . . . . 92
3.9 Computational Considerations . . . . . . . . . . . . . . . 93
Bibliographic Notes . . . . . . . . . . . . . . . . . . . . . . . . . 94
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
Contents xv
4 Linear Methods for Classification 101
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 101
4.2 Linear Regression of an Indicator Matrix . . . . . . . . . 103
4.3 Linear Discriminant Analysis . . . . . . . . . . . . . . . . 106
4.3.1 Regularized Discriminant Analysis . . . . . . . . 112
4.3.2 Computations for LDA . . . . . . . . . . . . . . 113
4.3.3 Reduced-Rank Linear Discriminant Analysis . . 113
4.4 Logistic Regression . . . . . . . . . . . . . . . . . . . . . . 119
4.4.1 Fitting Logistic Regression Models . . . . . . . . 120
4.4.2 Example: South African Heart Disease . . . . . 122
4.4.3 Quadratic Approximations and Inference . . . . 124
4.4.4 L1 Regularized Logistic Regression . . . . . . . . 125
4.4.5 Logistic Regression or LDA? . . . . . . . . . . . 127
4.5 Separating Hyperplanes . . . . . . . . . . . . . . . . . . . 129
4.5.1 Rosenblatt’s Perceptron Learning Algorithm . . 130
4.5.2 Optimal Separating Hyperplanes . . . . . . . . . 132
Bibliographic Notes . . . . . . . . . . . . . . . . . . . . . . . . . 135
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135

5 Basis Expansions and Regularization 139
6 Kernel Smoothing Methods 191
7 Model Assessment and Selection 219
8 Model Inference and Averaging 261
9 Additive Models, Trees, and Related Methods 295
10 Boosting and Additive Trees 337
11 Neural Networks 389
12 Support Vector Machines and Flexible Discriminants 417
13 Prototype Methods and Nearest-Neighbors 459
14 Unsupervised Learning 485
15 Random Forests 587
16 Ensemble Learning 605
17 Undirected Graphical Models 625
18 High-Dimensional Problems: p ≫ N 649

回复此楼

» 本帖附件资源列表

欢迎监督和反馈：小木虫仅提供交流平台，不对该内容负责。
本内容由用户自主发布，如果其内容涉及到知识产权问题，其责任在于用户本人，如对版权有异议，请联系邮箱：xiaomuchong@tal.com
附件 1 : The_Elements_of_Statistical_Learning_Data_Mining,_Inference,_and_Prediction.pdf

2015-05-09 21:06:40, 12.16 M

» 收录本帖的淘帖专辑推荐

遥感图像处理专辑	科研与育人	专业书籍	李的收藏
@数学参考资料

» 猜你喜欢

» 本主题相关价值贴推荐，对您同样有帮助:

剑桥2012年Modern.Statistical.Methods.for.Astronomy.With.R.Applications 已经有31人回复
数据挖掘经典书《DataMining Practical Machine Learning Tools and Techniques》已经有293人回复
经典书籍<Statistics for Spatial Data> 已经有19人回复
【分享】概率论沉思录--清晰版PDF.pdf 已经有146人回复
【讨论】机器学习理论基础已经有33人回复

1楼2015-05-09 21:07:19

已阅回复此楼关注TA 给TA发消息送TA红花 TA的回帖

相关版块跳转我要订阅楼主 jlcuit 的主题更新

返回列表