24小时热门版块排行榜    

查看: 228  |  回复: 8
当前主题已经存档。
当前只显示满足指定条件的回帖,点击这里查看本话题的所有回帖

iangie

木虫 (著名写手)

强气受

[交流] 什么是多线性相关?

就是一个Y跟两个以上的x进行线性相关..
怎么求相关系数啊?

英文叫multiple correlation analyses

» 猜你喜欢

超越性别的爱情才是纯粹的爱情
已阅   回复此楼   关注TA 给TA发消息 送TA红花 TA的回帖

iangie

木虫 (著名写手)

强气受

Example 2
Simple Linear Regression

The example may be easier to understand if you copy it to a blank worksheet.

How do I copy an example?

Select the example in this article.
Important  Do not select the row or column headers.



Selecting an example from Help

Press CTRL+C.
In Excel, create a blank workbook or worksheet.
In the worksheet, select cell A1, and press CTRL+V.
Important  For the example to work properly, you must paste it into cell A1 of the worksheet.

To switch between viewing the results and viewing the formulas that return the results, press CTRL+` (grave accent), or on the Formulas tab, in the Formula Auditing group, click the Show Formulas button.
After you copy the example to a blank worksheet, you can adapt it to suit your needs.


--------------------------------------------------------------------------------

   
1
2
3
4
5
6
7
8

9
A B C
Month Sales  
1 3100  
2 4500  
3 4400  
4 5400  
5 7500  
6 8100  
Formula Description Result
=SUM(LINEST(B2:B7, A2:A7)*{9,1}) Estimate sales for the ninth month 11000


In general, SUM({m,b}*{x,1}) equals mx + b, the estimated y-value for a given x-value. You can also use the TREND function.

Example 3
Multiple Linear Regression

Suppose a commercial developer is considering purchasing a group of small office buildings in an established business district.

The developer can use multiple linear regression analysis to estimate the value of an office building in a given area based on the following variables.

Variable Refers to the
y Assessed value of the office building
x1 Floor space in square feet
x2 Number of offices
x3 Number of entrances
x4 Age of the office building in years

This example assumes that a straight-line relationship exists between each independent variable (x1, x2, x3, and x4) and the dependent variable (y), the value of office buildings in the area.

The developer randomly chooses a sample of 11 office buildings from a possible 1,500 office buildings and obtains the following data. "Half an entrance" means an entrance for deliveries only.

The example may be easier to understand if you copy it to a blank worksheet.

How do I copy an example?

Select the example in this article.
Important  Do not select the row or column headers.



Selecting an example from Help

Press CTRL+C.
In Excel, create a blank workbook or worksheet.
In the worksheet, select cell A1, and press CTRL+V.
Important  For the example to work properly, you must paste it into cell A1 of the worksheet.

To switch between viewing the results and viewing the formulas that return the results, press CTRL+` (grave accent), or on the Formulas tab, in the Formula Auditing group, click the Show Formulas button.
After you copy the example to a blank worksheet, you can adapt it to suit your needs.


--------------------------------------------------------------------------------

   

1
2
3
4
5
6
7
8
9
10
11
12
13

14
A B C D E
Floor space (x1) Offices (x2) Entrances (x3) Age (x4) Assessed value (y)
2310 2 2 20 142,000
2333 2 2 12 144,000
2356 3 1.5 33 151,000
2379 3 2 43 150,000
2402 2 3 53 139,000
2425 4 2 23 169,000
2448 2 1.5 99 126,000
2471 2 2 34 142,900
2494 3 3 23 163,000
2517 4 4 55 169,000
2540 2 3 22 149,000
Formula
=LINEST(E2:E12,A212,TRUE,TRUE)


Important  The formula in the example must be entered as an array formula. After copying the example to a blank worksheet, select the range A14:E18 starting with the formula cell. Press F2, and then press CTRL+SHIFT+ENTER. If the formula is not entered as an array formula, the single result is -234.2371645.

When entered as an array, the following regression statistics are returned. Use this key to identify the statistic you want.



The multiple regression equation, y = m1*x1 + m2*x2 + m3*x3 + m4*x4 + b, can be obtained by using the values from row 14:

y = 27.64*x1 + 12,530*x2 + 2,553*x3 - 234.24*x4 + 52,318

The developer can now estimate the assessed value of an office building in the same area that has 2,500 square feet, three offices, and two entrances and is 25 years old, by using the following equation:

y = 27.64*2500 + 12530*3 + 2553*2 - 234.24*25 + 52318 = $158,261

Alternatively, you can copy the following table to cell A21 of the worksheet that you created for this example.

Floor space (x1) Offices (x2) Entrances (x3) Age (x4) Assessed value (y)
2500 3 2 25 =D14*A22 + C14*B22 + B14*C22 + A14*D22 + E14

You can also use the TREND function to calculate this value.

Example 4
Using the F and r2 Statistics

In the preceding example, the coefficient of determination, or r2, is 0.99675 (see cell A17 in the output for LINEST), which would indicate a strong relationship between the independent variables and the sale price. You can use the F statistic to determine whether these results, with such a high r2 value, occurred by chance.

Assume for the moment that in fact there is no relationship among the variables, but that you have drawn a rare sample of 11 office buildings that causes the statistical analysis to demonstrate a strong relationship. The term "Alpha" is used for the probability of erroneously concluding that there is a relationship.

The F and df values in output from the LINEST function can be used to assess the likelihood of a higher F value occurring by chance. F can be compared with critical values in published F-distribution tables or the FDIST function in Excel can be used to calculate the probability of a larger F value occurring by chance. The appropriate F distribution has v1 and v2 degrees of freedom. If n is the number of data points and const = TRUE or omitted, then v1 = n – df – 1 and v2 = df. (If const = FALSE, then v1 = n – df and v2 = df.) The FDIST function — with the syntax FDIST(F,v1,v2) — will return the probability of a higher F value occurring by chance. In this example, df = 6 (cell B18) and F = 459.753674 (cell A18).

Assuming an Alpha value of 0.05, v1 = 11 – 6 – 1 = 4 and v2 = 6, the critical level of F is 4.53. Since F = 459.753674 is much higher than 4.53, it is extremely unlikely that an F value this high occurred by chance. (With Alpha = 0.05, the hypothesis that there is no relationship between known_y’s and known_x’s is to be rejected when F exceeds the critical level, 4.53.) You can use the FDIST function in Excel to obtain the probability that an F value this high occurred by chance. For example, FDIST(459.753674, 4, 6) = 1.37E-7, an extremely small probability. You can conclude, either by finding the critical level of F in a table or by using the FDIST function, that the regression equation is useful in predicting the assessed value of office buildings in this area. Remember that it is critical to use the correct values of v1 and v2 that were computed in the preceding paragraph.

Example 5
Calculating the t-Statistics

Another hypothesis test will determine whether each slope coefficient is useful in estimating the assessed value of an office building in Example 3. For example, to test the age coefficient for statistical significance, divide -234.24 (age slope coefficient) by 13.268 (the estimated standard error of age coefficients in cell A15). The following is the t-observed value:

t = m4 ÷ se4 = -234.24 ÷ 13.268 = -17.7

If the absolute value of t is sufficiently high, it can be concluded that the slope coefficient is useful in estimating the assessed value of an office building in Example 3. The following table shows the absolute values of the 4 t-observed values.

If you consult a table in a statistics manual, you will find that t-critical, two tailed, with 6 degrees of freedom and Alpha = 0.05 is 2.447. This critical value can also be found by using the TINV function in Excel. TINV(0.05,6) = 2.447. Because the absolute value of t (17.7) is greater than 2.447, age is an important variable when estimating the assessed value of an office building. Each of the other independent variables can be tested for statistical significance in a similar manner. The following are the t-observed values for each of the independent variables.

Variable t-observed value
Floor space 5.1
Number of offices 31.3
Number of entrances 4.8
Age 17.7

These values all have an absolute value greater than 2.447; therefore, all the variables used in the regression equation are useful in predicting the assessed value of office buildings in this area.



See Also
Add, change, or remove a trendline in a chart
GROWTH function
LOGEST function
Statistical functions (reference)
TREND function
超越性别的爱情才是纯粹的爱情
7楼2010-03-12 12:08:08
已阅   回复此楼   关注TA 给TA发消息 送TA红花 TA的回帖
查看全部 9 个回答

Daoist

木虫 (正式写手)

iangie(金币+5):谢谢~~只会excel... 2010-03-12 11:46
用EXCEL 分析工具 /相关工具可求,非常简单
也可以用SPSS求,也很简单
2楼2010-03-12 11:03:35
已阅   回复此楼   关注TA 给TA发消息 送TA红花 TA的回帖

Daoist

木虫 (正式写手)

再不见数据我就走了
4楼2010-03-12 11:05:37
已阅   回复此楼   关注TA 给TA发消息 送TA红花 TA的回帖

iangie

木虫 (著名写手)

强气受

手上没有数据~~~看到文章里面用了线性相关和多元相关~~~
所以想了解下如何做多元回归....

你说Excel都可以做吗?
偶去搜搜看...
超越性别的爱情才是纯粹的爱情
5楼2010-03-12 11:12:08
已阅   回复此楼   关注TA 给TA发消息 送TA红花 TA的回帖
普通表情 高级回复 (可上传附件)
信息提示
请填处理意见