收藏 分享(赏)

linear regression (cont(商务统计英文版)) PPT课件.ppt

上传人:微传9988 文档编号:2532427 上传时间:2018-09-21 格式:PPT 页数:22 大小:541KB
下载 相关 举报
linear regression (cont(商务统计英文版)) PPT课件.ppt_第1页
第1页 / 共22页
linear regression (cont(商务统计英文版)) PPT课件.ppt_第2页
第2页 / 共22页
linear regression (cont(商务统计英文版)) PPT课件.ppt_第3页
第3页 / 共22页
linear regression (cont(商务统计英文版)) PPT课件.ppt_第4页
第4页 / 共22页
linear regression (cont(商务统计英文版)) PPT课件.ppt_第5页
第5页 / 共22页
点击查看更多>>
资源描述

1、Business Statistics BEO 1106,WEEK 10 SIMPLE LINEAR REGRESSION (cont.) TIME-SERIES ANALYSIS Reference: Selvanathan et al. (2004), Chapters 11, 13,BEO1106 - Week 10,2,EVALUATING SIMPLE LINEAR REGRESSION,There is just one independent variable in the model.,The relationship between X and Y can be depict

2、ed with a straight line.,Ex 1: (Selvanathan 2004 ed. - p.494, ex.11.72 (2000 ed.- p.419, ex.11.61) The manager of Colonial Furniture has been reviewing weekly advertising expenditures. During the past six months all advertisements for the store have been appeared in the local newspaper. The number o

3、f ads per week has varied from one to seven. The stores sales staff has been tracking the number of customers who enter the store each week. The number of ads and the number of customers per week for the past 26 weeks have been stored in the file Xr11-72.,Number of ads per week is the independent va

4、riable (X), and the number of customers per week is the dependent variable (Y).,The purpose of advertising in the local newspaper is to boost sales, so we expect a positive relationship between X and Y.,BEO1106 - Week 10,3,Graph the paired observations of X and Y.,This scatter diagram shows that, at

5、 least in the sample, there is some rather weak positive linear relationship between the numbers of ads and customers.,Determine the sample regression line.,We are going to rely on Excel. However, at home, try to reproduce the results manually using the sum of squares: SSx = 86.654, SSy = 463802 and

6、 SSxy = 1850.6.,BEO1106 - Week 10,4,Using Excel:,Interpret the coefficients.,This means that when X,= 0, i.e. no advertisement is placed in the local newspaper, the expected number of customers per week is 296.92.,suggesting that on a given week for each additional advertisement in the local newspap

7、er the number of customers is expected to increase by 21.356.,BEO1106 - Week 10,5,Find and interpret the coefficient of determination.,This suggests that only 8.5% of the total variation in the number of customers can be explained, or is due to, the variation in the number of advertisements. The rem

8、aining 91.5% is unexplained, i.e. is due to some other factors than ads.,This model has a rather poor overall fit.,Similar to the mean of a single population, say x, the unknown parameters of a population regression model, 0 and 1, can be estimated with point estimators and with interval estimators,

9、 and we can also use hypothesis testing to verify whether the sample at hand supports certain statements about these parameters.,However, to do so, first we have to study the sampling distributions of the 0-hat and 1-hat least squares estimators.,BEO1106 - Week 10,6,Just as the sampling distribution

10、 of the sample mean, X-bar, depends on the the mean, standard deviation and shape of the X population, the sampling distributions of the 0-hat and 1-hat least squares estimators depend on the properties of the Yj sub-populations (j=1, n).,Given xj, the properties of the Yj sub-population are determi

11、ned by the j error/random variable.,As regards the probability distributions of j ( j =1, n), it is assumed that:,Each j is normally distributed,Yj is also normal;,Each j has zero mean,E(Yj) = 0 + 1 xj,Each j has the same variance, 2,Var(Yj) = 2 is also constant;,The errors are independent of each o

12、ther,Yi and Yj, ij, are also independent;,The error does not depend on the independent variable(s).,The effects of X and on Y can be separated from each other.,BEO1106 - Week 10,7,xi,xj,The first three assumptions can be illustrated as follows:,Yi : N (0+1xi ; ),Yj : N (0+1xj ; ),If all 5 assumption

13、s are met, then the 0-hat and 1-hat least squares estimators are also normally distributed:,BEO1106 - Week 10,8,are standard normal random variables.,and,However, the standard errors (standard deviations) of the 0-hat and 1-hat estimators are unknown and they depend on the standard deviation of , ,

14、which is also unknown. They have to be estimated from the sample, similarly to 0 and 1.,Standard error of estimate: the sample standard deviation of .,In the case of simple linear regression (k=1) it is,Replacing with its estimate, s, the estimated standard errors of 0-hat and 1-hat are,and,BEO1106

15、- Week 10,9,and,are t random variables with n -2 degrees of freedom (df).,The C% confidence interval estimators of 0 and 1 are,(Ex 1) Find the standard error of estimate and the standard error of the slope estimator.,Solving by hand, we have to start with the sum of squares for error, SSE. A useful

16、computational formula for SSE is,BEO1106 - Week 10,10,Hence,and,Standard Error of Estimate,BEO1106 - Week 10,11,Determine the 95% confidence interval estimate of 1.,C = 95, so /2 = 0.025,n = 26, so df = 24,Thus,and with 95% confidence the slope coefficient is within this interval.,Notice that this i

17、nterval has a negative lower limit and a positive upper limit, i.e. the slope coefficient can be negative, positive or zero.,The number of advertisement does not necessarily have a positive impact on the number of customers, so it might not be worth to spend on advertisements in the local newspaper.

18、,BEO1106 - Week 10,12,The sampling distributions of the 0-hat and 1-hat least squares estimators can also be used for testing the regression coefficients.,Recall (see the week 9 notes) that in regression analysis the most important question is whether there is a linear relationship between X and Y (

19、1 0), and if there is, whether this relationship is positive (1 0) or negative (1 0).,To this end we can conduct two-tail or one-tail t-tests about 1, the same way as we test when is unknown.,The test statistic is,where 0,1 denotes the hypothetical value of 1 (often zero).,Granted that H0 is true, t

20、 has a Students t distribution with n -2 degrees of freedom (for k=1).,(Ex 1) Is there sufficient evidence at the 5% level to conclude that the number of advertisements and the number of customers are linearly related?,BEO1106 - Week 10,13,The question suggests that H0 : 1= 0 and HA : 1 0, so 0,1= 0

21、.,Since this a two-tail test, there are two critical values: -t/2 = -2.064, t/2= 2.064 (see part f).,Accordingly, reject H0 if the value of the test statistic calculated from the sample is either smaller than -2.064 or greater than 2.064.,Since 1.495 is in the non-rejection region we maintain H0.,Ma

22、intain H0 and conclude at the 5% level of significance that the numbers of advertisements and customers are not linearly related.,p-value = 0.05,BEO1106 - Week 10,14,Ex 2: (Example 1 of week 9) Pat Statsdud postulated that the longer one studied, the better ones grade. To test this theory, Pat regre

23、ssed final mark (Y) on study time (X) and obtained the following sample regression equation:,Can we conclude at the 1% significance level that Pats theory is correct, I. there is a positive linear relationship between study time and final mark?,Right-tail test with H0 : 1= 0 and HA : 1 0.,p-value 0.

24、01,Reject H0 and conclude at the 1% level of significance that Pats theory is supported by the sample evidence.,If the fit of the sample regression equation is satisfactory, it can be used to predict the dependent variable or to estimate its mean value.,USING THE SAMPLE REGRESSION EQUATION,E.g.: Wha

25、t is the final mark of Tom who spent 30 hours on studying? I.e., given x = 30, how large is y?,E.g.: What is the mean final mark of all those students who spent 30 hours on studying? I.e., given x = 30, how large is E(y)?,The dependent variable can be predicted or estimated in two ways: with a singl

26、e value or with an interval.,For a given X value, the point forecast of Y and the point estimator of the mean of the Y sub-population are the same:,For a particular element of a Y sub-population.,For the expected value of a Y sub-population.,Ex.2 Predict the final mark when study time is 30 hours.,B

27、EO1106 - Week 10,16,INTRODUCTION TO TIME-SERIES ANALYSIS,According to classical time-series analysis an observed time series is the combination of some pattern and random variations.,Traditionally, there are two types of methods for identifying the pattern.,The aim is to separate them from each othe

28、r in order to describe to historical pattern in the data,and to prepare forecasts by projecting the revealed historical pattern into the future.,Smoothing:,Decomposition:,The random fluctuations are removed from the data by smoothing the time series.,The time series is broken into its components and

29、 the pattern is the combination of the systematic parts.,The pattern itself is likely to contain some, or all, of the following three components: trend, seasonal and cyclical.,BEO1106 - Week 10,17,Trend: The long-term general change in the level of the data with a duration of longer than a year.,It

30、can be linear (straight line),or non-linear (smooth curve), like e.g.,quadratic,etc.,t,Yt,exponential,BEO1106 - Week 10,18,Seasonal variations: Regular wavelike fluctuations of constant length, repeating themselves within a period of no longer than a year.,Seasonal variations are usually associated

31、with the four seasons of the year, but they may also refer to any systematic pattern that occurs during a month, a week or even a single day.,t,Yt,BEO1106 - Week 10,19,Peak,Cyclical variations: Wavelike movements, quasi regular fluctuations around the long-term trend, lasting longer than a year.,Beg

32、inning trough,Ending trough,The time gap between the beginning trough and ending trough is the length of the cycle, while the vertical distance between the through and the peak is the amplitude of the cycle.,t,Yt,BEO1106 - Week 10,20,However, while seasonal variations are absolutely regular and occu

33、r over calendar periods no longer than a year, cyclical variations might and do change both in their intensity (amplitude) and duration, and they last longer than a year.,It is far more difficult to study and predict the cyclical component than the seasonal component.,The time period between the beg

34、inning trough and the peak is called expansion phase, while the period between the peak and the ending trough is termed contraction phase.,Seasonal and cyclical variations might be very similar in their appearance.,The random variations of the data comprise the deviations of the observed time series

35、 from the underlying pattern.,When this irregular component is strong compared to the (quasi-) regular components, it tends to hide the seasonal and cyclical variations, and it is difficult to be detached from the pattern.,However, if we manage to capture the trend, the seasonal and cyclical variati

36、ons, the remaining changes do not have any discernible pattern, so they are totally unpredictable.,Cyclical variations are often attributed to business cycles, i.e. to the ups and downs in the general level of business activity.,BEO1106 - Week 10,21,The four components of a time series (T: trend, S:

37、 seasonal, C: cyclical, R: random) can be combined in different ways. Accordingly, the time series model used to describe the observed data (Y) can be,Additive:,Multiplicative:,E.g.: If the trend is linear, these two models look as follows:,In an additive model the seasonal, cyclical and random vari

38、ations are absolute deviations from the trend.,In a multiplicative model the seasonal, cyclical and random variations are relative (percentage) deviations from the trend.,They do not depend on the level of the trend.,The higher the trend, the more intensive these variations are.,BEO1106 - Week 10,22

39、,These time series have an increasing linear trend component, but,the fluctuations around this trend have the same intensity;,the fluctuations around this trend are more and more intensive.,Though in practice the multiplicative model is the more popular, both models have their own merits and, depending on the nature of the time series to be analysed, they are equally acceptable.,

展开阅读全文
相关资源
猜你喜欢
相关搜索

当前位置:首页 > 中等教育 > 小学课件

本站链接:文库   一言   我酷   合作


客服QQ:2549714901微博号:道客多多官方知乎号:道客多多

经营许可证编号: 粤ICP备2021046453号世界地图

道客多多©版权所有2020-2025营业执照举报