目 录
摘要 I
Abstract II
1 引言 1
1.1 研究背景及意义 1
1.2 国内外的研究进展 1
2 数据和研究方法 2
2.1 研究数据的选取 2
2.2 回归模型及其扩展 3
2.2.1 一般的多元线性回归模型 3
2.2.2 异方差的诊断及其处理..............................................................................................4
2.2.3 自相关性的分析................................................................................. ........................5
2.2.4 多重共线性的诊断及其处理 6
3 实证分析 8
3.1 国内旅游收入的一般线性回归建模 8
3.2 国内旅游收入模型的共线性分析 10
3.3 国内旅游收入模型的异方差分析 12
3.4 国内旅游收入模型的自相关分析 13
3.5 结果分析 13
4结论以及建议 14
参考文献 16
致谢 18
Abstract:In this paper, the relevant data of the development of China's statistical yearbook are selected, and the linear regression equation of domestic tourism income is studied. Firstly, a general multiple linear regression model is established between the total explanations of the explanatory variables, namely, the total income of the domestic tourism and the selected explanatory variables. The VIF values of the variables are found in the t-test of the model coefficients and the variable variance expansion factor test More than 10, there is a more serious multiple collinearity problem, then the principal component regression is used to obtain an estimation equation of the regression model based on the relationship between the principal component and the independent variable. Secondly, the white test and autocorrelation test of the principal component and the explanatory variable equation are obtained by the principal component regression method. Finally, the final model is obtained by the regression of the principal component and the explanatory variable equation. The results show that the variance expansion factor of the independent regression equation is greater than 10, there is a collinearity problem. The P value of the heteroscedasticity test of the principal component regression equation is 0.6221 and 0.5652 respectively. It is judged that there is no heteroskedasticity, In the subsequent autocorrelation test, DW = 0.947 was obtained. The correlation coefficient of the residuals and the partial correlation coefficient were found to have no correlation between the two values in the s period. The autocorrelation was not found, and the resulting principal component regression Compared with the general linear regression equation, all the principal component regression coefficients are positive numbers, indicating that the explanatory variables are the promotion factors of tourism income, which is more in line with the real economic significance. The most significant factor influencing the tourism revenue is the railway Mileage, the relative impact of the smallest factor is the per capita disposable income of urban residents.
Key words:domestic tourism income; heteroskedasticity; autocorrelation; multiple collinearity; principal component regression
1.1 研究意义以及背景
旅游业是一个新兴发展的行业,国内外广大学者长期致力于其影响因素的研究以及相关统计模型的建立研究。早在1988年时,Jean S. Holder[1]就提出了旅游对于当地环境模式的影响,而影响旅游发展的重要因素也变为了至关重要的研究项目之一。借鉴于该研究,国内王占祥在2008年的论文[2]分析结论表明了人均生产总值、国内旅游人次和物价水准都与国内旅游收入呈正性的相关。崔美姣等[3]在2009年提出了多元回归在我国旅游业中的应用,在建立模型后他们选择了一些较容易量化的因素如旅游人数、旅游景点、酒店个数、铁路里程等作为模型的自变量,运用了SPSS软件,得出旅游景点与国内旅游收入之间关联性较强的结论。在当时国内的经济环境下,其实旅游业已然接近饱和状态,已不是国内旅游人数越多,人均花费更多,交通条件更好就能决定国内旅游收入的增长了,在一定程度上,这些因素反而会增加旅游景区的承载能力和环境压力,而国家收入因素已然成为重要的影响旅游需求规模及多样性的一个指标[4],周口职业技术学院的许建国当时发表的一篇对国内旅游收入因素的实证分析也很好的诠释了这一点[5]。在引用其研究后袁翊茗等人也运用了1994-2010的数据对旅游收入进行了计量分析[6].而与之结论稍有偏差的是薛媛在2013年中同样运用多元回归法分析国内旅游收入影响因素的一篇论文[7],她在建立模型时为了使数据更加平稳,对数据使用了平稳性检验及协整性检验,并且根据检验的结果使用了学术界广泛使用的双对数线性函数,其得出结论认为旅游饭店数量是催进旅游业发展的关键推动力。
对于寻找影响因素的问题而言,建立多元回归模型是一个较为基础且有效的方法,但模型也有其缺点所在。对于多元回归模型,其各个解释变量之间可能会存在多重共线性,而修正多重共线性就可以使用逐步回归法[8],做因变量对各个解释变量的一元回归,然后逐步加入其他解释变量保留最大的。在2017年,Yahya Peranginangin等人就运用了较为成熟的多元回归模型分析了品牌意识的社交图[9]。