论文总字数:71568字
摘 要
随着电子商务的不断发展,线上销售正成为越来越多的产品重要销售渠道。通过线上销售渠道可方便获取销售数据,这些数据为企业分析市场和客户提供了很好的基础。而手机作为一种更新速度非常快的产品,明确影响手机销售的影响因素,把握市场的发展需求对手机生产决策非常重要。为此,本文基于网上手机销售数据的挖掘,分析影响手机销售的因素。
首先,论文介绍了与本文研究相关的理论基础。其次,采用定性分析方法,通过文本挖掘对抓取的评论进行分词,并采用TF-IDF加权方式构建DTM矩阵,使用层次聚类和云图分析法,将分词后的结果聚为两大类:外观和配置,发现了影响手机销量最主要的两个因素。同时词云图分析结果表明价格、服务质量也是影响手机销量的重要因素。然后,论文采用定量方法,分析影响手机销量的因素。分别构建一元线性回归和非线性回归方程,分别讨论了手机价格、品牌知名度、拍照像素、手机内存、屏幕分辨率和手机核心数等单因素对手机销量的影响。进一步采用多元线性回归和逐步回归的方法,分析并建立了手机销量与拍照像素、产品知名度、价格和是否支持NFC四个变量的多元线性回归模型。最后结合研究结果建立了手机销量影响因素理论模型。
通过本文的探究,虽然得到了较为稳健的回归模型,但在研究方法的选取上还需要进一步的深化。同时在文本挖掘的过程中没有考虑用户情感属性,手机销量影响因素理论模型未进行检验和评价,相关工作在未来值得深入展开。
关键词:数据挖掘;销量影响因素;文本挖掘;回归分析
Abstract
With the continuous development of e-commerce, online sales are becoming an important sales channel for more and more products. Sales data is readily available through online sales channels, which provide a good foundation for companies to analyze the market and their customers. As a product with a very fast update speed, the mobile phone clearly influences the influencing factors of mobile phone sales, and grasping the development needs of the market is very important for mobile phone production decisions. To this end, this paper is based on the mining of online mobile phone sales data to analyze the factors affecting mobile phone sales.
First of all, the paper introduces the theoretical basis related to the research of this paper. Secondly, qualitative analysis method is used to segment the captured comments through text mining, and the DTM matrix is constructed by TF-IDF weighting method. The hierarchical clustering and cloud image analysis methods are used to cluster the results of the word segmentation into two categories: Appearance And configuration, found the two most important factors affecting mobile phone sales. At the same time, the results of word cloud analysis show that price and service quality are also important factors affecting mobile phone sales. Then, the paper uses quantitative methods to analyze the factors affecting mobile phone sales. The linear regression and nonlinear regression equations were constructed respectively, and the effects of single factors such as mobile phone price, brand popularity, camera pixels, mobile phone memory, screen resolution and mobile phone core number on mobile phone sales were discussed. Further, using multiple linear regression and stepwise regression method, the multiple linear regression models of mobile phone sales and camera pixels, product popularity, price and whether to support NFC are analyzed and established. Finally, based on the research results, a theoretical model of the factors affecting mobile phone sales is established.
Through the exploration of this paper, although a more robust regression model is obtained, further research needs to be deepened in the selection of research methods. At the same time, the user's emotional attributes are not considered in the process of text mining. The theoretical model of the factors affecting mobile phone sales has not been tested and evaluated, and relevant work is worthy of further development in the future.
KEY WORDS: data mining; sales influencing factors; text mining; regression analysis
目 录
第一章 绪论 1
1.1论文背景和意义 1
1.2文献综述 1
1.3本文主要内容 4
第二章 基础理论概述 5
2.1 数据预处理理论 5
2.1.1数据预处理概述 5
2.1.2数据预处理的主要任务 5
2.2文本挖掘与文本分析理论 6
2.2.1中文分词理论之词法分析概述 6
2.2.2词典与基于规则分词 6
2.2.3文本层次聚类方法—层次聚类与簇间的计算 8
2.3回归分析理论 9
2.3.1一元线性回归模型原理 9
2.3.2相关程度的衡量 9
2.3.3多元线性回归模型原理 10
第三章 基于文本挖掘的手机销量影响因素分析 12
3.1研究流程与数据预处理 12
3.1.1研究流程设计 12
3.1.2数据收集与预处理 12
3.2.3建立词条-文档关系矩阵(DTM矩阵) 16
3.2.4降维处理 17
3.3聚类和结果分析 18
3.3.1层次聚类分析 18
3.3.2词云分析 20
3.4分析主要结论 21
第四章 基于回归分析模型的手机销量影响因素分析 22
4.1数据采集及预处理 22
4.1.1数据采集 22
4.1.2数据清洗 22
4.1.2数据变换 22
4.1.3数据标准化和归一化处理 23
4.2单因素回归影响分析 24
4.2.1基于一元线性回归探究单因素对销量的影响 24
4.2.2基于非线性回归探究单因素对销售量的影响 27
4.3多因素回归影响分析 28
4.3.1基于多元线性回归探究单因素对销量的影响 28
4.3.2基于逐步回归分析探究多个因素对销量的影响 30
4.3.3手机销量影响因素理论模型的构建 33
4.4分析主要结论 33
第五章 总结与展望 35
参考文献 37
附 录 40
剩余内容已隐藏,请支付后下载全文,论文总字数:71568字
该课题毕业论文、开题报告、外文翻译、程序设计、图纸设计等资料可联系客服协助查找;