基于改進(jìn)隨機(jī)森林的Android惡意軟件檢測(cè)方法研究
發(fā)布時(shí)間:2019-01-25 21:37
【摘要】:近年來,隨著移動(dòng)互聯(lián)網(wǎng)的發(fā)展與壯大,智能手機(jī)也得到迅速的發(fā)展。目前Android系統(tǒng)占據(jù)了全球手機(jī)操作系統(tǒng)市場(chǎng)份額的一大部分且仍有不斷上升的趨勢(shì),與此同時(shí),Android也成為了惡意軟件泛濫的主要平臺(tái)。Android惡意軟件的惡意行為多種多樣,給用戶甚至整個(gè)社會(huì)都帶來了巨大的危害和經(jīng)濟(jì)損失。因此,如何將Android惡意軟件快速高效的分析并檢測(cè)出來已經(jīng)成為目前的研究熱點(diǎn)。首先對(duì)Android平臺(tái)進(jìn)行歸納總結(jié),分析了Android的系統(tǒng)架構(gòu)和應(yīng)用程序組件,然后對(duì)使用到的機(jī)器學(xué)習(xí)算法以及Spark并行環(huán)境框架進(jìn)行分析,為后續(xù)研究打下基礎(chǔ)。然后,針對(duì)隨機(jī)森林算法的投票原則無法區(qū)分強(qiáng)分類器與弱分類器差異的缺陷進(jìn)行改進(jìn),提出一種加權(quán)投票改進(jìn)方法,并在此基礎(chǔ)上提出了一種用于檢測(cè)Android惡意軟件的改進(jìn)隨機(jī)森林分類模型(Improved Random Forest Classification Model,IRFCM)。IRFCM選取AndroidManifest.xml文件中的Permission信息和Intent信息作為特征屬性,并通過特征選擇算法進(jìn)行優(yōu)化生成特征向量集合,最后應(yīng)用該模型對(duì)最終生成的特征向量集合進(jìn)行分類檢測(cè),實(shí)驗(yàn)結(jié)果表明IRFCM具有較好的分類精度和分類效率。最后,針對(duì)大數(shù)據(jù)環(huán)境下應(yīng)用程序安裝包反編譯過程耗時(shí)長(zhǎng)和特征提取慢的問題,將IRFCM與Spark框架相結(jié)合,設(shè)計(jì)實(shí)現(xiàn)并行環(huán)境下的Android惡意軟件檢測(cè)。將樣本數(shù)據(jù)轉(zhuǎn)換為Spark框架下的彈性分布式數(shù)據(jù)集(Resilient Distributed Dataset,RDD),并在虛擬機(jī)集群環(huán)境中并行地對(duì)RDD進(jìn)行特征提取和分類檢測(cè),并行環(huán)境下的實(shí)驗(yàn)結(jié)果與單機(jī)環(huán)境相比,有效提高了Android惡意軟件的檢測(cè)效率。
[Abstract]:In recent years, with the development and expansion of the mobile Internet, smart phones have also been rapidly developed. At present, Android system accounts for a large part of the global mobile operating system market and still has a rising trend. At the same time, Android has become the main platform for malware proliferation. Android malware has a variety of malware. To the users and even the whole society has brought huge harm and economic losses. Therefore, how to analyze and detect Android malware quickly and efficiently has become a hotspot. Firstly, the Android platform is summarized, and the system architecture and application program components of Android are analyzed. Then, the machine learning algorithm and the Spark parallel environment framework are analyzed, which lays the foundation for further research. Then, aiming at the defect that the voting principle of stochastic forest algorithm can not distinguish the difference between strong classifier and weak classifier, a weighted voting improvement method is proposed. On this basis, an improved stochastic forest classification model, (Improved Random Forest Classification Model,IRFCM). IRFCM, which is used to detect Android malware, is proposed to select Permission information and Intent information in AndroidManifest.xml files as feature attributes. The feature selection algorithm is used to optimize the set of feature vectors. Finally, the model is used to detect the final set of feature vectors. The experimental results show that IRFCM has better classification accuracy and efficiency. Finally, aiming at the problems of time-consuming and slow feature extraction in the decompilation process of application installation package under big data environment, combining IRFCM with Spark framework, Android malware detection in parallel environment is designed and implemented. The sample data is converted to the elastic distributed data set (Resilient Distributed Dataset,RDD) under the framework of Spark, and the feature extraction and classification detection of RDD are carried out in parallel in the virtual machine cluster environment. The experimental results in the parallel environment are compared with those in the single machine environment. The detection efficiency of Android malware is improved effectively.
【學(xué)位授予單位】:中國(guó)民航大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2017
【分類號(hào)】:TP316;TP309
[Abstract]:In recent years, with the development and expansion of the mobile Internet, smart phones have also been rapidly developed. At present, Android system accounts for a large part of the global mobile operating system market and still has a rising trend. At the same time, Android has become the main platform for malware proliferation. Android malware has a variety of malware. To the users and even the whole society has brought huge harm and economic losses. Therefore, how to analyze and detect Android malware quickly and efficiently has become a hotspot. Firstly, the Android platform is summarized, and the system architecture and application program components of Android are analyzed. Then, the machine learning algorithm and the Spark parallel environment framework are analyzed, which lays the foundation for further research. Then, aiming at the defect that the voting principle of stochastic forest algorithm can not distinguish the difference between strong classifier and weak classifier, a weighted voting improvement method is proposed. On this basis, an improved stochastic forest classification model, (Improved Random Forest Classification Model,IRFCM). IRFCM, which is used to detect Android malware, is proposed to select Permission information and Intent information in AndroidManifest.xml files as feature attributes. The feature selection algorithm is used to optimize the set of feature vectors. Finally, the model is used to detect the final set of feature vectors. The experimental results show that IRFCM has better classification accuracy and efficiency. Finally, aiming at the problems of time-consuming and slow feature extraction in the decompilation process of application installation package under big data environment, combining IRFCM with Spark framework, Android malware detection in parallel environment is designed and implemented. The sample data is converted to the elastic distributed data set (Resilient Distributed Dataset,RDD) under the framework of Spark, and the feature extraction and classification detection of RDD are carried out in parallel in the virtual machine cluster environment. The experimental results in the parallel environment are compared with those in the single machine environment. The detection efficiency of Android malware is improved effectively.
【學(xué)位授予單位】:中國(guó)民航大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2017
【分類號(hào)】:TP316;TP309
【參考文獻(xiàn)】
相關(guān)期刊論文 前10條
1 許艷萍;伍淳華;侯美佳;鄭康鋒;姚珊;;基于改進(jìn)樸素貝葉斯的Android惡意應(yīng)用檢測(cè)技術(shù)[J];北京郵電大學(xué)學(xué)報(bào);2016年02期
2 孫潤(rùn)康;彭國(guó)軍;李晶雯;沈詩琦;;基于行為的Android惡意軟件判定方法及其有效性[J];計(jì)算機(jī)應(yīng)用;2016年04期
3 Zhenlong Yuan;Yongqiang Lu;Yibo Xue;;Droid Detector:Android Malware Characterization and Detection Using Deep Learning[J];Tsinghua Science and Technology;2016年01期
4 王琪;張洪偉;;基于Spark計(jì)算模型的隨機(jī)森林的電話量預(yù)測(cè)研究[J];成都信息工程學(xué)院學(xué)報(bào);2015年05期
5 張樂峰;肖茹s,
本文編號(hào):2415213
本文鏈接:http://www.wukwdryxk.cn/shoufeilunwen/xixikjs/2415213.html
最近更新
教材專著