期刊文献+

一种抗混淆的大规模Android应用相似性检测方法 被引量:9

An Anti-Obfuscation Method for Detecting Similarity Among Android Applications in Large Scale
下载PDF
导出
摘要 随着代码混淆、加壳技术的应用,基于行为特征的Android应用相似性检测受到的影响愈加明显.提出了一种抗混淆的大规模Android应用相似性检测方法,通过提取应用内特定文件的内容特征计算应用相似性,该方法不受代码混淆的影响,且能有效抵抗文件混淆带来的干扰.对5.9万个应用内的文件类型进行统计,选取具有普遍性、代表性和可度量性的图片文件、音频文件和布局文件作为特征文件.针对3种特征文件的特点,提出了不同内容特征提取方法和相似度计算方法,并通过学习对其相似度赋予权重,进一步提高应用相似性检测的准确性.使用正版应用和已知恶意应用作为标准,对5.9万个应用进行相似性检测实验,结果显示基于文件内容的相似性检测可以准确识别重打包应用和含有已知恶意代码的应用,并且在效率和准确性上均优于现有方案. Code obfuscation exerts a huge impact on similarity detection among Android applications based on behavior characteristics. In order to deal with the situation, we propose a novel way of similarity detection among Android applications based on file content characteristics, which computes the similarity of file content features and can be applied to large-scale scenario in real world. Our method is not subject to code obfuscation or file obfuscation. We choose to utilize the characteristics of image, audio and layout files which are shown in our statistics as the most representative features in Android applications. Meanwhile, different weights are given to these features through machine learning, which further enhances the accuracy of our method. In addition, we implement a prototype system and particularly optimize each step to speed up the calculation, making our system suitable for large-scale scenario and give a good calculation performance. The experiments dataset contains 59 000 applications. And for both legitimate application and malware applications, our system successfully detects those repackaged pirate applications and those with the similar malicious component, which prove the effectiveness of our method. The experiment results demonstrate that similarity detection based on file content characteristics could resist the file obfuscation and give better performance in both accuracy and efficiency.
出处 《计算机研究与发展》 EI CSCD 北大核心 2014年第7期1446-1457,共12页 Journal of Computer Research and Development
基金 国家"九七三"重点基础研究发展计划基金项目(2012CB315804) 国家自然科学基金重大研究计划项目(91118006) 国家自然科学基金项目(61073179) 北京市自然科学基金项目(4122086)
关键词 文件内容特征 模糊散列 感知特征 安卓 应用相似性 抗混淆 file content characteristics fuzzy similarity anti-obfuscation Hash perceptual features Android application
  • 相关文献

参考文献26

  • 1IDC. Worldwide quarterly mobile phone tracker [EB/OL]. [2013-01-20]. http://www, idc. com/getdoe, jsp?containerld = prUS24108913.
  • 2Engadget. Google play hits 25 billion app downloads[EB/ OL]. (2012-09- 16) [2013-01-20]. http://www, engadget. com[2012[O9]26]google-play-hits-25 billion app-downloads/.
  • 3网秦.2012上半年全球手机安全报告[EB/OL].[2013-01-20].http://on.nq.com/neirong/2012shang.pdf.
  • 4Wisniewski R. Brut. alll @ gmail, com. android apktool [CP/OL]. [ 2013-01-20 ]. https://code, google, corn/p/ android-apktool/.
  • 5Gruver B. jesusfreke @ jesusfreke, corn, small [CP/OL]. [2013- 01- 20]. http://code, google, corn/p/small/.
  • 6Google. DDMS [CP/OL]. ]2013-01 -20]. http://developer. android, com]guide/developing/debugging/ddms, htrnl.
  • 7Dupuy E. JD-GUI [CP/OL]. [2013-01-20]. http://java. decompiler, free. fr/.
  • 8Panxiaobo. pxb1988 @ gmail, corn, yyjdelete @ gmail, com. dex2jar [CP/OL]. [2013-01-20]. http://code, google, corn/p/ dex2jar/.
  • 9Shabtai A, Kanonov U, Elovici Y, et al. "Andromaly": A behavioral malware detection framework for android devices [J]. Journal of Intelligent Information System, 2012, 38 (1): 161-190.
  • 10Xie L, Zhang X, Seifert J P, et al. pBMDS= A behavior based malware detection system for cellphone devices [C] // Proc of the 3rd ACM Conf on Wireless Network Security. New York: ACM, 2010: 37-48.

二级参考文献32

  • 1马帅,唐世渭,杨冬青,王腾蛟.一种用于位置数据库结构调整的增量聚类算法[J].软件学报,2004,15(9):1351-1360. 被引量:5
  • 2Nayak R. Fast and effective clustering of XML data using structural information [J]. Knowledge and Information Systems, 2008, 14(2).. 197-215.
  • 3Samet H. The Design and Analysis of Spatial Data Structures [M]. Reading, MA: Addison-Wesley, 1990.
  • 4Hsi-Cheng C, Chiun-Chieh H. Using topic keyword clusters for automatic document clustering [C] //Proc of the 3rd Int Conf on Information Technology and Applications (ICITA'05), Piscataway, NJ: IEEE, 2005:1852-1860.
  • 5Frakes W B, Ricardo B Y. Information Retrieval Data Structures & Algorithm [M]. New York: Prentice Hall Press, 1992.
  • 6JDom [EB]OL]. 2004 [2009-11-02]. http://jdom, org.
  • 7INEX Wikipedia XML Data Set[EB/OL]. 2009 [-2009-12- 18]. http://xrnlmining, lip6. fr.
  • 8ACM SIGMOD XML Data Set [EB/OL]. 2006 [2009-12- 18]. http:/ttwww, acre. org/sigmod/record/xml.
  • 9Shakespeare and Religion XML Data Set [EB/OL']. 2006 [2009-12-18]. http://metalab, unc. edu/bosak/xml/eg.
  • 10XML Core Working Group. Extensible Markup Language (XML) 1. 0 (Third Edition), W3C Recommendation'04 EEB/OL]. (2004-02-04) [2010-12-18]. http://www, w3. org/TR/2OO4/REC xml-20040204.

共引文献3

同被引文献36

引证文献9

二级引证文献17

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部