Mining Modules’ Dependencies for Malware Detection
Transcript of Mining Modules’ Dependencies for Malware Detection
“IT Security for the Next Generation”
Asia Pacific & MEA Cup, Hong Kong
14-16 March, 2012
Mining Modules’ Dependencies for
Malware Detection
Masoud Narouei, Mansour Ahmadi, Ashkan Sami
Shiraz University
Outline
1) Introduction
2) Modules’ Dependencies
3) Proposed System
4) Experiments & Results
5) Conclusions & Future work
6) References
"IT Security for the Next Generation", Asia Pacific & MEA Cup PAGE 2 | | 14-16 March, 2012
Introduction
Malware:
Every software Designed to
• Gain unauthorized access
• Steal user’s critical information
• Cause damage to computers
About 413 billion infections detected by anti-viruses in
Q1/2011
Detection techniques:
• Static
• Dynamic
"IT Security for the Next Generation", Asia Pacific & MEA Cup PAGE 4 | | 14-16 March, 2012
Static techniques:
Analyze the executable without running
Benefits:
• High detection rate
• Speed
Drawbacks:
• Code packing
• Polymorphism
• …
"IT Security for the Next Generation", Asia Pacific & MEA Cup PAGE 5 | | 14-16 March, 2012
Dynamic techniques:
Consider the behavior of malware during run time
Benefits:
• Robust against obfuscation and other techniques
Drawbacks:
• Time consuming during data gathering
• Monitoring dynamic libraries such as ocx, dll, …, is not easily possible
– 60 percent of malware collected at KingSoft anti-malware lab are DLL files
• Debugger detection
• Virtual machine detection
"IT Security for the Next Generation", Asia Pacific & MEA Cup PAGE 6 | | 14-16 March, 2012
Motivation
Each approach has benefits and drawbacks
A new method is needed to:
• Utilize the benefits of both
• Cover evasion such as packing, polymorphism, …
• Cover Fake API call injection
We extracting behavior without execution
We propose a heuristic static method
We consider modules’ dependencies of PE
We consider frequent subtrees as features
This is the first work that considers such feature
"IT Security for the Next Generation", Asia Pacific & MEA Cup PAGE 7 | | 14-16 March, 2012
Modules’ dependencies Tree
PEs Interact with OS by APIs which are categorized in DLLs.
Each PE depends on some DLL for execution
DLLs has relationship with other DLLs for completing the task.
"IT Security for the Next Generation", Asia Pacific & MEA Cup PAGE 9 | | 14-16 March, 2012
Figure 1.
Tree’s String Encoding
"IT Security for the Next Generation", Asia Pacific & MEA Cup PAGE 10 | | 14-16 March, 2012
Figure 2.
System
"IT Security for the Next Generation", Asia Pacific & MEA Cup PAGE 12 | | 14-16 March, 2012
Figure 3.
Closed Tree
"IT Security for the Next Generation", Asia Pacific & MEA Cup PAGE 13 | | 14-16 March, 2012
Figure 4. Sample tree
Figure 5. Sample closed
trees by support count of 2
Support 2
Detection Rate
SUP PR DR ACC
30 0.892 0.925 90.43
40 0.895 0.93 90.86
50 0.889 0.928 90.43
60 0.895 0.926 90.69
70 0.888 0.928 90.35
80 0.899 0.925 90.86
90 0.901 0.928 91.11
"IT Security for the Next Generation", Asia Pacific & MEA Cup PAGE 15 | | 14-16 March, 2012
Table 2.
SUP PR DR ACC
30 0.857 0.943 89.06
40 0.849 0.941 88.47
50 0.848 0.94 88.30
60 0.849 0.94 88.38
70 0.85 0.941 88.55
80 0.846 0.945 88.38
90 0.834 0.943 87.53
Depth 2 Depth 3 598 malware and 573 benign
Table 1.
Rule between Depths of Tree
"IT Security for the Next Generation", Asia Pacific & MEA Cup PAGE 16 | | 14-16 March, 2012
Test on 200 malware and 200 benign, due to limitation on system speed
Figure 6.
Stuxnet & Duqu
A very famous industrial malware
DLLs Dependency tree was extracted (without unpacking )
Frequent patterns were searched
A feature vector for Stuxnet was created
Successfully detected as a malware without running !!!!
Duqu was also detected at the same way
"IT Security for the Next Generation", Asia Pacific & MEA Cup PAGE 17 | | 14-16 March, 2012
Variant Similarity Results
"IT Security for the Next Generation", Asia Pacific & MEA Cup PAGE 18 | | 14-16 March, 2012
Table 4.
Source PE Destination PE Cosine Jaccard Pearson Average
VIRUS.WIN32.EVOL.A VIRUS.WIN32.EVOL.B 1 1 1 1
VIRUS.WIN32.EVUL.8192.A VIRUS.WIN32.EVUL.8192.B 1 1 1 1
NET-WORM.WIN32.VESSER.A NET-WORM.WIN32.VESSER.B 1 1 1 1
VIRUS.WIN32.THORIN.B VIRUS.WIN32.THORIN.E 1 1 1 1
VIRUS.WIN32.MIAM.1727 VIRUS.WIN32.MIAM.4716 1 1 1 1
VIRUS.WIN32.SAVIOR.1696 VIRUS.WIN32.SAVIOR.1832 1 1 1 1
NET-WORM.WIN32.WELCHIA.A NET-WORM.WIN32.WELCHIA.B 0.9541 0.9104 0.9401 0.9349
VIRUS.WIN32.INRAR.A VIRUS.WIN32.INRAR.E 0.9177 0.8421 0.9058 0.8885
VIRUS.WIN32.DRIVALON.1876 VIRUS.WIN32.MAGIC.3038 0.8572 0.7500 0.8381 0.8151
VIRUS.WIN32.BLATEROZ VIRUS.WIN32.CHAMP 0.8291 0.6875 0.8120 0.7762
VIRUS.WIN32.OROCH.5420 VIRUS.WIN32.PARADISE.2116 0.8291 0.6875 0.8120 0.7762
BOOTCFG VIRUS.WIN32.ZOMBIE 0.6370 0.4133 0.5701 0.5401
DIALER NET-WORM.WIN32.DOMWOOT.C 0.6170 0.3870 0.5303 0.5114
Calculator VIRUS.WIN32.EVOL.C 0.3853 0.2058 0.3439 0.3117
Using feature vector as the new signature
Sim
ilar
alm
ost S
imila
r N
ot S
imila
r
Evasion Packers
Packers evade static analyzer
Manually packed: UPX, ASPACK, Exe32pack, Exepacker, Petit, …
The packer mirrors the resource tree
Our method is resilient against common packers
Packers that eliminate some parts of the tree like PECOMPACT are solvable by
recovering Import address table during runtime
"IT Security for the Next Generation", Asia Pacific & MEA Cup PAGE 19 | | 14-16 March, 2012
Figure 7.
Fake API & DLL Injection
Fake API & DLL do not influence our method
"IT Security for the Next Generation", Asia Pacific & MEA Cup PAGE 20 | | 14-16 March, 2012
Figure 8.
Conclusions &
Future Work
Both dynamic and static detection have drawbacks
Our approach combines the benefits of both and indeed can be a hybrid of the
two
Malware’s Behavioral tree was extracted
To find the unique structures of malwares closed frequent trees were mined
Our technique achieved more than 94% recall for malwares
Future Work
• Expand the behavioral tree by considering more dependencies among modules
• Pruning the tree by removing unimportant substructures from it
"IT Security for the Next Generation", Asia Pacific & MEA Cup PAGE 21 | | 14-16 March, 2012
Our Team
"IT Security for the Next Generation", Asia Pacific & MEA Cup PAGE 22 | | 14-16 March, 2012
Masoud Narouei Mansour Ahmadi Dr. Ashkan Sami
Thank You
Q & A
“IT Security for the Next Generation”
Asia Pacific & MEA Cup, Hong Kong
14-16 March, 2012
References
[1] LUCIA, D. L., LINGXIAO JIANG, ADITYA BUDI 2010. Comprehensive evaluation of association
measures for fault localization. ICSM.
[2] YANFANG YE, TAO LI, YONG CHEN & JIANG, Q. 2010. Automatic Malware Categorization Using
Cluster Ensemble KDD'10 Washington, DC, USA.
[3] YUN CHI, YIRONG YANG, YI XIA & MUNTZ, R. R. Year. CMTreeMiner: Mining Both Closet and
Maximal Frequent Subtrees. In: Proc. of the 8th Pacific-Asia Conf. on Knowledge Discovery and Data
Mining (PAKDD-2004) 2004. 63-73.
[4] MILLER, S. P. 2006. Dependency Walker [Online]. Available: http://www.dependencywalker.com/
[Accessed].
[5] WAIKATO. . 2008. Weka 3: Data Mining open source Software. [Online]. Available:
http://www.cs.waikato.ac.nz/ml/weka/ [Accessed].
[6] STEFAN TANASE, S. S. R. 2011. Malware Report. Kaspersky lab.
"IT Security for the Next Generation", Asia Pacific & MEA Cup PAGE 24 | | 14-16 March, 2012