Mining Modules’ Dependencies for Malware Detection

24
“IT Security for the Next Generation” Asia Pacific & MEA Cup, Hong Kong 14-16 March, 2012 Mining Modules’ Dependencies for Malware Detection Masoud Narouei, Mansour Ahmadi, Ashkan Sami Shiraz University

Transcript of Mining Modules’ Dependencies for Malware Detection

“IT Security for the Next Generation”

Asia Pacific & MEA Cup, Hong Kong

14-16 March, 2012

Mining Modules’ Dependencies for

Malware Detection

Masoud Narouei, Mansour Ahmadi, Ashkan Sami

Shiraz University

Outline

1) Introduction

2) Modules’ Dependencies

3) Proposed System

4) Experiments & Results

5) Conclusions & Future work

6) References

"IT Security for the Next Generation", Asia Pacific & MEA Cup PAGE 2 | | 14-16 March, 2012

1 / Introduction

Introduction, Static techniques, Dynamic techniques, Motivation

Introduction

Malware:

Every software Designed to

• Gain unauthorized access

• Steal user’s critical information

• Cause damage to computers

About 413 billion infections detected by anti-viruses in

Q1/2011

Detection techniques:

• Static

• Dynamic

"IT Security for the Next Generation", Asia Pacific & MEA Cup PAGE 4 | | 14-16 March, 2012

Static techniques:

Analyze the executable without running

Benefits:

• High detection rate

• Speed

Drawbacks:

• Code packing

• Polymorphism

• …

"IT Security for the Next Generation", Asia Pacific & MEA Cup PAGE 5 | | 14-16 March, 2012

Dynamic techniques:

Consider the behavior of malware during run time

Benefits:

• Robust against obfuscation and other techniques

Drawbacks:

• Time consuming during data gathering

• Monitoring dynamic libraries such as ocx, dll, …, is not easily possible

– 60 percent of malware collected at KingSoft anti-malware lab are DLL files

• Debugger detection

• Virtual machine detection

"IT Security for the Next Generation", Asia Pacific & MEA Cup PAGE 6 | | 14-16 March, 2012

Motivation

Each approach has benefits and drawbacks

A new method is needed to:

• Utilize the benefits of both

• Cover evasion such as packing, polymorphism, …

• Cover Fake API call injection

We extracting behavior without execution

We propose a heuristic static method

We consider modules’ dependencies of PE

We consider frequent subtrees as features

This is the first work that considers such feature

"IT Security for the Next Generation", Asia Pacific & MEA Cup PAGE 7 | | 14-16 March, 2012

2 / Modules’ Dependencies

Modules’ dependencies Tree, Tree’s String Encoding

Modules’ dependencies Tree

PEs Interact with OS by APIs which are categorized in DLLs.

Each PE depends on some DLL for execution

DLLs has relationship with other DLLs for completing the task.

"IT Security for the Next Generation", Asia Pacific & MEA Cup PAGE 9 | | 14-16 March, 2012

Figure 1.

Tree’s String Encoding

"IT Security for the Next Generation", Asia Pacific & MEA Cup PAGE 10 | | 14-16 March, 2012

Figure 2.

3 / Proposed System System, Closed Tree

System

"IT Security for the Next Generation", Asia Pacific & MEA Cup PAGE 12 | | 14-16 March, 2012

Figure 3.

Closed Tree

"IT Security for the Next Generation", Asia Pacific & MEA Cup PAGE 13 | | 14-16 March, 2012

Figure 4. Sample tree

Figure 5. Sample closed

trees by support count of 2

Support 2

4 / Experiment & Results

Detection rate, Variant Similarity, Evasion, Fake API & DLL Injection

Detection Rate

SUP PR DR ACC

30 0.892 0.925 90.43

40 0.895 0.93 90.86

50 0.889 0.928 90.43

60 0.895 0.926 90.69

70 0.888 0.928 90.35

80 0.899 0.925 90.86

90 0.901 0.928 91.11

"IT Security for the Next Generation", Asia Pacific & MEA Cup PAGE 15 | | 14-16 March, 2012

Table 2.

SUP PR DR ACC

30 0.857 0.943 89.06

40 0.849 0.941 88.47

50 0.848 0.94 88.30

60 0.849 0.94 88.38

70 0.85 0.941 88.55

80 0.846 0.945 88.38

90 0.834 0.943 87.53

Depth 2 Depth 3 598 malware and 573 benign

Table 1.

Rule between Depths of Tree

"IT Security for the Next Generation", Asia Pacific & MEA Cup PAGE 16 | | 14-16 March, 2012

Test on 200 malware and 200 benign, due to limitation on system speed

Figure 6.

Stuxnet & Duqu

A very famous industrial malware

DLLs Dependency tree was extracted (without unpacking )

Frequent patterns were searched

A feature vector for Stuxnet was created

Successfully detected as a malware without running !!!!

Duqu was also detected at the same way

"IT Security for the Next Generation", Asia Pacific & MEA Cup PAGE 17 | | 14-16 March, 2012

Variant Similarity Results

"IT Security for the Next Generation", Asia Pacific & MEA Cup PAGE 18 | | 14-16 March, 2012

Table 4.

Source PE Destination PE Cosine Jaccard Pearson Average

VIRUS.WIN32.EVOL.A VIRUS.WIN32.EVOL.B 1 1 1 1

VIRUS.WIN32.EVUL.8192.A VIRUS.WIN32.EVUL.8192.B 1 1 1 1

NET-WORM.WIN32.VESSER.A NET-WORM.WIN32.VESSER.B 1 1 1 1

VIRUS.WIN32.THORIN.B VIRUS.WIN32.THORIN.E 1 1 1 1

VIRUS.WIN32.MIAM.1727 VIRUS.WIN32.MIAM.4716 1 1 1 1

VIRUS.WIN32.SAVIOR.1696 VIRUS.WIN32.SAVIOR.1832 1 1 1 1

NET-WORM.WIN32.WELCHIA.A NET-WORM.WIN32.WELCHIA.B 0.9541 0.9104 0.9401 0.9349

VIRUS.WIN32.INRAR.A VIRUS.WIN32.INRAR.E 0.9177 0.8421 0.9058 0.8885

VIRUS.WIN32.DRIVALON.1876 VIRUS.WIN32.MAGIC.3038 0.8572 0.7500 0.8381 0.8151

VIRUS.WIN32.BLATEROZ VIRUS.WIN32.CHAMP 0.8291 0.6875 0.8120 0.7762

VIRUS.WIN32.OROCH.5420 VIRUS.WIN32.PARADISE.2116 0.8291 0.6875 0.8120 0.7762

BOOTCFG VIRUS.WIN32.ZOMBIE 0.6370 0.4133 0.5701 0.5401

DIALER NET-WORM.WIN32.DOMWOOT.C 0.6170 0.3870 0.5303 0.5114

Calculator VIRUS.WIN32.EVOL.C 0.3853 0.2058 0.3439 0.3117

Using feature vector as the new signature

Sim

ilar

alm

ost S

imila

r N

ot S

imila

r

Evasion Packers

Packers evade static analyzer

Manually packed: UPX, ASPACK, Exe32pack, Exepacker, Petit, …

The packer mirrors the resource tree

Our method is resilient against common packers

Packers that eliminate some parts of the tree like PECOMPACT are solvable by

recovering Import address table during runtime

"IT Security for the Next Generation", Asia Pacific & MEA Cup PAGE 19 | | 14-16 March, 2012

Figure 7.

Fake API & DLL Injection

Fake API & DLL do not influence our method

"IT Security for the Next Generation", Asia Pacific & MEA Cup PAGE 20 | | 14-16 March, 2012

Figure 8.

Conclusions &

Future Work

Both dynamic and static detection have drawbacks

Our approach combines the benefits of both and indeed can be a hybrid of the

two

Malware’s Behavioral tree was extracted

To find the unique structures of malwares closed frequent trees were mined

Our technique achieved more than 94% recall for malwares

Future Work

• Expand the behavioral tree by considering more dependencies among modules

• Pruning the tree by removing unimportant substructures from it

"IT Security for the Next Generation", Asia Pacific & MEA Cup PAGE 21 | | 14-16 March, 2012

Our Team

"IT Security for the Next Generation", Asia Pacific & MEA Cup PAGE 22 | | 14-16 March, 2012

Masoud Narouei Mansour Ahmadi Dr. Ashkan Sami

Thank You

Q & A

“IT Security for the Next Generation”

Asia Pacific & MEA Cup, Hong Kong

14-16 March, 2012

References

[1] LUCIA, D. L., LINGXIAO JIANG, ADITYA BUDI 2010. Comprehensive evaluation of association

measures for fault localization. ICSM.

[2] YANFANG YE, TAO LI, YONG CHEN & JIANG, Q. 2010. Automatic Malware Categorization Using

Cluster Ensemble KDD'10 Washington, DC, USA.

[3] YUN CHI, YIRONG YANG, YI XIA & MUNTZ, R. R. Year. CMTreeMiner: Mining Both Closet and

Maximal Frequent Subtrees. In: Proc. of the 8th Pacific-Asia Conf. on Knowledge Discovery and Data

Mining (PAKDD-2004) 2004. 63-73.

[4] MILLER, S. P. 2006. Dependency Walker [Online]. Available: http://www.dependencywalker.com/

[Accessed].

[5] WAIKATO. . 2008. Weka 3: Data Mining open source Software. [Online]. Available:

http://www.cs.waikato.ac.nz/ml/weka/ [Accessed].

[6] STEFAN TANASE, S. S. R. 2011. Malware Report. Kaspersky lab.

"IT Security for the Next Generation", Asia Pacific & MEA Cup PAGE 24 | | 14-16 March, 2012