Data Management (Data Mining Association Rule)

82
Manajemen DATA Adam Mukharil Bachtiar, M.T. Data Mining Association Rule

Transcript of Data Management (Data Mining Association Rule)

Page 1: Data Management (Data Mining Association Rule)

Manajemen

DATAAdamMukharil Bachtiar, M.T.

Data Mining Association Rule

Page 2: Data Management (Data Mining Association Rule)

Pemahaman Awal Data Mining

Page 3: Data Management (Data Mining Association Rule)

apa itu Data Mining?

Page 4: Data Management (Data Mining Association Rule)

Ekstraksi pengetahuan yang menarik dalam bentuk rule, regularities, pola, constraint, dan lain-lain dari data yang

tersimpan dalam sejumlah besar basis data

Page 5: Data Management (Data Mining Association Rule)

Gambaran Umum Data Mining

Page 6: Data Management (Data Mining Association Rule)

Data mining atau dikenal juga sebagai KDD (Knowledge Discovery in Databases) menggunakan data historical untukmengekstraksi pengetahuan

Page 7: Data Management (Data Mining Association Rule)

Bagaimana proses untukmelakukan Data Mining?

Page 8: Data Management (Data Mining Association Rule)
Page 9: Data Management (Data Mining Association Rule)

Fungsional data mining adadua, yaitu fungsi predictive dan fungsi descriptive

Page 10: Data Management (Data Mining Association Rule)

Fungsi Predictive

Memprediksi nilai suatu atribut berdasarkan atribut-atribut lainnya

Page 11: Data Management (Data Mining Association Rule)

Fungsi Descriptive

Memperoleh pola yang merangkum relasi pokok pada data yang digunakan

Page 12: Data Management (Data Mining Association Rule)

What is Know Your Customer (KYC)(https://www.youtube.com/watch?v=vLeC6khWzpM)

Page 13: Data Management (Data Mining Association Rule)

Business Analytics: Data Trends Let Businesses Spot New Opportunities

(https://www.youtube.com/watch?v=HbHTvqZE3D8)

Page 14: Data Management (Data Mining Association Rule)

Metode data mining adatiga, yaitu Association Rule, Classification, dan Clustering

Page 15: Data Management (Data Mining Association Rule)

Dalam bab ini akan dijelaskandata mining menggunakanmetode Association Rule

Page 16: Data Management (Data Mining Association Rule)

Penjelasan Association Rule

Page 17: Data Management (Data Mining Association Rule)

Metode Association Rule sering juga disebut sebagaiMarket Basket Analysis

Page 18: Data Management (Data Mining Association Rule)

Association Rule digunakanuntuk mengekstraksiketerhubungan asosiatif ataukorelasi yang menarik antar item

Page 19: Data Management (Data Mining Association Rule)
Page 20: Data Management (Data Mining Association Rule)

Gambaran Hubungan Asosiatif Antar Item

if then

Sebagai contoh:

if antecedent then consequent

Artinya:1.Adahubunganasosiatifantararotidenganselai.2.Jikaseseorangmembelirotimakadiaberkemungkinan jugasebesarn%untukmembeliselaidalamsatupembelian

Page 21: Data Management (Data Mining Association Rule)

Ada dua parameter yang perludiketahui pada metodeAssociation Rule, yaitu nilaisupport dan nilai confidence

Page 22: Data Management (Data Mining Association Rule)

Nilai support:Rasio antara jumlah transaksi yang memuat antecedent dan consequent terhadap jumlah transaksi

Nilai confidence:Rasio antara jumlah transaksi yang memuat antecedent dan consequent terhadap jumlah transaksi yang meliputisemua item dalam antecedent

Page 23: Data Management (Data Mining Association Rule)

IF A THEN B,CSupport = 0.5 (50%)Confidence = 1 (100%)

IF B THEN CSupport = 1 (100%)Confidence = 1 (100%)

IF B THEN C,ASupport = 0.5 (50%)Confidence = 0.5 (50%)

Cara menghitung nilai support dan confidence

Terdapat dua transaksi yang terjadi:

Page 24: Data Management (Data Mining Association Rule)

Terdapat beberapa algoritmayang bisa digunakan untukassociation rule di antaranyaalgoritma apriori dan FP-Growth

Page 25: Data Management (Data Mining Association Rule)

Section 1:Algoritma Apriori

Page 26: Data Management (Data Mining Association Rule)

Ide dasar:Mengembangkan frequent itemset danmemangkas item yang tingkatfrekuensinya di bawah minimum support (Support >= Minimal Support)

Page 27: Data Management (Data Mining Association Rule)

Pseudocode algoritma apriori

Page 28: Data Management (Data Mining Association Rule)

Bagaimana cara kerjanya?

Page 29: Data Management (Data Mining Association Rule)

Contoh Kasus Algoritma Apriori

Terdapat 9 transaksi yang terjadi:

Page 30: Data Management (Data Mining Association Rule)

Langkah 1:Tentukan nilai minimum support danminimum confidence

Page 31: Data Management (Data Mining Association Rule)

Minimum support:Menyatakan nilai minimum kemunculan itemset padasuatu kumpulan transaksi

Minimum confidence:Menyatakan nilai minimum kepercayaan terhadap rule yang dihasilkan

Page 32: Data Management (Data Mining Association Rule)

Dalam kasus ini, ditentukannilai minimum support = 2 (22%) dan minimum confidence = 70%

Page 33: Data Management (Data Mining Association Rule)

Langkah 2:Generate frequent pattern 1-itemset

Page 34: Data Management (Data Mining Association Rule)

Pada iterasi pertama ini, semua itemset memenuhi aturan minimum supportnyasehingga semua item menjadi kandidat.

Page 35: Data Management (Data Mining Association Rule)

Langkah 3:Generate frequent pattern 2-itemset

Page 36: Data Management (Data Mining Association Rule)

1. C2 adalah hasil dari L1 join L12. L2 adalah itemset C2 yang memenuhi aturan minimum support

L1

Page 37: Data Management (Data Mining Association Rule)

Langkah 4:Generate frequent pattern 3-itemset. Lakukan untuk n-itemset apabila masihmungkin terbentuk itemset.

Page 38: Data Management (Data Mining Association Rule)

1. Algoritma apriori mulai berjalan di langkah ini

2. Join step: {{I1, I2, I3}, {I1, I2, I5}, {I1, I3, I5}, {I2, I3, I4}, {I2, I3, I5}, {I2, I4, I5}}

3. {I1, I3, I5}, {I2, I3, I4}, {I2, I3, I5}, {I2, I4, I5} tidak dijadikan itemset karena ada

subset dari set tersebut yang tidak memenuhi minimum support (prune)

Page 39: Data Management (Data Mining Association Rule)

Langkah 5:Bentuk Association Rule dari frequent itemset yang sudah dibentuk. Rule yang nilai confidencenya lebih dari minimum confidence akan digunakan (Strong Association Rule).

Page 40: Data Management (Data Mining Association Rule)

Itemset terpilih:{{I1}, {I2}, {I3}, {I4}, {I5}, {I1,I2}, {I1,I3}, {I1,I5}, {I2,I3}, {I2,I4}, {I2,I5}, {I1,I2,I3}, {I1,I2,I5}}

Page 41: Data Management (Data Mining Association Rule)

Sebagai contoh dipilih{I1,I2,I5} untuk mencariStrong Association Rule

Page 42: Data Management (Data Mining Association Rule)

{I1,I2,I5} à Subset = {{I1,I2}, {I1,I5}, {I2,I5}, {I1}, {I2}, {I5}} Minimum confidence: 70%

• IF {I1,I2} THEN {I5} Confidence: sc{I1,I2,I5}/sc {I1,I2} = 2/4 = 50% (Rule Rejected!)

• IF {I1,I5} THEN {I2} Confidence: sc{I1,I2,I5}/sc {I1,I5} = 2/2 = 100%(Rule Selected!)

• IF {I2,I5} THEN {I1} Confidence: sc{I1,I2,I5}/sc {I2,I5} = 2/2 = 100% (Rule Selected!)

• IF {I1} THEN {I2,I5} Confidence: sc{I1,I2,I5}/sc {I1} = 2/6 = 33% (Rule Rejected!)

• IF {I2} THEN {I1,I5} Confidence: sc{I1,I2,I5}/sc {I2} = 2/7 = 29% (Rule Rejected!)

• IF {I5} THEN {I1,I2} Confidence: sc{I1,I2,I5}/sc {I5} = 2/2 = 100% (Rule Selected!)

Page 43: Data Management (Data Mining Association Rule)

Setelah Strong Association Rule terbentuk maka langkahselanjutnya adalahmerepresentasikan pengetahuan

Page 44: Data Management (Data Mining Association Rule)

Bentuk representasipengetahuan didasarkanpada tujuan data mining yang dideskripsikan berdasarkankebutuhan pengguna

Page 45: Data Management (Data Mining Association Rule)

Representasi pengetahuan

No. Strong Association Rule Representasi Pengetahuan

1 if {I1,I2} then {I5} Item I1, I2, dan I5 harus ditempatkan pada rak yang berdekatan/pada satu lorong rak

2 if {I3} then {I1,I2} Item I3, I1, dan I2 harus ditempatkan pada rak yang berdekatan/pada satu lorong rak

.. .. ..

Misalkan tujuan data mining adalah penempatan item yang memiliki hubungan asosiatif harus ditempatkan berdekatan agar keuntungan lebih optimal

Page 46: Data Management (Data Mining Association Rule)

Section 2:Algoritma FP-Growth

Page 47: Data Management (Data Mining Association Rule)

Ide dasar:Mengembangkan FP-Tree danConditional FP-Tree sebagai penggantiFrequent Itemset

Page 48: Data Management (Data Mining Association Rule)

Bagaimana cara kerjanya?

Page 49: Data Management (Data Mining Association Rule)

Langkah 1:Tentukan nilai minimum support danminimum confidence

Page 50: Data Management (Data Mining Association Rule)

Dalam kasus ini, ditentukannilai minimum support = 2 (22%) dan minimum confidence = 70%

Page 51: Data Management (Data Mining Association Rule)

Langkah 2:Generate frequent pattern 1-itemset seperti yang dilakukan pada algoritmaapriori

Page 52: Data Management (Data Mining Association Rule)
Page 53: Data Management (Data Mining Association Rule)

Langkah 3:Urutkan tabel transaksi berdasarkanfrequent 1-itemset yang sudah diurutkansupport count-nya secara descending

Page 54: Data Management (Data Mining Association Rule)

Apabila ada dua item atau lebih yang memiliki support count yang sama maka urutan didasarkanpada item mana yang ada di transaksi yang lebih awal muncul (T1 terjadi lebih dahulu dibanding T2)

Sort menurut support count (Descending):𝐿 = { 𝐼2: 7 , 𝐼1: 6 , 𝐼3: 6 , 𝐼4: 2 , 𝐼5: 2 }

TID List of Items

T1 I2, I1, I5

T2 I2, I4

T3 I2, I3

T4 I2, I1, I4

T5 I1, I3

T6 I2, I3

T7 I1, I3

T8 I2, I1, I3, I5

T9 I2, I1, I3

Page 55: Data Management (Data Mining Association Rule)

Langkah 4:Bentuk FP-Tree sesuai algoritma FP-Tree

Page 56: Data Management (Data Mining Association Rule)

null akan menjadi root dan child dari root dipilih berdasarkan scan List of Items

TID List of Items

T1 I2, I1, I5

T2 I2, I4

T3 I2, I3

T4 I2, I1, I4

T5 I1, I3

T6 I2, I3

T7 I1, I3

T8 I2, I1, I3, I5

T9 I2, I1, I3

null

I2:1

I1:1

I5:1

Page 57: Data Management (Data Mining Association Rule)

TID List of Items

T1 I2, I1, I5

T2 I2, I4

T3 I2, I3

T4 I2, I1, I4

T5 I1, I3

T6 I2, I3

T7 I1, I3

T8 I2, I1, I3, I5

T9 I2, I1, I3

null

I2:2

I1:1

I5:1

I4:1

Page 58: Data Management (Data Mining Association Rule)

TID List of Items

T1 I2, I1, I5

T2 I2, I4

T3 I2, I3

T4 I2, I1, I4

T5 I1, I3

T6 I2, I3

T7 I1, I3

T8 I2, I1, I3, I5

T9 I2, I1, I3

null

I2:3

I1:1

I5:1

I4:1 I3:1

Page 59: Data Management (Data Mining Association Rule)

TID List of Items

T1 I2, I1, I5

T2 I2, I4

T3 I2, I3

T4 I2, I1, I4

T5 I1, I3

T6 I2, I3

T7 I1, I3

T8 I2, I1, I3, I5

T9 I2, I1, I3

null

I2:4

I1:2

I5:1

I4:1 I3:1

I4:1

Page 60: Data Management (Data Mining Association Rule)

TID List of Items

T1 I2, I1, I5

T2 I2, I4

T3 I2, I3

T4 I2, I1, I4

T5 I1, I3

T6 I2, I3

T7 I1, I3

T8 I2, I1, I3, I5

T9 I2, I1, I3

null

I2:4

I1:2

I5:1

I4:1 I3:1

I4:1

I1:1

I3:1

Page 61: Data Management (Data Mining Association Rule)

TID List of Items

T1 I2, I1, I5

T2 I2, I4

T3 I2, I3

T4 I2, I1, I4

T5 I1, I3

T6 I2, I3

T7 I1, I3

T8 I2, I1, I3, I5

T9 I2, I1, I3

null

I2:5

I1:2

I5:1

I4:1 I3:2

I4:1

I1:1

I3:1

Page 62: Data Management (Data Mining Association Rule)

TID List of Items

T1 I2, I1, I5

T2 I2, I4

T3 I2, I3

T4 I2, I1, I4

T5 I1, I3

T6 I2, I3

T7 I1, I3

T8 I2, I1, I3, I5

T9 I2, I1, I3

null

I2:5

I1:2

I5:1

I4:1 I3:2

I4:1

I1:2

I3:2

Page 63: Data Management (Data Mining Association Rule)

TID List of Items

T1 I2, I1, I5

T2 I2, I4

T3 I2, I3

T4 I2, I1, I4

T5 I1, I3

T6 I2, I3

T7 I1, I3

T8 I2, I1, I3, I5

T9 I2, I1, I3

null

I2:6

I1:3

I5:1

I4:1 I3:2

I4:1

I1:2

I3:2

I3:1

I5:1

Page 64: Data Management (Data Mining Association Rule)

TID List of Items

T1 I2, I1, I5

T2 I2, I4

T3 I2, I3

T4 I2, I1, I4

T5 I1, I3

T6 I2, I3

T7 I1, I3

T8 I2, I1, I3, I5

T9 I2, I1, I3

null

I2:7

I1:4

I5:1

I4:1 I3:2

I4:1

I1:2

I3:2

I3:2

I5:1

Page 65: Data Management (Data Mining Association Rule)

Untuk membantu penelusuranFP-Tree digunakan nodelink

Page 66: Data Management (Data Mining Association Rule)

null

I2:7

I1:4

I5:1

I4:1 I3:2

I4:1

I1:2

I3:2

I3:2

I5:1

Ilustrasi FP-Tree tanpa node-link Ilustrasi FP-Tree dengan node-link

Page 67: Data Management (Data Mining Association Rule)

Langkah 5:Bentuk Conditional Pattern Base dimulaidari item dengan support count terendahke item dengan support count tertinggi

Page 68: Data Management (Data Mining Association Rule)

Item Conditional Pattern Base

I5 {I2, I1:1}, {I2, I1, I3:1}

I4

I3

I1

null

I2:7

I1:4

I5:1

I4:1 I3:2

I4:1

I1:2

I3:2

I3:2

I5:1

I2 tidak diikutsertakan karena prefixnya adalah null (root)

Page 69: Data Management (Data Mining Association Rule)

Item Conditional Pattern Base

I5 {I2, I1:1}, {I2, I1, I3:1}

I4 {I2, I1:1}, {I2:1}

I3

I1

null

I2:7

I1:4

I5:1

I4:1 I3:2

I4:1

I1:2

I3:2

I3:2

I5:1

Page 70: Data Management (Data Mining Association Rule)

Item Conditional Pattern Base

I5 {I2, I1:1}, {I2, I1, I3:1}

I4 {I2, I1:1}, {I2:1}

I3 {I2, I1, I3:2}, {I2:2}, {I1:2}

I1 {I2:4}

null

I2:7

I1:4

I5:1

I4:1 I3:2

I4:1

I1:2

I3:2

I3:2

I5:1

Page 71: Data Management (Data Mining Association Rule)

Item Conditional Pattern Base

I5 {I2, I1:1}, {I2, I1, I3:1}

I4 {I2, I1:1}, {I2:1}

I3 {I2, I1:2}, {I2:2}, {I1:2}

I1

null

I2:7

I1:4

I5:1

I4:1 I3:2

I4:1

I1:2

I3:2

I3:2

I5:1

Page 72: Data Management (Data Mining Association Rule)

Langkah 6:Bentuk Conditional FP-Tree dimulai dariitem dengan support count terendah keitem dengan support count tertinggi(gunakan konsep minimum support)

Page 73: Data Management (Data Mining Association Rule)

null

I2:7

I1:4

I5:1

I4:1 I3:2

I4:1

I1:2

I3:2

I3:2

I5:1

null

I2:2

I1:2

I5:1 I3:1

I5:1

Tahap 1: Conditional FP-Tree untuk I5 = {I2:2, I1:2}

Tidak memenuhi minimum support

Page 74: Data Management (Data Mining Association Rule)

null

I2:7

I1:4

I5:1

I4:1 I3:2

I4:1

I1:2

I3:2

I3:2

I5:1

Tahap 2: Conditional FP-Tree untuk I4 = {I2:2}

null

I2:2

I1:1 I4:1

I4:1

Page 75: Data Management (Data Mining Association Rule)

null

I2:7

I1:4

I5:1

I4:1 I3:2

I4:1

I1:2

I3:2

I3:2

I5:1

Tahap 3: Conditional FP-Tree untuk I3 = {I2:4, I1:2}, {I1:2}

null

I2:4

I1:2 I3:2

I1:2

I3:2

I3:2

Page 76: Data Management (Data Mining Association Rule)

null

I2:7

I1:4

I5:1

I4:1 I3:2

I4:1

I1:2

I3:2

I3:2

I5:1

Tahap 4: Conditional FP-Tree untuk I1 = {I2:4}

null

I2:4

I1:4

Page 77: Data Management (Data Mining Association Rule)

Item Conditional Pattern Base Conditional FP-Tree

I5 {I2, I1:1}, {I2, I1, I3:1} {I2:2, I1:2}

I4 {I2, I1:1}, {I2:1} {I2:2}

I3 {I2, I1, I3:2}, {I2:2}, {I1:2} {I2:4, I1:2}, {I1:2}

I1 {I2:4} {I2:4}

Page 78: Data Management (Data Mining Association Rule)

Langkah 7:Bentuk Frequent Patterns dengan caramenjoinkan set dan subset conditional FP-Tree dengan item

Page 79: Data Management (Data Mining Association Rule)

ItemConditionalPattern Base

ConditionalFP-Tree

Frequent Patterns Generated

I5 {I2, I1:1}, {I2, I1, I3:1} {I2:2, I1:2} {I2, I5:2}, {I1, I5:2}, {I2, I1, I5:2}

I4 {I2, I1:1}, {I2:1} {I2:2} {I2, I4:2}

I3 {I2, I1, I3:2}, {I2:2}, {I1:2} {I2:4, I1:2}, {I1:2} {I2, I3:4}, {I1, I3:4}, {I2, I1, I3:2}

I1 {I2:4} {I2:4} {I2, I1:4}

Page 80: Data Management (Data Mining Association Rule)

Langkah 8:Cari Strong Association Rule berdasarkan Frequent Pattern yang terbentuk dengan cara yang samadengan apriori sampai terbentukrepresentasi pengetahuan

Page 81: Data Management (Data Mining Association Rule)

Exercise Time

Page 82: Data Management (Data Mining Association Rule)

Transaction ID Items

1 Bread, Milk

2 Bread, Diaper, Beer, Eggs

3 Milk, Diaper, Beer, Coke

4 Bread, Milk, Diaper, Beer

5 Bread, Milk, Diaper, Coke

Terdapat 5 transaksi yang terjadi:

Kasus:1. Tujuan data mining adalah membentuk paket ekonomis dari item yang punya hubungan asosiasi

2. Ditentukan minimum support 2 dan minimum confidence 70%