Pertemuan 7
Dasar Probabilita
Definitions
Probability: value between 0 and 1, describing relative
possibility an event will occur
Experiment/trial: A process that leads only one of
possible outcomes
1 kali lempar dadu atau 1kali lempar koin = 1 trial
Outcome: a result of a trial
Event: kumpulan 1 atau beberapa trial
Approaches to Probability
Classical Empirical
ObjectiveSubjective
Tanpa data sebelumnya
Probability Distribution
Probability Distribution
Definisi: listing of all outcomes of a trial and the
probability of each outcomes
Normal Probability Distribution
Normal Probability Distribution
Normal Probability Distribution
Fungsi:
memberikan visualisasi besarnya penyebaran probabilitas dari event, atau probabilitas dari range suatu event
Dari kiri sampai kanan, jumlah semua probabilitas = 1
Sifat:
Numerous continuous variables common in business have distributions that closely resemble the normal distribution.
The normal distribution can be used to approximate various discrete probability distributions.
The normal distribution provides the basis for classical statistical inference
Bentuk Lain: mean sama, stdev beda
Bentuk Lain: mean beda, stdev sama
Standard Normal Probability Distribution
Definisi: distribusi probabilitas normal yang
distandarkan mean menjadi 0 dan stdev jadi 1
Fungsi: Menstandarkan dan menyederhanakan
perhitungan probabiltas dari berbagai distribusi yang
beragam mean dan stdev
Rumus Asli dari Normal Distribution
Rumus Standard Normal Probability Distribution
Z = nilai normal standar
X = nilai dari variabel yang ingin dicari
Hubungan Z, mean dan stdev
68.26%
95.44%
99.74%
Penerapan 1: menentukan probabilitas
range event
Contoh: dari sampel yang diambil, diketahui bahwa
pengeluaran untuk kopi per rumah tangga tersebar
normal dengan mean $45.16 dan stdev $10
Berapakah probabilita rumah tangga mempunyai
pengeluaran untuk kopi antara $45.16 dan $50?
=
=
50 45.16
10= 0.484
Jenis 1 dari Tabel Standard Normal Probability
Distribution: cumulative from mean (0 to Z)
Z = 0.484 = 0.4 + 0.08
Prob pengeluaran kopi antara $45.16 dan $50 adalah 0.18
Atau sekitar 18% populasi mempunyai pengeluaran untuk kopi antara $45.16 dan $50
Jenis 2 dari Tabel Standard Normal Probability
Distribution: cumulative from to mean
Kalau menggunakan tabel jenis 2
(cumulative from to mean)
0.6844
0.5
0.6844-0.5 = 0.1844
Penerapan 2: menentukan nilai X
Contoh: dari sampel yang diambil, diketahui bahwa
pengeluaran untuk kopi per rumah tangga tersebar
normal dengan mean $45.16 dan stdev $10
Sebuah perusahaan ingin menarget 60% teratas dari
segmen pasar kopi tersebut. Berapakah pengeluaran
terkecil untuk kopi?
0.60
0.40
Dengan tabel tipe 2 (cumulative from to mean) cari nilai
probabilita yang Z-nya mendekati 0.4
Probabilita yang mendekati 40% adalah 0.4207, berarti nilai Z-nya adalah -0.200
Z=-0.200, mean=$45.16 dan stdev $10
=
; = +
= + = 0.2 45.16 + 10 = $0.968
Latihan: bagaimana cara menentukan nilai X dengan menggunakan tabel tipe 1?
Sampling Error & Central Limit Theorem
Sampling Error
Error di sini bukan kesalahan pengambilan sampel
Definisi: perbedaan antara mean sampel dan mean
populasinya
Latihan
Dengan data di file employee data.xls (anggap
datanya adalah POPULASI.)
Dengan variabel salary, ambil 5 kali sampel, di mana 1
sampel ada 3 responden
Hitung mean kelima sampel tersebut
Apakah mean kelima sampel tersebut (yang cuma ada
3 orang) beda dengan mean salary yang ada 474 orang?
Sampling Distribution of the Sample Mean
Definisi: distribusi probabilitas dari beberapa mean
sampel.
Latihan: dengan contoh tadi, kelima sampel tadi dibuat
tabel probabilitasnya
Hitung rata2 mean dari semua sample
Central Limit Theorem
Definisi: jika semua sampel diambil dari sebuah populasi,
penyebaran mean-nya (Sampling Distribution of the Sample Mean) akan mendekati distribusi normal dan sama dengan mean populasi
Makin besar n, makin mendekati distribusi normal
Kalau n>= 30, distribusinya sudah mendekati normal
Jika tidak semua sampel diambil, setidaknya Sampling Distribution of the Sample Mean mendekati mean populasi
Stedev dari Sampling Distribution of the Sample Mean sebesar
Manfaat Central Limit Theorem
Kita punya informasi atau target tentang populasi
(mean dan stdev)
Kita ambil sampel dari populasi tersebut
Mean sampel kemungkinan akan berbeda dari
populasi (atau dari target yang kita inginkan)
Apakah mean sampel memang beda atau kebetulan
beda dari mean populasi?
Kondisi Central Limit Theorem
Jika populasinya diketahui normal
Minimum jumlah sampel tidak masalah
Jika populasinya diketahui tidak normal, atau sama
sekali tidak diketahui
Minimum jumlah sampel harus sebesar 30
Latihan
Dari sensus perumahan, diketahui mean harga rumah
sebesar $224,200 dengan stdev $9,000.
Jika kita ambil sampel dan kemudian setelah dihitung
mean sampel adalah $230,000, berapa probabilita
sample tersebut menyimpang dari mean populasi?
=
Stdev populasi diketahui
=
Stdev populasi tidak diketahui
Latihan
Sebuah pabrik minuman mineral men-setting
mesinnya untuk mengisi air mineral sebanyak 1 liter ke
dalam botol, dengan deviasi sebesar 0.01 liter
Sebuah sampel sebanyak 30 botol lalu ditemukan
bahwa mean sampel sebesar 1.05 liter. Berapa besar
probabilita dari mean sampel ini berbeda dari 1 liter?
Confidence Interval Estimation
Manfaat
Statistical process control
Mengira-ngira rentang nilai rata2 atau proporsi populasi, dengan tingkat keyakinan tertentu
Menentukan jumlah responden/sampel
Mengira-ngira Rata2 Populasi: Rumus
Z atau t?
Populasi terdistribusi normal
Stdev populasi diketahui = gunakan tabel Z
Stdev populasi tidak diketahui = gunakan tabel t
Populasi tidak normal/tidak diketahui distribusinya
Jumlah sampelnya >= 30 : gunakan tabel Z
Jumlah sampelnya < 30: gunakan non parametrik
Catatan: sampel besar berarti >= 30 sampel
Rumus Z dan t untuk Estimasi Rata2
Populasi
X = rata2 sampel
Z = nilai z (sesuai confidence level)
t = nilai t (sesuai confidence level)
= stdev populasi
s = stdev sampel
n = jumlah sampel
Tabel t
Distribusi t
Estimasi Proporsi Populasi
p = proporsi sampel
Z = nilai z (sesuai confidence level)
n = jumlah sampel
Latihan
Sebuah pabrik minuman mineral men-setting
mesinnya untuk mengisi air mineral sebanyak 1 liter ke
dalam botol dan stdev 0.01 liter.
Sebuah batch sebanyak 5000 botol diproduksi, lalu
sebuah sampel 30 botol diambil secara acak dengan
mean 1.05 liter.
Sebesar 95% kemungkinan, berapa nilai maksimum
dan minimum dari rata2 populasinya (dari 5000 botol)?
Apakah rata2 populasi (1 liter) masih berada di
rentang minimum-maksimum di atas?
Latihan
A major department store chain is interested in
estimating the average amount its credit card customers
spent on their first visit to the chain's new store in the
mall. Fifteen credit card accounts were randomly
sampled and analyzed with the following results: =
$50.50 and S2 = 400. Construct a 95% confidence
interval for the average amount its credit card
customers spent on their first visit to the chain's new
store in the mall assuming that the amount spent
follows a normal distribution.
Latihan
An economist is interested in studying the incomes of
consumers in a particular region. The population
standard deviation is known to be $1,000. A random
sample of 50 individuals resulted in an average income
of $15,000. What is the width of the 90% confidence
interval?
Latihan
Private colleges and universities rely on money contributed by
individuals and corporations for their operating expenses.
Much of this money is put into a fund called an endowment,
and the college spends only the interest earned by the fund. A
recent survey of 8 private colleges in the United States revealed
the following endowments (in millions of dollars): 60.2, 47.0,
235.1, 490.0, 122.6, 177.5, 95.4, and 220.0. Summary statistics
yield Xbar = 180.975 and S = 143.042. Calculate a 95%
confidence interval for the mean endowment of all the private
colleges in the United States assuming a normal distribution for
the endowments.
Latihan
A major department store chain is interested in
estimating the average amount its credit card customers
spent on their first visit to the chain's new store in the
mall. Fifteen credit card accounts were randomly
sampled and analyzed with the following results: =
$50.50 and S2 = 400. Construct a 95% confidence
interval for the average amount its credit card
customers spent on their first visit to the chain's new
store in the mall assuming that the amount spent
follows a normal distribution.
Latihan
As an aid to the establishment of personnel
requirements, the director of a hospital wishes to
estimate the mean number of people who are admitted
to the emergency room during a 24-hour period. The
director randomly selects 64 different 24-hour periods
and determines the number of admissions for each. For
this sample, = 19.8 and S2 = 25. Estimate the mean
number of admissions per 24-hour period with a 95%
confidence interval.
Latihan
True or false? A race car driver tested his car for time
from 0 to 60 mph, and in 20 tests obtained an average
of 4.85 seconds with a standard deviation of 1.47
seconds. A 95% confidence interval for the 0 to 60 time
is 4.52 seconds to 5.18 seconds.
Latihan
A university dean is interested in determining the
proportion of students who receive some sort of
financial aid. Rather than examine the records for all
students, the dean randomly selects 200 students and
finds that 118 of them are receiving financial aid. Use a
90% confidence interval to estimate the true proportion
of students who receive financial aid.
Latihan
True or False? A sample of 100 fuses from a very
large shipment is found to have 10 that are defective.
The 95% confidence interval would indicate that, for
this shipment, the proportion of defective fuses is
between 0 and 0.28.
Catatan untuk Populasi Finite
Jika jumlah populasi diketahui
Dan jumlah sampel >5% dari jumlah sampel
1
1
Menentukan Jumlah Sampel
n = jumlah sampel
Z = nilai Z sesuai confidence level
= standar deviasi populasi
e = nilai error (penyimpangan dari
rata2 populasi) yg diinginkan
Cara Mencari Stdev Populasi
Melihat penelitian sebelumnya yang mirip
respondennya atau topiknya
Melakukan pilot study dengan sampel kecil
Menggunakan range
(nilai max nilai min)/6
Latihan
Suppose a department store wants to estimate the
average age of the customers of its contemporary
apparel department, correct to within 2 years, with
level of confidence equal to 95%. Management
believes that the standard deviation is 8 years. What is
the sample size they should take?