1 Pendahuluan Statistik(1)
-
Upload
eduward-klinton-nainggolan -
Category
Documents
-
view
233 -
download
0
Transcript of 1 Pendahuluan Statistik(1)
-
8/12/2019 1 Pendahuluan Statistik(1)
1/31
Pendahuluan
Statistik
-
8/12/2019 1 Pendahuluan Statistik(1)
2/31
Statistik
Terdapat 2 tipe statistik
Statistik Deskriptif (Descriptive Statistics):
meliputi tabulasi, penyederhanaan, dan penjelasan
data. Atau menyimpulkan data yang kompleks
dengan suatu nilai.
Statistik Inferensial (Inferential Statistics):
perkiraan karakteristik dari suatu populasiberdasarkan pengetahuan karakteristik suatu
sample dalam populasi tersebut.
-
8/12/2019 1 Pendahuluan Statistik(1)
3/31
Pendahuluan Statistika: Teori dan Metodologi untuk analisis
data kuantitatif dari sampel observasi dalamhubungan-hubungan yang telah di hipotesakan
Alat untuk perencanaan dan kajian
Ilmu Statistika membantu analist yang memilikitumpukan data untuk menghasilkan susunanyang teratur dan penyederhananaan dari halyang kompleks dan tidak beraturan.
-
8/12/2019 1 Pendahuluan Statistik(1)
4/31
Perkiraan Statistik
Populasi
Sampel Acak
Parameter-Parameter
Statistik
Setiap anggota dalam
populasi mempunyai
kesempatan yang sama
untuk terpilih sebagai
sampel.
Perkiraan
-
8/12/2019 1 Pendahuluan Statistik(1)
5/31
Statistik Deskriptif, Skala Pengukuran (1)
Nominal Tidak terdapat properti numerik atau quantitatif,
klasifikasi group atau kategori Gender: Pria atau wanita Bidang: Struktur atau Sumber Daya Air
Ordinal Digunakan untuk mengurutkan level variabel yang
sedang di analisis. Tidak ada nilai spesifik yangditempatkan dalam skala rating tersebut.
Rating hotel: bintang 4, bintang 3, bintang 2, danbintang 1
-
8/12/2019 1 Pendahuluan Statistik(1)
6/31
Statistik Deskriptif, Skala Pengukuran (2)
Interval
Perbedaan antar nilai dalam skala dan interval
tersebut berukuran sama. Tidak ada nilai nol.
Dapat digunakan pembanding nilai pengukuran
Temperatur: Perbedaan antara 20 dan 30 derajat
adalah sama dengan perbedaan antara 30 dan 40
derajat. Kita tidak bisa bilang bahwa 40 derajat dua
kali lebih panas dari 20 derajat, hanya 20 derajatlebih panas.
Rasio
Skala yang mempunyani titik nol yang
mengindikasikan nilai variabel tersebut tidak ada.
Dapat dijadikan rasio
-
8/12/2019 1 Pendahuluan Statistik(1)
7/31
Statistik Deskriptif, Distribusi
Frekuensi
Dalam tabel, distribusi frekwensi di bentuk denganme-resume data dalam bentuk nilai frekwensiobservasi dalam setiap kategori, skor, atau intervalskor.
Dalam grafik, distribusi frekuensi dibentuk denganmeresume data dalam bentuk histogram ataupoligon frekuensi
-
8/12/2019 1 Pendahuluan Statistik(1)
8/31
Distribusi frekuensi, histogram dan poligon
frekuensi
Age in years
60.0
57.5
55.0
52.5
50.0
47.5
45.0
42.5
40.0
37.5
35.0
32.5
30.0
27.5
25.0
22.5
Frequency
50
40
30
20
10
0
-
8/12/2019 1 Pendahuluan Statistik(1)
9/31
Statistik Deskriptif
Kurva Normal
Positively
Skewed
Curva Bimodal
Negatively
Skewed
-
8/12/2019 1 Pendahuluan Statistik(1)
10/31
Property distribusi frekuensi: Central Tendency
Modus (Mode) Nilai yang mempunyai frekuensi paling besar
3 3 3 4 4 4 5 5 5 6 6 6 6: Modus=6
3 3 3 4 4 4 5 5 6 6 7 7 8: Modus adalah 3 dan 4
Nilai Tengah (Median)
Nilai yang membagi dua grup nilai dimana 50 % berada di atas
dan 50 % berada di bawah nilai median 3 3 3 5 8 8 8: Median=5
3 3 5 6: Median=4 (Rata-rata dari 2 nilai yang terdapat di tengah)
Nilai Rerata (Mean)
Nilai yang selalu di utamakan, dan satu-satunya properti centraltendency yang digunakan dalam analisis statistika lanjut. Lebih akurat dan reliabel
Cocok bagi perhitungan aritmatik
Pada umumnya menjumlahkan semua nilai dibagi denganbanyaknya nilai.
2 3 4 6 10: Mean=5 (25/5)
-
8/12/2019 1 Pendahuluan Statistik(1)
11/31
Properti distribusi frekuensi:
Variability/Dispersion
Rentang (Range) Dihitung dengan mengurangi nilai tertinggi dengan nilai
terendah
Hanya digunakan untuk skala Ordinal, Interval, dan Ratioscales dan data harus terurut
Contoh: 2 3 4 6 8 11 24 (Rentang=22) Varian (Variance)
Jangkauan nilai dalam distribusi frekuensi (The extent towhich individual scores in a distribution of scores differ fromone another)
Standard Deviasi (Standard Deviation) Akar kuadrat dari varian
Digunakan untuk menggambarkan dispersi dalam setobservasi pada sebuah distribusi
-
8/12/2019 1 Pendahuluan Statistik(1)
12/31
Z-Scores dan T-Scores
Z-Scores
Most widely used standard score in statistics
It is the number of standard deviations above or below the mean.
A Z score of 1.5 means that the score is 1.5 standard deviations
above the mean; a Z score of -1.5 means that the score is 1.5
standard deviations below the mean Always have the same meaning in all distributions
To find a percentile rank, first convert to a Z score and then find
percentile rank off a normal-curve table
T-Scores
Most commonly used standard score for reporting performance May be converted from Z-scores and are always rounded to two
figures; therefore, eliminating decimals
Always reported in positive numbers
The mean is always 50 and the standard deviation is always 10.
A T-score of 70 is 2 SDs above the mean
A T-score of 20 is 3 SDs below the mean
-
8/12/2019 1 Pendahuluan Statistik(1)
13/31
-
8/12/2019 1 Pendahuluan Statistik(1)
14/31
Korelasi dan Regresi Linear Korelasi atau Kovarian
(Correlation/Covariation)
Koefisien korelasi adalah summary statistik dari
derajat keterkaitan atau hubunan antara duavariabel
Dapat memililiki korelasi negatif atau positif
Regresi Linear Tujuan dari persamaan regeresi adalah untuk
perkiraaan sampel baru observasi berdasarkantemuan dari sampel sebelumnya.
-
8/12/2019 1 Pendahuluan Statistik(1)
15/31
15
Resume: Statistic Deskriptif & Inferential
Deskriptif A. For one variable ("univariate analysis"):
Measures of "CENTRAL TENDENCY") (averages) and of
DISPERSION or variance around that average.
Examples: Means, Modes, Medians, Standard Deviation,
quartiles
B. Descriptive statistics for the strength of relationship
between two variables (bivariate analysis) or among a set of
variables (multivariate analysis) are measures of
ASSOCIATION or correlation.
Inferential
Are measures of the SIGNIFICANCE of the relationship
between two or more variables. Significance refers to the
probability that the findings could be attributed to sampling
error. Appropriate statistics depend on the LEVEL OF
-
8/12/2019 1 Pendahuluan Statistik(1)
16/31
Types of Statistical Analysis -Descriptive
Quantify the degree of relationship betweenvariables
Parametric tests are used to test hypotheseswith stringent assumptions about observations e.g., t-test, ANOVA
Nonparametric tests are used with data in anominal or ordinal scale e.g., Chi-Square, Mann-Whitney U, Wilcoxon
-
8/12/2019 1 Pendahuluan Statistik(1)
17/31
Types of Statistical Analysis -Inferential
Allow generalization about populations using datafrom samples
Non-parametric Non-parametric tests do not require any
assumptions about normal distribution, but aregenerally less sensitive than parametric tests.
The test for nominal data is the Chi-Square test
The tests for ordinal data are the Kolmogorov-Smirnov test, the Mann-Whitney U test, and the
Wilcoxon Matched-Pairs Signed-Ranks test
Parametric The tests for interval and ratio data include the t-test
and etc
-
8/12/2019 1 Pendahuluan Statistik(1)
18/31
Statistics and Probability
Statistics: Procedures for describing,analyzing, and interpreting quantitativedata
The choice of statistical technique
should be guided by the research designand the type of data collected
Probability simply represents a judgment
about likelihood of outcomes, i.e., howlikely is it that I could obtain a result likethis purely by chance?
Statistical inferences significant
very unlikely the effect would occur by
-
8/12/2019 1 Pendahuluan Statistik(1)
19/31
Pendahuluan Statistika
Statistik Inferensial
-
8/12/2019 1 Pendahuluan Statistik(1)
20/31
Sampling (1)
Sampling relates to the degree to which thosesurveyed are representative of a specificpopulation
The sample frameis the set of people whohave the chance to respond to the survey
A question related to external validity is thedegree to which the sample framecorresponds to the population to which theresearcher wants to apply the results (Fowler,1988)
-
8/12/2019 1 Pendahuluan Statistik(1)
21/31
Sampling (2)
Two basic types: probability and non-probability
Probability sampling (PS) can include randomsampling, stratified random sampling, andcluster sampling
Non-probability sampling (NPS) can includequota sampling, snowball sampling, andconvenience sampling
-
8/12/2019 1 Pendahuluan Statistik(1)
22/31
Random Sampling (PS)
Every unit has an equal chance of selection
Although it is relatively simple, members of
specific subgroups may not be included in
appropriate proportions
-
8/12/2019 1 Pendahuluan Statistik(1)
23/31
Stratified Random Sampling (PS) The population is grouped according to
meaningful characteristics or strata
This method is more likely to reflect the general
population, and subgroup analysis is possible
However, it can be time consuming and costly
-
8/12/2019 1 Pendahuluan Statistik(1)
24/31
Systematic Sampling (PS)
Every xthunit is selected (e.g., every other person entering the gate was
selected)
The method is convenient and close torandom sampling if the starting point israndomly chosen
Recurring patterns can occur and should beexamined
-
8/12/2019 1 Pendahuluan Statistik(1)
25/31
Cluster/Multistage Sampling (PS)
Natural groups are sampled and then theirmembers are sampled
This method is convenient and can use existing
units
-
8/12/2019 1 Pendahuluan Statistik(1)
26/31
Quota Sampling (NPS)
The population is divided into subgroups and thesample is selected based on the proportions of
the subgroups necessary to represent the
population
This method depends on reliable data about the
proportions in the population
-
8/12/2019 1 Pendahuluan Statistik(1)
27/31
Convenience Sampling (NPS)
This method uses readily available groups orunits of individuals
It is practical and easy to use
However, it may produce a biased sample
Convenience sampling can be perfectlyacceptable if the purpose of the research is totest a hypothesis that certain variables arerelated to one another
-
8/12/2019 1 Pendahuluan Statistik(1)
28/31
Snowball Sampling (NPS)
Previously identified members identify others
This method is useful when a list of potential
names is difficult to obtain
However, it may produce a biased sample
-
8/12/2019 1 Pendahuluan Statistik(1)
29/31
Statistics & Parameters
Aparameteris a value, usually unknown (andwhich therefore has to be estimated), used torepresent a certain population characteristic. Forexample, the population mean is a parameterthat is often used to indicate the average value
of a quantity
A statistic is a quantity that is calculated from asample of data. It is used to give informationabout unknown values in the corresponding
population. For example, the average of the datain a sample is used to give information about theoverall average in the population from which thatsample was drawn.
The sampling distributiondescribes probabilities
associated with a statistic when a randomsample is drawn from a population
-
8/12/2019 1 Pendahuluan Statistik(1)
30/31
Interval Estimate & Sampling Distributions
Interval EstimateA range or band within which the parameter is thought to
lie, instead of a single point or value as the estimate oftheparameter
Sampling Distributions
The sampling distribution of the mean is a frequencydistribution, not of observations, but of means ofsamples, each based on nobservations.
The standard error of the mean is used as an estimateof the magnitude of sampling error. It is the standarddeviation of the sampling distribution of the samplemeans.
-
8/12/2019 1 Pendahuluan Statistik(1)
31/31
Inferential Statistics
Confidence Intervals Same as the percentage of cases in a normal
distribution that lie within 1, 2, or 3 standarddeviations from the mean
Central Limit Theorem States that the distribution of samples (means,
medians, variances, and most other statisticalmeasures) approaches a normal distribution as the
sample size, n, increases