1 Pendahuluan Statistik(1)

8/12/2019 1 Pendahuluan Statistik(1)

1/31

Pendahuluan

Statistik


2/31

Statistik

Terdapat 2 tipe statistik

Statistik Deskriptif (Descriptive Statistics):

meliputi tabulasi, penyederhanaan, dan penjelasan

data. Atau menyimpulkan data yang kompleks

dengan suatu nilai.

Statistik Inferensial (Inferential Statistics):

perkiraan karakteristik dari suatu populasiberdasarkan pengetahuan karakteristik suatu

sample dalam populasi tersebut.


3/31

Pendahuluan Statistika: Teori dan Metodologi untuk analisis

data kuantitatif dari sampel observasi dalamhubungan-hubungan yang telah di hipotesakan

Alat untuk perencanaan dan kajian

Ilmu Statistika membantu analist yang memilikitumpukan data untuk menghasilkan susunanyang teratur dan penyederhananaan dari halyang kompleks dan tidak beraturan.


4/31

Perkiraan Statistik

Populasi

Sampel Acak

Parameter-Parameter

Statistik

Setiap anggota dalam

populasi mempunyai

kesempatan yang sama

untuk terpilih sebagai

sampel.

Perkiraan


5/31

Statistik Deskriptif, Skala Pengukuran (1)

Nominal Tidak terdapat properti numerik atau quantitatif,

klasifikasi group atau kategori Gender: Pria atau wanita Bidang: Struktur atau Sumber Daya Air

Ordinal Digunakan untuk mengurutkan level variabel yang

sedang di analisis. Tidak ada nilai spesifik yangditempatkan dalam skala rating tersebut.

Rating hotel: bintang 4, bintang 3, bintang 2, danbintang 1


6/31

Statistik Deskriptif, Skala Pengukuran (2)

Interval

Perbedaan antar nilai dalam skala dan interval

tersebut berukuran sama. Tidak ada nilai nol.

Dapat digunakan pembanding nilai pengukuran

Temperatur: Perbedaan antara 20 dan 30 derajat

adalah sama dengan perbedaan antara 30 dan 40

derajat. Kita tidak bisa bilang bahwa 40 derajat dua

kali lebih panas dari 20 derajat, hanya 20 derajatlebih panas.

Rasio

Skala yang mempunyani titik nol yang

mengindikasikan nilai variabel tersebut tidak ada.

Dapat dijadikan rasio


7/31

Statistik Deskriptif, Distribusi

Frekuensi

Dalam tabel, distribusi frekwensi di bentuk denganme-resume data dalam bentuk nilai frekwensiobservasi dalam setiap kategori, skor, atau intervalskor.

Dalam grafik, distribusi frekuensi dibentuk denganmeresume data dalam bentuk histogram ataupoligon frekuensi


8/31

Distribusi frekuensi, histogram dan poligon

frekuensi

Age in years

60.0

57.5

55.0

52.5

50.0

47.5

45.0

42.5

40.0

37.5

35.0

32.5

30.0

27.5

25.0

22.5

Frequency

50

40

30

20

10

0


9/31

Statistik Deskriptif

Kurva Normal

Positively

Skewed

Curva Bimodal

Negatively

Skewed


10/31

Property distribusi frekuensi: Central Tendency

Modus (Mode) Nilai yang mempunyai frekuensi paling besar

3 3 3 4 4 4 5 5 5 6 6 6 6: Modus=6

3 3 3 4 4 4 5 5 6 6 7 7 8: Modus adalah 3 dan 4

Nilai Tengah (Median)

Nilai yang membagi dua grup nilai dimana 50 % berada di atas

dan 50 % berada di bawah nilai median 3 3 3 5 8 8 8: Median=5

3 3 5 6: Median=4 (Rata-rata dari 2 nilai yang terdapat di tengah)

Nilai Rerata (Mean)

Nilai yang selalu di utamakan, dan satu-satunya properti centraltendency yang digunakan dalam analisis statistika lanjut. Lebih akurat dan reliabel

Cocok bagi perhitungan aritmatik

Pada umumnya menjumlahkan semua nilai dibagi denganbanyaknya nilai.

2 3 4 6 10: Mean=5 (25/5)


11/31

Properti distribusi frekuensi:

Variability/Dispersion

Rentang (Range) Dihitung dengan mengurangi nilai tertinggi dengan nilai

terendah

Hanya digunakan untuk skala Ordinal, Interval, dan Ratioscales dan data harus terurut

Contoh: 2 3 4 6 8 11 24 (Rentang=22) Varian (Variance)

Jangkauan nilai dalam distribusi frekuensi (The extent towhich individual scores in a distribution of scores differ fromone another)

Standard Deviasi (Standard Deviation) Akar kuadrat dari varian

Digunakan untuk menggambarkan dispersi dalam setobservasi pada sebuah distribusi


12/31

Z-Scores dan T-Scores

Z-Scores

Most widely used standard score in statistics

It is the number of standard deviations above or below the mean.

A Z score of 1.5 means that the score is 1.5 standard deviations

above the mean; a Z score of -1.5 means that the score is 1.5

standard deviations below the mean Always have the same meaning in all distributions

To find a percentile rank, first convert to a Z score and then find

percentile rank off a normal-curve table

T-Scores

Most commonly used standard score for reporting performance May be converted from Z-scores and are always rounded to two

figures; therefore, eliminating decimals

Always reported in positive numbers

The mean is always 50 and the standard deviation is always 10.

A T-score of 70 is 2 SDs above the mean

A T-score of 20 is 3 SDs below the mean


13/31


14/31

Korelasi dan Regresi Linear Korelasi atau Kovarian

(Correlation/Covariation)

Koefisien korelasi adalah summary statistik dari

derajat keterkaitan atau hubunan antara duavariabel

Dapat memililiki korelasi negatif atau positif

Regresi Linear Tujuan dari persamaan regeresi adalah untuk

perkiraaan sampel baru observasi berdasarkantemuan dari sampel sebelumnya.


15/31

15

Resume: Statistic Deskriptif & Inferential

Deskriptif A. For one variable ("univariate analysis"):

Measures of "CENTRAL TENDENCY") (averages) and of

DISPERSION or variance around that average.

Examples: Means, Modes, Medians, Standard Deviation,

quartiles

B. Descriptive statistics for the strength of relationship

between two variables (bivariate analysis) or among a set of

variables (multivariate analysis) are measures of

ASSOCIATION or correlation.

Inferential

Are measures of the SIGNIFICANCE of the relationship

between two or more variables. Significance refers to the

probability that the findings could be attributed to sampling

error. Appropriate statistics depend on the LEVEL OF


16/31

Types of Statistical Analysis -Descriptive

Quantify the degree of relationship betweenvariables

Parametric tests are used to test hypotheseswith stringent assumptions about observations e.g., t-test, ANOVA

Nonparametric tests are used with data in anominal or ordinal scale e.g., Chi-Square, Mann-Whitney U, Wilcoxon


17/31

Types of Statistical Analysis -Inferential

Allow generalization about populations using datafrom samples

Non-parametric Non-parametric tests do not require any

assumptions about normal distribution, but aregenerally less sensitive than parametric tests.

The test for nominal data is the Chi-Square test

The tests for ordinal data are the Kolmogorov-Smirnov test, the Mann-Whitney U test, and the

Wilcoxon Matched-Pairs Signed-Ranks test

Parametric The tests for interval and ratio data include the t-test

and etc


18/31

Statistics and Probability

Statistics: Procedures for describing,analyzing, and interpreting quantitativedata

The choice of statistical technique

should be guided by the research designand the type of data collected

Probability simply represents a judgment

about likelihood of outcomes, i.e., howlikely is it that I could obtain a result likethis purely by chance?

Statistical inferences significant

very unlikely the effect would occur by


19/31

Pendahuluan Statistika

Statistik Inferensial


20/31

Sampling (1)

Sampling relates to the degree to which thosesurveyed are representative of a specificpopulation

The sample frameis the set of people whohave the chance to respond to the survey

A question related to external validity is thedegree to which the sample framecorresponds to the population to which theresearcher wants to apply the results (Fowler,1988)


21/31

Sampling (2)

Two basic types: probability and non-probability

Probability sampling (PS) can include randomsampling, stratified random sampling, andcluster sampling

Non-probability sampling (NPS) can includequota sampling, snowball sampling, andconvenience sampling


22/31

Random Sampling (PS)

Every unit has an equal chance of selection

Although it is relatively simple, members of

specific subgroups may not be included in

appropriate proportions


23/31

Stratified Random Sampling (PS) The population is grouped according to

meaningful characteristics or strata

This method is more likely to reflect the general

population, and subgroup analysis is possible

However, it can be time consuming and costly


24/31

Systematic Sampling (PS)

Every xthunit is selected (e.g., every other person entering the gate was

selected)

The method is convenient and close torandom sampling if the starting point israndomly chosen

Recurring patterns can occur and should beexamined


25/31

Cluster/Multistage Sampling (PS)

Natural groups are sampled and then theirmembers are sampled

This method is convenient and can use existing

units


26/31

Quota Sampling (NPS)

The population is divided into subgroups and thesample is selected based on the proportions of

the subgroups necessary to represent the

population

This method depends on reliable data about the

proportions in the population


27/31

Convenience Sampling (NPS)

This method uses readily available groups orunits of individuals

It is practical and easy to use

However, it may produce a biased sample

Convenience sampling can be perfectlyacceptable if the purpose of the research is totest a hypothesis that certain variables arerelated to one another


28/31

Snowball Sampling (NPS)

Previously identified members identify others

This method is useful when a list of potential

names is difficult to obtain

However, it may produce a biased sample


29/31

Statistics & Parameters

Aparameteris a value, usually unknown (andwhich therefore has to be estimated), used torepresent a certain population characteristic. Forexample, the population mean is a parameterthat is often used to indicate the average value

of a quantity

A statistic is a quantity that is calculated from asample of data. It is used to give informationabout unknown values in the corresponding

population. For example, the average of the datain a sample is used to give information about theoverall average in the population from which thatsample was drawn.

The sampling distributiondescribes probabilities

associated with a statistic when a randomsample is drawn from a population


30/31

Interval Estimate & Sampling Distributions

Interval EstimateA range or band within which the parameter is thought to

lie, instead of a single point or value as the estimate oftheparameter

Sampling Distributions

The sampling distribution of the mean is a frequencydistribution, not of observations, but of means ofsamples, each based on nobservations.

The standard error of the mean is used as an estimateof the magnitude of sampling error. It is the standarddeviation of the sampling distribution of the samplemeans.


31/31

Inferential Statistics

Confidence Intervals Same as the percentage of cases in a normal

distribution that lie within 1, 2, or 3 standarddeviations from the mean

Central Limit Theorem States that the distribution of samples (means,

medians, variances, and most other statisticalmeasures) approaches a normal distribution as the

sample size, n, increases

1 Pendahuluan Statistik(1)

Documents

Transcript of 1 Pendahuluan Statistik(1)