Qualitative and Quantitative Data Analysis

Post on 04-Apr-2023

2 views 0 download

Transcript of Qualitative and Quantitative Data Analysis

1

REGIONAL MARITIME UNIVERSITY

Unit 8

RESEARCH DATA ANALYSISQualitative & Quantitative

Data Analysis

G.S.K. AKAKPO

WHAT IS DATA?

WHAT IS DATA?

• They are numerical facts and figures from which conclusions are drawn using statistical analysis.

• It is the information researchers obtain on the subjects of their study/research.

• The statistical data usually come in two forms

1. Demographic information

2. Information gathered on the basis of research objectives

Data Management

• Data management is the assembling and keeping of data accurately and securely and in a form that will be available and easy to use.

• After the instruments have been administered and collated, there is the need to review, assemble and sort them out based on well defined criteria such as community, job type, gender, religion, social status, marital status, reasons for doing something etc.

Process of data management (1)

• After data have been collated, and before its analyses, do the following:

i. Prepare a coding or scoring scheme,

ii. Prepare data dictionary,

iii. Edit and clean the data,

Processes Of Data Management (2)

1. Coding data

a. Questionnaires are coded as

i. 01, 02, 03, 04, etc for up to 99 elements

ii. 001, 002, 003, etc for hundreds

b. Gender is coded as

i. 1 for males

ii. 2 for females

Processes Of Data Management (3)

Coding data

c. Likert scale interpretation is coded as

1= Strongly Agree

2=Agree

3=Do not know

4=Disagree

5=Strongly Disagree

Likert scale: Example 1

On a scale of 1 to 5, (1 being least and 5 being highest), rate your assessment of this course.

----------------------------------------------------

Likert scale: Example 2

In your opinion, how do you rate this course?

1 = Poor

2 = Satisfactory

3 = Good

4 = Very good

5 = Excellent

Processes Of Data Management (4)

2. Data dictionary

This is a record book to keep track of all variables, names and codes used during data collection in a computer file.

It is prepared by using all the acronyms in the data and their definitions & meanings.

E.g. GNPC-Ghana National Petroleum Corporation

ECG- Electricity Company of Ghana

GMG- Good Morning Ghana

1-male

2-female

01-questionnaire number 1 for 1-99 range items

009 – questionnaire 9 for 1-999 range items

Processes Of Data Management (5)

3. Data editing : Handling missing dataWhen questionnaires are returned and are being retrieved, one

is likely to encounter wrong; and/or no responses

When this occurs the researcher can do 1 of 2 things

i. Remove the respondent from the analysis if no response

ii. Insert the average response into the missing case. Note that this can lead to data torturing

iii. Don’t know /not applicable responses must be clearly defined as to how they will be used in the analysis.

Data

Analysis

Data analysis can be thought of as making graphical and numerical meaning out of raw or processed

statistical data

Data Analysis

Data analysis involves summarizing data with tables and presenting it using graphs.

It also involves statistical calculations to ensure statistical truths and relevance of data gathered.

• We Summarize data with

• Frequency tables or

• Contingency tables

Data Analysis

• The data presented in the tables are represented using:

i. Pie charts

ii. Histograms

iii. Bar charts

iv. Scatter plots/regression lines

Further statistical calculations can be done for the purpose of statistical inferences by use of statistical tests-chi-square, t-test, etc.

Data Analysis

Many research data has 2 parts

1. Demographic Data Analysis

2. Research objective based data

Part 1: Demographic Data Analysis

Data on personal variables are best presented usingbasic descriptive statistical charts such as

Pie charts (nominal variables-gender, nationality) and;

Bar graphs (ordinal variables-age, qualification)

Hypothetical Example

Imagine data was collected on 60 respondents with demographic variables

Gender, and

Age

Demographic Data Analysis 1

Table 1. Frequency table showing gender distribution of respondents

Gender Number %

Males 39 65

Females 21 35

Totals 60 100

Demographic data analysis-graphs

65

35

Males

Females

Figure 1: A pie chart showing gender distribution ofrespondents

Discussion on gender distribution

The graph (Fig.1) shows that 39 out of the total of 60 representing 65% of the respondents were males.

This could be inferred that two-thirds of the target population were males meaning that the target population was male dominated.

Demographic data analysis 2

Age of respondent

Number % of respondents

25-below 15 25.0

26-30 8 13.3

31-35 10 16.7

36-40 17 28.3

41-above 10 16.7

Total 60 100%

Table 2. Frequency table showing distribution of respondents in terms of age

Demographic data analysis

0

2

4

6

8

10

12

14

16

18

25-below 26-30 31-35 36-40 41-above

Nu

mb

er

of

re

sp

on

de

nts

Age distribution of respondentsFigure 2: A simple bar graph showing Age distribution of respondents

Discussion on marital status distribution

The graph (Fig.2) shows that the modal age groupwas the 36-40 age bracket involving 17 out of 60respondents followed by those from 25 and belowwhich was 15. The least group of 8 were the 26-30age bracket.

Other demographic variables

Variables such as

Marital status

Health status

Social status

Level of education

Nationality

Occupation

Qualification

Preferences

CONTENT DATA ANALYSIS

BASED ON THE OBJECTIVES OF YOUR STUDY

CONTENT DATA ANALYSIS

• Content data is usually based on research objectives

• Content data can be analyzed using

i. Descriptive statistics

ii. Inferential statistics

Descriptive analysis can be done using any of the methods earlier indicated

Inferential statistics analysis usually takes the form of hypothesis testing/further statistical computations

Likert Scale Questionnaire analysis

Recall the item

In your opinion, how do you rate this course?

1. Poor

2. Satisfactory

3. Good

4. Very good

5. Excellent

Item Weight Frequency Value

Poor 1 10 10

Satisfactory 2 12 24

Good 3 7 21

Very good 4 17 68

Excellent 5 14 70

60 193

Generated frequency table from the administered Questionnaire

Graph showing how respondents rated the item

Overall rating by the respondents

Total respondents = 60 Total points = 193 Mean choice = 193/60 =3.22 =3 approx

From the calculation, it can be observed and concluded that the overall rating of the course by the respondents is “3” which means that the respondents rated the course as “Good”.

Analyzing statistical data

Using

Hypothesis testing

Statistical computations

Steps in hypothesis testing

a) State the null hypothesis, Ho

b) State the alternative hypothesis, H1

c) Choose the level of significance, α

d) Select an appropriate test statistic –t-test, x2, etc

e) Calculate the value of the test statistic

f) Determine the critical region

g) Reject Ho if the value of the test statistic falls in the

critical region; otherwise accept Ho

Test statistic

Two common test statistics used in student research

Student’s t-test

Used to analyze categorical variables over continuous variables.

Example:

Testing differences in mean performances of boys and girls in class

Test statistic

Chi-square test

Used to analyze categorical variables over discrete variables

Example:

Analyzing equality of illegal electricity connections by customers of different social classes

Let’s focus on content data analysis

Chi-Square Analysis

• Chi-square analysis• Chi-square (χ2) tests are the most popular and

frequently used non-parametric tests of significance in social sciences

Chi-square (χ2) analysis

In a Chi-square (χ2) , the data collected in the study (observed frequencies) are compared with the expected data (expected frequencies)

Their actual difference determines the level of significance

Chi-squared = ∑(observed-expected)2/(expected)

Formula

When to use Chi-square (χ2) analysis?

a. Every value falls in only one category (nominal data)

b. The probability of a subject falling in a particular category is independent

c. The expected frequency is at least 5

Chi-square (χ2) analysis

• Procedure

i. A simple table called contingency table is constructed

ii. The observed frequencies are identified

iii. The expected frequencies are ascertained

iv. The expected frequencies are subtracted from the observed frequencies

v. The differences are squared for each category

vi. Divide each squared difference by corresponding E

vii. Sum up the ratios in (vi) = (χ2)calculated

Chi-square (χ2) Analysis

Table 3: Contingency table for gender involvement in illegal connection

Formula

Gender Males Females

Observed (O) 39 21

Expected (E) 30 30

Calculations

i) Formulate the null hypotheses (Ho) & alternative hypotheses (H1)

ii) Ascertain the degree of freedom and critical value of Chi-square (χ2)

iii) Compare critical value of (χ2) with the calculated (χ2) score

Calculations

iv. If the computed (χ2) value is greater than or equal to the critical value, the null hypothesis is rejected and the difference is considered to be significant

v. If the computed (χ2) value is less than the critical value, the null hypothesis is accepted meaning there is no significant difference.

Example

In a study of 60 illegal electrical connections, a researcher reported 30 such cases in the lower social class, 20 in the middle class and 10 in the upper social class.

Test at 1% level of significance, the hypotheses that illegal electricity connections are equally likely in all 3 social classes.

Descriptive analysis: Frequency Table

The above data can be summarized and presented using simple frequency tables and graphs

Social class Frequency

Lower class 30

Middle class 20

Upper class 10

60

Descriptive analysis: graph

Discussion (1)

Looking at the above graph, one can say that there were more people involved in the illegal connections of electricity within the lower class social class community.

But this statement needs to be tested statistically by performing further calculations

Hence the chi-square calculation below

SolutionRecall the hypotheses stated under chapter one

Ho: Illegal connections are the same in all 3 social classes

Ha: Illegal connections vary according to the respondent’s social classes (3 social classes)

If all 3 social classes are made up of human beings who behave the same way then expected occurrence of illegal connections equals 60/3=20 each

X2 Contingency table

• Table for analysis

There are 3 classes, hence, k=3

Degree of freedom , df=k-1 =3-1 =2

Critical value of χ2 (0.01, 2) =9.21 (refer to table)(0.01 is the alpha level & 2 is degree of freedom)

Social class Lower Middle Upper

Observed frequencies (O) 30 20 10

Expected frequencies (E) 20 20 20

Preparing the Table

Social class Lower Middle Upper Total

O 30 20 10 60

E 20 20 20 60

O-E 10 0 -10

(O-E)2 102=100 0 (-10)2=100

∑[(O-E)2/E] 100/20=5

0/20=0

100/20=5

χ2 =5+5

=10

Summary Of Results

• Critical value of χ2 (0.01, 2) =9.21

• Calculated value of χ2 =10

Recall the Decision rule

• When Calculated value of χ2 >= Critical value ofχ2, Ho is rejected

• When Calculated value of χ2 < Critical value of χ2, Ho is accepted

Decision rule revoked

Since χ2 calc > χ2 critical

=> 10 > 9.21

Conclusion: We reject the null hypothesis that illegal connections are equally likely in all 3 social classes.

That is, illegal electrical connections vary in terms of the inhabitants’ social class status.

Type I & Type II errors

Type I error:

1. Rejection of a true null hypothesis is called the type I error.

2. The subsequent results might not produce the result observed in the original investigation.

3. Leads to changes that are unwarranted.

Type II error:

1. Retention of false null hypothesis is called the type II error.

2. The ultimate truth remains unknown although evidence might support an alternative hypothesis.

3. Leads to maintenance of a status quo when a change is warranted.

Summary

Data can be referred to as numerical facts and figures from which conclusions are drawn using statistical analysis. It is the information researchers obtain on the subjects of their study/research. The statistical data usually come in two forms. These are: (1) Demographic information, and (2) Information gathered on the basis of research objectives.

After the instruments have been administered and collated, there is the need to review, assemble and sort them out based on well-defined criteria such as community, job type, gender, religion, social status, marital status, etc.

Data management is the assembling and keeping of data accurately and securely and in a form that will be available and easy to use.

Data analysis involves summarizing into tables and presentation of data using graphs. It also involves statistical calculations to ensure guaranteed statistical truths and relevance of data gathered.