Interpreting the Dynamics of Social Indicators: Methodological Issues Related to Absolute, Relative,...

18
Interpreting the Dynamics of Social Indicators: Methodological Issues Related to Absolute, Relative, and Time Differences Katja Prevodnik Vesna Dolničar Vasja Vehovar Faculty of Social Sciences, University of Ljubljana, Kardeljeva ploščad 5, 1000 Ljubljana, Slovenia Abstract In contemporary society, a growing demand for the accessibility of social indicators can be observed. This requires increased attention to their construction to avoid misleading interpretations, which can be the result of inadequate knowledge, or can even be an intentional choice to imply a specific desired outcome. This paper addresses this issue by first summarizing research regarding the perception of numbers, statistical thinking, and numerical literacy. The focus is then narrowed to the comparison of social indicators observed for two units in a time perspective. Three simple and popular measures of dynamics—most frequently used when social change is analyzed and interpreted—are addressed: absolute difference, relative difference, and time distance. In a corresponding experiment, respondents evaluated the direction of change of a certain social indicator in time (i.e., whether the differences increase, decrease, or stagnate) for a hypothetical case where the three measures implied contradictory interpretations. Each experimental group was exposed to one of these measures. The results indicate that interpretations basically followed the specific measures that respondents were exposed to. This effect was particularly strong regarding absolute difference, followed 1,* 1 1 1 e.Proofing http://springerproof.sps.co.in:8080/oxe_v1/printpage.php?token=5xv2... 1 od 36 11.1.2014 20:42

Transcript of Interpreting the Dynamics of Social Indicators: Methodological Issues Related to Absolute, Relative,...

Interpreting the Dynamics of SocialIndicators: Methodological IssuesRelated to Absolute, Relative, and TimeDifferences

Katja Prevodnik

Vesna Dolničar

Vasja Vehovar

Faculty of Social Sciences, University of Ljubljana, Kardeljeva ploščad

5, 1000 Ljubljana, Slovenia

Abstract

In contemporary society, a growing demand for the accessibility of social

indicators can be observed. This requires increased attention to their

construction to avoid misleading interpretations, which can be the result of

inadequate knowledge, or can even be an intentional choice to imply a

specific desired outcome. This paper addresses this issue by first

summarizing research regarding the perception of numbers, statistical

thinking, and numerical literacy. The focus is then narrowed to the

comparison of social indicators observed for two units in a time

perspective. Three simple and popular measures of dynamics—most

frequently used when social change is analyzed and interpreted—are

addressed: absolute difference, relative difference, and time distance. In a

corresponding experiment, respondents evaluated the direction of change of

a certain social indicator in time (i.e., whether the differences increase,

decrease, or stagnate) for a hypothetical case where the three measures

implied contradictory interpretations. Each experimental group was

exposed to one of these measures. The results indicate that interpretations

basically followed the specific measures that respondents were exposed to.

This effect was particularly strong regarding absolute difference, followed

1,*

1

1

1

e.Proofing http://springerproof.sps.co.in:8080/oxe_v1/printpage.php?token=5xv2...

1 od 36 11.1.2014 20:42

by relative difference, while the effect of exposure to time distance was

somewhat weaker. When only data or graphical presentation was given,

respondents tended to interpret dynamics according to absolute differences.

The results indicate that extreme methodological rigor is needed when

presenting social indicators in time, and some guidelines are provided for

this purpose.

Keywords

Statistical literacy

Numerical data

Number sense

Manipulations

Comparative analysis

1. Introduction

The proper usage of indicators, statistics, and presentations is primarily

dependent on if and how the public understands them. Sicherl (2011 ) claims

that perceptions of welfare and social progress depend on the measure used.

Thus, the lagging of certain subjects or groups in a specific comparison can

be perceived as much more severe if presented with time-distance

methodology as opposed to presenting only absolute or relative differences.

Consequently, the ten-year time lag of a certain country appears much more

serious compared to correspondingly, say, a three-point absolute difference

(e.g., 20 vs. 23 %). Similarly, Mueller (2011 ) has shown that different forms

of presentation can have very specific effects on the reader’s understanding.

Decisions and actions based on such presentations are, therefore, directly

dependent not only on data quality (e.g., accuracy, reliability, validity, etc.)

but also on how the analyses have been performed, presented, and interpreted.

Mueller (2011 ) also noted that contextual issues arise not so much from the

core problem of monitoring the development but from the methodology.

According to Best (2008 ), an in-depth analysis of an individual statistic is

necessary to really acquire a clear view of a social phenomenon. Therefore,

the authors believe that accurate statistics (e.g., absolute numbers, ratios,

percentages, etc.) are not solely sufficient in building general knowledge of

research questions but a quality method for the presentation of data is also

necessary.

The assumption of the general public is that statistics are provided by experts

e.Proofing http://springerproof.sps.co.in:8080/oxe_v1/printpage.php?token=5xv2...

2 od 36 11.1.2014 20:42

(e.g., statistical offices and research institutes), which is sometimes one of the

key reasons for a rather uncritical acceptance of provided results. In this

paper, this assumption is questioned following Best (2008 ), who states that

even official statistics are social products formulated by specific people and

organizations. In general, all numbers, measures, and indicators are, thus,

products of the activities of people who decide not only what they want to

count and how to perform measurements and analyses but also how to present

and interpret the results and give them meaning. Thus, the capacity of the

public to evaluate and critically analyze data and corresponding presentations

is becoming ever more important. This is essential for social indicators,

particularly in the field of well-being and quality-of-life research, since

results have an important influence on the public perception of the current

state in a society as well as on government measures and policies. Substantial

efforts have already been devoted to various contextual and methodological

issues; however, certain very essential aspects have still not been fully

elaborated. The issue of the appropriate choice of indicator and its

presentation may sometimes be decisive to its interpretation as, for example,

Diener and Suh (1997 ) demonstrated in their quality-of-life indicators.

Many researchers (e.g., Wild and Pfannkuch 1999 ; Kaplan et al. 2010 ;

Schield 2010 ) and popular writers (e.g., Huff 1973 ; Campbell 1974 ;

Blastland and Dilnot 2007 ; Best 2008 ) believe that this discussion is

particularly important because the use of numerical and statistical data is not

only rapidly increasing but also is being used by a growing number of

different groups. Data and indicators are generally produced by competent

researchers but can also be produced by less qualified persons (e.g., the

media, lobbyists, specific stakeholders, policymakers, etc.) where the

selection of numbers, graphs, and formulae might become a communication

or even a manipulation strategy. The multiplication of measurement

instruments and methods further expands the potential of heterogeneous

methods of analysis and presentation. In practice, this may complicate an

objective and conclusive judgment as it may become unclear which

approach/interpretation is correct. A researcher’s ethical duty is to cover as

many aspects as possible and answer as many questions as possible by also

taking into account current knowledge in the perceptions of numbers,

graphical presentations, and verbal interpretations (e.g., Curcio 1981 ;

Lewandowsky 1987, Lewadowsky 1999 ; Lewandowsky and Spence 1989 ;

Schwarz 1996; Shaughnessy et al. 1996; Friel et al. 2001 ; Dehaene 2011 ;

OECD 2011 ). These problems are covered in most ethics codes in the fields

e.Proofing http://springerproof.sps.co.in:8080/oxe_v1/printpage.php?token=5xv2...

3 od 36 11.1.2014 20:42

of social sciences, methodology, and statistics (e.g., ISA Code of Ethics,

AAPOR Code of Professional Ethics and Practices, WAPOR Code of

Professional Ethics and Practices, ICC/ESOMAR Code, RESPECT Code of

Practice, and ISI Declaration of Professional Ethics).

The problem of interpreting the dynamics of social indicators is encountered

extremely frequently, and therefore, is considered particularly important.

Various specific questions are often discussed such as “Is the gender gap truly

increasing?”, “Is the digital divide growing or stagnating?”, and “Is the

quality-of-life indicator changing equally across subgroups?” These dilemmas

are, therefore, not only methodologically intriguing but also of substantial

practical importance. In this paper, the above issues are addressed within the

somewhat narrower context of simple comparative analyses of a social

indicator for two units/groups in time. The corresponding theory and

empirical research are elaborated to evaluate the opportunity for

misinterpretation (and manipulation) when interpreting such comparisons.

We start with an overview of a broader set of theoretical considerations

regarding numbers and the human perception of quantities (Sect. 2 ). Next,

absolute, relative, and time difference measures are presented together with an

example (Sect. 3 ). For the empirical study, the experimental design is

outlined in Sect. 4 , and the data analyses are presented in Sect. 5 . In the

conclusions (Sect. 6 ), the key findings are discussed in a broader context

together with their methodological limitations and implications for future

research. Some recommendations for practical work using these measures are

also outlined.

2. Understanding Numbers and Statistics

Theoretically, two major research streams in human perception and the

understanding of numerical data can be identified. The first arises from

psychology, and particularly neuropsychology (in relation to the so-called

number sense), while the second is related to statistical thinking and statistical

literacy.

One of the basic questions related to human perception is whether any innate

predispositions determine potential perception. Some answers can be found in

the field of neuropsychology, where important research has been related to the

so-called human “number sense” (the “number concept”, Brainerd 1979 )

defined as the ability to quickly understand, approximate, and manipulate

e.Proofing http://springerproof.sps.co.in:8080/oxe_v1/printpage.php?token=5xv2...

4 od 36 11.1.2014 20:42

numerical quantities. A likely conclusion (Dehaene 2011 : 50) is that there

exists, in fact, an area of the brain specifically for identifying numbers, which

is laid down through the spontaneous maturation of cerebral neuronal

networks under genetic control and with minimal guidance from the

environment. Dehaene (2001 ) argues that the foundations of arithmetic lie in

a human’s ability to mentally represent and manipulate numerosities on a

mental “number line” and that this representation has a long evolutionary

history and a specific cerebral substrate. The findings of various studies,

summarized by Dehaene (2011 ), show that individuals very quickly

differentiate digits up to 3 (4 seems to be the inflection point). Similarly, they

determine which number is bigger or smaller more quickly if the absolute

difference between the compared amounts is greater. They also perceive the

difference between two small numbers as greater than the difference between

two large numbers (e.g., 3 and 4 vs. 103 and 104). The parameter by which

humans naturally distinguish two numbers is, thus, not so much their absolute

difference but their difference relative to their size. An experiment comparing

young children in a Western civilization with uneducated tribal adults

concluded that the representation of numbers in the uneducated adults closely

approximated a logarithmic function and not a line; whereas, in the young,

educated children, the converse was true. Thus, “a shift from logarithmic to

linear mapping occurs later in development, between first and fourth grade

depending on the experience and the range of numbers tested” (Dehaene

2011 ; see also Nunez 2011 ). Such research indicates that the bias for small

numbers can have far-reaching consequences in the way individuals conduct

and interpret statistical analyses. The biological principles behind the number

sense can be explored further (e.g., Dehaene 2001 and Dehaene 2011 ; Göbel

et al. 2011 ; Nunez 2011 ), but the intriguing fact here is that there is a real

and biologically proven possibility that a particular numerical problem will be

viewed in a certain way unless we are taught otherwise. Research has shown

that education plays an important role in developing or acquiring some

numerical abilities (e.g., Brainerd 1979 ; Göbel et al. 2011 ; Nunez 2011 ). If

this is the case, education (specific to culture or civilization) may condition

our understanding and view of numbers in everyday life.

Within this context, for the purpose of our study, the following question is

relevant: if humans are in fact conditioned (taught) to view a certain

comparison in time predominantly in one specific dimension (e.g., the

absolute difference), is our perception of the true state of inequality

(difference or gap) actually distorted or skewed?

e.Proofing http://springerproof.sps.co.in:8080/oxe_v1/printpage.php?token=5xv2...

5 od 36 11.1.2014 20:42

Human perception aspects have been emphasized not only in

neuropsychology but also by researchers in statistical education. Researchers

investigated factors that can explain why some students understand statistics

better while others never understand the importance and meaning of even

simple statistical measures (e.g., Wild and Pfannkuch 1999 ; Friel et al. 2001 ;

Watson and Callingham 2003 ; Garfield and Ben-Zvi 2007 ; Kaplan et al.

2010 ; Pfannkuch et al. 2010 ; Schield 2010 ). This research is covered under

the umbrella term, statistical literacy. Typically, however, research is focused

on a narrow research problem (e.g., understanding certain types of tables and

graphs, percentages, the p value, reading data, etc.). Different terminology

and definitions have arisen in the last few decades with regards to statistical

literacy, which all encompass a certain view on understanding statistics; for

example, statistical thinking (Wild and Pfannkuch 1998, 1999 ), statistical

reasoning (Garfield and Ben-Zvi 2007 ), statistical proficiency (Kaplan et al.

2010 ), statistical literacy (Gaal 2002 ; Schield 2010 ), and numeracy (Best

2008 ).

Paulos (2001) defined innumeracy as the mathematical equivalent of illiteracy

and referred to the inability to perceive the basic meanings of numbers and

probability. Biggeri and Zuliani (1999 ) refer to numerical literacy in terms of

several competences: the ability to work with numbers and quantitative

problems, understanding basic mathematical ideas and patterns, statistical

reasoning, the importance of thinking from the aspect of probability,

collecting and presenting data, the omnipresence of variability, and the

quantification and explanation of variability. However, the most important is

an understanding of the meaning of information (e.g., limitations and source

of statistical information and differentiation between quality and questionable

data). Schield (2010 ) defines statistical literacy as the ability to read and

interpret statistics in everyday media; in graphs, tables, assertions, surveys,

and studies, and states that it is a prerequisite for all data users. Human

understanding of complex analyses, nevertheless, varies significantly with

education (in mathematics or statistics, as well as by general education,

people’s experiences with data, etc.; e.g., Friel et al. 2001 ), which

corresponds to certain assertions in research that are summarized in Chapter

3.

Applied statistics is part of the process of collecting information and learning,

by which we support the process of informed decision and policymaking.

Being in the position of presenting information (as a researcher, statistician,

e.Proofing http://springerproof.sps.co.in:8080/oxe_v1/printpage.php?token=5xv2...

6 od 36 11.1.2014 20:42

policy maker, or journalist/reporter) is demanding. One needs to provide the

reader with enough information to enable him or her to form a coherent view

of the problem, and at the same time, ensure that the choice of measures and

data correctly influences the perception of the reader. According to Gaal

(2002 ), there are two interrelated components to statistical literacy: (a)

people’s abilities to interpret and critically evaluate statistical information,

and (b) their abilities to discuss or communicate their reactions to such

statistical information. The true answers to research questions are often out of

reach, and we are only able to formulate an assessment of the available

answers and interpretations with varying degrees of error. Correspondingly,

we may provide users with several measures that describe a certain problem

(data) and are relevant for the research question, and thus, enable the

individual to form an informed notion of the problem. Of course, such an

approach is very demanding and requires additional resources. In addition, to

apply such an open and expanded methodology for the dissemination of

findings, knowledge of how people perceive different views of a research

question also needs to be gained to minimize possible manipulations and

misunderstandings.

Perception of the different measures is also the focus of this paper, with the

focus on the dynamics of social indicators between units in time (i.e., absolute

difference, relative difference, or time distance). All three measures are

presumed to answer the question: is the difference (gap) between the two

units in time constant, increasing, or decreasing? A comprehensive framework

to cover these issues is still required, and so far, only specific attempts have

been developed, mainly in the field of graphical presentations. Spence and

Lewandowsky (1991 ) and Spence (2005 ), for example, concluded that the

prejudice towards pie charts (instead of bar charts) is sometimes misguided.

Similarly, Galesic and Garcia-Retamero (2011 ) demonstrated through

experimental research that, in both the United States and Germany, one-third

of the population has low graph literacy and low numeracy skills.

AQ1

AQ2

3. Basic Comparative Analyses in Time

The three most common and most basic measures of comparison in time are

absolute difference, relative difference (ratio), and time distance. For

illustrative purposes, an example is presented (Fig. 1 ), which is then also

e.Proofing http://springerproof.sps.co.in:8080/oxe_v1/printpage.php?token=5xv2...

7 od 36 11.1.2014 20:42

used in the experiment. Next, previous research is overviewed. This section is

concluded by outlining the research questions and hypotheses.

Fig. 1

Graphical presentation of the example and the indicated static (A absolute

difference, R relative difference) and dynamic measures of difference (T time

distance)

In the example presented here, a standard question is asked when dealing with

comparisons: is the difference (i.e., monthly income) between two units (i.e.,

individuals) in a certain time interval increasing, decreasing, or constant? For

illustration—as well as for purpose of the experiment—a specific example is

presented in Fig. 1 and Table 1 , which was intentionally constructed so that

the three measures show contradictory conclusions regarding the direction of

change.

Table 1

Data and calculations of statistical measures for the indicator of monthly income (in €)

2008 1,000 500 500 0.50 (Not available)

2009 1,500 1,000 500 0.67 1

2010 1,750 1,250 500 0.71 1.5

2011 2,000 1,500 500 0.75 2

Constant Decrease Increase

It is also necessary to comment briefly on the choice of indicator (i.e., the

e.Proofing http://springerproof.sps.co.in:8080/oxe_v1/printpage.php?token=5xv2...

8 od 36 11.1.2014 20:42

monthly income for person X and person Y for the years 2008–2011). The

aim was to choose an indicator that would immediately intrigue all potential

respondents to actively consider and analyze the problem. The indicator used

is intuitively understandable to all regardless of their educations in

mathematics, statistics, or other data-related fields. Unlike issues in

information society, economic development, or medical studies (e.g.,

Svedberg 2004 ; Dolničar 2007 ; Moser et al. 2007 ; Sicherl 2007 ; Citrome

2010 ; James 2011 ), this example refers to an everyday issue affecting the

majority of the population—income. Average monthly income for two distinct

groups (e.g., the public vs. the private sector or two different countries) over

several decades could also be used as an example. However, a significant

drawback here might be that most individuals probably already have at least

some opinion or even prejudice regarding this issue, which might have an

effect on their perceptions and understandings.

Firstly, the comparison can be made based on the absolute difference, which is

the subtraction of the lagging unit from the leading unit. In this case, the

absolute difference at all points in time is 500 € (1,000–500 in 2008;

1,500–1,000 in 2009, and so on) and is, thus constant, implies that the

difference between the two units has not changed.

Secondly, the relative difference can be calculated (i.e., the ratio of the two

values of the indicator). Similar results follow if some derivatives are

calculated (e.g., various types of growth rates, indices, or Gini coefficient);

however, these measures are more complex for the reader to understand. In

2008, the ratio is 0.5 (ratio = 500/1,000; the lagging unit achieved 50 % of the

value of the leading unit). However, by 2011 this has changed to 0.75

(ratio = 1,500/2,000). Thus, the situation can be interpreted as a decrease in

the difference between the units.

Thirdly, as Sicherl (2004 , 2007 , and 2011 ) suggests our usual perspective

can be complemented by considering the dynamic dimension of the

comparison. The statistical measure S-time-distance expresses the distance

(proximity) in time between points i and j when two compared series reach a

specified level of indicator X according to:

This determines how far ahead in time the leading unit is. The time distance

1

2

L

e.Proofing http://springerproof.sps.co.in:8080/oxe_v1/printpage.php?token=5xv2...

9 od 36 11.1.2014 20:42

in 2009 is 1 year: person X achieved a level of 1,000 in 2008, which means a

1-year time lag (distance) for person Y, who achieved the same level in 2009.

In 2010, the time distance is 1.5 years, and in 2011, it is 2 years (Fig. 1 ).

Considering the situation from this perspective, the gap in time has in fact

increased; person Y needs increasingly more time to catch up with person X.

With this simple and realistic—albeit, controversial—example, it has been

illustrated that the answer to “How is the difference between the two units

changing?” is not always straightforward. Considering different statistical

measures to cover various dimensions of the compared time series, it is clear

that the results can even contradict one another. All of the three presented

measures are statistically correct, legitimate, and possible but imply different

conclusions, and the case itself is not completely artificial but strongly

resembles relations in reality.

Given its importance, the above dilemma has been addressed by surprisingly

few researchers who are, in addition, from various contextual research fields.

Wallgren and Wallgren (2010 ) emphasize that statistical time-series analysis

differs from other aspects of statistical science because, instead of estimating

quantitative parameters (such as regression coefficients), the aim is often to

acquire a picture of qualitative patterns of the time series under study.

Nevertheless, the authors of the present study maintain that appropriate basic

quantitative measures are also necessary for adequate qualitative

interpretations of trends.

Mueller and Schuessler (1961 ) define time series as a succession of

chronologically spaced observations designed to depict growth or decline, or

that they are simply variations in the incidence of the subject observed; while

quantitative observations of time series (measures of dynamics) may be in the

form of absolute values and relative values. Wallgren and Wallgren (2010 )

showed that, even when reporting only the raw values, misunderstandings

might occur. When referring to only two measures (absolute and relative

difference), several authors in various scientific areas highlight the

importance of recognizing that the two measures do not necessarily show the

same direction (e.g., Mueller and Schuessler 1961 ; Amiel and Cowell 1992 ;

Atkinson and Brandolini 2004 ; Svedberg 2004 ; Harper and Lynch 2005 ;

Atkinson and Sicherl 2007 and Sicherl 2011 ; Dolničar 2007 ; Moser et al.

2007 ; James 2009 , 2010 , 2011 ; Citrome 2010 ). For example, Moser et al.

(2007 ) compare the measures for relative inequality (ratio) and absolute

inequality to assess their implications from both static and dynamic

e.Proofing http://springerproof.sps.co.in:8080/oxe_v1/printpage.php?token=5xv2...

10 od 36 11.1.2014 20:42

perspectives. They conclude that there is substantial variation between

countries in the size and direction of change in inequality, demonstrating that

the direction of trends can depend upon the way in which inequalities are

measured. In addition, Harper and Lynch (2005 ) show that, in relative terms,

the difference in stomach cancer mortality between genders from 1930 to

2000 has steadily increased, but in terms of absolute difference, there has

been a steep decline in disparity between men and women. If this is not

properly taken into account (given the specific research question and the

characteristics of the research problem), it may lead to skewed answers.

Atkinson and Brandolini (2004 ) go even further and emphasize that there is

no a priori reason to rank the relative over the absolute criterion since they are

both equally acceptable, and that the choice is a value judgment. They argue

that people differ in their views and that their evaluation patterns are more

complicated than the simple relative/absolute dichotomy.

Reporting of inequality is particularly problematic when only relative

measures are considered and estimates of absolute inequality are not included.

This attitude is supported by Amiel and Cowell (1992 ), who posed verbal

and numerical questions to groups of students to elicit their views on

inequality. One-third of the respondents preferred the relative approach and

one-sixth preferred the absolute approach, while the remaining fraction

rejected either approach and followed some other logic. Similarly, certain

methodological approaches (e.g., Land et al. 2011 ) also focus mainly on

relative differences. On the other hand, James (2011 ), in the context of

comparing countries (developed vs. developing), clearly advocated the use of

the absolute difference because the analysis must capture a real achievement

rather than simply be an exercise in arithmetic starting from a low base

number, which then serves for relative comparisons. In possible contradictory

trends that can be deduced from absolute and relative difference alone, some

elaboration is made also by Mueller and Schuessler (1961 ), who believe that

the choice of measure needs to be appropriately tailored according to the

research problem.

AQ4

In recent years, a strong focus has been evident in using an alternative method

to compare units in time—the dynamic view, using the time distance measure

(e.g., Sicherl 2004 , 2007 , 2011 ; Vehovar et al. 2006 ; Dolničar 2007 , 2008 ;

Mueller 2011 ). This measure has the potential to provide important additional

information as a supplementary measure to existing simple comparisons. This

e.Proofing http://springerproof.sps.co.in:8080/oxe_v1/printpage.php?token=5xv2...

11 od 36 11.1.2014 20:42

has already been acknowledged by the application of this measure by several

research institutions (e.g., Granger and Jeon 1997 ; Empirica 2005 ;

EUROCHAMBRES 2005 ; Eurostat 2010 ; ITU 2010 ; NSCB 2010 ; IHME

(Kulkarni et al. 2011 ); The Millenium Project (Glenn and Florescu 2011 );

OECD 2011 ; SORS 2012 ).

In general, comparative analyses most often inquire into specific contextual

research issues rather than related general methodological questions (e.g.,

Natoli and Zuhair 2011 ; Pasimeni 2011 ; Dale and Neal 2012 ). As a more

specific methodological approach, Mueller (2011 ) recently introduced an

innovative methodology of social clocks for the monitoring of

multidimensional social development, which also includes the time distance

concept.

Nevertheless, despite several attempts, no comprehensive research has been

identified regarding how readers understand and perceive various measures of

dynamics. Consequently, a most basic and general methodological guidance

for the usage of these three measures is still lacking. On the other hand, we do

have some guidance in the somewhat similar conceptual dilemma regarding

the usage of the measures of central tendency (mean, modus, and median). In

this case, every statistical textbook clearly indicates which measure is

preferred in certain circumstances because they too may provide contradicting

results. This is exactly what is also required for the three basic measures of

dynamics in time. In this situation, however, the context and circumstances

are, of course, much more complex.

Within this framework, the contribution of this experimental study is twofold:

first, the methodological aspect will contribute to the empirical evidence

regarding how people read and understand measures of changes in time;

second, in a more substantial context, implications and practical suggestions

are presented for a more comprehensive way of presenting basic comparative

analyses to the general public.

AQ3

4. The Experiment

As mentioned, there is no consensus on guidelines regarding when and how to

use one or another measure of dynamics in time. The most often-used

approximation is the recommendation that we should use the measure that

best answers our specific research question, but there are no guidelines on

e.Proofing http://springerproof.sps.co.in:8080/oxe_v1/printpage.php?token=5xv2...

12 od 36 11.1.2014 20:42

how to achieve this. As is evident from the previous review, different authors

prefer different measures. Nevertheless, in some very specific types of time

series, such as diffusion process (e.g., digital divide), a more elaborative

framework has already been established. However, in this case, only very

specific (increasing) types of time-series trends are observed (e.g., penetration

of Internet usage) with a set of expected characteristics of the diffusion

process. However, this is not the case for time series in general (e.g., a

financial indicator, as in the above example), so for the purpose of this

experiment, a rather general case of dynamics in time was selected.

4.1. Research Question and Hypotheses

The general thesis is that manipulation of presentations of the data can have

very serious consequences on the reader’s perception. As illustrated

previously (Table 1 and Fig. 1 ), even in very simple (mathematically almost

trivial) comparative analyses contradicting suggestions may occur.

The focus of this study is on the perceptions of the receivers of data (readers

and users): Do their perceptions differ according to the specific measure

used? Are these differences so important that we need to consider them when

building a methodological and interpretational framework? Is there room for

manipulation?

Based on previous research and a review of the literature, an experiment was

conducted to test the following hypotheses:

• H1: The first reaction of users when exposed to raw data in a table is

thinking in absolute differences.

• H2: When exposed to specific measures, users accept absolute rather than

relative differences, while time distance has a rather weak impact on their

perceptions.

• H3: To ensure that the relative difference or time distance measure are

considered, additional stimuli are required (i.e., strong, explicit, or even

exclusive exposure to this measure).

• H4: When exposed to relative differences, users follow the corresponding

interpretation instead of the default thinking of absolute differences. This

effect is much less when exposed to time distance.

e.Proofing http://springerproof.sps.co.in:8080/oxe_v1/printpage.php?token=5xv2...

13 od 36 11.1.2014 20:42

• H5: A graphical presentation strongly implies and reinforces the

perception of absolute differences.

• H6: The time distance measure is by far the most cognitively demanding,

while absolute differences are the easiest to understand.

The next section explains the research design as a combination of

between-group and repeated-measures designs, which allow for testing all the

hypotheses. To determine the differences within the experimental groups, the

paired comparison t-test is used; to test the differences between the groups

(regarding the treatment effect and the cognitive burden), one-way ANOVA is

used with corresponding post hoc tests (Bonferroni and Dunnett test). We

expect that strong relationships and strong causality links will be revealed as

significant in a relatively small sample.

4.2. Experimental Design

The overall aim of the experiment was to study the perceptions and

understandings of a certain measure. Following Creswell (2009 ), the

experiment was designed to test the effects of an intervention on the outcome

variable. The experimental design is illustrated in Fig. 2 . The measures of

outcome were all of the same kind (a judgment of difference) on the same

contextual example. First, all respondents were asked to judge the difference

based on the data in a table (Fig. 2 step A). Second, an experimental

treatment was introduced: the first group was asked to consider the absolute

difference, the second group was asked to consider the relative difference

(ratio), and the third group was asked to consider the time distance measure

(Fig. 2 step B). The control group was unexposed to this manipulation. All

groups were then shown a graphical presentation of the same example (Fig. 2

step C), and the question that followed was based on all measures and the

graphical presentation (Fig. 2 step D). After each step, a simple measure of

cognitive burden was introduced to all groups.

Fig. 2

e.Proofing http://springerproof.sps.co.in:8080/oxe_v1/printpage.php?token=5xv2...

14 od 36 11.1.2014 20:42

The experimental design. Note In the schematic presentation, only variable

labels and consecutive numbering (in brackets in each label) of the indicators

are presented. The survey questionnaire has subsequently been translated to

English and is available for preview at https://www.1ka.si/SURVEY. In the

“Appendix”, a screenshot of one of the steps in the survey is presented (Step

D)

From this point forward, the four experimental groups are referred to by the

following abbreviations: ABS (experimental group 1, absolute difference),

REL (experimental group 2, relative difference), TIME (experimental group

3, time distance), and CONT (control group, no experimental manipulation).

4.3. Sampling

A non-probability sample was used for this experimental study, which is

e.Proofing http://springerproof.sps.co.in:8080/oxe_v1/printpage.php?token=5xv2...

15 od 36 11.1.2014 20:42

usually the case with randomized experimental studies where relations and

causality are addressed, particularly in clinical studies and marketing research

but also in social sciences (Schreuder et al. 2001 ). Thus, the focus here is

primarily on the internal validity, while external validity (i.e., inference to the

population) cannot be formally elaborated. However, extensive empirical

evidence shows that, while the sample may not be representative in socio-

demographic controls, the relationships and causality found are usually very

robust. Implicitly, in social research, this is also confirmed with growing

usage of internet/access panels (ESOMAR 2011 ), which are becoming the

prevailing method for surveying general population. In addition, our own

more than 15-year experience with RIS (www.ris.org) research, where

face-to-face and telephone surveys were extensively compared with

(reasonably spread) non-probability web surveys, shows there is almost no

difference in averages and distributions for ordinal and ratio scale variables.

With shares (percentages), certain biases may appear towards the

characteristics of intensive Internet users (Vehovar et al. 1999 ). Thus, it is

highly likely that relationships found in the experiment also hold true in the

general population, particularly for the differences in means of scale

variables.

The research was conducted in Slovenia, which is by all information society

indicators an average European country with around 70 % of weekly internet

users in population 16–74 (Eurostat 2013 ).

The sample (n = 146) consisted of predominantly young, educated adults

broadly recruited on the web and via social networks. The sampling started

with a small pool of initial contacts of the authors and continued in the form

of a snowball sampling. According to gender, the sample was quite balanced

(44 % male and 56 % female), approximately 80 % of the respondents were

30 years old or younger (the rest were aged between 31 and 50 years), and the

majority were well educated (more than 50 % with a higher education, and

most of the others were students). There were no significant differences in

results regarding gender, age, or education—also additionally and strongly

reinforcing the robustness of the findings—so we did not perform any socio-

demographic weighting. However, weighting was used to remove the

remaining random variation in differences of the initial perceptions (i.e.,

answers to the first question, step A). This way, the same initial distribution

on the first question was effectively assured for all groups. This was achieved

with simple post-stratification weighting, which adjusted the distributions of

3

total

e.Proofing http://springerproof.sps.co.in:8080/oxe_v1/printpage.php?token=5xv2...

16 od 36 11.1.2014 20:42

the four groups on the first variable of the experiment according to the known

average of all respondents in this question. As the sample was relatively

small, the focus was only on very strong effects, which may become visible

(statistically significant) in a relatively small-scale study.

After respondents started the survey, they were randomly assigned to groups

(using the web application random function), so differences in the outcomes

could be attributed (besides random variation) only to the experimental

treatment.

In experiments, we either expose different units to different experimental

manipulations (between-group or independent design) or take a single group

of units and expose them to different experimental manipulations at different

points in time (a repeated-measures design). Here, a combination of both was

used as it is possible to compare the four groups with one another (based on

one different experimental treatment), but the differences within each group

can also be compared (based on the presumption that each step is an

experimental treatment, per se).

5. Results

The data for each experimental group are presented in a table of frequencies

across the main experimental factors (Table 2 ). Further, results of analyses of

the differences between and within the groups are presented. All the tests

where ANOVA was used have been repeated using even more robust

non-parametric tests (Kruskal–Wallis test and Mann–Whitney test). Where

ANOVA showed statistically significant differences, post hoc tests were used

to identify for which pair of groups the difference was actually significant

(Bonferroni and Dunnett test). Since one-way ANOVA and non-parametric

tests showed identical conclusions, our decision was only to report the

significant results of one-way ANOVA.

Table 2

Frequencies of the main experimental factors (weighted data)

ABSA—data(1)

4 10 21 56 13 34 38 100 2.3 0.62

e.Proofing http://springerproof.sps.co.in:8080/oxe_v1/printpage.php?token=5xv2...

17 od 36 11.1.2014 20:42

B—ABS(4)

2 5 23 61 13 34 38 100 2.3 0.56

C—graph(10)

2 5 27 72 8 23 37 100 2.2 0.50

D—all(13)

2 5 20 53 16 42 38 100 2.4 0.59

REL

A—data(1)

3 10 19 56 12 34 34 100 2.3 0.62

B—REL(6)

5 14 10 28 20 58 35 100 2.4 0.74

C—graph(10)

2 7 25 74 6 19 33 100 2.1 0.50

D—all(13)

3 10 18 53 13 37 34 100 2.3 0.64

TIME

A—data(1)

4 10 23 56 14 34 41 100 2.3 0.62

B—TIME(8)

7 17 21 52 13 32 41 100 2.2 0.69

C—graph(10)

0 0 34 83 7 16 41 100 2.2 0.37

D—all(13)

4 11 21 54 13 35 38 100 2.2 0.64

CONT

A—data(1)

3 10 20 56 12 34 35 100 2.3 0.62

B(experimental treatment was not introduced to the controlgroup)

C—graph(10)

1 3 29 84 5 13 35 100 2.1 0.40

D—all(13)

6 16 16 45 14 39 36 100 2.2 0.72

Mean answer to the question “Is the difference increasing, constant, or decreasing?”,1 (increasing), 2 (constant), and 3 (decreasing)

n , n , n = weighted data for the categories of answers: n = increase, n = constant, andn = decrease

mean evaluation of the cognitive burden on the scale from 1 to 5, where 1 = very easy5 = very difficult

a

bi c d i c

d

C

e.Proofing http://springerproof.sps.co.in:8080/oxe_v1/printpage.php?token=5xv2...

18 od 36 11.1.2014 20:42