A CORRELATIONAL STUDY OF EXPLICIT AND IMPLICIT PERSONALITY TESTS

i

Running head: A CORRELATIONAL STUDY OF EXPLICIT AND IMPLICIT PERSONALITY TESTS

An Investigation Into the Correlations Between Explicit Self-Report

Personality Inventories and Implicit Personality Inventories, With Focus on the

Myers-Briggs Type Indicator (MBTI), Murphy-Meisgeier Type Indicator For Children

(MMTIC) and the Implicit Association Test (IAT)

as Partial Fulfillment of Requirements for

the Master of Arts Degree in General Psychology

Terrence A. Marselle

University of Hartford

January 25, 2014

A CORRELATIONAL STUDY OF EXPLICIT AND IMPLICIT PERSONALITY TESTS ii

A CORRELATIONAL STUDY OF EXPLICIT AND IMPLICIT PERSONALITY TESTS iii

A CORRELATIONAL STUDY OF EXPLICIT AND IMPLICIT PERSONALITY TESTS iv

Acknowledgements

This thesis, clearly one of the signature accomplishments of my life, would not have been

possible without the encouragement, tutelage, professionalism, and patience of University of Hartford

academicians as well as friends and acquaintances who individually and collectively are just plain

brilliant. These people are as follows:

Dr. Mala Matacin, whose uplifting personality and soaring encouragement inspired me in

more ways than she will ever know.

Dr. Caryn Christensen, who decided to take a chance on someone who had not been on a

college campus in over 35+ years. Thank you Caryn, I told you I would work hard!

Dr. Robert Duran, who was the first faculty member on campus to say yes to my request

to be a member of my faculty thesis committee – and who chose to remain in an active

role throughout, despite circumstances beyond his control.

Dr. Jack Powell, a clear-headed, even-keeled, and down to earth, master thinker, whose

cultivation of the things I did right as well as his silky way of letting me know where I

had gone wrong was priceless and appreciated.

Dr. Olga Clark Sharp, my primary thesis advisor who is one the most patient, hard-

working, organized, and thorough educators I have ever met. She is living testimony to

the fact that one can teach an old dog (me) new tricks.

Dr. Robert McPeek, Director of Research at the Center for the Application of Type

(CAPT) in Gainesville, Florida. Throughout it all, Bob gave me clear-headed guidance as

well as unwanted, but needed, criticism.

Dr. Harry Bennett of Littleton, New Hampshire. In this Master’s quest, without Harry to

help me, not only stay on the road, but as well, to navigate my way around the potholes,

this thesis would not exist.

A CORRELATIONAL STUDY OF EXPLICIT AND IMPLICIT PERSONALITY TESTS v

Dedications

To Elizabeth “Liz” Marselle. I hereby dedicate this thesis to my mother, Elizabeth “Liz”

Marselle, who was on this earth for 97 years before she recently passed. Mom, wherever

you are, you know that I love you for a thousand reasons. Thank you not only for giving

me life, but more importantly, thank you for giving me a life in which your blood is in

my veins. I hope you are as proud of me as I am to be your son. Here is hoping I see you

again someday. I love you.

To BoBo, my mother’s cat, who I inherited. Yes, reader, you read that correctly. I am

dedicating this thesis, in part, to (now) my cat, BoBo. Through it all, BoBo kept me

company, many times, by draping himself end-to-end atop my computer desk as I plowed

forward with this paper. Not knowing that I had a lot of work ahead, sometimes he would

just plop himself on the keyboard. In retrospect, however, maybe he really did know, and

he was just trying to get me to relax a bit. Hey, it worked! No matter how exhausted

and/or frustrated as I was at times, BoBo, was simply “there” to keep me company.

Thanks Bo. Here is hoping we have many more years together.

Terrence (Terry) Marselle

January 25, 2014

A CORRELATIONAL STUDY OF EXPLICIT AND IMPLICIT PERSONALITY TESTS vi

Table of Contents

Chapter I: Introduction 3

Background of self-report techniques 3

Establishing the Validity of Personality Inventories 5

The Myers-Briggs Type Indicator (MBTI) 7

A Critique of the Myers-Briggs Type Indicator (MBTI) 13

Murphy-Meisgeier Type Indicator For Children (MMTIC) 22

A Critique of the Murphy-Meisgeier Type Indicator For Children (MMTIC) 24

Sources of Inaccuracy and Bias in the Self-Report Method 27

Social Desirability responses: an overview 30

Socially Desirable responding – research review 31

Taking a position on the legitimacy of the social desirability response construct 36

The Implicit Association Test 45

Critique of the Implicit Association Test 50

IAT and Personality Assessment 53

Implicit cognition and its relationship to the IAT 56

The neurobiology of implicit thinking and the pivotal role of emotion 57

Hypotheses 60

Chapter II: Method 61

Participants 61

Procedure 61

Information sessions 62

Confidentiality 62

Integrity of the test-taking environment 64

A CORRELATIONAL STUDY OF EXPLICIT AND IMPLICIT PERSONALITY TESTS vii

Order effect and counterbalancing measures 65

Attrition as part of the procedure 67

Dry-run of a generic (non-personality-type) implicit (IAT) test 67

online and in school

Ethical considerations 67

Informed consent 68

Deception 69

Debriefing 69

Apparatus 70

Chapter III: Results 71

Data gathering and analysis 71

Chapter IV: Discussion 74

Was SDR a factor in the taking of the MBTI and/or the MMTIC? 75

Discussion on why the IAT / MBTI / MMTIC correlations were mixed 77

Current status of the IAT 101

Concluding thoughts 104

Future direction and research involving the IAT 105

References 107

A CORRELATIONAL STUDY OF EXPLICIT AND IMPLICIT PERSONALITY TESTS viii

List of Tables

Table 1 - McCrae and Costa’s comparison of the Big Five (FFM and MBTI) 140

Table 2 - Pearson Product Moment r comparisons of IAT, MBTI and MMTIC 141

Table 3 – Mean and standard deviation statistics this study 142

Table 4 – Correlations between the MBTI and MMTIC 143

Table 5 - Summary of the Improved Algorithm for IAT Scoring Procedures

Recommended by Greenwald et al. (2003) 144

Table 6 - MBTI-like descriptive words used for Hall High School IAT tests 145

Table 7 - Stimuli used in the qIAT 146

Table 8 - Correlations between the explicitly measured extraversion

items and the explicit and implicit measures of extraversion on the

qIAT 147

A CORRELATIONAL STUDY OF EXPLICIT AND IMPLICIT PERSONALITY TESTS ix

List of Figures

Figure 1 - The brain regions most often reported in studies of race. 148

Figure 2 - A model for the neural basis of implicit attitudes. 149

Figure 3 - Example of “wrong answer” screen on IAT test 150

Figure 4 – Example of 4 categories instead of 2 categories on IAT test 151

A CORRELATIONAL STUDY OF EXPLICIT AND IMPLICIT PERSONALITY TESTS x

Appendices

Appendix A - Myers-Briggs Type Indicator (MBTI) – Preference Clarity Index

- evidence of continuous scoring 152

Appendix B - Murphy-Meisgeier Type Indicator for Children (MMTIC) – percentage

evidence of continuous scoring 153

Appendix C - Murphy-Meisgeier Type Indicator for Children (MMTIC) – “Strengths

and Stretches” section 154

Appendix D - Script for participant recruiting session by the TAs to each whole class 155

Appendix E - Script for informational session for participants who volunteered 156

Appendix F – Instructions on how to go to the publisher’s websites and log-on to 162

take each test

Appendix G - Myers-Briggs Type Indicator (MBTI) questions – 93 total 165

Appendix H - Murphy-Meisgeier Type Indicator for Children (MMTIC)

questions – 54 total 173

Appendix I – The Implicit Association Test sample questions, adjective word list

and actual questions from screen captures 177

Appendix J - Participant honor pledge to maintain a quality testing environment 200

Appendix K - Official permission from West Hartford Public Schools to conduct

research on participants who are young than 18 years of age 201

Appendix L - General Informational / Introductory Letter to Parents: of all 11th

and 12th

grade behavioral science students at Hall High School – West Hartford CT 202

Appendix M - Passive Parental or Guardian Consent Form 203

Appendix N – Informed Consent / Assent of student participants - Participation in a

study of the validity and reliability of three different personality tests 206

Running head: A CORRELATIONAL STUDY OF EXPLICIT AND IMPLICIT PERSONALITY TESTS

An Investigation Into the Correlations Between Explicit Self-Report

Personality Inventories and Implicit Personality Inventories, With Focus on the

Myers-Briggs Type Indicator (MBTI), Murphy-Meisgeier Type Indicator For Children

(MMTIC) and the Implicit Association Test (IAT)

as Partial Fulfillment of Requirements for

the Master of Arts Degree in Psychology

Terrence A. Marselle

University of Hartford

January 25, 2014

A CORRELATIONAL STUDY OF EXPLICIT AND IMPLICIT PERSONALITY TESTS 2

Abstract

This correlational study investigated the relationship between the responses of the relatively new

testing instrument, the Implicit Association Test (IAT), and two previously validated explicit

self-report methods of personality assessment, namely the Myers-Briggs Type Indicator (MBTI)

and its children’s counterpart: the Murphy-Meisgeier Type Indicator for Children (MMTIC).

Using the same type of descriptors for all three assessments, it was hypothesized there would be

statistically significant correlations between the IAT responses and the responses on both the

MBTI and MMTIC, establishing evidence of convergent construct validity of the IAT. One-

hundred and three eleventh and twelfth grade volunteer high school students completed all three

assessments. In total, eight hypotheses were tested. The results were mixed. The researcher

found that the primary reason for these unexpected results rested not with the explicit self-report

MBTI and MMTIC assessments, but with the Implicit Association Test. The researcher found

that social desirability was not a factor with the self-report assessments, while simultaneously

concluding that when the Implicit Association Test attempts to measure topics which are

inherently complex, abstract, dynamic and which may also involve high emotional content –

such as personality characteristics – it falters in several areas. This may especially be true when

adolescents take the IAT.


Chapter I: Introduction

Background of self-report techniques

Sir Francis Galton is credited with being the founder of the empirical study of personality

(Shrout, 1993). In 1884, he coined the term: "lexical hypothesis" (Galton, 1884, p. 181). He

posited that if personality characteristics and differences were as stable and long-lasting as they

were thought to be, they could be encoded into language, i.e., into a culture’s vocabulary.

Because these personality characteristics were believed to be biologically rooted, they were

called traits. According to trait theory, these characteristics are present across cultures and

throughout history and remain relatively stable for one’s entire life. The question then arises, if

this is so, is it possible to extrapolate a comprehensive taxonomy of personality traits?

Galton originally called this challenge an “empirical thicket” (Shrout, 1993, p. 770).

Ultimately, the end result of Galton’s theory became factor analysis. Factor analysis is a widely

used quantitative tool which is used to extract a simpler structure from a dataset. That is, it uses

correlation-based analysis to impute a solution that identifies a set of factors to which different

items in an inventory usually fit themselves. As exemplified by McCrae and Costa’s Five Factor

Model (FFM), within contemporary mainstream psychology, factor analysis of constructs has

become one of the accepted tools for measuring personality characteristics (Clark, 2007).

Perhaps, another alternative to the FFM is a personality assessment that is derived using a wholly

different approach. This might be the controversial but popular personality inventory, the Myers-

Briggs Type Indicator (MBTI). In both cases these personality inventories depend upon self-

report questionnaires.

The nature of self-report inventories is that information is ‘mined’ directly from the

individual taking the assessment. As Gerald J.S. Wilde, Professor of Psychology at Queen’s


University in Canada, said, “It should be appreciated, however, that the self-report method tacitly

assumes the validity of ‘inventory premise,’ i.e., the assumption that the individual being

assessed can and will accurately describe her or his feelings and thoughts – a premise which

cannot always be supported” (Wilde, 1972, p. 72). The crux of the matter is that in order for a

self-report answer to be accurate, when an individual responds, she or he must: be self-aware

enough to know with confidence how he or she thinks or feels about his or her answer, be willing

to share that answer as part of the assessment-taking process, and not be colored by social

desirability or cultural constraints. In the final analysis, depending upon the degree to which

these cognitive and situational processes are operative during an individual’s response mode,

answers to these self-report assessments might very well be, as an end product, an erroneous

self-report. Researchers could then justifiably ask: who is being fooled? Would it be the

individual who took the assessment and/or the individuals interpreting the assessment? With the

former, the situation would be tantamount to ‘pulling the wool over one’s own eyes’.

Academicians, of course, would call it an invalid self-report. The research is compromised by

these inaccurate responses and / or by failing to detect the true associations.

With the above in mind, the researcher tested the following research question: To what

degree does self-report demonstrate construct validity with regards to the Myers-Briggs Indicator

(MBTI) and its children’s counterpart, the Murphy-Meisgeier Type Indicator for Children

(MMTIC)? More specifically, to what extent is the explicit self-reporting by individuals on the

MBTI and MMTIC, i.e., their paper and pencil / or online answers, correlate with another MBTI-

like personality assessment which is based on implicit self-reporting?


Establishing the Validity of Personality Inventories

Beyond the problem of producing objective data from subjective responses, this paper

now turns to another important consideration: the validity of the questions themselves. Validity

is the ‘holy grail’ of any testing instrument. Personality inventories should measure what they are

designed to measure (Ghiselli, Campbell, & Zedeck, 1981; Murphy & Davidshofer, 2005).

Formulating questions that are primarily rooted in subjective-qualitative thoughts and feelings,

while at the same time giving those questions properties that would allow a transduction to

directly accessible and measurable data, can be psychometrically daunting. With a personality

inventory, this can be especially difficult.

Hypothetically, what if a researcher was trying to measure the validity of a question that

asked extroverted executives to compare themselves with introverts on their degree of self-

awareness? Since self-awareness is generally considered positive in most societies, they might

rate themselves higher than they are rated by others. So, could they say with accuracy, they are

more self-aware? A legitimate question would be how might the researcher validate these

individuals’ explicit self-reports on this facet of personality? Direct behavioral observations,

anecdotal observations of peers, and physiological measures would be either difficult or near-

impossible to obtain.

In the real world, most data produced by self-reports cannot be independently verified.

Even if such efforts were feasible, there would undoubtedly be ethical concerns. Moreover, it

would be insufficient for researchers to only know whether the data were accurate. Before this

self-reported data could be interpreted, they would also have to know the magnitude to which the

data were inaccurate, as well as the probable sources of those data. With these methodologies,

this is not a practical option.


Construct validity is the foundational core of behavioral empirical research. Construct

validity descriptors are an integral part of research in the behavioral sciences - and in particular

with self-report personality inventories (Anastasi & Urbina, 1997; Cronbach & Meehl, 1955;

Nunnally & Bernstein, 1994). These are generally estimated when trying to measure and assess

the validity of constructs that are not directly observable. Accordingly, the “construct” in

construct validity has garnered the bulk of attention in both theoretical and applied psychology.

Examples of constructs are personality characteristics such as aggression, happiness and

depression. When the construct validity of self-report personality inventories is estimated, there

are complications when an inherently subjective situation is described in an objective manner.

Efforts toward lessening this problem are represented by a subtype of construct validity:

convergent validity. Convergent validity posits that if results of a self-report inventory show

statistically significant correlations with other self-report instruments on the same topic then not

only is this a credible form of validity on its own, but if continued to be validated over time, such

assessments can be used to make predictions.

Arguably, the current, “gold standard” in determining the validity of self-report

inventories is the ability of an assessment to predict behavior, i.e., predictive validity. In addition,

great weight is also given to concurrent validity. Concurrent validity is established when the

results of research involving self-report personality inventories correlate with results that

originated from a different psychological instrument designed to measure the same construct.

This study is focused on concurrent validity.

Jerome Kagan, Harvard’s world-renowned child-development expert, may have said it

best. In 1988, he argued that personality researchers had arrived at the point where they were

relying too heavily on self-report instruments. He maintained they were/are making the


fundamental mistake of actually believing that these ratings of personality are giving us an

accurate picture of what personality really is. Kagan said further, that researchers appear

“indifferent to the possibility that the theoretical meaning of a descriptive term for any quality is

derived from its source of evidence” (Kagan, 1988, p. 614). Twenty years later, Kagan continued

to criticize “the practice of treating personality concepts as unchanging essences that transcend

all assessment contexts” and the “disambiguation of findings from single paradigms and single

response indices” (Kagan, 2008, p. 1619).

In contrast, we are reminded of comments made by psychometrician, Jum Nunnally.

Nunnally wrote, “even though self-[report] inventories definitely have their problems as

approaches to personality characteristics, attitudes, values, and a variety of other non-cognitive

traits, they represent by far, the best approach available” (Nunnally, 1978, p. 141).

As mentioned, the Myers-Briggs Type Indicator (MBTI) and its children’s counterpart,

the Murphy-Meisgeier Type Indicator For Children (MMTIC) are both self-report as well as

fixed choice inventories. With this in mind, this construct validity investigation shall focus

primarily on these two instruments plus one, very different, and relatively newer method of

personality assessment – The Implicit Association Test. These instruments will be discussed in

the following sections.

The Myers-Briggs Type Indicator (MBTI)

Different from the widely used trait approach– based and measured by factor analysis

theory of the Five Factor Model - the Myers-Briggs Type Indicator (MBTI) uses Jungian theory

to assess and describe personality. Carl Jung (1875-1961), was a protégé of Freud, i.e., in the

psychoanalytic tradition. As a foundation of his theory, Jung believed all humans possessed two

opposing mental attitudes, called extroversion and introversion. These attitudes were not only


rooted in the unconscious, but they also manifested themselves according to where in our

environment we focused our conscious attention. He described these attitudes as “the psyche to

act or react in a certain way” (Jung, 1921/1971, p. 414). In his descriptions, Jung did not

characterize individuals as one-dimensional in their behavior, i.e., as only able to exhibit

introversion or only able to exhibit extroversion. Contrarily, he posited that each individual

possessed the ability to be both introverted and extroverted, stating, “There is no such thing as a

pure extrovert or a pure introvert . . . those are only terms to designate a certain punction, a

certain tendency” (Jung, 1921/1971, p. 416). Jung went on to say that, within each individual, as

a result of personal preference, one of the attitudes generally prevailed over the other, depending

upon the immediate surroundings. Because each of us has a natural (biological) preference

between these two attitudes, the particular attitude which is put into use more often – results in

that part of our personality which gains greater skill and provides more comfort than the opposite

attitude. Jung explained these attitudes without judgment. He noted that the qualities of

extroversion and introversion may be expressed in both positive or negative behaviors depending

on the disposition and personality of the individual (Wehr, 1971). Introversion and extroversion

can also be explained along a continuum at the same time that it is dichotomous. That is, within

one’s attitude, Jung said individuals might vary from being slightly introverted to strongly

introverted or from slightly extroverted to strongly extroverted.

Jung defined introverts as being more interested in their inner world of feelings and more

in their own thoughts and thus, they might shy away (literally) from interest in others or objects.

As such, Jung said because they derive their energy from within, introverts were introspective

and preferred small groups, perhaps even solitude. In addition, they are generally hesitant when

presented with new stimuli or thrust into new circumstances. Therefore they are more inclined to


make decisions with caution. Extroverts, Jung said, are oriented toward their external

environment and toward other people. They are generally outgoing, more attuned with the

expectations of others and quickly cultivate attachments. As such, they are likely to be sociable,

open and assertive (Jung, 1921/1971; Wehr, 1971).

Later, upon his recognition that there were different types of introverts and extraverts,

Jung added four additional dichotomies to his personality theory. These, he called psychological

functions and labeled them: feeling, thinking, sensing, and intuition (Jung, 1921/1971). They,

too, were presumed to operate in opposing pairs (Jung, 1927). Like the two opposing mental

attitudes of extroversion and introversion, Jung stated there is a similar dynamic within the four

functions themselves. That is, as they opposed each other, one operated on a conscious level and

was well developed, while the other functioned on an unconscious level and was not well

developed (Jung, 1920/1926). Therefore, by definition, only one of these functions can be

working on a conscious level at any given time.

Jung posited that two of these functions are “sensing” and “intuition” and that they

describe the processes of how our brains preferred to access and process information. In their

pure forms, Jung further stated these two functions are the antithesis of each other. As part of this

dynamic, Jung wrote, “Sensation is just as antagonistic to intuition as thinking is to feeling”

(Jung, 1930/1933, p. 106).

Intuition originates from insight garnered from perceptions that are unconscious and

holistic which are then fused with our past experiences. Intuition can extract meaning from

opaque ideas, sweeping theories, and patterns with scant attention to details. Intuitives process

their feelings and thoughts, not by way of actual sensory experience, but by way of envisioning

how phenomena are associated with each other, i.e., seeing the big picture or playing out a hunch


in one’s mind. Jung noted, “In intuitives a context presents itself whole and complete, without

our being able to explain or discover how this context came into existence” (Jung, 1921/1971, p.

453). In contrast, sensation is generated by concrete experience, facts, and tangible evidence of

one’s immediate world. It is this function that embraces real world (external) stimuli as it passes

through our five senses. As such, sensation is a conscious phenomenon.

“Thinking” and “Feeling” were Jung’s second set of opposing functions. He said the

purpose of this dichotomy was decision-making. Thinkers utilize objectivity, analysis based

logic, and facts in their decision-making process. Feelers, on the other hand, engage in a more

subjective process in making decisions. Their decisions are rooted in a system of personal values,

e.g., the well-being of others and empathy. Unsurprisingly, this personal value system is also

based in personal experiences, but one which has arisen from a sense of mood gleaned from

one’s overall likes and dislikes. From an emotional standpoint, impacting these decisions would

include terms such as pleasantness or unpleasantness, or, whether something might be exciting or

dull. Said Jung, “Feeling is a kind of judgment, differing from intellectual judgment in that its

aim is not to establish conceptual relations but to set up a subjective criterion of acceptance or

rejection” (Jung, 1921/1971, p. 434).

To restate, according to Jung, of the four functions just described, in any individual, only

one can be working on a conscious level at any given time. That one function is said to be

“dominant.” In other words, it has the greatest amount of psychic and at the same time,

conscious energy at its command. As an example, if Intuition is the most dominant conscious

function, then its opposite, Sensing, will be the least conscious and have the weakest amount of

psychic energy behind it. In the pecking order of this example, Feeling and Thinking is

sandwiched in-between. Moreover, in this case, an individual using her/his weakest function


would be tantamount to trying to write with their non-dominant hand. When asked to explain this

relationship between the four functions, Jung replied in his characteristic Intuitive way, “I

distinguish these functions from one another because they cannot be related or reduced to one

another” (Jung, 1921/1971, p. 437). This dominant / conscious and non-dominant / unconscious

discussion will be revisited later in the investigation.

Eventually, Jung’s theories on personality saw their most visible and practical

manifestation in the Myers-Briggs Type Indicator (MBTI). Katherine Cook Briggs (1875-1968)

and her daughter, Isabel Biggs-Myers (1897-1980) were already avid “people-watchers” when

they took notice of Jung’s theories in the early 1920s. Based on their everyday “type watching”

observations of ordinary people, they combined their own anecdotal information with Jung’s

theories. Literally for the rest of both their lives, Briggs and Myers researched Jung’s original

hypotheses regarding personality dynamics and built on his findings with their own ‘value

added’ model. The singular most obvious example of this elaboration was their addition of

another two letter dichotomy to Jung’s original model. After more than two decades of their own

studies on human interaction, Briggs and Myers made official, their “J and P” addition. They

described their new J/P dichotomy as an indicator of how individuals orient themselves to the

outer world, i.e., how they structure their lives – or not. “J” stood for Judging, but was not meant

to mean ‘judgmental’. It was meant to describe individuals who lead a self-regimented life, who

regularly plan and enjoy schedules, are organized, and generally seek timely closure on life’s

tasks. On the other hand, “P” (Perceiving) is a descriptor for a personality characteristic of

individuals who adapt easily to their outside word. Ps usually lead curious, spontaneous, and

unplanned lives. Rather than attempting to control things, they are quite adaptive, open to

change, and generally highly tolerant (Myers & Myers, 1980). With this fourth dichotomy (JP),


Briggs and Myers elaborated further on Jung’s original typology to create 16 types of personality

represented by the four letters of each of the four dichotomies (EI, SN, TF, and JP). An example

is ENTJ, i.e., extraverted, intuitive, thinking judging.

A milestone was reached in 1942, when after having spent more than a decade

developing test items to “indicate type” from both Jung’s and their own theories, Briggs and

Myers published Form A of the Myers-Briggs Type Indicator® (MBTI). As their main premise,

Briggs and Myers stated that any personality inventory should be used “to describe valuable

differences between, normal, healthy people” with a “constructive use of differences” (Myers, &

Myers, 1980 p. xiii). This represented a departure from the norm of nearly all psychology

inventories which was to measure pathology. Briggs and Myers further stated that it was their

intent to help “parents, teachers, students, counselors, clinicians, clergy, and all others who are

concerned with the realization of human potential” (Myers & Myers, 1980, p. xiii). Katherine

Briggs’s partnership with her daughter Isabel continued throughout her lifetime. Together, they

initiated decades of research on the utility of the MBTI.

After an additional 25 years of research by the mother-daughter team, Katherine Cook

Briggs died in 1968. Isabel Myers singularly continued her efforts until 1975, when she teamed

up with Dr. Mary McCaulley, a clinical psychologist at the University of Florida to co-found the

Center for Applications of Psychological Type (CAPT) and continue the research on the MBTI.

CAPT is located in Gainesville, Florida and has the original papers of Briggs, Myers and

McCaulley as well as over 12,000 bibliography entries for the MBTI, web-based MBTI and

MMTIC administration as well as access to the Journal of Psychological Type

(http://www.capt.org/about-capt/home.htm).


Currently, the typology of personality as conceptualized by Jung, Briggs/Myers and

McCaulley is widely recognized. The MBTI enjoys its greatest popularity, however, in the

consumer and business world. Its publisher, Consulting Psychologists Press (CPP), has reported

that as of 2005, there were three million administrations in 32 languages given each year.

Moreover, eighty-nine of the Fortune 100 companies and 80% of Fortune 500 companies were

using the MBTI for a variety of purposes, e.g., team building, conflict resolution, etc. (Paul 2004;

2005; Suplee, 1991). The current versions of the MBTI are Forms M and Q. Response from

academia, however, has been less than enthusiastic. To a limited degree, it has been utilized by a

smattering of psychologists (e.g., in clinical and industrial/organizational (IO) psychology) as

well as some social workers and counselors. For the most part, however, it has not been

embraced by mainstream psychology for reasons that shall be discussed shortly.

A Critique of the Myers-Briggs Type Indicator (MBTI)

The scoring system used by the MBTI is built around the Item Response Theory (IRT)

and its corresponding software (Myers et al., 1998). The IRT method runs on the premise that

each item /question) is not of equal weight in terms of the information generated. Unlike simpler

methods of scoring, such as Likert scales, where "All items are assumed to be replications of

each other or in other words items are considered to be parallel instruments" (van Alphen et al.,

1994, p. 197), IRT treats the answer to each question as independent entity. Among the largest

users of the IRT method is the Educational Testing Service which owns the rights to many of the

high stakes tests used in the United States. These include the Scholastic Aptitude Test (SAT),

Graduate Record Examination (GRE), as well as the Medical College Admissions Test (MCAT)

and Law School Admissions Test (LSAT). Using mathematical models designed around MBTI

descriptors, MBTI researchers have spent over 70 years honing the questions and studying the


answers and how these answers pull. Accordingly, each question and how it is answered is

assigned an individual weight in the MBTI’s scoring system.

The MBTI has good convergent validity. Its publishers have documented strong

correlations between its current version, called Form M, and six other personality assessments.

These other personality assessments are: the CPI 260 ® (California Psychological Inventory),

FIRO® (Fundamental Interpersonal Relations Orientation), Adjective Check List®, Strong

Interest Inventory®, Thomas-Kilmann Conflict Mode Instrument (TKI)®, and the Birkman

Method® assessments (Schaubhut & Thompson, 2009). In addition, the publishers of the MBTI

state that these relationships remained stable over time.

Convergent validity, however, is considered a ‘back door’ method of validating a self-

report inventory – and therefore is a weak type of validity. Rarely accepted on its own merits,

convergent validity carries within its design, a potential fatal flaw. In this case, if the MBTI-

Form M is not valid and six other assessments on the same general topics show strong

convergent validity, it could mean these other inventories themselves contain the same innate

weaknesses. So-to-speak, they could all be drinking from the same ‘poisoned’ water. Given that

convergent validity is one of the two primary kinds of evidence for the MBTI’s validity (the

other evidence is relationships with behavior), this may help explain why the MBTI has been

criticized in academia (Pittenger, 1993). Along these lines, most contemporary personality

researchers think the MBTI lacks empirical support, is too complex and extremely difficult to

measure (Eysenck, 1973; Steele & Kelly, 1976; Stricker & Ross, 1973).

Without question, however, the greatest criticism of the MBTI is its dichotomous scoring

system. That is, instead of being measured on a continuum, an individual is clustered /

categorized toward one of two possible poles. For example, depending on what answer an


individual chooses on the MBTI’s self-report questions, at some point, a person “crosses the

line” and is labeled an extravert or introvert, etc. Typical of this criticism are comments made by

Professor Dan McAdams, the current chair of the Psychology Department at Northwestern

University. McAdams comments, “The MBTI’s whole notion of traits is arbitrary. What if you

were trying to measure height? A test like the MBTI would say that if you’re above 5 feet 9

inches, you’re tall. If you’re below, you’re short. Now, in measuring height, no one would settle

for this. If you look at all the respected academic psychology journals, you’ll hardly ever see a

reference to the MBTI—and if you do, it’s dismissive. It’s just not a credible test” (Wachter,

2005, p. 26).

When faced with this criticism, academics within the MBTI community (as stated in its

manual), say, “The MBTI is different from typical trait approaches to personality that measure

variation along a continuum. Instead, the Indicator seeks to identify a respondent’s status on

either one or the other of two opposite personality categories…both of which are regarded as

neutral in relation to emotional health, intellectual functioning, and psychological adaptation.

The MBTI dichotomies are concerned with basic attitudes and mental functions that enter into

almost every aspect of behavior. Therefore, the scope of practical applications is broad rather

than narrow and includes quite varied aspects of living. The clarity indexes on the MBTI are

interpreted as how ‘clearly a respondent prefers one of two opposite poles of a dichotomy’ rather

than an abundance or lack of the trait” (Myers et al., 1998, p. 5).

In the discussion over the interpretation of continuous vs. dichotomous variables, the

publisher of the MBTI cites the commonality of the dichotomous approach typically used in

psychology, psychiatry as well as in medicine. Here, individuals must cross thresholds to meet

criteria for a diagnosis - or they are not diagnosed. Whether it’s a patient’s lab results, e.g.,


diabetes, or meeting five out of eight criteria for Oppositional Defiant Disorder (ODD), there

must be a critical mass of symptoms in order for a diagnosis to be made. Re-stated, with ODD, it

is five out of eight, not eight out of eight symptoms. In other words, adolescents are not

perceived as having a low, medium or high amount of ODD characteristics. Moreover, when

added criteria from the DSM-IV-TR are factored in for ODD, it is entirely possible that two

adolescents with a diagnosis of ODD may not have a single symptom in common. In short, the

MBTI has never sought to point out any degrees of mental wellness or pathology. Contrarily, all

results are assumed to be neutral representing personality descriptors that are adaptive – while at

the same time, not implying homogeneity within its categories (Myers et al., 1998, pp. 4–5).

A second criticism of the MBTI is concerned with its concept of “whole type”. This is an

important research consideration because the primary purpose of the MBTI is to identify whole

type. Whole type is essentially a unit of the four categories/dichotomies together. In brief, an

example is offered of whole type using only two of the dichotomies at a time. The reader is

asked to keep in mind that all four dichotomies within one’s personality are ‘in play’ at all times.

In the lettering scheme of the MBTI, Js are decision-makers. They have a natural tendency to

bring closure. An FJ (feeling), however, is an emotional decision-maker, whereas a TJ (thinking)

is a logical decision-maker. An NTJ (intuitive) is a logical decision-maker who sees the big

picture….who has natural tendency to envision and think of things “out-of-the-box.” On the

other hand, STJs (sensing) are logical decision-makers whose decisions are based on the tangible

details of the “facts on the ground,” i.e., what is in their immediate grasp….and so on.

Whole type is exemplified by an INFP or an ESTJ and personified by a (perhaps) INFP-

like person such as the primatologist Jane Goodall – as well as an ESTJ-like person such a

(perhaps) the TV character, Judge Judy. Simply put, the MBTI is not merely trying to indicate


how an individual comes out on each of the four dichotomies. More important is the totality of

what is indicated by the four dichotomies, i.e., greater than the sum of its parts. “Characteristics

or behavior based either on whole types or on combinations of preferences demonstrating type

dynamics should not be explained based on knowledge of the preferences alone. Because the

indicator is based on a complex theory, establishing this kind of validity [of whole type] is more

difficult than acquiring evidence for the validity of the four individual scales” (Myers et al.,

1998, p. 196). Chapter 09 of the MBTI Manual gives a seemingly impressive array of studies

citing the validity of whole type. These studies suggest that whole types show personality

characteristics that are not predictable from the knowledge of individual dichotomies alone

(Myers et al., 1998, p. 171- 219). Of particular note, in McCrae & Costa’s Five Factor Model,

there is no attempt to combine any of the factors into two-letter temperaments or four-letter

personality types as the MBTI. On the contrary, each trait is considered independent of the other.

Interestingly, the basis for nearly all criticism of whether whole type is a valid construct

largely dissipates when continuous scoring is used. This becomes doubly ironic because this is

how MBTI reliability is routinely scored. It is just that the MBTI is expressed as a dichotomous

structure. In other words, although every version of the MBTI creates continuous scores, on the

tests results themselves, they are actually presented as categorical results. To oversimplify, a line

is drawn on a calculated (according to an algorithm) predetermined middle point of these

continuous scores. This line represents a “zero” score which by design, no one ever achieves.

The maximum score is 30 and these continuous scores are expressed as positive numbers. The

minimum score is negative 30. With approximately 23 questions that pull for each of the four

dichotomies, i.e., E/I; S/N; T/F; J/P, everybody with a negative score gets classified as an E and

S and T and J) and everybody with a positive score gets the alternative classification (I and N


and F and P ). On the MBTI Form M, the closest these continuous scores come to actually being

reported to the user is the Preference Clarity Index (PCI). Refer to Appendix A where

dichotomous labeling is described, but in the lower portion of the appendix, the continuous

numerical values are referenced.

With the PCI, it is a visual expression (usually a bar chart) that indicates how clear an

individual was in expressing her/his preference for a particular pole. Higher absolute numerical

scores suggest an individual is more certain about his/her preference, while lower absolute scores

suggest that individual is less sure about that preference. Rather than being expressed as

numerical data points on a continuum, an individual’s PCI for each dichotomy is expressed in

the verbal terms representing four gradations of preference, i.e., slight, moderate, clear, and very

clear.

Statistically, since participants’ scores are interval data and this is a correlational study,

and related pairs of scores are being measured, the parametric Pearson product r test is used to

calculate correlations using interval variables. However, there is a substantial loss of power when

truly continuous variables are turned into a dichotomy (Harvey & Murry, 1996). This may be

especially true when one considers that the MBTI has four dichotomies. In a hypothetical

example, if each dichotomy agreed 85% of the time in a test/retest situation, the best result for all

four dichotomies would be .85 x .85 x .85 x .85 = .52. Therefore, even with strong correlations

for each dichotomy, the probability that all four dichotomies will agree would only be 52%. If,

on the other hand, continuous scores are used, since a researcher is not categorizing individuals

as either A or B, but as a range on a scale, a point or two change does not throw an individual

into a different category. Therefore, it is the nature of the dichotomous scoring itself – and not

necessarily the MBTI and MMTIC inventories themselves – which has the tendency to


undermine the credibility of these instruments. Perhaps this is a primary reason for skepticism of

these two instruments among academics (West & Aiken, 1991).

This compelling argument about the loss of statistical power due to the MBTI’s – and by

association – the MMTIC’s scoring system is further exemplified by research within the MBTI

community itself. Most individuals who change preferences, e.g., extroversion/introversion, are

those with scores near the midline, at the preference divide. This makes statistical sense. Again,

in a scoring system that is continuous, no one score should have that much meaning. This

becomes especially true when statisticians factor in the fact that all measurements are inexact and

full of measurement error due to transitory influences (mood, setting), misinterpretation,

falsification, self-deception, and other forms of response bias. But where MBTI and MMTIC

type theorists run into problems is when they say (hypothetically) two people of opposite

preferences but whose continuous scores are very close are more different than people with very

different scores who fall on the same side of the dividing line, e.g., slight introvert and very clear

introvert. Because the scoring system is dichotomous rather than on a continuum, scores which

would normally be in the middle of the continuum inherently take on a greater mathematical

importance (Muller, Judd & Yzerbyt, 2005).

Again, MBTI theorists unapologetically address this criticism by returning to their

perspective of “critical mass has been reached; therefore a decision has to be made.”

Hypothetical questions which exemplify this perspective might resemble the following: At what

point does an individual cease to be an extreme introvert, but instead, become slightly timid? The

theoretical bases of these hypothetical questions are supported. In the ten years following the

publication of the 1985 manual, Harvey (1996) performed meta-analytic studies, using

continuous scores of the MBTI. He reported the MBTI to be quite good with overall reliabilities


of .84 and .86. “These compare quite well with reliabilities of “even the most well-established

and respected trait-based instruments. Respondents with strong preference clarity are classified

the same across the four scales 92% of the time on retesting; those of medium preference clarity

are classified identically 81% of the time” (Harvey & Murry, 1996, p. 24).

Jung posited that part of an individual’s personality is conscious and unconscious at any

given moment and/or situation in time. In the vernacular of MBTI theory, this is known as “type

dynamics” and its accompanying “hierarchy of functions.” More specifically, Type theory talks

of these functions in terms of “conscious energy,” asserting that the most dominant function is

the most conscious and the most inferior function is the least conscious - also known as

unconscious. Briggs-Myers equated an individual’s dominant personality function as a “captain

of a ship with undisputed authority to set its course and bring it safely to the desired port”

(Myers, 1980, pp. 9-10). She also analogized one’s dominant psychic function to his or her

dominant hand, i.e., left or right. “It’s that just we do things” (Myers, 1980, p. 10), meaning that

we, as individuals, have a propensity to think in certain directions . To invoke a term used in

contemporary conversation, one’s dominant function is that part of his or her personality where

one is in his or her comfort zone. It is where they can function at their best level and do their best

work. Myers offered an example of an individual who, as part of his or her personality, was a

dominant “N”, i.e., an intuitive person. A person who is a dominant N will find that intuition

permeates most of his or her thinking. Other functions such as S (sensing…the propensity to

notice details) “will naturally give the right of way to other thoughts arising from other functions.

“They will enjoy, use and trust [intuition] the most. Their lives will be so shaped so as to give

maximum freedom for the pursuit of intuitive goals” (Myers, 1980, p. 10). If an individual is an

N, that individual’s brain’s first response is to naturally see the big picture, i.e., meanings and


associations and connections of how phenomena go together – at the expense of seeing or

considering the details of a situation. Writers and inventors are examples.

On the other hand, if an individual was a dominant sensor, the most comfortable part of

that person’s personality would be one in which his or her mind effortlessly gathered and

sequenced the details and facts of their immediate environment. These individuals would be

grounded in the here and now – at the expense of seeing the big picture. Actuaries and pilots

come to mind.

According to Type theory, the least comfortable part of an individual’s personality lies in

the unconscious and is called the “inferior function.” One’s inferior function is the opposite of

their dominant function. That is, it is the most uncomfortable part of one’s personality and it lies

in the unconscious – where they would like it to stay. For example, if a dominant N were

suddenly called upon to memorize all the bones in the body, or if a dominant S was surprised

with the task creating an abstract painting, these situations would be deeply troublesome, hence

the name “inferior function.”

“Somewhere in between this energy,” according to Briggs-Myers, “are the auxiliary and

tertiary functions” (Myers et al., 1998, p. 23). They are beyond the parameters of this

investigation. One explanation might be to say that one’s auxiliary function is that zone of

personality where an individual feels the second best comfortable, while the tertiary function

represents that part of one’s personality where an individual is the “next-to-least” uncomfortable.

That is, it is not as bad as being in the zone of one’s inferior function, but one just does not like

being there. Perhaps an example might be a T (thinker) who was thrust in a situation where it

was expected to show emotion at someone’s birthday party. They would feel awkward, but get

through it.


Within MBTI theory, type dynamics and the hierarchy of type are probably the most

controversial of all. As such, this part of the theory is devoid of much supporting evidence - and

is generally a mixed bag. Efforts to establish clarity for dominant vs. auxiliary functions have

failed (Myers & McCaulley, 1985), while recent publications by Reynierse (2009) present

evidence more consistent with a two-fold rather than a four-fold order. That is, a dominant-

auxiliary as distinguished from dominant-auxiliary-tertiary-inferior.

Murphy-Meisgeier Type Indicator For Children (MMTIC)

Most early personality theory was based on the observed behaviors of adults, although

some theorists did address manifestations in early childhood. In Jung’s writings he says, “The

differentiation of type begins often very early, so early that in certain cases one must speak of it

as being innate” (Jung (1928/1945, p. 303). He further stated that infants’ adaptation to their

surrounding environments was probably an early indicator of extroversion. In his description of

introversion in children, he made note of their “shyness, thoughtful reflection before acting, and

their fearfulness of unknown objects as key indicators” (Jung, 1928/1945, p. 307).

With the MBTI, designed to be used for ages 14 and over (Myers, McCaulley, Quenk, &

Hammer, 1998), a parallel inventory for children was created by Charles Meisgeier, the chair of

the Educational Psychology Department at the University of Houston, and Elizabeth Murphy, a

psychologist (Meisgeier & Murphy, 1987). Earlier in her educational career, as a classroom

teacher and as a graduate student, Murphy became interested in applying the MBTI to children.

This interest led to her dissertation at the University of Houston, which investigated these

possibilities.

Meisgeier’s interest in psychological type for children was rooted in his career

advocating for special education services and the need to better understand children’s learning


abilities and styles. Eventually, in 1985, the two coauthored the MMTIC. The current version of

the MMTIC represents a newly revised version (MMTIC-R, 2008) and is also based on Jungian-

Briggs-Myers theory. As with the MBTI, the MMTIC clusters for exactly the same dichotomous

constructs, i.e., E/I; S/N; T/F; J/P. The only difference is the number of questions and their

wording. There is only one version of the MMTIC but the reporting of the results is age

appropriate for three levels: elementary, middle and high school students. The MMTIC is

designed to be taken by children from grades three to twelve, i.e., ages 7 to 18.

Regarding the earlier discussion of the MBTI and its relationship to continuous scores,

with the MMTIC, the closest these continuous scores come to being reported as a continuum are

labeled as Response Consistency (for each dichotomy) scores. They are expressed in

percentages. For example, if a child scored a 92% on her extroversion preference, it means that

for every question that was designed to access extroversion / introversion, this particular child

answered toward the extroversion pole 92% of the time (Murphy & Meisgeier, 2008). Refer to

Appendix B, where a sample scoring results show response consistencies. Again, the theory of

psychological type as interpreted by Myers, and Jung before her, asserts that this

typology/dichotomy is real - that sometime, either due to genetics or environment or both, there

is a split in the cognitive and developmental road. Which path an individual goes down creates a

fundamental and profoundly different route. Moreover, an individual can also go down the road

they did not choose, just not as often or as comfortably.

Like all personality inventories arising from Jungian-Briggs-Myers theory, the MMTIC is

modeled on the same abstinence of negative labeling. Emphasis is placed on enhancing the

understanding of oneself and others. Results that could potentially be framed as negative

personality descriptors are addressed as “Stretches” from its Strengths & Stretches section.


(Appendix C, p.141). In addition, the MMTIC manual provides a strong emphasis on the

application of Type in education, e.g., learning styles, etc. (Murphy & Meisgeier, 2008, pp. 43-

52).

With the 2008 revision of the MMTIC, it joins other children’s personality assessments

that have been available over many decades. All are self-report. A short list and age range is

offered in Archer and Krishnamurthy (2001):

Children’s Personality Questionnaire – 8 to 12 (Porter & Cattell, 1968)

Revised Junior Eysenck Personality Questionnaire - 7 to 17 (Eysenck & Eysenck, 1975a)

Early School Personality Questionnaire – 6 to 8 (Cattell & Coan, 1976)

Adolescent Personality Questionnaire - 12 to 18 (Cattell et al., 1984/2001)

Five Factor Personality Inventory – Children – 9 to 18 (Costa & McCrae, 1992)

MMPI-2-A Minnesota Multiphasic Personality Inventory Adolescent Version

A critique of the Murphy-Meisgeier Type Indicator For Children (MMTIC)

Since its last revision in 2008, the MMTIC scoring system has been based on a

sophisticated subset of the IRT, called latent content analysis (LCA). A latent variable is a

variable which cannot be directly measured. Like the Item Response Theory used by the MBTI,

the scores reflected by the MMTIC’s Latent Class Analysis reflect the empirically derived

importance of each item. LCA goes a step further than the IRT in the sense that its empirical goal

is to identify unobservable subgroups within a population. In part, the results of an LCA test are

determined by how well items on a scale “hang together”. That is, hang together according to

whether that scale of measurement is appropriate for measuring a particular underlying construct.

For example, with extroversion, an algorithm is developed for obtaining the maximum likelihood

a measurement is within the model parameters and characterizations of being an extrovert. And

if so, how strongly is this item / question connected to the underlying construct of extroversion?


The vast majority of research on personality has been conducted on adults and/or the

usual pool of psychology research participants, college students. The few instruments that have

been used on children, as just cited, have been or are based on factor analysis and/or screen for

pathology. None of these tests is based on the exploration of personality type in the Jungian-

Briggs-Myers sense. Research that involves developmental changes in personality beckons. Is

high school a more difficult time than other time-frames in a child’s life? Is personality affected

by maturation? As an empirical example, looking at the MMTIC manual, it is interesting to note

that the highest internal consistency of MMTIC items seems to occur in grade 9 (see Table 1)

(Murphy & Meisgeier, 2008, p. 32).

Based on those particular ninth graders, perhaps this is a fluke. Or maybe ninth grade is a

period of relative stability after the initial changes of puberty but before more changes take place

in the intense social atmosphere of a typical high school. From a slightly different angle, perhaps

lower validities for type preferences in younger children might confirm predictions of Jung’s

earlier mentioned comments.

Looking at the MBTI (not MMTIC) manual, “Psychological type is presumed to be

innate and enduring – children are born with a predisposition to prefer some functions to others”

(Myers et al. 1998, p. 27), and “type theory assumes that type does not change over the life span”

(Myers et al. 1998, p. 28). The MMTIC manual makes reference to this enduring nature of type

preferences and suggests that if there is a convergent validity between the MBTI and MMTIC

instruments, the distribution of the 16 whole personality types should be similar. That is, whole

types, e.g., INFP, that are least and most common should be similar across population samples /

norms. The manual then goes on to say, “These results, replicated in this second independent

sample, indicate that the distribution of types in adults, as measured by the MBTI instrument, is


significantly related to the distribution in children, as measured by the MMTIC” (Murphy &

Meisgeier, 2008, pp. 40-41).

Within the MMTIC manual itself, the evidence for validity of this instrument is

impressive. A long and rich body of research supports the construct validity of the Jung-Myers

conceptualization of psychological type. The MMTIC manual does not summarize the evidence

for such validity of the four dichotomous domains (E-I, S-N, T-F, and J-P) of type, as that has

been well documented elsewhere (e.g., Myers et al. 1998). Instead, “the current focus is on the

evidence that the MMTIC produces results consistent with the structure of these four domains”

(Murphy & Meisgeier, 2008, p. 35). Moreover, “The combined reliability and validity evidence

suggests that the MMTIC is an instrument able to accurately identify normal personality

preferences in children grades two through twelve. The confidence scores generated for each

child provide a means for replacing the previous undecided results while recognizing the

emergent status of type preference” (Murphy & Meisgeier, 2008, p. 41).

From these manual entries and their accompanying evidence, several questions arise.

First, the MMTIC manual makes numerous references to the MBTI, but at no point does it

appear to give any indication of concurrent validity with the MBTI, i.e., with the MBTI and the

MMTIC being administered together in the same study involving the same circumstances. The

researcher was able to find an annotated bibliography from an unpublished doctoral dissertation

which investigated concurrent validity between the MMTIC and the MBTI with 217 middle

school students (7th

and 8th

graders). Comparing the two instruments, significant correlations

were found between them, as well as between the two groups tested: gifted-talented and control

(Lang, 2000). Although encouraging, this particular study is not relevant to the current

investigation for two reasons. First, it involved the previous version of the MMTIC which was


retired in 2008. Second, with the minimum reading level being age 14, the MBTI was not

designed to be administered to middle school children. A second study two years earlier took

place over a two-year period. In a test-retest situation, two administrations of the MBTI and two

administrations of the MMTIC were given to high school students. The results showed a high

percentage of agreement between MMTIC and MBTI dichotomies (Gilbert, 1998). Although

this, too, is encouraging, the situation was similar in that it occurred ten years before the

introduction of the newest version of the MMTIC in 2008.

A second and major point of concern is an unsuccessful search for results of studies

involving children’s self-report personality inventories that originate from a different theoretical

perspective, i.e. congruent validity. Despite an extensive literary search as well as close scrutiny

of the MMTIC manual itself, the researcher found nothing. If true, this would appear to be a gap

in information that is conspicuous in its absence. For these two reasons alone, there is a dearth of

information about the validity of the MMTIC. In this vein, this Master’s Thesis represents

pioneering work.

Sources of Inaccuracy and Bias in the Self-Report Method

Self-report questionnaires are the most often used tools in the behavioral sciences for

gathering information. The problem is, they are inherently problematic. This is true for a number

of reasons. Among them is the situation that is created when efforts to have participants address

questions which are qualitative in nature, are transformed into quantitative data. For example, in

measuring the constructs of extraversion or introversion, in many inventories, participants are

offered only a fixed choice response (e.g., A/B or yes/no answers). With a dichotomous response

format, respondents are forced into choosing between answers which, although appearing to be

neutral and positive, may be pulling toward opposite poles. Participants are not allowed more


desirable and perhaps more accurate responses. Lost in this fixed choice format are some

potentially important insights into the dynamics underlying each response choice. Are both

alternatives strongly attractive with one slightly more appealing, or are they nearly equally

unattractive with one less so? Are both close to neutral, leading to a mild endorsement? A

different method to elicit responses are Likert scales, in which participants are asked to rate how

strongly they agree or disagree with a statement, which can bring their own set of problems.

Likert scales offer opportunities to express more detail than a fixed choice answer. For myriad

reasons, with Likert response sets, there is a temptation for people to gravitate toward harsh sets,

leniency sets or middle sets (Carifio & Perla,2007; Jamieson, 2004). Whether the process is

conscious or unconscious, respondents may not want to appear extreme in their responses.

There may be various reasons why respondents give inaccurate answers on self-report

personality inventories. Among them is the possible inability for introspection. There seems to

be little question that despite all efforts to be sincere and accurate in one’s responses, there are

individuals who may be unable to do so. This is exacerbated by a difficulty of measuring one’s

ability to introspect and/or have the ability to accurately self-appraise. Human nature is such that

people view themselves in a completely different light than how others see them. When these

situations are factored in with different life situations and experiences, it becomes a legitimate

investigative exercise to pursue a study that would shed light on the validity of the Myers-Briggs

Type Indicator (MBTI) and the Murphy-Meisgeier Type Indicator for Children (MMTIC).

A second problem with self-report inventories may be dishonesty or faking, especially

when it is coupled with one’s conscious efforts toward impression management. Topics

contained within questions on these personality inventories might determine one’s responses.

Intentional misrepresentation may be at work as participants answer in such a way as to portray


themselves in a good light, i.e., impression management. Even if respondents are self-aware,

they may simply be too embarrassed to come to grips with situations posed by certain questions.

Because they do not want to test-out as having an opinion that is atypical – either in its direction

or in its degree - respondents’ answers may be skewed toward social desirability. Social

desirability is the tendency for respondents to answer questions according to what they perceive

as being socially acceptable. This may occur purposefully to present oneself in a particularly

positive manner or inadvertently if the individual perceives there is a correct response. Choosing

socially desirable answers is also known as “desirability response sets,” i.e., “SD sets” and is

related to the need for approval (Pauls & Crost, 2004).

Finally, there may be cases in which individuals may be deliberately trying to control

their responses (i.e., “faking good”, or lying) (Eysenck & Eysenck, 1963). This is evidenced by

numerous studies that have concluded that both the MBTI and the Five Factor Model can be

distorted somewhat (Furnham, 1990; Harvey & Murry, 1994; Snell, 1994). Whatever the causes,

self-report bias can significantly compromise the validity of all instruments that utilize this tool

(Grum & Collani, 2007). This, of course, poses serious problems in conducting academic

research. When trying to interpret average (nomothetic) tendencies as well as differences on an

individual (idiographic) level, there may be blind spots. Perhaps even more concerning is the

usage of the self-report method in real world situations such as employee selection, usage in an

educational environment, or perhaps as a clinical diagnostic tool. For all of these reasons, it is

important that the process by which personality inventories might possibly yield inaccurate

results via socially desirable responding be discussed in detail. This would be appropriate since

the creator of the Implicit Association Test has stated that the primary reason why the IAT was

developed, was to reduce the effect of social desirability responses that are explicitly prejudiced


(Greenwald et al., 1998). This discussion will come later in the paper. First, however, as focal

points of this thesis, both the self-report MBTI and the MMTIC themselves must be given

comprehensive treatment.

Social Desirability Responses – An Overview

Socially desirable responses (SDR) are typically defined as “the tendency to give positive

descriptions of oneself that match with society’s current standards and norms” (Paulhus,1984, p.

50). With SDR almost certainly rooted in cultural and societal norms, substantial questions

naturally arise, the answers to which could possibly affect conclusions of this investigation.

Integrally related to such inquiries is the primary reason why the Implicit Association Test was

developed. According to its creator, it was to reduce the effect of social desirability of self-report

responses which are explicitly prejudiced (Greenwald et al., 1998).

From its onset, socially desirable responding has been both complex and controversial,

having to struggle from originally being considered a psychological artifact to its acceptance as a

psychological construct. For decades, agreement was elusive regarding its definition, causes, and

pervasiveness. In addition, this has led to considerations on how its various layers and

permutations might lead to biasing consequences, as well as the extent of those consequences.

Fifty years on, as widely disparate opinions among researchers on this topic rule the day,

there is no shortage of questions. For example, among the issues still debated are whether SDR is

gender based and/or age based; whether manifestations of SDR originate in the implicit or

explicit parts of our psyche; and to what extent the sensitivity of self-report questions “activate”

different parts of the brain. Especially within the discipline of psychological testing, when not

controlled for, SDR represents a direct threat to the validity of any findings. That is, SDR can

manifest itself in the form of a systematic error via its corruption of test items and subsequent


contamination of respondents’ answers. Simply put, socially desirable responding can take the

form of a quintessential confounding variable (Patel, 2006; Rossi et al., 1983; Tyson 1992).

Restated, exploring the dynamics between socially desirable responding and the Implicit

Association Test could not be more important to the results and discussion of this paper.

Socially Desirable Responding: Research Review

One of the most accepted models of socially desirable responding has been a formulation

by Delroy Paulhus and various colleagues. Starting with “minimalist constructs” (p. 50) in 1984,

which had a “plethora of operationalizations” (p. 50), the next two decades witnessed extensive

research within social psychology. As a result, in 2002, Paulhus’s model underwent a major

revision which incorporated most of this new research (Paulhus, 1984; Paulhus & John, 2002).

His final iteration of social desirability responding was a complex two tier system, most of which

is beyond the parameters of this investigation. For the reader’s purpose, however, the evolution

of Paulhus’s model can be separated into two components: unconscious or conscious. Currently,

this most recent Paulhus premise enjoys general acceptance (Morgeson, 2007; Paulhus, 1984;

1988; 1989; Paulhus & Reid, 1991; Zerbe & Paulhus, 1987).

The unconscious part of Paulhus’s model is called “self-deception”. It represents the

motivation of individuals to see themselves favorably without themselves being aware of it.

Therefore, because it is not deliberate and there is no overt attempt to distort reality, it is by

default, an honest cognitive process, i.e., individuals actually believe their thoughts to be true.

According to Paulhus, self-deception is split into two further subdivisions of unconsciousness:

self-enhancement and self-denial. Paulhus says further that self-enhancement is a way of

exaggerating one’s intellectual prowess, creativity, social status, or emotional responsiveness in

the direction of the norms of society. Self-denial is one’s tendency to unconsciously deny


societally deviant impulses in favor of more positive “saint-like” attributes. Because both of

these subdivisions dwell in the unconscious, they process cognition in such a way that they

make no attempt to distort reality (Paulhus & John, 1998). Therefore, from this point onward,

this paper will only address that part of Paulhus’s SDR model which is conscious. Paulhus calls

this impression management (IM). Impression management (IM) is also known as fakery.

Understandably, this conscious side of Paulhus’s model represents the opposite pole of

what has just been mentioned. That is, the entire idea of IM is to consciously create a socially

acceptable and/or favorable impression of oneself so that others will notice (Paulhus, 2002;

Paulhus & John, 1998; Pauls & Stemmler, 2003). As MacCann, Ziegler, and Roberts (2011) said,

“It thus seems that most experts view faking as a deliberate act, distinguishing this from other

forms of response distortion that may not be conscious and intentional” (p. 311).

As a ground-level example toward understanding of SDR, the following is offered. In

1988, Rychtarik et al. conducted a self-report study of marital satisfaction between 143 male

alcoholics and their wives. Using two well-regarded assessment instruments in this area, the

researchers used factor analysis to conclude that the alcoholics rated the success of their

marriages significantly higher than did their wives. Rychtarik and his colleagues concluded that

the self-reported satisfaction within their marriages by the alcoholics had been highly

contaminated by social desirability responding (Rychtarik et al., 1989).

When Paulhus introduced his updated model of SDR in 2002, he was careful to remind his

peers that social psychologists must always be alert that socially desirable responding can easily

compromise construct validity – particularly when researching behaviors and attitudes that are

considered to be socially deviant or unapproved - and especially when investigating personality

characteristics. At the time, Paulhus said, “I argue that no SDR measure should be used without


sufficient evidence that high or low SDR scores indicate a departure from reality” (p. 50).

“Researchers making allegations about response bias must do the work and this task requires the

collection of credible measures of personality to be parsed from self-reports” (p. 60).

A few years earlier, Ones, Viswesvaran, and Reiss (1996) offered the strongest support yet

for the validity of self-report personality measures, i.e., resisting the influences of SDR. The

authors were quite clear in their conclusions that as long as testing was in a non-experimental

setting, generally speaking, faking does not occur very often. Further, they labeled the role of

socially desirable responding a “red herring,” stating that SDR has no effect on the criterion-

related (predictive) validity as well as construct validity of personality measures, and thus is not

as big of a problem as past literature reviews would have us believe (Ones, Viswesvaran, &

Reiss, 1996). Although the Viswesvaran et al. study was aimed primarily at the investigation of

personality assessments in conjunction with their relationship with job performance, their

research is relevant to this paper because it involved the use of McCrae and Costa’s Big Five

factors of personality as a basis of comparison.

This is an important point of reference because the Big Five and the MBTI have been

shown in a number of studies to significantly overlap in terms of what they are attempting to

measure – each from its very different perspective (Johnson & Sanders, 1990; McCrae & Costa,

1989a; McCrae & John, 1992). That is, the Big Five is based in the factor analysis of traits

whereas the MBTI is based in Jungian theory of personality and its descriptors (McCrae &

Costa, 1989a). In their own research, McCrae and Costa have suggested strong congruent

validity between the two inventories, citing that “each of the four [MBTI] indices showed

impressive evidence of convergence with four of the five major dimensions of normal


personality” (McCrae & Costa, 1989a, p. 32-33). It is noted that McCrae and Costa’s fifth factor

is neuroticism, which the MBTI makes no attempt to measure. See Table 1.

With the use of Rosenthal and Rubin’s: A simple general purpose display of magnitude of

experimental effect as a guide, the following correlational “hit rates” are offered. A correlation of

0.00 gives a 50% or 50/50 chance of being accurate. A .40 correlation gives a 70% chance of

being accurate, and is usually described as having moderate correlation. This is about as high as

can be expected. The reason for this is because in any behavioral study, the independent factors

are often numerous. Therefore, there is a nearly automatic inverse relationship between the two.

That is, as these behavioral factors increase, the correlations between any one of these factors

almost certainly decrease. As such, a 70% rate of accuracy is unlikely to be much higher. On the

rare occasion that correlations are .70 or higher, McCrae and Costa (as previously mentioned),

have noted themselves, that we are talking about equivalence (Rosenthal & Rubin, 1982;

McCrae & Costa, 1989a).

As cited from Table 1, the correlation between the Big Five and the MBTI on

Extraversion/Introversion was .70; Big Five Openness and its MBTI S/N equivalent was also

.70; Big Five Agreeableness and its MBTI equivalent T/F was .45; and Big Five

Conscientiousness and its MBTI equivalent J/P was .47.

Within the American Psychological Association's PsycLIT database between 1974-1995,

(Ones et al., 1991) searched over 700 studies on social desirability and construct validity. Using

scales that were designed to detect socially desirable responding, this information yielded

409,496 individuals and 1460 correlations (Ones et al., 1996). As such, the researchers arrived at

the conclusion that, “Removing the effects of social desirability from the Big Five dimensions of

personality leaves the criterion-related (predictive) validity of personality constructs for


predicting job performance intact” (Ones et al., 1996, p. 663). Interestingly, as stated by the

researchers themselves, the most important finding of this meta-analysis was that individuals’

social desirability scores were a direct reflection of their personality variables on the Big Five.

For example, there was a high correlation between social desirability scores and two dimensions

of the Big Five: emotional stability and conscientiousness. This was not surprising given that the

ability to respond in a socially desirable manner would seem to be associated with these two

personality traits. The authors, however, were quite clear in their statements that SDR did not

translate into playing a role in predicting on-the-job behavior and/or performance.

The 1990s witnessed additional considerable research which found little evidence that SDR

was negatively impacting either construct or predictive (criterion) validity of personality

assessments (Barrick & Mount, 1996; Christiansen, Goffin, Johnson, & Rothstein, 1994; Hough

et al., 1990; Ones, Viswesvaran, & Schmidt, 1993). Typical commentaries among these studies

were statements such as "Social desirability may not be the problem it has often assumed to be"

(Hough et al., 1990 p. 592).

Contrarily, other researchers have argued that social desirability responding does indeed

impact the validity of personality assessments in a negative way – and that previous research has

seriously undermined its actual effects (Holden, 2008; Morgeson et al., 2007; Rossé, Stecher,

Miller, & Levin, 1998). In 1999, while presenting a symposium paper, Rossé et al. (1998) took

Ones et al. to task, concluding that, “Higher levels of faking occurs among those who rise to the

top of the applicant pool”…and…”Faking can lead employers to select applicants who will

demonstrate more negative behaviors after being hired. Faking then, is not a red herring for

employers making hiring decisions, but a great white shark” (Rosse et al., 1998, p. 5). Paulhus

himself returned to the fray addressing the original criticisms by the Ones et al. study, stating, “I


argue that the attention given to SDR research cannot be dismissed as a red herring, but

represents a process of construct validation that has now accumulated to the point where a

coherent integration is possible (Paulhus, 2002, p. 50).

Since then, the issue of socially desirable responding has remained contentious as ever – as

there exists a substantial amount of research to support both schools of thinking. So heavily

researched has the topic become, Dr. Kevin Murphy, then editor of the Journal of Applied

Psychology, called for the cessation of faking research. In 1999, he announced the JOAP would

no longer accept “faking papers” effective on December 31 of that year (Griffith, 2006). With

JOAP being only one of several peer review journals as a choice to publish in, Murphy’s

pronouncement seemed not to have slowed down the tempo of this academic tumult. This is

exemplified by one chapter in the 2011 book, New Perspectives in Personality Assessment -

Putting the Horse Back in Front of the Cart, by Eric Heggestad. In this chapter, Heggestad

declared that social desirability responding is not a psychological construct at all. “Faking is thus

a deliberate set of behaviors motivated by a desire to present a deceptive impression to the world.

Like most other behavior, faking is caused by an interaction between person and situation

characteristics” (Heggestad, 2011, p. 87).

Taking a Position on the Legitimacy of the Social Desirability Response Construct

In light of Heggestad’s comments and with regard to the disagreement as to whether or not

conscious level SDR and its subset, impression management (IM), are “real” psychological

constructs, the author of this investigation takes the position that they are indeed real

psychological constructs.

Perhaps Griffith and McDaniel (2006) as well as Smith (2004) framed it most clearly

when they suggested that, from an evolutionary point of view, deception is adaptive. To gain a


competitive advantage is to survive. What camouflage was to a Neanderthal is today, padding

one’s resume, getting hired, getting promoted, or not getting fired. In short, SDR and IM is

merely a socially constructed extension of the Neanderthal’s camouflage (Griffith & McDaniel,

2006, p. 5; Smith 2004, p. 42). John and Hogan (2006) further stated, “deception is a function of

natural selection and an inherent part of life” (p. 209). Other researchers have further argued that

within our routine social interactions, deception is so common, most people lie every day and

that most people believe that in a competitive situation, not faking will leave them at a

disadvantage (Depaulo, Kashy, Kirkendol, Dwyer, & Epstein, 1996). With this position taken,

this investigation now moves to a more specific discussion of what the literature review shows

on this subject.

In general, the discussion over SDR has morphed into three basic questions: 1) is fakery

possible? 2) how prevalent is it? and 3) what are its effects and does it matter? The short

answers to these questions appear to be: 1) yes 2) it depends on what is being measured and the

methodologies being used in the process and 3) the results are mixed, i.e., either the effects are

negligible, or there is a substantial comprising of both construct and predictive validity. These

questions will now be addressed in greater depth.

Before examining this research, however, the reader is reminded again that the impression

management (IM) subset of socially desirable responding is almost always studied and discussed

within the parameters of Industrial/Organizational (I/O) Psychology - and within that framework,

it is most often conjoined with personality assessments. In short, it is not surprising that when

researching these two focuses, the tendency has been to concentrate on these specific points

within the employment world. That is, rather than “stand-alone” personality constructs of the


MBTI and MMTIC such as extraversion/introversion; thinking/feeling, etc., a substantial

majority of the research has involved SDR/IM together within the context of I/O Psychology.

Returning to the “yes” answer to question one, i.e., whether faking is possible, this is the

only area where it can be stated that researchers overwhelmingly agree. As far back as 1946

Meehl and Hathaway stated that researchers hardly bothered to probe this question (Meehl &

Hathaway, 1946). Since then, researchers have demonstrated that the literature is substantial

(Zickar & Robie, 1999).

In reviewing faking literature, this researcher found the most widely used research

technique is simultaneously the most unrealistic, and therefore, the most subject to criticism.

Referred to as “fake good, participants in a lab are induced to overtly present themselves in a

socially desirable way, i.e., favorable manner. In psychological research, it is presumed this

instructional set reveals the upper limits of response distortion, i.e., a “worst-case” scenario of

sorts. The literature is replete with examples that when asked to do so, participants can routinely

and fundamentally change how they score on personality assessments (Becker & Colquitt, 1992;

Hough & Paullin, 1994; Ryan & Sackett, 1987; Rynes, 1993; Stanush, 1997; Viswesvaran &

Ones, 1999).

While literature reviews provide clear evidence to conclude that participants distort their

responses if instructed to do so, this methodology is considered to be deeply flawed for a number

of reasons. One of these reasons is how this information is gathered. Instead of measuring what

respondents actually do in real-life situations, and thereby apply the research to the general

population, this type of data is considered to be substantially overestimated because these

experiments are set up to deliberately push the statistical envelope (Snell & McDaniel, 1998). In

addition, most of the participants in these contrived experiments are student samples (e.g., Hogan


et al., 1996; Hough, 1997; Hough & Schneider, 1996). The only way in which results generated

in this manner could be generalizable to the population as a whole, would be if psychologists

assumed that all participants taking these assessments automatically engaged in faking their

responses.

This last statement brings this investigation toward answering question number two. In real

life situations, how prevalent is the impression management form of socially desirable

responding? Within Industrial/Organizational Psychology, while taking personality assessments,

the extent to which respondents consciously engage in the impression management is widely

disputed. Unsurprisingly, a major part of what fuels this long-standing debate is the choice of

techniques used to generate the data.

Within the job application process, in order to ferret out fakery in responding, the preferred

choice is the utilization of lie scales. Social Desirability (SD) lie scales are nothing more than

self-report inventories with the special purpose of determining whether an individual is

attempting to portray him or herself in a positive light (Rosse et al., 1998). Researchers using this

method, however, sometimes develop a “blind spot” in the sense that respondents might actually

fake-out the fake scale. According to Moorman and Podsakoff (1992), because of the nature of

SD items, these items are more likely to be spotted by test-takers and therefore be ever more

vulnerable to faking. In addition, research has also indicated that scales that measure SD are not

very effective when trying to seek out fakers (Griffith et al. 2005; Snell et al., 1998; Snell et al.,

1999). With particular relevance to this paper, the reader is again reminded that the research of

Viswesvaran and Ones (1999) drew clear conclusions that SD scales were the most vulnerable to

faking when using Big Five personality traits as comparison points. Other researchers have also

suggested that SD scales are especially ineffective when applied to measures of personality


(Christiansen et al., 1994; Ellingson et al., 1999; Morgeson et al., 2007; Ones et al., 1996). This

will be addressed shortly.

The second research method used to decipher whether or not a participant is faking on an

assessment is to compare her or his scores to individuals in other groups. Restated, this is

especially true within I/O Psychology where there are naturally occurring differentiated samples.

For example, with applicant / incumbent scores, most studies show that scores are commonly

higher for applicants than incumbents (Barrick & Mount, 1996; Schmit & Ryan, 1993; Stokes et

al., 1993; Wheeler et al., 1996). Simply put, incumbents already have the job and applicants do

not. The thought being, applicants will do whatever it takes to achieve their goal of getting the

job, while there is less pressure for those who already have the job to engage in fakery. Similar

conclusions in this area of research are well grounded. That is, they support the statement that

fakery is omnipresent in the world of self-report testing.

This is what Paulhus warned against, however. He said the strongest statement that can be

made here is that there is an inference – not a conclusion - that some conscious impression

management is likely going on. Paulhus further stated that in order to draw a conclusion from

these analyses, a stronger set of controls would be necessary within the experimental designs. In

this case, if it were possible, a within-subjects design, where the same participant took both the

applicant’s test as well as the incumbent’s test would be necessary. Failing that, researchers are

left to infer that applicant faking has taken place. Using inferences, of course, is not what

respected research is made of (Paulhus, 2011, p.153).

To illustrate this needed researcher wariness, a study by Hough et al. (1990) is offered.

This is one of the most often-cited studies with regards to the idea that job applicant faking does

not occur on any meaningful level on entry-type assessments. Hough and his colleagues sought


to investigate the extent to which army recruits faked their responses when told beforehand that

scores on their personality assessment would affect their future military careers. When compared

to other groups who had no incentive to alter their responses, the experimental group actually

scored lower on many of Hough’s SD predictor scales. This led Hough et al. to conclude that

applicants, on average, did not engage in overt faking (Hough et al., 1990). Three years later,

however, a re-analysis of Hough’s study suggested that upwards of 29% of participants in this

study were in fact faking (Rynes, 1993). In the re-analysis, it was determined that participants in

this study had already enlisted in the military and therefore were not a true applicant sample. In

addition, the social desirability scale used in Hough’s study has since been determined to be

ineffective (Griffith et al., 2005; Snell et al., 1999).

Within the employment world, as a conservative estimate, there is wide agreement that a

minimum of 30% of job applicants fake it when personality assessments are conjoined with the

job application process (Converse, Peterson, & Griffith, 2009; Ellingson, 2011; Griffith,

Chmielowski, & Yoshita, 2007; Griffith & Converse, 2011; Peterson, Griffith, & Converse,

2009). Within this percentage, approximately 20% of self-report respondents were classified as

extreme fakers and 80% were categorized as slight fakers (Robie, Brown & Beaty, 2007; Zickar

et al., 2004). When interviewed afterwards, most of the fakers indicated that their thought

process fell along the lines of giving a value-added answer to what they would have answered

under normal circumstances, i.e., nudging their answer further along in the “correct” direction.

Another common theme among the fakers was that neither the slight nor the extreme fakers

faked on all questions asked.

With regards to this investigation, to answer to the question concerning the prevalence of

faking on self-report personality assessments, it is perhaps better to return to Question 1 and


reframe it. Instead of asking: ‘do people fake?’ a likely better question is: when do people fake?

Until very recently, there has been a dearth of attention given to this topic. In MacCann’s

previously mentioned book, Ellingson (2011) pens a chapter entitled: “People fake only when

they need to fake.” The author pulls together various studies which enable her to make an

excellent case that most studies on this topic are guilty of grossly omitting situational

circumstances and/or individual personality differences of test-takers. Basically, Ellingson states

that individuals will engage in faking when, on a personal level, they believe and decide that it is

necessary to gain something which is of value - and which, given the perceived risks, they

believe they can successfully accomplish.

Ellingson reminds us yet again, that the vast majority of social desirability response studies

have been based in the I/O Psychology domain of employment literature, e.g., within the context

of external hiring or internal promotion team-building. The author chastises research in general

for not addressing this phenomenon and its possible role in being a contributing factor in

skewing of results in situations where taking self-report assessments involves either high-stakes

(getting hired) or job desirability values. Ellingson reserves her harshest criticism for some of the

more well-known studies such as Ones and Viswesaran (1996). She further states that no matter

the situation or personality of participants, faking behavior can be accomplished.

Ziegler et al., (2008) go even further, stating that “spurious measurement error” (SME)

(Schmidt et al., 2003) has been generally neglected. SME, which is the failure to consider

interactions between individuals and their situations has been an on-going error in the study of

faking (Heggestad, George, & Reeve, 2006; Ziegler & Buehner, 2009; Ziegler, Toomela, &

Buehner, 2009). Added to the discussion are conclusions by MacCann et al. ( 2011) who have

concluded that these points have a wide potential to mask patterns of faking behavior. They point


out that the vast majority of SDR and IM theories should be concerned, not with the faking

behavior itself, but with the antecedents of faking behavior (MacCann et al., 2011). Moreover,

these antecedents should be so named, according to what is being measured. For example, there

is motivated / non-motivated socially desirable responding and as stated earlier, whether an

individual is a job applicant or a person who is an incumbent, and whether a job itself has high

desirability or low desirability. Yet another consideration is whether the results of the testing are

tied to a high-stakes / low-stakes situations. In short, the literature review strongly suggests that

the more motivated an individual is to obtain an end result, e.g., get hired; get promoted to a

highly desirable job, the outcome of high stakes assessments, i.e., getting the job or not, the more

likely an individual is to engage in socially desirable responding and impression management. In

one study, researchers even concluded that job desirability easily trumped social desirability in

faking during the applicant process (Kluger & Colella, 1993).

This investigation now turns its attention towards answering question 3: what are the

effects of SDR and does it matter? More specific and more relevant to the topic of this paper, is

the question: what effect does faking have on construct validity and criterion (predictive)

validity? Again, the results are mixed and accordingly, this researcher has concluded there is an

impasse on this topic. Depending on which study one reads, the effects are either negligible, or

there is a substantial compromising of both construct and predictive validity (Ellingson, Smith,

& Sackett, 2001; Ziegler & Bühner, 2009; Ziegler, Danay, Schölmerich, & Bühner, 2010). For

example, in their previously cited meta-analysis, Viswesvaran and Ones (1998) found that when

it came to the ability to fake responses, participants’ scores, on average, improved nearly one-

half of one standard deviation. Other studies have reached stronger conclusions – with findings

of improvements of one full standard deviation being routinely reported (Birkeland, Manson,


Kisamore, Brannick, & Smith, 2006; Jackson, Wroblewski, & Ashton, 2000); Ziegler, Schmidt-

Atzert, Bühner, & Krumm, 2007). In another study, Holden (2007, Study 3), university students

were asked to fake good in a self-rating test for extraversion. Their roommates were then asked

to rate them in a non-faking good situation. When the results were tabulated, the correlation

between the fake good (of themselves) extraverts and their roommates’ ratings of them dropped

from .54 in the straight-take condition to .11 in the condition of fake-good (Holden, 2007).

Conversely, there are many studies that have investigated SDR and have found “little

evidence of deleterious effects of the response bias on predictive validity” (Hough, Eaton,

Dunnette, Kamp, & McCloy, 1990, p. 471). Ones et al. (1996) had concluded that SDR acted

neither as predictor, moderator or suppressor in job-related studies. Moreover, McGrath et al.

(2010) arrived at the same conclusion, stating that despite 100 years of research on the usage of

socially desirable responding, “a sufficient justification for [response bias indicators] use…in

applied settings remains elusive” (p. 450).

In 2012, a study by Paunonen et al. seemed to take a middle ground. On the one hand,

Paunonen’s research team concluded,

Our study generally supported previous findings that have reported relatively minor

decrements in criterion prediction, even with personality scores that were massively

infused with desirability bias. Following normal respondent-sampling procedures, only

under the most extreme and unusual levels of distorted self-reports will the observed

criterion validity of a personality measure be dramatically affected by SDR. Moreover,

SDR will normally fail to show itself statistically as either a moderator of test validity or

a suppressor of test validity (p.1).

Paunonen and his colleagues went on to say,


Unlike some researchers, however, we do not automatically conclude that desirability

bias is therefore, not an issue for personality inventories or other measures of typical

performance (e.g., Ones et al., 1996). We reiterate that response distortion due to SDR

can profoundly compromise the construct validity of the assessment because the obtained

scores for some of our respondents on the simulated measure departed substantially from

their true scores” (Paunonen et al., 2012, p.1).

Relative to this study, perhaps the most compelling piece of research on the correlations

between the Implicit Association Test and explicit self-report measures was published in 2005 by

Hofmann et al. Based on 126 studies, Hofmann and his colleagues concluded,

We did not find any evidence that correlations were influenced by the degree of social

desirability or introspection associated with the topic. Specifically, one could suspect that

correlations should be lower when strong social desirability concerns are triggered by the

research topic under investigation. Moreover, correlations may be higher for topics that

are associated with a high level of introspection. These assumptions were not confirmed

in the present meta-analysis (Hofmann et al., 2005, p. 1380).

The Implicit Association Test

As mentioned, the primary purpose of self-report inventories is to measure an

individual’s explicit answers. Moreover, whether measuring personality characteristics, attitudes,

biases, self-esteem, etc., this technique appears to be limited (Greenwald, McGhee & Schwartz,

1998). At minimum, being oblivious to one’s own personality characteristics or attitudes may

compromise the validity of answers. At maximum, socially desirable responding can, in theory,

throw-off the accuracy of a participant’s answers. As such, the Implicit Association Test (IAT)

attempts to circumvent these problems.


In academic terms, De Houwer and Moors (2009a), said that an

implicit measure can be defined as the outcome of a measurement procedure that is

caused by the to-be-measured psychological attribute (e.g., an attitude or stereotype) [or

in MBTI descriptors, extraversion and introversion] by means of automatic processes.

Based on our definition of implicit measures, we proposed that measures can differ with

regard to internal properties (i.e., the properties of the attributes and processes underlying

the measure) and external properties (i.e., the properties of the measurement procedure)

(DeHouwer & Moors, 2009a, pp. 32-33).

In other terms, the Implicit Association Test (IAT) attempts to measure, albeit indirectly,

implicit (unconscious) cognition. Specifically, the IAT measures the strengths of automatic

associations between concepts. It does this by measuring the amount of time it takes an

individual to respond to categorization choices that require different responses (typically

operationalized as pushing two different computer keys) compared to categorization decisions

that require the same response (e.g., pressing the same key). As opposed to the self-report, which

asks participants for a direct (conscious and explicit) statement regarding their beliefs and

attitudes, the IAT probes for an indirect (implicit) measure via a comparison of these response

speeds. The total time it takes to complete the test is approximately 10 to 15 minutes. Its original

version was published by social psychologist Professor Anthony Greenwald of the University of

Washington. When Greenwald first piloted this test, he used positive words like “peace” or

“happy” and hypothesized that participants would be more likely to pair up these words with

pictures of flowers. Likewise, he predicted that when presented with negative words like “ugly”

or “rotten” – individuals would associate them with photos of insects. They did.


Later, Greenwald saw the value of the IAT in testing implicit attitudes regarding ethnicity

and race. Greenwald’s hypothesis was that when participants had time constraints, i.e., had to

rapidly classify images and words (for example, White vs. Black faces with pleasant vs.

unpleasant words) they would react at a notably faster speed if they were subconsciously

harboring a biased attitude. An example is White faces paired up with pleasant words. That is, it

would be “easier” (and faster) to respond because there was no cognitive ‘heavy-lifting’ involved

if the person had a positive bias toward White faces. Because the IAT measures association

strength via a shorter response time, in this case, the definition of ‘heavy-lifting’ would be an

explicit (conscious) override of an attitude or belief that one held in his or her subconscious.

At the end of the experiment, if a participant was measured as having a faster response

time when pairing up White faces with pleasant words than Black faces with pleasant words, one

would be designated as having a bias in favor of White over Black. Fourteen years later, the

premise for Greenwald’s hypothesis has continued to be upheld – almost without exception. In

short, strong associations and their faster reaction times won over weak associations and their

slower reaction times (Nosek, et al. 2007).

Since the earliest versions of his IAT, Greenwald has teamed up with former student

Mahzarin Banaji, now of Harvard, and Brian Nosek of the University of Virginia. Currently, the

three of them work closely in on-going research involving the IAT – as well as being co-authors

of the newest versions of this tool. An integral part of Greenwald et al. team’s research is their

Harvard-based Project Implicit website – an online virtual laboratory. In the 15 years since it was

initiated, there have been approximately 12 million administrations of the IAT test given to

visitors of this site. Most probably, the primary reason for the IAT’s rapid dissemination and

accompanying credibility was the controversy that arose from the ‘implicit truth’ generated with


regards to race and ethnicity. Briefly stated, until the IAT, well-respected scientific surveys had

generally found that in America, on the topics of race and ethnic bias – as presented on explicit-

based self-report surveys - European American Whites favored other European American Whites

(as opposed to people with darker skin) by approximately 15%. Accordingly, since then, the IAT

has become a center of focus, consistently demonstrating this original percentage to be

substantially under-represented. Instead, its on-going findings hover around 70% (Cunningham

& Banaji, 2004).

In 2004, Banaji undertook a study to find when explicit and implicit race attitudes first

form, and subsequently, at what point ‘above the surface’ (explicit) attitudes split from implicit

attitudes to become more conscious. In IAT tests that were specially designed to be child-

friendly, it was discovered that at as young as six, White American children from New England

and Japanese children, both explicitly and implicitly showed preference for people like

themselves. By ten years of age, however, their conscious and unconscious attitudes started to

diverge. As they grew older, most professed a conscious attitude of egalitarianism, but implicitly

continued to show a bias for their own group. Most noteworthy, as both groups of children got

older, they both showed implicit bias against black faces. In particular, with the Japanese

participants, attitudes toward white European faces – as compared to black - became even more

positive over time (Banaji & Baron, 2004).

According to Banaji, as time went on, she and Greenwald and Nosek realized what a

powerful role that the unconscious attitudes play in ordinary decision-making and "we knew the

right thing was to take this to the public" (Lehrman, 2006, p. 1). Since then, critical reviews have

been mostly positive. For example, in their study of various methodologies and theories that

attempt to assess implicit self-esteem, Bosson, Swann, and Pennebaker (2000) commented that


“no other procedure is as psychometrically strong as the [implicit measures of the] IAT” (p.

642).

Currently, Project Implicit has 14 different IAT tests on its website. They are as follows:

Asian / European-American IAT; Disability IAT; Arab-Muslim IAT; Weapons IAT which looks

for who (black or white) may be most likely to be carrying a weapon in American society; Race

IAT; Gender-Career IAT; Sexuality IAT; Weight IAT which looks for bias for or against thin

people; Religion IAT; Gender-Science-Liberal Arts IAT; Native American / European-American

IAT; Skin-tone IAT which looks for a bias for those with lighter or darker skin; Presidents IAT

which looks for bias for or against President Obama; and finally, the Age IAT which looks for

bias for youth vs. old age.

The IAT is scored by subtracting the response time scores on the incongruent trial stages

from the scores on the congruent trial stage. By “stages”, it is meant the four dichotomies. That

is, Extroversion / Introversion (E/I) scores; Sensing / Intuition (S/N) scores; Thinking / Feeling

(T/F) scores; and Judging/Perceiving(J/P) scores. For example, an incongruence sample from the

Extroversion / Introversion dichotomy would be “gregarious” and “timid”. A congruent example

from the same dichotomy would be “talkative and “outgoing”. The basic logic is that an

association between ‘‘talkative” and “outgoing” will be much more strongly encoded than an

association between ‘‘gregarious” and “timid” so the reaction times to the ‘‘talkative” and

“outgoing” pairing will be shorter than the reaction times to the ‘‘gregarious” and “timid”. The

greater one’s score on the IAT for “talkative” and “outgoing”, the more strongly it is assumed

that one holds ‘‘congruent” evaluations of these words. More specifically, scores on the IAT are

calculated based on latency responses (delayed responses) on four-word stimuli (called double

discriminants) and not the two word stimuli. It is important to note that included in every double-


discriminant screen are the two words “self” and “other.” The two word stimuli are inserted only

for the purposes of keeping the participants primed in the sorting routine, but participants are not

told this (See Appendix I, The Implicit Association Test sample questions).

Critique of the Implicit Association Test

Empirical evidence criticizing the Implicit Association Test has been sparse. A few

psychologists, however, have offered their interpretations of the findings. They have said the

IAT does not truly measure implicit prejudice, but only benign cultural knowledge which is

different from true racism. Others have argued about the meaning of implicit cognition. Perhaps

one of the reasons why criticisms have been few in number is that Greenwald, Banaji and Nosek

have themselves conducted and/or co-authored voluminous rigorous studies. Their series of

“check-ups” given the IAT every few years is encouraging. In 2001, Nosek and Greenwald

authored a study entitled: Health of the Implicit Association Test at Age 3. In the abstract, they

conclude: “Although there have also been a few studies critical of the IAT, there now exists

substantial evidence for the IAT's convergent and discriminant validity, including new evidence

reported in several of the articles in this special issue” (Greenwald & Nosek, 2001 p. 85). In

2003, Greenwald, Nosek and Banji conducted yet another comprehensive study to test a new

scoring algorithm. They concluded, “this new algorithm strongly outperforms the earlier

[original-conventional] procedure” (Greenwald et al., 2003, p. 197). In 2004, at the annual

conference of the Society of Personality and Social Psychology, Greenwald gave a slide

presentation entitled: “Revised Top 10 List of Things Wrong with the IAT (Greenwald, 2004).

Three years later came a study, The Implicit Association Test at 7 - A Methodological and

Conceptual Review - in which the authors concluded, “In its seventh year, the IAT is showing a


rapid growth in maturity with a solid base of evidence for its internal, construct, and predictive

validity” (Nosek et al., 2007, p. 286).

In terms of validation of the IAT, the most noteworthy achievement to date has been a

10-year meta-analysis by Greenwald and Banaji et al. (2009). After summarizing 184 studies, the

authors declared that the decade-long controversy over the validity of the IAT’s measurement of

racial and ethnic attitudes to be over. As part of their presentation, they asked and answered their

own question and sub-question. They asked: “Are the IAT’s findings of widespread preference

for White relative to Black scientifically valid or, alternately, are they uninteresting artifacts of

the IAT’s novel indirect method?” In addition, they queried: “Do IAT measures significantly

predict social behavior, judgment, and decision making?” (Greenwald et al., 2009, p. 1). Among

these 184 studies, in ascending numerical order, the following domains were included: political

leanings (11), close relationships (12), sexual and gender orientation (15), intergroup behavior

that was non-racial (15), drug and alcohol abuse (16), clinical issues (19), personality differences

(24), interracial (Black-White) behavior (32) and consumer preferences (40). The authors

confidently concluded that when they statistically combined all nine of the above domains (184

studies), the IAT was able to predict personal choices in judgment as well as social behavior

(Greenwald et al., 2009).

Another observation the authors found to be striking was that both self-report (explicit)

studies as well as implicit measurements of the IAT not only had predictive validity, but they

both had predictive validity independently of each other. In other words, both the explicit and

implicit scores were useful as stand-alone measurements even though they were not duplicates of

each other. With this in mind, the authors suggested that both types of measure were desirable in

research studies (Greenwald et al., 2009).


As part of the findings of their meta-analysis, a third item Greenwald et al. found

remarkable was that IAT (implicit) and self-report (explicit) scores were highly correlated in

several behavioral domains. This was especially true in political preferences and consumer

preferences. They were surprised to find that both of these domains predicted behavior, but that

explicit (self-report) measures actually had greater predictive ability (Greenwald et al., 2009).

The most significant finding of the meta-analysis was that for attitudes that were socially

sensitive, i.e., interracial and non-racial intergroup behavior, the “IAT and self-report measures

produced dissimilar measures that were only weakly correlated. In these socially sensitive

domains, IAT measures had significantly greater predictive validity than did explicit measures.

This is the result that establishes the value of using IAT measures in research designed to explore

roots of racial discrimination” (Greenwald et al., 2009, p. 2). The authors then concluded their

discussion of their meta-analysis by offering as examples, seven studies which showed predictive

validity of IAT scores. This is a core point for this investigation.

Another item deserving of the reader’s attention is that the authors of the IAT have

steadfastly encouraged the academic community to take part in their research – even inviting

criticism. An example of this was a 2007 study published by the three principals and a colleague,

subtitled: “What we know so far.” Parts of the abstract are excerpted as follows:

Because of the rapid dissemination of the IAT, researchers have correctly called for intensive

investigation into its underlying psychometric properties and mechanisms”…. and “a number

of issues remain open and in critical need of analysis. A better understanding of the

mechanism of the IAT is needed. In addition, exploration of the relationship between changes

in implicit cognitions and changes in behavior may help to identify mechanisms of

behavioral change as well as consequences of the well-documented malleability effects.


Rather than simply asking if the IAT converges with other implicit and explicit measures and

covaries with meaningful criterion variables—because there is evidence that it does—the

next generation of questions will likely continue the current shift to identifying when and

why these patterns emerge. Answers to these questions will help in building theories of

implicit social cognition, because methods are a central route to theory development. (Lane,

Banaji, Nosek, & Greenwald, 2007, p. 93)

IAT and Personality Assessment

As noted throughout this literature review, one of its central questions asks: how valid is

the self-report method as a tool when it is used in personality inventories? Thus far, on the face

of it, the IAT test has shown great promise in addressing that query. The first effort to use the

IAT to measure personality variables showed that “the IAT is able to assess inter-individual

differences that are valid for the prediction of behaviour but that are not accessible with direct

measurement procedures” (Schnabel et al., 2007 p. 393). As well, after adapting the IAT to

measure implicit self-concepts from the descriptors of the Five Factor Model, Schmukle et al.

(2008) concluded that, “In two studies (N = 106 and N = 92), confirmatory factor analyses

validated the five-factor model for the implicit personality self-concept. Internal consistencies of

the IAT proved satisfactory for all Big Five personality dimensions”… “Inter-correlations were

highly similar for implicit and explicit personality measures” (Schmukle et al., 2008, p. 269).

When the IAT was originally published in 1998, some psychologists feared that it might

be used as a lie-detector, i.e., dredging up associations from deep in the subconscious and

exposing “truths” about what people otherwise cannot tell about themselves. Related to this has

been another on-going concern about explicit (self-report) measurements of personality – the

question of fakery on the part of the participants. It appears it is even more difficult to engage in


fakery while taking the IAT. In 2010, Greenwald et al. instructed participants to deliberately alter

their responses on a gender identity IAT. Perhaps the title of the study says it most clearly:

“Faking of the Implicit Association Test Is Statistically Detectable and Partly Correctable”

(Cvencek et al., 2010, p. 301). Actually, Greenwald et al. was able to predict 75% of those

participants who were intentionally faking. As well, by using a pre-designed algorithm to adjust

for such eventualities, the IAT scores’ adjustments highly correlated with those that were

unfaked (Cvencek et al., 2010).

Yet another study involving the potentials for fakery with the IAT took place at the

University of Trier in Germany. This time, efforts in response fakery in both the explicit (self-

report) responses of McCrae and Costa’s Five Factor Model were matched with the same

implicit responses of the IAT. The conclusion of the study was, “The results show, indeed, that

the IAT is much less susceptible to faking than questionnaire measures are, even if no selective

faking of single dimensions of the questionnaire occurred. However, given limited experience,

scores on the IAT, too, are susceptible to faking” (Steffens, 2004, p. 165).

In summary, there is little doubt over the substantial growth in science-based literature

with regards to the use of the IAT in personality assessment. The specially designed Implicit

Association Test (IAT) with MBTI-like descriptors has already been developed by CAPT. Also

noteworthy is that in 2006, Anthony Greenwald (inventor of the IAT) filed a patent application

for use of the IAT methodology in multi-factor personality measures. In his application, he

specifically mentions both the Five Factor Model and the Myers-Briggs Type Indicator. It is this

author’s understanding that Greenwald’s patent application has not been granted. However, his

filing of it attests to a perceived commercial potential for an implicit-type measure of

characteristics of personality.


To the best of this author’s knowledge, the only study that included the IAT and/or the

MBTI /MMTIC was conducted by CAPT and involved only a small sample of 50 adolescents.

This is important because some of the methods of measuring and interpreting personality

characteristics that are unique to the MBTI / MMTIC suggest this would be a very beneficial

pairing.

Originally, there were some researchers who thought that the IAT was measuring bias,

and in the process, generated a lot of controversy. Soon thereafter, came those who believed that

the IAT should be incorporated into personality research. That is, not a measurement of either

bias or personality, but as a starting point, to chart the discrepancy between the conscious and

subconscious responses to self-reported questions. The thought was, using the IAT as a tool in

the actual measurement of personality would follow.

Simply put, no matter the different methodologies, the importance of accurately assessing

personality seems a common goal. As Isabel Myers said, "Assessment via questionnaire

responses….is subject to ‘the opposing pressure of environment’ and other unknown self-report

biases that may produce a false result” (Myers & Myers, 1995, p. 181).

Jung had postulated that “as a rule, whenever such a falsification of type takes place as a

result of external influence, the individual later becomes neurotic.” Later he said, “a reversal of

type often proves exceedingly harmful to the physiological well-being of the organism, often

provoking an acute state of exhaustion” (Jung, 1923, p. 415). Agreement or disagreement of

implicit and explicit measures of type has significant potential for helping resolve personality

ambiguity and enhancing self-insight, career satisfaction, workplace performance, and even

lending insight to pathology.


Implicit Cognition and its Relationship to the IAT

In recent years, the research trajectory has yielded progressively clearer results

concerning the role of biology in implicit thinking. More specifically, most studies have

concluded that in comparison to explicit prejudice, implicit prejudice involves a significantly

greater emotional component. As an example, in her groundbreaking study, Phelps et al. (2000)

became the first researchers to link brain activity to race preference. When white subjects viewed

unknown faces of Blacks, researchers found that amygdala activity had increased significantly.

Further research has arrived at the same conclusion (Amodio 2003; Lieberman et al. 2005;

Wheeler & Fiske, 2005). As importantly, the degree to which the amygdala was activated was

highly correlated with participants’ IAT scores. During an interview, Phelps commented,

We measured the eye-blink startle, a reflex response that people display when they hear a

loud noise, for example. A lot of studies have shown that this reflex is potentiated

[enhanced] when people are anxious or in the presence of something they think is

negative. We found that implicit preferences were correlated with potentiated startle and

that both were correlated with the amount of amygdala activation (Costandi, 2012, pp.

27-28)

At the time, however, Phelps pointed out that while welcomed, these studies were

primarily correlational and that much more research in establishing the underlying

neurobiological mechanisms of implicit bias was needed.

In 2004, Poehlman et al. conducted a meta-analysis of the IAT which strongly suggested

that when it comes to predicting discrimination against members of out-groups, the IAT was

consistent in its superiority over those instruments which attempt to measure explicit prejudice

(Poehlman et al., 2004). That would seem to coincide with Fazio’s conclusion that measures of


explicit prejudice predict deliberate behavior (Fazio et al. 1995). Nosek et al. (2007). followed up

with Fazio’s suggestion that biased behavior which is subtle and spontaneous is best measured by

implicit measures (Nosek et al., 2007). At this point in the evolution of the IAT, it can now be

posited with reasonable assurance that the IAT is reliable in its measurement of implicit

preferences toward social in-groups (Nosek et al., 2007).

The Neurobiology of Implicit Thinking and the Pivotal Role of Emotion.

In a groundbreaking 2012 study titled: The Neuroscience of Race, Phelps and her

colleagues made a step toward this goal of studying the underlying biological mechanisms of

implicit cognition. For the first time, this study, while not establishing a causal relationship

between unconscious social evaluation and neuronal activity, was strongly correlated.

With the hypothesis being that Black faces would spawn more emotion in White subjects

than White faces as a result of an involvement of the amygdala, the study singularly focused on

facial recognition amongst Blacks and Whites, (i.e., how individuals “perceive and categorize

race and the attitudes that flow from it”; Costandi, 2012, p. 27). The researchers concluded that

“a network of interacting brain regions is important in the unintentional, implicit expression of

racial attitudes and its control” (Costandi, 2012, p. 27). As part of their study, via the use of

fMRI (functional magnetic resonance imaging), Phelps and her colleagues discovered that

among White participants, there was a significantly higher connectivity between the brain

regions themselves and their accompanying network of neural pathways when presented with a

picture of a Black face than with a White face. More to the point, Phelps commented, “This

network overlaps with the circuits involved in decision-making and emotion regulation, and

includes the amygdala, fusiform face area (FFA), anterior cingulate cortex (ACC) and

dorsolateral prefrontal cortex (DLPFC)” (Costandi, 2012, p. 28).


With its status of being the most sophisticated area of the brain, the prefrontal cortex

houses the two DLPFCs (right and left). They are responsible for social judgment and other

complex cognitive processes. These include executive functions such as organization, regulation

of intellectual processing and subsequent action stemming from that processing. Whether it is the

studies behavioral adaptation to a changing environment or the fine-tuning of a highly dynamic

executive control center, the last decade has witnessed many well-grounded studies suggesting

the pivotal role of the DLPFCs. Among them was a 2008 study by Phelps et al. which

demonstrated that the activation of these neural pathways and regions of the brain was highly

dependent on time. That is, when faces of those in the ‘out-group’ were flashed for only a short

time (30 milliseconds), it was the fusiform gyrus and amygdala which showed significant

activation, but there was no activation in the ACC or DLPFC. However, when the facial images

were shown for a much longer time (525 milliseconds), neuronal activity in the amygdala was

nearly non-existent. Instead, it was replaced by strong activity in both the ACC and DLPFC

(Stanley, Phelps & Banaji, 2008). This research has given great insights into what may be the

possible presence of innate prejudice and the roles that the ACC or DLPFC play in that process.

One of those possible insights is that when images or information are presented for a short

duration, our primordial (built-in) neuronal activity is strong and emotional, thus showing initial

– and sometimes unintentional – responses to members of the ‘out-group.’ Then, upon a longer

exposure, the ACC and the DLPFC spring into action neutralizing or perhaps even abolishing the

influences of the fusiform gyrus and amygdala, which in turn, provide a more rational response.

However, researchers should exercise caution in the sense that increased activity in the amygdala

does not automatically mean negative emotion. Rational response might simply represent

increased awareness and deeper thought when processing facial images from the ‘out-group.’


See Figures 1 and 2 for the role of specific parts of the brain regarding the neural basis of

implicit attitudes.

With regards to the workings of the amygdala and prefrontal cortex, adolescents represent a

particularly interesting study group. In the study of brain maturation, until the early 1990s, it was

thought that the first three years of life were the most important and that by the age of ten, the

final product was nearly in place and final maturity by 18. It has only been recently that

cognitive neuroscientists have concluded that dramatic and on-going changes in both brain

structure and function also occur in the late teens and continue well into adulthood, completing

this process at the mean age of 25 (Casey & Giedd, 2000). In short, controlling impulses and

regulating emotions are a challenge at this time in life.

Jay Giedd observed with regularity that exposing adolescents to photographs of

frightened faces resulted in overactive amygdales. As well, their accompanying heightened

emotional states remained far longer than control group (beyond age 25) volunteers. According

to Casey,

there is no question that the amygdala has the upper hand during this critical period of

brain development when teenagers are exploring their world. And, it's not until the

prefrontal cortex fully matures in the mid-twenties that greater reasoning and

judgment are able to modify the more turbulent thoughts and feelings of adolescence

(PBS/Frontline, Giedd, 2002).

Noteworthy was the remark by Giedd during a television interview on the topic, “It’s

sort of unfair to expect [teens] to have adult levels of organizational skills or decision-

making before their brains are finished being built.” (PBS/Frontline, Giedd, 2002).


Hypotheses

Measuring personality is challenging. Using the self-report method can be even more

problematic as preferences can be easily misidentified when they are subjected to subtle or

external pressures which are in conflict with internal inclinations. All of these phenomena have

the potential to skew measures of personality. Given the previously cited characteristics of

adolescent brains, this situation may be exacerbated when traditional problems associated with

MBTI and the MMTIC self-report personality inventories are factored in. Currently, the Implicit

Association Test is the sole available assessment approach which attempts to measure

personality without reliance on potentially fallible and misleading explicit self-report devices.

Therefore, establishing evidence of convergent validity through a correlational study involving

older adolescents and their scores on both explicit (self-report) and implicit personality

inventories is a worthy goal of research. Specifically, the correlations between Extroversion /

Introversion (E/I) scores (a); Sensing / Intuition (S/N) scores (b); Thinking / Feeling (T/F) scores

(c); and Judging / Perceiving (J/P) scores (d) will be examined. In total, eight correlations will be

computed. As such, the following hypotheses are proposed:

H1a-d: There will be statistically significant correlations between the scores on four

corresponding dimensions of the Myers-Briggs Type Indicator (MBTI) and a specially

designed Implicit Association Test (IAT) with MBTI-like descriptors.

H2a-d: There will be statistically significant correlations between the scores on four

dimensions of the Murphy-Meisgeier Type Indicator for Children (MMTIC) and a specially

designed Implicit Association Test (IAT) with MBTI-like descriptors.


Chapter II: Method

Participants

The sampling method consisted of opportunity sampling of older adolescents enrolled in

seven AP Psychology classes, two Introductory Psychology classes and one Introductory

Sociology class. Participants numbered 103 which consisted of 73 females and 31 males. All of

them were either 12th

or 11th

graders, with the vast majority (70%) being in the 12th

grade. With

the exception of 100% of all participants being between 16 and 18 years of age, there was no

attempt to collect demographics of age or race. One hundred percent of these classes were

taught by either the researcher or by the researcher’s colleague, to be referred from this point

onward as the “TA”,( i.e., teacher’s assistant). All participation was voluntary.

Procedure

All participants were recruited by the TA via the following protocol. The TA and the

researcher switched classes during the recruitment session(s). Because the researcher was not

involved in any of the testing, only the TA read a written script (Appendix D, p.142). However,

so as to maintain confidentiality, there were no requests for students to raise their hands to

indicate their interest / willingness to participate. Instead, every student in the class was given the

general recruitment presentation and the handout with consent and instructions. Only those

students who brought back parental consent / student assent forms were given the full version of

the participant instructions, but only at an after school informational session on a date

determined by the TA. At this unabridged informational meeting, participants received clear

communications of the voluntary nature of the participation as well as multiple reassurances that

students’ grades would not be affected in any way by participation / non-participation in the

study. For the full reading script for this informational session for participants who volunteered,

see Appendix E. It was also reinforced that the researcher, the TA, and the students themselves


would never know their individual results. This was because of a unique identity code that was

generated by each participant according to a formula described below.

Information sessions. Before test-taking began, participants were given an information

presentation lasting approximately 30 minutes. The logistics of the study were explained at this

time. Students absent from class that day were given individual explanations. In addition,

potential participants were informed that these three assessments, (i.e., MBTI, MMTIC, and the

IAT) were personality inventories However, they were not made aware what the motives were

behind taking the IAT test until after the study was over and they were debriefed. See section

under ethical considerations - Appendix D. These presentations were made by the TA in the

students’ classrooms where they met on a daily basis. The TA was also the one who instructed

willing participants how to generate their own unique codes (see next section on confidentiality).

Each participant took three personality inventories online while at home at three separate

times. From beginning to end, data collection process took five weeks. These tests consisted of

the following.

a test of the MBTI (Appendix G)

a test of the MMTIC (Appendix H)

a test of the specially designed and beta-tested IAT with

MBTI/MMTIC descriptors. (Appendix I)

Confidentiality. The data collection process was confidential. No personally-identifying

information was collected. When participants took these online assessments, instead of entering

their name(s), they entered a unique code number. The vast majority of the content of each of

these unique code numbers was generated by the student participants themselves and therefore

was confidential. The only commonality in any of these unique codes was for the purpose of


controlling for possible confounding variables resulting from the order in which these tests were

taken. As discussed in the section on order effect, there were three separate groups of test-takers.

Group one was scheduled to take the IAT test first in the sequence. Group two was scheduled to

take the IAT test second. Group three was scheduled to take the IAT test last in the sequence.

The formula for generating these unique codes was as follows:

Group one (the TA’s Period 1 and 2 classes as well as the researcher’s Period 3 class)

entered “1 - meaning the group taking the IAT first.

Group two (the TA’s Period 4 and 5 classes as well as the researcher’s Period 6 and 7

classes…entered “2” - meaning the group taking the IAT second.

Group three (the researcher’s Period 2 and 8 classes as well as the TA’s Period 7 class.

They entered “3” - meaning the group taking the IAT third.

-----------------------------------------------------------------------------

The second part of the formula required each participant to enter either “m” for male or

“f” for female to specify the gender they consider themselves. For example, if a female in Group

1 were to initiate her unique alpha-numeric log-in code it would be: “1f”

From this point onward, the formula was as follows:

the last letter of participant’s last name. For example, if this person’s last name ended in the letter “a” – then the example up to this point: 1fa

the number of the month that participant was born: example Oct = “10”

example up to this point: 1fa10

the numerical day of the month on which participant was born: example: 19

example up to this point: 1fa1019 = this is the completed unique code.

The only information that was supplied by the TA was the first digit of each unique code.

This was for the purpose of knowing that the TA’s Period 1 and 2 classes as well as the

researcher’s Period 3 class was scheduled to take the IAT first; that the TA’s Period 4 and 5

classes as well as the researcher’s Period 6 and 7 classes was scheduled to take the IAT second;

and that the TA’s period 7 class as well as the researcher’s Period 2 and 8 classes was scheduled


to take the IAT third. In other words, someone in the study had to know the order in which

groups were scheduled to take each test. At no point was the researcher or his TA ever aware of

whose results belonged to which participant in the study. As well, at no point did the participants

ever become aware of their results, either on each test or in the aggregate.

Participants’ responses were scored automatically via a computer software programs and

empirically derived scoring algorithms specific to each personality inventory. For the MBTI, it

was based on Item Response Theory (Myers et al., 1998). For the MMTIC, it was based on

Latent Class Analysis (Murphy & Meisgeier, 2008). For the IAT, it was based on the latest

scoring algorithms outlined by its authors and computed by their proprietary Inquisit software

(Greenwald, Nosek & Benaji, 2003).

Integrity of the test-taking environment. As part of ensuring that the test-taking

atmosphere is conducive to the task at hand, measures that strove toward the integrity of the test-

taking environment were instituted. This included efforts to seek an environment that was not

compromised by multi-tasking. For example, as a control measure, students purposely were not

asked to take these online inventories in school. The purpose of this was to ensure that the time

spent taking these assessments was unencumbered by external distractions. High school

computer labs are usually crowded and noisy. In addition, it was thought that the close physical

proximity of computer stations inherently ran the risk of peer influence and/or impaired the

ability to concentrate. It was thought that by avoiding these errors in procedure, possible

confounding variables would have been avoided.

Of equal importance was the avoidance of a problematic and compromising test-taking

atmosphere away from school. As part of the test-taking protocol, participants were asked to sign

a pledge both before and after taking these assessments stating they would not / did not take


these personality inventories under compromised conditions. As part of this pledge, definitions

were included of what a quality-filled test-taking atmosphere is. Relative to this point, because

the IAT test is especially vulnerable to a poor test-taking atmosphere, special attention was paid

to this topic during the in-class information sessions. As a point of information, there was

nothing to prevent the participants from taking tests at school, except the signed pledge to not do

so. Participants who did not sign the aforementioned pledge were not allowed to take part in the

testing. (Appendix J - Participant honor pledge to maintain a quality testing environment).

Order effect and counterbalancing measures. Order effects are confounding effects.

Because the order effect is an inherent weakness of the repeated measures design,

counterbalancing measures should be taken. Currently, explicit-implicit correlations and their

relationship to the order effect are not fully understood (Schnabel, Asendorpf & Greenwald,

2008). This appears to be especially true if the correlations are weak. For example, in this study,

there is the thought that the validity of implicit measures (Implicit Association Test) may be

affected by first having completed the explicit (self-report) tests (MBTI and MMTIC). If true,

correlations between these explicit and implicit measures may increase as a result. This would

make logical sense because of priming. On the other hand, if participants were called upon to

take the implicit (IAT) test first, even if they were called upon to do nothing afterward, the test

would automatically be less implicit – simply because they have done it before (Klauer &

Mierke, 2005).

Thus each participant was randomly assigned to one of the following three groups. These

groups were defined according to the sequence in which these three different tests were

scheduled to be taken. This was done to minimize any order effects errors in the study. As

previously explained, groups already knew what their order of test-taking was supposed to be.


Group 1 was to take (in order) the test for the:

1. IAT

2. MBTI

3. MMTIC


1. MBTI

2. IAT

3. MMTIC


1. MBTI

2. MMTIC

3. IAT

With regard to the amount of time between administrations of these personality

inventories, the researcher strove to ensure there was an optimal spacing between them.

Administrations should not be so close that participants might be mentally fatigued from taking

them. Therefore, it was planned that these personality inventories be spaced one week apart. In

addition, participants were encouraged to take them on weekends, i.e., not school days.

Participants were also coached to take these inventories only when they were ideally in a rested

and calm state. With these recommendations having been stated and reinforced, it was also

understood that these testing conditions were impossible to monitor and/or enforce. The regimen

/schedule for the tests of the three different personality inventories were as follows:

It was recommended to participants that the taking of these tests should be approximately one week apart and be taken only on weekends. As well, only one test could be taken per

weekend.


It was also planned that all three tests would be taken during the months of April and May, 2013.

Participants were further informed that the publisher’s website would be open, i. e., the

ability for a participant to log-on and start testing for their specific group only during pre-

stated time frames.

Attrition as part of the procedure. In order to achieve researchable results, each

participant had to have completed all three personality assessments. As such, the researcher

expected to lose, at minimum, 20% of participants via attrition. It was speculated that by the end

of the study, various situations in students’ lives would have taken its toll. These included:

illness, the fact that perhaps some students did not have internet connections at home, typical

end-of-year academic demands such as preparation for exams, the on-going and time-consuming

processes of dealing with college acceptances and applying for scholarships, social demands

such as the “prom season” and the simple lack of follow-through. Data from participants who did

not complete all three of the testing tasks were discarded from the study.

Practice-run of a generic (non-personality-type) implicit (IAT) test online and in

school. As opposed to “when it really counts” (at home) – it was a goal that participants became

familiarized with the IAT format before the study began. Toward this end, participants’ classes

were brought to various computer labs within the school and the website maintained by Project

Implicit at Harvard University was accessed. On this website, there were/are interactive

demonstrations of the IAT. Each demonstration took 10 to 15 minutes. Students were

encouraged to engage in two demonstration sessions.

Ethical considerations


In accordance with the University of Hartford’s participation in the Collaborative

Institutional Training Initiative (CITI) program for the Protection of Human Research Subjects,

this study was in full compliance with the following standard procedures:

Appendix K: official permission from West Hartford Public Schools to conduct research on

participants who were younger than 18 years of age

Appendix L: Introductory Letter to Parents: 11th

and 12th

grade behavioral science students at

Hall High School – West Hartford CT

Appendix M: Passive Parental or Guardian Consent Form

Appendix N: Informed Consent/Assent of student participants: Participation in a study of the

validity and reliability of three different personality tests

Informed consent. Participants were given informed consent and the right to refuse

and/or withdraw at any time. Researchers informed participants in advance about the general

nature of the research and potential risks involved. In this case, there were no risks involved. In

addition to the participant/assent forms, all students’ parents or guardians, regardless of their

child’s age, were sent passive parental consent forms. Given that a minimum of 30% of

participants were below the age of 18, both the researcher and the Interim High School Principal

believed it was better to build in this extra layer of consent. Toward this end, a physical letter

was sent to participants’ homes informing them of this study (see Appendix E). In summary,

there was no anticipated risk of harm or discomfort associated with this study. However, the

risks were discussed both in the information sessions and encapsulated in informed consent

forms themselves, as well as in the letter to the parents.

Professional ethics precluded researchers and their colleagues from releasing data on or

sharing information about individual participants. Again, this was not an issue as all participants

took these inventories using a pre-assigned number. Researchers were not aware of individual


participants’ results until the end of the study and only as a pre-assigned unique alpha-numeric

code.

Deception. There was an innocuous amount of deception. For the administrations of the

MBTI and the MMTIC participants were not given full disclosure. That is, they were told that

they were being given two non-threatening personality tests ( i.e., the MBTI and the MMTIC),

which do not look for, or report any negatives in the form of personality descriptors. Both of

these inventories are based on the constructive use of differences. On the other hand, with

regards to this specially designed version of the Implicit Association Test (with its descriptors

modeled on the MBTI and MMTIC), although participants were told that it was “just another

personality test” – the fact of the matter was that the IAT was measured via a timed process and

for accuracy of answers.

Debriefing. At the end of the entire test-taking regimen, all participants were given an

informal, yet full explanation of the research. This was done using the same procedure in the

information sessions before the study began. These debriefing sessions lasted approximately 20

minutes each. Participants absent from class that day were given individual explanations.

Apparatus

Websites. All Assessments were taken on the internet on the official sites of the MBTI

and MMTIC, respectively. The official website for the MBTI is called SkillsOne.com. It is the

company-based (publisher’s) website owned by Consulting Psychologists Press (CPP) in

Mountain View, California. The official MMTIC website is called CAPT.org and is owned by

the Center for the Application of Psychological Type (CAPT). It is based in Gainesville, Florida,

and is the non-profit organization originally established by Isabel Myers-Briggs for the purpose

of furthering the academic study of Type.


A home computer with an internet connection was required. If participants did not have a

computer and/or access to the internet at home, they were asked in the informational session(s) to

discreetly identify themselves (so as not to cause them embarrassment). As long as they could

assure the researcher or the TA that this could be accomplished in an environment that was of

equal quality inside school, permission was granted to do so, but the researcher’s TA had to be

informed first. An example of an environment that was of equal quality was in the quiet area of

the school library, i.e., with no disturbing stimuli around them. Restated, to the best of the

researcher’s knowledge at that time, at no time did participants take any of these tests while in

class or any other noisy environment within the school.

Statistical software. SPSS Graduate Pack 12.0 - Student Version by IBM.


Chapter III: Results

Data Gathering and Analysis

Computing participants’ scores. All data were gathered by the online servers of the

publishers of the Myers-Briggs Type Indicator (MBTI) and the Murphy-Meisgeier Type

Indicator for Children (MMTIC). The publisher of the MMTIC was the same developer of the

specially designed Implicit Association Test (IAT) with MBTI-like descriptors.

Tests of the hypotheses. Because this study was correlational, and interval data were

involved, eight parametric Pearson’s Product moment r tests were performed to test the eight

separate hypotheses. Participants’ scores were entered in SPSS software for the Pearson Product

Moment r, and the tests run. The Cohen’s D test was used to measure the strength of the

associations for the IAT test. Specially created by the developers of the IAT, the D score is

derived from the Cohen’s d calculation. That is, it measures the average response latency

and its accompanying mathematical interdependency between the latency responses of

each IAT sorting condition. It then takes the average of these latency responses and divides

them by the standard deviation of all the latencies for all of the sorting tasks. According to

Greenwald, Nosek and Banaji, “the difference between Cohen’s d and the Implicit

Association Test (IAT) D measure is that the standard deviation in the denominator of d

is a pooled within-treatment standard deviation. The present D computes the standard

deviation with the scores in both conditions, ignoring the condition membership of

each score” (Nosek et al., 2005, p. 167). For a detailed description of how these D scores

were obtained, the actual algorithm is offered in Table 4 (Greenwald et al., 2003).

Hypotheses 1 a-d addressed the MBTI-IAT agreement of scales, one bivariate correlation

for each corresponding pair of dimensions (4 correlations):


H1a: Extroversion / Introversion (E/I) scores;

H1b: Sensing / Intuition (S/N) scores;

H1c: Thinking / Feeling (T/F) scores;

H1d: Judging/Perceiving (J/P) scores.

Hypotheses 2 a-d addressed the MMTIC-IAT agreement of scales, one bivariate

correlation for each corresponding pair of dimensions (4 correlations).

H2a: Extroversion / Introversion (E/I) scores;

H2b: Sensing / Intuition (S/N) scores;

H2c: Thinking / Feeling (T/F) scores; and

H2d: Judging/Perceiving (J/P) scores.

In addition to the Pearson’s Product moment r tests and the Cohen’s D tests, correlations

among the different continuous scores for EI, SN, TF, and JP for the MBTI, the MMTIC, and the

IAT were measured.

It was hypothesized there would be statistically significant correlations between the

corresponding dimension scores on the explicit (conscious) responses of the MBTI and MMTIC

when compared to the implicit (unconscious) responses of an IAT test designed using the same,

above mentioned descriptors. The results were mixed. That is, four of the eight hypotheses were

supported and four were not supported. Specifically, within the dichotomies of Extroversion /

Introversion (E/I); Sensing / Intuition (S/N); Thinking/Feeling (T/F); and Judging/Perceiving

(J/P), the results were as follows:

With H1a: Extroversion / Introversion (E/I) scores between the MBTI and IAT, there was

a significant positive correlation between these two variables (r = .254, n = 103, p < .01).


With H1b: Sensing / Intuition (S/N) scores between the MBTI and IAT, there was a

significant positive correlation between these two variables (r = .238, n = 103, p < .05).

With H1c: Thinking / Feeling (T/F) scores between the MBTI and IAT, there was no

significant correlation between these two variables (r = .190, n = 103, n.s.).

With H1d: Judging/Perceiving (J/P) scores between the MBTI and IAT, there was no


With H2a: Extroversion / Introversion (E/I) scores between the MMTIC and IAT, there

was a significant positive correlation between these two variables (r = .319, n = 103, p < .01

level).

With H2b: Sensing / Intuition (S/N) scores between the MMTIC and IAT, there was a

significant positive correlation between these two variables (r = .268, n = 103, p < .05).

With H2c: Thinking / Feeling (T/F) scores between the MMTIC and IAT, there was no


With H2d: Judging/Perceiving (J/P) scores between the MMTIC and IAT, there was no


While not part of the hypotheses, the positive correlations between all of the dichotomies

of the MBTI and the MMTIC were all highly significant. This is noteworthy because these took

on the role of a de facto “control group”. That is, because all facets of these two self-report

personality inventories were so strongly correlated, it became a matter of great interest to see

how aligned the IAT would be in this regard. As such, they are displayed in Table 4.


Chapter IV: Discussion

This study attempted to address several problems associated with explicit self-report

personality inventories such as the MBTI and the MMTIC. Although not overtly stated, there

were heavy inferences throughout the introductory section of this paper that as the “new kid on

the block,” the IAT had possibilities of becoming one of the most credible tools in psychological

research. This study addressed this need by attempting to validate the IAT and its correlations

with the MBTI and the MMTIC.

The test of the eight hypotheses provided the following evidence. H1a, H2a, H1b and

H2b were supported. In contrast, H1c, H2c, H1d and H2d were not.

As with most studies, however, the results have spawned more questions than the original

hypotheses sought to answer. Given the disparities between the correlations that were significant

and not significant, one is tempted to inquire whether explicit self-report instruments such as the

MBTI and the MMTIC are invalid measurements of personality – perhaps for reasons having to

do with social desirability. Conversely, while seemingly a valuable research tool regarding

socially sensitive topics with high emotional content, (e.g., race and gender bias), is there a

possibility that the IAT falters when trying to measure personality with its characteristic

complex, fluid and situational scenarios and their accompanying emotional subtleties?

Further, within those basic questions lies a deeper set of inquiries. How implicit is the

Implicit Association Test? In measuring latency responses, are those delayed responses

manifestations of actual unconscious biases - or might it be a cognitive leap to posit this?

Perhaps those delayed responses are a simply a measurement of stimuli which is familiar /

unfamiliar and nothing more. With regard to the underlying neurobiologies of implicit and

explicit cognition, well-grounded research involving the study of memory has yielded strong


evidence as to the different geographic locations within the brain of explicit and implicit

memory. Might the same be true with implicit and explicit cognitive responses to different

stimuli as well? Finally, ever present within the disciplines of psychology and cognitive

neuroscience is the nature / nurture question. Perhaps these implicit-explicit responses can be

framed as manifestations of the hardware/software, (i.e., biology /environment metaphor).

Drawing from classic theories in psychology, perhaps even the Erikson’s premise of identity as

well as Tajfel and Turner’s in-group/out-group constructs have significant roles to play. These

questions will be addressed in the approximate order they were presented.

Finally, one planned procedure of this study had to be abandoned. Referenced here is the

discussion about three groups of participants, each taking these assessments in specific order.

This was for the purposes of counterbalancing possible order effects. Early-on, the TA

determined this plan had collapsed. A combination of apathy among participants (to be addressed

on the next page), their seemingly inability to follow directions and candidly, a lack of diligence

on the part of the TA himself, were the root causes. In debriefing sessions, participants were

quite sincere in letting the TA and the researcher know that it just was not that important to them

or types of comments exemplified by “oops, I forgot” seemed to rule the day. The implications

of this seem to be that original researcher concerns over the order effect were negated by

participant apathy.

Was SDR a Factor in the Taking of the MBTI and/or the MMTIC?

Given the existence of a substantial amount of literature to support both sides of the SDR

debate, relative to this thesis, the author takes the position that social desirability responding

played either no role at all, or at most, a very little role in participant answers in either the MBTI

or MMTIC self-report personality assessments. Contrarily, the theme of participants’ roles in this

study seemed to be along the lines of general apathy and disengagement. On this point, while


much of the information which follows is anecdotal, the researcher offers the following

description.

From the onset of this study, all participants were quite aware of the confidential aspect of

the coding system and that in accordance with the ethics of research on participants below 18

years of age, they also knew they would never know their results. In short, from this perspective

alone, there was little or no reason to engage in social desirability – consciously or

subconsciously. In short, as far as socially desirable responding was concerned, this study took

on a “why bother?” attitude among participants.

Further, when all possible students in the researcher’s and TA’s classes were tallied, 261

students were eligible to take part, yet the attrition rate was very high. In order to be included in

the final data tabulation, a participant had to complete all three personality assessments. Yet,

only 103 participants met this criteria. Among these 261 students were 73 who chose from the

onset, not to take part. An additional 45 students completed either one or two, but not all three of

these assessments. As such, their data was discarded. Perhaps another indicator of a lack of

involvement in the testing process was the fact that IAT data from 39 additional students had to

be purged as a result of “key-banging.” This was done in accordance with guidance given by

Greenwald et al. who developed an algorithm that calculates when a respondent has answered

too quickly (Greenwald et al., 2003). That is, time-wise, they pressed the “A” key or the “K” key

in an automaton manner that makes it virtually impossible for meaningful cognition to occur in

the average person. Finally, perhaps one of the most telling indicators of the general lack of

involvement among participants was the number who chose not to take advantage of a special

reward offered to them upon completion of all three assessments. Fearing that participant

awareness that they would never know their results, and coupled with anticipated risks of a low


participation and high attrition rates, as a preemptive move, the researcher successfully had

persuaded the publisher of the MBTI, Consulting Psychologists Press (CPP), to offer a free

administration of their most sophisticated personality inventory currently offered – the MBTI

Step II. According to the publisher, the Step II represents a “deeper-drilling” into one’s

personality in a non-judgmental way. Used by 89 of Fortune 100 companies, it also carries a

retail price tag of $100 (Pepper, 2005, p. 45). Out of 143 eligible participants, only 55 chose to

take advantage of this opportunity. Restated, social desirability was probably not a factor in this

study.

Why the IAT Correlations Were So Mixed

When comparing participant results between the IAT and explicit self-report assessments

on the same topic, a broad review of the research literature shows historically low correlations

which are accompanied by regularly occurring patterns to indicate possible reasons why.

Mentioned in an earlier different context in this thesis was a seminal paper by Hofmann et al.

(2005). Hofmann’s research team conducted a meta-analysis on 126 studies which yielded a

mean correlation of .24 (Hofmann et al., 2005). This compared to a mean correlation of this

paper’s eight bivariate hypotheses of .21. Specifically, the IAT and MBTI = .21 and the IAT and

MMTIC = .20 (see Table 2). While the findings of Hofmann et al. are not necessarily the “holy

grail,” it is particularly noteworthy that this researcher’s results are very much in alignment with

them.

For this thesis, the first reason why the researcher contends the results were mixed was

possibly a subjective perception of items presented by the IAT. What Hofmann et al. calls

“motivational biases in explicit self-reports” is essentially a discussion of social desirability”

(Hofmann, et al., 2005, p. 1369). In this case whether the MBTI and the MMTIC incorporated


social desirability or social undesirability is a moot point. Stated earlier, it is the position of this

researcher there was little or no reason for participants to engage in social desirability –

consciously or subconsciously. Once potential subjects became aware of the fact that they all

would be taking all of these assessments under confidential identities and that they would never

know their own personal results, persuading them to participate in any manner was quite

difficult.

In an opposite twist on Hofmann’s conclusion, this researcher offers a discussion on

subjective perceptions, but from a different perspective. That is, these motivational biases

possibly arose from emotion-based (subconscious) implicit thinking while taking the IAT.

At this point in the paper, evidence produced regarding the underlying biological and

dominant role of emotion in implicit thinking has been robust. This evidence should be fused

with equally strong knowledge of how the amygdala has the upper hand in adolescent cognition.

Specifically, a further examination of the actual words flashed on the screen while these

adolescents took the IAT is warranted. The reader will recall the IAT used in this study was

specially-designed using MBTI-like descriptors. See Table 6.

The reader is also asked to keep in mind that the word items in the discussion which

follows, known as single word stimuli, are supposed to collectively describe individuals’

personalities. This is inherently problematic because words used to describe one’s personality are

rarely neutral. On the contrary, it the researcher’s position that nearly all of them are emotionally

charged, positively or negatively. As a result, they may disproportionally evoke negative or

positive responses called for by the IAT.

Given the total absence of studies that involve the MBTI and the IAT, the researcher

must reluctantly engage in anecdotal speculation about the word choices used. As part of that


speculation, the researcher asserts that some words have negative connotations that are so great,

even if the word did apply to the test-taker, the person implicitly might not want it to. Therefore,

subconsciously, they might not associate themselves with it. In taking a closer look at these

words, it is the researcher’s further contention that for one reason or another, easily 50% of these

words are invalid. This is because there is a good chance the test-taker will implicitly respond to

the emotionally charged word, and in the process, biases that might otherwise be displayed will

be drowned out. See Table 5.

Beginning with the word talkative….talkative is somewhat negative. For some, talkative

may not be negative at all. Loud and show-off are very negative. Many, if not most extraverts

may not want to automatically assume they are loud. Indeed, one would be hard-pressed to make

a case that show-off fits at all. Demonstrative would be a better word, but most participants

might not understand what it meant. Adolescents would almost certainly not know its meaning.

Expressive, outgoing and sociable all have positive connotations.

Regarding the words that pulled for introversion, when an introvert sees the word quiet

flashed on the screen, many would have no discordance, (i.e., the word fits). Other introverts,

however, may not think of themselves as quiet. Whether it is loud for extroverts or quiet for

introverts, when a one word descriptor is applied to a personality test via the IAT, it is the

researcher’s contention that the response depends upon the nature of the word and how closely

one associates oneself with those words. In isolation, that association of the word is not

necessarily shared by the individual herself or himself. Reticent might be a good substitute for

quiet but most people, especially teenagers, would not know what that meant. Cautious and

modest seem not to fit at all. This researcher is not aware that either word can automatically be

associated with introversion. Introverts might not talk about themselves very much, but herein


lies the point. The point is, there is a great deal of “fuzz” with these words when they are used as

stand-alone items which are being flashed on the screen in rapid succession – and being scored

for accuracy.

When one looks at the dichotomy of sensing (noticing details) vs. – intuition, literal may

be quite confusing for most adolescents. Creative is a hyper-positive word. The antonym

(possibly dwelling in the unconscious) is this: who wants to be thought of as not creative?

Probably no one. Clever, inventive and innovative are equally positive with the caveat that

clever might not even belong in this category. Antonymly speaking, does this (implicitly) mean

that the people who have a propensity to notice details, (i.e., the sensors) are not clever?

Practical and realistic fall into the same category. They are both positive words for most people.

Is a person who is intuitive not realistic? not practical? The emotional loading of the words

seems to trump the attempted alignment with what the IAT is attempting to measure. For

example, is it not possible that an intuitive person can also be practical? And if there is a

delayed response here, how is that scored? The reader will also note that the word abstract was

removed after being beta-tested because, depending on the situation, it had a multitude of

meanings. Some adolescents thought it meant something that one puts at the beginning of a

research paper – others thought that it had to do with art.

With the logical-warm dichotomy (used in place of the MBTI Thinking / Feeling

category), the word warm itself is problematic. Does this mean that by default, logical people

are cold? The word objective is acceptable, with the thought being thinkers are generally

objective whereas feelers are subjective. However, out of context, objective can be a vague

word, especially for teenagers. The term tough-minded is in a similar situation. To most adults,


tough-minded means disciplined, but to teenagers, tough-minded may take on the connotation

of being mean.

Frank is a very poor word because most teenagers today do not know the meaning of

frank. As a full-time public high school teacher of eleventh and twelfth grade students, it is the

researcher’s opinion that frank is a generationally-old and anachronistic word. It is hardly ever

used by this generation of teenagers. Candid might be a better word, but that has to do more

with honesty than anything else. Skeptical, analytical, and logical are about as close to being

neutral as any of these words. Warm, generous, and agreeable are excessive in their positivity.

To counterbalance them should cold, stingy and disagreeable be there as well? For most

adolescent boys, emotional is beyond acceptable. Harmonious can be a word with more than

one meaning. To some it might mean they can sing well. Sympathetic is generally acceptable,

but again, it may be too positive. How many people (subconsciously) want to be called

unsympathetic?

Within the dichotomy of organized vs. spontaneous, planful is not a word. Most adults

might guess what it meant, but teens might be confused. No assessment of personality should

contain a word that does not exist and expect people to know its meaning. Disciplined is a

positive word, yet people who are quite undisciplined often think of themselves as disciplined.

Orderly, organized, prompt, and thorough are all complimentary, but again, are their

antonyms part of participants’ implicit thoughts? Relaxed, informal, spontaneous, easygoing,

casual are words which could fall either way, depending on the person taking the IAT. Flexible

is positive, but its uncomfortable antonym inflexible may implicitly (subconsciously) drive an

IAT test-taker in the opposite direction. The reader is reminded these words are being flashed on

the screen in rapid succession.


In this study, a second possible reason for the mixed or low correlations between the IAT

and the MBTI and MMTIC is a magnification of the reason just discussed. It is one consideration

when the word stimuli being flashed on IAT screens are emotionally charged. However, it is the

contention of this thesis that the ineffectiveness of these stimuli become exaggerated if there is

little or no ability to cognitively process them. Further, if a IAT test-taker also has a certain

amount of ignorance about what a word means, then it will be impossible to fully understand the

relationship between the words that are being flashed on the screen…knowing all the while the

“big red X” will appear if a “wrong” answer is chosen and as the clock is ticking. Reciprocally

stated, if the user does not understand what he/she is being asked, (e.g., if the person taking the

test does not fully know what introversion / extroversion means), by definition, it would then be

impossible to show bias. In the case of this investigation, depending upon on what was being

flashed on the screen, those adolescents who were clueless or confused cannot have any bias. In

short, if test-takers have a heightened introspective access to implicitly assessed representations,

that possibly would also affect their responses on the IAT. In an absence of this, confusion would

trump any bias being looked for.

Related to the second possible reason this thesis gives, Hofmann’s meta-analysis used the

words “lack of introspective access to implicitly assessed representations” to describe how

participant’s possible lack of understanding of the nature of the IAT test might influence his or

her responses (Hofmann, et al., 2005, p. 1369). Hofmann’s team, however, approached this point

without a specific group, (i.e., adolescents) in mind. For example, would the responses of an

academician who had been working in the civil rights movement for 30 years, or a feminist who

has dedicated her or his entire adult life to the cause be different from an individual who had


great sympathies with regard to these social issues but only a modicum of knowledge in these

areas?

As a possible third reason for the mixed results, this thesis puts forth the contention that a

new generation of studies now point to the position that implicit cognition is highly malleable.

Until quite recently, most researchers assumed implicit cognition to be “stable, rooted, and easily

produced” within each unique individual (Shepard, 2011, p. 122). Accordingly, as an unspoken

assumption, IAT researchers have maintained that if the IAT could meet its stated goal of

eliminating social desirability generated by explicit self-report assessments, then it would be

possible to mine the pure ore of our true (subconscious) personality (Blanton & Jaccard, 2008).

In contrast, Shepard argues the opposite is probably true. That is, the “malleability in the

activation of implicit associations should be interpreted as the default state instead of as

an exception. In other words, patterns of cognitive associations are properties less of

people than of the interaction of people and situations” (Shepard, 2011, p.123).

A decade before Shepard’s publication, other researchers were studying how

fundamentally important an individual’s general environment and culture, as well as

personalized situational variables, were to the activation of implicit cognition. Bargh (1997)

argued “much of everyday life—thinking, feeling, and doing—is automatic in that it is driven by

current features of the environment (i.e., people, objects, behaviors of others, settings, roles,

norms, etc.) as mediated by automatic cognitive processing of those features, without any

mediation by conscious choice or reflection’’(Bargh, 1997, p. 57).

In 2001, Karpinsky and Hilton found that an IAT showed an average preference for

apples vs. candy bars – something that was not observed in an explicit measurement of

preference between these two edible items. With regards to these results, the researchers


commented, “In our society, there are an abundance of positive associations and virtually no

negative associations with apples. For candy bars, however, the messages are much more mixed”

– thereby suggesting that the Implicit Association Test may be influenced by environmental

associations (Karpinski & Hilton, p. 783). To further test this notion, the researchers presented

participants in their study with a large number of word pairs. They paired the word “elderly”

with various negative items and “youth” with positive terms and vice versa. Instead of pairing

favorability for youth, (i.e., when the pairings were reversed) the extent of the preferences for the

youth declined significantly. Contrarily, explicit measures of the same exercise did not change

(Karpinski & Hilton, 2001).

Drawing on the work of Karpinsky and Hilton and others, Russell Fazio and Michael Olson

have become two of the greatest critics of the IAT. In the first of their four studies, they asserted

that even the mere assertion that an implicit attitude is a phenomenon which is stored differently

in memory than are explicit attitudes – and that implicit attitudes can be called a construct – have

no basis in empirical evidence (Fazio & Olson, 2003). Additionally, they have argued that the

role of society factors into implicit cognition and its accompanying associations in a central way.

Calling these associations “extra-personal,” they have maintained these associations originate in

large part, as a result of “widespread societal associations that people remember and draw upon

in completing the IAT - thereby contaminating the Implicit Association Test as a personal

measure” (Fazio & Olson, 2004, p. 654). Fazio and Olson further state that the IAT is marinating

in “environmental associations—culturally shared but not necessarily individually accepted”

(Fazio & Olson, 2004, p.654). Along with other researchers, Olson and Fazio make note that the

distinction between personal attitudes and societal associations is not at all clear. As such, both


research teams have concluded that the traditional IAT measures, at least in part, are associations

that have been socially learned (Olson & Fazio, 2003, 2004; Jost et al., 2009).

Regarding this point, as part of one of these previously mentioned studies, Olson and Fazio

(2003) conducted four experiments with the traditional IAT, (i.e., measuring attitudes toward

race). In doing so, they introduced their “personalized IAT.” That is, instead of using the more

blunt terms of the traditional IAT such as “good” or “bad,” they used terms such as “I like” and

“I don’t like.” According to Fazio, this variant of the IAT would “reduce the contamination of

these extra-personal associations.”(Olson & Fazio, 2004, p. 653.) When the experiments were

complete, the researchers posited, the “personalized” IAT revealed relatively less racial prejudice

among Whites in Experiments 1 and 2. In Experiments 3 and 4, the personalized IAT correlated

more strongly with explicit measures of attitudes and behavioral intentions than did the

traditional IAT” (Olson & Fazio, 2004, p. 665). The researchers continued,

At the risk of appearing overly skeptical, we encourage caution in interpreting the results

of research using the traditional version of the IAT. To the extent that the measure is

being used in a domain that involves extra-personal associations, the IAT may not reflect

individuals’ attitudes as much as is desired. Like any lens, the IAT appears to color its

contents. The more personalized version of the IAT that we have examined in the present

research focuses the IAT on more personal associations. This more precise focus may

provide a stronger basis for interpreting the scores and their meaning (Olson & Fazio,

2004, p. 665).

Over the decades, among the greatest concerns Fazio and Olson have had with the IAT

is priming. In psychological testing, concepts are most often primed by explicit and external

stimuli. In other words, concepts that have been associated with a primed construct are activated


by our memories. For example, whether someone was a sports fan or not, if after having walked

to work, they were to be asked to list symbols of various sports teams, they might suggest

“eagle.” Why? Because they might have heard the chirping of robins on the way. That is,

consciously or not, the robins might have primed the concept of bird which then enabled ‘eagle’

to come into one’s thoughts.

Before the development of the IAT, in a well-known study on how our external

environments affect implicit cognition, Bargh et al. (1996) primed participants with stereotypes

involving the elderly. Without participants’ knowledge, they timed how long it took those who

had been primed to walk down the hall. Compared to those who had not been primed, the

experimental group took significantly longer to walk down the hall, seemingly fulfilling the

stereotype of elderly people being slow. These types of effects of priming have been observed

in a number of behaviors (Bargh & Chartrand, 1999); Dijksterhuis & Bargh, 2001); Gollwitzer

& Bargh, 1996); Hassin et al., 2005); Oyserman & Lee, 2008; Wegner & Bargh, 1998);

Wheeler & Petty, 2001); Wilson, 2002).

Once the IAT came into being, however, the problem for Fazio and Olson was that

almost no direct research had been done on priming as it related to implicit cognition

and the IAT. As such, Fazio and Olson have been unreserved in their criticisms on this

point, stating, In contrast to priming measures, the IAT has little to do with what is

automatically activated in response to a given stimulus. Although IAT effects are often

referred to as ‘automatic preferences,’ this use of the term automatic appears to have a

very different meaning than it does in the context of priming procedures (Fazio & Olson,

2003, p. 315).


Recently, however, studies have surfaced involving the IAT that appear to support Fazio

and Olson’s position on this point. In 2002, Rudman and Lee conducted an experiment that

involved administering IAT to Whites who had been exposed to either violent rap music or

popular music. The researchers found that those participants who were exposed to the violent

rap music had more implicit stereotypical negative associations with Blacks on their IAT than

the control group, which were presented with popular music. Noteworthy was the fact that

participants’ levels of prejudice had already been controlled for before the experiment began

(Rudman & Lee, 2002).

While Fazio and Olson were researching situational priming, others were investigating

priming that was more on-going as well as cultural. Dasgupta and Asgari (2004) examined the

scores of gender-based IATs of two groups of women college students, all of whom had long-

term exposure to women in leadership positions. One group was students at a women’s college,

while the second group was students at a co-educational college. The women at the women’s

college lived in an environment that involved less of an association between themselves and

that of being a follower, whereas the coeds did not. After one year at their respective colleges,

their scores on the IAT differed significantly. The authors of the study concluded that the group

of students who attended the all-women’s college, simply by having close associations with

women leaders living amongst them, was the mediating factor in the undermining of implicit

stereotypes (Dasgupta & Asgari, 2004).

Among the most vocal critics of the IAT are academicians Philip Tetlock and Hart

Blanton. Tetlock is a Professor of Psychology and Management at the University of

Pennsylvania, while Blanton is a psychologist at Texas A & M University. Interviewed for a

New York Times article on the IAT, Tetlock commented, "there isn’t even that much consistency


in the same person’s scores if the test is taken again" (Tiernay, 2008, p.D1 NYC ed.). From a

similar interview in the Wall Street Journal, Blanton added, “One can decrease racial bias scores

on the I.A.T. by simply exposing people to pictures of African-Americans enjoying a picnic”

(Wax, 2005, p. 5).

Shepard’s earlier-mentioned paper has been well received as she has laid out a

compelling case for the position that implicit cognition is impacted by a society’s broad culture

as well as by an individual’s immediate environment. She has acknowledged, however, that

while the context in which implicit cognitive associations are activated is a challenge to study,

they also cannot be understood independently of that context. Toward the end of her essay,

Shepard suggests that the IAT may be invalid because in the end, it is testing what one’s culture

is suggesting. For example, if individuals have lived all their lives in an egalitarian society, the

notion that other societies, (e.g., the Taliban in rural Afghanistan) treat large segments of their

population as second-class citizens can be repugnant to the core of their (implicit) beings.

Therefore, any test that is so easily influenced by such things as surrounding culture and/or

impacted by events prior to taking of the test, by definition, does not measure deep-seated

thinking. When that criticism was put to two of the original developers of the IAT, both

Greenwald and Nosek agreed that the IAT is affected by external measurements, but that is part

of the measurement. Said Nosek, “In my view, implicit associations are the sum total of

everyday associations” (Azar, 2008, p. 44). Shepard, meanwhile, is unequivocal in her

conclusions that implicit cognition is highly malleable. Toward the end of her paper, she

commented that, “Humans hold in our memories, multiple, rich, and diverse representations of

people, groups, and institutions. Whether we can recognize them and consciously articulate them

or not, is the question” (Shepard, 2011, p. 135). Swidler suggested “We know more culture than


we use.”(1986, p. 29). It may also be the case that we use more culture than we know—in ways

under our awareness and outside of explicit control.

Given the relatively new research on the stability / instability of implicit cognition, the

fact that the meta-analysis of Hofmann et al. listed as problematic “factors influencing the

retrieval of information from memory” seem, in retrospect, to be a few years ahead of its time.

(Hofmann et al., 2005, p. 1369).

As supported by research to be described shortly, this thesis offers a fourth possibility for

the mixed and/or low correlations between the IAT and self-reported assessments. Namely, it is,

on several levels, the stark differences between these two types of instruments. This includes

both the administration and the scoring of them.

Before proceeding, however, it is advisable to refresh the reader on some points. Implicit

cognition processes are automatic, effortless, fast, associative, and in theory, not accessible to

introspection. Explicit cognition processes are effortful, slower, deliberative and consciously

monitored (Kahneman, 2003).

Both the MBTI and MMTIC self-report assessments ask their questions primarily by giving

“a” or “b” choices between two situations or scenarios. For example, on the MBTI – Form G,

which has 123 questions, the instructions read: “Please choose the one that appeals to you: a) I

can play at any time or b) I must finish my work before I can play” (Myers, I., 1998/2003, p.

142). On the current 54 question edition of the MMTIC, number 34 reads: “People do better

when they: a) know someone cares about them or b) know the rules” (CAPT, 2011). In both the

MBTI and the MMTIC, there are no right or wrong answers and in the process, participants are

allowed to take as much time as they need.


The IAT is gauged for accuracy and is timed. When the participant chooses a “wrong”

answer, a large red “X” appears on the screen and does not go away until the individual presses

the “correct” key, i.e., either the “a” key or the “k” key. See Figure 4. In addition, the IAT

consists of four modules with 180 trials in each and all questions must be answered. There are

built-in breaks or pauses between each module. All four modules consist of both practice and test

questions, but participants do not know which ones they are. By the time the participant

completes the IAT, he or she will have collectively pressed either the “a” key or the “k” key 720

times (4 modules x180 key presses). Another difference between the IAT and the MBTI/MMTIC

is that each IAT module is aimed at one personality facet at a time, (e.g., extroversion /

introversion). On the MBTI/MMTIC tests, questions from all four personality facets are

uniformly distributed throughout the test.

As previously mentioned, scores on the IAT are calculated based on latency responses

(delayed responses) on four-word stimuli (called double discriminants) and not the two word

stimuli. It is important to note that included in every double-discriminant screen are the two

words “self” and “other.” The two word stimuli are inserted only for the purposes of keeping the

participants primed in the sorting routine, but participants are not told this (See Appendix I, The

Implicit Association Test sample questions). These double discriminants appear on the screen in

rapid fire sequences. Longer response times on incongruent answers, (e.g., “extrovert” and

“quiet”) are subtracted from the congruent answers, (e.g., “extrovert” and “talkative”). In this

example, the premise is that “extrovert” and “talkative” will be far more strongly encoded than

an association between “extrovert” and “quiet.” Therefore, the reaction times (pressing the “a”

key or the “k” key) will be shorter for the more socially congruent combination. In this case,

socially congruent means prevalent assumptions within society. In addition, throughout the IAT,


there are pre-announced left-right screen-switching of categories. With this description of how

the IAT is physically taken, the reader can now appreciate criticisms of this process.

Fazio commented that the IAT is “noisy and complex and there’s no way to determine

whether it's measuring unconscious attitudes or simply associations picked up from the

environment”(Azar, 2008, p. 44). Hart Blanton, a psychologist at Texas A & M University, has

multi-faceted concerns with it. Having published several papers stating so, his greatest criticism

is over how the IAT is scored, saying that the IAT deserves an “F” in psychometrics. He also

takes the original IAT developers to task for “arbitrarily changing the scoring system in recent

years” (Blanton, et al. 2009, p. 582). Another Blanton concern of the IAT’s scoring system is

how it measures just one or two word (mostly abstract) concepts – not people.

Some years before, however, DeHouwer (2001) had already gone further. He said, “The

data of an experiment that was designed to test these accounts showed that IAT effects reflect

attitudes toward the target concepts rather than attitudes toward the individual exemplars of those

concepts” (DeHeuwer, 2001, p. 443). Later in his paper, DeHeuwer called many of these

concepts “irrelevant” when it came to describing personality characteristics (DeHeuwer, 2001, p

445). Meanwhile, Blanton was recently quoted as saying,

Unbeknownst to respondents who take this test, the labels given to them were chosen by

a small group of people who simply looked at a distribution of test scores and decided

what terms seemed about right. There's not a single study showing that people above and

below that cutoff differ in any way based on that score. This is not how science is done

(Azar, 2008, p 44).

MIT psychologist Michael Norton agrees. "Measures of unconscious prejudice are

especially untrustworthy predictors of discriminatory behavior. There is virtually no published


research showing a systematic link between racist attitudes, overt or subconscious, and real-

world discrimination” (Tiernay, 2008, p. d1). In a new paper, De Houwer seemed to agree,

saying, “In the absence of strong empirical evidence, one should refrain from making statements

about how a measure should be interpreted, how it works, or whether it is implicit. Without a

basic level of theoretical understanding of the measures, there is little ground for predicting when

a measure will be related to which kind of behavior” (DeHouwer et al., 2009, p. 364).

After this researcher’s study was complete, other issues concerning IAT methodologies

first manifested themselves via comments during debriefing sessions. Mentioned earlier was the

fact that only 103 participants, out of 261 possible students actually completed all three

assessments. An additional 45 students had their data discarded because they had only completed

either one or two, but not all three of these assessments. Perhaps another indicator of a lack of

involvement with regards to the testing process was the fact that 40 additional students’ data had

to be purged from their IAT results for “key-banging.” Again, the “key-banging” exclusion was

done according to the formula set by the developers of the IAT (Greenwald et al., 2003). During

these debriefing sessions, of those 103 participants who completed the full battery of testing,

there was a pattern of particularly critical comments of the IAT, but almost totally non-critical of

the MBTI and the MMTIC. On the IAT, with regards to the switching from two-word categories

to four-word categories as well as finding words that were formerly located on the left side of the

screen now suddenly appearing on the right side of the screen, one participant’s comment seem

to be representative. He called it “game-show trickery.” Another student commented that the

IAT is a “word game” that she experienced in early elementary school. She remembered her

elementary school teacher trying to trick the class by asking: What color milk comes from a

brown cow? “And they are going to tell what kind of personality I have from this? umm….I


don’t think so. This whole thing was a bit on the silly side,” she said. Others freely commented

about the tediousness as well as being mentally exhausted by their taking of the IAT (T.

Marselle, personal conversation, June 17, 2013, see Figure 6).While these participant comments

are anecdotal, they do bring to mind psychological constructs that have been studied on an

empirical level. With regards to the 103 participants who completed all three segments of the

study, this researcher asks the following questions: to what extent was participant apathy a part

of response acquiescence; mental exhaustion a part of response fatigue; and frustration over

switching categories, the lack the cognitive ability to adjust to this switching? Han and Olson

(2006) found that in a race IAT, switching from two-word categories to four-word categories and

having words that were formerly located on the left side of the screen now suddenly appear on

the right side of the screen, biased results in favor of the first category pairing (e.g., pairing

"Asian" with positive stimuli first, instead of pairing "Asian" with negative stimuli first) (Han &

Olson, 2006). Known as “cognitive inertia” or “IAT effects,” industrial-organizational

psychologists studying the IAT’s use in marketing consider this phenomenon a possible validity

problem. Cognitive inertia refers to switching difficulties between categories, which in turn

causes the IAT results to depend upon the order of the presentation of the various IAT modules

(Messner, & Vosgerau, 2010). In their study, the researchers concluded that, “Cognitive inertia

distorts individual IAT scores and diminishes the correlations between IAT scores and predictor

variables when the block order is counterbalanced between subjects but that counterbalancing

the block order repeatedly within subjects can eliminate cognitive inertia effects on the

individual level” (Messner & Vosgerau, 2010, p. 374). On this point the authors concluded that if

IAT scores are to mean anything at an individual level, there should be repeated

counterbalancing regarding the module order (Messner & Vosgerau, 2010).


This researcher concedes that while it is understandable these devices may have been built

into the software to keep participants focused, it seems this intent may, in some cases, have had

the opposite effect. Might this be especially true of adolescents? With reference to this particular

study, the researcher openly questions that with all the “noise and complexities” that seem to be

part of the ambient nature of traditional IATs (Fazio, 2008), how many of these 103 participants’

data would have been placed into the “key-banging” group, if the criteria for doing so were

expanded?

The IAT is a test that measures rapid response vs. latency response and no more. Drawing

from a small reservoir of 40 words, participants must make 720 choices from either two-word

screen flashes or four-word screen flashes. Add to this mix, the fact that the IAT is both timed

and gauged for accuracy, thereby adding mild stress or pressure, this may be problematic in the

sense that it adds to the “noise and complexity” of the test. Further, IAT participants are being

scored by an algorithm calculated by a select group of talented and respected academicians,

which as part of the software, have arrived at a conclusion that there is a bias being shown. In the

eyes of the IAT, any latency response is tantamount to a bias. At most it may be relative latency,,

but it seems to go no deeper than that.

Given the above, this researcher takes the position that drawing broad conclusions

regarding biases, (e.g., race, ethnicity, gender, politics, consumer preferences, or personality

characteristics) is a concern. This researcher posits that the only thing the IAT can conclude with

any degree of assurance is, “there is a latency response here.” Accordingly, the first question

beyond that is: exactly what does this latency response mean? To start, there appears to be a

robust debate as to whether latent responses truly reveal implicit attitudes.


Evidence has been presented in this paper that when dealing with one-dimensional topics

which also tend to polarize into “good” and “bad” categories, (e.g., race, ethnicity, gender, and

religious bias), the IAT can be helpful (Fazio & Olson, 2003). In addition, topics which are

socially sensitive and thus have a greater likelihood of being accompanied by overt bias are the

equivalent of low-hanging fruit. These overt biases have a greater chance of being detected and

relegated to opposite ends of the spectrum. This researcher believes this to be true even with the

preponderance of evidence regarding Shepard’s papers on culture, Fazio and Olson’s

extrapersonal associations and Karpinski and Hilton’s cultural knowledge, social influences and

media information (Fazio & Olson, 2003; Karpinski and Hilton 2001; Shepard, 2008).

However, as soon as the IAT attempts to delineate between mundane things like consumer

preferences, or measurements that involve subtleness, abstractness, complexity of issues, or any

of these phenomena in a dynamic arrangement, the IAT shows it weaknesses. In a hypothetical

example, if a person took an IAT that has been specially designed for MBTI-like or Big Five-like

personality descriptors, and was given time to think about what answers to choose, she or he

might think in the following terms: I’m an extrovert / introvert…nothing wrong with that. Or….I

am a big picture person (intuitive) / detailed-oriented person…nothing wrong with that. Or…I

am a person who likes to bring closure to situations / likes to be open-ended about things…

nothing wrong with that, etc. The point is that in life’s ordinary situations, none of these patterns

of thinking would be of any consequence. On the other hand, being racially or gender biased is

indeed significant in the world at large. However, being an introvert or a procrastinator is not.

This researcher further asserts that, as opposed to trying to measure a one-dimensional,

polarizing, and socially sensitive topic, the more subtle, complex, abstract, and dynamic a topic


is, the more likely there are to be confounding factors that will minimize any correlating factors

between the results.

Metaphorically speaking, at the end of the day, what the IAT is trying to do is measure a

very complex thing with a flat, straight-edged household ruler. Not in the sense that it is a blunt

instrument, but in the sense that it is one-dimensional instrument. Personality is a complex,

three-dimensional, and dynamic phenomenon. It is not sufficient to use a one-dimensional tool to

measure a three-dimensional object, which not only also happens to be moving at the same, but

has an ever-changing orbit in relation to other forces. To attempt to measure how a person reacts

to 720 rapid-fire, two-word or four-word screen flashes which originate from a bare-bones list of

40 words chosen without extensive field testing by the makers of the test, while being timed and

gauged for accuracy and to draw conclusions about major components regarding their

personality is not good science. In an effort to tap into our implicit (subconscious) thinking, the

creators of the IAT have purposely made their instrument simple and speedy. In doing so, the

IAT has come away from its intended moorings. In their attempts to simplify the relationship

between test participants and their personalities, they lose the opportunity for correlation.

With regard to self-report inventories such as the MBTI and Big Five, the vast majority of

the questions on these assessments, as stated, are aimed at various situations and scenarios that

are personalized in their wording. For example, the MBTI might ask: “Which do you prefer: a

loud party or a quiet conversation with a friend? The point is, unlike the IAT, where “quiet” and

“loud” are the totality of what is being flashed on the screen and asked of the test-taker – and

thereby oversimplifying question in the process - the MBTI and the MMTIC want to know about

you. They have taken hundreds of questions and decades of research to parse out meanings of

words. In turn, they present descriptive situations that a person can emotionally, cognitively, and


intellectually associate with – and to see to the extent and direction which these situational

questions pull. As the creators of the MBTI, MMTIC and the Big Five would tell you, self-report

questions have been refined to the point where there is not much ambiguity in what each

question is asking. Even with all the concerns over self-report inventories, (i.e., social

desirability, etc.), their questions have been crafted toward a focus on measuring what is

intended to measure, (e.g., a very specific thing such as extroversion).

The IAT, on the other hand, has not only said “in effect,” but in this case, literally, that

these words are the introverted words and these words are the extraverted words. And there lies

another weakness of the IAT. By design, the IAT demands a spontaneous (as quick as possible)

response. In doing so, the IAT cannot handle topics which are complex, abstract and/or which

are highly situational and therefore inherently require introspection, reflection, and thought. For

example, with the abstractness of introversion / extroversion, no two individuals would ever

define these two concepts in the same way. Therefore, by definition, the IAT cannot test for it. In

retrospect, Hofmann’s meta-analysis seems to resonate again when it lists “method-related

characteristics of the two measures” as its fourth possibility for the regularly low correlations

between the IAT and self-reported assessments (Hofmann et al., 2005, p. 1369).

Within this strand of criticisms, one new methodology that has received attention since its

introduction has been a question-based Implicit Association Test (qIAT). Developed by Iftah

Yovel and Ariela Friedman and introduced in January of 2013, the qIAT represents an attempt to

hybrid the traditional question / situation-based self-report with the traditional IAT - where

single words seek to evoke a response. In the researchers’ words,

Assessment in many self-report questionnaires is based on respondents’ ratings of the

extent to which each of a set of short statements is true for them. The qIAT was designed


to closely resemble these procedures. Compared to other implicit assessment paradigms,

this task uses more complex and elaborated semantic stimuli, and it therefore enables the

measurement of a broader range of psychological phenomena (Yovel & Friedman, 2013,

p.79).

In the only study to date involving the qIAT, Yovel and Friedman tested 88 undergraduate

students. The protocol was to have test participants take the Big Five (Five Factor Model)

personality test. Several days later, they were given the qIAT which was designed around the Big

Five descriptors. With the qIAT, some key facts remained the same. There were still absolute

(right and wrong) answers, and it was timed. For examples of the questions that were put to the

participants, as well as correlations between the explicitly measured extraversion items and the

explicit and implicit measures of extraversion, see Table 5.

Depending on the origin of the comment, the results of this study were either disappointing,

or in the case of the research team, “The present results indicate that an implicit assessment that

is based on the original items of standard self-reports is feasible” (Yovel & Friedman, 2013, p.

79).

From a different vantage point, including the author of this paper, these results are quite

similar to consistent findings across the numerous implicit assessment studies which involve two

or four word stimuli, (i.e., the traditional IAT, Hofmann et al., 2005). Simply stated, Yovel and

Friedman have not departed in any substantial way from the methodology of the traditional IAT.

That is, they were still trying to measure response time to obviously true questions or to

obviously false questions, all of which required little or no response delay. Therein lies the

continued flaw with the qIAT. Questions / situations like (see previous tables) “I am the life of

the party. I feel comfortable around people”, place many people in situations where they are not


sure they can picture themselves in such situations. On the other hand, a self-report which would

have a tendency to say “which of these two would you prefer?” or “Which of these represents a

best reflection of yourself?” In summary, the qIAT is a minor upgrade from the traditional IAT,

but not much more. It is still asking participants to associate sentences with words – not

individual people. As such, a fundamental limitation of the qIAT is that like the original-

traditional IAT, it is still a one dimensional creation. Like the traditional IAT, it is still trying to

get participants to gravitate to opposite ends of the spectrum. A respondent is put in the position

of saying yes or no, (e.g., aka extravert / introvert. In a previously mentioned example of a

typical MBTI self-report question / situation: “I can play at any time” vs. “I must finish my work

before I can play,” there are two different scenarios, neither of which represents opposite ends of

a spectrum. The qIAT respondent would have to make a quick decision: “I am the life of the

party.” vs. “I am not the life of the party.”

Methodology aside, it is statistically noteworthy to state that the results of Yovel and

Friedman’s study were that qIAT implicit extraversion scores, although low, were correlated

with the Big Five explicit extraversion scores but not with other explicit scales. This is a pattern

that is almost universally true when implicit and explicit personality assessments are compared

and contrasted. This point will be discussed in the next section.

Finally, as its final possible reason for the mixed and/or low correlations between the IAT

and the MBTI and MMTIC, this thesis paper agrees with a conclusion advanced by the meta-

analysis of Hofmann et al. Hofmann and his research team called it a “complete independence of

underlying constructs” (Hofmann et al., p. 1369).

In the main body of their study, Hofmann’s research team were quick to take a position

on this possibility saying, “The two measures are systematically related but that higher order


inferences and lack of conceptual correspondence can reduce the influence of automatic

associations on explicit self-reports. Explicit-implicit correlations were higher for affective as

compared to cognitive and for relative as compared to absolute self-report measures (Hofmann et

al., p. 1380). Later in their findings, Hofmann and his fellow researchers said:

More importantly, we found a reliable increase in correlations as a function of increasing

spontaneity in the course of making an explicit judgment . This finding is consistent with

the assumption that implicit measures primarily reflect automatic associations, whereas

explicit self-reports depend on the effortful retrieval of information from memory

(Hofmann et al., p. 1382).

More specific in its relevance to this thesis is a study mentioned previously in this paper.

In 2008, Schmukle, Back and Egloff were among the first to show that the structure was there for

a comparison between the traditional explicit self-report (traditional) Big Five personality

assessment and a specially designed Implicit Association Test version with the same Big Five

descriptors .Their study showed that it was possible, at least in theory, to compare “apples to

apples” and “oranges to oranges.” (Schmukle et al., 2008). However, what was not commented

about earlier was the actual results of this study.

Interestingly, the results obtained by Schmukle et al. were almost identical to the results in

this study. Correlations between corresponding implicit and explicit personality dimensions were

low, with the highest correlations being observed for Extraversion (.32) and .26 for

Conscientiousness and the average being .13 (Schmukle et al., 2008). This meant that the mean

correlation for all Big Five personality facets was actually lower than the .24 mean of Hofmann’s

meta-analysis. In their conclusions, the authors of the study commented that, of the five

personality dimensions, studies have shown that the most easily observable are extroversion and


conscientiousness, hence the highest correlations (Borkenau & Liebler, 1992). Particularly

noteworthy, they also speculated that, “it may well be that the structure of the implicit

personality self-concept is less sophisticated and less complex than the explicit personality self-

concept” (Schmukle et al., p. 264). The reader is reminded this is exactly the point the author of

this thesis has repeatedly made throughout this discussion.

The reader is also reminded that the Big Five’s facet of Conscientiousness has repeatedly

been strongly correlated with the MBTI’s Judging / Perceiving dichotomy. (Johnson &

Sanders, 1990: McCrae & Costa, 1989a; McCrae & John, 1992).

Current Status of the IAT

It is one thing to try to measure the implicit attitudes of polarizing social constructs

such as race and gender, but a substantially different matter to attempt to accurately measure

other complex constructs like conscientiousness. Accordingly, most industrial-organizational

psychologists have shown reluctance to utilize the IAT because of their psychometric concerns

regarding validity. On this point, Stuttgen et al. argued “that the IAT’s reliability, and thus its

validity, strongly depends on the particular application (i.e., which attitudes are measured,

which stimuli are used, and the sample). Thus, whether a given application for a given sample

will achieve sufficient reliability cannot be answered a priori” (Stuttgen et al. 2012, p. 2)

Currently, among industrial-organizational psychologists, the IAT is not considered to

be useful (Haines & Sumner, 2006). Landy (2008), was particularly critical of the IAT’s

methodologies, stating that “My review of the literature suggests that both stereotyping and

IAT research study designs are sufficiently far removed from real work settings as to render

them largely useless for drawing inferences about most, but not all, forms of employment

discrimination” (Landy, 2008, p. 379).


Also, with the developers of the IAT having stated that the main goal of their instrument is

to bypass our explicit (conscious) higher-order thinking system in order to access our implicit

(unconscious) thinking, other fundamental questions arise. If it were possible to access the holy

grail of psychology, (i.e., the inner world of human beings that is not directly observable), would

that lay bare our “real” emotions, attitudes, biases, motives, beliefs and subsequent personalities?

Should cognitive neuroscientists and personality theorists, consider the study of personality to be

only that phenomenon which lies deep beneath the surface – and in the process, cast aside the

evolutionary process? In determining our “real” biases, and by extension, the integral parts of

our personalities, are we the same primordial beings today that we were 40,000 years ago or do

we “get credit” for the pre-frontal cortex which we have brought along with us?

As recently as February of 2013, a Yale research team concluded that clear thinking and

good mental health depend on slow firing neurons. Senior author and professor of neurobiology

Amy Arnsten said, “Insults to these highly evolved cortical circuits impair the ability to create

and maintain our mental representations of the world, which is the basis of higher cognition.

High-order thinking depends upon our ability to generate mental representations in our brains

without any sensory stimulation from the environment. These cognitive abilities arise from

highly evolved circuits in the prefrontal cortex” (Arnten, 2013, p.6).

Arnsten further commented that there is little doubt that the breakdown of this system is a

contributing factor in Alzheimer’s disease. Further evidence toward this conclusion was provided

by Xiao-Jing Wang, a neurobiologist at New York University. Wang constructed mathematical

models which predicted that in order to ‘hold-on’ to these visual representations, the prefrontal

cortex depends upon maintaining a family of receptors which facilitate a slow and steady firing

of neurons. These conclusions are quite noteworthy because the Yale scientists had been


studying these glutamate signaling receptors for a decade. Now, they have identified them as

regulating neuronal firing. They are the NMDA-NR2B receptors.

At this point in this implicit/explicit investigation, the implicit portion of the discussion

seems to be occupying the majority of the focus. A legitimate question here, is whether one’s

explicit /conscious part of the thinking process can override the ‘biology made me do it’ mantra

of implicit thinking. Paul Bloom, head of the Infant Cognition Lab at Yale, made a point that

Piaget’s Theory of Mind has not yet set in when he said,

Young children are obsessed with social comparison. They don’t care about fairness.

They want relatively more, also known as the ‘Gimme’ syndrome. But a funny thing

happens as children grow older. At about 8 years old, [when given a choice in a division

of resources] they start choosing the equal / fair option. They become generous…..chalk

one up for society. It’s a ‘a gorgeous finding’. They become educated, acculturated. So

we can learn to temper some of those nasty tendencies we are all wired for: selfishness,

bias, etc., but the instinct is still there. [However], when life becomes difficult, and we

are stressed and under pressure, we regress to our younger selves. Depending on the

amount of stress at hand, we alternate throughout life. In the end, we end up where we

began. We go from selfishness to altruism, to justice, to altruism to bigotry to kindness to

prejudice. But the important thing is, virtue can be taught” (60 Minutes, CBS News,

2012).

As repeatedly stated by the developers of the IAT, the main purpose of the Implicit

Association Test is to detect and measure something about individuals that is supposedly

unknown to them. By definition, the IAT is trying to measure things going on in the

subconscious that are supposedly inaccessible to anyone. Therefore, to say the IAT is measuring


these things is both counter-intuitive and illogical. This researcher takes the position that in their

current versions, both the traditional IATs, as well as those recent versions which have been

specially designed with MBTI, MMTIC and Big Five-like personality descriptors, are all making

unwarranted cognitive leaps in their conclusions. As Northwest University social psychologist

Dr. Alice Eagly said, “The IAT adds something, but it’s not a direct line to the unconscious”

(Azur, 2008, p. 44). This researcher is being a bit more candid in the sense that he believes the

IAT represents the “twitterization” of personality testing. Unlike the explicit self-reports

exemplified by the MBTI and MMTIC, when situations and scenarios are put to the participant

taking the test, the IAT gives participants bare-bones two-word or four-word choices. In one of

his most recent papers on the IAT, Blanton et al. come as close to saying that the IAT is invalid.

Perhaps the title of his paper suggests this point: “Strong Claims and Weak Evidence:

Reassessing the Predictive Validity of the IAT” (Blanton et al., 2009).

Concluding thoughts

As constructed for this one study, the IAT might be useful in ferreting-out extreme biases

regarding good and bad on societally sanctioned (socially sensitive) topics, but not much more.

By purposeful intent, the IAT measures quick and simple associations whereas the MBTI and

MMTIC attempts to measure more personalized descriptions of situations and scenarios. The

complexity of what each is trying to measure is quite different.

Defenders of the IAT might be quick to say that self-report assessments like the MBTI,

MMTIC and Big Five also have small parts of their instruments which are two-word adjective

check-lists. Supporters of self-report inventories would agree, but would be quick to say these

adjective check lists represent a small percentage of their inventories. Their primary response,

however, would be that the big difference is that self-report assessments gives one time to think.


The researcher’s conclusion regarding the IAT is that it does not lend itself to complex

scenarios which have high emotional content such as in personality measurement. This seems

especially true with the taking of the IAT by adolescents.

From the scientific community, perhaps a point made by University of Virginia former

social psychologist, now turned law professor, Greg Mitchell, is noteworthy. Mitchell recently

said, "The IAT is not yet ready for prime time. I think this research is important research and the

people doing it are very good scientists with noble intentions. But noble intentions don't make

good public policy" (Azur, 2008, p. 44).

Future Direction and Research Involving the IAT

As this thesis draws to a close, pertinent questions regarding the practical and theoretical

implications of its findings should be addressed. Originally, the author predicted significant

correlations between the IAT dichotomies and their MBTI and MMTIC counterparts. The results

were that bivariate correlations between E/I and S/N were significant, while the T/F and J/P were

not significant.

Theoretically, what does this mean? The researcher’s answer is as follows: Had the

results turned out to show statistically significant correlations, and thus congruent validity in all

areas (all eight hypotheses), the IAT it would have represented a welcomed step in the direction

toward becoming a new and different tool to assess personality. This is because research

showing a multitude of weaknesses of the self-report method are well-grounded and described in

detail in my paper.

From a practical standpoint, with 50% (four out the eight bivariate hypotheses) being

significantly correlated, it can be stated that greater evidence of convergent validity may come

with further iterations of the IAT design. That is, other, more refined word choices, an IAT that


is less “noisy and busy” and by giving the IAT only to those who are familiar with the IAT

format. Specifically, for reasons documented in this paper, in choosing a participant population,

avoiding adolescents, at least for now, is probably important. Only test subjects familiar with the

format and requirements of IAT type of tests as well as possessing some modicum of life’s

experiences should be considered viable participants. Further, another consideration might be a

statistical analysis of the IAT word choices themselves, as well as an analysis of variance

(ANOVA) in the responses to the double-discriminant word pairings. More attention, of course,

should go toward the two dichotomies that were shown not to be significant, i.e., (the T/F and the

J/P). Finally, in future studies involving comparison of implicit and explicit personality

assessments, a more sophisticated statistical analysis should be considered. In addition to an

ANOVA, researcher may want to consider Fisher’s maximum-likelihood estimation (MLE) test

which estimates the parameters of a statistical model. (Fisher, 1925). A point worth noting here,

is the threshold which determined whether a correlation was significant in this study was .21.

Beyond the four bivariate hypotheses that were shown to be significant, there were two more that

came quite close. Perhaps using the Fisher MLE as standard procedure in future studies might

give more accurate results.

This thesis ends with, a quote which comes from a different perspective, literally. Perhaps

Franz Kafka was right, perhaps not, when he said, “How pathetically scanty my self-knowledge

is, compared with, say, my knowledge of my room. There is no such thing as observation of the

inner world, as there is of the outer world” (Kafka, 1917/1954).


References

Aboud, F.E. (2003). The formation of in-group favoritism and out-group prejudice in young

children: Are they distinct attitudes? Developmental Psychology, 39, 48-60.

Amodio, D., Harmon-Jones, E. & Devine, P., (2003). Individual differences in the activation and

control of affective race bias as assessed by startle eye-blink response and self-report.

Journal of Personal Sociology, 84 (4),738-753. doi: 10.1037/0022-3514.84.4.738

Arnsten, A., Wang, M., Yang Yang, Ching-Jung, Wang, Nao J., Gamo, L., J., Mazer, J.,

Morrison, J., Wang, T. & Wang, (2013). NMDA receptors subserve persistent neuronal

firing during working memory in dorsolateral prefrontal cortex. Neuron 77 (2), 736-749.

Anastasi, A., & Urbina, S. (1997). Psychological testing (7th ed.). New York: Macmillan.

Archer, R., & Krishnamurthy, R. (2001). Essentials of MMPI-A assessment. Hoboken, NJ: John

Wiley and Sons.

Asendorpf, J. B., Banse, R., & Mucke, D. (2002). Double dissociation between implicit and

explicit personality self-concept: The case of shy behavior. Journal of Personality and

Social Psychology, 83, 380–393.

Austin, E. J., Gibson, G. J., Deary, I. J., McGregor, M. J., & Dent, J. B. (1998). Individual

response spread in self-report scales: personality correlations and consequences.

Personality and Individual Differences, 24, 421–438. retrieved Nov 22, 2012 from

http://www.sciencedirect.com/science/article/pii/S019188699700175X

Azar, B. (2008). IAT: Fad or Fabulous? APA Monitor, July/August 39 (7), 44.

Banaji, M. R., & Baron, A. S. (2004). Implicit and explicit race attitudes: Evidence from ages 6,

10 and adulthood. Society of Personality and Social Psychology annual meeting, Austin,

TX.

http://www.sciencedirect.com/science/article/pii/S019188699700175X


Bargh, J. (1997). The Automaticity of everyday life, in Robert S. Wyer (ed.), The

automaticity of everyday life: Advances in social cognition, 1–61. Mahwah, NJ:

Lawrence Erlbaum Associates.

Bargh, J., & Chartrand, T. (1999). The unbearable automaticity of being, American

Psychologist 54, 462–479.

Bar-Haim, Y., Ziv, T., Lamy, D.& Hodes, R. (2006). Nature and nurture in own-race face

processing. Psychological Science, 17 (2), 159–163.

Barrick, M., & Mount, M. (1991). The Big Five personality dimensions and job performance: A

meta-analysis. Personnel Psychology, 4, 1-26.

Barrick, M., & Mount, M. (1996). Effects of impression management and self-deception on the

predictive validity of personal constructs. Journal of Applied Psychology, 81, 261–272.

doi:10.1037/0021-9010.81.3.261

Becker, T., & Colquitt, A. (1992). Potential versus actual faking of a biodata form: An analysis

along several dimensions of item type. Personnel Psychology, 45, 389-406.

Berger, P., & Luckmann, T. (1966). The social construction of reality: A treatise in the

sociology of knowledge. New York: Anchor Books.

Birkeland, S., Manson, T., Kisamore, J., Brannick, M., & Smith, M. (2006). A meta-analytic

investigation of job applicant faking on personality measures. International Journal of

Selection and Assessment, 14, 317-335.

Blanton, H. & Jaccard, J. (2006). Arbitrary metrics in psychology. American Psychologist, 61(1),

27-41. doi:10.1037/0003-066X.61.1.27. PMID 16435974

Blanton, H., & Jaccard, J. (2008). Unconscious racism: A concept in pursuit of a measure.

Annual Review of Sociology, 34, 277–297.

http://en.wikipedia.org/wiki/Digital_object_identifier

http://dx.doi.org/10.1037%2F0003-066X.61.1.27

http://en.wikipedia.org/wiki/PubMed_Identifier

http://www.ncbi.nlm.nih.gov/pubmed/16435974


Blanton, H., Jaccard, J., Klick, J.,Mellers, B., Mitchellm G., & Tetlock, P. (2009). Strong claims

and weak evidence: Reassessing the predictive validity of the IAT. Journal of Applied

Psychology, 94 (3), 567-582. doi: 10.1037/a0014665.

Borkenau, P., & Liebler, A. (1992). Trait inferences: Sources of validity at zero acquaintance.

Journal of Personality and Social Psychology, 62, 645–657.

Bosson, J. K., Swann, W. B., & Pennebaker, J. W. (2000). Stalking the perfect measure of

implicit self-esteem: The blind men and the elephant revisited? Journal of Personality

and Social Psychology, 79, 631-643.

Bradley, K., & Hauenstein, N., (2006). The moderating effects of sample type as evidence of the

effects of faking on personality scale correlations and factor structure. Psychology

Science, 48 (3), 313-335.

Briggs, K., Myers, I., & McCaulley, M.H., Quenk, N. L., & Hammer, A. L. (1998). A guide to

the development and use of the Myers-Briggs Type Indicator. Mountain View, CA: CPP,

Inc.

Burns, G., & Christiansen, N. (2006). Use of social desirability in correcting for motivated

distortion. In R. Griffith (Ed.), A Closer Examination of Applicant Faking Behavior.

Greenwich, CT: Information Age Publishing.

Cantor, N. (1990). From thought to behavior: “Having” and “doing” in the study of personality

and cognition. American Psychologist, 45, 735-750.

Carifio, J., & Perla, R. (2007). Ten common misunderstandings, misconceptions, persistent

myths and urban legends about Likert scales and Likert response formats and their

antidotes. Journal of Social Sciences, 2, 106-116.

http://www.scipub.org/fulltext/jss/jss33106-116.pdf

http://www.scipub.org/fulltext/jss/jss33106-116.pdf


Casey, BJ., Castellanos, F. & Giedd, J. (1997). Implications of right frontostriatial circuitry in

response inhibition and attention deficit/hyperactivity disorder. Journal of the American

Academy of Child and Adolescent Psychiatry, 36, 367.

Casey, B.J., Giedd, J.N., & Thomas, K.M. (2000). Structural and functional brain development

and its relation to cognitive development. Biological Psychology, 54, 241-257.

Cattell, R. B., & Coan, R. W. (1976). Early school personality questionnaire (Rev. ed.).

Champaign, IL: Institute for Personality and Ability Testing.

Cattell, R. B., Cattell, M. D., & Johns, E. (1984/2001). High School Personality Questionnaire.

Champaign, IL: Institute for Personality and Ability Testing.

Clark, L. (2007). Assessment and diagnosis of personality disorder: Perennial issues and an

emerging reconceptualization. Annual Review of Psychology, 58, 227–257.

Christiansen, N., Goffin, R., Johnston, N. & Rothstein, M. (1994). Correcting the 16PF for

faking: effects on the criterion-related validity and individual hiring decisions. Personnel

Psychology, 47, 847-60.

Converse, P., Peterson, M., & Griffith, R. (2009). Faking on personality measures: Implications

for selection involving multiple predictors. International Journal of Selection and

Assessment, 17, 47–60.

Costa, P. T., Jr., & McCrae, R. R. (1992). Revised NEO Personality Inventory (NOE-PI-PR) and

NEO Five Factor inventory (NEO-FFI) professional manual, Psychological Assessment

Resources, Odessa, FL.

Costandi, M. (2012, June 26, 27-28). How the brain views race. Nature.com. Nature Publishing

Group Retrieved March 31, 2013, from http://www.Nature.com.

doi:10.1038/nature.2012.10886

http://www.nature.com/


Cronbach, L. & Meehl, P. (1955). Construct validity in psychological tests. Psychological

Bulletin, 52, 281–302.

Cunningham, W., Johnson, M., Raye, C., Gatenby, C., Gore, J. & Banaji, M. (2004) Separable

neural components in the processing of black and white faces. Psychological Science, 10,

806–813. doi: 10.1111/j.0956-7976.2004.00760.x

Cunningham, W. A., Nezlek, J. B., & Banaji, M. R. (2004). Implicit and explicit ethnocentrism:

Revisiting the ideologies of prejudice. Personality and Social Psychology Bulletin, 30,

1332-1346.

Cunningham W. & Van Bavel, J. (2009) Separable Neural Components in the Processing of

Black and White Faces Part 2. Psychological Science Nature Reviews Neuroscience, 15,

141–152. doi:10.1038/nrn2538

Cvencek, D., Greenwald, A. G., Brown, A., Snowden, R. & Gray, N. (2010). Faking of the

Implicit Association Test is statistically detectable and partly correctable. Basic and

Applied Social Psychology, 32, 302–314. doi: 10.1080/01973533.2010.519236

Cvencek, D., Greenwald, A. G., & Meltzoff, A, N. (2011). Measuring Implicit attitudes of 4-

year-old children: The Preschool Implicit Association Test. Journal of Experimental Child

Psychology, 109, 187–200. doi:10.1016/j.jecp.2010.11.002

Dalton, D., & Ortegren, M. (2011). Gender differences in ethics research: The importance of

controlling for the social desirability response bias. Journal of Business Ethics, 103 (1),

73–93. doi:10.1007/s10551-011-0843-8

Dasgupta, N., McGhee, D., Greenwald, A., & Banaji, M. (2000). Automatic preference for white

Americans: eliminating the familiarity explanation. Journal of Experimental Social

Psychology 36, 316–28.

javascript:request('2004_PSPB.pdf')

javascript:request('2004_PSPB.pdf')

http://en.wikipedia.org/wiki/Digital_object_identifier

http://dx.doi.org/10.1007%2Fs10551-011-0843-8


Dasgupta, N., & Greenwald, A. G. (2001). On the malleability of automatic attitudes: Combating

automatic prejudice with images of admired and disliked individuals. Journal of

Personality and Social Psychology, 81, 800-814.

Dasgupta, N., Greenwald, A., & Banaji, M. (2003). The first ontological challenge to the IAT:

Attitude or mere familiarity? Psychological Inquiry, 14, (3 & 4), 238-243.

Dasgupta, N., & Asgari, S. (2004). Seeing is believing: Exposure to counter stereotypical

women and its effect on the malleability of gender stereotyping. Journal of

Experimental Social Psychology, 40, 642–658.

De Houwer, J. (2001). A structural and process analysis of the Implicit Association Test. Journal

of Experimental Social Psychology, 37, 443–51.

DeHouwer, J., Teige Mocigemba, S., Spruyt, A., & Moors, A. (2009). Implicit measures: a

normative analysis and review. Psychological Bulletin, 5 (3), 347-368 doi:

10.1037/a0014211

DePaulo, B., Kashy, D., Kirkendol, S., Dwyer, M., & Epstein, J. (1996). Lying in everyday life.

Journal of Personality and Social Psychology, 70 (5), 979-995.

Dijksterhuis, A., & Bargh, J. (2001). The perception-behavior expressway: automatic

effects of social perception on social behavior. In Mark P. Zanna (ed.), Advances

in experimental social psychology, 1–40. San Diego, CA., Academic Press.

Douglas, E., McDaniel, M., & Snell, A. (1996). The validity of non-cognitive measures decays

when applicants fake. Presented at the annual meeting of the Academy of Management,

Cincinnatti, Ohio (August).

Dupree, C. (2004). Cushioning hard memories. Harvard Magazine, July-August, 9-10.


Durston, S., Hulshoffpol, H.E., Casey, B.J., Giedd, J.N., Buitelaar, J.K., & van England, H.

(2001). Anatomical MRI of the developing brain: what have we learned? Journal of the

American Academy of Child and Adolescent Psychiatry, 40 (9),1012-1020.

Egloff, B., & Schmukle, S. C. (2002). Predictive validity of an implicit association test for

assessing anxiety. Journal of Personality and Social Psychology, 83, 1441–1455.

Egloff, B., & Schmukle, S. C. (2003). Does social desirability moderate the relationship between

implicit and explicit anxiety measures? Personality and Individual Differences, 35, 1697–

1706.

Ellingson, J., Sackett, P., & Hough, L. (1999). Social desirability corrections in personality

measurement: Issues of applicant comparison and construct validity. Journal of Applied


Ellingson, J., Smith, D., & Sackett, P. (2001). Investigating the influence of social desirability on

personality factor structure. Journal of Applied Psychology, 86, 122-133.

Ellingson, J., (2011). People fake only when they need to fake. In Ziegler, M., MacCann, C., &

Roberts, R., (Eds.). New perspectives on faking in personality assessment, 19–33, New

York, NY: Oxford University Press.

Embretson, S. (1983). Construct validity: Construct representation versus nomothetic span.

Psychological Bulletin, 93, 179-197.

Ermer, E. Cope, L.M., Nyalakanti, P.K., Calhoun, V.D., & Kiehl, K.A. (2013). Aberrant

paralimbic gray matter in incarcerated male adolescents with psychopathic traits. Journal

of the American Academy of Child and Adolescent Psychiatry 52 (1) 94-103.

doi: 10.1016/j.jaac.2012.10.013.


Eysenck, S., & Eysenck, H. (1963). An experimental investigation of "desirability" response set

in a personality questionnaire. Life Sciences, 5, 343-355.

doi: 10.1016/0024-3205(63)90168-1

Eysenck, H. J., & Eysenck, S.B.G. (1975a). Junior Eysenck Personality Questionnaire. San

Diego, CA: Educational and Industrial Testing Service.

Eysenck, H. J. (1973). Eysenck on Extraversion. New York: Wiley.

Eysenck, H. J. (1990a). Biological dimensions of personality. In E. Pervin (Ed.), Handbook of

personality, 244–276, New York: Guilford.

Eysenck, H. J. (1990b). Genetic and environmental contributions to individual differences: The

three major dimensions of personality. Journal of Personality, 58, 245–261.

Fan, X., Miller, B. C., Park, K., Winward, B. W., Christensen, M. & Grotevant, H. (2006). An

exploratory study about inaccuracy and invalidity in adolescent self-report surveys. Field

Methods ,18, 223–244. http://fmx.sagepub.com/content/18/3/223.short

Fazio, R., Jackson, J., Dunton, B., & Williams, C, (1995). Variability in automatic activation as

an unobtrusive measure of racial attitudes: a bona fide pipeline? Journal of Personality

and Social Psychology, 69 (6), 1013-27.

Fazio, R., & Olson, M. (2003). Implicit measures in social cognition research: Their meaning

and use. Annual Review of Psychology, 54 (2), 297-327.

doi: 10.1146/annurev.psych.54.101601.145225

Fazio, R., & Olson, M. (2004). Reducing the influence of extrapersonal associations on the

Implicit Association Test: Personalizing the IAT. Journal of Personality and Social

Psychology, 86 (5), 653–667. doi: 10.1037/0022-3514.86.5.653

http://psycnet.apa.org/doi/10.1016/0024-3205(63)90168-1

http://fmx.sagepub.com/content/18/3/223.short


Fernandes, M., & Randall, D. (1992). The nature of social desirability response effects in ethics

research. Business Ethics Quarterly, 2 (2), 183-205.

Fisher, R., A. (1925). Theory of Statistical Estimation. Proceedings of the Cambridge

Philosophical Society (22) 700-725.

Fisher, R. (1993). Social desirability bias and the validity of indirect questioning. Journal of

Consumer Research, 20, 303-315.

Fisher, R., & Tellis, G. (1998). Removing social desirability bias with indirect questioning: Is the

cure worse than the disease? eds. Joseph W. Alba & J. Wesley Hutchinson, Provo, UT :

Association for Consumer Research, 563-567.

Furnham, A. (1990). The fakability of the 16PF, Myers-Briggs and FIRO-B personality

measures. Personality and Individual Differences, 11, 711-716.

Gailliot, M., Peruche, B., Plant, E. & Baumeister, R. (2009). Stereotyping and prejudice in the

blood: sucrose drinks reduce prejudice and stereotyping. Journal of Experimental Social


Galton, F. (1884). Measurement of character. Fortnightly Review, 36, 179-185.

Gao, Y., Glenn, A., Schug, R., Yang, Y., Raine, A. (2009). The neurobiology of psychopathy: a

neurodevelopmental perspective. Canadian Journal of Psychiatry, 54 (12), 813-823.

Ghiselli, E. E., Campbell, J. P, & Zedeck, S. (1981). Measurement theory for the behavioral

sciences. San Francisco, CA: W. H. Freeman and Company.

Giedd, J. (2002). Interview with PBS Frontline/Inside the Teen Brain , online at

www.pbs.org/wgbh/pages/frontline/shows/teenbrain. retrieved July 23, 2013.

Gilbert, A. P. (1998). A test-retest study of the Myers-Briggs Type Indicator (MBTI) and the

Murphy-Meisgeier Type Indicator for Children (MMTIC) over a two year time period

http://www.mendeley.com/research/social-desirability-bias-and-the-validity-of-indirect-questioning/

http://www.mendeley.com/research/social-desirability-bias-and-the-validity-of-indirect-questioning/

http://www.pbs.org/wgbh/pages/frontline/shows/teenbrain


(Doctoral dissertation, Ohio State University, 1998). Dissertation Abstracts International,

59 (5), 1457A (University Microfilms No. AAG98-33981)

Gollwitzer, P., & Bargh, J. (eds.), (1996). The Psychology of Action: Linking Cognition

and Motivation to Behavior. New York: Guilford Press.

Gough, H.G., Fioravanti, M., & Lazzari, R. (1983). Some implications of real self versus ideal-

self congruence on the revised Adjective Check List. Journal of Personality and Social

Psychology, 44, 1214-1220.

Gray, N., MacCulloch, M., Smith, J., Morris, M., & Snowden, R. (2012). Implicit affective

associations to violence in psychopathic murders. Journal of Forensic Psychiatry and

Psychology, 15 (4), 620-641. doi:10.1038/nn0703-647.

Greenwald, A. G., & Banaji, M. R. (1995). Implicit social cognition: attitudes, self-esteem, and

stereotypes. Psychological Review, 102 (1), 4-27.

Greenwald, A., McGhee, D., & Schwartz, J., (1998). Measuring individual differences in implicit

cognition: The Implicit Association Test. Journal of Personality and Social Psychology,

74 (6), 1464-1480.

Greenwald, A. (2001, October). Top 10 list of things wrong with the IAT. Presentation at the

annual conference of the Society of Experimental Social Psychology, Spokane, WA.

Greenwald, A. G., & Nosek, B. A. (2001). Health of the Implicit Association Test at age 3.

Zeitschrift fuer Experimentelle Psychologie. 48, 85-93.

Greenwald, A. G., Nosek, B. A., & Banaji, M. R. (2003). Understanding and using the Implicit

Association Test: I. an improved scoring algorithm. Journal of Personality and Social



Greenwald, A. G. (2004, January). Revised Top 10 List of Things Wrong with the IAT. Invited

presentation at attitudes preconference of the 5th annual meeting of the Society of

Personality and Social Psychology, Austin, TX. (Powerpoint slides)

Greenwald, A., Poehlman , T., Uhlmann, E. & Banaji M. (2009). Understanding and using the

Implicit Association Test: III. Meta-analysis of predictive validity. Journal of Personality

and Social Psychology, 97 (7), 17- 41.

Greenwald, A. (2012). A Description of the Implicit Association Test (IAT). Washington State

University Faculty Homepage September 12 retrieved from

http://faculty.washington.edu/agg/IATmaterials/IATdescription.htm

Griffith, R., Rutkowski, K., Gujar, A., Yoshita, Y. & Steelman, L. (2005). Modeling applicant

faking: new methods to examine an old problem. Personnel Review, 36 (3), 341-355.

Griffith, R., & McDaniel, M. (2006). The nature of deception and applicant faking behavior. In

R. L. Griffith & M. H. Peterson (Eds.). A closer examination of applicant faking

behavior, 1–19. Greenwich, CT: Information Age.

Griffith, R., Chmielowski, L., & Yoshita, Y., (2007). Do applicants fake? An examination of the

frequency of applicant faking behavior. Personnel Review, 36 (3), 341-357.

Griffith, R., & Converse, P., (2011). The rules of evidence and the prevalence of applicant

faking. In Ziegler, M., MacCann,, C., & Roberts, R. (Eds.). New perspectives on faking in

personality assessment (34–52). New York, NY: Oxford University Press.

Griffith, R., Lee, L,.Peterson, M., & Zickar, M. (2011). First dates and little white lies: a trait

contract classification theory of applicant faking behavior. Human Performance, 24 (4)

338-357.

http://faculty.washington.edu/agg/IATmaterials/IATdescription.htm


Grum, M., & von Collani, G. (2007). Measuring Big-Five personality dimensions with the

implicit association test-Implicit personality traits or self-esteem? Personality and

Individual Differences, 43, 2205-2217.

Haider, A., Sexton J. & Sriram, N. (2011). Association of unconscious race and social class bias

with vignette-based clinical assessments by medical students. Journal of the American

Medical Association, 306 (9), 942-951. doi:10.1001/jama.2011.1248

Haines, E., & Sumner, K. (2006). Implicit measurement of attitudes, stereotypes, and self-

concept in organizations: Teaching an old dogma new tricks. Organizational Research

Methods, 9 (4), 536-553.

Hambleton, R., Swaminathan, H., & Rogers, H. (1991). Fundamentals of Item Response Theory.

Newbury Park, CA: Sage Press

Hamlin, J., Wynn, K,, Bloom, P., (2007). Social evaluation by preverbal infants. Nature. 450,

557–559.

Hamlin, J., Wynn, K. & Bloom, P. (2010). Three month olds showed a negativity bias in their

social evaluations. Developmental Science, 13 (6), 923–929. doi:10.1111/j.1467-

7687.2010.00951

Han, A., Olson, M., & Fazio, R. (2006). The influence of experimentally created extrapersonal

associations on the Implicit Association Test. Journal of Experimental Social

Psychology, 42 (3), 259-72.

Han, H., Czellar, S., Olson, M., & Fazio, R. (2010). Malleability of attitudes or malleability of

the IAT? Journal of Experimental Social Psychology, 46, 286-298.


Hardin, C., & Higgins, E. (1996). Shared Reality: How social verification makes the

subjective o bjective. In R. Sorrentino, & E. Higgins, E. (eds.), Handbook of

motivation and cognition,3,The interpersonal context, 28–84. New York, NY:

Guilford Press.

Hare, R., & Neumann, C. (2008). Psychopathy as a clinical and empirical construct. Annual

Review of Clinical Psychology, 4, 217-246.

Harvey, R. J., & Murry, W. D. (1994). Scoring the Myers-Briggs Type Indicator: Empirical

comparison of preference score versus latent-trait methods. Journal of Personality

Assessment, 62, 116-129.

Harvey, R. J., Murry, W. D., & Markham, S. E. (1995). A Big Five scoring System for the

Myers-Briggs Type Indicator. Paper presented at the Annual Conference of the Society

for Industrial and Organizational Psychology, Orlando. Retrieved November 06, 2012

from http://harvey.psyc.vt.edu/Documents/BIGFIVE.pdf

Harvey, R. J. (1996). Reliability and validity. In A. L. Hammer (Ed.), MBTI applications, 5-

29. Palo Alto, CA: Consulting Psychologists Press.

Hassin, R., Uleman, J., & Bargh, J. (eds.). (2005). The new unconscious. New York:

Oxford University Press.

Heggestad, E., George, E., & Reeve, C. (2006). Transient error in personality scores:

Considering honest and faked responses. Personality and Individual Differences, 40,

1201-1211.

Heggestad, E. (2011). A conceptual representation of faking: Putting the horse back in front of

the cart. In Zeigler, M., McCann, C., & Roberts, R., (Eds.). New perspectives on faking in

personality assessments . 87-101. New York, NY: Oxford University Press.

http://harvey.psyc.vt.edu/Documents/BIGFIVE.pdf


Henry, E., Bartholow, B., & Arndt, J. (2010). Death on the brain: Effects of mortality salience on

the neural correlates of in-group and out-group categorization. Social Cognitive and

Affective Neuroscience, 5, 77-87.

Hogan, R., Hogan, J., & Roberts, B. (1996). Personality measurement and employment

decisions: questions and answers. American Psychologist, 51, 469-77.

Hoffman, A. (2010). IAT and personality: Implicit personality as a predictor of performance. A

thesis submitted to the graduate faculty of North Carolina State University, Raleigh

Hofmann, W., Gawronski, B., Gschwendner, T., Le, H., & Schmitt, M. (2005). A meta-

analysis on the correlation between the implicit association test and explicit self-report

measures. Personality and Social Psychology Bulletin, 31(10), 1369–1385.

http://dx.doi.org/10.1177/0146167205275613

Holden, R. (2007). Socially desirable responding does moderate personality scale validity both in

experimental and in non-experimental contexts. Canadian Journal of Behavioural

Science/Revue Canadiennedes Sciences du Comportment, 39, 184–201.

doi:10.1037/cjbs2007015

Holden, R. (2008). Underestimating the effects of faking on the validity of self-report personality

scales. Personality and Individual Differences, 44(1), 311-321.

Hough, L., Eaton, N., Dunnette, M., Kamp, J., & McCloy, R., (1990). Criterion-related validities

of personality constructs and the effect of response distortion on those validities. Journal

of Applied Psychology, 75, 581-95.

Hough, L., & Paullin, C. (1994). Construct-oriented scale construction: the rational approach, in

Stokes, G.S., Mumford, M.and Owens, W.(Eds). The biodata handbook: Theory,

http://www.psychologytoday.com/basics/neuroscience

http://www.psychologytoday.com/basics/cognition

http://dx.doi.org/10.1177/0146167205275613


research, and use of biographical information in selection and performance prediction,

109-145, Palo Alto, CA: Consulting Psychologists Press.

Hough, L., & Schneider, R. (1995). The frontiers of I/O personality research. In K. R. Murphy

(Ed.). Individual differences and behavior in organizations. San Francisco: Jossey- Bass.

Hough, L., & Schneider, R. (1996). Personality traits, taxonomies, and applications on

organizations, in Murphy, K. (Ed.), Individual Differences on Organizations, Jossey-

Bass, San Francisco, CA.

Hough, L. (1997). An examination of the structure and usefulness of non-cognitive constructs for

predicting job performance. Paper presented at the 12th Annual Conference of the

Society for Industrial and Organizational Psychology, St Louis, MO.

Hough, L. (1998b). Personality at work: Issues and evidence. In M. Hakel (Ed). Beyond multiple

choice: Evaluating alternatives to traditional testing for selection, 131-166. Hillsdale,

NJ: Lawrence Erlbaum.

Jackson, D. N., Wroblewski, V. R., & Ashton, M. C. (2000). The impact of faking on

employment tests: Does forced choice offer a solution? Human Performance, 13, 371–

388. doi:10.1207/S15327043HUP1304_3

Jamieson, S. (2004). Likert scales: how to (ab)use them. Medical Education, 38, 1212-1218.

Johnson, J., & Hogan, R. (2006). A socioanalytic view of faking. In R. Griffith, & M. Peterson,

(Eds.). A closer examination of applicant faking behavior, 209-231. Greenwich, CT:

Information Age Publishing.

Jost, J., Rudman, L., Blair, I., Carney, D., Dasgupta, N., Glaser, J., & Hardin, D. (2009).

The e xistence of i m plicit b ias is b eyond r easonable doubt: A refutation of


ideological and methodological objections and executive summary of ten studies

that no manager should ignore. Research in Organizational Behavior, 29, 39–69.

Jung, C. G. (1921/English edition 1923). Psychological types. Princeton NJ : Princeton

University Press.

Jung, C. G. (1936/1971). Psychological types. Collected works of C.G. Jung, 6. Princeton NJ,

Princeton University Press.

Kafka, F. (1954). The third notebook, October 18, 1917. In F. Kafka, Dearest Father. Stories

and other writings. New York: Schocken Books.

Kafka, F. (n.d.). BrainyQuote.com. Retrieved August 7, 2013, from BrainyQuote.com Web site:

http://www.brainyquote.com/quotes/quotes/f/franzkafka401830.html

http://www.brainyquote.com/citation/quotes/quotes/f/franzkafka401830.html#PD2Fsmd

QV2YzKWQe.99

Kagan, J. (1988). The meaning of Personality predicates. American Psychologist, 43, 614-620

Kagan, J. (2008). In defense of qualitative changes in development. Child Development, 79,

1606 – 1624. doi: 10.1111/j.1467-8624.2008.01212.x

Kahneman, D. (2003). A perspective on judgment and Choice: Mapping Bounded

Rationality. American Psychologist,58, 697–720.

Kanwisher, N. Stanley, D., & Harris, A. (1999). The parahippocampal place area: recognition,

navigation, or encoding? Neuron 5 (1), 115-25.

Karpinski, A., & Hilton, J. (2001). Attitudes and the Implicit Association Test. Journal of

Personality and Social Psychology 81, 774–788.

Kelly, D., Quinn, P., Slater, A., Lee, K., Gibson, A., Smith, M., Ge, L., & Pascalis, O., (2005).

Three- month-olds, but not newborns, prefer own-race faces. Developmental Science, 6,

F31–F36. doi: 10.1111/j.1467-7687.2005.0434a.x

http://www.brainyquote.com/citation/quotes/quotes/f/franzkafka401830.html#PD2FsmdQV2YzKWQe.99

http://www.brainyquote.com/citation/quotes/quotes/f/franzkafka401830.html#PD2FsmdQV2YzKWQe.99

http://dx.doi.org/10.1111%2Fj.1467-7687.2005.0434a.x


Kinzler, K., Dupoux, E., Spelke, E. (2007). The native language of social cognition. Proceedings

of the National Academy of Science, 104 (30), 12577–12580.

Klauer, K. C., & Mierke, J. (2005). Task-set inertia, attitude accessibility, and compatibility-

order effects: New evidence for a task-set switching account of the Implicit Association

Test effect. Personality and Social Psychology Bulletin, 37, 208-217.

Kluger, A., & Colella, A. (1993). Beyond the mean bias: The effect of warning against faking on

biodata item variances. Personnel Psychology, 46, 763-780.

Kubota, J., Banaji, M., & Phelps, E. (2012). The neuroscience of race. Nature

Neuroscience, 15, 940–948. doi: 10.1038/nn.3136

Kuhlmeier, V., Bloom, P., & Wynn, K. (2004). Do 5 month old infants see humans as material

objects? Cognition, 94, 95-103.

Kuhlmeier, V., Wynn, K., & Bloom, P. (2003). Attribution of dispositional states by 12-month-

old infants. Psychological Science, 14, 402-408.

Landy, F. (2008). Stereotypes, bias, and personnel decision: Strange and stranger. Industrial and

Organizational Psychology, 1(4), 379-392. doi:10.1111/j.1754- 9434.2008.00071.x

Lane, K. A., Banaji, M. R., Nosek, B. A., & Greenwald, A. G. (2007). Understanding and using

the Implicit Association Test: IV. What we know (so far) 59–102. In B. Wittenbrink & N.

S. Schwarz (Eds.). Implicit measures of attitudes: Procedures and controversies. New

York: Guilford Press.

Lang, M. L. S. (1999). A concurrent validity study of the MBTI, MMTIC-R and SSQ with

middle school students [Myers-Briggs Type Indicator, Murphy-Meisgeier Type Indicator

for Children-Revised, Student Styles Questionnaire] (Doctoral dissertation, Texas

Woman's University, 1999). Dissertation Abstracts International, 60(09), 3271A,

University Microfilms No. AAI99-44500

http://pantheon.yale.edu/~kw77/KuhlBloomWynn2004.pdf

http://pantheon.yale.edu/~kw77/KuhlmeierWynnBloomPsychScience2003.pdf


Lehrman, S. (2006). The implicit prejudice, Scientific American Magazine, Retrieved November

21, 2012, from http://www.scientificamerican.com/article.cfm?id=the-implicit-prejudice

Levin, R. (1995). Self-presentation, lies, and bullshit: The impact of impression management on

employee selection. Presented at the 10th

annual meeting of Society of Industrial and

Organizational Psychology, May, Orlando, Fl.

Lieberman M., Hariri A., Jarcho J., Eisenberger N., & Bookheimer S. (2005). An fMRI

investigation of race-related amygdala activity in African-American and Caucasian-

American individuals. NatureNeuroscience, 8 (6), 720-722. doi: 10.1038/nn1465

Luo, Q., Nakic, M., Wheatley, T., Richell, R., Martin, A., & Blair, R.J. 2006. The neural basis of

implicit moral attitude—an IAT study using event-related fMRI. NeuroImage, 30, 1449–

1457.

Malpass, R., & Kravitz, J. (1969). Recognition for faces of own and other race. Journal of

Personality and Sociology, 13, 330–334.

MacCann, C., Ziegler, M., & Roberts, R. D. (2011). Faking in personality assessment:

Reflections and recommendations. In M. Ziegler, C. MacCann & R. D. Roberts (Eds.).

New perspectives on faking in personality assessment, 309–329. New York, NY: Oxford

University Press.

Martin, B., Bowen, C., & Hunt, S. (2002). How effective are people at faking on personality

questionnaires? Personality and Individual Differences, 32 (2), 247-256.

McCrea, R., & Costa, P. Jr. (1989). Reinterpreting the Myers-Briggs Type Indicator from the

perspective of the Five-Factor Model of Personality. Journal of Personality, 57 (1), 17-

40.

McCrae, R., & John, O. (1992). An introduction to the five-factor model and its applications.

http://www.scientificamerican.com/article.cfm?id=the-implicit-prejudice


Journal of Personality 60 (2), 175–215. doi: 10.1111/j.1467-6494.1992.tb00970.x

McGrath, R. E., Mitchell, M., Kim, B. H., & Hough, L. (2010). Evidence for response bias as a

source of error variance in applied assessment. Psychological Bulletin, 136, 450–470.

doi:10.1037/a0019216

McKone, E., Crookes, K., Jeffery, L., Dilks, D. (2012): A critical review of the development of

face recognition: Experience is less important than previously believed. Cognitive

Neuropsychology, 29 (1-2), Special Issue: Understanding cognitive development:

Approaches from mind and brain, 1-39. doi:10.1080/02643294.2012.660138

Meehl, P., & Hathaway, S. (1946). The K factor as a suppressor variable in the Minnesota

Multiphasic Personality Inventory. Journal of Applied Psychology, 30, 525–564.

doi:10.1037/h0053634

Messner, C., & Vosgerau, J., (2010). Cognitive inertia and the Implicit Association Test. Journal

of Marketing Research, 47(2), 374-386. http://dx.doi.org/10.1509/jmkr.47.2.374

Moorman, R., & Podsakoff, P. (1992). A meta-analytic review and empirical test of the potential

confounding effects of social desirability response sets in organization behavior research.

Journal of Occupational and Organizational Psychology, 65, 131-49.

Morgeson, F. P., Campion, M. A., Dipboye, R. L., Hollenbeck, J. R., Murphy, K., & Schmitt, N.

(2007). Reconsidering the use of personality tests in personnel selection contexts.

Personnel Psychology, 60, 683-729. doi:10.1111/j.1744-6570.2007.00089.x

Motzkin, J., Newman, J., Kiehl, K., & Koenigs, M. (2011). Reduced prefrontal connectivity in

psychopathy. Journal of Neuroscience, 31 (48), 17348-17357. doi:

10.1523/JNEUROSCI.4215-11.2011

http://dx.doi.org/10.1111/j.1467-6494.1992.tb00970.x

http://dx.doi.org/10.1509/jmkr.47.2.374

http://dx.doi.org/10.1523/JNEUROSCI.4215-11.2011


Muller, D., Judd, C. M., & Yzerbyt, V. Y. (2005). When moderation is mediated and mediation

is moderated. Journal of Personality and Social Psychology, 89 (6), 852-863.

Murphy, K. R., & Davidshofer, C. O. (2005). Psychological testing (6th

ed.). Upper Saddle

River, NJ: Prentice-Hall.

Murphy, E. & Meisgeier, C. (1987). MMTIC Manual – a guide to the development and use of the

Murphy-Meigeiser Type Indicator for Children, 25-26, Center for the Applications of

Type, Gainesville FL.

Murphy, E & Meisgeier, C. (2008). MMTIC Manual – A guide to the development and use of the

Murphy-Meigeiser Type Indicator for Children, 25-26, Center for the Applications of

Type, Gainesville FL.

Myers, I. B. (1962). Manual: The Myers-Briggs Type Indicator. Princeton, NJ: Education

Testing Service.

Myers, I. B. (1977). In M. H. McCaulley, The Myers longitudinal medical study (Monograph II).

Gainesville, FL: Center for Applications of Psychological Type.

Myers, I. B., & McCaulley, M. (1985). Manual: A guide to the development and use of the

Myers-Briggs Type Indicator (2nd ed.). Palo Alto, CA: Consulting Psychological Press.

Myers, I. B., with Myers, P. B. (1980/1995). Gifts differing: Understanding personality type

(2nd ed.). Palo Alto, CA: Davies-Black.

Myers, I. B., McCaulley, M. H., Quenk, N. L., & Hammer, A. L. (1998/2003). MBTI Manual: A

guide to the development and use of the Myers-Briggs Type Indicator (3rd ed.). Palo

Alto, CA: Consulting Psychologists Press.

Nagourney, E. (2007). Even babies may be good judges of character. The New York Times.

Retrieved from http://www.nytimes.com/2007/12/04/health/research/04beha.html?_r=2&

http://www.nytimes.com/2007/12/04/health/research/04beha.html?_r=2&


Nederhof, A. (1985). Methods of coping with social desirability bias: a review. European

Journal of Social Psychology, 15, 263-280.

Nordqvist, C. (2012). Does propranolol reduce racism? Probably yes, subconsciously. Medical

News Today. Retrieved from http://www.medicalnewstoday.com/articles/242769.php.

Nosek, B. A., Greenwald, A. G., & Banaji, M. R. (2007). The Implicit Association Test at age 7:

A methodological and conceptual review, 265-292. In J.A. Bargh (Ed.). Automatic

processes in social thinking and behavior, Psychology Press.

Nosek, B., Greenwald, A., & Banaji, M. (2005). Understanding and using the Implicit

Association Test: II. Method variables and construct validity. Personality and Social

Psychology Bulletin, 31, 166–180.

Nunnally, J.C. (1978). An overview of psychological measurement. In Clinical Diagnosis of

Mental Disorders: A Handbook (ed. B.B. Wolman), 97-146. Plenum Press: New York.

Nunnally, J. C., & Bernstein, I. H. (1994). Psychometric theory (3rd ed.). New York: McGraw-

Hill.

O'Brien, E., & LaHuis, D. (2011). Do applicants and incumbents respond to personality items

similarly? A comparison of dominance and ideal point response models. International

Journal of Selection and Assessment, 19, 109-118. doi: 10.1111/j.1468-

2389.2011.00539.x

Ones, D., Viswesvaran, C. & Schmidt, F. (1993). Comprehensive meta-analysis of integrity test

validities: Findings and implications for personnel selection and theories of job

performance [Monograph]. Journal of Applied Psychology,78, 679-703.

http://onlinelibrary.wiley.com/doi/10.1002/ejsp.2420150303/abstract

http://onlinelibrary.wiley.com/doi/10.1002/ejsp.2420150303/abstract

http://www.medicalnewstoday.com/articles/242769.php


Ones, D., Viswesvaran, C., & Reiss, A. D. (1996). Role of social desirability in personality

testing for personnel selection: The red herring. Journal of Applied Psychology, 81(6),

660-682. doi:10.1037/0021-9010.81.6.660

Ones, D., & Viswesvaran, C. (1998). The effects of social desirability and faking on personality

and integrity assessment for personnel selection. Human Performance, 11, 245-269.

Olson, M, & Fazio, R. (2004). Reducing the influence of extrapersonal associations on the

Implicit Association Test: personalizing the IAT. Journal of Personality and Social

Psychology, 86 (5), 653-67.

Olson, M., Fazio, R., & Han, H. (2009). Conceptualizing personal and extrapersonal

associations. Social Psychology and Personality Compass, 3, 152-170.

Ottaway, S., Hayden, D., & Oakes, M. (2001). Implicit attitudes and racism: effects of word

familiarity and frequency in the Implicit Association Test. Social Cognition, 19, 97–144.

Oyserman, D., & Lee, S. (2008). Does culture influence w hat and h ow w e think?

Effects of priming individualism and collectivism. Psychological Bulletin, 134,

311– 342.

Patel, C. (2006). A comparative study of professional accountants’ judgements. Elsevier Jai.

Oxford.

Paul, M. (2004, 2005). The cult of personality testing: How personality tests are leading us to

miseducate our children, mismanage our companies, and misunderstand ourselves, New

York: Simon & Schuster.

Pauls, C., & Crost, N. (2004). Effects of faking on self-deception and impression management

scales. Personality and Individual Differences, 37, 1137-1151.

http://psycnet.apa.org/doi/10.1037/0021-9010.81.6.660


Payne, B., Burkley, M., & Stokes, M. (2008). Why do implicit and explicit attitude tests

diverge? The role of structural fit. Journal of Personality and Social Psychology 94 (1),

16–31. doi: 10.1037/0022- 3514.94.1.16

Paulhus, D. (2002). Socially desirable responding: The evolution of a construct. In H. Braun, D.

N. Jackson, & D.E. Wiley (Eds.), The role of constructs in psychological and educational

measurement, 67-88, Hillsdale, NJ: Erlbaum.

Paulhus, D. (2011). Overclaiming on personality questionnaires. In M. Ziegler, M., C. MacCann,

& R. Roberts, (Eds.), New perspectives on faking in personality assessment, 151-164,

New York, NY: Oxford University Press.

Paunonen, S., & LeBel, E.(2012). Socially desirable responding and its elusive effects on the

validity of personality assessments. Journal of Personality and Social Psychology, 103,

158-175.

PBS (2002). Frontline: Inside The Teenage Brain, Retrieved from PBS online

http://www.pbs.org/wgbh/pages/frontline/shows/teenbrain/interviews/giedd.html

Pepper, T. (2005, January 15). Getting to know you, Newsweek, 145 (3), 44-47.

Peterson, M., Griffith, R., & Converse, P. (2009). Examining the role of applicant faking in

hiring decisions: Percentage of fakers hired and hiring discrepancies in single- and

multiple-predictor selection. Journal of Business and Psychology, 24, 1–14.

Phelps E., O’Conner, K., Cunningham, W., Funayama, E., Gatenby, J., Gore, J., & Banaji, M.

(2000). Performance on indirect measures of race evaluation predicts amygdala

activation. Journal of Cognitive Neuroscience 12, 729–738.

doi: 10.1162/089892900562552

http://www.pbs.org/wgbh/pages/frontline/shows/teenbrain/interviews/giedd.html


Phelps, E. (2006). Emotion and cognition: insights from studies of the human amygdala. Annual

Review of Psychology, 57, 27–53.

Pitman, R., Sanders, K., Zusman, R., Healy, A., Cheema, F., Lasko, N., Cahill, L., & Orr, S.

(2002), Pilot study of secondary prevention of posttraumatic stress disorder with

propranolol, Biological Psychiatry, 51 (2), 189-192.

Pitman, R., Rasmusson, A., Koenen, K., Shin, L., Orr, S., Gilbertson, M., Milad, M., &

Liberzon, I. (2012). Biological studies of post-traumatic stress disorder. Nature Reviews

Neuroscience 13, 769-787. doi:10.1038/nrn3339

Pittenger, D. J. (1993). Measuring the MBTI and coming up short. Journal of Career Planning

and Employment, 54 (1), 48-52.

Porter, R. B., & Cattell, R. B. (1968). Handbook for the Children’s Personality Questionnaire

(CPG). Champaign, IL: Institute for Personality and Ability Testing Inc.

Quinn, P., Yahr, J., Kuhn, A., Slater, A., & Pascalis, O. (2002). Representation of the gender of

human faces by infants: A preference for females. Perception. 31, 1109–1121.

Raine, A. (2013). The criminal mind. The Wall Street Journal, April 26, American edition.

Retrieved from

http://online.wsj.com/article/SB10001424127887323335404578444682892520530.html

Reynierse, J. (2009). The case against type dynamics. Journal of Psychological Type, 69, 1-21.

Robie, C., Brown, D., & Beaty, J. (2007). Do people fake on personality inventories? A verbal

protocol analysis. Journal of Business and Psychology, 21, 489–509.

doi: 10.1007/s10869-007-9038-9

http://online.wsj.com/article/SB10001424127887323335404578444682892520530.html


Rosse, J., Stecher, M., Miller, J., & Levin, R. (1998). The impact of response distortion on pre-

employment personality testing and hiring decisions. Journal of Applied Psychology, 83,

634-44.

Rosse, J., Levin, R., & Nowicki, M. (1999, April). Assessing the impact of faking on job

performance and counter-productive job behaviors - Paper presented in Paul Sackett

(Chair) New empirical research on social desirability in personality management

symposium for the 14th

annual meeting for the Society of Industrial and Organizational

Psychology, Atlanta, GA., 5.

Rosse, J., Stecher, M., Miller, J., & Levin, R. (1998). The impact of response distortion on pre-

employment personality testing and hiring decisions. Journal of Applied Psychology, 83,

634-644.

Rosenthal, R., & Rubin, D. (1982). A simple general purpose display of magnitude of

experimental effect. Journal of Educational Psychology, 74, 166–169.

doi: 10.1037/0022-0663.74.2.166

Rossi, P., Wright, J., & Anderson, A. (1983). Handbook of Survey Research. Orlando, FL.

Academic Press.

Rudman, L., Greenwald, A., Mellott, D., & Schwartz, J. (1999). Measuring the automatic

components of prejudice: flexibility and generality of the Implicit Association Test.

Social Cognition, 17, 437–65.

Rudman, L., Ashmore, R., & Gary, M. (2001). Unlearning a utomatic bias: The

malleability of implicit prejudice and stereotypes. Journal of Personality and

Social Psychology, 81, 856–868.



Rudman, L., & Lee, M. (2002). Implicit and explicit consequences of exposure to violent

and misogynous rap music. Group Processes and Intergroup Relations 5, 133-150.

Ryan, A., & Sackett, P. (1987). Pre-employment honesty testing: fakability, reactions of test

takers, and company image. Journal of Business and Psychology, 1, 248-56.

Rychtarik, R.,Tarnowski, K., & St. Lawrence, J. (1989). Impact of social desirability response

sets on the self-report of marital adjustment in alcoholics. Journal of Studies on Alcohol

and Drugs, 50, (1), 24-29.

Rynes, S. (1993). Who’s selecting whom? Effects of selection practices on applicant attitudes

and behavior. In Schmitt, N. and Borman, W. (Eds). Personnel Selection in

Organizations, 240-74. San Francisco, CA: Jossey Bass.

Sabin, J., Marini, M., & Nosek, B. A. (2012). Implicit and explicit anti-fat bias among a large

sample of medical doctors by BMI, race/ethnicity and gender. PLoS ONE, 7 (11) e48448.

doi: 10.1371/journal.pone.0048448

Schaubhut, N., Herk, N., & Thompson, R. (2009). MBTI® Form M manual supplement, 9-17,

Mountain View, CA: CPP, Inc.

Schmit, M., & Ryan, A. (1993). The Big Five in personnel selection: Factor structure in

applicant and nonapplicant populations. Journal of Applied Psychology, 78, 966- 974.

Schmidt, F., Le, H., & Ilies, R. (2003). Beyond alpha: An empirical examination of the effects of

different sources of measurement error on reliability estimates for measures of

individual-differences constructs. Psychological Methods, 8, 206-224.

Schmukle, S., Back, M., & Egloff, B. (2008). Validity of the five-factor model for the implicit

self-concept of personality. European Journal of Psychological Assessment, 24 (4), 263-

272. doi: 10.1027/1015-5759.24.4.263

http://dx.doi.org/10.1371/journal.pone.0048448



Schnabel, K., Banse, R., & Asendorpf, J. (2006). Assessment of implicit personality self-concept

using the implicit association test (IAT). British Journal of Social Psychology, 45, 373–

396.

Schnabel, K., Asendorpf, J. B., & Greenwald, A.(2007). Implicit association tests: A landmark

for the assessment of implicit personality self-concept. In G. J. Boyle, G. Matthews, & H.

Saklofske (Eds.). Handbook of personality theory and testing, 508-528. London: Sage.

Schnabel, K., Asendorpf, J., & Greenwald, A. (2008). Assessment of individual differences in

implicit cognition. European Journal of Psychological Assessment, 24 (4), 210-217.

Sciencedaily.com (2011, November 30). Psychopaths' Brains Show Differences in Structure and

Function. Retrieved April 17 from

http://www.sciencedaily.com/releases/2011/11/111122230903.htm

Shepard, H. (2011). The cultural context of Cognition: What the Implicit Association Test tells

us about how culture works. Sociological Forum, 26 (1), 121-143.

doi: 10.1111/j.1573-7861.2010.01227.x

Shrout, P. (1993). Analyzing consensus in personality judgments: A variance components

approach. Journal of Personality, 61, 769-788.

Siers, B. (2008). Effects of response distortion on the validity of implicit association tests of

personality. ProQuest Information and Learning. Dissertation Abstracts International:

Section B: The Sciences and Engineering, 68 (11), 7699-7699.

Sipps, G., Alexander, R., & Friedt, L. (1985). Item analysis of the Myers-Briggs Type Indicator.

Educational and Psychological Measurement, 45 (4), 789-796.

Smith, D. (2004). Why we lie. The evolutionary roots of deception and the unconscious mind.

New York: St. Martin’s Press.


Snell, A.,& McDaniel, M. (1998). Faking: getting data to answer the right questions. Paper

presented at the 13th Annual Conference of the Society for Industrial and Organizational

Psychology, Dallas, TX.

Snell, A., Sydell, E., & Lueke, S. (1999). Towards a theory of applicant faking: integrating

studies of perception. Human Resource Management Review, 19, 219-242.

Snell, K. (1994). Response distortion and the Myers-Briggs Type Indicator: Implications for

selection and organizational applications. Virginia Polytechnic Institute and State

University.

Snowden, R., Gray, N., Smith, J., Morris, M., & MacCulloch, M. (2004). Implicit affective

associations to violence in psychopathic murders. Journal of Forensic Psychiatry and

Psychology, 15 (4), 620-641.

SPSS (Version 12 graduate pack) [software]. Tarrytown NY: IBM

Stahl, L., CBS News (Producers). (2012, November 17). The Baby Lab 60 Minutes [Television

broadcast]. New York, New York, CBS News.

http://www.cbsnews.com/video/watch/?id=50135408n

Stanley S., Phelps E., & Banaji, M. (2008). The neural basis of implicit attitudes. Current

Directions in Psychological Science, 17 (2), 164-170.

doi: 10.1111/j.1467-8721.2008.00568.x

Stanush, P. (1997). Factors that influence the susceptibility of self-report inventories to

distortion: a meta-analytic investigation, unpublished doctoral dissertation, Texas A&M

University, College Station, TX.

http://www.cbsnews.com/video/watch/?id=50135408n


Steele, R., & Kelly, T. (1976). Eysenck Personality Questionnaire and Jungian Myers-Briggs

Type Indicator correlation of extraversion-introversion. Journal of Consulting and

Clinical Psychology, 44 (4), 690-691.

Steffens, M., & Inga, P. (2001). Items' cross-category associations as a confounding factor in

the Implicit Association Test. Zeitschrift fuer Experimentelle Psychologie, 48 (2), 123-34.

Steffens, M. (2004). Is the implicit association Test immune to faking? Experimental


Stokes, G., Hogan, J., & Snell, A. (1993). Comparability of incumbent and applicant samples for

the development of biodata keys: the influence of social desirability. Personnel


Stricker, L., & Ross, J. (1964). Some correlates of a Jungian personality inventory.

Psychological Reports, 14, 623-643.

Stricker, L., & Ross, J. (1973). An assessment of some structural properties of the Jungian

personality typology. Journal of Abnormal and Social Psychology, 68, 62-67.

Stuttgen, P., Vosgerau, J., Messner, C., & Boatwright, P. (2011). Adding significance to the

Implicit Association Test, working paper, Tepper School of Business, Carnegie Mellon

University, Paper 1393. http://repository.cmu.edu/tepper/1393

Suplee, C. (1991). Performance-enhancement techniques found flawed or ineffective in study.

The Washington Post, September 25 p. A3.

Tajfel, H., & Turner, J. (1979). An integrative theory of intergroup conflict. In W.G. Austin & S.

Worchel (Eds.). The social psychology of intergroup relations, 33-47. Monterey, CA:

Brooks/Cole.


Terbeck S., Kahane G., McTavish S., Savulescu, J., Cowen, P., & Hewstone, M. (2012).

Propranolol reduces implicit negative racial bias, Psychopharmacology, 3, 419-424.

doi: 10.1007/s00213-012-2657-5

Tiernay, J. (2008). In bias test, shades of gray. New York Times. Nov. 17. Retrieved 2013 July

24.

Tyson, T. (1992). Does believing that everyone else is less ethical have an impact on work

behavior? Journal of Business Ethics, 11, 707-717.

van Alphen, A., Halfens, R., Hasman, A., & Imbos, T. (1994). Likert or Rasch? Nothing is more

applicable than good theory. Journal of Advanced Nursing, 20, 196-201.

Viswesvaran, C., & Ones, D. (1999). Meta-analyses of fakability estimates: Implications for

personality measurement. Educational and Psychological Measurement, 59, 197– 210.

doi:10.1177/00131649921969802

University of Wisconsin-Madison image (2011, November 30). Psychopaths' brains show

differences in structure and function. ScienceDaily. Retrieved August 7, 2013, from

http://www.sciencedaily.com /releases/2011/11/111122230903.htm

von Hippel, W., Sekaquaptewa, D., Espinoza, M., Thompson, P., & Vargas, W. (2003).

Stereotypic explanatory bias: Implicit stereotyping as a predictor of discrimination.

Journal of Experimental Social Psychology, 39 (1), 75-82.

von Hippel, W. (2007). Aging, executive functioning and social control. Current Directions in

Psychological Science, 16 (5), 240-244.

Wachter, P. (2005). Jung Love. Swarthmore College Bulletin, 52 (5), 26. Swarthmore PA

Watkins, D., & Cheung, S. (1995). Culture, gender, and response bias. Journal of Cross-Cultural

Psychology, 26 (5), 490-504.

http://dx.doi.org/10.1007%2Fs00213-012-2657-5

http://scholar.google.com/citations?view_op=view_citation&hl=en&user=jEYNGCsAAAAJ&citation_for_view=jEYNGCsAAAAJ:9yKSN-GCB0IC


Wax, A., & Tetlock, P. (2005). We’re all racists at heart. Wall Street Journal, Dec. 01.

Retrieved 2011, June 09.

Wegner, D., & Bargh, J. (1998). Control and automaticity in social life, In D.Gilbert, S.

Fiske, & G. Lindzey, (eds.). Handbook of Social Psychology, 446–496. New York:

McGraw-Hill.

Wheeler, J., Hamill, L., & Tippins, N. (1996). Warnings against candidate misrepresentations: do

they work? Paper presented at the 11th Annual Conference of the Society for Industrial

and Organizational Psychology, San Diego, CA.

Wheeler M., & Fiske S. (2005). Controlling racial prejudice: social-cognitive goals affect

amygdala and stereotype activation. Psychological Science 16 (1), 56-63.

doi: 10.1111/j.0956-7976.2005.00780.x

Wheeler, S., Christian, S., & Petty, R. (2001). The effects of stereotype activation on

behavior: A review of possible mechanisms. Psychological Bulletin, 127, 797–826.

Wehr, G. (1971). Portrait of Jung: An illustrated biography. (W. A. Hargreaves, Trans.). New

York: Herder and Herder. (Original work published 1969.)

Wiggens, J. (1964). Convergences among stylistic response measures from objective personality

tests. Educational and Psychological Measurement, 24, 551-562.

Wilson, T., Lindsey, S., & Schooler, T. (2000). A model of dual attitudes. Psychological Review,

107, 101-126.

Wilson, T. (2002). Strangers to ourselves: Discovering the adaptive unconscious. Cam-

bridge, MA: Harvard University Press.

West, S., & Aiken, L. (1991). Multiple regression: Testing and interpreting interactions. Sage

Publications, Incorporated.


Wilde, G. (1977). Trait description and measurement by personality questionnaires. In R.B.

Cattell, Chapter III, Dreger, R. (Eds.) Handbook of modern personality theory, 69- 103.

Washington D.C.: Hemisphere Publishing Corporation.

Wright, N., & Meade, A. (2011). Predictive validity and procedural justice of the implicit

association test. Paper presented at the 26th Annual Meeting of the Society for

Industrial and Organizational Psychology, Chicago, IL (April).

Yovel, I., & Friedman , A. (2013). Bridging the gap between explicit and implicit

measurement of personality: The questionnaire-based implicit association

test. Personality and Individual Differences, 54, 76–80.

Zerbe, W., & Paulhus, D. (1987). Socially desirable responding in organizational behavior: a

reconception. Academy of Management Journal, 12 (2), 250-264.

Zickar, M., Rosse, J., & Levin, R. (1996, April). Modeling the effects of faking on personality

instruments. Paper presented at the 11th

annual meeting of the Society of Industrial and

Organizational Psychology, San Diego, CA.

Zickar, M., & Robie, C. (1999). Modeling faking good on personality items: An item-level

analysis. Journal of Applied Psychology, 84, 551–563. doi:10.1037/0021- 9010.84.4.551

Zickar, M., Gibby, R., & Robie, C. (2004). Uncovering faking samples in applicant, incumbent,

and experimental data sets: An application of mixed-model item response theory.

Organizational Research Methods, 7, 168–190.

Ziegler, M. (2007). Situational demand and its impact on construct and criterion validity of a

personality questionnaire: State and trait, a couple you just can’t study separately!

Dissertation, LMU München: Fakultät für Psychologie und Pädagogik.


Ziegler, M., Schmidt-Atzert, L., Bühner, M., & Krumm, S. (2007). Fakability of different

measurement methods for achievement motivation: Questionnaire, semi-projective, and

objective. Psychological Science, 49 (4), 291–307.

Ziegler, M., & Bühner, M. (2009). Modeling socially desirable responding and its effects.

Educational and Psychological Measurement, 69 (4), 548-565.

doi: 10.1177/0013164408324469

Ziegler, M., Toomela, A., & Bühner, M. (2009). A reanalysis of Toomela (2003): Spurious

measurement error as cause for common variance between personality factors.

Psychology Science Quarterly, 51, 65-75.

Ziegler, M., Danay, E., Schölmerich, F., & Bühner, M. (2010). Predicting academic success

with the Big Five rated from different points of view: Self-rated, other rated and faked.

European Journal of Personality, 24 (4) 341-355.

Ziegler, M., MacCann, C., & Roberts, R. (2011). Faking: knowns, unknowns, and points of

contention. In M. Ziegler, C. MacCann & R. D. Roberts (Eds.). New perspectives on

faking in personality assessment, 3-16. New York, NY: Oxford University Press.

http://psycnet.apa.org/doi/10.1177/0013164408324469


Table 1

McCrae & Costa’s Comparison of the Big Five (FFM and MBTI)

MBTI scales E/I S/N T/F J/P

Big Five scales

Extraversion -.70*** .15* .15* .17**

Openness .03 .70*** .00 .29**

Agreeableness .05 .04 .45*** -.01

Conscientiousness .08 -.12 -.20** -.47***

N = 468 ***p < .001 **p < .01 *p < .01

(note: This table is adapted from McCrae and Costa’s Table 2 (1989)

where they list correlations for men (N = 267) and women (N = 201)

separately. For p levels less than .001, some of the male vs female

correlations varied by more than a few points. Also note the signs of

correlations reflect that MBTI continuous scores of 100 or higher indicate

preferences for I, N, F, and P).

(McCrae & Costa, 1989, p. 37)

A closer look at this study reveals even more noteworthy findings. Before proceeding, however,

the reader is refreshed to the widely accepted standards for the Pearson Product Moment r

correlations. If r =

+.70 or higher - Very strong positive relationship

+.40 to +.69 - Strong positive relationship

+.30 to +.39 - Moderate positive relationship

+.20 to +.29 - Weak positive relationship

+.01 to +.19 - No or negligible relationship

-.01 to -.19 - No or negligible relationship

-.20 to -.29 - Weak negative relationship

-.30 to -.39 - Moderate negative relationship

-.40 to -.69 - Strong negative relationship

-.70 or higher - Very strong negative relationship


Table 2

Pearson Product Moment r Comparisons of IAT, MBTI and MMTIC

E / I S / N T / F J / P

MBTI 0.254** 0.238* 0.190 0.171

MMTIC 0.319** 0.268** 0.078 0.132

Note: N = 103, * p < .05; two-tailed, ** p < .01, two tailed


Table 3

Means and Standard Deviations.

M S. D. N

e/i IAT .10 .45 103

e/i MMTIC 69.48 428.76 103

e/i MBTI .01 15.74 103

s/n IAT .02 .32 103

s/n MMTIC -130.52 418.33 103

s/n MBTI 4.83 12.64 103

t/f IAT .33 .39 103

t/f MMTIC 120.68 477.06 103

t/f MBTI 5.99 13.18 103

j/p IAT .14 .34 103

j/p MMTIC 68.72 393.66 103

j/p MBTI 3.72 14.59 103


Table 4

Pearson Product Moment r correlations between the MBTI and MMTIC

E / I S / N T / F J / P

0.813

0.743

0.692

0.766

Note: N = 103


Table 5

Summary of the Improved Algorithm for IAT Scoring Procedures Recommended by Greenwald et

al. (2003)

__________________________________________________________________________

1. Delete trials greater than 10,000 milliseconds. These are to eliminate the scores of subjects

who have had their attention diverted during the IAT.

2. Delete subjects for whom more than 10% of trials have latency responses less than 300

milliseconds. These are the so-called: “key-bangers”.

3. Compute the “inclusive” standard deviation for all trials in Stages 3 and 6 and likewise for all

trials in Stage 4 and 7.

4. Compute the mean latency for responses for each of Stages 3, 4, 6, and 7.

5. Compute the two mean differences (Meanstage 6 – Meanstage 3 ) and (Meanstage 7 –Meanstage 4).

6. Divide each difference score by its associated “inclusive” standard deviation.

7. D = the equal-weight average of the two resulting ratios.

__________________________________________________________________________ Note: From Greenwald, Nosek, and Banaji (2007, Table 3.3). Copywrite 2003 by the American Psychological

Association. Adapted by permission. This computation is appropriate for designs in which subjects must correctly

identify each item before the next stimulus appears. If subjects can proceed to the next stimulus following an

incorrect response, the following steps may be taken between Steps 2 and 3 in the table: (1) compute mean latency

of correct responses for each combined Stage (3, 4, 6, 7); (2) replace each error latency with an error penalty

computed optionally as “Stage mean + 600 milliseconds” or “Stage mean + twice the SD of correct responses for

that stage.” Proceed as above from Step 3 using these error-penalty latencies. Stage numbers refer to stages depicted

in Figure 3.1. SPSS and SAS syntax for implementing the new scoring algorithm are available at

faculty.washington.edu/agg/iat_materials.htm and www.briannosek.com respectively.

SAS syntax is available at:

http://projectimplicit.net/nosek/papers/scoringalgorithm.sas.txt

SPSS syntax for computing the D measure can be found in the “Generic IAT

zipfile download” at:

http://faculty.washington.edu/agg/iat_materials.htm

http://www.briannosek.com/

http://projectimplicit.net/nosek/papers/scoringalgorithm.sas.txt

http://faculty.washington.edu/agg/iat_materials.htm


Table 6

MBTI-like descriptive words used for IAT tests

Outgoing-Quiet (used in place of Extraversion-Introversion)

Talkative Cautious

Loud Modest

Show off Private

Expressive Quiet

Outgoing Calm

Sociable Reserved

Realistic-Inventive (used in place of Sensing-Intuition)

Detailed Creative

Sensible Innovative

Realistic Theoretical

Practical Unconventional

Traditional Inventive

Literal Clever

*note - Abstract did not work during beta-testing and was withdrawn before the study began

Logical-Warm (used in place of Thinking-Feeling)

Objective Warm

Tough-minded Generous

Frank Agreeable

Skeptical Emotional

Analytical Harmonious

Logical Sympathetic

Organized-Spontaneous (used in place of Judging-Perceiving)

Disciplined Relaxed

Planful Informal

Orderly Spontaneous

Organized Easygoing

Prompt Casual

Thorough Flexible


Table 7

Stimuli used in the qIAT

____________________________________________________________

Category Stimuli

____________________________________________________________

True I’m in a building in Mount Scopus campus

I’m in a small room with a computer

I’m participating in an experiment in psychology

I’m in a psychology laboratory

I’m sitting in front of the computer

False I’m climbing a steep mountain

I’m sitting on the sand at the beach

I’m playing my electric guitar

I’m playing soccer outside

I’m shopping at the local grocery store

Extravert I am the life of the party

person I feel comfortable around people

I start conversations

I talk to a lot of different people at parties

I don’t mind being the center of attention

Introvert I don’t talk a lot

person I keep in the background

I have little to say

I don’t like to draw attention to myself

I am quiet around strangers

_________________________________________________________

adapted from Yovel and Friedman’s Table 1, Yovel and Friedman, (2013, p.78).


Table 8

Correlations between the explicitly measured extraversion

items and the explicit and implicit measures of extraversion __________________________________________________________________

Extraversion items (explicit measure) Explicit scale qIAT

total score D score

__________________________________________________________________

I am the life of the party 0.68*** 0.37**

I don’t talk a lot (r) 0.70*** 0.27*

I feel comfortable around people 0.55*** 0.03

I keep in the background (r) 0.80*** 0.26*

I start conversations 0.66*** 0.25*

I have little to say (r) 0.65*** 0.40**

I talk to a lot of different people at parties 0.77*** 0.28**

I don’t like to draw attention to myself (r) 0.68*** 0.27*

I don’t mind being the center of attention 0.53*** 0.17

I am quiet around strangers (r) 0.76*** 0.34**

________________________________________________________________________

Note: Correlations with the explicit scale are corrected item-total correlations.

qIAT = questionnaire-based implicit association test; r = reversed.

* p< .05; ** p <.01; *** p < .001

_________________________________________________________________________

adapted from Yovel and Friedman’s Table 2, Yovel and Friedman, (2013, p.78).


Figure 1 - The brain regions most often reported in studies of race.

from Kubota & Phelps et al. 2012,“The brain regions most often reported in studies of race. The

amygdala has been linked to automatic race evaluations and the FFA is involved in the rapid

identification of other race individuals. The ACC is thought to detect conflict between implicit race

attitudes and conscious intentions to be nonbiased. When such conflicts are detected, the DLPFC may

regulate negative evaluations” (Kubota & Phelps, 2012, p. 944).


Figure 2 - A model for the neural basis of implicit attitudes

“A model for the neural basis of implicit attitudes. The evidence reviewed in this article

suggests at least 3 components: The amygdala is implicated in the automatic evaluation of

socially relevant stimuli while the anterior cingulate cortex (ACC) and the dorsolateral

prefrontal cortex (dlPFC) are implicated in the detection of such stimuli and the regulation of

the amygdala’s response, respectively. There are many open questions. For example, how does

the dlPFC exert its influence over the amygdala given that there is little evidence of direct

connectivity between the two structures (dashed red arrow; one possibility is via the

ventromedial prefrontal cortex—see Phelps and LeDoux, 2006). Another is whether the ACC

detects the presence of a social stimulus itself (dotted white arrow) or is, instead, sensitive to the

initiation of the automatic amygdala response (dotted green arrow)”

(Stanley, Phelps & Banaji 2008, p. 168).


Figure 3 - Example of “wrong answer” screen on IAT test


Figure 4 - Example of 4 categories instead of 2 categories on IAT test


APPENDIX A

Myers-Briggs Type Indicator (MBTI) – Preference Clarity Index - evidence of continuous scoring


APPENDIX B

Murphy-Meisgeier Type Indicator for Children (MMTIC) – percentage evidence of continuous scoring


APPENDIX C

Murphy-Meisgeier Type Indicator for Children (MMTIC) – “Strengths and Stretches” section


APPENDIX D

Script for participant recruiting session by the TAs to each whole class

.

On a voluntary basis, you are being invited to take part in research that regarding three different

personality tests. This research is part of a Master’s thesis requirement currently being worked

on by a teacher at Hall High School, Mr. Terry Marselle. The purpose of this study is to compare

the results of these personality tests to each other. That is, how they are alike and how they are

different. It is stressed that you have the right not to take part – or if you have already started –

you can withdraw from participation at any time. In addition, please be aware that whether you,

along with your parent’s permission choose to participate – or not participate, your grades in

school will not be affected in any way. It should be made known as well, that neither the

participants in this study nor any of the researchers will ever know the results.

Please be aware that all three of these tests are very non-threatening. None of them look for

negative personality characteristics, attitudes or opinions, either in the past, present or signs for

the future. Again, you will be taking three different personality tests. Then, approximately one

week later, you will be tested on a different personality test. That means, you will end up taking

three separate administrations of these tests. At no time will there be more than one personality

test “in play”. Whenever one test is completed, one week later, a second test will begin – and so

on. From beginning to end, this “one-week-at-a-time” process will take approximately three

weeks. All of this will start on a weekend in early April. If you would like to participate, take a

participant packet home and have the Parental Consent form signed by your parent/guardian as

well as the Student Assent form by Monday, April 01, 2013. Return all signed forms Mr. Brian

Rappelfeld (TA).


APPENDIX E

Script for informational session for participants who volunteered

Thank you for volunteering to take part in this study! To briefly restate some points given in the

original recruitment/informational session, Mr. Marselle is a Master’s degree student at the

University of Hartford in West Hartford, CT and as part of his thesis, he is involved in a research

study regarding the reliability and validity of three different personality tests – particularly with

older adolescents.

All three of these personality tests are non-threatening. None of them look for negative

personality characteristics, attitudes or opinions, either in the past, present, or signs for the

future. Each of these three personality tests takes approximately 40 minutes to complete and their

names are as follows:

Myers-Briggs Type Indicator (MBTI)

Murphy-Meisgeier Type Indicator for Children – the MBTI’s children’s counterpart

Implicit Association Test (IAT)

Again, please note that official permission for me to conduct this study has been granted by the

West Hartford Public Schools district level offices, as well as by Hall High School Interim

Principal, Mr. Thomas Einhorn, as well as Hall High School Social Studies Department Head,

Mr. Steve Armstrong.

From beginning to end, this process will take approximately three weeks, intended to be one

test administration per week. This three week process will begin in early April, 2013. All


APPENDIX E - continued


administrations of these tests will be online and intended to be taken at home in a quiet

environment. Therefore, a computer with an internet connection is required.

Taking part is voluntary. You are reminded again that your participation is voluntary. You

have a right to withdraw from this study at any time for any reason. No permission is needed.

Please also note your non-participation will not affect your grade(s) in any course at Hall High

School.

All data collection will be – and remain - confidential.

The data collection process will be confidential. No personally-identifying information

will be collected. When you take these online assessments, instead of entering your name, you

will enter a unique alpha-numeric code you will have generated yourself according to a

suggested formula. Why do it this way? In social science research, there is a phenomenon called

the “order effect”. That is, the order in which these three different tests are taken might possibly

affect results. For this reason, and because the number of students involved in this study is

expected to be over 200, there will be three different groups. Each of these groups will have a

different order in which they will be taking these online tests. Therefore, the only commonality

in any of these unique codes will be a one number prefix of either a “1” or “2” or “3” which will

be pre-assigned to you. Group one will take the IAT test first in the sequence. Group two will

take the IAT test second. Group three will take the IAT test last in the sequence.




As an example, if you are pre-assigned to Group 1, the formula for generating your alpha-

numeric unique code is as follows:

last letter of your last name. For example, if the last letter of your last name is “a” then

example up to this point: 1a

pick the number of the month that you were born in: example Oct = “10”

example up to this point: 1a10

pick the numerical day of the month on which you were born: example: 19

example up to this point: 1a1019 = your completed unique code.

In addition to the above, participants are again reminded that you will complete all three of these

personality tests online and without any supervision or input from the researcher. Your responses

will be scored automatically via computer software programs. Therefore, the researcher cannot

score or influence the data. Information gleaned from this study will be used in presentations

and publications. The fact that everyone has an alpha-numeric unique code ensures that there

will never be a situation where your names will be associated with any answer given on any of

these three tests.

Other than this informational session, there will be no class time used to do this.




testing environment

One of the major guidelines of this research is that all of these personality inventories will be

taken online and at home. Therefore, you must have an internet connection to do this. Equally

important is the quality of the environment of the test-taking. It must be one of quietness and

where there will be no interruptions. Ideally, test-takers should be in a situation where they are

able to be alone with their thoughts. Specifically, there cannot be any multi-tasking whatsoever.

With two of these three tests, there is an exception to the “home alone” model of this test-taking

process. For the Myers-Briggs Type Indicator (MBTI) and the Murphy-Meisgeier Type Indicator

for Children (MMTIC) – if participants can assure the researcher by proxy (via the TAs) that this

can be accomplished in an environment that is of equal quality – permission may be granted, but

the researcher must be asked first. An example is in the quiet area of the school library, i.e., with

no disturbing or intrusive stimuli around you. Again, at no time, can a participant take any of

these tests while in class or any other noisy environment within the school. This includes noisy

areas of the library.

Regarding the third test, the Implicit Association Test (IAT), there will be no alternatives to the

“online at home, quiet environment rule” allowed. Because this test is measured in time and

degree of accuracy, the IAT test is the most vulnerable to having its results affected by a

compromised testing atmosphere. Therefore, it is paramount that the IAT be taken in a highly

controlled quiet and quality environment – most probably at home.




Because a quality test-taking environment is integral to this research, all participants will be

asked to sign a pledge stating they will abide by the researcher’s requests. Returning to the point

about multi-tasking, there is a particular worry here. Examples of multi-tasking are as follows:

any media going on in the background, e.g. movies or TV, sending or receiving texts, e-mails,

tweets, FaceBooking, playing video and/or computer games, using iPads, listening to selections

on MP3 devices such as iPods, etc. Even having other family members and/or friends near you

while you are taking these tests is discouraged.

Actual instructions on how to go to the publisher’s websites and log-on to take these tests

(See Appendix F, p. 151)

For all participants, please note the following important message:

After having utilized the specific URL (link) provided for each particular test, you will

quickly arrive at the point where you will be prompted to type in your first and last names. DO

NOT do this, as the test results will be discarded.

Instead, you must type in for both the first and last name your unique alpha-numeric code

according to the formula provided. Again, the formula is as follows: If, for example, you have

been are pre-assigned to Group 1, the formula for generating your alpha-numeric unique code is

as follows:

last letter of your last name. For example, if the last letter of your last name is “a” then

example up to this point: 1a




pick the number of the month that you were born in: example Oct = “10”

example up to this point: 1a10

pick the numerical day of the month on which you were born: example: 19

example up to this point: 1a1019 = your completed unique code.

In our above example, therefore, you will be filling in 1a1019 for both your first and last

name(s).


APPENDIX F

Actual instructions on how to go to the publisher’s websites and log-on to take these tests

1. For the Murphy-Meisgeier Type Indicator for Children (MMTIC), please do the

following:

2. For the Myers-Briggs Type Indicator (MBTI), here are the:

Online Assessment Instructions

To Take an Assessment

1. Using a web browser (i.e., Microsoft® Internet Explorer),

access the CPP Online Assessment site.

https://online.cpp.com

2. Enter the following Login (all small letters). group 3 mbti taken last

3. Enter the following Password (all small letters).

ctspringwhps

4. Click: LOGIN

5. For both first and last names, don’t forget to enter your unique alpha-numeric code according to the formula given to

you by Hall High School researchers.

Enter your unique code for both first and

last names and

then click:

BEGIN

https://online.cpp.com/


APPENDIX F - continued

Actual instructions on how to go to the publisher’s websites and log-on to take these tests - continued

This is what the beginning of the MBTI test-taking screen looked like.

Do not fill in gender or e-mail address or home postal code.

next page


APPENDIX F - continued

Actual instructions on how to go to the publisher’s websites and log-on to take these tests - continued

3. For the Implicit Association Test (IAT), please click on the following URL (link)

http://research.millisecond.com/judy/aCCBatch22613.web

The screen will look like this:

end of APPENDIX F

http://research.millisecond.com/judy/aCCBatch22613.web


APPENDIX G

Myers-Briggs Type Indicator (MBTI) questions – 93 total


APPENDIX G - continued



APPENDIX H

Murphy-Meisgeier Type Indicator for Children (MMTIC) questions – 54 total


APPENDIX H – continued



APPENDIX H - continued



APPENDIX I

The Implicit Association Test (IAT) sample questions, adjective word list, and actual questions

from screen captures. These are divided into 5 minute sorting tasks

This questionnaire is administered via computer and is a sorting task in which the person is

presented with a word on the screen and is asked to sort that word into the category that fits for it.

A word will appear in the middle of the screen. Press the “A” key on your keyboard if that word

goes better with the word on the left. Press the “L” key if it goes better with the word on the right.

OUTGOING QUIET

Sociable


goes better with the word on the left. Press the “L” key if it goes better with the word on the right.

SELF OTHER

They


goes better with one of the words on the left. Press the “L” key if it goes better with one of the words

on the right.

OUTGOING QUIET

SELF OTHER

Sociable

* The Inquisit software for the IAT test will not allow a screen capture or printing of a page.

Therefore, it is recommended that members of the Master’s Thesis Committee actually take the

instrument themselves to achieve a 100% appreciation of the nature of this test. Should any member

choose to do so, they should be made aware they would be taking the same exact IAT test as

participants in this study would be taking. Here is the link:


APPENDIX I - continued



Words to be used for this study

Outgoing-Quiet (used in place of MBTI / MMTIC descriptors of Extraversion-Introversion)

Talkative

Loud

Show off

Expressive

Outgoing

Sociable

Cautious

Modest

Private

Quiet

Calm

Reserved

Realistic-Inventive (used in place of MBTI / MMTIC descriptors of Sensing-Intuition)

Detailed

Sensible

Realistic

Practical

Traditional

Literal

Creative

Innovative

Theoretical

Unconventional

Inventive

Clever





Logical-Warm (used in place of MBTI / MMTIC descriptors of Thinking-Feeling)

Objective

Tough-minded

Frank

Skeptical

Analytical

Logical

Warm

Generous

Agreeable

Emotional

Harmonious

Sympathetic

Organized-Spontaneous (used in place of MBTI / MMTIC descriptors of Judging-Perceiving)

Disciplined

Planful

Orderly

Organized

Prompt

Thorough

Relaxed

Informal

Spontaneous

Easygoing

Casual

Flexible



The Implicit Association Test (IAT) sample questions, adjective word list, and actual questions from

screen captures. These are divided into 5 minute sorting tasks


APPENDIX J

Participant honor pledge to maintain a quality testing environment

As part of Mr. Marselle’s Master’s Thesis research, while online taking the:

test of the Myers-Briggs Type Indicator MBTI

test of the Murphy-Meisgeier Type Indicator (MMTIC)

test of the Implicit Association Test (IAT)

I ________________________________________________pledge to maintain the integrity of printed name of student participant

the environment in which I do so. I understand that in order to accomplish this, the testing

environment must be a quiet place, preferably when I am alone and where the chances of

interruptions and intrusions will be very low. Examples of a good testing environment – or lack

thereof - the following are offered:

preferably at home - or in a similar situation that can assure an equally good quality atmosphere for test-taking

if taken at school (applicable to the MBTI and MMTIC only – and not the IAT), it cannot

be during any class, and must be in a situation where there is an extraordinarily quiet area

of the library, i.e., with no disturbing stimuli around you. Example: taking these in the

cafeteria or during a “class” which has a substitute in charge is not acceptable. In all

cases, taking the MBTI & MMTIC inside of school must be given advance permission by

the researcher, Mr. Marselle.

regarding the third test (the IAT), as stated above, there will be no exceptions

permitted. Because this test is timed, IAT test is the most vulnerable to having its results

affected by a compromised testing atmosphere. Therefore, it is paramount that the IAT be

taken in a highly controlled quiet and quality environment.

multi-tasking. This is a particular worry. There cannot be any multi-tasking whatsoever. Here are examples of what is not permitted:

any media going-on in the background, e.g. movies or TV

texting, e-mailing, Twittering, FaceBooking, playing video and/or computer games

using iPads or similar tablet instruments

listening to selections on MP3 devices such as iPods, etc. (no ear buds – no nothing

having with you/near you, other family members and/or friends or children you might

be watching or babysitting for

I, _________________________________________ on this date ______________________ full signature of student (written neatly)

have read the above and understand how important a quality test-taking environment is to this

research. As such, I agree to honor this pledge with my signature. Please do not give this to Mr.

Marselle. Give to the TA – Mr. Brian Rappelfeld.


APPENDIX K

Official permission from West Hartford Public Schools to conduct research on participants who

are young than 18 years of age


APPENDIX L

General Informational / Introductory Letter to Parents: of all 11th

and 12th

grade behavioral

science students at Hall High School – West Hartford CT

Dear Parent/Guardian:

I am a Master’s degree student at the University of Hartford in West Hartford, CT and as part of

my thesis, I am involved in a research study regarding the reliability and validity of three

different personality tests – particularly with older adolescents.

All of these three personality tests are non-threatening. None of them look for negative







Please note that official permission for me to conduct this study has been granted by the West

Hartford Public Schools district level offices, as well as by Hall High School Interim Principal,

Mr. Thomas Einhorn, as well as Hall High School Social Studies Department Head, Mr. Steve

Armstrong.

By allowing your student to take part in this study, please note that she/he will be taking three

different personality tests – a total of three administrations. From beginning to end, this process

will take approximately three weeks, intended to be one test administration per week. This three

week process will begin in early April, 2013. All administrations of these tests will be online and

intended to be taken at home in a quiet environment. Therefore, a computer with an internet

connection is required.

If your son/daughter is interested in taking part in this study, he/she will bring home a form

called the “Passive Parental or Guardian Consent Form” as well as a “Student Assent Form”. If

you have any questions about your rights as a parent/guardian or your son/daughter’s rights as a

research subject, please contact me:

Mr. Terrence Marselle,

Social Studies Department

Hall High School

975 North Main St.

West Hartford CT 06107

Phone: 860-232-4561 x 1173 Email: [email protected]

If you have any other questions about your student’s rights as a research subject, please contact

the University of Hartford Human Subjects Committee (HSC) at 860-768-4721. The HSC is a

group of people that reviews research studies and protects the rights of people involved in

research.

mailto:[email protected]


APPENDIX M

Passive Parental or Guardian Consent Form

Student's Name___________________________________Grade______(age?)

Dear Parent(s) or Guardian(s)

Mr. Terrence Marselle, a psychology teacher at Hall High School is asking permission for

your child to be in a research study on the validity and reliability of various personality

tests.

Please note that official permission for Mr. Marselle to conduct this study has been granted by

the West Hartford Public Schools district level offices, as well as by Hall High School Interim

Principal, Mr. Thomas Einhorn, as well as Hall High School Social Studies Department Head,

Mr. Steve Armstrong.

Whether you allow your son/daughter to take part in the study, or not, please understand that

his/her grade(s) will not be affected in any way.

By allowing your student to take part in this study, please note that she/he will be taking three

different personality tests – a total of three administrations. From beginning to end, this process

will take approximately three weeks, intended to be one test administration per week. This three

week process will begin in early April 2013.

All administrations of these tests will be online and intended to be taken at home in a quiet


All of these three personality tests are non-threatening. None of them look for negative







Taking part is voluntary.

Your student’s participation in the study is voluntary. Please note they have the right to withdraw

from this study at any time for any reason. No permission is needed. Please also note their non-

participation will not affect her or his grade in any course at Hall High School.

Your child's responses will remain confidential.

The data collection process will be confidential. No personally-identifying information

will be collected. When your son or daughter takes these online assessments, instead of entering

their name(s), they will enter a unique alpha-numeric code they will have generated themselves


APPENDIX M

Passive Parental or Guardian Consent Form - continued

according to a suggested formula. The only commonality in any of these alpha-numeric unique

codes will be a one number prefix of either a “1” or “2” or “3” which will be pre-assigned to

them. The reason for this is for the purposes of controlling for possible confounding variables

resulting from the three different orders these tests will be taken. Group one will take the IAT

test first in the sequence. So as to ensure confidentiality of this process, not even the parents may

know the formula that will be used to instruct your daughter or son on how to self-generate their

unique alpha-numeric code.

Your student will complete all instruments online without any supervision or input from the

researcher, and his / her responses will be scored automatically via a computer software program.

Therefore, the researcher cannot score or influence the data.

Information gleaned from this study will be used in presentations and publications. However, all

data will only be presented in the aggregate. There will never be a situation where participants’

names will be associated with any answer given on any of these three tests. If you have any questions about your student’s rights as a research subject, please contact



Hall High School

975 North Main St.


Phone: 860-232-4561 x 1173 Email: [email protected]

If you have any other questions about your student’s rights as a research subject, please contact

the University of Hartford Human Subjects Committee (HSC) at 860-768-4721. The HSC is a

group of people that reviews research studies and protects the rights of people involved in

research.



APPENDIX M

Passive Parental or Guardian Consent Form - continued

Please keep this for your records

Passive parental consent form I have read and understood the information provided to me about Mr. Marselle’s study on the

validity and reliability of various personality tests. I also understand that unless I respond via my

denial of my child’s participation in this study, it will be assumed by the researcher, Mr.

Marselle, that my permission is passively granted.

______________________________________________

(Date)

I ____do not give my permission to have my child ________________________ (child's name)

to be included in the study.

_________________________________

(Parent's or Guardian's Signature


APPENDIX N

Informed Consent / Assent of student participants

Participation in a study of the validity and reliability of three different personality tests

The purpose of this study is to compare the results of three different personality tests to each

other. That is, how they are alike and how they are different.

You must be in the 11th

grade or older to participate in this study.

In addition to your own informed consent, all participants must have at least one parent or

guardian’s passive consent to participate in this study. This includes all participants who are

18 years of age as well. A general informational letter has already been sent to yor parent(s) /

guardian(s) informing them of this.

Participation in the study is voluntary. You may choose not to participate and this non-participation will not affect your grade in this course or any other course at Hall High School.

You have the right to withdraw from this study at any time for any reason. No permission is needed.

Official permission for the researcher to conduct this study has been granted by the West Hartford Public Schools district level offices, as well as by Hall High School Interim

Principal, Mr. Thomas Einhorn, as well as Hall High School Social Studies Department

Head, Mr. Steve Armstrong.

If you have questions about your rights as a research subject, please contact the University of

Hartford Human Subjects Committee (HSC) at 860-768-4721. The HSC is a group of people

that reviews research studies and protects the rights of people involved in research.

Risks of participation in this study are not greater, considering probability and magnitude, than those ordinarily encountered in daily life.

By taking part in this study, you will be taking three different personality tests – a total of three administrations. From beginning to end, this process will take approximately three

weeks, intended to be one test administration per week.

All administrations of these tests will be online and intended to be taken at home in a quiet


All of these three personality tests are non-threatening. None of them look for negative personality characteristics, attitudes or opinions, either in the past, present, or signs for the

future.

One of these tests is the Myers-Briggs Type Indicator (MBTI) and has 93 questions. It requires approximately 40 minutes to complete. A second test is the children’s counterpart of

the MBTI. It is called the Murphy-Meisgeier Type Indicator for Children. It has 54 questions

and requires approximately 25 minutes to complete. Both of these tests have no time limit for

completion. The third personality test is called the Implicit Association Test (IAT). In

approximately 80 situations, the test-taker is given word prompts whereby a decision must be

made as to which category the word should be placed. This test will chart how much time the

test-taker requires to complete the test. In most cases, the IAT test requires approximately 40

minutes.

next page


APPENDIX N - Informed Consent/Assent of student participants - continued

Participation in a study of the validity and reliability of three different personality tests

While taking these tests, participants’ names will be confidential. Participants will be given pre-assigned code numbers to type in lieu of their proper name. Individuals at the data

gathering center will never know participants’ names and their association with any answers.

Participants will complete all instruments without any supervision or input from the

researcher, and their responses will be scored automatically via a computer software

program. Therefore, the researcher cannot score or influence the data.

Because of the unique alpha-numeric code generated by the participant herself or himself ensures total confidentiality, at no point in the study will participants, the researchers, the

TAs, or the data gathering center know the identities or the results – or retroactively be able

to find out the identities or participants’ results.

Information gleaned from this study will be used in presentations and publications. However, all data will only be presented in the aggregate. There will never be a situation where

participants’ names will be associated with any answer given on any of these three tests.

Thank you for participating. If you have any questions about this survey, you may contact:



Hall High School

975 North Main St.


Phone: 860-232-4561 X 1173

Email: [email protected]



APPENDIX N - Informed Consent/Assent of student participants - continued

Please keep this for your records

Informed consent signature of student participant

After having read the above, I _____________________________________hereby give my

printed name of student participant

personal permission to the researcher, Mr. Marselle, to take part in his study which is

investigating the reliability and validity of the named (previous page) personality tests.

I also understand that my parent(s) or guardian(s) may exercise their right to choose that I may

not participate in this study – and that if they do so, their right not to allow my participation will

prevail over my desire to do so.

signed: _______________________________________________

signature of student participant (please write neatly)

dated: ___________________________________________2013

A CORRELATIONAL STUDY OF EXPLICIT AND IMPLICIT PERSONALITY TESTS

Documents

Transcript of A CORRELATIONAL STUDY OF EXPLICIT AND IMPLICIT PERSONALITY TESTS