A Predictive Model of Financial Risk Aversion

40
Economics Degree Final Year Project A Predictive Model of Financial Risk Aversion Albert Dorador Chalar Thesis advisor: Dr. Xavier Freixas June 2015

Transcript of A Predictive Model of Financial Risk Aversion

Economics Degree

Final Year Project

A Predictive Model of Financial Risk Aversion

Albert Dorador Chalar

Thesis advisor:

Dr. Xavier Freixas

June 2015

ACKNOWLEDGEMENTS

I would like to thank:

My thesis advisor, Dr. Xavier Freixas, Professor at Pompeu Fabra University / Barcelona GSE,

for his guidance when choosing the topic of the project and also throughout all this time that I

have been working on it while I was away taking part in an academic exchange program at

Carnegie Mellon University.

Ms. Lily Amara Morse, PhD candidate at Carnegie Mellon University, for her endless support

while I was at Carnegie Mellon and beyond, contributing decisively in the creation of the survey

I used, and therefore, making a truly decisive contribution to this project as a whole.

Dr. Joan de Marti Beltran, Professor at Pompeu Fabra University / Barcelona GSE, for his help

and always insightful suggestions regarding the mathematical treatment of the utility functions

considered and the models presented here in general.

Mr. Albert Raya Munte, Graduate student in Economics at Pompeu Fabra University /

Barcelona GSE, for his attention to detail and valuable suggestions that undoubtedly refined

this paper in countless instances.

ABSTRACT

The present study attempts to shed some light on the challenging task of predicting an agent’s

degree of risk aversion by just knowing some basic information about the agent, like gender,

age, income, etc. We present both an empirical and theoretical model in order to build a robust

predictive model.

We find that Race, Nationality, Home net income, Investment experience, and investing in

Bank deposits are statistically significant at the 95% confidence level and therefore they are

able to help predict an agent’s degree of risk aversion. In particular, we find that colored people

are more risk averse than white people, US citizens are more risk averse than Spanish citizens,

people with lower income are more risk averse, people with less investment experience are

more risk averse, and people that typically like to invest in bank deposits are more risk averse

than others, including those that do not invest at all. Notably, we find that gender is not

statistically significant at the 95% confidence level, and that the relationship between education

level and risk-lovingness might not be positive for all education levels.

TABLE OF CONTENTS

1. INTRODUCTION ......................................................................................................................... 5

2. EMPIRICAL STUDY ................................................................................................................... 8

2.1. Introduction ............................................................................................................................. 8

2.2. Methodology ......................................................................................................................... 10

a) Justification of the questionnaire on lotteries ........................................................................ 10

b) Justification of the “demographics” questions of the survey ................................................ 14

3. THEORETICAL MODEL ......................................................................................................... 16

4. PREDICTIVE ANALYSIS ......................................................................................................... 20

4.1. Introduction: Prediction vs Causality .................................................................................... 20

4.2. The Model ............................................................................................................................. 21

a) Estimating the coefficient of risk aversion for agenti ............................................................ 22

b) Predicting the risk-lovingness score for an arbitrary agentj .................................................. 24

5. APPLICATION ........................................................................................................................... 28

6. CONCLUSION ............................................................................................................................ 28

REFERENCES .................................................................................................................................... 30

APPENDIX .......................................................................................................................................... 32

1. INTRODUCTION

The aim of this project is to devise a quantitative model to predict the degree of financial risk

aversion of an arbitrary agent, given a set of parameters, with the ultimate goal of being able to

understand better the risk profile of a person and thereby allowing financial institutions to

provide more accurate investment advice depending on the agent’s risk tolerance level.

This project attempts to improve the predictive power of current popular utility models in terms

of understanding the risk preferences of a given individual, instead of a so-called

“representative agent”.

Some of these common models include basic ones that are taught at undergraduate level, like

U(w) = ln(w)1, or although far less popular

U(w) = √w,

where w in both cases is the level of wealth of the person, but also more sophisticated ones

taught at graduate level, like

( , ) ( , )T

t

t

U c s u c s

where δ is the agent’s discount factor, c is the consumption level at time t, and s is the savings

level at time t, as shown in Loewenstein et al. 2002, p.10, among others2.

All these models, no matter how sophisticated they are, tend to assume a sort of “average”

degree of risk aversion – like in the case of a logarithmic function, which has a certain degree

of concavity and therefore risk aversion implicit – or in case they are non-parametric, they don’t

specify any procedure to find out the exact coefficient of risk aversion for a particular individual

with a certain set of characteristics (because, to be fair, that’s obviously not their objective).

1 For example, as shown in Gollier, C. et al., 2005. Economic and Financial Decisions under Risk. Princeton

University Press, p. 34. Logarithmic utility functions are derived by applying L'Hôpital's rule to the CRRA

functional form when γ approaches 1. 2 For instance, a very similar model can be found in Acemoglu, D., 2009. Introduction to Modern Economic

Growth. Princeton University Press, p. 156.

Therefore, we aim to do better in terms of how accurate the model approximates the level of

(financial) risk aversion of a given person.

This way, if we are able to create a satisfactory model in those terms we would be able to make

statements of the type: “if you are a 52-year old male, with this level of income, education, and

knowledge about the financial markets I advise you to avoid this kind of security, as you would

find its level of risk unacceptable (given its expected return)”.

The relevance of this project, of course, rests on two basic premises:

On the one hand, the 2011-2013 Spanish scandal regarding preferred stock in which many small

investors – which most likely didn’t have the right risk profile for such type of security – lost

almost everything they had invested in this kind of hybrid asset (which constituted a large

proportion of their capital wealth), suggests that many people that invest in the financial markets

lack the financial knowledge required to do so since those large losses were regarded as a big

(and fatal) surprise that wrecked many families’ finances. Obviously, those small investors

were not the only ones to blame: if one assumes they didn’t have enough financial knowledge,

then it is highly likely that they didn’t know about this type of security either, so it is highly

likely that they were misguided by some financial institutions that did know the risks preferred

stocks entail and failed to communicate this effectively when they introduced this type of

security to those investors. It is risky to rely on others when their interests are not necessarily

aligned with yours.

On the other hand, it is possible that in some cases the financial institution involved did, in fact,

their best in trying to convey the riskiness of that security, but even so, the tragedy happened

nonetheless due to one of the many biases humans have, namely, dynamic inconsistency: what

one perceives as desirable today is different from what it is perceived as such tomorrow, for

many potential reasons, including the fact that people do not calculate correctly their financial

needs when the time horizon is far away, or the fact that some people may have a tendency to

underestimate the chance of a negative event happening.

Therefore, I find that trying to provide a quantitative tool to guide investment advisory – for the

benefit of the individual investor – is both interesting and socially useful.

Having said that, it would be naïve to state that my model is likely to capture all the complexity

of the human psyche and be 100% flawless, 100% of the time. There are many sources of

potential errors, including the dynamic changes in preferences (Hoch & Loewenstein, 1991),

or human’s bounded rationality, in the sense that the 5 axioms of rationality – namely

Completeness, Transitivity, Continuity, Independence and Consequentialism (von Neumann &

Morgenstern, 1944, p.26) – do not always hold for all agents.

However, it is worth pointing out that my model’s accuracy (assuming a threshold level of

initial correctness) will increase as more individuals take the survey I use to estimate financial

risk aversion, since by the Law of Large numbers, the sample mean converges to the true

population mean as n grows larger.

This project is divided in several sections and subsections:

Introduction

Body, in which we show the methodology and results of our work, and it is divided

in 4 subsections: Empirical study, where we mainly discuss the survey employed to

derive an agent’s degree of financial risk aversion; Theoretical model, where we

present the mathematical model chosen and why we have chosen this one; Predictive

analysis, which is the core of this project and we show how we estimate an agent’s

coefficient of relative risk aversion as well as how we predict that coefficient for an

arbitrary agent after our model has “learnt” how different characteristics of an

individual affect her degree of risk aversion; finally, an Application, where we

briefly present a possible real-world application of the information obtained through

this project.

Conclusions

References

Appendix

2. EMPIRICAL STUDY

2.1.Introduction

In this section we will present how we prepared the empirical analysis we have carried out with

the goal of determining the degree of (financial) risk aversion of a person, by using a certain

type of survey created specifically for this purpose. The survey format and content is based on

a that may be called “Point-wise utility function modelling”, whose basic idea is that a way of

building a von Neumann-Morgenstern utility function is to consider two significant monetary

values, w1 and w9 such that w9 > w1 and assign two arbitrary utility values to them, u(w1) =

0 and u(w9) = 32, for example (we chose 32, i.e. 25, and the convenience of using this scale

will become apparent when we discuss the survey methodology in more detail in the next sub-

section). Note that the two arbitrary values are almost meaningless: they are in different units

for different individuals3 (so it is not possible to make a direct comparison) and u(w9) = 32 is

not intended to mean the maximum possible utility level for any given individual, it is only

intended to mean that u(w9) > u(w1), and that’s the only reason we use numbers. The utility

levels associated with the other quantities of money can be determined by trying to estimate the

3 This has an extremely important consequence: because Ui($200,000) = 32 ≠ 32 = Uj($200,000) for i ≠ j, i.e.

because a utility of “32 units” for person i may mean a completely different magnitude than “32 units” for person

j, there is no problem in stating that in both cases the utility of $200,000 is 32 units, which implies that there is a

unique γ that solves the equation U($200,000) = 32 ↔ [200,000(1 - γ)] / (1 - γ) = 32, but this is fine, since in

reality “32” means a different number for different people, so if they were in the same units the number would

not be 32 for person i and for person j (e.g. it would be 32 for person i and 25 for person j), so the value of γ

would be different for person i and for person j for the same level of wealth ($200,000). This whole paper rests

on this assumption: different people (may) have different values for γ, i.e. different degrees of relative risk

aversion.

certainty equivalent of different lotteries (asking the agent to choose between lotteries and fixed

quantities of money). In order to simplify the task for the person taking the survey (which will

increase the reliability of the answer) we will only focus on “head or tails” lotteries. In our case,

fix w1 = $0 and w9 = $200,000 such that u($0) = 0 and u($200,000) = 32. Now, consider a

lottery that gives as monetary outcomes either $200,000 or $0 with probability 1/2. That is, we

play “head or tails” with w9 and w1. Ask the agent for the certainty equivalent of this lottery,

i.e., the certain amount of money w5 that makes the agent indifferent between the lottery and

this certain amount. We know that, by definition of certainty equivalent, u(w5) = ½*u($0) +

½*u($200,000) = ½*( u($0) + u($200,000)) = ½*(0+32) = 16. This way we have evaluated the

utility function at another point: we know a third point, the point (w5, 16) in the Wealth-Utility

plane. We now ask the agent for the certainty equivalent of the lottery that gives w5 or $200,000

with equal probabilities. We call it w6. Then, we obtain the utility level of a fourth point: u(w6)

= ½*u(w5) + ½*u($200,000) = ½*(u(w5) + u($200,000)) = ½*(16+32) = 24. Now we know

yet another point in the plane: (w6, 24). By proceeding in a similar fashion, one can find all

intermediate points in the closed interval [w1, w9] and then one would obtain the graphical

representation of the utility function of the agent taking the survey, whose concavity determines

the degree of risk aversion which is what we are trying to estimate.

Therefore, we have created a survey that tries to find all those intermediate values for the levels

of wealth that make the agent indifferent.

Essentially, our rationale is that if we are able to pair up answers to those decision-making

questions with agent characteristics, then after repeating the process with many different agents

we will begin to realize what type of people give what type of answers (if such a pattern exists,

which is our initial hypothesis), and therefore start making inferences and predicting the level

of risk aversion of a new agent by just knowing some basic, but relevant, characteristics about

her (age, gender, education level, household income, etc.). In effect, it is a really simple process

but with very powerful results potentially.

In the next sub-section we will discuss how we came up with the exact monetary quantities and

decision-making questions, as well as how we determined what specific agent characteristics

would be relevant in predicting the level of risk aversion of an agent and therefore must be

included in the empirical study.

2.2.Methodology

a) Justification of the questionnaire on lotteries

Our survey asks 7 lottery questions and allows us to find monetary quantities w2 through w8,

in which w2, w3, and w4 are mapped to utility values 2 , 4, 8, then w5 gives us exactly half the

maximum utility level considered, and then w6, w7, and w8 provide utility values 24, 28, 30.

We found that increasing the number of questions made the questionnaire too tedious, thereby

increasing the risk of obtaining inaccurate answers, without providing enough additional

information to make up for that.

The highest monetary outcome possible in these fictional lotteries is $200,0004, which is a

sufficiently large quantity so that I can observe risk aversion for large amounts of money, but

at the same time it is small enough so that people can make sense of it and give more accurate

answers. I tried initially with $100,000 in an attempt to benefit from the psychological easiness

of working with the number “100” but it provided intervals that were too narrow sometimes so

the risk of the lottery (measured, for instance, by the variance of the outcomes) was too low and

4 The currency we have choses in the US Dollar, for two reasons: a significant proportion of the respondents are

US citizens, and second, even if a respondent is not from the United States, the US Dollar is a very well-known

currency so it is highly likely the person taking the survey is familiar with it and knows the approximate

exchange rate between USD and her home currency. In fact, a large proportion of respondents are from Spain

and, for better or worse, the exchange rate between USD and EUR is virtually 1, for the period May-June 2015,

so the currency conversion is extremely straightforward and it is very hard to make a mistake assuming they

know the current USD-EUR exchange rate, which the vast majority of Spaniards do.

therefore people’s answers showed too little risk aversion in certain wealth ranges. In short, a

higher upper bound allows for wider intervals and therefore it increases the level of risk

perceived and the relevance of the questions we ask.

Here’s an example of the type of lottery question we ask:

“What's the MINIMUM amount of money I need to offer you to NOT play a game of chance

that gives you $0 or maybe $200,000 with 50-50 chance for each outcome?”

The wording has been carefully chosen, as 3 different frameworks have been tested:

1. Initially, we framed the question in terms of the maximum monetary quantity the agent

would be willing to pay to play such game of chance, because it is easier to make sense

of a question that asks you how much would you pay (at most) to do something; but

after administering the survey in person, soon we realized that some people were saying

small quantities (i.e. showing too much risk aversion) just because they didn’t have

more money to do anything, no matter how attractive that thing would be. In an attempt

to avoid such rationale, we changed to framework #2

2. Switching the question from “how much would you pay to do X” to “how much do I

need to pay you not to do X”. Note that the two questions are logically equivalent: both

ask the agent about her monetary valuation of the game of chance. Further note that this

new version successfully avoids the budget constraint problem but creates 2 new

problems: first, it may be harder to think in terms of “not doing” than in terms of

“doing”, and second, some people may be tempted to answer a higher amount of money

than the true minimum amount they would settle for (i.e. showing too little risk

aversion).

3. In an attempt to overcome the problems of frameworks #1 and #2, we found inspiration

in Dohmen et al., 2009 (p. 13), which instead of asking about payments, they directly

offered a set of guaranteed amounts of money and asked the respondent to say if they

prefer that amount of money guaranteed or a lottery with prices €0 or €300 with

probability ½. This approach successfully avoids the shortcomings of frameworks #1

and #2, but it creates new problems: lack of flexibility (there’s a finite set of guaranteed

amounts of money the respondent can choose, and in each subsequent question the

amount of options is smaller, and what is more, it gets smaller at an unpredictable rate

so it is not possible to create a standard survey), and related to the lack of flexibility, the

amount of information that the researcher is able to extract decreases (at an

unpredictable speed) with each new question, eventually learning nothing when the set

of possible fixed amounts is a singleton containing only the value €0. It must be noted

that in Dohmen et al. 2009 it worked better because first, they conducted the experiment

in person, not online (so they could solve any problems that arose), and second, they

made the experiment with actual money, so respondents can make (or not make) real

money depending on what they answer so it is a more accurate measurement of risk

aversion since it involves real money to be made. However, I believe it may not be wise

to replicate the method shown in Dohmen et al. 2009, for several reasons: first, due to

the tight constraint on financial resources I have to undertake this project, it is not

possible to use the same methodology, and second, even if I could employ the same

tactics, probably I would not replicate it exactly the same way, as I believe this method

could be improved by using larger figures (we have reasons to believe risk aversion is

not constant for all values of wealth) and more than one lottery, in particular, subsequent

lotteries with prizes as a function of previous answers, just like I did, in order to derive

enough data points for a single individual as to be able to estimate her coefficient of risk

aversion. Only 1 lottery question is definitely not enough, in my opinion.

Therefore, in conclusion, after comparing all three approaches (including an optimized version

of the third one) I believe the second one is the least problematic one (although is far from

perfect), and its shortcomings can be minimized by emphasizing that people need to answer the

minimum amount, meaning that if they were offered a lower guaranteed amount they would

reject it and choose the game of chance instead.

Another issue we considered was whether to make some of the possible lottery outcomes

negative, so that the perceived risk of the lottery (as well as in order to create a more “realistic”

game) increases and respondents might take it – potentially – more seriously; however, this

possibility was quickly dismissed since for some people it would increase the risk so much that

they would we willing to pay me not to play the game, which implies a negative certainty

equivalent, which behaves badly in most utility functions, and in particular, it would not allow

me to perform a necessary logarithmic transformation that I will discuss in a future section.

This is how the utility function would look like for a subject that answered $60000, $20000,

$8000, $4000, $100000, $130000, $160000, and boundaries $0 and $200000.

Figure 1

Source: based on empirical data collected by the author

Last but not least, we have included several subtle mechanisms to figure out more or less

automatically whether a respondent understood the questions / took the survey seriously: the

lottery questions are open-ended, so the subject can type anything they want, and so if they type

0

5

10

15

20

25

30

35

0 50000 100000 150000 200000 250000

Utility level

Wealth level

Utility Function

a number (let alone a word) that is outside the range or strictly matching the lower/upper bound

of the lottery, then we disregard that survey. An example from the Demographics section: after

the question that asks the subject for how long she has been investing in the financial markets,

only if the person chooses an option other than “Never” it comes a follow-up question asking

for the type of securities she’s invested in, but it also allows the subject to answer “I have never

invested in the financial markets”, which shows she was not paying attention in the previous

question when she chose something other than “Never” (that’s why she has seen this follow-up

question). It is extremely important to incorporate checks like these when conducting surveys

online (when your monitoring capabilities are seriously restricted, for obvious reasons).

The complete survey can be found in the Appendix at the end of this paper.

b) Justification of the “demographics” questions of the survey

Find below the parameters we used as explanatory variables for the degree of relative risk

aversion, as well as their justification:

Gender: it has been shown that, on average, females are less risk-seeking than men

(Powell & Ansic, 1997), irrespective of other potentially relevant variables like task

familiarity and framing, costs or ambiguity. Set Female = 0, Male = 1,. Expectation β <

0, i.e. females show, on average, more risk aversion, everything else being equal.

Age: Dohmen et al., 2009 shows that age has a significant impact on risk aversion; in

particular, risk aversion tends to increase both for males and females as they age.

Expectation β > 0.

Race/ethnicity: Weber, 2013 shows that culture has an impact on risk aversion, in other

words, different cultures may show different degrees of risk aversion. Set Black = 0,

White = 1, Expectation β < 0.

Nationality: similarly as race/ethnicity, nationality could play a role in determining risk

aversion. We believe it is worth including since two people of the same race/ethnicity

that have been raised in different countries may show different risk preferences. Define

USA, Spain. Expectation unclear.

Education: Outreville, 2015 shows that there exists a negative correlation between

relative risk aversion and education level. Reverse causality can’t be rejected but we

don’t care since we are not performing a causality analysis (more on this later).

Expectation β < 0.

Employment status, household net income and number of non-contributing members at

home: it seems obvious that having a reliable source of enough income and not having

too many “unproductive” people to share it with (not only do they shrink the individual

share but also they are financially vulnerable) matters for one’s level of risk aversion.

This relationship has been shown to exist in numerous studies, for example Cohn et al.,

1975. Again, it is plausible that there exists reverse causality, but we are not concerned

about it in the present paper, since we only look at correlation. Employment status: Full

time, Part time, Not employed, Retired. Household net income: several intervals, from

“less than $25,000” to “$150,000 to $199,999”. Number of non-contributing members:

from 0 to more than 4. Expectation: the more (and safer) net income a person has access

to, the less risk averse.

Financial literacy score: we include the same 3 questions that the European Central

Bank included in its 2010 study on financial literacy (Lusardi, A., 2010). Score can

range from 0 to 3. Expectation β < 0.

Experience as an investor: we use the number of years the subject has been investing in

the financial markets (in a very broad sense: from bank deposits to derivatives), as well

as the type of securities she has ever invested in, if any, as a proxy for her experience as

an investor. There may be better proxies for that (e.g. number of operations completed

in the past 24 months, YTD profitability, etc.) but the increase in quality is not enough

to compensate the increase in complexity, I believe. More importantly, these questions

are taken directly from the Aptitude test the Spanish Stock Market Commission

(CNMV, in Spanish) requires financial institutions to administer to any client interested

in contracting portfolio management services, in order to assess her investor profile.

Our hypothesis is that more experienced individuals will be less risk averse, ceteris

paribus, i.e. expectation β < 0.

3. THEORETICAL MODEL

It is possible to tell that different subjects have different degrees of relative risk aversion by

simply looking at the Wealth-Utility plane, but we wish to provide a quantitative measure of

how different their risk aversion degrees are.

Firstly, we need to define a utility function.

We will use the Constant Relative Risk Aversion (CRRA) functional form, “one set of

preferences that has been by far the most used in the literature” (Gollier et al., 2005, p. 21). In

effect, this is considered the standard functional form employed in Finance and

Macroeconomics, and there are reasons for this: Chiappori & Paiella, 2008 find evidence in

panel data that supports this parametric family of utility functions.

CRRA functions are of the form:

1

( )1

wu w

for w > 0, γ , γ ≠ 1

It is important to highlight that we are not concerned so much about the actual coefficient of

risk aversion (γ) per se but we care mostly about how different the coefficient is for different

people (with different attributes).

In order to accomplish this, a relatively straightforward approach could be trying to find γ for

every individual by performing a nonlinear regression (after a logarithmic transformation to

make the relationship linear5 and be able to use linear regression, a very simple yet powerful

tool) and finding γ. Once you find γ, you know the only ex-ante unknown parameter and

therefore her utility function is completely determined and can be graphically represented for

any value of w.

Note that this way we don’t obtain the exact curve depicted by the survey answers, but this is

fine because answers may change slightly if the same questions were asked a different day (i.e.,

it’s not even clear that this is the exact graph of her utility function) so we just want to keep the

“essence” of the relationship between U and W (not the exact details), which is effectively

captured in our model. In other words, what we mainly do is to combine the utility curve

obtained through the survey with a well-known theoretical model for utility functions, in order

to increase the robustness of the predictive model and avoiding relying 100% on the survey

results which it is impossible to guarantee they are completely unbiased.

In effect, our hypothesis is that the true underlying function for any agent is of the form

1

( )1

wu w

for w > 0, , γ ≠ 1

So we actually allow for any real value of γ (except for exactly 1), therefore we include the

possibility that some people might be risk neutral (γ = 0) or even risk loving (γ < 0).

As we shall see next, this functional form has so many convenient properties and its

assumptions are very reasonable.

Properties of CRRA utility functions

I. 1

lim ( ) ln( )u w w

by applying L'Hôpital's rule, as mentioned on footnote #1

5 An intuitive explanation of why this works: a logarithmic transformation achieves almost a straight line

because it smoothens the values, i.e. it removes extreme values so there are less “jumps”, thereby achieving a

straighter curve. This is so because the logarithmic function affects higher values more than small values, which

provides a homogenizer effect. Note: smoother steps while having the same number of steps implies a straighter

curve. In other words, it is as if you “zoomed in” enough so that any curve looks almost straight.

II. 1du

dw w 0 w 0 (which is true by assumption, since we consider only

non-negative wealth values), which implies that u(w) is increasing in wealth, i.e.

more wealth yields more utility, for all values of wealth and utility.

III. 2

2 (1 )

u

w w

0 w, 0, that is, as long as both the wealth level and

the coefficient of relative risk aversion are positive (which is normally the case,

since most people are risk averse), which means the acceleration function is

negative, implying that u(w) is concave, i.e. the function depicts risk aversion and

so it presents diminishing returns on wealth (i.e. the increments in utility are ever

smaller for higher values of wealth, which is what most people experience: $100

more when you have $0 provides more utility than $100 more when you already

have $100 billion).

IV. Arrow-Pratt coefficient of Absolute Risk Aversion

2

2

u

wAdu w

dw

which means that, assuming w, 0, Absolute risk aversion

decreases as wealth increases, for any given value of ; i.e. no matter how risk

averse you are, you become less risk averse as you get more wealthy (again, a fairly

reasonable assumption for most people), and the rate at which you become less risk

averse depends on a constant (that’s why it is called CRRA) equal to . In effect,

2

dA

dw w

.

So, it can be shown that CRRADARA (Decreasing Absolute Risk Aversion), but

the converse is not necessarily true.

V. Relative Risk Aversion

* *RA w A ww

which proves that is indeed the coefficient of relative

risk aversion.

Finally, we would like to briefly state that we also considered two other, more sophisticated

theoretical models:

a) Nonparametric model: it has many more degrees of freedom and therefore is more

flexible (i.e. its assumptions are less restrictive). It is the also well-known Hyperbolic

Absolute Risk Aversion (HARA) family of functions, which is so flexible that by proper

adjustment of the parameters, it is possible to achieve a utility function with absolute or

relative risk aversion decreasing, constant or increasing (Merton, 1971, p. 389). HARA

functions are of the form

(1 )

( )1

CU w

It can be shown that CRRA functions are a special case of HARA functions, in particular

that in which γ < 1, β = (1-γ), and η = 0.

b) Recursive utility function, à la Epstein-Zin (Epstein-Zin, 1989): following the

suggestion of Dr. Steven Shreve (Orion Hoch Professor of Mathematical Sciences at

Carnegie Mellon University), I have explored this type of function, which is of the

form

1/

1(1 ) ( )t t t tU c U

but unfortunately the degree of complexity was too high and I did not succeed at

devising a tractable version of this model while keeping the nice dynamic property it

has, i.e. the idea that today’s level of utility depends on tomorrow’s (and the converse

is also true: by finding the inverse function you obtain that tomorrow’s utility depends

on today’s).

4. PREDICTIVE ANALYSIS

4.1.Introduction: Prediction vs Causality

This is a predictive model, not a model trying to determine causality, because we are primarily

concerned about predicting the degree of risk aversion of a person given a set of parameters, as

opposed to trying to establish causality relations between some variables and a person’s degree

of risk aversion. This is so because unlike in many other situations, in our case our response

variable, i.e. the degree of risk aversion of a person, in general, is not something that we or

anybody wishes to maximize or minimize: it’s not necessarily better or desirable to have a

different risk preference, as it is not necessarily desirable to have a different culinary taste, it is

just part of a person’s personality. Therefore, it is not so relevant to find out what causes risk

aversion because, first of all, there’s not much one can do about many explanatory variables

considered like gender, age, income (what if we found out a particular gender causes risk

aversion?) and secondly, even if we could change the value of a variable that causes risk

aversion, do we really want to try to change it? Would a change made with the purpose of

altering one’s risk profile have the intended effect? Since risk aversion is a psychological trait,

it is unclear whether one can purposely alter that trait even if one can alter the variables that

cause it.

This approach has several consequences:

From a technical point of view, it somewhat simplifies the analysis, in several ways:

First, we don’t need as many assumptions to hold (essentially, the expected error being 0, all

variables in the model having 4 moments, and absence of perfect multicollinearity). However,

this simplification is, in actuality, of minor importance since the first assumption is actually

trivial as this can always be accomplished by modifying the constant term in the linear

regression; the second assumption is automatically fulfilled when the variables of the model

have a finite range; and the third assumption almost always holds since it requires perfect

multicollinearity, and this is rarely the case unless the person creating the model makes this

mistake on purpose (including the same independent variable twice, including an independent

variable that is proportional to another one, including all options for a categorical variable

instead of dropping one, etc.).

Second, endogeneity bias is less important: for example, omitted-variable bias is less worrying

when we don’t try to state that a certain explanatory variable causes the response variable

(perhaps there is another, not included variable that causes the one included and in turn causes

the response variable). Regarding another type of endogeneity, namely, reverse causality, it is

obviously not as problematic when we don’t make causation statements at all, so we don’t need

to worry about whether x causes y or y causes x (or both at the same time).

Additionally, having independent variables that barely explain the dependent variable is not as

much an issue when we don’t care about causality and only about finding the best possible

estimation of a response variable given a set of parameters (in other words, we do not care as

much about the adjusted R2 as we care about R2: we need a very good fit without worrying too

much about efficiency, especially when the sample size is large enough).

But more importantly, from a practical viewpoint, this approach tends to produce higher quality

analysis because the researcher can solely focus on (correctly) determining correlations

between the independent variables and the dependent variable, which is enough for statistical

inference: given a set of parameter values, on average we should expect the response variable

have this other value. No need to make difficult and often controversial statements about

causality. No subjectivity involved, only objective, undeniable facts about correlation among

variables.

4.2.The Model

Our model is based on performing two different set of regressions so that in the first one we

obtain the coefficient of risk aversion for a given agent of the sample (i.e. 88 regressions), and

then in the second one we obtain the predicted gamma for an arbitrary agent as a function of

all the socioeconomic parameters considered.

a) Estimating the coefficient of risk aversion for agenti

We will use STATA to fit the data points of the W-U plane collected from the survey.

First of all, it is plain to see by inspecting figure no. 1 that the data points do not form a straight

line, so we cannot use linear regression. Instead, we will apply a logarithmic transformation to

linearize the relationship between wealth and utility. Observe

Figure 2 Figure 3

Source: based on empirical data collected by the author

Now, analytically,

1

1ln ln ln 1 (1 ) ln( ) ln(1 ) ln( ) ln( )1

ww w w

Therefore, we need to find the δ that best fits the data points, i.e. we need to use Constrained

OLS. In our case, the constraint is that 0 1ln where 1 , so it is a non-linear

constraint. Note that working with logarithms imposes an additional, implicit, constraint:

must be strictly positive, since the domain of the logarithmic function takes only positive values,

but this is fine because >0 simply means 1– >0 <1, which is not a serious limitation.

Further note that negative values of are feasible since <0 simply means >1, which is

possible.

We can use the “nl” command in STATA to perform a linear regression with non-linear

constraints (or a non-linear regression), like in our case.

0

5

10

15

20

25

30

35

0 50000 100000 150000 200000 250000

Utility

level

Wealth level

Utility Function

0

0.5

1

1.5

2

2.5

3

3.5

4

0 5 10 15

ln(U)

ln(W)

Log-transformed Utility Function

In particular, we coded

nl (var2 = {b1=1}*var1-ln({b1})), where var2 = ln(u) and var1 = ln(w)

A similar procedure can be performed in Excel by using the Solver optimizer add-in, and the

results we got were exactly the same (up to the 5th decimal, due to rounding errors), which

suggests the results have been computed correctly.

After running a non-linear regression for each of the 88 subjects we have in our sample, we

obtained their γ coefficient, but we have to interpret it in the reverse way as it is usual: the

higher the gamma, the higher the risk-lovingness, not risk aversion. This is a direct consequence

of working with logarithms: a logarithm of a very small number is very negative, but the

logarithm is subtracting in our model so it becomes very positive and so it is impossible to shift

down the regression line while keeping the slope small, so basically the way it all turns out is

that more risk averse agents have smaller gammas, which is usually the reverse interpretation

of such coefficient.

This analytical limitation somewhat hinders the goodness of fit of the model, as it can be

visually verified by inspecting the regression line we obtained that best fits the data collected

from the survey:

Figure 4

Source: based on empirical data collected by the author and analysis with STATA

but nonetheless we are still able to obtain very decent R2 levels, ranging from 0.7638 to 0.9556

(the regression line depicted in the graph above).

However, note that it is still fine, since as we said earlier, we are not so concerned about finding

the true gamma of the people (there’s a lot of controversy in different papers by reputed

mathematicians about what the actual magnitude is, so I do not expect to come up with the right

answer by myself in this paper), but we do care about being able to predict gamma for a given

individual with a particular set of parameters, and we have succeeded at this: for all agents

tested, the relationship between the magnitude of gamma and the (graphically apparent)

concavity of her utility function is robust: the less concave (more convex, more risk-loving) it

is, the higher the gamma. Therefore, we shall interpret the gamma coefficient as the coefficient

of relative risk-lovingness, this time. In fact, we shall refer to gamma as a “score”, to stress the

idea that gamma is no longer the coefficient of relative risk aversion, but just a measure of risk-

lovingness that will allow us to predict risk attitudes given a set of parameters. For this reason,

we have actually multiplied by 10,000 the coefficients to obtain scores ranging roughly from

7,500 to 9,500.

b) Predicting the risk-lovingness score for an arbitrary agentj

After obtaining the risk-lovingness score of all 88 subjects in our sample using STATA, we

now turn to R to run a simple linear regression to predict that score as a function of all the

explanatory variables we have considered. A summary of the data and full details about the

linear regression performed can be found in the Appendix.

Example prediction: a white, 60 year old woman from Spain, with High School degree only,

employed full time, with a home net income of $23,000 (category 1), 0 non-income-

contributing people at home (category 1), a finance score of 2 out of 3, no investment experience

at all (category 1), will have a predicted Gamma

G = Intercept + 0*GenderMale + 60*Age + 1*Racew + 0*NationalityUSA + 1*HS +

0*UndGrad + 0*Grad + 1*Emp_F + 0*Emp_P + 0*Retired + 1*HomeNetInc + 1*NoIncMem

+ 2*Fin_Score + 1*Inv_exp + 0*Bank_dep + 0*Pen_plan + 0*Bonds + 0*Stocks +

0*Derivatives

G = 8051.382 + 0*102.951 + 60*-2.587 + 1*291.203 + 0*-174.537 + 1*265.767 + 0*372.973

+ 0*195.012 + 1*12.446 + 0*-75.486 + 0*9.482 + 1*48.257 + 1*-16.925 + 2*36.727 +

1*107.910 + 0*-237.588 + 0*-80.396 + 0*30.928 + 0*-101.101 + 0*138.480

G = 8678.274

Linear regression highlights:

Verification of the assumptions of the OLS model:

o Relationship between the response variable and the independent variables is

linear

o Homoscedasticity

o Errors are normally distributed

o Errors are independent

All assumptions are approximately satisfied except for the third one. However, that

assumption would also hold if it wasn’t for three observations: observations 1, 58 and

86.

However, the fact our model fails the third test does not mean our model is invalid:

because our sample size is large (88), the Central Limit Theorem ensures that our

predictions are reasonable.

The four plots used to check whether the assumptions hold can be found in the

Appendix, together with a brief comment.

Internal validity:

o Omitted variable bias: it is never possible to be 100% sure there are no omitted

variables, but we are confident that definitely the most influential ones have been

included. Also, note that, as pointed out earlier in this paper, this bias is a

problem mostly when performing causality analysis, not correlation analysis.

o Incorrect functional form: plot no. 2 in the Appendix (section D) shows that the

relationship between the response variable and the regressors is fairly linear, so

we believe the functional form is approximately right.

o Errors of measure: we have taken all the necessary steps to make sure the

questions of the survey have been asked in such a way that the answers are not

biased.

o Sample selection: we have tried our best to ensure our observations are i.i.d: we

administered the survey to individual subjects only, not groups (otherwise

answers of an agent might be biased by the answers of another one), and also we

tried to have a selection that is as random as possible, avoiding all clusters as

much as we could.

External validity: we believe the model can be extrapolated to new agents within the

age, race, and nationality spectrum considered in our study. To prove this, we checked

the model’s accuracy in predicting the gamma of a new person, by administering the

survey to 5% more of individuals (5% with respect the size of the training data, which

is a different proportion than the one used in Tetko et al. 1995, but our sample size does

not recommend making a 50-50 split). Details can be found in the Appendix.

Multiple R2 = 0.4: Probably an acceptable goodness of fit although obviously not great,

which does nothing but showing how hard it is to model human behavior.

F-statistic = 2.386, with a p-value of 0.004701: with 95% confidence level, we can

reject the hypothesis that all coefficients are 0, i.e. there exists at least one coefficient

that is not 0, meaning that at least one of the variables considered helps explain the

variation of the response variable, so the model’s predictive power is statistically

significant with 95% confidence.

The only variables that are statistically significant at the 95% confidence level are Race,

Nationality, Home net income, Investment experience, and investing in Bank deposits

(compared to not investing at all). Note gender is not statistically significant, and that

individuals with higher education levels are not always more risk loving unlike shown

in Outreville 2015 (e.g. graduate students are less risk loving than undergraduates).

Reference category for the categorical / dummy variables considered6:

o Gender: Male vs Female

o Race: White vs Black

o Nationality: USA vs Spain

o Education: HS & UndGrad & Grad vs Elem/Mid

o Job status: Emp_F & Emp_P & Retired vs Not_Emp

o Securities: Bank_dep & Pen_plan & Bonds & Stocks & Derivatives vs Nothing

All variables behave as predicted in section 2.2.b7 except for Nationality: US citizens

seem to be more risk averse than Spaniards, and the difference is statistically significant

at the 95% confidence level. Still, we did not have a clear expectation initially, as stated

in section 2.2.b.

6 Recall we must omit one of the categories so as to avoid perfect multicollinearity. 7 Note that the sign of the beta has been reversed since our interpretation of gamma has also been reversed.

5. APPLICATION

In this section we will very briefly hint at a possible application of the results of this study:

providing better investment advice, especially to those who are more vulnerable.

Our registered scores of risk-lovingness range from 7697 to 9133, with a median of 8962, so

we can divide the sample in at least two groups: agents with a low score and agents with a high

score.

a) Low score agents (risk averse agents)

Those that have a score from 7697 to 8962.

Recommended securities based on our regression analysis:

- Bank deposits

- Pension plans

- Stocks

Stay away from: Derivatives

b) High score agents (risk loving agents)

Those with a score from 8963 to 9133.

Recommended:

- Derivatives

- Bonds

6. CONCLUSION

This study attempts to shed some light on the issue of predicting an agent’s degree of risk

aversion by just knowing some basic information about the agent, like gender, age, income, etc.

We present both an empirical and theoretical model to try to come up with a robust predictive

model.

Although the goodness of fit of our predictive model could be higher (Multiple R2 = 0.4), the

fact that the F-Statistics rejects the hypothesis that all coefficients are 0 gives us both confidence

that our model has some true predictive power and some hope in the fact that it is indeed

possible to model human behavior (at least regarding financial risk aversion), which was an

initial major challenge we faced.

We find that variables Race, Nationality, Home net income, Investment experience, and

investing in Bank deposits are statistically significant at the 95% confidence level and therefore

they are able to help predict an agent’s degree of risk aversion. In particular, we find that colored

people are more risk averse than white people, US citizens are more risk averse than Spanish

citizens, people with lower income are more risk averse, people with less investment experience

are more risk averse, and people that typically like to invest in bank deposits are more risk

averse than others, including those that do not invest at all. Notably, we find that gender is not

statistically significant at the 95% confidence level, and that the relationship between education

level and risk-lovingness might not be positive for all education levels.

Finally, this model could be further extended/improved by exploring nonparametric functional

forms and/or recursive functional forms (to capture dynamic effects), increasing the sample size

and variety of observations, conducting 100% of the surveys in person, as well as potentially

introducing interaction terms in the regression for Gamma.

REFERENCES

Acemoglu, D., 2009. Introduction to Modern Economic Growth. Princeton University Press, p.

156.

Breusch, T., and Pagan, A., 1979. A Simple Test for Heteroscedasticity and Random

Coefficient Variation. Econometrica, 47 (5), pp. 1287-1294.

Chiappori, P., and Paiella, M., 2008. Relative Risk Aversion is Constant: Evidence from Panel

Data. Paper presented at a seminar at Dartmouth College.

Chiappori, P., et al., 2009. Identifying Preferences under Risk from Discrete Choices. The

American Economic Review, 99 (2), pp. 356-362.

Cohn, R., et al., 1975. Individual Investor Risk Aversion and Investment Portfolio Composition.

The Journal of Finance, 30 (2), pp. 605-620.

Comisión Nacional del Mercado de Valores. Asesoramiento de inversiones y gestión de

carteras. Evaluación de la idoneidad. [Online] Available at

<http://www.cnmv.es/Portal/Inversor/Idoneidad.aspx> [Accessed June 7, 2015].

Dohmen, T., et al., 2009. Individual Risk Attitudes: Measurement, Determinants and

Behavioral consequences.

Epstein, L. and Zin, S., 1989. Substitution, Risk Aversion, and the Temporal Behavior of

Consumption and Asset Returns: a Theoretical Framework. Econometrica, 57(4), pp. 937-969.

Gollier, C. et al., 2005. Economic and Financial Decisions under Risk. Princeton University

Press, p. 34.

Hoch, S. and Loewenstein, G., 1991. Time-inconsistent preferences and consumer self-control.

Journal of Consumer Research, 17, pp. 492-507.

Loewenstein, G., et al., 2002. Projection Bias in Predicting Future Utility. CAE working paper

#02-11.

Lusardi, A., 2010. The Importance of Financial Literacy. European Central Bank presentation

to the conference on household finance and consumption. Luxembourg City.

Merton, R., 1971. Optimum Consumption and Portfolio Rules in a Continuous-Time Model.

Journal of Economic Theory, 3, pp. 373-413.

Outreville, F., 2015. The relationship between relative risk aversion and the level of education:

a survey and implications for the demand for life insurance.

Powell, M., Ansic, D., 1997. Gender differences in risk behavior in financial decision-making:

An experimental analysis. Journal of Economic Psychology, 18, pp. 605-628.

Tetko, I., et al., 1995. Neural Network Studies. 1. Comparison of Overfitting and Overtraining.

Journal of Chemical Information and Computer Science, 35, pp. 826-833.

Vieider, F., et al., 2015. Common components of risk and uncertainty attitudes across contexts

and domains: Evidence from 30 countries. Journal of the European Economic Association,

JUN/2015.

Von Neumann, J., and Morgenstern, O., 1944. Theory of Games and Economic Behavior.

Princeton University Press.

Weber, C., 2013. Cultural Differences in Risk Tolerance. Working paper No. 01-2013.

White, H., 1980. A Heteroskedasticity-Consistent Covariance Matrix Estimator and a Direct

Test for Heteroskedasticity. Econometrica, 48 (4), pp. 817-838.

APPENDIX

A. Complete survey

[Page 1]

Instructions: Please read the following questions and indicate a response.

What is your gender?

Male

Female

How old are you (in years)?

What is your race / ethnicity? (check all that apply)

American Indian or Alaska Native

Asian

Black / African American

Hispanic / Latino

Middle Eastern

Native Hawaiian or Pacific Islander

White / Caucasian

Other, please specify

Are you a citizen of the United States?

Yes

No (please specify the country where you maintain citizenship in the box below)

[Page 2]

What is the highest degree or level of school you have completed? If currently enrolled, mark

the previous grade or highest degree received.

No schooling completed

Elementary and middle school (ages 6-14)

High school (age 14-18)

Bachelor's degree

Graduate degree (e.g., Master's degree, PhD, JD)

If you are currently enrolled, what is your current education level (i.e., what grade or year are

you currently in)?

I am not currently enrolled in school

I am currently enrolled in school and my current grade/year is:

Are you currently....?

Employed (full time)

Employed (part time)

Not currently employed (e.g., out of work, student, homemaker)

Retired

What is your total household annual net income (i.e. income after paying taxes)?

Less than $25,000

$25,000 to $34,999

$35,000 to $49,999

$50,000 to $74,999

$75,000 to $99,999

$100,000 to $149,999

$150,000 to $199,999

$200,000 or more

Excluding yourself, how many non-income-contributing members are there in your

household?

none

1

2

3

4 or more

Instructions: In this section, you will be asked questions about finances and investing.

Suppose you had $100 in a savings account and the interest rate was 2% per year. After 5

years, how much do you think you would have in the account if you left the money to grow?

More than $102

Exactly $102

Less than $102

Not sure

Imagine that the interest rate on your savings account was 1% per year and inflation was 2%

per year. After 1 year, with the money in this account, would you be able to buy…”

More than today

Exactly the same as today

Less than today

Not sure

Do you think the following statement is true or false?

In general, buying a single company stock usually provides a safer return than buying several

different stocks.

True

False

Not sure

[Page 4]

How long have you been investing in the financial markets (whether on your own or with the

help of a financial advisor)? This may include any type of security (bank deposit, pension

plan, bonds, stocks, derivatives, etc.)

Never

Less than 1 year

Between 1 and 3 years

More than 3 years

[Page 5: only shown if answer to previous question is not “Never”]

What types of financial instruments do you use or have you used in the past? Check all that

apply.

I have never invested in the financial markets

Bank deposit

Pension plan or Investment fund

Fixed income: Treasury bills, other bonds

Variable income: Stocks

Derivatives: Warrants, Options, Futures, CFD, ETF, etc.

[Page 6]

Instructions: In this next section, you will read about a financial game. As you read about the

game, think carefully about it. Afterwards, you will be asked to describe your opinions about

the game. Keep in mind that there is not a correct answer to the questions about the game.

Example:

Interviewer: Hey Chris & Pat, let me introduce you to a game. In this game, you can make

money in two different ways:

Option 1): Play a game of chance in which two possible outcomes are equally likely. One

outcome give you $20 while the other outcome gives $100. Either way, it is impossible to lose

money. However, there is some risk in how much money you receive.

Option 2): Receive a guaranteed amount of money, with no risk at all.

Now, imagine that I don't want you to choose the first option. In other words, I want you to

choose the second option. What's the MINIMUM amount of money I need to offer you to

convince you not to choose the first option? Remember, there is no right or wrong answer to

this question. It depends entirely on your preferences.

Chris: Hmm, I would be willing to choose the second option instead of the first option if you

gave me $70, because even if there's some risk, I have a 50% chance of making $100 if I choose

the first option.

Interviewer: So if I offer you any LESS than $70, you would not accept the second option and

instead would choose the first option?

Chris: Yes, that is correct.

Interviewer: What about you, Pat?

Pat: Well, I am guaranteed at least $20 if I choose the first option, so I would personally be

happy if you give me $30 to choose the second option instead. You don't have to pay me as

much as Chris because I hate risk and there is a 50% chance that I may only get $20 if I choose

option one.

Interviewer: Fair points people, certainly there's no right or wrong answer. It depends

entirely on your preferences.

Now it’s your turn, let’s hear what you think about the following game:

What's the MINIMUM amount of money I need to offer you to NOT play a game of chance

that gives you $0 or maybe $200,000 with 50-50 chance for each outcome?

10,000

[Page 7]

What's the MINIMUM amount of money I need to offer you to NOT play a game of chance

that gives you at least $0 or maybe $10,000 with a 50-50 chance for each outcome?

4,000

[Page 8]

What's the MINIMUM amount of money I need to offer you to NOT play a game of chance

that gives you at least $0 or maybe $4,000 with a 50-50 chance for each outcome?

2,000

[Page 9]

What's the MINIMUM amount of money I need to offer you to NOT play a game of chance

that gives you at least $0 or maybe $2,000 with a 50-50 chance for each outcome?

1,000

[Page 10]

Instructions: You will now be asked a different question about the game.

What's the MINIMUM amount of money I need to offer you to NOT play a game of chance

that gives you at least 10,000 or maybe $200,000 with a 50-50 chance for each outcome?

18,000

[Page 11]

What's the MINIMUM amount of money I need to offer you to NOT play a game of chance

that gives you at least 18,000 or maybe $200,000 with a 50-50 chance for each outcome?

25,000

[Page 12]

What's the MINIMUM amount of money I need to offer you to NOT play a game of chance

that gives you at least 25,000 or maybe $200,000 with a 50-50 chance for each outcome?

40,000

We thank you for your time spent taking this survey.

Your response has been recorded.

B. Summary of the Data using R summary(data) Gamma Gender Age Race Nationality USA Min. :7697 Fem :32 Min. :18.00 B: 5 Spain:50 Min. :0.0000 1st Qu.:8779 Male:56 1st Qu.:23.00 W:83 USA :38 1st Qu.:0.0000 Median :8962 Median :25.50 Median :0.0000 Mean :8848 Mean :32.01 Mean :0.4318 3rd Qu.:9035 3rd Qu.:36.25 3rd Qu.:1.0000

Max. :9133 Max. :72.00 Max. :1.0000 Spain Elem.Mid_s HS UndGrad Min. :0.0000 Min. :0.00000 Min. :0.0000 Min. :0.0000 1st Qu.:0.0000 1st Qu.:0.00000 1st Qu.:0.0000 1st Qu.:0.0000 Median :1.0000 Median :0.00000 Median :0.0000 Median :0.0000 Mean :0.5682 Mean :0.06818 Mean :0.2955 Mean :0.4773 3rd Qu.:1.0000 3rd Qu.:0.00000 3rd Qu.:1.0000 3rd Qu.:1.0000 Max. :1.0000 Max. :1.00000 Max. :1.0000 Max. :1.0000 Grad Emp_F Emp_P Not_Emp Min. :0.0000 Min. :0.0000 Min. :0.00000 Min. :0.0000 1st Qu.:0.0000 1st Qu.:0.0000 1st Qu.:0.00000 1st Qu.:0.0000 Median :0.0000 Median :0.0000 Median :0.00000 Median :0.0000 Mean :0.1591 Mean :0.3864 Mean :0.05682 Mean :0.4886 3rd Qu.:0.0000 3rd Qu.:1.0000 3rd Qu.:0.00000 3rd Qu.:1.0000 Max. :1.0000 Max. :1.0000 Max. :1.00000 Max. :1.0000 Retired HomeNetInc NoIncMem Fin_Score Inv_exp Min. :0.00000 Min. :1.000 Min. :1.000 Min. :0.000 Min. :1.000 1st Qu.:0.00000 1st Qu.:1.750 1st Qu.:1.000 1st Qu.:3.000 1st Qu.:1.000 Median :0.00000 Median :2.000 Median :1.000 Median :3.000 Median :1.000 Mean :0.06818 Mean :2.693 Mean :1.773 Mean :2.682 Mean :2.148 3rd Qu.:0.00000 3rd Qu.:4.000 3rd Qu.:2.000 3rd Qu.:3.000 3rd Qu.:4.000 Max. :1.00000 Max. :7.000 Max. :5.000 Max. :3.000 Max. :4.000 Nothing Bank_dep Pen_plan Bonds Stocks Min. :0.0000 Min. :0.0000 Min. :0.0000 Min. :0.000 Min. :0.0000 1st Qu.:0.0000 1st Qu.:0.0000 1st Qu.:0.0000 1st Qu.:0.000 1st Qu.:0.0000 Median :1.0000 Median :0.0000 Median :0.0000 Median :0.000 Median :0.0000 Mean :0.5341 Mean :0.3295 Mean :0.2159 Mean :0.125 Mean :0.3295 3rd Qu.:1.0000 3rd Qu.:1.0000 3rd Qu.:0.0000 3rd Qu.:0.000 3rd Qu.:1.0000 Max. :1.0000 Max. :1.0000 Max. :1.0000 Max. :1.000 Max. :1.0000 Derivatives Min. :0.0000 1st Qu.:0.0000 Median :0.0000 Mean :0.1477 3rd Qu.:0.0000 Max. :1.0000

C. Linear regression to predict the risk-lovingness score using R Call: lm(formula = Gamma ~ Gender + Age + Race + Nationality + HS + UndGrad + Grad + Emp_F + Emp_P + Retired + HomeNetInc + NoIncMem + Fin_Score + Inv_exp + Bank_dep + Pen_plan + Bonds + Stocks + Derivatives) Residuals: Min 1Q Median 3Q Max -1124.29 -56.53 37.41 125.89 332.82 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 8051.382 280.517 28.702 <2e-16 ***

GenderMale 102.951 67.741 1.520 0.1332 Age -2.587 3.086 -0.838 0.4047 RaceW 291.203 135.124 2.155 0.0347 * NationalityUSA -174.537 69.326 -2.518 0.0142 * HS 265.767 219.570 1.210 0.2303 UndGrad 372.973 224.103 1.664 0.1007 Grad 195.012 238.679 0.817 0.4168 Emp_F 12.446 83.371 0.149 0.8818 Emp_P -75.486 157.784 -0.478 0.6339 Retired 9.482 242.874 0.039 0.9690 HomeNetInc 48.257 21.620 2.232 0.0289 * NoIncMem -16.925 30.069 -0.563 0.5754 Fin_Score 36.727 55.891 0.657 0.5133 Inv_exp 107.910 46.981 2.297 0.0247 * Bank_dep -237.588 115.183 -2.063 0.0430 * Pen_plan -80.396 96.118 -0.836 0.4058 Bonds 30.928 118.441 0.261 0.7948 Stocks -101.101 116.707 -0.866 0.3894 Derivatives 138.480 100.026 1.384 0.1707 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 265.6 on 68 degrees of freedom Multiple R-squared: 0.4, Adjusted R-squared: 0.2323 F-statistic: 2.386 on 19 and 68 DF, p-value: 0.004701

D. Plots used to check the assumptions of the OLS model

It shows that there’s no heteroscedsticity, since there seems to be no clear pattern in the

distribution of residuals, i.e. variance is approximately the same for all magnitudes of predicted

values, there’s no trend. Also, the fitted line is fairly flat, indicating that the linearity assumption

is met.

Again, the homoscedasticity assumption seems to hold, there’s no clear pattern in the

distribution of the residuals.

Clearly, not all errors are normally distributed so it fails the test.

E. Validity data

Subject 1

Male, 23, white USA, Graduate, Not working, Income category 5, 0 non-income-contributing

members at home, fin_score=3, 1-3 years of investment experience, stocks and derivatives,

certainty equivalents for lotteries (from smallest to largest): 8000, 15000, 30000, 70000,

110000, 140000, 170000.

Actual Gamma: 8985.125

Predicted Gamma: 9102.159

Error: +117.034

Subject 2

Male, 40, black, USA, Elem/Mid, Employed full time, Income category 2, 2 non-income-

contributing members at home, fin_score=1, no investment experience, certainty equivalents

for lotteries (from smallest to largest): 500, 1000, 3000, 5000, 10000, 20000, 30000.

Actual Gamma: 8351.062

Predicted Gamma: 8116.063

Error: -234.999

Subject 3

Female, 31, white, Spain, HS, Employed part time, Income category 1, 0 non-income-

contributing members at home, fin_score=2, no investment experience, certainty equivalents

for lotteries (from smallest to largest): 3000, 5000, 10000, 20000, 40000, 60000, 80000.

Actual Gamma: 8857.556

Predicted Gamma: 8665.365

Error: -192.191

Subject 4

Female, 47, white, Spain, Undergraduate, Employed full time, Income category 2, 1 non-

income-contributing members at home, fin_score=3, no investment experience, certainty

equivalents for lotteries (from smallest to largest): 5000, 8000, 15000, 30000, 60000, 80000,

100000.

Actual Gamma: 8915.299

Predicted Gamma: 8887.17

Error: -28.129