Mixed feelings?: the relationship between perceived usability and user experience in the wild

10
Mixed Feelings? The Relationship between Perceived Usability and User Experience in the Wild Eeva Raita University of Helsinki and Helsinki Institute for Information Technology HIIT Antti Oulasvirta Aalto University ABSTRACT Although both user experience and perceived usability have been extensively studied, the relationship between the two is less well understood. Prior empirical research suggests that perceived usability influences especially negative user experiences, but the effect depend on goals, contexts, and expectations. The paper contributes on this theme with de- scription of a field study covering self-reporting of 12 sub- jects using a new smartphone. The findings confirm some earlier views on the relationship but also permit a richer un- derstanding. Unlike prior work, the results show that per- ceived usability can play an important role in ambivalent experiential episodes. These episodes emerge from a clash between desired uses and either poor perceived usability or lack of appropriateness in the broader social context. We dis- cuss our findings in relation to prior studies. Author Keywords Usability; User experience; Field studies; Qualitative stud- ies; Ambivalent experiences; Diary method ACM Classification Keywords H.5.m. Information interfaces and presentation (e.g., HCI): Miscellaneous. INTRODUCTION Perceived usability is studied to understand users’ percep- tions of task-related efficiency, effectiveness, and satisfac- tion. However, as computers see increasing use for leisure, not only work, metrics focusing on task-achievement have been criticized as insufficient for capturing what users en- joy. User experience has been posited as an alternative and a more holistic perspective that goes “beyond the instru- mental” (e.g., [11, 17, 26, 28]). Yet it remains ambiguous how instrumental and non-instrumental qualities are related to each other, or how task-related elements influence the formation of pleasurable experiences. Currently, the relationship between user experience and perceived usability is relatively unexplored. For example, only 23% of empirical user-experience studies examined in a recent review included some analysis of the relation be- tween user experiences and perceived usability [1]. The ambiguity over the relationship between the two is evident even in the research methodology itself: some maintain that usability criteria can be utilized to measure user experience [35], while others have stressed the need for new research methodologies [8] or developed measurements for captur- ing perceived usability as a part of user experience [7]. Perceived usability and user experience have been studied in relation to the initial finding that non-instrumental quali- ties such as aesthetics can influence perceptions of instru- mental qualities [34]. Some related papers have assessed the effect of both kinds of qualities on user experience, as measured as an emotional experience (e.g., [11, 25, 33]) or, in other cases, an overall evaluation of goodness or prefer- ence (e.g., [4, 10, 17, 36]). These studies have found that perceived usability influences emotional reactions: good perceived usability results in positive reactions, and poor perceived usability in negative ones [24, 25, 33]. However, it has also been found that it is not a source of outstanding positive emotions [12]. According to prior studies, perceived usability influences preference and overall appraisals but only in combination with other factors [10, 17] or depend- ing on user expectations and the context of use [4, 9, 13]. We contribute to this discussion by pointing out how in- strumental qualities are present in experiential episodes and influence the emergence of pleasurable user experiences. Our data come from a mixed-methods field study covering 12 participants’ detailed self-reporting of the first weeks of using a new smartphone. Prior studies have focused on user experiences as emotional reactions and overall evaluations and measured them mostly with predetermined rating scales. This paper contributes by studying how user experiences are conveyed in spontaneous reports in the wild. The novel finding is that perceived usability is related to ambivalent experiential episodes. These are episodes wherein both positive and negative user experiences are present. For example, in the study certain apps were found desirable but not as usable as they could have been. In addi- tion, certain use behaviors (mostly “killing time”) were found to be enjoyable but also pointless or inappropriate. We conclude the paper by discussing the significance of ambivalent experiential episodes and their contribution to the study of perceived usability and user experience. Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for com- ponents of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]. NordiCHI '14, October 26 - 30 2014, Helsinki, Finland Copyright is held by the owner/author(s). Publication rights licensed to ACM. ACM 978-1-4503-2542-4/14/10…$15.00. http://dx.doi.org/10.1145/2639189.2639207

Transcript of Mixed feelings?: the relationship between perceived usability and user experience in the wild

Mixed Feelings? The Relationship between Perceived Usability and User Experience in the Wild

Eeva Raita University of Helsinki and

Helsinki Institute for Information Technology HIIT

Antti Oulasvirta Aalto University

ABSTRACT

Although both user experience and perceived usability have been extensively studied, the relationship between the two is less well understood. Prior empirical research suggests that perceived usability influences especially negative user experiences, but the effect depend on goals, contexts, and expectations. The paper contributes on this theme with de-scription of a field study covering self-reporting of 12 sub-jects using a new smartphone. The findings confirm some earlier views on the relationship but also permit a richer un-derstanding. Unlike prior work, the results show that per-ceived usability can play an important role in ambivalent experiential episodes. These episodes emerge from a clash between desired uses and either poor perceived usability or lack of appropriateness in the broader social context. We dis-cuss our findings in relation to prior studies.

Author Keywords Usability; User experience; Field studies; Qualitative stud-ies; Ambivalent experiences; Diary method

ACM Classification Keywords H.5.m. Information interfaces and presentation (e.g., HCI): Miscellaneous.

INTRODUCTION Perceived usability is studied to understand users’ percep-tions of task-related efficiency, effectiveness, and satisfac-tion. However, as computers see increasing use for leisure, not only work, metrics focusing on task-achievement have been criticized as insufficient for capturing what users en-joy. User experience has been posited as an alternative and a more holistic perspective that goes “beyond the instru-mental” (e.g., [11, 17, 26, 28]). Yet it remains ambiguous how instrumental and non-instrumental qualities are related to each other, or how task-related elements influence the formation of pleasurable experiences.

Currently, the relationship between user experience and perceived usability is relatively unexplored. For example,

only 23% of empirical user-experience studies examined in a recent review included some analysis of the relation be-tween user experiences and perceived usability [1]. The ambiguity over the relationship between the two is evident even in the research methodology itself: some maintain that usability criteria can be utilized to measure user experience [35], while others have stressed the need for new research methodologies [8] or developed measurements for captur-ing perceived usability as a part of user experience [7].

Perceived usability and user experience have been studied in relation to the initial finding that non-instrumental quali-ties such as aesthetics can influence perceptions of instru-mental qualities [34]. Some related papers have assessed the effect of both kinds of qualities on user experience, as measured as an emotional experience (e.g., [11, 25, 33]) or, in other cases, an overall evaluation of goodness or prefer-ence (e.g., [4, 10, 17, 36]). These studies have found that perceived usability influences emotional reactions: good perceived usability results in positive reactions, and poor perceived usability in negative ones [24, 25, 33]. However, it has also been found that it is not a source of outstanding positive emotions [12]. According to prior studies, perceived usability influences preference and overall appraisals but only in combination with other factors [10, 17] or depend-ing on user expectations and the context of use [4, 9, 13].

We contribute to this discussion by pointing out how in-strumental qualities are present in experiential episodes and influence the emergence of pleasurable user experiences. Our data come from a mixed-methods field study covering 12 participants’ detailed self-reporting of the first weeks of using a new smartphone. Prior studies have focused on user experiences as emotional reactions and overall evaluations and measured them mostly with predetermined rating scales. This paper contributes by studying how user experiences are conveyed in spontaneous reports in the wild.

The novel finding is that perceived usability is related to ambivalent experiential episodes. These are episodes wherein both positive and negative user experiences are present. For example, in the study certain apps were found desirable but not as usable as they could have been. In addi-tion, certain use behaviors (mostly “killing time”) were found to be enjoyable but also pointless or inappropriate. We conclude the paper by discussing the significance of ambivalent experiential episodes and their contribution to the study of perceived usability and user experience.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for com-ponents of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]. NordiCHI '14, October 26 - 30 2014, Helsinki, Finland Copyright is held by the owner/author(s). Publication rights licensed to ACM. ACM 978-1-4503-2542-4/14/10…$15.00. http://dx.doi.org/10.1145/2639189.2639207

Definitions Used In This Paper Before reviewing related work, we discuss the two con-structs used in this study: user experience and perceived us-ability. In general, user experience is a broad and ambigu-ous concept [22]. Individual researchers have focused on different aspects and conceptualizations of it. Despite per-ceived usability being considered a more straightforward concept; it too has been approached in multiple ways (for a review of current usability practices, see [15]). For the pur-poses of this paper, we differentiate between perceived usa-bility and “objective” usability, where the latter involves those performance-related aspects of the interaction that are independent of perception [15]. In practice, usability is of-ten understood as ease of use and learning (e.g., [27]), but other researchers’ definitions include also the usefulness of the system and the satisfaction of the user [2].

In this paper, perceived usability is understood in reference to the ISO-9241-210 definition: perceptions of the extent to which a product can be used by specified users to achieve specified goals with effectiveness, efficiency, and satisfac-tion in a specified context of use [16]. This broad under-standing of usability includes the usability of both the prod-uct and functions in enhancing of task-achievement.

There are multiple conceptualizations of user experience, and some researchers emphasize that it should be viewed pluralistically [8]. User experience has been defined: as a constant stream of self-talk (e.g., [8]), as an emotional re-sponse or experience (e.g., [23, 24]), as an experiential epi-sode with a beginning and end (e.g., [8, 13, 18]), as co-experience (jointly created) (e.g., [8]), and as the overall evaluation based on momentary experiences (e.g., [13]). In addition, empirical user-experience studies have assessed user experience in terms of generic user experience, emo-tion, enjoyment, aesthetics, appeal, hedonic quality, en-gagement, flow, and a few other constructs (reviewed in [1]). Some studies have focused on only one aspect of user experience, while others utilize multiple perspectives simul-taneously—for example by studying both emotional experi-ences and the overall user experience (e.g., [21, 25]).

In this paper, user experience too is defined in reference to the ISO-9241-210 standard: as a person's perceptions and responses that result from the use or anticipated use of a product, system, or service [16]. We focus on self-reports of user experience, which capture reflexive and articulated ex-periences as a part of use episodes (see also [18]).

RELATED WORK In the discussion that follows, we review related works in terms of their focus on user experience, either 1) as emo-tions or 2) as overall preference or evaluation of goodness. In the last subsection we summarize prior findings particu-larly from longitudinal field studies.

Perceived Usability and Emotional Experiences Studies assessing the effect of perceived usability on emo-tional experiences often draw their inspiration from the ini-

tial finding that a product’s aesthetic appeal influences its perceived usability [34]. Consequently, they often compare the effect of instrumental (i.e., usability and usefulness) and non-instrumental (i.e., aesthetics, or hedonic) qualities. These studies suggest that perceived usability influences emotions: good perceived usability results in positive, and poor perceived usability in negative, reactions [24, 25, 33]. However, it has been found that instrumental qualities are not a source of outstanding positive emotional experiences [12].

In experimental studies, it has been shown that both instru-mental and non-instrumental quality perceptions (influ-enced by manipulated system properties) influence emo-tional experiences as measured via subjective feelings, faci-al expressions, and physiological responses [24, 25, 33]. These studies have not found an interactive effect between usability and aesthetics; the two factors influence emotions additively. In addition, perceived usability has been found to have a somewhat greater effect than perceived aesthetics [25, 33]—especially during goal-driven use [24].

In contrast, in a questionnaire study by Hassenzahl et al. [12], perceived usability was not found to be a cause of pos-itive emotions. In the study, participants were asked to re-port a recent outstanding positive experience with technolo-gy in terms of need-fulfillment, experienced affect, attribu-tion, and product perception and evaluation. Hedonic quality (i.e., stimulation and identification) was found to capture the product’s perceived ability to create positive affect through need-fulfillment. Pragmatic quality (good perceived usabil-ity) could eliminate barriers to need-fulfillment but was not a source of positive affect per se.

Perceived Usability and Overall Appraisals Studies approaching user experience in terms of overall evaluation or preference have found that perceived usability influences overall evaluation but only together with other factors [10, 17] or depending on the context of use and us-ers’ expectations [4, 9].

One of the first studies to explore this relationship was that of Jordan on pleasure in technology use [17]. According to that interview-based study, perceived usability—but also aesthetics, performance, and reliability—influence how pleasurable a product is. Later, Hassenzahl [10] studied the interplay between perceived usability and hedonic quality in forming overall judgments pertaining to beauty and goodness. He found that judgments of beauty are more in-fluenced by the user’s perception of the hedonic qualities, while judgments of overall goodness are affected by both hedonic quality and perceived usability.

In a study of preferences between two Web sites with the same content but different interface style, it was found that the importance of perceived usability depends on preferred interface style and the context of use [4]. While both per-ceived usability and expressive aesthetics were good pre-dictors of overall preference, this happened mainly because

participants appeared to discount negative attributes of their favorite interface style. In addition, the importance of per-ceived usability, or expressive aesthetics, changed with the context of use. In a parallel to this, a paper [9] covering three experimental studies found that the importance of us-ability and aesthetics to the overall preference depends on the context of use, the tasks, and users’ expectations. Good perceived usability is favored especially in a “serious” use context, while perceived aesthetics have a greater impact in more playful contexts. Likewise, a study comparing the ef-fect of the presence or absence of instrumental goals found that users’ evaluations focused on usability issues especial-ly in the presence of instrumental, task-related goals [13].

Longitudinal Studies The longitudinal field studies on the theme have studied the formation of user experiences [19] and the effect of emotions and memories on overall evaluations [21]. According to Karapanos et al. [19], the overall “goodness” of a product is associated mostly with perceived usability but also with use-fulness, stimulation, and identification. While early experi-ences are related to hedonic aspects of product use, pro-longed experience is tied more to how the product becomes meaningful in one’s life. Kujala and Miron-Shatz [21] found that negative emotions are related to perceived usability, and positive emotions to good overall user experience. In addi-tion, they found that in the early stages users focused more on their user experiences, and the importance of usability in-creased over time.

THE FIELD STUDY Prior empirical research suggests that perceived usability influences especially negative user experiences [10, 21], but the effect depends on goals [13], contexts, and expectations [4, 9]. Most of these findings have been obtained in the la-boratory or with rating scales. In contrast, we studied spon-taneous user reports capturing reflexive and articulated user experiences in the wild. Some longitudinal studies [19, 21] have utilized a setting similar to ours, but their focus has been on the longitudinal development of user experiences. In contrast, we are interested in understanding how perceived usability and user experiences are connected to each other in self-reported experiential episodes.

The study covered the first two weeks of participants’ owner-ship of a new smartphone. During the study period, user ex-periences were measured with diaries utilizing principles from the Day Reconstruction Method (DRM) [18], referred to as self-reports in the rest of the paper. Perceived usability was studied through identification of perceptions of usability from the self-reports and also via measurement of a global, summary usability score with the System Usability Scale (SUS) [3]. The study began with a kick-off meeting at which participants received the new device, answered de-mographic questions, and completed a questionnaire ad-dressing their prior uses of mobile devices and expectations of the device’s usability with a future-form System Usabil-ity Scale [4]. During the study period, participants reported

experiences arising in the use of the device with DRM nar-ratives and filled in the SUS questionnaire for perceived us-ability daily. After the study period, the users filled in the questionnaire once more. The study was conducted in the native language of the participants (Finnish); the data cited in the paper are reported in translation.

The Smartphone The data for this study were gathered in 2010. We utilized a device that presented what was new technology for our par-ticipants. The device was the Nokia N97, a smartphone launched in 2009 especially for browsing the Web. The op-erating system of the N97 is Symbian S60, and the phone has both a resistive touchscreen and a QWERTY keyboard. The phone features a five-megapixel camera; a music-player; 3G, WiFi, and HSDPA connections; and an FM receiver.

Participants The participants were 12 university students (three of them female), of ages 21–27. The subjects’ previous phones were typically not smartphones, with the exception of partici-pants 3, 5, 7, and 12, of whom subjects 3 and 12 had owned a smartphone for approximately two years and the others for less than a year. Participants received the N97 phones in connection with a wider study. As compensation, they did not have to pay for voice, text, or data use during the study.

Data Collection

SUS questionnaires The System Usability Scale is a widely used and validated quantitative metric for determining a global, summary usa-bility score [4]. With this scale, participants rated 10 state-ments about the usability of the device (e.g., “I thought the system was easy to use”) on a five-point Likert scale from “strongly disagree” to “strongly agree.” The SUS yields a single number representing a composite measure of the overall usability, within the range 0–100.

Self-reporting The Day Reconstruction Method, developed by Kahneman et al. [16], is used for studying how people spend their time and how they experience the activities and settings of their lives. With this method, participants reconstruct their activi-ties and experiences of the preceding day via procedures designed to reduce recall biases (see also [19, 21]). For this study, we formulated self-reporting forms, on which partic-ipants filled in all daily activities for five given time bands (morning, lunchtime, afternoon, evening, and night). After having formed an overview of the day, the subject de-scribed all episodes of anticipated use, use, or deliberate non-use, along with the related user experiences. Partici-pants filled in the reports every evening for the previous day and sent them to the researcher.

ANALYSIS We analyzed the self-reports with open-coding content anal-ysis [21]. Open coding is a technique wherein the coding cat-egories are not determined in advance; instead, they are cre-ated and modified during the coding process. Our coding was

actually semi-open, since user experience was utilized as a sensitizing concept. In particular, the coding categories were formulated through exploration of the data in relation to the definition of user experience [16]: How do participants de-scribe their perceptions and responses that result from or are related to the use, anticipated use, or deliberate non-use of the device?

Classification of Experiential Episodes We broke the self-reports down into experiential episodes (N=739), which could include multiple user experiences simultaneously but were schematized with an identifiable beginning and end, such as a bus trip, a party, or an evening at home (see also [8]). Neutral episodes (N=288) were de-scriptions of the use, anticipated use, or deliberate non-use of the device that did not show any positive or negative va-lence (e.g., “I call my friend”), and we detach them from further analysis. Positive (N=169), and negative (N=158) episodes involved only either positive or negative valence, while ambivalent episodes (N=124) featured both.

Classification of User Experiences in the Episodes We analyzed user experiences in the negative, positive, and ambivalent experiential episodes (N=451) by distinguishing among the ways in which user experiences were present in them. This analysis yielded three coding categories for user experiences:

1. Perception of a feature or functioning 2. Perception of an interaction with the device 3. Description of an emotion

User experiences in the first two categories were percep-tions about either a feature or functioning of the device (e.g., “The newsfeed is useful”), or interaction with it (e.g., “Browsing the Web feels easy”). In the last category, the user experiences were descriptions of emotions (e.g., “I feel happy about owning this device”).

OVERVIEW OF THE DATA Before reporting on the findings, we provide an overview of the main tendencies in the dataset. In total, we identified 619 user experiences, across the selected experiential epi-sodes. The participants differed in the number of user expe-riences they reported. The participant-specific variations were: 3–7 negative and 3–18 positive perceptions of feature or functioning, 1–12 negative and 0−9 positive perceptions of interaction with the device, and 2–22 negative and 3–24 positive descriptions of emotions.

Perception of a feature or functioning (Category 1): Posi-tive perception (N=97) had to do most often either with us-ability in terms of something being easy to use, or working well / better than expected (38%), or with the broader goodness of a certain feature (30%). Usefulness (23%) and hedonic qualities such as aesthetics, fun, or “niceness” (9%) were evaluated too. Negative perceptions (N=96) were re-lated mostly to dimensions of usability such as slowness, or clumsiness or to the device not working properly (73%). A

few negative evaluations pertained to uselessness (11%) or to something being poor (8%) or ugly/awful (7%).

Perceptions of interaction with the device (Category 2): Positive perceptions (N=53) were related to aspects of usa-bility such as interaction succeeding or feeling easy (39%), usefulness of interaction (36%), or the interaction feeling intuitive/pleasant (25%). Negative perceptions (N=80) mostly had to do with usability in terms of interaction fail-ing or being difficult/slow (90%). A few pertained to use-lessness (6%) or unpleasant interaction (4%).

Descriptions of emotions (Category 3): Positive emotions (N=126) included descriptions of feeling content (N=54), happy (N=34), pleased (N=18), delighted (N=8), excited (N=10), and proud (N=2). Positive emotions were related mostly to usefulness (46%), especially in terms of being able to access information efficiently (N=18) and relieve boredom by killing time (N=22). In addition, consequences of use such as cheering up after calling a friend or enjoying the relaxed feeling of not using the device (24%) and he-donic elements related to feeling proud for owning the de-vice or to enjoying a “cool ringtone” (19%) were related to positive emotions. A few positive emotions had to do with usability or with functionality surpassing expectations (11%).

Negative emotions (N=167) included descriptions of felt annoyance and frustration (N=100), confusion (N=29), dis-appointment (N=10), negative feelings (N=10), shame (N=11), and nervousness (N=7). Negative emotions were related mostly to usability problems such as the phone be-coming unresponsive or being hard to use (73%) and to use-fulness in terms of the device or a certain app not having specific expected features (17%). A few were linked to con-sequences of use such as having an unpleasant phone con-versation (8%) and also to poor aesthetics in design (2%).

Correlations between User Experiences and SUS For a rough grasp of the relationship between user experi-ences and perceived usability we calculated correlations of the quantity of user experiences and SUS scores. The aver-age of daily SUS scores correlated with the number of negative user experiences per day (r =-.64, p<.05): the more negative user experiences, the lower the SUS score. The expected SUS value correlated negatively with the number of positive user experiences during the study (r=-.58, p<.05): the higher the expected SUS score, the fewer posi-tive user experiences there were during the study. No other statistically significant correlations were found.

Evolving of User Experiences and SUS Figure 1 provides an overview of how user experiences and perceived usability evolved during the study. It presents the count of positive and negative user experiences calculated by summing participants’ daily numbers of user experienc-es. The timeline shows a peak in negative and positive user experiences. Participants reported many user experiences during the first days of use, after which the number of user experiences started to decrease and stabilize.

Figure 2 plots mean SUS scores against time in relation to expected and end SUS score. Mean SUS scores have been calculated as averages of participants’ daily scores. Over the first few days, participants’ SUS scores crashed to about SUS 55, after which they started to recover to nearly the expected level. The difference in SUS ratings between ex-pected SUS (mean=69) and SUS score after the first day (mean=56) is statistically significant: t(11)=3.610, p<.005.

Figure 1. Count of positive and negative user experience from

one day to the next during the study period.

Figure 2. The mean day-to-day SUS scores during the study,

with expected (EXP) and end (END) SUS scores.

FINDINGS We present qualitative findings in relation to four themes: 1) comparisons, 2) justifications, 3) triggers, and 4) ambiva-lence. Firstly, we discuss how user experiences emerged from comparisons to expectations and in relation to other devices. Secondly, we analyze how expectations and per-ceptions of instrumental qualities (i.e., perceived usability and usefulness) were utilized to justify the use, anticipated use, or deliberate non-use of the device. Thirdly, we discuss the difference in the triggers of positive vs. negative emo-tions. Fourthly, we analyze ambivalent experiential epi-sodes in greater depth.

Comparisons with Existing Devices and Expectations Many user experiences arose from comparison with other devices, such as prior phone models or devices with similar functions.

[P3] I miss my old phone that had a virtual keyboard that I had gotten used to—using the hardware keyboard feels re-ally burdensome, and I haven’t yet found a virtual QWERTY keyboard for the N97.

[P12] I browse the Web a lot, because the new display feels luxurious when compared to my old one. Unfortunately the browser is as undeveloped as the one on my old phone, and actually even my prior phone from 2003 had basically the same browser. This makes me really disappointed.

Contrasts were often made against expectations. These ei-ther were specific to the device, a certain feature or app, or the functioning or involved broader expectations as to the user-friendliness of all similar devices. Participants were disappointed when their expectations were not met and pos-itively surprised when certain aspects exceeded their expec-tations or eclipsed prior experiences.

[P10] When I write an SMS, I use an “a” instead of the “ä” character, because I once again can’t discover how to get the “ä” letter conveniently. Feels stupid, but I don’t have time to get to know the keyboard better; one would expect that in the design of a cell phone, users are taken better into account.

[P1] During breakfast, I download a few songs to my phone just for testing. The downloading goes quickly and without the hardships I somewhat expected.

Instrumental Qualities As Justifications for Use Behaviors In the self-reports, participants often explained why they had been doing certain things in a certain way. Expected and perceived instrumental qualities (i.e., usability and use-fulness) were utilized often as justification for how partici-pants proceeded toward their goals. Efficiency and ease of use were employed as justifications for use, and uselessness and difficulty of use as justifications for non-use.

[P12] Maybe the biggest change in my life is that I’ve start-ed to use only the phone’s calendar whereas before I used both a phone’s and laptop’s calendar. Using the N97’s cal-endar is just so convenient, because you can just pinpoint a day and mark a meeting.

[P4] I did not use the phone when I was doing my workout at the gym. A workout, an aerobic one especially, could in-clude music, but it feels too difficult to adjust the phone to play the workout music.

[P5] I still haven’t had the time to update the phone; I ha-ven’t had the motivation, because at worst it can make the whole thing unresponsive.

Reported Triggers for Positive vs. Negative Emotions In addition to perceptions, user experiences included de-scriptions of positive and negative emotions. Positive emo-

tions (N=126) were related especially to usefulness con-nected to quick access to information and an opportunity to relieve boredom with the smartphone.

[P6] I utilized my phone to fill in an Internet form. I was re-ally pleased at being able to do that immediately with the phone, since I was only on my way home, where I would normally have to have filled in the form on my laptop.

[P10] It feels relieving that I can take the Internet with me on the trip, just in case we get lost.

[P12] During a lecture, I learned that one flying game is a really pleasant way to spend time, if the professor starts to talk about something meaningless.

In addition, there were positive emotions related to conse-quences of use wherein the focus was on not the act of us-ing (e.g., calling) but on what using “resulted” in (e.g., talk-ing to a friend).

[P10] I study the whole evening in the university’s computer class, and about once an hour or two I call my friends for some gossip so that I can take a break from work. My mind always cheers up after a chat.

Some more hedonic aspects, such as enjoying a cool ring-tone or feeling proud for owning a “business” phone, were connected to positive emotions.

[P7] We admire my new phone together, and I told my friend how a large proportion of the functions are totally useless for me but it just feels so good to own a phone with which to amaze another goofy fool.

[P5] I forgot to put the phone on silent, but I don’t have to be ashamed at all to show the phone, because it is the new business phone.

Negative emotions (N=167) were associated especially with usability problems with the device, a certain feature of it, or an app being hard to use or not working properly / at all.

[P3] I receive a phone call, but I don’t know how to answer it—I try to press the answer button many times, but I don’t realize that I need to open the keypad lock. My ringtone rings loudly in the quiet corridor, and I decide to change it to something less obtrusive once I get home. The phone call ends up in voice mail, and I have to call the caller back. I feel annoyed.

[P7] It made me mad that the phone became unresponsive after I had quit one phone call; that is, immediately after hanging up, the display presented those things that are usu-ally present when someone is calling. When I tapped the phone, it vibrated, but the screen stayed frozen with the same view. When I restarted the phone, the problem went away, but the whole thing made me quite upset—on the ba-sis of the first two days of use, I would not buy an N97.

Participants struggled especially with the initialization of the device. Problems in this phase were decisive, because

they caused negative experiences also later, when the de-vice could not be used for expected ends.

[P3] We listen to music from the speakers for about 15 minutes, and it irritates me that I was not able to transfer the music in the morning. Now we have to listen to the background music from a LocoRoco game I transferred earlier to be my ringtone, and this really amuses my team.

[P1] I wasn’t able to transfer my calendar, so I try to re-member events. I have to take along my old phone. It frus-trates me that the synchronization doesn’t work.

Ambivalent Experiential Episodes While prior studies have differentiated between positive and negative user experiences, we identified also ambiva-lent experiential episodes (N=124), which included positive and negative user experiences simultaneously. Many of the-se contradictions emerged when participants were pleased with the possibility of using the device for a certain purpose but at the same time felt that this functionality could have been more usable and/or useful in a certain respect.

[P5] I ate breakfast and browsed through the news in the mobile HS newspaper; the app was quite convenient, though loading takes a really long time in comparison to the iPhone. Everything occurs after a few seconds’ delay.

[P4] I started to look for people’s phone numbers on the Net, so that I could call them and let them know I was late for the seminar preparations (I should have been there al-ready at 10am). This per se was nice; however, the phone does not seem to have a copy-paste function, so I did find the numbers but I could not call them. It annoyed me a bit.

[P11] Later in the evening, I also checked the HSL pages for which bus my friend should take to get home. The pages do not work as nicely as with the computer, and definitely not as I would wish them to work. Finally I get the right di-rections open and we check the route. Nifty even though it took quite a lot of time.

Ambivalence was evident when participants commented on something not being as usable as it could have been but still enjoying its use because it was “better than nothing.” Espe-cially when participants felt bored, they claimed that utiliz-ing hard-to-use apps for killing time was still able to relief boredom.

[P8] I watched some videos on YouTube with my phone, be-cause I couldn’t get to sleep. Watching the videos with the 3G connection is not nearly as quick as with broadband, but the YouTube app is still a nice feature. At least it is re-ally apt for boring moments and killing time.

[P2] (Day 4) On my way, I log in to Facebook, but the phone gets somehow stuck, and I don’t really even like my phone’s Facebook app. On the basis of my experience, I de-cide to take the Facebook app off my home screen, because at least now I don’t think I will use it. […] (Day 10) On the train, I browse through Facebook, I guess I have gotten

used to it or maybe travelling on the train is just really bor-ing and I go through my friends’ status updates to kill time.

Ambivalent experiential episodes arose also when partici-pants enjoyed something that was perceived as pointless or somewhat inappropriate in the broader social context. This source of enjoyment was often time-killing activities. On one hand, all participants commented that killing time made them feel they were making use of their time at least some-how, and it made the time go more quickly. On the other hand, it was also perceived as unproductive or as something that one should not really be doing. This resulted in am-bivalent experiential episodes: enjoying passing the time but simultaneously feeling guilt or annoyance.

[P4] I browsed the Web, feeling ashamed because I was one of the organizers of the seminar. I wanted some amusement to relieve boredom this way.

[P9] I played some games with my phone when using the bathroom. The good side is that I enjoy the bathroom visits much more, but the bad side is that my visits have started to take a really long time because of my phone.

[P7] I feel ashamed to admit it, but today I got into playing that guitarhero game found on my Nokia again. It is sur-prisingly hooking, and surprisingly stupid. At the same time addictive and annoying. One definitely does not become re-laxed when playing those games. Yet they do offer a great extra for moments when it is boring and one does not want to get into unpleasant social situations.

DISCUSSION In a parallel with some prior studies, we found perceived usability connected especially to negative user experiences. In a contrast to what prior perspectives suggest, positive us-er experiences were often connected to instrumental quali-ties, mostly usefulness. The self-reports showed a large number of ambivalent experiential episodes, including both positive and negative user experiences. These episodes in-dicate that, even though good perceived usability was rarely the cause of positive emotions, poor usability did make de-sired uses less enjoyable. Also, a desired use found to be inappropriate in the broader context evoked mixed feelings.

Perceived Usability Connected Especially to Negative User Experiences We found that perceived usability is connected especially to negative user experiences. In the self-reports, 80% of nega-tive perceptions had to do with poor usability, such as hard-to-use features, or the device becoming unresponsive. Usability problems caused 73% of the negative emotions such as frustration arising from the device not working properly, or annoyance with not understanding how the de-vice functions. The number of negative user experiences per day correlated with the average of daily SUS scores (r=-.64, p<.05): the more negative user experiences, the lower the SUS score. Positive user experiences were not connected to perceived usability to the same extent. There was no correlation between perceived usability and positive

user experiences. In the self-reports, 39% of positive per-ceptions and 11% of positive emotions were related to good usability.

The finding that perceived usability is connected more to negative than to positive user experiences is in line with prior empirical studies addressing the theme (e.g., [10, 19, 21]). For example, in a questionnaire study by Hassenzahl et al. [12], good usability could eliminate barriers to need-fulfillment but was not a source of positive affect per se. These findings suggest that perceived usability acts as a necessary precondition to positive user experiences, the main causes of which lie elsewhere (see also [10, 11, 22]).

In the difference between negative and positive user experi-ences, we see an analogy to Herzberg’s two-way model of user satisfaction (see [14] and also [37]). Originally formu-lated to explain work satisfaction, Herzberg’s model pro-poses that dissatisfaction and satisfaction with work are not polar opposites but related to different elements. Dissatis-faction arises from a lack of “hygiene” factors (e.g., good company policies, supervision, and salary) and satisfaction from the presence of motivation factors (e.g., meaningful work content, recognition, and responsibility). Studies on this theme with respect to user satisfaction have found that hygiene factors have more to do with the functional ele-ments and motivation factors with the user-system interac-tion and user involvement [37]. In consistency with recent propositions (e.g., [6, 30]), usability can be understood as a hygiene factor for minimizing users’ negative user experi-ences. However, if it is to produce pleasurable user experi-ences, good usability needs to be paired with motivating factors that bring added value to users’ lives.

If good usability acts mostly as a hygiene factor, what are the motivating factors that cause positive user experiences in the HCI context? Some prior studies have focused on the difference between instrumental “do” goals such as making a phone call and more non-instrumental “be” goals such as feeling related. This research suggests that the latter, “be” goals produce pleasurable experiences through the fulfill-ment of human basic needs (e.g., [11, 12, 13]). Before one equates “be” goals with motivating factors, however, it is worth discussing why in this study positive user experienc-es were connected in large numbers to instrumental quali-ties.

The Dominance of Instrumental Quality Perceptions The concept of user experience has been offered as an al-ternative to usability that focuses on positive user experi-ences and goes “beyond the instrumental” (e.g., [11, 17, 26, 28]). While some research has emphasized the importance of non-instrumental qualities in the formation of positive user experiences (e.g., [11, 28]), positive user experiences were often related to instrumental qualities in this study. In the self-reports, 66% of positive perceptions and 57% of positive emotions were related to instrumental qualities (i.e., usefulness and usability); while good usability rarely

evoked positive emotions (11%), usefulness did so quite of-ten (46%).

In light of existing views, there emerge at least three possi-ble explanations for this finding: 1) instrumental qualities are especially important for this device, or this class of de-vices; 2) non-instrumental qualities influence user experi-ences unconsciously and are not present in reflective self-reports and 3) instrumental qualities have become a legiti-mate discourse for arguing about a digital devices and may dominate in self-reports.

Firstly, also prior longitudinal studies of smartphones have found that instrumental qualities act as an important dimen-sion along which users describe their user experiences [19]. What might explain this phenomenon is that smartphones are often used to reach instrumental goals. According to prior studies, the presence of instrumental goals makes us-ers focus on them in their evaluations (e.g., [9, 12]).

Secondly, it is possible that non-instrumental qualities in-fluence user experiences more unconsciously than instru-mental qualities do and are not present in spontaneous, re-flective self-reports. This explanation calls for more re-search on the theme, because prior studies indicate that per-ceived aesthetics influence perceived usability [34]. If this holds, non-instrumental qualities, especially aesthetics, could influence those reflective self-reports that explicitly concern instrumental qualities. In contrast, more recent studies on the theme have not found an interactive effect between the two (e.g., [24, 25, 33]).

Thirdly, it is possible that instrumental qualities have be-come a concurrent way of arguing for/against and perceiv-ing new devices. Prior research has discussed the dilemma of the hedonic, which arises when people overemphasize the instrumental aspects over the more non-instrumental qualities in a choice situation [6]. That research suggests that non-instrumental qualities are what people would actu-ally enjoy, because they pertain to reaching of non-instrumental, “be” goals related to basic human needs (such as feeling related) (e.g., [11, 12, 13]). However, in choice situations, users seem to find it easier to utilize justi-fications related to instrumental task-achievement [6]. Cor-respondingly, in this study, instrumental qualities were uti-lized to justify use behaviors: use was justified with effort-lessness and usefulness, non-use with expecting certain use to be burdensome and/or useless.

Broader Contexts of Experiencing The self-reports bring out the importance of the broader context in the formation of perceived usability and user ex-perience. When forming reflexive user experiences, partici-pants often compared it to what they expected and to expe-riences with prior devices. Comparisons influenced emo-tional experiences: participants described feeling disap-pointed when their expectations were not met and feeling content when certain functionality surpassed what had been possible with prior devices. Prior studies too have acknowl-

edged the importance of expectations in the forming of emotional reactions (e.g., [4, 9] and overall evaluations (e.g., [11]). However, while short-term laboratory studies have identified a boosting effect of expectations [29], in this study the effect was contrariwise: failing to meet expecta-tions led to disappointments.

Expectations influenced also SUS scores. Here we observed a crash during the first days of use in relation to expected SUS score. This crash was consistent with the high number of negative user experiences. However, as users kept using the system, the quantity of negative user experiences started to decrease and SUS scores rebounded to close to the ex-pected level. One explanation for this is that in the longer term the effect of expectations starts to vanish [19]. How-ever, while this might hold for expectations formulated be-fore usage, we speculate on the extent to which new com-parisons arise with use; that is, since comparisons seem to be a way of articulating experiences, different comparison points might emerge in tandem with use. This is what hap-pens, for example, when a new appealing device comes to market and leads users to disregard older ones. In addition, in the self-reports we identified expectations that were not specific to the device and, rather, pertained to broader ex-pectations of all similar devices, such as expectations relat-ed to user-friendliness. We speculate that expectations de-rived in relation to experience and word-of-mouth on simi-lar devices have a more lasting effect than more specific expectations. That is, if the device fails to deliver what is already known to take place with similar devices, there probably does not occur a transition from fantasy to reality [19]; in contrast, one sees what P7 argued: “on the basis of the first two days of use, I would not buy an N97.”

The broader context was present also in those ambivalent episodes wherein users felt ashamed for enjoying a desired use when it was found somehow inappropriate. Prior stud-ies have discussed the difference between instrumental “do” goals and more non-instrumental “be” goals (e.g., [11, 12, 13]). We too believe that it is analytically important to dif-ferentiate between these “do” goals and “be” goals if we are to understand both the what and the why of goal-pursuit (see also [5]). However, while “do” and “be” goals are dif-ferent in analytical terms, we find that the two often inter-twine. Doing something means also being something, and users seem to be quite aware of this. Why else would they feel ashamed for doing things that they enjoy but are not socially appropriate? Supporting this idea, in a review on empirical user experience studies instrumental and non-instrumental goals were often found interwoven and insepa-rable [1]. We suggest, that in day-to-day life, different goals are related to each other in multiple ways, and pursuing one aim can either promote or inhibit the reaching of other goals (see also [31]).

Mixed Feeling? Our novel finding is a sizable proportion of what we call ambivalent experiential episodes (N=124). They are ambiv-alent because they include both positive and negative user experiences. The existence of this class of experiences beckons further research. We observed ambivalence mostly when certain use was perceived as desirable but lacking in usability. For example, participants were pleased with the new opportunities to use the Web for fast information ac-cess and killing time, but they simultaneously perceived the multiple usability problems (such as hard-to-use apps) neg-atively. If we interpret this in relation to Herzberg’s two-way model [14], these episodes comprise motivating factors (desired, potentially pleasure-evoking uses) paired with lack of hygiene factors (poor perceived usability).

The results pertaining to ambivalent episodes complement findings from other longitudinal studies on the theme. Simi-larly to what was seen in the study by Karapanos et al. [19], usefulness emerged through the appropriation of the device in specific use contexts (such as when travelling) and in rela-tion to the changes this brought to participants’ lives (such as having fast access to information). Karapanos et al. suggest that, if meaningful mediation emerging from usefulness is to be supported, there is a need for designs that are specific enough to address a specific need but flexible enough to en-able artful appropriation. Along with this conclusion, we stress the importance of perceived usability: it is hard to en-joy meaningful mediation if the device is not working properly. Meaningful mediations may even go unexplored on account of poor expected usability—as seen in our data when participants justified non-use with expecting some-thing to be too difficult/burdensome relative to its benefits.

Our study dealt with only one device, and further studies on the theme are needed. Supporting our findings, psychologi-cal research has identified the importance of ambivalent at-titudes and emotions (e.g., [31]). Consequently, we believe that ambivalent experiential episodes offer an interesting perspective and one complementary to that of user experi-ence research. From the design point of view, ambivalent episodes can help designers to identify users’ desires with respect to a certain device in connection with the obstacles (poor perceived usability or inappropriateness in the social context) hindering complete fulfillment of these desires. For embracing the users’ perspective and the holistic study of experiences, it is important to utilize approaches that leave room for ambivalent experiential episodes. Instead of iden-tifying what people love or hate, we should understand what makes them feel both things at the same time.

Conclusion We studied the relationship between perceived usability and user experiences, measured as articulated and reflexive ex-periences. In a parallel with a few prior studies, we found perceived usability connected more to negative than to posi-tive user experiences. Unlike some prior studies we found positive user experiences often connected to instrumental

qualities. A novel finding in our work is that perceived usa-bility is related also to ambivalent experiential episodes. These episodes emerged from a clash between desired uses and either poor perceived usability or lack of appropriate-ness. Desired uses can become less enjoyable via hard-to-use features or if the use is experienced as inappropriate in the broader social context. These findings complement ear-lier findings and call for more research on the intersection of user experience and perceived usability.

Acknowledgements Raita acknowledges funding from UCIT Graduate School and from the Doctoral School of University of Helsinki. This work has been supported by the project Theseus fund-ed by Tekes. We thank our colleagues and independent re-viewers for their valuable feedback.

REFERENCES 1. Bargas-Avila, J. A. & Hornbæk, K. (2011). Old wine in

new bottles or novel challenges? A critical analysis of empirical studies of user experience. In Proc. CHI 2011 (pp. 2689–2698). New York: ACM Press.

2. Bevan, N. (1995). Measuring usability as quality of use. Journal of Software Quality, 4, 115–130.

3. Brooke, J. (1996). SUS—a quick and dirty usability scale. In: P. W. Jordan, B. Thomas, B. A. Weerd-meester, & I. L. McClelland (eds.), Usability Evaluation in Industry (pp. 189–194). London: Taylor and Francis.

4. De Angeli, A., Sutcliffe, A., & Hartmann, J. (2006). In-teraction, usability and aesthetics: What influences users' preferences? In Proc. DIS 2006 (pp. 271– 280). New York: ACM Press.

5. Deci, E. L. & Ryan, R. M. (2000). The “what” and “why” of goal pursuits: Human needs and the self-determination of behavior. Psychological Inquiry, 11, 227–268.

6. Diefenbach, S. & Hassenzahl, M. (2011). The dilemma of the hedonic—appreciated, but hard to justify. Inter-acting with Computers, 23, 461–472.

7. Finstad, K. (2010). The usability metric for user experi-ence. Interacting with Computers, 22, 323–327.

8. Forlizzi, J. and Battarbee, K. (2004) Understanding ex-perience in interactive systems. In Proc. DIS 2004 (pp. 261–268). ACM Press.

9. Hartmann, J., Sutcliffe, A., & De Angeli. A. (2008). Towards a theory of user judgment of aesthetics and us-er interface quality. ACM Transactions on Computer–Human Interaction, 15.

10. Hassenzahl, M. (2004). The interplay of beauty, good-ness, and usability in interactive products. Human–Computer Interaction, 19, 319–349.

11. Hassenzahl, M. (2008). User experience (UX): Towards an experiential perspective on product quality. In Proc. IHM 2008 (pp. 11–15). New York: ACM Press.

12. Hassenzahl, M., Diefenbach, S., & Göritz, A. (2010). Needs, affect, and interactive products—facets of user experience. Interacting with Computers, 22, 353–362.

13. Hassenzahl, M. & Ullrich, D. (2007). To do or not to do: Differences in user experience and retrospective judg-ments depending on the presence or absence of instru-mental goals. Interacting with Computers, 19, 429–437.

14. Herzberg, F. (1987). One more time: How do you moti-vate employees? Harvard Business Review, 65, 53–62.

15. Hornbæk, K. (2006). Current practice in measuring usa-bility: Challenges to usability studies and research. Inter-national Journal of Human–Computer Studies, 64, 79–102.

16. ISO 9241-210. Ergonomics of human–system interac-tion—Part 210: Human-centred design for interactive sys-tems.

17. Jordan, P. (1998). Human factors for pleasure in product use. Applied Ergonomics, 29, 25–33.

18. Kahneman, D., Krueger, A. B., Schkade, D. A., Schwarz, N., & Stone, A. A. (2004). A survey method for characterizing daily life experience: The day recon-struction method. Science, 306, 1776–1780.

19. Karapanos, E., Zimmerman, J., Forlizzi, J., & Martens, J. B. (2009). User experience over time: An initial frame-work. In Proc. CHI 2009 (pp. 729–738). New York: ACM Press.

20. Krippendorff, K. (1980). Content Analysis: An Introduc-tion to Its Methodology. Beverly Hills, California: Sage Publications.

21. Kujala, S. & Miron-Shatz, T. (2013). Emotions, experi-ences and usability in real-life mobile phone use. In Proc. CHI 2013 (pp. 1061–1070). New York: ACM Press.

22. Lai-Chong Law, E., Roto, V., Hassenzahl, M., Ver-meeren, A., & Kort, J. (2009). Understanding, scoping and defining user experience: A survey approach. In Proc. CHI 2009 (pp. 719–728). New York: ACM Press.

23. Lindgaard, G. & Dudek, C. (2003). What is this evasive beast we call user satisfaction? Interacting with Com-puters, 15, 429–452.

24. Mahlke, S. & Lindgaard, G. (2007). Emotional experi-ences and quality perceptions of interactive products. In: J. Jacko (ed.), Human–Computer Interaction, Part I, HCII 2007, LNCS 4550 (pp. 164–173). Berlin: Springer.

25. Mahlke, S. & Thüring, M. (2007). Studying antecedents of emotional experiences in interactive contexts. In Proc. CHI 2007 (pp. 915–918). ACM.

26. McCarthy, J. & Wright, P. (2004). Technology As Ex-pe-rience. Massachusetts: MIT Press.

27. Nielsen, J. (1994). Usability inspection methods. In Proc. CHI 1994 (pp. 413–414). ACM Press.

28. Norman, D. A. (2002). Emotion and design: Attractive things work better. Interactions Magazine, IX, 36–42.

29. Raita, E. & Oulasvirta, A. (2011). Too good to be bad: Favorable product expectations boost subjective usabil-ity ratings. Interacting with Computers, 23, 363–371.

30. Robert, J.-M. & Lesage, A. (2010). From usability to user experience with interactive systems. In: G. A. Boy (ed.), Handbook of Human–Machine Interaction. UK: Ashgate.

31. Tamminen, S., Oulasvirta, A., Toiskallio, K., & Kankainen, A. (2004). Understanding mobile contexts. Personal and Ubiquitous Computing, 8, 135–143.

32. Thompson, M. M., Zanna, M. P., & Griffin, D. W. (1995). Let’s not be indifferent about (attitudinal) am-bivalence. In: R.E. Petty (ed.) an J. A. Krosnick, (ed.) Attitude Strength: Antecedents and Consequences, 4, 361–386.

33. Thüring, M. & Mahlke, S. (2007). Usability, aesthetics and emotions in human–technology interaction. Interna-tional Journal of Psychology, 42, 253–264.

34. Tractinsky, N., Katz, A. S., & Ikar, D. (2000). What is beau-tiful is usable. Interacting with Computers, 13, 127–145.

35. Tullis, T. & Albert, W. 2008. Measuring the User Expe-rience: Collecting, Analyzing and Presenting Usability Metrics. Massachusetts: Morgan Kaufmann.

36. van Schaik, P. & Ling, J. (2008). Modelling user experi-ence with web sites: Usability, hedonic value, beauty and goodness. Interacting with Computers, 20, 419–443.

37. Zhang, P. & von Dran, G. (2000). Satisfiers and dissat-isfiers: A two-factor model for Website design and evaluation. JASIST, 51, 1253-1268.