A security assessment of tiles: a new portfolio-based graphical authentication system

6
A Security Assessment of Tiles: A New Portfolio-Based Graphical Authentication System Abstract In this paper we propose Tiles, a graphical authentication system in which users are assigned a target image and subsequently asked to select segments of that image. We assess the extent to which this system provides protection against two security threats: observation attacks and sharing of authentication credentials in two laboratory-based studies. We note some of the vulnerabilities of the new system but provide evidence that automated manipulation of the similarity of the decoy images can help mitigate the threat from verbal sharing and observation attacks. Keywords Graphical authentication systems; graphical passwords; authentication; usable security; shoulder surfing. ACM Classification Keywords K.6.5 [Security and Protection]: Authentication. Introduction The problem of password or PIN overload is becoming severe, given that the average user has approximately 25 accounts that require a password [5]. The resulting cognitive overload can lead to insecure behaviours such as users writing pass-codes down or re-using the same Copyright is held by the author/owner(s). CHI’12, May 5–10, 2012, Austin, Texas, USA. ACM 978-1-4503-1016-1/12/05. James Nicholson PaCT Lab, Northumbria University Northumberland Building Newcastle upon Tyne, UK [email protected] Paul Dunphy Culture Lab, Newcastle University Space 2, Kings Road Newcastle upon Tyne, UK [email protected] Lynne Coventry PaCT Lab, Northumbria University Northumberland Building Newcastle upon Tyne, UK [email protected] Pam Briggs PaCT Lab, Northumbria University Northumberland Building Newcastle upon Tyne, UK [email protected] Patrick Olivier Culture Lab, Newcastle University Space 2, Kings Road Newcastle upon Tyne, UK [email protected]

Transcript of A security assessment of tiles: a new portfolio-based graphical authentication system

A Security Assessment of Tiles: A New Portfolio-Based Graphical Authentication System

Abstract In this paper we propose Tiles, a graphical authentication system in which users are assigned a target image and subsequently asked to select segments of that image. We assess the extent to which this system provides protection against two security threats: observation attacks and sharing of authentication credentials in two laboratory-based studies. We note some of the vulnerabilities of the new system but provide evidence that automated manipulation of the similarity of the decoy images can help mitigate the threat from verbal sharing and observation attacks.

Keywords Graphical authentication systems; graphical passwords; authentication; usable security; shoulder surfing.

ACM Classification Keywords K.6.5 [Security and Protection]: Authentication.

Introduction The problem of password or PIN overload is becoming severe, given that the average user has approximately 25 accounts that require a password [5]. The resulting cognitive overload can lead to insecure behaviours such as users writing pass-codes down or re-using the same

Copyright is held by the author/owner(s).

CHI’12, May 5–10, 2012, Austin, Texas, USA.

ACM 978-1-4503-1016-1/12/05.

James Nicholson PaCT Lab, Northumbria University Northumberland Building Newcastle upon Tyne, UK [email protected] Paul Dunphy Culture Lab, Newcastle University Space 2, Kings Road Newcastle upon Tyne, UK [email protected] Lynne Coventry PaCT Lab, Northumbria University Northumberland Building Newcastle upon Tyne, UK [email protected]

Pam Briggs PaCT Lab, Northumbria University Northumberland Building Newcastle upon Tyne, UK [email protected] Patrick Olivier Culture Lab, Newcastle University Space 2, Kings Road Newcastle upon Tyne, UK [email protected]

code [6]. Graphical authentication systems have been proposed as alternatives to alphanumeric passwords or PINs and have advantages because of their reliance on recognition or cued-recall processes that exert less cognitive demand on users [2], however such systems can be vulnerable to observation attacks [7,10]. In this paper we introduce a new recognition-based image portfolio system called Tiles and assess its vulnerability to two attacks, but also show that systematic manipulations of the Tiles images can improve its security credentials.

In Tiles, the user is assigned a single base image for any one account and is then presented with a login comprising four sets each of nine tiles (in a 3x3 grid). Each set contains one tile from the target image and eight different decoy tiles. While intuitively Tiles helps alleviate the problem of remembering four semantically different images, this creates two interesting security vulnerabilities which we focus on in this paper. Firstly, given a description of a base image, how difficult is it to identify its corresponding tiles? Secondly, given one tile, how difficult is it to identify other tiles from the same image? These two questions are analogous to sharing via description [4] and observation attacks respectively. This paper’s contribution is to (i) introduce a new portfolio-based graphical authentication system called Tiles with the aim of improving usability (ii) empirically evaluate the extent to which the system is vulnerable to attack and (iii) understand how manipulating the similarity of decoys might improve the security of the system.

Related Work Graphical authentication systems generally require less cognitive resources and are memorable over extended

periods of time [1,9]. However, a potential paradox of any graphical system is that the attacker may benefit from the very thing that makes it more usable – the relative ease of processing and recalling visual information. In other words, graphical systems can be vulnerable to observation attacks [9], and while some specific systems have been developed to limit such attacks [10], little is known about the way such vulnerabilities affect security.

Tiles The Tiles system introduced here is simple to understand and has multiple potential benefits over traditional graphical systems.

Figure 1: Tiles – Image segments are the stimuli.

Firstly the user is assigned an image to represent their base image. At login the user is displayed a nxn grid of tiles across s screens, one tile on each screen is a tile from their original base secret. The user must for each grid identify the tile that is taken from their original base image to be successful. The system operates in a portfolio configuration, i.e. the target images exposed in a login challenge are spontaneously and randomly selected from a larger portfolio at each login. For the purposes of this evaluation we chose a 3x3 grid pattern although in practice this could be larger according to security requirements (See Figure 1). This protocol potentially has a number of benefits: (i) Target images

are related – The cognitive load of a portfolio-based system is reduced; (ii) Scalability – Target images are drawn from a single base image, for a single account a user must remember a single image; (iii) Description –As tiles are likely to have little semantic meaning, attacks based on tile descriptions will be difficult. (iv) Observation Attacks – It is likely to be more difficult to make a spontaneous semantic association with tiles than semantically meaningful images. Threat Model We assume an offline brute force attack to be trivial, and that online guess attacks are resisted by a three strikes mechanism. Such a mechanism is most suited for local authentication, where authentication does not occur across a network. The probability of making a successful random guess afforded by our chosen 3x3 configuration is 1/(94). There is no requirement that the base image be stored secretly, however in the worst case an attacker can steal a copy of an image known to be the base image to facilitate arbitrary future logins. Ideally the images (targets and decoys) are not drawn from a personal collection to offset the threat of guessing attacks from those with knowledge of the photograph collection of the user. We assume an intersection attack [3] is not possible due to strategic selection of target image and decoy image portfolios size. Tiles presents users with 4 screens of 3x3 grids. In this context 8 images are needed to serve as decoys and 1 key image. All 9 images contribute 4 random tiles to the login challenge – one per grid. The probability of any segment being chosen from any image is static at (1/(9choose4)).

User Studies As noted earlier, a tension exists between security and usability – generally speaking the more secure a system is made the less usable it becomes. During recent years, many authentication systems have focused on measuring usability and made assumptions regarding human-centred threats. We believe there is also an opportunity to empirically evaluate these vulnerabilities as well as usability.

We firstly addressed the question of whether a description of a base image would be a sufficient cue to allow entry into the Tiles system (Study A). The ability to provide descriptions can facilitate consensual or non-consensual password sharing, and while security experts have not explicitly set out to prevent users from sharing passwords, we believed it would be useful to quantify the extent to which graphical systems can be easily shared verbally.

More crucially, we also wished to know whether, given knowledge of a single tile, a user could identify other, related tiles from the same base image (Study B). This is an issue that explicitly addresses the systems vulnerabilities to observation attacks (i.e. shoulder surfing). Finally, we asked how adjusting the similarity of the decoy images might affect ease of attack, with previous research suggesting that hand-selected similar decoys could improve resistance to guessing attacks [8]. In this study we aim to explore the ways in which image similarity could affect system security.

Method Both studies A and B consisted of a computer-based task with a repeated measures design. The independent variable was the similarity of the decoy images to the

target image (similar, medium, dissimilar) while the dependent variable was the number of successful attacks, given a cue: either a textual description (Study A) or a lone tile (Study B). We recruited 60 people, the majority comprising first year undergraduate psychology students (Age µ=20). Half the participants took part in the Study A and half in Study B. Participants were not paid for their participation.

A collection of 1000 images was obtained from a publicly available image database, purposed for image processing operations (http://wang.ist.psu.edu/docs/related.shtml). From these we selected nine at random to form the base images used in both studies. For each we constructed a grid by randomly selecting a tile from a base image, and then selecting a tile from eight other decoy images according to the particular similarity required (similar, medium, dissimilar). The algorithm compared image signatures in the form of 3D image histograms (in the CIELAB colour space) using Earth Movers Distance and produced a list ranking images from most similar to least similar. The top eight most similar images (as ranked by the algorithm) were used as decoys for the similar grids, while the bottom eight images were used for dissimilar grids. For the medium grids, the eight images in the middle of the set were used. The positions chosen from each base image were random and each decoy image could only be used once. Images were displayed on a tablet device for display consistency. For Study A, image descriptions were collected in advance by asking 11 independent participants to take no longer than 60 seconds to describe each image - as if describing to a friend or relative.

Participants were presented with a 3x3 grid of 9 tiles on a tablet computer. They were asked to select the tile they believed matched the provided cue. Participants were provided with either a description of the base image on a sheet of paper (Study A) or a tile from the base image on screen next to the grid (Study B) (See Figure 2). Participants were tested with 9 grids – three in each of the three configurations (similar, medium, dissimilar) which were randomly ordered.

Figure 2: Example grid with both cues: A. description of the base image and B. image segment.

Results Participants were scored on the number of successful selections they made for each of the grid compositions (a maximum of three in each of the three conditions).

The data was analysed using a Wilcoxon signed-rank test (two-tailed). Table 1 summarizes the success rates for every image in each grid composition type. The total success rate for the overall grid type is also shown. The analysis for description attacks found that target images with similar decoys were significantly harder to identify from descriptions than targets amongst medium decoys (z=-3.120, p=.002, r=-.33)

and dissimilar decoys (z=-3.497, p<.001, r=-.37). There was no significant difference between medium and dissimilar grid compositions.

Description Attack Observer Attack

Similarity

Success Rate

Overall Success

Rate

Success Rate

Overall Success

Rate

Similar 1 93%

78%*

1 57% 29%* 2 60% 2 7%

3 80% 3 23%

Medium 4 93%

94%

4 97% 89% 5 93% 5 90%

6 97% 6 80%

Dissimilar 7 93%

97%

7 90% 91% 8 97% 8 97%

9 100% 9 87%

Table 1: Success rates to correctly identify tiles in each of three grid compositions, for the three levels of image similarity. Percentage obtained by dividing correct selections by total number of possible selections. (*Similar grids significantly more difficult (p<.05) than medium and dissimilar grids).

The analysis for observational attacks found that target images with similar decoys were significantly harder to identify with a segment than targets amongst medium decoys (z=-4.679, p<.001, r=-.49) and dissimilar decoys (z=-4.544, p<.001, r=-.48). There was no significant difference between the medium and dissimilar grid compositions.

Similarity is important, but base image selection can also affect system vulnerability. Login 2 (similar composition) had by far the lowest success rate of any login for both studies (See Figure 3). The variance in success rates amongst the base images in the similar grid category suggests that the selection of the base

images can play a big part in reducing a system’s vulnerability to attack.

Figure 3: Beach scene was the image less vulnerable to attack (Image 2) for both studies

Discussion The first point to be made is that this methodology (that we are calling ‘attacker analysis’) has clearly shown up security problems with the system, although a manipulation of image similarity has helped to ameliorate this. Hence, we would only recommend similar grid compositions for use with this system. That said, the system vulnerabilities are still significant, which means that the system is not yet ready to proceed to usability testing. Note, however, that have presented here a ‘worst case’ scenario in that we have allowed our participants (attackers) as much time as they need to determine which tiles are the targets and which the decoys, plus we have made both the cue and the tile grid simultaneously available. In a real world scenario, any attacker is likely to be working under more difficult conditions. Nonetheless, by carrying out this analysis prior to more time-consuming usability testing, we were able to identify a key security problem early in the design process and address it.

Future Work We have identified that image similarity can be used to impact aspects of security. The next step is to increase the difficulty for attackers by disguising the key tiles – possibly by drawing decoys from similar base images rather than from the whole image database. Once the security of the system has been increased, the next step is to validate the usability of Tiles over multiple weeks using similar grids. The system will be tested with specific populations including older adults, who may benefit from the reduction in cognitive load from remembering one base image rather than four or more semantically diverse images.

Conclusion In this paper we have simulated two security threats to our proposed portfolio-based graphical authentication system, Tiles: description and observation attacks. Strategic and automated selection of decoy images significantly impacted both tasks. This suggests preliminary evidence that decoy similarity can be systematically affected for purposes of security. Now that we have identified the optimal security makeup, the next step is to test the usability of the system over short and long intervals.

References [1] Angeli, A. De, Coutts, M., Coventry, L., Johnson, G., Cameron, D., and Fischer, M. VIP: a visual approach to user authentication. In Proc. of AVI (2002), 316-323.

[2] Baddeley, A. Human Memory: Theory and Practice. Psychology Press, 1997.

[3] Dunphy, P., Heiner, A., and Asokan, N. A closer look at recognition-based graphical passwords on mobile devices. In Proc. of SOUPS (2010).

[4] Dunphy, P., Nicholson, J., and Olivier, P.L. Securing Passfaces for Description. In Proc. Of SOUPS (2008), 24-35.

[5] Florencio, D. and Herley, C. A large-scale study of web password habits. In Proc. of WWW (2007), 657-666.

[6] Sasse, M.A., Brostoff, S., and Weirich, D. Transforming the ‘weakest link’ — a human/computer interaction approach to usable and effective security. BT Technology Journal 19, 3 (2001), 122-131.

[7] Tari, F., Ozok, A., and Holden, S.H. A comparison of perceived and real shoulder-surfing risks between alphanumeric and graphical passwords. In Proc. of SOUPS (2006), 56–66.

[8] Tullis, T.S. and Tedesco, D.P. Using personal photos as pictorial passwords. In Proc. of CHI (2005), 1841–1844.

[9] Tullis, T.S., Tedesco, D.P., and McCaffrey, K.E. Can users remember their pictorial passwords six years later. In Proc. of CHI (2011), 1789–1794.

[10] Wiedenbeck, S., Waters, J., Sobrado, L., and Birget, J. Design and evaluation of a shoulder-surfing resistant graphical password scheme. In Proc. of AVI (2006), 177-184.