Optimising privacy-preserving computations - Lirias

266
ARENBERG DOCTORAL SCHOOL Faculty of Engineering Science Optimising privacy-preserving computations Charlotte Bonte Dissertation presented in partial fulfillment of the requirements for the degree of Doctor of Engineering Science (PhD): Electrical Engineering June 2021 Supervisors: Prof. dr. ir. Bart Preneel Prof. dr. ir. Frederik Vercauteren

Transcript of Optimising privacy-preserving computations - Lirias

ARENBERG DOCTORAL SCHOOLFaculty of Engineering Science

Optimising privacy-preservingcomputations

Charlotte Bonte

Dissertation presented in partialfulfillment of the requirements for the

degree of Doctor of EngineeringScience (PhD): Electrical Engineering

June 2021

Supervisors:Prof. dr. ir. Bart PreneelProf. dr. ir. Frederik Vercauteren

Optimising privacy-preserving computations

Charlotte BONTE

Examination committee:em., Prof. dr. ir. Jean-Pierre Celis, chairProf. dr. ir. Bart Preneel, supervisorProf. dr. ir. Frederik Vercauteren, supervisorProf. dr. Nigel P. SmartProf. dr. ir. Frank PiessensProf. dr. ir. Luc Van EyckenDr. Wouter CastryckProf. dr. Jean-Sébastien Coron(University of Luxembourg)

Dr. Joppe Bos(NXP Belgium)

Dr. Rafael Misoczki(Google, USA)

Dissertation presented in partialfulfillment of the requirements forthe degree of Doctor of EngineeringScience (PhD): Electrical Engineer-ing

June 2021

© 2021 KU Leuven – Faculty of Engineering ScienceUitgegeven in eigen beheer, Charlotte Bonte, Kasteelpark Arenberg 10 box 2452, B-3001 Leuven (Belgium)

Alle rechten voorbehouden. Niets uit deze uitgave mag worden vermenigvuldigd en/of openbaar gemaakt wordendoor middel van druk, fotokopie, microfilm, elektronisch of op welke andere wijze ook zonder voorafgaandeschriftelijke toestemming van de uitgever.

All rights reserved. No part of the publication may be reproduced in any form by print, photoprint, microfilm,electronic or any other means without written permission from the publisher.

i

Dedication and acknowledgements

“There is and there was nothing like youand there shall be nothing like you.”

–Yogi tea

I want to dedicate this work to the following people, to whom I am sincerelythankful as without them this work would not exist:

To Frederik Vercauteren and Bart Preneel for giving me the opportunityto do research in COSIC. Your passion for research and dedication inteaching others what you know will always be a source of inspiration forme.

To Nigel Smart and once more to Fre for making time for all my questionsand organising and leading the public key and coed group meetings withan unstoppable enthousiasm, which resulted not only in high qualityresearch but also in lifelong friends.

To my supervisors and jury for guiding me and providing valuable feedbackon my research and this manuscript.

To my coauthors and colleagues, in particular Carl and Ilia, with whom Ispent a lot of time figuring out how things worked and who challengedour work over and over again.

To all of COSIC and especially my office mates Eleftheria and Ward forgenerating an amazing working environment and a great atmosphere thatturned the office into a real home for me.

To Martine Van Gastel for teaching me failure can be a source of inspirationand perseverance and hard work can result in big achievements.

To my family and friends for supporting me, all the distracting activitiesthey arranged to take my mind off my work and for putting up with metrying to explain time and again what I am working on.

To Ward Darquennes in particular for calming me down in stressfultimes and pushing me every day to be a better version of myself whileappreciating me for who I am.

I would also like to thank the European Commision through the ICT programmeH2020 and ERC Advanced Grant for funding my research over the years.

Abstract

Data has never been as valuable as it is today. Moreover, the modern cloudinfrastructure generated a shift from using in-house computational resources tousing powerful external commercial tools to achieve one’s goals. The combinationof the value of data and the shift of moving data to the cloud raises seriousprivacy concerns. As people start seeing the risks related to the cloud, but atthe same time do not want to lose the benefits it brings, a lot of effort has gonein to generating efficient techniques to enable privacy-preserving computation.

Two well-established techniques for privacy-preserving computation are discussedin this thesis: homomorphic encryption (HE) and multi-party computation(MPC). Homomorphic encryption enables a third party to perform computationson encrypted data and therefore enables one to outsource computations onsensitive data to an untrusted party without compromising the privacy of thedata. Multi-party computation allows several mutually distrusting parties tojointly compute a function over their inputs without revealing those inputs toother parties.

A tremendous amount of research has gradually increased the performance ofboth HE schemes and MPC protocols and the work in this thesis falls underthis line of research. Increasing the applicability of these privacy-preservingcomputation techniques in contemporary application scenarios can be doneby improving general operations which can later be used as building blocksfor more complex scenarios or by designing a tailored solution for the specificproblem at hand. This thesis contains work on boosting general operations, likedata encoding and homomorphic equality computation, as well as customisedsolutions for genome-wide association studies, training of a logistic regressionmodel and the construction of a threshold signature scheme using MPC. Asthe results of this thesis show, HE in contrast to MPC is not yet operationalfor universal industrial deployment. Nonetheless, interest in privacy-preservingcomputations keeps increasing, which ensures that, at least for now, the searchfor practical privacy-preserving computation techniques will continue.

iii

Beknopte samenvatting

Het verzamelen van gegevens is nooit zo waardevol geweest als nu. Wanneerwe dit zien in het licht van de ontwikkeling van de cloud infrastructuur die eenverschuiving veroorzaakt heeft van het werken met een persoonlijke computernaar het gebruiken van het procesvermogen van de cloud, groeit de bezorgdheidover gegevensbescherming. Hoewel de risico’s gerelateerd aan het gebruikvan de cloud duidelijk worden, wil men tegelijk de voordelen van de cloudniet opgeven. Daarom wordt er uitgebreid gewerkt aan de ontwikkeling vangegevensbeschermende rekentechnieken.

Twee technieken die ondertussen goed gekend zijn in de onderzoeksgemeenschapworden besproken in deze thesis, meerbepaald homomorfe encryptie en eenbeveiligde berekening met meerdere partijen. Homomorfe encryptie maakt hetmogelijk voor een externe partij om berekeningen uit te voeren op versleuteldedata, dit stelt een klant in staat om berekeningen op private data uit tebesteden aan een gewantrouwde derde partij zonder onderliggende informatievan de invoergegevens vrij te geven. Een beveiligde berekening met meerderepartijen laat toe dat verschillende partijen die elkaar niet vertrouwen samen eenberekening uitvoeren op de verzameling van hun invoergegevens zonder dezeinvoergegevens bekend te maken aan de andere partijen.

Een enorme hoeveelheid research is gedaan om de efficiëntie van homomorfeberekeningen en beveiligde berekeningen met meerdere partijen te verbeteren,waaronder ook het onderzoek van deze thesis. Onderzoek om de toepasbaarheidvan deze gegevensbeschermende rekentechnieken uit te breiden verloopt viahet verbeteren van algemene rekentechnieken die later als bouwstenen gebruiktkunnen worden voor het opbouwen van meer complexe toepassingen of doorhet ontwikkelen van op maat gemaakte oplossingen voor specifieke scenario’s.Deze thesis bevat onderzoek naar het verbeteren van algemene technieken voorhet omvormen van de invoergegevens naar een formaat dat meer aansluit bijhomomorfe encryptie en het ontwikkelen van een gerandomiseerde vergelijking;alsook de ontwikkeling van gespecialiseerde oplossingen voor het uitvoeren

v

vi BEKNOPTE SAMENVATTING

van gegevensbeschermende genoomwijde associatie-studies, het trainen van eenmodel voor logistieke regressie en de constructie van ondertekeningsmechanismengebaseerd op een minimaal aantal deelnemers. Het onderzoek in deze thesismaakt duidelijk dat homomorfe encryptie in tegenstelling tot de techniekenvoor beveiligde berekeningen met meerdere partijen nog niet algemeen inzetbaaris in de hedendaagse industriële toepassingen. Desondanks blijft de interessein gegevensbeschermende berekeningen groeien en dus zal de zoektocht naarefficiëntere gegevensbeschermende rekentechnieken althans voorlopig verdergezet worden.

List of Abbreviations

AGCD approximate greatest common divisor. 11

BGV Brakerski, Gentry and Vaikuntanathan. 45

CRT Chinese remainder theorem. 28

FE functional encryption. 7

FFT fast Fourier transform. 33

FHE fully homomorphic encryption. 10

FV Fan and Vercauteren. 45

GC garbled circuits. 19

GSW Gentry, Sahai and Waters. 72

GWAS genome-wide association studies. 22

HE homomorphic encryption. 5

HEAAN homomorphic encryption for arithmetic of approximate numbers. 45

LSSS linear secret sharing scheme. 18

LWE learning with errors. 11

MPC multi-party computation. 6

MPS monotone span program. 34

NTT number theoretic transform. 63

OLE oblivious linear function evaluation. 20

OT oblivious transfer. 19

vii

viii LIST OF ABBREVIATIONS

PIR private information retrieval. 14

RLWE ring learning with errors. 12

RNS residue number system. 63

SHE somewhat homomorphic encryption. 10

SIMD single-instruction multiple-data. 62

SIVP shortest indepedent vector problem. 30

SVP shortest vector problem. 29

TFHE torus fully homomorphic encryption. 45

UC universal composition. 18

Contents

Abstract iii

Beknopte samenvatting v

List of Abbreviations viii

Contents ix

I Introduction

1 Introduction 31.1 Cryptography and the cloud . . . . . . . . . . . . . . . . . . . . 31.2 Homomorphic encryption . . . . . . . . . . . . . . . . . . . . . 81.3 Multi-party computation . . . . . . . . . . . . . . . . . . . . . . 141.4 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

2 Mathematical background and preliminaries 252.1 Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 252.2 Algebraic number theory . . . . . . . . . . . . . . . . . . . . . . 262.3 Lattices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 282.4 Learning with errors problem . . . . . . . . . . . . . . . . . . . 302.5 From access structures to secret sharing . . . . . . . . . . . . . 332.6 Concrete MPC techniques used in this thesis . . . . . . . . . . 352.7 Doubly authenticated bits (daBits) . . . . . . . . . . . . . . . . 40

3 State of the art homomorphic encryptions schemes 453.1 BGV . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 463.2 FV . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 543.3 Optimisations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 613.4 HEAAN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67

ix

x CONTENTS

3.5 TFHE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72

4 Developing privacy-preserving applications 774.1 Privacy-preserving computation . . . . . . . . . . . . . . . . . . 774.2 Selecting the appropriate scheme . . . . . . . . . . . . . . . . . 784.3 Tuning the application . . . . . . . . . . . . . . . . . . . . . . . 824.4 Selecting the parameters . . . . . . . . . . . . . . . . . . . . . . 834.5 Usability of homomorphic encryption . . . . . . . . . . . . . . . 85

5 Conclusion and future research directions 87

Bibliography 95

II Publications

6 Faster homomorphic function evaluation using non-integral baseencoding 117

7 Towards practical privacy-preserving genome-wide association study139

8 Privacy-preserving logistic regression training 173

9 Homomorphic string search with constant multiplicative depth 191

10 Thresholdizing HashEdDSA: MPC to the Rescue 221

Curriculum vitae 249

Part I

Introduction

“Beware of false knowledge; it is more dangerous than ignorance.”–George Bernard Shaw

Chapter 1

Introduction

1.1 Cryptography and the cloud

Originally cryptography was used to keep information secret during transmission.Accordingly, its goal was to successfully hide the information exchanged betweentwo parties. Encryption schemes were thus designed to keep data safe bytransforming the original data, which we will refer to as plaintext, into anotherformat, called ciphertext. The ciphertext appears somehow random but containsthe original data and as such hides it. The process that transforms theplaintext into a ciphertext is called encryption. Since the ciphertext stillcontains the original data, the plaintext can be recovered again in reasonabletime from the ciphertext in a process that is called decryption. An importantrequirement of any encryption scheme is therefore that every valid ciphertextshould correctly decrypt to the encrypted plaintext. This property is calledcorrectness. Decryption is such that it can only be performed by authorisedparties, as it transforms the random looking data back to meaningful dataand the main goal of encryption is to keep this data hidden from others. Theencryption and decryption processes require auxiliary information, referred toas keys, that need to be generated in advance.

The first encryption schemes were all symmetric, which means the sender andreceiver use the same key to encrypt and decrypt messages. This implies theyneed to somehow pre-establish knowledge of this key by both parties. Thisproblem was solved in the 1970’s with the invention of public key cryptography.In a public key encryption scheme, the receiving party generates two keys thatare mathematically related. That party keeps the secret decryption key to

3

4 INTRODUCTION

himself and shares the public encryption key with everyone else. By making theencryption key public, anyone can send a private message to the receiving partyby encrypting the message with the corresponding public key. As only the ownerof the secret decryption key should be able to decrypt messages, the public keyshould not reveal any information on the secret key. This ensures the messagestays hidden from any adversary that does not know the secret decryptionkey. The first concrete public key constructions were introduced by Diffie andHellman in 1976 [DH76], Rivest, Shamir and Adleman in 1978 [RSA78] and ElGamal in 1985 [ElG85].

Today everyone uses the cloud to store data, as it enables us to access our data onany portable device with an internet connection. The cloud exists in practice of alarge number of servers that permit users to set up a virtual infrastructure whileremoving the burden of having to maintain the used hardware and software orbeing responsible for its maintenance. The emergence of this cloud infrastructureled to the generation of enormous amounts of outsourced data, which has beenproven very valuable. As it became clear how much information the cloudservers can obtain from these large amounts of data, privacy concerns have beenraised. In addition, cloud services provide more functionalities than just storageof the data, they allow users to process their data, like searching for specificinformation or performing computations on the data. Hence the combination ofthe privacy concerns and the ability of the cloud to compute on data introducedinteresting new problems: as cryptography now not only needs to protect data instorage, but it also needs to be able to perform computations while maintainingprivacy of the data. If one simply wants to protect the data stored on thecloud servers, the standard symmetric or public key encryption schemes canensure privacy of the user. Unfortunately, these schemes come with the seriouslimitation that any operation on the encrypted data can only be achieved by firstdecrypting the data and then performing such an operation. This means that ifa client wants to compute on private data, he needs to download everything,decrypt it, perform the necessary operations himself, re-encrypt the result andstore it again on the cloud. The only alternative of the client is to trust thecloud with his private data and provide it with the decryption key, so the cloudcan perform the required operations on the decrypted data. Preferably, onewould have a solution that allows a third party to perform computations onprivate data without revealing any information on it. Hence the modern cloudenvironment calls for more powerful tools and the traditional cryptographicrequirement of protecting data in transit and storage has been extended to arequirement of being able to compute on private data. These developmentshave started the search for techniques that allow us to delegate processing ofour private data without giving access to it.

This property of being able to compute on encrypted data was first mentioned

CRYPTOGRAPHY AND THE CLOUD 5

by Rivest, Adleman and Dertouzos in [RAD78] and introduced under the nameprivacy homomorphism. Nowadays encryption schemes with this propertyare called homomorphic encryption (HE) schemes. One of the examples ofthat time supporting the idea of computing on encrypted data is the RSAscheme [RSA78], which allows to compute a ciphertext encrypting the productof two messages from the multiplication of two RSA ciphertexts. Later, moreschemes able to perform computations on encrypted data by operating onthe ciphertexts became known. The first homomorphic encryption schemeswere however only able to perform one type of operation, which limited theirfunctionality. In 2005 Boneh, Goh and Nissim constructed the first schemepossessing two homomorphic properties [BGN05]. This scheme can performboth homomorphic addition and homomorphic multiplication. Nevertheless,their scheme was restricted in the sense that it only allowed one multiplication.Even though the concept of homomorphic encryption and its potential werealready described in 1978, the next 30 years of research resulted only in schemeswhich allowed restricted computations, meaning either the homomorphic schemeallowed only one operation or there were multiple operations possible but thescheme could only perform a limited number of computations. Only in 2009 didGentry propose the first scheme able to evaluate any function on encrypted data.Although impractical, this scheme proposed by Gentry [Gen09] demonstratedthat the construction of an encryption scheme allowing to compute any functionon encrypted data is possible. This milestone started the quest to find afully homomorphic encryption scheme that is efficient enough to solve real-lifeproblems.

As homomorphic encryption schemes can perform operations on encrypted datawithout the need for intermediate decryption, they form a solution for a widerange of applications. Using a homomorphic encryption scheme to encrypt thedata a user stores on a server enables the user to later ask the server to computeon his data without giving the server access to his data, as shown in Figure 1.1.

Figure 1.1: The client-server setup of homomorphic encryption.

6 INTRODUCTION

In theory, the user can even ask the server to perform an operation while hidingfrom it what kind of operation he wants to perform. As homomorphic schemesprevent the server from obtaining any information from the encrypted data andif needed can also hide the performed operations, it forms a solution to theprivacy concerns in the cloud computing scenario by enabling the user to takeadvantage of the facilities and computational power of the cloud without havingto sacrifice the privacy of his data. However, homomorphic encryption is notthe only solution that protects data in use. Two alternative methods have beendeveloped: secure multi-party computation and functional encryption. Eventhough these three methods all enable privacy-preserving computations, theyeach have their own properties and handle different scenarios.

Multi-party computation (MPC) allows computations on private data held bydifferent parties in a way that no party can learn more than his own input data,the result of the computation and what can be inferred from this result andhis own inputs. To enable MPC at least two parties need to be involved in theprotocol; the setup for six parties is shown in Figure 1.2.

Figure 1.2: The setup of a multi-party computation between six parties.

The input data is secret shared or masked and the results, if revealed, are usuallyrevealed to all parties. MPC protocols are required to be correct and secure.The correctness implies that the output of the protocol needs to equal the resultof the computation on the initial inputs. Defining the security of a protocolis more difficult as there are different security models to choose from. ClearlyMPC also allows computations on private data, however, it is different from

CRYPTOGRAPHY AND THE CLOUD 7

homomorphic encryption. In MPC, the data is not protected by encryptionbut by distributing the data over multiple parties such that all the shares lookrandom. The mechanisms to distribute the data are in addition constructed torequire the collaboration of multiple parties in order to perform computations.

A third option to perform computations on private data is functional encryption(FE). Functional encryption is a public key construction between at least threeparties. One party called the key owner creates a secret and public key. Thepublic key will be shared with one or multiple dataproviders. The secret key willbe used to derive decryption keys that allow the evaluation of a fixed functionduring decryption of the data. This function-specific decryption key will thenbe shared with the party that wants to perform a specific computation on thedata. This party will be called the requester. The setup is shown in Figure 1.3.

Figure 1.3: The setup for functional encryption.

Computing a function with functional encryption is carried out in the followingsteps. First, the requester sends a request to the owner of the keys specifyingthe function he wants to compute. The key owner uses this information andthe secret key to compute a decryption key that allows to evaluate that specificfunction and provides this functional decryption key to the requester. Thekey owner then sends the corresponding public key to the data provider. Thedata provider uses this public key to encrypt their data and send the encrypteddata to the requester. After receiving the functional decryption key and theencrypted data, the requester uses this functional decryption key to decryptthe data. As the functional decryption key was constructed to evaluate aspecific function, the requested function is evaluated simultaneously with the

8 INTRODUCTION

decryption algorithm, which implies that after running the decryption algorithmthe requester ends up with the result of the computation in cleartext. Therequester therefore learns the result of the computation on this specific data andnothing other than that. A practical scenario in which functional encryptionwould be beneficial is to construct spam filters. In this scenario, an emailprovider can learn whether or not a specific email is spam without learning thecontent of the email. Functional encryption was properly formalised by Boneh,Sahai and Waters in 2011 [BSW11]. Research on functional encryption has ledto multiple efficient constructions that allow to evaluate linear and quadraticfunctions, but so far there are no known functional encryption schemes thatallow to efficiently evaluate general functions. Functional encryption is addedhere to cover every known technique allowing computations on private datatoday, but we will not go into further detail as this technique is not part of theresearch done in this thesis.

Table 1.1 shows a brief comparison of the main properties of the three above-mentioned techniques that allow to protect private data during computations.

MPC HE FEInput masked/shared encrypted encryptedOutput cleartext encrypted cleartext

Interaction during computation yes no noType of computation any function any function fixed function

Table 1.1: Comparison between the three techniques for secure computation

1.2 Homomorphic encryption

Homomorphic encryption is a cryptographic technique that allows computationson encrypted data. It can therefore be seen as an encryption scheme augmentedwith an algorithm that allows computations on messages through operations onthe ciphertexts. As regular encryption schemes consist of three algorithms: akey generation algorithm, an encryption algorithm and a decryption algorithm;a homomorphic encryption scheme consists of four algorithms, the threepreviously mentioned ones and an evaluation algorithm used to compute withthe ciphertexts. The security parameter which determines the parameters usedin the scheme is denoted λ. We denote the key space for the secret key withKsk and the key space for the public key with Kpk, the message space withM and the ciphertext space with C. We can represent the functionalities of

HOMOMORPHIC ENCRYPTION 9

the different algorithms of a public key1 homomorphic encryption scheme asfollows.

• KeyGen(1λ) = (sk,pk)Given the security parameter λ, this algorithm generates the private keyused during decryption and the public key used for encryption.

• Enc(pk,m) = cReceiving as input the public key pk and a message m, the encryptionalgorithm generates a ciphertext encryption c of this message.

• Dec(sk, c) = mOn input the secret key sk and a valid ciphertext c, the decryptionalgorithm retrieves the underlying message m from the ciphertext c.

• Eval(φ, c1, . . . , ck) = cφOn input a function expressing a computation on messages φ :Mk →M(with k an integer ≥ 1) and the k ciphertexts c1, . . . , ck which are theencryptions of the messages m1, . . . , mk respectively, it computes aciphertext cφ that corresponds to an encryption of φ(m1, . . . . mk).

Schematically, we can represent it as in Figure 1.4, as a diagram that is madecommutative by the evaluation algorithm of homomorphic encryption.

m1, m2, m3, . . . , mk

φ(m1, m2, m3, . . . , mk)

c1, c2, c3, . . . , ck

φhom(c1, c2, c3, . . . , ck)

φ(.)

Encpk(.)

Decsk(.)

Eval(.)

Figure 1.4: A schematic representation of the homomorphic evaluationalgorithm.

1For a symmetric encryption scheme the key generation only generates a secret key skwhich is used both for encryption and decryption.

10 INTRODUCTION

Homomorphic encryption schemes are usually grouped depending on the typeand number of operations they support. Based on these properties they aredivided in three categories.

• Partially homomorphic encryption schemesThese schemes only allow one type of operation on the ciphertexts.Some famous examples of partially homomorphic encryption schemesare RSA [RSA78], El Gamal [ElG85] and Paillier [Pai99]. The RSA andEl Gamal scheme only allow homomorphic multiplication of ciphertexts,whereas the Paillier cryptosystem only allows homomorphic addition.These schemes are interesting on their own but their functionalities arelimited.

• Somewhat homomorphic encryption schemes (SHE)Homomorphic encryption schemes are called somewhat homomorphicencryption schemes when they enable homomorphic addition as well ashomomorphic multiplication, but they are still restricted by the numberof times these operations can be performed. A well-known example is thescheme of Boneh, Goh and Nissem [BGN05] that allows one to compute anunlimited number of additions but only one single multiplication. Hencethe homomorphic properties of this scheme are limited to computingquadratic polynomials.

• Fully homomorphic encryption schemes (FHE)This is the name for homomorphic encryption schemes that are capable ofcomputing any function homomorphically. The first fully homomorphicencryption scheme was proposed by Gentry in 2009 [Gen09].

The scheme designed by Gentry works in two steps to support any arithmeticcircuit. The first step consist of the construction of a homomorphic encryptionscheme based on ideal lattices, which are constructed as described in Section 2.3.This scheme uses ciphertexts that contain a certain amount of noise to ensuresecurity. However, for the decryption to remain correct, this noise needs to bekept under a fixed bound, called the decryption bound. As this noise grows withevery homomorphic computation that is performed, the scheme only permits alimited number of operations. Otherwise the noise will exceed the decryptionbound and decryption would no longer result in the correct message. To expandthis to a scheme that allows arbitrary computations, Gentry introduces a newprocedure called bootstrapping, which allows one to reduce the noise containedin a ciphertext. The bootstrapping procedure starts from a ciphertext for whichthe noise needs to be reduced and operates on this ciphertext by evaluating thedecryption function homomorphically. In order to perform this operation, anencryption of the private decryption key under the public encryption key, used to

HOMOMORPHIC ENCRYPTION 11

generate the input ciphertexts, is needed. This is called the bootstrapping key. Itis assumed that revealing the bootstrapping key leaks no information whatsoeverabout the secret key; this assumption is called the circular security assumption.The result of this bootstrapping procedure is a ciphertext that contains the samemessage as the input ciphertext but as decryption removes all the noise from it,the noise level of the resulting ciphertext equals the noise introduced by theoperations performed in the bootstrapping procedure. If the noise introducedby bootstrapping is small enough to allow at least one additional homomorphicoperation, then the scheme is called bootstrappable, which means the addition ofthe bootstrapping algorithm transforms the somewhat homomorphic encryptionscheme into a fully homomorphic encryption scheme. Bootstrapping is themost expensive algorithm of a homomorphic encryption scheme. Therefore it isoften avoided in applications where the function to be evaluated is known inadvance, as in this case one can determine parameters that allow evaluation ofthis function without causing a decryption error. We call such a scheme withpredetermined parameters a levelled homomorphic encryption scheme.

Although the scheme proposed by Gentry is theoretically correct, the parametersrequired by his scheme made it unusable in practice. Hence after Gentry’stheoretical breakthrough, the cryptographic community tried to develop morepractical FHE schemes following his blueprint. One year later, the DGVHscheme of van Dijck et al. [vDGHV10] appeared. The scheme is homomorphicwith regard to both addition and multiplication and uses only elementary moduloarithmetic. Its security is based on the approximate greatest common divisor(AGCD) problem, which was first described by Howgrave-Graham in [HG01].Like the original scheme from Gentry, the main drawback of this scheme isthe quick noise growth. One multiplication results in a noise magnitude ofroughly the square of the original noise. Hence these homomorphic schemesrequire large parameter sizes, even when evaluating small circuits that do notrequire bootstrapping. In 2011 [CMNT11], the DGHV scheme was improvedby reducing the public key size and the first implementation of this scheme wascreated. However, the improvements on homomorphic encryption in generalwere not limited to improvements of the DGHV scheme. Different schemeswere constructed, even schemes based on different hard problems, such as theLearning With Errors (LWE) problem, which is a hard problem introducedby Regev in 2005 [Reg05]. From this point on, research efforts were notonly spent on improving the efficiency of one particular scheme but also ontransferring the techniques which render one scheme more efficient to otherschemes. One example of this is the key compression technique proposed byCoron et al. in 2012 [CNT12], which is based on an adaption of the techniquesfrom [BGV12], another homomorphic encryption scheme which will be describedlater. In [CCK+13] in 2013, it was shown how to process multiple messages atthe same time in the DGHV context by packing them in a single ciphertext,

12 INTRODUCTION

this technique is known as batching. A generalisation of the DGHV scheme fornon-binary messages was created in 2015 [NK15]. The DGHV scheme can beused both as levelled homomorphic encryption scheme and as fully homomorphicencryption scheme. In addition, several interesting variants of the DGHV schemehave been proposed. There is the scheme by Coron et al. [CLT14], which isa scale-invariant scheme over the integers with its security still based on theAGCD problem. Cheon and Stehlé proposed in [CS15] a new variant of theDGHV encryption scheme which relies on a new decision variant of the AGCDproblem. The authors show by reduction that the new AGCD variant is as hardas the learning with errors problem. Benarroch, Brakerski and Lepoint proposedanother scheme in 2017 [BBL17]. In this work, they adapt the techniques ofthe Gentry, Sahai and Waters scheme [GSW13] to the AGCD setting.

Research on homomorphic encryption has over the years led to several schemesbased on different hard problems as well as to different optimisation strategies. Inan attempt to classify the different developments, the evaluation of homomorphicencryption schemes is generally divided in three generations. The first generationschemes are those that have exponential noise growth, such as the orignal schemeof Gentry and the DGHV scheme. The second and third generation schemes relyon the LWE problem. Informally the LWE problem consists of solving a systemof noisy linear equations, a formal introduction of this hard problem can befound [Reg05]. Many present-day homomorphic encryption schemes rely on LWEor one of its variants. The second generation schemes, such as for example theschemes by Brakerski-Vaikuntanathan [BV11] and Brakerski [Bra12], managedto reduce the noise growth after multiplication to a logarithmic increase in thepolynomial degree of the homomorphic function instead of the linear increaseof the first generation schemes. In a later stage, the cryptographic communityswitched to a variant of the LWE problem based on ideal lattices, called thering learning with errors problem (RLWE). This variant over polynomial ringssignificantly enhanced the performance of homomorphic encryption schemes,as the algebraic structure of the underlying hard problem reduces the keysizes and speeds up the homomorphic operations. Well-known schemes basedon RLWE are the Brakersi-Gentry-Vaikuntanathan scheme [BGV12], the Fan-Vercauteren scheme [FV12] and the HEAAN scheme [CKKS17] also referredto as the CKKS scheme. Further research into these schemes led to manyoptimisations such as packing algorithms [SV14, BGH13, CKKS17] and variousbootstrapping improvements [GHS12a, AP13, HS15, CHK+18, CH18, HS18].These optimisations in the RLWE schemes result in homomorphic operationsthat only have a polylogarithmic overhead in the security parameter as shownin [GHS12b]. Until today, this type of schemes is still considered to be the mostefficient when arithmetic circuits need to be evaluated homomorphically.

The third generation schemes make use of matrix constructions over LWE

HOMOMORPHIC ENCRYPTION 13

or RLWE. Examples are the GSW scheme [GSW13] by Gentry, Sahai andWaters; the FHEW scheme [DM15] by Ducas and Micciancio and the TFHEscheme [CGGI16] by Chillotti et al. These schemes are mostly known dueto their very efficient bootstrapping algorithms and their asymmetric noisebound for the multiplication, which makes them suitable for circuits with a highmultiplicative depth. Some of these schemes even allow look-up table operationswhich enable homomorphic evaluation of non-polynomial functions. We willelaborate further on the homomorphic schemes and how they differ from eachother in Chapter 3.

Applications

From the setup in Figure 1.1, it should be clear that the most obvious applicationof homomorphic encryption is outsourced computation. However, homomorphicencryption proves to be useful in many other scenarios and has therefore beencalled the swiss army knife of cryptography. Below we will list a few examplesof the wide range of applications of homomorphic encryption. Many more existand new applications might come up as research on HE progresses further.

• Circuit privacy: Just as homomorphic encryption preserves the privacyof the data, it can also be used to keep the function one wants to evaluatesecret. This can be done by evaluation of universal circuits or by encryptedlook-up tables. Keeping the function hidden can be valuable for companies,as the training and creation of models can be a costly process and hencethey want to hide the resulting models from their competitiors. Combiningthis with the increasing popularity of machine learning techniques, onecan see that this property of HE could be very valuable in the future.In [BdMW16] Bourse et al. show circuit privacy of a variant of the GSWFHE for branching programs.

• Multi-party computation: Homomorphic encryption can be used tocompute a function on input data which is distributed over different parties.Gentry’s thesis [] already mentions a two-round passively secure FHE-basedmultiparty computation protocol in which the parties encrypt their inputsand broadcast these ciphertexts. Each party can then homomorphicallycompute the desired function and the final result is obtained through adistributed decryption. Later Lopez-Alt et al. [LTV12] introduce multi-keyFHE which relies on similar ideas and enables homomorphic computationover data encrypted under different keys. In [MW16] Mukherjee andWichs create a two round multi-party protocol based on multi-key fullyhomomorphic encryption. In [BGG+18] Boneh et al. construct a thresholdfully homomorphic encryption scheme enabling a group of parties bigger

14 INTRODUCTION

than the threshold to use the secret key to perform a certain functionalitywithout reconstructing the key. Homomorphic encryption is not the onlytechnique that allows multi-party computations, as mentioned beforethere are many different techniques for MPC. Some of them even usehomomorphic encryption to generate randomness later used to enableparticular computations on the shares, such as for example [DPSZ12].

• Verifiable computation: This technique allows a client to outsourcea computation and in addition to the result, the client receives a proofthat the computation was carried out correctly. In order to make theoutsourcing useful, it is required that checking the correctness of theproof is computationally less demanding than performing the outsourcedcomputation yourself. In [GGP10] Gennaro, Gentry and Parno constructa verifiable computation scheme based on Yao’s garbled circuits and fullyhomomorphic encryption. In [LDPW14], Lai et al. propose verifiablehomomorphic encryption, a technique that enables verifiable computationon outsourced encrypted data by combining a homomorphic encryptionscheme with a homomorphic encryption authenticator based on the fullyhomomorphic MAC proposed by Gennaro and Wichs [GW13].

• Private information retrieval: Private information retrieval (PIR)schemes enable a user to retrieve information from a database whilemaintaining privacy of the queries. This notion was introduced by Chor,Goldreich, Kushilevitz and Sudan [CGKS95]. This goal can also beachieved by using homomorphic encryption. In [DLSW19], Dams et al.construct a PIR protocol based on the Fan-Vercauteren HE scheme andreport on its performance. In PIR constructions, there is always a trade-offbetween the communication costs and the cost of the computations onthe server’s side. In [GH19] Gentry and Halevi reduce the communicationcost by constructing a compression with nearly optimal rate. A deeperstudy of the trade-off between communication and computation cost inPIR is done by Ali et al. in [ALP+19].

1.3 Multi-party computation

Multi-party computation (MPC) is a method to compute on private data held bydifferent parties intending to allow the different parties to learn only the resultsof the computation and whatever they can infer from combining this output andthe party’s own private input. As MPC allows a party to deduce informationbased on the received output and its own inputs, one has to think carefullyabout which functions should be permitted to be computed. If one uses MPCto compute the average of two private inputs owned by the parties involved in

MULTI-PARTY COMPUTATION 15

the protocol, then the outcome will always enable one party to deduce the otherparty’s input. In MPC, computations are performed through an interactiveprotocol between different parties. We call the parties that correctly followthe protocol the honest parties and the parties controlled by the adversary,which can possibly deviate from the protocol, the corrupt parties. The securitymodel indicates the level of misbehaviour the protocol can endure without losingthe correctness of the computation and/or secrecy of the input values. Thuswhere HE assumes all the operations are performed correctly, some MPC setupsconsider the scenario in which the protocol execution may be influenced byan external party or even a subset of participating parties, who aim to learnprivate information or even render the result of the computation incorrect.

Security models in MPC are constructed based on the expected behaviour ofthe protocol and assumptions on the power of the adversary. For the expectedbehaviour of the protocol we have the following two common options.

• RobustnessThis is the setting with guaranteed output delivery, which implies honestparties always obtain the correct output at the end of the protocol,independent of the adversarial behaviour of the corrupt parties.

• Security with abortThis setting has two options, either the correct output is received by thehonest parties or the protocol is aborted due to an action of the adversary.

The three main parameters defining the power of the adversary are the allowedadversarial behaviour, the corruption strategy and the computational power ofthe adversary.

Allowed adversarial behaviour

The allowed adversarial behaviour defines what actions the corrupt parties cantake. The two most common types of adversarial behaviour considered in MPCprotocols are called passive and active security.

• Passive securityIn this setting, the corrupt parties managed by the adversary have to followthe protocol specifications, but the adversary tries to use the internalstatus of the corrupt parties during the protocol execution to learn privateinformation of the other parties. The passive security model is alsocalled honest-but-curious or semi-honest security model. This adversarialbehaviour type corresponds with the security model of homomorphicencryption.

16 INTRODUCTION

• Active securityIn an active secure model, the adversary is allowed to instruct the corruptparties to deviate arbitrarily from the protocol specification. This securitymodel is also called the malicious model and is considered the strongestsecurity model as it considers a very powerful adversary.

Corruption strategy

The corruption strategy defines when and how the parties are corrupted. The twomost common strategies considered in MPC are static and adaptive corruptionmodels.

• Static corruption modelIn the static corruption model, the corrupted parties are determined atthe beginning of the protocol and are then fixed for the whole protocolexecution.

• Adaptive corruption model An adaptive corruption model gives theadversary the capability to corrupt parties during the computation. Hencethe adversary can choose who to corrupt and when to corrupt theseparties based on its view of the execution of the protocol. Once a party iscorrupted, it remains corrupted for the rest of the protocol.

Computational power of the adversary

Here we also consider two options, the adversary can be computationallyunbounded or computationally bounded.

• Computationally unboundedThe adversary has unlimited computation power. The adversary cancompute anything regardless of how expensive the computation is.

• Computationally boundedThe computation power of the adversary is limited and parametrised by acomputational security parameter.

The number of corrupted parties that a protocol tolerates is usually given throughan access structure, which indicates which sets of parties can be corrupted, or bya fixed corruption threshold. Given a set of parties P , an access structure defineswhich subsets of 2P are allowed to reconstruct the secret information based onthe shares of each party in this subset and which subsets lack information to

MULTI-PARTY COMPUTATION 17

reconstruct the secret even if they gather all the shares of the parties in thatsubset. The subsets able to reconstruct the secret are called the qualified setsand the subsets unable to reconstruct the secret are called the unqualified sets.More details on access structures and how secret sharing schemes are createdfrom them is given in Section 2.5. A threshold access structure is defined bytwo numbers (n, t) such that for an (n, t)-threshold access structure defined onn parties, any set of t or fewer parties is unqualified and any set of t + 1 ormore parties is qualified. The common (n, n− 1)-threshold access structure isreferred to as the full-threshold access structure and the (n, t)-threshold accessstructure for t ≤

⌊n−1

2⌋is referred to as the honest majority access structure.

As we expect a multi-party computation protocol to fulfil several securityrequirements, it is important to have a security definition that can be used toprove the security of the protocol. The standard definition nowadays formulatesthe security of the MPC protocol through the ideal/real simulation paradigm.It mentally constructs an ideal world in which a trusted (and thus incorruptible)external party performs the computation the parties want to carry out. In thisideal world setting, the parties just input their input values to the trusted partyand receive from it the correctly computed output value. As the only action ofthe parties in this setting is sending their input, the only power of the adversaryis to choose the inputs of the corrupted parties. Of course in the real world,there is no trusted party and the parties have to execute the protocol whilesome of the parties involved in the computation might be corrupt. The realworld protocol is considered secure if the adversary can do no more harm in thereal world execution than in the ideal world execution.

So one has to prove that when an adversary succeeds in carrying out an attackin the real world, there exists an adversary that can perform an attack withthe same effect in the ideal world. This implies that for each attack in the realprotocol execution, there exists an adversary in the ideal world attacking theexecution of the trusted external party in a similar manner, which means theinput and output distributions of the adversarial and honest parties should bethe same for both executions. As there are no successful adversarial attackspossible in the ideal world, it should be concluded that all adversarial attacksin the real world also have to fail.

This thought experiment is used to prove the security of the isolated MPCprotocol, yet in reality the MPC protocol is often part of a larger system.In [Can00], it is proven that an MPC protocol that is part of a larger system,can still be considered to be executed by an incorruptible trusted party andthus the ideal/real simulation paradigm can still be used to prove its security.This property is called modular composition and allows to split larger systemsin a modular way into subprotocols. The real/ideal simulation paradigm coversa stand-alone setting of the MPC protocol. This indicates that if the MPC

18 INTRODUCTION

protocol is part of a larger system, no messages are sent in parallel with theprotocol execution; or equivalently that messages sent outside the MPC protocolcan only be sent before or after the MPC protocol. It is however possible thatMPC protocols are run concurrently with other, possibly insecure protocols.This setting is not covered by the previous stand-alone security definition,hence researchers developed new definitions to deal with this setting. The mostcommonly used definition for this setting is the universal composition (UC)framework introduced by Canetti [Can01]. If the MPC protocol is proven securein the UC framework, it has the composition property which guarantees thesecurity of the protocol instance independently from other protocols that areexecuted concurrently.

Over the years, research on MPC techniques has evolved from constructing the-oretical frameworks to optimising efficiency through algorithmic improvements.The efficiency of MPC techniques has improved to such a degree that nowadaysit is efficient enough to be used in practical applications. Giving an overview ofall the techniques developed and improvements done over the years is out ofthe scope of this thesis. To this end, we will restrict this overview to its origin,some important achievements done over the years, techniques we will use laterin this thesis and references to present-day schemes. We refer the interestedreader to a survey by Lindell [Lin20] and a more elaborate survey by Evans,Kolesnikov and Rosulek [EKR17] to get a more detailed description of MPCand its related techniques.

Even though Blakley and Shamir showed how to share a secret over multipleparties already in 1979 [Bla79, Sha79], the first attempt to create an MPCprotocol to compute a function on the input of both parties while hidingone’s input from the other party, was by Yao in 1982 [Yao82]. At that timesecret sharing was a research direction on its own, independent of MPC. Someimportant research results are the possibility to construct a linear secret sharingscheme (LSSS) for any possible access structure by Ito et al. in 1987 [ISN87]and the description of secret sharing schemes through monotone span programsby Karchmer and Wigderson in [KW93]. Several different linear secret sharingschemes were constructed such as disjunctive normal form based secret sharingin [ISN87] and replicated secret sharing based on the conjunctive normal formin [ISN93]. Replicated secret sharing got its name due to the fact that inthese schemes multiple parties hold the same share and hence the shares arereplicated.

Research was also directed to investigate the feasibility of performing securecomputation when corrupt parties are participating in the computation. Someof the most significant results were achieved in [GMW87, BGW88, CCD88,RB89, Yao86]. Later, the research direction switched from the more theoreticalfeasibility framework with proof of concept protocols to investigations into

MULTI-PARTY COMPUTATION 19

making the protocols more efficient in order to be able to use them in real-lifeapplications. A game changing idea to improve the efficiency of the schemes isdue to Beaver, who shows in [Bea92] that one can generate random values in apreprocessing phase before the start of the MPC computation, which can thenlater be used to reduce the communication cost of the MPC protocol. Thisintroduces the paradigm of splitting an MPC protocol in an offline preprocessingphase (parties work together to produce correlated randomness to later use inthe online phase) and an online phase (computation of the actual function on thesecret inputs). Most current MPC protocols use this two phase approach and astrong security setting like for example dishonest majority. Two contemporaryexamples in this setting are [DPSZ12] and [BDOZ11], which are based on secretsharing and have a preprocessing phase based on homomorphic encryption.

Another technique for achieving secure computation by multiple partiesoriginated in 1986, when Yao introduced circuit garbling in [Yao86]. Heshowed that in the setting of a passive adversary, two parties can evaluatea boolean circuit on their inputs using circuit garbling. The big advantagethe garbled circuit (GC) approach has over the LSSS approach is that onlya constant number of communication rounds are needed between the parties,while for LSSS the number of communication rounds typically depends on themultiplicative depth of the circuit (i.e. the number of subsequent multiplications).However, only needing a constant number of communication rounds comesat the cost of having to transfer a lot of data between the parties. Manyimprovements to the garbled circuit approach were done over the years, a firstfollow-up work [BMR90] broadened garbled circuits to a setting of n partiesand reduced the computational burden of one of the parties significantly. Thebest-known techniques for garbling arithmetic circuits are [BMR16] and [Ben18],but performing multiplications in this setting is expensive as a lot of data needsto be transferred to achieve this. Multi-party garbling of boolean circuits has incontrast evolved enough to be considered practical even for the active securitysetting, see [WRK17, HSS17, KY18].

The very first MPC protocol, Yao’s circuit [Yao82] uses oblivious transfer (OT),a protocol that allows a sender to transfer one of potentially many inputs toa receiver while remaining oblivious about which input was transferred to thereceiver. The first OT protocol was created by Rabin in 1981 [Rab81], it wasfollowed by a protocol by Even, Goldreich and Lempel [EGL82] which wasdesigned to build MPC from it. These first OT protocols allowed to send oneout of two inputs obliviously, later this technique got extended to a one out of ninputs setting and a k out of n setting in [BCR87]. It is shown by Kilian [Kil88]that OT is complete for secure multi-party computation; hence any polynomialtime computable function can be securely evaluated using only OT. Researchersnot only took an interest in oblivious transfer because of its completeness for

20 INTRODUCTION

secure multi-party computation but also for its use in the preprocessing phase oflinear secret sharing schemes and in garbled circuits. The technique of oblivioustransfer was broadened to the setting where one party inputs a polynomial P ,the other party inputs a value a and the oblivious transfer protocol results in theparty inputting the value a learning P (a). This is called oblivious polynomialevaluation and is introduced by Naor and Pinkas in [NP99], with as specialcase the oblivious linear function evaluation (OLE). Oblivious linear functionevaluation receives significant interest nowadays because of its potential in thepreprocessing phase of MPC protocols for arithmetic circuits. For example, therecent works [HIMV19] and [DGN+17] use OLE in the preprocessing phase toachieve an efficient MPC protocol in the dishonest majority setting.

Applications

There are many application scenarios in which MPC is useful. As MPCdistributes the computation and hence trust to multiple parties, it can beused to remove a potential single point of failure or to combine several trustedinstitutions into one which is more neutral and trustworthy. Since MPCtechniques advanced enough to be applicable in real-life, we list here some ofthe current MPC applications.

• Resolving conflicting interests: The first large scale deployment ofMPC aimed to solve the problem of determining the market price forsugar beets. A collaboration between researchers from Aarhus Universityand the Danish government [BCD+09] led to a system in which farmerssubmit bids specifying how many sugar beets they would sell for a specificprice and buyers input the amount of sugar beets they want to receive fora certain price. These inputs are secret shared amongst three parties whothen run a double auction to find the market price by finding the pointwhere the total supply equals the total demand. This led to the creationof the company Partisia focussing on commercial activities within marketdesign and data analytics. In addition, Partisia nowadays concentrates oninfrastructure for key management and data exchange.

• Private data as a service: The increasing amounts of data have led toa data-driven world in which data is used to make decisions and predictfuture outcomes. However, as data has become more valuable, privacy-concerns have been raised about data being used outside the originalcontext. Nowadays, efforts are done to deploy techniques allowing oneto operate on the data in a privacy-preserving manner. The followingtwo well-known systems are created to enable the collection and use ofprivate data in a secure manner. The Jana system [Gal], which Galois

CONTRIBUTIONS 21

Inc. developed with the help of several universities and other companies,integrates MPC, differential privacy and searchable encryption to evaluateuser queries on an encrypted database. The Sharemind system developedby Cybernetica [Cyb] provides a shared database. As such, it allows toshare data from multiple sources without revealing any information onthe shared records and enables joint analyses on the shared data.

• Securing cryptographic keys: Securing cryptographic keys is a tedioustask. Not only is key management challenging, there is also the threat ofan attacker obtaining access to one’s server and stealing the keys fromthere. One can avoid this single point of failure by distributing the key overmultiple servers using MPC. In that case an attacker needs to have accessto all the different servers in order to retrieve the key. An authenticationand signing procedure using this distributed secret key information canbe realised with specific MPC functionalities. Unbound Security [Sec]offers such a software solution, that distributes the key across differentcloud providers of the client’s choice and executes authentication andthreshold signatures without reconstructing the key from the distributedkey material.

• Private set intersection: Private set intersection allows two parties toinput a list of elements and only learn the common elements between thetwo lists. Google found a specialised two party protocol performing thisoperation, which shows a warning in Chrome if the username and passwordyou enter to log in match their giant database of leaked passwords. Assuch, they are using MPC to warn their users if their username andpassword are compromised in a data breach and advise them to changetheir credentials. This work was first announced in the blogpost [Lak19],later the underlying private set instersection techniques were publishedin [IKN+20] and [MPR+20].

1.4 Contributions

Optimising privacy-preserving computation techniques can be done with thefollowing two strategies. One is to investigate general mathematical operationsor functions and later use the resulting constructions as building blocks inmore complex computations. A second strategy is to start from a specificapplication scenario and design an optimised solution for the task at hand.Research on privacy-preserving computation is done by exploring both strategies.Similarly, this thesis comprises works constructing techniques that improvegeneric homomorphic operations as well as works on creating the most efficientprivacy-preserving solution for a particular problem at hand.

22 INTRODUCTION

A first general problem with homomorphic encryption is that the schemesnatively encrypt data from Z[X]/f(X) or even Zt[X]/f(X) for some monicpolynomial f and integer t > 1. This of course does not correspond with thenative data type of the input data of applications. Thus one has to find a wayto encode the input data into this ring such that operations on the encoded datamap to operations on the input data. Investigation of the previous encodingtechniques to transform real numbers to the plaintext space Zt[X]/f(X) showedthat the input data is concentrated in a small number of polynomial coefficientsof the encoding. This concentration causes the polynomial coefficients to growfaster during operations and hence requires a bigger plaintext modulus toaccomodate this growth. A big plaintext modulus leads to faster noise growthand as such demands larger parameters for the homomorphic encryption scheme,which results in less efficient homomorphic operations. In the work of Chapter 6,we construct an encoding technique that spreads the data out better over thedifferent polynomial coefficients. This reduces the coefficient growth triggeredby homomorphic operations significantly and hence allows to use a smallerciphertext modulus without affecting the security of the homomorphic scheme,which gives rise to more efficient homomorphic applications. Looking at theproblem from a different angle, this encoding algorithm also enables one totailor the encoding in order to make the data fit a fixed plaintext space.

Genome-wide association studies (GWAS) identify which genetic variants areassociated with diseases in order to define genetic disorders as potential markersfor a disease. On the one hand, GWAS require genomic information of a largepopulation to produce reliable results, on the other hand, genomic data is veryprivacy sensitive hence the data can not easily be shared with other parties.Hence, this application scenario of researchers wanting to investigate the datagathered in medical centres and hospitals demands a privacy-preserving solution.Chapter 7 provides two secure and efficient solutions for this application, onebased on homomorphic encryption and one based on MPC. At this point thesolution with MPC is the most performant, but it is important to keep in mind,as we also point out in Chapter 4, that in the future, latency might become animportant bottleneck which could render the HE setting to be competitive tothe MPC setting for this scenario.

In the data-driven world of today, there is a lot of interest in privacy-preservingmachine learning techniques as this would allow researchers to retrieve importantinformation without compromising the privacy of their subjects, or companiesto turn data into revenue without compromising the privacy of their clients.Finding efficient methods to train a model on encrypted data still remains a bigresearch challenge. We managed to design an efficient privacy-preserving trainingalgorithm for the logistic regression model in Chapter 8, using homomorphicencryption and a fixed Hessian based algorithm. We showcase the performance

CONTRIBUTIONS 23

of our algorithm in two real-life scenarios. Firstly, the algorithm is applied ina medical scenario, where it constructs a model to predict the probability ofa patient to have cancer and secondly, it is applied in a financial scenario topredict the probability that a credit card transaction is fraudulent.

It is a well known fact that the homomorphic schemes that are very efficient inperforming arithmetic operations like BGV, FV and HEAAN perform poorly atevaluating comparisons. However, equality testing is an important part of manyapplications, as for example internet search, text processing, DNA analysis.Previous techniques rely on bit representation of the inputs or a polynomialapproximation of the equality function, which results either in large parametersor poor results. In Chapter 9, we design a new method to homomorphicallycompare two numbers and use this method to construct a substring searchalgorithm. Our design is based on a randomised equality circuit that makes thedepth of the circuit independent of the pattern length. This independence doesnot only improve the efficiency of the protocol significantly, but it also increasesthe range of the application scenario as the same set of system parameters canbe used for a wide range of pattern lengths. As such, the user does not haveto fix the pattern length at the point in time where he encrypts and uploadsthe text he wants to search later. Instead, it is sufficient to determine a rangefor the size of the patterns the user wants to search in the future and to thenselect system parameters that allow searching for patterns with lengths withinthis range. In addition, we designed a compression technique that reduces thecommunication cost of the string search protocol.

Given the current state-of-the-art, the scenario of constructing specific privacy-preserving applications is different for HE and MPC. While the state-of-the-artMPC techniques could be considered mature enough to be used as buildingblocks with which one can construct a privacy-preserving computation, HE stillrequires a lot of customization of both the computations of the application andthe HE techniques to end up with decent performance. Not in all applicationscenarios one has the freedom to change the applications themselves. Sometimescompanies are bound by standards or compliance issues and have to adhere tocertain operations. In those scenarios, it is very hard to achieve performantsolutions using homomorphic encryption, unless the operations of the circuitfit the capabilities of a specific scheme perfectly. As recent cryptographicdevelopments such as blockchain created a scenario in which the creation of onevalid signature gives the adversary a lot of power, interest in threshold signatureschemes renewed. Threshold signature schemes mitigate the risk of an adversaryproducing a valid signature by distributing the signing power to a qualifiedset of parties, which implies the adversary has to corrupt multiple parties inorder to generate a valid signature. This new threat, coming from an adversarythat can generate a valid signature, led to an effort by the standardisation

24 INTRODUCTION

body NIST to create threshold variants of their standardised signature schemeswithout changing the output [NIS], such that organisations currently workingwith deterministic signatures, can replace these with threshold variants withminimal alterations to their system. Nowadays, MPC is obviously the besttechnique to achieve this threshold setup. We investigate in Chapter 10 how toachieve the most efficient threshold variant of the HashEdDSA signature schemewith the current state-of-the-art MPC protocols. Furthermore, we show that ifone is willing to change the standard algorithm to use recently developed, moreMPC-friendly hash functions, one can create slightly faster threshold signatureschemes.

Chapter 2

Mathematical backgroundand preliminaries

2.1 Notation

Given a number a ∈ N, [a] indicates the set of natural numbers 1, . . . , a andfor a, b ∈ Z with a ≤ b, [a, b] denotes the integer set a, a + 1, . . . , b − 1, b.For a finite set S, we denote sampling a uniformly random element s from S

by s $←− S and for a distribution D, we denote sampling an element from D byx← D.

For an integer q > 1, let Zq be the ring of integers modulo q, which by defaultwe identify with their representants in

[−q2 ,

q2). We denote this representant of

an integer a mod q by [a]q. If we need the reduction in the interval [0, q − 1] wewill use the notation rq(a). In addition, b·c denotes flooring, d·e denotes ceilingand b·e denotes rounding to the nearest integer, rounding upwards in case of atie. A vector will be indicated with a boldface lower case letter a and matriceswith their upper case variant A.

As we will make extensive use of polynomial rings and their elements, we willalso define notation for those here. For a polynomial f(X) of degree n, thering of polynomials with degree strictly smaller than n and integer coefficientsis denoted by R = Z[X]/(f(X)) and an element of this ring will be denotedwith a bold lower case letter a. We extend b·c, d·e and b·e to polynomials byapplying them coefficient-wise. For an element a =

∑n−1i=0 aiX

i ∈ Z[X], wedefine ‖a‖∞ = max|ai| , 0 ≤ i < n as the maximum absolute value of its

25

26 MATHEMATICAL BACKGROUND AND PRELIMINARIES

coefficients. The expansion factor of the ring R that defines the maximumgrowth of the coefficients during a multiplication of two elements of R is denotedby δR = sup‖a · b‖∞ / ‖a‖∞ ‖b‖∞ for a, b ∈ R\0.

2.2 Algebraic number theory

Number fields and rings

A field K which contains the rational numbers as subfield and is finitedimensional when considered as a Q-vector space, is called a number field.The dimension of K is also called the degree of the number field. As the degreeof a number field K is finite, one can for any element α ∈ K find a non-zeropolynomial f ∈ Q[X] of finite degree for which f(α) = 0. The polynomial fcan be selected to be monic and irreducible over Q[X], in which case we call fthe minimal polynomial, as this corresponds to the polynomial with minimaldegree for which α is a root. This root α is called an algebraic number. If theminimal polynomial of α only has integer coefficients, we call α an algebraicinteger. Hence for a monic irreducible polynomial f ∈ Z[X] of degree n, thequotient ring K = Q[X]/〈f〉 is a number field of degree n and we call f thedefining polynomial of the number field K. The set of all algebraic integers inthe number field K forms a ring with unity and is called the ring of integers of K.This ring of integers of K always has a Z-basis. If the set of all algebraic integersover K = Q[X]/〈f〉, with f the defining polynomial equals R = Z[X]/〈f〉, wecall K a monogenic number field and f a monogenic polynomial.

In this thesis we will work with a class of monogenic number fields calledcyclotomic fields. Define for a natural number m > 0 ζm = e

2πim to be the

primitive complex mth root of unity. Since ζm is a root of Xm − 1, it is analgebraic integer. The minimal polynomial of ζm which we will denote byΦm(X) turns out to be

Φm(X) =∏

1≤k≤mgcd(k,m)=1

(X − e 2πik

m

),

where gcd(k,m) stands for the greatest common divisor between k and m.We call Φm(X) the mth cyclotomic polynomial and its degree is given byn = ϕ(m) with ϕ(·) the Euler totient function. Common examples of cyclotomicpolynomials often used in homomorphic encryption are cyclotomic polynomialsfor m a power of two or m prime. For m = 2k with k ≥ 1, Φ2k(X) = X2k−1 + 1and for m = p with p a prime number, we have Φp(X) = Xp−1 +Xp−2 + . . .+X + 1.

ALGEBRAIC NUMBER THEORY 27

The canonical embedding

Assume we have a number field K = Q(α) of degree n, such that the minimalpolynomial of α has n roots α1, α2, . . . , αn in C. Then one can define nfield homomorphisms σi : K → C : α 7→ αi. If the image of K under σi isa subset of R we call σi a real embedding, otherwise, we call it a complexembedding. Complex embeddings always come in pairs, such that for a pairof complex embeddings (σi, σj) it holds that σj(x) = σi(x), for all x ∈ Kwhere x denotes the complex conjugate of x. Let s1 be the number of realembeddings and n− s1 = 2s2 the number of complex embeddings, then we candefine the canonical embedding σ : K → Rs1 × C2s2 : α 7→ σ(α) with σ(α) =(σ1(α), . . . , σs1(α), σs1+1(α), . . . , σs1+s2(α), σs1+1(α), . . . , σs1+s2(α)). The im-age of the canonical embedding is contained in H = (x1, . . . , xn) ∈ Rs1 ×C2s2 :xs1+s2+i = xs1+i,∀i ∈ [s2]. Given the field homomorphisms σi, we define thetrace map of an element x ∈ K as the sum of its embeddings Tr(x) =

∑ni=1 σi(x).

Ideals

We assume (R,+, ·) to be a commutative ring with an identity 1 6= 0 (i.e. aneutral element for the multiplication).Definition 1. A subset I is an ideal if (I,+) is a subgroup of the additive group(R,+) and for every element r ∈ R and x ∈ I, the product is an element of theideal, i.e. rx ∈ I.Definition 2. An ideal is called a principal ideal if it is generated by oneelement of the ring R, thus if I = rR for r ∈ R, we denote this as I = 〈r〉.Definition 3. Two ideals I and J of the commutative ring R are coprime ifI + J = R.Definition 4. If I and J are two ideals of the commutative ring R, then theproduct I · J is defined as the ideal of R formed by the set of all finite sums ofelements of the form xy with x ∈ I and y ∈ J.

The ring of integers of a number field K is a commutative ring with identity,consequently we can consider its ideals. A non-zero ideal of the ring of integersof a field K is called an integral ideal. An additive subgroup I of the field Kis called a fractional ideal if there exists a non-zero element r in the ring ofintegers of K for which rI is an integral ideal of the ring of integers of K. Everynon-zero fractional ideal of the ring of integers of K has a Z-basis of size m,with m the degree of the number field K. The dual ideal of a fractional ideal Iis defined as

I∨ = x ∈ K : Tr(xy) ∈ Z for all y ∈ I,

28 MATHEMATICAL BACKGROUND AND PRELIMINARIES

with Tr(·) the trace map as defined in Section 2.2.

We state the Chinese remainder theorem which will be used in homomorphicencryption schemes to make the operations more efficient.

Theorem 1. (Chinese remainder theorem (CRT)) Let I1, . . . , Ik be k pairwisecoprime ideals of R and let I = I1 · . . . · Ik be the product of these ideals, thenthe following ring isomorphism holds:

CRT : R/I→ R/I1 × . . .×R/Ik : x mod I 7→ (x mod I1, . . . , x mod Ik).

As shown by its definition, the Chinese remainder theorem defines anisomorphism between the quotient ring R/I and the direct product of thequotient rings R/Ii. This isomorphism allows one to go back and forth betweenthe two equivalent representations to either work on several elements packedinto a single element of R/I via the inverse transformation CRT−1 or to performindependent computations in the smaller rings R/Ii, which can moreover beperformed in parallel and are thus more efficient than computations in the bigring R/I.

2.3 Lattices

There are two different mathematical structures called lattices, in this thesis weconsider the following definition.

Definition 5. For n ≥ 1 an integer, a lattice L is a discrete additive subgroupof the real vector space Rn endowed with a norm ‖·‖.

As a lattice L is an additive group, it holds that 0 ∈ L and ∀v1, v2 ∈ L : −v1 ∈L and v1 + v2 ∈ L. The discreteness means that for the given norm and alattice element v ∈ L there exists an n-dimensional ball around v that does notcontain any other lattice element. We call n the dimension of the lattice L.

Definition 6. A basis B of an n-dimensional lattice L is a set of d ≤ n linearlyindependent vectors B = b1, . . . , bd such that every element of the lattice Lcan be written as a linear combination of elements of B with coefficients in Z.

We denote with L(B) the lattice spanned by the basis B. We can define a matrixB ∈ Rn×d with as columns the basis vectors bi from L. A basis is far fromunique as any non-trivial unimodular transformation1 transforms one basis into

1A unimodular transformation is given by a square matrix with integer coefficients, whosedeterminant equals ±1.

LATTICES 29

another. The number of vectors in a lattice basis is a constant called the rank ofthe lattice. Following Definition 6, the rank of the lattice L is rank(L) = d. As aconsequence of the fact that a non-trivial unimodular transformation transformsone basis into another basis, a lattice of rank d ≥ 2 has an infinite numberof bases. If the rank of the lattice equals the dimension, we call the lattice alattice of full rank.

The length of a lattice vector is given by its Euclidian norm.

Definition 7. The minimal distance of a lattice L is the length of a shortest non-zero vector of the lattice. The minimal distance is denoted by λ1(L) = min

0 6=v∈L‖v‖.

Note that we always talk about a shortest vector and not the shortest vectoras a shortest vector is never unique, because for a vector v ∈ L that satisfiesthe shortest vector condition, the vector −v is also an element of the lattice Lsatisfying this condition. Analogously to the minimal distance, we define thei-th minimum of the lattice L as the smallest positive real number λi(L) suchthat L has i linearly independent vectors of norm at most λi(L).

As mentioned in the introduction, homomorphic encryption schemes rely onhard problems. With hard problems we refer to mathematical problems whichrequire significant resources to solve, independent of the algorithm used tosolve it. To prove that a problem is hard, one often reduces it to anotherhard problem which implies its hardness in the following way. If one wants toshow that problem A is hard, one constructs an algorithm showing that solvingproblem A also solves problem B. Then, as problem B is hard and solving Aimplies solving B, one can deduce that problem A is hard. The current mostpopular HE schemes rely on the LWE and RLWE problem, which are reducedto hard problems for lattices. Hence before looking into LWE and RLWE, wedefine the hard lattice problems used to show the hardness of LWE and RLWE.

Definition 8. Shortest vector problem (SVP): Given an arbitrary basis B ofthe lattice L, find a shortest non-zero vector of L, thus a vector v ∈ L such that‖v‖ = λ1(L).

It was shown that worst case instances of the shortest vector problem are anNP-hard problems for generic lattices, [vEB81, Ajt98, Mic01]. Even though thisincreases trust in the security of schemes based on this hard problem, one hasto keep in mind that cryptographic constructions usually assume the hardnessof the average-case problem.

Furthermore, the shortest vector problem has been relaxed to an approximationproblem parametrised by an approximation factor γ ≥ 1 called the approximateshortest vector problem (SVPγ).

30 MATHEMATICAL BACKGROUND AND PRELIMINARIES

Definition 9. Approximate shortest vector problem (SVPγ): Given an arbitrarybasis B of a lattice L and an approximation factor γ ≥ 1, find a non-zero vectorv ∈ L such that ‖v‖ ≤ γ · λ1(L).

There is also a decisional version of the approximate shortest vector problem,called GapSVPγ .Definition 10. Decisional approximate shortest vector problem (GapSVPγ):Given an arbitrary basis B of lattice L and an approximation factor γ ≥ 1,decide whether λ1(L) ≤ 1 or λ1(L) > γ assuming one of the two holds.

A generalisation of the approximate shortest vectors problem to multiple vectorsis given as the approximate shortest independent vector problem (SIVPγ).Definition 11. Approximate shortest independent vectors problem (SIVPγ):Given an arbitrary basis B of a rank n lattice L and an approximation factorγ ≥ 1, find a set of n linearly independent lattice vectors v1, . . . , vn ⊂ L suchthat ∀i ∈ 1, . . . , n : ‖vi‖ ≤ γ · λi(L).

The SIVPγ is proven to be hard for γ = β · poly(n) and a real number β > 0 byAjtai [Ajt96]. In case of RLWE, we will not look at general lattices but specifictypes of lattices called ideal lattices obtained by mapping a fractional ideal ofthe ring of integers of a field K to a full-rank lattice in the above defined spaceH, defined in Section 2.2, by using the canonical embedding.

2.4 Learning with errors problem

The homomorphic encryption schemes that are mainly used today are based onthe (ring) learning with errors ((R)LWE) problem. The learning with errorsproblem was introduced by Regev in 2005 [Reg05] and has two variants: thedecision LWE problem and the search LWE problem.Definition 12. Let λ be a security parameter on which all the followingparameters will depend. Determine two integers n and q, respectively thedimension and modulus. Let χ be a probability distribution over Z. Sample auniformly random secret vector s from Znq . Now we define the LWE distributionDLWE

s,χ as the distribution consisting of pairs (a, b) ∈ Zn+1q with a chosen

uniformly at random from Znq and b = a · s + e with e ← χ. Then we candistinguish the following two variants of the problem:

• decision LWE problem: Given arbitrarily many samples (a, b) ∈ Zn+1q ,

determine if the samples come from the LWE distribution DLWEs,χ with fixed

s from Znq or from a uniform distribution over Zn+1q .

LEARNING WITH ERRORS PROBLEM 31

• search LWE problem: Given arbitrarily many samples (a, b)← DLWEs,χ ,

determine the secret vector s from Znq .

The instances of these problems are defined by random matrices and vectors.As the hardness can vary depending on the random choices, we talk aboutaverage-case hardness, by which we mean how hard it is on average over allthe possible choices for the randomness and worst-case hardness, by which wemean the hardest instance of the problem. It is known that the average-casehardness of the LWE problem can be related to the worst-case hardness oflattice problems as Regev has shown in [Reg05]. Following that work, it wasshown by Peikert in 2009 [Pei09] that the n-dimensional search LWE problemwith modulus q exponential in n is at least as hard as the worst-case GapSVPγproblem via a classical polynomial-time reduction. This work was improved byBrakerski et al. in 2013 [BLP+13] to a modulus q that is polynomial in n.

The hardness of these problems thus depends heavily on the chosen parameters.In order to obtain a high level of security, the matrices and vectors need tohave a large dimension. In practice, this leads to a high memory demand and along running time for the matrix-vector products. Looking for a more practicalsolution led to the introduction of a ring variant of this problem called ringlearning with errors (RLWE), which is based on ideal lattices. In [LPR10],RLWE is defined for the ring of integers in any number field, but here we restrictthe definition to the special case used in homomorphic encryption.Definition 13. Let λ be a security parameter on which all other parametersdepend. Choose an integer n that defines the size of the ring and set Z[X]/(f(X))with f(X) a monic polynomial of degree n, that is irreducible over Z[X]. Let χbe a probability distribution over Z[X]/(f(X)). Choose an integer q and definethe ring Rq = Zq[X]/(f(X)). Sample s $←− Rq uniformly at random. Now wecan define the ring learning with errors distribution DRLWE

s,χ as the distributionof pairs (a,b) ∈ R2

q consisting of a uniformly at random chosen a $←− Rq and bcomputed as b = a · s + e with e← χ. Then we distinguish the following twovariants of the problem:

• decision RLWE problem: Given arbitrarily many samples (a,b) ∈ R2q,

determine if the samples come from the RLWE distribution DRLWEs,χ with

fixed s $←− Rq or from a uniform distribution over R2q.

• search RLWE problem: Given arbitrarily many samples (a,b) ←DRLWE

s,χ , determine the secret vector s from Rq.

First, Lyubashevski et al. [LPR10] showed that the RLWE search problemreduces to the RLWE decisional problem. Later Lyubashevski [Lyu11] proved

32 MATHEMATICAL BACKGROUND AND PRELIMINARIES

that both problems are equivalent and can be reduced to the shortest vectorproblem or shortest independent vectors problem in ideal lattices. The originaldefinition of RLWE in [LPR10] requires the secret s to be taken from R∨, thedual fractional ideal of R, which is defined in Section 2.2.2 However, it waslater shown in [DD12] that it can be chosen uniform in Rq without making theproblem easier.

In homomorphic encryption, cyclotomic polynomials are used to define thepolynomial ring. The error term should be sampled according to an n-dimensional centered spherical Gaussian distribution in the embedding spaceH, for which each coordinate is sampled independently as reported in [LPR10]and [CIV16b]. For a cyclotomic polynomial Φm(X) with m a power of two, theerror term e can be directly sampled from a scaled centered spherical Gaussianin R. However, for general m this does not hold and hence the sampled errorsneed to be transformed from H to R. The two known approaches to performthis transformation are described in [LPR13, DD12]. Research has shown onehas to take special care when defining the error distribution. An overview ofinsecure RLWE instances due to a bad choice of the error distribution is givenby Peikert in [Pei16].

Hence to achieve a secure instantiation of RLWE, the selection of the parametersis essential. However, for an appropriate set of parameters and error distributionthe RLWE reduction to SVP or SIVP ensures asymptotic security of the RLWEproblem. In order to set appropriate parameters, we need to estimate thehardness of a specific RLWE instance. The cryptanalysis of lattice basedcryptography goes beyond the scope of this thesis, but we want to give thereader some intuition on the relation between the parameters and the securityof an RLWE instance. Important to note is that, given suitable parameters forRLWE, researchers have up until now not found an algorithm that exploits theadditional structure it has compared to the LWE problem to break or weakenits security. As consequence, the current security estimations of RLWE aredone by transforming the RLWE instance to its corresponding LWE instanceand making a security analysis for that LWE instance. Currently, most attackson LWE are based on lattice reduction algorithms. The most well-known latticereduction algorithms are the LLL algorithm introduced by Lenstra, Lenstraand Lovàsz [LLL82] and the BKZ reduction algorithm introduced by Schnorret al. [SE94] and improved in [CN11]. Based on a distinguishing attack ofLindner and Peikert [LP11], we have an inequality on the different parametersof the RLWE instance and shortest vectors that can be computed with latticereduction algorithm A. This inequality helps us better understand the relation

2For more details we refer the interested reader to the original article [LPR10] or [Con09].

FROM ACCESS STRUCTURES TO SECRET SHARING 33

between the different RLWE parameters and the security and is given by:

αq

σerr< 22√n log2 q log2 γ

minA

with α a constant depending on the success rate of the distinguishing attack, qthe modulus of the RLWE instance, σerr the standard deviation of the discreteGaussian error distribution, n the dimension of the RLWE instance and γmin

Aa value indicating the quality of the reduced basis that one can achieve withalgorithm A in time 2λ for a security level λ. This inequality shows that for afixed dimension n, one can not increase the modulus q beyond a certain boundwithout also increasing the standard deviation σerr. This relation gives theintuition that the security depends on a relation between the dimension n, theciphertext modulus q and the standard deviation of the error distribution σerr.However, in practice one has to check the efficiency of all the lattice algorithmsto get a security estimation for a concrete LWE instance as there is no consensuson which lattice attack on LWE outperforms the others for all instances ofLWE. For this purpose, Albrecht et al. [Alb15] created a tool that allows one tocheck the efficiency of the known lattice reduction algorithms for a specific setof parameters. As this tool is maintained and updated when new algorithms orimprovements to known algorithms are published, this tool is frequently usedby researchers to estimate the security of their schemes.

The ring version of LWE became popular due to its efficiency. The ring structurereduces the memory overhead of LWE as one sample of RLWE contains thesame amount of information as n LWE samples but is represented with onering element a of n coefficients instead of the n2 coefficients of the matrix A.As such, the ciphertext size and public key size are reduced. In addition, theexpensive matrix-vector products that need to be computed for LWE are inRLWE replaced with products of elements of Rq, which can be more efficientlycomputed using the fast Fourier transform (FFT). Given FFT performs bestfor Xn + 1 with n a power of two, the power of two cyclotomic polynomials area popular parameter choice in homomorphic encryption.

2.5 From access structures to secret sharing

Given a set of parties P, an access structure is defined as a pair (Γ,∆) ofdisjoint subsets of 2P , respectively called the qualified and unqualified sets.Each qualified set Q ∈ Γ is able to reconstruct the secret based on the shares ofthe parties in this set, while each set U ∈ ∆ does not have enough information toreconstruct the secret from the shares of the parties of U . If the superset of anyset in Γ forms a qualified set and any subset of a set of ∆ is an unqualified set,

34 MATHEMATICAL BACKGROUND AND PRELIMINARIES

we call the access structure a monotone access structure. An access structure iscalled complete if Γ ∪∆ = 2P . Hence, when we have a monotone and completeaccess structure, Γ and ∆ form a partition of 2P . A complete monotone accessstructure can therefore be described by the maximally-unqualified sets, denotedby ∆+, which are the largest subsets that are unqualified or the minimally-qualified sets Γ−, which are the smallest qualified sets. An access structure iscalled Q2 if no union of two unqualified sets in ∆ equals the whole set of partiesP . Access structures will often be written in terms of the indices of the partiesinstead of in function of the parties themselves.

Complete monotone access structures can be mapped to monotone booleanfunctions through a bijection by associating every party index with a coordinateof the domain of the boolean function. To talk about monotone booleanfunctions, we need a relation ≺ to be defined on its domain and image. Assumewe have a boolean function f : 0, 1n → 0, 1 with relation ≺ defined onthe domain 0, 1n and a relation ≤ defined on the image 0, 1 of f . Thenf is monotonic, if for every s, s′ ∈ 0, 1n holds that s ≺ s′ ⇒ f(s) ≤ f(s′).Monotone boolean functions can be computed using monotone span programs(MPS), which were introduced by Karchmer and Wigderson [KW93] and consistof a quadruple M = (F,M, t, ρ) containing a field F, an m × k matrix M ∈Fm×k of rank k, a non-zero vector t called the target vector and a surjectivemap ρ : [m] → [n] called the row map. Given a monotone span programM = (F,M, t, ρ), define the matrix Ms for s ∈ 0, 1n as the submatrix of Mobtained by gathering the rows indicated by the set j ∈ [m] : sρ(j) = 1, thenM defines the boolean function f : 0, 1n → 0, 1 as

f(s) :=

1 if t ∈ Row(Ms)0 otherwise

with Row(Ms) being the row space of the matrix Ms. The relation on thedomain of f can be defined as a relation on the row spaces of the matrices,such that for f monotonic holds that given s, s′ ∈ 0, 1n such that s ≺ s′,Row(Ms) ⊆ Row(Ms′), therefore t ∈ Row(Ms) implies that t ∈ Row(Ms′). Letthe vector s correspond to a subset of parties by having a 1 in position i ofthe vector s if party Pi belongs to the subset of parties and a 0 in position i ifparty Pi does not belong to the subset. As such we can make the qualified sets,correspond to vectors s for which the target vector t belongs to the span of therows of Ms.

A secret sharing of the value x ∈ F can now be constructed from the MSP bysampling a vector x ∈ Fk such that tT · x = x. The share of party Pi is thencreated by MsPi · x with MsPi the submatrix of the rows of M determinedby the index set j ∈ [m] : ρ(j) = i. As a qualified set Q ∈ Γ implies thatt ∈ Row(MsQ) with sQ the vector corresponding to the index set of the parties

CONCRETE MPC TECHNIQUES USED IN THIS THESIS 35

in Q, there is a vector of coefficients rQ ∈ Fm such that MTsQ · rQ = t. The

parties in a qualified set Q can thus combine their shares to reconstruct thesecret by computing rTQ ·MsQ · x = tT · x = x. For an unqualified set U ∈ ∆,the target vector t will not lie in the span of Row(MsU ), thus the parties cannot reconstruct the secret. It remains now to be shown that the combination ofthe shares of an unqualified set of parties reveal no information on the secret.The target vector t does not lie in the span of the row space of the matrixMsU , which implies ∃b ∈ Fk : tT · b 6= 0 and MsU · b = 0. We can scale thevector b such that tT · b = 1. Now given the vector x used to share the valuex and a second random value x′ ∈ F, we define the vector x′ used to sharex′ as x′ = x + (x′ − x) · b. Taking the inner product tT · x′ then results intT · x′ = tT · x + (x′ − x) · tT · b = x+ (x′ − x) · 1 = x′. Thus the vector x′ isindeed used to hide the secret x′. Now we show that even when an unqualifiedset U owns the shares of a value x′, they can not deduce any information on thevalue x′ from these shares. We have that MsU · x′ = MsU · (x + (x′ − x) · b) =MsU · x + (x′ − x) ·MsU · b = MsU · x + (x′ − x) · 0 = MsU · x, which showsthat given a share vector each element of F is as likely to be its correspondingsecret and consequently the parties of an unqualified set can not deduce anyinformation on the underlying secret from combining their shares. Consideringthe matrix M of the MSP to be the generator matrix of the underlying linearcode, the parity check matrix of the code can be determined. This parity checkmatrix will be denoted with N and can be used to check the correctness of thevector of shares x by taking the product of the parity check matrix with thevector of shares. If the vector of shares is correct, this matrix-vector productresults in the zero vector. So error detection in the secret sharing scheme canbe achieved using the parity check matrix of the underlying code.

2.6 Concrete MPC techniques used in this thesis

In this section, we provide background information on the various MPCtechniques used in Chapter 10.

Authentication

The protocols considered in this thesis are actively secure with abort. Hencewhen dealing with an active adversary, i.e. parties that can deviate from theprotocol, we have to ensure that the alteration of shares will be noticed by thehonest parties, so they can invoke an abort. To avoid unnoticed cheating bythe corrupt parties, the MPC protocols include a method for authenticatingthe shares when opening them to one or to all parties. Which authentication

36 MATHEMATICAL BACKGROUND AND PRELIMINARIES

method is applied depends on the access structure and secret-sharing schemeused in the protocol. In [SW19], the authors describe an authentication methodbased on properties of Q2 LSSS. More specifically, when a share is opened to oneparty Pi, we require each other party to send his share to Pi, who can then usethe parity check matrix to check the correctness of the shares. When a share isopened to all parties, a collision resistant hash function can be used to check theconsistency of the opened value by having each party reconstructing and hashingthe share vector and checking that the hash values of all parties are equal. Inthe full-threshold LSSS protocol based on additive secret-sharing, the sharesare authenticated using information theoretic message authentication codes(MACs) to prevent the adversary from performing additive attacks or changinghis shares during the protocol. There are two strategies for authenticating theshares in this setting, which are referred to as the SPDZ-style MAC [DPSZ12]and the BDOZ-style MAC [BDOZ11].

The SPDZ-style MAC has a global key α, which is called a MAC key. ThisMAC key α is a random value hidden for all parties and at the same timeheld by all parties as an additive sharing, so each party Pi has a share α(i)

such that α =∑ni=1 α

(i). The SPDZ-style MAC mitigates additive attackson a shared value JxK by producing a sharing of Jα · xK often called γ(x) suchthat γ(x) =

∑ni=1 γ

(i)(x). Hence the authenticated additive shares will consistof JxK := (x(1), . . . , x(n), γ(1)(x), . . . , γ(n)(x), α(1), . . . , α(n)). With these shares,the following equation can be computed to check the consistency of the sharesJα · xK − JαK · JxK = 0. This procedure is called the MAC check. In practice,the MAC check will validate multiple values at once by letting the parties agreeon random values and using these random values to compute a random linearcombination on the shares of the values and the MACs on these values. TheMAC check is then performed on these random linear combinations insteadof on each value separately. In the orignal SPDZ protocol, the global MACkey was revealed during the MAC check. However in [DKL+13], the authorsshowed how to check the MAC values without revealing the global MAC key,which allows the parties to continue the computation after the MAC check andto use the unopened shares in following computations. To open a value to allparties, the parties use the MAC check to ensure the consistency of the sharesand only open the value if the check passes. To open a value to one party Pi,the parties use a shared random value that is only known to Pi to mask thevalue that needs to be opened to Pi, perform the MAC check on the maskedvalue and open it if the check passes. As Pi knows the random value, he canundo the mask and learn the correct value. For more details we refer the readerto [DPSZ12] and [DKL+13]. This type of MAC is used in Chapter 7 of thisthesis.

In the BDOZ-style MAC, a key K consists of two random numbers (α, β) ∈ Z2p

CONCRETE MPC TECHNIQUES USED IN THIS THESIS 37

and the authentication on a share a ∈ Zp is computed as MACK(a) = αa +β mod p. The BDOZ-style MAC is a pairwise MAC, as we assume that one partywill hold a share a and the MAC value MACK(a) and another party will hold theMAC key K. Keeping the α-part of the MAC key between two parties constant,allows to generate a MAC on a linear combination of the shares by computingthe same linear operation on the MAC-values. This ensures that if a party Picorrupts one of its input values or a linear combination of these, he can onlyguess a correct MAC for this altered share with probability at most 1

p . Given avalue a ∈ Zp shared between n parties such that each party Pi holds a share aiand

∑ni=1 ai = a, then in addition to holding the share ai, party Pi will also hold

MAC keys for the shares of the other parties Kiaj with j 6= i and the MAC values

of its own share MACKjai

(ai) for j 6= i. The authenticated sharing in BDOZ-

style is therefore represented as JaK=

(ai, Ki

aj ,MACKjai

(ai)nj=1j 6=i

ni=1

). This

structure of pairwise MACs allows each party to check the shares of the otherparties as follows. Assume party Pi shares his share ai with party Pj , thenin order to prove his share is consistent, he sends ai and MACKj

ai(ai) to Pj .

Pj then uses the corresponding MAC key Kjai = (αjai , β

jai) to ensure that

MACKjai

(ai) = αjaia+ βjai mod p. Hence, if the parties decide to make a secret-shared value public, each party can check the share of all the other parties usingthe MAC values and the keys. As such, the consistency of the shares can bechecked and one of the parties can initiate an abort if needed. This pairwiseBDOZ-style MAC will be used in Chapter 10.

Garbled circuit

The garbled circuit protocol used in Chapter 10 is the protocol with fullauthenticated bit sharings described in [HSS17], which, as mentioned in theintroduction, entails multi-party garbling of boolean circuits. The high levelidea of a garbled circuit for two parties is that after the two parties agree onthe circuit they want to evaluate, one party, called the garbler, provides arandomised version of the circuit which has his own inputs values hard-wired.The randomised circuit is then passed to the other party, called the evaluator,who evaluates the circuit on his own inputs. Afterwards, the garbler andevaluator together perform a protocol to decode the final output. As can bedirectly deduced from this description, the number of rounds of communicationneeded does not depend on the circuit one wants to evaluate but is constant inthe garbled circuit approach. In general, a garbled circuit is called secure if thegarbler and evaluator only learn the output of the circuit and what they can

38 MATHEMATICAL BACKGROUND AND PRELIMINARIES

infer from this output and their own inputs, which coincides with the securitynotion of LSSS-based MPC.

The generation of the garbled circuit, i.e. the randomisation of the circuit by thegarbler has the following high level structure. A binary circuit consists of gatesand wires, by which we refer to all connections exiting one gate and enteringanother, as well as all circuit inputs and outputs. Each wire is assigned twokeys, corresponding to the zero bit value k0 and one bit value k1 and a mask,called the wire mask. For each boolean gate the garbler determines the keys andmask for the output wire based on the keys and masks for the input wires usingan encryption scheme. To achieve free XOR evaluation, i.e. to ensure the keysand wire mask of the output wire of an XOR gate can be computed from thekeys and wire masks of the input wires without using encryption, a fixed valueR is chosen such that k1 = k0 + R. The first protocol to achieve multi-partygarbling was presented by Beaver et al. in [BMR90]. In this multi-party setting,each party operates as both the garbler and the evaluator. Each party hencecreates a circuit for which it knows all the wire keys, but the encryptionsdetermining the gates are performed with the wire keys of all parties. As such,each party has to evaluate all circuits in parallel to learn all the needed keys fora decryption. The first garbling with active security was achieved by Lindell etal. [LPSY15]. After this work, many optimisations and improvements using asimilar approach followed [WRK17, HSS17, KY18]. We will focus on [HSS17]as that is the protocol used in Chapter 10.

In contrast to [WRK17, KY18] who construct garbled circuits authenticatedwith MACs, Hazay et al. [HSS17] construct an additively shared unauthenticatedgarbled circuit. The active security in [HSS17] is achieved through theobservation that an additive error added by an adversary would cause theother parties to abort during evaluation of the circuit. The main advantageof this approach is that an AND gate only requires a single actively secure F2multiplication, which reduces the preprocessing cost and makes it comparableto the preprocessing cost of non-constant-round MPC. In the general approach,correlated OT is used to compute the multiplication needed for an AND gateand a consistency check is done to verify that the inputs to the products werecorrect. The garbled circuit protocol used in Chapter 10 of this thesis is onebased on secret sharing with pairwise information-theoretic MACs for which thepairwise BDOZ-style MACs are used. Choosing the MAC key to be the sameas the fixed value R determining the difference between the zero and one keyfor the garbling reduces the costs of the protocol as we do not need extra OT’sto compute the multiplication nor the consistency check for the inputs. Formore details we refer the reader to [HSS17] and SCALE-MAMBA [ACC+20].

CONCRETE MPC TECHNIQUES USED IN THIS THESIS 39

Replicated secret sharing

In Chapter 10, we use a replicated secret sharing scheme over F2 as alternativeto garbled circuits. As linear secret sharing is used, linear operations can becomputed directly on the shares by individual parties and therefore the numberof communication rounds depends only on the non-linear operations that needto be performed. To construct an actively-secure LSSS based MPC protocol overF2, we combine a linear secret sharing scheme, a passively secure multiplicationprotocol and an authentication protocol. The access structure of the linearsecret sharing scheme used in Chapter 10 depends on the threshold signaturesetting. We choose the replicated secret sharing scheme over F2 such that itsaccess structure is the same as the access structure of the secret sharing schemeover Fq.

Multiplication triples in F2 are generated using the passively secure multiplica-tion protocol of Maurer [Mau06]. Starting from two shared values s =

∑ni=1 si

and t =∑ni=1 ti, this protocol performs the following steps to compute a sharing

of st. First, each of the players computes all the shares sitj that he can computelocally and shares them using the secret sharing scheme. Assuming this resultsin r sharings of the value sitj , the players compute r−1 differences by fixing onesharing and locally subtracting the other sharings from it. Subsequently, thesharings of the differences are opened and if they all open to 0, the previouslyfixed sharing is selected as a valid sharing of sitj . If one of the opened valuesdoes not equal zero, then si and tj are reconstructed and a n-out-of-n sharingof sitj is defined as (sitj , 0, . . . , 0). After performing these computations foreach i ∈ [n] and j ∈ [n], the parties compute locally the sum of their shareof sitj for all i ∈ [n] and j ∈ [m] and obtain a share of st as required. If theparties correctly execute the steps, a passively secure sharing of the product isconstructed.

However, the goal is to obtain actively secure multiplication triples over F2.This will be achieved using the cut-and-bucket as well as the check bucket stepsof Protocol 3.1 of [ABF+17]. In this protocol, the generated triples are dividedinto subsets. The triples of each of these subsets are again divided into smallersubsets and these smaller subsets are then randomly permuted. The order ofthe larger subsets is also changed according to a random permutation. In eachsmaller subset several triples are opened and checked, and only if all the checkspass, the protocol outputs the remaining triples of this subset. The remainingtriples of the subsets that passed the checks are reordered again into buckets.In each bucket, all but one triple are used to validate that one triple, if thisfinal check passes for all buckets then the protocol results in a vector of activelysecure triples of length the number of buckets used in the protocol. This finalcheck passes only if all triples in the bucket are valid or if all triples in the

40 MATHEMATICAL BACKGROUND AND PRELIMINARIES

bucket are invalid, which due to the sequence of random permutations andreshuffling can only occur with negligible probability. For more details and sizesof the subsets and buckets, we refer the reader to [ABF+17].

The actively secure triples obtained as described above are then used in astandard LSSS online phase, which uses one round of communication permultiplication in a passively secure multiplication protocol. This passivelysecure multiplication protocol is made actively secure by adding checks basedon parity check matrices and collision-resistant hash functions as describedin [SW19]. The parity check matrix is used to check the consistency of theshares when a value is opened to a single party Pi. In this case, all other partiessend their shares to Pi, Pi multiplies the parity check matrix with the vectorof shares and checks if the result is zero. If the multiplication of the paritycheck matrix and vector of shares does not result in zero, party Pi initiates anabort. The collision-resistant hash function is used to authenticate the shareswhen a value is opened to all parties. The parties then first engage in a roundof communication such that each party receives the shares it minimally needsto reconstruct the secret. Since the sharing is done with a replicated secretsharing scheme, this does not require a party to communicate with all otherparties. After receiving the necessary shares, each party computes the secretbased on his obtained shares. At this point, there is no guarantee that eachparty has the same secret value. In order to check that each party ends up withthe same secret value, each party computes the full vector of shares based onhis shares and updates the hash function with this vector of shares. Every sooften the parties broadcast the resulting hash value and check that their hashvalue corresponds with the hash value of all other parties. If the hash values ofall parties are equal, the shares are authenticated and each party computed thesame secret value; if not, the parties abort the protocol. For more details andthe proof showing that these computations authenticate the shares, we referthe reader to [SW19]. As mentioned before, since a replicated secret sharingscheme is used, a party does not have to communicate to all other parties inorder to obtain sufficient shares to reconstruct the full share vector. Hence tominimise the communication cost in our implementation of the above describedprotocol, we make use of the communication setup described in [KRSW18].

2.7 Doubly authenticated bits (daBits)

MPC has evolved to two different computation strategies. Linear secret sharingschemes operate most efficiently when used over finite fields Fp or a finitering, expressing the function to be computed as an arithmetic circuit. Garbledcircuits work most efficiently when operating with bits and expressing the

DOUBLY AUTHENTICATED BITS (DABITS) 41

function as a boolean combinatorial circuit. The function one wants to evaluatelargely determines which of the two strategies will be most suited to achievethe required operations, but of course the following question arises. Can weevaluate functions more efficiently by combining both techniques?

The first works to combine LSSS and GC were tailored to specific choicesof LSSS and GC protocols. However, in 2019 Rotaru and Wood introduceddaBits [RW19], which are bits that are shared and authenticated in both theLSSS and the GC world and allow to transfer data from one world to the other.A daBit of the bit b will be denoted as (JbKp, JbK2

l), where JbKp denotes thesharing in Fp and JbK2

l denotes the sharing in F2l . As the daBits from [RW19]are no longer restricted to specific LSSS or GC protocols, they are applicable ingeneral actively secure n-party computations and since the MPC protocols areused as a black box, there is no restriction on the underlying access structureof the protocol. DaBits allow the parties to switch between secret sharing andgarbled circuits midway through the computation and hence enable the partiesto split the total computation into subparts and choose for each of these subpartsthe most efficient field to evaluate it in. Protocols consisting partially of secretsharing operations and partially of garbled circuits are called mixed protocols. Itshould be clear that mixed protocols are only efficient if switching computationalstrategy, then performing the operation and switching back is more efficientthan performing the operations in the non-optimal computational setting andas such avoid switching with perhaps the aid of tailored preprocessing.

In order to achieve an actively secure mixed protocol, the secrets must beauthenticated at all times. Hence one needs to come up with a strategy tokeep the secrets authenticated through the conversions. In [RW19], switchingbetween LSSS and GC is achieved through bits, generated in a preprocessingphase, which are authenticated in both the LSSS Fp-world as the GC F2l -world.Switching from an element x ∈ Fp to input bits for a garbled circuit is doneby constructing a random value r ∈ Fp using daBits such that this value isshared and authenticated in both Fp and F2l . Then computing the value x− rin Fp and opening it, subsequently this public value is bit-decomposed andthese bits can be used as inputs in a garbled circuit. The first task of thegarbled circuit is to subtract r using the shares of r in F2l and to computethe result of this subtraction modulo p. The garbled circuit approach usedin both [RW19] and the follow-up work [AOR+19], which will also be used inChapter 10, is the technique of Beaver et al. [BMR90] often referred to as BMR.The traditional output of this multi-party garbled circuit consists of one ormore keys and the public signal bits, which are the actual boolean outputsXORed with circuit output wire masks. These circuit output wire masks aresecret shared values that conceal the output, which are traditionally revealedafter garbling to allow the parties to compute the final outputs at the end of

42 MATHEMATICAL BACKGROUND AND PRELIMINARIES

the garbled circuit protocol. In mixed protocols, the goal is to end up with asharing of the result of the garbled circuit, hence one can simply not reveal theoutput wire masks and use an MPC protocol to compute for each output bit theXOR of the secret-shared output mask with the public signal bit as to obtain asecret share of that output bit. However, we need the output shares to live inFp, thus we need to compute this XOR operation as an arithmetic circuit. Ifwe assume b to be the public signal bit and JwKp to be the daBit output wiremask, the XOR of these two values in Fp is computed as b+ JwKp − 2 · b · JwKp.These shares of the output bit can then be used to reconstruct the shares in Fpor remain as bits if desired. In [RW19], it is shown that even if the garblingprocedure can not be adapted to have daBits as output wire masks, one canconstruct an additional layer to the garbled circuit that allows to convert outputwires with masks only in F2l to output wires with daBit masks in both F2l andFp, while keeping the values on the wire intact. This additional layer consistsonly of XOR operations, hence the whole conversion from GC outputs to LSSSshares can be done through local operations.

DaBits are only used in the conversion of the shares. For the arithmeticoperations only authenticated LSSS shares in Fp are used and all secret valuesof the circuit except for the input and output wires consist of authenticatedshares in F2l only. This implies one has to create only a limited number ofdaBits, which is convenient as the original daBits generation of [RW19] is costly.DaBits are generated by sampling a random bit and calling the preprocessingphase of both Fp and F2l on that same bit. After generating many such bitsharings, an n-party XOR is computed to generate the final daBit. As thesetup deals with an active security model, one must ensure all parties inputthe same bit in both fields. This is checked with cut-and-choose and bucketingprocedures. The cut-and-choose technique opens a random subset of secretsto make sure that the unopened bits are correct with a certain probability.Afterwards, the unopened secrets are put into buckets and in each of thesebuckets one secret is selected as output and all the others are used to check thisone secret. The check of a bucket will only pass if either all secrets are corrector all secrets are incorrect. As the adversary does not know what secrets willend up in which bucket and additionally needs to corrupt the whole bucketto pass the test, his chance of success will be very small. For more details onthe generation of the daBits, we refer the reader to [RW19]. In [AOR+19], thegeneration algorithm for daBits is significantly improved, the new strategy tocreate daBits goes as follows. Consider n parties and assume we work withthe full-threshold setting, for the adaptions for different access structures werefer the reader to [AOR+19]. As a first step, n2 random bits which are sharedin Fp are generated by the parties, then for each of these bits the n-partysharing is transformed to a 2-party sharing by having n− 2 parties send theirshare to one of the two dedicated parties. Assume the two shares of the bit

DOUBLY AUTHENTICATED BITS (DABITS) 43

b(i) are denoted by b(i)1 and b(i)2 , such that b(i)1 + b(i)2 = b(i) mod p. The chance

of not wrapping around mod p is 1p , so with high probability it holds that

(b(i)1 mod 2)⊕ (b(i)2 mod 2)⊕ (p mod 2) = b(i) mod p. The two dedicated partiesthen input b(i)1 mod 2 and b(i)2 mod 2 to the input generation of F2, such that allparties obtain a share of an n-party boolean sharing of this bit. After repeatingthis for the n

2 bits, the parties compute the XOR of all the n2 bits in F2 and

subtract the offset (n2 (p mod 2)) mod 2 as for each of the shared bits we have tosubtract p mod 2. The parties also compute the XOR of all n2 bits in Fp, whichcompletes the generation of a daBit. The daBits check from [RW19] is usedto ensure the parties input the right bits into the generation of the n-party F2shares and hence to achieve active security. Therefore, the generation of onedaBit will still require generating multiple daBits in order to do the check andto ensure with certain probability that the remaining daBit is correct. All thecomponents and adaptions to existing protocols needed to construct a mixedcircuit as described in [AOR+19] are implemented in SCALE-MAMBA. Thisimplementation forms the starting point of the work in Chapter 10.

Chapter 3

State of the art homomorphicencryptions schemes

This section is dedicated to the four homomorphic encryption schemes thatare most commonly used in homomorphic applications today, namely BGV[BGV12], FV [FV12], CKKS [CKKS17] also called the HEAAN scheme andTFHE [CGGI20]. All four schemes have LWE, ring-LWE or a variant of it asthe underlying hard problem and therefore include noise during encryption. Soeach ciphertext contains noise and performing homomorphic operations makesthis noise grow. Unfortunately, one can only retrieve the correct underlyingplaintext data by decryption if the noise remains smaller than the decryptionbound. As TFHE has a different underlying framework than BGV, FV andHEAAN, we first focus on these last three schemes.

BGV, FV and HEAAN are ring-LWE based homomorphic encryptions schemes.These schemes have several advantages due to the algebraic structure of RLWE,namely small key sizes and faster homomorphic operations achieved through anumber of optimisations. BGV and FV are very similar and can be optimisedusing the same techniques which are explained in section 3.3. HEAAN originatedfrom a different design goal, nevertheless many of the optimisations of BGV andFV are applicable to it even though they are sometimes achieved differently. Dueto these optimisations, the computational overhead of the RLWE schemes is onlypolylogarithmic in the security parameter as shown in [GHS12b]. The ciphertextspaces consist of polynomials from R with coefficients reduced modulo an integerq with the operation [a]q as defined in section 2.1, which will be denoted Rq.For BGV and FV the plaintext space will have a similar space Rt and it shouldhold that t q. In HEAAN the plaintext space is R with magnitude bounded

45

46 STATE OF THE ART HOMOMORPHIC ENCRYPTIONS SCHEMES

by the ciphertext modulus q to avoid overflow of the underlying plaintextdata during homomorphic operations. In order to define the schemes, we needseveral distributions over the polynomial rings. The notation Uq will denotethe uniform distribution on Rq, in which all coefficients of the polynomial aresampled uniformly in Z∩

[−q2 ,

q2). The distributions χkey and χerr on R denote

respectively the distribution of the secret key and the distribution of the erroradded during encryption of the scheme. These distributions are bounded byBkey and Berr respectively, thus ∀a← χkey : ‖a‖∞ ≤ Bkey and similar for theerrors. For RLWE schemes, the error distribution χerr corresponds to a discretecentered Gaussian distribution with standard deviation σerr truncated at Berr,with a large enough Berr (e.g. Berr = 6σerr). As shown in [LPR10], the LWEproblem remains hard if one samples the secret keys from the error distributionχerr. However, a different key distribution can be used for efficiency reasons.As the noise growth is often depending on the distribution of the secret key,choosing the key distribution such that the coefficients of the key are uniformlyrandom sampled from the set −1, 0, 1 reduces the noise growth. This choice forχkey can, however, potentially introduce security weaknesses and is therefore anactive research topic discussed in several recent papers [SC19, CHHS19, CP19].

We define the following two functions to control the noise growth of ahomomorphic multiplication in BGV and FV. For any polynomial a ∈ R, fixedbase ω ≥ 2 and lω,q = blogω(q)c+ 1, we define a function Dω,q that decomposesa number with respect to the base ω and a function Pω,q that combines thepowers of ω of the decomposition with the corresponding coefficients of thedecomposition to recompose the decomposed element.

Dω,q(a) =(

[a]ω ,[⌊aω

⌉]ω, . . . ,

[⌊ aωlω,q−1

⌉]ω

)∈ Rlω,qω

Pω,q(a) =(

[a]q . [aω]q , . . . ,[aωlω,q−1]

q

)∈ Rlω,qq

Given these two functions we have the following lemma based on lemma 2of [BGV12].

Lemma 1. ∀(a,b) ∈ R2q : 〈Dω,q(a),Pω,q(b)〉 ≡ a · b mod q.

3.1 BGV

The scheme presented by Brakerski, Gentry and Vaikuntanathan [BGV12] in2011 is a levelled homomorphic encryption scheme. It is capable of computing anarbitrary function if the function is known up front, such that one is able to selectparameters that support the operations needed to compute this function. Thislevelled scheme manages to control the noise growth by introducing a modulus

BGV 47

switching procedure. Modulus switching results in a change of ciphertextmodulus, so instead of having one fixed ciphertext modulus, we need a decreasingchain of ciphertext modulus values. We describe this scheme in the RLWEsetting.

In order to describe the scheme we need some parameters, whose values inpractice will depend on the chosen security level λ and targeted application asexplained later in Section 4.4. The parameters needed for BGV are a cyclotomicindex m to define the cyclotomic polynomial of the ring, a plaintext modulust, a decreasing chain of ciphertext moduli qL > qL−1 > . . . > q1 > q0 suchthat t and qL are coprime and ∀i, j ∈ 0, . . . , L : qi ≡ qj mod t, a probabilitydistribution over R for the key and the error, χkey and χerr and a decompositionbase ω.

Key generation

BGV.SecretKeyGen(1λ): Sample sk = s← χkey over R and output sk.

BGV.PublicKeyGen(sk): Sample a ← UqL and e ← χerr and output thepublic key

pk = ([a · s + te]qL ,−a) ∈ R2qL .

BGV.RelinKeyGen(sk): Sample a vector−→a ∈ U lω,qLqL and a vector−→e ∈ χlω,qLerrand output

−→rlk =([Pω,qL(s2) +−→a · s + t−→e

]qL,−−→a

)∈ Rlω,qLqL ×Rlω,qLqL .

Encryption and decryption

BGV.Encrypt(pk,pt): To encrypt a plaintext pt = m ∈ Rt, sample u ←χkey and e0, e1 ← χerr and determine the encryption of m under thepublic key pk as

ct =(

[[m]t + u · pk0 + te0]qL , [u · pk1 + te1]qL , L).

Since we have a chain of ciphertext moduli, each ciphertext will be paired witha number i ∈ [0, L] that indicates the level of the encryption and hence themodulus qi with respect to which the coefficients of the ciphertext polynomialsshould be reduced. A level L ciphertext ct = (c0, c1, L) ∈ R2

qL × [0, L] canbe seen as a degree 1 polynomial with coefficients in RqL and evaluating this

48 STATE OF THE ART HOMOMORPHIC ENCRYPTIONS SCHEMES

ciphertext in the secret key reveals the message and noise inherent to theencryption. The following equations hold modulo qL:

c0 + c1 · s = [m]t + u · pk0 + te0 + (u · pk1 + te1) · s

= [m]t + u · (a · s + te) + te0 + (−u · a + te1) · s

= [m]t + t(u · e + e1 · s + e0)

If we now define v = u · e + e1 · s + e0 as the overall noise term of this level Lencryption, we have the equation

c0 + c1 · s ≡ [m]t + tv mod qL

BGV.Decrypt(sk, ct): To decrypt a ciphertext ct = (c0, c1, i) ∈ R2qi × [0, L]

at level i one computes m′ = [c0 + c1 · s]qi and subsequently computesthe coefficients of the result m′ modulo t with the centred reduction whichoutputs [m′]t.

The decryption is correct as long as m + tv does not cause a wrap aroundmodulo qi, thus ‖[m]t + tv‖∞ < qi

2 . To get a bound on ‖v‖∞ we set the bounda bit tighter and assume ‖[m]t‖∞ + ‖tv‖∞ < qi

2 and thus ‖v‖∞ < qi2t −

12 ,

which implies ‖v‖∞ <⌊qi2t⌋. As v = u · e + e1 · s + e0 with s,u ← χkey and

e, e0, e1 ← χerr, we need to choose the smallest modulus q0 from the moduluschain large enough to ensure that ‖v‖∞ ≤ 2δRBkeyBerr +Berr <

⌊q02t⌋with δR

the expansion factor of the polynomial ring R.

Modulus switching The modulus switching technique uses the moduli of themodulus chain to reduce the noise in a level j > 0 ciphertext when it is growingnear the bound

⌊ qj2t⌋by scaling down to a modulus lower in the modulus chain.

Scaling down the ciphertext by qiqj

for i < j scales the noise down with roughlythe same factor. The goal is to perform the rescaling such that the rescaledciphertext ct′ equals the closest R-vector to qi

qjct such that ct′ ≡ ct mod t.

BGV.ModSwitch(ct, j, i): for i < j, set

ct =(t

[−c0

t

]qjqi

, t

[−c1

t

]qjqi

, i

),

so we have for l = 0, 1 that ctl ≡ 0 mod t and ctl ≡ −cl mod qjqi. This

enables us to compute and output

ct′ = qiqj·(c0 + ˜ct0, c1 + ˜ct1

).

BGV 49

The division by qjqi

is exact because we determined ct such that for l = 0, 1

cl + ctl ≡ 0 mod qjqi.

As we want ct′ = (c′0, c′1, i) to be a level i encryption of m, we need [c′0+c′1 ·s]qi ≡[c0 + c1 · s]qj mod t. We have

c′0 + c′1 · s = qiqj

(c0 + c1 · s + ˜ct0 + ˜ct1 · s)

Take k ∈ R : [c0 + c1 · s]qj = c0 + c1 · s− qjk then

c′0 + c′1 · s = qiqj

([c0 + c1 · s]qj + qjk + ˜ct0 + ˜ct1 · s)

⇔ c′0 + c′1 · s− qik = qiqj

([c0 + c1 · s]qj + ˜ct0 + ˜ct1 · s)

By definition of ctl for l = 0, 1, it holds that∥∥ctl

∥∥∞ <

tqj2qi , which implies∥∥ct0 + ct1 · s

∥∥∞ ≤

tqj2qi (1 + δRBkey) and

∥∥[c0 + c1 · s]qj∥∥∞ <

qj2 . If we now

restrict∥∥[c0 + c1 · s]qj

∥∥∞ to be smaller than qj

2 −tqj2qi (1 + δRBkey), then

‖c′0 + c′1 · s− qik‖∞ < qiqj

( qj2 −tqj2qi (1+δRBkey)+ tqj

2qi (1+δRBkey)) = qi2 . Because

we defined the modulus chain such that qj ≡ qi mod t and ct is defined suchthat for l = 0, 1 : ctl ≡ 0 mod t, we have

[c′0 + c′1 · s]qi ≡qiqj

([c0 + c1 · s]qj + ct0 + ct1 · s) mod t

≡ [c0 + c1 · s]qj mod t.

If the noise of the original ciphertext ct is v, the noise v′ of the rescaled ciphertextct′ equals v′ = qi

qjv+vscale with ‖vscale‖∞ =

∥∥∥ qiqj (ct0 + ct1 · s)∥∥∥∞≤ t·(1+δRBkey)

2 .Therefore, as long as j > 0 and the ciphertext satisfies

∥∥[c0 + c1 · s]qj∥∥∞ <

qj2 −

tqj2qi (1 + δRBkey) one can reduce the noise of this ciphertext by converting

to a lower level i < j. A ciphertext of level 0 can not be rescaled to reduce thenoise. As such one has to ensure that at this point no more computations needto happen, because performing operations on a ciphertext of level 0 may leadto noise overflow.

Modulus switching does not only reduce the noise, but it also reduces theciphertext modulus by the same amount. Therefore the ratio of the noise to

50 STATE OF THE ART HOMOMORPHIC ENCRYPTIONS SCHEMES

the modulus size has not decreased at all. However, it is not only this noiseratio that decides how many homomorphic operations can be performed, theabsolute magnitude of the noise also plays an important role. Assume modulusq = Bk and two ciphertexts with noise magnitude B, then multiplying theciphertexts makes the noise of the result grow to B2. If we would subsequentlyuse two ciphertexts with noise magnitude B2 as inputs to a multiplication, theresult would have noise magnitude B4 and multiplying two ciphertexts withnoise magnitude B4 results in noise magnitude B8. Hence the ratio between themodulus q and the noise level of a ciphertext is with each of these subsequentmultiplications reduced with a larger factor. Nevertheless, if we choose the chainof moduli such that qi ≈ q/Bi for i < k and perform a modulus switching afterthe multiplication, we reduce the absolute error from after each multiplicationback down to B. As such, the ratio of q and the noise level of the ciphertextreduces with the same amount with each multiplication. Therefore, modulusswitching allows one to perform k levels of multiplication instead of log k beforereaching the modulus q.

Addition To be able to add two ciphertexts, we first need to ensure they areon the same level. So, if necessary, one first applies modulus switching to reducethe ciphertext with the highest level to the level of the other ciphertext. Theactual addition of two ciphertexts of the same level is done componentwise.

BGV.Add(ct, ct′, i) : The addition of ct = (c0, c1, i) and ct′ = (c′0, c′1, i)ciphertexts of level i equals ctadd = ([c0 + c′0]qi , [c1 + c′1]qi , i).

If ciphertext ct encrypts message m with noise v and ciphertext ct′ encryptsmessage m′ with noise v′, then the noise of the addition becomes:

(c0 + c′0) + (c1 + c′1) · s = c0 + c1 · s + c′0 + c′1 · s

≡ [m]t + [m′]t + t(v + v′) mod qi

≡ [m + m′]t + t(v + v′ + rm) mod qi

with [m]t+[m′]t = [m+m′]t+trm with ‖rm‖∞ ≤‖[m]t‖∞+‖[m′]t‖∞−‖[m+m′]t‖∞

t ≤( t2 + t

2 −t2 ) 1

t ≤ 1. Thus the noise of ctadd, the resulting level i encryption ofm + m′, equals vadd = v + v′ + rm with ‖vadd‖∞ ≤ ‖v‖∞ + ‖v′‖∞ + 1.

Multiplication We can only multiply two ciphertexts if they are at the samelevel i, hence the first step is to use the modulus switching to convert the

BGV 51

ciphertexts to the same level. Multiplying the ciphertexts (c0, c1, i) and (c′0, c′1, i)of level i then gives the following result:

(c0 + c1 · s)·(c′0 + c′1 · s) = c0 · c′0 + (c0 · c′1 + c1 · c′0) · s + c1 · c′1 · s2

≡ [m]t · [m′]t + t([m]t · v′ + v · [m]t + tv · v′) mod qi

≡ [m ·m′]t + t([m]t · v′ + v · [m′]t + tv · v′ + rm) mod qi

with [m]t · [m′]t = [m ·m′]t + trm. Thus the result of multiplication is a degree2 ciphertext ctmul = ([c0 · c′]qi , [c0 · c′1 + c′0 · c1]qi , [c1 · c1]qi) ∈ R3

qi that encryptsm ·m′ with noise vmul = [m]t · v′ + v · [m′]t + tv · v′ + rm.

BGV.Mul(ct, ct′, i) : Multiplying ct = (c0, c1, i) and ct′ = (c′0, c′1, i), cipher-texts of level i gives ctmult = ([c0 · c′0]qi , [c0 · c′1 + c1 · c′0]qi , [c1 · c′1]qi , i).

We can bound ‖rm‖∞ as ‖rm‖∞ =∥∥∥ [m]t·[m′]t−[m·m′]t

t

∥∥∥∞≤ δRt

2+2t4t ≤ δRt+2

4 <

δRt2 as t > 1 and thus 2

4 ≤δRt

4 . Therefore, the noise of the multiplication isbounded as follows

‖vmult‖∞ <δRt

2 (‖v‖∞ + ‖v′‖∞ + 2 ‖v‖∞ ‖v′‖∞ + 1)

The noise can be reduced using the modulus switching technique describedabove. As the multiplication results in an element from R3

qi , we still need todeal with the ciphertext expansion. This will be solved with an algorithm thatis called relinearisation and is described below.

Relinearisation The multiplication operation leads to an increase of the numberof ciphertext components, which results in a larger time and memory complexityof operations following multiplications if we do not first reduce the number ofciphertext components back to two. To avoid this complexity increase, we usea relinearisation technique introduced in [BV11]. If we denote the result of amultiplication, hence a degree 2 ciphertext of level i as ct = (c0, c1, c2, i) thatencrypts message m, then the following equation holds:

c0 + c1 · s + c2 · s2 ≡ [m]t + tv mod qi.

Now we want to construct a ciphertext ct′ = (c′0, c′1, i) consisting of only twocomponents but still decrypting to the same message as ct. Therefore, wewould like to include the term c2 · s2 in ct′. However, s2 can not be publishedas it would reveal the secret key s. We solve this problem by encryptings2 with the public key pk. As pk is derived from the secret key s we make

52 STATE OF THE ART HOMOMORPHIC ENCRYPTIONS SCHEMES

the assumption that encrypting a secret key with its corresponding publickey does not reveal anything on the secret key and hence does not decreasethe security of the scheme. This is called the circular security assumption,which is introduced by Camenisch and Lysyanskaya in [CL01]. Usage of thisassumption is not limited to HE, it also occurs in other application settings. Inthis case the circular security assumption is used to construct the relinearisationkey −→rlk =

(Pω,qi(s2) +−→a · s + t−→e ,−→a

). With the relinearisation key one can

compute:

ct′ =([

c0 +⟨Dω,qi(c2),−→rlk0

⟩]qi,[c1 +

⟨Dω,qi(c2),−→rlk1

⟩]qi, i

)∈ R2

qi× [0, L],

which gives us a ciphertext of two elements of level i that decrypts to m as isshown below.

c′0 + c′1s ≡ c0 +⟨Dω,qi(c2),−→rlk0

⟩+(

c1 +⟨Dω,qi(c2),−→rlk1

⟩)· s mod qi

≡ c0 + c1 · s +⟨Dω,qi(c2),−→rlk0

⟩+⟨Dω,qi(c2),−→rlk1

⟩· s mod qi

≡ c0 + c1 · s +⟨Dω,qi(c2),Pω,qi(s2) +−→a · s + t−→e

⟩− 〈Dω,qi(c2),−→a 〉 · s mod qi

≡ c0 + c1 · s + c2 · s2 + 〈Dω,qi(c2),−→a · s + t−→e 〉

− 〈Dω,qi(c2),−→a 〉 · s mod qi

≡ c0 + c1 · s + c2 · s2 + 〈Dω,qi(c2), t−→e 〉 mod qi

≡ [m]t + t(v + v′relin) mod qi

Thus relinearisation of a ciphertext with level i is done by computing

BGV.Relin(ct,−→rlk, i):

ctrelin =(([

c0 +⟨Dω,qi(c2),−→rlk0

⟩]qi,[c1 +

⟨Dω,qi(c2),−→rlk1

⟩]qi

), i

).

BGV 53

Now we want to bound the norm of the noise vrelin introduced by relinearisingthe ciphertext:

‖vrelin‖∞ ≤ ‖v‖∞ + ‖v′relin‖∞

≤ ‖v‖∞ + ‖〈Dω,qi(c2),−→e 〉‖∞

≤ ‖v‖∞ +lω,qi−1∑j=0

∥∥∥[⌊c2

ω

⌉]ω· −→e j

∥∥∥∞

≤ ‖v‖∞ + δRlω,qiωBerr

2By decomposing c2 in base ω we have a factor lω,qiω(≈ log qi) instead of qi inthe second term of the noise bound, which keeps the relinearisation noise small.

Parameters selection After defining all the operations and their correspondingnoise bounds, it remains to give some information on how to select the moduluschain of ciphertext moduli to allow certain operations. The noise bound ofthe addition shows that adding two ciphertexts makes the noise grow linearly.As this noise growth is quite small, additions do not require a subsequentmodulus switching. The noise growth during multiplication is quadratic andmultiplication is followed by a relinearisation which makes the noise grow evenmore. Hence a multiplication is usually followed by a modulus switching tokeep the error manageable. As the noise grows most with multiplications,the multiplicative depth is often used as an indication that determines thebounds for the parameters. If we want to compute an arithmetic circuit ofmultiplicative depth L in BGV, we want to modulus switch L times and thusneed a modulus chain with L + 1 moduli. In order to be able to performsubsequent multiplications, we need to ensure that the error after modulusswitching is smaller than the error of the input ciphertexts of the previousmultiplication. If we start from two ciphertexts of level i that have a noise withnorm bounded by B and form the inputs to a multiplication, then after modulusswitching and relinearisation the noise vi−1 of the resulting i− 1 ciphertext isgiven by:

‖vi−1‖∞ ≤qi−1

qi

δRt

2 (2B2 + 2B + 1) + ‖vrelin‖∞ + ‖vscale‖∞ .

To have a correct decryption of the result, we need that the ciphertext of level 0after the evaluation of the arithmetic circuit of multiplicative depth L satisfies‖v0‖∞ <

⌊q02t⌋. This gives us a bound on the ciphertext modulus q0, with which

we can subsequently determine the other moduli from ‖vi−1‖∞ ≤ B, with Bthe bound of the norm of the noise of the ciphertext of level i.

54 STATE OF THE ART HOMOMORPHIC ENCRYPTIONS SCHEMES

3.2 FV

The BGV scheme has a quadratic noise growth during multiplication whichintroduces the need for noise reduction by applying modulus switching. Thefirst scheme with linear noise growth after multiplication was introduced byBrakerski [Bra12] in 2012. This linear noise growth removed the need formodulus switching and hence instead of the modulus chain (qi)i=0,...,L of BGV,the scheme requires just one ciphertext modulus q. As the rescaling of theciphertext is no longer needed, these schemes are called scale invariant schemes.The idea that provides the basis for this linear noise growth is placing the messagein the upper bits instead of in the lower bits. Fan and Vercauteren createdin 2012 an RLWE version of the scale-invariant scheme of Brakerski [FV12].This FV scheme is due to its effective noise control technique considered moreuser-friendly than BGV and HEAAN.

To work with FV, one first needs to select the following parameters basedon the security level λ and the targeted application: a cyclotomic index m,a plaintext and ciphertext modulus, respectively t and q, with q t andt > 1, a decomposition base ω ≥ 2 and a probability distribution on R forthe key and the errors, respectively χkey and χerr. Additionally, we define aconstant ∆ =

⌊qt

⌋, given the definition of rt(q) from Section 2.1, it holds that

t∆ = q − rt(q).

Key generation

FV.SecretKeyGen(1λ): Sample sk = s← χkey over R and output sk.

FV.PublicKeyGen(sk): Sample a← Uq and e← χerr and output the publickey

pk =(

[a · s + e]q ,−a)∈ R2

q.

FV.RelinKeyGen(sk): Sample a vector −→a ∈ U lω,qq and a vector −→e ∈ χlω,qerrand output

−→rlk =([Pω,q(s2) +−→a · s +−→e

]q,−−→a

)∈ Rlω,qq ×Rlω,qq .

Encryption and decryption Using the above defined keys, one can encryptand decrypt messages as follows.

FV 55

FV.Encrypt(pk,pt): A plaintext pt = m ∈ Rt, is encrypted by firstsampling u← χkey and e0, e1 ← χerr and then computing

ct =(

[∆[m]t + u · pk0 + e0]q , [u · pk1 + e1]q).

The message and noise inherent in the ciphertext ct = (c0, c1) ∈ R2q can be

revealed by considering the ciphertext as a degree 1 polynomial with coefficientsin Rq and evaluating this ciphertext in the secret key.

c0 + c1 · s ≡ ∆[m]t + u · pk0 + e0 + (u · pk1 + e1) · s mod q

≡ ∆[m]t + u · (a · s + e) + e0 + (−u · a + e1) · s mod q

≡ ∆[m]t + (u · e + e1 · s + e0)

By renaming the total noise of this encryption as v = u · e + e1 · s + e0, theequation becomes

c0 + c1 · s ≡ ∆[m]t + v mod q.

To decrypt one needs to evaluate the ciphertext ct = (c0, c1) ∈ R2q in the secret

key, then divide by the factor ∆−1, which will be approximated by scaling by tq

and rounding the result. Afterwards the message can be retrieved by a simplereduction of the result modulo t.

FV.Decrypt(sk, ct): The decryption of a ciphertext ct = (c0, c1) ∈ R2q

is computed by m′ =⌊tq [c0 + c1 · s]q

⌉and subsequently taking the

coefficients of the result m′ modulo t with the centred reduction whichresults in [m′]t.

Let us now take a closer look at the decryption.⌊t

q[c0 + c1 · s]q

⌉=⌊t

q(∆ [m]t + v + qz)

⌉for some z ∈ R

=⌊q − rt(q)

q[m]t + t

qv⌉

+ tz

= [m]t +⌊tv− rt(q)[m]t

q

⌉+ tz

The rounded term will become zero if∥∥∥ tv−rt(q)[m]t

q

∥∥∥∞< 1

2 ⇔∥∥∥v− rt(q)

t [m]t∥∥∥∞<

q2t

56 STATE OF THE ART HOMOMORPHIC ENCRYPTIONS SCHEMES

Thus, to ensure correct decryption, it is required that ‖v‖∞ < q2t −

rt(q)2 . If this

equality on v holds, the message can be retrieved by taking the result of therounding operation modulo t.[⌊

t

q[c0 + c1 · s]q

⌉]t

= [[m]t + tu]t

= [m]t

Encryption results in an error v = u · e + e1 · s + e0 with s,u ← χkey ande, e0, e1 ← χerr, which implies ‖v‖∞ ≤ Berr(2δRBkey+1) with δR the expansionfactor of the polynomial ringR. So for correct decryption this gives the conditionBerr(2δRBkey + 1) < q

2t −rt(q)

2 , which can be used to determine q.

We can therefore conclude that a valid ciphertext ct = (c0, c1) ∈ R2q with noise

with norm smaller than ∆2 satisfies the equation:

c0 + c1 · s = ∆[m]t + v + qz for some z ∈ R

Addition As in BGV, the addition of two ciphertexts in FV is donecomponentwise.

FV.Add(ct, ct′) : The addition of the ciphertexts ct = (c0, c1) and ct′ =(c′0, c′1) is given by ctadd = ([c0 + c′0]q, [c1 + c′1]q).

Assume the ciphertext ct encrypts message m with noise v and ciphertext ct′encrypts message m′ with noise v′, then the noise of the addition is given by:

(c0 + c′0) + (c1 + c′1) · s ≡ ∆ ([m]t + [m′]t) + v + v′ mod q

≡ [m + m′]t + v + v′ + (q − rt(q)) rm mod q

≡ [m + m′]t + v + v′ − rt(q)rm mod q

with [m]t+[m′]t = [m+m′]t+trm with ‖rm‖∞ ≤‖[m]t‖∞+‖[m′]t‖∞−‖[m+m′]t‖∞

t ≤1. This implies that ctadd is an encryption of m + m′ with noise roughly equalto the sum of the noises of ct and ct′, vadd = v + v′ − rt(q)rm and this noise isbounded by ‖vadd‖∞ ≤ ‖v‖∞ + ‖v′‖∞ + rt(q).

Multiplication Multiplying two ciphertexts similar to the strategy of BGVresults in a factor ∆2 in front of the product of the messages. To reduce thisback to a valid ciphertext that only has a factor ∆ in front of the message, theproduct is rescaled by t

q and rounded afterwards. Thus the multiplication oftwo ciphertexts ct = (c0, c1) and ct′ = (c′0, c′1) is computed as follows.

FV 57

FV.Mul(ct, ct′) : Multiplying ct = (c0, c1) and ct′ = (c′0, c′1) gives

ctmult =([⌊

t

qc0 · c′0

⌉]q

,

[⌊t

q(c0 · c′1 + c1 · c′0)

⌉]q

,

[⌊t

qc1 · c′1

⌉]q

).

Next we want to determine the noise bound after this homomorphicmultiplication operation by checking what noise is created with each of theoperations that need to be performed. Just multiplying the ciphertextsct = (c0, c1) and ct′ = (c′0, c′1) results in:

(ct · ct′)(s) = (c0 + c1 · s) · (c′0 + c′1 · s)

= (∆ [m]t + v + qz) · (∆ [m′]t + v′ + qz′)

= ∆2[m]t · [m′]t + ∆([m]t · v′ + v · [m′]t) + q(v · z′ + v′ · z) + v · v′

+ q∆([m]t · z′ + [m′]t · z) + q2z · z′

= ∆2([m ·m′]t + trm) + ∆([m]t · v′ + v · [m′]t) + q(v · z′ + v′ · z)

+ v · v′ + q∆([m]t · z′ + [m′]t · z) + q2z · z′

with [m]t · [m′]t = [m ·m′]t + trm and as t > 1

‖rm‖∞ =∥∥∥∥ [m]t · [m′]t − [m ·m′]t

t

∥∥∥∥∞≤ δRt

2

4t + t

2t ≤δRt

4 + 24 <

δRt

2 .

The next step of the multiplication is scaling by tq . By using ∆ · t = q − rt(q),

we get the following equation:

t

q(ct · ct′)(s) = ∆[m ·m′]t −

rt(q)q

∆[m ·m′]t + t∆rm −rt(q)q

∆trm

+ ([m]t · v′ + v · [m′]t)−rt(q)q

([m]t · v′ + v · [m′]t)

+ t(v · z′ + v′ · z) + t

qv · v′ + t∆([m]t · z′ + [m′]t · z)

+ tqz · z′

= ∆[m ·m′]t + ([m]t · v′ + v · [m′]t) + t(v · z′ + v′ · z)

+ t

qv · v′ + t∆([m]t · z′ + [m′]t · z + rm) + tqz · z′

58 STATE OF THE ART HOMOMORPHIC ENCRYPTIONS SCHEMES

− rt(q)q

(∆[m]t · [m′]t + [m]t · v′ + v · [m′]t)

In the last equality we used again that [m]t · [m′]t = [m ·m′]t + trm.

Assuming we performed the multiplication on ciphertexts that satisfy thebound for correct decryption, we know that ‖v‖∞ , ‖v′‖∞ < ∆

2 . This impliesv · v′ = [v]∆ · [v′]∆ = [v · v′]∆ + ∆rv, which results in the equation

t

q(ct · ct′)(s) = ∆[m ·m′]t + ([m]t · v′ + v · [m′]t) + t(v · z′ + v′ · z)

+ (q − rt(q))([m]t · z′ + [m′]t · z + rm) + rv + tqz · z′

+ t

q[v · v′]∆ −

rt(q)q

(∆[m]t · [m′]t + [m]t · v′ + v · [m′]t + rv)

As only the last line of the above equation will be affected by the rounding, weset rr = t

q [v · v′]∆ − rt(q)q (∆[m]t · [m′]t + [m]t · v′ + v · [m′]t + rv).

We will now determine bounds for the norms of the different terms. Asshown above, ‖rm‖∞ < δRt

2 . Since we know ‖v‖∞ , ‖v′‖∞ < ∆2 , we can

bound max‖v‖∞ , ‖v′‖∞ with∆2 , then the norm of rv satisfies the following

inequality:

‖rv‖∞ =‖[v]∆ · [v′]∆ − [v · v′]∆‖∞

≤δR

∆2 min‖v‖∞ , ‖v′‖∞ −

∆2

≤δRmin‖v‖∞ , ‖v′‖∞

2 − 12

<δRmin‖v‖∞ , ‖v′‖∞

2 ≤ δR∆4 .

A bound for the norm of rr is given by

‖rr‖∞ <

(t

q

∆2 + rt(q)

q

(∆δR

t2

4 + 2δRt

2∆2 + δR∆

4

))

<12 + rt(q)δR

(t

4 + 12 + 1

4t

)because ∆t

q= 1− rt(q)

q≤ 1

<12 + rt(q)δRt given that 1

2 <t

2 and 14t <

t

4 .

FV 59

As consequence, ‖rr‖∞ ≤ rt(q)δRt.

When we denote the error introduced by rounding with ra, we get the equationtq (ct · ct′)(s) =

⌊tqc0 · c′0

⌉+⌊tq (c0 · c′1 + c1 · c′0)

⌉· s +

⌊tqc1 · c′1

⌉· s2 + ra with

‖ra‖∞ ≤1 + δRBkey + δ2

RB2key

2

as∥∥∥⌊ tqc0 · c′0

⌉∥∥∥∞,∥∥∥⌊ tq (c0 · c′1 + c1 · c′0)

⌉∥∥∥∞,∥∥∥⌊ tqc1 · c′1

⌉∥∥∥∞≤ 1

2 . We now needto consider the remainder modulo q of this ciphertext, which gives as result⌊

t

qc0 · c′0

⌉+⌊t

q(c0 · c′1 + c1 · c′0)

⌉· s +

⌊t

qc1 · c′1

⌉· s2

≡ t

q(ct · ct′)(s)− ra mod q

≡ ∆[m ·m′]t + ([m]t · v′ + v · [m′]t) + t(v · z′ + v′ · z)

− rt(q)([m]t · z′ + [m′]t · z + rm) + rv + rr − ra mod q

By gathering the noise in one variable vmult we get⌊t

qc0 · c′0

⌉+⌊t

q(c0 · c′1 + c1 · c′0)

⌉·s+

⌊t

qc1 · c′1

⌉·s2 = ∆[m·m′]t+vmult mod q,

with

vmult = ([m]t·v′+v·[m′]t)+t(v·z′+v′·z)−rt(q)([m]t·z′+[m′]t·z+rm)+rv+rr−ra.

We already determined the bounds of the norm of rm, rv, rr and ra, thus tobound the norm of vmult we just need to bound ‖z‖∞ and ‖z′‖∞.

‖z‖∞ =∥∥∥∥c0 + c1 · s−∆[m]t − v

q

∥∥∥∥∞

≤q2 + q

2δRBkey + ∆ t2 + ∆

2q

≤ 1 + δRBkey

2 + ∆tq

(12 + 1

2t )

< 1 + δRBkey + 12

60 STATE OF THE ART HOMOMORPHIC ENCRYPTIONS SCHEMES

Using the bounds ‖z‖∞ , ‖z′‖∞ ≤ 1 + δRBkey+12 , ‖rm‖∞ < δRt

2 , ‖rv‖∞ <δRmin‖v‖∞,‖v′‖∞

2 , ‖rr‖∞ ≤ rt(q)δRt and ‖ra‖∞ ≤1+δRBkey+δ2

RB2key

2 , we get

‖vmult‖∞ = ‖([m]t · v′ + v · [m′]t) + t(v · z′ + v′ · z)

−rt(q)([m]t · z′ + [m′]t · z + rm) + rv + rr − ra‖∞

≤ δRt

2(‖v‖∞ + ‖v′‖∞) + δRt

(1 + δRBkey + 1

2

)(‖v‖∞ + ‖v′‖∞)

+ rt(q)(

2δRt

2

(1 + δRBkey + 1

2

)+ δRt

2

)+δRmin‖v‖∞ , ‖v′‖∞

2

+ rt(q)δRt+1 + δRBkey + δ2

RB2key

2

≤ δRt

2 (4 + δRBkey)(‖v‖∞ + ‖v′‖∞) + rt(q)δRt

2 (6 + δRBkey)

+δRmin‖v‖∞ , ‖v′‖∞

2 +1 + δRBkey + δ2

RB2key

2

Relinearisation As multiplication increases the number of ciphertext compo-nents, we again introduce a relinearisation to reduce the number of componentsof the ciphertext back to two. The relinearisation of FV works exactly the sameas the relinearisation of BGV. Thus given the relinearisation key −→rlk, which isas for BGV an encryption of s2 under the public key pk, the relinearisationalgorithm is given by:

FV.Relin(ct,−→rlk):

ctrelin =([

c0 +⟨Dω,q(c2),−→rlk0

⟩]q,[c1 +

⟨Dω,q(c2),−→rlk1

⟩]q

).

This operation introduces an extra error vrelin, which is given by

vrelin ≤δRlω,qωBerr

2 .

Parameter selection Given a parameter set for BGV, it is easy to determinethe multiplicative depth as this equals the length of the moduli chain minus one.For FV the multiplicative depth is not so easily deduced from the parameters.However, we can make an analysis similar to the one for YASHE in [BLLN13]

OPTIMISATIONS 61

based on the previously obtained noise bounds of FV. Assuming the inputciphertexts of a multiplication have noise that in norm is smaller than B anda fresh ciphertext has noise smaller than Benc = Berr(2δRBkey + 1), the noisebound of one multiplication is given by C1B + C2 with

C1 =(δRt(4 + δRBkey) + δR

2

)and

C2 = rt(q)δRt

2 (6 + δRBkey) +1 + δRBkey + δ2

RB2key

2 + δRlω,qωBerr

2 .

A computation of multiplicative depth L, carried out in a binary three thenresults in CL1 B +

∑L−1i=0 Ci1C2, given that it contains a geometric series we have

that the norm of the noise is bounded by

CL1 B + C2CL1 − 1C1 − 1 .

Thus given FV parameters, the theoretical multiplicative depth that theseparameters can correctly evaluate homomorphically is given by the maximalvalue L that satisfies CL1 B + C2

CL1 −1C1−1 < q

2t −rt(q)

2 . Given the multiplicativedepth one wants to achieve, one can also use this bound to compute the FVparameters needed to homomorphically achieve this computation. When usingthis bound to determine the parameters, one has to take into account thatthis bound only considers the multiplications and hence does not consider anyadditional other operations, like for example additions. For functions thatrequire additions to be performed in between the layers of multiplications oneshould in theory incorporate the noise growth of the homomorphic additionsin the bound. However, as the noise growth of addition is significantly lessthan the noise growth during multiplication and this bound is an upper boundfor the norm of the noise, the additions are often discarded in the analysis ofthe noise growth and parameters are selected based on a bound expressing themultiplicative depth of the function as defined above.

3.3 Optimisations

As these RLWE schemes work with polynomials in the ring Rq, the efficiency ofoperations depends largely on the modulus q and the degree n of the ring definingpolynomial. These parameters are determined based on the requirements forsecurity (the corresponding RLWE problem should achieve the desired securitylevel) and correctness (the ciphertext modulus has to accomodate the noise

62 STATE OF THE ART HOMOMORPHIC ENCRYPTIONS SCHEMES

propagation of the desired computations). These conditions result in practice inlarge values for q and n, with q often chosen as a value of several hundreds of bitsand n between 210 and 215. This implies we need a lot of memory to representone ciphertext as we need thousands of coefficients of hundreds of bits andhomomorphic operations, whiuch are performed through arithmetic operationson these polynomials, are thus expensive and slow. The Chinese remaindertheorem from Section 2.2 can be used to optimise the arithmetic in polynomialrings and can therefore increase the performance of the homomorphic operationsof the RLWE based HE schemes.

As RLWE based HE schemes require the plaintext modulus t to be muchsmaller than q, encryption causes a large expansion of the data. The ratio ofmemory needed to represent the encrypted data over the memory needed torepresent the same plaintext data is what we call the ciphertext-to-plaintextratio. The fact that the required parameters for the previously stated RLWEbased homomorphic encryption schemes cause the ciphertext-to-plaintext ratioto be large is an important downside of these homomorphic encryption schemes.Research showed one can reduce the ciphertext-to-plaintext ratio using theChinese remainder theorem to pack several messages into one plaintext which isthen encrypted into one ciphertext. Thus this technique allows us to operate onseveral messages at the same time when we homomorphically manipulate onlyone ciphertext. As we perform the same homomorphic operation simultaneouslyon all the slots, we homomorphically perform single instruction multiple data(SIMD) operations, therefore we refer to this CRT based packing as the SIMDpacking.

We will now elaborate on the different optimisations based on the Chineseremainder theorem, which can be split up into 4 different cases. We haveoptimisations to make the arithmetic operations on the polynomials moreefficient, which will be achieved by applying CRT to the ciphertext space andoptimisations to reduce the ciphertext-to-plaintext ratio, which will be achievedby applying the CRT to the plaintext space. Given that CRT can be used torepresent the ring Rq as a direct product of smaller quotient rings and the factthat the ring Rq is defined based on two parameters, the modulus q and thering defining polynomial Xn + 1, we can apply the CRT to the modulus q or tothe defining polynomial Xn + 1.

Residue number system We consider a ciphertext modulus q that splits intok coprime factors q =

∏ki=0 qi then the CRT becomes the ring homomorphism

given by

RNS : Rq → Rq1 × . . .×Rqk : a 7→ (a mod q1, . . . , a mod qk).

OPTIMISATIONS 63

This decomposition of numbers modulo q into a product of k smaller coprimemoduli is called the residue number system (RNS) [Gar59, SV57]. Arithmeticoperations on elements in the ring Rq can be performed by componentwiseoperations on the elements of the direct product Rq1 × . . .×Rqk by using RNS:

a+ b mod q = (a+ b mod q1, . . . , a+ b mod qk)

a · b mod q = (a · b mod q1, . . . , a · b mod qk)

Note that even though we in general only need pairwise coprime factors qk,in homomorphic encryption one usually would consider ciphertext moduli qfor which the factors qk are prime as then one can use the NTT to performmultiplications of the polynomials in Rqk efficiently.

As RNS arithmetic is performed in rings with moduli qi much smaller thanq, the latency can be decreased by choosing appropriate coprime factors ofq, for example by choosing the qi to fit one machine word. As such RNSallows more efficient computation of sums and products. In addition, as thecomputations for the different moduli qi are completely independent, we canperform them in parallel. However, the RNS representation also has a downside.Because of its non-positional representation, it does not allow to performefficient comparison or roundings for inexact computations. To perform theseoperations we need to switch back to a positional representation. In spite ofthis downside, its high speed and low power consumption properties render theRNS representation useful in cryptography and recently it has also been used toimprove the efficiency of the abovementioned RLWE-based encryption schemesin [BEHZ16, HPS19, ABPA+19, CHK+19, BEM+19, TSKJ20].

Number theoretic transform In the previous paragraph, we looked at thedecomposition of the modulus q into coprime factors. As computations in Rqare also performed modulo the defining cyclotomic polynomial Xn + 1, wecan also look at the factorisation of this polynomial into coprime componentsmodulo q. In the extreme case, Xn + 1 can be factored into linear components,this will only be possible if q ≡ 1 mod 2n. We denote the factorization ofXn + 1 as

∏ni=0(X − ξi) with ξ1, . . . , ξn the roots of Xn + 1 modulo q. The

ring isomorphism given by applying CRT to this polynomial setting results in:

NTT : Rq → Zq[X]/(X − ξ1)× . . .× Zq[X]/(X − ξn) : a 7→ (a(ξ1), . . . ,a(ξn)).

This isomorphism is called the number theoretic transform (NTT) and isthe finite field equivalent of the fast fourier transform (FFT). NTT can beused to speed up the multiplication of two elements of Rq. The polynomialmultiplication will then be performed by the following three steps. First, we

64 STATE OF THE ART HOMOMORPHIC ENCRYPTIONS SCHEMES

represent the polynomials as n elements of Zq computed by evaluating thispolynomial in one of the 2n-roots of unity. Then, the coefficient-wise productof the vectors in Zq is performed and subsequently the result is interpolated toretrieve an element of Rq. NTT can be computed using existing FFT algorithmsas for instance the Cooley-Tukey method [CT65] and requires asymptoticallyonly O(n log2(n)) multiplications to evaluate or interpolate a polynomial onn roots of unity. The multiplication can then be performed componentwiseand hence requires n multiplications. Thus the overall algorithm requiresO(n log2(n)) multiplications, whereas the naive approach for a multiplication inRq would take O(n2) multiplications. Hence using NTT results in a significantspeed up of the polynomial multiplication in Rq.

SIMD packing So far we have used the Chinese remainder theorem to optimisethe arithmetic operations in Rq. Even though this increases the efficiency ofthe schemes, homomorphic operations still remain slow. Smart and Vercauterenshowed in [SV14] that the costly homomorphic operations can manipulateseveral messages at the same time by applying the CRT to the plaintext spaceRt. The underlying idea of this technique is to choose the plaintext modulus tprime and the defining cyclotomic polynomial Φm(X) with m a positive integernot divisible by t such that Φm(X) splits modulo t into a number of irreduciblefactors of the same degree. The degree d is the order of t modulo m (thusthe smallest integer d such that td ≡ 1 mod m). The number of factors isk = ϕ(m)/d and all factors are distinct, f1, . . . , fk. The plaintext space Rt isthen isomorphic to Zt[X]/f1 × . . .× Zt[X]/fk which is isomorphic to Fktd as foreach i ∈ [1, k] the quotient ring Zt[X]/fi is isomorphic to the finite field Ftd .Each copy of Ftd is called a slot and due to the isomorphism every element ofFktd can be seen as an array of k slots. Therefore, a user can choose to takeone message from Rt or to take k messages from Ftd and use the inverse CRTto pack these into one plaintext. In the latter case, homomorphic operations(addition or multiplication) on encrypted Rt-elements result in coefficient-wiseoperations on the respective slots. This technique, called batching, enables SIMDoperations as a single operation on an element of Rt gives rise to an operationon multiple Ftd elements in parallel. Evaluating a function homomorphicallythus leads to the parallel evaluation of that function on k different inputs,with approximately the same cost it takes to evaluate the function on oneinput without batching. The batching strategy therefore amortises the costof ciphertext operations and in addition reduces the ciphertext-to-plaintextratio. However, this technique comes with its own restriction; more specifically,the modulus t and parameter m of the cyclotomic polynomial Φm(X) have tobe chosen such that m is a positive integer not divisible by t. Therefore, thistechnique can not be used when t = 2 and m are both powers of two. In this

OPTIMISATIONS 65

case, we get Φm(X) = Xm2 + 1 ≡ (X + 1)m2 mod 2 and we hence do not have a

factorization into distinct factors.

With batching, the operations are processed independently in the k slots, soit speeds up the amortised cost if the same operation has to be performed onmultiple inputs, but it does not by itself reduce the overhead of computing acomplicated function on a single input. If one wants to perform operationsthat combine inputs or intermediate results placed in different slots of thesame ciphertext, one needs to either unpack, perform the operation and packthe ciphertexts again or one needs an extra method enabling one to permutethe slots. This implies that we need an extra algorithm, more specifically ahomomorphic permutation operation on top of the homomorphic addition andmultiplication to achieve a complete set of operations and be able to evaluatearbitrary circuits on packed ciphertexts. Gentry et al. explained in [GHS12b]how this homomorphic permutation can be achieved using actions of the Galoisgroup Gal(K/Q) of the field K = Q(ξm) on the cyclotomic ring R. If we denotethe multiplicative group of integers modulo m with (Z/mZ)×, then under thesame conditions on t and m as before1, (Z/mZ)× is isomorphic to Gal(K/Q)through the map (Z/mZ)× → Gal(K/Q) : i 7→ κi : X 7→ Xi and we haveexactly ϕ(m) = n of these transformations. Denoting the roots of Φm(X) withξ

(0)m , . . . , ξ

(ϕ(m)−1)m , we know that the Galois group Gal(K/Q) acts transitively

on these roots, thus we have ∀i, j ∈ [0, ϕ(m) − 1] : ∃κ ∈ Gal(K/Q) such thatκ(ξ(i)

m ) = ξ(j)m . Remark that the elements of the Galois group κ ∈ Gal(K/Q)

commute with the CRT map, thus for an aggregated plaintext a ∈ Rt, wehave κ(CRT(a)) = CRT(κ(a)) = (κ(a1), . . . , κ(ak)). As we know that forour choice of t and m holds that td ≡ 1 mod m, we know that t generates asubgroup of order d in (Z/mZ)× and thus the automorphism κt : X 7→ Xt,called the Frobenius automorphism, generates a subgroup Gt = (X 7→ Xtj ) :j ∈ [0, d− 1] of order d in Gal(K/Q). Thus, the action of Gt on the roots ofΦm in K partitions the set of roots into k = ϕ(m)/d disjoint sets of d elements,corresponding to the set ψ1, . . . , ψk, where ψi is a representative of the rootsof the respective factors fi of Φm modulo t for i ∈ [1, k]. Now we constructthe quotient group H = Gal(K/Q)/Gt which has order k as Gal(K/Q) hasorder ϕ(m) and Gt has order d and k = ϕ(m)/d. As such, the elements of thesubgroup H will act transitively on the set ψ1, . . . , ψk and fixing ψi implies thatwe fix a representation for the field Ft[X]/(fi(X)), therefore the elements of Hact directly as permutations on the plaintext slots.

We now take a closer look at the effect of applying an element κi ∈ H on an FVciphertext (this applies similarly to a BGV ciphertext) in order to permute theplaintext slots. Given the FV ciphertext c = (ct0(X), ct1(X)) encrypted under

1The plaintext modulus t is prime and m is a positive integer such that t does not divide m.

66 STATE OF THE ART HOMOMORPHIC ENCRYPTIONS SCHEMES

the secret key s(X) which decrypts to the aggregate plaintext polynomial m(X)with inherent noise v(X), ct0(X) + ct1(X) · s(X) = ∆[m(X)]t + v(X) mod q,the decryption after application of κi : X 7→ Xi to c is

ct0(Xi) + ct1(Xi) · s(Xi) = ∆[m(Xi)]t + v(Xi) mod q.

This implies that κi(c) decrypts to m(Xi) under s(Xi). Given a key switchingalgorithm, which is similar to the relinearisation algorithm shown before, we cantransform the ciphertext κi(c) = (ct0(Xi), ct1(Xi)) into a ciphertext ct′ thatdecrypts under the key s if we are willing to assume circular security. Assumea key switching key

−→kski =([Pω,q(s(Xi)) +−→a i · s +−→e

]q,−−→a i

)with −→a ∈ U lω,qq and a vector −→e ∈ χlω,qerr is passed to the party performing thehomomorphic computations. Then this party can transform the ciphertext(ct0(Xi), ct1(Xi)) into a ciphertext that can be decrypted with s(X) by using−→kski to transform (ct0(Xi), ct1(Xi)) into

ct′ =([

ct0(Xi) +⟨Dω,q(ct1(Xi)),−→kski0

⟩]q,[⟨Dω,q(ct1(Xi)),−→kski1

⟩]q

).

Decrypting ct′ with the secret key s gives ct′0 + ct′1 · s = ∆[m(Xi)]t + v(Xi) +⟨Dω,q(ct1(Xi)),−→e i

⟩mod q and thus the key switching creates an additive error

with norm ‖vks‖∞ =⟨Dω,q(ct1(Xi)),−→e i

⟩≤ δRlω,qωBerr

2 , which is of the samesize as the additive error for the relinearisation procedure. One can also seethat applying κi ∈ Gal(K/Q) to c makes κi act on both the message m(X)and the noise v(X). In [GHS12b], the noise growth of the original noise v(X)is investigated through the canonical embedding, which for a ∈ R and ξma primitive m-th root of unity is defined as: ‖a‖can

∞ = maxk∈Z∗m(a(ξkm)) =‖σ(a)‖∞ with σ the canonical embedding map. As the elements of Gal(K/Q)permute the roots of Φm, the canonical norm remains invariant under theapplication of one of the elements of Gal(K/Q). Therefore, the canonical normis the easiest way to investigate this error. One gets the following bound fora ∈ K : ‖a‖∞ ≤ cm ‖a‖

can∞ with cm a constant that can grow super-polynomially

withm. However, cm = 1 ifm is a power of two. For more details on the constantcm we refer the reader to appendix C of [DPSZ12]. This implies we can bound thenorm of the noise of v(Xi) with

∥∥v(Xi)∥∥∞ ≤ cm

∥∥v(Xi)∥∥can∞ = cm ‖v(X)‖can

∞ .

The SIMD packing is based on an application of CRT to the cyclotomicpolynomial Φm that defines the polynomial ring R. As mentioned before,one could also split the plaintext modulus t into coprime factors t1, . . . , tl. Thetwo can be combined by first splitting Rt into Rt1 × . . .×Rtl and then factorise

HEAAN 67

the defining polynomial Φm(X) = f in each of these rings Rti . Assume fsplits into ki irreducible polynomials modulo ti, f =

∏kij=1 fij mod ti which

results in Rt ∼= Rt1/ 〈f11〉 × . . . × Rt1/ 〈f1k1〉 × Rt2/ 〈f21〉 × . . . × Rtl/ 〈flkl〉.Each ring Rti/ 〈fij〉 will now be considered to be a slot. In each of these slots,we can encode a data value into a polynomial of that specific ring and thesepolynomials can then be combined into one polynomial pt ∈ Rt using theinverse CRT isomorphisms. Therefore pt contains all the encoded data and byoperating on pt or its encryption ct, one operates on all the input values atonce. As such, this thus results in a more evolved SIMD packing. For moredetails on this technique, we refer the reader to [CIV18].

3.4 HEAAN

The HEAAN scheme is not explicitly used in the works of this thesis. However,it starts from a different point of view on homomorphic operations which hasbeen proven to be valuable in certain application settings. As consequence, itis often used to implement homomorphic applications and therefore it shouldbe mentioned in an overview of state of the art homomorphic encryptiontechniques. HEAAN is a homomorphic encryption scheme for approximatearithmetic introduced in [CKKS17]. It treats the encryption noise as part ofthe error occurring during approximate computations, which is fundamentallydifferent from the strategy of BGV and FV which ensure that the noise doesnot interfere with the message in order to remove it completely during thedecryption procedure. The plaintext space of BGV and FV is a cyclotomicpolynomial ring of finite field characteristic, which we denoted with Rt. Tohave more efficient homomorphic computations in BGV and FV, the SIMDtechnique is used to encode a vector of plaintext values into one plaintextpolynomial by the ring isomorphism between Rt and the direct product offinite fields Zt[X]/f1 × . . . × Zt[X]/fk with f1, . . . , fk distinct factors of Φm

modulo t. The plaintext space of HEAAN will be the set of polynomials inthe cyclotomic ring R with magnitude bounded by the ciphertext modulusq. HEAAN considers a vector of complex numbers as a vector obtained byevaluating an integer polynomial in 2ϕ(m) roots of unity. This ensures thereduction modulo (Xϕ(m) + 1) does not affect the value of the encoded complexnumber and one can compute element-wise on the vector of complex numbersby computing on the corresponding polynomial. Mapping the vector of complexvalues into one plaintext polynomial is done using the complex canonicalembedding. The vectors of complex numbers can only be of size ϕ(m)

2 asHEAAN uses a polynomial with integer coefficients and the primitive roots

68 STATE OF THE ART HOMOMORPHIC ENCRYPTIONS SCHEMES

come in complex pairs, which means one can only choose values for half of theroots.

Recal the canonical embedding map from Section 2.2 and note that for m > 2we only have complex roots, so s1 = 0. This implies the image of the canonicalembedding σ is contained in

H = (z1, z2, . . . , zϕ(m)) ∈ Cϕ(m) : zs2+j = zj ,∀j ∈ [s2]

The space H is a ring with componentwise addition and multiplication andsince for any i the map σi satisfies σi(a) + σi(b) = σi(a + b) and σi(a)σi(b) =σi(ab), the complex canonical embedding σ is a ring homomorphism fromQ[X]/(Φm(X)) to H. Now take a natural projection π : H → Cϕ(m)/2 thatdiscards all the complex conjugate numbers from the list. If we remark thatthe indices of the roots are given by the elements of Z∗m then we can definethe subgroup T of the multiplicative group Z∗m to satisfy Z∗m/T = ±1. Theprojection π is then given by π : H → Cϕ(m)/2 : (zj)j∈Z∗m 7→ (zj)j∈T .

Given the maps σ and π, we can describe the decoding algorithm of a HEAANplaintext polynomial m(X) ∈ R into a vector of ϕ(m)/2 complex numbers(zj)j∈T . In a first step, the canonical embedding is applied to transform theplaintext into a vector of complex numbers (zj)j∈Z∗m ∈ H and afterwards theprojection π is used to send it to a vector of half the size (zj)j∈T , as the otherhalf is fixed by the complex conjugates of these values. This implies we canencode ϕ(m)/2 complex numbers into one HEAAN plaintext polynomial usingthe inverse of this map. However, an additional rounding algorithm is neededto ensure the polynomial has integer coefficients as the HEAAN plaintext spaceis Z[X]/(Φm(X)). This rounding error can influence the significant bits of themessage, therefore a scaling factor is introduced before rounding to preservethe precision of the message. The encoding thus starts from ϕ(m)/2 complexnumbers (zj)j∈T and a scale factor ∆ and maps these complex numbers to apolynomial in R using the following sequence of operations

Cϕ(m)/2 π−1

−−−→ Hb·eσ(R)−−−−−−→ σ(R) σ−1

−−−−−→ R

z = (zi)i∈T 7→ π−1(z) 7→⌊∆ · π−1(z)

⌉σ(R) 7→ σ−1

(⌊∆π−1(z)

⌉σ(R)

)with b·eσ(R) denotes rounding to the closest element in σ(R). We now also needto insert a multiplication with ∆−1 in the decoding algorithm. Decoding nowstarts from a polynomial m(X) ∈ R and a scaling factor ∆ and outputs thevector (zj)j∈T = π σ(∆−1 ·m(X)). The HEAAN scheme is based on applyingthis encoding technique to the BGV homomorphic encryption scheme.

As for BGV and FV, HEAAN has a packing technique that allows to perform thesame operation componentwise on multiple inputs. This decreases the amortised

HEAAN 69

cost of operations but does not reduce the overhead of computing a functiononly once. As for BGV and FV, an additional homomorphic operation is neededto achieve a complete set of homomorphic operations on packed ciphertexts,namely a homomorphic permutation of the slots. In HEAAN, we can apply thesame permutation technique as for BGV and FV. The Galois group Gal(K/Q)of the field K = Q(ξm) with ξm an m-th root of unity on the cyclotomic ringR, contains mappings κi : m(X) 7→m(Xi) mod Φm(X) for all i coprime withm. Since the Galois group Gal(K/Q) is isomorphic to Z∗m, it can be used topermute the vector of plaintext values. Given a vector (m(ξjm))j∈T with T asubgroup of Z∗m such that Z∗m/T = ±1 as defined above, there is an elementκk ∈ Gal(K/Q) that maps an element of slot i to an element of slot j, for eachi, j ∈ T . Therefore, if k = i−1 · j mod m, then κk(m(ξim)) = m(ξikm) = m(ξjm)and thus the element of the slot with index j of the result of this permutationis the same as that in the slot of index i of the orignal plaintext vector. Asexplained in [GHS12b], given a ciphertext c encryption of m with respect to thekey s, κk(c) is then a valid encryption of κk(m) with respect to the secret keyκk(s). Applying a key-switching to the ciphertext κk(c) results in an encryptionof the same message κk(m) with respect to the original secret key s.

Packing for BGV and FV is based on splitting the ring defining polynomialΦm(X) modulo t into distinct irreducible factors of the same degree. Thedegree of these factors should be large enough to avoid wrapping around of theunderlying messages during homomorphic operations, because that would leadto incorrect messages after decryption. The required bit size of the plaintextmodulus increases exponentially with the depth of the circuit, which couldincrease the required degree of the factors of Φm(X) modulo t, which on its turnpossibly reduces the number of factors. For the packing strategy of BGV andFV, there is therefore some dependency between the depth of the circuit onewants to evaluate and the number of slots on which one can evaluate the circuitin parallel. This dependency is absent in the packing strategy for HEAAN,which can result in better amortised evaluation times.

The main idea of the HEAAN scheme is to treat the noise of the RLWE problemas part of the error occurring during approximate computations. Decryptionthus leads to an approximate value for the underlying message of the ciphertextspace. It is therefore important that the deviation from the exact messagedoes not blow up during decoding of the plaintext polynomial into the vectorof complex values. This is ensured as the complex canonical embedding is anisometric ring homomorphism. In order to reduce the message bits covered bythe noise, the scaling factor with which the message is multiplied is chosen largeenough to ensure that only the lower message bits mix with the noise. Theoutput of decryption is therefore seen as a high precision approximate value ofthe underlying message.

70 STATE OF THE ART HOMOMORPHIC ENCRYPTIONS SCHEMES

The encryption techniques and plaintext space used in HEAAN have someaspects that need to be monitored during the homomorphic operations. Theencryption algorithm introduces some errors that will increase with eachperformed homomorphic operation. Bounds for this error growth thus need tobe determined to predict the precision of the result. Also the size of the messagesneeds to be monitored as it has to remain smaller than the ciphertext modulus.Like the rounding operation in regular approximate computation using floatingpoint numbers, HEAAN will use a rescaling operation that removes the leastsignificant bits of the message and makes a trade-off between size of numbersand precision loss. As the error is located in the least significant bits, therescaling will remove part of the error. Another advantage of the technique isthat rescaling after each multiplication allows to keep the size of the messagealmost constant throughout the whole computation, which reduces the conditionon the size of the ciphertext modulus. Where previous schemes, like BGV andFV, need a huge ciphertext modulus because the bit size of the output valuegrows exponentially with the multiplicative depth L of the circuit, this rescalingafter each homomorphic multiplication strategy in HEAAN yields a bitsizerequirement for the ciphertext modulus that is linear in the depth multiplicativeL of the circuit one wants to compute. For the same multiplicative depth, theHEAAN ciphertext modulus will thus be smaller than the one for BGV or FV.As a result, its computations will be more efficient and the performance of theHEAAN scheme will be better. Taking into account the abovementioned aspectsthat need monitoring, a ciphertext in HEAAN will consist of four components(c, l, ν, B) for a ciphertext c ∈ R2

ql, a level 0 ≤ l ≤ L, an upper bound for the

message ν ∈ R and an upper bound for the noise B ∈ R.

Since HEAAN performs approximate homomorphic computations, two im-portant aspects of approximate computations need to be investigated beforeperforming a homomorphic function evaluation. In approximate computations,small errors can blow up during subsequent computations, so the convergenceof the algorithm has to be checked to ensure a small difference between thecalculated result and the expected exact output value. It is thus important toensure that the algorithm used to compute the value homomorphically withHEAAN will converge to the function value you want to compute. Secondly,approximate computations usually have some precision loss, hence the precisionloss of a homomorphic function evaluation with HEAAN needs to be investigated.For unencrypted floating point arithmetic, a floating point multiplication ofd floating point numbers with η bits of precision results in a significand with(η− log d) bits of precision. Based on an investigation of the relative noise boundof the operations of the HEAAN scheme, one can investigate what happens withthe precision loss of the underlying messages during homomorphic operations.For a homomorphic multiplication of encryptions of d messages with η bits ofprecision, the HEAAN scheme computes the result with (η − log d − 1) bits

HEAAN 71

of precision in a multiplication circuit of depth dlog de. It is thus concludedthat the precision loss during homomorphic evaluation is at most one more bitcompared to unencrypted floating point arithmetic. An important implicationof the approximate computation blueprint of the HEAAN scheme is that evenwith bootstrapping this scheme can never achieve the concept of unlimitedcomputation as each multiplication reduces the precision of the underlyingmessage with one bit. Therefore, if the depth of the circuit is larger thanthe bit precision of the input data, it is impossible to retrieve any meaningfulinformation from the computation result. However, the strategy of HEAANhas proven to be particularly suited for applications with negative feedback, inwhich the error on the input is reduced during the computation as is often thecase for machine learning algorithms as for example gradient descent.

For more details on the HEAAN scheme we refer the reader to [CKKS17]and subsequent works on improving the HEAAN scheme [CHK+18, CHK+19,KS19, CKY18, HK20, LLL+20, KPP20, BMTPH20]. In [CHK+18], the authorsintroduce a bootstrapping algorithm, which consists of transforming an inputciphertext into an encryption of almost the same plaintext with a muchlarger modulus, which gives room for extra homomorphic operations butdoes not achieve unlimited computations for previously mentioned reasons.Improvements to this bootstrapping technique are given in [HK20], [LLL+20]and [BMTPH20]. The RNS optimisation to speed up the polynomial arithmeticis applied to the HEAAN scheme in [CHK+19]. Recently, a work on reducingthe approximation error due to the LWE noise in encryption and key switchingoperations appeared [KPP20]. In this work, the authors also propose an RNSvariant of the HEAAN scheme with smaller approximation error. There arealso works on encodings for specific application settings such as an encodingoptimised for working with real numbers [KS19] and an encoding optimised forworking with matrices [CKY18].

Recent work by Li and Micciancio [LM20] demonstrates that the security ofapproximate encryption schemes differs from the security of exact encryptionschemes and more specifically shows a passive key recovery attack on theHEAAN scheme. The authors explain how a passive adversary observing severalciphertexts of chosen plaintexts and their approximate decryptions can recoverthe secret key. As the attack requires the approximate decryption results of theciphertexts, the attack can be prevented by keeping the decryption result privateand making it only accessible to the party owning the secret key. The theoreticaldescription of the key recovery attack is supported with an implementation ofthe attack that works on several libraries implementing HEAAN. In addition tothe attack, the authors describe two strategies to harden the HEAAN schemeagainst the attack by modifying the decryption algorithm such that it outputsan approximation of the message that does not depend on the secret key and

72 STATE OF THE ART HOMOMORPHIC ENCRYPTIONS SCHEMES

encryption randomness. The HEAAN implementation in HElib and PALISADEare already adapted to mitigate this attack.

3.5 TFHE

Even though it is not used in the papers of this thesis, TFHE is another popularscheme for homomorphic applications nowadays. The TFHE scheme comprisesthe latest developments in a series of homomorphic schemes based on matrixarithmetic and is therefore quite different from the previously discussed RLWEschemes. The first scheme in this line of homomorphic encryption schemes,was in 2013 by Gentry, Sahai and Waters [GSW13] and is called the GSWscheme after its inventors’ names. It is an LWE based scheme with matrices asciphertexts that uses what the authors define as an approximate eigenvectormethod to encrypt and decrypt messages. The secret key is defined as anapproximate eigenvector of the ciphertext matrix and the message correspondsto its corresponding eigenvalue. Homomorphic addition and multiplication are inthis scenario mapped directly to the addition and multiplication of matrices. Theoriginal scheme works for small messages, but as mentioned in [GSW13], it canbe generalised to larger messages using an idea from [MP12] that recovers the bitsof the message bit by bit starting from the least significant bit. An interestingfeature of the scheme is the asymmetric noise growth of the homomorphicmultiplication. For B1, B2 the respective bounds of the ciphertexts c1, c2encrypting the messages µ1, µ2, the noise bound of the resulting ciphertextof the multiplication of c1 and c2 is given by ‖µ2‖B1 + poly(n)B2 with n thedimension of the underlying LWE problem. If we restrict the scheme to workwith binary messages, the noise grows only with a factor l · poly(n) after lhomomorphic multiplications. This is in contrast with the factor poly(n)log2(l)

known from the binary multiplication tree strategy of RLWE based schemes.This scheme and mainly its bootstrapping was improved in several works; firstby Brakerski and Vaikuntanathan in [BV14], afterwards by Alperin-Sheriff andPeikert in [AP14] and finally by Ducas and Micciancio in [DM15], who achieveda fast bootstrapping that however needs to be performed after each NAND gateevaluation. This last scheme is called the FHEW scheme. In [BR15], Biasseand Ruiz generalised the FHEW scheme to support larger message spaces.

The TFHE scheme constructed by Chillotti et al. and originally presentedin [CGGI16], is a revision of the techniques of the FHEW scheme [DM15]making the scheme more efficient and improving the bootstrapping dramatically.TFHE is defined over the torus T, i.e. the reals modulo 1, T = R/Z. The schemethus uses generalisations of previous structures and schemes over the torus andindicates this by using a prefix T. As the underlying hard problem of TFHE is a

TFHE 73

torus variant of the LWE and RLWE problem, the authors define, following thegeneralisation of [BGV12], the TLWE problem which comprises both the LWEand the RLWE variant of the problem. We will indicate them with TLWE andTRLWE respectively. In line with the generalised TLWE definition, the GSWscheme is also generalised to TGSW samples. This also contains an extensionof the GSW scheme to polynomials, which we will indicate with TRGSW.As the torus T is a Z-module, there is no product defined between two toruselements, only an external product between an element of Z and an element of T.Multiplications will therefore not happen between two T(R)LWE encryptionsbut will be defined between a TRGSW sample and a TRLWE sample. Adifference between the original GSW structure and the variant used in TFHE isthat the last has an approximate gadget decomposition that is determined upto a certain precision, where the decomposition operation of the original GSWwas exact. The approximate decomposition leads to a trade-off that improvesthe running time and reduces the memory requirements in exchange for a smallamount of additional noise. The TFHE scheme will switch often from TLWEto TRLWE samples and back to perform certain operations but multiplicationsare always done between TRLWE and TRGSW samples.

The most significant efficiency improvement of TFHE over GSW originates fromthe observation that a GSW ciphertext contains a lot of redundant information,which is needed to perform a homomorphic product as a matrix product, but isnot needed when you want to decrypt the ciphertext. As the new strategy is tobootstrap with the evaluation of each gate, there is no need for the redundantinformation present in the GSW ciphertext. Hence the authors of TFHE avoidedcomputing this unused part of the ciphertext by defining an external product,indicated with · that maps the product of a TRGSW encryption and a TRLWEencryption to a TRLWE encryption. A controlled selector gate (CMux gate)is then constructed using this external product. The CMux gate has threeinputs: one control input represented by a TRGSW sample encrypting a bitthat will decide which of the two messages will be outputted and two data inputslots of type TRLWE containing the possible output messages. The operationCMux(C,d0,d1) homomorphically computes C · (d1 − d0) + d0 and as suchoutputs either d0 or d1 depending on the decision bit encrypted in C. In additionto being a crucial component in the bootstrapping performed when evaluatinga gate, this CMux operation can be used to evaluate arbitrary functions witha lookup table approach, by selecting the right output value through a CMuxdecision tree. In this setting, we can with a correct choice of parameters evaluateseveral subsequent homomorphic operations without the need for intermediatebootstrapping and hence use TFHE as a levelled homomorphic scheme. However,in levelled mode, the external product poses a limitation on the composabilityof the gates. One can, using this external product, only build valid circuitsas long as the control input (TRGSW sample) is freshly generated because

74 STATE OF THE ART HOMOMORPHIC ENCRYPTIONS SCHEMES

the output of a homomorphic operation will always be in the TRLWE format.In order to remove this restriction and to make circuits fully composable, theauthors of TFHE also propose a transformation from a TRLWE sample to aTRGSW sample, which is first mentioned in [CGGI17]. TFHE now has twomodes of operation, one in which a bootstrapping is performed with each gateoperation and one where circuits are composed and bootstrapping takes the formof transforming a TRLWE sample into a TRGSW sample. The algorithm thatperforms a bootstrapping with each gate evaluation is called gate bootstrappingand is composed of a blind rotation, a sample extraction and a key switching.The bootstrapping of the levelled mode is called circuit bootstrapping andtransforms a TRLWE sample into a TRGSW sample to have full composabilityfor the circuits. For more details and the actual definitions and algorithms, werefer the reader to [CGGI16, CGGI17, CGGI20, Chi18].

Packing As TFHE has the torus as plaintext and therefore does not have aplaintext modulus t, the ring defining cyclotomic polynomial Φm can not splitinto irreducible factors modulo t. Therefore, the SIMD packing will not beapplicable to the ring of polynomials over the torus used in TFHE. We canhowever pack multiple torus elements in one ring element in TFHE by usingthe canonical coefficients embedding function that maps each torus elementto a coefficient of the torus polynomial. If we want to evaluate an arbitraryfunction f : V u → T s, we can express this function as a lookup table with 2uinput values and s2u corresponding output values as there are s subfunctions.We will indicate the output values as σj,h with j ∈ [0, s − 1] the subfunctionindex and h ∈ [0, 2u − 1] the input index. Coefficient packing can then beused to evaluate the lookup table more efficiently than evaluating each of thes subfunctions separately with a binary decision tree of 2u − 1 CMux gates.In [CGGI17], two packing methods are proposed, called horizontal packing andvertical packing. The horizontal packing method packs the outputs of the ssubfunctions for the same input into one ciphertext, which implies the coefficientsof the ciphertext will be σ0,h, σ1,h, . . . , σs−1,h. After evaluating one decision treeof 2u − 1 CMux gates, one ends up with the ciphertext containing the outputsof the s subfunctions for this input value, which can then be extracted fromthe polynomial. This horizontal packing is often suboptimal as the number ofsubfunctions s is often smaller than the number of coefficients in one ciphertext.The vertical packing method packs the results of one subfunction for all inputsin one ciphertext, thus σj,0, σj,1, . . . , σj,2u−1 for fixed j. As we have 2u possibleoutput values for one subfunction, which could be more than the number ofcoefficients we have available in one ciphertext, we might need several ciphertextsto store all output values. If we need more than one ciphertext to representall output values, we first use CMux gates to select the correct ciphertext.Afterwards, we select the right output value by rotating the ciphertext such

TFHE 75

that the output value corresponding to the input value ends up in the constantcoefficient of the ciphertext. The horizontal and vertical packing method canbe combined to pack as many values as possible into one ciphertext.

Chapter 4

Developing privacy-preservingapplications

4.1 Privacy-preserving computation

Privacy-preserving computation is a collective name for all methods allowingcomputations on data while keeping the data private. One often considers twodifferent scenarios in privacy-preserving computation, outsourced computationand multi-party computation. In outsourced computation, one party ownsthe data and wants to outsource the computation of some function to anotherparty without revealing the data to this second party. Multi-party computationencompasses the setting in which multiple data owners want to jointly computea function on the union of their data by engaging in a protocol that allowsthem to compute the result without sharing one’s own data with other parties.Homomorphic encryption naturally fits the outsourced computation scenario.For multi-party computation, different techniques are developed, as for examplelinear secret sharing schemes and garbled circuits. However, homomorphicencryption schemes can also be used to enable multi-party computation byadapting the HE scheme to use multiple keys and similarly one can setupdummy parties to achieve outsourced computation with MPC techniques.Nevertheless, HE schemes and MPC techniques are designed for different tasksand should therefore maybe not be compared directly. In general, MPC requirescommunication between the different parties. Thus compared to MPC, HE hasan asymptotic communication improvement, but this comes at the expense ofcomputational efficiency. In the application settings considered in the literature,

77

78 DEVELOPING PRIVACY-PRESERVING APPLICATIONS

state-of-the-art HE techniques are still many times slower than MPC protocols.At present, the relative costs of computation efficiency and latency will thereforelargely determine the relative performance. MPC outperforms HE when latencydoes not create a bottleneck, but as the relative cost of latency over computationmight increase, HE techniques could in the future become competitive withMPC protocols for many applications. Even so, research on MPC has led toits own strategy to deal with high latency networks. The strategy of garbledcircuits enables a constant number of communication rounds between the parties.Nonetheless, it demands large batches of data to be transmitted between theparties. One should however not overlook the data expansion caused by thelarge ciphertext to plaintext ratio of homomorphic encryption schemes. Thus,much like garbled circuits, a homomorphic computation scenario will requiretransferring significant amounts of data between the client and the server. Itwould be interesting to create an in depth comparison between HE and GC, asto the best of my knowledge, this is currently not available. As the main focusof this thesis is homomorphic encryption, we will now look into the differentdesign choices one needs to make to create a privacy-preserving computationbased on HE.

4.2 Selecting the appropriate scheme

These days there are several homomorphic encryption schemes available, however,the schemes most frequently used for applications nowadays are the four schemesdiscussed in Chapter 3. As these schemes are all based on variants of the samehard problem, they have many similarities. However, while the older schemeslike BGV and FV were designed as schemes that allow homomorphic operations,the newer schemes such as HEAAN and TFHE are designed with different goalsin mind. This leads to some important differences, which might render onescheme more suitable for a specific application scenario than the others. Hence,many efforts are done to thoroughly understand the different aspects of theschemes and to assess which scheme is most suited for which setting. We listsome of the characterisations of the different schemes here.

Features of the different HE schemes

BGV, FV and HEAAN are RLWE based schemes working with polynomialarithmetic and are therefore very similar. However, the fact that HEAAN isdesigned for approximate arithmetic creates some significant dissimilarities. Thegoal of HEAAN to achieve approximate computations implies that the result ofits decryption corresponds to an approximation of the result of the homomorphic

SELECTING THE APPROPRIATE SCHEME 79

operation. As a consequence, computations lead to some precision loss as is thecase for regular floating point arithmetic. Therefore, if the depth of the circuitis larger than the bit precision of the input data, it is impossible to retrieveany meaningful information from the computation result. As such, this schemecan never be fully homomorphic because the number of computations one canperform will be limited by the precision of the inputs and bootstrapping canhence not lead to unlimited computations. Other important differences betweenBGV and FV on the one hand and HEAAN on the other are the plaintext spaceand packing strategy. As BGV and FV have a plaintext modulus t, their packingstrategy is based on splitting the ring defining polynomial Φm(X) modulo tinto distinct irreducible factors of the same degree. The degree of these factorsshould be large enough to avoid wrapping around of the underlying plaintextduring operations on the ciphertexts, because that would lead to incorrectmessages after decryption. Since we know the required bit size of the plaintextmodulus increases exponentially with the depth of the circuit, a deeper circuitincreases the required degree of the factors of Φm(X) modulo t, which possiblyreduces the number of factors. As a result, the packing strategy of BGV andFV has some dependency between the depth of the circuit one wants to evaluateand the number of slots that can be evaluated in parallel. This dependency isabsent in the packing strategy for HEAAN, which can result in better amortisedrunning times.

TFHE originated as an improvement of the FHEW scheme [DM15] and istherefore based on a line of work on LWE based schemes. This leads to designchoices that make it fairly different from BGV, FV and HEAAN. One of thebiggest differences is the multiplication algorithm. As the torus is not a ringbut has a Z-module structure, TFHE uses a GSW-like structure to performhomomorphic multiplications. The original GSW internal multiplication, whichmultiplies two GSW ciphertexts, has an asymmetric noise growth. Whenone works with binary messages, this results in a noise growth by a factor ofl · poly(n) with l the number of multiplications and n the dimension of theciphertexts. The noise growth can be minimised by multiplying ciphertexts insubsequent multiplications, such that one of the two multiplicands is alwaysa fresh ciphertext. In BGV, FV and HEAAN the noise in homomorphicmultiplications is minimised by using a binary tree to order the multiplications,which leads to a noise growth by a factor of poly(n)log2(l) for l multiplications andn the dimension of the polynomial ring, defining the plaintext and ciphertextspace. Minimising the noise growth is very important as a smaller noisegrowth enables us to select smaller parameters and hence perform more efficientoperations and reduce the memory overhead. TFHE defines an external productbetween a TRGSW structured ciphertext and a TRLWE ciphertext. In TFHEthe external product is used to define a CMux gate, which means the TRGSWciphertext encrypts the decision bit. As the TRGSW ciphertext of the external

80 DEVELOPING PRIVACY-PRESERVING APPLICATIONS

product encrypts a bit, TFHE inherits the asymmetric noise growth of theGSW multiplication in the CMux operation. This asymmetric noise growthmakes TFHE more suitable than BGV, FV or HEAAN for operations with alarger multiplicative depth. The CMux gate is extensively used in the gatebootstrapping and lookup table techniques, hence allowing the scheme toefficiently evaluate binary gates and nonlinear function evaluations. These twooperations are very costly to perform homomorphically with BGV, FV andHEAAN as the native operations of these schemes are arithmetic operations.

Remember on the other hand that the polynomial structure of RLWE basedschemes like BGV, FV and HEAAN reduce the memory overhead as theirciphertext to plaintext ratio is much smaller than the ratio of LWE schemesand their packing techniques reduce this overhead even further. In addition, theciphertext modulus q could be split into small prime factors which led to an RNSimplementation that showed significant performance improvements for BGV,FV and HEAAN. As TFHE has no explicit ciphertext modulus and not all ofits defined operations are arithmetic operations, neither the RNS optimisationnor the SIMD packing structure can be translated to TFHE. Another packingtechnique was developed for TFHE but it is fundamentally different. It has acoefficient-wise packing which reduces the memory overhead, but with regardto parallel computations, this coefficient-wise packing is only compatible withlinear operations. Multiplicative SIMD operations for TRGSW ciphertextsare not supported with the TFHE packing technique. Efficient multiplicativeSIMD operations are supported with the CRT packing of BGV and FV andthe packing based on the canonical embedding of HEAAN. As TFHE evaluatesnonlinear functions with a lookup table approach, the coefficient-wise packingtechnique allows to compress multiple output values into one ciphertext. Assuch, one can evaluate different functions on the same input value. The packingof the other HE schemes on the contrary allows to pack several input values intoone ciphertext as to perform parallel computations in the sense of evaluatingone function simultaneously on several independent input values.

It should be clear that with the current state of the art techniques forhomomorphic encryption, there is no best scheme that outperforms the othersin all possible situations. The selection of the scheme depends largely on thegoals one wants to achieve and what kind of operations one wants to perform.The fact that all currently popular homomorphic encryption schemes are basedon variants of the same hard problem and that different schemes are moresuitable for specific operations, has led to the idea to switch between differentschemes to evaluate different subfunctions of an application with the mostefficient homomorphic scheme for that particular subfunction. The theoreticalframework describing how to transform a ciphertext of one scheme into aciphertext of another scheme is described in [BGGJ20]. However, this article

SELECTING THE APPROPRIATE SCHEME 81

does not provide any parameter sets that allow to estimate the computationtime or memory usage for applications following this strategy nor is there apublic implementation of these techniques available. Consequently, it remainsan open problem to assess how usable these techniques are in practice and whattheir overhead is in order to find the trade-off where switching schemes becomesmore advantageous than performing everything with the same HE scheme.

Libraries and practical implementations

After the construction of the first fully homomorphic encryption scheme byGentry, research on homomorphic encryption focused almost exclusively oncreating a scheme that can be used in practice. Therefore, the efforts to createmore performant homomorphic encryption schemes are twofold. On one hand,research proceeded by exploration of mathematical techniques and theoreticalbreakthroughs. On the other hand, researchers created implementations of thedifferent HE schemes. An overview of the most commonly used libraries andwhich schemes they implement is given in table Table 4.1.

Library Implemented HE Linkscheme

FV-NFLlib FV github.com/CryptoExperts/FV-NFLlibHEAAN HEAAN github.com/snucrypto/HEAANTFHE TFHE tfhe.github.io/tfhe/Helib BGV, HEAAN github.com/homenc/HElibSEAL FV, HEAAN github.com/Microsoft/SEAL

PALISADE BGV, FV, palisade-crypto.org/HEAAN, TFHE

Table 4.1: Overview of HE libraries

The libraries from Table 4.1 were created for research purposes, which meansthat they are not always user friendly and often lack proper documentation.For example, selecting parameters for the schemes, which is a challenging taskas it depends on complicated relations and error-prone analyses, is mostlyleft to the user. This gives the user the possibility to tailor the scheme andimplementation for their specific applications but also implies one needs toknow all the details of the scheme and the library in order to create a well-performing and secure implementation of the application. In order to fill thisvoid and make homomorphic encryption available to a broader public, theresearch community has started efforts on standardising the techniques andparameter sets, first in a self-organised consortium [ACC+18] and now also by

82 DEVELOPING PRIVACY-PRESERVING APPLICATIONS

the ISO organisation, which plans to elaborate on the previous work [ISO19]with an extra section on fully homomorphic encryption. In addition, differentcompanies and organisations started the creation of more user-friendly libraries,for example the development of PALISADE [PRR17], which implements BGV,FV, HEAAN and TFHE in a black box manner and Concrete by Zama [CJP20],which implements a variant of TFHE in a black box manner. In the future, theZama team aims to make Concrete even more user friendly by appending itwith a compiler that will automatically select the correct parameters for one’sprogram. This will enable users with no knowledge of homomorphic encryptionto easily transform their programs into homomorphic programs. Therefore, itwill hence increase the experiments done with homomorphic encryption, whichcould potentially lead to new insights and trigger new techniques to improvehomomorphic encryption.

4.3 Tuning the application

If one wants to use homomorphic encryption in a privacy-preserving computationsetting, one not only needs to select a scheme and corresponding library, but onealso needs to adjust the algorithm one wants to compute to make it more “HE-friendly”. Several operations are hard or even infeasible to compute efficientlyusing homomorphic encryption. Hence if one designs a homomorphic solutionto a certain application, one usually investigates how the algorithm can beadapted to make it easier to compute in the homomorphic domain. Makinga scheme HE-friendly is done with the following steps. First, one considersthe different operations in the algorithm and investigates how computationallyexpensive it is to compute each operation with the selected HE scheme. Secondly,one tries to replace the most expensive operations with a cheaper alternativeor approximates the results of this operation with a different computationtechnique. To assess the effect of replacing the original operation with analternative, one investigates in plaintext what effect the replacement has onthe accuracy of the overall result of the algorithm. As such, there is often atrade-off between the accuracy of the result and a more performant algorithm.The goal is of course to make the algorithm as efficient as possible with minimaldegradation of the accuracy of the result.

In addition to adapting the operations of the algorithm itself, one also has toconsider the order of the operations. The error inherent to a homomorphicciphertext grows with every operation. As the error growth will influence the sizeof the parameters and larger parameters lead to less efficient operations, a largererror will have a negative effect on the efficiency of the algorithm. Thereforeone has to take care to order the operations in a way that minimises the error

SELECTING THE PARAMETERS 83

growth. In general, the noise grows most during homomorphic multiplications.It is thus important to order these in a way that minimises the noise growth.As seen before, this means that we need to order multiplications for BGV, FVand HEAAN as much as possible into binary trees, while the asymmetric noisegrowth in the external product of TFHE requires to order the multiplicationssequentially such that one of the multiplicands is a fresh ciphertext. In BGV,FV and HEAAN multiplication is usually followed by a relinearisation operationto ensure compactness of the scheme. Compactness of the scheme means thatthe size of the output ciphertext does not depend on the computed algorithm.However, in some situations it can be beneficial to delay relinearisation to a laterpoint in the computation to keep the error growth smaller and subsequentlyget a more performant algorithm. As the order of the operations affects theefficiency of the algorithm, it could be useful to experiment with the order ofthe operations to find the most efficient implementation of a certain algorithm.

A third important aspect that can have a big influence on the efficiency of ascheme is the encoding of the data. Each scheme has a corresponding plaintextspace and finding a suitable way to embed the input data into the plaintext spacecan lead to significant improvement of the efficiency of the operations. In termsof input data, HEAAN natively encodes real numbers or even complex numbers,where BGV and FV natively work with integers. However, several researchershave investigated how to encode other data types into the BGV/FV plaintextspace Rt. Some examples of articles that consider HE specific data encodingare [DGBL+15], [CSVW16], [CSV17] and [BBB+17] (see also Chapter 6 of thisthesis). In addition to optimising how to represent specific data types, there isalso the question of how to optimally pack the data in one ciphertext. Whichdata inputs are packed together in one ciphertext can also affect the efficiencyas the right packing of data can facilitate parallelisation of computations. Thinkfor example of a matrix structure, one can pack the elements of a matrix indifferent ways, for example column by column, row by row or according tosome diagonal direction. What ordering of the elements is optimal depends onwhat operations one wants to perform on the matrix. For this reason, one canimprove the performance of one’s homomorphic algorithm by choosing the wayto encode the data into ciphertexts wisely.

4.4 Selecting the parameters

After selecting the correct scheme and library for the application andtransforming the application to make it HE friendly, there is still the dauntingtask of selecting parameters. In general, we know that smaller parametersfor the scheme lead to smaller ciphertexts and more efficient computations.

84 DEVELOPING PRIVACY-PRESERVING APPLICATIONS

Larger parameters on the contrary increase the key size, ciphertext size andcomputation time for the algorithms. However, the parameters can not bechosen freely. They are subject to several bounds, which can be categorisedinto two groups, bounds based on the security of the scheme and bounds basedon the correctness of the operations.

The security bounds determine minimal values for the parameters needed toachieve a certain security level λ, which indicates in terms of running timein bits how much effort is needed to successfully carry out an attack againstthe underlying hard problem of the scheme. Albrecht et al. described how toestimate the security level based on known attacks on LWE [APS15] and createdan accompanying tool which gets updated as new attacks become known. Thisestimation tool, which is available online [Alb15] is currently used by manyresearchers to estimate the security level for their parameter set while designinga homomorphic application. However, this tool only takes into account theattack on LWE, while homomorphic encryption often uses RLWE. For longtime there were no better attacks on RLWE than for LWE, but meanwhileit was shown in [ELOS15] and [CLS17] that for certain choices of rings anderror distributions there are better attacks for RLWE. These attacks havelater been improved in [CLS16], [CIV16a] and [CIV16b]. In [Pei16], Peikertinvestigates the rings and corresponding error distributions for which the attacksappeared and shows that any RLWE instantiation which satisfies the reductionfrom the hardness of lattice problems is not susceptible to a generalisation ofthe abovementioned attacks. As there is no uniform way to check for theseweaknesses yet, there is no tool that can estimate the security of a given setof parameters taking these attacks into account. At this point, there are onlyrecommendations on what rings to use and suitable error distributions for thechosen ring. Currently the recommendation in the standard document [ACC+18]is to use 2-power cyclotomic fields, which is also the only setting for which thestandard lists concrete parameter sets, but the hope is that future research willenable parameter recommendations for general cyclotomic rings as well.

With bounds for correctness, we refer to all the conditions we need to satisfy toensure the homomorphic computations end with a correct result. Currently usedhomomorphic encryption schemes have some noise added during encryption.This noise grows with every operation, which implies the noise growth has tobe monitored to ensure that a meaningful result can still be obtained afterdecryption. For BGV and FV, the noise is completely removed during decryption,therefore, we have to ensure that the noise remains smaller than a specific upperbound determined by the parameters of the scheme. In HEAAN, the noiseadded for security during encryption is considered as part of the precision lossthat always occurs during floating point operations on a computer. TFHE couldwork with both strategies, either one selects a discrete set of possible messages

USABILITY OF HOMOMORPHIC ENCRYPTION 85

and during decryption one rounds to the nearest message or one considers theresult of decryption to be the approximate result of the computation. Choosingthe strategy of BGV and FV, one has to choose the parameters such that theaccumulated noise bound after performing all the operations remains smallerthan the bound for correct decryption. Taking the approximate number strategyof HEAAN, one has to ensure that the parameters are chosen such that thedesired precision for the result of the computation can be achieved. In orderto ensure a correct result of a homomorphic computation, an error estimationhas to be done such that one can select a correct set of parameters for theapplication. The better we understand the noise behaviour of the differenthomomorphic operations in the different schemes, the easier it will become toselect the parameters.

Finding parameters that satisfy both the correctness and the security boundsoften requires several iterations of checking one bound for the selected parametersand then adapting the parameters to satisfy the other bound. If there is stillsome freedom to adapt the parameters once one has found parameters thatsatisfy both the correctness and security bound, one can tune them to makethe homomorphic computation more performant. The most common trade-offsin homomorphic applications are trading off computation time and accuracy ortrading off computation time and memory requirements. So depending on thebottleneck of the application some tuning of the parameters can be done, but inorder for the result to be meaningful, this tuning can only take place within therange of parameters that satisfy both the correctness and the security bounds.

4.5 Usability of homomorphic encryption

As the discussion above indicates, a lot of customisation is done to fully optimisehomomorphic applications, both of the application itself and of the homomorphicscheme and its instantiation. This has several important downsides, for onea lot of work needs to be done in order to design a performant homomorphicapplication. Besides, in order to optimize one’s application, one needs to be anexpert to be able to select the appropriate combination of techniques from allavailable posibilities. An additional limitation can be that in some scenario’s theapplication designers do not have so much freedom. They might need to followspecific standards or can not make certain design choices for compliance reasons.It is therefore clear that even though homomorphic encryption techniques mighthave matured enough to be practical in some very specific use cases, the currentstate-of-the-art homomorphic encryption techniques are not yet applicable ingeneral and it is still an open research question whether they ever will be.Regardless of the fact that HE is not yet applicable as a black box technique for

86 DEVELOPING PRIVACY-PRESERVING APPLICATIONS

all possible application scenarios, recently several efforts are made to make HEaccessible to a broader user public. This is done by generating libraries that willhave homomorphic encryption schemes as black box functionalities available fortheir users, as f.e. PALISSADE [PRR17], the standardisation effort [ACC+18]to guide users which scheme to select for their application and which parametersto use and the planned future work by Zama [Zam19] to construct compilersthat automatically select parameters for a homomorphic program in a specificapplication scenario.

Chapter 5

Conclusion and futureresearch directions

Homomorphic encryption is a very powerful tool in the data-driven world oftoday, as employing it enables entities to use the resources of an untrustedserver without having to disclose their private information. The main hurdlepreventing the deployment of homomorphic encryption in real-world applicationsis efficiency. Over the last few years, reasearch on homomorphic encryptionhas led to many improvements and optimisations and even the generation ofseveral new homomorphic encryption schemes. Most works in this thesis arein line with this research direction and explore new techniques to improve theefficiency of homomorphic encryption, following two different strategies. On theone hand, we construct more general techniques which can be used as buildingblocks to improve the efficiency in several applications and on the other hand,we design optimal solutions for specific well-defined application scenarios. Thefindings of this thesis can be summarised as follows.

Selecting a suitable encoding for your input data can dramaticallyimprove the efficiency of the homomorphic encryption scheme. At thestart of this thesis, encoding real data values for homomorphic encryption was abig challenge. The existing techniques were inefficient, as they only used a verysmall part of the BGV or FV plaintext space Rt. Therefore, large encryptionparameters were needed, which resulted in slow operations and increased thememory requirements. A new encoding algorithm transforming real numbersinto elements of Rt is proposed in Chapter 6. The main idea of our encodingalgorithm is to expand the real number with respect to a non-integral base bw,where w is a parameter indicating the sparseness of the resulting encoding. This

87

88 CONCLUSION AND FUTURE RESEARCH DIRECTIONS

expansion is then transformed into a polynomial of the plaintext space Rt. Thechosen base ensures that the encoding results in a polynomial with coefficientsin the set −1, 0, 1 with the additional property that each set of w consecutiveelements only contains one non-zero coefficient. Hence this encoding algorithmtakes advantage of the available powers of X which are anyway needed to ensurethe security of the encryption scheme. Consequently, the w-NIBNAF encodingmanages to keep the plaintext modulus t smaller and to perform the samenumber of operations with significantly smaller encryption parameters. As such,Chapter 6 shows that a suitable encoding of the data before encryption canlead to a remarkable improvement of the performance of the privacy-preservingapplication.

Efficient homomorphic solutions can be created for some specificapplication scenarios. Genome-wide association studies aim to identifygenetic variants associated to specific traits. To produce reliable results, theyrequire large datasets containing genomic data of patients. Being able tocompute these statistics in a privacy-preserving manner allows to performgenome-wide association studies without compromising the privacy of thepatients. Two efficient privacy-preserving solutions are given in Chapter 7. Wedesigned one solution based on homomorphic encryption and one solution basedon multi-party computation. Next to a thorough comparison of the performanceof these two techniques in this specific application scenario, we show that bothsolutions can perform the task at hand in a reasonable time frame.

The next application investigated in this thesis is logistic regression. A logisticregression model allows to classify data based on computing the probability thata set of inputs belongs to a specific class. Similar to other machine learningtechniques, logistic regression consists of two phases, a training phase in whichthe model for classification is constructed based on input data as well as aclassification result for each of these input data points and an inference phasein which the constructed model is used to classify new data. When the logisticregression model is used to classify sensitive data, it also needs to be trainedon sensitive data. The work in Chapter 8 focuses on how to train a logisticregression model without compromising the input data. The training algorithmis designed to better suit homomorphic encryption and tested in two scenarioswhich rely on sensitive data. In the first scenario, the model predicts theprobability of developing a specific disease based on medical data from patientsand in the second scenario we constructed a model for fraud detection based ona financial dataset containing data of credit card transactions. The resultingmodels obtained good results in both scenarios and therefore show that accuratelogistic regression models can be constructed without compromising the privacyof the data.

In Chapter 9, we succeed in constructing an efficient string search protocol

CONCLUSION AND FUTURE RESEARCH DIRECTIONS 89

by designing a randomised homomorphic equality circuit. The strength ofour randomised equality circuit is that it is independent of the pattern length.This provides more flexibility to the user, as the encryption parameters do notdepend on a fixed pattern length, but can be set to accomodate a wide rangeof different pattern lengths. Therefore, we do not only improve the efficiencyof homomorphically computing the equality, but we also increase its practicalapplicability. Furthermore, we construct a compression method which reducesthe communication cost of sending the search results to the client. As stringsearch is a general problem occurring in various application scenarios, ourtechniques can be employed in many settings, as for example Internet search,text processing and DNA analysis.

Present-day techniques show that homomorphic encryption is only practicalin customised application scenarios. Due to their limitations, they only allowto cover a very small subset of the broad range of application scenarios inwhich homomorphic encryption could enable privacy-preserving computationsusing the resources of an untrusted server. As most of the algorithms usedin application scenarios do not correspond directly to the optimisations weneed for efficient privacy-preserving computations, the scenarios in which onedoes not have the freedom to adapt the algorithm can often not be coveredwith the current techniques. The security model of homomorphic encryptioncorresponds perfectly to a client outsourcing computations on an untrustedserver, however, some application scenarios require a different security model.The last chapter of this thesis looks into such a scenario, as it aims to constructa threshold signature scheme. Threshold signature schemes aim to mitigate therisk that an adversary can produce a valid signature by distributing the signingpower among multiple parties. Even though it would be possible to build sucha construction based on a multi-key homomorphic encryption scheme, usingMPC for this application is preferred, as there are well-known MPC protocolsthat suit this threshold setting perfectly. In Chapter 10, we combine and extendseveral state-of-the-art MPC protocols to create an efficient threshold signaturescheme. This work is motivated by recent developements like blokchain, inwhich creating a valid signature is extremely valuable as even a signature on anincorrect message can have catastrophic consequences. As these developmentsrenewed the interest in threshold signature schemes, it became important tofind out if already standardised signature schemes could be turned into efficientthresholdised signature schemes given the state-of-the-art privacy-preservingcomputation techniques, which is exactly what we investigated in Chapter 10.

Designing efficient privacy-preserving computations calls for a cum-bersome process of finding the optimal techniques and parameters.Regardless of the significant improvements over the last few years, researchhas not lead to one scheme or technique that outperforms all others. As such,

90 CONCLUSION AND FUTURE RESEARCH DIRECTIONS

crafting an efficient privacy-preserving application demands a precise fine-tuningof several aspects of the solution. This is discussed in Chapter 4 of this thesisand shown for concrete applications in Chapters 6 to 10. First, dependingon the scenario, the input data and the goal of the application, one has toselect the suitable scheme or protocols and corresponding software. Then, ifpossible, the algorithm that needs to be computed is adapted and the inputdata is transformed using a proper encoding technique in order to make theapplication fit the selected scheme or protocols as much as possible. Finally, thecorrect parameters need to be selected, taking into consideration the security,correctness and if possible the efficiency of the resulting application. As all thesedesign choices influence each other and a wrong choice can lead to an inferiorsolution. One really has to have expert knowledge on both the application asthe available schemes and techniques in order to construct the optimal solution.

Increased interest in privacy-preserving solutions started the re-design of known structures to create a better fit with the currentprivacy-preserving computation techniques. The current privacy-preserving computation techniques depend on a specific data format andthe operations they can perform efficiently are often restricted. To makecomputations more performant in the privacy-preserving computation setting,their arithmetic complexity needs to be minimised. Older, now standardcryptographic techniques like hash functions and block ciphers are optimisedfor efficient hardware and software implementations and hence do not suitthe emerging privacy-preserving computation application setting. Due to theincreasing interest in privacy-preserving techniques, researchers started to studynew designs for these secure cryptographic algorithms in order to minimisetheir arithmetic complexity. An example is the new hash function mentionedin Chapter 10. In addition new standards are developed, both to describehomomorphic encryption, so the privacy-preserving techniques themselves, asto include the new developed algorithms optimised to fit the privacy-preservingcomputational setting into the existing standards of these algorithms. Ascompanies design their products based on standards, this is an important stepin making the world ready for widespread use of privacy-preserving techniques.

The goal of scientific research is to answer open questions and find solutions toopen problems, but in this process often new questions pop up. Here we list anumber of questions that remain open today and can serve as suggestions forfuture research directions.

Further optimising polynomial arithmetic: As they are based on hardlattice problems, the currently most used homomorphic encryption schemeswork with polynomials with large coefficients or equivalently vectors of largeintegers. The most straightforward manner to improve the performanceof these schemes is therefore by improving the performance of polynomial

CONCLUSION AND FUTURE RESEARCH DIRECTIONS 91

arithmetic. Some research to design dedicated hardware is already performed,for example [CCM+18], which reports on different designs for efficient polynomialmultipliers. Researchers are investigating methods to speed up these operationsby improving the existing algorithms and creating new hardware designs focusedon boosting the performance of homomorphic encryption.

Changing the framework used to optimise algorithms: Intrinsicproperties of homomorphic encryption pose limitations on the availableoperations in an algorithm. One can, for example, not branch a programdepending on whether or not a certain condition is satisfied. Severalalgorithms are optimised based on such tricks, think for example of patternmatching algorithms based on suffix trees or iterative processes that run untilthe error is smaller than a specific threshold. It would be interesting toconstruct a framework consisting only of the operations that can be performedhomomorphically and then try to find alternative optimisations for the naivealgorithms within this framework. This might lead to new more HE-friendlyalgorithms to efficiently perform the task at hand. Ideally after finding suchalgorithms, those can be compared to the existing classical optimisations andlead to new insights for the design of future homomorphic algorithms.

Encoding of data types for homomorphic encryption: Investigating howspecific datatypes can be encrypted with homomorphic encryption is also aresearch direction that can be explored from two perspectives. We can encodethe data in order to make it fit the plaintext space of the homomorphic schemeoptimally, an example of such an encoding is given in Chapter 6. Anotheroption is to construct a homomorphic encryption scheme that natively supportsarithmetic on that specific datatype. HEAAN [CKKS17] is an example thatnatively supports fixed point arithmetic, [CLPX18] supports arithmetic on largeintegers and [BCIV20] supports arithmetic on complex numbers. An intriguingquestion is how to deal with other data types like floating point numbers orstrings. In addition to developing encoding techniques for these, it would beinteresting to see if one can construct new homomorphic schemes that operatenatively on these datatypes.

Homomorphic encryption based on other primitives or mathematicalstructures: At present, all concrete instances of homomorphic encryption whichare considered efficient and secure rely on adding noise from a specific errordistribution to the ciphertexts. For most schemes, this noise is removed duringthe decryption. A well-known technique to resolve errors in cryptography isto work with error correcting codes. Hence a natural question that arrisesis if one can construct homomorphic encryption schemes based on codingtheory. Some attempts to create a homomorphic encryption scheme basedon error correcting codes have already appeared in the literature, a proposalby Armknecht et al. [AAPS11], a proposal by Bogdanov and Lee [BL11] that

92 CONCLUSION AND FUTURE RESEARCH DIRECTIONS

was broken by Gauthier et al. [GOT12] and a recent work from Challa andGunta [CG16]. Further research in this direction could lead to the invention oferror correcting codes with homomorphic properties which might be applicablein specific application scenarios. In addition, it remains an open questionwhether noise is necessary to build secure homomorphic encryption schemes.In [Nui14], Nuida developed a framework to construct a noiseless homomorphicencryption, however up until today, no concrete instantiation of this frameworkhas been created.

Combining different privacy-preserving techniques: The state of the arthomomorphic encryption schemes all allow the computation of arbitrary circuits.However, some schemes clearly outperform the others at performing certaincomputational tasks. Given this fact, two lines of thought can be constructed.One can split complex algorithms into several subfunctions existing of operationsthat one can perform efficiently with one scheme and switch between schemes inorder to use the optimal scheme for each of these subfunctions. Alternatively, onecan look at the constructions of the different schemes, try to extract the reasonfor their efficiency in performing a certain operation and then strive to transformthis idea to the setting of the other scheme. In the first line of thought mentionedabove, Boura et al. [BGGJ20] show how to map the ciphertext and plaintextrepresentations of the different schemes and describe how to switch betweenthem. Additional research in this direction could more thoroughly investigatethe correctness and security bounds of this switching scenario and their effect onthe parameter selection or further optimise the conversion techniques themselves.Another line of research could investigate whether it is possible to transfer thebest performing operations of one scheme directly into the other scheme, henceremoving the actual conversion of the ciphertexts. One pressing question in thisline of work, is whether it is possible to design a packing structure that allowsSIMD operations while being compatible with the external multiplication ofTFHE. In this thesis, we also illustrated the conversion between LSSS witha specific access structure and garbled circuits, which are two strategies toperform computations on private data in the MPC setting. A natural expansionwould be to create hybrid systems which combine MPC and HE techniques toperform privacy-preserving computations by generating conversion algorithmsbetween them. As a first step, more work needs to be done on comparing thedifferent techniques. For example a comparison between garbled circuits andhomomorphic encryption, two techniques that have constant communicationcost between the parties, can reveal whether there are trade-offs for which onetechnique starts to outperform the other.

Creation of cryptographic primitives suitable for privacy-preservingcomputation: The growing interest in privacy-preserving computationsinspires optimisation with respect to different metrics than before. Traditional

CONCLUSION AND FUTURE RESEARCH DIRECTIONS 93

symmetric cryptographic primitives are designed to be efficient in terms ofhardware and software optimisations. However, new cryptographic techniquessuch as multi-party computation, homomorphic encryption and zero-knowledgeproofs require optimisation with regard to arithmetic complexity. As expressionof the old primitive over a large prime field is nowadays often the most expensivepart of cryptographic systems, a new line of research has emerged that aimsto redesign the old cryptographic primitives and optimise them to better suitthe arithmetic computations. Some successful examples of this line of researchare the MPC-friendly block ciphers [ARS+15, AGR+16, GRR+16] and hashfunctions [AAB+19, GKK+19]. More research on this line of work would beuseful to identify the principles on which these new designs should rely in orderto ensure secure and efficient algorithms for this new cryptographic setting.

Bibliography

[AAB+19] Abdelrahaman Aly, Tomer Ashur, Eli Ben-Sasson, SiemenDhooghe, and Alan Szepieniec. Design of symmetric-key primitivesfor advanced cryptographic protocols. Cryptology ePrint Archive,Report 2019/426, 2019. https://eprint.iacr.org/2019/426.

[AAPS11] Frederik Armknecht, Daniel Augot, Ludovic Perret, and Ahmad-Reza Sadeghi. On constructing homomorphic encryption schemesfrom coding theory. In Liqun Chen, editor, 13th IMA InternationalConference on Cryptography and Coding, volume 7089 of LNCS,pages 23–40. Springer, Heidelberg, December 2011.

[ABF+17] Toshinori Araki, Assi Barak, Jun Furukawa, Tamar Lichter,Yehuda Lindell, Ariel Nof, Kazuma Ohara, Adi Watzman, andOr Weinstein. Optimized honest-majority MPC for maliciousadversaries - breaking the 1 billion-gate per second barrier. In 2017IEEE Symposium on Security and Privacy, pages 843–862. IEEEComputer Society Press, May 2017. doi:10.1109/SP.2017.15.

[ABPA+19] Ahmad Qaisar Ahmad Al Badawi, Yuriy Polyakov, Khin Mi MiAung, Bharadwaj Veeravalli, and Kurt Rohloff. Implementationand performance evaluation of RNS variants of the BFVhomomorphic encryption scheme. IEEE Transactions on EmergingTopics in Computing, 2019.

[ACC+18] Martin Albrecht, Melissa Chase, Hao Chen, Jintai Ding, ShafiGoldwasser, Sergey Gorbunov, Shai Halevi, Jeffrey Hoffstein, KimLaine, Kristin Lauter, Satya Lokam, Daniele Micciancio, DustinMoody, Travis Morrison, Amit Sahai, and Vinod Vaikuntanathan.Homomorphic encryption security standard. Technical report,HomomorphicEncryption.org, Toronto, Canada, November 2018.

95

96 BIBLIOGRAPHY

[ACC+20] Abdelrahaman Aly, Kelong Cong, Daniele Cozzo, Marcel Keller,Emmanuela Orsini, Dragos Rotaru, Oliver Scherer, Peter Scholl,Nigel P. Smart, Titouan Tanguyu, and Tim Wood. SCALE andMAMBA documentation, v1.10, 2020. URL: https://homes.esat.kuleuven.be/~nsmart/SCALE/Documentation.pdf.

[AGR+16] Martin R. Albrecht, Lorenzo Grassi, Christian Rechberger,Arnab Roy, and Tyge Tiessen. MiMC: Efficient encryption andcryptographic hashing with minimal multiplicative complexity. InJung Hee Cheon and Tsuyoshi Takagi, editors, ASIACRYPT 2016,Part I, volume 10031 of LNCS, pages 191–219. Springer,Heidelberg, December 2016. doi:10.1007/978-3-662-53887-6_7.

[Ajt96] Miklós Ajtai. Generating hard instances of lattice problems.In Proceedings of the twenty-eighth annual ACM symposium onTheory of computing, pages 99–108, 1996.

[Ajt98] Miklós Ajtai. The shortest vector problem in L2 is NP-hard forrandomized reductions. In Proceedings of the thirtieth annualACM symposium on Theory of computing, pages 10–19, 1998.

[Alb15] M. Albrecht. Complexity estimator for solving LWE. Technicalreport, 2015. https://bitbucket.org/malb/lwe-estimator/src/master/.

[ALP+19] Asra Ali, Tancrède Lepoint, Sarvar Patel, Mariana Raykova,Phillipp Schoppmann, Karn Seth, and Kevin Yeo. Communication–computation trade-offs in PIR. Cryptology ePrint Archive, Report2019/1483, 2019. https://eprint.iacr.org/2019/1483.

[AOR+19] Abdelrahaman Aly, Emmanuela Orsini, Dragos Rotaru, Nigel P.Smart, and Tim Wood. Zaphod: Efficiently combining LSSSand garbled circuits in SCALE. In Proceedings of the 7th ACMWorkshop on Encrypted Computing & Applied HomomorphicCryptography, WAHC’19, page 33–44, New York, NY, USA, 2019.Association for Computing Machinery. doi:10.1145/3338469.3358943.

[AP13] Jacob Alperin-Sheriff and Chris Peikert. Practical bootstrappingin quasilinear time. In Ran Canetti and Juan A. Garay, editors,CRYPTO 2013, Part I, volume 8042 of LNCS, pages 1–20. Springer,Heidelberg, August 2013. doi:10.1007/978-3-642-40041-4_1.

[AP14] Jacob Alperin-Sheriff and Chris Peikert. Faster bootstrappingwith polynomial error. In Juan A. Garay and Rosario Gennaro,

BIBLIOGRAPHY 97

editors, CRYPTO 2014, Part I, volume 8616 of LNCS, pages297–314. Springer, Heidelberg, August 2014. doi:10.1007/978-3-662-44371-2_17.

[APS15] Martin R. Albrecht, Rachel Player, and Sam Scott. On theconcrete hardness of learning with errors. Journal of MathematicalCryptology, 9(3):169 – 203, 01 Oct. 2015. doi:10.1515/jmc-2015-0016.

[ARS+15] Martin R. Albrecht, Christian Rechberger, Thomas Schneider,Tyge Tiessen, and Michael Zohner. Ciphers for MPC and FHE. InElisabeth Oswald and Marc Fischlin, editors, EUROCRYPT 2015,Part I, volume 9056 of LNCS, pages 430–454. Springer, Heidelberg,April 2015. doi:10.1007/978-3-662-46800-5_17.

[BBB+17] Charlotte Bonte, Carl Bootland, Joppe W. Bos, Wouter Castryck,Ilia Iliashenko, and Frederik Vercauteren. Faster homomorphicfunction evaluation using non-integral base encoding. In WielandFischer and Naofumi Homma, editors, CHES 2017, volume 10529of LNCS, pages 579–600. Springer, Heidelberg, September 2017.doi:10.1007/978-3-319-66787-4_28.

[BBL17] Daniel Benarroch, Zvika Brakerski, and Tancrède Lepoint. FHEover the integers: Decomposed and batched in the post-quantumregime. In Serge Fehr, editor, PKC 2017, Part II, volume 10175of LNCS, pages 271–301. Springer, Heidelberg, March 2017. doi:10.1007/978-3-662-54388-7_10.

[BCD+09] Peter Bogetoft, Dan Lund Christensen, Ivan Damgård, MartinGeisler, Thomas Jakobsen, Mikkel Krøigaard, Janus Dam Nielsen,Jesper Buus Nielsen, Kurt Nielsen, Jakob Pagter, Michael I.Schwartzbach, and Tomas Toft. Secure multiparty computationgoes live. In Roger Dingledine and Philippe Golle, editors, FC2009, volume 5628 of LNCS, pages 325–343. Springer, Heidelberg,February 2009.

[BCIV20] Carl Bootland, Wouter Castryck, Ilia Iliashenko, and FrederikVercauteren. Efficiently processing complex-valued data inhomomorphic encryption. Journal of Mathematical Cryptology,14(1):55 – 65, 01 Jan. 2020. URL: https://www.degruyter.com/view/journals/jmc/14/1/article-p55.xml, doi:https://doi.org/10.1515/jmc-2015-0051.

[BCR87] Gilles Brassard, Claude Crépeau, and Jean-Marc Robert. All-or-nothing disclosure of secrets. In Andrew M. Odlyzko, editor,

98 BIBLIOGRAPHY

CRYPTO’86, volume 263 of LNCS, pages 234–238. Springer,Heidelberg, August 1987. doi:10.1007/3-540-47721-7_17.

[BdMW16] Florian Bourse, Rafaël del Pino, Michele Minelli, and HoeteckWee. FHE circuit privacy almost for free. In Matthew Robshawand Jonathan Katz, editors, CRYPTO 2016, Part II, volume9815 of LNCS, pages 62–89. Springer, Heidelberg, August 2016.doi:10.1007/978-3-662-53008-5_3.

[BDOZ11] Rikke Bendlin, Ivan Damgård, Claudio Orlandi, and SarahZakarias. Semi-homomorphic encryption and multipartycomputation. In Kenneth G. Paterson, editor, EUROCRYPT 2011,volume 6632 of LNCS, pages 169–188. Springer, Heidelberg, May2011. doi:10.1007/978-3-642-20465-4_11.

[Bea92] Donald Beaver. Efficient multiparty protocols using circuitrandomization. In Joan Feigenbaum, editor, CRYPTO’91, volume576 of LNCS, pages 420–432. Springer, Heidelberg, August 1992.doi:10.1007/3-540-46766-1_34.

[BEHZ16] Jean-Claude Bajard, Julien Eynard, M. Anwar Hasan, and VincentZucca. A full RNS variant of FV like somewhat homomorphicencryption schemes. In Roberto Avanzi and Howard M. Heys,editors, SAC 2016, volume 10532 of LNCS, pages 423–442.Springer, Heidelberg, August 2016. doi:10.1007/978-3-319-69453-5_23.

[BEM+19] Jean Claude Bajard, Julien Eynard, Paulo Martins, Leonel Sousa,and Vincent Zucca. Note on the noise growth of the RNS variantsof the BFV scheme. Cryptology ePrint Archive, Report 2019/1266,2019. https://eprint.iacr.org/2019/1266.

[Ben18] Aner Ben-Efraim. On multiparty garbling of arithmetic circuits. InThomas Peyrin and Steven Galbraith, editors, ASIACRYPT 2018,Part III, volume 11274 of LNCS, pages 3–33. Springer, Heidelberg,December 2018. doi:10.1007/978-3-030-03332-3_1.

[BGG+18] Dan Boneh, Rosario Gennaro, Steven Goldfeder, Aayush Jain,Sam Kim, Peter M. R. Rasmussen, and Amit Sahai. Thresholdcryptosystems from threshold fully homomorphic encryption. InHovav Shacham and Alexandra Boldyreva, editors, CRYPTO 2018,Part I, volume 10991 of LNCS, pages 565–596. Springer,Heidelberg, August 2018. doi:10.1007/978-3-319-96884-1_19.

[BGGJ20] Christina Boura, Nicolas Gama, Mariya Georgieva, and DimitarJetchev. Chimera: Combining ring-LWE-based fully homomorphic

BIBLIOGRAPHY 99

encryption schemes. Journal of Mathematical Cryptology,14(1):316 – 338, 01 Jan. 2020. URL: https://www.degruyter.com/view/journals/jmc/14/1/article-p316.xml, doi:https://doi.org/10.1515/jmc-2019-0026.

[BGH13] Zvika Brakerski, Craig Gentry, and Shai Halevi. Packed ciphertextsin LWE-based homomorphic encryption. In Kaoru Kurosawa andGoichiro Hanaoka, editors, PKC 2013, volume 7778 of LNCS,pages 1–13. Springer, Heidelberg, February / March 2013. doi:10.1007/978-3-642-36362-7_1.

[BGN05] Dan Boneh, Eu-Jin Goh, and Kobbi Nissim. Evaluating 2-DNFformulas on ciphertexts. In Joe Kilian, editor, TCC 2005, volume3378 of LNCS, pages 325–341. Springer, Heidelberg, February2005. doi:10.1007/978-3-540-30576-7_18.

[BGV12] Zvika Brakerski, Craig Gentry, and Vinod Vaikuntanathan.(Leveled) fully homomorphic encryption without bootstrapping.In Shafi Goldwasser, editor, ITCS 2012, pages 309–325. ACM,January 2012. doi:10.1145/2090236.2090262.

[BGW88] Michael Ben-Or, Shafi Goldwasser, and Avi Wigderson.Completeness theorems for non-cryptographic fault-tolerantdistributed computation (extended abstract). In 20th ACM STOC,pages 1–10. ACM Press, May 1988. doi:10.1145/62212.62213.

[BL11] Andrej Bogdanov and Chin Ho Lee. Homomorphic encryptionfrom codes. Cryptology ePrint Archive, Report 2011/622, 2011.http://eprint.iacr.org/2011/622.

[Bla79] G. R. Blakley. Safeguarding cryptographic keys. Proceedings ofAFIPS 1979 National Computer Conference, 48:313–317, 1979.

[BLLN13] Joppe W. Bos, Kristin Lauter, Jake Loftus, and Michael Naehrig.Improved security for a ring-based fully homomorphic encryptionscheme. In Martijn Stam, editor, 14th IMA InternationalConference on Cryptography and Coding, volume 8308 of LNCS,pages 45–64. Springer, Heidelberg, December 2013. doi:10.1007/978-3-642-45239-0_4.

[BLP+13] Zvika Brakerski, Adeline Langlois, Chris Peikert, Oded Regev,and Damien Stehlé. Classical hardness of learning with errors. InDan Boneh, Tim Roughgarden, and Joan Feigenbaum, editors,45th ACM STOC, pages 575–584. ACM Press, June 2013. doi:10.1145/2488608.2488680.

100 BIBLIOGRAPHY

[BMR90] Donald Beaver, Silvio Micali, and Phillip Rogaway. The roundcomplexity of secure protocols (extended abstract). In 22nd ACMSTOC, pages 503–513. ACM Press, May 1990. doi:10.1145/100216.100287.

[BMR16] Marshall Ball, Tal Malkin, and Mike Rosulek. Garbling gadgetsfor Boolean and arithmetic circuits. In Edgar R. Weippl, StefanKatzenbeisser, Christopher Kruegel, Andrew C. Myers, and ShaiHalevi, editors, ACM CCS 2016, pages 565–577. ACM Press,October 2016. doi:10.1145/2976749.2978410.

[BMTPH20] Jean-Philippe Bossuat, Christian Mouchet, Juan Troncoso-Pastoriza, and Jean-Pierre Hubaux. Efficient bootstrappingfor approximate homomorphic encryption with non-sparse keys.Cryptology ePrint Archive, Report 2020/1203, 2020. https://eprint.iacr.org/2020/1203.

[BR15] Jean-François Biasse and Luis Ruiz. FHEW with efficient multibitbootstrapping. In Kristin E. Lauter and Francisco Rodríguez-Henríquez, editors, LATINCRYPT 2015, volume 9230 of LNCS,pages 119–135. Springer, Heidelberg, August 2015. doi:10.1007/978-3-319-22174-8_7.

[Bra12] Zvika Brakerski. Fully homomorphic encryption without modulusswitching from classical GapSVP. In Reihaneh Safavi-Naini andRan Canetti, editors, CRYPTO 2012, volume 7417 of LNCS, pages868–886. Springer, Heidelberg, August 2012. doi:10.1007/978-3-642-32009-5_50.

[BSW11] Dan Boneh, Amit Sahai, and Brent Waters. Functional encryption:Definitions and challenges. In Yuval Ishai, editor, TCC 2011,volume 6597 of LNCS, pages 253–273. Springer, Heidelberg, March2011. doi:10.1007/978-3-642-19571-6_16.

[BV11] Zvika Brakerski and Vinod Vaikuntanathan. Efficient fullyhomomorphic encryption from (standard) LWE. In RafailOstrovsky, editor, 52nd FOCS, pages 97–106. IEEE ComputerSociety Press, October 2011. doi:10.1109/FOCS.2011.12.

[BV14] Zvika Brakerski and Vinod Vaikuntanathan. Lattice-based FHEas secure as PKE. In Moni Naor, editor, ITCS 2014, pages 1–12.ACM, January 2014. doi:10.1145/2554797.2554799.

[Can00] Ran Canetti. Security and composition of multiparty cryptographicprotocols. Journal of Cryptology, 13(1):143–202, January 2000.doi:10.1007/s001459910006.

BIBLIOGRAPHY 101

[Can01] Ran Canetti. Universally composable security: A new paradigmfor cryptographic protocols. In 42nd FOCS, pages 136–145. IEEEComputer Society Press, October 2001. doi:10.1109/SFCS.2001.959888.

[CCD88] David Chaum, Claude Crépeau, and Ivan Damgård. Multipartyunconditionally secure protocols (extended abstract). In 20thACM STOC, pages 11–19. ACM Press, May 1988. doi:10.1145/62212.62214.

[CCK+13] Jung Hee Cheon, Jean-Sébastien Coron, Jinsu Kim, Moon SungLee, Tancrède Lepoint, Mehdi Tibouchi, and Aaram Yun. Batchfully homomorphic encryption over the integers. In ThomasJohansson and Phong Q. Nguyen, editors, EUROCRYPT 2013,volume 7881 of LNCS, pages 315–335. Springer, Heidelberg, May2013. doi:10.1007/978-3-642-38348-9_20.

[CCM+18] Joël Cathébras, Alexandre Carbon, Peter Milder, Renaud Sirdey,and Nicolas Ventroux. Data flow oriented hardware design ofRNS-based polynomial multiplication for SHE acceleration. IACRTCHES, 2018(3):69–88, 2018. https://tches.iacr.org/index.php/TCHES/article/view/7293. doi:10.13154/tches.v2018.i3.69-88.

[CG16] RatnaKumari Challa and VijayaKumari Gunta. Reed-muller codebased symmetric key fully homomorphic encryption scheme. InInternational Conference on Information Systems Security, pages499–508. Springer, 2016.

[CGGI16] Ilaria Chillotti, Nicolas Gama, Mariya Georgieva, and MalikaIzabachène. Faster fully homomorphic encryption: Bootstrappingin less than 0.1 seconds. In Jung Hee Cheon and Tsuyoshi Takagi,editors, ASIACRYPT 2016, Part I, volume 10031 of LNCS, pages3–33. Springer, Heidelberg, December 2016. doi:10.1007/978-3-662-53887-6_1.

[CGGI17] Ilaria Chillotti, Nicolas Gama, Mariya Georgieva, and MalikaIzabachène. Improving TFHE: faster packed homomorphicoperations and efficient circuit bootstrapping. Cryptology ePrintArchive, Report 2017/430, 2017. http://eprint.iacr.org/2017/430.

[CGGI20] Ilaria Chillotti, Nicolas Gama, Mariya Georgieva, and MalikaIzabachène. TFHE: Fast fully homomorphic encryption over thetorus. Journal of Cryptology, 33(1):34–91, January 2020. doi:10.1007/s00145-019-09319-x.

102 BIBLIOGRAPHY

[CGKS95] Benny Chor, Oded Goldreich, Eyal Kushilevitz, and Madhu Sudan.Private information retrieval. In 36th FOCS, pages 41–50. IEEEComputer Society Press, October 1995. doi:10.1109/SFCS.1995.492461.

[CH18] Hao Chen and Kyoohyung Han. Homomorphic lower digits removaland improved FHE bootstrapping. In Jesper Buus Nielsen andVincent Rijmen, editors, EUROCRYPT 2018, Part I, volume10820 of LNCS, pages 315–337. Springer, Heidelberg, April / May2018. doi:10.1007/978-3-319-78381-9_12.

[CHHS19] Jung Hee Cheon, Minki Hhan, Seungwan Hong, and Yongha Son.A hybrid of dual and meet-in-the-middle attack on sparse andternary secret LWE. Cryptology ePrint Archive, Report 2019/1114,2019. https://eprint.iacr.org/2019/1114.

[Chi18] Ilaria Chillotti. Vers l’efficacité et la sécurité du chiffrementhomomorphe et du cloud computing. PhD thesis, École NormaleSupérieure de Lyon, 2018.

[CHK+18] Jung Hee Cheon, Kyoohyung Han, Andrey Kim, Miran Kim,and Yongsoo Song. Bootstrapping for approximate homomorphicencryption. In Jesper Buus Nielsen and Vincent Rijmen, editors,EUROCRYPT 2018, Part I, volume 10820 of LNCS, pages 360–384. Springer, Heidelberg, April / May 2018. doi:10.1007/978-3-319-78381-9_14.

[CHK+19] Jung Hee Cheon, Kyoohyung Han, Andrey Kim, Miran Kim, andYongsoo Song. A full RNS variant of approximate homomorphicencryption. In Carlos Cid and Michael J. Jacobson Jr:, editors,SAC 2018, volume 11349 of LNCS, pages 347–368. Springer,Heidelberg, August 2019. doi:10.1007/978-3-030-10970-7_16.

[CIV16a] Wouter Castryck, Ilia Iliashenko, and Frederik Vercauteren.On error distributions in ring-based LWE. LMS Journal ofComputation and Mathematics, 19(A):130–145, 2016.

[CIV16b] Wouter Castryck, Ilia Iliashenko, and Frederik Vercauteren.Provably weak instances of ring-LWE revisited. In Marc Fischlinand Jean-Sébastien Coron, editors, EUROCRYPT 2016, Part I,volume 9665 of LNCS, pages 147–167. Springer, Heidelberg, May2016. doi:10.1007/978-3-662-49890-3_6.

[CIV18] Wouter Castryck, Ilia Iliashenko, and Frederik Vercauteren.Homomorphic SIM2D operations: Single instruction much more

BIBLIOGRAPHY 103

data. In Jesper Buus Nielsen and Vincent Rijmen, editors,EUROCRYPT 2018, Part I, volume 10820 of LNCS, pages 338–359. Springer, Heidelberg, April / May 2018. doi:10.1007/978-3-319-78381-9_13.

[CJP20] Ilaria Chillotti, Marc Joye, and Pascal Pallier. Programmablebootstrapping enables efficient homomorphic inference of deepneural networks. Technical report, zama, October 2020. hhttps://concrete.zama.ai/.

[CKKS17] Jung Hee Cheon, Andrey Kim, Miran Kim, and Yong SooSong. Homomorphic encryption for arithmetic of approximatenumbers. In Tsuyoshi Takagi and Thomas Peyrin, editors,ASIACRYPT 2017, Part I, volume 10624 of LNCS, pages 409–437.Springer, Heidelberg, December 2017. doi:10.1007/978-3-319-70694-8_15.

[CKY18] Jung Hee Cheon, Andrey Kim, and Donggeon Yhee. Multi-dimensional packing for HEAAN for approximate matrixarithmetics. Cryptology ePrint Archive, Report 2018/1245, 2018.https://eprint.iacr.org/2018/1245.

[CL01] Jan Camenisch and Anna Lysyanskaya. An efficient system fornon-transferable anonymous credentials with optional anonymityrevocation. In Birgit Pfitzmann, editor, EUROCRYPT 2001,volume 2045 of LNCS, pages 93–118. Springer, Heidelberg, May2001. doi:10.1007/3-540-44987-6_7.

[CLPX18] Hao Chen, Kim Laine, Rachel Player, and Yuhou Xia. High-precision arithmetic in homomorphic encryption. In Nigel P.Smart, editor, CT-RSA 2018, volume 10808 of LNCS, pages 116–136. Springer, Heidelberg, April 2018. doi:10.1007/978-3-319-76953-0_7.

[CLS16] Hao Chen, Kristin E. Lauter, and Katherine E. Stange. Securityconsiderations for galois non-dual RLWE families. In RobertoAvanzi and Howard M. Heys, editors, SAC 2016, volume 10532 ofLNCS, pages 443–462. Springer, Heidelberg, August 2016. doi:10.1007/978-3-319-69453-5_24.

[CLS17] Hao Chen, Kristin Lauter, and Katherine E Stange. Attacks onthe search RLWE problem with small errors. SIAM Journal onApplied Algebra and Geometry, 1(1):665–682, 2017.

[CLT14] Jean-Sébastien Coron, Tancrède Lepoint, and Mehdi Tibouchi.Scale-invariant fully homomorphic encryption over the integers. In

104 BIBLIOGRAPHY

Hugo Krawczyk, editor, PKC 2014, volume 8383 of LNCS, pages311–328. Springer, Heidelberg, March 2014. doi:10.1007/978-3-642-54631-0_18.

[CMNT11] Jean-Sébastien Coron, Avradip Mandal, David Naccache, andMehdi Tibouchi. Fully homomorphic encryption over theintegers with shorter public keys. In Phillip Rogaway, editor,CRYPTO 2011, volume 6841 of LNCS, pages 487–504. Springer,Heidelberg, August 2011. doi:10.1007/978-3-642-22792-9_28.

[CN11] Yuanmi Chen and Phong Q. Nguyen. BKZ 2.0: Better latticesecurity estimates. In Dong Hoon Lee and Xiaoyun Wang, editors,ASIACRYPT 2011, volume 7073 of LNCS, pages 1–20. Springer,Heidelberg, December 2011. doi:10.1007/978-3-642-25385-0_1.

[CNT12] Jean-Sébastien Coron, David Naccache, and Mehdi Tibouchi.Public key compression and modulus switching for fullyhomomorphic encryption over the integers. In David Pointchevaland Thomas Johansson, editors, EUROCRYPT 2012, volume7237 of LNCS, pages 446–464. Springer, Heidelberg, April 2012.doi:10.1007/978-3-642-29011-4_27.

[Con09] Keith Conrad. The different ideal. Technical report, 2009.Expository papers/Lecture notes. Available at:http://www.math.uconn.edu/~kconrad/blurbs/gradnumthy/different.pdf.

[CP19] Benjamin R. Curtis and Rachel Player. On the feasibility andimpact of standardising sparse-secret LWE parameter sets forhomomorphic encryption. Cryptology ePrint Archive, Report2019/1148, 2019. https://eprint.iacr.org/2019/1148.

[CS15] Jung Hee Cheon and Damien Stehlé. Fully homomophic encryptionover the integers revisited. In Elisabeth Oswald and Marc Fischlin,editors, EUROCRYPT 2015, Part I, volume 9056 of LNCS, pages513–536. Springer, Heidelberg, April 2015. doi:10.1007/978-3-662-46800-5_20.

[CSV17] Anamaria Costache, Nigel P. Smart, and Srinivas Vivek. Fasterhomomorphic evaluation of discrete Fourier transforms. In AggelosKiayias, editor, FC 2017, volume 10322 of LNCS, pages 517–529.Springer, Heidelberg, April 2017.

[CSVW16] Anamaria Costache, Nigel P. Smart, Srinivas Vivek, and AdrianWaller. Fixed-point arithmetic in SHE schemes. In Roberto

BIBLIOGRAPHY 105

Avanzi and Howard M. Heys, editors, SAC 2016, volume 10532 ofLNCS, pages 401–422. Springer, Heidelberg, August 2016. doi:10.1007/978-3-319-69453-5_22.

[CT65] James W Cooley and John W Tukey. An algorithm for the machinecalculation of complex fourier series. Mathematics of computation,19(90):297–301, 1965.

[Cyb] Cybernetica. Sharemind. https://sharemind.cyber.ee/.

[DD12] Léo Ducas and Alain Durmus. Ring-LWE in polynomial rings.In Marc Fischlin, Johannes Buchmann, and Mark Manulis,editors, PKC 2012, volume 7293 of LNCS, pages 34–51. Springer,Heidelberg, May 2012. doi:10.1007/978-3-642-30057-8_3.

[DGBL+15] Nathan Dowlin, Ran Gilad-Bachrach, Kim Laine, Kristin Lauter,Michael Naehrig, and John Wernsing. Manual for usinghomomorphic encryption for bioinformatics. Technical report,MSR-TR-2015-87, Microsoft Research, 2015.

[DGN+17] Nico Döttling, Satrajit Ghosh, Jesper Buus Nielsen, TobiasNilges, and Roberto Trifiletti. TinyOLE: Efficient activelysecure two-party computation from oblivious linear functionevaluation. In Bhavani M. Thuraisingham, David Evans, TalMalkin, and Dongyan Xu, editors, ACM CCS 2017, pages 2263–2276. ACM Press, October / November 2017. doi:10.1145/3133956.3134024.

[DH76] Whitfield Diffie and Martin E. Hellman. New directions incryptography. IEEE Transactions on Information Theory,22(6):644–654, 1976.

[DKL+13] Ivan Damgård, Marcel Keller, Enrique Larraia, Valerio Pastro,Peter Scholl, and Nigel P. Smart. Practical covertly secureMPC for dishonest majority - or: Breaking the SPDZ limits.In Jason Crampton, Sushil Jajodia, and Keith Mayes, editors,ESORICS 2013, volume 8134 of LNCS, pages 1–18. Springer,Heidelberg, September 2013. doi:10.1007/978-3-642-40203-6_1.

[DLSW19] Dominic Dams, Jeff Lataille, Rino Sanchez, and John Wade.WIDESEAS: A lattice-based PIR scheme implemented inEncryptedQuery. Cryptology ePrint Archive, Report 2019/855,2019. https://eprint.iacr.org/2019/855.

106 BIBLIOGRAPHY

[DM15] Léo Ducas and Daniele Micciancio. FHEW: Bootstrappinghomomorphic encryption in less than a second. In ElisabethOswald and Marc Fischlin, editors, EUROCRYPT 2015, Part I,volume 9056 of LNCS, pages 617–640. Springer, Heidelberg, April2015. doi:10.1007/978-3-662-46800-5_24.

[DPSZ12] Ivan Damgård, Valerio Pastro, Nigel P. Smart, and Sarah Zakarias.Multiparty computation from somewhat homomorphic encryption.In Reihaneh Safavi-Naini and Ran Canetti, editors, CRYPTO 2012,volume 7417 of LNCS, pages 643–662. Springer, Heidelberg, August2012. doi:10.1007/978-3-642-32009-5_38.

[EGL82] Shimon Even, Oded Goldreich, and Abraham Lempel. Arandomized protocol for signing contracts. In David Chaum,Ronald L. Rivest, and Alan T. Sherman, editors, CRYPTO’82,pages 205–210. Plenum Press, New York, USA, 1982.

[EKR17] David Evans, Vladimir Kolesnikov, and Mike Rosulek. A pragmaticintroduction to secure multi-party computation. Foundations andTrends® in Privacy and Security, 2(2-3), 2017.

[ElG85] Taher ElGamal. A public key cryptosystem and a signature schemebased on discrete logarithms. IEEE Transactions on InformationTheory, 31:469–472, 1985.

[ELOS15] Yara Elias, Kristin E. Lauter, Ekin Ozman, and Katherine E.Stange. Provably weak instances of ring-LWE. In Rosario Gennaroand Matthew J. B. Robshaw, editors, CRYPTO 2015, Part I,volume 9215 of LNCS, pages 63–92. Springer, Heidelberg, August2015. doi:10.1007/978-3-662-47989-6_4.

[FV12] Junfeng Fan and Frederik Vercauteren. Somewhat practical fullyhomomorphic encryption. Cryptology ePrint Archive, Report2012/144, 2012. http://eprint.iacr.org/2012/144.

[Gal] Inc Galois. Jana: Private data as a service. https://galois.com/project/jana-private-data-as-a-service/.

[Gar59] Harvey L Garner. The residue number system. In Papers presentedat the the March 3-5, 1959, western joint computer conference,pages 146–153, 1959.

[Gen09] Craig Gentry. Fully homomorphic encryption using ideal lattices.In Michael Mitzenmacher, editor, 41st ACM STOC, pages 169–178.ACM Press, May / June 2009. doi:10.1145/1536414.1536440.

BIBLIOGRAPHY 107

[GGP10] Rosario Gennaro, Craig Gentry, and Bryan Parno. Non-interactiveverifiable computing: Outsourcing computation to untrustedworkers. In Tal Rabin, editor, CRYPTO 2010, volume 6223of LNCS, pages 465–482. Springer, Heidelberg, August 2010.doi:10.1007/978-3-642-14623-7_25.

[GH19] Craig Gentry and Shai Halevi. Compressible FHE withapplications to PIR. In Dennis Hofheinz and Alon Rosen, editors,TCC 2019, Part II, volume 11892 of LNCS, pages 438–464.Springer, Heidelberg, December 2019. doi:10.1007/978-3-030-36033-7_17.

[GHS12a] Craig Gentry, Shai Halevi, and Nigel P. Smart. Betterbootstrapping in fully homomorphic encryption. In Marc Fischlin,Johannes Buchmann, and Mark Manulis, editors, PKC 2012,volume 7293 of LNCS, pages 1–16. Springer, Heidelberg, May2012. doi:10.1007/978-3-642-30057-8_1.

[GHS12b] Craig Gentry, Shai Halevi, and Nigel P. Smart. Fully homomorphicencryption with polylog overhead. In David Pointcheval andThomas Johansson, editors, EUROCRYPT 2012, volume 7237of LNCS, pages 465–482. Springer, Heidelberg, April 2012. doi:10.1007/978-3-642-29011-4_28.

[GKK+19] Lorenzo Grassi, Daniel Kales, Dmitry Khovratovich, Arnab Roy,Christian Rechberger, and Markus Schofnegger. Starkad andPoseidon: New hash functions for zero knowledge proof systems.Cryptology ePrint Archive, Report 2019/458, 2019. https://eprint.iacr.org/2019/458.

[GMW87] Oded Goldreich, Silvio Micali, and Avi Wigderson. How to playany mental game or A completeness theorem for protocols withhonest majority. In Alfred Aho, editor, 19th ACM STOC, pages218–229. ACM Press, May 1987. doi:10.1145/28395.28420.

[GOT12] Valérie Gauthier, Ayoub Otmani, and Jean-Pierre Tillich. Adistinguisher-based attack of a homomorphic encryption schemerelying on reed-solomon codes. Cryptology ePrint Archive, Report2012/168, 2012. http://eprint.iacr.org/2012/168.

[GRR+16] Lorenzo Grassi, Christian Rechberger, Dragos Rotaru, Peter Scholl,and Nigel P. Smart. MPC-friendly symmetric key primitives.In Edgar R. Weippl, Stefan Katzenbeisser, Christopher Kruegel,Andrew C. Myers, and Shai Halevi, editors, ACM CCS 2016,pages 430–443. ACM Press, October 2016. doi:10.1145/2976749.2978332.

108 BIBLIOGRAPHY

[GSW13] Craig Gentry, Amit Sahai, and Brent Waters. Homomorphicencryption from learning with errors: Conceptually-simpler,asymptotically-faster, attribute-based. In Ran Canetti and Juan A.Garay, editors, CRYPTO 2013, Part I, volume 8042 of LNCS,pages 75–92. Springer, Heidelberg, August 2013. doi:10.1007/978-3-642-40041-4_5.

[GW13] Rosario Gennaro and Daniel Wichs. Fully homomorphic messageauthenticators. In Kazue Sako and Palash Sarkar, editors,ASIACRYPT 2013, Part II, volume 8270 of LNCS, pages 301–320.Springer, Heidelberg, December 2013. doi:10.1007/978-3-642-42045-0_16.

[HG01] Nick Howgrave-Graham. Approximate integer common divisors. InRevised Papers from the International Conference on Cryptographyand Lattices, CaLC ’01, page 51–66, Berlin, Heidelberg, 2001.Springer-Verlag.

[HIMV19] Carmit Hazay, Yuval Ishai, Antonio Marcedone, and Muthura-makrishnan Venkitasubramaniam. LevioSA: Lightweight securearithmetic computation. In Lorenzo Cavallaro, Johannes Kinder,XiaoFeng Wang, and Jonathan Katz, editors, ACM CCS 2019,pages 327–344. ACM Press, November 2019. doi:10.1145/3319535.3354258.

[HK20] Kyoohyung Han and Dohyeong Ki. Better bootstrapping forapproximate homomorphic encryption. In Stanislaw Jarecki, editor,CT-RSA 2020, volume 12006 of LNCS, pages 364–390. Springer,Heidelberg, February 2020. doi:10.1007/978-3-030-40186-3_16.

[HPS19] Shai Halevi, Yuriy Polyakov, and Victor Shoup. An improved RNSvariant of the BFV homomorphic encryption scheme. In MitsuruMatsui, editor, CT-RSA 2019, volume 11405 of LNCS, pages 83–105. Springer, Heidelberg, March 2019. doi:10.1007/978-3-030-12612-4_5.

[HS15] Shai Halevi and Victor Shoup. Bootstrapping for HElib. InElisabeth Oswald and Marc Fischlin, editors, EUROCRYPT 2015,Part I, volume 9056 of LNCS, pages 641–670. Springer, Heidelberg,April 2015. doi:10.1007/978-3-662-46800-5_25.

[HS18] Shai Halevi and Victor Shoup. Faster homomorphic lineartransformations in HElib. In Hovav Shacham and AlexandraBoldyreva, editors, CRYPTO 2018, Part I, volume 10991 of

BIBLIOGRAPHY 109

LNCS, pages 93–120. Springer, Heidelberg, August 2018. doi:10.1007/978-3-319-96884-1_4.

[HSS17] Carmit Hazay, Peter Scholl, and Eduardo Soria-Vazquez. Low costconstant round MPC combining BMR and oblivious transfer. InTsuyoshi Takagi and Thomas Peyrin, editors, ASIACRYPT 2017,Part I, volume 10624 of LNCS, pages 598–628. Springer,Heidelberg, December 2017. doi:10.1007/978-3-319-70694-8_21.

[IKN+20] Mihaela Ion, Ben Kreuter, Ahmet Erhan Nergiz, Sarvar Patel,Shobhit Saxena, Karn Seth, Mariana Raykova, David Shanahan,and Moti Yung. On deploying secure computing: Privateintersection-sum-with-cardinality. In 2020 IEEE EuropeanSymposium on Security and Privacy (EuroS&P), pages 370–389.IEEE, 2020.

[ISN87] M. Ito, A. Saito, and Takao Nishizeki. Secret sharing schemesrealizing general access structure. In Proc. IEEE GlobalTelecommunication Conf. (Globecom’87), pages 99–102, 1987.

[ISN93] M. Ito, A. Saio, and Takao Nishizeki. Multiple assignment schemefor sharing secret. Journal of Cryptology, 6(1):15–20, March 1993.doi:10.1007/BF02620229.

[ISO19] It security techniques — encryption algorithms — part 6:Homomorphic encryption. Standard, International Organizationfor Standardization, May 2019.

[Kil88] Joe Kilian. Founding cryptography on oblivious transfer. In20th ACM STOC, pages 20–31. ACM Press, May 1988. doi:10.1145/62212.62215.

[KPP20] Andrey Kim, Antonis Papadimitriou, and Yuriy Polyakov. Ap-proximate homomorphic encryption with reduced approximationerror. Cryptology ePrint Archive, Report 2020/1118, 2020.https://eprint.iacr.org/2020/1118.

[KRSW18] Marcel Keller, Dragos Rotaru, Nigel P. Smart, and Tim Wood.Reducing communication channels in MPC. In Dario Catalanoand Roberto De Prisco, editors, SCN 18, volume 11035 of LNCS,pages 181–199. Springer, Heidelberg, September 2018. doi:10.1007/978-3-319-98113-0_10.

[KS19] Duhyeong Kim and Yongsoo Song. Approximate homomorphicencryption over the conjugate-invariant ring. In Kwangsu Lee,

110 BIBLIOGRAPHY

editor, ICISC 18, volume 11396 of LNCS, pages 85–102. Springer,Heidelberg, November 2019. doi:10.1007/978-3-030-12146-4_6.

[KW93] Mauricio Karchmer and Avi Wigderson. On span programs. InProceedings of Structures in Complexity Theory, pages 102–111,1993.

[KY18] Marcel Keller and Avishay Yanai. Efficient maliciously securemultiparty computation for RAM. In Jesper Buus Nielsen andVincent Rijmen, editors, EUROCRYPT 2018, Part III, volume10822 of LNCS, pages 91–124. Springer, Heidelberg, April / May2018. doi:10.1007/978-3-319-78372-7_4.

[Lak19] Ravie Lakshmanan. Google open-sources cryptographic tool tokeep data sets private. https://thenextweb.com/security/2019/06/20/google-open-sources-cryptographic-tool-to-keep-data-sets-private/, 2019.

[LDPW14] Junzuo Lai, Robert H. Deng, HweeHwa Pang, and Jian Weng.Verifiable computation on outsourced encrypted data. In MiroslawKutylowski and Jaideep Vaidya, editors, ESORICS 2014, Part I,volume 8712 of LNCS, pages 273–291. Springer, Heidelberg,September 2014. doi:10.1007/978-3-319-11203-9_16.

[Lin20] Yehuda Lindell. Secure multiparty computation (MPC). IACRCryptol. ePrint Arch., 2020:300, 2020.

[LLL82] Arjen K. Lenstra, Hendrik W. Lenstra, and László Lovász.Factoring polynomials with rational coefficients. MATH. ANN,261:515–534, 1982.

[LLL+20] Joon-Woo Lee, Eunsang Lee, Yongwoo Lee, Young-Sik Kim,and Jong-Seon No. High-precision bootstrapping of RNS-CKKShomomorphic encryption using optimal minimax polynomialapproximation and inverse sine function. Cryptology ePrintArchive, Report 2020/552, 2020. https://eprint.iacr.org/2020/552.

[LM20] Baiyu Li and Daniele Micciancio. On the security of homomorphicencryption on approximate numbers. Cryptology ePrint Archive,Report 2020/1533, 2020. https://eprint.iacr.org/2020/1533.

[LP11] Richard Lindner and Chris Peikert. Better key sizes (and attacks)for LWE-based encryption. In Aggelos Kiayias, editor, CT-RSA 2011, volume 6558 of LNCS, pages 319–339. Springer,

BIBLIOGRAPHY 111

Heidelberg, February 2011. doi:10.1007/978-3-642-19074-2_21.

[LPR10] Vadim Lyubashevsky, Chris Peikert, and Oded Regev. On ideallattices and learning with errors over rings. In Henri Gilbert,editor, EUROCRYPT 2010, volume 6110 of LNCS, pages 1–23.Springer, Heidelberg, May / June 2010. doi:10.1007/978-3-642-13190-5_1.

[LPR13] Vadim Lyubashevsky, Chris Peikert, and Oded Regev. A toolkitfor ring-LWE cryptography. In Thomas Johansson and Phong Q.Nguyen, editors, EUROCRYPT 2013, volume 7881 of LNCS, pages35–54. Springer, Heidelberg, May 2013. doi:10.1007/978-3-642-38348-9_3.

[LPSY15] Yehuda Lindell, Benny Pinkas, Nigel P. Smart, and Avishay Yanai.Efficient constant round multi-party computation combining BMRand SPDZ. In Rosario Gennaro and Matthew J. B. Robshaw,editors, CRYPTO 2015, Part II, volume 9216 of LNCS, pages319–338. Springer, Heidelberg, August 2015. doi:10.1007/978-3-662-48000-7_16.

[LTV12] Adriana López-Alt, Eran Tromer, and Vinod Vaikuntanathan.On-the-fly multiparty computation on the cloud via multikey fullyhomomorphic encryption. In Howard J. Karloff and ToniannPitassi, editors, 44th ACM STOC, pages 1219–1234. ACM Press,May 2012. doi:10.1145/2213977.2214086.

[Lyu11] Vadim Lyubashevsky. Search to decision reduction for the learningwith errors over rings problem. In 2011 IEEE Information TheoryWorkshop, ITW 2011, pages 410–414. IEEE, October 2011.

[Mau06] Ueli Maurer. Secure multi-party computation made simple.Discrete Applied Mathematics, 154(2):370 – 381, 2006. Codingand Cryptography. URL: http://www.sciencedirect.com/science/article/pii/S0166218X05002428, doi:https://doi.org/10.1016/j.dam.2005.03.020.

[Mic01] Daniele Micciancio. The shortest vector in a lattice is hard toapproximate to within some constant. SIAM journal on Computing,30(6):2008–2035, 2001.

[MP12] Daniele Micciancio and Chris Peikert. Trapdoors for lattices:Simpler, tighter, faster, smaller. In David Pointcheval and ThomasJohansson, editors, EUROCRYPT 2012, volume 7237 of LNCS,

112 BIBLIOGRAPHY

pages 700–718. Springer, Heidelberg, April 2012. doi:10.1007/978-3-642-29011-4_41.

[MPR+20] Peihan Miao, Sarvar Patel, Mariana Raykova, Karn Seth, and MotiYung. Two-sided malicious security for private intersection-sumwith cardinality. In Annual International Cryptology Conference,pages 3–33. Springer, 2020.

[MW16] Pratyay Mukherjee and Daniel Wichs. Two round multipartycomputation via multi-key FHE. In Marc Fischlin and Jean-Sébastien Coron, editors, EUROCRYPT 2016, Part II, volume9666 of LNCS, pages 735–763. Springer, Heidelberg, May 2016.doi:10.1007/978-3-662-49896-5_26.

[NIS] NIST. Threshold cryptography project. https://csrc.nist.gov/projects/threshold-cryptography.

[NK15] Koji Nuida and Kaoru Kurosawa. (Batch) fully homomorphicencryption over integers for non-binary message spaces. InElisabeth Oswald and Marc Fischlin, editors, EUROCRYPT 2015,Part I, volume 9056 of LNCS, pages 537–555. Springer, Heidelberg,April 2015. doi:10.1007/978-3-662-46800-5_21.

[NP99] Moni Naor and Benny Pinkas. Oblivious transfer and polynomialevaluation. In 31st ACM STOC, pages 245–254. ACM Press, May1999. doi:10.1145/301250.301312.

[Nui14] Koji Nuida. Candidate constructions of fully homomorphicencryption on finite simple groups without ciphertext noise.Cryptology ePrint Archive, Report 2014/097, 2014. http://eprint.iacr.org/2014/097.

[Pai99] Pascal Paillier. Public-key cryptosystems based on composite de-gree residuosity classes. In Jacques Stern, editor, EUROCRYPT’99,volume 1592 of LNCS, pages 223–238. Springer, Heidelberg, May1999. doi:10.1007/3-540-48910-X_16.

[Pei09] Chris Peikert. Public-key cryptosystems from the worst-caseshortest vector problem: extended abstract. In MichaelMitzenmacher, editor, 41st ACM STOC, pages 333–342. ACMPress, May / June 2009. doi:10.1145/1536414.1536461.

[Pei16] Chris Peikert. How (not) to instantiate ring-LWE. In VassilisZikas and Roberto De Prisco, editors, SCN 16, volume 9841 ofLNCS, pages 411–430. Springer, Heidelberg, August / September2016. doi:10.1007/978-3-319-44618-9_22.

BIBLIOGRAPHY 113

[PRR17] Yuriy Polyakov, Kurt Rohloff, and Gerard W Ryan. Palisadelattice cryptography library user manual. Cybersecurity ResearchCenter, New Jersey Institute ofTechnology (NJIT), Tech. Rep,2017. https://palisade-crypto.org/.

[Rab81] M Rabin. How to exchange secrets by oblivious transfer. harvardaiken comp. lab. Technical report, TR-81, 1981.

[RAD78] R L Rivest, L Adleman, and M L Dertouzos. On data banks andprivacy homomorphisms. Foundations of Secure Computation,Academia Press, pages 169–179, 1978.

[RB89] Tal Rabin and Michael Ben-Or. Verifiable secret sharing andmultiparty protocols with honest majority (extended abstract).In 21st ACM STOC, pages 73–85. ACM Press, May 1989. doi:10.1145/73007.73014.

[Reg05] Oded Regev. On lattices, learning with errors, random linearcodes, and cryptography. In Harold N. Gabow and Ronald Fagin,editors, 37th ACM STOC, pages 84–93. ACM Press, May 2005.doi:10.1145/1060590.1060603.

[RSA78] Ronald L. Rivest, Adi Shamir, and Leonard M. Adleman.A method for obtaining digital signatures and public-keycryptosystems. Communications of the Association for ComputingMachinery, 21(2):120–126, 1978.

[RW19] Dragos Rotaru and Tim Wood. MArBled circuits: Mixingarithmetic and Boolean circuits with active security. In Feng Hao,Sushmita Ruj, and Sourav Sen Gupta, editors, INDOCRYPT 2019,volume 11898 of LNCS, pages 227–249. Springer, Heidelberg,December 2019. doi:10.1007/978-3-030-35423-7_12.

[SC19] Yongha Son and Jung Hee Cheon. Revisiting the hybrid attack onsparse and ternary secret LWE. Cryptology ePrint Archive, Report2019/1019, 2019. https://eprint.iacr.org/2019/1019.

[SE94] Claus-Peter Schnorr and Martin Euchner. Lattice basis reduction:Improved practical algorithms and solving subset sum problems.Mathematical programming, 66(1-3):181–199, 1994.

[Sec] Unbound Security. Unbound security. https://www.unboundsecurity.com/.

[Sha79] Adi Shamir. How to share a secret. Communications of theAssociation for Computing Machinery, 22(11):612–613, November1979.

114 BIBLIOGRAPHY

[SV57] A Svoboda and M Valach. Rational system of residue classes.Stroje na Zpraccorani Informaci, Sbornik, Nakl. CSZV, Prague,pages 9–37, 1957.

[SV14] Nigel P. Smart and Frederik Vercauteren. Fully homomorphicSIMD operations. Des. Codes Cryptography, 71(1):57–81, April2014.

[SW19] Nigel P. Smart and Tim Wood. Error detection in monotone spanprograms with application to communication-efficient multi-partycomputation. In Mitsuru Matsui, editor, CT-RSA 2019, volume11405 of LNCS, pages 210–229. Springer, Heidelberg, March 2019.doi:10.1007/978-3-030-12612-4_11.

[TSKJ20] Jonathan Takeshita, Matthew Schoenbauer, Ryan Karl, and TaehoJung. Enabling faster operations for deeper circuits in full RNSvariants of FV-like somewhat homomorphic encryption. IACRCryptol. ePrint Arch., 2020:91, 2020.

[vDGHV10] Marten van Dijk, Craig Gentry, Shai Halevi, and VinodVaikuntanathan. Fully homomorphic encryption over the integers.In Henri Gilbert, editor, Advances in Cryptology – EUROCRYPT2010, pages 24–43, Berlin, Heidelberg, 2010. Springer BerlinHeidelberg.

[vEB81] Peter van Emde Boas. Another NP-complete problem and thecomplexity of computing short vectors in a lattice. Tecnical Report,Department of Mathmatics, University of Amsterdam, 1981.

[WRK17] Xiao Wang, Samuel Ranellucci, and Jonathan Katz. Global-scalesecure multiparty computation. In Bhavani M. Thuraisingham,David Evans, Tal Malkin, and Dongyan Xu, editors, ACM CCS2017, pages 39–56. ACM Press, October / November 2017. doi:10.1145/3133956.3133979.

[Yao82] Andrew Chi-Chih Yao. Protocols for secure computations(extended abstract). In 23rd FOCS, pages 160–164. IEEEComputer Society Press, November 1982. doi:10.1109/SFCS.1982.38.

[Yao86] Andrew Chi-Chih Yao. How to generate and exchange secrets(extended abstract). In 27th FOCS, pages 162–167. IEEEComputer Society Press, October 1986. doi:10.1109/SFCS.1986.25.

[Zam19] Zama. Concrete. https://zama.ai/, 2019.

Part II

Publications

“Success is not final, failure is not fatal:it is the courage to continue that counts”

–Winston Churchill

Chapter 6

Faster homomorphic functionevaluation using non-integralbase encoding

Publication data

Bonte C., Bootland C., Bos J. W., Castryck W., Iliashenko I., andVercauteren F. Faster homomorphic function evaluation using non-integralbase encoding. In Cryptographic Hardware and Embedded Systems - CHES2017 (Sept. 2017), W. Fischer and N. Homma, Eds., vol. 10529 of LectureNotes in Computer Science, Springer, Heidelberg, pp. 579–600.

Contribution:The contribution of the author of this thesis is to experimentally verify the theoreticalbounds on the coefficient growth of the ciphertext coefficients during homomorphiccomputations.

Faster Homomorphic Function Evaluation using

Non-Integral Base Encoding

Charlotte Bonte1, Carl Bootland1, Joppe W. Bos2, Wouter Castryck1,3, IliaIliashenko1, and Frederik Vercauteren1,4

1 imec-Cosic, Dept. Electrical Engineering, KU Leuven2 NXP Semiconductors

3 Laboratoire Paul Painlevé, Université de Lille-14 Open Security Research

Abstract. In this paper we present an encoding method for real num-bers tailored for homomorphic function evaluation. The choice of thedegree of the polynomial modulus used in all popular somewhat ho-momorphic encryption schemes is dominated by security considerations,while with the current encoding techniques the correctness requirementallows for much smaller values. We introduce a generic encoding methodusing expansions with respect to a non-integral base, which exploits thislarge degree at the benet of reducing the growth of the coecients whenperforming homomorphic operations. This allows one to choose a smallerplaintext coecient modulus which results in a signicant reduction ofthe running time. We illustrate our approach by applying this encodingin the setting of homomorphic electricity load forecasting for the smartgrid which results in a speed-up by a factor 13 compared to previouswork, where encoding was done using balanced ternary expansions.

1 Introduction

The cryptographic technique which allows an untrusted entity to perform arbi-trary computation on encrypted data is known as fully homomorphic encryption.The rst such construction was based on ideal lattices and was presented by Gen-try in 2009 [24]. When the algorithm applied to the encrypted data is known inadvance one can use a somewhat homomorphic encryption (SHE) scheme whichonly allows to perform a limited number of computational steps on the encrypteddata. Such schemes are signicantly more ecient in practice.

In all popular SHE schemes, the plaintext space is a ring of the form Rt =Zt[X]/(f(X)), where t ≥ 2 is a small integer called the coecient modulus, andf(X) ∈ Z[X] is a monic irreducible degree d polynomial called the polynomialmodulus. Usually one lets f(X) be a cyclotomic polynomial, where for reasons of

This work was supported by the European Commission under the ICT programmewith contract H2020-ICT-2014-1 644209 HEAT, and through the European ResearchCouncil under the FP7/2007-2013 programme with ERC Grant Agreement 615722MOTMELSUM. The second author is also supported by a PhD fellowship of theResearch Foundation - Flanders (FWO).

performance the most popular choices are the power-of-two cyclotomics Xd + 1where d = 2k for some positive integer k, which are maximally sparse. In this casearithmetic in Rt can be performed eciently using the fast Fourier transform,which is used in many lattice-based constructions (e.g. [8,9,10,34]) and mostimplementations (e.g. [3,6,7,25,26,29,32]).

One interesting problem relates to the encoding of the input data of the algo-rithm such that it can be represented as elements of Rt and such that one obtainsa meaningful outcome after the encrypted result is decrypted and decoded. Thismeans that addition and multiplication of the input data must agree with thecorresponding operations in Rt up to the depth of the envisaged SHE compu-tation. An active research area investigates dierent such encoding techniques,which are often application-specic and dependent on the type of the input data.For the sake of exposition we will concentrate on the particularly interesting andpopular setting where the input data consists of nite precision real numbers θ,even though our discussion below is fairly generic. The main idea, going back toDowlin et al. [19] (see also [20,27,31]) and analyzed in more detail by Costacheet al. [16], is to expand θ with respect to a base b

θ = arbr + ar−1b

r−1 + · · ·+ a1b+ a0 + a−1b−1 + a−2b

−2 + · · ·+ a−sb−s (1)

using integer digits ai, after which one replaces b by X to end up inside theLaurent polynomial ring Z[X±1]. One then reduces the digits ai modulo t andapplies the ring homomorphism to Rt dened by

ι : Zt[X±1]→ Rt :

X 7→ X,X−1 7→ −g(X) · f(0)−1,

where we write f(X) = Xg(X) + f(0) and it is assumed that f(0) is invertiblemodulo t; this is always true for cyclotomic polynomials, or for factors of them.The quantity r + s will sometimes be referred to as the degree of the encoding(where we assume that ara−s 6= 0). For power-of-two-cyclotomics the homomor-phism ι amounts to letting X−1 7→ −Xd−1, so that the encoding of (1) is givenby5 arX

r+ ar−1Xr−1+ · · ·+ a1X + a0− a−1Xd−1− a−2Xd−2− · · ·− a−sXd−s.Decoding is done through the inverse of the restriction ι|Zt[X±1][−`,m]

where

Zt[X±1][−`,m] = amXm + am−1Xm−1 + . . .+ a−`X

−` | ai ∈ Zt for all i

is a subset of Laurent polynomials whose monomials have bounded exponents. If`+m+1 = d then this restriction of ι is indeed invertible as a Zt-linear map. Theprecise choice of `,m depends on the data encoded. After applying this inverse,one replaces the coecients by their representants in −b(t − 1)/2c, . . . , d(t −1)/2e to end up with an expression in Z[X±1], and evaluates the result atX = b. Ensuring that decoding is correct to a given computational depth placesconstraints on the parameters t and d, in order to avoid ending up outside thebox depicted in Figure 1 if the computation were to be carried out directly in

5 In fact in [16] it is mentioned that inverting X is only possible in the power-of-twocyclotomic case, but this seems to be overcareful.

FASTER HOMOMORPHIC FUNCTION EVALUATION USING NON-INTEGRAL BASE ENCODING 119

X-axis

Z-axisZ[X±1]-plane

X−`

Xm

d(t− 1)/2e

−b(t− 1)/2c

Fig. 1. Box in which to stay during computation, where `+m+ 1 = d.

Z[X±1]. In terms of Rt we will often refer to this event as the `wrapping around'of the encoded data modulo t or f(X), although we note that this is an abuseof language. In the case of power-of-two cyclotomics, ending up above or belowthe box does indeed correspond to wrapping around modulo t, but ending up atthe left or the right of the box corresponds to a mix-up of the high degree termsand the low degree terms.

The precise constraints on t and d not only depend on the complexity of thecomputation, but also on the type of expansion (1) used in the encoding. Dowlinet al. suggest to use balanced b-ary expansions with respect to an odd baseb ∈ Z≥3, which means that the digits are taken from −(b−1)/2, . . . , (b−1)/2.Such expansions have been used for centuries going back at least to Colson (1726)and Cauchy (1840) in the quest for more ecient arithmetic.

If we x a precision, then for smaller b the balanced b-ary expansions arelonger but the coecients are smaller, this implies the need for a larger d butsmaller t. Similarly for larger bases the expansions become shorter but havelarger coecients leading to smaller d but larger t. For the application to some-what homomorphic encryption considered in [6,16] the security requirements askfor a very large d, so that the best choice is to use as small a base as possible,namely b = 3, with digits in ±1, 0. Even for this smallest choice the result-ing lower bound on t is very large and the bound on d is much smaller thanthat coming from the cryptographic requirements. To illustrate this, we recallthe concrete gures from the paper [6], which uses the Fan-Vercauteren (FV)somewhat homomorphic encryption scheme [23] for privacy-friendly predictionof electricity consumption in the setting of the smart grid. Here the authors used = 4096 for cryptographic reasons, which is an optimistic choice that leads to80-bit security only (and maybe even a few bits less than that [1]). On the otherhand using balanced ternary expansions, correct decoding is guaranteed as soonas d ≥ 368, which is even a conservative estimate. This eventually leads to thehuge bound t ' 2107, which is overcome by decomposing Rt into 13 factors usingthe Chinese Remainder Theorem (CRT). This is then used to homomorphicallyforecast the electricity usage for the next half hour for a small apartment complexof 10 households in about half a minute, using a sequential implementation.

The discrepancy between the requirements coming from correct decoding andthose coming from security considerations suggests that other possible expan-

120 FASTER HOMOMORPHIC FUNCTION EVALUATION USING NON-INTEGRAL BASE ENCODING

X-axis

Z-axis (log2-scale)

X−3710

X385

41

−41balanced ternary

X-axis

Z-axis (log2-scale)

X−3710

X3854

−4

950-NIBNAF

Fig. 2. Comparison of the amount of plaintext space which is actually used in thesetting of [6], where d = 4096. More precise gures to be found in Section 4.

sions may be better suited for use with SHE. In this paper we introduce a genericencoding technique, using very sparse expansions having digits in ±1, 0 withrespect to a non-integral base bw > 1, where w is a sparseness measure. Theseexpansions will be said to be of `non-integral base non-adjacent form' with win-dow size w, abbreviated to w-NIBNAF. Increasing w makes the degrees of theresulting Laurent polynomial encodings grow and decreases the growth of thecoecients when performing operations; hence lowering the bound on t. Ourencoding technique is especially useful when using nite precision real numbers,but could also serve in dealing with nite precision complex numbers or evenwith integers, despite the fact that bw is non-integral (this would require a carefulprecision analysis which is avoided here).

We demonstrate that this technique results in signicant performance in-creases by re-doing the experiments from [6]. Along with a more careful pre-cision analysis which is tailored for this specic use case, using 950-NIBNAFexpansions we end up with the dramatically reduced bound t ≥ 33. It is notentirely honest to compare this to t ' 2107 because of our better precision anal-ysis; as explained in Section 4 it makes more sense to compare the new bound tot ' 242, but the reduction remains huge. As the reader can see in Figure 2 thisis explained by the fact that the data is spread more evenly across the plaintextspace during computation. As a consequence we avoid the need for CRT decom-position and thus reduce the running time by a factor 13, showing that the samehomomorphic forecasting can be done in only 2.5 seconds.

Remark. An alternative recent proposal for encoding using a non-integral basecan be found in [15], which targets ecient evaluation of the discrete Fouriertransform on encrypted data. Here the authors work exclusively in the power-of-two cyclotomic setting f(X) = Xd + 1, and the input data consists of complexnumbers θ which are expanded with respect to the base b = ζ, where ζ is aprimitive 2d-th root of unity, i.e. a root of f(X); a similar idea was used in [12].One nice feature of this approach is that the correctness of decoding is notaected by wrapping around modulo f(X). To nd a sparse expansion theyuse the LLL algorithm [28], but for arbitrary complex inputs the digits becomerather large when compared to w-NIBNAF.

FASTER HOMOMORPHIC FUNCTION EVALUATION USING NON-INTEGRAL BASE ENCODING 121

2 Encoding data using w-NIBNAF

Our approach in reducing the lower bound on the plaintext modulus t is to useencodings for which many of the coecients are zero. In this respect, a rstimprovement over balanced ternary expansions is obtained by using the non-adjacent form (NAF) representations which were introduced by Reitweisner in1960 for speeding up early multiplication algorithms [33]. We note that inde-pendent work by Cheon et al. [11] also mentions the advantages of using NAFencodings.

Denition 1. The non-adjacent form (NAF) representation of a real number θis an expansion of θ to the base b = 2 with coecients in −1, 0, 1 such that

any two adjacent coecients are not both non-zero.

The NAF representation has been generalized [13]: for an integer w ≥ 1 (calledthe `window size') one can ensure that in any window of w consecutive coecientsat most one of them is non-zero. This is possible to base b = 2 but for w > 2one requires larger coecients.

Denition 2. Let w ≥ 1 be an integer. A w-NAF representation of a real num-

ber θ is an expansion of θ with base 2 and whose non-zero coecients are odd

and less than 2w−1 in absolute value such that for every set of w consecutive

coecients at most one of them is non-zero.

We see that NAF is just the special case of w-NAF for w = 2. Unfortunately, dueto the fact that the coecients are taken from a much larger set, using w-NAFencodings in the SHE setting actually gives larger bounds on both t and d forincreasing w. Therefore this is not useful for our purposes.

Ideally, we want the coecients in our expansions to be members of ±1, 0with many equal to 0, as this leads to the slowest growth in coecient sizes,allowing us to use smaller values for t. This would come at the expense of usinglonger encodings, but remember that we have a lot of manoeuvring space on thed side. One way to achieve this goal is to use a non-integral base b > 1 whencomputing a non-adjacent form. We rst give the denition of a non-integralbase non-adjacent form with window size w (w-NIBNAF) representation andthen explain where this precise formulation comes from.

Denition 3. A sequence a0, a1, . . . , an, . . . is a w-balanced ternary sequenceif it has ai ∈ −1, 0, 1 for i ∈ Z≥0 and satises the property that each set of wconsecutive terms has no more than one non-zero term.

Denition 4. Let θ ∈ R and w ∈ Z>0. Dene bw to be the unique positive real

root of the polynomial Fw(x) = xw+1−xw−x−1. A w-balanced ternary sequence

ar, ar−1, . . . , a1, a0, a−1, . . . is a w-NIBNAF representation of θ if

θ = arbrw + ar−1b

r−1w + · · ·+ a1bw + a0 + a−1b

−1w + · · · .

122 FASTER HOMOMORPHIC FUNCTION EVALUATION USING NON-INTEGRAL BASE ENCODING

Below we will show that every θ ∈ R has at least one such w-NIBNAFrepresentation and provide an algorithm to nd such a representation. But letus rst state a lemma which shows that bw is well-dened for w ≥ 1.

Lemma 1. For an integer w ≥ 1 the polynomial Fw(x) = xw+1−xw−x−1 has

a unique positive real root bw > 1. The sequence b1, b2, . . . is strictly decreasing

and limw→∞ bw = 1. Further, (x2 + 1) | Fw(x) for w ≡ 3 mod 4.

The proof is straightforward and given in Appendix A. The rst few valuesof bw are as follows

b1 = 1 +√2 ≈ 2.414214, b2 ≈ 1.839287,

b3 = 12 (1 +

√5) ≈ 1.618034, b4 ≈ 1.497094,

where we note that b3 is the golden ratio φ.Since we are using a non-integral base, a w-NIBNAF representation of a

xed-point number has innitely many non-zero terms in general. To overcomethis one approximates the number by terminating the w-NIBNAF representationafter some power of the base. We call such a terminated sequence an approxi-

mate w-NIBNAF representation. There are two straightforward ways of decidingwhere to terminate: either a xed power of the base is chosen so that any termsafter this are discarded giving an easy bound on the maximal possible error cre-ated, or we choose a maximal allowed error in advance and terminate after therst power which gives error less than or equal to this value.

Algorithm 1 below produces for every θ ∈ R a w-NIBNAF representationin the limit as ε tends to 0, thereby demonstrating its existence. It takes theform of a greedy algorithm which chooses the closest signed power of the baseto θ and then iteratively nds a representation of the dierence. Except whenθ can be written as θ = h(bw)/b

qw, for some polynomial h with coecients in

±1, 0 and q ∈ Z≥0, any w-NIBNAF representation is innitely long. Hence,we must terminate Algorithm 1 once the iterative input is smaller than somepre-determined precision ε > 0.

We now prove that the algorithm works as required.

Lemma 2. Algorithm 1 produces an approximate w-NIBNAF representation of

θ with an error of at most ε.

Proof. Assuming that the algorithm terminates, the output clearly represents θto within an error of at most size ε. First we show that the output is w-NIBNAF.Suppose that the output, on input θ, bw, ε, has at least two non-zero terms, therst being ad. This implies either that bdw ≤ |θ| < bd+1

w and bd+1w − |θ| > |θ| − bdw

or bd−1w < |θ| ≤ bdw and bdw − |θ| ≤ |θ| − bd−1w . These conditions can be written asbdw ≤ |θ| < 1

2bdw(1 + bw) and

12bd−1w (1 + bw) ≤ |θ| ≤ bdw respectively, showing that

||θ| − bdw| < maxbdw − 1

2bd−1w (1 + bw),

12bdw(1 + bw)− bdw

= 1

2bdw(bw − 1) .

The algorithm subsequently chooses the closest power of bw to this smaller value,suppose it is b`w. By the same argument with θ replaced by |θ|− bdw we have that

FASTER HOMOMORPHIC FUNCTION EVALUATION USING NON-INTEGRAL BASE ENCODING 123

Algorithm 1: GreedyRepresentation

Input: θ the real number to be represented,bw the w-NIBNAF base to be used in the representation,ε the precision to which the representation is determined.Output: An approximate w-NIBNAF representation ar, ar−1, . . . of θ with

error less than ε, where ai = 0 if not otherwise specied.σ ← sgn(θ)t← |θ|while t > ε do

r ←⌈logbw (t)

if brw − t > t− br−1w then

r ← r − 1ar ← σσ ← σ · sgn(t− brw)t← |t− brw|

Return (ai)i.

either b`w ≤∣∣|θ| − bdw

∣∣ or 12b`−1w (1 + bw) ≤

∣∣|θ| − bdw∣∣ and since b`w is larger than

12b`−1w (1+bw) the maximal possible value of `, which we denote by `w(d), satises

`w(d) = max` ∈ Z

∣∣ 12b`−1w (1 + bw) <

12bdw(bw − 1)

.

The condition on ` can be rewritten as b`w < bd+1w (bw−1)/(bw+1) which implies

that ` < d+ 1 + logbw((bw − 1)/(bw + 1)) and thus

`w(d) = d+

⌈logbw

(bw − 1

bw + 1

)⌉,

so that the smallest possible dierence is independent of d and equal to

s(w) := d− `w(d) = −⌈logbw

(bw − 1

bw + 1

)⌉=

⌊logbw

(bw + 1

bw − 1

)⌋.

We thus need to show that s(w) ≥ w. As w is an integer this is equivalent to

logbw

(bw + 1

bw − 1

)≥ w ⇐⇒ bww ≤

bw + 1

bw − 1⇐⇒ bw+1

w − bww − bw − 1 ≤ 0

which holds for all w since Fw(bw) = 0. Note that our algorithm works correctlyand deterministically because when |θ| is exactly half-way between two powersof bw we pick the larger power. This shows that the output is of the desired form.

Finally, to show that the algorithm terminates we note that the k'th suc-

cessive dierence is bounded above by 12bd−(k−1)s(w)w (bw − 1) and this tends

to 0 as k tends to innity. Therefore after a nite number of steps (at most⌈(d− logbw (2ε/(bw − 1)) /s(w)

⌉+ 1) the dierence is smaller than or equal to ε

and the algorithm terminates. ut

124 FASTER HOMOMORPHIC FUNCTION EVALUATION USING NON-INTEGRAL BASE ENCODING

The process of encoding works as described in the introduction, i.e. we followthe approach from [16,19] except we use an approximate w-NIBNAF represen-tation instead of the balanced ternary representation. Thus to encode a realnumber θ we nd an approximate w-NIBNAF representation of θ with smallenough error and replace each occurrence of bw by X, after which we apply themap ι to end up in plaintext space Rt. Decoding is almost the same as well, onlythat after inverting ι and lifting the coecients to Z we evaluate the resultingLaurent polynomial at X = bw rather than X = 3, computing the value only tothe required precision. Rather than evaluating directly it is best to reduce theLaurent polynomial modulo Fw(X) (or modulo Fw(X)/(X2+1) if w ≡ 3 mod 4)so that we only have to compute powers of bw up to w (respectively w − 2).

Clearly we can also ask Algorithm 1 to return∑i aiX

i ∈ Zt[X±1], this givesan encoding of θ with maximal error ε. Since the input θ of the algorithm canget arbitrarily close to but larger than ε, the nal term can be ±Xh whereh = blogbw(2ε/(1 + bw))c + 1. If we are to ensure that the smallest power ofthe base to appear in any approximate w-NIBNAF representation is bsw then werequire that if bs−1w is the nearest power of bw to the input θ then |θ| ≤ ε sothat we must have 1

2bs−1w (1 + bw) ≤ ε which implies the smallest precision we

can achieve is ε = bs−1w (1+ bw)/2. In particular if we want no negative powers ofbw then the best precision possible using the greedy algorithm is (1+b−1w )/2 < 1.

Remark. If one replaces bw by a smaller base b > 1 then Algorithm 1 stillproduces a w-NIBNAF expansion to precision ε: this follows from the proof ofLemma 2. The distinguishing feature of bw is that it is maximal with respect tothis property, so that the resulting expansions become as short as possible.

3 Analysis of coecient growth during computation

After encoding the input data it is ready for homomorphic computations. Thisincreases both the number of non-zero coecients as well as the size of these co-ecients. Since we are working in the ring Rt there is a risk that our data wrapsaround modulo t as well as modulo f(X), in the sense explained in the intro-duction, which we should avoid since this leads to erroneous decoding. Thereforewe need to understand the coecient growth more thoroughly. We simplify theanalysis in this section by only considering multiplications and what constraintthis puts on t, it is then not hard to generalize this to include additions.

Worst case coecient growth for w-NIBNAF encodings. Here we ana-lyze the maximal possible size of a coecient which could occur from computingwith w-NIBNAF encodings. Because fresh w-NIBNAF encodings are just ap-proximate w-NIBNAF representations written as elements of Rt we considernite w-balanced ternary sequences and the multiplication endowed on themfrom Rt. Equivalently, we consider multiplication in the Z[X±1]-plane depictedin Figure 1. As we ensure in practice that there is no wrap around modulo f(X)this can be ignored in our analysis.

FASTER HOMOMORPHIC FUNCTION EVALUATION USING NON-INTEGRAL BASE ENCODING 125

To start the worst case analysis we have the following lower bound; note thatthe d we use here is not that of the degree of f(X).

Lemma 3. The maximal absolute size of a term that can appear in the product

of p arbitrary w-balanced ternary sequences of length d+ 1 is at least

Bw(d, p) :=

bbpbd/wc/2c/(bd/wc+1)c∑

k=0

(−1)k(p

k

)(p− 1 + bpbd/wc/2c − kbd/wc − k

p− 1

).

A full proof of this lemma is given in Appendix A but the main idea is to lookat the largest coecient of mp where m has the maximal number of non-zerocoecients, bd/wc+1, all being equal to 1 and with exactly w−1 zero coecientsbetween each pair of adjacent non-zero coecients. The (non-zero) coecientsofmp are variously known in the literature as extended (or generalized) binomialcoecients or ordinary multinomials; we denote them here by

(pk

)ndened via

(1 +X +X2 + . . .+Xn−1)p =

∞∑

k=0

(p

k

)

n

Xk ,

[22,18,35,21]. In particular the maximal coecient is the (or a) central one andwe can write Bw(d, p) =

(pk

)nwhere k = bpbd/wc/2c and n = bd/wc+ 1.

We note that the w-NIBNAF encoding, using the greedy algorithm with

precision 12 , of b

d+w−(d mod w)w (bw − 1)/2 is m so in practice this lower bound is

achievable although highly unlikely to occur.We expect that this lower bound is tight, indeed we were able to prove the

following lemma, the proof is also given in Appendix A.

Lemma 4. Suppose w divides d, then Bw(d, p) equals the maximal absolute size

of a term that can be produced by taking the product of p arbitrary w-balancedternary sequences of length d+ 1.

We thus make the following conjecture which holds for all small values of pand d we tested and which we assume to be true in general.

Conjecture 1 The lower bound Bw(d, p) given in Lemma 3 is exact for all

d, that is the maximal absolute term size which can occur after multiplying parbitrary w-balanced ternary sequences of length d+ 1 is Bw(d, p).

This conjecture seems very plausible since as soon as one multiplicand doesnot have non-zero coecients exactly w places apart the non-zero coecientsstart to spread out and decrease in value.

To determine Bw(d, p) for xed p dene n := bd/wc+1, then we can expandthe expression for Bw(d, p) as a `polynomial' in n of degree p − 1 where thecoecients depend on the parity of n, see [5] for more details. The rst few are:

Bw(d, 1) = 1, Bw(d, 2) = n,

Bw(d, 3) =18 (6n

2 + 1)− (−1)n8 , Bw(d, 4) =

13 (2n

3 + n),

Bw(d, 5) =1

384 (230n4 + 70n2 + 27)− (−1)n

384 (30n2 + 27),

Bw(d, 6) =120 (11n

5 + 5n3 + 4n).

126 FASTER HOMOMORPHIC FUNCTION EVALUATION USING NON-INTEGRAL BASE ENCODING

Denoting the coecient of np−1 in these expressions by `p, it can be shown

(see [2] or [5]) that limp→∞√p`p =

√6/π and hence we have

limp→∞

log2(Bw(d, p))− (p− 1) log2(n) +12 log2

(πp6

)= 0

or equivalently Bw(d, p) ∼p√

6/πpnp−1. Thus we have the approximation

log2(Bw(d, p)) ≈ (p− 1) log2(n)− 12 log2

(πp6

)

which for large enough n (experimentally we found for n > 1.825√p− 1/2) is an

upper bound for p > 2. For a guaranteed upper bound we refer to Mattner andRoos [30] where they state, for n, p ∈ Z>0 with n ≥ 2, if p 6= 2 or n ∈ 2, 3, 4then Bw(d, p) ≤

√6/(πp(n2 − 1))np. This upper bound is in fact a more precise

asymptotic limit than that above which only considers the leading coecient.

Statistical analysis of the coecient growth. Based on the w-NIBNAFencodings of random numbers in N ∈

[−240, 240

], we try to get an idea of the

amount of zero and non-zero coecients in a fresh encoding without fractionalpart, obtained by running Algorithm 1 to precision (1+ b−1w )/2. We also analyzehow these proportions change when we perform multiplications. We plot this fordierent values of w to illustrate the positive eects of using sparser encodings.As a preliminary remark note that the w-NIBNAF encodings produced by Al-gorithm 1 applied to −N and N are obtained from one another by changing allthe signs, so the coecients −1 and 1 are necessarily distributed evenly.6

We know from the denition of a w-NIBNAF expansion that at least w − 1among each block of w consecutive coecients of the expansion will be 0, sowe expect for big w that the 0 coecient occurs a lot more often than ±1.This is clearly visible in Figure 3. In addition we see an increasing number of 0coecients and decreasing number of ±1 coecients for increasing w. Thus boththe absolute and the relative sparseness of our encodings increase as w increases.

Since the balanced ternary encoding of [16,19] and the 2-NAF encoding [33],only have coecients in 0,±1 it is interesting to compare them to 1-NIBNAFand 2-NIBNAF respectively. We compare them by computing the percentageof zero and non-zero coecients, in 10 000 encodings of random integers N in[−240, 240

]. We compute this percentage up to an accuracy of 10−2 and consider

for our counts all coecients up to and including the leading coecient, furtherzero coecients are not counted. When we compare the percentages of zero andnon-zero coecients occurring in 1-NIBNAF and balanced ternary in Table 1 wesee that for the balanced ternary representation, the occurrences of 0, 1 and −1coecients are approximately the same, while for 1-NIBNAF the proportion of 0

6 This is a desirable property leading to the maximal amount of cancellation duringcomputation. While this does not aect our worst case analysis, in practice wherethe worst case is extremely unlikely this accounts for a considerable reduction of thesize of the coecient modulus t. If in some application the input encodings happento be biased towards 1 or −1 then one can work with respect to the negative base−bw < −1, by switching the signs of all the digits appearing at an odd index.

FASTER HOMOMORPHIC FUNCTION EVALUATION USING NON-INTEGRAL BASE ENCODING 127

0 10 20 30 40 50

w

0

2

4

6

8

10

log

2(#

co

eff

)

non-zero coeff

coeff 0

Fig. 3. Plot of log2(#coe) on the vertical axis against w on the horizontal axis aver-aged over 10 000 w-NIBNAF encodings of random integers in

[−240, 240

].

balanced ternary 1-NIBNAF 2-NAF 2-NIBNAF

zero coecients 32.25% 48.69% 65.23% 70.46%non-zero coecients 67.76% 51.31% 34.77% 29.54%

Table 1. Comparison between the previous encoding techniques and w-NIBNAF

w= 1

-20 -10 0 10 20

coeff

0

5

10

15

20

log

2(#

co

eff

+1

)

w= 2

-20 -10 0 10 20

coeff

0

5

10

15

20

log

2(#

co

eff

+1

)

w= 3

-20 -10 0 10 20

coeff

0

5

10

15

20

log

2(#

co

eff

+1

)

w= 50

-20 -10 0 10 20

coeff

0

5

10

15

20

25

log

2(#

co

eff

+1

)

w= 100

-20 -10 0 10 20

coeff

0

5

10

15

20

25

log

2(#

co

eff

+1

)

w= 150

-20 -10 0 10 20

coeff

0

5

10

15

20

25

log

2(#

co

eff

+1

)

Fig. 4. Plot of log2(#coe+1) on the vertical axis against the respective value ofthe coecient on the horizontal axis for the result of 10 000 multiplications of twow-NIBNAF encodings of random numbers between

[−240, 240

].

coecients is larger than that of 1 or −1. Hence we can conclude that 1-NIBNAFencodings will be sparser than the balanced ternary encodings even though thewindow size is the same. For 2-NIBNAF we also see an improvement in terms ofsparseness of the encoding compared to 2-NAF.

The next step is to investigate what happens to the coecients when wemultiply two encodings. From Figure 4 we see that when w increases the max-

128 FASTER HOMOMORPHIC FUNCTION EVALUATION USING NON-INTEGRAL BASE ENCODING

0 100 200 3000

5

10

15 1 mult

2 mult

3 mult

4 mult

5 mult

0 100 200 3000

5

10

15 1 mult

2 mult

3 mult

4 mult

5 mult

0 100 200 3000

5

10

15 1 mult

2 mult

3 mult

4 mult

5 mult

0 2000 4000 60000

2

4

61 mult

2 mult

3 mult

4 mult

5 mult

0 2000 4000 60000

2

4

61 mult

2 mult

3 mult

4 mult

5 mult

0 2000 4000 60000

2

4

61 mult

2 mult

3 mult

4 mult

5 mult

Fig. 5. log2 of the maximum absolute value of the coecient of xi seen during 10 000products of two w-NIBNAF encodings of random numbers in

[−240, 240

]against i.

imal size of the resulting coecients becomes smaller. So the plots conrm theexpected result that sparser encodings lead to a reduction in the size of the re-sulting coecients after one multiplication. Next, we investigate the behaviourfor an increasing amount of multiplications. In Figure 5 one observes that for axed number of multiplications the maximum coecient, considering all coe-cients in the resulting polynomial, decreases as w increases and the maximumdegree of the polynomial increases as w increases. This conrms that increasingthe degree of the polynomial, in order to make it more sparse, has the desirableeect of decreasing the size of the coecients. Figure 5 also shows that basedon the result of one multiplication we can even estimate the maximum value ofthe average coecients of xi for a specic number of multiplications by scalingthe result for one multiplication.

To summarize, we plot the number of bits of the maximum coecient of thepolynomial that is the result of a certain xed amount of multiplications as afunction of w in Figure 6. From this gure we clearly see that the maximal co-ecient decreases when w increases and hence the original encoding polynomialis sparser. In addition we see that the eect of the sparseness of the encodingon the size of the resulting maximal coecient is bigger when the amount ofmultiplications increases. However the gain of sparser encodings decreases as wbecomes bigger. Furthermore, Figure 6 shows that the bound given in Lemma 3is much bigger than the observed upper bound we get from 10 000 samples.

4 Practical impact

We encounter the following constraints on the plaintext coecient modulus twhile homomorphically computing with polynomial encodings of nite precisionreal numbers. The rst constraint comes from the correctness requirement of theSHE scheme: the noise inside the ciphertext should not exceed a certain levelduring the computations, otherwise decryption fails. Since an increase of the

FASTER HOMOMORPHIC FUNCTION EVALUATION USING NON-INTEGRAL BASE ENCODING 129

0 5 10 15 20 25 30 35 40 45 50

w

0

2

4

6

8

10

12

14

16

log

2(m

ax(c

oeffic

ients

))

average coeff 1 mult

upper bound coeff 1 mult

average coeff 2 mult

upper bound coeff 2 mult

average coeff 3 mult

upper bound coeff 3 mult

average coeff 4 mult

upper bound coeff 4 mult

average coeff 5 mult

upper bound coeff 5 mult

Fig. 6. log2 of the observed and theoretical maximum absolute coecient of the resultof multiplying w-NIBNAF encodings of random numbers in

[−240, 240

]against w.

plaintext modulus expands the noise this places an upper bound on the possiblet which can be used. The second constraint does not relate to SHE but to thecircuit itself. After any arithmetic operation the polynomial coecients tend togrow. Given that fact, one should take a big enough plaintext modulus in order toprevent or mitigate possible wrapping around modulo t. This determines a lowerbound on the range of possible values of t. In practice, for deep enough circuitsthese two constraints are incompatible, i.e. there is no interval from which tcan be chosen. However, the plaintext space Rt can be split into smaller ringsRt1 , . . . , Rtk with t =

∏ki=1 ti using the Chinese Remainder Theorem (CRT).

This technique [8] allows us to take the modulus big enough for correct evaluationof the circuit and then perform k threads of the homomorphic algorithm overRtii. These k output polynomials will then be combined into the nal output,again by CRT. This approach needs k times more memory and time than thecase of a single modulus. Thus the problem is mostly about reducing the numberof factors of t needed.

An a priori lower bound on t can be derived using the worst case scenario inwhich the nal output has the maximal possible coecient, which was analyzedin Section 3. If we use w-NIBNAF encodings for increasing values of w thenthis lower bound will decrease, eventually leading to fewer CRT factors; here aconcern is not to take w too large to prevent wrapping around modulo f(X). Inpractice though, we can take t considerably smaller because the worst case occurswith a negligible probability, which even decreases for circuits having a biggermultiplicative depth. Moreover, we can allow the least signicant coecients ofthe fractional part to wrap around modulo t with no harm to the nal results.

In this section we revisit the homomorphic method for electricity load fore-casting described in [6] and demonstrate that by using w-NIBNAF encodings, byignoring the unlikely worst cases, and by tolerating minor precision losses we canreduce the number of CRT factors from k = 13 to k = 1, thereby enhancing itspractical performance by a factor 13. We recall that [6] uses the Fan-Vercauteren

130 FASTER HOMOMORPHIC FUNCTION EVALUATION USING NON-INTEGRAL BASE ENCODING

SHE scheme [23], along with the group method of data handling (GMDH) as aprediction tool; we refer to [6, 3] for a quick introduction to this method. Dueto the fact that 80 percent of electricity meter devices in the European Unionshould be replaced with smart meters by 2020, this application may mitigatesome emerging privacy and eciency issues.

Experimental setup. For comparison's sake we mimic the treatment in [6]as closely as possible. In particular we also use the real world measurementsobtained from the smart meter electricity trials performed in Ireland [14]. Thisdataset [14] contains observed electricity consumption of over 5000 residentialand commercial buildings during 30 minute intervals. We use aggregated con-sumption data of 10 buildings. Given previous consumption data with someadditional information, the GMDH network has the goal of predicting electricitydemand for the next time period. Concretely, it requires 51 input parameters:the 48 previous measurements plus the day of the week, the month and the tem-perature. There are three hidden layers with 8, 4, 2 nodes, respectively. A singleoutput node provides the electricity consumption prediction for the next halfhour. Recall that a node is just a bivariate quadratic polynomial evaluation.

The plaintext space is of the form Rt = Zt[X]/(X4096+1), where the degreed = 4096 is motivated by the security level of 80 bits which is targetted in [6];recent work by Albrecht [1] implies that the actual level of security is slightlyless than that. Inside Rt the terms corresponding to the fractional parts andthose corresponding to the integral parts come closer together after each multi-plication. Wrapping around modulo X4096 + 1, i.e. ending up at the left or atthe right of the box depicted in Figure 1, means that inside Rt these integer andfractional parts start to overlap. In this case it is no longer possible to decodecorrectly. We encode the input data using approximate w-NIBNAF representa-tions with a xed number of integer and fractional digits. When increasing thewindow size w one should take into account that the precision of the correspond-ing encodings changes as well. To maintain the same accuracy of the algorithmit is important to keep the precision xed, hence for bigger w's the smaller basebw should result in an increase of the number of coecients used by an encod-ing. Starting with the balanced ternary expansion (BTE), for any w > 2, thenumbers `(w)i and `(w)f of integer and fractional digits should be expandedaccording to `(w)i = (`(BTE)i − 1) · logbw 3 + 1, `(w)f = −blogbw efc, whereef is the maximal error of an approximate w-NIBNAF representation such thatthe prediction algorithm preserves the same accuracy. Empirically we found thatthe GMDH network demonstrates reasonable absolute and relative errors when`(BTE)inpi = 4 and einpf = 1 for the input and `(BTE)poli = 2 and epolf = 0.02032for the coecients of the nodes (quadratic polynomials).

Results. The results reported in this section are obtained running the samesoftware and hardware as in [6]: namely, FV-NFLlib software library [17] runningon a laptop equipped with an Intel Core i5-3427U CPU (running at 1.80GHz). Weperformed 8560 runs of the GMDH algorithm with BTE, NAF and 950-NIBNAF.The last expansion is with the maximal possible w such that the resulting output

FASTER HOMOMORPHIC FUNCTION EVALUATION USING NON-INTEGRAL BASE ENCODING 131

0 385 1,000 2,000 3,000 4,0950

6

29

41

integerpa

rt

fraction

alpa

rt

coefficient index

bina

rybits

BTE meanNAF mean950-NIBNAF meanBTE maxNAF max950-NIBNAF max

Fig. 7. The mean and the maximal size per coecient of the resulting polynomial.

polynomial still has discernible integer and fractional parts. Correct evaluationof the prediction algorithm requires the plaintext modulus to be bigger than themaximal coecient of the resulting polynomial. This lower bound for t can bededuced either from the maximal coecient (in absolute value) appearing afterany run or, in case of known distribution of coecient values, from the meanand the standard deviation. In both cases increasing window sizes reduce thebound as depicted in Figure 7. Since negative encoding coecients are used,950-NIBNAF demands a plaintext modulus of 7 bits which is almost 6 timessmaller than for BTE and NAF.

As expected, w-NIBNAF encodings have longer expansions for bigger w'sand that disrupts the decoding procedure in [6,16]. Namely, they naively splitthe resulting polynomial into two parts of equal size. As one can observe inFigure 7, using 950-NIBNAF, decoding in this manner will not give correctresults. Instead, the splitting index is should be shifted towards zero, i.e. to 385.To be specic [6, Lem. 1] states that is lies in the interval (di + 1, d− df ) wheredi = 2r+1(`(w)inpi +`(w)poli )−`(w)poli and df = 2r+1(`(w)inpf +`(w)polf )−`(w)polf .Indeed, this is the worst case estimation which results in the maximal w = 74for the current network conguration.

However the impact of the lower coecients of the fractional part can bemuch smaller than the precision required by an application. In our use case theprediction value should be precise up to einpf = 1. We denote the aggregatedsum of lower coecients multiplied by corresponding powers of the w-NIBNAFbase as L(j) =

∑isi=j−1 aib

−iw . Then the omitted fractional coecients ai should

satisfy |L(ic)| < 1, where ic is the index after which coecients are ignored.

To nd ic we computed L(j) for every index j of the fractional part andstored those sums for each run of the algorithm. For xed j the distribution ofL(j) is bimodal with mean µL(j) and standard deviation σL(j) (see Figure 8).Despite the fact that this unknown distribution is not normal, we naively ap-proximate the prediction interval [µL(j)−6σL(j), µL(j)+6σL(j)] that will containthe future observation with high probability. It seems to be a plausible guess in

132 FASTER HOMOMORPHIC FUNCTION EVALUATION USING NON-INTEGRAL BASE ENCODING

−2 −1 0 10

100

200

300

][

L(3500)

occu

rren

ces

Fig. 8. The distribution of L(3500)over 8560 runs of the GMDH algorithmand an approximation of its predictioninterval in red.

3,100 3,388 3,700

1

2

3

4

j

τ(j)

Fig. 9. The expected precision loss af-ter ignoring fractional coecients lessthan j.

t CRT factors timing for one run

950-NIBNAF 25.044 1 2.57 sBTE (this paper) 241.627 5 12.95 s

BTE [6] 2103.787 13 32.5 s

Table 2. GMDH implementation with 950-NIBNAF and BTE [6]

this application because all observed L(j) fall into that region with a big overes-timate according to Figure 8. Therefore ic is equal to the maximal j that satisesτ(j) < 1, where τ(j) = max(|µL(j) − 6σL(j)|, |µL(j) + 6σL(j)|).

As Figure 9 shows, ic is equal to 3388. Thus, the precision setting allows anoverow in any fractional coecient aj for j < 3388. The nal goal is to providethe bound on t which is bigger than any aj for j ≥ 3388. Since the explicit distri-butions of coecients are unknown and seem to vary among dierent indices, werely in our analysis on the maximal coecients occurring among all runs. Hence,the plaintext modulus should be bigger than maxj≥3388aj over all resultingpolynomials. Looking back at Figure 7, one can nd that t = 33 suces.

As mentioned above t is constrained in two ways: from the circuit and fromthe SHE correctness requirements. In our setup the ciphertext modulus is q ≈2186 and the standard deviation of noise is σ = 102, which together impose thatt ≤ 396 [6]. This is perfectly compatible with t = 33, therefore 950-NIBNAFallows us to omit the CRT trick and work with a single modulus, reducing thesequential timings by a factor 13. In the parallel mode it means that 13 timesless memory is needed.

Additionally, these plaintext moduli are much smaller than the worst caseestimation from Section 3. For 950-NIBNAF we take d ∈ [542, 821] according tothe encoding degrees of input data and network coecients. Any such encod-ing contains only one non-zero coecient. Consequently, any product of thoseencodings has only one non-zero coecient which is equal to ±1. When all mono-

FASTER HOMOMORPHIC FUNCTION EVALUATION USING NON-INTEGRAL BASE ENCODING 133

mials of the GMDH polynomial result in an encoding with the same index ofa non-zero coecient, the maximal possible coecient of the output encodingwill occur. In this case the maximal coecient is equal to the evaluation of theGMDH network with all input data and network coecients being just 1. Itleads to t = 2 · 615 ' 239.775.

One further consequence of smaller t is that one can reconsider the parame-ters of the underlying SHE scheme. Namely, one can take smaller q and σ thatpreserve the same security level and require a smaller bound on t instead of396 taken above. Given t = 33 from above experiments, q reduces to 2154 to-gether with σ ≈ 5 that corresponds to smaller sizes of ciphertexts and fasterSHE routines, where σ is taken the minimal possible to prevent the Arora-Geattack [4] as long as each batch of input parameters is encrypted with a dierentkey. Unfortunately, it is not possible to reduce the size of q by 32 bits in ourimplementation due to constraints of the FV-NFLlib library.

5 Conclusions

We have presented a generic technique to encode real numbers using a non-integral base. This encoding technique is especially suitable for use when eval-uating homomorphic functions since it utilizes the large degree of the deningpolynomial imposed by the security requirements. This leads to a considerablysmaller growth of the coecients and allows one to reduce the size of the plain-text modulus signicantly, resulting in faster implementations. We show that inthe setting studied in [6], where somewhat homomorphic function evaluation isused to achieve a privacy-preserving electricity forecast algorithm, the plaintextmodulus can be reduced from about 2103 when using a balanced ternary expan-sion encoding, to 33 ' 25.044 when using the encoding method introduced in thispaper (non-integral base non-adjacent form with window size w), see Table 2.This smaller plaintext modulus means a factor 13 decrease in the running timeof this privacy-preserving forecasting algorithm: closing the gap even further tomaking this approach suitable for industrial applications in the smart grid.

References

1. M. R. Albrecht. On dual lattice attacks against small-secret LWE and parameterchoices in helib and SEAL. In J. Coron and J. B. Nielsen, editors, EUROCRYPT2017, volume 10211 of LNCS, pages 103129, 2017.

2. I. Aliev. Siegel's lemma and sum-distinct sets. Discrete Comput. Geom., 39(1-3):5966, 2008.

3. E. Alkim, L. Ducas, T. Pöppelmann, and P. Schwabe. Post-quantum key exchange a new hope. In USENIX Security Symposium. USENIX Association, 2016.

4. S. Arora and R. Ge. New algorithms for learning in presence of errors. In L. Aceto,M. Henzinger, and J. Sgall, editors, ICALP 2011, Part I, volume 6755 of LNCS,pages 403415. Springer, Heidelberg, July 2011.

5. C. Bootland. Central Extended Binomial Coecients and Sums of Powers. Inpreparation.

134 FASTER HOMOMORPHIC FUNCTION EVALUATION USING NON-INTEGRAL BASE ENCODING

6. J. W. Bos, W. Castryck, I. Iliashenko, and F. Vercauteren. Privacy-friendly fore-casting for the smart grid using homomorphic encryption and the group methodof data handling. In M. Joye and A. Nitaj, editors, AFRICACRYPT 2017, volume10239 of LNCS, pages 184201, 2017.

7. J. W. Bos, C. Costello, M. Naehrig, and D. Stebila. Post-quantum key exchangefor the TLS protocol from the ring learning with errors problem. In IEEE S&P,pages 553570. IEEE Computer Society, 2015.

8. J. W. Bos, K. Lauter, J. Loftus, and M. Naehrig. Improved security for a ring-based fully homomorphic encryption scheme. In M. Stam, editor, Cryptographyand Coding 2013, volume 8308 of LNCS, pages 4564. Springer, 2013.

9. Z. Brakerski, C. Gentry, and V. Vaikuntanathan. (Leveled) fully homomorphicencryption without bootstrapping. In S. Goldwasser, editor, ITCS 2012, pages309325. ACM, Jan. 2012.

10. Z. Brakerski and V. Vaikuntanathan. Fully homomorphic encryption fromring-LWE and security for key dependent messages. In P. Rogaway, editor,CRYPTO 2011, volume 6841 of LNCS, pages 505524. Springer, Heidelberg, Aug.2011.

11. J. H. Cheon, J. Jeong, J. Lee, and K. Lee. Privacy-preserving computations ofpredictive medical models with minimax approximation and non-adjacent form.In Proceedings of WAHC 2017, LNCS, 2017.

12. J. H. Cheon, A. Kim, M. Kim, and Y. Song. Homomorphic encryption for arith-metic of approximate numbers. Cryptology ePrint Archive, Report 2016/421, 2016.http://eprint.iacr.org/2016/421.

13. H. Cohen, A. Miyaji, and T. Ono. Ecient Elliptic Curve Exponentiation UsingMixed Coordinates. In K. Ohta and D. Pei, editors, Advances in Cryptology

ASIACRYPT '98, volume 1514 of LNCS, pages 5165. Springer, 1998.14. Commission for Energy Regulation. Electricity smart metering customer behaviour

trials (CBT) ndings report. Technical Report CER11080a, 2011. http://www.

cer.ie/docs/000340/cer11080(a)(i).pdf.15. A. Costache, N. P. Smart, and S. Vivek. Faster homomorphic evaluation of Discrete

Fourier Transforms. IACR Cryptology ePrint Archive, 2016.16. A. Costache, N. P. Smart, S. Vivek, and A. Waller. Fixed point arithmetic in SHE

schemes. In SAC 2016, LNCS. Springer, 2016.17. CryptoExperts. FV-NFLlib. https://github.com/CryptoExperts/FV-NFLlib,

2016.18. A. de Moivre. The doctrine of Chances. Woodfall, 1738.19. N. Dowlin, R. Gilad-Bachrach, K. Laine, K. Lauter, M. Naehrig, and J. Wernsing.

Manual for using homomorphic encryption for bioinformatics. Technical report,MSR-TR-2015-87, Microsoft Research, 2015.

20. N. Dowlin, R. Gilad-Bachrach, K. Laine, K. E. Lauter, M. Naehrig, and J. Werns-ing. Cryptonets: Applying neural networks to encrypted data with high throughputand accuracy. In M. Balcan and K. Q. Weinberger, editors, International Confer-ence on Machine Learning, volume 48, pages 201210. JMLR.org, 2016.

21. S. Eger. Stirling's Approximation for Central Extended Binomial Coecients. TheAmerican Mathematical Monthly, 121:344349, 2014.

22. L. Euler. De evolutione potestatis polynomialis cuiuscunque (1 + x + x2 + x3 +x4+ etc.)n. Nova Acta Academiae Scientarum Imperialis Petropolitinae, 12:4757,1801.

23. J. Fan and F. Vercauteren. Somewhat practical fully homomorphic encryption.IACR Cryptology ePrint Archive, 2012:144, 2012.

FASTER HOMOMORPHIC FUNCTION EVALUATION USING NON-INTEGRAL BASE ENCODING 135

24. C. Gentry. Fully homomorphic encryption using ideal lattices. In M. Mitzenmacher,editor, 41st ACM STOC, pages 169178. ACM Press, May / June 2009.

25. N. Göttert, T. Feller, M. Schneider, J. Buchmann, and S. A. Huss. On the design ofhardware building blocks for modern lattice-based encryption schemes. In E. Prouand P. Schaumont, editors, CHES 2012, volume 7428 of LNCS, pages 512529.Springer, Heidelberg, Sept. 2012.

26. T. Güneysu, T. Oder, T. Pöppelmann, and P. Schwabe. Software speed recordsfor lattice-based signatures. In P. Gaborit, editor, PQCrypto 2013, volume 7932 ofLNCS, pages 6782. Springer, 2013.

27. K. E. Lauter, A. López-Alt, and M. Naehrig. Private computation on encryptedgenomic data. In D. F. Aranha and A. Menezes, editors, LATINCRYPT 2014,volume 8895 of LNCS, pages 327. Springer, Heidelberg, Sept. 2015.

28. A. K. Lenstra, H. W. Lenstra, and L. Lovász. Factoring polynomials with rationalcoecients. MATH. ANN, 261:515534, 1982.

29. V. Lyubashevsky, D. Micciancio, C. Peikert, and A. Rosen. SWIFFT: A modestproposal for FFT hashing. In K. Nyberg, editor, FSE 2008, volume 5086 of LNCS,pages 5472. Springer, Heidelberg, Feb. 2008.

30. L. Mattner and B. Roos. Maximal probabilities of convolution powers of discreteuniform distributions. Statistics & Probability Letters, 78(17):2992 2996, 2008.

31. M. Naehrig, K. E. Lauter, and V. Vaikuntanathan. Can homomorphic encryptionbe practical? In C. Cachin and T. Ristenpart, editors, ACM Cloud Computing

Security Workshop CCSW, pages 113124. ACM, 2011.32. T. Pöppelmann and T. Güneysu. Towards practical lattice-based public-key en-

cryption on recongurable hardware. In T. Lange, K. Lauter, and P. Lisonek,editors, SAC 2013, volume 8282 of LNCS, pages 6885. Springer, Heidelberg, Aug.2014.

33. G. W. Reitwiesner. Binary Arithmetic, volume 1 of Advances in Computers, pages231308. Academic Press, 1960.

34. D. Stehlé and R. Steinfeld. Making NTRU as secure as worst-case problems overideal lattices. In K. G. Paterson, editor, EUROCRYPT 2011, volume 6632 ofLNCS, pages 2747. Springer, Heidelberg, May 2011.

35. J. W. Swanepoel. On a generalization of a theorem by Euler. Journal of NumberTheory, 149:4656, 2015.

A Proofs

Lemma 1 For an integer w ≥ 1 the polynomial Fw(x) = xw+1−xw−x− 1 has

a unique positive root bw > 1. The sequence b1, b2, . . . is strictly decreasing and

limw→∞ bw = 1. Further, (x2 + 1) | Fw(x) for w ≡ 3 mod 4.

Proof. For w ≥ 1, F ′w(x) = (w + 1)xw − wxw−1 − 1 = (x − 1)((w + 1)xw−1 +xw−2 + · · · + 1) so that for x ≥ 0 there is only one turning point of Fw(x),at x = 1. Further, F ′′w(x) = (w + 1)wxw−1 − w(w − 1)xw−2, which takes thevalue 2w > 0 at x = 1, so the turning point is a minimum. Since Fw(0) = −1and limx→∞ Fw(x) = ∞ we conclude that there is a unique positive root ofFw(x), bw > 1, for any w ≥ 1. Further, we have that Fw+1(x) = xFw(x) +x2 − 1 so that Fw+1(bw) = b2w − 1 > 0 so that bw+1 < bw and hence thesequence bw is strictly decreasing and bounded below by 1 so must converge

136 FASTER HOMOMORPHIC FUNCTION EVALUATION USING NON-INTEGRAL BASE ENCODING

to some limit, say b∞ ≥ 1. If b∞ > 1 then as bw is the positive solution tox − 1 = (x + 1)/xw and, for x ≥ b∞ > 1, limw→∞(x + 1)/xw = 0 we seethat b∞ = limw→∞ bw = 1, a contradiction. Hence b∞ = 1 as required. Finallywe see that Fw(x) = x(x − 1)(xw−1 + 1) − (x2 + 1) and for w = 4k + 3 that

xw−1+1 = 1− (−x2)2k+1 = (x2+1)∑2ki=0(−x2)i and hence (x2+1) | F4k+3(x).

utRecall that to nd a lower bound on the maximal absolute coecient size we

consider w-balanced ternary sequences and to each sequence (ai) we have thecorresponding polynomial

∑i aiX

i in Rt. As we only look at the coecients andtheir relative distances we can simply assume that to each w-balanced ternarysequence c0, c1, . . . , cd of length d + 1 we have the associated polynomial c0 +c1X + . . .+ cdX

d of degree d. Multiplication of polynomials thus gives us a wayof multiplying (nite) w-balanced ternary sequences. In the rest of this appendixwe use the polynomial and sequence notation interchangeably.

Lemma 3 The maximal absolute size of a term that can appear in the product

of p arbitrary w-balanced ternary sequences of length d+ 1 is at least

Bw(d, p) :=

bbpbd/wc/2c/(bd/wc+1)c∑

k=0

(−1)k(p

k

)(p− 1 + bpbd/wc/2c − kbd/wc − k

p− 1

).

Proof. Consider the product of p sequences all of which are equal to m =10 · · · 010 · · · 010 · · · 0 of length d + 1, having n := bd/wc + 1 non-zero terms(all being 1) and between each pair of adjacent non-zero terms there are exactlyw−1 zero terms. Note that n is the maximal number of non-zero terms possible.As polynomials we have that m =

∑n−1i=0 X

iw = 1−Xnw

1−Xw , and hence we have

mp =

(1−Xnw

1−Xw

)p= (1−Xnw)p · (1−Xw)−p

=

(p∑

i=0

(−1)i(p

i

)Xinw

)∞∑

j=0

(p− 1 + j

p− 1

)Xjw

=

∞∑

`=0

b`/nc∑

k=0

(−1)k(p

k

)(p− 1 + `− kn

p− 1

)X`w ,

where we have used the substitution (i, j)→ (k, `) = (i, in+ j). Since we knowthat mp has degree p(n − 1)w we can in fact change the innite sum over ` toa nite one from ` = 0 to p(n − 1). To give the tightest lower bound we lookfor the maximal coecient of mp. It is well known that this maximal coecientoccurs as the central coecient, namely of x` where ` is any nearest integer top(n− 1)/2 and this gives us Bw(d, p). ut

Lemma 4 Suppose w divides d, then Bw(d, p) equals the maximal absolute size

of a term that can be produced by taking the product of p arbitrary w-balancedternary sequences of length d+ 1.

FASTER HOMOMORPHIC FUNCTION EVALUATION USING NON-INTEGRAL BASE ENCODING 137

Proof. Let Sw(d, p) be the set of all sequences that are the product of p arbitraryw-balanced ternary sequences of length d+1. To prove the lemma we bound allthe terms of any sequence in Sw(d, p). For i = 0, . . . , pd dene

mw(d, p, i) = max |ai| | ai is the i'th term of a sequence in Sw(d, p) .

Dene Bw(d, p, `) :=∑b`/nck=0 (−1)k

(pk

)(p−1+`−kn

p−1), the coecient of X`w in mp.

We will prove by induction on p that mw(d, p, i) ≤ Bw(d, p, bi/wc). We will usethe notation Ci(f) for a polynomial f to denote the coecient ofXi in f(X); thisis dened to be zero if i > deg(f) or i < 0. Thus in this notation Bw(d, p, `) =C`w ((1−Xnw)p/(1−Xw)p). The base case p = 1 is straight forward, all themw(d, p, i) are equal to 1 by the denition of a w-balanced ternary sequence. Wetherefore suppose that mw(d, p− 1, i) ≤ Bw(d, p− 1, bi/wc) for 0 ≤ i ≤ (p− 1)d.Consider a product of p w-balanced ternary sequences of length d+1. It can bewritten as f(X)e(X) where f(X) ∈ Sw(d, p− 1) and e(X) ∈ Sw(d, 1). We know

that if f(X) =∑(p−1)di=0 aiX

i then |ai| ≤ mw(d, p−1, i) and if e(X) =∑dj=0 αjX

j

that (fe)(X) = f(X)e(X) =∑pdk=0

(∑min((p−1)d,k)i=max(0,k−d) aiαk−i

)Xk, and due to the

form of e(X) we see that |Ck(fe)| ≤∑nk

j=1 |aij | ≤∑nk

j=1mw(d, p−1, ij) for somenk ≤ n, max(0, k−d) ≤ i1 < i2 < · · · < ink

≤ min((p−1)d, k) and ij+1− ij ≥ wfor j = 1, . . . , nk − 1.

The nal condition on the ij implies that the bij/wc are distinct and sincemw(d, p− 1, i) is bounded above by Bw(d, p− 1, bi/wc), which depends only onbi/wc, we can recast this as

|Ck(fe)| ≤nk∑

j=1

Bw(d, p− 1, `j) =

nk∑

j=1

C`jw

((1−Xnw

1−Xw

)p−1)

where max(0, bk/wc−(n−1)) ≤ `1 < `2 < · · · < `nk≤ min((p−1)(n−1), bk/wc)

where we have used that d/w = n− 1 is an integer.Since bk/wc − (bk/wc − (n− 1)) + 1 = n we see that to make nk as large as

possible the `j must be the (at most n) consecutive integers in this range subjectalso to 0 ≤ `1 and `nk

≤ (p−1)(n−1). Thus taking a maximum over all possiblef and e we have

mw(d, p, k) ≤bk/wc∑

`=bk/wc−(n−1)C`w

((1−Xnw

1−Xw

)p−1)

=

n−1∑

j=0

Cbk/wcw

((1−Xnw

1−Xw

)p−1Xw(n−1−j)

)

= Cbk/wcw

((1−Xnw

1−Xw

)p)= Bw(d, p, bk/wc) ,

which proves the inductive step. To nish the proof we note as before that themaximal value of Bw(d, p, bk/wc) for 0 ≤ k ≤ pd is reached, for example, whenbk/wc = bpbd/wc/2c and in this case we have Bw(d, p) as required. ut

138 FASTER HOMOMORPHIC FUNCTION EVALUATION USING NON-INTEGRAL BASE ENCODING

Chapter 7

Towards practicalprivacy-preservinggenome-wide associationstudy

Publication data

This is an extended version of Bonte C., Makri E., Ardeshirdavani A., Simm J.,Moreau Y. and Vercauteren F. Towards practical privacy-preserving genome-wideassociation study. In BMC Bioinformatics (Dec. 2018), A. Cuff, D. Talbot, Eds., vol.19, Springer Nature, article number: 537.

Contribution:The author of this thesis is together with Eleftheria Makri a main author of this paper.

Towards practical privacy-preservinggenome-wide association study

Charlotte Bonte2,4, Eleftheria Makri2,3,4, Amin Ardeshirdavani1, Jaak Simm1,Yves Moreau1,5, Frederik Vercauteren2,5

1 STADIUS KU [email protected]

2 imec-Cosic, Dept. Electrical Engineering, KU [email protected]

3 ABRR Saxion University of Applied Sciences4 Joint first authors5 Joint last authors

Abstract. The deployment of Genome-wide association studies (GWASs)requires genomic information of a large population to produce reliableresults. This raises significant privacy concerns, making people hesitateto contribute their genetic information to such studies. We propose twoprovably secure solutions to address this challenge: (1) a somewhat ho-momorphic encryption (HE) approach, and (2) a secure multiparty com-putation (MPC) approach. Unlike previous work, our approach does notrely on adding noise to the input data, nor does it reveal any informa-tion about the patients. Our protocols aim to prevent data breaches bycalculating the χ2 statistic in a privacy-preserving manner, without re-vealing any information other than whether the statistic is significant ornot. Specifically, our protocols compute the χ2 statistic, but only returna yes/no answer, indicating significance. By not revealing the statisticvalue itself but only the significance, our approach thwarts attacks ex-ploiting statistic values. We significantly increased the efficiency of ourHE protocols by introducing a new masking technique to perform the se-cure comparison that is necessary for determining significance. We showthat full-scale privacy-preserving GWAS is practical, as long as the statis-tics can be computed by low degree polynomials. Our implementationsdemonstrated that both approaches are efficient. The secure multipartycomputation technique completes its execution in approximately 2 msfor data contributed by one million subjects.

1 Introduction

The goal of a genome-wide association study (GWAS) is to identify geneticvariants that are associated with traits. Large-scale sequencing provides reliableinformation on single nucleotide variants (SNVs). To date, researchers workedmostly on identifying genetic alterations which lead to classification of SNVsand SNPs (single nucleotide polymorphims). Therefore, when we mention SNVswe refer to both frequent SNPs, and less frequent SNVs. A common approach is

to divide the population into a disease, and a healthy group based on whetherthe individual has the particular disease. Each individual gives a sample DNAfrom which millions of genetic variants (i.e., SNVs) are identified. If a variant ismore frequent in individuals with the disease, it will likely be associated with thespecific genetic disorder and be classified as a potential marker of the disease.

1.1 Motivation for the distributed setup with secure computations

Having a large population size is crucial for GWAS, because it allows to improvethe accuracy of identified associations, especially for rare genetic disorders. Tworecent developments result in a significant increase of the available data forGWAS: First, the development of cheap next generation sequencing (NGS). Sec-ond, the creation of distributed genomic databases, which enable pooling of datafrom many hospitals, and research centers, further increasing the populationsizes of the studies by 10-50 times. Several such distributed databases have re-cently been proposed, including NGS-Logistics [ASD+14], Elixir, and GA4GHBeacon [The16].

In studies like GWAS, which use personally identifiable genetic markers of theparticipants as input, the privacy of the patients and protection of their sensitivedata becomes of great importance. It has been shown by Malin et al. [Mal06],that releasing the raw data even after removal of explicit identifiers, does notprotect an individual from getting identified. The classical approach to solvethis privacy problem involves a trusted third party who first collects both theSNV, and the trait data, then carries out the statistical test, and finally eithera) only reveals the very few SNVs that have statistically significant associationor b) reveals all aggregate data on SNVs but masks them with sufficient noiseto guarantee differential privacy. For example, previous works by Uhlerop etal. [USF13], and by Simmons and Berger [SB16] have focused on computinga differentially private χ2 test. However, setting up such a trusted third partyhas significant legal, and technical difficulties given the sensitive nature of theunderlying data.

The aforementioned privacy concerns make both individuals and medical cen-ters hesitant to share this private data. Hence, centralized (third party) datasetscollected for research purposes remain small. Our goal is to address this chal-lenge, in a way that the data can be shared without trusting an external thirdparty. In our setup, the medical centers aggregate and encrypt or secret sharethe patient data before sending it to a third party for research purposes. Thisensures the privacy of the input data, because the only party with access to theraw input data is the medical center which gathers it. Hence, our distributedsolution allows to combine input data from different medical centers to con-struct a large dataset for research, while eliminating the privacy implications.Our solution can even scale up to millions of patients, and perform millions ortens of millions of hypothesis tests per day. This enables the first step towardslarge-scale distributed GWAS, where multiple medical centers contribute data,without relying on a trusted third party. Such large data collections would alsoallow association studies on rare diseases.

TOWARDS PRACTICAL PRIVACY-PRESERVING GENOME-WIDE ASSOCIATION STUDY 141

Another reason to opt for our secure computation solution instead of thetrusted third party one, is that the latter does not provide defense against ma-licious agents or operating system bugs, which might result in leakage of infor-mation. In our case, such a mishap would reveal encrypted values (or shares of avalue, resp.), which essentially provides no information to the adversary, as longas the secret key is not comprimised (or the adversary has fewer than n shares,resp.).

1.2 Motivation for the yes/no response

Studies related to GWAS raised even more privacey concerns. Research hasbrought to light that releasing aggregated statistics related to GWASs leaksinformation in an implicit way. Therefore, it is not enough to protect only theinput data; care has to be taken when releasing aggregated results to the pub-lic, as well. The work of Homer et al. [HSR+08] showed that the presence ofan individual in the case group can be determined from the aggregated allelefrequencies. One can argue that this attack requires an adversary to have atleast 10, 000 SNVs from the victim. However, we assume that with the currentsequencing techniques, this is no longer a challenge and hence Homer’s attackis posing a real threat nowadays. By computing with encrypted or secret shareddata, and only revealing a boolean value indicating significance, we prevent ad-versaries from obtaining the aggregated allele frequencies, thus protecting againstHomer’s attack.

Shortly following Homer’s attack, Wang et al. [WLW+09] reported an attackbased on statistical values reported in GWAS papers. Even though, the attackof Wang et al. [WLW+09] requires more statistical data than what our solutionwould reveal, such developments show that we need to be careful with the amountof information we publish. Our solution anticipates future statistical attacks, bynot publishing any statistic values at all.

1.3 Additional properties of the our setup

Our proposal consists of a cryptographic approach, where the trusted third partyperforming research is replaced by a privacy-preserving system, which receivesthe input in encrypted (protected) form from a set of distributed parties (e.g.,hospitals), performs the χ2 test, and only publicly discloses whether the currenttest is significant or not. Since nothing except the final answer is revealed duringthe execution of our protocols, the proposed system enjoys various security guar-antees, even against malicious agents who gain access to the servers executingthe system.

Even though the aforementioned attacks show it is not a good idea to revealthe χ2 value, the value itself would be highly interesting for research purposes.Therefore, it is worth mentioning that our current system can be easily adaptedto return the significance value itself. However, since revealing the values cancause privacy issues, we suggest to incorperate an authentication process to thesystem if the χ2 value should be revealed. This way the access to the actual χ2

142 TOWARDS PRACTICAL PRIVACY-PRESERVING GENOME-WIDE ASSOCIATION STUDY

values can be restricted to authenticated users. As such, the researchers can haveaccess to the actual result, while it stays hidden from the public and thereforecannot be abused in an attack like the aforementioned ones.

By only revealing the yes/no answer, our system indicates whether the SNVis a possible marker. To determine whether or not this SNV is actually causallylinked to the disease more statistics need to be computed. Therefore, we assumethat for the selected SNVs –indicated by our system– the researcher would re-quest specific patient data from the different centers for further analysis. Weassume this will happen with the current techniques for requesting data forGWASs. However, we expect patients to be more inclined to share their data forresearch, even despite the potential privacy concerns, when the researchers ex-plain to them, with the aid of the public tables, that their data is highly relevantfor the study of a specific disease.

Additionally, it is common practice in GWASs, and more general bioinfor-matics studies to publish only when significant results are found. This meansthat all the insignificant (yet identified) results are not published, despite thefact that they could also contribute in finding, or eliminating interesting correla-tions. In fact, a non-significant correlation between a genotype and a phenotypecan serve as a proof that a certain mutation is not related to a disease. Oursolution comes to bridge this gap, as we aim to construct a public table, listingall possible mutations, versus all possible phenotypes, and indicating whetherthe initial relationship between them (indicated by the χ2 test) is significant ornot. By publishing also the insignificant results in our public table, mutationsnot related to phenotypes can be immediately shown, allowing the researchersto discard them, and focus only on the significant ones.

To allow for a privacy-preserving system addressing our challenges, we pro-pose two secure approaches: one based on homomorphic encryption (HE), andone based on multiparty computation (MPC). We also compare their securityguarantees, and their efficiency in terms of execution time of practical imple-mentations. Homomorphic encryption refers to a set of cryptographic tools thatallow certain computations to take place in the encrypted domain, while the re-sulting ciphertext, when decrypted, is the expected (correct) result of operationson the plaintext data. Secure multiparty computation aims at allowing a similarfunctionality, amongst several mutually distrusting parties, who wish to computea function without revealing their private inputs. With the latter approach, com-munication between the computing parties is required for the execution of thecryptographic protocols.

In the MPC setting, there are two main security models used, offering pas-sive, or active security, respectively. Passive security, also known as securityin the semi-honest model, assumes that the protocol participants are honest-but-curious. This means that they are trying to collect as much information aspossible from the protocol execution, but they do follow the protocol instruc-tions honestly. Active security, also known as malicious security, offers strongersecurity guarantees, assuming that adversaries or corrupted protocol partici-pants may arbitrarily deviate from the protocol instructions. In both security

TOWARDS PRACTICAL PRIVACY-PRESERVING GENOME-WIDE ASSOCIATION STUDY 143

models, we can build protocols assuming an honest majority of the protocolparticipants, or a dishonest majority. Our solution with MPC offers the highestsecurity guarantees being built in the malicious model, with dishonest majority.

Specifically, we make the following contributions:

– We propose the first somewhat homomorphic encryption approach to with-stand GWAS attacks such as the ones described by Homer et al. [HSR+08].

– We develop a multiparty computation solution for GWAS that is efficient forrealistic sample sizes.

– We propose a new masking technique to allow efficient secure comparisons.– We compare the security, and efficiency of HE and MPC on a real-life appli-

cation.– We demonstrate the practicality of our solutions, based on their short run-

ning times, which are in the range of 1.9-2.4 ms for the MPC approach.– We show that our solution scales logarithmically in the number of subjects

contributing their genetic information, allowing us to treat current popula-tion sizes, and being able to scale to larger (future) GWASs for millions ofpeople.

2 Related Work

2.1 Homomorphic encryption approach

There has already been some work on using homomorphic encryption to pre-serve the privacy of the patients while performing statistics on genome data.Kim et al. [KL15] present the computation of minor allele frequencies, and theχ2 statistic with the use of the homomorphic BGV and YASHE encryptionschemes. They use a specific encoding technique to improve on the work ofLauter et al. [LLAN15a]. However, they only compute the allele counts homo-morphically, and execute the other operations on the decrypted data. Anotherwork on GWASs using fully homomorphic encryption was published by Lu etal. [LYS15]. They also start from encrypted genotype/phenotype informationthat is uploaded to a cloud for each person separately. Then they perform theminimal operations necessary to provide someone with access to the decryp-tion key with the necessary values to construct the contingency table for therequested case based on the data present on the cloud. Hence, when perform-ing a request, the scientist gets three encrypted values, and based on those hecan, after decryption, reconstruct the contingency table, and compute the χ2

statistic in the clear. These solutions are not resistant to attacks like the onedescribed by Homer et al. [HSR+08]. Our solution improves on these previousworks by performing the χ2 computation in the encrypted domain, and revealingonly whether or not the χ2 value is significant for this case, which makes thepreviously mentioned attacks impossible.Sadat et al. [SAM+17] propose a hybrid system called SAFETY, to computevarious statistical values over genomic data. This hybrid system consists of a

144 TOWARDS PRACTICAL PRIVACY-PRESERVING GENOME-WIDE ASSOCIATION STUDY

combination of the partially homomorphic Paillier scheme with the secure hard-ware component of Intel Software Guard Extensions (Intel SGX) to ensure bothhigh efficiency, and privacy. With this hybrid system they propose a more effi-cient way to get the total counts of all patients for a specific case. By using theadditive property of the homomorphic Paillier scheme, they reduce the compu-tational overhead of decrypting all individual encrypted outputs received fromthe different servers. Afterwards it uses the Intel SGX component to performthe χ2 computations. Even though, the results of this system scale well for in-creasing number of servers that provide data for the computation, the systemdoes not provide the same functionality as our solution. Sadat et al. [SAM+17]mention that the only privacy guarantee for the final computation result againstthe attack described by Homer et al. [HSR+08] is the assumption that the re-searcher decrypting the result is semi-honest. This is the main difference withour work: with our solution only the significance of the test will be made public.As mentioned before, the current system can be easily adapted to return theχ2 value itself but due to known attacks we want to avoid making these valuespublic. Hence, we believe that if our system is adapted to reveal the χ2 values,it should only reveal these values after authentication of the requesting party.Zhang et al. [ZDJ+15], construct an algorithm, which performs the whole χ2

statistic in the homomorphic domain. To compute the division, they constructa lookup table in which they link the result of their computation with the nu-merator and denominator of the corresponding, simplified fraction. Therefore,an authenticated user can look up the correct fraction in the lookup table af-ter decrypting the result, and hence recover the result of the χ2 statistic. Eventhough their strategy performs well, it does not scale enough to treat the largedatasets we envision in our application. Increasing the number of patients in thestudy would increase the circuit depth significantly, which comes with severaldisadvantages including increasing the parameter sizes, and hence the key size,and ciphertexts size, as well as the computation time.

2.2 Secure multiparty computation approach

Kamm et al. [KBLV13] propose a solution to address the privacy challenges ingenome-wide association studies. Their application scenarios, much like ours, fo-cus on large data collections from several biobanks, and their solutions are basedon the same fundamental techniques as ours. However, the setting of Kamm etal. [KBLV13] requires all raw genotype, phenotype, and clinical data to be en-tered to the secure shared database. To the contrary, our setting assumes thatonly the aggregate values, necessary to identify the significance of a gene-diseaserelationship (i.e., the contingency tables recording the counts of genotypes vs.phenotypes), are contributed by each biobank. This is a simpler, and more re-alistic setting, which not only is likely to be implemented in the near future,but also alleviates the computational cost of the proposed solutions. Unlike theapproach of Kamm et al. [KBLV13], and the alternatives that they suggest,our solution achieves active security with dishonest majority (contrary to the

TOWARDS PRACTICAL PRIVACY-PRESERVING GENOME-WIDE ASSOCIATION STUDY 145

semi-honest security suggested). This means that our protocols tolerate dishon-est behavior by the majority of the computing parties, while preserving privacy,and still guarantee the correctness of accepted results. Kamm et al.’s protocolsassume that the computing parties –the biobanks– cannot be corrupted, whichwe consider to be a strong assumption.

Independent and concurrent work by Cho et al. [CWB18] tries to address thesame problem as we do in our work, using multiparty computation techniques.They focus on a method that enables the identification and correction for pop-ulation biases before computing the statistics. However, just like the work ofKamm et al. [KBLV13], they make the strong assumption of semi-honest secu-rity. In practice, the semi-honest security is not a sufficient security guaranteefor GWAS, as attackers who have obtained access to the systems are likely toemploy active measures to obtain the data.

Constable et al. [CTW+15] present a garbled-circuit based MPC approachto perform GWAS. Their solution can compute in a privacy-preserving mannerthe minor allele frequency (MAF), and the χ2 statistic. Similarly to the workof Kamm et al. [KBLV13], the framework of Constable et al. [CTW+15] re-quires the raw genotype, and phenotype data, increasing the workload of theproposed privacy-preserving system. In contrast to our solution, which can scaleto hundreds of medical centers contributing data to the GWAS, the solutionof Constable et al. [CTW+15] only works for two medical centers. Despite thestrong security guarantees that our approach offers, which generally presentsitself as a tradeoff to efficiency, our proposal is faster than that of Constable etal. [CTW+15]. This is also due to the fact that we have optimized the compu-tations of the χ2 statistic, in such a way that the expensive computations in theprivacy-preserving domain, are avoided to the maximum extent possible.

Zhang et al. [ZBA15] propose a secret-sharing based MPC approach to solvethe same GWAS problem as Constable et al. [CTW+15]. Although Zhang etal.’s solution can scale to more than two medical centers contributing data tothe GWAS, the approach has the same inherent limitations (e.g., requiring rawgenomic data as input) that their application scenario incurs. The works of Zhanget al. [ZBA15], Constable et al. [CTW+15], and Cho et al. [CWB18] have notconsidered protecting the aggregate statistic result of the private computation,which –as Homer et al. [HSR+08] showed– can be used to breach an individual’sprivacy. We additionally protect the aggregate statistic result, while at the sametime allowing for a public list to be created, showing which SNVs are significantfor a certain disease.

3 Distributed GWAS Scenario

In this paper we aim at identifying which mutations are linked to which phe-notypes, without compromising the privacy of the patients. Specifically, thereare K centers (hospitals) who each have genotype (SNV), and phenotype (trait)data. For a single genotype-phenotype pair a center k has a 2 × 2 contingency

146 TOWARDS PRACTICAL PRIVACY-PRESERVING GENOME-WIDE ASSOCIATION STUDY

table6 of the counts of patients for all 4 possible combinations of genotype, andphenotype (see Table 1). The goal is to perform a privacy-preserving compu-tation that adds together all contingency tables from individual centers, thencomputes the Pearson’s χ2 test statistic [Pea00], and finally reveals a booleanvalue indicating whether the computed statistic is larger than a predeterminedsignificance threshold t. This threshold is chosen based on the p-value, and thecorrection for multiple hypothesis testing. For example, using significance level0.01 with Bonferroni correction for 10 million tests results in t = 37.3, and for100 million tests t = 41.8.

We propose two different methods for carrying out the χ2 test without dis-closing the input, and intermediate values. The first method performs all compu-tations on homomorphically encrypted data, while the second applies techniquesof secure multiparty computation to achieve the same goal. Both methods followthe same general outline, presented below. The first step is to encrypt (or secretshare) all the input tables from the centers, and securely compute the aggregatecontingency table

Oij =

K∑

k=1

O(k)ij , (1)

where O(k)ij is the data from k-th center. This step is straightforward in both

methods.

Next to determine the significance of the relation between a mutation, and aphenotype, we calculate the Pearson’s χ2 test statistic [Pea00] on the aggregatedcontingency table O, and check whether this value is above the threshold t. ThePearson’s χ2 statistic is given by the following formula:

χ2 =∑

i,j∈1,2

(Oij −modelij)2modelij

, (2)

where modelij = (RTi · CTj)/N with RTi = Oi,1 + Oi,2 being the row total,CTj = O1,j +O2,j being the column total, and N the total number of patients.

phenotype ¬phenotypegenotype O1,1 O1,2 RT1 = O1,1 +O1,2

¬genotype O2,1 O2,2 RT2 = O2,1 +O2,2

CT1 = O1,1 +O2,1 CT2 = O1,2 +O2,2 N = CT1 + CT2 = RT1 +RT2

Table 1: Representation of a contingency table containing the number of observedgenotypes i per phenotype j noted by Oi,j . In the table we also calculate theRow Totals (RTi), Column Totals (CTj), as well as the grand total (N).

6 Our method can be also extended to contingency tables of larger size.

TOWARDS PRACTICAL PRIVACY-PRESERVING GENOME-WIDE ASSOCIATION STUDY 147

Since division is a costly operation in both the homomorphic domain, andsecret shared domain, we will rewrite the formula of the χ2 statistic as follows:

χ2 =RT1 · CT1 · (N ·O2,2 −RT2 · CT2)2

N ·RT1 ·RT2 · CT1 · CT2+RT1 · CT2 · (N ·O2,1 −RT2 · CT1)2

N ·RT1 ·RT2 · CT1 · CT2+RT2 · CT1 · (N ·O1,2 −RT1 · CT2)2

N ·RT1 ·RT2 · CT1 · CT2+RT2 · CT2 · (N ·O1,1 −RT1 · CT1)2

N ·RT1 ·RT2 · CT1 · CT2.

(3)As a final step, we need to compare whether χ2 ≥ t. To do that, we calculate

the numerator, and denominator of the fraction in Equation (3), separately.Subsequently, we multiply the denominator of the fraction with the thresholdvalue t, and finally check inequality (4), without revealing any of the privateinputs in the contingency tables.

RT1 · CT1 · (N ·O2,2 −RT2 · CT2)2 +RT1 · CT2 · (N ·O2,1 −RT2 · CT1)2

+RT2 · CT1 · (N ·O1,2 −RT1 · CT2)2 +RT2 · CT2 · (N ·O1,1 −RT1 · CT1)2

?≥ t · (N ·RT1 ·RT2 · CT1 · CT2).

(4)This computation is repeated for every phenotype-genotype pair, and the

results are aggregated in a public table indicating whether a mutation is signif-icant for a particular phenotype, or not. Since the price of DNA sequencing hasdecreased a lot, we assume new data will keep becoming available. Taking thisnew data into account for the computation of our public table, requires runningour protocols anew, and it will change the table results. Therefore, we proposeto make the table dynamic. There will be a fixed time interval, which allows thecenters to gather more data and include this data in their contingency tables.The new table values will then be encrypted/secret shared and the computationof the fresh public table will be executed, after which the new results will bepublished.

3.1 Efficient Masking-Based Comparison

To the best of our knowledge the state-of-the-art techniques to perform securecomparisons, both in the homomorphic, and in the secret shared domain, requirebitwise operations on the secret inputs, which have a high total cost. To allowfor a practically efficient implementation of our solution, we consider a maskingtechnique to perform the comparison instead of the bit-decomposition of ourinputs. By masking the values we need to compare, we can later securely revealthe masked result upon decryption, since the mask will hide the original secretvalue. Hence, masking allows us to perform the comparison without revealingthe values we want to keep secret. Comparing two values x and y can be doneby comparing their difference with zero. Our mask for the value x−y consists ofmultiplying this value with a positive random number. We require the multiplierto be positive to preserve the original relation of our difference x− y with zero.

148 TOWARDS PRACTICAL PRIVACY-PRESERVING GENOME-WIDE ASSOCIATION STUDY

The second step is adding another random number (different from the previousone) to the already multiplied result. We require this random number to besmaller than the first one, again to preserve the original relation with zero.Let us denote the masked difference with x− y, then for two positive randomnumbers r and r′, with r′ in the range [1, r), our proposed masking is given byx− y = r · (x − y) + r′. In our setup we are working with homomorphic (orsecret shared) values, so this masking has to be performed on encrypted (orsecret shared) values. For an integer x, we denote [[x]] either the homomorphicencryption of x or its secret shared value. Masking in the homomorphic or secretshared domain will then be computed as [[x− y]] = [[r]] · [[x − y]] + [[r′]], with rand r′ random numbers satisfying the following condition: r is selected to bea positive integer number (bounded properly so as to fit the largest possibleinput sizes our framework can handle), and then r′ is randomly selected in therange [1, r) (i.e., such that r′ < r) . Afterwards the masked value is revealed byrespectively decrypting or opening the calculated value. Depending on the signof (x− y) we can deduce the relationship between x and y (i.e., if (x− y) > 0then x > y, otherwise x < y).

Given properly selected r, and r′, the correctness of this masking-based com-parison is straightforward. To ensure preservation of the security and correctnessof the masking, we require one of the medical centers to properly select r, and r′

within suitable bounds. Note that this requirement does not increase the level oftrust we need to put in the medical centers (nor does it reduce the security of thesystem). We already trust the medical centers to provide our privacy-preservingsystem with their correct inputs. Upon selection of the values r, and r′, the med-ical center in question homomorphically encrypts these values, or secret sharesthem to the computation servers, along with its own contributed inputs.

The proposed type of masking, which allows us to perform the comparison,could leak information about the secret input to the inequality, which in ourcase is the difference x− y, when the system is queried multiple times with thesame input. However, in our scenario, the proposed system cannot be queriedat will. We suggest the calculation of a table listing all possible phenotypes,and all possible mutation positions, which will become public. This table will becomputed on updated input data after a fixed time interval. While constructingthe table we select r and r′ at random for each contingency table, thus therandom values r and r′ will only be used once with certain input values. Afterthe fixed time interval, we expect the input values to be changed, so we repeat thewhole setup and select new random values for each contingency table. Hence, byrecomputing the table at fixed times and not allowing users to query, we ensurethat no information is leaked by our system.

Let us for completeness briefly discuss the leakage that occurs if a partyobserving the masked result of the inequality check can submit multiple querieson the same inputs and obtain the masked values for these queries. If this wouldbe possible, an adversary would be able to approximate the value of x− y fromthe obtained list of masked values. The maximum of this observed list will beclose to the bound set for the randomness r times the difference x − y. Hence,

TOWARDS PRACTICAL PRIVACY-PRESERVING GENOME-WIDE ASSOCIATION STUDY 149

by deduction, if we divide the maximum observed masked value by the upperbound on r, we will get a good approximation of the value x− y.

In the event of a malicious party being able to observe the intermediatevalues revealed by our approach (i.e., the value of the masked difference), andgiven that this malicious party can trigger multiple computations of the sametable entry, one can prevent the aforementioned leakage by selecting the randomvalues r and r′ once per table entry, and keep them thereafter fixed, until theactual inputs to the protocol (contributed by each medical center) change.

4 Homomorphic Encryption Approach

4.1 Setup and security assumptions

To solve the problem described in Section 3 with homomorphic encryption, weneed multiple parties, as indicated in Figure 1. The steps of the process depictedin Figure 1 are as follows. In the first step, the decryptor will select the secretkey, and associated public key for the homomorphic encryption, and make thepublic key available to all medical centers. Then, all the medical centers willencrypt their contingency tables with the given public key, and send these en-cryptions to the computation server. Upon receiving all contingency tables, thecomputation server will first add them to construct the aggregated contingencytable, and subsequently perform the operations of the Pearson χ2 test. Then,the computation server will send the result, which is masked with the techniquedescribed in Section 3.1 to the decryptor, who uses the secret key to decrypt themasked value, and performs the comparison.

It is important to note that in this model we trust the decryptor to decryptthe masked values, and post the corresponding correct yes/no value into thepublic table. Since the decryptor only decrypts masked values, the decryptorcan only deduce the yes/no answer which will become public, anyway. No otherinformation about the χ2 value is revealed to the decryptor. If the system wouldbe adapted to reveal the actual χ2 value, the party receiving the encrypted resultwould first have to authenticate itself to make sure that it is a trusted entity (likea medical doctor, for example). If this authentication is considered insufficientby the medical centers contributing their data, they could still prevent the au-thenticated party from being a single point of trust by introducing a multipartycomputation to perform the decryption based on a secret shared decryption key.The solution based on homomorphic encryption does rely on the following twosecurity assumptions:

– The computation server is honest but curious: It will follow the stated pro-tocol to provide the desired functionality, and will not deviate, nor fail toreturn the results. The computation server can however monitor the resultof every operation it performs. This assumption is reasonable for an econom-ically motivated cloud service provider. The cloud is motivated to provideexcellent service, yet it would take advantage of extra available information.

150 TOWARDS PRACTICAL PRIVACY-PRESERVING GENOME-WIDE ASSOCIATION STUDY

Medical

center 1

Medical

center 2

Medical

center n

Medical

center n-1

.

.

.

Public Table Disease

position 1 significant

position 2 …

… non-significant

Position 3.000.000.000 …

Computation

ServerDecryptor

(sk, pk)

k Send

Encrypted contingency

tables

l Perform encrypted

computation

m Send intermediate result

n Send

Yes/No

answer

Fig. 1: A schematic representation of the homomorphic scenario.Before the execution of the protocol, the decryptor generates a valid public and secretkey pair for the homomorphic encryption scheme. Step 1O of the protocol is to send thegenerated public key pk to all participating medical centers. Then, the medical centerscompute their local contingency tables, encrypt them with the received public key, andsend them to computation server in step 2O. Step 3O is the actual secure computationof the encrypted, and masked χ2 value, which is then sent to the decryptor in step 4O.By decrypting the masked χ2 value (using the secret key sk), the decryptor can onlydetermine whether the result is significant or not, which is published in a public tablein step 5O.

TOWARDS PRACTICAL PRIVACY-PRESERVING GENOME-WIDE ASSOCIATION STUDY 151

– We only need the decryptor to perform the comparison. He should not be al-lowed to see the input values, since he has the key to decrypt them. Thereforewe presume that the communication between the centers, and the computa-tion server is hidden from the decryptor. This can be achieved by performingthe communication over authenticated, secure channels. An alternative wayto solve this is by introducing the multiparty computation for the decryptor.Each party only has a part of the decryption key, and hence will never beable to decrypt the values of the encrypted contingency table.

For the homomorphic evaluation of the χ2 statistic we use the FV scheme, intro-duced by Fan and Vercauteren [FV12]. Moreover, we base our implementationon the FV-NFLlib software library [Cry16] in which the FV homomorphic en-cryption scheme is implemented using the NFLlib software library developed forperforming polynomial arithmetic computations (as described in [AMBG+16],and released in [CIQ16]).

4.2 Preliminaries

The Fan-Vercauteren SHE scheme The Fan-Vercauteren SHE scheme is ascale invariant SHE scheme whose hardness is based on the ring learning witherror problem (RLWE) [LPR13]. It operates on polynomials of the ring R =Z[X]/(f(X)) with f(X) = Xd + 1 for d = 2n.

The FV scheme makes use of a plaintext space Rt with R the polynomialring defined above, and t > 1 a small integer modulus. Each coefficient of aplaintext polynomial is computed modulo t. The ciphertext space consists of apair of elements of Rq with R the polynomial ring defined above, and q > 1,an integer modulus much larger than the plaintext modulus t. A homomorphicencryption scheme consists of a standard encryption scheme specifically con-structed to enable additions, and multiplications in the ciphertext domain. Thekey generation, and encryption algorithms require elements sampled from twoprobability distributions defined on R. The secret key of our scheme is sampledfrom χkey, and for encryption some error polynomials are sampled from an errordistribution χerr. These probability distributions in combination with the degreed of the defining polynomial f of R, and the size of the integer q determine thesecurity of the FV scheme.

Given the parameters d, q, and t, and distributions χkey, and χerr, and usingbold notation for a vector of two polynomials, we define the encryption, anddecryption mechanism of the homomorphic encrytion scheme introduced by Fanand Vercauteren:

– Encrypt(pk,m): By multiplying the message m ∈ Rt with ∆ = bq/tc wetransfer the message m to the ring Rq. To hide the message we sample theerror polynomials e1, e2 ∈ χerr, and u ∈ χkey, and compute the polynomialsc0 = ∆ · m + bu + e1, and c1 = au + e2. Both the polynomials c0, and c1belong to Rq, and together they form the ciphertext c = (c0, c1) of the FVscheme.

152 TOWARDS PRACTICAL PRIVACY-PRESERVING GENOME-WIDE ASSOCIATION STUDY

– Decrypt(sk, c): First compute m = [c0 + s · c1]q, then by scaling down thecoefficients of m by ∆, and rounding the results we recover the message m.

Given a ciphertext c = (c1, c2) we can write the m of the decryption algorithmas [c0 + c1s]q = ∆ ·m+ e, with e the noise in the ciphertext. From this equationone can clearly see that if the noise e grows too large, the decryption algorithmwill fail to output the original message m correctly. The property of ensuringthat decryption results in the original message is called the correctness of theencryption scheme.

Every homomorphic operation will cause the noise in the ciphertexts to in-crease. Knowing the computations we want to perform in advance enables usto optimize the order of the computations for the sake of minimizing the noisegrowth. In addition it enables us to make an estimation of the noise present inthe result, and hence allows us to determine parameters that can deal with thisnoise.

Preprocessing of the data to improve the performance of the ho-momorphic computations The plaintext space of the FV scheme consistsof the polynomial ring Rt. Hence the first thing that needs to be done is en-code the given integer data values into polynomials in the ring Rt. We achievethis by applying the w-NIBNAF encoding introduced in [BBB+17]. The mainidea of w-NIBNAF encoding goes back to Dowlin et al. [DGBL+15] (see also[DGL+16,LLAN15b,NLV11]), and was analyzed in more detail by Costache etal. [CSVW16]. It consists of expanding the given number θ with respect to abase bw, and replacing this base with the symbol X and then mapping thispolynomial into the plaintext space Rt. So the encoding of a number θ is givenby:

θ = arXr+ar−1X

r−1 + · · ·+a1X+a0−a−1Xd−1−a−2Xd−2−· · ·−a−sXd−s .

For our encoding we use a well chosen non-integral base such that the encodingof an input value results in a polynomial with coefficients in the set −1, 0, 1,with the property that each set of w consecutive coefficients has no more thanone non-zero coefficient. Hence encoding our inputs with w-NIBNAF leads tosparse polynomials.

Starting our computations with sparse polynomials with coefficients in theset −1, 0, 1 leads to small coefficients in the resulting polynomial. Thereforeit enables us to work with a smaller plaintext modulus t, which improves theperformance of the homomorphic encryption scheme. Bigger values for w leads tolonger, and sparser encodings, which reduce the minimum size for t even further.However, we have to ensure that when all the computations are done, decodingwill still provide the correct answer. Therefore, we need to carfully select ourbase bw, which also determines the value of w.

Selecting the optimal parameters The three main concepts that will affectthe selection of our parameters are:

TOWARDS PRACTICAL PRIVACY-PRESERVING GENOME-WIDE ASSOCIATION STUDY 153

– the security of the somewhat homomorphic FV scheme– the correctness of the somewhat homomorphic FV scheme– the correctness of the w-NIBNAF encoding

To take the security restrictions into account, we rely on the work by Albrecht,Player, and Scott [APS15], and the open source LWE hardness estimator im-plemented by Albrecht [Alb04]. The latter estimates the hardness of the LWEproblem based on three given parameters: the dimension d, the ciphertext mod-ulus q, and a parameter α, which is related to the error distribution χerr. It takesinto account the currently known attacks on the learning with error problem.The parameters that satisfy the restriction of the security implications of a se-curity level of 90 bits are q = 2186, d = 4096, σ = 2657. From the descriptionof the FV scheme in Section 4.2, we know that for a ciphertext to be decryptedcorrectly the error cannot grow too much. Therefore we make an estimation ofthe infinity norm of the error in the ciphertext resulting from computing ourcircuit, and select parameters that keep the error small enough. This results inan upper bound for t. Since this upper bound is constructed specifically for ourcircuit it will depend amongst other things on the number of centers that deliverdata to the computation server.

For correctness of the w-NIBNAF decoding, it is important to estimate thesize of the coefficients of the encodings after performing the necessary opera-tions. This leads to a lower bound on the plaintext modulus t. The security,and correctness of the FV scheme on one hand, and the correctness of the w-NIBNAF encoding on the other hand, set conflicting requirements for t. In orderto solve this conflict, we make use of the Chinese Remainder Theorem to de-compose the plaintext space, following the idea of [BLLN13, §5.5]. This impliesthat we choose t to be the product of small prime numbers t1, t2, . . . , tn, with∀i ∈ 1, . . . , n : ti ≤ tmax, and t =

∏ni=1 ti ≥ tmin, where tmax is determined

by the security, and correctness of the FV scheme, and tmin by the correctnessof the w−NIBNAF decoding.

4.3 Privacy-Preserving Homomorphic Chi-Squared ThresholdingAlgorithm and Parameters

In this section, we combine all information given before in order to constructthe algorithm needed to perform the privacy-preserving χ2 thresholding in thehomomorphic setting. Algorithm 1 lists the computations for the algorithm cor-responding to the order in which they need to be performed, and mentions theparties that need to perform them. Hence the steps from Figure 1 are describedin more detail in Algorithm 1, and the combination of these two gives a clearpicture of our homomorphic solution for the privacy-preserving χ2 thresholding.

For the homomorphic solution we consider two scenarios: (1) the case in whichwe compute the numerator and denominator of the χ2 statistic seperately, and

7 σ determines the error distribution χerr.

154 TOWARDS PRACTICAL PRIVACY-PRESERVING GENOME-WIDE ASSOCIATION STUDY

Algorithm 1 Privacy-preserving Homomorphic Chi-squaredTest

Medical center:input: O[2][2] : the observed values of the 2 × 2 contingency table (mutation vs.disease)

for Ck medical center k do1. Encode each value Ok

i,j for i = 1, 2; j = 1, 2 using the w-NIBNAF encodingtechnique

.

2. Transform this encoding to the plaintext ring Rti for all i such that t =∏

i ti.3. Encrypt the plaintext polynomials using the Fan-Vercauteren SHE scheme to ob-

tain ciphertexts cki,j.end forComputation center:

input: encrypted contigency table values cki,j, for k = 1, ..., Nc, with Nc the numberof centers contributing data

4. Compute the four values of the aggregated contingency table ci,j =∑Nc

k=1 cki,j .

5. Compute the χ2 numerator α and denominator β homomorphically6. Compute the difference of the χ2 numerator and the product of the encrypted χ2

threshold T and denominator, and mask the computed difference with the randomvalues r and r′: MR = r · (α− T · β) + r′

Decryptor:input: MR the encrypted masked output value of the homomorphic circuit7. Decrypt the masked result MR8. Use the inverse CRT ring isomorphism to transfer the plaintext polynomials to

the ring Rt.9. Decode the w-NIBNAF polynomial, and evaluate the result in the correct basisbw to get the value MaskedDifference of the masked result.

10. Determine the significance based on the sign of MaskedDifference:if MaskedDifference > 0 then

return 1else

return 0end if

TOWARDS PRACTICAL PRIVACY-PRESERVING GENOME-WIDE ASSOCIATION STUDY 155

decrypt them both, and (2) the case where we use the masking technique, anddecode the masked value to determine the “True/False” answer to the signif-icance question. As mentioned before, there are many parameters we need toset, and their values depend highly on the homomorphic circuit we perform andthe values of other parameters. We selected parameters for different scenariosin which we fix the number of patients per center to 10000 and vary the num-ber of centers from 20 to 100. The parameters are listed in Table 2 and 3. Weuse these paramters in our implementation later to assess the performance bymeasuring the computation time, and communication cost for each of the threeparties mentioned in Algorithm 1.

The value of w and the splitting degree are dependent on the size of thenumbers we need to encode, but independent of the number of centers includedin the calculation. Therefore the value of w and the splitting degree will bethe same for all scenarios. Without comparison, they are respectively w = 271,and splitting degree = 3166. The parameters that differ for different number ofcenters in scenario (1) is the value for t, and its CRT factors. These are listed inTable 2. For scenario (2) in which we perform the masked comparison, the valueof w will be different for each different number of centers participating in thecomputations, because we have to encode the random variables r and r′, whichhave sizes depending on the number of centers included in the computations.The parameters for the second scenario are listed in Table 3.

Table 2: Parameter selection for scenario (1)

Centers Patients t

20 200000 17431 · 17443 · 1744940 400000 1718713 · 1718719 · 171872360 600000 1558103 · 1558129 · 155817780 800000 1453043 · 1453057 · 1453061100 1000000 1376213 · 1376231 · 1376237

Table 3: Parameter selection for scenario (2)

Centers Patientsbit-length

w splitting degree trandom value

20 200000 100 118 3625 12253 · 12263 · 1226940 400000 106 113 3629 885127 · 885133 · 88516160 600000 110 111 3644 802651 · 802661 · 80266780 800000 112 110 3653 747811 · 747827 · 747829100 1000000 114 108 3651 707801 · 70813 · 707827

156 TOWARDS PRACTICAL PRIVACY-PRESERVING GENOME-WIDE ASSOCIATION STUDY

4.4 Implementation and performance analysis

In order to assess the practical performance, and verify the correctness of theselected parameters of the homomorphic scenario, we implemented the privacy-preserving χ2 computation using the FV-NFLlib software library [Cry16]. Ourpresented timings are obtained by running the implementation on a computerequipped with an Intel Core i5-4590 CPU, runnning at 3.30 GHz. We executedthe program 10 times per case, and calculated the average execution time forour timing results. To evaluate the scalability of our protocol we have consideredthe cases where our system receives data from 20, 40, 60, 80, and 100 medicalcenters, respectively. We assume each medical center to contribute data of 10000subjects (i.e., the total number of subjects per case is 200000, 400000, 600000,800000, and 1000000, respectively). In order to measure these timings we usedthe parameter set corresponding to the same scenario, so Table 2 for scenario(1) and Table 3 for scenario (2).

For scenario (1) we compute the numerator and denominator seperately, anddo not perform the masked comparison with the threshold value. The CPU timeneeded for the hospitals to encrypt the four values of the contingency tableis the same for any number of centers considered in the experiment. For ourselected parameters this is 15.2 ms. Also the time to decrypt the numerator anddenominator of the χ2 value is the same for any number of centers contributingto the experiment. The average time we measured during our experiments is 38.8ms. The timings of the computation server in scenario (1) are the times neededto perform the calculations for computing the numerator and denominator ofthe χ2 statistic. These timings are dependent on the number of centers thatparticipate in the computation. Therefore we list them in Table 4. We see thatthe timings for increasing centers do not differ significantly. This is consitentwith the fact that homomorphic additions are not the most time consumingpart of our computations.

For scenario (2) the encryption time does not depend on the number of centerseither, since the centers can perform the encryption in parallel. The measuredencryption time for one contingency table in scenario (2) is 17.1 ms. The timeto decrypt the result does not depend on the number of centers participating inthe computation either. However it is smaller than for scenario (1), since now weonly have to decrypt one value instead of two. The measured decryption timefor scenario (2) is 21.1 ms. The timings for the computation server are listedin Table 4, since these timings are dependent of the number of medical centersparticipating. Here we see the same trend as for scenario (1): the timings do notincrease linearly in the number of medical centers.

From Table 4 one sees that the timings for the computation server do notdiffer much between both scenarios. This is because addition and multiplicationwith a constant (which are the extra operations we need to compute the maskedvalue) are not the most time consuming homomorphic operations. Hence ourmasked comparison gives an efficient solution for keeping the χ2 value private.We can also conclude that considering CPU time, our solution scales really wellfor increasing number of medical centers participating in the computation.

TOWARDS PRACTICAL PRIVACY-PRESERVING GENOME-WIDE ASSOCIATION STUDY 157

Table 4: CPU time of the computation server for the homomorphic solutionusing 1 CPU core.

Centers Patients scenario (1) scenario (2)

20 200000 1.40 s 1.48 s40 400000 1.48 s 1.52 s60 600000 1.44 s 1.53 s80 800000 1.47 s 1.56 s100 1000000 1.49 s 1.56 s

For the homomorphic setup, there is no communication cost during the com-putations. The communication cost comes from sending values from each of thethree parties to the next. We have three points of communication: the publickey has to be sent from the decryptor to the medical centers; the encryptedvalues of the contingency tables have to be sent from the medical centers to thecomputation server; and the result has to be sent from the computation serverto the decryptor. The communication cost is similar for both scenarios since forboth scenarios the size of one ciphertext will be the same, and we only need tosent ciphertexts form one party to another. The size of the public key that needsto be sent to the different medical centers is 186 kB. The data needed to sendone contingency table to the computation server is 2.1 MB. The communicationcost between the medical centers, and the computation server is the number ofcenters participating times the number of data needed to send one contingencytable. So this communication cost increases linearly in the number of centerscontributing to the computation. In scenario (1) we send both the numerator asthe denominator from the computation server to the decryptor, which results ina communication cost of 1.8 MB. In scenario (2) we only have to send one value,which gives a communication cost of 0.54 MB.

5 Secure Multiparty Computation Approach

To address the challenge of disease gene identification using secure multipartycomputation techniques, in the setting described in Section 3, we deploy MAS-COT [KOS16]. We selected MASCOT [KOS16] as the most suitable multipartycomputation solution, because it is currently the most efficient proposal, offeringmalicious static security with a dishonest majority. This means that any numberof the computing parties may deviate from the protocol execution, and this willbe detected without leaking information, other than what the correct protocolexecution would reveal. Corruption may only occur prior to the beginning of theprotocol execution, affecting up to n− 1 (out of the n) computing parties.

5.1 Setup and Security Assumptions

For our multiparty computation approach, we first need to determine the numberof computation servers n (n ≥ 2) that we have at our disposal. Given that the

158 TOWARDS PRACTICAL PRIVACY-PRESERVING GENOME-WIDE ASSOCIATION STUDY

underlying protocol offers security against any coalition of n − 1 computationservers, we consider the security of the whole system to increase as the numberof computation servers increases. However, the number of computation serversis inversly proportional to the efficiency of the solution. Therefore, we considerthat three computation servers is an adequate number of servers, both from anefficiency/plausibility perspective, and from a security perspective. If any twoof the three computation servers that we assume get compromised, or otherwisebehave dishonestly, or even collude, the solution still guarantees input privacy,and does not accept incorrect results.

We assume a preprocessing phase that can take place offline, at any momentprior to the actual protocol execution. This is to create the necessary randomnessfor the medical centers to contribute their inputs in a secret shared manner to thecomputation servers. In addition, the preprocessing phase creates authenticatedrandomness to be used in the online phase, so as to boost the efficiency ofcomputing multiplications on the shares, which requires interaction amongst theservers.

The medical centers that wish to contribute their private inputs, first needto agree on a common format for this data (e.g., what is the order of sendingthe contingency tables). Then, they need to secret share their contingency ta-bles to the three computation servers, which can also be pushed to an offline,preprocessing phase. Given that all contributing medical centers have sharedtheir private contingency tables to the computation servers, the online phasestarts. During the online phase the servers perform both local, and interactivesecure computations, and they finally reveal per contingency table whether therelationship between a mutation at a certain DNA position, and a disease is sig-nificant or not, without disclosing further information on the underlying data.A schematic representation of this approach is presented in Figure 2.

5.2 Preliminaries

Additive Secret Sharing A secret sharing scheme is a protocol, which allows(some of) the protocol participants to share their secret inputs amongst all otherprotocol participants, in such a way that nothing is revealed to the individualparticipants about the secret input. In some instances of secret sharing schemes,a subset of the protocol participants, called the qualified set, can reconstruct theoriginal secret input, when they engage in the reconstruction protocol. Otherschemes, such as additive secret sharing, require all protocol participants tocontribute their shares for the reconstruction protocol to work.

Additive secret sharing is essentially masking the input of a protocol partic-ipant x by subtracting a random value r from it. Given that the value of r isonly known to the inputting participant, and that the rest of the participantshold shares [[r]] of this value (which can be done in a preprocessing stage), secureshares of x can be created as follows: The inputting party computes ε = x − r,and broadcasts it to the rest of the participants. All other parties can now com-pute their own shares of the input x as [[x]] = [[r]] + ε . It is easy to see that thisscheme enjoys an additively homomorphic property, allowing additions of shares,

TOWARDS PRACTICAL PRIVACY-PRESERVING GENOME-WIDE ASSOCIATION STUDY 159

Medical

center 1

Medical

center 2

Medical

center n

Medical

center n-1

.

.

.

Public Table Disease

position 1 significant

position 2 …

… non-significant

Position 3.000.000.000 …

Computation

Server 1

j Secret

share the contingency

tables

k Collaboratively

perform secure

computation

l Send

Yes/No

answer

Computation

Server 2

Computation

Server 3

*authenticated

communication

channels

Fig. 2: A schematic representation of the multiparty computation scenarioAny time prior to the protocol execution each of the medical centers computes theirlocal contingency tables, and secret shares them to the three computation servers,as indicated in step 1O of the protocol. In step 2O, the computation servers securelycompute the χ2 value, and perform a secure comparison to determine whether the valueis significant or not. This reveals no information about the inputs, or the actual χ2 valueto the individual computation servers. In step 3O the computation servers reconstructthe final result, which indicates significance or non-significance, by combining theirindividual secret shares, and they publish this result in the public table.

160 TOWARDS PRACTICAL PRIVACY-PRESERVING GENOME-WIDE ASSOCIATION STUDY

and linear functions to be directly computed locally on the shares. Hence, nocommunication amongst the protocol participants is required to perform addi-tions on these additively secret shared values.

Oblivious Transfer Oblivious Transfer (OT) is a cryptographic primitive,which allows a sender to transfer only one (or none) out of many values toa receiver, while remaining oblivious as to which value has been received (ifany). Oblivious transfer was introduced by Rabin [Rab81] in 1981. Basic OTconstructions make use of public key encryption primitives to allow the afore-mentioned functionality. More precisely, an 1-out-of-2 OT requires the sender tobe in possession of a public/private key pair, generate, and send two randommessages to the receiver, decrypt two messages, and send two messages to thereceiver. The receiver obtains the public key of the sender, along with the tworandom messages that the latter has selected, performs one encryption, and oneblinding, sends the resulting message to the sender, and finally inverts the blind-ing to retrieve the desired message. For more information on basic OT, we referthe reader to the work of Even et al. [EGL85].

Many works have considered extending OT so as to make it practically effi-cient, with the most notable work of Ishai et al. [IKNP03]. Under the assumptionthat a random oracle H, which can be instantiated by a hash function family,exists, Ishai et al. [IKNP03] show how to perform only a few OTs from scratch,and then be able to perform many additional OTs at the cost of a constantnumber of invocations of the random oracle. The actively secure version of theirprotocol comes with an increase at the cost of the protocol by a factor σ, for σa statistical security parameter. More recently, Keller et al. [KOS15] presentedan actively secure OT extension protocol, where malicious security comes atnegligible extra cost.

Correlated Oblivious Product Evaluation - COPE The Correlated Obliv-ious Product Evaluation (COPE) protocol presented by Keller et al. [KOS16] isessentially a generalization of Ishai et al.’s protocol [IKNP03] to the arithmeticcase (instead of the original binary). The protocol is executed between two par-ties, and allows them to obtain an additive sharing of the product x ·∆, wherethe sender holds x ∈ F, and the receiver holds ∆ ∈ F. COPE is based on Gilboa’soblivious product evaluation [Gil99], where the parties run k sets of 1-out-of-2OTs, on k-bit inputs. The proposed product evaluation is correlated in the sensethat the one party’s input ∆ is fixed at the beginning of the protocol for manyprotocol runs. After a one-time expensive initialization of the COPE protocol,the extension step, generating fresh OTs without the public-key crypto extensivecosts, can be repeated several times on new inputs x.

MASCOT Online Phase The online phase of MASCOT [KOS16] is essen-tially the same as the one of the SPDZ protocol [DPSZ12,DKL+13]. This familyof protocols uses additive secret sharing, allowing additions, and linear functions

TOWARDS PRACTICAL PRIVACY-PRESERVING GENOME-WIDE ASSOCIATION STUDY 161

to be computed locally by the protocol participants, without requiring commu-nication. To achieve active security, information-theoretic MACs are being used,which provide authenticity, and integrity of the messages. A secret shared valuex, shared amongst n parties, is represented as follows:

[[x]] = (x(1), ..., x(n),m(1), ...,m(n), ∆(1), ...,∆(n)), (5)

where, x(i) is the random share, m(i) is the random MAC share, and ∆(i) is theMAC key share, such that m = x ·∆.

To perform multiplications of secret shared values, multiplication triples ala Beaver [Bea91] from the preprocessing phase are required, with which theparties can compute shares of the required product. The multiplication triplesare of the form: ([[a]], [[b]], [[c]]), with a, b uniformly random, and c = a · b. Moreprecisely, to compute [[z]] = [[x · y]], on input [[x]], [[y]], and given a multiplicationtriple ([[a]], [[b]], [[c]]), the parties compute ε = [[x]] − [[a]], and ρ = [[y]] − [[b]], andopen these two values. Then, they compute [[z]] = [[c]] + ε · [[b]] + ρ · [[a]] + ε · ρ .

MASCOT Offline Phase - Preprocessing The goal of the preprocessingphase is to generate random values for the parties who wish to contribute in-puts –so as to allow them to mask their inputs in an authenticated manner–,and multiplication triples –so that multiplication of secret shared values can beefficiently implemented in the online phase–.

The preprocessing is based on Oblivious Transfer techniques, and more pre-cisely the authors of MASCOT [KOS16], have generalized the OT extension ideapresented by Ishai et al. [IKNP03] to the arithmetic circuit case. For every partythat wishes to contribute an input, the COPE protocol is executed, to create anauthenticated version of the input, based on the global MAC key ∆. Creation of(authenticated) additive shares is straightforward. Recall that both the shares,and their MACs are linear. Hence, computing linear functions on (authenticated)shared values is also straightforward. To ensure active security, the party thatwishes to contribute an input first authenticates a random input x0 togetherwith the actual inputs m. Then, the party opens a random linear combinationof the inputs including x0, and all other parties check the MAC on this linearcombination. This way the party contributing input is committed to them, andthe actual inputs are masked by the random input x0.

For the triple generation, the parties invoke an OT to compute the secretsharing of b ∈ F, and a ∈ Fτ , where τ ≥ 3 is a security related parameter,meaning that they run τ copies of the basic two-party COPE per pair of parties.This ensures that a has enough randomness to produce a triple. To protectagainst malicious behavior, the parties sample two sets of random coefficients r,and r ∈ Fτ , from which they generate two triples with the same b component.Upon authentication of their shares, the parties ensure correctness of one of thetriples, by sacrificing the other triple.

162 TOWARDS PRACTICAL PRIVACY-PRESERVING GENOME-WIDE ASSOCIATION STUDY

5.3 Privacy-Preserving Chi-Squared Thresholding Protocol

Our protocol calculates the units of inequality (4) using MASCOT [KOS16].For the inequality check, we apply the masking technique described in Section3.1. In addition to the masking-based comparison, we have also implementedthe protocol using the standard, bit-decomposition based secure comparison, asimplemented in SPDZ-2 [Bri16]. We have also made a non-secure implementationavoiding completely the comparison. All these cases are presented in Section 5.4,analyzing their performance. The online phase of the masking-based version ofour protocol is detailed in Algorithm 2.

Algorithm 2 v ← Chi-squaredTest(Nc,N[Nc], [[O[4][Nc]]], t, [[r]], [[r′]])

1: Input: Nc: number of centers contributing data,2: N[Nc]: a table of size Nc containing the total sample size Ni of every center i,3: [[O[4][Nc]]]: secret shared observed values of the 2× 2 contingency table (mutation

vs. disease), contributed by each of the Nc centers,4: t : χ2 threshold value for the significance test,5: [[r]], [[r′]]: secret shared random values r and r′

6: Output: v = 0 or 1; 0 → non-significant relationship between mutation, anddisease, 1 → significant relationship between mutation, and disease

7: for all Pj do8: Each party Pj engages in the protocol9: for all Ci do

10: [[Ok,l]]← [[∑Nc

i=1Ok,l]]i, k = 1, 2; l = 1, 211: end for12: N ←∑Nc

i=1Ni

13: [[RTi]]← [[Oi,1 +Oi,2]], i = 1, 214: [[CTi]]← [[O1,i +O2,i]], i = 1, 215: [[modelk,l]]← [[RTk · CTl]], k = 1, 2, l = 1, 216: [[square]]← [[(N ·O1,1 −model1,1)2]]17: [[Ui,j ]]← [[square ·modeli,j ]], i = 1, 2; j = 1, 218: [[numerator]]← [[

∑2,2i=1,j=1 Ui,j ]]

19: [[denominator]]← N · [[model1,1 ·model2,2]]20: [[difference]]← [[numerator]]− t · [[denominator]]21: [[MaskedDifference]]← [[difference · r + r′]]22: MaskedDifference← Open([[MaskedDifference]])23: if MaskedDifference > 0 then24: return 125: else26: return 027: end if28: end for

In our setting, we consider the computing parties that actually execute ourprotocol (i.e., the computation servers), different from the parties contributingtheir inputs (i.e., the medical centers), as shown in Figure 2. For the offline

TOWARDS PRACTICAL PRIVACY-PRESERVING GENOME-WIDE ASSOCIATION STUDY 163

phase, together with the preparation of the triples, and randomness discussedin Section 5.2, we wish to perform the required preprocessing that will allowthe medical centers to correctly contribute their inputs, without compromisingprivacy. To do that we use the protocols proposed by Damgard et al. [DDN+15].First we use the Output Delivery protocol to reveal a preprocessed random valuer only to the inputting party, who can then broadcast his masked input x− r tothe computing parties. Based on this value, and the preprocessed randomness r,the servers can locally compute their share of x, as [[x]] = (x− r) + [[r]].

5.4 Implementation and Performance Analysis

We have built a proof of concept implementation of our MPC approach usingthe platform provided by the authors of MASCOT [KOS16] in SPDZ-2 [Bri16].We ran our experiments for timing the execution of our protocol on a desktopcomputer equipped with an Intel(R) Core(TM) i5-3570K processor, at 3.40GHz,with 16.00 GB RAM, and the Ubuntu 17.04 operating system.

We have only considered the online phase of the protocol, as the preprocessingis protocol-independent, and can be executed at any moment, well before theexecution of the online phase. We note, however, that the offline phase is alsopractically efficient, and we refer the reader to MASCOT [KOS16] for moredetails on the throughput of the offline phase. To give an indication of the costof the offline phase of the protocol, we estimate the triple generation throughput,based on the experiment results presented by Keller et al. [KOS16]. For threecomputation servers, equipped with eight-core i7 3.1 GHz CPU, and 32 GBRAM, in a local network with a 1 Gbit/s link per party, and a field Fp, with pa 128-bit prime, approximately 2200 triples per second can be generated. Everytime we recorded timings, before the execution of the online phase, we ran thesetup script provided with the SPDZ-2 [Bri16] software. This script simulatesthe offline phase, and creates all the necessary randomness for the execution ofthe online phase. The fact that the offline phase is simulated does not affect theperformance, or efficiency of the online phase.

Our experiments were conducted on localhost with three computation servers.Hence, we do not take the network latency into account in the timing results wereport. We do present the size of the data that each server has to send, as well asthe communication rounds, and we consider this information to be sufficient forthe reader to calculate the additional communication cost, based on the availablenetwork bandwidth.

For our performance analysis we have considered the following three scenar-ios: (1) the case where we calculate the numerator, and denominator of the χ2statistic in the secret shared domain, and then open these two results; (2) thecase where the secure comparison is implemented as described in Section 3; and(3) the case where we perform the secure comparison in the secret shared domain,and then open only the “True/False” answer to this question, as implementedin MASCOT. We selected these three scenarios, as the secure comparison is themost costly operation we need to carry out, and we wish to assess its impact onthe performance of our protocol. Thus, with scenario (1) we completely avoid

164 TOWARDS PRACTICAL PRIVACY-PRESERVING GENOME-WIDE ASSOCIATION STUDY

the secure comparison by opening the numerator, and denominator separately;with scenario (2) we perform the secure comparison using our randomizationapproach; and with scenario (3) we use the most popular method to carry outa secure comparison (see [CDH10]), which is based on bit-decomposition –aninherrently inefficient approach–. Note that scenario (1) does not satisfy thesecurity requirements of our application, and is presented only for the sake ofperformance comparison.

For all our timing results we have executed our protocol 10 times per case, andcalculated the average execution time. The communication cost of the protocol isconstant. For our experiments we have established that all input data is sharedby one of the computation servers (namely Server 1), instead of the medicalcenters that would contribute the data in a real setting. This is reflected in thecommunication cost of the protocol for Server 1, which has to secret share allinput data. Note, that in practice the secret sharing step can be pushed to anoffline preprocessing phase.

In Table 5 we present the execution times of our approach, as well as the datasent by Server 1 (including the sharing of the original inputs), without performinga secure comparison (scenario 1). Server 1 is presented separately, because it hasto do some extra tasks, such as sharing the inputs, collect all the final results, andprint them, which is reflected in its execution times. The other two servers aregrouped together, as their execution times are similar. Although we would expectthe execution times to grow with the number of medical centers contributingdata, this is not the case. This is because the execution times are so small thatthey can be highly affected by the computing environment. Furthermore, thecommunication cost of Server 1 includes also the secret sharing of the inputs.This is how all three scenarios have been executed. Recall, however, that thesharing of the inputs can be performed in a preprocessing phase, prior to theactual protocol execution, allowing the online phase to be less communicationintensive. The communication cost for the other two servers is constant –1228bytes–, since they do not share any inputs. The protocol completes its executionin 4 communication rounds, and consumes 9 triples, and 1 square from thepreprocessing.

Table 5: Performance with No Secure Comparison

Server 1 Server i, i 6= 1

Centers Patients CPU Time Sent Data CPU Time

20 200000 1.6 ms 4152 bytes 1.3 ms40 400000 1.6 ms 6712 bytes 1.4 ms60 600000 1.5 ms 9272 bytes 1.3 ms80 800000 1.7 ms 11832 bytes 1.4 ms100 1000000 1.7 ms 14392 bytes 1.4 ms

TOWARDS PRACTICAL PRIVACY-PRESERVING GENOME-WIDE ASSOCIATION STUDY 165

In Table 6 we provide the execution times of our alternative secure approach,which is based on randomizing the difference of the two numbers to be compared(scenario 2). Note that in the previously presented scenario (1) it is sufficientto work in a prime field of 128 bits, as the numbers we operate with nevergrow larger than 114 bits. To perform the randomization of the difference of ournumbers securely, however, we need to multiply them with a (random) numberof roughly the same size. This implies that we need to work in a field of 256 bitsto be able to handle the size of our quantities. More precisely, we calculated thelargest that our quantities can grow in all cases, and we selected the randomnumbers to be upper bounded by these sizes. Our numbers can grow up to100, 106, 110, 112, and 114 bits for 20, 40, 60, 80, and 100 medical centers,respectively. For this scenario, and the case of 20 centers, Server 1 has to send82 (256-bit) elements (instead of 128-bit elements) to the two other servers.The two additional elements that have to be sent are the randomness r, andr′, which is used to mask the difference that facilitates the execution of thesecure comparison. The communication cost for Server 1 is analyzed in Table 6,while for the other two servers is constant and equal to 1632 bytes. The protocolcompletes its execution in 5 communication rounds, and consumes 10 triples,and 1 square from the preprocessing.

Table 6: Performance with Randomized Secure Comparison

Server 1 Server i, i 6= 1

Centers Patients CPU Time Data Sent CPU Time

20 200000 1.7 ms 7616 bytes 1.4 ms40 400000 1.7 ms 12736 bytes 1.5 ms60 600000 1.9 ms 17856 bytes 1.7 ms80 800000 2.1 ms 22976 bytes 1.8 ms100 1000000 2.2 ms 28096 bytes 1.9 ms

In Table 7 we display the execution times of our protocol, using the standard,bit-decomposition based, secure comparison (scenario 3). Similarly to the secondscenario, due to the size of our inputs (up to 114 bits), we need to scale our inputsrepresentation to 256 bits field elements, so as to achieve adequate statisticalsecurity (always ≥ 40 bits). The protocol communication cost of Server 2, and3 is constant –4244 bytes–, while for Server 1 it varies, based on the number ofinputs it has to share, as shown in Table 7. The protocol completes its executionin 10 communication rounds, and consumes 50 triples, 1 square, and 31 bits fromthe preprocessing. As expected, this is the most inefficient of the three scenarios,both in terms of communication, and in terms of computational cost.

166 TOWARDS PRACTICAL PRIVACY-PRESERVING GENOME-WIDE ASSOCIATION STUDY

Table 7: Performance with Standard Secure Comparison

Server 1 Server i, i 6= 1

Centers Patients CPU Time Data Sent CPU Time

20 200000 2.2 ms 12712 bytes 1.9 ms40 400000 2.3 ms 17832 bytes 2.0 ms60 600000 2.3 ms 22952 bytes 2.0 ms80 800000 2.5 ms 28072 bytes 2.2 ms100 1000000 2.4 ms 33192 bytes 2.1 ms

6 Conclusion

From the setup description of both suggested techniques, one can determinethe first significant difference between them: in the homomorphic setting, themedical centers only have to encrypt, and send their data to one party, namelythe computation server; while for the multiparty computation they have to secretshare their data with two or more computation servers. The execution timesresulting from our experiments show that the MPC approach is significantlyfaster than the homomorphic approach. Even if we assume the encryption of thecontingency tables by the medical centers to be part of a preprocessing phase, thehomomorphic approach will take more than a second to complete its execution,while the computations in the MPC setup take only a few milliseconds. In termsof communication cost, the homomorphic setup has the advantage that it needsno communication during the computations. However, in terms of total amountof data that has to be transferred between the different parties, the MPC setupoutperforms the homomorphic setup once more. We therefore recommend theMPC approach, as it is the most efficient out of the two approaches, and it doesnot rely on the strong assumption of semi-honest parties participating in theprotocol.

Having compared the HE, and MPC approaches in a setting addressing theexact same problem, we have established that MPC can provide more efficientsolutions with more relaxed security assumptions. Thus, we plan to proceed withfuture work on computing state-of-the-art statistics used in GWASs (instead ofthe more simple χ2 test) in a privacy-preserving way, using MPC. To this end,we consider an interesting first step to study how we can express, or approximatelogistic regression with low degree polynomials. Then, we can deploy MPC forcomputing them securely, which will yield solutions efficient enough to be usedin practice.

Our work shows that, as long as we can express the statistics to be calculatedwith low-degree polynomials, privacy-preserving GWAS has become practical.We made the first step to efficient privacy-preserving GWAS with the securecalculation of the χ2 test. Our solutions provide provable security guarantees,while being efficient for realistic sample sizes, and number of medical centerscontributing data to the studies. Interestingly, our solutions scale logarithmicallyin the number of subjects contributing data to the study, which means that as

TOWARDS PRACTICAL PRIVACY-PRESERVING GENOME-WIDE ASSOCIATION STUDY 167

GWAS population sizes grow, our approach will remain suitable. We also proposea new masking-based comparison method, and show that in certain applicationscenarios, such as the GWAS scenario at hand, comparisons can be executedefficiently even in the HE setting, without leaking useful information about theunderlying data.

Acknowledgements

This work was supported by the European Commission under the ICT pro-gramme with contract H2020-ICT-2014-1 644209 HEAT. Additionally, YvesMoreau, Jaak Simm, and Amin Ardeshirdavani were funded by Research CouncilKU Leuven: CoE PFV/10/016 SymBioSys; Flemish Government: IWT: Exapta-tion; O&O ExaScience Life Pharma; Exaptation, PhD grants Industrial Researchfund (IOF): IOF/KP (Identification and development of new classes of immuno-suppressive compounds and discovery of new key proteins involved in the T andB-cell activation); FWO 06260 (Iterative and multi-level methods for Bayesianmulti-relational factorization with features); imec strategic funding 2017.

References

[Alb04] Martin Albrecht. Complexity Estimates for Solving LWE. https:

//bitbucket.org/malb/lwe-estimator/raw/HEAD/estimator.py, 2000–2004.

[AMBG+16] Carlos Aguilar-Melchor, Joris Barrier, Serge Guelton, Adrien Guinet,Marc-Olivier Killijian, and Tancrede Lepoint. Nfllib: Ntt-based fast lat-tice library. In Kazue Sako, editor, Topics in Cryptology - CT-RSA 2016,pages 341–356, Cham, 2016. Springer International Publishing.

[APS15] Martin R. Albrecht, Rachel Player, and Sam Scott. On the ConcreteHardness of Learning with Errors. J. Mathematical Cryptology, 9(3):169–203, 2015.

[ASD+14] Amin Ardeshirdavani, Erika Souche, Luc Dehaspe, Jeroen Van Houdt,Joris Robert Vermeesch, and Yves Moreau. NGS-Logistics: FederatedAnalysis of NGS Sequence Variants Across Multiple Locations. Genomemedicine, 6(9):71, 2014.

[BBB+17] Charlotte Bonte, Carl Bootland, Joppe W. Bos, Wouter Castryck, Ilia Il-iashenko, and Frederik Vercauteren. Faster Homomorphic Function Eval-uation using Non-Integral Base Encoding. Cryptology ePrint Archive,Report 2017/333, 2017. http://eprint.iacr.org/2017/333.

[Bea91] Donald Beaver. Efficient Multiparty Protocols Using Circuit Random-ization. In Annual International Cryptology Conference, pages 420–432.Springer, 1991.

[BLLN13] Joppe W. Bos, Kristin Lauter, Jake Loftus, and Michael Naehrig. Im-proved Security for a Ring-Based Fully Homomorphic Encryption Scheme.In Martijn Stam, editor, Cryptography and Coding 2013, volume 8308 ofLNCS, pages 45–64. Springer, 2013.

[Bri16] Bristol Crypto. SPDZ-2: Multiparty Computation with SPDZ On-line Phase and MASCOT Offline Phase. https://github.com/

bristolcrypto/SPDZ-2, 2016.

168 TOWARDS PRACTICAL PRIVACY-PRESERVING GENOME-WIDE ASSOCIATION STUDY

[CDH10] Octavian Catrina and Sebastiaan De Hoogh. Improved Primitives forSecure Multiparty Integer Computation. In International Conference onSecurity and Cryptography for Networks, pages 182–199. Springer, 2010.

[CIQ16] CryptoExperts, INP ENSEEIHT, and Quarkslab. NFLlib. https:

//github.com/quarkslab/NFLlib, 2016.

[Cry16] CryptoExperts. FV-NFLlib. https://github.com/CryptoExperts/

FV-NFLlib, 2016.

[CSVW16] Anamaria Costache, Nigel P Smart, Srinivas Vivek, and Adrian Waller.Fixed Point Arithmetic in SHE Schemes. In SAC 2016, LNCS. Springer,2016.

[CTW+15] Scott D Constable, Yuzhe Tang, Shuang Wang, Xiaoqian Jiang, andSteve Chapin. Privacy-Preserving GWAS Analysis on Federated GenomicDatasets. BMC medical informatics and decision making, 15(5):S2, 2015.

[CWB18] Hyunghoon Cho, David J Wu, and Bonnie Berger. Secure genome-wideassociation analysis using multiparty computation. Nature biotechnology,36(6):547, 2018.

[DDN+15] Ivan Damgard, Kasper Damgard, Kurt Nielsen, Peter Sebastian Nord-holt, and Tomas Toft. Confidential Benchmarking based on MultipartyComputation. IACR Cryptology ePrint Archive, 2015:1006, 2015.

[DGBL+15] Nathan Dowlin, Ran Gilad-Bachrach, Kim Laine, Kristin Lauter, MichaelNaehrig, and John Wernsing. Manual for Using Homomorphic Encryptionfor Bioinformatics. Technical report, November 2015.

[DGL+16] Nathan Dowlin, Ran Gilad-Bachrach, Kim Laine, Kristin E. Lauter,Michael Naehrig, and John Wernsing. CryptoNets: Applying Neural Net-works to Encrypted Data with High Throughput and Accuracy. In Maria-Florina Balcan and Kilian Q. Weinberger, editors, International Confer-ence on Machine Learning, volume 48, pages 201–210. JMLR.org, 2016.

[DKL+13] Ivan Damgard, Marcel Keller, Enrique Larraia, Valerio Pastro, PeterScholl, and Nigel P Smart. Practical Covertly Secure MPC for Dishon-est Majority–or: Breaking the SPDZ Limits. In European Symposium onResearch in Computer Security, pages 1–18. Springer, 2013.

[DPSZ12] Ivan Damgard, Valerio Pastro, Nigel Smart, and Sarah Zakarias. Mul-tiparty Computation from Somewhat Homomorphic Encryption. In Ad-vances in Cryptology–CRYPTO 2012, pages 643–662. Springer, 2012.

[EGL85] Shimon Even, Oded Goldreich, and Abraham Lempel. A RandomizedProtocol for Signing Contracts. Communications of the ACM, 28(6):637–647, 1985.

[FV12] Junfeng Fan and Frederik Vercauteren. Somewhat Practical Fully Homo-morphic Encryption. IACR Cryptology ePrint Archive, 2012:144, 2012.

[Gil99] Niv Gilboa. Two Party RSA Key Generation. In Annual InternationalCryptology Conference, pages 116–129. Springer, 1999.

[HSR+08] Nils Homer, Szabolcs Szelinger, Margot Redman, David Duggan, WaibhavTembe, Jill Muehling, John V Pearson, Dietrich A Stephan, Stanley FNelson, and David W Craig. Resolving Individuals Contributing TraceAmounts of DNA to Highly Complex Mixtures using High-Density SNPGenotyping Microarrays. PLoS genetics, 4(8):e1000167, 2008.

[IKNP03] Yuval Ishai, Joe Kilian, Kobbi Nissim, and Erez Petrank. ExtendingOblivious Transfers Efficiently. In Annual International Cryptology Con-ference, pages 145–161. Springer, 2003.

TOWARDS PRACTICAL PRIVACY-PRESERVING GENOME-WIDE ASSOCIATION STUDY 169

[KBLV13] Liina Kamm, Dan Bogdanov, Sven Laur, and Jaak Vilo. A New Way toProtect Privacy in Large-Scale Genome-Wide Association Studies. Bioin-formatics, 29(7):886–893, 2013.

[KL15] Miran Kim and Kristin Lauter. Private Genome Analysis through Homo-morphic Encryption. 15:S3, 12 2015.

[KOS15] Marcel Keller, Emmanuela Orsini, and Peter Scholl. Actively Secure OTExtension with Optimal Overhead. In Annual Cryptology Conference,pages 724–741. Springer, 2015.

[KOS16] Marcel Keller, Emmanuela Orsini, and Peter Scholl. MASCOT: Faster ma-licious arithmetic secure computation with oblivious transfer. In Edgar R.Weippl, Stefan Katzenbeisser, Christopher Kruegel, Andrew C. Myers,and Shai Halevi, editors, ACM CCS 16: 23rd Conference on Computer andCommunications Security, pages 830–842, Vienna, Austria, October 24–28, 2016. ACM Press.

[LLAN15a] Kristin Lauter, Adriana Lopez-Alt, and Michael Naehrig. Private com-putation on encrypted genomic data. In Diego F. Aranha and AlfredMenezes, editors, Progress in Cryptology - LATINCRYPT 2014, pages3–27, Cham, 2015. Springer International Publishing.

[LLAN15b] Kristin Lauter, Adriana Lopez-Alt, and Michael Naehrig. Private Com-putation on Encrypted Genomic Data, pages 3–27. Springer InternationalPublishing, Cham, 2015.

[LPR13] Vadim Lyubashevsky, Chris Peikert, and Oded Regev. On Ideal Latticesand Learning with Errors over Rings. J. ACM, 60(6):Art. 43, 35, 2013.

[LYS15] Wen-Jie Lu, Yoshiji Yamada, and Jun Sakuma. Privacy-PreservingGenome-Wide Association Studies on Cloud Environment using Fully Ho-momorphic Encryption. 15:S1, 12 2015.

[Mal06] Bradley Malin. Re-identification of familial database records. In AMIAannual symposium proceedings, volume 2006, page 524. American MedicalInformatics Association, 2006.

[NLV11] Michael Naehrig, Kristin E. Lauter, and Vinod Vaikuntanathan. CanHomomorphic Encryption be Practical? In Christian Cachin and ThomasRistenpart, editors, ACM Cloud Computing Security Workshop – CCSW,pages 113–124. ACM, 2011.

[Pea00] Karl Pearson. X. On the Criterion that a Given System of Deviations fromthe Probable in the Case of a Correlated System of Variables is such thatit can be Reasonably Supposed to have Arisen from Random Sampling.The London, Edinburgh, and Dublin Philosophical Magazine and Journalof Science, 50(302):157–175, 1900.

[Rab81] Michael O Rabin. How to Exchange Secrets with Oblivious Transfer.Technical report, Aiken Computation Lab, Harvard University, 1981.

[SAM+17] Md Nazmus Sadat, Md Momin Al Aziz, Noman Mohammed, Feng Chen,Shuang Wang, and Xiaoqian Jiang. SAFETY: Secure GWAS in FederatedEnvironment through a Hybrid Solution with Intel SGX and Homomor-phic Encryption. CoRR, abs/1703.02577:1–17, 2017.

[SB16] Sean Simmons and Bonnie Berger. Realizing Privacy Preserving Genome-Wide Association Studies. Bioinformatics, 32(9):1293–1300, 2016.

[The16] The Global Alliance for Genomics and Health. A Federated Ecosystemfor Sharing Genomic, Clinical Data. Science, 352(6291):1278–1280, jun2016.

170 TOWARDS PRACTICAL PRIVACY-PRESERVING GENOME-WIDE ASSOCIATION STUDY

[USF13] Caroline Uhlerop, Aleksandra Slavkovic, and Stephen E Fienberg.Privacy-Preserving Data Sharing for Genome-Wide Association Studies.The Journal of privacy and confidentiality, 5(1):137, 2013.

[WLW+09] Rui Wang, Yong Fuga Li, XiaoFeng Wang, Haixu Tang, and XiaoyongZhou. Learning your identity and disease from research papers: Informa-tion leaks in genome wide association study. In Proceedings of the 16thACM Conference on Computer and Communications Security, CCS ’09,pages 534–544, New York, NY, USA, 2009. ACM.

[ZBA15] Yihua Zhang, Marina Blanton, and Ghada Almashaqbeh. Secure Dis-tributed Genome Analysis for GWAS and Sequence Comparison Compu-tation. BMC medical informatics and decision making, 15(5):S4, 2015.

[ZDJ+15] Yuchen Zhang, Wenrui Dai, Xiaoqian Jiang, Hongkai Xiong, and ShuangWang. Foresee: Fully outsourced secure genome study based on homo-morphic encryption. BMC Medical Informatics and Decision Making,15(5):S5, Dec 2015.

TOWARDS PRACTICAL PRIVACY-PRESERVING GENOME-WIDE ASSOCIATION STUDY 171

Chapter 8

Privacy-preserving logisticregression training

Publication data

Bonte C. and Vercauteren F. Privacy-Preserving Logistic Regression Training.In BMC Medical Genomics (Oct. 2018), S. Wang, X. Jiang, X. Wang and H. Tang.,Eds., vol. 11, Springer Nature, article number: 86.

Contribution:The author of this thesis is a main author of this paper.

Privacy-Preserving Logistic Regression Training

Charlotte Bonte, Frederik Vercauteren

imec-Cosic, Dept. Electrical Engineering, KU Leuven, Belgiumcharlotte.bonte, [email protected]

Abstract. Logistic regression is a popular technique used in machinelearning to construct classification models. Since the construction of suchmodels is based on computing with large datasets, it is an appealing ideato outsource this computation to a cloud service. The privacy-sensitivenature of the input data requires appropriate privacy preserving mea-sures before outsourcing it. Homomorphic encryption enables one to com-pute on encrypted data directly, without decryption and can be used tomitigate the privacy concerns raised by using a cloud service. In this pa-per, we propose an algorithm (and its implementation) to train a logisticregression model on a homomorphically encrypted dataset. The core ofour algorithm consists of a new iterative method that can be seen asa simplified form of the fixed Hessian method, but with a much lowermultiplicative complexity. We test the new method on two interestingreal life applications: the first application is in medicine and constructsa model to predict the probability for a patient to have cancer, given ge-nomic data as input; the second application is in finance and the modelpredicts the probability of a credit card transaction to be fraudulent. Themethod produces accurate results for both applications, comparable torunning standard algorithms on plaintext data. This article introduces anew simple iterative algorithm to train a logistic regression model thatis tailored to be applied on a homomorphically encrypted dataset. Thisalgorithm can be used as a privacy-preserving technique to build a bi-nary classification model and can be applied in a wide range of problemsthat can be modelled with logistic regression. Our implementation resultsshow that our method can handle the large datasets used in logistic re-gression training.

1 Background

1.1 Introduction

Logistic regression is a popular technique used in machine learning to solve bi-nary classification problems. It starts with a training phase during which onecomputes a model for prediction based on previously gathered values for pre-dictor variables (called covariates) and corresponding outcomes. The trainingphase is followed by a testing phase that assesses the accuracy of the model. To

This work was supported by the European Commission under the ICT programmewith contract H2020-ICT-2014-1 644209 HEAT.

this end, the dataset is split into data for training and data for validation. Thisvalidation is done by evaluating the model in the given covariates and comparingthe output with the known outcome. When the classification of the model equalsthe outcome for most of the test data, the model is considered to be valuableand it can be used to predict the probability of an outcome by simply evaluatingthe model for new measurements of the covariates.

Logistic regression is popular because it provides a simple and powerfulmethod to solve a wide range of problems. In medicine, logistic regression isused to predict the risk of developing a certain disease based on observed char-acteristics of the patient. In politics, it is used to predict the voting behaviourof a person based on personal data such as age, income, sex, state of residence,previous votes. In finance, logistic regression is used to predict the likelihoodof a homeowner defaulting on a mortgage or a credit card transaction beingfraudulent.

As all machine learning tools, logistic regression needs sufficient trainingdata to construct a useful model. As the above examples show, the values forthe covariates and the corresponding outcomes are typically highly sensitive,which implies that the owners of this data (either people or companies) arereluctant to have their data included in the training set. In this paper, we solvethis problem by describing a method for privacy preserving logistic regressiontraining using somewhat homomorphic encryption. Homomorphic encryptionenables computations on encrypted data without needing to decrypt the datafirst. As such, our method can be used to send encrypted data to a central server,which will then perform logistic regression training on this encrypted input data.This also allows to combine data from different data owners since the server willlearn nothing about the underlying data.

1.2 Related work

Private logistic regression with the aid of homomorphic encryption has alreadybeen considered in [NLV11,BLN14], but in a rather limited form: both papersassume that the logistic model has already been trained and is publicly available.This publicly known model is then evaluated on homomorphically encrypteddata in order to perform classification of this data without compromising theprivacy of the patients. Our work complements these works by executing thetraining phase for the logistic regression model in a privacy-preserving manner.This is a much more challenging problem than the classification of new data,since this requires the application of an iterative method and a solution for thenonlinearity in the minimization function.

Aono et al. [AHTPW16] also explored secure logistic regression via homo-morphic encryption. However, they shift the computations that are challengingto perform homomorphically to trusted data sources and a trusted client. Con-sequently, in their solution the data sources need to compute some intermediatevalues, which they subsequently encrypt and send to the computation server.This allows them to only use an additively homomorphic encryption scheme

PRIVACY-PRESERVING LOGISTIC REGRESSION TRAINING 175

to perform the second, easier, part of the training process. Finally, they re-quire a trusted client to perform a decryption of the computed coefficients anduse these coefficients to construct the cost function for which the trusted clientneeds to determine the minimum in plaintext space. Their technique is based ona polynomial approximation of the logarithmic function in the cost function andthe trusted client applies the gradient descent algorithm as iterative method toperform the minimization of the cost function resulting from the homomorphiccomputations. Our method does not require the data owners to perform anycomputations (bar the encryption of their data) and determines the model pa-rameters by executing the minimization directly on encrypted data. Again thisis a much more challenging problem.

In [XWBB16] Xie et al. construct PrivLogit which performs logistic regressionin a privacy-preserving but distributed manner. As before, they require the dataowners to perform computations on their data before encryption to computeparts of a matrix used in the logistic regression. Our solution starts from theencrypted raw dataset, not from values that were precomputed by the centersthat collect the data. In our solution all computations that are needed to createthe model parameters, are performed homomorphically.

Independently and in parallel with our research, Kim et al. [KSW+18] inves-tigated the same problem of performing the training phase of logistic regressionin the encrypted domain. Their method uses a different approach than ours:firstly, they use a different minimization method (gradient descent) comparedto ours (a simplification of the fixed Hessian method), a different approxima-tion of the sigmoid function and a different homomorphic encryption scheme.Their solution is based on a small adaptation of the input values, which reducesthe number of homomorphic multiplications needed in the computation of themodel. We assumed the dataset would be already encrypted and therefore adap-tations to the input would be impossible. Furthermore, they tested their methodon datasets that contain a smaller number of covariates than the datasets usedin this article.

1.3 Contributions

Our contributions in this paper are as follows: firstly, we develop a method forprivacy preserving logistic training using homomorphic encryption that consistsof a low depth version of the fixed Hessian method. We show that consecutivesimplifications result in a practical algorithm, called the simplified fixed Hessian(SFH) method, that at the same time is still accurate enough to be useful. Weimplemented this algorithm and tested its performance and accuracy on two reallife use cases: a medical application predicting the probability of having cancergiven genomic data and a financial application predicting the probability that atransaction is fraudulent. Our test results show that in both use cases the modelcomputed is almost as accurate as the model computed by standard logisticregression tools such as the ones present in Matlab.

176 PRIVACY-PRESERVING LOGISTIC REGRESSION TRAINING

2 Technical Background

2.1 Logistic regression

Logistic regression can be used to predict the probability that a dependent vari-able belongs to a class, e.g. healthy or sick, given a set of covariates, e.g. somegenomic data. In this article, we will consider binary logistic regression, wherethe dependent variable can belong to only two possible classes, which are la-belled ±1. Binary logistic regression is often used for binary classification bysetting a threshold for a given class up front and comparing the output of theregression with this threshold. The logistic regression model is given by:

Pr(y = ±1|x,β) = σ(yβTx) =1

1 + e(−yβTx), (1)

where the vector β = (β0, . . . , βd) are the model parameters, y the class label(in our case ±1) and the vector x = (1, x1, . . . , xd) ∈ Rd+1 the covariates.

Because logistic regression predicts probabilities rather than classes, we cangenerate the model using the log likelihood function. The training of the modelstarts with a training dataset (X,y) = [(x1, y1), . . . , (xN, yN )], consisting of Ntraining vectors xi = (1, xi,1, . . . , xi,d) ∈ Rd+1 and corresponding observedclass yi ∈ −1, 1. The goal is to find the parameter vector β that maximizesthe log likelihood function:

l(β) = −n∑

i=1

log(1 + e(−yiβ

Txi)). (2)

When the parameters β are determined, they can be used to classify new datavectors xnew = (1, xnew1 , . . . , xnewd ) ∈ Rd+1 by setting

ynew =

1 if p(y = 1|xnew,β) ≥ τ−1 if p(y = 1|xnew,β) < τ

in which 0 < τ < 1 is a variable threshold which typically equals 12 .

2.2 Datasets

As mentioned before, we will test our method in the context of two real life usecases, one in genomics and the other in finance.

The genomic dataset was provided by the iDASH competition of 2017 andconsists of 1581 records (each corresponding to a patient) consisting of 103 co-variates and a class variable indicating whether or not the patient has cancer.The challenge was to devise a logistic regression model to predict the diseasegiven a training data set of at least 200 records and 5 covariates. However, forscalability reasons the solution needed to be able to scale up to 1000 recordswith 100 covariates. This genomic dataset consists entirely of binary data.

PRIVACY-PRESERVING LOGISTIC REGRESSION TRAINING 177

The financial data was provided by an undisclosed bank that provided anonymizeddata with the goal of predicting fraudulent transactions. Relevant data fields thatwere selected are: type of transaction, effective amount of the transaction, cur-rency, origin and destination, fees and interests, etc. This data has been subjectto preprocessing by firstly representing the non-numerical values with labels andsecondly computing the minimum and maximum for each of the covariates andusing these to normalise the data by computing x−xmin

xmax−xmin. The resulting finan-

cial dataset consists of 20 000 records with 32 covariates, containing floatingpoint values between 0 and 1.

2.3 The FV scheme

Our solution is based on the somewhat homomorphic encryption scheme of Fanand Vercauteren [FV12], which can be used to compute a limited number ofadditions and multiplications on encrypted data. The security of this encryptionscheme is based on the hardness of the ring learning with errors problem (RLWE)introduced by Lyubashevsky et al. in [LPR13]. The core objects in the FV schemeare elements of the polynomial ring R = Z[X]/(f(X)), where typically onechooses f(X) = Xd + 1 for d = 2n (in our case d = 4096). For an integermodulus M ∈ Z we denote with RM the quotient ring R/(MR).

The plaintext space of the FV scheme is the ring Rt for t > 1 a small integermodulus and the ciphertext space is Rq ×Rq for an integer modulus q t. Fora ∈ Rq, we denote by [a]q the element in R obtained by applying [·]q to all itscoefficients ai, with [ai]q = ai mod q given by a representative in

(−q2 ,

q2

]. The

FV scheme uses two probability distributions on Rq: one is denoted by χkey andis used to sample the secret key of the scheme, the other is denoted χerr and willbe used to sample error polynomials during encryption. The exact security levelof the FV scheme is based on these probability distributions, the degree d andthe ciphertext modulus q and can be determined using an online tool developedby Albrecht et al. [Alb04].

Given parameters d, q, t and the distributions χkey and χerr, the core oper-ations are then as follows:

– KeyGen: the private key consists of an element s← χkey and the public keypk = (b, a) is computed as a← Rq uniformly at random and b = [−(as+e)]qwith e← χerr.

– Encrypt(pk, m): given m ∈ Rt, sample error polynomials e1, e2 ∈ χerr andu ∈ χkey and compute c0 = ∆m+ bu+ e1 and c1 = au+ e2 with ∆ = bq/tc,the largest integer smaller than q

t . The ciphertext is then c = (c0, c1).– Decrypt(sk, c): compute m = [c0 + c1s]q, divide the coefficients of m by ∆

and round and reduce the result into Rt.

Computing the sum of two ciphertexts simply amounts to adding the corre-sponding polynomials in the ciphertexts. Multiplication, however, requires a bitmore work and we refer to [FV12] for the precise details.

The relation between a ciphertext and the underlying plaintext can be de-scribed as [c0 + c1s]q = ∆m+ e, where e is the noise component present in the

178 PRIVACY-PRESERVING LOGISTIC REGRESSION TRAINING

ciphertext. This also shows that if the noise e grows too large, decryption will nolonger result in the original message, and the scheme will no longer be correct.Since the noise present in the resulting ciphertext will grow with each opera-tion we perform homomorphically, it is important to choose parameters thatguarantee correctness of the scheme. Knowing the computations that need to beperformed up front enables us to estimate the size of the noise in the resultingciphertext, which permits the selection of suitable parameters.

2.4 w-NIBNAF

In order to use the FV scheme, we need to transform the input data into poly-nomials of the plaintext space Rt. To achieve this, our solution makes use of thew-NIBNAF encoding, because this encoding improves the performance of thehomomorphic scheme. The w-NIBNAF encoding is introduced in [BBB+17] andexpands a given number θ with respect to a non-integral base 1 < bw < 2. Byreplacing the base bw by the variable X, the method encodes any real number θas a Laurent polynomial:

θ = arXr + ar−1X

r−1 + · · ·+ a1X + a0 − a−1X−1 − a−2X−2 − · · · − a−sX−s.

A final step then maps this Laurent polynomial into the plaintext space Rt andwe refer the reader to [BBB+17] for the precise details.

The w-NIBNAF encoding is constructed such that the encoding of a numberwill satisfy two conditions: the encoding has coefficients in the set −1, 0, 1and each set of w consecutive coefficients will have no more than one non-zerocoefficient. Both conditions ensure that the encoded numbers are representedby very sparse polynomials with coefficients in the set −1, 0, 1, which canbe used to bound the size of the coefficients of the result of computations onthese representations. In particular, this encoding results in a smaller plaintextmodulus t, which improves the performance of the homomorphic encryptionscheme. Since larger values for w increase the sparseness of the encodings andhence reduce the size of t even more, one would like to select the value for wto be as large as possible. However, similar to encryption one has to considera correctness requirement for the encoding. More specifically, decoding of thefinal polynomial should result in the correct answer, hence the base bw andconsequently also the value of w should be chosen with care.

3 Privacy preserving training of the model

3.1 Newton-Raphson method

To estimate the parameters of our logistic regression model, we need to com-pute the parameter vector β that maximizes Equation (2). Typically, one woulddifferentiate the log likelihood equation with respect to the parameters, set thederivatives equal to zero and solve these equations to find the maximum. The

PRIVACY-PRESERVING LOGISTIC REGRESSION TRAINING 179

gradient of the log likelihood function l(β), i.e. the vector of its partial derivatives[∂l/∂β0, ∂l/∂β1, . . . , ∂l/∂βd] is given by:

∇βl(β) =∑

i

(1− σ(yiβTxi))yixi .

In order to estimate the parameters β, this equation will be solved numericallyby applying the Newton-Raphson method, which is a method to numericallydetermine the zeros of a function. The iterative formula of the Newton-Raphsonmethod to calculate the root of a univariate function f(x) is given by:

xk+1 = xk −f(xk)

f ′(xk), (3)

with f ′(x) the derivative of f(x). Since we now compute with a multivariateobjective function l(β), the (k + 1)th iteration for the parameter vector β isgiven by:

βk+1 = βk −H−1(βk)∇βl(βk) , (4)

with ∇βl(β) as defined above and H(β) = ∇2βl(β) the Hessian of l(β), being

the matrix of its second partial derivatives Hi,j = ∂2l/∂βi∂βj , given by:

H(β) = −∑

i

(1− σ(yiβTxi))σ(yiβ

Txi)(yixi)T (yixi) .

3.2 Homomorphic logistic regression

The downside of Newton’s method is that exact evaluation of the Hessian andits inverse are quite expensive in computational terms. In addition, the goal is toestimate the parameters of the logistic regression model in a privacy-preservingmanner using homomorphic encryption, which will further increase the compu-tational challenges. Therefore, we will adapt the method in order to make itpossible to compute it efficiently in the encrypted domain.

The first step in the simplification process is to approximate the Hessian ma-trix with a fixed matrix instead of updating it every iteration. This technique iscalled the fixed Hessian Newton method. In [BL88], Bohning and Lindsay inves-tigate the convergence of the Newton-Raphson method and show it convergesif the Hessian H(β) is replaced by a fixed symmetric negative definite matrixB (independent of β) such that H(β) ≥ B for all feasible parameter values β,where “ ≥ ” denotes the Loewner ordering. The Loewner ordering is defined forsymmetric matrices A, B and denoted as A ≥ B iff their difference A − B isnon-negative definite. Given such B, the Newton-Raphson iteration simplifies to

βk+1 = βk −B−1∇βl(βk) .

Furthermore, they suggest a lower bound specifically for the Hessian of the logis-tic regression problem, which is defined as H = − 1

4XTX and demonstrate that

this is a good bound. This approximation does not depend on β, consequently

180 PRIVACY-PRESERVING LOGISTIC REGRESSION TRAINING

it is fixed throughout all iterations and it only needs to be computed once asdesired. Since the Hessian is fixed, so is its inverse, which means it only needsto be computed once.

In the second step, we will need to simplify this approximation even more,since inverting a square matrix whose dimensions equal the number of covariates(and thus can be quite large), is nearly impossible in the encrypted domain. Tothis end, we replace the matrix H by a diagonal matrix for which the methodstill converges. The entries of the diagonal matrix are simply the sums of therows of the matrix H, so our new approximation H of the Hessian becomes:

H =

∑di=0 h0,i 0 . . . 0

0∑di=0 h1,i . . . 0

......

. . ....

0 0 . . .∑di=0 hd,i

.

To be able to use this approximation as lower bound for the above fixedHessian method we need to assure ourselves it satisfies the condition H(β) ≥ H.As mentioned before we already know from [BL88] that H(β) ≥ −14 XTX, so it

is sufficient to show that −14 XTX ≥ H, which we now prove more generally.

Lemma 1. Let A ∈ Rn×n be a symmetric matrix with all entries non-positive,and let B be the diagonal matrix with diagonal entries Bk,k =

∑ni=1Ak,i for

k = 1, . . . , n, then A ≥ B.

Proof. By definition of the matrix B, we have that C = A−B has the followingentries: for i 6= j we have Ci,j = Ai,j and Ci,i = −∑n

k=1,k 6=iAi,k. In particular,the diagonal elements of C are minus the sum of the off-diagonal elements onthe i-th row. We can bound the eigenvalues λi of C by Gerschgorin’s circletheorem [Ger31], which states that for every eigenvalue λ of C, there exists anindex i such that

|λ− Ci,i| ≤∑

j 6=i|Cij | i ∈ 1, 2, . . . , n .

Note that by construction of C we have that Ci,i =∑j 6=i |Cij |, and so every

eigenvalue λ satisfies |λ − Ci,i| < Ci,i for some i. In particular, since Ci,i ≥ 0,we conclude that λ ≥ 0 for all eigenvalues λ and thus that A ≥ B.

Our approximation H for the Hessian also simplifies the computation of theinverse of the matrix, since we simply need to invert each diagonal element sep-arately. The inverse will be again computed using the Newton-Raphson method:assume we want to invert the number a, then the function f(x) will be equal to1x−a and the iteration is given by xk+1 = xk(2−axk). For the Newton-Raphsonmethod to converge, it is important to determine a good start value. Given thevalue range of the input data and taking into account the dimensions of thetraining data, we estimate a range of the size of the number we want to invert.This results in an estimation of the order of magnitude of the solution that is

PRIVACY-PRESERVING LOGISTIC REGRESSION TRAINING 181

expected to be found by the Newton-Raphson algorithm. By choosing the initialvalue of our Newton-Raphson iteration close to the constructed estimation ofthe inverse, we can already find an acceptable approximation of the inverse byperforming only one iteration of the method.

In the third and final step, we simplify the non-linearity coming from the sig-moid function. Here, we simply use the Taylor series: extensive experiments with

plaintext data showed that approximating σ(yiβTxi) by 1

2 + yiβTxi

4 is enoughto obtain good results.

The combination of the above techniques finally results in our simplified fixedHessian (SFH) method given in Algorithm 1.

Algorithm 1 β ← simplified fixed Hessian(X,Y, u0, κ)

1: Input: X(N, d + 1): training data with in each row the values for the covariatesfor one record and starting with a column of ones to account for the constantcoefficient

2: Y (N, 1): labels of the training data3: u0: start value for the Newton-Raphson iteration that computes the inverse4: κ: the required number of iterations5: Output: β: the parameters of the logistic regression model6:7: β = 0.001 ∗ ones(d+ 1, 1)8: sum = zeros(N, 1);9: for i = 1 : N do

10: for j = 1 : d+ 1 do11: sum(i)+ = X(i, j)12: end for13: end for14: for j = 1 : d+ 1 do15: temp=0;16: for i = 1 : N do17: temp+ = X(i, j)sum(i);18: end for19: H(j)(j) = − 1

4temp;

20: H−1(j)(j) = 2u0 − H(j)(j)u20;

21: end for22: for k = 1 : κ do23: for i = 1 : N do24: g+ = ( 1

2− 1

4Y (i)X(i, :)β)Y (i)X(i, :);

25: end for26: β = β − H−1g;27: end for

We implemented the SFH algorithm in Matlab and verified the accuracy for agrowing number of iterations. One can see from Algorithm 1 that each iterationrequires 5 homomorphic multiplications, so performing one iteration is quite ex-pensive. In addition, Table 1 indicates that improving the accuracy significantly

182 PRIVACY-PRESERVING LOGISTIC REGRESSION TRAINING

Table 1: Performance for the financial dataset with 31 covariates and 700 trainingrecords and 19 300 testing records.

# iterations AUC SFH

1 0.9418

5 0.9436

10 0.9448

20 0.9466

50 0.9517

100 0.9599

requires multiple iterations. We will therefore restrict our experiments to onesingle iteration.

4 Results

4.1 Accuracy of the SFH method

Table 2 shows the confusion matrix of a general binary classifier. From the

Table 2: Comparing actual and predicted classes

actual class

-1 1

predicted −1 true negative (TN) false negative (FN)

class 1 false positive (FP) true positive (TP)

confusion matrix, we can compute the true positive rate (TPR) and the falsepositive rate (FPR) which are given by

TPR =#TP

#TP + #FNand FPR =

#FP

#FP + #TN. (5)

By computing the TPR and FPR for varying thresholds 0 ≤ τ ≤ 1, we can con-struct the receiver operating characteristic curve or ROC-curve. The ROC-curveis constructed by plotting the (FPR,TPR) pairs for each possible value of thethreshold τ . In the ideal situation there would exists a point with (FPR,TPR) =(0, 1), which would imply that there exists a threshold for which the model clas-sifies all test data correctly.

The area under the ROC-curve or AUC-value will be used as the main indi-cator of how well the classifier works. Since our SFH method combines several

PRIVACY-PRESERVING LOGISTIC REGRESSION TRAINING 183

approximations, we need to verify the accuracy of our model first on unencrypteddata and later on encrypted data. For well chosen system parameters, there willbe no difference between accuracy for unencrypted vs. encrypted data since allcomputations on encrypted data are exact.

The first step is performed by comparing our SFH method with the standardlogistic regression functionality of Matlab. This is done by applying our methodwith all its approximations to the plaintext data and comparing the result to theresult of the “glmfit” function in Matlab. The function b = glmfit(X, y, distr)returns a vector b of coefficient estimates for a generalized linear model of theresponses y on the predictors in X, using distribution distr. Generalized linearmodels unify various statistical models, such as linear regression, logistic regres-sion and Poisson regression, by allowing the linear model to be related to theresponse variable via a link function. We use the “binomial” distribution, whichcorresponds to the “logit” link function and y a binary vector indicating successor failure to compute the parameters of the logistic regression model with “glm-fit”.

0 0.2 0.4 0.6 0.8 10

0.2

0.4

0.6

0.8

1

Fig. 1: ROC curve for the cancer detection scenario of iDASH with 1000 trainingrecords and 581 testing records, all with 20 covariates.

From Figures 1 and 2 one can see that the SFH method classifies the dataapproximately as well as “glmfit” in Matlab, in the sense that one can alwaysselect a threshold that gives approximately the same true positive rate and falsepositive rate. One can thus conclude that the SFH method, with all its approx-imations, performs well compared to the standard Matlab method, which usesmuch more precise computations. By computing the TPR and FPR for severalthresholds, we found that the approximations of our SFH method shifts themodel a bit such that we need a slightly larger threshold to get approximately

184 PRIVACY-PRESERVING LOGISTIC REGRESSION TRAINING

0 0.2 0.4 0.6 0.8 10

0.2

0.4

0.6

0.8

1

Fig. 2: ROC curve for the financial fraud detection with 1000 training recordsand 19 000 testing records, all with 31 covariates.

the same TPR and FPR as for the Matlab model. Since the ideal situation wouldbe to end up with a true positive rate of 1 and false positive rate of 0, we seefrom Figure 1 that for the genomics dataset both models are performing ratherpoorly. The financial fraud use case is, however, much more amenable to binaryclassification as shown in Figure 2. The main conclusion is that our SFH methodperforms almost as well as standard methods such as those provided by Matlab.

4.2 Implementation details and performance

Our implementation uses the FV-NFLlib software library [Cry16] which imple-ments the FV homomorphic encryption scheme. The system parameters need tobe selected taking into account the following three constraints:

1. the security of the somewhat homomorphic FV scheme,2. the correctness of the somewhat homomorphic FV scheme,3. the correctness of the w-NIBNAF encoding.

The security of a given set of system parameters can be estimated using thework of Albrecht, Player and Scott [APS15] and the open source learning witherror (LWE) hardness estimator implemented by Albrecht [Alb04]. This pro-gram estimates the security of the LWE problem based on the following threeparameters: the degree d of the polynomial ring, the ciphertext modulus q and

α =√2πσq where σ is the standard deviation of the error distribution χerr. The

security estimation is based on the best known attacks for the learning witherrors problem. Our system parameters are chosen to be q = 2186, d = 4096 and

σ = 20 (and thus α =√2πσq ) which results in a security of 78 bits.

As explained in the section on the FV scheme, the error in the ciphertextencrypting the result, should be small enough to enable correct decryption. By

PRIVACY-PRESERVING LOGISTIC REGRESSION TRAINING 185

estimating the infinity norm of the noise we can select parameters that keepthis noise under the correctness bound and in particular, we obtain an upperbound tmax of the plaintext modulus. Similarly, to ensure correct decoding, thecoefficients of the polynomial encoding the result must remain smaller than thesize of the plaintext modulus t. This condition results in a lower bound on theplaintext modulus tmin.

It turns out that these bounds are incompatible for the chosen parameters, sowe have to rely on the Chinese Remainder Theorem to decompose the plaintextspace into smaller parts that can be handled correctly. The plaintext modulust is chosen as a product of small prime numbers t1, t2, . . . , tn with ∀i ∈1, . . . , n : ti ≤ tmax and t =

∏ni=1 ti ≥ tmin, where tmax is determined by

the correctness of the FV scheme and tmin by the correctness of the w-NIBNAFdecoding. The CRT then gives the following ring isomorphism:

Rt → Rt1 × . . .×Rtn : g(X) 7→ (g(X) mod t1, . . . , g(X) mod tn) .

and instead of performing the training algorithm directly over Rt, we computewith each of the Rti ’s by reducing the w-NIBNAF encodings modulo ti. Theresulting choices for the plaintext spaces are given in Table 3.

Table 3: The parameters defining plaintext encoding

w t

genomic data (1) 71 5179 · 5189 · 5197

financial data (2) 150 2237 · 2239

Since we are using the Chinese Remainder Theorem, each record will beencrypted using two (for the financial fraud case) or three (for the genomicscase) ciphertexts. As such, a time-memory trade off is possible depending onthe requirements of the application. One can choose to save computing timeby executing the algorithm for the different ciphertexts in parallel; or one canchoose to save memory by computing the result for each plaintext space Rticonsecutively and overwriting the intermediate values of the computations inthe process.

The memory required for each ciphertext is easy to estimate: a ciphertextconsists of 2 polynomials of Rq = Zq[X]/(Xd+1), so its size is given by 2d log2 qwhich is ≈ 186kB for the chosen parameter set. Due to the use of the CRT,we require T (with T = 2 or T = 3) ciphertexts to encrypt each record, so thegeneral formula for the encrypted dataset size is given by:

T (d+ 1)N2d log2 q bits ,

with T the number of prime factors used to split the plaintext modulus t andd+ 1 (resp. N) the number of covariates (resp. records) used in the training set.

186 PRIVACY-PRESERVING LOGISTIC REGRESSION TRAINING

Table 4: Performance for the genomic dataset with a fixed number of covariatesequal to 20. The number of testing records is for each row equal to the totalnumber of input records (1581) minus the number of training records.

# training records computation time AUC SFH AUC glmfit

500 22 min 0.6348 0.6287

600 26 min 0.6298 0.6362

800 35 min 0.6452 0.6360

1000 44 min 0.6561 0.6446

Table 5: Performance for the genomic dataset with a fixed number of trainingrecords equal to 500 and the number of testing records equal to 1081.

# covariates computation time AUC SFH AUC glmfit

5 7 min 0.65 0.6324

10 12 min 0.6545 0.6131

15 17 min 0.6446 0.6241

20 22 min 0.6348 0.6272

The time complexity of our SFH method is also easy to estimate, but one hasto be careful to perform the operations in a specific order. If one would naivelycompute the matrix H by first computing H and subsequently summing eachrow, the complexity would be O(Nd2). However, the formula of the k-th diagonal

element of H is given by −14∑d+1j=1

(∑Ni=1 xk,ixj,i

), which can be rewritten as

−14

∑Ni=1 xk,i

(∑d+1j=1 xj,i

). This formula shows that it is more efficient to first

sum all the rows of X and then perform a matrix vector multiplication withcomplexity O(Nd).

Table 6: Performance for the financial dataset with a fixed number of covariatesequal to 31. The number of testing records is for each row equal to the totalnumber of input records (20 000) minus the number of training records.

# training records computation time AUC SFH AUC glmfit

700 30 min 0.9416 0.9619

800 36 min 0.9411 0.9616

900 40 min 0.9409 0.9619

1000 45 min 0.9402 0.9668

This complexity is clearly visible in the tables, more specifically in Table 4and Table 5 for the genomic use case, and Table 6 and Table 7 for the financial

PRIVACY-PRESERVING LOGISTIC REGRESSION TRAINING 187

Table 7: Performance for the financial dataset with a fixed number of recordsequal to 500 and the number of testing records equal to 19 500.

# covariates computation time AUC SFH AUC glmfit

5 5 min 0.8131 0.8447

10 8 min 0.9403 0.9409

15 11 min 0.9327 0.9492

20 15 min 0.9401 0.9629

use case. All these tables show a linear growth of the computation time for agrowing number of records or covariates as expected by the chosen order of thecomputations in the implementation.In Table 4 and Table 5 we see that often the AUC value of the SFH model isslightly higher than the AUC value of the glmfit model. However, as mentionedbefore both models perform poorly on this dataset. Since our SFH model containsmany approximations we expect it to perform slightly worse than the “glmfit”model. Only slightly worse because Figure 1 and Figure 2 already showed thatthe SFH models classifies the data almost as well as the “glmfit” model. This isconsistent with the results for the financial dataset shown in Table 6 and Table 7,which we consider more relevant than the results of the genomic dataset due tothe fact that both models perform better on this dataset.

5 Discussion

The experiments of this article show promising results for the simple iterativemethod we propose as an algorithm to compute the logistic regression model. Afirst natural question is whether this technique is generalizable to other machinelearning problems. In [Boh92], Bohning describes how to adapt the lower boundmethod to make it applicable to multinomial logistic regression, it is likely thisadaption will also apply to our SFH technique and hence our SFH technique canmost likely also be applied to construct a multinomial logistic regression model.In the case of neural networks we can refer to [BCIV17]; in order to constructthe neural network one needs to rank all the possibilities and only keep the bestperforming neurons for the next layer. Constructing this ranking homomorphi-cally is not straightforward and not considered at all in our algorithm, henceneural networks will require more complicated algorithms.When we look purely at the performance of the FV homomorphic encryptionscheme, we might consider a residue number system (RNS) variant of the FVscheme as described in [BEHZ16] to further improve the running time of our im-plementation. One could also consider single instruction multiple data (SIMD)techniques as suggested in [CIV17] or look further into a dynamic rescaling pro-cedure for FV as mentioned in [FV12]. These techniques will presumably further

188 PRIVACY-PRESERVING LOGISTIC REGRESSION TRAINING

decrease the running time of our implementation, which would render our solu-tion even more valuable.

6 Conclusions

The simple, but effective, iterative method presented in this paper allows one totrain a logistic regression model on homomorphically encrypted input data. Ourmethod can be used to outsource the training phase of logistic regression to acloud service in a privacy preserving manner. We demonstrated the performanceof our logistic training algorithm on two real life applications using differentnumeric data types. In both cases, the accuracy of our method is only slightlyworse than standard algorithms to train logistic regression models. Finally, thetime complexity of our method grows linearly in the number of covariates andthe number of training input data points.

References

[AHTPW16] Yoshinori Aono, Takuya Hayashi, Le Trieu Phong, and Lihua Wang. Scal-able and secure logistic regression via homomorphic encryption. In Pro-ceedings of the Sixth ACM Conference on Data and Application Securityand Privacy, CODASPY ’16, pages 142–144, New York, NY, USA, 2016.ACM.

[Alb04] Martin Albrecht. Complexity estimates for solving LWE, 2000–2004.[APS15] Martin R. Albrecht, Rachel Player, and Sam Scott. On the concrete

hardness of learning with errors. J. Mathematical Cryptology, 9(3):169–203, 2015.

[BBB+17] Charlotte Bonte, Carl Bootland, Joppe W. Bos, Wouter Castryck, IliaIliashenko, and Frederik Vercauteren. Faster homomorphic function eval-uation using non-integral base encoding. In Wieland Fischer and NaofumiHomma, editors, Cryptographic Hardware and Embedded Systems – CHES2017, pages 579–600, Cham, 2017. Springer International Publishing.

[BCIV17] Joppe W. Bos, Wouter Castryck, Ilia Iliashenko, and Frederik Ver-cauteren. Privacy-friendly forecasting for the smart grid using homomor-phic encryption and the group method of data handling. In Marc Joye andAbderrahmane Nitaj, editors, Progress in Cryptology - AFRICACRYPT2017, pages 184–201, Cham, 2017. Springer International Publishing.

[BEHZ16] Jean-Claude Bajard, Julien Eynard, Anwar Hasan, and Vincent Zucca.A full rns variant of fv like somewhat homomorphic encryption schemes.IACR Cryptology ePrint Archive, 2017:22, 2016.

[BL88] Dankmar Bohning and Bruce G Lindsay. Monotonicity of quadratic-approximation algorithms. Annals of the Institute of Statistical Mathe-matics, 40(4):641–663, 1988.

[BLN14] Joppe W Bos, Kristin Lauter, and Michael Naehrig. Private predictiveanalysis on encrypted medical data. Journal of biomedical informatics,50:234–243, 2014.

[Boh92] Dankmar Bohning. Multinomial logistic regression algorithm. Annals ofthe Institute of Statistical Mathematics, 44(1):197–200, 1992.

PRIVACY-PRESERVING LOGISTIC REGRESSION TRAINING 189

[CIV17] Wouter Castryck, Ilia Iliashenko, and Frederik Vercauteren. Homomor-phic sim2d operations: Single instruction much more data. IACR Cryp-tology ePrint Archive, 2017:22, 2017.

[Cry16] CryptoExperts. FV-NFLlib, 2016.[FV12] Junfeng Fan and Frederik Vercauteren. Somewhat practical fully homo-

morphic encryption. IACR Cryptology ePrint Archive, 2012:144, 2012.[Ger31] Semyon Aranovich Gershgorin. Uber die abgrenzung der eigenwerte einer

matrix. Bulletin de l’Academie des Sciences de l’URSS. Classe des sci-ences mathematiques et na, (6):749–754, 1931.

[KSW+18] Miran Kim, Yongsoo Song, Shuang Wang, Yuhou Xia, and XiaoqianJiang. Secure logistic regression based on homomorphic encryption. IACRCryptology ePrint Archive, 2018:14, 2018.

[LPR13] Vadim Lyubashevsky, Chris Peikert, and Oded Regev. On Ideal Latticesand Learning with Errors over Rings. J. ACM, 60(6):Art. 43, 35, 2013.

[NLV11] Michael Naehrig, Kristin Lauter, and Vinod Vaikuntanathan. Can homo-morphic encryption be practical? In Proceedings of the 3rd ACM Work-shop on Cloud Computing Security Workshop, CCSW ’11, pages 113–124,New York, NY, USA, 2011. ACM.

[XWBB16] Wei Xie, Yang Wang, Steven M. Boker, and Donald E. Brown. Privlogit:Efficient privacy-preserving logistic regression by tailoring numerical op-timizers. CoRR, abs/1611.01170, 2016.

190 PRIVACY-PRESERVING LOGISTIC REGRESSION TRAINING

Chapter 9

Homomorphic string searchwith constant multiplicativedepth

Publication data

Bonte C. and Iliashenko I. Homomorphic string search with constant multiplicativedepth In 2020 ACM SIGSAC Conference on Cloud Computing Security Workshop (Nov.2020), Y. Zhang and R. Sion, Eds., Proceedings of the 2020 ACM SIGSAC Conferenceon Cloud Computing Security Workshop, Association for Computing Machinery, NewYork, pp. 105–117.

Contribution:The author of this thesis is a main author of this paper.

Homomorphic string search with constantmultiplicative depth

Charlotte Bonte and Ilia Iliashenko

imec-COSIC, Dept. Electrical Engineering, KU Leuven, Belgiumcharlotte.bonte, [email protected]

Abstract. String search finds occurrences of patterns in a larger text.This general problem occurs in various application scenarios, f.e. Internetsearch, text processing, DNA analysis, etc. Using somewhat homomor-phic encryption with SIMD packing, we provide an efficient string searchprotocol that allows to perform a private search in outsourced data withminimal preprocessing. At the base of the string search protocol lies arandomized homomorphic equality circuit whose depth is independentof the pattern length. This circuit not only improves the performancebut also increases the practicality of our protocol as it requires the sameset of encryption parameters for a wide range of patterns of differentlengths. This constant depth algorithm is about 12 times faster than theprior work. It takes about 5 minutes on an average laptop to find thepositions of a string with at most 50 UTF-32 characters in a text with1000 characters. In addition, we provide a method that compresses thesearch results, thus reducing the communication cost of the protocol. Forexample, the communication complexity for searching a string with 50characters in a text of length 10000 is about 347 KB and 13.9 MB for atext with 1000000 characters.

1 Introduction

The string search problem consists in finding occurrences of a given string (thepattern) in a larger string (the text). This problem arises in various branchesof computer science including text processing, programming, DNA analysis,database search, Internet search, network security, data mining, etc.

In real-life scenarios, string-searching algorithms often deal with private in-formation. For example, the business model of Internet search engines is basedon profiling users given their search queries, which is later used for targetedadvertising. Another example is the analysis of genomic data. Doctors can out-source the genomic data of their patients to a service provider and query partsof this data remotely. If this information is exposed in the clear to the serviceprovider, it might be exploited in an unauthorized way.

To protect private data, users and service providers can resort to a specialtype of encryption algorithms, called homomorphic encryption (HE) [25]. Inaddition to data hiding, HE allows to perform computations on encrypted datawithout decrypting it. Depending on computational capabilities, HE schemes

are divided into several classes. The most powerful class is fully homomorphicencryption (FHE) that allows to compute any function on encrypted values. Thefirst realization of FHE was presented in [15].

In secure string search, FHE has the following advantages over other privacy-preserving cryptographic tools.

– Low communication complexity. FHE requires only two communication roundsand its communication overhead is proportional to the plaintext size, whereasYao’s garbled circuits [30] have communication complexity proportional tothe running time of the string-searching algorithm.

– Non-interactiveness. FHE does not require users and service providers to bepresent on-line while computing a string-searching algorithm. In contrast,multi-party computation (MPC) [30,18] is based on extensive on-line com-munication between the parties.

– Universality. Any string-searching algorithm can be implemented with FHEwithout or with little data preprocessing. This allows to keep data in a formthat is accessible for other computational tasks. On the contrary, private-information retrieval (PIR) [12], oblivious RAM (ORAM) [17] and private-set intersection (PSI) [8] protocols require data to be converted to a specificformat that introduces additional time and memory overhead.In particular, PIR and ORAM retrieve an element with a unique identi-fier. Thus, substrings with the same sets of characters should be attachedadditional labels (e.g. their positions in the text) to distinguish them. PSIcomputes the intersection between the query (pattern) and the data (text).Thus, PSI checks only the pattern presence in the text without specifying itspositions and the number of its occurrences. It implies that both the patternand the text must be turned into sets whose intersection contains all thepositions of the text substrings matching the pattern.

– No data leakage. Since the semantic security of the existing FHE schemesis based on hard lattice problems, FHE is believed to hide any informationabout encrypted data except for the maximal data size. In contrast, sym-metric searchable encryption (SSE) [28] assumes so-called “minimal leakage”that usually includes whether the same data is accessed on the server side(access pattern) or whether the same query is generated by the client (searchpattern).

Nevertheless, the efficiency of FHE schemes in general is far from practicaldespite numerous optimizations and tricks [4,13,20,11,7]. A more efficient ap-proach is to resort to somewhat homomorphic encryption (SHE) [15] that cancompute any function of bounded multiplicative depth. SHE is a better option inpractical use cases where a function to be computed is often known in advance.

The most efficient SHE schemes are based on algebraic lattices [5,14]. It wasnoticed in [26] that the algebraic structure of these lattices yields a way of pack-ing several data values into one homomorphic ciphertext. A homomorphic arith-metic operation applied on such a ciphertext results in an arithmetic operationoperation simultaneously applied on all the packed data values. In other words,

HOMOMORPHIC STRING SEARCH WITH CONSTANT MULTIPLICATIVE DEPTH 193

a single homomorphic instruction acts on multiple data values. This is why thistechnique is called SIMD packing. SIMD packing not only reduces ciphertext-plaintext expansion ratio of SHE/FHE schemes but also significantly reducesthe computational overhead of homomorphic circuits even when parallelism isnot required [16].

The multiplicative depth of the existing homomorphic string-searching andpattern-matching algorithms with SIMD packing [9,10,29,23,22,1,2] depends onthe pattern length, which makes it hard to set encryption parameters for pat-terns of varying lengths. In this work, we show how using the SIMD techniquesfrom [16] and randomization, we can efficiently address this problem.

1.1 Contribution

We propose a general framework for the design of homomorphic string searchprotocols using SHE schemes with SIMD packing.

We consider the setting where a client places encrypted data on a server andat a later point in time wants to search for a specific pattern without revealingthe text, the pattern or the result of string search to the server. In addition, ourframework is applicable when the server has plaintext data and the client wantsto query it without revealing the pattern and the results of string search.

This framework includes the following steps.

Preprocessing. We provide a simple algorithm for converting a large textinto a set of ciphertexts with reasonably small encryption parameters such thata homomorphic string search algorithm can efficiently operate on them. Sincethis algorithm preserves the natural representation of the text as an array ofcharacters, the server can easily change the text on the client’s request.

Processing. Even though any secure string search algorithm without prepro-cessing can be applied at this stage, we provide a concrete efficient example. Inparticular, we design a randomized homomorphic circuit with false-biased proba-bility 1/q that checks the equality relation between pairs of encrypted strings en-coded as vectors over a finite field Fq. Combining homomorphic SIMD techniquesfrom [26,16] and the randomization method of Razborov-Smolensky [24,27], thiscircuit achieves constant multiplicative depth, which allows to set a single setof encryption parameters for different pattern lengths. Furthermore, it requiresfewer homomorphic multiplications, which leads to a significant improvement incomputational time over the prior works [23,22,2].

Postprocessing. We describe a new compression technique that allows tocombine the encrypted results of the string search such that the number ofciphertexts transmitted from the server to the client is reduced by a linear factor.

To demonstrate the efficiency of our framework, we provide its concrete run-ning time using implementation in the HElib library [21] and compare it withthe prior works.

194 HOMOMORPHIC STRING SEARCH WITH CONSTANT MULTIPLICATIVE DEPTH

1.2 Related works

The first work on homomorphic string search and pattern matching was pre-sented in [31,32]. Despite the efficiency of this algorithm, it has several functionaldrawbacks.

First, it assumes that data values are packed into plaintext polynomial coef-ficients. This type of encoding does not admit SIMD operations. Therefore, tomanipulate individual data values one should resort to the coefficient-extractionprocedure, which is expensive in practice [6].

Secondly, given r ciphertexts encrypting the text, this algorithm returns ex-actly the same number of ciphertexts encrypting string search results. Thus, thecommunication complexity in this case is exactly the same as in the naive proto-col where the server sends all r ciphertexts of the text to the client. If the clientowns the text and uses the server to outsource his data, this problem makes theabove algorithm meaningless.

Every string search algorithm uses the equality function as a subroutine.Thus, the optimization of a homomorphic string search often boils down to theoptimization of the homomorphic equality function. The first equality circuit forbinary data encoded in the SIMD manner was proposed by Cheon et al. [9]. Thispaper shows how homomorphic permutations of SIMD slots can be exploited todecrease the complexity of the equality circuit as predicted in [16]. Further, Kimet al. [23] designed an efficient equality circuit over arbitrary finite fields byemploying the homomorphic Frobenius map. However, the multiplicative depthof the above circuits depends on the input length. In our work, we remove thisdependency.

In [2], a homomorphic string search is based on the classic binary equalitycircuit and a randomized OR circuit. Both circuits depend on the input length,but the multiplicative depth of the OR circuit is decreased by the randomizationmethod of Razborov and Smolensky [24,27]. In our work, we exploit the extremeversion of this method where the failure probability depends only on the plaintextspace size. This makes the depth of our matching algorithm constant at the costof a large plaintext space, which we efficiently use in the preprocessing andpostprocessing steps.

Another drawback of [2] is that it deals only with data encrypted bit-wiseand exploits the SIMD packing only for parallel search in several texts, whileignoring the techniques from [16]. This data encoding increases the input lengthand thus introduces a larger computational overhead in comparison to the cir-cuits in [9,23,22] as more homomorphic multiplications are required to computethe equality function. Moreover, the communication complexity is dependent onthe bit-size of the pattern. In our work, the characters are encoded into a finitefield Fq, which results in a bigger number of characters that can be encryptedby one ciphertext. Furthermore, by employing the SIMD techniques from [16],we are able to keep and process characters of the same text in each ciphertext.This means that if the pattern length is always less than the ciphertext capacity,then we need only one ciphertext to encrypt the pattern. This makes the com-

HOMOMORPHIC STRING SEARCH WITH CONSTANT MULTIPLICATIVE DEPTH 195

munication complexity from the client to the server independent on the patternlength.

Another advantage of our work is that our string search algorithm can findall the string matches in one round, whereas in [2] only one match is returnedto the client.

2 Preliminaries

2.1 Notation

Vectors are written in column form and denoted by boldface lower-case letters.The vector containing only 1’s in its coordinates is denoted by 1. We write 0 forthe zero vector.

The set of integers `, . . . , k is denoted by [`, k]. For a positive integer t, letwt(t) be the Hamming weight of its binary expansion.

Let t be an integer with |t| > 1. We denote the set of residue classes modulot by Zt. The class representatives of Zt are taken from the half-open interval[−t/2, t/2).

2.2 Cyclotomic fields and Chinese Remainder Theorem

Let m be a positive integer and n = φ(m) where φ is the Euler totient function.Let K be a cyclotomic number field constructed by adjoining a primitive complexm-th root of unity to the field of rational numbers. We denote this root of unityby ζm, so K = Q(ζm). The ring of integers of K, denoted by R, is isomorphic toZ[X]/ 〈Φm(X)〉 where Φm(X) is the mth cyclotomic polynomial.

Let Rt be the quotient of R modulo an ideal 〈t〉 generated by some elementt ∈ R. The ring Rt is isomorphic to the direct product of its factor rings asstated by the Chinese Reminder Theorem (CRT).

Theorem 1 (The Chinese Remainder Theorem for Rt). Let t be an inte-ger with |t| > 1 and 〈t〉 be an ideal of R generated by t. Let 〈t〉 be the product ofpairwise co-prime ideals I0, . . . , Ik−1, then the following ring isomorphism holds

Rt ∼= R/I0 × . . .×R/Ik−1 (1)

where the ring operations of the right-side direct product are component-wiseaddition and multiplication.

We can further characterize this isomorphism by using standard facts from num-ber theory. Let t be a prime number not dividing m. The cyclotomic polynomialΦm(X) splits into k irreducible degree-d factors f0(X), . . . , fk−1(X) modulo twhere d is the order of t modulo m, i.e. td ≡ 1 mod m. Note that d = n/k. Cor-respondingly, the ideal 〈t〉 splits into k prime ideals 〈t, f0(X)〉 , . . . , 〈t, fk−1(X)〉.Hence, for any i ∈ [0, k− 1] the quotient ring R/Ii = Z[X]/ 〈t, fi(X)〉 is isomor-phic to the finite field Ftd . As a result, we can rewrite the isomorphism in (1) asRt ∼= Fktd .

196 HOMOMORPHIC STRING SEARCH WITH CONSTANT MULTIPLICATIVE DEPTH

We call every copy of Ftd in the above isomorphism a slot. Hence, everyelement of Rt corresponds to k slots, which implies that an array of k elementsof Ftd can be encoded as a unique element of Rt. We enumerate the slots in thesame way as ideals Ii’s. Namely, the slot isomorphic to R/Ii is referred to asthe ith slot.

Addition (multiplication) of Rt-elements results in coefficient-wise addition(multiplication) of their respective slots. In other words, a single Rt operationinduces a single operation applied on multiple Ftd elements, which resembles theSingle-Instruction Multiple-Data (SIMD) instructions used in parallel comput-ing.

Using multiplication, we can easily define a projection map πi on Rt thatsends a ∈ Rt encoding slots (m0, . . . ,mk−1) to πi(a) encoding (0, . . . ,mi, . . . , 0).In particular, πi(a) = agi, where gi ∈ Rt encodes (0, . . . , 1, . . . , 0). For anyI ⊆ 0, . . . , k − 1, we can easily generalize this projection to πI(a) = agI withgI ∈ Rt encoding 1 in the SIMD slots indexed by I.

The field K = Q(ζm) is a Galois extension and its Galois group Gal (K/Q)contains automorphisms of the form σi : X 7→ Xi where i ∈ Z×m. The automor-phisms that fix every ideal Ii in the above decomposition of 〈t〉 form a subgroupGt of Gal (K/Q) generated by the automorphism σt, named the Frobenius au-

tomorphism. Since (a(X))ti

= a(Xti) for every a(X) ∈ Ftd , the elements of Gtmap the values of SIMD slots to their (ti)-th powers for i ∈ [0, d− 1].

The elements of the quotient group H = Gal (K/Q) /Gt act transitively onI0, . . . , Ik−1, thus permuting corresponding SIMD slots. However, the order ofH is n/d = k, which is less than k!, the number of all possible permutations onk slots. Gentry et al. [16] showed that every permutation of SIMD slots can bedone via combination of automorphisms from H, projection maps and additions.

One can define the map χ0 : a 7→ atd−1 from Ftd to the binary set 0, 1.

According to Euler’s theorem, this map, called the principal character, returns1 if a is non-zero and 0 otherwise. Since

atd−1 = a(t−1)(t

d−1+···+1) =

d−1∏

i=0

(at−1)ti

, (2)

χ0 can be computed with Frobenius maps and multiplications.

2.3 String search

The goal of string search is to find occurrences of a given string P, called thepattern, in a larger string T, called the text. Formally, let Σ be a finite alphabet,i.e. a finite set of characters. The pattern and the text are arrays of charactersP [0 . . .M − 1] and T [0 . . . N − 1], respectively, where characters are taken fromΣ. Assume that M ≤ N . The string search problem is to find all S ∈ [0, N −M ]such that P [i] = T [S + i] for any i ∈ [0,M − 1]. In other words, this problemasks to find the positions of all substrings of T that match P .

We assume that there exist an injective map φ : Σ → Ftd that encodes char-acters of the alphabet Σ into the finite field Ftd . Thus, the pattern and the textcan be considered as vectors over Ftd .

HOMOMORPHIC STRING SEARCH WITH CONSTANT MULTIPLICATIVE DEPTH 197

3 Homomorphic operations

In this work, we exploit leveled HE schemes that support the SIMD operationson their plaintexts. Such schemes include FV [14] and BGV [5], whose plaintextspace is the ring Rt for some t > 1. The general framework of these schemes isoutlined below.

3.1 Basic setup

Let λ be the security level of an HE scheme. Let L be the maximal multiplica-tive depth of homomorphic circuits we want to evaluate. Let d be the order ofthe plaintext modulus t modulo the order m of R. Assume that the plaintextspace Rt has k SIMD slots, i.e. Rt ∼= Fktd . For a vector a ∈ Fktd , we denote theplaintext encoding of a by pt(a). The basic algorithms of any HE scheme arekey generation, encryption and decryption.

KeyGen(1λ, 1L)→ (sk, pk). Given λ and L, this function generates the secretkey sk and the public key pk. Note that pk contains key-switching keys thathelp to transform ciphertexts encrypted under other secret keys to ciphertextsencrypted under sk.

Encrypt(pt ∈ Rt, pk) → ct. The encryption algorithm takes a plaintext pt

and the public key pk and outputs a ciphertext ct.Decrypt(ct, sk) → pt. The decryption algorithm takes a ciphertext ct and

the secret key sk and returns a plaintext pt. For freshly encrypted ciphertexts,the decryption correctness means that Decrypt(Encrypt(pt, pk), sk) = pt.

3.2 Arithmetic operations

Basic arithmetic operations in SHE are addition and multiplication.Add(ct1, ct2)→ ct. The addition algorithm takes two input ciphertexts ct1

and ct2 encrypting plaintexts pt1 and pt2 respectively. It outputs a ciphertextct that encrypts the sum of these plaintexts in the ring Rt. It implies thathomomorphic addition sums respective SIMD slots of pt1 and pt2.

AddPlain(ct1, pt2)→ ct. This algorithm takes a ciphertext ct1 encryptinga plaintext pt1 and a plaintext pt2. It outputs a ciphertext ct that encryptspt1 + pt2. As for the Add algorithm, AddPlain sums respective SIMD slots ofpt1 and pt2.

Mul(ct1, ct2) → ct. Given two input ciphertext ct1 and ct2 encryptingplaintext pt1 and pt2 respectively, the multiplication algorithm outputs a cipher-text ct that encrypts the plaintext product pt1 · pt2. As a result, homomorphicmultiplication multiplies respective SIMD slots of pt1 and pt2.

MulPlain(ct1, pt2) → ct. Given a ciphertext ct1 encrypting plaintext pt1and a plaintext pt2, this algorithm outputs a ciphertext ct that encrypts theplaintext product pt1 · pt2. As a result, MulPlain multiplies respective SIMDslots of pt1 and pt2.

Using the above operations as building blocks, one can design homomorphicsubtraction algorithms.

198 HOMOMORPHIC STRING SEARCH WITH CONSTANT MULTIPLICATIVE DEPTH

Sub(ct1, ct2) = Add(ct1, MulPlain(ct2, pt(−1)))→ ct. The subtraction al-gorithm returns a ciphertext ct that encrypts the difference of two plaintextmessages pt1 − pt2 encrypted by ct1 and ct2, respectively.

SubPlain(ct1, pt2) = AddPlain(ct1, pt2 · pt(−1)) → ct. This algorithmreturns a ciphertext ct that encrypts pt1 − pt2 where pt1 is encrypted by ct1.We consider SubPlain(pt1, ct2) to be equivalent to SubPlain(ct1, pt2).

As shown in Section 2.2, the projection map πI can select the SIMD slotsindexed by a set I ⊆ [0, k − 1] and set the rest to zero. This operation is homo-morphically realized by the Select function.

Select(ct, I) = MulPlain(ct, pt(1I))→ ct′ where 1I is a vector having 1’sin the coordinates indexed by a set I and zeros elsewhere. Given a ciphertextct encrypting SIMD slots m = (m0,m1, . . . ,mk−1) and a set I, this functionreturns a ciphertext ct′ that encrypts m′ = (m′0, . . . ,m

′k−1) such that m′i = mi

if i ∈ I and m′i = 0 otherwise.

3.3 Special operations

One can also homomorphically permute the SIMD slots of a given ciphertextand act on them with the Frobenius automorphism.

Rot(ct, i) → ct′ with i ∈ [0, k − 1]. Given a ciphertext ct encrypt-ing SIMD slots m = (m0,m1, . . . ,mk−1), the rotation algorithm returns aciphertext ct′ that encrypts the cyclic shift of m by i positions, namely(mi,m(i+1) mod k, . . . ,m(i−1) mod k).

Frob(ct, i)→ ct′ with i ∈ [0, d− 1]. Given a ciphertext ct encrypting SIMDslots m as above, the Frobenius algorithm returns a ciphertext ct′ that encryptsa Frobenius map action on m, namely (mti

0 ,mti

1 , . . . ,mti

k−1).

As discussed in Section 2.2, the Frob and Mul operations can be combinedto compute the principal character χ0(x), which tests whether x is non-zero.

IsNonZero(ct) → ct′. Given a ciphertext ct encrypting SIMD slotsm = (m0,m1, . . . ,mk−1), this function returns a ciphertext ct′ that encrypts

(χ0(m0), χ0(m1), . . . , χ0(mk−1)). Recall that χ0(m) = mtd−1 =∏d−1i=0 (mt−1)t

i

as shown in (2). The multiplicative depth of xt−1 is equal to dlog2(t− 1)e. The

multiplicative depth of xti

is zero as it can be done by the Frob operation. Intotal, d− 1 Frob operations are needed to compute χ0(m). As a result, the totalmultiplicative depth of IsNonZero is

dlog2(t− 1)e+ dlog2 de . (3)

Using general exponentiation by squaring, xt−1 requires blog2(t− 1)c + wt(t −1)−1 field multiplications. Since d−1 field multiplications are needed to compute∏d−1i=0 (xt−1)t

i

, the total number of multiplications to compute χ0(m) is

blog2(t− 1)c+ wt(t− 1) + d− 2. (4)

HOMOMORPHIC STRING SEARCH WITH CONSTANT MULTIPLICATIVE DEPTH 199

Table 1: The cost of homomorphic operations with relation to running time andnoise growth.

Operation Time Noise

Add cheap cheapAddPlain cheap cheap

Mul expensive expensiveMulPlain cheap moderate

Sub cheap cheapSubPlain cheap cheapSelect cheap moderateRot expensive moderateFrob expensive cheap

IsNonZero expensive expensive

3.4 Cost of homomorphic operations

Note that every homomorphic ciphertext contains a special component callednoise that is removed during decryption. However, the decryption function candeal only with noise of small enough magnitude; otherwise, this function fails.This noise bound is defined by encryption parameters in a way that larger pa-rameters result in a larger bound. The ciphertext noise increases after everyhomomorphic operation and, therefore, approaches its maximal possible bound.It implies that to reduce encryption parameters one needs to avoid homomor-phic operations that significantly increase the noise. Therefore, while designinghomomorphic circuits, we need to take into account not only the running timeof homomorphic operations but also their effect on the noise.

Table 1 summarizes the running time and the noise cost of the above ho-momorphic operations. Similar to [19], we divide the operations into expensive,moderate and cheap. The expensive operations dominate the cost of a homomor-phic circuit. The moderate operations are less important, but if there are manyof them in a circuit, their total cost can dominate the total cost. The cheapoperations are the least important and can be omitted in the cost analysis.

It is worth to note that there are two multiplication functions Mul (ciphertext-ciphertext multiplication) and MulPlain (ciphertext-plaintext multiplication).Since Mul is much more expensive than MulPlain, the multiplicative depth of ahomomorphic circuit is calculated with relation to the number of Mul’s.

4 Equality circuits

The equality function tests whether two `-dimensional vectors over some finitefield F are equal. It returns 1 when input strings are equal and 0 otherwise.

200 HOMOMORPHIC STRING SEARCH WITH CONSTANT MULTIPLICATIVE DEPTH

4.1 State-of-the-art equality circuits

If input vectors are binary, the equality function can be computed in any ringZt>1.

Definition 1 (Binary equality circuit). Given two `-dimensional binary vec-tors x = (x0, . . . , x`−1) and y = (y0, . . . , y`−1), the equality function can becomputed over any Zt>1 via the following arithmetic circuit

EQ2(x,y) =

`−1∏

i=0

(1− (xi − yi)) .

Representing data in the binary form can be far from optimal, especially whenthe plaintext modulus t is bigger than 2. In this case, Rt is capable to encoden log2 t bits of data rather than just n. To use this extra space, we employ finitefield arithmetic. Let t be a prime number. Then each SIMD slot is isomorphicto a finite algebraic extension of the finite field Ft of degree d. Hence, data canbe encoded into elements of Ftd rather than into elements of F2. The equalitycircuit for vectors over Ftd is defined as follows.

Definition 2 (Equality circuit in Ftd). Given two vectors x = (x0, . . . , x`−1)and y = (y0, . . . , y`−1) from F`td , the equality function can be computed via thefollowing polynomial function

EQtd(x,y) =

`−1∏

i=0

(1− (xi − yi)t

d−1). (5)

Using (3), it is easy to see that the total multiplicative depth of (5) is

dlog2 `e+ dlog2(t− 1)e+ dlog2 de .

It follows from (4) that the total number of multiplications in (5) is

blog2(t− 1)c+ wt(t− 1) + d+ `− 3 .

We can also derive (5) from a function with ` variables. Let IsZero(x) be afunction that outputs 1 when x is the zero vector and 0 otherwise. For x ∈ F`td ,

it holds for each i = [0, ` − 1] that 1 − xtd−1i is 1 if xi = 0 and 0 otherwise.

This implies that IsZero(x) is given by∏`−1i=0(1 − xt

d−1i ). Since EQtd(x,y) =

IsZero(x− y), we indeed obtain (5) as the expression for the equality circuit.

4.2 Our equality circuits

We propose a new randomized equality circuit that makes the multiplicativedepth independent of the input length. Our circuit is based on the Razborov-Smolensky method, which helps to represent a high fan-in OR function by a lowdegree polynomial. In finite fields, the OR function returns 1 if its input has at

HOMOMORPHIC STRING SEARCH WITH CONSTANT MULTIPLICATIVE DEPTH 201

least one non-zero coordinate and 0 otherwise. Using the primitive character,

we can represent OR as the polynomial OR(x) = 1 −∏`−1i=0(1 − xtd−1i ) of degree

`(td − 1) over F`td .

To decrease the polynomial degree, we take some positive integer D < `and sample D` uniformly random elements r0, . . . , rD`−1 and compute ORr(x) =

1−∏D−1i=0

(1−

(∑`−1j=0 ri`+jxj

)td−1). The degree of this polynomial D(td − 1)

is smaller than that of OR, but its output is randomized and might be wrong.Notice that if x = 0, this polynomial correctly returns 0. If x is a non-zerovector,

∑`−1j=0 ri`+jxj = 0 with probability t−d. Thus, ORr wrongly returns 0 with

probability t−Dd. This means that ORr(x) = OR(x) for x 6= 0 with probability1− t−Dd.

Note that D was chosen to decrease the failure probability. We can simply setit to 1 if the field size td is sufficiently large. Following this idea, we randomizedthe equality function over finite fields.

Definition 3 (Randomized equality circuit over Ftd). Given two `-dimensional vectors x = (x0, . . . , x`−1) and y = (y0, . . . , y`−1) with xi, yi ∈ Ftdfor any i, the randomized equality function can be computed over Ftd via thefollowing polynomial function

EQrtd(x,y) = 1−(`−1∑

i=0

ri(xi − yi))td−1

(6)

where ri’s are uniformly random elements of Ftd .

The correctness of EQrtd is defined by the following lemma.

Lemma 1. Let x,y ∈ F`td . If x = y, then EQrtd(x,y) = 1. If x 6= y, thenEQrtd(x,y) = 0 with probability 1− t−d.

Proof. If x = y, the sum∑`−1i=0 ri(xi − yi) always vanishes, which results in

EQrtd(x,y) = 1.

If x 6= y, then there exist a non-empty set of indices I ⊆ [0, `− 1] such thatxi − yi is non-zero for any i ∈ I. Then, the product ri(xi − yi) is a uniformly

random element of Ftd if i ∈ I and 0 otherwise. As a result,∑`−1i=0 ri(xi − yi) is

a uniformly random element of Ftd . The above sum is non-zero with probability

1− t−d, which leads to(∑`−1

i=0 ri(xi − yi))td−1

= 1 by Euler’s theorem. Hence,

EQrtd(x,y) outputs 0 when x 6= y with probability 1− t−d.

Complexity. Following the same reasoning as for the deterministic equality cir-cuit, we obtain that the multiplicative depth of (6) is

dlog2(t− 1)e+ dlog2 de+ 1 ,

202 HOMOMORPHIC STRING SEARCH WITH CONSTANT MULTIPLICATIVE DEPTH

which is independent of the vector length `. It follows from (4) that the totalnumber of multiplications in (6) is

blog2(t− 1)c+ wt(t− 1) + d− 2 + ` .

In a similar manner, we can define an equality circuit for vectors containingwildcards. Let ∗ be a wildcard character meaning that it represents any symbolin the alphabet. Assume that ∗ is encoded by an element ω ∈ Ftd . Then therandomized equality circuit with wildcards for Ftd -vectors is defined as follows.

Definition 4 (Randomized equality circuit with wildcards over Ftd).Let x = (x0, . . . , x`−1) and y = (y0, . . . , y`−1) be two `-dimensional vectors suchthat xi ∈ Ftd and yi ∈ Ftd \ω. Then the randomized equality function for suchvectors can be computed via the following polynomial function

EQr,∗td

(x,y) = 1−(`−1∑

i=0

ri(xi − ω)(xi − yi))td−1

(7)

where ri’s are uniformly random elements of Ftd .

Due to additional multiplication by xi − ω, this circuit has multiplicative depth

dlog2(t− 1)e+ dlog2 de+ 2,

which is one more than that of EQrtd . This also introduces ` additional multipli-cations, so their total number becomes

blog2(t− 1)c+ wt(t− 1) + d− 2 + 2`.

Example 1. The randomized equality testing of two vectors from F8316 with wild-

cards needs 32 multiplications. The multiplicative depth of this circuit is 7. Theoutput of this circuit is correct with error probability about 3−16 ' 2−25.

5 Homomorphic string search protocol

In this section, we describe a protocol for homomorphic string search. This proto-col assumes two parties, the client and the honest-but-curious server. The clientwants to upload a text document to the server and then be able to search over it.In particular, she wants to send a pattern to the server and receive the positionsof this pattern in the outsourced text. The client wants to hide the text, thepatterns and the query results from the server.

The flow of the protocol is depicted in Figure 1. Before the start of theprotocol, the client encrypts and sends her text to the server. The protocolbegins when the client encrypts and sends a pattern to the server. Receiving theencrypted pattern, the server performs a homomorphic string search algorithmand then sends the results back to the client. Our threat model assumes thatthe computationally-bounded server follows the protocol but it tries to extract

HOMOMORPHIC STRING SEARCH WITH CONSTANT MULTIPLICATIVE DEPTH 203

Fig. 1: Our string search protocol.

information from the client’s queries. The server should not be able to distinguishtwo encrypted patterns of the same length. This security requirement is achievedby the semantic security of an SHE scheme as discussed in Section 3.3 of [2].

In an alternative scenario, one could assume the server has the text in theclear. Our solution immediately transfers to this scenario by replacing the ci-phertexts encrypting the text by plaintexts in the solution below, which wouldonly reduce the complexity of our solution as opertations between a plaintextand a ciphertext are less expensive than ciphertext-ciphertext operations. As asolution for the setting where the text is encrypted, immediately yields a solu-tion for the scenario where the server has the text in the clear, we focus on themore general problem where the text is encrypted.

The protocol description starts with the preprocessing step where the clientencrypts the text.

5.1 How to encrypt the text into several ciphertexts

Recall that the text T has length N , so it can be represented as an N -dimensionalvector over Ftd . Similarly, the pattern P can be represented as an Ftd -vector of

length M < N . Let T(i)M be a substring of T of length M starting at the ith

position of T .In practice, we can assume that M is less than k, the number of SIMD slots,

which seems plausible as k can be hundreds or thousands. However, the entiretext might not fit k SIMD slots; thus, we assume N > k. In this case, we need to

204 HOMOMORPHIC STRING SEARCH WITH CONSTANT MULTIPLICATIVE DEPTH

split T and encrypt it into different ciphertexts such that a homomorphic stringsearch algorithm can find all the occurrences of any length-M pattern in T .

Let r = dN/ke. Let us naively split T into substrings T1 . . . Tr, where allbut the last one are of length k and |Tr| ≤ k. Notice that the substrings

T(ik)M , . . . , T

((i+1)k−M)M are contained in Ti+1 for any i ∈ [0, r − 1]. However,

substrings T(ik−M+1)M , . . . , T

(ik−1)M for i ∈ [1, r − 1] are missing, which means

that this naive approach does not work. Moreover, it implies that we might needto duplicate characters of T to encode all its length-M substrings.

We define an (M,k)-cover of T as a set of length-k substrings T1, T2, . . . Trof T such that every length-M substring of T is contained only in one Ti.Therefore, all the occurrences of any pattern P of length at most M in Tcan be found by matching P with Ti’s. For example, if T = ”example”, then”exam”, ”ampl”, ”ple” is a (3, 4)-cover of T . See Figure 2 for an illustration.

e x a m a m p l p l e

Fig. 2: The (3, 4)-cover of the text T = ”example”. Every rectangle representsa length-4 substring of this cover. Every length-3 substring of T is contained inan exactly one of these substrings.

We construct an (M,k)-cover of T as follows. Let T1 be as in the naiveapproach, i.e. T1 = T [0] . . . T [k − 1]. As pointed out above, T1 contains all the

length-M substrings up to T(k−M)M . Thus, T2 should start with T [k −M + 1],

which yields T2 = T [k − M + 1] . . . T [2k − M ]. Following this procedure, wetransform T into the set of its length-k substrings T1, T2, . . . , Tr such that

T1 = T [0] . . . T [k − 1],

T2 = T [k −M + 1] . . . T [2k −M ],

. . .

Ti = T [(i− 1)(k −M + 1)] . . . T [(i− 1)(k −M + 1) + k − 1],

. . .

Tr = T [(r − 1)(k −M + 1)] . . . T [N − 1].

Thus, r should satisfy N − 1 ≤ k − 1 + (r − 1)(k − M + 1). It follows thatr ≥ (N −M + 1)/(k−M + 1) and then r = d(N −M + 1)/(k −M + 1)e, sincer is an integer.

Note that Ti’s are chosen to fit into one ciphertext with k slots. Hence,T1, . . . , Tr can be encoded into SIMD slots such that the jth character Ti[j] ismapped to the jth slot of the ith ciphertext. Hence, r ciphertexts are needed toencrypt N characters of the text T .

Example 2. The extreme examples are

HOMOMORPHIC STRING SEARCH WITH CONSTANT MULTIPLICATIVE DEPTH 205

– M = k (the pattern occupies all the SIMD slots of a single ciphertext), thenr = N − k + 1;

– M = 1 (the pattern is just one character, so k copies of the pattern can beencrypted into one ciphertext), then r = dN/ke, which is optimal.

If M = k/2, then r = d(N − (k/2) + 1)/((k/2) + 1)e ' 2N/k. Thus, abouttwice more ciphertexts are needed than in the optimal case. If M = k/c forsome c ∈ (1, k], then r = d(N − (k/c) + 1)/(k − (k/c) + 1)e ' c

c−1 · Nk . Thismeans that if one ciphertext can contain at most c copies of the pattern, thenc/(c − 1) times more ciphertexts are needed to encrypt the text than in theoptimal case.

Using the procedure above the client produces r ciphertexts that contain thetext. These ciphertexts are then uploaded to the server, which concludes thepreprocessing phase.

An (M,k)-cover can also be created at the server side. Let the client naivelysplit T into substrings T1, . . . , Tr′ of length at most k. These substrings are thenencrypted and sent to the server. As r′ = bk/Mc, this reduces the communicationcost between the client and the server and shifts the workload from the client tothe server. Given the substring ciphertexts ct1, . . . , ctr′ , the server can computean (M,k)-cover using the following steps. To create an encryption of a string of an(M,k)-cover, the server extracts slots of ct1, . . . , ctr′ containing the charactersof that string with Select and glues them into one ciphertext using Rot andAdd operations.

The above procedure makes the setup of our protocol independent of themaximal pattern length M . The client shares M with the server who can thencreate a correct (M,k)-cover either from the naive encryption of the text orfrom an earlier formed (M ′, k)-cover. This method requires more homomorphicoperations and hence increases the noise in the ciphertexts which might requirelarger parameters to keep the decryption and computation correctness.

The homomorphic string search protocol starts when the client sends anencrypted pattern to the server. The following section describes how the patternshould be encrypted.

5.2 How to encrypt the pattern?

Since the pattern length M is assumed to be smaller than the number of slots k,one ciphertext is enough to encrypt the pattern P . Note that the characters ofP should be encoded into SIMD slots such that the ith character of P is mappedto the ith SIMD slot. In this way, the order of the pattern characters is alignedwith the order of the text characters. Furthermore, the pattern length can beseveral times smaller than the number of the SIMD slots, i.e. bk/Mc = C > 1.In this case, we encrypt C copies of P by placing them one by one into theSIMD slots. Namely, the character P [j] is encoded into the slots enumerated byj, j +M, . . . , j + (C − 1)M . See Figure 3 for an illustration.

206 HOMOMORPHIC STRING SEARCH WITH CONSTANT MULTIPLICATIVE DEPTH

L o r e m m i p s s u m

i p i p

Fig. 3: The rectangles depict ciphertexts with 5 slots (squares). The top onescontain the text while the bottom one encrypts two copies of the pattern P =”ip”.

Algorithm 1: Homomorphic string search algorithm.

Input: ctP – a ciphertext encrypting C copies of a length-M pattern P ,ct1, . . . , ctr – ciphertexts encrypting the (M,k)-cover T1, . . . , Tr of atext T .

Output: ct′1, . . . , ct′r – ciphertexts such that ct′i contains 1 in the jth SIMD

slot if the occurrence of P starts at the jth position of Ti and 0otherwise.

1 for i← 1 to r do2 ct′i ← pt(0)3 for j ← 0 to M − 1 do4 ctP,j ← Rot(ctP ,−j)5 Cj →

⌊k−jM

6 Ij → j, j +M, . . . , j + (Cj − 1)M7 ct← HomEQ(M, cti, ctP,j , Ij)8 ct′i ← Add(ct′i, ct) //AddPlain(ct′i, ct) when j = 0

9 Return ct′1, . . . , ct′r.

5.3 How to compare the text and the pattern?

Given the text encrypted into r ciphertexts ct1, . . . , ctr and a ciphertext ctPcontaining C copies of the pattern as above, the homomorphic string searchalgorithm follows Algorithm 1. In particular, for every cti Algorithm 1 per-forms a homomorphic equality test between shifted copies of the pattern andthe text (see Figure 4). The homomorphic equality test is done by the HomEQ

function, which homomorphically realizes any equality circuit described in Sec-tion 4. Given the pattern shifted by j positions, HomEQ should output a ci-phertext ct containing the equality results in the SIMD slots indexed byj, j + M, j + 2M, . . . , j + (Cj − 1)M and zeros in other slots (see Figure 5).In this case, the equality results can be combined into ct′ by the homomorphicaddition on line 8 of Algorithm 1.

As shown in Figure 4, Algorithm 1 compares all length-M substrings en-crypted by ciphertext ct1, . . . , ctr to the pattern. For example, when i = 2,the ciphertext ct2 containing ”m ips” is compared to the length-2 pattern ”ip”.The string ”m ips” has 4 substrings of length 2, namely ”m ”, ” i”, ”ip”, ”ps”.

HOMOMORPHIC STRING SEARCH WITH CONSTANT MULTIPLICATIVE DEPTH 207

i = 1 i = 2 i = 3

cti

j = 0

j = 1

L o r e m m i p s s u m

i p i p

i p i p

i p i p

i p i p

i p i p

i p i p

Fig. 4: Algorithm 1 shifts the position of the pattern SIMD slots such that ateach jth iteration the pattern is compared to different substrings of the textencrypted by cti.

cti

j = 0

j = 1

m i p s

i p i p

i p i p

HomEQ

HomEQ

0 0 1 0 0

0 0 0 0 0

Fig. 5: The HomEQ function returns the results of string search in the slots wherethe pattern copies begin (circled values on the right). Other slots are set to zero,which allows Algorithm 1 to combine the results with addition (line 8).

When j = 0, ”ip” is compared to ”m ” and ”ip”. When j = 1, the pattern isshifted to the right and then compared to substrings ” i” and ”ps”.

A concrete instantiation of HomEQ is provided by Algorithm 2. This algorithmis a homomorphic implementation of the equality circuit EQrtd from (6). In fact,Algorithm 2 homomorphically computes EQrtd on several vectors simultaneouslyin the SIMD manner. Namely, given a set I ⊆ 0, . . . , k − M, it outputs aciphertext ct that contains EQrtd((xi, xi+1, . . . , xi+M−1), (yi, yi+1, . . . , yi+M−1))for any i ∈ I. Let us prove this claim.

208 HOMOMORPHIC STRING SEARCH WITH CONSTANT MULTIPLICATIVE DEPTH

Algorithm 2: The HomEQ algorithm that homomorphically implementsthe EQrtd circuit.

Input: M ∈ Z,ctx – a ciphertext encrypting x ∈ Fk

td ,cty – a ciphertext encrypting y ∈ Fk

td ,I ⊆ 0, . . . , k −M.Output: ct – a ciphertext encrypting 1 in the ith slot if i ∈ I and xi+j = yi+j

for any j ∈ 0, . . . ,M − 1; all other slots contain 0.1 cte ← Sub(ctx, cty)2 ptr ← uniformly random plaintext3 cte ← MulPlain(ct, ptr)4 cto ← pt(0)5 re ← 16 ro ←M7 `←M8 while ` > 1 do9 if ` is even then

10 `← `/211 else12 ro ← ro − re13 cto ← Add(cto, Rot(cte, ro))14 `← (`− 1)/2

15 cte ← Add(cte, Rot(cte, re))16 re ← 2re

17 ct← Add(cte, cto)18 ct← IsNonZero(ct)19 ct← SubPlain(pt(1), ct)20 ct← Select(ct, I)21 Return ct.

Correctness.

Lemma 2. Given an integer M , two vectors x,y ∈ Fktd and a set I ∈ [0, k−M ],

Algorithm 2 outputs a correct result with probability at least (1− t−d)|I|.

Proof. It is straightforward that lines 1-3 compute a ciphertext containing zi =ri(xi − yi) with uniformly random ri ∈ Ftd for any i ∈ [0, k − 1]. The next stepis to sum all zi’s, which is done by adding the ciphertext containing zi’s withits shifted copies (lines 8-16). Since homomorphic shifts are circular, we assumethat indices of zi’s are taken modulo k.

Let M = 2K +∑K−1i=0 ai2

i, ai ∈ 0, 1 be the bit-decomposition of M . Hence,the while loop in lines 8-16 have K iterations, which we count starting from

1. Let us denote r(i)e and r

(i)o be re and ro at the end of the ith iteration of

the while loop. Let us set r(0)e = 1 and r

(0)o = M . Since re doubles at each

iteration, we have r(i)e = 2i. Since ` is set to M before the while loop, the

least significant bit of ` is equal to ai−1 at the start of the ith iteration. Hence,

HOMOMORPHIC STRING SEARCH WITH CONSTANT MULTIPLICATIVE DEPTH 209

r(i)o = r

(i−1)o − ai−1r(i−1)e = r

(i−1)o − ai−12i−1, which by induction leads to r

(i)o =

M −∑i−1u=0 au2u = 2K +

∑K−1u=i au2u. Note that if ai−1 = 1, then r

(i)o + 2i−1 =

r(i−1)o .

Let z(i)e [j] (resp. z

(i)o [j]) be the jth slot of cte (resp. cto) at the end of the

ith iteration. After the first iteration we have z(1)e [j] = zj + z

j+r(0)e

= zj + zj+1.

By induction, it follows that

z(i)e [j] = z(i−1)e [j] + z(i−1)e [j + r(i−1)e ] = z(i−1)e [j] + z(i−1)e [j + 2i−1]

=

2i−1−1∑

u=0

zj+u

+

2i−1∑

u=2i−1

zj+u

=

2i−1∑

u=0

zj+u. (8)

The ciphertext cto is updated in the ith iteration of the while loop if ai−1 = 1.

In this case, z(i)o [j] = z

(i−1)o [j] + z

(i−1)e [j + r

(i)o ]. Since r

(i)o + 2i−1 = r

(i−1)o , it

follows from (8) that z(i−1)e [j+r

(i)o ] =

∑r(i−1)o −1u=r

(i)o

zj+u. Hence, z(i)o [j] = z

(i−1)o [j]+

∑r(i−1)o −1u=r

(i)o

zj+u. As z(0)o [j] = 0 (line 4 of the algorithm), it follows by induction

that

z(i)o [j] =∑

v∈[1,i]:av−1=1

r(v−1)o −1∑

u=r(v)o

zj+u.

Notice that if ai−1 = 0, then r(i)o = r

(i−1)o . Let av−1 = av′−1 = 1 for some v < v′.

Thus, aw = 0 for any w ∈ [v, v′−2]. The previous argument yields r(v)o = r

(v′−1)o .

Hence,

[r(v′)

o , r(v′−1)

o − 1] ∪ [r(v)o , r(v−1)o − 1] = [r(v′)

o , r(v)o − 1] ∪ [r(v)o , r(v−1)o − 1]

= [r(v′)

o , r(v−1)o − 1].

Applying this argument for all v with av−1 = 1, we obtain

z(i)o [j] =

r(0)o −1∑

u=r(i)o

zj+u =

M−1∑

u=r(i)o

zj+u. (9)

Combining (8) and (9), we obtain that after the while loop cte and cto containthe following values in their slots

ze[j] =

2K−1∑

u=0

zj+u , zo[j] =

M−1∑

u=r(K)o

zj+u .

Since r(K)o = M −∑K−1

u=0 au2u = 2K , the output of Add on line 17 encrypts thefollowing value in its jth SIMD slot

z′[j] =

M−1∑

u=0

zj+u =

M−1∑

u=0

rj+u(xj+u − yj+u) .

210 HOMOMORPHIC STRING SEARCH WITH CONSTANT MULTIPLICATIVE DEPTH

The SIMD slots are then changed by IsNonZero and SubPlain, which results inthe jth slot containing

1−(M−1∑

u=0

rj+u(xj+u − yj+u)

)td−1.

For any j ∈ [0, k−M ] this is exactly the output of EQrtd applied on vectors xj =(xj , . . . , xj+M−1) and yj = (yj , . . . , yj+M−1). Applying the Select operationon line 20, we zeroize all the slots, whose indices are not in I. Thus, the jthSIMD slot of the final output contains 1 only if j ∈ I and EQrtd(xj ,yj) = 1.According to Lemma 1, EQrtd always outputs 1 if xj = yj and returns 1 whenxj 6= yj with probability t−d. Thus, Algorithm 2 is correct with probability atleast (1− t−d)|I|.Given Lemma 2, we are ready to prove the correctness of Algorithm 1.

Theorem 2. Let ctP be a ciphertext encrypting C copies of a length-M pat-tern P . Let ct1, . . . , ctr be ciphertexts encrypting an (M,k)-cover of a text T .Given all the aforementioned ciphertexts, Algorithm 1 outputs a correct resultwith probability at least (1− t−d)r(k−M+1).

Proof. Let us consider the for loop in lines 3-8 of Algorithm 1. On line 4, thepattern copies are shifted by j positions to the right such that they start atthe jth slot of the ciphertext ctP,j as in Figure 4. It means that ctP,j containsexactly Cj copies of the pattern unbroken by this cyclic shift. The startingpositions of these copies corresponds to the elements of the set Ij . Next, theHomEQ function compares these pattern copies to the length-M substrings of thetext that starts at the SIMD slots indexed by Ij . It returns a ciphertext thatcontains 1 in its jth slot if P = Ti[j] . . . Ti[j +M − 1].

For any j 6= j′ the sets Ij and Ij′ must be disjoint. Otherwise, there existintegers u, u′ such that j + uM = j′ + u′M , which leads to j − j′ = (u′ − u)M .Since j, j′ ∈ [0,M −1], it follows that j− j′ < M −1 and thus u = u′ and j = j′.Hence, Ij ∩ Ij′ = ∅. It means that the homomorphic addition on line 8 puts theresults of HomEQ on line 7 into distinct SIMD slots. Thus, ct′i contains 1 in itsjth slot if the pattern P matches Ti[j] . . . Ti[j +M − 1].

According to Lemma 2, HomEQ outputs a correct result with probability (1−t−d)Cj in the jth iteration of the inner loop. Hence, the probability that ct′icontains correct results is at least (1− t−d)

∑M−1j=0 Cj . Note that

M−1∑

j=0

Cj =

M−1∑

j=0

⌊k − jM

⌋=

M−1∑

j=0

k − j − (k − j) mod M

M

=k + (k − 1) + · · ·+ (k − (M − 1))

M− 0 + 1 + · · ·+ (M − 1))

M

=kM − M(M−1)

2

M− M(M − 1)

2M= k −M + 1.

HOMOMORPHIC STRING SEARCH WITH CONSTANT MULTIPLICATIVE DEPTH 211

Thus, ∪M−1j=0 Ij = [0, k −M ]. All the length-M substrings are compared to thepattern in the inner loop by computing EQrtd homomorphically. Hence, ct′i con-tains correct results with probability at least (1− t−d)k−M+1. Since there are riterations of the outer loop, the Algorithm 1 returns correct results with proba-bility at least

(1− t−d)r(k−M+1) .

Given the above expression, we see that a bigger pattern length M increasesthe probability of success. Since r = d(N −M + 1)/(k −M + 1)e, we can ap-proximate the success probability with (1 − t−d)N−M+1. When M increases,(1− t−d)N−M+1 increases as 0 ≤ (1− t−d) ≤ 1. Hence, there will be less failureswhen the patterns are longer.

Complexity.

Complexity of Algorithm 2 . The multiplicative depth of Algorithm 2 is fixed andequal to the multiplicative depth of IsNonZero, which is dlog2(t− 1)e+ dlog2 deaccording to (3).

As described in Section 3.4, the most expensive homomorphic operations areMul, MulPlain, Select, Rot, and Frob. Let us count them in Algorithm 2. Thereis one MulPlain on line 3 and one Select on line 20. All ciphertext-ciphertextmultiplications are executed within the IsNonZero function on line 18. Accordingto Section 3.3, IsNonZero performs blog2(t− 1)c+wt(t−1)+d−2 Mul operationsand d− 1 Frob operations.

Rot operations are only present in the while loop (lines 8-16). Let 2K +∑K−1i=0 ai2

i be a bit decomposition of M . Since the while loop has K iterations,there are at least K rotations performed on line 15. The number of rotationsperformed on line 13 is equal to the number of non-zero ai’s, which is equal towt(M)−1. Since K = blog2Mc, the total number of Rot operations is blog2Mc+wt(M)− 1.

In summary, the following (expensive and moderate) operations are requiredto compute Algorithm 2:

– Mul : blog2(t− 1)c+ wt(t− 1) + d− 2 ,– Rot : blog2Mc+ wt(M)− 1,– Frob : d− 1,– MulPlain : 1,– Select : 1.

Similarly to Algorithm 2, we can implement a homomorphic circuit for EQtd ,which was used in the prior work [23]. This can be easily done by removingmultiplication by a random plaintext (lines 2-3), changing the order of homo-morphic operations and replacing homomorphic additions with multiplicationsin the while loop. As shown in Table 2, our approach is strictly more efficientas it has fewer ciphertext-ciphertext multiplications. Moreover, the multiplica-tive depth of the prior technique depends on the pattern length M , whereas ourapproach eliminates this dependency.

212 HOMOMORPHIC STRING SEARCH WITH CONSTANT MULTIPLICATIVE DEPTH

Table 2: The number of expensive and moderate homomorphic operations inour paper and in the prior work. Our circuit removes the dependency of themultiplicative depth on the pattern length M .

HomEQ [23,22]

Mulblog2(t− 1)c+ wt(t− 1)+d− 2

blog2(t− 1)c+ wt(t− 1)+d− 3 + blog2Mc+ wt(M)

Rot blog2Mc+ wt(M)− 1 blog2Mc+ wt(M)− 1Frob d− 1 d− 1

Mul.depth

dlog2(t− 1)e+ dlog2 dedlog2(t− 1)e+ dlog2 de+ dlog2Me

Complexity of Algorithm 1 . Algorithm 1 invokes the HomEQ function exactly rMtimes. In addition, it performs r(M − 1) homomorphic rotations of ctP (we canignore one rotation with j = 0 as it does not change the ciphertext). As a result,Algorithm 1 performs the following (expensive and moderate) homomorphic op-erations.

– Mul : rM(blog2(t− 1)c+ wt(t− 1) + d− 2),– Rot : rM(blog2Mc+ wt(M))− r,– Frob : rM(d− 1),– MulPlain : rM ,– Select : rM .

The multiplicative depth of Algorithm 1 is the same as that of Algorithm 2,namely

dlog2(t− 1)e+ dlog2 de .

String search with wildcards. Algorithm 2 can be modified to support theequality circuit with wildcards, EQr,∗

td. For simplicity, we assume that ω = 0 in

(7). After the first line of Algorithm 2, we insert cte → Mul(ctx, cte), whichoutputs cte encrypting xi(xi − yi). The correctness of this modified algorithmfollows by setting zi to rixi(xi − yi) in the proof of Lemma 2.

Since only a single homomorphic multiplication is added, the modified versionof Algorithm 2 requires blog2(t− 1)c + wt(t − 1) + d − 1 ciphertext-ciphertextmultiplications. Its multiplicative depth also increases by one to dlog2(t− 1)e+dlog2 de+ 1. This implies that Algorithm 1 should perform M more ciphertext-ciphertext multiplications at the cost of one additional multiplicative level.

5.4 Compression of results

The string search algorithm in the previous section outputs r ciphertexts con-taining the positions of the pattern occurrences in the text. Sending all theseciphertexts to the client makes the entire protocol meaningless as the server could

HOMOMORPHIC STRING SEARCH WITH CONSTANT MULTIPLICATIVE DEPTH 213

instead send the r ciphertexts encrypting the text back to the client. To avoidthis problem, the encrypted results should be compressed such that significantlyless than r ciphertexts are to be transmitted to the client.

In the output of Algorithm 1, each ciphertext ct′i encrypts SIMD slots con-taining a single bit. However, every SIMD slot can store any element of the finitefield Ftd , or d blog2 tc bits. Thus, we can split ct′1, . . . , ct

′r into groups of size

d blog2 tc and then combine ciphertexts within each group as follows

ct =

blog2 tc−1∑

i=0

MulPlain

d−1∑

j=0

MulPlain(ct′id+j , Xj)

, 2i

.

The summation symbol means a homomorphic sum of ciphertexts using Add.The ith slot of ct contains a polynomial

∑d−1j=0 ajX

j such that the `th bit of ajis the value of the ith slot of ct′j+(`−1)d.

This compression method reduces the number of ciphertexts containing stringsearch results from r to r/(d blog2 tc). Even though this method returns O(r)ciphertexts, it significantly reduces the communication complexity of the stringsearch protocol in practice. For example, if a SIMD slot is isomorphic to F1716 ,then r/64 ciphertexts should be transmitted.

Our compression method is not optimal as it does not exploit the last M − 1slots. For M close to k, this implies a significant part of ciphertext slots is notused.

This problem can be solved by replacing these zero slots with slots extractedfrom other ciphertexts. Assume that the above compression returns ciphertextsct1, ct2, . . . , ctr′ . Each cti does not exploit the last M − 1 slots. To fill all theslots of ct1, we extract the first M − 1 slots of ct2 and write them into the lastM − 1 slots of ct1 (this is done by Select, Rot and Add). We remove these slotsfrom ct2 by shifting its slots to the left, thus setting the last 2(M − 1) slots tozero. To fill these zero slots, we move the first 2(M − 1) slots of ct3 to ct2 asabove. We continue this procedure until we end up with ciphertexts whose slotsare fully occupied. The number of such ciphertexts is minimal to encrypt all thecompressed results.

Since this extra compression introduce more ciphertext noise, larger encryp-tion parameters might be needed to support decryption correctness. This leadsto a larger communication overhead that downgrades the gain from the extracompression. Therefore, we recommend to assess advantages of this techniquedepending on a use case scenario.

6 Implementation results

We tested our homomorphic string search algorithm using the implementation ofthe BGV scheme [5] in the HElib software library [21]. Our experiments were per-formed on a laptop equipped with an Intel Dual-Core i5-7267U CPU (running at3.1 GHz) and 8 GB of RAM without multi-threading. The code of our implemen-tation is available on https://github.com/iliailia/he_pattern_matching.

214 HOMOMORPHIC STRING SEARCH WITH CONSTANT MULTIPLICATIVE DEPTH

t m d ε k log q IS, KB EXP , B OS, KB λ

Set 7.12 7 21177 12 ' 2−34 1080 294 930 882 340 150Set 17.16 17 18913 16 ' 2−65 1182 334 1542 1288 513 260Set 7.12* 7 21177 12 ' 2−34 1080 330 1041 987 347 135Set 17.16* 17 18913 16 ' 2−65 1182 376 1736 1504 496 185

Table 3: The parameter sets used in our experiments, where t is the plaintextmodulus, m is the order of the cyclotomic ring R, d is the extension degree ofa SIMD slot over Ft, ε is the maximal failure probability of EQrtd (EQr,∗

td), k is

the number of SIMD slots, q is the ciphertext modulus, IS is the size of oneinput ciphertext, EXP is the input ciphertext expansion per slot (character)(EXP = IS/k), OS is the size of an output ciphertext, λ is the security levelmeasured using the LWE estimator [3]. The (*) symbol denote the parametersused for string search with wildcards.

In all experiments we have the following setup. Texts and patterns are stringsconsisting of 32-bit characters (corresponding to the UTF-32 encoding). To im-itate real scenarios, patterns are generated uniformly randomly in experimentswithout wildcard characters. In experiments with wildcard characters, every pat-tern is random but it has a non-negligible probability of having at least onewildcard. Given a pattern, we sample a random text, which is enforced to havesubstrings matching the pattern with non-negligible probability.

Texts and patterns are encrypted with encryption parameters present in Ta-ble 3. We empirically found that these parameters yield the best running timewhile supporting failure probability of EQrtd (EQr,∗

td) lower than 2−32 or 2−64. The

Hamming weight of secret keys is not bounded.

For each set of parameters in Table 3, we ran Algorithm 1 on one cipher-text with the text and the encrypted pattern of length M varying over the set1, 2, . . . , 9, 10, 20, . . . , 100. Since the iterations of the outer for loop in Algo-rithm 1 are independent, they can run in parallel. Thus, the above setting witha single ciphertext is a valid benchmark for the parallel implementation of Algo-rithm 1. In the sequential mode, the timing of this benchmark can be multipliedby the number r of ciphertexts containing the text.

As shown in Table 3, the size of input ciphertexts varies between 930 KBand 1.7 MB. The amortized memory usage per slot is 882-1504 bytes. Since anencoded character occupies exactly one slot, this number can be regarded as theciphertext expansion per character. Since the ciphertext size in BGV decreasesafter every ciphertext-ciphertext multiplication, output ciphertexts are smallerthan the input ones, namely between 340 and 513 KB.

The results of the experiments are present in Table 6. They include thetotal running time of Algorithm 1 with one input ciphertext and the amortizedtime per substring of length M . Since the input ciphertext contain k −M + 1substrings of length M , the amortized time is equal to the total time dividedby k −M + 1. It takes between 4 seconds and 14.5 minutes to perform string

HOMOMORPHIC STRING SEARCH WITH CONSTANT MULTIPLICATIVE DEPTH 215

N r TS, MB OS, MB Time, minMax. failureprobability

10000 10 10 0.35 52 ' 2−20

100000 97 99 1.69 506 ' 2−17

1000000 970 986 13.9 5060 ' 2−14

Table 4: The dependency of the communication cost and the running time (inthe sequential mode) of our homomorphic string search protocol on the text oflength N . The encryption parameters are taken from Set 7.12* and the patternlength is fixed to 50. The rightmost column contains the maximal probabilitythat Algorithm 1 returns at least one wrong position of the pattern (1 − (1 −t−d)r(k−M+1)). If the string-searching algorithm is performed with N parallelthreads the running time is decreased fo 5 minutes.

Patternlength

Failure probabilityper substring

Amortized timeper bit, ms

This work 1− 100 2−34 − 2−65 0.06− 0.13[22] 35− 55 0 7.07− 10.86[2] 64 2−1 − 2−80 1.58− 5.50

Table 5: Comparison of our string search algorithm with the prior works. Thesecond column shows the range of pattern lengths used in related experiments.

search on one ciphertext depending on the pattern length, failure probability andwhether wildcards are used. The amortized time per substring varies between4 and 800 milliseconds. We did not encounter any false positive results in theexperiments.

To give the reader a feeling on how our solution scales up with the textlength, we consider the following use case. Assume that the client wants to searchfor patterns of length at most 50 with wildcards. She chooses the parametersfrom Set 7.12* and a text of length N . As in Section 5.1, she splits the textinto r substrings, encrypt them and send to the server. We denote the sizeof these ciphertext by TS. Then, the client queries the server, which performsAlgorithm 1. Then, the server compresses the r outputs using the technique fromSection 5.4 and sends them back to the client. The size of the compressed resultsis denoted by OS. Table 4 illustrates how r, TS, OS and the running time ofAlgorithm 1 depend on N . All these values grow linearly, but the communicationcost (OS) remains quite low even if the text contains one million characters. Therunning time can be decreased to 5 minutes for any N , if the string-searchingalgorithm is processed in the parallel mode.

Comparison to the prior works. As shown in Table 5, our algorithm hasa better running time per bit. Furthermore, our method has several functionaladvantages. Namely, our algorithm has a constant depth, which allows to usemodest encryption parameters. In comparison to [22] with ring dimension 27000

216 HOMOMORPHIC STRING SEARCH WITH CONSTANT MULTIPLICATIVE DEPTH

required for pattern length 55, our method works even in the ring dimension12960 (Sets 7.12 and 7.12∗) for pattern length up to 100. In [22], it is alsorequired to transmit additional ciphertexts with wildcard positions.

Our algorithm hides wildcard positions in contrast to [2], where wildcardpositions are publicly known to the server. In addition, our protocol returns allthe pattern matches found in the text, while the protocol in [2] outputs only oneoccurrence of the pattern.

Unfortunately, we are not able to compare a concrete communication cost asit is not indicated in the above works.

7 Conclusion

In this work, we designed a general framework for a homomorphic string searchprotocol based on SHE schemes with SIMD packing. We provided an efficientinstantiation of this framework including the preprocessing step where the clientsplits the text into several components, which can be separately encrypted andprocessed.

We also showed a randomized homomorphic string search algorithm whosemultiplicative depth is independent of the pattern length. This allows us to usea single set of encryption parameters for a wide range of patterns with differentlengths.

Our string search algorithm can be efficiently executed on an average laptopas demonstrated by our implementation in the HElib library. The running timeof our algorithm is about 12 times faster than the prior work based on theSIMD techniques. Another advantage of our work is the communication costthat is significantly reduced by our compression technique, that allows to sendthe result to the client in r/(dblog2 tc) instead of r ciphertexts. For example,to transmit all the positions of a given substring in a text with million UTF-32characters, our protocol only requires about 13.9 MB instead of 328.7 MB.

This work presents a homomorphic realization of the naive string searchalgorithm with computation complexity Ω(NM) where N is the text length andM is the length of the pattern. There are asymptotically faster string-searchingalgorithms that exploit special data structures, e.g. suffix trees, tries or finiteautomata. Given the computational constraints of homomorphic encryption, itis an open question whether it is possible to implement efficient homomorphiccounterparts of these algorithms.

Acknowledgements. This work has beend supported by CyberSecurity ResearchFlanders with reference number VR20192203 and in part by ERC AdvancedGrant ERC-2015-AdG-IMPaCT. The second author is supported by a JuniorPostdoctoral Fellowship from the Research Foundation – Flanders (FWO).

References

1. Adi Akavia, Dan Feldman, and Hayim Shaul. Secure search on encrypted datavia multi-ring sketch. In David Lie, Mohammad Mannan, Michael Backes, and

HOMOMORPHIC STRING SEARCH WITH CONSTANT MULTIPLICATIVE DEPTH 217

Patternlength

ParametersTime,

secAmortized time,sec per substring

ParametersTime,

secAmortized time,sec per substring

1 Set 7.12 4 0.004 Set 7.12* 5 0.0052 9 0.008 10 0.0093 14 0.013 16 0.0154 19 0.018 19 0.0185 25 0.023 25 0.0236 30 0.028 30 0.0287 36 0.034 36 0.0348 37 0.034 40 0.0379 44 0.041 49 0.04610 50 0.047 55 0.05120 111 0.105 118 0.11130 184 0.175 187 0.17840 236 0.227 243 0.23350 303 0.294 301 0.29260 385 0.377 382 0.37470 460 0.455 449 0.44480 492 0.492 491 0.49190 619 0.625 617 0.623100 622 0.634 630 0.642

1 Set 17.16 6 0.005 Set 17.16* 7 0.0062 12 0.010 14 0.0123 19 0.016 22 0.0194 25 0.021 28 0.0245 32 0.027 36 0.0316 39 0.033 44 0.0377 48 0.041 53 0.0458 52 0.044 59 0.0509 60 0.051 69 0.05910 68 0.058 77 0.06620 129 0.111 160 0.13830 204 0.177 259 0.22540 269 0.235 317 0.27750 351 0.310 413 0.36560 439 0.391 511 0.45570 513 0.461 585 0.52680 571 0.518 663 0.60190 685 0.627 793 0.726100 739 0.682 850 0.785

Table 6: The running time (averaged out over 100 experiments) and the amor-tized time per substring of Algorithm 1 (with and without wildcards) with oneciphertext encrypting the text and the encryption parameters present in Table 3.One ciphertext contains a text of 1080 characters for Set 7.12 (Set 7.12*) and of1182 characters for Set 17.16 (Set 17.16*).

218 HOMOMORPHIC STRING SEARCH WITH CONSTANT MULTIPLICATIVE DEPTH

XiaoFeng Wang, editors, ACM CCS 2018, pages 985–1001. ACM Press, October2018.

2. Adi Akavia, Craig Gentry, Shai Halevi, and Max Leibovich. Setup-free securesearch on encrypted data: faster and post-processing free. PoPETs, 2019(3):87–107, July 2019.

3. Martin R. Albrecht, Rachel Player, and Sam Scott. On the concrete hardness oflearning with errors. Journal of Mathematical Cryptology, 9(3):169–203, 2015.

4. Jacob Alperin-Sheriff and Chris Peikert. Faster bootstrapping with polynomialerror. In Juan A. Garay and Rosario Gennaro, editors, CRYPTO 2014, Part I,volume 8616 of LNCS, pages 297–314. Springer, Heidelberg, August 2014.

5. Zvika Brakerski, Craig Gentry, and Vinod Vaikuntanathan. (Leveled) fully ho-momorphic encryption without bootstrapping. In Shafi Goldwasser, editor, ITCS2012, pages 309–325. ACM, January 2012.

6. Hao Chen, Wei Dai, Miran Kim, and Yongsoo Song. Efficient homomorphic con-version between (Ring) LWE ciphertexts. Cryptology ePrint Archive, Report2020/015, 2020. https://eprint.iacr.org/2020/015.

7. Hao Chen and Kyoohyung Han. Homomorphic lower digits removal and improvedFHE bootstrapping. In Jesper Buus Nielsen and Vincent Rijmen, editors, EURO-CRYPT 2018, Part I, volume 10820 of LNCS, pages 315–337. Springer, Heidelberg,April / May 2018.

8. Hao Chen, Zhicong Huang, Kim Laine, and Peter Rindal. Labeled PSI from fullyhomomorphic encryption with malicious security. In David Lie, Mohammad Man-nan, Michael Backes, and XiaoFeng Wang, editors, ACM CCS 2018, pages 1223–1237. ACM Press, October 2018.

9. Jung Hee Cheon, Miran Kim, and Myungsun Kim. Search-and-compute on en-crypted data. In Michael Brenner, Nicolas Christin, Benjamin Johnson, andKurt Rohloff, editors, FC 2015 Workshops, volume 8976 of LNCS, pages 142–159.Springer, Heidelberg, January 2015.

10. Jung Hee Cheon, Miran Kim, and Kristin E. Lauter. Homomorphic computationof edit distance. In Michael Brenner, Nicolas Christin, Benjamin Johnson, andKurt Rohloff, editors, FC 2015 Workshops, volume 8976 of LNCS, pages 194–212.Springer, Heidelberg, January 2015.

11. Ilaria Chillotti, Nicolas Gama, Mariya Georgieva, and Malika Izabachene. Fasterfully homomorphic encryption: Bootstrapping in less than 0.1 seconds. In Jung HeeCheon and Tsuyoshi Takagi, editors, ASIACRYPT 2016, Part I, volume 10031 ofLNCS, pages 3–33. Springer, Heidelberg, December 2016.

12. Benny Chor, Oded Goldreich, Eyal Kushilevitz, and Madhu Sudan. Private in-formation retrieval. In 36th FOCS, pages 41–50. IEEE Computer Society Press,October 1995.

13. Leo Ducas and Daniele Micciancio. FHEW: Bootstrapping homomorphic encryp-tion in less than a second. In Elisabeth Oswald and Marc Fischlin, editors, EURO-CRYPT 2015, Part I, volume 9056 of LNCS, pages 617–640. Springer, Heidelberg,April 2015.

14. Junfeng Fan and Frederik Vercauteren. Somewhat practical fully homomorphicencryption. Cryptology ePrint Archive, Report 2012/144, 2012. http://eprint.

iacr.org/2012/144.

15. Craig Gentry. Fully homomorphic encryption using ideal lattices. In MichaelMitzenmacher, editor, 41st ACM STOC, pages 169–178. ACM Press, May / June2009.

HOMOMORPHIC STRING SEARCH WITH CONSTANT MULTIPLICATIVE DEPTH 219

16. Craig Gentry, Shai Halevi, and Nigel P. Smart. Fully homomorphic encryptionwith polylog overhead. In David Pointcheval and Thomas Johansson, editors,EUROCRYPT 2012, volume 7237 of LNCS, pages 465–482. Springer, Heidelberg,April 2012.

17. Oded Goldreich. Towards a theory of software protection and simulation by obliv-ious RAMs. In Alfred Aho, editor, 19th ACM STOC, pages 182–194. ACM Press,May 1987.

18. Oded Goldreich, Silvio Micali, and Avi Wigderson. How to play any mental gameor A completeness theorem for protocols with honest majority. In Alfred Aho,editor, 19th ACM STOC, pages 218–229. ACM Press, May 1987.

19. Shai Halevi and Victor Shoup. Algorithms in HElib. In Juan A. Garay and RosarioGennaro, editors, CRYPTO 2014, Part I, volume 8616 of LNCS, pages 554–571.Springer, Heidelberg, August 2014.

20. Shai Halevi and Victor Shoup. Bootstrapping for HElib. In Elisabeth Oswald andMarc Fischlin, editors, EUROCRYPT 2015, Part I, volume 9056 of LNCS, pages641–670. Springer, Heidelberg, April 2015.

21. HElib: An implementation of homomorphic encryption (1.0.1). https://github.

com/shaih/HElib, April 2020. IBM.22. Myungsun Kim, Hyung Tae Lee, San Ling, Benjamin Hong Meng Tan, and Huax-

iong Wang. Private compound wildcard queries using fully homomorphic encryp-tion. IEEE Transactions on Dependable and Secure Computing, 2017.

23. Myungsun Kim, Hyung Tae Lee, San Ling, and Huaxiong Wang. On the efficiencyof FHE-based private queries. IEEE Transactions on Dependable and Secure Com-puting, 15(2):357–363, 2018.

24. Alexander A Razborov. Lower bounds on the size of bounded depth circuits overa complete basis with logical addition. Mathematical Notes of the Academy ofSciences of the USSR, 41(4):333–338, 1987.

25. Ronald L. Rivest, Len Adleman, and Michael L. Dertouzos. On data banks andprivacy homomorphisms. Foundations of secure computation, 4(11):169–180, 1978.

26. Nigel P. Smart and Frederik Vercauteren. Fully homomorphic SIMD operations.Des. Codes Cryptography, 71(1):57–81, April 2014.

27. Roman Smolensky. Algebraic methods in the theory of lower bounds for booleancircuit complexity. In Proceedings of the nineteenth annual ACM symposium onTheory of computing, pages 77–82, 1987.

28. Dawn Xiaodong Song, David Wagner, and Adrian Perrig. Practical techniques forsearches on encrypted data. In 2000 IEEE Symposium on Security and Privacy,pages 44–55. IEEE Computer Society Press, May 2000.

29. Haixu Tang, Xiaoqian Jiang, Xiaofeng Wang, Shuang Wang, Heidi Sofia, DovFox, Kristin Lauter, Bradley Malin, Amalio Telenti, Li Xiong, and Lucila Ohno-Machado. Protecting genomic data analytics in the cloud: state of the art andopportunities. BMC medical genomics, 9(1):63, 2016.

30. Andrew Chi-Chih Yao. How to generate and exchange secrets (extended abstract).In 27th FOCS, pages 162–167. IEEE Computer Society Press, October 1986.

31. Masaya Yasuda, Takeshi Shimoyama, Jun Kogure, Kazuhiro Yokoyama, andTakeshi Koshiba. Secure pattern matching using somewhat homomorphic encryp-tion. In Ahmad-Reza Sadeghi, Virgil D. Gligor, and Moti Yung, editors, ACMCCS 2013, pages 65–76. ACM Press, November 2013.

32. Masaya Yasuda, Takeshi Shimoyama, Jun Kogure, Kazuhiro Yokoyama, andTakeshi Koshiba. Privacy-preserving wildcards pattern matching using symmetricsomewhat homomorphic encryption. In Willy Susilo and Yi Mu, editors, ACISP14, volume 8544 of LNCS, pages 338–353. Springer, Heidelberg, July 2014.

220 HOMOMORPHIC STRING SEARCH WITH CONSTANT MULTIPLICATIVE DEPTH

Chapter 10

Thresholdizing HashEdDSA:MPC to the Rescue

Publication data

Bonte C., Smart N. P. and Tanguy T. Thresholdizing HashEdDSA: MPC to theRescue. In International Journal of Information Security (Feb. 2021), D. Gollmann,J. Lopez, M. Mambo, Eds., Springer

Contribution:The author of this thesis is a main author of this paper.

Thresholdizing HashEdDSA: MPC to the Rescue

Abstract. Following recent comments in a NIST document related tothreshold cryptographic standards, we examine the case of thresholdiz-ing the HashEdDSA signature scheme. This is a deterministic signaturescheme based on Edwards elliptic curves. Unlike DSA, it has a Schnorrlike signature equation, which is an advantage for threshold implementa-tions, but it has the disadvantage of having the ephemeral secret obtainedby hashing the secret key and the message. We show that one can obtainrelatively efficient implementations of threshold HashEdDSA with nomodifications to the behaviour of the signing algorithm; we achieve thisusing a doubly-authenticated bit (daBit) generation protocol tailored forQ2 access structures, that is more efficient than prior work. However, ifone was to modify the standard algorithm to use an MPC-friendly hashfunction, such as Rescue, the performance becomes very fast indeed.

1 Introduction

Recent developments, like blockchain, have produced scenarios where creatingvalid signatures are extremely valuable. Even a single valid signature on anincorrect message can result in catastrophic consequences. This creates a problemthat we can describe as fraudulent key usage. Threshold signature schemes can beused to mitigate the risk that an adversary can produce such a valid signature,by distributing signing power to a qualified set for a given access structure.Threshold signature schemes replace the key generation and signing algorithms ofa digital signature scheme with an interactive protocol that requires participationof a certain number of parties to generate the signatures. Hence an adversarywould have to corrupt multiple parties in order to generate a valid signature.

Threshold signatures schemes were previously studied in [17,22,35,41]. Withthe emergence of the scenarios where creating valid signatures can cause severethreats to the system, interest in threshold signature schemes renewed and newmethods to generate ECDSA signatures were constructed [11,14–16,19–21,31,33,34]. As a consequence, the standardisation body NIST has initiated a ThresholdCryptography project in which they aim to standardise threshold schemes. Bymaking threshold versions of their earlier standardised cryptographic primitives,they aim to develop the ability to distribute trust in operations of the variousreal systems that already base their security on NIST-approved cryptographicprimitives.

In one of the documents NIST published in this effort [10], threshold schemesfor several cryptographic primitives are described. Amongst others, the docu-ment mentions the Edwards-curve Digital Signature Algorithm (EdDSA). TheEdwards-curve digital signature algorithm consists of a deterministic variant of aSchnorr signature based on Edwards curves. In deterministic Schnorr signatures,

the ephemeral secret key is obtained by hashing the secret key and the message.There are many (secure implementation) reasons for using such a variant ofSchnorr signatures, even though a verifier is unable to verify if the correct deter-ministic procedure was indeed carried out. However, this deterministic structurecreates a technical difficulty for achieving a corresponding threshold version, asdistributing the computation of a hash over different parties is not straightfor-ward. In [10], NIST therefore raised concerns that this difficulty would lead toa longer path of standardisation. In particular they ask in the document

The concrete (deterministic) EdDSA replaces the randomness by a hash of the con-catenation of the secret signing key and the message being signed. This creates atechnical difficulty for achieving a corresponding threshold interchangeable mode,which may either imply for it a more complex longer path of standardization, oradditional possible considerations about the exact intended threshold mode.

On the NIST mailing list associated with their standardization of thresholdschemes they also posted the request for further comments

The preliminary roadmap briefly acknowledges (lines 600–608) the technical diffi-culty, with respect to thresholdization, associated with EdDSA requiring a hash ofthe secret key and other material, which would be expensive to obtain in a thresholdmanner. Specific comments about this are also welcome.

In private communication with the authors, they also expressed interest inwhether choosing different hash functions in the EdDSA algorithm could helpmitigate these issues, as the official standardization of EdDSA is not yet fullycomplete.

Note, the standard is not focused on the situation where a system designerhas a choice of algorithm to use in an application where threshold signatures arerequired. In such a situation, the obvious choice is clearly to use standard Schnorrsignatures. The proposed standard is focused on situations where an algorithmhas already been selected in an application, and the organization performingthe signing operations wants to secure the underlying cryptographic key usingthreshold cryptography.

NIST as an organization also validates cryptographic modules and thus anymodule implementation of the threshold variant of the algorithm must outputsignatures which are identical to the non-threshold variant; i.e. exactly the samedeterministic signature needs to be produced by the threshold module as itwould via normal cryptographic module, with the same key. Thus the thresholdvariant should produce signatures which are perfectly indistinguishable fromnon-threshold signatures.

A more detailed description of EdDSA is provided in NIST’s Digital SignatureStandard document [37]. There are two variants of EdDSA, the first obtainedthe ephemeral secret key using

r = H( H(d) ‖ m )

THRESHOLDIZING HASHEDDSA: MPC TO THE RESCUE 223

where d is the signing key and both d, r ∈ Fq. The second variant, called HashEd-DSA, hashes the message first, thus we obtain

r = H( H(d) ‖ H(m) )

To transform a signature scheme to a threshold signature scheme, we needto replace the key generation and signing algorithms with interactive protocolsbetween multiple parties. In this work we consider Q2 access structure; this is anaccess structure in which no union of two unqualified sets covers the whole setof parties. These quorum based access structures were introduced by Hirt andMaurer [26] and are now commonly used in the MPC literature. For simplicityone can consider threshold (t, n) access structures where the t < n/2 and nosubset of fewer than t + 1 parties can recover the underlying secret or morespecifically in this setting, a subset of fewer than t + 1 parties can not forge asignature. Hence in a threshold protocol at least t + 1 parties have to agree onsigning a message before a valid signature of this message can be produced usinga protocol. The resulting signature is identical to a signature which is obtained ina non-threshold manner, thus the verification algorithm of the signature remainsunchanged.

The required security notion is that an active adversary, controlling an un-qualified set of parties, who can arbitrarily deviate from the protocol, is unableto learn any information about the secret key or to produce a valid signature.In this work, we concentrate on active security-with-abort as this often is morerelevant in practical situations. Formally we require that the adversary cannotdetermine when interacting with the protocol whether it is interacting with a realprotocol, or with a simulator which has access to an ideal signing functionality.In particular, this means that the output signatures from the protocol need tobe the same as those produced by the non-threshold ideal signing functionality.

If we apply this to EdDSA, one sees that, since the ephemeral signing key rneeds to be kept secret, one needs to compute the hash via a form of multi-partycomputation (MPC) in order to create a signature. This is the biggest challengein transforming the EdDSA signature to a threshold signature. As hashing inan MPC setting will be the most expensive part of our protocol, EdDSA willbe slow if it is used on long messages, and hence we focus on the HashEdDSAvariant of the protocol.

1.1 Our Contribution

We answer to the concerns raised by NIST in [10] by showing that with some(minor) changes to existing MPC-techniques, it is possible to create a thresholdversion of the standardised HashEdDSA signing algorithm. Since the long termand ephemeral secrets of HashEdDSA lie in the finite field Fq (for q a largeprime), it makes a lot of sense to utilize a generic MPC framework for Fq to domost of the heavy elliptic curve operations; see [15, 42] where this idea is usedbefore. However, to evaluate the SHA512/SHAKE256 hash functions, it is moreconvenient to work with binary numbers. We will apply two different strategiesto compute these hash functions.

224 THRESHOLDIZING HASHEDDSA: MPC TO THE RESCUE

Our first strategy is to securely evaluate the hash function using a GarbledCircuit (GC) based approach for n parties. In particular, we use a variant of theHSS protocol [25]. This produces an additive authenticated bit-wise sharing ofthe hash function output between the n-parties. Our second approach is to useevaluate the hash function circuit using a secret sharing based approach overF2. In particular we adapted the method of Araki et al. [6], which focuses onthe threshold case of (n, t) = (3, 1) to work with any multiplicative Q2 accessstructure instantiated by a replicated secret sharing scheme. These methodsare explained in Section 3, and are to be incorporated in the v1.11 release ofSCALE-MAMBA [4].

We then convert these bit-sharings into an Fq-sharing for the desired Q2

access structure using a modification of the daBit procedure from [5, 40]. AdaBit is a doubly-authenticated bit, namely a bit b which is secret shared in twodifferent secret sharing schemes with respect to potentially two different moduli,e.g. 〈b〉p1 and 〈b〉p2 .

The most efficient daBit generation procedure is given in [39]. But this pro-cedure is focused on producing daBits which are shared with respect to two largeprimes p1 and p2, with respect to a full threshold secret sharing scheme for bothmoduli. In our work, we require to generate daBits in the ‘classic’ setting wherep1 = 2 and p2 = q a large prime; but where the sharing modulus p1 = 2 is fullthreshold and the sharing modulus p2 = q is with respect to a Q2 access struc-ture. Thus in Section 4, we provide a variant of the method in [39] which worksin our situation. This variant works by only requiring a qualified subset of theparties to contribute to the computation of the so-called correction terms in thebig field. We also have to pay attention to how each party in this qualified subsetcontributes to this computation using its share of the bit and the recombinationvector of the underlying secret sharing scheme.

In Section 5, we present an actively secure threshold variant of the standardHashEdDSA. Unlike threshold variants of standard (non-deterministic) DSAor Schnorr, we do not use zero-knowledge proofs to ensure correctness of thevalues produced by the adversary. This is because we rely on the underlyingactively UC-secure MPC protocol to enable extraction of adversarial input forour simulation. In particular, it is important that we use an underlying MPCprotocol which is UC-secure. Hence, we also provided specimen runtimes for ourthreshold protocols and compare these with timings from other MPC frameworksthat allow similar setups in Section 7.

The main problem, and the most expensive part of our threshold protocol,is the need to evaluate a hash function which is designed to operate on binarydata and then convert it into data which is represented as elements in Fq. Thus,we also investigate in Section 6 the effect of replacing the standard hash func-tions SHA512 and SHAKE256 used in HashEdDSA with recently developed moreMPC-Friendly hash functions to estimate the effect of this change on the effi-ciency of the threshold signing algorthm. We choose this modification as theMPC friendly hash functions look promising for improving the efficiency of oursetup and we consider changing the hash function a non-invasive change to the

THRESHOLDIZING HASHEDDSA: MPC TO THE RESCUE 225

standard that can easily be carried out in real-life systems currently using thisstandardised signing algorithm. In particular, we examine the Rescue hash func-tion given in [3] as this is a hash function particularly suited to operating on Fqdata. In Section 7, we also give experimental runtimes for executing Rescue inthis context.

Why focus on Q2 and not full threshold? Our focus in this paper is on multiplica-tiveQ2 access structures and not full threshold access structures for the followingreason. Full threshold access structure poses additional complexity when utilizedfor MPC, in particular the full threshold access structure cannot be associatedwith a multiplicative secret sharing scheme. This means that it is not possibleto compute an additive sharing of the product of two secrets by performing onlylocal computations. Therefore, the input independent preprocessing phase needsto rely on more complex machinery such as Homomorphic Encryption [18] orOblivious Transfer [29]. The most efficient method is to use Homomorphic En-cryption, but here we require special properties of the underlying prime modulusq. These conditions are not satisfied for the two curve parameters in the NISTstandards. On the other hand, using Oblivious Transfer one can design an offlinephase for any prime modulus q but at a relatively large cost. Thus, we focus onthe case of Q2 access structures only.

2 Preliminaries

As in [37], we will consider two variants of HashEdDSA, based on the Edwardscurves Ed25519 and Ed448. Both are variants of a Schnorr signature based ontwisted Edwards curves. They use, however, different curves, hash functionsand bitsizes, which means they provide different security levels. We will brieflydescribe Prehash EdDSA (HashEdDSA) for both Ed25519 and Ed448 signatureshere. This is the version of EdDSA where the ephemeral key is generated onthe hash of the message rather than on the message itself. For more details onEdDSA, we refer the reader to the original paper [8]. An interested reader canfind a generalised version of EdDSA in [9].

First, we will define the notation of the parameters needed in these signatures.Let λ be the required security level. Set b the number of bits for the publicHashEdDSA keys. Then the HashEdDSA signatures will consist of exactly 2bbits. This value b is always a multiple of 8 and will hence be considered as astring of octets. We will let H denote the hash function used in HashEdDSA,for Ed25519 this is SHA512 and for Ed448 this is SHAKE256. HashEdDSA alsorelies on the parameters of the Edwards curve. Let G be a base point of primeorder on the curve with coordinates (xG, yG). The order of the point G will beindicated with q. The private key of the signature scheme is denoted by sk andthe public key with pk. The Edwards curve is itself defined over the prime fieldFp.

226 THRESHOLDIZING HASHEDDSA: MPC TO THE RESCUE

2.1 The Signature Algorithms

Having defined the parameters and encoding/decoding techniques used in theEdDSA signature schemes, we now clarify the prehashed version of the Ed25519signature scheme in Figure 1 and the prehashed version of the Ed448 signaturescheme in Figure 2. The algorithms make use of octet string encodings of ellipticcurve points. This is done using a form of point compression tailored to thecase of Edwards curves, for details see [37]. For our purposes, we note that theencoding of a point for Ed25519 requires 32 octets, whilst that for Ed448 requires57 octets.

Hashed Ed25519 Signature Algorithm

KeyGen(b, λ):

1. Use a random bit generator to obtain a string of b = 256 bits. The privatekey sk equals this string of 256 bits.

2. Compute the hash H of the private key sk with SHA512, whichresults in a bitstring of length 512, i.e. H(sk) = SHA512(sk) =(h0, h1, h2, . . . , h2b−1).

3. Use HL(sk) the first half of H(sk) to generate the public key by settingthe first three bits of the first octet and the last bit of the last octet tozero and setting the second last bit of the last octet to one. Hence we seth0 = h1 = h2 = hb−1 = 0 and hb−2 = 1. Determine from this new bitstringan integer s ∈ Fq using little-endian convention.

4. Compute Q = [s]G, the corresponding public key pk is the encoding of Q.

Sign(m, sk, pk):

1. Taking the second half of the hash value computed above weset HR(sk) = hb||hb+1|| . . . ||h2b−1 and compute with it r =SHA512(HR(sk)||SHA512(m)). This r will be 64 octets long, and we treatit as a little-endian integer modulo q.

2. Compute the encoding R of the point [r]G.3. Define S as the encoding of r + SHA512(R||pk||SHA512(m)) · s (mod q).4. The signature is constructed as the concatenation of R and S.

Verify(R||S,m, pk):

1. Split the signature into two equal parts and decode the first half R asa point and the second half S as an integer s. Verify that s lies in thehalf open interval [0, q). Decode the public key pk into a point. Reject thesignature if any of the decodings fail.

2. Create a bit string of the concatenation of the octet strings R, pk, m, andHashData = R||pk||SHA512(m).

3. Compute SHA512(HashData) and interpret this bit string as a little-endianinteger t.

4. Verify the equation [23 ·S]G = [23]R+ [23 · t]pk. Reject the signature if thisverification fails, otherwise accept the signature.

Figure 1. Hashed Ed25519 Signature Algorithm

THRESHOLDIZING HASHEDDSA: MPC TO THE RESCUE 227

Hashed Ed448 Signature Algorithm

KeyGen(b, λ):

1. Use a random bit generator to obtain a string of b = 456 bits. The privatekey sk equals this string of 456 bits.

2. Compute the hash H of the private key sk with SHAKE256, whichresults in a bitstring of length 912. H(sk) = SHAKE256(sk, 912) =(h0, h1, h2, . . . , h2b−1)

3. Use HL(sk) the first half of H(sk) to generate the public key by setting thefirst two bits of the first octet and all eight bits of the last octet to zeroand setting the last bit of the second to last octet to one. Hence we seth0 = h1 = hb−8 = . . . = hb−1 = 0 and hb−9 = 1. Determine from this newbitstring an integer s using little-endian convention.

4. Compute Q = [s]G, the corresponding public key pk is the encoding of Q.

Sign(m, sk, pk, context): A string context with maximum length of 255 octets is setby the signer and verifier, by default this string context is the empty string.1. Taking the second half of the hash value of the private key sk computed

above, we set HR(sk) = hb||hb+1|| . . . ||h2b−1. This value can already beprecomputed.

2. We define dom4(f, c) to be the value

(SigEd448||octet(f)||octet(octetlength(c))||c),

where string SigEd448 is 8 octets in ASCII, the value octet(f) is the octetwith f a value in the range 0 − 255 and octetlength(c) is the number ofoctets in string c. We compute r as the value

r = SHAKE256(dom4(0, context)||HR(sk)||SHAKE256(m, 912), 912).

This r will be 114-octets long, and treated as an integer modulo q.3. Compute the encoding R of the point [r]G.4. The value S is defined as the encoding of r +

SHAKE256(dom4(0, context)||R||pk||SHAKE256(m, 912), 912) · s (mod q).5. The signature is constructed as the concatenation of R and S.

Verify(R||S,m, pk, context): A string context with maximum length of 255 octets isset by the signer and verifier, by default this string context is the empty string.1. Split the signature into two equal parts and decode the first half R as

a point and the second half S as an integer s. Verify that s lies in thehalf open interval [0, q). Decode the public key pk into a point. Reject thesignature if any of the decodings fail.

2. Create a bit string of the concatenation of the octet strings R, pk, m, andHashData = R||pk||SHAKE256(m, 912).

3. Compute SHAKE256(dom4(0, context)||HashData, 912) and interpret thisbit string as a little-endian integer t.

4. Verify the equation [22 ·S]G = [22]R+ [22 · t]pk. Reject the signature if thisverification fails, otherwise accept the signature.

Figure 2. Hashed Ed448 Signature Algorithm

228 THRESHOLDIZING HASHEDDSA: MPC TO THE RESCUE

Ed25519 and SHA512 We need to determine the number of SHA512 callswe will need to compute using MPC in the Ed25519 signing algorithm. Fromthe description of SHA512, we know the blocksize is 1024 bits and that somepreprocessing is performed on the input of the hash function. In the preprocessingof the input data of SHA512, one bit and a 128-bit string encoding the length ofthe message are appended to the message.

The description of Ed25519 includes 3 calls to the hash function. The firstoccurence computes the hash of the secret key H(sk). Remember that we canprecompute this value, so we should not take this instance of the hash functioninto account. In the signing protocol, we first need to compute the hash on themessage H(m). The number of hash calls needed for this computation dependson the size of the message. If l is the length of the message we want to hash, thenumber of hash calls we need to do considering the preprocessing of the messagebefore hashing, is d(l+ 1 + 128)/1024e. This can be computed in the clear sinceall parties know the message.

Afterwards, we need to compute another hash with input the second half ofthe resulting bitstring of H(sk), the hash of the secret key, concatenated withthe hash of the message H(m). Since SHA512 outputs a bitstring of 512 bits,the input size of this last hash call is 256 + 512 + 1 + 128 = 897 bits, taking intoaccount the length padding needed in a standard call to SHA512. This is smallerthan the blocklength, hence this will only require one call to the underlyingcompression function.

Ed448 and SHAKE256 We can carry out a similar analysis for the case ofthe usage of SHAKE256 in the Ed448 signing algorithm. Note SHAKE256 alwaystakes a second parameter which defines how many bits of output this applicationof SHAKE256 should produce; this should be contrasted with SHA512 whichalways produces a block of 512-bits as output. In SHAKE256, the suffix 1111 isappended to the message before padding, then the padding needs to assure themessage can be divided into blocks of bitsize 1088. Hence zeros are attached tothe message in order to make the message size a multiple of 1088. However, thefinal bit added is a 1 and not a 0. Therefore, we have to at least add 5 ones tothe message. If we consider a message of length l, we will have to process d(l +5)/1088e blocks, which equals the number of calls to the permutation functionof SHAKE256.

Just as for the case of Ed25519 above, Ed448 uses three calls to this hashfunction. The first is to compute the hash of the secret key, which can again bedone in preprocessing. The second is to compute the hash of the message, de-pending on the length l of the message we need d(l+5)/1088e calls to SHAKE256to compute this.

The final, crucial for us, hash function call is applied to the concatenation ofdom4(0, context), half of the bitstring encoding the hash of the secret key and thehash of the message. For the default setting where context is the empty string,the length of the message we want to sign here is 80 + 456 + 912 + 5 = 1453 bits,which corresponds to the need for 2 blocks. For the maximal size of context,

THRESHOLDIZING HASHEDDSA: MPC TO THE RESCUE 229

which corresponds to 255 octets, we need 2120 + 456 + 912 + 5 = 3493 bits,which corresponds to needing 4 blocks. Thus we need to apply the SHAKE256permutation function between two and four times.

3 MPC Functionalities

Our key MPC functionality needs to process shared values in Fq as well asevaluate binary garbled circuits using the HSS protocol, [25]. We therefore adoptthe Zaphod framework from [5].

Let P = P1, . . . , Pn be a set of parties, Γ,∆ ∈ 2P be respectively themonotonically increasing set of qualified sets and the monotonically decreasingset of unqualified sets. Then if Γ ∩ ∆ = ∅, (Γ,∆) defines a monotone accessstructure. For our matter, we only consider complete monotone access structures,that is those for which ∆ = 2P \Γ holds. Eventually the access structure is saidto be Q2 if no union of two sets in ∆ is the whole set of parties P.

We let 〈·〉q denote a linear secret sharing scheme (LSSS) over the finite fieldFq which realizes a Q2-access structure. We restrict the set of supported LSSSto one which is multiplicative, which means that given 〈x〉q and 〈y〉q then thevalue x · y can be expressed as a linear combination of local products of shares.Since the access structure is Q2, the shares are in some-sense automaticallyauthenticated through redundancy among the honest players, thus one can easilydefine actively secure MPC (with abort) protocols for such a secret sharingscheme, see e.g. [13, 43].

The sharing 〈·〉q is defined via monotone span programs (MSPs), which werefirst introduced by Karchmer and Widgerson [27]. Let M ∈ Fm×kq be a matrix,

choose a non-zero “target” vector t ∈ Fkq and a surjective index function ι :1, . . . ,m −→ 1, . . . , n. If we consider for example Shamir based Q2 accessstructure, the number of shares m are equal to the number of parties n and ifwe consider the (3, 1)-threshold replicated secret sharing, the number of sharesm equals 6. To share a secret s using this matrix, “target” vector and indexfunction, the dealer samples a vector vk ← Fkq such that t · vkT = s ∈ Fq, setss = (s1, . . . , sm) = M · vk, and for each j ∈ [m] computes i = ι(j) and sends sjto party Pi over a synchronous secure channel1. The matrix is chosen in such away that for any qualified set of parties Q ∈ Γ , there is a (public) recombinationvector rQ that given the share vector s (i.e. the concatenation of shares heldby the qualified set of parties) can recover the secret by computing s = rQ · sT(mod q). To authenticate a set of shares there is a public parity check matrixH, which for a valid set of shares s will satisfy H · sT = 0. Unfamiliar readersare referred to [30] for a more detailed introduction to MSPs.

As mentioned above, we will use two different strategies to compute the hashfunctions over F2. For the HSS GC-based protocol [25], we use a full thresholdauthenticated sharing of bits 〈·〉2, according to the pairwise BDOZ-style MACintroduced by Bendlin et al. [7]. We extend the sharing of bits 〈x〉2 to sharingsof vectors of bits 〈x〉2 in the obvious manner.

1 Standard TLS satisfies the properties we need for our secure channels.

230 THRESHOLDIZING HASHEDDSA: MPC TO THE RESCUE

For theQ2 based protocol for replicated secret sharing we use a generalisationof the method of [6]. Thus when working with a Q2 structure modulo q for themain sharing 〈·〉q above, we select a replicated secret sharing with the sameaccess structure to obtain a similar sharing over F2. For the protocol to executeMPC over F2, we use Maurer’s passively secure multiplication protocol [36] togenerate passively secure multiplcation triples over F2. These are then turnedinto actively secure triples using the method of [6][Protocol 3.1]. Finally thetriples are consumed in a standard GMW/CCD style online phase, using themethodology of [43] to ensure active security, whilst using the techniques of [30]to reduce the total amount of communication performed. This combination oftechniques works for any replicated secret sharing scheme over F2, and producesan online phase requiring the least number of rounds2.

The different types of shares and their functionalities are combined by ap-plying the Zaphod framework [5] using the functionality in Figure 3. Note thatunlike the paper [5] we are only interested in converting from binary sharings〈·〉2 to an equivalent shared bit in the 〈·〉q-world. We also allow the circuitsCf to produce multiple outputs. Each value in FMPC is uniquely identified byan identifier varid ∈ I, where I is a set of valid identifiers, and a domain setdomain ∈ Fq,F2. Note the functionality is modelled in such a way that it isindependent of the details of the authentication technique used. In addition, thefunctionality captures all the MPC computations we will require from a systemsuch as SCALE-MAMBA. Note, we will be using the reactive nature of the MPCfunctionality as we will be calling it to perform the key generation, as well asrepeatedly calling it for the signing operation.

In expressing algorithms based on this functionality, we shall use the short-hand 〈x〉q to denote an item stored in a varid for domain Fq and 〈x〉2 to denotean item stored in a varid for domain F2. The functionalities and protocols ex-pressed in this work require all parties to participate in order to produce anoutput. However, a qualified set of parties should suffice to request and generatean output, the formalization of this setup would be an extension of our paperwhich is left for future work.

2 Note, other methodologies can reduce the total number of rounds or the total numberof multiplications; i.e. when considering online and offline phases as one.

THRESHOLDIZING HASHEDDSA: MPC TO THE RESCUE 231

Functionality FMPC

The functionality runs with parties P1, . . . , Pn and an ideal adversary Adv. Let A bethe set of corrupt parties. Given a set I of valid identifiers, all values are stored inthe form (varid , domain, x), where varid ∈ I, domain ∈ F2,Fq and x ∈ domain.

Initialize: On input (Init) from all parties, the functionality activates.If (Init) was received before, do nothing.

Input: On input (Input , Pi, varid , domain, x) from Pi and(input , Pi, varid , domain) from all other parties, with varid a fresh iden-tifier, store (varid , domain, x).

Random: On input (Random, varid , domain) from all parties, with varid a freshidentifier, generate a uniformly random value x ∈ Fq or x ∈ F2 depending onthe value of domain and store (varid , domain, x).

Evaluate: Upon receiving ((varid j)j∈[nI ], (varid i)i∈[nO ], domain, Cf ), from allparties, where f : domainnI → domainnO and the varid i are all freshidentifiers, if varid jj∈[nI ] were previously stored, proceed as follows:1. Retrieve (varid j , domain, xj), for each j ∈ [nI ]2. For each i ∈ [nO] store (varid i, domain, yi) where (y1, . . . , ynO ) ←

f(x1, . . . , xnI )Output: On input (Output , varid , domain, type), from all parties (if varid is

present in memory):1. If type = 0 (Public Output): Retrieve (varid , domain, y) and send y to

Adv. If the adversary sends Deliver, send y to all parties.2. Otherwise type = i (Private Output): Wait for the adversary. If the

adversary sends Deliver, retrieve (varid , domain, y) and send y to PiAbort: The adversary can at any time send abort, upon which send abort to all

honest parties and halt.Convert: On input (Convert , varid1, F2, varid2, Fq):

1. Retrieve (varid1,F2, x) and convert x ∈ 0, 1 to an element y ∈ Fq bysetting y = x

2. Store (varid2,Fq, y).

Figure 3. The ideal functionality for MPC with Abort over Fq and F2 - Evaluation

232 THRESHOLDIZING HASHEDDSA: MPC TO THE RESCUE

4 Improved daBit Technique for Q2-LSSS

In the following we want to adapt the daBit scheme to generate daBits to switchbetween binary shares and any multiplicative LSSS with a Q2 access structurefollowing the method by Rotaru and Wood [40]. This base protocol was improvedin [5], and then again in [39]. However, the latter improvement was only in thecase of producing daBits between two full threshold LSSS systems both overlarge prime moduli. Since our interest lies in Q2 access structures, we need tomodify the technique in [39] for the case where q1 is the prime modulus of a Q2

LSSS and q2 = 2.We note that to realize the GC protocol implemented in the SCALE-MAMBA [4]

framework, and described in [25], the sharing in F2 is a full threshold sharingirrespective of the access structure of the LSSS in Fq. As an the alternativestrategy, as explained earlier, we can utilize a replicated secret sharing schemeto implement the Q2 access structure over F2. In either case the daBit protocolsand the conversion scheme are identical; all that differs is how shared bits arerepresented and authenticated.

We recall that we denote by 〈.〉q a sharing in the multiplicative LSSS and by〈.〉2 a full threshold sharing (GC) or honest majority sharing ( mod 2 sharing)in F2 (with all sharings authenticated). The protocol follows closely the onedescribed in [39, Figure 6] as we want to achieve the same goal. In our case,the smallest modulo we work with is qmin = 2. As is the case in [39], ourprotocol requires that qγmin > 2sec. We therefore set γ = sec + 1 where sec isthe security parameter (typically sec = 128). Informally, the protocol asks theparties to generate a random bit in the LSSS through a call to the GenBitprotocol, which extends the FMPC functionality and for which we give the idealfunctionality in Figure 4 and its secure realisation in Figure 5. The protocol alsomakes a call to the ideal functionality F2sec

Rand which is described in Figure 6. Theideal functionalities and the protocol are taken from [39] and are given here forcompleteness.

Functionality FMPC.GenBit()

1. For each corrupt party Pi, the functionality waits for inputs bi ∈ Fq.2. The functionality waits for a message abort or ok from the adversary. If the

message is ok then it continues.3. The functionality then samples a bit b ∈ 0, 1 and then completes the sharing

to b =∑i bi by selecting shares for the honest parties.

4. The (authenticated) shares are passed to the honest parties.5. The bit b is stored in the functionality FMPC.

Figure 4. The ideal functionality for single random bits

Then one party is given enough information to compute, with probability atleast 1 − 1

q , how many ‘wrap arounds’ modulo q the sharing of these random

THRESHOLDIZING HASHEDDSA: MPC TO THE RESCUE 233

Entering a Random Bit ΠqMPC.GenBit()

1. For i = 1, . . . , n execute FMPC.Input such that Pi inputs xi and all parties obtainthe share of 〈xi〉q, where xi is a random element in Fq.

2. 〈x〉q ←∑ni=1〈xi〉q.

3. 〈y〉q ← 〈x〉q · 〈x〉q.4. Execute FMPC.Output to publicly reveal y from its sharing 〈y〉q.5. If y = 0 then restart the process.6. z ← √y, picking the value z ∈ [0, . . . , q/2).7. 〈a〉q ← 〈x〉q/z.8. 〈b〉q ← (〈a〉q + 1)/2.9. Return 〈b〉q.

Figure 5. ‘Standard’ method to produce a shared random bit in ΠqMPC

bits induced. Therefore, this party knows what its input should be modulo 2to correct for the shares held by the other parties. We therefore end up with asharing of the same bit modulo q and modulo 2. Eventually, to check that theprotocol was correctly executed, the parties open γ = sec+1 linear combinationsof the same random bits in both modulo q and modulo 2 and make sure thatthe linear combinations are equal modulo 2.

Functionality FBRand(M)

1. On input (Rand, cnt) from all parties, if the counter value is the same for allparties and has not been used before, the functionality samples ri ← [0, B) fori = 1, . . . ,M .

2. The values ri are sent to the adversary, and the functionality waits for an input.3. If the input is Deliver then the values ri are sent to all parties, otherwise the

functionality aborts.

Figure 6. The ideal FBRand(M) functionality

We note that the only differences between our variant of the protocol pre-sented in Figure 7 and the one from [39] lie in step 2. In fact, because in ourprotocol we consider any multiplicative LSSS with a Q2 access structure we donot require all parties to take part in the computation of the correction termski. All we need is that enough of them, that is any set of parties in Γ , help P1

in doing so. That is we select a set Q1 which is the smallest qualified set whichcontains party P1. Moreover, the definition of the bits bi,j slightly changes fromthe original protocol as we need to take into account the reconstruction vectorfor the set Q1, and not only the raw shares. We stress here that for the fullthreshold case the full set of parties is the only subset that can reconstruct asharing. Therefore if one considers Q1 = P and rQ1

= 1, then our definition ofbi,j is identical to the one given in [39]. This alteration means that the proof of

234 THRESHOLDIZING HASHEDDSA: MPC TO THE RESCUE

security from [39], which relies on assuming a variant of the subset sum problem,also applies to our modification.

Protocol ΠdaBit

Let Q1 be the smallest set in Γ that contains P1.

1. Set δ = dq/|Q1|e.2. For i = 1, . . . ,m+ γ · sec do

(a) 〈bi〉q ← FMPC.GenBit().

(b) For j ∈ Q1 let bi,j = 〈rPjQ1

, 〈bi〉Pjq 〉 denote party j’s value such that∑

j∈Q1bi,j = bi (mod q).

Such a bi,j always exists by definition of our LSSS.(c) For j ∈ Q1, party Pj writes bi,j = li,j + δ · hi,j with 0 ≤ li,j < δ.(d) For j ∈ Q1, party Pj sends hi,j to party P1.(e) Party P1 sets

ki =⌈δ ·∑j∈Q1

hi,j

q

⌉.

(f) Parties re-randomize the sharing in Fq- All parties execute FMPC.Input, for j ∈ Q1, such that Pj inputs bi,j

(mod q) and all parties obtain the sharing 〈b(j)i 〉q.- The parties compute 〈bi〉q =

∑j∈Q1

〈b(j)i 〉q.(g) Parties re-share the same bit in F2

- All parties execute FMPC.Input such that P1 inputs bi,1−ki · q (mod 2)

and all parties obtain the sharing 〈b(1)i 〉2.

- All parties execute FMPC.Input, for j ∈ Q1 \ P1, such that Pj inputs

bi,j (mod 2) and all parties obtain the sharing 〈b(j)i 〉2.

- The parties compute 〈bi〉2 =∑j∈Q1

〈b(j)i 〉2.

3. The parties initialize an instance of the functionality F2sec

Rand. [Implementationnote this needs to be done after the previous step so the parties have no priorknowledge of the output].

4. For j = 1, . . . , γ do(a) For i = 1, . . . ,m+ γ · sec generate ri,j ← F2sec

Rand(m+ γ · sec).(b) Compute the sharing 〈Sj,2〉2 =

∑i ri,j · 〈bi〉2.

(c) Compute 〈Sj〉q =∑i ri,j · 〈bi〉q.

(d) Execute FMPC.Output to publicly reveal Sj,2 from its sharing 〈Sj,2〉2.(e) Execute FMPC.Output to publicly reveal Sj from its sharing 〈Sj〉q.(f) Abort if Sj (mod 2) 6= Sj,2.

5. Output 〈bi〉q and 〈bi〉2 for i = 1, . . . ,m.

Figure 7. Method to produce m shared daBits

THRESHOLDIZING HASHEDDSA: MPC TO THE RESCUE 235

5 Threshold Variant of Standard HashEdDSA

We now present our threshold variant of HashEdDSA and show that it is securewhen instantiated with a UC-secure instantiation of the MPC functionality fromFigure 3. We concentrate on the HashEdDSA signature based on the Ed22519curve given in Figure 1, with the case of Ed448 being virtually identical (barusing a different hash function and a different elliptic curve). We also focus onthe KeyGen and Sign algorithms, as Verify is fixed irrespective of whether onesigns with a threshold variant or not. Our goal is to realise the functionalitygiven in Figure 8.

Distributed Signature Functionality: FSign

We let A denote the set of parties controlled by the adversary.

KeyGen: This proceeds as follows:1. The functionality generates a public/private key pair; s ∈ Fq and pk = [s]G

and the value pk is output to the adversary.2. The functionality waits for an input from the adversary.3. The adversary returns with either abort or deliver. If deliver the functionality

returns pk to the honest parties, otherwise it aborts.Sign: On input of the same message m from all parties the functionality proceeds

as follows:1. The functionality adversary waits from an input from the adversary.2. If the input is not abort then the functionality generates a signature σ on

the message m.3. The signature is returned to the adversary, and the functionality again

waits for input. If the input is again not abort then the functionality returnsσ to the honest parties.

Figure 8. Distributed Signature Functionality: FSign

The two relevant threshold-ized algorithms are given in Figure 9. They arepresented in the FMPC-hybrid model. At two points in the algorithm we needto take a sharing 〈s〉q and output in public the value Q = [〈s〉q]G. This is doneusing the following process:

- For each element sj in the share vector held by party Pi the party publishesQj = [sj ]G.

- The parties compute Q = r · (Q1, . . . , Qm)T for the reconstruction vector rof the underlying MSP.

- The parties also verify the output is correct by checking that

H · (Q1, . . . , Qm)T = 0

and aborting if this does not hold.

236 THRESHOLDIZING HASHEDDSA: MPC TO THE RESCUE

This algorithm hides the underlying secret s assuming the discrete logarithmproblem on the elliptic curve is hard as s and the shares of s come from ahigh entropy distribution. Note the abort test on the shares Q above using thematrix H is the reason why we have a potential abort in our Key Generationfunctionality in Figure 8.

Threshold Variant of the Hashed Ed25519 Signature Algorithm

KeyGen(b, λ): This algorithm proceeds as follows1. Call FMPC.Random a total of 256 times to generate shared random bits〈sk〉2 = 〈di〉2255

i=0.2. For the circuit Cf = SHA512 compute the hash value SHA512(〈sk〉2) =

(〈h0〉2, 〈h1〉2, . . . , 〈h511〉2) by calling FMPC.Evaluate3. Apply FMPC.Convert to convert 〈hi〉2 for i = 3, . . . , 253 to 〈hi〉q, and then

set 〈s〉q = 2254 +∑253i=3 2i · 〈hi〉q.

4. Compute the point [〈s〉q]G using the method described in the text. Thecorresponding Ed25519 public key pk is the encoding of this point.

Sign(m, sk, pk): Signing proceeds as follows1. We compute the hash of the message m in the clear to obtain SHA512(m) =

(h′0, h′1, . . . , h

′511).

2. We then apply the SHA512 circuit using FMPC.Evaluate to the bitstring, (〈h256〉2, 〈h257〉2, . . . , 〈h511〉2, h′0, h′1, . . . , h′511) consistingof 256 unknown bits and 512 known bits. Note, this requires onlyone iteration of the SHA512 compression function. Let the output be(〈r0〉2, 〈r1〉2, . . . , 〈r511〉2)

3. We apply FMPC.Convert to convert 〈ri〉2 for i = 0, . . . , 511 to 〈ri〉q and set〈r〉q =

∑511i=0 2i · 〈ri〉q.

4. Compute the point [〈r〉q]G using the method above, and the result openedto all parties. The resulting point is converted to a public octet string R.

5. The parties compute 〈S〉q = 〈r〉q + e · 〈s〉q where e =SHA512(R||pk||SHA512(m)) can be computed publicly.

6. Finally 〈S〉q is opened to all parties.

Figure 9. Threshold Variant of the Hashed Ed25519 Signature Algorithm

Theorem 1. The protocol in Figure 9 securely realises the distributed signingfunctionality given in Figure 8 in the FMPC-hybrid model, assuming the discretelogarithm problem is hard.

Proof. The simulator has access to the functionality FMPC and thus when sim-ulating the KeyGen procedure it executes steps 1–3 of the threshold KeyGenprocedure using the functionality FMPC. The simulator, from the simulation ofthe MPC functionality, obtains the adversarial shares sj for i = ι(j) ∈ A andcomputes Qj = [sj ]G. The simulator calls the KeyGen functionality on FSign andobtains a public key pk = [s]G for some hidden value of s ∈ Fq. With overwhelm-ing probability there is a choice of random bits di which produce the secret key

THRESHOLDIZING HASHEDDSA: MPC TO THE RESCUE 237

corresponding to s, and thus there is a way for this value to have arisen in theprotocol. The simulator now needs to generate Qj for i = ι(j) 6∈ A which isconsistent with the Qj computed above. The unknown Qj define a series of lin-ear equations in elliptic curve points (one equation defined by the reconstructionvector r and a set of equations defined by the parity check matrix H). This set ofequations will have a solution since there is an assignment which produces thishidden value s. Thus the values Qj for i = ι(j) 6∈ A can be perfectly simulated,and by the hardness of the discrete logarithm problem, the combined set hidethe unknown hidden value s. The values Qj for i = ι(j) 6∈ A are sent to theadversary, who returns his own set Q∗j for i = ι(j) ∈ A. If the Q∗j 6= Qj then thesimulator passes abort to FSign and exits. Note, that this abort will be caught inthe real protocol as well by using the error detecting properties of the Q2 secretsharing scheme; see [43, Lemma 2].

For the signing algorithm, the simulator obtains a signature (R,S) from thesigning oracle. The steps 1–4 of the threshold signing algorithm are simulatedin the same way as the steps to generate pk in the KeyGen algorithm. All thatremains, is to simulate the opening of 〈S〉q. As before, from the simulation ofFMPC, we know the adversarial shares of 〈r〉q and 〈s〉q, thus we know what theadversary should output as their shares of 〈S〉q. Thus we are able, this time bysolving linear equations over Fq, to find a set of consistent shares for the honestparties which open to the correct value of S. If the adversary sends incorrectvalues for his opening of 〈R〉q or 〈S〉q which would cause the real protocol toabort, then the simulator passes abort to FSign.

It is clear that the above simulation perfectly simulates the algorithms KeyGenand Sign. Thus the security of the protocol follows. ut

6 Using MPC-Friendly Hash Functions

Up to now, we have concentrated on following the precise definition in the NISTsDigital Signature Standard [37]. Thus we used SHA-512 and SHAKE-256 as theunderlying hash functions; which gave a potential performance penalty in thedeterministic signature environment. However, as part of the NIST ThresholdCryptography initiative, there is some interest in examining variants which aremore amendable to a threshold implementation. The obvious ‘tweak’ which couldbe applied to the standard HashEdDSA and EdDSA algorithms is to replace theuse of SHA-512 and SHAKE-256 with so-called ‘MPC-Friendly’ hash functions.

In recent years, there has been considerable interest in MPC-Friendly variantsof symmetric cryptographic primitives; for example block ciphers [1,2,24] modes-of-operation [38], and more recently hash functions [3, 23]. The hash functionconstructions are sponge-based, and designs have been given which are suitablefor MPC over characteristic two fields (StarkAD and Vision), as well as overlarge prime fields (Poseidon and Rescue). In this paper, we concentrate on theRescue design from [3], which seems more suited to our application.

Rescue has a state of t = r+ c finite field elements Fq. The initial state of thesponge is defined to be the vector of t zero elements. A message is divided into

238 THRESHOLDIZING HASHEDDSA: MPC TO THE RESCUE

n = d · r elements in Fq, m0,m1, . . . ,mn−1. The elements are absorbed into thesponge in d absorption phases, where r elements are absorbed in each phase. Ateach phase a permutation f : Ftq −→ Ftq is applied; see Figure 10. This resultsin a state s0, . . . , st−1. At the end of the absorption, the r values sc, . . . , st−1are output from the state. This process can then be repeated, with more dataabsorbed and then squeezed out. Thus we are defining a map H : Fnq −→ Frq.To obtain security of the sponge itself, we require that min(r, c) · log2 q ≥ 2 · κ,where κ is the desired security parameter. We will always take c = 2 in ourapplication3.

st−1-+ -

. . .

sc+1 -+-

sc -

. . .

s0 -

m0

?

. . . mr−1

?

f

-

-

-

-

. . .

. . .

. . .

. . .

. . .

-

-

-

-

f

-+ -

-+-

-

-

mn−r

?

. . . mn−1

?

f

-

-

Fig. 10. The Rescue Sponge Function

Each primitive call f in the Rescue sponge is performed by executing a roundfunction rnds times. The round function is parametrized by a (small prime) valueα, an MDS matrix M ∈ Ft×tq and 2 step constants vki ∈ Ftq. The value α is chosento be the smallest prime such that gcd(q−1, α) = 1. The round function is givenin Figure 11, where S1 is the S-Box which maps x ∈ Fq to x1/α and S2 is theS-Box which maps x ∈ Fq to xα. To obtain security of the permutation f werequire that the round function is repeated

rnds = max( 2 · dκ/4 · te, 10 )

times.Notice, that the product t·rnds is (to a first approximation) fixed by the secu-

rity parameter. The ‘cost’ of evaluating f in terms of number of multiplicationswill be proportional to t · rnds, but the online evaluation time will depend (in anMPC system) mainly only on the value of rnds. Thus having a larger t value willimprove performance compared to a smaller t value. To show the dependence off on t, we will write ft for this function in what follows.

3 We note this is a conservative choice since taking c = 1 is possible due to us havingq ≈ 22·κ.

THRESHOLDIZING HASHEDDSA: MPC TO THE RESCUE 239

si−1-

S1

. . .

S1

- M - + -

S2

. . .

S2

- M - + - si

vk2·i−1

?

vk2·i

?

Fig. 11. The Rescue Round Function

We will require different variants of the Rescue hash function for differentvalues of the rate r, respectively t = r+2. Thus we define the function Rr whichtakes an arbitrary sized input, which is divided into blocks of size r Fq-elements,and produces a final output block of size r Fq-elements.

Rr : (Frq)∗ −→ Frq.

When one applies Rr to messages always of the same number of blocks, thereis no need for a padding scheme, however when Rr is applied to messages ofvariable length, we need to pad the input. Thus for arbitrary length messagesm, we first pad by adding a single Fq element consisting of the one element,and then we pad by enough zero elements so as to obtain a message which is amultiple of r field elements long.

If we apply Rr as a hash function where the initial state is not the zero state,but another set of t values s0, . . . , st−1, then we write Rr(m; s0, . . . , st−1). Thuswe have Rr(m) = Rr(m; 0, . . . , 0).

To examine the cost of an MPC implementation, we consider only the onlineround complexity, assuming a ‘standard’ SDPZ-like offline procedure (which pro-duces multiplication triples, square pairs and random shared values only). Wenote that in a constant number of rounds we can produce from the standardpre-processing as many pairs of the form (〈r〉q, 〈r−α〉q) as we like. In fact, usingexisting pre-processed squares, this will require one round for α = 3 and tworounds for α = 5.

Given this pre-processing, we can produce 〈xα〉q given 〈x〉q in two rounds,irrespectly of α. This is done by computing 〈y〉q = 〈x〉q · 〈r〉q, for one of ourpre-computed pairs (〈r〉q, 〈r−α〉q), which requires one round of communication.We then open 〈y〉q to obtain y, requiring another round of communication. Thevalue 〈xα〉q can then be locally computed as yα · 〈r−α〉q. This will only producean incorrect result when r = 0, which happens with negligible probability.

To compute 〈x1/α〉q given 〈x〉q we again take a pre-computed pair (〈r〉q, 〈r−α〉q)and this time compute and open 〈y〉q = 〈x〉q · 〈r−α〉q; requiring two rounds.Opening the result we obtain 〈x1/α〉q locally as y1/α · 〈r〉q.

Summing up, the total MPC-round complexity of evaluating Rescue is givenby 4 · rnds irrespective of the value of α.

240 THRESHOLDIZING HASHEDDSA: MPC TO THE RESCUE

6.1 Ed25519 with Rescue

The curve Ed25519 aims to obtain roughly 128 bits of security and uses q withlog2 q ≈ 252. We concentrate still on HashEdDSA, although now with the use ofRescue it is easy to apply the thresholdizing technique to normal EdDSA, as weshall describe below. We have q (mod 3) = 1 and q (mod 5) = 4, thus we selectα = 5 in the Rescue construction. We use r = 4 in our construction of Rescueand so will use the functions R4 and f6 defined above, thus the number of roundsis rnds = 12. The permutation f6 will be used to perform the internal hasheswithin the HashEd25519 algorithm, with R4 used in the case of producing athresholdized EdDSA (as opposed to HashEdDSA) algorithm.

We modify the threshold variant of the algorithm in Figure 9 as follows toobtain a Rescue-enabled variant of both HashEd25519 and Ed25519. Derivingnon-threshold variants of these Rescue-enabled variants can be done in an obviousmanner.

1. In line 1 of KeyGen we select 〈sk〉q ∈ Fq by calling FMPC.Random once.2. In line 2 of KeyGen we apply the Rescue permutation f6 once to the input

state (〈sk〉q, 0, 0, 0, 0, 0) ∈ F6q. This produces a value (〈h0〉q, 〈h1〉q, 〈h2〉q,

〈h3〉q, 〈h4〉q, 〈h5〉q) ∈ F6q, which is stored for use later.

3. In line 3 of KeyGen we set 〈s〉q = 〈h0〉q. There is no need to expand byapplying another hash function, as in the standard HashEdDSA algorithm,since the Rescue function works natively with Fq elements.

4. In line 1 of Sign we apply SHA512 to the message m to obtain a 512 bitoutput. This is split into two 248 bit chunks and a 16 bit chunk, and theneach chunk is treated as en element of Fq. This results in three finite fieldelements (c0, c1, c2).

5. Line 2 of Sign then involves applying the Rescue permutation f6 once to theinput state (〈h0〉q + c0, 〈h1〉q + c1, 〈h2〉q + c2, 〈h3〉q, 〈h4〉q, 〈h5〉q). This willproduce an output (〈r0〉q, . . . , 〈r5〉q).

6. Finally, the output in Line 2 of Sign is replaced by setting 〈r〉q = 〈r0〉q.With these changes the rest of our threshold implementation of HashEdDSAfollows immediately. To obtain a threshold variant of EdDSA, we replace lines 4and 5 above, by the following operations:

- We split m into 248-bit blocks m0, . . . ,ml′ , and treat each block as an ele-ment in Fq.

- We pad with a one block, and then enough zero blocks to make the entireblock length a multiple of t = 4. Thus we obtain the message as m0, . . . ,ml ∈Fl+1q

- We compute

(〈r0〉q, . . . , 〈r3〉q) = H4(m0, . . . ,ml; (〈h0〉q, . . . , 〈h5〉q))and take 〈r〉q = 〈r0〉q as before.

Note, we apply Rescue as hash function here as this is easier to implement viaMPC, and we apply SHA512 to compute the hash of the message for HashEd25519as this is easier to implement in the clear compared to Rescue.

THRESHOLDIZING HASHEDDSA: MPC TO THE RESCUE 241

6.2 Ed448 with Rescue

The curve Ed448 aims to obtain roughly 224 bits of security and we have log2 q ≈446. We again have q (mod 3) = 1 and q (mod 5) = 4, thus we select α = 5 inthe Rescue construction. Again we utilize the Rescue functions, but this time weselect t = 10 and r = 8, i.e. f10 and R8, and with rnds = 12. The modificationsto the KeyGen and Sign algorithms follow as above.

7 Experimental Results

We now present some experimental results. To put these into context, we firstrecap on what the state-of-the-art is for standard ECDSA threshold signingalgorithms. Prior experimental work has focused on standard ECDSA, whichhas the complexity of needing to divide by the ephemeral secret within thesigning equation, and has been focused on curves at the 128-bit security level;e.g. curves such as secp256k1 or secp256r1.

The paper [34] looks at the case of full threshold ECDSA signing. The com-plexity growth of their implementation is roughly linear. For two parties, theyobtain a signing time of 304 milliseconds, and for three parties they obtain asigning time of 575 milliseconds. In [19] the full threshold case is also consid-ered, but this time restricted only to two parties; where they obtain a signingtime of 81 milliseconds. They also present timings for a protocol which is inthe 2-out-of-n threshold situation where they obtain a signing time of also 81milliseconds. In [31] the two party full threshold case is considered, and a timeof 37 milliseconds for signing is reported on. In [20], the authors present againa full threshold protocol for ECDSA. The complexity scales with the number ofparties t which engage in the signing protocol, resulting in a runtime of approx-imately 29 + 24 · t milliseconds for ECDSA threshold signing, for a curve overa field of 256-bits. Recent, work [11], improves on the above work, and extendsthe experiments to larger field sizes. They give the functions depending on thenumber of signing parties t as run times (in milliseconds) for distributed signingoperations, which we show in Table 1.

Curve [33] [20] [11]

NIST P-256 310 · t 88 · t 237 + 730 · tNIST P-384 3000 · t 857 · t 903 + 2780 · tNIST P-521 16741 · t 4783 · t 2608 + 8011 · t

Table 1. Function of runtime (in milliseconds) as a function of t for various thresholdECDSA algorithms for various curves. Data taken from [11]

The recent work closest to ours is that of [15] and [16] who look at ECDSA inthe case of an honest majority. In [15], the authors give an online time of (at best)2.78ms for a three party honest majority protocol (i.e. threshold (n, t) = (3, 1)),

242 THRESHOLDIZING HASHEDDSA: MPC TO THE RESCUE

the protocol is an example of a more general methodology to thresholdize ‘any’general elliptic curve based protocol which was given in [42]. In [16] the authorspresent a protocol, implemented in Java, which has an online signing time of19.9ms when (n, t) = (3, 1) and 25.0ms when (n, t) = (5, 2) on a LAN.

In our situation, using the standard HashECDSA algorithm with standardhash functions, our signing equation is easier (there is no costly conversion),but we need to evaluate the hash function in MPC and convert the output to ashared value modulo q; as explained in the introduction. In the case of Ed25519HashECDSA signing the key time critical operations are lines 2–3 in Figure 9for Ed25519, with the equivalent lines for Ed448 being also the most costlyoperations. The rest of the computation is relatively marginal (and equivalentto a standard ECDSA non-threshold signing cost) thus we timed only these partsof our algorithm, for different threshold Q2 access structures.

Our implementation was done using a modification of the SCALE-MAMBA sys-tem and tested in a LAN setting, with each party running on an Intel i7-7700KCPU (4 cores at 4.20GHz with 2 threads per core) with 32GB of RAM over a10Gb/s network switch. The results are presented in Table 2; note that in the caseof Ed448 signing the cost depends on the size of the context field which controlsthe number of times v the underlying SHAKE256 permutation needs to be called;as discussed earlier we are guaranteed that 2 ≤ v ≤ 4. For each case we give tworun-times, one for an HSS-based evaluation of the underlying hash function andone for a evaluation using a our replicated secret sharing based approach de-scribed earlier, denoted RSS in the table. Note; to perform the latter efficiently,we need to generate circuit representations of the SHA-256 and Keccak roundfunctions which used the smallest number of rounds possible. These circuits wehave made available at https://homes.esat.kuleuven.be/~nsmart/MPC/.

Ed25519 Ed448Access Structure v = 2 v = 3 v = 4

(n, t) HSS RSS HSS RSS HSS RSS HSS RSS

(3,1) 75.8 ms 23.0 ms 80.9 ms 13.0 ms 123.27 ms 17.4 ms 198.47 ms 21.3 ms

(4,1) 110.2 ms 23.6 ms 149.6 ms 13.2 ms 209.2 ms 17.5 ms 299.3 ms 22.5 ms

(5,1) 138.8 ms 25.6 ms 189.2 ms 14.3 ms 290.7 ms 18.5 ms 389.0 ms 24.1 ms

(5,2) 116.8 ms 38.8 ms 191.6 ms 15.4 ms 288.4 ms 21.3 ms 341.4 ms 27.3 ms

Table 2. Running times for computing the SHA512/SHAKE256 hash function and con-verting the 〈·〉2 to 〈·〉q. These timings are averaged over 100 experiments and performedwith SCALE-MAMBA.

We can see from the timing results that the binary LSSS method outperformsthe garbled circuit strategy. The timings for the RSS strategy are in the sameorder of magnitude of the results from [16], but still one order of magnitudebigger than the results of [15]. However, these timings are over a Local-Area-Network in which the downside of the high-round complexity the RSS strategymay not apply. As the ping-time between machines increases, it may be thatthe HSS strategy becomes more useful. The increase in time, compared to [15],

THRESHOLDIZING HASHEDDSA: MPC TO THE RESCUE 243

are almost all down to the need to perform a secure hash within the signingoperation itself.

Next, for completeness, we compare our results with the MP-SPDZ frame-work [28] on the same problem. The MP-SPDZ framework also allows a com-bination of different MPC protocols in order to achieve the computation of thehash function over a binary sharing and the subsequent elliptic curve opera-tions over Fq. We use the following constructions for MP-SPDZ: for the (3, 1)access structure we use the malicious replicated sharing scheme based protocolof Lindell and Nof [32]. For the other access structures MP-SPDZ only allowsthe execution of a protocol based on the ancient protocol of [12], which utilizesShamir sharing to obtain the threshold access structure modulo 2; but over alarge finite field. To generate the best comparison, we also produce timings forthe RSS case in SCALE-MAMBA.

Ed25519 Ed448Access Structure v = 2 v = 3 v = 4

(n, t) SCALE-MAMBA MP-SPDZ SCALE-MAMBA MP-SPDZ SCALE-MAMBA MP-SPDZ SCALE-MAMBA MP-SPDZ

(3,1) 23.0 ms 17.7 ms 13.0 ms 3.1 ms 17.4 ms 3.7 ms 21.3 ms 4.7 ms

(4,1) 23.6 ms 58.3 ms 13.2 ms 42.8 ms 17.5 ms 59.3 ms 22.5 ms 82.9 ms

(5,1) 25.6 ms 49.5 ms 14.3 ms 37.0 ms 18.5 ms 52.3 ms 24.1 ms 71.4 ms

(5,2) 38.8 ms 80.8 ms 15.4 ms 64.1 ms 21.3 ms 94.5 ms 27.3 ms 123.5 ms

Table 3. Running times for computing the SHA512/SHAKE256 hash function andconverting the 〈·〉2 to 〈·〉q. These timings are averaged over 100 experiments.

The results of our experiments are given in Table 3. From Table 3, we seethat only for the (3, 1) access structure does MP-SPDZ give better timings thanSCALE-MAMBA, in all other cases SCALE-MAMBA outperforms MP-SPDZ. The rea-son for this is two fold; firstly MP-SPDZ in the case (n, t) = (3, 1) uses spe-cialised code for the underlying three party replicated access structure, whereasSCALE-MAMBA uses code for a generic n-party replicated access structure. Forthe other cases the protocol in MP-SPDZ requiring larger finite fields adds anadded complexity compared to the protocol used in SCALE-MAMBA for thresholdstructures other than (n, t) = (3, 1).

We also examined applying Rescue in the same context. For ease of com-parison, we examined simply the cost of applying the Rescue hash function inthe situation and for the primes needed in our application. The cost of the fullsigning algorithm on top of these runtimes is equivalent to the cost of a non-threshold ECDSA operation, and can thus be discounted. Again we present timesfor different access structures; also note that with Rescue and the Ed448 signingalgorithm we can avoid the complication with the use of various values of v, bypassing the context into the hash of the message. For the run times see Table 4.

Here we see that using Rescue enables us to obtain execution times whichare now well within an order of magnitude of standard ECDSA signing. Thusone can conclude that NIST, if it wishes to support threshold variants of itsstandard algorithms, should consider also standardizing MPC friendly hash andblock cipher components such as Rescue.

244 THRESHOLDIZING HASHEDDSA: MPC TO THE RESCUE

Access structure Ed25519 Ed448

(3,1) 7 ms 14 ms

(4,1) 9 ms 14 ms

(5,1) 11 ms 15 ms

(5,2) 15 ms 17 msTable 4. Running times for computing the Rescue hash function. This works on el-ements of Fq directly so we do not have to convert the results. These timings areaveraged over 100 000 hash function calls.

References

1. Albrecht, M.R., Grassi, L., Rechberger, C., Roy, A., Tiessen, T.: MiMC: Efficientencryption and cryptographic hashing with minimal multiplicative complexity. In:Cheon, J.H., Takagi, T. (eds.) Advances in Cryptology – ASIACRYPT 2016, Part I.Lecture Notes in Computer Science, vol. 10031, pp. 191–219. Springer, Heidelberg,Germany, Hanoi, Vietnam (Dec 4–8, 2016)

2. Albrecht, M.R., Rechberger, C., Schneider, T., Tiessen, T., Zohner, M.: Ciphersfor MPC and FHE. In: Oswald, E., Fischlin, M. (eds.) Advances in Cryptology– EUROCRYPT 2015, Part I. Lecture Notes in Computer Science, vol. 9056, pp.430–454. Springer, Heidelberg, Germany, Sofia, Bulgaria (Apr 26–30, 2015)

3. Aly, A., Ashur, T., Ben-Sasson, E., Dhooghe, S., Szepieniec, A.: Design ofsymmetric-key primitives for advanced cryptographic protocols. Cryptology ePrintArchive, Report 2019/426 (2019), https://eprint.iacr.org/2019/426

4. Aly, A., Cong, K., Cozzo, D., Keller, M., Orsini, E., Rotaru, D., Scherer, O.,Scholl, P., Smart, N.P., Tanguyu, T., Wood, T.: SCALE and MAMBA doc-umentation, v1.10 (2020), https://homes.esat.kuleuven.be/~nsmart/SCALE/

Documentation.pdf

5. Aly, A., Orsini, E., Rotaru, D., Smart, N.P., Wood, T.: Zaphod: Efficiently combin-ing LSSS and garbled circuits in SCALE. In: Brenner, M., Lepoint, T., Rohloff, K.(eds.) Proceedings of the 7th ACM Workshop on Encrypted Computing & AppliedHomomorphic Cryptography, WAHC@CCS 2019, London, UK, November 11-15,2019. pp. 33–44. ACM (2019), https://doi.org/10.1145/3338469.3358943

6. Araki, T., Barak, A., Furukawa, J., Lichter, T., Lindell, Y., Nof, A., Ohara, K.,Watzman, A., Weinstein, O.: Optimized honest-majority MPC for malicious adver-saries - breaking the 1 billion-gate per second barrier. In: 2017 IEEE Symposiumon Security and Privacy. pp. 843–862. IEEE Computer Society Press, San Jose,CA, USA (May 22–26, 2017)

7. Bendlin, R., Damgard, I., Orlandi, C., Zakarias, S.: Semi-homomorphic encryptionand multiparty computation. In: Paterson, K.G. (ed.) Advances in Cryptology –EUROCRYPT 2011. Lecture Notes in Computer Science, vol. 6632, pp. 169–188.Springer, Heidelberg, Germany, Tallinn, Estonia (May 15–19, 2011)

8. Bernstein, D.J., Duif, N., Lange, T., Schwabe, P., Yang, B.Y.: High-speed high-security signatures. In: Preneel, B., Takagi, T. (eds.) Cryptographic Hardware andEmbedded Systems – CHES 2011. Lecture Notes in Computer Science, vol. 6917,pp. 124–142. Springer, Heidelberg, Germany, Nara, Japan (Sep 28 – Oct 1, 2011)

9. Bernstein, D.J., Josefsson, S., Lange, T., Schwabe, P., Yang, B.Y.: EdDSA for morecurves. Cryptology ePrint Archive, Report 2015/677 (2015), http://eprint.iacr.org/2015/677

THRESHOLDIZING HASHEDDSA: MPC TO THE RESCUE 245

10. Brandao, L.T.A.N., Davidson, M., Vassilev, A.: NIST 8214A (Draft): TowardsNIST standards for threshold schemes for cryptographic primitives: A prelimi-nary roadmap (2019), https://nvlpubs.nist.gov/nistpubs/ir/2019/NIST.IR.

8214A-draft.pdf

11. Castagnos, G., Catalano, D., Laguillaumie, F., Savasta, F., Tucker, I.: Bandwidth-efficient threshold EC-DSA. In: Kiayias, A., Kohlweiss, M., Wallden, P., Zikas, V.(eds.) PKC 2020: 23rd International Conference on Theory and Practice of PublicKey Cryptography, Part II. Lecture Notes in Computer Science, vol. 12111, pp.266–296. Springer, Heidelberg, Germany, Edinburgh, UK (May 4–7, 2020)

12. Chaum, D., Crepeau, C., Damgard, I.: Multiparty unconditionally secure protocols(extended abstract). In: 20th Annual ACM Symposium on Theory of Computing.pp. 11–19. ACM Press, Chicago, IL, USA (May 2–4, 1988)

13. Chida, K., Genkin, D., Hamada, K., Ikarashi, D., Kikuchi, R., Lindell, Y., Nof, A.:Fast large-scale honest-majority MPC for malicious adversaries. In: Shacham, H.,Boldyreva, A. (eds.) Advances in Cryptology – CRYPTO 2018, Part III. LectureNotes in Computer Science, vol. 10993, pp. 34–64. Springer, Heidelberg, Germany,Santa Barbara, CA, USA (Aug 19–23, 2018)

14. Cogliati, B., Dodis, Y., Katz, J., Lee, J., Steinberger, J.P., Thiruvengadam, A.,Zhang, Z.: Provable security of (tweakable) block ciphers based on substitution-permutation networks. In: Shacham, H., Boldyreva, A. (eds.) Advances in Cryptol-ogy – CRYPTO 2018, Part I. Lecture Notes in Computer Science, vol. 10991, pp.722–753. Springer, Heidelberg, Germany, Santa Barbara, CA, USA (Aug 19–23,2018)

15. Dalskov, A.P.K., Orlandi, C., Keller, M., Shrishak, K., Shulman, H.: SecuringDNSSEC keys via threshold ECDSA from generic MPC. In: Chen, L., Li, N.,Liang, K., Schneider, S.A. (eds.) ESORICS 2020: 25th European Symposium onResearch in Computer Security, Part II. Lecture Notes in Computer Science, vol.12309, pp. 654–673. Springer, Heidelberg, Germany, Guildford, UK (Sep 14–18,2020)

16. Damgard, I., Jakobsen, T.P., Nielsen, J.B., Pagter, J.I., Østergaard, M.B.: Fastthreshold ECDSA with honest majority. In: Galdi, C., Kolesnikov, V. (eds.) SCN20: 12th International Conference on Security in Communication Networks. Lec-ture Notes in Computer Science, vol. 12238, pp. 382–400. Springer, Heidelberg,Germany, Amalfi, Italy (Sep 14–16, 2020)

17. Damgard, I., Koprowski, M.: Practical threshold RSA signatures without a trusteddealer. In: Pfitzmann, B. (ed.) Advances in Cryptology – EUROCRYPT 2001.Lecture Notes in Computer Science, vol. 2045, pp. 152–165. Springer, Heidelberg,Germany, Innsbruck, Austria (May 6–10, 2001)

18. Damgard, I., Pastro, V., Smart, N.P., Zakarias, S.: Multiparty computation fromsomewhat homomorphic encryption. In: Safavi-Naini, R., Canetti, R. (eds.) Ad-vances in Cryptology – CRYPTO 2012. Lecture Notes in Computer Science, vol.7417, pp. 643–662. Springer, Heidelberg, Germany, Santa Barbara, CA, USA(Aug 19–23, 2012)

19. Doerner, J., Kondi, Y., Lee, E., shelat, a.: Secure two-party threshold ECDSAfrom ECDSA assumptions. In: 2018 IEEE Symposium on Security and Privacy.pp. 980–997. IEEE Computer Society Press, San Francisco, CA, USA (May 21–23,2018)

20. Gennaro, R., Goldfeder, S.: Fast multiparty threshold ECDSA with fast trustlesssetup. In: Lie, D., Mannan, M., Backes, M., Wang, X. (eds.) ACM CCS 2018:

246 THRESHOLDIZING HASHEDDSA: MPC TO THE RESCUE

25th Conference on Computer and Communications Security. pp. 1179–1194. ACMPress, Toronto, ON, Canada (Oct 15–19, 2018)

21. Gennaro, R., Goldfeder, S., Narayanan, A.: Threshold-optimal DSA/ECDSA signa-tures and an application to bitcoin wallet security. In: Manulis, M., Sadeghi, A.R.,Schneider, S. (eds.) ACNS 16: 14th International Conference on Applied Cryptog-raphy and Network Security. Lecture Notes in Computer Science, vol. 9696, pp.156–174. Springer, Heidelberg, Germany, Guildford, UK (Jun 19–22, 2016)

22. Gennaro, R., Jarecki, S., Krawczyk, H., Rabin, T.: Robust threshold DSS signa-tures. In: Maurer, U.M. (ed.) Advances in Cryptology – EUROCRYPT’96. LectureNotes in Computer Science, vol. 1070, pp. 354–371. Springer, Heidelberg, Germany,Saragossa, Spain (May 12–16, 1996)

23. Grassi, L., Kales, D., Khovratovich, D., Roy, A., Rechberger, C., Schofnegger,M.: Starkad and Poseidon: New hash functions for zero knowledge proof systems.Cryptology ePrint Archive, Report 2019/458 (2019), https://eprint.iacr.org/2019/458

24. Grassi, L., Rechberger, C., Rotaru, D., Scholl, P., Smart, N.P.: MPC-friendly sym-metric key primitives. In: Weippl, E.R., Katzenbeisser, S., Kruegel, C., Myers,A.C., Halevi, S. (eds.) ACM CCS 2016: 23rd Conference on Computer and Com-munications Security. pp. 430–443. ACM Press, Vienna, Austria (Oct 24–28, 2016)

25. Hazay, C., Scholl, P., Soria-Vazquez, E.: Low cost constant round MPC combiningBMR and oblivious transfer. In: Takagi, T., Peyrin, T. (eds.) Advances in Cryptol-ogy – ASIACRYPT 2017, Part I. Lecture Notes in Computer Science, vol. 10624,pp. 598–628. Springer, Heidelberg, Germany, Hong Kong, China (Dec 3–7, 2017)

26. Hirt, M., Maurer, U.M.: Player simulation and general adversary structures inperfect multiparty computation. Journal of Cryptology 13(1), 31–60 (Jan 2000)

27. Karchmer, M., Wigderson, A.: On span programs. In: Proceedings of Structures inComplexity Theory. pp. 102–111 (1993)

28. Keller, M.: Mp-spdz: A versatile framework for multi-party computation. IACRCryptol. ePrint Arch. 2020, 521 (2020)

29. Keller, M., Orsini, E., Scholl, P.: MASCOT: Faster malicious arithmetic securecomputation with oblivious transfer. In: Weippl, E.R., Katzenbeisser, S., Kruegel,C., Myers, A.C., Halevi, S. (eds.) ACM CCS 2016: 23rd Conference on Computerand Communications Security. pp. 830–842. ACM Press, Vienna, Austria (Oct 24–28, 2016)

30. Keller, M., Rotaru, D., Smart, N.P., Wood, T.: Reducing communication channelsin MPC. In: Catalano, D., De Prisco, R. (eds.) SCN 18: 11th International Confer-ence on Security in Communication Networks. Lecture Notes in Computer Science,vol. 11035, pp. 181–199. Springer, Heidelberg, Germany, Amalfi, Italy (Sep 5–7,2018)

31. Lindell, Y.: Fast secure two-party ECDSA signing. In: Katz, J., Shacham, H. (eds.)Advances in Cryptology – CRYPTO 2017, Part II. Lecture Notes in ComputerScience, vol. 10402, pp. 613–644. Springer, Heidelberg, Germany, Santa Barbara,CA, USA (Aug 20–24, 2017)

32. Lindell, Y., Nof, A.: A framework for constructing fast MPC over arithmetic cir-cuits with malicious adversaries and an honest-majority. In: Thuraisingham, B.M.,Evans, D., Malkin, T., Xu, D. (eds.) ACM CCS 2017: 24th Conference on Com-puter and Communications Security. pp. 259–276. ACM Press, Dallas, TX, USA(Oct 31 – Nov 2, 2017)

33. Lindell, Y., Nof, A.: Fast secure multiparty ECDSA with practical distributedkey generation and applications to cryptocurrency custody. In: Lie, D., Mannan,

THRESHOLDIZING HASHEDDSA: MPC TO THE RESCUE 247

M., Backes, M., Wang, X. (eds.) ACM CCS 2018: 25th Conference on Computerand Communications Security. pp. 1837–1854. ACM Press, Toronto, ON, Canada(Oct 15–19, 2018)

34. Lindell, Y., Nof, A., Ranellucci, S.: Fast secure multiparty ECDSA with practicaldistributed key generation and applications to cryptocurrency custody. CryptologyePrint Archive, Report 2018/987 (2018), https://eprint.iacr.org/2018/987

35. MacKenzie, P.D., Reiter, M.K.: Two-party generation of DSA signatures. In: Kil-ian, J. (ed.) Advances in Cryptology – CRYPTO 2001. Lecture Notes in ComputerScience, vol. 2139, pp. 137–154. Springer, Heidelberg, Germany, Santa Barbara,CA, USA (Aug 19–23, 2001)

36. Maurer, U.M.: Secure multi-party computation made simple. Discret. Appl. Math.154(2), 370–381 (2006), https://doi.org/10.1016/j.dam.2005.03.020

37. National Institute of Standards and Technology: FIPS PUB 186-5 (Draft): DigitalSignature Standard (DSS) (2019), https://nvlpubs.nist.gov/nistpubs/FIPS/

NIST.FIPS.186-5-draft.pdf

38. Rotaru, D., Smart, N.P., Stam, M.: Modes of operation suitable for computing onencrypted data. IACR Transactions on Symmetric Cryptology 2017(3), 294–324(2017)

39. Rotaru, D., Smart, N.P., Tanguy, T., Vercauteren, F., Wood, T.: Actively securesetup for SPDZ. Cryptology ePrint Archive, Report 2019/1300 (2019), https:

//eprint.iacr.org/2019/1300

40. Rotaru, D., Wood, T.: MArBled circuits: Mixing arithmetic and Boolean circuitswith active security. In: Hao, F., Ruj, S., Sen Gupta, S. (eds.) Progress in Cryptol-ogy - INDOCRYPT 2019: 20th International Conference in Cryptology in India.Lecture Notes in Computer Science, vol. 11898, pp. 227–249. Springer, Heidelberg,Germany, Hyderabad, India (Dec 15–18, 2019)

41. Shoup, V.: Practical threshold signatures. In: Preneel, B. (ed.) Advances in Cryp-tology – EUROCRYPT 2000. Lecture Notes in Computer Science, vol. 1807, pp.207–220. Springer, Heidelberg, Germany, Bruges, Belgium (May 14–18, 2000)

42. Smart, N.P., Talibi Alaoui, Y.: Distributing any elliptic curve based protocol. In:Albrecht, M. (ed.) 17th IMA International Conference on Cryptography and Cod-ing. Lecture Notes in Computer Science, vol. 11929, pp. 342–366. Springer, Hei-delberg, Germany, Oxford, UK (Dec 16–18, 2019)

43. Smart, N.P., Wood, T.: Error detection in monotone span programs with applica-tion to communication-efficient multi-party computation. In: Matsui, M. (ed.) Top-ics in Cryptology – CT-RSA 2019. Lecture Notes in Computer Science, vol. 11405,pp. 210–229. Springer, Heidelberg, Germany, San Francisco, CA, USA (Mar 4–8,2019)

248 THRESHOLDIZING HASHEDDSA: MPC TO THE RESCUE

Curriculum vitae

“All we have to decide is what to do with the time that is given us.”–J.R.R. Tolkien

Charlotte Bonte obtained a Bachelor’s degree in Mathematics from the university ofAntwerp in 2012 and a Master’s degree in Mathematical Engineering from KU Leuvenin July 2014. After graduation she worked as a software engineer developing softwarefor 3D printers at Materialise. In October 2016, she returned to KU Leuven andjoined the COSIC research group as a PhD candidate under the supervision of Prof.Bart Preneel and Prof. Frederik Vercauteren. During her PhD, she was also an internin the security and privacy research group at Intel Labs from May to September 2020.

FACULTY OF ENGINEERING SCIENCEDEPARTMENT OF ELECTRICAL ENGINEERING

IMEC-COSICKasteelpark Arenberg 10 box 2452

B-3001 [email protected]