Efficient Designs for Practical Blockchain-IoT Integration

232
Zurich Open Repository and Archive University of Zurich University Library Strickhofstrasse 39 CH-8057 Zurich www.zora.uzh.ch Year: 2022 Effcient Designs for Practical Blockchain-IoT Integration Rafati Niya, Sina Posted at the Zurich Open Repository and Archive, University of Zurich ZORA URL: https://doi.org/10.5167/uzh-218583 Dissertation Published Version Originally published at: Rafati Niya, Sina. Effcient Designs for Practical Blockchain-IoT Integration. 2022, University of Zurich, Faculty of Economics.

Transcript of Efficient Designs for Practical Blockchain-IoT Integration

Zurich Open Repository andArchiveUniversity of ZurichUniversity LibraryStrickhofstrasse 39CH-8057 Zurichwww.zora.uzh.ch

Year: 2022

Efficient Designs for Practical Blockchain-IoT Integration

Rafati Niya, Sina

Posted at the Zurich Open Repository and Archive, University of ZurichZORA URL: https://doi.org/10.5167/uzh-218583DissertationPublished Version

Originally published at:Rafati Niya, Sina. Efficient Designs for Practical Blockchain-IoT Integration. 2022, University of Zurich,Faculty of Economics.

Department of Informatics

Efficient Designs for PracticalBlockchain-IoT Integration

Dissertation submitted to theFaculty of Business, Economics, and Informatics

of the University of Zurich

to obtain the degree ofDoktor derWissenschaften, Dr. sc.

(corresponds to Doctor of Science, PhD)

presented bySina Rafati Niya

from Iran

approved in February 2022

at the request ofProf. Dr. Burkhard StillerProf. Dr. Salil Kanhere

The Faculty of Business, Economics and Informatics of the University of Zurichhereby authorizes the printing of this dissertation, without indicating anopinion of the views expressed in the work.

Zürich, February 16, 2022

The Chairman of the Doctoral Board: Prof. Dr. Thomas Fritz

Abstract

Potential advances with Blockchains (BC) have reached various application areas beyondFinTech-oriented use cases. Since Internet-of-Things (IoT) based use cases being an im-portant part of them, this thesis identifies and tackles key concerns of the interdisciplinaryarea of BC and IoT integration (BIoT). As many IoT devices interact in BC-IoT inte-grated applications, it is crucial to provide efficientmeasurementmetrics andmechanismsfor IoT data collection, transmission, and persistence within BCs. Due to salient featuressuch as strong trust and decentralization, BIoT shows potentials in many use cases (appli-cations) e.g., Supply Chain Tracking (SCT), smart cities, identity management, and datastreaming and trading. This thesis resembles further potentials and incentives for BIoT,which leads via suitable use cases to respective challenges. The analysis of existing studiesleads to the fact that BIoT faces various efficiency issues which can be associated with (a)scalability, (b) energy efficiency, and (c) security.

Solving these 3 important BIoT issues have to be considered proactively within theapplication layer—i.e., social and functional aspects of BC-IoT integrated applications—and technical layer —i.e., underlying BC, IoT, and the adaptation of these two realms—.Hence, to address these 3 issues while considering the application and technical layers, thisthesis specifies and pursues 3 goals, namely the Technical, Functional, and Social goals. Toreach its goals, the experimental approach taken in this thesis include 3 main steps.

At first, this thesis exploits the utilization of BIoT applications and focuses ondefiningand determining measures and criteria —to be met proactively for an efficient BIoT— bydesigning and prototyping 4 BC-IoT-integrated decentralized application (dApp). Thefist step’s outcome is a granular set of metrics andmeasures considering the specified tech-nical, functional, and social goals for an efficient BIoT.

In the second step, this thesis introduces various BIoTmethods that improve the per-formance of BC-IoT integrated systems, especially in using the LoRaWANaccessmethodas the main IoT protocol considered throughout this thesis. In this step, the main focus isput on the performance enhancement of the BIoT systems by studying themaximal num-ber of TXs submitted, reliability of transport schemes, and the energy efficiency. Thisthesis specifies a reliable data transmission scheme from IoTdevices to the connected BCs.Driven by the performed evaluations and the state-of-the-art BIoT architectures, the new

iii

architecture —called BIIT— is proposed in this thesis to pave the path toward practicaland efficient BIoT architectures.

In the third step, to tackle the BIoT adaptation issues, especially the scalability-relatedconcerns such as low TX validation rates of BCs in comparison to the centralized storagesystems, a novel sharding mechanism is proposed to enhance the scalability of BCs. Sincedisconnections and delays of a BC’s distributed network can cause concerns for inter-shard and inter-miner synchronizations, eventually preventing the BC from reaching ahigh throughput, this thesis develops an IoT-oriented permissioned BC, which covers viaa scalableDistributed Ledger (DL) the novel shardingmechanism for unstable distributednetworks. Therefore, DLIT (Distributed Ledger for IoT Data) offers a novel two-layeredTX distribution, validation, and inter-shard synchronization, combined with authentica-tion and verification mechanisms in support of a viable security level. Moreover, to en-hance the scalability of the DLIT a TX aggregation mechanism is introduced. Havingdeveloped the TX aggregation, efficient prevention and control of the BC’s size growth isobserved in the evaluated scenarios.

Lastly, BIoT applications face many challenges to comply with the European Gen-eral Data Protection Regulation (GDPR), i.e., enabling users to hold on to their rightsfor deleting or modifying their data stored on publicly accessible and immutable BCs. Inthis regard, to investigate further on the social goal of this thesis, requirements of BCs forbeing GDPR compliant in BIoT use cases is identified. Accordingly, an on-chain solu-tion is proposed that allows fine-grainedmodification (update and erasure) operations onTXs’ data fields within a BC. The proposed solution is based on a cryptographic primitivecalled Chameleon Hashing. The proposed novel approach lets BC users have the author-ity to update their data, which are addressed at the TX level with no side effects on theblock or chain. By performing and storing the data updates, all on-chain, traceability andverifiability of the BC are preserved. Moreover, the compatibility with TX aggregationmechanisms that allow the compression of the BC size is maintained.

The technical, functional, and social BIoT demands collected in step 1, collectivelyaddressed via the BIIT architecture, and DLIT due to the introduced set of scientific andpractical approaches in steps 2 and 3.

All in all, contributions of this thesis include (a) the 4BC-IoT integrateddAppswhichoffer novel designs, (b) the set ofwell studied and practically identifiedBIoT challenges, (c)the BIIT architecture for efficient BIoT, and (d) DLIT as the scalable and secure DL forIoT data. This thesis reach to its indivisible goals as these contribution empower efficientand practical BC-IoT integrated dApps encompass methodologies, algorithms, and de-signed processes that have been proved novel via multitude of peer-reviewed publications.

iv

Contents

1 Introduction 11.1 Identifying BIoT Efficiency Affecting Factors . . . . . . . . . . 3

1.1.1 Blockchain-related Efficiency Impacting Factors . . . . . 4

1.1.2 IoT-related Efficiency Impacting Factors . . . . . . . . . 4

1.1.3 BC-IoT Adaptation-related Efficiency Impacting Factors 4

1.2 Research Questions . . . . . . . . . . . . . . . . . . . . . . . . . . 5

1.3 Goals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

1.4 Thesis Contribution . . . . . . . . . . . . . . . . . . . . . . . . . 7

1.5 Thesis Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

2 Background 92.1 Distributed Storage Systems . . . . . . . . . . . . . . . . . . . . 9

2.1.1 Distributed File Systems . . . . . . . . . . . . . . . . . 11

2.1.2 DSS Implementations . . . . . . . . . . . . . . . . . . . 12

2.1.3 Performance Evaluation of DSSes . . . . . . . . . . . . . 15

2.2 Cloud Storage Systems . . . . . . . . . . . . . . . . . . . . . . . 16

2.3 Blockchains . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

2.3.1 Building Elements of Blockchains . . . . . . . . . . . . . 19

2.3.2 Time Stamping . . . . . . . . . . . . . . . . . . . . . . . 21

2.3.3 Blockchain Types . . . . . . . . . . . . . . . . . . . . . . 21

2.4 Mining and Consensus Mechanisms in Blockchains . . . . . . . 22

2.4.1 Proof-of-Work (PoW) . . . . . . . . . . . . . . . . . . . 23

2.4.2 Proof-of-Stake (PoS) . . . . . . . . . . . . . . . . . . . . 24

2.4.3 Byzantine Fault Tolerant (BFT) . . . . . . . . . . . . . 30

2.4.4 Proof-of-X (PoX) . . . . . . . . . . . . . . . . . . . . . . 31

2.5 Blockchain Implementations . . . . . . . . . . . . . . . . . . . . 32

2.5.1 Bitcoin . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

2.5.2 Ethereum . . . . . . . . . . . . . . . . . . . . . . . . . . 33

2.5.3 Hyperledger (HL) . . . . . . . . . . . . . . . . . . . . . 34

v

2.5.4 IOTA . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

2.5.5 Bazo . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

2.5.6 Blockchain Performance Analysis Metrics . . . . . . . . 35

2.6 Blockchain Performance Enhancement Approaches . . . . . . . 38

2.6.1 Layer 1 Scalability Enhancement Approaches . . . . . . 39

2.6.2 Layer 2 BC Scalability Enhancement Approaches . . . . 43

2.6.3 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . 43

2.7 Internet of Things (IoT) Protocols . . . . . . . . . . . . . . . . . 45

2.7.1 Low-Power Wide Area Networking (LPWAN) . . . . . . 45

2.7.2 Security of IoT . . . . . . . . . . . . . . . . . . . . . . . 48

2.7.3 Choices of IoT Protocols . . . . . . . . . . . . . . . . . . 52

2.8 Blockchain-IoT Integration (BIoT) . . . . . . . . . . . . . . . . 53

2.8.1 BIoT Incentives . . . . . . . . . . . . . . . . . . . . . . . 53

2.8.2 Overview on BIoT Use Cases . . . . . . . . . . . . . . . 55

2.8.3 BIoT Architecture . . . . . . . . . . . . . . . . . . . . . 61

2.9 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65

3 Design and Implementation of BC-IoT Integrated Applications 663.1 KYoT: A Blockchain and IoT based Device Identification System 66

3.1.1 Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67

3.1.2 Implementation . . . . . . . . . . . . . . . . . . . . . . . 71

3.1.3 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . 72

3.2 Pollution Monitoring . . . . . . . . . . . . . . . . . . . . . . . . . 74

3.2.1 Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74

3.2.2 Implementation . . . . . . . . . . . . . . . . . . . . . . . 76

3.2.3 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . 77

3.2.4 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . 79

3.3 NUTRIA: A Supply Chain Tracking System for The Swiss

Dairy Use Case . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80

3.3.1 Requirements Analysis via Interviews . . . . . . . . . . 81

3.3.2 Design and Implementation . . . . . . . . . . . . . . . . 84

3.3.3 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . 87

3.3.4 Social Study Results . . . . . . . . . . . . . . . . . . . . 87

3.3.5 Technical Evaluation . . . . . . . . . . . . . . . . . . . . 89

3.3.6 Operational, Social, and Technical Risks . . . . . . . . . 90

3.4 ITrade: An IoT Data Streaming System & Marketplace . . . . 92

3.4.1 Requirements Analysis of Data Streaming systems . . . 93

3.4.2 Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94

vi

3.4.3 User Interactions and Processes in ITrade . . . . . . . . 96

3.4.4 Implementation . . . . . . . . . . . . . . . . . . . . . . . 101

3.4.5 Test Scenario . . . . . . . . . . . . . . . . . . . . . . . . 108

3.4.6 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . 109

3.5 Blockchain and IoT Integration (BIoT) Risks and Challenges . 111

3.5.1 Discussion: To BC, or Not to BC?! . . . . . . . . . . . 119

3.6 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119

4 BIIT—An Efficient BIoT Architecture 1214.1 BIIT Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121

4.2 BIIT Architecture Components . . . . . . . . . . . . . . . . . . 123

4.3 BIIT Implementation . . . . . . . . . . . . . . . . . . . . . . . . 124

4.3.1 Management of the TTN Networks by BIIT . . . . . . . 125

4.3.2 Management of the Cellular Networks by BIIT . . . . . 128

4.4 Experimental Performance Evaluation . . . . . . . . . . . . . . 128

4.4.1 BIIT Performance in LoRa . . . . . . . . . . . . . . . . 128

4.4.2 BIIT Performance in Cellular Networks . . . . . . . . . 131

4.5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132

5 DLIT—ADistributed Ledger for Efficient BIoT 1365.1 DLIT’s Design and Implementation . . . . . . . . . . . . . . . . 136

5.1.1 IoT Data Transactions . . . . . . . . . . . . . . . . . . . 137

5.1.2 Aggregated Transaction (AggTX) . . . . . . . . . . . . 137

5.1.3 Double Linked Blockchain . . . . . . . . . . . . . . . . . 140

5.1.4 Consensus and Sharding . . . . . . . . . . . . . . . . . . 141

5.1.5 Consensus Design and Implementation . . . . . . . . . . 148

5.1.6 IoT Data Storage Mechanism . . . . . . . . . . . . . . . 151

5.2 Adhering to GDPR . . . . . . . . . . . . . . . . . . . . . . . . . . 154

5.2.1 Related Work . . . . . . . . . . . . . . . . . . . . . . . . 155

5.2.2 Chameleon Hash Function (CHF) . . . . . . . . . . . . 157

5.2.3 Discussion and Requirements Specification . . . . . . . 158

5.2.4 DLIT’s GDPR-Compliant Design . . . . . . . . . . . . . 159

5.2.5 TX Aggregation Compatibility . . . . . . . . . . . . . . 167

5.3 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169

5.3.1 Evaluation of the TX Aggregation . . . . . . . . . . . . 169

5.3.2 Evaluation of DLIT’s Hybrid Consensus Mechanism . . 178

5.3.3 Evaluating GDPR-compliance of DLIT . . . . . . . . . 179

6 Summary and Conclusions 181

vii

6.1 Overview of All Contributions . . . . . . . . . . . . . . . . . . . 183

6.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186

7 Publications 188

References 191

List of Figures 211

List of Tables 214

List of Listings 216

Appendix A 217A.1 Bazo Specifications . . . . . . . . . . . . . . . . . . . . . . . . . . 217

viii

Acknowledgments

I would like to thank Prof. Dr. Burkhard Stiller for his support during the manydifferent activities of my Ph.D. studies. These were five unforgettable years which, thankto him they were filled with numerous experiences and explorations. Thanks, Burkhard,for being an excellently organized, always responsive, and a caring leader. I would alsolike to thank my co-adviser Prof. Dr. Salil Kanhere for providing valuable feedback inpreparation of my thesis.

Iwould like to thankmy always supportive andprofessionalCSGcolleagues, especiallyDr. ThomasBocek andDr. ErykSchiller for sharing their experienceswithmeandguidingme in finding the right path during my Ph.D. studies.

I am grateful to all of the students of UZH, who contributed to this thesis with theirgreat work especially Danijel Dordevic, Raphael Beckmann, Julius Willems, Ile Cepilov,Fabio Maddaloni, Atif Ghulam Nabi, Kürsat Aydinli, Sanjiv Jha, and Benjamin Jeffrey.Indeed there was a lot for me to learn from each of them.

There are no words that could express my gratitude to my love, Narges. My DearNarges, thanks for standing shoulder-to-shoulder beside me throughout the challengingtimes.

Last but not least, I would like to thank my parents Laya and Javad, and my brotherSafa, who supported me with their kindness, love, and wisdom during the many differentsteps of my life.

ix

1Introduction

The Internet-of-Things (IoT) is projected as a key element of the 5th Generation (5G)wireless networking era [265]. IoT allows for automation of data collection and actua-tion that does not require continuous human intervention. Thus, collected data volumes,measurement resolution, and coveragemay be enhanced compared to currentmanual andhuman-dependent data collection processes. In real-world IoT-based use cases such as inIndustrial Internet of Things (IIoT), end-devices continuously record data about the en-vironment and provide it for further processing. The knowledge derived from collectedinformation allows for optimizations of industrial ormanufacturing processes, thus, prov-ing a great value.

The variation of IoT technologies created a broad heterogeneous ecosystem composedof distinct communicationprotocols, technologies, hardware, software, and architectures.The advent of different network techniques for the IoT promises an appropriate Quality-of-Service (QoS) in different domains, especially when the desired Service Level Agree-ment (SLA)may be contracted with the network operator, e.g., aMobile NetworkOpera-tor (MNO) in the case of cellular technologies [207]. However, themanagement of indus-trial ormanufacturing processes can be seriously impacted by the integration of IoT [243].Typically, resource-constrained, easily accessible, and heterogeneous IoT devices could bedeployed in large numbers over an industrial environment raising questions about perfor-mance, i.e., QoS, authenticity, security, privacy, and trust in the data collection process.These problems can emerge at different levels of IoT-based systems. To this end, recent re-search has shown interest in integrating Blockchain (BC)with IoT, to enable authenticity,security, and trust into the IoT ecosystem.

BCs are public distributed ledgers with Peer-to-peer (P2P) architecture that containdata records provided by users or machines, e.g., Machine-to-Machine (M2M) applica-tions. In general, BCs are decentralized and distributed data storage systems, which op-erate on the basis of a full copy of the distributed ledger (DL) per node; (the data stor-

1

age), and include two main entities, i.e., miners or validators, and BC clients. The im-mutability of BCs is enabled via consensus mechanisms, which define a decentralized andtrusted principle-set providing a global agreement. The content of the ledger is organizedas a trusted ordered list of blocks, i.e., Blockchain, queryable by third parties. The chain inBCs are constructed by amining process performed among participating peers [164, 177].

There is increasing attention focused on Blockchain and IoT integration (BIoT)such as in supply chain monitoring [160, 215],[197], smart industries [231], and envi-ronment monitoring [199]. The primary interest of BIoT lies in data authenticity andimmutability used in policy based control loops involving measurements, data collection,and control [207]. Moreover, BIoT provides data transparency through the introductionof security and consensus mechanisms [266] on top of properly organized IoT and net-work layers [155], which guarantee data authenticity and integrity, before the data is in-serted into the BC. For instance, BIoT can be applicable towards Industry 4.0 (I4), provid-ing a fully operational BIoT combined by trustworthiness of the data collection coupledto the specific QoS requirements of a given industry branch such as health-care [32], orwearables [230] assisted living [138], and smart cities [56].

However, IoT-based systems are often restricted in processing power, storage, net-working, and energy. Thus, typical resource-hungry BCs cannot be directly ported to-wards IoT devices. Therefore, a management approach of BIoT for the efficient adap-tation of BC-IoT is needed to consider data acquisition via IoT devices, transmission ofdata, and its storage on BC.

It is expected that the number of installed IoTdevices worldwidewill exceed 30 billionunits in 2025 [7] producing 79.4 Zettabytes of data [6]. The variety of IoT technologies,such as Low Power Wide Area Networks (LPWAN) and Cellular (3G/4G/5G) network-based ones, create a diverse ecosystem with many different (protocol and capability wise)IoT devices. Such diversity leads to the key problem observed in this thesis, i.e., BIoTapproaches miss taking into account those differences and potential aspects affecting theefficiency of BC-IoT adaption [215, 219]. For instance, within most BC-IoT integratedapplications, large numbers of data packets with variable sizes of data collected by IoT sen-sors require scalable and secure streaming toward BCs [203]. Whereas BCs like Bitcoin orEthereum cannot store those TXswith the same rate as transmitted towards them, leadingto inefficient and suboptimal BIoT.

Enabling an efficient BIoT in practicemerits identifying BIoT challenges (e.g.,BCandIoT deficits) and determiningmeasurablemetrics considered carefully in BIoT design andmanagement to address those challenges. While the following Chapters will discuss BIoTchallenges andmetrics in detail, Sections 1.1-1.3 introduce such efficiency affecting factorsandhow they are used to formulate the researchquestions in this thesis to reach the definedgoals.

2

1.1 Identifying BIoT Efficiency Affecting Factors

The research and practical evaluations on a set of BC-IoT integrated applications con-ducted in this thesis identifies Scalability, Energy Efficiency, and Security as the mostcrucial efficiency impacting factors of anyBIoT system. These 3 aspects are affected by var-ious design and technology adaptation related aspects. A few primary aspects are shownby Figure 1.1, which are classified into three groups explained as follows.

Figure 1.1: Identified Challenges of BIoT Systems and Corresponding Elements

3

1.1.1 Blockchain-related Efficiency Impacting Factors

As shown in Figure 1.1, BC-related efficiency impacting factors of BIoT are associatedwith the following aspects [137]. (i) Choice of BC type, e.g., public or private, permis-sioned or permissionless directly impact the accessibility, transparency, and coverage scopeof employed BCs. (ii) Consensusmechanism of a BC,—handled betweenminers (valida-tors)while and after transaction (TX ) and block validation—affects the overall scalability,trust, and reliability of BIoT systems relying on that BC. (iii) Throughput or TX valida-tion rate in a BC, which is calculated by determining the TX validation rate, i.e.,Transac-tion per Second (TPS) is influenced by the consensus mechanism employed and the net-working layer efficiency over which miners and clients communicate with each other. TXvalidation rate is the key scalability indicator of BCs. (iv) Reliability, security, and scala-bility of BCs are influenced by the number its users (clients) and miners who verify theBC clients’ TXs . The more the number of miners, the secure and reliable would be thatBC. However, the increase in the number of clients leads to higher number of TXs whichputs the BC network under pressure. (v) employing lightweight BC clients to increase thenumber clients while not demanding large storage and computational power influencesthe adoption of BCs.

1.1.2 IoT-related Efficiency Impacting Factors

As shown in Figure 1.1, the major efficiency impacting factors directly related to the IoTtransmission are (i) Number of IoT devices and corresponding data object size, (ii) Band-width and data rate offered by data transport schemes, (iii) Traffic type i.e., the transportscheme or pattern, (iv) Communication range covered by the underlying IoT infrastruc-ture impacts the use cases. (v)Mechanisms to establish automated data collection in orderto remove or reduce human interaction and potential data manipulations.

1.1.3 BC-IoT Adaptation-related Efficiency Impacting Factors

As shown in Figure 1.1, this group is related to the conjunction point of IoT technologiesand BCs. Efficiency of this layer has been missed the most in the state-of-the-art BIoTsolutions, which is affected by (i) Identification of IoT devices to prevent malicious activ-ities is vitally important. Thus, an Identity and Access Management (IAM) mechanismthat takes into account users’ BC credentials and their IoT devices identification is essen-tial. (ii) Data collection approaches established in BC-IoT integrated applications requireadapters to collect the BC and IoT infrastructure data and accordingly, re-/setting datatransmission parameters dynamically. Adaptation process is needed such that IoT nodescan perform as BC clients. (iii) BC-specific cryptographic signature functions on IoTnodes while considering the Maximum Transmission Unit (MTU) of IoT network with

4

the goal of sending larger data packets. (iv) An overarching BIoT architecture that consid-ers and encourages a dynamic and manageable BIC-IoT systems.

1.2 ResearchQuestions

This thesis formulates the following three Research Questions (RQs) to propose a practi-cally reliable and efficient BIoT. As illustrated in Figure 1.1, addressing each of these RQsregarding the elements involved and impacted on the BC, IoT, and BIoT architecture as-pects are realized separately. Thus, by considering a comprehensive list of dependenciesaffecting the BIoT efficiency, these RQs compel a set of efficient mechanisms to be care-fully designed and specified for the collection, transmission, and storage of IoT data.

RQ1: How to achieve scalability in BC-IoT integrated ecosystems?Answering this question requires a broad study on the scalability affecting metrics,

followed by the design and development of BC-IoT integrated applications, BIoT archi-tectures, and underlying protocols.

Scalability of the proposed BC designs can be evaluated against well known relatedmetrics such as TX validation rate, number of forks, and BC size growth. The scalabil-ity performance needs to reach higher TX validation rates than well-known BCs such asBitcoin and Ethereum and in a comparative level with the recent BC proposals such asMonero, Zcash, EOS, or Cardano.

The proposed BIoT architecture’s scalability shall be evaluated against the IoT-to-BCcommunications and transmission scalability efficiency provided. Packet loss, through-put, and maximum transmitted data size over a time unit are some of the metrics to beused for scalability measurements offered by the proposed BIoT architecture in this the-sis.

Moreover, theproposedBIoTarchitectureneeds tobe evaluatedby running real-worldscenarios and simulations to ensure the fulfillment of the following parameters: (i) En-abling a dynamic adapter layer for IoT protocols which connects to the dedicated adaptersto collect the communication parameters and set the transport scheme based on the in-puts from IoT technology, (ii) Enabling software wallets that sign and transmit the dataregardless of the data size and fragmentation requirements of large data. (iii) Functionalcapability of installing and running BC lightweight clients.

RQ2: How to facilitate energy-efficient BIoT?Since, on one hand, the energy consumption of BCminers is a crucial issue for PoW-

based BCs, and, on the other hand, the power supply requirements of IoT devices is aninevitable element of any BC-IoT integrated application, this RQ focuses on facilitating

5

the energy efficiency of BIoT architecture and BC proposed in this thesis. Thus, the em-ployed IoTdevices andprotocols, and thedesignedBCneed tobe evaluatedwith respect tothe total computational power and energy consumed by each employed technology, espe-cially the energy efficiency of the proposed transmission schemes in this thesis. The resultof this evaluation shall be compared with the related work and comparable solutions. Forinstance, in case a transmission mechanism is proposed, the energy consumption of eachstep in the data transmission flow need to be analyzed via accurate simulations.

RQ3: How to secure the designed BC and architecture of BIoT, to enable trust,transparency, and user privacy?

During the past years there has been a great attempt in enhancing the scalability ofBCs, however, some of these approaches trade-off the PoW security for a higher scalabil-ity. Thus, in order to make sure the scalability features proposed by this thesis will notpose potential security threats, the BC-IoT integrated applications, the designed BC, andthe BIoT architecture proposed in this thesis all need to consider addressing well-knownsecurity attacks such as “51% attack”, “grinding attack”, and “Denial of Service (DoS) at-tack” and security evaluation of the BIoT architecture against “Hardware Replacement”,“Hardware tampering”, “Man in the middle”, and “Data corruption”.

1.3 Goals

The analysis of existing studies such as [208, 170, 85, 188, 218] leads to the fact that reach-ing to practically efficient BIoT requires proactive considerations within the applicationlayer —i.e., social and functional aspects of BC-IoT integrated applications— and tech-nical layer —i.e., underlying BC, IoT, and the adaptation of these two realms—. Hence,this thesis pursues 3 goals to tackle inefficiency of practical BIoT applications by definingTechnical, Functional, and Social goals.

The Technical goal focuses on (a) Scalability, (b) Energy Efficiency, and (c) Securityof (i) BC-IoT integrated applications, (ii) BCs, and (iii) BIoT architectures. Thus, aselection of methods are proposed to answer the specified RQs and their consecutive re-sults are reflected into the targeted goals. This thesis proves that proactively answering theRQs 1-3 builds the technical foundation of constructing these three aspects to achieve thetechnical goals. In this regard, this thesis aims at an IoT-oriented BC design and a genericBIoT architecture to empower the BIoT efficiency within a broad range of applications.

TheFunctional goal is specifically determined toprovide transparency, trust, andman-ageability. Thus, this thesis aims to design and develop BC-IoT integrated applicationssuch that users can access information that was not possible to gain by using centralizedsystems. The transparency and trust of the provided data, especially those collected au-

6

tomatically by IoT devices, are provably fulfilled by addressing RQ3. Since the manage-ability of BIoT applications depends on the underlying architecture and BIoT systems’design, this functionality shall be supported (a) by the BIoT architecture to be designed,and (b) by the design of BIoT applications, e.g., proper use of Smart Contracts to handleP2P interactions decentrally.

The Social goal is concentratedon thepublic acceptance and adoption of BC-IoT in-tegrated applications by user privacy fulfillment and decentralized identity management(focused in RQ3). Therefore, in this thesis, on one hand, the General Data ProtectionRegulation (GDPR) compliance of the designed BC elements is investigated —for thispurpose, data storage and deletion in BCs are analyzed regarding the effects of deletingdata in the designed BC—, on the other hand, design decisions to meet identity manage-ment and privacy demands of users are determined within Know-Your-Customer (KYC)and Know-Your-Device (KYD) approaches and applied via novel system designs.

1.4 Thesis Contribution

Contributions of this thesis are 4-fold. First, this thesis designs various decentralized BC-IoT integrateddApps, i.e., theNUTRIASCT, theBPMSpollutionmonitoring, the ITradedata streaming platform, and the KYoT platform as a KYC and KYD system (cf. Sections3.3 - 3.1 respectively). The proposed design and implementation for these 4 dApps specif-ically considers the goals of this thesis explained above. While the proposed design forthese dApps proven to be novel, they have been used for experimental analysis of suchBIoT dApps, thus, as an output (i.e., the second contribution), this thesis proposes anextensive categorized set of metrics affecting BIoT systems.

As the third contribution, this thesis introduces aBIoTarchitecture, i.e.,BIIT (Blockchainand IoT Integration archiTecture), as a framework for highlighting the required compo-nents and essential metrics of BIoT architectures. BIIT considers BIoT challenges and isconsisted of the key elements for efficient adaptation of BCs and IoT technologies. Com-ponents and features embedded in BIIT are gathered throughout this Ph.D. thesis eitherwith practical experiences or based on simulations conducted.

As the forth contribution, this thesis proposes the design and implementation of anIoT orientedDistributed Ledger (DL) calledDLIT.DLIT is designed to address the threeRQs by employing an efficient set of methods, including the consensus mechanism, TXtypes, TX aggregation, and novel scalability enhancement solutions based on sharding.

It will be shown that these 4 key contributions will lead to achieving the goals since atleast 12 specific improvements are made possible for the scalability, energy efficiency, andsecurity of BIoT which reflect in technical, functional and social aspects of such dApps.

7

1.5 Thesis Outline

The remainder of this thesis is structured as follows.As mentioned above, efficiency of BIoT is affected by the different technologies em-

ployed in those systems, i.e., the BC, IoT, and their adaptation. Hence, Chapter 2 presentsan overview on the background information of these three elements. Relevant securityprimitives and concerns, potential state-of-the-art scalability enhancement proposals, anddifferent implementations are surveyed in this Chapter. Chapter 2 will cover the mainstate-of-the-art BIoTmethodologies and architectures, too.

Chapter 3 presents the 4 BIoT platforms designed and implemented in this thesis,which serve as testbeds for extracting the practical information on the challenges and po-tential of BC-IoT integrated applications. The proposed applications not only attest theviability of BC-IoT integrated applications, they offer novel features that were not shownin any other related work. This chapter covers the detailed categorization of BIoT poten-tials and challenges —as one of the contributions of this thesis—.

The identified BIoT challenges are used for the determination of a dynamic, flexible,and software-defined BIoT architecture, i.e., the BIIT, presented in Chapter 4. To evenfurther improve the efficiency of BIoT, this thesis proposes several novel scalability, se-curity, and privacy improvements on IoT-oriented DLs, presented in Chapter 5. ThisChapter will cover the design, development, and evaluation of the proposed IoT-orientedDL —called DLIT—. Furthermore, this Chapter includes the detailed specifications ofthe novel features designed for inter-miner communications, sharding, consensus, TX val-idation, mining processes, and GDPR-compliance design of DLIT.

Finally, Chapter 6 summarizes this thesis’ main contribution and highlights its mainresults achieved. This Chapter concludes this thesis, presents the final words, and shedslight to the potential future work.

8

2Background

This chapter provides the technical foundation of this thesis by presenting an overview ofDistributed File Systems, Blockchains (BCs) and Internet-of-Things (IoT) protocols andtheir enabling technologies. At first, by defining difference of centralized and decentral-ized and distributed files systems the historical journey of IT from traditionally centralizedsystems towards distributed systems are presented. Next, by introducing BCs and speci-fying the differences of BC types and consensus mechanisms, the impacting elements onsecurity, energy efficiency, and scalability of BCs are studied. The elaboration on such adiverse technical area shall help with the explanations and justifications provided furtherin the proposed methods and techniques. The same applies to the IoT ecosystem whichwill be introduced and elaborated especially considering the 5Gwireless communications.Finally, this chapter brings together the two worlds of BCs and IoT by discussing incen-tives, use cases, and concerns of BC-IoT integrated systems and applications. Providedinformation here clarifies the BIoT requirements and challenges which will be referredmany times throughout the next Chapters.

2.1 Distributed Storage Systems

A fundamental expectation from data storage systems is preserving a data object withoutmaking any changes on it or without tempering any part of the data. In the centralizeddata storage world, it is the responsibility of that hardware, operating system, and theapplication software to maintain the data safe and unchanged. For example, a PortableDocument Format (PDF) file including the ASCII characters or an image or the datarecords in databases shall be kept safe without any random changes. Thus, data storagesystems need to ensure users that no changes have occurred on a particular data object bybeing tamper-proof [210]. Approaches to guarantee tamper-proofness of storage systemsinclude (a) hardware-based mechanisms e.g., Hardware Security Modules (HSM) which

9

assures the device hardware is not manipulated, (b) software-based functions e.g., Hashfunctions which assure the integrity of data, and (c) combinatory approaches e.g., Phys-ically Unclonable Functions (PUF) that ensure a specific hardware and software is usedwith no unexpected changes [198, 210].

Additionally, in some cases, data storage systems are expected to offer change traceabil-ity by time stamping the updates and changes made on data objects. Tractability sharescommon semantic in many ways as in the “Git”-based version controlling for files. Pro-grammers mostly use git to monitor the changes on a program code, and if needed, revertto a state in the past [210].

To offer a reliable ecosystem, centralized approaches, even if efficient, have provento be the bottleneck and single-point-of-failure. As shown in Figure 2.1, in a centralizedstorage all nodes are connected to (and depend on) a central entity for storing their data.Hence, a compromised central node could cause essential losses for data owners. More-over, centralized storage systems hinder efficacy of processing large files, e.g., for Big Dataanalysis [210].

To confront the safety and efficiency risks of centralized storage, decentralized storageenables the connection of nodes to more storage hosts. These storage hosts are usually in-terconnected. Distributed storage networks, however, remove the central storage entitiesand delegate the storage to all the nodes in the networkwhich are connected to other peers[210].

Server

Centralized

Server2

Decentralized

Server1Server3

Distributed

Figure 2.1: Storage Network Types [210]

Different operating systems on computers and digital devices use their internal clockfor time stamping. In decentralized and distributed networks, however, a common no-tion of time and time stamping has to be integrated explicitly. Otherwise, keeping trackof changes would result in non-synchronized decisions [210]. The multiple DSS tech-nologies shown in Figure 2.2 each have a specific underlying architecture and design with

10

a different level of trust, transparency, decentralization, security provided for BIoT usecases. Following sections discuss these different technologies in detail based on [210].

Distributed Storage Systems (DSS)

Cloud Storage Systems

Blockchains (BC)

Distributed Ledgers(DL)

Google CloudStorage

Amazon S3 Google File System(GFS)

Hadoop DFS (HDFS) Hyperledger

Bitcoin

IPFS

Swarm

Storj

SiaFilecoin

Distributed HashTables (DHT)

Distributed File Systems(DFS)

DLITKademlia

Ethereum

Figure 2.2: Scope of Distributed Storage Systems — The Big Picture [210]

2.1.1 Distributed File Systems

Over time, centralizeddata storage systemshave evolvedbyprogressing thefile systemman-agement combinedwithOperating Systems (OS), such aswith theUnixNetwork File Sys-tem (NFS) [212]. While OSes enhanced the efficiency of their processing units and alsoby distributing data storage and processing tasks between multiple computing entities toreduce data transmission delays especially in decentralized systems [61, 136], distributedapproaches optimized on operational dimensions. Thus, as shown in Figure 2.2 a rangeof different types of data storage systems and their scope exists [210].

A driving factor in decentralizing data storage systems was the growth in data volume,collecting data in/from different applications, leading to “Big Data” and data mining ap-proaches. Big Data analysis demands fast processing on massive date volume, well exceed-ingTeraByte (TB) sizes and slowly reaching the scale of PetaByte (PB).To store and analyzesuch huge data amounts, distributed file system frameworks lay the only feasible path to

11

follow [210]. Many different Distributed File Systems (DFS), such as the Google File Sys-tem (GFS) [212], NFS, and the Hadoop Distributed File System (HDFS) [212], havebeen developed during the last three decades. These DFSes have resulted not only in effi-ciency and performance of storage and processing, but also in transparency (e.g., location,naming, access, and replication), redundancy, user mobility, ease of access, and high avail-ability [212, 210].

The distribution of data via DFS is extended in different directions, too. The firstdimension addresses computational units, which are placed in the same location, e.g., ,within the same rack(s) in a building. The second dimension refers to files being hostedremotely, i.e., onCloud storage systems. WhileHDFS,GFS, andNFS focusmostly on thefirst dimensions, Cloud storage systems, such as Amazon S3 and Google Cloud Storage,have followed the second path. The third dimensions is referred to data storage, which ismade possible without any single entity in charge of the data storage. This is the case withDistributed Storage Systems (DSS), such as IPFS [25] and Sia [28, 254, 210].

2.1.2 DSS Implementations

While a wider range of implementation of DSSes exists, major players of distributed stor-age systems as of 2021 encounter the following five (cf. Table 3.4) based on [210]:

2.1.2.1 Storj

Storj proposes a distributed data storage framework that scales to exabytes of data stor-age globally. The Storj network stores, encrypts, shards, and distributes data to nodes forstorage. This system is designed to prevent breaches by being modular, i.e., consistingof independent components with task-specific jobs. Storj aims at achieving high security,performance, reliability, and cost-efficiency [235]. Metadata servers, object storage servers,Satellites, and clients are the major actors in the Storj network. Storj defines a set of nodesas “Satellites” which are responsible for discovering the storage server nodes and keepingtrack of metadata and accounting and payments. Storj clients and storage server nodes areconnected to each other and to the satellite nodes via P2P connections [235].

2.1.2.2 Sia

Sia is an open-source BC-basedDSS. Sia ensures data recovery for its users by redundantlydividing each data file into 30 segments. These segments are encrypted –using the Three-fish [55] hash function– and stored in a distributed fashion. Using the Reed-Solomonerasure coding, Sia facilitates the recovery of files even if 20 out of the 30 segments arelost [28, 254]. Sia enforces its users to set up contracts, i.e., “file contracts,” between them-selves and the data storage hosts. These contracts indicate the service specifications such

12

as uptime commitment and pricing. File contracts are automatically applied using SmartContracts, initiated by users and stored on the Sia BC, without the need for any interme-diaries. Sia employs these Smart Contracts to set up Service Level Agreements (SLA) [28].Sia nodes, including users and storage hosts, pay and get paid by Siacoin. The payment isconducted off-chain analog to P2P payment channels. An important design element inSia is that storage hosts are discouraged from leaving the network, since they have to pay acollateral. Storage hosts in Sia prove their commitment in data storage of files according tofile contracts and a submission of “storage proofs,” which consist ofMerkle tree roots. Us-ing Merkle trees, Sia assures the persistence of the data by a host. Each host has to submitthe storage proof to the Sia BC within a specific time span to be paid for.

2.1.2.3 Swarm

Swarm is the DSS of the Ethereum BC. Swarm introduces a 4-layer design starting with aP2P layer as its transport layer. The next layers include an overlay network layer, data accesslayer, and the application layer [246, 91]. In Swarm, nodes are running a client applica-tion and shape the “underlay” network of Swarm. Each node obtains an address and canlisten to the network, dial-in other peers directly, and make live connections. Each nodein Swarm is identified by its 256 bit overlay network address, which is determined basedon that node’s Ethereum address. Swarm enables authentication and integrity verificationsince Ethereum addresses are generated by hashing the corresponding node’s PK. Swarmtopology in overlay network is shapedwith the Kademlia distributed hash tables and rout-ingmechanism [171]. Kademlia guarantees the presence of a path between any two nodeswith O(log(n)) hops. Swarm establishes quasi-stable peers over Kademlia relying on TCPlive channels.

2.1.2.4 Interplanetary File System (IPFS)

IPFS is a P2P public DSS for storing and accessing data objects, files, websites, and appli-cations. IPFS utilizes content addressing, i.e., assigning addresses and fetching files accord-ing to data object content – hash of the content–, instead of its’ location in the distributedstorage network [25]. IPFS employs the Interplanetary Linked Data (IPLD) to translatebetween hash-linked data structures allowing for its clients to explore data regardless ofthe underlying protocol. These links between content are embedded within that contentaddress through a Directed Acyclic Graph (DAG), i.e.,Merkle DAGs [25]. If a web site isupdated, only updated files receive new content addresses. Thus transferring large datasetsare made efficient since only the updated parts need to be transferred. IPFS uses a Dis-tributed Hash Table (DHT) as a database of keys to values. A DHT is scattered across allthe storage nodes (i.e., peers) of IPFS. To find content, clients need to request these peers.IPFS employs Bitswap to reply to these requests. In IPFS, data transfer is encrypted, but

13

the essential metadata is published publicly. This metadata includes unique node identi-fiers and addresses of the data blocks. For further privacy provisioning, users can use anIPFS “public gateway” for data transfers.

2.1.2.5 Filecoin

Filecoin is a publicDSS. It enables a dynamic distributed data storagemarketplace for datastorage providers (storage miners) and users. The pricing and availability of the storage isdesigned to be decentralized. Payments in Filecoin are through the FIL cryptocurrency,the native currency of Filecoin BC. The BC in the Filecoin system is used for storing theTXs between users and storage miners, and the storage proofs are provided by the storageminers [121, 229].

Filecoin is built on IPFS, and every node in Filecoin can communicatewith other IPFSnodes. However, not every IPFS node is a Filecoin node, and they cannot be paid by File-coin users or vice versa. Filecoin nodes do not join and contribute to the IPFS DHT. InIPFS, a piece of data exists as long as there is a node that stores it. Filecoin attempts toincentivize data storage providers to join the IPFS system and provide storage services toothers. Filecoin enables a data retrieval market in which storage providers are get paid forcaching data for faster retrieval. This type ofminers in Filecoin is called “retrieval miners”.In this case, to respond a data retrieval request, user requests do not need to necessarilybe sent into the IPFS network since the paid retrieval miners can return user data back tothem faster.

In comparison to IPFS, where data storage is only done by volunteers, and there isno control on the centralized pinning process, i.e., storing files, in Filecoin paid members,i.e., storage miners are providing data storage. Thus, Filecoin offers higher reliability asa DSS to its users. In essence, both IPFS and Filecoin are using the same file format, i.e.,IPLD, networking protocol, i.e., libp2p, and data transfermechanism, i.e.,Graphsync andBitswap. However, Filecoin is designed for higher efficiency for larger files’ storage andlonger and guaranteed service provision [121, 143].

Table 2.1: Comparison of Selected DSSes [210]

Storj Sia Swarm IPFS Filecoin

SecurityTLS, S/Kademlia,

PK HashingThreefish Hash Keccak256 Hashing

Encrypted DataTransmissions

Encrypted DataTransmissions

ConsensusByzantine Altruistic Rational

(BAR)Storage Proofs:

Merkle Tree RootBinary Merkle Tree

(BMT) Chunk— Storage Proofs

Payment STORJ Token Sia Coin Ether — FIL

Execution Storj Software File Contracts EVM and SCs IPFS Client App Filecoin Client App

14

2.1.3 Performance Evaluation of DSSes

DSS performance can bemeasured from various angles. The processes undertaken withinaDSS includemany steps such as transmitting data which cause (and affected by) networkdelays, consensusmechanismusedbyminers and storageproviders, replications of thedataobjects, the infrastructural capability of storage hosts, size of data objects, encryption, andsigning data objects. Hence, the efficiency of these processes affects the overall efficiencyand performance of DSSes [210].

A key performance indicator of DSSes is their throughput, that is, their ability toreceive data objects from users and storing them reliably within a time unit calculatedby “TX/s”. Performance of DLs and BCs regarding their throughput and latency havebeen evaluated in different studies [117, 137, 83]. As shown in Table 2.2, the latency andthroughput of some DLs such as Bitcoin and Ethereum can be quite limiting as a DSS,while other ones like EOS are offering a higher number of TX handling [210].

Table 2.2: Comparison of Throughput and Latency of Selected DLs Based on [142, 210]

DL Bitcoin Ethereum Litecoin Monero Zcash EOS Cardano

Throughput[TX/s]

7 15 28 30 27 4,000 257

Latency [min] 10 0.25 2.3 2 2 0.5 0.33

From users’ perspective, input/output (I/O) speed, i.e., write/read speed, is a key per-formance indicator of DSSes. User experience in I/O interactions with DSSes with IPFShas been evaluated in [228] via an imperial approach. [228] sets a network of 8 IPFS stor-age nodes hosted onAmazon EC2machines distributed in different countries. Outcomesof this study show how user requests, in terms of size, protocol, and distance, affect thethroughput and latency of IPFS. For instance, these outcomes prove that the I/O latencyincreases with the request size since IPFS divides a data object into multiple blocks, whichincurs high disk I/O overhead when storing these blocks to a local storage device. More-over, IPFS shows almost the same throughput as HTTP for small requests (i.e., requestsizes of 1, 4, 16, 64, and 256KB), whereas for larger requests (i.e., 1, 4, 16, 64MB),HTTPoutperforms IPFS with exponential growth scale. IPFS suffers from higher latency forwrite and read operations in comparison to HTTP, up to 100% in some cases. Further-more, this study shows that the geographical distance of storage hosts and the client isalso affecting (up to 800 times) the latency of I/O, even for small data objects. Table 2.3summarizes the performance evaluation of IPFS for different data sizes [210].

15

Table 2.3: Comparison of Throughput and Latency of Remote Read Operations in IPFS Based on [228, 210]

Data Object Size 16 KB 64 KB 256 KB 1 MB 4 MB 16 MB 64 MB

Throughput [MB/s] 0.05 0.1 0.3 0.5 1 1.2 1

Latency [ms] 400 400 600 1 3 ∗ 103 8 ∗ 103 80 ∗ 103

2.2 Cloud Storage Systems

Cloud data storage persists data via a cooperating storage service provider, which mayuse multiple devices in remote locations, i.e., distributed storage with centralized manage-ment. Cloud storage providers, such as AmazonWeb Services (AWS) S3 [26] and GoogleCloud Storage [22, 107], offer a wide range of data storage services: Google Cloud Stor-age materializes the configuration of user data storage settings and life cycle managementfeatures. Thus, an automated transition to lower-cost storage classes happens, wheneverthe state of data usage meets those criteria users specified, such as reaching a certain agedata has been stored or users have stored a newer version of their data. Moreover, userscan store data with automatic redundancy options to optimize response time or to createa robust disaster recovery plan [210].

Cloud storage services offer data object versioning by storing old copies of data, evenwhen they are erased or overwritten. Users of such systems can defineminimum retentionperiods in which data objects must be stored before theymay be deleted. Data owners canplace a hold on data to prevent deletion [210].

To protect user data, cloud storage services encrypt data with keys created by usersand stored with the storage providers’ keymanagement services that users manage. AccessControl Lists (ACL) configured by data owners prevent users from a uniform access to theshared data. Data owners can automate payment requests via cloud storage services thatrequire data accessors to be charged for network, operation, and data retrieval [210].

User data can be corrupted upon uploading to or downloading from the Cloud stor-age, e.g., due to noisy network links, memory errors along the path, or software bugs.Cloud storage providers encourage users to employ hash functions before/while trans-mitting data to the Cloud and after downloading the data to detect corrupted files. Forinstance, Google Cloud storage recommends its users to employ a CRC32c hash func-tion [24, 210].

Cloud storage systems reduce infrastructure costs andmaintenance effort of data stor-age for enterprises, but they are still not addressing all concerns experienced with central-ized approaches. Users may not know the exact location, where their data is stored orwhether the storage service provider is abusing their data. In case of no ServiceLevelAgree-ments (SLA) or a legal contract being concluded between data storage providers and users,

16

users will be left without legal rights to protect their data from being abused. Nationaland international regulations, such as the European General Data Protection Regulation(GDPR) [235], have been defined and enable certain enforcement of user data storagerights and precautions to be met by Cloud service providers [210].

2.3 Blockchains

One specific group of DSSes are Blockchain (BC) or Distributed Ledger (DL)-based ap-proaches. These approaches can be divided into two categories: the first category deploysBCs or DLs as their underlying data storage infrastructure. The second category relies onBCs andDLs, but does not use their consensus mechanism for data validation [208, 210].

BCs are defined as “decentralized, distributed, and oftentimes public, digital ledger con-sisting of records called blocks that is used to record transactions (TXs) across many computersso that any involved block cannot be altered retroactively, without the alteration of all sub-sequent blocks. This allows the participants to verify and audit transactions independentlyand relatively inexpensively. A BC database is managed autonomously using a peer-to-peernetwork and a distributed timestamping server. They are authenticated by mass collabo-ration powered by collective self-interests. Such a design facilitates robust workflow whereparticipants’ uncertainty regarding data security is marginal” [3].

Several concepts and algorithms lay the foundation of BCs. Considering their corestructure and functionality, BCs define overlay networks of P2P nodes and that are com-parable to DSS, since data in these networks is stored in a distributed manner. As shownin Figure 2.3, a BC is organized as a chain of blocks storing TX in backward-linked blocks,created and maintained via a network of distributed entities, in which actively block per-sisting nodes are called “miners”. This data stored in blocks contains TXs , sent by BCusers to the network, and specific meta data of that block. BCs, in contrast to other DSStypes, store data directly inside the chain after verifying their validity byminers [208, 210].

Figure 2.3: Chain of Blocks in a Blockchain [210]

17

Any two consecutive blocks on a chain are linked through pointers based on their con-tent’s hash values. Hash functions can be used to map data of arbitrary size to fixed-sizevalues. A hash function h : {0, 1}∗ −→ {0, 1}k maps an input string a ∈ {0, 1}∗ of arbi-trary length to an output string b ∈ {0, 1}k of fixed length k such that ∀a ∈ {0, 1}∗∃b ∈{0, 1}k∧k ∈ N. Thus, the key characteristic of a hash function h : A −→ B is its resilienceto collisions in the co-domain. This means it is very difficult to find two distinct input val-ues with the same hash output such that for a, b ∈ {0, 1}∗ ∧ a ̸= b the hash outputs areequal, i.e., h(a) = h(b) [210, 211].

BCs and data storage systems depend on the collision-resistance of hash functions.Thus, for a BC the hash function (e.g., SHA2 and SHA3 [139, 86]) receives the contentof a block (cf. Figure 2.4) and the hash function’s output will be specific to that block. Inturn, any malicious or errorneous change in this block’s content will be detected. There-fore, each block stores two hashes, i.e., the hash digest and the hash digest of the previousblock, which also refers to its parent block. Therefore, BCs are tamper-proof, since thechain of blocks, i.e., Blockchain, serves as the persisted distributed data storage [211].

Lets assume a block as a tuple B = (p, c, n) defined as follows based on [211, 210].

• p ∈ {0, 1}k is a pointer of fixed length k ∈ N to the previous block

• c ∈ {0, 1}∗ is a stream of 0s and 1s representing the content of a block

• n ∈ N is a special number associated with the block’s content.

An important property of n is the following: When concatenated with the block’scontent c ∥ n and then hashed, the resulting hash has a prefix of 0s of fixed length i. Dueto the collision-resistance of hash functions, a suitable n for a given c can only be found bybrute force and is computationally expensive. Whenever such an n is found, the Proof-of-Work puzzle is solved and the block is valid:

valid(B) ⇐⇒ g(c ∥ n) = 010203...0i ∥ {0, 1}k−i (2.1)

Where f, g : {0, 1}∗ −→ {0, 1}k are hash functions and ∥ denotes the concatenation oper-ation.

Let C be a chain of blocks with Bn = (pn, cn, nn) being the last block. A new blockBn+1 = (pn+1, cn+1, nn+1) can be attached toBn such that pn+1 = f(pn ∥g(cn ∥nn)). Thisyields the new chain C′ = B1 ∥ B2 ∥ ... ∥ Bn ∥ Bn+1. It becomes clear that altering thecontent of block Bi from ci to ci′ yields pi+1 ̸= f(pi ∥ g(ci′ ∥ ni)). This break propagatesonwards and renders all subsequent blocks invalid. It is important to observe here that thecollision-resistance property (cf. Equation (1)) of g(ci′ ∥ ni) ̸= g(ci ∥ ni) is responsible forthe diverging output and thus the break of the chain [211].

18

2.3.1 Building Elements of Blockchains

Since BCs operate over a P2P overlay network to facilitate communications betweenmin-ers and clients, the data shared and distributed throughout the network of miners is partof the ledger of TXs persisted within blocks. While clients hold wallets for initiating TXsonly,minersmay need to store a full copy of the distributed ledger (DL). Thus, by runningtheBC’s consensusmechanism across all BCnodes, decentralized rules for themining pro-cesses are adhered to so that the consensusmechanism forcesminers to verifyTXs , confirmthose, and persist the data in the chain, which is considered to be the major difference toother non BC-based DSSes [210].

As shown in Figure 2.4, each Bitcoin block consists of a header and a content part. ABitcoin block header includes the following data defined here based on [57, 210]:

• “version” is a number that indicates which set of block validation rules to be fol-lowed. Four versions are available as of today, which refer to a specific forked versionof Bitcoin.

• “parent block’s hash”. As a miner mining a recent block has already received theprevious block in the chain, it knows its parent’s block hash and has to add thathash in the newly mined block to persist the chain.

Block Version Merkle Tree Root Hash Time Stamp nBitsPrevious Block Hash

Block Headers

Version TX_in-------------------------1) Previous output    1.1- hash    1.2- index2) Script byte3) Signature script4) Sequence

Lock_time

Transaction (TX)

nonce

TX_out countTX_out--------------------

1) Value2) Pk_script bytes3) Pk_script

TX_in count

Transaction Transaction Transaction Transaction Transaction Transaction

Block 

Block Content

Figure 2.4: An Example of a Block Content in the Bitcoin Blockchain [210]

19

• “hash value” of the Merkle tree root. A Merkle tree root is constructed using allTX IDs of TXs in this block. It is the SHA256(SHA256( )) of those TXs pairedin a binary tree. The ordered list of TXs construct the leaves of this tree. The hashof concatenations of these TXs are paired and concatenated and hashed again untilonly one root value is created (cf. Figure 2.5). A Merkle root is used to verify theintegrity of data in many different BCs.

• “timestamp” is the time at which this block was mined.

• “nBits” is an encoded representation of the target difficulty threshold of this block.Which means, this block’s hash needs to be less than or equal to the value deter-mined via nBits.

• “nonce” is a random number. Miners try to find this number by which the hash ofthe concatenated block content and nonce matches the difficulty level specified bynBits. Nonce is key in identifying a miner as the next miner in the chain. The pro-cess of finding the appropriate nonce and hashing data demands a massive amountof computational power. Towin this competition (“Crypto Puzzle”), Bitcoinmin-ers consume a high volume of energy [172].

H(A|B) H(C|D)

A B C D

H(H(A|B)|H(C|D))

Figure 2.5: A Simplified Example of the Merkle Tree Construction [210]

The block’s content part contains all newly mined TXs . Different BCs employ vari-ous types of TXs to enable P2P communications within their ecosystem. Generally, BCscan be labelled as “transactional DSSes”, since they employ TXs for changing the state ofthe BC, i.e., their length. This means that every single change can be accepted only, ifrecorded via a dedicated TX . In this regard, for instance, Bitcoin has implemented 25 TXtypes [194]. From a high-level perspective, a block in Bitcoin contains several fields in eachTX to interpret the data stored within that TX. For instance, the “Raw” TX format usedin Bitcoin (cf. Figure 2.4) includes the version, input, output, counters, and a lock timefield.

BCs provide data integrity and authenticity by enforcing clients as well as miners tosign TXs using predetermined Public Key Cryptography (PKC) [50]. As for asymmetriccryptography, for PKC a mathematically associated pair of public and secret/private keys

20

(PK and SK, respectively) is used to avoid distributing encryption keys between differentparties in a communication. Thus, the SK has to be kept secret and in possession of onlyone entity. The public key is known publicly, i.e., by the other side of the communication,who can encrypt a message or sign it. When a TX is signed by a user’s PK it can only bedecrypted by its corresponding SK. In this case, there is no need to transmit a “shared key”to decrypt the message as the receiver already holds the SK. PKC algorithms, such as RSAand ElGamal [50], have proven to be hard to break, i.e., knowing the PK will not leadto disclosing a corresponding SK [233]. BCs employ PKC-based schemes known as theDigital Signature Algorithm (DSA) or the Digital Signature Standard [50]. Elliptic curvevariants of DSA, i.e., EC-DSA [50], are used within BCs to enable the address generation(cf. Figure 2.6) and to preserve data integrity.

PrivateKey PublicKey

bc1qxy2kgdygjrsqtzq2n0yrf2493p83kkfjhx0wlh

BlockchainAddressPublic-KeyCryptography HashFunction

Figure 2.6: Blockchain Address Generation based on Public Key Cryptography [210]

2.3.2 Time Stamping

Besides the operations of BCs as discussed, the storage of data is provided by them viaDLs following the concept of time stamped data objects as the 1991 introduction of timestamping of electronic documents [135]. [135] introduced themark-up functionality of atime stamp, which provided for a unique “label” of a file, such that no-one else could suc-cessfully counter-argue on its validity on generating that label. This determines basic func-tionality BCs inherited, since the sequence of writes is essential to maintain a backward-linked list in a decentralized manner for which the time stamp serves as a respective proof[210].

2.3.3 Blockchain Types

Bitcoin network has been criticized for its high power consumption and low scalability.Henceforth, many different consensus mechanisms have been introduced [177], such asProof-of-Stake (PoS), Byzantine Fault Tolerance (BFT), Proof of Authority (PoAuth),Proof-of-Space-Time (PoST), and hybrid mechanisms, such as in Libra [66]. These con-sensus mechanisms are intended to overcome PoW’s deficits, especially its scalability andenergy efficiency concerns. In this regard, enterprise solutions of Distributed Ledgers

21

(DL), such as Hyperledger [240], introduced a paradigm shift toward “private” BCs withlimited access or contribution rights, but higher scalability by removing or reducing themining dependency.

These newer approaches either improve the consensus mechanism’s efficiency com-pared to PoW-based BCs or trade-off selected attributes of BCs, such as being publiclyaccessible or being permissionless. Thus, these solutions limit the availability of BCs to aspecific group in private settings and restrict respective data storage options, too.

Since BCs are backward-linked blocks of data, regardless of their type (public or pri-vate), they are all share and operate as a DL. Recent studies have distinguished these differ-ent types of BCs and DLs, which do fit overall into four categories (cf. Figure 2.7) [117].Accordingly, public permissionless DLs are specifically determining “BCs” and all othertypes remain as “DL”. From a data storage point of view, all DL types store data iden-tically, however, for permissioned versions only selected contributors are granted accessfrom one authority in the DL, usually the DL ecosystem owner [210].

public access for reading,writing, and miningpermission required

public access for reading,writing, and miningno permission required

private access for reading,writing, and miningpermission required

private access for reading,writing, and miningno permission required

Public

Private

PermissionlessPermissioned

Public PermissionlessPublic Permissioned

Private PermissionlessPrivate Permissioned

Figure 2.7: Distributed Ledger Types [210]

2.4 Mining and ConsensusMechanisms in Blockchains

Mining is referred to a task performed by miners, which depends on the consensus mech-anism in use. For instance, miners in Bitcoin (a) validate each new TX , i.e., open TX ,(b) add them into a new block and, (c) verify the validity of other blocks mined by otherminers. Details of these three tasks are determinedby the consensusmechanismofBitcoin.

Reaching a consensus in a BC is the vision or understanding of that BC’s state, ac-cepted and approved by a larger portion (i.e., >50%) of its miners. Consensus is enforced

22

in BCs as the cornerstone logic enabled by pre-defined algorithms followedby everyminer.In the past years, after introducing Bitcoin, various consensus mechanisms have been im-plemented for different BCs, each of which is based on a set of rules defined by their de-velopers to enable security, trust, and transparency. Considering the data-driven natureof BIoT use cases, consensus mechanisms directly impact the efficiency of the underly-ing BIoT system. Hence, the following Section is dedicated to elaborate on the mostrenowned consensus mechanisms, especially the IoT-specific ones.

2.4.1 Proof-of-Work (PoW)

PoWwas introduced by Bitcoin, whereminers have to validate “open TXs” tomine a newblock. The validation process is mandatory and it prevents double-spending of coins ormalicious behaviors of BC users. As mentioned above, Bitcoin miners are in an ongoingcompetition to solve these Crypto Puzzles, i.e., finding the nonce, bywhich the block con-tent hash meets the condition of mining a new block according to the current difficultylevel. That is precisely defined by the number of leading zero bits the block hash outputstarts with. This range or “difficulty” is determined by the PoW consensus algorithm ofBitcoin for that state (height) of the BC. In Bitcoin, all these tasks need to be conductedby miners in less than 10 min within a highly competitive and time-sensitive situation. Asa result, a miner who finds the nonce and validates the block faster than others will add itsblock to the chain and, consequently, earn two rewards, first the reward for mining thatnew block and second a reward per validated TX [210, 208].

Each miner’s success is the factor of it’s computing power in proportion to the com-puting power of all miners. A single miner with limited computational power may not beable to mine a single block in years, that is why usually miners gather and create the “min-ing pools”. These miners share their information in mining a block and eventually, theyshare the revenues [128]. There are two financial incentives for miners to participate inthe consensus algorithm of PoW-based BCs. The first incentive is the fund they receive asa reward of mining a new block and second reason is the fund they get for validating eachTX [185]. PoW is the most prominent consensus protocol among the existing consensusmechanisms as Bitcoin’s success has proved the functionality of a such an algorithm inmining and validating blocks.

2.4.1.1 PoW Security Risks

Some of the key security risks in PoWBCs are known as (i) 51% attack, (ii)Double spend-ing, (iii) Selfish mining, (iv) Eclipse attacks [209].

23

(i) 51% AttackIn Bitcoin a group of collaborating miners may reach to more than 50% of the network’scomputation power. In that case, those miners will have the ability to control the wholeBC by controlling the longest chain [134]. This condition is known as the 51% attack[209].

(ii)Double Spending AttacksIn double spending attacks adversary tries to spend the same coins in more than one TX. In this case, other side of the TX has to wait until he/she receives the TX confirmation.This means TX recipient has to wait for six validations, which take almost 1 hour tomakesure there is no double spending attack occurred. Miners in Bitcoin verify theTXs validityto avoid such an attack [209].

(iii) Selfish MiningSelfishmining is the consequenceof dividingminers into twogroups of trusted and collud-ingminers. The second group tries to hold andpublish the blocks gradually in a controlledfrequency to keep the longest chain under their control and increase their share of mining[134]. According to the algorithm proposed in [128], selfish miners work on a privatechain and in each iteration the difference of public chain and private chain is calculatedand saved into variable Δ. If a new block is mined in the private chain and the amount ofΔ equals to 2, then selfish miners will publish their chain in the network. This will causethem win the competition with trustful miners as the newly published chain is ahead ofthe public chain by 2 blocks. Further, if a new block mined in public chain, selfish minersact differently according to the amount of Δ. In the latter case, if Δ equals to 0, trust-ful miners win the competition. If Δ equals to 1, selfish miners will publish their alreadymined block, in this case there is no predefined winner. If Δ equals to 2, all the privateblocks will be published into the public chain and selfish miner will win as they are aheadof trusted miners by 2. In those cases that Δ is more than 2, selfish miners will publishonly the first block in their private chain and continue working on the top of their chain[209].

(iv) Eclips attacksEclips attacks are representing the attacks in which a subset ω of miners in BC is deprivedof receiving the correct information of latest status in the network [134]. In this attack,attackermonopolizes the incoming andoutgoing connections of its victim thus, the victimminer will be isolated from its peers and other nodes in the BC [221].

2.4.2 Proof-of-Stake (PoS)

PoS is based on the proof of investments made by miners. Since the PoS miners are notputting the mining effort as PoW, the act of mining is termed as “minting” in PoS, andthe miners are called “minters” or “ Validator (Vs)”. In PoS, a minter is being selected to

24

mint a new block through a condition determined, for instance, by combining the hash ofan Unspent Transaction Output (UTXO), the TX size (i.e., stake), and the TX age, i.e.,coin-age. Usually, peers (i.e., Vs) with the highest coin-age can submit new blocks moreoften than low stake peers [209].

If one Validator V dedicates a considerable amount of funds in a PoS-based BC, thisV can be assumed as a trustablemember of BC. Such aV suffers themost if anywrong ac-tion against consensus rules would be committed by him/her while minting blocks wouldinsecure and unstable the BC for its users, thus, losing the users gradually, which leads tothe devaluation of native coins of that BC [191].

PoS is associatedwithbeing environment-friendly as it doesnot requirepower-intensivecalculations by Vs. In contrast to a PoW protocol, where computational resources play asignificant role in the nextminer’s selection, and themarket is prone to bemonopolized bylarge enterprises working as united miners, in PoS, memory-hard mining algorithms like”Scrypt” and ”Dagger” alleviate this problem by most [67].

A well-known version of PoS consensus mechanisms is the Delegated Proof of Stake(DPoS), in which a set of predefined nodes (minters) are responsible for mining tasks andTXvalidation. Someof theprominent implementations ofDPoS are Slasher [67], Tender-mint [238] and Bitshare [59]. Comparable to PoW, DPoS minters have two basic tasks of(i) building newblocks and (ii) validating the other newly created blocks by otherminters.The predetermined list of minters in DPoS is changed by time according to BC-specificrules.

Another version of PoS is the Leased-PoS (LPoS) tries to address the centralizationproblem of PoS. LPoS lends coins to the miners with low stake to be part of the Vs. Thatway, “rich” nodes can lease their funds to interested Vs within a time box. The rewardof minting a block would be shared with the two. It is proven that the engagement ofmore nodes via LPoS adds to the security of BC ecosystem. The drawback of LPoS to beemployed as a BC for BIoT is dependency on the monetary scheme.

During the last years, different algorithms have been proposed for theminter selectionin DPoS. For instance, (i) delegates can be selected based on their stake, (ii) users can voteon who can be minters according to their stake, and (iii) delegates’ votes on valid blockscan be a measure on their selection [58]. In Slasher, users’ stake and BC history declarewho can be the next minter. However, there are other implementations like Tendermint,where any user in this system can sign the blocks [58]. In some implementations of DPoS,which are called “deposit-based” PoS, minters have to lock their funds for a certain period,and in case of malicious activities, the blocked fund would not be refunded to the minteras a penalty [209].

25

2.4.2.1 PoS Security Risks

Key challenges to be met while developing PoS consensus mechanisms include (a) mech-anisms needed to prevent nodes from double spending, (b) manipulation of the randomelection, (c) DoS attacks, and (d) the difficulty of keepingVs online at all times [209].

Figure 2.8: Competing Chains in PoS [115]

(i)Nothing-at-Stake (Double Spending)In the case of BC forks, which sees multiple side-chains competing simultaneously, a Vneeds to decide on two or more options to add the next block. Unlike in PoW, PoS doesnot “burn” resources in the process of finding a random node in the network. Hence, Vscan append a block on any competing chains, incentivizing a potential V to add blockson top of every competing chain. This strategy assures that the suggested block will mostlikely be included in the chain, which will be accepted by the network finally [209].

If all Vs act economically and append their blocks on every competing chain, the BCwill never reach a consensus, even if only honest Vs exist. Figure 2.8 shows how the ex-pected value EV changes depending on theVs strategy. The value P depends on the %ageof validating nodes that received this chain before any other chain of the same length. Thefractional amount of the total stake that votes for a particular chain influences P as well.For simplicity, it is assumed that the block-reward and TX fees add up to exactly one coinfor each block [209].

Unlike in PoW,where amining nodewould need to split its hashing power towork onmultiple chains, PoS allows for potential misuses of lacking such dependency. Whereas inPoW hash of the previous block is included in the calculation of a new block, as shown inFigure 2.9. Thus, a miner will always put all his/her mining power into the longest chain,which is the chain that the network will most likely accept [209].

In the context of PoW, the probability P of a chain depends on the %age of miningnodes that received this chain before any other chain of the same length. The fractionof the total hashing power that is put into each chain influences the likelihood P as well.Consequently, miners have to decide which competing chain they want to continue if

26

Figure 2.9: Competing Chains in PoW [115]

they have two or more competing chains of the same length. This results in a separationof miners and, therefore, P has to add up to exactly 100% over all competing chains of allminers [209].

As previously described, rational Vs in a PoS BC append a block on every competingchain. Therefore, the sum of voting power from all competing chains can be more than100%. This property is described in the following scenario and shown inFigure 2.10. Afterblock a is added, two Vs F and G append a block at the same time, which results in afork with two competing chains. Every V wants to make sure that his/her block will beincluded in the finalized BC. Therefore, every following block is added on both chains,too. With the assumption that V F has a staking power of 1% and V G of 2%, all the Vsbetween block f and ymake up for the total stake of 96%. At the time of adding block y,the upper chain has a staking weight of 98% and the lower chain of 97%. The sum of thevoting power of both competing chains in this example is 1.96, well above 1 [209].

Figure 2.10: Staking Weights in PoS [209]

A malicious user Z can take advantage of such a situation and double spend his/hercoins. A TX that is included in block g, but not in f, can easily be reversed by appending anew block z on only that subchain, where block g is not included. Therefore, after user Zadded block z, the chain containing block f and z has a higher staking weight and will be

27

accepted by the network (cf. Figure 2.11). In this example, the malicious user has doublespenthis/her coinwithonly 2%of the total stake. Thus, to ensure a secure implementationof PoS, a PoS consensus protocol has to implement a scheme allowing Vs to be punishedfor appending new blocks on multiple chains at the same height.

Figure 2.11: Double Spending in PoS [209]

(ii)Grinding AttacksIfVswere able tomanipulate the random election process of a consensus algorithm, grind-ing attacks happen. For example, a poor design iswhen the electionprocess depends on theprevious block hash. The node elected for adding a block can manipulate the block-hashby in- or excluding certain TXs or trying many parameters, resulting in different blockhashes. Hence, this node can grind through many different combinations and choose theone that re-elects itself with a high likelihood for the next block. Therefore, an optimalprotocol needs to ensure that the random election process parameters cannot be modifiedat the commitment time [209].

(iii) Denial-of-Service (DoS) AttacksIf the random election process is determined publicly, the elected V is known ahead oftime, and the consensus protocol is vulnerable to Denial-of-Service (DoS) attacks. There-fore, it is advantageous to run the election process privately. In a PoW protocol, the elec-tion process is done so that eachV independently tries to find a solution to a mathemati-cal puzzle. These puzzles are different for every miner. Therefore, a malicious user cannotpredict which node solves this puzzle first. Thus, the elected node cannot become a vic-tim of a DoS attack unless the malicious user attacks every single node in the network.The same level of unpredictability in the election processes as in PoW is desired for a PoSprotocol[209].

(iv)AvailabilityA situation where many different parties compete for owning the next block decreases thepossibility of appending multiple consecutive blocks by a single V. Hence, the networkbecomes more secure with every additional V. In some PoS mechanisms, the probabilityof adding the next block increases proportionally to the time that aV has not been elected.Often this is referred to as coin aging. A protocol that includes coin aging incentivizes Vsfor not being online until they reach a high or even the maximum possible coin age. Coinage is accumulated, even if a node is offline. In such a scenario, a significant fraction of

28

the V set would be offline most of the time. Therefore, the actual number of competingnodes in the network is by no means the size of the V set. A system becomes only moresecure with additional Vs if these nodes are online and compete for the next block at anytime. A PoS protocol should incentivizeVs to be online at any given time [209].

2.4.2.2 PoW and PoS comparison

Employing PoS-based consensusmechanisms as a PoW replacement are analyzed fromdif-ferent dimensions. The key ones are discussed as follows based on [209].

(i) Environmental HarmSeveral reports [172] indicate that Bitcoin consumes more than 36 TWh annually. Thisamount is more than the consumption of the entire Bulgaria every year. In other terms,Bitcoin uses as much electricity as more than 3 million households together. Thus, globalmining costs an amount to over 5 million USD every day. In contrast, there is no need foran extensive computation task in a distributed system using a PoS consensus mechanismto reach a comparable level of security.

(ii)Risk of Centralization in Form of Mining PoolsThe chances of adding a new valid block for an individual miner with a PoW-based BC isextremely low [158]. Therefore, miners can joinmining pools, where their computationalpower is collaboratively deployed. However, if these mining pools join together, Bitcoinwould be vulnerable to a 51%- attack, where these mining pools could always provide alonger competing chain than the currently accepted one [72]. Therefore, Bitcoin wouldbe controlled by a single centralized entity. In contrast, the need for a constant revenuestream is not required in a PoS mechanism, since the V does not have to be compensatedfor resources burned. Therefore, the formation of largeV sets is less likely.

(iii)Risk of Centralization in Form of Cloud MiningCloud mining allows people to mine cryptocurrencies without owning mining hardware[250]. Companies providing cloud mining services profit from the economies of scale,such as special deals on hardware orders and lower maintenance costs. Therefore, thesecompanies gain an economic advantage over individual miners and force them out of themarket. Since no specialized hardware is needed for a PoS mechanisms, centralized cloudcompanies do not significantly benefit.

(iv) Entry BarrierA miner in a competitive environment of PoW-based BCs, has to own specialized hard-ware, to gain an advantage over other miners. Whereas, in a PoS-based BC a V can onlygenerate revenue proportionally to his/her deposit in that BC. Thus, the entry barrier ofbecoming aV in a PoS-based BC is significantly lower than a PoW-based BC.

29

(v)Discrepancy between Miners and Non-MinersIn PoS mechanisms, the discrepancy between mining and non-mining nodes can be con-siderably lower thanPoW[158]. Practically, in PoS-basedBCs anymachinewith sufficientstorage and bandwidth can be part of theV set. Therefore, the community is less split upas it occurs to be in PoW-based BCs.

(vi) 51% AttackIn the initial phase of a new BC, the number of Vs/miners is limited. This poses a signif-icant risk of a 51% attack, indicating that 51% of all miners collude. A malicious user canbuymining hardware, such that he/she possesses at least 51% of the total hashing power ofthe network. This user can produce a longer competing chain than the remainder of thenetwork. Since the BCprotocol always follows the longest competing chain, themalicioususer controls the entire BC. The developers of a PoS-based BC can hold back 51% of allcoins until the network is established to prevent such an attack by an outsider.

(vii)Transaction FeesTransaction fees determine the incentives to persist a block in a given BC and prevent aBC from being spammed, but also compensate miners for their computational effort andelectricity costs. However, no resources are burned within a PoS consensus mechanismand, therefore, the TX fees should be lowered significantly.

2.4.3 Byzantine Fault Tolerant (BFT)

Despite PoW in which nodes and miners do not need to ask for permission and do notneed to know (trust) the other peers in the BC network, in BFT protocols, each minerhas to know all or a portion of other peers to participate in the consensus. For instance,in Practical BFT (PBFT), at least 23 of the miners need a consensus on a block to be ac-cepted as the next block. In some BFT implementations, a mechanism for selecting thenext miner(s) is developed. For instance, miners can be chosen by a leader. For instance,in Delegated BFT (DBFT), which follows the general rules as PBFT [219], participationof all the nodes for adding a block is not needed. Thus, higher scalability is achieved. Animplementation of dPFT is introduced within NEO, which a block time of 15 s.

BFTLeader(s) andminer(s) in each round, i.e., each block,may be changed dependingon the different metrics. However, in any case, identity management and cryptographiccertificates are assigned to potential miners in a centralized way. The need for imposingidentities in the BC depends on the applications. For instance, in the Fintech era, banksor real estate involved may incline to BFT-based BCs rather than pure PoW due to legalconcerns.

30

2.4.4 Proof-of-X (PoX)

In the last 13 years, in parallel to the explorations made on PoW, PoS, and BFT, many dif-ferent types of consensus mechanisms were introduced in the BC era. These mechanisms,generalized as PoX, where ‘X’ is replaced by the specific metric measured by the consen-sus mechanism proposing a new approach. Since this thesis focuses on the scalability ofBCs employed for data-driven IoT integrated systems, different consensusmechanisms areoverviewed from the perspective of suitability for IoT use cases as follows.

2.4.4.1 Proof-of-Activity (PoA)

PoA is a combinationofPoWandPoS. InPoA, afterminers find anonce tohashoneblock,they transmit all the data in the format of a block template to the BC network. Then, aset of other miners will be selected via a PoS approach to verify that block. In PoA, theremight be more than one template at a time. Moreover, there might be a situation thatone ormoreminers cannot validate or intentionally ignore to sign a new block assigned tothem. In such cases, PoAadvances overPoW, as PoAminerswill generate another templatewith a new set of candidate signers. After some time, there will be blocks that have beensigned, and the reward of block creation is divided between the miners and the signers. Incomparison to PoS, PoA is known to be a safer consensus against attacks. However, thescalability and delay are traded off [219].

2.4.4.2 Proof-of-Burn (PoB)

PoB is the proof of transmitting some coins to an irreversible and unspendable address byinterestedminers. This verifiable actionwillmake those coins permanently unavailable forthose miners. The winner will be rewarded for mining a block by that BC’s coins. Whilethese processes are considered energy-efficient, it demands spending cryptocurrencies inIoT use cases demands. However, not all IoT use cases need to rely on paid schemes, es-pecially with cryptocurrencies. Thus, an enforced dependency on P2P payments via cryp-tocurrencies to conduct transmit data is a limiting factor of using PoB in IoT use cases[219].

2.4.4.3 Proof-of-Importance (PoI)

As an improvement to basic PoS, PoI consideredmoremetrics than just the stake of nodes.In PoI, the reputation of nodes is calculated withmetrics such as the number of valid TXsissued by that node. PoI benefits from the advances of PoS over PoW, thus, making it apotential solution for IoT use cases. However, PoI depends on the monetary system asPoB i.e., depends on cryptocurrencies [219].

31

2.4.4.4 Proof-of-Elapsed Time (PoET)

PoET is designed to enhance the energy consumption of PoW BCs. In a PoEt-based BC,miners are still asked to solve a cryptographic puzzle. However, the selection of the nextminers is performed via a random wait timer verified by Trusted Execution Environment(TTE) like Intel’s Software Guard Extension (SGX). As a result, there will be no compe-tition between miners, as the next miner is the one whose timer is to expire sooner thanothers. PoET could be considered a proper consensus mechanism for IoT-oriented usecases due to its low energy demands. However, a key drawback of PoET is its dependencyon Intel’s SGX [219].

2.4.4.5 Proof-of-Capacity (PoC)

PoC and its sibling consensus mechanism Proof-of-Space (PoSP), measure the hard diskspace allocated to store data (which can differ from the BC data). PoC demands muchless energy and computational resources. The stored data is interpreted as the proof ofcommitment to the BC and the allocated storage size is a metric used to identify the nextblock’sminer. A key challenge in such BCs is to frequently verify the dedication of storageas claimed initially by theminers. PoC andPoSPhave shown great potential in data-drivenIoT use cases [219].

2.5 Blockchain Implementations

An introduction to five different DLs is presented in this section covering (i) Bitcoin as apublic permission-less BC, (ii) Ethereum as public permission-less BC which introducedthe SC in the BC ecosystem, (iii) Hyperledger as a BFT-based DL, (iv) IOTA as a DAGbased DL, and finally (v) Bazo, the public permissioned PoS-based BC. The first four DLsare presented here selected as an example of themost popular implementations of differentBC types and Bazo is introduced since it is used in this thesis as the underlying DL forDLIT’s design and implementation.

2.5.1 Bitcoin

Bitcoin was initially proposed by the pseudonym “Satoshi Nakamoto” in 2009. This pro-posal of an experimental cryptocurrency created a fully public anddecentralizedone,whichreached over time a notable position due to the disruptive potential of the technical plat-form on which it is based on, the BC. Bitcoin’s design based on the PoW guarantees thetamper-proofness and immutability and continues to perform today as a secure ecosystem[222].

32

One of the main factors of Bitcoin success is the incentive scheme for miners partici-pating in the BC network. Bitcoin, ensures that nodes generate valid blocks through thePoW consensus mechanism, and malicious miners are discouraged from gathering moreprocessing power than all honest miners. Following the success of Bitcoin, several othercryptocurrencies were created based on Bitcoin’s source code proposing several modifica-tions in its parameters such as block size or validation time (e.g.,Litecoin [163] andMonero[161]). Monero, for instance, offers additional security features based on a ring-signaturescheme inwhich transactions can be entirely covered [222]. The energy deficiency of PoWconsensus mechanism determines a drawback for Bitcoin. Bitcoin mining uses an esti-mated 61.76 TWh of electricity per year, more than many countries such as Switzerlandand Czech Republic and ∼0.28% of total global electricity consumption in 2019 [172].In 2021, if Bitcoin was a country, it would be the 41st most-energy-demanding nation inthe world. This information on Bitcoin energy consumption is a clear indication of howa consensus mechanism affects energy consumption. As discussed above, Bitcoin is not aproper match for IoT use cases due to its high computational demands, latency, and lowthroughput. Although Bitcoin was the first one representing the role model of a “pub-lic permissionless BC”, many other and different DLs have been introduced in the pastdecade, including for instance Ethereum, Hyperledger, Libra, EOS, Litecoin, Monero,NEO, Ripple, Steem, Stelar, Tether, XTZ, Zchash, etc. [177, 137].

2.5.2 Ethereum

TheEthereumBChasdistinguished itself as thefirstBCtopromotedistributed autonomouscomputation, thus, enabling data storage in a distributed setting, too [91]. Ethereumstarted by a proprietary PoW-based consensus, but with higher scalability than Bitcoin.Ethereum introducedSmartContracts (SC) as distributed applicationsdeveloped likepro-grams, i.e., written by programming languages, but run decentrally and accessing decen-trally stored data. The invention of SCs has been a linchpin for decentralized applicationsgrowth in different use cases and distributed computation, which led to the emergenceof different use cases [64]. Ethereum is applying a different version of the mining algo-rithm in its PoW mechanism i.e., Ethash, a modified version of the Dagger-Hashimotoalgorithm. Ethash is a memory-hard PoW algorithm and is designed to be ASIC resis-tant. Ethash operates on a 1 GB data structure stored via Directed Acyclic Graphs (DAG)[111]. The DAG changes every 30,000 blocks, equivalent to 125 hours (almost 5.2 days)time window, known as epoch. Every 15 seconds, a new block is created in Ethereum.Miners have 15 seconds to mine a new block. In this network, it is almost impossibleto have double-spend attacks unless the attackers own at least 51% of mining power [15].Ethereum has evolved even further to reach higher scalability in its version 2.0 by employ-ing PoS as its consensus mechanism and sharding techniques [27].

33

2.5.3 Hyperledger (HL)

HL is a Linux-based open-source project proposed by [53]. In HL, the main goal is toprovide themodular infrastructure for BC to be able to use the different consensusmecha-nisms likePoS,PoW, andTrustedEntities (TE) [240]. This communityhas several projectsrunning like BC explorer, Cello, Fabric, Iroha, Sawtoothlake. For instance, HL Fabric ismainly driven by IBM [35]. P2P channels enable data sharing in HL Fabric lead to dataisolation between peers. Each channel is dedicated to private communications, enablingconfidential TXs between peers. Thus, HL Fabric empowers multiple levels of privacy.Moreover, private data collections enable P2PTXs between authorized participants, keep-ing data private to a subset of peers (and potentially regulators/auditors). Private data isonly shared P2P, with hashes stored on the BC, offering verifiable evidence to all peers,validating TXs . Additionally, an optional Identity Mixer can be deployed to increase theanonymity of TX submitters [222].

2.5.4 IOTA

IOTA started its public appearance in 2015, supported by the IOTA Foundation. It uti-lizes a data structure to store TXs , called the Tangle [193]. In the Tangle each TX (ratherthan a block of transactions) references to at least two previous transactions, forming aWeb structure known as a Directed Acyclic Graph (DAG). Thus, all reference pointerspoint in the same direction where no loops are allowed. Such a graph suits well the DAGstructure to allow transactions issued simultaneously, asynchronously, and continuously,in contrast to the discrete-time intervals and linear expansion of other BCs [147, 222].

By parallelizing TX issuance and validation, IOTA promises a high TX throughputas of 1,000 TPS [75]. However, large scales expose the Tangle to uncontrolled growthattacks, e.g., parasite chains and splitting [193]. Until full-fledged addressing of such con-cerns, IOTA (a) introduced the concept of a central “coordinator”, which confirms un-referenced TXs every minute, and (b) required that new TXs must directly or indirectlyreference these confirmed TXs by the “coordinator”. Eventually, trading off the securingand controlling the Tangle at the cost of introducing a central entity [148, 222].

In the Tangle, incentives of participants are aligned equally [147], due to every nodesending an IOTA TX needs to participate in the consensus as well. In IOTA, there areno miners. Thus, TX validation is an inherent part of TX issuance, and there are no TXfees. The value sent is always equal to the value received by each peer. This strategy letsE2E micro-payments, e.g., for sharing economy [222]. While the security provision is acrucial critic of IOTA, it is potentially a proper BC for IoT use cases, assuming scalabilityis offered as intended [142].

34

2.5.5 Bazo

In its early version as presented in 2017 [226], Bazo was a PoW-based permissioned BCwhich,was later revised and extended to aPoS-basedBC[45]. Bazo’s basicTXs ,Accounts,Block, PoS Condition, and validation are explained as follows [209].

In Bazo, a chain-based PoS protocol works comparably to the PoW consensus mech-anism with key differences. It designed as a throttled PoW algorithm as Vs are limitedto exactly 1 h/s (hash per second). Bazo defines a semi-synchronous BC, such that everynode has its own local time. If a node attempts to speed up its hashing power, the BCdetects the malicious node and the suggested blocks are rejected. Bazo choosesVs propor-tionally to the number of coins that each V owns. Furthermore, an RSA scheme [62]is employed to enhance the randomization of selecting the next V at every Block Height(BH)(cf. Section A.1.0.7). In Bazo aV is always elected unless allVs are offline at the sametime. Whereas, other PoS protocols may need to implement a fall-back mechanism thatdeals with a deadlock, which occurs if all members of a chosen subset of Vs are offline. Adetails introduction to Bazo specifications is provided in section A.

2.5.6 Blockchain Performance Analysis Metrics

Different metrics used in the evaluation of BCs are defined as follows based on [205].

2.5.6.1 Transaction Per Second (TPS)

TPS represents the average number ofTXs validated per second calculated in the timespanbetween the first TX sent to the network and the first block which includes all sent TXs ,called the Validation Time Span. The TPS on its own does only tell how many TXs arevalidated in a BC per second. It does not reveal anything about limiting factors and why aspecific TPS is reached.

2.5.6.2 Transaction Per Second Calculated (TPScalc.)

The maximal possible calculated number of TXs , which can be validated with a set blocksize and interval, is computed as the number of TXs can fit into a block and the blockinterval (in seconds). The blockSize unit is byte and provides the upper maximum whenblocks are validated in the block interval. Therefore, TPScalc. is mainly used as a bench-mark, since it is basically just a theoretical value. TPScalc. should always be handled withcare because the theoretical and not the actual block interval is taken into account for this.TPScalc. indicates the maximal speed a BC can reach. However, if the TPS is very close orsimilar to the TPScalc. and the TPSsent is far above than the other two, the BC is probablylimited either through the block size or by the block interval.

35

2.5.6.3 Transaction Per Second Sent (TPSsent)

TPSsent is the average number of TXs sent to the network by all BC clients and it indicatesthe upper limit for the number of TXs which can be validated per second. The TPSsentindicates how fast TXs are sent to the network. When the TPSsent and the TPS are closeto each other, it can be assumed, that all TXs get validated shortly after they are issued.This because it does not take much longer to validate all TXs than it takes to send them.Thus, if they diverge a lot, it is an indicator, that it takes longer until the TXs are validated.Furthermore, it is an indicator for the upper TPS limit, since the TPS can not be higherthan the TPSsent.

2.5.6.4 Average Block Interval (ABI)

ABI reveals the actual timespan between two consecutive blocks. It indicates if the BCnetwork satisfies the defined block interval. Adjusting the block interval needs time tobecome consistent. It is simply measured as the difference between two following blocksor as the averagewith the timespan between two selected blocks (endTime - startTime) andthe number of blocks between them.

2.5.6.5 Blockchain Size (BCS)

BCS is a metric to calculate a BC’s overall size. BCS consists out of a fixed part and size ofall TXs . BCS is calculated as the sum of all validate blocks’ size starting from the genesisblock up to the last validated block.

2.5.6.6 Blockchains Throughput

Throughput inBCs is determinedby themaximumnumber ofTXs that can be confirmedby a BC in a time unit [142]. Even though BCs outperform the international bankingTXs , whichmay take several days to take place, in comparing PoW-based BCs like Bitcoinwith a maximum throughput of 7 TX/s and centralized payment systems like Visa with athroughput of 20,000 TX/s a great performance gap is observable. In this context, blocksize and interval are playing critical roles in designing a BC.

It may seem that by increasing the block size and reducing the block interval, the over-all throughput of BCs can be increased. However, just by accelerating the block produc-tion or increasing the size of each block, a higher throughput is not necessarily obtainable.Such changes lead to security risks, e.g., theBitcoinnetworkbecomes vulnerable to double-spending attacks. This is due to the increase in the number of forks by producing blockswith higher rates with different miners almost simultaneously scattered globally. Further-more, a higher number of forkswill help the attackers to actwith less computational power

36

consumption. By increasing the size of the blocks, mining network will requiremore timeto broadcast the newly mined blocks, consequently causing delays and further forks.

Table 2.4 2.5 provides a comparison between different consensus mechanisms and re-lated DL implementations especially regarding their throughput and latency.

Table 2.4: Comparison of Consensus Mechanisms and DL Implementations [142]

ConsensusMecha-nism

DesignGoal

Access⋆ MinerElection

EnergyEffi-ciency

Vulnerableto %51Attack

Vulnerableto DoubleSpendingAttack

ImplementationInstances

TPS

PoW PL PoW No Yes Yes

Bitcoin 7

Sybil Ethereum 15

Attack Litecoin 28

Proof Monero 30

Zcash 27

PoS PL Stake Yes Yes Difficult

Waves 100

Energy Qtum 70

Efficiency Nxt 100

Nano 7,000

DPoS P/PL Voting Yes Yes Yes

EOS 4,000

Enhancing Cardano 275

PoS Tron 2,000

Efficiency Lisk 3

Bitshares 100,000

PBFTEnhancingBFT P/PL Voting Yes Safe Safe

Ripple 1,500

Efficiency Stellar 1,000

PoC EnhancingPoWEnergyEfficiency

PL PoW Fair Yes Yes Burst 80

DAG

Speed

PL N/A Yes Safe Safe

IOTA 1,000

and Byteball 10

Scalability Travelflex 3,500

PoA

Merging

P

Voting

No Safe Yes

Dash 56

both PoW and Decred 14

and PoS PoW Komodo 100

dBFT FasterPBFT

P Voting No Yes Yes Neo 1,000

PoI ImprovingPoS

PL ImportanceScores

Yes Safe Safe NEM(XEM) 10,000

PoB AvoidingPoWDeficits

P Burntcoins

No Yes Yes Slimcoin 0.00003

⋆ PL: Permission-Less, P: Permissioned

37

Table 2.5: Comparison of Throughput and Latency of Selected DLs Based on [142]

DL Bitcoin Ethereum Litecoin Monero Zcash EOS Cardano

Throughput(TX/s)

7 15 28 30 27 4,000 257

Mining ablock (min)

10 0.25 2.3 2 2 0.5 0.33

ValidationTime (min)

60 6 30 30 30 Near-instant

10

2.5.6.7 Mining and Verification Delays

Several factors affect the total delay users experience in operating with a BC. One of thekey factors is the “block verification time”, i.e., the time needed for a TX to be (a) addedto a block by miners, and further, (b) being validated by other miners such that the TX isstored on the chain as a valid TX , in a final state. For instance, Bitcoin dedicates almost 10minutes tomine a new block, and it requires six times 10minutes to validate it through itsPoWconsensusmechanism. Another example is the EthereumBC,which requires almost14 seconds to mine a new block, but the total validity of that block takes up to 6 minutes.A comparison between the block mining and verification delays is presented for a selectedset of BCs in Table 2.2.

2.6 Blockchain Performance Enhancement Approaches

A survey on recent empirical performance evaluations of DLs is presented by [83] whichidentifies the current challenges and achievements of such evaluations. It has been shownin this survey that the size of the DL network and the TXs submitted to the BC, the sizeof the TXs, and even a DL’s client application versions (e.g.,Geth or Parity of Ethereum),are all impacting the performance of the examined DLs such as Ethereum, Hyperledger,and Libra [210]. More specifically, to evaluate and improve the performance of BCs, theyneed to be analyzed within (a) Network layer, (b) Consensus algorithm, and (c) Storageelements [173].

This is due to E2E TXs are transmitted via the Networking layer between BC nodes(miners and clients); Consensus is the collection of processes that need to be followed ac-cording to the predefined steps; Storage embeds the notion of distributed global memoryand data storage in which the TXs and blocks are stored. Thus, improving the scalabilityof BCs depends on improving its scalability in each or at least one of the Network, Con-sensus, and Storage layers. Scalability enhancement approaches have been categorized into2 main layers [262] explained as follows.

38

2.6.1 Layer 1 Scalability Enhancement Approaches

The first Layer, i.e., the “layer 1” scaling aims for enhancing the throughput of BCs froma systematic perspective by

• reducing the communication and computation overhead

• empowering nodes by adding more resources to a single node, or by increasing thenumber ofTXs inblocks, or reducing theblockperiod, knownas “vertical” scaling.

• adding more nodes into the BC, with dedicated duties, e.g., via “Sharding” or di-viding BCs into multiple shards. This group is known as “horizontal” scaling.

2.6.1.1 Sharding

Being one of the key approaches in layer 1 scalability enhancement group, this thesis lever-ages sharding in DLIT’s design. The basic idea of sharding originates from databases,where sharding denotes the horizontal partitioning of a database among multiple phys-ical data stores [103] such that they can be processed concurrently [52].

A basic DL sharding design partitions the global TX mempool (a memory to store aset of TX to be mined) according to the sender of an individual TX . Each shard mines ablock for each height and persists it in the DL [43]. There are various ways how a BCmayimplement sharding [262] but inmost cases inter-sharding state transitions need to be per-formed between all the miners in shards. This can cause delays even in perfect networkingconditions. Sharded BCs and DLs may run in fixed lengths of blocks called epochs. Aftereach epoch, the Vs are re-assigned to each shard. At the end of each epoch, a finalizationprocedure occurs by which the next epoch block is created and the number of new shardsis determined based on the current number ofVs in the network [43].

2.6.1.2 Sharding Concerns

Key concerns experienced by sharding directly relate to the design of shardingmechanismsor are influenced by external factors, such as peer-to-peer networking. Four of the consid-erable sharding concerns include:(i)Edge casesdenoteDLsbeing in the process of transitioning fromone epoch to thenext.The problem occurs when the sharded system is not yet prepared for transitioning to thenew epoch and some Vs are placed in two different epochs. E.g., for a 3 Validator (V )configuration consider the case (others exist, too), where all Vsmine the last block beforethe epochblock. At thismoment,V1 toV3 are leaders in Shards S1 to S3, respectively andV2 is in S2. After mining the last block and the assumption that V1 and V2 receive thestate transitions fromV3, they continuewithmining epoch block, whileV3 is waiting forthe state transition fromV2 due to an unstable connection. V1 fulfills the PoS condition

39

and can insert the epoch block, which is received by all Vs. Thus, V1 is now in S3, V2in S1, and V3 in S2. This new V assignment confuses V3, since he already processedthe transition from S1, which will not be requested again, since with a shard ID 2, notransitions from S2 are requested. Since the shard ID 3 is now associated with V1 and ifV1 receives a request for a state transition of S3, it will answer the request with its ownstate transition produced before the epoch block for S1. V3, however, already holds thistransition and it will keep requesting it, but will never receive the expected transition. Atthis moment the sharded DL cannot be recovered anymore, because V3 cannot requestthe transition from the right miner.(ii)Epoch block finality defines epoch blocks to be final and unchangeable after creation.Since rollbacks for epoch blocks are by definition not possible, the last epoch block arriv-ing can be accepted as valid in all cases. To ensure that all Vs eventually accept the sameepoch block, a timer can be employed by which, after the epoch block’s creation, Vs havetowait for a predefined time (e.g., 5 s) before continuing themining process. Thus, everyVaccepts the last epoch block and whenVs start mining, all hold the same epoch block thatthey attach their first block of the new epoch to. However, this is not always achievable!

In a 2 shards and 2Vs case consider thatV1 (leader in S1) andV2 (leader in S2) minethe last block before the epoch block. Both succeed in mining the block and V1 receivesthe state transition from V2. Assuming that the connection between these Vs is now in-terrupted, V1 can now start mining the epoch block, broadcasts it into the network, andcontinues mining the block after the epoch block. However, V2 still waits for the statetransition fromV1. IfV2 now receives the transition fromV1—be it through a rebroad-cast fromother nodes or by a short re-instantiation of the connection betweenV1 andV2— V2 is unaware of V1 producing the epoch block and mines an epoch block himself.At this moment the BC cannot be recovered anymore, because epoch blocks always haveto be final. When the connection is reestablished again,V2will broadcast the epoch blockto V1 and the new epoch block along with its V -shard assignment will be taken over byV1 after it mined the first block (after the previous epoch block). This means that one Vcan have two different shard IDs within one epoch, whichmakes the entire transition andrequest structure fail.(iii)Networking effects pose amajor challenge in operating aDL, sincemissing rebroad-casts of lost transactions in an unstable network does impact inter-miner communicationsheavily. Thus, the inter-/intra-shard TX transmission throughput may encounter severeperformance drops. Each disconnection leads to a new request or rebroadcast, whichcauses additional network load and reduces the scalability of the sharded DL.(iv) Inter-shard communications validity is required to ensure that theDLcanbe trusted.Thus, state transitions used in a shardedDL are key and the global state of theDLhas to bemaintained by inter-shard communications to validate TX ’s reliably. By letting all shards

40

validate all transactions, the purpose of a sharding mechanism would be defeated! Thus,protocols used for inter-shard communications for state transitionsmay only implement anaive verification mechanism. For each state transition the shard ID and the block heightof the block, where it originated from, has to be passed to other shards for verification.Therefore, malicious shards or even outsiders could create state transitions with anothershard’s name, if is no identity check would be performed.

To overcome these sharding concerns as discussed above, sophisticated approacheshave to be considered to avoid Edge cases, while ensuring the block finality, laid over securecommunications. Moreover, strict and precise verification of previouslymined blocks andTXs shall not be neglected. It is crucial, too, to consider unstable networks, where com-munication delays and disconnections cause synchronization problems between all enti-ties involved in the consensus mechanism. Otherwise, the DL will encounter TX losseswith time consuming or even no proper recovery method at hand. Above all, the consen-sus mechanism implemented has a direct impact on the overall scalability of DLs as well.

In order to offer a reliable and secure shardingmechanism, different proposals employproprietary measures by the combination of sharding and respective security features. Forinstance, in Bazo, in order to make sure that all miners are updated with all validated TXs, even about the TXs which don’t belong to their shard, a protocol is used that at a certainepoch height EH, epoch block is inserted, denoted as ebEH in Figure 2.12, which consoli-dates the global state of theDL and re-assigns theVs to the shards in a randomized fashion.Then, Shardi with shard ID i produces the block sbih at height h. Using a proprietary syn-chronizationmechanism, at each height h, all shards synchronize the global statewith eachother in order to have the global state before mining the next block with height h+1 [43].

Figure 2.12: Epoch Block Representation by [43]

41

2.6.1.3 Sharding RelatedWork

Zilliqa is one of theBCswhichhave implemented the sharding feature [242]. Zilliqa’s per-formance increases whenever 600 additional nodes join the network as it assigns roughlythis number of nodes in each shard. The performance increase is almost linear in the num-ber of shards, until around 1 million nodes are active in the system. A testbed with 1800active nodes manages to reach 1218 Transactions per Second (TPS) of July 2020 [242].Zilliqa implements a hybrid consensus mechanism. For authentication of the nodes, itimplements a Proof-of-Work (PoW) mechanism. Inside the shards, however, Zilliqa em-ploys a protocol based on Byzantine Fault Tolerance (BFT). The entire BC is managed bya committee. At the beginning of each epoch, a set of nodes is elected to participate in thecommittee, which assigns incoming TX to shards. Transactions are sharded based on theaddress of the sender [242].

RapidChain proposes a sharding based BC [263]. RapidChain createsmultiple com-mittees, which are similar to shards. During the bootstrap phase, a reference committee iselected which is in charge of partitioning the set of nodes in the BC into separate commit-tees. Each node stores 1/k of the BC, where k denotes the total amount of committees inthe system. The committees have to be reconfigured after each epoch. In order to performthe reconfiguration of the committees as efficient as possible, and to easily add new nodesto the BC, an adapted version of Harmony’s Cuckoo Rule was implemented [263] [157].It is claimed that using the methods proposed by RapidChain and running 4000 nodes, aTPS of 7380 could be reached [263].

QuarkChainproposes a two-layered approach. One layer consists of the elastic shardedBC. Elastic in this context means that the number of shards and nodes in the BC can vary.The second layer consists of the root BCwhich is in charge of confirming blocks from theshards [82]. The key challenge is to have a computationally strong enough root layer in or-der not to bottleneck the first layer. The highest TPS that could be reached using Pythonwas 55’039.58 TPS [237].

OmniLedger is a sharding-basedBCanduses theUTXOmodel toprocessTXs [156].In order to combat Sybil attacks, it operates an identity BC which runs in parallel to theactual BCwhere the TXs are processed. It utilizes state blocks as stable checkpoints whichsummarize the entire state of the BC and for the assignment of Vs to shards. The shardsin OmniLedger are called Optimistic Vs because they quickly validate TX and put theminto a block, creating a commitment, which with a high probabilistic likelihood won’tchange. Those blocks will then be checked again by the Core Vs. Once accepted by theCore Vs, the blocks produced by the Optimistic Vs are final. The Core Vs process blocksin parallel in order to maximize the scalability of the system. OmniLedger is able to reach4000 TPS [156].

42

Studies on the state of the art sharding related work shows that DLs designs offer asubset of entities with dedicated responsibilities. However, it can be noticed that an at-tempt to reduce or remove the inter-shard communications is missing. This applies to allother entities (used for validation or establishing a consensus) used within the introducedDLs, as well. These approaches seem to be limited in some facets such as lacking a properTX re-validation mechanism such as in Bazo[43], or they are not using any techniques toreduce the amount of the data stored in the DLs especially for IoT data. Moreover, therehas been no precocious techniques to proactively consider the networking instability all ofwhich will not allow the DLs to achieve a high scalability. Another missing element is thestorage optimizations, e.g.,with TX aggregation, integrated with the sharding approachesto limit the BC size growth. Especially for the BIoTuse cases where a large number of dataTXs flow from IoT devices towards BCs.

2.6.2 Layer 2 BC Scalability Enhancement Approaches

The second layer approaches for scaling BCs is known as “Layer 2” scaling approaches. Inthis group, off-chain processing of TXs is leveraged, e.g.,with side channels introduced asof “Plasma” approach in EthereumBC.These approaches do not improve the throughputof BCs at a systematic level, but they minimize the interaction with the BC to reduce thelatency from the users’ perspective [262].

A layer 2 proposal in enhancingBitcoin’s scalability is BitcoinLightning network [46].In this implementation, two sides of a TX communicate in sub-channels called “Micro-payment” channels. These channels are not trusted networks; they are used transmit Bit-coinTXbetween two parties to reduce the number of broadcasts in the network. To avoiddelays and complexity, instead of setting up channels for each pair of clients, a pre-installednet of channels is used for Micropayment TXs transmission. Hence, an infinite numberof TXs can be handled by Bitcoin Lightning each day with almost no limits.

In Lightning Bitcoin, instead of broadcasting every single TX in the BC, the mostrecent balance and all the previously deprecated balances are broadcasted. This meanstwo parties can sign the TX with new balances and publish it in the BC, making the twosides aware of the TX and, if required, announce the existence of problems. In the case ofasynchronous data, balance is published to BC when a party disagrees with the TX infor-mation. Lightning Bitcoin network does not provide an anonymous payment protocol,especially if two clients are using the same channel frequently [87].

2.6.3 Discussion

DSSes, DFSes, DLs, andCloud storage technologies introduced in this chapter aremainlyrepresented by Figure 2.2 each has pros and cons and is made for specific use cases. Thus,

43

Table 2.6: An Overview of BC Scalability Approaches [268]

Scalability Category Implementations

Approach

Payment Channel Lightning Network

Layer 2: Side chain Plasma, Pegged Sidechain, Liquidity.network

Off-Chain Cross-chain Cosmos, Polkadat

Off-chain Computation Truebit, Arbitrum

Blockdata SegWit, Bitcoin-Cash, Compact block relay,Txlim, CUB, Jidar

Layer 1: Consensus Bitcoin-NG, Algorand, Snow White,Ouroboros

On-Chain Sharding Elastico, Omniledger, RapidChain, Monox-ide

DAG IOTA, Inclusive, SPECTRE, PHANTOM,Conlux, Dagcoin, Byteball, Nano

there is no “one” technology that outperforms others to be selected for all BIoT applica-tions. As elaborated during the following Chapters, using DLs for BIoT applications andotherDSS can offer higher efficiency onlywith a dedicated adaptive design considering theDLdeficits and IoT technologies limitations. Asmentioned above, an example adaptationof BCs has been improving their consensus mechanism for more efficiency. Such an im-provement has been followed by several academic or industry-basedDL implementations,as discussed above.

A comparative overview of main DL implementations have been presented via Table2.4 and different layer 1 and 2 scalability approaches are summarized via Table 2.6. Fol-lowing the same BC adaptation and improvement trend, specifically by relying onDLs fordecentralization, transparency, trust, and immutability of data storedby them, this thesis isproposing aDL for IoT data storage whichwill be further elaborated inChapter 5. As thecomprehensive discussion on the Sharding technique above implicitly implies, this thesisemploys Sharding to improve the scalability of the designed DL for BIoT. This decision ismade to establish on-chain TX validation of Sharding to prevent malicious TX transmis-sions and avoid using deposit-oriented layer 1 approaches. Moreover, the proposed DLcovers TX aggregation and data modification which require on-chain processes.

In the following parts of this Chapter, IoT technologies and BIoT potential and chal-lenges are discussed to highlight the remaining necessary background information neededfor the following Chapters.

44

2.7 Internet of Things (IoT) Protocols

The most relevant and recent IoT protocols and technologies have been analyzed in thisthesis. As a result, a final set of protocols selected based on the long-range communicationsupport, high data rates, maximum Massage Data Units (MDU), and energy efficiency.The related IoT protocol group associated with such characters is Low-Power Wide AreaNetworking (LPWAN).

2.7.1 Low-PowerWide Area Networking (LPWAN)

LPWAN technology offers long-range communication of low power requirements. Dueto the deployment of several technologies, battery-powered devices can run for years [178].This thesis studies LPWAN technologies as follows, led by a comparative analysis and IoTprotocols chosen in this thesis.

2.7.1.1 LoRa

LoRa has become an interesting technology for lightweight smart sensing in IoT [166].It defines a specific radio layer based on the Chirp Spread Spectrum (CSS) modulationand a simple channel access method called LoRaWAN. LoRaWAN is arguably the mostadopted protocol in the LPWAN field. It promises ubiquitous connectivity in outdoorIoT applications while keeping network structures and management simple [208].

The LoRa operation depends on a set of parameters: (i) Bandwidth (BW), a range ofspectrum for transmissions, (ii) Spreading Factor (SF), the chirp rate that controls the bitrate and reliability, i.e., higher SF means lower bit rate and lower Bit Error Rate (BER),and Coding Rate (CR) that defines the ratio of the redundant information for ForwardError Correction (FEC). The LoRa CSS modulation results in low sensitivity enablingtransmissions over long distances. It provides a range of several kilometers outdoors andhundreds ofmeters indoors [245]. For instance outdoors, there is a less than 10% loss rate,i.e., the ratio between dropped packets and packets sent in the network over a distance of2 km for SF 9-12, and a more than 60% loss rate over 3.4 km for SF 12. Depending on theduty cycle of LoRadevices, i.e., howoften the packets are sent), their lifetimesmaybecomevery extended. For instance, 17 years for a node sending 100 Byte once a day [208].

LoRaWAN [166] defines an access method to the radio channel similar to ALOHA:a device wakes up and immediately sends a packet to a gateway. The difference betweenLoRa and pure ALOHA is the variable packet length in LoRa in comparison to the fixedpacket size in ALOHA. European Telecommunications Standards Institute (ETSI) reg-ulations of the 868 MHz ISM band set the limits on the maximum duty cycles rangingbetween 0.1% and 10% in the 863−870 MHz of the Industrial, Scientific, and Medical

45

(ISM) band depending on the selected sub-band. This is a result of the pure ALOHAimplementation of LoRa devices that does not conform to the Listen Before Talk (LBT)schema required by ETSI. This also limits the throughput of devices and the overall net-work capacity. Moreover, the LoRaWAN operation similar to ALOHA results in a highlevel of packet losses due to collisions as the number of devices increases. LoRa behaviorfor a larger number of devices strictly followsALOHAwith themaximum channel capac-ity of 18% and an increasing collision ratio. As an example, for the link load of 0.48, thecollision ratio is around 60% [41]. The impact of collisions is, however, significantly miti-gated by the capture effect and orthogonal spreading codes, in which some transmissionsbenefiting from a stronger signal are successful despite collisions. [30] realized that for lowduty cycles, the throughput is limited by collisions, whereas for higher duty cycle values,the maximum duty cycle set by the ETSI regulations (i.e., 0.1% - 10%) prevents devicesfrom increasing their packet transmission rates and limits the overall throughput of thenetwork [208].

[244] improves the performance of the LoRa network, while not impacting energyconsumption at the same time. They provide a simple LBT enhancement to LoRaWANthat lowers the collision ratio. Their results showthatCarrier SenseMultipleAccess (CSMA)considerably lowers the collision ratio, while only slightly increasing energy consumption.Moreover, CSMA is implemented through an LBTmechanism preceding every transmis-sion, therefore, the devices are relieved from restrictive 0.1% - 10% duty cycle regulationsof ETSI allowing for higher data rates. Furthermore, they observed that CSMApresentedlower energy consumption than LoRa for a large number of devices. To summarize, thework ofT-H. [244] significantly increases data rates and the probability of successful trans-mission (robustness) for low density networks at the cost of a slightly higher energy con-sumption as well as increases the probability of successful transmission, throughput, andenergy efficiency for high density networks [208].

2.7.1.2 SigFox

SigFox [30] is also one of the most adopted LPWAN solutions. It is a proprietary Ul-tra Narrow Band (UNB) solution that operates in the 869 MHz (Europe) and 915 MHz(North America) bands. Its signal is extremely narrow-band (100 Hz bandwidth). Itis based on Random Frequency and Time Division Multiple Access (RFTDMA) andachieves a data rate around 100 bps in the uplink, with a maximum packet payload of 12Bytes, and a number of packets per device that cannot exceed 14 packets/day. A businessmodel, in which SigFox owns the network, have shifted the community interest towardsother more flexible and open technologies such as LoRa [208].

46

2.7.1.3 IEEE 802.15.4g

IEEE 802.15.4g is dedicated to the Low-Data-Rate,Wireless, SmartMeteringUtilityNet-works [76]. IEEE 802.15WPAN task group 4g (TG4g) proposes first set of PHY amend-ments to extend the short range portfolio of IEEE 802.15.4 base standard [133]. Thestandard defines three PHY layers namely Frequency Shift Keying (FSK),Orthogonal Fre-quency Division Multiple Access (OFDMA), and offset Quaternary Phase Shift Keying(QPSK), which support multiple data rates ranging from 40 kbps to 1Mbps across differ-ent regions. With an exception of a single licensed band in USA, the PHY predominantlyoperates in ISM (sub-GHz and 2.4 GHz) bands and thus co-exists with other interferingtechnologies in the same spectrum range. The PHY is designed to deliver frames of size upto 1500 Byte so to avoid fragmenting Internet Protocol (IP) packets [208].

2.7.1.4 DASH7

The DASH7 Alliance protocol (D7A) [213] is an active RFID alliance standard for 433MHzwireless sensor communicationbasedon the ISO/IEC18000-7 standardmaintainedby the DASH7 Alliance. ISO/IEC 18000-7 defines parameters of the active air interfacecommunication at 433 MHz. D7A is built on top of an asynchronous Wireless SensorNetwork (WSN)Media Access Control (MAC). DASH7 provides multi-year battery life,range of up to 2 km, low latency for connecting with moving things, a very small opensource protocol stack, AES 128-bit shared key encryption support, and data transfer of upto 167 kbit/s [208].

2.7.1.5 Ingenu Random PhaseMultiple Access (RPMA)

Ingenu RPMA [30, 146] developed a proprietary LPWAN technology in the 2.4 GHzband, based on RPMA to provide Machine-to-machine (M2M) industry solutions andprivate networks. The main asset of Ingenu RPMA in comparison with alternative solu-tions is high data rate up to 624 kbps in the uplink, and 156 kbps in the downlink. Therange is of around 5-6 km due to the high spectrum band used [208].

The list of the technologies explained in this section is summarized in the Table 2.7.

2.7.1.6 3GPP 4G and 5G

The 3rd Generation Partnership Project (3GPP) standardized a set of low cost and lowcomplexity devices targeting enhanced Machine-Type-Communications (eMTC) [140,146]. In particular, 3GPP addresses the IoT market from a threefold approach by stan-dardizing Long Term Evolution (LTE) cat. M/M1, the enhancedNarrow Band IoT (NB-IoT), and the Extended Coverage GSM IoT (EC-GSM-IoT). LTE cat. M is reach up to

47

Table 2.7: Comparative Overview on Different LP‐WAN Networks [208]

Technology Range Throughput MAX MTU

LoRa 2-5 km urban <15 kmsuburban

0.3 to 50 kbps 55-222 B (SF 7-12)

SIGFOX 10 km urban, 50 kmrural

100 bps Fixed 12 B

IEEE802.15.4kLECIM

< 20 km LoS < 5 kmNoLoS

1.5 bps to 128 kbps 16/24/32 B

IEEE 802.15.4gSUN

2-3 km LoS 4.8 kbps to 800 kbps 2047 B

NB-IoT < 15 km 200 kbps 1600 B

LTE cat. M/M1 < 12 km up: <1 Mbpsdown: <1 Mbps

1500 B

1Mbps in the uplink and downlink, and operates in LTE bands within a 1.4 MHz band-width (i.e., 6 Physical Resource Blocks (PRBs) of 180 kHz). NB-IoT is an alternative thatthanks to the reduced complexity, has a lower cost at the expense of decreasing data rate(up to 200 kbps in both directions). Finally, EC-GSM-IoT is an evolution of EGPRS to-wards IoT, with data rates between 70 and 240 kbps. Cellular IoT operates in licensedbands; this fact implies the involvement of the mobile operator in the process of develop-ing private IoT networks. The Table 3 compares LTE-M1 and NB-IoT [208].

2.7.2 Security of IoT

In IoT systems, there are several architectures with varying levels of centralization. In themost popular fully centralized approaches, the application platforms located in the Inter-net (e.g., in the cloud) retrieve raw data from acquisition networks, process the data, andpresent the processed information towards third party services [93]. Therefore, a centralcloud provider is often used to store significant amounts of data and make it publicly orprivately available. Currently, there are many platforms following this paradigm such asAmazon Web Services (AWS) IoT [34], Azure IoT solution accelerators [175], or TheThings Network (TTN) [241].

There are many threats related to IoT architectures, which cause the disruption of theIoT integrated systems and raise questions about the system and data reliability [217]. Forinstance, (i) Denial of Service (DoS) attacks can exhaust the system by overloading variousresources (e.g., computing or physical spectrum). As an example, jamming of the com-munication channel can disrupt communication in a given area. (ii) Destruction of thephysical infrastructure is feasible, while the IoT devices are often insecure against physi-cal access. Another threat relates to (iii) Eavesdropping, which relates to extracting secret

48

Table 2.8: Comparative Analysis between LTE‐M1 and NB‐IoT [208]

Metric LTE Cat M1 NB-IoT

Deployment In-Band LTE In-Band LTE, LTE GuardBands, Standalone

Downlink Modulation OFDMA, 16 QAM OFDMA, QPSK

Downlink Data Rate Up to 1 Mbps 250 kbps

Uplink Modulation SC-FDMA, 16 QAM SC-FDMA, QPSK

Uplink Data Rate Up to 1 Mbps 250 kbps (multi-tone)20 kbps (single tone)

Bandwidth 1.08 MHz (6 PRB) 180 kHz (1 PRB)

Duplexing Technology Full Duplex, Half Duplex,FDD, TDD

Half Duplex and FDD

Latency 10 to 15 milliseconds 1.6 to 10 seconds

Link Budget 155.7 dB 164 dB

Power Class 23 dBm, 20 dBm 23 dBm, 20 dBm

information through the overhearing of communication channels. However, when theinfrastructure is compromised through either node capturing (e.g., tempering the nodefor key extraction) and (iv) Node controlling, the attacker could benefit from the infor-mation circulating in the entire network (e.g., if a single master key were used to protectthe communication) including physical damage, eavesdropping, node capture, or nodecontrolling [208].

Typically, there are various counter measures introduced to protect against differentattacks. For example, frequency hopping [180] can be a solution used against a jamming-based DoS, while re-keying and revocation of compromised keys shall be used in the IoTdevices dealing with tempering or controlling situations [109]. To allow for informationconfidentiality, authenticity, and integrity, different mechanisms are specified throughcryptographic mechanisms at various levels of the system including IoT perception, net-work, andmiddle-ware layers [155] as well as the application platform typically residing inthe Internet (e.g.,AWS IoT [34], Azure IoT [175], or TTN [241]).

Moreover, different security mechanisms have to be considered in terms of authenti-cation, authorization as well as identity and key management. The increasing focus is alsoput on the privacy issues [29] in IoT allowing users to remain anonymous and providecontrol over generated information in terms of selective access rights. The IoT systemshould allow for granting and revoking data access any time depending on the relationamong users and their willingness to share the data. This also brings issues with respect totrust and governance. Users should be allowed to base their decisions on trust in future

49

actions of other actors (e.g., a user considers the information from trusted third parties).However, in general, an IoT system should be trusted allowing for easy governance, whereusers are not restricted with hard constraints imposed by the system design [77, 208].

Furthermore, a fully trusted system [217] should be fault tolerant, i.e., composed ofrobust implementations and usable systems integrating intrusion detection, preventiontechniques, and recovery services allowing for high system reliability. Finally, the systemshould bring open interfaces (i.e., an API to easily integrate with other business applica-tions) offering significantmarket penetration, long time return on investment for involvedcompanies (e.g., a company lock-in), scalability to growwith the number of connected de-vices, and interoperability among different heterogeneous devices.

2.7.2.1 Physically Unclonable Functions (PUF)

Inmost IoTuse cases, data security is a key issue. Therefore, IoTdata transmission securityis provisionedwith different approaches. A typical solution has been encrypting data witha secret key, which is often preserved in a non-volatile storage. However, this solutionis susceptible to possible attacks, where an unauthorized retrieval of the secret key couldoccur [190]. To overcome such vulnerabilities, solutions like [198] exploit IoT devices’hardware characters. The hardware characters being physically unrepeatable, representthe unique construction of hardware and the material used for individual hardware. Thischaracter is used for identifying IoT devices and ensuring the origin of data [208].

There have been several proposals on the methods which utilize hardwares’ uniquecharacters to identify them, i.e., Physically Unclonable Functions (PUF) [44]. PUFmeth-ods study on small variations in the characteristics of Integrated Circuits (ICs) of IoT de-vices to derive unique keys based on the individual characteristics of the IC, thus enablingsecret key generation without the need for storage in a non-volatile memory [126]. This isdue to the fact that the PUF-generated key could be regenerated on the fly and in case ofneeds to regeneration, the PUF output shall be the same. The PUF-generated key acts likea DNA for a hardware. PUF-based solutions take an IC’s characteristics as input and pro-duce a key that is unique to the given IC (cf. Figure 2.13). A valid and reliable PUF mustproduce the same key when using the same IC with high probability and must, withinreason, produce a different key for each new IC.

The concept that initiated PUFs originally termed Silicon Physical Random Func-tions, was first proposed by [126]. They described several possible implementations ofcircuits to identify and authenticate individual ICs, using Field Programmable Gate Ar-rays (FPGAs). Their work was that there is sufficient variability in the manufacturingprocess of IC’s to produce a unique set of Challenge and Response Pairs (CRP) for eachcircuit, thus enabling both identification and authentication of ICs.

50

Figure 2.13: The Generic Concept of Physically Unclonable Functions (PUF) [149]

Figure 2.14: Fuzzy Extractor Using Power‐up SRAM State as Input Data [198]

Agreat number of IoT hardware providers exist in themarket such as theArduino de-vices (e.g., ArduinoMega2560, ArduinoUno) [2]. There are three pools ofmemory in themicrocontroller used on AVR-based Arduino boards [9] i.e., (i) Flash memory (programspace), is where theArduino sketch is stored, (ii) SRAM(StaticRandomAccessMemory)is where the sketch creates and manipulates variables when it runs, and (iii) EEPROM, isa memory space that programmers can use to store long-term information. Flash mem-ory and EEPROM memory are non-volatile (the information persists after the power isturned off). In contrast, SRAM is volatile and data on it is lost when the power is cycled.The ATmega2560, in theMega2560 – which is the main hardware used in this work– has8 kByte of SRAM space [9].

The manufacturing variability of SRAM chips causes small mismatches between N-channel and P-channel transistors, which result in individual SRAM cells having a biastowards 0 or 1 when powering on the chip. Since these cells’ values are consistent for eachpowering on, this is seen as as a physical characteristic for the chip. Therefore, the start-upvalues of SRAM bits used as the input for a PUF [190].

51

[141] proposed using the power-up state of SRAM as an identifying fingerprint in2008. [190] investigated using this approach as a PUF for chip identification in 2013.The manufacturing process of SRAM chips. [190] showed that the microcontroller theywere testing, theATMega 1284p contained enough entropy touniquely identify each chipbased on its power-up state. They also showed, however, that the error rates, meaning therate of the SRAM cells that behaved completely randomly, of SRAM PUFs are too highto be used for applications such as key generation and authentication, without utilizingsome form of error correction [190].

Inmany cases the out put of PUF algorithms are prone to noises, i.e., slightly deviatingfrom an expected output. To alleviate such noises, fuzzy extractors are developed. FuzzyExtractors were proposed in 2008 by [92, 198], as a primitive to extract nearly uniformrandomness from noisy PUF data in order to securely authenticate biometric data. Theprimitive can be applied to any information that is not reproducible precisely and is notdistributed uniformly. The output of fuzzy extractors is error-tolerant, meaning the sameoutput is produced for input data that is slightly variant, as long as the differences remainwithin some margin. Two separate scannings of the same fingerprint would, therefore,produce the same key, if the scannings match closely enough to assume they came fromthe same finger. This can be applied tomany different kinds of input data and thus can beused to generate a unique key from power-up states of SRAM chips if the error rate is lowenough [198].

The result of the process behind fuzzy extractors is outlined in Figure 2.14which is theextraction of a uniform random string R from some input w. Additionally, an input w′

that is close to w can reproduce an identicalR. This is achieved using two separate proce-dures, the generation procedure (GEN) and the reproduction procedure (REP). Duringthe generation procedure, the fuzzy extractor outputs a non-secret helper P alongside R.During the reproduction procedure, P is then used together with w′ to reproduce R, aslong as the distance between w and w′ is below a threshold t. This will only work withboth w′ and P, as neither provides adequate information onR on its own [92, 198].

2.7.3 Choices of IoT Protocols

Considering the requirements and environmental characteristics of BIoT use cases, re-garding the diversity of products and processes, the best set of IoT devices needs to be se-lected to serve the purpose of long-range communications with low power consumption.Regarding the data packets, since the LP-WAN protocols support small data packets, aLTE-based technology can be employed as a replacement for the LP-WAN technologies incase of the need to transmit large data sizes at each transmission. Based on the comparativeanalysis presented in the Table 2.7 and Table 2.8, in this thesis it is decided to select Lo-RaWAN and LTE-M1, which are enabling protocols that can play complementary roles

52

in the supply chain use case. Based on the data rate, communication range and power ef-ficiency. To use LTE-M1 it is needed to subscribe to get the communication service fromthe service provider e.g., Swisscom. LTE-M will serve as a complementary technology toLoRa nodes concerning the coverage and data transmission rates. However, both LTE-Mand LoRa support long-range communication.

2.8 Blockchain-IoT Integration (BIoT)

Studies have shown, a well designed and pre-configured IoT setup can benefit from BCsin many ways [120], [219], [170], and [100] introduced as follows based on [208].

2.8.1 BIoT Incentives

BIoT potentials encompass the driving force for application developers as well as platformproviders to explore BIoT operational advances. Five key incentives and potentials in BCintegration of IoT use cases explained as follows.

2.8.1.1 Decentralized security provisioning

BCs do not require a TTP, which means that a secure and trusted “decentralization” cre-ated by BCs prevents any individual or authority to control or tamper with particular datapersisted in the BC. Thus, depending on BC-IoT integrated application requirements thefull range of BCs to DLs can be deployed.

2.8.1.2 Reliability and resiliency

Once a TX is validated and appended to a BC, the content of this TX is immutable anddistributed across all BC nodes. The TX stored will be accessible for all BC clients at ahigh, uninterrupted availability, since full BC nodes store an entire copy of the distributedledger locally. Thus, the establishment of a reliable and resilient data storage is possible.Data reliability and resiliency is not always achievable withDSSes due to their data accessi-bility regulations and the pricingmodels. The reliability of Cloud storage systems for IoTapplications is also questionable due to their centralized nature.

2.8.1.3 Traceability

Since all TXs are stored within a BC in chronological order, all BC clients can trace anycontent and order of these TXs . Thus, data persisted once in a BCwill be accessible withrespect to their time, possibly geo-location, and content. Note that different data storage

53

alternatives exist, e.g., to store only a hash of data on-chain and storing the full data off-chain. This is often adopted in BIoT use cases, such as Supply Chain Tracking (SCT),since private data can be maintained effectively.

2.8.1.4 Autonomic Interactions

BCs grant IoT devices the ability to interact with autonomous processing entities, i.e.,Smart Contracts (SC), which define immutable programs, e.g., running in the EthereumBC [36]. Autonomic communications of IoT nodes without a TTP are possible, sincecontract clauses embedded in SCs are executed within a self-governed fashion, when a cer-tain condition is satisfied (e.g., the user breaching a contract will be fined automatically) orwhen a threshold is violated (e.g., IoT-connected environment monitoring sensors senseCarbon Monoxide (CO) levels reaching to a threshold amount; consequently, the corre-sponding SC will execute an alarm function) [199]. Thus, a BC-IoT integrated applica-tion can autonomously process and stimulate actions. Such a decentralized process han-dling as done by SCs is not offered by centralized, Cloud storage systems, or DSSes.

2.8.1.5 Identity and AccessManagement (IAM)

IoT devices are uniquely identified, however, only IAM provides authentication and au-thorization. Since the maintenance of credentials, such as keys and certificates for IoTdevices, are important, trusted and distributed authentication and authorization servicesfor IoT devices are required. While the authentication refers to the process of verifying theidentity of a user or process, which, once successfully performed, can lead to authorizationby providing access to resources. An Authentication and Authorization Infrastructure(AAI) can be extended to operate with and for IoT device identities, too [183].

Handling digital identities is a broad research subject with much hope being put intoBCs, due to their immutable nature [144]. On Ethereum specifically, there are many stan-dards (or standardizationproposals) to handle the identity of users anddevices. These pro-posals on are submitted through EthereumRequests for Comments (ERCs) or EthereumImprovement Proposals (EIPs). The ongoing Developments for identity standards onEthereum can be viewed on [113]. The Identity Standard used in this thesis is ERC 725v1, later moved into the key-manager standard ERC 734 [253] and is extended with claimholder functionality as described in ERC 735 [252].

ERC734 describes standard functions for uniquely identifiable proxy SCs that can beused by other accounts or even other SCs. The contracts can describe anything from hu-mans to groups or devices, and act as an identity proxy on the BC. The described identitySC has a key-storage that controls the degree to which other parties can interact with thecontract. e.g., upon creation of the contract, amanagement key is added to the key-storage

54

based on the creator’s account address to ensure certain functionality can only be executedby the SC creator. The contract also features an execute function to run arbitrary contractcalls, which enables it to act as a proxy for whatever instance it is representing [33]. TheERC734 standard is based onhaving proxy SCs deployed on theBC to represent the users,devices, groups. A trusted party can then verify the identities of these instances. To regis-ter a successful identification on the BC, the trusted third party’s proxy contract will issuea signed claim stating the verification of a specific identity. This claim can be added to thecorresponding proxy contract of the device, which consequently states to any account orcontract interacting with the proxy, that its identity was verified by the trusted third party.

ERC 735 is an associated standard to ERC 734 that extends its functionality to addand remove claims. The idea is that similarly to the real world, where a government willvouch for an individual’s claim of being a citizen of their country by issuing a passport,signed claims can be issued by a trusted authority on the BC and added to the identity-SCs. The claims will be hashed and signed by an identity contract of this trusted thirdparty. In order to issue claims, this contract must first add a claim key to its key-storage, asit will be used to sign the claim and will be later checked by anyone wishing to check if theclaim is valid [252].

2.8.2 Overview on BIoTUse Cases

Suchmerit advances brought with the decentralized nature of BCs have been employed innumber of use cases, some of the main ones are introduced as follows based on [208].

2.8.2.1 Smart Grid

operates as an electricity grid combined with information technology to enable the mu-tual exchange of control information and to allow for the management and monitoringof the distribution of electricity from various sources. Smart Grids need communicationsamong a largenumberof devices tobe tracked,monitored, analyzed, and controlledwithina network. Therefore, distributed automation is essential and it can be achieved throughthe application of IoT-based sensors, smart actuators, meters in production, transmissionand distribution, storage, and the management of electricity consumption. With SmartGrids, the role of consumers is reshaped into so-called prosumers, who can generate andconsume energy. For instance, prosumers holding unused energy can sell their electricityproduced. The respective trading process defines a P2P energy trading [85]. Such energytrading can benefit from secure and trusted BIoT platforms, since the storage of volumesof consumed/produced or traded energy can be stored in decentralized ledger.

55

2.8.2.2 Smart City andHome

shows various interconnected infrastructure components, mobility and traffic manage-ment, citizen relations, and environmental resource monitoring [85, 101]. Thus, IoT-based approaches for interoperations are prone to potential attack targets. To protectagainst security attacks and related risks, a Smart City ecosystem must support data en-cryption, anonymization, and pseudonymization. This can be supported by BIoT archi-tectures with the employment of SC-based autonomous interactions and an execution ofautomated actions on top of immutable data handling and persistence.

A traditional house being equipped with a human-intractable set of IoT sensors andprocessing is termed SmartHome, since respective units of the infrastructure canmonitorresources and the environment. Configuration settings of energy, light, or shutters withina given home could be updated—preferably automatically— based on data sensed. Suchan approach can enhance the comfort and safety of residents by achieving a higher energyefficiency, higher sustainability, and surveillance, while BCs offer the traceability of deci-sions taken based on these data monitored and persisted immutably.

In this sense, there have been theoretical research and proposed solutions in the areaof pollution monitoring with different approaches. Some instances of these solutions arethe methods used by Queensland government [195], Semtech [225], and Aeroqual [31].Also, modern laboratories for pollution monitoring collect the data using sensors placednear/in the examined area. Considering that themonitored areasmight be located far fromthe laboratories, themaintenance of sensors regularly and accessing the sensors to read thedata is not always feasible.

Alongwith thedistancebetween the located sensors anddata centers,main concerns indeveloping such a Blockchain-based PollutionMonitoring System (BPMS) can be listed asdata accuracy, sensors power consumption (i.e., battery lifetime), size of the sensor nodes,need for human interaction for accessing the data recorded by sensors, communicationprotocols used for connecting sensors and application servers, and following a centralizedarchitecture.

For small area communications, employingWiFi-based sensors or Ethernet-connectedonesmightbe apotential solution; however, such communicationmediums arenotproperfor long-distance communications due to their high power consumption requirements.There are some methods to reduce the energy consumed by sensor nodes, such as usingDARAL [110] – re-charging sensor batteries over WiFi network – but DARAL restrictsthe communication range of sensor nodes, too [48]. Amore plausible solution is employ-ing Low PowerWide Area Network (LPWAN) protocols that help sensors communicateover long distances, using a comparatively less power consumption, causing reduced costsand time of IoT infrastructure maintenance.

56

In some of the recent IoT-based BPMS proposals, BCs are being used for data storagepurposes, such as the solution in [78]. Considering that in BPMSes, data collection isdone frequently every day by sensors, high TX fees, and block-time may make the BCincongruous for this use case. Thus, selecting a proper BC plays a critical role in each usecase. TX validation time, block-time, scalability, and supporting Smart Contracts (SC)are some of the considerable factors in selecting a BC for such a use case.

2.8.2.3 Healthcare

Ingeneral, healthcareplatforms interactwithmultiple stakeholders, such aspatients, health-careproviders (e.g.,hospitals), insurance companies, government agencies, clinical researchers,and pharmaceutical suppliers. Given their role on thewell-being of people, IoT-integratedhealthcare systems support the sharing of data generated by IoT sensors. However, IoTactuators need to be stringently managed, since only authorized and correct commandsshall be processed. Thus, with a suitable IAM integration for data collection, storage, andcontrol decisions, health platforms need to guarantee the transparency, immutability, anddecentralization inherently. Therefore, the use of SCs residing within the BC can restrictdata accessibility on a “need-to-know” basis, which meets privacy requirements. Privacypreservation challenges and the security of healthcare data can be achieved by storing theproof of data integrity on the BC. By relying on these features, data collected by medicalsensors canbe automatically sent to the central control by triggering a SC, thus, supportingreal-timemonitoring of patients. BIoT-enabledmethods support the privacy of healthcareusers and can verify their authenticity and identity. As of today, a BIoT-based healthcaresystem can monitor a pandemic outbreak, like for COVID-19 as of late 2019, such thatinfected people with IoT sensors could be tracked and countermeasures could be applied,while protecting the patient’s privacy [71], [188].

Sharing the collected personal data such as health related data could have several pri-vacy related concerns for data owners. There are BC-based approaches to provide themeans for sharing IoT-collected data while maintaining user privacy and data security as-pects. Datapace [104] proposes a distributed and decentralized system for trading IoTdata, based on BCs to provide tokenized assets, data integrity, SC capabilities, networksecurity, and immutability via Practical Byzantine Fault Tolerance PBFT [189] consensusmechanism. The Datapace marketplace uses Mainflux IoT platform [162] for its back-end. Mainflux is a secure, scalable, and open-source IoT cloud platform. It allows usersto connect their devices to it by using the standardized network protocols such as Mes-sage Queuing Telemetry Transport (MQTT), WebSocket, and Constrained ApplicationProtocol (CoAP). To ensure data security and validityDatapace is based on the Proof-of-Verified-Source. Datapace also produces network and sensing equipment to be used bythe data providers. This way,Datapace ensures that the data traded is from a valid source.

57

IDMoB [187] proposes a fully decentralized IoT data marketplace designed to be ac-cessible under all circumstances by running on BCs. The rules are enforced with SCs de-ployed on Ethereum network [69] and data is stored on the Swarm distributed storageplatform [114]. SCs are designed to support a query mechanism for searching the avail-able vendors, sensor types, geolocation, a payment mechanism, and a voting system forusers to assess available data sellers. IDMoB relies on SCs and it is neither designed fortime-critical data nor for stream data in real-time. IDMoB is not GDPR compliant [127]due to publishing the data on a publicly accessible BC. The file handles of the data storedon Swarm are publicly visible on BC. Because of that, each data chunk is encrypted witha symmetric key. A new symmetric key is created for each file upload by a data vendor.This is achieved by utilizing a hierarchically deterministic key mechanism [216]. The SC-base method employed in IDMoB ensures a secure mechanism via which buyers obtainthe keys for decryption upon buying the data. This encryption mechanism adds to thecomplexity and certainly hinders the user experience.

[247] propose Sash, a framework to couple IoT platforms with BCs. They follow ahybrid approach by storing the data off-chain while handling the policy decision point(PDP) via BC SCs. This framework is designed in such a way that data owners encryptthe data being stored on their system. They achieved this by utilizing the prefix encryption[261], which allows fine-grained access control and minimizes the overhead required todistribute the keys. They use the FIWARE [122] as their IoT platform and HyperledgerFabric [145] as their BC. Since data is stored off-chain, the storage provider is a node run-ning the SC and performing the access-control accordingly.

The main components of the Sash are the IoT stakeholder, the router, cloud storage,BC network. Sash offers two data sharing schemes (a) ACLs based scheme, i.e., a dataowner stores the ACL list and enforces the access control decisions. ACL updates arestored on BC in order to ensure immutability. Since this approach enables the data to bestored unencrypted, the scheme assumes that the storage provider is trustworthymeaningit will not abuse theACL rules. (b) Prefix encryption-based scheme. With this scheme, theaccess control is cryptographically enforced so the assumption that the storage provider istrustworthy does not hold. Also, the scheme introduced the key authority for managingthe PKI. BC only plays the role of the payment provider in this scheme. The key author-ity is in charge of enforcing the ACL rules by generating the decryption keys for the databuyer upon successful payment for the data. Assessments on the actual performance ofthese two schemes and different data sizes resulted as follows. (a) Committing the datatakes longer than fetching it. The time increases linearly with the data size for both fetch-ing and committing. E.g., for 2.3 MB of data, it takes around 3800 ms to commit andaround 200 ms to fetch the data. (b) for a 20 MB file, it takes around 8000 ms to commitand 2000 ms to fetch the data.

58

[196] intends to use BC and other distributed ledger technologies for implementinga fully decentralized system and a technology-agnostic Streaming Data Payment Proto-col (SDPP). They utilize Ethereum as their BC platform and InterPlanetary File System(IPFS) as a distributed file system. SDPP is used as a streaming data protocol. In [196]the data is stored off-chain and BC is used just for storing crucial information about theentities within the system. However, there are no details available about the system per-formance in terms of data pushing, storing, and streaming.

Assessment of this platform is that it involves overly specific, more time-consumingsteps that needed to be taken manually by users (e.g., installing IPFS clients on their ma-chines just to be able to search the Data). Therefore, the mere initiation of data trading isa complex process, potentially deterring users from operating on the marketplace.

2.8.2.4 Supply Chain Tracing

In SCT, not only the sequence of fundTX is needed, but also the sequence of actions per-formed on a particular product during its life cycle is valuable. The traceability of fundsand actions rely on accurate time stamping, which is enabled by BC-IoT integrated appli-cations. To circumvent boundaries and trust issues between different stakeholders, SCTlike other BIoT use cases integrate BCs and IoT. In BC-IoT-integrated SCT, IoT devicesare used for sensing, collecting, andmonitoring data automatically in distributed settings,whereas BCs store the time stamps, geo-location information, and an IoT device identi-fier. Thus, actions and conditions monitored throughout a product’s life-cycle are per-sisted and traceable afterwards. Using BIoT, quantitative and qualitative data from suchprocesses can be mapped to resources and actions, which control the full life-cycle in atransparent manner. Recent studies elaborate on several aspects employing BCs in SCT,such as [63], [234], and [47].

There are many use cases deploying Supply Chain Tracing (SCT) systems based onBCs, since BC and SCT integration supports the persisting of data, their integrity, and,consequently, the establishment of trusted boundaries between different actors for a prod-uct’s life cycle. Since theproductionof aparticular product encompasses a largenumber ofactions and conditions during its life cycle, collecting throughout a product’s supply chaintime-stamped quantitative and qualitative data from the process, even including the geo-location of resources or times when certain steps had been performed, adds value beyondfinancial incentives such as region labels. To collect data required automatically, wirelesssensors, IoT devices, and surveillance devices, such as cameras and drones, are integrated.Unfortunately there is no distributed and BC-based dairy SCT system in the market thatpublishes the implementation details and evaluation publicly to be compared here, how-ever, recent studies such as [47] elaborate on several aspects of employing BCs in SCTsystems in general.

59

One BC-based sApp SCT platform is proposed in [14] to provide better performingvalue chains by proposing a new food-on-demand business model, based on Quality-of-Experience (QoE) food metrics. The research focused on key steps of the food chain, ad-dressing emergent needs of various actors. The system is applied to “grapes” SCT.

[256] presents AgriBlockIoT, a decentralized BC-based traceability solution for SCTto integrate IoT devices producing and consuming digital data along the supply chain.AgriBlockIoT defines the use case from-farm-to-fork, which achieves traceability usingtwo different BCs, i.e., Ethereum andHyperledger Sawtooth. The proposed system guar-antees transparency and traceability of products bydirectly collecting all relevantdata fromIoT devices along the supply chain and storing them inside the BC directly.

[17] defines a methodology for traceability along food supply chain production pro-cesses, to provide end-users with sufficient knowledge of the product. This BC-integratedmethodology implements its SCTapplication in the organic coffee industry in theColom-bian market. “Cold Chain” proposes the sterile safety of products throughout the supplychain, involving storage, production, transportation, distribution.

Modum.io [63] follows the deployment of a BC-based SCT solution in the pharma-ceutical domain by using IoT devices and BCs to reach data immutability and publiclyaccessible temperature records during transport. This approach proposes reducing opera-tional costs in pharmaceutical supply chain, since sensor devices examine the temperatureof eachpackage ofmedicine during the shipment for comprehensive affirmationofGDPR(General Data Protection Regulations). Smart Contracts (SC) are used in Modum’s ap-proach to assess the temperature information and offline features, i.e., storing data inter-nally, enable temperature reports at a later point in time.

[248] also represents a solution for the agricultural food supply chainproblemof trace-ability by deploying a BC-based approach to provide information security within the Chi-nese market. This work compares traditional supply chain systems with PEST, an analyt-ical model evaluating the macro environment location of the industry. PEST comprisesof four factors: political, economic, social, and technical. Based on these factors the studyconcludes that BCs are well-suited for governments to maintain traceability and security.This helps manufacturers to record transactions with full authenticity.

[16]presents a real-time traceability systemfor the food supply chainbasedonHACCP(Hazard Analysis and Critical Control Point), BC, and IoT. It serves as a platform foraccessing information across all supply chain members with transparency, immutability,security, reliability, and neutrality. RFID technology is used as a unique digital crypto-graphic identifier for connecting physical items to their virtual identity in the system. Thedata in the system is stored within the BigchainDB, which can be accessed by any user.Monitoring of data, sharing, and interaction of users are possible, too.

60

Enterprise Resource Planning ERP and BlockchainsEnterpriseResourcePlanning (ERP)was coinedbyGartnerGroup in the 1990s [165].

It is the integration of organisations departments and functions in a single computer sys-tem. ERP manages, stores, and traces resources and aims at operation and cost efficiencyas SCT enablers. ERP ensures that all relevant and exact needed amount of material for abusiness or its production is available in the right place and at the right time [60]. Thereare two different types of ERP Systems in themarket, the proprietary and the open sourcetypes. Both tyopes have reached a great success, as the total revenue of the proprietary typewas 82 billion $ in 2017 [1]. This indicates the rapidly increasing importance of ERP Sys-tems in the industry. Some of the worldwide leading companies are SAP, Oracle, Intuit,FIS global, Infor, andMicrosoft (MS) [165]. There is no real evaluation about the usage ofthe open source ERP Systems. However, some research shows the number of downloads,giving an indication of the popularity of each brand. For example, Openbravo registeredabout 427’203 downloads within a year, while Compiere has been downloaded 135’128times (Nov 2017 - Nov 2018). Indded, the number of downloads does not directly indi-cate respective software is being deployed with the same amount in organizations [19].

It is envisioned that industry 4.0 will be applying different approaches to combinethe ERP systems with BCs [181]. Research on the integration of BCs and ERP systemsconveys its incipient days and still several technical and integration challenges need to beaddressed. However, the great potential of such an integration is foreseen due to the com-plementary roles these two paradigms. In this regard, a brief comparison of ERPs andpublic BC-based dApps in SCT use cases is summarized as in Table 2.9. This comparisonindicated a great potential in BC-based dApps and justifies the initial intentions in usingsuch distributed platforms.

In the following of this thesis, design and implementation of a selected set of BC-IoTintegrated systems as samples of use case areas introduced above are presented, which in-clude (i) an IoT device identification platform, (ii) a pollution monitoring system basedon BIoT, (iii) a Supply Chain Tracking (SCT) dApp, and (iv) a P2P IoT data streamingand trading platform.

2.8.3 BIoT Architecture

BIoTdepends on the communication between the underlying IoT infrastructure andBC.IoT devices can generate massive amounts of data in real-time, and this can cause networkcongestion, when a large number of IoT devices streamdata at the same time [249]. WhileBIoT can be instantiated in different ways, such as establishing IoT devices via fog com-puting as a layer between the cloud and edge (cf. Section 2.8.3.1), main BIoT architecturesare relying on the models as of [215] differentiated into (i) IoT-IoT models, (ii) IoT-BCmodels, and (iii) hybridmodels (cf. Figure 2.15). BIoTarchitectures are designedbasedon

61

Table 2.9: Comparison of Supply Chain Tracing dApp and ERP Systems [201]

Measurement BC-based dApp ERP-based App

Data Collection from Supply Chain ✓ ✓

Intera-Organizational Interactions ✓ ✓

Inter-Organizational Interactions ✓ X

Integration with IoT ✓ ✓

Cost Efficiency X ✓

Limited Coverage Scope X ✓

Deployment Simplicity ✓ X

Storage Efficiency ✓ X

Data Accessibility for End Users ✓ X

Transparency for End Users ✓ X

Trustability for End Users ✓ X

one or a combination of these models, which have to encompass necessary functionalityto provide security, scalability, and energy-efficiency.

The IoT-IoT Model operates off-chain, storing data within databases, while only aproof of data integrity, i.e., data hashes, are stored in the BC. Thus, data storage demandsare minimized and IoT interactions occur without involving the BC. This model is em-ployed, when a reliable data channel between IoT devices exists and it is protected withIoT security measures, and IoT interactions occur with low latency. The IoT-BC Modelstores all interactions in the BC. Thus, the autonomy of IoT devices increases, since IoTdevices act as BC clients, where IoT-to-BC transactions are triggered. By collecting all IoTdata within immutable records of transactions, IoT-BC offers details of all interactions inthe BIoT platform. For theHybrid Model a set of IoT transactions will be stored in theBC, and the remainder will be transmitted throughout the IoT network without involv-ing the BC. The challenging part is to optimally categorize transactions, either in advanceor on the fly, such that BIoT applications leverage the benefits of BCs and real-time IoTinteractions. Hence, the hybrid model employs fog and cloud-based computing architec-tures to benefit from these processing units.

2.8.3.1 Cloud, Fog, Edge-based Architectures

In cloud-based architectures data collected by the IoT device layer (i.e., edge layer) is for-warded without further processing directly to the cloud through IoT gateways. Thus, ac-cess to cloud servers becomes a Single Point of Failure (SPoF). If an IoT device connected

62

Figure 2.15: BC and IoT Integration Models: (a) IoT–IoT, (b) IoT–Blockchain, and (c) Hybrid Approach [215]

to the cloud or central server is broken [120], IoT devices may be compromised. More-over, the physical distance between cloud data centers and IoT devices can cause delay indata transfer. Hence, Quality-of-Service (QoS) is negatively impacted for time-critical ap-plications [249]. To support distributed, low latency, and QoS-aware applications in IoT,cloud-based BIoT architectures have evolvedwith optimizing fog and edge computing ap-proaches.

Since fog and edge computing-based architectures support BIoT by shifting parts ofprocessing tasks from cloud servers to the network edge [249], they depend on cloudservers and services [120]. Edge computing refers to moving computational resources tothe edge of the network, where IoT devices are located. However, resource-constraineddevices at the edge do not support strong computational operations [170], [85], [188].Thus, fog computing emerged as a subset of edge computing [120], offering an intermedi-ate layer between the edge and the cloud. Fog-based architectures facilitate computational,storage, or network-intensive BIoT applications. Fog devices are considered distributedcomputing instances of the architecture deployed across the edge network.

2.8.3.2 Software-defined Architectures

Deploying Software-defined Networking (SDN) enhances the performance of BIoT ar-chitectures. For instance, the SDN-based BIoT architecture [227] is based on a BC-based,distributed cloud architecture with SDN enabling controller fog devices at the edge of thenetwork (cf. Figure 2.16). SDNarchitectures expand across 3 layered structures consistingof the (i) device or edge layer, (ii) fog layer, and (iii) cloud layer.

The device layer lies at the edge of the network. This layer monitors infrastructureenvironments and transmits un/filtered data to be processed to the fog layer. Hence, thedevice layer requires listening and transmission services to collect data from a monitored

63

Figure 2.16: An SDN‐enabled Fog and Cloud‐based BIoT Architecture [208]

environment and passes it to the fog layer. The fog layer covers a community consist-ing of end points and carries out data analysis and service delivery promptly. If needed,the results of data processed can be sent to the cloud layer. Here, the fog layer accessesthe distributed cloud layer to utilize application service and storage or computational re-sources. The cloud layer provides monitoring and control on a widespread level, whereasthe fog layer provides localization. Cloud and fog layers collaborate to materialize largescale event detection, long-term pattern recognition, and behavioral analysis through dis-tributed computing and storage. Thedistributed cloud layer can takeup computingwork-load of fog nodes, if they become incapable of processing local data due to a lack of suffi-cient computing resources [227].

[227] defines a distributed cloud-based BC. Moreover, a BC-based distributed SDNcontroller network operates within the fog layer. Each SDN controller is empowered byan analysis function of the flow rule and a packet routing function. Base Stations (BS)provide security in case of security attacks. Moreover, multi-interfaced BS at the edge ofthe network enable the adoption of new IoT protocols. A multi-layered BS consists ofwireless gateways to collect all raw data coming from local IoT devices. Thus, BSes keeptrack of the traffic at the data plane and create user sessions.

64

2.9 Discussion

This Chapter covered the three underlying areas of BIoT, namely, BC, IoT, and the adap-tation of BC and IoT via different models and architectures. Since the BIoT potential iswidely explored in different applications, it faces efficiency enhancement challenges i.e.,scalability, energy efficiency, and security. This thesis always considers all these three el-ements and thus, proposes novel application designs, architectures, and protocols imple-mentation in each of the BC, IoT, and adaptation of BC and IoT.

Technologies introduced in this Chapter are collected to provide an introduction tothe existing ecosystem. However, a selected set of these technologies are laying the foun-dation of this thesis, which the reader should keep in mind, namely, (a) The Bazo BC, (b)LoRa and LTE-m IoT technologies, (c) BIoT use cases and incentives, especially the smartcity, SCT, Healthcare, and IAM, (d) PUF and Ethereum ERC for Identity management,(e) Ethereum BC and especially the smart contracts, (f ) Sharding for higher scalability ofBCs, and (g) GDPR compliance for higher user privacy.

The remainder this thesis will cover its contribution in tackling such efficiency issuesvia a practical approach.

65

3Design and Implementation of BC-IoT

Integrated Applications

The emerging BC and IoT domains lead to newly integrated BIoT use cases. EspeciallyBIoT potentials as listed above enabled manifold use case areas from which the four mostimportant ones are discussed here. Furthermore, this thesis explores challenges of BC-IoT integrated use cases via a practical approach by designing and implementing the fourBIoT applications explained in this chapter. The categorization of challenges yields inBIoT architecture design proposed later in Chapter 4.

3.1 KYoT: A Blockchain and IoT based Device Identification System

This thesis, based on [198] applies PUf on Arduino SRAM, and proposes the design andimplementation of a Know Your Device (KYD) platform — named KYoT (Know YourIoT)— for the identification of Arduino-based IoT devices using a SRAM PUF. Design,implementation, and evaluation ofKYoTpresented here all based on and taken from [198,204, 149].

66

3.1.1 Design

KYoT’s design is multidimensional. It maps devices to users hence, it includes a Know-Your-Costumer (KYC), a KYD, a PUF algorithm, and SCs all integrated in one package.The following presents these dimensions by elaborating on the design details of each.

3.1.1.1 Device Registration and Verification

The process of verifying a device on the KYoT server is comprised of two stages. TheKYDplatform registers the device themselves first before the user gets access to the device. Sub-sequently, the user can register the device on the KYD platform when s/he wishes.

It is assumed that theKYDplatform is operated by themanufacturer of the devices. Asmanufacturer has access to the devices before they are sold, it makes themost sense to trustthe manufacturer with verifying the identities of the devices being registered by the deviceowner. The platform could also be operated by a third-party organization focused solelyon device verification. In which case the devices would have to be sent to this third partybefore verification can occur, as physical access to the device is necessary for the registrationin the database.

After manufacturing is completed, the device is connected to a computer in order toprovide power and access the serial bus of the device. The required start-up values for thisPUF implementation are extracted from the device’s SRAM chip by manufacturer. Thisdata is subsequently forwarded to the KYD server, alongside other metadata, through aWeb application. On the KYoT server, a fuzzy extractor is then used to generate a uniquekey for the device based on the SRAM values, as well as the helper that is necessary for thereproduction function of fuzzy extractors. At this point, a non-secret unique identifier isalso generated to keep track of the device on the database. All this data is then stored onthe database, at which point the device will be ready to be sold. The unique identifier willhave to be included in the shipment of the device, as the device buyerwill need it to registerand verify the device on the KYD platform. Figure 3.1 illustrates the flow of device regis-tration by amanufacturer. The next stepwould be for the device owner to register her/hisdevice on theKYDplatformusing the unique identifier s/he receivedwith the purchase ofthe device. Once successfully registered, the user can proceed to verify his device. Figure3.2 shows a sequence diagram of this process. Similarly to the registration process of themanufacturer, the user first connects the device to his computer and extracts the requiredPUF data and enters this data in the web application and sends it to the KYoT server. theKYoT server first retrieves the stored key and helper from its database. The helper is thenused alongside the PUF data in a fuzzy extractor to reproduce the device’s key. This repro-duced key is then compared to the key stored in the database by the manufacturer. If thekeys match precisely the device can be considered verified, and a corresponding message is

67

Figure 3.1: KYoT Design —Device Registration Process [198, 149]

returned to the web application. If the fuzzy extractor was unable to reproduce the samekey from the PUF data, the device is considered to be invalid as the physical characteristicsof the SRAM chip likely do not match the ones recorded by the manufacturer before thesale.

3.1.1.2 BC Integration

Thedesired endgoal forKYoT is tohave a verifiedpresenceof an IoTdevice on theEthereumBC. The idea is that this presence can then interact with other accounts and SCs on thenetwork and that those parties can trust the device’s identity. The device should furtherbe linked to a device owner that also has a verified presence on the BC. To achieve this,KYoT follows the ERC 734 combined with the ERC 735 standard of handling identitieson the Ethereum BC.

There are three parties to consider inKYoT from the creation of a user to the successfulverification of a device by the KYDplatform, and each of them have their separate identitySC. These three are the device owner, the device to be verified, and the KYD platform.It is assumed that the KYD platform already has a deployed identity contract with therelevant keys added. The relevant keys would be a management key, derived from the

68

Figure 3.2: KYoT Design – Device Verification Processes [198, 149]

account address that deployed the contract, as per ERC 734 standard, and a claim key tosign claims.

Figure 3.3 shows the steps required to get all contracts deployed and set-up, includinggetting the user verified. Firstly, the user deploys an identity SC to the Ethereum BC. Therequiredmanagement key is automatically added upon the creation of the contract. Next,this proxy contract of requires the user to verify his identity before adding any devices.

KYoT implements a user registration and KYD. In practice, it is better to dedicate anautonomous entity focused solely on the issuing of KYC claims. Thus, the user wouldsend the required data to get verified to the KYC service provider, which will then pro-cess the verification request. This claim then gets added to the user’s deployed identitycontract, at which point the user’s identity will be considered verified. If the user nowchooses to add some devices, s/hemust first add a claim key to his identity contract, as thiswill be required to sign ownership claims for her/his devices, effectively linking the devicesto the device owner. Subsequently, the user deploys an identity SC for his device, signs an

69

Figure 3.3: The Deployment and Set‐Up of Identity Contracts for a User and One of his/her Devices [198, 149]

ownership claim and adds it to the newly created proxy contract. At this point, KYoThavesuccessfully added proxy instances of the user and his device to the BC, as well as verifiedthe user’s identity.

To identify the added devices, as Figure 3.4 the device must first go through the ver-ification process. If the KYD platform deems the device to be valid, it has its proxy SCsign a KYD claim. For other claims, the device’s identity contract can then add this claimand show that it was verified by the KYD platform. If other contracts or accounts want tomake sure the identity of this device is verified, they can simply check if its identity con-tract includes a signed KYD claim. The claim also provides information on the issuer andother information to follow up on the claim if necessary. Interaction with the EthereumBC is handled by the KYD Web Application. It uses the provider injected by the Meta-Mask extension to interact with the BC on the user’s behalf, as well as having its separateidentity contract for the KYD platform to sign the KYC and KYD claims.

70

Figure 3.4: Sequence Diagram Issuing a KYD Claim [198, 149]

3.1.2 Implementation

Implementation of KYoT uses a 32 Byte input strings. This has to be noted that, as thelarger the inputs are allowed, the larger the helper becomes, thus, increasing storage costsand key reproduction time, which could negatively impact the efficiency of the system.Hence, the number of allowed noisy bits are set to 8 here, as this was sufficient to repro-duce keys reliably in the tests, though this might have to be adjusted to allow for differentdevices.

If n = is the length of the input string and k = is the number of matching bits, theprobability of getting exactly kmatching bits is

P(k) =(

nk

)

0.5k ∗ 0.5n−k =

(

nk

)

0.5n

Let q = number of allowed noisy bits, then the probability of getting at least the requiredn− qmatching bits for a successful key reproduction is

P(k ≥ n− q) =n

k=n−q

(

nk

)

0.5n

71

For the set parameters of n = 256 and q = 8, a probability of randomly selecting enoughmatching bits for a successful key reproduction of

P(k ≥ 248) =256∑

k=248

(

256k

)

0.5256 ≈ 3.65 ∗ 10−63

This is an adequately low probability to ensure the security of the PUF design in KYoT.

3.1.3 Evaluation

KYoT exploits the plausibility of utilizing SRAM based PUFs for self sovereign identityprovision of IoT devices using smart contracts. Evaluations of this system with respectto security, privacy, trust have helped in learning some valuable lessons, summarized asfollows.

3.1.3.1 Security and Privacy

It is exceedingly hard to deal with a scenario where the device owner wishes to exploit thesystem. This is due to using an SRAM-based PUF implementation. The problemsmainlyarise from the device owner having physical access to the device. With the most straight-forward implementation of just asking the user to use the Arduino sketch to read out theSRAMdata, the device owner could just return any values he sees fit, regardless ofwhat thesketch emits. In a more sophisticated design where the device owner gets blocked out asmuch as possible, through the establishment of a secure connection from the KYD plat-form to the device, we may run into the same problem. Since it is the user that sets upthis connection, it would be entirely possible to insert malware or imposter devices at anypoint in the communication chain. Unless a stronger PUF design is chosen, that is basedon multiple different kinds of Challenge Response Pairs CRPs, that the malicious partycannot predict, it is impossible to ensure fully secure authentication, if that party has phys-ical access to the device. For instance, the device owner can read out the values of the entireSRAM chip, and nomatter what challenge is set, as long as it is based on SRAM, the userwill know the correct response even if he is verifying an imposter device. Splitting up theregistration and verification process among multiple actors done in the design of KYoThelps to an extend but is not sufficient to guard against a malicious device owner.

The device owners are, of course, verified by a simpleKYCprovider inKYoT,meaningtheir BC representations are linked to their real-world identities. Considering this, mali-cious device owners could be held legally accountable for their actions. However, laws andregulations are often still unclear when it comes to BC technology.

72

KYOTdesign relies on a degree of trust in the device owner and is not impenetrable totampering with physical access to the device. The same is also true for the KYD platformitself. From the standpoint of a business partner on the BC, if the platform acts mali-ciously, the claims issued by it are also worthless. This means trust in the KYD platform isalso required too.

The security of KYoT is mostly dependent on how secure theMetaMask log-in is. In-teractions with the BC that could lose the usermoney always have to be approved throughthe extension. Even though retrieving all the registered devices is achieved by filtering thedevice list by the user’s account address, however, the only data stored on the KYD plat-form that is sensitive andnotpublishedon theBC is thePUFkey and theuser’s registrationdata. The PUF key never leaves the server environment. As for the user’s data, this part ofthe process is not intended for a production environment, and a dedicated KYC systemshould protect the access points to retrieve a user’s data more securely. Privacy concernswere given special thought throughout this work. One of the goals was to ensure thatno sensitive data gets published on the BC since it is accessible to anyone. The KYC andKYD verification processes are therefore run securely on the server. The data added to thesignatures is intentionally kept to a minimum and never includes any data that could beregarded as sensitive. This way, only the platform has access to the proof provided duringthe verification process of any entity.

3.1.3.2 Blockchain Integration and Trust

The BC integration of the KYDplatform is remarkably seamless from a user’s perspective.The user does not need to know any of the details that go into deploying contracts or evenwhat a SC is, in order to use the system.

BC-based systems generally are not intended to be dependent on third parties to pro-vide trust. The entire motivation for the creation of BC in KYoT is to remove the require-ment of a trusted third party in payment transactions. However, removing the require-ment for trust in the KYD platform would require the whole verification process to runon the BC. That way, the verification would be handled by the protocol described in a SCandwould be guaranteed to be executed identically for each device through the consensusmechanism of the BC. This approach was not feasible for two reasons. Firstly, the verifi-cation process outlined in this work is far too computationally complex to be run in a SCandwould result, if at all possible, in extortionately highmining fees. Secondly, in onewayor another, some sensitive data would have to be sent to the SC in order to verify a device,which would then be instantly accessible to anyone. There is no way around needing trustin the third-party KYD platform, therefore. As this is a centralized authority, the systemproposed in this work is neither completely trustless nor fully decentralized.

73

3.2 PollutionMonitoring

Introduction,motivation, design, implementation, and evaluationof aBPMS is presentedhere all based on and taken from [199, 151, 152]

Global population growth since 1950 [105] has contributed to the global warmingphenomena and natural resources depletion. Drinkable water sources, clean air, wildlife,and human beings’ life are endangered by civilization and current industrial activities.Even though some international authorities or even public and independent communitiesspend lots of effort in protecting the Earth from human-activity-triggered environmentalproblems, still there is a need for human-independent solutions which, automate the pro-cesses ofwater and air qualitymonitoring to use the results ofmonitoring as a drive againstpollution generators.

However, still majority of common state-of-the-art pollutionmeasuring andmonitor-ing solutions are operated based on direct human interactions (cf. Section 2.8.2.2). Theyare designed based on centralized architectures that mandate users to communicate withcentral entities for storing and retrievingdata knownasTrustedThirdParties (TTP).Mainproblemswith previous solutions can be listed as implementation cost, high space require-ments, lack of mobility, human-interaction dependence, centralization, high power con-sumption, lack of public access to collected data, and limited communication range ofIoT sensors [132]. Hence, to address these problems, in this work a BC-based solution isproposed, which automatically integrates the IoT sensors’ data into the Ethereum BC.

3.2.1 Design

This work here designs and implements a BPMS to incorporate LPWAN protocols andEthereum BC to enable automated data collection and storage for the pollution monitor-ing use case, even for long-range communications. The decision was taken to use LoRa asthe communication protocol and the Ethereum as the underlying BC. GBPMS—termedGBPMS (Global PollutionMonitoring System)— follows a layered architecture as shownin Figure 3.5. Data is traversed from the IoT sensors to a dedicated Web server throughthe BC or directly through the LoRa network (TTN) keeping the data secure and untam-pered. IEEE 802.11 b/g, i.e.,Wireless Local Area Network (WLAN) can also connect thesensors to the Internet to access BC from the sensors and the Web server layer. Once thedata is retrieved from the BC, it is stored on the local Data Base (DB) for offchain dataprocessing.

IoT sensors play an important role in energy consumption for data collection andtransmitting. In this work a set of water and air pollution sensors are connected to anInternet-connected board i.e., Raspberry Pi (RPI). Here, the collection of sensors and Ar-duino board is called sensor ”module”. As the size of the module increases i.e.,more sen-

74

Figure 3.5: BPMS Architecture Design [199]

sors are get connected, power consumption of data transmitting unit—including a sensormodule and an Arduino Uno connected to a LoRa shield— is increased as well.

Figure 3.6: Prototypical Implementation of a LoRa Sensor Node Including Four Sensors Attached to it [199, 151]

If sensors are placed in long distances, providing power resources e.g., by changingthe battery frequently, and monitoring the power status of sensors is not always a viableoption. With LoRa, high range of communication becomes possible as it supports onlymeager data rates, resulting in elongation of the sensors’ battery life.

Ethereum BC supports SCs, and it also provides light and full clients. One of thesignificant advances of the designed architecture here is integrating the Ethereum Light

75

Client (ELC) [112]. ELC is a specific Ethereum node that only stores and synchronizescurrent (recent) TXs and requires less space than Ethereum full nodes. With ELC, directtransmission of data from the Sensors to the application via BC becomes possible.

3.2.2 Implementation

For collecting the data from the IoT, four sensors connected to an Arduino Uno board asshown in Figure 3.6.

This sensormodule transmits the datawhich is thenuploaded into theBCusingLoRagateways. As shown in Figure 3.7, the GBPMS provides three approaches for selectingcommunication systems and BC client as following:

1. IoT sensors connected to LoRa nodes which are connected to BC with an ELCinstalled on the LoRa nodes.

2. LoRa nodes connected to BC with an ELC installed on the LoRa gateway.

3. IoT sensors connected RPIs as BC clients withWLAN communication.

In the first approach, an ELC is installed in the LoRa node which acts as BC entrypoint for data collected by sensors. Data flow (1) shows the first approach in Figure 3.7.In the second approach, the ELC is installed on LoRa gateways. Every time the TX is re-ceived and decoded by TTN, it is made available for the BC network using the NodeJS &Web3 API installed on the gateway. This way, the data is fully protected, and the accu-racy is maintained throughout the data flow (2) shown in Figure 3.7. The collected datafrom the sensors are transmitted to the gateway then once the gateway receives the datafrom TTN, it sends a TX to store data into the BC. The SC already deployed on the BCnetwork helps to check for the violations. The stored data is then used and presented bythe web application to users for further analysis. In the third approach, an Arduino Unois accompanying an RPI with an inbuilt WiFi module for connecting to the internet. TheRPI is used as a platform for ELC and Arduino Serial port communication. A JavaScriptfile running on the RPI and, with the help of NodeJS, collects the recently received databy The Things Network (TTN) [241]. TTN decrypts the encrypted data sent from theLoRa nodes. Once the ELC transacts the data, it is available on the BC, and the pollutiondata will be accessible over the BC network and on the BPMSmonitoring web page.

Regardless of the approaches above, a SC is developed to check the data received bysensors. Thresholds of the variables in SC are set according to the pollution standards(normal outdoor value range) from [129, 70, 108, 186, 153]. The ranges of pollutionstandards used in the SC can be specified depending on the regional standards before SC

76

deployment. The developed SC [150] includes functions for eachmeasured factor of pol-lution with which only violated values in the received data are detected and sent to theBC.

Figure 3.7: Data Flow Using LoRa and ELC [199, 151]

GBPMSweb application is used for extracting data from BC for further analysis. Theweb server stores all the data received from sensors, not only the violated data (the datawhich is out of the standard range). This implementation decision was made to providean additive trust and increase the functionality of the application by having all the datain hand to monitor every country and every city in the world based on both pollutiongeneration and the standard thresholds.

3.2.3 Evaluation

Evaluation of the proposed BPMS needs to be done on different aspects of the employ-ing communication protocol, BC, and overall functionality based on security, reliability,scalability, and accessibility.

Like anyother communicationprotocol, usingLoRaandTTNin theproposedBPMShas pros and cons. Compared to WLAN, where users can keep the track of sensors us-ing real-time and continuous data transmission, LoRa is not appropriate for deep and

77

Figure 3.8: Proposed Pollution Monitoring System Front‐end [151]

real-time data analysis (e.g., for monitoring sensors’ health, data loss, changes in pollutionamounts during a full day time) since the amount of data that can be sent over the LoRanetwork is restricted with the air-time regulation of the TTN network to the maximumamount of 30 s per day [30].

The restricted payload size of 50-100 Bytes also restricts the number of sensors addedto the LoRa node. The evaluation of GBPMS setup conducted in a four sensor setup;attached to a LoRa node, and only the pollution data is collected and sent over the net-work with a payload size of 8 bytes, added with 4 Bytes of LoRa headers that results in 12Bytes in total. If one needs to have multiple sensors, the LoRa network with its currentspecifications would not be an appropriate solution as the payload size would exceed themaximum allowed payload size given by LoRa. Thus the data would not be transmittedto the gateway.

Regarding the overall coverage of LoRa and TTN, it should be mentioned that thegateway installed for this solution is an indoor gateway, and it needed to be placed near awindow forbetter connectivity from theoutsideworld. Theoutdoor coverage of the setupwas better than the indoor withoutmany obstacles between the Lora node and gateway inindoor locations. Communications of the LoRa node and gateway were fast when placeddirectly in front of each other. So, it can be deduced that the node should be placed in anopen area, and the path to the gateway should not have many obstacles in between.

The second approach requires fewer packets to be sent in-between LoRa nodes andgateway due to smaller TXs as theywere not signed by the BC client on the Lora node side.All approaches ensure the data origin and integrity made available in the public domain.

78

Scalability of the proposed BPMS can be divided into three parts as back-end (i.e., BCscalability), front-end (i.e., Web Server scalability) and sensors (e.g., number of sensors).The scalability of the BC depends on the Ethereum network. Ethereum, with the TXvalidation time of utmost 10 seconds, addresses the time-related requirements of this usecase. The number of sensors can be easily increased to a point that LoRa air time offeredby TTN allows depending on the monitored area requirements.

The front-end of GBPMS is based on the Internet and has limits on concurrent usernumbers, however, using load balanced cloud-based architectures it can be accessed fromany place in theworldwith a high scalability in handling high concurrent users. Regardingthe accessability BC. Users can access the data stored on the BC from every location in theworld, unless that country has restrictive regulations against BC systems.

Regarding power consumption of the proposed BPMS, the whole sensor node setupuses four sensors, power generated by≈ 18 V of battery (9 V batteries in series) to start upand transmit the data over the LoRa network. The total power consumption of the sen-sor nodes needs to be divided into power consumption in transmitting data and sensing(gathering) data. Regarding the data transmission, LoRaWAN enables the communica-tion to be power efficient, Which narrows down the significant power-consuming partsof the BPMS as Sensors and Computing board (Arduino Uno), the major part of powerconsumption directly relates to the number of sensors and their power consumption.

3.2.4 Conclusions

In this work, a power-efficient, long-range communication enabled, automated, and de-centralized IoT and BC-based pollution monitoring system is introduced, which on theone hand, leverages the unique nature of BCs by providing tamper resistant decentralizedand trustable distributed systems, and on the other hand, employs LoRaWAN commu-nication protocol to provide long-range and low power enabled communication. LoRaenables communication of IoT sensors far from gateways. This work, installed ELC onLoRa gateway which is a great advantage in combining IoT-based applications and BC-based systems. This approach addresses the need for installing BC full nodes in the IoTsensors, which is in most cases not possible considering the small spaces, computationpower, and power resources given to the IoT sensor nodes. The proposed BPMS cov-ers another approach in which, ELC is installed on the LoRa nodes, and the Web serverreceives all the data from sensor nodes, and it has a full node installed on it. Approachestaken in GBPMS enable users to access the IoT data automatically and send them to theEthereum BC. This data can be used as evidence that pollution exists in the monitoredarea. GBPMS is capable of being used in many countries according to the coverage of theTTNnetwork. Evaluations of various factors onGBPMS indicate advantages of using theLoRaWAN and the Ethereum BCwhile expressing their drawbacks as well.

79

3.3 NUTRIA: A Supply Chain Tracking System for The Swiss Dairy UseCase

Introduction,motivation, design, implementation, and evaluationof a supply chain track-ing system for The Swiss dairy use case presented here all based on and taken from [202,201, 96, 182, 98].

The case studied here relates to milk and in general dairy SCT within Switzerland,where approximately 20,000 milk producers with 528,000 dairy cows produce 3.4 billionkilogramsofmilkper year. Themilkproduced is processed in almost 700 companies [267].The Swiss milk market has been in a challenging phase since the abolition of milk quotas.Despite border protection, Swiss prices are heavily dependent on European and globalprices, which applies to the entire dairy market [40]. Therefore, Swiss dairy producersstarted to look for dedicated values to be added to their products. Such values can be aresult of internal or external factors being provided during a supply chain, thus, termed“Value Chain”.

The Swiss dairy value chain describes the cooperation between independent compa-nies that have agreed to work together over a longer period of time and to jointly createvalue, i.e., to act in the sense of consumer satisfaction in such away that they jointly developa sequence of value-enhancing steps until the result reaches a correspondingly valuable endproduct [214]. Within the competitivemarket of Switzerland,Dairy SupplyChain (DSC)actors, i.e., all the stakeholders including farmers, transporters,milk processing companies,or retailers, have focused on deficits of dairy processes to integrate and empower a valuechain for the DSC.

In the same direction, the social studies performed in this work indicate the impor-tance of transparency, sustainability, and trust in creating a value chain for dairy products.

Therefore, the use case studied in this thesis is based on recent industry practice per-formed as the “Foodchain” project [234] supported by the Swiss Federal Office of Agri-culture∗ [119]. This project did collect a set of data points (e.g., geographical location andtime) to offer a traceability solution using the BC in a SwissDSC.The basic and importantknowledge extracted from site visits at the Fuchs undCo. AG“Molkerei Fuchs”[5] and in-terviews performed afterwards detailed very clearly respective data points and application-specific (i.e., dairy SCT) information.

∗This work was funded by the Swiss Federal Office of Agriculture (BLW) in 2018 and2019 under Grant No. 627000886. The author is sincerely grateful to Th. Meier and M.Tschumi from the BLW, to D. Fuchs from Fuchs und Co. AG, Molkerei Fuchs, Rorschach,Switzerland, the farmers Ph. Hafner and I. Sager, and M. El Bay as well as F. Hess fromDezentrum.

80

NUTRIA encompasses the following components within the scope of the Swiss dairyvalue chain:

• Preparing a questionnaire and running interviews to collect users’ and actors’ view-points and expectations

• Developing an application to trace milk and subsequent dairy products from thefarm to the shelf

• Tracing data points for actors, i.e., farmers, transports, processors, and retailers

• Applying a BC as an underlying IT infrastructure and as a decentralized database

• Using QR (Quick Response) codes for traceability and information availability onthe product

• Providing an interface to view products’ “history” data

Based on this overview of the state-of-the-art SCT solutions (cf. Section 2.8.2.4) itcan be concluded that an industrial analysis combined with a social study is missing, suchthat an extraction of requirements of supply chain actors within a value chain can be per-formed. Therefore, this thesis collects elements required for empowering a value chainwithin the Swiss dairy use case and exploits on an actors analysis a BC-based SCT system.Even though the state of the art SCT systems do not deliver specific insights on the dairyproducts STC requirements such as design and implementation details, analysis on thementioned state of the art BIoT-based SCT systems lead to the identification and classifi-cation of the proposed SCT dApp requirements described in the next sections.

3.3.1 Requirements Analysis via Interviews

Design of a SCT system to empower the dairy value chain requires considering severalaspects according to the characteristics of this supply chain. From a generic perspective,these characteristics can be listed as follows.

• Size (e.g.,numberof employees, numberof operational andorganizational branches,and distance)

• Actions (e.g., production of raw material, processes performed on raw product,packaging, and transportation)

• Inventory (e.g., temperature and usability period)

• Producer certificates and licenses (depending on raw materials and product)

In order to reach a clear understanding of user expectations, consumer and producer sur-veys have been conducted throughout the course of the project [234]. Transparency, sus-tainability, trust, and digitization consisted the main topics of these questions. They weredefined as follows.

81

Transparency is defined as an exchange of purpose-oriented knowledge [68]. Theremust be a common consensus on what information should be exchanged between whichactors and in what form, in order to create a situation that is perceived as transparent byall actors.

Sustainability is paired with the development and being defined as the “developmentthatmeets the needs of the presentwithout compromising the ability of future generationsto meet their own needs” [191].

Trust embeds the mutual reliability of partners based on their behavior, which is ex-pected to be in favor of the entire ecosystem. Trust shall be established by loyal acts ofpartners in a DSC to achieve common goals and benefits [251]. Thus, the comprehensiveexchange of information through technology, which can be practiced without delay, is theprerequisite for economic and production-specific interactions.

Digitization refers to the adoption of digital technologies that can help companies tocreate connections between their machinery, supply systems, production facilities, finalproducts, and customers in order to gather and share real-time market and operationalinformation [37], thus, enabling real-time access to product and production informationfor participating entities, and improving the performance of autonomous work processesalong value chains [179].

3.3.1.1 Interviews

In the course of the project [234] interviewees (n = 20) from different sectors were inter-viewed:

• Feed producers

• Milk producers

• Milk transport companies

• Milk trade and marketing organisations

• Milk processing undertakings

• Wholesaler / Retail trade

• Professional organisations / associations

• Miscellaneous

This survey determined which challenges the interviewees see within the Swiss milkvalue chain. The questionnaire addressed, whether the interviewees thought that therewas trust and transparency among the actors in the milk value chain and whether theythought that decentralized solutions (e.g.,BCs) could strengthen trust between actors. Keyquestions asked included the following four areas:

82

• Information about the company

– Products/Services– Customers– What makes your company stand out?– What are Unique Selling Propositions (USP) that you place on your prod-ucts?

• Challenges of the milk value chain and transparency

– What do you consider to be the most important challenges in the mil valuechain?

– What do you personally refer to by the term “transparency”?– Is transparency important to you in relation to milk and dairy products?

• Trust and digitization

– Does trust exist between players in the dairy market?– Do you think that decentralized IT systems/BCs could increase trust?

• Sustainability

– How do you define sustainability (in relation to the milk value chain)?– What do you think is the effect on the milk price?– Who should pay a possible sustainability surcharge?

3.3.1.2 Consumer Survey

The online consumer survey [234] collected information of dairy products, which are im-portant to consumers in terms of transparency, trust, sustainability, digitization, and re-gionalism. While the consumer survey was shared via social media, the primary channelwas the Facebook account of the Fuchs dairy [11]. Secondary channels included LinkedInchannels of Foodways [123] and Dezentrum [90] as well as private messages to commu-nity contacts. Almost 200 people participated in the survey. The survey was run March5-12, 2019 and linked to a competition. 60% of the participants (n = 116) had taken partin the competition, with three winners. The average time to answer all questions was 12min.

83

Ethereum Blockchain

Supply Chain Producers & Actors

Smart Contract

Client Web-App 

Figure 3.9: Server‐less Design of NUTRIA [202, 96]

3.3.2 Design and Implementation

With the goal of adhering to user requirements collected from the social study and af-ter analyzing the state-of-the-art solutions in this field, the dairy SCT platform termed“NUTRIA” was proposed to achieve the expected benefits via BC-based SCT systems.As shown in Figure 3.9, NUTRIA connects different actors in theDSC and is designed asa Decentralized Application (dApp) developed on Ethereum. Every step of the dairy pro-duction is recorded within a SC, which acts as an autonomous storing of data sent from aclient’s application, i.e.,NUTRIA’s Web-based front end.

To enable product tracing (cf. Figure 3.10)NUTRIAencompasses twoparallel chains,i.e., the DSC establishing a production flow of dairy products and the data chain. Thelatter is maintained within the BC by storing traceable production-related data in eachstep within a new SC.

The more transparent and comprehensive the data chain is, the more value will beadded to the value chain. Hence, it was vital to enable automated data collection features.In this regard, the product and producer-related data are being added to the EthereumBC via smartphones of producers at each step preserved within a new SC. The data storedimmutably within a SC at each step include (a) producer identity, (b) producer WebsiteURL, (c) producer certificate(s), (d) product license(s), (e) producer Ethereum account,(f) list of actions, and (g) previous SC addresses of that particular product.

NUTRIA’s user interactions include (a) actor registration, (b)QR code generation or(c) scanning, and (d) tracing products (cf. Figure 3.11). Producers generate a QR code,while performing a new action on an existing product or for a new product on that. Thistwo-dimensional code i.e., a machine-readable optical label, contains information about

84

the product to which it is attached to [257]. When a QR code is generated, it can beprinted and attached to products. By generating a new SC and QR code per product ateach step, the physical flow of products (theDSC) and digitized product information (thedata chain) are mapped to each other.

Each actor can equally access and add inputs to the DSC data stored in that chainthrough the Ethereum BC (cf. Figure 3.11). The collection of these inputs adds valu-able information to the value chain, such that end users can trace actions on one productthroughout the full DSC.

Actor 1

Actor 2Actor N

Consumers 

Action:Raw Milk Prodcution

Action:Milk Processing

Producing Dairy Product

Generating QR code1

Action:Transporting Dairy Product

Inventory

Action:Selling to end consumer

Scanning QR code1 &

Generating QR code 2 on top of QR code 1

Scanning the QR code N+1

Scanning QR code N-1 &

Generating QR code N chained to all the previous QR codes

Scanning QR code N &

Generating QR code N+1 chained to all the previous QR codes

Time

Figure 3.10: Actors Generating, Scanning, and Adding Data to the BC – Mapping the Supply and Data Chains [202]

NUTRIA’sWeb client, which canbeusedbybothproducers and consumers, providesall required functionality for interacting with the BC-based back-end (cf. Figure 3.11).This type of system design simplifies the deployment, reduces the deployment costs, andbrings transparency into the platform. Moreover, theWeb client is developedwith theRe-act framework to provide an identical user experience for iOS and Android smartphonesusers, thus, being fully platform-independent.

Users only need to run aMetamask [10] BCwallet on their device to linkNUTRIA totheir Ethereum account. End users, i.e., consumers, use NUTRIA without any registra-

85

tion. Opening the dApp in a smartphones’ Web browser e.g.,Google Chrome or FireFox,and scanning the QR code attached to a product navigates NUTRIA to the history ofthat particular product in the Map View. This view allows for a click on each step to ac-cess time-stamped and actors’ information.

Combining tracing and “mixed” products throughout the DSC determines a chal-lenge for the dairy use case, especially for milk being received from different farmers oron different days. NUTRIA allows for a tracing of multiple product combinations, sincein each step an SC can be generated based on multiple products i.e.,multiple “previous”SCs being combined. ThusNUTRIA is not limited to “single predecessor” use cases. Thesource code of NUTRIA including the SC and the dApp is accessible at [95].

Figure 3.11: Swiss Dairy Supply Chain (on the Left), Corresponding Producer Views of the NUTRIA dApp (Center), and theData Chain (on the Right) [202]

86

3.3.3 Evaluation

The evaluation of this work is twofold: Swiss consumer survey results and NUTRIA’stechnical and social evaluation.

3.3.4 Social Study Results

Based on results of the interviews and consumer survey [234], product quality, origin,animal welfare, and sustainability were particularly important for the dairy value chain.Key statistical results of these interviews are presented and the number of participants an-swering a question are indicated by n, followed by the percentage of the participants, whoshared the same view.

Transparency: The interviewees’ perception of this term was the disclosure of infor-mation of processes to all parties interested and involved in the production (the manufac-turingof theproduct). Furthermore, the verifiability andcontrol of information regardingtheir truthfulness needs to be provided. Transparency should also deliver the disclosure ofcosts and prices paid per stage of the DSC in the context of the final product price. n =11, 70% of the interviewees believed that transparency is an important feature in a DSC.However, apart from the milk price and the price of the final product, the remaining costcomponents are not disclosed to partners and consumers in the DSC. Thus, t interviewpartners stated that the composition ofmilk prices and dairy products are not consistentlytransparent today.

Trust: Almost half of the interviewees believe that except for a few specific cases, trustdoes not exist in the DSC. They justified their statement by the fact that trust may becreatedwith the disclosure of added value information and services or pricing details. Suchdisclosures in DSCs do not, or only partially, exist yet. Some interviewees mentioned thattrust can only be built on a partnership cooperation between DSC actors. Trust-buildingelements rooted in shared data, trust of people into digitally generated data (e.g., using IoTnodes and evaluated by an algorithm such as with SCs) are higher in contrast to manuallyrecorded or generated data.

Decentralization: The surveyees were asked, whether they think that decentralizedtechnologies like BCs would have the potential to increase trust among actors in theDSC.The results indicate that a majority of respondents can imagine that decentralized tech-nologies could increase trust (n = 12, 70.59%); few surveyees were unsure about the trust-building potential of digitization. They claimed that trust could not be strengthened thatway, since a human must form the center. From their point of view, the trust-buildingelement are players in a DSC, who are open and honest with each other and who jointlymanufacture products from their raw materials and offer them to the market.

87

Sustainability: Survey participants read the term “sustainability” concerning dairyproducts as both as “animal welfare” (n = 159, 89.31%) and as “environmentally friendlyproduction of the raw milk and animals feeding” (n = 161, 83.23%).

TheRegion of the production is one of the most important aspects for survey partic-ipants concerning dairy products (n = 176, 74.43%). When asked about the informationthat survey participants would like to see being declared on the product in addition tothe information required by law, both the “origin of the raw material” i.e.,milk (n = 150,90.67%) and the “origin of dairy products” (n = 151, 85.43%) were emphasized.

Actor Name: Participants were asked in two separate questions, whether it was im-portant for them to know (a) fromwhich farmers themilk they consume comes from and(b) from which region. This information, however, was not too interesting for surveyees(n = 150, 51.33%). This means that the region has a higher importance than the farm/er’sname.

Labeling Products forTracing: n=148, 50%of the surveyees showed awillingness toaccess product information via QR codes. Just less than 40% wanted to trust the productor brand without additional declaration of content or labels.

Product Quality: In response to the trend to a high-performance or high-qualitystrategy to be applied in the DSC, n = 8, 40% of the respondents were in favor of a qualitystrategy. n = 4, 20% believe that a high-performance strategy had to be pursued by farmers.However, with low milk prices, a farmer can only secure his life through the quantity ofmilk produced and cannot afford an individual production.

Challenges: The surveyees namedchallengeswithwhich they see the SwissDSCbeingconfronted with:

Socio-economic Challenges

• In the long term, farmers can ensure that they can make a living frommilk as a rawmaterial, and this is linked to the attractiveness of the profession of amilk producer

• Ensuring a fair cooperation between all actors in individual DSCs

• Non-homogeneous distribution of larger and smaller players along the Swiss DSC

Market Economy Challenges

• Finding a long term strategy for the competitiveness of dairy market

• Falling the demand for dairy products

• Added value with Swiss raw dairy products in the long term

• Increasing sensitivity of consumers to environmental challenges in connectionwithanimal production

88

• Functioning control system that is immune to abuse, e.g., food fraud

Ecological Challenges

• Negative effects of dairy production methods on the environment

• Shelf life of dairy products and their waste

Nutritional Challenges

• Alternative products, which do not have a one-to-one comparable properties

3.3.5 Technical Evaluation

The set ofmajor observations and interviews prove that the design and implementation ofNUTRIAachieve key user requirementswithin the functional, performance, operational,and application-specific dimensions. NUTRIA enables trust, transparency, sustainability,and digitization in the Swiss DSC.

Trust: NUTRIA provides those features to actors, who create trust and strengthenthe value chain. Thus, actor-related information is gathered with full trust in these ac-tors. However, concerning the geo-location, date, and time of actions, trust is guaranteedthroughout the DSC. This data is stored within the BC, automatically and without hu-man interaction.

Security and Privacy: Data integrity is provided by cryptographic signatures usedwhen generating a QR code, which ensures the origin of the data stored in the BC. User-specific information, such as cryptographic keys, and the passwords’ privacy are preservedby NUTRIA, since it does not store any sensitive information on any database. Thus,this information never leaves the user’s device, especially here a smartphone. Security riskscaused by using QR codes are eliminated by design, since the client application will notreact to altered or counterfeit QR codes. Hence, NUTRIA users are protected in the caseof replacing a physical QR code by a malicious intermediary.

Scalability of a dApp can be measured by several metrics, such as (a) the transactionrate, which is affected mainly by the scalability of the employed BC, (b) the size of thedata stored at the dApp client, and (c) the size of the data stored on the BC. As of June2020, the average transaction rate in the Ethereum BC is at about 20 TPS, which dictatesthat as the maximum rate for NUTRIA. The storage capacity demand a NUTRIA clientto be proportional to the size of data to be stored in the actor’s smartphone. Every timea QR code is generated, it is stored in the respective device. Since only the SC addressis preserved in one QR code (with a size of 20 Byte), no storage concerns with the userequipment remain.

Deployment Costs of issuing transactions into the BC may vary. As of June 2020,NUTRIA’s transaction costs for the Ethereum BC average at 0.0012 ETH (Ether) (≈

89

0.25USD)per transaction. However, this amount is basedonexchange rates ofETH/USD.In case of a deployment of privateDLs, these costs are negligible anddata storage is for free,since reasonable storage limitations are typically hidden in the IT infrastructure. Serverdeployment costs for NUTRIA’s backend include (a) running and maintaining a serverthat hosts the backend and (b) hosting a domain name and a public address for the Webpage’s operation. However, tominimize these costs even further,NUTRIA follows a fullyserver-less architecture.

Ease-of-use is provided by NUTRIA by enabling users to:

• employ platform-independent applications, where even without installing the ap-plication, users can refer to a Web site for tracing the products’ history;

• use the hardware that users are already familiar with, i.e., smartphones;

• set-up a QR code generation processes, while requiring the least number of inter-actions between all actors necessary and involved;

• add in a flexible manner user-specific actions, licenses, or certificates;

• enable various end user-related functions for printing QR codes; and

• use further options easily, such as printing, sharing a QR code, sending it via email,or saving it to a file.

3.3.6 Operational, Social, and Technical Risks

Risks determine the likelihood that a threat assumed may be successful under a set ofgiven constraints. Thus, it is important to understand for a distributed system, such asNUTRIA, the set of potential risks. Therefore, the NUTRIA dApp and its users wereanalyzed as follows:

• The operational risk of the loss of the private key of users: Due to those securitystatements made above, no mechanism was designed explicitly for storing that keywithin NUTRIA. The responsibility of maintaining Secure Keys (SK) fully relieson all users participating. Obviously, losing a SK can lead to losing access to thoseuser’s data.

• The social risk of fraudulent actors: Loosely coupled actions in the process of dis-tributed systems may be considered as a lack of obligations or incentives for actorsto participate. If intermediate actors do not see the need for adding relevant datainto the value chain they can cause the breaking of a product’s full data chain. e.g.,a new QR code may not being generated based on previous ones, sets forth an op-erational problem, since producers’ details will not become available. Thus, sucha break of the value chain does provide sufficient information on identifying the

90

last QR code’s producer acting correctly, which generates a strong incentive for allactors to behave in a value chain-compliant and correct manner to offer to users atracing of the full product history.

• The set of technical risks: They include many different ones, here especially thebreak of a connection, thus, the access to BC in use. Given the specific length of anoutage, time anddelay of the dAppmay cause ambiguity for users, who are not usedto delays in communications andwith BC-based applications, since they cannot actin a real-timemode. Depending on theBCbeing deployed, 30 to 120 s delays beforea persisted state may become visible for all actors and has to be adhered to. Finally,purely technical outages of communication systems and IT failures may still exist,as for all other IT in operation. However, since NUTRIA follows a distributedsystems’ approach, only very narrow and small data sets may be effected in such asituation.

• The operational and technical risks attached to a public BC being used instead ofa private DL is very closely related to data privacy demands of that information,which is stored within a BC. Since no data persisted once can be deleted –– andbecoming public without further mechanisms at hand –– all data persisted is inclear text, e.g., for all SCs, and will be always accessible publicly. This may come asa disadvantage, if a non-trusted participant may join the active group of contribut-ing actors, since this participant’s activities can be traced. In contrast, private DLcan handle data privacy by different access policies to the DL, which can differenti-ate between data access of participants only and data available for the public. Thiscomes at the cost of different assumptions on the underlying trust model, as out-lined in [64].

Overall, the interviews with producers and users of NUTRIA indicated key valuessuch a dApp adds to the Swiss DSC. However, the specific technical differences betweencentralized and decentralized systems, private DLs and public BCs, and the immutabil-ity of BCs are still generally unclear to non-technical users. While this is not surprisingper se, the promotion of NUTRIA’s dApp — or any BC-related dApp in general — hasto abstract away from technology, but has to focus clearly on the usage, operations, andreputation-based advantages a BC-based SCT can reach, such that risks for all actors, usersinvolved, and possibly a regulator on the food domain for a given country can be mini-mized or superimposed by transparency, immutability of data generation, and value gen-eration as benefits for the Swiss dairy value chain.

91

3.4 ITrade: An IoTData Streaming System &Marketplace

Introduction, motivation, design, implementation, and evaluation of an IoT data stream-ing system presented here all based on and taken from [203, 97].

The global Internet connection volume exceeding∼463TByte of data per dayby 2025[20], a considerable part of it is via Internet of Things (IoT) devices. The number of IoTdevices are projected to increase to 43 billion by 2023 [84]. More than half of the world al-ready participates online in the global digital economy [21]. Moreover, the current Covid-19 pandemic requires even more people to “go online” [23]. Thus, world is reshaped bydata connectivity resulting in the creation of new virtual and digital economies with thepotential to generate almost 11.1 trillion USD per year by 2025 [169]. Now that world’smost valuable resource is data [239], new business opportunities arise from data moneti-zation, i.e., the process of generating economic benefits from data by trading data.

Up to now, high-value personal data is usually handled by centralized databases andservers, belonging to large institutions, organizations, or companies. Such a centralizationincreases maintenance costs and risk of cyberattacks. The most famous example of suchan incident was the 2018 Facebook data breach, which exposed accounts of 50 millionusers [176]. Half-year 2019 statistics indicate that 4 billion records have been exposedillegitimate [131]. Since the current number of cyberattacks shows no sign of decreasing[125], the provision of a higher security level has become a central pre-requisite of onlineplatforms. Consequently, before the society can even attempt to start monetizing data,it has to look for and employ systems providing the highest level of data protection andsecurity for newly designed data processing systems.

To ensure the highest possible control over the handling of personal data anddigital as-sets, data traders need to be able to rely on data sovereignty [81]. Data sovereignty becomesparticularly relevant, where data is processed and/or stored [4], since cloud storage serviceshave become increasingly popular as data persistence platforms formany businesses [124].

In order to make the IoT data more accessible, IoT data marketplaces have been pro-posed in the last years. These marketplaces aim at providing technical solutions and busi-ness incentives for treating data as a tradeable virtual asset. However, our analysis showsthat these marketplaces are not meeting user and use case demands with respect to the pri-vacy, scalability, and data sovereignty aspects. Based on the study of these concerns, thiswork presents ITrade as a secure and scalable IoT data marketplace based on BCs. ITradeimplements the decentralized management of data trading via Smart Contracts (SC). Asa data streaming platform, it brings together individuals, companies, and organizationsfrom private and public sectors, who are interested in IoT-collected data, for instance (a)as within studies performed for health-related use cases which are most needed in the caseof COVID-19 pandemic, and (b) individuals, who are interested in selling data collected

92

by their devices in general. IoT owners can initiate data streaming via ITrade and benefitfrom its highly scalable architecture.

Our assessment of the data streaming related work (cf. Section 2.8.2.3) approves thevalidity of these potentially valuable platforms. However, unfortunately, most of theseplatforms do not disclose enough implementation or evaluation details. Table 3.1 sum-marizes and compares the analyzed solutions. The unavailable information are markedwith N/A in this table.

Table 3.1: Comparison of Data Trading Platforms [203]

Datapace IDMoB Sash SDPP

Blockchain HyperledgerFabric

Ethereum HyperledgerFabric

Ethereum

DataEncryption

N/A Yes Yes Yes

Access Control List Yes Yes Yes Yes

Realtime Streaming Yes No Yes N/A

GDPRCompliant

N/A No N/A N/A

Storage System Standard Swarm Standard IPFS

3.4.1 Requirements Analysis of Data Streaming systems

The field study in this work revealed a clear gap between the state of the art and user expec-tations or use case demands. The main deficits of related work are presented as follows.

User-friendliness: Most of the solutions proposed require manual steps to start trad-ing data, making the implementation very labor-intensive. In addition, for the solutionsdeveloped there is a lack of clear explanations on implementation details.

Performance and Cost: Most of the related work studied commits each transactionon a BCmeaning that there is a hard cap on the number of transactions stored per second.Therefore, traders incur high storage and transaction costs.

Deployment: Related work provides no or very little information about steps nec-essary to deploy the respective platform. These solutions do not follow a cloud-nativeapproach [18]. One requirement of the solutions available is to be manually deployed,which makes it challenging to reproduce steps for testing, staging, quality analysis, andproduction.

Scalability: Storing large amounts of data via transactions on aBC in systems availableresults in inhibiting scalability. In addition, underlying data storage systems used by the

93

solutions analyzed tend to lose performance and do not scale well, when the amount ofdata being handled exceeds 1 TB. Consequently, they either get too expensive to run orthey simply can hardly handle that amount of data and gets too slow, thus impacting user-friendliness as well.

Data Sovereignty: There are three aspects that have to be considered to provide datasovereignty. (a) Storing the data in the same geographical location (geolocation) wherethe data is being generated. In case of employing cloud services, the cloud servers have tostay in the same location, and the Service Level Agreement (SLA) of the cloud providerguarantees that the data does not leave the data center. (b) Serving the traffic based onthe geolocation of the users. (c) Preventing the users from accessing the data marketplacesfrom restricted locations. In this regard, there is not enough information about how re-lated solutions are providing data sovereignty.

3.4.2 Design

To address the deficiencies of existing data marketplaces, ITrade is designed with dynamicelements and scalable modules which enable sophisticated processes for secure device anduser registrations, data streaming, and end to end transaction. While Figure 3.12 illus-trates the overall view of the ecosystem, the following introduces the entities and processesdesigned in ITrade.

PersonalHealthData AgricultureSupplyChainTracingForestMonitoring Industry

DataStreamSellers:DSS

DataCollectionandTransmissionviaIoT

DataMarketPlace(MP)

IndividualsMedicalStudies

Hospitals&SocialCareAcademic&SocialStudies

Government&InsuranceCompanies

DataStreamBuyers(DSB)

ITrade

Cloud

SmartContracts(SC)

Blockchain(BC)

Figure 3.12: General Overview of Data Trading Ecosystem with ITrade [203]

94

3.4.2.1 Entities in ITrade

ITrade contain fourmain entities, while each entity represents a SCdeployedon theEthereumBC. Thus, these entities are uniquely identified by their SC addresses.

TheData Stream Principal (DSP) represents a generic entity within the system thatcan act both as a Data Stream Seller (DSS) and a Data stream Buyer (DSB). Each DSP hasan Owner Address, i.e., its Ethereum’s account address, its Name, a URL, and an RSApublic key. TheRSAkey is used for exchanging the symmetric key used for encryption/de-cryption of a sensor data.

AData Stream Seller (DSS) registers 0− n sensors to the marketplace and sells theirdata. In addition to the DSP attributes, a DSS has a list of sensors registered by him/her.

AData Stream Buyer (DSB) subscribes to data streams offered by DSSes. DSBs areindividuals or organizations are interested in a particular piece of data generated byDSSes.

3.4.2.2 Sensors

Sensors are owned by DSSes. A sensor collects raw data, while being attached to a devicewith an active Internet connection (i.e., IoT devices) to transmit the data into ITrade. As anovel approach and in contrast to the available datamarketplaces, in ITrade the data beingpushed to fromIoTdevices has tobe encryptedon theDSS side beforehand. This preventsITrade administrator and any anymiddleman from being able to read data or misuse suchdata. Each sensor has the following attributes.

• TheType defines the data type being collected by sensors, such as temperature, hu-midity, air or water pollution index, etc..

• A Status defines for every sensor upon its registration the state it is in. The DSShas to activate it before being able to send data from it. The sensor can also bedeactivated or blocked by its owner.

• TheGeolocation is represented by longitude and latitude of a sensor.

• ThePrice perData Entry defines the price that aDSBhas to pay, when subscribingto the sensor’s data stream. Buyers have to enter the number of entries they want tobuy.

• AnAES private Key is generated byDSS for each of his/her sensors. These privatekeys have to be securely stored by DSS locally.

95

3.4.2.3 DataMarketplace (DM)

DM is the central entity of ITrade design. It embeds multiple micro-services for enablingingestion, streaming, and persisting data. A DM shows the following attributes:

• TheDMOwner Address is the address of ITrade owner and administrator.

• ADSPRegistration Price defines that price eachDSP has to pay when registeringto the marketplace.

• The Sensor Registration Price defines a flat fee for registering sensors.

• The list of Registered DSPs names all DSPs registered.

• TheDSP Commission Rate for each DSP defines a certain percentage of its price,which is transferred to the DM’s Ethereum account as a streaming fee.

3.4.2.4 Datastream Subscription (DSSub)

ADSSub represents a subscription to a data stream. Each time a DSB subscribes to a datastream, a new DSSub SC is created. A DSSub operates on the following attributes.

• ADSB ID represents the identifier (ID) of the DSB.

• A Sensor ID defines the ID of the sensor to which the DSB has been subscribed to.

• The Start Timestamp defines the time from which the DSB has been subscribedto a sensor’s data stream.

• TheNumber of Data Entries determines those data stream entries that the DSBsubscribed to.

Figure 3.13 represents the relation between these entities.

3.4.3 User Interactions and Processes in ITrade

Different users interact with ITrade for device registration, stream subscription or pur-chases, data streaming, and identification purposes.

96

Sensor

Datastream BuyerDatastream Seller

Owns

N

1

DatastreamSubscriptionN

N

<<DatastreamPrincipal>>

inheritsinherits

Data MarketplaceHasN 1

Figure 3.13: Entity‐Relationship Model in ITrade Design [203, 97]

3.4.3.1 Data Stream Principle (DSP) Registration

In interacting with ITrade , the first phase is DSP registration phase where a DSP registersto theMP by providing basic information about him/her-self together with a public RSAkey. The RSA key-pair will be automatically generated by ITrade’s Web client. The DSPdeploys a SC on the BC. The deployed SC represents a DSP instance. The DSP also hasto pay the registration fee that is predefined by the ITrade’s admin. Once the DSP hasbeen successfully registered, he/she can act both as a DSS or a DSB. Upon a successfulregistration, theDSP has to securely store the newly generatedRSAprivate key whichwillbe used for streaming data process in later steps.

3.4.3.2 Sensor Registration

For a sensor registration the DSS client generates an AES key used for encrypting sensordata. The key has to be securely stored by the DSS. The DSS initiates the sensor registra-tion process by registering the sensor on the BC. This step incurs the sensor registrationfee that is automatically transferred to ITrade admin’s Ethereum account. The registra-tion transaction ID is then returned to the DSS. The next step is to activate the sensor onITrade. The DSS does so by initiating a call that includes the transaction ID obtained inthe previous step. ITrade validates the sensor by validating the transaction against the BC.Next, the result is returned to the Data MP and ITrade generates a sensor token that will

97

be used for the DSS authentication when publishing that sensor’s data streams. In casethe transaction is valid, ITradewill configure the sensor so that the data can be ingested toand streamed from the platform. Finally, the sensor token is returned to the DSS.

3.4.3.3 Publishing Sensor Data

Once a sensor is successfully registered and activated by DSS, it can start publishing datavia Kafka Service API to the marketplace. DSS is the one who decides on how frequentshall the data collection happen, and which type of data shall be collected as he/she hasthe control over his/her IoT sensors. A sensor publishes the data to ITrade , where eachiteration involves those steps as of Figure 3.14.

1. The sensor captures data from its physical environment.

2. Raw data is encrypted with the sensor’s AES key.

3. The device calls ITrade via an API for which the payload corresponds to the en-crypted sensor data and a sensor token that indicates sensors validity.

3.1. ITrade receives the call and validates the sensor token to detect possible mali-cious actors.

3.1.1. If the token is valid, ITrade will store the data and make it available forstreaming.

3.1.2. In turn, a success message will be returned to the sensor device.

3.2. If the sensor token is invalid, an error message will be returned.

3.4.3.4 Subscribing to a Sensor’s Data Stream

DSBs can search for available data streams at ITrade. If DSBs want to select a certain datastream, they have to subscribe to it, which is possible only by paying for each data entry.Consequently, a subscription amount will depend on the number of data entries DSBwants to buy. Payments at purchase are managed and enforced by Smart Contracts. Oncea DSB subscribes to a data stream, it can start obtaining a key from ITrade that will behim/her to decrypt the data streammessages. The key exchange process betweenDSS andDSB is as follows (cf. Figure 3.16):

1. Once a DSB subscribes to a data stream, it informs ITrade about that by sendingthe transaction ID and the corresponding sensor’s public key.

98

Figure 3.14: Publishing Sensor Data in ITrade [203, 97]

2. ITrade verifies that transaction inBCand requests from theDSS to encrypt the datastream decryption key with the DSB’s public key. The encrypted data stream key issent to the DM.

3. TheDSB can now fetch the encrypted key, decrypts it with its private key, and startsstreaming the data. Thus, the data streamed are decrypted on the DSB side.

While Figure 3.15 visualises this process described, this introduces an additional com-plexity, but it prevents ITrade frommanipulating data it stores.

3.4.3.5 Streaming Data

ADSB can start streaming the data from the chosen sensor as shown in Figure 3.16:

1. ADSBmakes a request against theDataMPAPI inorder to obtain thedata stream’sdecryption key. The request contains the data stream subscription purchase trans-action ID.

1.1. The Data MP validates the transaction with the Smart Contract.

1.2. All result is returned from the BC .

99

DSB Data MPBC Network

1.1. result

1. browseSensors()

3. subscribe()

3.1. txID

2. getSensorDetails(sensorId)

It fetches the sensor

information that we do not store on BC

(stream size)

2.1. sensorDetails

1.1. distributeFees()

It distributes the

registration fees to

the DSS and Data

MP's contract

addresses

Figure 3.15: Subscribing to a Data stream Process Flow in ITrade [203, 97]

1.3. If the transaction is valid, the Data MP will require the decryption key fromthe DSS.

1.4. The DSS will encrypt the data stream decryption key with the DSB’s publickey.

1.5. The key is returned to Data MP.

1.6. The Data MP keeps the key in the cache for subsequent requests.

1.7. The encrypted decryption key is returned to the DSB.

1.8. If the transaction from the step 1.1 is not valid, an error message is returnedto the DSB.

2. The DSB asks the Data MP for those data received.

2.1. The DSB receives the result.

100

DSB Data MP

Alternative

[txValid]

[txInvalid]

1: getDSDecKey(dsSubTxID)

BC Network

It fetches the

datastream

encryption key that

will be used by the

datastream buyer.

DSS

1.1: validateDSSubTx()

1.2: result

1.3: getDSDecryptionKey()

1.4: encrypt(encKey, DSBPubKey)1.5. result

1.6: cacheDSDecryptionKey()

1.5: result

1.7: decryptionKey

2: getData()

2.1: data

3: decrypt()

4: visualize()

1.8: errorMessage

Loop

[txValid]

Figure 3.16: Data Streaming Process Flow in ITrade [203, 97]

3. Data is decrypted with the key obtained in step 1.7.

4. The subscribed data is streamed to the DSB.

3.4.4 Implementation

ITrade is a cross-technology platform. It is established on the cloud and integrated withorchestration tools and load-balancers. While the source codes of ITrade implementationare fullymade accessible at [94], the following introduces the technologies, tools, architec-ture, and its implementation details.

101

3.4.4.1 Blockchain (BC) and Smart Contracts (SC)

ITrade is using Ethereum BC since it is supported by a large community and it obtains agreat set of libraries, and a built-in cryptocurrency which enables a seamless implementa-tion of the P2P payment system. In order to join the Data MP, each user have to have anEthereum account. Each action (e.g., sign in/up, sensor registration, and subscribing todata streams) is captured bymeans of deploying the corresponding SCs. Each SC creationincurs a fee for its creator (e.g., when a DSB subscribes to a DS, part of the subscriptionfees automatically goes to the DSS and part to the platform owner’s account). It can beseen that each registration function has the flag payable in the method signatures. Thisapproach makes the payment system easier to manage and makes ITrade more transpar-ent. Moreover, the SCs enabled by Ethereum offer an Access Control List (ACL)-likeautonomous and decentralized system. Since ITrade enables data trading, the paymentsystem is an integral part of this work.

3.4.4.2 DataMarketplace

It defines themethod for registeringData Stream Principals (DSP). A newDSP has to paythe registration fee to the DM owner, hence the registerDataStreamPrincipal method isof type payable. Upon successful call of this method, a new instance of the DSP SC willbe deployed (cf. Listing 3.1) [94].

Listing 3.1: Data Marketplace Smart Contract [203, 97]

contract DataMarketplace {function registerDataStreamPrincipal (

s t r ing _dataStreamPrincipalName ,s t r ing _dataStreamPrincipalURL ,s t r ing _dataStreamPrincipalEmail ,s t r ing _rsaPublicKey

) publ ic payable ;}

3.4.4.3 Data Stream Principal (DSP)

DSP SC has the method registerSensor() that is called when a new sensor gets registeredby a DSS (cf. Listing 3.2). The DSP has to provide info on the sensor type, geolocationdata, and the price that the DSBs have to pay for each data entry. This method creates anew instance of the Sensor SC [94].

102

Listing 3.2: Datastream Principal Smart Contract [203, 97]

contract DatastreamPrincipal {function reg i s te rSensor (

IoTDataMPLibrary . SensoryType _sensorType ,s t r ing _latitude ,s t r ing _longitude ,uint _pricePerDataUnit) publ ic payable ;}

3.4.4.4 Sensor

Sensor SC (cf. Listing 3.3) represents a sensor entity and it defines themethod subscribe()that expects the following parameters. (a) the DSB contract address, (b) the start times-tamp i.e., the time since when the DSB is subscribed, and (b) the number of the data en-tries that the DSB is subscribed for. This method deploys an instance of the data streamsubscription SC [94].

Listing 3.3: Sensor Smart Contract [203, 97]

contract Sensor {function subscr ibe (

address _dataStreamBayerContractAddress ,s t r ing _startTimestamp ,uint128 _dataEntries) publ ic payable ;}

3.4.4.5 Data Stream Subscription

It represents a subscription to a specific sensor and it has a method that returns true orfalse if the subscription is active or not (cf. Listing 3.4) [94].

Listing 3.4: Datastream Subscription Smart Contract [203, 97]

contract DatastreamSubscription {function isDatastreamSubscriptionValid ()public view returns ( bool ) ; }

3.4.4.6 Streaming System

There are several aspects that should be considered before choosing a streaming systemsuch as (a)Message consumption model: with two options: pull-basedmechanisms al-low the consumers to manage their message flow i.e., users pull only the messages they

103

need. In contrast, Push-based mechanisms put too many responsibilities on a streamingsystem since the system would need to manage the message consumption for each con-sumer which is not a scalable approach. Thus, a pull-based approach is preferable in a datamarketplace. (b)Number of components and, (c) Storage architecture.

Among the possible streaming protocols such as Pulsar and Kafka, ITrade implemen-tation is based on Kafka. This decision is due to the fact that Pulsar uses an index-basedstorage system that forms a tree structure. That enables fast access to the messages but in-troduces the write latency. Both Kafka and Pulsar retain the messages indefinitely mean-ing both can be used as storage systems. Kafka uses fewer components than Pulsar. Thatmakes Pulsar more difficult to deploy and manage. Kafka uses a commit log as a storagelayer. Newmessages are appended at the end of the log. Reads are sequential starting fromthe offset and moving towards the end of the log [8].

In ITrade each sensor has its own Kafka topic where it can publish the data and thesubscribers can stream from. The sensor’s SC address is used for the topic name since theaddresses are already globally unique.

3.4.4.7 Container Orchestration Tool

While Docker Swarm is quicker to get started with and has simple configuration require-ments, Kubernetes provides a more sophisticated environment. Kubernetes supports theinstallation of monitoring and logging components for better observability of a system,which is particularly important for its evaluating. Kubernetes also follows the cloud-nativephilosophy meaning that it is easier to migrate from one platform/cloud provider to an-other. There is also awide set of tools for packaging, versioning, and deploying the applica-tion on Kubernetes platforms [168]. Thus, ITrade services are orchestrated and deployedon a Kubernetes cluster. It can be seen (cf. Figure 3.17) that each component inside thecluster is deployed as a Kubernetes service. This approach provides an internal load bal-ancer in front of a group of running containers. It is a powerful feature that enables scalingof each group of components without the users noticing it.

3.4.4.8 ITradeArchitecture

Figure 3.17 represents the components in ITrade and their correlation. An internal Kafkacluster provides the streaming and storage features. The whole cluster is deployed withina private virtual network. The only public-facing component is the load balancer that hasthe TLS certificates attached to it for a secure connection. Both DSSes and DSBs accessthe corresponding services via the load balancer.

104

SignatureVerifierService

BlockchainClientService

StreamAPIService

EntityMangerService

KafkaRestService

KafkaNodeService

KafkaZookeeperService

APIGatewayService

WebClientService

PublicFacingLoadBalancer

Blockchain

DataStreamSellerIoTdevices DataStreamBuyers

KubernetesCluster

Figure 3.17: Component Overview of ITrade Architecture [97]

3.4.4.9 API Gateway

API Gateway is the front door for ITrade. It exposes the HTTP GraphQL endpoint thatthe clients (ITrade users) can call. API Gateway is implemented in Java programminglanguage with help of the Spring Boot framework. GraphQL is a query language thatenables fetching the data needed on the client-side. It also enables the fetching of multipleresources with a single API call, while REST APIs would require loading the results frommultiple URLs.

3.4.4.10 EntityManager

Since the goal in ITrade implementation is to have a high-performance system while stillusing a BC, the Entity Manager is implemented as a caching service to achieve this goal.Therefore, each call made to the BC is cached for a certain amount of time in order to im-prove the performance of the system. The caching is backed up by a key-value in-memorydatabase. Figure 3.18 depicts how users get authenticated and how the JSONWeb Token(JWT) tokens get generated. Those tokens are stored in the Entity Manager service. Eachtoken has certain permissions assigned to it.

When a DSB subscribes to a data stream, he/she will provide the transaction ID asa proof that he/she has been subscribed. The API Gateway will first call the BC clientto validate the transaction. The BC client will then return the result to the API gateway

105

which will save the transaction information to the Entity Manager in case the transactionis valid, otherwise, it will reject the client call. When the DSB starts streaming the data,he/she has to include the JWT token in the authorization header, that token gets checkedby the Entity Manager service. A similar process applies when publishing the data fromthe IoT devices. Each activated sensor has a JWT token assigned to it which has to beincluded in each call thus the clients can be authenticated.

3.4.4.11 Signature Verifier

The signature verifier component provides a passwordless authentication for users. ITradeutilize a cryptographically secure authentication flowwith help of theWeb3.js library [13].ITrade relies on the property that it is cryptographically easy to prove the ownership of anaccount (i.e., Ethereum account) by signing a piece of data with the user account’s privatekey. ITrade ’s specific implementation uses a message-signing-based authenticationmech-anism where the users are identified by their Ethereum account addresses. For a user toobtain a JWT token, the steps shown in Figure 3.18 needed to be performed. To evaluateand monitor the components involved in the microservice-based architecture of ITrade,the Istio [42] platform is employed. Istio embeds a proxy (i.e., Envoy) in front of eachservice to capture each request, by which the latency, request/response size, and successrate are measured. The Envoy proxy uses 0.5 Virtual CPU (vCPU) and 50 MB memoryper 1000 requests-per-second going through it. Envoy adds 3.12ms to the 90th percentilelatency. Additionally, this test environment runs the control plane component of Istiowith 2 vCPU and 4 GBmemory.

In order to measure the performance and user experience of ITrade , the “PercentileLatency” is used as a main metric. The Percentile Latency gives the maximum latency forthe fastest percentage of all requests. For instance, P50 Latency gives the maximum la-tency for the fastest 50% of all requests.

In the evaluationof ITrade “DistributedTracing” is employed formonitoring andpro-filing. Distributed Tracing promotes the idea of distributed context propagation, whichmeans that each request has associated metadata to be followed across multiple microser-vices [12].

Considered parameters of this approach include (a) Trace: It is the sequence of callsthrough the system that are needed to resolve a request. Although only one call againstan entry point in the distributed system may be able to resolve the request right away, of-tentimes multiple subcalls are needed to work together in order to resolve the request. (b)Span: When a trace includesmultiple subcalls, each subcall represents a span. Each subcallis timed and accepts key-value tags as well as fine-grained, timestamped, structured logs at-tached to a particular span instance. (c) Span context: It is ametadata attached to each spanused for linking spans to their trace. Such a distributed tracing mechanism offers deep in-

106

sights into the HTTP requests. Distributed Tracing helps to inspect requests against taillatency [224], the overall latency, and to detect the particular services that are the biggestcontributors to those delays.

DSB&DSS Entity ManagerAPI Gateway

1: getAuthChallenge(dspContractAddress)

1.2: dspContract

BC Client

1.1: fetchDSPContract(dspContractAddress)

Signature Verifier

1.3: getDSPPublicAddress(dspContract)

1.4: generateChallenge()

1.5: store(dspPublicAddress, challenge)

1.6: challenge

2: signChallenge(challenge, privateKey)

3: getJWTToken(signedChallenge, publicKey)

3.2: challenge

3.3: verify(challenge, signedChallenge, dspPublicAddress)

3.4: verify(

challenge, signedChallenge, dspPublicAddress

)

3.5: result

signedChallengeValid

[true]

[false]

3.6: generateJWT()

3.7: store(dspPublicAddress, jwt)3.8: jwtToken

3.10: error

3.9: store(jwt)

3.1: getChallenge(dspPublicAddress)

Figure 3.18: Passwordless Authentication Process[97]

107

3.4.5 Test Scenario

In an evaluation of ITrade , 100 IoT devices transmit the sensor data at about 2000 mes-sages per second. 100 clients (representingDSBs) simultaneously stream (receive) the dataat about the same rate. They all make requests per second via the public-facing load bal-ancer. The total number of requests steadily increases up to 5000 requests per second.The test ran for a almost an hour. In this test ITrade was deployed in Amazon Web Ser-vice (AWS) servers and it was connected to an Ethereum test network.

Figure 3.19-(a) shows that the CPU utilization reached to ∼36 vCPUs, then wentdown slightly to∼31 vCPUs. Memory utilization increased gradually from∼12.32 GBand reached to ∼58 GB. At 12:52 the CPU utilization dropped slightly for the StreamAPI service while the memory utilization went up for the same service. That was the mo-ment when more replicas of the corresponding service were brought to the system by itsautoscaler i.e., the K8s Horizontal Pod Autoscaler (HPA). Thenmore replicas started uti-lizingmorememory. That event did not have any impact onOperations Per Second (OPS)and latency which can be seen in Figure 3.20 respectively.

Table 3.2: Percentile Latency per Service with 4000 OPS [97]

Latency (ms)Service Requests (OPS)

P50 P90 P99

web-client 0.79 210.00 182.00 523.00

api-gateway 4.04K 23.59 480.21 1.7s

stream-api 4.03K 12.98 462.43 1.67s

entity-manager 3.98K 3.11 4.79 21.32

bc-client 63.21 175.00 235.00 248.50

signature-verifier 1.21 3.12 5.20 6.45

kafka-rest 4.03K 12.63 420.51 1.65s

The percentile latency can be seen in Figure 3.20. The P99 latency was as high as 2.3s for about 60% of the testing time. The P90 latency was much lower, ranging from 50ms to more than 1 s at the very end of the testing. The P50 latency maintained a very lowvalue i.e., 23ms. This means with ITrade , the maximum delay experienced for the 50% ofstreaming is at most only 23ms.

For amore elaborate analysis, Table 3.2 collects the average requests and the percentilelatency per service.

P50 latency: maintained low values of 23.59ms for the API Gateway which is satis-factory. The biggest contributor to this latency was the downstream Kafka REST servicewith 12.63 ms delays.

P90 latency: sees a considerable amounts for latency. Here is where the ITrade ar-chitecture started reaching the limits of the Kafka REST component since the latency for

108

Figure 3.19: (a) CPU and (b) Memory Utilization Per Service With 4000 OPS [97]

the APIGateway was as high as 480.21ms and about 87% of that latency can be attributedto Kafka REST component (420.51 ms). The most significant latency experienced wasfor the Stream API service which was expected since it is an upstream service of the KafkaREST.

P99 latency: showed a latency for the API Gateway equal to∼1.7 s. 97% of that la-tency canbe attributed to theKafkaRESTdownstream service. KafkaRESTcomponentshad a few packets dropped so the overall success rate was 99.99%. Other services did notexperience the same issue, meaning their success rate was 100%.

3.4.6 Discussion

After running the several scenarios, it is concluded that scalability of ITrade reached cer-tain limitations due to the external Kafka REST component due to its stateful consumermodel, hence applying theK8sHPA is not applicable in this case. Changing this approachwould require significant engineering effort meaning that the only way to overcome this

109

Figure 3.20: End‐to‐end Percentile Latency with 4,000 Operations Per Second [97]

issue at the time of writing is to make sure that Kafka REST workload on network andcomputing is scheduled and optimized upfront. Moreover, it can be concluded that theemployed components in ITrade architecture are reliable and able to scale horizontally.For instance, building a Stream API service that abstracts enables to switch to anotherstreaming system in the future without changing the rest of the system. Should that beaddressed, higher performances might be achieved.

Since ITrade is deployed on a cloud environment, data sovereignty is of particular im-portance. ITrade is compliant with European data sovereignty regulations. It is deployedin Europe and ensures data sovereignty for Europe only. ITrade is hosted on AWS in Eu-rope which guarantees the data does not leave the data center. Traffic is being routed tothe right location by utilizing the geolocation routing policy. Thismeans that theDomainName Service (DNS) server will resolve the DNS queries to the IP addresses accordingto the geographic location of the users. Restricted access to ITrade from restricted loca-tions is ensuredby enabling theWebApplicationFirewall (WAF) service providedbyAWS.ITrade utilizes theWAF service by filtering the traffic by the source IP addresses. Meaningthe traffic that originates outside Europe will be blocked. Additionally, data sovereigntyis ensured in ITrade by encrypting/decrypting the data on the user’s side. This means thateven if a malicious actor gets possession of the data, such encrypted data is useless withoutthe decryption keys.

110

3.5 Blockchain and IoT Integration (BIoT) Risks and Challenges

BC-IoT integrated systems introduced above are establishedvia interdisciplinary approaches.Due to their very different BC and IoT characteristics, several requirements and risks areidentified as discussed in Sections 3.3.4, 3.3.6, 2.8.2.2, 3.2.3, and3.4.1. Based on thepointselaborated above, this thesis categorizes BIoT risks, as shown inFigure 3.21, into the social,operational, performance, technical, functional, and architectural fields partially based onor taken from [208, 207, 223].

Energy Efficiency

ArchitecturePerformance

Social

Operational

Technical

FunctionalRisk Analysis

Figure 3.21: Categories of BIoT Risks [208]

Social risks refer to the social acceptance risks of a solution. A purely “solution-oriented” design of an application may offer an output that is not seen in the same im-portance and usability by end-users who are mainly facing a product from a “problem-oriented” perspective. In the BIoT case, a clear understanding of BC benefits by end-usersis key for any application to be socially adopted. Otherwise, there is a high risk that even atechnically perfect application ends up with no costumers nor consumers.

Technical risks can be caused by the algorithms and protocols employed in each tech-nology i.e., BC and IoT. For instance, using Ethereum as an underlying BC infrastructureof many platforms has shown a reliable security level in its latest versions, however, therehave been a few security breaches experienced with its SCs in the past. Furthermore, tech-nical risks canbe causedby thewayprogrammers developBC-based applications includinghow they use the programming languages e.g., solidity when developing Ethereum-basedSCs. Technical risks have resulted in data and financial loss several times in the past bothfor platform owners and users.

Functional risks exist where an application cannot deliver a specific functionality dueto technical problems or updates. For instance, a functional issue can be caused due to theincompatibility of software or hardware deployed.

111

Operational risks refer to the inability in delivering expected operations either due toa weak design of an application or users lacking knowledge in interacting with the applica-tion. Both functional and operational risks potentially cause dissatisfaction of users andtheir reluctance in using a platform.

Performance and Architecture risks are correlated, hence, intentionally placed ad-jacently in Figure 3.21. An efficient BIoT is not achievable unless corresponding com-ponents and functions are enabled. Therefore, the set of important metrics directly im-pacting a BIoT architecture and consequently the application performance include (a)scalability, (b) security, and (c) energy efficiency (cf. Figure 3.22).

Each of these categories has a group of directly relevant and impacting parameters andreasons identified in this thesis.

Software-DefinedNetwork Adaptation

Security

Transparency

Trustability

Privacy

Data Integrity

Regulator/Government

Influence

Encryption/Cryptographic

Signature

Regulator/Government

Influence

Flixible/ModifiableArchitecture

Modules

Energy Efficiency

IoT-to-BCThroughput

MaximumTransmission

Unit

Medium Access Control

Protocol

TransactionAggregation

De/Fragmentation

of Data

IoT-to-BCTransmission

SchemeEnergy Efficiency

Processing Nodes &

Computational Complexity

ConsensusMechanism

TransactionValidation

Rate

Mining ProcessesClient Actions

(e.g., Signing) &Location

Blockchain Type Scalability

Figure 3.22: BC‐IoT Integration (BIoT) Metrics [208]

112

3.5.0.1 Scalability

The scalability of a BIoT depends mainly on the scalability of the underlying BC, the IoTprotocol, and the architecture design. Thus, the BC consensus mechanism impacts theTX rate, which is used in general to evaluate the scalability of BCs. Since the consensusmechanism and a mining process are tightly-coupled, these algorithms are affected by theunderlying networking layer. Since the networking layer enables P2P communicationsbetween all BC nodes, the overall latency and TX rate are impacted (cf. Figure 3.22).

Firstly, BC miners’ communications for (a) synchronization and state transmissions,(b) broadcasting a newly mined block, (c) asking for lost TXs , or (d) block/TX valida-tions cause delays and affect the scalability in case of networking instability, which cancause packet losses.Thus, a suitable BIoT has to take into account the number of consen-sus findings andmining-related TXs [255]. For example, un-/successful miners in Bitcoinaffect the divergence of the latency of PoWminers in the Bitcoin network, discriminatedby the consensus mechanism. As a consequence, miners may be delayed in synchronizingthemselves with the BC and their effort in mining new blocks might be lost due to theweak networking situation i.e., by exceeding the blocktime or inability to broadcast theirmined block to a greater portion of the Bitcoin network on time.

Secondly, the BC type impacts the scalability of BIoT applications, since it defines auser participation level as a client or miner. One of the main reasons for the scalabilitydifference (in terms of the TX rate) between private, permissioned DLs and public BCsis the consensus mechanisms and its computational complexity. Since permissioned BCsshow a centralized authority deciding onmining participation, they do not need computa-tionally expensive consensus mechanisms. Thus, usually a Proof-of-Authority (PoAuth),a Proof-of-Stake (POS), or a Byzantine Fault Tolerant (BFT)-like consensus mechanismsis used. A recent study performed on the detailed effects of consensus mechanisms on theIoT-based use cases can be found in [219].

Thirdly, the BC size and growth need to be considered for a scalable BIoT. BC size isthe accumulation of all the data records stored andmaintained on the main chain of a BCby its miners. When a new miner intends to join a BC it has to receive all these recordswhich means that the new miner has to dedicate an equivalent storage space accordingto the size of that BC. For instance, the Bitcoin size reaches almost 300 GByte and theEthereumBC size is over 1TByte as of June 2020. Considering that IoTdevices can gener-ate gigabytes of data in real-time, corresponding techniques need to be employed to limit aBC size growth, while still providing a trusted and tamper-proof data storage [205]. More-over, the design of public BCs does not provide fast and cheap storage of large amounts ofdata. Hence, different approaches have to be considered to filter, normalize, and compressIoT data to reduce the data size [215]. In this context, the removal of older TXs from the“older” blocks or aggregating them have been recently proposed, too [170], [205].

113

Fourthly, the scalability of BIoT is related to the (a) employed IoT technology, (b)transmission scheme, and (c) processing tasks on IoT nodes. The Maximum Transmis-sion Unit (MTU), available bandwidth, available air time, the Medium Access Controlprotocol, the location of in-/outdoor Gateways, the computational complexity of crypto-graphic operations, and en-/decryptions determine IoT-related characteristics, which af-fect the scalability, i.e., the data size transmitted from IoT sensors to the BC.

Table 3.3 summarizes different functions in popular existing BCs according to the useof CPU, RAM, storage, and networking. Different functions are classified according tothe low, moderate, high, and impossible scale assuming an arbitrary BC of high popular-ity, when a given function runs on the IoT device, i.e., without offloading, e.g., throughMobile Edge Computing (MEC), etc. We assumed IoT constrained peers having CPU,RAM, network, and storage equal to 10 MHz, 10 kB, 100 bps – 100 kbps, and 100 kB –1MB respectively, which overlaps with the IoT device of Class 0 [65], represented by theTelosB or AVR device families. We are not going to provide the complete analysis, how-ever, we illustrate our reasoning using the PoW Mining scenario, in which BTC is usedfor evaluations. Having an average block-time of 10 mins, block-size of 1–4MB, the per-formance of an ASIC miner expressed in tens TTPS, we evaluated that “PoWMining” isnot possible due to CPU (cf. Table 3.4) and memory constraints and requires very highnetwork and storage utilization.

Table 3.3: Required Use of Resources for Different Software Functions in BCs from the IoT Resource‐Constrained De‐vices’ Perspective [207].

BC Function CPU Memory Storage Network

BFT Ordering high impossible high impossible

PoA Ordering high impossible high impossible

PoW Mining impossible impossible high high

PoS Mining high impossible impossible high

Full Node♮ impossible impossible impossible impossible

Light Node⋄ high high impossible high

Submit TX high low low low

Verify TX impossible high impossible impossible

♮ A full node providing TX and block verification, which stores the completechain.

⋄ The light node possessing portion of the BC; submits and verifies TXs.

Finally, the BIoT architecture itself affects the overall scalability depending on its de-sign since edge, fog, or cloud oriented approaches facilitate different scalability levels. BC-related tasks operated by IoT devices, such as the signing of packets by IoT devices, impactcomputationally demanding, resource-constrained IoTdevices [223], [207]. Thus, the lo-

114

cation of BCclients and the complexity of cryptographic algorithms for hashing or signingTXs are important. In this regard, a BIoT architecture needs to take into account requiredcomputational and storage resources on different layers.

3.5.0.2 Security

With the increasing number of attacks on IoT networks, security measures and function-ality are needed for BIoT architectures and include (i)Data Integrity, (ii)Trust, (iii)Reg-ulators and Governments’ Influence, (iv) Transparency, (v) Privacy, and (vi) Encryptionand Cryptographic signatures (cf. Figure 3.22).

Several reasons can cause security concerns for IoT protocols, e.g., failure of devices,vandalism, and users (cf. Section 2.7.2). Thus, it is important to perform a health checkof IoT devices before and while they are integrated with BCs. Thus, the implementationof automated security alerts based on Hardware Security Modules (HSM) or PhysicallyUnclonable Functions (PUF)-based periodical IoT device identity verification determinesan essential element of secure BIoT architectures.

Trust and reliability of IoT-generated data determines a vital security aspect of BIoT,since if data had already been corrupted before persisted into the BC, it will remain secure,but wrong. Thus, data integrity, immutability, and the identification of changes have tobe provided by BIoT architectures, such that data reliability can be ensured throughoutdata collection, transmission, and storage. Furthermore, trusted distributed authentica-tion and authorization services for IoT devices can be provided by BC-preserved SC-basedIAM. For example, storing the hash of a device firmware and state will create a permanentrecord on the BC that can be used to verify the identity of the device and its settings, mak-ing sure they have not been manipulated [170].

A safe preserving of the private key or Secret Key (SK) of IoT devices acting as BCnodes is crucial, since SK losses causes users to be unable to access his/her account andtheir funds. If the SK is stolen, the user can even lose all digital assets in the form of cryp-tocurrencies or tokens [234]. Thus, for security reasons special mechanisms are neededfor storing the SK within the users’ hardware, particularly not in the BIoT’s fog or cloud.The responsibility of maintaining the SK’s privacy at full is with IoT owners.

Data privacy in BIoT architectures can be required at the time, when data are beingstored in the BC—by a careful decision on what shall be stored within the BC, especiallyin public BCs—, but also within the entire IoT-to-BC path involving IoT data collection,communications, and the application domain. Taking into account theGeneralData Pro-tection Regulation (GDPR) user privacy is crucial to be integrated into BIoT architec-tures. The risk of using a public BC is related to the data privacy of information beingstored within this BC. BCs being immutable by design cannot be deleted and being pub-lic, all data is persisted in clear text. Thus, any non-trusted participant may join the BIoT

115

application as a user, e.g., SCT stakeholders. Diversely, private distributed ledgers can par-tiallymanipulate the data stored or delete the data. In such a case, their privacy is preservedby different policies, nonetheless, only at the cost of making very different premises on theunderlying trust model, especially the centralization of the consensus mechanism.

3.5.0.3 Energy Efficiency

Energy efficiency and scalability of BIoT architectures are closely intertwined [220]. Forinstance, networking and communication layer characteristics and configurations, such asthe Medium Access Control (MAC) protocol of the IoT infrastructure deployed, affectenergy efficiency. Specifically, the IoT transmission scheme plays a crucial role in pro-viding energy-efficient communication. e.g., the lack or presence of Automatic RepeatreQuest in LoRaWAN affects the throughput and packet loss in the IoT network [223].Energy efficiency is particularly relevant for the transmission of signature packets, since alost signature affects TXs integrity and consequently the reliability of the architecture.

It is recommended to sign IoT packets inside an IoT node with a BC client PrivateKey (PK), which is the IoT device owner PK. Also, it is recommended to use the samecryptographic algorithm while hashing and signing, that is used by BC and BC clients toprovide data integrity [207]. When data packets signed by the IoT device owners’ PK,the signature indicates the owner and origin of that packets. However, if the IoT MACprotocol - due to the smallMTU settings (e.g., in LoRaWANwith 55-200 Bytes)- requiresdata fragmentation on the IoT nodes, such that the collected data can fit in the packets,data aggregation will be needed on the BIoT edge or fog, or the BC client. Hence, the en-ergy consumption of the encryption and cryptographic functions is a decisive element onthe design of BIoT architectures, given that not all IoT devices are capable of performingcomputationally expensive operations.

Data integrity and transmission reliability is the key for BIoT, here comprising of theIoT infrastructure in use and the BC deployed. Since IoT devices are the initiators of IoT-to-BC communications, they are in the front line to support data integrity and thus itsreliability. Therefor, IoT devices need to operate as BC nodes, either as BC clients or BCminers. Generally, BC clients using BC wallet applications and BCminers utilize crypto-graphic functions to issue a TX or mine a block. Since hashing is a major task of BCs, theperformance of IoT devices in running cryptographic functions, such as a SHA-256 forsealing TXs in a block uponmining or a the light-weight Elliptic Curve Digital SignatureAlgorithm (ECDSA) Ed25519 with SHA-512 [54] for signing TXs , have to be studiedon different computing architectures. Their current performance levels reached [207] aresummarized in Table 3.4.

Here, the performance of selected cryptographic functions is evaluated on a TelosB(MSP430), an ATmega 2560 (ArduinoMega) node, a Raspberry Pi 3 (RPI), and a regular

116

PC with an Intel i7 CPU of 2.4 GHz [207]. The evaluation was performed for SHA-256 and SHA-512, measured in Hashes per Second (HPS), and for Ed25519, measuredin signatures per second. These determine those cryptographic functions as used by BCs.Consequently, once IoT devices would be required to operate as a BC node, thousands ofTelosB and Arduinos would be needed to perform BC operations as efficient as a singleApplication-specific Integrated Circuit (ASIC)-based BCminer could do it.

Table 3.4: Performance of Selected Cryptographic Functions on Different IoT Hardware in Hashes per Second [HPS] andSignatures per Second [207]

CryptographicFunction

TelosB Arduino Mega2560

RPI 3 Intel-i7

SHA-256 [HPS] 34.22 79.8 31989 182216

SHA-512 [HPS] 4.79 10.46 12194 97655

Ed25519[signatures/s]

0.0036 0.0179 30.1 84.2

It has been proven that only signing operation for a single TX with Ed25519 can beconsidered heavy for constrained IoTdevices. Thus, IoT-basedmining of PoW-based con-sensusmechanisms using SHA-256 is neither realistic nor practically achievable in general.However, if the computational power of IoT devices is high, such as with an Arduino andanRPI, they can operate as BC clients. The IoT device signs a TXwith its Secret Key (SK)and attaches the corresponding Public Key (PK) to the signed data, and/or the hash of thedata. Such a signed TX , consisting of the raw data, its hash, and the IoT device’s PK, willensure data integrity and origin of a TX . Therefore, even operating as a BC client, IoTdevices in BIoT applications have to be powerful for TX signing, to enable data integrityfor the full BIoT path.

Further requirements, such as scalability and energy-efficiency, are required by BIoTapplications, too. Hence, it is crucial to determine which functions and configurationscan be supportive. For instance, de-/fragmentation, pre-/processing, time stamping, eventhandling, en-/decryption, and compression/decompression of data collected before anyIoT-to-BC transmission are functions, impact the scalability and energy-efficiency of IoT-to-BC communications. To configure and manage IoT devices to perform these func-tions, an efficient and flexibly managed BIoT architecture is necessary. Thus, BIoT archi-tectures are expected to consider BCs and IoT protocols and according to specific charac-teristics of the employed IoT protocol, corresponding settings for the IoT configurationand the type of a BC or DL selected are needed.

117

3.5.0.4 Manageability

The flexibility of a BIoT architecture is essential to adapt its settings with the changes ofthe underlying infrastructure. BIoT applications benefit from adopting communicationsettings of IoT devices based on network congestion, such that IoT devices can connect toa new gateway, thus, providing a more reliable IoT-to-BC path. Such a modification de-mands software-definedmanagement of the BIoT architecture. Thus, it is vital to providecontroller units and adjustable components to facilitate flexible and configurable BIoTarchitectures.

Figure 3.23: Blockchain Suitability Diagram for BIoT Use Cases

118

3.5.1 Discussion: To BC, or Not to BC?!

Having observed the BIoT challenges on one hand, and by considering the characteristicsof other centralized and decentralized and distributed storage mechanisms can offer, thediagram in Figure 3.23 summarizes the suitability of public permission-less DLs, i.e., BCschoice as data storage infrastructure for IoT use cases.

This diagram takes the social interactionof stakeholders and their trust in eachother ina BIoT system into account. It specifies that data shall not be stored onBCs if stakeholdersdo not need such a trusted data storage andwill not benefit from aDL. Thus, a centralizedapproach is recommended.

Moreover, the validation of TXs in BIoT applications are considered. If the data inTXs need to be stored in a decentralized fashion without the need for verifying each TXvalidity, there is no need to enforce a resource-consuming validation step ofmining beforestoring them. Thus, just storing on DSSes like IPFS would be sufficient. This is based onthe assumption clarified by the first question, i.e., trust between different entities to notexist and they cannot store the data centrally.

In the two final steps, the division between BCs and DLs have been made clear basedon the accessibility and payments for TXs . Thus, this thesis recommends employingDLswhere public accessibility of data is unnecessary or even prohibited, or stakeholders cannotjustify the costs they need to pay for every single TX stored in a BC.

This diagramrestricts unnecessaryBIoTby specifying the only caseswhereBIoT is jus-tified are when (i) there is no trust between stake holders, (ii) IoT collected data is trusted,(iii) data cannot be stored centrally due to lack of trust and transparency, (iv) IoT TXshave to be validated to verify the origin and integrity, (v) IoT data must be publicly acces-sible, and (vi) payments for the TXmining and storage is justified for stakeholders.

3.6 Conclusions

In this Chapter, the experiences collected by the four developed dApps made the identi-fication of BIoT challenges possible. These challenges are categorized regarding the secu-rity, scalability, and energy efficiency of BIoT in Figure 3.22. In practice, depending onthe BIoT use case, all of these metrics must be considered to reach high efficiency. Forinstance, BC client location will directly affect the data integrity, and consequently, its re-liability. As shown by BPMS in Section 3.2, a BC client can be located on different placesand devices in a BIoT architecture to sign the data. It may be placed on IoT nodes, onIoT gateways, or even on a standalone BC client which “listens” to data from a server (e.g.,LoRa) and then signs and sends the collected data to a BC. However, the location of theBC client will demand higher processing power. Thus, affecting the total costs of estab-lishing a BIoT system.

119

The same situation exists with the level of transparency needed. As shown in the foodchain tracking scenario in Section 3.3), every farmer could embed IoT/BC nodes per eachcattle, or only per building, or just oneBCnode could be located at themilk delivery point.Here each case leads to a different level of transparency, trading off the infrastructure costs.If the collected data is not reliable due to the misplacement of BC nodes, the stored datain BC cannot add any value to the BIoT application.

It is made evident that BCs and/or IoT technologies can become the scalability bot-tleneck of BIoT as experienced in ITrade (cf. Section 3.4). The same for the security andenergy efficiency of employed technologies, as they directly impact the total security andenergy consumption of BIoT. Adding the security by IAM of users or devices via a KYCand KYD approach is essential in most BIoT applications. Thus multiple considerableaspects exist to determine the impact of underlying technologies on a BIoT applicationbefore establishing and while maintaining one just for developing and integration of theKYC and KYD platforms like KYoT (cf. Section 3.1) in a BIoT application.

The BC location mentioned here is only one example of all those metrics and chal-lenges identified earlier in Figure 3.22. In the following Chapters, this thesis proposes anefficient BIoT architecture and DL while considering such challenges.

120

4BIIT—An Efficient BIoT Architecture

Based on the decentralized and distributed BIoT applications and protocols developed inChapter 3, and the practical demands of BIoT use cases identified in Section 3.5, designand implement of BIoT architectures are taken into account for further analysis in thisChapter partially based on or taken from [208, 207, 223].

BIoT requires adaptations at BC and IoT infrastructures complemented with a BIoTarchitecture providing configurable components required for an efficient ecosystem. Inthis context and considering the challenges of BIoT (cf. Section 3.5) the Blockchain IoTIntegration archiTecture (BIIT) (cf. Figure 4.1) based on [207] is proposed as a potentialefficient BIoT architecture (cf. Figure 4.1).

4.1 BIIT Objectives

This thesis identifies 3 key objectives to be followed by BIIT based on [208, 207].(i) BIIT needs to provide a configurable BIoT transmission scheme. The flexible APIs

shall enable an IoT logic compatible with the targeted BC implementation. The BIIT ar-chitecture and approaches shall integrate various IoTnetworking technologies, e.g.,Thread,LoRaWAN, and cellular networks. Thus, management components defined in BIIT needto optimize the transmission on-the-fly (i.e., provide appropriate fragmentation and take

121

into account theunderlyingnetwork characteristics) to provide anoptimized solution thatguarantees an appropriate level ofTXthroughout, packet loss, and energy efficiency [223].

(ii) IoT shall be compliant to the target BC by using similar network primitives (e.g.,TX data format). A TX signed by an IoT device assures the authenticity of the data sub-mitted to the BC. However, BIIT shall preserve the BC-specific TX fields providing anappropriate level of BC security (e.g., ETH nonce TX fields protecting against doublespending).

(iii) BIIT needs to provide an appropriate level of security upon data collection as wellas configurable APIs to cover authentication, encryption, and data integrity (e.g.,MessageDigest, ECDSA) upon TX submission. The management level decisions to be taken canbe based on parameters such as (a) security level to bemaintained, (b) computational com-plexity, (c) storage requirement, and (d) power efficiency. Thus, BIIT shall enable andconsider the security APIs for hashing, signing, and encrypting messages in a similar wayas Security Service Provider module was integrated in ZigBee to provide the required levelof privacy.

Figure 4.1: BIIT 1.0 Architecture [208]

122

4.2 BIIT Architecture Components

BIIT explicitly defines the following components on IoT devices, and uses them for BIoTmanagement. These components are selected as a complete andoverarching set of requiredelements for BIoTmanagement to confront the BIoT challenges identified in Section 3.5.IoTDevices: are enddevices used for data collection that communicatewithother systemscomponents through the IoT network infrastructure.BCWallet: is a software component placed on an IoT device that contains device specificcredentials such as BC address, private key, destination addresses, balance, TX counter,etc. This information is essential to provide data authenticity, integrity, and trust. Usingthe Configuration engine, the BCwallet configures the lower layer components, i.e.,Net-workAdaptationLayer (NAL), to produce appropriateTXs optimized for the underlyingnetwork technology. Moreover, the BC wallet issues data packets that shall be convertedthrough the NAL into fully fledged TXs on the southbound interfaces. The data packetsmay contain information on the the industrial process monitored.Security Functions: analogically to the Security Service Provider defined in ZigBee [118,130], a generic BIoT architecture has to consider a security engine (i.e., API) produc-ing a large number of hash functions (e.g., SHA-256) and ECDSA signature types (e.g.,Ed25519), which guarantee the compatibility with a large spectrum of BC protocols.Configuration: this engine manages the configuration of the wallet and NAL. The Wal-let is configured by the Enterprise End User (EEU) using the management plane. In turn,the configuration of the NAL is requested by the wallet, which demands the computa-tion of appropriate hash functions (e.g., SHA-256) or ECDSA signatures (e.g., Ed25519).Furthermore, the Configuration engine provides the necessary keys (e.g., Ed25519 privatekey) to the NAL and negotiates the data-plane message format used by the NAL on thesouthbound network interfaces.Software Based Network Adaptation Layer: The configuration of NAL is initiatedby the Configuration engine. NAL is a supporting component of the wallet; it receivesdata packets from the wallet on the data-plane, computes the required hash functions(e.g., SHA-256) or ECDSA signatures (e.g., Ed25519), assembles TXs, and sends themto the underlying network protocol stack. NAL optimizes the packet transmission to-wards the underlying network layer (i.e., message fragmentation, Automatic Repeat re-Quest (ARQ), TX aggregation [223]) and provides TX compatibility with the targetedBC (e.g., BTC, ETH). The adaptation is required as in many situations, the TXs cannotbe sent directly by the network layer, e.g., the regular TX size may already exceed theMax-imum Transfer Unit (MTU) of the IoT network.

123

IoTNetwork Protocol Stack: the physical, data link, and networking layers enabling thecommunication between IoT nodes and IoT Gateways (GWs), e.g., LoRa GW, and LTEMTC evolved Node B (eNB)).IoT Network Gateways: this component relies data packets between the IoT networkand other computing infrastructures, e.g., Edge-Nodes and cloud infrastructures.Network Core: provides the data collection point, such as the TTN servers [241], inwhich the IoT data is stored temporarily and can be accessed by other system components(e.g., BCClient)IoT Edge Node: Since BIIT tackles heavily constrained devices, IoT nodes are not envi-sioned to play the role of full BC clients. The edge node functionality can be provided onless resource constrained devices such as a RPI helping the BCWallets on IoT devices. Insome cases, e.g., in LoRa, the EdgeNodes derive the TXs fromTX chunks sent by the IoTdevice (cf. Section 4.2) as for example in LoRa, the regular TX size may exceed the LoRaMTU. It is worth noting that some Edge Nodes can also act as a gateway relying packetsfrom the IoT network towards other computing infrastructures.BC Client: Typical full BC clients require significant amounts of resources (e.g., storageat the order of GB in the case of BTCor ETH) and are not implementable on IoT devices.The BCClient residing on the Edge Node is an auxiliary element helping out IoT walletsto submit/verify TXs using the Edge Node resources.

As proof-of-concept, BIIT is partially implemented in [207, 206, 236, 73, 223]. Theimplementation details of BIIT for different IoT networks are elaborated as follows.

 IoT Device + Blockchain Wallet ( LoRa Shield + Arduino Mega )Sensors LoRa Gateways

Raspberry Pi (incl. TTN Listener,

Blockchain Light Client)

Data & Ack Transmissions  TTN Servers

Blockchain,e.g., Bazo

Figure 4.2: BIIT Implementation with LoRa —Components’ Engagement View [207]

4.3 BIIT Implementation

LoRanetworks endure air time and packet size limitations [166]. Thus, alignedwith thesespecific characteristics, the LoRa-to-BC integration management based on BIIT is imple-mented as follows based on [223]. Figure 4.2 represents the implementation of the LoRanetwork including IoT devices and edge nodes.

124

4.3.1 Management of the TTNNetworks by BIIT

IoT setup: IoT sensors connected directly to ArduinoMega (AT2560) boards [38]. Theconnection iswired-based and enabled through theGeneral Purpose Input/Output (GPIO)pins. Each Arduino device is equipped with a LoRa shield [102] and the whole setup isreferred to as the LoRa node. The coupled Arduino and Lora shield is called LoRa node.Each LoRa node is able to use the LoRaWAN protocol. The TXs originated by the Ar-duino wallet have to be signed with the Ed25519 ECDSA using a private key and otherparameters stored on the IoT device (e.g., ETH TX nonce). Moreover, in the TTN pro-tocol, the data exchanged over the network is encrypted, therefore the solution meets theprivacy requirements of the private BC as well.

alt

LoRa Node Bazo ClientTTN-Plugin Bazo Miner

exists

send PubKey

sendAccTxEndpoint

[exists == false]

txHash

sendAccTx(txHash, Signature)

txHash

sendFundsTxEndpoint

send Data

send SignaturesendData sendTxIotEndpoint

LoRa Node Initialization

Transaction

sendFundsTx(txHash, Signature)

sendFunds(sensor, amount)

isAccountCreated(sensor)

createNewAccount(sensor)

Figure 4.3: Data Flow and Instantiating of a Blockchain Wallet on IoT Devices [223, 73]

125

IoT Network Gateways [223]: The nearest LoRa Gateway (GW) receives the dataand forwards it to the TTN Network Collector, which is responsible for decrypting thedata and presenting it to third applications registered with the TTNback-end (e.g., via theMessage Queuing Telemetry Transport (MQTT) protocol).

IoT Edge node: As Edge node, an RPi device is deployed to constantly listen to theTTN sever and to receive messages originated by the LoRa nodes. The RPi is responsiblefor collecting the relying the TX between the TTN network and the BC miners. RPiacknowledges messages received from IoT device by sending acknowledgement (ACK)packets. For security reasons, the RPi has to send the ACK and not the gateways. WhentheRPi receives thewhole data sent from anode, it verifies the integrity using the signatureand the PK of each node.

BC TTN Plugin: The role of the BC TTN Plugin is to connect to the back-end ofthe TTN network through the MQTT protocol, retrieve and acknowledge data chunksoriginated by the LoRa nodes, and present the reconstructed TX originated by the LoRanods at the BC client (cf. Figure 4.3).

BC client: RPi also runs a BC client. Its purpose is the verification of TXs receivedfrom the BC TTN Plugin and forwarding them to the BC [209]. In this setup, the BCminer also resides on the RPi device, therefore, the BC client can directly submit a TXto the BC. Moreover, both the BC TTN Plugin, and BC client are involved in the LoRanode recognition procedure (cf. Figure 4.3): when a new LoRa node appears in the net-work, it presents the PK,which is in turn registeredwith the BC.The initial funds are thenstored on the LoRa node account to allow for the submission of upcoming TXs (i.e., theLoRa node needs to directly pay the TX fees). This greatly improves the management ofnew IoT devices joining the network. As indicated in Table 3.3, the TX verification is acostly operation for a restricted IoT device. Therefore, BIIT recommends to offload theverification procedure from the IoT wallet to the more powerful Edge Node. The walletwill trust the Edge Node BC client and use it to confirm whether the previously issuedTXs were successfully included in the BC. Such a mechanism could drastically increasethe fault tolerance in the BIoT system.

Moreover, the RPi should save a douple of device Id and PK (devId, PK) to assure IoTnodes are not being manipulated. Since anonymity of the devices is not a concern, the(PK, SK) of a node can always remain the same. The current status of a node should besaved as well, so that when the system shuts down it knows the state of each node withouthaving the need of asking the network about the current status. To do so, a local databasecould be implemented.

The main deficit of BIoT systems using LoRa is the transmission performance, i.e.,maximizingTXthroughput,while simultaneously providing thedata integrity using cryp-tographic signatures. In [223],it was discovered that the combination of Listen Before

126

Talk (LBT)MediumAccess Control (MAC), the ARQmechanism to handle retransmis-sions, TX aggregation (i.e., aggregating several data packets into a single BC TX ) as wellas TX fragmentation into small data chunks (when the TX payload is larger than the in-terfaceMTU) can provide a good trade-off between the TX network capacity, packet loss,and energy efficiency. Moreover, the Ed25519ECDSA is very costly on the IoTdevice andshould be rarely used to limit both the payload (i.e., sending 64 Byte signature) and com-putational overheads (cf. Table 3.4). BIITmanages this condition through aLoRa specifictransmission scheme between the LoRa node and the TTN-plugin (cf. Figure 4.4).

Figure 4.4: LoRa to Blockchain Transmission Protocol [206, 207, 208]

The data collected by IoT devices has to be divided into smaller chunks (i.e., 51 B) tofit the LoRaMTUwith Spreading Factor 12 (SF 12). The employed TXs shall not createhigh overhead, therefore signing is employed only upon Nth data packet, which reducesthe computational overhead caused by the Ed25519 ECDSA algorithm (i.e., residing inthe Security engine) and payload overhead caused by the Ed25519 ECDSA signature. Bysigning the data at the IoT device, authenticity and integrity of the data can be verified.IoT edge nodes collect the TX payloads from several chunks of data exchanged over theTTN, strip the receivedmessages from theLoRa transmissionheaders, attach requiredBCheaders, and send the completeTX to theBCclient. The protocol uses SegmentNumbers(Seg. Nb.), which are used for tracking and re-sending lost chunks of data. To informthe IoT device about the data reception at the GW side, ACKs are employed. Finally,our mechanism assures that BC TXs derived at the BC client comply to standard of thetargeted BC protocol (e.g., ETH, Bazo) and may be successfully submitted for mining.

127

Figure 4.5: Cellular TX Design [207, 208]

4.3.2 Management of the Cellular Networks by BIIT

Themanagement of BIoT cellular networks ismuch simpler in comparison toLoRa. Firstof all, the cellular stack is IP-based. Second, IoT technologies provide largeMTUs of 1600B [223, 264], therefore, they can send several data chunks in one shot. The Configurationand Security engine play exactly the same role as in LoRa, however, theNAL is configureddifferently. NAL accumulates several chunks of data (e.g., measurements) in cache, andthen sends one large TX containing multiple data chunks signed with a single Ed25519signature. The resulting TX is enclosed in a User Datagram Protocol (UDP) data packet,which is confirmedwith a singleACKmessage fromthe IoTEdgeNode. Finally, BIITusesthe UDP Packet Listener on the IoT Edge Node side (cf. Figure 4.5) as the IoT NetworkPlugin.

4.4 Experimental Performance Evaluation

BIIT needs to justify its viability by enhancing the system automation and performancewith respect to the considered criteria.

4.4.1 BIIT Performance in LoRa

ABIIT-based implementation ofLoRa transmission scheme is simulated inNetwork Sim-ulator (NS) [223, 236] which, study (i) the influence of LBT on the MAC layer, (ii) theAutomatic Repeat reQuest (ARQ) on the transport layer (i.e., downstream messages inLoRa are initiated by the NS, therefore, as an end-to-end scheme, such an ARQ mecha-nism can be understood as a transport layer instrument), and (iii) the TX aggregation onthe application layer to provide an efficient communication scheme for BIoT applicationusing LoRaWAN.

128

The simulations conducted in [223] studies two MAC configurations, i.e., the basicLoRaWAN class AMAC abbreviated as Duty Cycle Enforcement (DCE), and the ListenBeforeTalkLoRaWANMACprovidedby [245] abbreviated as LBT; twoTransport Layerconfigurations with and without the retransmission scheme coupled to ACK messagesabbreviated as ACK and NOACK respectively; and the Single Data Packet TX schemefollowed by the fragmented signature denoted as N = 1 as well as the Multi-Packet TXscheme, in whichN = 10 data packets are followed by the corresponding cryptographicsignature. The size of the data packets is equal to 42 Bytes. A BCTX requires a signature,which is generated with the help of the Ed25519 cryptography [54]. The signature is,therefore, of size equal to 64 B and is carried in separate two packets following a TX of sizeequal to 32 Byte each, as a 64 Byte signature cannot be sent over the LoRa PHY BW=125kHz, SF=12 configuration, which only allows forMACMTUs of 55 Byte. Therefore, thesignature is fragmented. The number of GWs is fixed and equal to 6 and 1000 end (IoT)devices. [223] and [236] experimentwith the following end-device densities, and inter-TXdelays:

• Inter Transaction Delay / Inter Transmission Delay: [120, 95, 65, 35, 14, 9] s,

• Ndevices (Number of Devices): [200, 400, 600, 800, 1,000, 1,200, 1,400, 1,600].

[223] performs experiments with varying number of inter-TX delays (i.e., inter-datapacket arrival). In this simulation, when the MAC layer is still delivering an old TX (apacket) through retransmissions, a new TX (data packet) is not generated. Even if theBIoT Application is configured to deliver a packet per a time unit, it will wait until theMAC operation for the preceding packet is finished. The physical layer of LoRa is con-figured according to the appropriate transmission power and SFs so that the transmis-sion air-time is minimized under the assumption that every end-device can reach at leastone GW [223]. In terms of the channel model, LogDistancePropagationLossModel forpropagation loss andConstantSpeedPropagationDelayModel for propagation delay ofNS-3 [184] are used in the simulation. The LoRa module adds a 9 Byte MAC header; there-fore the total size of data packets transmitted over the air increases respectively. The sizeof ACKmessages on the Transport Layer is equal to 9 Byte.

Simulation results in [223] indicate that scenarios simulating transmissions of signedmulti-packet TXs scored significantly lower success rates than Single Packet TX scenarios(cf. Figure 4.6). This is due to the fact, that for Multi-Packet TXs N = 10, all twelvepackets belonging to a TX (i.e., 10 data packets, and 2 signature fragments) have to beconsecutively received intact, such that a TX counts as successful, whereas a Single PacketTX increases the success rate, while only 3 packets (1 data packet and 2 signature frag-ments) have to be reliably delivered to the LoRa NS. However, the ARQ scheme on the

129

LBT/ACK/N=10LBT/NOACK/N=10DCE/ACK/N=10DCE/NOACK/N=10LBT/ACK/N=1LBT/NOACK/N=1DCE/ACK/N=1DCE/NOACK/N=1

requested pkts / hour / node rate

pktdeliveryratio[%

]

400350300250200150100500

100806040200

Figure 4.6: Packet Loss Experienced by End‐Devices, 1000 End‐devices, 6 GWs [223].

transport layer coupled to retransmissions (i.e., 8 delivery attempts of every message) cansignificantly increase the performance of Multi-Packet transmissions.

The LBT variant provided in CSMA-x [245] used in LoRaWAN significantly im-proves the network capacity (cf. Figure 4.7). Throughputs of Multi-Packet transmissionfor N = 10 typically exceed the performance of Single Packet TXs by a factor of 2-2.5.This relation can be easily derived from the traffic pattern that has to be sent in both situa-tions, i.e.,datapacket followedby two signature fragments in the SinglePacketTXscheme,against 10 data packets followed by two signature fragments in theN = 10 Multi-Packetsituation. Morever, the ACK mechanism significantly reduces throughput by a factor oftwo in the dense deployments, as in such a case, a GW has to shutdown listening to issuea downlink ACK packet. In [207], BIIT is evaluated using Arduino Mega devices (ATMega 2560) [38] equipped with the Dragino LoRa communication shield [102]. TheEdge Node is provided on an RPi 3 hosting the BC TTN Plugin, Bazo BC Client, andBazo BCminer [74]. RPi also provides a local TTNGW. The Arduino device and TTN-plugin receive E2E connectivity through the TTN network. The Arduino device sendstrafficusingLoRawithBandwidth (BW)of 125kHz, SF12, and transmissionpower equalto 27 dBm. Every segment is retransmitted up to a configurable retry number, when thecorresponding ACK does not arrive.

The evaluation of the employed TX scheme for one device is shown in Figure 4.9and Figure 4.10. In Figure 4.9 (left y axis) presents the amount of total traffic volume ex-changed in the network for aTXconsisting of several packets signed (x axis) of size equal to40B.The traffic types can be divided into three groups (cf. Figure 4.4) for remote and local

130

LBT/ACK/N=10LBT/NOACK/N=10DCE/ACK/N=10DCE/NOACK/N=10LBT/ACK/N=1LBT/NOACK/N=1DCE/ACK/N=1DCE/NOACK/N=1

requested pkts / hour / node rate

network

server

delivered

pkts

/hour

400350300250200150100500

300000

250000

200000

150000

100000

50000

0

Figure 4.7: Cumulative Throughput of the LoRaWAN Network, 1000 End‐devices, 6 GWs [223].

TTNGWs: (a) Data (i.e., payload), which increases upon sending several data chunks of40 Byte per TX, (b) Signatures fragments (32 B each), and (c) ACKs (2 Byte). The PacketandTraffic overhead, which ismeasured in relative values displayed on the right y axis, con-sists of signatures and ACKs. Please notice that there is a higher number of ACKs thanData packets, while the Signatures are also acknowledged. The relative overhead decreasesin aggregated TXs due to an increased number of data chunks (40 Byte) signed with oneheavy Signature packet (64 Byte). Figure 4.9 and Figure 4.10 also compare the transmis-sion performance between a local indoorGWvs. a remoteGW, i.e., less retransmissions toa local gateway observed due to better radio conditions. Thus, it is highly recommendedto install a local GW especially in suburban environments having weak TTN coverage.

4.4.2 BIIT Performance in Cellular Networks

In an experiment [207], a similar setup as in Section 4.4.1 is used, however, ArduinoMega Devices (AT Mega 2560) [38] are equipped with the Sixfab Cellular IoT Applica-tion Shield [232]. The access to LTE Cat. M networks is provided by Swisscom AG inthe Zurich, Switzerland area. TXs are sent in UDPmessages. The RPi again plays the roleof the IoT Edge Node, however, it receives data from the Swisscom network through theInternet using the UDP Packet Listener (cf. Figure 4.5). In this measurements, the nodewakes-up, re-attaches to the cellular network, and sends a data packet towards the IoTEdgeNode, if there are enough collected data chunks in cache to send. The sending procedurerequires the computing of BCheaders (also including signatures). Themeasurements col-

131

DCE/ACK/N=10DCE/NOACK/N=10DCE/ACK/N=1DCE/NOACK/N=1

requested pkts / hour / node rate

consumed

energy/hour/node[J]

400350300250200150100500

800

750

700

650

600

550

500

Figure 4.8: Energy Consumption, 1000 End‐Devices, 6 GWs [223].

lected on a single devices are presented in Figure 4.11. Sending an aggregated TXs (com-posed on multiple 40 B data chunks, e.g., measurements) has a positive impact on both,the relative BC data overhead (i.e., data transmitted vs. actual data size) and energy effi-ciency, because the resulting Bazo header is heavy (141 Byte) and the Ed25519 ECDSA iscomputationally expensive. Furthermore, the most dominant energy consumption is theresult of the cellular connection re-initialization after a deep sleep and the computation ofan Ed25519 signature, which can take up to several seconds. It was not possible to per-formNB-IoT or EC-GSM-IoT experiments; Swisscom did not provide EC-GSM-IoT inthe area and connecting to their NB-IoT infrastructure was impossible. Therefore, onlyLTE Cat. M is considered.

4.5 Conclusions

BITT fulfills the defined objectives as shown via different implementations and evalua-tions presented above. A summary of concluded outcomes is presented as follows [208,223, 207].

To address the scalability concerns of IoT communication protocols such as for LoRa,BIIT provides configurable BIoT communication schemes. Moreover, a BIIT compliantBIoT shall enableAPIs to allow for the implementation of logical components compatibleto a wide variety of different BCs. BIIT BCs support a broad range of BC types all withapplication oriented consensusmechanisms. BIIT enriches the edge and fog adaptationbyspecifying elements to smoothly integrate IoT with BCs. The management components

132

Overhead (local GW)ACK (local GW)Signatures (local GW)Data (local GW)Overhead (remote GW)ACK (remote GW)Signatures (remote GW)Data (remote GW)

Number of Data Packets Signed

Packet

Overhead[%

]

PacketsSent

500

400

300

200

100

01614121086420

120

100

80

60

40

20

0

Figure 4.9: The Packet Volume Exchanged in The Network Upon One TX Consisting of Several Data Packets [207].

defined in BIIT allow for the specification of efficient transmission schemes.A BIIT BC TX adaptation scheme on-the-fly adapts the transmission scheme to un-

derlying networking interfaces, e.g., by employing a TX fragmentation level in case net-work interfaces with low Maximum Transmission Units (MTUs) are used, such as forIEEE 802.15.4, to guarantee a high TX throughout, transmission reliability, and energyefficiency. This adaptation is supported by the Software-basedNetworkAdaptationLayer(cf. Figure 4.1). The fragmentation and aggregating of IoT data packets have been simu-lated in different scenarios [223] which proves the importance of such technical consider-ations in IoT-to-BC communications. Furthermore, the generic architecture of BIIT al-lows for the implementation of communication protocols adjusted to particular BC speci-fications allowing the programmer to derive BC-compatible protocol data units, e.g.,withan appropriate TX data format. Finally, BIIT considers a flexible edge, fog, and cloud-based service-and-resource management scheme for BIoT.

For a high security, BIIT complaint system has its BC TXs signed on the IoT deviceitself. Hence, the origin and authenticity of the data submitted to the BC can be verifiedat the later stage. Moreover, to allow for a required level of protocol security, BIIT appro-priately handles security-specific TX fields rooted in a given BC specification, such as theEthereum’s TX nonce field, which protects against double-spending. BIIT suggests a setof configurable APIs that play a key role in the IoT-to-BC communication safety. TheseAPIs shall establish configurable security settings via encryption protocols according tothe BIoT use case requirements. Therefore, BIIT considers and offers a high security andflexibility at the same time.

BIITdoes not define a framework for the implementationof aBIoTapplication, but itoutlines most critical considerations of BIoT system’s design and development in support

133

Overhead (local GW)ACK [B] (local GW)Signatures [B] (local GW)Data [B] (local GW)Overhead (remote GW)ACK [B] (remote GW)Signatures [B] (remote GW)Data [B] (remote GW)

Number of Data Packets Signed

VolumeOverhead[%

]

BytesSent[B]

200

150

100

50

01614121086420

3500

3000

2500

2000

1500

1000

500

0

Figure 4.10: The Total Traffic Volume Exchanged in the Network Upon One TX Consisting of Several Data Packets [207].

of a realistic application. Thus, BIIT cannot fragment or aggregate IoTdata automaticallyor it cannot run BC clients with specified details, but crucial requirements are derived.

In the security provision context, this generic architecture covers a wide range of func-tionalities related to authentication, encryption, data integrity, and authenticity. Authen-tication engine challenges IoT devices by asking for credentials before access to a specificresource may be granted. For example, a permissioned BC is equipped with authentica-tion engine, Access Control Lists (ACL), and membership registers that allow authorizeddevices to read/write to the BC, excluding third party users explicitly. Furthermore, sim-ilar authentication engines may be placed at the edge, when only authorized devices mayuse the edge infrastructure to offload heavy processing toward edge with high processingcapabilities. These authentication functions are embedded in the BC wallet on the IoTdevice and as light BC clients at the edge.

Data integrity is guaranteed by the message digest computed over chunks of data sub-mitted through hashing functions, such as Ed25519, SHA2, and SHA3. Additionally,data authenticity is established through digital signatures via public cryptography mech-anisms using two mutual cryptography keys used, such as with the Elliptic Curve DigitalSignature Algorithm (ECDSA). Therefore, BIIT components securely submit a TX froman IoT device to the BC. [207] implements BC clients — according to BIIT’s specificconsiderations— on LoRa IoT devices, which transmit data to a BC by first singing themwith the user’s BC address SK. In a BIIT-compliant approach IoT devices shall connect tothe BC client which may be located on IoT infrastructure or even on the fog, and collectthe user related information such as account address, balance, and the SK without userinteraction, and via an automated mechanism. The corresponding data integrity and au-thenticity functions on the IoT device are delivered by the security module. The node

134

Energy Consumption LTE-MData Overhead LTE-M

Number of Data Samples Signed

EnergyConsumption[m

Wh/B]

Data

Overhead[%

]

0.1

0.08

0.06

0.04

0.02

0302520151050

500

400

300

200

100

0

Figure 4.11: BIIT Performance Using The Cellular Connectivity [207].

perception layer has to be secured as well against node tempering attacks to ensure thatdata of the environment remain in tact, before they are eventually sealed in the BC.

Finally, the energy efficiency via BIIT is handled by the management componentswhich have to be adjusted based on users preferred parameters, such as the (a)maintainedsecurity level, (b) computational complexity, (c) storage requirement, and (d) power effi-ciency. These management decisions are executed in the networking configuration mod-ule.

135

5DLIT—ADistributed Ledger for

Efficient BIoT

To confront the BIoT efficiency issues related to scalability, energy efficiency, and security,this thesis presents DLIT (Distributed Ledger for IoT Data) that enables a novel hybridand sharded consensus mechanism by applying TX aggregation on validated TXs. Thedesign and implementation of DLIT is based on a PoS-based BC named Bazo introducedin Section2.5.5. ThisChapter elaboratesDLIT’s design and features, i.e., (i) sharding, andinter-shard communications, (ii) TX aggregation, (iii) the hybrid consensus mechanism,(iv) GDPR-compliance, and finally discusses on DLIT’s performance evaluation.

5.1 DLIT’s Design and Implementation

Bazo chose validators proportionally to the amount of stake each validator owns and thetime waited by validators influenced by a randomization factor. While a PoS-based BCsare known to be more scalable than PoW BCs, however, Bazo approach did not offer anyspecific scalability enhancement for BIoT use cases. Thus, through out the course of thisthesis, Bazo’s designmodified and improvedwith introducing newTXs andnovel featuresadded to Bazo BC, introducing a different public-permissioned version of it, i.e., “DLIT”.

136

5.1.1 IoTData Transactions

Since the main focus of DLIT is preserving IoT data, a specific data TX is dedicated forthis purpose. The dataTX (DataTx) can be used to send data fromone address to anotherand store this TX inside the DL. The data sent inside data transactions can have manydifferent forms, e.g., strings, or serialized Java objects. By converting these data formats toa unified bytestream format, TXs can be stored in the same way with no extra overhead.When the data is retrieved again, the bytestream can be augmentedwith the data about theoriginal format, and the original data to be stored can be recreated again. DLIT dedicatesthe following fields toDataTXs. The structure of aDataTx is presented below. Importanttomention is, thisTXcanbe developed such that it does not include a field for transferringfunds, since in private BCs, such as for DLIT, data transmission in a BIoT use case, e.g.,STC, users may not be willing to pay for every data being set by their IoT devices does.Thus, the DataTXs shall not transfer any funds. Furthermore, the field data does nothave a specific lengthbecause it canbedynamic—while being limitedwith theblock size—. Therefore, using the function Size(), the data size of aDataTx is calculated. This fieldcannot be static like other types of TXs , but it has to be dynamic and adjusted accordingto the size of the data transmitted.

Listing 5.1: Structure of a DataTx [51]

type IotTx struct {Header byteTxCnt uint32From [32 ] byteTo [ 32 ] byteSig [ 6 4 ] byteData [ ] byteFee uint64

}func ( tx *IotTx ) Size () uint64 {

s i z e := int ( unsafe . S i zeo f (* tx )) + len ( tx .Data)return uint64 ( s i z e )

}

5.1.2 Aggregated Transaction (AggTX )

DLIT designs a newTX , i.e., AggTx based on [167] , specifically to aggregate and sum upthe matching funds TXs and DataTXs and older aggregated TXs. Instead of these TXs ,an AggTx will be listed inside a block. The AggTx is added to the blocks similar to otherTXs. They are listed in the AggTxData slice. The fields inside a AggTx is considered asfollows.

137

Amount: The amount is the summed up amount of all TXs aggregated inside thisAggTx.Fee: The fee of this TX . It is set to 0 because, at the time of writing, the users should notbe charged for this type of TX .From: It is a slice where the addresses of all senders sending a TX aggregated in this blockare stored.To: This is the counterpart of the From field and filled with the addresses of the TXs ’receivers.AggregatedTxSlice: This slice is of type [][32] Byte and does store all hashes of the TXsaggregated inside this AggTx.Aggregated: a boolean variable that indicates if the TX is aggregated.Block: In this filed the hash of the block – in which this TX is aggregated the first time–stored.MerkleRoot: Root of the Merkle tree to ensure integrity and the correct order for theTXs aggregated in this AggTx.

DifferentTXs canbe aggregated in twodifferentways. But in any case, it is not possibleto combine an already aggregated TX aggregated by the sender with another one which isaggregated by receiver address. This results in either the From or To slice to have a lengthof one. If TXs aggregated is done based on the sender address, all the TXs which are sentby that specific wallet are aggregated into oneAggTx. Hence, the From slice has a length ofone, as there is only one sender included. On the other hand, TXs can also be aggregatedby the receiver and all TXs sent to one specific wallet are aggregated into oneAggTx. Herethe To slice is only of the length one since all TXs are sent to one specific receiver.

When a miner combs through all open TXs he tries to aggregate as many open fundsTXs as possible according to these two rules. If two or more TXs can be aggregated, theirTXhashes arewritten to theAggTx’s AggregatedTxSlice and theTXs ’ booleanAggregatedwill be set to true.

In the next step, theminer checks already closed blocks, whether there are TXs (eitherFundsTx or AggTx) which match the chosen pattern (either aggregated by sender or re-ceiver) and are not aggregated until now. If such historic TXs exist, they are also added tothe AggTx. But they do not have an influence on the BC state during the post validationof a block anymore.

In Figure 5.1 the aggregation process is visualized schematically. The letters are walletsand the numbers are the amount of coins sent. All openTXs are listed on the right side. Inthis example, the historic aggregation is omitted due to simplicity and only one of variouspossibilities is shown.

Algorithm 1 is designed to group the TXs in an optimal way, such that the least TXsare listed in the block, but the most TXs are validated. As shown in Figure 5.1 withoutTX aggregation, all these open TXs try to be in current block 103. When the block size

138

Figure 5.1: Transaction Aggregation Concept [167]

is assumed to be limited to five TXs , with aggregation, there is still place for one moreTX whereas without TX aggregation not even all nine TXs can be validated in the cur-rent block. Because DLIT only writes the hashes of TXs inside its blocks, this is possible.Consequently, with aggregation only the hashes of the two aggregated TXs and the twonormal funds TXs , which cannot be aggregated in this block, are stored in the block’sbody.

In this design, miners do not earn a specific fee for aggregating. But they still receive allthe fees which belong to the found TXs i.e., FundsTx. This results in miners that want tovalidate as many FundsTx as possible and thus earning as much as feasible. The more TXsthey can aggregate themore TXs are in a block and they will get a higher reward. Since theblock’s size does not grow with every TX , they can add more TX into one block.

Since only valid FundsTx and already validated AggTx can be aggregated, the aggrega-tion takes place after a FundsTx is characterized as a valid TX . Instead of adding this validTX directly into the block’s body it gets added into a temporary slice. Transactions in thisslice will be sorted by the sender’s address and then by the TX counter.

All different senders and receivers are stored into two “maps”. The number of occur-rences in all open TXs , which can possibly be aggregated, are used as values. These mapsare calleddiffS anddiffR in algorithm1. This approachfinds the best combinationof howTXs will be aggregated (either by the sender or by receiver address). The getMaxSender-Receiver(diffS, diffR) function returns the sender and receiver with themost occurrences.

Algorithm 1 groups TXs in a way that open TXs either have the same sender or thesame receiver. The output is then a new slice of TXs which matches the rules definedfor TX aggregation. The selected TXs are stored into the txToAggregate slice and aggre-gated with the function on line 19. Then the algorithm removes this group of TXs from

139

possibleTxToAggregate and recalculates the diffS and diffR. A miner repeats these stepsuntil possibleTxToAggregate is empty. This ensures, that the highest maximal number ofTXs is validated in a block.

In order to keep the scalability improvements enabled by TX aggregation, DLIT ag-gregates any amount of DataTxs to an AggDataTx based on the sender or receiver. Thus,reducing the space occupied by TX headers in the DLIT. The number of aggregated TXsin oneAggTx is only limited by DLIT’s block size. TX aggregation is especially empower-ing for a BIoT use case where multiple TXs are sent from a constant set of IoT nodes.

5.1.3 Double Linked Blockchain

The concept of a double linked BC is a result of the idea based on [167] to remove allTXs from a block once all of them are aggregated in a later validated block (as shown inFigure 5.2). This contradicts the immutability of BCs where all validated blocks are keptunchanged on the chain.

In DLIT the block hash is calculated from various block related input fields. One ofthese variables is the Merkle root, which ensures TX verification. It ascertains that TXsneither can be added to or removed from a block nor the ordering can be changed oncea block hash is created. Thus, removing of TXs is only possible when a new additionalblock hash is calculated for every block because the old hash is becoming invalid, as soonas someTXs are removed. This new hash, calledHashWithoutTransactions, is used alwayswhen the normal hash is becoming invalid. The normal hash becomes invalid because theMerkle root changes.

The goal and also the specification of the double linking is, that at least one link tothe previous block is valid. Additionally to the common variables included in a block, thefollowing fields are added in respect of the double linking of the BC:Aggregated: It indicates if a block is aggregated and therefore does not contain any TXsanymore. It is of type boolean.HashWithoutTransactions: This hash is used once all TXs from a specific block are ag-gregated. It can be calculated when not taking the TXs into account and as a consequenceassuming an empty block. Thus, it is only possible to empty a block, once all TXs areaggregated and removed. It is of type [32]byte.PrevHashWithoutTransactions: This field links the current block to the previous oneonce all TXs in the previous block are aggregated.ConflictingBlockHashWithoutTx1: HashWithoutTransactions of the first conflictingblock.ConflictingBlockHashWithoutTx2: HashWithoutTransactions of the second conflict-ing block.

140

In figure 5.2 the concept of a double linked BC is illustrated. Every block, expectthe genesis block, can either be in the storage Blocks With TX or Blocks Without TX .This two versions of a block are indicated with block-namew/ (including TXs ) and block-namew/o (without TXs , meaning this block never contained TXs or all of them are ag-gregated by now). The increasing block number indicates which block is the ancestor,and the arrows point to them. Every block, expect the genesis block, can contain TXs.These TXs are sent from clients to the network, what is indicated on the left side, as in-coming TX. Transactions are read as FundsTxsenderAddress=>receiverAddress or when aggregatedasAggTxsenderAddress=>receiverAddress. This happens as a historical aggregation in block 103 orwhile mining a new block denoted as the incoming AggTx also included in block 104. AsFundsTxB=>D reaches the network, theminers search in already validated blocks for otherFundsTx or AggTx with either the same sender or receiver.

In this example, FundsTxB=>C in block 102 can be aggregated, which leads to the casewhere in block 102 all TXs are aggregated. Once all TXs are aggregated and the block isout of the exclusion zone, it can be transferred to the storage without TXs. The exclusionzone is defined as the current blockheight -NO_EMPTYING_LENGTH that ensures user-definedNO_EMPTYING_LENGTH of last blocks are not moved even though all theirTXs are aggregated. Meaning only blocks with a blockheight smaller than currentBlock-height - NO_EMPTYING_LENGTH are moved. Block 105 is not emptied yet despitethe fact it does not contain any TXs. Once block 105 will be out of the exclusion zone, itwill be transferred to the Blocks Without TX . When a NO_EMPTYING_LENGTH of2 is assumed, block 105 can be moved once a block with height 108 is appended to thechain.

When emptying block 102, it will be moved to the Blocks Without TX , and the hashHashWithoutTransactions gets valid. Therefore block 103 is not linked to block 102 viathe Previous Hash anymore but over the PrevHashWithoutTransactions now. It has to beensured that one link between two consecutive blocks is always valid. In figure 5.2 thisis indicated with the black arrows between blocks. This results in a valid chain indicatedwith grayish background color whereas all other blocks are only there for visual purposesand therefore slightly faded.

5.1.4 Consensus and Sharding

To develop a scalable mechanism for IoT use cases, DLIT is designed as a DL for small-to-medium size networks with the goal of reducing inter-shard communications for higherscalability. Design and implementation of sharding in DLIT started by the work of [43]and completed later in [51]. Comparable to related work, DLIT is designed in two lay-ers, i.e., (i) Committee (CM) and (ii) Validators (Vs). The novelty of the DLIT designincludes roles, which moderate the power entities can obtain. While reducing the inter-

141

Figure 5.2: Double Linked Blockchain Concept in DLIT [167]

shard communication dependency of the entire DL, DLIT materializes a highly reliablevalidation processes monitoring the integrity of the full DL.

The DLIT CM is an entity, where its members do not participate in the validationprocess performed within shards. For an efficient TX management, the CM assigns TXstoVs in shards and performs necessary validation and authentication steps to make DLITa trustworthy DL. Vs are provided with TXs , which they need to validate at a particularstep in the DL. Thus, DLITVs do not have their ownTX database, and they do not com-municate with other shard Vs. Therefore, less redundancy at the shards level is reached.Moreover, TX rebroadcasts are not necessary anymore, sinceVs are already provided withtheTX they need. Thus, a global view ofTXs inside shards is obsolete. DLITVs no longerhave to fetch, store, and delete TX that they never validated earlier. Two CM entity typesexist: CM members and leaders. A leader exists in shards, too, who is the leader for oneepochbeing electedby theCMin a randomized leader assignmentmechanismof theprevi-ous epoch. TheCM leader is responsible for assigning TX to shards and also to run a BFTCM inside the CM to come to an agreement, if malicious behavior is detected in DLIT.Meanwhile, all CMmembers validate the DL. In order to restrict the CM leader’s powerof a certain height, the CM leader of epoch n is responsible for the BFT of epoch n − 1.If a user joins theV pool, he automatically becomes aV, with the chance of becoming the

142

leader in every following epoch. If the user joins the CM pool, he automatically becomesa CMmember, with the chance of becoming the leader in every following epoch, too.

5.1.4.1 Processes on Transactions

SinceVs do not have a global view of all TXs anymore, they rely on theCMto supply themwith a portion of the total TX pool with TXs that have not been validated yet. As shownin Figure 5.3-a, at the beginning of each epoch, the CM evaluates the number ofVs in thesystem and assigns to each of them a partition of the pool of open TXs according to thepublic address of the TXs ’ sender. In order to do that, a new data type TX assignment isintroduced in DLIT. It includes all TXs from the global TX pool that have been assignedto a specific V at the time of the creation of this assignment. Shifting the responsibilityfor the TX sharding to the CM avoids multiple executions of the same task in each shardand effectively makesVs light-weight.

Figure 5.3: Transaction Assignment and Validation in DLIT [51]

After a block is mined, it contains a set of TXs according to the TX aggregationmech-anism. The mined block is broadcast to the network to be validated by CM members.Moreover, the state transition is sent out to the entire network for synchronization andvalidation purposes. WhenCM receive themined block, removed the validated TXs fromthe open pool (red circles) and re-assign the remaining TXs (the green circles) to the min-ers in shards. DLIT’s TX assignment and management mechanism function at a V fillsup the local mempool of the V with TXs , once the assignment was performed. The Vchooses the optimal combination ofTXs to be added to the block (green circles). After the

143

block’s creation, both the TXs that were validated and the ones that were not validated aredeleted from the V ’s mempool (cf. Figure 5.3-b). The CM, upon reception of the blockfromVs, fetches all TXs , which were included in the block and writes them to the closedTXs pool. The same set of TXs is also deleted from the open TX pool. E.g., in a 3 shard -1 CM configuration, first, the global open TX pool is loaded into the memory of the CMleader. In turn, the entire pool is divided into 3 different partitions, each representing oneshard. Those partitions are sent by the CM leader to Vs assigned to each shard. The TXsvalidated inside the block are deleted from the mempool. All remaining TXs are sent outagain to all Vs assigned. By keeping track of all TXs , which were not validated by Vs, theCM does not have to force Vs to validate all TXs that were assigned. TXs not validatedwill be taken into the TX assignment mechanism at the next block height again.

Figure 5.4: Inter‐shard Synchronization in DLIT [200]

5.1.4.2 Communication and Synchronization

DLIT keeps the communication networking overhead as low as possible. Thus, a light-weight synchronization mechanism reduces inter-shard communications. Moreover, thecommunication between theCMandVs has been designed to beminimal. DLIT requiresthe V -CM synchronization to keep a global view of open TXs and to decide, which Vvalidates which TXs at a certain height. Without synchronization, malicious behavior,such as double spending, could be possible.

Inter-Shard Communication and Synchronization:DLITcould distribute the task of aggregating state transitions to a global state to all shards,thus, the same task is performed by all Vs. Also, all state transitions have to be sent to allother Vs. If the total amount of Vs is denoted as n, the basic sharding mechanism leadsto a total of n∗(n−1)

2 connections that have to be active. Moreover, a total of n ∗ (n − 1)messages have to be sent inside the network at eachblock height. This leads to a complexityofO(n2)messages in the system at each block height, which DLIT avoids.

DLIT solves this problem by fixing the epoch length to one. Thus, state transitionsare created and distributed by allVs, and they are applied as well as aggregated by the leaderbefore creating the epoch block. The last step of deleting TXs that were validated by other

144

Vs is omitted, because the leader does not see those TXs in the mempool. Note that sinceonly the leaderV has to receive state transitions, the size of messages for each block heightis reduced toO(n). This mechanism guarantees that the shard leader is up to date after re-ceiving all state transitions. Exactly for this reason, it is the leader’s task to create the epochblock with the updated global state in it, such that other shards and the CM synchronizethe global state as well, along with the shard and CM leader assignment

DLIT only requires all shards to communicate with shard 1. In comparison to a reg-ular sharding mechanism, where all miners need to send their state to each other, DLITrequires a reduced amount of communications. Figure 5.4 illustrates the communicationbetween 4 shards in DLIT. Dashed lines indicate inter-shard communications, which arenot required by DLIT. Clearly, fewer communications enable a decreased probability ofpacket loss and faster synchronization of miners. Hence, DLIT is less affected by unstablenetworks.

Inter-Committee Communication and Synchronization:Like shards, CMmembers synchronize with each other, to work on identical data to keeptheir mempools synchronized. It should be irrelevant for shards how many members theCM has, because the CM acts as a synchronized and unified entity. The inter-CM com-munication designed enables theCM to decentralize power by running the validation tasknot only in one node, but in all CMnodes in parallel, thus, coming to a unified and defini-tive decision in the end. To achieve that, all CM members receive the TX assignmentfrom theCM leader, such that everymember has identical data toworkwithwhile validat-ing. During the validation process, eachCMnode performs identical steps using identicaldata, which will, in the end, help them find a consensus of which nodes actedmaliciously.This is reached, since CM member send out a CM check, containing information aboutwhich nodes they identified as being malicious. Since only the leader has to process thosemessages and like for inter-shard communications, it is a major concern to keep the to-tal amount of required messages for inter-CM communications as low as possible. Thus,only the leader sends out the TX assignment to other members, and only the leader re-ceives CM check messages from other members, leading to a complexity ofO(n) of sentmessages, as opposed toO(n2), with n denoting the number of nodes in the system.

Validator-Committee Communication and Synchronization:Communications between the CM andVs is grouped into two types: (i) sending out TXassignments and (ii) validating blocks and the epoch block. At each block height,Vs haveto wait for the CM’s TX assignment. The reception of this TX assignment is the startingpoint for mining a new block. Once the CM received and validated all blocks, it can buildthe new global open TX pool with remaining, not validated TXs. After the reception ofthe epoch block with itsV ’s shard assignment, it is ready to partition the global open TXpool according to theseVs again. An epoch block with height h can only arrive at the CM,

145

EpochBlockValidationTransactionAssignmentCreation BlockValidation

BlockCreation StateTransitionCreation/Aggregation

EpochBlockCreation/AcceptingValidators

CommitteeMembers

Sendingtransactionassignment

SendingBlockSendingstatetransition

SendingtheEpochBlock

Figure 5.5: DLIT’s Validator‐Committee Communication and Synchronization [51, 200]

if all blocks in the epoch with height h have beenmined and after all TX assignments havebeen received. The synchronization happening at these communications guarantees thatVs and the CM are always at the same block height. Figure 5.5 shows a simplified statemachine for DLIT’s CM-V communications. A transition to the next state is depictedby a solid arrow, while messages are represented by dashed arrows. Upon incoming mes-sages, the machine halts until all required incoming messages have arrived. Effectively,those communications at both state machines enables them to be synchronized. E.g., byconstruction it is not possible to send two consecutive TX assignments before a block forthe first TX assignment has beenmined and sent to the network. Finally, theV -CM com-munication is used for updating the closed mempool. Since all CMmembers validate allblocks, they can write TXs , which were contained in blocks to the closed mempool. Tosynchronize the global state ofVswith the global state of the CMand to communicate thenew shard and CM leader assignments, all CMmembers validate epoch blocks.

5.1.4.3 Authentication and Validation

Authentication and validation were two of the major concerns during the design of thecommittee in DLIT. In this regard, DLIT needed to guarantee the authenticity of com-munications and validate the TXs.

146

Authentication Strategy: DLIT state transitions, just like blocks and epoch blocks,have to be signed using theVs’ private commitment key via an RSA signature mechanism.Using the public commitment key, publicly available in the DL state, the validity of thekey can be checked. In a similar fashion, inter-CM communications and the shard-CMcommunication is equipped with signatures and checks, such that no malicious node in-side or outside DLIT can assume a wrong identity. Therefore, the authenticity concernsof communications are settled.

Validation Strategy: The CM checks (i) whether Vs did include TXs in their blockthat were not assigned to them and (ii) if state transitions that they produced correspondto actually validated TXs. ADLITCM creates its own state transition based on TXs thatwere put inside the block and compares it to the state transition, which was produced bythe shard. The check for validity of this state transition is crucial, since without checks anynumber of TXs could be included in the relative state, remaining undetected. This is be-cause the block creation and state transition creation processes are decoupled. Therefore,even if the block produced is valid, it is possible that the state transition contains wronginformation.

The task of the first DLIT shard, called the shard leader, is to aggregate all state transi-tions with its own state and to consolidate it in the epoch block. In turn, all other nodes inthe network accept the global state from that epoch block. The validity (and authenticity)of the epoch block is, therefore, crucial within a trustworthy system. Since authenticitywas already addressed by previous implementations, the validity check in DLIT are per-formed by CM via aggregating all relative states from all shards in the system, creating anaggregated relative state and comparing that relative state to the difference between theepoch block being checked and the epoch block of the previous height.

Edge Cases: It becomes evident that miners need a way to remember the shard IDsthat they were assigned to for each block height and requests should always bemade usingthe correct shard ID for each height, in order to avoid any confusion.

Epoch Block Finality: The shard with shard ID 1 can produce the epoch block.Thereby, there is no race condition in the creation of epoch blocks anymore and thereis immediate finality. Furthermore, this condition doesn’t provide any more vulnerabil-ities than the previous system because the sortition mechanism for who can create theepoch block is still randomized in a similar fashion. The only disadvantage is that thePoS mechanism loses its weighing mechanisms. That is, nodes with a higher stake aren’tmore likely to mine the epoch block anymore. This problem, however, could be solvedby redesigning the validator shard assignment process in the future. The advantages ofimmediate finality are diverse, but mostly they enable the BC to be more scalable, becauserollbacks are time- and resource-consuming. Moreover, it enables to process TXs in quasireal-time, whereas in Bitcoin it takes up to 60minutes until they can be considered as final.

147

Algorithm 1: Splitting Valid Transactions for Aggregation: split-

SortedAggregatableTransactions (block b) [205]

Input : block bOutput: Slice of transactions which can be aggregated and block b, into

which they belong

[1] get possibleTxToAggregate from TempList[2] get diffS and diffR[3] sort possibleTxToAggregate by senderAddress and then TxCnt[4] for moreTxToAggregate do

[5] maxSender, maxReceiver := getMaxSenderReceiver(diffS, diffR)[6] if maxSender >= maxReceiver then

[7] forall tx in possibleTxToAggregate do

[8] if tx.From == maxSender then append tx to txToAggregate[9] else keep tx in possibleTxToAggregate

[10] end

[11] else

[12] forall tx in possibleTxToAggregate do

[13] if tx.To == maxReceiver then append tx to txToAggregate[14] else keep tx in possibleTxToAggregate[15] end

[16] end

[17] remove txToAggregate from possibleTxToAggregate[18] recount diffS and diffR[19] AggregateTransactions(txToAggregate, b)[20] set txToAggregate to nil[21] if len(possibleTxToAggregate) == 0 then moreTxToAggregate =

false[22] else moreTxToAggregate = true[23] end

5.1.5 Consensus Design and Implementation

DLIT’s hybrid consensus is implemented in two parts [200], (i) on shards, where Vs areassigned to shards, and (ii) at the CM.

Part 1: DLIT implements the consensus mechanism in such a way that shard withshard ID = 1will be the leader for the respective epoch. The leader has a special role inside

148

the shard pool as shownwith Figure 5.6. Firstly, before being able tomine a block, the TXassignment has to arrive fromCM. After that, a block is mined. Then, the leader waits forall state transitions from the other shards (2 to N) and requests missing ones if necessary.For each state transition, it updates its local state by applying the relative state. After allstate transitions have been processed, the leader has the updated current global state ofthe BC. As it is the only shard node with this global information, it then produces thethe epoch block containing the global state in it. Moreover, it produces a new random Vshard assignment and a assigns a new randomCM leader.

In contrast to other shards, where theVs have to wait for the TX assignment tomine ablock and send out the state transition, shard leader waits for the epoch block to arrive soit can take over the new global state, along with the new shard assignment. From a shard’sperspective, consensus is reached at this moment.

Figure 5.6: Interaction between the Shards in DLIT [51, 200]

Part 2: On theCMside, theCM leader is elected during epoch block creation in a ran-domized fashion. Other than the V shard assignment however, the other CM membersdon’t need any additional information, which is necessary for the shards to partition theTX pool, i.e., shard IDs. This is because the shards all do a different task, while the valida-tion process is the same for all CMmembers. Therefore, the CM leader is stored directlyinside the epoch block, with no assignment for the other CMmembers. A CM leader hasmore responsibilities than the other CM members, that are running the BFT consensusmechanism between CMmembers, and assigning TXs to the shards based on PoS. How-ever it does not holdmore power than othermembers in the consensusmechanism. Figure5.7 illustrates the work flow of communications in the CM on a high level.

149

SlashingConsensusforthelastEpoch

Sendout/receiveTransactionAssignment

FetchingBlocksandBlockValidation

FetchingStateTransitions

StateTransitionValidation

FetchingtheEpochBlockand

EpochBlockValidation

ConstructionandDistributionofCommitteeCheck

Figure 5.7: Inter‐Committee Communications in DLIT [51, 200]

At the beginning of each round, only the CM leader is active. It runs the BFT consen-sus mechanism, slashes malicious nodes of the previous round, and creates and sends outthe TX assignments. Whereas, the other CMmembers have to wait to get all TX assign-ments before being able to continue, the CM leader goes directly to the next step becauseit already has them. From now on, all CMmembers behave in the same way. That is theystart listening to, requesting, and authenticating blocks and state transitions in parallel.

This parallel execution of the code has the advantage that time for fetching the statetransitions from the network can be saved, since it is done much faster than the main al-gorithm. During block validation, the validity of the PoS and of the contained TXs ischecked. Moreover, a relative state is constructed, which is used to validate the state tran-sitions at the next step, which will be entered once both the previous two steps are done.

150

State transitions are then validated and the epoch block is validated. In the next step, theepoch block with its new state and assignments is accepted, and all CM members sendout their CM checks containing the members who acted maliciously in the round. ThoseCM checks can then be used by the CM leader in the next round for the BFT consensusmechanism.

The slashing in the DLIT is regulated through a BFT consensus mechanism and givesthe system away to combatmalicious behaviour. All CMmembers validate thework of allVs in parallel. At the end of each round, the CMmembers prepare a summary of all nodesfor which they detectedmalicious behaviour. In order to do that, they create a CM check,containing a slice of all V addresses who acted maliciously as well as a slice for all CMmember addresses who acted maliciously in the epoch. The message is then signed andbroadcast to the network. In the next epoch, the new CM leader collects all of those CMchecks and based on the complete set of these, performs the BFT consensus mechanism.The mechanism is based on voting, which means that based on the total amount of CMmembers in the system, a certain amount of those has to vote to slash a specific node inthe BC. The percentage of CMmembers needed in order to slash a node is set to a defaultvalue of at least 66.67% of the CMmembers.

Algorithm 2 describes the slashing process. First, number of CM members votes re-quired for slashing is calculated. Then twomapsfineMapShards andfineMapCommitteesfor counting the votes are instantiated. Those two maps are kept separated such that thesystem allows for different methods of handling malicious behaviour based on the entitytype of the offender. Then, all accounts in the current global state are evaluated in a loop.For each CM check which includes the address of the current account being evaluated,the counter in the respective map is increased by one. After all accounts have been evalu-ated, the maps are checked for entries which have enough votes to be slashed. Inside thefunctions SlashShard(address) and SlashCommittee(address), the measures to slash themalicious nodes are implemented. Slashing in DLIT is implemented with a fine TX .

In an ideal case, where no node in the network actsmaliciously, consensus is reached inthe shards without active intervention by the CM. For all members of the BC, the epochblock serves as a way to synchronize the global state. If, however, malicious behaviour isdetected by theCM, consensus has to be reached taking the slashingmechanismpresentedin the previous section into consideration, which includes issuing fine TXs .

5.1.6 IoTData StorageMechanism

Whenever aDataTX gets validated by aV, the CM takes the data from the TX and storesit in the persistent database. For each sender account, the CMmaintains a data structurein the database called “data summary”. The data summary consists of an Address field,

151

Algorithm 2: Runing the BFT Consensus Mechanism: runBFTMecha-nism([]CommitteeCheck committeChecks) [200]

Input : []committeeCheck committeChecksOutput: none

[1] numOfCommittees := DetNumberOfCommittees()[2] requiredVotes :=

DetNumberOfVotersForSlashing(numOfCommittees)[3] create fineMapShards[4] create fineMapCommittees[5] ...[6] forall acc in storage.State do

[7] if acc.IsStaking then

[8] forall cc in committeeChecks do

[9] if containsAddress(cc.SlashedShards, acc) then

[10] fineMapShards[acc] += 1[11] end

[12] end

[13] end

[14] if acc.IsCommittee then

[15] forall cc in committeeChecks do

[16] if containsAddress(cc.SlashedCommittees, acc) then

[17] fineMapCommittees[acc] += 1[18] end

[19] end

[20] end

[21] end

[22] forall address, votes in fineMapShards do

[23] if votes >= requiredVotes then

[24] SlashShard(address)[25] end

[26] end

[27] forall address, votes in fineMapCommittees do

[28] if votes >= requiredVotes then

[29] SlashCommittee(address)[30] end

[31] end

152

Algorithm 3: Updating the Data Summary: UpdateDataSum-mary(DataTx DataTxs) [200]

Input : []DataTx DataTxsOutput: error err

[1] ...[2] initialize updateMap[3] initialize dataSummary[4] ...[5] forall dataTx in DataTxs do

[6] updateMap[dataTx.From] = append(updateMap[dataTx.from],dataTx)

[7] end

[8] forall sender, dataTxSlice in updateMap do

[9] if ∄ dataSummary for sender in Database then

[10] create dataSummary[11] forall dataTx in dataTxSlice do

[12] dataSummary.Data = append(dataSummary.Data,dataTx.Data)

[13] end

[14] ...[15] write dataSummary to Database

[16] else

[17] fetch dataSummary for sender from Database[18] forall dataTx in dataTxSlice do

[19] dataSummary.Data = append(dataSummary.Data,dataTx.Data)

[20] end

[21] write dataSummary to Database

[22] end

[23] ...

[24] end

[25] return nil

i.e., the sender’s address of the data, and aData field, i.e., a slice of bytestreams. It is of thetype [][]byte and stores the data that the sender sent to the network.

Whenever a block which contains aDataTX is validated at the CM, the functionUpdateSummary(DataTxs) is called, as in Algorithm 3. The argument DataTxs containsallDataTxs from the respective block. IoT data is stored in the data summary, in addition

153

with the address of the sender as a key to access the storage slice. Thus, a global andupdatedview of the data in the BC is kept at all times.

In order to be as efficient as possible, UpdateSummary(DataTxs) operates as follows. Amap is created with the sender address as a key and all of the sender’sDataTxs to be addedin a slice as a the value. This map is successively filled up by iterating through the wholeslice of DataTxs to be added. Next, the database will be iterated by sender and not by TX, minimizing the amount of database calls. In each round, the database is checked if thereis already a data summary for the sender. If yes, fetches it, otherwise, creates a new datasummary. Then, iterating through all DataTxs from the respective sender, the containedbytestreams in the individualDataTxs are appended to the slice in the data summary. Afterall DataTxs have been handled, the data summary is written back to the database.

5.2 Adhering to GDPR

With the increase of user datamisuse by third parties, international society has realized thepotential threats imposed by the lack of user privacy regulations. In 2018, the EuropeanUnion (EU) responded to this public concern by enforcing the European General DataProtection Regulation (GDPR), which regulates the relationship between a user (datasubject) and an organization (service provider) that stores any form of the users’ personaldata. GDPR intends to grant substantial rights to data subjects. Especially, by declaringarticle 16 [79] on “right of rectification” and article 17 [80] on “right to be forgotten”,GDPR empowers individuals to request the “immediate deletion” or “immediate rectifi-cation” of their personal data. In case of a violation, substantial fines would apply to theresponsible service provider. GDPR applies to any service provider storing or processingthe personal data of individuals residing in the EU [260].

BCs data storage is distributed among its users, especially the miners who validate theTXs andmine the blocks. Any data storedwithin aTX, once added into a BC, is accessibleby all BC nodes. These nodes can – and even in some cases, they have to – store the fullchain locally. Hence, a great challenge BIoT applications face is the coexistence of BCswith GDPR when someone’s personal data is inserted into a BC. An example scenariocan be the storage of health status data collected in a clinical study in a BC. Due to thepublic access to that data stored on the BC, user privacy is endangered. To address suchuser-centric privacy issues, BIoT platforms are interested in complying with the GDPR.However, the coexistence of BCs andGDPR is highly challenging due to the immutabilityof BCs.

The question raises is how can the “Rights to be forgotten and rectification” and theBCs immutability coexist? To address this question, this work presents a novel approachthat adapts DLIT to be an updateable BC. As mentioned, DLIT is a public-permissioned

154

BC, designed for BIoT use cases. To adhere to the GDPR, this work’s approach enablestransparent and traceable data modification at the TX level. For that purpose, a particularhash function, i.e.,Chameleon Hash Function (CHF) is employed (cf. Section 5.2.2).

5.2.1 RelatedWork

Approaches to complying BCs with GDPR are very recent and fundamentally different.This work conducted a study on the state of the art approaches that yield two identifiedcategories i.e., structural and cryptographic elaborated as follows.

5.2.1.1 Structural Approaches: Off-chain storage

One approach to harmonize BCswithGDPR, is the application of off-chain storage [106,49]. [106] proposes a solution where the storage of data is outsourced to an external,off-chain data storage. To this end, the way TXs are stored into the BC differs fromtraditional BCs. The TX’s data-payload is replaced with the hash of the original datahash = h(dataorig). The dataorig is then stored in a trusted, off-chain data storage. Thisdata storage is a key-value store, and takes hash as key and dataorig as value.

Whenever a participant queries the data of a TX, it uses hash which serves as key forthe external data storage. Upon reception of the external dataext, the client can assumethe data is untampered whenever h(dataorig) = h(dataext) holds. When a data subjectrequests its right to be forgotten, the individual or organization in charge of the centraldata storage performs a modification on the data. Due to the fact that data is being storedoff-chain, a deletion does not affect the content of a TX and thus does not affect the BC’sconsistency [106].

The biggest constraint in this approach is the centralized data storage. It compromisesone of the defining characteristics of BCs - distribution of data. While minimizing datastorage on the BC is in line with GDPR, data hash, even after modification, renders a risk-factor fornon-compliance. Thehash constitutes, when combinedwithother information,pseudonymous data, which does not meet the standard of anonymous data as required bythe GDPR [192].

5.2.1.2 Storing TXHashes

Omitting the TX content is another approach where no off-chain storage is required. Forinstance, [99] aims to reduce the overall BC-size by employing a reward system that in-centivizes BC users to compress or delete their data. [99] introduces multiple agent rolesassigned to individual miners to handle the different tasks associated with compression,deletion, and rewards distribution. Thus, a centralized entity is responsible for auditingand verifying the agents mentioned above.

155

To reduce storage requirements and enhance privacy, [99] developed a new type ofTXdTX that incorporates deletion of a TX TXi. It is designed in a way that only the creatorof the TX TXi can request the deletion of the same TX. For this purpose, dTX featuresan input field holding the TX-id TXi.id. If a TX creator wants to delete it, he generatesdTX and signs it with the same private key as TXi was signed. Using a signing scheme,[99] verify that the originator of TXi and dTX are identical. dTX is then broadcasted tothe network of miners. Upon receipt, a miner first has to query TXi by the given TXi.id.Querying TXi is performed in O(N) where N is the number of TXs in the BC. Since Ntypically grows over time, this linear scan yields a potential slowdown of the BC. To avoidperformance degeneration, miners locate TXs asynchronously in a so called Cleaning Pe-riod (CP). The length and frequency of the CP is based on the application. The mainidea behind a CP is, however, that the miners perform the deletions in batches (one af-ter the other) to increase efficiency. The authors also propose that the batch-removal canbe performed by one miner only who then broadcasts the updated chain to the network.Nonetheless, other miners can still perform the batch deletion on their own and comparetheir results to ensure data integrity. In order not to break hash consistency among blockswhen deleting TXs, the block and TX structure has to be adjusted. In conventional BCsthe hash of the block includes the Merkle root of those TXs

MerkleRoot = MerkleTree(TX1 ∥ TX2 ∥ ... ∥ TXn)

where TXi is a single TX. To mitigate breaking hashes, [99] proposes a design where theMerkle root is generated over the stored hashes of the TXs

MerkleRoot = MerkleTree(TX1.h ∥ TX2.h ∥ ... ∥ TXn.h)

where TXi.h is the hash of TXi. Note that h is a field on TX. This means that h is com-puted once when the TX is created such that TXi.h = h(TXi). Consequently, duringthe calculation of the Merkle root, the TX-hash is looked up rather than calculated dur-ing runtime. A deletion is then performed by clearing all fields on the TX except TX.h,ensuring consistent Merkle roots and thus block-hash consistency.

The absence of a centralized database — as opposed to the previous solution — is animprovement in [99]. Additionally, the modification is possible on a TX-level, which en-ables fine-grained control. However, this approach can be criticized for centralized agentsmanaging different tasks, thus, limiting its application to authorized nodes in a permis-sioned BC only. Further, it is unclear whether the delay of a deletion contradicts with theGDPR requirement of an immediate deletion. Lastly, storing the hashes on TXs compro-mises verifiability as a malicious participant could modify fields of the TXs without themstanding out in the Merkle root computation.

156

5.2.1.3 Replacing Blocks

Replacing entire blocks goes in a similar direction as omitting the TX content. The pro-tocol developed in [89] accommodates modifications in BCs via democratic voting as ameans for mutating the BC state. Every BC client can propose an edit (delete or modifi-cation) operation. Therefore, BC miners vote on whether to accept or reject the change.[89] stores the Merkle root of the block in two fields. i.e., the old state, and the Merkleroot. Which means a block Bi is connected to its predecessor Bi−1 by 2 links:

1. Bi.PrevOldState = Bi−1.OldState

2. Bi.Prev = Bi−1.MerkleRoot

When TXj stored in block Bi needs to be edited, a new block Bi′ is created such that:

1. Bi′ .PrevOldState = Bi−1.OldState

2. Bi′ .Prev = Bi−1.MerkleRoot

3. Bi′ .OldState = Bi+1.PrevOldState

4. Bi′ .MerkleRoot ̸= Bi+1.Prev

(1) and (2) remain unchanged due to changes in a block are propagated to subsequentblocks only. (3) is enabled by the added field old state and ensures that at least one validlink exists between Bi+1 and Bi. (4) is the consequence of the change. Miners can nowvote about the change. If a majority is reached, Bi is swapped with Bi′ .

This approach relies on storing the hash of the original state to preserve hash consis-tency. The absence of a centralized agent and the application democratic voting can beseen as an improvement over the previous approach. However, this approach focuses onreplacing entire blocks and thus does not allow fine-grained modifications on TX-level.Similar to [99], the hash of the data is stored rather than computed, resulting in compro-mises in verifiability.

5.2.2 ChameleonHash Function (CHF)

CHF lies on the cryptographic aspects of BCs. A CHF is a special kind of hash functionthat makes use of the “collision” concept [159]. A CHF is associated with a public (pk)and private key. The private key is also referred to as the trapdoor key (tk). Without tkthe hash function is collision-resistant. Using tk, it is computationally possible to find aninput a′ for an already existing input a where a ̸= a′ and h(a) = h(a′) i.e., generating ahash collision.

A CHF consists of a set of four algorithms CHF = {Htrp,H,Hver,Hcol} defined asfollows [39, 154]. One can now observe that hi and ni remain the same before and afterthe redaction. Thus, the link in the subsequent block pi+1 = f(hi ∥ni) remains valid [39].

157

[39] focuses on the cryptographic aspect of the BC, rendering a centralized authoritythat stores data ormanages roles. Also, the fact that the content can be replaced completely(m and m′) allows a strict realization of the GDPR’s requirement of erasure. However,similar to the approach in section 5.2.1.3, it suffers from the lack of fine-grained control asthe modification is done on a block-level. Additionally, the management of the trapdoorkey as to which individuals are allowed to performmodifications imposes new challenges[192].

5.2.2.1 Policy-based ChameleonHashing

[88] improves thework in [39] by developing a cryptographic permission scheme enablingmore fine-grained control over BCmodifications. [88] proposes an attribute-based accesscontrol to implement policies on whether a user is authorized to perform a modification.To this end, boolean formulas are evaluated over a set of user attributes. If the user is inpossession of tk and satisfies the boolean formula, he or she is authorized to perform themodification. Consequently, the notion of policy-based Chameleon Hashing (PCH) isintroduced which extends chameleon hashing by additionally taking an access policy as aninput parameter. PCHworks in a way that those hash collisions can only be generated byusers satisfying the access policy and in possession of tk. The access policy can be definedby the user generating the Chameleon Hash of a resource.

Moreover, [88] offers the key management for users. In the approach of [39] the key-pair (pk, tk) is created and associated on a per user basis. As a consequence, the user canfind hash collisions for all hashes generated by his/her tk. Losing tk or if the tkwas stolen, ahigh security risk could be raised. Tomitigate this risk, [88] generates a newpair of (pk, tk)for every generated hash. Hence, an advantage over the previous solution is made possibleby modifying the BC content on a TX-level – allowing more fine-grained control. How-ever, the improved access-control adds an additional layer of cryptographic complexity tothe already complex chameleon hashing scheme. Another disadvantage is using a differentkey-pair for each hash also causes a demand of a more sophisticated key storage.

5.2.3 Discussion and Requirements Specification

These different approaches are compared in Table 5.1 based on [260]. Based on thesestudies, the following seven factors are derived as high-priority requirements for a GDPR-Complaint BC development to be considered in BIoT-based systems.

1. Modifications shall enable themodification or deletion of personal/IoT data storedin the BC.

2. A user shall only be able to request the modification of his/her own personal/IoTdata.

158

3. Modifications shall be performed immediately and at worst case, in less than 24hours.

4. The BIoT-based system cannot depend on external services/systems (e.g., externalDatabase) for modification.

5. Modifications shall be possible in a fine-grained way.

6. Modifications on the BC shall be transparent, recorded, and traceable.

7. Modifications shall not affect existing BC optimizations such as TX aggregation

Table 5.1: Comparative Overview on Potential GDPR‐compliant Approaches [260]

Approach Description Level ofModifica-tion

Advantages Disadvantages

Off-chainStor-age [106]

Data is stored off-chain in a central-ized database

Transaction Minimal ten-sion withGDPR

1- Compromises dis-tribution of data.2- Dependency on ex-ternal system

Storingtransactionhashes [99]

Hash of transac-tion is looked uprather than com-puted

Transaction Simple ap-proach lowcomplexity

1- Compromisestraceability.2- Compromisesauditability

Replacingblocks [89]

Miners vote onreplacing blocks

Block Respects avoting policy

Only possible to re-place entire blocks

Chameleonhashing [39]

Hash collisionscan be used toreplace content

Block Maintainstraceabilityand auditabil-ity

1- Complexity ofcryptography.2- Storage of trap-door key

Attribute-basedchameleonhashing [88]

Adds fine-grainedaccess control tochameleon hash-ing

Transaction Enables afine-grainedcontrol as towho is ableto perform anupdate

Access scheme addsan additional layer ofcomplexity

5.2.4 DLIT’s GDPR-Compliant Design

The approach in this work adapts the DLIT BC to a GDPR-complaint version based on[260] by considering the seven requirements discussed in Section 5.2.3. While the rest ofthis Section elaborates on the design and implementation of the approach taken in thisthesis, Figures 5.8 and 5.9 represent the steps taken at BC client and miner sides for dele-tion and modification processes – i.e., update processes.

159

5.2.4.1 Data Ingestion and Storage

Asdefinedby requirement (1), it shall be possible to remove ormodify personal data storedin the BC.With the absence of legal precedence as towhat exactly qualifies as personal data[116], the scope and implications of a modification are not clear. As a consequence, anassumption is made in this work in order to keep the design flexible to compensate for thisuncertainty. Hence, an abstraction layer is introduced in this work to hide the ambiguityof personal vs. non-personal data.

The same as in most other BCs, in DLIT data is ingested by several types of TXs eachwith different functionality (e.g., account TXs and funds TXs). In this work, a TX’s datastructure is chosen to implement the abstraction mentioned above. Hence, TXs initiatedby users or IoT devices, are extended with a new generic field called “Data” to hold (po-tential) personal data, subject to GDPR.Other fields of a TX (e.g., Amount or Fee) do notrender personal data. In this work, only content of theData field is considered as personaldata. Users are able to pass along data with every TX they emit. Their data will be storedin the designatedData field in the TX to be broadcast among the network of miners andeventually added to a block. Thus, the requirement (1) i.e.,modification right, is mappedto the Data field on TX data structure. Which means, to implement a modification re-quest, the designed approach here updates only theData field.

5.2.4.2 Replacing the Traditional Hash Function

If theData field of aTX is updated due to amodification request, that TX’s hash changes,which propagates further causing a differentMerkle root of the block theTXwas includedin. As a consequence, the change of a single TX hash would eventually break the hashconsistency of the entire chain in a BC.While this is the intended behavior for other fieldson the TX (e.g., Amount field of a funds TX), modifications of theData field can have adifferent behavior.

Requirement (2) implies that only the user who created the TX and passed personaldata into a BC should be able to alter the same data without changing the hash of the TX.Evenly important, any other user shall not be able to change the Data field of that TXwithout changing its hash. To address this requirement the SHA-3 hash function usedpreviously in DLIT is replaced with a CHF.

With the approach in this work, whenever a new client account is issued, a set of CHFparameters is generated alongside. Thus, DLIT clients are modified to use the CHF pa-rameters to hash the TXs they emit. As generating a hash for an input value requires apublic hash key (cf. Section 5.2.2), and returns the digest togetherwith a check string. Thedigest can then be recomputed with the input value, public hash key, and check string. AstheCHF signature and return value differ from the hash function used inDLIT, a drop-in

160

replacement is not possible. For BC participants, in order to be able to hash a certain TX,the public hash key and check string need to be publicly available. To achieve this charac-teristic, data structure adjustments were made on DLIT. Accordingly, all data embeddedTXs are extendedwith a field to store their generated check string. Moreover, DLIT client(user/IoT device) accounts are extended to store the CHF parameters generated in the ac-count creation step. This ensures that a DLIT miner is now able to compute the hash ofa TX as it uses the check string stored on the TX and the client’s CHF parameter storedin the account. With these updates, each DLIT client has it’s own set of CHF parametersand uses them to hash his/her/its TXs.

5.2.4.3 CHF Specification

The CHF parameters developed in this work define a set of 5 numbers as follows [259].

Listing 5.2: CHF Parameters

t ype Chame l eonHa shPa r ame t e r s s t r u c t {G [ ] byt e / / Pr imeP [ ] byt e / / Pr imeQ [ ] byt e / / Pr imeHK [ ] byt e / / P u b l i c Hash KeyTK [ ] byt e / / S e c r e t T ra pd o o r Key

}p a r am e t e r s .TK = [ ] byt e { }

G,P,Q are special prime numbers created by the parameter generation algorithm. HKis the public hash key required to compute the CHF of an input value. TK is the secrettrapdoor key enabling the generation of hash collisions. Just like the secret key of an RSAkey-pair, it should be stored safely and never sharedwith others. The designed approach inthis work is implemented as follows. The CHF parameters are created during the accountcreation process and written to a file on the client’s machine. As the CHF parametersalso contain the public hash key, they need to be accessible for other entities as well. Inparticular, miners need to recomputeTXhashes and thus need access to the client’s publichash key. As the miners in DLIT have access to all BC accounts, the CHF parameters areattached to the client’s account and included in the AccountTx. However, before theyare attached, the secret tk is sanitized. When the AccountTx is sent to the network andthe account is created, all miners have access to the client’s public hash key.

Listing 5.3: Account Tx

t ype AccountTx s t r u c t {. . .P a r am e t e r s * c r y p t o . CHFParamete r sCh e c k S t r i n g * c r y p t o . CHFCheckStr ingData [ ] byt e }

161

When a client issues a newTX it creates aTXobject, fills the fields of theTXwith the givenvalues, signs the TX, and submits it to the network of miners. The signature consists ofthe TX hash, signed by the client’s private key. Before shifting to chameleon hashing, theclient application used a conventional SHA3 hash function to compute theTXhash. Themessage (input to the hash function) incorporates all fields of the TX. To that end, eachTX implements a hash method.

An example of FundsTx Hash() method:

Listing 5.4: Hash() Function

func ( t x * FundsTx ) Hash ( ) [ 3 2 ] byt e {i n pu t := s t r u c t {

Header byt eAmount uint64Fe e uint64TxCnt uint32From [ 3 2 ] byt eTo [ 3 2 ] byt eData [ ] byt e

} { t x . Header , t x . Amount , t x . Fee ,t x . TxCnt , t x . From , t x . To , t x . Data }

r e t u rn s h a 3 . Sum256 ([ ] byt e ( fmt . S p r i n t f ( ”%v ” , i n pu t ) ) )}

Asmentioned in Section 5.2.2, in contrast to the traditional hash functions e.g., SHA2and SHA3 where hash generation and verification are done by re-/computing the hash,CHF does the hash generation and verification using two separate procedures. For that,the client needs to generate an initial check string.

Listing 5.5: Chameleon Hash CheckString

t ype Chame l eonHashCheckS t r ing s t r u c t {R [ ] byt eS [ ] byt e }

The check string is generated using theCHFparameters of the client. R, S are randomnumbers with an upper bound defined by the CHF’sQ parameter.

Listing 5.6: CHF Parameter and CheckString

p a r am e t e r s := c r y p t o . NewCHFParameters ( )c h e c k S t r i n g := c r y p t o . NewCHFCheckString (

p a r am e t e r s )

162

Afterwards, parameters, check string and message (hash input) are used to computethe 32-byte CHF.

Listing 5.7: CHF Computation

chame l eonHash := c r y p t o .CHF(p a r am e t e r s , c h e c k S t r i n g , m e s s a g e )

The message in our case is the SHA-hash of the TX we saw above. To keep checkstrings organized, they are stored directly on the TX they belong to. To that end, each TXis extended with a CheckString field.

User Client CommandLine (CLI)

UpdateComponent

CryptographyComponent

ConsensusProtocol

NetworkComponent

Invoke update CommandHandle Update Request

Generate Hash Collision

Create UpdateTX

Submit UpdateTX

New Check String

UpdateTx

Confirmation

Figure 5.8: Client Interactions with DLIT for Updating a Transaction [260]

5.2.4.4 Performing aModification

With the changes described in the previous section, prerequisites are met for BC clientsto perform modifications. Firstly, a DLIT client chooses one of its TXs to be modified.Subsequently, the TX’sData field can be modified (either cleared or overwritten). Lastly,the client needs to generate a hash collision for the contents of the original and modifiedTX. Generating a hash collision requires the secret tk, the old hash input, the new hashinput, and the old check string. CHF parameters are issued on a per account basis. As thesecret tk always stays with its owner, only clients can generate hash collisions for their ownTXs. Generating a hash collision returns a new check string associated with the modifiedTX content. When a newTX is hashed including [new data / public hash key / new checkstring], the same digest is returned as before the modification with [old data / public hashkey / old check string]. At this point, the client successfully modified the Data field of aTX and generated a hash collision. An attempt to generate a hash collision for a TX thatdoes not belong to the client (e.g., the check string on theTXwas not createdby the client’sCHF parameters) would fail in the sense that the resulting hashes would not match.

163

5.2.4.5 UpdateTx

Until now, the scope of modifications described in the previous section was limited to aclient’s local machine. In order for the modification to come into effect, updated TXsneed to be sent to all the distributed BCminers and applied to their local copy of the BC.In that context, the notion of UpdateTx is introduced in this work as a new TX type inDLIT BC. An UpdateTx is serving as a carrier for the modification. The motivation forusingTXs to request datamodification consists of two reasons. Firstly, as identified in sec-tion 5.2.4.1, TXs are the only way to interact and alter the state of a BC, thus this conceptis used to modify the DLIT BC as well. Secondly, TXs provide auditability and traceabil-ity. Requirement (6) (cf. Section 5.2.3) states that BC participants shall be able to traceany modifications performed on the BC. Thus, anUpdateTx contains all the informationabout the modification on the BC and is mined in a block – just like any other TX. Evi-dently, they record the modifications transparently and traceable.

TheUpdateTx specifies the following fields:

• TxToUpdateHashThe hash of the TX a client wants to modify.

• TxToUpdateCheckString The new check string the client computed to generatea hash collision.

• TxToUpdateData The modified data to be set on TxToUpdate.

• IssuerThe account address of the client.

• CheckString The check string for thisUpdateTx.

• Data The data field of thisUpdateTx.

• SignatureThe signed hash of thisUpdateTx.

The client signs the UpdateTx with his/her private key and sends the TX to the net-work of miners.

5.2.4.6 Validation of an UpdateTx

When a miner receives a new TX it adds the TX to a set of open TXs (cf. Figure 5.9).When a new mining round starts, the miner validates, processes and adds the TXs to thea new block. The same is applied to UpdateTx as well. In this step, the miner performschecks to ensure the client is authorized to request the modification. In the first step, theminer verifies if TxToUpdate exists. For that matter, the miner queries the TxToUpdatein its local storage. If the TX does not exist, the process is terminated. Next, the minerchecks whether the client’s account exists in the BC. Upon success, miner verifies that thesignature on the UpdateTx was signed with the private key corresponding to the client’s

164

public key. If the signature is valid, the miner compares if the TxToUpdate was signedwith the same private key as theUpdateTx. This check verifies that clients can only updatetheir own TXs. As pointed out in Section 5.2.4.4, this constraint is already implementedby the distribution of CHF parameters on a per account basis. Lastly, the miner verifiesthe modified TxToUpdate yield the same hash as before the modification. This is due tothe hash collision the client created before and ensures hash consistency. If all these checkspass, theUpdateTx is processed.

5.2.4.7 Processing UpdateTx

After a successful validation of an UpdateTx, the actual update process is performed byminers. In DLIT, only the TX hash is stored in a block, whereas the TX itself is stored inthe miner’s local storage. This is due to BC-size optimizations and has yet another bene-ficial side-effect. As the TX hash is not affected by the modification (because of the hashcollision), it does not affect the block’sMerkle root where the TX originally was included.Thus, it does not affect the hash consistency of BC. This means that miners do not needtomodify the original block. The onlymodification they need to perform is to update theTX in their local storage. To that end, the miner queries the TxToUpdate from the localstorage and performs the modifications on it. First, the new check string is copied overfrom the UpdateTx to the TxToUpdate. Subsequently, the new data is copied over fromthe TxToUpdateData field of the UpdateTx to the TxToUpdate’s data field. Finally, theminer stores and overwrites the modified TX in the local storage.

5.2.4.8 Adding UpdateTx to ANew Block

After an UpdateTx is validated and processed, the miner stores the UpdateTx in its localstorage and adds it - like any other TX - to a new block. As a consequence, the local storageand block data structure of DLIT BC extended in this work. In DLIT, on one hand, aminer’s local storage is based on a file database, segmented into buckets where each bucketprovides a key/value storage. To store and queryUpdateTx, a newbucket is addedwith theUpdateTx hash as key and the TX as value. On the other hand, the block data is extendedwith the following headers to store information about the newUpdateTx.

• NrUpdateTx Checksum counter storing the number of UpdateTx contained inthis block.

• UpdateTxList Array storing the hashes of the UpdateTx contained in this block.The hash can be used to query the update TX from the local storage.

This work’s modified data structure allows each update request to be recorded in atransparent and traceable way. From the content of the UpdateTx one can derive (i) theuser who requested a modification, (ii) the content of the modification, and (iii) the TXon which the modification was applied.

165

Netw

ork

P2P

Com

pone

ntTe

mpo

rary

Stor

age

Min

ing

Valid

atio

nLo

cal S

tora

geNe

w B

lock

Hand

ling

incom

ing

Upda

te T

X

Addin

g Up

date

TX

to o

pen

TX p

ool

Hand

ling

open

Upd

ate

TX a

sync

hron

ously

Valid

ate

incom

ing

Upda

te T

X

Is Va

lid

alter

nativ

es

Get T

X To

Upda

te

TXTo

Upda

te

Writ

e up

date

d TX

ToUp

date

Addin

g Up

date

TX to

the

new

block

isVali

d ==

false

isVali

d ==

true

Figu

re5.9:MinerInteractionsinDLITforProcessinganupdateTransactions[260]

166

5.2.4.9 Distributing UpdateTx

Miners in DLIT use a shared Peer-to-peer (P2P) interface to fetch and distribute TXs andblocks among each other. In order to broadcast an UpdateTx across the miners, the P2Pcomponent of the DLITminer application is extended in this work. At first, new requesttypes were defined as follows.

• ReqUpdateTxCode indicating the P2P request is about fetching anUpdateTx.

• ResUpdateTxCode indicating the P2P response is about anUpdateTx response.

Secondly, a newmechanism is implemented forminers to asynchronously fetchUpdateTxfrom their peers.

Thus, a new miner can join the BC network by syncing with other ones and buildingits own local copy of the BC. To that end, the joiningminer requests all previous TXs andblocks from its peer miners and validates them in sequential order. This miner will receiveall the previous TXs from the network. Moreover, this miner will receive the already up-dated TXs, and the requests caused the previous updates, too. Thus the joining miner isupdated, and no modifications need to be done on this miner’s side.

5.2.4.10 Blockchain Explorer

For the convenience of users, a BC explorer is adapted [258] to illustrate the update TXsas shown in Figure 5.10. In the information shown for an example Update TX, the “newdata” field contains the updated data (the identity of a user which is updated to “anony-mous”). Moreover, the “Reason” of change is now added by users along with the otherfields. In this example case, the fee of update TX is set to Zero, due to running in a testBC network. In a public setting, this amount has to be set to a larger than Zero amountto avoid update spams.

5.2.5 TX Aggregation Compatibility

In the case of DLIT, TX aggregation on funds TXs is implemented to decrease the overallBC size [205]. In TX aggregation, multiple TXs are summarized and aggregated into oneTX by either the sender or receiver, grouped and contained in a designated AggTx. Forinstance, BC client A sends 5 coins to client B and 7 coins to client C. The aggregationprocesses combines these TXs as follows.

FundsTx : A −→ B : 5 (5.1)

FundsTx : A −→ C : 7 (5.2)

will be aggregated toAggTx : A −→ [B,C] : 12 (5.3)

167

AggTx contains the hashes of FundsTx and FundsTx andwill be added to a newblock.To use less space on the BC, FundsTx and FundsTx are removed from the blocks theywereincluded in. Removing FundsTx and FundsTx has the consequence that the Merkle rootof the affected blocks will change and thus hash consistency will be destroyed. Tomitigatethis side effect, blocks are connected by an additional fallback link. The fallback link doesnot include the block’s Merkle root and thus is not affected by the TX removal [205]. Inthis case, anUpdateTx enables authorized clients to alter their FundsTx data field withoutchanging the hash. In DLIT, AggTx only contains the hash of the aggregated fund TXs,thus, a modification on a TX level will have no effect on the AggTx.

However, the aggregation of DataTX s has to be done differently. In this case, theaggregation shall happen based on the data in the data field and sender address. Otherwise,combination of DataTXs with different data causes data loss. Which means data1 anddata2 sent to B and C as follows, could be aggregated only if data1 = data2. Hence theaggregated data i.e., data3 has to be equal to data in all of the aggregated DataTXs.

DataTx : A −→ B : data1 (5.4)

DataTx : A −→ C : data2 (5.5)

will be aggregated toAggTx : A −→ [B,C] : data3 (5.6)

Figure 5.10: Representation of Data Update Transaction in Blockchain Explorer [260]

168

5.3 Evaluation

Evaluations of designed features on DLIT is performed three different steps of (i) TX ag-gregation, (ii) Hybrid Consensus, and (iii) GDPRCompliance.

5.3.1 Evaluation of the TXAggregation

An extensive evaluation onTXaggregation is conducted by setting up a test-bed including9 miners hosted on Amazon Web Services (AWS) located in Oregon, São Paulo, Ohio,London, Paris, Mumbai, Singapore, Tokyo and Sydney [206]. In this network more than2200TXsmonitored between the clients andminers. Themaximal block size is set to 500KB. The block interval is set to 180 s. Related information can be seen in the Table 5.2.

Table 5.2: DLIT Simulation Results [167]

MinerLocation

# ofMinedBlocks

FirstBlock

LastBlock

Time-span

ElapsedTime(s)

BlocksPerMinute

# ofBlocksAddedtoChain

Total# ofTXsent

Total# ofTXVali-dated

Average# ofTPB

MaxTPB

London 51 11:44:05 18:58:04 07:13:59 26039.88 0.12 43 21536 21536 500.83 4403

Ohio 41 11:38:45 18:56:00 07:17:15 26235.14 0.1 43 9952 9952 231.44 2272

Tokyo 32 11:23:02 19:09:54 07:46:52 28012.53 0.07 43 17589 17589 409.04 5204

Sydney 77 11:14:54 18:53:50 07:38:56 27536.54 0.17 32 13159 13159 411.21 2272

Mumbai 69 11:17:11 19:06:23 07:49:12 28152.95 0.15 43 16908 16908 393.20 2272

SaoPaulo

51 11:33:17 19:16:45 07:43:28 27808.23 0.12 43 16942 16942 394 2174

Singapore 58 11:23:13 19:21:46 07:58:33 28713.40 0.13 43 10104 10104 234.97 3090

Oregon 42 11:18:50 19:18:46 07:59:55 28795.47 0.09 32 13727 13727 428.96 3836

Paris 54 11:29:18 19:28:02 07:58:44 28724.73 0.12 40 15927 15927 398.17 3302

Overall 475 11:14:54 19:28:02 08:13:08 29588.98 0.97 362 135844 171230 ——– 5204

AverageperMiner

52.78 11:26:57 19:09:57 07:42:59 27779.87 0.12 40.02 15093 19025 473 578

5.3.1.1 Performance Analysis Scenarios

Performance analysis of DLIT are performed while considering the effect of block size,interval between blocks, and the BC’s overall size on the three test cases defined as follows.

1. without TX aggregation,

2. with TX aggregation enabled,

3. with TX aggregation & the emptying of blocks enabled.

169

Evaluations ofDLIT is performed via a local setup or by setting up a test-bed including 11miners hosted on Amazon Web Services (AWS) located in different countries around theglobe. Related information can be seen in Table 5.2.

5.3.1.2 Evaluation onDifferent Block Sizes

Table 5.3: Table of Different Block Sizes Influencing the TPS with TX Aggregation Enabled [167].

Block Size Transactions

Defined ABS #sent #validated TPSsent TPS TPSmin TPSmaxTPS

TPScalc.

1’000 342 179’328 178’196 33.1 28.0 14.6 32.7 39.2

5’000 4’342 181’933 181’933 33.3 28.5 20.9 33.2 3.1

20’000 19’324 181’613 181’613 33.0 28.0 16.7 32.9 0.7

Here, the evaluation and comparison between different block sizes and its influence onthe TPS, with an initially defined amount of TXs sent to the network, is measured andlisted. However, it is possible that not the same amount of TXs are sent to the network ineach test run because of either the miner or the client crashes. This is mainly caused by anunrecoverable SIGBUS error.

Table 5.3 shows the TPS rates for test cases with TX aggregation enabled whereas ta-ble5.4 shows the test cases without TX aggregation. In both tables, the block size has unitbyte, and the values belonging to TXs have unit TX or TXs per second.

The actual block size (ABS and unit byte) is the available space in a block where TXscan be filled in. At the time of writing, it isDefinedBlockSize− 658byte. The 658 bytes areused for block related values and therefore this space cannot be filled with Txs.

At a first look, it occurs strange that there are also TPSmin and TPSmax beside the nor-mal TPS. This is due to the fact, that someminers need longer to fetch all TXs when theydo not receive all TXs during broadcasts. Also nearly always the virtual machines hostedon GCP had a lower TPS than the ones on AWS, what may be an indicator that they areslightly underpowered in regards of the random access memory. The TPSmax is an ev-idence of what speeds can be possible with TX aggregation as it is implemented in thisthesis. However, it is visible that also with TX aggregation, the TPSmin and TPSmax canbe close to each other. Therefore it is strongly assumed, that this difference is caused byconnection issues, which results in requesting more Txs.

Aspresented in theTable 5.3, theblock size does not take a big influenceon the reachedTPS value. This can be explained frommultiple reasons:

TX Aggregation: Since TXs are aggregated, not every TXs needs space in a block,and thus, more TXs fit into one block. Whether there are two TXs fromA to B or if there

170

Table 5.4: Table of Different Block Sizes Influencing the TPS with TX Aggregation Disabled [167].

Block Size Transactions

Defined ABS #sent #validated TPSsent TPS TPSmin TPSmaxTPS

TPScalc.

1’000 342 19’000 19’000 36.3 0.6 0.6 0.6 0.7

5’000 4’342 74’960 74’834 34.7 7.0 7.0 7.0 0.8

20’000 19’342 181’078 181’078 33.5 27.3 27.2 27.4 0.7

are one hundred TXs from A to B. With TXs aggregation, only one transacting will bewritten into the block.

UnlimitedAggTx Size: If a TXs does not fit in one block, it is likely that it is validatedin the next block because there is no limit in howmanyTXs can bewritten into oneAggTxand because of the splitAndSort-algorithm’s design. This algorithm takes the senders orreceivers which have most awaiting TXs to be validated. Thus, the more TXs from onesender or to one receiver are in the mempool, the more likely it is that they get validated.Because the TXs from/to one wallet, which cannot be validated in a block, are still in themempool, it is likely that there are more TXs matching the selection criteria for the nextblock. Since the size of an AggTx is unlimited, they are able to aggregate as many TXs aspossible.

Test Case: It is likely that the same sender or receiver is found quickly. This does helpto keep the TPS at a high level. TX aggregation works better the more similar sender orreceivers are in the Txs.

TXSending: TXs are sent approximately every 0.5 second. If TXswould only be sentafter the previous TXs is validated in the network, aggregating by sender would not work.The current acknowledgment a miner sends to a client is only confirming that a certainTXs is sent to the network. A client does not know if and when the sent TXs is validated.Thus, aggregating by sender is possible.

The TPSTPScalc.

indicates by which factor the aggregation increases themaximumpossibleTPScalc. value. Thus with a block size of 1’000 byte, the version with TXs aggregation cantheoretically handle roughly 39 times more TXs than without aggregation. It is clear, thatwith a smaller TPScalc. and a constant TPS, this factor increases. The TPS

TPScalc.for a block

size of 20’000 byte is below one, because theoretically, over 600 TXs fit into a 20’000 byteblockwhat results in aTPScalc. of about 40Txs. ThisTPScalc. is higher than theTPSsent andtherefore, the ratio cannot be bigger than one.

As a preliminary conclusion, it can be stated that it is not possible to generalize, thatwith TXs aggregation it is exactly 39.2 respectively, 3.1 times faster because these numbersare highly influenced by the actual TPS which can not be higher than the TPSsent. Thus,sending more TXs probably increases the TPS, and therefore also the TPS

TPScalc.ratio, even

171

more. In Table 5.3, the fraction TPSTPScalc.

can be understood as a degree of utilization interms of the defined and actual block size.

Figure 5.11: Different Block Sizes and its Influence on the TPS with and Without Aggregation [167]

Figure 5.11 visualizes the output of tables 5.3 and 5.3. The scalability improvementis clearly visible. One can say, that with TXs aggregation, the block size is not a limitingfactor anymore if there are either same senders or same receivers in theTxs. This is visible inFigure 5.11, becauseTPSwith aggregation (the blue bars) are nearly constant, whereas testruns without aggregation (orange bars), are increasing with bigger block sizes. The greybars visualize the TPScalc.. Furthermore, in test runs without aggregation do not reach tohigher TPS in the test run with a block size of 20,000 byte. The block’s size is not limitingthe TPS anymore, since the TPScalc. is higher as the TPSsent. This may indicate that theconnection issues are limiting the network throughput.

The TPSsent is indicated by the green line. The shrinking of the TPSsent in Figure 5.11is not related to the block size, as it may could be assumed. Sending the TXs , even with afixed interval, is still not always equally fast. It is related to the timespan in which a clientreceives an acknowledgment fromaminer after sending theTxs. This timespan can slightlydiffer, and thus, it can have a bigger influence when sending a lot of Txs.

In the test runs with a block size of 1,000 Byte, the version with TXs aggregation canhandle that much more TXs because about 10 TXs fit into one block. These 10 TXs can

172

Table 5.5: Table of Different Block Intervals Influencing the TPS with TX Aggregation Enabled [167].

Block interval Transactions

Defined ABI #sent #validated TPSsent TPS TPSmin TPSmaxTPS

TPScalc.

15 18.8 181’933 181’933 33.3 28.5 20.9 33.2 3.1

60 66.8 190’000 190’000 34.8 34.4 33.9 34.7 15.3

120 118.4 181’278 181’278 33.8 32.9 30.2 33.4 28.5

Table 5.6: Table of Different Block Intervals Influencing The TPS with TX Aggregation Disabled [167].

Block interval Transactions

Defined ABI #sent #validated TPSsent TPS TPSmin TPSmaxTPS

TPScalc.

15 14.3 74’960 74’834 34.7 7.0 7.0 7.0 0.8

60 63.5 78’375 78’375 36.3 2.0 2.0 2.0 0.9

120 111.4 18’000 18’000 34.4 1.1 1.1 1.1 1

be aggregation TXs aggregating way more TXs and therefore increasing the TPS. Herepoint two from the listing above does have an influence because 19 different wallets are inthe network. When aggregating these TXs perfectly, it would result in 19AggTxTxs. Thiswould overflow the block size by nine Txs. If TXs from one sender cannot be validatedin block n, there will be all these TXs plus the newly received ones for block n + 1. Thus,during the preparation of blockn+1, this specific senderwill havemoreTXs than a senderwhose TXs get validated in block n and therefore all its TXs get validated then.

The block size is not limiting theDLIT versionwith TXs aggregation up to the point,where only distinct senders and receivers are sending and receiving the Txs. Therefore,in regards to the block size, TXs aggregation does increase the TPS, especially for smallblocks.

Here the comparison and evaluation between different block intervals, with a givennumber of TXs sent to the network, is measured and listed. Table 5.5 does show the TPSrates for test caseswithTXaggregation enabledwhereas table 5.6 shows the test caseswith-out TX aggregation. In both tables, the block interval has unit seconds, and the valuesbelonging to TXs have unit TXs or TXs per second. In table 5.5 the fraction TPS

TPScalc.can

be seen as improvement factor in contrast to the theoretical maximum and in table 5.6 asdegree of utilization.

The ABI (Actual Block Interval) is an indicator if the BC does validate blocks in theuser defined interval. It has unit seconds and is the average timespan between two blocks.Since the validation speed is set with the help of the target, it is not exactly the set interval.This target is adapted every n blocks, whereas n is user defined.

173

Results of the test runs with different block intervals look similar to the test runs withvarious block sizes. The TPS can be increased when TX aggregation is used. Especiallybecause the block interval and the TPS without aggregation are behaving inversely pro-portional, these test runs show the advantages nicely. The relationship between the TPSand the block interval is inversely proportional because, with a higher block interval, fewerblocks can be validated, and thus, less TX as well.

The block size, on the other side, is related proportional to the TPS, because biggerblocks can handle more Txs. This results in a higher TPS value. Consequently, the ver-sion without aggregation can theoretically handle the most TXs with a tiny block intervaland huge blocks. Similar to the different block sizes, the block interval is not a TPS lim-iting factor anymore. Furthermore, TX aggregation also allows validating more TXs persecond, as already stated before. During one test run, with a block interval of 60 seconds,the TPS is close to the TPSsent.

Figure 5.12: Different Block Intervals and its Influence on the TPS with and without Aggregation [167]

In Figure 5.12 the differences of the TPSsent were relatively big and thus it has differ-ent heights. The differences are again caused by small differences between sending a TXsand receiving the acknowledgment from a miner. The difference between the TPS withaggregation (blue bars in Figure 5.12) and without (orange bars) is even bigger here, sincethe block interval and the TPS are inversely proportional.

It is not certainly valid to conclude that with a higher block interval the TPS can beincreased for DLIT with aggregation enabled. Theoretically, the different block intervals

174

should not have an effect on the TPS up to a certain number of different senders and orreceivers. This is because with the two higher block intervals, four to eight times fewerblocks have to be sent through the network compared to the default 15 seconds interval.Therefore, the miners probably disconnect less often and they have more time to fetchTXs or blocks, which they never received.

5.3.1.3 Blockchain’s Overall Size

As shown in Figure 5.13, the BC size can be reduced with aggregation and even morewith aggregation and emptying of blocks. The difference between only aggregating andaggregating with emptying is not big, because, in this test case only three different TXsare written into a block with aggregation. When emptying a block, the block’s size getsreduced only around three times the size of a TXs hash. Themore different TXs are listedin a block, the larger this difference will be.

When DLIT with TXs aggregation is compared to a version without aggregation, thedifference is sensible because with TXs aggregation enabled, around three TXs are val-idated in each block, whereas without aggregation, roughly 135 TXs get aggregated ineach block. These 135 TXs are also the maximum capacity of a block with the definedsize of 5,000 Byte. The two major escalations are caused by the start and end of sendingTXs , whereas the smaller ones are caused by rollbacks. This graph shows the possibilityof having a smaller overall BC size with TXs aggregation and emptying of blocks.

5.3.1.4 Benefits Of TXAggregation

TXaggregation does definitely allowmoreTXs in one block. Thus the overall throughputincreases and at the same time the block size and the overall chain size can be kept small.Furthermore, the block interval can be enlarged, which reduces network traffic. How-ever, TX aggregation does help the most, if there are similar senders or receivers of Txs. Itperforms better, the more TXs with the same sender or receiver are sent. As DLIT is be-coming an IoTBC,wheremany IoTnodes send theirTXs to one receiver, this aggregationapproach helps in scaling the BC.

As stated above, the TX aggregation implementation would be effectless if there isno constant or overlapping set of of senders or receivers. Therefore, another aggregationtechnique should be used.

5.3.1.5 Obstacles Of TXAggregation

The intention behind double linking the BC is emptying all blocks once they are secureenough. A block is secure enough when it is accepted by the majority of miners, and thus,

175

Figure 5.13: Differences Regarding The Size of The Blockchain With Aggregation, With Aggregation and Emptying ofBlocks and Without Aggregation [167].

will not be included in rollbacks anymore. The emptying helps to save storage, as visi-ble in section 5.3.1.3. However, the emptying of validated blocks is kind of a contradic-tion against the core concept of a BC, where secure blocks are immutable and cannot bechanged again. Thus, some impediments occur, especially when a new miner joins thenetwork and wants to start mining.

Joining to DLIT Network as a New ValidatorWhen aggregating TXs in a historic manner, various problems and difficulties can oc-

cur. They are described with the help of Figure 5.14.In Figure 5.14 three FundsTx are incoming to the BC and get validated in blocks 1011,

1012 and 1013. As it is visible, the third TX (FundsTxA=>C : 5) can be aggregated with thefirstTX(FundsTxA=>B : 10), becauseof the similar sender. This results in theAggTxA=>{B, C}: 15 in block 1013, and the removing of FundsTxA=>B : 10 in block 1011.

The table on the right side of Figure 5.14 shows the balances for the three wallets A,B and C with and without aggregation, before, between and after the three blocks arevalidated. As it is now visible in the tables, the balances forA and B are not the same whenaggregating the TX as when not aggregating them. This can lead to problems, especiallywhen restarting or joining the network after TXs are already sent. Since DLIT fetches allblocks from the last validated one to the genesis block first and afterward validates themin the correct order, moving and aggregating TXs is problematic.

176

Figure 5.14: TX Aggregation and the Balance Issue for Joining Miners [167]

As example, when the TXs FundsTxA=>B : 10 and FundsTxA=>C : 5 get aggregated toAggTxA=>{B, C} : 15 and thus TX FundsTxA=>B : 10 moves from block 1011 to 1013, B doesnot have enough funds for TX FundsTxB=>C : 4 at the point of validating block 1012. Thisis visible in themiddle two sub-tableswhere the balance ofB is not the samewith andwith-out aggregation. When a new miner is joining the network, which uses TX aggregation,and the FundsTxA=>B : 10 is not in Block 1011 but in 1013, this new joining miner is notable to validate block 1012, because B does not have enough funds.

One Way of eluding this is a credit-like behavior on startup. This concept allows awallet to have a negative balance during the startup process. At the end, similar to thefourth small table in Figure 5.14, the balances with aggregation enabled are the same aswhen validating each TX without aggregation. As long as all TXs are validated, the orderof validation does not play a substantial role. At a participation of aminer, this newmineronly validates TXs which are already validated from other miners in the network. If thebootstrap miner tries to send invalid TXs to the new miner, the currently joining minerwill find them, either by invalid block hashes, or when other miners are refusing its minedblocks later. Consequently, with this credit like behavior during the participation, it ispossible to validate block 1012 even with aggregation and join the network.

Joining as a newminer, whenblocks are already emptied is problematic since thenonceof a block is calculated with the help of the wallets’ balances. Here, similar problems asin the previous subsection can occur, and blocks are not validated because the nonce is

177

incorrect at this point. This should also be possible to prevent, when not checking thenonce on startup. It is also possible to argue, that these blocks are validated in the networkalready and therefore secure.

5.3.2 Evaluation of DLIT’s Hybrid ConsensusMechanism

DLIT performance has been tested within a local set up by using a set of 60 artificial wal-lets which are partitioned into 8 partitions, each sending out 3000 Txs. The block sizewas kept at 800 Bytes and the TXs were sent in batches, making sure that there are TXsfrom as many different virtual wallets in the open memory pool as possible. Evaluatingthe performance of DLIT is done mainly based on the TPS as a sufficient measure of BCscalability.

Figure 5.15 summarizes the testing results. It can be seen that the test results withdifferent amounts ofCMmembers almost behave in the same way, but they are shifted onthe TPS axis. Furthermore, it appears that the slope becomes less steep if there are moreCMmembers, which means that smaller committees have a higher scaling potential.

Figure 5.15: Evaluation with a Varying Amount of Committee Members and Validators [51]

Also, it can be observed that the Sharding mechanism has the potential to give a con-siderable scalability benefit to DLIT. This can mainly be explained by the enhanced TXmanagement, as the most time-intense work is distributed among the shards. However,once a certain TPS threshold is reached, this threshold decreasing with a growing CM,

178

adding more miners doesn’t improve the scalability anymore, at some point even slowingdown the system. It is evident that the reason for this is the number of CM members,which at a certain point reaches the inter-CM communications cause additive securitywhich trades off the scalability.

If the system shall validate TXs as fast as possible, then a small CM might be a goodoption. Alternatively, if security concerns prevail, a larger CM would be advisable. Judg-ing by the testing, a 2 CMmembers - 3 shards configurationmight be a good choice, sinceit doesn’t completely centralize the CMwhile benefiting from the performance gains thatthe Sharding mechanism provides.

5.3.3 Evaluating GDPR-compliance of DLIT

Fulfillment of the identified requirements in Section 5.2.3 with the designed and devel-oped approach in this work are elaborated as follows.

Requirement (1): The implementation in this work satisfies the requirement for up-date and erasure and BCusers are able to request the update or erasure of their data by cre-ating an UpdateTx. However, the fact that UpdateTx carries the TxToUpdateData field,as well as the fact that updates are mined into the BC yields a corner case where GDPRcompliance is hard to assess.

Assume a user (Alice) issues an AccountTx with {name: ”Alise”} included in theData field. Unfortunately, the user made a typo and subsequently issues anUpdateTx tocorrect themistake and specifiesTxToUpdateData as{name: ”Alice”}. After the updateis performed, the BC contains the corrected AccountTx, as well as the UpdateTx. If Alicedecides to leave the BC later and requests the deletion of her personal data she needs tocreate another UpdateTx to erase the Data field on the AccountTx. While her personaldata is removed from theAccountTx, the TxToUpdateData field of the firstUpdateTx stillcontains {name: ”Alice”}. This yields a personal data record (in this case the name usedby the user) to remain in the BC, which can’t be erased as it is not stored in an editableData field but in a TX that must be stored for the tracking of changes purposes. Thus,the requirement (1) is partially satisfied but this special case –which relates more to useraction tracks in BC than his/her data– needs to be addressed in a future work.

Requirement (2): The distribution of the CHF parameters on an account basis, aswell as the various validation steps byminers ensure that users only update their ownTXs.If a user ‘A’ tries to update aTXof user ‘B’, theCHFparameters of user ‘A’would not yielda hash collision for theTX.This is due to the fact thatTXwas not created by userA’s CHFparameters. Thus, requirement (2) is addressed.

Requirement (3): As updates are executed in the form of TXs and are mined inblocks, update performance depends on various configurations of the BC such as theirscalability and consensus mechanism efficiency. For instance, difficulty-level of the Proof-

179

of-Work puzzle, and block interval impact the overall TX throughput of a BC and so theupdating processes. In the developed approach in this work, updates are performed in lessthan 1 minute. Thus, requirement (3) is fulfilled.

Requirement (4): The current implementation is self-contained anddoes not dependon any external systems or services. Thus, requirement (4) is addressed.

Requirement (5): In this work, modifications are performed strictly on a TX-level.Thus, the requirement (5) is addressed.

Requirement (6): After anupdate is executed, theUpdateTX is stored in theBC.Thisdesign choice yields thatUpdateTX are treated like any other TX in the BC and thus profitfrom a high level of traceability and auditability. Thus, requirement (6) is addressed.

Requirement (7): The fact that hash collisions enable TX modifications, comple-ment the existing design of DLIT very well. As TXs are stored and updated in the localdatabase of the miners, a modification on the block is obsolete, because blocks only storethe TX hash. As a consequence, updates are performed in a non-invasive way, meaningthe BC itself is not affected by the update. This enables existing BC optimizations— suchas TX aggregation — to be unaffected by the updates as well. Thus, requirement (7) issatisfied.

180

6Summary and Conclusions

This thesis explored many specific aspects of broadly applied BIoT dApps leading to thecollection of deep and experimental knowledge on the BIoT ecosystem, including the im-pact of different protocols and technologies. As a result of the practical studies, the iden-tification of key challenges and potentials of BIoT from an objective point of view is pre-sented and elaborated thoroughly. It became evident that many aspects are impacting theefficiency of BIoT in practice. Driven from those BIoT dApp examples and the simula-tions performed, the BIoT challenges are categorized into the social, operational, perfor-mance, technical, functional, and architectural areas. Since an interdisciplinary approachcontains risks, dedicatedmetrics were introduced for the efficiencymeasurement to deter-mine the technical components that impact those.

It is concluded that for an efficient BIoT, flexibility, adaptability, andmanageability ofthe BIoT architectures must be facilitated. To reach these features, this thesis presented aBIoT architecture, i.e., BIIT, to highlight a set of management approaches, to prepare andenhance the performance of IoT networks in BIoT use cases. BIIT recommends proac-tively considering many aspects, including (i) Adaptive IoT technology-specific transmis-sion scheme design, (ii) BC specific TX formatting on the IoT device side, (iii) BC clientsecurity, (iv) flexible network adaptation layer configuration on IoT device, (v) Networkplugins andBCclient components onEdgenodes, and (vi) on/off-chain andde-/centralized

181

or DSSes’ integration to avoid unnecessary storage costs. The efficiency of BIoT systemsrequires observation of the data traveling routes and the efficiency of the data storage sys-tem employed.

Competitive advantages of BCs to other DSSes in addressing the key concerns of cen-tralized storage systems, especially tamper-proofness, time stamping, reliability, and scal-ability, have made them a viable choice for BIoT. However, BCs’ design shall follow theselected notions and approaches taken by DSSes to improve large data storage and pro-cessing efficiency. Thus, based on BIIT, moving between BCs and DSSes shall be madepossible. The simulations performed to evaluate BIIT efficiency lead to promising resultsby enhancing the IoT-to-BC transport throughput and energy efficiency and enhancedoverarching security.

The developed dApps reassured that their underlying BC highly impacts bIoT appli-cations’ scalability. It is important to note that current PoW-based BCs cannot be utilizedfor BIoT in the same way as DSSes. This is mainly due to such BCs’ high data storagecosts, delay, low scalability, and restricted privacy considerations. Thus, this is concludedthat either DLs and DHT-based solutions have to be adopted for BIoT, or proprietaryapproaches need to be developed to replace such BCs. In this regard, to enhance the scal-ability of BCs, this thesis introduced DLIT, an IoT-oriented DL. Various new TXs weredefined for DLIT, each with a specific purpose and a novel design to help DLIT’s consen-sus mechanism for a faster and reliable IoT-oriented DL. The evaluation results indicatedthe approaches followed inDLIT’s design offer a highTPSdue to the shardingmechanismdesigned.

The experience collected during this thesis indicates that modifying BCs’ complexand intertwined elements, e.g., when adding new features or changing an algorithm, in-/directly influences many other elements of that BC. This influence must be thoroughlyevaluated for any production-ready BC. For instance, the TX aggregation mechanism re-lies on a simple idea, i.e., collectingTXsbased on their sender or receiver address. However,introducingTXaggregation to an already developedBCcausesmanifold side effects. First,blocks’ content will be changing, which causes the newminers to see a different version ofthe chain, especially the recently joined ones. Moreover, the validation or verification ofTXs can be challenging, especially if the modification of TXs is permitted.

BIoT security is the representation of data security and integrity provision through-out data collection, transmission, and persistence. The more the security of a BIoT, thehigher will be the users’ trust in the data/information shown by BIoT applications. In or-der to offer a higher trustability, BIoT systems shall offer highly sophisticated encryptionand cryptographic algorithms while providing user privacy and transparency. In this re-gard, the two-layer consensus of DLIT provides high security and reliability to the wholeDL network. The dedicated roles and expectations of validators/miners were considered

182

carefully. That is, within DLIT, not all miners follow the same steps at all points in time.Thus, eachminer has its specific roles and knowledge of the chain and openTXs at a pointin time. Moreover, re-verifications and distributed slashing mechanisms are embedded toprovide data security and obstruct malicious activities of miners.

As today’s world requires sustainable approaches urgently, the energy efficiency ofBIoT is gaining high importance. This thesis concludes that on one hand, processes ofan IoT device, including all the actions it has to perform for collecting data and the onesperformed prior or in parallel to data transmission, such as encapsulations and aggrega-tion of data, are affecting the energy efficiency of a BIoT system. On the other hand,the high energy demand of BC mining processes must be considered before its selectionfor a BIoT application. The simulations performed to evaluate BIIT efficiency lead topromising results even in highly limited IoT protocols such as LoRaWAN, by enhancingthe IoT-to-BC transport throughput, the energy efficiency is enhanced due to the reducedre-transmissions and packet losses.

Usability, i.e., public acceptance of BC-based dApps studied regarding the functionaland social aspects of those dApps shown tohave a highpotential according to the interviewresults conducted during the NUTRIA SCT dAPP design and implementation. How-ever, due to the identified risks of BIoT systems, and in contrast to the widespread cryp-tocurrency trading applications, the daily use of BC-based applications is still not achievedon a global scale. Even if not applicable in all countries today, regulations’ influence onpublic acceptance of BC-based applications and verifying the validity of the data stored ina BC is significant, and it impacts the trust in that application or system. To confront suchcases, the Zero-knowledge Proof IAMdesign of the developed dApps in this thesis added alayer of user privacy which is potentially stimulating users’ trust via user-controlled identi-fication processes. Moreover, conforming to the user privacy demands and correspondingregulations, this thesis differentiates between the concept of immutability and tamper-proofness from updateability andmodifiability. Hence, DLIT is designed to offer GDPRcompliance. DLIT enables data modification requests, issued by users and handled byminers, and it maintains the chain after modifying data. Thus, DLIT’s implementation isan attestation for the plausibility of the deletable BCs’ usability.

6.1 Overview of All Contributions

To fulfill the goals set for this thesis, apart from the identification of the metrics affectingBIoT efficiency, this thesis contributed in 12 key areas, that are achieved by addressing theRQs as follows.

183

Figure 6.1: Correlation Presentation of Goals, Research Questions, and Contributions

Contributions regardingRQ1: Howto achieve scalability inBC-IoT integrated ecosys-tems?

This thesis has 4 direct contributions (C1 - C4) to enhance scalability of BIoT includ-ing:

C1 –Designing a sharding-based DL, i.e.,DLIT for BIoT use cases. DLIT reduces orin many cases even removes the need for inter-shard communications, thus offers a highTPS.

C2 – Specifying a TX aggregation approach, the storage requirement of BIoT BCs isreduced as the TX aggregation controls the BC size growth.

C3 – Specifying a decision making tree to determine the need for choosing a BC, or aDSS, or a centralized systemhas been clarified to prevent from sub-optimal choices of BCs’use in unnecessary use cases. Hence, guiding IoT-integrated systemdesigners in reasonabletechnology selection. Thus, transparency, immutability, and decentralization of trust areachieved by using BC and DL variants of DSSes trading off the costs of such technologiescautiously.

C4 – Designing a scalable data streaming platform, i.e., ITrade to enable P2P com-munications of IoT data generators and users. Moreover, decentralized management ofinteractions between entities involved are addressing the “manageability” requirement ofthe functional goal. By addressing the scalability issues of BCs used in BIoT this thesisfulfills the functional and social as well as the technical goals. This is due to the effect ofunderlying BCs’ scalability on the BIoT systems usability and public adoption.

184

Contributions regarding RQ2: How to facilitate energy-efficiency of the designedBIoT?

This thesis presented 4 contributions (C5 - C8) to the energy-efficiency of BIoT in-cluding:

C5– Introducing BIITwhich specifically emphasizes on transport scheme’s efficiencyand provides a solution to transmit large data packets with limiting MTUs such as in Lo-RaWAN. It was shown that via such a transport scheme by reducing the lost data, andconsequently by elimination of packet re-transmissions, the total energy consumed by theIoT-to-BC communication is improved.

C6 – Defining the interactions required for IoT nodes to operate as a light BC clientand to collect the required parameters from the user’s BC client. This is an importantadaptation step which is being missed the most in the related work. Via defining theseinteractions, more flexible and usable BIoT system is achieved, fulfilling the functionaland social goals.

C7 – Specifying the software definedmanagementmodules’ of BIoTwithin the BIITarchitecture. Thus, the proposed BIIT architecture leans towards higher flexibility andmanageability specified by the functional goal.

C8 – Designing a hybrid consensus mechanism for DLIT where the combination ofPoS and BFT has been selected to replace PoW, thus reducing the energy demand of vali-dation processes on validators.

Contributions regarding RQ3: How to secure the designed BC and BIoT architec-ture to enable trust, transparency, and user privacy?

This thesis has 4 direct contributions (C9 - C12) to the security of BIoT including:C9 – design and implementation of a KYD platform, i.e., KYoT, based on the physi-

cal attributes of IoT devices. KYoT’s multidimensional platform offers a secure and self-sovereign identification of IoT devices such that trusted users on a BC can issue KYDclaims themselves. KYoT’s approach is among the pioneers of combining a PUF-baseddevice identification with the issuing of ERC 734/735-based claims. Such a combinationresults a user-managed and decentralized IAM.

C10 – Design of the committee for DLIT’s consensus which adds a robust securitylayer over all the activities conducted by different entities in DLIT. The committee veri-fies the TX’s added to blocks in shards, and the slashing mechanism underlying the mali-cious activities of the validators on shards. Moreover, DLIT divides the TX assignment,validation, verification, and storage responsibility between committeemembers and shardvalidators. These roles assigned to theCommittee/shard leader/members defined carefullyto prevent from potential security attacks.

185

C11 – Specifying the security module by the BIIT architecture, and highlighting theimportance of data integrity provision and IAMwhich shall be performed at the IoT de-vices.

C12 – Enabling the data modification via a GDPR-compliant design of DLIT whichis providing more privacy and trust to BIoT application users.

Figure 6.1 is revisiting Figure 1.1 by assigning the contributions (C1- C12) to the re-lated RQs and goals. Advances provided by each mechanisms, protocols, or systems elab-orated by the evaluations of each of these contributions presented in previous Chapters.

6.2 FutureWork

In all aspects of this thesis, practical efficiency has been the main overarching goal. Thesame mindset applies in identification of potential future work with an outlook for newgoals categorized into three groups.

As the first group, it is foreseen that a more in-depth evaluation of developed applica-tions with a more extensive set of real-life users will lead to even further realistic insightson the efficiency and scalability challenges of such dApps in the BIoT sector. Moreover,the security of processes in the developed dApps can be further improved for real-life ap-plicability. For instance, evaluation results on KYoT indicate that trust to device ownersis unavoidable as far as a fully automated and distributed key extraction mechanism is notavailable. Thus, it is highly recommended for a practical system in operation to utilize (a)a more robust PUF design, which is of a different type than SRAM-based, (b) periodi-cal PUF generations without any user interactions, and (c) a mechanism independent ofdevices’ power cycles.

The second group relates to the further practical implementations and evaluations oftheBIITarchitecture, such as implementing software-defined adaptationof the configura-tions. As part of an outlook, it is recommended to implement an online data transmissionplatform that supports the transmission of data and proposes and adjusts the IoT-to-BCtransport protocols’ configuration. Such adjustments shall be based on continuous mon-itoring of the network status and measuring the achieved throughput.

The third group relates to the consensus efficiency of DLIT, which merits large-scaleevaluations andmodifications to offer the same or even higher throughput on a large scale.Such modifications may include the introduction of variations of BFT for the committeeand PoS for the validation election processes. Furthermore, the addition of Smart Con-tracts to DLIT can stimulate this DL’s potential in support of different dApps, especiallyif the developed Smart Contract can offer GDPR-compliance. It can be recommendedto pursue the energy resource proofs as an improvement to the existing consensus mecha-nisms to ensure the sustainability of the energy consumed by DLIT’s mining network.

186

These three groups are only examples of what the future of this work could follow. Bytime and by the invention of new IoT technologies, cryptographic algorithms, BIoT usecases, and change of BIoT requirements, new areas of researchwill come up thatwillmeritfurther investigations in the aspects mentioned here, which could not be all predicted orperformed by this work due to time constraints.

Looking forward to the evolving6thGenerationof telecommunications—6G—whereIoT applications are combined with Artificial Intelligence (AI) for advanced data process-ing, BIoT will gain higher importance in storing automatically collected and processedIoT data. Such an application could be live surveillance by image processing performedon the IoT devices used for smart city use cases. Thus, the censorship resistance natureof BCs would enable the persistence of IoT-captured-and-processed data, which could beused to provide trustable evidence of the existing situation.

Finally, it could be suggested to investigate the potential of using sophisticatedmathe-matical approaches such as Differential Privacy (DP) to provide users with higher privacywhen data is stored on publicly accessible BCs. The DP-altered date could be a potentialsolution for protecting user identity and data from adversaries.

187

7Publications

The thesis’ proposal and development resulted in the publication of several scientific ar-ticles, which are directly or indirectly related to the proposed BIIT architecture and theDLIT DL. Further, master’s (M.Sc.) and bachelor’s (B.Sc.) theses, Independent Studies(IS), andMaster Projects (MAP)were developed in the BIoT context, with individual out-comes in terms of system prototypes development and evaluations contributed towardsthe developed dApps and overall BIIT architecture and DLIT DL.

This thesis objectives outlined are decomposed into several specific goals, which areachieved by publications across the thesis period. Thus, it is essential to highlight how theyare associated within this thesis as a foundation for discussed elements in each chapter. Inthe case of own publications, references are duly cited, the same as for relevant state of theart publications being cited. Table 7.1 lists the contributions of own publications to thisthesis per chapter basis.

Chapter 1. IntroductionChapter 1 provides the introduction and motivation about the importance and rel-

evance of BIoT, as well as an overview of the thesis goals and research questions. In thissense, [197, 195] were the first work outlining the need for efficient BC and IoT adap-tation as a multi-dimensional challenging task for BIoT application. Other publicationssuch as [201, 205, 206] are used to complete the provided introductory arguments.

188

Chapter 2. BackgroundThisChapter collected the requiredbackground informationneeded in following chap-

ters. The three book chapters of [206, 208, 220] are used to present the key BC, IoT, andBIoT related technical aspects. Moreover, the set of assumptions and required method-ologies provided by relatedworkwere presented taken fromother publications as listed ontable 7.1 contributed partially on completion of the background details.

Chapter 3. Design and Implementation of BC-IoT Integration ApplicationsThis chapter covers the design and implementation specifications of the four dApps

introduced in this thesis. These systems were initially developed by the thesis work of stu-dents [94, 96, 148, 150, 179] which later laid the foundation for the publications repre-senting the 4 dApps in [196, 197, 200, 201]. This thesis derives the overall conclusionsfrom the experiences collected by working on those dApps by (a) suggesting a decisiontree to identify when IoT use cases should be using BC, and (b) identifying the BIoT chal-lenges.

Chapter 4. BIIT —An Efficient BIoT ArchitectureDesign of the BIIT architecture was initiated by studying the evaluation results on

LoRa and LTE-m IoT nodes performed by [71, 234]. The first version of BIIT was pre-sented by [204], which later improved in many aspects and presented in [205, 221]. Fi-nally, the BIIT 1.0 was proposed as a standard BIoT architecture in the recently publishedbook chapter [206].

Chapter 5. DLIT—ADistributed Ledger for Efficient BIoTDLIT presented in Chapter 5 is based on Bazo with the PoS version of it worked in

[45] as mostly covered via appendix 1. In the next steps, this version improved in manyscalability and security aspects by the work started by [41, 204] as the first sharded versionof Bazo. Later this version was adapted to provide TX aggregation in [165] and enhancedconsensus mechanism reliability in [49]. While the result of these works presented by thepublished papers [198,203], the GDPR compliant version of Bazo, which was already re-named to DLIT, was implemented by [258].

189

Table 7.1: List of Own Publications and Student Thesis Supervised

ChapterRelated Publications Student Thesis

Full Paper Short Paper Book Chapter Demo Technical Report M.Sc., B.Sc. IS, MAP

1. Introduction [201, 205] [197] [206] [195]

2. Background [196, 203, 209] [206, 208, 220] [207]

3. Design and Implemen-tation of BC-IoT IntegratedApplications

[196, 200, 201] [197] [206] [195] [199, 202] [96, 148, 150] [94, 96, 179]

4. BIIT —An Efficient BIoTArchitecture

[205, 221] [206] [204] [71, 234]

5. DLIT —A DistributedLedger for Efficient BIoT

[198, 203, 209] [204] [207] [41, 49, 165,257]

Publication Details

[41] Design and Implementation of a Scalable IoT-based Blockchain,

[49] Enhancing the Scalability of a Sharded Blockchain

[71] Design and Implementation of an IoT-based Hierarchical Payment System with BAZO Blockchain.

[94] Design and Development of a Platform Agnostic Supply Chain Tracking Application

[95] Data Sovereignty Provision in Cloud-and-Blockchain-Integrated IoT Data Trading

[96] Developing a Blockchain-based Supply Chain Tracking Platform

[147] Design and Prototypical Implementation of an IoT Identification Platform based on Blockchains and Physical Unclonable Functions (PUF)

[149] Design and Implementation of an Integrated Water Quality Monitoring System and Blockchains

[165] Evaluation and Improving Scalability of the Bazo Blockchain

[180] Design and Development of an Android-based Supply Chain Tracking Application

[195] A Platform Independent, Generic-purpose, and Blockchain-based Supply Chain Tracking

[196] KYoT: Self-sovereign IoT Identification with a Physically Unclonable Function

[197] Design and Implementation of an Automated and Decentralized Pollution Monitoring System with Blockchains, Smart Contracts, and LoRaWAN

[198] DLIT: A Scalable Distributed Ledger for IoT Data

[199] A Blockchain-based Supply Chain Tracing for the Swiss Dairy Use Case

[200] A Blockchain-based Supply Chain Tracing for the Swiss Dairy Use Case

[201] ITrade: A Blockchain-based, Self- Sovereign, and Scalable Marketplace for IoT Data Streams

[202] A Blockchain-based Platform for Self-sovereign IoT Identification

[203] Toward Scalable Blockchains with Transaction Aggregation

[204] Adaptation of Proof-of-Stake-based Blockchains for IoT Data Streams

[205] Standardization of Blockchain-based I2oT Systems in the I4 Era

[206] Architectures for Blockchain-IoT Integration

[207] BAZO: A Proof-of-Stake (PoS) based Blockchain

[208] Enabling Technologies and Distributed Storage

[209] On-Chain IoT Data Modification in Blockchains

[220] Blockchains and Distributed Ledgers Uncovered: Clarifications, Achievements,and Open Issues

[221] Scalable Transport Mechanisms for Blockchain IoT Applications

[234] Simulation and Efficiency Improvements of the IoT Communication Protocols Used in the Supply Chain Monitoring Systems

[258] Design and Prototypical Development of a GDPR and Swiss Law Compliant Blockchain

190

References

[1] “Apps Run The World, Apps Research and Buyer Insight- Top10 ERP Software Vendors and Market Forecast 2017-2022,”https://www.appsruntheworld.com/about-us/, last visit: 6 December, 2019.

[2] “Arduino Boards & Modules,” https://store.arduino.cc/arduino-genuino/boards-modules, last visit: October 15, 2021.

[3] “Blockchain,” https://en.wikipedia.org/wiki/Blockchain, last visit: September 22,2021.

[4] “Data Sovereignty and The Cloud,” https://www.itgovernance.co.uk/data-sovereignty-and-the-cloud, last visit: October 15, 2021.

[5] “Freshness, Regionality, and Quality,” https://www.fuchsmilch.ch, last visit: Oc-tober 15, 2021.

[6] “How Many IoT Devices Are There in 2021?” https://techjury.net/blog/how-many-iot-devices-are-there/, last visit: January 20, 2022.

[7] “Internet of Things (IoT) and non-IoT Active Device Connections World-wide from 2010 to 2025,” https://www.statista.com/statistics/1101442/iot-number-of-connected-devices-worldwide/, last visit: January 20, 2022.

[8] “Kafka vs Pulsar - Performance, Features, and Architecture Compared,” https://www.confluent.io/kafka-vs-pulsar/, last visit: October 15, 2021.

[9] “Memory,” https://www.arduino.cc/en/tutorial/memory, last visit: October 15,2021.

[10] “MetaMask,” https://metamask.io/, last visit: October 15, 2021.

[11] “Molkerei Fuchs,” https://www.facebook.com/MolkereiFuchs/, last visit: Octo-ber 15, 2021.

[12] “Opentracing overview,” https://opentracing.io/docs/overview/, last visit: Octo-ber 15, 2021.

[13] “web3.js - Ethereum JavaScript API,” https://web3js.readthedocs.io/en/v1.3.0/,last visit: October 15, 2021.

[14] “Bitcoin,” https://en.wikipedia.org/wiki/Bitcoin, September 2017, last visit: Oc-tober 15, 2021.

191

[15] “Ethereum Homestead Documentation,” February 2017, http://www.ethdocs.org/en/latest/.

[16] “NEM, the Smart Asset Blockchain,” https://nem.io/, September 2017, last visit:October 15, 2021.

[17] “Top Crypto Assets,” https://cryptowat.ch/, August 2017, last visit: October 15,2021.

[18] “CNCF Cloud Native Definition v1.0,” https://github.com/cncf/toc/blob/master/DEFINITION.md, June 2018, last visit: October 15, 2021.

[19] “Internet Economics XII,” in IFI-TecReport No. 2018.01, B. Stiller, T. Bocek,P. Poullie, S. Rafati, B. B. Rodrigues, and C. Schmitt, Eds. Zürich, Switzerland:University of Zurich, February 2018.

[20] “How Much Data is Generated Each Day? - Full Size Version,”https://www.visualcapitalist.com/wp-content/uploads/2019/04/data-generated-each-day-full.html, April 2019, last visit: October 15, 2021.

[21] “The State of Broadband 2019: Broadband as a Foundation for Sustainable Devel-opment.” Broadband Commission, 2019, p. 11.

[22] “Cloud Storage,” https://cloud.google.com/storage, December 2020, last visit:October 15, 2021.

[23] “COVID-19 Impact on The Internet of Things (IoT) Market by Components(Software Solutions, Platforms, Services), Vertical (BFSI, Healthcare, Manufac-turing, Retail, Transportation, Utilities, Government & Defense) and Region -Global Forecast 2021,” https://www.researchandmarkets.com/reports/5019998/covid-19-impact-on-the-internet-of-things-iot, April 2020, last visit: October 15,2021.

[24] “Google/crc32c,” https://github.com/google/crc32c, December 2020, last visit:October 15, 2021.

[25] “How IPFS Works,” https://docs.ipfs.io/concepts/how-ipfs-works, December2020, last visit: October 15, 2021.

[26] “Object Storage Built to Store andRetrieve anyAmount ofData fromAnywhere,”https://aws.amazon.com/s3/, December 2020, last visit: October 15, 2021.

[27] “A Digital Future on a Global Scale,” https://ethereum.org/en/eth2/vision/,February 2021, last visit: October 15, 2021.

[28] “Sia Combines A Peer-To-Peer Network With Blockchain Technology To CreateTheWorld’s FirstDecentralized Storage Platform,” January 2021, https://sia.tech/technology.

192

[29] M. Abomhara and G. M. Køien, “Security and privacy in the Internet of Things:Current status and open issues,” in International Conference on Privacy and Secu-rity inMobile Systems (PRISMS 2014), Aalborg, Denmark, May 2014, pp. 1–8.

[30] F. Adelantado, X. Vilajosana, P. Tuset-Peiro, B. Martinez, J. Melia-Segui, andT. Watteyne, “Understanding the Limits of LoRaWAN,” T. S. El-Bawab, Ed.,Vol. 55, No. 9, 2017, pp. 34–40.

[31] Aeroqual, “Portable Outdoor Air Quality Monitors,” https://www.aeroqual.com/outdoor-air-quality-monitors/outdoor-portable-air-monitors, 2021, lastvisit: 06 June, 2021.

[32] T.Ahram,A. Sargolzaei, S. Sargolzaei, J.Daniels, andB.Amaba, “BlockchainTech-nology Innovations,” in IEEE Technology Engineering Management Conference(TEMSCON 2017), Santa Clara, California, USA, June 2017, pp. 137–141.

[33] E. Alliance, “ERC-725 Ethereum Identity Standard,” https://erc725alliance.org/,last visit: October 15, 2021.

[34] Amazon Web Services AWS-IoT, “IoT Services for Industrial, Consumer, andCommercial Solutions,” https://aws.amazon.com/iot, last visit: April 29, 2021.

[35] E. Androulaki, A. Barger, V. Bortnikov, C. Cachin, K. Christidis, A. De Caro,D. Enyeart, C. Ferris, G. Laventman, Y. Manevich, S. Muralidharan, C. Murthy,B. Nguyen, M. Sethi, G. Singh, K. Smith, A. Sorniotti, C. Stathakopoulou,M. Vukolić, S.W. Cocco, and J. Yellick, “Hyperledger Fabric: ADistributedOper-ating System for Permissioned Blockchains,” inACMEuroSys Conference (EuroSys2018), Porto, Portugal, April 2018, pp. 1–15.

[36] A. M. Antonopoulos and G. Wood, Mastering Ethereum: Building Smart Con-tracts andDapps. Sebastopol, California, USA:O’ReillyMedia, December 2018.

[37] L. Ardito, A. M. Petruzzelli, U. Panniello, and A. C. Garavelli, “Towards Indus-try 4.0: Mapping Digital Technologies for Supply Chain Management-MarketingIntegration,” Business Process Management Journal, Vol. 25, No. 2, pp. 323–346,July 2019, doi=10.1108/BPMJ-04-2017-0088.

[38] arduino.cc, “Getting Started with Arduino and Genuino MEGA2560,” https://www.arduino.cc/en/Guide/ArduinoMega2560, May 2018, last visit: June 13,2021.

[39] G. Ateniese, B.Magri, D. Venturi, and E. Andrade, “Redactable Blockchain – or –Rewriting History in Bitcoin and Friends,” in IEEE European Symposium on Secu-rity and Privacy (EuroS&P 2017), Paris, France, April 2017, pp. 111–126.

[40] N. Atzei, M. Bartoletti, and T. Cimoli, “A Survey of Attacks on Ethereum SmartContracts,” Cryptology ePrint Archive: Report 2016/1007, https://eprint. iacr.org/2016/1007, Tech. Rep., 2016.

193

[41] A. Augustin, J. Yi, T. Clausen, and W. M. Townsley, “A Study of LoRa: LongRange & Low Power Networks for the Internet of Things,” in Sensors, D. Kim,Ed., Vol. 16, No. 9, 2016.

[42] I. Authors, “Istio / Performance and Scalability,” https://istio.io/latest/docs/ops/deployment/performance-and-scalability/#performance-summary-for-istio-hahahugoshortcode-s0-hbhb, September2020, last visit: October 15, 2021.

[43] K. Aydinli, “Design and Implementation of a Scalable IoT-based Blockchain,”Communication Systems Group, Department of Informatics, Zürich, Switzer-land, 2019.

[44] A. Babaei and G. Schiele, “Physical Unclonable Functions in The Internet ofThings: State of The Art and Open Challenges,” Sensors, Vol. 19, No. 14, p. 3208,2019.

[45] S. Bachmann, “Proof of Stake for Bazo.” Zürich, Switzerland: University ofZürich, January 2018.

[46] A. Back, M. Corallo, L. Dashjr, M. Friedenbach, G. Maxwell, A. Miller, A. Poel-stra, J. Timón, and P. Wuille, “Enabling Blockchain Innovations with PeggedSidechains,” Vol. 72, October 2014, http://www.opensciencereview.com/papers/123/enablingblockchain-innovations-with-pegged-sidechains.

[47] A. Banerjee, “Chapter Nine - Blockchain with IoT: Applications and Use Casesfor a New Paradigm of Supply Chain Driving Efficiency and Cost,” in Role ofBlockchain Technology in IoT Applications, ser. Advances in Computers, S. Kim,G. C. Deka, and P. Zhang, Eds., Vol. 115. Elsevier, 2019, pp. 259–292.

[48] B. Barrett, “Wi-Fi That Charges Your Gadgets Is Closer Than You Think,” 2015,https://www.wired.com/2015/06/power-over-wi-fi/.

[49] E. Barriocanal, S. Sánchez-Alonso, and M. Sicilia, “Deploying Metadata onBlockchain Technologies,” ser. 4, E. Garoufallou, S. Virkus, R. Siatri, andD. Kout-somiha, Eds. Tallinn, Estonia: Springer International Publishing, 11 2017, pp.38–49.

[50] I. Bashir, Mastering Blockchain: Distributed Ledger Technology, Decentralization,and Smart Contracts Explained. Birmingham, England: Packt Publishing Ltd,2018.

[51] R. Beckmann, “Enhancing the Scalability of a Sharded Blockchain,” Communi-cation Systems Group, Department of Informatics, Zürich, Switzerland, March2020.

[52] M. Beedham, “Blockchain Sharding Made So Simple Your Dog Would Under-stand,” https://thenextweb.com/news/explainer-blockchain-sharding-beginners,January 2019, last visit: June 18, 2021.

194

[53] I. Bentov, C. Lee, A.Mizrahi, andM.Rosenfeld, “Proof ofActivity: ExtendingBit-coin’s Proof of Work via Proof of Stake [Extended Abstract]Y,” in SIGMETRICSPerformance Evaluation Review, Vol. 42, No. 3. New York, NY, USA: ACM,December 2014, pp. 34–37.

[54] D. J. Bernstein, N. Duif, T. Lange, P. Schwabe, and B. Yang, “High-speedhigh-security signatures,” in Journal of Cryptographic Engineering, Vol. 2, No. 2.Springer Link, September 2012, pp. 77–89.

[55] R. Bhanot and R. Hans, “A Review and Comparative Analysis of Various Encryp-tion Algorithms,” International Journal of Security and Its Applications, Vol. 9,No. 4, pp. 289–306, April 2015.

[56] K. Biswas and V. Muthukkumarasamy, “Securing Smart Cities Using BlockchainTechnology,” in IEEE 18th International Conference on High Performance Com-puting and Communications (HPCC 2016), Sydney, Australia, December 2016,pp. 1392–1393.

[57] bitcoindeveloper, “Block Chain,” December 2020, https://developer.bitcoin.org/reference/block_chain.html.

[58] BitFury Group, “Proof of Stake Versus Proof of Work,” September 2015, http://bitfury.com/content/5-white-papers-research/pos-vs-pow-1.0.2.pdf.

[59] Bitshares.org, “Bitshare Blockchain,” https://bitshares.org/, May 2021, last visit:12May, 2021.

[60] F. S. Björn Johansson, “Choosing Open Source ERP Systems: What Reasons AreThere ForDoing So?, CopenhagenBusiness School, Center forApplied ICT, Fred-eriksberg, Denmark June 2009,” last visit: 6 December, 2019.

[61] J. Blomer, “A Survey onDistributed File SystemTechnology,” in Journal of Physics:Conference Series, Vol. 608, No. 1. IOP Publishing, April 2015, p. 012039.

[62] R. Blum and T. Bocek, “Superlight – A Permissionless, Light-client OnlyBlockchain with Self-Contained Proofs and BLS Signatures,” in IFIP/IEEE Sym-posium on Integrated Network and Service Management (IM 2019), WashingtonDC, USA, Aptil 2019, pp. 36–41.

[63] T. Bocek, B. B. Rodrigues, T. Strasser, and B. Stiller, “Blockchains Everywhere - AUse-case of Blockchains in the Pharma Supply Chain,” in IFIP/IEEE Symposiumon IntegratedNetwork and ServiceManagement (IM2017), Lisbon, Portugal,May2017, pp. 772–777.

[64] T. Bocek and B. Stiller, “Smart Contracts - Blockchains in the Wings,” in DigitalMarketplaces Unleashed, C. Linnhoff-Popien, R. Schneider, andM. Zaddach, Eds.Tiergartenstr. 17, 69121 Heidelberg, Germany: Springer, January 2017, pp. 169–184.

[65] C.Bormann,M.Ersue, andA.Keranen, “Terminology forConstrained-NodeNet-works,” in Internet Research Task Force, RFC, Vol. 7228, May 2014.

195

[66] V. Brühl, “Libra—ADifferentiated view on Facebook’s Virtual Currency Project,”in Intereconomics, C. Breuer, E. Sprenger, J. Bourguignon, F.Warmbier, andC. Al-cidi, Eds., Vol. 55, No. 1. Springer, January 2020, pp. 54–61.

[67] V. Buterin, “Slasher: A Punitive Proof-of-Stake Algorithm,” January 2014, https://blog.ethereum.org/2014/01/15/slasher-a-punitive-proof-of-stake-algorithm/.

[68] ——, “On Stake,” August 2015, https://blog.ethereum.org/2014/07/05/stake.

[69] V. Buterin andG.Wood, “Home | ethereum.org,” https://ethereum.org/, last visit:October 15, 2021.

[70] CarbonMonoxideKills.Com, “Permissible levels of Carbon Monoxide - Car-bon Monoxide Kills,” http://www.carbonmonoxidekills.com/are-you-at-risk/carbon-monoxide-levels/, last visit: 06 June, 2021.

[71] A. D. Carli, M. Franco, A. Gassmann, C. Killer, B. Rodrigues, E. Scheid,D. Schoenbaechler, and B. Stiller, “WeTrace – A Privacy-preserving MobileCOVID-19 Tracing Approach and Application.” arxiv, May 2020, last visit: Oc-tober 15, 2021.

[72] D. Cawrey, “Are 51% Attacks a Real Threat to Bitcoin?” https://www.coindesk.com/51-attacks-real-threat-bitcoin, June 2014, ”last visit: May 3, 2021”.

[73] I. Cepilov, “Design and Implementation of an IoT-based Hierarchical PaymentSystem with BAZO Blockchain.” Zürich, Switzerland: Communication Sys-tems Group, Department of Informatics, University of Zürich, April 2019, https://files.ifi.uzh.ch/CSG/staff/Rafati/Ile-Cepilov-BA.pdf.

[74] I. Cepilov, E. Schiller, and S. Rafati Niya, “BIIT TTN Plugin, Bazo-client andminer Implementations,” https://gitlab.ifi.uzh.ch/csg/biot/, last visit: June 13,2021.

[75] S. Chandel, S. Zhang, and H.Wu, “Using Blockchain in IoT: Is it a Smooth RoadAhead for Real?” in Future of Information and Communication Conference (FICC2020), San Francisco, USA, March 2020, pp. 159–171.

[76] K.Chang andB.Mason, “The IEEE802.15.4g Standard for SmartMeteringUtilityNetworks,” in IEEE Third International Conference on Smart Grid Communica-tions (SmartGridComm 2012), Tainan City, Taiwan, November 2012, pp. 476–480.

[77] D. Chen, G. Chang, D. Sun, J. Li, J. Jia, and X. Wang, “TRM-IoT: A Trust Man-agement Model Based on Fuzzy Reputation for Internet of Things,” in ComputerScience and Information Systems, Vol. 8, No. 4. Novi Sad, Serbia: ComSIS Con-sortium, 2011, pp. 1207–1228.

[78] Y. Chen, “How Can Blockchain Technology HelpFight Air Pollution?” https://media.consensys.net/how-can-blockchain-technology-help-fight-air-pollution-3bdcb1e1045f, De-cember 2016, last visit: 06 June, 2021.

196

[79] I. Consulting, “Art. 16 GDPR Right to Rectification,” December 2020, https://gdpr-info.eu/art-16-gdpr/.

[80] ——, “Art. 17GDPRRight to Erasure (‘Right to be Forgotten’),”December 2020,https://gdpr-info.eu/art-17-gdpr/.

[81] S. Couture and S. Toupin, “What Does The Notion of ’Sovereignty’ MeanWhenReferring to The Digital?” New Media and Society, Vol. 21, No. 10, pp. 2305–2322, 2019.

[82] C. Crunch, “QuarkChain Review — A New Scalable BlockchainLooking to Dethrone Ethereum?” https://hackernoon.com/quarkchain-review-a-new-scalable-blockchain-looking-to-dethrone-ethereum-9ffc0e814772,April 2018, last visit: June 18, 2021.

[83] M. Dabbagh, K.-K. R. Choo, A. Beheshti, M. Tahir, and N. S. Safa, “A surveyof empirical performance evaluation of permissioned blockchain platforms: Chal-lenges and opportunities,” Computers & Security, Vol. 100, p. 102078, 2021.

[84] F. Dahlqvist, M. Patel, A. Rajko, and J. Shulman, “Growing Op-portunities in The Internet of Things,” https://www.mckinsey.com/industries/private-equity-and-principal-investors/our-insights/growing-opportunities-in-the-internet-of-things, July 2019, last visit: Octo-ber 15, 2021.

[85] H. Dai, Z. Zheng, and Y. Zhang, “Blockchain for Internet of Things: A Survey,” inIEEE Internet of Things Journal, H. Wang, Ed., Vol. 6, No. 5, October 2019, pp.8076–8094.

[86] S. Debnath, A. Chattopadhyay, and S.Dutta, “Brief Review on Journey of SecuredHashAlgorithms,” in 4th InternationalConference onOpto-Electronics andAppliedOptics (IEMOPTRONIX 2017), Kolkata, India, 2017, pp. 1–5.

[87] C. Decker and R. Wattenhofer, “A Fast and Scalable Payment Network with Bit-coin Duplex Micropayment Channels,” in 17th International Symposium on Sta-bilization, Safety, and Security of Distributed Systems (SSS 2015), A. Pelc andA. A.Schwarzmann, Eds. Edmonton, AB, Canada: Springer International Publishing,August 2015, pp. 3–18.

[88] D.Derler, K. Samelin, D. Slamanig, andC. Striecks, “Fine-Grained andControlledRewriting in Blockchains: Chameleon-Hashing Gone Attribute-Based,” in 26thAnnual Network andDistributed System Security Symposium (NDSS 2019). SanDiego, California, USA: The Internet Society, February 2019.

[89] D. Deuber, B. Magri, and S. Thyagarajan, “Redactable Blockchain in The Permis-sionless Setting,” in Symposium on Security and Privacy (SP 2019). San Francisco,CA, USA: IEEE, 2019, pp. 124–138.

[90] Dezentrum, “Mittels Zukunftsexperimenten schaffen wir pos-itive Szenarien für eine digitale Gesellschaft von morgen,”https://www.linkedin.com/company/dezentrum, last visit: October 15, 2021.

197

[91] V. Dhillon, D. Metcalf, and M. Hooper, “Unpacking Ethereum,” in BlockchainEnabled Applications: Understand the Blockchain Ecosystem and How to Make itWork for You. Berkeley, California, USA: Apress, 2021, pp. 37–72.

[92] Y. Dodis, R. Ostrovsky, L. Reyzin, and A. Smith, “Fuzzy Extractors: How to Gen-erate StrongKeys fromBiometrics andOtherNoisyData,” SIAMJournal onCom-puting, Vol. 38, No. 1, p. 97–139, 2008.

[93] A. Dohr, R. Modre-Opsrian, M. Drobics, D. Hayn, and G. Schreier, “The Inter-net of Things for Ambient Assisted Living,” in Seventh International Conferenceon Information Technology: New Generations (ITNG 2010), Las Vegas, NV, USA,April 2010, pp. 804–809.

[94] D. Dordevic, “Data Market Place,” https://github.com/IoT-Data-Marketplace,last visit: October 15, 2021.

[95] ——, “Swiss Dairy Tracking Platform,” https://github.com/ddanijel/nutria-dapp, last visit: October 15, 2021.

[96] ——, “Design and Development of a Platform Agnostic Supply Chain Track-ing Application,” Communication Systems Group, Department of Informatics,Zürich, Switzerland, August 2019.

[97] ——, “Data Sovereignty Provision in Cloud-and-Blockchain-Integrated IoT DataTrading,” Master’s thesis, Zürich, Switzerland, September 2020.

[98] D. Dordevic, A. G. Nabi, and T. Mann, “Developing a Blockchain-based SupplyChain Tracking Platform,” Communication Systems Group, Department of In-formatics, Zürich, Switzerland, September 2019.

[99] A.Dorri, S. S.Kanhere, andR. Jurdak, “MOF-BC:AMemoryOptimized andFlex-ibleBlockchain forLarge ScaleNetworks,” inFutureGenerationComputer Systems,Vol. 92, 2019, pp. 357 – 373.

[100] A. Dorri, S. S. Kanhere, R. Jurdak, and P. Gauravaram, “Blockchain for IoT Secu-rity andPrivacy: TheCase Study of a SmartHome,” in IEEE International Confer-ence on Pervasive Computing and CommunicationsWorkshops (PerComWorkshops2017), Kona, Big Island, HI, USA, 2017, pp. 618–623.

[101] ——, “LSB: A Lightweight Scalable Blockchain for IoT Security andAnonymity,”in Journal of Parallel and Distributed Computing, V. Prasanna, O. Beaumont, andS. Ranka, Eds., Vol. 134, 2019, pp. 180–197.

[102] Dragino.com, “LoRa Shield for Arduino,” http://www.dragino.com/products/lora/item/102-lora-shield.html, August 2020, last visit: April 2, 2021.

[103] M. Drake, “Understanding Database Sharding,” https://www.digitalocean.com/community/tutorials/understanding-database-sharding, February 2019, last visit:June 18, 2021.

198

[104] D. Draskovic and G. Saleh, “Datapace - Decentralized Data Marketplace Based onBlockchain,” inDatapace, December 2017.

[105] J. M. Dunwell, “Global Population Growth, Food Security and Food and FarmingforThe Future,” in SuccessfulAgricultural Innovation inEmergingEconomies: NewGenetic Technologies for Global Food Production, D. J. Bennett and R. C. Jennings,Eds. Cambridge: Cambridge University Press, 2013, pp. 23–38.

[106] J. Eberhardt and S.Tai, “OnorOff theBlockchain? Insights onOffChainingCom-putation and Data,” in Service-Oriented and Cloud Computing (ESOCC 2017),F. De Paoli, S. Schulte, and E. Broch Johnsen, Eds. Oslo, Norway: Springer In-ternational Publishing, 2017, pp. 3–15.

[107] B. Ekaba, “AnOverview ofGoogle Cloud Platform Services,” inBuildingMachineLearning and Deep Learning Models on Google Cloud Platform. Berkeley, Cali-fornia: Apress, 2019, pp. 7–10, https://doi.org/10.1007/978-1-4842-4470-8_2.

[108] Engineering ToolBox, “Carbon Dioxide Concentration - Comfort Levels,” https://www.engineeringtoolbox.com/co2-comfort-level-d_1024.html, 2008, last visit:06 June, 2021.

[109] L. Eschenauer and V. D. Gligor, “A Key-management Scheme for Distributed Sen-sorNetworks,” in 9thACMConference onComputer andCommunications Security(CSS 2002), ser. CCS ’02. New York, NY, USA: ACM, 2002, pp. 41–47.

[110] F. J. Estévez, P. Glösekötter, and J. González, “DARAL: ADynamic and AdaptiveRouting Algorithm for Wireless Sensor Networks,” I. Bravo, Ed., Vol. 16, No. 7,2016.

[111] Ethereum WiKi, “Ethash DAG,” November 2020, https://eth.wiki/concepts/ethash/dag.

[112] Ethereum Wiki, “Light Client Protocol,” https://github.com/ethereum/wiki/wiki/Light-client-protocol, November 2020, last visit: 06 June, 2021.

[113] EthHub, “Identity Standards,” https://docs.ethhub.io/built-on-ethereum/identity/ERC-EIP/, last visit: October 15, 2021.

[114] “Swarm - Storage and Communication for a Sovereign Digital Society,” https://swarm.ethereum.org/, last visit: October 15, 2021.

[115] Eth.wiki, “Proof of Stake FAQs,” https://eth.wiki/en/concepts/proof-of-stake-faqs, June 2020, last visit: May 1, 2021.

[116] European Parliamentary Research Service, “Blockchain and the General Data Pro-tection Regulation, Can Distributed Ledgers be Squared with European DataProtection law?” July 2019, https://www.europarl.europa.eu/RegData/etudes/STUD/2019/634445/EPRS_STU(2019)634445_EN.pdf.

199

[117] C. Fan, S. Ghaemi, H. Khazaei, and P. Musilek, “Performance Evaluation ofBlockchain Systems: A Systematic Survey,” IEEE Access, Vol. 8, pp. 126 927–126 950, June 2020.

[118] S. Farahani, Ed., ZigBee Wireless Networks and Transceivers. Burlington, Ver-mont, USA: Newnes, 2008.

[119] Federal Office of Agriculture (FOAG), https://www.blw.admin.ch/blw/en/home.html, last visit: October 15, 2021.

[120] T. M. Fernández-Caramés and P. Fraga-Lamas, “A Review on The Use ofBlockchain for The Internet of Things,” in IEEE Access, Vol. 6, 2018, pp. 32 979–33 001.

[121] Filecoin.io, “The technology Behind IPFS and Filecoin,” January 2021, https://docs.filecoin.io/about-filecoin/.

[122] “The Open Source Platform for Our Smart Digital Future - FIWARE,” https://www.fiware.org/, last visit: October 15, 2021.

[123] Foodways, “UnlockingSustainableValue,” https://www.linkedin.com/company/foodways-consulting-ag//, last visit: October 15, 2021.

[124] Gartner, “Gartner Forecasts Worldwide Public Cloud Revenue to Grow 17% in2020,” shorturl.at/ltE79, November 2019, last visit: October 15, 2021.

[125] ——, “Gartner SaysData andCyber-RelatedRisks Remain TopWorries for AuditExecutives,” shorturl.at/fhqAB, November 2019, (last visit on 08/03/2020).

[126] B. Gassend, D. Clarke, M. V. Dijk, and S. Devadas, “Silicon Physical RandomFunctions,” Proceedings of the 9th ACM conference on Computer and communica-tions security - CCS, 2002.

[127] “General Data Protection Regulation (GDPR) Official Legal Text,” https://gdpr-info.eu/, last visit: October 15, 2021.

[128] A. Gervais, G. O. Karame, K. Wüst, V. Glykantzis, H. Ritzdorf, and S. Cap-kun, “On the Security and Performance of Proof of Work Blockchains,” in ACMSIGSAC Conference on Computer and Communications Security (CCS 2016). Vi-enna, Austria: ACM, October 2016, pp. 3–16.

[129] R. L. GholizadehMH,Melesse AM, “AComprehensive Review onWater QualityParameters Estimation Using Remote Sensing Techniques,” in Sensors, A. M. Me-lesse, E. Kaba Ayana, and G. Senay, Eds., Vol. 16, No. 8. Basel, Switzerland: USNational Library of Medicine (NCBI), August 2016.

[130] D. Gislason, Ed., Zigbee Wireless Networking. Burlington, Vermont, USA:Newnes, 2008.

200

[131] I. Goddijn and J. Kouns, “Request The 2019 Q1 DataBreach QuickView Report,” https://pages.riskbasedsecurity.com/2019-midyear-data-breach-quickview-report, August 2019, last visit: Octo-ber 15, 2021.

[132] E.González, J. Casanova-Chafer, A.Romero, X.Vilanova, J.Mitrovics, andE. Llo-bet, “LoRa Sensor Network Development for Air Quality Monitoring or Detect-ingGas Leakage Events,” in Sensors, V.M. Passaro, R. Bruno, andR.Ghaffari, Eds.,Vol. 20, No. 21, 2020.

[133] C. Goursaud and J.-M. Gorce, “Dedicated Networks for IoT: PHY / MAC Stateof the Art and Challenges,” in EAI Endorsed Transactions on Internet of Things,D. Jiunn Deng and S. Li, Eds., Vol. 1, No. 1. European Alliance for Innovation,October 2015.

[134] M. Green and I. Miers, “Bolt: Anonymous Payment Channels for DecentralizedCurrencies,” inConference onComputer andCommunications Security (CCS2017),B. M. Thuraisingham, D. Evans, T. Malkin, and D. Xu, Eds. Dallas, TX, USA:ACM SIGSAC, October 2017, pp. 473–489.

[135] S. Haber andW. S. Stornetta, “How to Time-Stamp aDigital Document,” Journalof Cryptology, Vol. 3, No. 2, pp. 99–111, January 1991, ckecked.

[136] A. Hac, “Distributed File Systems-A Survey,” ACM SIGOPS Operating SystemsReview, Vol. 19, No. 1, pp. 15–18, 1985.

[137] A.Hafid, A. S.Hafid, andM. Samih, “Scaling Blockchains: AComprehensive Sur-vey,” IEEE Access, Vol. 8, pp. 125 244–125 262, July 2020.

[138] D. Han, H. Kim, and J. Jang, “Blockchain Based Smart Door Lock System,” inInternational Conference on Information and Communication Technology Conver-gence (ICTC 2017), Jeju Island, South Korea, October 2017, pp. 1165–1167.

[139] T. Hansen and D. E. Eastlake, “US Secure Hash Algorithms (SHA and HMAC-SHA),” No. 4634, July 2006, https://rfc-editor.org/rfc/rfc4634.txt.

[140] M. Hernandez, “Connectivity Now and Beyond; Exploring Cat-M1, NB-IoT, and LPWAN Connections,” https://ubidots.com/blog/exploring-cat-m1-nb-iot-lpwan-connections/, July 2018, last visit: May 14,2021.

[141] D. Holcomb, W. Burleson, and K. Fu, “Power-Up SRAM State as an IdentifyingFingerprint and Source of True Random Numbers,” IEEE Transactions on Com-puters, Vol. 58, No. 9, p. 1198–1210, 2009.

[142] S. M. Hosseini Bamakan, A. Motavali, and A. Babaei Bondarti, “A Survey ofBlockchain Consensus Algorithms Performance Evaluation Criteria,” in ExpertSystems with Applications, Vol. 154. Elsevier, September 2020, p. 113385.

201

[143] H. Huang, J. Lin, B. Zheng, Z. Zheng, and J. Bian, “When Blockchain Meets Dis-tributed File Systems: An Overview, Challenges, and Open Issues,” IEEE Access,Vol. 8, pp. 50 574–50 586, 2020.

[144] J. Humenansky, “The Impact of Digital Identity,” https://medium.com/blockchain-at-berkeley/the-impact-of-digital-identity-9eed5b0c3016, November2018, last visit: October 15, 2021.

[145] “Hyperledger Fabric - Hyperledger,” https://www.hyperledger.org/projects/fabric, last visit: October 15, 2021.

[146] ingenu.com, “Full Featured Value,” https://www.ingenu.com/technology/rpma/value/, 2020, last visit: May 14, 2021.

[147] IOTAFoundation, “IOTA,”2021, https://www.iota.org/, last visit: 30May, 2021.

[148] ——, “TheCoordicide,”May2021, https://coordicide.iota.org/,last visit: 30May,2021.

[149] B. Jeffrey, “Design and Prototypical Implementation of an IoT Identification Plat-form based on Blockchains and Physical Unclonable Functions (PUF),” Com-munication Systems Group, Department of Informatics, Zürich, Switzerland,September 2020.

[150] S. Jha, “Pollution Monitoring System Using Smart Contracts ,” https://github.com/jhasanjiv5/smartlabs, last visit: 06 June, 2021.

[151] S. S. Jha, “Design and Implementation of an IntegratedWater QualityMonitoringSystem and Blockchains,” Master’s thesis, Zürich, Switzerland, January 2018.

[152] ——, “Design and implementation of an integrated water quality monitoring sys-tem and blockchains,” Communication SystemsGroup, Department of Informat-ics. Zürich, Switzerland: University of Zürich, January 2018.

[153] Kane International Ltd, “What Are Safe Levels of CO andCO2 in Rooms?” https://www.kane.co.uk/knowledge-centre/what-are-safe-levels-of-co-and-co2-in-rooms, last visit: 06 June, 2021.

[154] M. Khalili, M. Dakhilalian, and W. Susilo, “Efficient Chameleon Hash Functionsin The Enhanced Collision Resistant Model,” Information Sciences, Vol. 510, pp.155 – 164, 2020.

[155] R. Khan, S. U. Khan, R. Zaheer, and S. Khan, “Future Internet: The Internet ofThings Architecture, Possible Applications and Key Challenges,” in 10th Interna-tional Conference on Frontiers of Information Technology (FIT 2012), Islamabad,Pakistan, December 2012, pp. 257–260.

[156] E. Kokoris Kogias, P. S. Jovanovic, L. Gasser, N. Gailly, E. Syta, and B. A. Ford,“OmniLedger: A Secure, Scale-Out, Decentralized Ledger via Sharding,” pp. 16.583–598, May 2018.

202

[157] A. Kothari, “Understanding Harmony’s Cuckoo Rulefor Resharding,” https://medium.com/harmony-one/understanding-harmonys-cuckoo-rule-for-resharding-215766f4ca50, April2019, last visit: June 18, 2021.

[158] D. Kraft:, “Difficulty Control for Blockchain-based Consensus Systems,” in Peer-to-Peer Networking and Applications, X. Shen, L. Bedogni, and J. Cao, Eds., Vol. 9,No. 2, March 2016, pp. 397–413.

[159] H. Krawczyk and T. Rabin, “Chameleon Signatures,” inNetwork and DistributedSystem Security Symposium (NDSS 2000). San Diego, California, USA: The In-ternet Society, 2000.

[160] N. Kshetri, “Can Blockchain Strengthen the Internet of Things?” IT Professional,Vol. 19, No. 4, pp. 68–72, 2017.

[161] A.M. Kurt, “Zero toMonero - First Edition,” 2018, semantic Scholar, https://bit.ly/3fOpBYC, checkd.

[162] M. LABS, “Mainflux open source iot platform,” https://www.mainflux.com/, lastvisit: October 15, 2021.

[163] C.Lee, “Litecoin -TheCryptocurrency forPayments,” 2011, https://litecoin.org/,last visit:30 May, 2021.

[164] I.-C. Lin and T.-C. Liao, “A Survey of Blockchain Security Issues and Challenges,”in International Journal of Network Security, Vol. 19, No. 5, September 2017, pp.653–659.

[165] D. Linke and S. Strahringer, “Integration einer Blockchain in ein ERP-System fürden Procure-to-Pay-Prozess: Prototypische Realisierung mit SAP S/4HANA undHyperledger Fabric amBeispiel der Daimler AG,”HMDPraxis derWirtschaftsin-formatik, Vol. 55, No. 6, pp. 1341–1359, Dec 2018.

[166] LoRa Alliance, “A Technical Overview of LoRa and LoRaWAN,” https://lora-alliance.org/resource_hub/lorawan-specification-v1-1/, last visit: April 29,2021.

[167] F. Maddaloni, “Evaluation and Improving Scalability of the Bazo Blockchain.”Zürich, Switzerland: Communication Systems Group, Department of Informat-ics, April 2019.

[168] M. Mangat, “Kubernetes vs Docker Swarm: What are the Differences?” https://phoenixnap.com/blog/kubernetes-vs-docker-swarm, April 2019, last visit: Oc-tober 4, 2020.

[169] J. Manyika, M. Chui, P. Bisson, J. Woetzel, R. Dobbs, J. Bughin, and D. Aharon,“The Internet of Things Mapping the Value Beyond The Hype,” https://owncloud.csg.uzh.ch/index.php/s/abaGWX2PzPLnc7W, June 2015, last visit:October 15, 2021.

203

[170] M. Maroufi, R. Abdolee, and B. Mozaffari Tazehkand, “On the Convergence ofBlockchain and Internet-of-Things (IoT) Technologies,” in Journal of Strategic In-novation and Sustainability JSIS, Vol. 14, No. 1, March 2019.

[171] P. Maymounkov and D. Mazières, “Kademlia: A Peer-to-Peer Information SystemBased on the XORMetric,” in Peer-to-Peer Systems, IPTPS 2002, Lecture Notes inComputer Science, Vol 2429, Springer, Berlin,Heidelberg, https://doi.org/10.1007/3-540-45748-8_5.

[172] N. McCarthy, “Bitcoin Devours More Electricity than Switzerland,”July 2019, https://www.forbes.com/sites/niallmccarthy/2019/07/08/bitcoin-devours-more-electricity-than-switzerland-infographic/#6f2a0a3321c0.

[173] T. McConaghy, R. Marques, A. Müller, D. De Jonghe, T. McConaghy, G. Mc-Mullen, R. Henderson, S. Bellemare, and A. Granzotto, “Bigchaindb: A ScalableBlockchain Database,”White paper, BigChainDB, June 2016.

[174] E. Meier and M. Steiner, “Integrating Smart Contracts into the BazoBlockchain.” Rapperswill, Switzerland: Department of Computer Science,University of Applied Sciences Rapperswil, https://eprints.ost.ch/682/1/FS%202018-BA-EP-Steiner-Meier-Integrating%20Smart%20Contracts%20into%20the%20Bazo%20Blockchain.pdf, last visit: June 23, 2021.

[175] Microsoft Azure, “Azure IoT Solution Accelerators,” https://azure.microsoft.com/en-us/features/iot-accelerators/, last visit: April 29, 2021.

[176] S. F. Mike Isaac, “Facebook Security Breach Exposes Accounts of 50Million Users- The New York Times,” https://www.nytimes.com/2018/09/28/technology/facebook-hack-data-breach.html, 9 2018, last visit: October 15, 2021.

[177] A. A. Monrat, O. Schelén, and K. Andersson, “A Survey of Blockchain From thePerspectives of Applications, Challenges, and Opportunities,” IEEE Access, Vol. 7,pp. 117 134–117 151, 2019.

[178] E.Morin,M.Maman,R.Guizzetti, andA.Duda, “Comparison of theDevice Life-time in Wireless Networks for the Internet of Things,” IEEE Access, Vol. 5, pp.7097–7114, 2017.

[179] M. Moufaddal, A. Benghabrit, and I. Bouhaddou, “Industry 4.0: A Roadmap toDigital Supply Chains,” in International Conference on Smart Systems and DataScience (ICSSD 2019), Rabat, Morocco, October 2019, pp. 1–9.

[180] A. Mpitziopoulos, D. Gavalas, C. Konstantopoulos, and G. Pantziou, “A Surveyon Jamming Attacks and Countermeasures in WSNs,” in IEEE CommunicationsSurveys and Tutorials, Vol. 11, No. 4, April 2009, pp. 42–56.

[181] A. Mushtaq and I. U. Haq, “Implications of Blockchain in Industry 4.O,” in In-ternational Conference on Engineering and Emerging Technologies (ICEET), Feb2019, pp. 1–5.

204

[182] A. G. Nabi, “Design and Development of an Android-based Supply Chain Track-ing Application.” Zürich, Switzerland: Informatics, January 2019.

[183] O. Novo, “Blockchain Meets IoT: An Architecture for Scalable Access Manage-ment in IoT,” in IEEE Internet of Things Journal, J. Hu, K. Yang, S. T. Marin, andH. Sharif, Eds., Vol. 5, No. 2, 2018, pp. 1184–1195.

[184] nsnam, “NS-3, A Discrete-Event Network Simulator,” https://www.nsnam.org/,last visit: June 13, 2021.

[185] Nxt Community, “Nxt Whitepaper,” July 2014, https://www.jelurida.com/sites/default/files/NxtWhitepaper.pdf.

[186] B. Oram, “The PH ofWater,” http://www.water-research.net/index.php/ph, lastvisit: 06 June, 2021.

[187] K. R. Ozyilmaz, M. Dogan, and A. Yurdakul, “Idmob: Iot data marketplace onblockchain,” inCryptoValley Conference on BlockchainTechnology (CVCBT), 2018,pp. 11–19.

[188] A. Panarello, N. Tapas, G. Merlino, F. Longo, and A. Puliafito, “Blockchain andIoT Integration: A Systematic Survey,” V. Passaro, R. Ghaffari, and Y. Hu, Eds.,Vol. 18, No. 8, August 2018.

[189] “What is Practical Byzantine Fault Tolerance (PBFT)? - Crush Crypto,” https://crushcrypto.com/what-is-practical-byzantine-fault-tolerance/, last visit: October15, 2021.

[190] M. Platonov, J. Hlavác, and R. Lórencz, “Using Power-Up SRAM State of AtmelATmega1284PMicrocontrollers as Physical Unclonable Function for Key Genera-tion and Chip Identification,” Information Security Journal: A Global Perspective,Vol. 22, No. 5-6, p. 244–250, February 2013.

[191] A. Poelstra, “DistributedConsensus fromProof of Stake is Impossible,”May 2014,https://download.wpsoftware.net/bitcoin/old-pos.pdf.

[192] E. Politou, F. Casino, E. Alepis, and C. Patsakis, “Blockchain Mutability: Chal-lenges and Proposed Solutions,” in IEEE Transactions on Emerging Topics in Com-puting, P. Montuschi, Ed., October 2019.

[193] S. Popov, “The Tangle,” 2018.

[194] QuantaBytes Technology Ltd, “A Survey of Bitcoin Transac-tion Types,” January 2021, https://www.quantabytes.com/articles/a-survey-of-bitcoin-transaction-types.

[195] Queensland Government, “Air Quality Monitoring,” https://www.qld.gov.au/environment/pollution/monitoring/air-monitoring, December 2017, last visit:06 June, 2021.

205

[196] R. Radhakrishnan, G. S. Ramachandran, and B. Krishnamachari, “SDPP: Stream-ing Data Payment Protocol for Data Economy,” in IEEE International Conferenceon Blockchain and Cryptocurrency (ICBC), 2019, pp. 17–18.

[197] S. Rafati Niya, D. Dordevic, A. G. Nabi, T. Mann, and B. Stiller, “A Platform-independent, Generic-purpose, and Blockchain-based Supply Chain Tracking,” inIEEE International Conference on Blockchain and Cryptocurrency (ICBC 2019),Seoul, South Korea, May 2019, pp. 11–12.

[198] S. Rafati Niya, B. Jeffrey, and B. Stiller, “KYoT: Self-sovereign IoT Identificationwith a Physically Unclonable Function,” in IEEE 45th Conference on Local Com-puter Networks (LCN 2020), Sydney, Australia, November 2020, pp. 485–490.

[199] S. Rafati Niya, S. S. Jha, T. Bocek, and B. Stiller, “Design and Implementation ofan Automated and Decentralized PollutionMonitoring System with Blockchains,Smart Contracts, and LoRaWAN,” in IEEE/IFIP Network Operations and Man-agement Symposium (NOMS 2018), Taipei, Taiwan, April 2018, pp. 1–4.

[200] S. Rafati Niya, R. Beckmann, and B. Stiller, “DLIT: A Scalable Distributed Ledgerfor IoT Data,” in Second International Conference on Blockchain Computing andApplications (BCCA 2020), Antalya, Turkey, November 2020, pp. 100–107.

[201] S. Rafati Niya, D. Dordevic, M. Hurschler, S. Grossenbacher, and B. Stiller, “ABlockchain-based Supply Chain Tracing for the Swiss Dairy Use Case,” November2020, https://owncloud.csg.uzh.ch/index.php/s/rH6sA25C9JegEHW.

[202] S. Rafati Niya, D. Dordevic, M. Hurschler, S. Grossenbacher, and B. Stiller, “ABlockchain-based Supply Chain Tracing for the Swiss Dairy Use Case,” in 2nd In-ternational Conference on Societal Automation (SA 2021), Funchal, Portugal, 2021.

[203] S. Rafati Niya, D. Dordevic, and B. Stiller, “ITrade: A Blockchain-based, Self-Sovereign, and Scalable Marketplace for IoT Data Streams,” in IFIP/IEEE Inter-national Symposium on IntegratedNetworkManagement (IM 2021). Bordeaux,France: IFIP/IEEE, May 2020, pp. 530–536.

[204] S. Rafati Niya, B. Jeffrey, and B. Stiller, “A Blockchain-based Platform for Self-sovereign IoT Identification,” in IfI Technical Report No. 2020.04. Zürich,Switzerland: University of Zurich, September 2020.

[205] S. Rafati Niya, F. Maddaloni, T. Bocek, and B. Stiller, “Toward ScalableBlockchains with Transaction Aggregation,” in Symposium on Applied Computing(SAC 2020). Brno, Czech Republic: ACM, 2020, p. 308–315.

[206] S. Rafati Niya, E. Schiller, I. Cepilov, F. Maddaloni, K. Aydinli, T. Surbeck, T. Bo-cek, and B. Stiller, “Adaptation of Proof-of-Stake-based Blockchains for IoT DataStreams,” in International Conference on Blockchain and Cryptocurrency (ICBC2019), Seoul, South Korea, 2019, pp. 15–16.

206

[207] S. Rafati Niya, E. Schiller, I. Cepilov, and B. Stiller, “Standardization ofBlockchain-based I2oT Systems in the I4 Era,” in IEEE/IFIP Network OperationsandManagement Symposium (NOMS2020), Budapest,Hungary, April 2020, pp.1–9.

[208] S. Rafati Niya, E. Schiller, and B. Stiller, “Architectures for Blockchain-IoT Inte-gration,” inCommunication Networks and ServiceManagement in the Era of Arti-ficial Intelligence and Machine Learning, ser. IEEE Press Series on Networks andService Management, N. Zincir-Heywood, Y. Diao, and M. Mellia, Eds. NewYork, NY, USA:Wiley-IEEE Press, October 2021, pp. 100–137.

[209] S. Rafati Niya and B. Stiller, “BAZO: A Proof-of-Stake (PoS) based Blockchain,”in IfI Technical Report No. 2019.03. Zürich, Switzerland: University of Zurich,May 2019, https://bit.ly/2G3odoh.

[210] ——, “Enabling Technologies andDistributed Storage,” in Blockchains: Empower-ing Technologies and Industrial Applications, A. Al-Dulaimi, O. Dobre, and C.-L.I, Eds. IEEE/Wiley, May 2021, pp. 1–45.

[211] S. Rafati Niya, J. Willems, and B. Stiller, “On-Chain IoT Data Modification inBlockchains,” 2021, https://arxiv.org/abs/2103.10756.

[212] L. S. Rani, K. Sudhakar, and S. V. Kumar, “Distributed File Systems: A Survey,”in International Journal of Computer Science and Information Technologies (IJC-SIT), J. Zizka, O. Abdul Kadhir, J. Tony, and S. Abdelmageed, Eds., Vol. 5, No. 3.Citeseer, 2014.

[213] U. Raza, P. Kulkarni, and M. Sooriyabandara, “Low Power Wide Area Networks:AnOverview,” in IEEE Communications Surveys Tutorials, D. Niyato, F. Granelli,andM. H. Rehmani, Eds., Vol. 19, No. 2, June 2017, pp. 855–873.

[214] J. Reese, “Management von Wertschöpfungsketten,” February 2016,doi:10.15358/9783800651979-I.

[215] A. Reyna, C. Martín, J. Chen, E. Soler, and M. Díaz, “On Blockchain and its In-tegration with IoT. Challenges and Opportunities,” in Future Generation Com-puter Systems, M. Taufer, A. Belloum, D. Abramson, and I. Altintas, Eds., Vol. 88,November 2018, pp. 173 – 190.

[216] K. Robles, “Hierarchical deterministic keys,” https://www.w3.org/2016/04/blockchain-workshop/interest/robles.html, last visit: October 15, 2021.

[217] R.Roman, J. Zhou, and J. Lopez, “On the Features andChallenges of Security andPrivacy in Distributed Internet of Things,” Computer Networks, Vol. 57, No. 10,pp. 2266–2279, 2013.

[218] K. Salah, “IOT Access control and Authentication Management via Blockchain,”06 2018.

207

[219] M. Salimitari, M. Chatterjee, and Y. Fallah, “A Survey on Consensus Methodsin Blockchain for Resource-constrained IoT Networks,” in Internet of Things, F.Xhafa, Ed., Vol. 11, September 2020, p. 100212.

[220] S. Sanju, S. Sankaran, and K. Achuthan, “Energy Comparison of Blockchain Plat-forms for Internet of Things,” in IEEE International Symposium on Smart Elec-tronic Systems (iSES 2018), Hyderabad, India, December 2018, pp. 235–238.

[221] A. Sapirshtein, Y. Sompolinsky, and A. Zohar, “Optimal Selfish Mining Strategiesin Bitcoin,” in Financial Cryptography andData Security, J. Grossklags and B. Pre-neel, Eds., Vol. 9603. Berlin, Heidelberg: Springer Berlin Heidelberg, 2017, pp.515–532.

[222] E. Schied, B. B. Rodrigues, C. Killer, M. Franco, S. Rafati Niya, and B. Stiller,“Blockchains and Distributed Ledgers Uncovered: Clarifications, Achieve-ments,and Open Issues,” in Advancing ICT Research: IFIP’s 60plus exciting yearsseen by TCs andWGs, Vol. 600. Cham, Switzerland: pringer International Pub-lishing, February 2021.

[223] E. Schiller, S. Rafati Niya, T. Surbeck, and B. Stiller, “Scalable Transport Mecha-nisms for Blockchain IoT Applications,” in IEEE 44th Conference on Local Com-puter Networks (LCN 2019), Osnabrück, Germany, October 2019, pp. 34–41.

[224] section.io, “Preventing Long Tail Latency,” https://www.section.io/blog/preventing-long-tail-latency/, November 2018, last visit: October 15, 2021.

[225] Semtech, “Air Pollution Monitoring,” https://www.semtech.com/uploads/technology/LoRa/app-briefs/Semtech_Enviro_AirPollution_AppBrief-FINAL.pdf, 2016, last visit= 06 June, 2021.

[226] L. Sgier, “Bazo - A Cryptocurrency from Scratch.” Zürich, Switzer-land: Communication Systems Group, Department of Informatics, Universityof Zürich, August 2017, https://files.ifi.uzh.ch/CSG/staff/bocek/extern/theses/BA-Livio-Sgier.pdf.

[227] P. K. Sharma, M. Chen, and J. H. Park, “A Software Defined Fog Node Based Dis-tributed Blockchain Cloud Architecture for IoT,” IEEE Access, Vol. 6, pp. 115–124, 2018.

[228] J. Shen, Y. Li, Y. Zhou, and X. Wang, “Understanding I/O Performance of IPFSStorage: A Client’s Perspective,” in Proceedings of the International Symposium onQuality of Service, ser. IWQoS ’19. New York, NY, USA: Association for Com-putingMachinery, 2019, https://doi.org/10.1145/3326285.3329052.

[229] A. K. Show, A. Kumar, A. Singhal, S. R. Kumar, and N. Gayathri, “6 BlockchainStorage,” Blockchain, Big Data and Machine Learning: Trends and Applications,pp. 141–152, September 2020.

208

[230] M. Siddiqi, S. T. All, and V. Sivaraman, “Secure lightweight Context-driven DataLogging for Bodyworn Sensing Devices,” in 5th International Symposium on Dig-ital Forensic and Security (ISDFS 2017), Tirgu Mures, Romania, April 2017, pp.1–6.

[231] J. J. Sikorski, J. Haughton, and M. Kraft, “Blockchain Technology in the Chem-ical Industry: Machine-to-Machine Electricity Market,” J. Yan, S.-K. Chou,U. Desideri, and H. xing Yang, Eds., Vol. 195, 2017, pp. 234 – 246.

[232] Sixfab, “Cellular IoT Application Shield for Arduino,” https://sixfab.com/product/arduino-cellular-iot-application-shield/, last visit: 13 June, 2021.

[233] N. P. Smart, Cryptography Made Simple, ser. Information Security and Cryptog-raphy. Cham, Switzerland: Springer International Publishing, 2016, No. 1.

[234] B. Stiller, S. Rafati Niya, and S. Grossenbacher, “Application of Blockchain Tech-nology in the Swiss Food Value Chain (Foodchains Project Report).” Zürich,Switzerland: University of Zürich, June 2019.

[235] I. Storj Labs, “Storj: A Decentralized Cloud Storage Network Framework,” Octo-ber 2018, https://storj.io/whitepaper/.

[236] T. Surbeck, “Simulation and Efficiency Improvements of the IoT Communica-tion Protocols Used in the Supply ChainMonitoring Systems.” Zürich, Switzer-land: Communication Systems Group, Department of Informatics, University ofZürich, April 2019.

[237] Q. C. Team, “After Scaling, Why Do Public Chains StillFail to Land?” https://medium.com/quarkchain-official/after-scaling-why-do-public-chains-still-fail-to-land-61eb74bdabe, Novem-ber 2019, last visit: June 18, 2021.

[238] Tendermint Docs, “What is Tendermint,” 2021, https://docs.tendermint.com/master/introduction/what-is-tendermint.html.

[239] The Economist, “Regulating the Internet Giants - The world’s most Valuable Re-source Is No Longer Oil, But Data,” https://www.economist.com/leaders/2017/05/06/the-worlds-most-valuable-resource-is-no-longer-oil-but-data, June 2017,last visit: October 15, 2021.

[240] The Linux Foundation, “HyperledgerWiki,” 2020, https://wiki.hyperledger.org/.

[241] The Things Network, “Building a Global Open LoRaWAN Network,” https://www.thethingsnetwork.org/, last visit: April 29, 2021.

[242] The Zilliqa Team, “The ZILLIQATechnicalWhitepaper,” August 2017, last visit:May 1, 2021.

[243] B. Tjahjono, C. Esplugues, E. Ares, and G. Pelaez, “What Does Industry 4.0Meanto SupplyChain?” inManufacturingEngineering Society InternationalConference(MESIC 2017), Vol. 13, Vigo, Pontevedra, Spain, June 2017, pp. 1175–1182.

209

[244] T. To and A. Duda, “Simulation of LoRa in NS-3: Improving LoRa Performancewith CSMA,” https://github.com/drakkar-lig/lora-ns3-module, last visit: April29, 2021.

[245] ——, “Simulation of LoRa inNS-3: Improving LoRa Performance with CSMA,”in IEEE International Conference on Communications (ICC 2018), Kansas City,MO, USA, May 2018, pp. 1–7.

[246] V.Tron, “TheBookof Swarm,” January 2021, https://gateway.ethswarm.org/bzz/latest.bookofswarm.eth/.

[247] H. T. T. Truong, M. Almeida, G. Karame, and C. Soriente, “Towards Secureand Decentralized Sharing of IoT Data,” in IEEE International Conference onBlockchain (Blockchain), 2019, pp. 176–183.

[248] D. Tse, B. Zhang, Y. Yang, C. Cheng, and H. Mu, “Blockchain Application inFood Supply Information Security,” in IEEE International Conference on Indus-trial Engineering andEngineeringManagement (IEEM2017), Suntec, Singapore,December 2017, pp. 1357–1361.

[249] S. Tuli, R. Mahmud, S. Tuli, and R. Buyya, “FogBus: A Blockchain-basedLightweight Framework for Edge and Fog Computing,” in Journal of Systems andSoftware, P. Avgeriou andD. Shepherd, Eds., Vol. 154. Elsevier, 2019, pp. 22–36.

[250] J. Tuwiner, “Bitcoin Cloud Mining,” https://www.buybitcoinworldwide.com/mining/, April 2021, ”last visit: MAy 3, 2021”.

[251] N. Uca, M. Çemberci, M. Civelek, and H. Yılmaz, “The Effect of Trust in SupplyChain on the Firm Performance Through Supply Chain Collaboration and Col-laborative Advantage,” Journal of Administrative Sciences, Vol. 15, pp. 215–230,October 2017.

[252] F. Vogelsteller, “ERC: ClaimHolder· Issue #735· ethereum/EIPs,” https://github.com/ethereum/EIPs/issues/735, October 2017, last visit: October 15, 2021.

[253] ——, “ERC: Key Manager· Issue #734· ethereum/EIPs,” https://github.com/ethereum/EIPs/issues/734, October 2017, last visit: October 15, 2021.

[254] D. Vorick and L. Champine, “Sia: Simple Decentralized Storage,” Nebulous Inc,November 2014.

[255] L. Wan, D. Eyers, and H. Zhang, “Evaluating the Impact of Network Latencyon the Safety of Blockchain Transactions,” in IEEE International Conference onBlockchain (Blockchain 2019), Atlanta, Georgia, USA, July 2019, pp. 194–201.

[256] L. Watkins, “What was the bitcoin fork and what is bitcoin cash?” August 2017,https://11fs.com/blog/bitcoin-fork-bitcoin-cash/.

[257] Wikipedia, “QR Code,” https://en.wikipedia.org/wiki/QR_code, last visit: Octo-ber 15, 2021.

210

[258] J. Willems, “Bazo Block Explorer,” December 2020, https://github.com/julwil/bazo-block-explorer,.

[259] ——, “Chameleon_hash,” December 2020, https://github.com/julwil/chameleon_hash.

[260] ——, “Design and Prototypical Development of a GDPR and Swiss Law Com-pliant Blockchain,” Communication Systems Group, Department of Informat-ics,University of Zürich, Zürich, Switzerland, September 2020.

[261] L. Xiao and I.-L. Yen, “Security Analysis and Enhancement for Prefix-PreservingEncryption Schemes.” https://eprint.iacr.org/2012/191.pdf, last visit: October 15,2021.

[262] G. Yu, X. Wang, K. Yu, W. Ni, J. A. Zhang, and R. P. Liu, “Survey: Sharding inBlockchains,” in IEEE Access, Vol. 8, 2020, pp. 14 155–14 181.

[263] M. Zamani,M.Movahedi, andM.Raykova, “Rapidchain,” inACMConference onComputer and Communications Security (SIGSAC 2018), Toronto, Canada, Octo-ber 2018, pp. 931–948.

[264] A.D. Zayas andP.Merino, “The 3GPPNB-IoT systemArchitecture forThe Inter-net of Things,” in IEEE International Conference on Communications Workshops(ICCWorkshops 2017), Paris, France, May 2017, pp. 277–282.

[265] L. Zhang, G. Zhao, andM.A. Imran, Eds., Internet of Things and Sensors Networksin 5GWireless Communications. Basel, Switzerland: MDPI, Sensors, 2020.

[266] Z. Zheng, S. Xie, H. Dai, X. Chen, and H. Wang, “An Overview of BlockchainTechnology: Architecture, Consensus, and Future Trends,” in IEEE InternationalCongress on Big Data (BigData Congress 2017), Honolulu, HI, USA, June 2017,pp. 557–564.

[267] Z. Zheng, S. Xie, H.-N. Dai, andH.Wang, “Blockchain Challenges and Opportu-nities: A Survey,” 2016.

[268] Q. Zhou,H.Huang, Z. Zheng, and J. Bian, “Solutions to Scalability of Blockchain:A Survey,” in IEEE Access, Vol. 8, January 2020, pp. 16 440–16 455.

211

List of Figures

1.1 Identified Challenges of BIoT Systems and Corresponding El-ements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

2.1 Storage Network Types [210] . . . . . . . . . . . . . . . . . . . 102.2 Scope of Distributed Storage Systems — The Big Picture [210] 112.3 Chain of Blocks in a Blockchain [210] . . . . . . . . . . . . . . . 172.4 An Example of a Block Content in the Bitcoin Blockchain [210] 192.5 A Simplified Example of the Merkle Tree Construction [210] . . 202.6 Blockchain Address Generation based on Public Key Cryptog-

raphy [210] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 212.7 Distributed Ledger Types [210] . . . . . . . . . . . . . . . . . . 222.8 Competing Chains in PoS [115] . . . . . . . . . . . . . . . . . . 262.9 Competing Chains in PoW [115] . . . . . . . . . . . . . . . . . 272.10 Staking Weights in PoS [209] . . . . . . . . . . . . . . . . . . . 272.11 Double Spending in PoS [209] . . . . . . . . . . . . . . . . . . . 282.12 Epoch Block Representation by [43] . . . . . . . . . . . . . . . 412.13 The Generic Concept of Physically Unclonable Functions (PUF)

[149] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 512.14 Fuzzy Extractor Using Power-up SRAM State as Input Data

[198] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 512.15 BC and IoT Integration Models: (a) IoT–IoT, (b) IoT–Blockchain,

and (c) Hybrid Approach [215] . . . . . . . . . . . . . . . . . . 632.16 An SDN-enabled Fog and Cloud-based BIoT Architecture [208] 64

3.1 KYoT Design —Device Registration Process [198, 149] . . . . . 683.2 KYoT Design – Device Verification Processes [198, 149] . . . . 693.3 The Deployment and Set-Up of Identity Contracts for a User

and One of his/her Devices [198, 149] . . . . . . . . . . . . . . 703.4 Sequence Diagram Issuing a KYD Claim [198, 149] . . . . . . . 713.5 BPMS Architecture Design [199] . . . . . . . . . . . . . . . . . 753.6 Prototypical Implementation of a LoRa Sensor Node Including

Four Sensors Attached to it [199, 151] . . . . . . . . . . . . . . 753.7 Data Flow Using LoRa and ELC [199, 151] . . . . . . . . . . . 773.8 Proposed Pollution Monitoring System Front-end [151] . . . . . 783.9 Server-less Design of NUTRIA [202, 96] . . . . . . . . . . . . . 84

212

3.10 Actors Generating, Scanning, and Adding Data to the BCMapping the Supply and Data Chains [202] . . . . . . . . . . . 85

3.11 Swiss Dairy Supply Chain (on the Left), Corresponding Pro-ducer Views of the NUTRIA dApp (Center), and the DataChain (on the Right) [202] . . . . . . . . . . . . . . . . . . . . . 86

3.12 General Overview of Data Trading Ecosystem with ITrade [203] 943.13 Entity-Relationship Model in ITrade Design [203, 97] . . . . . . 973.14 Publishing Sensor Data in ITrade [203, 97] . . . . . . . . . . . . 993.15 Subscribing to a Data stream Process Flow in ITrade [203, 97] 1003.16 Data Streaming Process Flow in ITrade [203, 97] . . . . . . . . 1013.17 Component Overview of ITrade Architecture [97] . . . . . . . . 1053.18 Passwordless Authentication Process[97] . . . . . . . . . . . . . 1073.19 (a) CPU and (b) Memory Utilization Per Service With 4000

OPS [97] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1093.20 End-to-end Percentile Latency with 4,000 Operations Per Sec-

ond [97] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1103.21 Categories of BIoT Risks [208] . . . . . . . . . . . . . . . . . . 1113.22 BC-IoT Integration (BIoT) Metrics [208] . . . . . . . . . . . . . 1123.23 Blockchain Suitability Diagram for BIoT Use Cases . . . . . . . 118

4.1 BIIT 1.0 Architecture [208] . . . . . . . . . . . . . . . . . . . . 1224.2 BIIT Implementation with LoRa —Components’ Engagement

View [207] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1244.3 Data Flow and Instantiating of a Blockchain Wallet on IoT

Devices [223, 73] . . . . . . . . . . . . . . . . . . . . . . . . . . 1254.4 LoRa to Blockchain Transmission Protocol [206, 207, 208] . . . 1274.5 Cellular TX Design [207, 208] . . . . . . . . . . . . . . . . . . . 1284.6 Packet Loss Experienced by End-Devices, 1000 End-devices, 6

GWs [223]. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1304.7 Cumulative Throughput of the LoRaWAN Network, 1000 End-

devices, 6 GWs [223]. . . . . . . . . . . . . . . . . . . . . . . . . 1314.8 Energy Consumption, 1000 End-Devices, 6 GWs [223]. . . . . . 1324.9 The Packet Volume Exchanged in The Network Upon One TX

Consisting of Several Data Packets [207]. . . . . . . . . . . . . . 1334.10 The Total Traffic Volume Exchanged in the Network Upon One

TX Consisting of Several Data Packets [207]. . . . . . . . . . . 1344.11 BIIT Performance Using The Cellular Connectivity [207]. . . . 135

5.1 Transaction Aggregation Concept [167] . . . . . . . . . . . . . . 1395.2 Double Linked Blockchain Concept in DLIT [167] . . . . . . . . 1425.3 Transaction Assignment and Validation in DLIT [51] . . . . . . 1435.4 Inter-shard Synchronization in DLIT [200] . . . . . . . . . . . . 1445.5 DLIT’s Validator-Committee Communication and Synchroniza-

tion [51, 200] . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1465.6 Interaction between the Shards in DLIT [51, 200] . . . . . . . . 1495.7 Inter-Committee Communications in DLIT [51, 200] . . . . . . 1505.8 Client Interactions with DLIT for Updating a Transaction [260] 163

213

5.9 Miner Interactions in DLIT for Processing an update Transac-tions [260] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166

5.10 Representation of Data Update Transaction in Blockchain Ex-plorer [260] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168

5.11 Different Block Sizes and its Influence on the TPS with andWithout Aggregation [167] . . . . . . . . . . . . . . . . . . . . . 172

5.12 Different Block Intervals and its Influence on the TPS with andwithout Aggregation [167] . . . . . . . . . . . . . . . . . . . . . 174

5.13 Differences Regarding The Size of The Blockchain With Aggre-gation, With Aggregation and Emptying of Blocks and WithoutAggregation [167]. . . . . . . . . . . . . . . . . . . . . . . . . . 176

5.14 TX Aggregation and the Balance Issue for Joining Miners [167] 1775.15 Evaluation with a Varying Amount of Committee Members and

Validators [51] . . . . . . . . . . . . . . . . . . . . . . . . . . . 178

6.1 Correlation Presentation of Goals, Research Questions, andContributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184

214

List of Tables

2.1 Comparison of Selected DSSes [210] . . . . . . . . . . . . . . . 142.2 Comparison of Throughput and Latency of Selected DLs Based

on [142, 210] . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152.3 Comparison of Throughput and Latency of Remote Read Op-

erations in IPFS Based on [228, 210] . . . . . . . . . . . . . . . 162.4 Comparison of Consensus Mechanisms and DL Implementa-

tions [142] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 372.5 Comparison of Throughput and Latency of Selected DLs Based

on [142] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 382.6 An Overview of BC Scalability Approaches [268] . . . . . . . . 442.7 Comparative Overview on Different LP-WAN Networks [208] . 482.8 Comparative Analysis between LTE-M1 and NB-IoT [208] . . . 492.9 Comparison of Supply Chain Tracing dApp and ERP Systems

[201] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62

3.1 Comparison of Data Trading Platforms [203] . . . . . . . . . . 933.2 Percentile Latency per Service with 4000 OPS [97] . . . . . . . 1083.3 Required Use of Resources for Different Software Functions in

BCs from the IoT Resource-Constrained Devices’ Perspective[207]. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114

3.4 Performance of Selected Cryptographic Functions on DifferentIoT Hardware in Hashes per Second [HPS] and Signatures perSecond [207] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117

5.1 Comparative Overview on Potential GDPR-compliant Approaches[260] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159

5.2 DLIT Simulation Results [167] . . . . . . . . . . . . . . . . . . 1695.3 Table of Different Block Sizes Influencing the TPS with TX

Aggregation Enabled [167]. . . . . . . . . . . . . . . . . . . . . 1705.4 Table of Different Block Sizes Influencing the TPS with TX

Aggregation Disabled [167]. . . . . . . . . . . . . . . . . . . . . 1715.5 Table of Different Block Intervals Influencing the TPS with TX

Aggregation Enabled [167]. . . . . . . . . . . . . . . . . . . . . 1735.6 Table of Different Block Intervals Influencing The TPS with

TX Aggregation Disabled [167]. . . . . . . . . . . . . . . . . . . 173

7.1 List of Own Publications and Student Thesis Supervised . . . . 190

215

Listings

3.1 Data Marketplace Smart Contract [203, 97] . . . . . . . . . . . . . . 1023.2 Datastream Principal Smart Contract [203, 97] . . . . . . . . . . . . . 1033.3 Sensor Smart Contract [203, 97] . . . . . . . . . . . . . . . . . . . . 1033.4 Datastream Subscription Smart Contract [203, 97] . . . . . . . . . . . 1035.1 Structure of aDataTx [51] . . . . . . . . . . . . . . . . . . . . . . . 1375.2 CHF Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1615.3 Account Tx . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1615.4 Hash() Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1625.5 Chameleon Hash CheckString . . . . . . . . . . . . . . . . . . . . . 1625.6 CHF Parameter and CheckString . . . . . . . . . . . . . . . . . . . . 1625.7 CHF Computation . . . . . . . . . . . . . . . . . . . . . . . . . . . 163

216

AA.1 Bazo Specifications

As mentioned in Chapter 5, Bazo is used as the underlying BC for the proposed DL, i.e.,the DLIT. This part covers the implementation specification of the Bazo BC based on[209, 45].

A.1.0.1 Transactions (TXs)

TXs are used to change the state of accounts inBazo. ThreeTXtypes are already existing inthe original PoW-basedBazo, including the “stakeTX”, i.e., StakeTx, “transferring funds”,i.e., fundsTX, and “adjusting BC parameters”, i.e., ConfigTX” explained as follows [226].

A.1.0.2 Stake Transaction (StakeTx)

A node in the activeV set confirmsTXs proportionally to their stake in the formof blocksof TXs . This type of TX allows a node to join or leave theV set. This TX consists of thefollowing parameters [209]:

Fee is a payment as an incentive for the V to include the TX in the next block. Thehigher the amount of the fee, the more likely the TX will be included in the next block.As in the PoW-based protocols, the V that appends the next block will be rewarded withTX fees of all TXs within the block appended.

217

Is Staking defines a Boolean state, whether the node wants to join or leave the set ofVs.

Account is defined by the hash of the public key of the issuer.Hashed Seed: When a node wants to join a set of Vs, it must create a seed, which is a

random 32 byte long string. By submitting the hash of this seed, the node commits to thisseed without actually revealing it and it cannot change the seed in a later step. The sameseed will be used in the PoS condition (cf. Section A.1.0.7).

Signature serves the purpose of authentication. Nodes digitally sign the TX with itsSK. A TX of this type is accepted by the BC network, if the issuer fulfills the minimumstaking amount that is needed to become aV. Staking TXs are referred to as StakeTx.

key Commitment: Each node has to generate a (Pk,Sk) pair. For this purpose theRSA algorithms is chosen with a key size of 4096 bits [62].

A.1.0.3 Funds TX (FundsTx)

transfers funds from one account to another. If the balance of a node is below the mini-mum staking amount after sending a FundsTx, it will be dropped by theV.

A.1.0.4 BC Parameter Adjustment TX (ConfigTx)

With ConfigTx BC parameters can be adjusted without the need of a hard fork. The Con-figTx must be signed by a root account in order to be accepted [209]. ConfigTx of thePoW-based Bazo BC is extended by the following parameters to reach a PoS-based Bazo.

Minimum Staking Amount is the minimum amount of coins that a potential Vmust possess. A node cannot join the set of Vs unless it fulfills this requirement. As longas a node is part of theV set, its balance must never fall below the minimum.

Minimum Waiting Time is the minimum number of blocks that a V must initiallywait for, when joining the V set. After appending a block to the BC, a V must also waitthe same number of blocks before another one can be added. The reason for waiting acertain number of blocks is the prevention of a stake grinding attack.

Accepted Time Difference: An important characteristic of a semi-synchronous BCis the different clock speed in every node participatingwithin the network. This parameterdefines the acceptable time difference. AmaliciousV that speeds up his/her clock intervalshould not be able to append a block to this BC.

Slashing Window Size: In Bazo Vs must be punished when adding blocks on twocompeting chains. Otherwise, the network will never reach consensus and it offers thepossibility for double spending attacks. Bazo forcesVs to commit to one of the competingchains after a fork. If a V votes on two different chains within a span of a certain BlockHeight he/she will be punished.

218

Slashing Reward: refers to the amount of coins that aV receives for providing a cor-rect slashing proof. This incentivizesVs to check, if otherVs have built on top of multiplecompeting chains within the slashing window.

A.1.0.5 Account

The Bazo BC follows an account-based BCmodel, whichmeans that the combined unionof all accounts make up the state of the network. Besides the three existing parameters –address, balance, and TX count – three additional parameters are introduced in [45]:

Is Staking: This Boolean parameter describes, whether the account is currently partof the activeV set or not.

Staking Block Height (SBH) stores the block height at which an account joined theset ofVs. This parameter is needed for the slashing condition.

A.1.0.6 Block Structure in Bazo

Bazo Block consists of the following parameters [209].Number of StakeTx corresponds to the number of StakeTxs that are included in a

block.StakeTx Data: The hashes of all StakeTxs that are included in this block in sequential

order.Time in Seconds (updated Nonce): The meaning of the nonce in the updated pro-

tocol bears the number of seconds that a V needs in order to fulfill the PoS condition (cf.Section A.1.0.7).

Height (H ):Theheight of a block refers to the number of previously appendedblocksto the BC. H is needed for the PoS condition (cf. Section A.1.0.7) as well as the slashingcondition (cf. Section A.1.0.8).

Commitment Proof represents the output of RSA(SK, SHA3 − 512(H)). SK rep-resents the private key that corresponds to the PK that was set in the initial StakeTx of thenode. PK can be used by other Vs to verify the proof. H is the Height at which blockcreated.

Slashed Address: A Validator (V ) can submit a slashing proof when appending ablock. Therefore, a V checks if another V has built on two competing chains within theblock span of the slashing window size. This parameter is set to 0 by default if no proof isincluded. Otherwise, it holds the address of themisbehaving node thatmust be punished.

TwoConflictingBlockHashes: These twoparameters exhibit theblockhasheswherethe same node has appended a block on two competing chainswithin the slashingwindowsize.

219

A.1.0.7 PoS Condition

The PoS condition works similar to the PoW condition but with different parameters anda regulated hash rate for each V. Each parameter has its own unique function in orderto secure the BC. After a node has joined the set of Vs and the minimum waiting timehas passed, it is eligible to append blocks to the BC. For that aim, it must provide a validTimeInSeconds (T) for the PoS condition A.1. Each of the parameters provide an impor-tant property, explained as follows.

SHA-256([PPrevBlocks] · PLocal · BH · T)Coins

≤ Target (A.1)

List of the Previous Proofs (ProofPrevBlocks): By having the list of previous proofs athand, a stake grinding attack becomes unfeasible [62].

Local Proof (PLocal): With Local Proof, even a V with a low amount of coins canappend a block to the BC [62].

Block Height (BH): The height of a block is characterized by the number of previ-ously added blocks in the BC plus one.

Amount ofCoins (Coins): Bydividingup thenumber of coins that aV possesses, theelection process becomes proportional to the stake. Without this division every economi-cally acting node would create new accounts that possess exactly the minimum staking inorder to maximize its staking reward [209].

Difficulty of the PoS Condition (Target:) Similar to the PoW condition the diffi-culty in the PoS protocol can be adjusted with this global variable in order to determinethe speed of the BC. As the number of Vs in-/decreases the difficulty is adjusted accord-ingly. TheTarget is recalculated and adjusted based on themeasured average block interval[226].

A.1.0.8 Validation

When a new block is received each V checks the following conditions whether the blockis valid or not [209]:

Commitment Proof: EachV checks the origin of message by checking the PK of theV in the Commitment Proof of the new block [62].

MinimumWaiting Time is needed to prevent stake grinding attacks. The larger thenumber of blocks the more difficult it becomes for an attacker to perform such an attack.Furthermore, it also sets a V offline for the number of blocks that is set as the minimumwaiting time after appending a block. This is because when adding a block, a new seedis submitted and this also opens the possibility of a stake grinding attack. Therefore, it isdesired to adjust the size of the minimumwaiting time according to the size of theV set.

220

PoS Condition: The current state of the BC and the suggested block contain all theneeded information to check whether the PoS condition is valid or not.

Slashing Condition: When the suggested block includes a proof for a slashing con-dition, each V checks for the proof’s validity by comparing the height of the two blocksand whether they arise in competing chains or not. If the proof is valid, the beneficiaryaddress receives an addition reward which is determined by a BC parameter. Further, theslashed address loses its position in theV set as well as the minimum amount of coins thatis required to be part of theV set.

Clock Speed: When a node tries to manipulate its clock speed, the suggested blockwill be ignored if the submitted time is too far in the future. This threshold is set with aBC parameter.

A.1.0.9 Competing Chains

Due to latency and the nature of the PoS condition it is possible that BC forks happen andVs compete on two or more chains. In such a case the proposed Bazo protocol follows thesame principle as in a PoW BCwhich covers the following possibilities [209]:

Same Height: AV takes the first received valid block and rejects all blocks belongingto a chain that are of the same height or shorter Competing Chains.

Longer Chain: If a block is received that belongs to a longer valid chain, this chain isused as the valid chain and a rollback is necessary [226].

Further specifications ofBazoTXs anddifferent components canbe accessed in relatedwork [45, 209, 43, 206, 174] and in the appendix A.

221

Curriculum Vitae

Personal Details

Name Sina Rafati NiyaDate of Birth June 12, 1988Place of Birth Oroumieh (Urmia), Iran

Education

September 2006 – August 2011

Bachelor of Science (B.Sc.) inSoftware EngineeringFaculty of EngineeringUrmia University

February 2013 – October 2015

Master of Science (M.Sc.) inInformation Technology (IT) EngineeringDepartment of Computer ScienceUrmia University of Technology

September 2016 – February 2022Doctoral program at theUniversity of ZurichDepartment of Informatics (IfI)Communication Systems Group (CSG)

Professional Experience

August 2013 – December 2013Internship as Cisco Network Engineer atOANCUrmia, Iran

November 2013 – December 2014Network Engineer atTorkanetUrmia, Iran

October 2016 – February 2022Research and Teaching AssistantUniversity of ZurichZürich, Switzerland

222