Secure Computation Offloading in Blockchain based IoT ...

ACCEPTED AT IEEE TRANSACTIONS ON NETWORK SCIENCE AND ENGINEERING 1

Secure Computation Offloading in Blockchain basedIoT Networks with Deep Reinforcement Learning

Dinh C. Nguyen, Member, IEEE, Pubudu N. Pathirana, Senior Member, IEEE,Ming Ding, Senior Member, IEEE, and Aruna Seneviratne, Senior Member, IEEE

Abstract—For current and future Internet of Things (IoT)networks, mobile edge-cloud computation offloading (MECCO)has been regarded as a promising means to support delay-sensitive IoT applications. However, offloading mobile tasks tocloud is vulnerable to security issues due to malicious mobiledevices (MDs). How to implement offloading to alleviate com-putation burdens at MDs while guaranteeing high security inmobile edge cloud is a challenging problem. In this paper, weinvestigate simultaneously the security and computation offload-ing problems in a multi-user MECCO system with blockchain.First, to improve the offloading security, we propose a trust-worthy access control using blockchain, which can protect cloudresources against illegal offloading behaviours. Then, to tacklethe computation management of authorized MDs, we formulatea computation offloading problem by jointly optimizing theoffloading decisions, the allocation of computing resource andradio bandwidth, and smart contract usage. This optimizationproblem aims to minimize the long-term system costs of latency,energy consumption and smart contract fee among all MDs. Tosolve the proposed offloading problem, we develop an advanceddeep reinforcement learning algorithm using a double-dueling Q-network. Evaluation results from real experiments and numericalsimulations demonstrate the significant advantages of our schemeover existing approaches.

Index Terms—Blockchain, computation offloading, deep rein-forcement learning, security.

I. INTRODUCTION

Recent years have witnessed the explosion of mobile tech-nologies with the proliferation of mobile devices (MDs) suchas smartphones, tablets, wearable devices, etc, which havebeen driving the evolution of Internet of Things (IoT). MDscan be used to run IoT applications such as smart home, smarthealthcare, and smart city with high flexibility and efficiency.However, due to the rapid growth of mobile data traffic,executing extensive IoT applications merely on overloadedMDs is incapable of providing satisfactory quality of service(QoS) to users [1]. Fortunately, with the recent advancementof communication technologies in 5G networks, MDs nowcan offload the workload of IoT applications (or computationtasks) to a cloud server for execution, which can provide aneffective alternative to mitigate the data computation pressure

*This work was supported by CSIRO Data61, Australia.Dinh C. Nguyen and Pubudu N. Pathirana are with the School of En-

gineering, Deakin University, Waurn Ponds, VIC 3216, Australia (e-mails:[email protected], [email protected]

Ming Ding is with Data61, CSIRO, Australia (email:[email protected])

Aruna Seneviratne is with the School of Electrical Engineering andTelecommunications, University of New South Wales (UNSW), NSW, Aus-tralia (email: [email protected])

on MDs. To further improve the efficiency of mobile com-putation, mobile edge computing (MEC) has emerged as apromising solution to enable MDs to offload their computationtasks to a nearby edge server [2]. In fact, the MEC server isless powerful than a remote cloud, but it is located at theedge of the network, with a close proximity to MDs, whichenables highly efficient IoT data computation with much lowertransmission delay, compared with the remote cloud [3].

More interestingly, the combination of cloud and edge com-puting leads to a new paradigm of mobile edge-cloud computa-tion offloading (MECCO) to facilitate offloading computationfor IoT networks [4]. The MECCO model can offer applicationdevelopers highly effective computing services in the mobileedge-cloud by obtaining the advantages of both edge andcloud computing to satisfy diverse users’ QoS requirements.Mobile applications without latency requirements (e.g., bigdata analysis) can be offloaded to the resourceful cloud, whilethe others (e.g., smart healthcare, smart home) with time-sensitive requirements can be executed at the edge serverfor fast-response services. Clearly, the MECCO architecturecan offer promising solutions with high flexibility to achievecommunication and computation objectives for future IoTapplications in 5G networks [5].

However, the MECCO also poses many challenges on com-putation offloading and one of the major challenges is security[6]. As mobile task offloading relies on MDs in a dynamicenvironment where mobile users are untrusted, the MEECOis vulnerable to various types of threats. Unauthorized MDsmay gain malicious access to exploit computing cloud serviceswithout the consent of a central authority. Further, attackerscan threaten computation resources on the cloud to obtainmobile data, leading to privacy concerns of cloud-based IoTapplications. Therefore, how to guarantee security for mobileoffloading is critical to any MECCO system.

Recently, blockchain has emerged as a promising approachto tackle security issues in future IoT networks, includingmobile offloading systems [7], [8]. The concept of blockchainis based on a peer-to-peer network architecture in which trans-action information is distributed among multiple nodes andnot controlled by any single centralized entity. The blockchainutilises public-key cryptography to establish an append-only,immutable chain of blocks which are publicly accessible to allblockchain entities in a verifiable and trustworthy manner [9].Blockchain with its decentralized and trustworthy nature hasbeen integrated with cloud IoT systems for security guaranteessuch as secure access control and data management amongIoT devices. Specially, smart contract [10], a self-operating

arX

iv:1

908.

0746

6v2

[ee

ss.S

P] 2

0 A

ug 2

021


computer program running on the blockchain platform hasbeen demonstrated its feasibility in various IoT security prob-lems. For example, smart contracts were employed to designan access control mechanism that was capable of trackingdata exchanges between untrusted IoT devices with the abilityto detect malicious access while providing provenance andauditing on mobile data [11]. Further, blockchain and smartcontracts were adopted to offer access control solutions tomanage and protect cloud resources among cloud nodes [12].With such security capabilities, it is believed that blockchainand smart contracts can be applied in mobile cloud IoTcontexts, especially in MECCO systems, to fulfill securityobjectives for mobile task offloading.

A. Related Works

Many works were proposed to investigate computationoffloading issues with edge-cloud computing in IoT networkswith blockchain. The offloading strategies proposed in [13],[14] utilized edge or cloud services to tackle blockchain-based IoT computation issues with the objective of mini-mizing offloading costs and cloud resources by leveragingLyapunov or conventional convex optimization methods. Butsuch conventional offloading optimization algorithms onlywork well for low-complexity online models and usuallyrequire prior knowledge of system statistics that is difficult toacquire in practical scenarios. To overcome such challenges,Reinforcement Learning (RL) [15] has emerged as an efficienttechnique which allows a learning agent to adjust its policyand derive an optimal solution via trial and error to achieve thebest long-term goal without requiring any prior environmentinformation. Nevertheless, in complex offloading problemswith multi-user multi-IoT device scenarios, the dimension ofstate and action space can be extremely high that makes RL-based solutions inefficient. Fortunately, Deep ReinforcementLearning (DRL) methods [1] such as deep Q-network (DQN)have been introduced as a strong alternative to solve suchhigh-dimensional problems in blockchain-based edge cloudoffloading tasks [16], [17]. The authors in [18] paid attentionto an edge offloading scheme for blockchain mining tasks thatcan be offloaded to the edge clouds, in order to enhance thequality of service (QoS) and mitigate the mining burden posedon mobile miners. The study in [19] proposed a blockchain-based MEC model for future wireless networks, where compu-tation offloading and resource allocation are jointly optimizedwith respect to spectrum allocation, block size, number ofconsecutive blocks via a double-dueling deep Q network. Theauthors in [20] paid attention to a cooperative computationoffloading framework for blockchain-based IoT networks. AnMA-DRL algorithm is designed which allows IoT devices tocollaboratively explore the offloading environments in orderto minimize long-term offloading costs.

Moreover, the security of task offloading in edge cloud com-puting has been also explored recently by using blockchainand smart contracts. For example, the authors in [21] leverageblockchain to create a secure communication protocol withgroup signatures and covert channel authorization techniquesto guarantee the validity of users in edge computing. A

blockchain-empowered framework is also proposed in [22]for implementing resource trading and task assignment inedge computing as the smart contracts which can providereliable resource transactions and immutable records in theblockchain. Smart contracts have been also adopted in [23]for facilitating a vehicular edge consortium blockchain, whichenables secure resource sharing while motivating vehicles toshare their computation resources with service requesters. Tosupport security and privacy for cognitive edge computing,spatio-temporal smart contracts are provided in [24] withincentive mechanisms to accelerate the economy sharing insmart cities.

B. Main Contributions and Paper Structure

Despite these research efforts, there has been little attentiongiven to the design of security, e.g., access control for taskoffloading in these existing works [16]–[20]. Moreover, theintegration of edge and cloud computing in blockchain has notbeen investigated fully for IoT computation offloading scenar-ios [22]–[24]. Motivated by such limitations, in this paper, wepropose a novel secure computation offloading model for IoTnetworks on mobile edge-cloud based on blockchain and DRLtechniques. In particular, we provide a comprehensive solutionto the access control and efficient computation problems of theMECCO system on blockchain to satisfy both the security andoffloading requirements. Also, different from existing works,we evaluate the proposed scheme by conducting both realexperiments and numerical simulations. In a nutshell, the maincontributions of this paper are highlighted as follows:

1) We consider a new secure computation offloading schemefor mobile edge cloud computing with blockchain whereMDs can offload their IoT data tasks to the cloud oredge server for computation under an access controlmechanism for security guarantees.

2) We propose a trustworthy access control mechanismspecific to edge task offloading by using a trustworthysmart contract on blockchain. The main purpose of ouraccess control design is to perform user authentication,offloading verification, and manage offloaded mobiledata, which thus provides high security for the MECCOsystem.

3) We formulate an optimization problem by taking bothcomputation offloading and smart contracts into account,which has not been considered in the open literature. Thisis enabled by the joint consideration of the offloadingdecisions, the allocation of computing resource and radiobandwidth, and smart contract usage. To this end, weproposed an advanced DRL algorithm to obtain theoptimal offloading policies for all MDs.

4) We investigate the effectiveness of the proposed MECCOframework in terms of access control and offloading per-formances by conducting both real experiments and nu-merical simulations. The implementation results demon-strate that the proposed scheme not only provides reli-able authentication for task offloading but also reducesoffloading costs, compared to the existing schemes.


Cloud serverEdge serverFiber link

Mobile Devices (MDs) Edge Computing Cloud Computing

Blockchain

DM

Access Point

Fig. 1: The proposed mobile edge-cloud network architecturewith blockchain for secure mobile offloading.

The remainder of this paper is organized as follows. Sec-tion II presents the integrated edge and cloud architecture withblockchain for secure computation offloading. In Section III,the system models for both access control and computationoffloading are described. Then, we formulate the computationoffloading, edge resource allocation, radio bandwidth resource,and smart contract cost as a joint optimization problem. InSection IV, we propose and access control mechanism withsmart contracts and computation offloading approaches withan advanced DRL algorithm for our MECCO system. Theperformances of the proposed MECCO system are investigatedin Section V by conducting both real experiments to evaluatethe access control performance and numerical simulations.Finally, conclusions are given in Section VI.

II. NETWORK ARCHITECTURE

A. Mobile Edge Cloud for Blockchain-IoT Networks

We propose an integrated edge-cloud architecture for IoTnetworks with blockchain as shown in Fig. 1, including threemain layers, namely mobile devices layer, edge computinglayer and cloud computing layer. The key features of eachlayer are presented as follows.

1) Mobile Devices Layer: This layer consists of a networkof MDs, such as smartphones, tablets, and sensor devices.Such MDs are connected together in the IoT network by theblockchain. Each MD has a blockchain account to join intothe network for various functionalities such as collecting dataor performing task offloading to cloud servers. For example,in our MECCO scenario, each MD needs to send requests tothe cloud server for task offloading. Once the cloud serververifies the request, a response will be returned to the MD foroffloading permission so that the device can start to offload itstasks to edge or cloud server for computation.

2) Edge Computing Layer: This layer includes a wirelessaccess point (AP) or base station (BS) for wireless com-munication with local devices, a Decision Maker (DM) fortask execution decision, and a light-weight edge server forinstant data processing. This layer can provide low-latencycomputation services at the edge of the network. However,for complex computation tasks, the edge server needs toforward them to the resourceful cloud server by a wired lineto avoid the task overload on the edge layer. In addition,the edge server also acts as a blockchain entity to establishtrustworthy communication with cloud nodes and MDs on the

blockchain network for security guarantees. Any transactionsand offloading activities in the offloading system will berecorded by blockchain and also broadcast to the edge serverto achieve a common agreement on offloading management. Inthis paper, for simplicity, we only consider one edge server inour MECCCO system. However, our model can easily extendto multi-edge scenarios where each edge server server acts asa decentralized blockchain node for system throughput andsecurity improvements.

3) Cloud Computing Layer: This layer includes multiplevirtual machines with powerful computation and storage ca-pabilities to solve complex computation tasks from local IoTdevices. In our MECCO architecture, the cloud layer alsocontains a network manager as a blockchain entity to controlall user access, an admin for smart contract management anda group of miners for transaction mining. All cloud nodesoperate on the blockchain platform in a decentralized andsecure manner and link securely to edge server and MDs viathe blockchain network.

B. Motivations of Using Blockchain for Secure ComputationOffloading in Edge Cloud Computing

Due to the decentralized, immutable, and traceable features,blockchain is able to provide higher security degrees to com-putation offloading in edge cloud computing, compared to thetraditional security techniques [25]. Indeed, security solutionssuch as access control based on blockchain not only manageeffectively offloading, but also enable reliable authenticationon all offloading behaviours to preserve the cloud resources ina decentralized manner. The motivations of using blockchainfor security, e.g., access control, in computation offloading areenabled by its unique advantages over the conventional accesscontrol solutions [26], [27] which are explained as follows:• Blockchain can provide a decentralized management

solution for the offloading system. IoT data offloadedfrom MDs can be stored in the peer-to-peer storage inthe blockchain network without relying on any centralauthority, which ensures fast data access and enhancessignificantly data security of mobile users [21].

• By incorporating blockchain into the edge-cloud com-puting network, the offloading system can achieve atrustworthy access control by using smart contracts whichenables to authorize automatically devices and distin-guish users from adversaries, aiming to prevent mali-cious offloading behaviours and potential threats to cloudcomputation resources. Consequently, the data integrityand offloading validity of the system can be significantlyimproved [28].

• With its decentralized and immutable nature, blockchaincan work well in untrusted environments like our consid-ered IoT scenario where there is no need the trust betweencloud server, edge server and IoT devices. Particularly, thepeer-to-peer network architecture provided by blockchaincan achieve robust access control with high data integrityand system security for the mobile offloading [22].


Send a request for offloading permission

Mobile device

The request is authorized?

User offloads task to MEC server

Reject the offloading

request YN

Process task at MEC?

Allocate resource and perform

execution

Y

N

Forward task to cloud server

Process the taskSmart contract on blockchain

Cloud server verifies the

request

Access control Edge Computing Cloud Computing

Fig. 2: Flowchart of computation task offloading with accesscontrol on edge-cloud.

C. Description of the Proposed MECCO System

With the network settings, we describe the proposedMECCO system with a focus on access control and offloadingconcepts on the blockchain network. To guarantee the QoSrequirements of mobile users, the combination of edge-cloudcomputing and blockchain should be considered carefully. Infact, since the dynamic and scalable characteristics of themobile multi-IoT environment, access control and computationoffloading requires a comprehensive design to achieve bothsecurity and offloading goals. To provide a complete compu-tation solution for the proposed MECCO system, we introducea network workflow as indicated in Fig. 2. The procedure in-cludes two phases: access control and computation offloading,which are described as follows.

First, a MD initializes a request as a blockchain transactionto start the computation offloading process and sends thisrequest to the cloud server via the wireless access point (Notethat for convenient management, access control for all MDsis implemented at the central cloud server). Then, the cloudserver will authorize this request using an access controlmechanism enabled by smart contracts. Based on predefinedstrict control policies, smart contract will identify, analyse andmake decisions to accept or refuse the request. If the requestis authorized successfully, a response will be returned to theMD so that the device can offload its tasks. Now the accesscontrol process finishes and the transaction will be recordedand stored on the blockchain network in a secure manner. Inthe second phase, the authorized MD will choose to offload itscomputation tasks to the edge or cloud server for calculation.At each offloading period, based on QoS requirements andcurrent network conditions (task size, available edge resource,channel bandwidth resource), the decision maker (DM) at theedge layer [16] will perform optimization to decide where themobile task should be executed, i.e., in the MEC server orcloud server, for optimal computation benefits (e.g., minimumoffloading costs). If the tasks are executed at the edge layer,the MEC server needs to allocate communication resource aswell as computation resource to each MD. However, if thetasks exceed the computing capability of the MEC server,such tasks should be forwarded to the resourceful cloud server

Blockchain Client

Mobile Device

Blockchain-empowered Cloud Computing

Miners

Transaction pool

Admin

MECCOmanager

Unsigned Tx

Signed Tx

Request Handler

Verify

Policy storageSmart contracts

Control

Transaction updated

User Interface

8

4

Block IDDevice IDSignatureTime stamp

Blocki+2

Blocki+1

Blocki

Blocki-1

Blocki-2

Blockchain

MD’s request pool

7

Fig. 3: Workflow of blockchain-based access control schemefor the MECCO system.

for calculation. Note that data offloaded from MDs can bestored securely in a decentralized cloud storage on blockchain[15]. The data storage management is beyond the scope of thispaper, and details can be referred to our previous work [15].

III. SYSTEM MODEL AND OFFLOADING PROBLEMFORMULATION

The proposed MECCO system includes two schemes, accesscontrol and computation offloading, which will be presentedin the following.

A. Access Control Model

We propose an access control model on a blockchainnetwork for the MECCO system as shown in Fig. 3. Forbetter offloading management, access control for all MDs isimplemented at the cloud server as explained in the previoussection, and therefore the MEC server is ignored in the accesscontrol scheme.

1) Key Components of The Access Control Scheme: Asshown in Fig. 3, the access control scheme consists of fourmain components: MECCO manager, admin, smart contractsand miners.• MECCO Manager: The MECCO manager plays a signif-

icant role in our access control framework. It is respon-sible to monitor all offloading events on the blockchainnetwork, including offloading requests and access authen-tication for MDs. The management capability of MECCOmanager is enabled by smart contracts through strict userpolicies.

• Admin: It is used to manage transactions and operationson cloud by the means of adding, changing or revokingaccess permissions. Admin is responsible to deploy smart


contracts and the only entity with the ability to update ormodify policies in smart contracts.

• Smart Contracts: The smart contracts define all oper-ations allowed in the access control system. MDs caninteract with smart contracts by the contract address andApplication Binary Interface (ABI). Smart contracts canidentify, validate request and grant access permissions forMDs by triggering transactions or messages. The smartcontract and its operations are accessible to all blockchainentities. It is considered as core software in our accesscontrol scheme.

• Miners: We employ a group of virtual machines on thecloud as miners that perform mining tasks to validatethe data blocks consisting of transactions of MDs via aconsensus mechanism such as Proof-of-Work (PoW) [29].Here, the fastest miner which solves the computationalpuzzle is rewarded for its mining contribution, and italso verifies the data block that is then sent along withthe signature to other miners for validation. If all minersachieve a consensus, the validated block is then appendedinto the blockchain in a chronological order.

2) Access Control Concept for Secure Computation Of-floading: The workflow of access control on blockchain isillustrated in Fig. 3. A description of each step is provided asfollows.

1© A MD initializes a request as an offloading transactionfor offloading computation tasks to the edge-cloud server.

2© The blockchain client processes and sends the request tothe storage pool so that the MECCO manager and smartcontracts can verify.

3© The MECCO manager collects the requests of MDs in thestorage pool based in a first-come-first-served manner.

4© The MECCO manager verifies the request by smartcontracts with a strict control policy. If the request isaccepted, a response will be returned to the MD foroffloading data.

5© Offloading transactions are grouped into data blocks,which are then inserted into the transaction pool forconfirmation by miners.

6© The miners validate the data blocks and sign them withdigital signature to append to the blockchain.

7© The offloading transaction is added to the blockchainnetwork and broadcast to all MDs within the MECCOsystem.

8© The offloading transaction is updated at the MD fortracking via the blockchain client.

The access control process will be explained in the followingsteps:• Step 1-Initialization (executed by the mobile gateway):

The MD needs to create a blockchain account to join theblockchain network and initializes a request as a transac-tion for offloading computation tasks. The MD uses theblockchain client module to interact with blockchain onclouds. In the task offloading, this module will create astorage transaction Ts as a request with information indexincluding request metadata (Device ID), digital signatureand timestamp for verification. The request then will be

sent to the MECCO manager on cloud for verification viaa wireless network. (Steps 1, 2 in Fig. 3).

• Step 2-Verification of the offloading request (executed bythe MECCO manager): After receiving a transaction fromthe MD, the MECCO manager will issue a signal tosmart contract as a notification of a new task offloadingrequest. By using the policy list in the policy storage,the smart contract can verify the transaction and if itis accepted, a message will be returned to the MD viathe blockchain client to allow to offload computing tasks(Step 3, 4 in Fig. 3). Note that in the practical limitedstorage, a threshold can be set up at the MECCO managerto manage the request flow from the MDs. In this regard,only a certain number of requests under the pre-definedthreshold are processed by the MECCO manager forauthentication in a time period, while the future requestswill queue to be handled in the next time window, wherequeue theories may be useful to model this process.

• Step 3-Adding the offloading transaction to blockchain(executed by miners): After verification, metadata ofoffloading transactions (Device ID as shown in Fig. 3)is also inserted into the unsigned transaction pool. Theminers will form periodically transactions in the poolinto blocks for mining (Step 5 in Fig. 3). The fastestminer which verifies the data block will send the signatureto other miners for validation. If all miners achieve anagreement, the validated block with its signature is thenappended to the blockchain in a chronological order. Fi-nally, all network MDs receive this block and synchronizethe copy of the blockchain via the blockchain client. Theworkflow for the above process is shown in Steps 6, 7, 8in Fig. 3.

B. Computation Offloading Model

In the computation offloading model, we assume that MDsare authorized by our access control scheme as designed inthe previous subsection. We consider realistic IoT applica-tions (e.g., speech recognition or big data analysis) wherecomputation tasks can be very large and thus inefficient to beprocessed locally by MDs. Therefore, the authorized MDs willhave to offload their tasks to edge or cloud server for efficientexecution. In this subsection, we propose a new offloadingscheme for our MECCO scenario. We first formulate thetask model and computation model, and then we describe theproblem formulation in details.

1) Task Model: We consider a task model as shown inFig. 1. We denote a set of MDs as N = {1, 2, ..., N}.It is assumed that each MD has a computation task to becompleted. For each MD n, the computation task can beformulated as a variable tuple Rn = (Dn, Xn, τn). Here Dn

(in bits) denotes the data size of computation task of the MDn. We also assume that the size of Dn is fixed when it isoffloaded to edge or cloud server. Xn (in CPU cycles/bit)denotes the total number of CPU cycles required to accomplishthe computation for the task Rn. Moreover, τn (in seconds)reflects the maximum tolerable delay of task Rn. Note that theinformation of Dn and Xn can be obtained by using program


profilers [17]. In this paper, we focus on a joint optimizationproblem of offloading decision, edge computation resource andbandwidth resource, which has been not studied well in theliterature studies.

We define an offloading decision vector as A =[α1, α2, ..., αN ] where αn = {αe

n, αcn}. Here αe

n, αcn ∈ {0, 1}

and αen+α

cn = 1. If the task is offloaded to the edge or cloud

server, the corresponding parameter is 1, otherwise it is 0. Notethat each task is executed at only a platform at each offload-ing period. Further, we also consider the resource allocationproblem for computation offloading. For edge computing, theMEC server needs to allocate its computing resources to eachMD to perform task execution. For cloud computing, thetask needs to be offloaded by the MD to the edge layer andthen is forwarded to the remote cloud via a wired link. Dueto the powerful computational capacity of the cloud server,the problem of cloud resource allocation is ignored in ourpaper. Nevertheless, the allocation of limited radio bandwidthshould be considered to improve offloading efficiency of ourMECCO system. Assuming that the total radio bandwidthof our MECCO system is B Hz, we allocate part of thebandwidth resource to each MD to avoid interference betweenthem [20]. We normalize the assigned bandwidth to the MD nas wn ∈ [0, 1], then we have

∑N1 wn = 1. We also denote the

channel gain between MDs and the MEC server as hn, thenthe transmission data rate of the MD n can be calculated as

rn = wnBlog2(1 +pnhnwnN0B

), (1)

where pn is the transmit power (W) of the MD n, N0 isadditive noisy power spectral density (dBm/Hz).

In the following, we formulate the computation offloadingmodel of edge computing and cloud computing with a focuson computation latency and energy consumption analysis forour MECCO system.

2) Computation Model: The data tasks of MDs can beenoffloaded to the edge or cloud server for execution.

2.1) Edge Computing: We consider the case when thetask Rn of the MD n is offloaded to the MEC server forcomputation (αe

n = 1). We denote T en as the edge computing

latency which includes the transmission delay for the MD nsending data to the MEC server and the execution time on theMEC server. Similar to [19], the total edge computing latencycan be expressed as

T en =

Dn

rn+Xn

fen, (2)

where fen (in CPU cycles/s) denotes the edge computationresource allocated to the MD n. Note that edge resourceallocated to all MDs should not exceed the total computationalcapacity of the MEC server

∑Nn=1 f

en ≤ F e.

Moreover, the energy cost for offloading to the MEC serverconsists of energy consumption for transmitting data andexecution. Denote pin as the power consumption (in watt)of the MD n in idle status, the total energy consumption ofoffloading data to the MEC server is given as

Een =

pnDn

rn+pinXn

fen. (3)

2.2) Cloud Computing: If the computation task is offloadedto the cloud server (αc

n = 1), the MD n needs to offload itstask to the MEC server which then forwards it to the remotecloud server for computation via a wired link. The cloudserver also allocates its computation resources to compute thetask efficiently. We denote the data rate of the wired linkfor transmitting the task of the MD n as rwn and the cloudresource allocated to the MD n as f cn. Then then the totalcloud computing latency can be expressed as [19]

T cn =

Dn

rn+Dn

rwn+Xn

f cn. (4)

Besides, offloading data to the cloud server also incurs theenergy cost, which can be calculated as

Ecn =

pnDn

rn+ pin(

Dn

rwn+Xn

f cn). (5)

According to (2)-(5), the computation latency and energyconsumption of the MD n in our MECCO system can beexpressed respectively as

Tn = T en + T c

n, (6)

En = Een + Ec

n. (7)

3) Smart Contract Cost for Offloading Authentication:We also consider the contract execution cost for offload-ing authentication which has not been explored in previousblockchain-based offloading works [14], [21], [22]. In thecloud blockchain, e.g., Ethereum, all transactions have feeswhich are measured in units of gas, which can be regardedas a metric to standardize the contract cost [30]. To execute asmart contract, a MD’s account has to pay a certain amountof gas using Ether, a common cryptocurrency in Ethereum,to specify the amount of computation and storage required bya transaction. Each offloading transaction in cloud Ethereummust specify a gas limit value $n, which is the maximumamount of gas for executing a transaction. A transaction whosegas cost is beyond the current block gas limit will be rejectedby the network. Transactions also determine a gas price ξwhich is the rate paid to cloud miners in Ether per unit ofgas. We denote gn to present the amount of gas used whenexecuting the access control contract. Accordingly, the contractfee (in Ether) that the authorized MD n needs to pay forauthentication as [31]:

Cscn = ξ ∗min{gn, $n}. (8)

The key notations used in this paper are listed in Table I.

C. Formulation of the Secure Offloading Problem

In this subsection, we formulate the computation offloading,edge resource allocation, radio bandwidth resource, and smartcontract cost as a joint optimization problem. Our objectiveis to minimize the sum cost of computation latency, energyconsumption, and smart contract cost for all MDs in ourblockchain-based MECCO system. We formulate the offload-ing cost function of the MD n as the weighted sum of


TABLE I: List of key notations.

Notation DescriptionN ,N ′

Set of all MDs/ authorized MDs in MECCO systemDn The data size of computation task of MD nXn The total number of CPU cycles per taskτn Completion deadline for executing a task

αen, αc

n Process the task at edge/cloud serverpn, pin The transmit power/idle power of the MD nN0 Additive noisy power spectral densityB The total radio bandwidth of our MECCO systemwn The allocated bandwidth to the MD nhn The channel gain between MDs and MEC serverrn The transmission data rate of MD n

fen, fcn The computational capacity of edge/cloud serverF e The total MEC computational capacity

T en, T c

n Computation latency of edge/cloud processingEe

n, Ecn Energy consumption of edge/cloud processing

Coffloadn The offloading cost of MECCO systemgn The gas fee of smart contract executionCsc

n The offloading authentication cost

computation latency and energy consumption, which is givenas

Coffloadn = βtTn + βeEn, (9)

where βtn, β

en ∈ [0, 1] (n ∈ N ) denote the weight of

latency and energy consumption, respectively. Thus, the to-tal cost can be defined as the sum of the smart contractcost Csc

n and the offloading cost Coffloadn . To this end,

we formulate the joint optimization of secure task offload-ing for the multi-user MECCO, subject to the offloadingdecisions A = [αe

1, αc1, ..., α

eN , α

cN ], edge resource alloca-

tion f = [f1, f2, ..., fN ], radio bandwidth allocation w =[w1, w2, ..., wN ], and consumed gas fee g = [g1, g2, ..., gN ]as follows:

(P1) : minimizeA,f,w,g

N∑n=1

Coffloadn + Csc

n

subject to (C1) : αen, α

cn ∈ {0, 1},∀n ∈ N ,

(C2) : αen + αc

n = 1,∀n ∈ N ,

(C3) :

N∑n=1

fen ≤ F e,

(C4) : fen ≥ 0,∀n ∈ N ,(C5) : 0 < wn ≤ 1,∀n ∈ N ,

(C6) :

N∑n=1

wn ≤ 1,

(C7) : Tn ≤ τn,∀n ∈ N ,(C8) : gn ≤ $n.

Here, the constraint (C1) and (C2) represent the binaryoffloading decision policy of the MD n, offloading to theMEC server or offloading to the cloud server. (C3) and (C4)indicate that the allocated edge resources should not exceedthe total computing capacity of the MEC server, while (C5)and (C6) are the constraints of bandwidth allocation. Further,the execution time to complete a computation task should notexceed a maximum time latency value, which is expressedin the constraint (C7). Also, (C8) represents the constraint of

gas execution cost in the offloading authentication. It is worthnoting that the optimization problem (P1) is not convex dueto the non-convexity of its feasible set and objective functionwith the binary variable A. The size of the problem (P1) canbe very large when the number of MDs in MECCO systemincreases rapidly. Moreover, traditional optimizations methodssuch as ADMM cannot solve well the proposed problemwith high network dynamics. Indeed, the authors assume thatnetwork statistics such as computation resource and radiobandwidth are known before making offloading decisions,which may not be met in highly dynamic blockchain-basedMEC environments like our considered scenario. Moreover,the computation offloading policy designs in these works aremostly based on one-shot optimization and fail to characterizethe long-term computation offloading performance in real-timetask execution. Therefore, we here propose to use a DRLalgorithm to adjust dynamically how much edge computationand bandwidth resources should be allocated to a certain MDbased on MDs task size so as to achieve the optimal offloadingfor the MECCO system, which at the same time does not needa priori knowledge of network statistics.

IV. PROPOSED SOLUTIONS

In this section, we propose access control and computationoffloading approaches for our MECCO system.

A. Access Control for MECCO System with Smart Contracts

In this subsection, we design a smart contract to formulateour access control scheme. We also provide an access protocolthat presents the workflow of access control for the MECCOsystem.

1) Smart Contract Design: We first create an AccessCon-trol contract controlled by the admin to monitor transactionoperations in our MECCO network on blockchain. Here weuse an Ethereum blockchain platform [29] due to its adaptableand flexible features, which allow to build any blockchainapplications such as our IoT scenario. We denote PK as thepublic key of MD. The contract mainly provides the followingfour functions.• AddMD(PK): (executed by Admin) This function allows

to add a new MD to the smart contract. MD is identifiedby their public key and is added into the contract witha corresponding role based on their request. MD infor-mation is also kept in cloud storage as part of systemdatabase.

• DeleteMD(PK): (executed by Admin) It is used to removeMDs from the network based on the corresponding publickey. All device information is also deleted from cloudstorage.

• PolicyList(PK): (executed by Admin) The policy list con-tains the public keys of all MDs for identification whenthe smart contract processes new transactions.

• Penalty(PK, action): (executed by Admin) When detectingan unauthorized offloading request to cloud, the MECCOmanager will inform smart contract to issue a penalty tothe requester. In our paper, we give a warning messageas a penalty to the unauthorized MDs.


pragma solidity ^0.4.22;

contract AccessControl {

[Declare variables, structures and

functions ]

function smartcontract_creation (string

AccessControl_Contract) public {

Admin = msg.sender;

status = true;

adddevice(0, "", "");

adddevice(Admin, 'Creator of Smart

Contract', "");

numberOfdevices = 0;

}

function adddevice(address devicePK)

onlyAdmin public {

require(status = true);

uint id = deviceId[devicePK];

if (id == 0) {

deviceId[devicePK] = devices.length;

id = devices.length++;

}

devices[id] = device({device: devicePK});

deviceAdded(devicePK, deviceNotes);

numberOfdevices++; }

function removedevice(address devicePK)

onlyAdmin public {

require(deviceId[devicePK] != 0);

for (uint i = deviceId[devicePK];

i<devices.length-1; i++){

devices[i] = devices[i+1];

}

delete devices[devices.length-1];

devices.length--;

deviceRemoved(devicePK);

numberOfdevices--;

}

function policyList(string _resource,

string _action, string _permission,

address devicePK) public{

bytes32 MDsresource =

stringToBytes32(_resource);

bytes32 action =

stringToBytes32(_action);

if(msg.sender == MECCOManager){

policies[MDsresource][action].devicePK =

true;

policies[MDsresource][action].permission

= _permission;

policies[MDsresource][action].result

= false;

}

else throw;

}

function penalty(address devicePK,

string _resource, string _action, uint

_time) public{

bool policycheck = false;

bool behaviorcheck = true;

bool checkID = 0;

uint penalty = 0;

if(msg.sender==MECCOManager){

devicePKcheck =

policies[MDsresource][action].devicePK;

checkID = devicePKcheck;

if (checkID is true) // Detect an

authorized access

ReturnAccessResult(msg.sender,

"Successful!", true, _time, penalty);

else // Detect an unauthorized

access

ReturnAccessResult(msg.sender,

"Failed!", true, _time, penalty);

} } }

Fig. 4: Pseudo-code of smart contract implementation foraccess control.

The smart contract design for the proposed access controlscheme on Ethereum blockchain can be seen in Fig. 4.

2) Access Control Protocol: To operate the access con-trol for computation offloading, we also develop an accesscontrol protocol which is performed when a MD executes atransaction for a request of offloading data to the edge-cloud.The access control protocol includes two phases: Transac-tion pre-processing (executed by the MECCO manager) andVerification (executed by the Admin). In the first phase, theMECCO manager receives a new transaction Tx from a MD.The MECCO manager will obtain the public key PK ofthe requester by using the Tx.getSenderPublicKey() functionand send it to the contract for validation. In the next phase,after receiving a transaction with a MD PK from MECCOmanager (msg.sender = ME), the admin will verify accessrights of the requester based on its PK in the policy list of thesmart contract. If the PK is available in the list, the request isaccepted and now a task offloading permission is granted tothe requester. Otherwise, the smart contract will issue a penaltyto this request through the Penalty() function. In this case, alloffloading activities are denied and the request is discardedfrom the blockchain network. The access control protocol issummarized in Algorithm 1.

B. Secure Computation Offloading for MECCO System withAdvanced DRL

1) Problem Formulation: We focus on formulating thesecure computation offloading problem via DRL, aiming tominimize the sum cost of all MDs in terms of computationlatency, energy consumption, and smart contract cost in theproposed MECCO. We first introduce the offloading frame-work using DRL where state space, action space and rewardare defined.

Algorithm 1 Access control for computation offloading1: Input: Tx (The offloading request on blockchain)2: Output: Result (Access result for offloading request)3: Initialization: (by the MECCO Manager)4: Receive a new transaction Tx from a MD5: Get the public key of the requester: PK ←Tx.getSenderPublicKey()

6: Send the public key to Admin (msg.sender =MECCOmanager)7: Pre-processing the request (by Admin)8: if PK is available in the policy list then9: policyList(PK)← true

10: end if11: Decode the transaction decodedTx ←

abiDecoder.decodeMethod(Tx)12: Specify request information: Addr ←

web3.eth.getData(decodedTx([DataIndex])13: Specify DeviceID: DID ← Addr(Index[DID]);14: Verification (by the smart contract)15: while true do16: if policyList(PK)→ true then17: if policyList(DID)→ true then18: Result← Penalty(PK, ”Successful! ”)19: break;20: else21: Result← Penalty(PK, ”Failed”)22: break;23: end if24: else25: Result← Penalty(PK, ”Failed”)26: break;27: end if28: end while

• State: The system state is chosen as s = {tc, ec, bw}where tc is the total offloading cost of MECCO system(tc = C), ec is the available computation resource of theMEC server (ec = F e −

∑Nn=1 f

en). Further, bw denotes

the available bandwidth resource of the MECCO system,and can be specified as bw = B −

∑Nn=1 wn. Moreover,

we estimate the consumed gas cost gn as the state ofsmart contract usage when offloading the task of the MDn.

• Action: The action space is formulated as the offloadingdecision vector A = [αe

1, αc1, ..., α

eN , α

cN ], edge resource

allocation f = [f1, f2, ..., fN ] and radio bandwidth alloca-tion w = [w1, w2, ..., wN ]. Thus, the action vector can beexpressed as a = [αe

1, αc1, f1, w1, ..., α

eN , α

cN , fN , wN ].

• Reward: The objective of the RL agent is to find anoptimal offloading action a at each state s with the aimof minimizing the sum cost C(s, a) of the offloadingcost Coffload(s, a) and smart contract cost Csc(s, a) inthe MECCO system. Also, the reward function shouldbe negatively related to the objective function of theoptimization problem (P1) in the previous section. Ac-cordingly, we can formulate the system reward as

r(s, a) = −C(s, a) = −(Coffload(s, a) + Csc(s, a)).(10)

2) Basics of Deep Reinforcement Learning: The principleof Reinforcement Learning (RL) can be described as a MarkovDecision Process (MDP) [17]. In the RL model, an agentcan make optimal actions by interacting with the environmentwithout an explicit model of the system dynamics. In ourMECCO scenario, at the beginning, the agent has no expe-rience and information about the MECCO environment. Thus


it needs to explore for every time epoch by taking some actionsat each offloading state, e.g., the size of current IoT datasize, available edge resource. As long as the agent has someexperiences from actual interactions with the environment,it will exploit the known information of states while keepexploration. As a combination of the Monte Carlo method anddynamic programing, a temporal-difference (TD) approach canbe employed to allow the agent to learn offloading policieswithout requiring the state transition probability which isdifficult to acquire in realistic scenarios like in our dynamicmobile blockchain. Therefore, we can develop a dynamicoffloading scheme using a free-model RL. Specially, in thispaper, our focus is to find the optimal policy that minimize theoffloading cost C. To this end, the state-action function can beupdated using the experience tuple of agent (st, at, rt, st+1)at each time step t in our offloading application as

Q(st, at)← Q(st, at)+α[r(st, at)+γ∗minQ(st+1, at+1),

−Q(st, at)] (11)

which is called as Q-learning algorithm [15]. Here α is thelearning rate, γ is the discount factor between (0,1) and σt =r(st, at) + γ ∗maxQ(st+1, at+1)−Q(st, at) is the TD errorwhich will be zero for the optimal Q-value. Further, under theoptimal policy π∗ which can be obtained from the maximumQ-value (π∗(s) = argmaxQ∗(s, a)), the Bellman optimalityequation [17] for the state-action equation can be expressedas

Q∗(st, at) = Est+1∼E [r(st, at) + γ ∗minQ∗(st+1, at+1)].

(12)It is noting that the Q-learning algorithm is proved to

converge with probability one over an infinite number of times[20] and achieves the optimal Q∗. Although the reinforcementlearning can solve the offloading problem by obtaining theoptimum reward, there are still some remaining problems. Thestate and action values in the Q-learning method are storedin a two-dimensional Q table, but this method can becomeinfeasible to solve complex problems with a much larger state-action space. This is because if we keep all Q-values in atable, the matrix Q(s, a) can be very large, which makes thelearning agents difficult to obtain sufficient samples to exploreeach state, leading to the failure of the learning algorithm.Moreover, the algorithm will converge slowly due to too manystates that the agent has to process.

To overcome such challenges, we can use deep learning withDeep Neural Network (DNN) to approximate the Q-valuesinstead of using the conventional Q-table, leading to a newalgorithm called DRL. In the DRL-based algorithm, a DNNis used to approximate the target Q-values Q(st, a, θ) withweights θ. Further, to solve the instability of Q-network dueto function approximation, the experience replay solution isemployed in the training phase with the buffer B which storesexperiences et = (st, at, rt, st+1) at each time step t. Next,a random mini-batch of transitions (sj , aj , rj , sj+1) from thereplay memory is selected to train the Q-network. Here theQ-network is trained by iteratively updating the weights θ to

minimize the loss function, which is written as

L(θ) = E[yj −Q(sj , aj |θj))2], (13)

where yj = (rj + γ ∗minaj+1Q(sj+1, aj+1|θ′) and the E[.]denotes the expectation function.

3) Advanced DRL-based Computation Offloading: Recentyears have witnessed great efforts in deep reinforcementlearning to improve the performance of DLR-based algorithm.In this paper, we focus on two recent improvements inDRL research to apply to our formulated offloading problem,including double DQN and dueling DQN [19].• Double DQN: In the conventional DQN, we use the same

samples from the replay memory to both specify whichaction is the best and estimate this action value, whichleads to the large over-estimation of action values. Tosolve this problem, two Q-functions are proposed to selectand evaluate the action values by a new loss function as:

Ldou(θ) = E[yjdou −Q(sj , aj |θj))2], (14)

where

yjdou = (rj + γ.Q(sj+1,minaj+1Q(sj+1, aj+1|θ1), θ2).(15)

It is noting that the action choice is still based on theweight θ1, while the evaluation of the selected actionrelies on the weight value θ2. This technique mitigatesover-estimation problem and thus improves the trainingprocess and in turn the efficiency of the DQN algorithms.

• Dueling DQN: During the computation of Q function insome MDP problems, it is unnecessary to estimate actionand state values at the same time, thus we can estimateseparately the action and state value functions. Motivatedby this concept, in dueling DQN, the state action valueQ(s, a) can be decomposed into two value functions asfollows:

Q(s, a) = V (s) +A(a). (16)

Here, V (s) is the state-value function which evaluates thesignificance of being at a given state s. A(a) is action-value function to estimate the significance of choosingan action a compared to other actions. The V (s) andA(a) are first computed separately, then are combinedto generate the final output Q(s, a). This approach leadsto a better performance on policy evaluation of MDPs,especially complex problems with large action space likeour considered offloading scenario.

Motivated by the advantages of the above approaches, wedevelop a novel DQN algorithm using such two improvementsto solve the offloading problem of our MECCO system. Thedetails of the proposed algorithm is shown in Algorithm 2.The ADRLO algorithm can achieve the optimal task offloadingstrategy in an iterative manner. Here, the procedure generatesa task offloading strategy for MDs based on system statesand observes the system reward at each time epoch so thatthe offloading policy can be optimized (lines 8-16). Then theprocedure updates the history experience tuple and train theQ-network (lines 18-22) with loss function minimization. Thistrial and error solution will avoid the requirement of prior


information of offloading environment. Over the training timeperiod, the trained deep neural network can characterize wellthe environment and therefore, the proposed offloading algo-rithm can dynamically adapt to the real MECCO environment.

Algorithm 2 Advanced DRL-based computation offloading(ADRLO) algorithm for MECCO system1: Initialization:2: Set replay memory D with capacity N3: Initialize the deep Q network Q(s, a) with random weight θ and θ

′,

initialize the exploration probability ε ∈ (0, 1)4: for episode = 1,..., M do5: Initialize the state sequence s06: for t = 1, 2, ... do7: /∗ ∗ ∗ Plan the computation offloading ∗ ∗ ∗/8: Estimate the current offloading cost tct9: Estimate the available edge resource ect

10: Estimate the available bandwidth resource bwt

11: Estimate the consumed gas cost gt12: Set st = {tct, ect, bwt, gt}13: Select a random action at with probability ε, otherwise at =

argminQ(st, a, θ)14: Offload the computation task αt

e(Dt) to the edge or cloud server

αtc(D

t)15: Observe the reward rt and next state st+1

16: Evaluate the system cost C(s, a)t

17: /∗ ∗ ∗ Update ∗ ∗ ∗/18: Store the experience (st, at, rt, st+1) into the memory D19: Sample random mini-batch of state transitions (sj , aj , rj , sj+1)

from D20: Calculate the target Q-value by (yjdou = rj +

γ.Q(sj+1,minaj+1Q(sj+1, aj+1|θ), θ′ )21: Perform a gradient descent step with the weight θj on (yjdou −

Q(sj , aj |θj))2 as the loss function22: Train the deep Q-network with updated θ and θ

′

23: end for24: end for

To this end, we propose a novel MECCO algorithm bycombining access control and computation offloading on amobile blockchain IoT network. The concept of the integratedscheme is shown in Algorithm 3, which can be explainedas follows. We first create a private Ethereum blockchainenvironment on cloud, i.e., Amazon cloud platform, to performaccess control and offloading functionalities. With blockchainsetup, we can deploy a smart contract and connect with anetwork of MDs, e.g., smartphones, to formulate a MECCOsystem (lines 1-3). It is assumed that all MDs have the demandto offload their computation tasks to edge or cloud servers forexecution. In the access control phase, each MD will send anoffloading request as a transaction in blockchain context to thecloud server for authentication. The smart contract deployedon cloud will verify the transaction to accept or refuse therequest. If the request is accepted, the process now goes tothe offloading phase, otherwise a penalty will be given to therequestor (lines 4-14). In the offloading phase, it is assumedthat all authorized MDs in the access control phase are groupedinto the new set of devices (N

′). We perform the offloading

algorithm using a DRL network to optimize the offloading costfor our MECCO. The process finishes with the mining of thetransaction and appending it to the blockchain (lines 15-20).

The computational complexity of our algorithm mostlycomes from the complexity of running the DNN. We assumethat are K neurons at the input layer of the DNN, andZ as the number of neurons at the output layer. Also, the

Algorithm 3 Joint access control and computation offloadingon blockchain for IoT networks

1: Initialize private blockchain, and setup N Ethereum nodesfor all MDs

2: Deploy smart contracts on the cloud3: Blockchain starts mining4: Access control phase:5: for each MD n in N do6: Initialize the transaction Tx for offloading requests7: Submit the transaction Tx to MECCO manager8: Perform access control via Algorithm 19: if verification is successful then

10: Go to the offloading phase11: else12: Reject the offloading request13: end if14: end for15: The offloading phase:16: for each authorized MD n in N

′do

17: Perform task offloading via Algorithm 218: Verify the offloading transaction Tx by mining process19: Upload the transaction Tx to blockchain and wait for

confirmation20: end for

hidden layer is L, and the number of neurons at hiddenlayers is H . Accordingly, the computation cost at the DNNis (KH + (L− 1)HH +HZ) =O(H(K + (L− 1)H + Z)).Also, the complexity of using activation function is O(HL).Hence, the total complexity is O(H(K +HL−H +Z +L))which can be simplified as O(H(K +HL+ Z)).

V. SIMULATIONS AND PERFORMANCE EVALUATIONS

In this section, we investigate the proposed MECCO systemby conducting both real experiments to evaluate the accesscontrol performance and numerical simulations to evaluatethe efficiency of our proposed secure computation offloadingscheme.

A. Implementation Settings

We considered an access control framework for MECCOon a mobile cloud as shown in Fig. 5. We deployed a privateEthereum blockchain which is available on the Amazon cloud1

computing platform, where two virtual machines AWS EC2were employed as the miners, two virtual machines Ubuntu16.04 LTS were used as the admin and MECCO manager,respectively. Our smart contract was written by Solidity pro-gramming language [29] as shown in Fig. 4 and was deployedon AWS Lambda functions. Each function interacts with thecloud blockchain via the web3.js API. MDs can interactwith smart contracts through their Android phone where aGeth client (a command line interface implemented in the Golanguage) was installed to transform each smartphone into anEthereum node. By using the Geth client, a MD can create

1https://aws.amazon.com/blockchain/


Blockchain clients/ Mobile Devices (MDs)

Access

Point

Send offloading request

A server running

Ethereum blokchain

on Amazon Cloud

Send offloading request

Fig. 5: Experiment setup for access control implementation.

an Ethereum account to communicate with our blockchainnetwork for accessing data. The web3.js library, a lightweightJava library for working with smart contracts and blockchain,was also used for developing the mobile application to connectwith the Ethereum blockchain network. In our experiment inthis paper, we used two Sony mobile phones running on anAndroid OS version 8.0 platform to investigate the accesscontrol results.

Moreover, in our simulation for computation offloading, aMECCO system is considered with a cloud server, a MECserver with a number of MDs authorized by the access controlmechanism. Here we consider N = [2 − 30] MDs, eachof them has a computation task to be executed at edge orcloud server. We assume that the data size of IoT computationtasks D is randomly distributed between 0.1MB and 12MB[29]. The transmit power pn is 0.4 W. The total bandwidthresource B is set to 15 MHz [20] ; the additive noisy powerspectral density N0 is -100dBm/Hz [19], [20]. Besides, thetotal computational capacity of MEC server F e and cloudserver F c are set to 2GHz and 5GHz, respectively [18]. Therequired number of CPU cyles Xn is set to [0.8−1.5] Gcyles.The completion deadline for executing a task taun is 1000ms. The channel gain between MDs and the MEC serverhn is generated using distance-dependent path-loss modelL[dB] = 140.7+36.7log10d[km] [20], [29]. The factors βt

n, βen

are set to 0.6 and 0.4, respectively. Also, the gas price ξ atthe cloud is set to 2e−9 and the contract gas demands are gnare [104 − 109] gas [14].

Regarding the DRL algorithm, the architecture of the DNN-based training model needs to be built carefully. The increaseof the number of hidden layers will make the algorithm morecomplex and the training takes much time. However, if thenumbers of hidden layers and neurons on each layer are toosmall, the algorithm may not achieve a desired convergence. Inthis paper, the parameters are selected by performing multipletests and numerical simulations. In this particular work, thediscount factor γ is set to 0.85 and the replay memory capacityand training batch size are 105 and 128, respectively. The usedDNN structure includes three hidden layers, where the first,second, and third hidden layers have 64, 32 and 32 neurons,respectively. We employ ReLU as the activation function in thehidden layers, while the sigmoid activation function is utilizedin the output layer to relax the offloading decision variables.The simulations were implemented in Python with TensorFlow2.0 on a computer with an Intel Core i7 4.7GHz CPU and 128GB memory.

The concept of data training in our DRL algorithm can be

Fig. 6: The running Ethereum blockchain on Amazon cloudfor access control of MECCO system.

explained as follows. Each episode consists of 50 time slots (orsteps), and the DRL training is performed via 2000 episodes.At the beginning of each time step in a training episode,the action set is chosen randomly, including offloading de-cision, edge resource, and bandwidth resource values. Then,the agents (MDs) interact with the offloading environment(established by MDs, MEC and cloud servers) to computethe computation latency, energy consumption, and the smartcontract fee values of the offloading process based on thepredefined functions described in the previous sections. Thisaims to create a matrix of data for computing the reward later.Then, the current state and action are stored, and the next stateand the reward value are updated ready for the next step oftraining. The training is iterated within the given episode untilthe algorithm converges.

B. Access Control Performance

In this subsection, we implemented access control onblockchain and verified the proposed scheme. We first de-ployed a private Ethereum blockchain on Amazon cloud asillustrated in Fig. 6. Offloading access and transactions arerecorded and shown on the web interface for monitoring.Based on our blockchain settings, we deployed smart con-tracts, established network entities of the access control asexplained in Fig. 3 and connected with mobile applications tobuild our access control framework.

We evaluated the performance of the proposed access con-trol scheme via two use cases with authorized and unautho-rized access for offloading request, as illustrated in Fig. 7. Themain objective of the access control is to verify effectively theoffloading request from the MDs and prevent any potentialthreats to our MECCO system on mobile edge-cloud. Theevaluation is presented as follows. First, it is assumed thatan owner of a smartphone wants to offload its data task tothe cloud or edge server for calculation. He should create anEthereum blockchain account and register an access right ofhis device by providing the device name, device ID and makea transaction for an offloading request (Fig. 7(a)). Note thatbased on the blockchain concept, the transaction is signed by aprivate key coupled with a public key for device identification.After receiving the transaction from the MD, the MECCOmanager will authorize the request using the smart contract.If the MD is verified by the smart contract, the request isnow confirmed for task offloading. Accordingly, a response isgiven to the requestor for confirmation so that the MD can


Fig. 7: Illustrations of access control results.

0 500 1000 1500 2000Training episodes

2

3

4

5

6

7

Ave

rage

Sys

tem

Rew

ard

Learning rate = 0.01Learning rate = 0.001Learning rate = 0.0001

Fig. 8: Comparison of average system rewards with differentlearning rates.

offload its data to edge or cloud server (Fig. 7(b)). Now theaccess control process finishes and the offloading transactionis appended to blockchain by the cloud miner and broadcastto all entities in the network. Therefore, a MD can keep trackof the offloading access (Fig. 7(c)), which improves networktrustworthiness.

In the case of unauthorized access, the smart contract willverify and detect by the access protocol with a predefinedpolicy list. Such illegal request is prevented and discarded fromblockchain, and a warning message is returned to the requester(Fig. 7(d)). A corresponding transaction for unauthorized ac-cess is also issued by the smart contract (Fig. 7(e)). Obviously,the use of blockchain can address effectively challenges men-tioned in the literature in controlling access information andmonitoring offloading behaviours, which can enhance systemreliability and data privacy.

C. Computation Offloading PerformanceWe first evaluate the training performance of our ADRLO

algorithm. Fig. 8 shows the performance of average system

reward with different learning rates. It can be seen that thelearning rate affects the learning rewards over the trainingepisodes. That is, when the learning rate decreases, the conver-gence performance of the proposed algorithm decreases dueto slow learning speed. Based on our experimental results, thelearning value at 0.01 yields the best reward performance andhas good convergence rate and thus we use it in the followingsystem simulations and evaluations.

We then evaluate the proposed ADRLO scheme in variousperformance metrics with extensive analysis on offloadingcosts of latency and energy consumption under differentparameter settings. To highlight the advantage of the proposedoffloading scheme in terms of offloading cost efficiency, wecompare our ADRLO algorithm with the following baselineschemes, i.e.,• DRL-based offloading scheme (DRLO) [15]: Computa-

tion offloading of MECCO system is performed by anoffloading algorithm using regular DRL.

• Edge offloading scheme (EO) [11]: All MDs offloadtheir computation task to the MEC server, .i.e, settingoffloading decision vector (αe

n = 1, αcn = 0, (∀n ∈ N )).

• Cloud offloading scheme (CO) [3]: All MDs offloadtheir computation task to the cloud server, .i.e, settingoffloading decision vector (αc

n = 1, αen = 0, (∀n ∈ N )).

Further, to investigate the performance of different offload-ing schemes, the simulation results are averaged over 50 runsof numerical simulations.

We first evaluate the performance of the total MECCOcost versus the number of MDs and tasks in Fig. 9. Morespecifically, in Fig. 9(a), we consider a MECCO system witha varying number of MDs, and each MD has a computationtask whose size varies between 0.1 MB and 1 MB. As indi-cated from the simulation results, the curves of all offloadingschemes increase with the growing number of MDs. Specially,the CO scheme has the highest offloading cost among alloffloading schemes. An explanation is that the computationtasks of MDs in this test case are relatively small, so offloadingto the remote cloud will incur a larger transmission latency,and thus leads to a higher offloading cost. Meanwhile, thanks


2 4 6 8 10 12

10

20

30

40

50

Number of Mobile Device

Tota

lC

ost

The proposed ADRLO schemeThe DRL scheme [15]The EO scheme [11]The CO scheme [3]

(a) Total cost versus the number of MDs.

2 4 6 8 10 12

40

60

80

100

120

Task Size (MB)

Tota

lC

ost


(b) Total cost versus the number of tasks.

Fig. 9: Total offloading cost versus the number of MDs and tasks.

1 2 3 4 5

0

20

40

60

80

100

Computational Capacity of MEC server (GHz)

Tota

lC

ost


(a) Total cost versus edge resource.

2 4 6 8 100

20

40

60

Total System Bandwidth (MHz)

Tota

lC

ost


(b) Total cost versus bandwidth resource.

Fig. 10: Total offloading cost versus resource allocation.

to low-latency computation services of MEC server, the EOscheme shows better offloading efficiency with lower offload-ing costs for any user cases. More importantly, the DRLOand ADRLO schemes exhibit much lower offloading costs andthe ADRLO scheme achieves the best performance among alloffloading schemes, which can be explained by the followingreasons. First, in the proposed advanced DRL algorithm, thedouble DQN network obtains the optimal policy with largersystem gains than that of the regular DQN. Second, the duelingDQN architecture leads to a better performance on policyevaluation by evaluating separately the state-value functionand the action-value function. These advanced techniquesmake the ADRLO scheme more efficient in terms of evaluationof optimal offloading policy and offloading performance.

Next, we analyze the computation offloading performancefor the MECCO system with a single MD N = 1 and its task

size changes between 2MB and 12MB as shown in Fig. 9(b).It is observed that when the task size increases, the cost of fourschemes increases due to the growing amount of IoT data to beexecuted completely. Particularly, when the task size is small(<5MB), the CO scheme has a higher offloading cost thanthat the EO scheme. The reason behind this observation is thesmall tasks can be processed efficiently by the MEC serverwith sufficient computation resources. Therefore, offloadingsmall tasks to the remote cloud will result in unnecessarytransmission latency and consequently, incur a higher totaloffloading cost for the CO scheme. However, when the tasksize become larger (>5MB), the computational capacity ofthe MEC server becomes less sufficient to accommodate alltasks, while the resourceful cloud server can compute large-size tasks effectively. As a result, the CO scheme can achievea much lower offloading cost, compared to the EO scheme.


2 4 6 8 10 12

30

40

50

60

Task Size (MB)

Tota

lC

ost

Proposed ADRLO schemeADRLO scheme w/o edge allocationADRLO scheme w/o bandwidth allocation

Fig. 11: Total costs of offloading schemes.

5 10 15 20 25 30

0.1

0.12

0.14

0.16

Number of Mobile Device

The

Smar

tC

ontr

act

Cos

t(E

ther

) The proposed ADRLO schemeThe DRL scheme [15]

Fig. 12: Smart contract cost for offloading authenticationversus the number of MDs.

Note that the ADRLO scheme still achieves the minimum totaloffloading cost, followed by the DRLO scheme with a smallgap when the task size increases.

Moreover, we compare offloading schemes under resourceallocation scenarios in Fig. 10. First, we evaluate the impactof edge resource allocation on offloading performance. FromFig. 10(a), the total offloading cost of schemes based onedge computing decreases significantly with the increase ofthe computational capacity of MEC server. Furthermore, theCO scheme has a stable offloading cost with the increase ofMEC capacity as all computation tasks in this scheme areprocessed in the cloud. In particular, when the edge com-putational capacity is small, the EO scheme has the highestoffloading cost. This is because a less computing resourceof MEC server will lead to a higher execution latency, andthus the EO scheme suffers from a much higher computation

cost, compared to other schemes. However, when the edgecomputational capacity becomes large, the total system costof the EO scheme will reduce significantly and can achievea lower cost than that the CO scheme (i.e., when the edgecapacity is 5GHz). This result can provide more insightsinto how to allocate properly allocation for MEC server toenhance the overall offloading performance for the MECCOsystem. Again, the proposed ADRLO scheme can obtain thelowest cost with a downward trend and thus exhibits the bestperformance when compared to the DRLO scheme and otherbaselines.

Meanwhile, Fig. 10(b) shows the total offloading cost versusthe system bandwidth allocation. Based on the simulationresult, it is clear that all the curves of offloading schemesdecrease gradually with the increment of the total systembandwidth. The reason behind this observation that bandwidthallocation is always significant in improving the data trans-mission rate between MDs and edge-cloud server. As a result,this solution will reduce the transmission delay and the totaloffloading cost, accordingly. Further, by applying the proposeddynamic offloading policy, our approach using reinforcementlearning can adjust dynamically how much bandwidth resourceshould be allocated to a certain MD based on MD’s task sizeso as to achieve the optimal offloading for the MECCO system.

We also analyze the efficiency of the proposed ADRLOscheme by comparing to the other two baseline solutions: theproposed scheme without edge allocation and the proposedscheme without bandwidth allocation. Note that for these base-lines, edge computation and bandwidth resources are allocatedequally to all MDs in the MECCO system regardless of thetask sizes. As indicated in Fig. 11, the proposed scheme, whichtakes edge resource and bandwidth resource into account,can achieve the best performance in terms of minimum totaloffloading costs, compared to other benchmarks. Obviously,by jointly optimizing both offloading decision and resourceallocation, our proposed algorithm can achieve effective com-putation cost savings and improve significantly the offloadingperformance of the MECCO system.

We also evaluate the smart contract cost for offloadingauthentication, as illustrated in Fig. 12, under our proposedADRLO scheme and the baseline DRLO scheme. It is in-dicated that more MDs lead to the higher contract costbecause more offloading transactions need to be verified andexecuted by the smart contract which requires more gas coststo achieve offloading on edge cloud. However, the contractcost achieved by the proposed ADRLO scheme is lower thanthe DRLO approach with different numbers of MDs, thanksto the optimized learning policy.

Finally, we investigate the execution latency of our DRLalgorithm. For fair comparison, we use a RL approach andalso consider a popular exhaustive search approach, where theMEC server collects all global information over the large state-action space and computes the offloading costs for all MDsin each offloading realization. As shown in Table II, our DRLapproach yields lower latency than the RL approach, and it hasa much lower running time for each offloading realization,compared to the greedy approach. For example, when thenumber of MDs is 60, our DRL algorithm generates an


TABLE II: Comparison of average execution latency for each offloading realization (insecond).

No. of MDs Our DRL approach RL approach Exhaustive SearchN= 20 2.241 6.704 27.034N= 40 3.590 10.501 40.440N =60 5.891 16.093 82.063

TABLE III: Comparison between our proposed scheme and the existing works.

Features Schemes[13] [16] [19] [20] [21] [23] Ours

Intelligent computa-tion offloading

X X X X X

Advanced DRL de-sign with double-dueling Q learning

X X

Smart contract de-sign

X X X

Offloading designin blockchain

X X X X

Smart contract-based security andoffloading services

X

offloading decision in about 5.891 second for each realization,while the RL and greedy approach spends 5 times and 14times longer execution time, respectively. This is becausethat the RL algorithm with table searching-based Q-learningmethod cannot achieve the optimal offloading policy due to theproblem of high curse-of-dimensionality when we optimizejointly multiple edge and bandwidth resources concurrently.Moreover, the exhaustive search approach need much timeto find the optimal offloading action by the evaluation overthe large state-action space. Overall, our DRL approach witha double-dueling Q network take advantage of the DNNto train the Q-learning process for more efficient offloadingoptimization. Finally, the advantages of our scheme overexisting approaches are summarized in Table III.

VI. CONCLUSIONS

In this paper, we have jointly studied access control andcomputation offloading for blockchain-based computation of-floading, which has not been considered fully in the literature.First, to improve the security of computation task offloading,we propose a new access control mechanism enabled by smartcontracts and blockchain to manage access of MDs with theobjective of preventing malicious offloading access and pre-serving cloud resources. Then, we propose a novel DRL-basedoffloading scheme to obtain the optimal offloading policy forall MDs in the IoT network. We formulate task offloadingdecision, edge resource allocation, bandwidth allocation, andsmart contract usage as a joint optimization problem, whichis solved efficiently by an advanced DQN algorithm with adouble-dueling Q-network in a fashion that the total offloadingcost of computation latency, energy consumption and smartcontract cost are minimized. The implementation results onboth real-world exeperiments and simulations showed that ourscheme can provide high security to the MECCO systemwhile achieving significant performance improvement withminimum offloading and smart contract costs, compared tothe existing schemes.

REFERENCES

[1] Y. He, C. Liang, F. R. Yu, and Z. Han, “Trust-Based Social Networkswith Computing, Caching and Communications: A Deep Reinforcement

Learning Approach,” IEEE Transactions on Network Science and Engi-neering, vol. 7, no. 1, pp. 66–79, Jan. 2020.

[2] L. Qian, Y. Wu, F. Jiang, N. Yu, W. Lu, and B. Lin, “NOMA AssistedMulti-Task Multi-Access Mobile Edge Computing via Deep Reinforce-ment Learning for Industrial Internet of Things,” IEEE Transactions onIndustrial Informatics, vol. 17, no. 8, pp. 5688–5698, Aug. 2021.

[3] M. Huang, W. Liu, T. Wang, A. Liu, and S. Zhang, “A Cloud-MECCollaborative Task Offloading Scheme With Service Orchestration,”IEEE Internet of Things Journal, vol. 7, no. 7, pp. 5792–5805, Jul.2020.

[4] Y. Miao, G. Wu, M. Li, A. Ghoneim, M. Al-Rakhami, and M. S. Hos-sain, “Intelligent task prediction and computation offloading based onmobile-edge cloud computing,” Future Generation Computer Systems,vol. 102, pp. 925–931, Jan. 2020.

[5] P. Zhou, K. Shen, N. Kumar, Y. Zhang, M. M. Hassan, and K. Hwang,“Communication-efficient Offloading for Mobile Edge Computing in 5GHeterogeneous Networks,” IEEE Internet of Things Journal, pp. 1–1,2020.

[6] W. Wu, F. Zhou, R. Q. Hu, and B. Wang, “Energy-Efficient ResourceAllocation for Secure NOMA-Enabled Mobile Edge Computing Net-works,” IEEE Transactions on Communications, vol. 68, no. 1, pp. 493–505, Jan. 2020.

[7] X. Xu, X. Zhang, H. Gao, Y. Xue, L. Qi, and W. Dou, “BeCome:Blockchain-Enabled Computation Offloading for IoT in Mobile EdgeComputing,” IEEE Transactions on Industrial Informatics, vol. 16, no. 6,pp. 4187–4195, Jun. 2020.

[8] S. Seng, C. Luo, X. Li, H. Zhang, and H. Ji, “User Matching onBlockchain for Computation Offloading in Ultra-dense Wireless Net-works,” IEEE Transactions on Network Science and Engineering, pp.1–1, 2020.

[9] L. Xiao, Y. Ding, D. Jiang, J. Huang, D. Wang, J. Li, and H. Vin-cent Poor, “A Reinforcement Learning and Blockchain-Based TrustMechanism for Edge Networks,” IEEE Transactions on Communica-tions, vol. 68, no. 9, pp. 5460–5470, Sep. 2020.

[10] T. Liu, J. Wu, L. Chen, Y. Wu, and Y. Li, “Smart Contract-Based Long-Term Auction for Mobile Blockchain Computation Offloading,” IEEEAccess, vol. 8, pp. 36 029–36 042, 2020.

[11] J. Pan, J. Wang, A. Hester, I. Alqerm, Y. Liu, and Y. Zhao, “EdgeChain:An Edge-IoT Framework and Prototype Based on Blockchain and SmartContracts,” IEEE Internet of Things Journal, vol. 6, no. 3, pp. 4719–4732, Jun. 2019.

[12] M. Debe, K. Salah, M. H. Ur Rehman, and D. Svetinovic, “Monetizationof Services Provided by Public Fog Nodes Using Blockchain and SmartContracts,” IEEE Access, vol. 8, pp. 20 118–20 128, 2020.

[13] X. Xu, C. He, Z. Xu, L. Qi, S. Wan, and M. Z. A. Bhuiyan, “JointOptimization of Offloading Utility and Privacy for Edge ComputingEnabled IoT,” IEEE Internet of Things Journal, vol. 7, no. 4, pp. 2622–2629, Apr. 2020.

[14] Z. Zhang, Z. Hong, W. Chen, Z. Zheng, and X. Chen, “Joint Computa-tion Offloading and Coin Loaning for Blockchain-Empowered Mobile-Edge Computing,” IEEE Internet of Things Journal, vol. 6, no. 6, pp.9934–9950, Dec. 2019.

[15] J. Wang, J. Hu, G. Min, W. Zhan, Q. Ni, and N. Georgalas, “ComputationOffloading in Multi-Access Edge Computing Using a Deep SequentialModel Based on Reinforcement Learning,” IEEE Communications Mag-azine, vol. 57, no. 5, pp. 64–69, May 2019.

[16] X. Qiu, L. Liu, W. Chen, Z. Hong, and Z. Zheng, “Online DeepReinforcement Learning for Computation Offloading in Blockchain-Empowered Mobile Edge Computing,” IEEE Transactions on VehicularTechnology, vol. 68, no. 8, pp. 8050–8062, Aug. 2019.

[17] M. Li, F. R. Yu, P. Si, W. Wu, and Y. Zhang, “Resource Optimization forDelay-Tolerant Data in Blockchain-Enabled IoT With Edge Computing:A Deep Reinforcement Learning Approach,” IEEE Internet of ThingsJournal, vol. 7, no. 10, pp. 9399–9412, Oct. 2020.

[18] S. Guo, Y. Dai, S. Guo, X. Qiu, and F. Qi, “Blockchain MeetsEdge Computing: Stackelberg Game and Double Auction Based TaskOffloading for Mobile Blockchain,” IEEE Transactions on VehicularTechnology, vol. 69, no. 5, pp. 5549–5561, May 2020.

[19] F. Guo, F. R. Yu, H. Zhang, H. Ji, M. Liu, and V. C. M. Leung, “AdaptiveResource Allocation in Future Wireless Networks With Blockchain andMobile Edge Computing,” IEEE Transactions on Wireless Communica-tions, vol. 19, no. 3, pp. 1689–1703, Mar. 2020.

[20] Z. Li, M. Xu, J. Nie, J. Kang, W. Chen, and S. Xie, “NOMA-EnabledCooperative Computation Offloading for Blockchain-Empowered Inter-net of Things: A Learning Approach,” IEEE Internet of Things Journal,pp. 1–1, 2020.


[21] K. Gai, Y. Wu, L. Zhu, L. Xu, and Y. Zhang, “Permissioned Blockchainand Edge Computing Empowered Privacy-Preserving Smart Grid Net-works,” IEEE Internet of Things Journal, vol. 6, no. 5, pp. 7992–8004,Oct. 2019.

[22] G. Qiao, S. Leng, H. Chai, A. Asadi, and Y. Zhang, “Blockchain Em-powered Resource Trading in Mobile Edge Computing and Networks,”in ICC 2019 - 2019 IEEE International Conference on Communications(ICC), Shanghai, China, May 2019, pp. 1–6.

[23] S. Wang, D. Ye, X. Huang, R. Yu, Y. Wang, and Y. Zhang, “ConsortiumBlockchain for Secure Resource Sharing in Vehicular Edge Computing:A Contract-based Approach,” IEEE Transactions on Network Scienceand Engineering, pp. 1–1, 2020.

[24] M. A. Rahman, M. M. Rashid, M. S. Hossain, E. Hassanain, M. F.Alhamid, and M. Guizani, “Blockchain and IoT-Based Cognitive EdgeFramework for Sharing Economy Services in a Smart City,” IEEEAccess, vol. 7, pp. 18 611–18 621, 2019.

[25] J. Li, N. Chen, and Y. Zhang, “Extended File Hierarchy Access ControlScheme with Attribute Based Encryption in Cloud Computing,” IEEETransactions on Emerging Topics in Computing, pp. 1–1, 2019.

[26] K. Riad, R. Hamza, and H. Yan, “Sensitive and Energetic IoT Access

Control for Managing Cloud Electronic Health Records,” IEEE Access,vol. 7, pp. 86 384–86 393, 2019.

[27] K. Fan, H. Xu, L. Gao, H. Li, and Y. Yang, “Efficient and privacy pre-serving access control scheme for fog-enabled IoT,” Future GenerationComputer Systems, vol. 99, pp. 134–142, Oct. 2019.

[28] C. Yang, L. Tan, N. Shi, B. Xu, Y. Cao, and K. Yu, “AuthPrivacyChain:A Blockchain-Based Access Control Framework With Privacy Protectionin Cloud,” IEEE Access, vol. 8, pp. 70 604–70 615, 2020.

[29] D. C. Nguyen, P. N. Pathirana, M. Ding, and A. Seneviratne, “Privacy-Preserved Task Offloading in Mobile Blockchain with Deep Reinforce-ment Learning,” IEEE Transactions on Network and Service Manage-ment, pp. 1–1, 2020.

[30] S. Wu, Y. Chen, Q. Wang, M. Li, C. Wang, and X. Luo, “CReam: ASmart Contract Enabled Collusion-Resistant e-Auction,” IEEE Transac-tions on Information Forensics and Security, vol. 14, no. 7, pp. 1687–1701, Jul. 2019.

[31] Y. Hanada, L. Hsiao, and P. Levis, “Smart Contracts for Machine-to-Machine Communication: Possibilities and Limitations,” in 2018 IEEEInternational Conference on Internet of Things and Intelligence System(IOTAIS), Nov. 2018, pp. 130–136.

Secure Computation Offloading in Blockchain based IoT ...

Documents

Transcript of Secure Computation Offloading in Blockchain based IoT ...