Data management issues in mobile ad hoc networks - J-Stage

27
Review Data management issues in mobile ad hoc networks By Takahiro HARA * 1,(Communicated by Makoto NAGAO, M.J.A.) Abstract: Research on mobile ad hoc networks (MANETs) has become a hot research topic since the middle 1990s. Over the rst decade, most research focused on networking techniques, ignoring data management issues. We, however, realized early the importance of data management in MANETs, and have been conducting studies in this area for 15 years. In this review, we summarize some key technical issues related to data management in MANETs, and the studies we have done in addressing these issues, which include placement of data replicas, update management, and query processing with security management. The techniques proposed in our studies have been designed with deep considerations of MANET features including network partitioning, node participation/disappearance, limited network bandwidth, and energy eciency. Our studies published in early 2000s have developed a new research eld as data management in MANETs. Also, our recent studies are expected to be signicant guidelines of new research directions. We conclude the review by discussing some future directions for research. Keywords: mobile ad hoc networks, data replication, update management, query processing 1 Introduction 1.1 History of MANET research. Mobile ad hoc networks (MANET) 1),2) has its origin in Packet Radio Network, 3) which was studied in the early 1970s. Afterward, MANET has been a hot research topic since the 1990s, in the computer science and IT research communities. While there are dierent denitions of a MANET, the most typical and well-known denition is a wireless network which is temporally constructed solely by mobile nodes with wireless communication capabilities. In a MANET, mobile nodes have a limited range of wireless communication, basically restricted to the reachable area (coverage) of radio signals. Therefore, two mobile nodes located beyond their own communica- tion range communicate with each other via other intermediate nodes located between them, who relay their communication messages. Thus, in a MANET, communication between nodes is achieved in a multi- hop manner. Compared with other existing network infra- structures, such as the Internet and WiFi, MANETs have a signicant advantage, in that they can be constructed without centralized network controls (i.e., xed infrastructure). Thus, MANETs are expected to be useful in situations where a xed network infrastructure is not available, such as in excavation work or military aairs, and also in rescue operations where network infrastructures are com- promised. A MANET is also useful for car-to-car communication, for safe driving and other services, because it can provide real-time communication, and is more energy and cost ecient than infrastructured communication. In a MANET, the movement of mobile nodes often causes nodal disconnection (i.e., two nodes that were within the communication range of each other move beyond this range and lose the ability to communicate directly with each other). Thus, a number of techniques have been developed to support communication between arbitrary pairs of nodes in a MANET, where the network topology is dynamically and continuously changing. Since mobile nodes are typically small battery-driven devices such as laptop computers, mobile phones, or sensor nodes, they are limited by resource constraints in terms of battery- life and communication channel. Therefore, almost * 1 Department of Multimedia Engineering, Osaka University, Osaka, Japan. Correspondence should be addressed: T. Hara, Depart- ment of Multimedia Engineering, Graduate School of Information Science and Technology, Osaka University, 1-5 Yamadaoka, Suita, Osaka 565-0871, Japan (e-mail: [email protected]). Proc. Jpn. Acad., Ser. B 93 (2017) [Vol. 93, 270 doi: 10.2183/pjab.93.018 ©2017 The Japan Academy

Transcript of Data management issues in mobile ad hoc networks - J-Stage

Review

Data management issues in mobile ad hoc networks

By Takahiro HARA*1,†

(Communicated by Makoto NAGAO, M.J.A.)

Abstract: Research on mobile ad hoc networks (MANETs) has become a hot research topicsince the middle 1990’s. Over the first decade, most research focused on networking techniques,ignoring data management issues. We, however, realized early the importance of data managementin MANETs, and have been conducting studies in this area for 15 years. In this review, wesummarize some key technical issues related to data management in MANETs, and the studies wehave done in addressing these issues, which include placement of data replicas, update management,and query processing with security management. The techniques proposed in our studies have beendesigned with deep considerations of MANET features including network partitioning, nodeparticipation/disappearance, limited network bandwidth, and energy efficiency. Our studiespublished in early 2000’s have developed a new research field as data management in MANETs.Also, our recent studies are expected to be significant guidelines of new research directions. Weconclude the review by discussing some future directions for research.

Keywords: mobile ad hoc networks, data replication, update management, query processing

1 Introduction

1.1 History of MANET research. Mobilead hoc networks (MANET)1),2) has its origin inPacket Radio Network,3) which was studied in theearly 1970’s. Afterward, MANET has been a hotresearch topic since the 1990’s, in the computerscience and IT research communities. While there aredifferent definitions of a MANET, the most typicaland well-known definition is a wireless network whichis temporally constructed solely by mobile nodes withwireless communication capabilities. In a MANET,mobile nodes have a limited range of wirelesscommunication, basically restricted to the reachablearea (coverage) of radio signals. Therefore, twomobile nodes located beyond their own communica-tion range communicate with each other via otherintermediate nodes located between them, who relaytheir communication messages. Thus, in a MANET,communication between nodes is achieved in a multi-hop manner.

Compared with other existing network infra-structures, such as the Internet and WiFi, MANETshave a significant advantage, in that they can beconstructed without centralized network controls(i.e., fixed infrastructure). Thus, MANETs areexpected to be useful in situations where a fixednetwork infrastructure is not available, such as inexcavation work or military affairs, and also in rescueoperations where network infrastructures are com-promised. A MANET is also useful for car-to-carcommunication, for safe driving and other services,because it can provide real-time communication, andis more energy and cost efficient than infrastructuredcommunication.

In a MANET, the movement of mobile nodesoften causes nodal disconnection (i.e., two nodesthat were within the communication range of eachother move beyond this range and lose the abilityto communicate directly with each other). Thus, anumber of techniques have been developed to supportcommunication between arbitrary pairs of nodes in aMANET, where the network topology is dynamicallyand continuously changing. Since mobile nodes aretypically small battery-driven devices such as laptopcomputers, mobile phones, or sensor nodes, they arelimited by resource constraints in terms of battery-life and communication channel. Therefore, almost

*1 Department of Multimedia Engineering, OsakaUniversity, Osaka, Japan.

† Correspondence should be addressed: T. Hara, Depart-ment of Multimedia Engineering, Graduate School of InformationScience and Technology, Osaka University, 1-5 Yamadaoka, Suita,Osaka 565-0871, Japan (e-mail: [email protected]).

Proc. Jpn. Acad., Ser. B 93 (2017) [Vol. 93,270

doi: 10.2183/pjab.93.018©2017 The Japan Academy

all the studies in the early stage of MANET research(in the 1990’s) sought networking techniques forachieving efficient communication between nodes.4)–7)

In particular, designing routing protocols to findefficient communication paths between source anddestination nodes, and relay message packets alongsuch paths, has been a central focus of research formany years.

Figure 1 shows an example in which the sourcenode S tries to find an efficient communication pathto the destination node D (for example, to access adata item held by D), where a line between a pair ofnodes indicates that there is a radio communicationlink between them (i.e., they are within thecommunication range of each other). In this example,the routing protocol basically finds the shortest pathto the destination node, indicated here by the solidarrow, rather than, for example, that indicated bythe dashed arrow.

1.2 New research direction: data manage-ment in MANETs.

1.2.1 Motivation. Thanks to the networkingtechniques developed by the existing studies,MANETs have become available for practical use insome situations such as military affairs8),9) anddisaster sites.10),11) Thus, these studies have clearlybeen of value. However, the networking techniquesare neither ideal nor sufficient, primarily becausethey only seek efficient communication between twonodes connected to each other via a one-hop (directconnection) or multi-hop path, and thus are ineffec-tive (cannot do anything) if the two nodes have nocommunication path between them (i.e., the networkis partitioned).

In most MANET application environments,such as rescue operations, users not only employdirect communication, such as voice communicationor video chat, but also often share information (e.g.,sensor data to monitor environmental situations, orinformation on the progress of collaborative work).And this fact directed us in a new research direction,toward MANET data management. While the

system cannot control the mobility of users, i.e., itcannot avoid network partitioning, the system cancontrol data operations, for example, through datareplication and replica placement, i.e., it can managethe data. Therefore, even when the network ispartitioned, we can keep high performance ofinformation sharing in the MANET, by controllingdata operations effectively. More specifically, if wereplicate, on another node, a data item held by aparticular node (data owner), we can continuouslyaccess the data item, even when the data owner is notaccessible (i.e., the owner is disconnected from thenetwork or network partitioning occurs). This notionprompted our initial address of data managementissues in MANETs.

Figure 2 shows an example. Here in a MANETshown in Fig. 1, the seven nodes on the left move in anopposite direction from the seven nodes on the right,and the network is partitioned. Now, let us assumethat the source node S in Fig. 2 wishes to communi-cate with the destination node D, as S seeks to accessD’s held data item. In this situation, no routingprotocol can solve the network partitioning problem,and thus node S simply cannot access D’s held dataitem. However, if this data item is replicated (i.e.,copied) on one of the seven nodes on the left side,before the network is partitioned, S (and the othersix nodes) still can access that data item. Here, weassume that at some timing before the networkpartitioning, a node decides to replicate the data heldby D according to some strategy, e.g., a strategy todecide data replication when finding a critical linkwhich may cause network partitioning, and replicatesdata items beyond the found critical link.

Here, it should be noted that while the MANETresearch has the history of more than 40 years, itsapplication domains have been still limited to, forexample, military affairs and disaster sites. One ofthe main reasons of this is that MANET is basicallyuseful for situations where network infrastructuressuch as the Internet are not available, which arerare situations today, i.e., the usage of MANET is

Fig. 1. Example of routing in a MANET.

S

D

Copy!

Fig. 2. Example of data replication in a MANET.

Data management issues in mobile ad hoc networksNo. 5] 271

naturally limited. However, in some applicationdomains such as interpersonal communication intown and information sharing among vehicles forsafety/autonomous driving, it has recently beenrequired that applications should be achieved withless overhead on the Internet. Thus, MANET hasbeen attracting much attention again as a keytechnology for off-loading and edge computing insuch application domains.

However, there are still some remaining techni-cal issues until MANETs are widely used in suchapplication domains, and we believe that the mostserious issue is lack of practical middleware includingdata management techniques. In other words,achieving just high communication performancecannot achieve satisfactory performance for informa-tion sharing (e.g., data availability, data accesslatency/throughput, scalability, and security/de-pendability) which the applications require. In thisreview, we aim to enlighten such technical issues fromthe data management perspective.

1.2.2 Application examples. As aforementioned,information sharing based on data managementtechniques in MANETs is highly beneficial to avariety of real-world applications. Actually, existingMANET studies including recent ones have assumeda variety of application domains. We present typicalapplications as below.• Rescue operations at a disaster site12)–14)

• Real-time information sharing among vehicles15)

• Interpersonal communication in town16)

• PANs (Personal Area Networks)17) and BANs(Body Area Networks)18)

• Mobile sensor networks19),20)

• Information sharing in conferences and class-rooms21),22)

• Military affair23),24)

• Smart home25)

Among the above application examples, we pickup the first three ones, and discuss them in detail.

A. Rescue operations at a disaster site. At adisaster site, information is often the most criticalelement in efficient and effective rescue operations,such as rescue planning, rescuer assignment, andresource allocation. If the information on structuraldamage, injured persons (injured level and location),the status of rescuer activities, etc., is effectivelyshared among rescuers and management staffs, over awide range and in real-time, the efficiency andeffectiveness of the rescue operations that dependon such information will be significantly improved.Unfortunately, however, information sharing in real-

world rescue operations is still far from adequate,with much critical information not being fullyutilized. Instead, in most cases, rescuers must workwithout sufficient information (sometimes with noreal-time information at all), and rescue leaders canobtain such information only after a delay of a fewhours or even half a day.

Therefore, if we could develop mechanismsenabling real-time information sharing based onMANET data-management techniques, various kindsof critical information could be shared amongrescuers and other staffs in a real-time, efficient(e.g., less energy consumption), reliable, and securemanner, resulting in significant improvements inrescue operations.

B. Real-time information sharing among vehicles.In-vehicle systems, such as car navigation systems,typically involve high-end mobile devices which offervarious kinds of services to users. Recently, moreadvanced ITS applications, enabling safe-drivingsupport and autonomous cars, have attracted muchattention. In such applications, real-time informationsharing among vehicles is essential. For example,for both safe-driving support and autonomous cars,driving information on nearby vehicles, and environ-mental data on traffic jams, accidents, road con-ditions, etc., must be shared in a real-time, efficient,and accurate manner.

In traditional ITS systems, such informationis shared by means of infrastructure such as theInternet and fixed devices on roads (e.g., JapaneseVehicle Information and Communication System(VICS)). However, since such information includesvarious types of sensor data, and its data volumeis typically large, infrastructure-based approachessuffer from delays and unnecessary communicationtraffic, which may not satisfy the system’s real-timerequirements. In addition to this problem, the datatraffic generated by such ITS systems, as well asother mobile and IoT (Internet of Things)/M2M(Machine-to-Machine) applications, make massivedemands on typically limited network capacity.

MANET-based information sharing is expectedto solve these problems,15) since here, wirelesscommunication among nearby nodes involves veryshort delays, and does not inject any data traffic intoan infrastructure network (e.g., the Internet).

C. Interpersonal communication in town. Sim-ilarly to the ITS applications described above, mobileapplications on smart phones typically generate largedata traffic. Among these, location-based services(LBSs), such as nearby restaurant recommendations

T. HARA [Vol. 93,272

and shop advertisements, as well as social networkingservices (SNSs), have become very popular; andrecently, location-based social network services(LBSNs), which integrate these two popular services,have become widely available. MANET-based com-munication is highly effective for offloading datatraffic from the Internet in such LBSNs; and thus,like corresponding ITS applications, MANET-basedapplications can achieve short delays and efficienttraffic offloading.

1.2.3 Technical issues. It should be noted thatdata replication in MANETs is not, in itself, an idealsolution. In fact, it presents two new and seriouschallenges needing address, and two additionalchallenges.

A. Replica placement. First, due to severalfactors, such as limited storage space and networkbandwidth, the number of replicas (copies) allocat-able to each node is basically limited. Therefore,effective replica allocation is crucial to systemperformance; for example, in terms of data avail-ability (data access request success rate), networktraffic, and response time.

B. Update management. Second, since data itemsare generally updated by the data owners or othernodes, we must effectively manage different versionsof data copies, in order to ensure consistency of dataaccess; that is, all forms of data access (readoperations) must read a valid version of the targetitems.

While data access consistency has been activelystudied in the database and distributed systemcommunities, MANETs have significant propertieswhich make data access consistency difficult toachieve, including dynamic change in networktopology, difficulty in recognizing the global view,and limited resources such as network bandwidth,storage, and computational power. Therefore, weneed to develop new mechanisms for preserving dataaccess consistency in MANETs.

C. Query processing. In addition to the problemsposed by data replication, finding (locating) dataitems of interest (i.e., query processing) is also asignificant issue, because this directly affects theperformance of the system or application. Here, thereare several different types of queries. The most naiveinvolves accessing data items by specifying their dataidentifiers. In this type of query, it is important toefficiently transmit the query message to the targetdata item’s owner node (i.e., location management ofdata items).26) There are also more complex types ofquery, which specify certain query conditions that

define the data items sought by the query-issuingnode. k-nearest neighbor (kNN) searches27) and top-ksearches28) are typical examples of such complexqueries.

kNN and top-k searches have been thoroughlystudied in the database communities, where the mainfocus is how to reduce the computation overhead tofind the search results from a massive volume of data.On the other hand, these two types of queries are alsouseful in multi-hop wireless networks such as wirelesssensor networks (WSNs) and MANETs, where themain focus is different from that in databases, i.e.,how to reduce communication overhead rather thancomputation overhead. This is because in WSNsand MANETs, the volume of data processed in thenetworks are generally not as huge as that ininfrastructured networks (e.g., data centers), i.e.,finding the result from the data is not so computa-tional costly, but data traffic should be the keybottleneck because of limited resources such asnetwork bandwidth and battery.

Here, kNN and top-k searches in WSNs havebeen recently well studied.29)–32) These existingstudies proposed infrastructure-free query processingmethods, which are efficient in retrieving searchresults with low message overhead (traffic). In thesemethods, nodes must exchange messages includingnode information (e.g., one-hop beacon messagesincluding the location of the sender), in order torecognize their neighbors for query processing.

On the other hand, query processing inMANETs has not yet been thoroughly investigated.This is because MANETs possess notable character-istics, such as limitations on network bandwidth,and dynamic topology change due to the movementof mobile nodes. More specifically, due to highmobility of nodes in MANETs, it is impractical tofrequently exchange beacon messages to accuratelyknow changing locations of neighboring nodes.Therefore, existing techniques proposed in WSNscannot be directly applied to MANETs. Addressingquery processing in MANETs is a significant issuefrom both an academic and social perspective.

D. Security management. In complex types ofqueries, such as top-k searches, in-network processingtechniques (e.g., data aggregation) are used forefficient query processing. In such situations, securityis a significant issue in order to accurately processqueries. For this aim, existing security techniquesin MANETs are applicable.33) However, in addition,new types of security issues arise for query processingin MANETs.

Data management issues in mobile ad hoc networksNo. 5] 273

For example, in top-k query processing, mali-cious nodes may elude the query processing protocol,and replace top-ranked data items with less valuableitems, which we call a data replacement attack.34)

This kind of attack is difficult to detect becausethe malicious nodes can still attack, even thoughthey strictly follow the underlying network protocol(i.e., we cannot detect it with existing securitytechniques); and this kind of attack can significantlydamage the system, because it can selectively removetop-ranked data items which should be included inthe final top-k result.

Here, data replacement attacks can take placeonly in in-network query processing in multi-hopnetworks such as MANET, WSNs, and peer-to-peer(P2P) networks, where relaying nodes aggregatethe intermediate query results for efficient queryprocessing. Among MANETs, WSNs, and P2P net-works, data replacement attacks are very serious onlyin MANETs, because in the other two types ofnetworks, the network topology does not change veryfrequently, and thus, malicious behaviors of adver-sary nodes are easily detected by the neighbors. Onthe other hand, in a MANET, since the networktopology (i.e., neighbors) frequently changes, somenew approaches are needed to handle data replace-ment attacks with low overhead and latency withlimited resources such as network bandwidth.

E. Summary. Table 1 summarizes the MANETdata management issues described above, and ourpioneering studies with regard to each of these issues.In the following sections, we will discuss some typicalresearch achievements in this regard. The techniquesproposed in our studies have been designed with deepconsiderations of MANET features such as networkpartitioning, node participation/disappearance, lim-ited network bandwidth, and energy efficiency. Ourstudies published in early 2000’s have developed anew research field as data management in MANETs.Also, our recent studies are expected to be significantguidelines which show new research directions.

1.3 Contributions of this review paper.1.3.1 Contributions. The survey papers in Refs. 53,54 well presented and categorized early studies ondata replication in MANETs. These survey papersrecognized data replication in MANETs as a newresearch topic, listed up fundamental technical issues,and presented a number of typical studies on thistopic. Specifically, these categorized the existingstudies based on various fundamental technical issuesincluding:1. Decentralized (or centralized)2. Dealing with node disappearance and network

partitioning (or not)3. Considering stability of wireless links (or not)4. Considering data update (or not, i.e., read-only)5. Addressing energy efficiency (or not)6. Considering geographical locations of nodes

and/or data items (or not)7. Addressing data retrieval efficiency, i.e., data

access latency (or not)Starting from the paper in Ref. 35, we have publisheda number of papers addressing the above technicalissues in the earliest stage. These papers werepresented in the above survey papers53),54) using lotsof space, as representative pioneering works. Inthis review, our works presented in section 2 andsection 3.1 cover (address) most of the abovetechnical issues. In addition, others’ studies presentedin Refs. 53, 54 and most of very recent studies(published after those survey papers)55)–62) alsofocused on those technical issues. For example, recentdata/replica allocation schemes in MANETs aim toimprove data availability and data access latencybased on similar ideas as the traditional existingstudies, but some new criteria such as node self-ishness,56) information density,57) and social relation-ship58),61),62) are taken into account.

On the other hand, our works presented in andafter section 3.2 were based on the papers publishedin and after 2009, which were not covered by thesurvey papers. These works firstly tackled new

Table 1. Data management issues in MANETs

Issue Description Our work

(1) Replica placement How to effectively place replicas of data objects 35–39

(2) Update management How to manage versions of replicas and how to keep consistency of data operations 40–42

(3) Query processing How to efficiently retrieve data items of interest 43–47

(4) Security management How to achieve secure query processing 34, 48, 49

(1) D (2) — — 37, 50, 51

(1) D (3) — — 45, 47, 52

T. HARA [Vol. 93,274

technical issues, which were not addressed by anyonebefore us. Therefore, we believe that we havesignificantly contributed to the advancement of thisresearch area (i.e., data management in MANETs).The main objective of this review is to publicize thenew technical issues in MANETs to the researchcommunities, and to encourage researchers andengineers to address the new issues, contributing tothe technical advancements and application develop-ments in MANETs.

1.3.2 Scope. It should be noted that researchon data management and replication in MANETpresented in this review strongly relates to researchesin iMANET (a special type of MANET in whichsome nodes have a connection to a stable networkinfrastructure, i.e., the Internet)63),64) and DTN(Delay Tolerant Network),65) because all the threeresearch areas assume wireless multi-hop networksand share some common technical issues such asresource constraints and dynamic change in networktopology including network partitioning. Since allof them basically assume a MANET (or iMANET),all of them can be recognized as different types ofresearch on data management in MANET.

Specifically, the iMANET and DTN researchareas have different focuses compared with theMANET data management research. For example,the iMANET research mainly focuses on balancingresponse time (i.e., query delay) and communicationcost because nodes can download data of interestfrom not only the Internet but also other nodesthrough wireless multi-hop communications.

The DTN research mainly focuses efficient datadistribution (i.e., push-based data access) wherethere is a trade-off between the quickness andcommunication cost, while data replication inMANETs mainly focuses pull-based data access.The main advantage of the DTN-based approach isthat it can achieve more efficient data distribution(e.g., emergent information dissemination), and thus,useful in some disaster situations. On the other hand,the MANET-based approach is more effective forgeneral purposes (not only data dissemination) wherevarious data operations such as real-time dataretrieval and advanced query processing (e.g., top-kand kNN queries) are performed. Therefore, somerecent works have focused on integrating MANETand DTN approaches.66)

In this review, in order to make our scope clear,we just focus on data management issues for pull-based data access in pure MANETs where we do notassume any nodes connecting to the Internet.

2 Replica placement

The first issue is how to effectively and efficientlyallocate replicas to nodes in MANETs, which wasthe first MANET data management challengeaddressed by our research group.35) In this section,we summarize the three approaches to MANETreplica placement, which we proposed in Ref. 35.

2.1 Three approaches proposed in Ref. 35.2.1.1 Preliminary. We assume a general

MANET model where nodes move freely and a pairof nodes can directly communicate with each otherthrough a wireless radio link when these are withinthe communication range of each other. Thus, thenetwork topology dynamically changes. In the net-work, if two nodes have a multi-hop link betweenthem, these nodes can also communicate with eachother through intermediate nodes which exist be-tween them and relay communication packets.

In this subsection, for simplicity, we assume anenvironment where the original of each data item isheld by a particular node, and is not updated. Eachmobile node has limited storage (or memory) space inwhich to replicate original data items held by others.

When a mobile node issues an access request fora given data item, the request is successful if either(i) the mobile node holds the original/replica of thedata item, or (ii) at least one mobile node, which isconnected to the requesting node through a one-hopor multi-hop link, holds the original/replica. Thus,the requesting node first confirms whether it holdsthe original/replica of the target data item; and if itdoes, the request succeeds on the spot. If it does not,the node sends a request message for the target dataitem, to connected mobile nodes. If it receives a replyfrom some other node(s) holding the original/replicaof the target data item, the request is also successful.Otherwise, the request fails. Note that in this paper,mobile nodes connected to each other by one-hop ormulti-hop wireless links are simply called connectedmobile nodes.

In this subsection, we assume the followingsystem model and notations.• The set of all mobile nodes in the system is

denoted by M F {M1,M2,*,Mm}, where m isthe total number of mobile nodes, and Mj

(1 5 j 5 m) is a node identifier. Each mobilenode moves freely.

• Each pair of nodes can directly communicatewith each other when these are within thecommunication range of each other. In addition,if two nodes have a multi-hop link between

Data management issues in mobile ad hoc networksNo. 5] 275

them, these also can communicate with eachother.Due to node mobility, the network topologydynamically changes, and network partitioningmay occur.

• Each mobile node, Mi, has a storage space of Ci

data items for replica allocation, in addition tothe space for the original data items held by thenode.

• The frequency of access to each data item bymobile nodes is known, and does not change.This assumption is not always realistic, but insome situations (applications), the speed ofaccess frequency change is slow enough toassume that access frequencies to data itemsdo not change in a short period. Actually, thisassumption has been often made in existingstudies in the database and distributed systemcommunities.56),67) In a real environment, theaccess frequency can usually be known bymaintaining logs of access requests at eachnode. Many existing studies adopted such alog-based access frequency estimation,68),69) andproved that it works well. Also, in real systems(e.g., GlassFish application server), this tech-nique is often used for load estimation and otherpurposes.

• Data is handled in the form of data items, whichare collections of data. The set of all data itemsis denoted by D F {D1,D2,*,Dn}, where n isthe total number of data items and Dj (1 5

j 5 n) is a data identifier. All data items are ofthe same size, and each original data item isheld by a particular mobile node.

• The data items are not updated. There aremany real situations where data items are notupdated or are updated with enough lowfrequency so that we can assume that they arenot updated, e.g., user generated documentssuch as SNS posts and web contents, and resultsof assigned tasks in a collaborative work. Manyexisting studies56),62) made this assumption.Of course, there are also many applicationswhere data items are updated. We have alsoaddressed the issue of data management inMANETs in such situations (see section 2.3,section 3, and Refs. 37, 40–42, 50, 51).2.1.2 Three replica placement methods. In terms

of data access request success rate, the replicaplacement problem is a form of resource allocationproblem in a distributed networked system, which iswell known to be NP-hard. In our study, the replica

placement problem is considered as a problem indetermining which data items are to be replicated ateach mobile node (each of which has limited storagespace), in order to maximize the data access requestsuccess rate.

In order to determine the optimal assignmentsamong all possible combinations of replica allocation,we must analytically determine the combinationwhich produces the highest data access requestsuccess rate. The computational complexity here isvery high, and this calculation must be performedevery time the network topology changes due tomobile node migration, which is impractical in real-world situations. For these reasons, we took thefollowing heuristic approach in Ref. 35:• Replicas are periodically relocated with a

specific interval (relocation period).• During each relocation period, replica allocation

is determined based on the access frequencyto each data item by each mobile node, and(optionally) the network topology at the time.Based on this approach, we proposed three

replica placement methods, which differ in theemphasis put on access frequency and networktopology:35)

1. SAF (Static Access Frequency) method: Onlythe access frequency to each data item by eachnode is taken into account.

2. DAFN (Dynamic Access Frequency and Neigh-borhood) method: The access frequency to eachdata item and the neighborhood of mobile nodesare taken into account.

3. DCG (Dynamic Connectivity based Grouping)method: The access frequency to each data itemand the entire network topology are taken intoaccount.A. Static Access Frequency (SAF) method. In the

SAF method, each mobile node Mi allocates replicasof Ci data items, in descending order of its own accessfrequency to the data items. That is, each nodeallocates replicas in a selfish manner, and does nottake into account which data items are replicated byother nodes.

Now, let us suppose that six mobile nodes(M1,*,M6) are present and Mi (i F 1,*, 6) holdsDi as an original copy. The access frequency to eachdata item by each mobile node is shown in Table 2.Figure 3 shows the result of executing the SAFmethod. In this figure, a straight line denotes awireless link, a gray rectangle denotes an originaldata item, and a white rectangle denotes an allocatedreplica.

T. HARA [Vol. 93,276

In the SAF method, mobile nodes do not needto exchange information with each other for replicaplacement. Moreover, replica relocation basicallydoes not occur after replica placement is completed.As a result, this method allocates replicas with lowoverhead and low traffic. On the other hand, sinceeach mobile node allocates replicas based solely on itsown access frequencies to data items, mobile nodeswith the same access characteristics allocate the samereplicas; and the resultant volume of replica duplica-tion leads to a low data access request success rate,especially when many mobile nodes have similaraccess characteristics.

B. Dynamic Access Frequency and Neighborhood(DAFN) method. To solve this problem with the SAFmethod, the DAFN method eliminates the replicaduplication among neighboring mobile nodes. Here,each node still basically pursues its own ends, but atthe same time at least partially collaborates withothers toward the global end.

Initially, this method determines replica alloca-tion in the same manner as the SAF method.However, if there is replica duplication of a dataitem by two neighboring mobile nodes, the node withthe lower access frequency to the data item changesthe replica to another replica (a replica of the data

item with the next highest access frequency). Sincethe neighboring status changes as mobile nodesmove, the DAFN method is executed during eachrelocation period. The order of pairs for eliminatingreplica duplication is determined based on the orderof visiting nodes, using a breadth-first search in theMANET, which begins with the mobile node with thelowest node identifier suffix, and ends when all theconnected mobile nodes have been traversed.

Figure 4 shows an example of executing theDAFN method in the environment described byTable 2 and Fig. 3. In Fig. 4, a dark gray rectangledenotes a replica allocated to eliminate replicaduplication. While six types of replica were allocatednetwork-wide in the SAF method, seven are allocatedin the DAFN method; and by eliminating replicasduplicated by neighboring nodes, the data accessrequest success rate is expected to be higher than inthe SAF method.

However, the DAFN method does not com-pletely eliminate replica duplication among neighbor-ing nodes, because it only executes the eliminationprocess by scanning the network once, based on thebreadth-first search. Thus, in Fig. 4, we can seeduplication of replica D7 by M4 and M5, and of D4 byM4 and M6. And here, both the overhead and trafficare higher than in the SAF method, because mobilenodes exchange information and relocate replicasduring each relocation period.

C. Dynamic Connectivity based Grouping (DCG)method. The DCG method shares replicas amonglarger groups of mobile nodes than the DAFNmethod, which shares replicas only among neighbor-ing nodes. In this method, then, nodes behave withthe greatest attention to the global end.

In order to share replicas effectively, each groupshould be stable (i.e., the group is not easilypartitioned due to changes in network topology).To this end, the DCG method creates groups ofmobile nodes that are biconnected components70) inthe given network. A biconnected component denotesthe maximum partial subgraph which is connected

Table 2. Access frequencies to data items

DataMobile node

M1 M2 M3 M4 M5 M6

D1 0.50 0.25 0.30 0.35 0.25 0.20

D2 0.45 0.50 0.40 0.40 0.40 0.45

D3 0.35 0.45 0.50 0.25 0.45 0.35

D4 0.30 0.15 0.10 0.60 0.10 0.25

D5 0.50 0.20 0.15 0.25 0.70 0.20

D6 0.05 0.35 0.45 0.20 0.25 0.60

D7 0.40 0.25 0.30 0.30 0.35 0.20

D8 0.20 0.30 0.20 0.25 0.30 0.20

D9 0.20 0.20 0.20 0.25 0.25 0.20

D10 0.15 0.10 0.05 0.15 0.15 0.10

M2

D1

D2

D3

D4

D5

D6M1M3

M4

M5

M6

D1D2

D3

D5

D6

D6D2

D2

D3

D2

D3D2

Fig. 3. An example of executing the SAF method.

M2

D1

D2

D3D4

D5

D6M3

M5

M6

D1D7

D3

D5

D6

D6D2

D7

D4

D2

D3D7

M1

M4

Fig. 4. An example of executing the DAFN method.

Data management issues in mobile ad hoc networksNo. 5] 277

(not partitioned) if an arbitrary node in the graph isdeleted. When the mobile nodes are grouped asbiconnected components, the groups are not parti-tioned, even if a mobile node disappears from thenetwork or a link is disconnected within a givengroup, and thus the groups are considered to havehigh stability. Here, though a node may belong tomultiple biconnected components, it belongs only tothe group corresponding to the biconnected compo-nent which was determined earlier.

After the mobile nodes are grouped, the accessfrequency of each group to each data item iscalculated as a summation of the access frequenciesof all the mobile nodes in the group to this item.Then, in order of the access frequencies of the group,replicas of data items are allocated until the storagespace of all the mobile nodes in the group is filled.Here, each replica is allocated to the mobile nodewith the highest access frequency to the data item.

The above procedures (i.e., grouping and replicaallocation) are performed in each relocation period.Figure 5 shows an example of executing the DCGmethod in the environment described by Table 2and Fig. 3. In this example, two groups, respectivelyconsisting of M1, M2, M3, M4 (G1) and M5, M6 (G2),are created. Table 3 shows the access frequencies ofthe two groups, which are calculated from Table 2.In this figure, a dark gray rectangle denotes a replicaallocated in the second cycle. By executing the DCGmethod, a total of 10 replica types are allocated in thenetwork.

Since many types of replica can be shared, thedata access request success rate is expected to behigher than in the DAFN and SAF methods.However, both the overhead and traffic are higherhere than in the other two methods, because themobile nodes exchange information and relocatereplicas over a wide range during each relocationperiod.

2.1.3 Performance study. To show the dif-ferences in performance among the three methods,SAF, DAFN, and DCG, we briefly present the result

of a performance study,35) which was done through asimulation. In the simulation, 40 mobile nodes existin a size 50 # 50 (corresponding to about 500[m] #500[m]) flatland, each of which holds an original dataitem and has a storage to replicate 10 data items.These nodes initially locate at random positions andmove according to the random walk model (whichrandomly determines the movement speed anddirection at every time step). Due to the limitationof space, we omit the detail of the simulation setting(see the detail in Ref. 35).

Figure 635) shows the performances of the threemethods when varying the radio communicationrange of each node, where 1 corresponds to about10[m]. The performance metrics are data accessibility(Fig. 6(a)) and traffic (Fig. 6(b)). The data accessi-bility is defined as the ratio of the number ofsuccessful data requests to the total number of dataaccess requests issued during the simulation time.The traffic is defined as the total hop-count of datatransmission for allocating/relocating replicas.

Figure 6(a) shows that as the radio communi-cation range gets longer, the data accessibilityincreases in every method. When the communicationrange is very long, every method also gives almost thesame data accessibility. This is because most mobilenodes are connected to each other, and thus mobilenodes can access original data items in most cases.The DCG method gives the highest data accessibilityand the DAFN method gives the next highest.

From Fig. 6(b), as the radio communicationrange gets longer, the traffic caused by the DAFNmethod and the DCG method also gets larger at first,but it gets smaller from a certain point. In mostsituations, the DCG method produces the highest

M2

D1

D2

D3

D4

D5

D6M3

M5

M6

D10D5

D8

D7

D3

D6D2

D9

D8

D2

D3D7

M1

M4

Fig. 5. An example of executing the DCG method.

Table 3. Access frequencies of groups

DataMobile node Group

M1 M2 M3 M4 M5 M6 G1 G2

M1 M2 M3 M4 M5 M6

D1 0.50 0.25 0.30 0.35 0.25 0.20 1.40 0.45

D2 0.45 0.50 0.40 0.40 0.40 0.45 1.75 0.85

D3 0.35 0.45 0.50 0.25 0.45 0.35 1.55 0.80

D4 0.30 0.15 0.10 0.60 0.10 0.25 1.15 0.35

D5 0.50 0.20 0.15 0.25 0.70 0.20 1.10 0.90

D6 0.05 0.35 0.45 0.20 0.25 0.60 1.05 0.80

D7 0.40 0.25 0.30 0.30 0.35 0.20 1.25 0.55

D8 0.20 0.30 0.20 0.25 0.30 0.20 0.95 0.50

D9 0.20 0.20 0.20 0.25 0.25 0.20 0.85 0.45

D10 0.15 0.10 0.05 0.15 0.15 0.10 0.45 0.25

T. HARA [Vol. 93,278

traffic. When the radio communication range is verysmall, the traffic produced by these two methods issmall. This is because the number of mobile nodesconnected to each other is small, and thus replicarelocation does not produce large traffic. When theradio communication range is very long, the DCGmethod produces smaller traffic than the DAFNmethod. This is because in the DCG method, thenumber of mobile nodes in a group is very large (40 inmost cases) and thus replica relocation rarely occurs.

From the simulation result, we can confirm thatthe DCG method gives the highest accessibility,while the SAF method produces the lowest traffic.The DAFN method shows the balanced performance(i.e., the second best for both data accessibility andtraffic) where the radio communication range is notvery long (i.e., the number of neighbors of each nodeis not very high). Therefore, in a real environment, aproper method among the three methods should bechosen based on the system requirement, i.e., how

much data accessibility and traffic are critical. Forexample, when either network bandwidth or nodebattery is very limited, the SAF method should bethe best. On the other hand, when both the networkbandwidth and the battery are rich and/or the dataaccessibility is critical, the DCG method should bethe best.

2.2 Extensions. After our first paper on replicaplacement,35) we extended the methods proposedthere, in several respects, including remaining bat-teries life and mutual dependency among data items.In Ref. 36, we took into account the fact that inmany applications there are dependencies amongdata items; that is, multiple data items are oftenaccessed at the same time. For example, at a disastersite, two data items, relating respectively to struc-tural damage information and the progress of rescueractivity at the same location, are often accessedsimultaneously. Thus, in the extended methods, wereplaced the data access frequency to each data item,with the access frequency to pairs of data itemsaccessed simultaneously.

In order to prolong the lifetime of MANETs, inRefs. 38, 39, we took into account the remainingbattery life of mobile nodes, both when allocatingdata items to mobile nodes, and when determiningthe data items (replicas) to be accessed for a givendata access request. By doing so, the proposedmethods can prevent mobile nodes from exhaustingtheir batteries, which is very important in MANETenvironments, because such battery exhaustion has anegative impact, both in terms of data availability(i.e., data items held by such nodes cannot beaccessed) and network connectivity (i.e., these nodescannot relay communication packets, which maycause network partitioning).

2.3 Replica placement considering data up-date. The above studies basically assumed a simpleenvironment where data updates do not occur. Weextended the methods proposed in Ref. 35 to takedata updates into account when determining datareplication.37),50),51)

The common idea among these studies37),50),51)

is that the remaining time until each data item isupdated next is taken into account when decidingdata items replicated. More specifically, this remain-ing time is defined as the profit to replicate the dataitem as well as the data access frequency. In Ref. 50,we assumed the simplest situation that each dataitem is periodically updated (i.e., the remaining timeuntil the next update is easily calculated from theupdate interval and the latest update time). In

0

0.2

0.4

0.6

0.8

1

0 2 4 6 8 10 12 14 16 18 20

Dat

a A

cces

sibi

lity

Radio Communication Range

SAFDAFNDCG

(a) Data accessibility

0

10000

20000

30000

40000

50000

60000

70000

80000

90000

0 2 4 6 8 10 12 14 16 18 20

Tra

ffic

Radio Communication Range

SAFDAFNDCG

(b) Traffic

Fig. 6. Performance comparison among SAF, DAFN, andDCG.35)

Data management issues in mobile ad hoc networksNo. 5] 279

Ref. 37, we assumed that the update schedule (whichis not necessarily periodical) is given or predicted(i.e., the remaining time is calculated from theschedule). Finally, in Ref. 51, we assumed thatupdate intervals of each data item follow someprobability density function which is known inadvance, and the remaining time until the nextupdated is calculated in a probabilistic manner (i.e.,the benefit of replicating a data item is alsocalculated in a probabilistic manner).

3 Update management

In Section 2 (except for Section 2.3), for thepurpose of simplicity, we assumed an ideal environ-ment where data items are not updated. However, inmany real-world applications, data updates occur;and in MANETs, due mainly to nodal movement,disappearance (disconnection) of nodes and networkpartitioning frequently occur, and thus it is quitedifficult to consistently and completely update allthe versions of the replicas. And if non-updatedversions of replicas exist, data access (read) oper-ations will read old replicas, which is invalid in mostapplications. Reading old replicas is wasteful in termsof system features (e.g., consuming network resourcesand mobile node batteries), and requires the re-conduct of (valid) read operations on up-to-datereplicas. Moreover, in the case of many real-worldapplications, reading old data items is simplyunacceptable, as it may have a seriously negativeimpact on real-world activities; for example, rescueoperations based on out-of-date information may besignificantly impaired.

Therefore, the second issue is how to efficientlymanage data updates in MANETs. There arebasically two types of approach to update manage-ment in traditional database systems, which are alsoapplicable in MANETs: optimistic and pessimistic.

In an optimistic approach, when a mobile nodeissues an access (read) request for a data item, but isnot connected to the node holding the original copy,it tentatively accesses one of the replicas held bymobile nodes to which it is connected. The validity ofthe tentative access (i.e., whether it accesses an up-to-date replica) is then confirmed after the request-issuing node connects to the original item holder. Asthis form of update management often results in readoperations that access stale (old) replicas, which mayrequire rollbacks of the performed operations, it istypically difficult, if not impossible, to apply it in aMANET, where update (write) operations can beissued by any node. Therefore, in our research, we

assume that optimistic approaches are used only inMANETs where the node holding the original copyissues write operations. In such an environment,there are basically two technical challenges: invalid-ation of old replicas, and dissemination of the latest(up-to-date) data items.

In a pessimistic approach, all access (read)operations must be confirmed to be consistent (valid)on the spot, during the operation period. However,it is very difficult to achieve this form of updatemanagement in MANETs, because of the frequencyof nodal disconnections; and thus, further advancesare needed in this area.

In this section, we summarize our studies onupdate management based on the two approachesabove. Here we assume that all data items areupdated at inconstant intervals; and after a givendata item is updated, all its replicas basically becomeinvalid, meaning that read operations are valid if andonly if they are performed on up-to-date (i.e., valid)data items (replicas).

3.1 Optimistic approaches. Since, in opti-mistic approaches, the tentative read operations onreplicas may be invalid, there are several performancemetrics.1. Read operations success rate (Success rate): As

in the problem of replica placement, the readoperation success rate is of course the mostsignificant performance metric.

2. Rate of dirty read operations (Dirty-read rate):Since read operations on invalid (old) replicas(i.e., dirty reads) are not desirable, as manyapplications are aborted thereby, the rate ofdirty read operations is also a significant metric.Even if the read operation success rate is high,some applications cannot accept too high dirty-read rate. Furthermore, increasing the successrate is generally accompanied by an increase inthe dirty-read rate (i.e., there is a trade-off).

3. Delay until the validation of tentative readoperations (Delay): Tentative read operationson replicas are only later validated, and thus theread-operation issuer incurs some delay untilthe validation. For most applications, a shorterdelay is desirable.

4. Extra traffic (Traffic): To improve the resultsof the performance metrics above, a number ofuseful techniques have been developed, includ-ing invalidation of old replicas, and dissemina-tion of the latest (up-to-date) data items.However, such approaches incur extra overheadin terms of communication traffic; and large

T. HARA [Vol. 93,280

volumes of traffic make significant consumptiondemands on network bandwidth and energy,which are not desirable in MANETs.In the following, we summarize our studies on

optimistic approaches to update management, whichaim to improve the values in the above metrics.Again, in optimistic approaches, we assume that onlythe node holding the original copy issues write opera-tions (i.e., updates data items). We also present theresult of a performance study to show the effective-ness of our approaches. Due to the limitation of space,we only show it for updated-data dissemination.

3.1.1 Invalidation of old replicas. In order toreduce the dirty-read rate and delay, with low trafficoverhead, invalidating old replicas is often used intraditional distributed systems and mobile systems.In Ref. 41, we proposed two old-replica invalidationmethods, which involve the broadcast of a message(invalidation report) containing the time stamp ofthe latest version of the original copy, in order toinvalidate old replicas. We called these the UpdateBroadcast (UB) and Connection Rebroadcast (CR)methods. Here, we assume that each mobile nodemanages a table in which information on the timestamps of all the data items in the entire MANET isrecorded. A time stamp is the latest update time ofthe corresponding data item, known by the mobilenode, which may differ from the actual latest updatetime. This table is called the time stamp table.

By broadcasting an invalidation report morefrequently, we can reduce the dirty-read rate anddelay more effectively, but the traffic increases. Thetwo methods, UB and CR, adopt different strategieson how to optimize the combination of dirty-readrate, delay, and traffic. We outline the two methodsbelow. It should be noted that the old-replicainvalidation methods have no impact on the firstmetric (i.e., success rate), because the methods donot increase the number of valid replicas, but simplyinvalidate old replicas.

A. Update Broadcast (UB) method. In the UBmethod, a mobile node holding an original copybroadcasts an invalidation report to connected mobilenodes each time it updates the data item. The inval-idation report includes the following information:• the data identifier.• the update time (time stamp).

If a mobile node that receives the invalidationreport holds a replica of the corresponding data item,it confirms whether the replica it holds is valid;discarding it if it is stale. It also updates its own timestamp table.

In this method, the traffic caused by sendinginvalidation reports is small, because mobile nodesbroadcast only when they update the original copy.Mobile nodes connected to the owner of the originalcopy can maintain the most up-to-date time stamp ofthe corresponding data item. However, since thenetwork topology frequently changes due to nodalmovement, connected mobile nodes may have differ-ent time stamps and different versions of the replicasfor the same data item.

B. Connection Rebroadcast (CR) method. In theCR method, similarly to the UB method, a mobilenode holding an original copy broadcasts an invalid-ation report to connected mobile nodes each time itupdates the data item. But in addition, whenever twomobile nodes are newly connected with each other,they re-broadcast the invalidation reports they havepreviously received. Figure 7 shows an example ofexecuting the CR method.

More specifically, the two newly connectedmobile nodes share their respective time stamptables, and each compares the respective entries foreach data item. If the time stamp for a data item inthe received time stamp table is more recent (i.e., theother node has more recent information on that dataitem), the node updates the entry.

Then, each of the two nodes re-broadcasts, tothe previously connected mobile nodes (before thecurrent new connection), invalidation reports forthose data items whose received time stamps weregreater (i.e., newer) than its own. Mobile nodes thatreceive these invalidation reports discard theirreplicas in the same manner as in the UB method.

Thus, in the CR method, connected mobilenodes can maintain the same time stamp tables,because invalidation reports are re-broadcast when-ever two mobile nodes are newly connected. More-over, invalidation reports are disseminated amonga larger number of mobile nodes than in the UBmethod, and old replicas are effectively discardedeven if the replica owners are not connected to themobile nodes holding the original copies. Thus, thismethod can further reduce the number of dirty-read

Connection

Updatingtime stamps

Invalidation reports

Invalidation reports

Fig. 7. CR method.

Data management issues in mobile ad hoc networksNo. 5] 281

operations, in comparison with the UB method.However, when the network topology frequentlychanges, the traffic caused by sending invalidationreports is much higher in the CR method, due to thefrequent broadcasts of the reports.

C. Data access. When a node issuing a readoperation (access request) connects with the nodeholding the original copy of the target data item,it performs the operation on the original copy. If,on the other hand, it connects with a replica holder,it performs a tentative operation on the replica withthe latest time stamp; and when it later connectswith the mobile node holding the original copy, thesuccess (or failure) of the tentative data read isconfirmed.

3.1.2 Updated-data dissemination. As aforemen-tioned, the old-replica invalidation methods pre-sented above can reduce the dirty-read rate anddelay, with low traffic overhead, but cannot increasethe success rate, because they simply invalidate oldreplicas. Thus, in Refs. 42, 71, we proposed a numberof updated-data dissemination methods, to increasethe success rate by efficiently refreshing old replicas.Updated data items are disseminated after complet-ing the old-replica invalidation procedures proposedin Ref. 41. The basic idea is very simple, withupdated data items as well as invalidation reportsbeing disseminated.

To this end, our proposed methods follow similarstrategies to the old-replica invalidation methodsdescribed in Ref. 41, regarding data dissemination(i.e., disseminating data after data updates and newconnections), with one important difference: here weproposed two alternate approaches to data dissem-ination after new connections, because data itemsare typically much larger than invalidation reports,and this may lead to unacceptable traffic volume ifwe simply broadcast the updated data items to allconnected mobile nodes. Let us summarize themethods proposed in Refs. 42, 71.

A. Dissemination on Update (DU) method. In theDU method, old-replica invalidation is first per-formed, as in the UB method. But here, each mobilenode that discards one of its replicas refreshes it byrequesting the updated (up-to-date) data item fromthe mobile node holding the original copy.

This produces just low traffic, because thedissemination of an updated data item is performedonly when the data item is updated. However, sinceonly mobile nodes connected to the original copyholder can receive the latest version, the success rateis not appreciably improved.

B. Dissemination on Connection (DC) method. Inthe DC method, the procedure followed when theoriginal copy holder updates its own original copy isthe same as in the DU method. But in addition,whenever two mobile nodes newly connect with eachother, replica invalidation is performed as in the CRmethod, and these two nodes then disseminate theupdated data items. And here, as aforementioned,we proposed two variations, which differ in termsof the range over which updated data items aredisseminated.

DC/OO (DC/One-to-One) method: In theDC/OO method, only the two newly connectedmobile nodes share updated data items with eachother, after they have broadcast their invalidationreports.

DC/GG (DC/Group-to-Group) method:In the DC/GG method, two groups of mobile nodes,which were previously connected to the two newlyconnected mobile nodes, disseminate their ownupdated data items after broadcasting their invalid-ation reports (i.e., the dissemination range is thesame as that of the invalidation reports). Obviously,this method generates much more traffic than theDC/OO method, but can refresh much more oldreplicas, and thus can further increase the successrate.

C. Performance study. We briefly present theresult of a simulation study.71) In the simulation, 40mobile nodes exist in a size 500[m] # 500[m] flatland,each of which holds an original data item. Thesenodes initially locate at random positions and moveaccording to the random waypoint model (in whicheach node randomly determines a destination posi-tion and moves toward it with a randomly deter-mined speed, and repeats this behavior). Due tothe limitation of space, we omit the detail of thesimulation setting (see the detail in Ref. 71).

Figure 8 shows the performances of the proposedmethods when varying the average update period(U), where we assumed that updates of each dataitem occur at intervals based on an exponentialdistribution with mean U [s]. For the purpose ofcomparison, the performances when the flooding ofinvalidation reports and the dissemination of updateddata items are not performed are shown as “NO.”

The performance metrics are data accessibility(Fig. 8(a)), rate of accessing invalid replicas(Fig. 8(b)), and traffic (Fig. 6(c)). The data accessi-bility (i.e., success rate) is same as that insection 2.1.3. The rate of accessing invalid replicas(i.e., dirty-read rate) is defined as the ratio of the

T. HARA [Vol. 93,282

number of invalid data requests to old replicas to thetotal number of data access requests issued duringthe simulation time. The traffic is defined as the totaldata volume for transmitting updated data items andcontrol packets.

Figure 8(a) shows that the data accessibility inthe DC/OO and DC/GG methods is higher than thatin the DU method. The DC/GG method providesthe highest data accessibility, since updated dataitems are disseminated to all mobile nodes that wereoriginally connected to the two newly connectedmobile nodes. As the average update period increases,in all of the proposed methods, the data accessibility

improves because the replicas held by each mobilenode are valid for a longer time.

Figure 8(b) shows that the rates of accessinginvalid replicas in the DC/OO and DC/GG methodsare lower than that in the DUmethod. This is becausethe DC/OO and DC/GG methods can invalidatemore old replicas. The DC/OO method gives lowerrate of accessing invalid replicas than the DC/GGmethod. This is because the DC/GGmethod not onlyallocates more valid replicas, but also allocates moreinvalid replicas than the DC/OO method.

Figure 8(c) shows that the DC/GG methodproduces the highest traffic and the DC/OO methodproduces the next. This result is obvious because theDC method disseminates updated data items morefrequently than the DU method, and the DC/GGmethod disseminates them to wider ranges than theDC/OO method.

In summary, the simulation result shows thatthe DC method reduces the rate of accessing invalidreplicas, but produces higher traffic than the DUmethod. The result also shows that the DC/GGmethod gives the highest data accessibility, but italso gives the higher rate of accessing invalid replicasthan the DC/OO method and the highest traffic.In real environments, the most appropriate methodshould be chosen among the proposed methodsaccording to the update frequencies of data itemsand system requirements.

3.2 Pessimistic approaches. As aforemen-tioned, some applications require that the validity ofdata operations (read and write) must be confirmedimmediately (i.e., tentative operations are notacceptable). Therefore, in a pessimistic approach,the validity (consistency) of each performed dataoperation is confirmed on the spot, during theoperation period. Meanwhile, strict global consis-tency of data operations on replicas is not desirable inmany applications, as it is too costly and difficult toachieve. Thus, new consistency maintenance, basedon local conditions such as location and time, mustbe investigated. In Ref. 40, we attempted to classifydifferent consistency levels, according to specificapplication requirements, and provide protocols toachieve these levels.

Here, we briefly describe these consistency levelsand protocols. First, we outline the system modelassumed, and then describe the proposed approaches.

3.2.1 System model. In Ref. 40, we assumed thatthe entire area in which MANET mobile nodes arepresent is divided into several regions (e.g., grid-based square regions). This assumption is based on

0

0.1

0.2

0.3

0.4

0.5

0.6

0 500 1000 1500 2000 2500 3000

Dat

a A

cces

sibi

lity

Average Update Period

NODU

DC/OODC/GG

(a) Data accessibility

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

0 500 1000 1500 2000 2500 3000

Rat

e of

Acc

essi

ng In

valid

Rep

licas

Average Update Period

NODU

DC/OODC/GG

(b) Ratio of accessing old replicas

0

1e+10

2e+10

3e+10

4e+10

5e+10

6e+10

7e+10

8e+10

9e+10

0 500 1000 1500 2000 2500 3000

Traf

fic fo

r Dis

sem

inat

ing

Upd

ated

Dat

a

Average Update Period

NODU

DC/OODC/GG

(c) Traffic

Fig. 8. Performance comparison among DU, DC/OO, and DC/GG.71)

Data management issues in mobile ad hoc networksNo. 5] 283

the fact that it is usually difficult to centrallymaintain consistency over the entire network. Wealso assumed that the MANET consist of two kindsof mobile nodes: proxies and peers, where proxies arespecially designated peers who manage other peers ina specific MANET region, and every node (includingproxies) knows all the proxies in the network.

3.2.2 Global Consistency (GC). Data operationconsistency is required over the entire MANET. Agood example is a situation in which the membersof a rescue service are divided into several groups,each of which is responsible for a certain region, andinformation on the progress of the tasks assignedto each group is shared over the entire network. Toachieve GC, we adopt dynamic quorums similar toRef. 72. The consistency is hierarchically managed attwo levels: among the peers in a given region, andamong proxies. First, the quorum size for writeoperations, |QW |, and for read operations, |QR|, overthe entire MANET, are determined where thecondition, |QW | D |QR| > l, is satisfied, and where lis the total number of regions (proxies) in the entireMANET. In addition, in each region, Ri (i F 1,*, l),the quorum size for write operations, |QLWi|, andfor read operations, |QLRi|, in the region are deter-mined where the condition, |QLWi| D |QLRi| > Pi,is satisfied, and where Pi is the total number of peersin the region.

The node which issues an operation (write orread) first sends a request message to the proxy of itsregion, which we call the coordinator. Then, thecoordinator attempts to set the necessary number oflocal locks, |QLWi| (|QLRi|), for replicas held bypeers in its region of responsibility. If it succeeds,the global lock is set in the region. At the same time,the coordinator successively forwards the requestmessage to other proxies, until it successfully sets therequisite number of global locks, |QW | (|QR|), witheach proxy that receives the request attempting toset the necessary number of local locks (i.e., set theglobal lock). Finally, if the coordinator succeeds insetting the necessary numbers, |QW| (|QR|) and|QLWi| (|QLRi|), of global and local locks, amongproxies and peers, the write (read) operation is per-formed on the replicas for which the global and localread (write) locks have been set. In read operations,the operation is performed on the most recent versionof a given replica, among those with locks. In writeoperations, the operation is performed on all thereplicas with locks. Based on the above-mentionedconditions, |QW | D |QR| > l and |QLWi| D |QLRi| >Pi, the consistency of data operations can be

maintained among both proxies and peers in eachregion; that is, a peer which issues a read operationcan always read the most recent version of a givenreplica.

Figure 9 shows an example of executing this GCprotocol, where a gray node denotes the proxy in eachregion (R1,*,R9) and arrows denote message flowsamong proxies. Let us assume that |QW| F 6, |QR| F4 (i.e., |QW| D |QR| F 10 (> 9)), |QLRi| F ⌊Pi⌋, and|QLWi| F Pi ! |QLRi| D 1. In this example, since theproxy in the region where the operation (write)-issuing node (the right upper node in Ri) locatessuccessfully sets |QW| global locks with six regionsR1,*,R6, where the proxy in each region successfullysets |QLWi| local locks. Therefore, in this example,the write operation is successfully performed.

3.2.3 Local Consistency (LC). Data operationconsistency is required only in each region of interest,and this consistency level weakens the strictness ofconsistency from a spatial perspective. An exampleof an application environment requiring LC is asituation in which the members of a rescue serviceshare damage information, such as the number ofinjured persons and damaged buildings, whichconsists of distinct data items reflecting the varyingextent of the damage. This information is used locallyby the leaders in each region, to decide on resourceallocation and task schedules in their respectivegroups.

To achieve LC, consistency is maintained onlyamong peers in each region, in a manner similar toGC.

3.2.4 Time-based Consistency (TC). In TC,replicas are valid even if their versions are differentbut a predetermined time (validity period T) hasnot yet passed since they were last updated. This

R1 R2 3

R4 R5 R6

R7 R 98

R

R

Fig. 9. An example of executing the GC protocol.

T. HARA [Vol. 93,284

consistency level weakens the strictness of consis-tency from a temporal perspective.

In TC, typically, read and write operations areperformed locally on the operation issuing peers. Aread operation is performed if the operation issuingpeer holds a valid replica. Otherwise, the nodesearches for a mobile node that holds a valid replica.

3.2.5 Peer-based Consistency (PC). Here, dataoperation consistency is required only in each peer.Thus, this consistency level further weakens thestrictness of consistency, in comparison with LC, andis the weakest from a spatial perspective.

In PC, read and write operations are performedlocally on the operation issuing peers.

3.2.6 Performance study. We briefly present theresult of a simulation study.40) In the simulation, 240mobile nodes exist in a size X # 4X/3[m] flatland(a region is a rectangle of X/3 # X/3), each of whichholds an original data item and has an unlimitedstorage (i.e., it can replicate all data items). Thesenodes initially locate at random positions in theirassigned region (20 nodes are assigned to eachregion), and move according to the random waypointmodel within the region. Due to the limitation ofspace, we omit the detail of the simulation setting(see the detail in Ref. 40).

Figure 10 shows the performances of the fourconsistency management protocols (GC, LC, TC,and PC) when varying the area size (X) [m]. Theperformance metrics are success ratio (Fig. 10(a) and(b)), message traffic (Fig. 10(c) and (d)), and datatraffic (Fig. 10(e) and (f )). We measured thesemetrics for both read and write operations. Thesuccess ratio is the same as data accessibility insection 2.1.3. The message traffic is defined as theaverage of the total hop-count for message exchangesto process a read/write operation excluding trans-missions of data items. The data traffic is defined asthe average of the total hop-count to transmit a dataitem to perform a (successful) read/write operation.

From Fig. 10(a) and (b), the success ratios ofboth read and write operations in GC and LC getlower as the area size gets larger. This is because theconnectivity among mobile nodes becomes lower. Wecan see an interesting fact that when the area size islarger than 450, the success ratio in GC suddenly getslower but in LC remains high. This fact shows thateven when the connectivity among mobile nodes isstill high in each region, the connectivity amongproxies becomes low, i.e., we should not chooseunnecessarily strong consistency level if we wish toachieve high success ratio.

The success ratio of write operations in TC andthose of write and read operations in PC are always 1because every peer can perform operations locally.The success ratio of read operations in TC gets loweras the area size gets larger. This is because when theconnectivity is low, mobile nodes cannot access validreplicas held by connected mobile nodes with highprobability.

From Fig. 10(c) and (d), the message traffic ofwrite and read operations in GC and that of writeoperations in TC first get higher and then get lowerfrom a certain point (X F 450) as the area size getslarger. This is because the traffic firstly increases dueto the increase of hop-count for communication, andthen decreases due to the decrease of the number ofneighbors (i.e., the network becomes sparse). Of thefour protocols, GC produces the highest messagetraffic, and LC produces much lower than GC andTC (for write operations). Obviously, the messagetraffic of write operations in TC and those of writeand read operations in PC are always 0.

From Fig. 10(e) and (f ), the data traffic is muchlower than the message traffic for both write and readoperations, while the values do not represent actualtraffic because the data size is not considered. If thesizes of messages and data items are given, it isdetermined which one is dominant. The data trafficof GC is much higher than LC for both write andread operations. The data traffic for read operationsin TC is much higher than other protocols, becausethere are fewer valid replicas in TC, and request-issuing peers have to obtain valid replicas from faraway peers.

In summary, we can confirm from the simulationresult that all the four consistency managementmethods have quite different performance. Of course,the consistency level should be chosen based onthe system requirement. The simulation result alsosuggests us that we should not choose unnecessarilystrong consistency level because higher consistencylevel may cause significant degradation of successratio and increase of traffic.

4 Query processing

As noted in Section 1.2.3, the process wherebydata items of interest are found (i.e., query process-ing) is also a significant issue, because it directlyaffects the performance of the system or application.Here, focusing on complex types of queries, whichspecify certain query conditions that define the dataitems sought by the query issuing node, we describeour studies on top-k searches and k-nearest neighbor

Data management issues in mobile ad hoc networksNo. 5] 285

(kNN) searches, which are typical examples of suchqueries. We also present the result of a performancestudy to show the effectiveness of our approaches.Due to the limitation of space, we only show it forkNN query processing.

4.1 Top-k searches. As it is important toefficiently acquire only necessary data items in

MANETs, top-k queries offer a promising approach.In a top-k query, data items are ordered according totheir score, calculated based on a specific set ofattribute values using some scoring function, and thequery-issuing node acquires the data items with the khighest scores. In Ref. 43, we proposed a message-processing method for top-k queries, which guaran-

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

300 350 400 450 500 550 600

Suc

cess

Rat

io

Area Size

GCLCTCPC

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

300 350 400 450 500 550 600

Suc

cess

Rat

io

Area Size

GCLCTCPC

(a) Success ratio (write) (b) Success ratio (read)

0

100

200

300

400

500

600

700

800

900

1000

300 350 400 450 500 550 600

Tra

ffic

Area Size

GCLCTCPC

0

100

200

300

400

500

600

700

800

300 350 400 450 500 550 600

Tra

ffic

Area Size

GCLCTCPC

(c) Message traffic (write) (d) Message traffic (read)

0

20

40

60

80

100

120

300 350 400 450 500 550 600

Dat

a T

raffi

c

Area Size

GCLCTCPC

0

2

4

6

8

10

12

300 350 400 450 500 550 600

Dat

a T

raffi

c

Area Size

GCLCTCPC

(e) Data traffic (write) (f) Data traffic (read)

Fig. 10. Performance comparison (unlimited memory and limited movement).40)

T. HARA [Vol. 93,286

tees an accurate query result (i.e., acquiring thedata items with the k highest scores in the entireMANET), while reducing the traffic as much aspossible.

The basic means to achieve this consists inattaching a small but critical piece of information toeach query message, which effectively narrows downthe data item candidates to those included in thefinal top-k result. To this end, each mobile noderoughly identifies data items with the k highestscores, and designates some of these scores asStandard Scores (SS); then, as mobile nodes transmitquery and reply messages, they reduce the number ofcandidates included in the top-k result, by referringto these SSs. Furthermore, if a mobile node detectsthe disconnection of a necessary radio link during thetransmission of a reply message, it searches for analternate path along which to transmit the replymessage to the query-issuing node.

4.1.1 Determining the Standard Scores (SSs).Each SS, B(i, j)(1 5 j 5 N), calculated by mobilenode Mi, guarantees that the query-issuing nodeacquires more than k

N j(1 5 j 5 N) data items.4.1.2 Query processing. The procedures for the

query-issuing node, Mp, and for the mobile nodesreceiving a query message, are briefly explainedbelow. First, Mp specifies the number of requesteddata items, k, and the query conditions. Then, Mp

calculates the scores of its data items based on thequery conditions, using a scoring function, andinitializes its SSs as follows:

Bðp; jÞ ¼ S p;k

Nj

� �ð1 � j � NÞ: ½1�

Here, S(i, h) denotes the h-th highest score among thescores calculated by Mi. Specifically, the SSs of Mp

are the every kN -th scores calculated by Mp.

Then, Mp transmits a query message, with theattached SSs, to its neighboring mobile nodes. Eachmobile node that receives the query message updatesthe SSs based on the scores of its retained data items,and forwards the revised query message to itsneighbors. The reply messages are sent back to Mp

along the same routes through which the query wasdisseminated. Based on the information in the replymessage, each mobile node that relays the reply setsits own threshold, ensuring that this is equal to orgreater than the k-th highest score in the network,and sends to Mp its retained data items with scoresequal to or greater than the threshold (Fig. 11).

In this way, mobile nodes can reduce the numberof data item candidates included in the top-k result,

which helps to reduce the traffic required for queryprocessing.

4.1.3 Top-k query processing with replica place-ment. After Ref. 43, we have studied more efficientapproaches for top-k query processing in MANETs.For example, in Ref. 47, we proposed some new top-kquery processing methods which estimate the distri-bution of scores in the entire MANET and moreprecisely predict the k-th highest score than themethod in Ref. 43 which roughly calculates thelower-bound of the k-th score using Standard Scores.For estimation of score distribution, these methodsutilize either a histogram of scores or an approx-imation to a regular distribution based on the scoresof data items obtained during query processing. InRef. 47, we assumed an environment where dataitems are replicated to improve the query processingperformance. Therefore, the proposed methods alsotake replication into account in query processing.Specifically, these methods try to avoid duplicatetransmissions of replicas of same data items, and alsominimize the length of paths along which the dataitems are replied. By doing so, we can reduce both thetraffic and delay for query processing.

In Ref. 52, we addressed the data replicationtechnique which is specially designed for top-k queryprocessing, and proposed a simple replication schemeto avoid heavy replica duplication among nodes,which can occur if we simply replicate data itemsbased on the scores. To this end, our scheme not onlytakes into account scores of data items, but alsoadopts a randomized approach to achieve diversity ofreplicas in the entire MANET.

4.2 k-Nearest neighbor search. SinceMANETs are generally constructed of collaboratingmobile users who are geographically distributed inthe working area, location based queries such as thatfinding users near a specific location and that finding

M1 M2

M3

M4

M5

QueryReply

(SS)605336

(SS)876460

(SS)666260

Thr = 69

Thr = 60

Thr = 72

Thr = 69

Query-issuing node

1S

Example 3,6 Nk

2S3S

=top 2=top 4=top 6

(SS)626050

1

(SS)877669

Fig. 11. Query message transmission for top-k search.

Data management issues in mobile ad hoc networksNo. 5] 287

data associated with a specific location (e.g., sensordata observed at the target location) are often used.However, there have been no studies addressing suchqueries in MANETs so far. Therefore, since 2010,we have addressed the issues of k-nearest neighbor(kNN) search in MANETs. kNN query is the mosttypical location based query, which retrieves thekNNs from a query designated location.

In Ref. 44, we addressed the issue of searchingkNN nodes in MANETs. A naive approach forsearching kNN nodes is that flooding the entireMANET with a query message and receiving a replyfrom each query receiving node, which includes thelocation of the node. Obviously, this produces toomuch unnecessary traffic, resulting in not onlyconsuming a large amount of energy but alsoreducing the accuracy of the query result due topacket losses caused by the heavy traffic. Therefore,in Ref. 44, we proposed kNN query processingmethods for reducing traffic and maintaining highaccuracy of the query result in MANETs. Thesemethods are based on the following key policy.• Only mobile nodes that locate near the query

specified location (i.e., ideally, only kNNs)participate in the query processing as much aspossible.

To this end, in the proposed methods, the query-issuing node first forwards a kNN query using geo-routing to the nearest node from the locationspecified by the query (query point). Then, thenearest node from the query point forwards thequery to other nodes close to the query point, andeach node receiving the query replies with theinformation on itself. A possible way to achieve thisis that every node continuously recognizes thelocations of neighbor nodes by exchanging beacons(in other words, hello or heart-beat messages).However, in MANETs, since mobile nodes movefreely, exchanging beacons frequently to preciselyrecognize neighbors’ locations produces unacceptablytoo much traffic. Thus, our methods were designed asbeacon-less approaches. How to achieve this isexplained later in this subsection.

4.2.1 Proposed methods. To achieve searchingonly nodes close to the query point, we proposed twoapproaches: the Explosion (EXP) method and theSpiral (SPI) method. In the EXP method, the nearestnode from the query point broadcasts the query tonodes within a specific circular region, and each nodereceiving the query replies with information on itself(Fig. 12(a)). The circular region is determined basedon the density of nodes in the MANET. If the node

density around the query point is significantlydifferent from that in the entire MANET, the EXPmethod cannot collect the kNN (low estimation) orcollects the information on an unnecessarily largenumber of nodes (high estimation).

In the SPI method, the nearest node from thequery point forwards the query to other nodes in aspiral manner, and the node that collects a satisfac-tory kNN result transmits the result to the query-issuing node (Fig. 12(b)). Thus, this method does notrequire to specify a circular region. To achieve this,the entire area is dynamically partitioned into a set ofhexagonal cells whose size is determined based on thecommunication range of the mobile nodes (so thatthe information on nodes in a cell can be obtainedthrough on-hop communication), with the querypoint at the center of a hexagonal cell.

In both methods, a designated node aggregatesthe information in the received replies, and thusunnecessary information is not sent in reply througha long path to the query-issuing node. In these ways,unnecessary transmissions of queries and replies canbe reduced. Compared between the EXP and SPImethods, the EXP method generally achieves shorterquery processing delay, while the SPI methodgenerally achieves lower traffic in query processing.

4.2.2 Idea for making our methods beacon-less.The main idea of making the above methods beacon-less (i.e., processing queries without knowing neigh-bors locations) is that each node that received a queryeffectively sets the waiting time until replying to thequery, based on the locations of the query-issuingnode, query-relaying node, and query receiver. Forexample, in the EXP method, after broadcasting aquery within the circular region, it is achieved thatquery replies start to be sent from farther nodes tocloser nodes by setting the waiting timeRD as follows.

Query

Reply

Query

Reply

(a) EXP method (b) SPI method

Fig. 12. kNN search methods.

T. HARA [Vol. 93,288

RD ¼ Max delay � �� d

� �½2�

where Max_delay is a positive constant specifying themaximum waiting time before sending a reply, , is theradius of the searching range, and d is the distancebetween the source node’s location and the foot of thereceiving node’s perpendicular to the line from thesource node to the query point.

4.2.3 Performance study. We briefly present theresult of a simulation study.44) In the simulation,500 mobile nodes exist in a size 1000[m] # 1000[m]flatland. These nodes initially locate at randompositions and move according to the random way-point model. Due to the limitation of space, we omitthe detail of the simulation setting (see the detail inRef. 44).

Figure 13 shows the performances of the EXPand SPI methods when varying the number ofrequested kNNs (k). For a purpose of comparison,we also show the performances of the naive methodand the DIKNN method as well as beacon-basedversions of our methods. The naive method does notuse beacons, but the query-issuing node first floods aquery within the range, which is determined in thesame way as the EXP method, and then each nodereceiving the query individually replies to the query-issuing node. The DIKNN method is the state-of-the-art of KNN query processing in WSNs, which isbased on beacons. In the EXP and SPI methods usingbeacons, the node behavior is basically the same asin the beacon-less EXP and SPI methods, howevermessages are transmitted (unicast) based on theneighbor information obtained by beacons.

The performance metrics are traffic, responsetime, and accuracy of query result. The traffic isdefined as the average total volume of query messagesand replies exchanged in processing a query. Theresponse time is defined as the average time from the

transmission of a query message by the query-issuingnode, to the reception of the kNN result. Theaccuracy of query result is defined as the average ofthe weighted ratio (MAP value73)) of the number ofkNNs whose information is included in the kNN resultacquired by the query-issuing node, to the requestednumber of kNNs, k.

From Fig. 13(a), as k increases, the trafficincreases in all methods, because both the searcharea for processing a kNN query and the datavolume of the reply increase. Our proposed methodsgenerate far less traffic than the methods usingbeacons, as the periodical beacon exchanges involvedin the latter cause a great deal of traffic. Ourproposed methods also generate far less traffic thanthe naive method, because in our methods, repliesare sent back to the query-issuing node in a moreefficient manner. The EXP method produces moretraffic than the SPI method, because in thissimulation the estimated kNN circle is set largeenough for safety, and thus many non-kNN nodesreply. In the SPI method, the traffic depends on thenumber of laps required for collecting the informa-tion on kNNs, and thus the traffic increases in astepwise manner as k increases.

From Fig. 13(b), the response time in ourproposed methods is greater than in the methodsusing beacons. In our methods, a node sets a waitingtime before transmitting a message (which is adisadvantage of not using beacons), and thus theresponse time is increased. In the EXP method inparticular, every node must wait for calculatedwaiting time before sending back a reply, whichresults in an increase in response time. In the naivemethod, such waiting times do not occur; however,in this method retransmissions of replies often occur,due to packet losses caused by the increased traffic.Overall, the response time in the naive method isroughly similar to that of the SPI method.

0

5

10

15

20

25

30

35

40

Traf

fic[

KB

]

Requested number of kNNs, k

EXP SPIEXP(beacon) SPI (beacon)DIKNN Naive

0

0.5

1

1.5

2

2.5

Res

pons

e tim

e[s]

Requested number of kNN, k

EXP SPIEXP(beacon) SPI (beacon)DIKNN Naive

00.10.20.30.40.50.60.70.80.9

1

0 5 10 15 20 25 30 35 40 45 50 0 5 10 15 20 25 30 35 40 45 50 0 5 10 15 20 25 30 35 40 45 50

Acc

urac

y of

que

ry re

sult

Requested number of kNNs, k

EXP SPIEXP(beacon) SPI (beacon)DIKNN Naive

(a) Traffic (b) Response time (c) Accuracy of query result

Fig. 13. Performance comparison (500 nodes).44)

Data management issues in mobile ad hoc networksNo. 5] 289

From Fig. 13(c), the accuracy of query result isvery high (nearly 1) in our methods, in contrast tothe lower accuracy of query result in the DIKNN andnaive methods. This is because, in the latter, packetlosses often occur due to individual replies from alarge number of nodes.

In summary, our proposed beacon-less kNNquery processing methods which were speciallydesigned for MANETs perform significantly betterthan the naive method and the state-of-the-artapproach for WSNs. Of our proposed methods, theSPI method achieves high performance when thenode density is high (500 nodes). However, althoughnot presented in this review, the result in Ref. 44showed that the performance of the SPI methodsignificantly degrades when the node density is low.Thus, the EXP method also has an advantage that itcan stably achieve both reduction in traffic and highaccuracy of the query result.

4.2.4 kNN query processing with replica placement.We extended the EXP method to retrieve location-dependent data items which are associated with somelocations.45) Since a location-dependent data item isheld by a particular mobile node, which also keepsmoving, i.e., the data item’s current location isgenerally different from its associated location, somenew techniques are needed for searching kNN dataitems by using the EXP method. Therefore, themethod proposed in Ref. 45 extended the originalEXP method as follows.• The data items are managed to locate near

their associated locations. Specifically, when themobile node holding the original copy of a dataitem moves beyond d from the data item’sassociated location, the node passes it to thenode closest to this location. By doing so, wecan apply the EXP method by extending theradius of the circular region by d.

• Since data items can be replicated, which iseffective to improve the search performance,each mobile node replicates C data items whoseassociate locations are closer to itself, i.e., thereplica maintenance is needed as the nodemoves. Specifically, each node relocate replicasevery time when it moves distance d.

5 Security management

In our studies described in the above sections, wedo not assume the presence of malicious nodes. Ifmalicious nodes are present, the accuracy of queryprocessing is expected to decrease. In Refs. 34, 48, 49,we defined a new type of attack for top-k query,

called data replacement attack (DRA), in whichmalicious nodes attempt to replace necessary dataitems with unnecessary data items, and proposedsome novel techniques against DRA. Here, in top-kquery processing, a query-issuing node does not knowthe global top-k result beforehand. Therefore, even ifa malicious node has performed a DRA, the query-issuing node considers all the received data itemswith the k highest scores to be the global top-k result,rendering the DRA effectively undetectable, i.e.,DRA is a stronger attack than other traditionalforms of attack.

In this section, we present our approachesagainst DRAs, and also present the result of aperformance study to show the effectiveness of ourapproaches.

5.1 Top-k query processing and maliciousnode identification. When malicious nodesperforming DRAs are present, we need to make thefollowing actions to make the entire MANET robustagainst DRAs:• keeping high accuracy of the query result,• identifying malicious nodes.

The first action is needed to reduce the impactof DRAs from the application perspective. Thesecond action is needed to remove the maliciousnodes and keep the MANET healthy from the systemperspective.

In Ref. 34, we proposed a top-k query processingmethod (first action) against DRAs and a localmalicious node identification method (second action).Moreover, in Ref. 49, we proposed a global maliciousnode identification method (second action) in whichnormal nodes which have detected malicious nodesshare the information on the malicious nodes to morewidely detect malicious nodes. Here, for the purposeof simplicity, we assumed a naive manner for top-kquery processing where the query-issuing node firstbroadcasts a query over the entire MANET and eachnodes receiving the query sends back a reply with dataitems with the k highest scores (local top-k result)among its own data items and all data items receivedfrom its child nodes on the query propagation routes,i.e., simple data aggregation is performed.

5.1.1 Top-k query processing.34) The idea forkeeping high accuracy of the query result is simplebut effective: each node receiving a query replies withdata items with the k highest scores along multiple(two) routes. By doing so, even if a malicious nodeis present along a query reply path, the query-issuingnode can acquire the top-k result. Our experimentalresults confirmed us that even if more than two

T. HARA [Vol. 93,290

malicious nodes are present, just replying along tworoutes works well in most cases.

5.1.2 Local malicious node identification.34) Toenable DRA and malicious node detection, we makeeach reply message include information on the routealong which the message is forwarded. By doing so,the query-issuing node can know which data itemsshould be sent back along the route, in other words,it can recognize a DRA. When detecting a DRA,the query-issuing node narrows down the maliciousnode candidates from the information attached inthe received reply messages, and inquires withnon-candidate nodes (neighbors of the candidates)information on the data items sent by thesecandidates, allowing it to identify the maliciousnodes.

5.1.3 Global malicious node identification.49)

Through simulation experiments, we confirmed thatwhen there are multiple malicious nodes in aMANET, it is difficult for the method proposed inRef. 34 to identify all the malicious nodes in a singlequery. This is partly because nodes, in this method,are more likely to only identify malicious nodes neartheir own location than those farther away. Thus,in order to rapidly identify a greater number ofmalicious nodes, a global malicious node identifica-tion method proposed in Ref. 49 makes nodes shareinformation about identified malicious nodes withother nodes.

In this method, after receiving a predeterminednumber of queries, each node divides all nodes intosome groups based on the similarity of the informa-tion on identified malicious nodes which was sentfrom them. Then, the node performs malicious nodeidentification separately with each group, based onthe information in the group, and comprehensivelymakes the final judgment of malicious nodes basedon the identification results of all the groups. In thismethod, even if malicious nodes claim some normalnodes as malicious (which we call false notificationattack (FNA)), there is a decisive difference in thenature of the information possessed by normal andmalicious nodes concerning the identified maliciousnodes, and therefore, the malicious nodes can beeasily identified.

5.2 Signature-based top-k query processingagainst DRAs. By the method in Ref. 49, eachnode can identify a large number of malicious nodesmore quickly than the local identification method inRef. 34. However, the global malicious node identi-fication is performed only after each node receives apredetermined number of queries, which still needs

relatively long time to identify all malicious nodes.In addition, it sometimes happens that some normalnodes are determined as malicious by mistake.

To solve these problems and identify all mali-cious nodes more quickly, we proposed a signature-based top-k query processing method in Ref. 48. Inthis method, each node receiving a query messagesends back a reply message which contains the localtop-k result (i.e., tentative top-k data items)attached with encrypted information about the replyforwarding route (i.e., the list of nodes along whichthe reply message has been sent) and the sent dataitems (i.e., the list of data items newly added by thisnode and that deleted from the previous local top-kdata items) as the digital signatures. By doing so,the query-issuing node can know the data itemssent by each node in the MANET, and thereby canidentify malicious nodes using the received signatures.After identifying the malicious nodes, the query-issuing node floods the MANET with a notificationmessage including the signatures in which theidentified malicious nodes replaced higher-score dataitems to their own lower-score items.

Figure 14 illustrates an example of forwardingreply messages with digital signatures, then detectinga DRA and identifying a malicious node, whereSIGMi

denotes the digital signature by mobile nodeMi. Here, we assume that malicious node M2 replacesthe 88-score data item sent by M4 with its own65-score data item. The query-issuing node, M1,examines the reply message from M2, and it is clearfrom SIGM2

that M2 has replaced the 88-score dataitem with its own low-score item. M1 thus detects aDRA and identifies M2 as a malicious node.

5.3 Performance study. We briefly presentthe result of a simulation study.48) In the simulation,50 mobile nodes exist in a size 500[m] # 500[m]flatland, each of which holds 50 data items whosescores are randomly set. These nodes initially locateat random positions and move according to therandom waypoint model. Due to the limitation ofspace, we omit the detail of the simulation setting(see the detail in Ref. 48).

Figures 15 and 16 show the performances of oursignature-based method in Ref. 48 (denoted by MP-SIG), our non-signature-based method in Ref. 49(denoted by MP-noSIG), a modified version of themethod in Ref. 48 (denoted by SP-SIG), in whicheach node replies to only its parent (i.e., single-path-based method: SP), and the naive method (denotedby SIG-noSIG), which neither uses multi-path replynor detects DRAs and malicious nodes.

Data management issues in mobile ad hoc networksNo. 5] 291

In Fig. 15, we measured the number of querieswhich was necessary for detecting all malicious nodeswhen varying the number of malicious nodes (thenumber of requested top-k data items (k) was fixedas 30). Since the naive method (SP-noSIG) cannotdetect malicious nodes, we omit the result. In Fig. 16,we fixed the number of malicious nodes as 5, andvaried the number of requested top-k data items (k),and measured the three performance metrics; accu-racy of the query result (the average ratio of thenumber of top-k data items included in the acquiredtop-k result, to k), traffic for top-k query processing(the average of the total traffic volume required forprocessing a top-k query), and traffic for notification(the average of the total traffic volume requiredfor making notification of the identified maliciousnodes).

Figure 15 shows that the signature-basedmethod (MP-SIG) in Ref. 48 can detect all maliciousnodes significantly faster than the non-signature-based method (MP-noSIG) in Ref. 49, which con-firms us the effectiveness of using the signatures.

From Fig. 16(a), the accuracy of the query resultin MP-SIG and MP-noSIG is greater than in SP-SIG and SP-noSIG, which shows the effectivenessof our multiple replies against DRAs. However, as kincreases, the accuracy of the query result decreasesin MP-SIG and MP-noSIG, because the chances ofpacket losses increase with the increase in the size ofreplies. In particular, MP-SIG achieves lower accu-racy than MP-noSIG, and the difference betweenMP-SIG and MP-noSIG increases as k increases,because the size of the signatures in MP-SIGsignificantly increases, and thus packet losses moreoften occur.

From Fig. 16(b), in all methods, as k increases,the traffic required for top-k query processingincreases because of the increase in the reply messagesize. From Fig. 16(c), as k increases, the traffic fornotification increases in MP-SIG and SP-SIG becausethe size of the notification message increases withthe increase in the size of the signatures. On theother hand, in MP-noSIG, even if k increases, thetraffic required for notification is very small, as nodessend notification messages with only the informationabout identified malicious nodes (i.e., withoutsignatures).

In summary, the simulation result shows thetrade-off between quickness of malicious node detec-tion (achieved by our signature-based method inRef. 48) and high accuracy/low traffic (achieved byour non-signature-based method in Ref. 49). In realsituations, we should carefully choose an appropriatemethod based on the system requirements (i.e.,which performance metric is more critical in thesystem).

Reply message

Signature

M2 M4

Node ScoreM4 88M4 80M4 71

M2 M4

Local top-k resultNode Score

M4 80

M4 71

M2 65

M2 M4

Node ScoreM4 88M4 80M4 71

M2 M4M1

Node ScoreM4 88M2 65

M1 M2

Query-issuingnode

Node Score

M4 88

M4 80

M4 71

Malicious node

M3 M4

Node ScoreM4 88M4 80M4 71

M3 M4M1

Node ScoreM4 71M3 90

M1 M3

Node Score

M3 90

M4 88

M4 80

M4 88

M2 65

Data Items Data Items

Data Items Data Items Data Items

Forwarding route

Forwarding route

Forwarding route

Forwarding route

Forwarding route

Forwarding route

Forwarding route Forwarding

route

M1

M3

M2

M4

Local top-k result

Local top-k result

Fig. 14. Signature-based top-k query processing.

Fig. 15. The number of queries necessary for identifying allmalicious nodes.48)

T. HARA [Vol. 93,292

6 Concluding remarks

In this paper, we outlined our studies on datamanagement in MANETs. In this section, as asummary of this paper, we first summarize theacademic and social contributions of the studies.Then, we discuss some future directions of datamanagement research in MANETs.

6.1 Contributions of our studies.6.1.1 Academic contributions. A number of the

studies summarized above have been acknowledgedas pioneering works, which together have establisheda new field of research, on data management inMANETs. The study described in Ref. 35, in fact,was the first to address data replication techniquesin MANETs; and following its publication, a lot ofsimilar studies have been done, with most citing thisstudy in their published reports.

After that initial study in Ref. 35, as summariz-ed here, we addressed a variety of research issuesrelated to data management in MANETs; and inmost cases, published the first related papers in theresearch community.

Several survey papers on data management inMANETs have recently been published in leadingjournals and conferences, such as the VLDBJournal54) and the IEEE Communications Surveys& Tutorials;53) and in these surveys, our studies wereevaluated highly as pioneering work.

Thus, our studies have contributed significantlyto the advancement of academic research in this area.

6.1.2 Social contributions. Our studies also havesignificant social value, because technical advance-ment in MANET data management will significantlyimprove data availability and data access perform-ance in important applications, such as thosedescribed in Section 1.2.2. In particular, the tech-niques proposed in our studies will significantlycontribute to applications involving mobile-user

collaboration, such as rescue operations at disastersites and real-time information sharing amongvehicles for safe/autonomous driving.

6.2 Future directions. As aforementioned, thetechniques proposed in our studies have significantcontributions from both the academic and socialperspectives. However, to deal with more advancedMANET applications, further efforts are needed. Inthe last part of this paper here, we discuss somefuture directions. While of course, performanceimprovement of each of the techniques proposed byus is needed, we omit such discussion, but focus onother research directions.

6.2.1 Mobile crowdsensing/crowdsourcing. Re-cently, there has been a new trend of mobilecrowdsensing/crowdsourcing in which a crowd ofordinary people having a mobile device are stronglyinvolved in application missions. In mobile crowd-sensing, mobile devices with sensing capabilities actas sensor nodes and help sensing operations. Inmobile crowdsourcing, mobile users act as workers toconduct some tasks which are part of a large mission.

In MANETs, there are many applications inwhich mobile crowdsensing and crowdsourcing areuseful, e.g., rescue operations. Therefore, addressingissues for mobile crowdsensing/crowdsourcing inMANETs will be an interesting and significantresearch direction, such as task assignment, taskscheduling, resource allocation, and incentive mech-anisms.

6.2.2 Integration with clouds. The data manage-ment techniques in MANETs are particularly usefulin situations where no network infrastructures suchas the Internet are available. However, even if theInternet is available, MANET technologies are stillvery useful, for example, for enlarging the coverage ofthe Internet and for off-loading of the Internet andserver overhead, which have become hot researchtopics.

k

(a) Accuracy of the query result (b) Traffic for top-k query processing (c) Traffic for notification

Fig. 16. Performance comparison of our methods.48)

Data management issues in mobile ad hoc networksNo. 5] 293

On the other hand, as MANET applicationsbecome more complex and advanced, it is easilyexpected that computation overhead to meet theapplication requirements significantly increases andexceeds the computation capability of mobile nodes.In such a situation, if the Internet connection isavailable with some MANET nodes (i.e., iMANETenvironments), it will be promising that the compu-tation is off-loaded to clouds (and edge computers) inthe Internet. Therefore, addressing issues for seam-lessly integrating operations in MANETs and cloudswill be challenging, such as task assignment/schedul-ing, consistency management, and fault recovery.

6.2.3 Real deployment. Now we are facing theperiod to deploy real data management applicationsin MANETs and verify the effectiveness of existingtechniques in practical use. Rescue operations at adisaster site are typical and the most significanttarget especially in Japan.

References

1) Baker, D.J., Wieselthier, J. and Ephremides, A.(1982) A distributed algorithm for scheduling theactivation of links in a self-organizing, mobile,radio network, Proc. IEEE ICC’82, 2F6.1–2F6.5.

2) Broch, J., Maltz, D.A., Johnson, D.B., Hu, Y.C. andJetcheva, J. (1992) A performance comparison ofmulti-hop wireless ad hoc network routing proto-cols, Proc. Mobicom’98, 159–164.

3) Kahn, R.E. (1975) The organization of computerresources into a packet radio network, Proc.National Computer Conference, 177–186.

4) Johnson, D.B. (1994) Routing in ad hoc networks ofmobile hosts, Proc. IEEE Workshop on MobileComputing Systems and Applications, 158–163.

5) Pearlman, M.R. and Haas, Z.J. (1999) Determiningthe optimal configuration for the zone routingprotocol. IEEE J. Sel. Areas Commun. 17, 1395–1414.

6) Perkins, C.E. and Bhagwat, P. (1994) Highlydynamic destination-sequenced distance-vectorrouting (DSDV) for mobile computers, Proc.ACM SIGCOMM’94, 234–244.

7) Perkins, C.E. and Royer, E.M. (1999) Ad hoc ondemand distance vector routing, Proc. IEEEWorkshop on Mobile Computing Systems andApplications, 90–100.

8) ASA (ALT) Public Affairs (2011) Army networkingradios improve communications at tactical edge,U.S. Army.

9) Tactical Scalable MANET (TSM), TrellisWareTechnologies, https://www.trellisware.com/manet-products/tsm/.

10) Anjum, S.S., Noor, R.M. and Anisi, M.H. (2016)Review on MANET based communication forsearch and rescue operations. Wirel. Pers.Commun.

11) Bluetronix, http://www.bluetronix.net/.12) Aldunate, R., Ochoa, S.F., Pena-Mora, F. and

Nussbaum, M. (2006) Robust mobile ad hocspace for collaboration to support disaster reliefefforts involving critical physical infrastructure.J. Comput. Civ. Eng. 20.

13) Ochoa, S.F. and Santos, R. (2015) Human-centricwireless sensor networks to improve informationavailability during urban search and rescue activ-ities. Inf. Fusion 22, 71–84.

14) Sakano, T., Kotabe, S., Komukai, T., Kumagai, T.,Shimizu, Y., Takahara, A., Ngo, T., Fadlullah,Z.M., Nishiyama, H. and Kato, N. (2016) Bringingmovable and deployable networks to disasterareas: development and field test of MDRU. IEEENetw. 30, 86–91.

15) Willke, T.L., Tientrakool, P. and Maxemchuk, N.F.(2009) A survey of inter-vehicle communicationprotocols and their applications. IEEE Commun.Surv. Tutor. 11 (2), 3–10.

16) Zhang, D., Zhang, D., Xiong, H., Hsu, C.H. andVasilakos, A.V. (2014) BASA: building mobileAd-Hoc social networks on top of android. IEEENetw. 28 (1), 4–9.

17) Wu, S.L. and Tseng, Y.C. (2007) Wireless Ad HocNetworking: Personal-Area, Local-Area, and theSensory-Area Networks, Auerbach Publications.

18) Ikehara, C.S., Biagioni, E. and Crosby, M.E. (2007)Ad-hoc wireless body area network for augmentedcognition sensors, Proc. Int’l Conf. on Foundationsof Augmented Cognition (FAC 2007), 38–46.

19) Lambrou, T.P. and Panayiotou, C.G. (2009) Asurvey on routing techniques supporting mobilityin sensor networks, Proc. Int’l Conf. on MobileAd-hoc and Sensor Networks (MSN 2009), 78–85.

20) Zhu, C., Shu, L., Hara, T., Wang, L., Nishio, S. andYang, L.T. (2014) A survey on communication anddata management issues in mobile sensor net-works. Wirel. Commun. Mob. Comput. 14, 19–36.

21) Hui, P., Chaintreau, A., Scott, J., Gass, R.,Crowcroft, J. and Diot, C. (2005) Pocket switchednetworks and human mobility in conferenceenvironments, Proc. ACM SIGCOMM Workshopon Delay-Tolerant Networking, 244–251.

22) Yau, S.S., Gupta, S.K.S., Gupta, E.K.S., Karim, F.,Ahamed, S.I., Wang, Y. and Wang, B. (2003)Smart classroom: Enhancing collaborative learningusing pervasive computing technology, Proc.ASEE Annual Conf. and Expo., 13633–13642.

23) Barrere, L., Chaumette, S. and Turbert, J. (2006) Atactical active information sharing system formilitary MANETs, Proc. IEEE Military Commu-nications Conf. (MILCOM 2006), 3621–3627.

24) Burbank, J.L., Chimento, P.F., Haberman, B.K. andKasch, W.T. (2006) Key challenges of militarytactical networking and the elusive promise ofMANET technology. IEEE Commun. Mag. 44(11), 39–45.

25) Mohamad, O.A., Hameed, R.T. and Tapus, N.(2015) Smart home system based on comparativeanalysis among AODV and DSDV protocols inMANET, Proc. Int’l Conf. on System Theory,

T. HARA [Vol. 93,294

Control and Computing (ICSTCC 2015).26) Hara, T. (2006) Replica location management for

data sharing in mobile ad hoc networks. Journal ofInterconnection Networks (JOIN) 7, 75–89.

27) Cover, T. and Hart, P. (1967) Nearest neighborpattern classification. IEEE Trans. Inf. Theory 13,21–27.

28) Ilyas, I.F., Beskales, G. and Soliman, M.A. (2008) Asurvey of top-k query processing techniques inrelational database systems. ACM Comput. Surv.40 (4), article 11.

29) Fu, T.-Y., Peng, W.-C. and Lee, W.-C. (2010)Parallelizing itinerary-based KNN query process-ing in wireless sensor networks. IEEE Trans.Knowl. Data Eng. 22, 711–729.

30) Silberstein, A., Munagala, K. and Yang, J. (2006)Energy-efficient monitoring of extreme values insensor networks, Proc. ACM SIGMOD Int’l Conf.on Management of Data (SIGMOD 2006), 169–180.

31) Wu, M., Xu, J., Tang, X. and Lee, W.-C. (2006)Top-k monitoring in wireless sensor networks.IEEE Trans. Knowl. Data Eng. 19, 962–976.

32) Wu, S.-H., Chuang, K.-T., Chen, C.-M. and Chen,M.-S. (2008) Toward the optimal itinerary-basedKNN query processing in mobile sensor networks.IEEE Trans. Knowl. Data Eng. 20, 1655–1668.

33) Kannhavong, B., Hidehisa Nakayama, H., Nemoto,Y., Kato, N. and Jamalipour, A. (2007) A survey ofrouting attacks in mobile ad hoc networks. IEEEWirel. Commun. 14 (5), 85–91.

34) Tsuda, T., Komai, Y., Sasaki, Y., Hara, T. andNishio, S. (2014) Top-k query processing andmalicious node identification against data replace-ment attack in MANETs, Proc. Int’l Conf. onMobile Data Management (MDM 2014), 279–288.

35) Hara, T. (2001) Effective replica allocation in ad hocnetworks for improving data accessibility. Proc.IEEE INFOCOM 2001, 1568–1576.

36) Hara, T., Murakami, N. and Nishio, S. (2004)Replica allocation for correlated data items in ad-hoc sensor networks. ACM SIGMOD Rec. 33, 38–43.

37) Hara, T. and Madria, S.K. (2006) Data replicationfor improving data accessibility in ad hoc net-works. IEEE Trans. Mobile Comput. 5, 1515–1532.

38) Shinohara, M., Hayashi, H., Hara, T. and Nishio, S.(2006) Replica allocation considering power con-sumption in mobile ad hoc networks, Proc. Int’lWorkshop on Pervasive Wireless Networking(PWN 2006), 463–467.

39) Shinohara, M., Hara, T. and Nishio, S. (2007) Datareplication considering power consumption inad hoc networks, Proc. Int’l Conf. on Mobile DataManagement (MDM 2007), 118–125.

40) Hara, T. and Madria, S.K. (2009) Consistencymanagement strategies for data replication inmobile ad hoc networks. IEEE Trans. MobileComput. 8, 950–967.

41) Hayashi, H., Hara, T. and Nishio, S. (2003) Cacheinvalidation for updated data in ad hoc networks,

Proc. Int’l Conf. on Cooperative InformationSystems (CoopIS 2003), 516–535.

42) Hayashi, H., Hara, T. and Nishio, S. (2005) Updateddata dissemination methods for updating oldreplicas in ad hoc networks. Pers. UbiquitousComput. 9, 273–283.

43) Hagihara, R., Shinohara, M., Hara, T. and Nishio, S.(2009) A message processing method for top-kquery for traffic reduction in ad hoc networks,Proc. Int’l Conf. on Mobile Data Management(MDM 2009), 11–20.

44) Komai, Y., Sasaki, Y., Hara, T. and Nishio, S. (2014)KNN query processing methods in mobile ad hocnetworks. IEEE Trans. Mobile Comput. 13, 1090–1103.

45) Komai, Y., Sakaki, Y., Hara, T. and Nishio, S. (2015)K nearest neighbor search for location-dependentsensor data in MANETs. IEEE Access 3, 942–954.

46) Komai, Y., Hara, T. and Nishio, S. (2015) Processingconvex hull queries in MANETs, Proc. Int’l Conf.on Mobile Data Management (MDM 2015), 64–73.

47) Sasaki, Y., Hara, T. and Nishio, S. (2014) Top-kquery processing for replicated data in mobile peerto peer networks. J. Syst. Softw. 92, 45–58.

48) Tsuda, T., Komai, Y., Hara, T. and Nishio, S. (2015)Signature-based top-k query processing againstdata replacement attacks in MANETs, Proc. IEEEInt’l Symp. on Reliable Distributed Systems (IEEESRDS 2015), 130–139.

49) Tsuda, T., Komai, Y., Hara, T. and Nishio, S. (2016)Top-k query processing and malicious node iden-tification based on node grouping in MANETs.IEEE Access 4, 993–1007.

50) Hara, T. (2003) Replica allocation methods in ad hocnetworks with data update. ACM-Kluwer Journalon Mobile Networks and Applications 8, 343–354.

51) Hayashi, H., Hara, T. and Nishio, S. (2004) Replicaallocation considering data update intervals inad hoc networks, Proc. IFIP/IEEE Int’l Conf. onMobile and Wireless Communication Networks(MWCN 2004), 131–142.

52) Hara, T., Hagihara, R. and Nishio, S. (2010) Datareplication for top-k query processing in mobilewireless sensor networks, Proc. IEEE Int’l Conf.on Sensor Networks, Ubiquitous, and TrustworthyComputing (SUTC 2010), 115–122.

53) Derhab, A. and Badache, N. (2009) Data replicationprotocols for mobile ad-hoc networks: A surveyand taxonomy. IEEE Commun. Surv. Tutor. 11(2), 33–51.

54) Padmanabhan, P., Gruenwald, L., Vallur, A. andAtiquzzaman, M. (2008) A survey of data repli-cation techniques for mobile ad hoc network data-bases. VLDB J. 17, 1143–1164.

55) Chen, K. and Shen, H. (2015) Maximizing P2P fileaccess availability in mobile ad hoc networksthough replication for efficient file sharing. IEEETrans. Comput. 64, 1029–1042.

56) Choi, J.-H., Shim, K.-S., Lee, S.K. and Wu, K.-L.(2012) Handling selfishness in replica allocationover a mobile ad hoc network. IEEE Trans. MobileComput. 11, 278–291.

Data management issues in mobile ad hoc networksNo. 5] 295

57) Fiore, M., Casetti, C. and Chiasserini, C.-F. (2011)Caching strategies based on information densityestimation in wireless ad hoc networks. IEEETrans. Vehicular Technol. 60, 2194–2208.

58) Liu, Y., Han, Y., Yang, Z. and Wu, H. (2015)Efficient data query in intermittently-connectedmobile ad hoc social networks. IEEE Trans.Parallel Distrib. Syst. 26, 1301–1312.

59) Ting, I.-W. and Chang, Y.-K. (2013) Improvedgroup-based cooperative caching scheme for mobilead hoc networks. J. Parallel Distrib. Comput. 73,595–607.

60) Wu, W., Cao, J. and Fan, X. (2013) Design andperformance evaluation of overhearing-aided datacaching in wireless ad hoc networks. IEEE Trans.Parallel Distrib. Syst. 24, 450–463.

61) Xia, F., Ahmed, A.M., Yang, L.T., Ma, J. andRodrigues, J.J.P.C. (2014) Exploiting social rela-tionship to enable efficient replica allocation in ad-hoc social networks. IEEE Trans. Parallel Distrib.Syst. 25, 3167–3176.

62) Zhang, Y., Yin, L., Zhao, J. and Cao, G. (2012)Balancing the trade-offs between query delayand data availability in MANETs. IEEE Trans.Parallel Distrib. Syst. 23, 643–650.

63) Lim, S., Lee, W.-C., Cao, G. and Das, C.R. (2006) Anovel caching scheme for improving Internet-basedmobile ad hoc networks performance. Ad HocNetw. 4, 225–239.

64) Lim, S., Lee, W.-C., Cao, G. and Das, C.R. (2004)Performance comparison of cache invalidationstrategies for Internet-based mobile ad hoc net-works, Proc. of Int’l Conf. on Mobile Ad-hoc andSensor Systems (MASS 2004), 104–113.

65) Tomell, S.M., Calafate, C.T., Cano, J.-C. and

Manzoni, P. (2015) DTN protocols for vehicularnetworks: An application oriented overview. IEEECommun. Surv. Tutor. 17, 868–887.

66) Nishiyama, H., Ito, M. and Kato, N. (2014) Relay-by-smartphone: realizing multihop device-to-de-vice communications. IEEE Commun. Mag. 52(4), 56–65.

67) Zaman, S. and Grosu, D. (2011) A distributedalgorithm for the replica placement problem. IEEETrans. Parallel Distrib. Syst. 22, 1455–1468.

68) Chen, K., Shah, S.H. and Nahrstedt, K. (2002)Cross-layer design for data accessibility in mobilead hoc networks. Wirel. Pers. Commun. 21, 49–76.

69) Huang, Y., Sistla, P. and Wolfson, O. (1994) Datareplication for mobile computers, Proc. ACMSIGMOD Int’l Conf. on Management of Data(SIGMOD 1994), 13–24.

70) Aho, A.V., Hopcroft, J.E. and Ullman, J.D. (1974)The Design and Analysis of Computer Algorithms,Addison-Wesley.

71) Hayashi, H., Hara, T. and Nishio, S. (2004) Updateddata dissemination in ad hoc networks, Proc. ofInt’l Workshop on Ubiquitous Mobile Informationand Collaboration Systems (UMICS 2004), 29–43.

72) Luo, J., Hubaux, J.P. and Eugster, P. (2003) PAN:Providing reliable storage in mobile ad hoc net-works with probabilistic quorum systems, Proc.ACM MobiHoc’03, 1–12.

73) Manning, C.D., Raghavan, P. and Schutze, H. (2008)Introduction to Information Retrieval, CambridgeUniversity Press.

(Received Apr. 25, 2016; accepted Feb. 27, 2017)

Profile

Takahiro Hara received the B.E., M.E., and Dr.E. degrees in Information SystemsEngineering from Osaka University, Osaka, Japan, in 1995, 1997, and 2000, respectively.Currently, he is a Distinguished Professor of the Department of Multimedia Engineering,Osaka University. He has published more than 430 journal and conference papers in theareas of databases, mobile computing, peer-to-peer systems, WWW, and wirelessnetworking. He served as general chair of IEEE SRDS 2014 and Mobiquitous 2016, andprogram chair of IEEE MDM’06/10, IEEE AINA’09/14, and IEEE SRDS’12. Hisresearch interests include distributed databases, peer-to-peer systems, mobile networks,and mobile computing systems. He received more than 70 research awards includingJSPS prize from Japan Society for the Promotion of Science. He is an ACM distinguishedscientist and a senior member of IEEE and a member of three other learned societies.

T. HARA [Vol. 93,296