dynamic query form bucketization:- ordered/unordered

16
[INTERNATIONAL JOURNAL FOR RESEARCH & DEVELOPMENT IN TECHNOLOGY] Volume-4,Issue-4, Oct - 2015 ISSN (O) :- 2349-3585 www.ijrdt.org | copyright © 2014, All Rights Reserved. 77 DYNAMIC QUERY FORM BUCKETIZATION:- ORDERED/UNORDERED Ashwini Ann Varghese 1 1 Dept. Computer Science and Engineering St Joseph‟s College of Engineering and Technology, Palai Abstract:-Today security concerns as well faster data access from database are on the rise in all areas such as banks, industry, healthcare, military organization, governmental applications, educational institutions, etc. as the number of hackers are also increasing day by day along with the number of internet users. There are several issues when it comes to data availability and security concerns in these numerous and varying industries. A new scheme that focuses on both security of data and faster retrieval of data is proposed which combines encryption, bucketization and dynamic query for by ensuring secure data exchange without any fault tolerance rate. Encryption hides the confidential information for the purpose of security, by converting the data in to an unintelligible form, both ordered and unordered bucketization can provide security without any direct access to the data residing in the database and dynamic query form provides an efficient user interface based on the admins choice without creating separate form for various enterprises, organization etc. In the proposed system, the data is encrypted using blowfish algorithm and the resultant cipher is then embedded with the bucket number based on the bucket width for each column provided by admin which ensures privacy for editing content in databases. The encryption algorithms used is much secure and faster as each step in the process is fully dependent on the key. Dynamic Query Form is a user is a user interface which capture a user’s preference for components and to assist the user in making decisions for retrieving data. KeywordsEncryption, Dynamic Query Form, Security, Bucketization. 1. INTRODUCTION 1.1 Overview There has been a concern and growing interest about the big data in the modern world. Data or information has always been a part of every enterprises, application, and business whether it is big or small. It has been there from beginning of time and it will be there till the end. It has become a part of people. Data is always around and keeps on growing. Research and practical experimentation are done in order to efficiently use data with enhanced security. Information security have utmost importance in today„s fast developing era. People enjoy the most convenient information exchange facilities provided through the internet. But there are also certain risk factors. The sensitive information which are transmitted might be intercepted or distorted by unintended observers or hackers by exploiting the weakness for the purpose of destruction or entertainment. So it is of great importance to secure those information which are in transit. There are different mechanisms to ensure the security of the data in transit which includes cryptography, bucketization etc. 1.2 Dynamic Query Form Modern databases and the internet applications contain very huge amount of data and these data will be heterogeneous. The number of data in the society each year is increasing. Hundreds or thousands of relations and attributes are used by the real-world databases such as web databases and modern databases. Therefore, there is a need for accessing data with more security without direct access to data and there is a need to dynamically access the data from database using dynamic query forms. Earlier in order to access data from the real

Transcript of dynamic query form bucketization:- ordered/unordered

[INTERNATIONAL JOURNAL FOR RESEARCH &

DEVELOPMENT IN TECHNOLOGY] Volume-4,Issue-4, Oct - 2015

ISSN (O) :- 2349-3585

www.ijrdt.org | copyright © 2014, All Rights Reserved. 77

DYNAMIC QUERY FORM BUCKETIZATION:-

ORDERED/UNORDERED

Ashwini Ann Varghese 1

1 Dept. Computer Science and Engineering

St Joseph‟s College of Engineering and Technology, Palai

Abstract:-Today security concerns as well faster data access

from database are on the rise in all areas such as banks,

industry, healthcare, military organization, governmental

applications, educational institutions, etc. as the number of

hackers are also increasing day by day along with the

number of internet users. There are several issues when it

comes to data availability and security concerns in these

numerous and varying industries. A new scheme that focuses

on both security of data and faster retrieval of data is

proposed which combines encryption, bucketization and

dynamic query for by ensuring secure data exchange without

any fault tolerance rate. Encryption hides the confidential

information for the purpose of security, by converting the

data in to an unintelligible form, both ordered and

unordered bucketization can provide security without any

direct access to the data residing in the database and

dynamic query form provides an efficient user interface

based on the admins choice without creating separate form

for various enterprises, organization etc. In the proposed

system, the data is encrypted using blowfish algorithm and

the resultant cipher is then embedded with the bucket

number based on the bucket width for each column provided

by admin which ensures privacy for editing content in

databases. The encryption algorithms used is much secure

and faster as each step in the process is fully dependent on

the key. Dynamic Query Form is a user is a user interface

which capture a user’s preference for components and to

assist the user in making decisions for retrieving data.

Keywords— Encryption, Dynamic Query Form, Security,

Bucketization.

1. INTRODUCTION

1.1 Overview

There has been a concern and growing interest about the big

data in the modern world. Data or information has always

been a part of every enterprises, application, and business

whether it is big or small. It has been there from beginning of

time and it will be there till the end. It has become a part of

people. Data is always around and keeps on growing.

Research and practical experimentation are done in order to

efficiently use data with enhanced security. Information

security have utmost importance in today„s fast developing

era. People enjoy the most convenient information exchange

facilities provided through the internet. But there are also

certain risk factors. The sensitive information which are

transmitted might be intercepted or distorted by unintended

observers or hackers by exploiting the weakness for the

purpose of destruction or entertainment. So it is of great

importance to secure those information which are in transit.

There are different mechanisms to ensure the security of the

data in transit which includes cryptography, bucketization etc.

1.2 Dynamic Query Form

Modern databases and the internet applications contain very

huge amount of data and these data will be heterogeneous. The

number of data in the society each year is increasing.

Hundreds or thousands of relations and attributes are used by

the real-world databases such as web databases and modern

databases. Therefore, there is a need for accessing data with

more security without direct access to data and there is a need

to dynamically access the data from database using dynamic

query forms. Earlier in order to access data from the real

Volume-4,Issue-4, Oct -2015

ISSN (O) :- 2349-3585

Paper Title:- DYNAMIC QUERY FORM BUCKETIZATION:-

ORDERED/UNORDERED

www.ijrdt.org | copyright © 2014, All Rights Reserved. (78)

world databases predefined query forms were used which

cannot satisfy various dynamic queries from users on

databases. The user interface that is used for querying the

databases is named as Query form. The Query forms are most

widely used nowadays. Many researches were aimed to

examine the databases for efficient retrieval

of query results without alteration in database content and to

improve performance of the system. The amount of data

stored in databases in internet is also rapidly increasing due to

the advancement in the field of information technology. These

databases contain a wealth of data and are a gold mine of

valuable information which should be kept secured.

Earlier query forms were used for various information

management systems which was designed by developers.

These traditional query forms were predefined by the

developers. Scientific databases, modern databases became

very large and complex with the rapid development of internet

information. As the information in the web databases

increased there was a rapid increase in the entities for storing

data. Thus the design of set of static query forms which satisfy

database queries were difficult to implement. Nowadays the

existing database development and management tools provide

various techniques to let users create queries on databases. But

the main drawback with these queries are these queries

depends on user manual editing so if a user is unfamiliar

database schema in prior to the editing, the hundreds or

thousands of data entities and attributes would confuse the

developer. Therefore a need for a novel database query form

interface is applicable for generating dynamically query

forms. The essence of Dynamic Query Form is to capture a

user‟s preference for components and to assist the user in

making decisions. The query form generation can be an

iterative process and it can be efficiently and successfully

guided by any user. The user can add components for the form

into the query. The user can efficiently get the results without

any fault tolerance based on the query [2]. Sequential query

results discovery is a very important part of dynamic query

form system. Data Mining is a non-trivial process of

identifying valid, interesting, novel, useful, and ultimately

understandable data for user.

1.3 Cryptography

An art of information hiding by transforming it into

an unintelligible form so that one with the possession of the

key and algorithm can access the data, this technique is known

as cryptography.

Cryptography is the process of scrambling a message to

convert it in to an unintelligible form. The process of

transforming the data into unintelligible form (non-readable

form) is termed as encryption. In encryption an information

content which is termed as the plaintext is transformed into a

non-readable form. Encryption is only possible by using the

encryption algorithm and the key. The unintelligible form

(non-readable form) information content is termed as the

cipher text. The key used for encryption determines the way in

which a message is encoded. The cipher text can be converted

back to its plain text (original form). The process of

converting the cipher text to plain text is known as decryption.

One with corresponding decryption algorithm and the key can

retrieve and read the information.

The encryption process can be classified based on the keys

used. The classification of encryption process based on the

cipher text are of two types, they are symmetric encryption

and asymmetric encryption. The same key for both encryption

and decryption are used for symmetric encryption. Some of

the examples for asymmetric key encryption are DES, AES,

and Blowfish which have certain well-known algorithms.

Asymmetric encryption uses different keys for encryption and

decryption. The public key is the encryption key of the

receiver that is published to use by anyone and this key is used

for encrypting messages. But only the receiving party has

access to the decryption key and so the receiver can read the

encrypted messages. The key owned by the receiving party is

known as the private key. A well-known example for

asymmetric encryption is RSA encryption technique.

Volume-4,Issue-4, Oct -2015

ISSN (O) :- 2349-3585

Paper Title:- DYNAMIC QUERY FORM BUCKETIZATION:-

ORDERED/UNORDERED

www.ijrdt.org | copyright © 2014, All Rights Reserved. (79)

Figure 1.1: Symmetric Encryption

Figure 1.2: Asymmetric Encryption

1.4 Bucketization

The most common process that can be used for

dealing a data is bucketization technique. Bucketization

technique is used to perform analysis on every value of an

entity, for example the values of columns of table like product,

store, date, and for their attributes. Bucketization technique is

mainly used because serious performance consequences can

occur while implementing an attribute based on the primary

key of the table. This Secure Bucketization system uses both

ordered bucketization (OB) and unordered bucketization (UB).

Both ordered and unordered bucketization is used as a

cryptographic object.

The subsequent chapters will be dealing with the

following:

Chapter 2 is a study of existing systems that deal with the

basics of dynamic query form, the bucketization and the

different types of encryption mechanisms. The chapter gives

an overview of various techniques covered in the literature

survey. Each section in this chapter is a brief description of the

papers studied in the survey, the advantages and disadvantages

of the method and the relevance of the method with respect to

the proposed method.

Chapter 3 deals with the detailed description of the steps in the

proposed method. The method proposed in this paper for

dynamic query form generation and the bucketization

technique used by the admin for efficient retrieval of data. The

chapter ends with the experimental results and the

performance of the system from the model constructed.

Chapter 4 is a conclusion of the project followed by a

proposal for future work. This chapter is followed by the

references containing the list of books and journals referenced

for the study.

2. LITERATURE SURVEY

Literature survey gives the comprehensive review on the

previous works done in dynamic query form generation and

secured bucketization techniques. The advantages and

disadvantages of the methods are also discussed. It also gives

a basic idea about various concepts used in earlier systems.

The relevance of the steps involved and techniques of the

approaches in these studies are also explained.This chapter

presents a survey about various application areas where

dynamic query form and secure bucketization are used. The

literature survey includes the study of almost all the previous

works done in this area of secure data mining in handling the

big data. The methods used in the existing system and

proposed system are compared and evaluated.

2.1 Secure Ordered Bucketization

The cryptographic object that is used in this study is the

ordered bucketization (OB). In Ordered Bucketization, the

plain text space is divided into p disjoint buckets. The disjoint

buckets are numbered from 1 to p. This will be based on the

Volume-4,Issue-4, Oct -2015

ISSN (O) :- 2349-3585

Paper Title:- DYNAMIC QUERY FORM BUCKETIZATION:-

ORDERED/UNORDERED

www.ijrdt.org | copyright © 2014, All Rights Reserved. (80)

order of the ranges. Ordered Bucketization is useful in range

query which can be performed over encrypted data. There is

no need to decrypt the entire encrypted data by attaching a

bucket number to each cipher text. The paper proposes an

encryption scheme with Ordered Bucketization (EOB) which

has reasonable power. In Ordered Bucketization, p-1 points

are selected on the uniform distribution in the plaintext-space

and the plaintext-space is divided based on the selected points.

A bucket number is assigned to each divided range in

ascending order with certain range. The OB has good

efficiency on range queries.

Since this technique is the first cryptographic treatment, Order

Preserved Encryption attracted attention in the applied

community. The main three functions that the applications use

and realize with OPE are given. First, the OPE is normally

used to support the range queries over encrypted data. In

reference [3], the value in the date field was encrypted in

every eHR (electronic Health Record) to enable the user to

search for the eHRs generated on a specific period when the

eHRs are stored after encryption. For the applications,

replacing OPE into the scheme does not cause any problems

but there will be a slight higher communication overheads.

The second function was to search for the record of the

maximum or the minimum value on a set of encrypted records

in a database. This function was used for a secure text

document retrieval application [4]. In reference [4] the number

of text documents is maintained in an encrypted form in a

database, and the DBMS returns the encrypted document

when an encrypted query keyword is given to the DBMS. The

score in a text document is decided in proportion to the ratio

of the number of the occurrences of the queried keyword to

the total number of the words in the text document. The

frequencies of words in a text document shows as much

information as there is on the text document, which should be

hidden from the DBMS. When a query keyword is received,

the DBMS can determine in which encrypted text document

the keyword occurs most frequently. The third function is to

count the number of pairs where the first element is greater

than the second element, given a large number of pairs of two

encrypted numbers. This function was used to search the

encrypted image database in reference [5]. In this application,

the features that are extracted from both images were

compared to check if the queried encrypted image is close to

an encrypted image in a multimedia database. The features

have a large amount of information regarding the image, they

should be hidden from the multimedia DBMS if image privacy

is required. The third type of function appears difficult to

support by the proposed EOB because it is not good when

both features cannot be compared or are similar on a specific

range. But the main disadvantage from the proposed system is

that it does not have dynamic query forms which increases the

performance and efficiency of the system.

2.2 Dynamic Query Forms for Database Queries

With the rapid development of web information and scientific

databases, modern databases become complex and large.

Scientific databases and web databases maintain large and

heterogeneous data. These databases contain more relations

and attributes. Query form is one of the most widely used user

interfaces for querying databases.

It is difficult to design a set of static query forms to satisfy

various ad-hoc database queries on those complex databases.

Query forms that were used earlier cannot satisfy various

queries from users on those databases. This paper proposes

Data Query Form, which is able to dynamically generate

query forms. Data Query Form is a novel database query form

interface. The essence of Data Query Form is to capture a

user‟s preference and rank query form components. The

generation of a query form is an iterative process and is guided

by the user. For each iteration the system automatically

generates the lists of form components and the user then adds

the desired form components into the query form. A user can

fill the query form and the queries can be submitted to view

the query result at each iteration. In this way, a query form

could be dynamically refined until the user is satisfied with the

query results. The creation of customized queries totally

depends on users‟ manual editing [6]. If a user is not familiar

Volume-4,Issue-4, Oct -2015

ISSN (O) :- 2349-3585

Paper Title:- DYNAMIC QUERY FORM BUCKETIZATION:-

ORDERED/UNORDERED

www.ijrdt.org | copyright © 2014, All Rights Reserved. (81)

with the database schema in advance the hundreds or

thousands of data attributes would confuse.

The paper proposes a Dynamic Query Form system (DQF) [2]

a query interface which is capable of dynamically generating

query forms for users. The importance of Data Query Form is

to capture user interests during user interactions and it will

adapt the query form iteratively. Each iteration consists of two

types of user interactions: Query Form Enrichment and Query

Execution. Figure 2.11 shows the flowchart of DQF. It starts

with a basic query form which contains very few primary

attributes of the database. The basic query form is then

enriched iteratively via the interactions between the user and

the system until the user is satisfied with the query results.

It propose a dynamic query form system which generates the

query forms according to the user‟s desire at run time. The

system provides a solution for the query interface in large and

complex databases. Apply F-measure to estimate the goodness

of a query form. F measure is a typical metric to evaluate

query results. This metric is appropriate for query forms

because query forms are designed to help users query. The

efficiency of a query form is determined by the query results

generated from the query form. Based on this, rank and

recommend the potential query form components so that users

can refine the query form easily. This is an efficient

algorithms to estimate the goodness of the projection and

selection form components. Here efficiency is important

because DQF is an online system where users often expect

quick response. But here the main disadvantage is that it does

not have a secure ordered bucketization technique to store the

large complex data.

Figure 2.1: Flowchart of dynamic query form

Extracting information from large databases is a time-

consuming activity. The paper [20] present DynaCet - a

domain independent system that provides effective minimum-

effort based dynamic faceted search solutions over enterprise

databases. At every step, Dynacet suggests facets depending

on the user response in the previous step. Facets are selected

based on their ability to rapidly select the most promising

tuples, as well as on the ability of the user to provide desired

values for them. This method include faster access to

information stored in databases while taking into consideration

the variance in user knowledge and preferences.DynaCet - a

domain independent system that provides effective minimum-

effort based dynamic faceted search solutions over enterprise

databases. DynaCet‟s contributions include an efficient

approach for generating facets for minimum-effort navigation

over enterprise databases. Recent research works includes

collaborative approaches to recommend database query

components for database exploration [21]. Database

management systems (DBMSs) provide a various data

management capabilities. At the same time, tools for

managing queries over the data have remained relatively

primitive. Here the queries are typically issued through

applications. The queries are debugged once and re-used

repeatedly. This mode of interaction is changing. As

Volume-4,Issue-4, Oct -2015

ISSN (O) :- 2349-3585

Paper Title:- DYNAMIC QUERY FORM BUCKETIZATION:-

ORDERED/UNORDERED

www.ijrdt.org | copyright © 2014, All Rights Reserved. (82)

scientists‟ store and share large volumes of data in data

centres, it has the ability to analyse the data by issuing

exploratory queries. In this paper, data management systems

provide powerful query management capabilities, from query

browsing to automatic query recommendations. In a

collaborative query management system, SQL queries are

treated as items and it recommend similar queries to related

users. But this paper do not consider the goodness of the query

results. In the paper query by output [22] proposes a method to

recommend an alternative database query based on results of a

query. The difference from the above work is that their

recommendation is a complete query and this paper

recommendation is a query component for each iteration.

In the paper Usher: Improving data quality with

dynamic forms [23], develops an adaptive forms system for

data entry, which can be dynamically changed according to

the previous data input by the user. This is different as it deals

with database query forms instead of data-entry forms. The

quality of the data is a critical problem in modern databases.

In this paper [23], it propose USHER which is an end-to-end

system for form design, entry, and data quality assurance.

USHER then applies this model at every step of the data entry

process to improve data quality. Before entry, it induces a

form layout that captures the most important data values of a

form instance as quickly as possible and reduces the

complexity of error-prone questions. When the input is given

it dynamically adapts the form to the values being entered by

providing real-time interface feedback, re-asking questions

with responses and it simplifies questions by reformulating

them. After entry, it revisits question responses that it deems

likely to have been entered incorrectly by re-asking the

question.

3. PROPOSED METHOD

The method proposed in this study is for generating

dynamic query form and for providing security for data using

bucketization, authentication and encryption methods. The

system proposes a new secure scheme which includes

generating query forms dynamically and to provide

bucketization while dealing with the data in database for data

privacy over secrecy there by increasing the security. It is used

for secure retrieval of data from the database without any

alteration, which provides confidential information between

organizations. The main feature of the system is that it works

with any type of organisation, so the use of the system can be

enhanced to any type of real world application. The system

works for any type of application the admin has to simply

connect the database before the client is connected. While

connecting the database, the bucket width of each column will

be given and saved. The data of the table will be converted in

to a cipher text through the encryption process. The blowfish

encryption is the encryption technique used here. The cipher

text along with the bucket number is added to the database by

creating another table. While client login to the system, the

database attached by the admin is dynamically loaded based

on the clients choice the data can be added to the database.

Thus through this process a high degree of security for

information is achieved.

The main techniques done by the system are:-

1. Authentication: Authentication or setting of

credentials acts as the first line of defence. Authentication is

a process in which the credentials (username and password)

provided are checked with file in a database of authorized

users‟ information in a system or inside a server. If the

username and password match, then the process is completed

and the user is granted authorization for access. The

permissions of the user to the system will be based on the

authentication. These are cheap to deploy. In the system

authentication separates the admin and users choices.

2. Bucketization: The process which is used for dealing

a data is bucketization technique. Bucketization technique is

used to perform analysis on every value of an entity, for

example the values of columns of table like product, store,

date, and for their attributes. Bucketization technique is

mainly used because serious performance consequences can

occur while implementing an attribute based on the primary

key of the table. There are two types of bucketization

Volume-4,Issue-4, Oct -2015

ISSN (O) :- 2349-3585

Paper Title:- DYNAMIC QUERY FORM BUCKETIZATION:-

ORDERED/UNORDERED

www.ijrdt.org | copyright © 2014, All Rights Reserved. (83)

techniques. They are ordered bucketization and unordered

bucketization.

3. Cryptography: Cryptography is the science dealing

with the study of secret communication. Cryptography is an

art of converting the secret in to an unintelligible form so that

an unauthorized person cannot read the secret. The main

component of that science which is used in this system is

known as encryption. Encryption is the process of hiding

information. That is encryption is an algorithm or process to

make information hidden by making the content of the

information a secret. The hidden information is known as

cipher text. To make the hidden information accessible the

user needs a key. Therefore the admin hides the information

from everybody except for the one who possess the key. A

direct application of cryptography that is used for protecting

information is the encryption method. The classification of

encryption process based on the cipher text are of two types,

they are symmetric encryption and asymmetric encryption.

Symmetric encryption uses the same key for both encryption

and decryption. DES, AES, Blowfish are well-known

algorithm for asymmetric key encryption. Asymmetric

encryption uses different keys for encryption and decryption.

RSA is a well-known algorithm for asymmetric key

encryption. The public key is the encryption key of the

receiver which is published for anyone to use and encrypt

messages. But only the receiving party has access to the

decryption key and so the receiver can read the encrypted

messages. The key owned by the receiving party is known as

the private key.

4. Dynamic Query Form Generation: The user interface that is

used for querying the databases is named as Query form. The

Query forms are most widely used to examine the databases

for efficient retrieval of query results of the system. The

Dynamic Query Form is to capture a user‟s preference for

components and to assist the user in making decisions. The

query form generation can be an iterative process and it can

be efficiently and successfully guided by any user. The user

can add components for the form into the query. The user can

efficiently get the results without any fault tolerance based on

the query [2].

These steps will be explained later in this chapter.

Due to the large amount of information in the

databased or various sources. The valid data‟s that are to be

transmitted to the authorized user should be correctly handled.

In fact there will be tremendous risks involved here. Data that

are to be retrieved should be gathered carefully. Data for the

organizations may include various types of information which

contains tremendous vivid applications. These information

will be stored in database. So Dynamic Query Form

Bucketization is used for efficient retrieval of the data in the

database by providing security.

The system is designed to provide a secure scheme

for sharing information among the users without any alteration

in the database due to the restriction in direct accessing of data

by the user. The application can be used in any potential

scenarios which can include help desk application in any

enterprise, or business scenarios. The bucketization technique

and the encryption mechanisms are other main things that

evaluates the performance of the system. There is also an

admin side component which will be responsible for the

database loading for users so that data transfer among

registered users can be done efficiently.

The system is implemented as three main modules which

include admin, users and dynamic query form bucketization

server module. The admin module deals with the

confidentiality or security of the data in the database that is to

be send to the user. The users‟ module supports the interaction

between the database content without alteration of the content

which is to be done efficiently without any fault rate and the

dynamic query form bucketization module deals with the

security of database by maintaining the bucket number for

each data in the database and this module will be responsible

for the encryption of each content in the database.

Figure 3.1 shows the basic interaction between the admin,

users and the dynamic query form bucketization server

module.

Volume-4,Issue-4, Oct -2015

ISSN (O) :- 2349-3585

Paper Title:- DYNAMIC QUERY FORM BUCKETIZATION:-

ORDERED/UNORDERED

www.ijrdt.org | copyright © 2014, All Rights Reserved. (84)

Figure 3.1: Interaction between admin, users and

DQFB server

3.1 System Architecture

Figure 3.2 shows a system architecture of the proposed

method for dynamic query form bucketization. Admin has the

capability for adding any type of database into the system

based on the organisation in which the application but admin

has to give with corresponding username and password of

relational database management system(RDBMS). The

RDBMS (relational database management system) used by

this application is MySQL. The databases that is used by this

application resides in this RDBMS. The columns of the

corresponding database will be added to the application. The

admin has to set the bucket width of each column. While

saving the bucket width two main process takes place there.

The main things are the admin encrypts the database contents

and attach the bucket number along with encrpted data. Admin

has the privilage to choose order bucketization or unordered

bucketization. The user has the privilage to access the data

without altering the content in the database. The plain query

loaded in dynamic query form will load the accurate database

contents from the system.

Figure 3.2 Dynamic Query Form Bucketization

3.2 System Requirement

The main hardware and software requirement of the proposed

system is given below:-

3.2.1 Hardware Specification

Processor : Pentium 1 GHz or Above

Hard Disk : 40 GB

Monitor : 1024 x 768 VGA Color Monitor

Memory : 512 MB Ram

Keyboard : 101/102 Natural Keyboard

Mouse : PS/2 Compatible

3.2.2 Software Specification

Operating System : Windows

Front End : Java

Back End : MySql

Application Development Software : NetBeans IDE

3.3 Modules

The detailed description of each modules in the

dynamic query form bucketization- ordered / unordered is as

follows:-

3.3.1 AuthenticationAuthentication or setting of

credentials acts as the first line of defence.

Volume-4,Issue-4, Oct -2015

ISSN (O) :- 2349-3585

Paper Title:- DYNAMIC QUERY FORM BUCKETIZATION:-

ORDERED/UNORDERED

www.ijrdt.org | copyright © 2014, All Rights Reserved. (85)

Authentication is a process in which the credentials

(username and password) provided are checked with file

in a database of authorized users‟ information in a

system or inside a server. If the username and password

match, then the process is completed and the user is

granted authorization for access. The permissions of the

user to the system will be based on the authentication.

These are cheap to deploy. In the system authentication

separates the admin and users choices. Authentication

plays a major role in for separating the folders returned

based on the privileges that are set beforehand. Thereby

separating the environment that user sees. It also plays a

vital role in the way in which the user interact with the

same system, including the access and other rights such

as the amount of allocated storage space.

The authentication process in the proposed system

separates an administrator and user so the access rights of the

user will be different. The process of checking user account

permissions for accessing the resources is referred to as

authorization. The preferences and privileges granted for the

authorized account depends upon the user‟s permissions. The

user‟s permission are either locally stored or on the server. In

the proposed system user‟s permission are based on

authentication which depends on the application and runs at

runtime. New users to the application can sign up by clicking

the corresponding button in the application but the new users

cannot get the privilege of the administrator. There will be

only one administrator in this application and admin has every

privileges.

Authentication :-

This means it checks whether an authorised person

access the data. It understands the type of user who is

accessing the data. It allows a user to have confidence and

secure data which originates from a specific known source.

The authentication test were done successfully. The login was

tested using invalid user name or valid user name and invalid

password. The software was also tested to see if it allows

access without any identification.

Authorization :-

Authorization is the process of determining that a

user is allowed to receive a particular service or perform an

operation. It also determine whether there is restriction for

accessing to a particular place or other resources. The

authorization test was executed successfully. The software

identifies the user as the normal user or admin and allows the

normal user to access only the client module or user module.

The test was performed to check if the system restricts

unauthorized users from accessing certain privileged modules

of the system. Authorization ensures whether the user is

restricted from the access of modules which are for the

administrator.

Availability :-

Availability assures the information or data stored in

database will be ready to be used for communication when it

is expected. Information will be kept available to the persons

when they need it. The software was tested for availability and

the data were available to the users easily by preserving the

security of the system.

Confidentiality:-

Confidentiality is a measure of security which

protects against the disclosure of information or data to parties

other than the intended recipient. The system was tested for

confidentiality and the system ensures the confidentiality of

message through encryption and bucketization technique. The

key for decrypting is only available to the authorized user. The

admin has the privilege to access the bucket and the admin can

choose whether ordered or unordered should be used for a

particular application. This confidential information will be

Volume-4,Issue-4, Oct -2015

ISSN (O) :- 2349-3585

Paper Title:- DYNAMIC QUERY FORM BUCKETIZATION:-

ORDERED/UNORDERED

www.ijrdt.org | copyright © 2014, All Rights Reserved. (86)

hidden from the other normal users who access the data from

the database

3.3.2 Bucketization Technique

The most common process that can be used for

dealing a data is bucketization technique. Bucketization

technique is used to perform analysis on every value of an

entity, for example the values of columns of table like product,

store, date, and for their attributes. Bucketization technique is

mainly used because serious performance consequences can

occur while implementing an attribute based on the primary

key of the table. This Secure Bucketization system uses both

ordered bucketization (OB) and unordered bucketization (UB).

Both ordered and unordered bucketization is used as a

cryptographic object.

In Ordered Bucketization, the plaintext-space or the

original data is divided into a pre-defined number of buckets

which is assigned by the admin while loading the database.

Consider the bucket number p assigned for a particular

column. Assign number to each bucket. The number should be

in a range from 1 to p and the numbers will be ordered. With

bucketization various types of SQL queries over encrypted

data are possible if the bucket number which corresponds to

the original plaintext before encryption or decryption key is

given for encrypting or decrypting a particular data. These

queries can be used in case of ordered bucketization. For

example, if a program running on the client side wants to

retrieve the data in the range between 100,000 and 200,000, it

first calculates the numbers of buckets whose union is the

smallest set that covers the queried range. The program

running on the client side sends the bucket numbers to the

database server. The database server searches all the encrypted

data whose bucket number is one of the received numbers.

Then the server sends the data back to the client side. The

client can obtain the correct result by filtering out the data that

are not in the range after decrypting them. In this case, a larger

amount of data is transmitted between the client side and the

server side than in the case where the database stores

unencrypted data items. Due to the false positives that occur in

the case where a bucket has both the data the client wants to

retrieve and data that it does not want. On the other hand, this

approach is very efficient compared to the case where the

client receives all the encrypted data from the server and

decrypts all data items to obtain the correct query result. As a

result this method is very useful when users cannot store their

data without encryption such as in a cloud computing

environment.

In the case of unordered Bucketization, the plaintext-

space or the original data is divided into a pre-defined number

of buckets which is assigned by the admin while loading the

database. Consider the bucket number p assigned for a

particular column. Assign number to each bucket. The number

should be in a range from 1 to p and the numbers will be

unordered. Pseudorandom algorithm is called for generating

random numbers and they will be stored in an array. The

queries can be use unordered bucket for accessing the secure

data. For example, if a program running on the client side

wants to retrieve the data in the range between 100,000 and

200,000, it first divides the plaintext equally and mapped to

the array were the random numbers are stored. The program

running on the client side sends the bucket numbers to the

database server. The database server searches all the encrypted

data whose bucket number is one of the received numbers

using pseudo random algorithm. Then the server sends the

data back to the client side. The client can obtain the correct

result by filtering out the data that are not in the range after

decrypting them. In this case, a smaller amount of data is

transmitted between the client side and the server side than in

the case where the database stores unencrypted data items.

The false positives that occurs can avoided by efficiently using

a bucket which has the data that the client wants to retrieve.

This approach is very efficient compared to the case where the

client receives all the encrypted data from the server and

decrypts all data items to obtain the correct query result.

Bucketization techniques makes the indexing of

encrypted data faster so that the searching of content in the

system is performed efficiently.

Volume-4,Issue-4, Oct -2015

ISSN (O) :- 2349-3585

Paper Title:- DYNAMIC QUERY FORM BUCKETIZATION:-

ORDERED/UNORDERED

www.ijrdt.org | copyright © 2014, All Rights Reserved. (87)

3.3.3 Cryptography

Security is the most challenging aspects in all day to day real

world applications. Cryptography is the one of the main

categories of computer security that converts information from

its normal form into an unreadable form. The two main

characteristics that identify and differentiate one encryption

algorithm from another are its ability to secure the protected

data against attacks and its speed and efficiency in doing so.

Cryptography is usually referred to as “the study of secret”.

Encryption is the process of converting normal text to

unreadable form. Decryption is the process of converting

encrypted text to normal text in the readable form. Encryption

is the process of hiding information. That is encryption is an

algorithm or process to make information hidden by making

the content of the information a secret. The hidden

information is known as cipher text. To make the hidden

information accessible the user needs a key. Therefore the

admin hides the information from everybody except for the

one who possess the key. A direct application of cryptography

that is used for protecting information is the encryption

method. The classification of encryption process based on the

cipher text are of two types, they are symmetric encryption

and asymmetric encryption. Symmetric encryption uses the

same key for both encryption and decryption. DES, AES,

Blowfish are well-known algorithm for asymmetric key

encryption. Asymmetric encryption uses different keys for

encryption and decryption. RSA is a well-known algorithm for

asymmetric key encryption. The public key is the encryption

key of the receiver which is published for anyone to use and

encrypt messages. But only the receiving party has access to

the decryption key and so the receiver can read the encrypted

messages. The key owned by the receiving party is known as

the private key.

An art of information hiding by transforming it into an

unintelligible form so that one with the possession of the key

and algorithm can access the data, this technique is known as

cryptography.

Cryptography is the process of scrambling a message to

convert it in to an unintelligible form. The process of

transforming the data into unintelligible form (non-readable

form) is termed as encryption. In encryption an information

content which is termed as the plaintext is transformed into a

non-readable form. Encryption is only possible by using the

encryption algorithm and the key. The unintelligible form

(non-readable form) information content is termed as the

cipher text. The key used for encryption determines the way in

which a message is encoded. The cipher text can be converted

back to its plain text (original form). The process of

converting the cipher text into plain text is known as

decryption. One with corresponding decryption algorithm and

the key can retrieve and read the information.

The encryption process can be classified based on the keys

used. The classification of encryption process based on the

cipher text are of two types, they are symmetric encryption

and asymmetric encryption. Symmetric encryption uses the

same key for both encryption and decryption. Some of the

examples for asymmetric key encryption are DES, AES, and

Blowfish which have certain well-known algorithms.

3.3.4 Dynamic Query Form Generation

Dynamic Query Form Generation is considered as a very

important task in this work as this project needs quality and

reliability of available information which directly affects the

results attained. The user interface that is used for querying the

databases is named as Query form. The Query forms are most

widely used nowadays. Many researches were aimed to

examine the databases for efficient retrieval of query results

without alteration in database content and to improve

performance of the system. The amount of data stored in

databases in internet is also rapidly increasing due to the

advancement in the field of information technology. These

databases contain a wealth of data and are a gold mine of

valuable information which should be kept secured.

Volume-4,Issue-4, Oct -2015

ISSN (O) :- 2349-3585

Paper Title:- DYNAMIC QUERY FORM BUCKETIZATION:-

ORDERED/UNORDERED

www.ijrdt.org | copyright © 2014, All Rights Reserved. (88)

Earlier query forms were used for various information

management systems which was designed by developers.

These traditional query forms were predefined by the

developers. Scientific databases, modern databases became

very complex and large with the rapid development of internet

information. As the information in the web databases

increased there was a rapid increase in the entities for storing

data. Thus the design of set of static query forms to satisfy

various database queries were difficult to implement.

Nowadays the existing database development and

management tools provide various techniques to let users

create queries on databases. But the main drawback with these

queries are these queries depends on user manual editing so if

a user is unfamiliar database schema in prior to the editing, the

hundreds or thousands of data entities and attributes would

confuse the developer. Therefore a need for a novel database

query form interface is applicable for generating dynamically

query forms. The essence of Dynamic Query Form is to

capture a user‟s preference for components and to assist the

user in making decisions. The query form generation can be an

iterative process and it can be efficiently and successfully

guided by any user. The user can add components for the form

into the query. The user can efficiently get the results without

any fault tolerance based on the query [2]. Sequential query

results discovery is a very important part of dynamic query

form system. Data Mining is a non-trivial process of

identifying valid, interesting, novel, useful, and ultimately

understandable data for user.

Due to the large amount of information in the databased or

various sources. The valid data‟s that are to be transmitted to

the authorized user should be correctly handled. In fact there

will be tremendous risks involved here. Data that are to be

retrieved should be gathered carefully. Data for the

organizations may include various types of information which

contains tremendous vivid applications. These information

will be stored in database. So Dynamic Query Form

Bucketization is used for efficient retrieval of the data in the

database by providing security.

3.4 Performance Evaluation

This section provides a performance evaluation of the

advantages of the proposed system when compared with

existing systems. The following things are the main

advantages of the proposed system:-

1) Database Connectivity:-

The existing application connect a constant database for the

entire lifecycle of the system. But the proposed system

outperforms this drawbacks by connecting any types of

relational databases. Admin simply has to mention the name

of the database that is for querying the results. The system

works based on how the admin loads the content in the

database. So any organization, business, enterprise

applications can use the proposed system efficiently. The users

can make use of the system efficiently by querying the

databases with apt content without even knowing the table

name or column name of the organization which in fact

increases the performance of the system. Relational databases

of any number of columns can be added to the system.

2) Use of Blowfish Algorithm :-

While comparing the symmetric key cryptographic algorithms

AES, DES and Blowfish algorithm, blowfish algorithm out

performs others by taking lesser time for encryption and

decryption of contents in the database. While comparing the

algorithms blowfish outperforms others were block sizes are

different.

The plaintext or data is divided into smaller block

size as per algorithm settings given in Table 1 which is given

above.

Algorithm Key Size (Bits) Block Size (Bits)

DES 64 64

AES 128 128

Blowfish 128 64

Table 1: Key Size and Block size of DES, AES and Blowfish

algorithm

Volume-4,Issue-4, Oct -2015

ISSN (O) :- 2349-3585

Paper Title:- DYNAMIC QUERY FORM BUCKETIZATION:-

ORDERED/UNORDERED

www.ijrdt.org | copyright © 2014, All Rights Reserved. (89)

Figure 3.7: Data Block Size vs Execution Time (Sec) of AES,

DES and Blowfish algorithms

The figure 3.7 shows the superiority of Blowfish algorithm

over AES and DES algorithm in terms of processing time. It

shows also that AES consumes more resources when the data

block size is relatively big. The figure indicate that the extra

time added is not significant for many real world applications

so Blowfish algorithm is better for faster computing. The

figure also shows that Blowfish has a better performance than

other common encryption algorithms used. Since Blowfish has

not any known security weak point, it can be considered as a

standard encryption algorithm. AES showed poor performance

results compared to other algorithms since it requires more

processing power but overall it was relatively negligible

especially for certain application that requires more secure

encryption to a relatively large data blocks.

3) Faster indexing for accessing encrypted

databases:-

By using the bucketization technique the querying process can

be made easily. Range queries performs better when compared

with others. Bucketization technique deals with faster

accessing of data. Only the content that are needed for the user

is taken the other data will be kept secured without even

decryption the data. Only the data corresponding to the bucket

numbers will be taken for decryption which in fact reflects in

the security of the system. The actual data are not accessed by

the users only the data needed by the user is taken.

The running time of the existing system for searching and

retrieval of data were reduced when compared with the

proposed system since only the data that are needed for the

user will be taken and decrypted. In earlier system, while the

user queries for a particular result, the entire database should

be decrypted for sending the accurate results to the user, but

the use bucketization technique only the data that are needed

for the end user will be decrypted during runtime since the

bucket numbers are checked for retrieving the results which

makes the system to run much faster.

4) Use of Dynamic Query Form:-

The proposed system will allows end-users to customize the

existing query form at run time and the end-user may not be

familiar with the database. If the database that are connected

during run time is very large, it will be difficult for them to

find database entities and attributes and to create desired query

forms. It makes the users to access the query results faster.

These are the main things about performance while

comparing the existing systems with the proposed system.

CONCLUSION

As it was seen, the proposed system provides a new secure

and efficient way for dynamic query form bucketization

which uses both ordered and unordered bucketization

technique. This method give more importance to range

queries that uses application interfaces. The dynamic query

form bucketization in range query is supported efficiently

using the proposed method. The study shows this approach

select appropriate data from the database efficiently and

thereby improving the accuracy of the system. It also

preserves high-level security compared to existing methods.

Bucketization, Encryption and Dynamic Query Form

generation were the important tasks carried out in this work.

Since quality and reliability of the system directly affects the

results obtained, it was an important task to retrieve the useful

Volume-4,Issue-4, Oct -2015

ISSN (O) :- 2349-3585

Paper Title:- DYNAMIC QUERY FORM BUCKETIZATION:-

ORDERED/UNORDERED

www.ijrdt.org | copyright © 2014, All Rights Reserved. (90)

and correct information. There is no need to write the query to

retrieve data since the queries can be obtained dynamically

using an efficient interface. A dynamic query form generation

approach was implemented which helps users dynamically

generate query forms. This aims to capture user preference at

runtime and generate accurate query required by the user. The

dynamic approach can lead to higher success rate and it can

use simpler query forms compared with a static approach. It

can add a text-box for users to input some keywords queries.

A Bucketization technique was constructed efficiently with

which any Encrypted Bucketization that will works on top of

any secure symmetric encryption scheme [15].

Encryption scrambles the data in database to convert it in to an

unintelligible form and hides the actual information so that the

unauthorized user cannot access the information. The system

combines cryptography, bucketization and dynamic query

form to achieve data privacy and can handle generating

dynamic query form there by increasing the security. The

bucketization, encryption and dynamic query form provides

better thereby increasing the security of the system. The

resultant cipher after encryption is embedded with the bucket

number which prevents the modification of the actual data

while querying. Dynamic query form bucketization uses the

blowfish encryption technique.

In general, regarding the Dynamic Query Form Bucketization-

Ordered/Unordered technique has the main conclusions are as

follows:

1) It was shown that bucketization technique can be

used successfully for having higher security and

accuracy for retrieving range query without direct

access to the data which resides in the database.

2) It was shown that the utility of dynamic query form

generation techniques helps in retrieving faster access

in case of range queries when we have a great

number of data in the database. In this case, the

number of attributes used were reduced for obtaining

fewer rules and conditions without losing

classification performance.

3) Better symmetric encryption method is used for faster

encryption and decryption of datum in the entire

database. So even if the block size is larger the

encryption and decryption of the datum will be faster

when compared with other algorithms like AES,

DES.

Finally, as the next step in research, carry out more

experiments on Dynamic Query Form generation with user

preferences can be given based on certain ranking concepts.

REFERENCES

[1] Younho Lee,"Secure Ordered Bucketization," IEEE

Transactions On Dependable And Secure Computing,

Vol. 11, No. 3, May-June 2014

[2] Liang Tang, Tao Li, Yexi Jiang, and Zhiyuan Chen,

“Dynamic Query Forms for Database Queries” IEEE

Transactions on Knowledge and Data Engineering,

Vol. 26, No. 9, September 2014.

[3] Y. Ding and K. Klein, “Model-driven application-level

encryption for the privacy of E-health data,” in Proc.

IEEE 10th International Conference Availability Rel.

Security, 2010, pp. 341–346.

[4] C. Wang, N. Cao, J. Li, K. Ren, and W. Lou, “Secure

ranked keyword search over encrypted cloud data,” in

Proc. IEEE 30th International Conference Distributed

Computing System., 2010, pp. 253–262.

[5] W. Lu, A. Varna, and M. Wu, “Security analysis for

privacy preserving search of multimedia,” in

Proceeding. 17th IEEE International Conference

Image Processing, 2010, pp. 26–29, 2010

Volume-4,Issue-4, Oct -2015

ISSN (O) :- 2349-3585

Paper Title:- DYNAMIC QUERY FORM BUCKETIZATION:-

ORDERED/UNORDERED

www.ijrdt.org | copyright © 2014, All Rights Reserved. (91)

[6] M. Jayapandian and H. V. Jagadish, “Automated

creation of a forms-based database query interface,”

Proc. VLDB, vol. 1, no. 1, pp. 695–709, Aug. 2008.

[7] H. Hacigumus, B. Iyer, C. Li, and S. Mehrotra,

“Executing SQL over encrypted data in the database-

service-provider model,” in Proc. ACM SIGMOD

International Conference Manage. data, 2002, pp.

216–227.

[8] R. Agrawal, J. Kiernan, R. Srikant, and Y. Xu, “Order

preserving encryption for numeric data,” in Proc.

ACM SIGMOD Int. Conf. Manage. Data, 2004, pp.

563–574.

[9] A. Boldyreva, N. Chenette, Y. Lee, and A. O‟Neill,

“Order-preserving symmetric encryption,” in Proc.

31st Annu. Int. Conf. Adv. Cryptology, 2009, vol.

5479, pp. 224–241.

[10] G. Das and H. Mannila, “Context-based similarity

measures for categorical databases,” in Proc. PKDD,

Lyon, France, Sept. 2000, pp. 201–210.

[11] M. Jayapandian and H. V. Jagadish, “Expressive

query specification through form customization,” in

Proc. Int. Conf. EDBT, Nantes, France, Mar. 2008, pp.

416–427.

[12] M. Jayapandian and H. V. Jagadish, “Automating the

design and construction of query forms,” IEEE Trans.

Knowl. Data Eng., vol. 21, no. 10, pp. 1389–1402,

Oct. 2009. T. Joachims and F. Radlinski, “Search

engines that learn from implicit feedback,” IEEE

Comput., vol. 40, no. 8, pp. 34–40, Aug. 2007.

[13] E. Chu, A. Baid, X. Chai, A. Doan, and J. F.

Naughton, “Combining keyword search and forms for

ad hoc querying of databases,” in Proc. ACM

SIGMOD, Providence, RI, USA, Jun. 2009, pp. 349–

360.

[14] N. Khoussainova, Y. Kwon, M. Balazinska, and D.

Suciu, “Snipsuggest: Context-aware autocompletion for SQL,”

Proc. VLDB, vol. 4, no. 1, pp. 22–33, 2010.

[15] A. Nandi and H. V. Jagadish, “Assisted querying using

instantresponse interfaces,” in Proc. ACM SIGMOD, Beijing,

China, 2007, pp. 1156–1158.

[16] W. B. Frakes and R. A. Baeza-Yates, Information

Retrieval: Data Structures and Algorithms. Englewood Cliffs,

NJ, USA: Prentice-Hall, 1992.

[17] C. Li, N. Yan, S. B. Roy, L. Lisham, and G. Das,

“Facetedpedia: Dynamic generation of query-dependent

faceted interfaces for wikipedia,” in Proc. WWW, Raleigh,

NC, USA, Apr. 2010, pp. 651–660

[18] B. Hore, S. Mehrotra, M. Canim, and M. Kantarcioglu,

“Secure multidimensional range queries over outsourced

data,” The Very Large Data Bases J., vol. 21, pp. 333–358,

2012.

[19] B. Hore, S. Mehrotra, and G. Tsudik, “A privacy-

preserving index for range queries,” in Proc. 30th Int. Conf.

Very Large Data Bases,2004, pp. 720–731.

[20] S. B. Roy, H. Wang, U. Nambiar, G. Das, and M. K.

Mohania, “Dynacet: Building dynamic faceted search systems

over databases,” in Proc. ICDE, Shanghai, China, Mar. 2009,

pp. 1463–1466.

[21] N. Khoussainova, M. Balazinska, W. Gatterbauer, Y.

Kwon, and D. Suciu, “A case for a collaborative query

management system,” in Proc. CIDR, Asilomar, CA, USA,

Jan. 2009.

[22] Q. T. Tran, C.-Y. Chan, and S. Parthasarathy, “Query by

output,” in Proc. SIGMOD, Providence, RI, USA, Sept. 2009,

pp. 535–548.

Volume-4,Issue-4, Oct -2015

ISSN (O) :- 2349-3585

Paper Title:- DYNAMIC QUERY FORM BUCKETIZATION:-

ORDERED/UNORDERED

www.ijrdt.org | copyright © 2014, All Rights Reserved. (92)

[23] K. Chen, H. Chen, N. Conway, J. M. Hellerstein, and T.

S. Parikh, “Usher: Improving data quality with dynamic

forms,” in Proc. ICDE, Long Beach, CA, USA, Mar. 2010, pp.

321–332.

[24] S. Zhu, T. Li, Z. Chen, D. Wang, and Y. Gong, “Dynamic

active probing of helpdesk databases,” Proc. VLDB, vol. 1,

no. 1, pp. 748–760, Aug. 2008.

[25]. Yun Ding, Karsten Klein “Model-Driven Application-

Level Encryption for the Privacy of E-Health Data” 2010

International Conference on Availability, Reliability and

security