[INTERNATIONAL JOURNAL FOR RESEARCH &
DEVELOPMENT IN TECHNOLOGY] Volume-4,Issue-4, Oct - 2015
ISSN (O) :- 2349-3585
www.ijrdt.org | copyright © 2014, All Rights Reserved. 77
DYNAMIC QUERY FORM BUCKETIZATION:-
ORDERED/UNORDERED
Ashwini Ann Varghese 1
1 Dept. Computer Science and Engineering
St Joseph‟s College of Engineering and Technology, Palai
Abstract:-Today security concerns as well faster data access
from database are on the rise in all areas such as banks,
industry, healthcare, military organization, governmental
applications, educational institutions, etc. as the number of
hackers are also increasing day by day along with the
number of internet users. There are several issues when it
comes to data availability and security concerns in these
numerous and varying industries. A new scheme that focuses
on both security of data and faster retrieval of data is
proposed which combines encryption, bucketization and
dynamic query for by ensuring secure data exchange without
any fault tolerance rate. Encryption hides the confidential
information for the purpose of security, by converting the
data in to an unintelligible form, both ordered and
unordered bucketization can provide security without any
direct access to the data residing in the database and
dynamic query form provides an efficient user interface
based on the admins choice without creating separate form
for various enterprises, organization etc. In the proposed
system, the data is encrypted using blowfish algorithm and
the resultant cipher is then embedded with the bucket
number based on the bucket width for each column provided
by admin which ensures privacy for editing content in
databases. The encryption algorithms used is much secure
and faster as each step in the process is fully dependent on
the key. Dynamic Query Form is a user is a user interface
which capture a user’s preference for components and to
assist the user in making decisions for retrieving data.
Keywords— Encryption, Dynamic Query Form, Security,
Bucketization.
1. INTRODUCTION
1.1 Overview
There has been a concern and growing interest about the big
data in the modern world. Data or information has always
been a part of every enterprises, application, and business
whether it is big or small. It has been there from beginning of
time and it will be there till the end. It has become a part of
people. Data is always around and keeps on growing.
Research and practical experimentation are done in order to
efficiently use data with enhanced security. Information
security have utmost importance in today„s fast developing
era. People enjoy the most convenient information exchange
facilities provided through the internet. But there are also
certain risk factors. The sensitive information which are
transmitted might be intercepted or distorted by unintended
observers or hackers by exploiting the weakness for the
purpose of destruction or entertainment. So it is of great
importance to secure those information which are in transit.
There are different mechanisms to ensure the security of the
data in transit which includes cryptography, bucketization etc.
1.2 Dynamic Query Form
Modern databases and the internet applications contain very
huge amount of data and these data will be heterogeneous. The
number of data in the society each year is increasing.
Hundreds or thousands of relations and attributes are used by
the real-world databases such as web databases and modern
databases. Therefore, there is a need for accessing data with
more security without direct access to data and there is a need
to dynamically access the data from database using dynamic
query forms. Earlier in order to access data from the real
Volume-4,Issue-4, Oct -2015
ISSN (O) :- 2349-3585
Paper Title:- DYNAMIC QUERY FORM BUCKETIZATION:-
ORDERED/UNORDERED
www.ijrdt.org | copyright © 2014, All Rights Reserved. (78)
world databases predefined query forms were used which
cannot satisfy various dynamic queries from users on
databases. The user interface that is used for querying the
databases is named as Query form. The Query forms are most
widely used nowadays. Many researches were aimed to
examine the databases for efficient retrieval
of query results without alteration in database content and to
improve performance of the system. The amount of data
stored in databases in internet is also rapidly increasing due to
the advancement in the field of information technology. These
databases contain a wealth of data and are a gold mine of
valuable information which should be kept secured.
Earlier query forms were used for various information
management systems which was designed by developers.
These traditional query forms were predefined by the
developers. Scientific databases, modern databases became
very large and complex with the rapid development of internet
information. As the information in the web databases
increased there was a rapid increase in the entities for storing
data. Thus the design of set of static query forms which satisfy
database queries were difficult to implement. Nowadays the
existing database development and management tools provide
various techniques to let users create queries on databases. But
the main drawback with these queries are these queries
depends on user manual editing so if a user is unfamiliar
database schema in prior to the editing, the hundreds or
thousands of data entities and attributes would confuse the
developer. Therefore a need for a novel database query form
interface is applicable for generating dynamically query
forms. The essence of Dynamic Query Form is to capture a
user‟s preference for components and to assist the user in
making decisions. The query form generation can be an
iterative process and it can be efficiently and successfully
guided by any user. The user can add components for the form
into the query. The user can efficiently get the results without
any fault tolerance based on the query [2]. Sequential query
results discovery is a very important part of dynamic query
form system. Data Mining is a non-trivial process of
identifying valid, interesting, novel, useful, and ultimately
understandable data for user.
1.3 Cryptography
An art of information hiding by transforming it into
an unintelligible form so that one with the possession of the
key and algorithm can access the data, this technique is known
as cryptography.
Cryptography is the process of scrambling a message to
convert it in to an unintelligible form. The process of
transforming the data into unintelligible form (non-readable
form) is termed as encryption. In encryption an information
content which is termed as the plaintext is transformed into a
non-readable form. Encryption is only possible by using the
encryption algorithm and the key. The unintelligible form
(non-readable form) information content is termed as the
cipher text. The key used for encryption determines the way in
which a message is encoded. The cipher text can be converted
back to its plain text (original form). The process of
converting the cipher text to plain text is known as decryption.
One with corresponding decryption algorithm and the key can
retrieve and read the information.
The encryption process can be classified based on the keys
used. The classification of encryption process based on the
cipher text are of two types, they are symmetric encryption
and asymmetric encryption. The same key for both encryption
and decryption are used for symmetric encryption. Some of
the examples for asymmetric key encryption are DES, AES,
and Blowfish which have certain well-known algorithms.
Asymmetric encryption uses different keys for encryption and
decryption. The public key is the encryption key of the
receiver that is published to use by anyone and this key is used
for encrypting messages. But only the receiving party has
access to the decryption key and so the receiver can read the
encrypted messages. The key owned by the receiving party is
known as the private key. A well-known example for
asymmetric encryption is RSA encryption technique.
Volume-4,Issue-4, Oct -2015
ISSN (O) :- 2349-3585
Paper Title:- DYNAMIC QUERY FORM BUCKETIZATION:-
ORDERED/UNORDERED
www.ijrdt.org | copyright © 2014, All Rights Reserved. (79)
Figure 1.1: Symmetric Encryption
Figure 1.2: Asymmetric Encryption
1.4 Bucketization
The most common process that can be used for
dealing a data is bucketization technique. Bucketization
technique is used to perform analysis on every value of an
entity, for example the values of columns of table like product,
store, date, and for their attributes. Bucketization technique is
mainly used because serious performance consequences can
occur while implementing an attribute based on the primary
key of the table. This Secure Bucketization system uses both
ordered bucketization (OB) and unordered bucketization (UB).
Both ordered and unordered bucketization is used as a
cryptographic object.
The subsequent chapters will be dealing with the
following:
Chapter 2 is a study of existing systems that deal with the
basics of dynamic query form, the bucketization and the
different types of encryption mechanisms. The chapter gives
an overview of various techniques covered in the literature
survey. Each section in this chapter is a brief description of the
papers studied in the survey, the advantages and disadvantages
of the method and the relevance of the method with respect to
the proposed method.
Chapter 3 deals with the detailed description of the steps in the
proposed method. The method proposed in this paper for
dynamic query form generation and the bucketization
technique used by the admin for efficient retrieval of data. The
chapter ends with the experimental results and the
performance of the system from the model constructed.
Chapter 4 is a conclusion of the project followed by a
proposal for future work. This chapter is followed by the
references containing the list of books and journals referenced
for the study.
2. LITERATURE SURVEY
Literature survey gives the comprehensive review on the
previous works done in dynamic query form generation and
secured bucketization techniques. The advantages and
disadvantages of the methods are also discussed. It also gives
a basic idea about various concepts used in earlier systems.
The relevance of the steps involved and techniques of the
approaches in these studies are also explained.This chapter
presents a survey about various application areas where
dynamic query form and secure bucketization are used. The
literature survey includes the study of almost all the previous
works done in this area of secure data mining in handling the
big data. The methods used in the existing system and
proposed system are compared and evaluated.
2.1 Secure Ordered Bucketization
The cryptographic object that is used in this study is the
ordered bucketization (OB). In Ordered Bucketization, the
plain text space is divided into p disjoint buckets. The disjoint
buckets are numbered from 1 to p. This will be based on the
Volume-4,Issue-4, Oct -2015
ISSN (O) :- 2349-3585
Paper Title:- DYNAMIC QUERY FORM BUCKETIZATION:-
ORDERED/UNORDERED
www.ijrdt.org | copyright © 2014, All Rights Reserved. (80)
order of the ranges. Ordered Bucketization is useful in range
query which can be performed over encrypted data. There is
no need to decrypt the entire encrypted data by attaching a
bucket number to each cipher text. The paper proposes an
encryption scheme with Ordered Bucketization (EOB) which
has reasonable power. In Ordered Bucketization, p-1 points
are selected on the uniform distribution in the plaintext-space
and the plaintext-space is divided based on the selected points.
A bucket number is assigned to each divided range in
ascending order with certain range. The OB has good
efficiency on range queries.
Since this technique is the first cryptographic treatment, Order
Preserved Encryption attracted attention in the applied
community. The main three functions that the applications use
and realize with OPE are given. First, the OPE is normally
used to support the range queries over encrypted data. In
reference [3], the value in the date field was encrypted in
every eHR (electronic Health Record) to enable the user to
search for the eHRs generated on a specific period when the
eHRs are stored after encryption. For the applications,
replacing OPE into the scheme does not cause any problems
but there will be a slight higher communication overheads.
The second function was to search for the record of the
maximum or the minimum value on a set of encrypted records
in a database. This function was used for a secure text
document retrieval application [4]. In reference [4] the number
of text documents is maintained in an encrypted form in a
database, and the DBMS returns the encrypted document
when an encrypted query keyword is given to the DBMS. The
score in a text document is decided in proportion to the ratio
of the number of the occurrences of the queried keyword to
the total number of the words in the text document. The
frequencies of words in a text document shows as much
information as there is on the text document, which should be
hidden from the DBMS. When a query keyword is received,
the DBMS can determine in which encrypted text document
the keyword occurs most frequently. The third function is to
count the number of pairs where the first element is greater
than the second element, given a large number of pairs of two
encrypted numbers. This function was used to search the
encrypted image database in reference [5]. In this application,
the features that are extracted from both images were
compared to check if the queried encrypted image is close to
an encrypted image in a multimedia database. The features
have a large amount of information regarding the image, they
should be hidden from the multimedia DBMS if image privacy
is required. The third type of function appears difficult to
support by the proposed EOB because it is not good when
both features cannot be compared or are similar on a specific
range. But the main disadvantage from the proposed system is
that it does not have dynamic query forms which increases the
performance and efficiency of the system.
2.2 Dynamic Query Forms for Database Queries
With the rapid development of web information and scientific
databases, modern databases become complex and large.
Scientific databases and web databases maintain large and
heterogeneous data. These databases contain more relations
and attributes. Query form is one of the most widely used user
interfaces for querying databases.
It is difficult to design a set of static query forms to satisfy
various ad-hoc database queries on those complex databases.
Query forms that were used earlier cannot satisfy various
queries from users on those databases. This paper proposes
Data Query Form, which is able to dynamically generate
query forms. Data Query Form is a novel database query form
interface. The essence of Data Query Form is to capture a
user‟s preference and rank query form components. The
generation of a query form is an iterative process and is guided
by the user. For each iteration the system automatically
generates the lists of form components and the user then adds
the desired form components into the query form. A user can
fill the query form and the queries can be submitted to view
the query result at each iteration. In this way, a query form
could be dynamically refined until the user is satisfied with the
query results. The creation of customized queries totally
depends on users‟ manual editing [6]. If a user is not familiar
Volume-4,Issue-4, Oct -2015
ISSN (O) :- 2349-3585
Paper Title:- DYNAMIC QUERY FORM BUCKETIZATION:-
ORDERED/UNORDERED
www.ijrdt.org | copyright © 2014, All Rights Reserved. (81)
with the database schema in advance the hundreds or
thousands of data attributes would confuse.
The paper proposes a Dynamic Query Form system (DQF) [2]
a query interface which is capable of dynamically generating
query forms for users. The importance of Data Query Form is
to capture user interests during user interactions and it will
adapt the query form iteratively. Each iteration consists of two
types of user interactions: Query Form Enrichment and Query
Execution. Figure 2.11 shows the flowchart of DQF. It starts
with a basic query form which contains very few primary
attributes of the database. The basic query form is then
enriched iteratively via the interactions between the user and
the system until the user is satisfied with the query results.
It propose a dynamic query form system which generates the
query forms according to the user‟s desire at run time. The
system provides a solution for the query interface in large and
complex databases. Apply F-measure to estimate the goodness
of a query form. F measure is a typical metric to evaluate
query results. This metric is appropriate for query forms
because query forms are designed to help users query. The
efficiency of a query form is determined by the query results
generated from the query form. Based on this, rank and
recommend the potential query form components so that users
can refine the query form easily. This is an efficient
algorithms to estimate the goodness of the projection and
selection form components. Here efficiency is important
because DQF is an online system where users often expect
quick response. But here the main disadvantage is that it does
not have a secure ordered bucketization technique to store the
large complex data.
Figure 2.1: Flowchart of dynamic query form
Extracting information from large databases is a time-
consuming activity. The paper [20] present DynaCet - a
domain independent system that provides effective minimum-
effort based dynamic faceted search solutions over enterprise
databases. At every step, Dynacet suggests facets depending
on the user response in the previous step. Facets are selected
based on their ability to rapidly select the most promising
tuples, as well as on the ability of the user to provide desired
values for them. This method include faster access to
information stored in databases while taking into consideration
the variance in user knowledge and preferences.DynaCet - a
domain independent system that provides effective minimum-
effort based dynamic faceted search solutions over enterprise
databases. DynaCet‟s contributions include an efficient
approach for generating facets for minimum-effort navigation
over enterprise databases. Recent research works includes
collaborative approaches to recommend database query
components for database exploration [21]. Database
management systems (DBMSs) provide a various data
management capabilities. At the same time, tools for
managing queries over the data have remained relatively
primitive. Here the queries are typically issued through
applications. The queries are debugged once and re-used
repeatedly. This mode of interaction is changing. As
Volume-4,Issue-4, Oct -2015
ISSN (O) :- 2349-3585
Paper Title:- DYNAMIC QUERY FORM BUCKETIZATION:-
ORDERED/UNORDERED
www.ijrdt.org | copyright © 2014, All Rights Reserved. (82)
scientists‟ store and share large volumes of data in data
centres, it has the ability to analyse the data by issuing
exploratory queries. In this paper, data management systems
provide powerful query management capabilities, from query
browsing to automatic query recommendations. In a
collaborative query management system, SQL queries are
treated as items and it recommend similar queries to related
users. But this paper do not consider the goodness of the query
results. In the paper query by output [22] proposes a method to
recommend an alternative database query based on results of a
query. The difference from the above work is that their
recommendation is a complete query and this paper
recommendation is a query component for each iteration.
In the paper Usher: Improving data quality with
dynamic forms [23], develops an adaptive forms system for
data entry, which can be dynamically changed according to
the previous data input by the user. This is different as it deals
with database query forms instead of data-entry forms. The
quality of the data is a critical problem in modern databases.
In this paper [23], it propose USHER which is an end-to-end
system for form design, entry, and data quality assurance.
USHER then applies this model at every step of the data entry
process to improve data quality. Before entry, it induces a
form layout that captures the most important data values of a
form instance as quickly as possible and reduces the
complexity of error-prone questions. When the input is given
it dynamically adapts the form to the values being entered by
providing real-time interface feedback, re-asking questions
with responses and it simplifies questions by reformulating
them. After entry, it revisits question responses that it deems
likely to have been entered incorrectly by re-asking the
question.
3. PROPOSED METHOD
The method proposed in this study is for generating
dynamic query form and for providing security for data using
bucketization, authentication and encryption methods. The
system proposes a new secure scheme which includes
generating query forms dynamically and to provide
bucketization while dealing with the data in database for data
privacy over secrecy there by increasing the security. It is used
for secure retrieval of data from the database without any
alteration, which provides confidential information between
organizations. The main feature of the system is that it works
with any type of organisation, so the use of the system can be
enhanced to any type of real world application. The system
works for any type of application the admin has to simply
connect the database before the client is connected. While
connecting the database, the bucket width of each column will
be given and saved. The data of the table will be converted in
to a cipher text through the encryption process. The blowfish
encryption is the encryption technique used here. The cipher
text along with the bucket number is added to the database by
creating another table. While client login to the system, the
database attached by the admin is dynamically loaded based
on the clients choice the data can be added to the database.
Thus through this process a high degree of security for
information is achieved.
The main techniques done by the system are:-
1. Authentication: Authentication or setting of
credentials acts as the first line of defence. Authentication is
a process in which the credentials (username and password)
provided are checked with file in a database of authorized
users‟ information in a system or inside a server. If the
username and password match, then the process is completed
and the user is granted authorization for access. The
permissions of the user to the system will be based on the
authentication. These are cheap to deploy. In the system
authentication separates the admin and users choices.
2. Bucketization: The process which is used for dealing
a data is bucketization technique. Bucketization technique is
used to perform analysis on every value of an entity, for
example the values of columns of table like product, store,
date, and for their attributes. Bucketization technique is
mainly used because serious performance consequences can
occur while implementing an attribute based on the primary
key of the table. There are two types of bucketization
Volume-4,Issue-4, Oct -2015
ISSN (O) :- 2349-3585
Paper Title:- DYNAMIC QUERY FORM BUCKETIZATION:-
ORDERED/UNORDERED
www.ijrdt.org | copyright © 2014, All Rights Reserved. (83)
techniques. They are ordered bucketization and unordered
bucketization.
3. Cryptography: Cryptography is the science dealing
with the study of secret communication. Cryptography is an
art of converting the secret in to an unintelligible form so that
an unauthorized person cannot read the secret. The main
component of that science which is used in this system is
known as encryption. Encryption is the process of hiding
information. That is encryption is an algorithm or process to
make information hidden by making the content of the
information a secret. The hidden information is known as
cipher text. To make the hidden information accessible the
user needs a key. Therefore the admin hides the information
from everybody except for the one who possess the key. A
direct application of cryptography that is used for protecting
information is the encryption method. The classification of
encryption process based on the cipher text are of two types,
they are symmetric encryption and asymmetric encryption.
Symmetric encryption uses the same key for both encryption
and decryption. DES, AES, Blowfish are well-known
algorithm for asymmetric key encryption. Asymmetric
encryption uses different keys for encryption and decryption.
RSA is a well-known algorithm for asymmetric key
encryption. The public key is the encryption key of the
receiver which is published for anyone to use and encrypt
messages. But only the receiving party has access to the
decryption key and so the receiver can read the encrypted
messages. The key owned by the receiving party is known as
the private key.
4. Dynamic Query Form Generation: The user interface that is
used for querying the databases is named as Query form. The
Query forms are most widely used to examine the databases
for efficient retrieval of query results of the system. The
Dynamic Query Form is to capture a user‟s preference for
components and to assist the user in making decisions. The
query form generation can be an iterative process and it can
be efficiently and successfully guided by any user. The user
can add components for the form into the query. The user can
efficiently get the results without any fault tolerance based on
the query [2].
These steps will be explained later in this chapter.
Due to the large amount of information in the
databased or various sources. The valid data‟s that are to be
transmitted to the authorized user should be correctly handled.
In fact there will be tremendous risks involved here. Data that
are to be retrieved should be gathered carefully. Data for the
organizations may include various types of information which
contains tremendous vivid applications. These information
will be stored in database. So Dynamic Query Form
Bucketization is used for efficient retrieval of the data in the
database by providing security.
The system is designed to provide a secure scheme
for sharing information among the users without any alteration
in the database due to the restriction in direct accessing of data
by the user. The application can be used in any potential
scenarios which can include help desk application in any
enterprise, or business scenarios. The bucketization technique
and the encryption mechanisms are other main things that
evaluates the performance of the system. There is also an
admin side component which will be responsible for the
database loading for users so that data transfer among
registered users can be done efficiently.
The system is implemented as three main modules which
include admin, users and dynamic query form bucketization
server module. The admin module deals with the
confidentiality or security of the data in the database that is to
be send to the user. The users‟ module supports the interaction
between the database content without alteration of the content
which is to be done efficiently without any fault rate and the
dynamic query form bucketization module deals with the
security of database by maintaining the bucket number for
each data in the database and this module will be responsible
for the encryption of each content in the database.
Figure 3.1 shows the basic interaction between the admin,
users and the dynamic query form bucketization server
module.
Volume-4,Issue-4, Oct -2015
ISSN (O) :- 2349-3585
Paper Title:- DYNAMIC QUERY FORM BUCKETIZATION:-
ORDERED/UNORDERED
www.ijrdt.org | copyright © 2014, All Rights Reserved. (84)
Figure 3.1: Interaction between admin, users and
DQFB server
3.1 System Architecture
Figure 3.2 shows a system architecture of the proposed
method for dynamic query form bucketization. Admin has the
capability for adding any type of database into the system
based on the organisation in which the application but admin
has to give with corresponding username and password of
relational database management system(RDBMS). The
RDBMS (relational database management system) used by
this application is MySQL. The databases that is used by this
application resides in this RDBMS. The columns of the
corresponding database will be added to the application. The
admin has to set the bucket width of each column. While
saving the bucket width two main process takes place there.
The main things are the admin encrypts the database contents
and attach the bucket number along with encrpted data. Admin
has the privilage to choose order bucketization or unordered
bucketization. The user has the privilage to access the data
without altering the content in the database. The plain query
loaded in dynamic query form will load the accurate database
contents from the system.
Figure 3.2 Dynamic Query Form Bucketization
3.2 System Requirement
The main hardware and software requirement of the proposed
system is given below:-
3.2.1 Hardware Specification
Processor : Pentium 1 GHz or Above
Hard Disk : 40 GB
Monitor : 1024 x 768 VGA Color Monitor
Memory : 512 MB Ram
Keyboard : 101/102 Natural Keyboard
Mouse : PS/2 Compatible
3.2.2 Software Specification
Operating System : Windows
Front End : Java
Back End : MySql
Application Development Software : NetBeans IDE
3.3 Modules
The detailed description of each modules in the
dynamic query form bucketization- ordered / unordered is as
follows:-
3.3.1 AuthenticationAuthentication or setting of
credentials acts as the first line of defence.
Volume-4,Issue-4, Oct -2015
ISSN (O) :- 2349-3585
Paper Title:- DYNAMIC QUERY FORM BUCKETIZATION:-
ORDERED/UNORDERED
www.ijrdt.org | copyright © 2014, All Rights Reserved. (85)
Authentication is a process in which the credentials
(username and password) provided are checked with file
in a database of authorized users‟ information in a
system or inside a server. If the username and password
match, then the process is completed and the user is
granted authorization for access. The permissions of the
user to the system will be based on the authentication.
These are cheap to deploy. In the system authentication
separates the admin and users choices. Authentication
plays a major role in for separating the folders returned
based on the privileges that are set beforehand. Thereby
separating the environment that user sees. It also plays a
vital role in the way in which the user interact with the
same system, including the access and other rights such
as the amount of allocated storage space.
The authentication process in the proposed system
separates an administrator and user so the access rights of the
user will be different. The process of checking user account
permissions for accessing the resources is referred to as
authorization. The preferences and privileges granted for the
authorized account depends upon the user‟s permissions. The
user‟s permission are either locally stored or on the server. In
the proposed system user‟s permission are based on
authentication which depends on the application and runs at
runtime. New users to the application can sign up by clicking
the corresponding button in the application but the new users
cannot get the privilege of the administrator. There will be
only one administrator in this application and admin has every
privileges.
Authentication :-
This means it checks whether an authorised person
access the data. It understands the type of user who is
accessing the data. It allows a user to have confidence and
secure data which originates from a specific known source.
The authentication test were done successfully. The login was
tested using invalid user name or valid user name and invalid
password. The software was also tested to see if it allows
access without any identification.
Authorization :-
Authorization is the process of determining that a
user is allowed to receive a particular service or perform an
operation. It also determine whether there is restriction for
accessing to a particular place or other resources. The
authorization test was executed successfully. The software
identifies the user as the normal user or admin and allows the
normal user to access only the client module or user module.
The test was performed to check if the system restricts
unauthorized users from accessing certain privileged modules
of the system. Authorization ensures whether the user is
restricted from the access of modules which are for the
administrator.
Availability :-
Availability assures the information or data stored in
database will be ready to be used for communication when it
is expected. Information will be kept available to the persons
when they need it. The software was tested for availability and
the data were available to the users easily by preserving the
security of the system.
Confidentiality:-
Confidentiality is a measure of security which
protects against the disclosure of information or data to parties
other than the intended recipient. The system was tested for
confidentiality and the system ensures the confidentiality of
message through encryption and bucketization technique. The
key for decrypting is only available to the authorized user. The
admin has the privilege to access the bucket and the admin can
choose whether ordered or unordered should be used for a
particular application. This confidential information will be
Volume-4,Issue-4, Oct -2015
ISSN (O) :- 2349-3585
Paper Title:- DYNAMIC QUERY FORM BUCKETIZATION:-
ORDERED/UNORDERED
www.ijrdt.org | copyright © 2014, All Rights Reserved. (86)
hidden from the other normal users who access the data from
the database
3.3.2 Bucketization Technique
The most common process that can be used for
dealing a data is bucketization technique. Bucketization
technique is used to perform analysis on every value of an
entity, for example the values of columns of table like product,
store, date, and for their attributes. Bucketization technique is
mainly used because serious performance consequences can
occur while implementing an attribute based on the primary
key of the table. This Secure Bucketization system uses both
ordered bucketization (OB) and unordered bucketization (UB).
Both ordered and unordered bucketization is used as a
cryptographic object.
In Ordered Bucketization, the plaintext-space or the
original data is divided into a pre-defined number of buckets
which is assigned by the admin while loading the database.
Consider the bucket number p assigned for a particular
column. Assign number to each bucket. The number should be
in a range from 1 to p and the numbers will be ordered. With
bucketization various types of SQL queries over encrypted
data are possible if the bucket number which corresponds to
the original plaintext before encryption or decryption key is
given for encrypting or decrypting a particular data. These
queries can be used in case of ordered bucketization. For
example, if a program running on the client side wants to
retrieve the data in the range between 100,000 and 200,000, it
first calculates the numbers of buckets whose union is the
smallest set that covers the queried range. The program
running on the client side sends the bucket numbers to the
database server. The database server searches all the encrypted
data whose bucket number is one of the received numbers.
Then the server sends the data back to the client side. The
client can obtain the correct result by filtering out the data that
are not in the range after decrypting them. In this case, a larger
amount of data is transmitted between the client side and the
server side than in the case where the database stores
unencrypted data items. Due to the false positives that occur in
the case where a bucket has both the data the client wants to
retrieve and data that it does not want. On the other hand, this
approach is very efficient compared to the case where the
client receives all the encrypted data from the server and
decrypts all data items to obtain the correct query result. As a
result this method is very useful when users cannot store their
data without encryption such as in a cloud computing
environment.
In the case of unordered Bucketization, the plaintext-
space or the original data is divided into a pre-defined number
of buckets which is assigned by the admin while loading the
database. Consider the bucket number p assigned for a
particular column. Assign number to each bucket. The number
should be in a range from 1 to p and the numbers will be
unordered. Pseudorandom algorithm is called for generating
random numbers and they will be stored in an array. The
queries can be use unordered bucket for accessing the secure
data. For example, if a program running on the client side
wants to retrieve the data in the range between 100,000 and
200,000, it first divides the plaintext equally and mapped to
the array were the random numbers are stored. The program
running on the client side sends the bucket numbers to the
database server. The database server searches all the encrypted
data whose bucket number is one of the received numbers
using pseudo random algorithm. Then the server sends the
data back to the client side. The client can obtain the correct
result by filtering out the data that are not in the range after
decrypting them. In this case, a smaller amount of data is
transmitted between the client side and the server side than in
the case where the database stores unencrypted data items.
The false positives that occurs can avoided by efficiently using
a bucket which has the data that the client wants to retrieve.
This approach is very efficient compared to the case where the
client receives all the encrypted data from the server and
decrypts all data items to obtain the correct query result.
Bucketization techniques makes the indexing of
encrypted data faster so that the searching of content in the
system is performed efficiently.
Volume-4,Issue-4, Oct -2015
ISSN (O) :- 2349-3585
Paper Title:- DYNAMIC QUERY FORM BUCKETIZATION:-
ORDERED/UNORDERED
www.ijrdt.org | copyright © 2014, All Rights Reserved. (87)
3.3.3 Cryptography
Security is the most challenging aspects in all day to day real
world applications. Cryptography is the one of the main
categories of computer security that converts information from
its normal form into an unreadable form. The two main
characteristics that identify and differentiate one encryption
algorithm from another are its ability to secure the protected
data against attacks and its speed and efficiency in doing so.
Cryptography is usually referred to as “the study of secret”.
Encryption is the process of converting normal text to
unreadable form. Decryption is the process of converting
encrypted text to normal text in the readable form. Encryption
is the process of hiding information. That is encryption is an
algorithm or process to make information hidden by making
the content of the information a secret. The hidden
information is known as cipher text. To make the hidden
information accessible the user needs a key. Therefore the
admin hides the information from everybody except for the
one who possess the key. A direct application of cryptography
that is used for protecting information is the encryption
method. The classification of encryption process based on the
cipher text are of two types, they are symmetric encryption
and asymmetric encryption. Symmetric encryption uses the
same key for both encryption and decryption. DES, AES,
Blowfish are well-known algorithm for asymmetric key
encryption. Asymmetric encryption uses different keys for
encryption and decryption. RSA is a well-known algorithm for
asymmetric key encryption. The public key is the encryption
key of the receiver which is published for anyone to use and
encrypt messages. But only the receiving party has access to
the decryption key and so the receiver can read the encrypted
messages. The key owned by the receiving party is known as
the private key.
An art of information hiding by transforming it into an
unintelligible form so that one with the possession of the key
and algorithm can access the data, this technique is known as
cryptography.
Cryptography is the process of scrambling a message to
convert it in to an unintelligible form. The process of
transforming the data into unintelligible form (non-readable
form) is termed as encryption. In encryption an information
content which is termed as the plaintext is transformed into a
non-readable form. Encryption is only possible by using the
encryption algorithm and the key. The unintelligible form
(non-readable form) information content is termed as the
cipher text. The key used for encryption determines the way in
which a message is encoded. The cipher text can be converted
back to its plain text (original form). The process of
converting the cipher text into plain text is known as
decryption. One with corresponding decryption algorithm and
the key can retrieve and read the information.
The encryption process can be classified based on the keys
used. The classification of encryption process based on the
cipher text are of two types, they are symmetric encryption
and asymmetric encryption. Symmetric encryption uses the
same key for both encryption and decryption. Some of the
examples for asymmetric key encryption are DES, AES, and
Blowfish which have certain well-known algorithms.
3.3.4 Dynamic Query Form Generation
Dynamic Query Form Generation is considered as a very
important task in this work as this project needs quality and
reliability of available information which directly affects the
results attained. The user interface that is used for querying the
databases is named as Query form. The Query forms are most
widely used nowadays. Many researches were aimed to
examine the databases for efficient retrieval of query results
without alteration in database content and to improve
performance of the system. The amount of data stored in
databases in internet is also rapidly increasing due to the
advancement in the field of information technology. These
databases contain a wealth of data and are a gold mine of
valuable information which should be kept secured.
Volume-4,Issue-4, Oct -2015
ISSN (O) :- 2349-3585
Paper Title:- DYNAMIC QUERY FORM BUCKETIZATION:-
ORDERED/UNORDERED
www.ijrdt.org | copyright © 2014, All Rights Reserved. (88)
Earlier query forms were used for various information
management systems which was designed by developers.
These traditional query forms were predefined by the
developers. Scientific databases, modern databases became
very complex and large with the rapid development of internet
information. As the information in the web databases
increased there was a rapid increase in the entities for storing
data. Thus the design of set of static query forms to satisfy
various database queries were difficult to implement.
Nowadays the existing database development and
management tools provide various techniques to let users
create queries on databases. But the main drawback with these
queries are these queries depends on user manual editing so if
a user is unfamiliar database schema in prior to the editing, the
hundreds or thousands of data entities and attributes would
confuse the developer. Therefore a need for a novel database
query form interface is applicable for generating dynamically
query forms. The essence of Dynamic Query Form is to
capture a user‟s preference for components and to assist the
user in making decisions. The query form generation can be an
iterative process and it can be efficiently and successfully
guided by any user. The user can add components for the form
into the query. The user can efficiently get the results without
any fault tolerance based on the query [2]. Sequential query
results discovery is a very important part of dynamic query
form system. Data Mining is a non-trivial process of
identifying valid, interesting, novel, useful, and ultimately
understandable data for user.
Due to the large amount of information in the databased or
various sources. The valid data‟s that are to be transmitted to
the authorized user should be correctly handled. In fact there
will be tremendous risks involved here. Data that are to be
retrieved should be gathered carefully. Data for the
organizations may include various types of information which
contains tremendous vivid applications. These information
will be stored in database. So Dynamic Query Form
Bucketization is used for efficient retrieval of the data in the
database by providing security.
3.4 Performance Evaluation
This section provides a performance evaluation of the
advantages of the proposed system when compared with
existing systems. The following things are the main
advantages of the proposed system:-
1) Database Connectivity:-
The existing application connect a constant database for the
entire lifecycle of the system. But the proposed system
outperforms this drawbacks by connecting any types of
relational databases. Admin simply has to mention the name
of the database that is for querying the results. The system
works based on how the admin loads the content in the
database. So any organization, business, enterprise
applications can use the proposed system efficiently. The users
can make use of the system efficiently by querying the
databases with apt content without even knowing the table
name or column name of the organization which in fact
increases the performance of the system. Relational databases
of any number of columns can be added to the system.
2) Use of Blowfish Algorithm :-
While comparing the symmetric key cryptographic algorithms
AES, DES and Blowfish algorithm, blowfish algorithm out
performs others by taking lesser time for encryption and
decryption of contents in the database. While comparing the
algorithms blowfish outperforms others were block sizes are
different.
The plaintext or data is divided into smaller block
size as per algorithm settings given in Table 1 which is given
above.
Algorithm Key Size (Bits) Block Size (Bits)
DES 64 64
AES 128 128
Blowfish 128 64
Table 1: Key Size and Block size of DES, AES and Blowfish
algorithm
Volume-4,Issue-4, Oct -2015
ISSN (O) :- 2349-3585
Paper Title:- DYNAMIC QUERY FORM BUCKETIZATION:-
ORDERED/UNORDERED
www.ijrdt.org | copyright © 2014, All Rights Reserved. (89)
Figure 3.7: Data Block Size vs Execution Time (Sec) of AES,
DES and Blowfish algorithms
The figure 3.7 shows the superiority of Blowfish algorithm
over AES and DES algorithm in terms of processing time. It
shows also that AES consumes more resources when the data
block size is relatively big. The figure indicate that the extra
time added is not significant for many real world applications
so Blowfish algorithm is better for faster computing. The
figure also shows that Blowfish has a better performance than
other common encryption algorithms used. Since Blowfish has
not any known security weak point, it can be considered as a
standard encryption algorithm. AES showed poor performance
results compared to other algorithms since it requires more
processing power but overall it was relatively negligible
especially for certain application that requires more secure
encryption to a relatively large data blocks.
3) Faster indexing for accessing encrypted
databases:-
By using the bucketization technique the querying process can
be made easily. Range queries performs better when compared
with others. Bucketization technique deals with faster
accessing of data. Only the content that are needed for the user
is taken the other data will be kept secured without even
decryption the data. Only the data corresponding to the bucket
numbers will be taken for decryption which in fact reflects in
the security of the system. The actual data are not accessed by
the users only the data needed by the user is taken.
The running time of the existing system for searching and
retrieval of data were reduced when compared with the
proposed system since only the data that are needed for the
user will be taken and decrypted. In earlier system, while the
user queries for a particular result, the entire database should
be decrypted for sending the accurate results to the user, but
the use bucketization technique only the data that are needed
for the end user will be decrypted during runtime since the
bucket numbers are checked for retrieving the results which
makes the system to run much faster.
4) Use of Dynamic Query Form:-
The proposed system will allows end-users to customize the
existing query form at run time and the end-user may not be
familiar with the database. If the database that are connected
during run time is very large, it will be difficult for them to
find database entities and attributes and to create desired query
forms. It makes the users to access the query results faster.
These are the main things about performance while
comparing the existing systems with the proposed system.
CONCLUSION
As it was seen, the proposed system provides a new secure
and efficient way for dynamic query form bucketization
which uses both ordered and unordered bucketization
technique. This method give more importance to range
queries that uses application interfaces. The dynamic query
form bucketization in range query is supported efficiently
using the proposed method. The study shows this approach
select appropriate data from the database efficiently and
thereby improving the accuracy of the system. It also
preserves high-level security compared to existing methods.
Bucketization, Encryption and Dynamic Query Form
generation were the important tasks carried out in this work.
Since quality and reliability of the system directly affects the
results obtained, it was an important task to retrieve the useful
Volume-4,Issue-4, Oct -2015
ISSN (O) :- 2349-3585
Paper Title:- DYNAMIC QUERY FORM BUCKETIZATION:-
ORDERED/UNORDERED
www.ijrdt.org | copyright © 2014, All Rights Reserved. (90)
and correct information. There is no need to write the query to
retrieve data since the queries can be obtained dynamically
using an efficient interface. A dynamic query form generation
approach was implemented which helps users dynamically
generate query forms. This aims to capture user preference at
runtime and generate accurate query required by the user. The
dynamic approach can lead to higher success rate and it can
use simpler query forms compared with a static approach. It
can add a text-box for users to input some keywords queries.
A Bucketization technique was constructed efficiently with
which any Encrypted Bucketization that will works on top of
any secure symmetric encryption scheme [15].
Encryption scrambles the data in database to convert it in to an
unintelligible form and hides the actual information so that the
unauthorized user cannot access the information. The system
combines cryptography, bucketization and dynamic query
form to achieve data privacy and can handle generating
dynamic query form there by increasing the security. The
bucketization, encryption and dynamic query form provides
better thereby increasing the security of the system. The
resultant cipher after encryption is embedded with the bucket
number which prevents the modification of the actual data
while querying. Dynamic query form bucketization uses the
blowfish encryption technique.
In general, regarding the Dynamic Query Form Bucketization-
Ordered/Unordered technique has the main conclusions are as
follows:
1) It was shown that bucketization technique can be
used successfully for having higher security and
accuracy for retrieving range query without direct
access to the data which resides in the database.
2) It was shown that the utility of dynamic query form
generation techniques helps in retrieving faster access
in case of range queries when we have a great
number of data in the database. In this case, the
number of attributes used were reduced for obtaining
fewer rules and conditions without losing
classification performance.
3) Better symmetric encryption method is used for faster
encryption and decryption of datum in the entire
database. So even if the block size is larger the
encryption and decryption of the datum will be faster
when compared with other algorithms like AES,
DES.
Finally, as the next step in research, carry out more
experiments on Dynamic Query Form generation with user
preferences can be given based on certain ranking concepts.
REFERENCES
[1] Younho Lee,"Secure Ordered Bucketization," IEEE
Transactions On Dependable And Secure Computing,
Vol. 11, No. 3, May-June 2014
[2] Liang Tang, Tao Li, Yexi Jiang, and Zhiyuan Chen,
“Dynamic Query Forms for Database Queries” IEEE
Transactions on Knowledge and Data Engineering,
Vol. 26, No. 9, September 2014.
[3] Y. Ding and K. Klein, “Model-driven application-level
encryption for the privacy of E-health data,” in Proc.
IEEE 10th International Conference Availability Rel.
Security, 2010, pp. 341–346.
[4] C. Wang, N. Cao, J. Li, K. Ren, and W. Lou, “Secure
ranked keyword search over encrypted cloud data,” in
Proc. IEEE 30th International Conference Distributed
Computing System., 2010, pp. 253–262.
[5] W. Lu, A. Varna, and M. Wu, “Security analysis for
privacy preserving search of multimedia,” in
Proceeding. 17th IEEE International Conference
Image Processing, 2010, pp. 26–29, 2010
Volume-4,Issue-4, Oct -2015
ISSN (O) :- 2349-3585
Paper Title:- DYNAMIC QUERY FORM BUCKETIZATION:-
ORDERED/UNORDERED
www.ijrdt.org | copyright © 2014, All Rights Reserved. (91)
[6] M. Jayapandian and H. V. Jagadish, “Automated
creation of a forms-based database query interface,”
Proc. VLDB, vol. 1, no. 1, pp. 695–709, Aug. 2008.
[7] H. Hacigumus, B. Iyer, C. Li, and S. Mehrotra,
“Executing SQL over encrypted data in the database-
service-provider model,” in Proc. ACM SIGMOD
International Conference Manage. data, 2002, pp.
216–227.
[8] R. Agrawal, J. Kiernan, R. Srikant, and Y. Xu, “Order
preserving encryption for numeric data,” in Proc.
ACM SIGMOD Int. Conf. Manage. Data, 2004, pp.
563–574.
[9] A. Boldyreva, N. Chenette, Y. Lee, and A. O‟Neill,
“Order-preserving symmetric encryption,” in Proc.
31st Annu. Int. Conf. Adv. Cryptology, 2009, vol.
5479, pp. 224–241.
[10] G. Das and H. Mannila, “Context-based similarity
measures for categorical databases,” in Proc. PKDD,
Lyon, France, Sept. 2000, pp. 201–210.
[11] M. Jayapandian and H. V. Jagadish, “Expressive
query specification through form customization,” in
Proc. Int. Conf. EDBT, Nantes, France, Mar. 2008, pp.
416–427.
[12] M. Jayapandian and H. V. Jagadish, “Automating the
design and construction of query forms,” IEEE Trans.
Knowl. Data Eng., vol. 21, no. 10, pp. 1389–1402,
Oct. 2009. T. Joachims and F. Radlinski, “Search
engines that learn from implicit feedback,” IEEE
Comput., vol. 40, no. 8, pp. 34–40, Aug. 2007.
[13] E. Chu, A. Baid, X. Chai, A. Doan, and J. F.
Naughton, “Combining keyword search and forms for
ad hoc querying of databases,” in Proc. ACM
SIGMOD, Providence, RI, USA, Jun. 2009, pp. 349–
360.
[14] N. Khoussainova, Y. Kwon, M. Balazinska, and D.
Suciu, “Snipsuggest: Context-aware autocompletion for SQL,”
Proc. VLDB, vol. 4, no. 1, pp. 22–33, 2010.
[15] A. Nandi and H. V. Jagadish, “Assisted querying using
instantresponse interfaces,” in Proc. ACM SIGMOD, Beijing,
China, 2007, pp. 1156–1158.
[16] W. B. Frakes and R. A. Baeza-Yates, Information
Retrieval: Data Structures and Algorithms. Englewood Cliffs,
NJ, USA: Prentice-Hall, 1992.
[17] C. Li, N. Yan, S. B. Roy, L. Lisham, and G. Das,
“Facetedpedia: Dynamic generation of query-dependent
faceted interfaces for wikipedia,” in Proc. WWW, Raleigh,
NC, USA, Apr. 2010, pp. 651–660
[18] B. Hore, S. Mehrotra, M. Canim, and M. Kantarcioglu,
“Secure multidimensional range queries over outsourced
data,” The Very Large Data Bases J., vol. 21, pp. 333–358,
2012.
[19] B. Hore, S. Mehrotra, and G. Tsudik, “A privacy-
preserving index for range queries,” in Proc. 30th Int. Conf.
Very Large Data Bases,2004, pp. 720–731.
[20] S. B. Roy, H. Wang, U. Nambiar, G. Das, and M. K.
Mohania, “Dynacet: Building dynamic faceted search systems
over databases,” in Proc. ICDE, Shanghai, China, Mar. 2009,
pp. 1463–1466.
[21] N. Khoussainova, M. Balazinska, W. Gatterbauer, Y.
Kwon, and D. Suciu, “A case for a collaborative query
management system,” in Proc. CIDR, Asilomar, CA, USA,
Jan. 2009.
[22] Q. T. Tran, C.-Y. Chan, and S. Parthasarathy, “Query by
output,” in Proc. SIGMOD, Providence, RI, USA, Sept. 2009,
pp. 535–548.
Volume-4,Issue-4, Oct -2015
ISSN (O) :- 2349-3585
Paper Title:- DYNAMIC QUERY FORM BUCKETIZATION:-
ORDERED/UNORDERED
www.ijrdt.org | copyright © 2014, All Rights Reserved. (92)
[23] K. Chen, H. Chen, N. Conway, J. M. Hellerstein, and T.
S. Parikh, “Usher: Improving data quality with dynamic
forms,” in Proc. ICDE, Long Beach, CA, USA, Mar. 2010, pp.
321–332.
[24] S. Zhu, T. Li, Z. Chen, D. Wang, and Y. Gong, “Dynamic
active probing of helpdesk databases,” Proc. VLDB, vol. 1,
no. 1, pp. 748–760, Aug. 2008.
[25]. Yun Ding, Karsten Klein “Model-Driven Application-
Level Encryption for the Privacy of E-Health Data” 2010
International Conference on Availability, Reliability and
security
Top Related