Efficient Data Hiding Techniques for Digital Rights Management of Multimedia Archives

Efficient Data Hiding Techniques for Digital Rights Management of Multimedia Archives

BY Hafiz Muhammad Aslam Malik

B.S. (University of Engineering and Technology Lahore, Pakistan) 1999

Preliminary Proposal

Submitted in partial fulfillment of the requirements for the degree of Ph.D.

in the Graduate College of the University of Illinois at Chicago, 2004

Chicago, Illinois

2

TABLE OF CONTENTS

CHAPTER 1.................................................................................................................................................................5 INTRODUCTION .........................................................................................................................................................5

MOTIVATION: ....................................................................................................ERROR! BOOKMARK NOT DEFINED. PROBLEM STATEMENT: ...............................................................................................................................................9

CHAPTER 2...............................................................................................................................................................12 RELATED WORK.......................................................................................ERROR! BOOKMARK NOT DEFINED.

2.1 DATA HIDING SYSTEMS: APPLICATIONS AND REQUIREMENTS.....................................................................12 2.1.1 REQUIREMENTS OF A DATA HIDING SYSTEM:...............................................................................................12

I. Robustness:...............................................................................................................................................13 II. Effectiveness:............................................................................................................................................13 III. Fidelity: ....................................................................................................................................................14 IV. Capacity: ..................................................................................................................................................14 V. Blind or Informed Detection: ...................................................................................................................14 VI. False Positive Rate:..................................................................................................................................14 VII. Multiple Watermarks Capability: ............................................................................................................15 VIII. Cost: ........................................................................................................................................................15

2.1.2 APPLICATIONS OF DATA HIDING FOR DIGITAL RIGHTS MANAGEMENT: ........................................................15 I. Ownership Protection: .............................................................................................................................15 II. Content Authentication:............................................................................................................................16 III. Fingerprinting:.........................................................................................................................................16 IV. Copy Protection: ......................................................................................................................................16 V. Broadcast Monitoring: .............................................................................................................................16

2.2. CLASSIFICATION OF DATA HIDING TECHNIQUES.......................................................................................17 2.2.1 CLASSIFICATION BASED ON HOST MEDIA TYPE............................................................................................18

I. Data Hiding in Images .............................................................................................................................18 II. Data Hiding in Video................................................................................................................................18 III. Data Hiding in Audio ...............................................................................................................................18 IV. Data Hiding in Text ..................................................................................................................................18

2.2.2 CLASSIFICATION BASED ON DATA HIDING APPLICATIONS ............................................................................18 I. Robust Data Hiding..................................................................................................................................18 II. Fragile Data Hiding .................................................................................................................................18 III. Semi-Fragile Data Hiding........................................................................................................................18

2.2.3 CLASSIFICATION BASED ON PERCEPTIBILITY................................................................................................18 I. Imperceptible Data Embedding................................................................................................................19 II. Visible Data Embedding...........................................................................................................................19

2.2.4 CLASSIFICATION BASED ON DATA EMBEDDING DOMAIN ..............................................................................19 I. Data Hiding in Spatial/Time Domain (Direct Domain) ...........................................................................19 II. Data Hiding in Transformed Domain.......................................................................................................19

2.2.5 CLASSIFICATION BASED ON DATA EMBEDDING METHOD .............................................................................20 I. Additive Spread Spectrum or Host-Interference-Non-rejecting Methods.................................................20 II. Host Interference Rejecting Methods .......................................................................................................20

2.2.6 CLASSIFICATION BASED ON DATA EXTRACTION METHOD.............................................................................20 I. Private or Informed Data Hiding .............................................................................................................21 II. Semi-Private Data Hiding ........................................................................................................................21 III. Public or Blind Data Hiding ....................................................................................................................21

3.3 DIGITAL RIGHTS MANAGEMENT: A BRIEF OVERVIEW................................................................................23 CHAPTER SUMMERY..................................................................................................................................................25

CHAPTER 3...............................................................................................................................................................26

3

DATA HIDING MODELS...........................................................................ERROR! BOOKMARK NOT DEFINED. 3.1 NOTATION...........................................................................................ERROR! BOOKMARK NOT DEFINED. 3.2 TRANSMISSION CHANNELS........................................................................................................................21 3.2.1 BOUNDED DISTORTION CHANNELS..............................................................................................................21 3.2.2 BOUNDED HOST-DISTORTION CHANNELS....................................................................................................22 3.2.3 ADDITIVE NOISE CHANNELS........................................................................................................................22 3.3 DATA HIDING IN COMMUNICATION FRAMEWORK ...............................ERROR! BOOKMARK NOT DEFINED. 3.3.1 CLASSICAL MODEL OF COMMUNICATIONS SYSTEM.................................ERROR! BOOKMARK NOT DEFINED. 3.3.2 SECURE TRANSMISSION.........................................................................ERROR! BOOKMARK NOT DEFINED. 3.3.3 DATA HIDING MODEL BASED ON COMMUNICATION...............................ERROR! BOOKMARK NOT DEFINED. 3.4 DATA HIDING AS COMMUNICATION WITH SIDE INFORMATION AT THE TRANSMITTERERROR! BOOKMARK NOT DEFINED. 3.5 GEOMETRIC MODEL OF DATA HIDING ................................................ERROR! BOOKMARK NOT DEFINED. CHAPTER SUMMERY............................................................................................ERROR! BOOKMARK NOT DEFINED.

CHAPTER 4...............................................................................................................................................................44 BLIND DATA EMBEDDING ....................................................................................................................................44

4.1 DATA HIDING BASED ON ADDITIVE EMBEDDING................................ERROR! BOOKMARK NOT DEFINED. 4.2 WORK IN PROGRESS: ROBUST AND HIGH RATE DATA EMBEDDING .............................................................44 4.2.1 DATA HIDING USING FREQUENCY SELECTIVE BASED SPREAD SPECTRUM ....................................................44

4.2.1.1 WATERMARKING USING PERCEPTUAL AUDITORY MODEL.................................................45 4.2.1.2 SALIENT POINT EXTRACTION ....................................................................................................46 4.2.1.3 WATERMARK EMBEDDING.........................................................................................................48 4.2.1.3.1 Watermark Generation ...................................................................................................................49 4.2.1.3.2 Watermark Embedding ...................................................................................................................49 4.2.1.4 WATERMARK DETECTION ..........................................................................................................51 4.2.1.5 EXPERIMENTAL RESULTS...........................................................................................................52

4.3 FUTURE DIRECTIONS.................................................................................................................................55 4.3.1 PROPOSED DADA HIDING SCHEME FOR IMAGES..........................................................................................55 4.3.2 PROPOSED DADA HIDING SCHEME FOR VIDEO ...........................................................................................57

CHAPTER 5...............................................................................................................................................................59 INFORMED DATA EMBEDDING ...........................................................................................................................59

5.1 INFORMED EMBEDDING.......................................................................ERROR! BOOKMARK NOT DEFINED. 5.1.1 COSTA’S WORK ....................................................................................ERROR! BOOKMARK NOT DEFINED. 5.2 QUANTIZATION INDEX MODULATION (QIM) ......................................ERROR! BOOKMARK NOT DEFINED. 5.2.1 BINARY DITHER MODULATION ..............................................................ERROR! BOOKMARK NOT DEFINED. 5.3 WORK IN PROGRESS: HIGH RATE DATA EMBEDDING USING INFORMED ENCODING ....................................59 5.3.1 DATA HIDING USING FREQUENCY SELECTIVE DITHERING (OUR CONTRIBUTION).........................................59

5.3.1.1 FIR APPROXIMATION OF APF....................................................................................................60 5.3.1.2 DATA EMBEDDING ......................................................................................................................63 5.3.1.3 DATA DETECTION USING SIGNAL MODELING .......................................................................64 5.3.1.3.1 Spectrum Estimation .......................................................................................................................65 5.3.1.3.2 Allpass Filter Parameter Estimation ..............................................................................................66 5.3.1.3.3 Simulation Results...........................................................................................................................67 5.3.1.4 DATA DETECTION USING MATCH FILTER...............................................................................68 5.3.1.4.1 Simulation Results...........................................................................................................................70

5.4 FUTURE DIRECTION...................................................................................................................................71 5.4.1 EXTENSION AUDIO FINGERPRINTING AND AUTHENTICATION ........................................................................72

CHAPTER 6...............................................................................................................................................................73 CONCLUSION & FUTURE DIRECTIONS ..............................................................................................................73 REFERENCES:.........................................................................................................................................................75

4

TABLE OF FIGURES

FIGURE 2.1: GENERAL CLASSIFICATION OF DATA HIDING ...........................................................................................17 FIGURE 2.2: ANATOMY OF A DRM TRANSATION ........................................................................................................21 FIGURE 3.1: STANDARD COMMUNICATION MODEL................................................ERROR! BOOKMARK NOT DEFINED. FIGURE 3.2: STANDARD SECURE COMMUNICATION MODEL ..................................ERROR! BOOKMARK NOT DEFINED. FIGURE 3.3: GENERAL MODEL FOR DATA HIDING .................................................ERROR! BOOKMARK NOT DEFINED. FIGURE 3.4: DATA HIDING SYSTEM WITH INFORMED DETECTOR ANALOGOUS TO STANDARD SECURE COMMUNICATION MODEL............................................................................ERROR! BOOKMARK NOT DEFINED. FIGURE 3.5: DATA HIDING SYSTEM WITH BLIND DETECTOR ANALOGOUS TO STANDARD SECURE COMMUNICATION MODEL ............................................................................ERROR! BOOKMARK NOT DEFINED. FIGURE 3.6: DATA HIDING AS COMMUNICATION WITH SIDE INFORMATION AT THE ENCODER...... ERROR! BOOKMARK

NOT DEFINED. FIGURE 4.1: 5 –LEVEL MODIFIED DISCRETE WAVELET ANALYSIS FILTER BANK ........................................................48 FIGURE 4.2: BLOCK DIAGRAM OF WATERMARK EMBEDDING PROCESS .......................................................................50 FIGURE 4.3: NORMALIZED CORRELATION FOR WATERMARKED SUBBAND (LEFT) AND UNWATERMARKED SUBBAND (RIGHT). ..............................................................................................................................................52 FIGURE 4.4: BLOCK DIAGRAM FOR WATERMARK DETECTION .....................................................................................52 FIGURE 4.5: DPM FOR DIFFERENT VALUES OF NOISE POWER (PN).................................................................................54 FIGURE 5.1: INFORMED DATA EMBEDDING FOLLOWED BY AWGN ATTACK.........ERROR! BOOKMARK NOT DEFINED. FIGURE 5.2: BINARY DITHERED MODULATION SCHEME BASED ON DITHERED UNIFORM SCALAR QUANTIZATION ............................................................................................ERROR! BOOKMARK NOT DEFINED. FIGURE 5.3: MAGNITUDE RESPONSE OF APF H(EJW) APPROXIMATION FOR DIFFERENT VALUES OF LENGTH (L). ........62 FIGURE5.4: POLE-ZERO LAYOUT OF HAPI(Z) FOR BINARY ENCODING ..........................................................................62 FIGURE5.5: POLE-ZERO LAYOUT OF HAPI(Z) FOR 4-ARY ENCODING .............................................................................63 FIGURE 5.6: BLOCK DIAGRAM OF THE DATA EMBEDDING SCHEME .............................................................................64 FIGURE 5.7: BLOCK DIAGRAM OF THE DATA DETECTION PROCESS .............................................................................67 FIGURE 5.8: PROBABILITY OF ERROR (PE) VS. SNR PLOT FOR BOTH ENCODING SCHEMES ..........................................67 FIGURE 5.9: MAGNITUDE SPECTRUM OF CZT OF THE SUBBAND SEQUENCE X4,6(N) BEFORE AND AFTER PASSING THROUGH H0(Z I) I.E. Y4,6(N),AT R = 0.9 (RIGHT) AND AT R = 1/0.9 (RIGHT). .........................................69 FIGURE 5.10: BLOCK DIAGRAM OF THE DATA DETECTION USING MATCH FILTER ......................................................70 FIGURE 5.11: PROBABILITY OF ERROR FOR DIFFERENT SNR VALUES...........................................................................71

5

CHAPTER 1

Introduction The revolution in the area of digital information has visibly impacted our society and everyday

life [171, 172]. Some of the blessings of this digital revolution include: the evaluation of Internet

as a global village, availability of low–cost large capacity storage devices, deployment of long–

distance seamless networks at Gbps (gigabits per second) data rates, and popular use of the state-

of-the-art multimedia production equipments (such as palm tops, digital camera, camcorder,

high-tech scanner and printer, digital audio recorder, etc.). Furthermore, the developments in the

areas of digital media production, manipulation, and distribution have added new dimensions to

the technical challenges related to digital data security and integrity. Along with its countless

advantages the cutting edge technologies of this digital information revolution have generated

some serious concerns about digital content protection, ownership protection, unauthorized copy

prevention, etc. Today’s entertainment industry (music and film industry) alone claims a

multimillion dollar annual revenue loss due to piracy [171], which is more likely to increase in

the coming years due to fast growing trend of exchanging digital media (music, images, movies,

software, e-books, etc.) over peer-to-peer networks. There is an urgent need to develop robust

technologies to support the development of digital rights management (DRM) systems, capable

of providing diverse services such as, secure media streaming between user and content server,

ownership protection, unauthorized copy prevention, unauthorized content usage, content

authentication, and content usage tracing.

Generally digital rights management (DRM) systems consist of a set of rights models (business

models) and technologies to support the above-mentioned services. However, the research

proposed in this dissertation deals only with some of the technological issues of DRM systems.

These technological issues define the reliability of a DRM system.

6

Most of the existing DRM systems use traditional content protection schemes such as encryption

for digital content protection, secure content delivery, and its usage tracking [170]. However,

encryption and scrambling alone cannot provide adequate protection against ownership rights,

unauthorized content usage, unauthorized copy prevention etc. Encrypted or scrambled data

remain protected as long as decryption or unscrambling key is unknown, but once data is

decrypted or unscrambled there is no way to stop its reproduction or sharing [168]. Thus there is

a strong need to complement cryptography. Data hiding and watermarking (a special case of data

hiding) are the potential technologies promising to meet the shortcomings of traditional content-

protection technologies.

In general, information hiding or data hiding implies imperceptibility embedding information

(message or metadata) into the host signal (images, video, audio, text etc.) for a variety of

applications such as secret communication or steganography, content protection, ownership

protection, illegal copy prevention, etc. Salient characteristics of any data hiding scheme include:

embedding capacity or payload, minimal embedding distortion, robustness to attacks, low false

positive rate, low error probability of received data, etc. Among these, embedding capacity,

embedding distortion or fidelity, and robustness are three inter-dependent features, and are also

used to evaluate the performance of data embedding schemes. Embedding capacity refers to the

amount of data that can be embedded in a give multimedia clip. Embedding distortion or fidelity

measures the perceptibility of the embedded information. Robustness refers to the capability of

data hiding scheme to withstand intentional and unintentional attacks. Here, intentional attacks

include filtering, chopping, scaling, Gaussian or uniform noise addition, resampling, etc.,

whereas, lossy compression digital to analog conversion and requantization are generally treated

as unintentional attacks.

Digital watermarking, a special case of data hiding, is a process of embedding information into

the host data (cover data) for content protection, integrity and security. Robustness of the

7

embedded information against data hiding attacks is the most desirable feature of watermarking

schemes. A watermark is an imperceptible and inseparable signal about the data in which it is

embedded, and undergoes same transformation as the host data. These attributes distinguish

watermarking from the traditional digital content protection techniques [7] such as cryptography

and scrambling and this make watermarking an attractive tool for digital media protection, traitor

tracing, content usage monitoring, broadcast monitoring, and communication with side

information to improve the quality of service (QoS) of the multimedia transmission over lossy

channel.

The growing availability of digital information in different formats and its increasing illegal

sharing and distribution has led to the proliferation of DRM technologies including, data hiding

schemes designed for applications such as copyrights protection, media authentication, broad

cost monitoring [5 – 33], steganographic techniques for covert communications [8, 127 – 137],

fingerprinting methods for traitor-tracing applications [90 – 99]. This has also led to renewed

interests of information theoreticians in the data hiding problem, e.g. Moulin et al [39 – 41, 47 –

54], Cox et al [7, 59, 69, 71, 139, 130, 144, 145, 150], Chen et al [80 – 86], Girod et al [57, 58,

70, 104 – 112], P-Gonzalez et al [74 – 76, 115, 135, 146 – 149], Cohen et al [43 – 46, 55, 56],

and others [6, 25 – 33]. Most of the theoretical advances in the area of data hiding are attributed

to the following classical works:

“Writing on Dirty Papers” by M. Costa [37]

“Coding of Channels with Random Parameters” by Gel’fand and Pinsker [35],

“Channels with Side Information at the Transmitter” by Shannon [34].

Due to these inspirational papers many researchers have modeled the data hiding problem using

signal processing, communication theory, coding theory, and information theory framework.

From an application perspective, most of the existing data hiding research [9 – 24] is mainly

focused on digital images data. Relatively very little attention has been given to data hiding in

8

digital video and audio data. Audio data models (perceptual as well as real data models) are quite

different from images and video data models. Therefore, a data hiding scheme yielding high

performance of a given data hiding scheme for images or video may not yield the same

performance for audio, and vice versa. In the following we outline various shortcomings of the

existing data hiding schemes. More detailed analysis of the related work will be provided in

Chapter 3.

First of all, the host data is generally modeled as an independent and identically distributed

(i.i.d.) Gaussian random sequence, and the attack channel is modeled as an independent discrete

memoryless (DM) Gaussian channel [39 – 56]. These models do not agree with the host data

(audio, video, and images) models, because, in general multimedia data does not exhibits i.i.d.

Gaussian distribution [9]. Similarly, in practice active adversary attacks are host data dependent

[104 – 126], especially when an active adversary has knowledge about the host data. Therefore,

there is a need for more realistic and appropriate modeling of host data and attack channel for

performance analysis of a given data-hiding scheme.

Secondly, almost all existing data hiding schemes measure the perceptual quality (fidelity) of a

given data hiding scheme using the mean squared error metric [39 – 41, 80 – 86, 115, 135, 146 –

149], which often does not agree with the human perceptual model [2, 7, 60]. An appropriate

perceptual distortion measure is also needed for the performance evaluation based on the

perceptual distortion due to information embedding and robustness.

Thirdly, low data rate is a common limitation of the existing data hiding schemes [5 – 24]. Since

data hiding applications such as broadcast monitoring require relatively high data rate, therefore,

it is desirable to develop high capacity data hiding schemes for such applications. While several

researchers [71 – 86] have proposed high capacity data hiding schemes, but their work is based

on improving the coder performance by using efficient coding schemes and/or using host signal

interference cancellation [77 – 89]. A little attention has been focused on exploiting the host data

9

characteristics combined with a efficient coder and data hiding strategy to achieve high data

embedding rate.

Moreover, most of the existing DRM systems use encryption for content protection and content

tracking which alone cannot provide sufficient firewall against active adversary attacks.

Finally, most of the research in the data hiding community is mainly focused on traditional copy

control issues such as, copyright protection, content authentication, temper detection,

unauthorized copy prevention, content usage monitoring, broadcast monitoring, etc. Very little

attention has been given to broaden the data hiding application domain beyond the copy control

issues. For example, data hiding can be used in the area of multimedia transmission over lossy

and bursty channel to improve the QoS.

Problem Statement:

Most of the existing data hiding schemes [4 – 33] are based on i.i.d. Gaussian modeling of the

host data and independent Gaussian discrete memoryless channel (DMC) modeling of adversary

attacks or attacks channels [39 – 56]. These assumptions are not true in general. Similarly,

performance measures based on embedding fidelity of existing data embedding schemes

generally use mean squared error distortion criterion which does not agree with the human audio-

visual perceptual model. Therefore it is desirable to develop more realistic host data, attack

channel, and embedding distortion models for performance analysis of existing data embedding

schemes. The main goal of this research is to advance the theory underlying data hiding, develop

new techniques for data hiding, and extend data hiding applications to digital rights management

system. In this dissertation we propose to analyze the limitations of existing data hiding schemes

and their applications to different types of host data. Based on the analysis we will propose

efficient data hiding schemes for a reliable DRM system. We intend to analyze the performance

of the proposed schemes based on the triad of data hiding performance criteria, that is, capacity,

10

perceptibility, and robustness using more realistic data and channel models. We also plan to

develop a more realistic measure of distortion due to embedding in order to evaluate the

perceptual performance of the proposed data-hiding scheme. In particular we propose to

investigate the following research tasks:

Develop high capacity data hiding schemes based on host signal features along with

efficient codec and data hiding strategies.

Devise realistic host data models (stochastic models) for each type of host media i.e.

audio, images, and video separately for information-theoretic analysis of the

proposed data hiding schemes.

Design efficient source coding schemes based on the developed host data models.

Develop appropriate channel models for intentional and unintentional attacks and

analyze their performance against existing attack channel models.

Devise a realistic distortion metric to evaluate the performance based on

perceptibility and robustness.

Develop an appropriate protocol for online multimedia authentication, copy control,

and copyright protection applications.

Devise suitable data-hiding scheme for multimedia indexing and retrieval

application.

Develop a realistic data hiding strategy to improve the QoS of multimedia

transmission over lossy and busty channels.

The remainder of the dissertation proposal is organized as follows: Chapter 2 discusses the

requirements and application domain of data hiding schemes along with a general classification

of existing data hiding schemes. A brief overview of a DRM system is also provided in Chapter

2. Related work and data hiding modeling is given in Chapter 3. Our contribution to blind data

embedding or additive spread spectrum class of data hiding is discussed in Chapter 4. Chapter 5

11

gives the details of our proposed work in informed embedding class of data hiding. Our

proposed data hiding schemes for both classes of data hiding use audio data as a host media, their

extensions for images and video host data are also proposed. Future directions of our proposed

research are outlined in Chapter 6.

12

CHAPTER 2

Preliminaries This chapter presents an overview of the generic characteristics and requirements of the data

hiding problem, briefly describes the related application domains, and provides a general

classification of the existing data hiding schemes. The challenges, shortcomings, and the

promises of the data hiding schemes are outlined in Section 2.1. Section 2.2 gives general

classification of existing data hiding schemes. Transmission channel model is an important

ingredient for theoretical analysis of data hiding problem. Common transmission channel

models, such as bounded distortion channels, bounded host distortion channels, and additive

noise channels that have been used for modeling attacks against data hiding schemes and

watermarking are discussed in Section 2.3. A brief overview of DRM systems is provided in

Section 2.4.

2.1 Data Hiding Systems: Requirements and Applications

The digital multimedia (throughout this document digital multimedia or the host media refers to

digital audio, digital video and digital images, unless otherwise specified) has many advantages

over analog multimedia. For example, there is insignificant aging effect on the contemporary

digital media storage devices such as CDs, memory sticks, etc., reproduction of digital media is

very simple; a copy of a digital media clip is exactly similar to its original version. Also due to

recent advances in the techniques for digital data production, distribution, and manipulation,

research in the area of data hiding and watermarking has exploded with the goal to complement

deficiencies of the conventional content protection methods such as cryptography and

scrambling [7, 8].

2.1.1 Requirements of a Data Hiding System:

13

A data hiding scheme is characterized by a number of defining properties [5 – 8]. In general a

data hiding scheme is suppose to withstand against common data manipulations, such as lossy

compression, digital-to-analog conversion, rescaling, requantization, resampling, filtering, data

format conversion, encryption, decryption, and scrambling. It is also suppose to withstand

against active adversary attacks, such as noise, as long as attack channel distortion is below a

certain masking threshold. However, the relative importance of each property depends on the

requirements of the application and the role of data embedding in the application. For example,

if we are evaluating the performance of an audio watermarking system for copy control

application, we may need to check the robustness of short time energy ratio that adversary might

use for attack. However, such robustness might be irrelevant for broadcast monitoring

applications. Therefore, the performance of any data hiding scheme should be evaluated based

on the underlying application. Following are the desirable properties of a generic data hiding

scheme:

I. Robustness: Robustness measures the ability of embedded data or watermark to withstand against intentional

and unintentional attacks. Unintentional attacks generally include common data processing

operations i.e. compression, digital-to-analog conversion, resampling, requantization etc, where

as, intentional attacks cover a broad range of degradations [104 – 126], for example, white and

color noise addition, scaling, rotation (for image and video watermarking schemes), chopping,

low-pass filtering, etc. Details of these intentional attacks in the area of data hiding and their

countermeasures can be found in [8, 131, 132].

II. Effectiveness: The probability that the output of the embedder will be watermarked for a randomly selected

input data is generally referred as effectiveness of a data hiding scheme.

14

III. Fidelity: This is an important property of all perceptual based data hiding schemes [5 – 24]. Fidelity

measures the perceptual similarity between the host media and its data embedded version. To

meet this constraint, the perceptual distortion introduced due to embedding is kept below the

masking threshold of human auditory system (HAS) for audio data hiding schemes and human

visual system (HVS) for video and image data hiding schemes.

IV. Capacity: This property refers to the amount of information that a data hiding scheme can successfully

embed without introducing perceptual distortion. The need for this property is application

dependent, for example, a data hiding scheme designed for copyright protection or copy control

application does not require high data embedding capacity because only a few bits of information

are sufficient for this application. Whereas, a data embedding scheme for broadcast monitoring

applications requires to embed relatively large amount of data [6, 7].

V. Blind or Informed Detection: This property relates to the availability of host data at the detector for watermark detection

process. If the host data is available at the detector for watermark detection process; then, this

class of data hiding schemes are categorized as informed detector or private data hiding schemes.

These schemes are required for fingerprinting, and data authentication [5 – 7]. If the host data is

not available at the detector for watermark detection process then this class of data hiding

schemes are categorized as blind detector or public data hiding schemes. Blind detector based

data hiding schemes are commonly used for copy control applications.

VI. False Positive Rate: This property corresponds to the frequency of detecting mark in an unmarked portion of the host

data. It is an important property for content protection applications such as, ownership right,

copy control, etc.

15

VII. Multiple Watermarks Capability: This feature of a data hiding scheme to embed more than one mark in the same host data is

desirable in some application such as fingerprinting.. For example, consider a situation where the

owner and the chain of distributors of a multimedia product want to embed their marks (serial

numbers or tags) to keep the trace of content usage and tracing a traitor. For such applications

multiple watermarks embedding feature is desirable.

VIII. Cost: The computational cost of embedding and detection algorithm is another evaluation criterion of

data hiding schemes that is critical for real time applications, such as broadcast monitoring,

online content authentication, etc. On the otherhand, for ownership proof applications this

property is not that critical.

2.1.2 Applications of Data Hiding:

Applications domain of data hiding techniques is rapidly growing. Recently, several research

efforts [5, 9, 10, 150 – 158] are aiming beyond classic applications of data hiding including

ownership protection, content usage tracking, content authentication, copy control,

fingerprinting, broadcast monitoring, indexing, medical safety [5 – 24] etc. A brief overview of a

few of these applications and their design requirements is given in the following:

I. Ownership Protection: The watermark carrying the ownership information is embedded into the host data. The

watermarking scheme used for ownership protection is expected to be resilient to common data

processing operations (unintentional attacks) and intentional attacks. In the case of dispute over

ownership of the host data, embedded watermark can be used as a proof to identify the true

owner of the host data. Watermarking schemes intended for ownership protection must have low

probability of error and false alarm. In general, the capacity (payload) requirement of the

watermarking scheme designed for ownership protection applications does not need to be high.

16

II. Content Authentication: Robustness and undetectability are not the main concerns for content authentication application

of data hiding. Therefore, fragile watermarking is generally used for such applications. A

watermark is embedded in the host data, which is later used to determine the tempering of the

host media. Recent content authentications schemes are also capable of identifying the locations

of tempering in the host media [9 – 24, 100 – 103]. These applications generally require

informed detector i.e. original host data is available to the detector for content authentication.

Data hiding schemes for content authentication must have high embedding capacity to meet the

requirements of the content authentication applications [6].

III. Fingerprinting: The owner or distributor of multimedia contents uses fingerprinting or labeling to trace the

illegal copies or traitor. For such applications, content owner or distributor embed a unique

fingerprint, label, or serial number in each copy of the distributed data before distributing to each

customer. A fingerprinting scheme is required to survive against intentional and unintentional

attacks, more specifically collusion attacks [90 – 99]. Fingerprinting does not require high

embedding capacity but does require robustness in general.

IV. Copy Protection: Embedded information in the host multimedia data can be used to control the copying device for

unauthorized copy prevention [7]. For this purpose, a watermark detector is generally integrated

in the recording or playback system, such as, DVD copy control scheme proposed in [150], or

proposed SMDI player [159]. Data hiding schemes for such applications should be robust against

all intentional or unintentional attacks that temper with the watermark from the watermarked

data. Moreover, data hiding techniques designed for copy control intend to use a blind detector

and generally requires low data embedding capacity.

V. Broadcast Monitoring: An automated (active) broadcast monitoring system can be used to detect the embedded

watermark in the broadcasted commercial advertisement [5, 7, 158]. In addition, an active

broadcast surveillance system can also be used for other TV products (news, talk shows, etc.)

protected by broadcast monitoring watermarking systems. For such applications watermarking

scheme should be robust against watermarking attacks and requires a blind detector for

watermark detection process. Furthermore, such applications require low watermark embedding

capacity.

2.2. Classification of Data Hiding Techniques

This section provides a general classification of existing data hiding techniques based on the

following six criteria:

host media type (images, video, audio, and text),

areas of applications (robust, fragile, and semi-fragile),

perceptibility (visible and invisible),

embedding domain (spatial and transform),

data embedding schemes (know-host-state and know-host-statistics), and

data extraction techniques (private, semi-private, and public).

This classification hierarchy of data hiding techniques is illustrated in Figure 2.1.

DATA HIDING

BASED ONAPPLICATIONS

BASED ONPERCEPTIBILITY

BASED ONEMBEDDING DOMAIN

BASED ONEMBEDDING SCHEME

BASED ON HOSTMEDIA TYPE

BASED ONEXTRACTION SCHEME

IMAGEDATA HIDING

VIDEODATA HIDING

AUDIODATA HIDING

TEXTDATA HIDING

ROBUSTDATA HIDING

SEMI-FRAGILEDATA HIDING

FRAGILEDATA HIDING

IMPERCEPTIBLEDATA HIDING

VISIBLEDATA HIDING

DIRECT DOMAINEMBEDDING

TRANSFORMEDDOMAIN

EMBEDDING

HOSTINTERFERENCECANCELLATIONTECHNIQUES

(INFORMED DATAEMBEDDING)

PRIVATEDATA HIDING

ADDITIVE SPREADSPECTRUM

TECHNIQUES(BLIND DATAEMBEDDING)

SEMI-PRIVATEDATA HIDING

PUBLICDATA HIDING

Figure 2.1: General Classification of Data Hiding

17

18

Each category of the data hiding schemes is discussed briefly as follow,

2.2.1 Classification Based on Host Media Type

Most of the data hiding research is focused on digital images compared with the other host media

types i.e. video, audio, and text. This is due to the fact that the performance evaluation of a data

hiding scheme for digital images is relatively easier than digital audio and video; because the

performance evaluation of a data embedding scheme for audio or video generally requires

subjective testing. Data hiding techniques based on host media type can be divided into four sub-

groups [127 – 150]:

1. Data Hiding in Images 2. Data Hiding in Video 3. Data Hiding in Audio 4. Data Hiding in Text

2.2.2 Classification Based on Data Hiding Applications

Performance based on robustness, capacity and fidelity of a data hiding scheme depends on the

application of interest. For example, copyrights protection applications require a robust

watermarking [57 – 89] where as content verification applications need a fragile watermarking

[7, 9 – 24]. Similarly, fingerprinting needs a semi-fragile watermarking [90 – 103]. Therefore,

existing data hiding schemes can be classified into three sub-groups based on the application of

interest:

1. Robust Data Hiding 2. Fragile Data Hiding 3. Semi-Fragile Data Hiding

2.2.3 Classification Based on Perceptibility

Existing data hiding schemes can be divided into two main categories based on the perceptibility

(fidelity) of embedded data [5 – 24], that is,

19

1. Imperceptible Data Embedding 2. Visible Data Embedding

Imperceptible data embedding implies that embedded data is invisible (in case of image, video,

and text host media) and inaudible (for audio host media). Imperceptible data embedding

schemes are more common than the visible data embedding schemes [60 – 68]. Imperceptible

data embedding schemes exploit the HVS and HAS characteristics to ensure imperceptibility of

the embedded data. Visible data embedding schemes are generally used to imprint visible logo in

digital images or video.

2.2.4 Classification Based on Data Embedding Domain

Existing data hiding schemes can be classified into two major categories based on embedding

domain of the host media, that is,

1. Data Hiding in Spatial/Time Domain (Direct Domain) 2. Data Hiding in Transformed Domain

Least significant bit (LSB) encoding, patchwork, echo hiding, etc. are few common data hiding

schemes of direct domain data embedding [127, 128, 131, 136, 160] schemes. Direct domain

data hiding schemes very popular among the data hiding community. Discrete cosine transform

(DCT), discrete wavelet transform (DWT), and discrete fourier transform (DFT) are the most

commonly used transforms for data embedding process. Most DCT-based image data embedding

methods commonly use 8x8 size block of image for host data transformation then watermark is

embedded by modifying DCT-coefficients according to HVM [9 – 24]. In DWT-based data

embedding algorithms the host data is first decomposed into subbands using DWT, then for data

embedding discrete wavelet coefficients in the selected subbands are modified based on human

perceptual model. Robust data hiding schemes for images and video resilient to rotational,

scaling, and translational (RST) distortion generally use DFT-based data hiding schemes [6, 7,

127 – 141, 143, 144]. DFT-based algorithms are also common for audio data hiding schemes.

2.2.5 Classification Based on Data Embedding Method

Existing data hiding schemes based on the data embedding methods can be classified into two

major categories [75 – 86, 115, 139], that is,

1. Additive Spread Spectrum or Host-Interference-Non-Rejecting Methods 2. Host Interference Rejecting Methods or Informed Embedding

In case of additive spread spectrum based data hiding, a pseudorandom sequence w(mi)

generated using secret key or message mi is added to the host signal i.e.

( ) ( ) 2 .10 1

i o ix m C w mh ere

αα

= + ×< ≤

where α is called as scaling factor and value α is the tradeoff between robustness and fidelity of

the embedded data.

From Eq. 2.1 this is clear that for this class of data hiding the host signal Co acts as an additive

interference if a blind detector is used for watermark detection, which ultimately limits the

performance of the detector; even in the absence of attack channel zero-error probability is hard

to achieve. But these methods out perform the host interference rejection methods under sever

attack situations. Most of the existing data hiding methods [9 – 24] fall into the additive spread

spectrum class.

The inherited limitations of the host interference non-rejecting methods can be improved by

exploiting the host signal knowledge at the encoder; these methods are generally known as host

interference rejecting methods. Quantization index modulation (QIM) [77 – 89] based data

hiding methods is a sub-class of host interference rejecting methods. This class of data hiding

methods provides an easy control over the trade off between data rate, embedding distortion, and

robustness. These methods generally have higher data rate than the spread spectrum based data

hiding class at the cost complexity of the data hiding system.

2.2.6 Classification Based on Data Extraction Method

20

Data hiding systems based on the information available at detector can be classified in following

categories,

1. Private or Informed Data Hiding 2. Semi-Private Data Hiding 3. Public or Blind Data Hiding

Private data hiding systems requires original copy of the host media along with secret embedding

key for data extraction. These systems are generally used for data hiding applications like

content authentication, ownership verification, etc. Semi-private systems generally requires

secret embedding key only for information extraction, whereas, public data hiding systems need

only a marked copy of the host media at the detector for data extraction [5 – 33].

2.3 Transmission Channels

The transmission channel model plays an important role in analyzing the performance of a given

communications system. In general, a fixed transmission channel is assumed for design and

analysis of a communication system i.e. we cannot modify or design the noise function that

occurs during transmission. A channel is generally characterized by means of a conditional

probability distribution, Pr/x (r/x) which gives the probability of obtaining r at the output of the

channel when x is the input of the channel. Transmission channels are modeled based on the

noise function they apply to the signal and how the noise is applied.

In data hiding scenario adversary attacks (or attack channel) are generally treated as a

transmission channel for the performance analysis of a data hiding scheme. Commonly known

attacks in the data hiding community can be modeled as follow [30, 115]:

2.3.1 Bounded Distortion Channels

In this case we consider the largest distortion energy per dimension 2nσ to ensure (zero-

error) for any distortion (noise) vector n, that satisfies,

m m=

21

22

2.2nn Nσ≤

This channel model describes the minimum signal to noise ration (SNR) constraint between the

attack channel input and output. Bounded distortion channel model is more appropriate for

unintentional attacks such as compression attacks or active adversary attack to remove

watermark for the watermarked media.

2.3.2 Bounded Host-Distortion Channels

Some active adversary may use distortion measure between the host signals instead of distortion

introduced by channel. Since this is a direct measure of degradation of the host signal. This

model is appropriate when an attacker has partial knowledge about the host signal, this might be

in probabilistic sense i.e. probability distribution of the host signal is know or any other sence.

Active adversary can calculate the distortion between a watermarked copy of media and the host

media, this distortion is bounded to the expected distortion given as

[ ( , )] 2.3rD E D r x=

where expectation is taken over the conditional probability density of r given the channel input x.

2.3.3 Additive Noise Channels

In this case noise vector n is modeled as random and statistically independent of the host data Do

[39 – 46]. An additive white Gaussian noise (AWGN) is an example of such channel. The

robustness measure in this case is the maximum noise variance 2nσ to ensure sufficiently low

probability of error in the received data. Many researchers in the area of data hiding use AWGN

channel assumption to model attack channel for performance analysis of a given data hiding

scheme [5 – 7, 25 – 33].

22

The first two channel models are distortion constraints which are more appropriate to model

intentional attacks [5, 25 – 33, 115, 139] whereas AWGN channel is appropriate for

unintentional attacks.

3.4 Digital Rights Management: A Brief Overview

Digital rights management (DRM), i.e., the technologies, tools, and processes that protect

intellectual property during the life cycle of digital content, is a vital ingredient of the emerging

electronic multimedia (emedia) market. DRM creates an essential foundation of trust between

authors and consumers that is a prerequisite for the robust market development.

At its simplest level, digital rights management (DRM) technology is all about controlling access

to information. Customers want convenient access to their purchased products, while companies

seek to protect their intellectual property from unauthorized use or duplication. DRM sits

squarely between these two parties, trying to present an amicable compromise between the

customers and the vendor.

The hardware keys, software licenses, and serial numbers all fall under the DRM umbrella [169].

Although there are several approaches to providing digital rights management, but "Anatomy of

a DRM Transaction" is the most common one which is outlined in Figure 2.2.

CONTENT AUTHOR/CREATOR

MEDIA CONVERTER CONSUMER

CLIENT WEBBROWSER

CLIENT VIEWER

PLAY MANAGEMENTSYSTEM

WEB STOREFRONTAND MEDIA HOST

LICENSE MANAGER

Content Manager

1 2

4

5

6

3

Figure 2.2 Anatomy of a DRM Transaction

23

24

In Figure 2.2, at its most basic level, a DRM transaction starts with the content creator (1), who

generates a piece of media (2), be it audio, video, text, or some other format. Once in digital

form, the media file is encrypted or watermarked to protect it from unauthorized use and stored

on a content server. Access to the file is managed by the license server, possibly in conjunction

with a pay management system (3). Decrypted/unwatermarked media might be delivered directly

to a browser (4), or it could be decoded by the appropriate DRM-enabled software application

(5). Either way, a fully licensed, digital-quality media file or stream reaches to the customer (6).

Key features of an effective DRM system generally include:

Data protection, so files are not easily viewed without proper privileges (Content

Protection).

Unique identification of each customer to ensure that rights are applied appropriately

(Fingerprinting).

Central management of rights to allow for free distribution, anti-fraud measures, and

revocation (Content Authentication and legal action)

Flexibility, so the system can be tailored to various business models (rental, ownership,

and read-only (Copy Control).

Rights model is the core of any content rights managements system. A rights model is a

specification of the types of rights that system can keep track of or what the system can do with

those rights and the attribute of those rights such as how many times content can be used, for

how long user can access the content, how many times user can copy the contents, how much

money etc.

Rights model of DRM systems are used to define rights to content, according to some rights

model, and to enforce the granting of those rights. There are three ways to enforce content

rights:

1) Legally through registration forms, license agreements, and copyright laws.

25

2) Legally with an audit trail, such as copyright notices or watermarks (identifiers embedded

permanently in the content).

3) Technologically, using encryption and user authentication to protect content and only make it

accessible under strictly specified conditions.

Content protection and tracking are the basic building blocks of every DRM system. Most of the

existing DRM systems use encryption for content protection, content usage tracking, content and

user authentication, etc. which cannot provide sufficient safeguard against piracy due to its

limitations. On the other hand, watermarking along with encryption can ensure content

protection and usage tracking. A content protection scheme that incorporates both encryption and

watermarking is not foolproof but provides sufficient protection against active adversary attacks.

This is likely that most successful DRM solution in the years to come, where combine encryption

and watermarking can be used for content protection and related issues. In this dissertation we

intend to develop content protection techniques using both encryption and watermarking.

26

CHAPTER 3

Related Work This chapter studies the theoretical aspects of the data hiding problem. Different conceptual

models of data hiding problem are explored here. These models will help to comprehend the

theoretical aspects of the data hiding problem. These models can be classified into two main

categories: 1) the data hiding models based on communications theory, and 2) the data hiding

models based on geometrical framework. Based on embedding methods the existing data hiding

schemes can be divided into two classes (as discussed in Chapter 2): 1) spread spectrum based

data hiding, and 2) informed data hiding. Related work in these directions is briefly discussed in

Section 3.4. The goal of this chapter is to lay the foundation for the design and analysis of the

data hiding systems discussed in the later chapters.

3.1 Data hiding in Communication Framework

In the recent years several researchers in the data hiding community [5, 7, 25 – 33, 39 – 71] have

use traditional communications framework to analyze the theoretical-aspects (such as data hiding

capacity, error probability, and performance limits) of data hiding and watermarking. A brief

overview of the classical model of a communications system would be helpful to understand the

similarities and differences between a conventional communications system and a data hiding

system.

3.1.1 Classical Model of Communications system

The channel encoder, channel decoder, and communication channel are three basic elements of

the traditional communications model as illustrated in figure 3.1. Here message, m, is to be

transmitted across the communications a channel.

CHANNELENCODER

CHANNELDECODER

INPUT INFORMATIONSEQUENCE

CHANNELDISTORTIONS/

NOISE

OUTPUT INFORMATIONSEQUENCE

Transmitter Receiver

x

n

r

m∑

m

Figure 3.1: Standard Communication Model

The channel encoder is a function that maps each possible message mi to a code word x, selected

from a set of signals suitable for transmission over the channel. For digital communication

channel encoder is generally divided into source encoder and modulator. The source encoder

maps a message into sequence of binary symbols, where as, modulator maps a sequence of

binary symbols into a physical signal x, suitable for transmission over the channel.

In general channel encoder output is dependent on the transmission channel, but for our case x is

a finite precision real sequence of length N i.e. x = x0, x1,…, xN-1. We also assume that these

signals are bounded, i.e. these signals are power constraint, that is,

2( [ ]) ; 3.1i

x i p p≤ < ∞∑

The transmission channel is generally assumed as a noisy transmission channel; which means

that output of the channel r, is not identical to the input x, of the channel. The change from x to r

is due to additive noise of the channel i.e. transmission channel adds random noise n, to x.

The output of the communication channel r enters into the channel decoder. The channel decoder

inverts the channel encoding process, that is, maps the received signal into message . The

channel decoder is typically a many-to-one function, so that even in the presence of noise

received signal should be decoded correctly. The probability of error p

m

e in the decoded message

is very small if channel decoder is designed using channel parameters.

3.1.2 Secure Transmission 27

Communications systems designed for the communications applications where security of the

transmitted information is an additional requirement. A secure communication system is

generally used for such applications. Main difference between a conventional communication

system and a secure communication system is that, latter system uses a pair of secret keys

(encryption and decryption keys) at channel encoder to encrypt the message sequence and to

decrypt received message at channel decoder respectively.

Such a secure communication system is depicted in Figure 3.2

CHANNELENCODER

CHANNELDECODER

INPUT INFORMATIONSEQUENCE

CHANNELDISTORTIONS/

NOISE

OUTPUT INFORMATIONSEQUENCE

Transmitter Receiver

x

n

r

m∑

ENCRYPTIONKEY

DECRYPTIONKEY

Ke Kd

m

Figure 3.2: Standard Secure Communication Model

Encryption provides an extra security layer and helps to prevent passive as well as active attacks

for secure delivery of contents on such systems given that adversary does not have access to the

secure key, that is, cryptography prevent a passive adversary from unauthorized reading of the

message and similarly prevent an active adversary from unauthorized writing. The secure

communication system described in Figure 3.3 is known as a symmetric secure system if Ke = Kd

i.e. encryption key is same the decryption key; otherwise this an asymmetric secure system.

Cryptography has been a popular technology for content protection for many years and still

commonly in use for number of applications in the areas of content protection, secure network

communication and secure content delivery [7]. But cryptography is unable to provide sufficient

safeguard against jamming attacks and content security after decryption. This type of attacks can

be handled by using spread spectrum communication [38] schemes.

28

In case of data hiding both active as well as passive adversary have access to the watermarked

media. Therefore, a secure key based data embedding and data extraction system is required to

ensure the security of the embedded message and content protection. In the remaining document

we will assume a symmetric secure data hiding system unless otherwise specified.

3.1.3 Data Hiding Model Based on Communication Framework

Data hiding system has a strong analogy with the communication system [5 – 8]. In data hiding

we want to communicate information from the data embedder to the data detector. Therefore,

this is natural to use the conventional communication model for design and analysis of data

hiding systems.

Figure 3.3 shows the standard data hiding model, data hiding system with doted line is an

informed or private data hiding system, whereas, without doted line is a data hiding system with

blind or public detector or blind data hiding system.

EMBEDDINGATTACK

CHANNELEXTRACTION

INPUTMESSAGE EXTRACTED MESSAGE

HOST MEDIA

EMBEDDING KEY

K

CO

m x r m

Figure 3.3: General Model for Data Hiding

A relatively detailed description of the above data hiding model with informed detector is given

in Figure 3.4 and 3.5.

29

MESSAGEENCODER

MESSAGEDECODER

HOST MEDIADATA

EMBEDDING KEY

INPUT MESSAGE(WATERMARK)

DATADETECTION KEY

ADVERSARYATTACK

OUTPUT MESSAGE(WATERMARK)

Data Embedded Data Detector

Do

Dm

n

Dmnme

K Ko

m∑∑∑

m-

Figure 3.4: Data Hiding System with Informed Detector Analogous to the Standard Secure Communication Model

MESSAGEENCODER

MESSAGEDECODER

HOST MEDIADATA

EMBEDDING KEY

INPUT MESSAGE(WATERMARK)

DATADETECTION KEY

ADVERSARYATTACK

OUTPUT MESSAGE(WATERMARK)

Data Embedded Data Detector

Do

Dm

n

Dmnme

K Ko

m∑∑

m

Figure 3.5: Data Hiding System with Blind Detector Analogous to the Standard Secure Communication Model

This is clear from Figure 3.4 and 3.5 that data embedding consists of two basic steps: 1) message

mapping, message encoder maps the input message into a suitable embedding pattern, me, of

same dimension and type as the host media, Do. A secret key, K, can be used for this mapping

during data embedding process. 2) Embedding pattern then added to the original host media, Do,

to produce data embedded host media (marked host media), Dm. This type of embedding is

known as blind embedding in literature [5 – 8, 57 – 68] because encoder completely ignores the

host media information for data embedding process.

Marked media then undergoes intentional or unintentional attacks; for simplicity these attacks

are modeled as an AWGN channel. The output of AWGN channel (attack channel) is called as

processed or distorted marked host media, Dmn.

Finally to recover the embedded message for the processed marked media, Dmn, pass through the

watermark detector. In case of informed detector (Figure 3.4), detection operation is a two step 30

31

process: 1) Original host media, Do, is subtracted first from the received data, and 2) residue

noisy pattern, Dn, is used to estimate the embedded message. Whereas for blind watermark

detection case (Figure 3.5), the original copy of the host media is not available at watermark

detector, therefore, we cannot subtract the host data, Do, from Dmn. In such situation, we can

consider the received signal (Dmn) is the embedding pattern corrupted by noise formed by the

combination of the host media and attack channel. Performance of the watermark detector

depends on the application of interest, for example, for high robustness applications, such as

ownership identification or copy control; minimization of error probability of the estimated

message is the main criteria.

In case of blind data embedding schemes, watermark embedder completely ignores the

information about the host media, and which directly affect the overall performance of the data

hiding scheme. How we can utilize the host media information at encoder?; and how this can

improve the performance of the data hiding system?. These issues are addressed next.

3.2 Data Hiding as Communication with Side Information at the Transmitter

As pointed out in the previous section that communication based data hiding model with blind

detector (Figure 3.5) cannot fulfill the requirements of fidelity, robustness and high embedding

capacity. This is because in this model, embedding pattern, me, is restricted to be independent of

the host media and at blind detector the host media acts as random noise or interference. As

original copy of the host media, Do, is available at the encoder, therefore this is reasonable to

exploit the knowledge about the host media at the encoder to develop a robust and high capacity

data hiding system with minimal perceptual embedding distortion. All existing perceptual based

data hiding schemes exploit host media information at data embedder [5 – 24].

Figure 3.6 depicts the data hiding model where embedding pattern, me, is the host media

dependent. Then only difference between this model and the one described in Figure 3.5 is that

this model have informed encoder, that is, the encoder uses the information about the host media

for mapping input message into embedding pattern.

INFORMEDENCODER

CHANNELDECODER

DATAEMBEDDING KEY

INPUT MESSAGESEQUENCE

CHANNELDISTORTIONS

OUTPUT MESSAGESEQUENCE

EMBEDDER DETECTOR

Dm

n

Dmn

K

HOST MEDIA

Do

me m∑∑

m

Figure 3.6: Data Hiding as Communication with Side Information at the Encoder

Here if we consider combination of the host media and the channel distortion as the noise

process in an AWGN transmission channel, then this model is an example of a communication

system with side information at the transmitter, first studied by Shannon [34] and then by [35 –

37]. In the recent years few researchers have modeled the data hiding problem as communication

with side information at the encoder. Data hiding schemes based on the informed encoder

generally exhibit higher data rate, better perceptual quality, and robustness compare to the blind

data embedding schemes [57 – 71, 77 – 89]. For theoretical analysis, many researchers use the

idea proposed by Costa in [37] due to strong analogy between his “Writing on Dirty Paper”

problem and “Robust Data Hiding in Digital Media” problem of data hiding community. We

will explore this issue i.e. informed embedding in Section 3.4.2.

3.3 Geometric Model of Data Hiding

There is yet another way of modeling data hiding problem, that is, geometric modeling using n –

dimensional space. In this framework host media is considered as a point in an n –dimensional

space. This n –dimensional space is generally divided into two major regions:

32

33

1) Acceptable Fidelity Region: this is a region around the host media where perceptible

distortion between host media point and any other point in this region is imperceptible or below

masking threshold.

2) Detection Region: this is region in n –dimensional space where detector can decode the

embedded message based on the knowledge about the embedding key.

The embedding process moves the original host media point to a predefined detection region and

region of acceptable fidelity to ensure robustness as well as imperceptibility. Cox et al [7] have

used this geometric modeling for data hiding to interpret data hiding problem in n –dimensional

space.

3.4 Related Work

The work on digital watermarking became popular around mid nineties and since then the

number of research efforts in this area has surged significantly. However, most of the research

was focused on watermarking image contents [4 – 24]. Recently watermarking audio and video

contents has also gained significant research interest [127 – 142]. There are two main

communities in the area of information/data hiding: data hiding using spread spectrum theory or

additive embedding [57 – 68] and data hiding based on host-interference rejection or informed

embedding [77 – 89]. In general, spread spectrum watermarking scheme embeds data (message)

into the host data by adding a pseudo-random sequence and correlation based detector is

commonly used for the watermark detection process. In case of blind spread spectrum

watermarking schemes, the host data acts as interference at the watermark detector which

ultimately limits the detection performance of spread spectrum watermarking schemes.

Moreover, in order to meet the imperceptibility requirement of watermark, the power of

watermark signal is kept much lower than that of the host signal. Thus, the host signal

interference significantly reduces the amount of reliable communication between watermark

embedder and detector.

Cox et al [59] proposed a spread spectrum based watermarking system in which one information

bit (watermark bit) is spread over as many samples as the host media using modulated

pseudorandom spreading sequence to generate embedding sequence. Different variations of

spread spectrum based data hiding have been proposed in the past [57 – 68] for all types of

media. Low data hiding capacity and non-zero probability of error, Pe , even in the absence of

channel degradation are the main limitations of this class of data hiding. Relatively invariant

robustness performance from no distortion scenario to sever degradation is an attractive feature

of this class of data embedding. We will discuss our contributions in spread spectrum based data

hiding for digital audio in Chapter 4 and possible extensions of our proposed scheme for image

and video data.

3.4.1 Data Hiding Based on Additive Embedding

Most of the existing data embedding algorithms treat the host signal as an additive noise or

interference [57 – 67, 127 – 141]. The simplest of this data embedding class have purely additive

embedding function, that is,

( , ) ( ) 3.2m o o eD D k D m k= +

where me(k) is generally a pseudorandom sequence which is statistically independent of the host

media, Do, and generated using a secret key k.

This fact is quite evident from Figures 3.3, 3.4 and 3.5. Data embedding methods based on the

embedding function described in Eq. 3.2 are termed differently in literature, for example,

“spread spectrum methods” or “additive spread spectrum methods” [7], “host interference non-

rejecting methods” [30], “Type I embedding” [6, 28], and “known host statistics methods” [115],

but spread spectrum is the most commonly used term among the data hiding community. In

34

communication theory, the term “spread spectrum” means that the transmitted message signal

occupies much larger bandwidth than the required bandwidth for the message signal (base band

signal) [38]. In the recent years many researchers [9 – 24] have been using spread spectrum

theory for data hiding applications. The term “spread spectrum data hiding” has been established

for simple additive embedding a mark signal, me, chosen independently of the host signal, Do, as

described in Eq. 3.2. Cox et al[ 59] proposed a spread spectrum based watermarking system in

which one information bit (watermark bit) is speared over as many samples as the host media

using modulated pseudorandom spreading sequence to generate embedding sequence me. This

embedding sequence is then added to the original host data, Do, to produce watermarked copy of

the host data, Dm. This class of data embedding methods is limited to low data embedding

capacity, for example, Cox et al‘s [59] spread spectrum watermarking scheme can embed only

one bit in each host media.

A common variation of purely additive spread spectrum methods is the weighted-additive

embedding, that is,

( ) ( ) ( ) 3.3m o o eD k D D m kα= +

here embedding pattern is weighted with a scaling factor, α . This scaling factor,α , generally

accounts for the human perceptual characteristics, to ensure imperceptibility of the embedded

message. For example, embedding function proposed by Podilchuk et al [141], where amplitude

scaling factor,α , is host data dependent that is, it depends on just noticeable difference (JND)

level. Similarly, weighted embedding function proposed in [59], where amplitude scaling factor

is set proportional to the host data, Do, such that

( )o oD Dα λ=

where λ is constant 0 < λ ≤ 1.

35

This means that embedding function distort larger magnitude host signal samples more than the

smaller samples or coefficients of the host data in transformed domain. This proportional

weighted-additive embedding class of data hiding is still additive embedding in log –domain i.e.

( , ) ( ) ( )( )

(1 ( )) 3.4

m o o o e

o o e

o e

D D k D D m kD D m kD m k

αλλ

= += += +

now taking log on both side is Eq. 3.4,

log Dm (Do, k) = log Do + log (1 + λ me (k)) 3.5

Eq. 3.5 shows that weighted-additive embedding is still additive embedding in log –domain.

For watermark detection, these methods rely on the statistical properties of the host data which

are used to develop an optimal information decoder. This optimal information decoder is

generally in the maximum likelihood sense. The statistical characterization of the host data is

available in direct domain as well as in transformed domain such as the [9 –

24]. For simplicity we consider digital image host data,

DCT, DWT, or DFT

oD , in -domain, here can be

modeled by Laplacian probability distribution function [9, 147]. Thus of each

coefficient, d[ , can be written as,

DCToD

pdf pdf

i]

[ ]| |[ ]

[ ]( ) 3 .62

i dd i

if d e ββ −=

where [ ]

2[ ]d i

iβσ

=

Now robustness performance of the weighted-additive embedding function as described in Eq.

3.3 in the absence of channel noise (or adversary attack) can be calculated as:

For simplicity the embedding pattern, , is assumed as a pseudo-random sequence of antipodal

binary samples i.e. . Therefore,

em

1, 1em ∈ − + 2( ) 1eE m = , and if magnitude scaling factor,

, is non-negative real constant i.e. ( )oa D ( ) ; 0oa D a a= > , then data embedding distortion in

36

37

2this case 2 2( ) e eE m bσ α= α= , here b is one bit information to be embedded into host data and

b = ±1.

It can be shown [115, 147] that the probability of error Pe at the receiver using ML detector in

the absence of noise or adversary attack is given as,

21P 3

2e e λ−= .7

where eλ σ α= .

Hence this is clear from Eq. 3.7 that even in the absence of channel noise or attack Pe is not zero

i.e. zero probability of decoding error is not attainable; this fact ( P ) holds for other

probabilistic models of the host data i.e. for gaussian or generalized gaussian host model.

Therefore, additive embedding class of data hiding is not provably robust, and this is due to the

host signal interference. Informed detector improves the performance of this class of data hiding

methods considerably; this is because the presence of host signal at detector can be used to

cancel the host signal interference. Moreover, variation of this class that minimizes the effect of

host signal interference can also improve the performance.

0e ≠

One of the most important advantages of the additive embedding based data hiding methods is

that for a power-constrained transmission channel this is extremely difficult to severely degrade

the host signal’s underlying pfd. Because the statistical properties of the host signal are relative

invariant, this will cause a noticeable degradation of robustness performance in the presence of

attack. This is because the decoder optimization criterion depends on the statistical

characterization of the host signal. Therefore, the Pe does not degrade abruptly from attack free

scenario to growing channel distortion.

Additive embedding method such as spread spectrum watermarking is one of the first methods

used for data embedding [59, 164] and still most popular one due to its advantages to withstand

38

against sever distortion and simplicity. Many variations of this method are possible depending

one the nature of the host signal and the application of interest [67 – 68, 139, 140, 162, 163].

Malik et al [68] proposed frequency selective spread spectrum watermarking scheme for digital

audio, in this method we use only a selected frequency range of the host signal (audio signal) for

data embedding instead of the complete frequency range of the host media. Thus the host signal

interference at the detector due to the selected subband signal is minimized which in return

improves the robustness. The proposed method introduces low embedding distortion as

watermark is embedded in the selected frequency range. Moreover, this method is capable of

embedding 5 – 8 times more data compare to the existing data hiding methods of this class.

Detailed overview of this method is provided in Chapter 4.

3.4.2 Informed Embedding

Recently Chen et al [30, 81] and Cox et al [71] have explored the idea of informed embedding.

Data hiding methods under the informed embedding umbrella generally have higher data rate,

better robustness and fidelity for bounded perturbation attack channels. These methods are

capable to achieve zero-error probability as long as channel distortion is below a certain

threshold [5, 6, 27, 28, and 115]. In general this class of data embedding methods use blind

detector i.e. the detector has no information about the host signal Do for detection process but the

encoder exploit information about the host signal to reduce the host signal interference.

Cox et al’s work [71] is based on the general concept of Shannon’s paper “Channels with Side

Information at the Transmitter” [34]. Where as Chen et al’s [30, 80 – 86] work is based on

Costa’s work “Writing on Dirty Papers” [37]. Costa considered communication with side

information at encoder over an AWGN channel as described in Figure 3.6

3.4.2.1 Costa’s Work

The main requirement of Costa’s solution to the communication problem described in Figure 5.1

is to design an Nd –dimensional code book and an appropriate encoding process; here NdNu d is

the cardinality of the host data vector. In the limiting case i.e. as Costa’s codebook

achieves the capacity of communication with IID Gaussian side information D

dN →∞ dNu

o at the encoder

and AWGN channel.

Costa’s codebook can be defined as,

[ ] [ ] [ ] | 1 , 2 , . . . , 3 . 8dNe o uu u i m i D i i Nη= = + ∈

where Nu is the cardinality of the codebook and η : 0 ≤ η ≤ 1 is a codebook parameter. Moreover,

39

N

N

d

2

2

2

(0, )

(0, )

(0, )

d

d

d

e e

o d

n N

N I

N I

N I

σ

σ

σ

∼

∼

∼

m

D

n

are the realizations of embedding pattern, host data, and channel noise which are Nd –

dimensional mutually independent random processes with zero-mean and

2 2 2, ,andd de N d N n NI Iσ σ σ I covariance matrices respectively with Gaussian pdf , where

dNI is and Nd

–dimensional identity matrix. Costa showed [37] that the capacity of such communication

system is independent of the host signal interference, that is,

2

2 2

1 l o g ( 1 ) 3 . 92

eA W G N

n

C σσ

= +

which is equal to the capacity of additive spread spectrum system with informed detector [5].

3.4.2.1.1 Quantization Index Modulation (QIM)

Costa’s scheme is purely theoretical, therefore, several practical approaches to implement

Costa’s scheme have been proposed [28, 30, 57, 72 – 74, 84]. In Chen et al’s [80 – 86] proposed

data hiding scheme, the host signal Do is quantized depending on the information to be

embedded, this scheme is commonly referred as “quantization index modulation” (QIM). For

analysis and implementation, Chen et al [80 – 86] gave a low complexity practical

implementation of their theoretical QIM scheme, i.e. binary dither modulation (BDM). The QIM

and its variations belong to the informed embedding class of data hiding [77 – 89].

The QIM information embedding process involves modulating the index or sequence of indices

with the embedding information and then quantizing the host signal with the associated quantizer

or sequence of quantizers. A quantizer is approximately an identity function i.e. ( )q x ≈ x and can

be uniquely described by a set Q of reconstruction points in Nd –dimensional space along with a

rule of mapping the input signal of length Nd to a point in the set Q. Minimum distance rule is

generally used for selecting a suitable point from Q for an input signal, therefore, different

quantizers can be characterized by their reconstruction points Q only. As QIM scheme belongs to

the host interference rejecting class of data embedding, therefore QIM schemes offer high data

rate for power constraint attack channels [30].

Basic steps of QIM scheme can be outlined as:

1) A set of different quantizers Q1, Q2, …, QM is defined, where M is the cardinality of the

possible embedding messages set M.

2) To embed message m, the host signal is quantized using quantizer Qm.

3) The detector quantizes the received signal Dmn using the set of all quantizors Q1, Q2,…,

QM. Then the detector determines the index of the quantizer with reconstruction point closet

to the received signal; this estimated index corresponds to the received message . m

3.4.2.1.1 Binary Dither Modulation

Binary dither modulation is a low complexity implementation of the QIM in which the ensemble

of the embedding functions is dither quantizers [3]. For these dither quantizers the quantization

cells and reconstruction points of any given quantizer are the sifted version of the quantization

40

cells and reconstruction points of any other quantizer in the ensemble. The shifts are generally

achieved using a pseudorandom vector called as dither vector, d, for information embedding

purpose this dither vector is modulated according to the embedding message m. Let the modulate

dither vector corresponding to the message m is denoted by d(m). The embedding function based

for dither modulation is defined as [30, 81]:

( , ) ( ( )) ( ) 3.10m o oD D m q D m m= + −d d

here ( )q i is a uniform scalar quantizer with step size∆ .

For binary dither modulation, the mapping from the range of the host signal values Do[n] onto

the watermarked signal values Dm[n] using uniform scalar quantizer with step size ∆ is

illustrated in Figure 3.7. Here, the set Q 1 (circles ‘O’) is defined by a uniform scalar quantizer

with step size . Similarly, the set Q ∆ 2 (crosses ‘X’) is another uniform scalar quantizer with

same step size but with /2 offset. ∆

O

O

O

O

X

X

X

X

Dm [n]Do [n]

/ 4∆/ 4∆

∆

Figure 3.7: Binary Dithered Modulation Scheme Based on Dithered Uniform Scalar Quantization

The dither vector construction and the zero-error watermark detection condition are derived as:

We assume that the data embedding rate Rm is1/ 1d mN R≤ ≤ , and m = b1, b2,…, is the

binary representation of embedding message m where

d mN Rb

0,1ib ∈ for i : 1,2, …, NdRm. If ku/kc is

41

the rate of an error correcting code used for channel encoding then channel encoded binary

representation of m is z1, z2, …, where /dN Lz 1 ( / )u cm

L k kR

=

Now two dither subvectors of length L are constructed as,

(1) / 2 , if (1) 0( 2 ) 1, 2 , ..., 3 .1 1

(1) / 2 , if (1) 0i i

ii i

d dd i L

d d+ ∆ <⎧

= =⎨ − ∆ ≥⎩

Eq. 3.11 ensures that two L –dimensional dither quantizers are at maximum possible distance

from each other. Here, one dither subvector (say d(1)) is associated with binary information ‘0’

where as second dither subvector is d(2), associated with binary information ‘1’. Finally Nd/L

dither subvectors associated with channel encoded bits z1, z2,…, zNd/L are concatenated to from

dither vector . ( ) dN∈ℜd m

Finally minimum distance between reconstruction points of two quantizers Q 1 and Q 2 can be

shown [30, 84] as,

22

m i n ( ) 3 . 14

H u

m c

d kdR k

∆= 2

where dH is the minimum hamming distance, a feature of the error correcting code used for

channel encoding. For very small quantization cells, the mean squared distortion introduce per

dimension due embedding by the uniform, scalar quantizer with step size∆ is:

2

( ( , ) ) 3 . 11 2E o mE D D D ∆

= 3

Now for bounded distortion channels and minimum distance decoding zero-error decoding

condition can be shown as [82],

2

3 ( ( , ) ) 1 34

H u E o m

c d m n

d k E D D Dk N R σ

> . 1 4

Eq. 3.14 shows that for a fixed rate Rm, for more channel distortion energy 2nσ we need more

embedding mean squared distortion i.e. . Moreover, for a fixed rate and channel ( ( , ))E o mE D D D

42

distortion energy, then the Eq. 3.14 gives minimum perceptual distortion introduced due to data

embedding. Therefore, QIM scheme gives a trade off between rate, robustness and fidelity of the

data embedding process.

As informed embedding class of data embedding methods use blind detector, therefore, detector

can be treated as deterministic hence their performance limited by the bounded power channels

distortion. For example if channel distortion 2nσ > ∆ /2 then performance of the data embedding

system deteriorates and zero-error decoding is not guaranteed.

We will discuss our contributions in informed embedding for digital audio data in Chapter 5. Our

proposed data hiding schemes [75, 76] using phase alteration of audio data are capable to embed

more data than the existing schemes while keeping embedding distortion below masking

threshold.

Considering the level of research activity related to data hiding in the past decade, it is evident

that there has been a significant improvement in the design of data embedding and detection

schemes, but at the same time sophistication in the attacks against data hiding has shown similar

improvements. These parallel improvements have motivated theoretical analysis of performance

limits of digital data hiding techniques. First work in this direction is by Moulin et al [39, 40, 47

– 54] where they consider digital watermarking as a game between watermark embedder and

active attacker. The watermark embedder attempts to maximize the amount of embedded

information whereas attacker attempts to minimize it. Other theoretical work considers the

robustness against estimation attacks [5] or influence of quantization on correlation based

watermark detection.

43

44

CHAPTER 4

Blind Data Embedding This chapter presents our initial contributions in the additive embedding or blind embedding

techniques and outlines the future work. In general, additive embedding or spread spectrum

based watermarking techniques embed information by adding pseudo-random sequence into the

host data and correlation based detector is used for watermark detection.

4.1 Robust and High Capacity Data Embedding: Our Work

Currently we are working on the design of high capacity, robust data hiding algorithms using

spread spectrum theory. Initial results of our proposed algorithm in this direction are promising

[68]. Main features and performance analysis of our work are given next.

4.1.1 Data Hiding Using Frequency Selective Based Spread Spectrum

As pointed out in the previous Section that the host signal interference at the detector limits the

performance of additive embedding class of data hiding, therefore, we can improve the

performance of these methods by either rejecting or minimizing the host signal interference. For

complete rejection of the host signal interference we need informed detector which is not feasible

for many data hiding applications such as copy control, device control, etc. So we can think of

minimizing the host signal interference, and one possible way to do this by embedding data in a

selected subband signal of the host signal instead of whole frequency band of the host signal

which is the main idea of our work [68]. This frequency selective based embedding will also

reduce the embedding distortion that ultimately improves the fidelity of the data hiding scheme.

The frequency selective based data hiding algorithm is outlined next.

45

This method [68] is designed to overcome common shortcomings of existing DSSS based audio

data hiding /watermarking systems [59 – 67] such as vulnerability to desynchronization attacks,

poor detection performance, poor fidelity (inaudibility), and limited embedding capacity.

Robustness to desynchronization attacks and reliability of detection performance are improved

using content-adaptive features called salient points [65] of the input audio. These salient points

are frame level features of the input audio signal that are invariant to common audio processing

operations. Only a small fraction of the audible frequency range is used for data embedding in

order to reduce the amount of audible distortion. The method exploits the frequency masking

characteristics of the human auditory system (HAS) and inserts the mark into a randomly

selected frequency band of the input audio signal. A secret key is used for randomly selecting a

frequency band for watermark embedding. The proposed watermarking scheme induces low

perceptual as well as mean squared distortion; and is therefore, the proposed scheme has high

embedding capacity P. Moulin et al [39 – 41]. The detection performance of the system was

investigated for a variety of signal manipulations and attacks on a watermarked audio clip. These

attacks include addition of noise, resampling, requantization, filtering, and random chopping.

Results show the robustness of the method, with a low detection error rate and a low bit error

rate. Moreover, the proposed watermarking scheme is capable of embedding multiple

watermarks in the unused frequency bands with the use of separate secret keys.

4.1.1.1 WATERMARKING USING PERCEPTUAL AUDITORY MODEL The basic idea underlying perception-based watermarking schemes is to incorporate the

watermark into the perceptually insignificant region of an audio signal in order to ensure

transparency. The perceptually insignificant region is determined using the human perceptual

auditory model. Extensive work was done over the years on understanding the properties of HAS

and applying this knowledge to audio applications [2]. An important application of perceptual

models is in the area of perception-based compression [165]. An important characteristic of HAS

is auditory masking that has been that has been exploited in audio coding for lossy compression.

We consider its use in watermarking.

Human ear performs frequency analysis that maps a frequency to a location along the basilar

membrane. The HAS is generally modeled as a non-uniform bandpass filter bank with

logarithmically widening bandwidth for higher frequencies [165]. The bandwidth of each

bandpass filter is set according to the critical band, which is defined as “the bandwidth in which

subjective response changes abruptly” [2]. The critical band rate (CBR) is a measure of location

on the basilar membrane just as the frequency gives a measure of location in a spectrum. The

unit of critical band rate is Bark. The mapping between CBR and frequency is defined as:

( )213 arctan(0 .67 ) 3 .5 arctan ( / 7 .5) 4 .1z f f= +

where z is CBR in Barks and f is frequency in kHz.

Masking is a fundamental property of HAS and is a basic element of perceptual audio coding

systems. It is a phenomenon by which a stronger audible signal makes a weaker audible signal

inaudible [2], and this occurs both in frequency as well as time domain [2].

4.1.1.2 SALIENT POINT EXTRACTION Spread spectrum techniques have been applied in digital watermarking [59 – 67] due to their

potential for high fidelity, high capacity, robustness, and security. In the proposed scheme, the

process of generating a watermark and embedding it into an audio signal is treated in the

framework of spread spectrum theory. The original audio signal is treated as noise whereas the

message information used to generate a watermark sequence is considered as data. The spreading

sequence, also called pseudo-random noise sequence or PN-sequence, is treated as key. This

watermarking strategy can be treated in the framework of communication models discussed in

[7].

46

A critical aspect of designing a spread spectrum system is ensuring fast and reliable

synchronization at the detector. Synchronization impacts performance as it reduces the overall

capacity of the watermarking system, and an active adversary can use explicit synchronization

information for de-synchronization attacks. To overcome these problems, synchronization is tied

to attack-sensitive locations or salient points for watermark embedding and detection. Salient

points are extracted based on the audio features sensitive to the HAS [65], e.g. fast energy

transition points, zero crossing rate and spectral flatness measure. If these features are altered

then noticeable distortion is introduced. A good salient point extraction method is one that

approximately extracts the same salient points before and after common signal manipulations or

watermark embedding [65]. Fast energy transition audio feature is used in our method for salient

point extraction.

For an audio signal Do(n): n = 0,1,2,…N-1, the short time energy ratio at each point is calculated

as:

( )( ) 4 .2

( )a f t e r

b e fo r e

E nE r n

E n=

where Eafter(n) and Ebeforer(n) are defined as:

1 2( ) ( ) 4 .3b e fo re oi rE n D n i−

= −= +∑

1 20

( ) ( ) 4 .4ra fte r oi

E n D n i−

== +∑

Here r is the number of samples before and after x(n). A high energy transition points are

defined as:

If : Er(n) > Th1 & Eafter(n) > Th2

Finally a salient point is decided as follow:

1: If two high energy transition points are separated by less than Th3 then samples are merged

together to form a group.

472: Within each group, the strongest transition point is marked as a salient point.

here Th1, Th2 and Th3 are thresholds, these thresholds are set adaptively to ensure 3 - 4 salient points per second. 4.1.1.3 WATERMARK EMBEDDING To generate and embed a watermark, the host data (audio) is analyzed first to determine salient

points list. A block of P samples around each salient point is selected. The block is applied to a l-

level modified wavelet analysis filter bank to generate (2xl-1) –subband signals of unequal

bandwidths, as illustrated in Figure 4.1.

h_hp(n)

h_lp(n)

h_hp(n)

h_lp(n)

h_lp(n)

h_hp(n)

h_lp(n)

h_hp(n)

h_lp(n)

h_hp(n)

h_hp(n)

h_lp(n)

h_hp(n)

h_lp(n)

h_hp(n)

h_lp(n)

Sb 1f = 0~fs/64

Sb 2f = fs/64~fs/64

Sb 3f=fs/32~3fs/64

Sb 4f=3fs/64~fs/16

Sb 5f=fs/16~5fs/32

Sb 6f = 3fs/32~fs/8

Sb 7f = fs/8~3fs/16

Sb 8f = 3fs/16~fs/4

Sb 9f = fs/4 ~ fs/2

X(n)

f = 0~fs/2

:Represents downsampling by the factor of 2

h_lp(n) :Represents low pass filtering with cutoff freq. pi/2

h_hp(n) :Represents high pass filtering with cutoff freq. pi/2

Figure 4.1: 5 –Level Modified Discrete Wavelet Analysis Filter Bank

The choice of the number of subbands is made based on a compromise between allowing a large

choice in random selection and ensuring that the subband bandwidth covers at least three critical

bands. A subband from lower eight bands (for l = 5) is selected using the three bit sub-key k1i,

for ith salient point, where as the complete secret key K1 for subband selection is given as:

K1 = k11| k12|…| k1i|…k1M

where M is the cardinality of the salient point set.

The selected subband is used to estimate the masking threshold Tm(k), which is calculated as

follows:

48

Let sbi,j(n) for n = 0,1,2…L-1 be the jth subband of ith frame of the audio data that is selected

using key K1i. Its power spectrum is defined as,

Psb(k) = |Sbi,j(k)|2 4.5

here Sbi,j(k) is the discrete fourier transform (DFT) of the sbi,j(n). Now k is wrapped onto Bark

scale using Eq 1. The energy in each critical band is calculated as,

( ) ( ) / : 1, 2 , . . . 4 .6U Bz tk L B Z

E z P s b k P fo r z z=

= =∑

where zt is the total number of critical bands in the selected subband, LB and UB are the lower

and upper boundaries of the a critical band, and Pz is the total number of points in each critical

band. The energy per critical band is used to calculate the masking threshold Tm(z) using MPEG

layer III psychoacoustic model 1 [165].

4.1.1.3.1 Watermark Generation For each salient point a watermark W of length L is generated. To generate a watermark W,

binary message m is mapped onto using a channel encoder. The channel encoded data is

applied to binary phase shift keying (BPSK) modulator. The output of the BPSK modulator is

Wm(n) : n = 0,1…q-1, where q = L/(spreading factor). Maximum length PN-sequence p of

length (L/q) using log

m

2 (L/q) bit secret key K2 is generated. Finally modulated signal Wm is

spread using PN-sequence p to generate final watermark W. System key K = K1|K2.

4.1.1.3.2 Watermark Embedding Spectral shaping based on Tm(k) of W is required to ensure inaudibility of the embedded

watermark. For this purpose W(k) (DFT) and power spectrum Pw(k) of W is calculated. Now

using Tm(z) inaudible DFT coefficients of the selected subband sbi,j are removed, i.e.

, ( ) ( ) ( )( ) 4 . 7

0 ( ) ( )i js b k i f P s b k T m z

S b n ki f P s b k T m z

≥⎧= ⎨

<⎩

similarly unwanted DFT coefficients of W(k) are also removed, i.e.

49

0 ( ) ( )( ) 4 . 8

( ) ( ) ( )i f P s b k T m z

W n kw k i f P s b k T m z

≥⎧= ⎨ <⎩

The final watermark before embedding is given by

Wf(k)=Fz•Wn(k) 4.9

where Fz is the shaping factor and defined as,

( )4 . 1 0

m a x ( | ( ) | )zA T m k

FW n k

=

where 0 < A <1 is noise gain factor. Finally watermarked output in frequency domain,

Wsbi,j(k) = Sbn(k) + Wf(k) 4.11

The corresponding time domain watermarked subband signal is obtained by calculating inverse

discrete fourier transform (IDFT),

Wsbi,j(n) = IDFTWsbi,j(k) 4.12

This watermarked subband is then use to reconstruct the watermarked audio block data using

modified wavelet synthesis filter bank. This process is repeated for the remaining salient points

in the salient point list.

Watermark generation and embedding process is illustrated in Figure 4.2.

AUDIOCONTENTANALYSIS

ATTACK-SENSITIVEREGION EXTRACTION

SUBBANDSELECTION

MASKING THRESHOLDEXTRACTION

(USING PSYCHOACOUSTICMODEL)

WATERMARK SHAPING(USING MASKING

THRESHOLD)

SUBBAND ANALYSIS(USING MODIFIED

WAVELET ANALYSISFILTER BANK)

Key :k1i

Sbj(l)

Tma (z)

Sb1(l)

Sbp(l)

spi=1~M

x(l, i)

l = 1~P

WATERMARKEMBEDDING

100101101...

WATERMARKSEQUENCE

Key : k2

PN-SEQUENCEGENERATOR

BPSKMODULATOR

WATERMARKSPREADING

CHANNELENCODING

SUBBAND SYNTHESIS(USING MODIFIED

WAVELET SYNTHESISFILTER BANK)

Sb1(l)

Sbp(l)

DATAMERGING

ORIGINALAUDIO DO

WATERMARKEDAUDIO Dm

wsbj(l)

wn(l)

wx(i,l)

w(l)l= 1~L

Figure 4.2: Block Diagram of Watermark Embedding Process

50

4.1.1.4 WATERMARK DETECTION In order to be effective a watermarking system should be able to detect/extract the embedded

watermark even after the watermarked audio undergoes common signal manipulations and

psychoacoustic auditory model based audio processing. An attractive feature of the proposed

scheme is that a blind detector can be used for watermark detection/extraction i.e. detector does

not require original copy of the audio signal to detect watermark from the received audio signal.

The detector has access to the secret key that is the only information that detector has about the

embedding. The detector uses salient points for synchronizing the embedded information, so that

audio can be analyzed for salient point extraction (as discussed in Section 3). For each point in

the salient point list a block of P samples is passed through the modified analysis wavelet-filter

bank, then using a secret key k1i, jth subband sbi,j is selected for watermark detection/extraction.

The selected subband is analyzed to extract masking threshold say Tmr(z). This masking

threshold is used to extract the “residual” audio signal, Rr(k), that is defined as,

0 ( ) ( )( ) 4 . 1 3

( ) ( ) ( )r r

rr r r

i f P s k m zR k

s k i f P s k m z>⎧

= ⎨ ≤⎩

where Sr(k) is the DFT of sr(n) the selected subband of the received audio and Psr(k) is the

corresponding power spectrum.

The residual is transformed into time domain for watermark detection/extraction using IDFT i.e.

rr(n)=IDFT(Rr(k)) 4.14

The residual rr(n) is now used for watermark detection, by using normalized correlation test. The

normalized correlation between real sequences rr(n) and PN-sequence p(n) at the detector

generated using key K2 is defined as,

2 20 0

( ) ( )( ) 4 . 1 5

( ) ( )

Mrl M

n M Mrl l

r l p n lc o r n

r l p l= −

= =

+=

•

∑∑ ∑

51

where L is the length of the residual signal. High correlation implies the presence of watermark

as illustrated in Figure 4.3.

500 1000 1500 20000

0.2

0.4

0.6

0.8

1

Nor

malized

Cor

relatio

n

Normalized Correlation :Watermark Present

500 1000 1500 2000-0.02

-0.01

0

0.01

0.02

0.03

0.04

0.05

Normalized Correlation : Watermark Absent

Nor

malized

Cor

relatio

n

Figure 4.3: Normalized Correlation for watermarked subband (left) and unwatermarked subband (right).

The normalized correlation is compared with a threshold to determine the presence of a

watermark. Let hypothesis H1 denote the presence of a watermark in a selected subband and H0

denote the absence of a watermark. The decision criterion is

1 7

7

: m a x ( ) :: m a x ( ) : 4 .

n

o n

H i f c o r T h w a t e r m a r k p r e s e n tH i f c o r T h w a t e r m a r k a b s e n t

≥< 1 6

If H1 is true then the embedded information is recovered by despreading rr(n) using the PN-

sequence generated using same key K2, then demodulating the resulting sequence using BPSK

demodulator followed by channel decoding. The detection process is illustrated in Figure 4.4.

OUTPUTWATERMARKSEQUENCE

100101101...

AUDIO CONTENTANALYSIS

ATTACK-SENSITIVEREGION EXTRACTION

SUBBANDSELECTION

MASKING THRESHOLDEXTRACTION

(USING PSYCHOACOUSTICMODEL)

RESIDUAL EXTRACTION(USING MASKING

THRESHOLD)

SUBBAND ANALYSIS(USING MODIFIED

WAVELET ANALYSISFILTER)

Key :k1i

Sbj(l)

Tma (z)

Sb1(l)

Sbp(l)

spi=1~L

x(l, i)

l = 1~M

CORRELATORDETECTOR

Key : k2

PN-SEQUENCEGENERATOR

BPSKDEMODULATION

WATERMARKSPREADING

CHANNELDECODING

Thpeak ≥

Thpeak <

watermarkedAudio Dm

No watermark

Figure 4.4: Block Diagram for Watermark Detection

4.1.1.5 EXPERIMENTAL RESULTS 52

The robustness of the proposed scheme was tested on speech signals and music. The tests

included several degradations and distortions, i.e. addition of noise, lossy compression, low pass

filtering, resampling, random chopping, and multiple watermarks. The detection performance in

each case depends on the following measures, 1) watermark detection rate (WDR) which is a

measure of watermark detection, and 2) the bit accuracy rate (BAR) which is a measure of data

recovery. The bit accuracy rate is defined as,

4 . 1 7N u m b e r o f B i t s C o r r e c t l y D e t e c t e dB A RN u m b e r o f B i t s E m b e d d e d

=

and watermark detection rate:

4.18Number of Watermarked Frames Correctly DetectedWDRNumber of Watermarked Frames Embedded

=

The overall performance of the system is defined as,

4 . 1 9D P M B A R W D R= ×

where DPM stands for detection performance measure.

Detection results for degraded watermarked audio based on DPM for a variety of conditions are

described below.

White Gaussian noise is added to the watermarked audio; the DPM (as defined in Eq.

19) values in the presence of white gaussian noise with power from 0 to 50% of the

signal power are shown in Figure 4.5.

53

0 5 10 15 20 25 30 35 40 45 500.975

0.98

0.985

0.99

0.995

1

1.005

Noise Power

Detec

tion P

erform

ance

Mea

sure

Detection Performance Measure vs Noise Power

Noise Power = percentage of the Audio Power

Figure 4.5: DPM for different values of noise power (Pn)

Watermarked audio is down-sampled to 22.05 kHz and then interpolated to 44.1 kHz.

The DPM value for this test is 1.

Watermarked audio undergoes ISO/MPEG-1 Audio Layer III encoding/decoding

[165] at a bit rate of 128 kbs. The DPM value for compression test is 1.

Watermarked audio signal is lowpass filtered with 4 kHz cutoff frequency, Detection

of resulting audio gives a DPM of .995. Detection performance is still acceptable

despite severe audible distortion

To investigate desynchronization attacks, one out of every 100 samples of

watermarked signal was randomly dropped. Detection applied to this signal gave a

DPM of 1.

Three watermarks simultaneously embedded in the audio, with a unique sectary key

assigned to and a unique subband selected for each watermark. The DPM was 1 as

long as the number of watermarks is less than the number of analysis subbands.

54

55

4.2 Future Directions

A novel watermarking scheme for audio based on FS-DSSS is proposed. The technique

introduces low mean squared as well as perceptual distortion compare to existing spread

spectrum schemes [59 – 67] this is due to that fact that a watermark is embedded in a small

frequency band of complete audible frequency range. The watermarking capacity theory

presented in [39 – 41] suggests that the proposed scheme can embed more information. The

proposed method is also robust to standard data manipulations i.e. noise addition, compression,

random chopping and re-sampling.

5.2.1 Proposed Dada Hiding Scheme for Images

We are currently investigating to extend our frequency selective watermarking scheme [68] for

digital image watermarking, image authentication, and image fingerprinting. As the existing

image watermarking schemes [5 – 24] generally use blind detector for watermark detection

which limits the performance of blind or additive data embedding schemes due to host signal

interference. This interference can be reduced if watermark is selected from the null space of the

host data. This will improve the detection performance of the data hiding scheme because

watermark selected from null space of the host signal therefore the host signal interference is

minimum. Bayesian source separation approach can be used for blind watermark detection. As

embedded watermark is orthogonal therefore separation matrix estimation would be

computationally efficient as well.

For image decomposition we are investigating to use l –level discrete wavelet analysis filter

bank. The reason for using DWT for watermark embedding is multi-folds: 1) the wavelet

transformation provides good space-frequency localization to analyze image features such as,

edges or texture areas, 2) due to multiresolution representation of the image, hierarchical

processing of the image is possible that can be used for progressive watermark

embedding/decoding, 3) the wavelet transform is very flexible to adapt a give set of images or

application at hand, 4) wavelet coefficients can be generally characterized by a Gaussian

distribution [166, 167 ] which will improve the computationally efficiency of separation matrix

estimation using Bayesian approach, and 5) wavelet transform is compatible to JPEG2000, the

most recent still image compression standard. For watermark embedding all subbands except l th

–level approximation subband are selected because embedding in this subband will degrade the

visual quality of the watermarked image. Moreover, as diagonal subbands are generally less

sensitive to the quantization noise therefore, these subbands at 1st and 2nd level are not suitable

for watermark embedding, because these subbands are generally discarded during lossy

compression. The subbands along horizontal and vertical orientation at 2nd and 3rd level are

suitable for watermark embedding. Figure 4.6 illustrates 3 –level wavelet decomposition.

HL4

LH3

HL3

HH3

LH2

HL2

HH2

LH1

HL1

HH1

LL4

HH4LH4

Figure 4.6 Three Level Wavelet Decomposition

The proposed frequency selective image watermarking using DWT scheme is outline as:

Decompose the image I using l –level discrete wavelet analysis filter bank.

Randomly select a subband using secret key from the decomposed image such that

: 1, 2,3... 2, 4j ix sb i lθ θ= ∈ ∀ ∈ where j = 1,2.

56

Generate a watermark w with non-Gaussian distribution using secret key such that

watermark lie in the null space of the selected subband.

Calculate the masking threshold for the selected subband using noise visibility

function [167].

Embed watermark in the selected subband as:

where xe is the watermark embedded subband, xs are the wavelet coefficients above the

masking threshold, and α is the level adaptive scaling factor.

4 .2 0e sx x x wα= +

Reconstruct the watermarked image using discrete wavelet synthesis filter bank.

Now to detect the watermark using blind detector some variations of blind source separation

scheme can be used but we are thinking to use Bayesian source separation framework. Because

the embedded watermark is uncorrelated to the subband in which it is embedded therefore

separation matrix estimation is possible.

For image authentication and fingerprinting applications, original image is available at the

detector therefore detection performance will definitely improve. The proposed scheme is

flexible enough that we can modify it depending on the application of interest. For example, for

image fingerprinting application, to detect the fingerprint we can use normalized correlation

function because the host data is available at the detector. Moreover, multiple fingerprints can be

embedded simultaneously.

4.2.2 Proposed Dada Hiding Scheme for Video

Video and audio data types need more attention of the researchers from the data hiding

community because they are strong candidates to carry more data than digital images. Moreover,

researchers explored these data types very little in the past. Entertainment industry is facing more

losses due to digital audio and video instead of the digital images. Audio and video data hiding

57

58

can be used for multimedia wireless communication to improve the QoS, enhanced performance

for intended recipients where as normal service for a regular user.

Image watermarking technique proposed in the previous section can be extended for video data

hiding with certain modifications. For video data hiding I –frames and motion compensation

vectors can be used for data hiding where as P –frames can be used as a backup or 2nd level data

hiding. To eliminate the frame replacement or frame shuffling attacks in video data hiding, frame

pairing can be used for data embedding i.e. information about frame f is embedded in frame f’

where as index number of f’ is greater than the index number of f. We are also investigating to

develop data hiding scheme for efficient error concealment of multimedia transmission over

lossy and busty channels.

Chapter Summery This chapter gives an overview of the blind embedding schemes. In Section 4.1 we discus the

existing additive data hiding schemes, their advantages, and their limitations. Our contribution in

this class of data hiding schemes is provided in Section 4.2. Low host signal interference,

reduced embedding distortion, high data rate, and robustness against adversary attacks are the

attractive features of the proposed scheme. The simulation results show the robustness of the

proposed scheme against common data manipulation attacks. Possible extension of our

frequency selective watermark embedding scheme to digital images and video are proposed in

section 4.3. For watermark detection we are planning to use Bayesian source separation

approach. We are also investigating to use data hiding for error concealment of multimedia

transmission over a busty channel.

59

CHAPTER 5

Informed Data Embedding This chapter provides our initial contributions in the informed embedding or host interference

rejection based embedding techniques and outlines the future work. In general, informed

embedding based watermarking techniques embed information using informed encoder i.e. by

exploiting the host signal knowledge at encoder and use blind detector for watermark extraction.

5.1 High Rate Data Embedding Using Informed Encoding: Our Work

This is clear from Eq. 3.14 that QIM based data embedding techniques introduces large

embedding distortion for more robust data embedding at fixed data rate. Moreover, QIM

schemes did not pay attention to the human perceptual system. Therefore, for low perceptual

distortion and better robustness performance, data embedding schemes have to incorporate the

human perceptual system. Malik et al [75, 76] proposed a data hiding using deterministic

dithering in the selected frequency range of the audio signal for data embedding. The frequency

range selection for dithering is based on the human perceptual model.

5.1.1 Data Hiding Using Frequency Selective Dithering (Our Contribution)

In [75, 76] we propose a novel perception-based high capacity data hiding methods. In this

scheme we explore the following properties of HAS: the magnitude distortion at a specific

frequency in an audio signal is inaudible if it is below masking threshold; and human perception

is less sensitive to absolute phase changes in a certain frequency range [2, 154]. Not all of the

customary full range of audible frequencies, i.e. 20 Hz ~ 20 kHz, is suitable for data embedding.

In the higher frequency range (≈ f > 10 kHz) detection of small magnitude changes is unreliable

due to insignificant signal energy. On the other hand human perception is more sensitive to phase

distortion in the lower frequency range (≈ f <4.0 kHz). The frequency range (i.e. 4.0 < f <10.0

kHz) is therefore suitable for making embedded data imperceptible and robust to the standard

manipulations.

The signal content in the above range is partitioned into subband signals using discrete wavelet

packet analysis filter bank (DWPA-FB). Data is embedded in selected subband signals by

introducing inaudible magnitude and phase distortion using finite-length impulse response (FIR)

approximations of allpass filters (APFs). Data is detected by estimating the parameters (pole-

zero) of the APF by estimating the power spectrum of the audio. In our method the power

spectrum is estimated using parametric signal models, i.e. moving average and autoregressive

models. The performance of this method is evaluated for detecting data that is embedded in an

audio clip using binary and 4 –ary encoding and is subjected to signal manipulations such as

addition of noise, lossy compression, resampling, and random chopping. Compared with existing

methods [62 – 64, 136, 137] the proposed technique is shown to embed 5-8 times more data for

binary encoding and twice for 4 –ary encoding.

5.1.1.1 FIR APPROXIMATION OF APF An APF is suitable for data embedding because the phase distortion it introduces in the chosen

frequency range is largely inaudible. Let the frequency response of the APF be H(e jω) = ke jφ(ω).

Estimation of APF parameters from the processed audio consists of finding local extrema in the

magnitude spectrum of the processed audio along radial lines of pole-zero locations of APF.

The transfer function HAP(z) of a stable and causal first-order allpass filter can be expressed as,

1

1( ) 5 . 11A PzH z

zα

α

− ∗

−

−=

−

where α ∈ and |α|<1 and the region of convergence is | α | < |z|. The transfer function of a

higher order APF can be expressed as a product of first-order allpass sections specified in Eq.(1).

An allpass filter has an infinite-duration impulse response (IIR). Data is embedded by

introducing controlled phase distortion using a fixed set of pole-zero locations. We use an FIR

60

approximation of length L of an nth order APF. This introduces both magnitude and phase

distortion. The magnitude distortion tends to zero as . This is shown below. L → ∞

Consider a stable, causal first order APF defined in Eq. 5.1

( )

1 *1 *

1 1

1 * 1

0

1 * 1 1

0 1

1 * 1 1 * 1

0 1

1 *1 1 1

1

1( ) ( )1 1

( ) ( )

( ) ( ) ( )

( ) ( ) ( ) ( )

1 ( ) (1

A P

k

k

Lk k

k k L

Lk k

k k L

L

zH z zz z

z z

z z z

z z z z

z z zz

α αα α

α α

α α α

α α α α

α α αα

−−

− −

∞− −

=

∞− − −

= = +

∞− − − −

= = +

−− + −

−

−= = − ×

− −

⎡ ⎤= − ⎢ ⎥

⎣ ⎦⎡ ⎤

= − +⎢ ⎥⎣ ⎦⎡ ⎤ ⎡ ⎤

= − + −⎢ ⎥ ⎢ ⎥⎣ ⎦ ⎣ ⎦

⎛ ⎞−= − +⎜ ⎟−⎝ ⎠

∑

∑ ∑

∑ ∑ 5 .2

1 *1

1) 51

L zzα

α

−+

−

⎛ ⎞−⎜ ⎟−⎝ ⎠

.3

) L

L

First term on right hand side in Eq. 5.10 is FIR approximation of an APF referred as HFIR_AP(z).

HFIR-AP(z) can be expressed as,

( )1 1_ ( ) ( ) 1 ( ) 5 .4L

F I R A P A PH z H z zα − += −

The factor , introduces L + 1 zeros at . These L + 1 zeros are

uniformly distributed on the circle |z| = α. The zero for i=0, i.e. z

( 1 11 ( )Lzα − +− 2 /( 1) 0,1,...,j i Liz e iπα += =

o = α, cancels the pole at the

same location, therefore HFIR-AP(z) has L + 1 zeros altogether, where L zeros are located at

; and remaining one zero at |z| = 1/ α . The transfer function H2 /( 1) 1, 2, ...,j i Liz e iπα += = AP(z) of a

single pole-zero pair is obtained from HFIR-AP(z) as L goes to infinity. The nature of distortion

for different L is illustrated in Figure 5.1.

61

Figure 5.1: Magnitude Response of APF H(ejw) Approximation for Different values of Length (L).

An nth order APF is used in our method for embedding data. The nth order APF is realized with a

cascade of n/2 second order allpass filters. The use of cascaded form realization reduces the

effect of quantization of APF coefficients. Parameter αi of the transfer function HAPi(z) used for

data embedding is defined as: iji r e ωα = where 0 < r < 1, 0 < ωi < л, where i = 0,1 in binary

encoding, and i = 0,1,2,3 in 4-ary encoding. The transfer function HAPi(z) of the APF used for

data embedding is expressed as:

/ 21 * 1

1 1 *

( ) ( )( ) 2 , 4 . . 5 . 5(1 ) (1 )

n

i iA P i

i i

z zH z nz z

α αα α

− −

− −

⎛ ⎞− −= =⎜ ⎟− −⎝ ⎠

Note that HAPi(z) has n/2 poles at each αi and αi*, and n/2 zeros at each 1/αi and 1/ αi

* locations.

The parameter αi of the transfer function HAPi(z) for binary and 4-ary encoding used in data

embedding are given in Table1.

Table1: APF parameters for binary and 4-ary schemes

Binary Encoding Scheme

4-ary Encoding Scheme

r Ω R Ω α0 0.95 0.2π α0 0.9 0.25π α1 0.95 0.4π α2 0.95 0.6π α1 0.9 0.75π α3 0.95 0.8π

The pole-zero layouts of the 2nd order allpass filters used in binary encoding and 4-ary encoding

are illustrated in Figure 5.2 and 5.3 respectively.

Figure5.2: Pole-Zero Layout of HAPi(z) for Binary Encoding

62

Figure5.3: Pole-Zero Layout of HAPi(z) for 4-ary Encoding

5.1.1.2 DATA EMBEDDING The data embedding process begins by dividing the input audio signal into non-overlapping

blocks of N samples. Each block is then decomposed into 2l subbands using l-level DWPA-FB,

where l and N are positive integers. According to the human auditory perceptual model,

subbands corresponding to 4 to 10 kHz range are relatively less sensitive to the phase distortion

and robust to data manipulations. These subbands are selected for data embedding. To this end

ten subbands (from subband # 6 to subband # 15 at 44.1 k Hz sampling rate with l=5) are

selected for data embedding for each block. One bit of data is embedded in each subband for

binary encoding scheme and two bits of data for 4 –ary scheme. For example, in binary

encoding, bit ‘m’, m = 0,1, is embedded by passing a selected subband through an APF with

transfer functions Hm(z).

In order to cope with the desynchronization attacks, such as signal chopping, synchronization

locations called salient points are identified for inserting data. Salient points are attack-sensitive

locations in the input audio that can be used for synchronization. The salient points correspond to

audio features to which HAS is sensitive such as fast energy climbing points. If an adversary

alters these features, audible distortion is introduced [65]. In our method we adopt the salient

point extraction method described by C-P. Wu et al [65]. In our implementation thresholds Th1,

63

Th2 and Th3 of [65] are suitably set in order to ensure 1- 2 salient points per second. We set r =

4000 samples, Th1 = 2, Th2 = mean energy of one second duration audio window around n, and

Th3 = 200 samples.

The following steps outline the data-embedding scheme:

o A list of salient points is extracted for a given audio signal using the method described

in [65].

o Starting with the first salient point, the audio signal is segmented into non-overlapping

frames of N –samples.

o Each frame is decomposed using a 5 –level DWPA-FB and ten subbands (from

subband 6 to 15) are selected for data embedding.

o One bit of channel-encoded data is embedded in each selected subband in the case of

binary encoding, and two bits for 4 –ary encoding.

o All frames that contain a salient point are embedded with synchronization code (using

a suitable bit sequence).

o Finally each frame is re-synthesized using discrete wavelet packet synthesis filter bank

(DWPS-FB).

A detailed block diagram of the data embedding process is given in Figure 5.4.

AUDIOSEGMENTATION

SUBBANDDECOMPOSITION

using DWPA-FB

INPUTAUDIO

Do

APF: hi(n)i = 0, 1: Binary Scheme

i = 0,1,2,3: 4-ary Scheme

APF: hi(n)i = 0, 1: Binary Scheme

i = 0,1,2,3: 4-ary Scheme

Sb0

Sb6

Sb31

Sb15

Sb0

DSb6

Sb31

DSb15

MESSAGE100101101...

CHANNELENCODING

APF SELECTIONALGORITHM

SUBBANDRECOMPOSITION

using DWPS-FB

DATAMERGING

Hare :APF : Allpass FilterDWPA-FB: Discrete Wavelet Packet Analysis Filter BankDWPA-FB: Discrete Wavelet Packet Synthesis Filter BankSb : SubbandDSb : Data Embedded Subband

DATAEMBEDDEDAUDIO Dm

Figure 5.4: Block Diagram of the Data Embedding Scheme

5.1.1.3 DATA DETECTION USING SIGNAL MODELING The detector first analyzes the data-embedded input audio to extract the list of salient points

using the method described in [65]. Then, starting from the first salient point, the input audio 64

signal is segmented into non-overlapping frames of N –samples. Each frame is decomposed into

subband signals using a 5 –level DWPA-FB (as in the case of data embedding) after which ten-

subbands (from # 6 to 15) are selected for data recovery. Data recovery consists of power

spectrum estimation of the selected subband signals using a priori knowledge of the signal model

followed by APF parameter estimation to recover the embedded information.

5.1.1.3.1 Spectrum Estimation Our parametric spectrum estimation approach assumes an appropriate model of the process based

on a priori knowledge of the signal. We know that the subband signals have been processed by

an FIR approximation of an APF. Therefore, both autoregressive (AR) and moving average

(MA) signal models of sufficient order can be used [1] for spectrum estimation.

An autoregressive process, x(n), can be represented as the output of an all-pole filter excited by

unit variance white noise. The estimated power spectrum of a pth order AR process is

2

2

1

ˆ ( 0 )ˆ ( ) 5 .

ˆ1 ( )

jA R p j k

pk

bP e

a k e

ω

ω−=

=+ ∑

6

where âp(k) and are the estimates of the process model parameters. These p+1 estimated can

be obtained from the data methods such as autocorrelation method, covariance method, modified

covariance method, Burg algorithm etc [1]. We use Burg algorithm for p

ˆ(0)b

th order AR model

parameter estimation.

A moving average process, x(n), can be generated by exciting a qth order FIR filter by unit

variance white noise. The estimated power spectrum of qth order MA process is,

2

1ˆˆ ( ) ( ) 5 .qj j k

M A qkP e b k eω ω−

== ∑ 7

where are the estimates of the process model parameters. We use Durbin’s method [1] for

q

ˆ( )b k

th order MA model parameter estimation.

65

The next step for data recovery is to estimate APF parameter ˆ ( , )rα ω from the estimated

spectrum ˆ ( )jP e ω .

5.1.1.3.2 Allpass Filter Parameter Estimation For APF parameter ˆ ( , )rα ω estimation we need to estimate andr ω from the estimated spectrum

ˆ ( jP e )ω as α is function of r and ω. In our method r is fixed for all APF parameters and only the

frequency ω is varied for the information encoding schemes given in Table 1. Therefore, we

need to estimate frequency ω for ˆ ( , )rα ω estimation. Frequency ω is estimated from the

estimated spectrum ˆ ( jP e )ω based on the results of FIR approximation of APF (discussed in

Section 5.1.1.1).

We know, from Section 2, that the FIR approximation introduces magnitude distortion that

manifests as a local extremum at ijz e ω= The extremum becomes more pronounced as we

traverse from (1 / )ijr e r e ijω ω→ . Moreover, this extremum is stronger and evident for small value

of the duration L of the FIR approximation to the APF (as illustrated in Figure1). Therefore, to

estimate frequencyω we need to estimate consistent local extrema from the estimated

spectrum ˆ ( jP e )ω . For more accurate estimate ofω spectra based on both AR signal model as

well as MA signal model are used. Finally nearest neighborhood hypothesis testing is applied to

decode embedded information.

Hypothesis Testing: binary decoding scheme,

0 0

1 1

ˆ: | | 0 0 .1 5ˆ: | | 1 5 .8

H T h T hH T h

ω ω πω ω

− < = =

− < =

Hypothesis Testing: 4-ary decoding scheme

0 0

1 1

2 2

3 3

ˆ: | | 0 0 0 . 0 5ˆ: | | 0 1ˆ: | | 1 0ˆ: | | 1 1 5 . 9

H T h T hH T hH T hH T h

ω ω πω ωω ωω ω

− < = =

− < =− < =− < =

66

After estimating the coded bit sequence, channel decoding is applied to recover the original

message. The block diagram in Figure 5.5 illustrates the data detection in detail.

AUDIOSEGMENTATION

SUBBANDDECOMPOSITION

using DWPA-FB

DATA EMBEDDED

AUDIO x(n)

PARAMETRICSPECTRUM ESTIMATION

:PAR(ejw),PMA(ejw)

Sb6

Sb15

RECOVERED MESSAGE100101101... CHANNEL

DECODING

HYPOTHESIS TESTINGHi : |E_w - wi | Th

i = 0,1 for binary casei = 0,1,2,3 for 4-ary case

Hare :APF: Allpass FilterDWPA-FB : Discrete Wavelet Packet Analysis Filter BankSb : SubbandE_w : Estimated FrequencyTh : Threshold

PARAMETRICSPECTRUM ESTIMATION

:PAR(ejw),PMA(e

jw)

APFPARAMETERESTIMATION

:E_w

REMOVINGSYNCHRONIZATION

CODE

Figure 5.5: Block Diagram of the Data Detection Process

5.1.1.3.3 Simulation Results Imperceptibility and robustness are the two benchmarks used for performance evaluation of the

proposed data hiding scheme. Robustness is measured based on the probability of error in the

received data under different degradations: a) noise addition, b) lossy compression, c) random

chopping and d) resampling; for both encoding schemes. Probability of error Pe is defined as

1 1 0 0 5 .1 0eN u m b e r o f B i t s C o r r e c t l y D e t e c t e dP

N u m b e r o f B i t s E m b e d d e d⎛ ⎞

= − ×⎜ ⎟⎝ ⎠

For robustness test we have following observations:

o White Gaussian noise with 0% to 100% of the audio power is added to the data-

embedded audio signal. The probability of error Pe of the recovered data for different

values of signal to noise ratio (SNR in dB) and for both encoding schemes is given in

Figure 5.6.

67Figure 5.6: Probability of error (Pe) vs. SNR Plot for Both Encoding Schemes

68

o Data embedded audio signal is compressed using MPEG layer III coder [165]. Despite

the lossy compression the Pe value of the recovered data was below 1% for binary

encoding scheme. The Pe value for 4-ary encoding was higher.

o To test the robustness against desynchronization attacks, 2- 4 samples out of every 100

samples of the data-embedded audio are dropped randomly, probability of error Pe for

the resulting audio was error for both encoding schemes.

o In this test data-embedded audio is first down-sample to 22.05 kHz and then

interpolated to 44.1 kHz. Probability of error Pe of the recovered data after resampling

was 0 % for binary encoding case, and 0.5% for 4-ary encoding case.

5.1.1.4 DATA DETECTION USING MATCH FILTER The recovery of embedded data requires the detection of APF parameter pi used for data

embedding in each subband. The first step in data detection is the analysis of the data embedded

audio to extract the set of salient points (as discussed in Section 2). Starting from first salient

point in the set, the audio signal is segmented into non-overlapping frames of P samples. Each

frame is decomposed using 5 –level DWPA-FB. Then twelve subbands i.e. from sb4 to sb15 are

selected for data detection. The detector evaluates finite length Z-transform of the selected

subband at all possible values of APF parameter (i.e. at p0 and p1 for our implementation). The

detector in advance does not know which APF was used for data embedding in the received

subband sequence; however, detector has knowledge of the parameters of APFs used for data

embedding i.e. p0 and p1. The decision for bit ‘0’ or bit ‘1’ is made based on calculating Z-

transform of the sub-band sequence at 1/p0 and 1/p1 (zero-tracking) and then estimating the local

minima from the magnitude spectrum.

In practice zero-tracking is generally used for APF parameter estimation because theoretically

output of a stable and causal APF is an infinite sequence, i.e. yk,j(n) output of the APF in Eq. 3,

which is the convolution of a finite length input sequence xk,j(n) and an infinite sequence h0(n)

(as Hi(z) a rational function of z). But for APF parameter estimation only a finite length sequence

is available at the detector input.

Let ўk,j(n) be a finite length approximation of an infinite sequence yk,j(n) by dropping higher

indexed terms, available at the detector input. This approximation is valid only if Yk,j(z)

converges which is true if z = 1/pCz ∈∀ 0 (zeros of APF). This fact is illustrated in Figure 3 ,

Figure 3 (right, up) shows the plot of chirp z-transform (CZT) of a finite length subband

sequence x4,6(n) before and after passing through APF H0(z), calculated at r = 1/0.9, clearly

minima in Figure 3 (right, bottom) occurs at ω = ω0 ~ 0.25л; similarly, Figure 3 (left, up) shows

the CZT of the same sequence before passing through APF H0(z) calculated at r = 0.9, and 3(left

bottom) after passing through H0(z) but there is no maxima at ω = ω0 ~ 0.25л, this might be the

reason that finite length approximation of an infinite length sequence at z = p0 is not accurate.

CZT of x4,6

(n) at r = 0.9

|X 4,6(f)|

dB

0 0.5 1108

110

112

114

116

118

0.2 0.4 0.6 0.8 1

108

110

112

114

116

118

CZT of y4,6

(n) at r = 0.9

Normalized Frequency f

|Y 4,6(f)|

dB

0 0.5 1−10

−5

0

5

CZT of x4,6

(n) at r = 1/0.9

0.2 0.4 0.6 0.8 1

−100

−50

0

CZT of y4,6

(n) at r = 1/0.9

Normalized Frequency f

Figure 5.7: Magnitude spectrum of CZT of the subband sequence x4,6(n) before and after passing through H0(z i) i.e. y4,6(n),at r = 0.9 (right) and at r = 1/0.9 (right).

Therefore detector uses zero-tracking for APF parameter estimation (ωi, as r = 0.9 which is fix in

our implementation) for data detection. For parameter estimation we need to estimate only ω

which is done by estimating local minima magnitude response of CZT of the selected subband

calculated at r = 1/0.9, and then based on the nearest neighborhood hypothesis bit ‘ 0 ’ or bit ‘1’

is decided. Finally received data is channel decoded to recover the embedded information. The

block diagram in Figure 5.8 illustrates the data detection process in detail.

69

Audio ContentAnalysis for Salient

Point Extraction

AudioSegmentatio

nfrom SP1

RecoveredMessage

101101001...

Removingsynchronization

codeSub-Band

Decomposition

sb0

sb4

sb31

sb15 ChannelDecoding

Chirp Z-Transform

at 1/ r

EstimateMinima: w

0 : |w - w0| < e

1: |w - w1| < e

Chirp Z-Transform

at 1/ r

EstimateMinima: w

0 : |w - w0| < e

1: |w - w1| < e

DataEmbedded

Audio

SPkk=1~m

x(k,n)

SPkk=1~m

n = 1~L

Figure 5.8: Block Diagram of the Data Detection Using Match Filter 5.1.1.4.1 Simulation Results The proposed data hiding scheme divides input audio into 10 msec audio frame. Twelve-

subbands (as discussed in Section 3) are selected after subband decomposition for data hiding.

Hence embedded data = 12*100=1200 bps, which is 10-15 times more than the existing data

embedding methods [62 – 65, 136]. We applied the proposed data hiding scheme to the verity of

music clips having diverse frequency characteristics. Imperceptibility and robustness are the two

benchmarks used for performance evaluation of the proposed data hiding scheme. Robustness is

measured based on the probability of error in the received data under different constraints, these

constraints include: a) noise addition, b) lossy compression, c) random chopping and d)

resampling.

For robustness test we have following observations:

o White Gaussian noise with 0% to 30% of the audio power is added into data

embedded audio signal. The probability of error of the recovered data for different

values of signal to noise ratio (SNR in dB) is plotted in Figure 5.9.

70

5.23 6.78 7.45 8.24 9.20 10.46 12.22 15.23 Inf0

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08Probability of Error vs SNR

Probab

ility of

error

SNR

Figure 5.9: Probability of Error for different SNR values.

o Data embedded audio signal is compressed using MPEG layer III coder. Despite the

lossy compression the pe value of the recovered data was below 2%.

o To test robustness against desynchronization attacks, one sample out of every 100

samples of the data-embedded audio is dropped. Probability of error was 1.69%.

o In this test data-embedded audio is down-sample to 22.05 kHz and then interpolated to

44.1 kHz. Probability of error of the recovered data after resampling was 2.01%.

5.2 Future Direction

We propose a novel method of high-capacity data hiding based on the controlled inaudible

distortion introduced in the selected subbands of an audio signal using FIR approximations of an

nth order APF. The proposed technique is robust to standard data manipulations yielding low

error probability. The error probability performance can be improved by using channel coding

with higher error correction capability. Performance was evaluated with informal listening tests.

We are seeking approval for subjective test for evaluations using formal listening tests. We are

currently investigating the extension of the proposed scheme for its potential for copy-control

and digital watermarking applications for audio as well as for other multimedia data types such

as images and videos. Few intended extensions of the proposed scheme are discussed next.

71

72

5.2.1 Extension Audio Fingerprinting and Authentication

We are currently investigating to extend our frequency selective phase alteration based data

hiding schemes [75, 76] to collusion resistant audio fingerprinting system. As in fingerprinting

application an informed detector is used i.e. host data is available at the detector which will

further improve the detector performance. The proposed schemes [75, 76] are capable to

withstand against noise addition, compression, random copping, filtering etc. therefore extended

scheme can resist common data hiding attacks [90 – 103]. Traitor tracing and robustness against

collusion attacks are the main requirements of a good fingerprinting scheme. To extend the

existing data hiding scheme [75, 76] for these features we are investigating to develop an

efficient metadata (fingerprint) embedding strategy in the available 2l subbands using multiple

level embedding [6], and thinking to use viterbi algorithm or similar type of algorithm for data

decoding. Moreover, audio fingerprinting and authentication application will require a lower

order APF for information embedding, as these applications generally use an informed detector

hence small embedding distortion would be sufficient for the detector to decode data that can be

achieved by using lower order APF.

Chapter Summery This chapter gives an insight of informed embedding, a class of data hiding schemes. We briefly

discuss the existing data hiding schemes that fall in this category. Performance analysis of the

host interference rejection based data hiding schemes such as QIM and its practical extension i.e.

binary dither modulation is provided in Section 5.2. In Section 5.3, our contribution in this class

of data embedding is discussed in details. Robustness, fidelity and high data rate are the

attractive features of the proposed scheme. Simulation results for the robustness of the proposed

scheme against common signal manipulations using, 1) using signal modeling, and 2) using

73

match filter are also provided. Future extensions of the proposed schemes such as, audio

fingerprinting and authentication are proposed as future directions.

CHAPTER 6

Conclusion & Future Directions In this dissertation we propose to design an analytical framework to support a data hiding system

for digital rights management of multimedia archives. We study some salient features of two

main data hiding classes based on the embedding method, that is, blind embedding or additive

embedding and informed embedding or host signal interference rejection based embedding. We

high lighted the limitations of the existing additive data hiding schemes such as low embedding

capacity and robustness against attacks. Then we discuss the theoretical performance of informed

74

embedding [69 – 76] schemes to compare the performance of this class with blind embedding.

Fidelity constraint of data embedding schemes limits the embedding capacity for both data

embedding classes. We have already developed preliminary data hiding systems for both data

hiding categories. High data embedding capacity and low perceptual distortion are the attractive

features of the proposed systems [68, 75, 76].

In future, we intend to extend our proposed data hiding schemes to develop a data hiding system

for digital rights management of multimedia data. To meet the technological challenges of a

reliable DRM system we want to expand our research in the following directions:

o Develop attack channel models and associated performance for the proposed data

hiding schemes.

o Devise theoretical framework for the proposed data hiding schemes and compare its

performance with the existing data hiding schemes.

o Develop a data hiding system for enhanced QoS of multimedia communication over

lossy and busty communication channel then analyze the associated performance of

the proposed system for real world communication channels.

o Develop and analyze a data hiding scheme for data embedding in compressed domain

for each data type (audio, images, and video).

o Develop a robust fingerprinting scheme for secure music distribution on the Web and

analyze its performance against different types of collusion attacks.

o Develop a data hiding scheme for the music transmission over a wireless channel

where the intended recipients (listeners or subscribers) can enjoy the hi-fi music

whereas general public listen ordinary quality music at the same time.

o Explore the possibility of extending data hiding schemes for multimedia indexing and

retrieval application.

75

References: Books: [1] M. H. Hayes, “Statistical Digital Signal Processing and Modeling,” John Wiley & Sons,

Inc., NY, 1996. [2] E. Zwicker, and H. Fastl, "Psychoacoustics: Facts and Models,” Springer-Verlag, Berlin,

1999. [3] N. S. Jayant and P. Noll, “Digital Coding of Waveform: Principles and Applications to

Speech and Video”, Englewod Cliffs, NJ: Prentice-Hall, 1984. [4] T. M. Cover, and J. A. Thomas, “Elements of Information Theory”, John Willy & Sons,

New York, 1991. [5] J. Eggers and B. Girod, “Informed Watermarking”, Kluwer Academic Publisher, 2002. [6] M. Wu, and B. Liu, “Multimedia Data Hiding”, Springer Verlag, Oct 2002.

76

[7] I. J. Cox, M. L. Miller, and J. A. Bloom, Digital Watermarking”, Morgan Kaufmann, 2001.

[8] Neil F. Johnson, Zoran Duric, and Sushil Jajodia, “Information Hiding: Steganography and Watermarking - Attacks and Countermeasures”, Kluwer Academic Publishers, 2000.

Data Hiding Journals & Special Issues: [9] F.A. P. Petitcolas, and H. J. Kim , editors, Digital Watermarking, Proceedings of the 1st

international workshop on Digital Watermarking, Lecture Notes in Computer Science, Vol. 2613, Seoul, Korea, Nov. 2002.

[10] J. Feigenbaum, editor, “Digital Rights Management”, Proceedings of the ACM CSS-9 workshop on Digital Right Management, Lecture Notes in Computer Science, Vol. 2696, Washington, DC, USA, Nov. 2002.

[11] F. A. P. Petitcolas, editor, Information hiding. Proceedings of the 5th international workshop on information hiding, Lecture Notes in Computer Science, Vol. 2578, Noordwijkerhout, The Netherlands, Oct 2002.

[12] I. S. MosKowitz, editor, Information hiding. Proceedings of the 4th international workshop on information hiding, Lecture Notes in Computer Science, Vol. 2137, Pittsburg, PA, April 2001.

[13] A. Pfitzmann, editor, Information hiding. Proceedings of the 3rd international workshop on information hiding, Lecture Notes in Computer Science, Vol. 1768, Dresden, Germany, Sep/Oct 1999.

[14] D. Aucsmith, editor, Information hiding. Proceedings of the 2nd international workshop on information hiding, Lecture Notes in Computer Science, Vol. 1525, Portland, OR, April 1998.

[15] R. Anderson, editor, Information hiding. Proceedings of the 1st international workshop on information hiding, Lecture Notes in Computer Science, Vol. 1174, Cambridge, UK, May/April 1996.

[16] Proceedings of the SPIE/IS&T Inter. Conf. on Security and Watermarking of Multimedia Contents III, Vol. 4314, San Jose, CA, Jan 2001.

[17] Proceedings of the SPIE/IS&T Inter. Conf. on Security and Watermarking of Multimedia Contents II, Vol. 3971, San Jose, CA, Jan 2000.

[18] Proceedings of the SPIE/IS&T Inter. Conf. on Security and Watermarking of Multimedia Contents, Vol. 3657 San Jose, CA, Jan 1999.

[19] IEEE Trans. on Signal Processing, Special issue on Signal Processing for Data Hiding in Digital Media & Secure Content Delivery, vol. 51(4), April 2003.

[20] IEEE Communication Magazine, Aug 2001. [21] IEEE Signal Processing Magazine, Sep 2000. [22] Proceedings of the IEEE, Special Issue on Identification and Protection of Multimedia

Information, 87(7), July 1999. [23] Signal Processing, Special issue on Watermarking, Vol. 66(3), May 1998. [24] IEEE J. Select. Areas Communications, Special Issue on Copyright and Privacy

Protection, 16(4), May 1998. PhD Dissertations: [25] D. Karakos, “Digital Watermarking, Fingerprinting, and Compression: An Information-

Theoretic Perceptive”, Ph. D. Dissertation, University of Maryland, College Park, June 2002.

77

[26] J. Song, “Optimal Rate Allocation and Security Schemes for Image and Video Transmission over Wireless Channels”, Ph. D. Dissertation, University of Maryland, College Park, June 2002.

[27] A. Cohen, “Information Theoretic Analysis of Watermarking Systems”, Ph. D. Dissertation, MIT, Sept 2001.

[28] M. Wu, “Multimedia Data Hiding”, Ph. D. Dissertation, Princeton University, April 2001.

[29] C-Y. Lin, “Watermarking and Digital Signature Techniques for Multimedia Authentication and Copyright Protection”, Ph. D. Dissertation, Columbia University, Dec. 2000.

[30] B. Chen, “Design and Analysis of Digital Watermarking, Information Embedding and Data Hiding Systems”, Ph. D. Dissertation, MIT, June 2000.

[31] D. Kandur, “Multiresolutation Digital Watermarking: Algorithms and Implications for Multimedia Signals”, Ph. D. Dissertation, University of Toronto, 1999.

[32] M. Ramkumar, “Data Hiding in Multimedia –Theory and Applications”, Ph. D. Dissertation, New Jersey Institute of Technology, Nov. 1999.

[33] L. Qiao, “Multimedia Security and Copyright Protection”, Ph. D. Dissertation, University of Illinois at Urbana-Champaign, 1998.

Communication Theory: [34] C. E. Shannon, “Channels with Side Information at the Transmitter”, IBM J. Res.

Devlop., 2: 289-293, 1958. [35] G. I. Gel’fand, and M. S. Pinsker, “Coding for Channels with Random Parameters”,

Problems of Control and Information Theory, 9(1):19-31, 1980. [36] C. Heegard, and A. A. El Gamal, “On the Capacity of the Memory with Defects”, IEEE

Trans. Info. Theory, 29(5): 731-739, Sept 1983. [37] M. H. M. Costa, “Writing on Dirty Paper”, IEEE Trans. Info. Theory, 29(3): 439-441,

May, 1983. [38] R. L. Pickholtz, D. L. Schilling, and L. B. Milstein, “Theory of Spread Spectrum

Communications-A Tutorial,” IEEE Trans. on Communications, vol. COM-30, pp. 855-884, May, 1982.

Information-Theoretic Analysis: [39] P. Moulin, M. K. Mihcak, and G.-I. Lin, “An Information--Theoretic Model for Image

Watermarking and Data Hiding,” Proc. IEEE Inter. Conf. on Image Proc., Vancouver, B.C., Sep 2000.

[40] P. Moulin and J. A. O'Sullivan, “Information-Theoretic Analysis of Information Hiding,” IEEE Trans. on Information Theory, Vol. 49, No. 3, pp. 563-593, March 2003.

[41] P. Moulin, “The Role of Information Theory in Watermarking and Its Application to Image Watermarking,” Signal Processing, Vol. 81, No. 6, pp. 1121-1139, June 2001.

[42] A. S. Cohen, and R. Zamir, “Writing on Dirty Paper in the Presence of Difference Set Noise”, 41st Annual Allerton Conf. on Comm. Control and Computing, October, 2003.

[43] R. Zamir, and A. S. Cohen, “The Rate Loss in Writing on Dirty Paper”, DIMACS Workshop on Network Information Theory, Rutgers University, March 2003.

[44] A. S. Cohen, and A. Lapidoth, “Generalized Writing on Dirty Paper”, International Symposium on Information Theory (ISIT), p. 227, Lausanne, Switzerland, July 2002.

[45] A. S. Cohen, and A. Lapidoth, “Watermarking Capacity for Gaussian Sources”, 39th Annual Allerton Conference on Communication, Control and Computing, October, 2001.

78

[46] A. S. Cohen, and A. Lapidoth, “The Capacity of the Vector Gaussian Watermarking Game”, Inter. Symposium on Info. Theory (ISIT), p. 4, Washington, DC, June 2001.

Data Hiding Game: [47] P. Moulin, “Information-Hiding Games,” 1st Workshop on Digital Watermarking, Lecture

Notes in Computer Sciences, Vol. 2613, Seoul, Korea, Nov 2003. [48] P. Moulin, and A. Ivanovic, “The Zero-Rate Spread-Spectrum Watermarking Game,”

IEEE Trans. on Signal Processing, Vol. 51, No. 4, pp. 1098-1117, April 2003. [49] T. Liu, and P. Moulin, “Error Exponents for Watermarking Game with Squared-Error

Constraints,” IEEE Proc. Int. Symp. on Info. Theory, Yokohama, Japan, July 2003. [50] T. Liu, and P. Moulin, “Error Exponents for One-Bit Watermarking,” IEEE Proc.

ICASSP, Hong Kong, April 2003. [51] P. Moulin, “A Mathematical Approach to Watermarking and Data Hiding,” ICASSP

Tutorial, Orlando, FL, May 2002. [52] P. Moulin and M. K. Mihcak, “A Framework for Evaluating the Data-Hiding Capacity of

Image Sources,” IEEE Trans. on Image Processing, Vol. 11, No. 9, pp. 1029-1042, Sep. 2002.

[53] P. Moulin, and M. K. Mihcak, “The Parallel-Gaussian Watermarking Game,” UIUC Tech. Rep. UIUC-ENG-01-2214, IEEE Trans. on Information Theory, Feb. 2004.

[54] P. Moulin and A. Ivanovic, “The Watermark Selection Game,” Proc. Conf. on Info. Sciences and Systems, Baltimore, MD, March 2001.

[55] A. S. Cohen, and A. Lapidoth, “The Gaussian Watermarking Game”, IEEE Trans. on Info. Theory, Vol. 48(6), pp. 1639-1667, June 2002.

[56] A. Cohen, and A. Lapidoth, “On the Gaussian Watermarking Game", Inter. Symposium on Info. Theory (ISIT), p. 48, Sorrento, Italy, June 2000.

Blind Embedding: [57] J. J. Eggers, J. K. Su, and B. Girod, “A Blind Watermarking Scheme Based on Structured

Code Books,” Proc. IEE Conference on Secure Images and Image Authentication, London, U.K., April 2000.

[58] J. J. Eggers, J. K. Su, and B. Girod, “Robustness of a Blind Watermarking Scheme,” Proc. IEEE International Conference on Image Processing, ICIP-2000, Vancouver, Canada, Sept 2000.

[59] I.J. Cox, J. Kilian, T. Leighton, and T. Shamoon, “Secure Spread Spectrum Watermarking for Multimedia”, IEEE Trans. on Image Processing, 6, 12, 1673-1687, 1997.

[60] R. B. Wolfgang, C. I. Podilchuk, and E. J. Delp, “Perceptual Watermarks for Digital Images and Video”, Proce. IEEE, Vol. 87(7), pp. 1108-1126, July 1999.

[61] M. D. Swanson, B. Zhu, A. H. Tewfik, and L. Boney, “Robust audio watermarking using perceptual masking,” Signal Processing, vol. 66, pp. 337-355, 1998.

[62] P. Bassia and I. Pitas, “Robust audio watermarking in the time domain,” Proc. Int. Conf. on Acoustics, Speech and Signal Processing (ICASSP99), 1999.

[63] M. F. Mansour, and A. H. Tewfik, “Time-scale invariant audio data embedding,” Proc. IEEE International Conference on Multimedia and Expo, ICME, Japan, August 2001.

[64] Y. Yardimci, A. E. Cetin, and R. Ansari, “Data hiding in speech using phase coding,” Proc. Eurospeech Conference,1997.

79

[65] C.-P. Wu, P.-C. Su, and C.-C. J. Kuo, “Robust Audio Watermarking for Copyright Protection,” SPIE's 44th Annual Meeting Advanced Signal Processing Algorithms, Architectures, and Implementations, July 1999.

[66] D. Kirovski, and H. S. Malvar, “Spread Spectrum watermarking of Audio Signals,” IEEE Trans. Signal Proc. Vol. 51, no. 4, pp. 1020-1033, April, 2003.

[67] R. A. Garcia, “Digital Watermarking of Audio Signals using Psychoacoustic Auditory Model and Spread Spectrum Theory,” 107th Convention, AES, New York, September, 1999.

[68] H. MaliK, A. Khokhar, and R. Ansari, “Robust Audio Watermarking using Frequency Selective Spread Spectrum Theory,” accepted for Proc. ICASSP’04, Montreal, Quebec, Canada, May 17-21 2004.

Informed Embedding: [69] M.L. Miller, I.J. Cox, and J.A. Bloom, “Informed Embedding: Exploiting Image and

Detector Information During Watermark Insertion,” Proc. IEEE Inter. Conf. on Image Processing - ICIP (2000).

[70] J. K. Su, J. J. Eggers, and B. Girod, “Illustration of the Duality Between Channel Coding and Rate Distortion with Side Information,” Proc. 2000 Asilomar Conference on Signals and Systems, Pacific Grove, CA, USA, Oct 2000.

[71] I.J. Cox, M.L. Miller, and A.L. McKellips, “Watermarking as Communications with Side Information,” Proc. of IEEE, 87(7), pp. 1127-1141, (1999).

[72] M. Ramkumar, and A.N. Akansu, “FFT- based Signaling for Multimedia Steganography,” IEEE ICCASP’00, pp 1979-1982, Istanbul, Turkey, June 2000.

[73] M. Ramkumar, and A.N. Akansu, “Self-Noise Suppression Schemes for Blind Image Steganography,” Proc. SPIE, Vol 3845, pp 55-65, Boston, MA, Sep 99.

[74] M. Kesal, M. K. Mihcak, R. Koetter, and P. Moulin, “Iteratively Decodable Codes for Watermarking Applications,” Proc. 2nd Inter. Symp. on Turbo Codes and Related Topics, Brest, France, Sep. 2000.

[75] R. Ansari, H. MaliK, and A. Khokhar, “Data-Hiding in Audio using Frequency-Selective Phase Alteration,” accepted for Proc. ICASSP’04, Montreal, Quebec, Canada, May 17-21 2004.

[76] H. Malik, A. Khokhar, and R. Ansari, “Robust Data Hiding in Audio,” submitted to ICME’04, Taipei, Taiwan, June27-30, 2004.

Quantization Based Embedding: [77] F. Pérez-González, and F. Balado, “Quantized projection data hiding,” In Proc. of IEEE

Inter. Conf. on Image Processing, Rochester,NY, USA, Sep 2002. [78] Fernando Pérez-González and Félix Balado, “Improving data hiding performance by

using quantization in a projected domain,” In Proc. of IEEE Inter. Conf. on Multimedia and Expo, Lausanne, Switzerland, Aug 2002.

[79] F. Pérez-González, P. Comesaña, and F. Balado, “Dither-Modulation Data hiding with distortion-compensation: exact performance analysis and an improved detector for JPEG attacks,” In Inter. Conf. on Image Processing, Barcelona, Spain, Sep 2003.

[80] R. J. Barron, B. Chen, and G. W. Wornell, “The duality between information embedding and source coding with side information and some applications,” IEEE Trans. on Information Theory, vol. 49, no. 5, pp. 1159-1180, May 2003.

80

[81] B. Chen and G. W. Wornell, “Quantization index modulation: A class of provably good methods for digital watermarking and information embedding,” IEEE Trans. on Information Theory, vol. 47(4), pp. 1423-1443, May 2001.

[82] B. Chen and G. W. Wornell, “Quantization index modulation methods for digital watermarking and information embedding of multimedia,” Journal of VLSI Signal Processing Systems for Signal, Image, and Video Technology, Vol. 27(1-2), pp. 7-33, Feb 2001.

[83] B. Chen and G. W. Wornell, “Preprocessed and postprocessed quantization index modulation methods for digital watermarking,” Proc. of SPIE vol. 3971, San Jose, CA, pp. 48-59, Jan 2000.

[84] B. Chen and G. W. Wornell, “Dither modulation: A new approach to digital watermarking and information embedding,” Proc. of SPIE, vol. 3657, San Jose, CA, pp. 342-353, Jan. 1999.

[85] B. Chen and G. W. Wornell, “Digital watermarking and information embedding using dither modulation,” Proc. of 1998 IEEE Second Workshop on Multimedia Signal Processing (MMSP-98) , Redondo Beach, CA, pp. 273-278, Dec 1998.

[86] J. J. Eggers and B. Girod, “Quantization Watermarking,” Proc. Security and Watermarking of Multimedia Contents, Electronic Imaging 2000, San Jose, CA, USA, Jan. 2000.

[87] M. Wu, “Joint Security and Robustness Enhancement for Quantization Based Embedding,” IEEE Trans. on Circuits and Systems for Video Technology, May 2003.

[88] F. Balado, and F. Pérez-González. “Hexagonal quantizers are not optimal for 2-D data hiding”, Proc. SPIE, Vol. 5020, Santa Clara, USA, January 2003.

[89] P. Comesaña, F. Pérez-González, and F. Balado, “Optimal Strategies for Spread-Spectrum and Quantized-Projection image data hiding games with BER Payoffs,” Proc. ICIP’03, Barcelona, Spain, Sep 2003.

Fingerprinting: [90] W. Trappe, M. Wu, Z. Wang, K.J.R. Liu, “Anti-collusion Fingerprinting for

Multimedia,” IEEE Trans. on Signal Processing, Vol. 51(4), pp.1069-1087, April 2003. [91] H. Zhao, M. Wu, Z.J. Wang, and K.J.R. Liu, “Performance of Detection Statistics Under

Collusion Attacks on Independent Multimedia Fingerprints,” Proc. ICME'03, Baltimore, MD, July 2003.

[92] Z.J. Wang, M. Wu, W. Trappe, and K.J.R. Liu, “Anti-Collusion of Group-Oriented Fingerprinting,” Proc. ICME'03, Baltimore, MD, July 2003.

[93] H. Zhao, M. Wu, Z.J. Wang, and K.J.R. Liu, “Nonlinear Collusion Attacks On Independent Fingerprints For Multimedia,” Proc. ICASSP'03, Hong Kong, April 2003.

[94] Z.J. Wang, M. Wu, H. Zhao, W. Trappe, and K.J.R. Liu, “Resistance of Orthogonal Gaussian Fingerprints to Collusion Attacks,” Proc. ICASSP'03, Hong Kong, April 2003.

[95] W. Trappe, M. Wu, and K.J.R. Liu, “Anti-Collusion Codes: Multi-User and Multimedia Perspectives,” Proc. ICIP'02, Rochester, NY, Sept. 2002.

[96] W. Trappe, M. Wu, and K.J.R. Liu, “Joint Coding and Embedding for Collusion-Resistant Fingerprinting,” EUSIPCO 2002, Sept. 2002.

[97] W. Trappe, M. Wu, and K.J.R. Liu, “Collusion-Resistant Fingerprinting for Multimedia,” Proc.ICASSP'02, Orlando, FL, May 2002.

[98] A. Briassouli, and P. Moulin, “Detection-Theoretic Anaysis of Warping Attacks in Spread-Spectrum Watermarking,” Proc. ICASSP’03, Hong Kong, April 2003.

81

[99] P. Moulin, A. Briassouli, and H. Malvar, “Detection-Theoretic Analysis of Desynchronization Attacks in Watermarking,” Proc. DSP'02, Santorini, Greece, July 2002.

Authentication: [100] J. J. Eggers and B. Girod, “Blind Watermarking Applied to Image Authentication,” Proc.

ICASSP’01, Vol. 3, pp. 1977-1980, Salt Lake City, UT, May 2001. [101] M. Wu, and B. Liu, "Watermarking for Image Authentication,” Proc. ICIP'98, Chicago,

IL, 1998. [102] E. T. Lin, C. I. Podilchuk, and E. J. Delp, “Detection of Image Alterations Using Semi-

Fragile Watermarks”, Proc. SPIE, Vol. 3971, San Jose, CA, Jan 2000. [103] D. Kundur, and D. Hatzinakos, “Digital Watermarking for Telltale Tamper-Proofing and

Authentication”, Proc. IEEE, Vol. 87(7), pp. 1167-1180, July 1999. Attacks, Performance Evaluation, & Benchmarks: [104] J. K. Su, J. J. Eggers, and B. Girod, “Analysis of Digital Watermarks Subjected to

Optimum Linear Filtering and Additive Noise,” Signal Processing, vol. 81(6), June 2001. [105] J. J. Eggers and B. Girod, “Quantization Effects on Digital Watermarks,” Signal

Processing, vol. 8(2), pp. 239-263, February 2001. [106] J. J. Eggers, R. Bäuml, and B. Girod, “Digital Watermarking facing Attacks by

Amplitude Scaling and Additive White Noise,” Proc. 4th Intl. ITG Conference on Source and Channel Coding, Berlin, Germany, Jan. 2001.

[107] J. K. Su, J. J. Eggers, and B. Girod, “Capacity of Digital Watermarks Subjected to an Optimal Collusion Attack,” X. European Signal Processing Conference EUSIPCO-2000, Tampere, Finland, Sept 2000.

[108] J. Su and B. Girod, “Fundamental Performance Limits of Power-Spectrum Condition-Compliant Watermarks,” Proc. Security and Watermarking of Multimedia Contents, Electronic Imaging 2000, San Jose, CA, Jan. 2000.

[109] J. K. Su, J. J. Eggers, and B. Girod, “Optimum Attack on Digital Watermarks and its Defense,” Proc. 2000 Asilomar Conference on Signals and Systems, Pacific Grove, CA, Oct 2000.

[110] F. Hartung, J. Su, B. Girod, “Spread Spectrum Watermarking: Malicious Attacks and Counter-Attacks,” Proc. SPIE, Vol. 3657, pp. 147-158, San Jose, CA, Jan 1999.

[111] J. Su, F. Hartung, B. Girod, “Channel Model for a Watermark Attack,” Proc. SPIE, Vol. 3657, pp. 159-170, San Jose, CA, Jan 1999.

[112] J. J. Eggers and B. Girod, “Watermark Detection after Quantization Attacks,” Proc. Workshop on Information Hiding,” Dresden, Germany, Sept./Oct. 1999.

[113] M. Wu, and B. Liu, “Attacks on Digital Watermarks,” 33th Asilomar Conference on Signals, Systems, and Computers, 1999.

[114] J. A. O'Sullivan and P. Moulin, “Some Properties of Optimal Information Hiding and Information Attacks,” Proc. 39th Allerton conference , Monticello, IL, Oct 2001.

[115] F. Pérez-González, F. Balado, and J. R. Hernández, “Performance analysis of existing and new methods for data hiding with known-host information in additive channels,” IEEE Trans. on Signal Processing, 51(4):960-980, April 2003.

[116] D. Kirovski, and F. A. P. Petitcolas, “Blind pattern matching attack on watermarking systems”, IEEE Tran. Signal processing, vol. 51(4), pp. 1045–1053, April 2003.

[117] F. A. P. Petitcolas, “Watermarking schemes evaluation”, IEEE Signal Processing, vol. 17(5), pp. 58–64, Sep 2000.

82

[118] M. Kutter and F. A. P. Petitcolas, “Fair evaluation methods for image watermarking systems”, Journal of Electronic Imaging, vol. 9, no. 4, pp. 445–455, Oct. 2000.

[119] Darko Kirovski & Fabien A. P. Petitcolas. “Replacement attack on arbitrary watermarking systems” Proc. ACM CCS-9 workshop, DRM 2002, digital rights management, Lecture notes in computer science, Vol. 2696, Washington, D.C., Nov. 2002.

[120] F. A. P. Petitcolas & D. Kirovski, “Blind pattern matching attack on audio watermarking systems”, Proc. ICASSP 2002, Orlando, Florida, May 2002.

[121] S. Katzenbeisser, and F. A. P. Petitcolas, “Defining security in steganographic systems” Proc. SPIE, Vol. 4675, pp. 50–56, San Jose, CA, Jan 2002.

[122] M. Steinebach, A. Lang, J. Dittmann, and F. A. P. Petitcolas. “StirMark Benchmark: audio watermarking attacks based on lossy compression” Proc. SPIE, Vol. 4675, pp. 79–90, San Jose, CA, Jan 2002.

[123] F. A. P. Petitcolas, R. J. Anderson, and M. G. Kuhn, “Attacks on copyright marking systems. Proc. 2nd workshop on information hiding, in the Lecture Notes in Computer Science Vol. 1525, pp 218–238 Portland, OR, April 1998.

[124] D. Kundur, “Improved Digital Watermarking through Diversity and Attack Characterization”, Proc. Workshop on Multimedia Security at ACM Multimedia ‘99, pp. 53-58, Orlando, Florida, Oct 1999.

[125] D. Kundur, and D. Hatzinakos, “Attack Characterization for Effective Watermarking," Proc. ICIP’99, pp. 240-244, Oct 1999.

[126] D. Kundur, and D. Hatzinakos, “Improved Robust Watermarking through Attack Characterization”, Optics Express focus issue on Digital Watermarking, vol. 3(12), pp. 485-490, Dec 1998.

Surveys and Tutorials: [127] M. Wu, and B. Liu,“Data Hiding in Image and Video: Part-I -- Fundamental Issues and

Solutions,” IEEE Trans. on Image Proc., Vol. 12(6), pp.685-695, June 2003. [128] M. Wu, H. Yu, and B. Liu, “Data Hiding in Image and Video: Part-II -- Designs and

Applications,” IEEE Trans. on Image Proc., Vol. 12(6), pp.696-705, June 2003. [129] I.J. Cox, M.L. Miller, and J.A. Bloom, “Watermarking Applications and Their

Properties,” Proc. of Inter, Conf. on Information Technology: Coding and Computing - ITCC2000, pp. 6-10, 2000.

[130] I.J. Cox, M.L. Miller, J. M. G. Linnartz, and T. Kalker, “A Review of Watermarking Principles and Practices”, DSP for Multimedia Systems, K. K. Parhi, T. Nishitani (eds.), Marcell Dekker, Inc. NY, pp. 461-485, (1999).

[131] F. A. P. Petitcolas, R. J. Anderson, and M. G. Kuhn, “Information hiding – a survey”, Proc. IEEE, 87(7):1062–1078, July 1999.

[132] R. J. Anderson, and F. A. P. Petitcolas. “On the limits of steganography”, IEEE Journal of Selected Areas in Communications, 16(4):474-481, May 1998.

[133] M. Eskicioglu, and E. J. Delp, “An Overview of Multimedia Content Protection in Consumer Electronics Devices”, Signal Processing: Image Communication, Vol. 16, pp. 681-699, 2000.

[134] E. T. Lin, and E. J. Delp, “A Review of Fragile Image Watermarks”, Proc. Multimedia and Security Workshop (ACM Multimedia '99) Multimedia Contents, pp. 25-29, Oct 1999.

[135] A. Sequeira, and D. Kundur, “Communication and Information Theory in Watermarking: A Survey”, Proc. SPIE Vol. 4518, pp. 216-227, Denver, Colorado, August 2001.

83

[136] W. Bender, D. Gruhl, N. Morimoto, and A. Lu, “Techniques for data hiding,” IBM Systems Journal, vol.35, nr. ¾, 1996.

[137] C. I. Podilchuk and E. J. Delp, “Digital watermarking algorithms and applications,” IEEE Signal Processing Magazine, pp. 33-45 July, 2001.

[138] F. Pérez-González, and J. R. Hernández, “A tutorial on digital watermarking,” Proc. of 33rd IEEE Ann. Carnahan Conf. on Security Technology, Madrid, Spain, Oct 1999.

[139] J. R. Hernández and F. Pérez-González, “Statistical analysis of watermarking schemes for copyright protection of images,” Proc. of IEEE, 87(7):1142-1166, July 1999.

[140] J. R. Hernández, F. Pérez-González, J. M. Rodríguez, and G. Nieto, “Performance analysis of a 2D-multipulse amplitude modulation scheme for data hiding and watermarking of still images,” IEEE J. Select. Areas Comm., 16(4):510-524, May 1998.

[141] C. I. Podilchuk, and W. Zeng, “Image Adaptive Watermarking using visual models,” IEEE J. on Selected Areas in Comm., 16(4):525-539, May 1998.

Miscellaneous: [142] J. J. Eggers, J. K. Su, and B. Girod, “Public Key Watermarking Using Linear

Transforms,” X. European Signal Proc. Conf. EUSIPCO’00, Tampere, Finland, Sept 2000.

[143] C-Y. Lin, M. Wu, J.A. Bloom, M.L. Miller, I.J. Cox, and Y-M. Lui “Rotation, Scale, and Translation Resilient Public Watermarking for Images,” IEEE Trans. on Image Processing, vol.10, no.5, no.767-782, May 2001

[144] C-Y. Lin, M. Wu, J.A. Bloom, M.L. Miller, I.J. Cox, and Y-M. Lui, "Rotation, Scale, and Translation Resilient Public Watermarking for Images,” Proc. SPIE, Vol. 3971, San Jose, CA, Jan 2000.

[145] M. Wu, and B. Liu, “Digital Watermarking Using Shuffling,” Proc. ICIP'99, Kobe, Japan, 1999.

[146] F. Pérez-González, J. R. Hernández, and F. Balado, “Approaching the capacity limit in image watermarking: A perspective on coding techniques for data hiding applications,” Signal Processing, Elsevier, 81(6):1215-1238, June 2001.

[147] J. R. Hernández, M. Amado, and F. Pérez-González, “DCT-domain watermarking techniques for still images: Detector performance analysis and a new structure,” IEEE Trans. on Image Processing, 9(1):55-68, January 2000.

[148] J. R. Hernández, J. M. Rodríguez, and F. Pérez-González, “Improving the performance of spatial watermarking of images using channel coding,” Signal Processing, Elsevier, 80:1261-1279, July 2000.

[149] J. R. Hernández, F. Pérez-González, and M. Amado, “Improving DCT-domain watermark extraction using generalized gaussian models,” In Proc. of the COST #254 Int. Workshop on Intelligent Communications and Multimedia Terminals, pp. 23-26, Ljubljana, Slovenia, November 1998.

[150] J. A. Bloom, I. J. Cox, T. Kalker, J-P Linnartz, M. L. Miller, and B. Traw, “Copy Protection for DVD Video”, Proc. of IEEE, 87 (7), pp 1267-1276, 1999.

[151] M. Peinado, F. A. P. Petitcolas, and D. Kirovski, “Digital rights management for digital cinema”, Multimedia Systems Journal, vol. 9, no. 3, pp 228–238, 2003.

[152] D. Kundur, “Implications for High Capacity Data Hiding in the Presence of Lossy Compression”, Proc. IEEE Int. Conf. On Information Technology: Coding and Computing, pp. 16-21, Las Vegas, Nevada, March 2000.

[153] D. Kundur and D. Hatzinakos, “A Robust Digital Image Watermarking Scheme using Wavelet-Based fusion”, Proc. ICIP’97, pp. 544-547, Santa Barbara, CA, Oct 1997.

84

[154] D. A. Nelson, and R.C. Bilger, “Pure-Tone Octave Masking in Normal-Hearing Listeners,” J. of Speech and Hearing Research, Vol. 17 No. 2, June 1974.

[155] K. Karthik, D. Kundur and D. Hatzinakos, “Joint Fingerprinting and Decryption for Multimedia Content Tracing in Wireless Networks,” Proc. SPIE, vol. 5403, Orlando, Florida, April 2004.

[156] D. Kundur and K. Ahsan, “Practical Internet Steganography: Data Hiding in IP,” Proc. Texas Workshop on Security of Information Systems, College Station, Texas, April 2003.

[157] K. Ahsan and D. Kundur, “Practical Data Hiding in TCP/IP,” Proc. Workshop on Multimedia Security at ACM Multimedia '02, French Riviera, Dec 2002.

[158] B. Chen and C.-E. W. Sundberg, “Digital audio broadcasting in the FM band by means of contiguous band insertion and precancelling techniques,” IEEE Trans. on Communications, vol. 48, no. 10, pp. 1634-1637, Oct. 2000.

[159] Secure Digital Music Initiative (SMDI), http://www.smdi.org. [160] D. Gruhl, and A. L. W. Bender, “Echo Hiding,” Proc. 1st Workshop on Information

Hiding, LNCS, Vol. 1174, pp. 295-351, Cambridge, UK, May/April 1996. [161] F. Hartung, P. Eisert, and B. Girod, “Digital Watermarking of MPEG-4 Facial Animation

Parameters,” Computers & Graphics, Vol. 22(4), pp. 425-435, August 1998. [162] F. Hartung and B. Girod, “Watermarking of Uncompressed and Compressed Video,”

Signal Processing, Vol. 66(3), pp. 283-302, May 1998. [163] F. Hartung, P. Eisert, and B. Girod, “Digital Watermarking of MPEG-4 Facial Animation

Parameters,” Computers & Graphics, vol.22, no.4, pp. 425-435, August 1998. [164] F. Hartung, and B. Girod, “Digital Watermarking of Raw and Compressed Video,” Proc.

European EOS/SPIE Symp. Adv. Image & Network Tech., Berlin, Germany, Oct 1996. [165] P. Noll, “MPEG Digital Audio Coding,” IEEE Sig. Proc. Mag. vol. 14(5), pp. 59-81, Sep

1997. [166] S. Mallat, “Multifrequency Channel Decomposition of Images and Wavelet Models,”

IEEE Trans. Acoust., Speech, Signal Processing, Vol. 37(12), pp. 2091–2110, Dec 1989. [167] M. Antonini, M. Barlaud, P. Mathieu, and I. Daubechies, “ Image Coding using Wavelet

Transform,” IEEE Trans. Image Processing, Vol. 1(2), pp. 205–220, April 1992. [168] A. J. Menezes, P. C. V. Oorschot, and S. A. Vanstone, “Handbook of Applied

Cryptography,” CRC Press, 5th print, Aug 2001. [169] “Digital Rights Management and Privacy,” Electronic Privacy Information Center,

http://www.epic.org/privacy/drm/. [170] M. Karagosian, “Digital Rights Management: Friend or Foe?,”

http://www.mkpe.com/articles/2001/DRM_2001/drm_2001.htm. [171] “Are Music Companies Blinded by Fright?,”

http://www.businessweek.com/1999/99_26/b3635140.htm?scriptFramed. [172] http://history.acusd.edu/gen/recording/digital.html [173] http://www.kirtland.cc.mi.us/honors/digrev.htm

Appendix: A

Notations

http://www.smdi.org/

http://www.epic.org/privacy/drm/

http://www.mkpe.com/articles/2001/DRM_2001/drm_2001.htm

http://www.businessweek.com/1999/99_26/b3635140.htm?scriptFramed

http://history.acusd.edu/gen/recording/digital.html

http://www.kirtland.cc.mi.us/honors/digrev.htm

Some preliminary notational conventions are defined here. Notations according to different types

of variable used are given as,

o Scalar: Upper case or lowercase italic letters with Arial font represent scalar values

and individual members of sets i.e. N, x, r etc. Magnitude of scalar value n is denoted

|n|.

o Sets: Sets are represented by Calligraphic font, for example, the set of real numbers is

R and set of messages is M. Cardinality of set M is denoted by |M |.

o Vectors: n-dimensional vectors (where n is a positive integer ), are represented

as boldface lowercase italic letters with Arial font: c, r, and w. Indices into these

vectors are specifies in the square brackets. For example, pixel of an image c at

location i, j is specified by c[i,j]. Moreover, vectors in transformed domain i.e.

discrete cosine transform, discrete wavelet transform etc. are represented by boldface

uppercase letters, e.g. DCT of vector c is C.

n +∈

Subscripts are used to indicate different versions of same vector, for example, co indicates the

vector of the original host data, similarly watermarked copy of this host media is cw etc.

The Euclidian norm of a vector c, that is, || . ||2 is denoted |c|. The sample mean and sample

variance of a vector c are denoted c and respectively. 2cs

o Random Scalar Variables: Random scalar variables are indicated by italic letters

Times New Roman font: x, r, and y etc. Each random variable is associated with a

probability distribution or density function, that is, the probability that the value x will

be drawn from the distribution of x is written fx(x).

The statistical mean and variance (or expected value and second central moment) of a

random variable x are indicated by xµ and 2xσ respectively.

85

o Random Vectors: Random vectors are represented by the boldface letters but same

font as random scalars, i.e. x , r , and y etc. Similarly, the probability distribution

function associated with random vector x is written fx(x). The statistical mean and

variance of a random vector x are represented by µx and 2σ x respectively.

86

Efficient Data Hiding Techniques for Digital Rights Management of Multimedia Archives

Documents

Transcript of Efficient Data Hiding Techniques for Digital Rights Management of Multimedia Archives