Post on 09-Feb-2023
Efficient Data Hiding Techniques for Digital Rights Management of Multimedia Archives
BY Hafiz Muhammad Aslam Malik
B.S. (University of Engineering and Technology Lahore, Pakistan) 1999
Preliminary Proposal
Submitted in partial fulfillment of the requirements for the degree of Ph.D.
in the Graduate College of the University of Illinois at Chicago, 2004
Chicago, Illinois
2
TABLE OF CONTENTS
CHAPTER 1.................................................................................................................................................................5 INTRODUCTION .........................................................................................................................................................5
MOTIVATION: ....................................................................................................ERROR! BOOKMARK NOT DEFINED. PROBLEM STATEMENT: ...............................................................................................................................................9
CHAPTER 2...............................................................................................................................................................12 RELATED WORK.......................................................................................ERROR! BOOKMARK NOT DEFINED.
2.1 DATA HIDING SYSTEMS: APPLICATIONS AND REQUIREMENTS.....................................................................12 2.1.1 REQUIREMENTS OF A DATA HIDING SYSTEM:...............................................................................................12
I. Robustness:...............................................................................................................................................13 II. Effectiveness:............................................................................................................................................13 III. Fidelity: ....................................................................................................................................................14 IV. Capacity: ..................................................................................................................................................14 V. Blind or Informed Detection: ...................................................................................................................14 VI. False Positive Rate:..................................................................................................................................14 VII. Multiple Watermarks Capability: ............................................................................................................15 VIII. Cost: ........................................................................................................................................................15
2.1.2 APPLICATIONS OF DATA HIDING FOR DIGITAL RIGHTS MANAGEMENT: ........................................................15 I. Ownership Protection: .............................................................................................................................15 II. Content Authentication:............................................................................................................................16 III. Fingerprinting:.........................................................................................................................................16 IV. Copy Protection: ......................................................................................................................................16 V. Broadcast Monitoring: .............................................................................................................................16
2.2. CLASSIFICATION OF DATA HIDING TECHNIQUES.......................................................................................17 2.2.1 CLASSIFICATION BASED ON HOST MEDIA TYPE............................................................................................18
I. Data Hiding in Images .............................................................................................................................18 II. Data Hiding in Video................................................................................................................................18 III. Data Hiding in Audio ...............................................................................................................................18 IV. Data Hiding in Text ..................................................................................................................................18
2.2.2 CLASSIFICATION BASED ON DATA HIDING APPLICATIONS ............................................................................18 I. Robust Data Hiding..................................................................................................................................18 II. Fragile Data Hiding .................................................................................................................................18 III. Semi-Fragile Data Hiding........................................................................................................................18
2.2.3 CLASSIFICATION BASED ON PERCEPTIBILITY................................................................................................18 I. Imperceptible Data Embedding................................................................................................................19 II. Visible Data Embedding...........................................................................................................................19
2.2.4 CLASSIFICATION BASED ON DATA EMBEDDING DOMAIN ..............................................................................19 I. Data Hiding in Spatial/Time Domain (Direct Domain) ...........................................................................19 II. Data Hiding in Transformed Domain.......................................................................................................19
2.2.5 CLASSIFICATION BASED ON DATA EMBEDDING METHOD .............................................................................20 I. Additive Spread Spectrum or Host-Interference-Non-rejecting Methods.................................................20 II. Host Interference Rejecting Methods .......................................................................................................20
2.2.6 CLASSIFICATION BASED ON DATA EXTRACTION METHOD.............................................................................20 I. Private or Informed Data Hiding .............................................................................................................21 II. Semi-Private Data Hiding ........................................................................................................................21 III. Public or Blind Data Hiding ....................................................................................................................21
3.3 DIGITAL RIGHTS MANAGEMENT: A BRIEF OVERVIEW................................................................................23 CHAPTER SUMMERY..................................................................................................................................................25
CHAPTER 3...............................................................................................................................................................26
3
DATA HIDING MODELS...........................................................................ERROR! BOOKMARK NOT DEFINED. 3.1 NOTATION...........................................................................................ERROR! BOOKMARK NOT DEFINED. 3.2 TRANSMISSION CHANNELS........................................................................................................................21 3.2.1 BOUNDED DISTORTION CHANNELS..............................................................................................................21 3.2.2 BOUNDED HOST-DISTORTION CHANNELS....................................................................................................22 3.2.3 ADDITIVE NOISE CHANNELS........................................................................................................................22 3.3 DATA HIDING IN COMMUNICATION FRAMEWORK ...............................ERROR! BOOKMARK NOT DEFINED. 3.3.1 CLASSICAL MODEL OF COMMUNICATIONS SYSTEM.................................ERROR! BOOKMARK NOT DEFINED. 3.3.2 SECURE TRANSMISSION.........................................................................ERROR! BOOKMARK NOT DEFINED. 3.3.3 DATA HIDING MODEL BASED ON COMMUNICATION...............................ERROR! BOOKMARK NOT DEFINED. 3.4 DATA HIDING AS COMMUNICATION WITH SIDE INFORMATION AT THE TRANSMITTERERROR! BOOKMARK NOT DEFINED. 3.5 GEOMETRIC MODEL OF DATA HIDING ................................................ERROR! BOOKMARK NOT DEFINED. CHAPTER SUMMERY............................................................................................ERROR! BOOKMARK NOT DEFINED.
CHAPTER 4...............................................................................................................................................................44 BLIND DATA EMBEDDING ....................................................................................................................................44
4.1 DATA HIDING BASED ON ADDITIVE EMBEDDING................................ERROR! BOOKMARK NOT DEFINED. 4.2 WORK IN PROGRESS: ROBUST AND HIGH RATE DATA EMBEDDING .............................................................44 4.2.1 DATA HIDING USING FREQUENCY SELECTIVE BASED SPREAD SPECTRUM ....................................................44
4.2.1.1 WATERMARKING USING PERCEPTUAL AUDITORY MODEL.................................................45 4.2.1.2 SALIENT POINT EXTRACTION ....................................................................................................46 4.2.1.3 WATERMARK EMBEDDING.........................................................................................................48 4.2.1.3.1 Watermark Generation ...................................................................................................................49 4.2.1.3.2 Watermark Embedding ...................................................................................................................49 4.2.1.4 WATERMARK DETECTION ..........................................................................................................51 4.2.1.5 EXPERIMENTAL RESULTS...........................................................................................................52
4.3 FUTURE DIRECTIONS.................................................................................................................................55 4.3.1 PROPOSED DADA HIDING SCHEME FOR IMAGES..........................................................................................55 4.3.2 PROPOSED DADA HIDING SCHEME FOR VIDEO ...........................................................................................57
CHAPTER 5...............................................................................................................................................................59 INFORMED DATA EMBEDDING ...........................................................................................................................59
5.1 INFORMED EMBEDDING.......................................................................ERROR! BOOKMARK NOT DEFINED. 5.1.1 COSTA’S WORK ....................................................................................ERROR! BOOKMARK NOT DEFINED. 5.2 QUANTIZATION INDEX MODULATION (QIM) ......................................ERROR! BOOKMARK NOT DEFINED. 5.2.1 BINARY DITHER MODULATION ..............................................................ERROR! BOOKMARK NOT DEFINED. 5.3 WORK IN PROGRESS: HIGH RATE DATA EMBEDDING USING INFORMED ENCODING ....................................59 5.3.1 DATA HIDING USING FREQUENCY SELECTIVE DITHERING (OUR CONTRIBUTION).........................................59
5.3.1.1 FIR APPROXIMATION OF APF....................................................................................................60 5.3.1.2 DATA EMBEDDING ......................................................................................................................63 5.3.1.3 DATA DETECTION USING SIGNAL MODELING .......................................................................64 5.3.1.3.1 Spectrum Estimation .......................................................................................................................65 5.3.1.3.2 Allpass Filter Parameter Estimation ..............................................................................................66 5.3.1.3.3 Simulation Results...........................................................................................................................67 5.3.1.4 DATA DETECTION USING MATCH FILTER...............................................................................68 5.3.1.4.1 Simulation Results...........................................................................................................................70
5.4 FUTURE DIRECTION...................................................................................................................................71 5.4.1 EXTENSION AUDIO FINGERPRINTING AND AUTHENTICATION ........................................................................72
CHAPTER 6...............................................................................................................................................................73 CONCLUSION & FUTURE DIRECTIONS ..............................................................................................................73 REFERENCES:.........................................................................................................................................................75
4
TABLE OF FIGURES
FIGURE 2.1: GENERAL CLASSIFICATION OF DATA HIDING ...........................................................................................17 FIGURE 2.2: ANATOMY OF A DRM TRANSATION ........................................................................................................21 FIGURE 3.1: STANDARD COMMUNICATION MODEL................................................ERROR! BOOKMARK NOT DEFINED. FIGURE 3.2: STANDARD SECURE COMMUNICATION MODEL ..................................ERROR! BOOKMARK NOT DEFINED. FIGURE 3.3: GENERAL MODEL FOR DATA HIDING .................................................ERROR! BOOKMARK NOT DEFINED. FIGURE 3.4: DATA HIDING SYSTEM WITH INFORMED DETECTOR ANALOGOUS TO STANDARD SECURE COMMUNICATION MODEL............................................................................ERROR! BOOKMARK NOT DEFINED. FIGURE 3.5: DATA HIDING SYSTEM WITH BLIND DETECTOR ANALOGOUS TO STANDARD SECURE COMMUNICATION MODEL ............................................................................ERROR! BOOKMARK NOT DEFINED. FIGURE 3.6: DATA HIDING AS COMMUNICATION WITH SIDE INFORMATION AT THE ENCODER...... ERROR! BOOKMARK
NOT DEFINED. FIGURE 4.1: 5 –LEVEL MODIFIED DISCRETE WAVELET ANALYSIS FILTER BANK ........................................................48 FIGURE 4.2: BLOCK DIAGRAM OF WATERMARK EMBEDDING PROCESS .......................................................................50 FIGURE 4.3: NORMALIZED CORRELATION FOR WATERMARKED SUBBAND (LEFT) AND UNWATERMARKED SUBBAND (RIGHT). ..............................................................................................................................................52 FIGURE 4.4: BLOCK DIAGRAM FOR WATERMARK DETECTION .....................................................................................52 FIGURE 4.5: DPM FOR DIFFERENT VALUES OF NOISE POWER (PN).................................................................................54 FIGURE 5.1: INFORMED DATA EMBEDDING FOLLOWED BY AWGN ATTACK.........ERROR! BOOKMARK NOT DEFINED. FIGURE 5.2: BINARY DITHERED MODULATION SCHEME BASED ON DITHERED UNIFORM SCALAR QUANTIZATION ............................................................................................ERROR! BOOKMARK NOT DEFINED. FIGURE 5.3: MAGNITUDE RESPONSE OF APF H(EJW) APPROXIMATION FOR DIFFERENT VALUES OF LENGTH (L). ........62 FIGURE5.4: POLE-ZERO LAYOUT OF HAPI(Z) FOR BINARY ENCODING ..........................................................................62 FIGURE5.5: POLE-ZERO LAYOUT OF HAPI(Z) FOR 4-ARY ENCODING .............................................................................63 FIGURE 5.6: BLOCK DIAGRAM OF THE DATA EMBEDDING SCHEME .............................................................................64 FIGURE 5.7: BLOCK DIAGRAM OF THE DATA DETECTION PROCESS .............................................................................67 FIGURE 5.8: PROBABILITY OF ERROR (PE) VS. SNR PLOT FOR BOTH ENCODING SCHEMES ..........................................67 FIGURE 5.9: MAGNITUDE SPECTRUM OF CZT OF THE SUBBAND SEQUENCE X4,6(N) BEFORE AND AFTER PASSING THROUGH H0(Z I) I.E. Y4,6(N),AT R = 0.9 (RIGHT) AND AT R = 1/0.9 (RIGHT). .........................................69 FIGURE 5.10: BLOCK DIAGRAM OF THE DATA DETECTION USING MATCH FILTER ......................................................70 FIGURE 5.11: PROBABILITY OF ERROR FOR DIFFERENT SNR VALUES...........................................................................71
5
CHAPTER 1
Introduction The revolution in the area of digital information has visibly impacted our society and everyday
life [171, 172]. Some of the blessings of this digital revolution include: the evaluation of Internet
as a global village, availability of low–cost large capacity storage devices, deployment of long–
distance seamless networks at Gbps (gigabits per second) data rates, and popular use of the state-
of-the-art multimedia production equipments (such as palm tops, digital camera, camcorder,
high-tech scanner and printer, digital audio recorder, etc.). Furthermore, the developments in the
areas of digital media production, manipulation, and distribution have added new dimensions to
the technical challenges related to digital data security and integrity. Along with its countless
advantages the cutting edge technologies of this digital information revolution have generated
some serious concerns about digital content protection, ownership protection, unauthorized copy
prevention, etc. Today’s entertainment industry (music and film industry) alone claims a
multimillion dollar annual revenue loss due to piracy [171], which is more likely to increase in
the coming years due to fast growing trend of exchanging digital media (music, images, movies,
software, e-books, etc.) over peer-to-peer networks. There is an urgent need to develop robust
technologies to support the development of digital rights management (DRM) systems, capable
of providing diverse services such as, secure media streaming between user and content server,
ownership protection, unauthorized copy prevention, unauthorized content usage, content
authentication, and content usage tracing.
Generally digital rights management (DRM) systems consist of a set of rights models (business
models) and technologies to support the above-mentioned services. However, the research
proposed in this dissertation deals only with some of the technological issues of DRM systems.
These technological issues define the reliability of a DRM system.
6
Most of the existing DRM systems use traditional content protection schemes such as encryption
for digital content protection, secure content delivery, and its usage tracking [170]. However,
encryption and scrambling alone cannot provide adequate protection against ownership rights,
unauthorized content usage, unauthorized copy prevention etc. Encrypted or scrambled data
remain protected as long as decryption or unscrambling key is unknown, but once data is
decrypted or unscrambled there is no way to stop its reproduction or sharing [168]. Thus there is
a strong need to complement cryptography. Data hiding and watermarking (a special case of data
hiding) are the potential technologies promising to meet the shortcomings of traditional content-
protection technologies.
In general, information hiding or data hiding implies imperceptibility embedding information
(message or metadata) into the host signal (images, video, audio, text etc.) for a variety of
applications such as secret communication or steganography, content protection, ownership
protection, illegal copy prevention, etc. Salient characteristics of any data hiding scheme include:
embedding capacity or payload, minimal embedding distortion, robustness to attacks, low false
positive rate, low error probability of received data, etc. Among these, embedding capacity,
embedding distortion or fidelity, and robustness are three inter-dependent features, and are also
used to evaluate the performance of data embedding schemes. Embedding capacity refers to the
amount of data that can be embedded in a give multimedia clip. Embedding distortion or fidelity
measures the perceptibility of the embedded information. Robustness refers to the capability of
data hiding scheme to withstand intentional and unintentional attacks. Here, intentional attacks
include filtering, chopping, scaling, Gaussian or uniform noise addition, resampling, etc.,
whereas, lossy compression digital to analog conversion and requantization are generally treated
as unintentional attacks.
Digital watermarking, a special case of data hiding, is a process of embedding information into
the host data (cover data) for content protection, integrity and security. Robustness of the
7
embedded information against data hiding attacks is the most desirable feature of watermarking
schemes. A watermark is an imperceptible and inseparable signal about the data in which it is
embedded, and undergoes same transformation as the host data. These attributes distinguish
watermarking from the traditional digital content protection techniques [7] such as cryptography
and scrambling and this make watermarking an attractive tool for digital media protection, traitor
tracing, content usage monitoring, broadcast monitoring, and communication with side
information to improve the quality of service (QoS) of the multimedia transmission over lossy
channel.
The growing availability of digital information in different formats and its increasing illegal
sharing and distribution has led to the proliferation of DRM technologies including, data hiding
schemes designed for applications such as copyrights protection, media authentication, broad
cost monitoring [5 – 33], steganographic techniques for covert communications [8, 127 – 137],
fingerprinting methods for traitor-tracing applications [90 – 99]. This has also led to renewed
interests of information theoreticians in the data hiding problem, e.g. Moulin et al [39 – 41, 47 –
54], Cox et al [7, 59, 69, 71, 139, 130, 144, 145, 150], Chen et al [80 – 86], Girod et al [57, 58,
70, 104 – 112], P-Gonzalez et al [74 – 76, 115, 135, 146 – 149], Cohen et al [43 – 46, 55, 56],
and others [6, 25 – 33]. Most of the theoretical advances in the area of data hiding are attributed
to the following classical works:
“Writing on Dirty Papers” by M. Costa [37]
“Coding of Channels with Random Parameters” by Gel’fand and Pinsker [35],
“Channels with Side Information at the Transmitter” by Shannon [34].
Due to these inspirational papers many researchers have modeled the data hiding problem using
signal processing, communication theory, coding theory, and information theory framework.
From an application perspective, most of the existing data hiding research [9 – 24] is mainly
focused on digital images data. Relatively very little attention has been given to data hiding in
8
digital video and audio data. Audio data models (perceptual as well as real data models) are quite
different from images and video data models. Therefore, a data hiding scheme yielding high
performance of a given data hiding scheme for images or video may not yield the same
performance for audio, and vice versa. In the following we outline various shortcomings of the
existing data hiding schemes. More detailed analysis of the related work will be provided in
Chapter 3.
First of all, the host data is generally modeled as an independent and identically distributed
(i.i.d.) Gaussian random sequence, and the attack channel is modeled as an independent discrete
memoryless (DM) Gaussian channel [39 – 56]. These models do not agree with the host data
(audio, video, and images) models, because, in general multimedia data does not exhibits i.i.d.
Gaussian distribution [9]. Similarly, in practice active adversary attacks are host data dependent
[104 – 126], especially when an active adversary has knowledge about the host data. Therefore,
there is a need for more realistic and appropriate modeling of host data and attack channel for
performance analysis of a given data-hiding scheme.
Secondly, almost all existing data hiding schemes measure the perceptual quality (fidelity) of a
given data hiding scheme using the mean squared error metric [39 – 41, 80 – 86, 115, 135, 146 –
149], which often does not agree with the human perceptual model [2, 7, 60]. An appropriate
perceptual distortion measure is also needed for the performance evaluation based on the
perceptual distortion due to information embedding and robustness.
Thirdly, low data rate is a common limitation of the existing data hiding schemes [5 – 24]. Since
data hiding applications such as broadcast monitoring require relatively high data rate, therefore,
it is desirable to develop high capacity data hiding schemes for such applications. While several
researchers [71 – 86] have proposed high capacity data hiding schemes, but their work is based
on improving the coder performance by using efficient coding schemes and/or using host signal
interference cancellation [77 – 89]. A little attention has been focused on exploiting the host data
9
characteristics combined with a efficient coder and data hiding strategy to achieve high data
embedding rate.
Moreover, most of the existing DRM systems use encryption for content protection and content
tracking which alone cannot provide sufficient firewall against active adversary attacks.
Finally, most of the research in the data hiding community is mainly focused on traditional copy
control issues such as, copyright protection, content authentication, temper detection,
unauthorized copy prevention, content usage monitoring, broadcast monitoring, etc. Very little
attention has been given to broaden the data hiding application domain beyond the copy control
issues. For example, data hiding can be used in the area of multimedia transmission over lossy
and bursty channel to improve the QoS.
Problem Statement:
Most of the existing data hiding schemes [4 – 33] are based on i.i.d. Gaussian modeling of the
host data and independent Gaussian discrete memoryless channel (DMC) modeling of adversary
attacks or attacks channels [39 – 56]. These assumptions are not true in general. Similarly,
performance measures based on embedding fidelity of existing data embedding schemes
generally use mean squared error distortion criterion which does not agree with the human audio-
visual perceptual model. Therefore it is desirable to develop more realistic host data, attack
channel, and embedding distortion models for performance analysis of existing data embedding
schemes. The main goal of this research is to advance the theory underlying data hiding, develop
new techniques for data hiding, and extend data hiding applications to digital rights management
system. In this dissertation we propose to analyze the limitations of existing data hiding schemes
and their applications to different types of host data. Based on the analysis we will propose
efficient data hiding schemes for a reliable DRM system. We intend to analyze the performance
of the proposed schemes based on the triad of data hiding performance criteria, that is, capacity,
10
perceptibility, and robustness using more realistic data and channel models. We also plan to
develop a more realistic measure of distortion due to embedding in order to evaluate the
perceptual performance of the proposed data-hiding scheme. In particular we propose to
investigate the following research tasks:
Develop high capacity data hiding schemes based on host signal features along with
efficient codec and data hiding strategies.
Devise realistic host data models (stochastic models) for each type of host media i.e.
audio, images, and video separately for information-theoretic analysis of the
proposed data hiding schemes.
Design efficient source coding schemes based on the developed host data models.
Develop appropriate channel models for intentional and unintentional attacks and
analyze their performance against existing attack channel models.
Devise a realistic distortion metric to evaluate the performance based on
perceptibility and robustness.
Develop an appropriate protocol for online multimedia authentication, copy control,
and copyright protection applications.
Devise suitable data-hiding scheme for multimedia indexing and retrieval
application.
Develop a realistic data hiding strategy to improve the QoS of multimedia
transmission over lossy and busty channels.
The remainder of the dissertation proposal is organized as follows: Chapter 2 discusses the
requirements and application domain of data hiding schemes along with a general classification
of existing data hiding schemes. A brief overview of a DRM system is also provided in Chapter
2. Related work and data hiding modeling is given in Chapter 3. Our contribution to blind data
embedding or additive spread spectrum class of data hiding is discussed in Chapter 4. Chapter 5
11
gives the details of our proposed work in informed embedding class of data hiding. Our
proposed data hiding schemes for both classes of data hiding use audio data as a host media, their
extensions for images and video host data are also proposed. Future directions of our proposed
research are outlined in Chapter 6.
12
CHAPTER 2
Preliminaries This chapter presents an overview of the generic characteristics and requirements of the data
hiding problem, briefly describes the related application domains, and provides a general
classification of the existing data hiding schemes. The challenges, shortcomings, and the
promises of the data hiding schemes are outlined in Section 2.1. Section 2.2 gives general
classification of existing data hiding schemes. Transmission channel model is an important
ingredient for theoretical analysis of data hiding problem. Common transmission channel
models, such as bounded distortion channels, bounded host distortion channels, and additive
noise channels that have been used for modeling attacks against data hiding schemes and
watermarking are discussed in Section 2.3. A brief overview of DRM systems is provided in
Section 2.4.
2.1 Data Hiding Systems: Requirements and Applications
The digital multimedia (throughout this document digital multimedia or the host media refers to
digital audio, digital video and digital images, unless otherwise specified) has many advantages
over analog multimedia. For example, there is insignificant aging effect on the contemporary
digital media storage devices such as CDs, memory sticks, etc., reproduction of digital media is
very simple; a copy of a digital media clip is exactly similar to its original version. Also due to
recent advances in the techniques for digital data production, distribution, and manipulation,
research in the area of data hiding and watermarking has exploded with the goal to complement
deficiencies of the conventional content protection methods such as cryptography and
scrambling [7, 8].
2.1.1 Requirements of a Data Hiding System:
13
A data hiding scheme is characterized by a number of defining properties [5 – 8]. In general a
data hiding scheme is suppose to withstand against common data manipulations, such as lossy
compression, digital-to-analog conversion, rescaling, requantization, resampling, filtering, data
format conversion, encryption, decryption, and scrambling. It is also suppose to withstand
against active adversary attacks, such as noise, as long as attack channel distortion is below a
certain masking threshold. However, the relative importance of each property depends on the
requirements of the application and the role of data embedding in the application. For example,
if we are evaluating the performance of an audio watermarking system for copy control
application, we may need to check the robustness of short time energy ratio that adversary might
use for attack. However, such robustness might be irrelevant for broadcast monitoring
applications. Therefore, the performance of any data hiding scheme should be evaluated based
on the underlying application. Following are the desirable properties of a generic data hiding
scheme:
I. Robustness: Robustness measures the ability of embedded data or watermark to withstand against intentional
and unintentional attacks. Unintentional attacks generally include common data processing
operations i.e. compression, digital-to-analog conversion, resampling, requantization etc, where
as, intentional attacks cover a broad range of degradations [104 – 126], for example, white and
color noise addition, scaling, rotation (for image and video watermarking schemes), chopping,
low-pass filtering, etc. Details of these intentional attacks in the area of data hiding and their
countermeasures can be found in [8, 131, 132].
II. Effectiveness: The probability that the output of the embedder will be watermarked for a randomly selected
input data is generally referred as effectiveness of a data hiding scheme.
14
III. Fidelity: This is an important property of all perceptual based data hiding schemes [5 – 24]. Fidelity
measures the perceptual similarity between the host media and its data embedded version. To
meet this constraint, the perceptual distortion introduced due to embedding is kept below the
masking threshold of human auditory system (HAS) for audio data hiding schemes and human
visual system (HVS) for video and image data hiding schemes.
IV. Capacity: This property refers to the amount of information that a data hiding scheme can successfully
embed without introducing perceptual distortion. The need for this property is application
dependent, for example, a data hiding scheme designed for copyright protection or copy control
application does not require high data embedding capacity because only a few bits of information
are sufficient for this application. Whereas, a data embedding scheme for broadcast monitoring
applications requires to embed relatively large amount of data [6, 7].
V. Blind or Informed Detection: This property relates to the availability of host data at the detector for watermark detection
process. If the host data is available at the detector for watermark detection process; then, this
class of data hiding schemes are categorized as informed detector or private data hiding schemes.
These schemes are required for fingerprinting, and data authentication [5 – 7]. If the host data is
not available at the detector for watermark detection process then this class of data hiding
schemes are categorized as blind detector or public data hiding schemes. Blind detector based
data hiding schemes are commonly used for copy control applications.
VI. False Positive Rate: This property corresponds to the frequency of detecting mark in an unmarked portion of the host
data. It is an important property for content protection applications such as, ownership right,
copy control, etc.
15
VII. Multiple Watermarks Capability: This feature of a data hiding scheme to embed more than one mark in the same host data is
desirable in some application such as fingerprinting.. For example, consider a situation where the
owner and the chain of distributors of a multimedia product want to embed their marks (serial
numbers or tags) to keep the trace of content usage and tracing a traitor. For such applications
multiple watermarks embedding feature is desirable.
VIII. Cost: The computational cost of embedding and detection algorithm is another evaluation criterion of
data hiding schemes that is critical for real time applications, such as broadcast monitoring,
online content authentication, etc. On the otherhand, for ownership proof applications this
property is not that critical.
2.1.2 Applications of Data Hiding:
Applications domain of data hiding techniques is rapidly growing. Recently, several research
efforts [5, 9, 10, 150 – 158] are aiming beyond classic applications of data hiding including
ownership protection, content usage tracking, content authentication, copy control,
fingerprinting, broadcast monitoring, indexing, medical safety [5 – 24] etc. A brief overview of a
few of these applications and their design requirements is given in the following:
I. Ownership Protection: The watermark carrying the ownership information is embedded into the host data. The
watermarking scheme used for ownership protection is expected to be resilient to common data
processing operations (unintentional attacks) and intentional attacks. In the case of dispute over
ownership of the host data, embedded watermark can be used as a proof to identify the true
owner of the host data. Watermarking schemes intended for ownership protection must have low
probability of error and false alarm. In general, the capacity (payload) requirement of the
watermarking scheme designed for ownership protection applications does not need to be high.
16
II. Content Authentication: Robustness and undetectability are not the main concerns for content authentication application
of data hiding. Therefore, fragile watermarking is generally used for such applications. A
watermark is embedded in the host data, which is later used to determine the tempering of the
host media. Recent content authentications schemes are also capable of identifying the locations
of tempering in the host media [9 – 24, 100 – 103]. These applications generally require
informed detector i.e. original host data is available to the detector for content authentication.
Data hiding schemes for content authentication must have high embedding capacity to meet the
requirements of the content authentication applications [6].
III. Fingerprinting: The owner or distributor of multimedia contents uses fingerprinting or labeling to trace the
illegal copies or traitor. For such applications, content owner or distributor embed a unique
fingerprint, label, or serial number in each copy of the distributed data before distributing to each
customer. A fingerprinting scheme is required to survive against intentional and unintentional
attacks, more specifically collusion attacks [90 – 99]. Fingerprinting does not require high
embedding capacity but does require robustness in general.
IV. Copy Protection: Embedded information in the host multimedia data can be used to control the copying device for
unauthorized copy prevention [7]. For this purpose, a watermark detector is generally integrated
in the recording or playback system, such as, DVD copy control scheme proposed in [150], or
proposed SMDI player [159]. Data hiding schemes for such applications should be robust against
all intentional or unintentional attacks that temper with the watermark from the watermarked
data. Moreover, data hiding techniques designed for copy control intend to use a blind detector
and generally requires low data embedding capacity.
V. Broadcast Monitoring: An automated (active) broadcast monitoring system can be used to detect the embedded
watermark in the broadcasted commercial advertisement [5, 7, 158]. In addition, an active
broadcast surveillance system can also be used for other TV products (news, talk shows, etc.)
protected by broadcast monitoring watermarking systems. For such applications watermarking
scheme should be robust against watermarking attacks and requires a blind detector for
watermark detection process. Furthermore, such applications require low watermark embedding
capacity.
2.2. Classification of Data Hiding Techniques
This section provides a general classification of existing data hiding techniques based on the
following six criteria:
host media type (images, video, audio, and text),
areas of applications (robust, fragile, and semi-fragile),
perceptibility (visible and invisible),
embedding domain (spatial and transform),
data embedding schemes (know-host-state and know-host-statistics), and
data extraction techniques (private, semi-private, and public).
This classification hierarchy of data hiding techniques is illustrated in Figure 2.1.
DATA HIDING
BASED ONAPPLICATIONS
BASED ONPERCEPTIBILITY
BASED ONEMBEDDING DOMAIN
BASED ONEMBEDDING SCHEME
BASED ON HOSTMEDIA TYPE
BASED ONEXTRACTION SCHEME
IMAGEDATA HIDING
VIDEODATA HIDING
AUDIODATA HIDING
TEXTDATA HIDING
ROBUSTDATA HIDING
SEMI-FRAGILEDATA HIDING
FRAGILEDATA HIDING
IMPERCEPTIBLEDATA HIDING
VISIBLEDATA HIDING
DIRECT DOMAINEMBEDDING
TRANSFORMEDDOMAIN
EMBEDDING
HOSTINTERFERENCECANCELLATIONTECHNIQUES
(INFORMED DATAEMBEDDING)
PRIVATEDATA HIDING
ADDITIVE SPREADSPECTRUM
TECHNIQUES(BLIND DATAEMBEDDING)
SEMI-PRIVATEDATA HIDING
PUBLICDATA HIDING
Figure 2.1: General Classification of Data Hiding
17
18
Each category of the data hiding schemes is discussed briefly as follow,
2.2.1 Classification Based on Host Media Type
Most of the data hiding research is focused on digital images compared with the other host media
types i.e. video, audio, and text. This is due to the fact that the performance evaluation of a data
hiding scheme for digital images is relatively easier than digital audio and video; because the
performance evaluation of a data embedding scheme for audio or video generally requires
subjective testing. Data hiding techniques based on host media type can be divided into four sub-
groups [127 – 150]:
1. Data Hiding in Images 2. Data Hiding in Video 3. Data Hiding in Audio 4. Data Hiding in Text
2.2.2 Classification Based on Data Hiding Applications
Performance based on robustness, capacity and fidelity of a data hiding scheme depends on the
application of interest. For example, copyrights protection applications require a robust
watermarking [57 – 89] where as content verification applications need a fragile watermarking
[7, 9 – 24]. Similarly, fingerprinting needs a semi-fragile watermarking [90 – 103]. Therefore,
existing data hiding schemes can be classified into three sub-groups based on the application of
interest:
1. Robust Data Hiding 2. Fragile Data Hiding 3. Semi-Fragile Data Hiding
2.2.3 Classification Based on Perceptibility
Existing data hiding schemes can be divided into two main categories based on the perceptibility
(fidelity) of embedded data [5 – 24], that is,
19
1. Imperceptible Data Embedding 2. Visible Data Embedding
Imperceptible data embedding implies that embedded data is invisible (in case of image, video,
and text host media) and inaudible (for audio host media). Imperceptible data embedding
schemes are more common than the visible data embedding schemes [60 – 68]. Imperceptible
data embedding schemes exploit the HVS and HAS characteristics to ensure imperceptibility of
the embedded data. Visible data embedding schemes are generally used to imprint visible logo in
digital images or video.
2.2.4 Classification Based on Data Embedding Domain
Existing data hiding schemes can be classified into two major categories based on embedding
domain of the host media, that is,
1. Data Hiding in Spatial/Time Domain (Direct Domain) 2. Data Hiding in Transformed Domain
Least significant bit (LSB) encoding, patchwork, echo hiding, etc. are few common data hiding
schemes of direct domain data embedding [127, 128, 131, 136, 160] schemes. Direct domain
data hiding schemes very popular among the data hiding community. Discrete cosine transform
(DCT), discrete wavelet transform (DWT), and discrete fourier transform (DFT) are the most
commonly used transforms for data embedding process. Most DCT-based image data embedding
methods commonly use 8x8 size block of image for host data transformation then watermark is
embedded by modifying DCT-coefficients according to HVM [9 – 24]. In DWT-based data
embedding algorithms the host data is first decomposed into subbands using DWT, then for data
embedding discrete wavelet coefficients in the selected subbands are modified based on human
perceptual model. Robust data hiding schemes for images and video resilient to rotational,
scaling, and translational (RST) distortion generally use DFT-based data hiding schemes [6, 7,
127 – 141, 143, 144]. DFT-based algorithms are also common for audio data hiding schemes.
2.2.5 Classification Based on Data Embedding Method
Existing data hiding schemes based on the data embedding methods can be classified into two
major categories [75 – 86, 115, 139], that is,
1. Additive Spread Spectrum or Host-Interference-Non-Rejecting Methods 2. Host Interference Rejecting Methods or Informed Embedding
In case of additive spread spectrum based data hiding, a pseudorandom sequence w(mi)
generated using secret key or message mi is added to the host signal i.e.
( ) ( ) 2 .10 1
i o ix m C w mh ere
αα
= + ×< ≤
where α is called as scaling factor and value α is the tradeoff between robustness and fidelity of
the embedded data.
From Eq. 2.1 this is clear that for this class of data hiding the host signal Co acts as an additive
interference if a blind detector is used for watermark detection, which ultimately limits the
performance of the detector; even in the absence of attack channel zero-error probability is hard
to achieve. But these methods out perform the host interference rejection methods under sever
attack situations. Most of the existing data hiding methods [9 – 24] fall into the additive spread
spectrum class.
The inherited limitations of the host interference non-rejecting methods can be improved by
exploiting the host signal knowledge at the encoder; these methods are generally known as host
interference rejecting methods. Quantization index modulation (QIM) [77 – 89] based data
hiding methods is a sub-class of host interference rejecting methods. This class of data hiding
methods provides an easy control over the trade off between data rate, embedding distortion, and
robustness. These methods generally have higher data rate than the spread spectrum based data
hiding class at the cost complexity of the data hiding system.
2.2.6 Classification Based on Data Extraction Method
20
Data hiding systems based on the information available at detector can be classified in following
categories,
1. Private or Informed Data Hiding 2. Semi-Private Data Hiding 3. Public or Blind Data Hiding
Private data hiding systems requires original copy of the host media along with secret embedding
key for data extraction. These systems are generally used for data hiding applications like
content authentication, ownership verification, etc. Semi-private systems generally requires
secret embedding key only for information extraction, whereas, public data hiding systems need
only a marked copy of the host media at the detector for data extraction [5 – 33].
2.3 Transmission Channels
The transmission channel model plays an important role in analyzing the performance of a given
communications system. In general, a fixed transmission channel is assumed for design and
analysis of a communication system i.e. we cannot modify or design the noise function that
occurs during transmission. A channel is generally characterized by means of a conditional
probability distribution, Pr/x (r/x) which gives the probability of obtaining r at the output of the
channel when x is the input of the channel. Transmission channels are modeled based on the
noise function they apply to the signal and how the noise is applied.
In data hiding scenario adversary attacks (or attack channel) are generally treated as a
transmission channel for the performance analysis of a data hiding scheme. Commonly known
attacks in the data hiding community can be modeled as follow [30, 115]:
2.3.1 Bounded Distortion Channels
In this case we consider the largest distortion energy per dimension 2nσ to ensure (zero-
error) for any distortion (noise) vector n, that satisfies,
m m=
21
22
2.2nn Nσ≤
This channel model describes the minimum signal to noise ration (SNR) constraint between the
attack channel input and output. Bounded distortion channel model is more appropriate for
unintentional attacks such as compression attacks or active adversary attack to remove
watermark for the watermarked media.
2.3.2 Bounded Host-Distortion Channels
Some active adversary may use distortion measure between the host signals instead of distortion
introduced by channel. Since this is a direct measure of degradation of the host signal. This
model is appropriate when an attacker has partial knowledge about the host signal, this might be
in probabilistic sense i.e. probability distribution of the host signal is know or any other sence.
Active adversary can calculate the distortion between a watermarked copy of media and the host
media, this distortion is bounded to the expected distortion given as
[ ( , )] 2.3rD E D r x=
where expectation is taken over the conditional probability density of r given the channel input x.
2.3.3 Additive Noise Channels
In this case noise vector n is modeled as random and statistically independent of the host data Do
[39 – 46]. An additive white Gaussian noise (AWGN) is an example of such channel. The
robustness measure in this case is the maximum noise variance 2nσ to ensure sufficiently low
probability of error in the received data. Many researchers in the area of data hiding use AWGN
channel assumption to model attack channel for performance analysis of a given data hiding
scheme [5 – 7, 25 – 33].
22
The first two channel models are distortion constraints which are more appropriate to model
intentional attacks [5, 25 – 33, 115, 139] whereas AWGN channel is appropriate for
unintentional attacks.
3.4 Digital Rights Management: A Brief Overview
Digital rights management (DRM), i.e., the technologies, tools, and processes that protect
intellectual property during the life cycle of digital content, is a vital ingredient of the emerging
electronic multimedia (emedia) market. DRM creates an essential foundation of trust between
authors and consumers that is a prerequisite for the robust market development.
At its simplest level, digital rights management (DRM) technology is all about controlling access
to information. Customers want convenient access to their purchased products, while companies
seek to protect their intellectual property from unauthorized use or duplication. DRM sits
squarely between these two parties, trying to present an amicable compromise between the
customers and the vendor.
The hardware keys, software licenses, and serial numbers all fall under the DRM umbrella [169].
Although there are several approaches to providing digital rights management, but "Anatomy of
a DRM Transaction" is the most common one which is outlined in Figure 2.2.
CONTENT AUTHOR/CREATOR
MEDIA CONVERTER CONSUMER
CLIENT WEBBROWSER
CLIENT VIEWER
PLAY MANAGEMENTSYSTEM
WEB STOREFRONTAND MEDIA HOST
LICENSE MANAGER
Content Manager
1 2
4
5
6
3
Figure 2.2 Anatomy of a DRM Transaction
23
24
In Figure 2.2, at its most basic level, a DRM transaction starts with the content creator (1), who
generates a piece of media (2), be it audio, video, text, or some other format. Once in digital
form, the media file is encrypted or watermarked to protect it from unauthorized use and stored
on a content server. Access to the file is managed by the license server, possibly in conjunction
with a pay management system (3). Decrypted/unwatermarked media might be delivered directly
to a browser (4), or it could be decoded by the appropriate DRM-enabled software application
(5). Either way, a fully licensed, digital-quality media file or stream reaches to the customer (6).
Key features of an effective DRM system generally include:
Data protection, so files are not easily viewed without proper privileges (Content
Protection).
Unique identification of each customer to ensure that rights are applied appropriately
(Fingerprinting).
Central management of rights to allow for free distribution, anti-fraud measures, and
revocation (Content Authentication and legal action)
Flexibility, so the system can be tailored to various business models (rental, ownership,
and read-only (Copy Control).
Rights model is the core of any content rights managements system. A rights model is a
specification of the types of rights that system can keep track of or what the system can do with
those rights and the attribute of those rights such as how many times content can be used, for
how long user can access the content, how many times user can copy the contents, how much
money etc.
Rights model of DRM systems are used to define rights to content, according to some rights
model, and to enforce the granting of those rights. There are three ways to enforce content
rights:
1) Legally through registration forms, license agreements, and copyright laws.
25
2) Legally with an audit trail, such as copyright notices or watermarks (identifiers embedded
permanently in the content).
3) Technologically, using encryption and user authentication to protect content and only make it
accessible under strictly specified conditions.
Content protection and tracking are the basic building blocks of every DRM system. Most of the
existing DRM systems use encryption for content protection, content usage tracking, content and
user authentication, etc. which cannot provide sufficient safeguard against piracy due to its
limitations. On the other hand, watermarking along with encryption can ensure content
protection and usage tracking. A content protection scheme that incorporates both encryption and
watermarking is not foolproof but provides sufficient protection against active adversary attacks.
This is likely that most successful DRM solution in the years to come, where combine encryption
and watermarking can be used for content protection and related issues. In this dissertation we
intend to develop content protection techniques using both encryption and watermarking.
26
CHAPTER 3
Related Work This chapter studies the theoretical aspects of the data hiding problem. Different conceptual
models of data hiding problem are explored here. These models will help to comprehend the
theoretical aspects of the data hiding problem. These models can be classified into two main
categories: 1) the data hiding models based on communications theory, and 2) the data hiding
models based on geometrical framework. Based on embedding methods the existing data hiding
schemes can be divided into two classes (as discussed in Chapter 2): 1) spread spectrum based
data hiding, and 2) informed data hiding. Related work in these directions is briefly discussed in
Section 3.4. The goal of this chapter is to lay the foundation for the design and analysis of the
data hiding systems discussed in the later chapters.
3.1 Data hiding in Communication Framework
In the recent years several researchers in the data hiding community [5, 7, 25 – 33, 39 – 71] have
use traditional communications framework to analyze the theoretical-aspects (such as data hiding
capacity, error probability, and performance limits) of data hiding and watermarking. A brief
overview of the classical model of a communications system would be helpful to understand the
similarities and differences between a conventional communications system and a data hiding
system.
3.1.1 Classical Model of Communications system
The channel encoder, channel decoder, and communication channel are three basic elements of
the traditional communications model as illustrated in figure 3.1. Here message, m, is to be
transmitted across the communications a channel.
CHANNELENCODER
CHANNELDECODER
INPUT INFORMATIONSEQUENCE
CHANNELDISTORTIONS/
NOISE
OUTPUT INFORMATIONSEQUENCE
Transmitter Receiver
x
n
r
m∑
m
Figure 3.1: Standard Communication Model
The channel encoder is a function that maps each possible message mi to a code word x, selected
from a set of signals suitable for transmission over the channel. For digital communication
channel encoder is generally divided into source encoder and modulator. The source encoder
maps a message into sequence of binary symbols, where as, modulator maps a sequence of
binary symbols into a physical signal x, suitable for transmission over the channel.
In general channel encoder output is dependent on the transmission channel, but for our case x is
a finite precision real sequence of length N i.e. x = x0, x1,…, xN-1. We also assume that these
signals are bounded, i.e. these signals are power constraint, that is,
2( [ ]) ; 3.1i
x i p p≤ < ∞∑
The transmission channel is generally assumed as a noisy transmission channel; which means
that output of the channel r, is not identical to the input x, of the channel. The change from x to r
is due to additive noise of the channel i.e. transmission channel adds random noise n, to x.
The output of the communication channel r enters into the channel decoder. The channel decoder
inverts the channel encoding process, that is, maps the received signal into message . The
channel decoder is typically a many-to-one function, so that even in the presence of noise
received signal should be decoded correctly. The probability of error p
m
e in the decoded message
is very small if channel decoder is designed using channel parameters.
3.1.2 Secure Transmission 27
Communications systems designed for the communications applications where security of the
transmitted information is an additional requirement. A secure communication system is
generally used for such applications. Main difference between a conventional communication
system and a secure communication system is that, latter system uses a pair of secret keys
(encryption and decryption keys) at channel encoder to encrypt the message sequence and to
decrypt received message at channel decoder respectively.
Such a secure communication system is depicted in Figure 3.2
CHANNELENCODER
CHANNELDECODER
INPUT INFORMATIONSEQUENCE
CHANNELDISTORTIONS/
NOISE
OUTPUT INFORMATIONSEQUENCE
Transmitter Receiver
x
n
r
m∑
ENCRYPTIONKEY
DECRYPTIONKEY
Ke Kd
m
Figure 3.2: Standard Secure Communication Model
Encryption provides an extra security layer and helps to prevent passive as well as active attacks
for secure delivery of contents on such systems given that adversary does not have access to the
secure key, that is, cryptography prevent a passive adversary from unauthorized reading of the
message and similarly prevent an active adversary from unauthorized writing. The secure
communication system described in Figure 3.3 is known as a symmetric secure system if Ke = Kd
i.e. encryption key is same the decryption key; otherwise this an asymmetric secure system.
Cryptography has been a popular technology for content protection for many years and still
commonly in use for number of applications in the areas of content protection, secure network
communication and secure content delivery [7]. But cryptography is unable to provide sufficient
safeguard against jamming attacks and content security after decryption. This type of attacks can
be handled by using spread spectrum communication [38] schemes.
28
In case of data hiding both active as well as passive adversary have access to the watermarked
media. Therefore, a secure key based data embedding and data extraction system is required to
ensure the security of the embedded message and content protection. In the remaining document
we will assume a symmetric secure data hiding system unless otherwise specified.
3.1.3 Data Hiding Model Based on Communication Framework
Data hiding system has a strong analogy with the communication system [5 – 8]. In data hiding
we want to communicate information from the data embedder to the data detector. Therefore,
this is natural to use the conventional communication model for design and analysis of data
hiding systems.
Figure 3.3 shows the standard data hiding model, data hiding system with doted line is an
informed or private data hiding system, whereas, without doted line is a data hiding system with
blind or public detector or blind data hiding system.
EMBEDDINGATTACK
CHANNELEXTRACTION
INPUTMESSAGE EXTRACTED MESSAGE
HOST MEDIA
EMBEDDING KEY
K
CO
m x r m
Figure 3.3: General Model for Data Hiding
A relatively detailed description of the above data hiding model with informed detector is given
in Figure 3.4 and 3.5.
29
MESSAGEENCODER
MESSAGEDECODER
HOST MEDIADATA
EMBEDDING KEY
INPUT MESSAGE(WATERMARK)
DATADETECTION KEY
ADVERSARYATTACK
OUTPUT MESSAGE(WATERMARK)
Data Embedded Data Detector
Do
Dm
n
Dmnme
K Ko
m∑∑∑
m-
Figure 3.4: Data Hiding System with Informed Detector Analogous to the Standard Secure Communication Model
MESSAGEENCODER
MESSAGEDECODER
HOST MEDIADATA
EMBEDDING KEY
INPUT MESSAGE(WATERMARK)
DATADETECTION KEY
ADVERSARYATTACK
OUTPUT MESSAGE(WATERMARK)
Data Embedded Data Detector
Do
Dm
n
Dmnme
K Ko
m∑∑
m
Figure 3.5: Data Hiding System with Blind Detector Analogous to the Standard Secure Communication Model
This is clear from Figure 3.4 and 3.5 that data embedding consists of two basic steps: 1) message
mapping, message encoder maps the input message into a suitable embedding pattern, me, of
same dimension and type as the host media, Do. A secret key, K, can be used for this mapping
during data embedding process. 2) Embedding pattern then added to the original host media, Do,
to produce data embedded host media (marked host media), Dm. This type of embedding is
known as blind embedding in literature [5 – 8, 57 – 68] because encoder completely ignores the
host media information for data embedding process.
Marked media then undergoes intentional or unintentional attacks; for simplicity these attacks
are modeled as an AWGN channel. The output of AWGN channel (attack channel) is called as
processed or distorted marked host media, Dmn.
Finally to recover the embedded message for the processed marked media, Dmn, pass through the
watermark detector. In case of informed detector (Figure 3.4), detection operation is a two step 30
31
process: 1) Original host media, Do, is subtracted first from the received data, and 2) residue
noisy pattern, Dn, is used to estimate the embedded message. Whereas for blind watermark
detection case (Figure 3.5), the original copy of the host media is not available at watermark
detector, therefore, we cannot subtract the host data, Do, from Dmn. In such situation, we can
consider the received signal (Dmn) is the embedding pattern corrupted by noise formed by the
combination of the host media and attack channel. Performance of the watermark detector
depends on the application of interest, for example, for high robustness applications, such as
ownership identification or copy control; minimization of error probability of the estimated
message is the main criteria.
In case of blind data embedding schemes, watermark embedder completely ignores the
information about the host media, and which directly affect the overall performance of the data
hiding scheme. How we can utilize the host media information at encoder?; and how this can
improve the performance of the data hiding system?. These issues are addressed next.
3.2 Data Hiding as Communication with Side Information at the Transmitter
As pointed out in the previous section that communication based data hiding model with blind
detector (Figure 3.5) cannot fulfill the requirements of fidelity, robustness and high embedding
capacity. This is because in this model, embedding pattern, me, is restricted to be independent of
the host media and at blind detector the host media acts as random noise or interference. As
original copy of the host media, Do, is available at the encoder, therefore this is reasonable to
exploit the knowledge about the host media at the encoder to develop a robust and high capacity
data hiding system with minimal perceptual embedding distortion. All existing perceptual based
data hiding schemes exploit host media information at data embedder [5 – 24].
Figure 3.6 depicts the data hiding model where embedding pattern, me, is the host media
dependent. Then only difference between this model and the one described in Figure 3.5 is that
this model have informed encoder, that is, the encoder uses the information about the host media
for mapping input message into embedding pattern.
INFORMEDENCODER
CHANNELDECODER
DATAEMBEDDING KEY
INPUT MESSAGESEQUENCE
CHANNELDISTORTIONS
OUTPUT MESSAGESEQUENCE
EMBEDDER DETECTOR
Dm
n
Dmn
K
HOST MEDIA
Do
me m∑∑
m
Figure 3.6: Data Hiding as Communication with Side Information at the Encoder
Here if we consider combination of the host media and the channel distortion as the noise
process in an AWGN transmission channel, then this model is an example of a communication
system with side information at the transmitter, first studied by Shannon [34] and then by [35 –
37]. In the recent years few researchers have modeled the data hiding problem as communication
with side information at the encoder. Data hiding schemes based on the informed encoder
generally exhibit higher data rate, better perceptual quality, and robustness compare to the blind
data embedding schemes [57 – 71, 77 – 89]. For theoretical analysis, many researchers use the
idea proposed by Costa in [37] due to strong analogy between his “Writing on Dirty Paper”
problem and “Robust Data Hiding in Digital Media” problem of data hiding community. We
will explore this issue i.e. informed embedding in Section 3.4.2.
3.3 Geometric Model of Data Hiding
There is yet another way of modeling data hiding problem, that is, geometric modeling using n –
dimensional space. In this framework host media is considered as a point in an n –dimensional
space. This n –dimensional space is generally divided into two major regions:
32
33
1) Acceptable Fidelity Region: this is a region around the host media where perceptible
distortion between host media point and any other point in this region is imperceptible or below
masking threshold.
2) Detection Region: this is region in n –dimensional space where detector can decode the
embedded message based on the knowledge about the embedding key.
The embedding process moves the original host media point to a predefined detection region and
region of acceptable fidelity to ensure robustness as well as imperceptibility. Cox et al [7] have
used this geometric modeling for data hiding to interpret data hiding problem in n –dimensional
space.
3.4 Related Work
The work on digital watermarking became popular around mid nineties and since then the
number of research efforts in this area has surged significantly. However, most of the research
was focused on watermarking image contents [4 – 24]. Recently watermarking audio and video
contents has also gained significant research interest [127 – 142]. There are two main
communities in the area of information/data hiding: data hiding using spread spectrum theory or
additive embedding [57 – 68] and data hiding based on host-interference rejection or informed
embedding [77 – 89]. In general, spread spectrum watermarking scheme embeds data (message)
into the host data by adding a pseudo-random sequence and correlation based detector is
commonly used for the watermark detection process. In case of blind spread spectrum
watermarking schemes, the host data acts as interference at the watermark detector which
ultimately limits the detection performance of spread spectrum watermarking schemes.
Moreover, in order to meet the imperceptibility requirement of watermark, the power of
watermark signal is kept much lower than that of the host signal. Thus, the host signal
interference significantly reduces the amount of reliable communication between watermark
embedder and detector.
Cox et al [59] proposed a spread spectrum based watermarking system in which one information
bit (watermark bit) is spread over as many samples as the host media using modulated
pseudorandom spreading sequence to generate embedding sequence. Different variations of
spread spectrum based data hiding have been proposed in the past [57 – 68] for all types of
media. Low data hiding capacity and non-zero probability of error, Pe , even in the absence of
channel degradation are the main limitations of this class of data hiding. Relatively invariant
robustness performance from no distortion scenario to sever degradation is an attractive feature
of this class of data embedding. We will discuss our contributions in spread spectrum based data
hiding for digital audio in Chapter 4 and possible extensions of our proposed scheme for image
and video data.
3.4.1 Data Hiding Based on Additive Embedding
Most of the existing data embedding algorithms treat the host signal as an additive noise or
interference [57 – 67, 127 – 141]. The simplest of this data embedding class have purely additive
embedding function, that is,
( , ) ( ) 3.2m o o eD D k D m k= +
where me(k) is generally a pseudorandom sequence which is statistically independent of the host
media, Do, and generated using a secret key k.
This fact is quite evident from Figures 3.3, 3.4 and 3.5. Data embedding methods based on the
embedding function described in Eq. 3.2 are termed differently in literature, for example,
“spread spectrum methods” or “additive spread spectrum methods” [7], “host interference non-
rejecting methods” [30], “Type I embedding” [6, 28], and “known host statistics methods” [115],
but spread spectrum is the most commonly used term among the data hiding community. In
34
communication theory, the term “spread spectrum” means that the transmitted message signal
occupies much larger bandwidth than the required bandwidth for the message signal (base band
signal) [38]. In the recent years many researchers [9 – 24] have been using spread spectrum
theory for data hiding applications. The term “spread spectrum data hiding” has been established
for simple additive embedding a mark signal, me, chosen independently of the host signal, Do, as
described in Eq. 3.2. Cox et al[ 59] proposed a spread spectrum based watermarking system in
which one information bit (watermark bit) is speared over as many samples as the host media
using modulated pseudorandom spreading sequence to generate embedding sequence me. This
embedding sequence is then added to the original host data, Do, to produce watermarked copy of
the host data, Dm. This class of data embedding methods is limited to low data embedding
capacity, for example, Cox et al‘s [59] spread spectrum watermarking scheme can embed only
one bit in each host media.
A common variation of purely additive spread spectrum methods is the weighted-additive
embedding, that is,
( ) ( ) ( ) 3.3m o o eD k D D m kα= +
here embedding pattern is weighted with a scaling factor, α . This scaling factor,α , generally
accounts for the human perceptual characteristics, to ensure imperceptibility of the embedded
message. For example, embedding function proposed by Podilchuk et al [141], where amplitude
scaling factor,α , is host data dependent that is, it depends on just noticeable difference (JND)
level. Similarly, weighted embedding function proposed in [59], where amplitude scaling factor
is set proportional to the host data, Do, such that
( )o oD Dα λ=
where λ is constant 0 < λ ≤ 1.
35
This means that embedding function distort larger magnitude host signal samples more than the
smaller samples or coefficients of the host data in transformed domain. This proportional
weighted-additive embedding class of data hiding is still additive embedding in log –domain i.e.
( , ) ( ) ( )( )
(1 ( )) 3.4
m o o o e
o o e
o e
D D k D D m kD D m kD m k
αλλ
= += += +
now taking log on both side is Eq. 3.4,
log Dm (Do, k) = log Do + log (1 + λ me (k)) 3.5
Eq. 3.5 shows that weighted-additive embedding is still additive embedding in log –domain.
For watermark detection, these methods rely on the statistical properties of the host data which
are used to develop an optimal information decoder. This optimal information decoder is
generally in the maximum likelihood sense. The statistical characterization of the host data is
available in direct domain as well as in transformed domain such as the [9 –
24]. For simplicity we consider digital image host data,
DCT, DWT, or DFT
oD , in -domain, here can be
modeled by Laplacian probability distribution function [9, 147]. Thus of each
coefficient, d[ , can be written as,
DCToD
pdf pdf
i]
[ ]| |[ ]
[ ]( ) 3 .62
i dd i
if d e ββ −=
where [ ]
2[ ]d i
iβσ
=
Now robustness performance of the weighted-additive embedding function as described in Eq.
3.3 in the absence of channel noise (or adversary attack) can be calculated as:
For simplicity the embedding pattern, , is assumed as a pseudo-random sequence of antipodal
binary samples i.e. . Therefore,
em
1, 1em ∈ − + 2( ) 1eE m = , and if magnitude scaling factor,
, is non-negative real constant i.e. ( )oa D ( ) ; 0oa D a a= > , then data embedding distortion in
36
37
2this case 2 2( ) e eE m bσ α= α= , here b is one bit information to be embedded into host data and
b = ±1.
It can be shown [115, 147] that the probability of error Pe at the receiver using ML detector in
the absence of noise or adversary attack is given as,
21P 3
2e e λ−= .7
where eλ σ α= .
Hence this is clear from Eq. 3.7 that even in the absence of channel noise or attack Pe is not zero
i.e. zero probability of decoding error is not attainable; this fact ( P ) holds for other
probabilistic models of the host data i.e. for gaussian or generalized gaussian host model.
Therefore, additive embedding class of data hiding is not provably robust, and this is due to the
host signal interference. Informed detector improves the performance of this class of data hiding
methods considerably; this is because the presence of host signal at detector can be used to
cancel the host signal interference. Moreover, variation of this class that minimizes the effect of
host signal interference can also improve the performance.
0e ≠
One of the most important advantages of the additive embedding based data hiding methods is
that for a power-constrained transmission channel this is extremely difficult to severely degrade
the host signal’s underlying pfd. Because the statistical properties of the host signal are relative
invariant, this will cause a noticeable degradation of robustness performance in the presence of
attack. This is because the decoder optimization criterion depends on the statistical
characterization of the host signal. Therefore, the Pe does not degrade abruptly from attack free
scenario to growing channel distortion.
Additive embedding method such as spread spectrum watermarking is one of the first methods
used for data embedding [59, 164] and still most popular one due to its advantages to withstand
38
against sever distortion and simplicity. Many variations of this method are possible depending
one the nature of the host signal and the application of interest [67 – 68, 139, 140, 162, 163].
Malik et al [68] proposed frequency selective spread spectrum watermarking scheme for digital
audio, in this method we use only a selected frequency range of the host signal (audio signal) for
data embedding instead of the complete frequency range of the host media. Thus the host signal
interference at the detector due to the selected subband signal is minimized which in return
improves the robustness. The proposed method introduces low embedding distortion as
watermark is embedded in the selected frequency range. Moreover, this method is capable of
embedding 5 – 8 times more data compare to the existing data hiding methods of this class.
Detailed overview of this method is provided in Chapter 4.
3.4.2 Informed Embedding
Recently Chen et al [30, 81] and Cox et al [71] have explored the idea of informed embedding.
Data hiding methods under the informed embedding umbrella generally have higher data rate,
better robustness and fidelity for bounded perturbation attack channels. These methods are
capable to achieve zero-error probability as long as channel distortion is below a certain
threshold [5, 6, 27, 28, and 115]. In general this class of data embedding methods use blind
detector i.e. the detector has no information about the host signal Do for detection process but the
encoder exploit information about the host signal to reduce the host signal interference.
Cox et al’s work [71] is based on the general concept of Shannon’s paper “Channels with Side
Information at the Transmitter” [34]. Where as Chen et al’s [30, 80 – 86] work is based on
Costa’s work “Writing on Dirty Papers” [37]. Costa considered communication with side
information at encoder over an AWGN channel as described in Figure 3.6
3.4.2.1 Costa’s Work
The main requirement of Costa’s solution to the communication problem described in Figure 5.1
is to design an Nd –dimensional code book and an appropriate encoding process; here NdNu d is
the cardinality of the host data vector. In the limiting case i.e. as Costa’s codebook
achieves the capacity of communication with IID Gaussian side information D
dN →∞ dNu
o at the encoder
and AWGN channel.
Costa’s codebook can be defined as,
[ ] [ ] [ ] | 1 , 2 , . . . , 3 . 8dNe o uu u i m i D i i Nη= = + ∈
where Nu is the cardinality of the codebook and η : 0 ≤ η ≤ 1 is a codebook parameter. Moreover,
39
N
N
d
2
2
2
(0, )
(0, )
(0, )
d
d
d
e e
o d
n N
N I
N I
N I
σ
σ
σ
∼
∼
∼
m
D
n
are the realizations of embedding pattern, host data, and channel noise which are Nd –
dimensional mutually independent random processes with zero-mean and
2 2 2, ,andd de N d N n NI Iσ σ σ I covariance matrices respectively with Gaussian pdf , where
dNI is and Nd
–dimensional identity matrix. Costa showed [37] that the capacity of such communication
system is independent of the host signal interference, that is,
2
2 2
1 l o g ( 1 ) 3 . 92
eA W G N
n
C σσ
= +
which is equal to the capacity of additive spread spectrum system with informed detector [5].
3.4.2.1.1 Quantization Index Modulation (QIM)
Costa’s scheme is purely theoretical, therefore, several practical approaches to implement
Costa’s scheme have been proposed [28, 30, 57, 72 – 74, 84]. In Chen et al’s [80 – 86] proposed
data hiding scheme, the host signal Do is quantized depending on the information to be
embedded, this scheme is commonly referred as “quantization index modulation” (QIM). For
analysis and implementation, Chen et al [80 – 86] gave a low complexity practical
implementation of their theoretical QIM scheme, i.e. binary dither modulation (BDM). The QIM
and its variations belong to the informed embedding class of data hiding [77 – 89].
The QIM information embedding process involves modulating the index or sequence of indices
with the embedding information and then quantizing the host signal with the associated quantizer
or sequence of quantizers. A quantizer is approximately an identity function i.e. ( )q x ≈ x and can
be uniquely described by a set Q of reconstruction points in Nd –dimensional space along with a
rule of mapping the input signal of length Nd to a point in the set Q. Minimum distance rule is
generally used for selecting a suitable point from Q for an input signal, therefore, different
quantizers can be characterized by their reconstruction points Q only. As QIM scheme belongs to
the host interference rejecting class of data embedding, therefore QIM schemes offer high data
rate for power constraint attack channels [30].
Basic steps of QIM scheme can be outlined as:
1) A set of different quantizers Q1, Q2, …, QM is defined, where M is the cardinality of the
possible embedding messages set M.
2) To embed message m, the host signal is quantized using quantizer Qm.
3) The detector quantizes the received signal Dmn using the set of all quantizors Q1, Q2,…,
QM. Then the detector determines the index of the quantizer with reconstruction point closet
to the received signal; this estimated index corresponds to the received message . m
3.4.2.1.1 Binary Dither Modulation
Binary dither modulation is a low complexity implementation of the QIM in which the ensemble
of the embedding functions is dither quantizers [3]. For these dither quantizers the quantization
cells and reconstruction points of any given quantizer are the sifted version of the quantization
40
cells and reconstruction points of any other quantizer in the ensemble. The shifts are generally
achieved using a pseudorandom vector called as dither vector, d, for information embedding
purpose this dither vector is modulated according to the embedding message m. Let the modulate
dither vector corresponding to the message m is denoted by d(m). The embedding function based
for dither modulation is defined as [30, 81]:
( , ) ( ( )) ( ) 3.10m o oD D m q D m m= + −d d
here ( )q i is a uniform scalar quantizer with step size∆ .
For binary dither modulation, the mapping from the range of the host signal values Do[n] onto
the watermarked signal values Dm[n] using uniform scalar quantizer with step size ∆ is
illustrated in Figure 3.7. Here, the set Q 1 (circles ‘O’) is defined by a uniform scalar quantizer
with step size . Similarly, the set Q ∆ 2 (crosses ‘X’) is another uniform scalar quantizer with
same step size but with /2 offset. ∆
O
O
O
O
X
X
X
X
Dm [n]Do [n]
/ 4∆/ 4∆
∆
Figure 3.7: Binary Dithered Modulation Scheme Based on Dithered Uniform Scalar Quantization
The dither vector construction and the zero-error watermark detection condition are derived as:
We assume that the data embedding rate Rm is1/ 1d mN R≤ ≤ , and m = b1, b2,…, is the
binary representation of embedding message m where
d mN Rb
0,1ib ∈ for i : 1,2, …, NdRm. If ku/kc is
41
the rate of an error correcting code used for channel encoding then channel encoded binary
representation of m is z1, z2, …, where /dN Lz 1 ( / )u cm
L k kR
=
Now two dither subvectors of length L are constructed as,
(1) / 2 , if (1) 0( 2 ) 1, 2 , ..., 3 .1 1
(1) / 2 , if (1) 0i i
ii i
d dd i L
d d+ ∆ <⎧
= =⎨ − ∆ ≥⎩
Eq. 3.11 ensures that two L –dimensional dither quantizers are at maximum possible distance
from each other. Here, one dither subvector (say d(1)) is associated with binary information ‘0’
where as second dither subvector is d(2), associated with binary information ‘1’. Finally Nd/L
dither subvectors associated with channel encoded bits z1, z2,…, zNd/L are concatenated to from
dither vector . ( ) dN∈ℜd m
Finally minimum distance between reconstruction points of two quantizers Q 1 and Q 2 can be
shown [30, 84] as,
22
m i n ( ) 3 . 14
H u
m c
d kdR k
∆= 2
where dH is the minimum hamming distance, a feature of the error correcting code used for
channel encoding. For very small quantization cells, the mean squared distortion introduce per
dimension due embedding by the uniform, scalar quantizer with step size∆ is:
2
( ( , ) ) 3 . 11 2E o mE D D D ∆
= 3
Now for bounded distortion channels and minimum distance decoding zero-error decoding
condition can be shown as [82],
2
3 ( ( , ) ) 1 34
H u E o m
c d m n
d k E D D Dk N R σ
> . 1 4
Eq. 3.14 shows that for a fixed rate Rm, for more channel distortion energy 2nσ we need more
embedding mean squared distortion i.e. . Moreover, for a fixed rate and channel ( ( , ))E o mE D D D
42
distortion energy, then the Eq. 3.14 gives minimum perceptual distortion introduced due to data
embedding. Therefore, QIM scheme gives a trade off between rate, robustness and fidelity of the
data embedding process.
As informed embedding class of data embedding methods use blind detector, therefore, detector
can be treated as deterministic hence their performance limited by the bounded power channels
distortion. For example if channel distortion 2nσ > ∆ /2 then performance of the data embedding
system deteriorates and zero-error decoding is not guaranteed.
We will discuss our contributions in informed embedding for digital audio data in Chapter 5. Our
proposed data hiding schemes [75, 76] using phase alteration of audio data are capable to embed
more data than the existing schemes while keeping embedding distortion below masking
threshold.
Considering the level of research activity related to data hiding in the past decade, it is evident
that there has been a significant improvement in the design of data embedding and detection
schemes, but at the same time sophistication in the attacks against data hiding has shown similar
improvements. These parallel improvements have motivated theoretical analysis of performance
limits of digital data hiding techniques. First work in this direction is by Moulin et al [39, 40, 47
– 54] where they consider digital watermarking as a game between watermark embedder and
active attacker. The watermark embedder attempts to maximize the amount of embedded
information whereas attacker attempts to minimize it. Other theoretical work considers the
robustness against estimation attacks [5] or influence of quantization on correlation based
watermark detection.
43
44
CHAPTER 4
Blind Data Embedding This chapter presents our initial contributions in the additive embedding or blind embedding
techniques and outlines the future work. In general, additive embedding or spread spectrum
based watermarking techniques embed information by adding pseudo-random sequence into the
host data and correlation based detector is used for watermark detection.
4.1 Robust and High Capacity Data Embedding: Our Work
Currently we are working on the design of high capacity, robust data hiding algorithms using
spread spectrum theory. Initial results of our proposed algorithm in this direction are promising
[68]. Main features and performance analysis of our work are given next.
4.1.1 Data Hiding Using Frequency Selective Based Spread Spectrum
As pointed out in the previous Section that the host signal interference at the detector limits the
performance of additive embedding class of data hiding, therefore, we can improve the
performance of these methods by either rejecting or minimizing the host signal interference. For
complete rejection of the host signal interference we need informed detector which is not feasible
for many data hiding applications such as copy control, device control, etc. So we can think of
minimizing the host signal interference, and one possible way to do this by embedding data in a
selected subband signal of the host signal instead of whole frequency band of the host signal
which is the main idea of our work [68]. This frequency selective based embedding will also
reduce the embedding distortion that ultimately improves the fidelity of the data hiding scheme.
The frequency selective based data hiding algorithm is outlined next.
45
This method [68] is designed to overcome common shortcomings of existing DSSS based audio
data hiding /watermarking systems [59 – 67] such as vulnerability to desynchronization attacks,
poor detection performance, poor fidelity (inaudibility), and limited embedding capacity.
Robustness to desynchronization attacks and reliability of detection performance are improved
using content-adaptive features called salient points [65] of the input audio. These salient points
are frame level features of the input audio signal that are invariant to common audio processing
operations. Only a small fraction of the audible frequency range is used for data embedding in
order to reduce the amount of audible distortion. The method exploits the frequency masking
characteristics of the human auditory system (HAS) and inserts the mark into a randomly
selected frequency band of the input audio signal. A secret key is used for randomly selecting a
frequency band for watermark embedding. The proposed watermarking scheme induces low
perceptual as well as mean squared distortion; and is therefore, the proposed scheme has high
embedding capacity P. Moulin et al [39 – 41]. The detection performance of the system was
investigated for a variety of signal manipulations and attacks on a watermarked audio clip. These
attacks include addition of noise, resampling, requantization, filtering, and random chopping.
Results show the robustness of the method, with a low detection error rate and a low bit error
rate. Moreover, the proposed watermarking scheme is capable of embedding multiple
watermarks in the unused frequency bands with the use of separate secret keys.
4.1.1.1 WATERMARKING USING PERCEPTUAL AUDITORY MODEL The basic idea underlying perception-based watermarking schemes is to incorporate the
watermark into the perceptually insignificant region of an audio signal in order to ensure
transparency. The perceptually insignificant region is determined using the human perceptual
auditory model. Extensive work was done over the years on understanding the properties of HAS
and applying this knowledge to audio applications [2]. An important application of perceptual
models is in the area of perception-based compression [165]. An important characteristic of HAS
is auditory masking that has been that has been exploited in audio coding for lossy compression.
We consider its use in watermarking.
Human ear performs frequency analysis that maps a frequency to a location along the basilar
membrane. The HAS is generally modeled as a non-uniform bandpass filter bank with
logarithmically widening bandwidth for higher frequencies [165]. The bandwidth of each
bandpass filter is set according to the critical band, which is defined as “the bandwidth in which
subjective response changes abruptly” [2]. The critical band rate (CBR) is a measure of location
on the basilar membrane just as the frequency gives a measure of location in a spectrum. The
unit of critical band rate is Bark. The mapping between CBR and frequency is defined as:
( )213 arctan(0 .67 ) 3 .5 arctan ( / 7 .5) 4 .1z f f= +
where z is CBR in Barks and f is frequency in kHz.
Masking is a fundamental property of HAS and is a basic element of perceptual audio coding
systems. It is a phenomenon by which a stronger audible signal makes a weaker audible signal
inaudible [2], and this occurs both in frequency as well as time domain [2].
4.1.1.2 SALIENT POINT EXTRACTION Spread spectrum techniques have been applied in digital watermarking [59 – 67] due to their
potential for high fidelity, high capacity, robustness, and security. In the proposed scheme, the
process of generating a watermark and embedding it into an audio signal is treated in the
framework of spread spectrum theory. The original audio signal is treated as noise whereas the
message information used to generate a watermark sequence is considered as data. The spreading
sequence, also called pseudo-random noise sequence or PN-sequence, is treated as key. This
watermarking strategy can be treated in the framework of communication models discussed in
[7].
46
A critical aspect of designing a spread spectrum system is ensuring fast and reliable
synchronization at the detector. Synchronization impacts performance as it reduces the overall
capacity of the watermarking system, and an active adversary can use explicit synchronization
information for de-synchronization attacks. To overcome these problems, synchronization is tied
to attack-sensitive locations or salient points for watermark embedding and detection. Salient
points are extracted based on the audio features sensitive to the HAS [65], e.g. fast energy
transition points, zero crossing rate and spectral flatness measure. If these features are altered
then noticeable distortion is introduced. A good salient point extraction method is one that
approximately extracts the same salient points before and after common signal manipulations or
watermark embedding [65]. Fast energy transition audio feature is used in our method for salient
point extraction.
For an audio signal Do(n): n = 0,1,2,…N-1, the short time energy ratio at each point is calculated
as:
( )( ) 4 .2
( )a f t e r
b e fo r e
E nE r n
E n=
where Eafter(n) and Ebeforer(n) are defined as:
1 2( ) ( ) 4 .3b e fo re oi rE n D n i−
= −= +∑
1 20
( ) ( ) 4 .4ra fte r oi
E n D n i−
== +∑
Here r is the number of samples before and after x(n). A high energy transition points are
defined as:
If : Er(n) > Th1 & Eafter(n) > Th2
Finally a salient point is decided as follow:
1: If two high energy transition points are separated by less than Th3 then samples are merged
together to form a group.
472: Within each group, the strongest transition point is marked as a salient point.
here Th1, Th2 and Th3 are thresholds, these thresholds are set adaptively to ensure 3 - 4 salient points per second. 4.1.1.3 WATERMARK EMBEDDING To generate and embed a watermark, the host data (audio) is analyzed first to determine salient
points list. A block of P samples around each salient point is selected. The block is applied to a l-
level modified wavelet analysis filter bank to generate (2xl-1) –subband signals of unequal
bandwidths, as illustrated in Figure 4.1.
h_hp(n)
h_lp(n)
h_hp(n)
h_lp(n)
h_lp(n)
h_hp(n)
h_lp(n)
h_hp(n)
h_lp(n)
h_hp(n)
h_hp(n)
h_lp(n)
h_hp(n)
h_lp(n)
h_hp(n)
h_lp(n)
Sb 1f = 0~fs/64
Sb 2f = fs/64~fs/64
Sb 3f=fs/32~3fs/64
Sb 4f=3fs/64~fs/16
Sb 5f=fs/16~5fs/32
Sb 6f = 3fs/32~fs/8
Sb 7f = fs/8~3fs/16
Sb 8f = 3fs/16~fs/4
Sb 9f = fs/4 ~ fs/2
X(n)
f = 0~fs/2
:Represents downsampling by the factor of 2
h_lp(n) :Represents low pass filtering with cutoff freq. pi/2
h_hp(n) :Represents high pass filtering with cutoff freq. pi/2
Figure 4.1: 5 –Level Modified Discrete Wavelet Analysis Filter Bank
The choice of the number of subbands is made based on a compromise between allowing a large
choice in random selection and ensuring that the subband bandwidth covers at least three critical
bands. A subband from lower eight bands (for l = 5) is selected using the three bit sub-key k1i,
for ith salient point, where as the complete secret key K1 for subband selection is given as:
K1 = k11| k12|…| k1i|…k1M
where M is the cardinality of the salient point set.
The selected subband is used to estimate the masking threshold Tm(k), which is calculated as
follows:
48
Let sbi,j(n) for n = 0,1,2…L-1 be the jth subband of ith frame of the audio data that is selected
using key K1i. Its power spectrum is defined as,
Psb(k) = |Sbi,j(k)|2 4.5
here Sbi,j(k) is the discrete fourier transform (DFT) of the sbi,j(n). Now k is wrapped onto Bark
scale using Eq 1. The energy in each critical band is calculated as,
( ) ( ) / : 1, 2 , . . . 4 .6U Bz tk L B Z
E z P s b k P fo r z z=
= =∑
where zt is the total number of critical bands in the selected subband, LB and UB are the lower
and upper boundaries of the a critical band, and Pz is the total number of points in each critical
band. The energy per critical band is used to calculate the masking threshold Tm(z) using MPEG
layer III psychoacoustic model 1 [165].
4.1.1.3.1 Watermark Generation For each salient point a watermark W of length L is generated. To generate a watermark W,
binary message m is mapped onto using a channel encoder. The channel encoded data is
applied to binary phase shift keying (BPSK) modulator. The output of the BPSK modulator is
Wm(n) : n = 0,1…q-1, where q = L/(spreading factor). Maximum length PN-sequence p of
length (L/q) using log
m
2 (L/q) bit secret key K2 is generated. Finally modulated signal Wm is
spread using PN-sequence p to generate final watermark W. System key K = K1|K2.
4.1.1.3.2 Watermark Embedding Spectral shaping based on Tm(k) of W is required to ensure inaudibility of the embedded
watermark. For this purpose W(k) (DFT) and power spectrum Pw(k) of W is calculated. Now
using Tm(z) inaudible DFT coefficients of the selected subband sbi,j are removed, i.e.
, ( ) ( ) ( )( ) 4 . 7
0 ( ) ( )i js b k i f P s b k T m z
S b n ki f P s b k T m z
≥⎧= ⎨
<⎩
similarly unwanted DFT coefficients of W(k) are also removed, i.e.
49
0 ( ) ( )( ) 4 . 8
( ) ( ) ( )i f P s b k T m z
W n kw k i f P s b k T m z
≥⎧= ⎨ <⎩
The final watermark before embedding is given by
Wf(k)=Fz•Wn(k) 4.9
where Fz is the shaping factor and defined as,
( )4 . 1 0
m a x ( | ( ) | )zA T m k
FW n k
=
where 0 < A <1 is noise gain factor. Finally watermarked output in frequency domain,
Wsbi,j(k) = Sbn(k) + Wf(k) 4.11
The corresponding time domain watermarked subband signal is obtained by calculating inverse
discrete fourier transform (IDFT),
Wsbi,j(n) = IDFTWsbi,j(k) 4.12
This watermarked subband is then use to reconstruct the watermarked audio block data using
modified wavelet synthesis filter bank. This process is repeated for the remaining salient points
in the salient point list.
Watermark generation and embedding process is illustrated in Figure 4.2.
AUDIOCONTENTANALYSIS
ATTACK-SENSITIVEREGION EXTRACTION
SUBBANDSELECTION
MASKING THRESHOLDEXTRACTION
(USING PSYCHOACOUSTICMODEL)
WATERMARK SHAPING(USING MASKING
THRESHOLD)
SUBBAND ANALYSIS(USING MODIFIED
WAVELET ANALYSISFILTER BANK)
Key :k1i
Sbj(l)
Tma (z)
Sb1(l)
Sbp(l)
spi=1~M
x(l, i)
l = 1~P
WATERMARKEMBEDDING
100101101...
WATERMARKSEQUENCE
Key : k2
PN-SEQUENCEGENERATOR
BPSKMODULATOR
WATERMARKSPREADING
CHANNELENCODING
SUBBAND SYNTHESIS(USING MODIFIED
WAVELET SYNTHESISFILTER BANK)
Sb1(l)
Sbp(l)
DATAMERGING
ORIGINALAUDIO DO
WATERMARKEDAUDIO Dm
wsbj(l)
wn(l)
wx(i,l)
w(l)l= 1~L
Figure 4.2: Block Diagram of Watermark Embedding Process
50
4.1.1.4 WATERMARK DETECTION In order to be effective a watermarking system should be able to detect/extract the embedded
watermark even after the watermarked audio undergoes common signal manipulations and
psychoacoustic auditory model based audio processing. An attractive feature of the proposed
scheme is that a blind detector can be used for watermark detection/extraction i.e. detector does
not require original copy of the audio signal to detect watermark from the received audio signal.
The detector has access to the secret key that is the only information that detector has about the
embedding. The detector uses salient points for synchronizing the embedded information, so that
audio can be analyzed for salient point extraction (as discussed in Section 3). For each point in
the salient point list a block of P samples is passed through the modified analysis wavelet-filter
bank, then using a secret key k1i, jth subband sbi,j is selected for watermark detection/extraction.
The selected subband is analyzed to extract masking threshold say Tmr(z). This masking
threshold is used to extract the “residual” audio signal, Rr(k), that is defined as,
0 ( ) ( )( ) 4 . 1 3
( ) ( ) ( )r r
rr r r
i f P s k m zR k
s k i f P s k m z>⎧
= ⎨ ≤⎩
where Sr(k) is the DFT of sr(n) the selected subband of the received audio and Psr(k) is the
corresponding power spectrum.
The residual is transformed into time domain for watermark detection/extraction using IDFT i.e.
rr(n)=IDFT(Rr(k)) 4.14
The residual rr(n) is now used for watermark detection, by using normalized correlation test. The
normalized correlation between real sequences rr(n) and PN-sequence p(n) at the detector
generated using key K2 is defined as,
2 20 0
( ) ( )( ) 4 . 1 5
( ) ( )
Mrl M
n M Mrl l
r l p n lc o r n
r l p l= −
= =
+=
•
∑∑ ∑
51
where L is the length of the residual signal. High correlation implies the presence of watermark
as illustrated in Figure 4.3.
500 1000 1500 20000
0.2
0.4
0.6
0.8
1
Nor
malized
Cor
relatio
n
Normalized Correlation :Watermark Present
500 1000 1500 2000-0.02
-0.01
0
0.01
0.02
0.03
0.04
0.05
Normalized Correlation : Watermark Absent
Nor
malized
Cor
relatio
n
Figure 4.3: Normalized Correlation for watermarked subband (left) and unwatermarked subband (right).
The normalized correlation is compared with a threshold to determine the presence of a
watermark. Let hypothesis H1 denote the presence of a watermark in a selected subband and H0
denote the absence of a watermark. The decision criterion is
1 7
7
: m a x ( ) :: m a x ( ) : 4 .
n
o n
H i f c o r T h w a t e r m a r k p r e s e n tH i f c o r T h w a t e r m a r k a b s e n t
≥< 1 6
If H1 is true then the embedded information is recovered by despreading rr(n) using the PN-
sequence generated using same key K2, then demodulating the resulting sequence using BPSK
demodulator followed by channel decoding. The detection process is illustrated in Figure 4.4.
OUTPUTWATERMARKSEQUENCE
100101101...
AUDIO CONTENTANALYSIS
ATTACK-SENSITIVEREGION EXTRACTION
SUBBANDSELECTION
MASKING THRESHOLDEXTRACTION
(USING PSYCHOACOUSTICMODEL)
RESIDUAL EXTRACTION(USING MASKING
THRESHOLD)
SUBBAND ANALYSIS(USING MODIFIED
WAVELET ANALYSISFILTER)
Key :k1i
Sbj(l)
Tma (z)
Sb1(l)
Sbp(l)
spi=1~L
x(l, i)
l = 1~M
CORRELATORDETECTOR
Key : k2
PN-SEQUENCEGENERATOR
BPSKDEMODULATION
WATERMARKSPREADING
CHANNELDECODING
Thpeak ≥
Thpeak <
watermarkedAudio Dm
No watermark
Figure 4.4: Block Diagram for Watermark Detection
4.1.1.5 EXPERIMENTAL RESULTS 52
The robustness of the proposed scheme was tested on speech signals and music. The tests
included several degradations and distortions, i.e. addition of noise, lossy compression, low pass
filtering, resampling, random chopping, and multiple watermarks. The detection performance in
each case depends on the following measures, 1) watermark detection rate (WDR) which is a
measure of watermark detection, and 2) the bit accuracy rate (BAR) which is a measure of data
recovery. The bit accuracy rate is defined as,
4 . 1 7N u m b e r o f B i t s C o r r e c t l y D e t e c t e dB A RN u m b e r o f B i t s E m b e d d e d
=
and watermark detection rate:
4.18Number of Watermarked Frames Correctly DetectedWDRNumber of Watermarked Frames Embedded
=
The overall performance of the system is defined as,
4 . 1 9D P M B A R W D R= ×
where DPM stands for detection performance measure.
Detection results for degraded watermarked audio based on DPM for a variety of conditions are
described below.
White Gaussian noise is added to the watermarked audio; the DPM (as defined in Eq.
19) values in the presence of white gaussian noise with power from 0 to 50% of the
signal power are shown in Figure 4.5.
53
0 5 10 15 20 25 30 35 40 45 500.975
0.98
0.985
0.99
0.995
1
1.005
Noise Power
Detec
tion P
erform
ance
Mea
sure
Detection Performance Measure vs Noise Power
Noise Power = percentage of the Audio Power
Figure 4.5: DPM for different values of noise power (Pn)
Watermarked audio is down-sampled to 22.05 kHz and then interpolated to 44.1 kHz.
The DPM value for this test is 1.
Watermarked audio undergoes ISO/MPEG-1 Audio Layer III encoding/decoding
[165] at a bit rate of 128 kbs. The DPM value for compression test is 1.
Watermarked audio signal is lowpass filtered with 4 kHz cutoff frequency, Detection
of resulting audio gives a DPM of .995. Detection performance is still acceptable
despite severe audible distortion
To investigate desynchronization attacks, one out of every 100 samples of
watermarked signal was randomly dropped. Detection applied to this signal gave a
DPM of 1.
Three watermarks simultaneously embedded in the audio, with a unique sectary key
assigned to and a unique subband selected for each watermark. The DPM was 1 as
long as the number of watermarks is less than the number of analysis subbands.
54
55
4.2 Future Directions
A novel watermarking scheme for audio based on FS-DSSS is proposed. The technique
introduces low mean squared as well as perceptual distortion compare to existing spread
spectrum schemes [59 – 67] this is due to that fact that a watermark is embedded in a small
frequency band of complete audible frequency range. The watermarking capacity theory
presented in [39 – 41] suggests that the proposed scheme can embed more information. The
proposed method is also robust to standard data manipulations i.e. noise addition, compression,
random chopping and re-sampling.
5.2.1 Proposed Dada Hiding Scheme for Images
We are currently investigating to extend our frequency selective watermarking scheme [68] for
digital image watermarking, image authentication, and image fingerprinting. As the existing
image watermarking schemes [5 – 24] generally use blind detector for watermark detection
which limits the performance of blind or additive data embedding schemes due to host signal
interference. This interference can be reduced if watermark is selected from the null space of the
host data. This will improve the detection performance of the data hiding scheme because
watermark selected from null space of the host signal therefore the host signal interference is
minimum. Bayesian source separation approach can be used for blind watermark detection. As
embedded watermark is orthogonal therefore separation matrix estimation would be
computationally efficient as well.
For image decomposition we are investigating to use l –level discrete wavelet analysis filter
bank. The reason for using DWT for watermark embedding is multi-folds: 1) the wavelet
transformation provides good space-frequency localization to analyze image features such as,
edges or texture areas, 2) due to multiresolution representation of the image, hierarchical
processing of the image is possible that can be used for progressive watermark
embedding/decoding, 3) the wavelet transform is very flexible to adapt a give set of images or
application at hand, 4) wavelet coefficients can be generally characterized by a Gaussian
distribution [166, 167 ] which will improve the computationally efficiency of separation matrix
estimation using Bayesian approach, and 5) wavelet transform is compatible to JPEG2000, the
most recent still image compression standard. For watermark embedding all subbands except l th
–level approximation subband are selected because embedding in this subband will degrade the
visual quality of the watermarked image. Moreover, as diagonal subbands are generally less
sensitive to the quantization noise therefore, these subbands at 1st and 2nd level are not suitable
for watermark embedding, because these subbands are generally discarded during lossy
compression. The subbands along horizontal and vertical orientation at 2nd and 3rd level are
suitable for watermark embedding. Figure 4.6 illustrates 3 –level wavelet decomposition.
HL4
LH3
HL3
HH3
LH2
HL2
HH2
LH1
HL1
HH1
LL4
HH4LH4
Figure 4.6 Three Level Wavelet Decomposition
The proposed frequency selective image watermarking using DWT scheme is outline as:
Decompose the image I using l –level discrete wavelet analysis filter bank.
Randomly select a subband using secret key from the decomposed image such that
: 1, 2,3... 2, 4j ix sb i lθ θ= ∈ ∀ ∈ where j = 1,2.
56
Generate a watermark w with non-Gaussian distribution using secret key such that
watermark lie in the null space of the selected subband.
Calculate the masking threshold for the selected subband using noise visibility
function [167].
Embed watermark in the selected subband as:
where xe is the watermark embedded subband, xs are the wavelet coefficients above the
masking threshold, and α is the level adaptive scaling factor.
4 .2 0e sx x x wα= +
Reconstruct the watermarked image using discrete wavelet synthesis filter bank.
Now to detect the watermark using blind detector some variations of blind source separation
scheme can be used but we are thinking to use Bayesian source separation framework. Because
the embedded watermark is uncorrelated to the subband in which it is embedded therefore
separation matrix estimation is possible.
For image authentication and fingerprinting applications, original image is available at the
detector therefore detection performance will definitely improve. The proposed scheme is
flexible enough that we can modify it depending on the application of interest. For example, for
image fingerprinting application, to detect the fingerprint we can use normalized correlation
function because the host data is available at the detector. Moreover, multiple fingerprints can be
embedded simultaneously.
4.2.2 Proposed Dada Hiding Scheme for Video
Video and audio data types need more attention of the researchers from the data hiding
community because they are strong candidates to carry more data than digital images. Moreover,
researchers explored these data types very little in the past. Entertainment industry is facing more
losses due to digital audio and video instead of the digital images. Audio and video data hiding
57
58
can be used for multimedia wireless communication to improve the QoS, enhanced performance
for intended recipients where as normal service for a regular user.
Image watermarking technique proposed in the previous section can be extended for video data
hiding with certain modifications. For video data hiding I –frames and motion compensation
vectors can be used for data hiding where as P –frames can be used as a backup or 2nd level data
hiding. To eliminate the frame replacement or frame shuffling attacks in video data hiding, frame
pairing can be used for data embedding i.e. information about frame f is embedded in frame f’
where as index number of f’ is greater than the index number of f. We are also investigating to
develop data hiding scheme for efficient error concealment of multimedia transmission over
lossy and busty channels.
Chapter Summery This chapter gives an overview of the blind embedding schemes. In Section 4.1 we discus the
existing additive data hiding schemes, their advantages, and their limitations. Our contribution in
this class of data hiding schemes is provided in Section 4.2. Low host signal interference,
reduced embedding distortion, high data rate, and robustness against adversary attacks are the
attractive features of the proposed scheme. The simulation results show the robustness of the
proposed scheme against common data manipulation attacks. Possible extension of our
frequency selective watermark embedding scheme to digital images and video are proposed in
section 4.3. For watermark detection we are planning to use Bayesian source separation
approach. We are also investigating to use data hiding for error concealment of multimedia
transmission over a busty channel.
59
CHAPTER 5
Informed Data Embedding This chapter provides our initial contributions in the informed embedding or host interference
rejection based embedding techniques and outlines the future work. In general, informed
embedding based watermarking techniques embed information using informed encoder i.e. by
exploiting the host signal knowledge at encoder and use blind detector for watermark extraction.
5.1 High Rate Data Embedding Using Informed Encoding: Our Work
This is clear from Eq. 3.14 that QIM based data embedding techniques introduces large
embedding distortion for more robust data embedding at fixed data rate. Moreover, QIM
schemes did not pay attention to the human perceptual system. Therefore, for low perceptual
distortion and better robustness performance, data embedding schemes have to incorporate the
human perceptual system. Malik et al [75, 76] proposed a data hiding using deterministic
dithering in the selected frequency range of the audio signal for data embedding. The frequency
range selection for dithering is based on the human perceptual model.
5.1.1 Data Hiding Using Frequency Selective Dithering (Our Contribution)
In [75, 76] we propose a novel perception-based high capacity data hiding methods. In this
scheme we explore the following properties of HAS: the magnitude distortion at a specific
frequency in an audio signal is inaudible if it is below masking threshold; and human perception
is less sensitive to absolute phase changes in a certain frequency range [2, 154]. Not all of the
customary full range of audible frequencies, i.e. 20 Hz ~ 20 kHz, is suitable for data embedding.
In the higher frequency range (≈ f > 10 kHz) detection of small magnitude changes is unreliable
due to insignificant signal energy. On the other hand human perception is more sensitive to phase
distortion in the lower frequency range (≈ f <4.0 kHz). The frequency range (i.e. 4.0 < f <10.0
kHz) is therefore suitable for making embedded data imperceptible and robust to the standard
manipulations.
The signal content in the above range is partitioned into subband signals using discrete wavelet
packet analysis filter bank (DWPA-FB). Data is embedded in selected subband signals by
introducing inaudible magnitude and phase distortion using finite-length impulse response (FIR)
approximations of allpass filters (APFs). Data is detected by estimating the parameters (pole-
zero) of the APF by estimating the power spectrum of the audio. In our method the power
spectrum is estimated using parametric signal models, i.e. moving average and autoregressive
models. The performance of this method is evaluated for detecting data that is embedded in an
audio clip using binary and 4 –ary encoding and is subjected to signal manipulations such as
addition of noise, lossy compression, resampling, and random chopping. Compared with existing
methods [62 – 64, 136, 137] the proposed technique is shown to embed 5-8 times more data for
binary encoding and twice for 4 –ary encoding.
5.1.1.1 FIR APPROXIMATION OF APF An APF is suitable for data embedding because the phase distortion it introduces in the chosen
frequency range is largely inaudible. Let the frequency response of the APF be H(e jω) = ke jφ(ω).
Estimation of APF parameters from the processed audio consists of finding local extrema in the
magnitude spectrum of the processed audio along radial lines of pole-zero locations of APF.
The transfer function HAP(z) of a stable and causal first-order allpass filter can be expressed as,
1
1( ) 5 . 11A PzH z
zα
α
− ∗
−
−=
−
where α ∈ and |α|<1 and the region of convergence is | α | < |z|. The transfer function of a
higher order APF can be expressed as a product of first-order allpass sections specified in Eq.(1).
An allpass filter has an infinite-duration impulse response (IIR). Data is embedded by
introducing controlled phase distortion using a fixed set of pole-zero locations. We use an FIR
60
approximation of length L of an nth order APF. This introduces both magnitude and phase
distortion. The magnitude distortion tends to zero as . This is shown below. L → ∞
Consider a stable, causal first order APF defined in Eq. 5.1
( )
1 *1 *
1 1
1 * 1
0
1 * 1 1
0 1
1 * 1 1 * 1
0 1
1 *1 1 1
1
1( ) ( )1 1
( ) ( )
( ) ( ) ( )
( ) ( ) ( ) ( )
1 ( ) (1
A P
k
k
Lk k
k k L
Lk k
k k L
L
zH z zz z
z z
z z z
z z z z
z z zz
α αα α
α α
α α α
α α α α
α α αα
−−
− −
∞− −
=
∞− − −
= = +
∞− − − −
= = +
−− + −
−
−= = − ×
− −
⎡ ⎤= − ⎢ ⎥
⎣ ⎦⎡ ⎤
= − +⎢ ⎥⎣ ⎦⎡ ⎤ ⎡ ⎤
= − + −⎢ ⎥ ⎢ ⎥⎣ ⎦ ⎣ ⎦
⎛ ⎞−= − +⎜ ⎟−⎝ ⎠
∑
∑ ∑
∑ ∑ 5 .2
1 *1
1) 51
L zzα
α
−+
−
⎛ ⎞−⎜ ⎟−⎝ ⎠
.3
) L
L
First term on right hand side in Eq. 5.10 is FIR approximation of an APF referred as HFIR_AP(z).
HFIR-AP(z) can be expressed as,
( )1 1_ ( ) ( ) 1 ( ) 5 .4L
F I R A P A PH z H z zα − += −
The factor , introduces L + 1 zeros at . These L + 1 zeros are
uniformly distributed on the circle |z| = α. The zero for i=0, i.e. z
( 1 11 ( )Lzα − +− 2 /( 1) 0,1,...,j i Liz e iπα += =
o = α, cancels the pole at the
same location, therefore HFIR-AP(z) has L + 1 zeros altogether, where L zeros are located at
; and remaining one zero at |z| = 1/ α . The transfer function H2 /( 1) 1, 2, ...,j i Liz e iπα += = AP(z) of a
single pole-zero pair is obtained from HFIR-AP(z) as L goes to infinity. The nature of distortion
for different L is illustrated in Figure 5.1.
61
Figure 5.1: Magnitude Response of APF H(ejw) Approximation for Different values of Length (L).
An nth order APF is used in our method for embedding data. The nth order APF is realized with a
cascade of n/2 second order allpass filters. The use of cascaded form realization reduces the
effect of quantization of APF coefficients. Parameter αi of the transfer function HAPi(z) used for
data embedding is defined as: iji r e ωα = where 0 < r < 1, 0 < ωi < л, where i = 0,1 in binary
encoding, and i = 0,1,2,3 in 4-ary encoding. The transfer function HAPi(z) of the APF used for
data embedding is expressed as:
/ 21 * 1
1 1 *
( ) ( )( ) 2 , 4 . . 5 . 5(1 ) (1 )
n
i iA P i
i i
z zH z nz z
α αα α
− −
− −
⎛ ⎞− −= =⎜ ⎟− −⎝ ⎠
Note that HAPi(z) has n/2 poles at each αi and αi*, and n/2 zeros at each 1/αi and 1/ αi
* locations.
The parameter αi of the transfer function HAPi(z) for binary and 4-ary encoding used in data
embedding are given in Table1.
Table1: APF parameters for binary and 4-ary schemes
Binary Encoding Scheme
4-ary Encoding Scheme
r Ω R Ω α0 0.95 0.2π α0 0.9 0.25π α1 0.95 0.4π α2 0.95 0.6π α1 0.9 0.75π α3 0.95 0.8π
The pole-zero layouts of the 2nd order allpass filters used in binary encoding and 4-ary encoding
are illustrated in Figure 5.2 and 5.3 respectively.
Figure5.2: Pole-Zero Layout of HAPi(z) for Binary Encoding
62
Figure5.3: Pole-Zero Layout of HAPi(z) for 4-ary Encoding
5.1.1.2 DATA EMBEDDING The data embedding process begins by dividing the input audio signal into non-overlapping
blocks of N samples. Each block is then decomposed into 2l subbands using l-level DWPA-FB,
where l and N are positive integers. According to the human auditory perceptual model,
subbands corresponding to 4 to 10 kHz range are relatively less sensitive to the phase distortion
and robust to data manipulations. These subbands are selected for data embedding. To this end
ten subbands (from subband # 6 to subband # 15 at 44.1 k Hz sampling rate with l=5) are
selected for data embedding for each block. One bit of data is embedded in each subband for
binary encoding scheme and two bits of data for 4 –ary scheme. For example, in binary
encoding, bit ‘m’, m = 0,1, is embedded by passing a selected subband through an APF with
transfer functions Hm(z).
In order to cope with the desynchronization attacks, such as signal chopping, synchronization
locations called salient points are identified for inserting data. Salient points are attack-sensitive
locations in the input audio that can be used for synchronization. The salient points correspond to
audio features to which HAS is sensitive such as fast energy climbing points. If an adversary
alters these features, audible distortion is introduced [65]. In our method we adopt the salient
point extraction method described by C-P. Wu et al [65]. In our implementation thresholds Th1,
63
Th2 and Th3 of [65] are suitably set in order to ensure 1- 2 salient points per second. We set r =
4000 samples, Th1 = 2, Th2 = mean energy of one second duration audio window around n, and
Th3 = 200 samples.
The following steps outline the data-embedding scheme:
o A list of salient points is extracted for a given audio signal using the method described
in [65].
o Starting with the first salient point, the audio signal is segmented into non-overlapping
frames of N –samples.
o Each frame is decomposed using a 5 –level DWPA-FB and ten subbands (from
subband 6 to 15) are selected for data embedding.
o One bit of channel-encoded data is embedded in each selected subband in the case of
binary encoding, and two bits for 4 –ary encoding.
o All frames that contain a salient point are embedded with synchronization code (using
a suitable bit sequence).
o Finally each frame is re-synthesized using discrete wavelet packet synthesis filter bank
(DWPS-FB).
A detailed block diagram of the data embedding process is given in Figure 5.4.
AUDIOSEGMENTATION
SUBBANDDECOMPOSITION
using DWPA-FB
INPUTAUDIO
Do
APF: hi(n)i = 0, 1: Binary Scheme
i = 0,1,2,3: 4-ary Scheme
APF: hi(n)i = 0, 1: Binary Scheme
i = 0,1,2,3: 4-ary Scheme
Sb0
Sb6
Sb31
Sb15
Sb0
DSb6
Sb31
DSb15
MESSAGE100101101...
CHANNELENCODING
APF SELECTIONALGORITHM
SUBBANDRECOMPOSITION
using DWPS-FB
DATAMERGING
Hare :APF : Allpass FilterDWPA-FB: Discrete Wavelet Packet Analysis Filter BankDWPA-FB: Discrete Wavelet Packet Synthesis Filter BankSb : SubbandDSb : Data Embedded Subband
DATAEMBEDDEDAUDIO Dm
Figure 5.4: Block Diagram of the Data Embedding Scheme
5.1.1.3 DATA DETECTION USING SIGNAL MODELING The detector first analyzes the data-embedded input audio to extract the list of salient points
using the method described in [65]. Then, starting from the first salient point, the input audio 64
signal is segmented into non-overlapping frames of N –samples. Each frame is decomposed into
subband signals using a 5 –level DWPA-FB (as in the case of data embedding) after which ten-
subbands (from # 6 to 15) are selected for data recovery. Data recovery consists of power
spectrum estimation of the selected subband signals using a priori knowledge of the signal model
followed by APF parameter estimation to recover the embedded information.
5.1.1.3.1 Spectrum Estimation Our parametric spectrum estimation approach assumes an appropriate model of the process based
on a priori knowledge of the signal. We know that the subband signals have been processed by
an FIR approximation of an APF. Therefore, both autoregressive (AR) and moving average
(MA) signal models of sufficient order can be used [1] for spectrum estimation.
An autoregressive process, x(n), can be represented as the output of an all-pole filter excited by
unit variance white noise. The estimated power spectrum of a pth order AR process is
2
2
1
ˆ ( 0 )ˆ ( ) 5 .
ˆ1 ( )
jA R p j k
pk
bP e
a k e
ω
ω−=
=+ ∑
6
where âp(k) and are the estimates of the process model parameters. These p+1 estimated can
be obtained from the data methods such as autocorrelation method, covariance method, modified
covariance method, Burg algorithm etc [1]. We use Burg algorithm for p
ˆ(0)b
th order AR model
parameter estimation.
A moving average process, x(n), can be generated by exciting a qth order FIR filter by unit
variance white noise. The estimated power spectrum of qth order MA process is,
2
1ˆˆ ( ) ( ) 5 .qj j k
M A qkP e b k eω ω−
== ∑ 7
where are the estimates of the process model parameters. We use Durbin’s method [1] for
q
ˆ( )b k
th order MA model parameter estimation.
65
The next step for data recovery is to estimate APF parameter ˆ ( , )rα ω from the estimated
spectrum ˆ ( )jP e ω .
5.1.1.3.2 Allpass Filter Parameter Estimation For APF parameter ˆ ( , )rα ω estimation we need to estimate andr ω from the estimated spectrum
ˆ ( jP e )ω as α is function of r and ω. In our method r is fixed for all APF parameters and only the
frequency ω is varied for the information encoding schemes given in Table 1. Therefore, we
need to estimate frequency ω for ˆ ( , )rα ω estimation. Frequency ω is estimated from the
estimated spectrum ˆ ( jP e )ω based on the results of FIR approximation of APF (discussed in
Section 5.1.1.1).
We know, from Section 2, that the FIR approximation introduces magnitude distortion that
manifests as a local extremum at ijz e ω= The extremum becomes more pronounced as we
traverse from (1 / )ijr e r e ijω ω→ . Moreover, this extremum is stronger and evident for small value
of the duration L of the FIR approximation to the APF (as illustrated in Figure1). Therefore, to
estimate frequencyω we need to estimate consistent local extrema from the estimated
spectrum ˆ ( jP e )ω . For more accurate estimate ofω spectra based on both AR signal model as
well as MA signal model are used. Finally nearest neighborhood hypothesis testing is applied to
decode embedded information.
Hypothesis Testing: binary decoding scheme,
0 0
1 1
ˆ: | | 0 0 .1 5ˆ: | | 1 5 .8
H T h T hH T h
ω ω πω ω
− < = =
− < =
Hypothesis Testing: 4-ary decoding scheme
0 0
1 1
2 2
3 3
ˆ: | | 0 0 0 . 0 5ˆ: | | 0 1ˆ: | | 1 0ˆ: | | 1 1 5 . 9
H T h T hH T hH T hH T h
ω ω πω ωω ωω ω
− < = =
− < =− < =− < =
66
After estimating the coded bit sequence, channel decoding is applied to recover the original
message. The block diagram in Figure 5.5 illustrates the data detection in detail.
AUDIOSEGMENTATION
SUBBANDDECOMPOSITION
using DWPA-FB
DATA EMBEDDED
AUDIO x(n)
PARAMETRICSPECTRUM ESTIMATION
:PAR(ejw),PMA(ejw)
Sb6
Sb15
RECOVERED MESSAGE100101101... CHANNEL
DECODING
HYPOTHESIS TESTINGHi : |E_w - wi | Th
i = 0,1 for binary casei = 0,1,2,3 for 4-ary case
Hare :APF: Allpass FilterDWPA-FB : Discrete Wavelet Packet Analysis Filter BankSb : SubbandE_w : Estimated FrequencyTh : Threshold
PARAMETRICSPECTRUM ESTIMATION
:PAR(ejw),PMA(e
jw)
APFPARAMETERESTIMATION
:E_w
REMOVINGSYNCHRONIZATION
CODE
Figure 5.5: Block Diagram of the Data Detection Process
5.1.1.3.3 Simulation Results Imperceptibility and robustness are the two benchmarks used for performance evaluation of the
proposed data hiding scheme. Robustness is measured based on the probability of error in the
received data under different degradations: a) noise addition, b) lossy compression, c) random
chopping and d) resampling; for both encoding schemes. Probability of error Pe is defined as
1 1 0 0 5 .1 0eN u m b e r o f B i t s C o r r e c t l y D e t e c t e dP
N u m b e r o f B i t s E m b e d d e d⎛ ⎞
= − ×⎜ ⎟⎝ ⎠
For robustness test we have following observations:
o White Gaussian noise with 0% to 100% of the audio power is added to the data-
embedded audio signal. The probability of error Pe of the recovered data for different
values of signal to noise ratio (SNR in dB) and for both encoding schemes is given in
Figure 5.6.
67Figure 5.6: Probability of error (Pe) vs. SNR Plot for Both Encoding Schemes
68
o Data embedded audio signal is compressed using MPEG layer III coder [165]. Despite
the lossy compression the Pe value of the recovered data was below 1% for binary
encoding scheme. The Pe value for 4-ary encoding was higher.
o To test the robustness against desynchronization attacks, 2- 4 samples out of every 100
samples of the data-embedded audio are dropped randomly, probability of error Pe for
the resulting audio was error for both encoding schemes.
o In this test data-embedded audio is first down-sample to 22.05 kHz and then
interpolated to 44.1 kHz. Probability of error Pe of the recovered data after resampling
was 0 % for binary encoding case, and 0.5% for 4-ary encoding case.
5.1.1.4 DATA DETECTION USING MATCH FILTER The recovery of embedded data requires the detection of APF parameter pi used for data
embedding in each subband. The first step in data detection is the analysis of the data embedded
audio to extract the set of salient points (as discussed in Section 2). Starting from first salient
point in the set, the audio signal is segmented into non-overlapping frames of P samples. Each
frame is decomposed using 5 –level DWPA-FB. Then twelve subbands i.e. from sb4 to sb15 are
selected for data detection. The detector evaluates finite length Z-transform of the selected
subband at all possible values of APF parameter (i.e. at p0 and p1 for our implementation). The
detector in advance does not know which APF was used for data embedding in the received
subband sequence; however, detector has knowledge of the parameters of APFs used for data
embedding i.e. p0 and p1. The decision for bit ‘0’ or bit ‘1’ is made based on calculating Z-
transform of the sub-band sequence at 1/p0 and 1/p1 (zero-tracking) and then estimating the local
minima from the magnitude spectrum.
In practice zero-tracking is generally used for APF parameter estimation because theoretically
output of a stable and causal APF is an infinite sequence, i.e. yk,j(n) output of the APF in Eq. 3,
which is the convolution of a finite length input sequence xk,j(n) and an infinite sequence h0(n)
(as Hi(z) a rational function of z). But for APF parameter estimation only a finite length sequence
is available at the detector input.
Let ўk,j(n) be a finite length approximation of an infinite sequence yk,j(n) by dropping higher
indexed terms, available at the detector input. This approximation is valid only if Yk,j(z)
converges which is true if z = 1/pCz ∈∀ 0 (zeros of APF). This fact is illustrated in Figure 3 ,
Figure 3 (right, up) shows the plot of chirp z-transform (CZT) of a finite length subband
sequence x4,6(n) before and after passing through APF H0(z), calculated at r = 1/0.9, clearly
minima in Figure 3 (right, bottom) occurs at ω = ω0 ~ 0.25л; similarly, Figure 3 (left, up) shows
the CZT of the same sequence before passing through APF H0(z) calculated at r = 0.9, and 3(left
bottom) after passing through H0(z) but there is no maxima at ω = ω0 ~ 0.25л, this might be the
reason that finite length approximation of an infinite length sequence at z = p0 is not accurate.
CZT of x4,6
(n) at r = 0.9
|X 4,6(f)|
dB
0 0.5 1108
110
112
114
116
118
0.2 0.4 0.6 0.8 1
108
110
112
114
116
118
CZT of y4,6
(n) at r = 0.9
Normalized Frequency f
|Y 4,6(f)|
dB
0 0.5 1−10
−5
0
5
CZT of x4,6
(n) at r = 1/0.9
0.2 0.4 0.6 0.8 1
−100
−50
0
CZT of y4,6
(n) at r = 1/0.9
Normalized Frequency f
Figure 5.7: Magnitude spectrum of CZT of the subband sequence x4,6(n) before and after passing through H0(z i) i.e. y4,6(n),at r = 0.9 (right) and at r = 1/0.9 (right).
Therefore detector uses zero-tracking for APF parameter estimation (ωi, as r = 0.9 which is fix in
our implementation) for data detection. For parameter estimation we need to estimate only ω
which is done by estimating local minima magnitude response of CZT of the selected subband
calculated at r = 1/0.9, and then based on the nearest neighborhood hypothesis bit ‘ 0 ’ or bit ‘1’
is decided. Finally received data is channel decoded to recover the embedded information. The
block diagram in Figure 5.8 illustrates the data detection process in detail.
69
Audio ContentAnalysis for Salient
Point Extraction
AudioSegmentatio
nfrom SP1
RecoveredMessage
101101001...
Removingsynchronization
codeSub-Band
Decomposition
sb0
sb4
sb31
sb15 ChannelDecoding
Chirp Z-Transform
at 1/ r
EstimateMinima: w
0 : |w - w0| < e
1: |w - w1| < e
Chirp Z-Transform
at 1/ r
EstimateMinima: w
0 : |w - w0| < e
1: |w - w1| < e
DataEmbedded
Audio
SPkk=1~m
x(k,n)
SPkk=1~m
n = 1~L
Figure 5.8: Block Diagram of the Data Detection Using Match Filter 5.1.1.4.1 Simulation Results The proposed data hiding scheme divides input audio into 10 msec audio frame. Twelve-
subbands (as discussed in Section 3) are selected after subband decomposition for data hiding.
Hence embedded data = 12*100=1200 bps, which is 10-15 times more than the existing data
embedding methods [62 – 65, 136]. We applied the proposed data hiding scheme to the verity of
music clips having diverse frequency characteristics. Imperceptibility and robustness are the two
benchmarks used for performance evaluation of the proposed data hiding scheme. Robustness is
measured based on the probability of error in the received data under different constraints, these
constraints include: a) noise addition, b) lossy compression, c) random chopping and d)
resampling.
For robustness test we have following observations:
o White Gaussian noise with 0% to 30% of the audio power is added into data
embedded audio signal. The probability of error of the recovered data for different
values of signal to noise ratio (SNR in dB) is plotted in Figure 5.9.
70
5.23 6.78 7.45 8.24 9.20 10.46 12.22 15.23 Inf0
0.01
0.02
0.03
0.04
0.05
0.06
0.07
0.08Probability of Error vs SNR
Probab
ility of
error
SNR
Figure 5.9: Probability of Error for different SNR values.
o Data embedded audio signal is compressed using MPEG layer III coder. Despite the
lossy compression the pe value of the recovered data was below 2%.
o To test robustness against desynchronization attacks, one sample out of every 100
samples of the data-embedded audio is dropped. Probability of error was 1.69%.
o In this test data-embedded audio is down-sample to 22.05 kHz and then interpolated to
44.1 kHz. Probability of error of the recovered data after resampling was 2.01%.
5.2 Future Direction
We propose a novel method of high-capacity data hiding based on the controlled inaudible
distortion introduced in the selected subbands of an audio signal using FIR approximations of an
nth order APF. The proposed technique is robust to standard data manipulations yielding low
error probability. The error probability performance can be improved by using channel coding
with higher error correction capability. Performance was evaluated with informal listening tests.
We are seeking approval for subjective test for evaluations using formal listening tests. We are
currently investigating the extension of the proposed scheme for its potential for copy-control
and digital watermarking applications for audio as well as for other multimedia data types such
as images and videos. Few intended extensions of the proposed scheme are discussed next.
71
72
5.2.1 Extension Audio Fingerprinting and Authentication
We are currently investigating to extend our frequency selective phase alteration based data
hiding schemes [75, 76] to collusion resistant audio fingerprinting system. As in fingerprinting
application an informed detector is used i.e. host data is available at the detector which will
further improve the detector performance. The proposed schemes [75, 76] are capable to
withstand against noise addition, compression, random copping, filtering etc. therefore extended
scheme can resist common data hiding attacks [90 – 103]. Traitor tracing and robustness against
collusion attacks are the main requirements of a good fingerprinting scheme. To extend the
existing data hiding scheme [75, 76] for these features we are investigating to develop an
efficient metadata (fingerprint) embedding strategy in the available 2l subbands using multiple
level embedding [6], and thinking to use viterbi algorithm or similar type of algorithm for data
decoding. Moreover, audio fingerprinting and authentication application will require a lower
order APF for information embedding, as these applications generally use an informed detector
hence small embedding distortion would be sufficient for the detector to decode data that can be
achieved by using lower order APF.
Chapter Summery This chapter gives an insight of informed embedding, a class of data hiding schemes. We briefly
discuss the existing data hiding schemes that fall in this category. Performance analysis of the
host interference rejection based data hiding schemes such as QIM and its practical extension i.e.
binary dither modulation is provided in Section 5.2. In Section 5.3, our contribution in this class
of data embedding is discussed in details. Robustness, fidelity and high data rate are the
attractive features of the proposed scheme. Simulation results for the robustness of the proposed
scheme against common signal manipulations using, 1) using signal modeling, and 2) using
73
match filter are also provided. Future extensions of the proposed schemes such as, audio
fingerprinting and authentication are proposed as future directions.
CHAPTER 6
Conclusion & Future Directions In this dissertation we propose to design an analytical framework to support a data hiding system
for digital rights management of multimedia archives. We study some salient features of two
main data hiding classes based on the embedding method, that is, blind embedding or additive
embedding and informed embedding or host signal interference rejection based embedding. We
high lighted the limitations of the existing additive data hiding schemes such as low embedding
capacity and robustness against attacks. Then we discuss the theoretical performance of informed
74
embedding [69 – 76] schemes to compare the performance of this class with blind embedding.
Fidelity constraint of data embedding schemes limits the embedding capacity for both data
embedding classes. We have already developed preliminary data hiding systems for both data
hiding categories. High data embedding capacity and low perceptual distortion are the attractive
features of the proposed systems [68, 75, 76].
In future, we intend to extend our proposed data hiding schemes to develop a data hiding system
for digital rights management of multimedia data. To meet the technological challenges of a
reliable DRM system we want to expand our research in the following directions:
o Develop attack channel models and associated performance for the proposed data
hiding schemes.
o Devise theoretical framework for the proposed data hiding schemes and compare its
performance with the existing data hiding schemes.
o Develop a data hiding system for enhanced QoS of multimedia communication over
lossy and busty communication channel then analyze the associated performance of
the proposed system for real world communication channels.
o Develop and analyze a data hiding scheme for data embedding in compressed domain
for each data type (audio, images, and video).
o Develop a robust fingerprinting scheme for secure music distribution on the Web and
analyze its performance against different types of collusion attacks.
o Develop a data hiding scheme for the music transmission over a wireless channel
where the intended recipients (listeners or subscribers) can enjoy the hi-fi music
whereas general public listen ordinary quality music at the same time.
o Explore the possibility of extending data hiding schemes for multimedia indexing and
retrieval application.
75
References: Books: [1] M. H. Hayes, “Statistical Digital Signal Processing and Modeling,” John Wiley & Sons,
Inc., NY, 1996. [2] E. Zwicker, and H. Fastl, "Psychoacoustics: Facts and Models,” Springer-Verlag, Berlin,
1999. [3] N. S. Jayant and P. Noll, “Digital Coding of Waveform: Principles and Applications to
Speech and Video”, Englewod Cliffs, NJ: Prentice-Hall, 1984. [4] T. M. Cover, and J. A. Thomas, “Elements of Information Theory”, John Willy & Sons,
New York, 1991. [5] J. Eggers and B. Girod, “Informed Watermarking”, Kluwer Academic Publisher, 2002. [6] M. Wu, and B. Liu, “Multimedia Data Hiding”, Springer Verlag, Oct 2002.
76
[7] I. J. Cox, M. L. Miller, and J. A. Bloom, Digital Watermarking”, Morgan Kaufmann, 2001.
[8] Neil F. Johnson, Zoran Duric, and Sushil Jajodia, “Information Hiding: Steganography and Watermarking - Attacks and Countermeasures”, Kluwer Academic Publishers, 2000.
Data Hiding Journals & Special Issues: [9] F.A. P. Petitcolas, and H. J. Kim , editors, Digital Watermarking, Proceedings of the 1st
international workshop on Digital Watermarking, Lecture Notes in Computer Science, Vol. 2613, Seoul, Korea, Nov. 2002.
[10] J. Feigenbaum, editor, “Digital Rights Management”, Proceedings of the ACM CSS-9 workshop on Digital Right Management, Lecture Notes in Computer Science, Vol. 2696, Washington, DC, USA, Nov. 2002.
[11] F. A. P. Petitcolas, editor, Information hiding. Proceedings of the 5th international workshop on information hiding, Lecture Notes in Computer Science, Vol. 2578, Noordwijkerhout, The Netherlands, Oct 2002.
[12] I. S. MosKowitz, editor, Information hiding. Proceedings of the 4th international workshop on information hiding, Lecture Notes in Computer Science, Vol. 2137, Pittsburg, PA, April 2001.
[13] A. Pfitzmann, editor, Information hiding. Proceedings of the 3rd international workshop on information hiding, Lecture Notes in Computer Science, Vol. 1768, Dresden, Germany, Sep/Oct 1999.
[14] D. Aucsmith, editor, Information hiding. Proceedings of the 2nd international workshop on information hiding, Lecture Notes in Computer Science, Vol. 1525, Portland, OR, April 1998.
[15] R. Anderson, editor, Information hiding. Proceedings of the 1st international workshop on information hiding, Lecture Notes in Computer Science, Vol. 1174, Cambridge, UK, May/April 1996.
[16] Proceedings of the SPIE/IS&T Inter. Conf. on Security and Watermarking of Multimedia Contents III, Vol. 4314, San Jose, CA, Jan 2001.
[17] Proceedings of the SPIE/IS&T Inter. Conf. on Security and Watermarking of Multimedia Contents II, Vol. 3971, San Jose, CA, Jan 2000.
[18] Proceedings of the SPIE/IS&T Inter. Conf. on Security and Watermarking of Multimedia Contents, Vol. 3657 San Jose, CA, Jan 1999.
[19] IEEE Trans. on Signal Processing, Special issue on Signal Processing for Data Hiding in Digital Media & Secure Content Delivery, vol. 51(4), April 2003.
[20] IEEE Communication Magazine, Aug 2001. [21] IEEE Signal Processing Magazine, Sep 2000. [22] Proceedings of the IEEE, Special Issue on Identification and Protection of Multimedia
Information, 87(7), July 1999. [23] Signal Processing, Special issue on Watermarking, Vol. 66(3), May 1998. [24] IEEE J. Select. Areas Communications, Special Issue on Copyright and Privacy
Protection, 16(4), May 1998. PhD Dissertations: [25] D. Karakos, “Digital Watermarking, Fingerprinting, and Compression: An Information-
Theoretic Perceptive”, Ph. D. Dissertation, University of Maryland, College Park, June 2002.
77
[26] J. Song, “Optimal Rate Allocation and Security Schemes for Image and Video Transmission over Wireless Channels”, Ph. D. Dissertation, University of Maryland, College Park, June 2002.
[27] A. Cohen, “Information Theoretic Analysis of Watermarking Systems”, Ph. D. Dissertation, MIT, Sept 2001.
[28] M. Wu, “Multimedia Data Hiding”, Ph. D. Dissertation, Princeton University, April 2001.
[29] C-Y. Lin, “Watermarking and Digital Signature Techniques for Multimedia Authentication and Copyright Protection”, Ph. D. Dissertation, Columbia University, Dec. 2000.
[30] B. Chen, “Design and Analysis of Digital Watermarking, Information Embedding and Data Hiding Systems”, Ph. D. Dissertation, MIT, June 2000.
[31] D. Kandur, “Multiresolutation Digital Watermarking: Algorithms and Implications for Multimedia Signals”, Ph. D. Dissertation, University of Toronto, 1999.
[32] M. Ramkumar, “Data Hiding in Multimedia –Theory and Applications”, Ph. D. Dissertation, New Jersey Institute of Technology, Nov. 1999.
[33] L. Qiao, “Multimedia Security and Copyright Protection”, Ph. D. Dissertation, University of Illinois at Urbana-Champaign, 1998.
Communication Theory: [34] C. E. Shannon, “Channels with Side Information at the Transmitter”, IBM J. Res.
Devlop., 2: 289-293, 1958. [35] G. I. Gel’fand, and M. S. Pinsker, “Coding for Channels with Random Parameters”,
Problems of Control and Information Theory, 9(1):19-31, 1980. [36] C. Heegard, and A. A. El Gamal, “On the Capacity of the Memory with Defects”, IEEE
Trans. Info. Theory, 29(5): 731-739, Sept 1983. [37] M. H. M. Costa, “Writing on Dirty Paper”, IEEE Trans. Info. Theory, 29(3): 439-441,
May, 1983. [38] R. L. Pickholtz, D. L. Schilling, and L. B. Milstein, “Theory of Spread Spectrum
Communications-A Tutorial,” IEEE Trans. on Communications, vol. COM-30, pp. 855-884, May, 1982.
Information-Theoretic Analysis: [39] P. Moulin, M. K. Mihcak, and G.-I. Lin, “An Information--Theoretic Model for Image
Watermarking and Data Hiding,” Proc. IEEE Inter. Conf. on Image Proc., Vancouver, B.C., Sep 2000.
[40] P. Moulin and J. A. O'Sullivan, “Information-Theoretic Analysis of Information Hiding,” IEEE Trans. on Information Theory, Vol. 49, No. 3, pp. 563-593, March 2003.
[41] P. Moulin, “The Role of Information Theory in Watermarking and Its Application to Image Watermarking,” Signal Processing, Vol. 81, No. 6, pp. 1121-1139, June 2001.
[42] A. S. Cohen, and R. Zamir, “Writing on Dirty Paper in the Presence of Difference Set Noise”, 41st Annual Allerton Conf. on Comm. Control and Computing, October, 2003.
[43] R. Zamir, and A. S. Cohen, “The Rate Loss in Writing on Dirty Paper”, DIMACS Workshop on Network Information Theory, Rutgers University, March 2003.
[44] A. S. Cohen, and A. Lapidoth, “Generalized Writing on Dirty Paper”, International Symposium on Information Theory (ISIT), p. 227, Lausanne, Switzerland, July 2002.
[45] A. S. Cohen, and A. Lapidoth, “Watermarking Capacity for Gaussian Sources”, 39th Annual Allerton Conference on Communication, Control and Computing, October, 2001.
78
[46] A. S. Cohen, and A. Lapidoth, “The Capacity of the Vector Gaussian Watermarking Game”, Inter. Symposium on Info. Theory (ISIT), p. 4, Washington, DC, June 2001.
Data Hiding Game: [47] P. Moulin, “Information-Hiding Games,” 1st Workshop on Digital Watermarking, Lecture
Notes in Computer Sciences, Vol. 2613, Seoul, Korea, Nov 2003. [48] P. Moulin, and A. Ivanovic, “The Zero-Rate Spread-Spectrum Watermarking Game,”
IEEE Trans. on Signal Processing, Vol. 51, No. 4, pp. 1098-1117, April 2003. [49] T. Liu, and P. Moulin, “Error Exponents for Watermarking Game with Squared-Error
Constraints,” IEEE Proc. Int. Symp. on Info. Theory, Yokohama, Japan, July 2003. [50] T. Liu, and P. Moulin, “Error Exponents for One-Bit Watermarking,” IEEE Proc.
ICASSP, Hong Kong, April 2003. [51] P. Moulin, “A Mathematical Approach to Watermarking and Data Hiding,” ICASSP
Tutorial, Orlando, FL, May 2002. [52] P. Moulin and M. K. Mihcak, “A Framework for Evaluating the Data-Hiding Capacity of
Image Sources,” IEEE Trans. on Image Processing, Vol. 11, No. 9, pp. 1029-1042, Sep. 2002.
[53] P. Moulin, and M. K. Mihcak, “The Parallel-Gaussian Watermarking Game,” UIUC Tech. Rep. UIUC-ENG-01-2214, IEEE Trans. on Information Theory, Feb. 2004.
[54] P. Moulin and A. Ivanovic, “The Watermark Selection Game,” Proc. Conf. on Info. Sciences and Systems, Baltimore, MD, March 2001.
[55] A. S. Cohen, and A. Lapidoth, “The Gaussian Watermarking Game”, IEEE Trans. on Info. Theory, Vol. 48(6), pp. 1639-1667, June 2002.
[56] A. Cohen, and A. Lapidoth, “On the Gaussian Watermarking Game", Inter. Symposium on Info. Theory (ISIT), p. 48, Sorrento, Italy, June 2000.
Blind Embedding: [57] J. J. Eggers, J. K. Su, and B. Girod, “A Blind Watermarking Scheme Based on Structured
Code Books,” Proc. IEE Conference on Secure Images and Image Authentication, London, U.K., April 2000.
[58] J. J. Eggers, J. K. Su, and B. Girod, “Robustness of a Blind Watermarking Scheme,” Proc. IEEE International Conference on Image Processing, ICIP-2000, Vancouver, Canada, Sept 2000.
[59] I.J. Cox, J. Kilian, T. Leighton, and T. Shamoon, “Secure Spread Spectrum Watermarking for Multimedia”, IEEE Trans. on Image Processing, 6, 12, 1673-1687, 1997.
[60] R. B. Wolfgang, C. I. Podilchuk, and E. J. Delp, “Perceptual Watermarks for Digital Images and Video”, Proce. IEEE, Vol. 87(7), pp. 1108-1126, July 1999.
[61] M. D. Swanson, B. Zhu, A. H. Tewfik, and L. Boney, “Robust audio watermarking using perceptual masking,” Signal Processing, vol. 66, pp. 337-355, 1998.
[62] P. Bassia and I. Pitas, “Robust audio watermarking in the time domain,” Proc. Int. Conf. on Acoustics, Speech and Signal Processing (ICASSP99), 1999.
[63] M. F. Mansour, and A. H. Tewfik, “Time-scale invariant audio data embedding,” Proc. IEEE International Conference on Multimedia and Expo, ICME, Japan, August 2001.
[64] Y. Yardimci, A. E. Cetin, and R. Ansari, “Data hiding in speech using phase coding,” Proc. Eurospeech Conference,1997.
79
[65] C.-P. Wu, P.-C. Su, and C.-C. J. Kuo, “Robust Audio Watermarking for Copyright Protection,” SPIE's 44th Annual Meeting Advanced Signal Processing Algorithms, Architectures, and Implementations, July 1999.
[66] D. Kirovski, and H. S. Malvar, “Spread Spectrum watermarking of Audio Signals,” IEEE Trans. Signal Proc. Vol. 51, no. 4, pp. 1020-1033, April, 2003.
[67] R. A. Garcia, “Digital Watermarking of Audio Signals using Psychoacoustic Auditory Model and Spread Spectrum Theory,” 107th Convention, AES, New York, September, 1999.
[68] H. MaliK, A. Khokhar, and R. Ansari, “Robust Audio Watermarking using Frequency Selective Spread Spectrum Theory,” accepted for Proc. ICASSP’04, Montreal, Quebec, Canada, May 17-21 2004.
Informed Embedding: [69] M.L. Miller, I.J. Cox, and J.A. Bloom, “Informed Embedding: Exploiting Image and
Detector Information During Watermark Insertion,” Proc. IEEE Inter. Conf. on Image Processing - ICIP (2000).
[70] J. K. Su, J. J. Eggers, and B. Girod, “Illustration of the Duality Between Channel Coding and Rate Distortion with Side Information,” Proc. 2000 Asilomar Conference on Signals and Systems, Pacific Grove, CA, USA, Oct 2000.
[71] I.J. Cox, M.L. Miller, and A.L. McKellips, “Watermarking as Communications with Side Information,” Proc. of IEEE, 87(7), pp. 1127-1141, (1999).
[72] M. Ramkumar, and A.N. Akansu, “FFT- based Signaling for Multimedia Steganography,” IEEE ICCASP’00, pp 1979-1982, Istanbul, Turkey, June 2000.
[73] M. Ramkumar, and A.N. Akansu, “Self-Noise Suppression Schemes for Blind Image Steganography,” Proc. SPIE, Vol 3845, pp 55-65, Boston, MA, Sep 99.
[74] M. Kesal, M. K. Mihcak, R. Koetter, and P. Moulin, “Iteratively Decodable Codes for Watermarking Applications,” Proc. 2nd Inter. Symp. on Turbo Codes and Related Topics, Brest, France, Sep. 2000.
[75] R. Ansari, H. MaliK, and A. Khokhar, “Data-Hiding in Audio using Frequency-Selective Phase Alteration,” accepted for Proc. ICASSP’04, Montreal, Quebec, Canada, May 17-21 2004.
[76] H. Malik, A. Khokhar, and R. Ansari, “Robust Data Hiding in Audio,” submitted to ICME’04, Taipei, Taiwan, June27-30, 2004.
Quantization Based Embedding: [77] F. Pérez-González, and F. Balado, “Quantized projection data hiding,” In Proc. of IEEE
Inter. Conf. on Image Processing, Rochester,NY, USA, Sep 2002. [78] Fernando Pérez-González and Félix Balado, “Improving data hiding performance by
using quantization in a projected domain,” In Proc. of IEEE Inter. Conf. on Multimedia and Expo, Lausanne, Switzerland, Aug 2002.
[79] F. Pérez-González, P. Comesaña, and F. Balado, “Dither-Modulation Data hiding with distortion-compensation: exact performance analysis and an improved detector for JPEG attacks,” In Inter. Conf. on Image Processing, Barcelona, Spain, Sep 2003.
[80] R. J. Barron, B. Chen, and G. W. Wornell, “The duality between information embedding and source coding with side information and some applications,” IEEE Trans. on Information Theory, vol. 49, no. 5, pp. 1159-1180, May 2003.
80
[81] B. Chen and G. W. Wornell, “Quantization index modulation: A class of provably good methods for digital watermarking and information embedding,” IEEE Trans. on Information Theory, vol. 47(4), pp. 1423-1443, May 2001.
[82] B. Chen and G. W. Wornell, “Quantization index modulation methods for digital watermarking and information embedding of multimedia,” Journal of VLSI Signal Processing Systems for Signal, Image, and Video Technology, Vol. 27(1-2), pp. 7-33, Feb 2001.
[83] B. Chen and G. W. Wornell, “Preprocessed and postprocessed quantization index modulation methods for digital watermarking,” Proc. of SPIE vol. 3971, San Jose, CA, pp. 48-59, Jan 2000.
[84] B. Chen and G. W. Wornell, “Dither modulation: A new approach to digital watermarking and information embedding,” Proc. of SPIE, vol. 3657, San Jose, CA, pp. 342-353, Jan. 1999.
[85] B. Chen and G. W. Wornell, “Digital watermarking and information embedding using dither modulation,” Proc. of 1998 IEEE Second Workshop on Multimedia Signal Processing (MMSP-98) , Redondo Beach, CA, pp. 273-278, Dec 1998.
[86] J. J. Eggers and B. Girod, “Quantization Watermarking,” Proc. Security and Watermarking of Multimedia Contents, Electronic Imaging 2000, San Jose, CA, USA, Jan. 2000.
[87] M. Wu, “Joint Security and Robustness Enhancement for Quantization Based Embedding,” IEEE Trans. on Circuits and Systems for Video Technology, May 2003.
[88] F. Balado, and F. Pérez-González. “Hexagonal quantizers are not optimal for 2-D data hiding”, Proc. SPIE, Vol. 5020, Santa Clara, USA, January 2003.
[89] P. Comesaña, F. Pérez-González, and F. Balado, “Optimal Strategies for Spread-Spectrum and Quantized-Projection image data hiding games with BER Payoffs,” Proc. ICIP’03, Barcelona, Spain, Sep 2003.
Fingerprinting: [90] W. Trappe, M. Wu, Z. Wang, K.J.R. Liu, “Anti-collusion Fingerprinting for
Multimedia,” IEEE Trans. on Signal Processing, Vol. 51(4), pp.1069-1087, April 2003. [91] H. Zhao, M. Wu, Z.J. Wang, and K.J.R. Liu, “Performance of Detection Statistics Under
Collusion Attacks on Independent Multimedia Fingerprints,” Proc. ICME'03, Baltimore, MD, July 2003.
[92] Z.J. Wang, M. Wu, W. Trappe, and K.J.R. Liu, “Anti-Collusion of Group-Oriented Fingerprinting,” Proc. ICME'03, Baltimore, MD, July 2003.
[93] H. Zhao, M. Wu, Z.J. Wang, and K.J.R. Liu, “Nonlinear Collusion Attacks On Independent Fingerprints For Multimedia,” Proc. ICASSP'03, Hong Kong, April 2003.
[94] Z.J. Wang, M. Wu, H. Zhao, W. Trappe, and K.J.R. Liu, “Resistance of Orthogonal Gaussian Fingerprints to Collusion Attacks,” Proc. ICASSP'03, Hong Kong, April 2003.
[95] W. Trappe, M. Wu, and K.J.R. Liu, “Anti-Collusion Codes: Multi-User and Multimedia Perspectives,” Proc. ICIP'02, Rochester, NY, Sept. 2002.
[96] W. Trappe, M. Wu, and K.J.R. Liu, “Joint Coding and Embedding for Collusion-Resistant Fingerprinting,” EUSIPCO 2002, Sept. 2002.
[97] W. Trappe, M. Wu, and K.J.R. Liu, “Collusion-Resistant Fingerprinting for Multimedia,” Proc.ICASSP'02, Orlando, FL, May 2002.
[98] A. Briassouli, and P. Moulin, “Detection-Theoretic Anaysis of Warping Attacks in Spread-Spectrum Watermarking,” Proc. ICASSP’03, Hong Kong, April 2003.
81
[99] P. Moulin, A. Briassouli, and H. Malvar, “Detection-Theoretic Analysis of Desynchronization Attacks in Watermarking,” Proc. DSP'02, Santorini, Greece, July 2002.
Authentication: [100] J. J. Eggers and B. Girod, “Blind Watermarking Applied to Image Authentication,” Proc.
ICASSP’01, Vol. 3, pp. 1977-1980, Salt Lake City, UT, May 2001. [101] M. Wu, and B. Liu, "Watermarking for Image Authentication,” Proc. ICIP'98, Chicago,
IL, 1998. [102] E. T. Lin, C. I. Podilchuk, and E. J. Delp, “Detection of Image Alterations Using Semi-
Fragile Watermarks”, Proc. SPIE, Vol. 3971, San Jose, CA, Jan 2000. [103] D. Kundur, and D. Hatzinakos, “Digital Watermarking for Telltale Tamper-Proofing and
Authentication”, Proc. IEEE, Vol. 87(7), pp. 1167-1180, July 1999. Attacks, Performance Evaluation, & Benchmarks: [104] J. K. Su, J. J. Eggers, and B. Girod, “Analysis of Digital Watermarks Subjected to
Optimum Linear Filtering and Additive Noise,” Signal Processing, vol. 81(6), June 2001. [105] J. J. Eggers and B. Girod, “Quantization Effects on Digital Watermarks,” Signal
Processing, vol. 8(2), pp. 239-263, February 2001. [106] J. J. Eggers, R. Bäuml, and B. Girod, “Digital Watermarking facing Attacks by
Amplitude Scaling and Additive White Noise,” Proc. 4th Intl. ITG Conference on Source and Channel Coding, Berlin, Germany, Jan. 2001.
[107] J. K. Su, J. J. Eggers, and B. Girod, “Capacity of Digital Watermarks Subjected to an Optimal Collusion Attack,” X. European Signal Processing Conference EUSIPCO-2000, Tampere, Finland, Sept 2000.
[108] J. Su and B. Girod, “Fundamental Performance Limits of Power-Spectrum Condition-Compliant Watermarks,” Proc. Security and Watermarking of Multimedia Contents, Electronic Imaging 2000, San Jose, CA, Jan. 2000.
[109] J. K. Su, J. J. Eggers, and B. Girod, “Optimum Attack on Digital Watermarks and its Defense,” Proc. 2000 Asilomar Conference on Signals and Systems, Pacific Grove, CA, Oct 2000.
[110] F. Hartung, J. Su, B. Girod, “Spread Spectrum Watermarking: Malicious Attacks and Counter-Attacks,” Proc. SPIE, Vol. 3657, pp. 147-158, San Jose, CA, Jan 1999.
[111] J. Su, F. Hartung, B. Girod, “Channel Model for a Watermark Attack,” Proc. SPIE, Vol. 3657, pp. 159-170, San Jose, CA, Jan 1999.
[112] J. J. Eggers and B. Girod, “Watermark Detection after Quantization Attacks,” Proc. Workshop on Information Hiding,” Dresden, Germany, Sept./Oct. 1999.
[113] M. Wu, and B. Liu, “Attacks on Digital Watermarks,” 33th Asilomar Conference on Signals, Systems, and Computers, 1999.
[114] J. A. O'Sullivan and P. Moulin, “Some Properties of Optimal Information Hiding and Information Attacks,” Proc. 39th Allerton conference , Monticello, IL, Oct 2001.
[115] F. Pérez-González, F. Balado, and J. R. Hernández, “Performance analysis of existing and new methods for data hiding with known-host information in additive channels,” IEEE Trans. on Signal Processing, 51(4):960-980, April 2003.
[116] D. Kirovski, and F. A. P. Petitcolas, “Blind pattern matching attack on watermarking systems”, IEEE Tran. Signal processing, vol. 51(4), pp. 1045–1053, April 2003.
[117] F. A. P. Petitcolas, “Watermarking schemes evaluation”, IEEE Signal Processing, vol. 17(5), pp. 58–64, Sep 2000.
82
[118] M. Kutter and F. A. P. Petitcolas, “Fair evaluation methods for image watermarking systems”, Journal of Electronic Imaging, vol. 9, no. 4, pp. 445–455, Oct. 2000.
[119] Darko Kirovski & Fabien A. P. Petitcolas. “Replacement attack on arbitrary watermarking systems” Proc. ACM CCS-9 workshop, DRM 2002, digital rights management, Lecture notes in computer science, Vol. 2696, Washington, D.C., Nov. 2002.
[120] F. A. P. Petitcolas & D. Kirovski, “Blind pattern matching attack on audio watermarking systems”, Proc. ICASSP 2002, Orlando, Florida, May 2002.
[121] S. Katzenbeisser, and F. A. P. Petitcolas, “Defining security in steganographic systems” Proc. SPIE, Vol. 4675, pp. 50–56, San Jose, CA, Jan 2002.
[122] M. Steinebach, A. Lang, J. Dittmann, and F. A. P. Petitcolas. “StirMark Benchmark: audio watermarking attacks based on lossy compression” Proc. SPIE, Vol. 4675, pp. 79–90, San Jose, CA, Jan 2002.
[123] F. A. P. Petitcolas, R. J. Anderson, and M. G. Kuhn, “Attacks on copyright marking systems. Proc. 2nd workshop on information hiding, in the Lecture Notes in Computer Science Vol. 1525, pp 218–238 Portland, OR, April 1998.
[124] D. Kundur, “Improved Digital Watermarking through Diversity and Attack Characterization”, Proc. Workshop on Multimedia Security at ACM Multimedia ‘99, pp. 53-58, Orlando, Florida, Oct 1999.
[125] D. Kundur, and D. Hatzinakos, “Attack Characterization for Effective Watermarking," Proc. ICIP’99, pp. 240-244, Oct 1999.
[126] D. Kundur, and D. Hatzinakos, “Improved Robust Watermarking through Attack Characterization”, Optics Express focus issue on Digital Watermarking, vol. 3(12), pp. 485-490, Dec 1998.
Surveys and Tutorials: [127] M. Wu, and B. Liu,“Data Hiding in Image and Video: Part-I -- Fundamental Issues and
Solutions,” IEEE Trans. on Image Proc., Vol. 12(6), pp.685-695, June 2003. [128] M. Wu, H. Yu, and B. Liu, “Data Hiding in Image and Video: Part-II -- Designs and
Applications,” IEEE Trans. on Image Proc., Vol. 12(6), pp.696-705, June 2003. [129] I.J. Cox, M.L. Miller, and J.A. Bloom, “Watermarking Applications and Their
Properties,” Proc. of Inter, Conf. on Information Technology: Coding and Computing - ITCC2000, pp. 6-10, 2000.
[130] I.J. Cox, M.L. Miller, J. M. G. Linnartz, and T. Kalker, “A Review of Watermarking Principles and Practices”, DSP for Multimedia Systems, K. K. Parhi, T. Nishitani (eds.), Marcell Dekker, Inc. NY, pp. 461-485, (1999).
[131] F. A. P. Petitcolas, R. J. Anderson, and M. G. Kuhn, “Information hiding – a survey”, Proc. IEEE, 87(7):1062–1078, July 1999.
[132] R. J. Anderson, and F. A. P. Petitcolas. “On the limits of steganography”, IEEE Journal of Selected Areas in Communications, 16(4):474-481, May 1998.
[133] M. Eskicioglu, and E. J. Delp, “An Overview of Multimedia Content Protection in Consumer Electronics Devices”, Signal Processing: Image Communication, Vol. 16, pp. 681-699, 2000.
[134] E. T. Lin, and E. J. Delp, “A Review of Fragile Image Watermarks”, Proc. Multimedia and Security Workshop (ACM Multimedia '99) Multimedia Contents, pp. 25-29, Oct 1999.
[135] A. Sequeira, and D. Kundur, “Communication and Information Theory in Watermarking: A Survey”, Proc. SPIE Vol. 4518, pp. 216-227, Denver, Colorado, August 2001.
83
[136] W. Bender, D. Gruhl, N. Morimoto, and A. Lu, “Techniques for data hiding,” IBM Systems Journal, vol.35, nr. ¾, 1996.
[137] C. I. Podilchuk and E. J. Delp, “Digital watermarking algorithms and applications,” IEEE Signal Processing Magazine, pp. 33-45 July, 2001.
[138] F. Pérez-González, and J. R. Hernández, “A tutorial on digital watermarking,” Proc. of 33rd IEEE Ann. Carnahan Conf. on Security Technology, Madrid, Spain, Oct 1999.
[139] J. R. Hernández and F. Pérez-González, “Statistical analysis of watermarking schemes for copyright protection of images,” Proc. of IEEE, 87(7):1142-1166, July 1999.
[140] J. R. Hernández, F. Pérez-González, J. M. Rodríguez, and G. Nieto, “Performance analysis of a 2D-multipulse amplitude modulation scheme for data hiding and watermarking of still images,” IEEE J. Select. Areas Comm., 16(4):510-524, May 1998.
[141] C. I. Podilchuk, and W. Zeng, “Image Adaptive Watermarking using visual models,” IEEE J. on Selected Areas in Comm., 16(4):525-539, May 1998.
Miscellaneous: [142] J. J. Eggers, J. K. Su, and B. Girod, “Public Key Watermarking Using Linear
Transforms,” X. European Signal Proc. Conf. EUSIPCO’00, Tampere, Finland, Sept 2000.
[143] C-Y. Lin, M. Wu, J.A. Bloom, M.L. Miller, I.J. Cox, and Y-M. Lui “Rotation, Scale, and Translation Resilient Public Watermarking for Images,” IEEE Trans. on Image Processing, vol.10, no.5, no.767-782, May 2001
[144] C-Y. Lin, M. Wu, J.A. Bloom, M.L. Miller, I.J. Cox, and Y-M. Lui, "Rotation, Scale, and Translation Resilient Public Watermarking for Images,” Proc. SPIE, Vol. 3971, San Jose, CA, Jan 2000.
[145] M. Wu, and B. Liu, “Digital Watermarking Using Shuffling,” Proc. ICIP'99, Kobe, Japan, 1999.
[146] F. Pérez-González, J. R. Hernández, and F. Balado, “Approaching the capacity limit in image watermarking: A perspective on coding techniques for data hiding applications,” Signal Processing, Elsevier, 81(6):1215-1238, June 2001.
[147] J. R. Hernández, M. Amado, and F. Pérez-González, “DCT-domain watermarking techniques for still images: Detector performance analysis and a new structure,” IEEE Trans. on Image Processing, 9(1):55-68, January 2000.
[148] J. R. Hernández, J. M. Rodríguez, and F. Pérez-González, “Improving the performance of spatial watermarking of images using channel coding,” Signal Processing, Elsevier, 80:1261-1279, July 2000.
[149] J. R. Hernández, F. Pérez-González, and M. Amado, “Improving DCT-domain watermark extraction using generalized gaussian models,” In Proc. of the COST #254 Int. Workshop on Intelligent Communications and Multimedia Terminals, pp. 23-26, Ljubljana, Slovenia, November 1998.
[150] J. A. Bloom, I. J. Cox, T. Kalker, J-P Linnartz, M. L. Miller, and B. Traw, “Copy Protection for DVD Video”, Proc. of IEEE, 87 (7), pp 1267-1276, 1999.
[151] M. Peinado, F. A. P. Petitcolas, and D. Kirovski, “Digital rights management for digital cinema”, Multimedia Systems Journal, vol. 9, no. 3, pp 228–238, 2003.
[152] D. Kundur, “Implications for High Capacity Data Hiding in the Presence of Lossy Compression”, Proc. IEEE Int. Conf. On Information Technology: Coding and Computing, pp. 16-21, Las Vegas, Nevada, March 2000.
[153] D. Kundur and D. Hatzinakos, “A Robust Digital Image Watermarking Scheme using Wavelet-Based fusion”, Proc. ICIP’97, pp. 544-547, Santa Barbara, CA, Oct 1997.
84
[154] D. A. Nelson, and R.C. Bilger, “Pure-Tone Octave Masking in Normal-Hearing Listeners,” J. of Speech and Hearing Research, Vol. 17 No. 2, June 1974.
[155] K. Karthik, D. Kundur and D. Hatzinakos, “Joint Fingerprinting and Decryption for Multimedia Content Tracing in Wireless Networks,” Proc. SPIE, vol. 5403, Orlando, Florida, April 2004.
[156] D. Kundur and K. Ahsan, “Practical Internet Steganography: Data Hiding in IP,” Proc. Texas Workshop on Security of Information Systems, College Station, Texas, April 2003.
[157] K. Ahsan and D. Kundur, “Practical Data Hiding in TCP/IP,” Proc. Workshop on Multimedia Security at ACM Multimedia '02, French Riviera, Dec 2002.
[158] B. Chen and C.-E. W. Sundberg, “Digital audio broadcasting in the FM band by means of contiguous band insertion and precancelling techniques,” IEEE Trans. on Communications, vol. 48, no. 10, pp. 1634-1637, Oct. 2000.
[159] Secure Digital Music Initiative (SMDI), http://www.smdi.org. [160] D. Gruhl, and A. L. W. Bender, “Echo Hiding,” Proc. 1st Workshop on Information
Hiding, LNCS, Vol. 1174, pp. 295-351, Cambridge, UK, May/April 1996. [161] F. Hartung, P. Eisert, and B. Girod, “Digital Watermarking of MPEG-4 Facial Animation
Parameters,” Computers & Graphics, Vol. 22(4), pp. 425-435, August 1998. [162] F. Hartung and B. Girod, “Watermarking of Uncompressed and Compressed Video,”
Signal Processing, Vol. 66(3), pp. 283-302, May 1998. [163] F. Hartung, P. Eisert, and B. Girod, “Digital Watermarking of MPEG-4 Facial Animation
Parameters,” Computers & Graphics, vol.22, no.4, pp. 425-435, August 1998. [164] F. Hartung, and B. Girod, “Digital Watermarking of Raw and Compressed Video,” Proc.
European EOS/SPIE Symp. Adv. Image & Network Tech., Berlin, Germany, Oct 1996. [165] P. Noll, “MPEG Digital Audio Coding,” IEEE Sig. Proc. Mag. vol. 14(5), pp. 59-81, Sep
1997. [166] S. Mallat, “Multifrequency Channel Decomposition of Images and Wavelet Models,”
IEEE Trans. Acoust., Speech, Signal Processing, Vol. 37(12), pp. 2091–2110, Dec 1989. [167] M. Antonini, M. Barlaud, P. Mathieu, and I. Daubechies, “ Image Coding using Wavelet
Transform,” IEEE Trans. Image Processing, Vol. 1(2), pp. 205–220, April 1992. [168] A. J. Menezes, P. C. V. Oorschot, and S. A. Vanstone, “Handbook of Applied
Cryptography,” CRC Press, 5th print, Aug 2001. [169] “Digital Rights Management and Privacy,” Electronic Privacy Information Center,
http://www.epic.org/privacy/drm/. [170] M. Karagosian, “Digital Rights Management: Friend or Foe?,”
http://www.mkpe.com/articles/2001/DRM_2001/drm_2001.htm. [171] “Are Music Companies Blinded by Fright?,”
http://www.businessweek.com/1999/99_26/b3635140.htm?scriptFramed. [172] http://history.acusd.edu/gen/recording/digital.html [173] http://www.kirtland.cc.mi.us/honors/digrev.htm
Appendix: A
Notations
Some preliminary notational conventions are defined here. Notations according to different types
of variable used are given as,
o Scalar: Upper case or lowercase italic letters with Arial font represent scalar values
and individual members of sets i.e. N, x, r etc. Magnitude of scalar value n is denoted
|n|.
o Sets: Sets are represented by Calligraphic font, for example, the set of real numbers is
R and set of messages is M. Cardinality of set M is denoted by |M |.
o Vectors: n-dimensional vectors (where n is a positive integer ), are represented
as boldface lowercase italic letters with Arial font: c, r, and w. Indices into these
vectors are specifies in the square brackets. For example, pixel of an image c at
location i, j is specified by c[i,j]. Moreover, vectors in transformed domain i.e.
discrete cosine transform, discrete wavelet transform etc. are represented by boldface
uppercase letters, e.g. DCT of vector c is C.
n +∈
Subscripts are used to indicate different versions of same vector, for example, co indicates the
vector of the original host data, similarly watermarked copy of this host media is cw etc.
The Euclidian norm of a vector c, that is, || . ||2 is denoted |c|. The sample mean and sample
variance of a vector c are denoted c and respectively. 2cs
o Random Scalar Variables: Random scalar variables are indicated by italic letters
Times New Roman font: x, r, and y etc. Each random variable is associated with a
probability distribution or density function, that is, the probability that the value x will
be drawn from the distribution of x is written fx(x).
The statistical mean and variance (or expected value and second central moment) of a
random variable x are indicated by xµ and 2xσ respectively.
85
o Random Vectors: Random vectors are represented by the boldface letters but same
font as random scalars, i.e. x , r , and y etc. Similarly, the probability distribution
function associated with random vector x is written fx(x). The statistical mean and
variance of a random vector x are represented by µx and 2σ x respectively.
86