Robust watermarking techniques for scalable coded image ...

Access to Electronic Thesis

Author: Deepayan Bhowmik

Thesis title: Robust watermarking techniques for scalable coded image and video

Qualification: PhD

This electronic thesis is protected by the Copyright, Designs and Patents Act 1988. No reproduction is permitted without consent of the author. It is also protected by the Creative Commons Licence allowing Attributions-Non-commercial-No derivatives. If this electronic thesis has been edited by the author it will be indicated as such on the title page and in the text.

ROBUST WATERMARKING

TECHNIQUES FOR SCALABLE

CODED IMAGE AND VIDEO

submitted by

Deepayan Bhowmik

for the degree of

Doctor of Philosophy

of the

Department of Electronic and Electrical Engineering

The University of Sheffield

December, 2010

COPYRIGHT

Attention is drawn to the fact that copyright of this thesis rests with its author. This

copy of the thesis has been supplied on the condition that anyone who consults it is

understood to recognise that its copyright rests with its author and that no quotation

from the thesis and no information derived from it may be published without the

prior written consent of the author.

This thesis may be made available for consultation within the University Library and

may be photocopied or lent to other libraries for the purposes of consultation.

Signature of Author . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Deepayan Bhowmik

ABSTRACT

In scalable image/video coding, high resolution content is encoded to the highest visual

quality and the bit-streams are adapted to cater various communication channels, dis-

play devices and usage requirements. These content adaptations, which include quality,

resolution and frame rate scaling may also affect the content protection data, such as,

watermarks and are considered as a potential watermark attack. In this thesis, research

on robust watermarking techniques for scalable coded image and video, are proposed

and the improvements in robustness against various content adaptation attacks, such

as, JPEG 2000 for image and Motion JPEG 2000, MC-EZBC and H.264/SVC for

video, are reported. The spread spectrum domain, particularly wavelet-based image

watermarking schemes often provides better robustness to compression attacks due

to its multi-resolution decomposition and hence chosen for this work. A comprehen-

sive and comparative analysis of the available wavelet-based watermarking schemes,

is performed by developing a new modular framework, Watermark Evaluation Bench

for Content Adaptation Modes (WEBCAM). This analysis is used to derive a water-

mark embedding distortion model, that establishes a directly proportional relationship

between the sum of energy of the selected wavelet coefficients and the distortion per-

formance, i.e., mean square error (MSE) in spatial domain. On the other hand, the

improvements on robustness is achieved by modeling the bit plane discarding, which

analyzes the effect of the quantization and de-quantization within the image coder and

ranks the wavelet coefficients and other parameters according to their ability to retain

the watermark data intact under quality scalable coding-based content adaptation. The

work, then, extends these image watermarking models in video watermarking. But a

direct extension of the image watermarking methods into frame by frame video wa-

termarking without considering motion, results in flicker and other motion mismatch

artifacts in the watermarked video. Motion compensated temporal filtering (MCTF)

provides a good framework for accounting the motion. A generalized MCTF-based

spatio-temporal decomposition domain (2D+t+2D) video watermarking framework is

developed to address such issues. Improvements on imperceptibility and robustness are

achieved by embedding the watermark in 2D+t compared to traditional t+2D MCTF

based watermarking schemes. Finally, the research outcomes, discussed above, are

combined to propose a novel concept of scalable watermarking scheme, that generates

a distortion constrained robustness scalable watermarked media code stream which can

be truncated at various points to generate the watermarked image or video with the

desired distortion-robustness requirements.

i

Dedicated to my parents.

ii

ACKNOWLEDGEMENTS

I am grateful to my parents to motivate and encourage me for this long and endur-

ing journey, called PhD. I take this opportunity to express my sincere gratitude to

Dr. Charith Abhayaratne for guiding and sailing me through the entire process. I feel

fortunate to have him as my supervisor who helped me to learn not only the tech-

nical aspects but also the integrity of this degree. I wish to thank UK Engineering

and Physical Sciences Research Council (EPSRC) for funding this work through an

EPSRC-BP Dorothy Hodgkin Postgraduate Award (DHPA). I am specially thankful

to Dr. Sanchita Bandyopadhyay, Dr. Subrata B. Ghosh, Ms. Ritu Sengupta, Mr. Sub-

rato Chatterjee and Dr. Bala Amavasai for their encouragement and support. Finally,

I like to thank Mr. James Screaton for the technical support, Mr. Mathew Oakes

for helping me in proof reading, my colleagues in Visual and Information Engineering

(VIE) lab and the last but not least, my friends in Sheffield.

iii

Contents

List of Figures xiii

List of Tables xxiv

List of Symbols and Acronyms xxvii

Statement of Originality xxvii

1 Introduction 1

1.1 Scalable coded image watermarking . . . . . . . . . . . . . . . . . . . . 2

1.2 Scalable coded video watermarking . . . . . . . . . . . . . . . . . . . . . 3

1.3 Scalable watermarking for image and video . . . . . . . . . . . . . . . . 4

1.4 Thesis organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

1.5 Publications and software releases . . . . . . . . . . . . . . . . . . . . . 6

2 Background Overview 9

2.1 Scalable coding-based content adaptation . . . . . . . . . . . . . . . . . 9

2.1.1 Scalable coding modules . . . . . . . . . . . . . . . . . . . . . . . 9

v

2.1.2 Scalable coding technique . . . . . . . . . . . . . . . . . . . . . . 11

2.2 Digital watermarking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

2.2.1 Definition, properties, applications and attacks . . . . . . . . . . 13

2.2.2 Watermarking process . . . . . . . . . . . . . . . . . . . . . . . . 17

2.2.2.1 Embedding . . . . . . . . . . . . . . . . . . . . . . . . . 17

2.2.2.2 Extraction and authentication . . . . . . . . . . . . . . 19

2.2.3 Wavelet-based watermarking . . . . . . . . . . . . . . . . . . . . 20

2.2.4 Wavelet transform . . . . . . . . . . . . . . . . . . . . . . . . . . 21

2.2.4.1 Filter bank approach . . . . . . . . . . . . . . . . . . . 21

2.2.4.2 Lifting based approach . . . . . . . . . . . . . . . . . . 22

2.2.5 2D wavelet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

2.2.6 Motion compensated temporal filtering . . . . . . . . . . . . . . . 24

2.3 Conlcusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

3 State-of-the-art 25

3.1 Image watermarking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

3.1.1 Wavelet-based image watermarking . . . . . . . . . . . . . . . . . 25

3.1.1.1 Uncompressed domain watermarking algorithms . . . . 26

3.1.1.2 Joint compression-watermarking algorithms . . . . . . . 26

3.1.2 Dissection of wavelet-based image watermarking algorithms . . . 27

3.1.2.1 Wavelet kernel . . . . . . . . . . . . . . . . . . . . . . . 27

vi

3.1.2.2 Subband . . . . . . . . . . . . . . . . . . . . . . . . . . 28

3.1.2.3 Hosting coefficient . . . . . . . . . . . . . . . . . . . . . 28

3.1.2.4 Embedding method . . . . . . . . . . . . . . . . . . . . 28

3.2 Video watermarking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

3.2.1 Uncompressed and compressed domain video watermarking . . . 30

3.2.1.1 Uncompressed domain algorithms . . . . . . . . . . . . 30

3.2.1.2 Compressed domain algorithms . . . . . . . . . . . . . . 31

3.2.2 Dissection of the video watermarking algorithms . . . . . . . . . 32

3.2.2.1 Frame-by-frame . . . . . . . . . . . . . . . . . . . . . . 32

3.2.2.2 3D decomposed . . . . . . . . . . . . . . . . . . . . . . 33

3.2.2.3 Motion compensated . . . . . . . . . . . . . . . . . . . . 33

3.2.2.4 Bit stream domain . . . . . . . . . . . . . . . . . . . . . 33

3.2.2.5 Motion vector based . . . . . . . . . . . . . . . . . . . . 34

3.3 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

4 Watermarking Evaluation Bench for Content Adaptation Modes 37

4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

4.2 WEBCAM system architecture . . . . . . . . . . . . . . . . . . . . . . . 39

4.2.1 Watermark embedding tools . . . . . . . . . . . . . . . . . . . . . 40

4.2.2 Content adaptation tools . . . . . . . . . . . . . . . . . . . . . . 42

4.2.3 Watermark extraction and authentication tools . . . . . . . . . . 44

vii

4.2.3.1 Watermark extraction . . . . . . . . . . . . . . . . . . . 44

4.2.3.2 Postprocessing . . . . . . . . . . . . . . . . . . . . . . . 44

4.2.3.3 Watermark authentication . . . . . . . . . . . . . . . . 45

4.3 Experimental simulations and comparative study . . . . . . . . . . . . . 45

4.3.1 Different wavelet-based watermarking algorithm realization . . . 46

4.3.2 Robustness to content adaptation attacks . . . . . . . . . . . . . 46

4.3.2.1 The experimental setup . . . . . . . . . . . . . . . . . . 47

4.3.2.2 The effect of wavelet kernel choice on robustness . . . . 50

4.3.2.3 The effect of subband choice . . . . . . . . . . . . . . . 51

4.3.2.4 The effect of the choice of embedding method and host

coefficient selection . . . . . . . . . . . . . . . . . . . . 53

4.4 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54

5 Embedding distortion analysis and modeling 55

5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55

5.2 Embedding distortion model for orthonormal wavelet bases . . . . . . . 56

5.2.1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

5.2.2 The model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

5.2.2.1 An example of non-blind model . . . . . . . . . . . . . 60

5.2.2.2 An example of blind embedding model . . . . . . . . . 61

5.2.3 Experimental simulations and result discussion . . . . . . . . . . 61

viii

5.2.3.1 Non-blind model . . . . . . . . . . . . . . . . . . . . . . 62

5.2.3.2 Blind model . . . . . . . . . . . . . . . . . . . . . . . . 62

5.3 Embedding distortion model for non-orthonormal

wavelet bases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65

5.3.1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65

5.3.2 Experimental simulations and discussion . . . . . . . . . . . . . . 67

5.3.2.1 Calculation of the weighting parameters . . . . . . . . . 68

5.3.2.2 Simulations of the propositions . . . . . . . . . . . . . . 69

5.4 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70

6 Robustness analysis and modeling 75

6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75

6.2 Quality scalability in content adaptation . . . . . . . . . . . . . . . . . . 76

6.3 Robustness model for non-blind extraction using magnitude alteration . 77

6.3.1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77

6.3.2 The model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78

6.3.3 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83

6.4 Robustness model for blind extraction using

re-quantization-based modifications . . . . . . . . . . . . . . . . . . . . . 83

6.4.1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83

6.4.2 The model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85

6.4.3 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88

ix

6.5 Performance evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . 89

6.5.1 Evaluation of the model for non-blind watermarking . . . . . . . 90

6.5.1.1 Simulations with bit plane discarding . . . . . . . . . . 91

6.5.1.2 Experiments with JPEG 2000 quality scalability . . . . 91

6.5.2 Evaluation of the model for blind watermarking . . . . . . . . . . 93

6.5.2.1 Simulations with bit plane discarding . . . . . . . . . . 95

6.5.2.2 Experiments with JPEG 2000 quality scalability . . . . 96

6.6 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97

7 Motion Compensated Video Watermarking Techniques 99

7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99

7.2 Motion compensated 2D+t+2D filtering . . . . . . . . . . . . . . . . . . 102

7.2.1 MMCTF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102

7.2.2 2D+t+2D framework . . . . . . . . . . . . . . . . . . . . . . . . 104

7.3 Video watermarking in 2D+t+2D spatio-temporal decomposition . . . . 105

7.3.1 Proposed video watermarking scheme . . . . . . . . . . . . . . . 106

7.3.1.1 Embedding . . . . . . . . . . . . . . . . . . . . . . . . . 107

7.3.1.2 Extraction and authentication . . . . . . . . . . . . . . 107

7.3.2 The framework analysis in video watermarking context . . . . . . 109

7.3.2.1 On improving imperceptibility . . . . . . . . . . . . . . 109

7.3.2.2 On motion retrieval . . . . . . . . . . . . . . . . . . . . 111

x

7.4 Experimental results and discussion . . . . . . . . . . . . . . . . . . . . 114

7.4.1 Embedding distortion analysis . . . . . . . . . . . . . . . . . . . 116

7.4.2 Robustness performance evaluation . . . . . . . . . . . . . . . . . 124

7.5 Adopting robustness model in video watermarking . . . . . . . . . . . . 133

7.5.1 Robust video watermarking . . . . . . . . . . . . . . . . . . . . . 133

7.5.2 Experimental results . . . . . . . . . . . . . . . . . . . . . . . . . 133

7.6 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134

8 Distortion Constrained Robustness Scalable Watermarking 137

8.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137

8.2 Scalable watermarking . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139

8.2.1 Proposed algorithm . . . . . . . . . . . . . . . . . . . . . . . . . 140

8.2.1.1 Tree formation . . . . . . . . . . . . . . . . . . . . . . . 140

8.2.1.2 Embedding . . . . . . . . . . . . . . . . . . . . . . . . . 141

8.2.1.3 Extraction and Authentication . . . . . . . . . . . . . . 143

8.2.2 Scalable watermark system design . . . . . . . . . . . . . . . . . 144

8.2.2.1 Encoding module . . . . . . . . . . . . . . . . . . . . . 144

8.2.2.2 Embedded watermarking module . . . . . . . . . . . . . 146

8.2.2.3 Extractor module . . . . . . . . . . . . . . . . . . . . . 146

8.2.3 Effect of bit plane discarding . . . . . . . . . . . . . . . . . . . . 147

8.3 Experimental results and discussion . . . . . . . . . . . . . . . . . . . . 151

xi

8.3.1 Scalable watermarking for images . . . . . . . . . . . . . . . . . . 152

8.3.1.1 Proof of the concept . . . . . . . . . . . . . . . . . . . . 152

8.3.1.2 Verification of the scheme against bit plane discarding . 153

8.3.1.3 Robustness performance against JPEG 2000 . . . . . . 154

8.3.1.4 Robustness performance comparison with existing method155

8.3.1.5 Application scenario of scalable watermarking . . . . . 158

8.3.2 Scalable watermarking for video . . . . . . . . . . . . . . . . . . 159

8.4 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163

9 Conclusions and future work 169

9.1 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169

9.2 Future work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172

10 Appendix A 175

References 179

xii

List of Figures

2.1 Universal multimedia usage scenarios using scalable coded content. . . . 10

2.2 The scalable coding-decoding block diagram. . . . . . . . . . . . . . . . 10

2.3 Quality scalable encoding process. . . . . . . . . . . . . . . . . . . . . . 11

2.4 Spatial resolution scalable encoding process. . . . . . . . . . . . . . . . . 12

2.5 Temporal scalable encoding process. . . . . . . . . . . . . . . . . . . . . 12

2.6 Watermarking applications. . . . . . . . . . . . . . . . . . . . . . . . . . 13

2.7 Watermarking properties and associated applications. . . . . . . . . . . 14

2.8 Types of watermarking techniques. . . . . . . . . . . . . . . . . . . . . . 15

2.9 Watermark types. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

2.10 Attack characterization. . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

2.11 Watermark embedding process. . . . . . . . . . . . . . . . . . . . . . . . 18

2.12 Watermark extraction and authentication process. . . . . . . . . . . . . 20

2.13 The filter bank approach for DWT. . . . . . . . . . . . . . . . . . . . . . 22

2.14 The lifting approach for DWT. . . . . . . . . . . . . . . . . . . . . . . . 22

2.15 2D wavelet transform operation. . . . . . . . . . . . . . . . . . . . . . . 23

xiii

2.16 The block based motion estimation. . . . . . . . . . . . . . . . . . . . . 24

3.1 Uncompressed domain image watermarking and content adaptation attack. 26

3.2 Joint compression-watermarking and content adaptation attack. . . . . . 27

3.3 Re-quantisation-based modification. . . . . . . . . . . . . . . . . . . . . 29

3.4 Uncompressed domain video watermarking and compression / content

adaptation attack. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

3.5 Generic scheme for joint compression domain video watermarking. . . . 32

4.1 WEBCAM modules and input/output parameter blocks . . . . . . . . . 39

4.2 Flow diagram of the watermark embedding module in WEBCAM. . . . 40

4.3 The FDWT submodule with choices wavelet kernels. . . . . . . . . . . . 41

4.4 The flow diagram content adaptation tools in WEBCAM. . . . . . . . . 43

4.5 Content adaptation at nodes. . . . . . . . . . . . . . . . . . . . . . . . . 43

4.6 Flow diagram of watermark extraction and authentication in WEBCAM. 44

4.7 The test image set. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

4.8 The test logo set. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

4.9 An example of comparing the choice of logo with the same bit count

(8192) being embedded using the intra re-quantization-based embedding

on robustness to - Row 1: Quality scalability attack on full resolution;

and Row 2: Joint resolution-quality scalability attack (half resolution). . 48

xiv

4.10 Capacity-distortion plots. Numbers 1 to 5 represent the five images from

the test image set. Two different category of algorithms: 1) non-blind

(non-HVS based <1,0,0,0>(τ=1)) and 2) blind (intra re-quantization

based), are shown in each row for six different wavelet kernels: HR, D-4,

5/3 9/7, MH and MQ. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49

4.11 Original and extracted watermark logo and corresponding to different

Hamming distances (HD). . . . . . . . . . . . . . . . . . . . . . . . . . . 50

4.12 An example of evaluating the effect of the wavelet kernel for < 1, 0, 0, 0 >

(τ = 1) direct modification-based embedding on robustness to - Column

1: Quality scalability attack on full resolution; and Column 2: Joint

resolution-quality scalability attack (half resolution). . . . . . . . . . . . 51

4.13 An example of evaluating the effect of the wavelet kernel for intra re-

quantization-based embedding on robustness to - Column 1: Quality

scalability attack on full resolution; and Column 2: Joint resolution-

quality scalability attack (half resolution). . . . . . . . . . . . . . . . . . 51

4.14 An example of evaluating the effect of the subband choice for< 1, 0, 0, 0 >

(τ = 1) direct modification-based embedding on robustness to - Column

1: Quality scalability attack on full resolution; and Column 2: Joint

resolution-quality scalability attack (half resolution). . . . . . . . . . . . 52

4.15 An example of evaluating the effect of the subband choice for intra re-

quantization-based embedding on robustness to - Column 1: Quality

scalability attack on full resolution; and Column 2: Joint resolution-

quality scalability attack (half resolution). . . . . . . . . . . . . . . . . . 52

4.16 An example of evaluating the effect of different embedding methods on

robustness to - Column 1: Quality scalability attack on full resolution;

and Column 2: Joint resolution-quality scalability attack (half resolution). 53

5.1 Watermark embedding (non-blind) performance graph for different sub-

bands. Four different wavelet kernels used here: 1. HR, 2. D4, 3. D8

and 4. D16, respectively. Subbands are shown left to right and top to

bottom: LL3, HL3, LH3, HH3, respectively. . . . . . . . . . . . . . . . . 63

xv

5.2 Watermark embedding (non-blind) performance graph for various wavelets

in different subband. Wavelet kernels are shown left to right and top to

bottom: HR, D4, D8 and D16, respectively. . . . . . . . . . . . . . . . . 64

5.3 Watermark embedding (blind) performance graph for different subbands.

Four different wavelet kernels used here: 1. HR, 2. D4, 3. D8 and 4.

D16, respectively. Subbands are shown left to right and top to bottom:

LL3, HL3, LH3, HH3, respectively. . . . . . . . . . . . . . . . . . . . . . 64

5.4 Watermark embedding (blind) performance graph for various wavelets


bottom: HR, D4, D8 and D16, respectively. . . . . . . . . . . . . . . . . 65

5.5 Watermark embedding (non-blind) performance graph for different sub-

bands. Four different wavelet kernels used here: 1. 9/7, 2. 5/3, 3. MH

and 4. MQ, respectively. Subbands are shown left to right and top to

bottom: LL3, HL3, LH3, HH3, respectively. . . . . . . . . . . . . . . . . 71

5.6 Watermark embedding (non-blind) performance graph for various wavelets


bottom: 1. 9/7, 2. 5/3, 3. MH and 4. MQ, respectively. . . . . . . . . . 72

5.7 Watermark embedding (blind) performance graph for different subbands.

Four different wavelet kernels used here: 1. 9/7, 2. 5/3, 3. MH and 4.

MQ, respectively. Subbands are shown left to right and top to bottom:

LL3, HL3, LH3, HH3, respectively. . . . . . . . . . . . . . . . . . . . . . 73

5.8 Watermark embedding (blind) performance graph for various wavelets


bottom: 1. 9/7, 2. 5/3, 3. MH and 4. MQ, respectively. . . . . . . . . . 74

6.1 The effect of quantization and de-quantization processes in wavelet do-

main considering discarding of N bit planes. . . . . . . . . . . . . . . . . 77

6.2 The range of C capable of robust extraction of b = 1. Row 1 : C ′ ≥ C ′;

Row 2 : C ′ < C ′; Row 3 : The total range. . . . . . . . . . . . . . . . . . 80

6.3 The range of C capable of robust extraction of b = 0. Row 1 : C ′ < C ′;

Row 2 : C ′ ≥ C ′; Row 3 : The total range. . . . . . . . . . . . . . . . . . 82

xvi

6.4 The combined range of C capable of robust extraction of both b = 1 and

b = 0. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83

6.5 Coefficients’ robustness rank maps for discarding up to N bit planes

shown using 7 gray scales corresponding to N = 0, ...., 6. Left: LL sub-

band; Right: HL subband; Row 1: Embedding b = 1; Row 2: Embedding

b = 0; Row 3: Embedding any value of b. . . . . . . . . . . . . . . . . . 85

6.6 Mapping of coefficients after quantization and de-quantization processes

considering the discarding of N bit planes. . . . . . . . . . . . . . . . . . 86

6.7 Embedding performance of the model for non-blind watermarking con-

sidering different values of N at embedding. Column 1 : Image 1; and

Column 2 : Image 2. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90

6.8 Non-blind model evaluation: Robustness performance against discarding

of p bit planes for the embedding models that consider N = 1, 3, 5 (Col-

umn 1 ) and N = 2, 4, 6 (Column 2 ) bit planes to be discarded. N = 0

corresponds to algorithm without model. Row 1 : Image 1; and Row 2 :

Image 2; Row 3 : The entire image set. . . . . . . . . . . . . . . . . . . . 92

6.9 Non-blind model evaluation. a) and b) represent the difference images

|C ′ − C| in for using the embedding model with N = 0 and N = 5,

respectively. c) and d) show the corresponding difference images |C ′−C|at the decoder after discarding p = 5 bit planes. . . . . . . . . . . . . . . 93

6.10 Non-blind model evaluation: Robustness performance against JPEG

2000 quality scalability for the embedding models that consider N =

1, 3, 5 (Column 1 ) and N = 2, 4, 6 (Column 2 ) bit planes to be dis-

carded. N = 0 corresponds to algorithm without model. Row 1 : Image

1; and Row 2 : Image 2; Row 3 : The entire image set. . . . . . . . . . . 94

6.11 Embedding performance of the model for blind watermarking consider-

ing different values of N at embedding for image 3 and image 4. . . . . 95

6.12 Blind model evaluation: Robustness performance against discarding of

p bit planes for the embedding models that consider N = 0, 3, 4, 5 bit

planes to be discarded. Row 1, Column 1 : Image 3; and Row 1, Column

2 : Image 4; Row 2 : The entire image set. . . . . . . . . . . . . . . . . . 96

xvii

6.13 Blind model evaluation: Robustness performance against JPEG 2000

quality scalability for the embedding models that consider N = 0, 3, 4, 5

bit planes to be discarded. Row 1, Column 1 : Image 3; and Row 1,

Column 2 : Image 4; Row 2 : The entire image set. . . . . . . . . . . . . 97

7.1 Pixel connectivity in I2t and I2t+1 frames. . . . . . . . . . . . . . . . . . 103

7.2 Realization of 3-2 temporal schemes using the 2D+t+2D framework with

different parameters: (032). . . . . . . . . . . . . . . . . . . . . . . . . . 105





7.5 Realization of spatial 2D frame-by-frame scheme using the 2D+t+2D

framework with different parameters: (002). . . . . . . . . . . . . . . . . 107

7.6 System blocks for watermark embedding scheme in 2D+t+2D spatio-

temporal decomposition. . . . . . . . . . . . . . . . . . . . . . . . . . . . 108

7.7 System blocks for non-blind watermark extraction scheme in 2D+t+2D

spatio-temporal decomposition. . . . . . . . . . . . . . . . . . . . . . . . 108

7.8 System blocks for blind watermark extraction scheme in 2D+t+2D spatio-

temporal decomposition. . . . . . . . . . . . . . . . . . . . . . . . . . . . 109

7.9 Histogram of coefficients at LLs for 3rd level temporal low and high

frequency frames (GOP 1) for Foreman sequence. Column 1) & 2) rep-

resents LLL and LLH temporal frames, respectively and Row 1), 2) &

3) shows 032, 131 and 230 combinations of 2D+t+2D framework. . . . . 112

7.10 Histogram of coefficients at LLs for 3rd level temporal low and high fre-

quency frames (GOP 1) for Crew sequence. Column 1) & 2) represents

LLL and LLH temporal frames, respectively and Row 1), 2) & 3) shows

032, 131 and 230 combinations of 2D+t+2D framework. . . . . . . . . . 113

7.11 The test video sequence set. . . . . . . . . . . . . . . . . . . . . . . . . . 116

xviii

7.12 Embedding distortion performance for non-blind watermarking on LLL

and LLH temporal subbands for News sequence. a) and c) represents

MSE and b) and d) represents Flicker metric for LLL and LLH, respec-

tively. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118


and LLH temporal subbands for Foreman sequence. a) and c) repre-

sents MSE and b) and d) represents Flicker metric for LLL and LLH,

respectively. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119


and LLH temporal subbands for Crew sequence. a) and c) represents


tively. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120

7.15 Embedding distortion performance for blind watermarking on LLL and

LLH temporal subbands for News sequence. a) and c) represents MSE

and b) and d) represents Flicker metric for LLL and LLH, respectively. 121


LLH temporal subbands for Foreman sequence. a) and c) represents


tively. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122


LLH temporal subbands for Crew sequence. a) and c) represents MSE

and b) and d) represents Flicker metric for LLL and LLH, respectively. 123

7.18 Robustness performance of non-blind watermarking scheme for Crew

sequence. Row 1) & 2) show robustness against Motion JPEG 2000 and

MC-EZBC, respectively. Column 1) & 2) represents the embedding on

temporal subbands LLL & LLH, respectively. . . . . . . . . . . . . . . 127

7.19 Robustness performance of non-blind watermarking scheme for Foreman




xix

7.20 Robustness performance of non-blind watermarking scheme for News




7.21 Robustness performance of blind watermarking scheme for Crew se-

quence. Row 1) & 2) show robustness against Motion JPEG 2000 and



7.22 Robustness performance of blind watermarking scheme for Foreman se-




7.23 Robustness performance of blind watermarking scheme for News se-




7.24 Robustness performance enhancement using bit plane discarding model

(N = 5) of non-blind watermarking scheme for LLH subband. Column

1) & 2) show robustness against Motion JPEG 2000 and MC-EZBC,

respectively. Row 1), 2) & 3) represents the test sequences, Crew, Fore-

man & News, respectively. . . . . . . . . . . . . . . . . . . . . . . . . . . 135

7.25 Robustness performance enhancement using bit plane discarding model

(N = 5) of blind watermarking scheme for LLL subband. Column 1) &

2) show robustness against Motion JPEG 2000 and MC-EZBC, respec-

tively. Row 1), 2) & 3) represents the test sequences, Crew, Foreman &

News, respectively. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136

8.1 Non-uniform hierarchical quantizer in formation of binary tree. . . . . . 141

8.2 Example binary tree. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142

8.3 State machine diagram of watermark embedding based on tree-symbol-

association model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143

xx

8.4 Proposed scalable watermarking layer creation. . . . . . . . . . . . . . . 145

8.5 Code-stream generation. . . . . . . . . . . . . . . . . . . . . . . . . . . . 146

8.6 Effect of bit plane discarding in watermark extraction; λ = 2M and N

is the number of bit plane being discarded. . . . . . . . . . . . . . . . . 148

8.7 Effect of bit plane discarding in watermark extraction for special case of

EZ and EO; λ = 2M and N is the number of bit plane being discarded. 149

8.8 Visual representation of watermarked images at various rate points for

Boat image. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153


Barbara image. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154


Blackboard image. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155


Light House image. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156

8.12 PSNR and robustness vs Φ graph. Row 1: Embedding distortion vs. Φ,

Row 2: Hamming distance vs. Φ. . . . . . . . . . . . . . . . . . . . . . . 157

8.13 PSNR and robustness vs Φ graph. Row 1: Embedding distortion vs. Φ,

Row 2: Hamming distance vs. Φ. . . . . . . . . . . . . . . . . . . . . . . 157

8.14 Robustness against discarding of p bit planes for various d at minimum

and maximum Φ. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158

8.15 Robustness against JPEG 2000 compression for various d at minimum

and maximum Φ. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159

8.16 Robustness against JPEG 2000 compression for various Φ at d = 6. . . . 160

8.17 Robustness performance comparison between existing and proposed method

against JPEG 2000 compression with same Φ. . . . . . . . . . . . . . . . 161

xxi

8.18 Application example to use different Φ for various JPEG 2000 compres-

sion ratio to maintain embedding distortion and robustness. . . . . . . . 162

8.19 Embedding distortion performance for proposed watermarking on LLL

temporal subbands for various Φ(d = 6). Row 1), 2) & 3) represents

embedding performances for Crew, Foreman and News sequences, re-

spectively. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163

8.20 Embedding distortion performance for proposed watermarking on LLH

temporal subbands for various Φ (d = 6). Row 1), 2) & 3) represents

embedding performances for Crew, Foreman and News sequences, re-

spectively. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164

8.21 Robustness performance of proposed watermarking scheme at different Φ

(d = 6) for Crew sequence. Row 1) & 2) show robustness against Motion

JPEG 2000 and MC-EZBC, respectively. Column 1) & 2) represents the

embedding on temporal subbands LLL & LLH, respectively. . . . . . . 165

8.22 Robustness performance of proposed watermarking scheme at different

Φ (d = 6) for Foreman sequence. Row 1) & 2) show robustness against

Motion JPEG 2000 and MC-EZBC, respectively. Column 1) & 2) rep-

resents the embedding on temporal subbands LLL & LLH, respectively. 166

8.23 Robustness performance of proposed watermarking scheme at different Φ

(d = 6) for News sequence. Row 1) & 2) show robustness against Motion

JPEG 2000 and MC-EZBC, respectively. Column 1) & 2) represents the

embedding on temporal subbands LLL & LLH, respectively. . . . . . . 167

10.1 Robustness performance of non-blind watermarking scheme against H.264-

SVC for Crew sequence. Column 1) & 2) represents the embedding on



SVC for Foreman sequence. Column 1) & 2) represents the embedding

on temporal subbands LLL & LLH, respectively. . . . . . . . . . . . . . 176


SVC for News sequence. Column 1) & 2) represents the embedding on


xxii

10.4 Robustness performance of blind watermarking scheme against H.264-

SVC for Crew sequence. Column 1) & 2) represents the embedding on



SVC for Foreman sequence. Column 1) & 2) represents the embedding

on temporal subbands LLL & LLH, respectively. . . . . . . . . . . . . . 178


SVC for News sequence. Column 1) & 2) represents the embedding on


xxiii

List of Tables

4.1 Realization of major wavelet-based watermarking algorithms using com-

binations of options for submodules in WEBCAM. . . . . . . . . . . . . 46

5.1 Correlation coefficient values between sum of energy and the MSE for

different wavelet kernel in various subbands. . . . . . . . . . . . . . . . . 61

5.2 Weighting parameter values of each subband at each decomposition level

for various non-orthonormal wavelets. . . . . . . . . . . . . . . . . . . . 68

5.3 Correlation coefficient values between sum of energy and the MSE for

different wavelet kernel in various subbands. . . . . . . . . . . . . . . . . 70

6.1 Data value (C) ranges for retaining the watermark data, b = 1 and b = 0

for discarding N = 7 bit planes. . . . . . . . . . . . . . . . . . . . . . . . 84

6.2 Values of m and corresponding b for different modifications of C ′2 for

k = 1, k + n = 6 and N = 5. . . . . . . . . . . . . . . . . . . . . . . . . . 88

6.3 Ranges of C ′2 to embed watermark bits, b = 1 and b = 0, for different N 89

7.1 Sum of energy of coefficients at LLs for first two GOP each with 8

temporal low and high frequency frames of Foreman sequence. . . . . . 110

7.2 Sum of energy of coefficients at LLs for first two GOP each with 8

temporal low and high frequency frames of Crew sequence. . . . . . . . 111

xxv

7.3 Hamming distance for blind watermarking by estimating motion from

watermarked video using different macro block size (MB) and search

range (SR). Embedding at LLs on frame: a) LLL and b) LLH on Fore-

man sequence (average of first 64 frames). . . . . . . . . . . . . . . . . . 114

7.4 Hamming distance for blind watermarking by estimating motion from

watermarked video using different macro block size (MB) and search

range (SR). Embedding at LLs on frame: a) LLL and b) LLH on Crew

sequence (average of first 64 frames). . . . . . . . . . . . . . . . . . . . . 115

8.1 Tree-based watermarking rules table . . . . . . . . . . . . . . . . . . . . 142

8.2 Embedding distortion performance comparison between existing and

proposed watermarking method. . . . . . . . . . . . . . . . . . . . . . . 157

xxvi

List of Symbols and Acronyms

Symbols

Symbol Description

Q Quality scalability

S Spatial scalability

T Temporal scalability

ζ() Embedding function

$() Extraction function

I Original host image

I ′ Watermarked image

X × Y Image dimension

W Watermark

W ′ Extracted watermark

Ψ Mother wavelet

C Original wavelet coefficient

C ′ Modified wavelet coefficient

∆ Watermark modification

α Watermarking weight factor for

magnitude alteration based method

τ Watermarking strength

v HVS based weighting parameter

β Fusion strength parameter

δ Quantization step

δ Reconstructed quantization step

γ Watermark weighting parameter for

re-quantization based method

H Hamming Distance

S Similarity Measure

xxvii

Symbol Description

L Length of watermark sequence

x Input signal

y Transformed domain signal

h′(z) Low pass filter coefficients

g′(z) High pass filter coefficients

WΘΥt Wavelet weighting parameter at Θ

subband at Υ decomposition level

Cq Quantized coefficient

C De-quantized coefficient

Q Quantization factor

b Binary bit b ∈ {0, 1}b′ Recovered binary bit b′ ∈ {0, 1}T Threshold parameter in magnitude alteration

based watermark extraction

N Number of bit planes assumed to be discarded

p Number of bit plane actually being discarded

V Vertical displacement of motion block

H Horizontal displacement of motion block

λ Binary tree quantizer

d Depth of binary tree

Φ Embedding distortion rate

xxviii

Acronyms

Acronym Description

AVC Advanced Video Coding

DCT Discrete Cosine Transform

DFT Discrete Fourier Transform

DIA Digital Item Adaptation

DWT Discrete Wavelet Transform

FDWT Forward Discrete Wavelet Transform

HVS Human Visual System

IDWT Inverse Discrete Wavelet Transform

JND Just Noticeable Difference

JPEG Joint Photographic Experts Group

MB Macro-Block

MC-EZBC Motion Compensated Embedded Zero Block Coding

MCTF Motion Compensated Temporal Filtering

MMCTF Modified Motion Compensated Temporal Filtering

MPEG Moving Picture Experts Group

MSE Mean Square Error

MV Motion Vector

PSNR Peak Signal to Noise Ratio

RMSE Root Mean Square Error

SR Search Range

SSIM Structural Similarity Measure

SVC Scalable Video Coding

UMA Universal Media Access

WEBCAM Watermarking Evaluation Bench for Content Adaptation Modes

WET Watermark Evaluation Test bed

WO Weak One

WZ Weak Zero

CO Cumulative One

CZ Cumulative Zero

EO Embedded One

EZ Embedded Zero

xxix

Statement of Originality

The research conducted within the scope of this thesis produced the following novel

and unique contributions towards robust watermarking techniques for scalable coded

image and video:

Chapter 3

– State of the art analysis.

Chapter 4

– Generalization and dissection of wavelet based image watermarking schemes and

the related parameters, i.e., wavelet kernel, subband, host coefficients and embed-

ding methods.

– Design and implementation of the modular and reconfigurable tool repository to

develop WEBCAM framework.

– Providing evaluation platform for comparison of robustness performances against

content adaptation attacks, including JPEG 2000.

– Comprehensive analysis and the comparison of various parametric inputs within

WEBCAM framework.

Chapter 5

– Development of the watermark embedding distortion model.

– Relationships between the mean square error (MSE) and the wavelet coefficients

to be embedded.

– Proof of concept of the model for non-blind and blind watermarking algorithms.

– Proof of concept of the model for orthonormal wavelet transforms.

– Proof of concept of the model for non-orthonormal wavelets by using weighting

parameters for various wavelet kernels in different subbands.

xxxi

Chapter 6

– Modeling of bit plane discarding, used in quality scalability, into the wavelet based

watermarking.

– Establishing the relationship between watermark input parameters and bit plane

discarding model.

– Design and implementation of enhanced robust watermarking algorithm for non-

blind watermarking by coefficient ranking using the above model.

– Proof of concept to enhance the robustness in blind watermarking schemes, based

on the bit plane discarding model.

Chapter 7

– Development of modified MCTF for video decomposition using lifting Haar.

– Design and implementation of generalized 2D+t+2D framework in wavelet do-

main.

– Comparative performance analysis of t+2D and 2D+t based watermarking.

– Robust video watermarking techniques against Motion JPEG 2000, MC-EZBC

and H.264/SVC.

Chapter 8

– Defining common performance metric to represent data-capacity and embedding

distortion.

– Design and implementation of scalable watermarking.

– Development of new binary tree-guided rules-based blind watermarking scheme.

– Scalable code stream generation using hierarchically nested joint

distortion-robustness coding atoms.

– Proof concept of scalable watermarking by truncating the code-stream atom at

any distortion-robustness atom level.

– Compliance of scalable watermarking with robust watermarking techniques for

scalable coded image and video.

xxxii

Chapter 1

Introduction

Recent years have seen the emergence of scalable coding standards for multimedia

content coding: JPEG 2000 for images [1]; MPEG advanced video coding (AVC)/H.264

scalable video coding (SVC) extension for video [2]; and MPEG-4 scalable profile for

audio [3]. The scalable coders produce scalable bit streams representing content in

hierarchical layers according to audiovisual quality, spatio-temporal resolutions and

regions-of-interests. The bit streams may be accordingly truncated in order to satisfy

variable network data rates, display resolutions, display device resources and usage

preferences. The new bit streams may be transmitted or further adapted or decoded

using a universal decoder which is capable of decoding any original or adapted bit

streams to display or play adapted versions of the original content in terms of quality

or reductions. The multimedia usage framework standard, MPEG-21, standardizes

the operation of a content-agnostic content adaptation engine as the part 7 of the

standard: Digital Item Adaptation (DIA) [4]. Such bit stream truncation-based content

adaptations also affect any content protection data, such as watermarks, embedded in

the original content. This thesis considers the scalable coding based content adaptation

as potential watermark attacks and present novel watermarking techniques, robust to

such attacks particularly quality scalability. Within the scope of the thesis, this work

focuses on the watermarking robustness of scalable coded image and extends those

methods suitably in scalable coded video watermarking.

1

1.1 Scalable coded image watermarking

Influenced by its success in scalable image coding and multi-resolution decomposition

capability, the DWT has been widely used in image watermarking [5–26]. Based on

the embedding methodology, wavelet-based image watermarking can be categorized

into two main classes: uncompressed domain algorithms [5–18] and joint compression-

watermarking algorithms [19–26]. One of the main objectives of the latter class of

algorithms is to accommodate watermarking algorithms within JPEG 2000 based scal-

able image coding as suggested by JPEG 2000 Part 8 (ISO/IEC 15444-8, T.807) Secure

JPEG 2000 (JPSEC) [20] specification to secure JPEG 2000 bit streams. However the

major drawbacks of the compression domain algorithms are its dependency on the

specific coding scheme and the complexity to accommodate the algorithms within the

coding pipeline. Therefore, uncompressed domain watermarking approaches, indepen-

dent of the coding schemes, are considered here.

In order to propose robust watermarking techniques for scalable coded images, the

objectives are broadly categorized as:

O.1 To analyze the existing schemes: The wavelet based watermarking schemes of-

ten share a common model. A comprehensive analysis of the existing schemes is pre-

sented by dissecting commonly used wavelet based watermarking algorithms into modu-

lar tool blocks and fitting them into a common wavelet-based watermarking framework,

Watermark Evaluation Bench for Content Adaptation Modes (WEBCAM). Such analy-

sis helps to develop models for embedding distortion and robust watermarking schemes.

O.2 To model watermarking robustness against scalable compression: To

enhance the robustness against scalable compression such as JPEG 2000, the quanti-

zation process is analyzed in the context of watermarking. However the embedding

distortion performance is also taken into account for the analysis, to balance imper-

ceptibility and robustness. The research findings from these are used to propose a

new scalable watermarking scheme later in this thesis. Hence objective O.2 is further

categorized into following two sub-objectives:

O.2.1 To model embedding distortion: The embedding distortion and the ro-

bustness to scalable image coding are two complementary watermarking require-

ments. In order to increase the robustness often imperceptibility is compromised.

The aim of this objective is to derive a model to find suitable relationships be-

tween the wavelet coefficients and the watermarking distortion in pixel domain.

2

O.2.2 To model quantization vs. watermarking robustness: The robustness

performance deteriorates due to the scalable compression in content adaptation.

The quantization process block within the scalable coding, is often responsible for

the quality scaling. Here we aim to propose improved watermarking robustness

by modeling the effect of quantization on the wavelet coefficients and rank them

accordingly to embed the watermark.

1.2 Scalable coded video watermarking

As a successor of the wavelet based image watermarking, several attempts have been

made to extend these image watermarking algorithms into video watermarking by using

them either on frame-by-frame basis [27–30] or on 3D wavelet decompositions [9,31,32].

However, such video watermark embedding without considering motion, results in

flicker and other motion mismatch artifacts in the watermarked video. Motion compen-

sated temporal filtering (MCTF) provides a better framework for video watermarking

by accounting object motion. Depending on the motion and texture characteristics

of the video and the choice of spatial-temporal sub band for watermark embedding,

MCTF has to be performed either on the spatial domain (t+2D) or in the wavelet

domain (2D+t). In this thesis improved video watermarking schemes are proposed by

offering a generalized motion compensated 2D+t+2D framework for watermark em-

bedding. The watermarking algorithms derived for scalable coded images, as proposed

in O.1 and O.2, are then extended to offer robust video watermarking schemes. Also

an improved MCTF is used by modifying the MCTF update step to follow the motion

trajectory in hierarchical temporal decomposition by using direct motion vector fields

in the update step and implied motion vectors in the prediction step. In summary the

main objectives of the robust video watermarking schemes are

O.3 To prepare 2D+t+2D framework: We aim to prepare a generalized modified

MCTF based 2D+t+2D in order to analyze the motion and texture suitable for video

watermarking.

O.4 To model the video watermarking schemes: The watermarking algorithms de-

rived during robustness model for images are now extended to video watermarking

within the generalized 2D+t+2D framework to offer a unique model for video water-

marking, which is robust to content adaptation attacks, such as, scalable compression

in Motion JPEG 2000, scalable video coder MC-EZBC [33] and H.264/SVC.

3

1.3 Scalable watermarking for image and video

Although, a wide variety of watermarking schemes have been offered to the date, a fun-

damentally traditional concept is still followed in almost all the schemes, including the

robust watermarking techniques, proposed in previous objectives. With the increased

use of scalable coded media, a need is realized for scalable watermarking. However,

a little work has been proposed so far towards scalable watermarking [34–38]. The

final part of this thesis aims to propose a novel concept of scalable watermarking as

opposed to traditional watermarking schemes by creating hierarchically nested joint

distortion-robustness coding atoms. The main objective of this part is:

O.5 To propose scalable watermarking: The research outcomes, proposed in differ-

ent objectives above, are combined to model a novel concept of scalable watermarking

scheme, that can generate a distortion constrained robustness scalable watermarked

media code stream which can be truncated at various points to generate the water-

marked image or video with the desired distortion-robustness requirements.

1.4 Thesis organization

Rest of the thesis is structured in eight different chapters, the contents of which are

summarized as bellow:

Chapter 2 provides the background overview of content adaptation and digital water-

marking. Scalabale coding structure, compression and application scenarios are briefed

within the overview of content adaptation followed by a general discussion on digi-

tal watermarking, including, its properties, applications and attacks. Various wavelet

transforms related to the proposed watermarking schemes are the discussed briefly to

provide sufficient background of the work.

Chapter 3 presents the state-of-the-art analysis of the current literature on watermark-

ing techniques for content adaptation attacks, which includes wavelet domain image

and video watermarking, MCTF based video watermarking, compressed and uncom-

pressed domain watermarking algorithms etc.

Chapter 4 offers a content adaptation test bed framework (WEBCAM), for evaluat-

ing the robustness of wavelet based watermarking. Overall, the framework facilitates

and presents a parametric study of various variables in wavelet based watermarking

4

and proposes a watermark tweezing tool to balance the embedding distortion and the

robustness to scalable coding-based content adaptation using the tools repository.

Chapter 5 presents a model for embedding distortion performance for wavelet based

watermarking. The model derives the relationship between distortion performance

metrics and the watermark embedding parameter, i.e., wavelet coefficients and the

related propositions are made separately for orthonormal and non-orthonormal wavelet

bases.

Chapter 6 addresses the issues related to quality scalable content adaptation and pro-

poses a new embedding criterion to ensure the robustness of the wavelet based image

watermarking schemes for such adaptations. The quality scalable image coding is mod-

eled using wavelet domain bit plane discarding to identify the effect of the quantization

and de-quantization on wavelet coefficients and the data embedded within such coeffi-

cients.

Chapter 7 proposes improved video watermarking schemes by offering a generalized

motion compensated 2D+t+2D framework for watermark embedding. An improved

MCTF is used by modifying the MCTF update step to follow the motion trajectory in

hierarchical temporal decomposition by using direct motion vector fields in the update

step and implied motion vectors in the prediction step. The robust image watermarking

schemes, described in previous chapter, are then extended in this framework to propose

robust video watermarking to content adaptations.

Chapter 8 proposes a novel concept of scalable blind watermarking to generate a dis-

tortion constrained robustness scalable watermarked code stream which consists of

hierarchically nested joint distortion robustness coding atoms. The code stream is gen-

erated using a new wavelet domain binary tree guided rules-based blind watermarking

algorithm. The code stream can be truncated at any distortion-robustness atom level to

generate the watermarked image with the desired distortion-robustness requirements.

Chapter 9 concludes this thesis by summarizing the research outcomes, i.e., analysis,

proposed models and new algorithms on robust watermarking to content adaptation

attacks. Novel contributions of this work are also highlighted here along with the

suggestions on new ideas for future research in this domain.

5

1.5 Publications and software releases

During various stages of the work, some of the research outcomes of this thesis have

been published or are currently under review in the form of software and refereed

publications, which are listed below:

Software Releases

S1. D. Bhowmik and C. Abhayaratne, Watermark Evaluation Bench for Content

Adopted Modes (WEBCAM) v2.0 http://svc.group.shef.ac.uk/webcam.html

Book Chapter

B1. D. Bhowmik and C. Abhayaratne, A generalised model for distortion perfor-

mance analysis of wavelet based watermarking, Lecture Notes in Computer Science,

Springer-Verlag, editor, Proceedings of International Workshop on Digital Watermark-

ing (IWDW ’08), vol. 5450, November 2008, Busan, South Korea, pp. 363-378.

Conference Proceedings

C9. D. Bhowmik and C. Abhayaratne, Distortion constrained robustness scalable image

watermarking. (In preparation)

C8. D. Bhowmik and C. Abhayaratne, Video watermarking using motion compensated

2D+t+2D filtering, in Proceedings of ACM Workshop on Multimedia and Security

(ACM MM&Sec 2010), September 2010, Rome, Italy, pp. 127-136.

C7. D. Bhowmik , C. Abhayaratne and M. Oakes, Robustness analysis of blind water-

marking for quality scalable image compression, in Proceedings of 18th European Signal

Processing Conference (EUSIPCO 2010), August 2010, Denmark, pp. 810-814.

C6. D. Bhowmik and C. Abhayaratne, The effect of quality scalable image compression

on robust watermarking, in Proceedings of Digital Signal Processing (DSP 2009), July

2009, Santorini, Greece, pp. 1-8.

6

C5. D. Bhowmik and C. Abhayaratne, Embedding distortion modeling for wavelet

based watermarking schemes, in Proceedings of Wavelet Applications in Industrial Pro-

cessing VI , SPIE Electronic Imaging 2009, vol. 7248, San Jose, CA, USA, January

2009, pp. 72480K (12 pages).

C4. D. Bhowmik and C. Abhayaratne, A framework for evaluating wavelet-based

watermarking for scalable coded digital item adaptation attacks, in Proceedings of

Wavelet Applications in Industrial Processing VI , SPIE Electronic Imaging 2009, vol.

7248, San Jose, CA, USA, January 2009, pp. 72480M (10 pages).

C3. D. Bhowmik and C. Abhayaratne, Evaluation of watermark robustness to JPEG2000

based content adaptation Attacks, in Proceedings of IET 5th International Conference

on Visual Information Engineering (VIE ’08), July 2008, Xian, China, pp. 789-794.

C2. D. Bhowmik and C. Abhayaratne, A watermark evaluation bench for content

adaptation modes, in Proceedings of IET 4th European Conference on Visual Media

Production (CVMP ’07), November 2007, London, UK, pp. 1.

C1. D. Bhowmik and G. C. K. Abhayaratne, Morphological wavelet domain image wa-

termarking, in Proceedings of 15th European Signal Processing Conference (EUSIPCO

2007), September 2007, Poznan, Poland, pp. 2539-2543.

7

Chapter 2

Background Overview

Scalable coding-based content adaptation and wavelet-based image and video water-

marking are two main components of this thesis. This chapter presents an overview of

scalable coding-based content adaptation, digital watermarking and their applications

and wavelet-based watermarking, of relevance to this thesis.

2.1 Scalable coding-based content adaptation

The universal media access (UMA) is an important requirement in modern multimedia

usage chains. The UMA concept envisages seamless delivery of multimedia across the

heterogeneous networks and various devices. This would require catering for differ-

ent network bandwidths, transmission media, device capabilities, memory and power

availability and most importantly the usage preferences. This can only be achieved by

intelligent content-agnostic adaptations based on the scalable coded content represen-

tations. An example of scalable coding-based multimedia usage is shown in Figure 2.1.

2.1.1 Scalable coding modules

In scalable coding the input media is coded in a way that the main host server keeps bit

streams that can be decodable to high quality full resolution content. When the content

needs to be delivered to a less capable display or via a lower bandwidth network, the

9

Figure 2.1: Universal multimedia usage scenarios using scalable coded content.

bit stream is adapted at different nodes (N1, N2, ... , Nx, as shown in Figure 2.1)

using different scaling parameters to match those requirements. At each node the

adaptation parameters may be different and a new bit stream may be generated. Finally

the adapted bit streams are decoded using a universal decoder. The scalable coding-

decoding process consists of three main modules [39]: encoder, extractor and decoder,

as shown in Figure 2.2.

Figure 2.2: The scalable coding-decoding block diagram.

10

0

1

0

1

Bit plane 0

Bit plane N

Bit plane N-1

Bit plane N-2

Most significant

Least significant

LH2

HL2

HH2

LH1

HL1

HH1

HH2

HH1

LH2

LH1

HL2HL1

LL2

LL2

Figure 2.3: Quality scalable encoding process.

The encoder module is responsible for producing a full resolution, highest quality

compressed bit stream from the original content. The bit stream generation normally

focuses on three main functionalities: quality scalability (Qi), spatial resolution scal-

ability (Si) and temporal resolution scalability (Ti : for video), where Qi, Si and Ti

represent the scaling parameters for different quality-spatio-temporal layers with the

layer index i. A bit stream descriptor is also generated along with the bit stream

describing the location of these layers in the scalable bit stream.

The extractor module is part of a cross media engine that adapts the bit streams

following the MPEG 21 part-7 DIA specifications. It truncates the scalable bit stream

considering the context and produces the adapted bit-stream, which is also scalable

and can be re-adapted at any following network node by using another extractor, and

its new description.

The decoder module provides an universal decoder to decode any adapted bitstream

to display the adapted content and may have full spatial, quality or temporal resolution

or any combination of a lower resolution.

2.1.2 Scalable coding technique

The fundamental concept of spatio-temporal bitstream generation is shown in Fig-

ure 2.3, Figure 2.4 and Figure 2.5, for quality, spatial and temporal scalability, respec-

tively with respect to wavelet decomposition. For quality scalable encoding process,

firstly the images / video frames are wavelet decomposed and organized according to

the bit plane significance as shown in Figure 2.3, where L and H corresponds to low

11

LL1 HL1

LH1 HH1

HH2HL2

LH2LL2

Quarter Resolution

Half Resolution

Full Resolution

Figure 2.4: Spatial resolution scalable encoding process.

pass and high pass frequency decompositions, respectively. During the quality scaling,

bit values from the selected bit planes are considered in a hierarchical order starting

from most significant bit plane to least significant bit plane until the target bit rate is

achieved. Similarly, for a resolution scaling process, hierarchically low frequency sub-

bands are selected according to scaling requirements (Figure 2.4). To encode temporal

scaling, the video frames are temporally decomposed and then organized in a hierarchi-

cal order as shown in Figure 2.5. The encoded bitstreams are generated by combining

these three scalable coding schemes and putting them in individual concatenated pack-

ets in such a way that the extractor can truncate the bit stream at any point to fulfill

the scaling requirements. Finally the decoder decodes the truncated bit stream and

performs the inverse transform to reconstruct the scaled media.

1 2 3 4 1 3 2 4

L1 H1 L1 H1

L1 L1 H1 H1

L2 H2 H1 H1

Temporal frame significance

High Low

Figure 2.5: Temporal scalable encoding process.

12

Sl No. Name Description

1 Broadcast monitoringPassive monitoring by the automatic watermark detection of broadcasted watermarked media.

2 Copyright identificationResolving copyright issues of digital media by using watermark information as copyright data.

3 Content authenticationAuthentication of original art work, performance and protection against digital forgery.

4 Access control Access control applications, such as, Pay-TV.5 Copy control Disabling copy of CD / DVD etc. by watermarked permission.

6 Packaging and trackingTransaction tracking and protection against forged consumable items including pharmaceutical products by embedding watermark on packaging.

7Medical record authentication

Authentication of digitally preserved patient's medical record, including blood sample, X-ray, ECG etc.

8Insurance / Banking document authentication

Digital authentication of insurance claim, banking, financial, mortgage and corporate documents.

9 Media piracy control Tracking of the source of media piracy.

10 Ownership identification Supproting legitimate claim, such as, royalty by the media owner.

11 Transaction tracking Tracking of media ownership in a buyer-seller scenario. 12 Meta-data hiding Hiding meta-data within the media instead of a big header.

13 Video summary creationInstant retrieval of video summary by embedding the summary within the host video.

14Video hosting authentication

Piracy control by video authentication at video hosting servers, including youtube, megavideo etc.

Watermarking Applications

Figure 2.6: Watermarking applications.

2.2 Digital watermarking

2.2.1 Definition, properties, applications and attacks

By definition a digital watermark is the copyright or author identification information

which is embedded directly in the digital media in such a way that it is imperceptible,

robust and secure. The watermarking research is considerably mature by now, after

its major inception in mid nineties and offers digital protection to a wide spectrum of

application as shown in Figure 2.6. It comprises elements from a variety of disciplines

including image processing, video processing, telecommunication, computer science,

cryptography, remote sensing and geographical information systems etc. Watermarking

systems are often characterized by a set of common properties and the importance of

each property depends on the application requirements. A list of such properties and

corresponding example applications [40,41] are shown in Figure 2.7, where last column

of the figure shows the associated applications’ number from Figure 2.6.

Based on the embedding method, the watermarking techniques can be categorized [42]

as shown in Figure 2.8. The watermark embedding can be done in the spatial domain

or in the frequency domain. The latter have been a much popular choice as frequency

13

Properties Description Applications

ImperceptibilityThe watermark should not noticeably distort or degrade the host data in order to preserve the quality of the marked document.

Robustness

To measure robustness the watermark must be reliably detectable against signal processing schemes including data compression.

Fragility

These kinds of watermark are embedded in host data in such a way that they do not survive in the case of any modification even copying.

7, 8.

Tamper-resistanceThe tamper-resistance property is focused on the intentional attacks in contrast to robustness.

3, 5, 9, 10, 11, 14.

False positive rate

The probability of identifying an un-watermarked piece of data as containing a watermark by a detector is called the false positive rate.

6.

Data payloadThe amount of information present in watermarked media is called data payload.

12, 13.

General Properties of Digital Watermarking

1, 2, 3, 4, 14.

Figure 2.7: Watermarking properties and associated applications.

decomposition characterizes the host media to represent the human eye characteristics

and eye perception towards the media. Therefore frequency domain watermarking can

provide better insight to reduce embedding distortion or increase the robustness [30].

Now, depending on the type of host media, watermarking can be divided into four dif-

ferent categories: audio, image and video watermarking. Again, based on the human

perception the watermarking schemes can be categorized as visible or imperceptible

(invisible) watermarking and the latter can also be categorized as robust, fragile or

semi-fragile watermarking. In case of a robust watermarking scheme, the watermark

is expected to be sustained even after a compression or any other intentional attack,

whereas in the case of a fragile scheme [43] the watermark information is usually de-

stroyed to any alteration or attack to the media, in order to authenticate the image

integrity. A semi-fragile scheme [44] represents properties from both the above men-

tioned categories and the watermark information is robust to certain type of attacks

while fragile to other type of attacks.

Watermark represents the owner’s identity. Hence the selection of the watermark is

considered important and varies according to application requirements. Early days of

watermarking scheme often used a pseudo-random number to embed the watermark

and authenticity of the media is examined by the presence or absence of the watermark.

In recent literature a message or logo based watermark [15] has been preferred by the

researchers and in this case authentication is done by extracting the hidden message or

14

Visible watermarking

Digital Watermarking

Embedding domain

Type of host media

Human perception

Spatial domain

Frequency domain

Imperceptible watermarking

Robust Fragile Semi - fragile

Text Video Audio Image

Visible watermarking


Embedding domain

Type of host media

Human perception

Spatial domain

Frequency domain





Embedding domain

Type of host media

Human perception

Spatial domain

Frequency domain




Figure 2.8: Types of watermarking techniques.

Watermark Selection

Pseudo - random sequence Text / Logo / Image

Natural number sequence

Binary sequence

Binary logo Gray scale logo Colour logo

Watermark Selection

Pseudo - random sequence Text / Logo / Image

Natural number sequence

Binary sequence

Binary logo Gray scale logo Colour logo

Figure 2.9: Watermark types.

logo to identify the legitimate owner. Figure 2.9 shows the different types of watermark

used in this field.

Main requirements of the watermarking schemes are either 1) to retain the watermark

information after any intentional attacks or natural image/video processing operation,

or 2) to identify any tampering (fragile watermarking) of the target media. Any process

that modifies the host media affecting the watermark information, is called attack on

watermarking. Various types of attacks can be grouped together as follows: 1) signal

processing, 2) geometric, 3) enhancement, 4) printing-scanning-capturing, 5) oracle, 6)

chrominance, 7) transcoding attacks etc. The attack characterization with respect to

image and video watermarking and related applications are shown in Figure 2.10.

Watermarking schemes in general are evaluated in terms of imperceptibility, robustness

or capacity, while few research have been reported in the literature [45, 46] on the

security of the watermarks. The security of the watermark can be defined as the

15

Signal ProcessingJPEGJPEG 2000

GeometricHorizontal FlipRotationCroppingScalingRow / Column removal

Low pass filteringShrpeningHistogram modificationGamma correctionColor quantisationRestorationNoise addition

Signal ProcessingMotion JPEG 2000MPEG-2MPEG-4MC-EZBCH.264/AVCH.264/SVCH.264/MVCLinear / Non-linear adaptive filtering

DesynchronisationCroppingRow / Column removal

Watermarking attack characterisation

Video:

Vid

eo h

ostin

g au

then

ticat

ion

Image:

Met

a-da

ta

hidi

ng

Vid

eo e

ditin

g

Vid

eo s

umm

ary

Insu

ranc

e /

Ban

king

do

cum

ent

Inte

ntio

nal

Atta

cks

Pac

kagi

ng /

Tra

ckin

g

Bro

adca

st

mon

itorin

g

Cop

y co

ntro

l

Com

m. n

/w

adap

tatio

n

Dis

play

dev

ice

adap

tatio

n

Imag

e ed

iting

Med

ical

rec

ord

Geometric

Chrominance attackTrasncoding

Oracle AttackFragile watermarking

Applications

Attacks

Semi-Fragile watermarking

Enhancement

Printing-ScanningPrinting-Capturing

Figure 2.10: Attack characterization.

ability to properly conceal the watermark information in such way that it is secret

to the unauthorized users. The security of the watermarking schemes are usually

implemented using two different approaches [47]:

• Asymmetric watermarking which uses two different keys for watermark embed-

ding and detection and

• Zero-knowledge watermark detection using cryptographic techniques where the

watermark detection process is substituted by cryptographical protocol.

Cryptographical scrambling of the watermark logo is also used in order to secure the

16

watermark [15] in addition to the other security measures, such as, key based coefficient

selection, random filter parameter selection etc. These are particularly useful when the

attacker has access to the watermark detector. The security of the watermark is intend

to make the scheme robust against intentional attacks whereas this thesis considers

watermarking robustness to natural signal processing attacks such as compression.

Therefore the security aspects of the watermarking techniques have not been analyzed

further.

The scalable content adaptation, which compresses image and video during the scaling

operation, is considered as a type of signal processing attack to watermarking. Con-

sidering the nature of UMA application scenario, watermarking schemes for scalable

coded media focuses on two main properties: 1) imperceptibility and 2) robustness,

which are complementary to each other. This thesis provides an insight on the effects

of content adaptation on watermarking within the scalable coded media and suggested

robust watermarking techniques accordingly.

2.2.2 Watermarking process

The watermarking procedure, in its basic form, consists of two main processes: 1)

Embedding and 2) Extraction and authentication. At this point, for simplicity, we

describe these processes with reference to the image watermarking.

2.2.2.1 Embedding

This process insert or embed the watermark information within the host image by

modifying all or selected pixel values (spatial domain); or coefficients (frequency do-

main), in such a way that the watermark is imperceptible to human eye and is achieved

by minimizing the embedding distortion to the host image. The system block for the

embedding process is shown in Figure 2.11 amd can be expressed as:

I ′ = ζ(I,W ), (2.1)

where I ′ is the watermarked image, I is the original host image, W is the watermark

information and ζ() is the embedding function. The embedding function can further be

categorized in sub-processes: 1) forward transform (for frequency domain), 2) pixel /

coefficient selection, 3) embedding method (additive, multiplicative, quantization etc.)

and 4) inverse transform.

17

()ζEmbeddingOriginal Image

Watermarked Image

)(I

)'(IWatermark )(W

Figure 2.11: Watermark embedding process.

Finally the performance of the watermark embedding is measured by comparing the

watermarked image (I ′) with the original unmarked image (I) and is calculated by

various metrics: 1) peak signal to noise ratio (PSNR), 2) weighted PSNR (wPSNR) [48],

3) structural similarity measure (SSIM) [49], 4) just noticeable difference (JND) [50]

and 5) subjective quality measurement [51].

PSNR: This is one of the most commonly used visual quality metric which is based on

the root mean square error (RMSE) of the two images with dimension of X × Y as in

Eq. (2.2) and Eq. (2.3).

PSNR = 20 log10

(255

RMSE

)dB. (2.2)

RMSE =

√√√√ 1

X × Y

X−1∑

m=0

Y−1∑

n=0

(I(m,n)− I ′(m,n))2. (2.3)

wPSNR: On contrary to error measurement in spatial domain as in the previous metric,

this metric measures PSNR in wavelet transform domain with weighting factors at

different frequency decomposition level. The host and processed images are firstly

wavelet decomposed and then squared error is computed at every subband. Finally

wPSNR is calculated using cumulative squared error with weighting parameters for

each subband. The weights for various subbands are adjusted in such way that wPSNR

has the highest correlation with the subjective score.

SSIM: This quality measurement metric assumes that human visual system is highly

adapted for extracting structural information from a scene. Unlike PSNR, where av-

erage error between two images taken into consideration, SSIM focuses on a quality

assessment based on the degradation of structural information. The structural infor-

mation in the scene is calculated using local luminance and contrast rather an average

luminance and contrast.

JND: In this metric the host and test images are DCT transformed and Just Noticeable

Differences are measured using thresholds. The thresholds are decided based on 1)

luminance masking and 2) contrast masking of the transformed images. The threshold

for luminance pattern relies on the mean luminance of the local image region, whereas

18

the contrast masking is calculated within a block and particular DCT coefficient using

a visual masking algorithm.

Subjective: Although various objective metrics have been proposed to measure the vi-

sual quality, often by modeling the human visual system or subjective visual tests, the

subjective test offers best visual quality measurement. Subjective tests procedures are

recommended by ITU [51] which defines the specification of the screen, luminance of

the test room, distance of the observer from the screen, scoring techniques, test types

such as double stimulus continuous quality test (DSCQT) or double stimulus impair-

ment scale test (DSIST) etc. The tests are carried out with multiple viewer and the

mean opinion score (MOS) represents the visual quality of the test image. However the

subjective tests are often time consuming and difficult to perform and hence researchers

prefer objective metrics to measure the visual quality.

Among these metrics, due its simplicity, the most common method of evaluating the

embedding performance in watermarking research is PSNR. It is also observed that

most of the metrics behaves in a similar fashion when compared with any embedding

distortion measured by PSNR of 35dB or above. Therefore, this thesis have also used

PSNR as the visual quality metric in the following chapters.

2.2.2.2 Extraction and authentication

As the process name suggested, it consists of two subprocess: 1) extraction of water-

mark and 2) authentication of the extracted watermark. The watermark extraction

follows a reverse embedding algorithm, but with a similar input parameter set. Now

based on the watermark extraction criteria any watermarking method can be cate-

gorized in: 1) non-blind type and 2) blind type. For the first category, a copy of the

original un-watermarked image is required during extraction whereas in the latter case,

the watermark is extracted from the test image itself. The extraction process can be

written in the simplified form as:

W ′ = $(I ′, I), (2.4)

where W ′ is the extracted watermark, I ′ is the test image, I is the original image and

$() is the extraction function.

Once the watermark is extracted from the test image, the authentication is performed

by comparing with the original input watermark information. Common authentication

methods are defined by finding the closeness between the two in a vector space, by

19

()ϖExtractionTest Image Extracted

watermark)'(I )'(W

Original Image(for non-blind type)

)(I Original watermark )(W

Watermark detectiondecision

Authenti-cation

Figure 2.12: Watermark extraction and authentication process.

calculating the similarity correlation or Hamming distance. A complete system diagram

of extraction and authentication process is shown in Figure 2.12.

2.2.3 Wavelet-based watermarking

Embedding watermark information by modifying direct pixel values, can be referred as

the simplest form of the watermarking process. But eventually, the frequency domain

watermarking schemes received more attention due its ability to decompose the media

information in various frequency spaces. Research conducted by Cox et al. [30] indi-

cates that in order to increase the robustness and reliability, the watermark should be

embedded in the significant part such as low frequency components of the host media.

The frequency domain watermarking schemes are exploited using different frequency

domain analysis, namely, discrete Fourier transform (DFT) [52], fractal transform [53],

discrete cosine transform (DCT) [54–56], digital wavelet transform (DWT) [5–11]. Due

to its efficient multi-resolution spatio-frequency representation of signals, the DWT has

become the main transform used in image watermarking.

An extension of image watermarking to video is the easiest option for any video water-

marking scheme. Frame-by-frame video watermarking [27, 30] and 3D wavelet based

video watermarking schemes [9, 32] are available in the literature. However a direct

extension of the image watermarking schemes without consideration of motion, pro-

duces flicker and other motion related mismatch. The watermarking algorithms along

with MCTF, which decomposes motion information, provides a better solution to this

problem [57].

Therefore, this thesis focuses on wavelet based image watermarking research and ex-

tends the outcomes to video watermarking scenario using a motion compensated video

decomposition. But before continuing with the wavelet based watermarking schemes

and MCTF based video watermarking algorithms, here we like to discuss the back-

ground of the wavelet transform, various wavelet implementations and finally MCTF

concept.

20

2.2.4 Wavelet transform

Wavelets can described as a class of function used to localize a time domain input

signal in both space and scaling. A family of wavelets can be developed by defining

the mother wavelet, Ψ(t), which is confined in a finite interval. The family members,

often referred as daughter wavelets can be defined as:

Ψ(a,b)(t) =1√aΨ

(t− b

a

), (2.5)

where a > 1 is the change of scale and b ∈ R is the translation in time [58,59]. Therefore

a continuous input signal f(t) can be represented in wavelet transform as a linear

combination of daughter wavelets, Ψa,b(t) and the corresponding wavelet coefficients

f(a,b) can be defined as:

f(a,b) =

∫ α

−αΨ(a,b)f(t)dt, (2.6)

= 〈Ψ(a,b), f(t)〉. (2.7)

The above defined Continuous Wavelet Transform (CWT) can be extended to Discrete

Wavelet Transform (DWT), which is used in image and video coding applications,

including watermarking. In case of DWT, usually the wavelet function (Ψ(a,b) : a, b ∈Z) follows dyadic translation (by power of 2) and dilation in Hilbert space and Eq. (2.5)

is modified to:

Ψ(a,b)(t) = 2a/2(2at− b). (2.8)

The implementation of DWT is adapted primarily by two different methods: 1) Filter

bank approach and 2) Lifting based approach. The first one is more widely used for

many applications where as the latter one is more popular in recent image and video

coding schemes.

2.2.4.1 Filter bank approach

This approach consists of two filter banks, one each for the analysis (forward transform)

and the synthesis (inverse transform) as shown in Figure 2.13. During the analysis, the

input signal is passed through two separate channels, using a high pass filter (g) and

a low pass filter (h) followed by a down sampling operation by a factor of 2, in each

21

2

2

2

2h

g

'h

'g

X~X +

HP

LP

Analysis Synthesis

Figure 2.13: The filter bank approach for DWT.

PX U

+−

+

oX

eX

HP

LP

'U 'P

+−

+

X~

+

Analysis Synthesis

Figure 2.14: The lifting approach for DWT.

channel. The low pass filter data contains the coarse grain information while the high

passed data retains the fine-grained or detailed information of the input signal. To

reconstruct the signal data, the transformed coefficients are first interpolated by an up

sampling operation with a factor of 2 and then convolved with synthesis filter banks,

g′ and h′. The filter coefficients of g′ and h′ are obtained from corresponding analysis

filters g and h, respectively, to eliminate the aliasing.

2.2.4.2 Lifting based approach

In this alternative DWT implementation, the input signal is first decomposed into odd

(o) and even (e) samples. Then a predict (P ) and update (U) lifting functions are

operated sequentially on the odd and even samples, respectively, to obtain the wavelet

coefficients [60] as shown in Figure 2.14. The predict function approximates the data

set and the difference between the approximation and the odd samples creates the

detailed information of the input signal (equivalent to high pass subset in filter bank).

Then the update step modifies the even sample using the predicted samples from the

previous step and generates the average of the input signal (low pass subset). The

inverse transform (synthesis) of the lifting scheme is a mirror of the forward transform,

followed by a merging step to reconstruct the input signal.

22

2.2.5 2D wavelet

The wavelet decomposition of image requires 2D transform and it is achieved by per-

forming 1D DWT separately on rows and columns of 2D signals. At each stage of

the transform one low pass (L) and one high pass (H) coefficient subsets are gener-

ated and as a result an one level 2D wavelet transform creates four subbands, namely,

LL, LH, HL and HH as shown in Figure 2.15.a). The LL subband represents the

original image in half resolution and contains smooth spatial data with high spatial

correlation. The HH subband contains the noise and edge information while HL and

LH subbands consists of vertically and horizontally oriented high frequency details, re-

spectively. The 2D wavelet transform can repeatedly be applied on LL subband from

previous decomposition to create hierarchy of the wavelet coefficients. An example of

2 level 2D wavelet decomposition is shown in Figure 2.15.b).

Based on the orthogonality property, the wavelets are often designed as 1) orthogonal

kernels (Haar, Daubechies etc.) and 2) bi-orthogonal kernels (9/7, 5/3 etc.) The

above mentioned wavelets retains linearity property while non-linear wavelets, such as,

Morphological Haar (M-Haar) and Median lifting on quincunx sampling (M-QC) are

obtained by replacing the linear operations, such as weighted averaging, in lifting steps

with non-linear operations. They can modify only the lifting step(s) affecting the low

pass sub band (known as update step) [61], only the lifting step(s) affecting the high

pass subbands (known as prediction step) [62] and the both types of lifting steps [63].

2

2

H

L

2

2

HH

HL

2

2

LH

LL

a) Filter bank based 2D wavelet decomposition

G 0

G 1

G 0

G 0

G 1

G 1

I

HH1

HL1

LH1

HH2 LH2

HL2 LL2

b) 2 level decomposition

Figure 2.15: 2D wavelet transform operation.

23

1 2

3 4

5 6

7

8

9 10 11 12

13 14

15 16

1 1 2 2

3 3 4 4

5 5 6 6

7 7

8 8

9 9 10 10 11 11 12 12

13 13 14 14

15 15 16 16

1 2 4

5 6 7 8

9 10 11 12

13 14 15 16

3 1 2 4

5 6 7 8

9 10 11 12

13 14 15 16

3

Reference Frame Current Frame

Figure 2.16: The block based motion estimation.

2.2.6 Motion compensated temporal filtering

In the case of video decomposition, a temporal dimension has to be added and the

same can be achieved by extending previously discussed 2D wavelet transforms in 3D

transform. But for video, object motion between frames is important and therefore

translational motion information in temporal direction is required to be incorporated

during temporal decomposition. Motion Compensated Temporal Filtering (MCTF) [64]

provides such temporal decomposition solution using a block based motion estimation

as shown in Figure 2.16. The 1D lifting based wavelet transform can also be used

to adopt the motion model within its prediction and update steps to provide MCTF

decomposition. More about MCTF and related wavelet decomposition is discussed in

Chapter 7.

2.3 Conlcusions

In this chapter, overview and background of the scalable coding based content adap-

tation and digital watermarking are presented. Firstly, the content adaptation is de-

scribed according to MPEG 21 part-7 DIA specification. Then the basic properties and

applications of digital watermarking are briefed along with the probable attacks on wa-

termarking including scalable compression in content adaptation. To propose robust

watermarking techniques against such content adaptation, wavelet-domain watermark-

ing schemes are selected. In the next chapter the state-of-the-art study on wavelet

based watermarking schemes related to image and video watermarking are discussed

and analyzed.

24

Chapter 3

State-of-the-art

Image and video watermarking can be performed either in the pixel domain or in the

frequency domain. Frequency domain watermarking, more recently wavelet domain

watermarking schemes are preferred due to its increased robustness and reliability by

choosing important frequency components. At the same time scalable image and video

coding schemes, i.e., JPEG 2000, Motion JPEG 2000, MC-EZBC, and H.264/SVC uses

wavelets and related transforms for progressive transmission of media data with low

bit-rate, resolution, quality or a temporal scalability. Therefore an increased interest

in wavelet domain watermarking is noticed in the recent literature. Due to its multi-

resolution anlysis and compliance with image and video coding schemes, DWT is a

natural choice for robust watermarking research for scalable coded image and video, in

this thesis. This chapter discusses about the state-of-the-art researches, published in

this domain.

3.1 Image watermarking

3.1.1 Wavelet-based image watermarking

Due to its ability for efficient multi-resolution spatio-frequency representation of sig-

nals, the DWT has become the major transform for spread spectrum watermarking.

The wavelet domain watermarking algorithms often share a common model. Based

on the embedding methodology, wavelet-based image watermarking can be catego-

rized into two main classes: uncompressed domain algorithms and joint compression-

25

Forward DWT

Host Image

Watermark Embedding

Scalable Coding

Decoding Watermark Extraction

Authentication

Content Adaptation Watermark

Inverse DWT

Forward DWT

Figure 3.1: Uncompressed domain image watermarking and content adaptation attack.

watermarking algorithms.

3.1.1.1 Uncompressed domain watermarking algorithms

Watermark embedding is performed independent of and prior to compression. There

are many algorithms of this type of watermarking, presented in the literature [5–8,10–

18, 65, 66]. A system block diagram in the context of scalable coding-based content

adaptation is shown in Figure 3.1. The major steps for embedding include the forward

DWT (FDWT) and coefficient modification followed by the inverse DWT (IDWT).

Then the content is scalable coded and may be adapted during usage. Watermark

authentication includes the FDWT and recovery of the watermark as blind or non-

blind extraction and comparison with the original watermark.

3.1.1.2 Joint compression-watermarking algorithms

As scalable image coding is mainly based on the DWT, joint compression-watermarking

algorithms [19–25] incorporated into JPEG 2000 are also becoming more efficient way of

image watermarking. A general system block diagram is shown in Figure 3.2. In most

cases the watermark is embedded by modifying the quantized wavelet coefficients. The

watermark extraction is done during the decoding operation. The main difference in

this type of algorithms compared to the previous type is that the embedding DWT ker-

nel and the compression DWT kernel are the same in this case. The use of JPEG 2000

lossless mode in a joint watermarking-compression scheme results in an uncompressed-

domain watermarking algorithm that uses the same DWT kernel for both compression

26

Forward DWT

Host Image Quantisation

Entropy Coding

Bit-stream generator

Watermark Embedding

+

Bit-stream analyser

Entropy Decoding

Watermark Extraction

De- Quantisation

Inverse DWT

Decoded Image

Authentication

Content Adaptation

Figure 3.2: Joint compression-watermarking and content adaptation attack.

and watermark embedding.

3.1.2 Dissection of wavelet-based image watermarking algorithms

In both algorithm types, the watermark embedding algorithm considers different op-

tions for the choice of embedding subbands, for the selection of embedding coefficients

and the modification methodology. In addition to above three parameters for the un-

compressed domain algorithms, the choice of wavelet kernels is also regarded as a design

parameter. In this section the well-known wavelet-based algorithms are dissected in

terms of these four parameters to accumulate different options that have been used so

far in current watermarking algorithms. Currently used different options are listed as

follows:

3.1.2.1 Wavelet kernel

Early work on wavelet-based watermarking used mainly Haar or other Daubechies

family orthogonal wavelets [5–15]. Then with the success of biorthogonal wavelets

in image coding, they have been used in watermarking algorithms [16–18]. Further,

joint compression-watermarking algorithms are also considered as biorthogonal wavelet

domain watermarking. With the introduction of lifting-based wavelet design, lifting-

based integer-to-integer Haar transform [10] and lifting-based non-linear wavelets [16]

have been used in watermarking algorithms. In a recent work [67], the effect of different

wavelet kernels on watermark embedding distortion performance and robustness to

scalable coding-based content adaptation attacks is presented.

27

3.1.2.2 Subband

Wavelet-based watermarking algorithms also vary in terms of the number of wavelet

decomposition levels used and the subbands chosen for watermark embedding. There

have been algorithms using two [5, 6, 10–12], three [16–18] and four [8, 14, 15] levels

of wavelet decompositions. Joint compression-watermarking algorithms used the same

number of levels of decompositions used in the compression algorithm. The choice of

subbands for watermark embedding is often driven by the imperceptibility and robust-

ness criteria. Algorithms intending to meet low embedding distortion and impercepti-

bility requirements use high frequency subbands for embedding [6,7,10–15,20]. On the

other hand, algorithms designed to achieve high robustness against compression use

low frequency subbands for embedding [5, 8, 16]. Finally, algorithms aiming to meet

a balance between these two criterions use all subbands resulting in spread spectrum

embedding [17–19,22].

3.1.2.3 Hosting coefficient

The selection of wavelet coefficients to host the watermark can be classified into three

methods: choosing all coefficients in a subband [11–15,19,21]; using a threshold based

on their magnitude significance [10,17,18,20] or the just noticeable difference (JND) [16];

and based on the median of a 3x1 non-overlapping window, which can be based on the

same subband (Intra-band) [5, 22] or spanning three high frequency subbands in the

same decomposition level (Inter-band) [6, 7]. Some of the all-coefficients-based algo-

rithms use a Human Visual System (HVS)-based mask [13, 14] or a fusion rule-based

mask for refining the selection of host coefficients [15] or key-based random sequence

for ordering host coefficients [8].

3.1.2.4 Embedding method

The host wavelet coefficient modification methods used in wavelet-based watermarking

algorithms can be generalized as follows:

C ′m,n = Cm,n +∆m,n, (3.1)

where C ′m,n is the modified coefficient at (m,n) position, Cm,n is the original value of the

host coefficient and ∆m,n is the amount of modification due to watermark embedding.

The modification methods can be categorized into two classes: modification based on

28

0 1 k-1 k k+1Cmedian l

Cmin + �

CmaxCminW=1 W=0 W=1W=0

Cmedian

Figure 3.3: Re-quantisation-based modification.

magnitude alteration [10–21]; and re-quantization of a coefficient with respect to a

group of coefficients within a given window [5–8,22].

Further, for magnitude alteration algorithms, the way ∆m,n in Eq. (3.1) is modified can

be mapped into a generalized form consisting of four sub-classes of methods as follows:

∆m,n = a1A1 + a2A2 + a3A3 + a4A4, (3.2)

where

A1 = αCτm,nWm,n,

A2 = vm,nWm,n,

A3 = βm,nwm,n and

A4 = f(Cm,n,Wm,n).

A1 corresponds to direct modification of the host coefficient Cm,n with a watermark

value Wm,n according to the user specified parameters (α and τ = 1, 2, ...) to vary the

watermark weight and the strength, respectively [11, 16–19]. A2 corresponds to the

HVS driven modification using a weighting parameter (vm,n) which is a function of

Cm,n and the pixel masking process in the HVS model [13, 14, 20]. A3 corresponds to

fusion-based methods where the host wavelet coefficients are fused with the watermark

wavelet coefficients wm,n using an HVS-based fusion strength parameter, βm,n [15].

With A4, all other magnitude alteration algorithms are represented based on any func-

tion, f(Cm,n,Wm,n). Concluding this analysis a binary vector < a1, a2, a3, a4 > is used

to represent different magnitude alteration for watermark embedding by setting the

corresponding vector element to one in our proposed WEBCAM framework.

Similarly the re-quantization-based modifications are mapped into Eq. (3.1) as follows:

Such algorithms change the median coefficient of a group of coefficients to the kth

quantisation step position by a modification value ∆m,n, where |∆m,n| ≤ δ, which is

based on the new quantization step δ as shown in Figure 3.3. Different functions are

suggested in the literature to find the value of δ and such functions normally use the

minimum (Cmin) and the maximum (Cmax) coefficient values in the coefficients group.

29

They can be generalized into the following form:

δ = f(γ,Cmin, Cmax), (3.3)

where γ is the user defined weighting factor. As ∆m,n depends on the step size δ and

the user defined γ, the modification value ∆m,n is typically a function of Cmin and

Cmax for each group of coefficients. Details of the embedding procedures can be found

in [5] and [7].

The above parametric dissection of state-of-the-art wavelet-based watermarking algo-

rithms, in terms of wavelet kernel, subband, host coefficient and embedding method, is

used to design and implement the tools repository and the modular and reconfigurable

wavelet-based watermarking implementation is presented as WEBCAM framework in

Chapter 4. The comparative study of various input parameters, as mentioned above,

are also performed in Chapter 4 by experimental simulations.

3.2 Video watermarking

3.2.1 Uncompressed and compressed domain video watermarking

Video watermarking at its simplest form can be proposed by extending image water-

marking algorithms in the individual video frames considering the video as a collection

of frames. However frame-by-frame video watermarking without considering the tempo-

ral correlation in video often suffers from flickering problem and poor robustness perfor-

mance against various video processing attacks, such as, collusion, de-synchronization

and compression. Solutions are proposed in the literature by proposing 3D wavelet

decomposition, watermarking in compressed domain etc. Similar to the state of the

art image watermarking techniques, these video watermarking algorithms can be cat-

egorized in 1) Uncompressed domain and 2) Compressed domain algorithms.

3.2.1.1 Uncompressed domain algorithms

Similar to the uncompressed domain image watermarking, the algorithms in this cat-

egory embed the watermark in video before video encoding and the embedding algo-

rithms are independent of the video coding algorithms. Many such algorithms, of-

fered in the literature [9,27,28,30–32,57,68–81], often, extend the image watermarking

30

Figure 3.4: Uncompressed domain video watermarking and compression / contentadaptation attack.

schemes into video which also consider temporal decomposition and motion information

of the host video sequence. A system block diagram of uncompressed domain video

watermarking schemes are shown in Figure 3.4. Firstly the video frames are either

individually decomposed in 2D or 3D space using wavelet, DCT or other spread spec-

trum algorithms or temporally filtered using motion information and then decomposed.

Then the watermark is embedded using various algorithms, similar to the image wa-

termarking algorithms as described in Section 3.1. An inverse transform generates the

watermarked video which is then content adapted by various encoding schemes, such

as, MPEG-2, Motion JPEG 2000, MC-EZBC or H.264-AVC/SVC etc. The watermark

extraction and authentication is done by forward transform similar to the one used

during embedding, followed by a blind or non-blind recovery of the watermark.

3.2.1.2 Compressed domain algorithms

With the evolution of hybrid video coders, namely, from MPEG-2 to H.264-AVC/SVC

various joint compression domain schemes are proposed in the literature [82–98]. A

generic schematic block diagram of such system is shown in Figure 3.5. In these algo-

rithms, the watermark embedding is usually performed either on motion compensated

intra frames or residual frames (Option 1 ); on motion vector (Option 2 ) or by mod-

ifying the encoded bit streams (Option 3 in the figure). Joint compression domain

watermarking schemes are offered by modifying video coding pipeline and inserting the

embedding modules within it. Therefore, these schemes are always associated to and

dependent on a given video coding algorithms and offer lesser flexibility.

31

Figure 3.5: Generic scheme for joint compression domain video watermarking.

3.2.2 Dissection of the video watermarking algorithms

In Section 3.1.2, the image watermarking algorithms were dissected in terms of wavelet

kernels, subbands, hosting coefficients and embedding algorithms. On the other hand,

the video watermarking schemes exploited the temporal dimension and motion informa-

tion of host video sequences. In most of the cases, the image watermarking embedding

methods are adopted within the video decomposition schemes. In video watermarking

more focus has been given to various motion and temporal dimension related decom-

positions of the host video and such video watermarking algorithms, available in the

literature, can be categorized as follows: 1) Frame-by-frame, 2) 3D decomposed, 3)

Motion compensated, 4) Bit stream domain and 5) Motion vector based watermarking.

3.2.2.1 Frame-by-frame

Frame-by-frame video watermarking can be defined as the extension of image water-

marking algorithms into individual video frames. The initial attempts to video wa-

termarking were made by this approach due to its simplicity in implementation using

comparatively matured image watermarking algorithms. Many such algorithms are

available in the literature [27, 28, 30, 68–73]. However frame-by-frame watermarking

schemes often suffer from flickering problem and robustness issues, including, compres-

sion in hybrid video coding, collusion, frame dropping and de-synchronization.

32

3.2.2.2 3D decomposed

In order to overcome the weaknesses, as indicated in frame-by-frame video water-

marking, new algorithms are proposed considering the temporal dimension in video

sequences. These algorithms decompose the video by performing spatial 2D trans-

form on individual frames followed by 1D transform in the temporal domain. Various

transforms are proposed in 3D decomposed watermarking schemes, such as, 3D DFT

domain watermarking [74]; 3D DCT domain [75] and more popular multi-resolution 3D

DWT domain watermarking [9, 31,32,76–78]. A multi-level 3D DWT is performed by

recursively applying the above mentioned procedure on low frequency spatio-temporal

subband. Various watermarking methods similar to image watermarking are then ap-

plied to suitable subbands to balance the imperceptibility and robustness. Although 3D

decomposition based methods overcome issues, such as, temporal de-synchronization,

video format conversion, video collusion; 3D decomposition without considering mo-

tion often creates flickering problem and fragile to video compression attacks which

considers motion trajectory during encoding.

3.2.2.3 Motion compensated

Motion compensated decomposition is one of the primary features of the hybrid video

coding schemes, i.e., MPEG-x and H.26x. To offer robust watermarking schemes

against collusion attack and compression attacks in hybrid video coding, motion com-

pensated video watermarking algorithms are proposed in the literature. The account

for motion in these schemes also helps to remove the flicker problem, indicated in the

previous subsections. Various such schemes are proposed in the literature as uncom-

pressed domain [57,79–81] or joint compression domain within MPEG-2 encoder [82–84]

and H.264/AVC encoder [85–87]. In these schemes object motion within the frames

are tracked by motion estimation and motion compensation followed by the transform.

The watermark embedding is usually done on transform coefficients before or after the

quantization process on intra frames or prediction frames.

3.2.2.4 Bit stream domain

In this category the watermark embedding is done on partially decoded bit streams.

Many such algorithms are proposed for MPEG-2 bit streams [88–91] and more recent

H.264/AVC bit stream [92,93]. The major advantage of bit stream domain watermark-

33

ing is that the computational complexity is much lower which leads to a faster water-

mark detection for real time applications. It also prevents the decoding and re-encoding

data loss, when compared to joint compression domain watermarking schemes. In bit

stream domain schemes, usually, the bit-stream is partially decoded by entropy decod-

ing followed by the de-quantization process and the watermark embedding is performed

on the transform coefficients. However any error due to embedding modification in the

bit stream propagates and causes distortion. Various drift compensation algorithms

are suggested in these algorithms to counter such error propagation.

3.2.2.5 Motion vector based

Video watermarking within motion vector can be defined as special case of bit stream

domain watermarking. One of the major motivations of these schemes is that for

any video encoder motion vector is always preserved and encoded with higher priority

and hence less affected by compressions. Therefore any watermark embedding within

motion vector is robust to compression and other attacks. But at the same time any

small change in motion vector can cause a significant distortion in the host video.

However, a careful choice of motion vector to embed the watermark can reduce the

embedding distortion while keeping high robustness. Few such algorithms are proposed

in the literature for MPEG-2 motion vector [94–96] and H.264/AVC motion vector [97,

98]. To avoid significant embedding distortion, in all the cases the watermark capacity

is kept small and various algorithms adopted different methods to select the motion

vectors to be embedded, such as, higher magnitude based [96, 97]; motion estimation

mode selection based [98]; texture based [95] and phase angle based [94].

3.3 Conclusions

In this chapter various image and video watermarking schemes are discussed and

broadly categorized in uncompressed domain and compressed domain. Uncompressed

domain schemes are generally independent of any image and video encoding schemes

and hence more flexible compared to compressed domain algorithms. Therefore in

this thesis we choose to analyze and propose new uncompressed domain image water-

marking and video watermarking schemes based on a motion compensated framework.

From the state of the art analysis, it is evident that although many image watermarking

schemes are proposed towards robustness against JPEG 2000, a little work has been

done on robust video watermarking against scalable coded video and related attacks.

34

Again, within image watermarking schemes a gap is identified to model scalable com-

pression within the watermarking algorithm. Hence, within the scope of this thesis, we

aim to propose improved image watermarking schemes to enhance the robustness and

robust video watermarking techniques to quality scalable content adaptation.

35

Chapter 4

Watermarking Evaluation Bench

for Content Adaptation Modes

In the previous chapters, the content adaptation scenario and the state-of-the-art digital

watermarking schemes in such scenario were discussed. In the state-of-the-art analy-

sis different wavelet-based image and video watermarking schemes are dissected and

categorized into common system blocks. Based on dissection, in this chapter a novel

modular framework, Watermarking Evaluation Bench for Content Adaptation Modes

(WEBCAM) is presented for evaluating image watermark robustness against scalable

coding based content adaptation attacks. The various stages of the development of the

proposed framework have been previously presented in different publications [99–101].

The framework is also available for download from the framework’s web pages [102].

4.1 Introduction

Currently, a few good watermarking evaluation tools are available in the watermarking

research community. These evaluation tools have proven very useful to measure the

performance of the watermarking algorithms against different intentional attacks (in-

cluding cropping, average filtering, scanning etc.) and unintentional attacks (natural

image processing tasks, such as compression, rotation, scaling etc.). Examples of such

evaluation tools include Stirmark [103], Checkmark [104], Optimark [105] and Water-

mark Evaluation Test bed (WET) [106]. Stirmark, for a given watermarked image,

applies different attacks including cropping, filtering, rotation, JPEG compression to

37

generate a number of modified images which are used to verify the existence of the

watermark. In addition to the attacks considered in Stirmark, Checkmark includes

wavelet-based compression and helps to evaluate and rate the watermarking schemes.

The Optimark evaluation bench provides performance metrics such as receiver operat-

ing characteristics curve, equal error rate, probability of false detection and rejection

etc. WET provides a facility to test the robustness performance of different algorithms

against usual attacks.

In the proposed test bed, Watermark Evaluation Bench for Content Adaption Modes

(WEBCAM), the main aim is to address the evaluation of watermarking schemes for

robustness against scalable coding-based content adaptation attacks. However, another

important requirement of watermarking, the imperceptibility, is often complementary

in nature to robustness to content adaptation and is evaluated in this framework. For

example, in order to lower the embedding distortion, one may choose low significant

frequencies or low significant bit plane which often forms the low significant portions

of the scalable bit streams which may be discarded during content adaptations.

As stated before, due to the use of digital wavelet transform (DWT) as the underlying

technology of JPEG 2000 compression standard and its success in image coding, recent

years have seen wide use of wavelet-based techniques for image watermarking [5–25,66].

These algorithms are different to each other in terms of the wavelet kernel, number of

wavelet decomposition levels, wavelet sub band choices for embedding, wavelet coef-

ficient choices for embedding and the coefficient modification method for embedding.

Therefore it is important to study the effect of above parametric choices in terms of

balanced embedded distortion and robustness to content adaptation attacks perfor-

mances.

Overall, the proposed WEBCAM framework provides a tool repository for wavelet-

based watermarking, facilitates a parametric study of various design choices in wavelet-

based watermarking and proposes a watermark tweezing tool to balance the embedding

distortion and the robustness to scalable coding-based content adaptation. The cur-

rent version of the framework provides tools for wavelet-based image watermarking,

JPEG 2000-based scalability attacks and emulation of multiple node multimedia con-

tent adaptation chains covering various networks and devices. In summary, the main

objectives of this chapter are:

1. To provide tools to emulate multiple node multimedia content adaptation chains

and to perform scalable coding-based content adaptation (MPEG-21 Part-7) at-

tacks for evaluation of robustness of image watermarks to such attacks.

38

WEBCAM

Embedding Process Content Adaptation Extraction and Authentication

Embedding Algorithm

Wavelet Kernel

Subband Selection

Coefficient Selection

Quality Scalability

Resolution Scalability

Channel Parameters

Extraction Algorithm

Authentication

Figure 4.1: WEBCAM modules and input/output parameter blocks

2. To provide a tool repository for wavelet-based watermarking enabling controlled

experimentation for their parametric evaluation, in terms of both embedding

distortion and robustness to content adaptation attacks. This is achieved by dis-

secting commonly used wavelet based watermarking algorithms into modular tool

blocks and fitting them into a common wavelet-based watermarking framework.

3. To facilitate tools for developing new watermarking schemes by choosing various

modules and parameters from this common framework which can also be used as

a research and learning tool for wavelet-based watermarking.

4. To provide a comparative parametric study of wavelet based watermarking algo-

rithms using the framework.

4.2 WEBCAM system architecture

WEBCAM architecture for image consists of three main functional modules: 1) Wa-

termark embedding; 2) Scalable coding-based content adaptation; and 3) Watermark

extraction and authentication. The high-level block diagram of WEBCAM with main

modular input/output parameters is shown in Figure 4.1. WEBCAM can operate on

two modes: as a full system using all three modules or as a scalable coding-based con-

tent adaptation attack emulator for any watermarked image by just using the module 2.

39

Server-side user

Wavelet Kernel: Forward wavelet transformation

Image / video store

Image to be watermarked

Choice of wavelet kernel

No. of decomposition level

Choice of subband

Choice of threshold

Embedding process

Embedding parameters

Watermark logo store

Watermark logo

Inverse wavelet transformation

Watermarked image / video store

Watermarked image

Image viewing Watermarked

image

Watermarked image

Embedding performance evaluation process

PSNR /RMSE

Difference map

Figure 4.2: Flow diagram of the watermark embedding module in WEBCAM.

4.2.1 Watermark embedding tools

Following the dissection of wavelet-based watermarking shown in Chapter 3, the wa-

termark embedding module of WEBCAM facilitates a common framework consisting

of a tool repository for implementing those wavelet-based watermarking algorithms as

well as a research platform for designing new algorithms. The block diagram of the wa-

termark embedding module consisting all input parameters, the sub module functional

blocks, embedding performance evaluation, output parameters and their interconnected

flow is shown in Figure 4.2. The sub modules include the FDWT, watermark embed-

ding, the IDWT, image display and embedding performance evaluation.

The input parameters to this module are three folds: operational; systems-related;

and user-defined. Operational inputs are the host image and the watermark logo.

The systems-related input parameters are related to the tools repository and con-

sist of wavelet kernel choice, number of wavelet decomposition levels, host subband

choice, host coefficient selection method and embedding procedure choice. The user-

defined input parameters include embedding parameters, such as, thresholds, water-

40

Watermark embeder

Forward DWT initialisation

Image / video store Image / video to be watermarked



Orthogonal wavelet

Embedding Process

Bi-orthogonal wavelet

Lifting-based integer wavelet

Non-linear 2D wavelet

Non-linear Quincunx wavelet

Figure 4.3: The FDWT submodule with choices wavelet kernels.

mark strengths etc. The output parameters include the watermarked image and embed-

ding performance evaluation metrics, such as, the Peak Signal to Noise Ratio (PSNR)

and the data hiding capacity.

The FDWT submodule with its choices for the wavelet kernel is shown in Figure 4.3.

Currently available choices include orthogonal wavelets (Haar and Daubechies orthog-

onal), biorthogonal (9/7 and 5/3), lifting-based integer wavelets [60], separable non-

linear wavelets [62,63] and Quincunx sampling-based non-linear wavelets [61,63]. WE-

BCAM allows any number of wavelet decomposition levels permitting the image di-

mensions. It also facilitates choosing any single or a group of subbands as the host

subbands, followed by coefficient selection based on the realization of the embedding

methods discussed in Chapter 3. For the embedding methods, WEBCAM provides

both magnitude alteration and re-quantization schemes with flexibility of the binary

input vector < a1, a2, a3, a4 > for choosing different options for the former and options

of inter and intra band coefficient selection for the latter.

41

4.2.2 Content adaptation tools

We aim to implement the emulation of a heterogenous communication system, where

the content is encoded using the scalable coders to produce scalable bit streams followed

by channel coding and transmitted along various types of networks, such as, optical,

wired or wireless networks to reach the final user to display the content using devices

with various display resolutions and resources availability. Such content may be adapted

to address the varying network bandwidths, quality of services, display resolutions and

usage requirements at various nodes of the network. These bit streams are adapted in

terms of reducing quality, spatial resolutions and frame rates just by truncating various

layers of the bitstream, resulting in low data rates to be streamed.

The flow diagram depicting the functionality of the content adaptation tools of WEB-

CAM, emulating a heterogenous communication system, is shown in Figure 4.4. The

content adaptation tools in WEBCAM are two fold: channels-related tools and MPEG-

21 DIA-related content adaptation rules. The channel-related tools consists of chan-

nel coding and channel models. The content adaptation module consists of a media

adaptation engine. The adaptation engine is fed with the quality reduction, resolution

reduction and frame rate reduction parameters translated from the network, device and

usage requirements. Then the adaptation engine first adapts the bit stream description

(if available) and then based on the new description adapts the scalable bitstream to

produce the new bitstream, which is also scalable. This process may be carried out

repeatedly at the successive nodes.

WEBCAM also provides the facility to decode a bitstream and extract the water-

mark at any node. An example of repeated node adaptations is shown in Figure 4.5.

In this thesis WEBCAM provides JPEG 2000-based content adaptation and excludes

the channel related modules as the present focus is on image watermarking schemes

and their evaluation against content adaptation attacks. Although, the wavelet-based

watermarking is considered in this framework, the content adaptation module in WEB-

CAM can be used as a stand alone tool for emulating the scalable coding-based content

adaptation attack on any image watermarking scheme.

42

Content adaptation & view

Watermark extractor

Scalable coded image / video store

Quality reduction parameter (Q)

Channel coding

parameter Channel model

parameter

Scalable coded image / video

Decoded image / video store at terminal

Decoded image / video

Spatial reduction parameter(S)

Frame reduction

parameter(T)

Channel coding process

Channel model process

Viewing process Decoded image / video

Decoded image / video

Channel decoding & view

Node processing

Source decoding

Channel decoding

parameter

Figure 4.4: The flow diagram content adaptation tools in WEBCAM.

Node 1 Node 2 Node 3

Full resolution bit stream Quality and spatially scaled bit stream

Further scaled bit stream from Node 2

Decoding and Watermark Extraction


Transmission channel


Node 1 Node 2 Node 3

Full resolution bit stream Quality and spatially scaled bit stream

Further scaled bit stream from Node 2





Figure 4.5: Content adaptation at nodes.

43

Watermark extractor

Post processing: forward wavelet

transformation and resizing scheme

Watermarked Image / video store

Watermarked image



Choice of subband

Choice of threshold Watermark

extraction process Embedding parameters Watermark logo store

Original Watermark logo

Authentication

Extracted Watermark

Original image (For non-blind

detection) Original Image / video store

Authentication Decision

Figure 4.6: Flow diagram of watermark extraction and authentication in WEBCAM.

4.2.3 Watermark extraction and authentication tools

4.2.3.1 Watermark extraction

The watermark extraction process can be either blind or non-blind depending on the

coefficient selection and modification process used in the embedding algorithm. In this

test bed, the schemes associated with magnitude alteration algorithms are non-blind,

whereas, re-quantization-based modifications are blind. In general, watermark extrac-

tion (as shown in Figure 4.6) includes the FDWT followed by the finding of ∆m,n either

as C ′m,n − Cm,n from Eq. (3.1) or as f(γ,Cmin, Cmax) from Eq. (3.3) to find the wa-

termark information Wm,n. In addition to watermark extraction, WEBCAM includes

tools for postprocessing of the decoded image and the watermark authentication.

4.2.3.2 Postprocessing

WEBCAM addresses the situations where the resolution of the image of the decoded

image is smaller than the original image due to resolution scalability-based content

adaptation. The resizing scheme used in WEBCAM follows three steps. Firstly, the

decoded image is decomposed into (M1 −M2) levels using the FDWT employed in the

44

compression algorithm, where M1 is the number of wavelet decomposition levels used

in the embedding algorithm and M2 is the number of levels discarded due to content

adaptation. Secondly, the normalization of all coefficients are adjusted by multiplying

with 2M2 . Finally the dimensions are extended to those of the original by zero padding

the current matrix and the IDWT is applied to obtain the full resolution image.

4.2.3.3 Watermark authentication

The authentication process verifies the extracted watermark with the original water-

mark. Two commonly used authentication metrics are Hamming Distance (H) (often

referred as Bit Error Rate (BER) in communication systems) and correlation similar-

ity measure (S). The former is widely used for a binary watermark detection while the

latter is commonly used for pseudo-random sequence-based watermark data or for a

gray scale logo [15, 17]. Using these metrics, a watermark is said to be detected if the

Hamming Distance is lower than a specific threshold value or the correlation similarity

measure is higher than a given threshold. WEBCAM is equipped with both of these

metrics which are computed as follows:

H(W,W ′) =1

L

L−1∑

i=0

Wi ⊕W ′i , (4.1)

S(W,W ′) =W.W ′√W ′.W ′

/W.W√W.W

× 100,

=W.W ′√

W ′.W ′√W.W

× 100, (4.2)

where W and W ′ are the original and the extracted watermarks, respectively. L is the

length of the sequence and ⊕ represents the XOR operation between the respective

bits.

4.3 Experimental simulations and comparative study

WEBCAM provides a tools repository for evaluating scalable coding-based content

adaptation attacks, implementing wavelet-based image watermarking schemes by dis-

secting major algorithms into design tools and designing new tools. This section demon-

strates the achievement of objectives with experimental simulations and corresponding

comparative study.

45

Table 4.1: Realization of major wavelet-based watermarking algorithms using combi-nations of options for submodules in WEBCAM.

Wavelet Decom- Subband Host Embedding TheKernel -position Choice Coefficient Method Resulting

Levels Selection AlgorithmOrthogonal 2 Low Median Intra Re-quantization [5]Orthogonal 2 High Median Inter Re-quantization [6]Orthogonal 2 High Threshold < 1, 0, 0, 0 >(τ = 1) [10]

Haar 2 High All < 1, 0, 0, 0 >(τ = 2) [11]Orthogonal 2 High All < 0, 0, 0, 1 > [12]Orthogonal 3 High All < 0, 1, 0, 0 > [13]Orthogonal 4 High HVS < 0, 1, 0, 0 > [14]Orthogonal 4 High Fusion rule < 0, 0, 1, 0 > [15]

Haar 1 High Median Inter Re-quantization [7]Orthogonal 4 Low Key based Intra Re-quantization [8]

randomsequence

Biorthogonal 3 Low JND < 1, 0, 0, 0 >(τ = 1) [16]Biorthogonal 3 All Threshold < 1, 0, 0, 0 >(τ = 1) [17]Biorthogonal 3 All Threshold < 1, 0, 0, 0 >(τ = 1) [18]Biorthogonal 5 All All < 1, 0, 0, 0 >(τ = 1) [19]Biorthogonal 5 All Median Intra Re-quantization [22]Biorthogonal 5 High Threshold < 0, 1, 0, 0 > [20]

4.3.1 Different wavelet-based watermarking algorithm realization

With the provided tools repository, different wavelet-based watermarking algorithms

can be realized by combining various options for the WEBCAM submodules, namely,

the wavelet kernel, wavelet decomposition, subband choice, host coefficient choice and

embedding method, and using a set of user-defined parameters. A few examples of

realization of major wavelet-based watermarking algorithms in WEBCAM are shown

in Table 4.1. In addition to these existing algorithms one can pick and mix different

parameters and design new algorithms to cater their application requirements.

4.3.2 Robustness to content adaptation attacks

Next, the use of content adaptation tools are demonstrated in WEBCAM to evalu-

ate the robustness of watermarking against the MPEG-21 DIA attacks, such as, JPEG

2000-based quality scalable adaptations and JPEG 2000-based resolution scalable adap-

tations.

46

Image 1 (704x576) Image 2 (768x512) Image 3 (704x576) Image 4 (768x512) Image 5 (512x512)




Figure 4.7: The test image set.

4.3.2.1 The experimental setup

For these experiments, the Kodak test image set and other popular test images are used,

as shown in Figure 4.7, and a binary logo as the watermark data is used. The PSNR

is used for setting the host image distortion level to an acceptable level for embedding

a given amount of watermarking data for robustness evaluation experiments. The

Hamming distance is used as the authentication measure. The results show the mean

value of the Hamming distance for the test image set and the error bars corresponding

to 95% confidence level. The robustness against different compression ratios for the

quality scalability attacks on the full resolution and joint resolution-quality scalability

attacks (on half resolution) is evaluated.

1) The choice of logo

The experimental observations show that the choice of logo has no effect on the ro-

bustness performance of a given watermarking algorithm. As an example, Figure 4.9

shows the robustness performance for a watermarking algorithm, when used for five

different logos as shown in Figure 4.8. Irrespective of the used watermark logo, the

47

Logo 1 (40x40) Logo 2 (70x74) Logo 3 (64x64) Logo 4 (64x64) Logo 5 (76x77)

Figure 4.8: The test logo set.

1 2 3 4 5 6 7 8 9 100

0.2

0.4

Image 1: Full resolution || Embedded Bit Count = 8192H

amm

ing

Dis

tanc

e

Logo 1Logo 2Logo 3Logo 4Logo 5

1 2 3 4 5 6 7 8 9 100

0.2

0.4

0.6Image 1: Half resolution || Embedded Bit Count = 8192

Ham

min

g D

ista

nce

Compression Ratio: 1=1, 2=2, 3=4, 4=8, 5=10.67, 6=16, 7=24.24, 8=32, 9=44.44, 10=64

Figure 4.9: An example of comparing the choice of logo with the same bit count (8192)being embedded using the intra re-quantization-based embedding on robustness to -Row 1: Quality scalability attack on full resolution; and Row 2: Joint resolution-qualityscalability attack (half resolution).

trend of robustness under different resolution-quality scalability attacks remains the

same. Therefore, in this work results are shown using one logo (Logo 3 ).

2) On the use of PSNR in embedding distortion evaluations

In these experiments, the embedding performance is measured using the PSNR against

data capacity. In robustness evaluation tests, this measure is used to ensure that ei-

ther the distortion or data capacity is maintained constant for different watermarking

algorithms, so that a fair comparison can be made for robustness under different em-

bedding scenarios. Initial experiments suggest that for most host images, the PSNR

greater than 35dB, provides acceptable image quality in naked human eye.

As an example of the embedding distortion, Figure 4.10 shows the performance of three

types of wavelet kernels: orthogonal, bi-orthogonal and non-linear, using the Haar (HR)

48

1 2 3 4 530

35

40

45Non-HVS based <1,0,0,0>(τ=1) || Embedded Bit Count = 6144

PS

NR

HRD-45/39/7MHMQ

1 2 3 4 535

40

45

50Intra re-quantisation based || Embedded Bit Count = 2048

PS

NR

Image number

Figure 4.10: Capacity-distortion plots. Numbers 1 to 5 represent the five images fromthe test image set. Two different category of algorithms: 1) non-blind (non-HVS based<1,0,0,0>(τ=1)) and 2) blind (intra re-quantization based), are shown in each row forsix different wavelet kernels: HR, D-4, 5/3 9/7, MH and MQ.

and Daubechies-4 (D-4) for orthogonal, 5/3 and 9/7 for bi-orthogonal and Morpholog-

ical Haar (MH) and Quincunx domain Morphological (MQ) wavelets for non-linear

types, respectively. The experiment considered two different categories of embedding

algorithms, namely, the non-blind algorithm (non-HVS-based (<1,0,0,0>(τ=1))) and

the blind algorithm (intra re-quantization-based). A comprehensive study on the ef-

fect of wavelet kernel and other parameters on embedding performance is discussed in

Chapter 5.

3) Hamming distance interpretation

The Hamming distance is used as the authentication measure for robustness evaluation.

Figure 4.11 shows the visual quality corresponding to different Hamming distance val-

ues. It is evident from these figures that after about 0.25 Hamming distance, the visual

quality of logos becomes poor and difficult to compare with the original logo. Based

on the visual significance, one can define a threshold value of the Hamming distance to

ensure the extracted watermark is visually comparable with the original logo. Based

on the experiments a generalized threshold of 0.20±0.02 hamming distance can be set.

Using the above discussed experimental set up, three different scenarios are considered

to compare and evaluate the robustness against content adaptation. With the shown

set of experiments, it is also demonstrated how the full features of WEBCAM can be

49

HD=0.026 HD=0.087 HD=0.183 HD=0.278

HD=0.025 HD=0.091 HD=0.193 HD=0.284

Figure 4.11: Original and extracted watermark logo and corresponding to differentHamming distances (HD).

used for evaluating the effect of different options chosen for submodules in wavelet-

based watermarking algorithms. This is carried out by setting all but one submodules

setting as common and fixed choices. The scenarios, considered here are as follows:

4.3.2.2 The effect of wavelet kernel choice on robustness

The contribution of the choice of wavelet kernel on the robustness to content adapta-

tion is evaluated by considering non-blind and blind extraction algorithms. The other

parameters, namely, the embedding subband and the host coefficient selection are set

to low frequency and thresholds-based (<1,0,0,0>(τ=1) for the non-blind case and in-

tra re-quantization-based for the blind algorithm, respectively. In all the cases, three

level decomposition has been performed by trading of between complexity, data ca-

pacity and the robustness. Lesser number of decomposition level often lacks required

robustness specially in the case of resolution scalability, whereas a higher number of

decomposition adds complexity and reduces watermarking data capacity. The water-

mark strength parameter, α and γ, has been set to 0.08 for all wavelet kernel choices. A

set of six different wavelet kernels representing three different wavelet classes, namely,

orthonormal (HR and D-4), bi-orthogonal (5/3 and 9/7) and non-linear (MH and MQ)

have been used for the comparisons. The results are shown in Figure 4.12 (for the

non-blind algorithm) and Figure 4.13 (for the example blind algorithm). For the

full resolution quality scalability as well as joint resolution-quality scalability attacks,

the longer bi-orthogonal wavelets performed better compared to other wavelet kernels.

Particularly bi-orthogonal 9/7 wavelet which is also used in JPEG2000 compression

here, provides best result due to close approximation between watermarking wavelet

and compression wavelet kernels. Therefore for further experimental set 9/7 is used as

the watermarking wavelet transform.

50

0 10 20 30 40 50 60 70

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4Quality scalability attack on full resolution

Compression Ratio

Ham

min

g D

ista

nce

HRD-45/39/7MHMQ

0 10 20 30 40 50 60 70

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4Joint resolution-quality scalability attack (half resolution)

Compression Ratio

Ham

min

g D

ista

nce

HRD-45/39/7MHMQ

Figure 4.12: An example of evaluating the effect of the wavelet kernel for < 1, 0, 0, 0 >(τ = 1) direct modification-based embedding on robustness to - Column 1: Qualityscalability attack on full resolution; and Column 2: Joint resolution-quality scalabilityattack (half resolution).

0 10 20 30 40 50 60 700

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45


Compression Ratio

Ham

min

g D

ista

nce

HRD-45/39/7MHMQ

0 10 20 30 40 50 60 700

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45


Compression Ratio

Ham

min

g D

ista

nce

HRD-45/39/7MHMQ

Figure 4.13: An example of evaluating the effect of the wavelet kernel for intra re-quantization-based embedding on robustness to - Column 1: Quality scalability attackon full resolution; and Column 2: Joint resolution-quality scalability attack (half reso-lution).

4.3.2.3 The effect of subband choice

The contribution of the choice of subbands for the robustness of a watermarking al-

gorithm is compared by setting all other choices to fixed. In this set of experiments,

the wavelet kernel and decomposition levels are set to 9/7 and three, respectively. Fig-

ure 4.14 shows the robustness performance for non-blind extraction that uses threshold-

based (<1,0,0,0>(τ=1)) embedding method, while Figure 4.15 shows the robustness

performance for blind extraction that uses intra re-quantization-based embedding. In

plots, low, high and all frequency subband selection refers to the lowest frequency

51

0 10 20 30 40 50 60 70

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

Quality scalability attack on full resolution

Compression Ratio

Ham

min

g D

ista

nce

Low frequencyHigh frequencyAll Frequency

0 10 20 30 40 50 60 70

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

Joint resolution-quality scalability attack (half resolution)

Compression Ratio

Ham

min

g D

ista

nce


Figure 4.14: An example of evaluating the effect of the subband choice for < 1, 0, 0, 0 >(τ = 1) direct modification-based embedding on robustness to - Column 1: Qualityscalability attack on full resolution; and Column 2: Joint resolution-quality scalabilityattack (half resolution).

subband, three high frequency subbands in the third decomposition level and all four

frequency subband in the third decomposition level, respectively. For non-blind and

blind embedding cases, an average PSNR range of 32.75 ∼ 33.75db and 39 ∼ 40dB,

respectively is maintained for all three different subband selection modes by tuning the

watermark weight parameter α and γ. In both cases, embedding in low frequency sub-

bands results in the highest robustness, compared to other two choices. This is mainly

due to the high energy concentration in low frequency subband of the host image and

the content scalability treatments used in JPEG 2000 quality scalability and resolution

scalability.

0 10 20 30 40 50 60 700

0.1

0.2

0.3

0.4

0.5

Quality scalability attack on full resolution

Compression Ratio

Ham

min

g D

ista

nce


0 10 20 30 40 50 60 700

0.1

0.2

0.3

0.4

0.5

Joint resolution-quality scalability attack (half resolution)

Compression Ratio

Ham

min

g D

ista

nce


Figure 4.15: An example of evaluating the effect of the subband choice for intra re-quantization-based embedding on robustness to - Column 1: Quality scalability attackon full resolution; and Column 2: Joint resolution-quality scalability attack (half reso-lution).

52

0 5 10 15 20 25 30 350

0.1

0.2

0.3

0.4

0.5

0.6


Compression Ratio

Ham

min

g D

ista

nce

Magnitude Alteration (non-HVS based)Magnitude Alteration (HVS based)Intra re-quatisationInter re-quantisation

0 5 10 15 20 25 30 350

0.1

0.2

0.3

0.4

0.5

0.6


Compression Ratio

Ham

min

g D

ista

nce

Magnitude Alteration (non-HVS based)Magnitude Alteration (HVS based)Intra re-quatisationInter re-quantisation

Figure 4.16: An example of evaluating the effect of different embedding methods onrobustness to - Column 1: Quality scalability attack on full resolution; and Column 2:Joint resolution-quality scalability attack (half resolution).

4.3.2.4 The effect of the choice of embedding method and host coefficient

selection

In this experiment set, two different embedding methods, namely, magnitude alteration

and re-quantization are considered. For magnitude alteration, two cases are considered:

HVS-based and all coefficient selection. For re-quantization-based methods also two

cases are considered : inter and intra subband coefficient selection. The other param-

eters, the wavelet kernel, decomposition levels and the embedding subband are set to

9/7, three and high frequency subbands in the third decomposition level, respectively.

For a fair comparison, the average PSNR are adjusted within the range of 38 ∼ 40dB

for each algorithm by tuning the watermark weight parameter α and γ. It is evident

from Figure 4.16, for all different content adaptation scenarios, the HVS-based direct

modification combination shows the highest robustness. This is mainly due to the ef-

ficiency in the coefficient selection method, enabling to choose a higher value for the

watermark strength parameter, yet resulting in distortion performance in the specified

range. However, HVS-based model [14] and inter re-quantization based model [7] are

only intended for high frequency subband embedding. Whereas non-HVS based and

intra re-quantization based algorithms are independent of the subbands and hence used

in the thesis for further research.

53

4.4 Conclusions

Although there are few watermarking toolboxes available in the literature, most of

them rely on various attacks to evaluate the watermarking algorithms. On the con-

trary in this chapter we proposed WEBCAM framework to provide a common con-

trol experimental environment to compare the effect of various input parameters for

various wavelet based algorithms. A modular test bed consisting of a repository of

tools is presented for emulating MPEG-21 DIA content adaptation attacks, wavelet-

based watermarking, extraction and authentication. The parametric dissections of the

wavelet based watermarking algorithms from the previous chapter are used to design

and implement the tools repository and its modular and reconfigurable wavelet-based

watermarking implementation within WEBCAM framework. WEBCAM provides a

formal evaluation platform to compare the performances of different schemes under

a controlled experimental environment for various combinations of choices for those

functional submodules and a comparative study of the same is provided here. It also

facilitates the development of new algorithms and can also be used as an educational

tool for wavelet-based watermarking algorithm design. The content adaptation tools

repository provides a new set of attacks that are emerging in modern multimedia usage

within the heterogeneous networks.

54

Chapter 5

Embedding distortion analysis

and modeling

5.1 Introduction

Embedding performance and robustness are the two main but complementary prop-

erties of robust watermarking applications. The various contributing parameters on

imperceptibility and robustness performances are studied in the previous chapters.

The state of the art review and a generalization of the wavelet based watermarking

schemes are proposed in Chapter 3 and Chapter 4. Though many independent al-

gorithms are available in the literature, a gap has been identified which requires a

generalized mathematical analysis to identify the relationship between distortion per-

formance and various input parameters, responsible for embedding distortion. There

are very little or no research have been done to establish a relationship between an

objective metric for embedding distortion and watermarking input parameters. Few

attempts [107] have been made towards this problem but they mainly focused on their

own algorithms. In this chapter a mathematical model is derived to establish the rela-

tionship between embedding distortion performance metric, such as mean square error

(MSE) and watermarking input parameters including wavelet kernels, subband selec-

tion and coefficient selection. Although many objective metrics are presented in the

literature (as discussed in Chapter 2), most of the watermarking algorithms to date,

due its simplicity, used PSNR to measure the embedding distortion. In this chapter

MSE has been used instead of PSNR to represent the embedding distortion in a linear

scale. However similar mathematical modeling can be developed using other objective

55

metrics and that is considered as a future work in this thesis.

The main objective of the work in this chapter, is to derive a generalized model for

distortion performance analysis of wavelet based watermarking. In order to achieve the

same, first a proposition is made to show the relationship between the noise power in

the transform domain and the input signal domain. Then using the above proposition

a relationship is established between the distortion performance metrics and the input

parameters of a given wavelet based watermarking scheme. The proposed model is

derived in two parts: 1) Initial propositions are made using orthonormal wavelet bases,

which conserves energy in the signal domain as well as in the transform domain; 2)

Extension of the same into non-orthonormal bases, including bi-orthogonal and non-

linear wavelet kernels, to give a universal acceptance of the model.

5.2 Embedding distortion model for orthonormal wavelet

bases

5.2.1 Preliminaries

The embedding distortion performance is measured by MSE, which can be defined as

follows:

Definition 5.2.1 The Mean Square Error (MSE) or average noise power in pixel

domain between original image I and watermarked image I ′ is defined by:

MSE =1

X × Y

X−1∑

m=0

Y−1∑

n=0

(I(m,n)− I ′(m,n))2, (5.1)

where X and Y are the image dimension and m and n indicate each pixel position.

In order to formulate the model the transformation of noise energy from frequency

domain to the signal domain is shown using Parseval’s equality.

Definition 5.2.2 In the Parseval’s Equality, the energy is conserved between an input

signal and the transform domain coefficient in the case of an orthonormal filter bank

wavelet base [59]. Assuming the input signal x[n] with the length of n ∈ Z and the

56

corresponding transformed domain coefficients of y[k] where k ∈ Z, according to energy

conservation theorem,

‖x‖2 = ‖y‖2. (5.2)

5.2.2 The model

Based on these primary definitions the model is built which consists of the following

propositions and its proof.

Proposition 5.2.1 Sum of the noise power in the transform domain is equal to sum

of the noise power in the input signal for orthonormal transforms. If the input signal

noise is defined by ∆x[n] and the noise in transform domain is ∆y[k] then,

∑

n

|∆x[n]|2 =∑

k

|∆y[k]|2, (5.3)

where n ∈ Z is the length of the input signal and k ∈ Z is the length in the transform

domain, respectively.

Proof : As discussed in Chapter 2, DWT can be realized with a filter bank or lifting

scheme based factoring. In both the cases the wavelet decomposition and the recon-

struction can be represented by a polyphase matrix [60]. The inverse DWT can be

defined by a synthesis filter bank using the polyphase matrix M ′(z) =(h′

e(z)g′e(z)

h′

o(z)g′o(z)

)

where h′(z) represents the low pass filter coefficients and g′(z) is the high pass filter

coefficients and the subscripts e and o denote even and odd indexed terms, respectively.

Now the transform domain coefficient y can be re-mapped into input signal x as bellow:

(xe(z)xo(z)

)=(h′

e(z)g′e(z)

h′

o(z)g′o(z)

)(ye(z)yo(z)

). (5.4)

Assuming ∆y is the noise introduced in wavelet domain and ∆x is the modified signal

after the inverse transform, the relationship between the noise in the wavelet coefficient

and the noise in the modified signal can be defined using the following equations. From

Eq. (5.4) we can write

(xe(z)+∆xe(z)xo(z)+∆xo(z)

)=(h′

e(z)g′e(z)

h′

o(z)g′o(z)

)(ye(z)+∆ye(z)yo(z)+∆yo(z)

). (5.5)

57

From Eq. (5.4) and Eq. (5.5) using the Linearity property of the Z-transform of the

filter coefficients and signals in the polyphase matrix we can get,

xe(z) + ∆xe(z) = h′e(z)(ye(z) + ∆ye(z))

+h′o(z)(yo(z) + ∆yo(z)),

h′e(z)ye(z) + h′o(z)yo(z) + ∆xe(z) = h′e(z)ye(z) + h′e(z)∆ye(z)

+h′o(z)yo(z) + h′o(z)∆yo(z),

∆xe(z) = h′e(z)∆ye(z) + h′o(z)∆yo(z). (5.6)

Similarly ∆xo(z) can be obtained and written as

∆xo(z) = g′e(z)∆ye(z) + g′o(z)∆yo(z). (5.7)

Combining Eq. (5.6) and Eq. (5.7), finally we can write the polyphase matrix form of

the noise in the output signal:

(∆xe(z)∆xo(z)

)=(h′

e(z)g′e(z)

h′

o(z)g′o(z)

)(∆ye(z)∆yo(z)

). (5.8)

Recalling the Parseval’s energy conservation theorem as stated in Definition 5.2.2, from

Eq. (5.8) it can be concluded that

∑|∆xe|2 +

∑|∆xo|2 =

∑|∆ye|2 +

∑|∆yo|2 ,

∑

n

|∆x[n]|2 =∑

k

|∆y[k]|2. (5.9)

�.

Using the generalized framework, the Proposition 5.2.1 can be applied to build the

relationship between the modification energy in the coefficient domain to embed the

watermark and the distortion performance metrics. In this model propositions are

made for two different categories of embedding schemes, discussed in Chapter 3.

Proposition 5.2.2 In a wavelet based watermarking scheme, the mean square error

(MSE) of the watermarked image is directly proportional to the sum of the energy of

the modification values of the selected wavelet coefficients. The modification value itself

is a function of the wavelet coefficients and therefore two different cases are proposed

based on the categorization.

Case A: Non-blind model. For the magnitude alteration based embedding method

(non-blind algorithm), the modification is a function of the selected coefficient to be

58

watermarked and the relationship between (MSE) and the selected coefficient (Cm,n)

is expressed as:

MSE ∝∑

|f(Cm,n)|2. (5.10)

Case B. Blind model. For the re-quantization based method (blind algorithm), the

modification is a function of the neighboring wavelet coefficients of the selected me-

dian coefficient to be watermarked and the relationship between MSE and the wavelet

coefficients Cmin and Cmax is expressed as:

MSE ∝∑

|f(Cmin, Cmax)|2. (5.11)

Proof : In a wavelet based watermark embedding scheme the watermark information

is inserted by modifying the wavelet coefficients. This watermark insertion can be

considered as introducing noise in the transform domain. Hence the sum of the energy

of the modification value due to watermark embedding in the wavelet domain is equal

to the sum of the noise energy in the transform domain as stated in Proposition 5.2.1.

From Eq. (3.1) and Eq. (5.3), the energy sum of the modification value ∆m,n can be

defined as: ∑

m,n

|∆m,n|2 =∑

k

|∆y[k]|2. (5.12)

Similarly, the pixel domain distortion performance metrics which is represented by

MSE is considered as the noise error created in the signal due to the noise in wavelet

domain. Therefore, the sum of the noise energy in the input signal is equal to the sum

of the noise error energy MSE in the pixel domain:

MSE.(X × Y ) =∑

n

|∆x[n]|2, (5.13)

where X and Y are the image dimensions. Now the relationship between the distortion

performance metrics MSE of the watermarked image and the coefficient modification

value which is normally a function of the selected wavelet coefficients can be decided

using the Proposition 5.2.1. Thus from Eq. (5.12) and Eq. (5.13) we can write:

MSE.(X × Y ) =∑

m,n

|∆m,n|2, (5.14)

where X and Y are the image dimensions. Hence for any watermarked image, the

average noise power MSE is proportional to the sum of the energy of the modification

values of the selected wavelet coefficients:

MSE ∝∑

m,n

|∆m,n|2. (5.15)

59

Now with the help of the categorization in the generalized form of the popular wavelet

based watermarking schemes as discussed in Chapter 3, a relationship is established

between the error energy of the watermarked image and the selected wavelet coefficient

energy of the host image. For a magnitude alteration based algorithm, which is a

category of non-blind watermarking algorithm, the mean square error MSE is directly

proportional to the sum of the energy of the modification value ∆ which is a function

of wavelet coefficient value as stated below:

MSE ∝∑

|f(Cm,n)|2. (5.16)

Similarly for the re-quantization based method (blind watermarking) the mean square

error depends on the neighboring wavelet coefficient values. In this case the modifi-

cation energy |∆m,n|2 hold an inequality due the modification range −δ ≤ ∆m,n ≤ δ:

|∆m,n|2 ≤ |δ|2. (5.17)

Therefore the upper bound of the mean square error MSE is defined by:

MSE ∝∑

|f(Cmin, Cmax)|2. (5.18)

�.

5.2.2.1 An example of non-blind model

Considering a specific case of the non-blind algorithm in [17] the modification value ∆

is a direct function of wavelet coefficient (∆m,n = αCm,nWm,n). Hence Eq. (5.16) can

be modified and the MSE can be expressed as:

MSE ∝l∑

k=1

|C(k)|2, (5.19)

where C(k) is the selected coefficients to be watermarked and l is the number of such

selected coefficients.

60

Table 5.1: Correlation coefficient values between sum of energy and the MSE for dif-ferent wavelet kernel in various subbands.

Non-blind model Blind modelHR D4 D8 D16 HR D4 D8 D16

LL3 0.81 0.81 0.81 0.81 0.66 0.68 0.68 0.73LH3 0.93 0.94 0.96 0.97 0.78 0.68 0.61 0.58HL3 0.98 0.99 0.99 0.99 0.78 0.92 0.94 0.97HH3 0.96 0.97 0.98 0.98 0.82 0.81 0.73 0.72LH2 0.98 0.98 0.99 0.99 0.80 0.82 0.75 0.81HL2 0.99 0.99 0.99 0.99 0.92 0.92 0.94 0.97HH2 0.99 0.99 0.99 0.99 0.83 0.80 0.85 0.89LH1 0.99 0.99 0.99 0.99 0.89 0.90 0.89 0.90HL1 0.99 0.99 0.99 0.99 0.84 0.90 0.96 0.94HH1 0.99 0.99 0.99 0.99 0.90 0.91 0.93 0.96

5.2.2.2 An example of blind embedding model

In an blind embedding algorithm suggested in [5], the quantization step δ is defined as:

δ = γCmax + Cmin

2, (5.20)

where γ is the user defined watermark weighting factor. As the modification value ∆

depends on δ, with reference to Eq. (5.18), the relationship between the maximum limit

of MSE and wavelet energy is defined by the following equation:

MSE ∝∑

k

(C(k)max + C(k)min)2, (5.21)

where C(k)max and C(k)min are the neighborhood coefficients of the median value and

k is the number of such selected median value.

5.2.3 Experimental simulations and result discussion

The propositions made in the previous section are verified in the experimental sim-

ulations. The sum of the energy of the selected wavelet coefficients and the MSE of

the watermarked image have been calculated for the test images with a combination

of different input parameters. As the wavelet coefficients varies greatly in different

subbands the performances of all subbands are considered separately after a 3 level

wavelet decomposition. After three level of wavelet decompositions, ten subbands are

61

created, such as, LL3, HL3, LH3 and HH3 at 3rd decomposition level, HL2, LH2 and

HH2 at 2nd decomposition level and HL1, LH1 and HH1 at 1st decomposition level.

Also a set of different wavelet kernels having various filter lengths are selected to per-

form the simulations. The performance of different wavelet kernels such as Haar (HR),

Daubechies-4 (D4), Daubechies-8 (D8) and Daubechies-16 (D16) are simulated and

studied in order to verify the proposed model. A set of 20 images have been considered

as shown in Figure 4.7. Two different sets of results are obtained for each non-blind

and blind model, and displayed to verify the effects of different input parameters which

are responsible for embedding distortion performance. These two sets of experimental

arrangements and resulting plots are discussed separately as follows:

5.2.3.1 Non-blind model

In experiment Set 1, the non-blind type watermark embedding model is considered as

described in Section 5.2.2.1. The sum of energy of the selected wavelet coefficients to be

modified and MSE of the watermarked image have been calculated using α = 0.5 and

the binary watermark logo for each selected method. Various wavelet kernels are used

and the results are observed for each selected subbands. The correlation coefficients

are also calculated and presented in Table 5.1.

In another representation a set of graphs are plotted in Figure 5.1 to present the average

values of the MSE and the sum of energy for the test image set for four different wavelet

kernels. The error bars denote the accuracy up to the 95% confidence interval. For

display purposes the sum of energy values were scaled, so that they can be shown on

the same plot for comparing the trend.

In the experiment Set 2, the performance for ten different subbands are plotted for

each wavelet kernel in a similar fashion as mentioned in experiment Set 1 in order to

observe the trend. The results are shown in Figure 5.2. As earlier, a 95% confidence

interval is considered which is denoted by the error bars and the LL3 values are scaled

suitably in all cases to observe the trends.

5.2.3.2 Blind model

The experimental simulations for the blind model are conducted as described in Sec-

tion 5.2.2.2. A similar set of experimental set up is followed as in non-blind model with

γ = 0.04 and 0.2 for LL3 subband and other high frequency subbands, respectively.

62

The correlation coefficients, average pattern graphs for various wavelet kernels and ten

different subbands are presented in Table 5.1, Figure 5.3 and Figure 5.4, respectively.

0.5 1 1.5 2 2.5 3 3.5 4 4.51200

1400

1600

1800

2000

2200

2400

2600

2800Non-blind model: LL3 subband

Wavelets: 1.HR 2.D4 3.D8 4.D16

MS

E a

nd s

cale

d en

ergy

val

ue

MSEEnergy Sum (value scaled)

0.5 1 1.5 2 2.5 3 3.5 4 4.54

5

6

7

8

9

10

11

12Non-blind model: HL3 subband


MS

E a

nd s

cale

d en

ergy

val

ue


0.5 1 1.5 2 2.5 3 3.5 4 4.53

4

5

6

7

8

9

10Non-blind model: LH3 subband


MS

E a

nd s

cale

d en

ergy

val

ue


0.5 1 1.5 2 2.5 3 3.5 4 4.51

1.2

1.4

1.6

1.8

2

2.2

2.4

2.6Non-blind model: HH3 subband


MS

E a

nd s

cale

d en

ergy

val

ue


Figure 5.1: Watermark embedding (non-blind) performance graph for different sub-bands. Four different wavelet kernels used here: 1. HR, 2. D4, 3. D8 and 4. D16,respectively. Subbands are shown left to right and top to bottom: LL3, HL3, LH3,HH3, respectively.

The simulation results show a strong correlation between MSE of the watermarked

image and the energy sum of the selected wavelet coefficients to be modified. It is

observed that for the non-blind model, the correlation coefficient value is more than

0.80 and more than 0.58 in the case of blind model, for different wavelet kernels and

various selected subbands. On the other hand, a similar graph patterns are observed

in Figure 5.1, Figure 5.2, Figure 5.3 and Figure 5.4, which show the proportionality

trend between MSE and the energy sum as proposed in the model. Lower correlation

coefficients are observed for blind model due to the reason that the proportionality

relationship only defines the upper bound in Eq. (5.18) and Eq. (5.21).

63

0 2 4 6 8 10 120

10

20

30

40

50

60Non-blind model: HR wavelet

1.LL3 2.LH3 3.HL3 4.HH3 5.LH2 6.HL2 7.HH2 8.LH1 9.HL1 10.HH1

MS

E a

nd s

cale

d en

ergy

val

ue

MSEEnergy Sum (Scaled)

0 2 4 6 8 10 120

10

20

30

40

50

60Non-blind model: D4 wavelet


MS

E a

nd s

cale

d en

ergy

val

ue


0 2 4 6 8 10 120

10

20

30

40

50



MS

E a

nd s

cale

d en

ergy

val

ue


0 2 4 6 8 10 120

20

40

60

80

100



MS

E a

nd s

cale

d en

ergy

val

ue


Figure 5.2: Watermark embedding (non-blind) performance graph for various waveletsin different subband. Wavelet kernels are shown left to right and top to bottom: HR,D4, D8 and D16, respectively.

0.5 1 1.5 2 2.5 3 3.5 4 4.51

2

3

4

5

6

7

8

9Blind model: LL3 subband


MS

E a

nd s

cale

d en

ergy

val

ue


0.5 1 1.5 2 2.5 3 3.5 4 4.50.2

0.4

0.6

0.8

1

1.2

1.4

1.6Blind model: HL3 subband


MS

E a

nd s

cale

d en

ergy

val

ue


0.5 1 1.5 2 2.5 3 3.5 4 4.50.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

1.1Blind model: LH3 subband


MS

E a

nd s

cale

d en

ergy

val

ue


0.5 1 1.5 2 2.5 3 3.5 4 4.50.04

0.06

0.08

0.1

0.12

0.14

0.16

0.18

0.2

0.22

0.24Blind model: HH3 subband


MS

E a

nd s

cale

d en

ergy

val

ue


Figure 5.3: Watermark embedding (blind) performance graph for different subbands.Four different wavelet kernels used here: 1. HR, 2. D4, 3. D8 and 4. D16, respectively.Subbands are shown left to right and top to bottom: LL3, HL3, LH3, HH3, respectively.

64

0 2 4 6 8 10 120

0.5

1

1.5

2

2.5

3

3.5

4

4.5Blind model: HR wavelet

1.LL3 2.LH3 3.HL3 4.HH3 5.LH2 6.HL2 7.HH2 8.LH1 9.HL1 10.HH1M

SE

and

sca

led

ener

gy v

alue


0 2 4 6 8 10 120

0.5

1

1.5

2

2.5

3

3.5

4

4.5Blind model: D4 wavelet


MS

E a

nd s

cale

d en

ergy

val

ue


0 2 4 6 8 10 120

0.5

1

1.5

2

2.5

3

3.5

4



MS

E a

nd s

cale

d en

ergy

val

ue


0 2 4 6 8 10 120

0.5

1

1.5

2

2.5

3

3.5

4



MS

E a

nd s

cale

d en

ergy

val

ue


Figure 5.4: Watermark embedding (blind) performance graph for various wavelets indifferent subband. Wavelet kernels are shown left to right and top to bottom: HR, D4,D8 and D16, respectively.

5.3 Embedding distortion model for non-orthonormal

wavelet bases

5.3.1 Preliminaries

Recalling Parseval’s Equality inDefinition 5.2.2, Eq. (5.2) is true for orthonormal trans-

forms where energy is conserved between transforms. On the contrary, non-orthonormal

wavelets such as biorthogonal wavelets do not hold conservation of energy. But for a

stable expansion, the transform domain coefficients have to satisfy the Eq. (5.22) [59].

A∑

k

|y[k]|2 ≤ ‖x‖2 ≤ B∑

k

|y[k]|2, (5.22)

where A and B are the orthonormality correction factor.

Based on the discussed propositions and the definitions we shall build the extended

model and make the new propositions. As suggested in Eq. (5.22), for a non-orthonormal

wavelet base an orthonormality correction factor is required and we shall call this as a

65

weighting factor Wt which is defined as follows:

Wt =‖x‖2∑k |y[k]|2

, (5.23)

where x and y is the input signal and the transform domain coefficients, respectively.

Therefore at this point Proposition 5.2.1 can be extended to a more generalized form. In

a polyphase decomposition we use different low pass and high pass filter banks. Hence

at each of the different transform points, we receive different weighting factors W gt and

W ht , corresponding to high or low pass filters, respectively. Now the Proposition 5.2.1

can be extended as follows, accommodating the weighting factors for non-orthonormal

transforms:

∑(|∆xe|2 + |∆xo|2) = W g

t

∑(|∆ye|2 + |∆yo|2) +W h

t

∑(|∆ye|2 + |∆yo|2),

∑

n

|∆x[n]|2 = W gt

∑(|∆ye|2 + |∆yo|2) +W h

t

∑(|∆ye|2 + |∆yo|2). (5.24)

Now using the generalized framework, Eq. (5.24) can be applied to build the relationship

between the modification energy in the coefficient domain to embed the watermark and

the distortion performance metrics for orthonormal as well as non-orthonormal wavelet

bases.

Proposition 5.3.1 In a wavelet based watermarking scheme, the mean square error

(MSE) of the watermarked image is directly proportional to the weighted sum of the

energy of the modification values of the selected wavelet coefficients.

MSE ∝ WΘΥt

∑|∆m,n|)|2, (5.25)

where Wt is the weighting parameter at each subband and Θ represents the subband

number at Υ decomposition level.

Proof : In order to prove this proposition, we recall Eq. (5.12) and Eq. (5.13) to combine

them with Eq. (5.24) and the combined form can be written as:

MSE.(X × Y ) =∑

n

|∆x[n]|2,

= W gt

∑

n

|∆y[n]|2 +W ht

∑

n

|∆y[n]|2,

= W gt

∑

m,n

|∆m,n|2 +W ht

∑

m,n

|∆m,n|2. (5.26)

66

Hence for any watermarked image, the average noise power MSE is proportional to

the sum of the weighted energy of the modification values of the selected wavelet

coefficients:

MSE ∝ W gt

∑

m,n

|∆m,n|2 +W ht

∑

m,n

|∆m,n|2. (5.27)

Now in the case of 2-D wavelet decompositions, the wavelet kernel transfer function, for

each subband at each decomposition level are different and so that the weighting factors

are. Hence the ∆ in Eq. (5.27) are associated with a corresponding weighting parameter

for each subband at each decomposition level. We define the weighting parameter as

WΘΥt at each subband and Θ represents the subband number at Υ decomposition level

and therefore Eq. (5.27) can be re-written as:

MSE ∝ WΘΥt

∑|∆m,n|)|2. (5.28)

�.

Therefore, using Eq. (5.28), the Eq. (5.10) and Eq. (5.11) can be extended for non-blind

and blind model to Eq. (5.29) and Eq. (5.30), respectively, as follows:

MSE ∝∑

WΘΥt |f(Cm,n)|2. (5.29)

MSE ∝∑

WΘΥt |f(Cmin, Cmax)|2. (5.30)

Hence the above equation can universally used for various wavelet kernels, where for

orthonormal wavelet kernels the value of the weighting parameters are equal to unity.

For non-orthonormal wavelet kernel, different weighting parameter values are suggested

in next section for different subbands at each decomposition level.

5.3.2 Experimental simulations and discussion

Experimental simulations have been carried out to verify the propositions made in the

previous section. There are two different parts of the experiment conducted: calculation

of the weighting parameters and simulation of the propositions.

67

Table 5.2: Weighting parameter values of each subband at each decomposition level forvarious non-orthonormal wavelets.

9/7 5/3 MH MQ

LL1 1.00± 0.00 0.99± 0.00 1.00± 0.00 0.99± 0.00LH1 1.22± 0.04 1.31± 0.03 1.00± 0.00 0.94± 0.02HL1 1.09± 0.02 1.31± 0.03 1.00± 0.00 1.97± 0.03HH1 1.34± 0.04 2.43± 0.08 1.02± 0.00 1.64± 0.05LL2 1.00± 0.00 0.99± 0.00 1.00± 0.00 0.98± 0.00LH2 1.22± 0.06 0.69± 0.03 1.00± 0.00 0.31± 0.01HL2 1.07± 0.03 0.74± 0.04 1.00± 0.00 0.52± 0.00HH2 1.17± 0.05 0.81± 0.03 1.01± 0.00 0.41± 0.01LL3 1.00± 0.00 0.98± 0.00 1.00± 0.00 0.98± 0.01LH3 1.37± 0.08 0.57± 0.03 1.00± 0.00 0.15± 0.01HL3 1.13± 0.02 0.57± 0.03 1.00± 0.00 0.17± 0.00HH3 1.31± 0.06 0.53± 0.02 1.00± 0.00 0.12± 0.00

5.3.2.1 Calculation of the weighting parameters

The weighting parameters are calculated for each subband at each decomposition level

for various wavelet kernels. A three level decomposition is done and the weighting

parameter values are calculated for each of the ten subbands. A set of different non-

orthonormal wavelet kernels including bi-orthogonal 5/3 and 9/7, are chosen for the

experimental simulations. Although the propositions made here assumed Linearity

property of wavelet kernels, we have experimentally simulated and observe the similar

proposition on non-linear wavelets, such as, Morphological Haar (MH) and Quincunx

domain Morphological wavelets (MQ). While calculating the weighting parameters,

the energy ratios are considered for each subband one at a time while keeping other

subband values to zero in Eq. (5.31).

WΘΥt =

‖x‖2∑|yΘΥ|2 , (5.31)

where WΘΥt is the weighting parameter at Θ subband at Υ decomposition level, yΘΥ is

the coefficient value at Θ subband at Υ decomposition level and x is the output pixel

values after the inverse wavelet transform. The weighting parameters are calculated for

the experimental image set and generalized by averaging them. It is observed that these

parameters are image independent. The corresponding weighting parameters for dif-

ferent subbands at each decomposition levels are calculated and shown Table 5.2 along

with the error. The errors presented here display accuracy up to the 95% confidence

interval.

68

5.3.2.2 Simulations of the propositions

The simulations of the proposed embedding distortion model for non-orthonormal

wavelet kernels are performed using a similar set up as used in the simulations of

the embedding models for orthonormal wavelets. The same test image set is used,

with three level wavelet decomposition. Four different non-orthonormal wavelet ker-

nels, namely, bi-orthogonal 9/7 and 5/3 and non-linear Morphological Haar (MH) and

Quincunx domain Morphological wavelets (MQ), are simulated and studied here. For

each simulations, first, results are shown without considering the weighting parameters

(WΘΥt ) and then the corresponding results using weighting parameters from Table 5.2:

Non-blind model: The experimental simulations for non-blind model as described in

Eq. (5.29) is performed and the correlation coefficients are calculated and represented

in Table 5.3. The average values of the MSE and the sum of energy are shown in

Figure 5.5. Column 1 and Column 2 represent the results without and with considering

the weighting parameter, while calculating the energy sum, respectively. The error bars

denote the accuracy up to the 95% confidence interval. For display purposes the sum

of energy value was scaled, so that they can be shown on the same plot for comparing

the trend.

In the other experiment set the subbands are compared and the results are shown in

Figure 5.6. Here Column 1, Column 2 and Column 3 represent the MSE, energy sum

without and with weighting parameters, respectively. As earlier the LL3 values are

scaled suitably in all cases to observe the trends.

Blind model: A similar experimental set, as in non-blind model, is used for the blind

model for non-orthonormal wavelet kernels as described in Eq. (5.30). The correlation

coefficients, average pattern graphs for various wavelet kernels and ten different sub-

bands are presented in Table 5.3, Figure 5.7 and Figure 5.8, respectively, without and

with consideration of the weighting parameters.

It is observed that bi-orthogonal wavelets strongly support the propositions, whereas

an occasional deviation is noticed for MH and MQ wavelet kernels due its non-linear

activity within the transform. However, the general behavioral pattern is maintained in

all four non-orthonormal wavelets, ensures the propositions’ realization in embedding

distortion performance of the generalized watermarking schemes.

69

Table 5.3: Correlation coefficient values between sum of energy and the MSE for dif-ferent wavelet kernel in various subbands.

Non-blind model Blind model9/7 5/3 MH MQ 9/7 5/3 MH MQ

LL3 0.80 0.81 0.81 0.81 0.77 0.82 0.43 0.78LH3 0.95 0.90 0.93 0.97 0.78 0.51 0.73 0.86HL3 0.99 0.97 0.98 0.95 0.96 0.94 0.73 0.92HH3 0.95 0.94 0.95 0.96 0.80 0.84 0.69 0.83LH2 0.97 0.97 0.98 0.99 0.81 0.81 0.70 0.94HL2 0.99 0.99 0.99 0.99 0.96 0.97 0.90 0.86HH2 0.99 0.99 0.99 0.98 0.89 0.88 0.84 0.93LH1 0.99 0.99 0.99 0.97 0.88 0.87 0.90 0.89HL1 0.97 0.97 0.98 0.99 0.75 0.91 0.91 0.95HH1 0.99 0.99 0.99 0.99 0.95 0.89 0.88 0.94

5.4 Conclusions

A gap is identified in the literature to mathematically relate the embedding distortion

performance metric and watermarking input parameters. A universal embedding dis-

tortion performance model for wavelet based watermarking schemes is presented in this

chapter to address such gap. First we have proposed models for orthonormal wavelet

bases, which is then extended to non-orthonormal wavelet kernels such as biorthogonal

and non-linear wavelets. The current model suggests that the MSE of the watermarked

image is directly proportional to the weighted sum of energy of the modification val-

ues of the selected wavelet coefficients and this proposition is valid for orthonormal as

well as non-orthonormal wavelet kernels. In the case of the non-orthonormal wavelet

bases a weighting parameter is introduced and it is computed experimentally for differ-

ent non-orthonormal wavelet bases whereas in the case of orthonormal wavelets, these

weighting parameters are set to unity. This universal model is verified by extensive

experimental simulations with a wide range of wavelet kernels. Such a model is useful

to optimize the input parameters, i.e., wavelet kernel or subband selection or the host

coefficient selection in wavelet based watermarking schemes.

70

0.5 1 1.5 2 2.5 3 3.5 4 4.51200

1400

1600

1800

2000

2200

2400

2600

2800Non-blind model: LL3 subband

Wavelets: 1.9/7 2.5/3 3.MH 4.MQ

MS

E a

nd s

cale

d en

ergy

val

ue


0.5 1 1.5 2 2.5 3 3.5 4 4.51200

1400

1600

1800

2000

2200

2400

2600

2800Non-blind model: LL3 subband (Weighted energy sum)

Wavelets: 1.9/7 2.5/3 3.MH 4.MQ

MS

E a

nd s

cale

d en

ergy

val

ue

MSEEnergy Sum(value scaled)

0.5 1 1.5 2 2.5 3 3.5 4 4.50

2

4

6

8

10

12

14

16

18Non-blind model: HL3 subband

Wavelets: 1.9/7 2.5/3 3.MH 4.MQ

MS

E a

nd s

cale

d en

ergy

val

ue


0.5 1 1.5 2 2.5 3 3.5 4 4.50

2

4

6

8

10

12

14Non-blind model: HL3 subband (Weighted energy sum)

Wavelets: 1.9/7 2.5/3 3.MH 4.MQ

MS

E a

nd s

cale

d en

ergy

val

ue


0.5 1 1.5 2 2.5 3 3.5 4 4.50

5

10

15

20

25

30Non-blind model: LH3 subband

Wavelets: 1.9/7 2.5/3 3.MH 4.MQ

MS

E a

nd s

cale

d en

ergy

val

ue


0.5 1 1.5 2 2.5 3 3.5 4 4.52

3

4

5

6

7

8

9

10

11Non-blind model: LH3 subband (Weighted energy sum)

Wavelets: 1.9/7 2.5/3 3.MH 4.MQ

MS

E a

nd s

cale

d en

ergy

val

ue


0.5 1 1.5 2 2.5 3 3.5 4 4.51

2

3

4

5

6

7

8

9

10Non-blind model: HH3 subband

Wavelets: 1.9/7 2.5/3 3.MH 4.MQ

MS

E a

nd s

cale

d en

ergy

val

ue


0.5 1 1.5 2 2.5 3 3.5 4 4.50.5

1

1.5

2

2.5

3

3.5Non-blind model: HH3 subband (Weighted energy sum)

Wavelets: 1.9/7 2.5/3 3.MH 4.MQ

MS

E a

nd s

cale

d en

ergy

val

ue


Figure 5.5: Watermark embedding (non-blind) performance graph for different sub-bands. Four different wavelet kernels used here: 1. 9/7, 2. 5/3, 3. MH and 4. MQ,respectively. Subbands are shown left to right and top to bottom: LL3, HL3, LH3,HH3, respectively.

71

0 2 4 6 8 10 120

2

4

6

8

10

12

14

16

18Non-blind model: 9/7 wavelet


MS

E

MSE

0 2 4 6 8 10 120

1

2

3

4

5

6

7x 10

7 Non-blind model: 9/7 wavelet


Ene

rgy

valu

e

Energy Sum

0 2 4 6 8 10 120

1

2

3

4

5

6

7x 10

7Non-blind model: 9/7 wavelet (Weighted energy sum)


Ene

rgy

valu

e

Energy Sum

0 2 4 6 8 10 120

2

4

6

8

10

12

14

16

18Non-blind model: 5/3 wavelet


MS

E

MSE

0 2 4 6 8 10 120

1

2

3

4

5

6

7x 10

7 Non-blind model: 5/3 wavelet


Ene

rgy

valu

e

Energy Sum

0 2 4 6 8 10 120

1

2

3

4

5

6

7x 10

7Non-blind model: 5/3 wavelet (Weighted energy sum)


Ene

rgy

valu

e

Energy Sum

0 2 4 6 8 10 120

2

4

6

8

10

12

14

16

18Non-blind model: MH wavelet


MS

E

MSE

0 2 4 6 8 10 120

1

2

3

4

5

6

7x 10

7 Non-blind model: MH wavelet


Ene

rgy

valu

e

Energy Sum

0 2 4 6 8 10 120

1

2

3

4

5

6

7x 10

7Non-blind model: MH wavelet (Weighted energy sum)


Ene

rgy

valu

e

Energy Sum

0 2 4 6 8 10 120

5

10

15

20

25

30

35Non-blind model: MQ wavelet


MS

E

MSE

0 2 4 6 8 10 120

2

4

6

8

10

12

14x 10

7 Non-blind model: MQ wavelet


Ene

rgy

valu

e

Energy Sum

0 2 4 6 8 10 120

1

2

3

4

5

6

7x 10

7Non-blind model: MQ wavelet (Weighted energy sum)


Ene

rgy

valu

e

Energy Sum

Figure 5.6: Watermark embedding (non-blind) performance graph for various waveletsin different subband. Wavelet kernels are shown left to right and top to bottom: 1.9/7, 2. 5/3, 3. MH and 4. MQ, respectively.

72

0.5 1 1.5 2 2.5 3 3.5 4 4.50

1

2

3

4

5

6

7

8

9Blind model: LL3 subband

Wavelets: 1.9/7 2.5/3 3.MH 4.MQ

MS

E a

nd s

cale

d en

ergy

val

ue


0.5 1 1.5 2 2.5 3 3.5 4 4.50

1

2

3

4

5

6

7

8

9Blind model: LL3 subband (Weighted energy sum)

Wavelets: 1.9/7 2.5/3 3.MH 4.MQ

MS

E a

nd s

cale

d en

ergy

val

ue


0.5 1 1.5 2 2.5 3 3.5 4 4.50

0.5

1

1.5

2

2.5

3Blind model: HL3 subband

Wavelets: 1.9/7 2.5/3 3.MH 4.MQ

MS

E a

nd s

cale

d en

ergy

val

ue


0.5 1 1.5 2 2.5 3 3.5 4 4.50

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8Blind model: HL3 subband (Weighted energy sum)

Wavelets: 1.9/7 2.5/3 3.MH 4.MQ

MS

E a

nd s

cale

d en

ergy

val

ue


0.5 1 1.5 2 2.5 3 3.5 4 4.50

0.5

1

1.5

2

2.5Blind model: LH3 subband

Wavelets: 1.9/7 2.5/3 3.MH 4.MQ

MS

E a

nd s

cale

d en

ergy

val

ue


0.5 1 1.5 2 2.5 3 3.5 4 4.50

0.2

0.4

0.6

0.8

1

1.2

1.4Blind model: LH3 subband (Weighted energy sum)

Wavelets: 1.9/7 2.5/3 3.MH 4.MQ

MS

E a

nd s

cale

d en

ergy

val

ue


0.5 1 1.5 2 2.5 3 3.5 4 4.50

0.5

1

1.5Blind model: HH3 subband

Wavelets: 1.9/7 2.5/3 3.MH 4.MQ

MS

E a

nd s

cale

d en

ergy

val

ue


0.5 1 1.5 2 2.5 3 3.5 4 4.50

0.05

0.1

0.15

0.2

0.25

0.3

0.35Blind model: HH3 subband (Weighted energy sum)

Wavelets: 1.9/7 2.5/3 3.MH 4.MQ

MS

E a

nd s

cale

d en

ergy

val

ue


Figure 5.7: Watermark embedding (blind) performance graph for different subbands.Four different wavelet kernels used here: 1. 9/7, 2. 5/3, 3. MH and 4. MQ, respectively.Subbands are shown left to right and top to bottom: LL3, HL3, LH3, HH3, respectively.

73

0 2 4 6 8 10 120

0.5

1

1.5

2

2.5Blind model: 9/7 wavelet


MS

E

MSE

0 2 4 6 8 10 120

1

2

3

4

5

6

7

8

9x 10

7 Blind model: 9/7 wavelet


Ene

rgy

valu

e

Energy Sum

0 2 4 6 8 10 120

1

2

3

4

5

6

7

8

9x 10

7 Blind model: 9/7 wavelet (Weighted energy sum)


Ene

rgy

valu

e

Energy Sum

0 2 4 6 8 10 120

0.2

0.4

0.6

0.8

1

1.2

1.4Blind model: 5/3 wavelet


MS

E

MSE

0 2 4 6 8 10 120

1

2

3

4

5

6

7

8

9x 10

7 Blind model: 5/3 wavelet


Ene

rgy

valu

e

Energy Sum

0 2 4 6 8 10 120

1

2

3

4

5

6

7

8

9x 10

7 Blind model: 5/3 wavelet (Weighted energy sum)


Ene

rgy

valu

e

Energy Sum

0 2 4 6 8 10 120

0.5

1

1.5

2

2.5

3Blind model: MH wavelet


MS

E

MSE

0 2 4 6 8 10 120

1

2

3

4

5

6

7

8

9x 10

7 Blind model: MH wavelet


Ene

rgy

valu

e

Energy Sum

0 2 4 6 8 10 120

1

2

3

4

5

6

7

8

9x 10

7 Blind model: MH wavelet (Weighted energy sum)


Ene

rgy

valu

e

Energy Sum

0 2 4 6 8 10 120

0.2

0.4

0.6

0.8

1

1.2

1.4Blind model: MQ wavelet


MS

E

MSE

0 2 4 6 8 10 120

1

2

3

4

5

6

7

8

9x 10

7 Blind model: MQ wavelet


Ene

rgy

valu

e

Energy Sum

0 2 4 6 8 10 120

1

2

3

4

5

6

7

8

9x 10

7 Blind model: MQ wavelet (Weighted energy sum)


Ene

rgy

valu

e

Energy Sum

Figure 5.8: Watermark embedding (blind) performance graph for various wavelets indifferent subband. Wavelet kernels are shown left to right and top to bottom: 1. 9/7,2. 5/3, 3. MH and 4. MQ, respectively.

74

Chapter 6

Robustness analysis and

modeling

6.1 Introduction

Scalable image coding consists of multi-resolution decomposition of images, such as,

the discrete wavelet transform (DWT), followed by hierarchical layered representation

considering the scalability requirements. For the quality scalability driven content

adaptations, the corresponding insignificant quality layers are discarded. Such content

adaptations result in loss of watermark data embedded within the affected coefficients,

thus, diminishing the robustness of the watermarking schemes. The watermarking

literature often focuses only on the robustness to attacks, such as, image processing,

JPEG-based compression and geometric adaptations [107]. This chapter proposes a

novel approach enhancing the robustness of wavelet-based image watermarking for

scalable coding-based content adaptation attacks.

As discussed in Chapter 3, based on the embedding methodology, wavelet-based image

watermarking can be categorized into two main classes: uncompressed domain algo-

rithms [5–18] and joint compression-watermarking algorithms [19–26]. One of the main

objectives of the latter class of algorithms is to enhance the robustness to JPEG 2000-

based compression, although in most cases the effect of full capabilities of JPEG 2000

compression and associated content adaptations has not been considered in the pro-

posed solutions. However, JPEG 2000 Part 8 (ISO/IEC 15444-8, T.807) Secure JPEG

2000 (JPSEC) [20] specifies the framework, concepts, and methodology for securing

75

JPEG 2000 bit streams considering the full capabilities of JPEG 2000. One of the un-

derlying techniques in JPSEC is watermarking [26,108], proposed as joint compression

domain embedding within the coding pipeline.

The work, here, focuses on the uncompressed domain watermarking algorithms con-

sidering the sequence of events as watermarking, JPEG 2000 compression, content

adaptation and watermark authentication for the decoded image. We model quality

scalability by bit plane discarding and propose the choice of embedding coefficients

and watermarking parameters to minimize the effect of bit plane discarding on the wa-

termarked data. The proposed model addresses both non-blind and blind watermark

extraction scenarios.

6.2 Quality scalability in content adaptation

In universal media access (UMA) application scenario for images, the resolution-quality

layers in the scalable bit stream lead into two types of content adaptation: quality

scalability and resolution scalability. The present work focuses only on the quality

scalability, available in JPEG 2000 scalable image coding. The simplest form of quality

layers used in JPEG 2000 coding corresponds to bit plane-based coding of wavelet

coefficients. Choosing certain quality layers up to some bit planes corresponds to

quantization of the wavelet coefficients. In general, the coefficient quantization due to

bit plane discarding, in its simplest form, can be formulated as follows:

Cq =C

|C|

⌊ |C|Q

⌋, (6.1)

where Cq is the quantized coefficient, C is the non-zero original coefficient, Q is the

quantization factor and bxc denotes rounding of x to the largest integer smaller than

x (called downward rounding). Embedded quantizers often use Q = 2N , where N is a

non-negative integer that corresponds to the number of bit planes being discarded.

At the decoder side, the reverse process of the quantization (de-quantization) is followed

by multiplying by the quantization factor Q and allowing for the uncertainty due to

downward rounding as follows:

C = CqQ+C

|C|

(Q− 1

2

), (6.2)

where C is the de-quantized coefficient. The outcome of the combined quantization

76

Nk 2).1( − Nk 2.Nk 2).1( +

2

122).1()1(

−+−=−

NN

k kC2

122.

−+=N

Nk kC

C

Figure 6.1: The effect of quantization and de-quantization processes in wavelet domainconsidering discarding of N bit planes.

and de-quantization processes is

C =C

|C|

(⌊ |C|Q

⌋Q+

Q− 1

2

). (6.3)

Thereby, one can show that the original coefficient values in the range kQ ≤ C <

(k + 1)Q, where k ∈ {0,±1,±2, ...}, are quantized using N bit plane discarding, i.e.,

Q = 2N , are mapped to C = Ck, which is the center value of the region marked by kQ

and (k + 1)Q as shown in Figure 6.1. Thus, the center value, Ck, is given by

Ck = k2N +k

|k|

(2N − 1

2

). (6.4)

This relationship is further exploited in Section 6.3 and Section 6.4 in order to model

the watermark robustness to bit-plane discarding driven quality scalable decoding in

content adaptations.

6.3 Robustness model for non-blind extraction using mag-

nitude alteration

6.3.1 Preliminaries

For magnitude alteration algorithms Eq. (3.1) and Eq. (3.2) are combined considering

< 1, 0, 0, 0 > (τ = 1), ignoring the index subscripts (m,n), to get

C ′ = C + αCwb,

= C(1 + αwb), (6.5)

77

where the watermark W in Eq. (3.2) is replaced with wb (b ∈ {0, 1}) for a binary

watermark logo. The two values, w0 and w1, are usually chosen as w1 > w0 > 0. From

Eq. (6.5), the relationship between C ′ and C is

C =C ′

1 + αwb. (6.6)

Since (1+αwb) > 0, both C and C ′ share the same sign. The corresponding modification

∆ is

∆ = C ′ − C = αCwb. (6.7)

Thus, the extracted watermark value, w′b, is computed as

w′b =C ′ − C

αC. (6.8)

Then the recovered watermark value, b′, is

b′ =

{1 : w′b ≥ T,

0 : w′b < T,(6.9)

where the threshold T = w0+w12 .

6.3.2 The model

Now considering the quantization and de-quantization processes in the compression and

decompression, let C ′ be the reconstructed watermarked coefficient after decompres-

sion. As shown in Eq. (6.4) in Section 6.2, for discarding N bit planes, C ′ represents

re-mapping of the original watermarked coefficients, C ′, to the center points, Ck, of the

corresponding coefficient cluster, [k2N , (k + 1)2N ), i.e.,

C ′ = Ck, ∀ k2N ≤ C ′ < (k + 1)2N . (6.10)

The proposed model aims to identify coefficients with magnitude values that fall into

regions where the accurate watermark extraction is possible after the quantization and

de-quantization processes as follows:

78

Proposition 1 The original wavelet coefficients, C, for embedding a bit with value

b = 1 and retain intact when N bit planes are discarded are in the range

k.2N

1 + αw1≤ C ≤ Ck

1 + αT,

with k ∈ {0,±1,±2,±3, ...}.

Proof : To extract b = 1 accurately, we need w′b ≥ T . That means

C ′ − C

αC≥ T. (6.11)

Since both C ′ and C share the same sign and |C ′| > |C|,

C ′ ≥ C(1 + αT ). (6.12)

If there is no compression, the value of C ′ is given by Eq. (6.5). But due to compression,

only the reconstructed coefficients, C ′, are available. The correct extraction of b = 1 is

possible if

C ′ ≥ C ′. (6.13)

Considering the values in the region, k2N ≤ C ′ < (k + 1)2N ,

∀ k2N ≤ C ′ ≤ Ck, C ′ = Ck ⇒ C ′ ≥ C ′,

∀ Ck < C ′ < (k + 1)2N , C ′ = Ck ⇒ C ′ < C ′. (6.14)

Therefore, the condition in Eq. (6.13) is true when

k2N ≤ C ′ ≤ Ck, (6.15)

which in terms of the original coefficients, C, is

k2N ≤ C(1 + αw1) ≤ Ck,

k2N

1+αw1≤ C ≤ Ck

1+αw1. (6.16)

However, even if C ′ < C ′, the correct extraction of b = 1 is still possible if (by consid-

ering Eq. (6.12))

Ck − C ≥ αCT. (6.17)

79

Figure 6.2: The range of C capable of robust extraction of b = 1. Row 1 : C ′ ≥ C ′;Row 2 : C ′ < C ′; Row 3 : The total range.

This means,

Ck ≥ C(1 + αT ),

C ≤ Ck

1+αT . (6.18)

We know that w1 > T . Therefore, 11+αT > 1

1+αw1and thus we can merge the ranges

in Eq. (6.16) and Eq. (6.18), as summarized in Figure 6.2, to get the range of original

coefficients capable of robust extraction of b = 1 to

k2N

1 + αw1≤ C ≤ Ck

1 + αT. (6.19)

�.

80

Proposition 2 The original wavelet coefficients, C, for embedding a bit with value

b = 0 and retain intact when N bit planes are discarded are in the range

C(k−1)

1 + αT< C <

k.2N

1 + αw0,

with k ∈ {0,±1,±2,±3, ...}.

Proof : To extract b = 0 accurately, we need w′b < T . That means

C′−CαC < T, (6.20)

C ′ < C(1 + αT ). (6.21)

The correct extraction of b = 0 from the reconstructed coefficients, C ′, is possible if

C ′ < C ′. (6.22)

Therefore, considering the values in the region, (k − 1)2N ≤ C ′ < k2N ,

∀ (k − 1)2N ≤ C ′ ≤ Ck−1, C ′ = Ck−1 ⇒ C ′ ≥ C ′,

∀ Ck−1 < C ′ < k2N , C ′ = Ck−1 ⇒ C ′ < C ′. (6.23)

Therefore, the condition in Eq. (6.22) is true when

Ck−1 < C ′ < k2N , (6.24)

which in terms of the original coefficients, C, is

Ck−1 < C(1 + αw0) < k2N ,

Ck−1

1+αw0< C < k2N

1+αw0. (6.25)

However, even if C ′ ≥ C ′, the correct extraction of b = 0 is still possible if

Ck−1 − C < αCT, (6.26)

as suggested by Eq. (6.21). This means,

Ck−1 < C(1 + αT ),

C >Ck−1

1 + αT. (6.27)

81

Figure 6.3: The range of C capable of robust extraction of b = 0. Row 1 : C ′ < C ′;Row 2 : C ′ ≥ C ′; Row 3 : The total range.

Since w0 < T , we can write 11+αT < 1

1+αw0. Thus we can merge the ranges in Eq. (6.25)

and Eq. (6.27), as summarized in Figure 6.3, to get the range of original coefficients

capable of robust extraction of b = 0 to

C(k−1)

1 + αT< C <

k.2N

1 + αw0. (6.28)

�.

Finally, as shown in Figure 6.4, the above two results in Eq. (6.19) and Eq. (6.28) are

combined to derive the region of coefficient magnitudes that are capable of retaining

both b = 1 and b = 0 when N bit planes are discarded as follows:

k.2N

1 + αw1≤ C <

k.2N

1 + αw0. (6.29)

82

Nk 2).1( − Nk 2.Nk 2).1( +

)1( −kCkC

112.

w

k N

α+ T

Ck

α+101

2.w

k N

α+

T

C k

α+−

1)1(

C

Figure 6.4: The combined range of C capable of robust extraction of both b = 1 andb = 0.

6.3.3 Examples

As an example, we choose w1 = 0.8, w0 = 0.3, the threshold T = 0.55 and a data set

containing coefficient values, C from −512 to 512, and show the ranges of coefficient

values that can robustly retain the embedded watermark bits after discarding N = 7

bit planes in Table 6.1. Two scenarios of α = 0.5 and α = 0.05 are shown. First, the

coefficient selection for embedding b = 1 using Eq. (6.19) are shown followed by the

coefficient selection for embedding b = 0 using Eq. (6.28). Finally the common region

is found for embedding any value of b as shown in Eq. (6.29).

Figure 6.5 shows the robustness ability of wavelet coefficients for two different subbands

(LL and HL after a single level of decomposition) for given numbers of bit plane dis-

carding. In this figure the gray scale is quantized into 7 levels with black representing

robustness to N = 0 bit plane discarding (i.e., the least robust) and white representing

robustness up to N = 6 bit plane discarding (i.e., the most robust), with other inter-

mediate grey levels corresponding to coefficients robust up to discarding of N = 1, ..., 5

bit planes, respectively.

6.4 Robustness model for blind extraction using

re-quantization-based modifications

6.4.1 Preliminaries

Recalling Section 3.1.2.4, in re-quantization-based modification (e.g., [5,6,22]), a group

of coefficients (usually three coefficients) are ranked ordered to identify the minimum

83

Table 6.1: Data value (C) ranges for retaining the watermark data, b = 1 and b = 0for discarding N = 7 bit planes.

(a) α = 0.5

k → -5 -4 -3 -2 -1 0 1 2 3 4 5

b = 1 min -512 -460 -358 -256 -153 0 91 183 274 366 457max -457 -366 -274 -183 -91 51 153 256 358 460 512

b = 0 min -512 -445 -334 -223 -111 -51 51 153 256 358 460max -460 -358 -256 -153 -51 0 111 223 334 445 512

b = 1 and min -512 -445 -334 -223 -111 91 183 274 366 460b = 0 max -460 -366 -274 -183 -91 111 223 334 445 512

(b) α = 0.05k → -4 -3 -2 -1 0 1 2 3 4

b = 1 min -512 -437 -312 -187 0 123 246 369 492max -492 -369 -246 -123 62 187 312 437 512

b = 0 min -504 -378 -252 -126 -62 62 187 312 437max -437 -312 -187 -62 0 126 252 378 504

b = 1 and min -504 -378 -252 -126 123 246 369 492b = 0 max -492 -369 -246 -123 126 252 378 504

(C1), the maximum (C3) and the median (C2) coefficients. Then C2 is modified to

obtain C ′2 as follows:

C ′2 = f(γ,C1, C3, b), (6.30)

where b is binary watermark bit, b ∈ {0, 1}, γ is a parameter corresponding to the

watermark strength and f() is a non-linear transformation process which is described

as follows. This process first partitions the coefficient range, r, where

r = C3 − C1, (6.31)

by the quantization bin size, δ, defined by

δ = γ|C1|+ |C3|

2, (6.32)

into quantization bins with indexes, i = 0, 1, ..., rδ − 1. Then in order to embed a

watermark bit b, the original value, C2, is modified to C ′2 by choosing any value that

comes from the quantization bin index, i, where b = i%2, i.e.,

C ′2 ∈{C :

C − C1

δ%2 = b

}, (6.33)

84

Figure 6.5: Coefficients’ robustness rank maps for discarding up to N bit planes shownusing 7 gray scales corresponding to N = 0, ...., 6. Left: LL subband; Right: HLsubband; Row 1: Embedding b = 1; Row 2: Embedding b = 0; Row 3: Embedding anyvalue of b.

where % denotes the modulo operator as shown in Figure 3.3. To extract the watermark

bit, b, back from C1, C′2 and C3,

b =

(C ′2 − C1

δ

)%2. (6.34)

6.4.2 The model

After compression and decompression, only the reconstructed coefficients, C1, C ′2 and

C3, are available to the watermark extraction process. In order for the successful extrac-

tion, i.e., to maintain the robustness to quality scalable compression, the relationship,

85

Nk 2.

Nk 2).1( +Nk 2).2( +

1+kC

Nmk 2).( +Nmk 2).1( ++

mkCC +=2ˆ

Nnk 2).( +Nnk 2).1( ++

nkCC +=3ˆ

C

kCC =1ˆ

Figure 6.6: Mapping of coefficients after quantization and de-quantization processesconsidering the discarding of N bit planes.

C1 ≤ C ′2 ≤ C3, must be remained intact while C1 6= C ′2 6= C3 and

b =

(C ′2 − C1

δ

)%2, (6.35)

where

δ = γ|C1|+ |C3|

2. (6.36)

As discussed earlier, the original coefficient values in the range kQ ≤ C < (k + 1)Q,

where k ∈ {0,±1,±2,±3, ...} are quantized using N bit plane discarding i.e., Q = 2N ,

are mapped to C = Ck, which is the center value of the region marked by kQ and

(k + 1)Q as shown in Figure 6.1. The center value, Ck, of the clusters is given by

Eq. (6.4). In line with this definition, we assume that the mapped three values, C1, C ′2

and C3, are Ck, Ck+m and Ck+n, where m,n ∈ {0, 1, 2, ...} and 0 ≤ m ≤ n, respectively

as shown in Figure 6.6. Therefore, the robustness model needs to estimate the extracted

watermark bit, b, as a function of m, with respect to discarding N bit planes at the

time of embedding the watermark.

Proposition 3 The estimated extracted watermark bit, b, with respect to discarding N

bit planes, is given by

b =

(2m+ z

γ(|k|+ |k + n|+ 1)

)%2,

where z = 0, if Ck and Ck+m have the same sign and z = 2− 21−N , if otherwise.

Proof : Ck in Eq. (6.4) can be represented in the sign magnitude form as follows:

Ck =k

|k|

(|k|2N +

2N − 1

2

). (6.37)

86

With reference to Eq. (6.36), the reconstructed watermark quantization step value, δ,

after discarding N bit planes can now be defined as:

δ = γ|C1|+ |C3|

2,

= γ|Ck|+ |Ck+n|

2,

=γ

2

(|k|2N +

2N − 1

2+ |k + n|2N +

2N − 1

2

),

= γ2N−1(|k|+ |k + n|+ 1)− γ

2, (6.38)

The usual values of γ are in the range, 0.05 ≤ γ ≤ 0.1. Therefore, γ2 << γ2N−1(|k| +

|k + n|+ 1). thus, Eq. (6.38) can be re-written as

δ = γ2N−1(|k|+ |k + n|+ 1). (6.39)

Using Eq. (6.37) and Eq. (6.39) in Eq. (6.35), the estimated extracted watermark bit,

b, with respect to discarding N bit planes, can be formulated as,

b =

(C2 − C1

δ

)%2,

=

(Ck+m − Ck

δ

)%2,

=

(k+m)2N+ (k+m)

|k+m|(2N−1)

2 −k2N− k|k|

(2N−1)2

γ2N−1(|k|+ |k + n|+ 1)

%2,

=

m2N +

((k+m)|k+m| − k

|k|

)(2N−1

2

)

γ2N−1(|k|+ |k + n|+ 1)

%2,

=

2m+

((k+m)|k+m| −

k|k|

) (1− 2−N

)

γ(|k| + |k + n|+ 1)

%2. (6.40)

Now considering the two cases: k and k +m have the same sign (Case 1) and k and

k +m have different signs (Case 2),

b =

(2m

γ(|k|+|k+n|+1)

)%2 : Case 1,(

2m+2−21−N

γ(|k|+|k+n|+1)

)%2 : Case 2.

(6.41)

�.

Thus, using Eq. (6.41), it is possible to predict b for a given number of discarded bit

87

Table 6.2: Values of m and corresponding b for different modifications of C ′2 for k = 1,k + n = 6 and N = 5.

C ′2 values range C2 m b

35-63 47.5 0 0

64-95 79.5 1 1

96-127 111.5 2 1

128-159 143.5 3 0

160-191 175.5 4 0

192-203 207.5 5 1

planes, N , for particular modifications of C2 to C′2 during embedding. This relationship

is used for identifying the ranges of values for C ′2, i.e., the value of C2 after embedding

the watermark bit, b, by considering the value of m for given k, n and N . Similarly the

optimal values of C ′2 for other N −u, where u ∈ {1, 2, 3...N − 1} lower bit-planes being

discarded are calculated to maintain the robustness for discarding of any bit plane up

to the Nth bit plane.

6.4.3 Examples

Let C1 = 35, C2 = 181 and C3 = 203 are the three coefficients concerned. Set γ = 0.1

and consider N = 5 bit planes are being discarded. Then k = b35/32c = 1 and

k+n = b203/32c = 6. Thus, Eq. (6.41) is simplified to b = (2.5m)%2. A look-up table,

as shown in Table 6.2, of b for different C2 and corresponding m is derived. Thus, in

this example, for robustly embedding a watermark bit, b = 0, C2 can be modified to

any value in the regions, 35 ≤ C ′2 ≤ 63 and 128 ≤ C ′2 ≤ 191. Similarly, for robustly

embedding a watermark bit, b = 1, C2 can be modified to any value in the regions,

64 ≤ C ′2 ≤ 127 and 192 ≤ C ′2 ≤ 203. However, a value close to the original value, C2,

within these ranges is chosen in order to minimize the amount of distortion.

Similar computations are carried out for N = 1, 2, 3, 4... to obtain the corresponding

robust ranges for C ′2 The common range for all N values ensures correct watermark

extraction when N or any lower number of bit planes are discarded. The extension

of the previous example for N = 1, 2, 3, 4 to find the value ranges of C ′2 to embed the

watermark bits, b = 1 or b = 0, is shown in Table 6.3.

88

Table 6.3: Ranges of C ′2 to embed watermark bits, b = 1 and b = 0, for different N

Embedding b = 0Robustness for Robustness for discarding

discarding N bit planes up to N bit planes

N = 1 172-184 & 196-203 -

N = 2 168-180 & 192-203 172-180 & 196-203

N = 3 176-184 & 200-203 176-180 & 200-203

N = 4 176-192 & - 176-180

N = 5 128-192 176-180

Embedding b = 1Robustness for Robustness for discarding

discarding N bit planes up to N bit planes

N = 1 160-172 & 184-196 -

N = 2 156-168 & 180-192 160-168 &184-192

N = 3 160-176 & 184-200 160-176 &184-192

N = 4 144-176 & 192-203 160-176

N = 5 192-203 -

6.5 Performance evaluation

This section presents the results of experimental validation of the proposed two models

by simulating wavelet domain bit plane discarding and their performance against JPEG

2000 quality scalability-based content adaptation. The experimental set up includes

the test image set as shown in Figure 4.7.

In the experiments, firstly the watermark data is embedded by considering different

values of N , i.e., the maximum number of bit planes that can be discarded without

affecting the robustness. The case, N = 0, corresponds to not using the model. Then

for each case of N , the robustness to different compression ratios using different quan-

tization factors, Q, where Q = 2p and p is the corresponding number of bit planes

actually being discarded, is evaluated. Finally, JPEG 2000 quality scalability-based

content adaptation experiments are performed using WEBCAM framework to evalu-

ate the performance of the proposed models in actual quality scalability scenarios. The

extracted watermark data is compared with the original watermark data by comparing

the Hamming distance. The lower the Hammming distance, the higher the robustness.

89

0 1000 2000 3000 4000 5000 6000 700050

52

54

56

58

60

62

Watermarking data capacity (number of bits)

PS

NR

(dB

)

Embedding performance: PSNR vs Data capacity (image 1)

N=0N=1

N=2

N=3

N=4

N=5

N=6

0 1000 2000 3000 4000 5000 6000 700050

52

54

56

58

60

62

Watermarking data capacity (number of bits)

PS

NR

(dB

)

Embedding performance: PSNR vs Data capacity (image 2)

N=0N=1

N=2

N=3

N=4

N=5

N=6

Figure 6.7: Embedding performance of the model for non-blind watermarking consid-ering different values of N at embedding. Column 1 : Image 1; and Column 2 : Image2.

6.5.1 Evaluation of the model for non-blind watermarking

The proposed model for non-blind watermarking is evaluated using one of the magni-

tude alteration algorithms [17] as the control algorithm. Similar such algorithms are

discussed in the literature and categorized in Chapter 3 and Chapter 4 as shown in Ta-

ble 4.1. The proposed robustness model is incorporated into the algorithm for choosing

the coefficients for embedding the watermark and the robustness performance for qual-

ity scalability content adaptation attacks are compared with the original algorithm that

does not use the proposed model. The experimental set up includes the 9/7 wavelet,

3 levels of decomposition and embedding within the low-low (LL) frequency subbband

using α = 0.01 and considering different values for N , i.e., the maximum number of

bit planes that can be discarded without affecting the robustness. The case, N = 0,

corresponds to the control algorithm that does not use the model. For completeness,

the embedding performance is first shown in Figure 6.7 for different values of N . The

watermark embedding performance was measured by the watermark data count and

the corresponding peak signal to noise ratio (PSNR) of the watermarked image with

respect to the original image. Since a fewer number of coefficients are able to robustly

retain the watermark data for large N values considered in the model, the amount of

watermark data embedded is smaller for larger N , consequently, resulting in higher

PSNR values for such cases.

90

6.5.1.1 Simulations with bit plane discarding

Figure 6.8 shows the robustness performance for various compression steps achieved

by a quantization factor Q = 2p, where p is the corresponding number of bit planes

actually being discarded for each different embedding model values of N . The figure

shows the robustness plots for embedding scenarios that use different embedding model

values of N . The first two graphs in each column correspond to the Image 1 and Image

2 from the test image set, respectively. The third row shows the average performance

with error bars corresponding to 95% confidence intervals for the entire image set. For

better visualization, we have grouped the plots into two sets: viz., Column 1 with

N = 1, 3, 5 and Column 2 with N = 2, 4, 6. In the plots, the x-axis represents the

number of bit planes being discarded (p) during compression while the y-axis shows

the robustness performance in terms of Hamming distance.

The simulations verify the high robustness performance of the proposed model. It is

also evident that the robustness remains high for any p ≤ N number of bit planes being

discarded.

For further clarification of the proposed model’s behavior, in Figure 6.9, we show

|C ′ −C| and |C ′ −C| for the LL subband coefficients chosen using the two embedding

model scenarios, N = 0 and N = 5, and two bit plane discarding scenarios, p = 0 and

p = 5. For N = 0 all coefficients are chosen for watermark embedding, while for N = 5

only some of the coefficients are chosen by the model. The non-chosen coefficients

are shown in black in Figure 6.9.b and Figure 6.9.d. For the chosen coefficients, the

absolute magnitude differences due to embedding and compression are added with a

bias value and displayed in gray scale for clear distinction between the chosen and

non-chosen coefficients. The results show a greater similarity between Figure 6.9.b and

Figure 6.9.d (where p ≤ N) compared to that between Figure 6.9.a and Figure 6.9.c

(where p > N). The consideration of more bit plane discarding (N = 5 here) during

embedding helps to select the optimum set of coefficients to embed the watermark

in such a way that it can retain the watermarks after the said number of bit plane

discarding (p = 5 here) during compression, hence high robustness.

6.5.1.2 Experiments with JPEG 2000 quality scalability

Figure 6.10 shows the robustness performance of the proposed embedding modelwhen

the JPEG 2000 quality scalability-based content adaptations are used. For better

visualization, we have grouped the plots into two sets: viz., Column 1 with N = 1, 3, 5

91

0 1 2 3 4 5 6 70

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

0.5Robustness to bit plane discarding: image 1

Number of bit plane discarded (p)

Ham

min

g D

ista

nce

Without modelN=1N=3N=5

0 1 2 3 4 5 6 70

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45



Ham

min

g D

ista

nce


0 1 2 3 4 5 6 70.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45



Ham

min

g D

ista

nce


0 1 2 3 4 5 6 70.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45



Ham

min

g D

ista

nce


−1 0 1 2 3 4 5 6 7 80

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

0.5Robustness to bit plane discarding : image set


Ham

min

g D

ista

nce


−1 0 1 2 3 4 5 6 7 80

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

0.5Robustness to bit plane discarding : image set


Ham

min

g D

ista

nce


Figure 6.8: Non-blind model evaluation: Robustness performance against discardingof p bit planes for the embedding models that consider N = 1, 3, 5 (Column 1 ) andN = 2, 4, 6 (Column 2 ) bit planes to be discarded. N = 0 corresponds to algorithmwithout model. Row 1 : Image 1; and Row 2 : Image 2; Row 3 : The entire image set.

and Column 2 with N = 2, 4, 6. In the plots, the x-axis shows the compression ratio

while the y-axis shows the robustness performance in terms of Hamming distance. The

first two rows show the results for two of the test images while the third row shows

the average performance with error bars corresponding to 95% confidence intervals

for the entire image set. It is evident from these plots, that the higher the value of N

considered in the embedding model, the higher the watermarking robustness. With the

92

a) |C′ − C|, N = 0 b) |C′ − C|, N = 5

c) |C′ − C|, N = 0, p = 5 d) |C′ − C|, N = 5, p = 5

Figure 6.9: Non-blind model evaluation. a) and b) represent the difference images|C ′−C| in for using the embedding model with N = 0 and N = 5, respectively. c) andd) show the corresponding difference images |C ′ − C| at the decoder after discardingp = 5 bit planes.

higher order models, coefficients for embedding watermark data are chosen accurately

according to their ability to retain the correct watermark data under compression.

The improvement in robustness over the original algorithm [17], i.e., without using the

model (N = 0), is achieved by more than 30% at various compression ratio using the

non-blind model at N = 6.

6.5.2 Evaluation of the model for blind watermarking

In this case, the re-quantization based blind watermarking schemes discussed in Chap-

ter 3 and Chapter 4 are evaluated here and as an example, the algorithm presented

in [5] is used as control algorithm to verify and evaluate the model proposed for blind

watermarking algorithms. In this model, unlike the non-blind one, number of coeffi-

cients to be embedded are constant for any value of N , as no reference image is available

at the decoder. Due to the nature of the model, for a given N , all selected coefficients

may not satisfy the conditions in Eq. (6.41), particularly when m is small and can not

provide any suitable value to embed 0 or 1. This situation is possible when C1 and C2

are very close to each other. Therefore, the algorithmic implementation first attempts

93

0 10 20 30 40 50 600.04

0.06

0.08

0.1

0.12

0.14

0.16Robustness to JPEG 2000 quality scalability: image 1

Compression Ratio

Ham

min

g D

ista

nce


0 10 20 30 40 50 600.02

0.04

0.06

0.08

0.1

0.12

0.14


Compression Ratio

Ham

min

g D

ista

nce


0 10 20 30 40 50 60

0.05

0.1

0.15

0.2

0.25

0.3

Robustness to JPEG 2000 quality scalability: image 2

Compression Ratio

Ham

min

g D

ista

nce


0 10 20 30 40 50 60

0.05

0.1

0.15

0.2

0.25

0.3

Robustness to JPEG 2000 quality scalability: image 2

Compression Ratio

Ham

min

g D

ista

nce


0 10 20 30 40 50 600.02

0.04

0.06

0.08

0.1

0.12

0.14

0.16

0.18

0.2

0.22

Robustness to JPEG 2000 quality scalability : image set

Compression Ratio

Ham

min

g D

ista

nce


0 10 20 30 40 50 600.02

0.04

0.06

0.08

0.1

0.12

0.14

0.16

0.18

0.2

0.22

Robustness to JPEG 2000 quality scalability : image set

Compression Ratio

Ham

min

g D

ista

nce


Figure 6.10: Non-blind model evaluation: Robustness performance against JPEG 2000quality scalability for the embedding models that consider N = 1, 3, 5 (Column 1 ) andN = 2, 4, 6 (Column 2 ) bit planes to be discarded. N = 0 corresponds to algorithmwithout model. Row 1 : Image 1; and Row 2 : Image 2; Row 3 : The entire image set.

to calculate and modify the the coefficient C2 according to given N using Eq. (6.41). In

case, the required m is unavailable, it attempts to calculate and modify C2 considering

N − 1. This process continues until a suitable m is available with the condition N > 0.

Here, the experimental set up includes the 9/7 wavelet, 3 levels of decomposition and

embedding within the low-low (LL) frequency subbband using γ = 0.02 and considering

94

0 1 2 3 4 5 6 34

36

38

40

42

44

46

48

50 Embedding performance of blind model: Image 3 and Image 4

N

PS

NR

(dB

)

Image 3 (Watermark Count = 2112)

Image 4 (Watermark Count = 2048)

Figure 6.11: Embedding performance of the model for blind watermarking consideringdifferent values of N at embedding for image 3 and image 4.

different values for N , i.e., the maximum number of bit planes that can be discarded

without affecting the robustness. The case, N = 0, corresponds to the control algo-

rithm that does not use the model. In this case also for completeness, the embedding

performance is first shown in Figure 6.11 for different values of N . In this type of

blind watermarking, the modification value often increases with the higher N values

whilst keeping the same watermark bit count. As a result more embedding distortion

is introduced as shown in Figure 6.11. Hence the optimal N can be decided depending

on tolerable PSNR value for a given application.

6.5.2.1 Simulations with bit plane discarding

Figure 6.12 shows the robustness performance for various compression steps achieved

by a quantization factor Q = 2p, where p is the corresponding number of bit planes

actually being discarded for each different embedding model values of N . The first two

graphs correspond to the Image 3 and Image 4 from the test image set, respectively.

The third graph shows the average performance with error bars corresponding to 95%

confidence intervals for the entire image set. The simulations verify the enhanced

robustness performance of the proposed model. It is also evident that the robustness

remains high for any p ≤ N number of bit planes being discarded.

95

0 1 2 3 4 5 6 70

0.1

0.2

0.3

0.4

0.5

0.6



Ham

min

g D

ista

nce


0 1 2 3 4 5 6 70

0.1

0.2

0.3

0.4

0.5

0.6



Ham

min

g D

ista

nce


−1 0 1 2 3 4 5 6 7 80

0.1

0.2

0.3

0.4

0.5

0.6

0.7Robustness to bit plane discarding: image set


Ham

min

g D

ista

nce


Figure 6.12: Blind model evaluation: Robustness performance against discarding ofp bit planes for the embedding models that consider N = 0, 3, 4, 5 bit planes to bediscarded. N = 0 corresponds to algorithm without model. Row 1, Column 1 : Image3; and Row 1, Column 2 : Image 4; Row 2 : The entire image set.

6.5.2.2 Experiments with JPEG 2000 quality scalability

Similarly, Figure 6.13 shows the robustness performance of the model for actual JPEG

2000 quality scalability-based content adaptations for Image 3, Image 4, and the entire

image set (Figure 4.7). Plots show the robustness to different compression ratio points

for different embedding model values of N and compares with the case for N = 0, where

the model is not used. It is evident from these plots, that high values of N considered in

the embedding model lead to enhanced robustness of the blind watermarking scheme.

Similar to the non-blind model, the improvement over the original algorithm [5], in

robustness is achieved by more than 15% at various compression ratio using the blind

model at N = 5.

96

0 10 20 30 40 50 600.02

0.03

0.04

0.05

0.06

0.07

0.08

0.09

0.1

0.11


Compression Ratio

Ham

min

g D

ista

nce


0 10 20 30 40 50 600

0.02

0.04

0.06

0.08

0.1

0.12

0.14

0.16


Compression Ratio

Ham

min

g D

ista

nce


0 10 20 30 40 50 600

0.02

0.04

0.06

0.08

0.1

0.12

0.14

0.16

0.18

0.2Robustness to JPEG 2000 quality scalability: image set

Compression Ratio

Ham

min

g D

ista

nce


Figure 6.13: Blind model evaluation: Robustness performance against JPEG 2000quality scalability for the embedding models that consider N = 0, 3, 4, 5 bit planes tobe discarded. N = 0 corresponds to algorithm without model. Row 1, Column 1 :Image 3; and Row 1, Column 2 : Image 4; Row 2 : The entire image set.

6.6 Conclusions

In this chapter the models have been presented for enhancing the robustness of non-

blind and blind watermarking algorithms against quality scalability-based content adap-

tation. The proposed model for non-blind watermarking specifies the range of coeffi-

cient magnitudes that are capable of correctly extracting the embedded watermark bit

under compression by considering wavelet domain bit plane discarding and ranks the

coefficients accordingly. Similarly for blind algorithms, the proposed model specifies

the range of magnitudes for the modified coefficient in order to extract the watermark

data under compression. The simulations show the proposed models outperforming the

robustness performance of the existing watermarking methods, where the model was

not used. The proposed models result in scalable robustness performance where the

robustness remained high for any, p ≤ N , number of bit planes being discarded. The

high robustness of the models was experimentally verified for the JPEG 2000 quality

scalability.

97

Chapter 7

Motion Compensated Video

Watermarking Techniques

So far in this thesis, we have discussed, analyzed and proposed models to enhance the

watermarking robustness for images and evaluated the performances against scalable

compression, i.e., JPEG 2000. In this chapter the research findings are extended in

video watermarking scenario. Inline with the thesis objectives, this chapter focuses on

robust watermarking techniques for scalable coded video compression, such as, Motion

JPEG 2000, wavelet based MC-EZBC [33] and more recent H.264/SVC. In order to

account the motion information, we have proposed Motion Compensated Temporal

Filtering (MCTF) based video watermarking algorithms here. Firstly the generalized

MCTF based video watermarking schemes are proposed and evaluated and then robust-

ness models, derived in Chapter 6 are applied within MCTF framework for enhanced

robustness in video watermarking.

7.1 Introduction

MCTF has been successfully used in wavelet based scalable video coding reseach [33,

109]. The idea of MCTF is evolved from 3D subband wavelet decomposition, which

is merely an extension of spatial domain transform into temporal domain [110]. But

3D wavelet decomposition alone does not decouple motion information and it is ad-

dressed by using temporal filtering along the motion trajectories. This MCTF based

video decomposition technique motivates a new avenue in transform domain video wa-

99

termarking.

Often video watermarking schemes are developed by extending image watermarking

algorithms. Transform domain, especially, wavelet based image watermarking has been

very successful in imperceptibility as well as robustness performance against various

image processing attacks. As a successor of the same, several attempts have been made

to extend these image watermarking algorithms into video watermarking by using them

either on a frame-by-frame basis [27–30] or based on 3D wavelet decompositions [9,31,

32].

The frame-by-frame video watermarking considers embedding on selected frames lo-

cated at fixed intervals to make them robust against frame dropping based temporal

adaptations of video. In this case each frame is treated separately as an individual

image, hence any image watermarking algorithm can be adopted to achieve the in-

tended robustness. But frame-by-frame watermarking schemes often perform poorly

in terms of robustness against various video processing attacks including temporal

desynchronization, video collusion, video compression attacks etc. In order to address

these issues, the video temporal dimension is exploited by the spread spectrum domain

i.e. DCT and more recently wavelet based 3D decomposition of the host video. In 3D

wavelet-based watermarking approaches [9,31,32], video is composed into 3D subbands

by using separable 3D wavelet transform with shorter mother wavelets, such as Haar.

Unfortunately, such naive subband decomposition-based embedding strategies, that do

not consider the motion element of the sequence when embedding the watermark, often

result in unpleasant flickering visual artifacts. The amount of flickering in watermarked

sequences varies according to the texture, color and motion characteristics of the video

content as well as the watermark strength and the choice of frequency subband used

for watermark embedding. At the same time, these schemes are also fragile to video

compression attacks which consider motion trajectory during compression coding.

The aim of this chapter is to address the consideration of motion and texture charac-

teristics of the video sequence for extending image watermarking techniques into video.

The new proposed approach is evolved from the MCTF based wavelet domain video

decomposition concept, as briefed at the beginning of the chapter. Few attempts have

already been made to investigate the effect of motion in video watermarking attempts

on incorporating motion compensation into video watermarking [57, 79, 80]. In these

investigations the sequence is first temporally decomposed into Haar wavelet subbands

using MCTF and then spatially decomposed using the 2D DCT transform resulting

in the decomposition scheme widely known as t+2D. Here we aim to advance further

by investigating along the line of MCTF based wavelet coding to improve the robust-

100

ness while keeping the imperceptibility or vice versa. Apparent problems of direct use

of MCTF and t+2D decompositions in watermarking are three-fold and alternative

solutions are offered to address the same.

1. In scalable video coding research it has been evident that video with different

texture and motion characteristics leading to its spatial and temporal features

perform differently on t+2D domain [33] and its alternative 2D+t domain [111],

where MCTF is performed on the 2D wavelet decomposition domain. Further,

in 3D subband decomposition for video watermarking, the consideration of mo-

tion, thus the use of MCTF, is only required for subbands where the watermarks

are embedded. Therefore fixed architectures, such as t+2D or 2D+t, add un-

necessary complexity in terms of motion estimation and compensation into the

watermarking algorithm.

2. The conventional MCTF is focused on achieving higher compression and thus

gives more attention on the prediction lifting step in MCTF. However, for wa-

termarking it is necessary to follow the motion trajectory of content into low

frequency temporal subband frames, in order to avoid motion mismatch in the

update step of MCTF when these frames are modified due to watermark embed-

ding.

3. t+2D structure offers better energy compaction in the low frequency temporal

subband, while keeping majority coefficient values to very small or nearly zero in

high frequency temporal subbands. This is very useful during compression but

leaves very little room for watermark embedding in high frequency temporal sub-

bands. Therefore, for a robust algorithm most of the MCTF domain watermark-

ing schemes, as mentioned before, embed the watermark in the low-pass temporal

frames. On the other hand 2D+t provides more energy in high frequency sub-

bands, which enables the possibility to embed and recover the watermark robustly

using high-pass temporal frames which improves the overall imperceptibility of

the watermarked video.

To overcome these shortcomings, MCTF based 3D wavelet decomposition schemes are

proposed for video sequences and a flexible 2D+t+2D generalized motion compensated

temporal-spatial subband decomposition scheme is offered using a modified motion

compensated temporal filtering (MMCTF) scheme for video watermarking. Using the

proposed decompositions within the framework we study and analyze the merits and

the demerits of watermark embedding using various combinations of 2D+t+2D struc-

ture and propose new video watermarking schemes to improve the imperceptibility

101

and the robustness performance against scalable coded video attacks, such as, Motion

JPEG 2000, MC-EZBC and H.264-SVC. The issues related to motion estimation from

watermarked video without any prior knowledge of original motion information are also

addressed in the case of a blind watermarking method.

7.2 Motion compensated 2D+t+2D filtering

The generalised spatio-temporal decomposition scheme consists of two modules: 1)

MCTF and 2) 2D spatial frequency decomposition. To capture the motion information

accurately, the commonly used lifting based MCTF is modified by tracking inter-frame

pixel connectivity and the 2D wavelet transform is used for spatial decomposition. In

this section the modified motion compensated temporal filtering (MMCTF) is described

first and then the 2D+t+2D general framework is proposed based on MMCTF.

7.2.1 MMCTF

The MMCTF scheme is formulated by giving more focus into the motion trajectory-

based update step as follows. Let It be the video sequence, where t is the time index in

display order. We consider two consecutive frames I2t and I2t+1, as the current frame

(c) and the reference frame (r), respectively, following the video coding terminology.

The I2t frame is partitioned into non-overlapping blocks and for each block, vertical

and horizontal displacements are quantified and represented as motion vector fields

Vc→r and Hc→r, respectively. In the I2t frame, each block can be one of two types,

namely inter and intra blocks, where the motion is only estimated for the former block

type only. Similarly, as far as the I2t+1 frame is concerned any pixel can be one of

three types, namely, one-to-one connected, one-to-many connected and unconnected

(as shown in Figure 7.1), depending on their connectivity to pixels in the I2t frame

following the implied motion vector vector fields Vc←r and Hc←r, which are simply the

directional inverse of the original motion vector fields, Vc→r and Hc→r.

Considering these block and pixel classifications, the lifting steps for pixels at positions

[m,n] in frames I2t and I2t+1 (i.e., I2t[m,n] and I2t+1[m,n]) performing the temporal

motion compensated Haar wavelet transform are defined as follows:

102

1 2

3 4

1

2

3

4

I 2t

I 2t+1

Unconnected one-to-one connected

one-to-many connected

Figure 7.1: Pixel connectivity in I2t and I2t+1 frames.

The prediction step:

For one-to-one connected pixels:

I ′2t+1[m,n] = I2t+1[m,n]− I2t[m+Hc→r, n+ Vc→r]. (7.1)

For one-to-many connected pixels:

I ′2t+1[m,n] = I2t+1[m,n]− 1

J

J−1∑

i=0

I2t[m+Hc→ri , n+ Vc→r

i ], (7.2)

where J is the total number of connections. For unconnected pixels:

I ′2t+1[m,n] = I2t+1[m,n]. (7.3)

The above case is similar to the no prediction case as in intra blocks used in conventional

MCTF.

The update step:

For inter blocks: Every pixel in an inter block is one-to-one connected with a unique

pixel in I2t+1. Then the update step is computed as

I ′2t[m,n] = I2t[m,n] +1

2I ′2t+1[m−Hc←r, n− Vc←r]. (7.4)

For intra blocks: As there are no motion compensated connections with I2t+1,

I ′2t[m,n] = I2t[m,n]. (7.5)

103

Finally these lifting steps are followed by the normalization step.

I ′′2t[m,n] =√2I ′2t[m,n], (7.6)

I ′′2t+1[m,n] =1√2I ′2t+1[m,n]. (7.7)

The temporally decomposed frames I ′′2t and I ′′2t+1 are the first level low and high pass

frames and are denoted as L and H temporal subbands. These steps are repeated for

all frames in L to obtain LL and LH sub bands and continued to obtain the desired

number of temporal decomposition levels. For the inverse transform, the order of

operation of steps is reversed and the first operand in lifting steps is changed to subject

variable in above equations.

7.2.2 2D+t+2D framework

As discussed earlier in Section 7.1, in a 3D video decomposition scheme, t+2D is

achieved by performing temporal decomposition followed by a spatial transform where

as in case of 2D+t, the temporal filtering is done after the spatial 2D transform. Due

to its own merit and demerit, it is required to analyse both the combinations in order

to enhance the video watermarking performance. A common flexible reconfigurable

framework, which allows to create such possible combinations, are particularly useful

for applications like video watermarking. Here the 2D+t+2D framework is proposed by

combining the modified motion compensated temporal filtering with spatial 2D wavelet

transformation.

Let (s1ts2) be the number of decomposition levels used in the 2D+t+2D subband

decomposition to obtain a 3D subband decomposition with motion compensated t

temporal levels and s spatial levels, where s = s1 + s2. In such a scheme, first the

2D Discrete Wavelet Transform (DWT) is applied for an s1 level decomposition. As

a result a new sequence is formed by the low frequency spatial LL subband of all

frames. Then the sequence of spatial LL subbands are temporally decomposed using

the MMCTF into t temporal levels. Finally each of the temporal transformed spatial

LL subbands are further spatially decomposed into s2 wavelet levels.

For a t-s motion compensated temporal subband decomposition, the values of s1 and s2

are determined by considering the context of the choice of temporal-spatial subbands

used for watermark embedding. For example, (032) and (230) parameter combinations

result in t+2D and 2D+t motion compensated 3D subband decompositions, respec-

104

(032)0 1 2 3 4 5 6 7

0 1 3 5 7

0 2 4 6

0 4

H

LH

LLH LLL

Video Sequence

1st temporal level

2nd temporal level

3rd temporal level

2 4 6

Figure 7.2: Realization of 3-2 temporal schemes using the 2D+t+2D framework withdifferent parameters: (032).

tively. The same amount subband decomposition levels can be obtained by also using

the parameter combination (131) using the proposed generalized scheme implementa-

tion. The combination (002) allows 2D decomposition of all frames for frame by frame

watermark embedding. The realization of these examples are shown in Figure 7.2,

Figure 7.3, Figure 7.4 and Figure 7.5. We use the notation (LLL, LLH, LH, H) to

denote the temporal subbands after a 3 level decomposition. The use of this framework

is described in combination with watermarking algorithms, in the next section.

7.3 Video watermarking in 2D+t+2D spatio-temporal de-

composition

We propose a new video watermarking scheme by extending the wavelet based im-

age watermarking algorithms into 2D+t+2D framework. At this point we recall the

generalized wavelet based image watermarking schemes as described in Chapter 3 and

Chapter 4. In this section those watermarking algorithms are extended into MCTF

based framework to propose new video watermarking schemes. Then various combina-

tions in the proposed video decomposition framework are analyzed to decide on unique

video embedding parameters, such as, 1) choice of temporal subband selection and 2)

motion estimation parameters to retrieve the motion information from watermarked

video.

105

(230)

0 1 2 3 4 5 6 7

0 1 3 5 7

0 2 4 6

0 4

H

LH

LLH LLL

Video Sequence

2nd temporal level

3rd temporal level

2 4 6

temporal only on LL subband

1st temporal level


(131)

0 1 2 3 4 5 6 7

0 1 3 5 7

0 2 4 6

0 4

H

LH

LLH LLL

Video Sequence

2nd temporal level

3rd temporal level

2 4 6

temporal only on LL subband

1st temporal level


7.3.1 Proposed video watermarking scheme

The new video watermarking scheme uses the image watermarking algorithms on

spatial-temporal decomposed video. The system block diagrams for watermark embed-

ding, a non-blind extraction and a blind extraction are shown in Figure 7.6, Figure 7.7

106

(002)0 1 2 3 4 5 6 7

Video Sequence

0 1 2 3 4 5 6 7

No temporal decomposition

Figure 7.5: Realization of spatial 2D frame-by-frame scheme using the 2D+t+2D frame-work with different parameters: (002).

and Figure 7.8, respectively.

7.3.1.1 Embedding

To embed the watermark, first spatio-temporal decomposition is performed on the host

video sequence by applying spatial 2D-DWT followed by temporal MMCTF for a 2D+t

(230) or temporal decomposition followed by spatial transform for a t+2D (032). In

both the cases, the motion estimation (ME) is performed to create the motion vector

(MV) either on the spatial domain (t+2D) or in the frequency domain (2D+t) as

described in Section 7.2.2. Other combinations, such as, 131 and 002 are achieved

in a similar fashion. After obtaining the decomposed coefficients, the watermark is

embedded either using magnitude alteration (Eq. (3.2)) or a re-quantisation based

modification algorithm (Eq. (3.3)) by selecting various temporal low or high pass frames

(i.e. LLL or LLH etc.) and spatial subband within the selected frame. Once embedded,

the coefficients follow inverse process of spatio temporal decomposition in order to

reconstruct the watermarked video.

7.3.1.2 Extraction and authentication

The extraction procedure follows a similar decomposition scheme as in embedding and

the system diagram for the same is shown in Figure 7.7 and Figure 7.8. The watermark

coefficients are retrieved by applying 2D+t+2D decomposition on watermarked test

video. At this point we need to specifically mention about the motion information

retrieval. For a non-blind algorithm the original video sequence is available at the

decoder and hence the motion vector is obtained from the original video. After spatio-

temporal filtering on test and original video, the coefficients are compared to extract

107

the watermark. On the other hand, in case of a blind watermarking scheme, the motion

estimation is performed on the test video itself without any prior knowledge of original

motion information. The temporal filtering is then done by using the new motion

vector and consequently the spatio-temporal coefficients are obtained for the detection.

The authentication is then done by measuring the Hamming distance (H) between the

original and the extracted watermark using Eq. (4.1).

Watermarked Sequence

Spatial inverse

2D-DWT

Video Sequence Spatial

2D-DWT MMCTF

ME

Spatial 2D-DWT

Watermark

Embedding Algorithm

Inverse MMCTF

MV

Spatial inverse

2D-DWT

MV

Figure 7.6: System blocks for watermark embedding scheme in 2D+t+2D spatio-temporal decomposition.


Spatial 2D-DWT

MMCTF Spatial

2D-DWT

Original Watermark


Authentication

Original Video

Spatial 2D-DWT

MMCTF

ME

Spatial 2D-DWT

MV

MV from original sequence

Figure 7.7: System blocks for non-blind watermark extraction scheme in 2D+t+2Dspatio-temporal decomposition.

108


Spatial 2D-DWT

MMCTF

ME

Spatial 2D-DWT

Original Watermark


MV

Authentication

Figure 7.8: System blocks for blind watermark extraction scheme in 2D+t+2D spatio-temporal decomposition.

7.3.2 The framework analysis in video watermarking context

Before approaching to the experimental results, in this sub section we aim to address the

issues related to MMCTF based video watermarking of the proposed framework. Firstly

to improve the imperceptibility, an investigation is made about the energy distribution

of the host video in different temporal subbands, which is useful to select the temporally

decomposed frames during embedding. Then an insight is given to motion retrieval for

a blind watermarking scheme, where no prior motion information is available during

watermark extraction and this is crucial for the robustness performance.

7.3.2.1 On improving imperceptibility

In wavelet domain watermarking research, it is well known fact that embedding in

high frequency subbands offers better imperceptibility and low frequency embedding

provides better robustness. Often wavelet decompositions compact most of the en-

ergy in low frequency subbands and leave lesser energy in high frequencies and due

to this reason, high frequency watermarking schemes are less robust to compression.

Therefore, increase in energy distribution in high frequency subbands can offer a better

watermarking algorithm.

In analyzing the framework, the research findings shows that different 2D+t+2D combi-

nations can vary the energy distribution in high frequency temporal subbands and this

is independent of video content. To show an example, Foreman and Crew sequences

are used and decomposed using 032, 131, 230 and 002 combinations in the framework

109

Table 7.1: Sum of energy of coefficients at LLs for first two GOP each with 8 temporallow and high frequency frames of Foreman sequence.

Temporal Sum of Energy (GOP 1)frames 032 131 230 002

LLL 1.82 × 108 1.82 × 108 1.82 × 108 1.83 × 108

LLH 4.66 × 107 6.45 × 107 8.26 × 107 1.82 × 108

LH1 3.68 × 107 5.54 × 107 7.54 × 107 1.82 × 108

LH2 3.03 × 107 4.32 × 107 6.55 × 107 1.83 × 108

H1 3.15 × 107 4.32 × 107 5.58 × 107 1.82 × 108

H2 2.51 × 107 3.69 × 107 5.17 × 107 1.82 × 108

H3 2.85 × 107 3.84 × 107 5.69 × 107 1.83 × 108

H4 3.48 × 107 4.89 × 107 5.47 × 107 1.83 × 108


LLL 1.84 × 108 1.84 × 108 1.84 × 108 1.84 × 108

LLH 4.93 × 107 6.34 × 107 9.08 × 107 1.85 × 108

LH1 3.32 × 107 4.79 × 107 8.65 × 107 1.85 × 108

LH2 4.01 × 107 7.06 × 107 1.06 × 108 1.85 × 108

H1 2.51 × 107 5.08 × 107 6.49 × 107 1.84 × 108

H2 2.82 × 107 5.45 × 107 6.50 × 107 1.85 × 108

H3 3.78 × 107 5.62 × 107 7.62 × 107 1.85 × 108

H4 3.80 × 107 4.34 × 107 8.18 × 107 1.85 × 108

and the sum of energy are calculated for first two GOP each with 8 temporal frequency

frames, namely, LLL, LLH, LH1, LH2, H1, H2, H3 and H4. In all cases the energy is

calculated for the low frequency (LLs) subband of spatial decomposition. Other input

parameters are set to 8 × 8 macro block, a fixed size block matching (FSBM) motion

estimation with ±16 search window. The results are shown in Table 7.1, Table 7.2

and the histograms of the coefficients for 032, 131 and 230 of LLL and LLH are shown

in Figure 7.9 and Figure 7.10 for Foreman and Crew sequences, respectively. The

inner graphs in Figure 7.9 and Figure 7.10 represent the zoomed version of the local

variations by clipping the y-axis to show the coefficient distribution more effectively.

From the results, the energy distribution in high frequency temporal subbands can be

ranked as: (002) > (230) > (131) > (032). This analysis guides us to select optimum

spatio-temporal parameter in the framework to improve the robustness while keeping

better imperceptibility.

110

Table 7.2: Sum of energy of coefficients at LLs for first two GOP each with 8 temporallow and high frequency frames of Crew sequence.


LLL 6.47 × 107 6.46 × 107 6.45 × 107 6.54 × 107

LLH 2.70 × 107 2.67 × 107 3.86 × 107 6.32 × 107

LH1 1.04 × 107 2.49 × 107 3.01 × 107 6.57 × 107

LH2 6.75 × 107 7.20 × 107 7.79 × 107 8.46 × 107

H1 6.44 × 107 6.85 × 107 7.78 × 107 8.33 × 107

H2 1.50 × 107 1.19 × 107 1.88 × 107 6.53 × 107

H3 1.49 × 107 1.45 × 107 1.66 × 107 6.22 × 107

H4 4.38 × 107 5.38 × 107 5.99 × 107 6.24 × 107


LLL 6.06 × 107 6.04 × 107 6.06 × 107 6.24 × 107

LLH 1.94 × 107 2.23 × 107 2.40 × 107 6.00 × 107

LH1 1.79 × 107 1.67 × 107 2.34 × 107 5.97 × 107

LH2 3.62 × 107 3.70 × 107 4.24 × 107 6.60×107

H1 1.39 × 107 1.38 × 107 1.67 × 107 6.04 × 107

H2 1.13 × 107 1.05 × 107 1.50 × 107 5.97 × 107

H3 1.36 × 107 1.37 × 107 1.39 × 107 6.10×107

H4 2.86 × 107 3.09 × 107 3.91 × 107 6.80×107

7.3.2.2 On motion retrieval

In an MCTF based video watermarking scheme motion information contributes at

large for temporal decomposition along motion trajectory. The watermarking embed-

ding modification in the temporal domain causes motion mismatch which affects the

decoder performance. While original motion information is available for a non-blind

watermarking scheme, a motion estimation must be done in the case of a blind video

watermarking scheme. In this case, the motion vector is expected to be retrieved from

the watermarked video without any prior knowledge of the original motion vector (MV).

Our study shows that, in such a case, a more accurate motion estimation is possible

by choosing the right 2D+t+2D combination along with an optimum choice of macro

block (MB) size. At the same time we investigate the performance based on motion

search range (SR) and effectively SR has lesser contribution towards motion retrieval.

The experiment set is organized by studying the watermarking detection performance

by measuring Hamming distance of a blind watermark embedding at LLs spatial sub-

band on LLL and LLH temporal frames. The watermark extraction is done by using

various combinations of MB and SR to find the best the motion retrieval parameters.

111

0 50 100 150 200 250 300 0

20

40

60

80

100

120

140 Foreman (032) LLL1

-50 0 50 100 150 200 250 300 0

50

100

150

200

250

300

350

400 Foreman (032) LLH1

0 50 100 150 200 250 300 0

20

40

60

80

100


-50 0 50 100 150 200 250 300 0

50

100

150

200

250

300

350


0 50 100 150 200 250 300 0

20

40

60

80

100

120


-50 0 50 100 150 200 250 300 0

100

200

300

400

500


Figure 7.9: Histogram of coefficients at LLs for 3rd level temporal low and high fre-quency frames (GOP 1) for Foreman sequence. Column 1) & 2) represents LLL andLLH temporal frames, respectively and Row 1), 2) & 3) shows 032, 131 and 230 com-binations of 2D+t+2D framework.

The results are shown in Table 7.3 and Table 7.4 using average of first 64 frames from

Foreman and Crew CIF size video sequence, respectively, for 032, 131 and 230 spatio-

temporal decompositions. The motion is estimated using a fixed size block motion

algorithm. Due to the limitations in macro-block size and integer pixel motion search,

32× 32 MB search is excluded for 131 decomposition and 32× 32, 16× 16 MB searches

are excluded for 230 decomposition.

112

0 50 100 150 200 250 0

10

20

30

40

50

60

70 Crew (032) LLL1

-50 0 50 100 150 200 250 0

20

40

60

80

100

120 Crew (032) LLH1

0 50 100 150 200 250 0

10

20

30

40

50

60

70 Crew (131) LLL1

-50 0 50 100 150 200 250 0

20

40

60

80

100

120

140

160

180

200 Crew (131) LLH1

0 50 100 150 200 250 0

10

20

30

40

50

60

70 Crew (230) LLL1

-50 0 50 100 150 200 250 0

50

100

150

200

250 Crew (230) LLH1

Figure 7.10: Histogram of coefficients at LLs for 3rd level temporal low and highfrequency frames (GOP 1) for Crew sequence. Column 1) & 2) represents LLL and LLHtemporal frames, respectively and Row 1), 2) & 3) shows 032, 131 and 230 combinationsof 2D+t+2D framework.

The results show that for a MB size of 8× 8 or more, 2D+t outperform t+2D. In this

context the spatio-temporal decompositions can be ranked as (230) > (131) > (032).

In the case of 131 or 230, the motion is estimated in hierarchically down sampled low

frequency subband. Therefore number of motion vector reduces accordingly for a given

macro block size. As a result for a blind motion estimation less number of motion vector

needs to be estimated at the decoder resulting in more accurate motion estimation and

113

Table 7.3: Hamming distance for blind watermarking by estimating motion from water-marked video using different macro block size (MB) and search range (SR). Embeddingat LLs on frame: a) LLL and b) LLH on Foreman sequence (average of first 64 frames).

(a) LLLMV from watermarked video: MB/SR

32× 32 16× 16 16× 16 8× 8 8× 8 4× 4 4× 4/± 64 /± 64 /± 32 ±32 /± 16 /± 16 /± 8

032 0.02 0.03 0.02 0.03 0.03 0.04 0.04

131 - 0.02 0.03 0.03 0.03 0.08 0.07

230 - - - 0.03 0.03 0.08 0.07

(b) LLHMV from watermarked video: MB/SR

32× 32 16× 16 16× 16 8× 8 8× 8 4× 4 4× 4/± 64 /± 64 /± 32 ±32 /± 16 /± 16 /± 8

032 0.15 0.29 0.29 0.40 0.39 0.49 0.49

131 - 0.22 0.21 0.29 0.28 0.44 0.44

230 - - - 0.23 0.22 0.30 0.30

better robustness. It is evident from Table 7.3 and Table 7.4, that if same number of

motion vectors are considered, i.e., 32× 32 MB for 032, 16× 16 MB for 131 and 8× 8

MB for 230, the robustness performance are comparable for all three combinations.

However in LLL subband of 2D+t, for a smaller MB, such as, 4 × 4, more motion

mismatch is observed as motion estimation is done in a spatially decomposed region.

Now, using the analysis, above, the experiments are designed to verify the proposed

video watermarking schemes for improved imperceptibility as well as robustness against

scalable video compressions.

7.4 Experimental results and discussion

The following experimental setups are used for the simulation of watermark embedding

using the proposed generalized 2D+t+2D motion compensated temporal-spatial sub-

band scheme. In order to make the watermarking strength constant across subbands,

the normalization steps in the MCTF and the 2D DWT were omitted.

There are two different sets of result obtained for luma component of 8 test video

sequences (4 : 2 : 0 YUV sequences) as shown in Figure 7.11 to show the embedding

distortion and the robustness performance. One non-blind and one blind watermarking

schemes are used as example cases, described in Section 7.3.1. For the simulations

114

Table 7.4: Hamming distance for blind watermarking by estimating motion from water-marked video using different macro block size (MB) and search range (SR). Embeddingat LLs on frame: a) LLL and b) LLH on Crew sequence (average of first 64 frames).

(a) LLLMV from watermarked video: MB/SR

32× 32 16× 16 16× 16 8× 8 8× 8 4× 4 4× 4/± 64 /± 64 /± 32 ±32 /± 16 /± 16 /± 8

032 0.03 0.06 0.05 0.09 0.09 0.09 0.09

131 - 0.03 0.03 0.07 0.07 0.14 0.13

230 - - - 0.03 0.03 0.15 0.12

(b) LLHMV from watermarked video: MB/SR

32× 32 16× 16 16× 16 8× 8 8× 8 4× 4 4× 4/± 64 /± 64 /± 32 ±32 /± 16 /± 16 /± 8

032 0.17 0.24 0.23 0.36 0.36 0.48 0.47

131 - 0.16 0.16 0.23 0.23 0.41 0.38

230 - - - 0.17 0.17 0.28 0.27

shown in this work the four combinations (032), (230), (131) and (002) were used.

In each case, the watermark embedding is performed on the low frequency subband

(LLs) of 2D spatial decompositions due to its improved robustness performance against

compression attacks in image watermarking. In these simulations the 9/7 bi-orthogonal

wavelet transform was used as the 2D decompositions.

Based on the analysis in the previous section, here we explored the possibility of water-

mark embedding in high frequency temporal subband and investigate the robustness

performance against compression attacks, as high frequency subband can offer improved

imperceptibility. In the experiment sets, 3rd temporal level high pass (LLH) and low

pass (LLL) frames are chosen to embed the watermark. Other video decomposition

parameters are set to: 1) Eight groups of picture (GOP) size of 8 frames each, 2) 8× 8

macro block size and 3) a search window of ±16. The choice of macro block size and

search window are decided by referring the motion retrieval analysis in Section 7.3.2.2.

For embedding distortion measure, Mean Square Error (MSE) is used here along with

the amount of flicker introduced due to watermark embedding by using the flicker metric

in the MSU Quality Measurement Tool [112]. The flicker metric compares the flicker

content in the watermarked video with respect to the original video. In both metrics the

lower values correspond to the better distortion performance. On the other hand the

watermarking robustness is represented by Hamming distance as mentioned in Eq. (4.1)

and lower Hamming distance corresponds a better detection performance. Various

115

Foreman Crew News Stefan

Mobile City Football Flower garden

Figure 7.11: The test video sequence set.

wavelet scalable coded quality compression attacks are considered, such as, Motion

JPEG 2000 (using Open JPEG software code) and MC-EZBC scalable video coding

(an RWTH Aachen University implementation). We have also reported the preliminary

robustness performance against H.264-SVC (scalable extension) using JSVM software

(Release 9.15). The results show the mean value of Hamming distance for average of

first 64 frames of test video set.

The experiments are divided into two sets, one for embedding distortion analysis and

the other for robustness evaluation. In all the experimental set up, two example water-

marking algorithms, one each from non-blind [17] and blind [5] category are considered.

The weighting parameter α and γ are set to 0.1. In case of non-blind algorithm a level

adaptive threshold selection method [17] is used to choose the coefficients to embed

the watermark. The watermarking data capacity is set to 2000 bits and 2112 bits

using a binary logo for all combinations and every sequences for non-blind and blind

watermarking methods, respectively.

7.4.1 Embedding distortion analysis

The embedding distortion results are shown in Figure 7.12 for LLL and LLH frames

for News sequence; Figure 7.13 for LLL and LLH frames for Foreman sequence; Fig-

ure 7.14 for Crew sequence for non-blind watermarking method and embedding dis-

tortion results for blind watermarking methods are shown in Figure 7.15 for LLL and

LLH frames for News sequence; Figure 7.16 for LLL and LLH frames for Foreman

sequence; Figure 7.17 for Crew sequence. In each of the figures y-axis in a) and c) rep-

116

resents the MSE and b) and d) represents flicker metrics for LLL and LLH subband,

respectively. The x-axis of the figures presents first 64 frames of the test sequences

with the size of 8 frames per GOP.

From the results for LLL subband, it is evident that although the MSE performances

are comparable, proposed MCTF based methods ((032), (131) and (230)) outperform

the frame-by-frame embedding (002) with respect to embedding distortion performance

to address the flickering problem. In all four combinations the sum of energy in LLL

subband are similar and resulting in comparable MSE. However in the proposed meth-

ods the error (i.e., MSE) is propagated along the GOP due to due to hierarchical

temporal decomposition along the motion trajectory and the error propagation along

the motion trajectory addressed the issues related to flickering artifacts.

On the other hand for LLH subband, due to temporal filtering the sum of energy is

lesser and the four combinations can be ranked as 032 < 131 < 230 < 002. Hence

the MSE and flickering performance for this temporal subband can be ranked as

032 > 131 > 230 > 002. Therefore while choosing a temporally filtered high frequency

subband, such as LLH, LH or H, the proposed MCTF approach also outperform the

frame by frame embedding in terms of MSE while addressing the flickering issues.

To evaluate the embedding performance we chose three video sequences from different

motion activity, with very low motion (News), medium motion (Foreman) and high

motion (Crew). It is evident that flickering due to frame-by-frame embedding is in-

creasingly prominent in the sequences with lower motion and is successfully addressed

by the proposed MCTF based watermarking approach.

117

0 10 20 30 40 50 600

5

10

15

20

25

Frame numberM

ean

Squ

are

Err

or

Embedding distortion (Non−blind method), MSE for News (LLL)

002032131230

(a)

0 10 20 30 40 50 600

0.5

1

1.5

2

Frame number

Flic

ker

Met

ric

Embedding distortion (Non−blind method), flicker metric for News (LLL)

002032131230

(b)

0 10 20 30 40 50 600

5

10

15

20

25

Frame number

Mea

n S

quar

e E

rror

Embedding distortion (Non−blind method), MSE for News (LLH)

002032131230

(c)

0 10 20 30 40 50 600

0.5

1

1.5

2

Frame number

Flic

ker

Met

ric

Embedding distortion (Non−blind method), flicker metric for News (LLH)

002032131230

(d)

Figure 7.12: Embedding distortion performance for non-blind watermarking on LLLand LLH temporal subbands for News sequence. a) and c) represents MSE and b) andd) represents Flicker metric for LLL and LLH, respectively.

118

0 10 20 30 40 50 600

10

20

30

40

50

60

70

Frame numberM

ean

Squ

are

Err

or

Embedding distortion (Non−blind method), MSE for Foreman (LLL)

002032131230

(a)

0 10 20 30 40 50 600

1

2

3

4

5

Frame number

Flic

ker

Met

ric

Embedding distortion (Non−blind method), flicker metric for Foreman (LLL)

002032131230

(b)

0 10 20 30 40 50 600

10

20

30

40

50

60

70

Frame number

Mea

n S

quar

e E

rror

Embedding distortion (Non−blind method), MSE for Foreman (LLH)

002032131230

(c)

0 10 20 30 40 50 600

1

2

3

4

5

Frame number

Flic

ker

Met

ric

Embedding distortion (Non−blind method), flicker metric for Foreman (LLH)

002032131230

(d)

Figure 7.13: Embedding distortion performance for non-blind watermarking on LLLand LLH temporal subbands for Foreman sequence. a) and c) represents MSE and b)and d) represents Flicker metric for LLL and LLH, respectively.

119

0 10 20 30 40 50 600

5

10

15

20

Frame numberM

ean

Squ

are

Err

or

Embedding distortion (Non−blind method), MSE for Crew (LLL)

002032131230

(a)

0 10 20 30 40 50 600

0.5

1

1.5

Frame number

Flic

ker

Met

ric

Embedding distortion (Non−blind method), flicker metric for Crew (LLL)

002032131230

(b)

0 10 20 30 40 50 600

5

10

15

20

25

30

Frame number

Mea

n S

quar

e E

rror

Embedding distortion (Non−blind method), MSE for Crew (LLH)

002032131230

(c)

0 10 20 30 40 50 600

0.5

1

1.5

2

2.5

Frame number

Flic

ker

Met

ric

Embedding distortion (Non−blind method), flicker metric for Crew (LLH)

002032131230

(d)

Figure 7.14: Embedding distortion performance for non-blind watermarking on LLLand LLH temporal subbands for Crew sequence. a) and c) represents MSE and b) andd) represents Flicker metric for LLL and LLH, respectively.

120

0 10 20 30 40 50 600

1

2

3

4

5

6

Frame number

Mea

n S

quar

e E

rror

Embedding distortion (blind method), MSE for News (LLL)

002032131230

(a)

0 10 20 30 40 50 600

0.05

0.1

0.15

0.2

Frame number

Flic

ker

Met

ric

Embedding distortion (blind method), flicker metric for News (LLL)

002032131230

(b)

0 10 20 30 40 50 600

1

2

3

4

5

6

Frame number

Mea

n S

quar

e E

rror

Embedding distortion (blind method), MSE for News (LLH)

002032131230

(c)

0 10 20 30 40 50 600

0.05

0.1

0.15

0.2

Frame number

Flic

ker

Met

ric

Embedding distortion (blind method), flicker metric for News (LLH)

002032131230

(d)

Figure 7.15: Embedding distortion performance for blind watermarking on LLL andLLH temporal subbands for News sequence. a) and c) represents MSE and b) and d)represents Flicker metric for LLL and LLH, respectively.

121

0 10 20 30 40 50 600

2

4

6

8

10

12

14

Frame numberM

ean

Squ

are

Err

or

Embedding distortion (blind method), MSE for Foreman (LLL)

002032131230

(a)

0 10 20 30 40 50 600

0.05

0.1

0.15

0.2

0.25

Frame number

Flic

ker

Met

ric

Embedding distortion (blind method), flicker metric for Foreman (LLL)

002032131230

(b)

0 10 20 30 40 50 600

2

4

6

8

10

12

14

Frame number

Mea

n S

quar

e E

rror

Embedding distortion (blind method), MSE for Foreman (LLH)

002032131230

(c)

0 10 20 30 40 50 600

0.05

0.1

0.15

0.2

0.25

0.3

0.35

Frame number

Flic

ker

Met

ric

Embedding distortion (blind method), flicker metric for Foreman (LLH)

002032131230

(d)

Figure 7.16: Embedding distortion performance for blind watermarking on LLL andLLH temporal subbands for Foreman sequence. a) and c) represents MSE and b) andd) represents Flicker metric for LLL and LLH, respectively.

122

0 10 20 30 40 50 600

1

2

3

4

5

Frame number

Mea

n S

quar

e E

rror

Embedding distortion (blind method), MSE for Crew (LLL)

002032131230

(a)

0 10 20 30 40 50 600

0.05

0.1

0.15

0.2

Frame number

Flic

ker

Met

ric

Embedding distortion (blind method), flicker metric for Crew (LLL)

002032131230

(b)

0 10 20 30 40 50 600

1

2

3

4

5

6

Frame number

Mea

n S

quar

e E

rror

Embedding distortion (blind method), MSE for Crew (LLH)

002032131230

(c)

0 10 20 30 40 50 600

0.05

0.1

0.15

0.2

0.25

0.3

0.35

Frame number

Flic

ker

Met

ric

Embedding distortion (blind method), flicker metric for Crew (LLH)

002032131230

(d)

Figure 7.17: Embedding distortion performance for blind watermarking on LLL andLLH temporal subbands for Crew sequence. a) and c) represents MSE and b) and d)represents Flicker metric for LLL and LLH, respectively.

123

7.4.2 Robustness performance evaluation

This experimental set up reports the results for robustness of the proposed scheme

against various scalable content adaptation of video. The robustness results for the

non-blind watermarking method are shown in Figure 7.18, Figure 7.19 and Figure 7.20

for Crew, Foreman and News sequences, respectively. The x-axis represents the com-

pression ratio (Motion JPEG 2000) or video bit rates (MC-EZBC) and the y-axis

shows the corresponding Hamming distances. Column 1) & 2) show the results for

the LLL and LLH frame selections, respectively. The preliminary robustness results

against H.264-SVC are shown in Appendix A. The robustness performances shows that

any combination of temporal filtering on spatial decomposition (i.e., (131) and (230))

outperforms a conventional t+2D based scheme.

The experimental robustness results for blind watermarking method are shown in Fig-

ure 7.21, Figure 7.22 and Figure 7.23 for Crew, Foreman and News sequences, respec-

tively. The left hand column shows results for the LLL temporal subband while results

for LLH are shown in the right hand column. The rows represent various scalabil-

ity attacks, Motion JPEG 2000 and MC-EZBC, respectively. In this case the motion

information is obtain from the watermarked test video and based on motion retrieval

analysis in Section 7.3.2.2, the motion parameters are set to the macro block size of

8×8 with a ±16 search window. Similar to the non-blind watermarking, any combina-

tion of temporal filtering on spatial decomposition (i.e., (131) and (230)) outperforms

a conventional t+2D based scheme.

We now analyze the obtained results by grouping it by selection of temporal subband,

i.e., LLL and LLH; by embedding method, i.e., non-blind and blind; and by compres-

sion scheme, i.e., Motion JPEG 2000, MC-EZBC and H.264/SVC.

Selection of temporal subband:

The low frequency temporal subband (LLL) offers better robustness in comparison

to high frequency LLH subband. This is due to more energy concentration in LLL

subband after temporal filtering. Within the temporal subbands, in LLL subband

various spatio-temporal combinations performs equally as the energy levels are nearly

equal for 032, 131 and 230. However 230 performs slightly better due to lesser motion

related error in spatially scaled subband. On the other hand for LLH subband, the

robustness performance can be ranked as 230 > 131 > 032 as a result of the energy

distribution ranking of these combinations in Section 7.3.2.1.

124

Embedding method:

In the experimental set up we have used two different watermarking schemes: 1) Non-

blind and 2) Blind. For a non-blind case, the watermark extraction is performed

using the original host video and hence the original motion vector is available at the

extractor which makes this scheme more robust to various scalable content adaptation.

On the other hand as explained before, the blind watermarking scheme neither have

any reference to original sequence nor any reference motion vector. The motion vector

is estimated from the watermarked test video itself which results in comparatively poor

robustness. The effect of motion related error is more visible in LLH subband as the

motion compensated temporal high pass frame is highly sensitive to motion estimation

accuracy and so the robustness performance. As discussed in Section 7.3.2.2 in case of

a 2D+t (i.e., 230) the error due motion vector is lesser compared to t+2D scheme and

hence offers better robustness (230 > 131 > 032).

Compression scheme:

We have evaluated the proposed algorithm against various scalable video compres-

sion scheme, i.e., wavelet based Motion JPEG 2000, MC-EZBC and H.264/SVC. First

two video compression schemes are based on wavelet technology where more recent

H.264/SVC uses layered scalability using base layer coding of H.264/AVC.

In Motion JPEG 2000 scheme, the coding is performed by applying 2D wavelet trans-

form on each frames separately without considering any temporal correlation between

frames. In the proposed watermarking scheme, the use of 2D wavelet transform offers

better association with Motion JPEG 2000 scheme and hence provides better robust-

ness for 2D+t combination for LLL and LLH. Also in the case of LLH subband a

better energy concentration offers better robustness to Motion JPEG 2000 attacks. The

robustness performance against Motion JPEG 2000 can be ranked as 230 > 131 > 032.

MC-EZBC video coder uses motion compensated 1D wavelet transform in temporal

temporal filtering and 2D wavelet transform in spatial decomposition. In compression

point of view MC-EZBC usually encodes the video sequences in t+2D combination due

to better energy compaction in low frequency temporal frames. But in watermarking

perspective, higher energy in high frequency subband can offer better robustness. The

argument is justified from the robustness results where results for LLL subbands are

comparable, but a distinctive improvement is observed in LLH subband and based on

the results the robustness ranking for MC-EZBC can be done as 230 > 131 > 032.

Finally the robustness of the proposed scheme is evaluated against H.264/SVC, which

uses inter/intra motion compensated prediction followed by an integer transform with

125

similar properties of DCT transform. Although the watermarking and video coding

scheme does not share any common technology or transform, the results provide ac-

ceptable robustness. However for a blind watermarking scheme in LLH subband,

proposed schemes performs poorly due to blind motion estimation. Similar to previous

robustness results, based on energy distribution and motion retrieval argument, here

the spatio-temporal combinations can be ranked as 230 > 131 > 032. In a specific

example case H.264/SVC usually gives preference to intra prediction to the sequences

with low global or local motion, as in News sequence and hence exception in robustness

performance to H.264/SVC is noticed for the proposed scheme.

Based on the above discussion, due to the close association between the proposed

scheme and MC-EZBC, the robustness of the proposed scheme offers best performance

against MC-EZBC based content adaptation. To conclude this discussion, we suggest

that, a choice of 2D+t based watermarking scheme improves the imperceptibility and

the robustness performance in a video watermarking scenario for a non-blind as well as

a blind watermarking algorithm. In the next section we extend the image watermarking

robustness model into video watermarking framework to propose watermarking with

enhanced robustness against scalable compression.

126

5 10 15 20 25 30 35 40 45 500

0.02

0.04

0.06

0.08

0.1

0.12

Compression ratio

Ham

min

g di

stan

ce

Robustness (non-blind) against Motion JPEG 2000 (LLL): Crew

032131230

5 10 15 20 25 30 35 40 45 500

0.02

0.04

0.06

0.08

0.1

0.12

0.14

Compression ratio

Ham

min

g di

stan

ce

Robustness (non-blind) against Motion JPEG 2000 (LLH): Crew

032131230

0500100015002000250030000

0.02

0.04

0.06

0.08

0.1

0.12

0.14

0.16

0.18

0.2

Bit rate (kbps)

Ham

min

g di

stan

ce

Robustness (non-blind) against MC-EZBC (LLL): Crew

032131230

0500100015002000250030000

0.05

0.1

0.15

0.2

0.25

0.3

0.35

Bit rate (kbps)

Ham

min

g di

stan

ce

Robustness (non-blind) against MC-EZBC (LLH): Crew

032131230

Figure 7.18: Robustness performance of non-blind watermarking scheme for Crew se-quence. Row 1) & 2) show robustness against Motion JPEG 2000 and MC-EZBC,respectively. Column 1) & 2) represents the embedding on temporal subbands LLL &LLH, respectively.

127

5 10 15 20 25 30 35 40 45 500

0.02

0.04

0.06

0.08

0.1

0.12

0.14

0.16

Compression ratio

Ham

min

g di

stan

ce

Robustness (non-blind) against Motion JPEG 2000 (LLL): Foreman

032131230

5 10 15 20 25 30 35 40 45 500

0.05

0.1

0.15

0.2

0.25

Compression ratio

Ham

min

g di

stan

ce

Robustness (non-blind) against Motion JPEG 2000 (LLH): Foreman

032131230

0500100015002000250030000.005

0.01

0.015

0.02

0.025

0.03

0.035

0.04

0.045

0.05

Bit rate (kbps)

Ham

min

g di

stan

ce

Robustness (non-blind) against MC-EZBC (LLL): Foreman

032131230

0500100015002000250030000

0.05

0.1

0.15

0.2

0.25

0.3

0.35

Bit rate (kbps)

Ham

min

g di

stan

ce

Robustness (non-blind) against MC-EZBC (LLH): Foreman

032131230

Figure 7.19: Robustness performance of non-blind watermarking scheme for Foremansequence. Row 1) & 2) show robustness against Motion JPEG 2000 and MC-EZBC,respectively. Column 1) & 2) represents the embedding on temporal subbands LLL &LLH, respectively.

128

5 10 15 20 25 30 35 40 45 500

0.05

0.1

0.15

0.2

0.25

Compression ratio

Ham

min

g di

stan

ce

Robustness (non-blind) against Motion JPEG 2000 (LLL): News

032131230

5 10 15 20 25 30 35 40 45 500

0.05

0.1

0.15

0.2

0.25

Compression ratio

Ham

min

g di

stan

ce

Robustness (non-blind) against Motion JPEG 2000 (LLH): News

032131230

0500100015002000250030000

0.002

0.004

0.006

0.008

0.01

0.012

0.014

0.016

0.018

0.02

Bit rate (kbps)

Ham

min

g di

stan

ce

Robustness (non-blind) against MC-EZBC (LLL): News

032131230

0500100015002000250030000

0.02

0.04

0.06

0.08

0.1

0.12

0.14

0.16

0.18

Bit rate (kbps)

Ham

min

g di

stan

ce

Robustness (non-blind) against MC-EZBC (LLH): News

032131230

Figure 7.20: Robustness performance of non-blind watermarking scheme for News se-quence. Row 1) & 2) show robustness against Motion JPEG 2000 and MC-EZBC,respectively. Column 1) & 2) represents the embedding on temporal subbands LLL &LLH, respectively.

129

5 10 15 20 25 30 35 40 45 500.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

Compression ratio

Ham

min

g di

stan

ce

Robustness (blind) against Motion JPEG 2000 (LLL): Crew

032131230

5 10 15 20 25 30 35 40 45 500.2

0.25

0.3

0.35

0.4

0.45

0.5

Compression ratio

Ham

min

g di

stan

ce

Robustness (blind) against Motion JPEG 2000 (LLH): Crew

032131230

0500100015002000250030000.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

Bit rate (kbps)

Ham

min

g di

stan

ce

Robustness (blind) against MC-EZBC (LLL): Crew

032131230

0500100015002000250030000.2

0.25

0.3

0.35

0.4

0.45

0.5

Bit rate (kbps)

Ham

min

g di

stan

ce

Robustness (blind) against MC-EZBC (LLH): Crew

032131230

Figure 7.21: Robustness performance of blind watermarking scheme for Crew sequence.Row 1) & 2) show robustness against Motion JPEG 2000 and MC-EZBC, respectively.Column 1) & 2) represents the embedding on temporal subbands LLL & LLH, re-spectively.

130

5 10 15 20 25 30 35 40 45 500.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

Compression ratio

Ham

min

g di

stan

ce

Robustness (blind) against Motion JPEG 2000 (LLL): Foreman

032131230

5 10 15 20 25 30 35 40 45 500.25

0.3

0.35

0.4

0.45

0.5

Compression ratio

Ham

min

g di

stan

ce

Robustness (blind) against Motion JPEG 2000 (LLH): Foreman

032131230

0500100015002000250030000.05

0.1

0.15

0.2

0.25

0.3

Bit rate (kbps)

Ham

min

g di

stan

ce

Robustness (blind) against MC-EZBC (LLL): Foreman

032131230

0500100015002000250030000.25

0.3

0.35

0.4

0.45

0.5

Bit rate (kbps)

Ham

min

g di

stan

ce

Robustness (blind) against MC-EZBC (LLH): Foreman

032131230

Figure 7.22: Robustness performance of blind watermarking scheme for Foreman se-quence. Row 1) & 2) show robustness against Motion JPEG 2000 and MC-EZBC,respectively. Column 1) & 2) represents the embedding on temporal subbands LLL &LLH, respectively.

131

5 10 15 20 25 30 35 40 45 500.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

0.5

Compression ratio

Ham

min

g di

stan

ce

Robustness (blind) against Motion JPEG 2000 (LLL): News

032131230

5 10 15 20 25 30 35 40 45 500.38

0.4

0.42

0.44

0.46

0.48

0.5

Compression ratio

Ham

min

g di

stan

ce

Robustness (blind) against Motion JPEG 2000 (LLH): News

032131230

0500100015002000250030000.05

0.1

0.15

0.2

0.25

0.3

Bit rate (kbps)

Ham

min

g di

stan

ce

Robustness (blind) against MC-EZBC (LLL): News

032131230

0500100015002000250030000.36

0.38

0.4

0.42

0.44

0.46

0.48

0.5

Bit rate (kbps)

Ham

min

g di

stan

ce

Robustness (blind) against MC-EZBC (LLH): News

032131230

Figure 7.23: Robustness performance of blind watermarking scheme for News sequence.Row 1) & 2) show robustness against Motion JPEG 2000 and MC-EZBC, respectively.Column 1) & 2) represents the embedding on temporal subbands LLL & LLH, re-spectively.

132

7.5 Adopting robustness model in video watermarking

7.5.1 Robust video watermarking

So far in this chapter we have proposed and comprehensively evaluated the perfor-

mances of video watermarking schemes using a generic MCTF based 2D+t+2D frame-

work. The 2D+t spatio-temporal decomposition based watermarking outperformed

traditional t+2D decomposition based watermarking schemes. Now at this point we

aim to enhance the watermarking robustness further by adopting the image robustness

models, as proposed in Chapter 6, into the 2D+t+2D framework. One of the major

reasons for this adaptation is that, similar to JPEG 2000, the quality scalable compres-

sion within Motion JPEG 2000 and MC-EZBC can also be modeled by the bit-plane

discarding based quantization as described in Section 6.2. Therefore, the proposed

combined video watermarking scheme can offer an enhanced robustness against quality

scalable content adaptation for video.

7.5.2 Experimental results

In the experimental set up the robustness model is adopted during the watermark em-

bedding after 2D+t+2D decomposition. For the comparison purpose we have chosen

the 2D+t, 230 subband for non-blind and blind case. The comparison is made between

the cases, without using the model and using the model considering 5 bit plane discard-

ing (N = 5). Other experimental parameters are kept same as in the previous section.

In these cases the normalization on spatio-temporal decompositions are included. The

weighting parameter are set to 0.1 in both cases. For the non-blind method, coefficients

are selected in LLH subband in a raster scanning order with a data capacity of 2000

bits, while for the blind method one in every three coefficients of the selected subband

(LLL here) are considered to embed the watermark with a data capacity of 2112 bits.

The robustness results against various scalable compressions are shown in Figure 7.24

and Figure 7.25 for non-blind and blind method, respectively. In both the figures three

test sequences Crew, Foreman and News are used and the robustness against Motion

JPEG 2000 and MC-EZBC are compared.

It is evident from the results that the image robustness model works successfully

for video watermarking techniques by improving the robustness of 2D+t embedding

scheme. However the robustness model can also be applied for other combinations in

2D+t+2D framework. In a non-blind method due the availability of original sequence,

133

the motion vector was unaffected by the robustness model and hence outperformed the

cases which did not consider the model. On the other hand, in a blind case, the use

of robustness model increases the embedding distortion (as described in Chapter 6)

resulting in distortion in motion vector too, which in turn, reduces the robustness im-

provement in this case. Ideally, the robustness model assumes same spatio-temporal

decomposition parameters in embedding as well compression algorithm, where in this

experimental set, the embedding scheme is used independent of the compression al-

gorithms. Therefore we can conclude that the robustness enhancement model can

perform more efficiently when used within the compression algorithms, which uses bit

plane discarding model and preserve motion vector information.

7.6 Conclusions

In this chapter, a flexible generalized motion compensated temporal-spatial subband

decomposition scheme, based on the MMCTF for video watermarking is presented.

The MCTF was modified by taking into account the motion trajectory into obtain-

ing an efficient update step. The embedding distortion performance evaluated using

both MSE and flicker difference metric, shows superior performance for the MMCTF

driven 2D+t+2D subband domain watermarking as opposed to frame-by-frame 2D

wavelet domain watermarking which does not take motion into account. The proposed

subband decomposition also provides low complexity as MCTF is performed only on

subbands where the watermark is embedded. The robustness performance against scal-

able coding based compressions attacks, including Motion JPEG 2000, MC-EZBC and

preliminary results for H.264-SVC (scalable extension), are also evaluated. The pro-

posed 2D+t based video watermarking scheme within 2D+t+2D filtering framework

outperforms conventional t+2D watermarking schemes in a non-blind as well as a blind

watermarking scenario. This video watermarking technique is further extended using

the image robustness model to enhance the robustness against scalable compression.

134

5 10 15 20 25 30 35 40 45 50

0.16

0.18

0.2

0.22

0.24

0.26

0.28

0.3

0.32

0.34

Compression ratio

Ham

min

g di

stan

ce

Robustness (non-blind) against Motion JPEG 2000 (LLH): Crew

Without modelWith robustness model

050010001500200025003000

0.16

0.18

0.2

0.22

0.24

0.26

0.28

0.3

0.32

0.34

Bit rate (kbps)

Ham

min

g di

stan

ce

Robustness (non-blind) against MC-EZBC (LLH): Crew


5 10 15 20 25 30 35 40 45 500.2

0.25

0.3

0.35

0.4

0.45

Compression ratio

Ham

min

g di

stan

ce

Robustness (non-blind) against Motion JPEG 2000 (LLH): Foreman


0500100015002000250030000.2

0.25

0.3

0.35

0.4

0.45

Bit rate (kbps)

Ham

min

g di

stan

ce

Robustness (non-blind) against MC-EZBC (LLH): Foreman


5 10 15 20 25 30 35 40 45 500.2

0.25

0.3

0.35

0.4

0.45

0.5

0.55

Compression ratio

Ham

min

g di

stan

ce

Robustness (non-blind) against Motion JPEG 2000 (LLH): News


0500100015002000250030000.2

0.25

0.3

0.35

0.4

0.45

0.5

0.55

Bit rate (kbps)

Ham

min

g di

stan

ce

Robustness (non-blind) against MC-EZBC (LLH): News


Figure 7.24: Robustness performance enhancement using bit plane discarding model(N = 5) of non-blind watermarking scheme for LLH subband. Column 1) & 2) showrobustness against Motion JPEG 2000 and MC-EZBC, respectively. Row 1), 2) & 3)represents the test sequences, Crew, Foreman & News, respectively.

135

5 10 15 20 25 30 35 40 45 500.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

0.5

Compression ratio

Ham

min

g di

stan

ce

Robustness (blind) against Motion JPEG 2000 (LLL): Crew

Without modelWith robustness model (N=5)

0500100015002000250030000.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

0.5

Bit rate (kbps)

Ham

min

g di

stan

ce

Robustness (blind) against MC-EZBC (LLL): Crew


5 10 15 20 25 30 35 40 45 500.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

0.5

Compression ratio

Ham

min

g di

stan

ce

Robustness (blind) against Motion JPEG 2000 (LLL): Foreman


0500100015002000250030000.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

Bit rate (kbps)

Ham

min

g di

stan

ce

Robustness (blind) against MC-EZBC (LLL): Foreman


5 10 15 20 25 30 35 40 45 500.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

0.5

Compression ratio

Ham

min

g di

stan

ce

Robustness (blind) against Motion JPEG 2000 (LLL): News


0500100015002000250030000.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

Bit rate (kbps)

Ham

min

g di

stan

ce

Robustness (blind) against MC-EZBC (LLL): News


Figure 7.25: Robustness performance enhancement using bit plane discarding model(N = 5) of blind watermarking scheme for LLL subband. Column 1) & 2) showrobustness against Motion JPEG 2000 and MC-EZBC, respectively. Row 1), 2) & 3)represents the test sequences, Crew, Foreman & News, respectively.

136

Chapter 8

Distortion Constrained

Robustness Scalable

Watermarking

In the previous chapters of this thesis we discussed about the embedding distortion

and the robustness to quality scalable image and video coding, which are two com-

plementary watermarking requirements. In this chapter we introduce a novel concept

of scalable blind watermarking to generate a distortion-constrained robustness scalable

watermarked image code-stream which consists of hierarchically nested joint distortion-

robustness coding atoms.

8.1 Introduction

Inspired by image coding standard JPEG2000 and video coding scheme MC-EZBC,

this work addresses two long due watermarking questions: 1) Can we define a wa-

termark embedding-rate vs. overall embedding-distortion curve? and 2) Can we

formulate a scalable embedding-robustness relationship graph which can provide hier-

archically improved robustness against an image or video processing or compression

scheme? In answering these questions, we recall the state of the art analysis in Chap-

ter 3, where various traditional wavelet based image and video watermarking schemes

are generalized and dissected into a set of processes. The performance metrics are

defined into two categories: 1) Embedding performance (measured by data-capacity

137

and embedding-distortion such as PSNR) and 2) Robustness (measured by similarity

correlation or Hamming distance (Bit Error Rate)). In these traditional watermarking

methods, these metrics are often mutually exclusive and therefore measured and rep-

resented separately. On the contrary these metrics influence each other’s performance

and in this chapter we aim to combine them to propose a distortion constrained ro-

bustness scalable watermarking scheme. Therefore it is also important to formulate a

common mutually inclusive space for the performance measures.

The concept of scalable watermarking is particularly useful for watermarking for scal-

able coded image and video where the watermark can also be scaled according to the

heterogenous network capacity and end user’s requirement for a target application. For

example, for a high bandwidth network and a high resolution display, highly imper-

ceptible but less robust watermarked image or video can be transmitted, as in this

scenario, highly imperceptible media is desirable and the watermark can be extracted

reliably due to lesser compression, whereas, for a low capacity network and low reso-

lution display, the distribution server can choose highly robust watermarking stream,

where, due to higher compression the watermarking imperceptibility is less important

but high robustness is required for a reliable watermark extraction. Similarly, based on

any other combinations of the network’s capability and user’s requirement, the scalable

watermarked media code-stream can be truncated and distributed accordingly.

With the increased use of scalable coded media, such scalable watermarking concept is

very important, but a little or no work has been proposed so far in the current literature.

Available most common such algorithms are proposed either as joint progressive scal-

able watermarking and coding scheme [34,35] or efficient coefficient selection methods

which are robust against resolution or quality scalable attacks [36,37]. These algorithms

primarily focused on two main robustness issues [38]: 1) detection of watermark after

acceptable scalable compression and 2) graceful improvement of extracted watermark

as more quality or resolution layers received at the image decoder.

On the contrary, we propose a novel scalable watermarking concept, based on distortion

constrained watermarked code-stream to generate watermarked image / video with de-

sired distortion robustness requirements. This work addresses the tow-fold problem of

1) obtaining the least distortion at a given watermark embedding rate and 2) achieving

the best robustness in a scalable fashion by hierarchically encoding lower and higher

embedded code-atoms, respectively. In designing the algorithm, we have considered

the propositions for embedding distortion in Chapter 5, i.e., in order to minimize the

distortion, the coefficient modification must be minimized; and the concept of qual-

ity scalability using bit plane discarding model (as discussed in Chapter 6) in order

138

to improve the robustness against scalable content adaptation. The objectives of the

proposed scheme, are:

• Creating a common performance metric to represent data-capacity and embed-

ding distortion.

• Proposing a new watermarking algorithm which incorporates the bit plane dis-

carding model, used in quality scaling based content adaptation in scalable coded

image and video.

• Obtaining best robustness at a given embedding distortion rate.

• Scalable embedded code-stream generation using hierarchically nested joint distortion-

robustness coding atoms.

• The code-stream should be allowed to be truncated at any distortion-robustness

atom level to generate the watermarked image with the desired distortion-robustness

requirements.

To fulfill the requirements of the said objectives, we introduced a new wavelet domain

binary tree guided rules-based blind watermarking algorithm. The universal blind ex-

tractor of this algorithm is capable of extracting watermark data from the watermarked

images, created using any truncated code-stream. The code-stream mentioned in this

chapter refers to the joint watermarked wavelet coefficient stream.

As no such idea has been explored yet in the literature, in order to quantify the work

in this chapter we have introduced a new embedding distortion metric and reported

the robustness results in a hierarchical fashion to support the claim. However, further

evaluation can be done in future according to the potential applications such as in

authentication of multimedia streaming.

8.2 Scalable watermarking

In this section, the design and development of the proposed scalable watermarking

scheme is discussed. Firstly, the new wavelet domain binary tree guided watermarking

algorithm is discussed and then it is used in developing the scalable watermarking

system.

139

8.2.1 Proposed algorithm

In proposing the new algorithm we aim to address two issues related to the main theme

of the thesis, robust watermarking techniques for scalable coded image and video: 1) a

robust watermarking technique which considers bit plane discarding model in scalable

compression and 2) scalability of the watermarking. As discussed earlier the traditional

watermarking algorithms fail to comply with the design requirements of the proposed

scalable watermarking scheme. Therefore we introduce a new watermarking algorithm

which satisfies the above mentioned requirements by creating watermarked image /

video code-stream atoms and allows quantitative embedding-distortion measurement

at individual atom level.

The proposed wavelet-based algorithm follows a similar system block diagrams as shown

in Figure 3.1. The watermark embedding is performed on the coefficients generated

after FDWT. The embedding algorithm follows a non-uniform quantization based index

modulation. The embedding process is divided into two parts: 1) Quantized binary

tree formation and 2) Watermark embedding by index modulation.

8.2.1.1 Tree formation

In this step, all selected coefficients are recursively quantized to form a binary tree.

Firstly the selected wavelet coefficient (C) is indexed (bi) as 0 or 1 using an initial

quantizer λ:

bi =

⌊ |C|λ/2i

⌋%2, i ∈ 0, 1, 2, 3..., (8.1)

where % denotes the modulo operation.

Assuming n =⌊|C|λ

⌋, we can identify the position of C between the quantized cluster

(n) and (n + 1) which can alternatively described as bit plane clusters as shown in

Figure 8.1. The selected coefficient C is then further quantized more precisely within a

smaller cluster using a smaller quantizer λ/2 and corresponding index is calculated as:

b1 =⌊|C|λ/2

⌋%2. The index tree formation is continued recursively by scaling λ value by

2, until λ/2i ≥ 1. During this tree formation process the Sign of the coefficients are

preserved separately.

Now based on the calculated index value at various quantization step a binary tree

140

Figure 8.1: Non-uniform hierarchical quantizer in formation of binary tree.

(b(C)) of each selected coefficient is formed as follows:

b(C) = (b0)(b1)....(bi−1)(bi), (8.2)

where (b0), (b1)...(bi) are binary bits at most significant bit (MSB) to least significant

bit (LSB) positions, respectively with the tree depth i+1. For example if C = 135 and

initial λ = 30, the binary tree b(C) will be b(C) = 01000. The tree formation scenario

is shown in Figure 8.2. The number of tree nodes e.g. number of bits in any binary

tree are decided by the initial quantizer λ and is defined as depth of the tree.

8.2.1.2 Embedding

The above mentioned binary tree is used to embed binary watermark information using

symbol based embedding rules. To introduce the watermarking scalability, we chose

3 most significant bits which represents 8 different states corresponding to 6 different

symbols. Although any other number of bits (> 1) can be chosen, the use of more

number of bits (> 3) results in more states, thus increase the complexity and less

141

01

02% =n

0 10 1

0 1

0

1

Tree for C: 010001000)2%()( == nCb

TreeDepth

Depth: 5

0

1

0 1

Figure 8.2: Example binary tree.

Table 8.1: Tree-based watermarking rules table

Binary Watermarktree Symbol Association

000xxxx EZ 0001xxxx EZ 0010xxxx CZ 0011xxxx WO 1100xxxx WZ 0101xxxx CO 1110xxxx EO 1111xxxx EO 1

number of bits (< 3) reduces the watermark scalability. Now 3 most significant bits of

any binary tree, represents 6 symbols (EZ = Embedded Zero, CZ = Cumulative Zero,

WZ = Weak Zero, EO = Embedded One, CO = Cumulative One and WO = Weak

One) to identify the original coefficient’s association with a 0 or 1. The bits in the

binary trees, symbols and corresponding associations are shown in Table 8.1 for a tree

depth of 7. Now, based on the input watermark stream, if require, new association

is made by altering the chosen 3 most significant bits in the tree to reach the nearest

symbol as shown the state machine diagram in Figure 8.3. Assuming the current state

of the binary tree is EZ, to embed watermark bit 0 no change in state is required

while to embed watermark input 1, a new value of the binary tree must be assigned

associated with either WO, CO or EO. However to minimize the distortion nearest

state change must occur as shown in the state machine diagram. Other state changes

142

EZ

CZ

WZ

EO

CO

WO

1

1 1

0 0

1

0

0

1

0

0

1

Refinement pass for 0

Refinement pass for 1

Figure 8.3: State machine diagram of watermark embedding based on tree-symbol-association model.

in the binary tree follows a similar argument. Finally the watermarked image / video

is obtained by de-quantizing the modified binary tree followed by an inverse transform.

At this point we recall the issues related to embedding performance measure and pro-

pose a new metric to combine the data capacity and embedded distortion:

Φ =

∑X−1m=0

∑Y−1n=0 (I(m,n)− I ′(m,n))2

L, (8.3)

where Φ represents embedding distortion rate, I and I ′ are the original and water-

marked image, respectively with dimension X × Y and L is the number of watermark

bits embedded, e.g. data capacity. The traditional distortion metric PSNR vs newly

proposed Φ graphs are shown in Section 8.3.

8.2.1.3 Extraction and Authentication

A universal blind extractor is proposed for watermark extraction and authentication

process. The wavelet coefficients are generated after forward transform on the test

image / video followed by the tree formation process as in embedding. Based on the

143

recovered tree structure, symbols are re-generated to decide on a 0 or 1 watermark

extraction. The extracted watermark is then authenticated by comparing with the

original watermark. The authentication is done by measuring the similarity correlation

or the Hamming distance (Bit Error Rate) as described in Eq. (4.2) and Eq. (4.1),

respectively.

8.2.2 Scalable watermark system design

At this point we define the watermarking scalability, independent of the host image /

video coding-decoding schemes. The scalability here refers to embed the watermarks

in a hierarchical fashion in such a way that more embedding information leads to

better robustness. In the proposed algorithm, the symbols in Table 8.1 can be ranked

based on improved associated robustness. The MSB in the binary tree corresponds

to coarse-grained quantization index whereas LSB represents fine-grained quantization

index. It is evident that to extract the watermark bit successfully, all three MSB of any

binary tree must be unaltered in case of WO, CO or WZ, CZ, whereas only two most

significant bit is required to be preserved for EO or EZ. Therefore two consecutive

0s (EZ) or 1s (EO) provides strongest association with 0 or 1, respectively and so the

robustness. On the other hand WO, CO and WZ, CZ offers same level of robustness

and hence the robustness rank of the symbols can be defined as EO > CO,WO and

EZ > CZ,WZ. At the same time we can measure the collective embedding distortion

rate as in Eq. (8.3). In this section we exploit these two property of the algorithm to

design the scalable watermarking concept. The complete process is divided into three

separate modules: 1) Encoding module, 2) Embedded watermarking module and 3)

Extractor module.

8.2.2.1 Encoding module

The main functionality of this module is to generate a hierarchical embedded code-

stream. The example scalable system model is shown in Figure 8.4. The sequential

activities within the encoding module can be described in the following steps:

Step 1 (Tree formation): Binary trees are formed using the proposed algorithm for

each selected coefficient to be watermarked. Every tree is now assigned a symbol

according to Table 8.1.

Step 2 (Main pass): In step 2, based on the input watermark stream, we alter the trees

144

101

001

110

+

Input watermarking stream

EZ

CZ

WZ

EO

CO

WO

Sca

labl

e w

ater

mar

king

laye

r cr

eatio

n

Step 1: Tree formation

Step 2: Main pass

Step 3: Refinement pass

Figure 8.4: Proposed scalable watermarking layer creation.

to create right association as described in Figure 8.3 and hence all selected coefficients

are rightly associated at least with basic WZ/WO symbol and thus we comfortably

name it as base layer. The embedding distortion is calculated progressively at individual

tree level.

Step 3 (Refinement pass): The main aim of the refinement passes are to increase

the watermarking strength progressively to increase the robustness. The base layer

provides basic minimum association with watermark bits and in this refinement pass,

the watermarking strength is increased by modifying the symbols and corresponding

tree to the next available level i.e. WZ → EZ,CZ → EZ,WO → EO&CO → EO as

shown in the state machine diagram in Figure 8.3. At the end of this pass, all trees are

modified and associated with the strongest watermark embedding EZ/EO. Similar to

previous step the distortion is calculated as refinement level progresses.

Step 4 (Hierarchical atom and code-stream generation): During 2 different

passes, the binary trees are modified according to the input watermark association and

progressive embedding distortion is calculated at each individual tree. Here we define

these individual trees or a group of trees as an atom. Each atom contains two pieces

of information: 1) embedding distortion rate and 2) modified tree values. Now a code-

stream is generated by concatenating these atoms as shown in Figure 8.5. One set of

header information is also included in the beginning of the stream to identify the input

parameters such as wavelet kernel, number of decomposition levels, depth of the binary

145

Header

Main Pass

Atom 1 Atom 2 Atom n

Refinement Pass

Atom n+1 Atom n+2 Atom 2n

Progressive embedding distortion rate at atom

Group of Binary tree data within the atom

Figure 8.5: Code-stream generation.

tree etc.

8.2.2.2 Embedded watermarking module

The embedded watermarking module truncates the code-stream at any distortion-

robustness atom level to generate the watermarked image with the desired distortion-

robustness requirements. Inclusion of more atoms before truncation increases the ro-

bustness of the watermarked image but consumes greater embedding-distortion rate.

The coed-stream truncation at atom level provides flexibility towards watermarking

scalability. The truncated code-stream is then de-quantized to reconstruct the water-

marked coefficients. An inverse transform on these coefficients generates the required

watermarked image / video.

8.2.2.3 Extractor module

The extractor module consists of a universal blind extractor similar to the one described

in Section 8.2.1.3. Any attacked or compressed test image / video is passed to this

module for watermark extraction and authentication. During the extraction, forward

transform is applied on the test image / video and the coefficients are used to form

the binary tree. Based on the rules, stated in Table 8.1, each tree is then assigned

to a symbol and corresponding association. The association of 0 or 1 indicates the

extracted watermark value. The extracted watermark bits are then authenticated using

a similarity correlation or Hamming distance.

146

The feasibility verification of this scalable watermarking concept is described in exper-

imental results section.

8.2.3 Effect of bit plane discarding

To improve the robustness against quality scalable compression, at this point we in-

corporate the bit plane discarding model within the proposed algorithm by restricting

the initial quantizer (λ) value to the integer power of two. Therefore the quantization

cluster in tree formation (Section 8.2.1) can now alternatively described as bit plane

cluster. Due to the bit plane based clustering in binary tree formation, every value

in the binary tree corresponds to the bit planes of any selected coefficient. Therefore

based on the depth parameter in the embedding algorithm, the selected coefficient can

retain the watermark even after bit plane discarding. In this subsection we discuss the

effect of bit plane discarding on extracting the watermark information.

Assuming C ′ and C ′ as the watermarked coefficient before and after bit plane discard-

ing, respectively, we shall examine the effect of N bit plane discarding on every bits of

in the binary tree during the watermark extraction. Considering initial λ = 2M , where

M corresponds to the depth of the tree, at the extractor, using Eq. (8.1) the bit (bi)

in the binary tree can be calculated as:

bi =

⌊ |C ′|2M

⌋%2,

= k1%2, (8.4)

where k1 is the cluster index as shown in Figure 8.6. Now, using the bit plane discard-

ing model in Section 6.2, the watermarked coefficients C ′ are quantized and mapped

to center value C ′k within a bit plane cluster with an index value of k2 as shown in

Figure 8.6. At this point we consider following three cases to investigate the effect of

this quantization and de-quantization process:

Case 1 (M > N): In this case the binary tree cluster (λ = 2M ) is bigger than the bit

plane discarding cluster. Hence for any bit plane discarding where M > N , C ′k value

remains within the binary tree cluster, k.2M ≤< (k + 1).2M as shown in Figure 8.6.a)

147

a) Case 1: M > N

b) Case 2: M = N

c) Case 3: M < N

Figure 8.6: Effect of bit plane discarding in watermark extraction; λ = 2M and N isthe number of bit plane being discarded.

and

bi =

⌊ |C ′|2M

⌋%2,

=

⌊|C ′|2M

⌋%2,

= b′i, (8.5)

where bi and b′i represents the bit in binary tree, without bit plane discarding and after

bit plane discarding, respectively.

Case 2 (M = N): This case considers the same cluster size in binary tree and the bit

plane discarding, and therefore C ′k remains in the same cluster of binary tree during

watermark extraction as shown in Figure 8.6.b) and hence bi = b′i.

148

a) Case: EZ

b) Case: EO

Figure 8.7: Effect of bit plane discarding in watermark extraction for special case ofEZ and EO; λ = 2M and N is the number of bit plane being discarded.

Case 3 (M < N): In this scenario the number of bit planes being discarded are greater

than the depth of the binary tree. Due to bit plane discarding, any watermarked

coefficient (C ′) in the cluster (k2.2N ≤ C ′ < (k2 + 1).2N ) is mapped to the center

value C ′k. In terms of the binary tree clustering this range can be defined as (k1.2M ≤

C ′ < (k1+2(N−M)).2M ) where (N −M) is a positive integer. Hence during watermark

extraction, the index of the binary tree cluster can be changed and effectively bi = b′iis not guaranteed.

So far we have explained the effect of bit plane discarding on individual bits of the

binary tree. As the algorithm generates the watermark association symbols using the

most significant three bits of the binary tree (Table 8.1) , we can define the necessary

condition for the coefficients to retain the watermark as follows:

d ≥ N + 3, (8.6)

149

where d is the depth of the binary tree and N is the number of bit plane assumed to

be discarded.

But, after 2nd refinement pass in the code-stream all modified coefficients are associated

with either EZ and EO and in that case only most significant two bits are required

to be preserved and hence when the embedding considers highest robustness criteria,

Eq. (8.6) becomes:

d ≥ N + 2. (8.7)

However, in this case, 2nd most significant bit (MSB) in the binary tree needs not be

preserved where, MSB is preserved along with the support decision from 3rd MSB,

i.e., EZ and EO are allowed to be extracted as CZ and CO, respectively. Now we will

examine the effect of bit plane discarding in such cases when d = N + 1.

Case EZ:

Considering λ = 2M , in this case after 2nd refinement pass, the coefficients (C ′) are

associated to embedded zero (EZ→00x), i.e., k1.2M ≤ C ′ <

(k1 +

2M

2

)where k1%2 =

0, as shown in Figure 8.7.a). After N bit plane discarding C ′ is modified to the center

value Ck =(k2.2

N + 2N−12

). For M = N (i.e., d = N + 1), k2 becomes k1 and

therefore:

Ck =(k2.2

N + 2N−12

)<(k1.2

M + 2M

2

)⇒ Ck <

(k1.2

M + 2M

2

),

∀ k1.2M < Ck <

(k1.2

M + 2M

2

), (8.8)

results in 2nd MSB remains 0 in the binary tree. Hence after (d = N +1) bit plane dis-

carding, the coefficient association with EZ remains same and watermark information

can be successfully recovered.

Case EO:

Referring Figure 8.7.b), for embedded one (EO→11x), the condition for coefficient

association becomes(k1 +

2M

2

)≤ C ′ < (k1 + 1).2M where k1%2 = 1. Similar to the

previous case, after N bit plane discarding C ′ is modified to the center value of the

corresponding cluster Ck =(k2.2

N + 2N−12

). Considering M = N , similar to Eq. (8.8)

we can write:

k1.2M < Ck <

(k1.2

M +2M

2

). (8.9)

Therefore first two MSB of the binary tree now changed as 11x→10x. At this point we

150

aim to extract 3rd MSB (b′) which can be retrieved as:

b′ =

0 : if k1.2M ≤ Ck <

(k1.2

M + 2M

4

),

1 : if(k1.2

M + 2M

4

)≤ Ck <

(k1.2

M + 2M

2

).

(8.10)

Now considering M = N ⇒ 2N−12 > 2M

4 and Eq. (8.9) becomes

(k1.2

M +2M

4

)< Ck <

(k1.2

M +2M

2

). (8.11)

Combining, Eq. (8.10) and Eq. (8.11), the extracted 3rd MSB becomes b = 1 and

hence 11x→101. Therefore after d = N + 1 bit plane discarding, the coefficient asso-

ciation with EO becomes CO and the watermark information cam still be successfully

extracted.

Combining the above mentioned cases, we can modify Eq. (8.7) and conclude that for

EZ or EO the relationship between the embedding depth d and maximum number of

bit plane discarding N is as follows:

d ≥ N + 1. (8.12)

Therefore, using the above mentioned conditions, proposed new algorithm ensures the

reliable detection of watermark against quality scalable compressions which uses bit

plane discarding model. We have verified these conditions using experimental simula-

tions in Section 8.3.

8.3 Experimental results and discussion

This section provides the experimental verification of the proposed scalable watermark-

ing scheme, for images as well as video and evaluates its robustness to scalable content

adaptation attacks. As a proof of concept of the proposed scheme, firstly, we have

simulated various experimental set on image watermarking and later extended it to

MCTF based video watermarking.

151

8.3.1 Scalable watermarking for images

The experimental simulations are grouped into four sets: 1) Proof of the concept, 2)

Verification of the scheme for bit plane discarding model, 3) Robustness performance

against JPEG 2000 and 4) Robustness comparison with existing blind watermarking

scheme. In all the experimental set, a 3 level 9/7 wavelet decomposition is performed.

Then the low frequency subband has been selected to embed a binary logo based

watermark. The initial quantization value λ is set to 32 resulting the tree-depth of d =

6. In generating the code-stream, atoms are defined by grouping every 16 consecutive

binary-trees. The code-stream is generated by organizing hierarchically nested atoms,

generated in 2 individual passes.

8.3.1.1 Proof of the concept

Once the code-stream is generated, set of watermarked images are produced by trun-

cating the code-stream at different embedding-distortion rate points Φ (refer Eq. (8.3))

and the results for four test image are shown in Figure 8.8, Figure 8.9, Figure 8.10 and

Figure 8.11 for Boat, Barbara, Blackboard and Light House images, respectively. As the

embedding process creates hierarchical code-stream, at various Φ, watermark strength

varies accordingly, i.e., higher Φ corresponds to higher watermarking strength for a

given data capacity. As a result with increased value of Φ high embedding distortion

is introduced in the watermarked images and hence the visual image quality degrades

as shown in the above mentioned figures. However with higher watermarking strength,

the robustness performance improves hierarchically. The overall embedding distortion

performance, measured by PSNR and the robustness performance (Hamming distance)

at various Φ is shown in Figure 8.12 and Figure 8.13 for four test images. The x-axis of

the graphs represents the embedding-distortion rate (Φ) and y-axis shows the related

PSNR in Row 1 and Row 2 shows Hamming distance vs. Φ graphs.

It is evident from the results that with increasing embedding-distortion rate, i.e., more

watermarking strength results in a poor PSNR but offers better robustness. However

a trade-off can be made based on the application scenario by selecting an optimum

embedding-distortion rate to balance imperceptibility and robustness.

152

Boat (original) Φ = 124

Φ = 867 Φ = 1362

Figure 8.8: Visual representation of watermarked images at various rate points for Boatimage.

8.3.1.2 Verification of the scheme against bit plane discarding

The proposed watermarking scheme incorporates bit plane discarding model and the

experimental verifications for the same are shown in Figure 8.14. The y-axis shows the

robustness in terms of Hamming distance against the number of bit planes discarded (p)

on the x-axis. Here different depth (d) values with minimum and maximum embedding

distortion rate Φ are chosen to verify our arguments in Eq. (8.6) and Eq. (8.12). At

minimum embedding rate, the condition of correct watermark extraction is given in

Eq. (8.6) and the same is evident from the results shown in Figure 8.14. At maximum

Φ, all coefficients are associated with EZ or EO and the necessary condition to extract

watermark is discussed in Eq. (8.12), which is supported by the simulation results as

shown in Figure 8.14. For example, at d = 6, for Φmin, correct watermark extraction

is possible up to p = 3 and for Φmax, correct watermark is extracted up to p = 5 as

shown in the said figures.

153

Barbara (original) Φ = 120

Φ = 846 Φ = 1330

Figure 8.9: Visual representation of watermarked images at various rate points forBarbara image.

8.3.1.3 Robustness performance against JPEG 2000

Figure 8.15 and Figure 8.16 shows the robustness performance of the proposed wa-

termarking scheme against JPEG 2000 scalable compression. Similar to the previous

section firstly we have verified our proposed scheme’s robustness against JPEG 2000

compression using different depth parameter d with whereas the watermark scalabil-

ity at a given depth is shown in Figure 8.16. This results compare the robustness for

various Φ for a given d. In all the figures the x-axis represents the JPEG 2000 quality

compression ratio while y-axis shows the corresponding Hamming distances.

It is evident from the plots that higher depth and higher Φ in a given depth, offer

higher robustness to such scalable content adaptation attacks. The watermark scala-

bility is achieved by truncating the distortion-constrained code stream at various rate

points (Φ). With increased Φ more coefficients are associated with EZ/EO and hence

154

Blackboard (original) Φ = 113

Φ = 816 Φ = 1285

Figure 8.10: Visual representation of watermarked images at various rate points forBlackboard image.

improves the robustness by successfully retaining the watermark information at higher

compression rates. The results shows that more than 35% improvements in robustness

is achieved when comparing two consecutive depth parameter d, whereas more than

60% improvements are reported between minimum Φ and maximum Φ at a given depth.

8.3.1.4 Robustness performance comparison with existing method

Now we compare our proposed method with the existing blind watermarking method

used in this thesis. For a fair comparison, we first calculated Φ for the existing water-

marking algorithm and then set the same Φ for the proposed method. The embedding

performance is reported in Table 8.2 and the robustness against JPEG 2000 compres-

sion is shown in Figure 8.17.

155

Light House (original) Φ = 121

Φ = 852 Φ = 1340

Figure 8.11: Visual representation of watermarked images at various rate points forLight House image.

In embedding distortion performance comparison, at a similar embedding-distortion

rate Φ, the existing method shows a better overall embedding performance PSNR.

However the data capacity of the proposed algorithms are 3 times higher than the

existing one. Therefore, using the new embedding-distortion metric Φ, which considers

embedding distortion and data capacity into a single metric, we can fairly compare

156

0 200 400 600 800 1000 1200 140035

40

45

ΦP

SN

R

Embedding distortion vs Φ graph: Boat

0 200 400 600 800 1000 1200 14000

0.01

0.02

0.03

Φ

Ham

min

g di

stan

ce

Robustness vs Φ graph: Boat

0 200 400 600 800 1000 1200 140035

40

45

Φ

PS

NR

Embedding distortion vs Φ graph: Barbara

0 200 400 600 800 1000 1200 14000

0.01

0.02

0.03

Φ

Ham

min

g di

stan

ce

Robustness vs Φ graph: Barbara

Figure 8.12: PSNR and robustness vs Φ graph. Row 1: Embedding distortion vs. Φ,Row 2: Hamming distance vs. Φ.

0 200 400 600 800 1000 1200 140035

40

45

Φ

PS

NR

Embedding distortion vs Φ graph: Blackboard

0 200 400 600 800 1000 1200 14000

0.01

0.02

0.03

Φ

Ham

min

g di

stan

ce

Robustness vs Φ graph: Blackboard

0 200 400 600 800 1000 1200 140035

40

45

Φ

PS

NR

Embedding distortion vs Φ graph: Light House

0 200 400 600 800 1000 1200 14000

0.01

0.02

0.03

Φ

Ham

min

g di

stan

ce

Robustness vs Φ graph: Light House

Figure 8.13: PSNR and robustness vs Φ graph. Row 1: Embedding distortion vs. Φ,Row 2: Hamming distance vs. Φ.

Table 8.2: Embedding distortion performance comparison between existing and pro-posed watermarking method.

Existing algorithm Proposed methodΦ PSNR Data Capacity Φ PSNR Data Capacity

Boat 86.40 53.74 2112 84.13 47.43 6336

Barbara 80.64 55.12 2112 81.71 49.13 6336

Blackboard 69.12 56.45 2112 69.12 50.51 6336

Light House 84.48 55.36 2048 82.43 48.78 6144

the robustness performance of these two schemes and the results shows that despite

of 3 times more data capacity the proposed algorithm outperforms the existing blind

algorithm by an average of 20% improvement on robustness at higher compression

ratio. The results also confirms that the new algorithm, based on bit plane discarding

157

0 1 2 3 4 5 6 7

0

0.1

0.2

0.3

0.4

0.5

p

Ham

min

g D

ista

nce

Robustness against bit plane discarding: Boat

d=5 (Φ=30)

d=5 (Φ=327)

d=6 (Φ=119)

d=6 (Φ=1323)

d=7 (Φ=497)

d=7 (Φ=5313)

0 1 2 3 4 5 6 7

0

0.1

0.2

0.3

0.4

0.5

p

Ham

min

g D

ista

nce

Robustness against bit plane discarding: Barbara

d=5 (Φ=29)

d=5 (Φ=326)

d=6 (Φ=119)

d=6 (Φ=1348)

d=7 (Φ=506)

d=7 (Φ=5393)

0 1 2 3 4 5 6 7

0

0.1

0.2

0.3

0.4

0.5

p

Ham

min

g D

ista

nce

Robustness against bit plane discarding: Blackboard

d=5 (Φ=31)

d=5 (Φ=329)

d=6 (Φ=112)

d=6 (Φ=1278)

d=7 (Φ=466)

d=7 (Φ=4980)

0 1 2 3 4 5 6 7

0

0.1

0.2

0.3

0.4

0.5

p

Ham

min

g D

ista

nce

Robustness against bit plane discarding: Light House

d=5 (Φ=31)

d=5 (Φ=335)

d=6 (Φ=125)

d=6 (Φ=1358)

d=7 (Φ=482)

d=7 (Φ=5129)

Figure 8.14: Robustness against discarding of p bit planes for various d at minimumand maximum Φ.

model, offers improvements in robustness against scalable compression over existing

algorithm which does not use the model.

8.3.1.5 Application scenario of scalable watermarking

From various experimental results we can conclude that the proposed watermarking

method is highly robust to scalable image compression attacks and outperforms existing

methods in terms of robustness performance. At the same time it adds a new avenue to

the watermarking strategies by offering flexible scalable watermarking approach. For

example, to achieve the higher robustness at a high compression ratio (CR), one can

choose higher Φ and the effect on embedding distortion is neutralized by compression

quantization. An example is shown in Figure 8.18 for Barbara image, where we compare

the embedding distortion of the watermarked image after compression. The PSNR of

the watermarked and the un-watermarked images are similar at various compression

158

0 10 20 30 40 50

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

JPEG2000 compression Ratio

Ham

min

g D

ista

nce

Robustness against JPEG2000: Boat

d=5 (Φ=30)

d=5 (Φ=327)

d=6 (Φ=119)

d=6 (Φ=1323)

d=7 (Φ=497)

d=7 (Φ=5313)

0 10 20 30 40 50

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35


Ham

min

g D

ista

nce

Robustness against JPEG2000: Barbara

d=5 (Φ=29)

d=5 (Φ=326)

d=6 (Φ=119)

d=6 (Φ=1348)

d=7 (Φ=506)

d=7 (Φ=5393)

0 10 20 30 40 50

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35


Ham

min

g D

ista

nce

Robustness against JPEG2000: Blackboard

d=5 (Φ=31)

d=5 (Φ=329)

d=6 (Φ=112)

d=6 (Φ=1278)

d=7 (Φ=466)

d=7 (Φ=4980)

0 10 20 30 40 50

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4


Ham

min

g D

ista

nce

Robustness against JPEG2000: Light House

d=5 (Φ=31)

d=5 (Φ=335)

d=6 (Φ=125)

d=6 (Φ=1358)

d=7 (Φ=482)

d=7 (Φ=5129)

Figure 8.15: Robustness against JPEG 2000 compression for various d at minimumand maximum Φ.

points, while the watermarked image offers authenticity of the image with desired

robustness, i.e., Hamming distance (HD).

8.3.2 Scalable watermarking for video

In this section we have used the proposed scalable watermarking scheme for video

watermarking. Here the watermarking code-stream is generated using the 2D+t+2D

decomposed host video, as described in Chapter 7. In this case the binary tree is formed

using the motion compensated filtered coefficients. Similar to the image watermarking

of the proposed algorithm, the watermarked video is generated at a given embedding

distortion rate (Φ) either at individual frame level or in every GOP. For the experimen-

tal set here we have calculated Φ for every GOP with a size of 8 frames per GOP. The

watermark extraction procedure is similar to the image section. First the test video

is decomposed using the 2D+t+2D frame with a blind motion estimation without any

159

0 10 20 30 40 50 600

0.02

0.04

0.06

0.08

0.1

0.12

0.14

0.16


Ham

min

g D

ista

nce


Φ=125

Φ=372

Φ=620

Φ=867

Φ=1115

Φ=1362

0 10 20 30 40 500

0.02

0.04

0.06

0.08

0.1

0.12

0.14

0.16


Ham

min

g D

ista

nce


Φ=120

Φ=362

Φ=604

Φ=846

Φ=1088

Φ=1330

0 10 20 30 40 50 600

0.02

0.04

0.06

0.08

0.1

0.12

0.14

0.16


Ham

min

g D

ista

nce


Φ=113

Φ=348

Φ=582

Φ=816

Φ=1051

Φ=1285

0 10 20 30 40 50 600

0.02

0.04

0.06

0.08

0.1

0.12

0.14

0.16

0.18

0.2


Ham

min

g D

ista

nce


Φ=121

Φ=364

Φ=608

Φ=852

Φ=1096

Φ=1340

Figure 8.16: Robustness against JPEG 2000 compression for various Φ at d = 6.

reference to original video or motion vector and then the binary tree is formed for the

selected coefficients. The watermark extraction decision is made using the association

rules described in Table 8.1.

The experimental simulations here, are performed using 230 spatio-temporal subband

decomposition where a 2 level 9/7 spatial decomposition is performed, followed by a

3 level MMCTF based temporal decomposition. In subband selection we used LLs

spatial subband and considered two different scenarios for temporal selection as LLL

and LLH. In all the cases normalization is used during spatio-temporal decomposition.

In embedding procedure, depth parameter d is set to 6 with a data capacity of 6336. The

performance of the algorithm is evaluated for various Φ, by comparing the embedding

distortion and robustness against scalable compressions.

The embedding distortion is measured using MSE and the results are shown in Fig-

ure 8.19 and Figure 8.20 subband for the test sequences Crew, Foreman and News.

The x-axis represents the frame number while y-axis shows corresponding MSE. The

160

0 10 20 30 40 50 60

0.02

0.04

0.06

0.08

0.1

0.12

0.14

0.16


Ham

min

g D

ista

nce


Proposed algorithmExisting algorithm

0 10 20 30 40 500.04

0.06

0.08

0.1

0.12

0.14

0.16

0.18


Ham

min

g D

ista

nce



0 10 20 30 40 50 600.05

0.1

0.15

0.2

0.25

0.3


Ham

min

g D

ista

nce



0 10 20 30 40 50 600

0.05

0.1

0.15

0.2

0.25


Ham

min

g D

ista

nce



Figure 8.17: Robustness performance comparison between existing and proposedmethod against JPEG 2000 compression with same Φ.

robustness performance is evaluated by comparing Hamming distance against scalable

compressions, i.e., Motion JPEG 2000, MC-EZBC and H.264/SVC. The results are

shown in Figure 8.21, Figure 8.22 and Figure 8.23 for Crew, Foreman and News se-

quences, respectively. Left hand column shows the performance for LLL subband while

right hand column shows the robustness for LLL. Row 1, 2 and 3 represent the robust-

ness performance against Motion JPEG 2000, MC-EZBC and H.264/SVC, respectively.

In all the cases the x-axis shows the compression ratio / bit rates and the correspond-

ing Hamming distances are shown in y-axis. The Hamming distances are calculated by

averaging the individual frame level Hamming distances of each test sequence.

From the results it is evident that the concept of scalable watermarking is successfully

realized within video watermarking framework. With the increase in embedding dis-

tortion rate Φ, the robustness performances were improved by 30% to 70% between

low and high Φ, while embedding distortion is also increased with increasing Φ. Con-

ceptually as described before, based on the end user’s need, a high Φ can be chosen

where high compression is expected and a low Φ can be opted for high resolution video

distribution. Therefore a combined scalable watermarking and video encoding scheme

161

Un-watermarked: CR=2, PSNR=41.96, Watermarked: CR=2, PSNR=39.93,Φ=-, HD=-; Φ = 120, HD=0.08;

Un-watermarked: CR=50, PSNR=21.61, Watermarked: CR=50, PSNR=21.47,Φ=-, HD=-; Φ = 1330, HD=0.08;

Figure 8.18: Application example to use different Φ for various JPEG 2000 compressionratio to maintain embedding distortion and robustness.

can ensure secure multimedia distribution within scalable content adaptation scenario.

To conclude the discussion, we like note the limitation of this scheme. The proposed

scheme does not perform well against H.264/SVC mainly due to the following reasons:

1) the proposed scheme does not follow the similar filtering and decomposition steps as

in H.264/SVC coder and 2) the proposed scheme is developed on the basis of bit plane

discarding model which is not followed in H.264/SVC. However for the completeness

of the results we have compared the robustness performances against H.264/SVC.

162

0 10 20 30 40 50 600

1

2

3

4

5

6

7

Frame number

Mea

n S

quar

e E

rror

Embedding distortion, MSE for Crew (LLL)

Φ = 43

Φ = 90

Φ = 126

Φ = 236

0 10 20 30 40 50 600

1

2

3

4

5

6

7

Frame number

Mea

n S

quar

e E

rror

Embedding distortion, MSE for Foreman (LLL)

Φ = 53

Φ = 113

Φ = 149

Φ = 288

0 10 20 30 40 50 600

1

2

3

4

5

6

7

Frame number

Mea

n S

quar

e E

rror

Embedding distortion, MSE for News (LLL)

Φ = 101

Φ = 193

Φ = 281

Φ = 575

Figure 8.19: Embedding distortion performance for proposed watermarking on LLLtemporal subbands for various Φ(d = 6). Row 1), 2) & 3) represents embeddingperformances for Crew, Foreman and News sequences, respectively.

8.4 Conclusions

In this chapter, we proposed a novel concept of scalable watermarking. Firstly a dis-

tortion constrained coed-stream is generated by concatenating hierarchically nested

joint distortion robustness coding atoms. The code-stream is then truncated at var-

ious embedding-distortion rate points to create watermarked images, based on the

distortion-robustness requirements. The extraction and authentication is done using a

blind universal extractor. The algorithm is developed based on the bit plane discard-

ing model and outperformed the existing blind watermarking method. The concept

is experimentally verified for images and the robustness against JPEG 2000 quality

scalability is tested. Finally this scheme is extended in MCTF based video watermark-

ing scheme and the robustness is evaluated against Motion JPEG 2000, MC-EZBC and

H.264/SVC. Such a scheme adds new direction in watermarking research and has many

163

0 10 20 30 40 50 600

5

10

15

20

25

Frame numberM

ean

Squ

are

Err

or

Embedding distortion, MSE for Crew (LLH)

Φ = 135

Φ = 269

Φ = 392

Φ = 700

0 10 20 30 40 50 600

5

10

15

20

25

Frame number

Mea

n S

quar

e E

rror

Embedding distortion, MSE for Foreman (LLH)

Φ = 168

Φ = 318

Φ = 456

Φ = 740

0 10 20 30 40 50 600

5

10

15

20

Frame number

Mea

n S

quar

e E

rror

Embedding distortion, MSE for News (LLH)

Φ = 311

Φ = 523

Φ = 727

Φ = 1242

Figure 8.20: Embedding distortion performance for proposed watermarking on LLHtemporal subbands for various Φ (d = 6). Row 1), 2) & 3) represents embeddingperformances for Crew, Foreman and News sequences, respectively.

potential watermarking applications particularly in security enabled scalable content

coding.

164

0 10 20 30 40 500

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

0.5

Compression ratio

Ham

min

g di

stan

ce

Robustness against Motion JPEG 2000 (LLL): Crew

Φ = 43

Φ = 90

Φ = 126

Φ = 236

0 10 20 30 40 500.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

0.5

Compression ratioH

amm

ing

dist

ance

Robustness against Motion JPEG 2000 (LLH): Crew

Φ = 135

Φ = 269

Φ = 392

Φ = 700

0500100015002000250030000.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

0.5

Bit rate (kbps)

Ham

min

g di

stan

ce

Robustness against MC−EZBC (LLL): Crew

Φ = 43

Φ = 90

Φ = 126

Φ = 236

0500100015002000250030000.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

0.5

Bit rate (kbps)

Ham

min

g di

stan

ce

Robustness against MC−EZBC (LLH): Crew

Φ = 135

Φ = 269

Φ = 392

Φ = 700

Figure 8.21: Robustness performance of proposed watermarking scheme at different Φ(d = 6) for Crew sequence. Row 1) & 2) show robustness against Motion JPEG 2000and MC-EZBC, respectively. Column 1) & 2) represents the embedding on temporalsubbands LLL & LLH, respectively.

165

0 10 20 30 40 500

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

0.5

Compression ratio

Ham

min

g di

stan

ce

Robustness against Motion JPEG 2000 (LLL): Foreman

Φ = 53

Φ = 113

Φ = 149

Φ = 288

0 10 20 30 40 50

0.2

0.25

0.3

0.35

0.4

0.45

0.5

0.55

Compression ratio

Ham

min

g di

stan

ce

Robustness against Motion JPEG 2000 (LLH): Foreman

Φ = 168

Φ = 318

Φ = 456

Φ = 740

0500100015002000250030000.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

0.5

Bit rate (kbps)

Ham

min

g di

stan

ce

Robustness against MC−EZBC (LLL): Foreman

Φ = 53

Φ = 113

Φ = 149

Φ = 288

050010001500200025003000

0.2

0.25

0.3

0.35

0.4

0.45

0.5

0.55

Bit rate (kbps)

Ham

min

g di

stan

ce

Robustness against MC−EZBC (LLH): Foreman

Φ = 168

Φ = 318

Φ = 456

Φ = 740

Figure 8.22: Robustness performance of proposed watermarking scheme at different Φ(d = 6) for Foreman sequence. Row 1) & 2) show robustness against Motion JPEG 2000and MC-EZBC, respectively. Column 1) & 2) represents the embedding on temporalsubbands LLL & LLH, respectively.

166

0 10 20 30 40 500

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

0.5

Compression ratio

Ham

min

g di

stan

ce

Robustness against Motion JPEG 2000 (LLL): News

Φ = 101

Φ = 193

Φ = 281

Φ = 575

0 10 20 30 40 500.34

0.36

0.38

0.4

0.42

0.44

0.46

0.48

0.5

0.52

0.54

Compression ratioH

amm

ing

dist

ance

Robustness against Motion JPEG 2000 (LLH): News

Φ = 311

Φ = 523

Φ = 727

Φ = 1242

0500100015002000250030000

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

0.5

Bit rate (kbps)

Ham

min

g di

stan

ce

Robustness against MC−EZBC (LLL): News

Φ = 101

Φ = 193

Φ = 281

Φ = 575

0500100015002000250030000.34

0.36

0.38

0.4

0.42

0.44

0.46

0.48

0.5

0.52

Bit rate (kbps)

Ham

min

g di

stan

ce

Robustness against MC−EZBC (LLH): News

Φ = 311

Φ = 523

Φ = 727

Φ = 1242

Figure 8.23: Robustness performance of proposed watermarking scheme at different Φ(d = 6) for News sequence. Row 1) & 2) show robustness against Motion JPEG 2000and MC-EZBC, respectively. Column 1) & 2) represents the embedding on temporalsubbands LLL & LLH, respectively.

167

Chapter 9

Conclusions and future work

The aim of this thesis was to present robust watermarking techniques for scalable coded

image and video. In Section 9.1, we conclude our contributions and in Section 9.2, we

suggest future research directions in this domain.

9.1 Conclusions

In order to achieve the final goal first we have generalized the image watermarking

schemes related to scalable coding, i.e., wavelet based algorithms. The scalable-coding

based content adaptations were considered as a potential watermark attack and a con-

tent adaptation test bed framework, for evaluating the robustness of wavelet based

watermarking, was presented in Chapter 4. The modular framework, Watermark Eval-

uation Bench for Content Adaption Modes (WEBCAM), consists of a repository of tools

for emulating MPEG-21 DIA content adaptation attacks, wavelet-based watermarking,

extraction and authentication. In this framework we used the parametric dissections

of the wavelet based watermarking algorithms to implement the tools repository and

its modular and reconfigurable wavelet-based watermarking implementation within the

framework. WEBCAM provides a formal evaluation platform to compare the perfor-

mances of different schemes under a controlled experimental environment for various

combinations of choices for those functional submodules. It also facilitates the develop-

ment of new algorithms and can also be used as an educational tool for wavelet-based

watermarking algorithm design. The content adaptation tools repository provides a

new set of attacks that are emerging in modern multimedia usage within the heteroge-

neous networks. With the use of this proposed frameworks, a comprehensive study was

169

carried out based on various parametric inputs such as, wavelet kernel selection, sub-

band selection, embedding methods, coefficient selection etc. The robustness against

scalable content adaptation was evaluated in order to identify and understand the effect

of the responsible parameters.

The imperceptibility and the robustness performance are two main properties of any

watermarking scheme and are complementary to each other. While focusing on ro-

bustness issues in this thesis, firstly we characterized the embedding distortion perfor-

mances and categorized the responsible input parameters. In Chapter 5, a universal

embedding distortion performance model was presented for wavelet based watermark-

ing schemes. Models were proposed for orthonormal wavelet bases, which is extended to

non-orthonormal wavelet kernels such as biorthogonal and non-linear wavelets. These

models suggested that the MSE of the watermarked image is directly proportional to

the weighted sum of energy of the modification values of the selected wavelet coeffi-

cients and this proposition is valid for orthonormal as well as non-orthonormal wavelet

kernels. In the case of the non-orthonormal wavelet bases a weighting parameter is

introduced and it is computed experimentally for different non-orthonormal wavelet

bases whereas in the case of orthonormal wavelets, these weighting parameters are set

to unity. The claims of the models were verified by extensive experimental simulations

for non-blind and blind type of watermarking schemes for a wide range of wavelet

kernels.

In order to propose robust watermarking techniques, in Chapter 6 we have investigated

the compression process of scalable coding schemes. However within the scope of this

thesis we have focused only on the quality scalability attacks. The quality scalable

image coding (i.e., JPEG 2000) is modeled using wavelet domain bit plane discarding

to identify the effect of the quantization and de-quantization on wavelet coefficients

and the data embedded within such coefficients. The relationship is then established

between the watermark extraction rule, using the reconstructed coefficients, and the

embedding rule, using the original coefficients, to rank the wavelet coefficients and other

parameters according to their ability to retain the watermark data intact under quality

scalable coding-based content adaptation. Using such relationships we have presented

models for enhancing the robustness of non-blind and blind watermarking algorithms

against quality scalability-based content adaptation. The proposed model for non-

blind watermarking specifies the range of coefficient magnitudes that are capable of

correctly extracting the embedded watermark bit under compression by considering

wavelet domain bit plane discarding and ranks the coefficients accordingly. Similarly for

blind algorithms, the proposed model specifies the range of magnitudes for the modified

coefficient in order to extract the watermark data under compression. The simulations

170

showed that the proposed models outperform the robustness performance of the existing

watermarking methods, where the model was not used. The high robustness of the

models was experimentally verified for the JPEG 2000 quality scalability.

In the next phase of the thesis, research on video watermarking techniques were carried

out. In Chapter 7, we have investigated various video watermarking schemes and pro-

posed a novel MCTF based video decomposition architecture suitable for video water-

marking techniques. The proposed scheme overcame the weaknesses (motion related

flickers) of frame-by-frame video watermarking and offers improved spatio-temporal

decomposition considering object motion into it. Depending on motion and texture

characteristics of the video and the choice of spatial-temporal sub band for watermark

embedding, MCTF has to be performed either on the spatial domain (t+2D) or in the

wavelet domain (2D+t). In this work we proposed an improved video watermarking

schemes by offering a generalized motion compensated 2D+t+2D framework for wa-

termark embedding. An improved MCTF is used by modifying the MCTF update step

to follow the motion trajectory in hierarchical temporal decomposition by using direct

motion vector fields in the update step and implied motion vectors in the prediction

step. The embedding distortion performance evaluated using both MSE and flicker dif-

ference metric showed superior performance for the MMCTF driven 2D+t+2D subband

domain watermarking as opposed to frame-by-frame 2D wavelet domain watermark-

ing which does not take motion into account. The proposed subband decomposition

also provides low complexity as MCTF is performed only on subbands where the wa-

termark is embedded. In terms of watermarking methods, we have comprehensively

evaluated the performances of both non-blind and blind watermarking methods. The

robustness performance against scalable coding based compressions attacks, including

Motion JPEG 2000, MC-EZBC and H.264-SVC (scalable extension) were evaluated.

In conclusion within the proposed 2D+t+2D filtering framework, 2D+t based video

watermarking scheme outperformed conventional t+2D based watermarking schemes

in a non-blind as well as a blind watermarking scenario. To offer further improvements,

we have extended our robustness models for image watermarking, into the proposed

video watermarking scheme, resulting in better robustness performance against various

scalable compressions.

Finally, we proposed a novel concept of scalable blind image watermarking in Chapter 8.

Firstly we established the concept for image watermarking and then extended the same

for video watermarking. The proposed scheme generates a distortion-constrained ro-

bustness scalable watermarked media (i.e., image or video) code stream which consists

of hierarchically nested joint distortion-robustness coding atoms. The code stream is

generated using a new wavelet domain binary tree guided rules-based blind watermark-

171

ing algorithm. The code stream is then truncated at any distortion-robustness atom

level to generate the watermarked image / video with the desired distortion-robustness

requirements. A universal blind extractor enables the extracting of watermark data

from the watermarked media created using any truncated code stream. The algorithm

is developed based on the bit plane discarding model and outperformed the existing

blind watermarking method. The concept was experimentally verified for images and

the robustness against JPEG 2000 quality scalability was tested. The scheme is further

extended in MCTF based video watermarking scheme and the robustness was evalu-

ated against Motion JPEG 2000, MC-EZBC and H.264/SVC. Such a scheme allows

incorporating watermarking within scalable content coding and adds new direction in

watermarking research which has many potential watermarking applications particu-

larly in security enabled scalable media production and distribution.

9.2 Future work

The research discussed in this thesis indicates many direction to pursue further research

in this domain. Here we have summarized some of them as follows:

– Modeling transmission channel related error and its effect on watermarking robustness

for scalable coded media. Combining such a research with research outcomes in this

thesis, can provide a complete solution to digital right management in live streaming

or multimedia content sharing.

– Further improvement to WEBCAM framework to propose optimized parameter set

and embedding algorithm, based on the input image or applications. This can be

done by comparing the parameter sets for the given input image in order to offer best

embedding performance or most robustness.

– Mathematical modeling similar to Chapter 5 between various embedding performance

metrics, such as, JND, SSIM or wPSNR, and watermarking input parameters in order to

obtain best parameter set which can offer improved visual quality and better robustness.

– Robust watermarking techniques for Region Of Interest (ROI) based image and video

coding can provide the right balance between imperceptibility and robustness. A visual

attention model based watermarking technique can be a possible way to achieve the

same.

– Developing watermarking based authentication applications in JPEG 2000 streaming,

e.g., controlled distribution of copyrighted images to mobile, portable devices, computer

172

etc.

– Research on compression domain watermarking techniques for H.264/SVC. Using a

similar approach presented in this thesis, a robustness model can be proposed in order

to enhance the robustness against H.264/SVC based content adaptation.

– Developing real time watermarking based authentication scheme using bit stream do-

main watermarking for H.264/SVC etc. Such schemes are useful in multimedia content

distribution including user authentication for pay-TV.

– Developing joint compression domain scalable watermarking based image and video

coding schemes that offers scalability in media distribution while resolving digital right

management (DRM) issues. Such an application development is possible using the

scalable watermarking scheme suggested in this thesis.

173

Chapter 10

Appendix A

Priliminary robustness results of MCTF based video water-

marking schemes against H.264-SVC scalable compression

175

2004006008001000120014001600180020000.04

0.06

0.08

0.1

0.12

0.14

0.16

0.18

0.2

0.22

Bit rate (kbps)

Ham

min

g di

stan

ceRobustness (non-blind) against H.264/SVC (LLL): Crew

032131230

2004006008001000120014001600180020000.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

Bit rate (kbps)

Ham

min

g di

stan

ce

Robustness (non-blind) against H.264/SVC (LLH): Crew

032131230

Figure 10.1: Robustness performance of non-blind watermarking scheme against H.264-SVC for Crew sequence. Column 1) & 2) represents the embedding on temporal sub-bands LLL & LLH, respectively.

2004006008001000120014001600180020000.02

0.03

0.04

0.05

0.06

0.07

0.08

0.09

0.1

0.11

0.12

Bit rate (kbps)

Ham

min

g di

stan

ce

Robustness (non-blind) against H.264/SVC (LLL): Foreman

032131230

2004006008001000120014001600180020000.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

0.5

Bit rate (kbps)

Ham

min

g di

stan

ce

Robustness (non-blind) against H.264/SVC (LLH): Foreman

032131230

Figure 10.2: Robustness performance of non-blind watermarking scheme against H.264-SVC for Foreman sequence. Column 1) & 2) represents the embedding on temporalsubbands LLL & LLH, respectively.

176

2004006008001000120014001600180020000.048

0.05

0.052

0.054

0.056

0.058

0.06

0.062

Bit rate (kbps)

Ham

min

g di

stan

ceRobustness (non-blind) against H.264/SVC (LLL): News

032131230

2004006008001000120014001600180020000.16

0.17

0.18

0.19

0.2

0.21

0.22

0.23

Bit rate (kbps)

Ham

min

g di

stan

ce

Robustness (non-blind) against H.264/SVC (LLH): News

032131230

Figure 10.3: Robustness performance of non-blind watermarking scheme against H.264-SVC for News sequence. Column 1) & 2) represents the embedding on temporalsubbands LLL & LLH, respectively.

2004006008001000120014001600180020000.3

0.32

0.34

0.36

0.38

0.4

0.42

Bit rate (kbps)

Ham

min

g di

stan

ce

Robustness (blind) against H.264/SVC (LLL): Crew

032131230

2004006008001000120014001600180020000.43

0.44

0.45

0.46

0.47

0.48

0.49

0.5

Bit rate (kbps)

Ham

min

g di

stan

ce

Robustness (blind) against H.264/SVC (LLH): Crew

032131230

Figure 10.4: Robustness performance of blind watermarking scheme against H.264-SVCfor Crew sequence. Column 1) & 2) represents the embedding on temporal subbandsLLL & LLH, respectively.

177

2004006008001000120014001600180020000.2

0.22

0.24

0.26

0.28

0.3

0.32

0.34

Bit rate (kbps)

Ham

min

g di

stan

ceRobustness (blind) against H.264/SVC (LLL): Foreman

032131230

2004006008001000120014001600180020000.42

0.43

0.44

0.45

0.46

0.47

0.48

0.49

0.5

0.51

Bit rate (kbps)

Ham

min

g di

stan

ce

Robustness (blind) against H.264/SVC (LLH): Foreman

032131230

Figure 10.5: Robustness performance of blind watermarking scheme against H.264-SVC for Foreman sequence. Column 1) & 2) represents the embedding on temporalsubbands LLL & LLH, respectively.

200400600800100012001400160018002000

0.29

0.295

0.3

0.305

0.31

0.315

0.32

0.325

Bit rate (kbps)

Ham

min

g di

stan

ce

Robustness (blind) against H.264/SVC (LLL): News

032131230

2004006008001000120014001600180020000.462

0.464

0.466

0.468

0.47

0.472

0.474

0.476

0.478

0.48

Bit rate (kbps)

Ham

min

g di

stan

ce

Robustness (blind) against H.264/SVC (LLH): News

032131230

Figure 10.6: Robustness performance of blind watermarking scheme against H.264-SVCfor News sequence. Column 1) & 2) represents the embedding on temporal subbandsLLL & LLH, respectively.

178

References

[1] D. S. Taubman and M. W. Marcellin, JPEG2000 Image Compression Fundamen-

tals, Standards and Practice. USA: Springer, 2002.

[2] H. Schwarz, D. Marpe, and T. Wiegand, “Overview of the scalable video coding

extension of the H.264/AVC standard,” IEEE Trans. Circ. and Syst. for Video

Tech, vol. 17, no. 9, pp. 1103–1120, Sept. 2007.

[3] S. Kandadai and C. D. Creusere, “Scalable audio compression at low bitrates,”

IEEE Trans. Audio, Speech, and Language Processing, vol. 16, no. 5, pp. 969–979,

July 2008.

[4] A. Vetro, “MPEG-21 digital item adaptation: enabling universal multimedia

access,” IEEE Multimedia, vol. 11, no. 1, pp. 84–87, Jan.-March 2004.

[5] L. Xie and G. R. Arce, “Joint wavelet compression and authentication water-

marking,” in Proc. IEEE ICIP, vol. 2, 1998, pp. 427–431.

[6] F. Huo and X. Gao, “A wavelet based image watermarking scheme,” in Proc.

IEEE ICIP, 2006, pp. 2573–2576.

[7] D. Kundur and D. Hatzinakos, “Digital watermarking using multiresolution

wavelet decomposition,” in Proc. IEEE ICASSP, vol. 5, 1998, pp. 2969–2972.

[8] C. Jin and J. Peng, “A robust wavelet-based blind digital watermarking algo-

rithm,” Information Technology Journal, vol. 5, no. 2, pp. 358–363, 2006.

[9] P. Campisi, “Video watermarking in the 3D-DWT domain using quantization-

based methods,” in Proc. IEEE MMSP, 2005, pp. 1–4.

[10] H. Tao, J. Liu, and J. Tian, “Digital watermarking technique based on integer

Harr transforms and visual properties,” in Proc. SPIE Image Compression and

Encryption Tech., vol. 4551, no. 1, 2001, pp. 239–244.

179

[11] X. Xia, C. G. Boncelet, and G. R. Arce, “Wavelet transform based watermark

for digital images,” Optic Express, vol. 3, no. 12, pp. 497–511, Dec. 1998.

[12] X. C. Feng and Y. Yang, “A new watermarking method based on DWT,” in Proc.

Int’l Conf. on Computational Intelligence and Security, Lect. Notes in Comp. Sci.

(LNCS), vol. 3802, 2005, pp. 1122–1126.

[13] Q. Gong and H. Shen, “Toward blind logo watermarking in JPEG-compressed

images,” in Proc. Int’l Conf. on Parallel and Distributed Comp., Appl. and Tech.,

(PDCAT), 2005, pp. 1058–1062.

[14] M. Barni, F. Bartolini, and A. Piva, “Improved wavelet-based watermarking

through pixel-wise masking,” IEEE Trans. Image Processing, vol. 10, no. 5, pp.

783–791, May 2001.

[15] D. Kundur and D. Hatzinakos, “Toward robust logo watermarking using mul-

tiresolution image fusion principles,” IEEE Trans. Multimedia, vol. 6, no. 1, pp.

185–198, Feb. 2004.

[16] Z. Zhang and Y. L. Mo, “Embedding strategy of image watermarking in wavelet

transform domain,” in Proc. SPIE Image Compression and Encryption Tech.,

vol. 4551, no. 1, 2001, pp. 127–131.

[17] J. R. Kim and Y. S. Moon, “A robust wavelet-based digital watermarking using

level-adaptive thresholding,” in Proc. IEEE ICIP, vol. 2, 1999, pp. 226–230.

[18] S. Marusic, D. B. H. Tay, G. Deng, and P. Marimuthu, “A study of biorthogonal

wavelets in digital watermarking,” in Proc. IEEE ICIP, vol. 3, Sept. 2003, pp.

II–463–6.

[19] T.-S. Chen, J. Chen, and J.-G. Chen, “A simple and efficient watermarking tech-

nique based on JPEG2000 codec,” in Proc. Int’l Symp. on Multimedia Software

Eng., 2003, pp. 80–87.

[20] F. Dufaux, S. J. Wee, J. G. Apostolopoulos, and T. Ebrahimi, “JPSEC for secure

imaging in JPEG2000,” in Proc. SPIE Appl. of Digital Image Processing XXVII,

vol. 5558, no. 1, 2004, pp. 319–330.

[21] Y.-S. Seo, M.-S. Kim, H.-J. Park, H.-Y. Jung, H.-Y. Chung, Y. Huh, and J.-D.

Lee, “A secure watermarking for JPEG2000,” in Proc. IEEE ICIP, vol. 2, 2001,

pp. 530–533.

[22] P. Meerwald, “Quantization watermarking in the JPEG2000 coding pipeline,” in

Proc. Int’l Working Conf. on Comms. and Multimedia Security, 2001, pp. 69–79.

180

[23] Q. Sun and S. Chang, “A secure and robust digital signature scheme for

JPEG2000 image authentication,” IEEE Trans. Multimedia, vol. 7, no. 3, pp.

480–494, June 2005.

[24] R. Grosbois, P. Gerbelot, and T. Ebrahimi, “Authentication and access control

in the JPEG2000 compressed domain,” in Proc. SPIE Appl. of Digital Image

Processing XXIV, vol. 4472, no. 1, 2001, pp. 95–104.

[25] M. A. Suhail, M. S. Obaidat, S. S. Ipson, and B. Sadoun, “A comparative study

of digital watermarking in JPEG and JPEG2000 environments,” Information

Sciences, vol. 151, pp. 93–105, 2003.

[26] R. Grosbois and T. Ebrahimi, “Watermarking in the JPEG 2000 domain,” in

Proc. IEEE MMSP, 2001, pp. 339 –344.

[27] F. Hartung and B. Girod, “Watermarking of uncompressed and compressed

video,” Signal Processing, vol. 66, no. 3, pp. 283–301, 1998.

[28] G. Dorr and J.-L. Dugelay, “A guide tour of video watermarking,” Signal Pro-

cessing: Image Communication, vol. 18, no. 4, pp. 263–282, 2003.

[29] W. Zhu, Z. Xiong, and Y.-Q. Zhang, “Multiresolution watermarking for images

and video,” IEEE Trans. Circ. and Syst. for Video Tech, vol. 9, no. 4, pp. 545–

550, Jun 1999.

[30] I. J. Cox, J. Kilian, F. T. Leighton, and T. Shamoon, “Secure spread spectrum

watermarking for multimedia,” IEEE Trans. Image Processing, vol. 6, no. 12, pp.

1673–1687, Dec. 1997.

[31] Y. Li, X. Gao, and J. Hongbing, “A 3D wavelet based spatial-temporal approach

for video watermarking,” in Proc. IEEE Int’l Conf. on Comput. Intelligence and

Multimedia App. (ICCIMA), 2003, pp. 260–265.

[32] S.-J. Kim, S.-H. Lee, K.-S. Moon, W.-H. Cho, I.-T. Lim, K.-R. Kwon, and K.-I.

Lee, “A new digital video watermarking using the dual watermark images and

3D DWT,” in Proc. IEEE Region 10 TENCON, vol. 1, 2004, pp. 291–294.

[33] S. Choi and J. W. Woods, “Motion-compensated 3-D subband coding of video,”

IEEE Trans. Image Processing, vol. 8, no. 2, pp. 155–167, Feb. 1999.

[34] T. P.-C. Chen and T. Chen, “Progressive image watermarking,” in Proc. IEEE

ICME, vol. 2, 2000, pp. 1025 –1028.

[35] P.-C. Su, H.-J. M. Wang, and C.-C. J. Kuo, “An integrated approach to im-

age watermarking and JPEG-2000 compression,” The Journal of VLSI Signal

Processing, vol. 27, no. 1, pp. 35–53, Feb. 2001.

181

[36] D. Bhowmik and C. Abhayaratne, “The effect of quality scalable image com-

pression on robust watermarking,” in Proc. Int’l Workshop on Digital Signal

Processing, 2009, pp. 1–8.

[37] P. Meerwald and A. Uhl, “Scalability evaluation of blind spread-spectrum image

watermarking,” in Proc. Int’l Workshop on Digital Watermarking (IWDW ’08),

Lect. Notes in Comp. Sci. (LNCS), vol. 5450, 2008, pp. 61–75.

[38] A. Piper, R. Safavi-Naini, and A. Mertins, “Resolution and quality scalable

spread spectrum image watermarking,” in Proc. 7th workshop on Multimedia

and Security: MM&Sec’05, 2005, pp. 79–90.

[39] N. Sprljan, M. Mrak, G. C. K. Abhayaratne, and E. Izquierdo, “A scalable cod-

ing framework for efficient video adaptation,” in Proc. Int’l Workshop on Image

Analysis for Multimedia Interactive Services (WIAMIS), 2005.

[40] I. J. Cox, M. L. Miller, and J. A. Bloom, Digital watermarking. San Francisco,

CA, USA: Morgan Kaufmann Publishers Inc., 2002.

[41] M. Barni and F. Bartolini, Watermarking Systems Engineering (Signal Processing

and Communications, 21). Boca Raton, FL, USA: CRC Press, Inc., 2004.

[42] S. P. Mohanty, “Digital watermarking : A tuto-

rial review,” Available: http://www.cse.unt.edu/ smo-

hanty/research/OtherPublications/MohantyWatermarkingSurvey1999.pdf

[Accessed: Apr. 2010]., University of North Texas, Texas, USA, Tech. Rep.,

1999.

[43] J. Fridrich, M. Goljan, and A. C. Baldoza, “New fragile authentication watermark

for images,” in Proc. IEEE ICIP, vol. 1, 2000, pp. 446 –449.

[44] C. Y. Lin and S. F. Chang, “Semifragile watermarking for authenticating JPEG

visual content,” in Proc. SPIE Security, Steganography, and Watermarking of

Multimedia Contents, vol. 3971, no. 1, 2000, pp. 140–151.

[45] M. Barni, F. Bartolini, and T. Furon, “A general framework for robust water-

marking security,” Signal Processing, vol. 83, pp. 2069–2084, Oct. 2003.

[46] F. Cayre, C. Fontaine, and T. Furon, “Watermarking security: theory and prac-

tice,” IEEE Trans. Signal Processing, vol. 53, no. 10, pp. 3976 – 3987, Oct. 2005.

[47] A. Adelsbach, S. Katzenbeisser, and A. R. Sadeghi, “Cryptography meets wa-

termarking: Detecting watermarks with minimal or zero knowledge disclosure,”

in Proc. European Signal Processing Conference (EUSIPCO), vol. 1, 2004, pp.

446–449.

182

[48] O. Kwon and C. Lee, “Objective method for assessment of video quality using

wavelets,” in Proc. IEEE Int’l Symp. on Industrial Electronics (ISIE 2001).,

vol. 1, 2001, pp. 292–295.

[49] Z. Wang, A. C. Bovik, H. R. Sheikh, and E. P. Simoncelli, “Image quality assess-

ment: from error visibility to structural similarity,” IEEE Trans. Image Process-

ing, vol. 13, no. 4, pp. 600–612, April 2004.

[50] A. B. Watson, “Visual optimization of DCT quantization matrices for individual

images,” in Proc. American Institute of Aeronautics and Astronautics (AIAA)

Computing in Aerospace, vol. 9, 1993, pp. 286–291.

[51] H. G. Koumaras, “Subjective video quality assessment methods for multimedia

applications,” Geneva, Switzerland, Tech. Rep. ITU-R BT.500-11, april 2008.

[52] M. Ramkumar, A. N. Akansu, and A. A. Alatan, “A robust data hiding scheme

for image using DFT,” in Proc. IEEE ICIP, 1999, pp. 211–215.

[53] J. L. Dugelay and S. Roche, “Fractal transform based large digital watermark

embedding and robust full blind extraction,” in Proc. IEEE int’l conf. on Multi-

media & Computing Systems (ICMCS), vol. 2, 1999, pp. 1003–1004.

[54] A. Bors and I. Pitas, “Image watermarking using DCT domain constraints,” in

Proc. IEEE ICIP, 1996, pp. 231–234.

[55] M. A. Suhail and M. S. Obaidat, “Digital watermarking-based DCT and JPEG

model,” IEEE transactions on instrumentation and measurement, vol. 52, no. 5,

pp. 1640–1647, Oct 2003.

[56] J. R. Hernandez, M. Amado, and F. Perez-Gonzalez, “DCT-domain watermark-

ing techniques for still images: detector performance analysis and a new struc-

ture,” IEEE Trans. Image Processing, vol. 9, no. 1, pp. 55 –68, jan 2000.

[57] P. Vinod and P. K. Bora, “Motion-compensated inter-frame collusion attack on

video watermarking and a countermeasure,” IEE Proceedings on Information

Security, vol. 153, no. 2, pp. 61 – 73, June 2006.

[58] G. Strang and T. Nguyen, Wavelets and Filter Banks, 2nd ed. USA: Wellesley-

CambridgePress, 1997.

[59] M. Vetterli and J. Kovacevic, Wavelets and subband coding. Upper Saddle River,

NJ, USA: Prentice-Hall, Inc., 1995.

[60] I. Daubechies and W. Sweldens, “Factoring wavelet transforms into lifting steps,”

Journal of Fourier Anal. Appl., vol. 4, no. 3, pp. 245–267, 1998.

183

[61] H. Heijmans and J. Goutsias, “Nonlinear multiresolution signal decomposition

schemes: Part II: Morphological wavelets,” IEEE Trans. Image Processing, vol. 9,

no. 11, pp. 1897–1913, Nov. 2000.

[62] F. J. Hampson and J.-C. Pesquet, “A nonlinear subband decomposition with

perfect reconstruction,” in Proc. IEEE ICASSP, vol. 3, 1996, pp. 1523–1526.

[63] G. C. K. Abhayaratne and H. Heijmans, “A novel morphological subband decom-

position scheme for 2D+t wavelet video coding,” in Proc. Int’l Symp. on Image

and Signal Processing and Analysis, vol. 1, 2003, pp. 239–244.

[64] J.-R. Ohm, “Three-dimensional subband coding with motion compensation,”

IEEE Trans. Image Processing, vol. 3, no. 5, pp. 559 –571, Sep. 1994.

[65] P. Campisi, A. Neri, and M. Visconti, “Wavelet-based method for high-frequency

subband watermark embedding,” in Proc. SPIE Multimedia Sys. and Appl. III,

vol. 4209, 2001, pp. 344–353.

[66] P. Meerwald and A. Uhl, “A survey of wavelet-domain watermarking algorithms,”

in Proc. SPIE Security and Watermarking of Multimedia Contents III, vol. 4314,

2001, pp. 505–516.

[67] D. Bhowmik and C. Abhayaratne, “Morphological wavelet domain image water-

marking,” in Proc. European Signal Processing Conference (EUSIPCO), 2007,

pp. 2539–2543.

[68] F. Hartung and M. Kutter, “Multimedia watermarking techniques,” Proceedings

of the IEEE, vol. 87, no. 7, pp. 1079 –1107, Jul 1999.

[69] T. Kalker, G. Depovere, J. Haitsma, and M. J. Maes, “Video watermarking

system for broadcast monitoring,” in SPIE Conference Series, vol. 3657, 1999,

pp. 103–112.

[70] H. Inoue, A. Miyazaki, T. Araki, and T. Katsura, “A digital watermark method

using the wavelet transform for video data,” in Proc. IEEE ISCAS, vol. 4, Jul

1999, pp. 247–250.

[71] G. Depovere, T. Kalker, J. Haitsma, M. Maes, L. de Strycker, P. Termont, J. Van-

dewege, A. Langell, C. Alm, P. Norman, G. O’Reilly, B. Howes, H. Vaanholt,

R. Hintzen, P. Donnelly, and A. Hudson, “The VIVA project: digital watermark-

ing for broadcast monitoring,” in Proc. IEEE ICIP, vol. 2, 1999, pp. 202–205.

[72] M. P. Mitrea, T. B. Zaharia, F. J. Preteux, and A. Vlad, “Video watermarking

based on spread spectrum and wavelet decomposition,” in Wavelet Applications

in Industrial Processing II, vol. 5607, no. 1. SPIE, 2004, pp. 156–164.

184

[73] S. N. Merchant, A. Harchandani, S. Dua, H. Donde, and I. Sunesara, “Water-

marking of video data using integer-to-integer discrete wavelet transform,” in

Proc. IEEE TENCON, vol. 3, 2003, pp. 939 – 943.

[74] F. Deguillaume, G. Csurka, J. J. O’Ruanaidh, and T. Pun, “Robust 3D DFT

video watermarking,” in Proc. Security and Watermarking of Multimedia Con-

tents, SPIE, vol. 3657, no. 1, 1999, pp. 113–124.

[75] J. H. Lim, D. J. Kim, H. T. Kim, and C. S. Won, “Digital video watermarking

using 3D-DCT and intracubic correlation,” in Proc. SPIE Security and Water-

marking of Multimedia Contents III, vol. 4314, no. 1, 2001, pp. 64–72.

[76] D.-W. Xu, “A blind video watermarking algorithm based on 3D wavelet trans-

form,” in Proc. Int’l Conf. on Computational Intelligence and Security, vol. 0,

2007, pp. 945–949.

[77] Z. Huai-yu, L. Ying, andW. Cheng-ke, “A blind spatial-temporal algorithm based

on 3D wavelet for video watermarking,” in Proc. IEEE ICME, vol. 3, 2004, pp.

1727 – 1730.

[78] P. Campisi and A. Neri, “Video watermarking in the 3D-DWT domain using

perceptual masking,” in Proc. IEEE ICIP, vol. 1, 2005, pp. 997–1000.

[79] P. Vinod, G. Doerr, and P. K. Bora, “Assessing motion-coherency in video wa-

termarking,” in Proc. ACM Multimedia and Security, 2006, pp. 114–119.

[80] P. Meerwald and A. Uhl, “Blind motion-compensated video watermarking,” in

Proc. IEEE ICME, 2008, pp. 357–360.

[81] K. Su, D. Kundur, and D. Hatzinakos, “Statistical invisibility for collusion-

resistant digital video watermarking,” IEEE Trans. Multimedia, vol. 7, no. 1,

pp. 43 – 51, Feb 2005.

[82] S. J. Weng, T. T. Lu, and P. C. Chang, “Key-based video watermarking system

on MPEG-2,” in Proc. SPIE Security and Watermarking of Multimedia Contents

V, vol. 5020, no. 1, 2003, pp. 516–525.

[83] E. Hauer and M. Steinebach, “Robust digital watermark solution for intercoded

frames of MPEG video data,” in Proc. SPIE Security, Steganography, and Wa-

termarking of Multimedia Contents VII, vol. 5681, no. 1, 2005, pp. 381–390.

[84] Y. Y. Chung and F. F. Xu, “A secure digital watermarking scheme for MPEG-2

video copyright protection,” in Proc. IEEE Int’l Conf. on Video and Signal Based

Surveillance, AVSS, 2006, pp. 84 –84.

185

[85] J. Zhang, A. T. S. Ho, G. Qiu, and P. Marziliano, “Robust video watermarking

of H.264/AVC,” IEEE Trans. Circuits and Systems II: Express Briefs, vol. 54,

no. 2, pp. 205–209, Feb 2007.

[86] M. Noorkami and R. M. Mersereau, “Compressed-domain video watermarking

for H.264,” in Proc. IEEE ICIP, vol. 2, 2005, pp. 890–893.

[87] G. Z. Wu, Y. J. Wang, and W. H. Hsu, “Robust watermark embedding/detection

algorithm for H.264 video,” SPIE Journal of Electronic Imaging, vol. 14, no. 1,

p. 013013, 2005.

[88] F. Hartung and B. Girod, “Digital watermarking of MPEG-2 coded video in the

bitstream domain,” in Proc. IEEE ICASSP, vol. 4, 1997, pp. 2621 –2624.

[89] H. Liu, F. Shao, and J. Huang, “A MPEG-2 video watermarking algorithm with

compensation in bit stream,” in Digital Rights Management. Technologies, Is-

sues, Challenges and Systems, ser. Lect. Notes in Comp. Sc. Springer Berlin /

Heidelberg, 2006, vol. 3919, pp. 123–134.

[90] S. Biswas, S. R. Das, and E. M. Petriu, “An adaptive compressed MPEG-2 video

watermarking scheme,” IEEE Trans. Instrumentation and Measurement, vol. 54,

no. 5, pp. 1853 – 1861, 2005.

[91] B. G. Mobasseri and M. P. Marcinak, “Watermarking of MPEG-2 video in com-

pressed domain using VLC mapping,” in Proc. 7th workshop on Multimedia and

Security: MM&Sec’05, 2005, pp. 91–94.

[92] S. Sakazawa, Y. Takishima, and Y. Nakajima, “H.264 native video watermarking

method,” in Proc. IEEE ISCAS, 2006, p. 4 pp.

[93] L. Zhang, Y. Zhu, and L. M. Po, “A novel watermarking scheme with compen-

sation in bit-stream domain for H.264/AVC,” in Proc. IEEE ICASSP, 2010, pp.

1758 –1761.

[94] J. Zhang, J. Li, and L. Zhang, “Video watermark technique in motion vector,” in

Proc. XIV Brazilian Symposium on Computer Graphics and Image Processing,

2001, pp. 179 –182.

[95] Z. Liu, H. Liang, X. Niu, and Y. Yang, “A robust video watermarking in motion

vectors,” in Proc. 7th Int’l Conf. on Signal Processing, ICSP, vol. 3, 2004, pp.

2358 – 2361.

[96] K.-W. Kang, K. S. Moon, G. S. Jung, and J. N. Kim, “An efficient video wa-

termarking scheme using adaptive threshold and minimum modification on mo-

tion vectors,” in Image Analysis and Recognition, ser. Lect. Notes in Comp. Sc.

Springer Berlin / Heidelberg, 2005, vol. 3656, pp. 294–301.

186

[97] N. Mohaghegh and O. Fatemi, “H.264 copyright protection with motion vector

watermarking,” in Proc. Int’l Conf. on Audio, Language and Image Processing,

ICALIP, 2008, pp. 1384 –1389.

[98] W. Pei, Z. Zhendong, and L. Li, “A video watermarking scheme based on mo-

tion vectors and mode selection,” in Proc. Int’l Conf. on Computer Science and

Software Engineering, vol. 5, 2008, pp. 233 –237.

[99] D. Bhowmik and C. Abhayaratne, “A watermark evaluation bench for content

adaptation modes,” in Proc. IET Int’l Conf. on Visual Media Production, 2007,

pp. 1–1.

[100] ——, “Evaluation of watermark robustness to JPEG2000 based content adapta-

tion attacks,” in Proc. IET Int’l Conf. on Visual Info. Eng. (VIE ’08), 2008, pp.

789–794.

[101] ——, “A framework for evaluating wavelet based watermarking for scalable coded

digital item adaptation attacks,” in Proc. SPIE Wavelet Appl. in Industrial Pro-

cessing VI, vol. 7248, no. 1, 2009, p. 72480M (10 pages).

[102] ——. Watermarking Evaluation Bench for Content Adaptation Modes (WE-

BCAM). Available: http://svc.group.shef.ac.uk/webcam.html [Accessed: Jan.

2010].

[103] F. A. Petitcolas, M. Steinebach, F. Raynal, J. Dittmann, C. Fontaine, and

N. Fates, “Public automated web-based evaluation service for watermarking

schemes: StirMark benchmark,” in Proc. IEEE ICIP, vol. 4314, 2001, pp. 575–

584.

[104] S. Pereira, S. Voloshynovskiy, M. Madueno, S. M.-Maillet, and T. Pun, “Second

generation benchmarking and application oriented evaluation,” in Proc. Int’l.

Information Hiding Workshop, Lect. Notes in Comp. Sci. (LNCS), vol. 2137,

2001, pp. 340–353.

[105] V. Solachidis, A. Tefas, N. Nikolaidis, S. Tsekeridou, A. Nikolaidis, and I. Pitas,

“A benchmarking protocol for watermarking methods,” in Proc. IEEE ICIP,

vol. 3, 2001, pp. 1023–1026.

[106] O. Guitart, H. C. Kim, and E. J. Delp-III, “Watermark evaluation testbed,”

SPIE Journal of Electronic Imaging, vol. 15, p. 041106 (13 pages), 2006.

[107] M. Ejima and A. Miyazaki, “On the evaluation of performance of digital wa-

termarking in the frequency domain,” in Proc. IEEE ICIP, vol. 2, 2001, pp.

546–549.

187

[108] T. Ebrahimi and R. Grosbois, “Secure JPEG 2000-JPSEC,” in Proc. IEEE

ICASSP, vol. 4, 2003, pp. 716–719.

[109] S.-T. Hsiang and J. W. Woods, “Embedded video coding using invertible mo-

tion compensated 3-D subband/wavelet filter bank,” Signal Processing: Image

Communication, vol. 16, pp. 705–724, May 2001.

[110] C. I. Podilchuk, N. S. Jayant, and N. Farvardin, “Three-dimensional subband

coding of video,” IEEE Trans. Image Processing, vol. 4, no. 2, pp. 125 –139, Feb.

1995.

[111] Y. Andreopoulos, A. Munteanu, J. Barbarien, M. van der Schaar, J. Cornelis,

and P. Schelkens, “In-band motion compensated temporal filtering,” Signal Pro-

cessing: Image Communication, vol. 19, no. 7, pp. 653–673, Aug. 2004.

[112] V. G. MSU Graphics & Media Lab. MSU quality measurement tool. Available:

http://www.compression.ru/video/ [Accessed: Jan. 15, 2010].

188

Robust watermarking techniques for scalable coded image ...

Documents

Transcript of Robust watermarking techniques for scalable coded image ...